This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Tomlinson-Harashima Cluster-Based Precoders for Cell-Free MU-MIMO Networks

André R. Flores1, Rodrigo C. de Lamare1,2 and Kumar Vijay Mishra3
1Centre for Telecommunications Studies, Pontifical Catholic University of Rio de Janeiro, Brazil
2Department of Electronic Engineering, University of York, United Kingdom
3United States DEVCOM Army Research Laboratory, Adelphi, MD 20783 USA
Emails: {andre.flores, delamare}@cetuc.puc-rio.br, [email protected]
Abstract

Cell-free (CF) multiple-input multiple-output (MIMO) systems generally employ linear precoding techniques to mitigate the effects of multiuser interference. However, the power loss, efficiency, and precoding accuracy of linear precoders are usually improved by replacing them with nonlinear precoders that employ perturbation and modulo operation. In this work, we propose nonlinear user-centric precoders for CF MIMO, wherein different clusters of access points (APs) serve different users in CF multiple-antenna networks. Each cluster of APs is selected based on large-scale fading coefficients. The clustering procedure results in a sparse nonlinear precoder. We further devise a reduced-dimension nonlinear precoder, where clusters of users are created to reduce the complexity of the nonlinear precoder, the amount of required signaling, and the number of users. Numerical experiments show that the proposed nonlinear techniques for CF systems lead to an enhanced performance when compared to their linear counterparts.

Index Terms:
Cell-free wireless networks, multiple-antenna systems, multiuser interference, nonlinear precoding, Tomlinson-Harashima precoding.

I Introduction

Coordinated base stations (BSs) have been deployed worldwide to establish cellular network services. However, wireless applications are evolving constantly with an increasing demand for more resources [1, 2, 3, 4]. For high throughput and quality-of-service required for future networks, it is desired to further densify BSs. However, this approach is impractical. As an alternative, cell-free (CF) multiple-input multiple output (MIMO) systems have emerged as a potential solution to improve the performance and satisfy throughout requirements of future wireless networks [5, 6].

Compared to conventional BS-based networks, CF MU-MIMO systems employ multiple APs distributed geographically over the area of interest. A central processing unit (CPU), which may be located at the cloud server, coordinates the APs. The distributed deployment of CF networks yields higher coverage than the BSs with collocated antennas [7]. In addition, CF multiuser MIMO (MU-MIMO) has been shown to provide increased throughput per user [8, 9] as well as better performance in terms of energy efficiency [10, 11, 12].

Further, CF MU-MIMO employs the same time-frequency resources to provide service to multiple users as BS-based systems. To avoid the multiuser interference (MUI) in the downlink, a precoder is often implemented at the transmitter. Prior works on CF MU-MIMO have focused on linear precoding techniques such as matched filter (MF), zero-forcing (ZF) [13], and minimum mean-square error (MMSE) [14] techniques. However, it is well-known that nonlinear precoders [15, 16, 17] have the potential to outperform their linear counterparts [18, 19, 20, 21, 22, 23].

State-of-the-art in CF MU-MIMO systems has proposed network-wide (NW) precoders [24, 13, 14, 25] but these techniques entail a very high signaling load. Moreover, NW approaches demand high computational complexity because they require the inversion of a matrix whose size increases with the number of APs and users. To mitigate this problem, NW precoders that employ APs and user clusterization have been proposed [26] for lower computational complexity and signaling load. For instance, in [20] the number of APs is curtailed to reduce the signaling load. In [27], scalable MMSE combiners and precoders are developed. Very recently, a regularized ZF precoder based on subsets of user was proposed in [28] to judiciously use the available resources.

Unlike previous works [29, 20, 23], we propose nonlinear precoding techniques for CF MU-MIMO systems. The proposed techniques are based on the well-established Tomlinson-Harashima precoder (THP) [30], which may be interpreted as the transmit analog of the successive interference cancellation (SIC) employed at the receiver [31]. Essentially, THP employs a nonlinear modulo operation that reduces the power penalty associated with the linear precoders thereby enhancing the overall performance. Additionally, a cluster-based approach is devised based on a user selection matrix, resulting in a user-centric nonlinear precoder and addressing the gap in nonlinear structures for cluster-based precoders in CF networks. The resulting precoder is sparse and its complexity is reduced by employing clusters of users, thereby reducing the amount of signaling and the computational load. Our numerical experiments show that the TH precoding techniques outperform their linear counterparts.

The rest of this paper is organized as follows. In the next section, we describe the ssytem model of the CF MU-MIMO communications. We derive the proposed cluster-based nonlinear precoding techniques in Section III. We introduce the metric to evaluate the performance of the proposed precoders in Section IV. We validate our model and methods via numerical experiments in Section V. We conclude in Section VI.

II System Model

Consider the downlink of a CF MIMO system, where NN geographically distributed APs serve KK users equipped with a single omnidirecitonal antenna. A central processing unit (CPU) located at the cloud server is connected to the APs. The data are transmitted over a flat-fading channel 𝐆𝐂N×K\mathbf{G}\in\mathbf{C}^{N\times K}. The (n,k)(n,k)-th element of matrix 𝐆\mathbf{G} is the channel coefficient between the nn-th AP and kk-th user, i.e., gn,k=ζn,khn,kg_{n,k}=\sqrt{\zeta_{n,k}}h_{n,k}, where ζn,k\zeta_{n,k} is the large-scale fading coefficient that models the path loss and shadowing effects, and hn,kh_{n,k} represents the small-scale fading coefficient. The coefficients hn,kh_{n,k} are modeled as independently and identically distributed (i.i.d.) random variables with complex Gaussian distribution 𝒞𝒩(0,1)\mathcal{CN}\left(0,1\right).

Denote the transmit signal by 𝐱N\mathbf{x}\in\mathbb{C}^{N}, which obeys the transmit power constraint 𝔼[𝐱2]Pt\mathbb{E}\left[\lVert\mathbf{x}\rVert^{2}\right]\leq P_{t}, where 𝔼[]\mathbb{E}[\cdot] denotes the statistical expectation. Then, the K×1K\times 1 received signal vector is

𝐲=𝐆T𝐱+𝐧,\mathbf{y}=\mathbf{G}^{\text{T}}\mathbf{x}+\mathbf{n}, (1)

where ()T(\cdot)^{T} is the conjugate transpose and 𝐧K\mathbf{n}\in\mathbb{C}^{K} is the additive white Gaussian noise (AWGN) that follows the distribution 𝒞𝒩(𝟎,σn2𝐈)\mathcal{CN}\left(\mathbf{0},\sigma_{n}^{2}\mathbf{I}\right).

The system employs the time division duplexing (TDD) protocol and therefore the channels can be estimated employing the channel reciprocity property and pilot training [32]. After receiving the pilots, the CPU computes the channel estimate 𝐆^T=[𝐠^1,𝐠^2,,𝐠^k]TK×N\mathbf{\hat{G}}^{\text{T}}=\left[\mathbf{\hat{g}}_{1},\mathbf{\hat{g}}_{2},\cdots,\mathbf{\hat{g}}_{k}\right]^{\text{T}}\in\mathbb{C}^{K\times N}, whose (n,k)(n,k)-th element is

g^n,k=ζn,k(1σe2hn,k+σeh~n,k),\hat{g}_{n,k}=\sqrt{\zeta_{n,k}}\left(\sqrt{1-\sigma_{e}^{2}}h_{n,k}+\sigma_{e}\tilde{h}_{n,k}\right), (2)

where g^n,k\hat{g}_{n,k} is the channel estimate between the nn-th AP and the kk-th user; h~n,k\tilde{h}_{n,k} are i.i.d complex Gaussian random variables that follow the distribution 𝒞𝒩(0,1)\mathcal{CN}\left(0,1\right) (independent from hn,kh_{n,k}) and model the errors in the channel estimates; and σe\sigma_{e} represents the quality of the channel state information (CSI). The error affecting the channel estimate g^n,k\hat{g}_{n,k} is g~n,k=σeζn,kh~n,k\tilde{g}_{n,k}=\sigma_{e}\sqrt{\zeta_{n,k}}\tilde{h}_{n,k}.

III Proposed Cluster-Based Nonlinear Precoders

To enhance the performance of the system while reducing the signaling load and computational complexity of NW precoders, we propose cluster-based nonlinear precoders. To this end, we form clusters of APs and users. These clusters are defined based on the large-scale channel coefficients given by ζn,k\zeta_{n,k}. Since only small subsets of APs transmit the most relevant signals for reception, the contribution of the remaining APs is not significant and the transmission over such APs is avoidable. The upshot of this technique is that we discard the APs whose processing is cost-ineffective to reduce the signaling load.

III-A AP selection

The signaling load is brought down by taking into account that each user is served only by a reduced cluster of APs. Consider the pre-fixed scalar LL that denotes the number of APs that are going to be selected. Then, for the kk-th user, the LL APs with the largest large-scale fading coefficient are selected and gathered in the set 𝒜k\mathcal{A}_{k}. In this sense, we employ the equivalent channel estimate 𝐆¯T=[𝐠¯1,𝐠¯2,,𝐠¯k]TK×N\bar{\mathbf{G}}^{\text{T}}=\left[\mathbf{\bar{g}}_{1},\mathbf{\bar{g}}_{2},\cdots,\mathbf{\bar{g}}_{k}\right]^{\text{T}}\in\mathbb{C}^{K\times N}, which is a sparse matrix with the (n,k)(n,k)-th element as

g¯n,k={g^n,k,n𝒜k,0,otherwise.\bar{g}_{n,k}=\begin{cases}\hat{g}_{n,k},&n\in\mathcal{A}_{k},\\ 0,&\text{otherwise.}\end{cases} (3)

III-B Sparse TH precoder

Using (3), we compute a sparse TH precoder (TH-SP), which defines how the symbols are transmitted by the selected APs. The conventional THP employs three different filters [33]: feedback filter 𝐁K×K\mathbf{B}\in\mathbb{C}^{K\times K}, feedforward filter 𝐅N×K\mathbf{F}\in\mathbb{C}^{N\times K}, and a scaling matrix 𝐂K×K\mathbf{C}\in\mathbb{C}^{K\times K} [15]. The feedback filter 𝐁\mathbf{B} deals with the multiuser interference (MUI) by successively subtracting the interference of previous symbols from the current symbol and, therefore, is a matrix with a lower triangular structure. The feedforward filter 𝐅\mathbf{F} enforces the spatial causality. The scaling matrix 𝐂\mathbf{C} assigns a weight to each stream and is, therefore, a diagonal matrix. Depending on the position of matrix 𝐂\mathbf{C}, two different THP structures have been suggested: the centralized THP (cTHP) implements the scaling matrix at the transmitter side (at the central processing unit), whereas the decentralized THP (dTHP) considers that 𝐂\mathbf{C} is included at the receivers.

Our proposed (TH-SP) attempts to completely remove the MUI. We implement it by applying an LQ decomposition on the equivalent channel estimate 𝐆¯T\bar{\mathbf{G}}^{\text{T}}, i.e., 𝐆¯T=𝐋¯𝐐¯\bar{\mathbf{G}}^{\text{T}}=\bar{\mathbf{L}}\bar{\mathbf{Q}}, where 𝐋¯K×K\bar{\mathbf{L}}\in\mathbb{C}^{K\times K} and 𝐐¯K×N\bar{\mathbf{Q}}\in\mathbb{C}^{K\times N}. Denote the (n,k)(n,k)-th element of the matrix 𝐋¯\bar{\mathbf{L}} by l^n,k\hat{l}_{n,k}. Then, the respective three THP filters are

𝐅\displaystyle\mathbf{F} =𝐐¯H,\displaystyle=\bar{\mathbf{Q}}^{H}, (4)
𝐂\displaystyle\mathbf{C} =diag(l¯1,1,l¯2,2,,l¯N,N),\displaystyle=\text{diag}\left(\bar{l}_{1,1},\bar{l}_{2,2},\cdots,\bar{l}_{N,N}\right), (5)
𝐁(c)\displaystyle\mathbf{B}^{\left(\text{c}\right)} =𝐋¯𝐂,\displaystyle=\bar{\mathbf{L}}\mathbf{C}, (6)
𝐁(d)\displaystyle\quad\mathbf{B}^{\left(\text{d}\right)} =𝐂𝐋¯,\displaystyle=\mathbf{C}\bar{\mathbf{L}}, (7)

where 𝐁(c)\mathbf{B}^{\left(\text{c}\right)} and 𝐁(c)\mathbf{B}^{\left(\text{c}\right)} denote the feedback filters for the centralized and decentralized architectures, respectively.

Denote the coefficients of the feedback filter by bn,kb_{n,k} and the symbols after feedback processing by s˘k\breve{s}_{k}. Then, the feedback filter subtracts the interference from previous symbols as

s˘k=ski=1k1bk,is˘i.\breve{s}_{k}=s_{k}-\sum_{i=1}^{k-1}b_{k,i}\breve{s}_{i}. (8)

The feedback filter amplifies the power of the transmitted signal. Therefore, a modulo operation is introduced to reduce the power of the transmitted signal as

(s˘k)=s˘kRe(s˘k)λ+12λjIm(s˘k)λ+12λ,\mathcal{M}\left(\breve{s}_{k}\right)=\breve{s}_{k}-\left\lfloor\frac{\text{Re}\left(\breve{s}_{k}\right)}{\lambda}+\frac{1}{2}\right\rfloor\lambda-j\left\lfloor\frac{\text{Im}\left(\breve{s}_{k}\right)}{\lambda}+\frac{1}{2}\right\rfloor\lambda, (9)

where Re()\text{Re}(\cdot) (Im()\text{Im}(\cdot)) is the real (imaginary) part of its complex argument and the parameter λ\lambda depends on the modulation alphabet and the power allocation scheme. Some common values of λ\lambda when employing symbols with unit variance are λ=22\lambda=2\sqrt{2} and λ=410/5\lambda=4\sqrt{10}/5 for QPSK and 16-QAM, respectively.

Unlike linear precoders [4, 34, 35, 36, 37, 38, 39, 40, 41, 42], THP introduces power and modulo losses in the system. The former comes from the energy difference between the original constellation and the transmitted symbols after precoding. The latter is caused by the modulo operation. Both losses can be neglected for analysis purposes and for moderate and large modulation sizes [15, 17].

The modulo operation is modeled as the addition of a perturbation vector 𝐝K×1\mathbf{d}\in\mathbb{C}^{K\times 1} to the transmitted symbols 𝐬\mathbf{s}. On the other hand, the feedback processing is implemented through the inversion of the matrix 𝐁\mathbf{B}. Thus, the vector of symbols after feedback processing 𝐬˘K×1\breve{\mathbf{s}}\in\mathbb{C}^{K\times 1} is

𝐬˘=\displaystyle\breve{\mathbf{s}}= 𝐁1(𝐬+𝐝)\displaystyle\mathbf{B}^{-1}\left(\mathbf{s}+\mathbf{d}\right)
=\displaystyle= 𝐁1𝐯.\displaystyle\mathbf{B}^{-1}\mathbf{v}. (10)

Therefore, the receive signal vectors for the centralized and decentralized structures are, respectively,

𝐲(c)=1β(c)(𝐆Tβ(c)𝐅𝐂𝐬˘+𝐧),\mathbf{y}^{\left(\text{c}\right)}=\frac{1}{\beta^{\left(\text{c}\right)}}\left(\mathbf{G}^{\text{T}}\beta^{\left(\text{c}\right)}\mathbf{F}\mathbf{C}\breve{\mathbf{s}}+\mathbf{n}\right), (11)

and

𝐲(d)=1β(d)𝐂(𝐆Tβ(d)𝐅𝐬˘+𝐧),\mathbf{y}^{\left(\text{d}\right)}=\frac{1}{\beta^{\left(\text{d}\right)}}\mathbf{C}\left(\mathbf{G}^{\text{T}}\beta^{\left(\text{d}\right)}\mathbf{F}\breve{\mathbf{s}}+\mathbf{n}\right), (12)

where the parameters β(c)\beta^{\left(\text{c}\right)} (β(d)\beta^{\left(\text{d}\right)}) represent scaling factor of the centralized (decentralized) structure introduced to fulfill the transmit power constraint and defined as

β(c)PtK,\displaystyle\beta^{\left(\text{c}\right)}\approx\sqrt{\frac{P_{t}}{K}}, β(d)Ptk=1K(1/l¯k,k2).\displaystyle\beta^{\left(\text{d}\right)}\approx\sqrt{\frac{P_{t}}{\sum\limits_{k=1}^{K}\left(1/\bar{l}_{k,k}^{2}\right)}}. (13)

Using the channel relation 𝐆T=1τ(𝐆¯T𝐆~T)\mathbf{G}^{\text{T}}=\frac{1}{\tau}\left(\bar{\mathbf{G}}^{\text{T}}-\tilde{\mathbf{G}}^{\text{T}}\right), where τ=1+σe2\tau=\sqrt{1+\sigma_{e}^{2}}, and substituting (III-B) in (11) and (12) yield, respectively,

𝐲(c)=\displaystyle\mathbf{y}^{\left(\text{c}\right)}= 1τ𝐯1τ𝐆~T𝐅𝐂𝐁(c)1𝐯+1β(c)𝐧,\displaystyle\frac{1}{\tau}\mathbf{v}-\frac{1}{\tau}\tilde{\mathbf{G}}^{\text{T}}\mathbf{F}\mathbf{C}\mathbf{B}^{\left(\text{c}\right)^{-1}}\mathbf{v}+\frac{1}{\beta^{\left(\text{c}\right)}}\mathbf{n}, (14)
𝐲(d)=\displaystyle\mathbf{y}^{\left(\text{d}\right)}= 1τ𝐯1τ𝐂𝐆~T𝐅𝐁(d)1𝐯+1β(d)𝐂𝐧,\displaystyle\frac{1}{\tau}\mathbf{v}-\frac{1}{\tau}\mathbf{C}\tilde{\mathbf{G}}^{\text{T}}\mathbf{F}\mathbf{B}^{\left(\text{d}\right)^{-1}}\mathbf{v}+\frac{1}{\beta^{\left(\text{d}\right)}}\mathbf{C}\mathbf{n}, (15)

It follows that the received signal at user kk is

yk(c)=\displaystyle y^{\left(\text{c}\right)}_{k}= 1τvk1τ𝐠~kTi=1Kvi𝐩i(c)+1β(c)nk,\displaystyle\frac{1}{\tau}v_{k}-\frac{1}{\tau}\tilde{\mathbf{g}}^{\text{T}}_{k}\sum\limits_{i=1}^{K}v_{i}\mathbf{p}_{i}^{\left(\text{c}\right)}+\frac{1}{\beta^{\left(\text{c}\right)}}n_{k}, (16)
yi(d)=\displaystyle y^{\left(\text{d}\right)}_{i}= 1τvkck,kτ𝐠~kTi=1Kvi𝐩i(d)+ck,kβ(d)nk,\displaystyle\frac{1}{\tau}v_{k}-\frac{c_{k,k}}{\tau}\tilde{\mathbf{g}}^{\text{T}}_{k}\sum\limits_{i=1}^{K}v_{i}\mathbf{p}_{i}^{\left(\text{d}\right)}+\frac{c_{k,k}}{\beta^{\left(\text{d}\right)}}n_{k}, (17)

where, to simplify the notations, we have substituted the matrices 𝐏(c)=𝐅𝐂𝐁(c)1N×K\mathbf{P}^{\left(\text{c}\right)}=\mathbf{F}\mathbf{C}\mathbf{B}^{\left(\text{c}\right)^{-1}}\in\mathbb{C}^{N\times K} and 𝐏(d)=𝐅𝐁(d)1N×K\mathbf{P}^{\left(\text{d}\right)}=\mathbf{F}\mathbf{B}^{\left(\text{d}\right)^{-1}}\in\mathbb{C}^{N\times K}, whose ii-th columns are 𝐩i(c)\mathbf{p}^{\left(\text{c}\right)}_{i} and 𝐩i(d)\mathbf{p}^{\left(\text{d}\right)}_{i}, respectively.

III-C Cluster-based TH precoders

Denote the KK clusters of usersby 𝒫k\mathcal{P}_{k}, k=1,,Kk=1,\cdots,K. While the user kk is always included in 𝒫k\mathcal{P}_{k}, the user ii, iki\neq k is included in 𝒫k\mathcal{P}_{k} if at least NaN_{a} antennas provide service to user ii and all other users in 𝒫k\mathcal{P}_{k}. Then, define the user selection matrix 𝐔k|𝒫k|×K\mathbf{U}_{k}\in\mathbb{R}^{\lvert\mathcal{P}_{k}\rvert\times K}, where |𝒫k|\lvert\mathcal{P}_{k}\rvert is the cardinality of the set 𝒫k\mathcal{P}_{k} and the jj-th row of 𝐔k\mathbf{U}_{k} is 𝐮j,k\mathbf{u}_{j,k}. In particular, 𝐮1,k\mathbf{u}_{1,k} contains zeros in all positions except in the ll-th, where ll is the jj-th lowest index in 𝒫k\mathcal{P}_{k}. Similarly, the second row 𝐮2,k\mathbf{u}_{2,k} contains a one at the jj-th position, where jj is the second lowest index in 𝒫k\mathcal{P}_{k} and all other coefficients are equal to zero. The subsequent rows of 𝐔k\mathbf{U}_{k} are defined similarly.

The reduced channel matrix is 𝐆¯kT=𝐔k𝐆¯T|𝒫k|×N\bar{\mathbf{G}}^{\text{T}}_{k}=\mathbf{U}_{k}\bar{\mathbf{G}}^{\text{T}}\in\mathbb{C}^{\lvert\mathcal{P}_{k}\rvert\times N}, which is used to compute the TH precoder with reduced dimensions (THP-RD). Applying an LQ decomposition over the reduced channel matrix, i.e. 𝐆¯kT=𝐋¯k𝐐¯k\bar{\mathbf{G}}_{k}^{\text{T}}=\bar{\mathbf{L}}_{k}\bar{\mathbf{Q}}_{k}, where 𝐋¯k|𝒫k|×|𝒫k|\bar{\mathbf{L}}_{k}\in\mathbb{C}^{\lvert\mathcal{P}_{k}\rvert\times\lvert\mathcal{P}_{k}\rvert} and 𝐐¯k|𝒫k|×N\bar{\mathbf{Q}}_{k}\in\mathbb{C}^{\lvert\mathcal{P}_{k}\rvert\times N}, produces the three THP filters as

𝐅k\displaystyle\mathbf{F}_{k} =𝐐¯kH,\displaystyle=\bar{\mathbf{Q}}_{k}^{H}, (18)
𝐂k\displaystyle\mathbf{C}_{k} =diag(l¯1,1,l¯2,2,,l¯|𝒫k|,|𝒫k|),\displaystyle=\text{diag}\left(\bar{l}_{1,1},\bar{l}_{2,2},\cdots,\bar{l}_{\lvert\mathcal{P}_{k}\rvert,\lvert\mathcal{P}_{k}\rvert}\right), (19)
𝐁k(c)\displaystyle\mathbf{B}_{k}^{\left(\text{c}\right)} =𝐋¯k𝐂k,\displaystyle=\bar{\mathbf{L}}_{k}\mathbf{C}_{k}, (20)
𝐁k(d)\displaystyle\mathbf{B}_{k}^{\left(\text{d}\right)} =𝐂k𝐋¯k.\displaystyle=\mathbf{C}_{k}\bar{\mathbf{L}}_{k}. (21)

The set 𝒫k\mathcal{P}_{k} is associated to 𝐆¯kT\bar{\mathbf{G}}^{\text{T}}_{k} and to the decoding of the information of user kk but the channel matrix 𝐆¯kT\bar{\mathbf{G}}^{\text{T}}_{k} has reduced dimensions. Therefore, we need an index mapping to find the correct precoder. Denote this index by qq such that 𝐮q,k\mathbf{u}_{q,k} contains a one in its kk-th entry. It follows that the qq-th column should be employed in the precoders denoted by 𝐏(cTHP-RD)=[𝐩1(c)𝐩k(c)𝐩K(c)]\mathbf{P}^{\left(\text{cTHP-RD}\right)}=[\mathbf{p}_{1}^{\left(\text{c}\right)^{\prime}}\ldots\mathbf{p}_{k}^{\left(\text{c}\right)^{\prime}}\ldots\mathbf{p}_{K}^{\left(\text{c}\right)^{\prime}}] and 𝐏(dTHP-RD)=[𝐩1(d)𝐩k(d)𝐩K(d)]\mathbf{P}^{\left(\text{dTHP-RD}\right)}=[\mathbf{p}_{1}^{\left(\text{d}\right)^{\prime}}\ldots\mathbf{p}_{k}^{\left(\text{d}\right)^{\prime}}\ldots\mathbf{p}_{K}^{\left(\text{d}\right)^{\prime}}] for the cTHP and dTHP structures, respectively. Then, the kk-th columns of the respective precoding matrices are

𝐩k(c)=\displaystyle\mathbf{p}^{\left(c\right)^{\prime}}_{k}= [𝐅k𝐂k𝐁k(c)1]q,\displaystyle\left[\mathbf{F}_{k}\mathbf{C}_{k}\mathbf{B}^{\left(c\right)^{-1}}_{k}\right]_{q}, (22)
𝐩k(d)=\displaystyle\mathbf{p}^{\left(d\right)^{\prime}}_{k}= [𝐅k𝐁k(d)1]q.\displaystyle\left[\mathbf{F}_{k}\mathbf{B}^{\left(d\right)^{-1}}_{k}\right]_{q}. (23)

IV Sum-rate performance

To evaluate the proposed nonlinear schemes, we employ the ergodic sum-rate (ESR) defined as

Sr=𝔼[k=1KR¯k],S_{r}=\mathbb{E}\left[\sum_{k=1}^{K}\bar{R}_{k}\right], (24)

where R¯k=𝔼[Rk|𝐆^]\bar{R}_{k}=\mathbb{E}\left[R_{k}|\hat{\mathbf{G}}\right] is the average rate and RkR_{k} is the instantaneous rate of the kk-th user. The rate R¯k\bar{R}_{k} averages out the effects of the imperfect CSIT because the instantaneous rates are not achievable. Considering Gaussian codebooks, the instantaneous rate is

Rk=log2(1+γk),R_{k}=\log_{2}\left(1+\gamma_{k}\right), (25)

where γk\gamma_{k} is the signal-to-interference-plus-noise ratio (SINR) at user kk.

Denote the SINR for the centralized and decentralized structures by γk(c)\gamma_{k}^{\left(\text{c}\right)} and γk(d)\gamma_{k}^{\left(\text{d}\right)}, respectively. Then, depending on the specific THP structure used, we employ γk(c)\gamma_{k}^{\left(\text{c}\right)} or γk(d)\gamma_{k}^{\left(\text{d}\right)} in (25) to obtain the instantaneous rate. To compute the SINR, we obtain the mean powers of the received signal at user kk for centralized and decentralized structures as, respectively,

𝔼[|yk(c)|2]=\displaystyle\mathbb{E}\left[\lvert y^{\left(\text{c}\right)}_{k}\rvert^{2}\right]= 1τ2+1τ2i=1ikK|𝐠~kT𝐩i(c)|2+1β(c)σn2+1τ2dg(c),\displaystyle\frac{1}{\tau^{2}}+\frac{1}{\tau^{2}}\sum\limits_{\begin{subarray}{c}i=1\\ i\neq k\end{subarray}}^{K}\lvert\tilde{\mathbf{g}}_{k}^{\text{T}}\mathbf{p}_{i}^{\left(\text{c}\right)}\rvert^{2}+\frac{1}{\beta^{\left(\text{c}\right)}}\sigma_{n}^{2}+\frac{1}{\tau^{2}}d_{g}^{\left(\text{c}\right)}, (26)

and

𝔼[|yk(d)|2]=\displaystyle\mathbb{E}\left[\lvert y^{\left(\text{d}\right)}_{k}\rvert^{2}\right]= 1τ2+ck,k2τ2i=1ikK|𝐠~kT𝐩i(d)|2+ck,k2β(d)σn2+1τ2dg(d),\displaystyle\frac{1}{\tau^{2}}+\frac{c_{k,k}^{2}}{\tau^{2}}\sum\limits_{\begin{subarray}{c}i=1\\ i\neq k\end{subarray}}^{K}\lvert\tilde{\mathbf{g}}_{k}^{\text{T}}\mathbf{p}_{i}^{\left(\text{d}\right)}\rvert^{2}+\frac{c_{k,k}^{2}}{\beta^{\left(\text{d}\right)}}\sigma_{n}^{2}+\frac{1}{\tau^{2}}d_{g}^{\left(\text{d}\right)}, (27)

where dg(c)=|𝐠~k𝐩k(c)|22Re(𝐠~k𝐩k(c))d_{g}^{\left(\text{c}\right)}=\lvert\tilde{\mathbf{g}}_{k}\mathbf{p}^{\left(\text{c}\right)}_{k}\rvert^{2}-2\text{Re}{\left(\tilde{\mathbf{g}}_{k}\mathbf{p}^{\left(\text{c}\right)}_{k}\right)} and dg(d)=ck,k2|𝐠~k𝐩k(d)|22Re(ck,k𝐠~k𝐩k(d))d_{g}^{\left(\text{d}\right)}=c_{k,k}^{2}\lvert\tilde{\mathbf{g}}_{k}\mathbf{p}^{\left(\text{d}\right)}_{k}\rvert^{2}-2\text{Re}{\left(c_{k,k}\tilde{\mathbf{g}}_{k}\mathbf{p}^{\left(\text{d}\right)}_{k}\right)}. This yields

γk(c)=\displaystyle\gamma_{k}^{\left(\text{c}\right)}= 1dg(c)+i=1ikK|𝐠~kT𝐩i|2+τ2β(c)2σn2,\displaystyle\frac{1}{d_{g}^{\left(\text{c}\right)}+\sum\limits_{\begin{subarray}{c}i=1\\ i\neq k\end{subarray}}^{K}\left\lvert\tilde{\mathbf{g}}_{k}^{\textrm{T}}\mathbf{p}_{i}\right\rvert^{2}+\frac{\tau^{2}}{\beta^{\left(c\right)^{2}}}\sigma_{n}^{2}}, (28)

and

γk(d)=\displaystyle\gamma_{k}^{\left(\text{d}\right)}= 1dg(d)+ck,k2i=1ikK|𝐠~kT𝐩i(d)|2+ck,k2τ2β(d)σn2.\displaystyle\frac{1}{d_{g}^{\left(\text{d}\right)}+c_{k,k}^{2}\sum\limits_{\begin{subarray}{c}i=1\\ i\neq k\end{subarray}}^{K}\lvert\tilde{\mathbf{g}}_{k}^{\text{T}}\mathbf{p}_{i}^{\left(\text{d}\right)}\rvert^{2}+\frac{c_{k,k}^{2}\tau^{2}}{\beta^{\left(\text{d}\right)}}\sigma_{n}^{2}}. (29)

V Numerical Experiments

We assess the performance of the proposed TH precoders via numerical experiments. Throughout the experiments, the large scale fading coefficients are set to

ζk,n=Pk,n10σ(s)zk,n10,\zeta_{k,n}=P_{k,n}\cdot 10^{\frac{\sigma^{\left(\textrm{s}\right)}z_{k,n}}{10}}, (30)

where Pk,nP_{k,n} is the path loss and the scalar 10σ(s)zk,n1010^{\frac{\sigma^{\left(\textrm{s}\right)}z_{k,n}}{10}} include the shadowing effect with standard deviation σ(s)=8\sigma^{\left(\textrm{s}\right)}=8. The random variable zk,nz_{k,n} follows Gaussian distribution with zero mean and unit variance. The path loss was calculated using a three-slope model as

Pk,n={L35log10(dk,n),dk,n>d1L15log10(d1)20log10(dk,n),d0<dk,nd1L15log10(d1)20log10(d0),otherwise,\displaystyle P_{k,n}=\begin{cases}-L-35\log_{10}\left(d_{k,n}\right),&\text{$d_{k,n}>d_{1}$}\\ -L-15\log_{10}\left(d_{1}\right)-20\log_{10}\left(d_{k,n}\right),&\text{$d_{0}<d_{k,n}\leq d_{1}$}\\ -L-15\log_{10}\left(d_{1}\right)-20\log_{10}\left(d_{0}\right),&\text{otherwise,}\end{cases} (31)

where dk,nd_{k,n} is the distance between the nn-th AP and the kk-th user, d1=50d_{1}=50 m, d0=10d_{0}=10 m, and the attenuation LL is

L=\displaystyle L= 46.3+33.9log10(f)13.82log10(hAP)\displaystyle 46.3+33.9\log_{10}\left(f\right)-13.82\log_{10}\left(h_{\textrm{AP}}\right)
(1.1log10(f)0.7)hu+(1.56log10(f)0.8),\displaystyle-\left(1.1\log_{10}\left(f\right)-0.7\right)h_{u}+\left(1.56\log_{10}\left(f\right)-0.8\right), (32)

where hAP=15h_{\textrm{AP}}=15 m and hu=1.65h_{u}=1.65 m are the positions of the APs and UEs above the ground, respectively. We consider a frequency of f=1900f=1900 MHz. The noise variance is

σn2=TokBBNf,\sigma_{n}^{2}=T_{o}k_{B}BN_{f}, (33)

where To=290T_{o}=290 K is the noise temperature, kB=1.381×1023k_{B}=1.381\times 10^{-23} J/K is the Boltzmann constant, B=50B=50 MHz is the bandwidth and Nf=10N_{f}=10 dB is the noise figure. The signal-to-noise ratio (SNR) is

SNR=PtTr(𝐆T𝐆)NKσn2,\text{SNR}=\frac{P_{t}\textrm{Tr}\left(\mathbf{G}^{\text{T}}\mathbf{G}^{*}\right)}{NK\sigma_{n}^{2}}, (34)

where Tr()\textrm{Tr}(\cdot) is the trace of its matrix argument.

For all experiments, we have 128128 APs randomly distributed over a square with side equal to 2020 km. The APs serve a total of 2424 users, which are geographically distributed. We considered a total of 10,000 channel realizations to compute the ESR. Specifically, we employed 100100 channel estimates and, for each channel estimate, we considered 100100 error matrices. It follows that the average rate was computed with 100100 error matrices.

We first compare the ESR of the proposed precoders with their linear counterparts. We consider that the error in the channel coefficient estimate has a variance of 0.010.01. Fig. 1 shows the sum-rate performance of the proposed nonlinear precoders against their conventional linear counterparts. The dTHP with sparse channel estimate or “dTHP-SP” performs the best, even better than the linear ZF precoder that employs all the APs (ZF-NW).

Refer to caption
Figure 1: Sum-rate performance of various precoders versus SNR. Here, N=128N=128, K=24K=24, |𝒜k|=24|\mathcal{A}_{k}|=24, |𝒫k|=10|\mathcal{P}_{k}|=10, σe2=0.01\sigma_{e}^{2}=0.01.
Refer to caption
Figure 2: Sum-rate performance of precoders versus CSIT quality. Here, N=128N=128, K=24K=24, |𝒜k|=24|\mathcal{A}_{k}|=24, |𝒫k|=10|\mathcal{P}_{k}|=10, SNR=15dB\textrm{SNR}=15~{}\textrm{dB}.

In the second experiment, we assessed the sum-rate performance at SNR=15=15 dB with respect to CSIT quality (Fig. 2). The proposed dTHP with reduced dimensions or “dTHP-RD” outperforms the ZF-RD precoder. The RD precoding techniques have reduced computational complexity than the corresponding SP precoders. We observe that our proposed nonlinear cluster-based precoders generally yield better ESR than their linear counterparts.

VI Summary

We proposed clustered nonlinear precoders based on the noninear THP algorithm. Our proposed THP-SP reduces the signaling load and the THP-RD additionally lowers the computational complexity at the expense of performance. Note that the reduction of the computational complexity is critical for practical applications. Numerical experiments showed that the proposed cluster-based nonlinear precoders yield better performance and robustness against CSIT uncertainties than the conventional linear precoders.

References

  • [1] H. Tataria, M. Shafi, A. F. Molisch, M. Dohler, H. Sjöland, and F. Tufvesson, “6G wireless systems: Vision, requirements, challenges, insights, and opportunities,” Proceedings of the IEEE, vol. 109, no. 7, pp. 1166–1199, 2021.
  • [2] M. Giordani, M. Polese, M. Mezzavilla, S. Rangan, and M. Zorzi, “Toward 6G networks: Use cases and technologies,” IEEE Communications Magazine, vol. 58, no. 3, pp. 55–61, 2020.
  • [3] R. C. de Lamare, “Massive MIMO systems: Signal processing challenges and future trends,” URSI Radio Science Bulletin, vol. 2013, no. 347, pp. 8–20, 2013.
  • [4] W. Zhang, H. Ren, C. Pan, M. Chen, R. C. de Lamare, B. Du, and J. Dai, “Large-scale antenna systems with ul/dl hardware mismatch: Achievable rates analysis and calibration,” IEEE Transactions on Communications, vol. 63, no. 4, pp. 1216–1229, 2015.
  • [5] H. A. Ammar, R. Adve, S. Shahbazpanahi, G. Boudreau, and K. V. Srinivas, “User-centric cell-free massive MIMO networks: A survey of opportunities, challenges and solutions,” IEEE Communications Surveys & Tutorials, vol. 24, no. 1, pp. 611–652, 2022.
  • [6] S. Elhoushy, M. Ibrahim, and W. Hamouda, “Cell-free massive MIMO: A survey,” IEEE Communications Surveys & Tutorials, vol. 24, no. 1, pp. 492–523, 2022.
  • [7] M. Attarifar, A. Abbasfar, and A. Lozano, “Subset MMSE receivers for cell-free networks,” IEEE Transactions on Wireless Communications, vol. 19, no. 6, pp. 4183–4194, 2020.
  • [8] H. Yang and T. L. Marzetta, “Energy efficiency of massive MIMO: Cell-free vs. cellular,” in IEEE Vehicular Technology Conference - Spring, 2018.
  • [9] S. Elhoushy and W. Hamouda, “Towards high data rates in dynamic environments using hybrid cell-free massive MIMO/small-cell system,” IEEE Wireless Communications Letters, vol. 10, no. 2, pp. 201–205, 2021.
  • [10] H. Q. Ngo, L.-N. Tran, T. Q. Duong, M. Matthaiou, and E. G. Larsson, “On the total energy efficiency of cell-free massive MIMO,” IEEE Transactions on Green Communications and Networking, vol. 2, no. 1, pp. 25–39, 2018.
  • [11] J. Zhang, S. Chen, Y. Lin, J. Zheng, B. Ai, and L. Hanzo, “Cell-free massive MIMO: A new next-generation paradigm,” IEEE Access, vol. 7, pp. 99 878–99 888, 2019.
  • [12] S.-N. Jin, D.-W. Yue, and H. H. Nguyen, “Spectral and energy efficiency in cell-free massive MIMO systems over correlated Rician fading,” IEEE Systems Journal, vol. 15, no. 2, pp. 2822–2833, 2021.
  • [13] E. Nayebi, A. Ashikhmin, T. L. Marzetta, H. Yang, and B. D. Rao, “Precoding and power optimization in cell-free massive MIMO systems,” IEEE Transactions on Wireless Communications, vol. 16, no. 7, pp. 4445–4459, 2017.
  • [14] E. Björnson and L. Sanguinetti, “Making cell-free massive MIMO competitive with MMSE processing and centralized implementation,” IEEE Transactions on Wireless Communications, vol. 19, no. 1, pp. 77–90, 2020.
  • [15] K. Zu, R. C. de Lamare, and M. Haardt, “Multi-branch Tomlinson-Harashima precoding design for MU-MIMO systems: Theory and algorithms,” IEEE Transactions on Communications, vol. 62, no. 3, pp. 939–951, 2014.
  • [16] L. Zhang, Y. Cai, R. C. de Lamare, and M. Zhao, “Robust multibranch Tomlinson–Harashima precoding design in amplify-and-forward MIMO relay systems,” IEEE Transactions on Communications, vol. 62, no. 10, pp. 3476–3490, 2014.
  • [17] A. R. Flores, R. C. De Lamare, and B. Clerckx, “Tomlinson-Harashima precoded rate-splitting with stream combiners for MU-MIMO systems,” IEEE Transactions on Communications, vol. 69, no. 6, pp. 3833–3845, 2021.
  • [18] M. Joham, W. Utschick, and J. Nossek, “Linear transmit processing in MIMO communications systems,” IEEE Transactions on Signal Processing, vol. 53, no. 8, pp. 2700–2712, 2005.
  • [19] H. Ruan and R. C. de Lamare, “Distributed robust beamforming based on low-rank and cross-correlation techniques: Design and analysis,” IEEE Transactions on Signal Processing, vol. 67, no. 24, pp. 6411–6423, 2019.
  • [20] V. M. Palhares, R. C. de Lamare, A. R. Flores, and L. T. Landau, “Iterative AP selection, MMSE precoding and power allocation in cell-free massive MIMO systems,” IET Communications, vol. 14, no. 22, pp. 3996–4006, 2020.
  • [21] A. R. Flores, R. C. de Lamare, and B. Clerckx, “Linear precoding and stream combining for rate splitting in multiuser MIMO systems,” IEEE Communications Letters, vol. 24, no. 4, pp. 890–894, 2020.
  • [22] S. Mashdour, R. C. de Lamare, and J. P. S. H. Lima, “Enhanced subset greedy multiuser scheduling in clustered cell-free massive mimo systems,” IEEE Communications Letters, vol. 27, no. 2, pp. 610–614, 2023.
  • [23] A. R. Flores, R. C. de Lamare, and K. V. Mishra, “Clustered cell-free multi-user multiple-antenna systems with rate-splitting: Precoder design and power allocation,” IEEE Transactions on Communications, vol. 71, no. 10, pp. 5920–5934, 2023.
  • [24] H. Q. Ngo, A. Ashikhmin, H. Yang, E. G. Larsson, and T. L. Marzetta, “Cell-free massive MIMO versus small cells,” IEEE Transactions on Wireless Communications, vol. 16, no. 3, pp. 1834–1850, 2017.
  • [25] L. D. Nguyen, T. Q. Duong, H. Q. Ngo, and K. Tourki, “Energy efficiency in cell-free massive MIMO with zero-forcing precoding design,” IEEE Communications Letters, vol. 21, no. 8, pp. 1871–1874, 2017.
  • [26] S. Buzzi, C. D’Andrea, A. Zappone, and C. D’Elia, “User-centric 5G cellular networks: Resource allocation and comparison with the cell-free massive MIMO approach,” IEEE Transactions on Wireless Communications, vol. 19, no. 2, pp. 1250–1264, 2020.
  • [27] E. Björnson and L. Sanguinetti, “Scalable cell-free massive MIMO systems,” IEEE Transactions on Communications, vol. 68, no. 7, pp. 4247–4261, 2020.
  • [28] M. M. Mojahedian and A. Lozano, “Subset regularized zero-forcing precoders for cell-free C-RANs,” in European Signal Processing Conference, 2021, pp. 915–919.
  • [29] M. A. Albreem, A. H. Al Habbash, A. M. Abu-Hudrouss, and S. S. Ikki, “Overview of precoding techniques for massive MIMO,” IEEE Access, vol. 9, pp. 60 764–60 801, 2021.
  • [30] R. Fischer, C. Windpassinger, A. Lampe, and J. Huber, “Tomlinson-Harashima precoding in space-time transmission for low-rate backward channel,” in International Zurich Seminar on Broadband Communications Access - Transmission - Networking, 2002, pp. 7–7.
  • [31] R. C. De Lamare and R. Sampaio-Neto, “Minimum mean-squared error iterative successive parallel arbitrated decision feedback detectors for ds-cdma systems,” IEEE Transactions on Communications, vol. 56, no. 5, pp. 778–789, 2008.
  • [32] M. Vu and A. Paulraj, “MIMO wireless linear precoding,” IEEE Signal Processing Magazine, vol. 24, no. 5, pp. 86–105, 2007.
  • [33] Y. Chen, “Low complexity precoding schemes for massive MIMO systems,” Ph.D. dissertation, Newcastle University, 2019.
  • [34] K. Zu, R. C. de Lamare, and M. Haardt, “Generalized design of low-complexity block diagonalization type precoding algorithms for multiuser mimo systems,” IEEE Transactions on Communications, vol. 61, no. 10, pp. 4232–4242, 2013.
  • [35] Y. Cai, R. C. d. Lamare, and R. Fa, “Switched interleaving techniques with limited feedback for interference mitigation in ds-cdma systems,” IEEE Transactions on Communications, vol. 59, no. 7, pp. 1946–1956, 2011.
  • [36] Y. Cai, R. C. de Lamare, and D. Le Ruyet, “Transmit processing techniques based on switched interleaving and limited feedback for interference mitigation in multiantenna mc-cdma systems,” IEEE Transactions on Vehicular Technology, vol. 60, no. 4, pp. 1559–1570, 2011.
  • [37] W. Zhang, R. C. de Lamare, C. Pan, M. Chen, J. Dai, B. Wu, and X. Bao, “Widely linear precoding for large-scale mimo with iqi: Algorithms and performance analysis,” IEEE Transactions on Wireless Communications, vol. 16, no. 5, pp. 3298–3312, 2017.
  • [38] Y. Cai, R. C. de Lamare, L.-L. Yang, and M. Zhao, “Robust mmse precoding based on switched relaying and side information for multiuser mimo relay systems,” IEEE Transactions on Vehicular Technology, vol. 64, no. 12, pp. 5677–5687, 2015.
  • [39] S. F. B. Pinto and R. C. de Lamare, “Block diagonalization precoding and power allocation for multiple-antenna systems with coarsely quantized signals,” IEEE Transactions on Communications, vol. 69, no. 10, pp. 6793–6807, 2021.
  • [40] A. R. Flores, R. C. de Lamare, and B. Clerckx, “Linear precoding and stream combining for rate splitting in multiuser mimo systems,” IEEE Communications Letters, vol. 24, no. 4, pp. 890–894, 2020.
  • [41] L. T. N. Landau and R. C. de Lamare, “Branch-and-bound precoding for multiuser mimo systems with 1-bit quantization,” IEEE Wireless Communications Letters, vol. 6, no. 6, pp. 770–773, 2017.
  • [42] D. M. V. Melo, L. T. N. Landau, R. C. de Lamare, P. F. Neuhaus, and G. P. Fettweis, “Zero-crossing precoding techniques for channels with 1-bit temporal oversampling adcs,” IEEE Transactions on Wireless Communications, vol. 22, no. 8, pp. 5321–5336, 2023.