Tomlinson-Harashima Cluster-Based Precoders for Cell-Free MU-MIMO Networks
Abstract
Cell-free (CF) multiple-input multiple-output (MIMO) systems generally employ linear precoding techniques to mitigate the effects of multiuser interference. However, the power loss, efficiency, and precoding accuracy of linear precoders are usually improved by replacing them with nonlinear precoders that employ perturbation and modulo operation. In this work, we propose nonlinear user-centric precoders for CF MIMO, wherein different clusters of access points (APs) serve different users in CF multiple-antenna networks. Each cluster of APs is selected based on large-scale fading coefficients. The clustering procedure results in a sparse nonlinear precoder. We further devise a reduced-dimension nonlinear precoder, where clusters of users are created to reduce the complexity of the nonlinear precoder, the amount of required signaling, and the number of users. Numerical experiments show that the proposed nonlinear techniques for CF systems lead to an enhanced performance when compared to their linear counterparts.
Index Terms:
Cell-free wireless networks, multiple-antenna systems, multiuser interference, nonlinear precoding, Tomlinson-Harashima precoding.I Introduction
Coordinated base stations (BSs) have been deployed worldwide to establish cellular network services. However, wireless applications are evolving constantly with an increasing demand for more resources [1, 2, 3, 4]. For high throughput and quality-of-service required for future networks, it is desired to further densify BSs. However, this approach is impractical. As an alternative, cell-free (CF) multiple-input multiple output (MIMO) systems have emerged as a potential solution to improve the performance and satisfy throughout requirements of future wireless networks [5, 6].
Compared to conventional BS-based networks, CF MU-MIMO systems employ multiple APs distributed geographically over the area of interest. A central processing unit (CPU), which may be located at the cloud server, coordinates the APs. The distributed deployment of CF networks yields higher coverage than the BSs with collocated antennas [7]. In addition, CF multiuser MIMO (MU-MIMO) has been shown to provide increased throughput per user [8, 9] as well as better performance in terms of energy efficiency [10, 11, 12].
Further, CF MU-MIMO employs the same time-frequency resources to provide service to multiple users as BS-based systems. To avoid the multiuser interference (MUI) in the downlink, a precoder is often implemented at the transmitter. Prior works on CF MU-MIMO have focused on linear precoding techniques such as matched filter (MF), zero-forcing (ZF) [13], and minimum mean-square error (MMSE) [14] techniques. However, it is well-known that nonlinear precoders [15, 16, 17] have the potential to outperform their linear counterparts [18, 19, 20, 21, 22, 23].
State-of-the-art in CF MU-MIMO systems has proposed network-wide (NW) precoders [24, 13, 14, 25] but these techniques entail a very high signaling load. Moreover, NW approaches demand high computational complexity because they require the inversion of a matrix whose size increases with the number of APs and users. To mitigate this problem, NW precoders that employ APs and user clusterization have been proposed [26] for lower computational complexity and signaling load. For instance, in [20] the number of APs is curtailed to reduce the signaling load. In [27], scalable MMSE combiners and precoders are developed. Very recently, a regularized ZF precoder based on subsets of user was proposed in [28] to judiciously use the available resources.
Unlike previous works [29, 20, 23], we propose nonlinear precoding techniques for CF MU-MIMO systems. The proposed techniques are based on the well-established Tomlinson-Harashima precoder (THP) [30], which may be interpreted as the transmit analog of the successive interference cancellation (SIC) employed at the receiver [31]. Essentially, THP employs a nonlinear modulo operation that reduces the power penalty associated with the linear precoders thereby enhancing the overall performance. Additionally, a cluster-based approach is devised based on a user selection matrix, resulting in a user-centric nonlinear precoder and addressing the gap in nonlinear structures for cluster-based precoders in CF networks. The resulting precoder is sparse and its complexity is reduced by employing clusters of users, thereby reducing the amount of signaling and the computational load. Our numerical experiments show that the TH precoding techniques outperform their linear counterparts.
The rest of this paper is organized as follows. In the next section, we describe the ssytem model of the CF MU-MIMO communications. We derive the proposed cluster-based nonlinear precoding techniques in Section III. We introduce the metric to evaluate the performance of the proposed precoders in Section IV. We validate our model and methods via numerical experiments in Section V. We conclude in Section VI.
II System Model
Consider the downlink of a CF MIMO system, where geographically distributed APs serve users equipped with a single omnidirecitonal antenna. A central processing unit (CPU) located at the cloud server is connected to the APs. The data are transmitted over a flat-fading channel . The -th element of matrix is the channel coefficient between the -th AP and -th user, i.e., , where is the large-scale fading coefficient that models the path loss and shadowing effects, and represents the small-scale fading coefficient. The coefficients are modeled as independently and identically distributed (i.i.d.) random variables with complex Gaussian distribution .
Denote the transmit signal by , which obeys the transmit power constraint , where denotes the statistical expectation. Then, the received signal vector is
(1) |
where is the conjugate transpose and is the additive white Gaussian noise (AWGN) that follows the distribution .
The system employs the time division duplexing (TDD) protocol and therefore the channels can be estimated employing the channel reciprocity property and pilot training [32]. After receiving the pilots, the CPU computes the channel estimate , whose -th element is
(2) |
where is the channel estimate between the -th AP and the -th user; are i.i.d complex Gaussian random variables that follow the distribution (independent from ) and model the errors in the channel estimates; and represents the quality of the channel state information (CSI). The error affecting the channel estimate is .
III Proposed Cluster-Based Nonlinear Precoders
To enhance the performance of the system while reducing the signaling load and computational complexity of NW precoders, we propose cluster-based nonlinear precoders. To this end, we form clusters of APs and users. These clusters are defined based on the large-scale channel coefficients given by . Since only small subsets of APs transmit the most relevant signals for reception, the contribution of the remaining APs is not significant and the transmission over such APs is avoidable. The upshot of this technique is that we discard the APs whose processing is cost-ineffective to reduce the signaling load.
III-A AP selection
The signaling load is brought down by taking into account that each user is served only by a reduced cluster of APs. Consider the pre-fixed scalar that denotes the number of APs that are going to be selected. Then, for the -th user, the APs with the largest large-scale fading coefficient are selected and gathered in the set . In this sense, we employ the equivalent channel estimate , which is a sparse matrix with the -th element as
(3) |
III-B Sparse TH precoder
Using (3), we compute a sparse TH precoder (TH-SP), which defines how the symbols are transmitted by the selected APs. The conventional THP employs three different filters [33]: feedback filter , feedforward filter , and a scaling matrix [15]. The feedback filter deals with the multiuser interference (MUI) by successively subtracting the interference of previous symbols from the current symbol and, therefore, is a matrix with a lower triangular structure. The feedforward filter enforces the spatial causality. The scaling matrix assigns a weight to each stream and is, therefore, a diagonal matrix. Depending on the position of matrix , two different THP structures have been suggested: the centralized THP (cTHP) implements the scaling matrix at the transmitter side (at the central processing unit), whereas the decentralized THP (dTHP) considers that is included at the receivers.
Our proposed (TH-SP) attempts to completely remove the MUI. We implement it by applying an LQ decomposition on the equivalent channel estimate , i.e., , where and . Denote the -th element of the matrix by . Then, the respective three THP filters are
(4) | ||||
(5) | ||||
(6) | ||||
(7) |
where and denote the feedback filters for the centralized and decentralized architectures, respectively.
Denote the coefficients of the feedback filter by and the symbols after feedback processing by . Then, the feedback filter subtracts the interference from previous symbols as
(8) |
The feedback filter amplifies the power of the transmitted signal. Therefore, a modulo operation is introduced to reduce the power of the transmitted signal as
(9) |
where () is the real (imaginary) part of its complex argument and the parameter depends on the modulation alphabet and the power allocation scheme. Some common values of when employing symbols with unit variance are and for QPSK and 16-QAM, respectively.
Unlike linear precoders [4, 34, 35, 36, 37, 38, 39, 40, 41, 42], THP introduces power and modulo losses in the system. The former comes from the energy difference between the original constellation and the transmitted symbols after precoding. The latter is caused by the modulo operation. Both losses can be neglected for analysis purposes and for moderate and large modulation sizes [15, 17].
The modulo operation is modeled as the addition of a perturbation vector to the transmitted symbols . On the other hand, the feedback processing is implemented through the inversion of the matrix . Thus, the vector of symbols after feedback processing is
(10) |
Therefore, the receive signal vectors for the centralized and decentralized structures are, respectively,
(11) |
and
(12) |
where the parameters () represent scaling factor of the centralized (decentralized) structure introduced to fulfill the transmit power constraint and defined as
(13) |
III-C Cluster-based TH precoders
Denote the clusters of usersby , . While the user is always included in , the user , is included in if at least antennas provide service to user and all other users in . Then, define the user selection matrix , where is the cardinality of the set and the -th row of is . In particular, contains zeros in all positions except in the -th, where is the -th lowest index in . Similarly, the second row contains a one at the -th position, where is the second lowest index in and all other coefficients are equal to zero. The subsequent rows of are defined similarly.
The reduced channel matrix is , which is used to compute the TH precoder with reduced dimensions (THP-RD). Applying an LQ decomposition over the reduced channel matrix, i.e. , where and , produces the three THP filters as
(18) | ||||
(19) | ||||
(20) | ||||
(21) |
The set is associated to and to the decoding of the information of user but the channel matrix has reduced dimensions. Therefore, we need an index mapping to find the correct precoder. Denote this index by such that contains a one in its -th entry. It follows that the -th column should be employed in the precoders denoted by and for the cTHP and dTHP structures, respectively. Then, the -th columns of the respective precoding matrices are
(22) | ||||
(23) |
IV Sum-rate performance
To evaluate the proposed nonlinear schemes, we employ the ergodic sum-rate (ESR) defined as
(24) |
where is the average rate and is the instantaneous rate of the -th user. The rate averages out the effects of the imperfect CSIT because the instantaneous rates are not achievable. Considering Gaussian codebooks, the instantaneous rate is
(25) |
where is the signal-to-interference-plus-noise ratio (SINR) at user .
Denote the SINR for the centralized and decentralized structures by and , respectively. Then, depending on the specific THP structure used, we employ or in (25) to obtain the instantaneous rate. To compute the SINR, we obtain the mean powers of the received signal at user for centralized and decentralized structures as, respectively,
(26) |
and
(27) |
where and . This yields
(28) |
and
(29) |
V Numerical Experiments
We assess the performance of the proposed TH precoders via numerical experiments. Throughout the experiments, the large scale fading coefficients are set to
(30) |
where is the path loss and the scalar include the shadowing effect with standard deviation . The random variable follows Gaussian distribution with zero mean and unit variance. The path loss was calculated using a three-slope model as
(31) |
where is the distance between the -th AP and the -th user, m, m, and the attenuation is
(32) |
where m and m are the positions of the APs and UEs above the ground, respectively. We consider a frequency of MHz. The noise variance is
(33) |
where K is the noise temperature, J/K is the Boltzmann constant, MHz is the bandwidth and dB is the noise figure. The signal-to-noise ratio (SNR) is
(34) |
where is the trace of its matrix argument.
For all experiments, we have APs randomly distributed over a square with side equal to km. The APs serve a total of users, which are geographically distributed. We considered a total of 10,000 channel realizations to compute the ESR. Specifically, we employed channel estimates and, for each channel estimate, we considered error matrices. It follows that the average rate was computed with error matrices.
We first compare the ESR of the proposed precoders with their linear counterparts. We consider that the error in the channel coefficient estimate has a variance of . Fig. 1 shows the sum-rate performance of the proposed nonlinear precoders against their conventional linear counterparts. The dTHP with sparse channel estimate or “dTHP-SP” performs the best, even better than the linear ZF precoder that employs all the APs (ZF-NW).


In the second experiment, we assessed the sum-rate performance at SNR dB with respect to CSIT quality (Fig. 2). The proposed dTHP with reduced dimensions or “dTHP-RD” outperforms the ZF-RD precoder. The RD precoding techniques have reduced computational complexity than the corresponding SP precoders. We observe that our proposed nonlinear cluster-based precoders generally yield better ESR than their linear counterparts.
VI Summary
We proposed clustered nonlinear precoders based on the noninear THP algorithm. Our proposed THP-SP reduces the signaling load and the THP-RD additionally lowers the computational complexity at the expense of performance. Note that the reduction of the computational complexity is critical for practical applications. Numerical experiments showed that the proposed cluster-based nonlinear precoders yield better performance and robustness against CSIT uncertainties than the conventional linear precoders.
References
- [1] H. Tataria, M. Shafi, A. F. Molisch, M. Dohler, H. Sjöland, and F. Tufvesson, “6G wireless systems: Vision, requirements, challenges, insights, and opportunities,” Proceedings of the IEEE, vol. 109, no. 7, pp. 1166–1199, 2021.
- [2] M. Giordani, M. Polese, M. Mezzavilla, S. Rangan, and M. Zorzi, “Toward 6G networks: Use cases and technologies,” IEEE Communications Magazine, vol. 58, no. 3, pp. 55–61, 2020.
- [3] R. C. de Lamare, “Massive MIMO systems: Signal processing challenges and future trends,” URSI Radio Science Bulletin, vol. 2013, no. 347, pp. 8–20, 2013.
- [4] W. Zhang, H. Ren, C. Pan, M. Chen, R. C. de Lamare, B. Du, and J. Dai, “Large-scale antenna systems with ul/dl hardware mismatch: Achievable rates analysis and calibration,” IEEE Transactions on Communications, vol. 63, no. 4, pp. 1216–1229, 2015.
- [5] H. A. Ammar, R. Adve, S. Shahbazpanahi, G. Boudreau, and K. V. Srinivas, “User-centric cell-free massive MIMO networks: A survey of opportunities, challenges and solutions,” IEEE Communications Surveys & Tutorials, vol. 24, no. 1, pp. 611–652, 2022.
- [6] S. Elhoushy, M. Ibrahim, and W. Hamouda, “Cell-free massive MIMO: A survey,” IEEE Communications Surveys & Tutorials, vol. 24, no. 1, pp. 492–523, 2022.
- [7] M. Attarifar, A. Abbasfar, and A. Lozano, “Subset MMSE receivers for cell-free networks,” IEEE Transactions on Wireless Communications, vol. 19, no. 6, pp. 4183–4194, 2020.
- [8] H. Yang and T. L. Marzetta, “Energy efficiency of massive MIMO: Cell-free vs. cellular,” in IEEE Vehicular Technology Conference - Spring, 2018.
- [9] S. Elhoushy and W. Hamouda, “Towards high data rates in dynamic environments using hybrid cell-free massive MIMO/small-cell system,” IEEE Wireless Communications Letters, vol. 10, no. 2, pp. 201–205, 2021.
- [10] H. Q. Ngo, L.-N. Tran, T. Q. Duong, M. Matthaiou, and E. G. Larsson, “On the total energy efficiency of cell-free massive MIMO,” IEEE Transactions on Green Communications and Networking, vol. 2, no. 1, pp. 25–39, 2018.
- [11] J. Zhang, S. Chen, Y. Lin, J. Zheng, B. Ai, and L. Hanzo, “Cell-free massive MIMO: A new next-generation paradigm,” IEEE Access, vol. 7, pp. 99 878–99 888, 2019.
- [12] S.-N. Jin, D.-W. Yue, and H. H. Nguyen, “Spectral and energy efficiency in cell-free massive MIMO systems over correlated Rician fading,” IEEE Systems Journal, vol. 15, no. 2, pp. 2822–2833, 2021.
- [13] E. Nayebi, A. Ashikhmin, T. L. Marzetta, H. Yang, and B. D. Rao, “Precoding and power optimization in cell-free massive MIMO systems,” IEEE Transactions on Wireless Communications, vol. 16, no. 7, pp. 4445–4459, 2017.
- [14] E. Björnson and L. Sanguinetti, “Making cell-free massive MIMO competitive with MMSE processing and centralized implementation,” IEEE Transactions on Wireless Communications, vol. 19, no. 1, pp. 77–90, 2020.
- [15] K. Zu, R. C. de Lamare, and M. Haardt, “Multi-branch Tomlinson-Harashima precoding design for MU-MIMO systems: Theory and algorithms,” IEEE Transactions on Communications, vol. 62, no. 3, pp. 939–951, 2014.
- [16] L. Zhang, Y. Cai, R. C. de Lamare, and M. Zhao, “Robust multibranch Tomlinson–Harashima precoding design in amplify-and-forward MIMO relay systems,” IEEE Transactions on Communications, vol. 62, no. 10, pp. 3476–3490, 2014.
- [17] A. R. Flores, R. C. De Lamare, and B. Clerckx, “Tomlinson-Harashima precoded rate-splitting with stream combiners for MU-MIMO systems,” IEEE Transactions on Communications, vol. 69, no. 6, pp. 3833–3845, 2021.
- [18] M. Joham, W. Utschick, and J. Nossek, “Linear transmit processing in MIMO communications systems,” IEEE Transactions on Signal Processing, vol. 53, no. 8, pp. 2700–2712, 2005.
- [19] H. Ruan and R. C. de Lamare, “Distributed robust beamforming based on low-rank and cross-correlation techniques: Design and analysis,” IEEE Transactions on Signal Processing, vol. 67, no. 24, pp. 6411–6423, 2019.
- [20] V. M. Palhares, R. C. de Lamare, A. R. Flores, and L. T. Landau, “Iterative AP selection, MMSE precoding and power allocation in cell-free massive MIMO systems,” IET Communications, vol. 14, no. 22, pp. 3996–4006, 2020.
- [21] A. R. Flores, R. C. de Lamare, and B. Clerckx, “Linear precoding and stream combining for rate splitting in multiuser MIMO systems,” IEEE Communications Letters, vol. 24, no. 4, pp. 890–894, 2020.
- [22] S. Mashdour, R. C. de Lamare, and J. P. S. H. Lima, “Enhanced subset greedy multiuser scheduling in clustered cell-free massive mimo systems,” IEEE Communications Letters, vol. 27, no. 2, pp. 610–614, 2023.
- [23] A. R. Flores, R. C. de Lamare, and K. V. Mishra, “Clustered cell-free multi-user multiple-antenna systems with rate-splitting: Precoder design and power allocation,” IEEE Transactions on Communications, vol. 71, no. 10, pp. 5920–5934, 2023.
- [24] H. Q. Ngo, A. Ashikhmin, H. Yang, E. G. Larsson, and T. L. Marzetta, “Cell-free massive MIMO versus small cells,” IEEE Transactions on Wireless Communications, vol. 16, no. 3, pp. 1834–1850, 2017.
- [25] L. D. Nguyen, T. Q. Duong, H. Q. Ngo, and K. Tourki, “Energy efficiency in cell-free massive MIMO with zero-forcing precoding design,” IEEE Communications Letters, vol. 21, no. 8, pp. 1871–1874, 2017.
- [26] S. Buzzi, C. D’Andrea, A. Zappone, and C. D’Elia, “User-centric 5G cellular networks: Resource allocation and comparison with the cell-free massive MIMO approach,” IEEE Transactions on Wireless Communications, vol. 19, no. 2, pp. 1250–1264, 2020.
- [27] E. Björnson and L. Sanguinetti, “Scalable cell-free massive MIMO systems,” IEEE Transactions on Communications, vol. 68, no. 7, pp. 4247–4261, 2020.
- [28] M. M. Mojahedian and A. Lozano, “Subset regularized zero-forcing precoders for cell-free C-RANs,” in European Signal Processing Conference, 2021, pp. 915–919.
- [29] M. A. Albreem, A. H. Al Habbash, A. M. Abu-Hudrouss, and S. S. Ikki, “Overview of precoding techniques for massive MIMO,” IEEE Access, vol. 9, pp. 60 764–60 801, 2021.
- [30] R. Fischer, C. Windpassinger, A. Lampe, and J. Huber, “Tomlinson-Harashima precoding in space-time transmission for low-rate backward channel,” in International Zurich Seminar on Broadband Communications Access - Transmission - Networking, 2002, pp. 7–7.
- [31] R. C. De Lamare and R. Sampaio-Neto, “Minimum mean-squared error iterative successive parallel arbitrated decision feedback detectors for ds-cdma systems,” IEEE Transactions on Communications, vol. 56, no. 5, pp. 778–789, 2008.
- [32] M. Vu and A. Paulraj, “MIMO wireless linear precoding,” IEEE Signal Processing Magazine, vol. 24, no. 5, pp. 86–105, 2007.
- [33] Y. Chen, “Low complexity precoding schemes for massive MIMO systems,” Ph.D. dissertation, Newcastle University, 2019.
- [34] K. Zu, R. C. de Lamare, and M. Haardt, “Generalized design of low-complexity block diagonalization type precoding algorithms for multiuser mimo systems,” IEEE Transactions on Communications, vol. 61, no. 10, pp. 4232–4242, 2013.
- [35] Y. Cai, R. C. d. Lamare, and R. Fa, “Switched interleaving techniques with limited feedback for interference mitigation in ds-cdma systems,” IEEE Transactions on Communications, vol. 59, no. 7, pp. 1946–1956, 2011.
- [36] Y. Cai, R. C. de Lamare, and D. Le Ruyet, “Transmit processing techniques based on switched interleaving and limited feedback for interference mitigation in multiantenna mc-cdma systems,” IEEE Transactions on Vehicular Technology, vol. 60, no. 4, pp. 1559–1570, 2011.
- [37] W. Zhang, R. C. de Lamare, C. Pan, M. Chen, J. Dai, B. Wu, and X. Bao, “Widely linear precoding for large-scale mimo with iqi: Algorithms and performance analysis,” IEEE Transactions on Wireless Communications, vol. 16, no. 5, pp. 3298–3312, 2017.
- [38] Y. Cai, R. C. de Lamare, L.-L. Yang, and M. Zhao, “Robust mmse precoding based on switched relaying and side information for multiuser mimo relay systems,” IEEE Transactions on Vehicular Technology, vol. 64, no. 12, pp. 5677–5687, 2015.
- [39] S. F. B. Pinto and R. C. de Lamare, “Block diagonalization precoding and power allocation for multiple-antenna systems with coarsely quantized signals,” IEEE Transactions on Communications, vol. 69, no. 10, pp. 6793–6807, 2021.
- [40] A. R. Flores, R. C. de Lamare, and B. Clerckx, “Linear precoding and stream combining for rate splitting in multiuser mimo systems,” IEEE Communications Letters, vol. 24, no. 4, pp. 890–894, 2020.
- [41] L. T. N. Landau and R. C. de Lamare, “Branch-and-bound precoding for multiuser mimo systems with 1-bit quantization,” IEEE Wireless Communications Letters, vol. 6, no. 6, pp. 770–773, 2017.
- [42] D. M. V. Melo, L. T. N. Landau, R. C. de Lamare, P. F. Neuhaus, and G. P. Fettweis, “Zero-crossing precoding techniques for channels with 1-bit temporal oversampling adcs,” IEEE Transactions on Wireless Communications, vol. 22, no. 8, pp. 5321–5336, 2023.