This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Learning to Beamform in Heterogeneous Massive MIMO Networks

Minghe Zhu, Tsung-Hui Chang and Mingyi Hong Part of the work was presented in IEEE SAM 2020 [1]. M. Zhu and T.-H. Chang are with the Shenzhen Research Institute of Big Data, and School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, 518172, China (e-mail: [email protected][email protected]). M. Hong is with the Department of Electrical and Computer Engineering, University of Minnesota, Twin cities, MN, USA (e-mail: [email protected]).
Abstract

It is well-known that the problem of finding the optimal beamformers in massive multiple-input multiple-output (MIMO) networks is challenging because of its non-convexity, and conventional optimization based algorithms suffer from high computational costs. While computationally efficient deep learning based methods have been proposed, their complexity heavily relies upon system parameters such as the number of transmit antennas, and therefore these methods typically do not generalize well when deployed in heterogeneous scenarios where the base stations (BSs) are equipped with different numbers of transmit antennas and have different inter-BS distances. This paper proposes a novel deep learning based beamforming algorithm to address the above challenges. Specifically, we consider the weighted sum rate (WSR) maximization problem in multi-input and single-output (MISO) interference channels, and propose a deep neural network architecture by unfolding a parallel gradient projection algorithm. Somewhat surprisingly, by leveraging the low-dimensional structures of the optimal beamforming solution, our constructed neural network can be made independent of the numbers of transmit antennas and BSs. Moreover, such a design can be further extended to a cooperative multicell network. Numerical results based on both synthetic and ray-tracing channel models show that the proposed neural network can achieve high WSRs with significantly reduced runtime, while exhibiting favorable generalization capability with respect to the antenna number, BS number and the inter-BS distance.

Keywords - Beamforming, deep neural network, MISO interfering channel, cooperative multicell beamforming.

I Introduction

Massive multiple-input multiple-output (MIMO) is an emerging technology that uses antenna arrays with a few hundred antennas simultaneously serving many tens of terminals in the same time-frequency resource [2]. Multiuser beamforming techniques based on massive MIMO can provide high spectral efficiency and have been recognized as a key technology for 5G wireless networks [3].

However, there are many challenges in searching for optimal beamforming strategies for effectively improving the performance of wireless communication systems. First and foremost, the computational complexity brought by significantly increased number of base stations (BSs) and transmit antennas is too high to fulfill the latency requirement of 5G applications. Moreover, relying on the massive MIMO and millimeter wave (mmWave) communication technologies [4], ultra-dense cellular networks composed of a large number of small cells and macrocells are expected to be deployed heterogeneously in cellular scenarios. There will be different numbers of access points (APs) installed inside the building or BSs located outdoors which are equipped with different numbers of antennas to deal with the complex communication environments. Beamforming optimization for these dense and heterogeneous networks have to jointly consider different network sizes and antenna configurations, making it much harder to search for a proper beamforming solution. Therefore, a multiuser beamforming method that is computationally scalable in heterogeneous networks, regardless of the network size, antenna number or user number, is highly desired.

I-A Related Work

Beamforming optimization has been an active research area in the past two decades [5]. The power minimization based beamforming problems can be well-solved [5] or well-approximated by various convex optimization techniques [6]. On the contrary, the weighted sum rate maximization (WSRM) based beamforming problem is difficult to solve and in fact NP-hard in general [7, 8]. Many suboptimal but computationally efficient beamforming algorithms have been proposed in the literature. For example, the paper [9] proposed the zero-forcing based beamforming based on the generalized matrix inverse theory. Approximation algorithms based on successive convex approximation techniques are proposed in [10, 11] for efficient resource allocation and multiuser beamforming optimization. The inexact cyclic coordinate descent algorithm proposed in [8] relies on the block coordinate descent (BCD) and gradient projection (GP), and can achieve good performance with a low complexity. The celebrated weighted minimum mean square error (WMMSE) algorithm proposed in [12, 13] is based on the equivalence between signal-to-interference-plus-noise ratio (SINR) and MSE, which then is solved by the BCD method. The WMMSE algorithm provides the state-of-the-art performance and therefore is widely benchmarked in the literature. However, all these existing algorithms are iterative in nature, and their complexities quickly increase with the antenna number and network size.

In recent years, machine learning based approaches that depend on the deep neural network (DNN) have been considered in a range of wireless communication applications [14]. For instance, in [15], a parallel structured convolutional neural network (CNN) is trained with only geographical location information of transmitters and receivers to learn the optimal scheduling in dense device-to-device wireless networks. A black-box DNN is trained to approximate the WMMSE algorithm to learn the optimal power control strategy for WSRM in the interference channel [16, 17]. The paper [18] demonstrates the potential of multi-agent deep reinforcement learning techniques to develop a dynamic transmit power allocation scheme in wireless communication systems.

DNN based beamforming schemes have also been proposed for alleviating the computational issues faced in massive MIMO communications. For example, by considering the multiple-input single-output (MISO) broadcast channel, the work [19] proposed a beamforming neural network (BNN) that learns virtual uplink power variables based on the well-known uplink-downlink duality [5] for the power minimization problem and the SINR balancing problem. For the WSRM problem, they proposed a BNN to learn the power variables and Lagrange dual variables based on the optimal beamforming structure. By considering a coordinated beamforming scenario with multiple BSs serving one receiver, the work [20] proposed a black-box DNN to learn the radio-frequency (RF) downlink beamformers directly from the signals received at the distributed BSs during the uplink transmission.

Different from the black-box DNN approach, the deep unfolding technique [21, 22] can build a learning network based on approximating a known iterative algorithm with finite iterations. For example, the works [23], [24] and [25] respectively unfold the GP algorithm, alternating direction method of multipliers (ADMM) and gradicent descent algorithm to build learning networks for MIMO detection. For a single-cell multiuser beamforming problem, the authors in [26] proposed a learning network by unfolding the WMMSE algorithm. To overcome the difficulty of matrix inversion involved in the WMMSE algorithm, they approximate the matrix inversion by its first-order Taylor expansion. Another recent work [27] considered to unfold the WMMSE algorithm to solve the coordinated beamforming problem in MISO interference channels. They chose to avoid the matrix inversion by unfolding a GP algorithm to solve the beamforming subproblem. While these existing works have shown good performance, their learned networks cannot be easily used to optimize a new scenario in which network parameters such as the number of antennas, the number of BSs or the network size are different from the scenarios when the neural network is trained. This shortcoming makes the current designs not suitable for heterogeneous wireless environments.

I-B Contributions

In this paper, we consider learning-based beamforming designs for WSRM in the MISO interference channels as well as the cooperative multicell networks (see Fig. 1). We propose a beamforming learning network by unfolding the simple parallel GP (PGP) algorithm [28]. The proposed learning network has a recurrent neural network (RNN) structure and is referred to as “RNN-PGP”. The first key advantage of the proposed RNN-PGP is that it has a parallel structure and the deployed neural network is identical for all BSs. The second advantage is that the complexity of the neural network can be made independent of both the number of BSs and the number of BS antennas (which is usually large in massive MIMO communications), and thus the dimension of learnable parameters does not increase with the network size and antenna size. Consequently, the proposed RNN-PGP has the third advantage that it has good generalization capability with respect to the cell radius, the number of antennas, and the number of BSs, which means that the proposed RNN-PGP can be easily deployed in heterogeneous networks where the BSs may have different number of antennas and non-uniform inter-BS distances. Our specific technical contributions are summarized as follows.

Refer to caption
Refer to caption
Figure 1: (a) The MISO interference channel [8], and (b) the cooperative multicell network [29]. The blue arrows represent information signals and the red arrows represent the inter-cell interference.
  1. 1.

    We propose a RNN-PGP beamforming learning network by unfolding the PGP algorithm, which is a simple but effective method to handle the WSRM problem. In order to accelerate the algorithm convergence, we employ a multi-layer perception (MLP) to predict the gradient vector with respect to the beamforming vectors. By showing that the gradient vector lies in a low-dimensional subspace, the MLP simply learns the coefficients required to construct the gradient vector. Since the MLP is identical across the RNN iterations and for all BSs, the parameter space to be learned can be small.

  2. 2.

    We make two key steps to improve the generalization capability of RNN-PGP with respect to the number of transmit antennas and the number of BSs. Firstly, the low-dimensional structure of the optimal beamforming solution is exploited to transform the target WSRM problem into a dimension-reduced problem. This ensures that the proposed RNN-PGP actually solves a problem whose dimension is independent of the number of transmit antennas. Secondly, instead of considering the interference from the whole network, the MLP only considers the signals and interference that are sufficiently strong to predict the beamforming gradient vector. This makes the MLP dimension independent of the whole network size, and thus the RNN-PGP can be robust against the number of BSs.

  3. 3.

    The design of the RNN-PGP is further extended to the cooperative multicell beamforming problem, which is a joint transmission scheme with BS cooperation. The performance of the proposed RNN-PGP is examined by both synthetic channel dataset and ray-tracing based DeepMIMO dataset [30]. The results show that the proposed RNN-PGP can well approximate the WMMSE solution. More importantly, the RNN-PGP shows promising generalization capability with respect to the number of BSs, number of antennas and inter-BS distance.

Synopsis: Section II presents the system model of the MISO interference channel and formulates the WSRM problem. The existing beamforming algorithms are also briefly reviewed. Section III presents the main design of the proposed RNN-PGP, including introduction of the approach to improving its generalization capability. In Section IV, we extend our RNN-PGP to solve the more complex cooperative multicell beamforming problem. The simulation results are given in Section V and the paper is concluded in Section VI.

Notations: Column vectors and matrices are respectively written in boldfaced lower-case and upper-case letters, e.g., 𝒂{\bm{a}} and 𝐀{\bf A}, respectively. The superscripts ()𝖳(\cdot)^{{\mathsf{T}}}, ()(\cdot)^{*} and ()𝖧(\cdot)^{{\mathsf{H}}} represent the transpose, conjugate and hermitian transpose respectively. 𝑰K{\bm{I}}_{K} is the K×KK\times K identity matrix; 𝒂\|{\bm{a}}\| denotes the Euclidean norm of vector 𝒂{\bm{a}}. ()\Im(\cdot) and ()\Re(\cdot) represent the imaginary and real part of a complex value respectively. {ajk}\{a_{jk}\} denotes the set of all ajka_{jk} with subscripts j,kj,k covering all the admissible intergers, {ajk}k\{a_{jk}\}_{k} denotes the set of all ajka_{jk} with the first subscript equal to jj.

II Signal Model and Problem Formulation

In this section, we build the signal model of the MISO interference channel, and formulate the beamforming problem for maximizing the network weighted sum rate. Then, we briefly review some existing algorithms for solving the weighted sum-rate maximization (WSRM) problem.

As shown in Fig. 1, the downlink multi-user MISO interference channel has KK BSs serving KK respective user equipment (UE) at the same time and over the same spectrum. Each BS is equipped with NtN_{t} transmit antennas, and each UE has only one receive antenna. Let sks_{k}\in{\mathbb{C}}, 𝔼[|sk|2]=1{\mathbb{E}}[|s_{k}|^{2}]=1, be the information signal for UEk\text{UE}_{k}, and 𝒗kNt{\bm{v}}_{k}\in{\mathbb{C}}^{N_{t}} denote the beamforming vector used by BSk\text{BS}_{k}, for all k𝒦{1,,K}k\in\mathcal{K}\coloneqq\{1,\ldots,K\}. Moreover, denote 𝒉jkNt{\bm{h}}_{jk}\in{\mathbb{C}}^{N_{t}} as the channel between BSj\text{BS}_{j} andUEk\text{UE}_{k}. Then, the signal received by UEk\text{UE}_{k} is given by

yk=𝒉kk𝖧𝒗ksk+j=1,jkK𝒉jk𝖧𝒗jsj+nk,k𝒦,\displaystyle y_{k}={\bm{h}}^{\mathsf{H}}_{kk}{\bm{v}}_{k}s_{k}+\sum_{j=1,j\neq k}^{K}{\bm{h}}^{\mathsf{H}}_{jk}{\bm{v}}_{j}s_{j}+n_{k},~{}k\in\mathcal{K}, (1)

where nkn_{k}\in{\mathbb{C}} is the additive white Gaussian noise (AWGN) with zero mean and variance σk2\sigma_{k}^{2}, i.e., nk𝒞𝒩(0,σk2)n_{k}\sim\mathcal{CN}(0,\sigma_{k}^{2}). It is assumed that the signals sk,k𝒦s_{k},~{}k\in\mathcal{K}, are statistically independent from each other and from the AWGN. Thus, the SINR for each UEk{\rm UE}_{k} can be written as

𝖲𝖨𝖭𝖱k({𝒗k},{𝒉jk}j)=|𝒉kk𝖧𝒗k|2jk|𝒉jk𝖧𝒗j|2+σk2,k𝒦.\displaystyle{\sf SINR}_{k}(\{{\bm{v}}_{k}\},\{{\bm{h}}_{jk}\}_{j})=\frac{|{\bm{h}}_{kk}^{\mathsf{H}}{\bm{v}}_{k}|^{2}}{\sum_{j\neq k}|{\bm{h}}_{jk}^{\mathsf{H}}{\bm{v}}_{j}|^{2}+\sigma_{k}^{2}},~{}k\in\mathcal{K}. (2)

Assume that the BSs have perfect channel state information (CSI). The downlink transmission rate of link kk can be expressed as

Rk({𝒗k},{𝒉jk}j)=log2(1+𝖲𝖨𝖭𝖱k({𝒗k},{𝒉jk}j)).R_{k}(\{{\bm{v}}_{k}\},\{{\bm{h}}_{jk}\}_{j})=\log_{2}\left(1+{\sf SINR}_{k}(\{{\bm{v}}_{k}\},\{{\bm{h}}_{jk}\}_{j})\right). (3)

We are interested in designing the beamformers so that the network throughput is maximized. Specifically, the WSRM problem is formulated as

max𝒗kNt,k𝒦\displaystyle\max_{\begin{subarray}{c}{\bm{v}}_{k}\in{\mathbb{C}}^{N_{t}},k\in\mathcal{K}\end{subarray}}{} R({𝒗k},{𝒉jk})\displaystyle R(\{{\bm{v}}_{k}\},\{{\bm{h}}_{jk}\}) (4)
s.t.\displaystyle{\rm s.t.}{} 𝒗k2Pk,k𝒦,\displaystyle\lVert{\bm{v}}_{k}\rVert^{2}\leq P_{k},k\in\mathcal{K},

where

R({𝒗k},{𝒉jk})=k=1KαkRk({𝒗k},{𝒉jk}j),R(\{{\bm{v}}_{k}\},\{{\bm{h}}_{jk}\})=\sum_{k=1}^{K}\alpha_{k}\cdot R_{k}(\{{\bm{v}}_{k}\},\{{\bm{h}}_{jk}\}_{j}), (5)

in which αk0\alpha_{k}\geq 0 is a non-negative weighting coefficient of link kk, and PkP_{k} denotes the maximum power budget of BSk\text{BS}_{k}. Hence, the WSRM problem in the form of (4) has to be solved before the BSs transmit signals to their receivers. However, (4) is a non-convex problem, and it has been shown to be NP-hard in general [7, 8]. In view of this, suboptimal but computationally efficient algorithms have been proposed for problem (4). Next, we review the WMMSE algorithm [13], the GP algorithm [28, 8], and the POA algorithm [31].

II-A WMMSE Algorithm

The WMMSE algorithm [13] is one of the most popular algorithms for handling the WSRM problem in (4). It reformulates (4) as an equivalent weighted MSE minimization problem by the MMSE-SINR equality [32], followed by solving the problem with the BCD method [33]. The iterative steps of WMMSE are given by: for iteration r=1,,r=1,\ldots, perform

ukr\displaystyle u_{k}^{r} =(j=1K|𝒉jk𝖧𝒗jr1|2+σk2)1𝒉kk𝖧𝒗kr1,\displaystyle=\left(\sum\limits_{j=1}^{K}|{\bm{h}}_{jk}^{\mathsf{H}}{\bm{v}}_{j}^{r-1}|^{2}+\sigma_{k}^{2}\right)^{-1}{\bm{h}}_{kk}^{\mathsf{H}}{\bm{v}}_{k}^{r-1}, (6a)
wkr\displaystyle w_{k}^{r} =(1ukr𝒉kk𝖧𝒗kr1)1,\displaystyle=\left(1-u^{r}_{k}{\bm{h}}_{kk}^{\mathsf{H}}{\bm{v}}_{k}^{r-1}\right)^{-1}, (6b)
𝒗kr\displaystyle{\bm{v}}_{k}^{r} =αk(j=1Kαj|ujr|2wjr𝒉kj𝒉kj𝖳+μk𝑰N)1ukrwkr𝒉kk,\displaystyle=\alpha_{k}\left(\sum\limits_{j=1}^{K}\alpha_{j}|u_{j}^{r}|^{2}w_{j}^{r}{\bm{h}}_{kj}{\bm{h}}_{kj}^{\mathsf{T}}+\mu_{k}^{*}{\bm{I}}_{N}\right)^{-1}u_{k}^{r}w_{k}^{r}{\bm{h}}_{kk}, (6c)

for all k𝒦k\in\mathcal{K}. In (6c), μk\mu_{k}^{*} is an optimal dual variable associated with the power budget constraint [13].

It is worth mentioning that the WMMSE algorithm developed in [13] was for a more general multi-user scenario where one BS could serve multiple UEs, and it can be extended to network MIMO scenarios [2]. Theoretically, it has been shown that the WMMSE algorithm can converge to a stationary solution of (4), and practically performs well.

II-B Gradient ascent based Algorithm

Since the power budget constraints of problem (4) have a simple structure, gradient ascent based methods, such as the gradient projection (GP) method [28], can be applied. For example, the inexact cyclic coordinate descent (ICCD) algorithm proposed in [8] deals with problem (4) by applying gradient projection update for each beamformer 𝒗k{\bm{v}}_{k} in a sequential and cyclic fashion. Specifically, the ICCD algorithm has the following steps: for iteration r=1,2,,r=1,2,\ldots, perform for k=1,,Kk=1,\ldots,K sequentially

𝒗~kr\displaystyle\tilde{{\bm{v}}}_{k}^{r} =𝒗kr1+skr1𝒗kR({𝒗jr}j<k,{𝒗jr1}jk,{𝒉jk}),\displaystyle={\bm{v}}_{k}^{r-1}+s_{k}^{r-1}\nabla_{{\bm{v}}_{k}}R(\{{\bm{v}}_{j}^{r}\}_{j<k},\{{\bm{v}}_{j}^{r-1}\}_{j\geq k},\{{\bm{h}}_{jk}\}),
𝒗kr\displaystyle{\bm{v}}_{k}^{r} =𝒗~krmax{𝒗~kr/Pk,1},\displaystyle=\frac{\tilde{{\bm{v}}}_{k}^{r}}{\max\{\|\tilde{{\bm{v}}}_{k}^{r}\|/\sqrt{P_{k}},1\}}, (7)

where skr>0s_{k}^{r}>0 is the step size, and the step in (II-B) is projection onto the power budget constraint in (4). A key advantage of the gradient ascent based methods is that they involve simple computation steps. However, when comparing to the WMMSE algorithm, the gradient ascent based methods usually require a larger number of iterations to converge to a solution as good as the WMMSE solution.

II-C POA Algorithm

The POA algorithm [34] is a monotonic optimization method which can globally solve problem (4). Specifically, in the POA algorithm, a sequence of surrogate problems are systematically constructed, whose feasible set contains the feasible set of the original problem. The constructed feasible set will shrink iteratively and converge to the true feasible set of the original problem [35], while the objective values of the constructed problems will converge to the true optimal value from above asymptotically. Therefore, the POA algorithm provides an upper bound solution for the original optimization problem (4). However, the POA algorithm suffers from very high computation complexity, making it impractical to be used in real-time scenarios.

It is worth noting that all of the above algorithms are iterative in their original form. Therefore, all of them suffer from significant computational delays especially for massive MIMO scenarios where the numbers of cells and transmit antennas are large. To alleviate the computation issues, researchers have proposed the use of DNN [20] and the deep unfolding techniques [21] for approximating the WMMSE beamforming solution in a computationally efficient fashion. While one of the challenges of implementing the WMMSE algorithm by a deep-unfolding neural network lies in the matrix inversion in (6c), the recent work [27] proposed to employ a GP method to handle the update of 𝒗kr{\bm{v}}_{k}^{r} in order to avoid explicitly computing the matrix inversion. This results in a double-loop deep-unfolding structure. A related work [26] also proposed a deep unfolding based network for the WMMSE beamforming solution, where the authors approximate the matrix inversion by its first-order Taylor expansion. However, both beamforming designs in [27, 26] have limited generalization capability with respect to the number of transmit antennas or network size. In particular, the learning networks have to be retrained whenever they are deployed in a scenario with different numbers of BSs and transmit antennas.

In the next section, we present a new beamforming learning network, which is designed to improve the generalization capability of the existing DNN based approaches.

III Proposed Beamforming Learning Network

In this section, we first utilize the low-dimensional structure of the beamforming solution of problem (4) to transform the problem into an equivalent problem with reduced dimension. Then, by unfolding the PGP method, we develop the proposed beamforming learning network, and present approaches to enhance its generalization capability with respect to the number of transmit antennas and the network size.

III-A Problem Dimension Reduction

Since in massive MIMO scenarios the number of transmit antennas NtN_{t} is large, it is desirable to avoid handling problem (4) in its original form. It has been shown in [36, Proposition 1] that the optimal beamforming vectors actually have a low-dimensional structure, as stated below.

Proposition 1

[36, Proposition 1] Suppose that NtKN_{t}\geq K, and that {𝐡kj}j\{{\bm{h}}_{kj}\}_{j} are linearly independent and satisfy

𝒉kj𝖧𝒉kj0,j,j𝒦,jj.{\bm{h}}_{kj}^{\mathsf{H}}{\bm{h}}_{kj^{\prime}}\neq 0,~{}\forall j,j^{\prime}\in\mathcal{K},j\neq j^{\prime}.

Then, if 𝐯k{\bm{v}}_{k} is a beamforming vector that corresponds to a rate point on the Pareto boundary, there exist complex numbers {ξkj}j=1K\{\xi_{kj}\}_{j=1}^{K} such that

𝒗k=j=1Kξkj𝒉kj,𝒗k2=Pk.{\bm{v}}_{k}=\sum_{j=1}^{K}\xi_{kj}{\bm{h}}_{kj},~{}\lVert{\bm{v}}_{k}\rVert^{2}=P_{k}. (8)

By this property, the beamforming vector for each BSk\text{BS}_{k} lies in the low-dimensional subspace spanned by the channel vectors {𝒉kj}j\{{\bm{h}}_{kj}\}_{j}. Let 𝑯k=[𝒉k1,,𝒉kK]Nt×K{\bm{H}}_{k}=\begin{bmatrix}{\bm{h}}_{k1},\ldots,{\bm{h}}_{kK}\end{bmatrix}\in{\mathbb{C}}^{N_{t}\times K} and 𝝃k=[ξk1,ξkK]K{\bm{\xi}}_{k}=\begin{bmatrix}\xi_{k1},\ldots\,\xi_{kK}\end{bmatrix}^{\top}\in{\mathbb{C}}^{K}. We can let 𝒗k=𝑯k𝝃k{\bm{v}}_{k}={\bm{H}}_{k}{\bm{\xi}}_{k} for all k𝒦k\in\mathcal{K}, and rewrite problem (4) as

max𝝃kK,k𝒦\displaystyle\!\!\!\!\max_{\begin{subarray}{c}{\bm{\xi}}_{k}\in{\mathbb{C}}^{K},k\in\mathcal{K}\end{subarray}}~{} k=1Kαklog2(1+|𝒉kk𝖧𝑯k𝝃k|2jk|𝒉jk𝖧𝑯j𝝃j|2+σk2)\displaystyle\sum_{k=1}^{K}\alpha_{k}\log_{2}\left(1+\frac{\left|{\bm{h}}_{kk}^{\mathsf{H}}{\bm{H}}_{k}{\bm{\xi}}_{k}\right|^{2}}{\sum_{j\neq k}\left|{\bm{h}}_{jk}^{\mathsf{H}}{\bm{H}}_{j}{\bm{\xi}}_{j}\right|^{2}+\sigma_{k}^{2}}\right)
s.t.\displaystyle{\rm s.t.}~{} 𝑯k𝝃k2Pk,k𝒦.\displaystyle\|{\bm{H}}_{k}{\bm{\xi}}_{k}\|^{2}\leq P_{k},k\in\mathcal{K}. (9)

To avoid handling the ellipsoid constraint 𝑯k𝝃k2Pk\|{\bm{H}}_{k}{\bm{\xi}}_{k}\|^{2}\leq P_{k}, we consider the eigen-decomposition of

𝑯k𝖧𝑯k=𝐔k𝚲k𝐔k𝖧,{\bm{H}}_{k}^{\mathsf{H}}{\bm{H}}_{k}={\bf U}_{k}{\bf\Lambda}_{k}{\bf U}_{k}^{\mathsf{H}},

where 𝐔kK×K{\bf U}_{k}\in\mathbb{C}^{K\times K} is the unitary eigen-matrix and 𝚲kK×K{\bf\Lambda}_{k}\in\mathbb{R}^{K\times K} is a diagonal eigenvalue matrix. By letting 𝒘k=𝚲k1/2𝐔k𝖧𝝃k{\bm{w}}_{k}={\bf\Lambda}_{k}^{1/2}{\bf U}_{k}^{\mathsf{H}}{\bm{\xi}}_{k} and 𝒈jk=𝚲j1/2𝐔j𝖧𝑯j𝖧𝒉jk{\bm{g}}_{jk}={\bf\Lambda}_{j}^{-1/2}{\bf U}_{j}^{\mathsf{H}}{\bm{H}}_{j}^{\mathsf{H}}{\bm{h}}_{jk}, we write (III-A) as

max𝒘kK,k𝒦\displaystyle\max_{\begin{subarray}{c}\begin{subarray}{c}{\bm{w}}_{k}\in{\mathbb{C}}^{K},k\in\mathcal{K}\end{subarray}\end{subarray}}~{} k=1Kαklog2(1+|𝒈kk𝖧𝒘k|2jk|𝒈jk𝖧𝒘j|2+σk2)\displaystyle\sum_{k=1}^{K}\alpha_{k}\log_{2}\left(1+\frac{\left|{\bm{g}}_{kk}^{\mathsf{H}}{\bm{w}}_{k}\right|^{2}}{\sum_{j\neq k}\left|{\bm{g}}_{jk}^{\mathsf{H}}{\bm{w}}_{j}\right|^{2}+\sigma_{k}^{2}}\right)
s.t.\displaystyle{\rm s.t.}~{} 𝒘k2Pk.k𝒦.\displaystyle\lVert{\bm{w}}_{k}\rVert^{2}\leq P_{k}.~{}k\in\mathcal{K}. (10)

By comparing (III-A) with (4), the number of unknown parameters are reduced from O(KNt)O(KN_{t}) to O(K2)O(K^{2}). Therefore, when NtKN_{t}\gg K, it will be beneficial to deal with the dimension-reduced problem (III-A). In addition, problem (III-A) is independent of NtN_{t}, and therefore a learning network based on (III-A) inherently has a good generalization capability with respect to the number of transmit antennas.

Refer to caption
Figure 2: Diagram of the proposed RNN-PGP network for WSRM problem (III-A), where 𝑮={𝒈jk}j,k𝒦{\bm{G}}=\{{\bm{g}}_{jk}\}_{j,k\in\mathcal{K}} contains all the transformed channel information. The subscript ()kr(\cdot)^{r}_{k} refers to the parameter for the kk-th BS in the rr-th iteration. \nabla, 𝒫\mathcal{P} and \angle_{\mathcal{F}} indicate the gradient, projection and phase rotation respectively.

III-B PGP Inspired Learning Network

While the WMMSE algorithm can provide state-of-the-art performance, the matrix inversion structure of the updating rule makes it difficult to build a learning network to learn the beamforming solution. In this section, in view of the separable power constraint, we consider the simple PGP method [28] to handle problem (III-A). Moreover, based on the deep unfolding technique, we show how an effective and computationally efficient beamforming learning network can be built.

The PGP method applied to (III-A) entails the following iterative steps: for iteration r=1,,r=1,\ldots, perform for k=1,,Kk=1,\ldots,K in parallel

𝒘~kr\displaystyle\tilde{{\bm{w}}}_{k}^{r} =𝒘kr1+skr1𝒘kR({𝒘jr1},𝑮),\displaystyle={\bm{w}}_{k}^{r-1}+s_{k}^{r-1}\nabla_{{\bm{w}}_{k}}R(\{{\bm{w}}_{j}^{r-1}\},{\bm{G}}), (11)
𝒘kr\displaystyle{\bm{w}}_{k}^{r} =𝒘~krmax{𝒘~kr/Pk,1},\displaystyle=\frac{\tilde{{\bm{w}}}_{k}^{r}}{\max\{\|\tilde{{\bm{w}}}_{k}^{r}\|/\sqrt{P_{k}},1\}}, (12)

where 𝑮={𝒈jk}j,k𝒦{\bm{G}}=\{{\bm{g}}_{jk}\}_{j,k\in\mathcal{K}} contains all the transformed channel information. According to the deep unfolding idea, one can build an RNN with a finite iteration number to imitate the iterative updates of the PGP method (11)-(12). In the existing works such as [27], only the step sizes {skr1}\{s_{k}^{r-1}\} are set to the learnable parameters. Here, we attempt to learn both the step size skr1s_{k}^{r-1} and the gradient vector 𝒘kR({𝒘jr1},𝑮})\nabla_{{\bm{w}}_{k}}R(\{{\bm{w}}_{j}^{r-1}\},{\bm{G}}\}), aiming to find good ascent directions that may expedite the algorithm convergence.

It is interesting to note that the gradient vector 𝒘kR({𝒘jr1},𝑮)\nabla_{{\bm{w}}_{k}}R(\{{\bm{w}}_{j}^{r-1}\},{\bm{G}}) in fact lies in the range space of the channel vectors {𝒈kj}j\{{\bm{g}}_{kj}\}_{j}. Specifically, one can have

𝒘kR({𝒘j},𝑮)=j=1Kakj𝒈kj,\displaystyle\nabla_{{\bm{w}}_{k}}R(\{{\bm{w}}_{j}\},{\bm{G}})=\sum_{j=1}^{K}a_{kj}{\bm{g}}_{kj}, (13)

​​ where

akk\displaystyle a_{kk} =αk(l=1K|𝒈lk𝖧𝒘l|2+σk2)1𝒈kk𝖧𝒘k,\displaystyle=\alpha_{k}\bigg{(}\sum\limits_{l=1}^{K}|{\bm{g}}_{lk}^{\mathsf{H}}{\bm{w}}_{l}|^{2}+\sigma_{k}^{2}\bigg{)}^{-1}{\bm{g}}_{kk}^{\mathsf{H}}{\bm{w}}_{k}, (14a)
akj\displaystyle a_{kj} =αj|𝒈jj𝖧𝒘j|2(𝒈kj𝖧𝒘k)(l=1K|𝒈lj𝖧𝒘l|2+σj2)(lj|𝒈lj𝖧𝒘l|2+σj2),jk.\displaystyle=\frac{-\alpha_{j}|{\bm{g}}_{jj}^{\mathsf{H}}{\bm{w}}_{j}|^{2}\cdot({\bm{g}}_{kj}^{\mathsf{H}}{\bm{w}}_{k})}{\left(\sum_{l=1}^{K}|{\bm{g}}_{lj}^{\mathsf{H}}{\bm{w}}_{l}|^{2}+\sigma_{j}^{2}\right)\left(\sum_{l\neq j}|{\bm{g}}_{lj}^{\mathsf{H}}{\bm{w}}_{l}|^{2}+\sigma_{j}^{2}\right)},j\neq k. (14b)

In view of (13), it is sufficient to build a learning network that simply learns the KK coefficients {akj}j\{a_{kj}\}_{j} rather than learning directly the gradient vector 𝒘kR\nabla_{{\bm{w}}_{k}}R.

As shown in Fig. 2, we construct a beamforming learning network termed “RNN-PGP” based on the above ideas. The learning network has an RNN structure with TT iterations, each of which mimics the iterative PGP update (11)-(12) for solving problem (III-A). Specifically, in each iteration rr, it contains KK identical function blocks that produce the beamforming solutions {𝒘kr+1}\{{\bm{w}}_{k}^{r+1}\} in parallel. Here, we illustrate the detailed operations of the learning network.

Gradient prediction: In each iteration, a central preprocessor first gathers the channel information {𝒈kj}\{{\bm{g}}_{kj}\}, the noise variances {σk2}\{\sigma_{k}^{2}\} and the beamforming vectors {𝒘kr}\{{\bm{w}}^{r}_{k}\} obtained in the previous iteration, and then calculates for each BSk\text{BS}_{k}

Ijr\displaystyle I_{j}^{r} =lj|𝒈lj𝖧𝒘lr|2+σj2,j𝒦,\displaystyle=\sum_{l\neq j}|{\bm{g}}_{lj}^{\mathsf{H}}{\bm{w}}_{l}^{r}|^{2}+\sigma_{j}^{2},~{}j\in\mathcal{K}, (15)
Djr\displaystyle D_{j}^{r} =αj|𝒈jj𝖧𝒘jr|2,j𝒦,\displaystyle=\alpha_{j}|{\bm{g}}_{jj}^{\mathsf{H}}{\bm{w}}_{j}^{r}|^{2},~{}j\in\mathcal{K}, (16)

and {𝒈kj𝖧𝒘k}j𝒦\{{\bm{g}}_{kj}^{\mathsf{H}}{\bm{w}}_{k}\}_{j\in\mathcal{K}}. The terms DjrD_{j}^{r} and IjrI_{j}^{r} stand for the information signal power and the suffered interference plus noise power of each UEj\text{UE}_{j}, respectively; while 𝒈kj𝖧𝒘k{\bm{g}}_{kj}^{\mathsf{H}}{\bm{w}}_{k} is the out-going interference of BSk\text{BS}_{k} to UEj\text{UE}_{j}. The computed {Djr,Ijr,𝒈kj𝖧𝒘k}j𝒦\{D_{j}^{r},I_{j}^{r},{\bm{g}}_{kj}^{\mathsf{H}}{\bm{w}}_{k}\}_{j\in\mathcal{K}} is used as the input of a multilayer perception (MLP) block associated with each BSk\text{BS}_{k}, which is trained to predict the complex coefficients {akjr}j\{a_{kj}^{r}\}_{j} in (13) and the step size skrs_{k}^{r}. Note that the parallel MLPs for all KK BSs have an identical network structure with common parameters 𝜽M{\bm{\theta}}\in\mathbb{R}^{M}, where MM is the model size.

The predicted {akjr}j\{a_{kj}^{r}\}_{j} are used to construct the gradient vector through the function block \nabla

𝒘kRr=({akjr}k,{𝒈kj}j)=j=1Kakjr𝒈kj.\nabla_{{\bm{w}}_{k}}R^{r}=\nabla\left(\{a_{kj}^{r}\}_{k},\{{\bm{g}}_{kj}\}_{j}\right)=\sum_{j=1}^{K}a_{kj}^{r}{\bm{g}}_{kj}. (17)

Gradient ascent: With 𝒘kRr\nabla_{{\bm{w}}_{k}}R^{r} and skrs_{k}^{r}, the gradient ascent update is performed as

𝒘~kr+1=𝒘kr+skr𝒘kRr.\tilde{{\bm{w}}}_{k}^{r+1}={{\bm{w}}}_{k}^{r}+s_{k}^{r}\nabla_{{\bm{w}}_{k}}R^{r}.

This step is shown in the middle of the enlarged rectangle in Fig. 2, where \otimes and \oplus represent the multiplication and addition operations.

Projection: Following (12), the function block 𝒫\mathcal{P} projects the beamforming vector 𝒘~kr+1\tilde{{\bm{w}}}_{k}^{r+1} onto the feasible set by computing

𝒘^kr+1=𝒫(𝒘~kr+1)=𝒘~kr+1max{𝒘~kr+1/Pk.1}.\hat{{\bm{w}}}_{k}^{r+1}=\mathcal{P}(\tilde{{\bm{w}}}_{k}^{r+1})=\frac{\tilde{{\bm{w}}}_{k}^{r+1}}{\max\{\|\tilde{{\bm{w}}}_{k}^{r+1}\|/\sqrt{P_{k}}.1\}}. (18)

Phase rotation: It is worth noting that problem (III-A) does not have a unique solution. In fact, if {𝒘k}\{{\bm{w}}_{k}\} is an optimal solution of (III-A), then any phase rotated solution {𝒘keiψk}\{{\bm{w}}_{k}e^{i\psi_{k}}\} is also an optimal solution, where i=1i=\sqrt{-1}. In order to make sure that the RNN learns a one-to-one mapping, we rotate the phases of {^𝒘kr+1}\{\hat{}{\bm{w}}_{k}^{r+1}\} so that each rotated ^𝒘kr+1\hat{}{\bm{w}}_{k}^{r+1}, denoted by 𝒘kr+1{\bm{w}}_{k}^{r+1}, aligns with 𝒈kk{\bm{g}}_{kk} (i.e., 𝒈kk𝖧𝒘kr+1{\bm{g}}_{kk}^{\mathsf{H}}{{\bm{w}}}_{k}^{r+1} is real). In particular, we perform via the function block \angle_{\mathcal{F}}

𝒘kr+1\displaystyle{\bm{w}}_{k}^{r+1} =(𝒘^kr+1)\displaystyle=\angle_{\mathcal{F}}(\hat{{\bm{w}}}_{k}^{r+1})
=𝒘^kr+1exp(itan1((𝒈kk𝖧𝒘^kr+1)(𝒈kk𝖧𝒘^kr+1))).\displaystyle=\hat{{\bm{w}}}_{k}^{r+1}\exp{\bigg{(}{-i\tan^{-1}\bigg{(}\frac{\Im({\bm{g}}_{kk}^{\mathsf{H}}\hat{{\bm{w}}}_{k}^{r+1})}{\Re({\bm{g}}_{kk}^{\mathsf{H}}\hat{{\bm{w}}}_{k}^{r+1})}\bigg{)}}}\bigg{)}. (19)
Refer to caption
Figure 3: Illustration of the neighboring “interferering BSs” and “interfered UEs” of BS5{\rm BS}_{5}. The solid red arrows represent the interference BS5{\rm BS}_{5} gives to its neighboring “interfered UEs” and the solid blue arrows represent the interference UE5{\rm UE}_{5} received from neighboring “interferering BSs”.
Refer to caption
Figure 4: Diagram of the revised RNN-PGP network for WSRM problem (III-A) where the MLP input and output are based on the subsets kr(c)\mathcal{I}_{k}^{r}(c) and 𝒪kr(c)\mathcal{O}_{k}^{r}(c). Only the block for BSk{\rm BS}_{k} in the rr-th iteration is shown in the figure, where Djr=αj|𝒈jj𝖧𝒘j|2D_{j}^{r}=\alpha_{j}|{\bm{g}}_{jj}^{\mathsf{H}}{\bm{w}}_{j}|^{2} and Ijr(c)=lkr(c)|𝒈lj𝖧𝒘l|2+σj2I_{j}^{r}(c)=\sum_{l\in\mathcal{I}_{k}^{r}(c)}|{\bm{g}}_{lj}^{\mathsf{H}}{\bm{w}}_{l}|^{2}+\sigma_{j}^{2}. \nabla, 𝒫\mathcal{P} and \angle_{\mathcal{F}} indicate the gradient prediction, projection and phase rotation respectively.

III-C Generalization with the Number of BSs

As seen in the previous subsection, the proposed RNN-PGP network would have good generalization capability with respect to the number of transmit antennas NtN_{t} since both the MLPs and operations in each iteration do not depend on it. To further make the MLP easy to generalize to settings with different number of BSs KK, we need the MLP to have its input {Djr,Ijr,𝒈kj𝖧𝒘kr}j𝒦\{D_{j}^{r},I_{j}^{r},{\bm{g}}_{kj}^{\mathsf{H}}{\bm{w}}_{k}^{r}\}_{j\in\mathcal{K}} and output {akjr}j𝒦\{a_{kj}^{r}\}_{j\in\mathcal{K}} to be independent of the BS number KK.

As inspired by [18], for each BSk{\rm BS}_{k} we define two neighbor subsets. The first is the set of BSs that cause a relatively strong interference to BSk{\rm BS}_{k}, i.e.,

kr{j𝒦,jk||𝒈jk𝖧𝒘jr|2>ησk2},\displaystyle\mathcal{I}_{k}^{r}\coloneqq\left\{j\in\mathcal{K},j\neq k\big{|}\ |{\bm{g}}_{jk}^{\mathsf{H}}{\bm{w}}_{j}^{r}|^{2}>\eta\sigma_{k}^{2}\right\}, (20)

where η\eta a preset threshold. We order |𝒈jk𝖧𝒘jr|2/σk2\ |{\bm{g}}_{jk}^{\mathsf{H}}{\bm{w}}_{j}^{r}|^{2}/\sigma_{k}^{2}, jkrj\in\mathcal{I}_{k}^{r} in a decreasing fashion, and further select the first cc indices in kr\mathcal{I}_{k}^{r}. The selected subset is denoted by kr(c)kr\mathcal{I}_{k}^{r}(c)\subseteq\mathcal{I}_{k}^{r}, which indicates the first cc most “interferering” BSs that have strong interference on BSk{\rm BS}_{k}.

Analogously, for each BSk{\rm BS}_{k} we define in the following the set of UEs whom BSk{\rm BS}_{k} causes strong interference to

𝒪kr{j𝒦,jk||𝒈kj𝖧𝒘kr|2>ησj2},\displaystyle\mathcal{O}_{k}^{r}\coloneqq\left\{j\in\mathcal{K},j\neq k\big{|}\ |{\bm{g}}_{kj}^{\mathsf{H}}{\bm{w}}_{k}^{r}|^{2}>\eta\sigma_{j}^{2}\right\}, (21)

and further select a subset 𝒪kr(c)𝒪kr\mathcal{O}_{k}^{r}(c)\subseteq\mathcal{O}_{k}^{r} which corresponds to the cc most “interfered” UEs by BSk{\rm BS}_{k}. An example is shown in Fig. 3, where c=3c=3, and the “interferering” neighbors of BS5\text{BS}_{5} are {BS1,BS2,BS6}\{\text{BS}_{1},\text{BS}_{2},\text{BS}_{6}\} and “interfered” UEs of it are {UE4,UE6,UE7}\{\text{UE}_{4},\text{UE}_{6},\text{UE}_{7}\}.

Then, we consider only BSs in kr(c)\mathcal{I}_{k}^{r}(c) and UEs in 𝒪kr(c)\mathcal{O}_{k}^{r}(c) for computing the input and output of the MLP associated with BSk\text{BS}_{k}. Specifically, instead of {Djr,Ijr,𝒈kj𝖧𝒘k}j𝒦\{D_{j}^{r},I_{j}^{r},{\bm{g}}_{kj}^{\mathsf{H}}{\bm{w}}_{k}\}_{j\in\mathcal{K}}, we let the input of the MLP associated with BSk{\rm BS}_{k} be {Djr,Ijr(c),𝒈kj𝖧𝒘kr}j{𝒪kr(c),k}\{D_{j}^{r},I_{j}^{r}(c),{\bm{g}}_{kj}^{\mathsf{H}}{\bm{w}}_{k}^{r}\}_{j\in\{\mathcal{O}^{r}_{k}(c),k\}}, where

Ijr(c)=ljr(c)|𝒈lj𝖧𝒘lr|2+σj2.\displaystyle I_{j}^{r}(c)=\sum_{l\in\mathcal{I}_{j}^{r}(c)}|{\bm{g}}_{lj}^{\mathsf{H}}{\bm{w}}_{l}^{r}|^{2}+\sigma_{j}^{2}. (22)

In addition to the step size skrs_{k}^{r}, the output of the MLP associated with BSk{\rm BS}_{k} is changed to {akjr}j{𝒪kr(c),k}\{a_{kj}^{r}\}_{j\in\{\mathcal{O}^{r}_{k}(c),k\}}, which now is dependent on the set 𝒪kr(c)\mathcal{O}^{r}_{k}(c) only. Therefore, the input and output of the MLP are independent of the whole network size KK. The diagram of the revised RNN-PGP network based on kr(c)\mathcal{I}_{k}^{r}(c) and 𝒪kr(c)\mathcal{O}_{k}^{r}(c) is illustrated in Fig. 4, where only the block associated with BSk{\rm BS}_{k} at iteration rr is plotted.

III-D Hybrid Training Strategy

Suppose that a training data set of size LL is given, which contains the (dimension reduced) CSI {𝑮()}\{{\bm{G}}^{(\ell)}\}, WSR coefficients {αk()}k𝒦\{\alpha_{k}^{(\ell)}\}_{k\in\mathcal{K}}, and the beamforming solutions {𝒘k()}k𝒦\{{\bm{w}}_{k}^{(\ell)}\}_{k\in\mathcal{K}} obtained by an existing algorithm, for =1,,L\ell=1,\ldots,L. Suppose that the RNN-PGP has a total number of TT iterations (see Fig. 2), and denote {𝒘k(),r}k𝒦\{{\bm{w}}_{k}^{(\ell),r}\}_{k\in\mathcal{K}} as the output of RNN-PGP in the rrth iteration for the \ellth data sample. The RNN-PGP network is trained by a two-stage approach. The first stage is based on supervised learning, using the following loss function

LS(𝜽)=12LK\displaystyle\text{L}^{\text{S}}({\bm{\theta}})=\frac{1}{2LK} =1Lk=1Kαk()(γ𝒘k()𝒘k(),T2\displaystyle\sum_{\ell=1}^{L}\sum_{k=1}^{K}\alpha_{k}^{(\ell)}\Bigg{(}\gamma\lVert{\bm{w}}_{k}^{(\ell)}-{\bm{w}}_{k}^{(\ell),T}\rVert^{2} (23)
+(1γ)r=1T1𝒘k()𝒘k(),r2),\displaystyle+(1-\gamma)\sum_{r=1}^{T-1}\lVert{\bm{w}}_{k}^{(\ell)}-{\bm{w}}_{k}^{(\ell),r}\rVert^{2}\Bigg{)},

where γ\gamma is the penalty parameter. Note that the second term in the right hand side of (23) encourages the output of earlier iterations of RNN-PGP to be close to {𝒘k()}\{{\bm{w}}_{k}^{(\ell)}\}, which may help speed up the convergence of RNN-PGP.

By treating the supervised training in the first stage as a pre-training, we can further refine the network in an unsupervised fashion in the second stage. Specifically, we can directly train the network so that the WSR function of (4) is maximized; for example, we consider the following loss function

LU(𝜽)=1KL=1LR({𝒘k(),T},𝑮()).\text{L}^{\text{U}}({\bm{\theta}})=-\frac{1}{KL}\sum_{\ell=1}^{L}R(\{{\bm{w}}_{k}^{(\ell),T}\},{\bm{G}}^{(\ell)}). (24)

As will be shown in Section V, the two-stage training approach can outperform those that solely use supervised training or unsupervised training [19].

IV Extension to Cooperative Multicell Beamforming Problem

The MISO interference channel considered in Section II and Section III treats the interference from adjacent cells as noise, resulting in a fundamental limitation on the performance especially for terminals close to cell edges [37]. In recent years, BS coordination has been analyzed as a means of handling inter-cell interference, in which one UE is served by multiple BSs [38].

In this section, we extend the RNN-PGP network to the cooperative multicell scenario. As shown in Fig. 1, the cooperative multicell communication scenario considered herein consists of KrK_{r} single-antenna receivers served by KtK_{t} BSs equipped with NtN_{t} antennas each. The jjth transmitter and kkth receiver are denoted BSj\text{BS}_{j} and UEk\text{UE}_{k}, respectively; the channel between them is denoted by 𝒉jk{\bm{h}}_{jk} for j𝒦t{1,,Kt}j\in\mathcal{K}_{t}\coloneqq\{1,\ldots,K_{t}\} and k𝒦r{1,,Kr}k\in\mathcal{K}_{r}\coloneqq\{1,\ldots,K_{r}\}. Let 𝒗jk{\bm{v}}_{jk} be the beamforming vector used by BSj\text{BS}_{j} for serving UEk\text{UE}_{k}. The SINR at UEk\text{UE}_{k} is given by

𝖲𝖨𝖭𝖱k=|j=1Kt𝒉jk𝖧𝒗jk|2lkKr|j=1Kt𝒉jk𝖧𝒗jl|2+σk2.{\sf SINR}_{k}=\frac{\left|\sum_{j=1}^{K_{t}}{\bm{h}}_{jk}^{\mathsf{H}}{\bm{v}}_{jk}\right|^{2}}{\sum_{l\neq k}^{K_{r}}\left|\sum_{j=1}^{K_{t}}{\bm{h}}_{jk}^{\mathsf{H}}{\bm{v}}_{jl}\right|^{2}+\sigma_{k}^{2}}. (25)

The WSRM problem is formulated as

max{𝒗jk}j𝒦t,k𝒦r\displaystyle\max_{\begin{subarray}{c}\{{\bm{v}}_{jk}\}_{j\in\mathcal{K}_{t},k\in\mathcal{K}_{r}}\end{subarray}}~{} k=1Krαklog2(1+𝖲𝖨𝖭𝖱k)\displaystyle\sum_{k=1}^{K_{r}}\alpha_{k}\log_{2}\left(1+{\sf SINR}_{k}\right) (26a)
s.t.\displaystyle{\rm s.t.}~{} k=1Kr𝒗jk2Pj,j𝒦t,\displaystyle\sum_{k=1}^{K_{r}}\lVert{\bm{v}}_{jk}\rVert^{2}\leq P_{j},j\in\mathcal{K}_{t}, (26b)

where (26b) represents the total power constraint of each BSj\text{BS}_{j}.

Analogous to Section III, we employ [39, Theorem 2] to transform problem (26) into a dimension-reduced problem.

Theorem 1

[39, Theorem 2] For each rate tuple on the Pareto boundary for problem (26), it holds that beamformers {𝐯jk}\{{\bm{v}}_{jk}\} that achieve the Pareto boundary fulfill

𝒗jkspan({𝒉jk}lk{Π𝒉jl𝒉jk}),j,k,{\bm{v}}_{jk}\in\text{span}\left(\{{\bm{h}}_{jk}\}\bigcup_{l\neq k}\left\{\Pi_{{\bm{h}}_{jl}}^{\perp}{\bm{h}}_{jk}\right\}\right),\forall j,k, (27)

where Π𝐡jl𝐈Nt𝐡jl𝐡jl𝖧/𝐡jl2\Pi_{{\bm{h}}_{jl}}^{\perp}\coloneqq{\bm{I}}_{N_{t}}-{\bm{h}}_{jl}{\bm{h}}_{jl}^{\mathsf{H}}/\|{\bm{h}}_{jl}\|^{2} is the orthogonal projection onto the orthogonal complement of 𝐡jl{\bm{h}}_{jl}.

Based on Theorem 1, the optimal beamforming solution 𝒗jk,j𝒦t,k𝒦r{\bm{v}}_{jk},j\in\mathcal{K}_{t},k\in\mathcal{K}_{r} of problem (26) can be expressed as

𝒗jk\displaystyle{\bm{v}}_{jk} =ξjkk𝒉jk+lkKrξjkl𝒉jkl\displaystyle=\xi_{jk}^{k}{\bm{h}}_{jk}+\sum_{l\neq k}^{K_{r}}\xi_{jk}^{l}{\bm{h}}_{jk}^{l\perp}
=𝑯jk𝝃jk,\displaystyle={\bm{H}}_{jk}{\bm{\xi}}_{jk}, (28)

where 𝒉jklΠ𝒉jl𝒉jk{\bm{h}}_{jk}^{l\perp}\coloneqq\Pi_{{\bm{h}}_{jl}}^{\perp}{\bm{h}}_{jk}, 𝝃jk=[ξjk1,,ξjkKr]Kr{\bm{\xi}}_{jk}=\begin{bmatrix}\xi_{jk}^{1},\ldots,\xi_{jk}^{K_{r}}\end{bmatrix}\in{\mathbb{C}}^{K_{r}} and 𝑯jk=[𝒉jk1,,𝒉jkk1,𝒉jk,𝒉jkk+1,,𝒉jkKr]Nt×Kr{\bm{H}}_{jk}=\begin{bmatrix}{\bm{h}}_{jk}^{1\perp},\ldots,{\bm{h}}_{jk}^{k-1\perp},{\bm{h}}_{jk},{\bm{h}}_{jk}^{k+1\perp},\ldots,{\bm{h}}_{jk}^{K_{r}\perp}\end{bmatrix}\in{\mathbb{C}}^{N_{t}\times K_{r}}. Further consider the eigenvalue decomposition of

𝑯jk𝖧𝑯jk=𝐔jk𝚲jk𝐔jk𝖧,{\bm{H}}_{jk}^{\mathsf{H}}{\bm{H}}_{jk}={\bf U}_{jk}{\bf\Lambda}_{jk}{\bf U}_{jk}^{\mathsf{H}},

and define 𝒘jk=𝚲jk1/2𝐔jk𝖧𝝃jk{\bm{w}}_{jk}={\bf\Lambda}_{jk}^{1/2}{\bf U}_{jk}^{\mathsf{H}}{\bm{\xi}}_{jk}, and 𝒈jk=(𝚲j)1/2𝐔j𝖧𝑯jk𝖧𝒉jk{\bm{g}}_{jk}=({\bf\Lambda}_{j})^{-1/2}{\bf U}_{j}^{{\mathsf{H}}}{\bm{H}}_{jk}^{\mathsf{H}}{\bm{h}}_{jk}. We can rewrite (26) as

max{𝒘jk}j𝒦t,k𝒦r\displaystyle\max_{\begin{subarray}{c}\{{\bm{w}}_{jk}\}_{j\in\mathcal{K}_{t},k\in\mathcal{K}_{r}}\end{subarray}} k=1Krlog2(1+|j=1Kt𝒈jk𝖧𝒘jk|2lkKr|j=1Kt𝒈jk𝖧𝒘jl|2+σk2)\displaystyle\sum_{k=1}^{K_{r}}\log_{2}\left(1+\frac{\left|\sum_{j=1}^{K_{t}}{\bm{g}}_{jk}^{\mathsf{H}}{\bm{w}}_{jk}\right|^{2}}{\sum_{l\neq k}^{K_{r}}\left|\sum_{j=1}^{K_{t}}{\bm{g}}_{jk}^{\mathsf{H}}{\bm{w}}_{jl}\right|^{2}+\sigma_{k}^{2}}\right)
s.t.\displaystyle{\rm s.t.}~{} k=1Kr𝒘jk2Pj,j𝒦t.\displaystyle\sum_{k=1}^{K_{r}}\lVert{\bm{w}}_{jk}\rVert^{2}\leq P_{j},j\in\mathcal{K}_{t}. (29)

Let us slightly abuse the notation by defining R({𝒘jk},𝑮)R(\{{\bm{w}}_{jk}\},{\bm{G}}) as the WSR in (IV). The gradient of R({𝒘jk},𝑮)R(\{{\bm{w}}_{jk}\},{\bm{G}}) with respect to each 𝒘jk{\bm{w}}_{jk} has the following form

𝒘jkR=p=1Kr(ajp(k)𝒈jp+bjp(k)𝒈jp),\nabla_{{\bm{w}}_{jk}}R=\sum_{p=1}^{K_{r}}\left(a_{jp}^{(k)}{\bm{g}}_{jp}+b_{jp}^{(k)}{\bm{g}}^{*}_{jp}\right), (30)

where ajp(k)=cjp(k)𝒈jp𝖧𝒘jka_{jp}^{(k)}=c_{jp}^{(k)}{\bm{g}}_{jp}^{\mathsf{H}}{\bm{w}}_{jk} and bjp(k)=cjp(k)qjKt𝒈qp𝖳𝒘qkb_{jp}^{(k)}=c_{jp}^{(k)}\sum_{q\neq j}^{K_{t}}{\bm{g}}_{qp}^{\mathsf{T}}{\bm{w}}_{qk}^{*}, in which cjk(k)=l=1Kr|q=1Kt𝒈qk𝖧𝒘ql|2+σk2c_{jk}^{(k)}=\sum_{l=1}^{K_{r}}\left|\sum_{q=1}^{K_{t}}{\bm{g}}_{qk}^{\mathsf{H}}{\bm{w}}_{ql}\right|^{2}+\sigma_{k}^{2} and

cjp(k)=|q=1Kt𝒈qp𝖧𝒘qp|2(lpKr|q=1Kt𝒈qp𝖧𝒘ql|2+σp2)(l=1Kr|q=1Kt𝒈qp𝖧𝒘ql|2+σp2),\displaystyle\small c_{jp}^{(k)}=-\frac{\left|\sum\limits_{q=1}^{K_{t}}{\bm{g}}_{qp}^{\mathsf{H}}{\bm{w}}_{qp}\right|^{2}}{\left(\sum\limits_{l\neq p}^{K_{r}}\left|\sum\limits_{q=1}^{K_{t}}{\bm{g}}_{qp}^{\mathsf{H}}{\bm{w}}_{ql}\right|^{2}+\sigma_{p}^{2}\right)\left(\sum\limits_{l=1}^{K_{r}}\left|\sum\limits_{q=1}^{K_{t}}{\bm{g}}_{qp}^{\mathsf{H}}{\bm{w}}_{ql}\right|^{2}+\sigma_{p}^{2}\right)},

for all pkp\neq k. Therefore, similar to Fig. 2, we can build a learning network for problem (IV) by unfolding the PGP method and learning the complex coefficients {ajp(k),bjp(k)}\{a_{jp}^{(k)},b_{jp}^{(k)}\} in (30).

In Fig. 5, we present the block diagram of the RNN-PGP network for problem (IV), where we only plot the function blocks associated with BSj{\rm BS}_{j} at the rrth iteration.

Refer to caption
Figure 5: Diagram of the proposed RNN-PGP network for WSRM problem (IV) of the cooperative multicell scenario. The subscript ()jp(k),r(\cdot)^{(k),r}_{jp} refers to the parameters in the jj-th BS for the kk-th user in the rr-th iteration. 𝐖^j=[𝒘^j1,,𝒘^jKr]\hat{{\bf W}}_{j}=[\hat{{\bm{w}}}_{j1},\ldots,\hat{{\bm{w}}}_{jK_{r}}], 𝐖j=[𝒘j1,,𝒘jKr]{\bf W}_{j}=[{\bm{w}}_{j1},\ldots,{\bm{w}}_{jK_{r}}]. The subscript ()kr(\cdot)^{r}_{k} refers to the parameter for the kk-th BS in the rr-th iteration. \nabla, 𝒫\mathcal{P} and \angle_{\mathcal{F}} indicate the gradient, projections and phase rotation respectively.

The RNN-PGP network for the cooperative multicell scenario has a similar structure as that in Fig. 2. All the MLPs also share the same structure and parameters. The input of the MLP of BSj{\rm BS}_{j} for UEk{\rm UE}_{k} is {|𝒈jp𝖧𝒘jkr|2,|qjKt𝒈qp𝖧𝒘qkr|2,Dpr,Ipr}p𝒦r\{|{\bm{g}}_{jp}^{\mathsf{H}}{\bm{w}}_{jk}^{r}|^{2},|\sum_{q\neq j}^{K_{t}}{\bm{g}}_{qp}^{\mathsf{H}}{\bm{w}}_{qk}^{r}|^{2},D_{p}^{r},I_{p}^{r}\}_{p\in\mathcal{K}_{r}}, where

Dpr\displaystyle D_{p}^{r} =|q=1Kt𝒈qp𝖧𝒘qpr|2,Ipr\displaystyle=\bigg{|}\sum_{q=1}^{K_{t}}{\bm{g}}_{qp}^{\mathsf{H}}{\bm{w}}_{qp}^{r}\bigg{|}^{2},~{}I_{p}^{r} =lpKr|q=1Kt𝒈qp𝖧𝒘qlr|2+σp2.\displaystyle=\sum_{l\neq p}^{K_{r}}\bigg{|}\sum_{q=1}^{K_{t}}{\bm{g}}_{qp}^{\mathsf{H}}{\bm{w}}_{ql}^{r}\bigg{|}^{2}+\sigma_{p}^{2}. (31)

The output of the MLP of BSj{\rm BS}_{j} for UEk{\rm UE}_{k} in the rrth iteration is {ajp(k),r,bjp(k),r}p𝒦r\{a_{jp}^{(k),r},b_{jp}^{(k),r}\}_{p\in\mathcal{K}_{r}} and the step size sjk(k),rs_{jk}^{(k),r}.

Then, the block \nabla constructs the gradient vector according to (30), i.e.,

𝒘jkRr\displaystyle\nabla_{{\bm{w}}_{jk}}R^{r} =({ajp(k),r,bjp(k),r}p,{𝒈jp}p)\displaystyle=\nabla(\{a_{jp}^{(k),r},b_{jp}^{(k),r}\}_{p},\{{\bm{g}}_{jp}\}_{p}) (32)
=p=1Kr(ajp(k),r𝒈jp+bjp(k),r𝒈jp),k𝒦r,\displaystyle=\sum_{p=1}^{K_{r}}\left(a_{jp}^{(k),r}{\bm{g}}_{jp}+b_{jp}^{(k),r}{\bm{g}}^{*}_{jp}\right),~{}k\in\mathcal{K}_{r}, (33)

followed by gradient ascent update 𝒘~jkr+1=𝒘jkr+sjkr𝒘jkRr\tilde{{\bm{w}}}_{jk}^{r+1}={{\bm{w}}}_{jk}^{r}+s_{jk}^{r}\nabla_{{\bm{w}}_{jk}}R^{r}, k𝒦rk\in\mathcal{K}_{r}.

All the beamforming vectors {𝒘~jkr+1}k\{\tilde{{\bm{w}}}_{jk}^{r+1}\}_{k} due to BSj{\rm BS}_{j} will be collected and used to perform projection onto the feasible set of problem (IV), which yields

𝒘^jkr+1=𝒫(𝒘~jkr+1)=𝒘~jkr+1max{k=1Kr𝒘~jkr+12/Pj,1},\hat{{\bm{w}}}_{jk}^{r+1}=\mathcal{P}(\tilde{{\bm{w}}}_{jk}^{r+1})=\frac{\tilde{{\bm{w}}}_{jk}^{r+1}}{\max\{\sqrt{\sum_{k=1}^{K_{r}}\|\tilde{{\bm{w}}}_{jk}^{r+1}\|^{2}/P_{j}},1\}}, (34)

for all k𝒦rk\in\mathcal{K}_{r}. Lastly, the function block \angle_{\mathcal{F}} rotates the phases of {𝒘^jkr+1}k\{\hat{{\bm{w}}}_{jk}^{r+1}\}_{k} by

𝒘jkr+1\displaystyle{\bm{w}}_{jk}^{r+1} =(𝒘^jkr+1)\displaystyle=\angle_{\mathcal{F}}(\hat{{\bm{w}}}_{jk}^{r+1})
=𝒘^jkr+1exp(itan1((𝒈jk𝖧𝒘^jkr+1)(𝒈jk𝖧𝒘^jkr+1))).\displaystyle=\hat{{\bm{w}}}_{jk}^{r+1}\exp{\bigg{(}{-i\tan^{-1}\bigg{(}\frac{\Im({\bm{g}}_{jk}^{\mathsf{H}}\hat{{\bm{w}}}_{jk}^{r+1})}{\Re({\bm{g}}_{jk}^{\mathsf{H}}\hat{{\bm{w}}}_{jk}^{r+1})}\bigg{)}}\bigg{)}}. (35)

To train the RNN-PGP network in Fig. 5, we adopt the same hybrid strategy in Section III-D. The supervised training loss and the unsupervised training loss are respectively given by

LS(𝜽)=12LKtKr\displaystyle\text{L}^{\text{S}}({\bm{\theta}})=\frac{1}{2LK_{t}K_{r}} =1Lj=1Ktk=1Kr(γ𝒘jk()𝒘jk(),T2\displaystyle\sum_{\ell=1}^{L}\sum_{j=1}^{K_{t}}\sum_{k=1}^{K_{r}}\Big{(}\gamma\lVert{\bm{w}}_{jk}^{(\ell)}-{\bm{w}}_{jk}^{(\ell),T}\rVert^{2}
+(1γ)r=1T1𝒘jk()𝒘jk(),r2).\displaystyle+(1-\gamma)\sum_{r=1}^{T-1}\lVert{\bm{w}}_{jk}^{(\ell)}-{\bm{w}}_{jk}^{(\ell),r}\rVert^{2}\Big{)}. (36)

and

LU(𝜽)=1LKrl=1LR({𝒘jk(),T},𝑮()).\text{L}^{\text{U}}({\bm{\theta}})=-\frac{1}{LK_{r}}\sum_{l=1}^{L}R(\{{\bm{w}}_{jk}^{(\ell),T}\},{\bm{G}}^{(\ell)}). (37)

We remark that, similar to that in Fig. 2, the RNN-PGP network in Fig. 5 for the coopeartive multicell scenario inherently has a good generalization with respect to the number of transmit antennas NtN_{t}. Besides, since the input size and output size of the MLPs in Fig. 5 do not depend on the number of cooperative BSs KtK_{t}, the RNN-PGP network also has good generalization capability with respect to KtK_{t}. These will be examined in the simulation results in Section V.

V Simulation Results

In this section, we present numerical results of the proposed RNN-PGP networks in Fig. 2 and Fig. 5. Both the synthetic channel models and the ray-tracing based DeepMIMO dataset [30] are considered to examine the performance of the proposed beamforming learning networks.

V-A Simulation Setup

We first consider the MISO interference channel model in Section II and test the performance of the proposed RNN-PGP network in Fig. 2 under synthetic Rayleigh channel data. Like Fig. 3, we assume that each BSk{\rm BS}_{k} is located at the center of cell kk and BSk{\rm BS}_{k} is located randomly according to a uniform distribution within the cell. The half BS-to-BS distance is denoted as dd and it is chosen between 100100 and 10001000 meters. We set PmaxP_{max}, i.e., the maximum transmit power level of BSk{\rm BS}_{k}, to be 3838 dBm over a 1010 MHz frequency band. The path loss between the UE and its associated BS is set as 128.1+37.6log10(s)128.1+37.6\log_{10}(s) (dB) where ss (km) is the distance between the UE and BS. The channel coefficients of {𝒉kj}\{{\bm{h}}_{kj}\} are generated following the independent and identically distributed complex Gaussian distribution with mean equal to the pathloss and variance equal to one. The noise power spectral density of all UEs are set the same and equal to σ2=174\sigma^{2}=-174 dBm/Hz. A total of 50005000 training samples (L=5000L=5000) are generated which include CSI {𝒉kj()}\{{\bm{h}}_{kj}^{(\ell)}\}, =1,,L\ell=1,\ldots,L, rate weights {αk()}\{\alpha_{k}^{(\ell)}\}, =1,,L\ell=1,\ldots,L, and beamforming solutions {𝒘k()}\{{\bm{w}}_{k}^{(\ell)}\}, =1,,L\ell=1,\ldots,L, obtained either by the PGP method or the POA algorithm. Another 10001000 testing samples are also generated in the same way.

To train the RNN-PGP network in Fig. 2, we set the parameter η\eta in (20) and (21) to be 55, and the parameter γ\gamma in (23) and (IV) to be 0.95. The function tanh\tanh is used as the activation function in the MLP. The iteration number of RNN-PGP is set to T=20T=20, and the Adam optimizer is used for training the RNN-PGP network. The simulation environment is based on Python 3.6.9 with TensorFlow 1.14.0 on a desktop computer with Intel i7-9800X CPU Core, one NVIDIA RTX 2080Ti GPU, and 64GB of RAM. The GPU is used during the training stage but not used in the test stage.

In the presented results, we benchmark the proposed RNN-PGP network with the existing beamforming algorithms such as the PGP method, WMMSE algorithm and the POA algorithm, as well as black-box based DNN approach. Specifically, they include

  • PGP, WMMSE and POA: Performance results obtained by applying the three methods to the dimension reduced problem (III-A), respectively.

  • RNN-PGP (PGP) and RNN-PGP (POA): The RNN-PGP networks in Fig. 2 trained by the hybrid strategy, and the beamforming solutions used for supervised training are obtained by the PGP method and the POA method, respectively.

  • DNN (PGP) and DNN (POA): The black-box DNNs, which have 55 layers with the concatenated CSI as the input and the concatenated beamforming solutions as the output, are trained end-to-end by the hybrid training strategy, and the beamforming solutions used for supervised training are obtained by the PGP method and the POA method, respectively.

  • RNN-PGP (Unsuper): The RNN-PGP network in Fig. 2 trained solely by the unsupervised cost.

  • RNN-PGP (POA, Stepsize): The MLPs in the RNN-PGP only predicts the step size skrs_{k}^{r}, and the gradient vector 𝒘kRr\nabla_{{\bm{w}}_{k}}R^{r} is computed explicitly by (13) and (14). The network is trained by the hybrid strategy and the beamforming solutions obtained by the POA method is used during the supervised training.

For the presented test results, if not mentioned specifically, the beamforming solutions of RNN-PGP are obtained by unfolding the network for T=20T=20 iterations. Except for the (weighted) sum rate, we also show the “accuracy” (%) which is the ratio of the (weighted) sum rate achieved by the RNN-PGP and that achieved by the WMMSE solution.

V-B Sum-rate Performance

In this subsection, we evaluate the sum-rate performance of different schemes by setting the weights αk\alpha_{k} of all users to be 11. The results are shown in Fig. 6(a), Fig. 6(b) and Tables I, II, III, respectively.

If not mentioned specifically, we consider the MISO interference channel with K=19K=19 and Nt=36N_{t}=36. We set the neighbor size cc to be 18 (c=18c=18). For the “RNN-PGP (PGP)”, “RNN-PGP (POA)” and “RNN-PGP (Unsuper)”, the MLP used is a 5-layers DNN with the numbers of neurons of the input layer and output layer are 4K4K and 2K+12K+1, respectively, and the numbers of neurons in the three hidden layers are 125,100,85,125,100,85, respectively. For the “DNN (PGP)” and “DNN (POA)”, we use a 5-layers DNN where the number of nodes of the input and output layers are both 2KNt2KN_{t}, and the numbers of nodes in the hidden layers are 1450,1250,13251450,1250,1325, respectively. The input of the MLP in the “RNN-PGP (POA, Step-size)” is the same as that of the “RNN-PGP (PGP)”, but the numbers of neurons of the hidden layer are 120,75,25120,75,25, respectively, and the number of neurons of the output is reduced to 11.

Refer to caption
(a) Sum-rate versus iteration.
Refer to caption
(b) Sum-rate versus runtime for T=55T=55 iterations.
Figure 6: The sum rates achieved by various schemes versus the iteration number and runtime.

Convergence and runtime: The experiment results of the achieved sum rates versus the iteration number are shown in Fig. 6(a), where the RNN-PGPs are unfolded for 5555 iterations. For the POA algorithm and the “DNN (POA)”, we simply plot the lines indicating the sum rates achieved by the two methods. Firstly, one can observe that the POA algorithm provides a sum-rate upper bound, and that the WMMSE algorithm not only converges faster but also yields a slighter higher sum rate than the PGP method.

Secondly, we can see that “RNN-PGP (POA)” converges faster and yields higher sum rates than “RNN-PGP (PGP)” as well as “RNN-PGP (Unsuper)” and “RNN-PGP (POA, Stepsize)”, which shows the benefits of the hybrid training strategy and prediction of the gradient vector. Note from the figure that both “RNN-PGP (POA)” and “RNN-PGP (PGP)” can converge well around 10 iterations and achieve comparable sum rate as the WMMSE and PGP algorithms. Thirdly, except for “RNN-PGP (POA, Stepsize)”, the RNN-PGP networks can greatly outperform the black-box based “DNN (POA)”.

TABLE I: The sum-rate performance of the proposed RNN-PGP for K=19K=19 and c=18c=18; the size of hidden layers of the MLP are 125, 100, and 85 respectively (MLP 125:100:85).
Number of antennas Nt=36N_{t}=36 Nt=72N_{t}=72 Nt=108N_{t}=108 NtN_{t} randomly selected from 16-128
PGP 134.21 145.16 151.19 -
Trained with Nt=36N_{t}=36 129.85 (96.75%) 137.29 (94.58%) 142.81 (94.46%) - (93.78%)
RNN-PGP (POA) Trained with mixed Nt{18, 36, 72, 108}N_{t}\in\{18,\ 36,\ 72,\ 108\} 128.34 (95.63%) 138.47 (95.39%) 143.89 (95.17%) - (95.27%)
TABLE II: The sum-rate performance of the proposed RNN-PGP for Nt=64N_{t}=64.
Number of Neighbors c=18(MLP 85:73:42)c=18\ (\text{MLP}\ 85:73:42)
Number of BS-user links K=37K=37 K=61K=61 K=91K=91 K=91(Nt=128)K=91\ (N_{t}=128)
Trained with K=37K=37 221.88 (96.01%) 289.21 (93.12%) 535.33 (91.92%) 596.57 (91.79%)
RNN-PGP (POA) Trained with K{37, 61, 91}K\in\{37,\ 61,\ 91\} 219.85 (95.13%) 291.48 (93.85%) 540.68 (92.84%) 598.07 (92.02%)
Number of Neighbors c=6(MLP 32:21:15)c=6\ (\text{MLP}\ 32:21:15)
Number of BS-user links K=19K=19 K=37K=37 K=61K=61 K=61(Nt=128)K=61\ (N_{t}=128)
Trained with K=37K=37 121.83 (90.78%) 212.89 (92.12%) 274.53 (88.39%) 376.83 (87.12%)
RNN-PGP (POA) Trained with K{19, 37, 61}K\in\{19,\ 37,\ 61\} 123.92 (92.34%) 210.67 (91.16%) 279.31 (89.93%) 381.04 (88.09%)

As seen from Fig. 6(b), the RNN-PGP networks have advantage in terms of the runtime. Specifically, for running 20 iterations, the average runtimes of the “RNN-PGP (POA)” “RNN-PGP (PGP)” are about 0.0573s and 0.0576s, while the runtimes of the PGP, WMMSE and POA algorithms for 20 iterations are 1.0641.064s, 1.9641.964s and 23.23123.231s, respectively.

Impact of training sample size: We examine the achieved accuracy versus the size of training data (LL), as is shown in Fig. 7. From the figure, we can see that all schemes can have improved performance when the number of training samples increases. Moreover, we compare the performance of RNN-PGP when different training approaches are used. We can see that there is a gap between hybrid training and unsupervised training, but such a gap reduces when increasing the training data size. This implies that the advantage of hybrid training can be significant if the training size is small. One can also see from the figure that the gap between the black-box based DNN schemes and the proposed RNN-PGP cannot be effectively reduced when the training data size increases.

Generalization w.r.t. number of transmit antennas: To demonstrate the generalization capability of the proposed RNN-PGP, we train the “RNN-PGP (POA)” using the data set of Nt=36N_{t}=36 and K=19K=19, but test it on data sets with different numbers of NtN_{t}. The results are shown in the 3rd row of Table I. One can see that the proposed RNN-PGP can yield almost the same accuracy when applied to scenarios with Nt=36,72,N_{t}=36,72, and 108108. Interestingly, we also test the “RNN-PGP (POA)” in a heterogeneous scenario where the BS can have different numbers of antennas from each other, and the antenna number of each BS is randomly chosen from 1616 to 128128. As seen from the table, the “RNN-PGP (POA)” can still maintain an average accuracy of 93.78%93.78\%.

In Table I, we also present the results when the “RNN-PGP (POA)” is trained by a mixed data set which contains equal-sized data samples with Nt{18,36,72,108}N_{t}\in\{18,36,72,108\}. One can see from the table that this scheme can provide slightly higher accuracy.

Refer to caption
Figure 7: Accuracy for different numbers of training samples.
TABLE III: The sum-rate performance of the proposed RNN-PGP for K=19K=19 and Nt=36N_{t}=36.
Distance Between BSs d=1d=1 (km)
Number of Neighbors c=6c=6 (MLP 32:21:15) c=6(Nt=128)c=6\ (N_{t}=128) c=9c=9 (MLP 45:38:23) c=18c=18 (MLP 85:73:42)
Trained with d=1d=1 (km) 128.65 (95.86%) 156.02 (97.03%) 129.31 (96.35%) 129.85 (96.75%)
Trained with d{0.5, 1}d\in\{0.5,\ 1\} 127.98 (95.36%) 155.86 (96.91%) 128.35 (95.63%) 128.96 (96.09%)
Distance Between BSs d=0.5d=0.5 (km)
Number of Neighbors c=6c=6 (MLP 32:21:15) c=6(Nt=128)c=6\ (N_{t}=128) c=9c=9 (MLP 45:38:23) c=18c=18 (MLP 85:73:42)
Trained with d=1d=1 (km) 138.15 (82.47%) 187.21 (96.95%) 146.18 (87.27%) 159.22 (95.05%)
Trained with d{0.5, 1}d\in\{0.5,\ 1\} 159.75 (95.37%) 187.32 (97.01%) 160.16 (95.61%) 159.50 (95.22%)

Generalization w.r.t. number of BSs: To verify the generalization capability of the RNN-PGP with respect to the number of BSs KK, we consider a training scenario with Nt=64N_{t}=64, K=37K=37 and c=18c=18 and 66, respectively. From Table II(a), one can see that with c=18c=18, the trained “RNN-PGP (POA)” can yield a test accuracy of 96.01%96.01\% for K=37K=37, and have slightly reduced accuracies when deployed in scenarios with K=61K=61 and K=91K=91. We also present in the 4th column of the table the result when the trained RNN-PGP is deployed in a scenario where both KK and NtN_{t} are respectively changed to 9191 and 128128. The achieved accuracy can be maintained around 91.7991.79%.

In Table. II(b), we present another set of results with c=6c=6. One can see that the performance degradation is significant when the trained RNN-PGP is deployed to scenarios with different numbers of BSs. This implies that the neighbor size cc considered in the RNN-PGP network training should not be too small when compared to KK.

Generalization w.r.t. cell radius: Here, we examine the generalization capability of the RNN-PGP with respect to the cell radius dd. We consider a training scenario with Nt=36N_{t}=36, K=19K=19 and different number of neighbors c=6,9,18c=6,9,18, and the half inter-BS distance is fixed to d=1d=1 km or is mixed with d{0.5,1}d\in\{0.5,1\}.

Table III(a) shows the results that the trained frameworks are tested in the scenario with the same cell radius d=1d=1 km, while Table III(b) are the results obtained when tested in the scenario with d=0.5d=0.5 km. By comparing the 3rd rows and first columns of the two tables, one can see that the accuracy decreases from 95.86% to 82.47% when the trained RNN-PGP is deployed in a scenario with the cell radius decreased to 0.50.5 km, while the sum rate improvement is minor (from 128.65 to 138.15). This implies that the trained network cannot effectively mitigate the inter-cell interference. Interestingly, as seen from the 2nd columns of the two tables, if we apply the RNN-PGP to the scenario with the antenna number increased to Nt=128N_{t}=128, then the accuracy degradation due to decreased cell radius becomes minor and the sum rate improvement is more evident (from 156.02 to 187.21). By comparing the 4th rows and the first two columns of the two tables, one can see that training with mixed cell radius provides good robustness. Lastly, comparing the first column with the 3rd and 4th columns of Table III(a)-(b), one can also see that larger values of cc can make the RNN-PGP to achieve higher accuracy for both d=1d=1 km and d=0.5d=0.5 km.

V-C Weighted Sum-rate Performance

In this part, we consider the MISO interference channel with Nt=36N_{t}=36, K=19K=19 and c=18c=18. Each data sample \ell contains {𝒉kj(),αk(),𝒘k()}\{{\bm{h}}_{kj}^{(\ell)},\alpha_{k}^{(\ell)},{\bm{w}}_{k}^{(\ell)}\}, for =1,,L\ell=1,\ldots,L, where the weights {αk()}\{\alpha_{k}^{(\ell)}\} are generated randomly and satisfy k=1Kαk()=1\sum_{k=1}^{K}\alpha_{k}^{(\ell)}=1. For the “RNN-PGP (POA)” and “RNN-PGP (PGP)”, the sizes of the three hidden layers are 95,80,95,80, and 4545, respectively. The black-box based “DNN (POA)” and “DNN (PGP)” have 5 layers with numbers of nodes equal to 1387 (2KNt+K2KN_{t}+K), 1775, 1775, 1450 and 1368 (2KNt2KN_{t}), respectively.

Refer to caption
Figure 8: WSR versus iteration number for Nt=36N_{t}=36, K=19K=19 and c=18c=18.

We can see from Fig. 8 that the proposed RNN-PGP can still achieve good WSR performance, especially when trained by the POA solutions. The black-box based DNNs still suffer poor performance no matter which supervised solutions are used for training.

TABLE IV: The sum-rate performance of the proposed RNN-PGP under DeepMIMO dataset [30] for K=18K=18 and c=17c=17 (MLP 90:75:45).
Number of antennas
(number of transmit antennas in xx, yy and zz axes)
Nt=36N_{t}=36
(X=1,Y=6,Z=6)(X=1,Y=6,Z=6)
Nt=72N_{t}=72
(X=1,Y=6,Z=12)(X=1,Y=6,Z=12)
Nt=108N_{t}=108
(X=3,Y=6,Z=6)(X=3,Y=6,Z=6)
Random number antennas selected from
(XYZ[16,128])(X\cdot Y\cdot Z\in[16,128])
PGP 102.04102.04 110.36 114.94 -
Trained by POA with X=1,Y=6,Z=6X=1,Y=6,Z=6 98.49(96.53%)98.49\ (96.53\%) 104.15(94.37%)104.15\ (94.37\%) 108.18(94.12%)108.18\ (94.12\%) - (93.57%)(93.57\%)
RNN-PGP Trained with Nt{36, 72, 108}N_{t}\in\{36,\ 72,\ 108\} 97.38(95.43%)97.38\ (95.43\%) 105.05(95.19%)105.05\ (95.19\%) 109.22(95.02%)109.22\ (95.02\%) - (95.09%)(95.09\%)

V-D Performance on DeepMIMO dataset

In this subsection, we test the proposed RNN-PGP on the ray-tracing based DeepMIMO dataset [30]. We consider an outdoor scenario ‘O1’ as shown in Fig. 9. The main street (the horizontal one) is 600600m long and 4040m wide, and the second street (the vertical one) is 440440m long and 4040m wide. We consider 18 BSs (K=18K=18), and their served UEs are selected randomly around their respective BS within 5050m on the streets (the intersection of the black street and the red circle in Fig. 9). The BSs are equipped with antennas with dimensions X,Y,X,Y, and ZZ along the x, y, and z axes, and the antenna size is Nt=XYZN_{t}=XYZ. Analogous to the synthetic data, we generated 5000 training samples (L=5000)(L=5000) and 1000 test samples.

In this experiment, for the proposed RNN-PGP, the numbers of neurons of the 5-layer MLP are 72,90,75,4572,90,75,45 and 3737, respectively; the black-box based DNNs also have 5 layers with numbers of neurons equal to 2592,2845,2450,14502592,2845,2450,1450 and 12961296, respectively. The testing results versus the iteration number are shown in Fig. 10. Again, we can observe consistent results and the proposed RNN-PGP can achieve good sum rate performance.

Similar to Table I, we test the generalization capability of the RNN-PGP in the DeepMIMO data set. As seen from Table IV, the proposed RNN-PGP still can be generalized well and maintain good accuraces.

Refer to caption
Figure 9: The outdoor scenario provided in ’O1’ of [30]
Refer to caption
Figure 10: Sum-rate versus iteration in DeepMIMO channel for Nt=36(X=1,Y=6,Z=6)N_{t}=36\ (X=1,Y=6,Z=6), K=18K=18 and c=17c=17.
TABLE V: The sum-rate performance of the proposed RNN-PGP under DeepMIMO dataset [30] for Kr=KtK_{r}=K_{t}.
Number of BSs
{active BSs}
Kt=Kr=6K_{t}=K_{r}=6 (MLP 45:32:28)
{5,6,7,8,15,16}\{5,6,7,8,15,16\}
Number of antennas
(number of transmit antennas in xx, yy and zz axes)
Nt=36N_{t}=36 (X=1,Y=6,Z=6)(X=1,Y=6,Z=6) Nt=64N_{t}=64 (X=1,Y=8,Z=8)(X=1,Y=8,Z=8) Nt=216N_{t}=216 (X=6,Y=6,Z=6)(X=6,Y=6,Z=6)
Trained with X=1,Y=8,Z=8X=1,Y=8,Z=8 24.56(92.49%)24.56\ (92.49\%) 28.12(94.56%)28.12\ (94.56\%) 36.07(92.47%)36.07\ (92.47\%)
RNN-PGP (Trained by PGP) Trained with Nt{36, 64, 216}N_{t}\in\{36,\ 64,\ 216\} 24.84(93.52%)24.84\ (93.52\%) 27.80(93.49%)27.80\ (93.49\%) 36.47(93.51%)36.47\ (93.51\%)
Number of BSs
{active BSs}
Kt=Kr=12K_{t}=K_{r}=12 (MLP 75:56:50)
{3,4,5,6,7,8,9,10,15,16,17,18}\{3,4,5,6,7,8,9,10,15,16,17,18\}
Number of antennas
(number of transmit antennas in xx, yy and zz axes)
Nt=36N_{t}=36 (X=1,Y=6,Z=6)(X=1,Y=6,Z=6) Nt=64N_{t}=64 (X=1,Y=8,Z=8)(X=1,Y=8,Z=8) Nt=216N_{t}=216 (X=6,Y=6,Z=6)(X=6,Y=6,Z=6)
Trained with X=1,Y=8,Z=8X=1,Y=8,Z=8 37.46(92.44%)37.46\ (92.44\%) 43.72(94.23%)43.72\ (94.23\%) 54.99(92.40%)54.99\ (92.40\%)
RNN-PGP (Trained by PGP) Trained with Nt{36, 64, 216}N_{t}\in\{36,\ 64,\ 216\} 37.88(93.49%)37.88\ (93.49\%) 43.35(93.43%)43.35\ (93.43\%) 55.42(93.13%)55.42\ (93.13\%)
TABLE VI: The sum-rate performance of the proposed RNN-PGP under DeepMIMO dataset [30] for Nt=12N_{t}=12, Kr=12K_{r}=12 and Nt=64(X=1,Y=8,Z=8)N_{t}=64\ (X=1,Y=8,Z=8), (MLP 75:56:50).
Number of BSs KtK_{t} Kt=6K_{t}=6 Kt=12K_{t}=12 Kt=18K_{t}=18 Random selected from 6-18
PGP 39.47 46.45 69.66 -
Trained with Kt=12K_{t}=12 36.57 (92.36%) 43.76 (94.23%) 64.21 (92.17%) - (92.78%)
RNN-PGP (Trained by PGP) Trained with Kt{6, 12, 18}K_{t}\in\{6,\ 12,\ 18\} 36.88 (93.43%) 43.6 (93.89%) 64.96 (93.26%) - (93.31%)

V-E Performance for Cooperative Multicell Beamforming

In this subsection, we examine the proposed RNN-PGP in Fig. 5 for the cooperative multicell beamforming problem in Section IV. We again consider the outdoor scenario provided in ’O1’ of the DeepMIMO dataset [30]. Two cases are considered: (1) Kt=Kr=6K_{t}=K_{r}=6, and (2) Kt=Kr=12K_{t}=K_{r}=12. In case (1), the BSs 5, 6, 7, 8, 15, 16 in Fig. 9 are selected, while BSs 3, 4, 5, 6, 7, 8, 9, 10, 15, 16, 17, 18 in Fig. 9 are selected in case (2). The UEs are selected randomly on the streets (the black region of Fig. 9). For both cases, the RNN-PGP is trained by the dataset with Nt=64(X=1,Y=Z=8)N_{t}=64\ (X=1,Y=Z=8) and the PGP solutions.

The test performance results for the two cases are shown in Table V(a) and Table V(b), respectively. One can observe that for both cases the proposed RNN-PGP can yield high accuracy and generalizes well when deployed in scenarios with different number of transmit antennas.

In Table. VI, we further consider the generalization capability with respect to the number of BSs KtK_{t}. The RNN-PGP is trained under the setting of case (2) with Kt=12K_{t}=12, Kr=12K_{r}=12 and Nt=64N_{t}=64 while is tested in different scenarios with Kt=6K_{t}=6, Kt=18K_{t}=18 and randomly selected numbers of BSs. It can be observed from the table that the proposed RNN-PGP can maintain good performance and has good generalization capability w.r.t the number of BSs.

VI Conclusion

In this paper, we have considered a learning-based beamforming design for MISO interference channels and cooperative multicell scenarios. In particular, in order to overcome the computational issues of massive MIMO beamforming optimization, we have proposed the RNN-PGP network by unfolding the simple PGP method. We have shown that by exploiting the low-dimensional structures of optimal beamforming solutions as well as the gradient vector of the WSR, the proposed RNN-PGP network can have a low complexity, and such complexity is independent of the number of transmit antennas. We have also refined the input and output of the MLP in the RNN-PGP to remove its dependence on the number of BSs. Extensive experiments have been conducted based on both synthetic channel dataset and the DeepMIMO dataset. The presented experiment results have demonstrated that the proposed RNN-PGP can achieve a high solution accuracy with the expense of a small computation time. More importantly, the proposed RNN-PGP has promising generalization capabilities with respect to the number of transmit antennas, the number of BSs, and the cell radius, which is a key ability to be employed in heterogeneous networks.

References

  • [1] M. Zhu and T.-H. Chang, “Optimization inspired learning network for multiuser robust beamforming,” in IEEE 11th Sensor Array and Multichannel Signal Processing Workshop, Hangzhou, China, Jun. 8-11, 2020, pp. 1–5.
  • [2] E. G. Larsson, O. Edfors, F. Tufvesson, and T. L. Marzetta, “Massive MIMO for next generation wireless systems,” IEEE Commun. Mag., vol. 52, no. 2, pp. 186–195, 2014.
  • [3] S. Parkvall, E. Dahlman, A. Furuskar, and M. Frenne, “NR: The new 5G radio access technology,” IEEE Communications Standards Magazine, vol. 1, no. 4, pp. 24–30, 2017.
  • [4] X. Ge, S. Tu, G. Mao, C.-X. Wang, and T. Han, “5G ultra-dense cellular networks,” IEEE Wirel. Commun., vol. 23, no. 1, pp. 72–79, 2016.
  • [5] A. B. Gershman, N. D. Sidiropoulos, S. Shahbazpanahi, M. Bengtsson, and B. Ottersten, “Convex optimization-based beamforming,” IEEE Signal Process. Mag., vol. 27, no. 3, pp. 62–75, 2010.
  • [6] W.-K. Ma, “Semidefinite relaxation of quadratic optimization problems and applications,” IEEE Signal Process. Mag., vol. 1053, no. 5888/10, 2010.
  • [7] Z.-Q. Luo and S. Zhang, “Dynamic spectrum management: Complexity and duality,” IEEE J. Sel. Topics Signal Process., vol. 2, no. 1, pp. 57–73, 2008.
  • [8] Y.-F. Liu, Y.-H. Dai, and Z.-Q. Luo, “Coordinated beamforming for MISO interference channel: Complexity analysis and efficient algorithms,” IEEE Trans. Signal Process., vol. 59, no. 3, pp. 1142–1157, 2010.
  • [9] A. Wiesel, Y. C. Eldar, and S. Shamai, “Zero-forcing precoding and generalized inverses,” IEEE Trans. Signal Process., vol. 56, no. 9, pp. 4409–4418, 2008.
  • [10] J. Papandriopoulos and J. S. Evans, “SCALE: A low-complexity distributed protocol for spectrum balancing in multiuser DSL networks,” IEEE Trans. Inf. Theory, vol. 55, no. 8, pp. 3711–3724, 2009.
  • [11] C. Shen, W.-C. Li, and T.-H. Chang, “Wireless information and energy transfer in multi-antenna interference channel,” IEEE Trans. Signal Process., vol. 62, no. 23, pp. 6249–6264, 2014.
  • [12] S. S. Christensen, R. Agarwal, E. De Carvalho, and J. M. Cioffi, “Weighted sum-rate maximization using weighted MMSE for MIMO-BC beamforming design,” IEEE Trans. Wireless Commun., vol. 7, no. 12, pp. 4792–4799, 2008.
  • [13] Q. Shi, M. Razaviyayn, Z.-Q. Luo, and C. He, “An iteratively weighted MMSE approach to distributed sum-utility maximization for a MIMO interfering broadcast channel,” IEEE Trans. Signal Process., vol. 59, no. 9, pp. 4331–4340, 2011.
  • [14] T. J. O’Shea, T. Erpek, and T. C. Clancy, “Deep learning based MIMO communications,” arXiv preprint arXiv:1707.07980, 2017.
  • [15] W. Cui, K. Shen, and W. Yu, “Spatial deep learning for wireless scheduling,” IEEE J. Sel. Areas Commun, vol. 37, no. 6, pp. 1248–1261, 2019.
  • [16] H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos, “Learning to optimize: Training deep neural networks for interference management,” IEEE Trans. Signal Process., vol. 66, no. 20, pp. 5438–5453, 2018.
  • [17] F. Liang, C. Shen, W. Yu, and F. Wu, “Towards optimal power control via ensembling deep neural networks,” IEEE Trans. Commun, vol. 68, no. 3, pp. 1760–1776, 2019.
  • [18] Y. S. Nasir and D. Guo, “Multi-agent deep reinforcement learning for dynamic power allocation in wireless networks,” IEEE J. Sel. Areas Commun, vol. 37, no. 10, pp. 2239–2250, 2019.
  • [19] W. Xia, G. Zheng, Y. Zhu, J. Zhang, J. Wang, and A. P. Petropulu, “A deep learning framework for optimization of MISO downlink beamforming,” IEEE Trans. Commun., vol. 68, no. 3, pp. 1866–1880, 2019.
  • [20] A. Alkhateeb, S. Alex, P. Varkey, Y. Li, Q. Qu, and D. Tujkovic, “Deep learning coordinated beamforming for highly-mobile millimeter wave systems,” IEEE Access, vol. 6, pp. 37 328–37 348, 2018.
  • [21] A. Balatsoukas-Stimming and C. Studer, “Deep unfolding for communications systems: A survey and some new directions,” in IEEE International Workshop on Signal Processing Systems, Nanjing, China, Oct. 20-23, 2019, pp. 266–271.
  • [22] J. R. Hershey, J. L. Roux, and F. Weninger, “Deep unfolding: Model-based inspiration of novel deep architectures,” arXiv preprint arXiv:1409.2574, 2014.
  • [23] N. Samuel, T. Diskin, and A. Wiesel, “Learning to detect,” IEEE Trans. Signal Process., vol. 67, no. 10, pp. 2554–2564, 2019.
  • [24] M.-W. Un, M. Shao, W.-K. Ma, and P. Ching, “Deep MIMO detection using ADMM unfolding.” in DSW, 2019, pp. 333–337.
  • [25] Y. Wei, M.-M. Zhao, M. Hong, M.-J. Zhao, and M. Lei, “Learned conjugate gradient descent network for massive MIMO detection,” in Proc. IEEE ICC, Dublin, Ireland, Jun. 7-11, 2020, pp. 1–6.
  • [26] Q. Hu, Y. Cai, Q. Shi, K. Xu, G. Yu, and Z. Ding, “Iterative algorithm induced deep-unfolding neural networks: Precoding design for multiuser MIMO systems,” arXiv preprint arXiv:2006.08099, 2020.
  • [27] L. Pellaco, M. Bengtsson, and J. Jaldén, “Deep unfolding of the weighted MMSE beamforming algorithm,” arXiv preprint arXiv:2006.08448, 2020.
  • [28] D. Bertsekas, Nonlinear Programming, 2nd ed.   Athena Scientific Belmont, MA, 1999.
  • [29] E. G. Larsson and E. A. Jorswieck, “Competition versus cooperation on the MISO interference channel,” IEEE J. Sel. Areas Commun, vol. 26, no. 7, pp. 1059–1069, 2008.
  • [30] A. Alkhateeb, “DeepMIMO: A generic deep learning dataset for millimeter wave and massive MIMO applications,” arXiv preprint arXiv:1902.06435, 2019.
  • [31] H. Tuy, “Monotonic optimization: Problems and solution approaches,” SIAM J. Optim, vol. 11, no. 2, pp. 464–494, 2000.
  • [32] S. Verdu et al., Multiuser detection.   Cambridge university press, 1998.
  • [33] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and distributed computation: Numerical methods.   Prentice hall Englewood Cliffs, NJ, 1989, vol. 23.
  • [34] W. Utschick and J. Brehmer, “Monotonic optimization framework for coordinated beamforming in multicell networks,” IEEE Trans. Signal Process., vol. 60, no. 4, pp. 1899–1909, 2011.
  • [35] W.-C. Li, T.-H. Chang, and C.-Y. Chi, “Multicell coordinated beamforming with rate outage constraint—Part II: Efficient approximation algorithms,” IEEE Trans. Signal Process., vol. 63, no. 11, pp. 2763–2778, 2015.
  • [36] E. A. Jorswieck, E. G. Larsson, and D. Danev, “Complete characterization of the Pareto boundary for the MISO interference channel,” IEEE Trans. Signal Process., vol. 56, no. 10, pp. 5292–5296, 2008.
  • [37] M. K. Karakayali, G. J. Foschini, and R. A. Valenzuela, “Network coordination for spectrally efficient communications in cellular systems,” IEEE Wirel. Commun., vol. 13, no. 4, pp. 56–61, 2006.
  • [38] H. Zhang and H. Dai, “Cochannel interference mitigation and cooperative processing in downlink multicell multiuser MIMO networks,” EURASIP J. Wirel. Commun. Netw., vol. 2004, no. 2, p. 202654, 2004.
  • [39] E. Bjornson, R. Zakhour, D. Gesbert, and B. Ottersten, “Cooperative multicell precoding: Rate region characterization and distributed strategies with instantaneous and statistical CSI,” IEEE Trans. Signal Process., vol. 58, no. 8, pp. 4298–4310, 2010.