Fluid Antennas-Enabled Multiuser Uplink: A Low-Complexity Gradient Descent for Total Transmit Power Minimization

Guojie Hu, Qingqing Wu, Senior Member, IEEE, Kui Xu, Member, IEEE, Jian Ouyang, Member, IEEE, Jiangbo Si, Senior Member, IEEE, Yunlong Cai, Senior Member, IEEE, and Naofal Al-Dhahir, Fellow, IEEE This work was supported in part by the Natural Science Foundations of China under Grants 62201606. Guojie Hu is with the College of Communication Engineering, Rocket Force University of Engineering, Xi’an 710025, China (email: [email protected]). Qingqing Wu is with the Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai 200240, China ([email protected]). Kui Xu is with the College of Communications Engineering, the Army of Engineering University, Nanjing 210007, China ([email protected]). Jian Ouyang is with the Institute of Signal Processing and Transmission, Nanjing University of Posts and Telecommunications, Nanjing 210003, China ([email protected]). Jiangbo Si is with the Integrated Service Networks Lab of Xidian University, Xi’an 710100, China ([email protected]). Yunlong Cai is with the College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China (email: [email protected]). Naofal Al-Dhahir is with the Department of Electrical and Computer Engineering, The University of Texas at Dallas, Richardson, TX 75080 USA ([email protected]).

Abstract

We investigate multiuser uplink communications from multiple single-antenna users to a base station (BS), which is equipped with multiple fluid antennas (FAs) and adopts zero-forcing receivers to decode multiple signals. We aim to optimize antennas’ positions at the BS, to minimize the total transmit power of all users subject to the minimum rate requirement. After applying transformations, we show that the problem is equivalent to minimizing the sum of each eigenvalue’s reciprocal of a matrix, which is a function of all antennas’ positions at the BS. Subsequently, the projected gradient descent (PGD) method is utilized to find a locally optimal solution. In particular, different from the latest related work, we exploit the eigenvalue decomposition to successfully derive a closed-form gradient for the PGD, which facilitates the practical implementation greatly. We demonstrate by simulations that via careful optimization for all antennas’ positions in our proposed design, the total transmit power of all users can be decreased significantly as compared to competitive benchmarks.

Index Terms:

Fluid antennas, multiuser uplink, total transmit power minimization, projected gradient descent.

I Introduction

Beamforming, which exploits the degree of freedom (DoF) in the spatial domain, is a powerful technique for improving system capacity [1]. In conventional beamforming, positions of antennas at transceivers are fixed which may limit the gains of beamforming depending on channel conditions.

To mitigate the above deficiency, the intelligent reflecting surface (IRS) technique has been proposed and proven to be capable of reconfiguring wireless channels by adjusting passive IRS reflecting coefficients [2]. As another promising technology, fluid antennas (FAs) [3] $-$ [6] has emerged recently. Although its operating principle is different from that of the IRS, FAs can also reshape channel environments artificially, by adaptively adjusting positions of all antennas (connected to the radio frequency chains via flexible cables) supported by the stepper motors or servos. Unlike antenna selection (AS) which requires more candidate antennas, higher hardware cost and larger overhead of channel estimation, and concurrently unlike rotatable uniform linear array (RULA) which just mechanically rotates the transmit/receive array and cannot fully exploit spatial channel variation, FAs fully exploites the channel variation resulting from changes in antennas’ positions to achieve a higher spatial diversity without causing additional hardware or algorithm cost [7]. Driven by these potential advantages, earlier works have applied the technology of FAs to further enhance capacities of multiple-input multiple-output (MIMO) systems [7] $-$ [8], multiuser uplink/downlink communications [9] $-$ [10], physical-layer security systems [11] or interference networks [12].

Refer to caption — Figure 1: Illustration of the system model.

In this letter, as in [9], we focus on FAs-enabled classical multiuser uplink communications. Specifically, we assume multiple single-antenna users that intend to concurrently transmit their signals to a base station (BS), which is equipped with FAs and adopts the widely used zero-forcing (ZF) receivers to detect multiple signals. By carefully optimizing positions of all antennas at the BS, our goal is to minimize the total transmit power of all users subject to the minimum rate requirement for each user. The formulated problem is highly non-convex, and we develop a projected gradient descent (PGD) method to find a locally optimal solution. Unlike [9] which exploits the original definition-based method to compute the gradient in each iteration, the key contribution of this letter is that we successfully derive a closed-form gradient in each iteration with the help of the eigenvalue decomposition. This novelty greatly accelerates the implementation of the PGD method. Numerical results are performed to demonstrate that our proposed method with FAs can significantly decrease the total transmit power of all users as compared to competitive benchmarks.

II System Model and Problem Formulation

As shown in Fig. 1, we consider multiuser uplink communications from $M$ single-antenna users $\left\{{{{\rm{U}}_{i}}}\right\}_{i=1}^{M}$ to the BS equipped with $N$ FAs distributed along a linear dimension, with $N\geq M$ . Consider the line-of-sight (LoS) propagation environment, the channel vector between the BS and ${{{\rm{U}}_{i}}}$ is denoted by¹¹1As shown in the follows, the simple LoS environment is considered here since we aim to demonstrate that our proposed design of FAs’ movements just relies on the slow-changing property of statistical channel state information (CSI). In the simulations, we will show the effectiveness of the proposed FAs’ movement rule when facing random Rician fading channels.

\begin{split}{}{{\bf{h}}_{i}}({\bf{x}})={\left[{{e^{j\frac{{2\pi}}{\lambda}{x_{1}}\sin{\theta_{i}}}},{e^{j\frac{{2\pi}}{\lambda}{x_{2}}\sin{\theta_{i}}}},...,{e^{j\frac{{2\pi}}{\lambda}{x_{N}}\sin{\theta_{i}}}}}\right]^{T}},\end{split}

(1)

where $\lambda$ is the signal wavelength, ${{\theta_{i}}}$ is the angle of arrival (AoA) to the BS at ${\rm{U}}_{i}$ , and $x_{n}$ denotes the adjustable position of the $n$ -th antenna at the BS, with ${\bf{x}}={\left[{{x_{1}},{x_{2}},...,{x_{N}}}\right]^{T}}\in{{\mathbb{R}}^{N\times 1}}$ . For the multiuser uplink, the received signals ${\bf{y}}\in{{\mathbb{C}}^{M\times 1}}$ at the BS can be expressed as

\begin{split}{}{\bf{y}}={{\bf{W}}^{H}}{\bf{H}}({\bf{x}}){{\bf{P}}^{1/2}}{\bf{s}}+{{\bf{W}}^{H}}{\bf{n}},\end{split}

(2)

where ${\bf{H}}({\bf{x}})=\left[{{{\bf{h}}_{1}}({\bf{x}}),{{\bf{h}}_{2}}({\bf{x}}),...,{{\bf{h}}_{M}}({\bf{x}})}\right]\in{{\mathbb{C}}^{N\times M}}$ , ${{\bf{P}}^{1/2}}={\rm{diag}}\left\{{\left[{\sqrt{{P_{1}}},\sqrt{{P_{2}}},...,\sqrt{{P_{M}}}}\right]}\right\}$ , in which $P_{i}$ denotes the transmit power of ${\rm{U}}_{i}$ , ${\bf{s}}={\left[{{s_{1}},{s_{2}},...,{s_{M}}}\right]^{T}}\in{{\mathbb{C}}^{M\times 1}}$ , in which $s_{i}$ denotes the transmitted signal of ${\rm{U}}_{i}$ and ${\mathbb{E}}\left[{{{\left|{{s_{i}}}\right|}^{2}}}\right]=1,\forall i=1,...,M$ . In addition, ${\bf{W}}=\left[{{{\bf{w}}_{1}},{{\bf{w}}_{2}},...,{{\bf{w}}_{M}}}\right]\in{{\mathbb{C}}^{N\times M}}$ is the receive combining matrix at the BS, in which ${{\bf{w}}_{i}}$ is the combining vector for the signal $s_{i}$ , and ${\bf{n}}={\left[{{n_{1}},{n_{2}},...,{n_{N}}}\right]^{T}}$ , in which $n_{i}$ is the additive white Gaussian noise at the $i$ -th BS antenna, with ${n_{i}}\sim{\cal C}{\cal N}(0,{\sigma^{2}})$ . Based on (2), the received signal-to-interference-plus-noise ratio (SINR) of the signal $s_{i}$ at the BS is derived as

\begin{split}{}{\gamma_{i}}=\frac{{{P_{i}}{{\left|{{\bf{w}}_{i}^{H}{{\bf{h}}_{i}}({\bf{x}})}\right|}^{2}}}}{{\sum\nolimits_{k=1,k\neq i}^{M}{{P_{k}}{{\left|{{\bf{w}}_{i}^{H}{{\bf{h}}_{k}}({\bf{x}})}\right|}^{2}}+\left\|{{{\bf{w}}_{i}}}\right\|_{2}^{2}{\sigma^{2}}}}}.\end{split}

(3)

In this letter, we assume that the BS adopts the widely used linear ZF detector for processing multiple signals, due to its low implementation complexity especially when number of antennas at the BS is large. Based on this, the receive combining matrix ${\bf{W}}$ is accordingly expressed as

\begin{split}{}{\bf{W}}={\bf{H}}({\bf{x}}){\left({{\bf{H}}{{({\bf{x}})}^{H}}{\bf{H}}({\bf{x}})}\right)^{-1}}.\end{split}

(4)

Substituting (4) into (3), the received SINR of the signal $s_{i}$ is given by

\begin{split}{}{\gamma_{i}}=\frac{{{P_{i}}}}{{\left\|{{{\left[{{\bf{H}}({\bf{x}}){{\left({{\bf{H}}{{({\bf{x}})}^{H}}{\bf{H}}({\bf{x}})}\right)}^{-1}}}\right]}_{:,i}}}\right\|_{2}^{2}{\sigma^{2}}}}.\end{split}

(5)

Our goal is to optimize the positions of FAs at the BS, i.e., ${\bf{x}}$ , to minimize the total transmit power of $M$ users subject to a minimum achievable rate requirement for each user. Hence, the optimization problem is formulated as²²2In this work, only antennas’ positions are optimized for total power minimization. Consider the case where receiving beamforming and antennas’ positions are jointly optimized, the generalized Bender’s decomposition [13] can be exploited for obtaining the globally optimal solution.

	$\displaystyle({\rm{P1}}):{\rm{}}\mathop{\min}\limits_{{\bf{x}},{\bf{P}}}\ \sum\nolimits_{i=1}^{M}{{P_{i}}}$		( ${\rm{6a}}$ )
	$\displaystyle\ {\rm{s.t.}}\quad{\log_{2}}(1+{\gamma_{i}})\geq{r_{i}},\forall i=1,...,M,$		( ${\rm{6b}}$ )
	$\displaystyle\quad\quad\ \ {\bf{x}}\in{\cal C},$		( ${\rm{6c}}$ )

where $r_{i}$ in the constraint (6b) denotes the minimum rate requirement for ${\rm{U}}_{i}$ , and ${\cal C}$ in (6c) denotes the feasible moving region for $N$ antennas at the BS. More specifically, denote the total span for the movement of FAs as $L$ and without loss of generality set $0\leq{x_{1}}<{x_{2}}<...<{x_{N}}\leq L$ . Then, consider: i) the minimum distance between any two FAs to avoid the coupling effect as ${{d_{\min}}}$ [7], [8], i.e., $\left|{{x_{i}}-{x_{j}}}\right|\geq{d_{\min}}$ , $\forall i\neq j$ ; ii) the movement span should be the same for each antenna, we can conveniently set ${\cal C}\buildrel\Delta\over{=}\left\{{{x_{i}}\in[{F_{i}},{G_{i}}]}\right\}_{i=1}^{N}$ , where

\begin{split}{}{F_{i}}=&\frac{{L-(N-1){d_{\min}}}}{N}(i-1)+(i-1){d_{\min}},\\ {G_{i}}=&\frac{{L-(N-1){d_{\min}}}}{N}i+(i-1){d_{\min}},\end{split}

from which we have $0={F_{1}}<{G_{1}}<{F_{2}}<{G_{2}}<...<{F_{N}}<{G_{N}}=L$ and ${G_{i}}-{F_{i}}=\frac{{L-(N-1){d_{\min}}}}{N}$ , $\forall i=1,...,N$ . The feasible movement region for each FA is illustrated in Fig. 2 for better understanding.

\begin{split}{}\frac{{\partial{\lambda_{i}}\left\{{\bf{Z}}\right\}}}{{\partial{x_{n}}}}=&{\rm{Re}}\left[{\frac{{\partial{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}}}{{\partial{x_{n}}}}{\bf{Z}}{{\left[{\bf{V}}\right]}_{:,i}}+{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}\frac{{\partial{\bf{Z}}}}{{\partial{x_{n}}}}{{\left[{\bf{V}}\right]}_{:,i}}+{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}{\bf{Z}}\frac{{\partial{{\left[{\bf{V}}\right]}_{:,i}}}}{{\partial{x_{n}}}}}\right]\\ \mathop{=}\limits^{(a)}&{\rm{Re}}\left[{\frac{{\partial{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}}}{{\partial{x_{n}}}}{\lambda_{i}}\left\{{\bf{Z}}\right\}{{\left[{\bf{V}}\right]}_{:,i}}+{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}\frac{{\partial{\bf{Z}}}}{{\partial{x_{n}}}}{{\left[{\bf{V}}\right]}_{:,i}}+{\lambda_{i}}\left\{{\bf{Z}}\right\}{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}\frac{{\partial{{\left[{\bf{V}}\right]}_{:,i}}}}{{\partial{x_{n}}}}}\right].\end{split}

(13)

\begin{split}{}{\nabla_{{{\bf{x}}^{t}}}}f({\bf{x}})=\left[{\sum\nolimits_{i=1}^{M}{\frac{{-{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}\frac{{\partial{\bf{Z}}}}{{\partial{x_{1}}}}{{\left[{\bf{V}}\right]}_{:,i}}}}{{\lambda_{i}^{2}\left\{{\bf{Z}}\right\}}}},\sum\nolimits_{i=1}^{M}{\frac{{-{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}\frac{{\partial{\bf{Z}}}}{{\partial{x_{2}}}}{{\left[{\bf{V}}\right]}_{:,i}}}}{{\lambda_{i}^{2}\left\{{\bf{Z}}\right\}}}},...,\sum\nolimits_{i=1}^{M}{\frac{{-{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}\frac{{\partial{\bf{Z}}}}{{\partial{x_{N}}}}{{\left[{\bf{V}}\right]}_{:,i}}}}{{\lambda_{i}^{2}\left\{{\bf{Z}}\right\}}}}}\right]_{{\bf{x}}={{\bf{x}}^{t}}}^{T}.\end{split}

(18)

Based on (6b), it can be shown that $P_{i}$ should satisfy

\begin{split}{}{P_{i}}\geq\left\|{{{\left[{{\bf{H}}({\bf{x}}){{\left({{\bf{H}}{{({\bf{x}})}^{H}}{\bf{H}}({\bf{x}})}\right)}^{-1}}}\right]}_{:,i}}}\right\|_{2}^{2}{\varepsilon_{i}}{\sigma^{2}},\end{split}

(7)

where ${\varepsilon_{i}}=({2^{{r_{i}}}}-1)$ . According to (7), we can equivalently replace the objective of (P1) as [9]

\begin{split}{}&\sum\nolimits_{i=1}^{M}{\left\|{{{\left[{{\bf{H}}({\bf{x}}){{\left({{\bf{H}}{{({\bf{x}})}^{H}}{\bf{H}}({\bf{x}})}\right)}^{-1}}}\right]}_{:,i}}}\right\|_{2}^{2}{\varepsilon_{i}}{\sigma^{2}}}\\ =&\left\|{{\bf{H}}({\bf{x}}){{\left({{\bf{H}}{{({\bf{x}})}^{H}}{\bf{H}}({\bf{x}})}\right)}^{-1}}{{\bf{\Omega}}^{1/2}}}\right\|_{\rm{F}}^{2}\\ =&{\rm{tr}}\left\{{{{\left({{{\bf{\Omega}}^{-1}}{\bf{H}}{{({\bf{x}})}^{H}}{\bf{H}}({\bf{x}})}\right)}^{-1}}}\right\}\\ =&\sum\nolimits_{i=1}^{M}{\frac{1}{{{\lambda_{i}}\left\{{{{\bf{\Omega}}^{-1}}{\bf{H}}{{({\bf{x}})}^{H}}{\bf{H}}({\bf{x}})}\right\}}}}\buildrel\Delta\over{=}f({\bf{x}}),\end{split}

(8)

where ${\bf{\Omega}}={\rm{diag}}\left\{{\left[{{\varepsilon_{1}}{\sigma^{2}},{\varepsilon_{2}}{\sigma^{2}},...,{\varepsilon_{M}}{\sigma^{2}}}\right]}\right\}$ and ${{\lambda_{i}}\left\{{{{\bf{\Omega}}^{-1}}{\bf{H}}{{({\bf{x}})}^{H}}{\bf{H}}({\bf{x}})}\right\}}$ denotes the $i$ -th eigenvalue of the matrix ${{\bf{\Omega}}^{-1}}{\bf{H}}{({\bf{x}})^{H}}{\bf{H}}({\bf{x}})\buildrel\Delta\over{=}{\bf{Z}}\in{{\mathbb{C}}^{M\times M}}$ . Therefore, problem (P1) can be equivalently reformulated as

	$\displaystyle({\rm{P2}}):{\rm{}}\mathop{\min}\limits_{{\bf{x}}}\ f({\bf{x}})$		( ${\rm{9a}}$ )
	$\displaystyle\ {\rm{s.t.}}\quad{\bf{x}}\in{\cal C}.$		( ${\rm{9b}}$ )

Remark 1: Problem (P2) is highly non-convex because its objective is neither convex or concave, which cannot be solved via standard convex optimization techniques. Motivated by this, the authors in [9] try to solve (P2) by resorting to the PGD method, which handles the simple unconstrained or constrained problems well and is not sensitive to concavity or convexity of the objective. However, [9] computes the gradient based on the original definition shown in its equation (12), which has the large implementation complexity. In the next section, we show how to reduce the complexity significantly.

Remark 2: Considering the LoS environment, the BS can easily estimate the CSI by just estimating the AoAs to itself at $M$ users based on some mature algorithms, such as MUSIC. Based on this, the BS can directly optimize FAs’ positions via the proposed algorithm and then feedback each user the required transmit power based on (7) with optimized ${\bf{x}}$ .³³3In addition, even the general Rician fading is considered, the BS still optimizes FAs’ positions in advance based on the estimated AoAs. Then, in the communication process, all antennas’ positions are not changed and each user sends pilot signals to the BS for uplink channel estimations. When the BS successfully estimates the instantaneous CSI, it can tell each user the required transmit power based on (7). Since no antennas’ movements are involved, the consumed time for estimate-feedback is much smaller than the channel coherence time (CCT), especially for the low-mobility scenario where CCT is relatively larger [14].

III Algorithm Design for Solving (P2)

In this letter, we still exploit the PGD method to find a locally optimal solution to (P2). Specifically, using PGD, the update rule for ${\bf{x}}$ in the $t+1$ -th iteration is given by

\begin{split}{}{{\bf{x}}^{t+1}}=&{{\bf{x}}^{t}}-\delta{\nabla_{{{\bf{x}}^{t}}}}f({\bf{x}}),\\ {{\bf{x}}^{t+1}}=&{\cal B}\left\{{{{\bf{x}}^{t+1}},{\cal C}}\right\},\end{split}

(10)

where ${{\bf{x}}^{t+1}}$ in the first equation is the original updated ${\bf{x}}$ , and ${{\bf{x}}^{t+1}}$ in the second equation is the additional update (if necessary) via the projection function ${\cal B}\left\{\cdot\right\}$ as explained later, which ensures that the solutions for FAs’ positions in each iteration always satisfy the constraint in (9b). Further, ${\nabla_{{{\bf{x}}^{t}}}}f({\bf{x}})$ denotes the gradient of $f({\bf{x}})$ at ${{{\bf{x}}^{t}}}$ , and $\delta$ is the step size for the gradient descent.

A. Computing ${\nabla_{{{\bf{x}}^{t}}}}f({\bf{x}})$ : Note that ${\nabla_{\bf{x}}}f({\bf{x}})={\left[{\frac{{\partial f({\bf{x}})}}{{\partial{x_{1}}}},...,\frac{{\partial f({\bf{x}})}}{{\partial{x_{N}}}}}\right]^{T}}$ . Using the chain rule, $\frac{{\partial f({\bf{x}})}}{{\partial{x_{n}}}}$ , $\forall n=1,...,N$ , can be derived as

\begin{split}{}\frac{{\partial f({\bf{x}})}}{{\partial{x_{n}}}}=\sum\nolimits_{i=1}^{M}{\frac{{-1}}{{\lambda_{i}^{2}\left\{{\bf{Z}}\right\}}}}\frac{{\partial{\lambda_{i}}\left\{{\bf{Z}}\right\}}}{{\partial{x_{n}}}}.\end{split}

(11)

Based on (11), to compute ${\nabla_{\bf{x}}}f({\bf{x}})$ , the key is to derive a closed-form expression for $\frac{{\partial{\lambda_{i}}\left\{{\bf{Z}}\right\}}}{{\partial{x_{n}}}}$ , $\forall i=1,...,M$ and $n=1,...,N$ .

To proceed, let us denote ${\bf{Z}}={\bf{VD}}{{\bf{V}}^{-1}}$ as the eigenvalue decomposition of the matrix ${\bf{Z}}$ , where ${\bf{V}}\in{{\mathbb{C}}^{M\times M}}$ consists of linearly independent columns with unit norm, and ${\bf{D}}={\rm{diag}}\left\{{\left[{{\lambda_{1}}\left\{{\bf{Z}}\right\},...,{\lambda_{M}}\left\{{\bf{Z}}\right\}}\right]}\right\}$ . Then, we can equivalently express ${\lambda_{i}}\left\{{\bf{Z}}\right\}$ as

\begin{split}{}{\lambda_{i}}\left\{{\bf{Z}}\right\}={\left[{{{\bf{V}}^{-1}}}\right]_{i,:}}{\bf{Z}}{\left[{\bf{V}}\right]_{:,i}}.\end{split}

(12)

Based on (12), $\frac{{\partial{\lambda_{i}}\left\{{\bf{Z}}\right\}}}{{\partial{x_{n}}}}$ can be expanded as in (13), where $\mathop{=}\limits^{(a)}$ is established since ${\bf{Z}}{\left[{\bf{V}}\right]_{:,i}}={\lambda_{i}}\left\{{\bf{Z}}\right\}{\left[{\bf{V}}\right]_{:,i}}$ and ${\left[{{{\bf{V}}^{-1}}}\right]_{i,:}}{\bf{Z}}={\lambda_{i}}\left\{{\bf{Z}}\right\}{\left[{{{\bf{V}}^{-1}}}\right]_{i,:}}$ . Then, further note that the sum of the first and third terms in (13) equals

\begin{split}{}&\frac{{\partial{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}}}{{\partial{x_{n}}}}{\lambda_{i}}\left\{{\bf{Z}}\right\}{\left[{\bf{V}}\right]_{:,i}}+{\lambda_{i}}\left\{{\bf{Z}}\right\}{\left[{{{\bf{V}}^{-1}}}\right]_{i,:}}\frac{{\partial{{\left[{\bf{V}}\right]}_{:,i}}}}{{\partial{x_{n}}}}\\ =&{\lambda_{i}}\left\{{\bf{Z}}\right\}\frac{{\partial\left[{{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}{{\left[{\bf{V}}\right]}_{:,i}}}\right]}}{{\partial{x_{n}}}}\mathop{=}\limits^{(b)}0,\end{split}

(14)

where $\mathop{=}\limits^{(b)}$ is established since ${{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}{{\left[{\bf{V}}\right]}_{:,i}}}$ always equals the constant one and thus is not relevant to $x_{n}$ in any situation. Based on (13) and (14), $\frac{{\partial{\lambda_{i}}\left\{{\bf{Z}}\right\}}}{{\partial{x_{n}}}}$ can be simplified as

\begin{split}{}\frac{{\partial{\lambda_{i}}\left\{{\bf{Z}}\right\}}}{{\partial{x_{n}}}}=&{\rm{Re}}\left[{{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}\frac{{\partial{\bf{Z}}}}{{\partial{x_{n}}}}{{\left[{\bf{V}}\right]}_{:,i}}}\right]\\ \mathop{=}\limits^{(c)}&{{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}\frac{{\partial{\bf{Z}}}}{{\partial{x_{n}}}}{{\left[{\bf{V}}\right]}_{:,i}}},\end{split}

(15)

where $\mathop{=}\limits^{(c)}$ is established since ${{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}\frac{{\partial{\bf{Z}}}}{{\partial{x_{n}}}}{{\left[{\bf{V}}\right]}_{:,i}}}$ is a real number. Recall that ${\bf{Z}}={{\bf{\Omega}}^{-1}}{\bf{H}}{({\bf{x}})^{H}}{\bf{H}}({\bf{x}})$ and ${{\bf{\Omega}}^{-1}}={\rm{diag}}\left\{{\left[{1/({\varepsilon_{1}}{\sigma^{2}}),1/({\varepsilon_{2}}{\sigma^{2}}),...,1/({\varepsilon_{M}}{\sigma^{2}})}\right]}\right\}$ . The element in the $i$ -th row and $j$ -th column of ${\bf{Z}}$ based on (1) can be derived as

\begin{split}{}{\left[{\bf{Z}}\right]_{i,j}}=\frac{1}{{{\varepsilon_{i}}{\sigma^{2}}}}\sum\nolimits_{k=1}^{N}{{e^{j\frac{{2\pi}}{\lambda}{x_{k}}\left({\sin{\theta_{j}}-\sin{\theta_{i}}}\right)}}},\end{split}

(16)

based on which it is easy to derive the element in the $i$ -th row and $j$ -th column of $\frac{{\partial{\bf{Z}}}}{{\partial{x_{n}}}}$ as

\begin{split}{}&{\left[{\frac{{\partial{\bf{Z}}}}{{\partial{x_{n}}}}}\right]_{i,j}}=\frac{{\partial{{\left[{\bf{Z}}\right]}_{i,j}}}}{{\partial{x_{n}}}}\\ =&\frac{1}{{{\varepsilon_{i}}{\sigma^{2}}}}\frac{{2\pi}}{\lambda}\left({\sin{\theta_{j}}-\sin{\theta_{i}}}\right){e^{j\left[{\frac{{2\pi}}{\lambda}{x_{n}}\left({\sin{\theta_{j}}-\sin{\theta_{i}}}\right)+\frac{\pi}{2}}\right]}}.\end{split}

(17)

Finally, by substituting the known ${\frac{{\partial{\bf{Z}}}}{{\partial{x_{n}}}}}$ into (15) and then substituting (15) into (11), the gradient ${\nabla_{\bf{x}}}f({\bf{x}})$ at ${{\bf{x}}^{t}}$ can be computed as in (18).

Algorithm 1 BLS for a Feasible

\delta

in the

t

-th iteration

1:Input:

{{\bf{x}}^{t-1}}

\delta>0

0<\rho<1

2:Repeat:

{{\bf{x}}^{t}}={\cal B}\left\{{{{\bf{x}}^{t-1}}-\delta{\nabla_{{{\bf{x}}^{t}}}}f({\bf{x}}),{\cal C}}\right\}

4: If

f({{\bf{x}}^{t}})>f({{\bf{x}}^{t-1}})-\delta\left\|{{\nabla_{{{\bf{x}}^{t-1}}}}f({\bf{x}})}\right\|_{2}^{2}

, update

\delta\leftarrow\rho\delta

5:End

6:Until:

f({{\bf{x}}^{t}})\leq f({{\bf{x}}^{t-1}})-\delta\left\|{{\nabla_{{{\bf{x}}^{t-1}}}}f({\bf{x}})}\right\|_{2}^{2}

Algorithm 2 The Overall Algorithm for Solving (P2)

1:Input:

t=1

{{\bf{x}}^{1}}\in{\cal C}

2:Repeat:

3: Perform eigenvalue decomposition on

{\left[{\bf{Z}}\right]_{{\bf{x}}={{\bf{x}}^{t}}}}

and compute

{\left[{\left\{{\partial{\bf{Z}}/\partial{x_{n}}}\right\}_{n=1}^{N}}\right]_{{\bf{x}}={{\bf{x}}^{t}}}}

to obtain

{\nabla_{{{\bf{x}}^{t}}}}f({\bf{x}})

4: Determine a feasible

\delta

based on Algorithm 1;

t\leftarrow t+1

;

6: Update

{{\bf{x}}^{t}}={\cal B}\left\{{{{\bf{x}}^{t-1}}-\delta{\nabla_{{{\bf{x}}^{t-1}}}}f({\bf{x}}),{\cal C}}\right\}

7:End

8:Until:

\left|{f({{\bf{x}}^{t}})-f({{\bf{x}}^{t-1}})}\right|\leq\tau

B. Determining the feasible step size: In the PGD method, a correct setting for the step size in each iteration is important for realizing convergence. Specifically, the feasible $\delta$ in each iteration should satisfy $\delta\leq 1/{L_{\bf{x}}}$ , where ${L_{\bf{x}}}$ is a Lipschitz constant for ${\nabla_{\bf{x}}}f({\bf{x}})$ , which satisfies ${\left\|{{\nabla_{\bf{x}}}f({\bf{x}})-{\nabla_{{\bf{x^{\prime}}}}}f({\bf{x}})}\right\|_{2}}\leq{L_{\bf{x}}}{\left\|{{\bf{x}}-{\bf{x^{\prime}}}}\right\|_{2}}$ , $\forall{\bf{x}},{\bf{x^{\prime}}}\in{\cal C}$ [15]. Since the structure of ${{\nabla_{\bf{x}}}f({\bf{x}})}$ is much complex, generally ${L_{\bf{x}}}$ is difficult to determine. Based on this fact, we can instead exploit the backtracking line search (BLS) [16] to find a feasible $\delta$ . The details are shown in Algorithm 1, where $\rho$ denotes the shrinking factor.

C. Determining the projection function ${\cal B}\left\{\cdot\right\}$ : Recall that the projection function mainly ensures that $N$ FAs only move in their respective feasible regions. Therefore, according to the rule of nearest distance, ${\cal B}\left\{{{{\bf{x}}^{t+1}},{\cal C}}\right\}$ can be determined as

\begin{split}{}&{\cal B}\left\{{{{\bf{x}}^{t+1}},C}\right\}\triangleright x_{i}^{t+1}=\min\left({\max({F_{i}},x_{i}^{t+1}),{G_{i}}}\right).\end{split}

(19)

D. The algorithm, complexity analysis and comparison: The overall setups for solving problem (P2) are summarized in Algorithm 2, where $\tau$ denotes the prescribed accuracy. Generally, the PGD based minimization may lead to sightly different total transmit power for different initialization ${{\bf{x}}^{1}}$ and step-sizes $\delta$ . This is mainly because the PGD may converge to a local minimum of the objective, which is an unavoidable phenomenon arising in non-convex optimization problems. Nevertheless, this phenomenon can be well solved by randomly generating numerous different ${{\bf{x}}^{1}}$ and then selecting the one which produces the minimum power.

Complexity Analysis: To simplify the analysis while still capturing the complexity of Algorithm 2, we here focus on the number of complex multiplications required in each iteration. Specifically, the complexity of the eigenvalue decomposition for ${\left[{\bf{Z}}\right]_{{\bf{x}}={{\bf{x}}^{t}}}}$ is about ${\cal O}({M^{3}})$ [9]. Further, calculating $\sum\nolimits_{i=1}^{M}{-{{\left[{{{\bf{V}}^{-1}}}\right]}_{i,:}}\frac{{\partial{\bf{Z}}}}{{\partial{x_{n}}}}{{\left[{\bf{V}}\right]}_{:,i}}/\lambda_{i}^{2}\left\{{\bf{Z}}\right\}}$ , $\forall n=1,...,N$ , requires ${\cal O}({M^{2}})$ complex multiplications, leading to the complexity of computing ${\nabla_{{{\bf{x}}^{t}}}}f({\bf{x}})$ as ${\cal O}({M^{2}}N)$ . In addition, the complexity of finding a feasible $\delta$ is about ${\cal O}({T_{{\rm{inner}}}}N)$ , where $N$ is the complexity of computing $\delta{\nabla_{{{\bf{x}}^{t}}}}f({\bf{x}})$ in step 3 of Algorithm 1, and ${T_{{\rm{inner}}}}$ is the maximum number of iterations for BLS. Hence, the total complexity of Algorithm 2 is about

\begin{split}{}{\cal O}\left({{T_{{\rm{outer}}}}\left({{M^{3}}+{M^{2}}N+{T_{{\rm{inner}}}}N}\right)}\right),\end{split}

where ${{T_{{\rm{outer}}}}}$ is the maximum number of iterations for repeatedly implementing steps 3-5 in Algorithm 2.

Complexity Comparison: As a comparison, if the original definition based method [9] is exploited to compute the gradient, i.e.,

\begin{split}{}&\frac{{\partial f({\bf{x}})}}{{\partial{x_{n}}}}{|_{{\bf{x}}={{\bf{x}}^{t}}}}=\mathop{\lim}\limits_{\varepsilon\to 0}\frac{{f(x_{1}^{t},...,x_{n}^{t}+\varepsilon,...,x_{N}^{t})-f({{\bf{x}}^{t}})}}{\varepsilon},\end{split}

(20)

the corresponding complexity will become larger. Specifically, given ${{{\bf{x}}^{t}}}$ and $\varepsilon$ , using the eigenvalue decomposition to obtain ${f(x_{1}^{t},...,x_{n}^{t}+\varepsilon,...,x_{N}^{t})}$ for all $n=1,...,N$ requires a complexity of ${\cal O}(N{M^{3}})$ . Similarly, the complexity of obtaining ${f({{\bf{x}}^{t}})}$ is ${\cal O}({M^{3}})$ . Therefore, the complexity of obtaining ${\nabla_{{{\bf{x}}^{t}}}}f({\bf{x}})$ is about ${\cal O}((N+1){M^{3}})$ , and then the total complexity of Algorithm 2 becomes

{\cal O}\left({{T_{{\rm{outer}}}}\left({{M^{3}}+{M^{3}}N+{T_{{\rm{inner}}}}N}\right)}\right),

which is clearly higher than the complexity of Algorithm 2 in this work, especially when $M$ is large. We compare the above two complexities versus $M$ in Fig. 3 for better illustration, where we set ${T_{{\rm{outer}}}}={T_{{\rm{inner}}}}=10$ and $N=30$ .

IV Simulation Results

In this section, we present numerical results to demonstrate the effectiveness of the proposed design over the general Rician fading, in which the channel vector between the BS and ${\rm{U}}_{i}$ is ${\widehat{\bf{h}}_{i}}({\bf{x}})=\sqrt{K/(K+1)}{{\bf{h}}_{i}}({\bf{x}})+\sqrt{1/(K+1)}{\widetilde{\bf{h}}_{i}}$ , where $K$ is the Rician factor, ${{\bf{h}}_{i}}({\bf{x}})$ is given in (1), and each element of ${\widetilde{\bf{h}}_{i}}\in{{\mathbb{C}}^{N\times 1}}$ is i.i.d. complex Gaussian distributed with zero mean and unit variance. Under this setup, optimal ${\bf{x}}$ (denoted as ${{\bf{x}}^{{\rm{LoS}}}}$ ) is still obtained based on statistical AoAs, while the objective of total transmit power becomes ${\mathbb{E}}\left[{\sum\nolimits_{i=1}^{M}{\frac{1}{{{\lambda_{i}}\left\{{{{\bf{\Omega}}^{-1}}\widehat{\bf{H}}{{({{\bf{x}}^{{\rm{LoS}}}})}^{H}}\widehat{\bf{H}}({{\bf{x}}^{{\rm{LoS}}}})}\right\}}}}}\right]$ , with $\widehat{\bf{H}}({{\bf{x}}^{{\rm{LoS}}}})=\left[{{{\widehat{\bf{h}}}_{1}}({{\bf{x}}^{{\rm{LoS}}}}),...,{{\widehat{\bf{h}}}_{M}}({{\bf{x}}^{{\rm{LoS}}}})}\right]$ . For convincing comparisons, we further consider three widely used benchmarks:

•

RPA: The line segment of length $L$ is quantized into $2L+1$ discrete locations with equal-distance $0.5\lambda$ , and $N$ out of these $2L+1$ locations are optimally selected for antenna positions.
•

FPA: Each antenna has a fixed position, i.e., ${x_{i}}=(i-1){d_{\min}}$ .
•

Minimum mean square error (MMSE) combining: The BS will exploit MMSE combining to detect multiple signals, where positions of all antennas are optimized employing the method in [9], but base on statistical AoAs.

For the system parameters, we set the minimum distance between any two adjacent FAs as ${d_{\min}}=0.5\lambda$ , and without prejudice to the conclusion, $\lambda$ is set to 1 for simplification. We consider $M=3$ users and the AoAs are ${\theta_{1}}=\pi/16$ , ${\theta_{2}}=\pi/10$ and ${\theta_{3}}=\pi/2$ , respectively. In addition, the noise power is set as ${\sigma^{2}}=1$ for normalizing the large-scale channel fading power.

Fig. 4 first illustrates the convergence behavior of our proposed design under the LoS channels and for the case of $N=4$ and ${r_{i}}=1$ , $\forall i=1,2,3$ . Corresponding to different $L=2.5,3.5,4.5$ , the initial condition for the iteration is set as ${{\bf{x}}^{1}}={[0,L/3,2L/3,L]^{T}}$ . As we can observe, the total transmit power of all users rapidly converges to a constant within dozens of iterations. Therefore, the proposed design is computationally efficient which may be suitable for the practical implementation.

Fig. 5(a) compares the total transmit power of four schemes with respect to (w.r.t.) number of transmit antennas at the BS ( $N$ ) for the case of ${r_{i}}=1$ , $\forall i=1,2,3$ and $K=10$ . We can observe that: i) as $N$ increases, the BS can better distinguish signals in different directions and achieve higher reception gains, which in turn allows the users to transmit their signals with less power; ii) compared to FPA and RPA, the proposed design can optimally exploit the additional spatial DoF, so that the resulting total transmit power can be minimized; iii) as $N$ increases, the performance gap between RPA and the proposed design decreases. The reason is that when $L$ is fixed, each antenna can just move in a smaller region when $N$ increases, which implies that there may be not much performance difference from discrete positions selection in RPA or optimal FAs’ movements in the proposed design; iv) as reported in [9], due to the more powerful detection ability, MMSE combining outperforms the proposed design slightly.

Fig. 5(b) shows the total transmit power w.r.t. the span of FAs’ movement ( $L$ ) for the case of ${r_{i}}=1$ , $\forall i=1,2,3$ and $K=10$ , from which it is observed that when $L$ increases, the total transmit power of RPA, the proposed design and MMSE combining first becomes smaller and then converges to a constant. This phenomenon reveals that it is not necessary to expand $L$ indefinitely and only a limited span is enough to achieve the optimal performance.

Finally, Fig. 5(c) shows the total transmit power w.r.t. the Rician factor $K$ for the case of ${r_{i}}=1$ , $\forall i=1,2,3$ , $N=5$ and $L=5.5$ , from which it is observed that no matter whether $K$ is large (the LoS condition is dominate for each channel between the user and the BS) or small (the random Rayleigh fading is dominate for each channel between the user and the BS), our proposed design with statistical AoAs always achieves pretty good performance compared to FPA and RPA, indicating that our proposed design is not sensitive w.r.t. random fading components in Rician channels.

V Conclusion

This letter considers multiuser uplink communication supported by the FAs-enabled base station, which exploits zero-forcing receivers to decode multiple signals. The objective is to optimize the FAs’ positions at the BS, to minimize the total transmit power of all users subject to the minimum rate requirement. We develop a projected gradient descent method to iteratively find a locally optimal solution, at significantly reduced complexity compared to state of the art since a closed-form gradient is derived successfully. Results show the performance superiority of our proposed design compared to several benchmarks.

References

[1] W. Xia, et. al., “A deep learning framework for optimization of MISO downlink beamforming,” IEEE Trans. Commun., vol. 68, no. 3, pp. 1866 $-$ 1880, March 2020.
[2] Q. Wu, et. al., “Intelligent reflecting surface enhanced wireless network via joint active and passive beamforming,” IEEE Trans. Wireless Commun., vol. 18, no. 11, pp. 5394 $-$ 5409, Nov. 2019.
[3] L. Zhu, et. al., “Movable antennas for wireless communication: Opportunities and challenges,” IEEE Commun. Mag., early access. DOI: 10.1109/MCOM.001.2300212.
[4] W. K. New, et. al., “Fluid antenna system: New insights on outage probability and diversity gain,” IEEE Trans. Wireless Commun., early access. DOI: 10.1109/TWC.2023.3276245.
[5] K. K. Wong, et. al., “Fluid antenna systems,” IEEE Trans. Wireless Commun., vol. 20, No. 3, pp. 1950 $-$ 1962, March 2021.
[6] K. K. Wong, et. al., “Fluid antenna multiple access,” IEEE Trans. Wireless Commun., vol. 21, No. 7, pp. 4801 $-$ 4815, July 2022.
[7] W. Ma, et. al., “MIMO capacity characterization for movable antenna systems,” IEEE Trans. Wireless Commun., early access. DOI: 10.1109/TWC.2023.3307696.
[8] Y. Ye, et. al., “Fluid antenna-assisted MIMO transmission exploiting statistical CSI,” IEEE Commun. Lett., early access. DOI: 10.1109/LCOMM.2023.3336805.
[9] L. Zhu, et. al., “Movable-antenna enhanced multiuser communication via antenna position optimization,” arXiv: 2302.06978, 2023.
[10] Z. Xiao, et. al., “Multiuser communications with movable-antenna base station: Joint antenna positioning, receive combining, and power control,” arXiv: 2308.09512, 2023.
[11] G. Hu, et. al., “Secure wireless communication via movable-antenna array,” arXiv: 2311.07104, 2023.
[12] L. Zhu, et. al., “Movable-antenna array enhanced beamforming: Achieving full array gain with null steering,” IEEE Commun. Lett., early access. DOI: 10.1109/LCOMM.2023.3323656.
[13] Y. Wu, et. al., “Movable antenna-enhanced multiuser communication: Optimal discrete antenna positioning and beamforming,” arXiv: 2308.02304, 2023.
[14] T. Van Chien, et. al., “Uplink power control in massive MIMO with double scattering channels,” IEEE Trans. Wireless Commun., vol. 21, no. 3, pp. 1989 $-$ 2005, March 2022.
[15] A. Beck, et. al., “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imag. Sci., vol. 2, no. 1, pp. 183 $-$ 202, 2009.
[16] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.: Cambridge Univ. Press, 2004.