This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Beamforming Design for the Distributed RISs-aided THz Communications with Double-Layer True Time Delays

Gangcan Sun, Wencai Yan, Wanming Hao, Member, IEEE, Chongwen Huang, Member, IEEE, Chau Yuen G. Sun, W. Yan and W. Hao are with the School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China. (E-mail: [email protected], [email protected], [email protected])C. Huang is with the College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China (E-mails: [email protected])C. Yuen is with the Singapore University of Technology and Design, Singapore 487372, Singapore (E-mail: [email protected])
Abstract

In this paper, we investigate the reconfigurable intelligent surface (RIS)-aided terahertz (THz) communication system with the sparse radio frequency chains antenna structure at the base station (BS). To overcome the beam split of the BS, different from the conventional single-layer true-time-delay (TTD) scheme, we propose a double-layer TTD scheme that can effectively reduce the number of large-range delay devices, which involve additional insertion loss and amplification circuitry. Next, we analyze the system performance under the proposed double-layer TTD scheme. To relieve the beam split of the RIS, we consider multiple distributed RISs to replace an ultra-large size RIS. Based on this, we formulate an achievable rate maximization problem for the distributed RISs-aided THz communications via jointly optimizing the hybrid analog/digital beamforming, time delays of the double-layer TTD network and reflection coefficients of RISs. Considering the practical hardware limitation, the finite-resolution phase shift, time delay and reflection phase are constrained. To solve the formulated problem, we first design an analog beamforming scheme including optimizing phase shift and time delay based on the RISs’ locations. Then, an alternatively optimization algorithm is proposed to obtain the digital beamforming and reflection coefficients based on the minimum mean square error and coordinate update techniques. Finally, simulation results show the effectiveness of the proposed scheme.

Index Terms:
THz communication, double-layer TTD network, beam split, reconfigurable intelligent surface, hybrid beamforming.

I Introduction

To achieve data rates of terabits-per-second (Tb/s), the future sixth-generation (6G) wireless communications are expected to exploit the terahertz (THz) frequency (0.1-10 THz) due to its ultra-wide bandwidth [1]-[3]. However, THz signals usually suffer from the severe path loss and poor diffraction [4] [5], which leads to the limited coverage. Fortunately, the massive multiple-input multiple-output (MIMO) and reconfigurable intelligent surface (RIS) techniques nowadays are developed [6], and they can be applied for THz communications to form the high-gain directional beams and improve the signals’ coverage [7]. Furthermore, the virtual line of sight (LoS) link can be constructed via the deployment of RISs, and thus the serious blockage problem can be effectively solved [8][10]. Therefore, it is promising for the applications of RIS-aided massive MIMO in future THz communications.

I-A Related Works

It is well known that the base station (BS) can deploy a large number of antennas within the limited physical size because of the small wavelength of THz signals. However, the fully digital beamforming technique requires a unique radio frequency (RF) chain connecting to each antenna, and this will lead to huge power consumption and hardware complexity [11] [12], which is infeasible in practice. To address this, the hybrid analog/digital beamforming technique is developed [13] [14], where the antennas are connected to a few RF chains via several groups of phase shifters (PSs). It has been proved that a asymptotically optimal performance can be obtained by optimizing the analog/digital beamforming in the narrowband system. However, due to the frequency-independent characteristic of PSs, the beam split will occur for the wideband THz system, leading to the serious performance loss [15] [16]. Meanwhile, the RIS also faces the similar problem because of its frequency-independent reflecting elements [17][18].

Currently, there have been several works studying how to mitigate the beam split effect. One straightforward solution combating beam split is to replace all PSs by frequency-dependent true-time-delays (TTDs) [19], while this will cause huge power consumption and hardware complexity due to the use of large numbers of TTDs. Instead, a limited number of TTDs are inserted between the RF chains and PSs to solve the beam split [5][20]-[22], and thus, the traditional one-dimensional analog beamforming is converted into two-dimensional analog beamforming via the joint control of PSs and TTDs. Specifically, a novel THzPrism architecture is designed in [20], where TTDs are arranged in a serial manner. Since TTDs with the equal number of antennas are utilized, this architecture will cause a high hardware complexity. Consequently, the authors [5] propose two low hardware complexity schemes, including the virtual subarray beamforming scheme and the sparse TTD-based scheme. The former does not need to add any extra hardware, while its performance is lower than the latter. Similarly, a TTD-based delay-phase precoding is proposed in [21], and then the authors extend it to the user position by the beam tracking [22]. However, the TTDs in [5] [21] [22] are arranged in a parallel manner, and each TTD must be configured independently with the price of supporting a large-range delay [23] [24], especially for the large antenna array. In addition, the transceiver RF signal amplification circuit is needed to compensate the loss of the large delay line, which increases the hardware cost. In order to reduce the required delay range, a hybrid TTD architecture is proposed for fast beam training, where the time delays are separately realized by analog TTDs and digital TTDs [25]. Nevertheless, the analog phases and delays are fixed such that the number of required analog TTDs is equal to the number of antennas. To realize the high energy efficiency, a novel energy-efficient dynamic-subarray with fixed TTD architecture is developed [26]. However, the improved performance is very limited based on the schemes of [25] [26]. Meanwhile, all the above works assume that TTDs can provide delay with high resolution or even infinite resolution, and this is power-hungry and even infeasible in practice.

To overcome the beam split effect with low hardware complexity at the BS, we propose a double-layer TTD scheme. On the one hand, the proposed scheme can solve the maximum delay compensation problem observed in the traditional single-layer TTD scheme. Specifically, implementing a large-range delay TTD requires serious sacrifices in terms of linearity, noise, power and area, which increase the complexity of the design. Besides, the increase of delay range not only reduces the accuracy, but also deteriorates the nonlinear performance of the system due to the use of a large number of active amplifier devices [27]. Thus, minimizing the number of large-range delay TTDs used in the system would significantly reduce the hardware cost. On the other hand, based on the practical hardware limitation, the discrete time delays with different resolutions at each layer TTD network are considered. In this way, the TTDs of each layer only need to compensate the propagation delay across the specific subarray aperture. In addition, the PSs are used to compensate for the residual phase shift of the double-layer TTD network and generate a beam towarding to the target’s physical direction.

Besides, to the best of our knowledge, there has not been the related work jointly considering the beam split effect and beamforming design problems in wideband THz RIS communications. Therefore, we extend the double-layer TTD scheme to the wideband THz distributed RISs communications system to cooperatively mitigate the beam split effect. In fact, the beam split effect of distributed RISs with fewer elements is less severe than that of the centralized RIS [17] [28].

I-B Main Contributions

In this paper, we investigate the antenna structure design and beamforming optimization in the RIS-aided THz communications, and the main contributions are summarized as follows.

  • \bullet

    To overcome the beam split of the BS with low hardware complexity, we propose a double-layer TTD scheme. Different from the conventional single-layer TTD scheme, the required number of large-range delay TTDs is sharply reduced by bringing an additional small-range delay TTD network, which effectively reduce the hardware cost. We analyze the phase compensation error and normalized array gain under the proposed double-layer TTD scheme. The results show that the proposed scheme can almost obtain the same performance with the conventional single-layer TTD scheme, but the overall hardware cost is effectively decreased.

  • \bullet

    Next, we extend the double-layer TTD scheme to the wideband THz distributed RISs communications system to cooperatively mitigate the beam split effect. In fact, the beam gain loss of distributed RISs with fewer elements is less severe than that of the centralized RIS. Then, we formulate a achievable rate maximization problem via jointly optimizing the hybrid analog/digital beamforming, time delays of the double-layer TTD network and reflection coefficients of RISs. Meanwhile, the finite-resolution phase shifter, time delay and reflection phase are considered based on the practical hardware.

  • \bullet

    Since the formulated problem is NP hard, it is difficulty to directly solve. We first design the analog beamforming based on the RISs’ locations and phase compensation principle. Next, we still need to jointly solve the digital beamforming of the BS and reflection coefficients of the RISs. We propose an alternatively optimization algorithm to deal with it. Specifically, the reflection coefficients are fixed, and we obtain the digital beamforming based on the minimum mean square error (MMSE) technique. After that, the reflection coefficients are solved via the coordinate update approach. The finally solutions are obtained via repeating the above procedure until convergence. Furthermore, the robustness of the proposed joint wideband beamforming against the impacts of imperfect CSI is analyzed.

Notations: Lower-case and upper-case boldface letters represent vectors and matrices, respectively. ()T(\cdot)^{T} and ()H(\cdot)^{H} denote the transpose and Hermitian transpose, respectively. |||\cdot| denotes the absolute operator. \left\|\cdot\right\| is the Frobenius norm. \lfloor\cdot\rfloor and \lceil\cdot\rceil are the floor and ceil function, respectively. diag()\rm diag(\cdot) represents diagonal operation. x×y\mathbb{C}^{x\times y} denotes the space of x×yx\times y complex matrix. {}\Re\{\cdot\} denotes the real part of a complex number. 𝒞𝒩(A,B)\mathcal{C}\mathcal{N}\left(A,B\right) represents the Gaussian distribution with mean AA and covariance BB.

II Basic System Model

We consider a distributed RISs-aided THz communication system as shown in Fig. 1. The THz signals are poor diffraction and vulnerable to the obstruction due to its ultra-high frequency [29] [30]. Thus, the direct links between the BS and users are assumed to be blocked by buildings. We set that the BS is consisted of an NN-antenna uniform linear array (ULA) and NRFN_{{\rm{RF}}} (NNRF)(N\geq N_{{\rm{RF}}}) RF chains to serve KK single-antenna users. Let ={1,,R}\mathcal{R}=\{1,\cdots,R\} denote the index set of RISs, we assume that all RISs own the same size with NRIS=Mx×MyN_{\rm{RIS}}={M_{x}}\times{M_{y}} elements, where MxM_{x} and MyM_{y} represent the number of rows and columns, respectively. Let x={1,,Mx}\mathcal{M}_{x}=\{1,\cdots,M_{x}\} and y={1,,My}\mathcal{M}_{y}=\{1,\cdots,M_{y}\} denote the index sets of elements in rows and columns. The orthogonal frequency division multiplexing technique with total MM subcarriers is applied to realize the reliable wideband transmission. The frequency of the mm-th carrier can be expressed as fm=fc+BM(m1M12),m=1,2,,Mf_{m}=f_{c}+\frac{B}{M}\left(m-1-\frac{M-1}{2}\right),m=1,2,\cdots,M, where fcf_{c} and BB are the central frequency and bandwidth, respectively.

Refer to caption
Fig. 1: The distributed RISs-aided THz system.

Thus, the equivalent channel 𝐡m,k\mathbf{h}_{m,k} between the BS and the kk-th user on the mm-th subcarrier can be expressed as

𝐡m,k=r=1R𝐟r,m,k𝚽r𝐆r,m,\displaystyle\mathbf{h}_{m,k}=\sum_{r=1}^{R}\mathbf{f}_{r,m,k}\mathbf{\Phi}_{r}\mathbf{G}_{r,m}, (1)

where 𝐟r,m,k1×NRIS\mathbf{f}_{r,m,k}\in\mathbb{C}^{1\times N_{\rm{RIS}}} denotes the channel vector between the rr-th RIS and the kk-th user on the mm-th subcarrier, 𝐆r,mNRIS×N\mathbf{G}_{r,m}\in\mathbb{C}^{N_{\rm{RIS}}\times N} represents the channel matrix from the BS to the rr-th RIS on the mm-th subcarrier. 𝚽r=diag(φr,1,1,,φr,mx,my,,φr,Mx,My),r\mathbf{\Phi}_{r}=\operatorname{diag}\left(\varphi_{r,1,1},\cdots,\varphi_{r,m_{x},m_{y}},\cdots,\varphi_{r,M_{x},M_{y}}\right),r\in\mathcal{R}, mxx,myym_{x}\in\mathcal{M}_{x},m_{y}\in\mathcal{M}_{y}, is the diagonal reflection coefficients matrix of the rr-th RIS with φr,mx,my=εr,mx,myejϕr,mx,my\varphi_{r,m_{x},m_{y}}=\varepsilon_{r,m_{x},m_{y}}e^{j\phi_{r,m_{x},m_{y}}}. To maximize the reflection efficiency, we set εr,mx,my=1\varepsilon_{r,m_{x},m_{y}}=1 for rr\in\mathcal{R}, mxx,myym_{x}\in\mathcal{M}_{x},m_{y}\in\mathcal{M}_{y}. We apply the Saleh-Valenzuela channel model [31], and thus the channel matrix 𝐆r,m\mathbf{G}_{r,m} can be expressed as

𝐆r,m=l1=1L1αl1rej2πτl1rfm𝐛(ul1r,vl1r)𝐚(θl1r)H,\displaystyle\mathbf{G}_{r,m}=\sum_{l_{1}=1}^{L_{1}}\alpha_{l_{1}}^{r}e^{-j2\pi\tau_{l_{1}}^{r}f_{m}}\mathbf{b}\left(u_{l_{1}}^{r},v_{l_{1}}^{r}\right)\mathbf{a}\left(\theta_{l_{1}}^{r}\right)^{H}, (2)

where L1L_{1} represents the number of paths, αl1r\alpha_{l_{1}}^{r} and τl1r\tau_{l_{1}}^{r} respectively denote the gain and delay of the l1l_{1}-th path at the rr-th RIS. 𝐚(θl1r)\mathbf{a}\left(\theta_{l_{1}}^{r}\right) and 𝐛(ul1r,vl1r)\mathbf{b}\left(u_{l_{1}}^{r},v_{l_{1}}^{r}\right) respectively denote the array response vectors at the BS and RIS, which can be denoted as

𝐚(θl1r)=1N[1,,ej2πdfmcnsinθl1r,,ej2πdfmc(N1)sinθl1r)]T,\displaystyle\mathbf{a}\left(\theta_{l_{1}}^{r}\right)=\frac{1}{\sqrt{N}}\left[1,\ldots,e^{j2\pi d\frac{f_{m}}{c}n\sin\theta_{l_{1}}^{r}},\ldots,e^{\left.j2\pi d\frac{f_{m}}{c}\left(N-1\right)\sin\theta_{l_{1}}^{r}\right)}\right]^{T}, (3)

and

𝐛(ul1r,vl1r)=1NRIS[1,,ej2πdfmc(mxcosul1rsinvl1r+mycosvl1r),,ej2πdfmc((Mx1)cosul1rsinvl1r+(My1)cosvl1r)]T,\displaystyle\begin{split}\mathbf{b}\left(u_{l_{1}}^{r},v_{l_{1}}^{r}\right)&=\frac{1}{\sqrt{N_{\rm{RIS}}}}[1,\ldots,e^{j2\pi d\frac{f_{m}}{c}(m_{x}\cos u_{l_{1}}^{r}\sin v_{l_{1}}^{r}+m_{y}\cos v_{l_{1}}^{r})},\\ &\ldots,e^{j2\pi d\frac{f_{m}}{c}((M_{x}-1)\cos u_{l_{1}}^{r}\sin v_{l_{1}}^{r}+(M_{y}-1)\cos v_{l_{1}}^{r})}]^{T},\end{split} (4)

where cc and dd are the speed of light and the distance between two consecutive antennas, respectively. We set d=λc/2d=\lambda_{c}/2, and λc\lambda_{c} is the wavelength of the central frequency fcf_{c}. θl1r[π/2,π/2]\theta_{l_{1}}^{r}\in[-\pi/2,\pi/2] is the physical direction of the l1l_{1}-th path departing from the BS to the rr-th RIS. ul1ru_{l_{1}}^{r} and vl1r[π/2,π/2]v_{l_{1}}^{r}\in[-\pi/2,\pi/2] represent the azimuth and elevation angles of arrivals (AoAs) of the l1l_{1}-th path at the rr-th RIS, respectively.

Next, the channel vector from the rr-th RIS to the kk-th user on the mm-th subcarrier is denoted as

𝐟r,m,k=l2=1L2αl2r,kej2πτl2r,kfm𝐛(ul2r,k,vl2r,k),\displaystyle\mathbf{f}_{r,m,k}=\sum_{l_{2}=1}^{L_{2}}\alpha_{l_{2}}^{r,k}e^{-j2\pi\tau_{l_{2}}^{r,k}f_{m}}\mathbf{b}\left(u_{l_{2}}^{r,k},v_{l_{2}}^{r,k}\right), (5)

where L2L_{2} represents the number of paths, αl2r,k\alpha_{l_{2}}^{r,k} and τl2r,k\tau_{l_{2}}^{r,k} respectively denote the gain and delay of the l2l_{2}-th path from the rr-th RIS to the kk-th user, 𝐛(ul2r,k,vl2r,k)\mathbf{b}\left(u_{l_{2}}^{r,k},v_{l_{2}}^{r,k}\right) denotes the transmit array response vector at the RIS, namely

𝐛(ul2r,k,vl2r,k)=1NRIS[1,,ej2πdfmc(mxcosul2r,ksinvl2r,k+mycosvl2r,k),,ej2πdfmc((Mx1)cosul2r,ksinvl2r,k+(My1)cosvl2r,k)]T,\displaystyle\begin{split}\mathbf{b}\left(u_{l_{2}}^{r,k},v_{l_{2}}^{r,k}\right)=&\frac{1}{\sqrt{N_{\rm{RIS}}}}[1,\ldots,e^{j2\pi d\frac{f_{m}}{c}(m_{x}\cos u_{l_{2}}^{r,k}\sin v_{l_{2}}^{r,k}+m_{y}\cos v_{l_{2}}^{r,k})},\\ &\ldots,e^{j2\pi d\frac{f_{m}}{c}((M_{x}-1)\cos u_{l_{2}}^{r,k}\sin v_{l_{2}}^{r,k}+(M_{y}-1)\cos v_{l_{2}}^{r,k})}]^{T},\end{split} (6)

where ul2r,ku_{l_{2}}^{r,k} and vl2r,k[π/2,π/2]v_{l_{2}}^{r,k}\in[-\pi/2,\pi/2] represent the azimuth and elevation angles of departures (AoDs) of the l2l_{2}-th path from the rr-th RIS to the kk-th user, respectively.

The above is the basic system model for the distributed RISs-aided THz communications. Next, we first study how to mitigate the beam split effect by designing more practical antenna structure at the BS, and then investigate the joint beamforming optimization problem.

III Double-layer TTD Scheme and Performance Analysis

To overcome the beam split effect, existing works mainly consider the single-layer TTD scheme. However, it requires each time delay line to provide a large-range delay, especially for a large antenna array, and TTD with large-range delay usually has high power consumption, insertion loss, and hardware complexity, which is impractical for TTD circuits [32]. Therefore, to mitigate the beam split effect with low power consumption and hardware cost, we propose a double-layer TTD scheme at the BS.

Refer to caption
Fig. 2: Three different antenna schemes: (a) PS scheme, (b) Single-layer TTD scheme, (c) Double-layer TTD scheme.

III-A Phase Compensation Principle

We first introduce the phase compensation principle. For convenience, we assume that there is only one RF chain connecting to NN antennas via PSs at the BS as shown in Fig. 2(a), and one single-antenna user is served. Although there are a few scattering components in THz communications, their power are much lower than that of the LoS component [33]. Therefore, we only consider the LoS component here, and thus the channel vector 𝐡¯m1×N\mathbf{\bar{h}}_{m}\in\mathbb{C}^{1\times N} of the BS-user link on the mm-th carrier can be expressed as

𝐡¯m=αej2πτfm𝐚(θ0)H,\displaystyle\mathbf{\bar{h}}_{m}=\alpha e^{-j2\pi\mathcal{\tau}f_{m}}\mathbf{a}\left(\theta_{0}\right)^{H}, (7)

where α\alpha and τ\tau respectively denote the gain and delay, θ0[π/2,π/2]\theta_{0}\in[-\pi/2,\pi/2] is the AoD, 𝐚(θ0)\mathbf{a}\left(\theta_{0}\right) denotes the array response vector at the BS, namely

𝐚(θ0)=1N[1,,ej2πdfmcnsinθ0,,ej2πdfmc(N1)sinθ0)]T.\displaystyle\mathbf{a}\left(\theta_{0}\right)=\frac{1}{\sqrt{N}}\left[1,\ldots,e^{j2\pi d\frac{f_{m}}{c}n\sin\theta_{0}},\ldots,e^{\left.j2\pi d\frac{f_{m}}{c}\left(N-1\right)\sin\theta_{0}\right)}\right]^{T}. (8)

To form a beam at direction θ0\theta_{0}, the phase Ψc\Psi_{c} excited by the nn-th PS at center frequency fcf_{c} should be [34]

Ψc=2πfcc(n1)dsinθ0=ωcτn,n=1,2,,N,\displaystyle\Psi_{c}=\frac{2\pi f_{c}}{c}(n-1)d\sin\theta_{0}=\omega_{c}\tau_{n},n=1,2,\ldots,N, (9)

where

τn=(n1)dcsinθ0=(n1)Tdsinθ0,\displaystyle\tau_{n}=\frac{(n-1)d}{c}\sin\theta_{0}=(n-1)T_{d}\sin\theta_{0}, (10)

is the propagation delay between the first and nn-th antenna, ωc\omega_{c} is the angular frequency corresponding to the center frequency fcf_{c}, and Td=dcT_{d}=\frac{d}{c} is the delay between two consecutive antennas. Thus, the analog beam 𝐟ps\mathbf{f}_{\rm ps} under the PS scheme can be written as

𝐟ps=1N[1,,ej2πfcnTdsinθ0,,ej2πfc(N1)Tdsinθ0)]T.\displaystyle\mathbf{f}_{\rm ps}=\frac{1}{\sqrt{N}}\left[1,\ldots,e^{j2\pi f_{c}nT_{d}\sin\theta_{0}},\ldots,e^{\left.j2\pi f_{c}\left(N-1\right)T_{d}\sin\theta_{0}\right)}\right]^{T}. (11)

Wrapping the phase shift to [0,2π][0,2\pi], we obtain

Ψc=ΨcTRUNC(Ψc2π),\displaystyle\Psi_{c}^{\prime}=\Psi_{c}-\rm{TRUNC}(\frac{\Psi_{c}}{2\pi}), (12)

where TRUNC\rm{TRUNC} is a function that represents the integral part of its argument. Due to the frequency-independent property of PSs, the phase adjusted by each PS is common for all frequencies. But when the frequency varies from fcf_{c} to fmf_{m}, the ideal phase should be

Ψp=2πfm(n1)Tdsinθ0.\displaystyle\Psi_{p}=2\pi f_{m}(n-1)T_{d}\sin\theta_{0}. (13)

Thus, there is a phase difference between the ideal phase and practical phase, which can be expressed as

ΔΨ=2π(fmfc)(n1)Tdsinθ0.\displaystyle\Delta\Psi=2\pi(f_{m}-f_{c})(n-1)T_{d}\sin\theta_{0}. (14)

The phase difference at fmf_{m} leads to the beam direction moving to θ0\theta_{0}^{{}^{\prime}}, namely

θ0=arcsin(fcfmsinθ0).\displaystyle\theta_{0}^{{}^{\prime}}=\text{arcsin}(\frac{f_{c}}{f_{m}}\sin\theta_{0}). (15)

It is observed that the phase difference increases with the bandwidth and the number of antennas. Since the beam split essentially results from the propagation delay across the antenna array aperture, a reasonable approach to tackle this issue is to compensate the propagation delay. Consequently, the TTD is introduced to thoroughly eliminate the beam split. Fig. 2 (b) shows a typical TTD antenna structure, where NN antennas are uniformly divided into UU subarrays and each one includes S=N/US=N/U antennas. Meanwhile, these antennas are connected to one TTD via SS PSs at each subarray, and then all TTDs are connected to the RF chain. Next, as shown in Fig. 3, we define τu\tau_{u} as the signal transmission time delay difference from the first antenna of the first subarray to the first antenna of the uu-th subarray, namely

τu=(u1)STdsinθ0,\displaystyle\tau_{u}=(u-1)ST_{d}\sin\theta_{0}, (16)

where u=1,2,Uu=1,2,\ldots U. From (16), we have τu[0,(U1)STd]\tau_{u}\in[0,(U-1)ST_{d}] with θ0[π/2,π/2]\theta_{0}\in[-\pi/2,\pi/2]. For the single-layer TTD scheme, the transmission signal phase at each antenna is jointly controlled by the frequency-dependent TTD and frequency-independent PS [35]. Therefore, the transmission signal phase Ψu,s\Psi_{u,s} of the ss-th element in the uu-th subarray at the frequency fmf_{m} can be written as

Ψu,s=2πfm(u1)STdsinθ0+2πfc(s1)Tdsinθ0,\displaystyle\Psi_{u,s}=2\pi f_{m}\left(u-1\right)ST_{d}\sin\theta_{0}+2\pi f_{c}(s-1)T_{d}\sin\theta_{0}, (17)

where s=1,2,Ss=1,2,\ldots S. The corresponding analog beamforming 𝐟ttd\mathbf{f}_{\rm ttd} can be expressed as

𝐟ttd=1N[1,,ejΨu,s,,eΨU,S]T.\displaystyle\mathbf{f}_{\rm ttd}=\frac{1}{\sqrt{N}}[1,\ldots,e^{j\Psi_{u,s}},\ldots,e^{\Psi_{U,S}}]^{T}. (18)

To achieve unbiased beam synthesis, the ideal phase at frequency fmf_{m} should be

Ψ^u,s=2πfm[(u1)S+(s1)]Tdsinθ0.\displaystyle\hat{\Psi}_{u,s}=2\pi f_{m}[\left(u-1\right)S+(s-1)]T_{d}\sin\theta_{0}. (19)
Refer to caption
Fig. 3: Configuration of the typical single-layer TTD scheme.

Thus, there exists a phase error under the single-layer TTD scheme, namely

ΔΨu,s=2π(fmfc)(s1)Tdsinθ0.\displaystyle\Delta\Psi_{u,s}=2\pi(f_{m}-f_{c})(s-1)T_{d}\sin\theta_{0}. (20)

One can observe that the phase error is related to the subarray aperture and it increases with the number of elements in each subarray. Therefore, for reducing the phase error and beam split effect, we can increase the subarray number (i.e., TTDs number), but this will cause a high hardware cost.

Next, to clearly understand the hardware implementation of TTD, we present an 8-bit TTD with a minimum delay of 1 ps [35] as an example. In order to construct a complete 8-bit TTD, 8 TTD elements should be cascaded for a total delay of 255 ps, with the least significant bit (LSB) equal to 1 ps and the most significant bit (MSB) equal to 128 ps, as shown in the circuit block diagram of Fig. 4. Each discretely unit is realized by cascaded time delay units, reference units and input and output single-pole-double-throw (SPDT) switches [26] [36], which provide different levels of time delay and time delay selection, respectively. Due to the cascaded structure, the power consumption, insertion loss, and hardware complexity of the TTD are summation of those of the time delay units, and switches. In addition, the insertion loss of the time delay is increased as the frequency increases [37]. Thus, in the THz band, fewer bits can reduce the number of time delay units and switches, which further decreases the power consumption, insertion loss, and hardware complexity.

Refer to caption
Fig. 4: Circuit block diagram of the 8-bit TTD.

III-B Proposed Double-layer TTD Scheme

Fig. 2 (c) is our proposed double-layer TTD scheme, where the second layer includes KHK_{\rm H} TTDs and each TTD is connected to KLK_{\rm L} TTDs of the first layer. Meanwhile, each TTD of the first layer is connected to PP antennas via PSs. We define τkh,kl\tau_{k_{h},k_{l}} as the signal transmission time delay difference from the first antenna of the first subarray to the first antenna of the klk_{l}-th subarray related to the khk_{h}-th TTD of the second layer, namely

τkh,kl=(kl1)PTdsinθ0,\displaystyle\tau_{k_{h},k_{l}}=\left(k_{l}-1\right)PT_{d}\sin\theta_{0}, (21)

where kl=1,2,KLk_{l}=1,2,\ldots K_{\rm L}, kh=1,2,KHk_{h}=1,2,\ldots K_{\rm H}. Next, we define τkh\tau_{k_{h}} as the signal transmission time delay difference from the first antenna connecting to the first TTD of the second layer to the first antenna connecting to the khk_{h}-th TTD of the second layer, namely

τkh=(kh1)KLPTdsinθ0.\displaystyle\tau_{k_{h}}=\left(k_{h}-1\right)K_{\rm L}PT_{d}\sin\theta_{0}. (22)

From (21) and (22), we have τkh,kl[0,(KL1)PTd]\tau_{k_{h},k_{l}}\in[0,(K_{\rm L}-1)PT_{d}] and τkh[0,(KH1)KLPTd]\tau_{k_{h}}\in[0,(K_{\rm H}-1)K_{\rm L}PT_{d}] with θ0[π/2,π/2]\theta_{0}\in[-\pi/2,\pi/2], 1klKL1\leq k_{l}\leq K_{L} and 1khKH1\leq k_{h}\leq K_{\rm H}. Therefore, for the double-layer TTD scheme, the time delay range of the TTD at the first layer is always smaller than that of the TTD at the second layer. And the phase of each antenna is controlled by the double-layer network and PSs. The corresponding phase value Ψh,l,p\Psi_{h,l,p} of the pp-th element in the klk_{l}-th TTD under the khk_{h}-th subarray at the frequency fmf_{m} is

Ψh,l,p=2πfm(kh1)KLPTdsinθ0+2πfm(kl1)PTdsinθ0+2πfc(p1)Tdsinθ0,\displaystyle\begin{split}\Psi_{h,l,p}&=2\pi f_{m}\left(k_{h}-1\right)K_{\rm L}PT_{d}\sin\theta_{0}+\\ &2\pi f_{m}\left(k_{l}-1\right)PT_{d}\sin\theta_{0}+2\pi f_{c}(p-1)T_{d}\sin\theta_{0},\end{split} (23)

where p=1,2,Pp=1,2,\ldots P. The phase error of each element compared to the ideal phase is

ΔΨh,l,p=2π(fmfc)(p1)Tdsinθ0.\displaystyle\Delta\Psi_{h,l,p}=2\pi(f_{m}-f_{c})(p-1)T_{d}\sin\theta_{0}. (24)

The corresponding analog beamforming 𝐟mttd\mathbf{f}_{\rm mttd} excited by the double-layer TTD scheme can be expressed as

𝐟mttd=1N[1,,ejΨh,l,p,,eΨH,L,P]T.\displaystyle\mathbf{f}_{\rm mttd}=\frac{1}{\sqrt{N}}[1,\ldots,e^{j\Psi_{h,l,p}},\ldots,e^{\Psi_{H,L,P}}]^{T}. (25)

Next, we respectively derive the normalized array gains under three different schemes. For the target direction θ0\theta_{0} on the frequency fmf_{m}, the array gain under the traditional PS scheme can be calculated as

gps(fm,θ0)\displaystyle g_{\rm ps}\left(f_{m},\theta_{0}\right) =|𝐚H𝐟ps|\displaystyle=\left|\mathbf{a}^{\rm H}\mathbf{f}_{\rm ps}\right|
=1N|n=1Nej2πdfmc(n1)sinθ0ej2πdfcc(n1)sinθ0|\displaystyle=\frac{1}{N}\left|\sum_{n=1}^{N}e^{-j2\pi d\frac{f_{m}}{c}(n-1)\sin\theta_{0}}e^{j2\pi d\frac{f_{c}}{c}(n-1)\sin\theta_{0}}\right|
=1N|ΞN((ζm1)sinθ0)|,\displaystyle=\frac{1}{N}\left|\Xi_{N}\left(\left(\zeta_{m}-1\right)\sin\theta_{0}\right)\right|, (26)

where ζm=fmfc\zeta_{m}=\frac{f_{m}}{f_{c}} denotes the relative frequency, and ΞN(x)=sin(πN2x)sin(π2x)\Xi_{N}\left(x\right)=\frac{\sin\left(\frac{\pi N}{2}x\right)}{\sin\left(\frac{\pi}{2}x\right)} is the Dirichlet Sinc function [38] [39]. Next, the array gain under the single-layer TTD scheme should be

gttd(fm,θ0)=|𝐚H𝐟ttd|=1N|u=1Us=1Sej2πdfcc(s1)sinθ0ej2πfmc(u1)Sdsinθ0ej2πdfmc[(u1)S+(s1)]sinθ0|=UN|ΞS((ζm1)sinθ0)|.\displaystyle\begin{aligned} g_{\rm ttd}\left(f_{m},\theta_{0}\right)&=\left|\mathbf{a}^{\rm H}\mathbf{f}_{\rm ttd}\right|\\ &=\frac{1}{N}\left|\sum_{u=1}^{U}\sum_{s=1}^{S}e^{j2\pi d\frac{f_{c}}{c}(s-1)\sin\theta_{0}}e^{j\frac{2\pi f_{m}}{c}\left(u-1\right)Sd\sin\theta_{0}}\right.\\ &\left.e^{-j2\pi d\frac{f_{m}}{c}[(u-1)S+(s-1)]\sin\theta_{0}}\right|\\ &=\frac{U}{N}\left|\Xi_{S}\left(\left(\zeta_{m}-1\right)\sin\theta_{0}\right)\right|.\end{aligned} (27)

Finally, the array gain under the double-layer TTD scheme can be expressed as

gmttd(fm,θ0)\displaystyle\!\!\!\!\!\!g_{\rm mttd}\left(f_{m},\theta_{0}\right) =|𝐚H𝐟mttd|\displaystyle=\left|\mathbf{a}^{\rm H}\mathbf{f}_{\rm mttd}\right|
=1N|kh=1KHkl=1KLp=1Pejπfmfc(kh1)KLPsinθ0ejπfmfc(kl1)Psinθ0\displaystyle=\frac{1}{N}\left|\sum_{k_{h}=1}^{K_{\rm H}}\sum_{k_{l}=1}^{K_{\rm L}}\sum_{p=1}^{P}e^{j\pi\frac{f_{m}}{f_{c}}(k_{h}-1)K_{\rm L}P\sin\theta_{0}}e^{j\pi\frac{f_{m}}{f_{c}}(k_{l}-1)P\sin\theta_{0}}\right.
ejπ(p1)sinθ0ejπfmfc[(kh1)KLP+(kl1)P+(p1)]sinθ0|\displaystyle\left.e^{j\pi(p-1)\sin\theta_{0}}e^{-j\pi\frac{f_{m}}{f_{c}}[(k_{h}-1)K_{\rm L}P+(k_{l}-1)P+(p-1)]\sin\theta_{0}}\right|
=KHKLN|p=1Pejπ(p1)(fmfc1)sinθ0|\displaystyle=\frac{K_{\rm H}K_{\rm L}}{N}\left|\sum_{p=1}^{P}e^{-j\pi(p-1)(\frac{f_{m}}{f_{c}}-1)\sin\theta_{0}}\right|
=KHKLN|ΞP((ζm1)sinθ0)|.\displaystyle=\frac{K_{\rm H}K_{\rm L}}{N}\left|\Xi_{P}\left(\left(\zeta_{m}-1\right)\sin\theta_{0}\right)\right|. (28)

From (26)-(28), one can observe that the difference of the array gains under three different schemes mainly own to their different elements number of each subarray. In the proposed double-layer TTD scheme, by introducing an additional small-delay TTD network, the required number of large-range delay TTDs can be effectively reduced. Fig. 5 shows the phase compensation of each antenna under different schemes, and we set fc=300f_{c}=300 GHz, B=30B=30 GHz, θ0=π/4\theta_{0}=\pi/4 and N=128N=128.

Refer to caption
Fig. 5: Phase compensation of each antenna.
Refer to caption
Fig. 6: Phase error of each antenna.

In the single-layer TTD scheme, we set the number of TTDs U=32U=32. In the double-layer TTD scheme, we consider two configurations, i.e., KH=8,KL=4K_{\rm H}=8,K_{\rm L}=4 and KH=8,KL=2K_{\rm H}=8,K_{\rm L}=2. Without considering the quantization error of the TTD device, we can find that the phase compensation under the single- and double-layer TTD schemes is almost the same for KH=8,KL=4K_{\rm H}=8,K_{\rm L}=4. However, the required number of large-range delay TTDs in the double-layer TTD scheme is sharply reduced. When KH=8,KL=2K_{\rm H}=8,K_{\rm L}=2, the performance is slightly lower, but the total number of TTDs is further reduced. Fig. 6 depicts phase error of each antenna under different schemes. One can observe that the phase error under the single- and double-layer TTD schemes is periodic and the period becomes shorter when the number of subarray antennas is smaller. Nevertheless, the phase error under the PS scheme increases linearly antenna index.

Refer to caption
Fig. 7: Normalized array gain.

Fig. 7 illustrates the normalized array gain of each subcarrier under different schemes. It can be observed that the gain under single- and double-layer TTD schemes can almost obtain the high performance across the entire bandwidth. Although the gain under the double-layer TTD scheme with KH=8K_{\rm H}=8 and KL=2K_{\rm L}=2 is a little lower, the number of large-range delay TTDs is smaller. Whereas, the gain loss under the PS scheme is largest, which seriously effects the system performance.

Furthermore, considering the practical hardware limitation, the TTD can only realize the limited discrete time delays. Therefore, we assume that the TTD under the first layer and second layer owns 2PL2^{P_{\rm L}} and 2PH2^{P_{\rm H}} discrete values, respectively, where PLP_{\rm L} and PHP_{\rm H} denote the bit number. Thus, the set of discrete time delay τkh,kl\tau_{k_{h},k_{l}} and τkh\tau_{k_{h}} can be given by

τkh,kl𝒯1={0,D,2D,,(2PL1)D},\displaystyle\tau_{k_{h},k_{l}}\in\mathcal{T}_{1}=\{0,D,2D,\cdots,(2^{P_{\rm L}}-1)D\}, (29)
τkh𝒯2={0,D,2D,,(2PH1)D},\displaystyle\tau_{k_{h}}\in\mathcal{T}_{2}=\{0,D,2D,\cdots,(2^{P_{\rm H}}-1)D\}, (30)

where DD is the time delay step.

Without loss of generality, we first consider the time delay step as D=TcD=T_{c}. To ensure that the maximum time delay interval between PP array elements is within TcT_{c} and each array element can obtain the required optimal time delay, PP, PLP_{\rm L} and PHP_{\rm H} should satisfy the following conditions [40]

τn+P1(θ0)τn(θ0)<Tc,\displaystyle\tau_{n+P-1}\left(\theta_{0}\right)-\tau_{n}\left(\theta_{0}\right)<T_{c}, (31a)
τn+P(KL1)(θ0)τn(θ0)<(2PL1)Tc,\displaystyle\tau_{n+P\left(K_{\rm L}-1\right)}\left(\theta_{0}\right)-\tau_{n}\left(\theta_{0}\right)<\left(2^{P_{\rm L}}-1\right)T_{c}, (31b)
τNPKL(θ0)τ0(θ0)<(2PH1)Tc.\displaystyle\tau_{N-PK_{\rm L}}\left(\theta_{0}\right)-\tau_{0}\left(\theta_{0}\right)<\left(2^{P_{\rm H}}-1\right)T_{c}. (31c)

Then, the required bit at each layer and minimum array elements are calculated as

PTcTdsinθ0,\displaystyle P\leq\lfloor\frac{T_{c}}{T_{d}\sin\theta_{0}}\rfloor, (32a)
PLlog2(KL1)PTdsinθ0Tc,\displaystyle P_{\rm L}\geq\lceil\log_{2}\frac{\left(K_{\rm L}-1\right)PT_{d}\sin\theta_{0}}{T_{c}}\rceil, (32b)
PHlog2(NPKL)Tdsinθ0Tc,\displaystyle P_{\rm H}\geq\lceil\log_{2}\frac{\left(N-PK_{\rm L}\right)T_{d}\sin\theta_{0}}{T_{c}}\rceil, (32c)

where x\lfloor x\rfloor and x\lceil x\rceil are the floor and ceil function, respectively. If PL=0P_{\rm L}=0 and KL=1K_{\rm L}=1, double-layer TTD scheme degrades to the single-layer TTD scheme. The required bit PsP_{s} for the single-layer TTD should be

Pslog2(U1)STdsinθ0Tc.\displaystyle P_{s}\geq\lceil\log_{2}\frac{\left(U-1\right)ST_{d}\sin\theta_{0}}{T_{c}}\rceil. (33)

Next, we compare these two schemes. It is obvious that the hardware complexity of the TTD device and the antenna system mainly depends on the number of bits in an individual TTD device and the total number of TTDs, respectively. Consequently, the bit ratio can be used to measure the degree of reduction in hardware cost of the double-layer TTD scheme relative to the single-layer TTD scheme, which is defined as

η=(KHPH+KHKLPL)UPs.\displaystyle\eta=\frac{(K_{\rm H}P_{\rm H}+K_{\rm H}K_{\rm L}P_{\rm L})}{UP_{s}}. (34)

We assume N=128N=128, fc=300f_{c}=300 GHz, θ0=π/4\theta_{0}=\pi/4, U=32U=32, KH=8K_{\rm H}=8, KL=4K_{\rm L}=4, and then have Ps6P_{s}\geq 6, PH6P_{\rm H}\geq 6, PL3P_{\rm L}\geq 3 according to (32) and (33). To minimize the hardware cost, we set Ps=6P_{s}=6, PH=6P_{\rm H}=6 and PL=3P_{\rm L}=3. Then, we can calculate that the total bits under single- and double-layer TTD schemes are Bs=UPs=192B_{s}=UP_{s}=192 and Bm=KHPH+KHKLPL=144B_{m}=K_{\rm H}P_{\rm H}+K_{\rm H}K_{\rm L}P_{\rm L}=144, respectively. Therefore, the bit ratio is η=75%\eta=75\%. By introducing an additional small-range delay TTD network, it is obvious that the total required bits of TTD and the number of large-range delay TTDs are reduced under the double-layer TTD scheme, and thus hardware cost is reduced effectively.

IV Problem Formulation and Solutions

In this section, we investigate the joint beamforming optimization problem for the distributed RISs-aided THz communications based on the proposed double-layer TTD scheme. Based on the practical hardware limitation, we consider the finite-resolution phase shift and time delay. We assume that the number of RF chains NRFN_{\rm{RF}} is equal to the number of RISs, i.e., NRF=RN_{\rm{RF}}=R, and the fully connect structure is considered as shown in Fig. 8.

Refer to caption
Fig. 8: Fully connect antenna structure with double-layer TTD scheme.

IV-A Problem Formulation

The received signal of the kk-th user on the mm-th subcarrier can be written as

ym,k=𝐡m,k𝐅𝐝m,ksm,k+j=1,jkK𝐡m,k𝐅𝐝m,jsm,j+nm,k,\displaystyle y_{m,k}=\mathbf{h}_{m,k}\mathbf{F}\mathbf{d}_{m,k}{s}_{m,k}+\sum_{j=1,j\neq k}^{K}\mathbf{h}_{m,k}\mathbf{F}\mathbf{d}_{m,j}s_{m,j}+n_{m,k}, (35)

where 𝐅=𝐅A𝐅L𝐅H\mathbf{F}=\mathbf{F}_{\rm{A}}\mathbf{F}_{\rm{L}}\mathbf{F}_{\rm{H}} is the analog beamforming matrix realized by double-layer TTD network and PSs. 𝐅AN×KLKHNRF=diag([𝐅1,,𝐅nrf,,𝐅NRF])\mathbf{F}_{\rm{A}}\in\mathbb{C}^{N\times K_{\rm{L}}K_{\rm{H}}N_{\rm{RF}}}=\operatorname{diag}([\mathbf{F}_{1},\cdots,\mathbf{F}_{n_{\rm rf}},\cdots,\mathbf{F}_{N_{\rm{RF}}}]), where 𝐅nrfPKLKH×KLKH=diag([𝐜nrf,1,1,,𝐜nrf,kh,kl,,𝐜nrf,KH,KL])\mathbf{F}_{n_{\rm rf}}\in\mathbb{C}^{PK_{\rm{L}}K_{\rm{H}}\times K_{\rm{L}}K_{\rm{H}}}=\operatorname{diag}([\mathbf{c}_{n_{\rm rf},1,1},\cdots,\mathbf{c}_{n_{\rm rf},k_{h},k_{l}},\cdots,\mathbf{c}_{n_{\rm rf},K_{\rm{H}},K_{\rm{L}}}]), and 𝐜nrf,kh,klP×1\mathbf{c}_{n_{\rm rf},k_{h},k_{l}}\in\mathbb{C}^{P\times 1} denotes the beamforming vector generated by PP PSs connecting to the nrfn_{\rm rf}-th RF chain via the klk_{l}-th subarray related to the khk_{h}-th TTD of the second layer. 𝐅LKLKHNRF×KHNRF\mathbf{F}_{\rm{L}}\in\mathbb{C}^{K_{\rm{L}}K_{\rm{H}}N_{\rm{RF}}\times K_{\rm{H}}N_{\rm{RF}}} denotes the frequency-dependent phase shifts realized by the first-layer TTD network, and satisfies

𝐅L=diag([ej2πfm𝝉1,1,ej2πfm𝝉1,2,,ej2πfm𝝉NRF,KH]),\displaystyle\mathbf{F}_{\rm{L}}=\operatorname{diag}\left(\left[e^{j2\pi f_{m}\bm{\tau}_{1,1}},e^{j2\pi f_{m}\bm{\tau}_{1,2}},\cdots,e^{j2\pi f_{m}\bm{\tau}_{N_{\rm{RF}},K_{\rm{H}}}}\right]\right), (36)

where 𝝉nrf,khKL×1=[τnrf,kh,1,τnrf,kh,2,,τnrf,kh,KL]T\bm{\tau}_{n_{\rm rf},k_{h}}\in\mathbb{C}^{K_{\rm{L}}\times 1}=\left[\tau_{n_{\rm rf},k_{h},1},\tau_{n_{\rm rf},k_{h},2},\cdots,\tau_{n_{\rm rf},k_{h},K_{\rm{L}}}\right]^{\rm T} is the time delay vector realized by KLK_{\rm{L}} TTD elements connecting to the khk_{h}-th TTD of the second layer under the nrfn_{\rm rf}-th RF chain. 𝐅HKHNRF×NRF\mathbf{F}_{\rm{H}}\in\mathbb{C}^{K_{\rm{H}}N_{\rm{RF}}\times N_{\rm{RF}}} denotes the frequency-dependent phase shifts realized by the second-layer TTD network, and satisfies

𝐅H=diag([ej2πfm𝝉1,ej2πfm𝝉2,,ej2πfm𝝉NRF]),\displaystyle\mathbf{F}_{\rm{H}}=\operatorname{diag}\left(\left[e^{j2\pi f_{m}\bm{\tau}_{1}},e^{j2\pi f_{m}\bm{\tau}_{2}},\cdots,e^{j2\pi f_{m}\bm{\tau}_{N_{\rm{RF}}}}\right]\right), (37)

where 𝝉nrfKH×1=[τnrf,1,τnrf,2,,τnrf,KH]T\bm{\tau}_{n_{\rm rf}}\in\mathbb{C}^{K_{\rm{H}}\times 1}=\left[\tau_{n_{\rm rf},1},\tau_{n_{\rm rf},2},\cdots,\tau_{n_{\rm rf},K_{\rm{H}}}\right]^{\rm T} is the time delay vector realized by the second-layer TTD network connecting to the nrfn_{\rm rf}-th RF chain. In addition, 𝐝m,kNRF×1\mathbf{d}_{m,k}\in\mathbb{C}^{N_{\rm{RF}}\times 1} denotes the digital beamforming vector. nm,k𝒞𝒩(0,σm,k2)n_{m,k}\sim\mathcal{C}\mathcal{N}\left(0,\sigma_{m,k}^{2}\right) is the additive zero average white Gaussian noise (AWGN) with variance of σm,k2\sigma_{m,k}^{2} at the kk-th user on the mm-th subcarrier, and sm,ks_{m,k} denotes the transmit symbol for the kk-th user on the mm-th subcarrier with E[|sm,k|2]=1E\left[\left|s_{m,k}\right|^{2}\right]=1.

Then, the SINR of the kk-th user on the mm-th subcarrier can be calculated as

γm,k=|𝐡m,k𝐅𝐝m,k|2j=1,jkK|𝐡m,k𝐅𝐝m,j|2+σm,k2,\displaystyle\gamma_{m,k}=\frac{\left|\mathbf{h}_{m,k}\mathbf{F}\mathbf{d}_{m,k}\right|^{2}}{\sum_{j=1,j\neq k}^{K}\left|\mathbf{h}_{m,k}\mathbf{F}\mathbf{d}_{m,j}\right|^{2}+\sigma_{m,k}^{2}}, (38)

and the achievable sum rate RsumR_{\rm{sum}} can be expressed by

Rsum=k=1Km=1Mlog2(1+γm,k).\displaystyle R_{\rm{sum}}=\sum_{k=1}^{K}\sum_{m=1}^{M}\log_{2}\left(1+\gamma_{m,k}\right). (39)

Let bb denote the number of bits, and thus 2b2^{b} indicates the number of phase shift levels. Then, the set of discrete phase shifts generated by PSs can be expressed as

𝒞=1P{ej0,ej2π2b,,ej2π2b(2b1)}.\displaystyle\mathcal{C}=\frac{1}{\sqrt{P}}\{e^{j0},e^{j{\frac{2\pi}{2^{b}}}},\cdots,e^{j{\frac{2\pi}{2^{b}}\left(2^{b}-1\right)}}\}. (40)

Similarly, the set of discrete reflection coefficient of RISs can be written as

={ej0,ej2π2Q,,ej2π2Q(2Q1)},\displaystyle\begin{split}\mathcal{F}=\{e^{j0},e^{j{\frac{2\pi}{2^{Q}}}},\cdots,e^{j{\frac{2\pi}{2^{Q}}\left(2^{Q}-1\right)}}\},\end{split} (41)

where QQ indicates the bit number. Finally, we formulate the joint beamforming optimization problem as follows

P1:\displaystyle\mathrm{P1}: max𝚯,𝐅A,𝐅L,𝐅H,𝐝m,kRsum\displaystyle\max_{\bm{\Theta},\mathbf{F}_{\rm{A}},\mathbf{F}_{\rm{L}},\mathbf{F}_{\rm{H}},\mathbf{d}_{m,k}}R_{\rm{sum}} (42a)
s.t.\displaystyle{\rm{s.t.}}\;\; k=1Km=1M𝐅A𝐅L𝐅H𝐝m,k2Pmax,\displaystyle\sum_{\mathrm{k}=1}^{K}\sum_{m=1}^{M}\left\|\mathbf{F}_{\rm{A}}\mathbf{F}_{\rm{L}}\mathbf{F}_{\rm{H}}\mathbf{d}_{m,k}\right\|^{2}\leq P_{\text{max}}, (42b)
τnrf,kh{0,D,2D,,(2PH1)D},\displaystyle\tau_{n_{\rm rf},k_{h}}\in\{0,D,2D,\cdots,(2^{P_{\rm{H}}}-1)D\}, (42c)
τnrf,kh,kl{0,D,2D,,(2PL1)D},\displaystyle\tau_{n_{\rm rf},k_{h},k_{l}}\in\{0,D,2D,\cdots,(2^{P_{\rm{L}}}-1)D\}, (42d)
𝐜nrf,kh,kl1P{ej0,ej2π2b,,ej2π2b(2b1)},\displaystyle\mathbf{c}_{n_{\rm rf},k_{h},k_{l}}\in\frac{1}{\sqrt{P}}\{e^{j0},e^{j{\frac{2\pi}{2^{b}}}},\cdots,e^{j{\frac{2\pi}{2^{b}}\left(2^{b}-1\right)}}\}, (42e)
φr,mx,my{ej0,ej2π2Q,,ej2π2Q(2Q1)},\displaystyle\varphi_{r,m_{x},m_{y}}\in\{e^{j0},e^{j{\frac{2\pi}{2^{Q}}}},\cdots,e^{j{\frac{2\pi}{2^{Q}}\left(2^{Q}-1\right)}}\}, (42f)

where 𝚯=diag(𝚽1,,𝚽R)\bm{\Theta}=\operatorname{diag}\left(\bm{\Phi}_{1},\ldots,\bm{\Phi}_{\rm{R}}\right) and PmaxP_{\text{max}} is the maximum available transmit power. (42b) is the total transmit power constraint, (42c) and (42d) are the discrete time delay constraint for each TTD, (42e) is the phase shift constraint for each PS, and (42f) represents the discrete reflection coefficient constraint for each RIS element. P1 aims to maximize the achievable rate by jointly optimizing the reflection coefficient matrix 𝚯\bm{\Theta}, frequency-independent beamforming matrix 𝐅A\mathbf{F}_{\rm{A}}, frequency-dependent phase shifts 𝐅H\mathbf{F}_{\rm{H}} and 𝐅L\mathbf{F}_{\rm{L}}, and digital beamforming vector 𝐝m,k\mathbf{d}_{m,k}. Note that the constraint in (42b) is non-convex due to the coupling of 𝐅A\mathbf{F}_{\rm{A}}, 𝐅H\mathbf{F}_{\rm{H}}, 𝐅L\mathbf{F}_{\rm{L}}, and 𝐝m,k\mathbf{d}_{m,k}. Furthermore, the constraints in (42c)-(42f) restrict the optimization parameters to be discrete values. Thus, P1 is generally NP-hard, and there is no standard method to obtain its globally optimal solution efficiently. Next, we propose an effective algorithm to deal with it.

IV-B Problem Solution

In this section, we propose a joint beamforming framework to solve P1. Firstly, based on RISs’ locations, we design the analog beamforming, including phase shifts and time delays. Then, we propose an alternatively optimization algorithm to obtain the digital beamforming and reflection coefficients.

IV-B1 Design of Analog Beamforming 𝐅\mathbf{F}

The analog beamforming is defined as 𝐅=𝐅A𝐅L𝐅H\mathbf{F}=\mathbf{F}_{\rm{A}}\mathbf{F}_{\rm{L}}\mathbf{F}_{\rm{H}}, and thus we need to design 𝐅A\mathbf{F}_{\rm{A}}, 𝐅L\mathbf{F}_{\rm{L}} and 𝐅H\mathbf{F}_{\rm{H}} via optimizing the phase shifts and time delays. To compensate the severe array gain loss caused by the beam split, the analog beamforming should generate beams aligned with the target physical direction at all subcarriers. The key idea of the double-layer TTD scheme is that the time delays are elaborately designed to make beams over different subcarrier frequencies toward the target’s physical direction, and PSs are used to compensate for the remaining phase shift of the TTD network to generate beams aligned with target’s physical direction at the central frequency.

Therefore, we first design the time delays via the double-layer TTD network. Based on the analysis in Sec. III-B, the optimal time delay of the klk_{l}-th TTD at the first layer connected to the khk_{h}-th TTD at the second layer under the nrfn_{\rm rf}-th RF chain can be calculated as

τnrf,kh,kl=(kl1)PTdsinθlr.\displaystyle\tau_{n_{\rm rf},k_{h},k_{l}}=\left(k_{l}-1\right)PT_{d}\sin\theta_{l}^{r}. (43)

Similarly, the optimal time delay of the khk_{h}-th TTD at the second layer connected to the nrfn_{\rm rf}-th RF chain can be expressed as

τnrf,kh=(kh1)KLPTdsinθlr,\displaystyle\tau_{n_{\rm rf},k_{h}}=\left(k_{h}-1\right)K_{\rm L}PT_{d}\sin\theta_{l}^{r}, (44)

where θlr\theta_{l}^{r} is the physical direction of the LoS path from the BS to the rr-th RIS. Then, we can obtain the discrete time delay τnrf,kh,kl\tau_{n_{\rm rf},k_{h},k_{l}}^{{}^{\prime}} and τnrf,kh\tau_{n_{\rm rf},k_{h}}^{{}^{\prime}} according to the following approximation,

τnrf,kh,kl=argminτ1𝒯1|τnrf,kh,klτ1|,\displaystyle\tau_{n_{\rm rf},k_{h},k_{l}}^{{}^{\prime}}=\underset{\tau_{1}\in\mathcal{T}_{1}}{\operatorname{argmin}}\left|\tau_{n_{\rm rf},k_{h},k_{l}}-\tau_{1}\right|, (45)
τnrf,kh=argminτ2𝒯2|τnrf,khτ2|.\displaystyle\tau_{n_{\rm rf},k_{h}}^{{}^{\prime}}=\underset{\tau_{2}\in\mathcal{T}_{2}}{\operatorname{argmin}}\left|\tau_{n_{\rm rf},k_{h}}-\tau_{2}\right|. (46)

After obtaining the time delay τnrf,kh,kl\tau_{n_{\rm rf},k_{h},k_{l}}^{{}^{\prime}}, the time delay vector 𝝉^nrf,kh\hat{\bm{\tau}}_{n_{\rm rf},k_{h}} realized by KLK_{\rm{L}} TTD elements connecting to the khk_{h}-th TTD of the second layer under the nrfn_{\rm rf}-th RF chain is given by

𝝉^nrf,khKL×1=[τnrf,kh,1,τnrf,kh,2,,τnrf,kh,KL]T.\displaystyle\hat{\bm{\tau}}_{n_{\rm rf},k_{h}}\in\mathbb{C}^{K_{\rm{L}}\times 1}=\left[\tau_{n_{\rm rf},k_{h},1}^{{}^{\prime}},\tau_{n_{\rm rf},k_{h},2}^{{}^{\prime}},\cdots,\tau_{n_{\rm rf},k_{h},K_{\rm{L}}}^{{}^{\prime}}\right]^{T}. (47)

The corresponding frequency-dependent phase shifts 𝐅L\mathbf{F}_{\rm{L}} realized by the first layer is calculated as

𝐅L=diag([ej2πfm𝝉^1,1,ej2πfm𝝉^1,2,,ej2πfm𝝉^NRF,KH]).\displaystyle\mathbf{F}_{\rm{L}}=\operatorname{diag}\left(\left[e^{j2\pi f_{m}\hat{\bm{\tau}}_{1,1}},e^{j2\pi f_{m}\hat{\bm{\tau}}_{1,2}},\cdots,e^{j2\pi f_{m}\hat{\bm{\tau}}_{N_{\rm{RF}},K_{\rm{H}}}}\right]\right). (48)

Similarly, the time delay vector 𝝉^nrf\hat{\bm{\tau}}_{n_{\rm rf}} realized by KHK_{\rm{H}} TTDs connecting to the nrfn_{\rm rf}-th RF chain can be expressed as

𝝉^nrfKH×1=[τnrf,1,τnrf,2,,τnrf,KH]T.\displaystyle\hat{\bm{\tau}}_{n_{\rm rf}}\in\mathbb{C}^{K_{\rm{H}}\times 1}=\left[\tau_{n_{\rm rf},1}^{{}^{\prime}},\tau_{n_{\rm rf},2}^{{}^{\prime}},\cdots,\tau_{n_{\rm rf},K_{\rm{H}}}^{{}^{\prime}}\right]^{T}. (49)

The frequency-dependent phase shifts 𝐅H\mathbf{F}_{\rm{H}} realized by the second layer is formulated as

𝐅H=diag([ej2πfm𝝉^1,ej2πfm𝝉^2,,ej2πfm𝝉^NRF]).\displaystyle\mathbf{F}_{\rm{H}}=\operatorname{diag}\left(\left[e^{j2\pi f_{m}\hat{\bm{\tau}}_{1}},e^{j2\pi f_{m}\hat{\bm{\tau}}_{2}},\cdots,e^{j2\pi f_{m}\hat{\bm{\tau}}_{N_{\rm{RF}}}}\right]\right). (50)

Here, the design of the time delay only depend on the RISs’s locations and BS antenna structure.

Then, we present the frequency-independent beamforming matrix 𝐅A\mathbf{F}_{\rm{A}}, which is realized by PSs. The PSs can be used to compensate for the remaining phase shift of the former TTD network and generate beams aligned with RISs’ physical directions. The beamforming vector 𝐜nrf,kh,kl\mathbf{c}_{n_{\rm rf},k_{h},k_{l}} generated by PP PSs connecting to the nrfn_{\rm rf}-th RF chain via the klk_{l}-th TTD and khk_{h}-th TTD is given by

𝐜nrf,kh,kl=1P[1,,ej2πfcpTdsinθlr,,ej2πfcTd(P1)Tdsinθlr)]T.\displaystyle\!\!\!\!\!\mathbf{c}_{n_{\rm rf},k_{h},k_{l}}=\frac{1}{\sqrt{P}}\left[1,\ldots,e^{j2\pi f_{c}pT_{d}\sin\theta_{l}^{r}},\ldots,e^{\left.j2\pi f_{c}T_{d}\left(P-1\right)T_{d}\sin\theta_{l}^{r}\right)}\right]^{T}. (51)

Similarly, we obtain the discrete 𝐜nrf,kh,kl(p),p=1,2,,P\mathbf{c}_{n_{\rm rf},k_{h},k_{l}}^{{}^{\prime}}(p),p=1,2,\cdots,P according to the following approximation

𝐜nrf,kh,kl(p)=argminc𝒞|𝐜nrf,kh,kl(p)c|,\displaystyle\mathbf{c}_{n_{\rm rf},k_{h},k_{l}}^{{}^{\prime}}(p)=\underset{c\in\mathcal{C}}{\operatorname{argmin}}\left|\mathbf{c}_{n_{\rm rf},k_{h},k_{l}}(p)-c\right|, (52)

where cc is element in set 𝒞\mathcal{C}. Then, 𝐅nrf\mathbf{F}_{n_{\rm rf}} can be expressed as

𝐅nrf=diag([𝐜nrf,1,1,𝐜nrf,1,2,,𝐜nrf,kH,kLKHKL columns ]).\displaystyle\mathbf{F}_{n_{\rm rf}}=\operatorname{diag}\left([\underbrace{\mathbf{c}_{n_{\rm rf},1,1}^{{}^{\prime}},\mathbf{c}_{n_{\rm rf},1,2}^{{}^{\prime}},\cdots,\mathbf{c}_{n_{\rm rf},k_{\rm H},k_{\rm L}}^{{}^{\prime}}}_{K_{\mathrm{H}}K_{\mathrm{L}}\text{ columns }}]\right). (53)

Finally, we can obtain the frequency-independent beamforming matrix 𝐅A\mathbf{F}_{\rm{A}}, namely

𝐅A=diag([𝐅1,,𝐅nrf,,𝐅NRF]).\displaystyle\mathbf{F}_{\rm{A}}=\rm diag([\mathbf{F}_{1},\cdots,\mathbf{F}_{n_{\rm rf}},\cdots,\mathbf{F}_{N_{\rm{RF}}}]). (54)

IV-B2 Optimization of 𝐝m,k\mathbf{d}_{m,k} with Fixed 𝚯\mathbf{\Theta}

So far, we obtain the analog beamforming 𝐅\mathbf{F}, and next we solve the digital beamforming and reflection coefficients. Based on the obtained 𝐅\mathbf{F}, the original problem P1 can be reformulated as follows

P2:\displaystyle\mathrm{P2}: max𝚯,𝐝m,kRsum\displaystyle\max_{\bm{\Theta},\mathbf{d}_{m,k}}R_{\rm{sum}} (55a)
s.t.\displaystyle{\rm{s.t.}}\;\; k=1Km=1M𝐅A𝐅L𝐅H𝐝m,k2Pmax,\displaystyle\sum_{\mathrm{k}=1}^{K}\sum_{m=1}^{M}\left\|\mathbf{F}_{\rm{A}}\mathbf{F}_{\rm{L}}\mathbf{F}_{\rm{H}}\mathbf{d}_{m,k}\right\|^{2}\leq P_{\text{max}}, (55b)
φr,mx,my{ej0,ej2π2Q,,ej2π2Q(2Q1)}.\displaystyle\varphi_{r,m_{x},m_{y}}\in\{e^{j0},e^{j{\frac{2\pi}{2^{Q}}}},\cdots,e^{j{\frac{2\pi}{2^{Q}}\left(2^{Q}-1\right)}}\}. (55c)

However, P2 is still difficult to solve due to the non-convex complex objective function. Next, we propose an alternatively optimization algorithm to deal with it.

Firstly, for given the reflection coefficient matrix 𝚯\mathbf{\Theta}, we propose an iterative algorithm based on the MMSE technique to obtain the digital beamforming 𝐝m,k\mathbf{d}_{m,k}. The equivalent channel vectors for the kk-th user on the mm-th subcarrier can be written as 𝐡^m,k=𝐡m,k𝐅\hat{\mathbf{h}}_{m,k}=\mathbf{h}_{m,k}\mathbf{F}. Besides, based on the extension of the Sherman-Morrison-Woodbury formula [41], we have

(1+γm,k)1=1|𝐡^m,k𝐝m,k|2j=1K|𝐡^m,k𝐝m,j|2+σm,k2.\displaystyle\left(1+\gamma_{m,k}\right)^{-1}=1-\frac{\left|\hat{\mathbf{h}}_{m,k}\mathbf{d}_{m,k}\right|^{2}}{\sum_{j=1}^{K}\left|\hat{\mathbf{h}}_{m,k}\mathbf{d}_{m,j}\right|^{2}+\sigma_{m,k}^{2}}. (56)

The MMSE-receive combining filter at the kk-th user on the mm-th subcarrier is given as

χm,k=argminχm,kξm,k,\displaystyle\chi_{m,k}^{*}=\rm arg\min_{\chi_{m,k}}\xi_{m,k}, (57)

where ξm,k=𝔼[χm,kym,ksm,k22]\xi_{m,k}=\mathbb{E}\left[\left\|\chi_{m,k}y_{m,k}-s_{m,k}\right\|_{2}^{2}\right] is the MSE. Substituting (35) into ξm,k\xi_{m,k}, the MSE ξm,k\xi_{m,k} can be written as

ξm,k\displaystyle\!\!\!\!\!\!\!\!\!\!\xi_{m,k} =j=1K|χm,k𝐡^m,k𝐝m,j|22{χm,k𝐡^m,k𝐝m,k}+|χm,k|2σm,k2+1.\displaystyle=\sum_{j=1}^{K}\left|\chi_{m,k}\hat{\mathbf{h}}_{m,k}\mathbf{d}_{m,j}\right|^{2}-2\Re\left\{\chi_{m,k}\hat{\mathbf{h}}_{m,k}\mathbf{d}_{m,k}\right\}+\left|\chi_{m,k}\right|^{2}\sigma_{m,k}^{2}+1. (58)

Then, taking the partial derivatives to (58) with respect to χm,k\chi_{m,k} and setting the result to zero, the optimal receive combining filter χm,k\chi_{m,k}^{*} at the kk-th user on the mm-th subcarrier can be expressed as

χm,k=𝐡^m,k𝐝m,kj=1K|𝐡^m,k𝐝m,j|2+σm,k2.\displaystyle\chi_{m,k}^{*}=\frac{\hat{\mathbf{h}}_{m,k}\mathbf{d}_{m,k}}{\sum_{j=1}^{K}\left|\hat{\mathbf{h}}_{m,k}\mathbf{d}_{m,j}\right|^{2}+\sigma_{m,k}^{2}}. (59)

Substituting (59) into (58), the MMSE can be obtained as

ξm,k=1|𝐡^m,k𝐝m,k|2j=1K|𝐡^m,k𝐝m,j|2+σm,k2,\displaystyle\xi_{m,k}^{\rm*}=1-\frac{\left|\hat{\mathbf{h}}_{m,k}\mathbf{d}_{m,k}\right|^{2}}{\sum_{j=1}^{K}\left|\hat{\mathbf{h}}_{m,k}\mathbf{d}_{m,j}\right|^{2}+\sigma_{m,k}^{2}}, (60)

which is equal to (1+γm,k)1\left(1+\gamma_{m,k}\right)^{-1}, i.e.,

(1+γm,k)1=minum,kξm,k.\displaystyle\left(1+\gamma_{m,k}\right)^{-1}=\min_{u_{m,k}}\xi_{m,k}. (61)

Then, the achievable rate of the kk-th user on the mm-th subcarrier can be transformed as

log2(1+γm,k)=maxχm,k(log2ξm,k).\displaystyle\log_{2}\left(1+\gamma_{m,k}\right)=\max_{\chi_{m,k}}\left(-\log_{2}\xi_{m,k}\right). (62)

Based on [43] [44], we can obtain

log2(1+γm,k)=maxχm,kmaxϖm,k>0(ϖm,kξm,kln2+log2ϖm,k+1ln2),\displaystyle\log_{2}\left(1+\gamma_{m,k}\right)=\max_{\chi_{m,k}}\max_{\varpi_{m,k}>0}\left(-\frac{\varpi_{m,k}\xi_{m,k}}{\ln 2}+\log_{2}\varpi_{m,k}+\frac{1}{\ln 2}\right), (63)

where ϖm,k\varpi_{m,k} is the weight of the data for the kk-th user on the mm-th subcarrier and the optimal ϖm,k\varpi_{m,k} is ϖm,k=1ξm,k\varpi_{m,k}^{*}=\frac{1}{\xi_{m,k}}. Next, P2\mathrm{P2} can be transformed to the MSE minimization problem, namely

P3:\displaystyle\mathrm{P3}: max𝐝m,kk=1Km=1Mmaxχm,kmaxϖm,k>0(ϖm,kξm,kln2+log2ϖm,k+1ln2)\displaystyle\max_{\mathbf{d}_{m,k}}\sum_{k=1}^{K}\sum_{m=1}^{M}\max_{\chi_{m,k}}\max_{\varpi_{m,k}>0}\left(-\frac{\varpi_{m,k}\xi_{m,k}}{\ln 2}+\log_{2}\varpi_{m,k}+\frac{1}{\ln 2}\right) (64a)
s.t.\displaystyle{\rm{s.t.}}\;\; k=1Km=1M𝐅A𝐅L𝐅H𝐝m,k2Pmax.\displaystyle\sum_{k=1}^{K}\sum_{m=1}^{M}\left\|\mathbf{F}_{\rm{A}}\mathbf{F}_{\rm{L}}\mathbf{F}_{\rm{H}}\mathbf{d}_{m,k}\right\|^{2}\leq P_{\text{max}}. (64b)

To address P3\mathrm{P3}, an iterative optimization algorithm is proposed. Based on the obtained 𝐝m,k(i1)\mathbf{d}_{m,k}^{(i-1)} at the (i1)(i-1)-th iteration, χm,k(i)\chi_{m,k}^{(i)} at the ii-th iteration can be expressed as

χm,k(i)=𝐡^m,k𝐝m,k(i1)j=1K|𝐡^m,k𝐝m,j(i1)|2+σm,k2.\displaystyle\chi_{m,k}^{(i)}=\frac{\hat{\mathbf{h}}_{m,k}\mathbf{d}_{m,k}^{(i-1)}}{\sum_{j=1}^{K}\left|\hat{\mathbf{h}}_{m,k}\mathbf{d}_{m,j}^{(i-1)}\right|^{2}+\sigma_{m,k}^{2}}. (65)

And the optimal ϖm,k(i)\varpi_{m,k}^{(i)} at the ii-th iteration can be obtained by ϖm,k(i)=1ξm,k(i)\varpi_{m,k}^{(i)}=\frac{1}{\xi_{m,k}^{*(i)}} , where

ξm,k(i)=1|𝐡^m,k𝐝m,k(i1)|2j=1K|𝐡^m,k𝐝m,j(i1)|2+σm,k2.\displaystyle\xi_{m,k}^{*(i)}=1-\frac{\left|\hat{\mathbf{h}}_{m,k}\mathbf{d}_{m,k}^{(i-1)}\right|^{2}}{\sum_{j=1}^{K}\left|\hat{\mathbf{h}}_{m,k}\mathbf{d}_{m,j}^{(i-1)}\right|^{2}+\sigma_{m,k}^{2}}. (66)

Finally, the problem P3\mathrm{P3} is transformed as

P4:\displaystyle\;\;\mathrm{P4}: min𝐝m,k(i)k=1Km=1M(ϖm,k(i)ξm,k(i)ln2log2ϖm,k(i)1ln2)\displaystyle\min_{\mathbf{d}_{m,k}^{(i)}}\sum_{k=1}^{K}\sum_{m=1}^{M}\left(\frac{\varpi_{m,k}^{(i)}\xi_{m,k}^{(i)}}{\ln 2}-\log_{2}\varpi_{m,k}^{(i)}-\frac{1}{\ln 2}\right) (67a)
s.t.\displaystyle{\rm{s.t.}}\;\; k=1Km=1M𝐅A𝐅L𝐅H𝐝m,k(i)2Pmax.\displaystyle\sum_{k=1}^{K}\sum_{m=1}^{M}\left\|\mathbf{F}_{\rm{A}}\mathbf{F}_{\rm{L}}\mathbf{F}_{\rm{H}}\mathbf{d}_{m,k}^{(i)}\right\|^{2}\leq P_{\text{max}}. (67b)

We find that P4\mathrm{P4} can be solved by numerical convex program solvers. Particularly, since the obtained 𝐝m,k(i)\mathbf{d}_{m,k}^{(i)}, ϖm,k(i)\varpi_{m,k}^{(i)}, χm,k(i)\chi_{m,k}^{(i)} are the optimal solutions of P4\mathrm{P4} at the ii-th iteration and the objective function is lower bound and monotonically decreases with iterations. Consequently, it can converge to at least a local optimal solution.

1 Input: Channels 𝐟r,m,k\mathbf{f}_{r,m,k}, 𝐆r,m\mathbf{G}_{r,m}, analog beamforming 𝐅\mathbf{F}, digital beamforming vector 𝐝m,k\mathbf{d}_{m,k}, maximum iterations IoI_{o}.
2 Initialization: φnris(0)=+1\varphi_{n_{\rm ris}}^{(0)}=+1 for nris=1,2,,RNRIS,i=0n_{\rm ris}=1,2,\cdots,RN_{\rm{RIS}},i=0.
3 while 0i<Io0\leq i\textless I_{o} do
4     for nris=1:RNRISn_{\rm ris}=1:RN_{\rm{RIS}} do
5       φnris(i)=1\varphi_{n_{\rm ris}}^{(i)}=-1;
6       ϕ1=[φ1(i),φ2(i),,φnris1(i),φnris(i),φnris+1(i1),φRNRIS(i1)]T\bm{\phi}_{1}=\left[\varphi_{1}^{(i)},\varphi_{2}^{(i)},\cdots,\varphi_{n_{\rm ris}-1}^{(i)},\varphi_{n_{\rm ris}}^{(i)},\varphi_{n_{\rm ris}+1}^{(i-1)}\cdots,\varphi_{RN_{\rm{RIS}}}^{(i-1)}\right]^{T};
7       Transform the vector ϕ1\bm{\phi}_{1} to the reflection matrix 𝚯1\bm{\Theta}_{1};
8       φnris(i)=+1\varphi_{n_{\rm ris}}^{(i)}=+1;
9       ϕ2=[φ1(i),φ2(i),,φnris1(i),φnris(i),φnris+1(i1),φRNRIS(i1)]T\bm{\phi}_{2}=\left[\varphi_{1}^{(i)},\varphi_{2}^{(i)},\cdots,\varphi_{n_{\rm ris}-1}^{(i)},\varphi_{n_{\rm ris}}^{(i)},\varphi_{n_{\rm ris}+1}^{(i-1)}\cdots,\varphi_{RN_{\rm{RIS}}}^{(i-1)}\right]^{T};
10       Transform the vector ϕ2\bm{\phi}_{2} to the reflection matrix 𝚯2\bm{\Theta}_{2};
11       q=argmaxq=1,2{Rsum(𝚯q)}q=\arg\max_{q=1,2}\left\{R_{\rm{sum}}(\bm{\Theta}_{q})\right\};
12       𝚯=𝚯q\bm{\Theta}=\bm{\Theta}_{q};
13    end for
14 end while
Output: Reflection coefficients matrix 𝚯\mathbf{\Theta}.
Algorithm 1 Coordinate Update Algorithm for Optimizing Reflection Coefficients Matrix

IV-B3 Optimization of 𝚯\mathbf{\Theta} with Fixed 𝐝m,k\mathbf{d}_{m,k}

After obtaining the digital beamforming 𝐝m,k\mathbf{d}_{m,k}, we apply the coordinate update algorithm [42] to obtain the reflection coefficients matrix. Due to 𝐡m,k=r=1R𝐟r,m,k𝚽r𝐆r,m\mathbf{h}_{m,k}=\sum_{r=1}^{R}\mathbf{f}_{r,m,k}\mathbf{\Phi}_{r}\mathbf{G}_{r,m}, we transform (55) into the following optimization problem:

P5:\displaystyle\mathrm{P5}: max𝚯Rsum\displaystyle\max_{\mathbf{\Theta}}R_{\rm{sum}} (68a)
s.t.\displaystyle{\rm{s.t.}}\;\; φr,mx,my{ej0,ej2π2Q,,ej2π2Q(2Q1)}.\displaystyle\varphi_{r,m_{x},m_{y}}\in\{e^{j0},e^{j{\frac{2\pi}{2^{Q}}}},\cdots,e^{j{\frac{2\pi}{2^{Q}}\left(2^{Q}-1\right)}}\}. (68b)

Considering the discrete nature of the reflection coefficients, the optimal solution for P5 can be obtained by searching all possible vectors. However, even when Q=1Q=1 bit, such exhaustive search algorithm requires to search 2RNRIS2^{RN_{\rm RIS}} candidate vectors, which involves unaffordable complexity for a large NRISN_{\rm RIS}. Therefore, we employ the coordinate update algorithm and it only requires 2RNRISIo2RN_{\rm RIS}I_{o} search complexity to obtain a suboptimal solution. Specifically, by fixing any RNRIS1RN_{\rm RIS}-1 phase shifts in each iteration, we alternately optimize each of the RNRISRN_{\rm RIS} phase shifts via one-dimensional search over \mathcal{F} in an iterative manner until convergence.

To simplify the expression, we assume φnris=φr,mx,my\varphi_{n_{\rm ris}}=\varphi_{r,m_{x},m_{y}} and define a vector ϕRNRIS×1=[φ1,φ2,,φnris1,φnris,φnris+1,,φRNRIS]T\bm{\phi}\in\mathbb{C}^{RN_{\rm{RIS}}\times 1}=\left[\varphi_{1},\varphi_{2},\cdots,\varphi_{n_{\rm ris}-1},\varphi_{n_{\rm ris}},\varphi_{n_{\rm ris}+1},\cdots,\varphi_{RN_{\rm{RIS}}}\right]^{T}, nris=1,2,,RNRISn_{ris}=1,2,\cdots,RN_{\rm{RIS}}. Without loss of generality, we set Q=1Q=1 bit. The optimization problem is transformed as

P6:\displaystyle\mathrm{P6}: maxϕRsum\displaystyle\max_{\bm{\phi}}R_{\rm{sum}} (69a)
s.t.\displaystyle{\rm{s.t.}}\;\; φnris{1,1}.\displaystyle\varphi_{n_{\rm ris}}\in\{1,-1\}. (69b)

Based on the coordinate update algorithm, the optimal phase shift of the nrisn_{\rm ris}-th element is given by

φnris=argmaxφnris{1,1}{Rsum(ejφnris)}.\displaystyle\varphi_{n_{\rm ris}}=\arg\max_{\varphi_{n_{\rm ris}}\in{\{1,-1\}}}\left\{R_{\rm{sum}}(e^{j\varphi_{n_{\rm ris}}})\right\}. (70)

We summarize the procedure for solving reflection coefficients in 𝐀𝐥𝐠𝐨𝐫𝐢𝐭𝐡𝐦\mathbf{Algorithm} 𝟏\mathbf{1}. Note that problem P1 is the original problem formulation. P2-P6 are the intermediate problems obtained by transformation and simplification in the solving process. The digital beamforming vector and reflection coefficients matrix can be obtained by solving P4 and P6, respectively.

So far, the analog beamforming 𝐅\mathbf{F}, the digital beamforming 𝐝m,k\mathbf{d}_{m,k} and reflection coefficients matrix 𝚯\mathbf{\Theta} are all obtained. In the process of solving P1, based on the RISs’ locations, we first design the analog beamforming, including PSs’ phase shifts and time delays of the double-layer TTD network. Then, we propose an alternatively optimization algorithm to obtain the digital beamforming and reflection coefficients. Specifically, given fixed reflection coefficients, the MMSE technique is applied to obtain the digital beamforming. Next, we employ the coordinate update algorithm to slove the reflection coefficients. The above procedure is repeated until convergence. The details of the proposed optimization framework are summarized in 𝐀𝐥𝐠𝐨𝐫𝐢𝐭𝐡𝐦\mathbf{Algorithm} 𝟐\mathbf{2}, which can obtain a sub-optimal solution to balance the performance and computation complexity.

1 Input: Channels 𝐟r,m,k\mathbf{f}_{r,m,k}, 𝐆r,m\mathbf{G}_{r,m}.
2 Initialization: Digital beamforming vector 𝐝m,k(0)\mathbf{d}_{m,k}^{(0)} and reflection coefficients matrix 𝚯(0)\mathbf{\Theta}^{(0)}.
3Calculate the analog beamforming matrix 𝐅\mathbf{F} according to (48), (50) and (54).
4 while 0i<Imax0\leq i\textless I_{\rm{max}} do
5     Obtain digital beamforming vector 𝐝m,k\mathbf{d}_{m,k} by solving P4;
6     Obtain reflection coefficients matrix 𝚯\mathbf{\Theta} by solving P6;
7 end while
Output: Analog beamforming matrix 𝐅\mathbf{F}, digital   beamforming vector 𝐝m,k\mathbf{d}_{m,k}, reflection coefficients matrix 𝚯\mathbf{\Theta}.
Algorithm 2 The Proposed Algorithm for Solving P1\mathrm{P1}

IV-C Computational Complexity

In this subsection, we analyze the computational complexity of the proposed algorithm, which is mainly induced by MMSE algorithm and coordinate update algorithm. Specifically, to obtain the digital beamforming vector 𝐝m,k\mathbf{d}_{m,k}, the computational complexity is 𝒪(IdMN2)\mathcal{O}\left(I_{d}MN^{2}\right), where IdI_{d} is the required number of iterations of the MMSE algorithm. The computational complexity is 𝒪(Io2QKNRNRIS)\mathcal{O}\left(I_{o}2^{Q}KNRN_{\rm{RIS}}\right) for solving the reflection coefficients matrix 𝚯\bm{\Theta}, where IoI_{o} is the required number of iterations of the coordinate update algorithm. The overall computational complexity of the proposed 𝐀𝐥𝐠𝐨𝐫𝐢𝐭𝐡𝐦𝟏\bf Algorithm\hskip 1.4457pt1 is 𝒪(Imax(IdMN2+Io2QKNRNRIS))\mathcal{O}\left(I_{\rm{max}}(I_{d}MN^{2}+I_{o}2^{Q}KNRN_{\rm{RIS}})\right), where ImaxI_{\rm{max}} is the required iteration number of the outer iteration.

V Numerical Results

In this section, simulation results are presented to evaluate the performance of the proposed schemes. We assume that the BS is located at (50m, 0m, 3m) and K=4K=4 users are randomly distributed in a circle centered at (0, 85m, 0) with radius of 1m. In addition, we deploy R=4R=4 distributed RISs and their locations are (0, 80m, 6m), (0, 80m, 8m), (0, 85m, 6m) and (0, 85m, 8m), respectively. Since THz communication mainly relies on the LoS path, we set L1=L2=1L_{1}=L_{2}=1 [33]. The other default simulation parameters are listed in Table I.

TABLE I: System parameters
Parameters Value
Number of antennas N=128N=128
Central frequency fc=300f_{c}=300 GHz
Bandwidth B=30B=30 GHz
Number of subcarriers M=8M=8
Bit of the TTD at the first layer PH=8P_{\rm{H}}=8
Bit of the TTD at the second layer PL=4P_{\rm{L}}=4
Number of RISs R=4R=4
Number of RIS elements NRIS=16N_{\rm{RIS}}=16
Number of users K=4K=4
Number of RF chains NRF=4N_{\rm{RF}}=4
Maximum transmit power Pmax=10P_{\rm{max}}=10 dBm
Noise power σm,k2=85\sigma_{m,k}^{2}=-85 dBm
Refer to caption
Fig. 9: Achievable rate versus the BS transmit power under different time delay steps.

Fig. 9 plots the achievable rate versus maximum transmit power PmaxP_{\rm{max}} with different time delay steps. Here, we first consider the infinite-resolution phase shift at PSs. In addition, we mainly focus on the performance of the proposed double-layer TTD scheme at the BS, and thus the RIS is not considered in this simulation. For the single- and double-layer TTD schemes, we set U=32U=32, KH=8K_{\rm H}=8, KL=4K_{\rm L}=4, Ps=8P_{s}=8, PH=8P_{\rm{H}}=8, PL=4P_{\rm{L}}=4. The time delay step DD includes ideal case (continuous), 0.15Tc0.15T_{c}, and 0.25Tc0.25T_{c}. One can observe that the achievable rate of the single- and double-layer TTD schemes increases with the time delay step decreases. The reason is that the accuracy of the TTD becomes poor with a large time delay step, which results in the beam misalignment. Moreover, we find that the double-layer TTD scheme can almost obtain the same performance with the single-layer TTD scheme, but the number of large-range delay TTDs and the total number of bits can be effectively reduced. In addition, it is obvious that the PS scheme is the worst due to the serious beam split effect.

Fig. 10 (a) and Fig. 10 (b) show the convergence performance of the proposed inner iterative algorithm for solving the digital beamforming and reflection matrix, respectively, i.e., lines 5 and lines 6 in Algorithm 2. We set KH=8K_{\rm H}=8, KL=4K_{\rm L}=4 in the double-layer TTD scheme. Besides, we assume D=0.15TcD=0.15T_{c}, PH=8P_{\rm{H}}=8, PL=4P_{\rm{L}}=4, Ps=8P_{\rm{s}}=8, Q=1Q=1 and infinite-resolution phase shift at PSs. The legend “nn-th iteration” in Fig. 10 stands for the outer iteration number. One can observe that the inner iterative algorithm tends to converge after five iterations for each outer iteration. In addition, it can be found that the gap is small between the second and third iterations, but large between the first and second iterations. This means that outer iterative loop (i.e., Algorithm 2) also converges rapidly.

Refer to caption
Refer to caption
Fig. 10: Achievable rate versus iteration for solving (a) the digital beamforming, (b) the reflection coefficients matrix.
Refer to caption
Fig. 11: Achievable rate versus the iteration.
TABLE II: Hardware complexity and performance comparison
Number of large- Achievable
Architecture Delay range range delay TTDs Total number of TTDs Total number of bits rate
Single-layer TTD scheme
(U=32U=32) τu[0,62Tc]\tau_{u}\in[0,62T_{c}] NRFU=128N_{\rm RF}U=128 NRFU=128N_{\rm RF}U=128 NRFUPs=1024N_{\rm RF}UP_{s}=1024 3.363.36 bit/s/Hz
Double-layer TTD scheme τnrf,kh[0,56Tc]\tau_{n_{\rm rf},k_{h}}\in[0,56T_{c}]
(KH=8K_{\rm H}=8, KL=4K_{\rm L}=4) τnrf,kh,kl[0,6Tc]\tau_{n_{\rm rf},k_{h},k_{l}}\in[0,6T_{c}] NRFKH=32N_{\rm RF}K_{\rm H}=32 NRF(KH+KHKL)=160N_{\rm RF}(K_{\rm H}+K_{\rm H}K_{\rm L})=160 NRF(KHPH+KHKLPL)=768N_{\rm RF}(K_{\rm H}P_{\rm H}+K_{\rm H}K_{\rm L}P_{\rm L})=768 3.343.34 bit/s/Hz
Double-layer TTD scheme τnrf,kh[0,56Tc]\tau_{n_{\rm rf},k_{h}}\in[0,56T_{c}]
(KH=8K_{\rm H}=8, KL=2K_{\rm L}=2) τnrf,kh,kl[0,4Tc]\tau_{n_{\rm rf},k_{h},k_{l}}\in[0,4T_{c}] NRFKH=32N_{\rm RF}K_{\rm H}=32 NRF(KH+KHKL)=96N_{\rm RF}(K_{\rm H}+K_{\rm H}K_{\rm L})=96 NRF(KHPH+KHKLPL)=512N_{\rm RF}(K_{\rm H}P_{\rm H}+K_{\rm H}K_{\rm L}P_{\rm L})=512 2.932.93 bit/s/Hz

Fig. 11 shows the achievable rate versus outer iterations ImaxI_{\rm{max}} under different schemes. We set U=32U=32 in the single-layer TTD scheme serves as the performance upper bound. The classical PS scheme is adopted as the baseline. And we set KH=8K_{\rm H}=8, KL=4K_{\rm L}=4 and KH=8K_{\rm H}=8, KL=2K_{\rm L}=2 two cases in the double-layer TTD scheme. Meanwhile, we assume D=0.15TcD=0.15T_{c}, PH=8P_{\rm{H}}=8, PL=4P_{\rm{L}}=4, Ps=8P_{\rm{s}}=8, Q=1Q=1 and infinite-resolution phase shift at PSs. One can observe that the achievable rate tends to convergence after 3 iterations, which proves the effectiveness of the proposed algorithm. Besides, when the number of large-range delay TTDs KHK_{\rm H} is fixed, the achievable rate increases with the increased number of small-range delay TTDs. Specifically, the proposed double-layer TTD scheme with KL=2K_{\rm L}=2 small-range delay TTDs can significantly enhance the performance. And when KL=4K_{\rm L}=4, the double-layer TTD scheme is almost the same with that of the single-layer TTD scheme, which proves that the double-layer TTD scheme is another hardware-efficient approach to solve beam split effect. The reason is that the proposed small-range delay TTDs only need to compensate the propagation delay across the small subarray aperture.

Next, we compare the hardware complexity and the achievable rate under single- and double-layer TTD schemes as shown in TABLE II. One can observe that the number of large-range delay TTDs reduces from 128 under the single-layer TTD scheme to 32 under the proposed double-layer TTD scheme, which is down by 75%75\%. In addition, the total number of bits is down by 25%25\% for KH=8K_{\rm H}=8 and KL=4K_{\rm L}=4, but the achievable rate is only down by 0.6%0.6\% for the proposed scheme. When KH=8K_{\rm H}=8 and KL=2K_{\rm L}=2, there is a more obvious advantage. For example, the total number of bits is down by 50%50\% while the achievable rate is a little decrease. Furthermore, although an additional small-range delay TTD network is introduced, its required delay range is much smaller compared to large-range delay TTD network. Therefore, the proposed scheme can effectively reduce the hardware cost by sacrificing a little achievable rate.

In Fig. 12, we present the achievable rate versus the BS transmit power under different schemes. We set D=0.15TcD=0.15T_{c}, PH=8P_{\rm{H}}=8, PL=4P_{\rm{L}}=4, Ps=8P_{\rm{s}}=8, Q=1Q=1, and infinite-resolution phase shifter at PSs. One can observe that the achievable rate increases with the transmit power under all schemes. In addition, the double-layer TTD scheme with KH=8K_{\rm H}=8 and KL=4K_{\rm L}=4 can almost obtain the same performance with the single-layer TTD scheme, while the achievable rate of the double-layer TTD scheme with KH=8K_{\rm H}=8 and KL=2K_{\rm L}=2 has a little decrease. However, according to TABLE II, comparison with single-layer TTD scheme, the decreasing percentage of the hardware complexity is much larger than that of the achievable rate for the proposed double-layer TTD scheme.

Refer to caption
Fig. 12: Achievable rate versus the BS transmit power under different schemes.

Fig. 13 shows the achievable rate versus BS transmit power with different number of RIS elements RtotalR_{\rm{total}}. Here, we set D=0.15TcD=0.15T_{c}, PH=8P_{\rm{H}}=8, PL=4P_{\rm{L}}=4, Ps=8P_{\rm{s}}=8, Q=1Q=1, and infinite-resolution phase shift at PSs. It is obvious that more RIS elements can obtain a higher rate under the same conditions. Besides, it demonstrates that the proposed framework can be applied to any number of RIS elements. Fig. 14 plots the achievable rate versus iterations ImaxI_{\rm{max}}, where we set D=0.15TcD=0.15T_{c}, PH=8P_{\rm{H}}=8, PL=4P_{\rm{L}}=4, Ps=8P_{\rm{s}}=8, Q=1Q=1, b=1b=1. One can observe that the achievable rate also tends to convergence after 3 iterations, which proves the effectiveness of the proposed method. In addition, it can be observed that the achievable rate is a little lower than that of the infinite-resolution phase shift at PSs presented in Fig. 11. However, it is acceptable for a little degradation in achievable rate to obtain a large reduction in hardware complexity and cost.

Refer to caption
Fig. 13: Achievable rate versus the BS transmit power with different RIS elements.
Refer to caption
Fig. 14: Achievable rate versus the iteration with b=1b=1.

Furthermore, we analyze the robustness of the proposed joint wideband beamforming against the impacts of imperfect CSI. We consider CSI uncertainties that can be modeled as

h~=h+e,\displaystyle\widetilde{h}=h+e, (71)

where h~\widetilde{h} and hh denote estimated and real channel, respectively, ee represents the independent estimation error following complex Gaussian distribution with zero mean, i.e. e𝒞𝒩(0,σe2)e\sim\mathcal{C}\mathcal{N}\left(0,\sigma_{e}^{2}\right). We assume that the variance σe2\sigma_{e}^{2} , i.e. the error power, satisfies σe2δ|h|2\sigma_{e}^{2}\triangleq\delta|h|^{2} where δ\delta denotes the ratio of the error power σe2\sigma_{e}^{2} to the channel gain |h|2|h|^{2}, which characterizes the level of CSI error.

We assume D=0.15TcD=0.15T_{c}, PH=8P_{\rm{H}}=8, PL=4P_{\rm{L}}=4, Ps=8P_{\rm{s}}=8, Q=1Q=1 and b=1b=1. Then, the achievable rate per subcarrier versus the CSI error parameter δ\delta is shown in Fig. 15. One can observe that the performance loss grows with the increasing of δ\delta. The reason is that the accuracy of the estimation angles becomes poor with a large error, which results in the beam misalignment. For example, for the “Double-layer TTD scheme, KH=8K_{\rm H}=8, KL=4K_{\rm L}=4”, compared with the perfect CSI without error (i.e. δ=0\delta=0), the system performance suffers a loss of 6%6\% when the error power δ=0.1\delta=0.1, and a loss of 30%30\% when δ=0.3\delta=0.3. Besides, the proposed double-layer TTD scheme always outperforms the frequency-independent PS scheme without TTD and close to the single-layer TTD scheme at any CSI estimation error, which validates the effectiveness and robustness of our proposed joint wideband beamforming to deal with beam split.

Refer to caption
Fig. 15: Average achievable rate per subcarrier versus δ\delta.

VI Conclusions

In this paper, we proposed a double-layer TTD scheme with low hardware cost to overcome the beam split of the BS and solve the maximum delay compensation problem observed in the traditional single-layer TTD scheme. We first analyzed the phase compensation error and normalized array gain under the double-layer TTD scheme. Then, based on the proposed scheme, we investigated the beamforming optimization problem for the multiple distributed RISs-aided THz communications and formulated a achievable rate maximization problem via jointly optimizing the hybrid analog/digital beamforming, time delays of the double-layer TTD network and reflection coefficients of the RISs. Theoretical analysis and simulation results demonstrated that the double-layer TTD scheme can almost obtain the same performance with the single-layer TTD scheme, while the overall hardware cost is effectively decreased. In our future work, to further reduce the time delay range of the TTD at the BS, we will extend the proposed double-layer TTD network to the sub-connected hybrid beamforming architecture. Besides, to solve the beam split at the RIS, we will introduce TTDs to RIS elements and research how to reduce the beam split effect.

References

  • [1] J. Tan and L. Dai, “THz precoding for 6G: challenges, solutions, and opportunities,” IEEE Wireless Commun., doi: 10.1109/MWC.015.2100674.
  • [2] A. M. Elbir, K. V. Mishra, and S. Chatzinotas, “Terahertz-band joint ultra-massive MIMO radar-communications: model-based and model-free hybrid beamforming,” IEEE J. Sel. Top. Signal Process., vol. 15, no. 6, pp. 1468-1483, Nov. 2021.
  • [3] W. Hao, G. Sun, M. Zeng, Z. Chu, Z. Zheng, O. A. Dobre, and P. Xiao, “Robust design for intelligent reflecting surface assisted MIMO-OFDMA Terahertz IoT networks,” IEEE Internet Things J., vol. 8, no. 16, pp. 13052-13064, Aug. 2021.
  • [4] M. Giordani, M. Polese, M. Mezzavilla, S. Rangan, and M. Zorzi, “Toward 6G networks: use cases and technologies,” IEEE Commun. Mag., vol. 58, no. 3, pp. 55-61, Mar. 2020.
  • [5] F. Gao, B. Wang, C. Xing, J. An, and G. Y. Li, “Wideband beamforming for hybrid massive MIMO terahertz communications,” IEEE J. Sel. Areas Commun., vol. 39, no. 6, pp. 1725-1740, Jun. 2021.
  • [6] C. Huang et al., “Multi-hop RIS-empowered terahertz communications: a DRL-based hybrid beamforming design,” IEEE J. Sel. Areas Commun., vol. 39, no. 6, pp. 1663-1677, Jun. 2021.
  • [7] L. Yan, Y. Chen, C. Han, and J. Yuan, “Joint inter-path and intra-path multiplexing for terahertz widely-spaced multi-subarray hybrid beamforming systems,” IEEE Trans. Commun., vol. 70, no. 2, pp. 1391-1406, Feb. 2022.
  • [8] Q. Wu, S. Zhang, B. Zheng, C. You, and R. Zhang, “Intelligent reflecting surface aided wireless communications: a tutorial,” IEEE Trans. Commun., vol. 69, no. 5, pp. 3313-3351, May 2021.
  • [9] C. Huang, S. Hu, G. C. Alexandropoulos, A. Zappone, C. Yuen, R. Zhang, M. D. Renzo, and M. Debbah, “Holographic MIMO surfaces for 6G wireless networks: opportunities, challenges, and trends,” IEEE Wireless Commun., vol. 27, no. 5, pp. 118-125, Oct. 2020.
  • [10] Z. Chen et al., “Terahertz wireless communications for 2030 and beyond: a cutting-edge frontier,” IEEE Commun. Mag., vol. 59, no. 11, pp. 66-72, Nov. 2021.
  • [11] F. Sohrabi and W. Yu, “Hybrid analog and digital beamforming for mmWave OFDM large-scale antenna arrays,” IEEE J. Sel. Areas Commun., vol. 35, no. 7, pp. 1432-1443, Jul. 2017.
  • [12] S. Park, A. Alkhateeb, and R. W. Heath, “Dynamic subarrays for hybrid precoding in wideband mmWave MIMO systems,” IEEE Trans. Wireless Commun., vol. 16, no. 5, pp. 2907-2920, May 2017.
  • [13] L. Dai, B. Wang, M. Peng, and S. Chen, “Hybrid precoding-based millimeter-wave massive MIMO-NOMA with simultaneous wireless information and power transfer,” IEEE J. Sel. Areas Commun., vol. 37, no. 1, pp. 131-141, Jan. 2019.
  • [14] H. Li, M. Li, and Q. Liu, “Hybrid beamforming with dynamic subarrays and low-resolution PSs for mmWave MU-MISO systems,” IEEE Trans. Commun., vol. 68, no. 1, pp. 602-614, Jan. 2020.
  • [15] L. Dai, J. Tan, Z. Chen, and H. V. Poor, “Delay-phase precoding for wideband THz massive MIMO,” IEEE Trans. Wireless Commun., vol. 21, no. 9, pp. 7271-7286, Sept. 2022.
  • [16] X. Ai, Z. He, Y. Gao, and C. Han, “A joint beam control method of optical TTDs and phase shifters in phased array antennas,” in Proc. Int. Conf. Commun. Circuits Syst., 2004, pp. 880-882.
  • [17] W. Hao, F. Zhou, M. Zeng, O. A. Dobre, and N. Al-Dhahir, “Ultra wide band THz IRS communications: Applications, challenges, key techniques, and research opportunities,” IEEE Netw., vol. 36, no. 6, pp. 214-220, Dec. 2022.
  • [18] W. Hao, X. You, F. Zhou, Z. Chu, G. Sun, and P. Xiao, “The far-/near-field beam squint and solutions for THz intelligent reflecting surface communications,” IEEE Trans. Vehi. Technol., vol. 72, no. 8, 10107-10118, Sept. 2023.
  • [19] H. Hashemi, T. -s. Chu, and J. Roderick, “Integrated true-time-delay-based ultra-wideband array processing,” IEEE Commun. Mag., vol. 46, no. 9, pp. 162-172, Sept. 2008.
  • [20] B. Zhai, A. Tang, C. Peng, and X. Wang, “SS-OFDMA: spatial-spread orthogonal frequency division multiple access for terahertz networks,” IEEE J. Sel. Areas Commun., vol. 39, no. 6, pp. 1678-1692, Jun. 2021.
  • [21] J. Tan and L. Dai, “Delay-phase precoding for THz massive MIMO with beam split,” in Proc. IEEE Global Commun. Conf. (GLOBECOM’19), pp. 1-6, Dec. 2019.
  • [22] J. Tan and L. Dai, “Wideband beam tracking in THz massive MIMO systems,” IEEE J. Sel. Areas Commun., vol. 39, no. 6, pp. 1693-1710, Jun. 2021.
  • [23] G. F. Ricciardi, J. R. Connelly, H. A. Krichene, and M. T. Ho, “A fast-performing error simulation of wideband radiation patterns for large planar phased arrays with overlapped subarray architecture,” IEEE Trans. Antennas Propag., vol. 62, no. 4, pp. 1779-1788, Apr. 2014.
  • [24] F. Hu and K. Mouthaan, “A 1-20 GHz 400 ps true-time delay with small delay error in 0.13 CMOS for broadband phased array antennas,” in Proc. IEEE MTT-S Int. Microw. Symp. (IMS’15), 2015, pp. 1-3.
  • [25] V. Boljanovic et al., “Fast beam training with true-time-delay arrays in wideband millimeter-wave systems,” IEEE Trans. Circuits Syst. I: Regular Papers, vol. 68, no. 4, pp. 1727-1739, Apr. 2021.
  • [26] L. Yan, C. Han, and J. Yuan, “Energy-efficient dynamic-subarray with fixed true-time-delay design for Terahertz wideband hybrid beamforming,” IEEE J. Sel. Areas Commun., vol. 40, no. 10, pp. 2840-2854, Oct. 2022.
  • [27] Z. Liang, Y. Lin, H. Sun, Z. Wang, and B. Ding, “Application of multi-layer subarray design in wideband large phased array antenna,” Journal Microwaves, vol. 37, no. 5, pp. 23-28, Oct. 2021.
  • [28] W. Yan, W. Hao, C. Huang, G. Sun, O. Muta, and H. Gacanin, “Beamforming analysis and design for wideband THz reconfigurable intelligent surface communications, IEEE J. Sel. Areas Commun., vol. 41, no. 8, pp. 2306-2320, Aug. 2023.
  • [29] R. Su, L. Dai, and D. W. K. Ng, “Wideband precoding for RIS-aided THz communications,” IEEE Trans. Commun., doi: 10.1109/TCOMM.2023.3263230.
  • [30] Z. Wang, X. Mu, J. Xu, and Y. Liu, “Simultaneously transmitting and reflecting surface (STARS) for Terahertz communications,” IEEE J. Sel. Topics Signal Process., doi: 10.1109/JSTSP.2023.3279621.
  • [31] Y. Lu, M. Hao, and R. Mackenzie, “Reconfigurable intelligent surface based hybrid precoding for THz communications,” Intelligent Converged Netw., vol. 3, no. 1, pp. 103-118, Mar. 2022.
  • [32] A. Najjar, M. El-Absi and T. Kaiser, “Non-overlapped subarrays based wideband delay-phase hybrid beamforming,” 2022 Fifth International Workshop on Mobile Terahertz Systems (IWMTS), 2022, pp. 1-5.
  • [33] R. Piesiewicz, T. Kleine-Ostmann, N. Krumbholz, D. Mittleman, M. Koch, J. Schoebel, and T. Kurner, “Short-range ultra-broadband terahertz communications: concepts and perspectives,” IEEE Antennas Propag. Mag., vol. 49, no. 6, pp. 24-39, Dec. 2007.
  • [34] R. Rotman, M. Tur, and L. Yaron, “True time delay in phased arrays,” in Proc. IEEE, vol. 104, no. 3, pp. 504-518, Mar. 2016.
  • [35] M. Cho, I. Song, and J. D. Cressler, “A true time delay-based SiGe bi-directional T/R chipset for large-scale wideband timed array antennas,” in Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC’18), 2018, pp. 272-275.
  • [36] J. C. Jeong, I. B. Yom, J. D. Kim, W. Y. Lee, and C. H. Lee, “A 6-18-GHz GaAs multifunction chip with 8-bit true time delay and 7-bit amplitude control,” IEEE Trans. Microw. Theory and Techn., vol. 66, no. 5, pp. 2220-2230, May 2018.
  • [37] J. Kim, J. Park, and J. G. Kim, “CMOS true-time delay IC for wideband phased-array antenna,” Etri Journal, vol. 40, no. 6, pp. 693-698, May 2018.
  • [38] X. Zheng, L. Liu, H. Guo, and J. Yang, “A study on expanding instantaneous bandwidth of phased array antenna,” Modern Radar, vol. 36, no. 11, pp. 40-44, Nov. 2014.
  • [39] A. M. Sayeed, “Deconstructing multiantenna fading channels, IEEE Trans. Signal. Process., vol. 50, no. 10, pp. 2563-2579, Oct. 2002.
  • [40] J. Zhang, J. Li, and H. Sun, “A study on layered scheme of real-time delayers for the widebond phased array,” Modern Radar, vol. 32, no. 7, pp. 75-78, Jul. 2010.
  • [41] J. R. Magnus and H. Neudecker, “Matrix differential calculus with applications in statistics and econometrics,” John Wiley Sons, 2019.
  • [42] X. Qi, G. Xie, and Y. Liu, “Energy-efficient power allocation in multi-user mmWave-NOMA systems with finite resolution analog precoding,” IEEE Trans. Veh. Technol., vol. 71, no. 4, pp. 3750-3759, Apr. 2022.
  • [43] Q. Shi, M. Razaviyayn, Z.-Q. Luo, and C. He, “An iteratively weighted MMSE approach to distributed sum-utility maximization for a MIMO interfering broadcast channel, IEEE Trans. Signal Process., vol. 59, no. 9, pp. 4331-4340, Sept. 2011.
  • [44] F. Sohrabi and W. Yu, “Hybrid analog and digital beamforming for mmWave OFDM large-scale antenna arrays,” IEEE J. Sel. Areas Commun., vol. 35, no. 7, pp. 1432-1443, Jul. 2017.