Hierarchical-Absolute Reciprocity Calibration for Millimeter-wave Hybrid Beamforming Systems

Li Chen, Rongjiang Nie, Yunfei Chen, and Weidong Wang Li Chen, Rongjiang Nie, and Weidong Wang are with the CAS Key Laboratory of Wireless Optical Communication, University of Science and Technology of China, Hefei 230027, China (e-mail:[email protected]; [email protected]; [email protected]).Yunfei Chen is with the School of Engineering, University of Warwick, Coventry CV4 7AL, U.K. (e-mail: [email protected]).

Abstract

In time-division duplexing (TDD) millimeter-wave (mmWave) massive multiple-input multiple-output (MIMO) systems, the reciprocity mismatch severely degrades the performance of the hybrid beamforming (HBF). In this work, to mitigate the detrimental effect of the reciprocity mismatch, we investigate reciprocity calibration for the mmWave-HBF system with a fully-connected phase shifter network. To reduce the overhead and computational complexity of reciprocity calibration, we first decouple digital radio frequency (RF) chains and analog RF chains with beamforming design. Then, the entire calibration problem of the HBF system is equivalently decomposed into two subproblems corresponding to the digital-chain calibration and analog-chain calibration. To solve the calibration problems efficiently, a closed-form solution to the digital-chain calibration problem is derived, while an iterative-alternating optimization algorithm for the analog-chain calibration problem is proposed. To measure the performance of the proposed algorithm, we derive the Cramér-Rao lower bound on the errors in estimating mismatch coefficients. The results reveal that the estimation errors of mismatch coefficients of digital and analog chains are uncorrelated, and that the mismatch coefficients of receive digital chains can be estimated perfectly. Simulation results are presented to validate the analytical results and to show the performance of the proposed calibration approach.

Index Terms:

Calibration, hybrid beamforming, massive MIMO, millimeter-wave, reciprocity mismatch.

Introduction In time-division duplexing (TDD) massive multiple-input multiple-output (MIMO) systems, the base station (BS) estimates the downlink channel state information (CSI) by exploiting the reciprocity of the wireless channel, to relieve the overhead of acquiring CSI[1, 2].

In practice, the estimated CSI is composed of not only the wireless propagation channel response but also the radio frequency (RF) response of RF chains [3]. The transmit and receive RF chains consist of different RF components. The transmit chain is composed of a digital-to-analog converter, a power amplifier, etc., while the receive RF chain consists of an analog-to-digital converter, a low noise amplifier, etc. Due to the different compositions, the RF responses of transmit and receive chains are generally asymmetric, which results in the reciprocity mismatch of the uplink and downlink channels [4].

The study of reciprocity mismatch has attracted extensive attention in the past decade and can be mainly classified into impact analysis and calibration design. To examine the impact of the reciprocity mismatch, some theoretical analyses have been provided for massive MIMO systems with linear precoding techniques, e.g., the zero-forcing (ZF) and matched filter (MF). W. Zhang et al. in [5] studied the performance of the multi-user massive MIMO system with regularized ZF and MF precoding. They found that the reciprocity mismatch hardly caused any performance loss in the low signal-to-noise ratio (SNR) regime but severe performance loss in the high SNR regime. The theoretical results in [6] revealed that the reciprocity mismatch at the BS side was the key contributing factor to the multi-user interference and led to severe system performance degradation for the ZF precoding, while the mismatch at the user equipment (UE) side only led to very slight performance loss. Further, the theoretical comparison of MF and ZF precoding in [7, 8] indicated that the ZF-precoded system was more sensitive to the reciprocity mismatch than the MF-precoded system. The experimental results in [9] verified the theoretical conclusions of the system performance in the presence of the reciprocity mismatch.

Since the reciprocity mismatch causes severe system performance degradation, the reciprocity calibration plays an essential role in the deployment of the massive MIMO system. Unlike the CSI estimation error [10], which changes with each channel realization, the reciprocity mismatch coefficients remain constant over hours or even days, and reciprocity calibration can be performed infrequently, e.g., once an hour. Reciprocity calibration techniques can be mainly classified into two categories, which are hardware-based calibration and over-the-air (OTA) calibration. The hardware-based calibration utilizes the auxiliary circuits and components to connect the transmit RF chains and the receive RF chains. A real-time hardware-based calibration was first proposed in [11] for narrowband conventional MIMO systems, where the transmitted data signals were used to calibrate the antenna array. Then, A. Bourdoux et al. in [12] proposed a calibration approach which calibrated the different subcarriers respectively for wideband systems. To reduce transceiver interconnection effort, a daisy chain interconnection structure of the hardware circuits was proposed in [13], which also reduces the hardware cost of realizing reciprocal calibration to a certain extent. To study the trade-off between the connection structure and performance of the hardware-based calibration, X. Luo et al. in [14] proposed an optimal interconnection by minimizing the Cramér-Rao lower bound (CRLB) of mismatch coefficients, which revealed that the star structure of hardware circuits was optimal. The hardware cost and circuit complexity of these hardware-based calibration methods increase with the number of antennas and may be unaffordable in the massive MIMO systems.

Different from the hardware-based calibration, the OTA calibration is based on the software and protocol design, which only utilizes air-interface signals between uncalibrated antennas to compute the calibration coefficients [15]. OTA calibration approaches can be divided into the full-end OTA calibration which was mainly used for conventional MIMO systems, and the partial-end OTA calibration which was designed for massive MIMO systems. In conventional systems, the OTA calibration requires both the BS and UE to get involved in the operation, and is therefore known as the full-end calibration. The full-end reciprocity calibration was first proposed in [16], and the total least squares (LS) algorithm was applied to solve calibration coefficients. Then, in [17], the full-end calibration was extended to OFDM systems with each subcarrier calibrated independently in the frequency domain. In this case, the overhead and complexity of the reciprocity calibration increased with the number of subcarriers. To reduce the overhead and complexity, B. Kouassi et al. in [18] proposed a time-domain calibration for OFDM systems because the number of coefficients in the time domain was much less than those in the frequency domain. Since the overhead of channel feedback increases with the antenna number, the full-end calibration would produce heavy overhead pressure in massive MIMO systems.

Thanks to the theoretical and experimental results that the reciprocity mismatch at the single-antenna UE only causes minor performance loss, the OTA calibration only needs to be performed at the BS side, which is known as the partial-end calibration or one-side calibration. C. Shepard et al. in [19] proposed a simple one-side calibration for the massive MIMO Argos prototype, which was sensitive to the fading channel and the location of the reference antenna. To avoid the issue of the Argos calibration, a partial-end calibration based on the strong mutual coupling between the adjacent antennas was presented in [20]. By summarizing existing partial-end calibration approaches, X. Jiang et al. proposed an OTA calibration framework in [21]. Compared with co-located system, the calibration in distributed systems needs to gather the CSI from access points (APs). To reduce the overhead of gathering the CSI, R. Roganlin et al. in [22] proposed a hierarchical calibration which consisted of the intra-calibration and inter-calibration of AP. In [23], an OTA calibration with supporters was proposed for coordinated multi-point transmission systems to improve the SNR of calibration signals. To combat the path loss between the APs, our work in [24] proposed a beamforming-based OTA calibration for distribution MIMO relaying systems.

Although the reciprocity calibration designs for full-digital beamforming (DBF) MIMO systems have been extensively investigated in recently, they can not be applied to the hybrid analog-digital beamforming (HBF) systems. Due to the more complex structure than DBF systems, the reciprocity calibration in HBF systems is more challenging. On one hand, a typical HBF transceiver possesses a hierarchical structure consisting of the digital precoder, digital RF chains, the analog precoder, and analog RF chains[25], which results in more complex modeling of the uplink-downlink channel reciprocity mismatch. On the other hand, the digital RF chains and the analog RF chains are coupled with the analog precoder, e.g., a phase-shifter network[26], so that the digital chains and analog chains can not transmit signals independently. A reciprocity calibration for the sub-connected phase-shifter network HBF system was proposed in [27], which transformed the sub-connected HBF transceiver to a DBF transceiver by virtually changing the position of the RF components to the front end near the antennas. When it is applied to the fully-connected HBF system, the dimension of the equivalent channel matrix after the transformation becomes much larger than the realistic channel, which results in a large overhead of the calibration. Additionally, since this reciprocity calibration can only acquire the ratio of coefficients of transmit and receive RF chains, mmWave channel estimation approaches, e.g., the approaches in [28, 29, 30] which requires mismatch coefficients rather their ratios, remain unusable. To reduce the overhead and recover the mmWave channel estimation, a relative reciprocity calibration approach was proposed for fully-connected mmWave HBF system in [31]. Although this approach can reduce the calibration overhead to a certain extent, it requires UE to feed back received downlink calibration signals to BS, which can still causes large overhead. Further, since the relative calibration can not construct the equivalent channel, some existing hybrid beamforming designs, e.g. the designs proposed in [32, 33, 34], cannot be applied in the calibrated systems.

Motivated by the above observations, we investigate the reciprocity calibration for TDD mmWave-HBF systems with the fully-connected phase shifter network. To reduce the overhead and complexity of the reciprocity mismatch in the fully-connected HBF system, hierarchical ideology is employed to calibrate digital and analog RF chains. Since digital and analog RF chains are physically coupled via a phase shifter network, we propose a beamforming design to virtually decouple the reciprocity calibration of digital and analog RF chains. Based on the decoupling operation, the entire reciprocity calibration problem is decoupled into two subproblems corresponding to the calibrations of digital RF chains and analog RF chains. To guarantee the application of mmWave channel estimation approaches, we propose an absolute reciprocity calibration approach to estimate the mismatch coefficients of transmit and receive RF chains. The mismatch coefficients of digital RF chains are solved from the closed-form expression of the solution to the digital-chain problem, while the mismatch coefficients of analog chains are jointly estimated with mmWave channel coefficients. Finally, the CRLB of the mismatch coefficients is derived to measure the performance of the proposed calibration. The main contributions of this work can be summarized as follows.

•

Reciprocity mismatch decoupling. Since digital and analog RF chains are physically coupled via a phase shifter network, we propose a beamforming design to virtually decouple the digital and analog RF chains. Then, the entire reciprocity mismatch calibration problem of the HBF system is decomposed into two separate problems of digital-chain calibration and analog-chain calibration.
•

Absolute reciprocity calibration. To guarantee the efficacy of mmWave-channel estimation approaches, we propose novel estimating methods to acquire the mismatch coefficients of RF chains. Specifically, the closed-form expression of digital-chain mismatch coefficients is derived, and an iterative-alternating estimation algorithm is proposed for analog-chain mismatch coefficients.
•

CRLB for estimating mismatch coefficients. To measure the performance of the proposed algorithms, we derive the CRLB for the mismatch coefficient estimation. The CRLB reveals that the errors in estimating mismatch coefficients of digital chains and analog chains are independent of each other, and the mismatch coefficients of receive digital chains can be estimated perfectly.

The rest of the paper is organized as follows. Section II describes the system model. The hierarchical-absolute reciprocity calibration for the mmWave-HBF system is proposed in Section III. In Section IV, the performance including the overhead, computational complexity, and CRLB of the proposed calibration is derived. Simulation results are given in Section V, and the conclusion is given in Section VI. For readability, some proofs are deferred to the supplementary material.

Throughout the paper, vectors and matrices are denoted in bold lowercase and uppercase respectively, e.g., $\mathbf{a}$ and $\mathbf{A}$ . Let $\mathbf{A}^{T}$ , $\mathbf{A}^{H}$ , and $\mathbf{A}^{-1}$ denote the transpose, conjugate transpose, and inverse of a matrix $\mathbf{A},$ respectively. $\mathrm{tr}(\cdot)$ , $\mathbb{E}(\cdot)$ , and $\mathrm{vec}(\cdot)$ stand for the trace operator, the expectation operation, and column vectorization. Let $|a|$ and $\angle a$ denote the amplitude and phase of the complex number $a$ , and $\|\cdot\|_{\mathrm{F}}$ denotes the Frobenius norm. $\mathrm{diag}(a_{1},\cdots,a_{N})$ denotes an $N$ by $N$ diagonal matrix with diagonal entries given by $a_{1},\cdots,a_{N}$ , and $\mathrm{blkdiag}(\mathbf{a}_{1},\cdots,\mathbf{a}_{N})$ represents a block diagonal matrix. $\otimes$ , $\odot$ , and $\circ$ represent the Kronecker product, Khatri–Rao product, and Hadamard product, respectively. $\mathbb{C}$ and $\mathbb{R}$ stand for the complex numbers and real numbers, respectively. Let $[1:N]$ denote the set $\left\{1,2,\cdots,N\right\}$ , and $a\%b$ denote the remainder of $a$ divided by $b$ .

I System Model

Refer to caption — Figure 1: Hybrid beamforming massive MIMO with reciprocity mismatch.

We consider an mmWave massive MIMO system as illustrated in Fig. 1, where the BS is assumed to communicate with a single UE. The BS is quipped with $M_{\mathrm{t}}$ digital RF chains and $N_{\mathrm{t}}$ analog RF chains, and the UE is equipped with $M_{\mathrm{r}}$ digital RF chains and $N_{\mathrm{r}}$ analog RF chains. In both BS and UE, each analog chain is connected to an antenna in the uniform linear array (ULA), and the digital chains are connected to the analog chains via a fully-connected phase shift network.

In mmWave systems, the wireless channel is generally considered to possess limited scattering. Thus, we adopt a geometric channel model with $K\ (K\ll N_{\mathrm{t}},N_{\mathrm{r}})$ scatters, and each scatter contributes to a single propagation path between the BS and UE. Based on these assumptions, the wireless channel between the BS and the UE can be modeled as

\mathbf{H}=\sqrt{\frac{N_{\mathrm{t}}N_{\mathrm{r}}}{K}}\sum_{k=1}^{K}\alpha_{k}\mathbf{a}_{\mathrm{t}}(\theta_{k})\mathbf{a}_{\mathrm{r}}^{T}(\phi_{k}),

(1)

where $\alpha_{k}\sim\mathcal{CN}(0,\sigma_{\alpha}^{2})$ is the complex gain of the $k$ -th path, $\mathbf{a}_{\mathrm{t}}(\theta_{k})\in\mathbb{C}^{N_{\mathrm{t}}}$ and $\mathbf{a}_{\mathrm{r}}(\phi_{k})\in\mathbb{C}^{N_{\mathrm{r}}}$ denote the array steering vectors of the BS and UE which are given by

\begin{split}\mathbf{a}_{\mathrm{t}}(\theta_{k})&=\left[1,e^{-j\frac{2\pi d}{\lambda}\sin\theta_{k}},\cdots,e^{-j\frac{2\pi d}{\lambda}(N_{\mathrm{t}}-1)\sin\theta_{k}}\right]^{T},\\ \mathbf{a}_{\mathrm{r}}(\phi_{k})&=\left[1,e^{-j\frac{2\pi d}{\lambda}\sin\phi_{k}},\cdots,e^{-j\frac{2\pi d}{\lambda}(N_{\mathrm{r}}-1)\sin\phi_{k}}\right]^{T},\end{split}

(2)

$\lambda$ is the wavelength of the carrier, and $d$ is the distance of the adjacent antenna set to $\lambda/2$ , $\theta_{k}\in[-\pi/2,\pi/2)$ and $\phi_{k}\in[-\pi/2,\pi/2)$ are the azimuth angles of arrival or departure (AoAs/AoDs) of the BS and MS.

In a practical system, the receive and transmit RF chains are generally asymmetric. Let $\mathbf{T}_{1}/\mathbf{R}_{1}$ and $\mathbf{T}_{2}/\mathbf{R}_{2}$ represent the mismatch matrices of the transmit/receive digital and analog RF chains of the BS, and denote $\mathbf{V}_{1}/\mathbf{U}_{1}$ and $\mathbf{V}_{2}/\mathbf{U}_{2}$ as the mismatch matrices of the transmit/receive digital and analog RF chains of the UE. All of these matrices are diagonal and defined as

\begin{split}&\mathbf{T}_{1}=\mathrm{diag}(t_{1,1},\cdots,t_{1,M_{\mathrm{t}}}),\ \mathbf{T}_{2}=\mathrm{diag}(t_{2,1},\cdots,t_{2,N_{\mathrm{t}}}),\\ &\mathbf{R}_{1}=\mathrm{diag}(r_{1,1},\cdots,r_{1,M_{\mathrm{t}}}),\ \mathbf{R}_{2}=\mathrm{diag}(r_{2,1},\cdots,r_{2,N_{\mathrm{t}}}),\\ &\mathbf{V}_{1}=\mathrm{diag}(v_{1,1},\cdots,v_{1,M_{\mathrm{r}}}),\ \mathbf{V}_{2}=\mathrm{diag}(v_{2,1},\cdots,v_{2,N_{\mathrm{r}}}),\\ &\mathbf{U}_{1}=\mathrm{diag}(u_{1,1},\cdots,u_{1,M_{\mathrm{r}}}),\ \mathbf{U}_{2}=\mathrm{diag}(u_{2,1},\cdots,u_{2,N_{\mathrm{r}}})\end{split}

(3)

where $t_{1,m}/r_{1,m}$ denotes the mismatch coefficient of the $m$ -th ( $m\in[1:M_{\mathrm{t}}]$ ) transmit/receive digital RF chain of the BS, and $t_{2,i}/r_{2,i}$ denotes the mismatch coefficient of the $i$ -th ( $i\in[1:N_{\mathrm{t}}]$ ) transmit/receive analog RF chain of the BS. At the UE side, $v_{1,\bar{m}}/u_{1,\bar{m}}$ denotes the mismatch coefficient of the $\bar{m}$ -th ( $\bar{m}\in[1:M_{\mathrm{r}}]$ ) transmit/receive digital RF chain, and $v_{2,\bar{i}}/u_{2,\bar{i}}$ denotes the mismatch coefficient of the $\bar{i}$ -th ( $\bar{i}\in[1:N_{\mathrm{r}}]$ ) transmit/receive analog RF chain.

As depicted in Fig. 1, the overall channel observed by the baseband processor is the combination of the wireless channel, the digital RF chain, the phase shifter network, and the analog RF chain, which can be expressed by

\tilde{\mathbf{H}}_{\mathrm{UL}}=\tilde{\mathbf{R}}\mathbf{H}\tilde{\mathbf{V}},\quad\tilde{\mathbf{H}}_{\mathrm{DL}}=\tilde{\mathbf{U}}\mathbf{H}^{T}\tilde{\mathbf{T}},

(4)

where $\tilde{\mathbf{R}}=\mathbf{R}_{2}\mathbf{F}_{\mathrm{r}}^{T}\mathbf{R}_{1}$ , $\tilde{\mathbf{T}}=\mathbf{T}_{1}\mathbf{F}_{\mathrm{t}}\mathbf{T}_{2}$ , $\mathbf{F}_{\mathrm{r}}\in\mathbb{C}^{N_{\mathrm{t}}\times M_{\mathrm{t}}}$ and $\mathbf{F}_{\mathrm{t}}\in\mathbb{C}^{N_{\mathrm{t}}\times M_{\mathrm{t}}}$ are the analog receive and transmit beamforming matrices of the BS, $\tilde{\mathbf{V}}=\mathbf{V}_{1}\mathbf{B}_{\mathrm{t}}\mathbf{V}_{2}$ , $\tilde{\mathbf{U}}=\mathbf{U}_{2}\mathbf{B}_{r}^{T}\mathbf{U}_{1}$ , $\mathbf{B}_{\mathrm{t}}\in\mathbb{C}^{N_{\mathrm{r}}\times M_{\mathrm{r}}}$ and $\mathbf{B}_{\mathrm{r}}\in\mathbb{C}^{N_{\mathrm{r}}\times M_{\mathrm{r}}}$ are the analog beamforming matrices of the UE.

Based on the channel modeling and system setting, the downlink transmission signal received by the UE can be denoted as

\mathbf{y}=\mathbf{D}_{\mathrm{r}}^{T}\mathbf{U}_{1}\mathbf{B}_{\mathrm{r}}^{T}\mathbf{U}_{2}\mathbf{H}^{T}\mathbf{T}_{2}\mathbf{F}_{\mathrm{t}}\mathbf{T}_{1}\mathbf{W}_{\mathrm{t}}\mathbf{s}+\mathbf{D}_{\mathrm{r}}^{T}\mathbf{U}_{\mathrm{1}}\mathbf{B}_{\mathrm{r}}^{T}\mathbf{n},

(5)

where $\mathbf{D}_{\mathrm{r}}\in\mathbb{C}^{M_{\mathrm{r}}\times M_{\mathrm{r}}}$ is the digital combining matrix of the UE, $\mathbf{W}_{\mathrm{t}}\in\mathbb{C}^{M_{\mathrm{t}}\times M_{\mathrm{t}}}$ denotes the digital precoding matrix of the BS, $\mathbf{s}\in\mathbb{C}^{N_{\mathrm{s}}}$ denotes the data vector satisfying $\mathbb{E}\left\{\mathbf{s}\mathbf{s}^{H}\right\}=\rho_{\mathrm{d}}\mathbf{I}_{N_{\mathrm{s}}}$ , $\rho_{\mathrm{d}}$ denotes the average transmit power, $N_{\mathrm{s}}$ is the number of data streams, and $\mathbf{n}\in\mathbb{C}^{N_{\mathrm{r}}}$ represents the additive white Gaussian noise (AWGN) vector with distribution $\mathbf{n}\sim\mathcal{CN}(\mathbf{0},\sigma_{\mathrm{n}}^{2}\mathbf{I}_{N_{\mathrm{r}}})$ .

In TDD mode, the digital and analog beamforming matrices are computed by the BS based on the knowledge of uplink CSI. According to (4), the estimated uplink CSI is unequal to the downlink channel response at all, which is known as the reciprocity mismatch of the uplink and downlink channel. With the reciprocity mismatch, the existing beamforming approaches for HBF systems, e.g., [35], fail to achieve satisfactory performance. Further, due to the uncertainty of the reciprocity mismatch coefficients, mmWave channel estimation approaches like [28] are invalid. Accordingly, the reciprocity calibration is essential for mmWave-HBF systems.

II Reciprocity Calibration for mmWave-HBF System

In this section, the reciprocity calibration approach is proposed. We first introduce an existing reciprocity calibration approach for HBF systems and discuss its limitation in applying to the fully-connected structure. Then, an absolute reciprocity calibration for the mmWave-HBF system is proposed, which takes advantage of the particularity of the fully-connected structure to decouple the calibrations of digital RF chains and analog RF chains.

II-A Conventional Reciprocity Calibration Approach of HBF System

The conventional reciprocity calibration (CRC) of HBF was proposed in [27], which is an extension of the relative calibration of the full-digital MIMO system. The CRC treats the HBF system as a virtual full-digital MIMO with $N_{\mathrm{t}}M_{\mathrm{t}}$ virtual antennas and applies OTA signals to estimate the ratio of the transmit and receive mismatch coefficients, which are also called relative calibration coefficients.

In the CRC, the equivalent transmit and receive mismatch coefficients of the BS are defined as $\mathbf{T}_{\mathrm{eq}}=\mathbf{T}_{1}\otimes\mathbf{T}_{2}$ and $\mathbf{R}_{\mathrm{eq}}=\mathbf{R}_{1}\otimes\mathbf{R}_{2}$ , and the equivalent mismatch coefficients of the UE are defined as $\mathbf{V}_{\mathrm{eq}}=\mathbf{V}_{1}\otimes\mathbf{V}_{2}$ and $\mathbf{U}_{\mathrm{eq}}=\mathbf{U}_{1}\otimes\mathbf{U}_{2}$ . The equivalent uplink and downlink channels are defined as $\mathbf{H}_{\mathrm{UL,eq}}=(\mathbf{r}\otimes\mathbf{R}_{2})\mathbf{H}(\mathbf{v}_{1}\otimes\mathbf{V}_{2})$ and $\mathbf{H}_{\mathrm{DL,eq}}=(\mathbf{u}_{1}\otimes\mathbf{U}_{2})\mathbf{H}^{T}(\mathbf{t}_{1}\otimes\mathbf{T}_{2})$ , where $\mathbf{t}_{1}$ , $\mathbf{r}_{1}$ , $\mathbf{v}_{1}$ , and $\mathbf{u}_{1}$ consist of the diagonal entries of $\mathbf{T}_{1}$ , $\mathbf{R}_{1}$ , $\mathbf{V}_{1}$ , and $\mathbf{U}_{1}$ , respectively. Based on these definitions, the equation of CRC can be denoted as

\mathbf{H}_{\mathrm{DL,eq}}=\underbrace{\mathbf{U}_{\mathrm{eq}}\mathbf{V}_{\mathrm{eq}}^{-1}}_{\mathbf{C}_{\mathrm{UE}}^{-1}}\mathbf{H}_{\mathrm{UL,eq}}^{T}\underbrace{\mathbf{R}_{\mathrm{eq}}^{-1}\mathbf{T}_{\mathrm{eq}}}_{\mathbf{C}_{\mathrm{BS}}},

(6)

where $\mathbf{C}_{\mathrm{BS}}$ and $\mathbf{C}_{\mathrm{UE}}$ represent the relative calibration matrices of the BS and UE.

To obtain the relative calibration coefficients, it is necessary to acquire the equivalent uplink and downlink CSI. To estimate the equivalent downlink CSI, the BS transmits $L_{\mathrm{crc}}$ -length pilots by using $Q_{\mathrm{crc}}$ transmit beamforming matrices, and the UE receives the pilots with $P_{\mathrm{crc}}$ receive beamforming matrices, where $P_{\mathrm{crc}}Q_{\mathrm{crc}}=L_{\mathrm{crc}}$ . Assume that a $Q_{\mathrm{crc}}$ -length pilot sequence is denoted as $\{\mathbf{x}_{1},\mathbf{x}_{2},\cdots,\mathbf{x}_{Q_{\mathrm{crc}}}\}$ , where $\mathbf{x}_{q}$ denotes the pilot during the $[(p-1)Q_{\mathrm{crc}}+q]$ -th transmission $(p\in[1:P_{\mathrm{crc}}])$ satisfying $\mathbb{E}\left\{\mathbf{x}_{q}\mathbf{x}_{q}^{H}\right\}=\rho_{\mathrm{c}}\mathbf{I}_{M_{\mathrm{r}}}$ . During the training, the digital precoding and combining matrices are set as identity matrices, i.e., $\mathbf{D}_{\mathrm{r}}=\mathbf{I}_{M_{\mathrm{r}}}$ and $\mathbf{W}_{\mathrm{t}}=\mathbf{I}_{M_{\mathrm{t}}}/\sqrt{M_{\mathrm{t}}}$ , and the signal received by the UE can be denoted as

\begin{split}\mathbf{y}_{\mathrm{UE},p,q}&=\mathbf{U}_{1}\mathbf{B}_{\mathrm{r},p}\mathbf{H}_{\mathrm{DL}}\mathbf{F}_{\mathrm{t},q}\mathbf{T}_{1}\mathbf{x}_{q}+\mathbf{U}_{1}\mathbf{B}_{\mathrm{r},p}\mathbf{n}_{\mathrm{UE},p,q}\\ &=\mathbf{B}_{\mathrm{eq},p}\mathbf{H}_{\mathrm{DL,eq}}\mathbf{F}_{\mathrm{eq},q}\mathbf{x}_{q}+\mathbf{n}_{\mathrm{eq},p,q},\end{split}

(7)

where $\mathbf{H}_{\mathrm{DL}}=\mathbf{U}_{2}\mathbf{H}^{T}\mathbf{T}_{2}$ , $\mathbf{F}_{\mathrm{t},q}$ denotes the analog beamforming matrix at the BS, $\mathbf{B}_{\mathrm{r},q}$ is the analog combining matrix at the UE, $\mathbf{F}_{\mathrm{eq},q}=\mathrm{blkdiag}(\mathbf{f}_{q,1},\mathbf{f}_{q,2},\cdots,\mathbf{f}_{q,M_{\mathrm{t}}})$ , $\mathbf{f}_{q,m}$ denotes the $m$ -th column of $\mathbf{F}_{\mathrm{t},q}$ , $\mathbf{B}_{\mathrm{eq},p}=\mathrm{blkdiag}(\mathbf{b}_{p,1},\mathbf{b}_{p,2},\cdots,\mathbf{b}_{p,M_{\mathrm{r}}})$ , and $\mathbf{b}_{p,m}$ is the $m$ -th row of $\mathbf{B}_{\mathrm{r},p}$ .

By stacking all $L_{\mathrm{crc}}$ -length signals in matrix form denoted as $\mathbf{Y}_{\mathrm{UE}}=[\bar{\mathbf{Y}}_{\mathrm{UE},1}^{T},\cdots,\bar{\mathbf{Y}}_{\mathrm{UE},P_{\mathrm{crc}}}^{T}]^{T}\in\mathbb{C}^{M_{\mathrm{r}}P_{\mathrm{crc}}\times Q_{\mathrm{crc}}}$ and $\bar{\mathbf{Y}}_{\mathrm{UE},p}=[\mathbf{y}_{\mathrm{UE},p,1},\cdots,\mathbf{y}_{\mathrm{UE},p,Q_{\mathrm{crc}}}]$ , the received signal model can be given by

\mathbf{Y}_{\mathrm{UE}}=\tilde{\mathbf{B}}_{\mathrm{eq}}\mathbf{H}_{\mathrm{DL,eq}}\tilde{\mathbf{F}}_{\mathrm{eq}}+\mathbf{N}_{\mathrm{eq}},

(8)

where $\tilde{\mathbf{B}}_{\mathrm{eq}}=[\mathbf{B}_{\mathrm{eq},1}^{T},\mathbf{B}_{\mathrm{eq},2}^{T},\cdots,\mathbf{B}_{\mathrm{eq},P_{\mathrm{crc}}}^{T}]^{T}\in\mathbb{C}^{M_{\mathrm{r}}P_{\mathrm{crc}}\times N_{\mathrm{r}}}$ , $\tilde{\mathbf{F}}_{\mathrm{eq}}=[\mathbf{F}_{\mathrm{eq},1}\mathbf{x}_{1},\mathbf{F}_{\mathrm{eq},2}\mathbf{x}_{2},\cdots,\mathbf{F}_{\mathrm{eq},Q_{\mathrm{crc}}}\mathbf{x}_{Q_{\mathrm{crc}}}]\in\mathbb{C}^{N_{\mathrm{t}}\times Q_{\mathrm{crc}}}$ , $\mathbf{N}_{\mathrm{eq}}=[\bar{\mathbf{N}}_{\mathrm{eq},1}^{T},\cdots,\bar{\mathbf{N}}_{\mathrm{eq},P_{\mathrm{crc}}}^{T}]^{T}$ , and $\bar{\mathbf{N}}_{\mathrm{eq},p}=[\mathbf{n}_{\mathrm{eq},p,1},\cdots,\mathbf{n}_{\mathrm{eq},p,Q_{\mathrm{crc}}}]$ . By vectoring the matrix $\mathbf{Y}_{\mathrm{UE}}$ , the received signal can be further denoted as

\mathrm{vec}(\mathbf{Y}_{\mathrm{UE}})=\mathbf{B}_{\mathrm{crc}}\mathrm{vec}(\mathbf{H}_{\mathrm{DL,eq}})+\mathrm{vec}(\mathbf{N}_{\mathrm{eq}}),

(9)

where $\mathbf{B}_{\mathrm{crc}}=(\tilde{\mathbf{F}}_{\mathrm{eq}}^{T}\otimes\tilde{\mathbf{B}}_{\mathrm{eq}})$ . Using the LS approach[36], the equivalent downlink channel is estimated as

\mathrm{vec}(\mathbf{H}_{\mathrm{DL,eq}})=(\mathbf{B}_{\mathrm{crc}}^{H}\mathbf{B}_{\mathrm{crc}})^{-1}\mathbf{B}_{\mathrm{crc}}^{H}\mathrm{vec}(\mathbf{Y}_{\mathrm{UE}}).

(10)

Similarly, to estimate the equivalent uplink channel, the UE transmit the uplink training pilots to the BS. Further, to estimate the calibration coefficients, the UE feeds back the estimated downlink channel to the BS.

After the BS estimates the uplink channel and receives the downlink channel fed back from the UE, the calibration coefficients can be computed by the following proposition.

Proposition 1 (CRC coefficients).

With the knowledge of the equivalent uplink and downlink channels, the CRC coefficients can be computed by

\mathbf{c}=[1,-\mathbf{h}_{\mathrm{CRC},1}^{T}\mathbf{H}_{\mathrm{CRC},2}^{*}(\mathbf{H}_{\mathrm{CRC},2}^{T}\mathbf{H}_{\mathrm{CRC},2}^{*})^{-1}]^{T},

(11)

where $\mathbf{h}_{\mathrm{CRC},1}$ is the first column of matrix $\mathbf{H}_{\mathrm{CRC}}$ , $\mathbf{H}_{\mathrm{CRC},2}$ consists of the second to last columns of matrix $\mathbf{H}_{\mathrm{CRC}}$ , $\mathbf{H}_{\mathrm{CRC}}$ is an $(N_{\mathrm{t}}M_{\mathrm{t}}+N_{\mathrm{r}}M_{\mathrm{r}})$ -order square matrix defined as $\mathbf{H}_{\mathrm{CRC}}=[\mathbf{I}_{M_{\mathrm{t}}N_{\mathrm{t}}}\odot\mathbf{H}_{\mathrm{UL,eq}}^{T},-\mathbf{H}_{\mathrm{DL,eq}}^{T}\odot\mathbf{I}_{M_{\mathrm{r}}N_{\mathrm{r}}}]$ .

Proof:

The results can be derived based on [27] by assuming that the antennas of the BS are divided into the group $\mathcal{A}$ and the antennas of the UE are divided into the group $\mathcal{B}$ . ∎

To measure the complexity of the CRC, we further derive the overhead and computational complexity. The overhead of reciprocity calibration can be expressed by the count of channel use for transmitting calibration signals and feeding back the estimated CSI. The computational complexity can be measured by the times of multiplication for estimating the channel state information and computing the mismatch coefficients.

Remark 1 (Overhead and complexity of CRC).

According to (10), $\tilde{\mathbf{F}}_{\mathrm{eq}}^{T}\otimes\tilde{\mathbf{B}}_{\mathrm{eq}}$ must be a full column rank matrix to estimate the downlink channel. Based on the property of the Kronecker product, the pilots must satisfy the condition that $Q_{\mathrm{crc}}\geq N_{\mathrm{t}}M_{\mathrm{t}}$ and $P_{\mathrm{crc}}\geq N_{\mathrm{r}}$ , and $L_{\mathrm{crc}}\geq N_{\mathrm{t}}M_{\mathrm{t}}N_{\mathrm{r}}$ . This result means that the least overhead of downlink channel estimation is $N_{\mathrm{t}}M_{\mathrm{t}}N_{\mathrm{r}}$ . Since the uplink channel estimation is similar to the downlink channel estimation, the entire overhead of the CRC can be denoted as $N_{\mathrm{t}}N_{\mathrm{r}}(M_{\mathrm{t}}+M_{\mathrm{r}}+1)$ . Further, since the computation complexity is mainly determined by computing the inverse matrices of $\mathbf{B}_{\mathrm{crc}}^{H}\mathbf{B}_{\mathrm{crc}}\in\mathbb{C}^{N_{\mathrm{t}}M_{\mathrm{t}}N_{\mathrm{r}}M_{\mathrm{r}}\times N_{\mathrm{t}}M_{\mathrm{t}}N_{\mathrm{r}}M_{\mathrm{r}}}$ and $\mathbf{H}_{\mathrm{CRC},2}^{T}\mathbf{H}_{\mathrm{CRC},2}^{*}\in\mathbb{C}^{(N_{\mathrm{t}}M_{\mathrm{t}}+N_{\mathrm{r}}M_{\mathrm{r}})\times(N_{\mathrm{t}}M_{\mathrm{t}}+N_{\mathrm{r}}M_{\mathrm{r}})}$ , the computation complexity of the CRC is $\mathcal{O}(N_{\mathrm{t}}^{3}M_{\mathrm{t}}^{3}N_{\mathrm{r}}^{3}M_{\mathrm{r}}^{3})$ .

In the CRC, the dimensions of the equivalent channels $\mathbf{H}_{\mathrm{UL,eq}}$ and $\mathbf{H}_{\mathrm{DL,eq}}$ (see (6)) are much larger than that of the actual wireless channel matrix $\mathbf{H}$ (see (1)), which generates the heavy overhead of the channel estimation and high computational complexity. Further, since the CRC only estimates the ratio of the mismatch coefficients of transmit chains and receive chains, mmWave channel estimation of mmWave systems, which requires the knowledge of the individual mismatch coefficients, becomes invalid. Thus, due to limitations of the CRC in fully connected mmWave-HBF systems, we propose a hierarchical-absolute calibration (HAC) approach, which decouples the reciprocity calibration of digital RF chains and analog RF chains and estimates the individual mismatch coefficients of transmit chains and receive chains, respectively.

II-B Decouple Principle of HAC

To reduce the overhead and complexity of the reciprocity calibration of fully connected HBF systems, digital RF chains and analog RF chains must be calibrated individually, which means hierarchical calibration. To adopt the mmWave channel estimation approaches of mmWave systems, the individual mismatch coefficients are required rather than the ratio of the mismatch coefficients, which can be addressed by applying the absolute reciprocity calibration.

However, due to the fully-connected structure of the HBF system, HAC encounters two challenges. On the one hand, the digital RF chains and analog RF chains are physically coupled via the fully-connected phase shift network, which results in the decoupling challenge. On the other hand, the fully-connected phase shift network causes that the RF chains can not transmit and receive signals independently, which results in the calibration challenge. To a certain degree, these problems can be addressed by using extra auxiliary circuits to assistant the reciprocity calibration, e.g. the calibration approaches presented in [37, 38]. But the auxiliary circuits may bring extra non-reciprocity, and the calibration accuracy of hardware-circuit calibration highly depends on the auxiliary circuits [20]. Thus, we propose an OTA-based HAC for mmWave-HBF systems. Specifically, the calibrations of digital and analog RF chains are decoupled by a targeted beamforming scheme, and the mismatch coefficients are estimated by the OTA training signals between the BS and UE.

In the rest of this section, we will introduce the decoupling principle of the proposed HAC. Since the transmitter and receiver employ similar HBF structures, the decouple operation is first explained in the multi-input single-output (MISO) system for clarity, and then we will propose the concrete design for the general HBF-MIMO system.

MISO system for decoupling digital chains from the analog chains: We consider a MISO system where the transmitter is equipped with the fully-connected HBF structure and the receiver is equipped with a single antenna as illustrated in Fig. 2a. Thanks to the fully-connected structure, the signal transmitted from each digital RF chain passes through all antennas in the transmitter and all wireless channels. For example, if the $m$ -th transmit digital chain transmit a signal $x_{m}$ to the receiver, the received signal can be denoted as

y_{m}=t_{1,m}\underbrace{\mathbf{h}^{T}\mathbf{T}_{2}\mathbf{f}_{m}}_{h_{\mathrm{eq},m}}x_{m}+n_{m}=t_{1,m}h_{\mathrm{eq},m}x_{m}+n_{m},

(12)

where $h_{\mathrm{eq},m}$ denotes the virtual equivalent channel between the $m$ -th transmit digital chain and the receive antenna. Based on this, the HBF MISO system can be virtually constructed as a DBF MISO system as illustrated in Fig. 2b, where the virtual antennas are the transmit digital chains, and the virtual-equivalent channels consist of the phase shift network, the analog RF chains, and the wireless channel. Further, it can be found that the virtual-equivalent channels equal to each other when the beamforming vectors are identical, i.e., $\mathbf{f}_{1}=\cdots=\mathbf{f}_{M}$ . Thus, by applying this analog beamforming design, the digital chains can be decoupled from the analog chains, and the absolute reciprocity calibration can be considered as the calibration with known channel gains.

MISO system for decoupling the analog chains from the digital chains: Since each digital RF chain is connected to all analog RF chains via the fully-connected phase shifter network, the antenna array can be considered as an ABF system as shown in Fig. 3a. By using only one digital RF chain to transmit and receive calibration signals, the analog RF chains can be decoupled from the digital RF chains as illustrated in Fig. 3b. Based on this design, the analog RF chains can be calibrated with signal processing approaches.

Concrete design: Based on the above MISO systems, the decoupling principle can be extended to a general case where both the BS and UE are equipped with multiple antennas and HBF structures as illustrated in Fig. 1. For general point-to-point HBF MIMO systems, the overall HAC training phases can be divided into two phases, which are downlink training and uplink training as illustrated in Fig. 4. During the training processes, the calibration training signals and beamforming matrices of the BS and UE should be designed to decouple digital and analog RF chains. By taking the downlink training phase as an example, the concrete designs are given as follows.

•

Downlink training pilots: The entire $L_{\mathrm{d}}$ -length downlink training pilots consist of $L_{\mathrm{dr}}$ -length pilots for calibrating digital RF chains and $L_{\mathrm{da}}$ -length pilots for calibrating the analog RF chains, where $L_{\mathrm{dr}}+L_{\mathrm{da}}=L_{\mathrm{d}}$ . To increase the degree of freedom of received signals, the $L_{\mathrm{dr}}$ -length pilots are transmitted by using $Q_{\mathrm{dr}}$ transmit beamforming matrices and received by using $P_{\mathrm{dr}}$ receive beamforming matrices, where $L_{\mathrm{dr}}=Q_{\mathrm{dr}}P_{\mathrm{dr}}$ . The $L_{\mathrm{da}}$ -length pilots possess the homologous structure, i.e., $L_{\mathrm{da}}=Q_{\mathrm{da}}P_{\mathrm{da}}$ . By using $\{\mathbf{x}_{1},\cdots,\mathbf{x}_{Q_{\mathrm{max}}}\}$ to represent the pilots set, the transmitted pilot $\mathbf{x}_{\mathrm{d},l}$ during the $l$ -th transmission can be denoted as $\mathbf{x}_{\mathrm{d},l}=\mathbf{x}_{q}$ , where $\mathbb{E}\left\{\mathbf{x}_{q}\mathbf{x}_{q}^{H}\right\}=\rho_{\mathrm{c}}\mathbf{I}_{\mathrm{M}_{t}}$ , $q=l\%Q_{\mathrm{dr}}$ when $l\leq L_{\mathrm{dr}}$ , and $q=(l-L_{\mathrm{dr}})\%Q_{\mathrm{da}}$ when $L_{\mathrm{da}}<l\leq L_{\mathrm{d}}$ , $Q_{\mathrm{max}}=\max\{Q_{\mathrm{dr}},Q_{\mathrm{da}}\}$ .
•

Beamforming design for calibrating digital chains: Let $\mathbf{F}_{\mathrm{dr},q}\in\mathbb{C}^{N_{\mathrm{t}}\times M_{\mathrm{t}}}$ denote the analog transmit beamforming matrix and $\mathbf{B}_{\mathrm{dr},p}\in\mathbb{C}^{N_{\mathrm{r}}\times M_{\mathrm{r}}}$ represent the receive beamforming matrix during the $l$ -th transmission, where $l\leq L_{\mathrm{dr}}$ , $q=l\%Q_{\mathrm{dr}}$ , and $p=l\%P_{\mathbf{dr}}$ . To decouple the mismatch of the digital chains from the analog chains, analog beamforming matrices are designed as $\mathbf{F}_{\mathrm{dr},1}=\cdots=\mathbf{F}_{\mathrm{dr},Q_{\mathrm{dr}}}=\mathbf{f}_{\mathrm{dr}}\mathbf{1}_{M_{\mathrm{t}}}^{T}$ and $\mathbf{B}_{\mathrm{dr},1}=\cdots=\mathbf{B}_{\mathrm{dr},P_{\mathrm{dr}}}=\mathbf{b}_{\mathrm{dr}}\mathbf{1}_{M_{\mathrm{r}}}^{T}$ , where each element of $\mathbf{f}_{\mathrm{dr}}\in\mathbb{C}^{N_{\mathrm{t}}}$ and $\mathbf{b}_{\mathrm{dr}}\in\mathbb{C}^{N_{\mathrm{r}}}$ possesses random phase. Further, the digital precoding matrix can be designed as $\mathbf{W}_{\mathrm{dr},q}=\mathbf{I}_{M_{\mathrm{t}}}/\sqrt{M_{\mathrm{t}}}$ and the digital receive combining matrix is given by $\mathbf{D}_{\mathrm{dr},p}=\mathbf{I}_{M_{\mathrm{r}}}$ during the downlink training phase.
•

Beamforming design for calibrating analog chains: During the $l$ -th transmission ( $l>L_{\mathrm{dr}}$ ), the analog transmit beamforming matrix $\mathbf{F}_{\mathrm{da},q}$ and receive beamforming matrix $\mathbf{B}_{\mathrm{da},p}$ can be designed as random phase matrices, i.e., the elements of $\mathbf{F}_{\mathrm{da},q}$ and $\mathbf{B}_{\mathrm{da},p}$ possess random phases. Let $\mathbf{W}_{\mathrm{da},q}$ denote the digital precoding matrix and $\mathbf{D}_{\mathrm{da},p}$ denote the digital receive combining matrix, where $q=(l-L_{\mathrm{dr}})\%Q_{\mathrm{da}}$ , and $p=(l-L_{\mathrm{dr}})\%P_{\mathrm{da}}$ . To decouple the mismatch of analog RF chains from the mismatch of digital RF chains, the digital precoding matrix can be designed as $\mathbf{W}_{\mathrm{da},q}=\mathrm{blkdiag}(1,\mathbf{0}_{M_{\mathrm{t}}-1,M_{\mathrm{t}}-1})$ , and the digital receive combining matrix can be given by $\mathbf{D}_{\mathrm{da},p}=\mathrm{blkdiag}(1,\mathbf{0}_{M_{\mathrm{r}}-1,M_{\mathrm{r}}-1})$ .

II-C Problem Formulation and Decomposition of HAC

Since the mismatch coefficients of transmit chains have nothing to do with that of receive chains, HAC can be divided into downlink HAC and uplink HAC. The downlink HAC is applied to calibrate the transmit chains of the BS and the receive chains of the UE, while the uplink HAC can calibrate the receive chains of the BS as well as the transmit chains of the UE. The uplink HAC is similar to the downlink HAC. Thus, we introduce the signal modeling, the problem formulation, and the problem decoupling by taking the downlink HAC as an example.

By considering the BS transmits $l$ -th pilot to the UE, the signal received by the UE can be modeled as

\mathbf{y}_{\mathrm{d},l}=\mathbf{D}_{\mathrm{r},l}^{T}\mathbf{U}_{1}\mathbf{B}_{\mathrm{r},l}^{T}\mathbf{U}_{2}\mathbf{H}^{T}\mathbf{T}_{2}\mathbf{F}_{\mathrm{t},l}\mathbf{T}_{1}\mathbf{W}_{\mathrm{t},l}\mathbf{x}_{\mathrm{d},l}+\tilde{\mathbf{n}}_{\mathrm{d},l},

(13)

where $\tilde{\mathbf{n}}_{\mathrm{d},l}=\mathbf{D}_{\mathrm{r},l}^{T}\mathbf{U}_{1}\mathbf{B}_{\mathrm{r},l}^{T}\mathbf{n}_{\mathrm{d},l}$ . When $l\leq L_{\mathrm{dr}}$ , $\mathbf{D}_{\mathrm{r},l}=\mathbf{D}_{\mathrm{dr},p}$ , $\mathbf{B}_{\mathrm{r},l}=\mathbf{B}_{\mathrm{dr},p}$ , $\mathbf{F}_{\mathrm{t},q}=\mathbf{F}_{\mathrm{dr},q}$ , and $\mathbf{W}_{\mathrm{t},l}=\mathbf{W}_{\mathrm{dr},q}$ , where $p=l\%P_{\mathrm{dr}}$ , and $q=l\%Q_{\mathrm{dr}}$ . When $l>L_{\mathrm{dr}}$ , $\mathbf{D}_{\mathrm{r},l}=\mathbf{D}_{\mathrm{da},p}$ , $\mathbf{B}_{\mathrm{r},l}=\mathbf{B}_{\mathrm{da},p}$ , $\mathbf{F}_{\mathrm{t},q}=\mathbf{F}_{\mathrm{da},q}$ , and $\mathbf{W}_{\mathrm{t},l}=\mathbf{W}_{\mathrm{da},q}$ , where $p=(l-L_{\mathrm{dr}})\%P_{\mathrm{da}}$ , and $q=(l-L_{\mathrm{dr}})\%Q_{\mathrm{da}}$ .

After the BS transmits $L_{\mathrm{d}}$ -length pilots to the UE, the optimization problem for jointly estimating $\mathbf{U}_{1},\ \mathbf{U}_{2},\ \mathbf{T}_{1},\ \mathbf{T}_{2}$ , and $\mathbf{H}$ can be formulated as

\min_{\mathbf{U}_{\mathrm{1}},\mathbf{T}_{\mathrm{1}},\mathbf{U}_{2},\mathbf{T}_{2},\mathbf{H}}\quad\sum_{l=1}^{L_{\mathrm{d}}}\left\|\mathbf{y}_{\mathrm{d},l}-\mathbf{D}_{\mathrm{r},l}^{T}\mathbf{U}_{1}\mathbf{B}_{\mathrm{r},l}^{T}\mathbf{H}_{\mathrm{DL}}\mathbf{F}_{\mathrm{t},l}\mathbf{T}_{1}\tilde{\mathbf{x}}_{\mathrm{d},l}\right\|_{\mathrm{F}}^{2},

(14)

where $\mathbf{H}_{\mathrm{DL}}=\mathbf{U}_{2}\mathbf{H}^{T}\mathbf{T}_{2}$ , $\tilde{\mathbf{x}}_{\mathrm{d},l}=\mathbf{W}_{\mathrm{t},l}\mathbf{x}_{\mathrm{d},l}$ . Thanks to the proposed pilots and training scheme design, the above joint optimization problem can be equivalently decoupled into two subproblems demonstrated in the following proposition.

Proposition 2 (HAC problem decoupling).

Based on the specific pilots and training scheme design in Section II-B, the problem of HAC in (14) can be equivalently decoupled into two independent problems as

	$\displaystyle\mathcal{P}_{1}:\min_{\ \ \mathbf{u}_{\mathrm{1}},\mathbf{t}_{\mathrm{1}}\ }\quad\left\\|\mathbf{Y}_{\mathrm{dr}}-(\mathbf{1}_{P_{\mathrm{dr}}}\otimes\mathbf{u}_{1})\mathbf{t}_{1}^{T}\mathbf{X}_{\mathrm{dr}}\right\\|_{\mathrm{F}}^{2},$		(15)
	$\displaystyle\mathcal{P}_{2}:\min_{\mathbf{U}_{\mathrm{2}},\mathbf{T}_{\mathrm{2}},\mathbf{H}}\quad\left\\|\mathbf{Y}_{\mathrm{da}}-\bar{\mathbf{B}}_{\mathrm{da}}^{T}\mathbf{U}_{2}\mathbf{H}^{T}\mathbf{T}_{2}\bar{\mathbf{F}}_{\mathrm{da}}\mathbf{X}_{\mathrm{da}}\right\\|_{\mathrm{F}}^{2},$		(16)

where $\mathbf{Y}_{\mathrm{dr}}=[\bar{\mathbf{Y}}_{\mathrm{dr},1}^{T},\cdots,\bar{\mathbf{Y}}_{\mathrm{dr},P_{\mathrm{dr}}}^{T}]^{T}$ , $\bar{\mathbf{Y}}_{\mathrm{dr},p}=[\mathbf{y}_{\mathrm{d},(p-1)Q_{\mathrm{dr}}+1},\cdots,\mathbf{y}_{\mathrm{d},pQ_{\mathrm{dr}}}]$ , $\mathbf{u}_{1}$ consists of the diagonal entries of $\mathbf{U}_{1}$ , $\mathbf{t}_{1}$ is composed of the diagonal entries of $\mathbf{T}_{1}$ , $\mathbf{X}_{\mathrm{dr}}=[\mathbf{x}_{1},\cdots,\mathbf{x}_{Q_{\mathrm{dr}}}]$ , $\mathbf{Y}_{\mathrm{da}}=[\mathbf{y}_{\mathrm{da},1}^{T},\cdots,\mathbf{y}_{\mathrm{da},P_{\mathrm{da}}}^{T}]^{T}$ , $\mathbf{y}_{\mathrm{da},p}=[y_{\mathrm{d},(p-1)Q_{\mathrm{da}}+1,1},\cdots,y_{\mathrm{d},pQ_{\mathrm{da}},1}]$ , $\bar{\mathbf{B}}_{\mathrm{da}}=[\mathbf{b}_{\mathrm{da},1,1},\cdots,\mathbf{b}_{\mathrm{da},P_{\mathrm{da}},1}]$ , $\bar{\mathbf{F}}_{\mathrm{da}}=[\mathbf{f}_{\mathrm{da},1,1},\cdots,\mathbf{f}_{\mathrm{da},Q_{\mathrm{da}},1}]$ , $\mathbf{X}_{\mathrm{da}}=\mathrm{diag}(x_{1,1},\cdots,x_{Q_{\mathrm{da}},1})$ , $\mathbf{b}_{\mathrm{da},p,1}$ is the first column of $\mathbf{B}_{\mathrm{da},p}$ , $\mathbf{f}_{\mathrm{da},q,1}$ denotes the first column of $\mathbf{F}_{\mathrm{da},q}$ , and $x_{q,1}$ represents the first entry of $\mathbf{x}_{q}$ .

Proof:

See Appendix A. ∎

Remark 2 (HAC decoupling).

Since the problem $\mathcal{P}_{1}$ can solve the mismatch coefficients of the transmit digital RF chains of the BS and those of the receive digital RF chains of the UE, it is known as the downlink calibration problem of digital RF chains. Similarly, the problem $\mathcal{P}_{2}$ is the downlink calibration problem of analog RF chains. Thus, Proposition 2 indicates that HAC can be decoupled into the calibration of digital RF chains and the calibration of the analog RF chains, which is the purpose of the hierarchical calibration.

II-D Solution to Calibration Problem of HAC

As $\mathcal{P}_{1}$ and $\mathcal{P}_{2}$ are independent of each other, we first find the solution to $\mathcal{P}_{1}$ , then solve $\mathcal{P}_{2}$ . As the objective of $\mathcal{P}_{1}$ is bilinear, it can be solved by iterative approaches but this is inefficient. To solve $\mathcal{P}_{1}$ efficiently, we propose a closed-form solution by regarding the first receive digital chain of the UE as the calibration reference.

By using the auxiliary variables $\bar{\mathbf{X}}_{\mathrm{dr},p}=\mathbf{u}_{1}\mathbf{x}_{\mathrm{dt},p}^{T}$ and $\mathbf{x}_{\mathrm{dt},p}=\mathbf{X}_{\mathrm{dr}}^{T}\mathbf{t}_{1}$ , the problem $\mathcal{P}_{1}$ can be further formulated by

\mathcal{P}_{1.1}:\min_{\{\bar{\mathbf{X}}_{\mathrm{dr},p}\}_{p\in[1:P_{\mathrm{dr}}]}}\quad\sum_{p=1}^{P_{\mathrm{dr}}}\left\|\bar{\mathbf{Y}}_{\mathrm{dr},p}-\bar{\mathbf{X}}_{\mathrm{dr},p}\right\|_{\mathrm{F}}^{2},

(17)

By taking the derivative of the objective function of $\mathcal{P}_{1,1}$ , the solution can be given by[36]

\bar{\mathbf{X}}_{\mathrm{dr},p}=\bar{\mathbf{Y}}_{\mathrm{dr},p},\quad\forall p\in[1:P_{\mathrm{dr}}].

(18)

Since the first receive digital RF chain of the UE is the reference, its mismatch coefficient can be treated as a known constant, e.g., $u_{1,1}=c_{\mathrm{dr}}\neq 0$ . Based on this assumption and (18), $\mathbf{x}_{\mathrm{dt},p}^{T}$ equals to the first column of $\bar{\mathbf{X}}_{\mathrm{dr},p}$ , i.e.,

\mathbf{x}_{\mathrm{dt},p}=\frac{1}{c_{\mathrm{dr}}}\mathbf{y}_{\mathrm{dr},(p-1)M_{\mathrm{r}}+1}^{T},\quad\forall p\in[1:P_{\mathrm{dr}}],

(19)

where $\mathbf{y}_{\mathrm{dr},m}$ is the $m$ -th row of $\mathbf{Y}_{\mathrm{dr}}$ .

By substituting (19) into (15), the solution to the problem $\mathcal{P}_{1}$ can be given in the following proposition.

Proposition 3 (Solutions to the problem $\mathcal{P}_{1}$ ).

By assuming that the first receive digital RF chain is set as the reference, i.e., $u_{1,1}=c_{\mathrm{dr}}$ , the solutions to $\mathbf{t}_{1}$ and $\mathbf{u}_{1}$ can be given by

	$\displaystyle\hat{\mathbf{u}}_{1}=c_{\mathrm{dr}}\bigl{[}1,\tilde{\mathbf{y}}_{\mathrm{dr}}^{H}\check{\mathbf{Y}}_{\mathrm{dr}}(\check{\mathbf{Y}}_{\mathrm{dr}}^{H}\check{\mathbf{Y}}_{\mathrm{dr}})^{-1}\bigr{]}^{H},$		(20)
	$\displaystyle\hat{\mathbf{t}}_{1}=\frac{1}{c_{\mathrm{dr}}P_{\mathrm{dr}}}\Bigl{[}\mathbf{1}_{P_{\mathrm{dr}}}^{T}\otimes\bigl{(}\mathbf{X}_{\mathrm{dr}}^{}\mathbf{X}_{\mathrm{dr}}^{T}\bigr{)}^{-1}\mathbf{X}_{\mathrm{dr}}^{}\Bigr{]}\mathbf{y}_{\mathrm{dt}},$		(21)

where $\tilde{\mathbf{y}}_{\mathrm{dr}}=[\mathrm{vec}(\tilde{\mathbf{Y}}_{\mathrm{dr},1})^{T},\cdots,\mathrm{vec}(\tilde{\mathbf{Y}}_{\mathrm{dr},P_{\mathrm{dr}}})^{T}]^{T}\in\mathbb{C}^{P_{\mathrm{dr}}Q_{\mathrm{dr}}(M_{\mathrm{r}}-1)}$ , $\tilde{\mathbf{Y}}_{\mathrm{dr},p}$ consists of the second to the last row of $\bar{\mathbf{Y}}_{\mathrm{dr},p}$ , $\check{\mathbf{Y}}_{\mathrm{dr}}=[(\mathbf{y}_{\mathrm{dr},1}\otimes\mathbf{I}_{M_{\mathrm{r}}-1}),\cdots,(\mathbf{y}_{\mathrm{dr},(P_{\mathrm{dr}}-1)M_{\mathrm{r}}+1}\otimes\mathbf{I}_{M_{\mathrm{r}}-1})]^{T}\in\mathbb{C}^{P_{\mathrm{dr}}Q_{\mathrm{dr}}(M_{\mathrm{r}}-1)\times(M_{\mathrm{r}}-1)}$ , and $\mathbf{y}_{\mathrm{dt}}=[\mathbf{y}_{\mathrm{dr},1},\cdots,\mathbf{y}_{\mathrm{dr},(P_{\mathrm{dr}}-1)M_{\mathrm{r}}+1}]^{T}$ .

Proof:

See Appendix B. ∎

Remark 3 (The special solution to $\mathcal{P}_{1}$ ).

Equations (20) and (21) give the general solutions to $\mathcal{P}_{1}$ and are dependent on the value of $c_{\mathrm{dr}}$ . In practice, it is difficult to determine the value of $c_{\mathrm{dr}}$ . To avoid this issue, the mismatch coefficient of the reference can be set to $1$ , i.e., $c_{\mathrm{dr}}=1$ . In this case, equations (20) and (21) degenerate to a special solution to $\mathcal{P}_{1}$ . Since the vectors parallel to $\mathbf{t}_{1}$ and $\mathbf{u}_{1}$ can be applied to the calibration, the special solution to $\mathcal{P}_{1}$ still works for the reciprocity calibration.

Then, the mismatch coefficients of analog RF chains can be estimated by solving $\mathcal{P}_{2}$ . By exploiting the geometry channel model of mmWave, the calibration problem of analog chains can be further written as

\mathcal{P}_{2.1}:\min_{\mathbf{U}_{2},\mathbf{T}_{2},\boldsymbol{\Theta},\boldsymbol{\Phi},\mathbf{H}_{\alpha}}\quad\left\|\mathbf{Y}_{\mathrm{da}}-\bar{\mathbf{B}}_{\mathrm{da}}^{T}\mathbf{U}_{2}\mathbf{A}_{\mathrm{r}}\mathbf{H}_{\alpha}\mathbf{A}_{\mathrm{t}}^{T}\mathbf{T}_{2}\tilde{\mathbf{X}}_{\mathrm{da}}\right\|_{\mathrm{F}}^{2},

(22)

where $\tilde{\mathbf{X}}_{\mathrm{da}}=\bar{\mathbf{F}}_{\mathrm{da}}\mathbf{X}_{\mathrm{da}}$ , $\mathbf{A}_{\mathrm{t}}=[\mathbf{a}_{\mathrm{t}}(\theta_{1}),\cdots,\mathbf{a}_{\mathrm{t}}(\theta_{K})]\in\mathbb{C}^{N_{\mathrm{t}}\times K}$ , $\mathbf{A}_{\mathrm{r}}=[\mathbf{a}_{\mathrm{r}}(\phi_{1}),\cdots,\mathbf{a}_{\mathrm{r}}(\phi_{K})]\in\mathbb{C}^{N_{\mathrm{r}}\times K}$ , and $\mathbf{H}_{\alpha}=\mathrm{diag}(\alpha_{1},\cdots,\alpha_{K})\sqrt{N_{\mathrm{t}}N_{\mathrm{r}}}/{\sqrt{K}}$ . As the variables are correlated with each other, this problem is nonconvex and there is no tractable solution to the problem. To solve $\mathcal{P}_{2.1}$ efficiently, inspired by [39], we propose an alternating optimization algorithm to solve a locally optimal solution.

During the $l_{\mathrm{ao}}$ -th iteration, we apply the least square algorithm to estimate the diagonal matrices $\mathbf{U}_{2}$ , $\mathbf{T}_{2}$ , $\mathbf{H}_{\alpha}$ , then, propose an algorithm to estimate the AoA and AoD matrices $\boldsymbol{\Theta}$ , $\boldsymbol{\Phi}$ .

Lemma 1 (Solution to the diagonal matrices).

During the $l_{\mathrm{ao}}$ -th iteration, when $\mathbf{T}_{2}^{l_{\mathrm{ao}}-1},\mathbf{U}_{2}^{l_{\mathrm{ao}}-1}$ , $\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1}$ , and $\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1}$ are known, the diagonal elements of $\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}}$ can be estimated by

{\mathbf{h}}_{\alpha}^{l_{\mathrm{ao}}}=\mathrm{arg}\min_{\mathbf{h}_{\alpha}}\bar{g}(\mathbf{T}_{2}^{l_{\mathrm{ao}}-1},\mathbf{U}_{2}^{l_{\mathrm{ao}}-1},\mathbf{H}_{\alpha},\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1},\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1})=(\boldsymbol{\Gamma}_{\mathrm{h}}^{H}\boldsymbol{\Gamma}_{\mathrm{h}})^{-1}\boldsymbol{\Gamma}_{\mathrm{h}}^{H}\mathrm{vec}\left\{\mathbf{Y}_{\mathrm{da}}\right\},

(23)

where $\boldsymbol{\Gamma}_{\mathrm{h}}=(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\mathbf{T}_{2}^{l_{\mathrm{ao}}-1}\mathbf{A}_{\mathrm{t}}^{l_{\mathrm{ao}}-1}\odot\bar{\mathbf{B}}_{\mathrm{r}}\mathbf{U}_{2}^{l_{\mathrm{ao}}-1}\mathbf{A}_{\mathrm{r}}^{l_{\mathrm{ao}}-1})$ . When $\mathbf{T}_{2}^{l_{\mathrm{ao}}-1},\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}}$ , and $\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1}$ , $\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1}$ are known, the diagonal elements of $\mathbf{U}_{2}^{l_{\mathrm{ao}}}$ can be estimated by

\mathbf{u}_{2}^{l_{\mathrm{ao}}}=\mathrm{arg}\min_{\mathbf{u}_{2}}\bar{g}(\mathbf{T}_{2}^{l_{\mathrm{ao}}-1},\mathbf{U}_{2},\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}},\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1},\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1})=(\boldsymbol{\Gamma}_{\mathrm{u}}^{H}\boldsymbol{\Gamma}_{\mathrm{u}})^{-1}\boldsymbol{\Gamma}_{\mathrm{u}}^{H}\mathrm{vec}\left\{\mathbf{Y}_{\mathrm{da}}\right\},

(24)

where $\boldsymbol{\Gamma}_{\mathrm{u}}=(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\mathbf{T}_{2}^{l_{\mathrm{ao}}-1}\mathbf{A}_{\mathrm{t}}^{l_{\mathrm{ao}}-1}\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}}(\mathbf{A}_{\mathrm{r}}^{l_{\mathrm{ao}}-1})^{T}\odot\bar{\mathbf{B}}_{\mathrm{r}})$ . Similarly, by giving $\mathbf{U}_{2}^{l_{\mathrm{ao}}},\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}}$ , $\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1}$ , and $\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1}$ , the diagonal entries of $\mathbf{T}_{2}$ can be given by

\mathbf{t}_{2}^{l_{\mathrm{ao}}}=\mathrm{arg}\min_{\mathbf{t}_{2}}\bar{g}(\mathbf{T}_{2},\mathbf{U}_{2}^{l_{\mathrm{ao}}},\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}},\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1},\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1})=(\boldsymbol{\Gamma}_{\mathrm{t}}^{H}\boldsymbol{\Gamma}_{\mathrm{t}})^{-1}\boldsymbol{\Gamma}_{\mathrm{t}}^{H}\mathrm{vec}\left\{\mathbf{Y}_{\mathrm{da}}\right\},

(25)

where $\boldsymbol{\Gamma}_{\mathrm{t}}=(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\odot\bar{\mathbf{B}}_{\mathrm{r}}\mathbf{U}_{2}^{l_{\mathrm{ao}}}\mathbf{A}_{\mathrm{r}}^{l_{\mathrm{ao}}-1}\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}}(\mathbf{A}_{\mathrm{t}}^{l_{\mathrm{ao}}-1})^{T})$ .

Algorithm 1 The AoAs/AoDs updating

\mathbf{U}_{2}^{l_{\mathrm{ao}}}

\mathbf{T}_{2}^{l_{\mathrm{ao}}}

\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}}

\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1}

\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1}

, and the convergence condition

\epsilon_{\mathrm{an}}

1: Initialize

l_{\mathrm{an}}=1

, and

\bar{\boldsymbol{\Phi}}^{l_{\mathrm{an}}-1}=\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1}

2: repeat

3: Compute the array steering matrix

\mathbf{A}_{\mathrm{r}}(\bar{\boldsymbol{\Phi}}^{l_{\mathrm{an}}-1})

and its gradient matrix

\bar{\mathbf{A}}_{\mathrm{r}}(\bar{\boldsymbol{\Phi}}^{l_{\mathrm{an}}-1})

;

4: Compute the equivalent receive signal matrix

\mathbf{Y}_{\mathrm{dar}}=\mathbf{Y}_{\mathrm{da}}-\bar{\mathbf{B}}_{\mathrm{da}}^{T}\mathbf{U}_{2}^{l_{\mathrm{ao}}}\mathbf{A}_{\mathrm{r}}(\bar{\boldsymbol{\Phi}}^{l_{\mathrm{an}}-1})\mathbf{H}_{\mathrm{r}}^{l_{\mathrm{ao}}}

;

5: Compute the increase of direction angles

\boldsymbol{\xi}=\Re\{(\boldsymbol{\Gamma}_{\xi}^{H}\boldsymbol{\Gamma}_{\xi})\}^{-1}\Re\{\boldsymbol{\Gamma}_{\xi}^{H}\mathrm{vec}(\mathbf{Y}_{\mathrm{dar}})\}

;

6: Upgrade the direction angles

\bar{\boldsymbol{\Phi}}^{l_{\mathrm{an}}}=\bar{\boldsymbol{\Phi}}^{l_{\mathrm{an}}-1}+\boldsymbol{\xi}

, and set

l_{\mathrm{an}}=l_{\mathrm{an}}+1

;

7: until

\|\bar{\boldsymbol{\Phi}}^{l_{\mathrm{an}}}-\bar{\boldsymbol{\Phi}}^{l_{\mathrm{an}}-1}\|_{\mathrm{F}}^{2}<\epsilon_{\mathrm{an}}

7: The updated

\boldsymbol{\Phi}^{l_{\mathrm{ao}}}

Proof:

The complete proof is presented in Appendix C of Supplementary Material. ∎

Finally, we propose a AoAs and AoDs updating algorithm¹¹1The update method of AoA/AoD is not restricted to the algorithm proposed in this paper. Some existing methods, such as those presented in [30], can be employed. when $\mathbf{U}_{2}^{l_{\mathrm{ao}}}$ , $\mathbf{T}_{2}^{l_{\mathrm{ao}}}$ , $\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}}$ , $\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1}$ , and $\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1}$ are given. Since the AoAs and AoDs can be estimated using the same approaches, we introduce the updating algorithm by taking updating the AoDs $\boldsymbol{\Phi}$ as an example. Since the problem for estimating AoDs is nonlinear, it is difficult to solve the AoDs directly. To address this issue, as presented in [40], the nonlinear problem is transformed into a series of linear problems by using the first-order Taylor expansion to approximate the array steering vector. Let $\xi_{k}\ (k\in[1:K])$ denotes the differences between the estimated AoDs $\bar{\boldsymbol{\Phi}}$ and the real AoDs $\boldsymbol{\Phi}$ . By assuming that the differences are small, the array steering vector can be approximated by the first-order Taylor expansion denoted as $\mathbf{a}_{\mathrm{r}}(\phi_{k})=\mathbf{a}_{\mathrm{r}}(\bar{\phi}_{k})+\bar{\mathbf{a}}_{\mathrm{r}}(\bar{\phi}_{k})\xi_{k}$ , where $\bar{\mathbf{a}}_{\mathrm{r}}(\bar{\phi}_{k})=\partial\mathbf{a}_{\mathrm{r}}(\phi_{k})/\partial\phi_{k}|_{\phi_{k}=\bar{\phi}_{k}}=\mathbf{a}_{\mathrm{r}}(\bar{\phi}_{k})\circ[0,-j\frac{2\pi d}{\lambda}\cos\bar{\phi}_{k},\cdots,-j\frac{2\pi d}{\lambda}(N_{\mathrm{r}}-1)\cos\bar{\phi}_{k}]^{T}$ . Then, the problem $\mathcal{P}_{2.1}$ can be further formulated as

\mathcal{P}_{2.2}:\min_{\{\xi_{k}\}_{k\in[1:K]}}\quad\left\|\mathbf{Y}_{\mathrm{dar}}^{l_{\mathrm{ao}}}-\bar{\mathbf{B}}_{\mathrm{da}}^{T}\mathbf{U}_{2}^{l_{\mathrm{ao}}}\bar{\mathbf{A}}_{\mathrm{r}}(\bar{\boldsymbol{\Phi}})\boldsymbol{\Lambda}\mathbf{H}_{\mathrm{r}}^{l_{\mathrm{ao}}}\right\|_{\mathrm{F}}^{2},

(26)

where $\mathbf{H}_{\mathrm{r}}^{l_{\mathrm{ao}}}=\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}}\mathbf{A}_{\mathrm{t}}^{T}\mathbf{T}_{2}^{l_{\mathrm{ao}}}\tilde{\mathbf{X}}_{\mathrm{da}}$ , $\mathbf{Y}_{\mathrm{dar}}=\mathbf{Y}_{\mathrm{da}}-\bar{\mathbf{B}}_{\mathrm{da}}^{T}\mathbf{U}_{2}^{l_{\mathrm{ao}}}\mathbf{A}_{\mathrm{r}}(\bar{\boldsymbol{\Phi}})\mathbf{H}_{\mathrm{r}}^{l_{\mathrm{ao}}}$ , $\boldsymbol{\Lambda}=\mathrm{diag}(\xi_{1},\cdots,\xi_{K})$ , and $\bar{\mathbf{A}}_{\mathrm{r}}(\bar{\boldsymbol{\Phi}})=[\bar{\mathbf{a}}_{\mathrm{r}}(\bar{\phi}_{1}),\cdots,\bar{\mathbf{a}}_{\mathrm{r}}(\bar{\phi}_{K})]$ . The solution can be given by

\boldsymbol{\xi}=\Re\{(\boldsymbol{\Gamma}_{\xi}^{H}\boldsymbol{\Gamma}_{\xi})\}^{-1}\Re\{\boldsymbol{\Gamma}_{\xi}^{H}\mathrm{vec}(\mathbf{Y}_{\mathrm{dar}})\},

(27)

$\boldsymbol{\Gamma}_{\xi}=(\mathbf{H}_{\mathrm{r}}^{l_{\mathrm{ao}}})^{T}\odot\bar{\mathbf{B}}_{\mathrm{da}}^{T}\mathbf{U}_{2}^{l_{\mathrm{ao}}}\bar{\mathbf{A}}_{\mathrm{r}}(\bar{\boldsymbol{\Phi}})$ . Since $\xi_{k}$ is assumed to be small, the updating requires several iterations, and the iterative updating algorithm is summarized as Algorithm 1.

It is worth noting that the initial values of AoAs $\boldsymbol{\Theta}$ and AoDs $\boldsymbol{\Phi}$ can be roughly calculated by direction finding methods, e.g., the modified MUSIC algorithm in [41].

Finally, based on Lemma 1 and Algorithm 1, the problem $\mathcal{P}_{2.1}$ can be solved by an alternating optimization algorithm, which is summarized as Algorithm 2.

Algorithm 2 Alternating Optimization for solving

\mathcal{P}_{2.1}

0: The received signals

\mathbf{Y}_{\mathrm{da}}

, and the convergence threshold

\epsilon

1: Set

l_{\mathrm{ao}}=1

; initialize

\mathbf{U}_{2}^{l_{\mathrm{ao}}-1}

\mathbf{T}_{2}^{l_{\mathrm{ao}}-1}

, and

\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}-1}

randomly; initialize AoAs

\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1}

and AoDs

\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1}

by the modified MUSIC algorithm in [41];

2: repeat

3: Estimate the channel gain

\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}}

by using (23);

4: Estimate mismatch coefficients

\mathbf{T}_{2}^{l_{\mathrm{ao}}}

by using (25);

5: Estimated mismatch coefficients

\mathbf{U}_{2}^{l_{\mathrm{ao}}}

by using (24);

6: Upgrade the AoAs

\boldsymbol{\Theta}^{l_{\mathrm{ao}}}

and AoDs

\boldsymbol{\Phi}^{l_{\mathrm{ao}}}

with Algorithm 1; set

l_{\mathrm{ao}}=l_{\mathrm{ao}}+1

;

7: until

|\bar{g}(\mathbf{T}_{2}^{l_{\mathrm{ao}}-1},\mathbf{U}_{2}^{l_{\mathrm{ao}}-1},\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}-1},\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1},\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1})-\bar{g}(\mathbf{T}_{2}^{l_{\mathrm{ao}}},\mathbf{U}_{2}^{l_{\mathrm{ao}}},\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}},\boldsymbol{\Theta}^{l_{\mathrm{ao}}},\boldsymbol{\Phi}^{l_{\mathrm{ao}}})|<\epsilon

7: The mismatch coefficients

\mathbf{U}_{2}^{l_{\mathrm{ao}}}

and

\mathbf{T}_{2}^{l_{\mathrm{ao}}}

Remark 4 (Convergence analysis).

In Algorithm 1, each iteration can minimize the objective of $\mathcal{P}_{2.1}$ , i.e.,

\left\|\mathbf{Y}_{\mathrm{dar}}^{l_{\mathrm{ao}}}-\bar{\mathbf{B}}_{\mathrm{da}}^{T}\mathbf{U}_{2}^{l_{\mathrm{ao}}}\bar{\mathbf{A}}_{\mathrm{r}}(\bar{\boldsymbol{\Phi}}^{l_{\mathrm{an}}-1})\boldsymbol{\Lambda}\mathbf{H}_{\mathrm{r}}^{l_{\mathrm{ao}}}\right\|_{\mathrm{F}}^{2}\geq\left\|\mathbf{Y}_{\mathrm{dar}}^{l_{\mathrm{ao}}}-\bar{\mathbf{B}}_{\mathrm{da}}^{T}\mathbf{U}_{2}^{l_{\mathrm{ao}}}\bar{\mathbf{A}}_{\mathrm{r}}(\bar{\boldsymbol{\Phi}}^{l_{\mathrm{an}}})\boldsymbol{\Lambda}\mathbf{H}_{\mathrm{r}}^{l_{\mathrm{ao}}}\right\|_{\mathrm{F}}^{2}.

(28)

Thus, Algorithm 1 can achieve a local convergence. For Algorithm 2, each alternating optimization can minimize the objective $\bar{g}(\mathbf{T}_{2},\mathbf{U}_{2},\mathbf{H}_{\alpha},\boldsymbol{\Theta},\boldsymbol{\Phi})$ . In other words, we have

\begin{split}&\bar{g}(\mathbf{T}_{2}^{l_{\mathrm{ao}}-1},\mathbf{U}_{2}^{l_{\mathrm{ao}}-1},\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}-1},\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1},\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1})\geq\bar{g}(\mathbf{T}_{2}^{l_{\mathrm{ao}}-1},\mathbf{U}_{2}^{l_{\mathrm{ao}}-1},\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}},\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1},\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1})\\ &\geq\bar{g}(\mathbf{T}_{2}^{l_{\mathrm{ao}}-1},\mathbf{U}_{2}^{l_{\mathrm{ao}}},\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}},\boldsymbol{\Theta}^{l_{\mathrm{ao}}-1},\boldsymbol{\Phi}^{l_{\mathrm{ao}}-1})\geq\cdots\geq\bar{g}(\mathbf{T}_{2}^{l_{\mathrm{ao}}},\mathbf{U}_{2}^{l_{\mathrm{ao}}},\mathbf{H}_{\alpha}^{l_{\mathrm{ao}}},\boldsymbol{\Theta}^{l_{\mathrm{ao}}},\boldsymbol{\Phi}^{l_{\mathrm{ao}}}),\end{split}

(29)

and thus, Algorithm 2 converges to a minimum.

Similarly, the mismatch coefficients of the receive RF chains of the BS and the transmit RF chains of the UE can be estimated by the uplink calibration. During the uplink calibration, the UE transmits pilots to the BS, and the BS estimates the mismatch coefficients by Proposition 3 and Algorithm 2. With estimated mismatch coefficients, the reciprocity mismatch can be compensated in the digital domain. Thus, the overall procedure of HAC can be summarized as follows.

Step 1

(Downlink calibration) The BS sends downlink pilots to the UE. After receiving the downlink pilots, the UE jointly estimates the mismatch coefficients of the transmit RF chains of BS and the receive RF chains of UE by Algorithm 2;
Step 2

(Uplink calibration) The UE transmits uplink pilots to the BS. Based on the received uplink pilots, the BS calculates the mismatch coefficients of the receive RF chains of BS and the transmit RF chains of UE with Algorithm 2;
Step 3

(Mismatch coefficients feedback) The UE feeds back the estimated mismatch coefficients to the BS, and the BS sends the mismatch coefficients to the UE;
Step 4

(Reciprocity mismatch compensation) During data transmission phases, CSI $\mathbf{H}$ can be estimated by utilizing the knowledge of mismatch coefficients and some existing approaches , e.g., the methods presented in [28, 29, 30]. Then, the equivalent downlink CSI can be formulated as $\mathbf{U}_{2}\mathbf{H}^{T}\mathbf{T}_{2}$ , and the precoding/combining and beamforming matrices can be designed by some existing approaches, e.g. the methods proposed in [32, 33, 34]. Finally, the mismatch of digital RF chains can be compensated by multiplying the inverse of mismatch coefficients matrices of digital chains on the precoding/combining matrices.

II-E Extend HAC to the multi-user scenario

Although the HAC is introduced in a point-to-point HBF system, it can be also applied in multi-user HBF systems, where a single BS serves $G$ UEs simultaneously. Each UE is equipped with an HBF transceiver. Same as the single-UE case, the HAC consists of the downlink calibration and the uplink calibration. For both downlink and uplink calibrations, the pilots and beamforming designs remain consistent to the single-UE case.

Downlink calibration: The BS sends training pilots to the UEs, and the $g$ -th UE received signal can be denoted as

\mathbf{y}_{\mathrm{d},g,l}=\mathbf{D}_{\mathrm{r},g,l}^{T}\mathbf{U}_{g,1}\mathbf{B}_{\mathrm{r},g,l}^{T}\mathbf{U}_{g,2}\mathbf{H}_{g}^{T}\mathbf{T}_{2}\mathbf{F}_{\mathrm{t},l}\mathbf{T}_{1}\mathbf{W}_{\mathrm{t},l}\mathbf{x}_{\mathrm{d},l}+\tilde{\mathbf{n}}_{\mathrm{d},g,l},

(30)

where $\mathbf{H}_{g}$ denotes the wireless channel between the $g$ -th UE and the BS, $\mathbf{U}_{g,1}$ and $\mathbf{U}_{g,2}$ are the mismatch coefficient matrices of the receive digital and analog chains of the $g$ -th UE. Since (30) of each UE is same as (13), it can be solved by the same approach, i.e., Proposition 2, Proposition 3, and Algorithm 2.

Uplink calibration: The UEs sends training pilots to the BS, and the received signal can be given by

\mathbf{y}_{\mathrm{u},l}=\mathbf{W}_{\mathrm{r},l}^{T}\mathbf{R}_{1}\mathbf{F}_{\mathrm{r},l}^{T}\mathbf{R}_{2}\mathbf{H}_{\mathrm{mu}}\bar{\mathbf{V}}_{2}\mathbf{B}_{\mathrm{t},l}\bar{\mathbf{V}}_{1}\mathbf{D}_{\mathrm{t},l}\mathbf{x}_{\mathrm{u},l}+\tilde{\mathbf{n}}_{\mathrm{u},l},

(31)

where $\mathbf{H}_{\mathrm{mu}}=[\mathbf{H}_{1},\cdots,\mathbf{H}_{G}]$ , $\bar{\mathbf{V}}_{2}=\mathrm{blkdiag}(\mathbf{V}_{1,2},\cdots,\mathbf{V}_{G,2})$ , $\mathbf{B}_{\mathrm{t},l}=\mathrm{blkdiag}(\mathbf{B}_{\mathrm{t},1,l},\cdots,\mathbf{B}_{\mathrm{t},G,l})$ , $\bar{\mathbf{V}}_{1}=\mathrm{blkdiag}(\mathbf{V}_{1,1},\cdots,\mathbf{V}_{G,1})$ , $\mathbf{x}_{\mathrm{u},l}=[\mathbf{x}_{\mathrm{u},1,l}^{T}\mathbf{D}_{\mathrm{t},1,l}^{T},\cdots,\mathbf{x}_{\mathrm{u},G,l}^{T}\mathbf{D}_{\mathrm{t},G,l}^{T}]^{T}$ , $\mathbf{V}_{g,1}$ and $\mathbf{V}_{g,2}$ represent the mismatch matrices of the transmit digital and analog chains of the $g$ -th UE. The uplink calibration problem can be formulated as

\min_{\bar{\mathbf{V}}_{i},\mathbf{R}_{i},\mathbf{H}_{\mathrm{mu}}}\ \sum_{l=1}^{L_{\mathrm{u}}}\left\|\mathbf{y}_{\mathrm{u},l}-\mathbf{W}_{\mathrm{r},l}^{T}\mathbf{R}_{1}\mathbf{F}_{\mathrm{r},l}^{T}\mathbf{R}_{2}\mathbf{H}_{\mathrm{mu}}\bar{\mathbf{V}}_{2}\mathbf{B}_{\mathrm{t},l}\bar{\mathbf{V}}_{1}\mathbf{x}_{\mathrm{u},l}\right\|_{\mathrm{F}}^{2},

(32)

which can be decomposed into the calibration problem of digital chains and analog chains by Proposition 2. The uplink calibration problem of digital chains can be solved by Proposition 3, whereas the problem of analog chains can not solved by Algorithm 2 due to the different structure of $\mathbf{H}_{\mathrm{mu}}$ . Fortunately, if UEs feed back AoAs and AoDs estimated in the downlink calibration to the BS, the BS only estimates the mismatch coefficients $\bar{\mathbf{V}}_{2}$ and $\mathbf{R}_{2}$ which can be solved by alternating optimization approach similar to Algorithm 2.

III Performance Analysis of Reciprocity Calibration

In this section, we will analyze the performance of the proposed HAC. The minimum length of the calibration pilots will be first derived, followed by the overhead and computational complexity analysis. To measure the performance of the proposed calibration approach, the Cramér-Rao lower bound will be derived as the benchmark of the calibration performance.

III-A Overhead and Complexity of HAC

Based on the calibration signal design and estimation approaches, we can derive the requirements of the length of calibration pilots.

Proposition 4 (Length of downlink pilots).

The proposed downlink training and estimation approaches require that the length of pilots meets the following conditions

\begin{cases}Q_{\mathrm{dr}}\geq M_{\mathrm{t}},\\ P_{\mathrm{dr}}\geq 1,\\ Q_{\mathrm{da}}\geq N_{\mathrm{t}}-K+1,\\ P_{\mathrm{da}}\geq N_{\mathrm{r}}-K+1.\end{cases}

(33)

Proof:

The complete proof is presented in Appendix D of Supplementary Material. ∎

Based on the above pilot requirements and the proposed calibration algorithms, the overhead and computational complexity of HAC can be given in the following lemma.

Remark 5 (Overhead and complexity of the proposed HAC).

By considering the length of pilots exactly meets the requirements denoted as (33), the overhead of downlink training is proportional to $M_{\mathrm{t}}+N_{\mathrm{r}}N_{\mathrm{r}}$ . In each iteration , the computational complexity is mainly caused by computing the inverse of matrices. Let $\mathcal{O}(L_{\mathrm{an}}K^{3})$ denote the total iteration number of updating AoA/AoD and $L_{\mathrm{ao}}$ represent the iteration number of Algorithm 2. The complexity of solve problem $\mathcal{P}_{2.1}$ can be given by $\mathcal{O}[L_{\mathrm{ao}}(M_{\mathrm{r}}^{3}+M_{\mathrm{t}}^{3}+N_{\mathrm{r}}^{3}+N_{\mathrm{t}}^{3}+L_{\mathrm{ao}}K^{3})]$ . The comparisons between the overhead and complexity of HAC and CRC are shown in Table I, which indicates that the proposed HAC requires requires less overhead. According to experiments, both Algorithm 1 and Algorithm 2 converge after several iterations, and the complexity of the HAC is also lower than the CRC.

TABLE I: Comparison of HAC and CRC

	Overhead	Complexity
CRC	$N_{\mathrm{t}}M_{\mathrm{t}}N_{\mathrm{r}}$	$\mathcal{O}(N_{\mathrm{t}}^{3}M_{\mathrm{t}}^{3}N_{\mathrm{r}}^{3}M_{\mathrm{r}}^{3})$
HAC	$M_{\mathrm{t}}+N_{\mathrm{t}}N_{\mathrm{r}}$	$\mathcal{O}[L_{\mathrm{ao}}(M_{\mathrm{r}}^{3}+M_{\mathrm{t}}^{3}+N_{\mathrm{r}}^{3}+N_{\mathrm{t}}^{3}+L_{\mathrm{ao}}K^{3})]$

III-B CRLB of Calibration Coefficients

To verify the performance of proposed joint estimation approaches, we derive the CRLB of $\tilde{\mathbf{u}}_{1}$ , $\mathbf{t}_{1}$ , $\mathbf{u}_{2}$ , and $\mathbf{t}_{2}$ to be the performance benchmark, where $\tilde{\mathbf{u}}_{1}=[u_{1,2},\cdots,u_{1,M_{\mathrm{r}}}]^{T}$ . We first define the variable vectors as

\begin{split}\boldsymbol{\eta}=&[\Re\{\tilde{\mathbf{u}}_{1}^{T}\},\Im\{\tilde{\mathbf{u}}_{1}^{T}\},\Re\{\mathbf{t}_{1}^{T}\},\Im\{\mathbf{t}_{1}^{T}\},[\Re\{\mathbf{u}_{2}^{T}\},\Im\{\mathbf{u}_{2}^{T}\},\\ &\Re\{\mathbf{t}_{2}^{T}\},\Im\{\mathbf{t}_{2}^{T}\},\Re\{\mathbf{h}_{\alpha}^{T}\},\Im\{\mathbf{h}_{\alpha}^{T}\},\boldsymbol{\Theta}^{T},\boldsymbol{\Phi}^{T}]^{T},\\ \end{split}

(34)

\boldsymbol{\eta}_{\mathrm{ut}}=[\tilde{\mathbf{u}}_{1}^{T},\mathbf{t}_{1}^{T},\mathbf{u}_{2}^{T},\mathbf{t}_{2}^{T}]^{T},

(35)

and the transformation function vector $\mathbf{g}(\boldsymbol{\eta})$ as

\begin{split}\boldsymbol{\eta}_{\mathrm{ut}}=\mathbf{g}(\boldsymbol{\eta})=&\bigl{[}\Re\{\tilde{\mathbf{u}}_{1}^{T}\}+j\Im\{\tilde{\mathbf{u}}_{1}^{T}\},\Re\{\mathbf{t}_{1}^{T}\}+j\Im\{\mathbf{t}_{1}^{T}\},\\ &\Re\{\mathbf{u}_{2}^{T}\}+j\Im\{\mathbf{u}_{2}^{T}\},\Re\{\mathbf{t}_{2}^{T}\}+j\Im\{\mathbf{t}_{2}^{T}\}\bigr{]}^{T}.\end{split}

(36)

Based on this definition, the CRLB of the equivalent mismatch coefficients $\boldsymbol{\eta}_{\mathrm{ut}}$ can be defined as follows.

Definition 1 (CRLB of ${\boldsymbol{\eta}}_{\mathrm{ut}}$ ).

According to the transformation relation in [44], the CRLB of $\boldsymbol{\eta}_{\mathrm{ut}}$ can be given by

\mathrm{CRLB}({\eta}_{\mathrm{ut},i})=\left[\frac{\partial\mathbf{g}(\boldsymbol{\eta})}{\partial\boldsymbol{\eta}^{T}}\boldsymbol{\mathcal{I}}(\boldsymbol{\eta})^{-1}\left(\frac{\partial\mathbf{g}(\boldsymbol{\eta})}{\partial\boldsymbol{\eta}^{T}}\right)^{H}\right]_{i,i},

(37)

where $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta})$ denotes the Fisher information matrix of $\boldsymbol{\eta}$ , and ${\eta}_{\mathrm{ut},i}$ is the $i$ -th entry of $\boldsymbol{\eta}_{\mathrm{ut}}$ .

Lemma 2 (Transformation of the Fisher information matrix).

By dividing the variables into two parts and defining the vectors $\boldsymbol{\eta}_{1}=[\Re\{\tilde{\mathbf{u}}_{1}^{T}\},\Im\{\tilde{\mathbf{u}}_{1}^{T}\},\Re\{\mathbf{t}_{1}^{T}\},\Im\{\mathbf{t}_{1}^{T}\}]^{T}$ and $\boldsymbol{\eta}_{2}=[\Re\{\mathbf{u}_{2}^{T}\},\Im\{\mathbf{u}_{2}^{T}\},\\ \Re\{\mathbf{t}_{2}^{T}\},\Im\{\mathbf{t}_{2}^{T}\},\Re\{\mathbf{h}_{\alpha}^{T}\},\Im\{\mathbf{h}_{\alpha}^{T}\},\boldsymbol{\Theta}^{T},\boldsymbol{\Phi}^{T}]^{T}$ , the Fisher information matrix $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta})$ can be further denoted as

\boldsymbol{\mathcal{I}}(\boldsymbol{\eta})=\mathrm{blkdiag}[\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1}),\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{2})],

(38)

where $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{i})$ denotes the Fisher information matrix of $\boldsymbol{\eta}_{i}$ , $\forall i\in\{1,2\}$ .

Proof:

The complete proof is presented in Appendix E of Supplementary Material. ∎

Thus, to derive the closed-form expressions of the CRLB of $\boldsymbol{\eta}_{\mathrm{ut}}$ , we first derive the closed-form expression of $\mathcal{I}(\boldsymbol{\eta}_{1})$ denoted as follows.

Lemma 3 (Closed-form expression of $\mathcal{I}(\boldsymbol{\eta}_{1})$ ).

The closed-form expression can be given by

\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1})=\mathrm{blkdiag}(\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,1}),\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,2})),

(39)

where $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,1})=\lim_{\gamma\rightarrow 0}2\gamma^{-1}\sum_{p=1}^{P_{\mathrm{dr}}}\|\mathbf{x}_{\mathrm{tn},p}\|^{2}\mathbf{I}_{2M_{\mathrm{r}}-2}$ , $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,2})=2\rho_{\mathrm{c}}|\beta_{\mathrm{d}}|^{2}L_{\mathrm{dr}}\sigma_{\mathrm{n}}^{-2}\mathbf{I}_{2M_{\mathrm{t}}}$ , $\boldsymbol{\eta}_{1,1}=[\Re\{\tilde{\mathbf{u}}_{1}^{T}\},\Im\{\tilde{\mathbf{u}}_{1}^{T}\}]^{T}$ , and $\boldsymbol{\eta}_{1,2}=[\Re\{\mathbf{t}_{1}^{T}\},\Im\{\mathbf{t}_{1}^{T}\}]^{T}$ .

Proof:

The complete proof is presented in Appendix F of Supplementary Material. ∎

Similarly, for deriving the closed-form expressions of $\boldsymbol{\eta}_{\mathrm{ut}}$ , we derive $\mathcal{I}(\boldsymbol{\eta}_{2})$ denoted in the following lemma.

Lemma 4 (Closed-form expression of $\boldsymbol{I}(\boldsymbol{\eta}_{2})$ ).

The closed-form expression of $\boldsymbol{I}(\boldsymbol{\eta}_{2})$ can be given by

\boldsymbol{I}(\boldsymbol{\eta}_{2})=\frac{2}{\sigma_{\mathrm{n}}^{2}}\Re\left\{\boldsymbol{\Upsilon}_{\mathrm{\eta}}^{H}\boldsymbol{\Upsilon}_{\mathrm{\eta}}\right\},

(40)

where $\boldsymbol{\Upsilon}_{\mathrm{\eta}}=[\mathbf{\Gamma}_{\mathrm{t}},j\mathbf{\Gamma}_{\mathrm{t}},\mathbf{\Gamma}_{\mathrm{u}},j\mathbf{\Gamma}_{\mathrm{u}},\mathbf{\Gamma}_{\mathrm{h}},j\mathbf{\Gamma}_{\mathrm{h}},\mathbf{\Gamma}_{\mathrm{\theta}},\mathbf{\Gamma}_{\mathrm{\Phi}}]$ , $\mathbf{\Gamma_{\mathrm{h}}}$ , $\mathbf{\Gamma}_{\mathrm{u}}$ , and $\mathbf{\Gamma_{\mathrm{t}}}$ are defined in Lemma 1, $\mathbf{\Gamma_{\mathrm{\theta}}}=(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\mathbf{T}_{2}\otimes\bar{\mathbf{B}}_{\mathrm{r}}\mathbf{U}_{2}\mathbf{A}_{\mathrm{r}}\mathbf{H}_{\alpha})\mathbf{E}_{\mathrm{x},N_{\mathrm{t}}K}\mathrm{blkdiag}(\bar{\mathbf{a}}_{\mathrm{t}}(\theta_{1}),\cdots,\bar{\mathbf{a}}_{\mathrm{t}}(\theta_{K}))$ , $\bar{\mathbf{a}}_{\mathrm{t}}(\theta_{k})=\mathbf{a}_{\mathrm{t}}(\theta_{k})\circ[0,-j\frac{2\pi d}{\lambda}\cos\theta_{k},\cdots,\\ -j\frac{2\pi d}{\lambda}(N_{\mathrm{t}}-1)\cos\theta_{k}]^{T}$ , and $\mathbf{E}_{\mathrm{x},N_{\mathrm{t}},K}=\sum_{k=1}^{K}(\mathbf{e}_{k}^{T}\otimes\mathbf{I}_{N_{\mathrm{t}}}\otimes\mathbf{e}_{k})$ , $\mathbf{e}_{k}$ is the $k$ -the column of $\mathbf{I}_{K}$ , $\mathbf{\Gamma_{\mathrm{\Phi}}}=(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\mathbf{T}_{2}\mathbf{A}_{\mathrm{t}}\mathbf{H}_{\alpha}\otimes\bar{\mathbf{B}}_{\mathrm{r}}\mathbf{U}_{2})\mathrm{blkdiag}(\bar{\mathbf{a}}_{\mathrm{r}}(\phi_{1}),\cdots,\bar{\mathbf{a}}_{\mathrm{r}}(\phi_{K}))$ , and $\bar{\mathbf{a}}_{\mathrm{r}}(\phi_{k})=\mathbf{a}_{\mathrm{r}}(\phi_{k})\circ[0,-j\frac{2\pi d}{\lambda}\cos\phi_{k},\cdots,\\ -j\frac{2\pi d}{\lambda}(N_{\mathrm{r}}-1)\cos\phi_{k}]^{T}$ .

Proof:

The complete proof is presented in Appendix G of Supplementary Material. ∎

Based on Lemma 3 and Lemma 4, the CRLB of $\boldsymbol{\eta}_{\mathrm{ut}}$ can be given in the following proposition.

Proposition 5 (CRLB of $\boldsymbol{\eta}_{\mathrm{ut}}$ ).

Based on the Definition 1, Lemma 3, and Lemma 4, the closed-form expression of the CRLB of $\boldsymbol{\eta}_{\mathrm{ut}}$ can be given by

\mathrm{CRLB}(\eta_{\mathrm{ut},i})=\left[\boldsymbol{\mathcal{J}}(\boldsymbol{\eta}_{\mathrm{ut}})\right]_{i,i},

(41)

with

\boldsymbol{\mathcal{J}}(\boldsymbol{\eta}_{\mathrm{ut}})=\mathrm{blkdiag}\left(\mathbf{0}_{M_{\mathrm{r}}-1},\frac{\sigma_{\mathrm{n}}^{2}}{\rho_{\mathrm{c}}|\beta_{\mathrm{d}}|^{2}L_{\mathrm{dr}}}\mathbf{I}_{M_{\mathrm{t}}},\frac{\sigma_{\mathrm{n}}^{2}}{2}\boldsymbol{\Pi}\left(\Re\left\{\boldsymbol{\Upsilon}_{\eta}^{H}\boldsymbol{\Upsilon}_{\eta}\right\}\right)^{-1}\boldsymbol{\Pi}^{H}\right),

(42)

Proof:

The complete proof is presented in Appendix H of Supplementary Material. ∎

Remark 6 (CRLB analysis).

From (41), the CRLB of $u_{1,m}\ (m\in[1:M_{\mathrm{r}}])$ is equal to zeros. This result is because $u_{1,m}$ is estimated from the deterministic signals. The CRLB of $t_{1,m}\ (m\in[1:M_{\mathrm{t}}])$ can be given by $\sigma_{\mathrm{n}}^{2}/(\rho_{\mathrm{c}}|\beta_{\mathrm{d}}|^{2}L_{\mathrm{dr}})$ , which indicates that increasing the pilots can improve the accuracy of $t_{1,m}$ .

IV Simulation Results and Discussions

In this section, we will provide simulation results to evaluate the performance of the proposed reciprocity calibration approach for the mmWave-HBF system.

The system parameters are set as follows. Analog RF chains of the BS and UE, and the number of data streams are different in each simulation, while the number of digital RF chains equals to a quarter of the number of analog RF chains, i.e., $M_{\mathrm{t}}=N_{\mathrm{t}}/4$ and $M_{\mathrm{r}}=N_{\mathrm{r}}/4$ . The path number $K$ of the mmWave channel is set to $4$ . The variance $\sigma_{\alpha}^{2}$ of the channel gain $\alpha_{k}$ is set to $1$ , and the AoAs and DoAs obey the uniform distribution, i.e., $\{\theta_{k},\phi_{k}\}\sim\mathcal{U}(-\pi/2,\pi/2)$ . Then, the amplitudes of reciprocity mismatch coefficients obey the log-normal distribution, i.e., $\{\ln|t_{i,m}|,\ln|r_{i,m}|,\ln|u_{i,m}|,\ln|v_{i,m}|\}\sim\mathcal{CN}(0,0.01)$ , and the phases of reciprocity mismatch coefficients follow the uniform distribution, i.e., $\{\angle t_{i,m},\angle r_{i,m},\angle u_{i,m},\angle v_{i,m}\}\sim\mathcal{U}(-\pi/6,\pi/6)$ . Further, the length of pilots is set to $Q_{\mathrm{dr}}=M_{\mathrm{r}}$ , $P_{\mathrm{dr}}=1$ , and $Q_{\mathrm{da}}=P_{\mathrm{da}}=125$ . Finally, the variance of the AWGN is $1$ , and $\bar{\rho}_{\mathrm{c}}=\rho_{\mathrm{c}}/\sigma_{\mathrm{n}}^{2}$ denotes the average SNR of received calibration signals during the simulation.

IV-A NMSE of Estimated Mismatch Parameters

To illustrate the performance of the proposed HAC calibration approach, we compare the normalized mean square error (NMSE) of the mismatch coefficients with the CRLB. The NMSE of the mismatch coefficients is defined as $\mathrm{NMSE}(\boldsymbol{\eta}_{\mathrm{ut}})=\mathbb{E}\left\{\|\boldsymbol{\eta}_{\mathrm{ut}}-\hat{\boldsymbol{\eta}}_{\mathrm{ut}}\|_{\mathrm{F}}^{2}/\|\boldsymbol{\eta}_{\mathrm{ut}}\|_{\mathrm{F}}^{2}\right\}$ . It is worth noting that the Oracle HAC represents the reciprocity calibration with the knowledge of AoAs and AODs, which is a performance benchmark of the proposed HAC.

Fig. 6 demonstrates the NMSE of the mismatch coefficients versus the SNR $\bar{\rho}_{\mathrm{c}}$ of the calibration signals, where the antenna numbers of the BS and the UE are set to three different sets of parameters given by $(N_{\mathrm{t}},N_{\mathrm{r}})\in\{(32,32),(64,32),(128,64)\}$ . It can be seen that the NMSE of the proposed HAC gradually achieves a floor with the increase of calibration SNR. This is because the solution to the nonconvex problem gets stuck in local optima. Further, the figure also shows that the floor effect can be alleviated when the antenna number increases, which is because the independence between array steering vectors increases with the increase of antenna number. Besides, we find that the NMSE increases with the antenna number. This result indicates that the system with more antennas requires higher calibration SNR to guarantee the same calibration performance as the system possessing fewer antennas.

Then, the NMSE of the mismatch coefficients versus the length of calibration pilots is illustrated in Fig. 6 with the SNR of calibration signals set to $\bar{\rho}_{\mathrm{c}}=10$ dB. From the figure, it can be found that the NMSE and CRLB decrease with the increase of calibration pilots, which is consistent with the theoretical results shown in Proposition 5. This result indicates that better performance of the proposed HAC can be achieved at the cost of overhead or power. Also, the curves of the proposed HAC gradually approach floors when the length of pilots increases, while the curves of the Oracle HAC gradually converge to the CRLB. Increasing the antenna number can reduce the floor effect.

IV-B NMSE of Channel Estimation

To examine the efficacy of the reciprocity calibration in mmWave-HBF systems, we study the NMSE of the uplink channel estimation by using the two-dimension MUSIC algorithm proposed in [28], and the pilot block is set to $40$ .The NSME of the estimated channel is defined as

\mathrm{NMSE}(\mathbf{H}_{\mathrm{UL}})=\mathbb{E}\left\{\|\hat{\mathbf{H}}_{\mathrm{UL}}-\mathbf{H}_{\mathrm{UL}}\|_{\mathrm{F}}^{2}/\|\mathbf{H}_{\mathrm{UL}}\|_{\mathrm{F}}^{2}\right\},

(43)

where $\hat{\mathbf{H}}_{\mathrm{UL}}$ denotes the estimated channel from the uplink pilots. It is worth noting that ”Perfect Cal.” denotes the mismatch coefficients $\mathbf{U}_{1}$ , $\mathbf{T}_{1}$ , $\mathbf{U}_{2}$ , and $\mathbf{T}_{2}$ are known perfectly, and ”Without Cal.” represents that the mismatch coefficients are completely unknown.

Fig. 8 demonstrates the NMSE of estimated uplink channel versus the SNR $\bar{\rho}_{\mathrm{t}}$ of the training signals for the channel estimation, where the SNR $\bar{\rho}_{\mathrm{c}}$ of calibration signals is set to $10$ dB, $20$ dB, and $30$ dB. It can be seen that the NMSE of the perfect calibration decreases with the increase of the training SNR, while the NMSE of the uncalibrated case almost remains constant. The NMSE of the channel estimation with the proposed HAC also decreases and gradually achieves floors with the training SNR increasing. The floor effect is caused by the estimation error of mismatch coefficients. When the SNR of calibration signals increases, the curves of the proposed HAC can approach the curve of the perfect calibration, which is because the estimation error of mismatch coefficients decreases. Further, since the NMSE of HAC is much less than the NMSE of the uncalibrated case, the proposed HAC can improve the system performance of the mmWave-HBF system, significantly.

IV-C Achievable Rate of Downlink Transmission

To further examine the efficacy of the reciprocity calibration in mmWave-HBF systems, we study the achievable rate of the downlink transmission. During the downlink transmission, the transmit analog beamforming $\mathbf{F}_{\mathrm{t}}$ is set to $\mathbf{F}_{\mathrm{t}}=\angle\bar{\mathbf{V}}_{\mathrm{a}}$ , and the receive analog beamforming $\mathbf{B}_{\mathrm{r}}$ is equal to $\mathbf{B}_{\mathrm{r}}=\angle\bar{\mathbf{U}}_{\mathrm{a}}^{*}$ , where $\bar{\mathbf{V}}_{\mathrm{a}}$ is the first $M_{\mathrm{t}}$ columns of $\mathbf{V}_{\mathrm{a}}$ , $\bar{\mathbf{U}}_{\mathrm{a}}$ is the first $M_{\mathrm{r}}$ columns of $\mathbf{U}_{\mathrm{a}}$ , $\mathbf{V}_{\mathrm{a}}$ and $\mathbf{U}_{\mathrm{a}}$ are obtained from the SVD of $\hat{\mathbf{H}}_{\mathrm{DL}}=\hat{\mathbf{U}}_{2}\mathbf{H}\hat{\mathbf{T}}_{2}$ , i.e., $\hat{\mathbf{H}}_{\mathrm{DL}}=\mathbf{U}_{\mathrm{d}}\boldsymbol{\Sigma}_{\mathrm{a}}\mathbf{V}_{\mathrm{a}}^{H}$ . The digital precoding and the digital receiver are set to $\mathbf{W}_{\mathrm{t}}=\bar{\mathbf{V}}_{\mathrm{d}}$ and $\mathbf{D}_{\mathrm{r}}=\bar{\mathbf{U}}_{\mathrm{d}}$ , where $\bar{\mathbf{V}}_{\mathrm{d}}$ and $\bar{\mathbf{U}}_{\mathrm{d}}$ consist of the first $N_{\mathrm{s}}$ columns of $\mathbf{V}_{\mathrm{d}}$ and $\mathbf{U}_{\mathrm{d}}$ , $\mathbf{V}_{\mathrm{d}}$ and $\mathbf{U}_{\mathrm{d}}$ can be obtained from the SVD of the equivalent downlink channel $\mathbf{H}_{\mathrm{eq}}=\mathbf{U}_{1}\mathbf{B}_{\mathrm{r}}^{T}\bar{\mathbf{H}}_{\mathrm{DL}}\mathbf{F}_{\mathrm{t}}\mathbf{T}_{1}$ . Thus, based on the downlink transmission model (5), the sum achievable rate can be denoted as

R_{\mathrm{DL}}=\sum_{n_{\mathrm{s}}=1}^{N_{\mathrm{s}}}\mathbb{E}\left\{\log\left(1+\frac{\rho_{\mathrm{d}}|\bar{h}_{n_{\mathrm{s}},n_{\mathrm{s}}}|^{2}}{\rho_{\mathrm{d}}\sum_{i\neq n_{\mathrm{s}}}^{N_{\mathrm{s}}}|\bar{h}_{\mathrm{eq},n_{\mathrm{s}},i}|^{2}+\bar{\sigma}_{\mathrm{n}}^{2}}\right)\right\},

(44)

where $N_{\mathrm{s}}$ denotes the data stream number, and $\bar{\sigma}_{\mathrm{n}}^{2}=\|\mathbf{d}_{\mathrm{r},n_{\mathrm{s}}}^{T}\mathbf{U}_{1}\mathbf{B}_{\mathrm{r}}^{T}\|_{\mathrm{F}}^{2}\sigma_{\mathrm{n}}^{2}$ , $\bar{h}_{\mathrm{eq},n_{\mathrm{s}},i}$ denotes the $i$ -th entry of $\bar{\mathbf{h}}_{\mathrm{eq},n_{\mathrm{s}}}$ , and $\bar{\mathbf{h}}_{\mathrm{eq},n_{\mathrm{s}}}$ is the $n_{\mathrm{s}}$ -row of $\mathbf{D}_{\mathrm{r}}^{T}\mathbf{U}_{1}\mathbf{B}_{\mathrm{r}}^{T}\mathbf{H}_{\mathrm{DL}}\mathbf{F}_{\mathrm{t}}\mathbf{T}_{2}\mathbf{W}_{\mathrm{t}}\mathbf{T}_{1}$ .

The sum achievable rate of downlink transmission with the reciprocity calibration versus the downlink transmission SNR $\bar{\rho}_{\mathrm{d}}$ is illustrated in Fig. 8. From the figure, it can be observed that the curves of the systems with the perfect calibration, HAC, CRC, and without calibration achieve the same performance in the low SNR regime. This is because the impact of the noise is much larger than the reciprocity mismatch when the transmission SNR is small. In the high SNR regime, the achievable rate of the uncalibrated system converges to an upper bound, which is caused by multi-stream interference. Also, the achievable rate of the system applying HAC achieves the upper limit when the transmit SNR increases. For the perfectly calibrated system, the achievable rate linearly increases with the log function of the transmit SNR increasing. This is because the multi-stream interference can be completely mitigated by the HBF beamforming with perfect knowledge of mismatch coefficients. Further, we can also find that the achievable rate of HAC with $\bar{\rho}_{c}=30$ dB is almost twice larger than that of the uncalibrated case. This result implies that the reciprocity calibration can significantly improve the system performance as expected. Besides, the achievable rate of the system using HAC is larger than that using CRC, which indicates the proposed HAC outperforms the CRC.

Fig. 10 demonstrates the sum achievable rate versus the SNR of the calibration signals, where the number $N_{\mathrm{s}}$ of data streams is set to $2$ and $4$ , and the SNR $\bar{\rho}_{\mathrm{d}}$ of the downlink transmission signals is set to $30$ dB. It can be found that the sum achievable rate increases with the SNR of calibration signals increasing. This is because the estimation error decreases with the increase of the calibration SNR and the power of the interference decreases with the decrease of the estimation error of the mismatch coefficients. The gaps between the rates of the system using the proposed HAC and those of the perfectly calibrated systems are much smaller than the gaps between the rates of the system using CRC and those of the perfectly calibrated system. Further, when $N_{\mathrm{s}}=2$ , the curve of HAC approaches the curve of the perfect calibration more quickly. This result indicates that the system transmitting more data streams requires higher calibration SNR to have the same performance loss.

Finally, we examine the sum achievable rate versus the length of calibration pilots illustrated in Fig. 10, where the SNR $\bar{\rho}_{\mathrm{d}}$ of the downlink transmission signals is set to $20$ dB and $30$ dB, and the SNR of calibrations signals is set to $10$ dB. From the figure, it can be observed that the achievable rate increases with the length of calibration pilots increasing, which is because the estimation error of the mismatch coefficients decreases with the increase of the calibration pilots. Further, the system with higher transmit SNR requires smaller calibration errors to guarantee the same performance loss.

V Conclusion

In this paper, we have proposed a hierarchical-absolute reciprocity calibration for the mmWave-HBF system with the fully-connected phase shifter network. By proposing a specific beamforming design, the reciprocity calibration of the HBF system has been decoupled into the reciprocity calibration of digital RF chains and analog RF chains. Based on the decoupling, the entire reciprocity calibration problem of the HBF system has been equivalently decomposed into two subproblems corresponding to the reciprocity calibrations of digital and analog RF chains. Theoretical analysis has revealed that the overhead and computational complexity of the proposed HAC is much smaller than the conventional reciprocity calibration of HBF systems due to the decoupling. Further, based on the proposed calibration approach, we have derived the CRLB of the mismatch coefficients, which indicated that the estimation errors of the mismatch coefficients of digital and analog RF chains were independent, and the mismatch coefficients of receive digital chains could be estimated perfectly. Finally, simulation results have demonstrated that the proposed HAC significantly improved the system performance and outperformed the conventional calibration.

Appendix A Proof of Proposition 2

Based on the specific design of the digital/analog beamforming matrices, when $l\leq L_{\mathrm{dr}}$ , the received signal in (13) can be rewritten as

\mathbf{y}_{\mathrm{d},l}=\frac{1}{\sqrt{M_{\mathrm{t}}}}\mathbf{u}_{1}\mathbf{b}_{\mathrm{dr}}^{T}\mathbf{H}_{\mathrm{DL}}\mathbf{f}_{\mathrm{dr}}\mathbf{t}_{1}^{T}\mathbf{x}_{q}+\mathbf{u}_{1}\mathbf{b}_{\mathrm{dr}}^{T}\mathbf{n}_{\mathrm{d},l}\\ =\beta_{\mathrm{d}}\mathbf{u}_{1}\mathbf{t}_{1}^{T}\mathbf{x}_{q}+\mathbf{u}_{1}\mathbf{b}_{\mathrm{dr}}^{T}\mathbf{n}_{\mathrm{d},l},

(45)

where $\beta_{\mathrm{d}}=\frac{1}{\sqrt{M_{\mathrm{t}}}}\mathbf{b}_{\mathrm{dr}}^{T}\mathbf{H}_{\mathrm{DL}}\mathbf{f}_{\mathrm{dr}}$ , $\mathbf{u}_{1}$ consists of the diagonal entries of $\mathbf{U}_{1}$ , and $\mathbf{t}_{1}$ is composed of the diagonal entries of $\mathbf{T}_{1}$ . By stacking all $L_{\mathrm{dr}}$ -length signals $\mathbf{y}_{\mathrm{d},l}$ into the matrix form, the received signals can be further denoted as

\mathbf{Y}_{\mathrm{dr}}=\beta_{\mathrm{d}}(\mathbf{1}_{P_{\mathrm{dr}}}\otimes\mathbf{u}_{1})\mathbf{t}_{1}^{T}\mathbf{X}_{\mathrm{dr}}+(\mathbf{I}_{P_{\mathrm{dr}}}\otimes\mathbf{u}_{1}\mathbf{b}_{\mathrm{dr}}^{T})\mathbf{N}_{\mathrm{dr}},

(46)

Similarly, when $l>L_{\mathrm{dr}}$ , by substituting the designed beamforming matrices into (13), the received signal can be rewritten as

\mathbf{y}_{\mathrm{d},l}=\mathbf{D}_{\mathrm{da},p}^{T}\mathbf{U}_{1}\mathbf{B}_{\mathrm{da},p}^{T}\mathbf{H}_{\mathrm{DL}}\mathbf{F}_{\mathrm{da},q}\mathbf{T}_{1}\mathbf{W}_{\mathrm{da},q}\mathbf{x}_{q}+\tilde{\mathbf{n}}_{\mathrm{d},l}\overset{(a)}{=}\left[\begin{matrix}u_{1,1}t_{1,1}\mathbf{b}_{\mathrm{da},p,1}^{T}\mathbf{H}_{\mathrm{DL}}\mathbf{f}_{\mathrm{da},q,1}x_{q,1}+\tilde{n}_{\mathrm{d},l}\\ \mathbf{0}_{M_{\mathrm{r}}-1}\end{matrix}\right],

(47)

which indicates that signal $y_{\mathrm{d},l,1}$ received by the first receive digital RF chain is valid. By stacking all $L_{\mathrm{da}}$ -length signals $y_{\mathrm{d},l,1}$ into matrix form, the received signal can be further denoted as

\mathbf{Y}_{\mathrm{da}}=u_{1,1}t_{1,1}\bar{\mathbf{B}}_{\mathrm{da}}^{T}\mathbf{H}_{\mathrm{DL}}\bar{\mathbf{F}}_{\mathrm{da}}\mathbf{X}_{\mathrm{da}}+\mathbf{N}_{\mathrm{da}}

(48)

where $\mathbf{Y}_{\mathrm{da}}=[\mathbf{y}_{\mathrm{da},1}^{T},\cdots,\mathbf{y}_{\mathrm{da},P_{\mathrm{da}}}^{T}]^{T}$ , $\mathbf{y}_{\mathrm{da},p}=[y_{\mathrm{d},(p-1)Q_{\mathrm{da}}+1,1},\cdots,y_{\mathrm{d},pQ_{\mathrm{da}},1}]$ , $\bar{\mathbf{B}}_{\mathrm{da}}=[\mathbf{b}_{\mathrm{da},1,1},\cdots,\mathbf{b}_{\mathrm{da},P_{\mathrm{da}},1}]$ , $\bar{\mathbf{F}}_{\mathrm{da}}=[\mathbf{f}_{\mathrm{da},1,1},\cdots,\mathbf{f}_{\mathrm{da},Q_{\mathrm{da}},1}]$ , $\mathbf{X}_{\mathrm{da}}=\mathrm{diag}(x_{1,1},\cdots,x_{Q_{\mathrm{da}},1})$ , $\mathbf{N}_{\mathrm{da}}=\mathrm{blkdiag}(\mathbf{b}_{\mathrm{da},1,1}^{T},\cdots,\mathbf{b}_{\mathrm{da},P_{\mathrm{da}},1}^{T})[\bar{\mathbf{N}}_{\mathrm{da},1}^{T},\\ \cdots,\bar{\mathbf{N}}_{\mathrm{da},P_{\mathrm{da}}}^{T}]^{T}$ , and $\bar{\mathbf{N}}_{\mathrm{da},p}=[\mathbf{n}_{\mathrm{d},(p-1)Q_{\mathrm{da}}+1},\cdots,\mathbf{n}_{\mathrm{d},pQ_{\mathrm{da}}}]$ .

By substituting (46) and (48) into (14), the optimization problem can be rewritten as

\min_{\mathbf{u}_{\mathrm{1}},\mathbf{t}_{\mathrm{1}},\mathbf{U}_{2},\mathbf{T}_{2},\mathbf{H}}\quad\underbrace{\left\|\mathbf{Y}_{\mathrm{dr}}-\beta_{\mathrm{d}}(\mathbf{1}_{P_{\mathrm{dr}}}\otimes\mathbf{u}_{1})\mathbf{t}_{1}^{T}\mathbf{X}_{\mathrm{dr}}\right\|_{\mathrm{F}}^{2}}_{f(\beta_{\mathrm{d}},\mathbf{u}_{1},\mathbf{t}_{1})}+\underbrace{\left\|\mathbf{Y}_{\mathrm{da}}-u_{1,1}t_{1,1}\bar{\mathbf{B}}_{\mathrm{da}}^{T}\mathbf{H}_{\mathrm{DL}}\bar{\mathbf{F}}_{\mathrm{da}}\mathbf{X}_{\mathrm{da}}\right\|_{\mathrm{F}}^{2}}_{g(u_{1,1},t_{1,1},\mathbf{U}_{2},\mathbf{T}_{2},\mathbf{H})}.

(49)

From (49), it can be observed that the functions $f(\beta_{\mathrm{d}},\mathbf{u}_{1},\mathbf{t}_{1})$ and $g(u_{1,1},t_{1,1},\mathbf{U}_{2},\mathbf{T}_{2},\mathbf{H})$ are coupled to each other through the variables $\beta_{\mathrm{d}}$ , $u_{1,1}$ , and $t_{1,1}$ . Fortunately, any vector $\hat{\mathbf{t}}_{1}$ parallel to the mismatch coefficient vector $\mathbf{t}_{1}$ can be exploited to calibrate the reciprocity mismatch of digital RF chains. This fact indicates that the unknown variable $\beta_{\mathrm{d}}$ in the function $f(\beta_{\mathrm{d}},\mathbf{u}_{1},\mathbf{t}_{1})$ can be regarded as a scale factor of $\mathbf{t}_{1}$ or $\mathbf{u}_{1}$ . Similarly, since any vector $\hat{\mathbf{u}}_{2}$ parallel to the mismatch coefficient vector $\mathbf{u}_{2}$ can be used to calibrate the reciprocity mismatch of analog RF chains, the unknown variable $u_{1,1}t_{1,1}$ in the function $g(u_{1,1},t_{1,1},\mathbf{U}_{2},\mathbf{T}_{2},\mathbf{H})$ can be also treated as a scale factor of $\mathbf{U}_{2}$ or $\mathbf{T}_{2}$ . Therefore, (49) can be equivalently written as

\min_{\mathbf{u}_{\mathrm{1}},\mathbf{t}_{\mathrm{1}},\mathbf{U}_{2},\mathbf{T}_{2},\mathbf{H}}\quad\underbrace{\left\|\mathbf{Y}_{\mathrm{dr}}-(\mathbf{1}_{P_{\mathrm{dr}}}\otimes\mathbf{u}_{1})\mathbf{t}_{1}^{T}\mathbf{X}_{\mathrm{dr}}\right\|_{\mathrm{F}}^{2}}_{\bar{f}(\mathbf{u}_{1},\mathbf{t}_{1})}+\underbrace{\left\|\mathbf{Y}_{\mathrm{da}}-\bar{\mathbf{B}}_{\mathrm{da}}^{T}\mathbf{U}_{2}\mathbf{H}^{T}\mathbf{T}_{2}\bar{\mathbf{F}}_{\mathrm{da}}\mathbf{X}_{\mathrm{da}}\right\|_{\mathrm{F}}^{2}}_{\bar{g}(\mathbf{U}_{2},\mathbf{T}_{2},\mathbf{H})}.

(50)

It is worth noting that the solutions to the problem (50) are also the solutions to the problem (49). Further, since $\bar{f}(\mathbf{u}_{1},\mathbf{t}_{1})$ and $\bar{g}(\mathbf{U}_{2},\mathbf{T}_{2},\mathbf{H})$ are independent of each other, the minimum sum of these two functions is equal to the sum of the minimum of each function, i.e., $\min\{\bar{f}(\mathbf{u}_{1},\mathbf{t}_{1})+\bar{g}(\mathbf{U}_{2},\mathbf{T}_{2},\mathbf{H})\}=\min\{\bar{f}(\mathbf{u}_{1},\mathbf{t}_{1})\}+\min\{\bar{g}(\mathbf{U}_{2},\mathbf{T}_{2},\mathbf{H})\}$ . Consequently, the problem (50) can be decoupled into two independent subproblems denoted as

\begin{split}&\min_{\ \ \mathbf{u}_{\mathrm{1}},\mathbf{t}_{\mathrm{1}}\ }\quad\bar{f}(\mathbf{u}_{1},\mathbf{t}_{1}),\\ &\min_{\mathbf{T}_{2},\mathbf{U}_{2},\mathbf{H}}\quad\bar{g}(\mathbf{U}_{2},\mathbf{T}_{2},\mathbf{H}).\end{split}

(51)

Thus, Proposition 2 holds.

Appendix B Proof of Proposition 3

By substituting $\mathbf{x}_{\mathrm{dt},p}$ in (19) and $u_{1,1}=1$ into (15), the objective of the problem $\mathcal{P}_{1}$ can be further denoted as

\begin{split}\bar{f}(\mathbf{u}_{1},\mathbf{t}_{1})=&\sum_{p=1}^{P_{\mathrm{dr}}}\Bigl{[}\left\|\mathbf{y}_{\mathrm{dr},(p-1)M_{\mathrm{r}}+1}^{T}-c_{\mathrm{dr}}\mathbf{X}_{\mathrm{dr}}^{T}\mathbf{t}_{1}\right\|_{\mathrm{F}}^{2}+\left\|\tilde{\mathbf{Y}}_{\mathrm{dr},p}-\frac{1}{c_{\mathrm{dr}}}\tilde{\mathbf{u}}_{1}\mathbf{y}_{\mathrm{dr},(p-1)M_{\mathrm{r}}+1}\right\|_{\mathrm{F}}^{2}\Bigr{]}\\ &=\check{f}(\tilde{\mathbf{u}}_{1},\mathbf{t}_{1})\end{split}

(52)

where $\tilde{\mathbf{Y}}_{\mathrm{dr},p}$ consists of the second to the last row of $\bar{\mathbf{Y}}_{\mathrm{dr},p}$ , and $\tilde{\mathbf{u}}_{1}=[u_{1,2},\cdots,u_{1,M_{\mathrm{r}}}]^{T}$ . By defining $\mathbf{y}_{\mathrm{dt}}=[\mathbf{y}_{\mathrm{dr},1},\cdots,\mathbf{y}_{\mathrm{dr},(P_{\mathrm{dr}}-1)M_{\mathrm{r}}+1}]^{T}$ , $\tilde{\mathbf{y}}_{\mathrm{dr}}=[\mathrm{vec}(\tilde{\mathbf{Y}}_{\mathrm{dr},1})^{T},\cdots,\mathrm{vec}(\tilde{\mathbf{Y}}_{\mathrm{dr},P_{\mathrm{dr}}})^{T}]^{T}$ , and $\check{\mathbf{Y}}_{\mathrm{dr}}=[(\mathbf{y}_{\mathrm{dr},1}\otimes\mathbf{I}_{M_{\mathrm{r}}-1}),\cdots,(\mathbf{y}_{\mathrm{dr},(P_{\mathrm{dr}}-1)M_{\mathrm{r}}+1}\otimes\mathbf{I}_{M_{\mathrm{r}}-1})]^{T}$ , $\check{f}(\tilde{\mathbf{u}}_{1},\mathbf{t}_{1})$ can be rewritten into the matrix form denoted as

\check{f}(\tilde{\mathbf{u}}_{1},\mathbf{t}_{1})=\left\|\mathbf{y}_{\mathrm{dt}}-c_{\mathrm{dr}}(\mathbf{1}_{P_{\mathrm{\mathrm{dr}}}}\otimes\mathbf{X}_{\mathrm{dr}}^{T})\mathbf{t}_{1}\right\|_{\mathrm{F}}^{2}+\left\|\tilde{\mathbf{y}}_{\mathrm{dr}}-\frac{1}{c_{\mathrm{dr}}}\check{\mathbf{Y}}_{\mathrm{dr}}\tilde{\mathbf{u}}_{1}\right\|_{\mathrm{F}}^{2}.

(53)

Then, $\mathcal{P}_{1}$ can be equivalently transformed into

\mathcal{P}_{1,2}:\min_{\tilde{\mathbf{u}}_{1},\mathbf{t}_{1}}\quad\check{f}(\tilde{\mathbf{u}}_{1},\mathbf{t}_{1}).

(54)

Since the objective $\check{f}(\tilde{\mathbf{u}}_{1},\mathbf{t}_{1})$ of $\mathcal{P}_{1.2}$ is the sum of two convex functions, $\mathcal{P}_{1.2}$ is a convex problem without constraints. Thus, to solve $\tilde{\mathbf{u}}_{1}$ , we take the partial derivative of $\check{f}(\tilde{\mathbf{u}}_{1},\mathbf{t}_{1})$ with respect to $\tilde{\mathbf{u}}_{1}$ as follows

\frac{\partial\check{f}(\tilde{\mathbf{u}}_{1},\mathbf{t}_{1})}{\partial\tilde{\mathbf{u}}_{1}}=\frac{\partial\left\|\tilde{\mathbf{y}}_{\mathrm{dr}}-\frac{1}{c_{\mathrm{dr}}}\check{\mathbf{Y}}_{\mathrm{dr}}\tilde{\mathbf{u}}_{1}\right\|_{\mathrm{F}}^{2}}{\partial\tilde{\mathbf{u}}_{1}}=-\check{\mathbf{Y}}_{\mathrm{dr}}^{T}(\tilde{\mathbf{y}}_{\mathrm{dr}}-\frac{1}{c_{\mathrm{dr}}}\check{\mathbf{Y}}_{\mathrm{dr}}\tilde{\mathbf{u}}_{1})^{*}.

(55)

By setting the derivative equal to zero, the solution to $\mathbf{u}_{1}$ can be given by

\tilde{\mathbf{u}}_{1}\overset{(b)}{=}c_{\mathrm{dr}}(\check{\mathbf{Y}}_{\mathrm{dr}}^{H}\check{\mathbf{Y}}_{\mathrm{dr}})^{-1}\check{\mathbf{Y}}_{\mathrm{dr}}^{H}\tilde{\mathbf{y}}_{\mathrm{dr}},

(56)

where the condition $(b)$ holds when $\check{\mathbf{Y}}_{\mathrm{dr}}$ is a full column rank matrix.

Similarly, by taking the partial derivative of $\check{f}(\tilde{\mathbf{u}}_{1},\mathbf{t}_{1})$ with respect to $\mathbf{t}_{1}$ and setting the partial derivative to zero, we can obtain the solution to $\mathbf{t}_{1}$ denoted as follows

\begin{split}\hat{\mathbf{t}}_{1}&\overset{(c)}{=}\frac{1}{c_{\mathrm{dr}}}\bigl{[}(\mathbf{1}_{P_{\mathrm{dr}}}\otimes\mathbf{X}_{\mathrm{dr}}^{T})^{H}(\mathbf{1}_{P_{\mathrm{dr}}}\otimes\mathbf{X}_{\mathrm{dr}}^{T})\bigr{]}^{-1}(\mathbf{1}_{P_{\mathrm{dr}}}\otimes\mathbf{X}_{\mathrm{dr}}^{T})^{H}\mathbf{y}_{\mathrm{dt}}\\ &=\frac{1}{c_{\mathrm{dr}}P_{\mathrm{dr}}}\Bigl{[}\mathbf{1}_{P_{\mathrm{dr}}}^{T}\otimes\bigl{(}\mathbf{X}_{\mathrm{dr}}^{*}\mathbf{X}_{\mathrm{dr}}^{T}\bigr{)}^{-1}\mathbf{X}_{\mathrm{dr}}^{*}\Bigr{]}\mathbf{y}_{\mathrm{dt}},\end{split}

(57)

where the condition $(c)$ holds when $(\mathbf{1}_{P_{\mathrm{dr}}}\otimes\mathbf{X}_{\mathrm{dr}}^{T})$ is full column rank. Thus, Proposition 3 holds.

\AtNextBibliography

References

[1] Erik G. Larsson, Ove Edfors, Fredrik Tufvesson and Thomas L. Marzetta “Massive MIMO for next Generation Wireless Systems” In IEEE Commun. Mag. 52.2, 2014, pp. 186–195 DOI: 10.1109/MCOM.2014.6736761
[2] Jose Flordelis et al. “Massive MIMO Performance—TDD Versus FDD: What Do Measurements Say?” In IEEE Trans. Wireless Commun. 17.4, 2018, pp. 2247–2261 DOI: 10.1109/TWC.2018.2790912
[3] Chuanqiang Shan, Li Chen, Xiaohui Chen and Weidong Wang “A General Matched Filter Design for Reciprocity Calibration in Multiuser Massive MIMO Systems” In IEEE Trans. Veh. Technol. 67.9, 2018, pp. 8939–8943 DOI: 10.1109/TVT.2018.2839591
[4] Rongjiang Nie et al. “Impact and Calibration of Nonlinear Reciprocity Mismatch in Massive MIMO Systems” In IEEE Trans. Wireless Commun. 20.10, 2021, pp. 6418–6435 DOI: 10.1109/TWC.2021.3073949
[5] Wence Zhang et al. “Large-Scale Antenna Systems With UL/DL Hardware Mismatch: Achievable Rates Analysis and Calibration” In IEEE Trans. Commun. 63.4, 2015, pp. 1216–1229 DOI: 10.1109/TCOMM.2015.2395432
[6] Hao Wei, Dongming Wang, Jiangzhou Wang and Xiaohu You “Impact of RF Mismatches on the Performance of Massive MIMO Systems with ZF Precoding” In Sci. China Inf. Sci. 59.2, 2016, pp. 1–14 DOI: 10.1007/s11432-015-5509-1
[7] Orod Raeesi et al. “Performance Analysis of Multi-User Massive MIMO Downlink Under Channel Non-Reciprocity and Imperfect CSI” In IEEE Trans. Commun. 66.6, 2018, pp. 2456–2471 DOI: 10.1109/TCOMM.2018.2792017
[8] Xiliang Luo “Multiuser Massive MIMO Performance With Calibration Errors” In IEEE Trans. Wirel. Commun. 15.7, 2016, pp. 4521–4534
[9] Xiwen Jiang et al. “MIMO-TDD Reciprocity under Hardware Imbalances: Experimental Results” In IEEE Int. Conf. Commun., 2015, pp. 4949–4953 DOI: 10.1109/ICC.2015.7249107
[10] De Mi et al. “Massive MIMO Performance With Imperfect Channel Reciprocity and Channel Estimation Error” In IEEE Trans. Commun. 65.9, 2017, pp. 3734–3749
[11] K. Nishimori, K. Cho, Y. Takatori and T. Hori “Automatic Calibration Method Using Transmitting Signals of an Adaptive Array for TDD Systems” In IEEE Trans. Veh. Technol. 50.6, 2001, pp. 1636–1640 DOI: 10.1109/25.966592
[12] A. Bourdoux, B. Come and N. Khaled “Non-Reciprocal Transceivers in OFDM/SDMA Systems: Impact and Mitigation” In Proc. IEEE Radio Wirel. Conf. (RAWCON) Boston, Massachusetts, USA: IEEE, 2003, pp. 183–186 DOI: 10.1109/RAWCON.2003.1227923
[13] A. Benzin and G. Caire “Internal Self-Calibration Methods for Large Scale Array Transceiver Software-Defined Radios” In Proc. Int. ITG Workshop Smart Antennas (WSA), 2017, pp. 1–8
[14] Xiliang Luo, Fuqian Yang and Hanyu Zhu “Massive MIMO Self-Calibration: Optimal Interconnection for Full Calibration” In IEEE Trans. Veh. Technol. 68.11, 2019, pp. 10357–10371 DOI: 10.1109/TVT.2019.2941544
[15] Florian Kaltenberger, Haiyong Jiang, Maxime Guillaud and Raymond Knopp “Relative Channel Reciprocity Calibration in MIMO/TDD Systems” In Proc. Futur. Netw. Mob. Summit, 2010, pp. 1–10
[16] M. Guillaud, D.T.M. Slock and R. Knopp “A Practical Method for Wireless Channel Reciprocity Exploitation through Relative Calibration” In Proc. 8th Int. Symp. Signal Process. Applic. (ISSPA) 1 Sydney, Australia: IEEE, 2005, pp. 403–406 DOI: 10.1109/ISSPA.2005.1580281
[17] Mark Petermann et al. “Low-Complexity Calibration of Mutually Coupled Non-Reciprocal Multi-Antenna OFDM Transceivers” In Proc. 7th Int. Symp. Wireless Commun. Syst. York: IEEE, 2010, pp. 285–289 DOI: 10.1109/ISWCS.2010.5624277
[18] Boris Kouassi, Irfan Ghauri and Luc Deneire “Estimation of Time-Domain Calibration Parameters to Restore MIMO-TDD Channel Reciprocity” In Proc. 7th Int. ICST Conf. Cogn. Radio Oriented Wirel. Networks Commun. Stockholm, Sweden: IEEE, 2012 DOI: 10.4108/icst.crowncom.2012.248472
[19] Clayton Shepard et al. “Argos: Practical Many-Antenna Base Stations” In Proc. 18th Annu. Int. Conf. Mobile Comput. Networking (Mobicom) Istanbul, Turkey: ACM Press, 2012, pp. 53 DOI: 10.1145/2348543.2348553
[20] Hao Wei et al. “Mutual Coupling Calibration for Multiuser Massive MIMO Systems” In IEEE Trans. Wireless Commun. 15.1, 2016, pp. 606–619 DOI: 10.1109/TWC.2015.2476467
[21] Xiwen Jiang et al. “A Framework for Over-the-Air Reciprocity Calibration for TDD Massive MIMO Systems” In IEEE Trans. Wireless Commun. 17.9, 2018, pp. 5975–5990 DOI: 10.1109/TWC.2018.2853743
[22] R. Rogalin et al. “Hardware-Impairment Compensation for Enabling Distributed Large-Scale MIMO” In Proc. 2013 Inf. Theory Appl. Work. (ITA) San Diego, CA: IEEE, 2013, pp. 1–10 DOI: 10.1109/ITA.2013.6502966
[23] Liyan Su, Chenyang Yang, Gang Wang and Ming Lei “Retrieving Channel Reciprocity for Coordinated Multi-Point Transmission with Joint Processing” In IEEE Trans. Commun. 62.5, 2014, pp. 1541–1553 DOI: 10.1109/TCOMM.2014.031014.130367
[24] Rongjiang Nie et al. “Relaying Systems With Reciprocity Mismatch: Impact Analysis and Calibration” In IEEE Trans. Commun. 68.7, 2020, pp. 4035–4049 DOI: 10.1109/TCOMM.2020.2982632
[25] Andreas F. Molisch et al. “Hybrid Beamforming for Massive MIMO: A Survey” In IEEE Commun. Mag. 55.9, 2017, pp. 134–141 DOI: 10.1109/MCOM.2017.1600400
[26] X. Wei, Y. Jiang, Q. Liu and X. Wang “Calibration of Phase Shifter Network for Hybrid Beamforming in mmWave Massive MIMO Systems” In IEEE Trans. Signal Process. 68, 2020, pp. 2302–2315
[27] Xiwen Jiang and Florian Kaltenberger “Channel Reciprocity Calibration in TDD Hybrid Beamforming Massive MIMO Systems” In IEEE J. Sel. Topics Signal Process. 12.3, 2018, pp. 422–431 DOI: 10.1109/JSTSP.2018.2819118
[28] Z. Guo, X. Wang and W. Heng “Millimeter-Wave Channel Estimation Based on 2-D Beamspace MUSIC Method” In IEEE Trans. Wireless Commun. 16.8, 2017, pp. 5384–5394 DOI: 10.1109/TWC.2017.2710049
[29] Kiran Venugopal, Ahmed Alkhateeb, Nuria González Prelcic and Robert W. Heath “Channel Estimation for Hybrid Architecture-Based Wideband Millimeter Wave Systems” In IEEE J. Sel. Areas Commun. 35.9, 2017, pp. 1996–2009
[30] Wei Zhang, Miaomiao Dong and Taejoon Kim “MMV-Based Sequential AoA and AoD Estimation for Millimeter Wave MIMO Channels” In IEEE Trans. Commun. 70.6, 2022, pp. 4063–4077
[31] Xizixiang Wei, Yi Jiang, Xin Wang and Cong Shen “Tx-Rx Reciprocity Calibration for Hybrid Massive MIMO Systems” In IEEE Wireless Commun. Lett., 2021, pp. 1–1
[32] Foad Sohrabi and Wei Yu “Hybrid Digital and Analog Beamforming Design for Large-Scale Antenna Arrays” In IEEE J. Sel. Topics Signal Process. 10.3, 2016, pp. 501–513
[33] Xianghao Yu, Juei-Chin Shen, Jun Zhang and Khaled B. Letaief “Alternating Minimization Algorithms for Hybrid Precoding in Millimeter Wave MIMO Systems” In IEEE J. Sel. Topics Signal Process. 10.3, 2016, pp. 485–500
[34] Tian Lin et al. “Hybrid Beamforming for Millimeter Wave Systems Using the MMSE Criterion” In IEEE Trans. Commun. 67.5, 2019, pp. 3693–3708
[35] O.. Ayach et al. “Spatially Sparse Precoding in Millimeter Wave MIMO Systems” In IEEE Trans. Wireless Commun. 13.3, 2014, pp. 1499–1513 DOI: 10.1109/TWC.2014.011714.130846
[36] Xian-Da Zhang “Matrix Analysis and Applications” Cambridge University Press, 2017, pp. 325–329
[37] De Mi et al. “Self-Calibration for Massive MIMO with Channel Reciprocity and Channel Estimation Errors” In in Proc. IEEE Global Commun. Conf. (GLOBECOM) Abu Dhabi, United Arab Emirates: IEEE, 2018, pp. 1–7
[38] De Mi et al. “Massive MIMO in Mobile Networks: Self-Calibration with Channel Estimation Error” In in Proc. ACM MobiArch 15th Workshop Mobility Evolving Internet Archit., MobiArch’20 New York, USA: Association for Computing Machinery, 2020, pp. 30–35
[39] James C. Bezdek and Richard J. Hathaway “Convergence of Alternating Optimization” In Neural Parallel Sci. Comput. 11.4, 2003, pp. 351–368
[40] Zai Yang, Lihua Xie and Cishen Zhang “Off-Grid Direction of Arrival Estimation Using Sparse Bayesian Inference” In IEEE Trans. Signal Process. 61.1, 2013, pp. 38–43
[41] Biqing Qi, Wei Wang and Ben Wang “Off-Grid Compressive Channel Estimation for Mm-Wave Massive MIMO With Hybrid Precoding” In IEEE Commun. Lett. 23.1, 2019, pp. 108–111 DOI: 10.1109/LCOMM.2018.2878557
[42] Steven M. Kay “Fundamentals of Statistical Signal Processing”, Prentice Hall Signal Processing Series Englewood Cliffs, N.J: Prentice-Hall PTR, 1993, pp. 45–47

References

[43] Roger A. Horn and Zai Yang “Rank of a Hadamard Product” In Linear Algebra and its Applications 591, 2020, pp. 87–98 DOI: 10.1016/j.laa.2020.01.005
[44] Steven M. Kay “Fundamentals of Statistical Signal Processing”, Prentice Hall Signal Processing Series Englewood Cliffs, N.J: Prentice-Hall PTR, 1993, pp. 45–47

labelnumberR#1

Supplementary Material for “Hierarchical-Absolute Reciprocity Calibration for Millimeter-wave Hybrid Beamforming Systems”

Li Chen, Rongjiang Nie, Yunfei Chen, and Weidong Wang

Some proofs are omitted in the main paper for readability, and we provide the missing content in this supplementary material for completeness.

Appendix C Proof of Lemma 1

To estimate the diagonal matrix $\mathbf{H}_{\alpha}$ , the objective $\bar{g}(\mathbf{T}_{2},\mathbf{U}_{2},\mathbf{H}_{\alpha},\boldsymbol{\Theta},\boldsymbol{\Phi})$ of $\mathcal{P}_{2.1}$ can be further denoted as

\bar{g}(\mathbf{T}_{2},\mathbf{U}_{2},\mathbf{H}_{\alpha},\boldsymbol{\Theta},\boldsymbol{\Phi})=\left\|\mathrm{vec}\left\{\mathbf{Y}_{\mathrm{da}}\right\}-(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\mathbf{T}_{2}\mathbf{A}_{\mathrm{t}}\odot\bar{\mathbf{B}}_{\mathrm{r}}\mathbf{U}_{2}\mathbf{A}_{\mathrm{r}})\mathbf{h}_{\alpha}\right\|_{\mathrm{F}}^{2}.

(58)

When $\mathbf{U}_{2},\mathbf{T}_{2},\boldsymbol{\Theta},\boldsymbol{\Phi}$ are known during the $l_{\mathrm{ao}}$ -iteration, $\bar{g}(\mathbf{T}_{2},\mathbf{U}_{2},\mathbf{H}_{\alpha},\boldsymbol{\Theta},\boldsymbol{\Phi})$ is a convex function corresponding to $\mathbf{h}_{\alpha}$ . Thus, $\mathbf{h}_{\alpha}$ can be estimated by taking the derivative of $\bar{g}(\mathbf{T}_{2},\mathbf{U}_{2},\mathbf{H}_{\alpha},\boldsymbol{\Theta},\boldsymbol{\Phi})$ , and the solution to $\mathbf{h}_{\alpha}$ is given by

\mathbf{h}_{\alpha}=(\boldsymbol{\Gamma}_{\mathrm{h}}^{H}\boldsymbol{\Gamma}_{\mathrm{h}})^{-1}\boldsymbol{\Gamma}_{\mathrm{h}}^{H}\mathrm{vec}\left\{\mathbf{Y}_{\mathrm{da}}\right\},

(59)

where $\boldsymbol{\Gamma}_{\mathrm{h}}=(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\mathbf{T}_{2}\mathbf{A}_{\mathrm{t}}\odot\bar{\mathbf{B}}_{\mathrm{r}}\mathbf{U}_{2}\mathbf{A}_{\mathrm{r}})$ .

Similarly, the diagonal entries of $\mathbf{U}_{2}$ and $\mathbf{T}_{2}$ can be estimated as

	$\displaystyle\mathbf{u}_{2}=\mathrm{arg}\min_{\mathbf{u}_{2}}\\|\mathrm{vec}\left\{\mathbf{Y}_{\mathrm{da}}\right\}-\boldsymbol{\Gamma}_{\mathrm{u}}\mathbf{u}_{2}\\|_{\mathrm{F}}^{2}=(\boldsymbol{\Gamma}_{\mathrm{u}}^{H}\boldsymbol{\Gamma}_{\mathrm{u}})^{-1}\boldsymbol{\Gamma}_{\mathrm{u}}^{H}\mathrm{vec}\left\{\mathbf{Y}_{\mathrm{da}}\right\},$		(60)
	$\displaystyle\mathbf{t}_{2}=\mathrm{arg}\min_{\mathbf{t}_{2}}\left\\|\mathrm{vec}\{\mathbf{Y}_{\mathrm{da}}\}-\boldsymbol{\Gamma}_{\mathrm{t}}\mathbf{t}_{2}\right\\|_{\mathrm{F}}^{2}=(\boldsymbol{\Gamma}_{\mathrm{t}}^{H}\boldsymbol{\Gamma}_{\mathrm{t}})^{-1}\boldsymbol{\Gamma}_{\mathrm{t}}^{H}\mathrm{vec}\left\{\mathbf{Y}_{\mathrm{da}}\right\},$		(61)

where $\boldsymbol{\Gamma}_{\mathrm{u}}=(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\mathbf{T}_{2}\mathbf{A}_{\mathrm{t}}\mathbf{H}_{\alpha}(\mathbf{A}_{\mathrm{r}})^{T}\odot\bar{\mathbf{B}}_{\mathrm{r}})$ , and $\boldsymbol{\Gamma}_{\mathrm{t}}=(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\odot\bar{\mathbf{B}}_{\mathrm{r}}\mathbf{U}_{2}\mathbf{A}_{\mathrm{r}}\mathbf{H}_{\alpha}(\mathbf{A}_{\mathrm{t}})^{T})$ . Thus, Lemma 1 holds.

Appendix D Proof of Proposition 4

We first derive the pilot requirement for calibrating the digital RF chains. According to (20), to guarantee a unique solution to $\mathbf{u}_{1}$ , the matrix $\check{\mathbf{Y}}_{\mathrm{dr}}$ should have column rank, which is always satisfied when $P_{\mathrm{dr}}\geq 1$ and $Q_{\mathrm{dr}}\geq 1$ . Similarly, based on (21), for computing $\mathbf{t}_{1}$ , the matrix $\mathbf{1}_{P_{\mathrm{dr}}}\otimes\mathbf{X}_{\mathrm{dr}}^{T}$ should have column rank, i.e., $\mathrm{rank}\{\mathbf{1}_{P_{\mathrm{dr}}}\otimes\mathbf{X}_{\mathrm{dr}}^{T}\}=\mathrm{rank}\{\mathbf{X}_{\mathrm{dr}}^{T}\}=\min\{M_{\mathrm{t}},Q_{\mathrm{dr}}\}\geq M_{\mathrm{t}}$ . Thus, the pilots of digital RF chain calibration should satisfy

Q_{\mathrm{dr}}\geq M_{\mathrm{t}},\text{ and }P_{\mathrm{dr}}\geq 1.

(62)

Then, we derive the pilot requirement for calibrating the analog RF chains. To guarantee the unique solution to $\mathbf{h}_{\alpha}$ , $\boldsymbol{\Gamma}_{\mathrm{h}}$ must be full column rank, i.e.,

\mathrm{krank}(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\mathbf{T}_{2}\mathbf{A}_{\mathrm{t}})+\mathrm{krank}(\bar{\mathbf{B}}_{\mathrm{r}}\mathbf{U}_{2}\mathbf{A}_{\mathrm{r}})\geq K+1.

(63)

Since it is difficult to determine the krank of any matrices, we derive a sufficient condition to guarantee the above inequality to hold, which is denoted as

Q_{\mathrm{da}}\geq K,\text{and }P_{\mathrm{da}}\geq K.

(64)

To guarantee the unique solution to $\mathbf{t}_{2}$ during the iteration, $\boldsymbol{\Gamma}_{\mathrm{t}}^{H}\boldsymbol{\Gamma}_{\mathrm{t}}$ must be full rank, i.e., $\mathrm{rank}(\boldsymbol{\Gamma}_{\mathrm{t}}^{H}\boldsymbol{\Gamma}_{\mathrm{t}})=N_{\mathrm{t}}$ . $\boldsymbol{\Gamma}_{\mathrm{t}}^{H}\boldsymbol{\Gamma}_{\mathrm{t}}$ can be further expressed by the Hadamard product denoted as

\boldsymbol{\Gamma}_{\mathrm{t}}^{H}\boldsymbol{\Gamma}_{\mathrm{t}}=(\tilde{\mathbf{X}}_{\mathrm{da}}^{*}\tilde{\mathbf{X}}_{\mathrm{da}}^{T})\circ(\underbrace{\mathbf{A}_{\mathrm{t}}^{*}\mathbf{H}_{\alpha}^{H}\mathbf{A}_{\mathrm{r}}^{H}\mathbf{U}_{2}^{*}\bar{\mathbf{B}}_{\mathrm{r}}^{H}\bar{\mathbf{B}}_{\mathrm{r}}\mathbf{U}_{2}\mathbf{A}_{\mathrm{r}}\mathbf{H}_{\alpha}\mathbf{A}_{\mathrm{t}}^{T})}_{\tilde{\boldsymbol{\Gamma}}_{\mathrm{t}}}.

(65)

According to [43, (10)], the rank of the Hadamard product is given by

\mathrm{rank}(\boldsymbol{\Gamma}_{\mathrm{t}}^{H}\boldsymbol{\Gamma}_{\mathrm{t}})\geq\min\{\mathrm{krank}(\tilde{\mathbf{X}}_{\mathrm{da}}^{*}\tilde{\mathbf{X}}_{\mathrm{da}}^{T})+\mathrm{rank}(\tilde{\boldsymbol{\Gamma}}_{\mathrm{t}})-1,N_{\mathrm{t}}\}.

(66)

As $\tilde{\mathbf{X}}_{\mathrm{da}}$ is artificially designed, we assume that $\mathrm{krank}(\tilde{\mathbf{X}}_{\mathrm{da}}^{*}\tilde{\mathbf{X}}_{\mathrm{da}}^{T})=\mathrm{rank}(\tilde{\mathbf{X}}_{\mathrm{da}}^{*}\tilde{\mathbf{X}}_{\mathrm{da}})=\min(Q_{\mathrm{da}},N_{\mathrm{t}})$ . Further, because of $K\ll N_{\mathrm{t}}$ , $\mathrm{rank}(\tilde{\boldsymbol{\Gamma}}_{\mathrm{t}})=\min(P_{\mathrm{da}},K)$ . Accordingly, we can obtain the inequality denoted as

\min(Q_{\mathrm{da}},N_{\mathrm{t}})+\min(P_{\mathrm{da}},K)\geq N_{\mathrm{t}}+1.

(67)

Similarly, to guarantee the unique solution to $\mathbf{u}_{2}$ during the iteration, the rank of $\boldsymbol{\Gamma}_{\mathrm{u}}^{H}\boldsymbol{\Gamma}_{\mathrm{u}}$ must be $N_{\mathrm{r}}$ . Since $\bar{\mathbf{B}}_{\mathrm{r}}$ is artificially designed, we can obtain the following inequality denoted as

\min(Q_{\mathrm{da}},K)+\min(P_{\mathrm{da}},N_{\mathrm{r}})\geq N_{\mathrm{r}}+1.

(68)

The solution to the inequalities consisting of (64), (67), and (68) is given by

Q_{\mathrm{da}}\geq N_{\mathrm{t}}-K+1,\text{ and }P_{\mathrm{da}}\geq N_{\mathrm{r}}-K+1.

(69)

Therefore, Proposition 4 holds.

Appendix E Proof of Lemma 2

We use $\mathbf{y}_{\mathrm{d}}$ to denote the received signal vector for estimating $\boldsymbol{\eta}$ . The corresponding probability density function of $\mathbf{y}_{\mathrm{d}}$ is defined as $p(\mathbf{y}_{\mathrm{d}};\boldsymbol{\eta})$ . Then, the Fisher information matrix $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta})$ can be defined as [44]

\boldsymbol{\mathcal{I}}(\boldsymbol{\eta})=\mathbb{E}\left\{\left[\frac{\partial\ln p(\mathbf{y}_{\mathrm{d}};\boldsymbol{\eta})}{\partial\boldsymbol{\eta}}\right]\left[\frac{\partial\ln p(\mathbf{y}_{\mathrm{d}};\boldsymbol{\eta})}{\partial\boldsymbol{\eta}}\right]^{T}\right\}\\ =-\mathbb{E}\left\{\frac{\partial}{\partial\boldsymbol{\eta}}\left[\frac{\partial}{\partial\boldsymbol{\eta}}\ln p(\mathbf{y}_{\mathrm{d}};\boldsymbol{\eta})\right]^{T}\right\}.

(70)

In Section II-C, the received training signal $\mathbf{y}_{\mathrm{d}}$ consists of two independent parts $\mathbf{y}_{\mathrm{dr}}$ and $\mathbf{y}_{\mathrm{da}}$ , i.e., $\mathbf{y}_{\mathrm{d}}=[\mathbf{y}_{\mathrm{dr}}^{T},\mathbf{y}_{\mathrm{da}}^{T}]^{T}$ , where $\mathbf{y}_{\mathrm{dr}}$ is utilized to estimate $\tilde{\mathbf{u}}_{1}$ and $\mathbf{t}_{1}$ , and $\mathbf{y}_{\mathrm{da}}$ is used to estimate $\mathbf{u}_{2}$ and $\mathbf{t}_{2}$ . Thus, by dividing the vector $\boldsymbol{\eta}$ into two independent parts, i.e., $\boldsymbol{\eta}=[\boldsymbol{\eta}_{1}^{T},\boldsymbol{\eta}_{2}^{T}]^{T}$ , the corresponding probability density function of $\mathbf{y}_{\mathrm{d}}$ can be further denoted as $p(\mathbf{y}_{\mathrm{d}};\boldsymbol{\eta})=p_{1}(\mathbf{y}_{\mathrm{dr}};\boldsymbol{\eta}_{1})p_{2}(\mathbf{y}_{\mathrm{da}};\boldsymbol{\eta}_{2})$ , where $\boldsymbol{\eta}_{1}=[\Re\{\tilde{\mathbf{u}}_{1}^{T}\},\Im\{\tilde{\mathbf{u}}_{1}^{T}\},\Re\{\mathbf{t}_{1}^{T}\},\Im\{\mathbf{t}_{1}^{T}\}]^{T}$ , $\boldsymbol{\eta}_{2}=[\Re\{\mathbf{u}_{2}^{T}\},\Im\{\mathbf{u}_{2}^{T}\},\\ \Re\{\mathbf{t}_{2}^{T}\},\Im\{\mathbf{t}_{2}^{T}\},\Re\{\mathbf{h}_{\alpha}^{T}\},\Im\{\mathbf{h}_{\alpha}^{T}\},\boldsymbol{\Theta}^{T},\boldsymbol{\Phi}^{T}]^{T}$ , $p_{1}(\mathbf{y}_{\mathrm{dr}};\boldsymbol{\eta}_{1})$ and $p_{2}(\mathbf{y}_{\mathrm{da}};\boldsymbol{\eta}_{2})$ denote the corresponding probability density functions of $\mathbf{y}_{\mathrm{dr}}$ and $\mathbf{y}_{\mathrm{da}}$ . Then, based on (70), the Fisher information matrix can be further given by

\begin{split}\boldsymbol{\mathcal{I}}(\boldsymbol{\eta})&=-\mathbb{E}\left\{\frac{\partial}{\partial\boldsymbol{\eta}}\left[\frac{\partial}{\partial\boldsymbol{\eta}}\bigl{(}\ln p_{1}(\mathbf{y}_{\mathrm{dr}};\boldsymbol{\eta}_{1})+\ln p_{2}(\mathbf{y}_{\mathrm{da}};\boldsymbol{\eta}_{2})\bigr{)}\right]^{T}\right\}\\ &=-\mathbb{E}\left\{\left(\begin{matrix}\frac{\partial^{2}\ln p_{1}(\mathbf{y}_{\mathrm{dr}};\boldsymbol{\eta}_{1})}{\partial\boldsymbol{\eta}_{1}\partial\boldsymbol{\eta}_{1}^{T}}&\frac{\partial^{2}\ln p_{2}(\mathbf{y}_{\mathrm{da}};\boldsymbol{\eta}_{2})}{\partial\boldsymbol{\eta}_{1}\partial\boldsymbol{\eta}_{2}^{T}}\\ \frac{\partial^{2}\ln p_{1}(\mathbf{y}_{\mathrm{dr}};\boldsymbol{\eta}_{1})}{\partial\boldsymbol{\eta}_{2}\partial\boldsymbol{\eta}_{1}^{T}}&\frac{\partial^{2}\ln p_{2}(\mathbf{y}_{\mathrm{da}};\boldsymbol{\eta}_{2})}{\partial\boldsymbol{\eta}_{2}\partial\boldsymbol{\eta}_{2}^{T}}\end{matrix}\right)\right\}\\ &\overset{(a)}{=}\mathrm{blkdiag}[\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1}),\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{2})],\end{split}

(71)

where $(a)$ holds due to the independence between $\mathbf{y}_{\mathrm{dr}}$ and $\boldsymbol{\eta}_{2}$ as well as the independence between $\mathbf{y}_{\mathrm{da}}$ and $\boldsymbol{\eta}_{1}$ , $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1})=-\mathbb{E}\left\{\frac{\partial^{2}\ln p_{1}(\mathbf{y}_{\mathrm{dr}};\boldsymbol{\eta}_{1})}{\partial\boldsymbol{\eta}_{1}\partial\boldsymbol{\eta}_{1}^{T}}\right\}$ denotes the Fisher information matrix of $\boldsymbol{\eta}_{1}$ , and $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{2})=-\mathbb{E}\left\{\frac{\partial^{2}\ln p_{2}(\mathbf{y}_{\mathrm{da}};\boldsymbol{\eta}_{2})}{\partial\boldsymbol{\eta}_{2}\partial\boldsymbol{\eta}_{2}^{T}}\right\}$ denotes the Fisher information matrix of $\boldsymbol{\eta}_{2}$ . Thus, Lemma 2 holds.

Appendix F Proof of Lemma 3

In Section II-C, $\mathbf{u}_{1}$ and $\mathbf{t}_{1}$ are estimated from the received signal $\tilde{\mathbf{y}}_{\mathrm{dr}}$ and $\mathbf{y}_{\mathrm{dt}}$ , respectively. According to Proposition 3, $\tilde{\mathbf{y}}_{\mathrm{dr}}$ and $\mathbf{y}_{\mathrm{dt}}$ are denoted as

\tilde{\mathbf{y}}_{\mathrm{dr}}=[\mathrm{vec}(\tilde{\mathbf{Y}}_{\mathrm{dr},1})^{T},\cdots,\mathrm{vec}(\tilde{\mathbf{Y}}_{\mathrm{dr},P_{\mathrm{dr}}})^{T}]^{T},\text{ and }\mathbf{y}_{\mathrm{dt}}=[\mathbf{y}_{\mathrm{dr},1},\cdots,\mathbf{y}_{\mathrm{dr},(P_{\mathrm{dr}}-1)M_{\mathrm{r}}+1}]^{T}.

(72)

Thus, similar to Lemma 2, we have $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1})=\mathrm{blkdiag}(\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,1}),\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,2}))$ due to the independence between $\tilde{\mathbf{y}}_{\mathrm{dr}}$ and $\mathbf{y}_{\mathrm{dt}}$ , where $\boldsymbol{\eta}_{1,1}=[\Re\{\tilde{\mathbf{u}}_{1}^{T}\},\Im\{\tilde{\mathbf{u}}_{1}^{T}\}]^{T}$ , and $\boldsymbol{\eta}_{1,2}=[\Re\{\mathbf{t}_{1}^{T}\},\Im\{\mathbf{t}_{1}^{T}\}]^{T}$ .

To derive the fishier information matrices, we first model the received signal $\bar{\mathbf{Y}}_{\mathrm{dr},p}$ . Based on (45) and (46), $\bar{\mathbf{Y}}_{\mathrm{dr},p}$ can be denoted as

\bar{\mathbf{Y}}_{\mathrm{dr},p}=\beta_{\mathrm{d}}\mathbf{u}_{1}\mathbf{t}_{1}^{T}\mathbf{X}_{\mathrm{dr}}+\mathbf{u}_{1}\mathbf{b}_{\mathrm{dr}}^{T}\bar{\mathbf{N}}_{\mathrm{dr},p}=\mathbf{u}_{1}\mathbf{x}_{\mathrm{tn},p}^{T},

(73)

where $\mathbf{x}_{\mathrm{tn},p}=\beta_{\mathrm{d}}\mathbf{X}_{\mathrm{dr}}^{T}\mathbf{t}_{1}+\bar{\mathbf{N}}_{\mathrm{dr},p}^{T}\mathbf{b}_{\mathrm{dr}}$ . Then, we have

\mathbf{y}_{\mathrm{dr},(p-1)M_{\mathrm{r}}+1}=u_{1,1}\mathbf{x}_{\mathrm{tn},p},\text{ and }\tilde{\mathbf{Y}}_{\mathrm{dr},p}=\tilde{\mathbf{u}}_{1}\mathbf{x}_{\mathrm{tn},p}^{T},

(74)

As we set the first receive digital RF chain of the UE to be the reference, e.g., $u_{1,1}=1$ , $\mathbf{x}_{\mathrm{tn},p}$ is equal to the first row of $\mathbf{Y}_{\mathrm{dr},p}$ , i.e., $\mathbf{y}_{\mathrm{dr},(p-1)M_{\mathrm{r}}+1}=\mathbf{x}_{\mathrm{tn},p}$ . This result indicates that $\tilde{\mathbf{Y}}_{\mathrm{dr},p}$ is a deterministic signal without noises for computing $\tilde{\mathbf{u}}_{1}$ . To derive the closed-form expression of $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,1})$ , we regard computing $\tilde{\mathbf{u}}_{1}$ as an asymptotic case of the estimation from the signal in white Gaussian noise with zero mean and zero variance. Thus, By using the vectorization of $\tilde{\mathbf{Y}}_{\mathrm{dr},p}$ , (74) can be rewritten as

\mathrm{vec}\{\tilde{\mathbf{Y}}_{\mathrm{dr},p}\}=(\mathbf{x}_{\mathrm{tn},p}\otimes\mathbf{I}_{M_{\mathrm{r}}})\tilde{\mathbf{u}}_{1}+\mathbf{n}_{\mathrm{au}},

(75)

where each entry of $\mathbf{n}_{\mathrm{au}}$ is distributed as $\mathcal{CN}(0,\gamma)$ , and $\gamma\longrightarrow 0$ . Thus, $\mathrm{vec}\{\tilde{\mathbf{Y}}_{\mathrm{dr},p}\}$ is distributed as $\mathcal{CN}(\boldsymbol{\mu}_{\mathrm{tn},p},\gamma\mathbf{I}_{QM_{\mathrm{r}}})$ , where $\boldsymbol{\mu}_{\mathrm{tn},p}=(\mathbf{x}_{\mathrm{tn},p}\otimes\mathbf{I}_{M_{\mathrm{r}}})\tilde{\mathbf{u}}_{1}$ . Since $\frac{\partial\boldsymbol{\mu}_{\mathrm{tn},p}}{\partial\boldsymbol{\eta}_{1,1}^{T}}=[(\mathbf{x}_{\mathrm{tn},p}\otimes\mathbf{I}_{M_{\mathrm{r}}-1}),j(\mathbf{x}_{\mathrm{tn},p}\otimes\mathbf{I}_{M_{\mathrm{r}}-1})]$ and based on [44, (3.31)], the Fisher information matrix $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,1})$ is given by

\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,1})=\frac{2}{\gamma}\sum_{p=1}^{P_{\mathrm{dr}}}\Re\left\{\left(\frac{\partial\boldsymbol{\mu}_{\mathrm{tn},p}}{\partial\boldsymbol{\eta}_{1,1}^{T}}\right)^{H}\frac{\partial\boldsymbol{\mu}_{\mathrm{tn},p}}{\partial\boldsymbol{\eta}_{1,1}^{T}}\right\}=\frac{2\sum_{p=1}^{P_{\mathrm{dr}}}\|\mathbf{x}_{\mathrm{tn},p}\|^{2}}{\gamma}\mathbf{I}_{2M_{\mathrm{r}}-2}.

(76)

Further, to derive the closed-form expression of $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,2})$ , we use the fact that $\mathbf{y}_{\mathrm{dr},(p-1)M_{\mathrm{r}}+1}=\beta_{\mathrm{d}}\mathbf{X}_{\mathrm{dr}}^{T}\mathbf{t}_{1}+\mathbf{b}_{\mathrm{dr}}^{T}\bar{\mathbf{N}}_{\mathrm{dr},p}$ . The received signal $\mathbf{y}_{\mathrm{dr},(p-1)M_{\mathrm{r}}+1}$ is distributed as $\mathcal{CN}(\boldsymbol{\mu}_{\mathrm{dt},p},\sigma_{\mathrm{n}}^{2}\mathbf{I}_{\mathrm{Q}})$ , where $\boldsymbol{\mu}_{\mathrm{dt},p}=\beta_{\mathrm{d}}\mathbf{X}_{\mathrm{dr}}^{T}\mathbf{t}_{1}$ . Further, since $\frac{\partial\boldsymbol{\mu}_{\mathrm{dt},p}}{\partial\boldsymbol{\eta}_{1,2}^{T}}=[\beta_{\mathrm{d}}\mathbf{X}_{\mathrm{dr}}^{T},j\beta_{\mathrm{d}}\mathbf{X}_{\mathrm{dr}}^{T}]$ , the closed-form expression of $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,2})$ can be given by

\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,2})=\frac{2}{\sigma_{\mathrm{n}}^{2}}\Re\left\{\left(\frac{\partial\boldsymbol{\mu}_{\mathrm{dt},p}}{\partial\boldsymbol{\eta}_{1,2}^{T}}\right)^{H}\frac{\partial\boldsymbol{\mu}_{\mathrm{dt},p}}{\partial\boldsymbol{\eta}_{1,2}^{T}}\right\}=\frac{2P_{\mathrm{dr}}|\beta_{\mathrm{d}}|^{2}}{\sigma_{\mathrm{n}}^{2}}\Re(\mathbf{E}_{\mathrm{au}}\otimes\mathbf{X}_{\mathrm{dr}}^{*}\mathbf{X}_{\mathrm{dr}}^{T})\overset{(a)}{=}\frac{2\rho_{\mathrm{c}}Q_{\mathrm{dr}}P_{\mathrm{dr}}|\beta_{\mathrm{d}}|^{2}}{\sigma_{\mathrm{n}}^{2}}\mathbf{I}_{2M_{\mathrm{t}}},

(77)

where $\mathbf{E}_{\mathrm{au}}=[1,j;-j,1]$ , and the step $(a)$ holds by assuming that $\mathbf{X}_{\mathrm{dr}}$ is orthogonal in the time domain and $\mathbf{X}_{\mathrm{dr}}^{*}\mathbf{X}_{\mathrm{dr}}^{T}=\rho_{\mathrm{c}}Q_{\mathrm{dr}}\mathbf{I}_{M_{\mathrm{t}}}$ since $M_{t}<Q_{dr}$ . Thus, Lemma 3 holds.

Appendix G Proof of Lemma 4

In Section II-D, $\mathbf{u}_{2}$ and $\mathbf{t}_{2}$ are jointly estimated by using the received signal $\mathbf{y}_{\mathrm{da}}$ . According to (48), $\mathbf{y}_{\mathrm{da}}$ can be given by

\mathbf{y}_{\mathrm{da}}=\underbrace{\mathrm{vec}(\bar{\mathbf{B}}_{\mathrm{da}}^{T}\mathbf{U}_{2}\mathbf{A}_{\mathrm{r}}\mathbf{H}_{\alpha}\mathbf{A}_{\mathrm{t}}^{T}\mathbf{T}_{2}\tilde{\mathbf{X}}_{\mathrm{da}})}_{\boldsymbol{\mu}_{\mathrm{da}}}+\mathrm{vec}(\mathbf{N}_{\mathrm{da}}),

(78)

which obeys complex Gaussian distribution, i.e., $\mathbf{y}_{\mathrm{da}}\sim\mathcal{CN}(\boldsymbol{\mu}_{\mathrm{da}},\boldsymbol{\Sigma_{\mathrm{da}}})$ , where

\boldsymbol{\Sigma}_{\mathrm{da}}=\mathbb{E}\left\{\mathrm{vec}(\mathbf{N}_{\mathrm{da}})\mathrm{vec}(\mathbf{N}_{\mathrm{da}})^{H}\right\}=\sigma_{\mathrm{n}}^{2}\mathbf{I}_{Q_{\mathrm{da}}P_{\mathrm{da}}}.

(79)

We first derive the partial derivative of $\boldsymbol{\mu}_{\mathrm{da}}$ with the respect to $\boldsymbol{\Phi}$ , which is denoted as

\begin{split}\frac{\partial\boldsymbol{\mu}_{\mathrm{da}}}{\partial\boldsymbol{\Phi}^{T}}&=(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\mathbf{T}_{2}\mathbf{A}_{\mathrm{t}}\mathbf{H}_{\alpha}\otimes\bar{\mathbf{B}}_{\mathrm{r}}\mathbf{U}_{2})\frac{\partial\mathrm{vec}(\mathbf{A}_{\mathrm{r}})}{\partial\boldsymbol{\Phi}^{T}}\\ &=(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\mathbf{T}_{2}\mathbf{A}_{\mathrm{t}}\mathbf{H}_{\alpha}\otimes\bar{\mathbf{B}}_{\mathrm{r}}\mathbf{U}_{2})\mathrm{blkdiag}(\bar{\mathbf{a}}_{\mathrm{r}}(\phi_{1}),\cdots,\bar{\mathbf{a}}_{\mathrm{r}}(\phi_{K}))\\ &=\mathbf{\Gamma_{\phi}},\end{split}

(80)

where $\bar{\mathbf{a}}_{\mathrm{r}}(\phi_{k})=\partial\mathbf{a}_{\mathrm{r}}(\phi_{k})/\partial\phi_{k}=\mathbf{a}_{\mathrm{r}}(\phi_{k})\circ[0,-j\frac{2\pi d}{\lambda}\cos\phi_{k},\cdots,-j\frac{2\pi d}{\lambda}(N_{\mathrm{r}}-1)\cos\phi_{k}]^{T}$ . Similarly, the partial derivative of $\boldsymbol{\mu}_{\mathrm{da}}$ with the respect to $\boldsymbol{\Theta}$ is denoted as

\frac{\partial\boldsymbol{\mu}_{\mathrm{da}}}{\partial\boldsymbol{\Theta}^{T}}=(\tilde{\mathbf{X}}_{\mathrm{da}}^{T}\mathbf{T}_{2}\otimes\bar{\mathbf{B}}_{\mathrm{r}}\mathbf{U}_{2}\mathbf{A}_{\mathrm{r}}\mathbf{H}_{\alpha})\mathbf{E}_{\mathrm{x},N_{\mathrm{t}}K}\mathrm{blkdiag}(\bar{\mathbf{a}}_{\mathrm{t}}(\theta_{1}),\cdots,\bar{\mathbf{a}}_{\mathrm{t}}(\theta_{K}))=\mathbf{\Gamma_{\theta}},

(81)

where $\bar{\mathbf{a}}_{\mathrm{t}}(\theta_{k})=\mathbf{a}_{\mathrm{t}}(\theta_{1})\circ[0,-j\frac{2\pi d}{\lambda}\cos\theta_{k},\cdots,-j\frac{2\pi d}{\lambda}(N_{\mathrm{t}}-1)\cos\theta_{k}]^{T}$ , and $\mathbf{E}_{\mathrm{x},N_{\mathrm{t}},K}=\sum_{k=1}^{K}(\mathbf{e}_{k}^{T}\otimes\mathbf{I}_{N_{\mathrm{t}}}\otimes\mathbf{e}_{k})$ , $\mathbf{e}_{k}$ is the $k$ -the column of $\mathbf{I}_{K}$ .

Based on (58), (60), (61), (80) and (81), the partial derivative of $\boldsymbol{\mu}_{\mathrm{da}}$ with the respect to $\boldsymbol{\eta}_{2}$ can be given by $\frac{\partial\boldsymbol{\mu}_{\mathrm{da}}}{\partial\boldsymbol{\eta}_{2}^{T}}=[\mathbf{\Gamma}_{\mathrm{t}},j\mathbf{\Gamma}_{\mathrm{t}},\mathbf{\Gamma}_{\mathrm{u}},j\mathbf{\Gamma}_{\mathrm{u}},\mathbf{\Gamma}_{\mathrm{h}},j\mathbf{\Gamma}_{\mathrm{h}},\mathbf{\Gamma}_{\mathrm{\theta}},\mathbf{\Gamma}_{\mathrm{\Phi}}]=\boldsymbol{\Upsilon_{\mathrm{\eta}}}$ . Finally, based on [44, (3.31)], the Fisher information matrix $\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{2})$ can be given by

\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{2})=2\Re\left\{\left(\frac{\partial\boldsymbol{\mu}_{\mathrm{da}}}{\partial\boldsymbol{\eta}_{2}^{T}}\right)^{H}\boldsymbol{\Sigma}_{\mathrm{da}}^{-1}\frac{\partial\boldsymbol{\mu}_{\mathrm{da}}}{\partial\boldsymbol{\eta}_{2}^{T}}\right\}=\frac{2}{\sigma_{\mathrm{n}}^{2}}\Re\left\{\boldsymbol{\Upsilon}_{\mathrm{\eta}}^{H}\boldsymbol{\Upsilon}_{\mathrm{\eta}}\right\}.

(82)

Accordingly, Lemma 4 holds.

Appendix H Proof of Proposition 5

The partial derivative of the transformation function $\mathbf{g}(\boldsymbol{\eta})$ with respect to $\boldsymbol{\eta}$ can be denoted as

\frac{\partial\mathbf{g}(\boldsymbol{\eta})}{\partial\boldsymbol{\eta}^{T}}=\mathrm{blkdiag}([\mathbf{I}_{M_{\mathrm{r}}-1},j\mathbf{I}_{M_{\mathrm{r}}-1}],[\mathbf{I}_{M_{\mathrm{t}}},j\mathbf{I}_{M_{\mathrm{t}}}],[\mathbf{I}_{N_{\mathrm{r}}},j\mathbf{I}_{N_{\mathrm{r}}}],[\mathbf{I}_{N_{\mathrm{t}}},j\mathbf{I}_{N_{\mathrm{t}}}],\mathbf{0}_{4K,4K}).

(83)

Based on the definition of CRLB, we can obtain

\begin{split}\frac{\partial\mathbf{g}(\boldsymbol{\eta})}{\partial\boldsymbol{\eta}^{T}}\boldsymbol{\mathcal{I}}(\boldsymbol{\eta})^{-1}\left(\frac{\partial\mathbf{g}(\boldsymbol{\eta})}{\partial\boldsymbol{\eta}^{T}}\right)^{H}=&\mathrm{blkdiag}([\mathbf{I}_{M_{\mathrm{r}}-1},j\mathbf{I}_{M_{\mathrm{r}}-1}]\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,1})^{-1}[\mathbf{I}_{M_{\mathrm{r}}-1},j\mathbf{I}_{M_{\mathrm{r}}-1}]^{H},\\ &[\mathbf{I}_{M_{\mathrm{t}}},j\mathbf{I}_{M_{\mathrm{t}}}]\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{1,2})^{-1}[\mathbf{I}_{M_{\mathrm{t}}},j\mathbf{I}_{M_{\mathrm{t}}}]^{H},\boldsymbol{\Pi}\boldsymbol{\mathcal{I}}(\boldsymbol{\eta}_{2})^{-1}\boldsymbol{\Pi}^{H}),\end{split}

(84)

where $\boldsymbol{\Pi}=[\mathrm{blkdiag}([\mathbf{I}_{N_{\mathrm{r}}},j\mathbf{I}_{N_{\mathrm{r}}}],[\mathbf{I}_{N_{\mathrm{t}}},j\mathbf{I}_{N_{\mathrm{t}}}]),\mathbf{0}_{N_{\mathrm{r}}+N_{\mathrm{t}},4K}]$ . By substituting (39) and (40) into (84), the CRLB of $\boldsymbol{\eta}_{\mathrm{ut}}$ can be given by (41).

\AtNextBibliography

Hierarchical-Absolute Reciprocity Calibration for Millimeter-wave Hybrid Beamforming Systems

Abstract

Index Terms:

I System Model

II Reciprocity Calibration for mmWave-HBF System

II-A Conventional Reciprocity Calibration Approach of HBF System

Proposition 1 (CRC coefficients).

Proof:

Remark 1 (Overhead and complexity of CRC).

II-B Decouple Principle of HAC

II-C Problem Formulation and Decomposition of HAC

Proposition 2 (HAC problem decoupling).

Proof:

Remark 2 (HAC decoupling).

II-D Solution to Calibration Problem of HAC

Proposition 3 (Solutions to the problem 𝒫1\mathcal{P}_{1}).

Proof:

Remark 3 (The special solution to 𝒫1\mathcal{P}_{1}).

Lemma 1 (Solution to the diagonal matrices).

Proof:

Remark 4 (Convergence analysis).

II-E Extend HAC to the multi-user scenario

III Performance Analysis of Reciprocity Calibration

III-A Overhead and Complexity of HAC

Proposition 4 (Length of downlink pilots).

Proof:

Remark 5 (Overhead and complexity of the proposed HAC).

III-B CRLB of Calibration Coefficients

Definition 1 (CRLB of 𝜼ut{\boldsymbol{\eta}}_{\mathrm{ut}}).

Lemma 2 (Transformation of the Fisher information matrix).

Proof:

Lemma 3 (Closed-form expression of ℐ​(𝜼1)\mathcal{I}(\boldsymbol{\eta}_{1}) ).

Proof:

Lemma 4 (Closed-form expression of 𝑰​(𝜼2)\boldsymbol{I}(\boldsymbol{\eta}_{2})).

Proof:

Proposition 5 (CRLB of 𝜼ut\boldsymbol{\eta}_{\mathrm{ut}}).

Proof:

Remark 6 (CRLB analysis).

IV Simulation Results and Discussions

IV-A NMSE of Estimated Mismatch Parameters

IV-B NMSE of Channel Estimation

IV-C Achievable Rate of Downlink Transmission

V Conclusion

Appendix A Proof of Proposition 2

Appendix B Proof of Proposition 3

References

References

Appendix C Proof of Lemma 1

Appendix D Proof of Proposition 4

Appendix E Proof of Lemma 2

Appendix F Proof of Lemma 3

Appendix G Proof of Lemma 4

Appendix H Proof of Proposition 5

Proposition 3 (Solutions to the problem $\mathcal{P}_{1}$ ).

Remark 3 (The special solution to $\mathcal{P}_{1}$ ).

Definition 1 (CRLB of ${\boldsymbol{\eta}}_{\mathrm{ut}}$ ).

Lemma 3 (Closed-form expression of $\mathcal{I}(\boldsymbol{\eta}_{1})$ ).

Lemma 4 (Closed-form expression of $\boldsymbol{I}(\boldsymbol{\eta}_{2})$ ).

Proposition 5 (CRLB of $\boldsymbol{\eta}_{\mathrm{ut}}$ ).