This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Waveform Optimization for MIMO Joint Communication and Radio Sensing Systems with Training Overhead

Xin Yuan, Zhiyong Feng, J. Andrew Zhang, Wei Ni,
Ren Ping Liu, Zhiqing Wei, and Changqiao Xu
X. Yuan is with the Key Laboratory of the Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, China, and the Global Big Data Technologies Center, University of Technology Sydney, Australia (email: [email protected]). Z. Feng and Z. Wei are with the Key Laboratory of the Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, China (email: {fengzy, weizhiqing}@bupt.edu.cn). J. A. Zhang and R. P. Liu are with the Global Big Data Technologies Center, University of Technology Sydney, Australia (email: {renping.liu, andrew.zhang}@uts.edu.au). W. Ni is with Commonwealth Scientific and Industrial Research Organization, Australia (email: [email protected]).C. Xu is with the State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, China (e-mail: [email protected])
Abstract

In this paper, we study optimal waveform design to maximize mutual information (MI) for a joint communication and (radio) sensing (JCAS, a.k.a., radar-communication) multi-input multi-output (MIMO) downlink system. We consider a typical packet-based signal structure which includes training and data symbols. We first derive the conditional MI for both sensing and communication under correlated channels by considering the training overhead and channel estimation error (CEE). Then, we derive a lower bound for the channel estimation error and optimize the power allocation between the training and data symbols to minimize the CEE. Based on the optimal power allocation, we provide optimal waveform design methods for three scenarios, including maximizing MI for communication only and for sensing only, and maximizing a weighted sum MI for both communication and sensing. We also present extensive simulation results that provide insights on waveform design and validate the effectiveness of the proposed designs.

Index Terms:
Mutual information, joint communication and sensing, waveform design, training sequence.

I Introduction

I-A Background and Motivation

A joint communication and (radio) sensing (JCAS, a.k.a., Radar-Communications) system that enables share of hardware and signal processing modules, can achieve efficient spectrum efficiency, enhanced security, and reduced cost, size, and weight [1, 2, 3, 4]. JCAS systems can have many potential applications in intelligent transportation that require both communication links connecting vehicles and active environment sensing functions [5, 6]. For JCAS systems, it is crucial to use a waveform simultaneously performing both communication and sensing function, and help improve the availability of the limited spectrum resources. To this end, one of the main challenges in JCAS systems lies in designing optimal or adequate waveforms that serve both purposes of data transmission and radio sensing.

Mutual information (MI) is an important measure that can be used for studying waveform designs for joint communication and sensing systems. To be specific, for communications the MI between wireless channels and the received communication signals can be employed as the waveform optimization criterion, while for sensing, the conditional MI between sensing channels and the reflected sensing signals can be measured [7, 8]. Despite a significant amount of research effort on waveform design in both communication and sensing systems, existing joint waveform designs for JCAS systems are still limited. It is known that the training sequence for channel estimation has a significant impact on communication capacity, particularly for multiple input multiple output (MIMO) systems [9, 10]. However, there has been no study on the waveform design for JCAS, which takes into consideration the typical signal packet structure containing the training sequence.

I-B Related Work

Information theory has been used to design radar waveform [7, 11, 8, 12, 13]. Bell [7] was the first to apply information theory to optimize radar waveforms to improve target detection. In [12], the optimal radar waveform was proposed to maximize the detection performance of an extended target in a colored noise environment by using MI as waveform design criteria. Two criteria, namely, the maximization of the conditional MI and the minimization of the minimum mean-square error (MMSE), were studied in [8] to optimize the waveform design for MIMO radars by exploiting the covariance matrix of the extended target impulse response. In [11], the optimal waveform design for MIMO radars in colored noise was also investigated by considering two criteria: maximizing the MI and maximizing the relative entropy between two hypotheses that the target exists or does not exist in the echoes. In [13], a two-stage waveform optimization algorithm was proposed for an adaptive MIMO radar to unify the signal design and selection procedures. The algorithm is based on the constant learning of the radar environment at the receivers and the adaptation of the transmit waveform to dynamic radar scene. In [14], a robust waveform design based on the Cramér-Rao bound was proposed for co-located MIMO radars to improve the worst-case estimation accuracy in the presence of clutters.

For communication and radar co-existing systems that transmit and process respective signals, the MI has also been adopted for waveform design to minimize the interference to each other. In [15, 16], inner bounds on both the radar estimation rate for sensing and the data rate for communication were derived for the co-existing systems. Liu et al. [17] studied transmit beamforming for spectrum sharing between downlink MU-MIMO communication and co-located MIMO radar, to maximize the detection probability for sensing while guaranteeing the transmit power for downlink users. In [18], a minimum-estimation-error-variance waveform design method was proposed to optimize the spectral shape of a unimodular radar waveform and maximize the performance of both the radar and communications In [19], the radar waveform was designed based on a performance bound that is derived from jointly maximizing radar estimation rate and communication data rate.

Only a few studies have investigated the MI for JCAS systems [20, 21, 22]. In [20], considering a JCAS MIMO setup, the expressions for radar mutual information and communication channel capacity were derived. In [21, 22], an integrated waveform design was proposed for OFDM JCAS systems to improve the MI for both communication and sensing by considering extended targets and frequency-selective fading channels.

I-C Contributions

This paper presents information theoretically optimal waveform designs for a JCAS MIMO downlink system with a signal packet structure, including training sequence and information data symbols. In the JCAS MIMO downlink, a node sends MIMO signals to another node for communications and simultaneously uses the reflected signals for sensing the surrounding environment. We first derive the conditional MI for sensing and communication by taking both the training overhead and channel estimation error (CEE) into consideration, and then provide the optimal waveform designs for several different options of maximizing conditional MI. For sensing, both training and data sequences directly contribute to the MI; while for communications, only the data sequence contributes to the MI in the presence of CEE linked to the training sequence.

The key contributions of the paper are summarized as follows.

  1. 1.

    We derive the MI expressions for both sensing and communication. We reveal the significantly different impact of the training and data sequences on the MI of sensing and communications.

  2. 2.

    We design the optimal power allocation scheme between the training and data sequences under MMSE estimators for correlated MIMO communication channels.

  3. 3.

    We provide the optimal waveform designs for three scenarios, including maximizing the MI only for sensing, and only for communication, and maximizing the weighted MI for joint communication and sensing.

  4. 4.

    We conduct extensive simulations to corroborate the effectiveness of our proposed power allocation and waveform design. The results provide important insights into the trade-off of MI between communication and sensing in a JCAS system, and the non-negligible impact of training sequence on the MI.

I-D Organization

The rest of this paper is organized as follows. In Section II, the system model is introduced. In Section III, we derive the conditional MI for both communication and sensing in the JCAS system. In Section IV, we derive a lower bound for CEE and develop an optimal power allocation strategy between training and data sequences. In Section V, the optimal waveform design methods for optimal communication, optimal sensing, and JCAS are investigated. Section VI presents simulation results. Section VII concludes the paper.

Notation: Lower-case bold face (𝐱)(\mathbf{x}) indicates vector, and upper-case bold face (𝐗)(\mathbf{X}) indicates matrix. For a diagonal matrix 𝐗\mathbf{X}, 𝐗a\mathbf{X}^{a} denotes the power aa operation to each diagonal element. 𝐈\mathbf{I} denotes the identity matrix, 𝔼()\mathbb{E}(\cdot) denotes expectation. ()T(\cdot)^{T}, ()H(\cdot)^{H}, ()(\cdot)^{\ast}, ()1(\cdot)^{-1} and ()(\cdot)^{\dagger} denote transposition, conjugate transportation, conjugate, inverse and pseudo-inverse, respectively. det()\det(\cdot) and Tr()\operatorname{Tr}(\cdot) denote the determinant and trace of a matrix, respectively.

II System Model

Refer to caption
Figure 1: A joint communication and sensing (JCAS) MIMO downlink system, where node A transmits data to node B, and simultaneously senses the environment to determine, e.g., the locations and speeds of the nearby objects, by using the reflected transmitted signal.

We consider a JCAS MIMO system where two nodes A and B perform point-to-point communications in time division duplex (TDD) mode, and simultaneously sense the environment to determine, e.g., the locations and speeds of nearby objects, as illustrated in Fig. 1. Each node has NN antennas configured in the form of a uniform linear array (ULA). At the stage that node A is transmitting to node B, we consider downlink sensing where the reflection of the transmitted signal is used for sensing by node A. The transmitted symbols are known to node A. The channels of sensing and communications are correlated but different. To suppress leakage signals from the transmitter and enable the reception of clear sensing signals, each node is assumed to be equipped with two spatially widely separated antenna arrays, i.e., NN transmit antennas and NN receive antennas configured in the form of two uniform linear arrays (ULAs). Detailed configurations of the transceiver for JCAS systems are beyond the scope of this paper, and readers can refer to [4] and [6] for more details.

Refer to caption
Figure 2: Transmit symbols: including training and data symbols. For communications, the non-precoded training symbols are used for synchronization and channel estimation, and the data symbols are typically precoded data payload. While for sensing, both the training and data symbols are used for targets detection

.

In practice, a communication packet typically includes data payload, together with training signals for synchronization and channel estimation. The training signals can have various forms in different standards and systems. For example, it can be comb pilots or occupy whole resource blocks in 5G New Radio. Without loss of generality, we consider a general data structure which consists of a sequence of LtL_{t} training symbols and LdL_{d} data symbols for each spatial stream, as illustrated in Fig. 2. Concatenating the symbols from all NN spatial streams into a matrix 𝐗\mathbf{X}, we have 𝐗=[𝐗t,𝐗d]\mathbf{X}=[\mathbf{X}_{t},\mathbf{X}_{d}], where 𝐗t=[𝑿t(1),,𝑿t(N)]TN×Lt\mathbf{X}_{t}=\left[\bm{X}_{t}\left(1\right),\cdots,\bm{X}_{t}\left({N}\right)\right]^{T}\in{\mathbb{C}}^{N\times L_{t}} and 𝐗d=[𝑿d(1),,𝑿d(N)]TN×Ld\mathbf{X}_{d}=\left[\bm{X}_{d}\left(1\right),\cdots,\bm{X}_{d}\left({N}\right)\right]^{T}\in{\mathbb{C}}^{N\times L_{d}}, with 𝑿t(n)\bm{X}_{t}(n) and 𝑿d(n)\bm{X}_{d}(n) denoting the training and data symbols transmitted from the nn-th antenna, respectively. We assume that 𝐗d(n)Ld×1\mathbf{X}_{d}(n)\in\mathbb{C}^{L_{d}\times 1} is independent and identically distributed (i.i.d.) Gaussian variable with zero mean and covariance matrix 1Ld𝔼{𝐗d𝐗dH}=Σ𝐗d\frac{1}{L_{d}}\mathbb{E}\left\{\mathbf{X}_{d}\mathbf{X}_{d}^{H}\right\}=\varSigma_{\mathbf{X}_{d}}. Let 1Lt𝐗t𝐗tH=Σ𝐗t\frac{1}{L_{t}}\mathbf{X}_{t}\mathbf{X}_{t}^{H}=\varSigma_{\mathbf{X}_{t}}. 𝐗t(n)Lt×1\mathbf{X}_{t}(n)\in\mathbb{C}^{L_{t}\times 1} are typically designed to be orthogonal to each other and LtNL_{t}\geq N, and hence Σ𝐗t\varSigma_{\mathbf{X}_{t}} is a scaled diagonal matrix. More advanced designs of training sequences may be possible. The orthogonal design considered here is a typical setting in MIMO communication systems, and it is also typically used in MIMO radar to exploit the degrees of freedom offered by multiple antennas [23]. Most of the results presented in this paper can also be readily extended to systems using other training sequences.

The transmitted signal 𝐗\mathbf{X}, including 𝐗t\mathbf{X}_{t} and 𝐗d\mathbf{X}_{d}, is used for both communication and radio sensing operations. Let PP be the total energy of the transmit signal, PtP_{t} the energy of the training signals, and PdP_{d} the energy of the data signals. P=Pt+PdP=P_{t}+P_{d}. The average energy of the training and data symbols are σt2=1NLtn=1N𝐗t(n)H𝐗t(n)\sigma_{t}^{2}=\frac{1}{NL_{t}}\sum_{n=1}^{N}\mathbf{X}_{t}(n)^{H}\mathbf{X}_{t}(n), and σd2=1NLdn=1N𝔼[𝐗d(n)H𝐗d(n)]\sigma_{d}^{2}=\frac{1}{NL_{d}}\sum_{n=1}^{N}\mathbb{E}\left[\mathbf{X}_{d}(n)^{H}\mathbf{X}_{d}(n)\right], respectively. We also define a weighting value κ\kappa, 0<κ<10<\kappa<1, and have Pd=κP=NLdσd2P_{d}=\kappa P=NL_{d}\sigma_{d}^{2} and Pt=(1κ)P=σt2NLtP_{t}=(1-\kappa)P=\sigma_{t}^{2}NL_{t}. We optimize the power allocation between training and data symbols to maximize the communication capacity, as will be described in Section IV-B.

II-A Communication Model

For communication, the received training and data signals at node B can be respectively given by

𝐘comt=𝐇𝐗t+𝐍tc;{\mathbf{Y}}^{t}_{\rm com}=\mathbf{H}\mathbf{X}_{t}+\mathbf{N}_{tc}; (1)
𝐘comd\displaystyle{\mathbf{Y}}^{d}_{\rm com} =𝐇𝐗d+𝐍dc\displaystyle=\mathbf{H}\mathbf{X}_{d}+\mathbf{N}_{dc} (2)
=(𝐇^+Δ𝐇)𝐗d+𝐍dc\displaystyle=\left(\hat{\mathbf{H}}+\Delta\mathbf{H}\right)\mathbf{X}_{d}+\mathbf{N}_{dc}
=𝐇^𝐗d+Δ𝐇𝐗d+𝐍dc𝐍c,\displaystyle=\hat{\mathbf{H}}\mathbf{X}_{d}+\underbrace{\Delta\mathbf{H}\mathbf{X}_{d}+\mathbf{N}_{dc}}_{\mathbf{N}^{\prime}_{c}},

where 𝐇=[𝐡1,,𝐡j,,𝐡N]N×N\mathbf{H}=\left[\mathbf{h}_{1},\cdots,\mathbf{h}_{j},\cdots,\mathbf{h}_{N}\right]\in\mathbb{C}^{N\times N} is the channel matrix with 𝐡j=[h1,j,h2,j,,hN,j]T\mathbf{h}_{j}=\left[h_{1,j},h_{2,j},\cdots,h_{N,j}\right]^{T} denoting the jj-th row of 𝐇\mathbf{H}; 𝐍tcN×Lt\mathbf{N}_{tc}\in\mathbb{C}^{N\times L_{t}} and 𝐍dcN×Ld\mathbf{N}_{dc}\in\mathbb{C}^{N\times L_{d}} are both addictive white Gaussian noise (AWGN) with zero mean and element-wise variance σn2\sigma_{n}^{2}. It is reasonable to assume that 𝐍tc\mathbf{N}_{tc}, 𝐍dc\mathbf{N}_{dc} and 𝐗d\mathbf{X}_{d} are mutually independent. The signal 𝐘comt{\mathbf{Y}}^{t}_{\rm com} is used for channel estimation. We assume that a linear channel estimation based on a minimum mean-square error (MMSE) criterion [24] is applied. In this case, the channel estimate 𝐇^\hat{\mathbf{H}} and the estimation error Δ𝐇\Delta\mathbf{H} are uncorrelated [25]. Let Δ𝐇=[Δ𝐡1,,Δ𝐡j,,Δ𝐡N]\Delta\mathbf{H}=[\Delta\mathbf{h}_{1},\cdots,\Delta\mathbf{h}_{j},\cdots,\Delta\mathbf{h}_{N}], where Δ𝐡j=[Δh1j,Δh2j,,ΔhNj]T\Delta\mathbf{h}_{j}=[\Delta h_{1j},\Delta h_{2j},\cdots,\Delta h_{Nj}]^{T} the jj-th row of Δ𝐇\Delta\mathbf{H}. The coefficients Δhij\Delta h_{ij} are random variables following i.i.d. zero mean circularly symmetric complex Gaussian with variance σe2\sigma_{e}^{2}, i.e., 𝔼[Δ𝐇Δ𝐇H]=Nσe2𝐈N\mathbb{E}\left[\Delta\mathbf{H}\Delta\mathbf{H}^{H}\right]=N\sigma_{e}^{2}\mathbf{I}_{N}. We will evaluate σe2\sigma_{e}^{2} and link it to 𝐗t\mathbf{X}_{t} and 𝐍tc\mathbf{N}_{tc} in Section IV-A.

The matrix 𝐍c\mathbf{N}^{\prime}_{c} combines the CEE and noise, and can be viewed as an equivalent additive noise with zero mean and covariance. The variance σn2\sigma_{n^{\prime}}^{2} can be obtained as

𝔼[𝐍c𝐍cH]\displaystyle\mathbb{E}[\mathbf{N}^{\prime}_{c}{\mathbf{N}^{\prime}_{c}}^{H}] =𝔼[Δ𝐇𝐗d𝐗dHΔ𝐇H]+𝔼[𝐍dc𝐍dcH]\displaystyle=\mathbb{E}[\Delta\mathbf{H}\mathbf{X}_{d}\mathbf{X}_{d}^{H}\Delta\mathbf{H}^{H}]+\mathbb{E}\left[\mathbf{N}_{dc}\mathbf{N}_{dc}^{H}\right] (3a)
=𝔼[Δ𝐇Σ𝐗dΔ𝐇H]+𝔼[𝐍dc𝐍dcH]\displaystyle=\mathbb{E}[\Delta\mathbf{H}\varSigma_{\mathbf{X}_{d}}\Delta\mathbf{H}^{H}]+\mathbb{E}\left[\mathbf{N}_{dc}\mathbf{N}_{dc}^{H}\right] (3b)
=𝔼{diag{Δ𝐡1TΣ𝐗dΔ𝐡1,,Δ𝐡NTΣ𝐗dΔ𝐡N}}+Ldσn2𝐈N\displaystyle=\mathbb{E}\left\{\operatorname{diag}\left\{\Delta\mathbf{h}_{1}^{T}\varSigma_{\mathbf{X}_{d}}\Delta\mathbf{h}_{1}^{\ast},\cdots,\Delta\mathbf{h}_{N}^{T}\varSigma_{\mathbf{X}_{d}}\Delta\mathbf{h}_{N}^{\ast}\right\}\right\}\!+\!L_{d}\sigma_{n}^{2}\mathbf{I}_{N} (3c)
=diag{Tr(Σ𝐗d𝔼[Δ𝐡1Δ𝐡1T]),,Tr(Σ𝐗d𝔼[Δ𝐡NΔ𝐡NT])}+Ldσn2𝐈N\displaystyle=\operatorname{diag}\left\{{\operatorname{Tr}}\left(\varSigma_{\mathbf{X}_{d}}\mathbb{E}[\Delta\mathbf{h}_{1}^{\ast}\Delta\mathbf{h}_{1}^{T}]\right),\cdots,{\operatorname{Tr}}\left(\varSigma_{\mathbf{X}_{d}}\mathbb{E}[\Delta\mathbf{h}_{N}^{\ast}\Delta\mathbf{h}_{N}^{T}]\right)\right\}+L_{d}\sigma_{n}^{2}\mathbf{I}_{N} (3d)
=NLdσd2σe2𝐈N+Ldσn2𝐈N=Ld(PdLdσe2+σn2)𝐈N\displaystyle=NL_{d}\sigma_{d}^{2}\sigma_{e}^{2}\mathbf{I}_{N}+L_{d}\sigma_{n}^{2}\mathbf{I}_{N}=L_{d}\left(\frac{P_{d}}{L_{d}}\sigma_{e}^{2}+\sigma_{n}^{2}\right)\mathbf{I}_{N} (3e)
Ldσn2𝐈N,\displaystyle\triangleq L_{d}\sigma_{n}^{\prime 2}\mathbf{I}_{N}, (3f)

where σn2=PdLdσe2+σn2\sigma_{n}^{\prime 2}=\frac{P_{d}}{L_{d}}\sigma_{e}^{2}+\sigma_{n}^{2}.

Let 𝐑H=1N𝔼[𝐇H𝐇]\mathbf{R}_{H}=\frac{1}{N}\mathbb{E}[\mathbf{H}^{H}\mathbf{H}] be the channel covariance matrix, and 𝐑H\mathbf{R}_{H} is a positive semi-definite matrix. We assume that 𝐑H\mathbf{R}_{H} is known to Node A. We can write the random channel matrix as 𝐇=𝐇0𝐑H12\mathbf{H}=\mathbf{H}_{0}\mathbf{R}_{H}^{\frac{1}{2}}, where the entries of 𝐇0\mathbf{H}_{0} are i.i.d. zero mean circularly symmetric complex Gaussian with unit variance.

II-B Sensing Model

Node A uses the reflection of the transmitted signal for sensing. The received signal, denoted by 𝐘rad\mathbf{Y}_{\rm rad}, is given by

𝐘rad\displaystyle\mathbf{Y}_{\rm rad} =𝐆𝐗+𝐍=𝐆[𝐗t,𝐗d]+[𝐍tr,𝐍dr]\displaystyle=\mathbf{G}\mathbf{X}+\mathbf{N}=\mathbf{G}[\mathbf{X}_{t},\mathbf{X}_{d}]+\left[\mathbf{N}_{tr},\mathbf{N}_{dr}\right] (4)
=[𝐆𝐗t+𝐍tr,𝐆𝐗d+𝐍tr],\displaystyle=[\mathbf{G}\mathbf{X}_{t}+\mathbf{N}_{tr},\mathbf{G}\mathbf{X}_{d}+\mathbf{N}_{tr}],

where 𝐆=[𝐠1,,𝐠N]\mathbf{G}=[\mathbf{g}_{1},\cdots,\mathbf{g}_{N}] is the channel matrix to be sensed with its jj-th column being 𝐠j=[g1j,g2j,,gNj]T\mathbf{g}_{j}=\left[g_{1j},g_{2j},\cdots,g_{Nj}\right]^{T}, and 𝐠j,j=1,,N\mathbf{g}_{j},\,j=1,\cdots,N are independent of each other; 𝐍tr=[𝐧tr,1,𝐧tr,2,,𝐧tr,N]Lt×N\mathbf{N}_{tr}=[\mathbf{n}_{tr,1},\mathbf{n}_{tr,2},\cdots,\mathbf{n}_{tr,N}]\in\mathbb{C}^{L_{t}\times N} and 𝐍dr=[𝐧dr,1,𝐧dr,2,,𝐧dr,N]Ld×N\mathbf{N}_{dr}=[\mathbf{n}_{dr,1},\mathbf{n}_{dr,2},\cdots,\mathbf{n}_{dr,N}]\in\mathbb{C}^{L_{d}\times N} are AWGN with zero mean and covariance matrix 𝔼{𝐍tr𝐍trH}=Nσn2𝐈Lt\mathbb{E}\left\{\mathbf{N}_{tr}{\mathbf{N}_{tr}^{H}}\right\}=N\sigma_{n}^{2}\mathbf{I}_{L_{t}} and 𝔼{𝐍dr𝐍drH}=Nσn2𝐈Ld\mathbb{E}\left\{\mathbf{N}_{dr}{\mathbf{N}_{dr}^{H}}\right\}=N\sigma_{n}^{2}\mathbf{I}_{L_{d}}. Let Σ𝐆=1N𝔼{𝐆𝐆H}\varSigma_{\mathbf{G}}=\frac{1}{N}\mathbb{E}\{\mathbf{G}\mathbf{G}^{H}\} be the spatial correlation matrix. It is assumed to be full-rank and also known to Node A.

For both the communication and sensing channels, we assume that they remain unchanged during the period of a packet. Note that for both communication and sensing, the channel matrices include large-scale path loss and small-scale fading. The path loss of sensing can vary significantly for different multi-path components depending on the number of nearby objects and their locations, and therefore, we consider the mean path loss herein. Our optimization results only depend on the ratio between the mean path losses of communication and sensing.

III Mutual Information

In this section, we first derive the expression for the MI of sensing by using both the training and data symbols. Then, we present the MI for communications under CEEs.

III-A MI for Sensing

The MI between the sensing channel matrix 𝐆\mathbf{G} (or the “target impulse response” matrix in radar) and reflected signals 𝐘rad\mathbf{Y}_{\rm rad} given the knowledge of 𝐗\mathbf{X} can be used to measure the sensing performance [11]. With our model (4), the MI is given by

I(𝐆;𝐘rad|𝐗)\displaystyle I\left(\mathbf{G};{\mathbf{Y}}_{\rm rad}|\mathbf{X}\right) =h(𝐘rad|𝐗)h(𝐘rad|𝐗,𝐆)\displaystyle=h\left({\mathbf{Y}}_{\rm rad}|{\mathbf{X}}\right)-h\left({\mathbf{Y}}_{\rm rad}|{\mathbf{X}},\mathbf{G}\right) (5)
=h(𝐘rad|[𝐗t,𝐗d]T)h(𝐘rad|[𝐗t,𝐗d]T,𝐆)\displaystyle=h\left({\mathbf{Y}}_{\rm rad}|[{\mathbf{X}_{t}},{\mathbf{X}_{d}}]^{T}\right)-h\left({\mathbf{Y}}_{\rm rad}|[{\mathbf{X}_{t}},{\mathbf{X}_{d}}]^{T},\mathbf{G}\right)
=h(𝐘rad|[𝐗t,𝐗d]T)h(𝐍r),\displaystyle=h\left({\mathbf{Y}}_{\rm rad}|[{\mathbf{X}_{t}},{\mathbf{X}_{d}}]^{T}\right)-h\left(\mathbf{N}_{r}\right),

where h()h(\cdot) denotes the entropy of a random variable. Provided the noise vector 𝐍r,j=[𝐧tr,j𝐧dr,j],j=1,,N\mathbf{N}_{r,j}=\begin{bmatrix}\mathbf{n}_{tr,j}\\ \mathbf{n}_{dr,j}\end{bmatrix},\,j=1,\cdots,N are independent of each other, the conditional probability density function (PDF) of 𝐘rad\mathbf{Y}_{\rm rad} conditioned on 𝐗\mathbf{X} is given by

p(𝐘rad|𝐗)\displaystyle p\left(\mathbf{Y}_{\rm rad}|\mathbf{X}\right) =p(𝐘rad|[𝐗t,𝐗d]T)=j=1Np(𝐲rad,j|[𝐗t,𝐗d]T)\displaystyle=p\left({\mathbf{Y}}_{\rm rad}|[\mathbf{X}_{t},\mathbf{X}_{d}]^{T}\right)=\prod_{j=1}^{N}p\left(\mathbf{y}_{{\rm rad},j}|[\mathbf{X}_{t},\mathbf{X}_{d}]^{T}\right) (6a)
=j=1N1πLdet([𝐗t,𝐗d]TΣ𝐆[𝐗t,𝐗d]+σn2𝐈L)\displaystyle=\prod_{j=1}^{N}\frac{1}{\pi^{L}\det\left([\mathbf{X}_{t},\mathbf{X}_{d}]^{T}\varSigma_{\mathbf{G}}[\mathbf{X}_{t},\mathbf{X}_{d}]^{\ast}+\sigma_{n}^{2}\mathbf{I}_{L}\right)}
×exp(𝐲rad,jH([𝐗t,𝐗d]TΣ𝐆[𝐗t,𝐗d]+σn2𝐈L)1𝐲rad,j)\displaystyle\qquad\qquad\times\exp\left(-{\mathbf{y}}_{{\rm rad},j}^{H}\left([\mathbf{X}_{t},\mathbf{X}_{d}]^{T}\varSigma_{\mathbf{G}}[\mathbf{X}_{t},\mathbf{X}_{d}]^{\ast}+\sigma_{n}^{2}\mathbf{I}_{L}\!\right)^{-1}\mathbf{y}_{{\rm rad},j}\right) (6b)
=1πLNdetN([𝐗t,𝐗d]TΣ𝐆[𝐗t,𝐗d]+σn2𝐈L)\displaystyle=\frac{1}{\pi^{LN}\det^{N}\left([\mathbf{X}_{t},\mathbf{X}_{d}]^{T}\varSigma_{\mathbf{G}}[\mathbf{X}_{t},\mathbf{X}_{d}]^{\ast}+\sigma_{n}^{2}\mathbf{I}_{L}\right)}
×exp{Tr[([𝐗t,𝐗d]TΣ𝐆[𝐗t,𝐗d]+σn2𝐈L)1𝐘rad𝐘radH]},\displaystyle\qquad\qquad\times\exp\left\{-{\operatorname{Tr}}\left[\left([\mathbf{X}_{t},\mathbf{X}_{d}]^{T}\varSigma_{\mathbf{G}}[\mathbf{X}_{t},\mathbf{X}_{d}]^{\ast}\!+\!\sigma_{n}^{2}\mathbf{I}_{L}\right)^{-1}\mathbf{Y}_{\rm rad}{\mathbf{Y}^{H}_{\rm rad}}\right]\!\right\}\!, (6c)

where (6b) is obtained based on the PDF of circularly symmetric complex Gaussian distribution, and

𝔼{𝐲rad,i𝐲rad,iH}\displaystyle\mathbb{E}\{\mathbf{y}_{{\rm rad},i}\mathbf{y}^{H}_{{\rm rad},i}\} =𝔼{[𝐗tT𝐠j+𝐧tr,j𝐗dT𝐠j+𝐧dr,j][𝐠jH𝐗t+𝐧tr,jH,𝐠jH𝐗d+𝐧dr,jH]}\displaystyle=\mathbb{E}\!\left\{\!\begin{bmatrix}\mathbf{X}^{T}_{t}\mathbf{g}_{j}\!+\!\mathbf{n}_{tr,j}\\ \mathbf{X}^{T}_{d}\mathbf{g}_{j}\!+\!\mathbf{n}_{dr,j}\end{bmatrix}\left[\mathbf{g}_{j}^{H}\mathbf{X}_{t}^{\ast}+\mathbf{n}^{H}_{tr,j},\mathbf{g}_{j}^{H}\mathbf{X}^{\ast}_{d}+\mathbf{n}^{H}_{dr,j}\right]\right\}\! (7a)
=[𝐗t,𝐗d]T𝔼{𝐠j𝐠jH}[𝐗t,𝐗d]+𝔼{diag{𝐧tr,j𝐧tr,jH,𝐧dr,j𝐧dr,jH}}\displaystyle=[\mathbf{X}_{t},\mathbf{X}_{d}]^{T}\mathbb{E}\{\mathbf{g}_{j}\mathbf{g}_{j}^{H}\}[\mathbf{X}_{t},\mathbf{X}_{d}]^{\ast}+\mathbb{E}\{\operatorname{diag}\{\mathbf{n}_{tr,j}\mathbf{n}^{H}_{tr,j},\mathbf{n}_{dr,j}\mathbf{n}^{H}_{dr,j}\}\} (7b)
=1N[𝐗t,𝐗d]TΣ𝐆[𝐗t,𝐗d]+σn2𝐈L,\displaystyle=\frac{1}{N}[\mathbf{X}_{t},\mathbf{X}_{d}]^{T}\varSigma_{\mathbf{G}}[\mathbf{X}_{t},\mathbf{X}_{d}]^{\ast}+\sigma_{n}^{2}\mathbf{I}_{L}, (7c)

where (7b) is conditioned on 𝐗\mathbf{X}, and 𝔼{𝐠j𝐠jH}=1N𝔼{𝐆𝐆H}=Σ𝐆\mathbb{E}\{\mathbf{g}_{j}\mathbf{g}_{j}^{H}\}=\frac{1}{N}\mathbb{E}\{\mathbf{G}\mathbf{G}^{H}\}=\varSigma_{\mathbf{G}} in (7c) since 𝐠j,j=1,,N\mathbf{g}_{j},\,j=1,\cdots,N are independent of each other.

Based on (6), the entropy of 𝐘rad\mathbf{Y}_{\rm rad} conditional on 𝐗\mathbf{X} can be obtained as

h(𝐘rad|𝐗)\displaystyle h\left(\mathbf{Y}_{\rm rad}|\mathbf{X}\right) =LNlog2(π)+LN+Nlog2[det([𝐗t,𝐗d]TΣ𝐆[𝐗t,𝐗d]+σn2𝐈L)]\displaystyle=LN\log_{2}(\pi)+LN\!+\!N\log_{2}\left[\det\left([\mathbf{X}_{t},\mathbf{X}_{d}]^{T}\varSigma_{\mathbf{G}}[\mathbf{X}_{t},\mathbf{X}_{d}]^{\ast}+\sigma_{n}^{2}\mathbf{I}_{L}\right)\right] (8a)
=LNlog2(π)+LN+Nlog2[det([𝐗t,𝐗d][𝐗t,𝐗d]TΣ𝐆+σn2𝐈N)]\displaystyle=LN\log_{2}(\pi)+LN\!+\!N\log_{2}\left[\det\left([\mathbf{X}_{t},\mathbf{X}_{d}]^{\ast}[\mathbf{X}_{t},\mathbf{X}_{d}]^{T}\varSigma_{\mathbf{G}}\!+\!\sigma_{n}^{2}\mathbf{I}_{N}\right)\right] (8b)
=LNlog2(π)+LN+Nlog2[(σn2)LNdet(𝐗t𝐗tTΣ𝐆+𝐗d𝐗dTΣ𝐆+σn2𝐈N)],\displaystyle=LN\log_{2}(\pi)+LN\!+\!N\log_{2}\!\left[\!(\sigma_{n}^{2})^{{\color[rgb]{0,0,1}L-N}}\det\!\left(\!\mathbf{X}^{\ast}_{t}\mathbf{X}^{T}_{t}\varSigma_{\mathbf{G}}\!+\!\mathbf{X}^{\ast}_{d}\mathbf{X}_{d}^{T}\varSigma_{\mathbf{G}}\!+\!\sigma_{n}^{2}\mathbf{I}_{N}\!\right)\!\right]\!, (8c)

where (8c) is based on the Sylvester’s determinant theorem [26], i.e.,

det(𝐀M×N𝐁N×M+σn2𝐈M)=(σn2)MNdet(𝐁N×M𝐀M×N+σn2𝐈N).\displaystyle\det\left(\mathbf{A}_{M\times N}\mathbf{B}_{N\times M}+\sigma_{n}^{2}\mathbf{I}_{M}\right)=(\sigma_{n}^{2})^{{\color[rgb]{0,0,1}M-N}}\det\left(\mathbf{B}_{N\times M}\mathbf{A}_{M\times N}+\sigma_{n}^{2}\mathbf{I}_{N}\right). (9)

The columns of the noise matrix 𝐍r\mathbf{N}_{r} follow the i.i.d. multivariate complex Gaussian distribution with zero mean and covariance matrix σn2𝐈N\sigma^{2}_{n}\mathbf{I}_{N}, and the entropy of 𝐍r\mathbf{N}_{r} is given by

h(𝐍r)\displaystyle h\left(\mathbf{N}_{r}\right) =LNlog2(π)+LN+Nlog2[det(σn2𝐈N)].\displaystyle=LN\log_{2}(\pi)+LN+N\log_{2}\left[\det\left(\sigma_{n}^{2}\mathbf{I}_{N}\right)\right]. (10)

By substituting (8) and (10) into (5), the MI for sensing can be obtained as

I(𝐆;𝐘rad|𝐗)\displaystyle I\left(\mathbf{G};{\mathbf{Y}}_{\rm rad}|\mathbf{X}\right) =Nlog2[det(𝐗t𝐗tTΣ𝐆+𝐗d𝐗dTΣ𝐆(σn2)LN+𝐈N)].\displaystyle\!=\!N\log_{2}\!\left[\!\det\left(\frac{\mathbf{X}^{\ast}_{t}\mathbf{X}^{T}_{t}\varSigma_{\mathbf{G}}\!+\!\mathbf{X}^{\ast}_{d}\mathbf{X}_{d}^{T}\varSigma_{\mathbf{G}}}{\!\left(\!\sigma_{n}^{2}\right)^{{\color[rgb]{0,0,1}L-N}}}\!+\!\mathbf{I}_{N}\!\right)\!\!\right]\!. (11)

III-B MI for Communication

The MI for communication is defined as the mutual dependence between the transmit signals of node A and the received signals of node B, conditional on the estimated channel matrix 𝐇^\hat{\mathbf{H}}. With the Gaussian assumption of CEE, the conditional PDF of 𝐘comd{\mathbf{Y}}^{d}_{\rm com} on 𝐇^\hat{\mathbf{H}} is given by

p(𝐘comd|𝐇^)=i=1Ldp(𝐲com,id|𝐇^)\displaystyle p\left(\mathbf{Y}^{d}_{\rm com}|\hat{\mathbf{H}}\right)=\prod_{i=1}^{L_{d}}p\left(\mathbf{y}^{d}_{{\rm com},i}|\hat{\mathbf{H}}\right) (12a)
=i=1Ld1πNdet(𝐇^Σ𝐗d𝐇^H+σn2𝐈N)exp(𝐲com,idH(𝐇^Σ𝐗d𝐇^H+σn2𝐈N)1𝐲com,id)\displaystyle=\prod_{i=1}^{L_{d}}\frac{1}{\pi^{N}\det\left(\hat{\mathbf{H}}\varSigma_{\mathbf{X}_{d}}{\hat{\mathbf{H}}}^{H}+\sigma_{n^{\prime}}^{2}\mathbf{I}_{N}\right)}\exp\left(-{\mathbf{y}^{d}_{{\rm com},i}}^{H}\left(\hat{\mathbf{H}}\varSigma_{\mathbf{X}_{d}}{\hat{\mathbf{H}}}^{H}+\sigma_{n^{\prime}}^{2}\mathbf{I}_{N}\right)^{-1}\mathbf{y}^{d}_{{\rm com},i}\right) (12b)
=1πLdNdetLd(𝐇^Σ𝐗d𝐇^H+σn2𝐈N)exp{Tr[(𝐇^Σ𝐗d𝐇^H+σn2𝐈N)1𝐘com𝐘comH]},\displaystyle=\frac{1}{\pi^{L_{d}N}\det^{L_{d}}\left(\hat{\mathbf{H}}\varSigma_{\mathbf{X}_{d}}{\hat{\mathbf{H}}}^{H}+\sigma_{n^{\prime}}^{2}\mathbf{I}_{N}\right)}\exp\!\left\{\!-{\operatorname{Tr}}\!\left[\!\left(\hat{\mathbf{H}}\varSigma_{\mathbf{X}_{d}}{\hat{\mathbf{H}}}^{H}+\sigma_{n^{\prime}}^{2}\mathbf{I}_{N}\!\right)^{-1}\mathbf{Y}_{\rm com}\mathbf{Y}_{\rm com}^{H}\!\right]\!\right\}\!, (12c)

where (12a) is under the assumption that the columns of 𝐘comd\mathbf{Y}^{d}_{\rm com} (or 𝐗d\mathbf{X}_{d}) are i.i.d., and (12b) is from the PDF of circularly symmetric complex Gaussian distribution. The columns of equivalent noise matrix 𝐍c\mathbf{N}^{\prime}_{c} follow the i.i.d. multivariate complex Gaussian distribution with zero mean and covariance matrix σn2𝐈N\sigma_{n}^{\prime 2}\mathbf{I}_{N}. By referring to (8) – (10), the entropy of 𝐍c\mathbf{N}^{\prime}_{c} can be given by

h(𝐍c)=LdNlog2(π)+LdN+Ldlog2[det(σn2𝐈N)].h\left(\mathbf{N}^{\prime}_{c}\right)\!=\!L_{d}N\log_{2}(\pi)\!+\!L_{d}N\!+\!L_{d}\log_{2}\!\left[\!\det\!\left(\sigma_{n^{\prime}}^{2}\mathbf{I}_{N}\!\right)\!\right]\!. (13)

Therefore, the conditional MI between 𝐗d\mathbf{X}_{d} and 𝐘comd{\mathbf{Y}}_{\rm com}^{d} is obtained as

I(𝐗d;𝐘comd|𝐇^)\displaystyle I\left(\mathbf{X}_{d};{\mathbf{Y}}^{d}_{\rm com}|\hat{\mathbf{H}}\right) =h(𝐘comd|𝐇^)h(𝐘comd|𝐗d,𝐇^)\displaystyle=h\left({\mathbf{Y}}^{d}_{\rm com}|{\hat{\mathbf{H}}}\right)-h\left({\mathbf{Y}}^{d}_{\rm com}|{\mathbf{X}_{d}},\hat{\mathbf{H}}\right) (14)
=h(𝐘comd|𝐇^)h(𝐍c)=Ldlog2[det(𝐇^Σ𝐗d𝐇^Hσn2+𝐈N)].\displaystyle=h\left({\mathbf{Y}}^{d}_{\rm com}|{\hat{\mathbf{H}}}\right)-h\left(\mathbf{N}^{\prime}_{c}\right)=L_{d}\log_{2}\!\left[\!\det\left(\frac{\hat{\mathbf{H}}\varSigma_{\mathbf{X}_{d}}{\hat{\mathbf{H}}}^{H}}{\sigma_{n^{\prime}}^{2}}\!+\!\mathbf{I}_{N}\!\right)\!\right]\!.

Compared to the conventional MI results without consideration of CEE [27], we can see that the CEE here contributes as σn2=PdLdσe2+σn2\sigma_{n}^{\prime 2}=\frac{P_{d}}{L_{d}}\sigma_{e}^{2}+\sigma_{n}^{2}.

IV Channel Estimation Error and Optimal Power Allocation

In this section, we first derive a lower bound for CEE with the use of the training symbols. Based on this lower bound, we then propose an optimal scheme for allocating energy between training and data symbols, to maximize an upper bound of the MI for communications. We optimize the power allocation with respect to communication, as its impact on sensing performance is much weaker.

IV-A Channel Estimation Error

With an MMSE MIMO channel estimation, the estimated MIMO channel matrix can be expressed as [10]

𝐇^\displaystyle\hat{\mathbf{H}} =𝐇𝐗t𝐗tH(σn2𝐈N+𝐗t𝐗tH)1+𝐍t𝐗tH(σn2𝐈N+𝐗t𝐗tH)1=𝐇Δ𝐇,\displaystyle=\mathbf{H}\mathbf{X}_{t}\mathbf{X}_{t}^{H}\left(\sigma_{n}^{2}\mathbf{I}_{N}+\mathbf{X}_{t}\mathbf{X}_{t}^{H}\right)^{-1}+\mathbf{N}_{t}\mathbf{X}_{t}^{H}\left(\sigma_{n}^{2}\mathbf{I}_{N}+\mathbf{X}_{t}\mathbf{X}_{t}^{H}\right)^{-1}=\mathbf{H}-\Delta\mathbf{H}, (15)

where, as can be recalled, 𝐗t\mathbf{X}_{t} is an N×LtN\times L_{t} training symbol matrix whose elements have the average energy σt2\sigma_{t}^{2}. Take the singular value decomposition (SVD) of 𝐑H\mathbf{R}_{H}. 𝐑H=𝐔H𝚲H𝐔HH\mathbf{R}_{H}=\mathbf{U}_{H}\bm{\Lambda}_{H}\mathbf{U}_{H}^{H}, where the singular value matrix 𝚲H=diag(δ1,δ2,,δN)\bm{\Lambda}_{H}=\operatorname{diag}(\delta_{1},\delta_{2},\cdots,\delta_{N}), and 1Ni=1Nδi=1NTr(𝐑H)σh2\frac{1}{N}\sum_{i=1}^{N}\delta_{i}=\frac{1}{N}{\operatorname{Tr}}(\mathbf{R}_{H})\triangleq\sigma_{h}^{2}. Let 𝚲CRLB{\bm{\Lambda}}_{\rm CRLB} be the Cramér-Rao lower bound (CRLB) of the channel matrix estimation [28]. We have

𝔼[Δ𝐇Δ𝐇H]\displaystyle\mathbb{E}\left[\Delta\mathbf{H}\Delta\mathbf{H}^{H}\right] =𝔼[(𝐇𝐇^)(𝐇𝐇^)H]\displaystyle=\mathbb{E}\left[\left(\mathbf{H}-\hat{\mathbf{H}}\right)\left(\mathbf{H}-\hat{\mathbf{H}}\right)^{H}\right] (16)
𝚲CRLB=(Σ𝐗tσn2s+𝐑H1)1=(𝐔HHΣ𝐗t𝐔Hσn2+𝚲H1)1\displaystyle\!\geq\!{\bm{\Lambda}}_{\rm CRLB}=\left(\frac{\varSigma_{\mathbf{X}_{t}}}{\sigma_{n}^{2}}\!s+\!\mathbf{R}_{H}^{-1}\right)^{-1}\!=\!\left(\frac{\mathbf{U}_{H}^{H}\varSigma_{\mathbf{X}_{t}}\mathbf{U}_{H}}{\sigma_{n}^{2}}+{\bm{\Lambda}_{H}}^{-1}\right)^{-1}
=diag(σn2δ1σn2+Ltσt2δ1,,σn2δiσn2+Ltσt2δi,,σn2δNσn2+Ltσt2δN),\displaystyle\!=\!{\operatorname{diag}}\!\left(\!\frac{\sigma_{n}^{2}\delta_{1}}{\sigma_{n}^{2}\!+\!L_{t}\sigma_{t}^{2}\delta_{1}},\!\cdots\!,\frac{\sigma_{n}^{2}\delta_{i}}{\sigma_{n}^{2}\!+\!L_{t}\sigma_{t}^{2}\delta_{i}},\!\cdots\!,\frac{\sigma_{n}^{2}\delta_{N}}{\sigma_{n}^{2}\!+\!L_{t}\sigma_{t}^{2}\delta_{N}}\!\right)\!,

which is due to the fact that Σ𝐗t=𝐗t𝐗tH=Ltσt2𝐈N\varSigma_{\mathbf{X}_{t}}=\mathbf{X}_{t}\mathbf{X}_{t}^{H}=L_{t}\sigma_{t}^{2}\mathbf{I}_{N}. Therefore, a lower bound of MMSE of total channel estimates, denoted by 𝒞t\mathcal{C}_{t}, can be represented as

𝒞t=Tr(𝚲CRLB)=i=1Nσn2δiσn2+Ltσt2δi.\mathcal{C}_{t}={\operatorname{Tr}}({\bm{\Lambda}}_{\rm CRLB})=\sum_{i=1}^{N}\frac{\sigma_{n}^{2}\delta_{i}}{\sigma_{n}^{2}+L_{t}\sigma_{t}^{2}\delta_{i}}. (17)

Here, 𝒞t\mathcal{C}_{t} is a function of δi,i=1,,N\delta_{i},\,i=1,\cdots,N with the constraint that i=1Nδi=Tr(𝐑H)\sum_{i=1}^{N}\delta_{i}={\operatorname{Tr}}(\mathbf{R}_{H}). Therefore, we can further obtain the lower bound of 𝒞t\mathcal{C}_{t} by applying Lagrange multiplier method. The Lagrangian function can be written as

(𝚲H)=i=1Nσn2δiσn2+Ltσt2δi+τ(i=1NδiTr(𝐑H)),\displaystyle\mathcal{L}\left({\bm{\Lambda}}_{H}\right)=\sum_{i=1}^{N}\frac{\sigma_{n}^{2}\delta_{i}}{\sigma_{n}^{2}+L_{t}\sigma_{t}^{2}\delta_{i}}+\tau\left(\sum_{i=1}^{N}\delta_{i}-{\operatorname{Tr}}(\mathbf{R}_{H})\right), (18)

where τ\tau is the Lagrange multiplier. By solving (𝚲H)δi=0\frac{\partial\mathcal{L}\left({\bm{\Lambda}_{H}}\right)}{\partial\delta_{i}}=0, we get

σn4(σn2+δiLtσt2)2+τ=0,\frac{\sigma_{n}^{4}}{\left(\sigma_{n}^{2}+\delta_{i}L_{t}\sigma_{t}^{2}\right)^{2}}+\tau=0,

which shows that the lower bound is achieved when δ1==δi=δN\delta_{1}=\cdots=\delta_{i}\cdots=\delta_{N} and δi=1NTr(𝐑H)=1Ni=1Nδi=σh2,i=1,,N\delta_{i}=\frac{1}{N}{\operatorname{Tr}}(\mathbf{R}_{H})=\frac{1}{N}\sum_{i=1}^{N}\delta_{i}{=\sigma_{h}^{2}},\,i=1,\cdots,N. The lower bound of 𝒞t\mathcal{C}_{t} is then given by

𝒞tNσn2σh2σn2+Ltσt2σh2.\mathcal{C}_{t}\geq\frac{N\sigma_{n}^{2}{\sigma_{h}^{2}}}{\sigma_{n}^{2}+L_{t}\sigma_{t}^{2}{\sigma_{h}^{2}}}. (19)

Therefore, for any diagonal element of 𝚲CRLB(i)\bm{\Lambda}_{\rm CRLB}(i), we have

𝚲CRLB(i)σn2σh2σn2+Ltσt2σh2𝒞e,i=1,,N.{\bm{\Lambda}}_{\rm CRLB}(i)\geq\frac{\sigma_{n}^{2}{\sigma_{h}^{2}}}{\sigma_{n}^{2}+L_{t}\sigma_{t}^{2}{\sigma_{h}^{2}}}\triangleq\mathcal{C}_{e},\;i=1,\cdots,N. (20)

IV-B Optimal Power Allocation of Training and Data Symbols

In general, there are some constraints on the maximum and average transmission powers of a transmitter. When such power constraints are applied, there is a motivation for optimizing the power allocation between the training and data symbols, especially for maximizing the MI for communications. Here, we optimize power allocation only by referring to the communication MI, because its impact on communication MI is much stronger than on sensing MI. Larger CEE can cause substantially deteriorate communication performance while sensing MI can only be slightly affected since the training sequence is directly used for sensing.

Since 𝐇^=𝐇Δ𝐇\hat{\mathbf{H}}=\mathbf{H}-\Delta\mathbf{H}, we can obtain that 𝐇^\hat{\mathbf{H}} is a random variable with zero mean and variance σ𝐇^2=1N2𝔼[Tr{𝐇^𝐇^H}]\sigma^{2}_{\hat{\mathbf{H}}}=\frac{1}{N^{2}}\mathbb{E}\left[{\operatorname{Tr}}\{\hat{\mathbf{H}}{\hat{\mathbf{H}}}^{H}\}\right]. According to the orthogonality principle for MMSE [9] and the obtained lower bound of CEE, we have σ𝐇^2=σh2σe2\sigma^{2}_{\hat{\mathbf{H}}}=\sigma_{h}^{2}-\sigma_{e}^{2}. Therefore, the estimated channel 𝐇^\hat{\mathbf{H}} can be normalized as 𝐇~=1σ𝐇^𝐇^\tilde{\mathbf{H}}=\frac{1}{\sigma_{\hat{\mathbf{H}}}}\hat{\mathbf{H}}, which has elements following i.i.d. Complex Gaussian distribution 𝒞𝒩(0,1)\mathcal{CN}(0,1).

The MI in (14) can then be rewritten as

I(𝐗d;𝐘comd|𝐇^)\displaystyle I\!\left(\!\mathbf{X}_{d};{\mathbf{Y}}^{d}_{\rm com}|\hat{\mathbf{H}}\!\right) =Ldlog2[det(σ𝐇^2σn2𝐇~Σ𝐗d𝐇~H+𝐈N)]\displaystyle=L_{d}\log_{2}\!\left[\!\det\!\left(\!\frac{\sigma^{2}_{\hat{\mathbf{H}}}}{\sigma_{n^{\prime}}^{2}}\tilde{\mathbf{H}}\varSigma_{\mathbf{X}_{d}}{\tilde{\mathbf{H}}}^{H}\!+\!\mathbf{I}_{N}\!\right)\!\!\right]\! (21a)
=Ldlog2[det(σh2σe2PdLdσe2+σn2𝐇~Σ𝐗d𝐇~H+𝐈N)]\displaystyle=L_{d}\log_{2}\left[\det\left(\frac{\sigma_{h}^{2}-\sigma_{e}^{2}}{\frac{P_{d}}{L_{d}}\sigma_{e}^{2}+\sigma_{n}^{2}}\tilde{\mathbf{H}}\varSigma_{\mathbf{X}_{d}}{\tilde{\mathbf{H}}}^{H}+\mathbf{I}_{N}\right)\right] (21b)
Ldlog2[det(σh2𝒞ePdLd𝒞e+σn2𝐇~Σ𝐗d𝐇~H+𝐈N)],\displaystyle\leq L_{d}\log_{2}\left[\det\left(\frac{\sigma_{h}^{2}-\mathcal{C}_{e}}{\frac{P_{d}}{L_{d}}\mathcal{C}_{e}+\sigma_{n}^{2}}\tilde{\mathbf{H}}\varSigma_{\mathbf{X}_{d}}{\tilde{\mathbf{H}}}^{H}+\mathbf{I}_{N}\right)\right], (21c)

where (21c) is due to the lower bound of CEE, i.e., σe2𝒞tl\sigma_{e}^{2}\geq\mathcal{C}^{l}_{t}.

By substituting 𝒞e\mathcal{C}_{e} of (20) into (21) and exploiting the Jensen’s inequality, we can obtain the results for optimal power allocation and minimized CEE as summarized in the following theorem.

Theorem 1 (Optimal Power Allocation).

The optimal power allocation for maximizing the channel capacity under the training symbols is given by

κop={Γ+Γ(Γ1),ifLd<N;12,ifLd=N;ΓΓ(Γ1),ifLd>N,\kappa_{\rm{op}}=\left\{\begin{aligned} &\Gamma+\sqrt{\Gamma(\Gamma-1)},\;&{\rm if}\;L_{d}<N;\\ &\frac{1}{2},\;&{\rm if}\;L_{d}=N;\\ &\Gamma-\sqrt{\Gamma(\Gamma-1)},\;&{\rm if}\;L_{d}>N,\end{aligned}\right. (22)

where Γ=LdLdN(1+Nσn2Pσh2)\Gamma=\frac{L_{d}}{L_{d}-N}\left(1+\frac{N\sigma_{n}^{2}}{P\sigma_{h}^{2}}\right).

The lower bound of the CEE can be given by

𝒞tl=Nσn2σh2Nσn2+(1κop)Pσh2.\mathcal{C}_{t}^{l}=\frac{N\sigma_{n}^{2}{\sigma_{h}^{2}}}{N\sigma_{n}^{2}+\left(1-\kappa_{\rm{op}}\right)P{\sigma_{h}^{2}}}. (23)
Proof.

The proof is provided in Appendix A. ∎

Corollary 1.

In a high SNR regime, Γ\Gamma and κop\kappa_{\rm{op}} can be approximated as

ΓLdLdN;κopLdLd+N.\Gamma\approx\frac{L_{d}}{L_{d}-N};\;\kappa_{\rm{op}}\approx\frac{\sqrt{L_{d}}}{\sqrt{L_{d}}+\sqrt{N}}.

In this case, the SNR is approximately ρmax=Ld(Ld+N)2PNσn2\rho_{\max}=\frac{{L_{d}}}{(\sqrt{L_{d}}+\sqrt{N})^{2}}\frac{P}{N\sigma_{n}^{2}}. We can find that, in the high SNR regime, the power allocation between the data and the training symbols depends on the number of the data symbols, LdL_{d}, and the number of antennas, NN. Moreover, κop\kappa_{\rm{op}} decreases with the growth of both LdL_{d} and NN.

In a low SNR regime, Γ\Gamma and κop\kappa_{\rm{op}} can be approximated as

ΓLdNσn2(LdN)Pσh2;κop12.\Gamma\approx\frac{L_{d}N\sigma_{n}^{2}}{(L_{d}-N)P\sigma_{h}^{2}};\;\kappa_{\rm{op}}\approx\frac{1}{2}.

In this case, the SNR is approximately ρmax=P2σh44N2σn4\rho_{\max}=\frac{P^{2}\sigma_{h}^{4}}{4N^{2}\sigma_{n}^{4}}. We find that half of the total energy are allocated to the training symbols, and the maximum SNR in the low SNR regime quadratically increases with Pσn2\frac{P}{\sigma_{n}^{2}}.

Hereafter, we use the optimal power allocation formula (22) for power allocation and let 𝒞el=𝒞tlN=σn2σh2Nσn2+(1κop)Pσh2\mathcal{C}^{l}_{e}=\frac{\mathcal{C}_{t}^{l}}{N}=\frac{\sigma_{n}^{2}{\sigma_{h}^{2}}}{N\sigma_{n}^{2}+\left(1-\kappa_{\rm{op}}\right)P{\sigma_{h}^{2}}}; unless otherwise stated.

V Optimal Waveform Design

With the optimized power allocation in Section IV, we investigate the waveform design for three scenarios in this section, which are maximizing the MI for sensing only and for communications only, and maximizing a weighted relative MI jointly for communication and radio sensing.

V-A Optimal Waveform Design for Radio Sensing Only

In order to achieve the maximum MI for sensing, or in other words, to make received signals 𝐘rad{\mathbf{Y}}_{\rm rad} (including 𝐘radt{\mathbf{Y}}^{t}_{\rm rad} and 𝐘radd{\mathbf{Y}}^{d}_{\rm rad}) containing rich information about 𝐆\mathbf{G}, the transmit signals 𝐗\mathbf{X} (including the training sequence 𝐗t\mathbf{X}_{t} and data sequence 𝐗d\mathbf{X}_{d}) should be designed according to the sensing channel matrix 𝐆\mathbf{G}. Since the training sequence 𝐗t\mathbf{X}_{t} and the data sequence 𝐗d\mathbf{X}_{d} are independent and have different correlations, the optimization problem for maximizing the MI for sensing can be decoupled into two separate optimization problems. As assumed, 𝐗t\mathbf{X}_{t} contains deterministic orthogonal rows and 𝐗t𝐗tH=Ltσt2𝐈N\mathbf{X}_{t}\mathbf{X}_{t}^{H}=L_{t}\sigma_{t}^{2}\mathbf{I}_{N}. We only need to consider the optimization problem for the data sequence.

For the data sequence, the spatial correlation matrix can be diagonalized through SVD, that is,

Σ𝐆=1N𝔼{𝐆𝐆H}=𝐔G𝚲G𝐔GH,\varSigma_{\mathbf{G}}=\frac{1}{N}\mathbb{E}\{\mathbf{G}\mathbf{G}^{H}\}=\mathbf{U}_{G}\mathbf{\Lambda}_{G}\mathbf{U}_{G}^{H}, (24)

where 𝐔G\mathbf{U}_{G} is a unitary matrix and 𝚲G=diag{λ11,,λii,,λNN}\mathbf{\Lambda}_{G}={\operatorname{diag}}\left\{\lambda_{11},\cdots,\lambda_{ii},\cdots,\lambda_{NN}\right\} is a diagonal matrix with λii\lambda_{ii} being the singular values. The mean channel gain σg2\sigma_{g}^{2} for sensing channels is σg2=i=1λii\sigma_{g}^{2}=\sum_{i=1}\lambda_{ii}. The MI in (11) can be rewritten as (25),

I(𝐆;𝐘rad|𝐗)\displaystyle I\left(\mathbf{G};{\mathbf{Y}}_{\rm rad}|\mathbf{X}\right) =Nlog2[det(𝐗t𝐗tT𝐔G𝚲G𝐔GH+𝐗d𝐗dT𝐔G𝚲G𝐔GH(σn2)LN+𝐈N)]\displaystyle=N\log_{2}\left[\det\left(\frac{\mathbf{X}^{\ast}_{t}\mathbf{X}_{t}^{T}\mathbf{U}_{G}\mathbf{\Lambda}_{G}\mathbf{U}_{G}^{H}+\mathbf{X}^{\ast}_{d}\mathbf{X}_{d}^{T}\mathbf{U}_{G}\mathbf{\Lambda}_{G}\mathbf{U}_{G}^{H}}{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}+\mathbf{I}_{N}\right)\right] (25a)
=Nlog2[det(𝚲G(𝐗tT𝐔G)H𝐗tT𝐔G+𝚲G(𝐗dT𝐔G)H𝐗dT𝐔G(σn2)LN+𝐈N)],\displaystyle=N\log_{2}\left[\det\left(\frac{\mathbf{\Lambda}_{G}\left(\mathbf{X}^{T}_{t}\mathbf{U}_{G}\right)^{H}\mathbf{X}^{T}_{t}\mathbf{U}_{G}\!+\!\mathbf{\Lambda}_{G}\left(\mathbf{X}^{T}_{d}\mathbf{U}_{G}\right)^{H}\mathbf{X}^{T}_{d}\mathbf{U}_{G}}{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}+\mathbf{I}_{N}\right)\!\right]\!, (25b)

where (25b) is based on Sylvester’s determinant theorem. Define 𝐐(t)=(𝐗tT𝐔G)H𝐗tT𝐔G\mathbf{Q}^{(t)}=\left(\mathbf{X}^{T}_{t}\mathbf{U}_{G}\right)^{H}\mathbf{X}^{T}_{t}\mathbf{U}_{G} and 𝐐(d)=(𝐗dT𝐔G)H𝐗dT𝐔G\mathbf{Q}^{(d)}=\left(\mathbf{X}^{T}_{d}\mathbf{U}_{G}\right)^{H}\mathbf{X}^{T}_{d}\mathbf{U}_{G}, and their (i,j)(i,j)-th entries are qij(t)q^{(t)}_{ij} and qij(d)q^{(d)}_{ij}, respectively.

According to Hadamard’s inequality for the determinant and trace of an N×NN\times N positive semi-definite Hermitian matrix, we have the following inequalities:

det(𝐐N×N(t))i=1Nqii(t);det(𝐐N×N(d))i=1Nqii(d),\det\left(\mathbf{Q}^{(t)}_{N\times N}\right)\leq\prod_{i=1}^{N}q^{(t)}_{ii};\;\det\left(\mathbf{Q}^{(d)}_{N\times N}\right)\leq\prod_{i=1}^{N}q^{(d)}_{ii},

and

Tr(𝐐N×N(t)1)i=1N1qii(t);Tr(𝐐N×N(d)1)i=1N1qii(d),\operatorname{Tr}\left({\mathbf{Q}^{(t)}_{N\times N}}^{-1}\right)\geq\prod_{i=1}^{N}\frac{1}{q^{(t)}_{ii}};\;\operatorname{Tr}\left({\mathbf{Q}^{(d)}_{N\times N}}^{-1}\right)\geq\prod_{i=1}^{N}\frac{1}{q^{(d)}_{ii}},

where the equalities are achieved if and only if 𝐐N×N\mathbf{Q}_{N\times N} is diagonal. As a result, the MI can be rewritten as

I(𝐆;𝐘rad|𝐗)Nlog2[i=1N(λii(qii(t)+qii(d))(σn2)LN+1)].I\left(\mathbf{G};{\mathbf{Y}}_{\rm rad}|\mathbf{X}\right)\!\leq\!N\log_{2}\left[\prod_{i=1}^{N}\left(\frac{\lambda_{ii}(q^{(t)}_{ii}\!+\!q^{(d)}_{ii})}{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}\!+\!1\right)\right]. (26)

Since 𝐔G\mathbf{U}_{G} is unitary, we can find Tr(𝐐(d))=Tr((𝐗dT𝐔G)H𝐗dT𝐔G)=Tr(𝐙H𝐙)\operatorname{Tr}\left(\mathbf{Q}^{(d)}\right)=\operatorname{Tr}\left(\left(\mathbf{X}^{T}_{d}\mathbf{U}_{G}\right)^{H}\mathbf{X}^{T}_{d}\mathbf{U}_{G}\right)=\operatorname{Tr}\left(\mathbf{Z}^{H}\mathbf{Z}\right), where 𝐙=𝐗dT𝐔G\mathbf{Z}=\mathbf{X}^{T}_{d}\mathbf{U}_{G} is an Ld×NL_{d}\times N matrix. Under the constraint that the total transmit power is finite, we have Tr(𝐐(d))Pd=κopP\operatorname{Tr}\left(\mathbf{Q}^{(d)}\right)\leq P_{d}=\kappa_{\rm{op}}{P}, where PdP_{d} is the total power of the transmit data signals. Since 𝐗t\mathbf{X}_{t} satisfies the orthogonality condition, we have Tr(𝐐(t))=Tr((𝐗tT𝐔G)H𝐗tT𝐔G)=Tr(𝐗t𝐗tt)=Tr(𝐗t𝐗tH)=Pt=(1κop)P\operatorname{Tr}\left(\mathbf{Q}^{(t)}\right)=\operatorname{Tr}\left(\left(\mathbf{X}^{T}_{t}\mathbf{U}_{G}\right)^{H}\mathbf{X}^{T}_{t}\mathbf{U}_{G}\right)=\operatorname{Tr}\left(\mathbf{X}^{\ast}_{t}\mathbf{X}^{t}_{t}\right)=\operatorname{Tr}\left(\mathbf{X}_{t}\mathbf{X}^{H}_{t}\right)=P_{t}=(1-\kappa_{\rm{op}})P and qii(t)=PtN=(1κop)PNq^{(t)}_{ii}=\frac{P_{t}}{N}=\frac{(1-\kappa_{\rm{op}})P}{N}. Therefore, the maximum MI can be obtained by solving the following constrained problem:

Fr\displaystyle F_{r} =max𝐐(d)i=1N{log2(λii(PtN+qii(d))(σn2)LN+1)},\displaystyle\!=\!\underset{\mathbf{Q}^{(d)}}{\max}\sum_{i=1}^{N}\left\{\log_{2}\left(\frac{\lambda_{ii}\left(\frac{P_{t}}{N}+q^{(d)}_{ii}\right)}{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}+1\right)\right\}, (27)
subject toTr(𝐐(d))Pd;\displaystyle\text{subject to}\;\;\operatorname{Tr}\left(\mathbf{Q}^{(d)}\right)\leq P_{d};
qii(d)0, 1iN.\displaystyle\qquad\quad\quad q^{(d)}_{ii}\geq 0,\;1\leq i\leq N.

We can apply the Lagrange multiplier method to solve (27). The Lagrangian can be written as

(𝐐(d))=i=1N{log2(λii(PtN+qii(d))(σn2)LN+1)}+αi=1Nqii(d),\mathcal{L}\!\left(\mathbf{Q}^{(d)}\!\right)\!=\!\sum_{i=1}^{N}\!\left\{\!\log_{2}\!\left(\!\frac{\lambda_{ii}\!\left(\!\frac{P_{t}}{N}\!+\!q^{(d)}_{ii}\!\right)\!}{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}\!+\!1\!\right)\!\!\right\}\!+\alpha\sum_{i=1}^{N}q^{(d)}_{ii}, (28)

where α\alpha is the Lagrange multiplier associated with qii(d),i=1,,Nq^{(d)}_{ii},\,i=1,\cdots,N. Differentiating (𝐐(d))\mathcal{L}(\mathbf{Q}^{(d)}) with respect to qii(d)q^{(d)}_{ii}, i=1,,Ni=1,\cdots,N, and setting the first-order derivative as 0, we can obtain qii(d)q^{(d)}_{ii} as follows:

qii(d)\displaystyle q^{(d)}_{ii} =(σn2)LNαln2PtN(σn2)LNλii.\displaystyle=-\frac{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}{\alpha\ln 2}-\frac{P_{t}}{N}-\frac{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}{\lambda_{ii}}. (29)

The optimality conditions are satisfied if

i=1N((σn2)LNαln2PtN(σn2)LNλii)+=κopP\sum_{i=1}^{N}\left(-\frac{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}{\alpha\ln 2}-\frac{P_{t}}{N}-\frac{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}{\lambda_{ii}}\right)^{+}=\kappa_{\rm{op}}{P} (30)

holds, and

qii(d)=((σn2)LNαln2PtN(σn2)LNλii)+,i=1,,N.q^{(d)}_{ii}=\left(-\frac{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}{\alpha\ln 2}-\frac{P_{t}}{N}-\frac{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}{\lambda_{ii}}\right)^{+},i=1,\cdots,N. (31)

Since the diagonal elements of 𝐐(d)\mathbf{Q}^{(d)} are real and greater than 0, 𝐐(d)12{\mathbf{Q}^{(d)}}^{\frac{1}{2}} exists. For any Ld×NL_{d}\times N matrix 𝚿\mathbf{\Psi} with orthogonal columns, if 𝚿H𝚿=𝐈N\mathbf{\Psi}^{H}\mathbf{\Psi}=\mathbf{I}_{N}, we have 𝐙=𝚿𝐐(d)12\mathbf{Z}=\mathbf{\Psi}{\mathbf{Q}^{(d)}}^{\frac{1}{2}}. Since 𝐙=𝐗dH𝐔G\mathbf{Z}=\mathbf{X}^{H}_{d}\mathbf{U}_{G}, 𝐗d\mathbf{X}_{d} can be obtained as

𝐗d=(𝚿𝐐(d)12𝐔GH)H.\mathbf{X}_{d}=\left(\mathbf{\Psi}{\mathbf{Q}^{(d)}}^{\frac{1}{2}}\mathbf{U}_{G}^{H}\right)^{H}. (32)

With the obtained optimal 𝐗d\mathbf{X}_{d} for sensing, we can derive the corresponding communication MI which is not necessarily optimal, as given by

I(𝐗d;𝐘comd|𝐇^)\displaystyle I\left(\!\mathbf{X}_{d};{\mathbf{Y}}^{d}_{\rm com}|\hat{\mathbf{H}}\!\right) Ldlog2[det((σh2𝒞el)𝐇~Σ𝐗d𝐇~HPdLd𝒞el+σn2+𝐈N)]\displaystyle\leq L_{d}\log_{2}\!\left[\!\det\!\left(\!\frac{(\sigma_{h}^{2}\!-\!\mathcal{C}^{l}_{e})\tilde{\mathbf{H}}\varSigma_{\mathbf{X}_{d}}{\tilde{\mathbf{H}}}^{H}}{\frac{P_{d}}{L_{d}}\mathcal{C}^{l}_{e}+\sigma_{n}^{2}}\!+\!\mathbf{I}_{N}\!\right)\!\!\right]\! (33a)
=Ldlog2[det((σh2𝒞el)𝐔G𝐐(d)𝐔GH𝐇~H𝐇~PdLd𝒞el+σn2+𝐈N)],\displaystyle\!=\!L_{d}\log_{2}\!\left[\!\det\!\left(\!\frac{(\sigma_{h}^{2}-\mathcal{C}^{l}_{e}){\mathbf{U}_{G}\mathbf{Q}^{(d)}\mathbf{U}_{G}^{H}\tilde{\mathbf{H}}^{H}\tilde{\mathbf{H}}}}{\frac{P_{d}}{L_{d}}\mathcal{C}^{l}_{e}+\sigma_{n}^{2}}\!+\!\mathbf{I}_{N}\!\right)\!\!\right]\!, (33b)

where (33b) is obtained due to Σ𝐗d=1Ld𝔼{𝐗d𝐗dH}=(𝚿𝐐(d)12𝐔GH)H𝚿𝐐(d)12𝐔GH=𝐔G𝐐(d)𝐔GH\varSigma_{\mathbf{X}_{d}}=\frac{1}{L_{d}}\mathbb{E}\left\{\mathbf{X}_{d}\mathbf{X}_{d}^{H}\right\}=\left(\mathbf{\Psi}{\mathbf{Q}^{(d)}}^{\frac{1}{2}}\mathbf{U}_{G}^{H}\right)^{H}\mathbf{\Psi}{\mathbf{Q}^{(d)}}^{\frac{1}{2}}\mathbf{U}_{G}^{H}=\mathbf{U}_{G}\mathbf{Q}^{(d)}\mathbf{U}_{G}^{H}.

V-B Optimal Waveform Design for Communication Only

After obtaining the optimal power allocation between the training and data symbols for maximizing the channel capacity in the presence of CEE, as shown in Theorem 1, we can design waveform based on the optimal power allocation and correlated channel matrix 𝐇\mathbf{H} to maximize the MI for communications (i.e., channel capacity). As derived in Appendix A, the MI maximization problem can be formulated to maximize the upper bound of the MI, as given by

Fc\displaystyle F_{c} =maxLdi=1N{log2[(σh2𝒞el)μiiξiiPdLd𝒞el+σn2+1]}\displaystyle=\max L_{d}\sum_{i=1}^{N}\left\{\log_{2}\left[\frac{\left(\sigma_{h}^{2}-\mathcal{C}^{l}_{e}\right)\mu_{ii}\xi_{ii}}{\frac{P_{d}}{L_{d}}\mathcal{C}^{l}_{e}+\sigma_{n}^{2}}+1\right]\right\} (34)
subject toTr(𝚵)κopP;\displaystyle\text{subject to}\;\;\operatorname{Tr}\left(\mathbf{\Xi}\right)\leq\kappa_{\rm{op}}{P};
ξii0, 1iN.\displaystyle\qquad\quad\quad\xi_{ii}\geq 0,\;1\leq i\leq N.

The optimal solutions for this problem are satisfied if

i=1N[1βln2σn2+PdLd𝒞elμii(σh2𝒞el)]+=κopP\sum_{i=1}^{N}\left[-\frac{1}{\beta^{\prime}\ln 2}-\frac{\sigma_{n}^{2}+\frac{P_{d}}{L_{d}}\mathcal{C}^{l}_{e}}{\mu_{ii}\left(\sigma_{h}^{2}-\mathcal{C}^{l}_{e}\right)}\right]^{+}=\kappa_{\rm{op}}{P} (35)

holds, where β\beta^{\prime} is the the Lagrange multiplier. The optimal singular values ξii\xi_{ii} of the correlation matrix 𝚵\mathbf{\Xi} can be obtained as

ξii=[1βln2σn2+PdLd𝒞elμii(σh2𝒞el)]+,i=1,,N.\xi_{ii}=\left[-\frac{1}{\beta^{\prime}\ln 2}-\frac{\sigma_{n}^{2}+\frac{P_{d}}{L_{d}}\mathcal{C}^{l}_{e}}{\mu_{ii}\left(\sigma_{h}^{2}-\mathcal{C}^{l}_{e}\right)}\right]^{+},\;i=1,\cdots,N. (36)

Since the diagonal elements of 𝚵\mathbf{\Xi} are real and positive, 𝚵12{\mathbf{\Xi}}^{\frac{1}{2}} exists. For any Ld×NL_{d}\times N matrix 𝚯\mathbf{\Theta} with orthonormal columns, if 𝚯H𝚯=𝐈N\mathbf{\Theta}^{H}\mathbf{\Theta}=\mathbf{I}_{N}, we have 𝐘comd=𝚯𝚵12\mathbf{Y}^{d}_{\rm com}=\mathbf{\Theta}{\mathbf{\Xi}}^{\frac{1}{2}}. Since 𝐘comd=𝐗d𝐔𝐇~\mathbf{Y}^{d}_{\rm com}=\mathbf{X}_{d}\mathbf{U}_{\tilde{\mathbf{H}}}, 𝐗d\mathbf{X}_{d} can be obtained as

𝐗d=(𝚯𝚵12𝐔𝐇~H)H.\mathbf{X}_{d}=\left(\mathbf{\Theta}{\mathbf{\Xi}}^{\frac{1}{2}}\mathbf{U}_{\tilde{\mathbf{H}}}^{H}\right)^{H}. (37)

Based on the optimal singular values of the covariance matrix for the data signals in (36), we can obtain the MI for sensing under the condition of the maximum MI for communications, as given by

I(𝐆;𝐘rad|𝐗)=Nlog2[det((𝐗tT𝐔G)H𝐗tT𝐔G+𝐔𝐇~𝚵𝐔𝐇~HΣ𝐆(σn2)LN+𝐈N)],\displaystyle I\left(\mathbf{G};{\mathbf{Y}}_{\rm rad}|\mathbf{X}\right)=N\log_{2}\!\left[\det\left(\!\frac{\left(\mathbf{X}^{T}_{t}\mathbf{U}_{G}\right)^{H}\mathbf{X}^{T}_{t}\mathbf{U}_{G}\!+\!\mathbf{U}_{\tilde{\mathbf{H}}}\mathbf{\Xi}\mathbf{U}_{\tilde{\mathbf{H}}}^{H}\varSigma_{\mathbf{G}}}{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}\!+\!\mathbf{I}_{N}\!\right)\!\right]\!, (38)

(38) is obtained due to Σ𝐗d=1Ld𝔼{𝐗d𝐗dT}=1Ld𝔼{𝐗d𝐗dH}=(𝚯𝚵12𝐔𝐇~H)H𝚯𝚵12𝐔𝐇~H=𝐔𝐇~𝚵𝐔𝐇~H\varSigma_{\mathbf{X}_{d}}=\frac{1}{L_{d}}\mathbb{E}\left\{\mathbf{X}^{\ast}_{d}\mathbf{X}_{d}^{T}\right\}=\frac{1}{L_{d}}\mathbb{E}\left\{\mathbf{X}_{d}\mathbf{X}_{d}^{H}\right\}=\left(\mathbf{\Theta}{\mathbf{\Xi}}^{\frac{1}{2}}\mathbf{U}_{\tilde{\mathbf{H}}}^{H}\right)^{H}\mathbf{\Theta}{\mathbf{\Xi}}^{\frac{1}{2}}\mathbf{U}_{\tilde{\mathbf{H}}}^{H}=\mathbf{U}_{\tilde{\mathbf{H}}}\mathbf{\Xi}\mathbf{U}_{\tilde{\mathbf{H}}}^{H}.

V-C Joint Maximization of a Weighted Sum of MI

In this section, we conduct the waveform optimization for jointly considering the MI for both communication and radio sensing. Since there is generally no solution that can simultaneously maximize the MI for communication and radio sensing, a weighted sum of them is exploited and given by

Fw=wrFrI(𝐆;𝐘rad|𝐗)+1wrFcI(𝐗d;𝐘comd|𝐇^).F_{w}\!=\!\frac{w_{r}}{F_{r}}I\left(\mathbf{G};{\mathbf{Y}}_{\rm rad}|\mathbf{X}\right)+\frac{1-w_{r}}{F_{c}}I\left(\mathbf{X}_{d};{\mathbf{Y}}^{d}_{\rm com}|\hat{\mathbf{H}}\right). (39)

To maximize the weighted sum, the transmitted data signals 𝐗d\mathbf{X}_{d} should be designed according to the correlation matrices of both 𝐇\mathbf{H} and 𝐆\mathbf{G}. Based on the SVD, Σ𝐇=1N𝔼{𝐇𝐇H}=𝐔H𝚲H𝐔HH\varSigma_{\mathbf{H}}=\frac{1}{N}\mathbb{E}\{\mathbf{H}\mathbf{H}^{H}\}=\mathbf{U}_{H}\mathbf{\Lambda}_{H}\mathbf{U}_{H}^{H} and Σ𝐆=1N𝔼{𝐆𝐆H}=𝐔G𝚲G𝐔GH\varSigma_{\mathbf{G}}=\frac{1}{N}\mathbb{E}\{\mathbf{G}\mathbf{G}^{H}\}=\mathbf{U}_{G}\mathbf{\Lambda}_{G}\mathbf{U}_{G}^{H}, (39) can be rewritten as

Fw\displaystyle F_{w} =wrNFrlog2[det(𝐗t𝐗tTΣ𝐆+𝐗d𝐗dTΣ𝐆(σn2)LN+𝐈N)]\displaystyle=\frac{w_{r}N}{F_{r}}\log_{2}\left[\det\left(\frac{\mathbf{X}^{\ast}_{t}\mathbf{X}_{t}^{T}\varSigma_{\mathbf{G}}+\mathbf{X}^{\ast}_{d}\mathbf{X}_{d}^{T}\varSigma_{\mathbf{G}}}{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}+\mathbf{I}_{N}\right)\right] (40a)
+(1wr)LdFclog2[det((σh2𝒞el)𝐇~Σ𝐗d𝐇~HPdLd𝒞el+σn2+𝐈N)]\displaystyle\qquad\qquad\qquad+\frac{(1-w_{r})L_{d}}{F_{c}}\log_{2}\left[\det\left(\frac{(\sigma_{h}^{2}-\mathcal{C}^{l}_{e})\tilde{\mathbf{H}}\varSigma_{\mathbf{X}_{d}}{\tilde{\mathbf{H}}}^{H}}{\frac{P_{d}}{L_{d}}\mathcal{C}^{l}_{e}+\sigma_{n}^{2}}+\mathbf{I}_{N}\right)\right] (40b)
=wrNFrlog2[det(𝚲G(𝐗tT𝐔G)H𝐗tT𝐔G+𝚲G(𝐗dT𝐔G)H𝐗dT𝐔G(σn2)LN+𝐈N)]\displaystyle=\frac{w_{r}N}{F_{r}}\log_{2}\left[\det\left(\frac{\mathbf{\Lambda}_{G}\left(\mathbf{X}^{T}_{t}\mathbf{U}_{G}\right)^{H}\mathbf{X}^{T}_{t}\mathbf{U}_{G}+\mathbf{\Lambda}_{G}\left(\mathbf{X}^{T}_{d}\mathbf{U}_{G}\right)^{H}\mathbf{X}^{T}_{d}\mathbf{U}_{G}}{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}+\mathbf{I}_{N}\right)\right]
+(1wr)LdFclog2[det((σh2𝒞el)𝚲𝐇~(𝐗d𝐔𝐇~)H𝐗d𝐔𝐇~PdLd𝒞el+σn2+𝐈N)].\displaystyle\qquad\qquad\qquad+\frac{(1-w_{r})L_{d}}{F_{c}}\log_{2}\left[\det\left(\frac{(\sigma_{h}^{2}-\mathcal{C}^{l}_{e})\mathbf{\Lambda}_{\tilde{\mathbf{H}}}\left(\mathbf{X}_{d}\mathbf{U}_{\tilde{\mathbf{H}}}\right)^{H}\mathbf{X}_{d}\mathbf{U}_{\tilde{\mathbf{H}}}}{\frac{P_{d}}{L_{d}}\mathcal{C}^{l}_{e}+\sigma_{n}^{2}}+\mathbf{I}_{N}\right)\right]. (40c)

Define 𝚷=(𝐗dT𝐔H)H𝐗dT𝐔H=(𝐗dT𝐔G)H𝐗dT𝐔G\mathbf{\Pi}=(\mathbf{X}^{T}_{d}\mathbf{U}_{H})^{H}\mathbf{X}^{T}_{d}\mathbf{U}_{H}=(\mathbf{X}^{T}_{d}\mathbf{U}_{G})^{H}\mathbf{X}^{T}_{d}\mathbf{U}_{G} with the (i,j)(i,j)-th entry ϖij\varpi_{ij}, and tr(𝚷)=tr(Σ𝐗d)\rm{tr}\left(\mathbf{\Pi}\right)=\rm{tr}\left(\varSigma_{\mathbf{X}_{d}}\right). According to Hadamard’s inequality for the determinant and trace of a positive semi-definite Hermitian matrix, we have det(𝚷N×N)i=1Nϖii\det\left(\mathbf{\Pi}_{N\times N}\right)\leq\prod_{i=1}^{N}\varpi_{ii}. We can formulate the MI maximization problem as

Fw\displaystyle F_{w} max𝚷i=1N{wrFrNlog2(λii(PtN+ϖii)(σn2)LN+1)+1wrFcLdlog2((σh2𝒞el)μiiϖiiPdLd𝒞el+σn2+1)}\displaystyle\leq\underset{\mathbf{\Pi}}{\max}\sum_{i=1}^{N}\left\{\frac{w_{r}}{F_{r}}N\!\log_{2}\!\left(\!\frac{\lambda_{ii}\left(\frac{P_{t}}{N}\!+\!\varpi_{ii}\right)}{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}\!+\!1\!\right)\!+\!\frac{{1-w_{r}}}{F_{c}}L_{d}\log_{2}\left(\frac{\left(\sigma_{h}^{2}-\mathcal{C}_{e}^{l}\right)\mu_{ii}\varpi_{ii}}{\frac{P_{d}}{L_{d}}\mathcal{C}_{e}^{l}\!+\!\sigma_{n}^{2}}\!+\!1\!\right)\!\!\right\}\! (41a)
subject toTr(𝚷)Pd;\displaystyle\text{subject to}\;\;\operatorname{Tr}\left(\mathbf{\Pi}\right)\leq P_{d}; (41b)
ϖii0, 1iN,\displaystyle\qquad\quad\quad\varpi_{ii}\geq 0,\;1\leq i\leq N, (41c)

where FrF_{r} and FcF_{c} are the maximum MI (27) in Section V-A and the communication capacity (34) in Section V-B, respectively.

The objective function in (41a) is concave, since it is a non-negative weighted sum of two concave functions of ϖii\varpi_{ii}, i.e.,

Nlog2(λii(PtN+ϖii)(σn2)LN+1);andLdlog2((σh2𝒞el)μiiϖiiPdLd𝒞el+σn2+1).N\!\log_{2}\!\left(\!\frac{\lambda_{ii}\left(\frac{P_{t}}{N}\!+\!\varpi_{ii}\right)}{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}\!+\!1\!\right)\!;\;{\rm and}\;L_{d}\log_{2}\!\left(\!\frac{\left(\sigma_{h}^{2}\!-\!\mathcal{C}_{e}^{l}\right)\mu_{ii}\varpi_{ii}}{\frac{P_{d}}{L_{d}}\mathcal{C}_{e}^{l}\!+\!\sigma_{n}^{2}}\!+\!1\!\right)\!.

Besides, the functions Tr(𝚷)Pd\operatorname{Tr}\left(\mathbf{\Pi}\right)\leq P_{d} and ϖii0, 1iN\varpi_{ii}\geq 0,\;1\leq i\leq N are affine. Therefore, the maximization of the concave problem in (41) can be reformulated equivalently to minimize the convex objective. The optimization problem can be solved by using Karush-Kuhn-Tucker (KKT) conditions. Let

νi=λii(σn2)LN,φi=(σh2𝒞el)μiiPdLd𝒞el+σn2,ϵ=wrNln2Fr,andη=(1wr)Ldln2Fc,\displaystyle\nu_{i}=\frac{\lambda_{ii}}{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}},\ \varphi_{i}=\frac{\left(\sigma_{h}^{2}-\mathcal{C}_{e}^{l}\right)\mu_{ii}}{\frac{P_{d}}{L_{d}}\mathcal{C}_{e}^{l}+\sigma_{n}^{2}},\epsilon=\frac{w_{r}N}{\ln 2F_{r}},\ \text{and}\ \eta=\frac{\left(1-w_{r}\right)L_{d}}{\ln 2F_{c}},

we have

ζζi=ϵνi1+νi(PtN+ϖii)+ηφi1+φiϖii;\displaystyle\zeta-\zeta_{i}=\frac{\epsilon\nu_{i}}{1+\nu_{i}(\frac{P_{t}}{N}+\varpi_{ii})}+\frac{\eta\varphi_{i}}{1+\varphi_{i}\varpi_{ii}}; (42a)
ζ(i=1NϖiiPd)=0;\displaystyle\zeta\left(\sum_{i=1}^{N}\varpi_{ii}-P_{d}\right)=0; (42b)
ζiϖii=0;\displaystyle\zeta_{i}\varpi_{ii}=0; (42c)
ζ0;ζi0,i=1,,N,\displaystyle\zeta\geq 0;\;\zeta_{i}\geq 0,i=1,\cdots,N, (42d)

where ζ\zeta and ζi,i=1,,N\zeta_{i},\;i=1,\cdots,N, are the Lagrange multipliers. The optimal solution for (42) is given by

ϖ^ii\displaystyle\hat{\varpi}_{ii} =12[1ζ(ϵ+η)(PtN+1νi+1φi)s+[(PtN+1νi1φi)+1ζ(ηϵ)]2+4ϵηζ2]+,\displaystyle=\frac{1}{2}\Bigg{[}\frac{1}{\zeta}\left(\epsilon+\eta\right)-\left(\frac{P_{t}}{N}+\frac{1}{\nu_{i}}+\frac{1}{\varphi_{i}}\right)s+\sqrt{\!\left[\!\left(\frac{P_{t}}{N}+\frac{1}{\nu_{i}}-\frac{1}{\varphi_{i}}\right)+\frac{1}{\zeta}\left(\eta-\epsilon\right)\!\right]^{2}+\frac{4\epsilon\eta}{\zeta^{2}}}\Bigg{]}^{+}, (43)

where [x]+=max{x,0}\left[x\right]^{+}=\max\left\{x,0\right\}, and ϖ^ii,i=1,,N\hat{\varpi}_{ii},\;i=1,\cdots,N satisfy the following equality:

i=1Nϖ^iiPd=0.\sum_{i=1}^{N}\hat{\varpi}_{ii}-P_{d}=0. (44)

The positive ζ\zeta can be obtained by the bisection search over the following interval:

0<1ζ<1min𝑖{ϵνi(PtN+Pd)νi+1+ηφiφiPd+1},i=1,,N.0<\frac{1}{\zeta}<\frac{1}{\underset{i}{\min}\left\{\frac{\epsilon\nu_{i}}{(\frac{P_{t}}{N}+P_{d})\nu_{i}+1}+\frac{\eta\varphi_{i}}{\varphi_{i}P_{d}+1}\right\}},\;i=1,\cdots,N.

Once ζ\zeta is obtained, the optimal covariance matrix of the data signals in (41) can be obtained. Further, we can obtain the maximum relative MI, that is, the sum of the relative communication MI and the relative sensing MI, as given by

Rtotal=\displaystyle R_{\rm total}= i=1N{NFrlog2(λii(PtN+ϖ^ii)(σn2)LN+1)+LdFclog2[(σh2𝒞el)μiiϖ^iiPdLd𝒞el+σn2+1]}.\displaystyle\sum_{i=1}^{N}\left\{\frac{N}{F_{r}}\log_{2}\left(\frac{\lambda_{ii}(\frac{P_{t}}{N}+\hat{\varpi}_{ii})}{\left(\sigma_{n}^{2}\right)^{\frac{L}{N}}}+1\right)+\frac{{L_{d}}}{F_{c}}\log_{2}\left[\frac{(\sigma_{h}^{2}-\mathcal{C}_{e}^{l})\mu_{ii}\hat{\varpi}_{ii}}{\frac{P_{d}}{L_{d}}\mathcal{C}_{e}^{l}+\sigma_{n}^{2}}+1\right]\right\}. (45)

In the case of wr=0w_{r}=0, the singular values ϖ^ii,i=1,,N,\hat{\varpi}_{ii},\,i=1,\cdots,N, of the optimal covariance matrix are consistent with (36) in Section V-B. In the case of wr=1w_{r}=1, similarly, ϖ^ii,i=1,,N,\hat{\varpi}_{ii},\,i=1,\cdots,N, coincides with (31), as described in Section V-A.

VI Simulation Results

In this section, we conduct extensive simulations to numerically verify the effectiveness of the proposed methods. A system with 22 nodes is considered, and each node is equipped with 88 antennas.

We consider (correlated) MIMO Rayleigh fading (complex Gaussian) channels for both communications and sensing, and the channels between them are independent. Both channels remain unchanged during the period of transmitting. Correlated channels are generated based on the Kronecker model, where the normalized correlation matrix has identical diagonal elements and random off-diagonal elements following uniform distributions between 0 and a maximal correlation coefficient ϵc\epsilon_{c}. The values of ϵc\epsilon_{c} are set as 0.1 and 0.8 for communication and sensing, respectively. We are particularly interested in the case where communication and sensing channels have the same mean path losses, i.e., σh2=σg2=1\sigma_{h}^{2}=\sigma_{g}^{2}=1. This corresponds to the case where the mean sensing distance is approximately the square root of the communication distance.

In all the simulations, the noise is complex AWGN. Other simulation parameters are shown in Tab. I, unless stated otherwise. The value of SNR is (P/L)/σn2(P/L)/\sigma_{n}^{2}. For any given SNR and LL, we compute the value for PP with σn2=1\sigma_{n}^{2}=1, and then decide the value for PtP_{t} and PdP_{d} according to κop\kappa_{\rm{op}} in Theorem 1.

TABLE I: Simulation parameters
Parameter Value
Noise power σn2{\sigma_{n}^{2}} 1
σh2{\sigma_{h}^{2}} = σg2{\sigma_{g}^{2}} 1
Number of Antennas 8
Number of the training symbols LtL_{t} basic value 8
Number of the total symbols LL basic values 128

To provide comparable results to the communication rate, we introduce the sensing rate as the sensing MI per unit time, which can be viewed as the mutual information between the sensing return and the targets. For simplicity, we assume each symbol lasts 1 unit time. Hence, sensing rate, as well as communication rate, equal to the ratio between their respective MI and the number of total transmitted symbols. Note that the term, sensing rate, is not widely used in the literature because it implicitly assumes that the sensing channel is not changing during the period of interest. Hence the sensing rate is also related to how fast sensing channel changes.

For convenience, abbreviations of the waveform design schemes proposed in this paper and other comparison schemes for the legends in figures are listed as follows, where all the schemes include optimal power allocation, except for “without power allocation”.

  • OPTC (or OPTC with CEE): the scheme which optimizes communication only in the presence of CEE at the receiver;

  • OPTC without CEE: the scheme which optimizes communication only in the absence of CEE at the receiver;

  • OPTS: the scheme which optimizes sensing only;

  • JCAS: the scheme which optimizes both communication and sensing;

  • Equal: the scheme in which the singular values for data symbol correlation matrix are allocated with equal values;

  • Random: the scheme in which the singular values for data symbol correlation matrix are allocated with random values;

  • “without power allocation”: the scheme in which waveform is optimized for communication only without power allocation between the data and training sequences.

For each result, Monte-Carlo simulations with 5000 independent trials are conducted and the average results are provided.

Refer to caption

Figure 3: Communication rate vs. SNR, where L=128L=128, Lt=8L_{t}=8.

Refer to caption

Figure 4: Sensing rate vs. SNR, where L=128L=128, Lt=8L_{t}=8.

Fig. 4 plots the communication rate under correlated channels 𝐇\mathbf{H} with CEE on the SNR for different waveform design schemes. The communication rate without CEE is also plotted for comparison. We can see that the communication rate with CEE is upper bounded by the rate without CEE, and the gap between them decreases with the growth of SNR. The communication rate for OPTC is the maximum in the presence of CEE, as expected. The rate decreases towards that for OPTS with the growth of the weighting factor for sensing. We can also see that the rate for equal power allocation is the lowest in low SNR regimes, and the gap to other schemes decreases with the increase of SNR. In high SNR regimes, the rate approaches the case of OPTC. We also plot the rate without power optimization between the training and data signals. In this case, the rate lies between that for OPTS and Equal in the low SNR regimes, and the gap to that for OPTS decreases with the growth of SNR, and the rate surpasses that for OPTS when the SNR is larger than 1616 dB. This is because the difference between these waveforms decreases gradually as the SNR increases.

In Fig. 4, the sensing performance is evaluated under correlated 𝐆\mathbf{G} for different waveform design schemes. We find that the sensing rate is improved with the increase of the weighting factor for sensing, and approaches the result of OPTS. For OPTC with CEE, the sensing rate is the lowest of the three in the high SNR region since the communication rate is maximized without taking into consideration the sensing rate. The sensing rate for other schemes are between those for optimal sensing and communications with CEE. For OPTC without CEE, the sensing rate is almost the same as, or close to, the rate for other schemes, including OPTC with CEE, OPTC without power optimization, and JCAS, but gradually surpass them. Since the correlated coefficient for 𝐆\mathbf{G} is larger than 𝐇\mathbf{H}, the larger power allocated for training symbols, the smaller the sensing rate is. Therefore, the rate for OPTC without CEE is larger than those schemes in the high SNR regime. The Equal scheme achieves a sensing rate approaching OPTS with the increase of SNR. The sensing rate for OPTC without power optimization is the lowest among all the schemes in the high SNR regime, and only larger than that for Equal in the lower SNR region. This shows that not only can the optimal power allocation minimize the CEE, but also improve the sensing capability.

Refer to caption
Figure 5: Communication rate vs. the ratio of training symbols where the total number of the transmit symbols L=160L=160, SNR=1{\rm SNR}=1 dB.

Refer to caption

Figure 6: Sensing rate vs. the ratio of training symbols where the total number of the transmit symbols L=160L=160, SNR=1{\rm SNR}=1 dB.

Fig. 6 plots the communication rate against the ratio of the training symbols to the transmit symbols for various waveform design schemes. The different ratio is obtained via changing the length of training and data symbols while keeping the total number unchanged. We see that the communication rate decreases significantly with the growth of the ratio. This is because the number of data symbols reduces as the ratio of training symbols increases, resulting in the subsequent decrease in communication rate. Moreover, from Fig. 6, we see that the communication rate of OPTC without power optimization between the training and data symbols decreases the fastest among all the schemes. In other words, power optimization is important to improve the communication rate.

Fig. 6 depicts the sensing rate against the ratio, where 𝐆\mathbf{G} remains correlated during the period of LL symbols. From the figure, we can see that the sensing rate changes little with the increase of the ratio for Equal and OPTC without CEE. This is because the orthogonal (training) symbols can achieve the same sensing rate as the (data) symbols with equal power allocation. For Equal, both the orthogonal training and data symbols with equal power allocation are used for sensing, and therefore, changes in the ratio have little impact on the sensing rate. For OPTC without CEE, all power is used for communication and changes in the ratio do not affect the waveform design for communication, and therefore, the sensing rate of OPTC without CEE keeps unchanged. We also see that the curves of OPTC, OPTS, and JCAS show slight declines and approach the curve for Equal as the ratio increases. This is because the impact of the training symbols on the sensing rate is increasingly strong with the growth of the ratio, causing the sensing rate to approach that for Equal (as the training symbols are orthogonal). However, for OPTC without power optimization, the sensing rate declines rapidly with the growth of the ratio. This is because the power allocated to the orthogonal training symbols increases linearly with the ratio. The more power for orthogonal training symbols, the smaller the sensing rate is for the strongly correlated 𝐆\mathbf{G}.

Refer to caption
(a) Sensing rate & Communication rate vs. the number of transmit symbols LL.
Refer to caption
(b) Mutual information vs. LL.
Refer to caption
(c) Mutual information vs. LL.
Figure 7: Mutual information and rates vs. the total number of transmitted symbols LL, where Lt=8L_{t}=8 and SNR=1{\rm SNR}=1 dB.

In the subfigures of Fig. 7, both MI and rates for communication and sensing are plotted against the total number of the transmitted symbols LL. The number of training symbols LtL_{t} is fixed to 8, and only the number of data symbols varies. In Fig. 7(a), we can see the MI for both sensing and communication increases with LL, and the increase rate for communication is much higher than that for sensing. This is because the MI for communication increases almost linearly with the number of data symbols transmitted, but the MI for sensing only increases logarithmically since the sensing channel 𝐆\mathbf{G} is assumed to remain unchanged and more data symbols only increase the SNR. When L<16L<16, the MI for sensing is greater than that for communication. This is because the CEE has a strong impact on the MI for communication.

Fig. 7(b) shows the rates for communication and sensing, corresponding to the MI for them shown in Fig. 7(a). We also find that the communication rate increases with LL, while the sensing rate increases first and then decreases with LL. This is due to the fact that, as the number of data symbols LdL_{d} increases (since LtL_{t} is a constant), the communication capacity, i.e., communication rate, increases. On the contrary, as for sensing, with the increase of LL, the channel variation rate reduces, leading to a decrease in the sensing rate.

Fig. 7(c) shows the relative sensing and communication MI values, which are normalized to their optimal values, respectively. We can see that there is a large gap between those optimal and random waveform designs for communications. The optimal waveform design for sensing leads to better performance than the random waveform design scheme. We also see that the performance gap decreases with the growth of LL. This is because the signal overhead reduces with data symbols increasing. For sensing, the optimal waveform design can achieve a better performance than the orthogonal waveform that is employed in conventional MIMO radar [29], supposing the channel correlation is known. Both the optimal sensing and JCAS schemes can lead to better MI. We also find that the relative sensing MI for JCAS slightly decreases first and then converges to a constant value. This is because the ratio Lt/LL_{t}/L decreases rapidly and convergences to zero with the increase of LL.

Refer to caption

Figure 8: Total relative MI vs. ωr\omega_{r} for different waveform design methods with SNR=1\text{SNR}=1 dB, and ϵc\epsilon_{c} for communication and sensing are 0.8 and 0.1, respectively.

Refer to caption

Figure 9: Total relative MI vs. the maximal correlation coefficient ϵc\epsilon_{c} for communication channels, where SNR=1\text{SNR}=1 dB, and ϵc\epsilon_{c} for sensing is 0.3.

Fig. 9 plots the sum of the relative communication and sensing MI against the weighting factor for sensing under different waveform design schemes and correlation coefficient ϵc\epsilon_{c}. We can see that the total relative MI for JCAS is always the highest in the case of ϵc=0.1\epsilon_{c}=0.1 for communication and ϵc=0.8\epsilon_{c}=0.8 for sensing. The total relative MI for the JCAS increases first and then decreases with the growth of wrw_{r}, while the total relative MI value for other schemes does not change with wrw_{r}. The total relative MI for the equal power allocation waveform design is the lowest. As expected, the total relative MI values under JCAS with wr=0w_{r}=0 and wr=1w_{r}=1 are equal to those for OPTC and OPTS, respectively. In the case of ϵc=0.5\epsilon_{c}=0.5 for both communication and sensing, we find the weighted sum of the communication and sensing MI are almost the same across OPTC, OPTS and JCAS schemes. In other words, a single waveform can be optimized to maximize both communication and sensing MI when the communication and sensing channels have the same correlation characteristics.

Fig. 9 plots the sum of the relative communication and sensing MI against the maximum correlation coefficient ϵc\epsilon_{c} for sensing channel 𝐆\mathbf{G}. We can see that the total relative MI decreases with the growth of the correlation coefficient, especially when the coefficient is large. We also find that the JCAS scheme is less affected by the channel correlation and outperforms the other schemes. On the contrary, the increase in the correlation coefficient has a significant impact on the total relative MI for random and equal power allocation schemes, and the MI is drastically reduced with the correlation increases.

Refer to caption
Figure 10: Trade-off curve between communication and radio sensing, where ϵc\epsilon_{c} for communication and sensing are 0.8 and 0.3, respectively.

Fig. 10 demonstrates how the communication rate and sensing MI change with the weighting factor ωr\omega_{r} which increases from 0 to 11 along the direction of the arrow. We consider two cases where the mean path losses for sensing and communication are (i) the same, i.e., σh2=σg2=1\sigma_{h}^{2}=\sigma_{g}^{2}=1, and (ii) different, i.e., σh2=0.7\sigma_{h}^{2}=0.7 and σg2=1.3\sigma_{g}^{2}=1.3, or σh2=0.4\sigma_{h}^{2}=0.4 and σg2=1.6\sigma_{g}^{2}=1.6. In both cases, we can see that the sensing MI improves with the growth of wrw_{r}, and the sensing MI and communication rate are enhanced with the increase of SNR. We also see that the JCAS waveform scheme outperforms “Equal”. We also find that there is a gap between the trade-off curves for the two cases, which is due to the different path losses for communication and sensing. In addition, with the decrease of the mean path loss for communication, the variation range of the sensing MI increases. In other words, the stronger path loss for communication is, or the weaker path loss for sensing is, the greater the sensing MI is. In practice, the appropriate weighting value can be selected based on the trade-off curve to meet the requirements of both communication and radio sensing.

VII Conclusion

We presented the optimal waveform design methods based on MI for MIMO JCAS systems by considering a typical packet structure, including training and data symbols. We proposed an optimal power allocation scheme under MMSE estimators for MIMO communication channels. Among the three optimization strategies we studied, the design that maximizes the weighted sum of relative MI is shown to achieve the best overall performance for joint sensing and communication. The design is also less affected by varying channel correlation than the other two waveform design methods. The methodologies presented in this paper can be further extended to study waveform optimization in JCAS MIMO uplink and multiuser-MIMO systems.

Other important observations obtained in this paper are summarized as follows:

  • In most cases, the signal waveform cannot be optimized to maximize both the communication and sensing MI at the same time. However, the JCAS waveform design is nearly optimal for both in the low SNR regime where the noise has a dominant impact on the design.

  • The proposed optimal power allocation can efficiently improve the MI for communication and has an insignificant impact on the sensing MI. When there are more data symbols than training symbols, the sensing MI under power allocation is higher than the case without power allocation.

  • The ratio of mean path losses between communication and sensing can have a strong impact on the range of optimized MI values. A higher ratio can lead to a larger range. If the ratio is small enough, our waveform design can approach the optimal MI values for both communication and sensing.

Appendix A Proof of Theorem 1

Let the SVD of the spatial correlation matrix of 𝐇~\tilde{\mathbf{H}} be

Σ𝐇~=1N𝔼{𝐇~𝐇~H}=𝐔𝐇~𝚲𝐇~𝐔𝐇~H,\varSigma_{\tilde{\mathbf{H}}}=\frac{1}{N}\mathbb{E}\{\tilde{\mathbf{H}}\tilde{\mathbf{H}}^{H}\}=\mathbf{U}_{\tilde{\mathbf{H}}}\mathbf{\Lambda}_{\tilde{\mathbf{H}}}\mathbf{U}_{\tilde{\mathbf{H}}}^{H}, (46)

where 𝐔𝐇~\mathbf{U}_{\tilde{\mathbf{H}}} is a unitary matrix and 𝚲𝐇~=diag{μ11,,μii,,μNN}\mathbf{\Lambda}_{\tilde{\mathbf{H}}}={\operatorname{diag}}\left\{\mu_{11},\cdots,\mu_{ii},\cdots,\mu_{NN}\right\} is a diagonal matrix with μii\mu_{ii} being the singular values. Based on (14) and the lower bound of CEE, we can get an upper bound for the MI between 𝐗d\mathbf{X}_{d} and 𝐘comd{\mathbf{Y}}^{d}_{\rm com} as

I(𝐗d;𝐘comd|𝐇^)\displaystyle I\left(\mathbf{X}_{d};{\mathbf{Y}}^{d}_{\rm com}|\hat{\mathbf{H}}\right) (47a)
Ldlog2[det((σh2𝒞e)𝐗d𝐔𝐇~𝚲𝐇~𝐔𝐇~H𝐗dHPdLd𝒞e+σn2+𝐈N)]\displaystyle\!\leq\!L_{d}\log_{2}\!\left[\!\det\!\left(\!\frac{\left(\sigma_{h}^{2}\!-\!\mathcal{C}_{e}\right)\mathbf{X}_{d}\mathbf{U}_{\tilde{\mathbf{H}}}\mathbf{\Lambda}_{\tilde{\mathbf{H}}}\mathbf{U}_{\tilde{\mathbf{H}}}^{H}\mathbf{X}_{d}^{H}}{\frac{P_{d}}{L_{d}}\mathcal{C}_{e}+\sigma_{n}^{2}}\!+\!\mathbf{I}_{N}\!\right)\!\right]\! (47b)
=Ldlog2[det((σh2𝒞e)𝚲𝐇~(𝐗d𝐔𝐇~)H𝐗d𝐔𝐇~PdLd𝒞e+σn2+𝐈N)],\displaystyle\!=\!L_{d}\log_{2}\!\left[\!\det\!\left(\!\frac{\left(\sigma_{h}^{2}\!-\!\mathcal{C}_{e}\right)\mathbf{\Lambda}_{\tilde{\mathbf{H}}}\left(\mathbf{X}_{d}\mathbf{U}_{\tilde{\mathbf{H}}}\right)^{H}\mathbf{X}_{d}\mathbf{U}_{\tilde{\mathbf{H}}}}{\frac{P_{d}}{L_{d}}\mathcal{C}_{e}+\sigma_{n}^{2}}\!+\!\mathbf{I}_{N}\!\right)\!\right]\!, (47c)

where the equality in (47b) can be achieved when the lower bound CEE 𝒞e\mathcal{C}_{e} is achieved; (47c) is based on the Sylvester’s determinant theorem [26].

Let 𝚵=(𝐗d𝐔𝐇~)H𝐗d𝐔𝐇~=(𝐘comd)H𝐘comd\mathbf{\Xi}=\left(\mathbf{X}_{d}\mathbf{U}_{\tilde{\mathbf{H}}}\right)^{H}\mathbf{X}_{d}\mathbf{U}_{\tilde{\mathbf{H}}}=(\mathbf{Y}^{d}_{\rm com})^{H}\mathbf{Y}^{d}_{\rm com}, and its (i,j)(i,j)-th entry is ξij\xi_{ij}. Based on Hadamard’s inequality for the determinant and trace of a positive semi-definite Hermitian matrix, we have det(𝚵N×N)i=1Nξii\det\left(\mathbf{\Xi}_{N\times N}\right)\leq\prod_{i=1}^{N}\xi_{ii}. The upper bound of the MI between 𝐗d\mathbf{X}_{d} and 𝐘comd{\mathbf{Y}}^{d}_{\rm com} can be obtained as

I(𝐗d;𝐘comd|𝐇^)\displaystyle I\left(\mathbf{X}_{d};{\mathbf{Y}}^{d}_{\rm com}|\hat{\mathbf{H}}\right) i=1N{Ldlog2[(σh2𝒞e)μiiξiiPdLd𝒞e+σn2+1]}f(ξii),\displaystyle\!\leq\!\sum_{i=1}^{N}\!\underbrace{\left\{L_{d}\,\log_{2}\!\left[\!\frac{\left(\sigma_{h}^{2}\!-\!\mathcal{C}_{e}\right)\mu_{ii}\xi_{ii}}{\frac{P_{d}}{L_{d}}\mathcal{C}_{e}+\sigma_{n}^{2}}\!+\!1\!\right]\right\}}_{f\left(\xi_{ii}\right)}, (48)

where the equality is achieved if and only if 𝚵\mathbf{\Xi} is a diagonal matrix.

It is easy to see that f(ξii)f\left(\xi_{ii}\right) is a monotonically decreasing concave function of ξii\xi_{ii}. Based on Jensen’s inequality, we can obtain the expectation of I(𝐗d;𝐘comd|𝐇^)I\left(\mathbf{X}_{d};{\mathbf{Y}}^{d}_{\rm com}|\hat{\mathbf{H}}\right), i.e., 𝔼[I(𝐗d;𝐘comd|𝐇^)]\mathbb{E}\left[I\left(\mathbf{X}_{d};{\mathbf{Y}}^{d}_{\rm com}|\hat{\mathbf{H}}\right)\right], as given in (49), where (49d) is obtained by substituting 𝔼[ξii]=1NTr(𝜮𝐗𝐝)=1NTr(𝚵)=Ldσd2=κPN\mathbb{E}\left[\xi_{ii}\right]=\frac{1}{N}\operatorname{Tr}\left(\mathbf{\varSigma_{\mathbf{X}_{d}}}\right)=\frac{1}{N}\operatorname{Tr}\left(\mathbf{\Xi}\right)=L_{d}\sigma_{d}^{2}=\frac{\kappa P}{N} and (1κ)P=NLtσt2{(1-\kappa)P}=NL_{t}\sigma_{t}^{2} into (49c);

𝔼[I(𝐗d;𝐘comd|𝐇^)]\displaystyle\mathbb{E}\left[I\left(\mathbf{X}_{d};{\mathbf{Y}}^{d}_{\rm com}|\hat{\mathbf{H}}\right)\right] 𝔼[i=1Nf(ξii)]\displaystyle\leq\mathbb{E}\left[\sum_{i=1}^{N}f\left(\xi_{ii}\right)\right] (49a)
=i=1N𝔼[f(ξii)]i=1Nf(𝔼[ξii])\displaystyle\!=\!\sum_{i=1}^{N}\mathbb{E}\left[f\left(\xi_{ii}\right)\right]\leq\sum_{i=1}^{N}f\left(\mathbb{E}\left[\xi_{ii}\right]\right) (49b)
=Ldi=1N{log2[(σh2𝒞e)𝔼[ξii]PdLd𝒞e+σn2+1]}\displaystyle\!=\!L_{d}\sum_{i=1}^{N}\left\{\log_{2}\left[\frac{\left(\sigma_{h}^{2}-\mathcal{C}_{e}\right)\mathbb{E}\left[\xi_{ii}\right]}{\frac{P_{d}}{L_{d}}\mathcal{C}_{e}+\sigma_{n}^{2}}+1\right]\right\} (49c)
=Ldi=1N{log2[PLd(LdN)Nσn2κ(1κ)κ+LdLdN(1+Nσn2Pσh2)+1]}.\displaystyle\!=\!L_{d}\sum_{i=1}^{N}\!\left\{\!\log_{2}\!\left[\!\frac{PL_{d}}{(L_{d}\!-\!N)N\sigma_{n}^{2}}\frac{\kappa\left(1-\kappa\right)}{\!-\kappa\!+\!\frac{L_{d}}{L_{d}-N}\left(1\!+\!\frac{N\sigma_{n}^{2}}{P\sigma_{h}^{2}}\!\right)}\!+\!1\!\right]\!\!\right\}\!. (49d)

From (49d), we can see that different values of κ\kappa can lead to different mean MI values for a given total energy PP and the number of antennas NN through the SNR, denoted by

ρ=LdP(LdN)Nσn2κ(1κ)κ+LdLdN(1+Nσn2Pσh2).\rho=\frac{L_{d}P}{(L_{d}-N)N\sigma_{n}^{2}}\cdot\frac{\kappa\left(1-\kappa\right)}{\!-\kappa\!+\!\frac{L_{d}}{L_{d}-N}\left(1+\frac{N\sigma_{n}^{2}}{P\sigma_{h}^{2}}\right)}. (50)

Referring to the cases considered in [9], to maximize ρ\rho over 0κ10\leq\kappa\leq 1, we can separately consider the following three cases:

  1. 1.

    Ld=NL_{d}=N: The maximal ρ\rho, denoted by ρmax\rho_{\max}, is obtained as

    ρmax=P2σh44Nσn2(Nσn2+Pσh2),\rho_{\max}=\frac{P^{2}\sigma_{h}^{4}}{4N\sigma_{n}^{2}(N\sigma_{n}^{2}+P\sigma_{h}^{2})},

    from which it follows that κop=12\kappa_{\rm{op}}=\frac{1}{2}.

  2. 2.

    Ld>NL_{d}>N: We rewrite ρ\rho as

    ρ=LdP(LdN)Nσn2κ(1κ)κ+Γ,\rho=\frac{L_{d}P}{(L_{d}-N)N\sigma_{n}^{2}}\cdot\frac{\kappa\left(1-\kappa\right)}{-\kappa+\Gamma},

    where Γ=LdLdN(1+Nσn2Pσh2)>1\Gamma=\frac{L_{d}}{L_{d}-N}\left(1+\frac{N\sigma_{n}^{2}}{P\sigma_{h}^{2}}\right)>1. The maximal SNR ρmax\rho_{\max} can be obtained as

    ρmax=LdP(LdN)Nσn2(ΓΓ1)2,\rho_{\max}\!=\!\frac{L_{d}P}{(L_{d}\!-\!N)N\sigma_{n}^{2}}{\!\left(\!\sqrt{\Gamma}-\sqrt{\!\Gamma\!-1}\right)^{2}},

    and it follows that κop=ΓΓ(Γ1)\kappa_{\rm{op}}=\Gamma-\sqrt{\Gamma(\Gamma-1)}.

  3. 3.

    Ld<NL_{d}<N: We rewrite ρ\rho as

    ρ=LdP(NLd)Nσn2κ(1κ)κΓ,\rho=\frac{L_{d}P}{(N-L_{d})N\sigma_{n}^{2}}\cdot\frac{\kappa\left(1-\kappa\right)}{\kappa-\Gamma},

    where Γ=LdLdN(1+Nσn2Pσh2)<0\Gamma=\frac{L_{d}}{L_{d}-N}\left(1+\frac{N\sigma_{n}^{2}}{P\sigma_{h}^{2}}\right)<0. The maximal SNR ρmax\rho_{\max} can be obtained as

    ρmax=LdP(LdN)Nσn2(ΓΓ1)2,\rho_{\max}\!=\!\frac{L_{d}P}{(L_{d}\!-\!N)N\sigma_{n}^{2}}{\!\left(\!\sqrt{\!-\Gamma}-\!\sqrt{\!-\Gamma\!-1}\right)^{2}},

    and it follows that κop=Γ+Γ(Γ1)\kappa_{\rm{op}}=\Gamma+\sqrt{\Gamma(\Gamma-1)}.

Therefore, we can obtain the lower bound of the CEE as presented in (23).

References

  • [1] P. Kumari, M. E. Eltayeb, and R. W. Heath, “Sparsity-aware adaptive beamforming design for ieee 802.11ad-based joint communication-radar,” in 2018 IEEE Radar Conference (RadarConf18), Apr. 2018, pp. 0923–0928.
  • [2] A. R. Chiriyath, B. Paul, and D. W. Bliss, “Radar-communications convergence: Coexistence, cooperation, and co-design,” IEEE Trans. on Cogn. Commun. Netw., vol. 3, no. 1, pp. 1–12, Mar. 2017.
  • [3] B. Paul, A. R. Chiriyath, and D. W. Bliss, “Survey of RF communications and sensing convergence research,” IEEE Access, vol. 5, pp. 252–270, Dec. 2017.
  • [4] M. L. Rahman, J. A. Zhang, X. Huang, Y. J. Guo, and R. W. Heath Jr, “Framework for a perceptive mobile network using joint communication and radar sensing,” IEEE Trans. Aerosp. Electron. Syst., 2019.
  • [5] R. C. Daniels, E. R. Yeh, and R. W. Heath, “Forward collision vehicular radar with IEEE 802.11: Feasibility demonstration through measurements,” IEEE Trans. Veh. Tech., vol. 67, no. 2, pp. 1404–1416, Feb 2018.
  • [6] J. A. Zhang, X. Huang, Y. J. Guo, J. Yuan, and R. W. Heath, “Multibeam for joint communication and radar sensing using steerable analog antenna arrays,” IEEE Trans. Veh. Tech., vol. 68, no. 1, pp. 671–685, 2019.
  • [7] M. R. Bell, “Information theory and radar waveform design,” IEEE Trans. Inf. Theory, vol. 39, no. 5, pp. 1578–1597, Sept. 1993.
  • [8] Y. Yang and R. S. Blum, “MIMO radar waveform design based on mutual information and minimum mean-square error estimation,” IEEE Trans. Aerosp. Electron. Syst., vol. 43, no. 1, pp. 330–343, Jan. 2007.
  • [9] B. Hassibi and B. M. Hochwald, “How much training is needed in multiple-antenna wireless links?” IEEE Trans. Inf. Theory, vol. 49, no. 4, pp. 951–963, Apr. 2003.
  • [10] M. Biguesh and A. B. Gershman, “Training-based MIMO channel estimation: a study of estimator tradeoffs and optimal training signals,” IEEE Trans. Signal Process., vol. 54, no. 3, pp. 884–893, Mar. 2006.
  • [11] B. Tang, J. Tang, and Y. Peng, “MIMO radar waveform design in colored noise based on information theory,” IEEE Trans. Signal Process., vol. 58, no. 9, pp. 4684–4697, Sept. 2010.
  • [12] Z. Zhu, S. Kay, and R. S. Raghavan, “Information-theoretic optimal radar waveform design,” IEEE Signal Process. Lett., vol. 24, no. 3, pp. 274–278, Mar. 2017.
  • [13] Y. Chen et al., “Adaptive distributed MIMO radar waveform optimization based on mutual information,” IEEE Trans. Aerosp. Electron. Syst., vol. 49, no. 2, pp. 1374–1385, Apr. 2013.
  • [14] Y. Liu, H. Wang, and J. Wang, “Robust multiple-input multiple-output radar waveform design in the presence of clutter,” IET Radar, Sonar Navigation, vol. 10, no. 7, pp. 1249–1259, 2016.
  • [15] A. R. Chiriyath, B. Paul, and D. W. Bliss, “Joint radar-communications information bounds with clutter: The phase noise menace,” in 2016 IEEE Radar Conference (RadarConf), May 2016, pp. 1–6.
  • [16] A. R. Chiriyath, B. Paul, G. M. Jacyna, and D. W. Bliss, “Inner bounds on performance of radar and communications co-existence,” IEEE Trans. Signal Process., vol. 64, no. 2, pp. 464–474, Jan. 2016.
  • [17] F. Liu, C. Masouros, A. Li, and T. Ratnarajah, “Robust MIMO beamforming for cellular and radar coexistence,” IEEE Wireless Commun. Lett., vol. 6, no. 3, pp. 374–377, Jun. 2017.
  • [18] H. D. M. A. R. Chiriyath, S. Ragi and D. W. Bliss, “Novel radar waveform optimization for a cooperative radar-communications system,” IEEE Trans. Aerosp. Electron. Syst., vol. 55, no. 3, p. 1160–1173, Jun. 2019.
  • [19] B. Paul, A. R. Chiriyath, and D. W. Bliss, “Joint communications and radar performance bounds under continuous waveform optimization: The waveform awakens,” in 2016 IEEE Radar Conference (RadarConf), May 2016, pp. 1–6.
  • [20] R. Xu, L. Peng, W. Zhao, and Z. Mi, “Radar mutual information and communication channel capacity of integrated radar-communication system using MIMO,” ICT Express, vol. 1, no. 3, pp. 102 – 105, 2015.
  • [21] Y. Liu, G. Liao, J. Xu, Z. Yang, and Y. Zhang, “Adaptive OFDM integrated radar and communications waveform design based on information theory,” IEEE Commun. Lett., vol. 21, no. 10, pp. 2174–2177, Oct. 2017.
  • [22] Y. Liu, G. Liao, R. S. Blum, and Z. Yang, “Robust OFDM integrated radar and communications waveform design,” arXiv preprint arXiv:1805.01511, 2018.
  • [23] G. Galati and G. Pavan, “Waveforms design for modern and MIMO radar,” in Eurocon 2013, Jul. 2013, pp. 508–513.
  • [24] C. Artigue and P. Loubaton, “On the precoder design of flat fading MIMO systems equipped with MMSE receivers: A large-system approach,” IEEE Trans. Inf. Theory, vol. 57, no. 7, pp. 4138–4155, Jul. 2011.
  • [25] J. Wang, O. Y. Wen, and S. Li, “Near-optimum pilot and data symbols power allocation for MIMO spatial multiplexing system with zero-forcing receiver,” IEEE Signal Process. Lett., vol. 16, no. 5, pp. 358–361, May 2009.
  • [26] G. T. Gilbert, “Positive definite matrices and sylvester’s criterion,” The American Mathematical Monthly, vol. 98, no. 1, pp. 44–46, 1991.
  • [27] F. Perez-Cruz, M. R. D. Rodrigues, and S. Verdu, “MIMO gaussian channels with arbitrary inputs: Optimal precoding and power allocation,” IEEE Trans. Inf. Theory, vol. 56, no. 3, pp. 1070–1084, Mar. 2010.
  • [28] L. Berriche, K. Abed-Meraim, and J. C. Belfiore, “Cramer-rao bounds for MIMO channel estimation,” in 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, May 2004, pp. iv–397–iv–400 vol.4.
  • [29] X. Song, S. Zhou, and P. Willett, “Reducing the waveform cross correlation of MIMO Radar with space -time coding,” IEEE Trans. Signal Process., vol. 58, no. 8, pp. 4213–4224, Aug. 2010.