This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Robust Beamforming Design for Covert Communications

Shuai Ma, Member, IEEE, Yunqi Zhang, Hang Li, Songtao Lu, Member, IEEE, Naofal Al-Dhahir, Fellow, IEEE, Sha Zhang and Shiyin Li S. Ma is with the School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China, and also with National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, China (e-mail: [email protected]).Y. Zhang is with the School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China (e-mail: [email protected]).H. Li is with the Shenzhen Research Institute of Big Data, Shenzhen 518172, Guangdong, China. (email: [email protected]).S. Lu is with the IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598 USA (e-mail: [email protected]).N. Al-Dhahir is with the Electrical and Computer Engineering Department, University of Texas at Dallas, Dallas, TX 75080 USA (e-mail: [email protected])S. Zhang is with the Shenzhen Institute of Radio Testing and Tech, Shenzhen 518000 (e-mail: [email protected])S. Li is with the School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China (e-mail: [email protected]).
Abstract

In this paper, we consider a common unicast beamforming network where Alice utilizes the communication to Carol as a cover and covertly transmits a message to Bob without being recognized by Willie. We investigate the beamformer design of Alice to maximize the covert rate to Bob when Alice has either perfect or imperfect knowledge about Willie’s channel state information (WCSI). For the perfect WCSI case, the problem is formulated under the perfect covert constraint, and we develop a covert beamformer by applying semidefinite relaxation and the bisection method. Then, to reduce the computational complexity, we further propose a zero-forcing beamformer design with a single iteration processing. For the case of the imperfect WCSI, the robust beamformer is developed based on a relaxation and restriction approach by utilizing the property of Kullback-Leibler divergence. Furthermore, we derive the optimal decision threshold of Willie, and analyze the false alarm and the missed detection probabilities in this case. Finally, the performance of the proposed beamformer designs is evaluated through numerical experiments.

Index Terms:
Covert communications, covert beamformer design, zero-forcing beamformer design, robust beamformer design.

I Introduction

Due to its broadcasting nature, wireless communication is vulnerable to malicious attackers. By exploiting encryption and key exchange techniques, conventional security methods mainly focus on preventing the transmitted wireless signals form being decoded by unintended users [1, 2], but not concealing them. For many wireless scenarios, such as law enforcement and military communications, the transmitted signals should not be detected in order to perform the undercover missions. Therefore, the paradigm of covert communications, also known as low probability of detection (LPD) communications, aims to hide the transmissions status, and protect the users’ privacy.

In a typical covert communication scenario, the sender (Alice) wants to send the information to the covert receiver (Bob) without being detected by the eavesdropper (Willie). Here, Willie may or may not be a legitimate receiver, while its purpose is to detect whether the transmission from Alice to Bob happens based on its observations. Mathematically, the ultimate goal for Willie is to distinguish the two hypothesis 0{{\cal H}_{0}} or 1{{\cal H}_{1}} by applying a specific decision rule, where 0{{\cal H}_{0}} denotes the null hypothesis that Alice does not transmit a private data stream to Bob, and 1{{\cal H}_{1}} denotes the alternate hypothesis that Alice transmits a private data stream to Bob [3]. In general, the priori probabilities of hypotheses 0{{{\cal H}_{0}}} and 1{{{\cal H}_{1}}} are assumed to be equal, i.e., each equal to 1/21/2. As such, the detection error probability of Willie is defined as in [3, 4, 5]

ξ=Pr(𝒟1|0)+Pr(𝒟0|1),\displaystyle\xi=\Pr\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right)+\Pr\left({{{\cal D}_{0}}\left|{{{\cal H}_{1}}}\right.}\right), (1)

where 𝒟1{{\cal D}_{1}} indicates that Alice sends information to Bob, and 𝒟0{{\cal D}_{0}} indicates the other case. Covert communication is achieved for a given ε[0,1]\varepsilon\in\left[{0,1}\right] if the detection error probability ξ\xi is no less than 1ε1-\varepsilon, i.e., ξ1ε\xi\geq 1-\varepsilon. Here, ε\varepsilon is a predetermined value to specify the covert communication constraint.

Although practical covert communications has been studied by investigating spread-spectrum technology [6] for several decades, the information-theoretic limits of covert communication were only recently derived [5, 7, 8]. The achievability of the square root law (SRL) was established in [5] where in order to achieve covert communication over the additive white Gaussian noise (AWGN) channel, Alice can only transmit no more than 𝒪(n)\mathcal{O}\left(\sqrt{n}\right) bits to Bob in nn channel uses. Moreover, the SRL results have been verified in discrete memoryless channels (DMCs) [8, 7], two-hop systems[9], multiple access channels [10] and broadcast channels [11]. In short, these results imply that the average number of covert bits per channel use asymptotically approaches zero despite the noiseless transmission, i.e., limn𝒪(n)/n=0\mathop{\lim}\limits_{n\to\infty}{{{\cal O}\left({\sqrt{n}}\right)}\mathord{\left/{\vphantom{{{\mathcal{O}}\left({\sqrt{n}}\right)}n}}\right.\kern-1.2pt}n}=0.

Fortunately, some works have revealed that Alice can achieve a positive covert rate under some given conditions, i.e., imperfect knowledge of noise [12, 13, 14]; imperfect channel statistics [15, 16]; unknown transmission time [17, 18, 19]; the presence of random jamming signals [20, 21]; finite blocklengths of transmissions [22, 23]; the existence of relay [24, 25, 26]; and appearance of injecting artificial noise (AN) [27, 28]. To be more specific, in [25], the authors proposed a power allocation strategy to maximize the secrecy rate under the covert requirements with multiple untrusted relays. Based on the proposed rate-control and power-control strategies, the authors in [24] verified the feasibility of covert transmission in amplify and forward one-way relay networks. With a finite number of channel uses, delay-intolerant covert communications was investigated in [22], which demonstrated that random transmit power can enhance covert communications. In addition, the effect of the finite blocklength (i.e., finite nn) on covert communication was investigated in [28]. By exploiting a full-duplex (FD) receiver, covert communications was examined in [28] under fading channels, where the FD receiver generates artificial noise to confuse Willie. In [4], the optimality of Gaussian signalling was investigated by employing Kullback-Leibler (KL) divergence as a covert metric. By formulating LPD communications as a quickest detection problem, the authors in [23] investigated covert throughput maximization problems with three different detection methods, i.e., the Shewhart, the cumulative sum (CUSUM), and the Shiryaev-Roberts (SR) tests. With the help of a friendly uninformed jammer, Alice can also communicate 𝒪(n)\mathcal{O}\left(n\right) covert bits to Bob in nn channel uses [20, 21]. By producing artificial noise to inhibit Willie’s detection, Alice can reliably and covertly transmit information to Bob [27].

Most existing works [5, 7, 8, 9, 10, 11, 20, 21, 24, 27, 22, 28, 4, 23, 29, 30] investigate covert transmission with perfect channel state information (CSI) of all users. However, covert communication for multiple antenna beamforming [31] has rarely been studied to the best of our knowledge, and in practical scenarios the perfect CSI of warden is usually not available. In [31], the authors investigated power allocation to maximize the secrecy rate while satisfying the covert communication requirements. In [32], the three-dimensional (3D) beamformer and jamming interference beamformer were iteratively optimized for maximizing the covert rate. In this work, we show that using multiple antennas allows us to relax the perfect CSI assumption while still guarantees convert transmission.

In this paper, we consider a practical scenario where Alice uses the communication link with Carol as a cover, and aims to achieve covert communication with Bob against Willie. The most relevant work to this paper is [16]. In [16], a single-input-single-output (SISO) covert communication scenario was considered, and an exact expression for the optimal threshold of Warden’s detector was derived. The authors then analyzed the achievable rates with outage constraints under imperfect CSI. However, in our work, we focus on a multiple-input-single-output (MISO) covert network for both the perfect CSI and imperfect CSI cases.

Our main contributions are summarized as follows:

  • When Willie’s CSI (WCSI) is perfect at Alice, we study the joint beamformer design problem with the objective of maximizing the achievable rate of Bob, subject to the perfect covert transmission constraint, quality of service (QoS) of Carol, and the total transmit power constraints of Alice. The covert rate maximization problem is shown to be non-convex. Then, by applying the semidefinite relaxation (SDR) technique and the bisection method, we find the solution by solving a series of convex subproblems.

  • Furthermore, to reduce the computational complexity, we propose a low-complexity zero-forcing (ZF) beamformer design with a single iteration processing, which provides a promising tradeoff between complexity and performance. Such design problem is transformed into two decoupled subproblems. To be specific, the design of Willie’s ZF beamformer can be relaxed to a convex problem by SDR, and Bob’s ZF beamformer design can be reformulated as a convex second-order cone program (SOCP) problem.

  • When WCSI is imperfect at Alice, we consider the robust covert rate maximization problem under the QoS constraint of Carol, the covertness constraint, and the total power constraint. To handle this non-convex problem, a restriction and relaxation method is introduced, and a convex convex semidefinite program (SDP) is obtained by using the S-lemma and SDR. Given that the covert constraint is not perfect, we derive the optimal detection threshold of Willie, and the corresponding detection error probability based on the robust beamformer vector. Such result can be used as the theoretical benchmark to evaluate the covert performance of beamformers deign. Our simulation results further reveal the tradeoff between Willie’s detection performance and Bob’s covert rate.

The rest of this paper is organized as follows. In Section II, we introduce the system model, assumptions and the notations used throughout the paper. In Section III, we discuss the covert beamformer and the ZF beamformer designs with perfect WCSI. In Section IV, we consider a robust beamforming design with imperfect WCSI. In Section V, we present numerical results to evaluate the proposed beamformers, and finally the paper is concluded in Section VI.

Notations: Boldfaced lowercase and uppercase letters represent vectors and matrices, respectively. Re(){\mathop{\rm Re}\nolimits}\left(\cdot\right) and Im(){\mathop{\rm Im}\nolimits}\left(\cdot\right) denote the real part and imaginary part of its argument, respectively. A complex-valued circularly symmetric Gaussian distribution with mean μ\mu and variance σ2{\sigma^{2}} is denoted by 𝒞𝒩(μ,σ2)\mathcal{CN}\left({\mu,{\sigma^{2}}}\right).

II System Model

Refer to caption
Figure 1: Illustration of the covert communication scenario

We consider the scenario illustrated in Fig. 1, in which Alice (base station) transmits data stream xc{{x_{{\rm{c}}}}} to Carol (regular user) all the time, and transmits private data stream xb{x_{{\rm{b}}}} to Bob (covert user) occasionally. For simplicity, let 𝔼{|xc|2}=1{{\mathbb{E}}}\left\{{{{\left|{{x_{\rm{c}}}}\right|}^{2}}}\right\}=1, 𝔼{|xb|2}=1{{\mathbb{E}}}\left\{{{{\left|{{x_{\rm{b}}}}\right|}^{2}}}\right\}=1. At the same time, Willie (eavesdropper) silently (passively) observes the communication environment and tries to identify whether Alice is transmitting to Bob or not. As we mentioned in the previous section, Alice may achieve covert communication by using the transmission to Carol as a cover. Suppose that Alice is equipped with NN antennas, while Carol, Bob and Willie each has a single antenna111 Under this setup, Willie only needs to perform energy detection and does not have to know the beamforming vectors.. Let 𝐡bN{{\bf{h}}_{\rm{b}}}\in{\mathbb{C}^{N}}, 𝐡cN{{\bf{h}}_{\rm{c}}}\in{\mathbb{C}^{N}} and 𝐡wN{{\bf{h}}_{\rm{w}}}\in{\mathbb{C}^{N}} denote the channel vectors from Alice to Bob, Carol and Willie, respectively. We assume that all channels are modeled as Rayleigh flat fading, i.e., 𝐡b𝒞𝒩(𝟎,σ12𝐈){{\bf{h}}_{\rm{b}}}\sim{\cal CN}\left({{\bf{0}},\sigma_{1}^{2}{\bf{I}}}\right), 𝐡w𝒞𝒩(𝟎,σ22𝐈){{\bf{h}}_{\rm{w}}}\sim{\cal CN}\left({{\bf{0}},\sigma_{2}^{2}{\bf{I}}}\right), and 𝐡c𝒞𝒩(𝟎,σ32𝐈){{\bf{h}}_{\rm{c}}}\sim{\cal CN}\left({{\bf{0}},\sigma_{3}^{2}{\bf{I}}}\right) [16], where σ12\sigma_{1}^{2}, σ22\sigma_{2}^{2} and σ32\sigma_{3}^{2} denote the variances of channels 𝐡b{{\bf{h}}_{\rm{b}}}, 𝐡w{{\bf{h}}_{\rm{w}}} and 𝐡c{{\bf{h}}_{\rm{c}}}, respectively.

Recall that the goal for Willie is to determine which hypothesis (0{{\cal H}_{0}} or 1{{\cal H}_{1}}) is true by applying a specific decision rule. For convenience, we use 𝒟1{{\cal D}_{1}} (𝒟0{{\cal D}_{0}}) to indicate the event that Alice does (does not) send information to Bob.

II-A Signal Model and Covert Constraints

From Willie’s perspective, Alice’s transmitted signal is given by

𝐱={𝐰c,0xc,0,𝐰c,1xc+𝐰bxb,1,\displaystyle{\bf{x}}=\left\{\begin{array}[]{l}{{\bf{w}}_{{\rm{c,0}}}}{x_{\rm{c}}},~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}{{\cal H}_{0}},\\ {{\bf{w}}_{{\rm{c,1}}}}{x_{\rm{c}}}+{{\bf{w}}_{\rm{b}}}{x_{\rm{b}}},~{}{{\cal H}_{1}},\\ \end{array}\right. (4)

where 𝐰c,0{{\bf{w}}_{\rm{c,0}}} and 𝐰c,1{{\bf{w}}_{\rm{c,1}}} denote the transmit beamformer vectors for xc{{x_{{\rm{c}}}}} in hypothesis 0{{\cal H}_{0}} and hypothesis 1{{\cal H}_{1}}, respectively; 𝐰b{{\bf{w}}_{\rm{b}}} denotes the transmit beamformer vector for xb{{x_{{\rm{b}}}}}. Let Ptotal{P_{{\rm{total}}}} denote the maximum transmit power of Alice. Therefore, the beamformer vectors satisfy: 𝐰c,02Ptotal{\left\|{{{\bf{w}}_{{\rm{c,0}}}}}\right\|^{2}}\leq{P_{{\rm{total}}}} under 0{{\cal H}_{0}} and 𝐰c,12+𝐰b2Ptotal{\left\|{{{\bf{w}}_{{\rm{c,1}}}}}\right\|^{2}}+{\left\|{{{\bf{w}}_{\rm{b}}}}\right\|^{2}}\leq{P_{{\rm{total}}}} under 1{{\cal H}_{1}}.

For Carol, the received signal is given by

yc={𝐡cH𝐰c,0xc+zc,0,𝐡cH(𝐰c,1xc+𝐰bxb)+zc,1,\displaystyle{y_{{\rm{c}}}}=\left\{{\begin{array}[]{*{20}{c}}{\bf{h}}_{\rm{c}}^{H}{{\bf{w}}_{{\rm{c}},0}}{x_{\rm{c}}}+{z_{\rm{c}}},~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}{{\cal H}_{0}},\\ {\bf{h}}_{\rm{c}}^{H}\left({{{\bf{w}}_{{\rm{c}},{\rm{1}}}}{x_{\rm{c}}}+{{\bf{w}}_{\rm{b}}}{x_{\rm{b}}}}\right)+{z_{\rm{c}}},~{}{{\cal H}_{1}},\end{array}}\right. (7)

where zc𝒞𝒩(0,σc2){z_{\rm{c}}}\sim{\cal CN}\left({{\rm{0}},\sigma_{\rm{c}}^{2}}\right) is the received noise at Carol222 Here, the inter cell interference is modeled as white Gaussian noise..

For Bob, the received signal is given by

yb={𝐡bH𝐰c,0xc+zb,0,𝐡bH(𝐰c,1xc+𝐰bxb)+zb,1,\displaystyle{y_{{\rm{b}}}}=\left\{{\begin{array}[]{*{20}{c}}{\bf{h}}_{\rm{b}}^{H}{{\bf{w}}_{{\rm{c}},0}}{x_{{\rm{c}}}}+{z_{\rm{b}}},~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}{{\cal H}_{0}},\\ {\bf{h}}_{\rm{b}}^{H}\left({{{\bf{w}}_{{\rm{c}},{\rm{1}}}}{x_{\rm{c}}}+{{\bf{w}}_{\rm{b}}}{x_{\rm{b}}}}\right)+{z_{\rm{b}}},~{}{{\cal H}_{1}},\end{array}}\right. (10)

where zb𝒞𝒩(0,σb2){z_{\rm{b}}}\sim{\cal CN}\left({{\rm{0}},\sigma_{\rm{b}}^{2}}\right) is the received noise at Bob.

In addition, the signals received by Willie can be written as

yw={𝐡wH𝐰c,0xc+zw,0,𝐡wH(𝐰c,1xc+𝐰bxb)+zw,1,\displaystyle{y_{{\rm{w}}}}=\left\{{\begin{array}[]{*{20}{c}}{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{{\rm{c}},0}}{x_{{\rm{c}}}}+{z_{\rm{w}}},~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}{{\cal H}_{0}},\\ {\bf{h}}_{\rm{w}}^{H}\left({{{\bf{w}}_{{\rm{c}},{\rm{1}}}}{x_{\rm{c}}}+{{\bf{w}}_{\rm{b}}}{x_{\rm{b}}}}\right)+{z_{\rm{w}}},~{}{{\cal H}_{1}},\end{array}}\right. (13)

where zw𝒞𝒩(0,σw2){z_{\rm{w}}}\sim{\cal CN}\left({{\rm{0}},\sigma_{\rm{w}}^{2}}\right) is the received noise at Willie.

According to (7), we assume that the instantaneous rates at Carol are expressed as Rc,0(𝐰c,0){R_{{\rm{c}},0}}\left({{{\bf{w}}_{{\rm{c}},0}}}\right) and Rc,1(𝐰c,1,𝐰b){R_{{\rm{c}},1}}\left({{{\bf{w}}_{{\rm{c}},1}}},{{{\bf{w}}_{{\rm{b}}}}}\right) under 0{{\cal H}_{0}} and 1{{\cal H}_{1}}, respectively, and can be written as

Rc,0(𝐰c,0)=log2(1+|𝐡cH𝐰c,0|2σc2),\displaystyle{R_{{\rm{c}},0}}\left({{{\bf{w}}_{{\rm{c}},0}}}\right)=\log_{2}\left({1+\frac{{{{\left|{{\bf{h}}_{\rm{c}}^{H}{{\bf{w}}_{{\rm{c}},0}}}\right|}^{2}}}}{{\sigma_{\rm{c}}^{2}}}}\right), (14a)
Rc,1(𝐰c,1,𝐰b)=log2(1+|𝐡cH𝐰c,1|2|𝐡cH𝐰b|2+σc2).\displaystyle{R_{{\rm{c}},1}}\left({{{\bf{w}}_{{\rm{c}},1}}},{{{\bf{w}}_{{\rm{b}}}}}\right)=\log_{2}\left({1+\frac{{{{\left|{{\bf{h}}_{\rm{c}}^{H}{{\bf{w}}_{{\rm{c}},1}}}\right|}^{2}}}}{{{{\left|{{\bf{h}}_{\rm{c}}^{H}{{\bf{w}}_{\rm{b}}}}\right|}^{2}}+\sigma_{\rm{c}}^{2}}}}\right). (14b)

Similarly, based on (10), we assume that Rb(𝐰c,1,𝐰b){R_{\rm{b}}}\left({{{\bf{w}}_{{\rm{c}},1}}},{{{\bf{w}}_{{\rm{b}}}}}\right) is the instantaneous rate at Bob under hypothesis 1{{\cal H}_{1}}, which is given by

Rb(𝐰c,1,𝐰b)=log2(1+|𝐡bH𝐰b|2|𝐡bH𝐰c,1|2+σb2).\displaystyle{R_{\rm{b}}}\left({{{\bf{w}}_{{\rm{c}},1}},{{\bf{w}}_{\rm{b}}}}\right)={\log_{2}}\left({1+\frac{{{{\left|{{\bf{h}}_{\rm{b}}^{H}{{\bf{w}}_{\rm{b}}}}\right|}^{2}}}}{{{{\left|{{\bf{h}}_{\rm{b}}^{H}{{\bf{w}}_{{\rm{c}},1}}}\right|}^{2}}+\sigma_{\rm{b}}^{2}}}}\right). (15)

Since Willie needs to distinguish between the two hypotheses from its received signal yw{y_{{\rm{w}}}}, we further characterize the probability of yw{y_{{\rm{w}}}}. Let p0(yw){p_{0}}\left({{y_{\rm{w}}}}\right) and p1(yw){p_{1}}\left({{y_{\rm{w}}}}\right) denote the likelihood functions of the received signals of Willie under 0{{{\cal H}_{\rm{0}}}} and 1{{{\cal H}_{\rm{1}}}}, respectively. Based on (13), p0(yw){p_{0}}\left({{y_{\rm{w}}}}\right) and p1(yw){p_{1}}\left({{y_{\rm{w}}}}\right) are given as

p0(yw)=1πλ0exp(|yw|2λ0),\displaystyle{p_{0}}\left({{y_{\rm{w}}}}\right)=\frac{1}{{\pi{\lambda_{0}}}}\exp\left({-\frac{{{{\left|{{y_{\rm{w}}}}\right|}^{2}}}}{{{\lambda_{0}}}}}\right), (16a)
p1(yw)=1πλ1exp(|yw|2λ1),\displaystyle{p_{1}}\left({{y_{\rm{w}}}}\right)=\frac{1}{{\pi{\lambda_{1}}}}\exp\left({-\frac{{{{\left|{{y_{\rm{w}}}}\right|}^{2}}}}{{{\lambda_{1}}}}}\right), (16b)

where λ0=Δ|𝐡wH𝐰c,0|2+σw2{\lambda_{0}}\buildrel\Delta\over{=}{\left|{{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{{\rm{c}},0}}}\right|^{2}}+\sigma_{\rm{w}}^{2} and λ1=Δ|𝐡wH𝐰c,1|2+|𝐡wH𝐰b|2+σw2{\lambda_{1}}\buildrel\Delta\over{=}{\left|{{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{{\rm{c}},1}}}\right|^{2}}+{\left|{{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{\rm{b}}}}\right|^{2}}+\sigma_{\rm{w}}^{2}.

Recall from the previous section that Willie wants to minimize the detection error probability ξ{\xi} (1) by applying an optimal detector. To take ξ\xi into our problem formulation, we next specify conditions on the likelihood functions such that covert communication can be achieved with the given ε\varepsilon. First, we let

ξ=1VT(p0,p1),\displaystyle\xi=1-{V_{T}}\left({{p_{0}},{p_{1}}}\right), (17)

where VT(p0,p1){V_{T}}\left({{p_{0}},{p_{1}}}\right) is the total variation between p0(yw){p_{0}}\left({{y_{\rm{w}}}}\right) and p1(yw){p_{1}}\left({{y_{\rm{w}}}}\right). In general, computing VT(p0,p1){V_{T}}\left({{p_{0}},{p_{1}}}\right) analytically is intractable. Thus, we adopt Pinsker’s inequality [33], and can obtain

VT(p0,p1)12D(p0p1),\displaystyle{V_{T}}\left({{p_{0}},{p_{1}}}\right)\leq\sqrt{\frac{1}{2}D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)}, (18a)
VT(p0,p1)12D(p1p0),\displaystyle{V_{T}}\left({{p_{0}},{p_{1}}}\right)\leq\sqrt{\frac{1}{2}D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)}, (18b)

where D(p0p1)D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right) denotes the KL divergence from p0(yw)p_{0}(y_{\rm{w}}) to p1(yw)p_{1}(y_{\rm{w}}), and D(p1p0)D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right) is the KL divergence from p1(yw)p_{1}(y_{\rm{w}}) to p0(yw)p_{0}(y_{\rm{w}}). Furthermore, D(p0p1)D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right) and D(p1p0)D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right) are respectively given as

D(p0p1)\displaystyle D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right) =+p0(yw)lnp0(yw)p1(yw)dy=lnλ1λ0+λ0λ11,\displaystyle=\int_{-\infty}^{+\infty}{{p_{0}}\left({{y_{\text{w}}}}\right)\ln\frac{{{p_{0}}\left({{y_{\text{w}}}}\right)}}{{{p_{1}}\left({{y_{\text{w}}}}\right)}}}dy=\ln\frac{{{\lambda_{1}}}}{{{\lambda_{0}}}}+\frac{{{\lambda_{0}}}}{{{\lambda_{1}}}}-1, (19a)
D(p1p0)\displaystyle D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right) =+p1(yw)lnp1(yw)p0(yw)dy=lnλ0λ1+λ1λ01.\displaystyle=\int_{-\infty}^{+\infty}{{p_{1}}\left({{y_{\text{w}}}}\right)\ln\frac{{{p_{1}}\left({{y_{\text{w}}}}\right)}}{{{p_{0}}\left({{y_{\text{w}}}}\right)}}}dy=\ln\frac{{{\lambda_{0}}}}{{{\lambda_{1}}}}+\frac{{{\lambda_{1}}}}{{{\lambda_{0}}}}-1. (19b)

Therefore, to achieve covert communication with the given ε\varepsilon, i.e., ξ1ε\xi\geq 1-\varepsilon, the KL divergences of the likelihood functions should satisfy one of the following constraints:

D(p0p1)2ε2,\displaystyle D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 2{\varepsilon^{2}}, (20a)
D(p1p0)2ε2.\displaystyle D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)\leq 2{\varepsilon^{2}}. (20b)

II-B CSI Availability

In this subsection, we assume that Alice can accurately estimate the CSI of Bob and Carol. In most cases, such CSI can be learned at both the receiver side and the transmitter side by training and feedback. However, the WCSI may not be always accessible to Alice because of the potential limited cooperation between Alice and Willie. As a result, we consider the following two scenarios333 When Willie is totally passive, the covert communications scheme design may turn to exploit the channel distribution information of Willie [34, 31, 32, 35] :

1) Scenario 1. Perfect WCSI: We first consider a scenario that often arises in practice, where Willie is a legitimate user and is only hostile to Bob. In this case, Alice knows the full CSI of the channel 𝐡w{{\bf{h}}_{\rm{w}}}, and uses it to help Bob to hide from Willie [5, 22, 28].

2) Scenario 2. Imperfect WCSI: We consider a more practical scenario where Willie is a regular user with only limited cooperation to Alice. In this case, Alice has imperfect CSI knowledge due to the passive warden and channel estimation errors [16, 36]. Here, the imperfect WCSI is modeled as

𝐡w=𝐡^w+Δ𝐡w,\displaystyle{{\bf{h}}_{\rm{w}}}={{{\bf{\hat{h}}}}_{\rm{w}}}+\Delta{{\bf{h}}_{\rm{w}}}, (21)

where 𝐡^w{{{\bf{\hat{h}}}}_{\rm{w}}} denotes the estimated CSI vector between Alice and Willie, and Δ𝐡w\Delta{{\bf{h}}_{\rm{w}}} denotes corresponding CSI error vector. Moreover, the CSI error vector Δ𝐡w\Delta{{\bf{h}}_{\rm{w}}} is characterized by an ellipsoidal region, i.e.,

w=Δ{Δ𝐡w|Δ𝐡wH𝐂wΔ𝐡wvw},\displaystyle{{\cal E}_{\rm{w}}}\buildrel\Delta\over{=}\left\{{\Delta{{\bf{h}}_{\rm{w}}}\left|{\Delta{\bf{h}}_{\rm{w}}^{H}{{\bf{C}}_{\rm{w}}}\Delta{{\bf{h}}_{\rm{w}}}\leq{v_{\rm{w}}}}\right.}\right\}, (22)

where 𝐂w=𝐂wH¯𝟎{{\bf{C}}_{\rm{w}}}={{\bf{C}}_{\rm{w}}^{H}}\underline{\succ}{\bf{0}} controls the axes of the ellipsoid, and vw>0{v_{\rm{w}}}>0 determines the volume of the ellipsoid [37, 32].

III Proposed Covert Transmission for Perfect WCSI

In this section, we consider the perfect WCSI scenario (scenario 1) and maximize the covert rate to Bob by optimizing beamformers at Alice. Specifically, we study a joint beamforming design problem with the objective of maximizing the achievable rate of Bob Rb{R_{\rm{b}}}, subject to the perfect covert transmission constraint, QoS of Carol, and the total transmit power constraints of Alice, which can be mathematically formulated as

max𝐰b,𝐰c,1\displaystyle\mathop{\max}\limits_{{{\mathbf{w}}_{\rm{b}}},{{\mathbf{w}}_{{\rm{c}},{1}}}}{\rm{}} Rb(𝐰c,1,𝐰b)\displaystyle{R_{\rm{b}}}\left({{{\bf{w}}_{{\rm{c}},1}}},{{{\bf{w}}_{{\rm{b}}}}}\right)\hfill (23a)
s.t. Rc,1(𝐰c,1,𝐰b)=Rc,0(𝐰c,0),\displaystyle{R_{{\rm{c}},1}}\left({{{\bf{w}}_{{\rm{c}},1}}},{{{\bf{w}}_{{\rm{b}}}}}\right)={R_{{\rm{c}},0}}\left({{{\bf{w}}_{{\rm{c}},0}}}\right),\hfill (23b)
D(p0p1)=0,\displaystyle D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)=0,\hfill (23c)
𝐰b2+𝐰c,12Ptotal.\displaystyle{\left\|{{{\mathbf{w}}_{\rm{b}}}}\right\|^{2}}+{\left\|{{{\mathbf{w}}_{{\rm{c}},{1}}}}\right\|^{2}}\leq{P_{{\rm{total}}}}.\hfill (23d)

Note that, p0{p_{0}} is a function of 𝐰c,0{{\bf{w}}_{{\rm{c}},0}}, and p1{p_{1}} is a function of both 𝐰c,1{{\bf{w}}_{{\rm{c}},1}} and 𝐰b{{\bf{w}}_{{\rm{b}}}} for (23c). Thus, p0{p_{0}} and p1{p_{1}} can be expressed as p0(𝐰c,0){p_{0}}\left({{{\bf{w}}_{{\rm{c}},0}}}\right) and p1(𝐰c,1,𝐰b){p_{1}}\left({{{\bf{w}}_{{\rm{c}},1}},{{\bf{w}}_{\rm{b}}}}\right), respectively.

Notice that problem (23) is non-convex and difficult to be optimally solved. Moreover, constraints D(p0p1)=0D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)=0 and D(p1p0)=0D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)=0 are equivalent for the perfect covert transmission case.

To address the non-convex problem (23), we propose two beamformers design approaches, namely, the proposed covert beamformer design and proposed ZF beamformers design.

III-A Proposed Covert Beamformer Design

To simplify the derivation, define τ1=Δ|𝐡cH𝐰c,0|2\tau_{1}\buildrel\Delta\over{=}{\left|{{\mathbf{h}}_{{\rm{c}}}^{H}{{\mathbf{w}}_{{\rm{c}},0}}}\right|^{2}} and τ2=Δ|𝐡wH𝐰c,0|2\tau_{2}\buildrel\Delta\over{=}{\left|{{\mathbf{h}}_{{\rm{w}}}^{H}{{\mathbf{w}}_{{\rm{c}},0}}}\right|^{2}}, and introduce an auxiliary variable rb{r_{b}}. Then, problem (23) can be reformulated as the following equivalent form:

max𝐰b,𝐰c,1,rb\displaystyle\mathop{\max}\limits_{{{\bf{w}}_{\rm{b}}},{{\bf{w}}_{{\rm{c}},1}},{r_{\rm{b}}}}~{} rb\displaystyle{r_{\rm{b}}} (24a)
s.t.\displaystyle{\rm{s.t.}}~{} |𝐡bH𝐰b|2|𝐡bH𝐰c,1|2+σb2rb,\displaystyle\frac{{{{\left|{{\bf{h}}_{\rm{b}}^{H}{{\bf{w}}_{\rm{b}}}}\right|}^{2}}}}{{{{\left|{{\bf{h}}_{\rm{b}}^{H}{{\bf{w}}_{{\rm{c}},1}}}\right|}^{2}}+\sigma_{\rm{b}}^{2}}}\geq{r_{\rm{b}}}, (24b)
|𝐡cH𝐰c,1|2|𝐡cH𝐰b|2+σc2=τ1σc2,\displaystyle\frac{{{{\left|{{\bf{h}}_{\rm{c}}^{H}{{\bf{w}}_{{\rm{c}},1}}}\right|}^{2}}}}{{{{\left|{{\bf{h}}_{\rm{c}}^{H}{{\bf{w}}_{\rm{b}}}}\right|}^{2}}+\sigma_{\rm{c}}^{2}}}=\frac{{{\tau_{1}}}}{{\sigma_{\rm{c}}^{2}}}, (24c)
|𝐡wH𝐰c,1|2+|𝐡wH𝐰b|2=τ2,\displaystyle{\left|{{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{{\rm{c}},1}}}\right|^{2}}+{\left|{{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{\rm{b}}}}\right|^{2}}={\tau_{2}}, (24d)
𝐰b2+𝐰c,12Ptotal.\displaystyle{\left\|{{{\bf{w}}_{\rm{b}}}}\right\|^{2}}+{\left\|{{{\bf{w}}_{{\rm{c}},1}}}\right\|^{2}}\leq{P_{{\rm{total}}}}. (24e)

Next, we apply the SDR technique [38] to relax problem (24). Towards this end, by using the following conditions

𝐖b=𝐰b𝐰bH𝐖b¯𝟎,rank(𝐖b)=1,\displaystyle{{\bf{W}}_{\rm{b}}}{\rm{=}}{{\bf{w}}_{\rm{b}}}{\bf{w}}_{\rm{b}}^{H}\Leftrightarrow{{\bf{W}}_{\rm{b}}}\underline{\succ}{\bf{0}},{\rm{rank}}\left({{{\bf{W}}_{\rm{b}}}}\right)=1, (25a)
𝐖c,1=𝐰c,1𝐰c,1H𝐖c,1¯𝟎,rank(𝐖c,1)=1,\displaystyle{{\bf{W}}_{{\rm{c}},1}}{\rm{=}}{{\bf{w}}_{{\rm{c}},1}}{\bf{w}}_{{\rm{c}},1}^{H}\Leftrightarrow{{\bf{W}}_{{\rm{c}},1}}\underline{\succ}{\bf{0}},{\rm{rank}}\left({{{\bf{W}}_{{\rm{c}},1}}}\right)=1, (25b)

and ignoring the rank-one constraints, we can obtain a relaxed version of problem (24) as

max𝐖b,𝐖c,1,rb\displaystyle\mathop{\max}\limits_{{{\bf{W}}_{\rm{b}}},{{\bf{W}}_{{\rm{c}},1}},{r_{\rm{b}}}} rb\displaystyle{r_{\rm{b}}} (26a)
s.t.\displaystyle{\rm{s}}.{\rm{t}}.~{} Tr(𝐡bH𝐖b𝐡b)rb(Tr(𝐡bH𝐖c,1𝐡b)+σb2),\displaystyle{\rm{Tr}}\left({{\bf{h}}_{\rm{b}}^{H}{{\bf{W}}_{\rm{b}}}{{\bf{h}}_{\rm{b}}}}\right)\geq{r_{\rm{b}}}\left({{\rm{Tr}}\left({{\bf{h}}_{\rm{b}}^{H}{{\bf{W}}_{{\rm{c}},1}}{{\bf{h}}_{\rm{b}}}}\right)+\sigma_{\rm{b}}^{2}}\right), (26b)
σc2Tr(𝐡cH𝐖c,1𝐡c)=τ1Tr(𝐡cH𝐖b𝐡c)+τ1σc2,\displaystyle\sigma_{\rm{c}}^{2}{\rm{Tr}}\left({{\bf{h}}_{\rm{c}}^{H}{{\bf{W}}_{{\rm{c}},1}}{{\bf{h}}_{\rm{c}}}}\right)={\tau_{1}}{\rm{Tr}}\left({{\bf{h}}_{\rm{c}}^{H}{{\bf{W}}_{\rm{b}}}{{\bf{h}}_{\rm{c}}}}\right)+{\tau_{1}}\sigma_{\rm{c}}^{2}, (26c)
Tr(𝐡wH𝐖c,1𝐡w)+Tr(𝐡wH𝐖b𝐡w)=τ2,\displaystyle{\rm{Tr}}\left({{\bf{h}}_{\rm{w}}^{H}{{\bf{W}}_{{\rm{c}},1}}{{\bf{h}}_{\rm{w}}}}\right)+{\rm{Tr}}\left({{\bf{h}}_{\rm{w}}^{H}{{\bf{W}}_{\rm{b}}}{{\bf{h}}_{\rm{w}}}}\right)={\tau_{2}}, (26d)
Tr(𝐖c,1)+Tr(𝐖b)Ptotal,\displaystyle{\rm{Tr}}\left({{{\bf{W}}_{{\rm{c}},1}}}\right)+{\rm{Tr}}\left({{{\bf{W}}_{\rm{b}}}}\right)\leq{P_{{\rm{total}}}}, (26e)
𝐖c,1¯𝟎,𝐖b¯𝟎.\displaystyle{{{\bf{W}}_{{\rm{c}},1}}}\underline{\succ}{\bf{0}},~{}{{{\bf{W}}_{\rm{b}}}}\underline{\succ}{\bf{0}}. (26f)

Note that for any fixed rb0{r_{\rm{b}}}\geq 0, problem (26) is a SDP. Therefore, problem (26) is quasi-concave, and its optimal solution can be found by checking its feasibility under any given rb{r_{\rm{b}}}.

After that, it can be checked that the problem of maximizing (26b) is concave with respect to rbr_{\rm{b}}. To be more specific, let 𝒲={𝐖b,𝐖c,1|(26c)(26f)}\mathcal{W}=\{{{\bf{W}}_{\rm{b}}},{{\bf{W}}_{{\rm{c}},1}}|\eqref{C_15b}-\eqref{C_15e}\}, ϕ(𝐖b):=Tr(𝐡bH𝐖b𝐡b)\phi({\bf{W}}_{\rm{b}}):={\rm{Tr}}\left({{\bf{h}}_{\rm{b}}^{H}{{\bf{W}}_{\rm{b}}}{{\bf{h}}_{\rm{b}}}}\right), and θ(𝐖c,1):=Tr(𝐡bH𝐖c,1𝐡b)+σb2\theta({\bf{W}}_{{\rm{c}},1}):={{\rm{Tr}}\left({{\bf{h}}_{\rm{b}}^{H}{{\bf{W}}_{{\rm{c}},1}}{{\bf{h}}_{\rm{b}}}}\right)+\sigma_{\rm{b}}^{2}}. Then, we have the following result.

Lemma 1: Function

g(rb)=max𝐖b,𝐖c,𝟏𝒲\displaystyle g(r_{b})=\max_{\bf{W}_{\rm{b}},\bf{W}_{{\rm{c}},1}\in\mathcal{W}} rb,\displaystyle r_{b}, (27)
s.t. ϕ(𝐖b)rbθ(𝐖c,1).\displaystyle\phi({\bf{W}}_{\rm{b}})\geq r_{b}\theta({\bf{W}}_{{\rm{c}},1}). (28)

is concave for rb0r_{b}\geq 0.

Proof:

Please see Appendix A. ∎

Thus, we first transform problem (26) into a series of convex subproblems with a given rb0{r_{\rm{b}}}\geq 0, which can be optimally solved by standard convex optimization solvers such as CVX. Next, we adopt a bisection search method to find the proposed covert beamformers 𝐖b{{\bf{W}}_{\rm{b}}} and 𝐖c,1{{\bf{W}}_{{\rm{c}},1}}. The details of the bisection search method are summarized as Algorithm 1 in Table I, which outputs the optimal solutions 𝐖c,1{\bf{W}}_{{\rm{c}},1}^{*} and 𝐖b{\bf{W}}_{\rm{b}}^{*}. The computational complexity of Algorithm 1 is 𝒪(max{4,2N}42Nlog(1/ξ1)log(1/ζ1)){\cal O}\left({\max{{\left\{{4,2N}\right\}}^{4}}\sqrt{2N}\log\left({{1\mathord{\left/{\vphantom{1{{\xi_{1}}}}}\right.\kern-1.2pt}{{\xi_{1}}}}}\right)\log\left({{1\mathord{\left/{\vphantom{1\zeta_{1}}}\right.\kern-1.2pt}\zeta_{1}}}\right)}\right), where ξ1>0{\xi_{1}}>0 is the pre-defined accuracy of problem (26) [38, 39, 40].

Algorithm 1 Proposed covert beamformers design method for problem (26)
1:choose ζ1>0\zeta_{1}>0 (termination parameter), rb,l{{r}_{{\rm{b,l}}}} and rb,u{{r}_{{\rm{b,u}}}} such that rb{r}_{\rm{b}}^{\rm{*}} lies in [rb,l,rb,u]\left[{{{r}_{{\rm{b,l}}}},{{r}_{{\rm{b,u}}}}}\right];
2:Initialize rb,l=0{{r}_{{\rm{b,l}}}}=0, rb,u=r^b{{r}_{{\rm{b,u}}}}={{\hat{r}}_{\rm{b}}};
3:while rb,urb,lζ1{{r}_{{\rm{b,u}}}}-{{r}_{{\rm{b,l}}}}\geq\zeta_{1} do
4:     set rb=(rb,l+rb,u)/2{{r}_{{\rm{b}}}}=\left({{{r}_{{\rm{b,l}}}}+{{r}_{{\rm{b,u}}}}}\right)/2;
5:     if problem (26) is feasible, we get solution 𝐖b{{\bf{W}}_{\rm{b}}} and 𝐖c,1{{\bf{W}}_{{\rm{c}},1}}, and set rb,l=rb{{r}_{{\rm{b,l}}}}={{r}_{{\rm{b}}}}
6:     else, set rb,u=rb,mid{{r}_{{\rm{b,u}}}}={{r}_{{\rm{b,mid}}}};
7:end while
8:Output 𝐖c,1{\bf{W}}_{{\rm{c}},1}^{*},𝐖b{\bf{W}}_{\rm{b}}^{*};

Finally, we can reconstruct the beamformers 𝐰c,1{{\bf{w}}_{{\rm{c}},1}} and 𝐰b{{\bf{w}}_{\rm{b}}} based on the solutions given by Algorithm 1. Note that due to relaxation of SDR, the ranks of the optimal solutions 𝐖c,1{\bf{W}}_{{\rm{c}},1}^{*},𝐖b{\bf{W}}_{\rm{b}}^{*} may not be the optimal solutions of problem (23) or, equivalently, (24). In particular, if rank(𝐖c,1)=1{\rm{rank}}\left({{\bf{W}}_{{\rm{c}},1}^{*}}\right)=1 and rank(𝐖b)=1{\rm{rank}}\left({{\bf{W}}_{\rm{b}}^{*}}\right)=1, then 𝐖c,1{\bf{W}}_{{\rm{c}},1}^{*},𝐖b{\bf{W}}_{\rm{b}}^{*} are also the optimal solutions of problem (23), and the optimal beamformers 𝐰c,1{{\bf{w}}_{{\rm{c}},1}} and 𝐰b{{\bf{w}}_{\rm{b}}} can be obtained using the singular value decomposition (SVD), i.e.,𝐖c,1=𝐰c,1𝐰c,1H{\bf{W}}_{{\rm{c}},1}^{*}={{\bf{w}}_{{\rm{c}},1}}{\bf{w}}_{{\rm{c}},1}^{H} and 𝐖b=𝐰b𝐰bH{\bf{W}}_{\rm{b}}^{*}={{\bf{w}}_{\rm{b}}}{\bf{w}}_{\rm{b}}^{H}. However, if rank(𝐖c,1)>1{\rm{rank}}\left({{\bf{W}}_{{\rm{c}},1}^{*}}\right)>1 or rank(𝐖b)>1{\rm{rank}}\left({{\bf{W}}_{\rm{b}}^{*}}\right)>1, we can adopt the Gaussian randomization procedure [38] to produce a high-quality rank-one solution to problem (23).

It is worth mentioning that the above SDR based beamformers design approach requires solving a series of feasibility subproblems. Therefore, this approach leads to high computational complexity, which motivates us to further develop an alternative approach with less intensive computational complexity.

III-B Proposed Zero-Forcing Beamformers Design

In this subsection, we propose a ZF beamformers design with iterative processing, which is able to achieve a desirable tradeoff between complexity and performance. In particular, the interference signals 𝐡wH𝐰bsb{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{\rm{b}}}{s_{\rm{b}}} and 𝐡cH𝐰bsb{\bf{h}}_{\rm{c}}^{H}{{\bf{w}}_{\rm{b}}}{s_{\rm{b}}} are eliminated by designing 𝐰b{{\bf{w}}_{\rm{b}}} such that 𝐡wH𝐰b=0{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{\rm{b}}}=0 and 𝐡cH𝐰b=0{\bf{h}}_{\rm{c}}^{H}{{\bf{w}}_{\rm{b}}}=0. Meanwhile, the interference signal 𝐡bH𝐰c,1sc,1{\bf{h}}_{\rm{b}}^{H}{{\bf{w}}_{{\rm{c}},1}}{s_{{\rm{c}},1}} can be removed by designing 𝐰c,1{{\bf{w}}_{{\rm{c}},1}} such that 𝐡bH𝐰c,1=0{\bf{h}}_{\rm{b}}^{H}{{\bf{w}}_{{\rm{c}},1}}=0. Note that 𝐰b{{\bf{w}}_{\rm{b}}} has to be orthogonal to 𝐡w{{\bf{h}}_{\rm{w}}} and 𝐡c{{\bf{h}}_{\rm{c}}}; and 𝐰c,1{{\bf{w}}_{\rm{c,1}}} has to have non-zero projections on 𝐡w{{\bf{h}}_{\rm{w}}} and 𝐡c{{\bf{h}}_{\rm{c}}}, which means it has to be orthogonal to 𝐡b{{\bf{h}}_{\rm{b}}}. Since the probability that 𝐡b{{\bf{h}}_{\rm{b}}} falls in the space spanned by 𝐡w{{\bf{h}}_{\rm{w}}} and 𝐡c{{\bf{h}}_{\rm{c}}} is zero, the number of antennas at Alice is no less than three [41, 42].

Mathematically, applying the ZF beamformers design principle, problem (24) can be reformulated as

max𝐰b,𝐰c,1\displaystyle\mathop{\max}\limits_{{{\bf{w}}_{\rm{b}}},{{\bf{w}}_{\rm{c},1}}}{\rm{}}{\rm{}}{\rm{}}{\rm{}} |𝐡bH𝐰b|2\displaystyle{\left|{{\bf{h}}_{\rm{b}}^{H}{{\bf{w}}_{\rm{b}}}}\right|^{2}} (29a)
s.t.\displaystyle{\rm{s.t.}}~{} 𝐡wH𝐰b=0,\displaystyle{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{\rm{b}}}=0, (29b)
𝐡cH𝐰b=0,\displaystyle{\bf{h}}_{\rm{c}}^{H}{{\bf{w}}_{\rm{b}}}=0, (29c)
𝐡bH𝐰c,1=0,\displaystyle{\bf{h}}_{\rm{b}}^{H}{{\bf{w}}_{{\rm{c}},1}}=0, (29d)
|𝐡cH𝐰c,1|2=τ1,\displaystyle{\left|{{\bf{h}}_{\rm{c}}^{H}{{\bf{w}}_{{\rm{c}},1}}}\right|^{2}}={\tau_{1}}, (29e)
|𝐡wH𝐰c,1|2=τ2,\displaystyle{\left|{{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{{\rm{c}},1}}}\right|^{2}}={\tau_{2}}, (29f)
𝐰b2+𝐰c,12Ptotal.\displaystyle{\left\|{{{\bf{w}}_{\rm{b}}}}\right\|^{2}}+{\left\|{{{\bf{w}}_{{\rm{c}},1}}}\right\|^{2}}\leq{P_{\rm{{total}}}}. (29g)

To address the joint ZF beamformers design problem (29), we first optimize the beamformer 𝐰c,1{\bf{w}}_{\rm{c},1} by minimizing the transmission power 𝐰c,12{\left\|{{{\bf{w}}_{{\rm{c}},{\rm{1}}}}}\right\|^{2}} under constraints (29d), (29e) and (29f). This is because the objective function (29a) does not depend on 𝐰c,1{{\bf{w}}_{\rm{c},1}}, but it increases with the power of beamformer 𝐰b{{\bf{w}}_{\rm{b}}}. The total transmission power constraint (29g) includes both 𝐰b{{\bf{w}}_{\rm{b}}} and 𝐰c,1{{\bf{w}}_{\rm{c},1}}. Therefore, in order to maximize the objective function (29a), we need to design the beamformer 𝐰c,1{\bf{w}}_{\rm{c},1} with the minimum transmission power. Therefore, the ZF beamformer 𝐰c,1{\bf{w}}_{\rm{c},1} design problem can be formulated as

min𝐰c,1\displaystyle\mathop{\min}\limits_{{{\bf{w}}_{{\rm{c}},1}}}{\rm{}} 𝐰c,12\displaystyle{\rm{}}{\left\|{{{\bf{w}}_{{\rm{c}},1}}}\right\|^{2}} (30)
s.t.\displaystyle{\rm{s.t.}}~{} (29d),(29e),(29f),\displaystyle\eqref{C_21c},\eqref{C_21d},\eqref{C_21e},

which is also non-convex.

To handle the non-convexity issue, we relax problem (30) to a convex form by applying SDR as well, which is similar to the approach we followed in the previous section. Specifically, by relaxing 𝐖c,1=𝐰c,1𝐰c,1H{{\bf{W}}_{{\rm{c}},1}}={{\bf{w}}_{{\rm{c}},1}}{\bf{w}}_{{\rm{c}},1}^{H} to 𝐖c,1𝟎{{\bf{W}}_{{\rm{c}},1}}\succeq{\bf{0}}, problem (30) can be reformulated as

min𝐖c,1\displaystyle\mathop{\min}\limits_{{{\bf{W}}_{{\rm{c}},1}}}{\rm{}} Tr(𝐖c,1)\displaystyle{\rm{}}{\rm{Tr}}\left({{{\bf{W}}_{{\rm{c}},1}}}\right) (31a)
s.t.\displaystyle{\rm{s.t.}}~{} Tr(𝐖c,1𝐡b𝐡bH)=0,\displaystyle{\rm{Tr}}\left({{{\bf{W}}_{{\rm{c}},1}}{{\bf{h}}_{\rm{b}}}{\bf{h}}_{\rm{b}}^{H}}\right)=0, (31b)
Tr(𝐖c,1𝐡c𝐡cH)=τ1,\displaystyle{\rm{Tr}}\left({{{\bf{W}}_{{\rm{c}},1}}{{\bf{h}}_{\rm{c}}}{\bf{h}}_{\rm{c}}^{H}}\right)={\tau_{1}}, (31c)
Tr(𝐖c,1𝐡w𝐡wH)=τ2,\displaystyle{\rm{Tr}}\left({{{\bf{W}}_{{\rm{c}},1}}{{\bf{h}}_{\rm{w}}}{\bf{h}}_{\rm{w}}^{H}}\right)={\tau_{2}}, (31d)
𝐖c,1𝟎,\displaystyle{{\bf{W}}_{{\rm{c}},1}}\succeq{\bf{0}}, (31e)

which is a convex SDP.

Let 𝐖c,1opt{\bf{W}}_{{\rm{c}},1}^{{\rm{opt}}} denote the optimal solution of problem (31). Due to relaxation, the rank of 𝐖c,1opt{\bf{W}}_{{\rm{c}},1}^{{\rm{opt}}} may not equal to one. Therefore if rank(𝐖c,1opt)=1{\rm{rank}}\left({{\bf{W}}_{{\rm{c}},1}^{{\rm{opt}}}}\right)=1, then 𝐖c,1opt{\bf{W}}_{{\rm{c}},1}^{{\rm{opt}}} is the optimal solution of problem (29), and the optimal beamformer 𝐰c,1{{\bf{w}}_{{\rm{c}},1}} can be obtained by SVD, i.e.,𝐖c,1opt=𝐰c,1𝐰c,1H{\bf{W}}_{{\rm{c}},1}^{{\rm{opt}}}={{\bf{w}}_{{\rm{c}},1}}{\bf{w}}_{{\rm{c}},1}^{H}. Otherwise, if rank(𝐖c,1opt)>1{\rm{rank}}\left({{\bf{W}}_{{\rm{c}},1}^{{\rm{opt}}}}\right)>1, we can adopt the Gaussian randomization procedure [38] to produce a high-quality rank-one solution to problem (30).

Next, we consider the design of 𝐰b{{\bf{w}}_{{\rm{b}}}}. Let 𝐰c,1opt{{\bf{w}}_{{\rm{c}},{\rm{1}}}^{{\rm{opt}}}} denote the beamformer of problem (31). Then, let Pc=𝐰c,1opt2{P_{c}}={\left\|{{\bf{w}}_{{\rm{c}},{\rm{1}}}^{{\rm{opt}}}}\right\|^{2}} denote the transmission power of 𝐰c,1opt{{\bf{w}}_{{\rm{c}},{\rm{1}}}^{{\rm{opt}}}}. With the notations just defined, problem (29) can be formulated as

max𝐰b\displaystyle\mathop{\max}\limits_{{{\bf{w}}_{\rm{b}}}}{\rm{}} |𝐡bH𝐰b|2\displaystyle{\rm{}}{\left|{{\bf{h}}_{\rm{b}}^{H}{{\bf{w}}_{\rm{b}}}}\right|^{2}} (32a)
s.t.\displaystyle{\rm{s.t.}}~{} 𝐰b2+PcPtotal,\displaystyle{\left\|{{{\bf{w}}_{\rm{b}}}}\right\|^{2}}+{P_{c}}\leq{P_{{\rm{total}}}}, (32b)
(29b),(29c),\displaystyle\eqref{C_23},\eqref{C_24},

which is equivalent to

max𝐰b\displaystyle\mathop{\max}\limits_{{{\bf{w}}_{\rm{b}}}} Re{𝐡bH𝐰b}\displaystyle~{}{\mathop{\rm Re}\nolimits}\left\{{{\bf{h}}_{\rm{b}}^{H}{{\bf{w}}_{\rm{b}}}}\right\} (33a)
s.t.\displaystyle{\rm{s}}{\rm{.t}}{\rm{.}}~{} Im{𝐡bH𝐰b}=0,\displaystyle{\rm{Im}}\left\{{{\bf{h}}_{\rm{b}}^{H}{{\bf{w}}_{\rm{b}}}}\right\}=0, (33b)
(29b),(29c),(32b).\displaystyle\eqref{C_23},\eqref{C_24},\eqref{C_21Ba}.

Problem (33) is a SOCP that can be optimally solved by standard convex optimization solvers such as CVX [39]. Therefore, the ZF transmit beamformers of problem (29) are finally obtained.

Furthermore, we analyze the multiplexing gains of the covert communication system based on ZF beamformer design [42]. Specifically, based on the definitions 𝐇w,c=Δ[𝐡w,𝐡c]{{\bf{H}}_{{\rm{w,c}}}}\buildrel\Delta\over{=}\left[{{{\bf{h}}_{\rm{w}}},{{\bf{h}}_{\rm{c}}}}\right] and w,c=Δ𝐈𝐇w,c𝐇w,cH𝐇w,c2\prod_{{\rm{w,c}}}^{\bot}\buildrel\Delta\over{=}{\bf{I}}-\frac{{{{\bf{H}}_{{\rm{w,c}}}}{\bf{H}}_{{\rm{w,c}}}^{H}}}{{{{\left\|{{{\bf{H}}_{{\rm{w,c}}}}}\right\|}^{2}}}}, the ZF beamformer 𝐰b{{\bf{w}}_{\rm{b}}} of problem (29c) is given as

𝐰b=αw,c𝐡bw,c𝐡b,\displaystyle{{\bf{w}}_{\rm{b}}}=\alpha\frac{{\prod_{{\rm{w,c}}}^{\bot}{{\bf{h}}_{\rm{b}}}}}{{\left\|{\prod_{{\rm{w,c}}}^{\bot}{{\bf{h}}_{\rm{b}}}}\right\|}}, (34)

where α0\alpha\geq 0 is a non-negative real-valued scalar. Then, by substituting 𝐰b{{\bf{w}}_{\rm{b}}} into problem (29c), the optimal ZF beamformer of Bob is given by

𝐰b=PtotalPcw,c𝐡bw,c𝐡b.\displaystyle{{\bf{w}}_{\rm{b}}}=\sqrt{{P_{{\rm{total}}}}-{P_{c}}}\frac{{\prod_{{\rm{w,c}}}^{\bot}{{\bf{h}}_{\rm{b}}}}}{{\left\|{\prod_{{\rm{w,c}}}^{\bot}{{\bf{h}}_{\rm{b}}}}\right\|}}. (35)

Thus, the multiplexing gain covert communication system is given as

limPtotalRblog2SNR\displaystyle\mathop{\lim}\limits_{{P_{{\rm{total}}}}\to\infty}\frac{{{R_{b}}}}{{{\rm{lo}}{{\rm{g}}_{2}}{\rm{SNR}}}} (36a)
=limαlog2(PtotalPcσb2)+log2|𝐡bHw,c𝐡bw,c𝐡b|2log2PtotalPcσb2\displaystyle=\mathop{\lim}\limits_{\alpha\to\infty}\frac{{{{\log}_{2}}\left({\frac{{{P_{{\rm{total}}}}-{P_{c}}}}{{\sigma_{\rm{b}}^{2}}}}\right)+{{\log}_{2}}{{\left|{\frac{{{\bf{h}}_{\rm{b}}^{H}\prod_{{\rm{w,c}}}^{\bot}{{\bf{h}}_{\rm{b}}}}}{{\left\|{\prod_{{\rm{w,c}}}^{\bot}{{\bf{h}}_{\rm{b}}}}\right\|}}}\right|}^{2}}}}{{{\rm{lo}}{{\rm{g}}_{2}}\frac{{{P_{{\rm{total}}}}-{P_{c}}}}{{\sigma_{\rm{b}}^{2}}}}} (36b)
=1,\displaystyle=1, (36c)

where SNR=Δ𝐰b2σb2{\rm{SNR}}\buildrel\Delta\over{=}\frac{{{{\left\|{{{\bf{w}}_{\rm{b}}}}\right\|}^{2}}}}{{\sigma_{\rm{b}}^{2}}}.

IV Proposed Robust Covert Transmission for Imperfect WCSI

In the previous section, we considered the case with perfect WCSI. In practice, it is common that the obtained CSI is corrupted by certain estimation errors [8, 7]. Hence, we further propose a robust beamforming design for the optimization problem (23) under the imperfect WCSI scenario. In this scenario, the perfect covert transmission, i.e., D(p0p1)=0D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)=0, is difficult to achieve. Therefore, we adopt D(p0p1)2ε2D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 2{\varepsilon^{2}} and D(p1p0)2ε2D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)\leq 2{\varepsilon^{2}} as covertness constraints[5, 8, 7, 4], according to (20). Moreover, based on the developed robust beamformer, we further study the best situation for Willie where the desired detection error probability of Willie can be achieved.

IV-A Case of D(p0p1)2ε2D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 2{\varepsilon^{2}}

With imperfect WCSI, we aim to maximize Rb{R_{b}} via the joint design of the beamformers 𝐰c,1{{\bf{w}}_{{\rm{c}},1}} and 𝐰b{{\bf{w}}_{\rm{b}}} under the QoS of Carol, the covertness constraint and the total power constraint. Mathematically, the robust covert rate maximization problem is formulated as

max𝐰b,𝐰c,1\displaystyle\mathop{\max}\limits_{{{\mathbf{w}}_{\rm{b}}},{{\mathbf{w}}_{{\rm{c}},{1}}}}{\rm{}}~{} Rb(𝐰c,1,𝐰b)\displaystyle{R_{\rm{b}}}\left({{{\bf{w}}_{{\rm{c}},1}}},{{{\bf{w}}_{{\rm{b}}}}}\right)\hfill (37a)
s.t. Rc,1(𝐰c,1,𝐰b)=Rc,0(𝐰c,0),\displaystyle{R_{{\rm{c}},1}}\left({{{\bf{w}}_{{\rm{c}},1}}},{{{\bf{w}}_{{\rm{b}}}}}\right)={R_{{\rm{c}},0}}\left({{{\bf{w}}_{{\rm{c}},0}}}\right),\hfill (37b)
D(p0p1)2ε2,\displaystyle D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 2{\varepsilon^{2}},\hfill (37c)
𝐰b2+𝐰c,12Ptotal,\displaystyle{\left\|{{{\mathbf{w}}_{\rm{b}}}}\right\|^{2}}+{\left\|{{{\mathbf{w}}_{{\rm{c}},{1}}}}\right\|^{2}}\leq{P_{{\rm{total}}}}, (37d)
𝐡w=𝐡^w+Δ𝐡w,Δ𝐡ww.\displaystyle{{\bf{h}}_{\rm{w}}}={{{\bf{\hat{h}}}}_{\rm{w}}}+\Delta{{\bf{h}}_{\rm{w}}},\Delta{{\bf{h}}_{\rm{w}}}\in{{\cal{E}}_{\rm{w}}}. (37e)

Recall in Section II.B that the CSIs of Bob and Carol, i.e., 𝐡b{{\bf{h}}_{\rm{b}}} and 𝐡c{{\bf{h}}_{\rm{c}}}, are perfectly known.

It is clear that problem (37) is not convex, and thereby it is difficult to obtain the optimal solution directly. To deal with this issue, we first reformulate the covertness constraint (37d), by exploiting the property of the function f(x)=lnx+1x1f\left(x\right)=\ln x+\frac{1}{x}-1 for x>0x>0. More specifically, the covertness constraint D(p0p1)=lnλ1λ0+λ0λ112ε2D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)=\ln\frac{{{\lambda_{1}}}}{{{\lambda_{0}}}}+\frac{{{\lambda_{0}}}}{{{\lambda_{1}}}}-1\leq 2{\varepsilon^{2}} can be equivalently transformed as

a¯λ1λ0b¯,\displaystyle\bar{a}\leq\frac{{{\lambda_{1}}}}{{{\lambda_{0}}}}\leq\bar{b}, (38)

where a¯\bar{a} and b¯\bar{b} are the two roots of the equation lnλ1λ0+λ0λ11=2ε2\ln\frac{{{\lambda_{1}}}}{{{\lambda_{0}}}}+\frac{{{\lambda_{0}}}}{{{\lambda_{1}}}}-1=2{\varepsilon^{2}}. Therefore, constraint (37c) can be equivalently reformulated as

a¯|𝐡wH𝐰c,1|2+|𝐡wH𝐰b|2+σw2|𝐡wH𝐰c,0|2+σw2b¯.\displaystyle\bar{a}\leq\frac{{{{\left|{{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{{\rm{c}},1}}}\right|}^{2}}+{{\left|{{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{\rm{b}}}}\right|}^{2}}+\sigma_{\rm{w}}^{2}}}{{{{\left|{{\bf{h}}_{\rm{w}}^{H}{{\bf{w}}_{\rm{c,0}}}}\right|}^{2}}+\sigma_{\rm{w}}^{2}}}\leq\bar{b}. (39)

Here, due to Δ𝐡ww\Delta{{\bf{h}}_{\rm{w}}}\in{{\cal{E}}_{\rm{w}}}, there are infinite choices for Δ𝐡w\Delta{{\bf{h}}_{\rm{w}}}, in constraint (37e), which makes problem (37) non-convex and intractable. To overcome this challenge, we propose a relaxation and restriction approach. Specifically, in the relaxation step, the nonconvex robust design problem is transformed into a convex SDP; while in the restriction step, infinite number of complicated constraints are reformulated into a finite number of linear matrix inequalities (LMIs).

For mathematical convenience, let us define 𝐖b=𝐰b𝐰bH{{\bf{W}}_{\rm{b}}}={{\bf{w}}_{\rm{b}}}{{\bf{w}}_{\rm{b}}}^{H}, 𝐖c,1=𝐰c,1𝐰c,1H{{\bf{W}}_{{\rm{c}},1}}={{\bf{w}}_{{\rm{c}},1}}{\bf{w}}_{{\rm{c}},1}^{H}, 𝐖^1=Δ𝐖b+𝐖c,1a¯𝐰c,0𝐰c,0H\widehat{\bf{W}}_{1}\buildrel\Delta\over{=}{{\bf{W}}_{b}}+{{\bf{W}}_{c,1}}-\bar{a}{{\bf{w}}_{{\rm{c}},0}}{\bf{w}}_{{\rm{c}},0}^{H}, and 𝐖~1=Δ𝐖c,1+𝐖bb¯𝐰c,0𝐰c,0H\widetilde{\bf{W}}_{1}\buildrel\Delta\over{=}{{\bf{W}}_{{\rm{c}},1}}+{{\bf{W}}_{\rm{b}}}-\bar{b}{{\bf{w}}_{{\rm{c}},0}}{\bf{w}}_{{\rm{c}},0}^{H}. Then, constraint (39) can be equivalently re-expressed as

Δ𝐡wH𝐖^1Δ𝐡w+2Δ𝐡wH𝐖^1𝐡^w+𝐡^wH𝐖^1𝐡^wσw2(a¯1),\displaystyle\Delta{\bf{h}}_{\rm{w}}^{H}\widehat{\bf{W}}_{1}\Delta{{\bf{h}}_{\rm{w}}}+2\Delta{\bf{h}}_{\rm{w}}^{H}\widehat{\bf{W}}_{1}{{{\bf{\hat{h}}}}_{\rm{w}}}+{\bf{\hat{h}}}_{\rm{w}}^{H}\widehat{\bf{W}}_{1}{{{\bf{\hat{h}}}}_{\rm{w}}}\geq\sigma_{\rm{w}}^{2}\left({\bar{a}-1}\right), (40a)
Δ𝐡wH𝐖~1Δ𝐡w+2Δ𝐡wH𝐖~1𝐡^w+𝐡^wH𝐖~1𝐡^wσw2(b¯1),\displaystyle\Delta{\bf{h}}_{\rm{w}}^{H}\widetilde{\bf{W}}_{1}\Delta{{\bf{h}}_{\rm{w}}}+2\Delta{\bf{h}}_{\rm{w}}^{H}\widetilde{\bf{W}}_{1}{{{\bf{\hat{h}}}}_{\rm{w}}}+{\bf{\hat{h}}}_{\rm{w}}^{H}\widetilde{\bf{W}}_{1}{{{\bf{\hat{h}}}}_{\rm{w}}}\leq\sigma_{\rm{w}}^{2}\left({\bar{b}-1}\right), (40b)

By applying SDR, we ignore the rank-one constraints of 𝐖c,1{{\bf{W}}_{{\rm{c}},1}} and 𝐖b{{\bf{W}}_{\rm{b}}}, which is similar to the approach used in (25) and (26). Then, problem (37) can be relaxed as follows

max𝐖b,𝐖c,1,r~b\displaystyle\mathop{\max}\limits_{{{\bf{W}}_{\rm{b}}},{{\bf{W}}_{{\rm{c}},1}},{{\tilde{r}}_{\rm{b}}}} r~b\displaystyle{{\tilde{r}}_{\rm{b}}} (41a)
s.t.\displaystyle{\rm{s}}.{\rm{t}}.~{} Tr(𝐡bH𝐖b𝐡b)r~b(Tr(𝐡bH𝐖c,1𝐡b)+σb2),\displaystyle{\rm{Tr}}\left({{\bf{h}}_{\rm{b}}^{H}{{\bf{W}}_{\rm{b}}}{{\bf{h}}_{\rm{b}}}}\right)\geq{{\tilde{r}}_{\rm{b}}}\left({{\rm{Tr}}\left({{\bf{h}}_{\rm{b}}^{H}{{\bf{W}}_{{\rm{c}},1}}{{\bf{h}}_{\rm{b}}}}\right)+\sigma_{\rm{b}}^{2}}\right), (41b)
σc2Tr(𝐡cH𝐖c,1𝐡c)=τ1Tr(𝐡cH𝐖b𝐡c)+τ1σc2,\displaystyle\sigma_{\rm{c}}^{2}{\rm{Tr}}\left({{\bf{h}}_{\rm{c}}^{H}{{\bf{W}}_{{\rm{c}},1}}{{\bf{h}}_{\rm{c}}}}\right)={\tau_{1}}{\rm{Tr}}\left({{\bf{h}}_{\rm{c}}^{H}{{\bf{W}}_{\rm{b}}}{{\bf{h}}_{\rm{c}}}}\right)+{\tau_{1}}\sigma_{\rm{c}}^{2}, (41c)
Tr(𝐖c,1)+Tr(𝐖b)Ptotal,\displaystyle{\rm{Tr}}\left({{{\bf{W}}_{{\rm{c}},1}}}\right)+{\rm{Tr}}\left({{{\bf{W}}_{\rm{b}}}}\right)\leq{P_{{\rm{total}}}}, (41d)
𝐖c,1¯𝟎,𝐖b¯𝟎,\displaystyle{{{\bf{W}}_{{\rm{c}},1}}}\underline{\succ}{\bf{0}},~{}{{{\bf{W}}_{\rm{b}}}}\underline{\succ}{\bf{0}}, (41e)
Δ𝐡wH𝐂wΔ𝐡wvw,\displaystyle{\Delta{\bf{h}}_{\rm{w}}^{H}{{\bf{C}}_{\rm{w}}}\Delta{{\bf{h}}_{\rm{w}}}\leq{v_{w}}}, (41f)
(40a),(40b).\displaystyle\eqref{W1a},\eqref{W1b}.

where r~b0{{\tilde{r}}_{\rm{b}}}\geq 0 is a slack variable.

Note that the SDR problem (41) is quasi-concave, since the objective function and constraints are linear in 𝐖c,1{{\bf{W}}_{{\rm{c}},1}} and 𝐖b{{\bf{W}}_{\rm{b}}}. However, problem (41) is still computationally intractable because it involves an infinite number of constraints due to Δ𝐡ww\Delta{{\bf{h}}_{\rm{w}}}\in{{\cal E}_{\rm{w}}}.

Next, we employ the S-Procedure to recast the infinitely many constraints as a certain set of LMIs, which is a tractable approximation.

Lemma 2 (S-Procedure[43]): Let a function fm(x),m{1,2},xN×1{f_{m}}\left(x\right),m\in\left\{{1,2}\right\},x\in{{\mathbb{C}}^{N\times 1}}, be defined as

fm(x)=𝐱H𝐀m𝐱+2Re{𝐛mH𝐱}+cm,\displaystyle{f_{m}}\left(x\right)={{\bf{x}}^{H}}{{\bf{A}}_{m}}{\bf{x}}+2\text{Re}\left\{{{\bf{b}}_{m}^{\rm{H}}{\bf{x}}}\right\}+{c_{m}}, (42)

where 𝐀mN{{\bf{A}}_{m}}\in{{\mathbb{C}}^{N}} is a complex Hermitian matrix, 𝐛mN×1{{\bf{b}}_{m}}\in{{\mathbb{C}}^{N\times 1}} and cm1×1{c_{m}}\in{{\mathbb{R}}^{1\times 1}}. Then, the implication relation f1(x)0f2(x)0{f_{1}}\left(x\right)\leq 0\Rightarrow{f_{2}}\left(x\right)\leq 0 holds if and only if there exists a variable η0\eta\geq 0 such that

η[𝐀1𝐛1𝐛1Hc1][𝐀2𝐛2𝐛2Hc2]¯𝟎.\displaystyle{\eta}\left[{\begin{array}[]{*{20}{c}}{{{\bf{A}}_{1}}}&{{{\bf{b}}_{1}}}\\ {{\bf{b}}_{1}^{{H}}}&{{c_{1}}}\end{array}}\right]-\left[{\begin{array}[]{*{20}{c}}{{{\bf{A}}_{2}}}&{{{\bf{b}}_{2}}}\\ {{\bf{b}}_{2}^{{H}}}&{{c_{2}}}\end{array}}\right]\underline{\succ}{\bf{0}}. (47)

Consequently, by using the S-Procedure of Lemma 2, constraints (40a) and (40b) can be respectively recast as a finite number of LMIs:

[𝐖^1+η1𝐂w𝐖^1𝐡^w𝐡^wH𝐖^1𝐡^wH𝐖^1𝐡^wσw2(a¯1)η1vw]¯𝟎,\displaystyle\left[{\begin{array}[]{*{20}{c}}{\widehat{\bf{W}}_{1}+{\eta_{1}}{{\bf{C}}_{\rm{w}}}}&{\widehat{\bf{W}}_{1}{{{\bf{\hat{h}}}}_{\rm{w}}}}\\ {{\bf{\hat{h}}}_{\rm{w}}^{H}\widehat{\bf{W}}_{1}}&{{\bf{\hat{h}}}_{\rm{w}}^{H}\widehat{\bf{W}}_{1}{{{\bf{\hat{h}}}}_{\rm{w}}}-\sigma_{\rm{w}}^{2}\left({\bar{a}-1}\right)-{\eta_{1}}{v_{\rm{w}}}}\end{array}}\right]\underline{\succ}{\bf{0}}, (48c)
[𝐖~1+η2𝐂w𝐖~1𝐡^w𝐡^wH𝐖~1𝐡^wH𝐖~1𝐡^w+σw2(b¯1)η2vw]¯𝟎.\displaystyle\left[{\begin{array}[]{*{20}{c}}{-\widetilde{\bf{W}}_{1}+{\eta_{2}}{{\bf{C}}_{\rm{w}}}}&{-\widetilde{\bf{W}}_{1}{{{\bf{\hat{h}}}}_{\rm{w}}}}\\ {-{\bf{\hat{h}}}_{\rm{w}}^{H}\widetilde{\bf{W}}_{1}}&{-{\bf{\hat{h}}}_{\rm{w}}^{H}\widetilde{\bf{W}}_{1}{{{\bf{\hat{h}}}}_{\rm{w}}}+\sigma_{\rm{w}}^{2}\left({\bar{b}-1}\right)-{\eta_{2}}{v_{\rm{w}}}}\end{array}}\right]\underline{\succ}{\bf{0}}. (48f)

Thus, we obtain the following conservative approximation of problem (41):

max𝐖b,𝐖c,1,r~b\displaystyle\mathop{\max}\limits_{{{\bf{W}}_{\rm{b}}},{{\bf{W}}_{{\rm{c}},1}},{{\tilde{r}}_{\rm{b}}}}{\rm{}} r~b\displaystyle{{\tilde{r}}_{\rm{b}}} (49)
s.t.\displaystyle{\rm{s}}.{\rm{t}}.~{} (41b),(41c),(41d),(41e),(48c),(48f).\displaystyle\eqref{C_41a},\eqref{C_41b},\eqref{C_41d},\eqref{C_41e},\eqref{slemma1},\eqref{slemma2}.

When r~b{{\tilde{r}}_{\rm{b}}} is fixed, problem (49) is a convex SDP which can be efficiently solved by off-the-shelf convex solvers [39]. Therefore, problem (49) can be efficiently solved by the proposed bisection method, which is summarized in Algorithm 2. The computational complexity of Algorithm 2 is 𝒪(max{5,2N1}42N1log(1/ξ2)log(1/ζ2)){\cal O}\left({\max{{\left\{{5,2N-1}\right\}}^{4}}\sqrt{2N-1}\log\left({{1\mathord{\left/{\vphantom{1{{\xi_{2}}}}}\right.\kern-1.2pt}{{\xi_{2}}}}}\right)\log\left({{1\mathord{\left/{\vphantom{1\zeta_{2}}}\right.\kern-1.2pt}{\zeta_{2}}}}\right)}\right), where ξ2>0{\xi_{2}}>0 is the pre-defined accuracy of problem (49).

Similarly, if rank(𝐖c,1)=1{\rm{rank}}\left({{\bf{W}}_{{\rm{c}},1}^{*}}\right)=1 and rank(𝐖b)=1{\rm{rank}}\left({{\bf{W}}_{\rm{b}}^{*}}\right)=1, then 𝐖c,1{\bf{W}}_{{\rm{c}},1}^{*},𝐖b{\bf{W}}_{\rm{b}}^{*} are also the optimal solutions of problem (37), and the optimal beamformers 𝐰c,1{{\bf{w}}_{{\rm{c}},1}} and 𝐰b{{\bf{w}}_{\rm{b}}} can be obtained by SVD, i.e.,𝐖c,1=𝐰c,1𝐰c,1H{\bf{W}}_{{\rm{c}},1}^{*}={{\bf{w}}_{{\rm{c}},1}}{\bf{w}}_{{\rm{c}},1}^{H} and 𝐖b=𝐰b𝐰bH{\bf{W}}_{\rm{b}}^{*}={{\bf{w}}_{\rm{b}}}{\bf{w}}_{\rm{b}}^{H}. However, if rank(𝐖c,1)>1{\rm{rank}}\left({{\bf{W}}_{{\rm{c}},1}^{*}}\right)>1 or rank(𝐖b)>1{\rm{rank}}\left({{\bf{W}}_{\rm{b}}^{*}}\right)>1, we can adopt the Gaussian randomization procedure [38] to produce a high-quality rank-one solution to problem (37).

Algorithm 2 Proposed robust beamformers design method for problem (49)
1:choose ζ2>0\zeta_{2}>0 (termination parameter), r~b,l{{\tilde{r}}_{{\rm{b,l}}}} and r~b,u{{\tilde{r}}_{{\rm{b,u}}}} such that r~b{\tilde{r}}_{\rm{b}}^{\rm{*}} lies in [r~b,l,r~b,u]\left[{{{\tilde{r}}_{{\rm{b,l}}}},{{\tilde{r}}_{{\rm{b,u}}}}}\right];
2:Initialize r~b,l=0{{\tilde{r}}_{{\rm{b,l}}}}=0, r~b,u=r^b{{\tilde{r}}_{{\rm{b,u}}}}={{\hat{r}}_{\rm{b}}};
3:while r~b,ur~b,lζ2{{\tilde{r}}_{{\rm{b,u}}}}-{{\tilde{r}}_{{\rm{b,l}}}}\geq\zeta_{2} do
4:     Let r~b=(r~b,l+r~b,u)/2{{\tilde{r}}_{{\rm{b}}}}=\left({{{\tilde{r}}_{{\rm{b,l}}}}+{{\tilde{r}}_{{\rm{b,u}}}}}\right)/2;
5:     if problem (49) is feasible, we obtain the solution 𝐖b{{\bf{W}}_{\rm{b}}} and 𝐖c,1{{\bf{W}}_{{\rm{c}},1}}, and set r~b,l=r~b{{\tilde{r}}_{{\rm{b,l}}}}={{\tilde{r}}_{{\rm{b}}}};
6:     else, let r~b,u=r~b{{\tilde{r}}_{{\rm{b,u}}}}={{\tilde{r}}_{{\rm{b}}}};
7:end while
8:Output the optimal solutions 𝐖c,1{\bf{W}}_{{\rm{c}},1}^{*},𝐖b{\bf{W}}_{\rm{b}}^{*}.

IV-B Case of D(p1p0)2ε2D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)\leq 2{\varepsilon^{2}}

In this subsection, we consider the constraint D(p1p0)2ε2D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)\leq 2{\varepsilon^{2}}, and the corresponding robust covert rate maximization problem can be formulated as

max𝐰b,𝐰c,1\displaystyle\mathop{\max}\limits_{{{\mathbf{w}}_{\rm{b}}},{{\mathbf{w}}_{{\rm{c}},{1}}}}{\rm{}} Rb(𝐰c,1,𝐰b)\displaystyle{R_{\rm{b}}}\left({{{\bf{w}}_{{\rm{c}},1}}},{{{\bf{w}}_{{\rm{b}}}}}\right)\hfill (50a)
s.t. Rc,1(𝐰c,1,𝐰b)=Rc,0(𝐰c,0),\displaystyle{R_{{\rm{c}},1}}\left({{{\bf{w}}_{{\rm{c}},1}}},{{{\bf{w}}_{{\rm{b}}}}}\right)={R_{{\rm{c}},0}}\left({{{\bf{w}}_{{\rm{c}},0}}}\right),\hfill (50b)
D(p1p0)2ε2,\displaystyle D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)\leq 2{\varepsilon^{2}},\hfill (50c)
𝐰b2+𝐰c,12Ptotal,\displaystyle{\left\|{{{\mathbf{w}}_{\rm{b}}}}\right\|^{2}}+{\left\|{{{\mathbf{w}}_{{\rm{c}},{1}}}}\right\|^{2}}\leq{P_{{\rm{total}}}},\hfill (50d)
𝐡w=𝐡^w+Δ𝐡w,Δ𝐡ww,\displaystyle{{\bf{h}}_{\rm{w}}}={{{\bf{\hat{h}}}}_{\rm{w}}}+\Delta{{\bf{h}}_{\rm{w}}},\Delta{{\bf{h}}_{\rm{w}}}\in{{\cal{E}}_{\rm{w}}}, (50e)

where D(p1p0)=lnλ0λ1+λ1λ01D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)=\ln\frac{{{\lambda_{0}}}}{{{\lambda_{1}}}}+\frac{{{\lambda_{1}}}}{{{\lambda_{0}}}}-1.

Note that problem (50) is similar to problem (37) except for the covertness constraint. The covertness constraint D(p1p0)=lnλ0λ1+λ1λ012ε2D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)=\ln\frac{{{\lambda_{0}}}}{{{\lambda_{1}}}}+\frac{{{\lambda_{1}}}}{{{\lambda_{0}}}}-1\leq 2{\varepsilon^{2}} can be equivalently transformed as

c¯λ0λ1d¯,\displaystyle\bar{c}\leq\frac{{{\lambda_{0}}}}{{{\lambda_{1}}}}\leq\bar{d}, (51)

where c¯=a¯\bar{c}={\bar{a}} and d¯=b¯\bar{d}={\bar{b}}, are the two roots of the equation lnλ0λ1+λ1λ01=2ε2\ln\frac{{{\lambda_{0}}}}{{{\lambda_{1}}}}+\frac{{{\lambda_{1}}}}{{{\lambda_{0}}}}-1=2{\varepsilon^{2}}.

Similar to the previous subsection, we may apply the relaxation and restriction approach to solve problem (50). We omit the detailed derivations for brevity. Note that although the methods are similar, the achievable covert rates are quite different under the two covertness constraints. We will illustrate and discuss this issue in the next section.

IV-C Ideal Detection Performance of Willie

In order to evaluate the above robust beamformer deign, we further develop the optimal decision threshold of Willie, and the corresponding false alarm and missed detection probabilities. We consider the ideal case for Willie, i.e., the beamformers 𝐰b{\mathbf{w}}_{\rm{b}}, 𝐰c,0{{\mathbf{w}}_{{\rm{c}},{0}}}, and 𝐰c,1{{\mathbf{w}}_{{\rm{c}},{1}}} are known by Willie, which is the the worst case for Bob.

According to the Neyman-Pearson criterion [3], the optimal rule for Willie to minimize his detection error is the likelihood ratio test[3], i.e.,

p1(yw)p0(yw)>D1<D01,\displaystyle\frac{{{p_{1}}\left({{y_{\rm{w}}}}\right)}}{{{p_{0}}\left({{y_{\rm{w}}}}\right)}}\frac{{\mathop{>}\limits^{{D_{1}}}}}{{\mathop{<}\limits_{{D_{0}}}}}1, (52)

where 𝒟1{{{\cal D}_{1}}} and 𝒟0{{{\cal D}_{0}}} are the binary decisions that correspond to hypotheses 0{{{\cal H}_{0}}} and 1{{{\cal H}_{1}}}, respectively. Furthermore, (52) can be equivalently reformulated as

|yw|2>𝒟1<𝒟0ϕ.\displaystyle{\left|{{y_{\rm{w}}}}\right|^{2}}\frac{{\mathop{>}\limits^{{{\cal D}_{1}}}}}{{\mathop{<}\limits_{{{\cal D}_{0}}}}}{\phi^{*}}. (53)

where ϕ=Δλ0λ1λ1λ0lnλ1λ0{\phi^{*}}\buildrel\Delta\over{=}\frac{{{\lambda_{0}}{\lambda_{1}}}}{{{\lambda_{1}}-{\lambda_{0}}}}{\rm{ln}}\frac{{{\lambda_{1}}}}{{{\lambda_{0}}}} denotes the optimal detection threshold of Willie. Here, please recall that λ0\lambda_{0} and λ1\lambda_{1} are given in (16), which depend on the beamformer vectors 𝐰b{\mathbf{w}}_{\rm{b}}, 𝐰c,0{{\mathbf{w}}_{{\rm{c}},{0}}}, and 𝐰c,1{{\mathbf{w}}_{{\rm{c}},{1}}}.

According to (16), the cumulative density functions (CDFs) of |yw|2{\left|{{y_{\rm{w}}}}\right|^{2}} under 0{{{\cal H}_{\rm{0}}}} and 1{{{\cal H}_{\rm{1}}}} are respectively given by

Pr(|yw|2|0)=1exp(|yw|2λ0),\displaystyle\Pr\left({{{\left|{{y_{\rm{w}}}}\right|}^{2}}|{\cal{H}_{\rm{0}}}}\right)=1-\exp\left({-\frac{{{{\left|{{y_{\rm{w}}}}\right|}^{2}}}}{{{\lambda_{0}}}}}\right), (54a)
Pr(|yw|2|1)=1exp(|yw|2λ1).\displaystyle\Pr\left({{{\left|{{y_{\rm{w}}}}\right|}^{2}}|{\cal{H}_{\rm{1}}}}\right)=1-\exp\left({-\frac{{{{\left|{{y_{\rm{w}}}}\right|}^{2}}}}{{{\lambda_{1}}}}}\right). (54b)

Therefore, based on the optimal detection threshold ϕ{\phi^{*}}, the false alarm P(𝒟1|0)P\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right) and missed detection probabilities P(𝒟0|1)P\left({{{\cal D}_{0}}\left|{{{\cal H}_{1}}}\right.}\right) are given as

P(𝒟1|0)=Pr(|yw|2ϕ|0)=(λ1λ0)λ1λ1λ0,\displaystyle P\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right)={\Pr}\left({{{\left|{{y_{\rm{w}}}}\right|}^{2}}\geq{\phi^{*}}|{{{\cal H}_{\rm{0}}}}}\right)={\left({\frac{{{\lambda_{1}}}}{{{\lambda_{0}}}}}\right)^{-\frac{{{\lambda_{1}}}}{{{\lambda_{1}}-{\lambda_{0}}}}}}, (55a)
P(𝒟0|1)=Pr(|yw|2ϕ|1)=1(λ1λ0)λ0λ1λ0.\displaystyle P\left({{{\cal D}_{0}}\left|{{{\cal H}_{1}}}\right.}\right)={\Pr}\left({{{\left|{{y_{\rm{w}}}}\right|}^{2}}\leq{\phi^{*}}|{{{\cal H}_{\rm{1}}}}}\right)=1-{\left({\frac{{{\lambda_{1}}}}{{{\lambda_{0}}}}}\right)^{-\frac{{{\lambda_{0}}}}{{{\lambda_{1}}-{\lambda_{0}}}}}}. (55b)

Therefore, the ideal detection performance of Willie can be characterized by ϕ{\phi^{*}}, P(𝒟1|0)P\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right) and P(𝒟0|1)P\left({{{\cal D}_{0}}\left|{{{\cal H}_{1}}}\right.}\right) . Such results can be used as the theoretical benchmark to evaluate the covert performance of our robust beamformer designs. We will further discuss the detection performance of Willie in the next section.

V Numerical Results

In this section, we present and discuss numerical results to assess the performance of the proposed covert beamformers design, ZF beamformers design and robust beamformers design methods for covert communications. In our simulations, we set the number of antennas at Alice to 55, i.e., N=5N=5, the noise variance of the three users is normalized to 11, i.e., σc2=σb2=σw2=0dBW\sigma_{\rm{c}}^{2}=\sigma_{\rm{b}}^{2}=\sigma_{\rm{w}}^{2}=0\rm{dBW}, the total transmit power of Alice to Ptotal=10dBWP_{\rm{total}}=10\rm{dBW}, and 𝐰c,02=1dBW{{{{\left\|{{{\bf{w}}_{c,0}}}\right\|}^{2}}}}=1\rm{dBW}. Moreover, we assume that all channels experience Rayleigh flat fading, and σ1=σ2=σ3=1\sigma_{1}=\sigma_{2}=\sigma_{3}=1 [16].

V-A Evaluation for Scenario 1

We first evaluate the proposed methods in scenario 1, i.e., Alice with perfect WCSI.

Refer to caption
Figure 2:  RbR_{\rm{b}} (bits/sec/Hz) of the proposed covert beamformer design and proposed ZF beamformer design design versus PtotalP_{\rm{total}} (dBW).

Fig. 2 depicts the covert rate of Bob RbR_{\rm{b}} with the proposed covert beamformer design and the proposed ZF beamformer design versus the total transmit power PtotalP_{\rm{total}}. It can be observed that the covert rate of Bob RbR_{\rm{b}} increases as the transmit power of Alice PtotalP_{\rm{total}} increases, while RbR_{\rm{b}} of the proposed covert beamforming design is higher than that of the ZF beamformer design. In addition, by comparing the two different transmit powers of beamformer for Carol 𝐰c,02{{{{\left\|{{{\bf{w}}_{{\rm{c}},0}}}\right\|}^{2}}}} under 0{\cal{H}}_{0}, we observe that the lower the transmit power of 𝐰c,02{{{{\left\|{{{\bf{w}}_{{\rm{c}},0}}}\right\|}^{2}}}} is, the higher the covert rate of Bob RbR_{\rm{b}} will be. This is because when the transmit power of 𝐰c,02{{{{\left\|{{{\bf{w}}_{{\rm{c}},0}}}\right\|}^{2}}}} is lower, more power can be allocated to Bob.

Refer to caption
Figure 3:  RbR_{\rm{b}} of the proposed covert beamformer design and proposed ZF beamformer design versus with different ratios 𝐰c,02Ptotal\frac{{{{\left\|{{{\bf{w}}_{{\rm{c}},0}}}\right\|}^{2}}}}{P_{{\rm{total}}}}.

Fig. 3 plots the covert rate RbR_{\rm{b}} of the proposed covert beamformer design and the proposed ZF beamformer design versus different ratios 𝐰c,02Ptotal\frac{{{{\left\|{{{\bf{w}}_{{\rm{c}},0}}}\right\|}^{2}}}}{P_{{\rm{total}}}} with Ptotal=10W{P_{{\rm{total}}}}=10\rm{W}. In this figure, we observe that for a fixed value of the ratio 𝐰c,02Ptotal\frac{{{{\left\|{{{\bf{w}}_{{\rm{c}},0}}}\right\|}^{2}}}}{P_{{\rm{total}}}}, RbR_{\rm{b}} of the ZF beamformer design is lower than that of the covert beamformer design, which is consistent with Fig. 2. In addition, as the ratio 𝐰c,02Ptotal\frac{{{{\left\|{{{\bf{w}}_{{\rm{c}},0}}}\right\|}^{2}}}}{P_{{\rm{total}}}} increases, the covert rate of Bob RbR_{\rm{b}} decreases, and the rate gap between the covert beamformer design and ZF beamformer design also decreases. This is because when the ratio 𝐰c,02Ptotal\frac{{{{\left\|{{{\bf{w}}_{{\rm{c}},0}}}\right\|}^{2}}}}{P_{{\rm{total}}}} is high, the allocated power of the beamformer 𝐰b{{{\bf{w}}_{{\rm{b}}}}} is close to 0, which leads to rate gap between the covert beamformer design and ZF design close to 0; and when the ratio is low, more power is allocated to the beamformer 𝐰b{{{\bf{w}}_{{\rm{b}}}}}, which results in a rate gap between the covert beamformer design and ZF design (verified in Fig. 2).

Refer to caption
Figure 4: RbR_{\rm{b}} versus the number of antennas NN for proposed covert beamformer design and proposed ZF beamformer design.

In Fig. 4, we plot the covert rate of Bob RbR_{\rm{b}} of the proposed covert beamformer design and the proposed ZF beamformer design versus the number of antennas of Alice NN with Ptotal=10dBW{P_{{\rm{total}}}}=10\rm{dBW}. It is observed that as the number of antennas NN increases, the covert rate of Bob RbR_{\rm{b}} increases and the rate gap between the covert beamformer design and ZF beamformer design also increases. This is because with more antennas, more spatial multiplexing gains can be exploited.

Through Figs. 2, 3 and 4, we observe that the covert rate of the proposed covert beamformer design is always higher than that of the proposed ZF beamformer design. However, the computational complexity of the ZF beamformer design is significantly lower than that of the covert beamformer design. Specifically, the comparison of the computational time between the covert beamformer design and ZF beamformer design is presented in Table I, and all simulations of the two methods are performed using MATLAB 2016b with 2.30GHz, 2.29GHz dual CPUs and a 128GB RAM. Table I shows that the computational time of the covert beamformer design and ZF beamformer design increases as the number of antennas NN increases. More importantly, the computational time of the ZF beamformer design is less than 1/10{1\mathord{\left/{\vphantom{1{10}}}\right.\kern-1.2pt}{10}} of that of the covert beamformer design.

TABLE I: Comparison of the computational time between the proposed covert beamformer design and proposed ZF beamformer design
Method NN Time/second N=4N=4 N=6N=6 N=8N=8 N=10N=10
Covert Design 10.3610.36 10.4910.49 10.9710.97 11.3911.39
ZF Design 0.75710.7571 0.75930.7593 0.76150.7615 0.76210.7621

V-B Evaluation for Scenario 2

In this subsection, we evaluate the proposed robust beamformer design for scenario 2, namely, Alice with imperfect WCSI.

Refer to caption

(a)

Refer to caption

(b)

Figure 5: The empirical CDF of (a) D(p0p1)D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right) and (b) D(p0p1)D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right), with the covertness threshold 2ε2=0.022{\varepsilon^{2}}=0.02 and CSI errors vw=0.005v_{w}=0.005.

Fig. 5 shows the cumulative density function (CDF) of D(p0p1)D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right), where the relative entropy requirement is D(p0p1)0.02D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 0.02, 𝐰c,02=8dBW{{{{\left\|{{{\bf{w}}_{{\rm{c}},0}}}\right\|}^{2}}}}=8{\rm{dBW}} and vw=0.005v_{w}=0.005. From these results, we observe that the CDF in the KL divergence of the non-robust design cannot guarantee the requirement, while the robust beamforming design satisfies the KL divergence constraint, that is, it satisfies Willie’s error detection probability requirement, in order to achieve our goal.

Fig. 5 (a) and (b) show the empirical CDF of the achieved D(p0p1)D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right) and D(p1p0)D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right), respectively, for both the robust and non-robust designs, where the covertness threshold is 2ε2=0.022{\varepsilon^{2}}=0.02, i.e., D(p0p1)0.02D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 0.02 and D(p1p0)0.02D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)\leq 0.02, and the CSI errors parameter is vw=0.005v_{w}=0.005. Here, the non-robust design refers to the proposed covert design with 𝐡^w{{{\bf{\hat{h}}}}_{\rm{w}}} under the same conditions. As can be observed from Fig. 5 (a) and (b), the proposed robust design satisfies the covertness constraint, i.e., D(p0p1)0.02D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 0.02 and D(p1p0)0.02D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)\leq 0.02. On the other hand, the non-robust design cannot satisfy the covertness constraints, where about 45%\% of the resulting D(p0p1)D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right) exceed the covertness threshold 2ε2=0.022{\varepsilon^{2}}=0.02; and about 50%\% of the resulting D(p1p0)D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right) exceed the covertness threshold 2ε2=0.022{\varepsilon^{2}}=0.02. Fig. 5 (a) and (b) verify the necessity and effectiveness of the proposed robust design.

Refer to caption

(a)

Refer to caption

(b)

Figure 6: The value of ε\varepsilon versus (a) the covert rate and (b) the detection error probabilities with CSI errors vw=0.005v_{w}=0.005.
Refer to caption

(a)

Refer to caption

(b)

Figure 7: (a) The covert rate and (b) the detection error probabilities versus CSI errors vwv_{w} with the value of ε=0.1\varepsilon=0.1.

Fig. 6 (a) plots covert rates RbR_{\rm{b}} versus the value of ε\varepsilon for the two KL divergence cases with CSI errors vw=0.005v_{w}=0.005, where P(p0p1)(𝒟1|0){P_{\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)}}\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right) represents the false alarm probability P(𝒟1|0)P\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right) in the case of D(p0p1)2ε2D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 2{\varepsilon^{2}}, and the other notation is defined likewise. Such simulation result is consistent with the theoretical analysis showing that when ε\varepsilon becomes larger, the covertness constraint is more loose, which causes RbR_{\rm{b}} to become larger. And the rate of performance improvement also decreases in Fig. 6 (a) with increasing ε\varepsilon. Fig. 6 (b) plots the false alarm probability P(𝒟1|0)P\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right) and the missed detection probability P(𝒟0|1)P\left({{{\cal D}_{0}}\left|{{{\cal H}_{1}}}\right.}\right) versus the value of ε\varepsilon with CSI errors vw=0.005v_{w}=0.005. We observe that under either case of the covertness constraint, the false alarm probability P(𝒟1|0)P\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right) and the missed detection probability P(𝒟0|1)P\left({{{\cal D}_{0}}\left|{{{\cal H}_{1}}}\right.}\right) are decreasing as ε\varepsilon increases, where P(𝒟1|0)P\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right) is always lower than P(𝒟0|1)P\left({{{\cal D}_{0}}\left|{{{\cal H}_{1}}}\right.}\right). It implies that when the convert constraint is looser, the detection performance of Willie becomes better. Moreover, Fig. 6 (b) also verifies the effectiveness of the proposed robust beamformers design in covert communications, i.e., Pr(𝒟1|0)+Pr(𝒟0|1)1ε\Pr\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right)+\Pr\left({{{\cal D}_{0}}\left|{{{\cal H}_{1}}}\right.}\right)\geq 1-\varepsilon. Therefore, from Fig. 6, we reveal the tradeoff between Willie’s detection performance and Bob’s covert rate, and a desired tradeoff can be achieved via a proper robust beamformer design.

Refer to caption
Figure 8: Covert rates RbR_{\rm{b}} versus number of antennas NN with CSI errors vw=0.005v_{w}=0.005.

Fig. 7 (a) plots covert rates RbR_{\rm{b}} versus CSI errors vwv_{w} under two covertness constraints D(p0p1)2ε2D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 2{\varepsilon^{2}} and D(p1p0)2ε2D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)\leq 2{\varepsilon^{2}}. We observe that as vwv_{w} increases, the covert rates RbR_{\rm{b}} of two covertness constraints decrease, and the rates gap increases. Fig. 7 (b) plots the false alarm probability P(𝒟1|0)P\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right) and the missed detection probability P(𝒟0|1)P\left({{{\cal D}_{0}}\left|{{{\cal H}_{1}}}\right.}\right) versus CSI errors vwv_{w} under two covertness constraints D(p0p1)2ε2D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 2{\varepsilon^{2}} and D(p1p0)2ε2D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)\leq 2{\varepsilon^{2}}. We observe that under the two cases of covertness constraint both the false alarm probability P(𝒟1|0)P\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right) and the missed detection probability P(𝒟0|1)P\left({{{\cal D}_{0}}\left|{{{\cal H}_{1}}}\right.}\right) increase as vwv_{w} increases, where P(𝒟1|0)P\left({{{\cal D}_{1}}\left|{{{\cal H}_{0}}}\right.}\right) is always lower than P(𝒟0|1)P\left({{{\cal D}_{0}}\left|{{{\cal H}_{1}}}\right.}\right). Moreover, Fig. 7 implies that a large error vwv_{w} may lead to a bad beamformer design in terms of cover rate RbR_{\rm{b}}. However, such beamformer may confuse the detection of Willie, which is also good for Bob. Therefore, such tradeoff also should be paid attention in the beamformer design. Moreover, the detection error probabilities do not continuously increase with increasing vwv_{w}.

Finally, Fig. 8 shows the covert rates RbR_{\rm{b}} versus the number of antennas NN for two covertness constraints D(p0p1)2ε2D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 2{\varepsilon^{2}} and D(p1p0)2ε2D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)\leq 2{\varepsilon^{2}}, where 𝐰c,02=1dBW{{{{\left\|{{{\bf{w}}_{c,0}}}\right\|}^{2}}}}=1\rm{dBW}, ε=0.1\varepsilon=0.1 and vw=0.005v_{w}=0.005. From Fig. 8, we can see that the higher the number of antennas NN is, the higher the achieved covert rates RbR_{\rm{b}} will be, which is similar to the case in Fig. 4. From Fig. 6-8, we observe the rates with the covertness constraint D(p0p1)2ε2D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 2{\varepsilon^{2}} are higher than those with the covertness constraint D(p1p0)2ε2D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)\leq 2{\varepsilon^{2}}. This is because D(p1p0)2ε2D\left({{p_{1}}\left\|{{p_{0}}}\right.}\right)\leq 2{\varepsilon^{2}} is stricter than D(p0p1)2ε2D\left({{p_{0}}\left\|{{p_{1}}}\right.}\right)\leq 2{\varepsilon^{2}}, and this conclusion is also verified in [4].

VI Conclusions

In this paper, we designed a covert beamformer, ZF beamformer and robust beamformer for covert communication networks, where the communication link with Carol is exploited as a cover. For the perfect WCSI scenario, we develop both the covert beamformer design and low-complexity ZF beamformer design to maximize the covert rate. Furthermore, to quantify the impact of practical channel estimation errors, we considered the imperfect WCSI scenario, and proposed robust beamformers design, which can maximize the covert rate while meeting covert requirements. In addition, to evaluate the performance of the robust beamformers design, we derived the covert decision threshold of Willie, and false alarm probability and missed detection probability expressions. Numerical results illustrated the validity of the proposed beamformers design and provide useful insights on the impact of the involved system design parameters on the covert communications performance. In the future, we would further investigate the covert communications where Willie is equipped with multi-antenna.

Appendix A Proof of Lemma 11

Proof. We rewrite function g(rb)g(r_{b}) as following compact form:

f(x)=max𝐖𝒲\displaystyle f\left(x\right)=\mathop{\max}\limits_{\bf{W}\in\mathcal{W}} x\displaystyle~{}x (56a)
s.t.\displaystyle{\rm{s}}.{\rm{t}}.\quad a(𝐖)xb(𝐖),\displaystyle a(\mathbf{W})\geq xb(\mathbf{W}), (56b)

where 𝐖:=[𝐖b,𝐖c,𝟏]\bf{W}:=[{{\bf{W}}_{\rm{b}}},{{\bf{W}}_{{\rm{c}},1}}], a(𝐖):=ϕ(𝐖b)a(\bf{W}):=\phi({\bf{W}}_{\rm{b}}), b(𝐖):=θ(𝐖c,𝟏)b(\bf{W}):=\theta({\bf{W}}_{{\rm{c}},1}), x0x\geq 0.

Next, we will examine the concavity of function f(x)f(x) over x0x\geq 0 by definition below. First, for 0θ10\leq\theta\leq 1 and x1,x20x_{1},x_{2}\geq 0, we have

f(θx1+(1θ)x2)\displaystyle f\left({\theta{x_{1}}+\left({1-\theta}\right){x_{2}}}\right) (57a)
=\displaystyle= max𝐖𝒲θx1+(1θ)x2\displaystyle\mathop{\max}\limits_{\mathbf{W}\in\mathcal{W}}~{}\theta{x_{1}}+\left({1-\theta}\right){x_{2}} (57b)
s.t.a(𝐖)(θx1+(1θ)x2)b(𝐖),\displaystyle{\rm{s}}.{\rm{t}}.~{}a\left({\mathbf{W}}\right)\geq\left({\theta{x_{1}}+\left({1-\theta}\right){x_{2}}}\right)b\left({\mathbf{W}}\right), (57c)

Then, we have functions θf(x1)\theta f\left({{x_{1}}}\right) and (1θ)f(x2)\left({1-\theta}\right)f\left({{x_{2}}}\right) as follows

θf(x1)=max𝐖𝒲\displaystyle\theta f\left({{x_{1}}}\right)=\mathop{\max}\limits_{\bf{W}\in\mathcal{W}} θx1\displaystyle~{}\theta{x_{1}} (58a)
s.t.\displaystyle{\rm{s}}.{\rm{t}}.~{} a(𝐖)x1b(𝐖),\displaystyle a\left({\mathbf{W}}\right)\geq{x_{1}}b\left({\mathbf{W}}\right), (58b)
(1θ)f(x2)=max𝐖𝒲\displaystyle\left({1-\theta}\right)f\left({{x_{2}}}\right)=\mathop{\max}\limits_{\bf{W}\in\mathcal{W}}~{} (1θ)x2\displaystyle\left({1-\theta}\right){x_{2}} (59a)
s.t.\displaystyle{\rm{s}}.{\rm{t}}. a(𝐖)x2b(𝐖).\displaystyle~{}a\left({\bf{W}}\right)\geq{x_{2}}b\left({\bf{W}}\right). (59b)

Let c(𝐖)=Δa(𝐖)b(𝐖)c\left({\bf{W}}\right)\buildrel\Delta\over{=}\frac{{a\left({\bf{W}}\right)}}{{b\left({\bf{W}}\right)}}. We have

θf(x1)+(1θ)f(x2)=max𝐖𝒲\displaystyle\theta f\left({{x_{1}}}\right)+\left({1-\theta}\right)f\left({{x_{2}}}\right)=\mathop{\max}\limits_{\bf{W}\in\mathcal{W}} θx1+(1θ)x2\displaystyle~{}\theta{x_{1}}+\left({1-\theta}\right){x_{2}} (60a)
s.t.\displaystyle{\rm{s}}.{\rm{t}}. 0x1c(𝐖),\displaystyle 0\leq{x_{1}}\leq c\left({\bf{W}}\right), (60b)
0x2c(𝐖).\displaystyle 0\leq{x_{2}}\leq c\left({\bf{W}}\right). (60c)

Note that constraint (57c) can be equivalently written as

θx1+(1θ)x2\displaystyle\theta{x_{1}}+\left({1-\theta}\right){x_{2}} c(𝐖),\displaystyle\leq c\left({\bf{W}}\right), (61)

where x1,x20x_{1},x_{2}\geq 0.

It can be easily checked that when 0θ10\leq\theta\leq 1, the feasible region of x1x_{1} and x2x_{2} shown in (57c) is larger than that in (60). Therfore, we have

θf(x1)+(1θ)f(x2)f(θx1+(1θ)x2),\theta f\left({{x_{1}}}\right)+\left({1-\theta}\right)f\left({{x_{2}}}\right)\leq f\left({\theta{x_{1}}+\left({1-\theta}\right){x_{2}}}\right), (62)

implying that f(x)f\left(x\right) is concave in xx. In other words, function (26) is concave in rb{r_{\rm{b}}}. \blacksquare

References

  • [1] M. Bloch and J. Barros, Physical-Layer Security: From Information Theory to Security Engineering, U.K.: Cambridge Univ., 2011.
  • [2] M. Letafati, A. Kuhestani, K. K. Wong, and M. J. Piran, “A lightweight secure and resilient transmission scheme for the internet-of-things in the presence of a hostile jammer,” IEEE Internet Things J., 2020, DOI: 10.1109/JIOT.2020.3026475.
  • [3] E. L. Lehmann and J. P. Romano, Testing Statistical Hypotheses, Springer New York, 2005.
  • [4] S. Yan, Y. Cong, S. V. Hanly, and X. Zhou, “Gaussian signalling for covert communications,” IEEE Trans. Wireless Commun., vol. 18, no. 7, pp. 3542–3553, Jul. 2019.
  • [5] B. A. Bash, D. Goeckel, and D. Towsley, “Limits of reliable communication with low probability of detection on AWGN channels,” IEEE J. Sel. Areas Commun., vol. 31, no. 9, pp. 1921–1930, Sep. 2013.
  • [6] M. K. Simon, J. K. Omura, R. A. Scholtz, and B. K. Levitt, Spread Spectrum Communications Handbook, New York, NY, USA: McGraw-Hill, Apr. 1994.
  • [7] M. R. Bloch, “Covert communication over noisy channels: A resolvability perspective,” IEEE Trans. Inf. Theory, vol. 62, no. 5, pp. 2334–2354, May. 2016.
  • [8] L. Wang, W. Wornell, and L. Zheng, “Fundamental limits of communication with low probability of detection,” IEEE Trans. Inf. Theory, vol. 62, no. 6, pp. 3493–3503, Jun. 2016.
  • [9] H. Wu, X. Liao, Y. Dang, Y. Shen, and X. Jiang, “Limits of covert communication on two-hop AWGN channels,” in Proc. Int. Conf. Netw. Netw. Appl, pp. 42–47, Oct. 2017.
  • [10] K. S. K. Arumugam and M. R. Bloch, “Covert communication over a kk-user multiple access channel,” IEEE Trans. Inf. Theory, vol. 65, no. 11, pp. 7020–7044, Nov. 2019.
  • [11] V. Y. F. Tan and S. Lee, “Time-division is optimal for covert communication over some broadcast channels,” IEEE Trans. Inf. Forensics Security, vol. 14, no. 5, pp. 1377–1389, May. 2019.
  • [12] S. Lee, R. J. Baxley, M. A. Weitnauer, and B. Walkenhorst, “Achieving undetectable communication,” IEEE J. Sel. Topics Signal Process., vol. 9, no. 7, pp. 1195–1205, Oct. 2015.
  • [13] D. Goeckel, B. Bash, S. Guha, and D. Towsley, “Covert communications when the warden does not know the background noise power,” IEEE Commun. Lett., vol. 20, no. 2, pp. 236–239, Feb. 2016.
  • [14] B. He, S. Yan, X. Zhou, and V. K. N. Lau, “On covert communication with noise uncertainty,” IEEE Commun. Lett., vol. 21, no. 4, pp. 941–944, Apr. 2017.
  • [15] P. H. Che, M. Bakshi, C. Chan, and S. Jaggi, “Reliable deniable communication with channel uncertainty,” in Proc. IEEE Inf. Theory Workshop (ITW), pp. 30–34, Nov. 2014.
  • [16] K. Shahzad, X. Zhou, and S. Yan, “Covert communication in fading channels under channel uncertainty,” in Proc. IEEE 85th Veh. Technol. Conf. (VTC Spring), pp. 1–5, Jun. 2017.
  • [17] B. A. Bash, D. Goeckel, and D. Towsley, “LPD communication when the warden does not know when,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), pp. 606–610, Jun. 2014.
  • [18] K. S. K. Arumugam and M. R. Bloch, “Keyless asynchronous covert communication,” in Proc. Inf. Theory Workshop (ITW), pp. 191–195, Sep. 2016.
  • [19] B. A. Bash, D. Goeckel, and D. Towsley, “Covert communication gains from adversary’s ignorance of transmission time,” IEEE Trans. Wireless Commun., vol. 15, no. 12, pp. 8394–8405, Dec. 2016.
  • [20] T. V. Sobers, B. A. Bash, D. Goeckel, S. Guha, and D. Towsley, “Covert communication with the help of an uninformed jammer achieves positive rate,” in Proc. Asilomar Conf. Signals, Syst., Comput., pp. 625–629, Nov. 2015.
  • [21] T. V. Sobers, B. A. Bash, S. Guha, D. Towsley, and D. Goeckel, “Covert communication in the presence of an uninformed jammer,” IEEE Trans. Wireless Commun., vol. 16, no. 9, pp. 6193–6206, Sep. 2017.
  • [22] S. Yan, B. He, X. Zhou, Y. Cong, and A. L. Swindlehurst, “Delay-intolerant covert communications with either fixed or random transmit power,” IEEE Trans. Inf. Forensics Security, vol. 14, no. 1, pp. 129–140, Jan. 2019.
  • [23] K. Huang, H. Wang, D. Towsley, and H. V. Poor, “LPD communication: A sequential change-point detection perspective,” IEEE Trans. Commun., vol. 68, no. 4, pp. 2474–2490, Apr. 2020.
  • [24] J. Hu, S. Yan, X. Zhou, F. Shu, J. Li, and J. Wang, “Covert communication achieved by a greedy relay in wireless networks,” IEEE Trans. Wireless Commun., vol. 17, no. 7, pp. 4766–4779, Jul. 2018.
  • [25] M. Forouzesh, P. Azmi, A. Kuhestani, and P. L. Yeoh, “Covert communication and secure transmission over untrusted relaying networks in the presence of multiple wardens,” IEEE Trans. Commun., vol. 68, no. 6, pp. 3737–3749, Jun. 2020.
  • [26] J. Wang, W. Tang, Q. Zhu, X. Li, H. Rao, and S. Li, “Covert communication with the help of relay and channel uncertainty,” IEEE Wireless Commun. Lett., vol. 8, no. 1, pp. 317–320, Feb. 2019.
  • [27] R. Soltani, D. Goeckel, D. Towsley, B. A. Bash, and S. Guha, “Covert wireless communication with artificial noise generation,” IEEE Trans. Wireless Commun., vol. 17, no. 11, pp. 7252–7267, Nov. 2018.
  • [28] K. Shahzad, X. Zhou, S. Yan, J. Hu, F. Shu, and J. Li, “Achieving covert wireless communications using a full-duplex receiver,” IEEE Trans. Wireless Commun., vol. 17, no. 12, pp. 8517–8530, Dec. 2018.
  • [29] L. Tao, W. Yang, S. Yan, D. Wu, X. Guan, and D. Chen, “Covert communication in downlink NOMA systems with random transmit power,” IEEE Wireless Commun. Lett., vol. 9, no. 11, pp. 2000–2004, Nov. 2020.
  • [30] Y. Jiang, L. Wang, H. Zhao, and H. H. Chen, “Covert communications in D2D underlaying cellular networks with power domain NOMA,” IEEE Syst. J., vol. 14, no. 3, pp. 3717–3728, Sep. 2020.
  • [31] M. Forouzesh, P. Azmi, A. Kuhestani, and P. L. Yeoh, “Joint information theoretic secrecy and covert communication in the presence of an untrusted user and warden,” IEEE Internet Things J., 2020, DOI: 10.1109/JIOT.2020.3038682.
  • [32] M. Forouzesh, P. Azmi, N. Mokari, and D. Goeckel, “Covert communication using null space and 3D beamforming: Uncertainty of willie’s location information,” IEEE Trans. Veh. Technol., vol. 69, no. 8, pp. 8568–8576, Aug. 2020.
  • [33] T. M. Cover and J. A. Thomas, Elements of Information Theory, New York:Wiley, Jul. 2006.
  • [34] M. Zheng, A. Hamilton, and C. Ling, “Covert communications with a full-duplex receiver in non-coherent rayleigh fading,” IEEE Trans. Commun., 2020, DOI: 10.1109/TCOMM.2020.3041353.
  • [35] M. Forouzesh, P. Azmi, N. Mokari, and D. Goeckel, “Robust power allocation in covert communication: Imperfect CDI,” arXiv:1901.04914., Jan. 2019.
  • [36] A. Vakili, M. Sharif, and B. Hassibi, “The effect of channel estimation error on the throughput of broadcast channels,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), vol. 4, pp. 29–32, May. 2006.
  • [37] B. He and X. Zhou, “Secure on-off transmission design with channel estimation errors,” IEEE Trans. Inf. Forensics Security, vol. 8, no. 12, pp. 1923–1936, Dec. 2013.
  • [38] Z. Luo, W. Ma, A. M. So, Y. Ye, and S. Zhang, “Semidefinite relaxation of quadratic optimization problems,” IEEE Signal Process. Mag., vol. 27, no. 3, pp. 20–34, May. 2010.
  • [39] M. Grant and S. Boyd, “CVX: Matlab software for disciplined convex programming, version 2.1,” http://cvxr.com/cvx, Mar. 2014.
  • [40] J. F. Sturm, “Using sedumi 1.02, a matlab toolbox for optimization over symmetric cones,” in Optimization Methods and Software, vol. 11, pp. 625–653, 1999.
  • [41] K. P. Jagannathan, S. Borst, P. Whiting, and E. Modiano, “Efficient scheduling of multi-user multi-antenna systems,” in 2006 4th International Symposium on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks, pp. 1–8, 2006.
  • [42] J. Kampeas, A. Cohen, and O. Gurewitz, “The ergodic capacity of the multiple access channel under distributed scheduling - order optimality of linear receivers,” IEEE Trans. Inf. Theory, vol. 64, no. 8, pp. 5898–5919, Aug. 2018.
  • [43] D. W. K. Ng, E. S. Lo, and R. Schober, “Robust beamforming for secure communication in systems with wireless information and power transfer,” IEEE Trans. Wireless Commun., vol. 13, no. 8, pp. 4599–4615, Aug. 2014.