Capacity of Gaussian
Arbitrarily-Varying Fading Channels

Fatemeh Hosseinigoki^∗ and Oliver Kosut^∗ This material is based upon work supported by the National Science Foundation under Grant No. CCF-1453718.This paper was presented in part at the Conference on Information Sciences and Systems (CISS 2019) [1].^∗School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85287. {fhossei1,okosut}@asu.edu

Abstract

This paper considers an arbitrarily-varying fading channel consisting of one transmitter, one receiver and an arbitrarily varying adversary. The channel is assumed to have additive Gaussian noise and fast fading of the gain from the legitimate user to the receiver. We study four variants of the problem depending on whether the transmitter and/or adversary have access to the fading gains; we assume the receiver always knows the fading gains. In two variants the adversary does not have access to the gains, we show that the capacity corresponds to the capacity of a standard point-to-point fading channel with increased noise variance. The capacity of the other two cases, in which the adversary has knowledge of the channel gains, are determined by the worst-case noise variance as a function of the channel gain subject to the jammer’s power constraint; if the jammer has enough power, then it can imitate the legitimate user’s channel, causing the capacity to drop to zero. We also show that having the channel gains causally or non-causally at the encoder and/or the adversary does not change the capacity, except for the case where all parties know the channel gains. In this case, if the transmitter knows the gains non-causally, while the adversary knows the gains causally, then it is possible for the legitimate users to keep a secret from the adversary. We show that in this case the capacity is always positive.

Index Terms: Gaussian arbitrarily-varying fading channel, Gaussian arbitrarily-varying channel, fast fading channel, capacity, active adversary

I Introduction

It is a matter of great importance to study wireless communication channels since they include numerous challenges caused by noise and fading. The wireless environment also allows any uninvited signal to enter the channel. These signals can act as interference or in the worse case as a malicious jammer whose aim is to disrupt the communication between the legitimate users. In this work, we explore how these various signals interact with each other to restrict the overall capacity of the channel.

Goldsmith and Varaiya in [2] studied the point-to-point Gaussian channel with fast fading in which the fading coefficients are drawn i.i.d. from a distribution known to all parties. They determined the capacity of the channel where the fading coefficients are available at the receiver and possibly the transmitter. When the fading coefficients are not available at the transmitter, the capacity is equal to the expected value of the capacity of the corresponding Gaussian channel with the received signal-to-noise ratio affected by the fading gains. If both the transmitter and receiver know the exact channel gains, the transmitter can maximize the capacity by constructing its signal power as a function of the channel gains.

Another line of work studies channels in the presence of a malicious adversary. The adversary can be an active attacker who sends its signal to the channel in order to cease or restrict the communication between the legitimate users. If the adversary’s signal is arbitrarily chosen or is drawn from an unknown distribution, then the channel is called as an arbitrary-varying channel (AVC). We focus on the variant of the problem wherein the adversary’s knowledge is limited to the code of the legitimate user, but it does not have access to the user’s transmitted messages, neither exact nor noisy. Csiszár and Narayan established the capacity of discrete AVC with input and jammer power constraints in [3, 4]. They also derived the capacity for a continuous version of the AVC with input and state (jammer) power constraints in [5].

Csiszár and Narayan in [5] characterized the capacity of the Gaussian arbitrarily-varying channel under the average probability of error and the average power constraints for input and state. It is assumed that the adversary does not have any information about the legitimate signal except the code. It is shown that if the adversary has enough power to forge a message and send it to the channel, then the receiver gets confused and cannot distinguish between the true message and the malicious one. This occurrence is called symmetrizability , causing the capacity to drop to zero. In [5], it is shown that the adversary can symmetrize the channel if and only if it has greater power that the legitimate transmitter. However, if the jammer does not have enough power, then the capacity is equal to the capacity of a standard Gaussian channel with the noise variance increased by the power of the jammer.

The problem of AVC capacity has been also studied for discrete and continuous multi-user channels. The authors in [6] considered the discrete arbitrarily-varying multiple-access memoryless channel. We characterized lower and upper bounds for the capacity of arbitrarily-varying Gaussian interference channel in [7]. List-decoding is also investigated for the discrete and Gaussian AVCs in [8, 9] and in [10], respectively. The list capacity is derived by using list-decoding which decodes a list of messages instead of unique message at the receiver.

The three elements of Gaussian noise, fading, and adversary have been previously combined to study problems with a (passive) eavesdropping adversary, rather than an (active) transmitting adversary. A secrecy capacity problem with slow fading, in which the fading gains are constant over each block length, is considered in [11]. The authors determined the secrecy capacity of the channel where both of transmitter and receiver know the channel state information (CSI) of the main path, but do not have any information about the eavesdropper channel. Furthermore, the capacity is generalized to multiple eavesdroppers in [12]. The problem is also studied in [13] for a specific case of a fast Rayleigh fading eavesdropper channel and a standard Gaussian main channel where the CSI are only known to the eavesdropper.

In this paper, we consider a Gaussian AVC with fast fading on the main path, as illustrated in Fig. 1; we refer to this channel as the Gaussian arbitrarily-varying fading channel (GAVFC). We characterize the capacity of the GAVFC under the average probability of error criterion. Similar to the Gaussian fading channel, we also assume that everyone knows the fading gain distribution including the adversary, but they may or may not know the realization of the gain sequence. Note that the “arbitrarily-varying” aspect of the channel is the adversary’s signal, not the channel gains, which we assume to be random from a known distribution. The receiver always needs the exact fading gains to decode the message, while the adversary and the transmitter may or may not know the exact values of fading gains. Therefore, we derive the capacity of the GAVFC for four cases wherein the channel gains are available at the transmitter and/or adversary as follows:

•

Neither the transmitter nor the adversary knows the channel gains.
•

Only the transmitter knows the channel gains.
•

Only the adversary knows the channel gains.
•

Both the transmitter and the adversary knows the channel gains.

If the jammer does not know the channel gains, we show that the capacity is equal to the capacity of the corresponding fading channel with increased noise variance by the power of the jammer. If the jammer knows the fading gains, then it can choose its signal as a function of the gains, and under some power constraints it can symmetrize the channel and make the capacity zero. Note that if the channel gains are not available at the adversary, it does not have the required channel information to symmetrize the channel. Moreover, all the results still hold if the adversary and the encoder have the channel gains causally or non-causally, except one situation. If the adversary knows the channel gains causally while the encoder knows them non-causally, then the adversary cannot symmetrize the channel since the encoder possesses some extra information that the adversary does not.

The rest of the paper is organized as follows. We describe the GAVFC model and define the various capacities in Sec. II. We state our main theorem including the capacity of the GAVFC in all cases in Sec. III. Before giving the proof of our main theorem, we need some auxiliary lemmas and tools which are presented in Sec. IV. Later, in sections V, VI, VII, and VIII, we provide the converse and achievability proofs of each of the main results in Theorem 1. Finally in the Appendix, we provide a brief proof for the auxiliary results.

Notation: We use bold letters to indicate $n$ -length vectors. We employ $\langle\cdot,\cdot\rangle$ and $\circ$ to denote inner product and Hadamard product (element-wise multiplication), respectively. We indicate the positive-part function, 2-norm and the expectation by $|\cdot|^{+}$ , $\|\cdot\|$ and $\mathbb{E}[\cdot]$ , respectively. Also, for an integer $N$ , $[N]$ stands for the set $\{1,2,3,\ldots,N\}$ . Notation $\mathbf{I}_{n}$ represents the identity matrix of size $n$ . Each of $\log(\cdot)$ and $\exp(\cdot)$ functions has base 2. Moreover, $C(x)=\frac{1}{2}\log(1+x)$ , and $X\sim\mathcal{N}(\mu,\sigma^{2})$ denotes Gaussian random variable $X$ with mean $\mu$ and variance $\sigma^{2}$ .

II Problem Statement

Refer to caption — Figure 1: Gaussian Arbitrarily-Varying Fading Channel.

The Gaussian arbitrarily-varying fading channel (GAVFC) in Fig. 1 is a point-to-point fading channel with additive Gaussian noise and an intelligent adversary who does not have any information about the transmitted signal except the code. The received signal is given by

\displaystyle\begin{split}\mathbf{Y}=\mathbf{G}\circ\mathbf{x}+\mathbf{s}+\mathbf{V}\end{split}

(1)

where $\mathbf{G}$ is a random sequence of identical and independently distributed (i.i.d.) fast fading channel gains from the legitimate transmitter to the receiver drawn from continuous distribution $f_{G}(g)$ assumed to have positive and finite variance, $\mathbf{x}$ is the $n$ -length deterministic vector representing the user’s signal, $\mathbf{s}$ is the adversary signal chosen arbitrarily, and $\mathbf{V}$ is a random $n$ -length noise vector distributed as a sequence of i.i.d. zero mean Gaussian random variables with variance $\sigma^{2}$ , independent of $\mathbf{x}$ , $\mathbf{G}$ and $\mathbf{s}$ . Note that the receiver always knows the exact fading coefficients $\mathbf{g}$ while the transmitter and the adversary may not know the gains, know them causally, or know them non-causally.

Define an $\left(N,n\right)$ code for the GAVFC by a message set, an encoding function and a decoding function as follows:

•

Message set $\mathcal{M}=[N]$ ,
•
Encoding function (one of the following)
- –
  
  (No knowledge) $\mathbf{x}(m):\mathcal{M}\to\mathbb{R}^{n}$ where $\mathbf{x}=(x_{1},\ldots,x_{n})$ ,
- –
  
  (Causal) $x_{i}(m,\mathbf{g}^{i})\!\!:\!\mathcal{M}\times\mathbb{R}^{i}\to\mathbb{R}$ where $\mathbf{g}^{i}\!=\!(g_{1},\ldots,g_{i})$ and $\mathbf{x}\!=\!(x_{1},\ldots,x_{n})$ for $i\in[n]$ ,
- –
  
  (Non-causal) $x_{i}(m,\mathbf{g})\!\!:\!\mathcal{M}\times\mathbb{R}^{n}\!\to\!\mathbb{R}$ where $\mathbf{g}\!=\!(g_{1},\ldots,g_{n})$ and $\mathbf{x}\!=\!(x_{1},\ldots,x_{n})$ for $i\!~{}\in~{}[n]$ ,
•

Decoding function $\Theta(\mathbf{y},\mathbf{g}):\mathbb{R}^{n}\times\mathbb{R}^{n}\to\mathcal{M}$ ,

where the rate of the code is $R=\frac{1}{n}\log(N)$ . The message $m$ is drawn uniformly from the set $\mathcal{M}$ . If the encoder does not know the channel gains, it maps the message to $\mathbf{x}(m)\in\mathbb{R}^{n}$ . If the encoder knows the channel gains causally, then it maps the message to $x_{i}(m,\mathbf{g}^{i})\in\mathbb{R}$ , and if the encoder knows the channel gains non-causally, then it maps the message to $x_{i}(m,\mathbf{g})\in\mathbb{R}$ where $\mathbf{x}=(x_{1},\ldots,x_{n})$ . Given channel gains $\mathbf{g}$ at the receiver, the signal $\mathbf{y}$ is decoded by function $\Theta(\mathbf{y},\mathbf{g})$ to the message $\hat{m}$ . Moreover, we assume that if the channel gains are available at the transmitter then the transmitter’s signal satisfies the expected power constraints $\mathbb{E}\left[\|\mathbf{X}(m,\mathbf{G})\|^{2}\right]\leq nP$ for any message $m\in\mathcal{M}$ . Otherwise, the power constraint is $\|\mathbf{x}(m)\|^{2}\leq nP$ . The same definition applies to the adversary’s signal power constraint, i.e. if the adversary knows the channel gains, the constraint is $\mathbb{E}\left[\|\mathbf{S}(\mathbf{G})\|^{2}\right]\leq n\Lambda$ ; otherwise, it is $\|\mathbf{s}\|^{2}\leq n\Lambda$ . The three parameters $P$ , $\Lambda$ , and $\sigma^{2}$ as well as the distribution of fading gains $f_{G}(g)$ are known to all parties.

The probability of error $e(\mathbf{s},m)$ for the message $m\in\mathcal{M}$ in the presence of adversary signal $\mathbf{s}\in\mathbb{R}^{n}$ is now given by the probability that $\hat{m}\neq m$ . Thus, the average probability of error for a specific $\mathbf{s}\in\mathbb{R}^{n}$ is

\displaystyle\bar{e}(\mathbf{s})=\frac{1}{N}\sum_{m=1}^{N}e(\mathbf{s},m).

(2)

If the adversary knows the channel gains non-causally then his signal is denoted by $s_{i}(\mathbf{g})$ for $i\in[n]$ . Alternatively, if the adversary knows the gains causally, then the adversary’s action is given by functions $s_{i}(\mathbf{g}^{i})$ for $i\in[n]$ where $\mathbf{s}=(s_{1},\cdots,s_{n})$ and $\mathbf{g}^{i}=(g_{1},\cdots,g_{i})$ . Therefore, the average probability of error for this specific $\mathbf{s}(G)\in\mathbb{R}^{n}$ is

\displaystyle\bar{e}(\mathbf{s}(\cdot))=\frac{1}{N}\sum_{m=1}^{N}\mathbb{E}e(\mathbf{s}(\mathbf{G}),m).

(3)

Finally, the overall probability of error $P_{e}^{(n)}$ is maximized over all possible choices of jammers’ sequences $\mathbf{s}$ which satisfy either $\mathbb{E}\left[\|\mathbf{S}\|^{2}\right]\leq n\Lambda$ or $\|\mathbf{s}\|^{2}\leq n\Lambda$ . Rate $R$ is achievable if there exists a sequence of $\left(2^{nR},n\right)$ codes where $\underset{n\rightarrow\infty}{\lim}P_{e}^{(n)}=0$ . The capacity is the supremum of all achievable rates. We denote the capacity of the GAVFC as $C_{\alpha,\beta}$ where $\alpha$ denotes the transmitter’s knowledge, and $\beta$ denotes the adversary’s knowledge; $\alpha$ and $\beta$ can be U, C, or N depending on whether the transmitter or adversary does not know the gains (U $=$ unknown), knows the gains causally (C), or knows the gains non-causally (N). For example, $C_{\text{U,N}}$ is the capacity where the transmitter does not know the gains and the adversary knows the gains non-causally.

III Main Results

We present our results for the capacity of GAVFC whether the fading channel gains $\mathbf{G}$ are available causally or non-causally at the encoder and/or the adversary (the decoder always knows the gains) in the following theorems.

Theorem 1

The capacities of the GAVFC are given by

	$\displaystyle C_{\text{U,U}}=\mathbb{E}_{G}\left[C\left(\frac{G^{2}P}{\Lambda+\sigma^{2}}\right)\right],$		(4)
	$\displaystyle C_{\text{N,U}}=C_{\text{C,U}}=\underset{\begin{subarray}{c}\varphi(g):\mathbb{E}\varphi(G)\leq P\end{subarray}}{\max}\mathbb{E}_{G}\left[C\left(\frac{G^{2}\varphi(G)}{\Lambda+\sigma^{2}}\right)\right],$		(5)
	$\displaystyle C_{\text{U,N}}=C_{\text{U,C}}=\begin{cases}\underset{\psi(g):\mathbb{E}\psi(G)\leq\Lambda}{\min}\mathbb{E}_{G}\left[C\left(\frac{G^{2}P}{\psi(G)+\sigma^{2}}\right)\right],\!&\mathbb{E}G^{2}P\!>\!\Lambda\\ 0,\!&\mathbb{E}G^{2}P\!\leq\!\Lambda\end{cases}$		(6)
	$\displaystyle C_{\text{N,N}}=\!C_{\text{C,C}}\!=\!C_{\text{C,N}}=$
	$\displaystyle\begin{cases}\underset{\begin{subarray}{c}\varphi(g):\mathbb{E}\varphi(G)\leq P,\\ \mathbb{E}G^{2}\varphi(G)\geq\Lambda\end{subarray}}{\max}\ \ \underset{\psi(g):\mathbb{E}\psi(G)\leq\Lambda}{\min}\mathbb{E}_{G}\left[C\!\left(\frac{G^{2}\varphi(G)}{\psi(G)+\sigma^{2}}\right)\!\right],&\text{if}\underset{\varphi(g):\mathbb{E}\varphi(G)\leq P}{\max}\mathbb{E}G^{2}\varphi(G)>\Lambda\\ 0,&\text{if}\underset{\varphi(g):\mathbb{E}\varphi(G)\leq P}{\max}\mathbb{E}G^{2}\varphi(G)\leq\Lambda\end{cases}$		(7)
	$\displaystyle C_{\text{N,C}}=\underset{\begin{subarray}{c}\varphi(g):\mathbb{E}\varphi(G)\leq P\end{subarray}}{\max}\ \ \underset{\psi(g):\mathbb{E}\psi(G)\leq\Lambda}{\min}\mathbb{E}_{G}\left[C\!\left(\frac{G^{2}\varphi(G)}{\psi(G)+\sigma^{2}}\right)\!\right].$		(8)

Note that when the encoder knows the gains (5), (7), and (8), the capacity expression includes a maximization of the input power as a function $\varphi(\cdot)$ of the gain, similar to the result in [2]. Similarly, when the jammer knows the gains (6)–(8), the capacity expression includes a minimization that represents the jammer’s choice of noise power as a function $\psi(\cdot)$ of the gain. Moreover, when the jammer knows the gains, with enough power it can symmetrize the channel by mimicking the legitimate signal, thus reducing the capacity to zero. However, in (8) we have assumed that the adversary knows the gains causally and the encoder and the decoder know the gains non-causally. Thus, the encoder and decoder effectively share a secret (the channel gains at the end of the block) unknown to the adversary, so the adversary cannot symmetrize the channel. It is also worth mentioning that for the other cases (except (8)) our proof works exactly the same whether the transmitter and/or the adversary know the gain sequence causally, non-causally, or even memorylessly (i.e., at time $i$ , you only know the gain value at time $i$ ).

While we have stated the theorem by writing the capacities in terms of optimization over the $\varphi$ and/or $\psi$ functions, these expressions can be computed by solving for the optimizing functions. In particular, the optimum value of $\varphi^{*}(g)$ in (5) is $\left|\lambda-\frac{\Lambda+\sigma^{2}}{g^{2}}\right|^{+}$ where $\lambda$ is obtained by $\mathbb{E}[\varphi^{*}(G)]=P$ , and the optimum value of $\psi^{*}(\cdot)$ in (6) is a function of gain $g$ as follows

\displaystyle\psi^{*}(g)=\left|\frac{-2\sigma^{2}-g^{2}P+\sqrt{g^{4}P^{2}+\frac{2g^{2}P}{\lambda}}}{2}\right|^{+},

(9)

and $\lambda$ is obtained by solving $\mathbb{E}\psi^{*}(G)=\Lambda$ . Moreover, the optimum values of $\varphi^{*}(g)$ and $\psi^{*}(g)$ in (7) are

	$\displaystyle\varphi^{*}(g)$	$\displaystyle=\left\|\frac{1}{2(\lambda_{1}-g^{2}\lambda_{3})\left(1+\frac{\lambda_{1}-g^{2}\lambda_{3}}{g^{2}\lambda_{2}}\right)}\right\|^{+}$		(10)
	$\displaystyle\psi^{*}(g)$	$\displaystyle=\left\|\frac{g^{2}}{2g^{2}\lambda_{2}+2(\lambda_{1}-g^{2}\lambda_{3})}-\sigma^{2}\right\|^{+},$		(11)

where $\lambda_{1}$ , $\lambda_{2}$ and $\lambda_{3}$ are found by solving $\mathbb{E}\varphi(G)=P$ , $\mathbb{E}\psi(G)=\Lambda$ and $\mathbb{E}G^{2}\varphi(G)=\Lambda$ , respectively. Finally, the optimum values of $\varphi^{*}(g)$ and $\psi^{*}(g)$ in (8) are

	$\displaystyle\varphi^{*}(g)$	$\displaystyle=\left\|\frac{1}{2\lambda_{1}\left(1+\frac{\lambda_{1}}{g^{2}\lambda_{2}}\right)}\right\|^{+}$		(12)
	$\displaystyle\psi^{*}(g)$	$\displaystyle=\left\|\frac{g^{2}}{2g^{2}\lambda_{2}+2\lambda_{1}}-\sigma^{2}\right\|^{+},$		(13)

where $\lambda_{1}$ and $\lambda_{2}$ can be obtained by solving $\mathbb{E}\varphi(G)=P$ and $\mathbb{E}\psi(G)=\Lambda$ , respectively.

In the Fig. 2, the capacity of GAVFC with Rayleigh fading is shown for $P=1,\sigma^{2}=0.25,0<\Lambda<5$ and the Rayleigh distribution scale parameter $\sigma_{R}=1$ whether the channel gains are available at the encoder and/or the adversary. However, $C$ is the capacity of Gaussian arbitrary-varying channel with no fading. Note that when the transmitter’s knowledge increases, the capacity increases, whereas when the adversary’s knowledge increases, the capacity decreases. On the other hand, the knowledge of adversary about the channel gains may decrease the capacity, and in this case if the adversary’s power exceeds $2$ , the capacity will be zero by the symmetrizability.

The proofs of the different capacity variants all follow a similar pattern, so we have attempted to reduce the redundancy in our presentation. We provide essentially one converse proof for each capacity expression in the theorem:

1.

Sec. V-A: Converse for $C_{U,U}$ .
2.

Sec. VI-A: Converse for $C_{N,U}$ . This also bounds $C_{C,U}$ which is trivially upper bounded by $C_{N,U}$ .
3.

Sec. VII-A: Converse for $C_{U,C}$ . This also bounds $C_{U,N}$ which is trivially upper bounded by $C_{U,C}$ .
4.

Sec. VIII-A: Converse for $C_{C,C}$ . This also bounds $C_{C,N}$ which is trivially upper bounded by $C_{C,C}$ . Essentially the same proof also works for $C_{N,N}$ . The case $C_{N,C}$ requires a different proof also covered in this section.

We provide essentially three achievability proofs:

1.

Sec. VI-B: Achievability for $C_{C,U}$ . This also bounds the $C_{N,U}$ , which is trivially lower bounded by $C_{C,N}$ . It also provides a bound for $C_{U,U}$ , because the same proof works assuming $\varphi(g)=P$ (i.e., the encoder’s power is independent of the channel gain).
2.

Sec. VIII-B: Achievability for $C_{C,N}$ . This also bounds $C_{N,N}$ and $C_{C,C}$ , which are trivially lower bounded by $C_{C,N}$ . It also provides a bound for $C_{U,N}$ and $C_{U,C}$ , because the same proof works again assuming $\varphi(g)=P$ .
3.

Sec. VIII-C: Achievability for $C_{N,C}$ . This case is different from all others in that there is effectively a shared secret between encoder and decoder.

IV Auxiliary Results and Tools

Before proceeding to the proofs, we first define the typical set for continuous random variables $X_{1},\ldots,X_{k}$ with probability density function $f_{X_{1},\ldots,X_{k}}(x_{1},\ldots,x_{k})$ as follows:

\mathcal{T}_{\epsilon}^{(n)}(X_{1},\ldots,X_{k})=\bigg{\{}(x_{1},\ldots,x_{k})\!:\left|-\frac{1}{n}\log f_{X_{A}}(x_{A})-h(X_{A})\right|\leq\epsilon\text{ for all }A\!\subset\![k]\!\bigg{\}}

(14)

where $h(X_{A})$ is the differential entropy of $(X_{i}:i\in A)$ . Next, we define the typical set for continuous random variables $X_{1},\ldots,X_{k}$ with probability density function $f_{X_{1},\ldots,X_{k}}(x_{1},\ldots,x_{k})$ and a discrete random variable $\tilde{G}$ with probability mass function $P_{\tilde{G}(\tilde{g})}$ as follows:

\mathcal{T}_{\epsilon}^{(n)}(X_{1},\ldots,X_{k},\tilde{G})=\bigg{\{}(x_{1},\ldots,x_{k},\hat{g})\!:\left|-\frac{1}{n}\log P_{\tilde{G}}(\tilde{g})-H(\tilde{G})\right|\leq\epsilon,\\ \left|-\frac{1}{n}\log f_{X_{A}}(x_{A})-h(X_{A})\right|\leq\epsilon,\left|-\frac{1}{n}\log f_{X_{A}|\tilde{G}}(x_{A}|\tilde{g})-h(X_{A}|\tilde{G})\right|\leq\epsilon,\text{ for all }A\!\subset\![k]\!\bigg{\}},

(15)

where $H(\tilde{G})$ and $h(X_{A}|\tilde{G})$ denote the entropy of $G$ and the conditional differential entropy of $X_{A}$ given $\tilde{G}$ .

Throughout the achievability proofs, we will utilize several lemmas including the joint typicality lemma and conditional typicality lemma for Gaussian random variables given in [10]. In addition, we will need the following two lemmas; they show that with high probability a Gaussian codebook satisfies several desirable properties. The proof is given in the Appendix.

Lemma 2

Fix $\epsilon^{\prime}>0$ . There exists $\gamma>0$ such that the following holds. Let $\mathbf{X}(m)$ for $m\in[N]$ , $N=2^{nR}$ be a zero mean Gaussian codebook with variance $1-\gamma$ . Let $G$ be drawn from probability density function $f_{G}(g)$ . With probability approaching 1 as $n\to\infty$ , for any $\mathbf{s},\mathbf{g}$ where $\|\mathbf{s}\|^{2}\leq n\Lambda$ , there exists a function $\delta(\epsilon^{\prime})>0$ such that

\displaystyle\frac{1}{N}\left|\left\{m:(\mathbf{x}(m),\mathbf{s},\mathbf{g})\notin\bigcup_{\begin{subarray}{c}X\text{ independent of }(S,G):\\ EX^{2}=1,ES^{2}\leq\Lambda\end{subarray}}\mathcal{T}^{(n)}_{\epsilon^{\prime}}(X,S,G)\right\}\right|\leq\exp(-n\delta(\epsilon^{\prime})),

(16)

where the union is over zero mean conditionally Gaussian random vectors $(X,S)$ given $G$ .

Lemma 3

Fix $\epsilon>0$ . There exists $\gamma>0$ such that the following holds. Let $\mathbf{X}(m)$ for $m\in[N]$ , $N=2^{nR}$ be a zero mean Gaussian codebook with variance $1-\gamma$ . Let $G$ be drawn from probability density function $f_{G}(g)$ . With probability approaching 1 as $n\to\infty$ , for any

•

zero-mean conditionally Gaussian random vector $(X,X^{\prime},S)$ given $G$ where $\mathbb{E}X^{2}=\mathbb{E}X^{\prime 2}=1$ and $\mathbb{E}S^{2}\leq\Lambda$ ,
•

$\mathbf{x},\mathbf{s},\mathbf{g}$ where $\|\mathbf{s}\|^{2}\leq n\Lambda$ ,

there exists a function $\delta(\epsilon)>0$ such that

	$\displaystyle\mathbb{P}\left\{\big{\|}\big{\{}(\mathbf{x}(m^{\prime}),\mathbf{s},\mathbf{G})\!\in\!\mathcal{T}_{\epsilon}^{(n)}\!(X^{\prime},S,G)\text{ for some }m^{\prime}\big{\}}\big{\|}\right\}\leq 2\exp\{\!-n\delta(\epsilon)/2\},$
	$\displaystyle\hskip 200.0003pt\text{if }I(G;X^{\prime}S)\!\geq\!\|R\!-\!I(X^{\prime};S)\|^{+}\!\!+\!\delta(\epsilon),$		(17)
	$\displaystyle\big{\|}\big{\{}m^{\prime}:(\mathbf{x}(m^{\prime}),\mathbf{s})\in\mathcal{T}_{\epsilon}^{(n)}(X^{\prime},S)\big{\}}\big{\|}\leq\exp\big{\{}n\big{[}\|R-I(X^{\prime};S)\|^{+}+\delta(\epsilon)\big{]}\big{\}},$		(18)
	$\displaystyle\big{\|}\big{\{}m^{\prime}:(\mathbf{x},\mathbf{x}(m^{\prime}),\mathbf{s},\mathbf{g})\in\mathcal{T}_{\epsilon}^{(n)}(X,X^{\prime},S,G)\big{\}}\big{\|}\leq\exp\big{\{}n\big{[}\|R-I(X^{\prime};XSG)\|^{+}+\delta(\epsilon)\big{]}\big{\}},$		(19)
	$\displaystyle\frac{1}{N}\big{\|}\big{\{}\!m\!:\!(\mathbf{x}(m),\mathbf{x}(m^{\prime}),\mathbf{s},\mathbf{g})\!\in\!\mathcal{T}_{\epsilon}^{(n)}\text{ for some }m^{\prime}\!\neq\!m\!\big{\}}\big{\|}\leq\!2\exp\{\!-n\delta(\epsilon)/2\},$
	$\displaystyle\hskip 190.00029pt\text{if }I(X;X^{\prime}SG)\!\geq\!\|R\!-\!I(X^{\prime};SG)\|^{+}\!\!+\!\delta(\epsilon).$		(20)

V Capacity Proof with Gains Available at Decoder

V-A Converse Proof

We initially assume that for any arbitrary adversary strategy there is a sequence of $(2^{nR},n)$ codes with vanishing probability of error. The adversary can generate a Gaussian sequence with variance $\Lambda-\gamma$ for any $\gamma>0$ ; if this sequence has power less than $\Lambda$ , it is transmitted, otherwise, the adversary sends the all-zero sequence. Note that the power of this Gaussian sequence exceeds $\Lambda$ only with small probability by the law of large numbers. With this choice of adversary, the channel corresponds to a standard Gaussian fading channel with the noise variance $\Lambda+\sigma^{2}-\gamma$ where the channel gains are available only at the decoder. Therefore, using capacity of a non-adversarial Gaussian fading channel [14] for arbitrarily small $\gamma$ , we may upper bound the capacity by

C\leq\mathbb{E}_{G}\left[C\left(\frac{G^{2}P}{\Lambda+\sigma^{2}}\right)\right].

(21)

V-B Achievability Proof

The achievability proof of this case can be counted as a special case of $C_{N,U}$ in Sec. VI-B where both encoder and decoder know the channel gains. However, in this case since the encoder does not know the channel gains, we do not have any $\varphi(g)$ function at the encoder. In other words, the achievability proof for this case is identical to that in Sec. VI-B with $\varphi(g)=P$ .

VI Capacity Proof with Gains Available at Encoder and Decoder

VI-A Converse Proof

As in the previous case, the adversary can simply send Gaussian send noise with variance $\Lambda-\gamma$ . By the law of large numbers, the resulting channel is equivalent to a standard Gaussian fading channel with the knowledge of gains at both encoder and decoder and noise variance $\Lambda+\sigma^{2}-\gamma$ with high probability. Thus, since $\gamma$ can be chosen arbitrarily small, from the capacity of a non-adversarial Gaussian fading channel [2], we have

C\leq\underset{\varphi(g):\mathbb{E}\varphi(G)\leq P}{\max}\mathbb{E}_{G}\left[C\left(\frac{G^{2}\varphi(G)}{\Lambda+\sigma^{2}}\right)\right].

(22)

VI-B Achievability Proof

For simplicity we assume $P=1$ . Suppose any arbitrary function $\varphi(G)$ that satisfies $\mathbb{E}\varphi(G)\leq 1$ and $\operatorname{Var}(G\sqrt{\varphi(G)})>0$ . We further assume that $G^{2}\varphi(G)$ has a positive variance. Note that this is only a concern if the optimum $\varphi^{*}(G)=\frac{c}{G^{2}}$ ; in this case, we can instead take $\varphi(G)=\frac{c}{(G-d)^{2}}$ where $c,d$ are two positive constants and $d$ can be chosen arbitrarily small. Let

\displaystyle R<\mathbb{E}_{G}\left[C\left(\frac{G^{2}\varphi(G)}{\Lambda+\sigma^{2}}\right)\right].

(23)

We now propose a $(2^{nR},n)$ code sequence, and prove that using this code the probability of error tends to zero as $n\to\infty$ .

Codebook generation: Fix $\epsilon>\epsilon^{\prime}>\gamma>0$ . We generate $2^{nR}$ i.i.d zero mean Gaussian sequences $\mathbf{X}(m)$ with variance $(1-\gamma)$ for each $m\in[2^{nR}]$ . By Lemma 2 and Lemma 3, we assume that the deterministic codebook satisfies (16)–(20).

Encoding: Since the transmitter knows the channel gains, it sends $\sqrt{\varphi(\mathbf{g})}\circ\mathbf{x}(m)$ (at time $i$ signal $\sqrt{\varphi(g_{i})}x_{i}(m)$ is sent) if its power is less than $1$ , otherwise it sends zero.

Decoding: Given $\mathbf{y}$ , let $\mathscr{S}$ be the set of messages $\hat{m}$ such that $(\mathbf{x}(\hat{m}),\mathbf{g},\mathbf{y})\in\mathcal{T}_{\epsilon}^{(n)}(X^{\prime},G,Y)$ for some random variables $X^{\prime}\sim\mathcal{N}(0,1)$ , $G\sim f_{G}(g)$ and zero mean Gaussian $Y-G\sqrt{\varphi(G)}X^{\prime}$ where $(X^{\prime},G,Y-G\sqrt{\varphi(G)}X^{\prime})$ are mutually independent.

Now, we define the decoding function as

\displaystyle\Theta(\mathbf{y},\mathbf{g})

\displaystyle=\operatorname*{arg\,min}_{\hat{m}\in\mathscr{S}}\left\|\mathbf{y}-\mathbf{g}\circ\sqrt{\varphi(\mathbf{g})}\circ\mathbf{x}(\hat{m})\right\|^{2}.

(24)

Analysis of the probability of error: Suppose the true message sent by the legitimate user is message $M$ with the power constraint $\|\mathbf{x}(M)\|^{2}\leq n(1-\gamma)$ . Then, the overall probability of error is upper bounded by $P_{e}^{(n)}\leq P_{0}+P_{1}$ where

	$\displaystyle P_{0}$	$\displaystyle=\mathbb{P}\left\{M\notin\mathscr{S}\right\},$		(25)
	$\displaystyle P_{1}$	$\displaystyle=\mathbb{P}\bigg{\{}\left\\|\mathbf{Y}-\mathbf{G}\circ\sqrt{\varphi(\mathbf{G})}\circ\mathbf{x}(\hat{m})\right\\|^{2}\leq\left\\|\mathbf{s}+\mathbf{V}\right\\|^{2}\text{ for some }\hat{m}\in\mathscr{S}\setminus\{M\}\bigg{\}}.$		(26)

Consider any state sequence $\mathbf{s}$ . By (16), with high probability $(\mathbf{x}(M),\mathbf{s},\mathbf{G})\in\mathcal{T}_{\epsilon^{\prime}}^{(n)}(X,S,G)$ where $(X,S,G)$ are independent, and $\mathbb{E}X^{2}=1,\allowbreak\mathbb{E}S^{2}\leq\Lambda$ . By the conditional typicality lemma, for every $\epsilon>\epsilon^{\prime}$ with high probability $(\mathbf{x}(M),\mathbf{s},\mathbf{G},\mathbf{V})\in\mathcal{T}_{\epsilon}^{(n)}(X,S,G,V)$ where $(X,S,G,V)$ are mutually independent, and $\mathbb{E}V^{2}=\sigma^{2}$ . Thus, according to the definition of $\mathscr{S}$ , with high probability $M\in\mathscr{S}$ and $P_{0}$ tends to zero as $n\to\infty$ .

Define the shorthand $\vec{X}=(XX^{\prime}SGV)$ . Let $\mathcal{V}$ be a finite $\epsilon$ -dense subset in the set of all distributions of random vectors $\vec{X}$ that are determined by $f_{G}(g)$ and jointly zero mean Gaussian vector $(XX^{\prime}SV)$ independent of $G$ with bounded covariances at most $(1,1,\Lambda,\sigma^{2})$ . Note that because the distribution of $f_{G}(g)$ is fixed, the overall distribution of $\vec{X}$ can be determined by the covariance matrix of $(XX^{\prime}SV)$ , so $\mathcal{V}$ only needs to cover a compact set. Now, we may upper bound $P_{1}$ by

\sum_{\vec{X}\in\mathcal{V}}\frac{1}{N}\sum_{m=1}^{N}\mathbb{E}_{G}[e_{\vec{X}}(m,\mathbf{s},\mathbf{G})]

(27)

where

e_{\vec{X}}(m,\mathbf{s},\mathbf{g})=\mathbb{P}\bigg{\{}(\mathbf{x}(m),\mathbf{x}(\hat{m}),\mathbf{s},\mathbf{g},\mathbf{V})\in\mathcal{T}_{\epsilon}^{(n)}(\vec{X}),\\ \|\mathbf{g}\circ\sqrt{\varphi(\mathbf{g})}\circ\mathbf{x}(m)+\mathbf{s}+\mathbf{V}-\mathbf{g}\circ\sqrt{\varphi(\mathbf{g})}\circ\mathbf{x}(\hat{m})\|^{2}\leq\|\mathbf{s}+\mathbf{V}\|^{2}\text{ for some }\hat{m}\in\mathscr{S}\setminus\{m\}\bigg{\}}.

(28)

We will show that $\frac{1}{N}\sum_{m=1}^{N}e_{\vec{X}}(m,\mathbf{s},\mathbf{g})\to 0$ for all vectors $\mathbf{g}$ and all vectors $(XX^{\prime}SV)$ which are Gaussian given $G$ (whether or not they are in $\mathcal{V}$ ). Let $Z=G\sqrt{\varphi(G)}X+S+V-G\sqrt{\varphi(G)}X^{\prime}$ . We may restrict ourselves to $\vec{X}$ where

	$\displaystyle(X,S,G,V)\text{ are mutually independent},$		(29)
	$\displaystyle(X,X^{\prime},S,V)\text{ are zero mean Gaussian},$		(30)
	$\displaystyle\mathbb{E}X^{2}=\mathbb{E}X^{\prime 2}=1,\quad\mathbb{E}V^{2}=\sigma^{2},\quad\mathbb{E}S^{2}\leq\Lambda,$		(31)
	$\displaystyle\left(X^{\prime},G,Z\right)\!\text{ are independent},$		(32)
	$\displaystyle\mathbb{E}\left[Z^{2}\right]\leq\Lambda+\sigma^{2},$		(33)

where (29) holds since the input $X$ , adversary $S$ , fading gains $G$ and noise $V$ are all generated independently, (30)–(31) follows from $m,\hat{m}\in\mathscr{S}$ , and $\vec{X}\in\mathcal{V}$ , (32) holds since we have $(X^{\prime},G,Y-GX^{\prime})$ are mutually independent using $\mathbf{x}(\hat{m})\in\mathscr{S}$ , and (33) corresponds to $\mathbb{E}\left[\left(Y-G\sqrt{\varphi(G)}X^{\prime}\right)^{2}\right]$ which is less than $\Lambda+\sigma^{2}$ from (28).

Observe that if $I(X,V,G;X^{\prime},S)=0$ , then we would have

$\displaystyle 0$	$\displaystyle=\mathbb{E}[X^{\prime}Z]$	(34)
	$\displaystyle=\mathbb{E}[X^{\prime}(G\sqrt{\varphi(G)}X+S+V-G\sqrt{\varphi(G)}X^{\prime})]$	(35)
	$\displaystyle=\mathbb{E}[X^{\prime}(S-G\sqrt{\varphi(G)}X^{\prime})]$	(36)
	$\displaystyle=\mathbb{E}[X^{\prime}S]-\mathbb{E}G\sqrt{\varphi(G)},$	(37)

where (34) follows from (32), (36) holds because $(X^{\prime},G,X,V)$ are all mutually independent by the assumption $I(X,V,G;X^{\prime},S)=0$ and (29), and the last equality holds since $X^{\prime}$ is independent of $G$ and because $\mathbb{E}[X^{\prime 2}]=1$ . Therefore, $\mathbb{E}[X^{\prime}S]=\mathbb{E}G\sqrt{\varphi(G)}$ .

Moreover, from (28) we have

$\displaystyle\mathbb{E}(S+V)^{2}$	$\displaystyle\geq\mathbb{E}(G\sqrt{\varphi(G)}X\!+\!S\!+\!V\!-\!G\sqrt{\varphi(G)}X^{\prime})^{2}$	(38)
	$\displaystyle=\mathbb{E}G^{2}\varphi(G)(X-X^{\prime})^{2}+2\mathbb{E}G\sqrt{\varphi(G)}(X-X^{\prime})(S+V)+\mathbb{E}(S+V)^{2}$	(39)
	$\displaystyle=\mathbb{E}G^{2}\varphi(G)\mathbb{E}X^{2}+\mathbb{E}G^{2}\varphi(G)\mathbb{E}X^{\prime 2}-2\mathbb{E}G\sqrt{\varphi(G)}X^{\prime}S+\mathbb{E}(S+V)^{2}$	(40)
	$\displaystyle=2\mathbb{E}G^{2}\varphi(G)-2\mathbb{E}G\sqrt{\varphi(G)}\mathbb{E}X^{\prime}S+\mathbb{E}(S+V)^{2},$	(41)

where (40) holds because $\mathbb{E}X=\mathbb{E}X^{\prime}=\mathbb{E}V=0$ , $(X,X^{\prime},G)$ are mutually independent, $(X,S,V)$ are mutually independent, and $(X^{\prime},V)$ are independent by (29), (30) and the assumption $I(X,V,G;X^{\prime},S)=0$ . Canceling $\mathbb{E}(S+V)^{2}$ from both sides of (41) gives us

\displaystyle\mathbb{E}G^{2}\varphi(G)-\mathbb{E}G\sqrt{\varphi(G)}\mathbb{E}X^{\prime}S\leq 0.

(42)

Now, if we apply the result from (37) to (42), we get

$\displaystyle\mathbb{E}G^{2}\varphi(G)-\mathbb{E}G\sqrt{\varphi(G)}\mathbb{E}X^{\prime}S$	$\displaystyle=\mathbb{E}G^{2}\varphi(G)-\mathbb{E}G\sqrt{\varphi(G)}\mathbb{E}G\sqrt{\varphi(G)}$	(43)
	$\displaystyle=\mathbb{E}G^{2}\varphi(G)-\mathbb{E}^{2}G\sqrt{\varphi(G)}$	(44)
	$\displaystyle=\operatorname{Var}{G\sqrt{\varphi(G)}}$	(45)
	$\displaystyle\leq 0.$	(46)

which is a contradiction since we assume $\operatorname{Var}{(G\sqrt{\varphi(G)})}$ is always positive. Thus, there exists an $\eta>0$ such that

\eta\leq I(XVG;X^{\prime}S).

(47)

Also, by (20), we may restrict ourselves to distributions where

I(X;X^{\prime}SG)<|R-I(X^{\prime};SG)|^{+}+\delta(\epsilon)

(48)

and

I(G;X^{\prime}S)<|R-I(X^{\prime};S)|^{+}+\delta(\epsilon).

(49)

Note that $I(X;X^{\prime}SG)=I(X;X^{\prime}|SG)$ . We also have the upper bound

	$\displaystyle e_{\vec{X}}(m,\mathbf{s},\mathbf{g})$	$\displaystyle\leq\sum_{\hat{m}:(\mathbf{x}(m),\mathbf{x}(\hat{m}),\mathbf{s},\mathbf{g})\in\mathcal{T}_{\epsilon}^{(n)}(X,X^{\prime},S,G)}\mathbb{P}\left\{\!(\mathbf{x}(m),\mathbf{x}(\hat{m}),\mathbf{s},\mathbf{g},\mathbf{V})\!\in\!\mathcal{T}_{\epsilon}^{(n)}(X,X^{\prime},S,G,V)\!\right\}$		(50)
		$\displaystyle\leq\exp\big{\{}n\big{[}\|R\!-\!I(X^{\prime};XSG)\|^{+}\!\!-I(V;X^{\prime}\|XSG)+\delta(\epsilon)\big{]}$		(51)

where (51) follows from $I(V;XSG)=0$ , (19) and the joint typicality lemma.

Now, let us consider three cases as follows:

Case (a): $R<I(X^{\prime};S)$ that implies $R<I(X^{\prime};XSG)$ . From (51), for any $m,\mathbf{s},\mathbf{g}$

$\displaystyle e_{\vec{X}}(m,\mathbf{s},\mathbf{g})$	$\displaystyle\leq\exp\left\{-n\left(I(V;X^{\prime}\|XSG)-\delta(\epsilon)\right)\right\}$	(52)
	$\displaystyle=\exp\{-n(I(XV;X^{\prime}\|SG)-I(X;X^{\prime}\|SG)-I(XV;S\|G)-\delta(\epsilon))\}$	(53)
	$\displaystyle=\exp\{-n(I(XV;X^{\prime}S\|G)\!-\!I(X;X^{\prime}\|SG)-\delta(\epsilon))\}$	(54)
	$\displaystyle=\exp\{-n(I(XVG;X^{\prime}S)-I(G;X^{\prime}S)-I(X;X^{\prime}\|SG)-\delta(\epsilon))\}$	(55)
	$\displaystyle\leq\exp\{-n(\eta-\delta(\epsilon)-\delta^{\prime}(\epsilon))\}$	(56)

where (56) follows from (47), (48) and (49). Therefore, $e_{\vec{X}}(m,\mathbf{s},\mathbf{g})$ vanishes exponentially fast if $\delta(\epsilon)$ is sufficiently small.

Case (b): $I(X^{\prime};S)\leq R$ . Since $R\geq I(X^{\prime};S)$ and $I(G;S)=0$ , from (49) we have

$\displaystyle R$	$\displaystyle>I(G;X^{\prime}S)+I(X^{\prime};S)-\delta(\epsilon)$	(57)
	$\displaystyle=I(G;S)+I(G;X^{\prime}\|S)+I(X^{\prime};S)-\delta(\epsilon)$	(58)
	$\displaystyle=I(X^{\prime};SG)-\delta(\epsilon).$	(59)

Using this result in (48), we have

\displaystyle I(X;X^{\prime}SG)<R-I(X^{\prime};SG)+\delta(\epsilon)+\delta(\epsilon).

(60)

Therefore,

	$\displaystyle R$	$\displaystyle>I(X;X^{\prime}SG)+I(X^{\prime};SG)-2\delta(\epsilon)$		(61)
		$\displaystyle\geq I(X^{\prime};XSG)-2\delta(\epsilon).$		(62)

Now, from (51), we have for any $m,\mathbf{s},\mathbf{g}$

$\displaystyle e_{\vec{X}}(m,\mathbf{s},\mathbf{g})$	$\displaystyle\leq\exp\big{\{}n\big{[}\|R\!-\!I(X^{\prime};XSG)\|^{+}\!\!-I(V;X^{\prime}\|XSG)+\delta(\epsilon)\big{]}$	(63)
	$\displaystyle\leq\exp\big{\{}n\big{[}R\!-I(X^{\prime};XSG)+2\delta(\epsilon)-\!I(V;X^{\prime}\|XSG)\!+\!\delta(\epsilon)\big{]}$	(64)
	$\displaystyle=\!\exp(n[R-I(X^{\prime};XSGV)+3\delta(\epsilon)])$	(65)

where (64) follows from (62). We now lower bound $I(X^{\prime};XSVG)$ as follows:

$\displaystyle I(X^{\prime};XSVG)$	$\displaystyle=I(X^{\prime};XSV\|G)+I(X^{\prime};G)$	(66)
	$\displaystyle\geq I(X^{\prime};G\sqrt{\varphi(G)}X\!+\!S\!+\!V\|G)$	(67)
	$\displaystyle=I(X^{\prime};Z+G\sqrt{\varphi(G)}X^{\prime}\|G)$	(68)
	$\displaystyle=h(Z+G\sqrt{\varphi(G)}X^{\prime}\|G)-h(Z+G\sqrt{\varphi(G)}X^{\prime}\|G,X^{\prime})$	(69)
	$\displaystyle=\mathbb{E}\bigg{[}\frac{1}{2}\log 2\pi e\left(G^{2}\varphi(G)+\mathbb{E}[Z^{2}\|G]\right)-\frac{1}{2}\log 2\pi e\mathbb{E}[Z^{2}\|G]\bigg{]}$	(70)
	$\displaystyle=\mathbb{E}\left[C\left(\frac{G^{2}\varphi(G)}{\mathbb{E}[Z^{2}\|G]}\right)\right]$	(71)
	$\displaystyle\geq\mathbb{E}\left[C\left(\frac{G^{2}\varphi(G)}{\Lambda+\sigma^{2}}\right)\right]$	(72)

where (72) follows from (32) and (33). Replacing this result in (65), we obtain

e_{\vec{X}}(m,\mathbf{s},\mathbf{g})\leq\exp\left\{n\left[R-\mathbb{E}\left[C\left(\frac{G^{2}\varphi(G)}{\Lambda+\sigma^{2}}\right)\right]+3\delta(\epsilon)\right]\right\}

(73)

meaning that $e_{\vec{X}}(m,\mathbf{s},\mathbf{g})$ is exponentially vanishing if $\delta(\epsilon)$ is sufficiently small, and (23) holds.

VII Capacity Proof with Gains Available at Decoder and Jammer

VII-A Converse Proof

Consider a sequence of $(2^{nR},n)$ codes with vanishing probability of error that must function for arbitrary jamming signals. Because we are proving the converse, we may assume the best case scenario from the legitimate user’s perspective; in particular, that the adversary only knows the channel gains causally.

We begin with the case that $\Lambda\leq\mathbb{E}G^{2}P$ . Given any function $\psi(g)$ satisfying $\mathbb{E}\psi(G)\leq\Lambda$ , we may obtain an upper bound by assuming that the jammer transmits a random sequence $\mathbf{S}=(S_{1},\cdots,S_{n})$ where $S_{i}$ is Gaussian with mean zero and variance $\psi(G_{i})$ for $i=1,\cdots,n$ . Note that

$\displaystyle\mathbb{E}[\\|\mathbf{S}\\|^{2}]$	$\displaystyle=\mathbb{E}\sum_{i=1}^{n}S_{i}^{2}$	(74)
	$\displaystyle=\sum_{i=1}^{n}\mathbb{E}S_{i}^{2}$	(75)
	$\displaystyle=\sum_{i=1}^{n}\mathbb{E}\psi(G_{i})$	(76)
	$\displaystyle\leq n\Lambda.$	(77)

The resulting channel is equivalent to a standard Gaussian fading channel with the knowledge of gains only at the decoder and noise variance $\psi(g)+\sigma^{2}$ . From the capacity of a non-adversarial Gaussian fading channel

C\leq\mathbb{E}_{G}\left[C\left(\frac{G^{2}P}{\psi(G)+\sigma^{2}}\right)\right].

(78)

Therefore, the capacity is also less than the minimum over all $\psi(G)$ that satisfies $\mathbb{E}\psi(G)\leq\Lambda$ .

C\leq\min_{\psi(G):\mathbb{E}\psi(G)\leq\Lambda}\mathbb{E}_{G}\left[C\left(\frac{G^{2}P}{\psi(G)+\sigma^{2}}\right)\right].

(79)

For the case $\Lambda>\mathbb{E}G^{2}P$ , we first show that the adversary has enough power to choose a codeword and send it to the channel, thereby symmetrizing the channel. Let $\tilde{M}$ be a uniformly chosen message by the adversary and $M$ be the true message sent by the legitimate transmitter. Suppose the adversary chooses $\mathbf{S}=\mathbf{G}\circ\mathbf{x}(\tilde{M})$ then the adversary’s power constraint is satisfied as follows:

$\displaystyle\mathbb{E}\left[\\|\mathbf{S}\\|^{2}\right]$	$\displaystyle=\mathbb{E}\left[\\|\mathbf{G}\circ\mathbf{x}(\tilde{M})\\|^{2}\right]$	(80)
	$\displaystyle=\mathbb{E}\left[\sum_{i=1}^{n}G_{i}^{2}x_{i}^{2}(\tilde{M})\right]$	(81)
	$\displaystyle<\sum_{i=1}^{n}x_{i}^{2}(\tilde{M})\frac{\Lambda}{P}$	(82)
	$\displaystyle\leq n\Lambda$	(83)

where (82) follows from the assumption $\Lambda>\mathbb{E}G^{2}P$ , and (83) follows from the codebook power constraint $\|\mathbf{x}^{2}\|\leq nP$ . Given this choice of $\mathbf{S}$ , $\mathbf{Y}=\mathbf{G}\circ\mathbf{x}(M)+\mathbf{G}\circ\mathbf{x}(\tilde{M})+\mathbf{V}$ . Thus, with high probability the decoder cannot decode the message since it does not know whether the true message is $M$ or $\tilde{M}$ .

VII-B Achievability Proof

The achievability proof of this case is very similar to the achievability proof of Sec. VIII-B where the encoder, the decoder and the adversary all know the channel gains. Here, the transmitter does not know the channel gain so it cannot leverage this knowledge to choose its transmit power. However, the achievability proof for this case is identical to that in Sec. VIII-B except that the transmitter’s power function is constant; i.e., $\varphi(g)=1$ .

VIII Capacity Proof with Gains Available at Encoder, Decoder, and Jammer

In this section, we first provide the converse proof for the case that the channel gains are available at the encoder, the decoder and the adversary in Sec. VIII-A. The converse proof includes all the four cases in which each of the adversary and the encoder knows the fading gains causally or non-causally. In Sec. VIII-B, we show the achievability proof of the case that the channel gains are available non-causally at the adversary and causally at the encoder. This proof also works for the two cases of channel gains being available causally at both the adversary and the encoder or non-causally at both ends. Finally, we provide the achievability proof for the last case when the channel gains are causally available at the adversary and non-causally available at the encoder in Sec. VIII-C.

VIII-A Converse Proof

Consider a sequence of $(2^{nR},n)$ codes with vanishing probability of error. Since in this case both the encoder and the adversary know the channel gains, we consider four cases to prove the converse whether each of them knows the fading gains causally or non-causally.

First assume that both the encoder and adversary know the channel gains causally. Let $\varphi_{i}(g)=\frac{1}{N}\sum_{m=1}^{N}\mathbb{E}[X_{i}^{2}(m,G^{i})|G_{i}=g]$ and $\varphi(g)=\frac{1}{n}\sum_{i=1}^{n}\varphi_{i}(g)$ where $G^{i}=(G_{1},\ldots,G_{i})$ , for $i\in[n]$ . Thus, $\varphi(g)$ satisfies $\mathbb{E}\varphi(G)\leq P$ as follows:

$\displaystyle\mathbb{E}\varphi(G)$	$\displaystyle=\mathbb{E}\left[\frac{1}{n}\sum_{i=1}^{n}\varphi_{i}(G)\right]$	(84)
	$\displaystyle=\frac{1}{N}\sum_{m=1}^{N}\frac{1}{n}\mathbb{E}\left[\sum_{i=1}^{n}X_{i}^{2}(m,G^{i})\right]$	(85)
	$\displaystyle\leq P$	(86)

where (86) follows by the power constraint for the input signal.

Now, similar to the previous case, where the adversary and decoder know the channel gains, we also have symmetrizability and non-symmetrizability cases, but with different conditions. We first show the symmetrizability case, that is if $\Lambda\geq\mathbb{E}G^{2}\varphi(G)$ , then the jammer can symmetrize the channel. Suppose the adversary chooses a message $\tilde{M}$ uniformly at random and sends $S_{i}=G_{i}X_{i}(\tilde{M},G^{i})$ where $G^{i}=(G_{1},\cdots,G_{i})$ for $i\in[n]$ . Note that this selection of jamming signal is a causal function of the channel gains. Then we have

$\displaystyle\mathbb{E}\left[\\|\mathbf{S}\\|^{2}\right]$	$\displaystyle=\mathbb{E}\left[\sum_{i=1}^{n}S_{i}^{2}\right]$	(87)
	$\displaystyle=\frac{1}{N}\sum_{\tilde{m}=1}^{N}\mathbb{E}\left[\sum_{i=1}^{n}G_{i}^{2}X_{i}^{2}(\tilde{m},G^{i})\right]$	(88)
	$\displaystyle=\sum_{i=1}^{n}\mathbb{E}_{G}\left[G^{2}\frac{1}{N}\sum_{\tilde{m}=1}^{N}\mathbb{E}\left[X_{i}^{2}(\tilde{m},G^{i})\|G_{i}=G\right]\right]$	(89)
	$\displaystyle=\sum_{i=1}^{n}\mathbb{E}_{G}\left[G^{2}\varphi_{i}(G)\right]$	(90)
	$\displaystyle=\mathbb{E}_{G}\left[G^{2}\sum_{i=1}^{n}\varphi_{i}(G)\right]$	(91)
	$\displaystyle=n\mathbb{E}_{G}\left[G^{2}\varphi(G)\right]$	(92)
	$\displaystyle\leq n\Lambda.$	(93)

Therefore, this choice of jammer satisfies the adversary power constraint. Given $\mathbf{Y}=\mathbf{g}\circ\mathbf{x}(M,\mathbf{g})+\mathbf{g}\circ\mathbf{x}(\tilde{M},\mathbf{g})+\mathbf{V}$ , the decoder cannot determine the correct message between true message $M$ or the adversary message $\tilde{M}$ with high probability. Thus, the probability of error is bounded away from zero. By the above argument, if $\mathbb{E}G^{2}\varphi(G)\leq\Lambda$ for all $\varphi(g)$ where $\mathbb{E}\varphi(G)\leq P$ , then the capacity cannot be positive; the adversary can always symmetrize the channel, so the capacity is 0.

On the other hand, consider the case where there exists some function $\varphi(g)$ where $\mathbb{E}G^{2}\varphi(G)>\Lambda$ and $\mathbb{E}\varphi(G)\leq P$ . Let $\psi_{i}(g)$ be given by

\displaystyle\psi_{i}(g)=\operatorname*{arg\,min}_{\psi(g):\mathbb{E}\psi(G)\leq\Lambda}\mathbb{E}\left[C\left(\frac{G^{2}\varphi_{i}(G)}{\sigma^{2}+\psi(G)}\right)\right].

(94)

Since the transmitted codes should work for arbitrary jamming signals, an outer bound may be obtained by assuming the adversary sends $S_{i}\sim\mathcal{N}(0,\psi_{i}(G))$ . By the assumption that $\mathbb{E}\psi_{i}(G)\leq~{}\Lambda$ , the jammer’s expected power constraint is satisfied. Therefore, the rate is upper bounded by

$\displaystyle nR$	$\displaystyle\leq\sum_{i=1}^{n}I(X_{i};Y_{i}\|G_{i})$	(95)
	$\displaystyle=\sum_{i=1}^{n}I(X_{i};G_{i}X_{i}+S_{i}+V_{i}\|G_{i})$	(96)
	$\displaystyle\leq\sum_{i=1}^{n}\mathbb{E}_{G_{i}}\left[C\left(\frac{G_{i}^{2}\varphi_{i}(G_{i})}{\psi_{i}(G_{i})+\sigma^{2}}\right)\right]$	(97)
	$\displaystyle=\sum_{i=1}^{n}\min_{\psi(g):\mathbb{E}\psi(g)\leq\Lambda}\mathbb{E}_{G}\left[C\left(\frac{G^{2}\varphi_{i}(G)}{\psi(G)+\sigma^{2}}\right)\right]$	(98)
	$\displaystyle\leq n\min_{\psi(g):\mathbb{E}\psi(g)\leq\Lambda}\mathbb{E}_{G}\left[C\left(\frac{G^{2}\frac{1}{n}\sum_{i=1}^{n}\varphi_{i}(G)}{\psi(G)+\sigma^{2}}\right)\right]$	(99)
	$\displaystyle\leq n\underset{\begin{subarray}{c}\varphi(g):\mathbb{E}\varphi(G)\leq P\\ \mathbb{E}G^{2}\varphi(G)\geq\Lambda\end{subarray}}{\max}\ \min_{\psi(g):\mathbb{E}\psi(g)\leq\Lambda}\mathbb{E}_{G}\left[C\left(\frac{G^{2}\varphi(G)}{\psi(G)+\sigma^{2}}\right)\right]$	(100)

where (97) follows since the mutual information is less than the capacity of equivalent standard fading channel with noise variance $\psi_{i}(g_{i})+\sigma^{2}$ , and the gains being available at both encoder and decoder, (98) follows by the definition of $\psi_{i}(g)$ , (99) follows by the concavity of $C(\cdot)$ with respect to $\varphi_{i}(g)$ and Jensen’s inequality, and (100) follows since we have established that $\varphi(g)=\frac{1}{n}\sum_{i=1}^{n}\varphi_{i}(g)$ satisfies $\mathbb{E}\varphi(G)\leq P$ and $\mathbb{E}G^{2}\varphi(G)\geq\Lambda$ .

Moreover, if the encoder knows the channel gains causally, and the adversary knows them non-causally, then the adversary is stronger than in the previous case, so exactly the same bound holds. If both encoder and adversary know the channel gains non-causally, then we instead assume

\displaystyle\varphi_{i}(g)=\frac{1}{N}\sum_{m=1}^{N}\mathbb{E}\left[X_{i}^{2}(m,\mathbf{G})|G_{i}=g\right]

(101)

where $\mathbf{G}=(G_{1},\ldots,G_{n})$ and $S_{i}=G_{i}X_{i}(\tilde{m},\mathbf{G})$ , so we get the same upper bound.

However, the case wherein the encoder knows the channel gains non-causally, and the adversary knows them causally is somewhat different. In this case, the encoder may send $X_{i}^{2}(m,\mathbf{G})$ while the adversary does not have any access to $(G_{i+1},\ldots,G_{n})$ to construct $S_{i}=G_{i}X_{i}(\tilde{m},\mathbf{G})$ . Thus, it cannot do better than sending Gaussian noise. In this case, we derive only a converse bound based on the adversary sending Gaussian noise. Hence, we obtain the following bound:

\displaystyle R\leq\underset{\begin{subarray}{c}\varphi(g):\mathbb{E}\varphi(G)\leq P\end{subarray}}{\max}\ \min_{\psi(g):\mathbb{E}\psi(g)\leq\Lambda}\mathbb{E}_{G}\left[C\left(\frac{G^{2}\varphi(G)}{\psi(G)+\sigma^{2}}\right)\right].

(102)

Note that here we are not making the assumption that $\mathbb{E}G^{2}\varphi(G)\geq\Lambda$ .

VIII-B Achievability Proof (Gains Available Non-causally at Adversary and Causally at Encoder)

We first quantize $G$ in the following way. Fix $\nu>0$ . Given the assumption that $G$ has finite variance, there exists a real-valued random variable $\tilde{G}$ with a finite support such that $\tilde{G}$ is a deterministic function of $G$ and $\mathbb{E}[(G-\tilde{G})^{2}|\tilde{G}=\tilde{g}]\leq\nu$ for each $\tilde{g}$ . We further assume that $\tilde{G}$ is the expected value of $G$ within each quantization set; that is, $\mathbb{E}[G|\tilde{G}]=\tilde{G}$ .

Without loss of generality, assume $P=1$ . Let $R$ be a rate and $\varphi(\tilde{g})$ be any function satisfying

	$\displaystyle\mathbb{E}\varphi(\tilde{G})\leq 1,$		(103)
	$\displaystyle\Lambda<\mathbb{E}\tilde{G}^{2}\varphi(\tilde{G}),$		(104)
	$\displaystyle R<\underset{\psi(\tilde{g}):\mathbb{E}\psi(\tilde{G})\leq\Lambda}{\min}\mathbb{E}_{\tilde{G}}\left[C\left(\frac{\tilde{G}^{2}\varphi(\tilde{G})}{\psi(\tilde{G})+\sigma^{2}}\right)\right].$		(105)

We construct a $(2^{nR},n)$ code as follows:

Codebook generation: Fix $\epsilon>\ \epsilon^{\prime\prime}>\epsilon^{\prime}>\lambda>0$ . Generate $2^{nR}$ i.i.d. zero mean Gaussian sequences $\mathbf{X}(m)$ with variance $(1-\gamma)$ for each $m\in[2^{nR}]$ . By Lemmas 3 and Lemma 2, we may assume that the deterministic codebook satisfies (16)–(20).

Encoding: Given message $m$ and gain sequence $\mathbf{g}$ , the transmitter computes $\tilde{g}$ from the quantization function, and then sends $\sqrt{\varphi(\tilde{\mathbf{g}})}\circ\mathbf{x}(m)$ (at time $i$ signal $\sqrt{\varphi(\tilde{g_{i}})}x_{i}(m)$ is sent) if $\|\sqrt{\varphi(\tilde{\mathbf{g}})}\circ\mathbf{x}(m)\|^{2}\leq n$ ; otherwise, it sends zero. Note that here we assume that the encoder knows the channel gains causally.

Decoding: Given $\mathbf{y}$ and $\mathbf{g}$ , let $\nu<\epsilon$ and $\mathscr{S}$ be the set of messages $\hat{m}$ such that $(\mathbf{x}(\hat{m}),\tilde{\mathbf{g}},\mathbf{y})\in\mathcal{T}_{\epsilon}^{(n)}(X^{\prime},\tilde{G},Y)$ where $\tilde{G}$ is the quantized random variable from $G$ and $(X^{\prime},Y)$ are some random variables that are conditionally Gaussian given $\tilde{G}=\tilde{g}$ with zero mean and covariance

\displaystyle\operatorname{Cov}\left(X^{\prime},Y\Big{|}\tilde{G}=\tilde{g}\right)=\left[\begin{array}[]{cccc}1&\tilde{g}\sqrt{\varphi(\tilde{g})}\\ \tilde{g}\sqrt{\varphi(\tilde{g})}&a_{\tilde{g}}\end{array}\right]

(108)

where $a_{\tilde{g}}\geq\tilde{g}^{2}\varphi(\tilde{g})+\sigma^{2}$ . Note that the following can be shown from (108).

	$\displaystyle X^{\prime}\text{ is independent of }\tilde{G},$		(109)
	$\displaystyle\mathbb{E}X^{\prime 2}=1,$		(110)
	$\displaystyle Y-\tilde{G}\sqrt{\varphi(\tilde{G})}X^{\prime}\text{ is independent of }X^{\prime}\text{ given }\tilde{G},$		(111)
	$\displaystyle\operatorname{Var}\left(Y-\tilde{G}\sqrt{\varphi(\tilde{G})}X^{\prime}\bigg{\|}\tilde{G}\right)\geq\sigma^{2}.$		(112)

Now, we define the decoding function as

\displaystyle\Theta(\mathbf{y},\tilde{\mathbf{g}})

\displaystyle=\operatorname*{arg\,min}_{\hat{m}\in\mathscr{S}}\left\|\mathbf{y}-\tilde{\mathbf{g}}\circ\sqrt{\varphi(\tilde{\mathbf{g}})}\circ\mathbf{x}(\hat{m})\right\|^{2}.

(113)

Analysis of the probability of error: Assume the legitimate transmitter sends message $M$ . Then, we can upper bound the probability of error by the summation of the following error probabilities:

	$\displaystyle P_{0}$	$\displaystyle=\mathbb{P}\left\{M\notin\mathscr{S}\right\},$		(114)
	$\displaystyle P_{1}$	$\displaystyle=\mathbb{P}\bigg{\{}\left\\|\mathbf{Y}-\tilde{\mathbf{G}}\circ\sqrt{\varphi(\tilde{\mathbf{G}})}\circ\mathbf{x}(\hat{m})\right\\|^{2}\leq\left\\|\mathbf{s}+\mathbf{V}\right\\|^{2}\text{ for some }\hat{m}\in\mathscr{S}\setminus\{M\}\bigg{\}}.$		(115)

We can prove with high probability that

$\displaystyle\frac{1}{n}\left\\|\mathbf{x}\circ\sqrt{\varphi(\tilde{\mathbf{G}})}\circ(\mathbf{G}-\tilde{\mathbf{G}})\right\\|^{2}$	$\displaystyle=\frac{1}{n}\sum_{i=1}^{n}\left(x_{i}\sqrt{\varphi(\tilde{G}_{i})}\left(G_{i}-\tilde{G}_{i}\right)\right)^{2}$	(116)
	$\displaystyle\leq\frac{1}{n}\sum_{i=1}^{n}x_{i}^{2}\mathbb{E}_{\tilde{G}_{i}}\left[\mathbb{E}_{G_{i}}\left[\varphi(\tilde{G}_{i})\left(G_{i}-\tilde{G}_{i}\right)^{2}\Big{\|}\tilde{G}_{i}\right]\right]+\nu$	(117)
	$\displaystyle\leq\frac{1}{n}\sum_{i=1}^{n}x_{i}^{2}\mathbb{E}_{\tilde{G}_{i}}\left[\varphi(\tilde{G}_{i})\right]\nu+\nu$	(118)
	$\displaystyle\leq 2\nu$	(119)

where (117) follows from the law of large numbers for non-identical independent random variables $x_{i}^{2}\varphi(\tilde{G}_{i})\left(G_{i}-\tilde{G}_{i}\right)^{2}$ , (118) follows from the assumption $\mathbb{E}\left[\left(G_{i}-\tilde{G}_{i}\right)^{2}\Big{|}\tilde{G}_{i}=\tilde{g}_{i}\right]\leq\nu$ , and (119) follows from the assumption $\frac{1}{n}\|\mathbf{x}\|^{2}\leq 1$ and $\mathbb{E}\varphi(\tilde{G})\leq 1$ .

Consider any jammer sequence $\mathbf{s}$ . We may assume sequence $\mathbf{G}$ is typical since it is drawn i.i.d. from the distribution $f_{G}(g)$ . Similarly, $\tilde{\mathbf{G}}$ is also typical because it is from the corresponding discrete distribution $P_{\tilde{G}}(\tilde{g})$ . Thus, $(\mathbf{s},\tilde{\mathbf{G}})$ is also typical with respect to some distribution $P_{\tilde{G}}(\tilde{g})f_{S|\tilde{G}}(s|\tilde{g})$ where $f_{S|\tilde{G}}(s|\tilde{g})$ is conditionally Gaussian. Note that we can make no assumptions about the conditional variances defining $f_{S|\tilde{G}}$ , because the adversary is assumed to know $G$ in its choice of $s$ . By (16), with high probability $(\mathbf{x}(M),\mathbf{s},\tilde{\mathbf{G}})\in\mathcal{T}_{\epsilon^{\prime}}^{(n)}(X,S,\tilde{G})$ where $X$ is independent of $(S,\tilde{G})$ , and $\mathbb{E}X^{2}=1,\allowbreak\mathbb{E}S^{2}\leq\Lambda$ . Thus, by the conditional typicality lemma, with high probability $(\mathbf{x},\mathbf{s},\tilde{\mathbf{G}},\mathbf{V})\in\mathcal{T}_{\epsilon^{\prime\prime}}^{(n)}(X,S,\tilde{G},V)$ where $X,S,\tilde{G}$ are independent of $V$ , and $\mathbb{E}V^{2}=\sigma^{2}$ . Hence, using 119, we have $\left(\mathbf{x},\mathbf{s},\tilde{\mathbf{G}},\mathbf{V}+\mathbf{x}\circ(\mathbf{G}-\tilde{\mathbf{G}})\circ\sqrt{\varphi(\tilde{\mathbf{G}})}\right)\in\mathcal{T}_{\epsilon}^{(n)}(X,S,\tilde{G},V)$ where $\nu$ is sufficiently small compared to $\epsilon$ . Also, since $\mathbf{Y}-\mathbf{x}\circ\tilde{\mathbf{G}}\circ\sqrt{\varphi(\tilde{\mathbf{G}})}-\mathbf{s}-\mathbf{V}=\mathbf{x}\circ(\mathbf{G}-\tilde{\mathbf{G}})\circ\sqrt{\varphi(\tilde{\mathbf{G}})}$ , by (119) we may roughly assume $\mathbf{Y}=\mathbf{x}\circ\tilde{\mathbf{G}}\circ\sqrt{\varphi(\tilde{\mathbf{G}})}+\mathbf{s}+\mathbf{V}$ and obtain $\left(\mathbf{x},\mathbf{s},\tilde{\mathbf{G}},\mathbf{Y}\right)\in\mathcal{T}_{\epsilon}^{(n)}(X,S,\tilde{G},Y)$ . Moreover, to completely show that with high probability $M\in\mathscr{S}$ , we also need to compute the covariance matrix of $(X,Y)$ given $\tilde{G}=\tilde{g}$ , where $Y=\tilde{G}\sqrt{\varphi(\tilde{G})}X+S+V$ , and show that it is in the form of (108). First, $\mathbb{E}\left(X^{2}|\tilde{G}=\tilde{g}\right)=\mathbb{E}X^{2}=1$ since $X$ is independent of $\tilde{G}$ ,

	$\displaystyle\mathbb{E}\left(\!X\!\left(\!\tilde{G}\sqrt{\varphi(\tilde{G})}X\!+\!S\!+\!V\right)\Big{\|}\tilde{G}\!=\!\tilde{g}\right)\!$	$\displaystyle=\tilde{g}\sqrt{\varphi(\tilde{g})}\mathbb{E}X^{2}\!+\!\mathbb{E}\left(XS\|\tilde{G}\!=\!\tilde{g}\right)\!+\!\mathbb{E}\left(XV\|\tilde{G}\!=\!\tilde{g}\right)\!$		(120)
		$\displaystyle=\tilde{g}\sqrt{\varphi(\tilde{g})}$		(121)

where $\mathbb{E}\left(XS\Big{|}\tilde{G}=\tilde{g}\right)=0$ follows from the weak union rule since $X$ is independent of $(S,G)$ . Furthermore,

$\displaystyle\mathbb{E}$	$\displaystyle\left(\left(\tilde{G}\sqrt{\varphi(\tilde{G})}X+S+V\right)^{2}\bigg{\|}\tilde{G}=\tilde{g}\right)=\mathbb{E}\left(\tilde{G}^{2}{\varphi(\tilde{G})}X^{2}\bigg{\|}\tilde{G}=\tilde{g}\right)+\mathbb{E}\left(S^{2}\Big{\|}\tilde{G}=\tilde{g}\right)$
	$\displaystyle+\mathbb{E}\left(V^{2}\Big{\|}\tilde{G}=\tilde{g}\right)+\!2\mathbb{E}\left(\tilde{G}\sqrt{\varphi(\tilde{G})}XS\bigg{\|}\tilde{G}\!=\!\tilde{g}\right)\!+\!2\mathbb{E}\left(\tilde{G}X\sqrt{\varphi(\tilde{G})}V\bigg{\|}\tilde{G}\!=\!\tilde{g}\right)\!+\!2\mathbb{E}\left(SV\Big{\|}\tilde{G}\!=\!\tilde{g}\right)$	(122)
	$\displaystyle=\tilde{g}^{2}{\varphi(\tilde{g})}\!+\!\mathbb{E}\left(S^{2}\Big{\|}\tilde{G}\!=\!\tilde{g}\right)\!+\!\sigma^{2}\!+\!2\tilde{g}\sqrt{\varphi(\tilde{g})}\mathbb{E}\left(XS\Big{\|}\tilde{G}\!=\!\tilde{g}\right)\!+\!2\tilde{g}\sqrt{\varphi(\tilde{g})}\mathbb{E}\left(XV\Big{\|}\tilde{G}\!=\!\tilde{g}\right)\!$
	$\displaystyle\quad{}+\!2\mathbb{E}\left(SV\Big{\|}\tilde{G}\!=\!\tilde{g}\right)$	(123)
	$\displaystyle=\tilde{g}^{2}{\varphi(\tilde{g})}+\mathbb{E}\left(S^{2}\Big{\|}\tilde{G}=\tilde{g}\right)+\sigma^{2}$	(124)
	$\displaystyle\geq\tilde{g}^{2}{\varphi(\tilde{g})}+\sigma^{2}$	(125)

where (124) follows from the weak union rule for $X$ independent of $(S,\tilde{G})$ and $V$ independent of $(S,\tilde{G})$ . Therefore, the conditional covariance matrix of $(X,Y)$ can be obtain from $\mathbb{E}X^{2}=1$ , (121) and (125), and is the same as (108). Now, since $(\mathbf{x}(\hat{M}),\tilde{\mathbf{g}},\mathbf{y})\in\mathcal{T}_{\epsilon}^{(n)}(X,\tilde{G},Y)$ and the conditional covariance matrix of $(X(M),Y)$ satisfies (108), with high probability $M\in\mathscr{S}$ , and $P_{0}$ vanishes as $n\to\infty$ .

Using (119) and triangle inequality, we may upper bound $P_{1}$ by the following:

P_{1}\leq\\ \mathbb{P}\left\{\!\left\|\mathbf{x}(m)\!\circ\!\tilde{\mathbf{G}}\sqrt{\varphi(\tilde{\mathbf{G}})}\!+\!\mathbf{s}\!+\!\mathbf{V}\!-\!\mathbf{x}(\hat{m})\!\circ\!\tilde{\mathbf{G}}\sqrt{\varphi(\tilde{\mathbf{G}})}\right\|^{2}\!\!\!\leq\!\|\mathbf{s}\!+\!\mathbf{V}\|^{2}\!+\!2n\nu\text{ for some }\!\hat{m}\!\in\!\mathscr{S}\setminus\!\{m\}\!\right\}.

(126)

Define the shorthand $\vec{X}=(XX^{\prime}S\tilde{G}V)$ . Let $\mathcal{V}$ denote a finite $\epsilon$ -dense subset in the set of all distributions of random vectors $\vec{X}$ that are determined by $P_{\tilde{G}}(\tilde{g})$ and a random vector $(XX^{\prime}SV)$ distributed conditionally zero mean Gaussian given $\tilde{G}$ with bounded covariances at most $(1,1,\Lambda,\sigma^{2})$ . Note that because the distribution of $P_{\tilde{G}}(\tilde{g})$ is completely known, the overall distribution of $\vec{X}$ can be determined by the conditional covariance matrix of $(XX^{\prime}SV)$ given $\tilde{G}=\tilde{g}$ for each of the finitely many $\tilde{g}$ realizations, so $\mathcal{V}$ only needs to cover a compact set. Now, we may upper bound $P_{1}$ by

\sum_{\vec{X}\in\mathcal{V}}\frac{1}{N}\sum_{m=1}^{N}\mathbb{E}_{\tilde{G}}\left[e_{\vec{X}}(m,\mathbf{s},\tilde{\mathbf{G}})\right]

(127)

where

e_{\vec{X}}\left(m,\mathbf{s},\tilde{\mathbf{g}}\right)=\mathbb{P}\bigg{\{}\left(\mathbf{x}(m),\mathbf{x}(\hat{m}),\mathbf{s},\tilde{\mathbf{g}},\mathbf{V}\right)\in\mathcal{T}_{\epsilon}^{(n)}\left(\vec{X}\right),\\ \left\|\tilde{\mathbf{g}}\!\circ\!\sqrt{\varphi(\tilde{\mathbf{g}})}\!\circ\!\mathbf{x}(m)\!+\!\mathbf{s}\!+\!\mathbf{V}\!-\!\tilde{\mathbf{g}}\!\circ\!\sqrt{\varphi(\tilde{\mathbf{g}})}\!\circ\!\mathbf{x}(\hat{m})\right\|^{2}\!\leq\!\|\mathbf{s}\!+\!\mathbf{V}\|^{2}\!+\!2n\nu\text{ for some }\ \hat{m}\in\mathscr{S}\setminus\{m\}\bigg{\}}.

(128)

We will show that $\frac{1}{N}\sum_{m=1}^{N}e_{\vec{X}}(m,\mathbf{s},\tilde{\mathbf{g}})\to 0$ for all vectors $\tilde{\mathbf{g}}$ and all vectors $(XX^{\prime}SV)$ which are Gaussian given $\tilde{G}$ (whether or not they are in $\mathcal{V}$ ). Let $Z=\tilde{G}\sqrt{\varphi(\tilde{G})}X+S+V-\tilde{G}\sqrt{\varphi(\tilde{G})}X^{\prime}$ . We may restrict ourselves to $\vec{X}$ where

	$\displaystyle I(X;X^{\prime}S\tilde{G})<\|R-I(X^{\prime};S\tilde{G})\|^{+}+\delta(\epsilon),$		(129)
	$\displaystyle\tilde{G}\sim P_{\tilde{G}}(\tilde{g}),$		(130)
	$\displaystyle(X,X^{\prime},S,V)\text{ are zero mean Gaussian given }\tilde{G},$		(131)
	$\displaystyle X,(S,\tilde{G}),V\text{ are mutually independent},$		(132)
	$\displaystyle X^{\prime},\tilde{G}\text{ are independent},$		(133)
	$\displaystyle\mathbb{E}X^{2}=\mathbb{E}X^{\prime 2}=1,\mathbb{E}S^{2}\leq\Lambda,\mathbb{E}V^{2}=\sigma^{2},$		(134)
	$\displaystyle X^{\prime},Z\text{ are independent given }\tilde{G},$		(135)
	$\displaystyle\mathbb{E}\left[Z^{2}\Big{\|}\tilde{G}\right]\geq\sigma^{2},$		(136)
	$\displaystyle\operatorname{Var}(Z)\leq\sigma^{2}+\Lambda+2\nu,.$		(137)

Note that using (20), we only need to consider the distributions that satisfies (129). in addition, (130)–(131) are obtained by the definition of $\mathscr{S}$ , (132) holds since the codebook $X$ , Gaussian noise $V$ and fading gains $\tilde{G}$ are generated independently, and the adversary signal $S$ may depend on $\tilde{G}$ but not the others, (133) follows from (109), (134) follows from the power constraints of the codebook, the adversary and the distribution of noise, (135)-(136) follows from (111)-(112), and (137) follows from (128). Let $\psi(\tilde{g})=\mathbb{E}\left[Z^{2}\Big{|}\tilde{G}=\tilde{g}\right]-\sigma^{2}$ . Therefore, using (136) we have $\psi(\tilde{g})\geq 0$ , and by (137) we get $\mathbb{E}\psi(\tilde{G})=\operatorname{Var}(Z)-\sigma^{2}\leq\Lambda+2\nu$ .

Observe that if $I(XV;X^{\prime}S|\tilde{G})=0$ , then we would have

$\displaystyle 0$	$\displaystyle=\mathbb{E}\left[X^{\prime}Z\|\tilde{G}\right]$	(138)
	$\displaystyle=\mathbb{E}\left[X^{\prime}\left(\tilde{G}\sqrt{\varphi(\tilde{G})}X+S+V-\tilde{G}\sqrt{\varphi(\tilde{G})}X^{\prime}\right)\bigg{\|}\tilde{G}\right]$	(139)
	$\displaystyle=\mathbb{E}\left[X^{\prime}S\Big{\|}\tilde{G}\right]-\mathbb{E}\left[\tilde{G}\sqrt{\varphi(\tilde{G})}X^{\prime 2}\bigg{\|}\tilde{G}\right]$	(140)
	$\displaystyle=\mathbb{E}\left[X^{\prime}S\Big{\|}\tilde{G}\right]-\tilde{G}\sqrt{\varphi(\tilde{G})}$	(141)

where (138) follows from (135), (140) follows from the assumption $I(XV;X^{\prime}S|\tilde{G})=0$ in which $X^{\prime}$ is independent of $(X,V)$ , and (141) holds since $X^{\prime}$ is independent of $\tilde{G}$ . Therefore, $\mathbb{E}\left[X^{\prime}S\Big{|}\tilde{G}\right]=\tilde{G}\sqrt{\varphi(\tilde{G})}$ and the covariance matrix of $S,X^{\prime}$ given $\tilde{G}$ is equal to

\displaystyle\operatorname{Cov}\left(S,X^{\prime}\Big{|}\tilde{G}\right)=\begin{bmatrix}\mathbb{E}\left[S^{2}\Big{|}\tilde{G}\right]&\tilde{G}\sqrt{\varphi(\tilde{G})}\\ \tilde{G}\sqrt{\varphi(\tilde{G})}&1\end{bmatrix}.

(142)

The determinant of $\operatorname{Cov}\left(S,X^{\prime}\Big{|}\tilde{G}\right)$ is $\mathbb{E}\left[S^{2}\Big{|}\tilde{G}\right]-\tilde{G}^{2}\varphi(\tilde{G})$ that should be non-negative since the covariance matrix must be positive semi-definite. Thus, its expectation is also non-negative:

\displaystyle 0

\displaystyle\leq\mathbb{E}S^{2}-\mathbb{E}\tilde{G}^{2}\varphi(\tilde{G}).

(143)

However, since $\mathbb{E}S^{2}\leq\Lambda$ , (143) contradicts the initial assumption on $\varphi$ in (104). Thus, there exists $\eta>0$ such that

\eta\leq I(XV;X^{\prime}S|\tilde{G})=I(XV;X^{\prime}|S\tilde{G})

(144)

where we have used the fact that $I(XV;S)=0$ .

Probability $e_{\vec{X}}$ may be upper bounded by

	$\displaystyle e_{\vec{X}}(m,\mathbf{s},\tilde{\mathbf{g}})$	$\displaystyle\leq\sum_{\hat{m}:(\mathbf{x}(m),\mathbf{x}(\hat{m}),\mathbf{s},\tilde{\mathbf{g}})\in\mathcal{T}_{\epsilon}^{(n)}(X,X^{\prime},S,\tilde{G})}\mathbb{P}\left\{\!(\mathbf{x}(m),\mathbf{x}(\hat{m}),\mathbf{s},\tilde{\mathbf{g}},\mathbf{V})\!\in\!\mathcal{T}_{\epsilon}^{(n)}(X,X^{\prime},S,\tilde{G},V)\!\right\}$		(145)
		$\displaystyle\leq\exp\big{\{}n\big{[}\|R\!-\!I(X^{\prime};XS\tilde{G})\|^{+}\!\!-I(V;X^{\prime}\|XS\tilde{G})\!+\!\delta(\epsilon)\big{]}$		(146)

where (146) follows from (19) and the joint typicality lemma.

We consider the following two cases.

Case (a): $R<I(X^{\prime};S\tilde{G})$ . Applying this condition to (129), we get

	$\displaystyle\delta(\epsilon)$	$\displaystyle>I(X;X^{\prime}S\tilde{G})$		(147)
		$\displaystyle=I(X;X^{\prime}\|S\tilde{G}).$		(148)

Since $I(X^{\prime};S\tilde{G})\leq I(X^{\prime};XS\tilde{G})$ then $R-I(X^{\prime};XS\tilde{G})<0$ . Considering (146), for any $m,\mathbf{s},\tilde{\mathbf{g}}$ we have

$\displaystyle e_{\vec{X}}(m,\mathbf{s},\tilde{\mathbf{g}})$	$\displaystyle\leq\exp\left\{-n\left(I(V;X^{\prime}\|XS\tilde{G})-\delta(\epsilon)\right)\right\}$	(149)
	$\displaystyle=\exp\{-n(I(XV;X^{\prime}\|S\tilde{G})-I(X;X^{\prime}\|S\tilde{G})-\delta(\epsilon))\}$	(150)
	$\displaystyle\leq\exp\{-n(\eta-2\delta(\epsilon))\}$	(151)

where (151) follows from (144) and (148). Therefore, $e_{\vec{X}}(m,\mathbf{s},\tilde{\mathbf{g}})$ vanishes exponentially fast if $\delta(\epsilon)$ is sufficiently small.

Case (b): $R\geq I(X^{\prime};S\tilde{G})$ . Then we may apply this condition to (129) as

$\displaystyle R$	$\displaystyle>I(X;X^{\prime}S\tilde{G})+I(X^{\prime};S\tilde{G})-\delta(\epsilon)$	(152)
	$\displaystyle\geq I(X;X^{\prime}\|S\tilde{G})+I(X^{\prime};S\tilde{G})-\delta(\epsilon)$	(153)
	$\displaystyle=I(X^{\prime};XS\tilde{G})-\delta(\epsilon).$	(154)

Since $R-I(X^{\prime};XS\tilde{G})+\delta(\epsilon)>0$ , we may upper bound (146) by

$\displaystyle e_{\vec{X}}(m,\mathbf{s},\tilde{\mathbf{g}})$	$\displaystyle\leq\exp\left(n\left[R\!-\!I(X^{\prime};XS\tilde{G})\!-\!I(V;X^{\prime}\|XS\tilde{G})\!+\!2\delta(\epsilon)\right]\right)$	(155)
	$\displaystyle=\!\exp(n[R-I(X^{\prime};XS\tilde{G}V)+2\delta(\epsilon)])$	(156)
	$\displaystyle\leq\!\exp(n[R-I(X^{\prime};XSV\|\tilde{G})+2\delta(\epsilon)])$	(157)

where by (133), we have $I(X^{\prime};\tilde{G})=0$ . In the following, we find a lower bound for the mutual information in (157).

$\displaystyle I\left(X^{\prime};XSV\Big{\|}\tilde{G}\right)$	$\displaystyle\geq I\left(X^{\prime};\tilde{G}\sqrt{\varphi(\tilde{G})}X\!+\!S\!\!+\!\!V\Big{\|}\tilde{G}\right)$	(158)
	$\displaystyle=I\left(X^{\prime};Z+\tilde{G}\sqrt{\varphi(\tilde{G})}X^{\prime}\Big{\|}\tilde{G}\right)$	(159)
	$\displaystyle=h\left(Z+\tilde{G}\sqrt{\varphi(\tilde{G})}X^{\prime}\Big{\|}\tilde{G}\right)-h\left(Z+\tilde{G}\sqrt{\varphi(\tilde{G})}X^{\prime}\Big{\|}\tilde{G},X^{\prime}\right)$	(160)
	$\displaystyle=\mathbb{E}_{\tilde{G}}\left[\frac{1}{2}\log 2\pi e\left(\tilde{G}^{2}{\varphi(\tilde{G})}+\mathbb{E}\left[Z^{2}\Big{\|}\tilde{G}\right]\right)-\frac{1}{2}\log 2\pi e\mathbb{E}\left[Z^{2}\Big{\|}\tilde{G}\right]\right]$	(161)
	$\displaystyle=\mathbb{E}_{\tilde{G}}\left[C\left(\frac{\tilde{G}^{2}\varphi(\tilde{G})}{\mathbb{E}\left[Z^{2}\Big{\|}\tilde{G}\right]}\right)\right]$	(162)
	$\displaystyle=\mathbb{E}_{\tilde{G}}\left[C\left(\frac{\tilde{G}^{2}\varphi(\tilde{G})}{\psi(\tilde{G})+\sigma^{2}+2\nu}\right)\right]$	(163)

where (158) follows from data processing inequality, (162) follows from standard argument for the capacity of Gaussian channel, and (163) follows from the definition of $\psi$ . Therefore, by the assumptions about $R$ and $\Lambda$ in (104)–(105), $R<I(X^{\prime};XSV|\tilde{G})$ , so by (157) $e_{\vec{X}}(m,\mathbf{s},\tilde{\mathbf{g}})$ is exponentially vanishing if $\delta(\epsilon)$ and $\nu$ are sufficiently small.

It is worth mentioning that this achievability proof also works for the case where both the adversary and encoder know the channel gains causally, or both know the gains non-causally. Since in all three cases the knowledge of the encoder is not more than the knowledge of the adversary, the jammer is able to impersonate the legitimate transmitter, and thereby symmetrize the channel, depending on the power allocation.

VIII-C Achievability Proof (Gains Available Causally at Adversary and Non-causally at Encoder)

In this case, both the encoder and the decoder know the channel gains non-causally meaning that they know the whole $\mathbf{g}$ string including $(g_{1},g_{2},\cdots,g_{n})$ . However, the adversary only knows the gains causally, so at time $i$ it only has access to $(g_{1},g_{2},\cdots,g_{i})$ . Therefore, both the encoder and the decoder have some extra common information $(g_{i+1},g_{i+2},\cdots,g_{n})$ that the adversary does not know. In particular, the encoder and the decoder immediately know $g_{n}$ which the adversary knows only at time $n$ . Hence, we can leverage this common knowledge between the encoder and the decoder as common randomness that is unknown to the jammer. Moreover, by the assumption that $G$ is a continuous random variable with positive variance, in fact just $G_{n}$ has infinite entropy, and thus can be considered a source of an infinite number of bits of common randomness. Therefore, we proceed to provide an achievability proof where the encoder and decoder are assumed to share an infinite source of common randomness. However, note that implementing this approach would require measuring $G_{n}$ to an arbitrarily level of precision, which is not practical. Even so, the random code reduction technique of, for example, [15, Lemma 12.8], can be used to show that only $O(\log n)$ bits of common randomness need to be extracted from $G_{n}$ (or perhaps $G_{n-k},\ldots,G_{n}$ for some $k$ ) in order to achieve the same rate.

A large amount of the achievability proof is identical to the Sec. VIII-B. The main difference is that the codebook is based on common randomness between encoder and decoder, so we denote the codebook as random variables $\mathbf{X}(m)$ which are independent from the jammer signal. As a consequence, the symmetrizability condition $\Lambda<\mathbb{E}\tilde{G}^{2}\varphi(\tilde{G})$ is not needed, and the proof is somewhat simpler. In particular, we fix a $\varphi(\cdot)$ satisfying (103), and a rate satisfying (105), and we prove achievability as follows.

Codebook generation: Let $\mathbf{X}(m)$ be a Gaussian codebook with variance $1\!-\!\gamma$ satisfying (16). This random codebook is generated from the infinite source of common randomness, so it is unknown to the adversary.

Encoding: Given message $m$ and gain sequence $\mathbf{g}$ , the transmitter first computes $\tilde{g}$ from the quantization function, and then sends $\sqrt{\varphi(\tilde{\mathbf{g}})}\circ\mathbf{X}(m)$ (at time $i$ signal $\sqrt{\varphi(\tilde{g_{i}})}X_{i}(m)$ is sent) if $\|\sqrt{\varphi(\tilde{\mathbf{g}})}\circ\mathbf{X}(m)\|^{2}\leq n$ ; otherwise, it sends zero.

Decoding: Given $\mathbf{y}$ and $\mathbf{g}$ , let $\nu<\epsilon$ and let $\mathscr{S}$ be the set of messages $\hat{m}$ such that $(\mathbf{X}(\hat{m}),\tilde{\mathbf{g}},\mathbf{y})\in\mathcal{T}_{\epsilon}^{(n)}(X^{\prime},\tilde{G},Y)$ where $\tilde{G}$ is the quantized random variable from $G$ and $(X^{\prime},Y)$ are conditionally Gaussian given $\tilde{G}=\tilde{g}$ with zero mean and covariance matrix $\Sigma_{\tilde{g}}$ as follows:

\displaystyle\Sigma_{\tilde{g}}=\operatorname{Cov}\left(X^{\prime},Y\Big{|}\tilde{G}=\tilde{g}\right)=\left[\begin{array}[]{cccc}1&\tilde{g}\sqrt{\varphi(\tilde{g})}\\ \tilde{g}\sqrt{\varphi(\tilde{g})}&a_{\tilde{g}}\end{array}\right]

(166)

where $a_{\tilde{g}}\geq{\tilde{g}}^{2}\varphi({\tilde{g}})+\sigma^{2}$ . Note that the following can be shown from (166):

	$\displaystyle X^{\prime}\text{ is independent of }{\tilde{G}},$		(167)
	$\displaystyle\mathbb{E}X^{\prime 2}=1,$		(168)
	$\displaystyle Y-{\tilde{G}}\sqrt{\varphi({\tilde{G}})}X^{\prime}\text{ is independent of }X^{\prime}\text{ given }{\tilde{G}},$		(169)
	$\displaystyle\operatorname{Var}\left(Y-{\tilde{G}}\sqrt{\varphi({\tilde{G}})}X^{\prime}\bigg{\|}{\tilde{G}}\right)\geq\sigma^{2}.$		(170)

Now, we define the decoding function as

\displaystyle\Theta(\mathbf{y},\tilde{\mathbf{g}})

\displaystyle=\operatorname*{arg\,min}_{\hat{m}\in\mathscr{S}}\left\|\mathbf{y}-\tilde{\mathbf{g}}\circ\sqrt{\varphi(\tilde{\mathbf{g}})}\circ\mathbf{X}(\hat{m})\right\|^{2}.

(171)

Analysis of the probability of error: Assume the legitimate transmitter sends message $M$ . Then, we can upper bound the probability of error by the summation of the following error probabilities:

	$\displaystyle P_{0}$	$\displaystyle=\mathbb{P}\left\{M\notin\mathscr{S}\right\},$		(172)
	$\displaystyle P_{1}$	$\displaystyle=\mathbb{P}\bigg{\{}\left\\|\mathbf{Y}-\tilde{\mathbf{G}}\circ\sqrt{\varphi(\tilde{\mathbf{G}})}\circ\mathbf{X}(\hat{m})\right\\|^{2}\leq\left\\|\mathbf{s}+\mathbf{V}\right\\|^{2}\text{ for some }\hat{m}\in\mathscr{S}\setminus\{M\}\bigg{\}}.$		(173)

Using the same argument in Sec. VIII-B, we can prove with high probability that $P_{0}$ tends to zero as $n\to\infty$ . Using (119) and triangle inequality, we may upper bound $P_{1}$ by the following:

P_{1}\leq\\ \mathbb{P}\left\{\!\left\|\mathbf{X}(m)\!\circ\!\tilde{\mathbf{G}}\sqrt{\varphi(\tilde{\mathbf{G}})}\!+\!\mathbf{s}\!+\!\mathbf{V}\!-\!\mathbf{X}(\hat{m})\!\circ\!\tilde{\mathbf{G}}\sqrt{\varphi(\tilde{\mathbf{G}})}\right\|^{2}\!\!\!\leq\!\|\mathbf{s}\!+\!\mathbf{V}\|^{2}\!+\!2n\nu\text{ for some }\!\hat{m}\!\in\!\mathscr{S}\setminus\!\{m\}\!\!\right\}.

(174)

Defining $\vec{X}=(XX^{\prime}S\tilde{G}V)$ and $\mathcal{V}$ the same as Sec. VIII-B, we may now upper bound $P_{1}$ by

\sum_{\vec{X}\in\mathcal{V}}\frac{1}{N}\sum_{m=1}^{N}\mathbb{E}_{{\tilde{G}}}\left[e_{\vec{X}}(m,\mathbf{s},\tilde{\mathbf{G}})\right]

(175)

where

e_{\vec{X}}(m,\mathbf{s},\tilde{\mathbf{g}})=\!\bigg{\{}\!(\mathbf{X}(m),\mathbf{X}(\hat{m}),\mathbf{s},\tilde{\mathbf{g}},\mathbf{V})\!\in\!\mathcal{T}_{\epsilon}^{(n)}(\vec{X}),\\ \quad{}\left\|\tilde{\mathbf{g}}\!\circ\!\sqrt{\varphi(\tilde{\mathbf{g}})}\!\circ\!\mathbf{X}(m)\!+\!\mathbf{s}\!+\!\mathbf{V}\!-\!\tilde{\mathbf{g}}\!\circ\!\sqrt{\varphi(\tilde{\mathbf{g}})}\!\circ\!\mathbf{X}(\hat{m})\right\|^{2}\!\leq\!\|\mathbf{s}+\!\mathbf{V}\|^{2}+2n\nu\text{ for some }\hat{m}\!\in\!\mathscr{S}\setminus\{M\}\!\bigg{\}},

(176)

and $\vec{X}$ satisfies the same properties as in (129)–(137). Now, it suffices to show that $\frac{1}{N}\sum_{m=1}^{N}e_{\vec{X}}(m,\mathbf{s},\tilde{\mathbf{g}})$ vanishes for all typical vectors $\mathbf{g}$ and all vectors $(XX^{\prime}SV)$ which are Gaussian given ${\tilde{G}}$ (whether or not they are in $\mathcal{V}$ ).

Using the joint typicality lemma in [14, Remark 2.2] we may upper bound $e_{\vec{X}}$ in (128) (with the codewords $\mathbf{X}(m)$ and $\mathbf{X}(\hat{m})$ ) as follows:

$\displaystyle e_{\vec{X}}(m,\mathbf{s},\tilde{\mathbf{g}})$	$\displaystyle\leq\sum_{\hat{m}\in\mathscr{S}\setminus\{m\}}\mathbb{P}\left\{\!(\mathbf{X}(m),\mathbf{X}(\hat{m}),\mathbf{s},\tilde{\mathbf{g}},\mathbf{V})\!\in\!\mathcal{T}_{\epsilon}^{(n)}(X,X^{\prime},S,{\tilde{G}},V)\!\right\}$	(177)
	$\displaystyle\leq\exp\{n(R-I(X^{\prime};XSV{\tilde{G}})+\epsilon)\}$	(178)
	$\displaystyle=\!\exp\{n(R-I(X^{\prime};XSV\|{\tilde{G}})+\epsilon)\}$	(179)

where in (178) we have used the fact that $\mathbf{X}(\hat{m})$ is independent of $(\mathbf{X}(m),\mathbf{s},\tilde{\mathbf{g}},\mathbf{V})$ , and (179) follows from (167). We now lower bound the mutual information in (179) by the following.

$\displaystyle I(X^{\prime};XSV\|{\tilde{G}})$	$\displaystyle\geq I\left(X^{\prime};{\tilde{G}}\sqrt{\varphi({\tilde{G}})}X+S+V\Big{\|}{\tilde{G}}\right)$	(180)
	$\displaystyle=I\left(X^{\prime};Z+{\tilde{G}}\sqrt{\varphi({\tilde{G}})}X^{\prime}\Big{\|}{\tilde{G}}\right)$	(181)
	$\displaystyle=h\left(Z+{\tilde{G}}\sqrt{\varphi({\tilde{G}})}X^{\prime}\Big{\|}{\tilde{G}}\right)-h\left(Z+{\tilde{G}}\sqrt{\varphi({\tilde{G}})}X^{\prime}\Big{\|}{\tilde{G}},X^{\prime}\right)$	(182)
	$\displaystyle=\mathbb{E}_{\tilde{G}}\bigg{[}\frac{1}{2}\log 2\pi e\left({\tilde{G}}^{2}\varphi({\tilde{G}})\mathbb{E}\left[X^{\prime 2}\Big{\|}{\tilde{G}}\right]+\mathbb{E}\left[Z^{2}\Big{\|}{\tilde{G}}\right]\right)-\frac{1}{2}\log 2\pi e\mathbb{E}\left[Z^{2}\Big{\|}{\tilde{G}}\right]\bigg{]}$	(183)
	$\displaystyle=\mathbb{E}_{\tilde{G}}\left[C\left(\frac{{\tilde{G}}^{2}\varphi({\tilde{G}})}{\mathbb{E}\left[Z^{2}\Big{\|}{\tilde{G}}\right]}\right)\right]$	(184)
	$\displaystyle=\mathbb{E}_{\tilde{G}}\left[C\left(\frac{{\tilde{G}}^{2}\varphi({\tilde{G}})}{\psi({\tilde{G}})+\sigma^{2}+2\nu}\right)\right]$	(185)

where (180) follows from data processing inequality, (184) follows from standard argument for the capacity of Gaussian channel, and (185) follows from the definition of $\psi$ . Therefore, by the assumptions about $R$ and $\Lambda$ in (103) and (105), $R<I(X^{\prime};XSV|{\tilde{G}})$ , so by (179) $e_{\vec{X}}(m,\mathbf{s},\tilde{\mathbf{g}})$ is exponentially vanishing if $\delta(\epsilon)$ and $\nu$ are sufficiently small.

IX Conclusion

This paper studied two phenomena together which are usually studied separately: namely, active adversaries and fading. We derived the capacity of Gaussian arbitrarily-varying fading channels where the adversary knows the transmitter’s code but not the exact transmitted signal. The adversary affects the capacity by increasing the noise variance by an amount related the adversary’s power. The capacity also depends on whether the transmitter and/or the adversary know the fading gains or not. The transmitter uses its knowledge to maximize the channel capacity while the adversary tries to minimize the capacity by its knowledge of the channel gains. Furthermore, if the adversary’s knowledge is at least that of the transmitter’s knowledge, then the adversary is able to make the capacity zero with enough power. In this paper, we have focused on the scenario where fading applies to the transmitter-to-receiver path, but not the adversary-to-receiver path. Future work could including considering fading along both paths. Such an alternatively model would present somewhat different challenges. Alternative directions include considering fading and adversaries in network settings, or an adversary with some direct control of the fading gains.

Acknowledgment

This material is based upon work supported by the National Science Foundation under Grant No. CCF-1453718.

Appendix A Proof of Lemma 3

In order to prove (16), we use our proof in [16, Lemma 6] for one codebook. Moreover, to obtain (19)–(20), we apply the corresponding proof of the equations in [8, Lemma 1] for Gaussian distributions. Note that [8] focuses on discrete alphabets, but the same proofs can be extended to Gaussian distributions by quantization of the set of continuous random variables in the following way.

Let $\mathbf{X}_{i}$ be Gaussian i.i.d. $n$ -length random vectors (codebook) independent from each other with $\operatorname{Var}(X)=1$ . First let $\mathbf{g}\in\mathbb{R}^{n}$ be a typical realization of $n$ i.i.d. continuous random variable $G$ with probability density function $f_{G}(g)$ . Next, we quantize the set of all $\mathbf{g}\in\mathbb{R}^{n}$ , into a $\nu$ -dense subset $\mathcal{G}^{n}$ . For a fixed $\mathbf{g}\in\mathcal{G}^{n}$ , fix $\mathbf{x}\in\mathcal{T}_{\epsilon}^{(n)}(X),\mathbf{s}\in\mathscr{U}^{n}$ and a covariance matrix $\operatorname{Cov}(X,X^{\prime},S|G=g)\in\mathcal{V}^{3\times 3}$ such that $\mathscr{U}^{n}$ is a $\nu$ -dense subset of $\mathbb{R}^{n}$ for $\mathbf{s}$ such that $||\mathbf{s}||^{2}\leq n\Lambda$ , and $\mathcal{V}^{3\times 3}$ is a $\nu$ -dense subset of $\mathbb{R}^{3\times 3}$ for positive definite covariance matrices with diagonals at most $(1,1,\Lambda)$ .

Using the similar proof in [8, Lemma 1], we obtain for given $(\mathbf{x},\mathbf{s},\mathbf{g})$ and covariance matrix $\operatorname{Cov}(X,X^{\prime},S|G=g)$ that the complement of each event in (19)–(20) happens with decreasingly doubly exponential probability for sufficiently large $n$ meaning that

	$\displaystyle\mathbb{P}\Big{\{}\!\big{\|}\big{\{}\!m^{\prime}\!:\!(\mathbf{x},\mathbf{x}(m^{\prime}),\mathbf{s},\mathbf{g})\!\in\!\mathcal{T}_{\epsilon}^{(n)}(X,X^{\prime},S,G)\!\big{\}}\!\big{\|}\!\leq\!\exp\big{\{}n\big{[}\|R-I(X^{\prime};XSG)\|^{+}+\delta(\epsilon)\big{]}\big{\}}\Big{\}}$
	$\displaystyle\hskip 260.0004pt<\exp(-\exp(n\sigma(\epsilon))),$		(186)
	$\displaystyle\mathbb{P}\bigg{\{}\frac{1}{N}\big{\|}\big{\{}\!m\!:\!(\mathbf{x}(m),\mathbf{x}(m^{\prime}),\mathbf{s},\mathbf{g})\!\in\!\mathcal{T}_{\epsilon}^{(n)}\!(X,X^{\prime}\!,S,G)\text{ for some }m^{\prime}\!\neq\!m\!\big{\}}\big{\|}\leq\!2\exp\{\!-n\delta(\epsilon)/2\!\}\bigg{\}}$
	$\displaystyle<\exp(-\exp(n\sigma(\epsilon))),\text{ if }I(X;X^{\prime}SG)\!\geq\!\|R\!-\!I(X^{\prime};SG)\|^{+}\!\!+\!\delta(\epsilon).$		(187)

Then, in order to complete the proof, since for any fixed $\nu$ the cardinality of finite set $\mathscr{U}^{n}$ is only increasingly exponentially in $n$ , and the set $\mathcal{V}^{3\times 3}$ is finite along with the doubly decreasing exponential probabilities in (186)–(187), we derive that with probability approaching to $1$ , all inequalities in (19)–(20) hold simultaneously for sufficiently large $n$ . Since these inequalities hold for every element in the finite sets $\mathscr{U}^{n}$ and $\mathcal{V}^{3\times 3}$ , then for any vector $\mathbf{s},\mathbf{x}$ and any given covariance matrix $\operatorname{Cov}(X,X^{\prime},S|G=g)$ (with $\|\mathbf{x}\|^{2}=n,\|\mathbf{s}\|^{2}\leq n\Lambda$ ) which is not in its corresponding $\nu$ -dense subset, there exists a point in the corresponding $\nu$ -dense subset that is close enough to it (in its $\nu$ distance neighborhood). Now, by using the continuity properties of all sets, we may conclude that (19)–(20) hold also for any point which is not in the $\nu$ -dense subset.

References

[1] F. Hosseinigoki and O. Kosut, “Capacity of gaussian arbitrarily-varying fading channels,” in 2019 53rd Annual Conference on Information Sciences and Systems (CISS), March 2019, pp. 1–6.
[2] A. J. Goldsmith and P. P. Varaiya, “Capacity of fading channels with channel side information,” IEEE Transactions on Information Theory, vol. 43, no. 6, pp. 1986–1992, Nov 1997.
[3] I. Csiszar and P. Narayan, “The capacity of the arbitrarily varying channel revisited: positivity, constraints,” IEEE Transactions on Information Theory, vol. 34, no. 2, pp. 181–193, March 1988.
[4] ——, “Arbitrarily varying channels with constrained inputs and states,” IEEE Transactions on Information Theory, vol. 34, no. 1, pp. 27–34, Jan 1988.
[5] ——, “Capacity of the Gaussian arbitrarily varying channel,” Information Theory, IEEE Transactions on, vol. 37, no. 1, pp. 18–26, Jan 1991.
[6] J. A. Gubner, “On the capacity region of the discrete additive multiple-access arbitrarily varying channel,” IEEE Transactions on Information Theory, vol. 38, no. 4, pp. 1344–1347, July 1992.
[7] F. Hosseinigoki and O. Kosut, “The gaussian interference channel in the presence of a malicious jammer,” in 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), Sept 2016, pp. 679–686.
[8] B. L. Hughes, “The smallest list for the arbitrarily varying channel,” IEEE Transactions on Information Theory, vol. 43, no. 3, pp. 803–815, May 1997.
[9] A. D. Sarwate and M. Gastpar, “List-decoding for the arbitrarily varying channel under state constraints,” IEEE Transactions on Information Theory, vol. 58, no. 3, pp. 1372–1384, March 2012.
[10] F. Hosseinigoki and O. Kosut, “Capacity of the gaussian arbitrarily-varying channel with list decoding,” in 2018 IEEE International Symposium on Information Theory (ISIT), June 2018, pp. 471–475.
[11] J. Barros and M. R. D. Rodrigues, “Secrecy capacity of wireless channels,” in 2006 IEEE International Symposium on Information Theory, July 2006, pp. 356–360.
[12] P. Wang, G. Yu, and Z. Zhang, “On the secrecy capacity of fading wireless channel with multiple eavesdroppers,” in 2007 IEEE International Symposium on Information Theory, June 2007, pp. 1301–1305.
[13] Z. Li, R. Yates, and W. Trappe, “Achieving secret communication for fast rayleigh fading channels,” IEEE Transactions on Wireless Communications, vol. 9, no. 9, pp. 2792–2799, September 2010.
[14] A. E. Gamal and Y. H. Kim, Network information theory. Cambridge University Press, 2011.
[15] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems. Cambri, 2011.
[16] F. Hosseinigoki and O. Kosut, “The Gaussian interference channel in the presence of malicious jammers,” [Online] arXiv:1712.04133, Dec. 2017, submitted to IEEE Trans. on Information Theory.

	$\displaystyle\mathbb{P}\left\{\big{\|}\big{\{}(\mathbf{x}(m^{\prime}),\mathbf{s},\mathbf{G})\!\in\!\mathcal{T}_{\epsilon}^{(n)}\!(X^{\prime},S,G)\text{ for some }m^{\prime}\big{\}}\big{\|}\right\}\leq 2\exp\{\!-n\delta(\epsilon)/2\},$
	$\displaystyle\hskip 200.0003pt\text{if }I(G;X^{\prime}S)\!\geq\!\|R\!-\!I(X^{\prime};S)\|^{+}\!\!+\!\delta(\epsilon),$		(17)
	$\displaystyle\big{\|}\big{\{}m^{\prime}:(\mathbf{x}(m^{\prime}),\mathbf{s})\in\mathcal{T}_{\epsilon}^{(n)}(X^{\prime},S)\big{\}}\big{\|}\leq\exp\big{\{}n\big{[}\|R-I(X^{\prime};S)\|^{+}+\delta(\epsilon)\big{]}\big{\}},$		(18)
	$\displaystyle\big{\|}\big{\{}m^{\prime}:(\mathbf{x},\mathbf{x}(m^{\prime}),\mathbf{s},\mathbf{g})\in\mathcal{T}_{\epsilon}^{(n)}(X,X^{\prime},S,G)\big{\}}\big{\|}\leq\exp\big{\{}n\big{[}\|R-I(X^{\prime};XSG)\|^{+}+\delta(\epsilon)\big{]}\big{\}},$		(19)
	$\displaystyle\frac{1}{N}\big{\|}\big{\{}\!m\!:\!(\mathbf{x}(m),\mathbf{x}(m^{\prime}),\mathbf{s},\mathbf{g})\!\in\!\mathcal{T}_{\epsilon}^{(n)}\text{ for some }m^{\prime}\!\neq\!m\!\big{\}}\big{\|}\leq\!2\exp\{\!-n\delta(\epsilon)/2\},$
	$\displaystyle\hskip 190.00029pt\text{if }I(X;X^{\prime}SG)\!\geq\!\|R\!-\!I(X^{\prime};SG)\|^{+}\!\!+\!\delta(\epsilon).$		(20)

Capacity of Gaussian Arbitrarily-Varying Fading Channels

Abstract

I Introduction

II Problem Statement

III Main Results

Theorem 1

IV Auxiliary Results and Tools

Lemma 2

Lemma 3

V Capacity Proof with Gains Available at Decoder

V-A Converse Proof

V-B Achievability Proof

VI Capacity Proof with Gains Available at Encoder and Decoder

VI-A Converse Proof

VI-B Achievability Proof

VII Capacity Proof with Gains Available at Decoder and Jammer

VII-A Converse Proof

VII-B Achievability Proof

VIII Capacity Proof with Gains Available at Encoder, Decoder, and Jammer

VIII-A Converse Proof

VIII-B Achievability Proof (Gains Available Non-causally at Adversary and Causally at Encoder)

VIII-C Achievability Proof (Gains Available Causally at Adversary and Non-causally at Encoder)

IX Conclusion

Acknowledgment

Appendix A Proof of Lemma 3

References

Capacity of Gaussian
Arbitrarily-Varying Fading Channels