This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Optimal Correlators and Waveforms for Mismatched Detection

Neri Merhav
Abstract

We consider the classical Neymann–Pearson hypothesis testing problem of signal detection, where under the null hypothesis (0{\cal H}_{0}), the received signal is white Gaussian noise, and under the alternative hypothesis (1{\cal H}_{1}), the received signal includes also an additional non–Gaussian random signal, which in turn can be viewed as a deterministic waveform plus zero–mean, non-Gaussian noise. However, instead of the classical likelihood ratio test detector, which might be difficult to implement, in general, we impose a (mismatched) correlation detector, which is relatively easy to implement, and we characterize the optimal correlator weights in the sense of the best trade-off between the false-alarm error exponent and the missed-detection error exponent. Those optimal correlator weights depend (non-linearly, in general) on the underlying deterministic waveform under 1{\cal H}_{1}. We then assume that the deterministic waveform may also be free to be optimized (subject to a power constraint), jointly with the correlator, and show that both the optimal waveform and the optimal correlator weights may take on values in a small finite set of typically no more than two to four levels, depending on the distribution of the non-Gaussian noise component. Finally, we outline an extension of the scope to a wider class of detectors that are based on linear combinations of the correlation and the energy of the received signal.

Index terms: hypothesis testing, signal detection, correlation–detection, error exponent.

The Andrew & Erna Viterbi Faculty of Electrical and Computer Engineering

Technion - Israel Institute of Technology

Technion City, Haifa 32000, ISRAEL

E–mail: [email protected]

1 Introduction

The topic of detection of signals corrupted by noise has a very long history of active research efforts, as it has an extremely wide spectrum of engineering applications in the areas communications and signal processing. These include radar, sonar, light detection and ranging (LIDAR), object recognition in images and video streams, diagnosis based on biomedical signals, watermark detection in images and audio signals, seismological signal detection related to geophysical activity, and object detection using multispectral/hyperspectral imaging, just to name a few. One of the most problematic and frequently encountered issues in signal detection scenarios is mismatch between the signal model and the detector design, which is based upon certain assumptions on that model. Accordingly, the topic of mismatched signal detection has received considerable attention in the literature, see, e.g., [1], [10], [11], [18], [19], [20], [21], [28], and [30], for a non-exhaustive list of relevant references. The common theme in most of these works is the possible presence of uncertainties in the desired signal to be detected, in the steering vector, in the transfer function of the propagation medium, and/or in the distributions of the various kinds of noise, interference and clutter. Accordingly, adaptive detection mechanisms with tunable parameters have been developed and proposed in order to combat those types of mismatch.

Another line of earlier relevant research activity is associated with the notion of robust detection techniques, where the common theme is generally directed towards a worst-case design of the detector against small non-parametric uncertainties around some nominal noise distribution, most notably, a Gaussian distribution. See, e.g., [2], [4], [5], [9], [12], [13], [15], [16], [17], [22], [24], and [25]. See also [14] for a survey on the subject.

Last but not least, when the uncertainty is only in a finite number of parameters of the model, the problem is normally treated in the framework of composite hypothesis testing, where the popular approach is the well–known generalized likelihood ratio test (GLRT) [27], which is often (but not always) asymptotically optimal in the error exponent sense, see, for example, [3], [6], [7], and [29]. The GLRT is applied also in some of the above cited articles on mismatched detection, among many others. Another approach to composite hypothesis testing is the competitive minimax approach, proposed in [8].

Our objective in this work is partially related to those studies, but it is different. It is associated with mismatched detection, except that the origin of this mismatch is not quite due to uncertainty in the signal-plus-noise model, but it comes from practical considerations: the optimal likelihood ratio test (LRT) detector might be difficult to implement in many application examples, especially in the case of sensors that are built on small, mobile devices which are subjected to severe limitations on power and computational resources. In such situations, it is desirable that the detector would be as simple as possible, e.g., a correlation detector, or a detector that is based on correlation and energy. Within this framework, the number of arithmetic operations (especially the multiplications) should be made as small as possible. Clearly, a detector from this class cannot be optimal, unless the noise is Gaussian, hence the mismatch. Nonetheless, we would like to find the best correlator weights in the sense of optimizing the trade-off between the false-alarm (FA) and the missed–detection (MD) rates. This would partially compensate for the mismatch in case the noise is not purely Gaussian.

More precisely, consider the following signal detection problem, of distinguishing between two hypotheses:

0:Yt=Nt,t=1,2,,n\displaystyle{\cal H}_{0}:~{}~{}Y_{t}=N_{t},~{}~{}~{}~{}~{}~{}t=1,2,\ldots,n (1)
1:Yt=Xt+Nt,t=1,2,,n\displaystyle{\cal H}_{1}:~{}~{}Y_{t}=X_{t}+N_{t},~{}~{}~{}~{}~{}~{}~{}t=1,2,\ldots,n (2)

where {Nt}\{N_{t}\} is an independently identically distributed (IID), zero-mean Gaussian noise process with variance σN2\sigma_{N}^{2}, independent of {Xt}\{X_{t}\}, which is another random process, that we decompose as Xt=st+ZtX_{t}=s_{t}+Z_{t}, with st=𝑬{Xt}s_{t}=\mbox{\boldmath$E$}\{X_{t}\} being a deterministic waveform and Zt=XtstZ_{t}=X_{t}-s_{t} being an IID, zero-mean noise process, which is not necessarily Gaussian in general. The non–Gaussian noise component, {Zt}\{Z_{t}\}, can be thought of as signal–induced noise (SIN), which may stem from several possible mechanisms, such as: echos of the desired signal, multiplicative noise, cross-talks from parallel channels conveying correlated signals, interference by jammers, and in the case of optical detection using avalanche photo-diodes (APDs), it corresponds to shot noise plus multiplicative noise due to the random gain of the device (see, e.g., [23] and references therein, for more details). In general, {Zt}\{Z_{t}\} may also designate randomness that could be attributed to uncertainty associated with the transmitted signal.

As mentioned above, the optimal LRT detector might be considerably difficult to implement in practice since the probability density function (PDF) of {Yt}\{Y_{t}\} under 1{\cal H}_{1} involves the convolution between the Gaussian PDF of NtN_{t} and the (non-Gaussian) PDF of ZtZ_{t}, which is typically complicated. As said, a reasonable practical compromise, valid when the underlying signal {st}\{s_{t}\} is not identically zero, is a correlation detector, which compares the correlation, t=1nwtYt\sum_{t=1}^{n}w_{t}Y_{t}, to a threshold, where w1,,wnw_{1},\ldots,w_{n} are referred to as the correlator weights, and the threshold controls the trade-off between the FA probability and the MD probability. Our first objective is to characterize the best correlator weights, w1,,wnw_{1}^{*},\ldots,w_{n}^{*}, in the sense of the optimal trade-off between the FA probability and the MD probability, or more precisely, the optimal trade-off between the asymptotic exponential rates of decay of these probabilities as functions of the sample size nn, i.e., the FA exponent and the MD exponent. Clearly, the optimal correlation detector is, in general, not as good as the optimal, LRT detector, but it is the best compromise between performance and the practical implementability in the framework of correlation detectors. As very similar study was already carried out in [23], in the context of optical signal detection using photo-detectors, where the optimal correlator waveform was characterized in terms of the optical transmitted signal in continuous time, and was found to be given by a certain non-linear function of the optical signal.

Here, we study the problem in a more general framework, in the sense that the PDF of the SIN, ZtZ_{t}, is arbitrary. Moreover, we expand the scope in several directions, in addition to the study that is directly parallel to that of [23].

  1. 1.

    We consider the possibility of limiting the number of levels {wt}\{w_{t}\} to be finite (e.g., binary, ternary, etc.), with the motivation of significantly reducing the number of multiplications needed to calculate the correlation, twtYt\sum_{t}w_{t}Y_{t}.

  2. 2.

    We jointly optimize both the signal {st}\{s_{t}\} and the correlator, {wt}\{w_{t}\}. Interestingly, here both the optimal signal and the optimal correlation weights turn out to have a finite number of levels even if this number is not restricted a-priori. The number of levels depends on the PDF of ZtZ_{t}, and it is typically very small (e.g., two to four levels). Moreover, the optimal {st}\{s_{t}\} and {wt}\{w_{t}\} turn out to be proportional to each other, in contrast to the non-linear relation resulting when only {wt}\{w_{t}\} is optimized while {st}\{s_{t}\} is given.

  3. 3.

    We outline an extension to a wider class of detectors that are based on linear combinations of the correlation, twtYt\sum_{t}w_{t}Y_{t}, and the energy, tYt2\sum_{t}Y_{t}^{2}, with the motivation that it is, in fact, the structure of the optimal detector when ZtZ_{t} is Gaussian noise, and that it is reasonable regardless, since the under 1{\cal H}_{1} the power (or the variance) of the received signal is larger than under 0{\cal H}_{0}.111In fact, when st0s_{t}\equiv 0, the correlation term becomes useless altogether and the energy term becomes necessary. We also address the possibility of replacing the energy term by the sum of absolute values, t|Yt|\sum_{t}|Y_{t}|, which is another measure of signal intensity, with the practical advantage that its calculation does not require multiplications.

The outline of the remaining part of this work is as follows. In Section 2, we formalize the problem rigorously and spell out our basic assumptions. In Section 3, we characterize the optimal correlator, {wt}\{w_{t}^{*}\}, for a given signal, {st}\{s_{t}\}, subject to the power constraint. In Section 4, we address the problem of joint optimization of both {wt}\{w_{t}\} and {st}\{s_{t}\}, both under power constraints, and finally in Section 5, we outline extensions to wider classes of detectors that are based on correlation and energy.

2 Assumptions and Preliminaries

Consider the signal detection model described in the fifth paragraph of the Introduction. We assume that Z1,,ZnZ_{1},\ldots,Z_{n} are independent copies of a zero–mean random variable (RV), ZZ, that has a symmetric222The symmetry assumption is imposed mostly for convenience, but the results can be extended to address also non-symmetric PDFs. PDF, fZ(z)f_{Z}(z), around the origin, and that it has a finite cumulant generating function (CGF),

C(v)=Δln𝑬{evZ},C(v)\stackrel{{\scriptstyle\Delta}}{{=}}\ln\mbox{\boldmath$E$}\{e^{vZ}\}, (3)

at least in a certain interval of the real valued variable vv. Note that since fZ()f_{Z}(\cdot) is assumed symmetric around the origin, then so is C()C(\cdot). We also assume that C()C(\cdot) is twice differentiable within the range it exists. It is well known to be a convex function, because its second derivative cannot be negative, as it can be viewed as the variance of ZZ under the PDF that is proportional to fZ(z)evzf_{Z}(z)e^{vz}. Further assumptions on ZZ and its CGF will be spelled out in the sequel, at the places they will be needed. The following simple special cases will accompany our derivations and discussions repeatedly in the sequel:

Case 1. ZZ is zero-mean, Gaussian RV with variance σZ2\sigma_{Z}^{2}:

C(v)=σZ2v22.C(v)=\frac{\sigma_{Z}^{2}v^{2}}{2}. (4)

Case 2. ZZ is a Laplacian RV with parameter qq, i.e., fZ(z)=q2eq|z|f_{Z}(z)=\frac{q}{2}e^{-q|z|}:

C(v)=ln(1v2q2).C(v)=-\ln\left(1-\frac{v^{2}}{q^{2}}\right). (5)

Case 3. ZZ in a binary RV, taking values in {z0,+z0}\{-z_{0},+z_{0}\} with equal probabilities:

C(v)=lncosh(z0v).C(v)=\ln\cosh(z_{0}v). (6)

Case 4. ZZ is a uniformly distributed RV across the interval [z0,+z0][-z_{0},+z_{0}]:

C(v)=ln(sinh(z0v)z0v).C(v)=\ln\left(\frac{\sinh(z_{0}v)}{z_{0}v}\right). (7)

The signal vector, 𝒔=(s1,,sn)\mbox{\boldmath$s$}=(s_{1},\ldots,s_{n}), stIRs_{t}\in{\rm I\!R}, t=1,,nt=1,\ldots,n, is assumed known, and we denote its power by P(𝒔)P(\mbox{\boldmath$s$}), that is,

P(𝒔)=Δ1nt=1nst2.P(\mbox{\boldmath$s$})\stackrel{{\scriptstyle\Delta}}{{=}}\frac{1}{n}\sum_{t=1}^{n}s_{t}^{2}. (8)

Consider the class of correlation detectors, i.e., detectors that compare the correlation, t=1nwtYt\sum_{t=1}^{n}w_{t}Y_{t}, to a certain threshold TT, where 𝒘=(w1,,wn)\mbox{\boldmath$w$}=(w_{1},\ldots,w_{n}) is a vector of real valued correlator coefficients, henceforth referred to as the correlator, for short. The decision rule is as follows: If t=1nwtYt<T\sum_{t=1}^{n}w_{t}Y_{t}<T, accept the null hypothesis, 0{\cal H}_{0}, otherwise, accept the alternative, 1{\cal H}_{1}. The threshold, TT, controls the trade-off between the FA probability and the MD probability of the detector. To allow exponential decay (as nn grows without bound) of both types of error probabilities, we let TT vary linearly with nn, and denote T=θnT=\theta n, where θ\theta is a real valued constant, independent of nn.

In order to have a well-defined asymptotic FA exponent, we assume that the correlator, 𝒘w, has a fixed power,

P(𝒘)=1nt=1nwt2,P(\mbox{\boldmath$w$})=\frac{1}{n}\sum_{t=1}^{n}w_{t}^{2}, (9)

which is independent of nn, or, more generally, that the right–hand side (RHS) of eq. (9) tends to a certain fixed positive power level, as nn\to\infty (otherwise, the normalized logarithm of the FA probability would oscillate indefinitely, without a limit). Indeed, the FA probability of the correlation detector is given by

PFA=Pr{t=1nwtNtθn}=Q(θnσ𝒘)=exp{θ2n2σN2P(𝒘)},P_{\mbox{\tiny FA}}=\mbox{Pr}\left\{\sum_{t=1}^{n}w_{t}N_{t}\geq\theta n\right\}=Q\left(\frac{\theta n}{\sigma\|\mbox{\boldmath$w$}\|}\right)\stackrel{{\scriptstyle\cdot}}{{=}}\exp\left\{-\frac{\theta^{2}n}{2\sigma_{N}^{2}P(\mbox{\boldmath$w$})}\right\}, (10)

where QQ is the well-known QQ-function,

Q(u)=Δ12πueu2/2du,Q(u)\stackrel{{\scriptstyle\Delta}}{{=}}\frac{1}{\sqrt{2\pi}}\int_{u}^{\infty}e^{-u^{2}/2}\mbox{d}u, (11)

and =\stackrel{{\scriptstyle\cdot}}{{=}} denotes equivalence in the exponential scale, in other words, the notation, an=bna_{n}\stackrel{{\scriptstyle\cdot}}{{=}}b_{n}, for two positive sequences {an}\{a_{n}\} and {bn}\{b_{n}\}, means that limn1nloganbn=0\lim_{n\to\infty}\frac{1}{n}\log\frac{a_{n}}{b_{n}}=0. It follows from (10) that the FA exponent is given by

EFA(θ)=θ22σN2P(𝒘).E_{\mbox{\tiny FA}}(\theta)=\frac{\theta^{2}}{2\sigma_{N}^{2}P(\mbox{\boldmath$w$})}. (12)

Thus, the FA exponent depends on 𝒘w only via P(𝒘)P(\mbox{\boldmath$w$}). It follows that for a given θ\theta, if we wish to achieve a given, prescribed FA exponent, EFA(θ)EFAE_{\mbox{\tiny FA}}(\theta)\geq E_{\mbox{\tiny FA}} (where EFAE_{\mbox{\tiny FA}} is a given positive number), we must have

P(𝒘)Pw=Δθ22σN2EFA.P(\mbox{\boldmath$w$})\leq P_{w}\stackrel{{\scriptstyle\Delta}}{{=}}\frac{\theta^{2}}{2\sigma_{N}^{2}E_{\mbox{\tiny FA}}}. (13)

In other words, a constraint on the FA exponent means a corresponding constraint on the asymptotic power of 𝒘w to be no larger than PwP_{w}.

In order to have a well–defined MD exponent, our assumptions concerning the asymptotic behavior of 𝒘w and 𝒔s will have to be more restrictive: We will assume that as nn\to\infty, the pairs {(wt,st)}t=1n\{(w_{t},s_{t})\}_{t=1}^{n} obey a certain joint PDF, fWS(w,s)f_{WS}(w,s), in the following sense: For every λ0\lambda\geq 0,

limn{λ(1nwtstθ)1nt=1nC(λwt)λ2σN221nt=1nwt2}\displaystyle\lim_{n\to\infty}\left\{\lambda\left(\cdot\frac{1}{n}w_{t}s_{t}-\theta\right)-\frac{1}{n}\sum_{t=1}^{n}C(\lambda w_{t})-\frac{\lambda^{2}\sigma_{N}^{2}}{2}\cdot\frac{1}{n}\sum_{t=1}^{n}w_{t}^{2}\right\} (14)
=\displaystyle= 𝑬WS{λ(𝑬{WS}θ)𝑬{C(λW)}λ2σN22𝑬{W2}},\displaystyle\mbox{\boldmath$E$}_{WS}\left\{\lambda(\mbox{\boldmath$E$}\{W\cdot S\}-\theta)-\mbox{\boldmath$E$}\{C(\lambda W)\}-\frac{\lambda^{2}\sigma_{N}^{2}}{2}\cdot\mbox{\boldmath$E$}\{W^{2}\}\right\},

where 𝑬WS{}\mbox{\boldmath$E$}_{WS}\{\cdot\} denotes expectation with respect to (w.r.t.) fWSf_{WS}. Whenever there is no room for confusion, the subscript WSWS will be omitted and the expectation will be denoted simply by 𝑬{}\mbox{\boldmath$E$}\{\cdot\}. The function fWS(,)f_{WS}(\cdot,\cdot) will be referred to as the asymptotic empirical joint PDF of 𝒘w and 𝒔s.333In the sequel, we will encounter one scenario where the asymptotic empirical PDF will be irrelevant, but this scenario will be handled separately, in the original domain of vectors of dimension nn.

The MD probability is now upper bounded, exponentially tightly, by the Chernoff bound, as follows. Denoting the Gaussian random variable, U=Δt=1nwtNtU\stackrel{{\scriptstyle\Delta}}{{=}}\sum_{t=1}^{n}w_{t}N_{t}, we have

PMD\displaystyle P_{\mbox{\tiny MD}} =\displaystyle= Pr{t=1nwtst+twtZt+Uθn}\displaystyle\mbox{Pr}\left\{\sum_{t=1}^{n}w_{t}s_{t}+\sum_{t}w_{t}Z_{t}+U\leq\theta n\right\} (15)
\displaystyle\leq infλ0𝑬(exp{λ[θnt=1nwtsttwtZtU]})\displaystyle\inf_{\lambda\geq 0}\mbox{\boldmath$E$}\left(\exp\left\{\lambda\left[\theta n-\sum_{t=1}^{n}w_{t}s_{t}-\sum_{t}w_{t}Z_{t}-U\right]\right\}\right)
=\displaystyle= infλ0exp{λ[θnt=1nwtst]}𝑬exp{λU}𝑬exp{λt=1nwtZt}\displaystyle\inf_{\lambda\geq 0}\exp\left\{\lambda\left[\theta n-\sum_{t=1}^{n}w_{t}s_{t}\right]\right\}\cdot\mbox{\boldmath$E$}\exp\{-\lambda U\}\cdot\mbox{\boldmath$E$}\exp\left\{-\lambda\sum_{t=1}^{n}w_{t}Z_{t}\right\}
=\displaystyle= infλ0exp{λ[θnt=1nwtst]}exp{nλ2σN2P(𝒘)2}t=1n𝑬exp{λwtZt}\displaystyle\inf_{\lambda\geq 0}\exp\left\{\lambda\left[\theta n-\sum_{t=1}^{n}w_{t}s_{t}\right]\right\}\cdot\exp\left\{\frac{n\lambda^{2}\sigma_{N}^{2}P(\mbox{\boldmath$w$})}{2}\right\}\cdot\prod_{t=1}^{n}\mbox{\boldmath$E$}\exp\{-\lambda w_{t}Z_{t}\}
=\displaystyle= infλ0exp{λ[θnt=1nwtst]}exp{nλ2σN2P(𝒘)2}t=1nexp{C(λwtZt)}\displaystyle\inf_{\lambda\geq 0}\exp\left\{\lambda\left[\theta n-\sum_{t=1}^{n}w_{t}s_{t}\right]\right\}\cdot\exp\left\{\frac{n\lambda^{2}\sigma_{N}^{2}P(\mbox{\boldmath$w$})}{2}\right\}\cdot\prod_{t=1}^{n}\exp\{C(-\lambda w_{t}Z_{t})\}
=\displaystyle= infλ0exp{λ[θnt=1nwtst]}exp{nλ2σN2P(𝒘)2}t=1nexp{C(λwtZt)},\displaystyle\inf_{\lambda\geq 0}\exp\left\{\lambda\left[\theta n-\sum_{t=1}^{n}w_{t}s_{t}\right]\right\}\cdot\exp\left\{\frac{n\lambda^{2}\sigma_{N}^{2}P(\mbox{\boldmath$w$})}{2}\right\}\cdot\prod_{t=1}^{n}\exp\{C(\lambda w_{t}Z_{t})\},

where the last step is due to the symmetry of the C()C(\cdot). The resulting MD exponent is therefore given by

EMD(θ)=supλ0{λ(𝑬{WS}θ)𝑬{C(λW)}λ2σN22𝑬{W2}},E_{\mbox{\tiny MD}}(\theta)=\sup_{\lambda\geq 0}\left\{\lambda(\mbox{\boldmath$E$}\{W\cdot S\}-\theta)-\mbox{\boldmath$E$}\{C(\lambda W)\}-\frac{\lambda^{2}\sigma_{N}^{2}}{2}\cdot\mbox{\boldmath$E$}\{W^{2}\}\right\}, (16)

which is a functional of fWSf_{WS}.

The problem of optimal correlator design for a given 𝒔s, is equivalent to the problem of finding a conditional density, fW|Sf_{W|S}, that maximizes the MD exponent subject to the power constraint, 𝑬{W2}Pw\mbox{\boldmath$E$}\{W^{2}\}\leq P_{w}. The problem of joint design of both 𝒘w and 𝒔s is asymptotically equivalent to the problem of maximizing the MD exponent over {fWS}\{f_{WS}\} subject to the power constraints, and 𝑬{W2}Pw\mbox{\boldmath$E$}\{W^{2}\}\leq P_{w} and 𝑬{S2}Ps\mbox{\boldmath$E$}\{S^{2}\}\leq P_{s}, for some given Ps>0P_{s}>0. The first problem is relevant when the detector designer has no control of the transmitted signal, for example, when the transmitter and the receiver are hostile parties, which is typically the case in military applications. The second problem is relevant when the transmitter and the receiver cooperate. In radar applications, for example, the transmitter and the receiver are the same party. In sections 3 and 4, we address the first problem and the second problem, respectively.

Comment 1. Instead of maximizing the MD exponent for a fixed threshold, θ\theta, and a fixed power constraint, PwP_{w}, in order to fit a prescribed FA exponent, in principle, there is an alternative approach: maximize the MD exponent directly for a given FA exponent, by substituting θ=σ2PwEFA\theta=\sigma\sqrt{2P_{w}E_{\mbox{\tiny FA}}} in the MD exponent expression. Not surprisingly, in this case, the MD exponent would become invariant to scaling in WW (as any scaling of WW can be absorbed in λ\lambda for all terms of the MD exponent), and so, there would be no need for the PwP_{w}-constraint, but this invariance property holds only after maximizing over λ\lambda, not for a given λ\lambda. However, maximizing over λ\lambda as a first step of the calculation, does not seem to lend itself to closed form analysis, in general, and consequently, it would make the subsequent optimization extremely difficult, if not impossible, to carry out. We therefore opt to fix both θ\theta and PwP_{w} throughout our derivations.

3 Optimum Correlator for a Given Signal

In view of the discussion in Section 2, we wish to find the optimal conditional density, fW|Sf_{W|S}, in the sense of maximizing

λ𝑬{WS}𝑬{C(λW)}λ2σN22𝑬{W2}=+fS(s)𝑬{λsWC(λW)λ2σN22W2|S=s}ds,\lambda\mbox{\boldmath$E$}\{W\cdot S\}-\mbox{\boldmath$E$}\{C(\lambda W)\}-\frac{\lambda^{2}\sigma_{N}^{2}}{2}\mbox{\boldmath$E$}\{W^{2}\}=\int_{\infty}^{+\infty}f_{S}(s)\cdot\mbox{\boldmath$E$}\left\{\lambda sW-C(\lambda W)-\frac{\lambda^{2}\sigma_{N}^{2}}{2}\cdot W^{2}\bigg{|}S=s\right\}\mbox{d}s, (17)

subject to the power constraint,

𝑬{W2}+fS(s)𝑬{W2|S=s}dsPw.\mbox{\boldmath$E$}\{W^{2}\}\equiv\int_{\infty}^{+\infty}f_{S}(s)\mbox{\boldmath$E$}\{W^{2}|S=s\}\mbox{d}s\leq P_{w}. (18)

At this stage, we carry out this optimization for a given λ0\lambda\geq 0, but with the understanding that eventually, λ\lambda will be subjected to optimization as well. To this end, let us denote the derivative of C(v)C(v) by C˙(v)\dot{C}(v), and for a given ρ0\rho\geq 0, define the function

g(w|ρ,λ)=ΔC˙(λw)+(ρλ+σN2λ)w.g(w|\rho,\lambda)\stackrel{{\scriptstyle\Delta}}{{=}}\dot{C}(\lambda w)+\left(\frac{\rho}{\lambda}+\sigma_{N}^{2}\lambda\right)\cdot w. (19)

Observe that since CC is convex, C˙\dot{C} is monotonically non-decreasing, and so, g(|ρ,λ)g(\cdot|\rho,\lambda) is monotonically strictly increasing, which in turn implies that it has an inverse. We denote the inverse of g(|ρ,λ)g(\cdot|\rho,\lambda) by g1(|ρ,λ)g^{-1}(\cdot|\rho,\lambda). Also, since ZZ is assumed zero mean, then C˙(0)=0\dot{C}(0)=0, and hence also g(0|ρ,λ)=0g(0|\rho,\lambda)=0 and g1(0|ρ,λ)=0g^{-1}(0|\rho,\lambda)=0. Note also that g(|ρ,λ)g(\cdot|\rho,\lambda) (and hence also g1(|ρ,λ)g^{-1}(\cdot|\rho,\lambda)) is a linear function if and only if ZZ is Gaussian. The following lemma characterizes the optimal fW|Sf_{W|S}.

Theorem 1

Let the assumptions of Section 2 hold. Assume further that PwP_{w} is such that there exists ρ0\rho\geq 0 (possibly, depending on λ\lambda), with 𝐄{[g1(S|ρ,λ)]2}=Pw\mbox{\boldmath$E$}\{[g^{-1}(S|\rho,\lambda)]^{2}\}=P_{w}. Otherwise, if 𝐄{[g1(S|0,λ)]2}<Pw\mbox{\boldmath$E$}\{[g^{-1}(S|0,\lambda)]^{2}\}<P_{w}, set ρ=0\rho=0. Then, the optimal conditional density, fW|Sf_{W|S}, is given by

fW|S(w|s)=δ(wg1(s|ρ,λ)),f_{W|S}^{*}(w|s)=\delta(w-g^{-1}(s|\rho,\lambda)), (20)

where δ()\delta(\cdot) is the Dirac delta function.

The theorem tells that the best correlator, 𝒘=(w1,,wn)\mbox{\boldmath$w$}^{*}=(w_{1}^{*},\ldots,w_{n}^{*}), for a given 𝒔=(s1,,sn)\mbox{\boldmath$s$}=(s_{1},\ldots,s_{n}), is obtained by the relation,

wt=g1(st|ρ,λ),t=1,,n,w_{t}^{*}=g^{-1}(s_{t}|\rho,\lambda),~{}~{}~{}~{}~{}t=1,\ldots,n, (21)

which means that wtw_{t}^{*} is given by a function of sts_{t}, which is non-linear unless ZZ is Gaussian. To gain an initial insight regarding the condition on ρ\rho, consider the Gaussian example (Case 1, eq. (4)). In this case, g(W|ρ,λ)=[(σN2+σZ2)λ+ρ/λ]Wg(W|\rho,\lambda)=[(\sigma_{N}^{2}+\sigma_{Z}^{2})\lambda+\rho/\lambda]W, and so, g1(S|ρ,λ)=λS/[(σN2+σZ2)λ2+ρ]g^{-1}(S|\rho,\lambda)=\lambda S/[(\sigma_{N}^{2}+\sigma_{Z}^{2})\lambda^{2}+\rho], whose power is PwP_{w} for ρ=λ𝑬{S2}/Pw(σN2+σZ2)λ2\rho=\lambda\sqrt{\mbox{\boldmath$E$}\{S^{2}\}/P_{w}}-(\sigma_{N}^{2}+\sigma_{Z}^{2})\lambda^{2}, which is non-negative as long as Pw𝑬{S2}/[(σN2+σZ2)2λ]P_{w}\leq\mbox{\boldmath$E$}\{S^{2}\}/[(\sigma_{N}^{2}+\sigma_{Z}^{2})^{2}\lambda]. In general, the exact choice of PwP_{w} is not crucial, as the prescribed FA exponent can be achieved by adjusting θ\theta proportionally to Pw\sqrt{P_{w}}. However, once PwP_{w} is chosen, we will keep it fixed throughout (see Comment 1 above).

Proof of Theorem 1. Consider the following chain of inequalities and inequalities.

sup{fW|S:𝑬{W2}Pw}+fS(s)𝑬{λsWC(λW)λ2σN22W2|S=s}ds\displaystyle\sup_{\{f_{W|S}:~{}\mbox{\boldmath$E$}\{W^{2}\}\leq P_{w}\}}\int_{\infty}^{+\infty}f_{S}(s)\cdot\mbox{\boldmath$E$}\left\{\lambda sW-C(\lambda W)-\frac{\lambda^{2}\sigma_{N}^{2}}{2}\cdot W^{2}\bigg{|}S=s\right\}\mbox{d}s (22)
=\displaystyle= supfW|Sinfϱ0{+fS(s)𝑬{λsWC(λW)λ2σN22W2|S=s}ds+\displaystyle\sup_{f_{W|S}}\inf_{\varrho\geq 0}\bigg{\{}\int_{\infty}^{+\infty}f_{S}(s)\cdot\mbox{\boldmath$E$}\left\{\lambda sW-C(\lambda W)-\frac{\lambda^{2}\sigma_{N}^{2}}{2}\cdot W^{2}\bigg{|}S=s\right\}\mbox{d}s+
ϱ2[Pw+fS(s)𝑬{W2|S=s}ds]}\displaystyle\frac{\varrho}{2}\left[P_{w}-\int_{\infty}^{+\infty}f_{S}(s)\mbox{\boldmath$E$}\{W^{2}|S=s\}\mbox{d}s\right]\bigg{\}}
=(a)\displaystyle\stackrel{{\scriptstyle\mbox{\tiny(a)}}}{{=}} infϱ0supfW|S{+fS(s)𝑬{λsWC(λW)(λ2σN22+ϱ2)W2|S=s}ds+ϱPw2}\displaystyle\inf_{\varrho\geq 0}\sup_{f_{W|S}}\bigg{\{}\int_{\infty}^{+\infty}f_{S}(s)\cdot\mbox{\boldmath$E$}\left\{\lambda sW-C(\lambda W)-\left(\frac{\lambda^{2}\sigma_{N}^{2}}{2}+\frac{\varrho}{2}\right)\cdot W^{2}\bigg{|}S=s\right\}\mbox{d}s+\frac{\varrho P_{w}}{2}\bigg{\}}
=(b)\displaystyle\stackrel{{\scriptstyle\mbox{\tiny(b)}}}{{=}} infϱ0{+fS(s)supw{λswC(λw)(λ2σN22+ϱ2)w2}ds+ϱPw2}\displaystyle\inf_{\varrho\geq 0}\left\{\int_{\infty}^{+\infty}f_{S}(s)\cdot\sup_{w}\left\{\lambda sw-C(\lambda w)-\left(\frac{\lambda^{2}\sigma_{N}^{2}}{2}+\frac{\varrho}{2}\right)\cdot w^{2}\right\}\mbox{d}s+\frac{\varrho P_{w}}{2}\right\}
=(c)\displaystyle\stackrel{{\scriptstyle\mbox{\tiny(c)}}}{{=}} infϱ0{+fS(s){λsg1(s|ϱ,λ)C(λg1(s|ϱ,λ))(λ2σN22+ϱ2)[g1(s|ϱ,λ)]2}ds+ϱPw2}\displaystyle\inf_{\varrho\geq 0}\left\{\int_{\infty}^{+\infty}f_{S}(s)\cdot\left\{\lambda sg^{-1}(s|\varrho,\lambda)-C(\lambda g^{-1}(s|\varrho,\lambda))-\left(\frac{\lambda^{2}\sigma_{N}^{2}}{2}+\frac{\varrho}{2}\right)\cdot[g^{-1}(s|\varrho,\lambda)]^{2}\right\}\mbox{d}s+\frac{\varrho P_{w}}{2}\right\}
=\displaystyle= infϱ0{+fS(s){λsg1(s|ϱ,λ)C(λg1(s|ϱ,λ))λ2σN22[g1(s|ϱ,λ)]2}ds+\displaystyle\inf_{\varrho\geq 0}\bigg{\{}\int_{\infty}^{+\infty}f_{S}(s)\cdot\left\{\lambda sg^{-1}(s|\varrho,\lambda)-C(\lambda g^{-1}(s|\varrho,\lambda))-\frac{\lambda^{2}\sigma_{N}^{2}}{2}\cdot[g^{-1}(s|\varrho,\lambda)]^{2}\right\}\mbox{d}s+
ϱ2[Pw+fS(s)[g1(s|ϱ,λ)]2ds]}\displaystyle\frac{\varrho}{2}\left[P_{w}-\int_{\infty}^{+\infty}f_{S}(s)\cdot[g^{-1}(s|\varrho,\lambda)]^{2}\mbox{d}s\right]\bigg{\}}
(d)\displaystyle\stackrel{{\scriptstyle\mbox{\tiny(d)}}}{{\leq}} +fS(s){λsg1(s|ρ,λ)C(λg1(s|ρ,λ))λ2σN22[g1(s|ρ,λ)]2}ds+\displaystyle\int_{\infty}^{+\infty}f_{S}(s)\cdot\left\{\lambda sg^{-1}(s|\rho,\lambda)-C(\lambda g^{-1}(s|\rho,\lambda))-\frac{\lambda^{2}\sigma_{N}^{2}}{2}\cdot[g^{-1}(s|\rho,\lambda)]^{2}\right\}\mbox{d}s+
ρ2[Pw+fS(s)[g1(s|ρ,λ)]2ds]\displaystyle\frac{\rho}{2}\left[P_{w}-\int_{\infty}^{+\infty}f_{S}(s)\cdot[g^{-1}(s|\rho,\lambda)]^{2}\mbox{d}s\right]
=(e)\displaystyle\stackrel{{\scriptstyle\mbox{\tiny(e)}}}{{=}} +fS(s){λsg1(s|ρ,λ)C(λg1(s|ρ,λ))λ2σN22[g1(s|ρ,λ)]2}ds\displaystyle\int_{\infty}^{+\infty}f_{S}(s)\cdot\left\{\lambda sg^{-1}(s|\rho,\lambda)-C(\lambda g^{-1}(s|\rho,\lambda))-\frac{\lambda^{2}\sigma_{N}^{2}}{2}\cdot[g^{-1}(s|\rho,\lambda)]^{2}\right\}\mbox{d}s
=\displaystyle= 𝑬{λSg1(S|ρ,λ)C(λg1(S|ρ,λ))λ2σN22[g1(S|ρ,λ)]2},\displaystyle\mbox{\boldmath$E$}\left\{\lambda Sg^{-1}(S|\rho,\lambda)-C(\lambda g^{-1}(S|\rho,\lambda))-\frac{\lambda^{2}\sigma_{N}^{2}}{2}\cdot[g^{-1}(S|\rho,\lambda)]^{2}\right\},

where (a) is since the objective is affine in both fW|Sf_{W|S} and in ρ\rho (and hence is concave in fW|Sf_{W|S} and convex in ϱ\varrho), (b) is since the unconstrained maximum of the conditional expectation of a function of WW given S=sS=s is attained when fW|Sf_{W|S} puts all its mass on the maximum of this function, (c) is because the maximum is over a concave function of ww, which is attained at the point of zero-derivative, w=g1(s|ϱ,λ)w=g^{-1}(s|\varrho,\lambda), (d) is by the postulate that ρ0\rho\geq 0, and (e) is because either ρ=0\rho=0 or Pw𝑬{[g1(S|ρ,λ)]2}=0P_{w}-\mbox{\boldmath$E$}\{[g^{-1}(S|\rho,\lambda)]^{2}\}=0. The upper bound on the constrained maximum in the first line of the above chain is therefore attained by W=g1(S|ρ,λ)W=g^{-1}(S|\rho,\lambda) with probability one, which is equivalent to (20). This completes the proof of Theorem 1. \Box

Optimal correlator weights within a finite set. There is a practical motivation to consider the case where 𝒘=(w1,,wn)\mbox{\boldmath$w$}=(w_{1},\ldots,w_{n}) is restricted to be a binary vector with bipolar components, taking the values +Pw+\sqrt{P_{w}} and Pw-\sqrt{P_{w}} only. The reason is that in such a case, the implementation of the correlation detector involves no multiplications at all, as it is equivalent to the comparison of the difference

{t:wt=Pw}Yt{t:wt=Pw}Yt\sum_{\{t:~{}w_{t}=\sqrt{P_{w}}\}}Y_{t}-\sum_{\{t:~{}w_{t}=-\sqrt{P_{w}}\}}Y_{t}

to θn/Pw\theta n/\sqrt{P_{w}}. Here, the maximization over ww (step (b) in the proof of Theorem 1) is carried out just over its two allowed values, +Pw+\sqrt{P_{w}} and Pw-\sqrt{P_{w}}. As C()C(\cdot) is symmetric, the maximum is readily seen to be attained by W=Pwsgn(S)W=\sqrt{P_{w}}\cdot\mbox{sgn}(S), which means wt=Pwsgn(st)w_{t}^{*}=\sqrt{P_{w}}\cdot\mbox{sgn}(s_{t}).

Suppose, more generally, that {wt}\{w_{t}\} is constrained to take on values in a finite set whose cardinality kk is fixed, independent of nn. This can be considered as a compromise between the above two extremes of performance and computational complexity, since the number of multiplications need not be larger than k1k-1. The design of such a signal is very similar to the scalar quantizer design problem: A finite–alphabet signal wtw_{t} is defined as follows. Let smina0<a1<<ak1<aksmaxs_{\min}\equiv a_{0}<a_{1}<\ldots<a_{k-1}<a_{k}\equiv s_{\max}, where smin=mintsts_{\min}=\min_{t}s_{t}, smax=maxtsts_{\max}=\max_{t}s_{t}, and let i=Δ[ai,ai+1){\cal I}_{i}\stackrel{{\scriptstyle\Delta}}{{=}}[a_{i},a_{i+1}), i=0,1,,k1i=0,1,\ldots,k-1, be given. Define

W=i=0k1ωi1{Si},W=\sum_{i=0}^{k-1}\omega_{i}\cdot 1\{S\in{\cal I}_{i}\}, (23)

for some given ω0,ω1,,ωk1\omega_{0},\omega_{1},\ldots,\omega_{k-1}. We wish to minimize

Δ=i=0k1aiai+1dsfS(s)[λωisC(λωi)12λ2σN2ωi2+ρ2(Pwωi2)]\Delta=\sum_{i=0}^{k-1}\int_{a_{i}}^{a_{i+1}}\mbox{d}s\cdot f_{S}(s)\left[\lambda\omega_{i}s-C(\lambda\omega_{i})-\frac{1}{2}\lambda^{2}\sigma_{N}^{2}\omega_{i}^{2}+\frac{\rho}{2}(P_{w}-\omega_{i}^{2})\right] (24)

over {ai}\{a_{i}\}, i=1,,k1i=1,\ldots,k-1, and {ωi}\{\omega_{i}\}, i=0,1,,k1i=0,1,\ldots,k-1. The necessary conditions for optimality are obtained by equating to zero all partial derivatives w.r.t. {ai}\{a_{i}\}, i=1,,k1i=1,\ldots,k-1, and {ωi}\{\omega_{i}\}, i=0,1,,k1i=0,1,\ldots,k-1. This results in the following sets of equations:

λωi1aiC(λωi1)(ρ2+λ2σN22)ωi12\displaystyle\lambda\omega_{i-1}a_{i}-C(\lambda\omega_{i-1})-\left(\frac{\rho}{2}+\frac{\lambda^{2}\sigma_{N}^{2}}{2}\right)\omega_{i-1}^{2} =\displaystyle= λωiaiC(λωi)(ρ2+λ2σN22)ωi2,i=1,2,,k1\displaystyle\lambda\omega_{i}a_{i}-C(\lambda\omega_{i})-\left(\frac{\rho}{2}+\frac{\lambda^{2}\sigma_{N}^{2}}{2}\right)\omega_{i}^{2},~{}~{}i=1,2,\ldots,k-1
C˙(λωi)+(ρλ+λσN2)ωi\displaystyle\dot{C}(\lambda\omega_{i})+\left(\frac{\rho}{\lambda}+\lambda\sigma_{N}^{2}\right)\omega_{i} =\displaystyle= 𝑬{S|Si},i=0,1,,k1.\displaystyle\mbox{\boldmath$E$}\{S|S\in{\cal I}_{i}\},~{}~{}i=0,1,\ldots,k-1.

Alternatively, we may represent these equations as:

ai\displaystyle a_{i} =\displaystyle= C(λωi)C(λωi1)+(ρ+λ2σN2)(ωi2ωi12)/2λ(ωiωi1),i=1,2,,k1\displaystyle\frac{C(\lambda\omega_{i})-C(\lambda\omega_{i-1})+(\rho+\lambda^{2}\sigma_{N}^{2})(\omega_{i}^{2}-\omega_{i-1}^{2})/2}{\lambda(\omega_{i}-\omega_{i-1})},~{}~{}~{}~{}i=1,2,\ldots,k-1 (25)
ωi\displaystyle\omega_{i} =\displaystyle= g1[𝑬{S|Si}|ρ,λ}],i=0,1,,k1,\displaystyle g^{-1}[\mbox{\boldmath$E$}\{S|S\in{\cal I}_{i}\}|\rho,\lambda\}],~{}~{}~{}~{}i=0,1,\ldots,k-1, (26)

where ρ\rho is tuned such that

i=0k1P(i)(g1[𝑬{S|Si}|ρ,λ}])2Pw\sum_{i=0}^{k-1}P({\cal I}_{i})\cdot(g^{-1}[\mbox{\boldmath$E$}\{S|S\in{\cal I}_{i}\}|\rho,\lambda\}])^{2}\leq P_{w} (27)

as before. The first set of equations is parallel to the nearest-neighbor condition in optimal quantizer design, and the second set corresponds to the centroid condition. Optimal signal design can be conducted iteratively, like in quantizer design, by alternating between the two sets of equations.

Example 1. Consider the case where Z𝒩(0,σZ2)Z\sim{\cal N}(0,\sigma_{Z}^{2}) (i.e., Case 1). In this case, C(v)=σZ2v2/2C(v)=\sigma_{Z}^{2}v^{2}/2, and so, C˙(v)=σZ2v\dot{C}(v)=\sigma_{Z}^{2}v, which leads to

g(w|ρ,λ)=σZ2λw+(ρλ+σN2λ)w=[(σN2+σZ2)λ+ρλ]w,g(w|\rho,\lambda)=\sigma_{Z}^{2}\lambda w+\left(\frac{\rho}{\lambda}+\sigma_{N}^{2}\lambda\right)\cdot w=\left[(\sigma_{N}^{2}+\sigma_{Z}^{2})\lambda+\frac{\rho}{\lambda}\right]\cdot w, (28)

and so,

g1(s|ρ,λ)=λs(σN2+σZ2)λ2+ρ.g^{-1}(s|\rho,\lambda)=\frac{\lambda s}{(\sigma_{N}^{2}+\sigma_{Z}^{2})\lambda^{2}+\rho}. (29)

Choosing

ρ=λ𝑬{S2}Pwλ2σZ2,\rho=\lambda\sqrt{\frac{\mbox{\boldmath$E$}\{S^{2}\}}{P_{w}}}-\lambda^{2}\sigma_{Z}^{2}, (30)

yields

wt=Pw𝑬{S2}st,w_{t}^{*}=\sqrt{\frac{P_{w}}{\mbox{\boldmath$E$}\{S^{2}\}}}\cdot s_{t}, (31)

which results in

EMD(θ)={(Pw𝑬{S2}θ)22(σN2+σZ2)Pwθ<Pw𝑬{S2}0θPw𝑬{S2}E_{\mbox{\tiny MD}}(\theta)=\left\{\begin{array}[]{ll}\frac{(\sqrt{P_{w}\mbox{\boldmath$E$}\{S^{2}\}}-\theta)^{2}}{2(\sigma_{N}^{2}+\sigma_{Z}^{2})P_{w}}&\theta<\sqrt{P_{w}\mbox{\boldmath$E$}\{S^{2}\}}\\ 0&\theta\geq\sqrt{P_{w}\mbox{\boldmath$E$}\{S^{2}\}}\end{array}\right. (32)

If wtw_{t} is constrained to be binary, then as we already saw, wt=Pwsgn(st)w_{t}^{*}=\sqrt{P_{w}}\cdot\mbox{sgn}(s_{t}) and then

EMD(θ)={(Pw𝑬{|S|}θ)22(σN2+σZ2)Pwθ<Pw𝑬{|S|}0θPw𝑬{|S|}E_{\mbox{\tiny MD}}(\theta)=\left\{\begin{array}[]{ll}\frac{(\sqrt{P_{w}}\cdot\mbox{\boldmath$E$}\{|S|\}-\theta)^{2}}{2(\sigma_{N}^{2}+\sigma_{Z}^{2})P_{w}}&\theta<\sqrt{P_{w}}\cdot\mbox{\boldmath$E$}\{|S|\}\\ 0&\theta\geq\sqrt{P_{w}}\cdot\mbox{\boldmath$E$}\{|S|\}\end{array}\right. (33)

For the more general quantization, we obtain

ai=(σZ2λ2/2+ρ/2)(ωi2ωi12)λ(ωiωi1)=(σZ2λ+ρλ)ωi+ωi12.a_{i}=\frac{(\sigma_{Z}^{2}\lambda^{2}/2+\rho/2)(\omega_{i}^{2}-\omega_{i-1}^{2})}{\lambda(\omega_{i}-\omega_{i-1})}=\left(\sigma_{Z}^{2}\lambda+\frac{\rho}{\lambda}\right)\cdot\frac{\omega_{i}+\omega_{i-1}}{2}. (34)

For simplicity, let us assume that fSf_{S} is uniform across the interval [A,+A][-A,+A]. In this case, 𝑬{S|Si}=(ai+ai+1)/2\mbox{\boldmath$E$}\{S|S\in{\cal I}_{i}\}=(a_{i}+a_{i+1})/2, and so,

ωi=λ(ai+ai+1)2[(σN2+σZ2)λ2+ρ)].\omega_{i}=\frac{\lambda(a_{i}+a_{i+1})}{2[(\sigma_{N}^{2}+\sigma_{Z}^{2})\lambda^{2}+\rho)]}. (35)

It follows that {ai}\{a_{i}\} have uniform spacings across the support of SS, that is, ai=(2i/k1)Sa_{i}=(2i/k-1)S, i=0,1,,ki=0,1,\ldots,k. Accordingly,

ωi=λA[(2i+1)/k1](σN2+σZ2)λ2+ρ,\omega_{i}=\frac{\lambda A[(2i+1)/k-1]}{(\sigma_{N}^{2}+\sigma_{Z}^{2})\lambda^{2}+\rho}, (36)

where ρ\rho is chosen such that

1ki=0k1λ2A2[(2i+1)/k1]2[(σN2+σZ2)λ2+ρ]2=Pw.\frac{1}{k}\sum_{i=0}^{k-1}\frac{\lambda^{2}A^{2}[(2i+1)/k-1]^{2}}{[(\sigma_{N}^{2}+\sigma_{Z}^{2})\lambda^{2}+\rho]^{2}}=P_{w}. (37)

The binary case considered above is the special case of k=2k=2. This concludes Example 1. \Box

If {st}\{s_{t}\} is itself a finite–alphabet signal, then the optimal {wt}\{w_{t}^{*}\} is also a finite-alphabet signal with the same alphabet size, even without restricting it to be so in the first place. If this alphabet is small enough and/or there is a strong degree of symmetry, one might as well optimize the levels of {wt}\{w_{t}\} directly subject to the power constraint. Consider, for example, the case of a 4-ASK signal, st{3a,a,+a,+3a}s_{t}\in\{-3a,-a,+a,+3a\}, for some given a>0a>0. Then, since the PDF of ZZ is assumed symmetric, the alphabet of the optimal {wt}\{w_{t}\} must be of the form {β,α,+α,+β}\{-\beta,-\alpha,+\alpha,+\beta\} for some 0<α<β0<\alpha<\beta. Assuming that st=±as_{t}=\pm a along half of the time and st=±3as_{t}=\pm 3a along the other half, then wt=±αw_{t}=\pm\alpha and wt=±βw_{t}=\pm\beta in the corresponding halves, and so, 12α2+12β2=Pw\frac{1}{2}\alpha^{2}+\frac{1}{2}\beta^{2}=P_{w}, or β=2Pwα2\beta=\sqrt{2P_{w}-\alpha^{2}}. Thus, the MD exponent should be maximized over one parameter only (beyond the optimization over λ\lambda), which is α[0,2Pw]\alpha\in[0,\sqrt{2P_{w}}]. In particular,

EMD(θ)\displaystyle E_{\mbox{\tiny MD}}(\theta) =\displaystyle= supλ0max0α2Pw{12λaα+32λa2Pwα2\displaystyle\sup_{\lambda\geq 0}\max_{0\leq\alpha\leq\sqrt{2P_{w}}}\bigg{\{}\frac{1}{2}\lambda a\alpha+\frac{3}{2}\lambda a\sqrt{2P_{w}-\alpha^{2}}- (38)
12C(λα)12C(λ2Pwα2)λθλ2σN2Pw2}.\displaystyle\frac{1}{2}C(\lambda\alpha)-\frac{1}{2}C(\lambda\sqrt{2P_{w}-\alpha^{2}})-\lambda\theta-\frac{\lambda^{2}\sigma_{N}^{2}P_{w}}{2}\bigg{\}}.

We next examine this expression in several examples.

Example 2. Let ZZ be a binary symmetric source, taking values ±z0\pm z_{0} for some z0>0z_{0}>0 (Case 3). Then, owing to eq. (6), eq. (38) becomes

EMD(θ)\displaystyle E_{\mbox{\tiny MD}}(\theta) =\displaystyle= supλ0max0α2Pw{12λaα+32λa2Pwα2\displaystyle\sup_{\lambda\geq 0}\max_{0\leq\alpha\leq\sqrt{2P_{w}}}\bigg{\{}\frac{1}{2}\lambda a\alpha+\frac{3}{2}\lambda a\sqrt{2P_{w}-\alpha^{2}}- (40)
12lncosh(z0λα)12lncosh(z0λ2Pwα2)λθλ2σN2Pw2}\displaystyle\frac{1}{2}\ln\cosh(z_{0}\lambda\alpha)-\frac{1}{2}\ln\cosh(z_{0}\lambda\sqrt{2P_{w}-\alpha^{2}})-\lambda\theta-\frac{\lambda^{2}\sigma_{N}^{2}P_{w}}{2}\bigg{\}}
=\displaystyle= 12supλ0max0α2Pw{λaα+3λa2Pwα2\displaystyle\frac{1}{2}\sup_{\lambda\geq 0}\max_{0\leq\alpha\leq\sqrt{2P_{w}}}\bigg{\{}\lambda a\alpha+3\lambda a\sqrt{2P_{w}-\alpha^{2}}-
lncosh(z0λα)lncosh(z0λ2Pwα2)2λθλ2σN2Pw}\displaystyle\ln\cosh(z_{0}\lambda\alpha)-\ln\cosh(z_{0}\lambda\sqrt{2P_{w}-\alpha^{2}})-2\lambda\theta-\lambda^{2}\sigma_{N}^{2}P_{w}\bigg{\}}

The ‘classical’ correlator, where wtstw_{t}\propto s_{t}, corresponds to the choice α=Pw/5\alpha=\sqrt{P_{w}/5} instead of maximizing over α\alpha. In Fig. 1, we compare the two curves of the MD exponent as functions of θ\theta. Since they share the same level of PwP_{w}, the FA exponents are the same for a given θ\theta. As can be seen, the optimal correlator significantly outperforms the classical one, which is optimal in the Gaussian case only. This concludes Example 2. \Box

Refer to caption
Figure 1: Graphs for binary interference: MD error exponents as functions of θ\theta pertaining to the classical correlator (red curve) and the optimal correlator (blue curve) for the following parameter values: Pw=1P_{w}=1, z0=7z_{0}=7, a=4a=4, and σN2=1\sigma_{N}^{2}=1.

Example 3. We conduct a similar comparison for the case where ZZ is distributed uniformly over [z0,+z0][-z_{0},+z_{0}] (Case 4), which corresponds to eq. (7). The results are displayed in Fig. 2, and as can be seen, here too, the optimal correlator significantly improves upon the classical one. This concludes Example 3. \Box

Refer to caption
Figure 2: Graphs for uniformly distributed interference: MD error exponents as functions of θ\theta pertaining to the classical correlator (red curve) and the optimal correlator (blue curve) for the following parameter values: Pw=1P_{w}=1, z0=7z_{0}=7, a=4a=4, and σN2=1\sigma_{N}^{2}=1.

It is interesting to note that in both Examples 2 and 3, for large θ\theta, the two graphs approach each other faster than they approach zero. A possible intuitive explanation is that for large θ\theta, what counts is the behavior of the PDF of twtZt\sum_{t}w_{t}Z_{t}, fairly close to its peak, where the regime of the central limit theorem is quite relevant, and so, there is no significant difference from Case 1, where ZZ is Gaussian and the classical correlator is good. Mathematically, as θ\theta grows, the optimum λ\lambda decreases, and so, it ‘samples’ the function C(λwt)C(\lambda w_{t}) in the vicinity of the origin, where it is well approximated by a quadratic function, just like in the Gaussian case (Case 1).

Example 4. Finally, consider the case where ZZ is Laplacian (Case 2). In this case, the differences turned out to be rather minor – see Fig. 3. A plausible intuition is that the Laplacian PDF is much ‘closer’ to the Gaussian PDF, relative to the binary distribution and the uniform distribution of Examples 2 and 3. This concludes Example 4. \Box

Refer to caption
Figure 3: Graphs for Laplace-distributed interference: MD error exponents as functions of θ\theta pertaining to the classical correlator (red curve) and the optimal correlator (blue curve) for the following parameter values: Pw=1P_{w}=1, q=0.1q=0.1, a=4a=4, and σN2=1\sigma_{N}^{2}=1.

The loss relative to the optimal LRT detector depends on the relative intensity of the process {Zt}\{Z_{t}\} compared to the Gaussian noise component.

4 Joint Optimization of the Correlator and the Signal

So far, we have concerned ourselves with the optimization of the correlator waveform, {wt}\{w_{t}\} for a given signal, {st}\{s_{t}\}. But what would be the optimal signal {st}\{s_{t}\} (subject to a power constraint) when it is jointly optimized with {wt}\{w_{t}\}? Mathematically, we are interested in the problem,

sup{fS:𝑬{S2}Ps}sup{fW|S:𝑬{W2}Pw}EMD(θ)\displaystyle\sup_{\{f_{S}:~{}\mbox{\boldmath$E$}\{S^{2}\}\leq P_{s}\}}\sup_{\{f_{W|S}:~{}\mbox{\boldmath$E$}\{W^{2}\}\leq P_{w}\}}E_{\mbox{\tiny MD}}(\theta) (41)
=\displaystyle= sup{fS:𝑬{S2}Ps}sup{fW|S:𝑬{W2}Pw}supλ0[𝑬{λ(WSθ)C(λW)λ2σN2W22}]\displaystyle\sup_{\{f_{S}:~{}\mbox{\boldmath$E$}\{S^{2}\}\leq P_{s}\}}\sup_{\{f_{W|S}:~{}\mbox{\boldmath$E$}\{W^{2}\}\leq P_{w}\}}\sup_{\lambda\geq 0}\left[\mbox{\boldmath$E$}\left\{\lambda(W\cdot S-\theta)-C(\lambda W)-\frac{\lambda^{2}\sigma_{N}^{2}W^{2}}{2}\right\}\right]
=\displaystyle= sup{fW:𝑬{W2}Pw}supλ0sup{fS|W:𝑬{S2}Ps}[λ𝑬{WS}𝑬{C(λW)}λθλ2σN2𝑬{W2}2]\displaystyle\sup_{\{f_{W}:~{}\mbox{\boldmath$E$}\{W^{2}\}\leq P_{w}\}}\sup_{\lambda\geq 0}\sup_{\{f_{S|W}:~{}\mbox{\boldmath$E$}\{S^{2}\}\leq P_{s}\}}\left[\lambda\mbox{\boldmath$E$}\{W\cdot S\}-\mbox{\boldmath$E$}\{C(\lambda W)\}-\lambda\theta-\frac{\lambda^{2}\sigma_{N}^{2}\mbox{\boldmath$E$}\{W^{2}\}}{2}\right]
=(a)\displaystyle\stackrel{{\scriptstyle\mbox{\tiny(a)}}}{{=}} sup{fW:𝑬{W2}Pw}supλ0[λ𝑬{WPs𝑬{W2}W}𝑬{C(λW)}λθλ2σN2𝑬{W2}2}\displaystyle\sup_{\{f_{W}:~{}\mbox{\boldmath$E$}\{W^{2}\}\leq P_{w}\}}\sup_{\lambda\geq 0}\left[\lambda\mbox{\boldmath$E$}\left\{W\cdot\sqrt{\frac{P_{s}}{\mbox{\boldmath$E$}\{W^{2}\}}}\cdot W\right\}-\mbox{\boldmath$E$}\{C(\lambda W)\}-\lambda\theta-\frac{\lambda^{2}\sigma_{N}^{2}\mbox{\boldmath$E$}\{W^{2}\}}{2}\right\}
=\displaystyle= sup{fW:𝑬{W2}Pw}supλ0{λPs𝑬{W2}𝑬{C(λW)}λθλ2σN2𝑬{W2}2}\displaystyle\sup_{\{f_{W}:~{}\mbox{\boldmath$E$}\{W^{2}\}\leq P_{w}\}}\sup_{\lambda\geq 0}\left\{\lambda\sqrt{P_{s}\mbox{\boldmath$E$}\{W^{2}\}}-\mbox{\boldmath$E$}\{C(\lambda W)\}-\lambda\theta-\frac{\lambda^{2}\sigma_{N}^{2}\mbox{\boldmath$E$}\{W^{2}\}}{2}\right\}
=\displaystyle= supλ0supPPw{λPsPmin{fW:𝑬{W2}=P}𝑬{C(λW)}λθλ2σN2P2},\displaystyle\sup_{\lambda\geq 0}\sup_{P\leq P_{w}}\left\{\lambda\sqrt{P_{s}P}-\min_{\{f_{W}:~{}\mbox{\boldmath$E$}\{W^{2}\}=P\}}\mbox{\boldmath$E$}\{C(\lambda W)\}-\lambda\theta-\frac{\lambda^{2}\sigma_{N}^{2}P}{2}\right\},

where in (a) we have used the simple fact that, for a given WW and PsP_{s}, the correlation, 𝑬{WS}\mbox{\boldmath$E$}\{W\cdot S\} is maximized by S=Ps/𝑬{W2}WS=\sqrt{P_{s}/\mbox{\boldmath$E$}\{W^{2}\}}\cdot W. Earlier, we maximized the MD exponent w.r.t. WW for a given SS and found that the optimal WW is given by a function, g1(S|ρ,λ)g^{-1}(S|\rho,\lambda), which is, in general, non–linear (unless ZZ is Gaussian). Now, on the other hand, the optimal SS for a given WW turns out to be given by a linear function. These two findings settle together if and only if WW takes values only in the set of solutions, 𝒮(ζ){\cal S}(\zeta), to the equation

C˙(λW)+ρλW=ζW,\dot{C}(\lambda W)+\frac{\rho}{\lambda}\cdot W=\zeta\cdot W, (42)

for some ζ>0\zeta>0 (and then SS takes the corresponding values according to their relationship). The two sides of the equation represent the non-linear and the linear relations, respectively. Note that 𝒮(ζ){\cal S}(\zeta) always includes at least the solution W=0W=0. Once ζ\zeta is chosen, WW is allowed to take on values only within 𝒮(ζ){\cal S}(\zeta). The inner minimization over fWf_{W} in the last line of (41) is obviously lower bounded by C~λ(P)\tilde{C}_{\lambda}(P), which is defined as

C~λ(P)=Δinfζ>0inf{μ():𝒮2(ζ)pμ(p)dp=P}𝒮2(ζ)μ(p)C(λp)dp,\tilde{C}_{\lambda}(P)\stackrel{{\scriptstyle\Delta}}{{=}}\inf_{\zeta>0}\inf_{\{\mu(\cdot):~{}\int_{{\cal S}^{2}(\zeta)}p\cdot\mu(p)\mbox{d}p=P\}}\int_{{\cal S}^{2}(\zeta)}\mu(p)C(\lambda\sqrt{p})\mbox{d}p, (43)

where 𝒮2(ζ)={w2:w𝒮(ζ)}{\cal S}^{2}(\zeta)=\{w^{2}:~{}w\in{\cal S}(\zeta)\}, and μ()\mu(\cdot) is understood to be a weight function over 𝒮2(ζ){\cal S}^{2}(\zeta), i.e., μ(p)0\mu(p)\geq 0 for all p𝒮2(ζ)p\in{\cal S}^{2}(\zeta) and 𝒮2(ζ)μ(p)dp=1\int_{{\cal S}^{2}(\zeta)}\mu(p)\mbox{d}p=1. While this expression appears complicated, there are two facts that help to simplify it significantly. The first is that, in most cases, 𝒮(ζ){\cal S}(\zeta) is a finite set (unless C()C(\cdot) is linear, or contains linear segments), and the second is that only two members of 𝒮(ζ){\cal S}(\zeta) suffice, i.e., eq. (43) simplifies to

C~λ(P)=Δinfζ>0min{p0,p1𝒮2(ζ),α[0,1]:(1α)p0+αp1=P}{(1α)C(λp0)+αC(λp1)}.\tilde{C}_{\lambda}(P)\stackrel{{\scriptstyle\Delta}}{{=}}\inf_{\zeta>0}\min_{\{p_{0},p_{1}\in{\cal S}^{2}(\zeta),~{}\alpha\in[0,1]:~{}(1-\alpha)p_{0}+\alpha p_{1}=P\}}\{(1-\alpha)C(\lambda\sqrt{p_{0}})+\alpha C(\lambda\sqrt{p_{1}})\}. (44)

The function C~λ(P)\tilde{C}_{\lambda}(P) has the flavor of a lower convex envelope for the function C(λ)C(\lambda\sqrt{\cdot}), but with the exception that the support of the convex combinations is limited to 𝒮2(ζ){\cal S}^{2}(\zeta). Finally, the optimal MD exponent is given by

EMD(θ)=supλ0supPPw{λ(PsPθ)C~λ(P)λ2σN2P2}.E_{\mbox{\tiny MD}}(\theta)=\sup_{\lambda\geq 0}\sup_{P\leq P_{w}}\left\{\lambda(\sqrt{P_{s}P}-\theta)-\tilde{C}_{\lambda}(P)-\frac{\lambda^{2}\sigma_{N}^{2}P}{2}\right\}. (45)

The optimal WW is one that achieves C~λ(P)\tilde{C}_{\lambda}(P) for the maximizing λ\lambda and PP, that is, the components of {|wt|}\{|w_{t}|\} take only two values in 𝒮(ζ){\cal S}(\zeta^{*}), with relative frequencies given by α\alpha^{*} and 1α1-\alpha^{*}, where ζ\zeta^{*} and α\alpha^{*} are the achievers of C~λ(P)\tilde{C}_{\lambda}(P^{*}), PP^{*} being the optimal PP. In other words, the optimal signal has at most four levels, ±a\pm a and ±b\pm b, for some a0a\geq 0 and b>0b>0.

Comment 2. By a simple change of variables, q=λ2pq=\lambda^{2}p, it is readily seen that C~λ(P)\tilde{C}_{\lambda}(P) depends on λ\lambda and PP only via the quantity λP\lambda\sqrt{P}, and so, it might as well be denoted as C~(λP)\tilde{C}(\lambda\sqrt{P}). \Box

Observe that while the function C()C(\cdot) is always convex, nothing general can be asserted regarding convexity or concavity properties of the function C(λ)C(\lambda\sqrt{\cdot}), as the internal square root function, which is concave, may or may not destroy the convexity of the composite function, depending on the function C()C(\cdot). In other words, C(λ)C(\lambda\sqrt{\cdot}) may either be convex, or concave, or neither. For example, if Z{z0,+z0}Z\in\{-z_{0},+z_{0}\} with equal probabilities (as in Case 3), then C(λp)=lncosh(z0λp)C(\lambda\sqrt{p})=\ln\cosh(z_{0}\lambda\sqrt{p}) which is concave in pp. On the other hand, if ZZ is Laplacian with parameter qq (Case 2), then C(λp)=ln(1λ2p/q2)C(\lambda\sqrt{p})=-\ln(1-\lambda^{2}p/q^{2}), which is convex in pp. By mixing these two distributions, we can also make it neither convex, nor concave, as will be shown in the sequel.

Let us examine now several special cases, where the form of C~λ(P)\tilde{C}_{\lambda}(P) can be determined more explicitly.

1. Consider first the Gaussian case (Case 1), where C(λp)=12σZ2λ2pC(\lambda\sqrt{p})=\frac{1}{2}\sigma_{Z}^{2}\lambda^{2}p, namely, it is linear in pp. In this case, for ζ=σZ2λ+ρ/λ\zeta=\sigma_{Z}^{2}\lambda+\rho/\lambda, 𝒮(ζ)=IR+{\cal S}(\zeta)={\rm I\!R}^{+}, the choice of μ\mu is immaterial, and C~λ(P)=12σZ2λ2P\tilde{C}_{\lambda}(P)=\frac{1}{2}\sigma_{Z}^{2}\lambda^{2}P. In this case, any signal 𝒘w with power PP is equally good, as expected.

2. Consider next the case where C(λ)C(\lambda\sqrt{\cdot}) is is convex. Then,

𝑬{C(λW)}\displaystyle\mbox{\boldmath$E$}\{C(\lambda W)\} =\displaystyle= 𝑬{C(λW2)}\displaystyle\mbox{\boldmath$E$}\left\{C\left(\lambda\sqrt{W^{2}}\right)\right\} (46)
\displaystyle\geq C(λ𝑬{W2})\displaystyle C\left(\lambda\sqrt{\mbox{\boldmath$E$}\{W^{2}\}}\right) (47)
=\displaystyle= C(λP),\displaystyle C(\lambda\sqrt{P}), (48)

where the inequality is achieved with equality whenever W2=constW^{2}=\mbox{const} with probability one, and then this constant must be PP. So, here C~λ(P)=C(λP)\tilde{C}_{\lambda}(P)=C(\lambda\sqrt{P}), 𝒮2(ζ)={0,P}{\cal S}^{2}(\zeta)=\{0,P\} and μ(p)=δ(pP)\mu(p)=\delta(p-P), which is expected, because in the convex case, there is no need for any non-trivial convex combinations. The optimal signal vector 𝒘w is any member of {P,+P}n\{-\sqrt{P^{*}},+\sqrt{P^{*}}\}^{n}, and then 𝒔s is the corresponding member of {Ps,+Ps}n\{-\sqrt{P_{s}},+\sqrt{P_{s}}\}^{n}. It is interesting to note that they both turn out to be DC or bipolar signals, which is good news from the practical point of view, as discussed in Section 1.

3. We now move on to the case where C(λ)C(\lambda\sqrt{\cdot}) is concave. In this case, it is instructive to return temporarily to the original domain of vectors {𝒘}\{\mbox{\boldmath$w$}\} of finite dimension nn, find the optimal solution in that domain, and finally, take the limit of large nn (see footnote no. 3). We therefore wish to minimize 1nt=1nC(λwt)\frac{1}{n}\sum_{t=1}^{n}C(\lambda w_{t}) s.t. t=1nwt2=nP\sum_{t=1}^{n}w_{t}^{2}=nP. Since C(λ0)=0C(\lambda\sqrt{0})=0 and each wt2w_{t}^{2} is limited to the range [0,nP][0,nP], we can lower bound the function C(λwt2)C(\lambda\sqrt{w_{t}^{2}}) (which is concave as a function of wt2w_{t}^{2}), by a linear function of wt2w_{t}^{2}, as follows:

C(λwt2)C(λnP)nPwt2,C\left(\lambda\sqrt{w_{t}^{2}}\right)\geq\frac{C(\lambda\sqrt{nP})}{nP}\cdot w_{t}^{2}, (49)

with equality at wt2=0w_{t}^{2}=0 and wt2=nPw_{t}^{2}=nP. Consequently,

1nt=1nC(λwt)\displaystyle\frac{1}{n}\sum_{t=1}^{n}C(\lambda w_{t}) =\displaystyle= 1nt=1nC(λwt2)\displaystyle\frac{1}{n}\sum_{t=1}^{n}C\left(\lambda\sqrt{w_{t}^{2}}\right) (50)
\displaystyle\geq 1nt=1nC(λnP)nPwt2\displaystyle\frac{1}{n}\sum_{t=1}^{n}\frac{C(\lambda\sqrt{nP})}{nP}\cdot w_{t}^{2} (51)
=\displaystyle= C(λnP)nPP\displaystyle\frac{C(\lambda\sqrt{nP})}{nP}\cdot P (52)
=\displaystyle= C(λnP)n,\displaystyle\frac{C(\lambda\sqrt{nP})}{n}, (53)

with equality if one of the components of 𝒘w is equal to ±nP\pm\sqrt{nP} and all other components vanish, and then, the same component of 𝒔s is ±nPs\pm\sqrt{nP_{s}} (and, of course, all other vanish), correspondingly. Here, we have 𝒮2(ζ)={0,nP}{\cal S}^{2}(\zeta)=\{0,nP\} and μ(p)=(11n)δ(p)+1nδ(pnP)\mu(p)=\left(1-\frac{1}{n}\right)\delta(p)+\frac{1}{n}\delta(p-nP). Asymptotically, as nn grows without bound, C~λ(P)=limnC(λnP)/n\tilde{C}_{\lambda}(P)=\lim_{n\to\infty}C(\lambda\sqrt{nP})/n, and the limit exists since C(λnP)/nC(\lambda\sqrt{nP})/n is monotonically non-increasing by the assumed concavity of C(λ)C(\lambda\sqrt{\cdot}). If this limit happens to vanish (like in Case 3, for instance), then the interference {Zt}\{Z_{t}\} has no impact whatsoever on the MD exponent for the optimal 𝒔s and 𝒘w. Here too, the optimal signaling is binary.

We now summarize our findings, in this section so far, in the following theorem.

Theorem 2

Let the assumptions of Section 2 hold. Then, wtw_{t}^{*} and sts_{t}^{*} are proportional to each other with |wt||w_{t}^{*}| and |st||s_{t}^{*}| taking values in a finite set of size at most two (t=1,,nt=1,\ldots,n), and the MD exponent is given by eq. (45).

  1. 1.

    If the function C(λ)C(\lambda\sqrt{\cdot}) is convex, then both 𝒘\mbox{\boldmath$w$}^{*} and 𝒔\mbox{\boldmath$s$}^{*} are either DC or bipolar, and the MD exponent is given by

    EMD(θ)=supλ0supPPw{λ(PsPθ)C(λP)λ2σN2P2}.E_{\mbox{\tiny MD}}(\theta)=\sup_{\lambda\geq 0}\sup_{P\leq P_{w}}\left\{\lambda(\sqrt{P_{s}P}-\theta)-C\left(\lambda\sqrt{P}\right)-\frac{\lambda^{2}\sigma_{N}^{2}P}{2}\right\}. (54)
  2. 2.

    If the function C(λ)C(\lambda\sqrt{\cdot}) is concave, then the components of both 𝒘\mbox{\boldmath$w$}^{*} and 𝒔\mbox{\boldmath$s$}^{*} are all zero, except for one component which exploits their entire energy. The MD exponent is given by

    EMD(θ)=supλ0supPPw{λ(PsPθ)limnC(λPn)nλ2σN2P2}.E_{\mbox{\tiny MD}}(\theta)=\sup_{\lambda\geq 0}\sup_{P\leq P_{w}}\left\{\lambda(\sqrt{P_{s}P}-\theta)-\lim_{n\to\infty}\frac{C\left(\lambda\sqrt{Pn}\right)}{n}-\frac{\lambda^{2}\sigma_{N}^{2}P}{2}\right\}. (55)

Finally, we should consider the case where C(λ)C(\lambda\sqrt{\cdot}) is neither convex, nor concave. Here, we will not carry out the full calculations needed, but we will demonstrate that 𝒮(ζ){\cal S}(\zeta) may include more than one positive solution, in addition to the trivial solution at the origin. Consider, for example a mixture of the binary PDF and the Laplacian PDF with weights δ\delta and 1δ1-\delta, respectively (δ(0,1)\delta\in(0,1)). In this case,

C(λw)=C(λw2)=ln[δcosh(z0λw2)+1δ1λ2w2/q2],C(\lambda w)=C\left(\lambda\sqrt{w^{2}}\right)=\ln\left[\delta\cdot\cosh\left(z_{0}\lambda\sqrt{w^{2}}\right)+\frac{1-\delta}{1-\lambda^{2}w^{2}/q^{2}}\right], (56)

If δ\delta is close to 1, the hyperbolic cosine term is dominant for small and moderate values of ww, where C(λ()C(\lambda(\sqrt{\cdot}) is concave. In contrast, when w2w^{2} approaches q2/λ2q^{2}/\lambda^{2}, the second term tends steeply to infinity and hence must be convex. So in this example, C^\hat{C} is concave in a certain range of relatively small w2w^{2} and at some point it becomes convex. Now, the derivative w.r.t. ww is given by

C˙(λw)=δz0sinh(z0λw)+(1δ)2λq2/(q2λ2w2)2δcosh(z0λw)+(1δ)q2/(q2λ2w2).\dot{C}(\lambda w)=\frac{\delta z_{0}\sinh(z_{0}\lambda w)+(1-\delta)\cdot 2\lambda q^{2}/(q^{2}-\lambda^{2}w^{2})^{2}}{\delta\cosh(z_{0}\lambda w)+(1-\delta)q^{2}/(q^{2}-\lambda^{2}w^{2})}. (57)

As discussed above, the first step is to solve the equation

C˙(λw)=(ζρλ)w.\dot{C}(\lambda w)=\left(\zeta-\frac{\rho}{\lambda}\right)w. (58)

As there is no apparent analytical closed-form solution to this equation, here we demonstrate the solutions graphically. In Fig. 4, we plot the functions C˙(λw)\dot{C}(\lambda w) and (ζρ/λ)w(\zeta-\rho/\lambda)\cdot w vs. ww for the following parameter values: δ=0.95\delta=0.95, q=5q=5, z0=0.5z_{0}=0.5, λ=1\lambda=1, and ζρ/λ=0.13\zeta-\rho/\lambda=0.13. As can be seen, in this example, there are two positive solutions (in addition to the trivial solution, w0=0w_{0}=0), which are approximately, w1=3.71w_{1}=3.71 and w2=4.58w_{2}=4.58. Thus, in this case, 𝒮(ζ)={0,3.71,4.58}{\cal S}(\zeta)=\{0,3.71,4.58\}, which corresponds to the set of power levels, 𝒮2(ζ)={0,13.7641,20.9764}{\cal S}^{2}(\zeta)=\{0,13.7641,20.9764\}. According to the above discussion, optimal signaling is associated with time-sharing between two out of these three signal levels. Given this simple fact, the optimal signal levels, say, a0a\geq 0 and b>0b>0, and the optimal weight parameter, α\alpha, can also be found directly by maximizing the MD error exponent expression with respect to these parameters, subject to the power constraint, (1α)a2+αb2=Pw(1-\alpha)a^{2}+\alpha b^{2}=P_{w}, similarly as was done earlier in the example of the 4-ASK signal in Section 3.

Refer to caption
Figure 4: The functions C˙(λw)\dot{C}(\lambda w) (blue curve) and (ζρ/λ)w(\zeta-\rho/\lambda)\cdot w (red straight line) for the example of eqs. (56) and (57) with parameter values: δ=0.95\delta=0.95, q=5q=5, z0=0.5z_{0}=0.5, λ=1\lambda=1, and ζρ/λ=0.13\zeta-\rho/\lambda=0.13. As can be seen, these two graphs meet at three points, w0=0w_{0}=0, w13.71w_{1}\approx 3.71 and w24.58w_{2}\approx 4.58.

5 Detectors Based on Linear Combinations of Correlation and Energy

In this section, we provide a brief outline of a possible extension of the scope to a broader class of detectors that compare the test statistic

t=1nwtYt+αt=1nYt2\sum_{t=1}^{n}w_{t}Y_{t}+\alpha\sum_{t=1}^{n}Y_{t}^{2}

to a threshold, T=θnT=\theta n. The motivation stems from the fact that the two hypotheses, 0{\cal H}_{0} and 1{\cal H}_{1}, differ not only in the presence of the signal, {st}\{s_{t}\}, but also in the presence of the SIN, {Zt}\{Z_{t}\}, which adds to the energy (or the variance) of the received signal. In the extreme case, where st0s_{t}\equiv 0, the simple correlation detector we examined so far (corresponding to α=0\alpha=0) would be useless, but still, one expects to be able to distinguish between the two hypotheses thanks to the different energies of the received signal. Indeed, if {Zt}\{Z_{t}\} is Gaussian white noise (Case 1), the optimal LRT detector obeys this structure with α=σZ2/[2(σN2+σZ2)]\alpha=\sigma_{Z}^{2}/[2(\sigma_{N}^{2}+\sigma_{Z}^{2})].

For practical reasons, it would also be relevant to consider detectors that are based on

t=1nwtYt+αt=1n|Yt|,\sum_{t=1}^{n}w_{t}Y_{t}+\alpha\sum_{t=1}^{n}|Y_{t}|,

where the second term is another measure of the signal intensity, but with the advantage that its calculation does not require multiplications. We shall consider both classes of detectors, but provide merely the basic derivations of the MD exponent, without attempt to arrive at full, explicit solutions. Nevertheless, we will make an attempt to make some observations on the solutions.

We begin with the first class of detectors mentioned above. The FA probability is readily bounded by

PFA(θ)\displaystyle P_{\mbox{\tiny FA}}(\theta) =\displaystyle= Pr{t=1nwtNt+αt=1nNt2θn}\displaystyle\mbox{Pr}\left\{\sum_{t=1}^{n}w_{t}N_{t}+\alpha\sum_{t=1}^{n}N_{t}^{2}\geq\theta n\right\} (59)
\displaystyle\leq exp{nsupλ0[λθλ2σN2Pw2(12αλσN2)+12ln(12αλσN2)]},\displaystyle\exp\left\{-n\sup_{\lambda\geq 0}\left[\lambda\theta-\frac{\lambda^{2}\sigma_{N}^{2}P_{w}}{2(1-2\alpha\lambda\sigma_{N}^{2})}+\frac{1}{2}\ln(1-2\alpha\lambda\sigma_{N}^{2})\right]\right\},

which depends on 𝒘w only via PwP_{w}, as before.

As for the MD probability, we define

A=1nt=1n(wtst+αst2)A=\frac{1}{n}\sum_{t=1}^{n}(w_{t}s_{t}+\alpha s_{t}^{2}) (60)

and

ut=wt+2αst,t=1,2,,n.u_{t}=w_{t}+2\alpha s_{t},~{}~{}~{}~{}~{}t=1,2,\ldots,n. (61)

Then,

PMD(θ)\displaystyle P_{\mbox{\tiny MD}}(\theta) =\displaystyle= Pr{t=1nwt(st+Zt+Nt)+αt=1n(st+Zt+Nt)2<θn}\displaystyle\mbox{Pr}\left\{\sum_{t=1}^{n}w_{t}(s_{t}+Z_{t}+N_{t})+\alpha\sum_{t=1}^{n}(s_{t}+Z_{t}+N_{t})^{2}<\theta n\right\} (62)
=\displaystyle= Pr{nA+t=1nut(Zt+Nt)+αt=1n(Zt+Nt)2<nθ}\displaystyle\mbox{Pr}\left\{nA+\sum_{t=1}^{n}u_{t}(Z_{t}+N_{t})+\alpha\sum_{t=1}^{n}(Z_{t}+N_{t})^{2}<n\theta\right\}
\displaystyle\leq 𝑬{exp[λn(θA)λt=1nut(Zt+Nt)αλt=1n(Zt+Nt)2]}\displaystyle\mbox{\boldmath$E$}\left\{\exp\left[\lambda n(\theta-A)-\lambda\sum_{t=1}^{n}u_{t}(Z_{t}+N_{t})-\alpha\lambda\sum_{t=1}^{n}(Z_{t}+N_{t})^{2}\right]\right\}
=\displaystyle= 𝑬{exp[λn(θA)λt=1nut(Zt+Nt)]×\displaystyle\mbox{\boldmath$E$}\bigg{\{}\exp\left[\lambda n(\theta-A)-\lambda\sum_{t=1}^{n}u_{t}(Z_{t}+N_{t})\right]\times
t=1nexp[αλ(Zt+Nt)2]}\displaystyle\prod_{t=1}^{n}\exp\left[-\alpha\lambda(Z_{t}+N_{t})^{2}\right]\bigg{\}}
=(a)\displaystyle\stackrel{{\scriptstyle\mbox{\tiny(a)}}}{{=}} 𝑬{exp[λn(θA)λt=1nut(Zt+Nt)]×\displaystyle\mbox{\boldmath$E$}\bigg{\{}\exp\left[\lambda n(\theta-A)-\lambda\sum_{t=1}^{n}u_{t}(Z_{t}+N_{t})\right]\times
t=1n[(4παλ)1/2exp{jqt(Zt+Nt)qt24αλ}dqt]}\displaystyle\prod_{t=1}^{n}\left[(4\pi\alpha\lambda)^{-1/2}\int_{-\infty}^{\infty}\exp\left\{-jq_{t}(Z_{t}+N_{t})-\frac{q_{t}^{2}}{4\alpha\lambda}\right\}\mbox{d}q_{t}\right]\bigg{\}}
=\displaystyle= eλn(θA)t=1n[(4παλ)1/2𝑬{exp[(λut+jqt)(Zt+Nt)]}exp(qt24αλ)dqt]\displaystyle e^{\lambda n(\theta-A)}\prod_{t=1}^{n}\left[(4\pi\alpha\lambda)^{-1/2}\int_{-\infty}^{\infty}\mbox{\boldmath$E$}\left\{\exp\left[-(\lambda u_{t}+jq_{t})(Z_{t}+N_{t})\right]\right\}\exp\left(-\frac{q_{t}^{2}}{4\alpha\lambda}\right)\mbox{d}q_{t}\right]
=\displaystyle= eλn(θA)t=1n[(4παλ)1/2𝑬{exp[(λut+jqt)Zt]}×\displaystyle e^{\lambda n(\theta-A)}\prod_{t=1}^{n}\bigg{[}(4\pi\alpha\lambda)^{-1/2}\int_{-\infty}^{\infty}\mbox{\boldmath$E$}\left\{\exp\left[-(\lambda u_{t}+jq_{t})Z_{t}\right]\right\}\times
𝑬{exp[(λut+jqt)Nt]}exp(qt24αλ)dqt]\displaystyle\mbox{\boldmath$E$}\left\{\exp\left[-(\lambda u_{t}+jq_{t})N_{t}\right]\right\}\cdot\exp\left(-\frac{q_{t}^{2}}{4\alpha\lambda}\right)\mbox{d}q_{t}\bigg{]}
=\displaystyle= eλn(θA)t=1n[(4παλ)1/2𝑬{exp(λutZt)ejqtZt}×\displaystyle e^{\lambda n(\theta-A)}\prod_{t=1}^{n}\bigg{[}(4\pi\alpha\lambda)^{-1/2}\int_{-\infty}^{\infty}\mbox{\boldmath$E$}\left\{\exp\left(-\lambda u_{t}Z_{t}\right)e^{-jq_{t}Z_{t}}\right\}\times
exp(12σN2[λut+jqt]2)exp(qt24αλ)dqt]\displaystyle\exp\left(\frac{1}{2}\sigma_{N}^{2}[\lambda u_{t}+jq_{t}]^{2}\right)\cdot\exp\left(-\frac{q_{t}^{2}}{4\alpha\lambda}\right)\mbox{d}q_{t}\bigg{]}
=\displaystyle= eλn(θA)t=1n[(4παλ)1/2𝑬{exp(λutZt)ejqtZt}×\displaystyle e^{\lambda n(\theta-A)}\prod_{t=1}^{n}\bigg{[}(4\pi\alpha\lambda)^{-1/2}\int_{-\infty}^{\infty}\mbox{\boldmath$E$}\left\{\exp\left(-\lambda u_{t}Z_{t}\right)e^{-jq_{t}Z_{t}}\right\}\times
exp(12σN2λ2ut2)ejσN2λutqtexp{(σN22+14αλ)qt2}dqt]\displaystyle\exp\left(\frac{1}{2}\sigma_{N}^{2}\lambda^{2}u_{t}^{2}\right)e^{j\sigma_{N}^{2}\lambda u_{t}q_{t}}\cdot\exp\left\{-\left(\frac{\sigma_{N}^{2}}{2}+\frac{1}{4\alpha\lambda}\right)q_{t}^{2}\right\}\mbox{d}q_{t}\bigg{]}
=\displaystyle= exp{λn(θA)+12σN2λ2t=1nut2}×\displaystyle\exp\bigg{\{}\lambda n(\theta-A)+\frac{1}{2}\sigma_{N}^{2}\lambda^{2}\sum_{t=1}^{n}u_{t}^{2}\bigg{\}}\times
t=1n[(4παλ)1/2𝑬{exp(λutZt)cos((ZtσN2λut)qt)}×\displaystyle\prod_{t=1}^{n}\bigg{[}(4\pi\alpha\lambda)^{-1/2}\int_{-\infty}^{\infty}\mbox{\boldmath$E$}\left\{\exp\left(-\lambda u_{t}Z_{t}\right)\cos((Z_{t}-\sigma_{N}^{2}\lambda u_{t})q_{t})\right\}\times
exp{(σN22+14αλ)qt2}dqt],\displaystyle\exp\left\{-\left(\frac{\sigma_{N}^{2}}{2}+\frac{1}{4\alpha\lambda}\right)q_{t}^{2}\right\}\mbox{d}q_{t}\bigg{]},

where j=1j=\sqrt{-1} and (a) is due to the identity

eax2=(4πa)1/2ejqxexp{q24a}dq,a>0,e^{-ax^{2}}=(4\pi a)^{-1/2}\int_{-\infty}^{\infty}e^{-jqx}\exp\left\{-\frac{q^{2}}{4a}\right\}\mbox{d}q,~{}~{}~{}~{}~{}a>0, (63)

which is the characteristic function of a zero-mean Gaussian random variable with variance 2a2a.444Alternatively, it can be viewed as the Fourier transform relation between two Gaussians, one in the domain of xx and one in the domain of qq. We now define

Cα(v)=Δln[14παλ𝑬{exp(vZ)cos((ZσN2v)q)}exp{(σN22+14αλ)q2}dq],C_{\alpha}(v)\stackrel{{\scriptstyle\Delta}}{{=}}\ln\left[\frac{1}{\sqrt{4\pi\alpha\lambda}}\int_{-\infty}^{\infty}\mbox{\boldmath$E$}\left\{\exp\left(-vZ\right)\cos((Z-\sigma_{N}^{2}v)q)\right\}\cdot\exp\left\{-\left(\frac{\sigma_{N}^{2}}{2}+\frac{1}{4\alpha\lambda}\right)q^{2}\right\}\mbox{d}q\right],

and we arrive at the following expression of the MD exponent:

EMD(θ)\displaystyle E_{\mbox{\tiny MD}}(\theta) =\displaystyle= supλ0limn{λ(Aθ)12λ2σN21nt=1nut21nt=1nCα(λut)}\displaystyle\sup_{\lambda\geq 0}\lim_{n\to\infty}\left\{\lambda(A-\theta)-\frac{1}{2}\lambda^{2}\sigma_{N}^{2}\cdot\frac{1}{n}\sum_{t=1}^{n}u_{t}^{2}-\frac{1}{n}\sum_{t=1}^{n}C_{\alpha}(\lambda u_{t})\right\} (64)
=\displaystyle= supλ0{λ(𝑬{SU}αPsθ)12λ2σN2𝑬{U2}𝑬{Cα(λU)},\displaystyle\sup_{\lambda\geq 0}\bigg{\{}\lambda\left(\mbox{\boldmath$E$}\{S\cdot U\}-\alpha P_{s}-\theta\right)-\frac{1}{2}\lambda^{2}\sigma_{N}^{2}\cdot\mbox{\boldmath$E$}\{U^{2}\}-\mbox{\boldmath$E$}\{C_{\alpha}(\lambda U)\bigg{\}},

where U=W+2αSU=W+2\alpha S. Note that this expression is of the same form of the one we had earlier, except that WW is replaced by UU, θ\theta is replaced by θ+αPs\theta+\alpha P_{s}, and CC is replaced by CαC_{\alpha}.555It is easy to verify that 12λ2σN2u2+Cα(λu)\frac{1}{2}\lambda^{2}\sigma_{N}^{2}u^{2}+C_{\alpha}(\lambda u) is convex in uu simply because ln𝑬{exp[λ(θ+αPs)su)λu(Z+N)αλ(Z+N)2]}\ln\mbox{\boldmath$E$}\left\{\exp\left[\lambda(\theta+\alpha P_{s})-su)-\lambda u(Z+N)-\alpha\lambda(Z+N)^{2}\right]\right\} is such. Therefore, its derivative is monotonically non-decreasing. This expression should now be jointly maximized w.r.t. fUSf_{US} subject to the power constraints, 𝑬{S2}Ps\mbox{\boldmath$E$}\{S^{2}\}\leq P_{s}, 𝑬{(U2αS)2Pw\mbox{\boldmath$E$}\{(U-2\alpha S)^{2}\leq P_{w}. Using the same techniques as before, it is not difficult to infer that the optimal SS for a given UU is linear in UU, whereas the optimal UU for a given SS is given by a non-linear equation. Whenever the number of simultaneous solutions to both equations is finite, the signal levels can be optimized directly, as before. As for the optimization of α\alpha, among all pairs {(α,Pw)}\{(\alpha,P_{w})\} that give rise to the same value of the FA exponent, one chooses the one that maximizes the MD exponent.

Moving on to the second class of detectors, the analysis can be carried out using the same technique as above, where the this time, we use the Fourier transform identity,

ea|x|=aπ+ejqxdqq2+a2,a>0e^{-a|x|}=\frac{a}{\pi}\int_{-\infty}^{+\infty}\frac{e^{-jqx}\mbox{d}q}{q^{2}+a^{2}},~{}~{}~{}~{}a>0 (65)

which, as before, enables to exploit the independence between ZtZ_{t} and NtN_{t} once the expectation operator is commuted with the inverse Fourier transform integral over qq. Equipped with this identity, we have

PMD(θ)\displaystyle P_{\mbox{\tiny MD}}(\theta) =\displaystyle= Pr{t=1nwt(st+Zt+Nt)+αt=1n|st+Zt+Nt|<θn}\displaystyle\mbox{Pr}\left\{\sum_{t=1}^{n}w_{t}(s_{t}+Z_{t}+N_{t})+\alpha\sum_{t=1}^{n}|s_{t}+Z_{t}+N_{t}|<\theta n\right\} (66)
\displaystyle\leq 𝑬{exp[λ(nθt=1nwt(st+Zt+Nt)αt=1n|st+Zt+Nt|)]}\displaystyle\mbox{\boldmath$E$}\left\{\exp\left[\lambda\left(n\theta-\sum_{t=1}^{n}w_{t}(s_{t}+Z_{t}+N_{t})-\alpha\sum_{t=1}^{n}|s_{t}+Z_{t}+N_{t}|\right)\right]\right\}
=\displaystyle= eλnθt=1n(𝑬{exp[λwt(st+Zt+Nt)]exp[αλ|st+Zt+Nt|]})\displaystyle e^{\lambda n\theta}\prod_{t=1}^{n}\left(\mbox{\boldmath$E$}\left\{\exp\left[-\lambda w_{t}(s_{t}+Z_{t}+N_{t})\right]\exp\left[-\alpha\lambda|s_{t}+Z_{t}+N_{t}|\right]\right\}\right)
=\displaystyle= eλnθt=1n(𝑬{exp[λwt(st+Zt+Nt)]exp[αλ|st+Zt+Nt|]})\displaystyle e^{\lambda n\theta}\prod_{t=1}^{n}\left(\mbox{\boldmath$E$}\left\{\exp\left[-\lambda w_{t}(s_{t}+Z_{t}+N_{t})\right]\cdot\exp\left[-\alpha\lambda|s_{t}+Z_{t}+N_{t}|\right]\right\}\right)
=\displaystyle= eλnθt=1n(𝑬{exp[λwt(st+Zt+Nt)]αλπejqt(st+Zt+Nt)dqtqt2+α2λ2})\displaystyle e^{\lambda n\theta}\prod_{t=1}^{n}\left(\mbox{\boldmath$E$}\left\{\exp\left[-\lambda w_{t}(s_{t}+Z_{t}+N_{t})\right]\cdot\frac{\alpha\lambda}{\pi}\int_{-\infty}^{\infty}\frac{e^{-jq_{t}(s_{t}+Z_{t}+N_{t})}\mbox{d}q_{t}}{q_{t}^{2}+\alpha^{2}\lambda^{2}}\right\}\right)
=\displaystyle= exp{λ(nθt=1nwtst)}t=1n(𝑬{αλπejqtste(λwt+jqt)(Zt+Nt)dqtqt2+α2λ2})\displaystyle\exp\left\{\lambda\left(n\theta-\sum_{t=1}^{n}w_{t}s_{t}\right)\right\}\prod_{t=1}^{n}\left(\mbox{\boldmath$E$}\left\{\frac{\alpha\lambda}{\pi}\int_{-\infty}^{\infty}\frac{e^{-jq_{t}s_{t}}e^{-(\lambda w_{t}+jq_{t})(Z_{t}+N_{t})}\mbox{d}q_{t}}{q_{t}^{2}+\alpha^{2}\lambda^{2}}\right\}\right)
=\displaystyle= exp{λ(nθt=1nwtst)}t=1n(αλπejqtst𝑬{e(λwt+jqt)(Zt+Nt)}dqtqt2+α2λ2)\displaystyle\exp\left\{\lambda\left(n\theta-\sum_{t=1}^{n}w_{t}s_{t}\right)\right\}\prod_{t=1}^{n}\left(\frac{\alpha\lambda}{\pi}\int_{-\infty}^{\infty}\frac{e^{-jq_{t}s_{t}}\mbox{\boldmath$E$}\left\{e^{-(\lambda w_{t}+jq_{t})(Z_{t}+N_{t})}\right\}\mbox{d}q_{t}}{q_{t}^{2}+\alpha^{2}\lambda^{2}}\right)
=\displaystyle= exp{λ(nθt=1nwtst)}×\displaystyle\exp\left\{\lambda\left(n\theta-\sum_{t=1}^{n}w_{t}s_{t}\right)\right\}\times
t=1n(αλπejqtst𝑬{e(λwt+jqt)Zt}𝑬{e(λwt+jqt)Nt}dqtqt2+α2λ2)\displaystyle\prod_{t=1}^{n}\left(\frac{\alpha\lambda}{\pi}\int_{-\infty}^{\infty}\frac{e^{-jq_{t}s_{t}}\mbox{\boldmath$E$}\left\{e^{-(\lambda w_{t}+jq_{t})Z_{t}}\right\}\mbox{\boldmath$E$}\left\{e^{-(\lambda w_{t}+jq_{t})N_{t}}\right\}\mbox{d}q_{t}}{q_{t}^{2}+\alpha^{2}\lambda^{2}}\right)
=\displaystyle= exp{λ(nθt=1nwtst)}×\displaystyle\exp\left\{\lambda\left(n\theta-\sum_{t=1}^{n}w_{t}s_{t}\right)\right\}\times
t=1n(αλπejqtst𝑬{e(λwt+jqt)Zt}exp{12(λwt+jqt)2σN2}dqtqt2+α2λ2)\displaystyle\prod_{t=1}^{n}\left(\frac{\alpha\lambda}{\pi}\int_{-\infty}^{\infty}\frac{e^{-jq_{t}s_{t}}\mbox{\boldmath$E$}\left\{e^{-(\lambda w_{t}+jq_{t})Z_{t}}\right\}\exp\left\{\frac{1}{2}(\lambda w_{t}+jq_{t})^{2}\sigma_{N}^{2}\right\}\mbox{d}q_{t}}{q_{t}^{2}+\alpha^{2}\lambda^{2}}\right)
=\displaystyle= exp{λ(nθt=1nwtst)+12λ2σN2t=1nwt2}×\displaystyle\exp\left\{\lambda\left(n\theta-\sum_{t=1}^{n}w_{t}s_{t}\right)+\frac{1}{2}\lambda^{2}\sigma_{N}^{2}\sum_{t=1}^{n}w_{t}^{2}\right\}\times
t=1n(αλπ𝑬{eλwtZtexp{jqt(Zt+σN2λwtst)}}eqt2σN2/2dqtqt2+α2λ2)\displaystyle\prod_{t=1}^{n}\left(\frac{\alpha\lambda}{\pi}\int_{-\infty}^{\infty}\frac{\mbox{\boldmath$E$}\left\{e^{-\lambda w_{t}Z_{t}}\exp\{jq_{t}(Z_{t}+\sigma_{N}^{2}\lambda w_{t}-s_{t})\}\right\}e^{-q_{t}^{2}\sigma_{N}^{2}/2}\mbox{d}q_{t}}{q_{t}^{2}+\alpha^{2}\lambda^{2}}\right)
=\displaystyle= exp{λ(nθt=1nwtst)+12λ2σN2t=1nwt2}×\displaystyle\exp\left\{\lambda\left(n\theta-\sum_{t=1}^{n}w_{t}s_{t}\right)+\frac{1}{2}\lambda^{2}\sigma_{N}^{2}\sum_{t=1}^{n}w_{t}^{2}\right\}\times
t=1n(αλπ𝑬{eλwtZtcos((Zt+σN2λwtst)qt)}eqt2σN2/2dqtqt2+α2λ2).\displaystyle\prod_{t=1}^{n}\left(\frac{\alpha\lambda}{\pi}\int_{-\infty}^{\infty}\frac{\mbox{\boldmath$E$}\left\{e^{-\lambda w_{t}Z_{t}}\cos((Z_{t}+\sigma_{N}^{2}\lambda w_{t}-s_{t})q_{t})\right\}e^{-q_{t}^{2}\sigma_{N}^{2}/2}\mbox{d}q_{t}}{q_{t}^{2}+\alpha^{2}\lambda^{2}}\right).

Thus, defining

Cα(v,s)=ln[αλπ𝑬{evZcos((Z+σN2vs)q)}eq2σN2/2dqq2+α2λ2],C_{\alpha}(v,s)=\ln\left[\frac{\alpha\lambda}{\pi}\int_{-\infty}^{\infty}\frac{\mbox{\boldmath$E$}\left\{e^{-vZ}\cos((Z+\sigma_{N}^{2}v-s)q)\right\}e^{-q^{2}\sigma_{N}^{2}/2}\mbox{d}q}{q^{2}+\alpha^{2}\lambda^{2}}\right], (67)

the MD exponent is

EMD(θ)=supλ0{λ(𝑬{WS}θ)12λ2σN2𝑬{W2}𝑬{Cα(λW,S)}}.E_{\mbox{\tiny MD}}(\theta)=\sup_{\lambda\geq 0}\left\{\lambda\left(\mbox{\boldmath$E$}\{W\cdot S\}-\theta\right)-\frac{1}{2}\lambda^{2}\sigma_{N}^{2}\mbox{\boldmath$E$}\{W^{2}\}-\mbox{\boldmath$E$}\{C_{\alpha}(\lambda W,S)\}\right\}. (68)

However, in this case, there is an additional complication, which stems from the fact that the FA exponent depends on 𝒘w not only via PwP_{w}. A standard Chernoff-bound analysis yields

EFA(θ)\displaystyle E_{\mbox{\tiny FA}}(\theta) =\displaystyle= supλ0(λθ12λ2σN2(𝑬{W2}+α2)\displaystyle\sup_{\lambda\geq 0}\bigg{(}\lambda\theta-\frac{1}{2}\lambda^{2}\sigma_{N}^{2}(\mbox{\boldmath$E$}\{W^{2}\}+\alpha^{2})- (69)
𝑬{ln[eλαW[1Q(λ(W+α)σ)]+eλαWQ(λ(Wα)σ)]}).\displaystyle\mbox{\boldmath$E$}\left\{\ln\left[e^{\lambda\alpha W}\left[1-Q\left(\frac{\lambda(W+\alpha)}{\sigma}\right)\right]+e^{-\lambda\alpha W}Q\left(\frac{\lambda(W-\alpha)}{\sigma}\right)\right]\right\}\bigg{)}.

Therefore, the maximization of the MD exponent will have to incorporate the full asymptotic PDF of WW and not just its second moment.

References

  • [1] F. Bandiera, D. Orlando, and G. Ricci, Advanced Radar Detection Schemes Under Mismatched Signal Models, Synthesis Lectures in Signal Processing, Morgan & Claypool Publishers, 2009.
  • [2] J. Capon, “On the asymptotic efficiency of locally optimum detectors,” IRE Trans. Inform. Theory, pp. 67–71, 1961.
  • [3] E. Conte and G. Ricci, “Sensitivity study of GLRT detection in compound Gaussian clutter,” IEEE Trans. on Aerospace and Electronic Systems, vol. 34, no. 1, pp. 308–316, January 1998.
  • [4] A. H. El-Sawy and V. D. Vandelinde, “Robust detection of known signals,” IEEE Transactions on Information Theory, vol. IT–23, no. 6, pp. 722–727, November 1977.
  • [5] A. H. El-Sawy and V. D. Vandelinde, “Robust sequential detection of signals in noise,” IEEE Transactions on Information Theory, vol. IT–25, no. 3, pp. 346–353, November 1979.
  • [6] E. Erez and M. Feder, “The generalized likelihood ratio decoder can be uniformly improved for Gaussian intersymbol interference channels,” Proc. ISITA 2000, Honolulu, Hawaii, November 2000.
  • [7] E. Erez and M. Feder, “Uniformly improving the generalized likelihood ratio test for linear Gaussian channels,” Proc. Allerton Conference on Communications Control and Computing, pp. 498–507, September 2000.
  • [8] M. Feder and N. Merhav, “Universal composite hypothesis testing: a competitive minimax approach,” IEEE Trans. Inform. Theory, special issue in memory of Aaron D. Wyner, vol. 48, no. 6, pp. 1504–1517, June 2002.
  • [9] E. A. Geraniotis, “Performance bounds for discrimination problems with uncertain statistics,” IEEE Transactions on Information Theory, vol. IT–31, no. 5, pp. 703–707, September 1985.
  • [10] F. Gini, M. V. Greco, A. Farina, and P. Lombardo, “Optimum and mismatched detection against KK-distributed plus Gaussian clutter,” IEEE Trans. Aerospace and Electronic Systems, vol. 34, no. 3, pp. 860–876, July 1998.
  • [11] C. Hao, B. Liu, S. Yan, and L. Cai1, “Parametric adaptive Radar detector with enhanced mismatched signals rejection capabilities,” EURASIP Journal on Advances in Signal Processing, vol. 2010, Article ID 375136, 2010.
  • [12] S. A. Kassam, “Robust hypothesis testing for bounded classes of probability densities,” IEEE Transactions on Information Theory, vol. IT–27, no. 2, pp. 242–247, March 1981.
  • [13] S. A. Kassam, G. Moustakides, and J. G. Shin, “Robust detection of known signals in asymmetric noise,” IEEE Transactions on Information Theory, vol. IT–28, no. 1, pp. 84–91, January 1982.
  • [14] S. A. Kassam and H. V. Poor, “Robust techniques for signal processing: a survey,” Proc. IEEE, vol. 73, no. 3, pp. 433–481, March 1985.
  • [15] S. A. Kassam and J. B. Thomas, “Asymptotically robust detection of a known signal in contaminated non-Gaussian noise,” IEEE Transactions on Information Theory, vol. IT–22, no. 1, pp. 22–26, January 1976.
  • [16] S. M. Kay, “Robust detection by autoregressive spectrum analysis,” IEEE Trans. on Acoustics, Speech, and Signal Processing, vol. ASSP–30, no. 2, pp. 256–269, April 1982.
  • [17] V. M. Krasnenker, “Stable (robust) detection methods for signal against a noise background,” Automatation Remote Control, vol. 41, no. 5, pt. 1, pp. 640–659, May 1980.
  • [18] J. Liu and J. Li, “Robust detection in MIMO Radar with steering vector mismatches,” IEEE Trans. Signal Processing, vol. 67, no. 20, October 2019.
  • [19] W. Liu, J. Liu, Y. Gao, G. Wang, and Y. Wang, “Multichannel signal detection in interference and noise when signal mismatch happens,” Signal Processing, vol. 166, 107268 2020.
  • [20] W. Liu, W.  Xie, R. Li, F. Gao, X.  Hu, and Y.  Wang, “Adaptive detection in the presence of signal mismatch,” Journal of Systems Engineering and Electronics, vol. 26, no. 1, pp. 38–43, February 2015.
  • [21] W.  Liu, W. Xie, and Y. Wang, “Parametric detector in the situation of mismatched signals,” IET Radar, Sonar and Navigation, vol. 8, no. 1, pp. 48-–53, 2014.
  • [22] R. D. Martin and S. C. Schwartz, “Robust detection of a known signal in nearly Gaussian noise,” IEEE Transactions on Information Theory, vol. IT–17, no. 1, pp. 50–56, January 1971.
  • [23] N. Merhav, “Optimal correlators for detection and estimation in optical receivers,” IEEE Trans. Inform. Theory, vol. 67, no. 8, pp. 5200–5210, August 2021.
  • [24] G. V. Moustakides, “Robust detection of signals: a large deviations approach,” IEEE Transactions on Information Theory, vol. IT–31, no. 6, pp. 822–825, November 1985.
  • [25] G. V. Moustakides and J. B. Thomas, “Min-max detection of weak signals in phi-mixing noise,” IEEE Transactions on Information Theory, vol. IT–30, no. 3, pp. 529–537, May 1984.
  • [26] X. Shuwen, S. Xingyu, and S. Penglang, “An adaptive detector with mismatched signals rejection in compound Gaussian clutter,” Journal of Radars, vol. 8, no. 3, pp. 326-334, 2019.
  • [27] H. van Trees, Detection, Estimation and Modulation Theory, part I, John Wiley & Sons, New York 1968.
  • [28] L. Wei-jian, W. Li-cai, D. Yuan-shui, J.  Tao, X, Dang, and W. Yong-liang, “Adaptive energy detector and its application for mismatched signal detection,” Journal of Radars, vol. 4, no. 2, pp. 149–159, April 2015.
  • [29] O. Zeitouni, J. Ziv, and N. Merhav, “When is the generalized likelihood ratio test optimal?” IEEE Trans. Inform. Theory, vol. 38, no. 5, pp. 1597–1602, September 1992.
  • [30] D. Zhao, W. Gu, L. Jie, and L. Jing, “Weighted detector for distributed target detection in the presence of signal mismatch,” 2020 Proc. IET International Radar Conference (IET IRC 2020), pp. 960–963, doi: 10.1049/icp.2021.0816.