On the detection of low-rank signal in the presence of spatially uncorrelated noise: a frequency domain approach.
Abstract
This paper analyzes the detection of a –dimensional useful signal modeled as the output of a MIMO filter driven by a –dimensional white Gaussian noise, and corrupted by a –dimensional Gaussian noise with mutually uncorrelated components. The study is focused on frequency domain test statistics based on the eigenvalues of an estimate of the spectral coherence matrix (SCM), obtained as a renormalization of the frequency-smoothed periodogram of the observed signal. If denotes the sample size and the smoothing span, it is proved that in the high-dimensional regime where converge to infinity while remains fixed, the SCM behaves as a certain correlated Wishart matrix. Exploiting well-known results on the behaviour of the eigenvalues of such matrices, it is deduced that the standard tests based on linear spectral statistics of the SCM fail to detect the presence of the useful signal in the high-dimensional regime. A new test based on the SCM, which is proved to be consistent, is also proposed, and its statistical performance is evaluated through numerical simulations.
Index Terms:
detection, spectral coherence matrix, periodogram, high-dimensional statistics, Random Matrix TheoryI Introduction
Detecting the presence of an unknown multivariate signal corrupted by noise is one of the fundamental problems in signal processing, which is found in many applications including array and radar processing, wireless communications, radio-astronomy or seismology among others. In a statistical framework, this problem is usually formulated as the following binary hypothesis test, where the objective is to discriminate between the null hypothesis and the alternative hypothesis defined as
(1) |
where is the -variate observed signal, and where and represent a non observable signal of interest and the noise respectively, both modeled in this paper as mutually independent zero-mean complex Gaussian stationary time series.
Without further knowledge on the covariance function of and/or , or access to “noise only” samples, the test problem (I) is ill-posed, even for temporally white time series and , and one needs to exploit additional information on the covariance structure of the useful signal and noise. One common assumption, widely used in the context of array processing and multi-antenna communications, is to consider that the noise is spatially uncorrelated. Moreover, when the receive antennas are not calibrated, it is reasonable to assume that the spectral densities of the components of the noise may not coincide, see e.g. [2], [3], [4], [5]. This will be the context of the present paper.
A first class of tests is based on the observation that the noise is spatially uncorrelated if and only if the matrices are diagonal for all , whereas if the useful signal is assumed spatially correlated, is non-diagonal for some . Under this assumption, the problem in (I) can be formulated as the following correlation test:
(2) |
where and , where is the element-wise (Hadamard) product and the identity matrix. A number of previous works developed lag domains tests that specifically tackle the above problem, see e.g. [6], [7], [8], [9], [10], [11]. Also relevant are the approaches in [2] and [3], where the possible useful signal is supposed to be the output of a filter driven by a low-dimensional white noise sequence.
Our focus here is on another type of formulation, referred to as frequency domain approach, which consists in rewriting problem (I) as
(3) |
where is the spectral density matrix of at frequency , defined by
This problem is equivalent to testing whether the spectral coherence matrix (see for instance [12, Chapter 7-6], [13, Chapter 5.5])
(4) |
is equal to for all frequencies . In this approach, usual test statistics are mostly based on consistent sample estimates of or that are compared to a diagonal matrix or to the identity respectively. Previous works that developed this approach include [14], [15], [16], [17]. In particular, [14] considered the following frequency smoothed-periodogram estimator defined by
(5) |
with the renormalized finite Fourier transform of , the smoothing span, assumed to be an even number, and where is the conjugate transpose of the vector . [14] was devoted to the study of the limiting distribution of
for some properly defined subset of frequencies , where . When , [16] considered a general kernel estimator of :
where is a weight function satisfying some specific properties, and a test statistic of the form
which is proven to be, after proper recentring and renormalization, asymptotically normally distributed. Finally, [15] and [17] considered the more general class of test statistics, defined by:
for some well-defined functions and , and where is the Euclidian norm. They proved that these quantities asymptotically follow normal distributions. In the present paper, we focus on the natural estimator (see e.g. [12, Chapter 7-6], [13, Chapter 8-4]) of , defined by
(6) |
where is the frequency-smoothed periodogram estimate defined by (5). Note that adding a weight to the matrices leads to a more general class of estimators of . The study of this more general class of estimators involves different techniques and random matrix models than the ones used here, and is therefore out of the scope of this paper.
I-A Low vs High-dimensional regime
The performance of the test statistics developed in the above mentioned previous works is usually studied in the low-dimensional regime where and is fixed. It is well known (see for instance [12]) that and are consistent estimates if and . Under mild assumptions on the memory of the time series , is a consistent and asymptotically normal estimate of , which can in turn be used to study the asymptotic performance of the various tests based on . In practice, the above asymptotic regime allows to predict the actual performance of the tests quite accurately, provided the ratio is small enough. If this condition is not met, test statistics based on may be of delicate use, as the choice of the smoothing span must meet the constraints much larger than 1 (because is supposed to converge towards ) as well as small enough (because is supposed to converge towards ).
Nowadays, in many practical applications involving high-dimensional signals and/or a moderate sample size, the ratio may not be small enough to be able to choose so as to meet much larger than 1 and small enough. Therefore, the results obtained in the low-dimensional regime may fail to provide accurate predictions of the behaviour of the aforementioned test statistics. In this situation, one may rely on the more relevant high-dimensional regime in which converge to infinity such that converges to a positive constant while converges to zero.
In comparison to the low-dimensional regime, the literature concerning correlation tests for the frequency domain in the high-dimensional regime is quite scarce. Recent results obtained in [18] show that under hypothesis , the empirical eigenvalue distribution of the spectral coherence estimate behaves in the high-dimensional regime as the well-known Marcenko-Pastur distribution [19]. The result of [18] allows to predict the performance under of a large class of test statistics based on
where are the eigenvalues of , and belongs to a certain functional class. Such family of statistics , called linear spectral statistics (LSS) of , include in particular the choice , i.e. and the choice , i.e. , where represents the Frobenius norm.
In this paper, we consider the study of the eigenvalues of in the high-dimensional regime under the special alternative for which the useful signal is modeled as the output of a stable MIMO filter driven by a –dimensional white complex Gaussian noise. In the context where the intrinsic dimension is fixed while , it is shown that the empirical eigenvalue distribution of still converges to the Marcenko-Pastur distribution, showing that the test statistic based on is unable to discriminate between hypotheses and in the high-dimensional regime. Nevertheless, we also prove that, provided that the signal-to-noise ratio is large enough, the largest eigenvalue of asymptotically splits from the support of the Marcenko-Pastur distribution. We can therefore exploit this result to design a new frequency domain test statistic, which is shown to be consistent in the high-dimensional regime. This result is connected to the widely studied spiked models in Random Matrix Theory, defined as low rank perturbations of large random matrices. These models were extensively studied in the context of sample covariance matrices of independent identically distributed high-dimensional vectors, see e.g. [20]. We however notice that papers addressing the behaviour of the corresponding sample correlation matrices are quite scarce, see [21] when the low rank perturbation affects only the first components of the observations.
I-B Related works
Although the asymptotic framework differs from the high-dimensional regime considered here, we also mention the series of studies [22, 23] in the econometrics field, which consider a similar model under . In these works, it is assumed that so the ratio remains bounded, while the non-zero eigenvalues of the spectral density of are assumed to converge towards at rate . This last assumption, which ensures that the Signal-to-Noise Ratio (SNR) remains bounded away from as , significantly facilitates the design of consistent detection methods. Nevertheless, while relevant in the domain of econometrics, this assumption may be unrealistic in several applications of array processing, where the challenge is to manage situations in which the SNR converges towards at rate . This situation is the one considered in this paper and, in that case, the results of [22, 23] can not be used. We discuss this point further in Section II below.
The rest of the paper is organized as follows. In Section II, we formally introduce the model of signals used in the remainder, as well as the required technical assumptions. In section III, we introduce informally the proposed test statistic, and illustrate its behaviour in order to provide some intuition before a more rigorous presentation. In section IV, we study some approximation results for the spectral coherence which are useful to study the linear spectral statistics considered here. This study is then used in Section V to introduce a new test statistic that is consistent in the high-dimensional regime. Finally Section VI provides some simulations illustrating its performance and comparisons against other relevant approaches.
Notations. For a complex matrix , we denote by its conjugate transpose, and by and its spectral and Frobenius norms respectively. If is a complex matrix, we denote by its trace, and by its eigenvalues; if moreover is Hermitian, they are sorted in decreasing order . The identity matrix is denoted . The expectation of a complex random variable is denoted by . The complex circular Gaussian distribution with variance is denoted as and a random vector of follows the distribution if for all deterministic (column) vector and a fixed positive definite matrix . Finally, (resp. ) represents the set of continuously differentiable functions (resp. continuously differentiable functions with compact support) on an open set .
II Model and assumptions
Let us consider a –dimensional observed time series defined as
(7) |
where represents a useful signal and where represents an additive noise. The useful signal is modeled as the output of an unknown stable MIMO filter driven by a non-observable –dimensional complex Gaussian white noise with , i.e.
with probability one. We notice that represents the number of sources in the context of array processing. is modeled as a –dimensional stationary complex Gaussian time series such that its component time series are mutually independent.
For each , we denote by the covariance function of , i.e. , which verifies the following memory assumption.
Assumption 1.
The covariance coefficients decay sufficiently fast in the lag domain, in the sense that
(8) |
In particular, Assumption 1 implies that the spectral density of , given by
verifies
Assumption 1 is in particular verified as soon as the condition
(9) |
holds for each and each , where and are positive constants. As the autocovariance function of ARMA signals decreases exponentially towards , Assumption 1 thus holds if the time series are ARMA signals, provided some extra purely technical conditions that allow to manage the supremum over in (8) are met. As the spectral coherence matrix of , involves a renormalization by the inverse of the spectral densities , we also need that does not vanish for each .
Assumption 2.
The spectral densities are uniformly bounded away from zero, that is
Assumptions 1 and 2 also imply that the total noise power satisfies
(10) |
The next assumption is related to the signal part . For each , we denote by the Fourier transform of , i.e.
and by the rows of .
Assumption 3.
The MIMO filter coefficient matrices are such that
(11) |
and
(12) |
When is fixed while , condition (11) in Assumption 3 implies that the total useful signal power remains bounded, i.e.
(13) |
so that, using (10), the SNR vanishes at rate , i.e.
(14) |
Likewise, condition (12) in Assumption 3 implies that the SNR per time series vanishes, i.e.
(15) |
as . We finally notice that (11) is stronger than (13). While is a rather fundamental assumption that allows to precise the behaviour of the signal to noise ratio, the extra condition is essentially motivated by technical reasons (it is needed to establish Theorem 2). However, it is clearly not restrictive in practice.
Remark 1.
Conditions (11) and (12) in Assumption 3 are especially relevant in the context of array processing, where represents the number of sensors, which may be large [24, 25]. In this context, (14) represents the SNR before matched filtering, while (15) represents the SNR per sensor. The use of spatial filtering techniques, which combine the observations across the sensors, allows to increase the SNR by a factor when the second order statistics of are known, which leads to a SNR after matched filtering of the order of magnitude . Thus, despite the apparent low SNR, reliable information on the useful signal can potentially still be extracted from the observed signal .
Let denote the spectral density of , given by
where . To estimate , we consider in this paper a frequency-smoothed periodogram , which we defined in (5). In the classical low-dimensional regime where while remain fixed, it is well-known [12] that
and
Thus, in this regime, is a consistent estimator of as long as and . Likewise, the sample Spectral Coherence Matrix (SCM, not to be confused with the sample covariance matrix, which will not be used in this paper) defined in (6) is a consistent estimator of the true SCM defined in (4). When and , it can be shown that, under some additional mild extra assumptions, the consistency of and in the spectral norm sense still holds provided that is chosen in such a way that and . In practice, for finite values of and , the above asymptotic regime will allow to predict the performance of various inference schemes in situations where it is possible to choose in such a way that and are both small enough. Nevertheless, when the dimension is large and the sample size is not unlimited, or equivalently if is not small enough, such a choice of may be impossible. In such a context, it seems more relevant to consider asymptotic regimes for which and converging towards a positive constant. In the following, we will consider the following asymptotic regime.
Assumption 4.
and are both functions of such that, for some ,
while is fixed with respect to .
As does not converge towards , the consistency of and is lost. This can be explained in a simple way when for each and the signals are mutually independent i.i.d. distributed sequences. In this context, for each , the renormalized Fourier transform vectors are mutually independent random vectors. The spectral density estimate defined by (5) thus coincides with the sample covariance matrix of these –dimensional vectors. If and are of the same order to magnitude, it cannot be expected that converges towards because the true covariance matrix to be estimated depends on parameters, and that the number of available scalar observations used to estimate is also . Despite the loss of the convergence of the estimators and , we will see that one can still rely on the high-dimensional structure of these matrices to design relevant test statistics.
III Informal presentation of the proposed test statistic
Mathematical details will reveal later that for each , behaves as a spike model covariance matrix, whose eigenvalues are precisely described by [20]. More precisely, we will see that, in some sense, the eigenvalues of that are due to the noise belong to the interval where and , and that in the presence of signal, some eigenvalues of may be strictly greater than if an SNR criteria is respected. For the remainder, we define
(16) |
the set of Fourier frequencies. A natural way to test for against is to compute the largest eigenvalue of over the frequencies of , and compare it with . This leads to the following test statistic:
(17) |
We will prove later that, under proper assumption on the SNR, this test statistic is consistent in the present high-dimensional regime. Before describing the mathematical details leading to consider , we now provide some numerical illustrations of its behaviour. The general settings are given as follows. The noise is generated as a Gaussian AR(1) process having spectral density
(18) |
for all , with , whereas for the useful signal, we also consider an AR(1) process by choosing and
(19) |
with and being a positive constant used to adjust the SNR.
In order to understand how the test statistics discriminates between and , we show in Figure 1 the largest eigenvalue of for in the presence of signal, and compare it to the threshold . We see that for some frequencies around , the largest eigenvalue of deviates significantly from . As we will see later, it is possible to evaluate the asymptotic behaviour of the largest eigenvalue of , and to establish that it converges towards where is a certain function, and where can be interpreted as a signal-to-noise ratio at frequency . is also represented in Figure 1, and it is seen that it is close to the largest eigenvalue of . In Figure 2, we compare the empirical distribution of under and over 10000 repetitions. We see that the distribution of our test statistic is able to discriminate the scenarios where the data are generated under or , and that is typically over the threshold under .
IV Approximation results for in the high-dimensional regime
In this section we present the mathematical details which lead to the test statistic (17). More specifically, we provide useful approximation results for , which basically show that behaves as a certain Wishart matrix in the high-dimensional regime. These approximation results are the keystone for the study of the behaviour of the eigenvalues of and the detection test proposed in Section V.
We first study separately the signal-free case (i.e. ) as well as the noise-free case (i.e. ).
IV-A Signal-free case
Let
denote the discrete (time-limited) Fourier transform of , and define the matrix as
The following result, derived in [18], reveals an interesting behaviour of the frequency-smoothed periodogram of the noise.
Theorem 1.
Informally speaking, Theorem 1 shows that the random vectors ,…, asymptotically behave as a family of i.i.d. vectors, for all . Moreover, if
denotes the frequency-smoothed periodogram of the noise observations , we deduce that asymptotically behaves as a complex Gaussian Wishart matrix with covariance matrix , thanks to the following corollary.
Corollary 1.
Under the assumptions of Theorem 1, it holds that
Proof:
The proof is deferred to Appendix D-A. ∎
It is worth noticing that Corollary 1 implies in particular
and consequently is a consistent estimator of the noise spectral density in the operator norm sense, at each Fourier frequency . This convergence may be directly obtained using Lemma 1 in Appendix A and we omit the details since this result is well-known.
IV-B Noise-free case
Let
and let be the matrix defined as
In the same way, we also denote by the normalized discrete (time-limited) Fourier transform of , and consider the matrix defined as . We then have the following important approximation result.
Proof:
The proof is deferred to Appendix B. ∎
As in Theorem 1, Theorem 2 shows that the random vectors asymptotically behave as the i.i.d. vectors , for all .
Remark 2.
The type of approximation given in Theorem 2 is well-known in the low-dimensional regime in which are fixed while . Indeed, in that case, we have [12, Th. 4.5.2]
In the high-dimensional regime where and also converge to infinity as described in Assumption 4, the result of Theorem 2 cannot be obtained from [12, Th. 4.5.2] and thus requires a new study.
We also deduce the following approximation result on the frequency-smoothed periodogram of the signal observations given by
Corollary 2.
Under the assumptions of Theorem 2, it holds that
Proof:
The proof is deferred to Appendix D-B. ∎
As a result of Corollary 2, we deduce that the frequency-smoothed periodogram is a consistent estimator of the spectral density of in the high-dimensional regime, for each .
Having characterized the pure noise and pure signal cases, we are now in position to study the high-dimensional behaviour of the spectral coherence matrix .
IV-C The signal-plus-noise case
First, using Corollaries 1 and 2, we deduce the high-dimensional behaviour of the frequency smoothed periodogram . The following results show that, as it could be expected, the frequency smoothed periodogram essentially behaves as a colored Wishart matrix in the large asymptotic regime.
Proposition 1.
For all , there exists an matrix with i.i.d. entries such that
(20) |
Proof:
The proof is deferred to Appendix D-C. ∎
We finally consider the study of the spectral coherence . From condition (12) in Assumption 3 on the SNR, it turns out that (cf. proof of Theorem 3 below where the result is shown) that
(21) |
This approximation result regarding the normalization term in the SCM naturally leads to the following theorem, which is the key result of this paper.
Proof:
The proof is deferred to Appendix C. ∎
Let us make a few important comments regarding the result of Theorem 3.
First, used in conjunction with Weyl’s inequalities [26, Th. 4.3.1], Theorem 3 implies in particular that each eigenvalue of the SCM behaves as its counterpart of the Wishart matrix
for , that is
(22) |
Second, Theorem 3 has an important consequence regarding the behaviour of linear spectral statistics of , that is statistics of the type
(23) |
where belongs to a certain class of functions.
Corollary 3.
Proof:
The proof is deferred to Appendix E. ∎
Therefore, Corollary 3 shows that linear spectral statistics of the SCM converge to the same limit regardless of whether the observations contain only pure noise or signal-plus-noise contributions. This shows that any test statistic solely relying on linear spectral statistics of the SCM is unable to distinguish between absence or presence of useful signal, and cannot be consistent in the high-dimensional regime. Nevertheless, in the next section we will see that we can exploit Theorem 3 to build a new test statistic based on the largest eigenvalue of , which is proved to be consistent in the high-dimensional regime.
Remark 3.
Corollary 1, Corollary 2 and Theorems 3 may be interpreted in the context of array processing. Indeed, in the time model (7), usually referred to as “wideband”, the signal contribution modeled as a linear process, is in general not confined to a low-dimensional subspace (i.e. with dimension less than ). However, in the frequency domain, Corollary 1 and Corollary 2 show that we can retrieve, in the high-dimensional regime, a “narrowband” model, since the useful signal is confined to a –dimensional subspace of . Thus, standard narrowband techniques used in array processing for detection may be used, see e.g. [27].
V A new consistent test statistic
As we have seen in Theorem 3 and the related comments, the SCM behaves in the high-dimensional regime as a Wishart matrix with scale being a fixed rank perturbation of the identity matrix. The behaviour of the eigenvalues for each of such matrix model is well-known since [20] (and other related works such as the well-known BBP-phase-transition [28] or [29]), and the rest of this section is devoted to the application of the results from [20] in our frequency-domain detection context. A crucial point is to choose the particular frequency at which the above mentioned results will be used in order to obtain information on the behaviour of . For this, we have first to define some notations. We consider the fundamental function which already appears in [20]:
where we recall that (see Corollary 3). We notice that for all , . Define as the maximum eigenvalue of the finite rank perturbation for each , that is
(24) |
and let such that
We remark that may be interpreted as a certain SNR metric in the frequency domain. In the following, we study the behaviour of the largest eigenvalue of , which requires the following additional assumption on .
Assumption 5.
There exists such that
Theorem 3 implies that the eigenvalues of have the same asymptotic behaviour as the corresponding eigenvalues of matrix . Under Assumption 5, [28], [20] or [29] immediately imply the following result. Note that since is unknown in practice, this proposition is an intermediate theoretical result that will justify the detection test statistic introduced below.
Proposition 2.
Since neither the intrinsic dimensionality of the useful signal nor the frequency are known in practice, we use the largest eigenvalue of the SCM maximized over all Fourier frequencies as a test statistic. This leads to the test statistic defined previously in (17) which we recall here:
It turns out that this test statistics is consistent in the high-dimensional regime, as stated in the following result.
Proposition 3.
VI Simulations
In this section, we provide some numerical illustrations of the approximation results of Section IV. We will consider the case where the rank of the signal is equal to one and then the case where is strictly greater than one.
VI-A Case
As in the numerical simulation presented in Section III, each component of the noise is generated as a Gaussian AR(1) process with . The expression of its spectral density for all is still given in (18). The useful signal is generated as an AR(1) process with , defined by (19) and . is again a positive constant used to tune the SNR. Note that, in this context, the SNR at frequency defined in (24) takes the form
Figures 3 and 4 illustrate the signal-free case , and where . In Figure 3, we plot the histogram of the eigenvalues of for .
As predicted by Corollary 3 in the signal-free case, the empirical eigenvalue distribution of is well approximated by the Marcenko-Pastur distribution with shape parameter . Figure 4 further illustrates this convergence, where the cumulative distribution function (cdf) of the Marcenko-Pastur distribution is plotted against the two following quantities:
These two functions represent the maximum deviations (from above and below) over the frequencies of the empirical spectral distribution of against the Marcenko-Pastur distribution. As suggested by the uniform convergence in the frequency domain in Corollary 3, the Marcenko-Pastur approximation in the high-dimensional regime is reliable over the whole set of Fourier frequencies. Note that the statement of Corollary 3 does not exactly match the setting used in Figure 4, as the test function used here is not in .
To illustrate the signal-plus-noise case and the results of Corollary 3 and Proposition 2, we plot in Figure 5, the histogram of the eigenvalues of for , with . We see that the largest eigenvalue deviates from the right edge and is located around the value , as predicted by Proposition 2, while all the other eigenvalues spread as the Marcenko-Pastur distribution, as predicted by Corollary 3.
In order to compare the test statistic (17) with other frequency domain methods based on the SCM, we consider:
-
•
the new test statistic (17), denoted as LE (for largest eigenvalue),
- •
-
•
a test statistic based on the largest off-diagonal entry of the SCM:
denoted as MCC (for Maximum of Cross Coherence),
and where is some threshold. In Table I, we provide, via Monte-Carlo simulations ( draws), the power of each of the four tests, calibrated so that the empirical type I error is equal to . The results are provided for various values of chosen so that , and . We set the SNR in the frequency domain as .
The LE test presents the best detection performance among the four candidates, whereas the MCC test does not seem to be adapted to the detection of this alternative. While it is proved in Corollary 3 that the test statistics based on the LSS of can not asymptotically distinguish between and , they remain sensible to a large variation of a single eigenvalue for finite values of . Consider for instance the Frobenius LSS test, where the test statistic is based on :
where an explicit computation shows that . An variation of , the largest eigenvalue of , will lead to a variation of order of the above term. Therefore, the power of a LSS based test asymptotically converge towards zero, while having non-zero power for finite values of as it is visible on the results of Table I.
LSS Frob. | LSS logdet | MCC | LE | |||
---|---|---|---|---|---|---|
N | M | B | ||||
400 | 20 | 40 | 0.09 | 0.07 | 0.06 | 0.15 |
1600 | 40 | 80 | 0.15 | 0.08 | 0.06 | 0.37 |
3600 | 60 | 120 | 0.19 | 0.08 | 0.06 | 0.68 |
6400 | 80 | 160 | 0.25 | 0.08 | 0.06 | 0.87 |
10000 | 100 | 200 | 0.26 | 0.07 | 0.06 | 0.96 |
14400 | 120 | 240 | 0.25 | 0.06 | 0.06 | 0.99 |
19600 | 140 | 280 | 0.28 | 0.06 | 0.06 | 1.00 |
25600 | 160 | 320 | 0.30 | 0.06 | 0.06 | 1.00 |
32400 | 180 | 360 | 0.31 | 0.06 | 0.06 | 1.00 |
VI-B Case
We eventually consider a model which have the flexibility to consider a signal with an arbitrary value of . We assume that matrices verify if for a certain integer , and that the sequence of matrices is defined by:
where the vectors are generated as independent realisations of –dimensional vectors uniformly distributed on the unit sphere of and where the are positive constants used to tune the SNR of each of the sources at the desired level. Moreover, as the columns of each matrix coincide with the realisations of mutually independent random vectors, the columns of are easily seen to be nearly orthogonal and to nearly share the same norm for each if is large enough. More precisely, for each , it holds that when . As the spectral densities of the components of the noise all coincide with , the non-zero eigenvalues of converge towards the when increases. Therefore, the signal obtained by this model satisfies Assumption 3. Rather than just providing the performance of the test based on the maximum of the largest eigenvalue of proposed in this paper, we compare in the following with defined by
which depends on the largest eigenvalues of rather than on the largest one. It is easy to generalize Proposition 2 and Proposition 3 in order to study the asymptotic properties of . More precisely, for each , we define by
(29) |
and denote one of the frequency such that . can of course be seen as a generalization of defined by (24). Then, under the extra assumption that for , converges towards a finite limit (a condition which holds in the context of the present experiment because it is easily seen that ), converges towards if and towards if . It is easy to check that if , then the statistics also leads to a consistent test provided . While in practice the number of sources is unknown, it is interesting to evaluate the performance provided by which can be considered as an ideal reference. Intuitively, could lead to a better performance than when for , because, in this context, if is a frequency that maximises , then . Therefore, the largest eigenvalues of bring useful information to the detection of the useful signal.
In order to evaluate numerically the compared performance of and when is known, we first consider the case , , and where . Concerning the value of , we consider the two following cases: and . This corresponds respectively to the case where both sources contributes exactly the same on each sensor, and where the first source contributes much more than the second one. Tables II, III report the power of the proposed test (LE(1) represents and LE(2) represents ) against the LSS tests and the MCC test, with a type I error fixed at 5%. When , it can be expected that the most powerful source is dominant, and that . Therefore, is likely to stay close to for each , so that the use of should not bring any extra performance. This intuition is confirmed by Table II. When , and should be both close to , thus suggesting that the two largest eigenvalues of at the maximizing frequency should also nearly coincide, and should escape from . While the second eigenvalue brings here some information, Table III tends to indicate that has better performance than . In the next experiment, . For , the largest eigenvalue of is likely to be still dominant for each , and Table IV confirms the better performance of . When , and should be both close to the detectability threshold , and Table V this time shows that the use of leads to some improvement. For comparison, we also report the results of for in Table VI.
LSS Fr. | LSS ld | MCC | LE(1) | LE(2) | |||
---|---|---|---|---|---|---|---|
N | M | B | |||||
100 | 10 | 20 | 0.31 | 0.18 | 0.16 | 0.42 | 0.37 |
400 | 20 | 40 | 0.79 | 0.39 | 0.45 | 0.94 | 0.89 |
900 | 30 | 60 | 0.94 | 0.49 | 0.53 | 1.00 | 0.99 |
1600 | 40 | 80 | 0.98 | 0.50 | 0.55 | 1.00 | 1.00 |
2500 | 50 | 100 | 0.99 | 0.52 | 0.55 | 1.00 | 1.00 |
3600 | 60 | 120 | 1.00 | 0.51 | 0.43 | 1.00 | 1.00 |
4900 | 70 | 140 | 1.00 | 0.55 | 0.37 | 1.00 | 1.00 |
6400 | 80 | 160 | 1.00 | 0.54 | 0.28 | 1.00 | 1.00 |
LSS Fr. | LSS ld | MCC | LE(1) | LE(2) | |||
---|---|---|---|---|---|---|---|
N | M | B | |||||
100 | 10 | 20 | 0.38 | 0.22 | 0.16 | 0.48 | 0.46 |
400 | 20 | 40 | 0.58 | 0.30 | 0.30 | 0.75 | 0.73 |
900 | 30 | 60 | 0.67 | 0.30 | 0.28 | 0.91 | 0.89 |
1600 | 40 | 80 | 0.74 | 0.29 | 0.18 | 0.96 | 0.97 |
2500 | 50 | 100 | 0.79 | 0.30 | 0.16 | 0.99 | 0.99 |
3600 | 60 | 120 | 0.79 | 0.24 | 0.13 | 1.00 | 1.00 |
4900 | 70 | 140 | 0.85 | 0.28 | 0.12 | 1.00 | 1.00 |
LSS Fr. | LSS ld | MCC | LE(1) | LE(2) | |||
---|---|---|---|---|---|---|---|
N | M | B | |||||
100 | 10 | 20 | 0.15 | 0.10 | 0.10 | 0.21 | 0.20 |
400 | 20 | 40 | 0.33 | 0.15 | 0.12 | 0.55 | 0.50 |
900 | 30 | 60 | 0.39 | 0.15 | 0.17 | 0.75 | 0.71 |
1600 | 40 | 80 | 0.52 | 0.16 | 0.14 | 0.94 | 0.90 |
2500 | 50 | 100 | 0.54 | 0.15 | 0.14 | 0.98 | 0.97 |
3600 | 60 | 120 | 0.56 | 0.13 | 0.13 | 1.00 | 0.99 |
4900 | 70 | 140 | 0.55 | 0.13 | 0.10 | 1.00 | 1.00 |
6400 | 80 | 160 | 0.62 | 0.11 | 0.10 | 1.00 | 1.00 |
LSS Fr. | LSS ld | MCC | LE(1) | LE(2) | |||
---|---|---|---|---|---|---|---|
N | M | B | |||||
400 | 20 | 40 | 0.17 | 0.11 | 0.08 | 0.27 | 0.27 |
1600 | 40 | 80 | 0.18 | 0.10 | 0.08 | 0.45 | 0.48 |
3600 | 60 | 120 | 0.15 | 0.07 | 0.07 | 0.58 | 0.62 |
6400 | 80 | 160 | 0.16 | 0.07 | 0.08 | 0.69 | 0.75 |
10000 | 100 | 200 | 0.13 | 0.05 | 0.07 | 0.76 | 0.83 |
14400 | 120 | 240 | 0.10 | 0.03 | 0.07 | 0.82 | 0.86 |
19600 | 140 | 280 | 0.09 | 0.04 | 0.07 | 0.86 | 0.89 |
25600 | 160 | 320 | 0.10 | 0.03 | 0.06 | 0.89 | 0.93 |
32400 | 180 | 360 | 0.09 | 0.03 | 0.06 | 0.87 | 0.93 |
LSS Fr. | LSS ld | MCC | LE(1) | LE(2) | |||
---|---|---|---|---|---|---|---|
N | M | B | |||||
100 | 10 | 20 | 0.19 | 0.12 | 0.12 | 0.26 | 0.22 |
400 | 20 | 40 | 0.43 | 0.19 | 0.14 | 0.66 | 0.59 |
900 | 30 | 60 | 0.51 | 0.19 | 0.19 | 0.88 | 0.83 |
1600 | 40 | 80 | 0.62 | 0.20 | 0.15 | 0.97 | 0.95 |
2500 | 50 | 100 | 0.65 | 0.18 | 0.17 | 0.99 | 0.99 |
3600 | 60 | 120 | 0.68 | 0.16 | 0.12 | 1.00 | 1.00 |
4900 | 70 | 140 | 0.71 | 0.16 | 0.13 | 1.00 | 1.00 |
6400 | 80 | 160 | 0.75 | 0.17 | 0.12 | 1.00 | 1.00 |
This discussion tends to indicate that, even when is assumed known, the use of the maximum over of the largest eigenvalue of does not introduce any significant loss of performance.
VII Conclusion
In this paper, we have studied the statistical behaviour of certain frequency-domain detection test statistics, based on the eigenvalues of a sample estimate of the SCM, in the high-dimensional regime in which both the dimension of the underlying signals and the number of samples converge to infinity at certain rates. In particular, we have proved various approximation results showing that the sample SCM asymptotically behaves as a Wishart matrix. These results have been exploited to prove that test statistics based on LSS of the sample SCM are not consistent in the high-dimensional regime. A new test statistic relying on the largest eigenvalue of the sample SCM has also been proposed and proved to be consistent in the high-dimensional regime. Finally, numerical results have demonstrated that this new test statistic provides reasonable performance and outperforms other standard test statistics in situations where the dimension and the number of samples are large.
References
- [1] A. Rosuel, P. Vallet, P. Loubaton, and X. Mestre, “On the frequency domain detection of high dimensional time series,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020, pp. 8782–8786.
- [2] D. Ramírez, G. Vazquez-Vilar, R. López-Valcarce, J. Vía, and I. Santamaría, “Detection of rank-p signals in cognitive radio networks with uncalibrated multiple antennas,” IEEE Trans. Signal Process., vol. 59, no. 8, pp. 3764–3774, 2011.
- [3] J. Sala-Alvarez, G. Vázquez-Vilar, R. López-Valcarce, S. Sedighi, and A. Taherpour, “Multiantenna glr detection of rank-one signals with known power spectral shape under spatially uncorrelated noise,” IEEE Transactions on Signal Processing, vol. 64, no. 23, pp. 6269–6283, 2016.
- [4] A.-J. Boonstra and A.-J. Van der Veen, “Gain calibration methods for radio telescope arrays,” IEEE Trans. Signal Process., vol. 51, no. 1, pp. 25–38, 2003.
- [5] A. Leshem and A.-J. Van der Veen, “Multichannel detection of gaussian signals with uncalibrated receivers,” IEEE Signal Process. Lett., vol. 8, no. 4, pp. 120–122, 2001.
- [6] L. D. Haugh, “Checking the independence of two covariance-stationary time series: a univariate residual cross-correlation approach,” Journal of the American Statistical Association, vol. 71, no. 354, pp. 378–385, 1976.
- [7] Y. Hong, “Testing for independence between two covariance stationary time series,” Biometrika, vol. 83, no. 3, pp. 615–625, 1996.
- [8] W. Li and Y. Hui, “Robust residual cross correlation tests for lagged relations in time series,” Journal of Statistical Computation and Simulation, vol. 49, no. 1-2, pp. 103–109, 1994.
- [9] K. El Himdi, R. Roy, and P. Duchesne, “Tests for non-norrelation of two multivariate time series: A nonparametric approach,” Lecture Notes-Monograph Series, vol. 42, pp. 397–416, 2003.
- [10] D. Ramirez, J. Via, I. Santamaria, and L. Scharf, “Detection of spatially correlated gaussian time series,” IEEE Trans. Signal Process., vol. 58, no. 10, pp. 5006–5015, 2010.
- [11] N. Klausner, M. Azimi-Sadjadi, and L. Scharf, “Detection of spatially correlated time series from a network of sensor arrays,” IEEE Trans. Signal Process., vol. 62, no. 6, pp. 1396–1407, 2014.
- [12] D. Brillinger, Time series: data analysis and theory. Classics in Applied Mathematics, Siam, 2001, vol. 36.
- [13] L. Koopmans, The spectral analysis of time series. Probability and Mathematical Statistics, Academic Press, 1995, vol. 22.
- [14] G. Wahba, “Some tests of independence for stationary multivariate time series,” Journal of the Royal Statistical Society: Series B (Methodological), vol. 33, no. 1, pp. 153–166, 1971.
- [15] M. Taniguchi, M. L. Puri, and M. Kondo, “Nonparametric approach for non-gaussian vector stationary processes,” Journal of Multivariate Analysis, vol. 56, no. 2, pp. 259–283, 1996.
- [16] M. Eichler, “A frequency-domain based test for non-correlation between stationary time series,” Metrika, vol. 65, no. 2, pp. 133–157, 2007.
- [17] ——, “Testing nonparametric and semiparametric hypotheses in vector stationary processes,” Journal of Multivariate Analysis, vol. 99, no. 5, pp. 968–1009, 2008.
- [18] P. Loubaton and A. Rosuel, “Large random matrix approach for testing independence of a large number of gaussian time series,” arXiv preprint arXiv:2007.08806, 2020.
- [19] V. Marcenko and L. Pastur, “Distribution of eigenvalues for some sets of random matrices,” Mathematics of the USSR-Sbornik, vol. 1, p. 457, 1967.
- [20] J. Baik and J. Silverstein, “Eigenvalues of large sample covariance matrices of spiked population models,” J. Multivariate Anal., vol. 97, no. 6, pp. 1382–1408, 2006.
- [21] D. Morales-Jimenez, I. Johnstone, M. MacKay, and J. Yang, “Asymptotics of eigenstructure of sample correlation matrices for high-dimensional spiked models,” To appear in Statistica Sinica, 2019, preprint arXiv:1810.10214v3.
- [22] M. Forni, M. Hallin, M. Lippi, and L. Reichlin, “The generalized dynamic-factor model: Identification and estimation,” Rev. Econ. Stat., vol. 82, no. 4, pp. 540–554, 2000.
- [23] ——, “The generalized dynamic factor model consistency and rates,” J. Econom., vol. 119, no. 2, pp. 231–255, 2004.
- [24] P. Vallet, P. Loubaton, and X. Mestre, “Improved Subspace Estimation for Multivariate Observations of High Dimension: The Deterministic Signal Case,” IEEE Trans. Inf. Theory, vol. 58, no. 2, 2012.
- [25] P. Vallet, X. Mestre, and P. Loubaton, “Performance Analysis of an Improved MUSIC DoA Estimator,” IEEE Trans. Signal Process., vol. 63, no. 23, pp. 6407–6422, Dec. 2015.
- [26] R. Horn and C. Johnson, Matrix analysis. Cambridge university press, 2005.
- [27] M. Wax and T. Kailath, “Detection of signals by information theoretic criteria,” IEEE Transactions on acoustics, speech, and signal processing, vol. 33, no. 2, pp. 387–392, 1985.
- [28] J. Baik, G. B. Arous, S. Péché et al., “Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices,” Annals of Probability, vol. 33, no. 5, pp. 1643–1697, 2005.
- [29] F. Benaych-Georges and R. R. Nadakuditi, “The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices,” Advances in Mathematics, vol. 227, no. 1, pp. 494–521, 2011.
- [30] M. Rudelson and R. Vershynin, “Hanson-wright inequality and sub-gaussian concentration,” Electron. Commun. Probab., vol. 18, 2013.
- [31] U. Haagerup and S. Thorbjørnsen, “Random matrices with complex gaussian entries,” Expo. Math., vol. 21, no. 4, pp. 293–337, 2003.
- [32] T. Tao, Topics in random matrix theory. American Mathematical Soc., 2012, vol. 132.
- [33] A. Guionnet and O. Zeitouni, “Concentration of the spectral measure for large matrices,” Electron. Commun. Probab., vol. 5, pp. 119–136, 2000. [Online]. Available: https://doi.org/10.1214/ECP.v5-1026
- [34] W. Hachem, O. Khorunzhiy, P. Loubaton, J. Najim, and L. Pastur, “A new approach for capacity analysis of large dimensional multi-antenna channels,” IEEE Trans. Inf. Theory, vol. 54, no. 9, pp. 3987–4004, 2008.
Appendix A Useful results
In this section, we recall some useful results which will be constantly used in the proofs developed in the following sections.
The first result is based on a Chernoff bound for the distribution, and is also a special case of the well-known Hanson-Wright inequality describing the concentration of sub-Gaussian quadratic forms around their means (see [30]).
Lemma 1.
Let and a deterministic complex matrix. Then there exists a constant independent of and such that for all ,
The second following result describes the behaviour of the largest and smallest eigenvalues of a standard Wishart matrix.
Appendix B Proof of Theorem 2
B-A Reduction to
First, note that we may assume without loss of generality. Indeed, consider the decomposition
where and where and are the -th column of and the -th entry of respectively. Moreover, Assumption 3 implies that
From the fact that is fixed with respect to (Assumption 4) and
where , , are defined as , , respectively, Theorem 2 is proved if we can show that
for all . Therefore, we assume for the remainder of the proof that
where
-
•
is a filter, with and such that
(30) -
•
is a scalar standard complex Gaussian white noise.
B-B Reduction to
Let and
From (30) and Assumption 4, a first-order Taylor expansion of at leads to
Moreover, from Lemma 1 applied to the random vector
and matrix , there exists some constant independent of such that for all ,
and Borel-Cantelli lemma together with Assumption 4 imply
with probability one. Defining
as well as
with , we therefore have the control
Finally, since the spectral norm of a matrix is bounded by its Frobenius norm,
Theorem 1 is proven if we show that
B-C Periodization
For all integer , let denotes the integer contained in such that and define
where represents the circular convolution between and . If , then the equality
holds for all . It is straightforward to check that
where
and
Theorem 2 is proved if we can show that
(31) |
and
(32) |
In the remainder, we only prove (31) and omit the details for (32) whose treatment is similar. To that end, we define
B-D Control of
For , let
Then are i.i.d. and by rearranging the sums in , we have
with
Therefore, with
Moreover,
and a straightforward rearrangement together with (30) leads to
where we used that for . Additionally,
and
Using Lemma 1, there exists a constant independent of such that for all ,
Applying Assumption 4 and Borel-Cantelli lemma, it follows that
Finally, we deduce that
B-E Control of
We first split in the following two parts
where
We remark that only involves the i.i.d. random variables and that
with defined as
It is clear that
and from (30),
Thus with and
as . Using Lemma 1 as for the control of in the previous section, we end up with
We now consider the term , which involves the sequence of random variables . For all , set
and consider the sequence . Using Assumption 3,
since for any , by the gaussianity of the , converges almost surely towards as by the law of the large numbers, so it remains almost surely bounded for any finite . This implies that the family is a.s. absolutely summable. Therefore, we can rearrange the series defining and write
with probability one, where this time is defined for all as
Again,
(33) |
, where
and such that . Thus, using Lemma 1 also yields
This concludes the proof of Theorem 2.
Appendix C Proof of Theorem 3
To prove Theorem 3, we need as a preliminary step to study the behaviour of the renormalization by in the SCM.
Proof:
To prove (34), we establish successively
(36) |
as well as
(37) |
Using (20), we have the bound
with
and
Denoting , where is the -th vector of the canonical basis of , as well as the i.i.d. column vectors of , we have for all ,
From Assumption 1, Assumption 2 and condition (11) from Assumption 3, we have
Setting in the statement of Lemma 1
and as the block-diagonal matrix
with denoting the Kronecker product, we obtain
where is a constant independent of , which in turn implies that
and that (36) holds. In order to check (37), we use Assumption 3 eq. (12) to get that
and from the fact that
We also need the following lemma on the boundedness of matrix .
Proof:
Appendix D Proof of Corollary 1, Corollary 2 and Proposition 1
D-A Proof of Corollary 1
D-B Proof of Corollary 2
D-C Proof of Proposition 1
Appendix E Proof of Corollary 3
We first prove that all the eigenvalues of the SCM asymptotically concentrate in a compact set with probability one for all large . Indeed, considering matrix
defined through Theorem 3 and using Lemma 2 in conjunction with Borel-Cantelli lemma, we deduce that there exists constants such that
and
(40) |
with probability one, where verify, thanks to Assumption 3,
and
Using (22), we obtain similarly
and
with probability one. Let and such that
Then it follows that
Thus, without loss of generality, we may assume for the remainder of the proof that . Using (22), we deduce that
Next, consider the two functions
and
defined for all , and where for all Borel set ,
and
denote the empirical eigenvalue distributions of matrices and respectively, and is the Dirac measure at point . Functions and coincide with the Stieltjes transforms of measures and respectively (see [32] for a review of the main properties of the Stieltjes transform). Since
and using the fact that for non-singular matrices , we have
it follows from Assumptions 2, 3, 4 and Lemma 2 that
(41) |
for all . In the following, we fix a realization in an event of probability one for which (41) holds for all and consider
Then as , for all . From the fact that the pointwise convergence on of a sequence of Stieltjes transforms is equivalent to the weak convergence of the related sequence of probability measures (see e.g. [32, Ex.2.4.10]), we deduce that
To conclude the proof of Corollary 3, it remains to prove that
Consider the decomposition
where
and
Using the concentration inequality of [33, Cor. 1.8(b)], it is straightforward to show that
Moreover, using again the properties of the Stieltjes transform, it can be deduced from e.g. [34] that
This concludes the proof of Corollary 3.
Appendix F Proof of Proposition 2
Convergences (25), (26) and (27) are straightforward consequences of (22) and the results of [20, Th. 1.1] on the behaviour of the largest eigenvalues for the so-called multiplicative spike model random matrices. To prove (28), we use the bound
Then, from the fact that and Lemma 2, we finally obtain
The proof is concluded by invoking again convergence (22).