Rank-adaptive covariance testing with applications to
genomics and neuroimaging

David Veitch¹, Yinqiu He^2,∗, and Jun Young Park^1,3,∗

(¹Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada
²Department of Statistics, University of Wisconsin - Madison, Madison, Wisconsin, U.S.A.
³Department of Psychology, University of Toronto, Toronto, ON, Canada
^∗Equal contributions
)

Abstract

In biomedical studies, testing for differences in covariance offers scientific insights beyond mean differences, especially when differences are driven by complex joint behavior between features. However, when differences in joint behavior are weakly dispersed across many dimensions and arise from differences in low-rank structures within the data, as is often the case in genomics and neuroimaging, existing two-sample covariance testing methods may suffer from power loss. The Ky-Fan( $k$ ) norm, defined by the sum of the top $k$ singular values, is a simple and intuitive matrix norm able to capture signals caused by differences in low-rank structures between matrices, but its statistical properties in hypothesis testing have not been studied well. In this paper, we investigate the behavior of the Ky-Fan( $k$ ) norm in two-sample covariance testing. Ultimately, we propose a novel methodology, Rank-Adaptive Covariance Testing (RACT), which is able to leverage differences in low-rank structures found in the covariance matrices of two groups in order to maximize power. RACT uses permutation for statistical inference, ensuring an exact Type I error control. We validate RACT in simulation studies and evaluate its performance when testing for differences in gene expression networks between two types of lung cancer, as well as testing for covariance heterogeneity in diffusion tensor imaging (DTI) data taken on two different scanner types.

Keywords: Ky-Fan( $k$ ) norm; adaptive testing; permutation; two-sample covariance testing; genomics; neuroimaging.

1 Introduction

1.1 Two-sample covariance testing in biomedical data

In biomedical studies, comparing differences between covariance often offers scientific insights beyond what is inferred by comparing the mean differences. Especially, it can help determine if complex joint behavior differs between two groups of samples. In this paper, we present two specific applications of two-sample covariance testing within genomics and neuroimaging.

In genomics, gene expression networks, quantified by the covariance of expression levels of multiple genes, provides a better understanding of the genetic drivers of cellular behavior, particularly when the individual contribution from numerous genes is weak, but their joint co-expression levels are strong (Eisen \BOthers., \APACyear1998; Stuart \BOthers., \APACyear2003). Testing for differences in the gene expression networks between tissue types (e.g., a tumor tissue and normal tissue), molecular subtypes of a cancer (e.g., basal and HER2 subtypes in breast cancer), or cancer types (e.g., breast cancer versus ovarian cancer) can be used to identify important genomic biomarkers of complex diseases via these co-expression levels.

In the second motivating example, we consider the ‘batch effect’ problem in genomics, microbiome, and neuroimaging where non-biological variations are induced by collecting data from multiple lab/sites, platforms, preprocessing methods, or scanners. Within neuroimaging, a number of methods have been proposed to estimate and remove these effects (Hu \BOthers., \APACyear2023), and have the potential to increase reliability of scientific findings from the increased sample sizes and more diverse groups of subjects that come from combining datasets. Recently Chen \BOthers. (\APACyear2022), Zhang \BOthers. (\APACyear2023), and Zhang \BOthers. (\APACyear2024) have empirically observed batch effects reflected in the the heterogeneity of covariances of observations taken at different sites or using different scanners. Despite these observed differences, there has been limited work on testing whether these observed covariance heterogeneities are statistically significant, and hence whether preprocessing the data to mitigate batch effects in covariances is justified.

1.2 Leveraging low-rank structure

In both genomics and neuroimaging, techniques which leverage the low-rank structure of the often high dimensional data have been found to be useful in characterizing and understanding variations within the data. These low-rank structures are correspondingly reflected in the spiked structure of the singular values of the covariance matrix. Using tools from differential co-expression analysis (Chowdhury \BOthers., \APACyear2019), low-rank structures have been empirically observed in the differences between gene expression networks (Amar \BOthers., \APACyear2013; Meng \BOthers., \APACyear2018; Yu \BBA Luo, \APACyear2021). In addition, several model-based approaches in cancer genomics demonstrated the utility of leveraging low-rank structures in studying differences between tumor types (Park \BBA Lock, \APACyear2020; Lock \BOthers., \APACyear2022). As well, the batch effect induced covariance heterogeneities in neuroimaging appear to be driven by differences in low-rank structures (Veronese \BOthers., \APACyear2019; Carmon \BOthers., \APACyear2020; Chen \BOthers., \APACyear2022; Zhang \BOthers., \APACyear2023). When these low-rank structures differ between two groups, it is expected that a low-rank structure will explain most of the differences in covariances between two groups. Therefore, there is potential utility in constructing statistics which leverage the low-rank structure of the difference in covariance matrices between two groups in improving statistical power.

1.3 Literature review

In both of the aforementioned application areas, a low-rank structure is likely to be useful in characterizing the differences in covariance matrices between populations, however to date few two-sample covariance testing methods, which test $H_{0}:\Sigma_{1}=\Sigma_{2}$ , have been developed which explicitly are able to leverage this type of low-rank structure. Schott (\APACyear2007) and Li \BBA Chen (\APACyear2012) consider test statistics based on the squared Frobenius norm of $\Sigma_{1}-\Sigma_{2}$ . Srivastava \BBA Yanagihara (\APACyear2010) develops an estimator based on the trace of $\Sigma_{1}$ and $\Sigma_{2}$ as well as $\Sigma_{1}^{2}$ and $\Sigma_{2}^{2}$ . These tests which involve the Frobenius norm or trace can be seen as a function of all of the singular values of $\Sigma_{1}$ and $\Sigma_{2}$ or $\Sigma_{1}-\Sigma_{2}$ , and hence do not exploit the data’s potential low-rank structure. Tests which are more powerful in specific low-rank settings include: Cai \BOthers. (\APACyear2013) a test powerful against sparse alternatives based on the maximum of standardized elementwise differences of sample covariance matrices between two groups, Danaher \BOthers. (\APACyear2015) a biological pathway inspired test which uses the leading eigenvalues and trace of the sample covariance matrices, Zhu \BOthers. (\APACyear2017) which uses the sparse leading eigenvalue of $\Sigma_{1}-\Sigma_{2}$ , He \BBA Chen (\APACyear2018) whose test statistic is based on differences in superdiagonals between $\Sigma_{1}$ and $\Sigma_{2}$ , which is particularly powerful when $\Sigma_{1}$ and $\Sigma_{2}$ have a bandable structure, and Ding \BOthers. (\APACyear2024) which compares a subset of eigenvalues between the two sample covariance matrices.

1.4 Our contributions

A selection of the two-sample methods listed in Section 1.3 leverage some form of low-rank structure in either the covariance matrices, or difference in covariance matrices, between groups; however, they individually all make strong assumptions of what the form of the low-rank structure that exists is. This can be problematic in the two application areas this paper examines where it is unclear in advance what low-rank structures exist, let alone which low-rank structures to use in order to maximize power for two-sample covariance matrix testing. In this paper we bridge this gap and propose a two-sample covariance testing method which is able to adapt to the form of the low-rank structures which exist within the data, without placing stringent prior assumptions on their form. Specifically, we construct a test statistic that is adaptive to the form of the low-rank structure found in the difference of two sample covariance matrices, and utilize a permutation scheme to ensure strict Type I error control in the finite sample setting.

The remainder of the paper is organized as follows. In Section 2 the adaptive test statistic underlying RACT is presented, and its permutation procedure introduced. Section 3 investigates the asymptotic behavior of the individual test statistics included in RACT’s adaptive test statistic, and discusses the signal-to-noise trade-off of these individual test statistics. A simulation study is presented in Section 4, demonstrating RACT’s adaptivity to various forms of covariance differences. We then apply RACT to the two application areas of interest in Section 5, and compare its performance to a selection of competing two-sample covariance testing methods. Finally in Section 6 we discuss potential extensions, and limitations, of RACT.

2 Methodology

2.1 Notation and setup

Denote $X_{1}^{(1)},\dots,X_{n_{1}}^{(1)}$ , $X_{1}^{(2)},\dots,X_{n_{2}}^{(2)}\in\mathbb{R}^{p}$ as observed features from two groups of sample sizes $n_{1}$ and $n_{2}$ , respectively, with equal population means (for practical purposes as a preprocessing step the data can be centered using the sample mean for each group), and let $(\Sigma_{1},\Sigma_{2}),(\hat{\Sigma}_{1},\hat{\Sigma}_{2})$ be the population and sample covariances of these groups. In addition, we define $n=n_{1}+n_{2}$ to be the total number of samples and $\hat{\Sigma}=((n_{1}-1)\hat{\Sigma}_{1}+(n_{2}-1)\hat{\Sigma}_{2})/(n-1)$ to be the pooled covariance. In this section, we test for differences in the covariance matrix, however if one preferred to test for differences in the correlation matrix this could be accomplished by redefining $(\Sigma_{1},\Sigma_{2}),(\hat{\Sigma}_{1},\hat{\Sigma}_{2})$ as population and sample correlation matrices. We define our null and alternative hypotheses as $H_{0}:\Sigma_{1}=\Sigma_{2}$ versus $H_{1}:\Sigma_{1}\neq\Sigma_{2}$ .

2.2 Ky-Fan( $k$ ) statistic - fixed $k$

Testing $H_{0}:\Sigma_{1}=\Sigma_{2}$ requires quantifying the difference between $\hat{\Sigma}_{1}$ and $\hat{\Sigma}_{2}$ , for example by using a probabilistic model. As noted in Section 1.3, several existing two-sample covariance testing methods are based on test statistics which utilize the Frobenius norm or leading singular values for detecting differences in covariance structures. However, using either of these test statistics individually would be limited for certain covariance differences and lead to underpowered results. For example, a test statistic based on only a single singular value would be underpowered when several singular values drive the difference in covariance. On the other hand, the Frobenius norm, constructed by $||\hat{\Sigma}_{1}-\hat{\Sigma}_{2}||_{F}=\sqrt{\sum_{r=1}^{p}\sum_{s=1}^{p}(\hat{\Sigma}_{1}[r,s]-\hat{\Sigma}_{2}[r,s])^{2}}$ , is well-suited when each entry of $\hat{\Sigma}_{1}-\hat{\Sigma}_{2}$ has non-zero expectation, but it could be underpowered when the entries with non-zero expected values are sparse. Also, even when the signals are dense, considering that the Frobenius norm of a matrix is equivalent to the square root of the sum of squares of all singular values of the matrix, it may be underpowered when $\hat{\Sigma}_{1}-\hat{\Sigma}_{2}$ is low-rank.

For a fixed $k\in\mathbb{N}$ , one way to characterize the difference between sample covariance matrices is through the use of a Ky-Fan( $k$ ) norm defined by

\displaystyle T_{k}

\displaystyle=||\hat{\Sigma}_{1}-\hat{\Sigma}_{2}||_{(k)},

(1)

where $\|A\|_{(k)}=\sum_{l=1}^{k}\sigma_{l}(A)$ and $\sigma_{l}(A)$ is the $l$ th largest singular value of a matrix $A$ . If $\Sigma_{1}-\Sigma_{2}$ is low-rank, there likely will exist a $k<p$ such that this Ky-Fan( $k$ ) norm captures most of the variation of the signal, and by excluding the bottom $p-k$ singular values, ignores noise introduced through finite sample variability.

2.3 Adaptive Ky-Fan $(k)$ statistic

Although $T_{k}$ is a well-motivated test statistic with a prespecified $k$ , it is unclear in advance what value of $k$ will maximize $T_{k}$ ’s power. If $k$ is chosen too small, then $T_{k}$ may fail to capture the signal which exists outside of the top- $k$ singular values of $\hat{\Sigma}_{1}-\hat{\Sigma}_{2}$ . Alternatively, if $k$ is chosen too large then some of the singular values included in $T_{k}$ will be noise, decreasing the signal to noise ratio of $T_{k}$ . Section 3 further discusses this signal-to-noise trade-off in the asymptotic setting.

The problem of selecting the ‘optimal’ $k$ motivates the proposed method, rank-adaptive covariance testing (RACT), which considers the adaptive test statistic:

\displaystyle T_{\text{RACT}}

\displaystyle=\max_{k\in\mathcal{K}}\frac{T_{k}-\text{E}_{H_{0}}[T_{k}]}{\sqrt{\text{Var}_{H_{0}}[T_{k}]}},

(2)

where $\mathcal{K}=\{1,\dots,K\}$ represents the collection of Ky-Fan( $k$ ) norms from 1 to $K$ , and $\text{E}_{H_{0}}[T_{k}]$ , $\text{Var}_{H_{0}}[T_{k}]$ are the expected value and variance of $T_{k}$ under $H_{0}$ .

As discussed in Section 3, for normal data in the asymptotic setting, we show that the signal-to-noise ratio of $T_{k}$ is a function of $k$ , $\Sigma_{1}$ , and $\Sigma_{2}$ . Therefore, when $\Sigma_{1}$ and $\Sigma_{2}$ are unknown it is difficult to ascertain in advance which $k$ will attain the maximum signal-to-noise ratio for $k\in\mathbb{N}$ . By taking a maximum across different values of $k$ for an appropriately normalized $T_{k}$ , under the alternative the behavior of $T_{\text{RACT}}$ will resemble the behavior of the signal-to-noise maximizing $T_{k}$ , and similarly the power of $T_{\text{RACT}}$ will be close to the power of this $T_{k}$ (our simulations in Section 4 reflect this adaptivity). By including a diverse set of norms in $T_{\text{RACT}}$ , an investigator is freed from having to make a potentially consequential decision as to what single matrix norm should be used for testing $H_{0}$ , while at the same time potentially benefiting from the inclusion of norms which are sensitive to certain types of structures in $\Sigma_{1}-\Sigma_{2}$ .

The maximum $K$ can be chosen either as $\min(n,p)$ , or a smaller value that reflects prior knowledge of the data or computational considerations (since truncated SVD for the top $K$ singular values is an $O(K\times\min(n,p)^{2})$ operation). In this paper, we choose $K$ to be the smallest $K\leq\min(n,p)$ such that the variation of $\hat{\Sigma}$ explained by its top $K$ singular values exceeds 80%. This is done as the bottom 20% of the singular values are unlikely to provide much of the signal for our test statistic in the high-dimensional setting, and by including observations from both groups we ensure $K$ is the same for all permutations of the data.

2.4 Permutation testing

We use permutation to both estimate $\text{E}_{H_{0}}[T_{k}]$ and $\text{Var}_{H_{0}}[T_{k}]$ , as well as to set a critical value for conducting hypothesis testing based on $T_{\text{RACT}}$ . The use of permutation to set a critical value is attractive due to the dependencies inherent in the $T_{k}$ statistics. Even if the asymptotic distribution $T_{\text{RACT}}$ was known, the critical values from this distribution could be inaccurate for Type I error control for a finite $n$ . We create $B$ permuted datasets (permuting subjects between the two groups) and take the empirical means and standard deviations of $T_{k}$ for all $k\in\mathcal{K}$ to estimate $\text{E}_{H_{0}}[T_{k}]$ and $\text{Var}_{H_{0}}[T_{k}]$ . Then, using these same $B$ permuted datasets, we set the critical value at level $\alpha$ of $T_{\text{RACT}}$ for testing $H_{0}$ , as the $(1-\alpha)$ th quantile of $\{T_{\text{RACT}}^{(1)},\dots,T_{\text{RACT}}^{(B)}\}$ where $T_{\text{RACT}}^{(b)}$ is calculated in the same way as in (2) except each $T_{k}$ is calculated using the permuted dataset. We then reject $H_{0}$ if $T_{\text{RACT}}$ is larger than this critical value.

Remark 1

While Section 3 suggests that for normally distributed data in the asymptotic setting these statistics for certain values of $k$ are marginally normal, making the standardized $T_{k}$ comparable across different $k$ , for finite sample sizes we observe these statistics often deviate substantially from normality, particularly for small $k$ . We find that a minimum $p$ -value approach is more robust in smaller samples, where the test statistic is defined by $T_{\text{RACT-min}p}=\min_{k\in\mathcal{K}}p_{k}$ where $p_{k}$ is the $p$ value corresponding to the $T_{k}$ . Further details are provided in the Appendix.

3 Asymptotic analysis

Understanding the (asymptotic) properties of the Ky-Fan $(k)$ norms is necessary in obtaining deeper insights into the power under the alternative hypothesis. Assumptions of underlying true distributions will be made to derive simple analytical forms below, but they are not obligatory for applying the procedure to control the Type I error because permutation can effectively control the Type I error given a finite sample size (Van der Vaart, \APACyear2000).

Throughout this section, we consider two-sample observations $X^{(1)}_{1},\ldots,X^{(1)}_{n_{1}}\overset{i.i.d}{\sim}\mathcal{N}_{p}(\mu_{1},\Sigma_{1})$ and $X^{(2)}_{1},\ldots,X^{(2)}_{n_{2}}\overset{i.i.d}{\sim}\mathcal{N}_{p}(\mu_{2},\Sigma_{2})$ , assume $r_{1}=n/n_{1}$ and $r_{2}=n/n_{2}$ are bounded as $n=n_{1}+n_{2}\to\infty$ . We will next elucidate a trade-off between the signal and noise increments that determines an optimal choice of $k$ when using the Ky-Fan( $k$ ) norms.

For the ease of discussion, we next consider scenarios where the eigendecomposition of $\Sigma_{1}-\Sigma_{2}$ is unique, so that all of the sample Ky-Fan( $k$ ) norms converge to well-defined limits. Suppose $\Sigma_{1}-\Sigma_{2}=\sum_{j=1}^{p}\lambda_{j}u_{j}v_{j}^{\top}$ is the singular value decomposition with ordered singular values $\lambda_{1}\geq\ldots\geq\lambda_{p}$ . By matrix properties, we can write the eigendecomposition $\Sigma_{1}-\Sigma_{2}=\sum_{j=1}^{p}\Lambda_{j}u_{j}u_{j}^{\top}$ with $\lambda_{j}=|\Lambda_{j}|$ and $v_{j}=\mathrm{sign}(\Lambda_{j})u_{j}$ for $1\leq j\leq p$ .

Theorem 1

For $1\leq k\leq p$ , define

\displaystyle\omega_{1:k}^{2}=\sum_{1\leq j,l\leq k}\omega_{jl},\hskip 10.00002pt\text{with}\hskip 10.00002pt\omega_{jl}=2\left[r_{1}\left(u_{j}^{\top}\Sigma_{1}u_{l}\right)^{2}+r_{2}\left(u_{j}^{\top}\Sigma_{2}u_{l}\right)^{2}\right].

(3)

Let $k_{0}=\mathrm{rank}(\Sigma_{1}-\Sigma_{2})$ .

(i)

When $k\leq k_{0}$ ,

\displaystyle\frac{\sqrt{n}}{\omega_{1:k}}\Big{(}\|\hat{\Sigma}_{1}-\hat{\Sigma}_{2}\|_{(k)}-\|{\Sigma}_{1}-{\Sigma}_{2}\|_{(k)}\Big{)}\xrightarrow{d}\mathcal{N}(0,1),

where $\xrightarrow{d}$ represents the convergence in distribution.

(ii)

When $k>k_{0}$ ,

\displaystyle\frac{\sqrt{n}}{\omega_{1:k}}\Big{(}\|\hat{\Sigma}_{1}-\hat{\Sigma}_{2}\|_{(k)}-\|{\Sigma}_{1}-{\Sigma}_{2}\|_{(k)}\Big{)}\xrightarrow{d}\frac{1}{\omega_{1:k}}\left(\sum_{j=1}^{k_{0}}Z_{j}+\sum_{j=k_{0}+1}^{k}|Z_{j}|\right),

where $(Z_{1},\ldots,Z_{k})$ is jointly normal with mean zero and covariance $(\omega_{jl})_{1\leq j,l\leq k}$ .

By Theorem 1, under $H_{0}:\Sigma_{1}-\Sigma_{2}=0$ , we have

\frac{\sqrt{n}}{\omega_{1:k}}\|\hat{\Sigma}_{1}-\hat{\Sigma}_{2}\|_{(k)}\xrightarrow{d}\mathbb{Z}_{1:k},

where we define $\mathbb{Z}_{1:k}=\sum_{j=1}^{k}|Z_{j}|/\omega_{1:k}$ for $1\leq k\leq p$ . In a special case where the $u_{j}$ ’s are also eigenvectors of $\Sigma_{1}$ and $\Sigma_{2}$ , we have $\omega_{jl}=0$ when $1\leq j\neq l\leq p$ , suggesting that the $Z_{j}$ ’s are independent.

Remark 2

By the property of the normal distribution,

\displaystyle\mathrm{E}(\mathbb{Z}_{1:k})=\frac{\sum_{j=1}^{k}\sqrt{\omega_{jj}}}{\omega_{1:k}}\sqrt{\frac{2}{\pi}},

which shows that the limiting distribution changes with respect to $k$ . This motivates us to consider the standardization used in Section 2.3. In this way, the Ky-Fan $(k)$ statistics are of the standardized scale under $H_{0}$ , so that taking the maximum yields a fair comparison.

Based on Theorem 1, we study the asymptotic power of the individual Ky-Fan $(k)$ statistic $T_{k}$ . Let $q_{{0,k}}^{(\alpha)}$ represent the upper $\alpha$ -level quantile of $\mathbb{Z}_{1:k}$ . For a given $T_{k}$ , suppose we reject $H_{0}:\Sigma_{1}=\Sigma_{2}$ , if and only if $\frac{\sqrt{n}}{\omega_{1:k}}\|\hat{\Sigma}_{1}-\hat{\Sigma}_{2}\|_{(k)}\geq q_{{0,k}}^{(\alpha)}$ . Let $P_{H_{1,k_{0}}}$ denote the probability measure under $H_{1}:\Sigma_{1}\neq\Sigma_{2}$ and $\mathrm{rank}(\Sigma_{1}-\Sigma_{2})=k_{0}$ . Then the test power is

\displaystyle P_{H_{1,k_{0}}}\left\{\frac{\sqrt{n}}{\omega_{1:k}}\|\hat{\Sigma}_{1}-\hat{\Sigma}_{2}\|_{(k)}\geq q_{{0,k}}^{(\alpha)}\right\}.

(4)

When $k\leq k_{0}$ , by Part (i) of Theorem 1, (4) can be approximated by

\displaystyle 1-\Phi\left(\sqrt{n}\frac{\|{\Sigma}_{1}-{\Sigma}_{2}\|_{(k)}+q_{{0,k}}^{(\alpha)}/\sqrt{n}}{\omega_{1:k}}\right),

(5)

where $\Phi(\cdot)$ represents the cumulative distribution function of $\mathcal{N}(0,1)$ . We define the signal to noise ratio as $\operatorname{SNR}_{k}={\|{\Sigma}_{1}-{\Sigma}_{2}\|_{(k)}}/{\omega_{1:k}}$ . When $n$ is large such that $|q_{{0,k}}^{(\alpha)}/\sqrt{n}|$ is very small, the power formula (5) is dominated by $\operatorname{SNR}_{k}$ , and higher value of $\text{SNR}_{k}$ indicates a larger value of (5), i.e., higher asymptotic power.

We next compare the asymptotic power of different Ky-Fan $(k)$ statistics via comparing the values of $\operatorname{SNR}_{k}$ .

Proposition 2

For two indexes $k_{1}<k_{2}\in\{1,\ldots,p\}$ , define

	$\displaystyle\beta_{k_{1},k_{2}}=$	$\displaystyle~{}\frac{\\|\Sigma_{1}-\Sigma_{2}\\|_{(k_{2})}-\\|\Sigma_{1}-\Sigma_{2}\\|_{(k_{1})}}{\\|\Sigma_{1}-\Sigma_{2}\\|_{(k_{1})}}=\frac{\sum_{j=k_{1}+1}^{k_{2}}\lambda_{j}}{\\|\Sigma_{1}-\Sigma_{2}\\|_{(k_{1})}},$
	$\displaystyle\gamma_{k_{1},k_{2}}=$	$\displaystyle~{}\frac{\omega_{1:k_{2}}^{2}-\omega^{2}_{1:k_{1}}}{\omega_{1:k_{1}}^{2}}$

as the relative increments of signal and noise, respectively. We have $\operatorname{SNR}_{k_{2}}\geq\operatorname{SNR}_{k_{1}}$ and only if $\beta_{k_{1},k_{2}}\geq\sqrt{\gamma_{k_{1},k_{2}}+1}-1$ .

We next discuss the interpretation of Proposition 2. As an example, when $\lambda_{k_{1}+1}=\ldots=\lambda_{k_{2}}=0$ , we have $\beta_{k_{1},k_{2}}=0$ (no signal increment) whereas $\gamma_{k_{1},k_{2}}>0$ (positive noise increment), so that $\operatorname{SNR}_{k_{2}}<\operatorname{SNR}_{k_{1}}$ by Proposition 2. This suggests that the test power/SNR of Ky-Fan $(k_{2})$ would be smaller than that of Ky-Fan $(k_{1})$ . Thus it is likely that including sample singular values $\hat{\lambda}_{k_{1}+1},\ldots,\hat{\lambda}_{k_{2}}$ in the test statistic would not help enhance the test power. Moreover, in this case, we anticipate that Ky-Fan $(k_{1})$ might have higher power than the popular Frobenius norm statistic. When $\beta_{k_{1},k_{2}}>0$ , the condition $\beta_{k_{1},k_{2}}\geq\sqrt{\gamma_{k_{1},k_{2}}+1}-1$ can be interpreted as the relative signal increment is larger than the relative variance increment. This suggests that even when the signal has a positive increment, whether the test power can be enhanced or not depends on the trade-off between signal and noise increments. Such an intricate trade-off partially shows the difficulty of determining an optimal $k$ exactly in observational data. It partially justifies using the proposed adaptive version which integrates different Ky-Fan $(k)$ statistics in Equation (2).

4 Simulation study

4.1 Simulation setup

We conducted extensive simulation studies to better understand the performance of RACT. All simulations, unless otherwise noted, were run in the setting of $n_{1}=n_{2}=25,\;p=250$ . For simulations related to the null hypothesis, the covariance matrix was $\Sigma_{1}$ for all observations; otherwise $\Sigma_{1}$ and $\Sigma_{2}$ represent the covariance matrices for group 1 and group 2. We simulated data from covariance matrices with changing low-rank, and low-rank block structures to reflect the block structures commonly found in biomedical data. In the data simulation settings S1-S3 below, we let $\Sigma_{1}=I+UU^{\top}$ and $\Sigma_{2}=I+VV^{\top}$ where $U$ and $V$ are low-rank matrices, so that $\Sigma_{1}-\Sigma_{2}$ is low-rank. We define $w_{U},w_{V}$ as the ranks of $U$ and $V$ , and in our simulations either $w_{U}=w_{V}=2$ or $w_{U}=w_{V}=5$ . To randomly generate appropriately sized blocks of the low-rank components of the covariance matrix $U_{1}U_{1}^{\top},U_{2}U_{2}^{\top},V_{1}V_{1}^{\top}$ , we generated the columns of $U_{1},U_{2},V_{1}$ independently using the first $w$ singular vectors from randomly generated matrices $A_{1}A_{1}^{\top},A_{2}A_{2}^{\top},A_{3}A_{3}^{\top}$ , where $A_{1},A_{2},A_{3}$ are appropriately sized matrices with i.i.d. standard normal entries. Figure 1 is provided for graphical illustrations of our simulation settings.

Refer to caption — Figure 1: Visualizations of the covariance matrices of both groups across the four simulation scenarios.

S1.

(LowRank): We set $UU^{\top}=\tau^{2}U_{1}U_{1}^{\top},VV^{\top}=\tau^{2}V_{1}V_{1}^{\top}$ with $U_{1}U_{1}^{\top},V_{1}V_{1}^{\top}\in\mathbb{R}^{p\times p}$ . Here, all elements of the covariance matrix experience a change with high probability.

S2.

(LowRankBlockLarge): We set

\displaystyle UU^{\top}=\begin{bmatrix}\tau^{2}U_{1}U_{1}^{\top}&0\\ 0&U_{2}U_{2}^{\top}\end{bmatrix}\;\;\;VV^{\top}=\begin{bmatrix}\tau^{2}V_{1}V_{1}^{\top}&0\\ 0&U_{2}U_{2}^{\top}\end{bmatrix}\;\;\;U_{1}U_{1}^{T},V_{1}V_{1}^{T},U_{2}U_{2}^{\top}\in\mathbb{R}^{p/2\times p/2}.

S3.

(LowRankBlockSmall): A similar setup to S2 except $U_{1}U_{1}^{\top},V_{1}V_{1}^{\top}\in\mathbb{R}^{10\times 10}$ and $U_{2}U_{2}^{\top}\in\mathbb{R}^{(p-10)\times(p-10)}$ .
S4.

(OffDiagonal): Here $\Sigma_{1}=\begin{bmatrix}A_{1}&0\\ 0&I\end{bmatrix},\Sigma_{2}=\begin{bmatrix}A_{2}&0\\ 0&I\end{bmatrix}$ where $A_{1},A_{2},I\in\mathbb{R}^{p/2\times p/2}$ . $A_{1}$ has an equicorrelation structure in that $a_{r,r}=1$ for $r\in 1,\dots,p/2$ and $a_{r,s}=\tau^{2}$ for $r\neq s$ . $A_{2}$ is equal to $A_{1}$ except the covariances between dimensions $(1,\dots,\lceil p/4\rceil-1)$ and $(\lceil p/4\rceil,\dots,p/2)$ are set to $-\tau^{2}$ .

To evaluate Type I error, we generated 10000 datasets and used 2000 permutations for each dataset. For power analysis we used 1000 datasets and 1000 permutations.

4.2 Simulation results

4.2.1 Control of Type I error

RACT provided reliable Type I error control in all of the simulation setups, which was expected from the permutation scheme. Across all simulation setups, for $\alpha=0.05$ RACT’s Type I error fell between 0.047 and 0.054.

4.2.2 Two-sample testing

The performance of RACT in two-sample testing allows us to understand how the power of individual norms drives the overall power of RACT. Figure 2 shows the power of RACT when specific single norms were used, as well as when multiple norms were included as described in Section 2.3. The individual norms we present are the operator (i.e. Ky-Fan(1)), Ky-Fan(4), Ky-Fan(10), and Ky-Fan(25), as they help demonstrate in our simulation settings the impact varying $k$ has on the power of the Ky-Fan( $k$ ) norm. Recall that because RACT selects a $K$ based on the pooled covariance matrix, RACT may not include all of the the individual norms we compare it to in all simulations.

S1.

(LowRank): For this covariance structure, when $w_{U}=w_{V}=2$ RACT’s power was similar to that of the Ky-Fan(4) norm, which we would expect would have high power as $\Sigma_{1}-\Sigma_{2}$ is of rank at most 4. Increasing the rank such that $w_{U}=w_{V}=5$ led to an improvement in performance of the higher Ky-Fan( $k$ ) norms and a relative underperformance of the operator norm. Interestingly for the $w_{U}=w_{V}=5$ case the Ky-Fan(4) norm still performed quite strongly, reflecting the difficultly in ascertaining in advance the ideal set of Ky-Fan( $k$ ) norms for RACT to include in the finite sample setting.
S2.

(LowRankBlockSmall): As the change is relegated to a small block we saw a deterioration in the performance of the operator norm and an increase in performance of the higher Ky-Fan( $k$ ) norms. The $w_{U}=w_{V}=5$ setting demonstrates the way in which a very small block changing is in some ways an adversarial setting for RACT. Here the norms we examined which provided the greatest individual power were the Ky-Fan(10) and Ky-Fan(25) norms. However RACT’s power was lower than these as it suboptimally chose $K$ due to the size, and hence large singular values, associated with the unchanging block. For example, in the $w_{U}=w_{V}=5$ case for $\tau^{2}$ associated with RACT’s power equal to 0.5, $K$ was chosen to be 7.46 on average implying that the higher power Ky-Fan( $k$ ) norms were often excluded from consideration.
S3.

(LowRankBlockLarge): Here we saw when a large block experienced a change, the operator norm markedly underperformed. For $\tau^{2}$ associated with RACT’s power equal to 0.5, on average RACT chose $K$ as 19.6 and 11.6 for the $w_{U}=w_{V}=2$ and $w_{U}=w_{V}=5$ cases respectively. These choices of $K$ contributed to RACT’s high power, as we see the Ky-Fan( $k$ ) norms for large values of $k$ were particularly powerful.
S4.

(OffDiagonal): This setting sees two very large singular values in $\Sigma_{1}-\Sigma_{2}$ , and this is reflected in the strong performance of the operator norm. Notable was the very poor performance of the Ky-Fan( $25$ ) norm, an indication that including too many Ky-Fan( $K$ ) norms may lead to a deterioration of performance.

One interesting result seen in S2 and S3, but not in S1, is the strong performance of the Ky-Fan(25) norm, despite the fact that the rank of $\Sigma_{1}-\Sigma_{2}$ is at most 4 and 10 for $w_{U}=w_{V}=2$ and $w_{U}=w_{V}=5$ respectively. This can be attributed to the fact that for S2 and S3 in the finite sample setting the top singular values of $\hat{\Sigma}_{1}-\hat{\Sigma}_{2}$ can be driven by the unchanging block of $\Sigma_{1}$ and $\Sigma_{2}$ for small values of $\tau^{2}$ . This led to the signal arising from the changing block to be excluded from the Ky-Fan( $k$ ) norms for small values of $k$ , reducing the power of these. This points to the difficulty of ascertaining the optimal $K$ in advance as the interaction of the covariance structures, $n$ , and $\tau^{2}$ , seem to determine which Ky-Fan( $k$ ) norms produce the highest power.

5 Real data analysis

We examine the performance of RACT on two separate datasets and compare it to five competitors: SY (Srivastava \BBA Yanagihara, \APACyear2010), LC (Li \BBA Chen, \APACyear2012), CLX (Cai \BOthers., \APACyear2013), HC (He \BBA Chen, \APACyear2018), and DHW (Ding \BOthers., \APACyear2024). The competing methods are implemented using the R package UHDtst available at https://github.com/xcding1212/UHDtst. Since all the competing methods considered are based on asymptotic results that reported inflated Type I error rates in small sample sizes, we implement a permutation-based version of SY, LC, CLX, HC for a fair comparison. We do not implement a permutation-based version of DHW given its high computational cost. For the hyperparameters used for each method we follow the default implementation in UHDtst; namely HC tests $\lfloor p^{0.7}\rfloor$ superdiagonals, and for DHW the data-splitting algorithm is repeated 100 times.

For both applications, we compare the empirical Type I error rates by conducting two sample tests when both samples are from the same group, and power when they are from different groups, for different sizes of subsamples from the full sample. When comparing RACT to other methods we select $K$ as described in Section 2.3, and when comparing the power of individual Ky-Fan( $k$ ) norms relative to the power of RACT we set $K=\min(n_{1}+n_{2},50)$ for comparability across simulations. We present results where we conduct two-sample tests using the covariance matrix as results when using correlation are broadly similar. We note that the Type I error and power of RACT are nearly identical for covariance and correlation, whereas there is more variability for the competing methods. We provide a visualization of the difference in covariance matrices between groups, and their low-rank approximation, for both datasets in Figure 3.

5.1 TCGA lung cancer data

The Cancer Genome Atlas dataset (TCGA) contains 11,000 samples from 33 tumor types for a range of data modalities and has enabled a better understanding of the molecular changes which cancer causes (Hutter \BBA Zenklusen, \APACyear2018). Of the data modalities available from TCGA, we focus on gene expression, which serves as a vital link in understanding the mechanism through which DNA mutations can lead to cancer. The study of gene expression, recently aided by the development of new sequencing technologies such as RNA-Seq (Wang \BOthers., \APACyear2009), has led to a better understanding of the differences between cancer and normal cells (and therefore serves as a potential biomarker), and has helped in the identification of new disease subtypes (Gerstung \BOthers., \APACyear2015). Specifically, we analyze gene expression networks in two types of lung cancers, lung squamous cell carcinomas (LUSC) which is a common type of lung cancer (The Cancer Genome Atlas Research Network, \APACyear2012), and lung adenocarcinomas (LUAD) which is a leading cause of cancer death (The Cancer Genome Atlas Research Network, \APACyear2014). Gene expressions for protein coding genes for these tumors was downloaded using the BiocManager R package (Morgan \BBA Ramos, \APACyear2024), for which $n_{1}=553$ LUSC tumor samples and $n_{2}=600$ LUAD tumor samples were available from 19962 genes. We transform the data by taking $\log_{2}(1+\text{count})$ and keep the $p=500$ genes which exhibit the most variability across both tumor types. Finally within each tumor type, we regressed out age and sex from the data, and then used the residuals for our evaluations.

Using all samples we see in the difference of covariance matrices between tumor types some evidence of a low-rank structure. Using all samples, the first singular value and the first 93 singular values, represent 16% and 80% respectively of the total sum of all singular values. This suggests that when testing the equality of these covariance matrices much of the signal can be found in a limited number of low-rank structures. This is reflected in Figure 4 where we examine the empirical power of various Ky-Fan( $k$ ) norms for a fixed subsample size of $n_{1}=n_{2}=15$ . A rapid increase in power when $\alpha=0.05$ is seen as $k$ increases, before peaking at $k=13$ and decreasing modestly thereafter. The adaptivity of RACT is demonstrated in that its empirical power, which takes into account multiple $k$ , settles near the maximum power across all $k$ .

For comparing RACT to other methods, we first note that for moderate subsample sizes of between 15 and 50 we see LC and HC exhibit slightly inflated Type I error of approximately 10%, while CLX and DHW experience more severe Type I error inflation. Hence we use a permutation-based version of LC, HC, and CLX, and exclude DHW from our power comparison. RACT’s Type I error is controlled close to $\alpha=0.05$ , while SY is slightly conservative. Figure 4 shows RACT exhibits increased power compared to SY, LC, CLX, and is comparable to that of HC.

5.2 SPINS diffusion tensor imaging data

The second dataset we analyze is from the Social Processes Initiative in Neurobiology of the Schizophrenia(s) (SPINS) group (Hawco \BOthers., \APACyear2021). This dataset consists of diffusion tensor imaging (DTI) measurements of fractional anisotropy (FA) and mean diffusivity (MD). DTI is a non-invasive technique which measures the diffusion of water molecules within neural tissue. MD relates to the amount of diffusion which is occurring, while FA is a summary statistic related to the architecture of the tissue being examined which is influenced by how uniform diffusion is for different directions (O’Donnell \BBA Westin, \APACyear2011; Van Hecke \BOthers., \APACyear2016). DTI has been found to be useful in a clinical setting in providing biomarkers for amyotrophic lateral sclerosis, and as a diagnostic tool for Parkinson’s disease (Tae \BOthers., \APACyear2018). Zhang \BOthers. (\APACyear2023) investigated the biases that are introduced for DTI imaging in the SPINS study when measurements are taken at different sites or when using different scanners, specifically inter-scanner covariance heterogeneity. In the SPINS study most sites began with General Electric 3T (GE) scanners, and all ended with Siemens Prisma 3T (SP) scanners. In the below analysis we provide further evidence of inter-scanner covariance heterogeneity in the SPINS study. Along the lines of Zhang \BOthers. (\APACyear2023) we use linear regression to remove the effect of age, age², gender, diagnosis, age $\times$ gender, and age $\times$ diagnosis, and then use these residuals to test for differences in the covariance matrix of observations for each scanner type. For both FA and MD each observation has $p=73$ , and in total we have $n_{1}=130$ observations from GE scanners and $n_{2}=195$ observations from SP scanners.

An examination of the difference in covariance matrices using all samples reveals a low-rank structure for both FA and MD measurements. For FA the first singular value and the sum of the first 19 singular values represent 30% and 80% of the total sum of all singular values. The low-rank structure is more pronounced for MD where the first singular value represents 59% of the total sum, and the sum of the top 6 singular values represents 81% of the total sum. Using a fixed subsample size of 30 and 15 for FA and MD respectively we see in Figure 4 for individual Ky-Fan( $k$ ) norms power is maximized at $k=12$ for FA and $k=14$ for MD. Reflecting the lower-rank structure of the MD data, we see a less steep increase in power as $k$ increases relative to FA. Similar to the TCGA data we see across both FA and MD datasets inflated Type I errors for LC, CLX, HC, and DHW; therefore we use permutation-based versions of LC, CLX, and HC and exclude DHW from the power analysis. RACT’s Type I error control is accurate, and SY is very conservative (Type I error of 0%). Relative to other methods we see similar results to what was seen in the TCGA dataset, with HC once again performing most comparably to RACT.

5.3 Relative performance of competing methods

Across TCGA, SPINS FA, and SPINS MD we note that RACT’s best relative performance appears for the SPINS MD dataset, where its empirical power is strictly higher than other methods for subsample sizes up to 35. This potentially reflects the relatively lower-rank structure of the SPINS MD data, and correspondingly RACT’s ability to leverage this structure better than other methods. Also it is notable that RACT’s performance is consistently strong across all datasets whereas the performance of SY, CLX, and LC is more variable; this would be expected given RACT’s adaptive test statistic as compared to the test statistics of other methods which will be more sensitive to the specific form the difference in covariance takes. The method performing most similarly to RACT is HC. However, HC’s superdiagonal-based test statistic leads its performance to vary based on the ordering of the dimensions, and for an arbitrary reordering of the dimensions for these datasets will produce different power results. This could particularly affect HC’s results on SPINS FA, which we observe in Figure 3 to have several blocks along the diagonal of the difference in covariance matrices. This contrasts with RACT’s test statistic which is invariant to permutation of the dimensions.

6 Discussion

In this paper we propose a novel two-sample covariance testing method, which is able to improve power via leveraging low-rank structures commonly found in genomics and neuroimaging data. Underlying RACT is the use of the Ky-Fan( $k$ ) matrix norm, which is novel in the setting of two-sample covariance testing. In Section 3 we investigate the asymptotic properties of the Ky-Fan( $k$ ) norm, and discover a delicate signal-to-noise trade-off which emerges for different values of $k$ . This trade-off is reflected in our power simulations where the Ky-Fan( $k$ ) norm which maximizes power is seen to differ between scenarios.

RACT utilizes an adaptive test statistic, composed of a series of individual Ky-Fan( $k$ ) norms. RACT is able to adapt to the differences in low-rank structures, by virtue of the fact that for an appropriate $k$ , the Ky-Fan( $k$ ) norm is able to capture most of the signal found in these low-rank differences. Figure 2 and 4 show the power of RACT generally approaches that of the power of the best performing individual Ky-Fan( $k$ ) norm. Notably, in Figure 2 we see the potential downside of setting $K$ too high as there are cases where the power of the Ky-Fan(25) norm trails substantially behind smaller Ky-Fan( $k$ ) norms. This demonstrates the importance of RACT’s adaptivity as one would not be able to replicate its performance by only using a single Ky-Fan( $k$ ) norm with large $k$ . However, we also see in Figure 4 that in our real data applications, the Ky-Fan( $k$ ) norms for very small $k$ have substantially reduced power (and in Figure 2 the operator norm generally does not maximize power). To solve this problem one could potentially select a lower bound $K^{*}$ such that only Ky-Fan( $k$ ) norms for $K^{*}\leq k\leq K$ are included, potentially choosing $K^{*}$ in a similar fashion to how $K$ is selected in Section 2.3. Another potential extension would be to have $T_{\text{RACT}}$ take a maximum over test statistics other than Ky-Fan( $k$ ) norms. These test statistics could be chosen to maximize the power for the application at hand; for example if the differences in covariances were expected to have a banded structure then a test statistic similar to He \BBA Chen (\APACyear2018) could be included.

The applications to gene expression networks and batch effects in neuroimaging we present in Section 5 shows RACT exhibit strong power relative to competing two-sample covariance testing methods, and is able to attain power close to the maximum of individual Ky-Fan( $k$ ) norms in each case (Figure 4). We see in Figure 3 the unique low-rank nature of the differences in the covariance matrices for each dataset, and note RACT’s consistently strong performance as compared to some other methods with more volatile relative performance between datasets.

Section 4 showed how the rank of $\Sigma_{1}-\Sigma_{2}$ had a significant effect on the relative power of the individual Ky-Fan( $k$ ) norms. Generally we saw that the Ky-Fan( $k$ ) norms which maximize power were those where $k$ is close to the rank of $\Sigma_{1}-\Sigma_{2}$ . In biomedical data analysis, techniques such as principal component analysis are often used for dimension reduction. With this in mind, the power maximizing $k$ of RACT could be used to help guide the selection of how many principal components to include for downstream analysis.

In Section 3 we present the asymptotic distribution of the Ky-Fan( $k$ ) norm for a fixed $k$ . While the limiting distribution of $T_{\text{RACT}}$ remains an open question, we are able to control the Type I error via permutation. While this lack of closed-form distribution for $T_{\text{RACT}}$ is a limitation, we also note that Section 5 shows that when faced with finite samples permutation may be necessary for many other methods in order to control Type I error. A further extension could include investigating the performance of RACT in the asymptotic setting where $p$ is not fixed, which would be valuable given the high-dimensional nature of many biomedical problems.

Acknowledgements

We would like to thank the SPINS group for providing access to the brain imaging data from the SPINS study. DV was partially supported by the doctoral fellowship from University of Toronto’s Data Sciences Institute. YH was partially supported by the Wisconsin Alumni Research Foundation. JYP was partially supported by the Natural Sciences and Engineering Research Council of Canada (RGPIN-2022-04831) the University of Toronto’s Data Sciences Institute, the McLaughlin Centre, and the Connaught Fund. The computing resources were enabled in part by support provided by University of Toronto and the Digital Research Alliance of Canada (alliancecan.ca).

References

Amar \BOthers. (\APACyear2013) \APACinsertmetastaramar2013dissection{APACrefauthors}Amar, D., Safer, H.\BCBL \BBA Shamir, R. \APACrefYearMonthDay2013. \BBOQ\APACrefatitleDissection of regulatory networks that are altered in disease via differential co-expression Dissection of regulatory networks that are altered in disease via differential co-expression.\BBCQ \APACjournalVolNumPagesPLoS Computational Biology93e1002955. \PrintBackRefs\CurrentBib
Cai \BOthers. (\APACyear2013) \APACinsertmetastarcai2013two{APACrefauthors}Cai, T., Liu, W.\BCBL \BBA Xia, Y. \APACrefYearMonthDay2013. \BBOQ\APACrefatitleTwo-sample covariance matrix testing and support recovery in high-dimensional and sparse settings Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings.\BBCQ \APACjournalVolNumPagesJournal of the American Statistical Association108501265–277. \PrintBackRefs\CurrentBib
Carmon \BOthers. (\APACyear2020) \APACinsertmetastarcarmon2020reliability{APACrefauthors}Carmon, J., Heege, J., Necus, J\BPBIH., Owen, T\BPBIW., Pipa, G., Kaiser, M.\BDBLWang, Y. \APACrefYearMonthDay2020. \BBOQ\APACrefatitleReliability and comparability of human brain structural covariance networks Reliability and comparability of human brain structural covariance networks.\BBCQ \APACjournalVolNumPagesNeuroImage220117104. \PrintBackRefs\CurrentBib
Chen \BOthers. (\APACyear2022) \APACinsertmetastarchen2022mitigating{APACrefauthors}Chen, A\BPBIA., Beer, J\BPBIC., Tustison, N\BPBIJ., Cook, P\BPBIA., Shinohara, R\BPBIT., Shou, H.\BCBL \BBA Alzheimer’s Disease Neuroimaging Initiative. \APACrefYearMonthDay2022. \BBOQ\APACrefatitleMitigating site effects in covariance for machine learning in neuroimaging data Mitigating site effects in covariance for machine learning in neuroimaging data.\BBCQ \APACjournalVolNumPagesHuman Brain Mapping4341179–1195. \PrintBackRefs\CurrentBib
Chowdhury \BOthers. (\APACyear2019) \APACinsertmetastarchowdhury2019differential{APACrefauthors}Chowdhury, H\BPBIA., Bhattacharyya, D\BPBIK.\BCBL \BBA Kalita, J\BPBIK. \APACrefYearMonthDay2019. \BBOQ\APACrefatitle(Differential) co-expression analysis of gene expression: a survey of best practices (Differential) co-expression analysis of gene expression: a survey of best practices.\BBCQ \APACjournalVolNumPagesIEEE/ACM Transactions on Computational Biology and Bioinformatics1741154–1173. \PrintBackRefs\CurrentBib
Danaher \BOthers. (\APACyear2015) \APACinsertmetastardanaher2015covariance{APACrefauthors}Danaher, P., Paul, D.\BCBL \BBA Wang, P. \APACrefYearMonthDay2015. \BBOQ\APACrefatitleCovariance-based analyses of biological pathways Covariance-based analyses of biological pathways.\BBCQ \APACjournalVolNumPagesBiometrika1023533–544. \PrintBackRefs\CurrentBib
Ding \BOthers. (\APACyear2024) \APACinsertmetastarding2023sampletestcovariancematrices{APACrefauthors}Ding, X., Hu, Y.\BCBL \BBA Wang, Z. \APACrefYearMonthDay2024. \BBOQ\APACrefatitleTwo sample test for covariance matrices in ultra-high dimension Two sample test for covariance matrices in ultra-high dimension.\BBCQ \APACjournalVolNumPagesJournal of the American Statistical Associationjust-accepted1–22. \PrintBackRefs\CurrentBib
Eisen \BOthers. (\APACyear1998) \APACinsertmetastareisen1998cluster{APACrefauthors}Eisen, M\BPBIB., Spellman, P\BPBIT., Brown, P\BPBIO.\BCBL \BBA Botstein, D. \APACrefYearMonthDay1998. \BBOQ\APACrefatitleCluster analysis and display of genome-wide expression patterns Cluster analysis and display of genome-wide expression patterns.\BBCQ \APACjournalVolNumPagesProceedings of the National Academy of Sciences952514863–14868. \PrintBackRefs\CurrentBib
Gerstung \BOthers. (\APACyear2015) \APACinsertmetastargerstung2015combining{APACrefauthors}Gerstung, M., Pellagatti, A., Malcovati, L., Giagounidis, A., Porta, M., Jädersten, M.\BDBLBoultwood, J. \APACrefYearMonthDay2015. \BBOQ\APACrefatitleCombining gene mutation with gene expression data improves outcome prediction in myelodysplastic syndromes Combining gene mutation with gene expression data improves outcome prediction in myelodysplastic syndromes.\BBCQ \APACjournalVolNumPagesNature Communications615901. \PrintBackRefs\CurrentBib
Hawco \BOthers. (\APACyear2021) \APACinsertmetastards003011:1.2.0{APACrefauthors}Hawco, C., Dickie, E., Herman, G., Turner, J., Argyan, M., Homan, P.\BDBLVoineskos, A. \APACrefYearMonthDay2021. \APACrefbtitleSocial Processes Initiative in Neurobiology of the Schizophrenia(s) Traveling Human Phantoms. Social Processes Initiative in Neurobiology of the Schizophrenia(s) Traveling Human Phantoms. \APACaddressPublisherOpenNeuro. \PrintBackRefs\CurrentBib
He \BBA Chen (\APACyear2018) \APACinsertmetastarhe2018high{APACrefauthors}He, J.\BCBT \BBA Chen, S\BPBIX. \APACrefYearMonthDay2018. \BBOQ\APACrefatitleHigh-dimensional two-sample covariance matrix testing via super-diagonals High-dimensional two-sample covariance matrix testing via super-diagonals.\BBCQ \APACjournalVolNumPagesStatistica Sinica2842671–2696. \PrintBackRefs\CurrentBib
Hu \BOthers. (\APACyear2023) \APACinsertmetastarhu2023image{APACrefauthors}Hu, F., Chen, A\BPBIA., Horng, H., Bashyam, V., Davatzikos, C., Alexander-Bloch, A.\BDBLShinohara, R\BPBIT. \APACrefYearMonthDay2023. \BBOQ\APACrefatitleImage harmonization: A review of statistical and deep learning methods for removing batch effects and evaluation metrics for effective harmonization Image harmonization: A review of statistical and deep learning methods for removing batch effects and evaluation metrics for effective harmonization.\BBCQ \APACjournalVolNumPagesNeuroImage274120125. \PrintBackRefs\CurrentBib
Hutter \BBA Zenklusen (\APACyear2018) \APACinsertmetastarhutter2018cancer{APACrefauthors}Hutter, C.\BCBT \BBA Zenklusen, J\BPBIC. \APACrefYearMonthDay2018. \BBOQ\APACrefatitleThe cancer genome atlas: creating lasting value beyond its data The cancer genome atlas: creating lasting value beyond its data.\BBCQ \APACjournalVolNumPagesCell1732283–285. \PrintBackRefs\CurrentBib
Li \BBA Chen (\APACyear2012) \APACinsertmetastarli2012two{APACrefauthors}Li, J.\BCBT \BBA Chen, S\BPBIX. \APACrefYearMonthDay2012. \BBOQ\APACrefatitleTwo sample tests for high-dimensional covariance matrices Two sample tests for high-dimensional covariance matrices.\BBCQ \APACjournalVolNumPagesThe Annals of Statistics402908 – 940. \PrintBackRefs\CurrentBib
Lock \BOthers. (\APACyear2022) \APACinsertmetastarlock2022bidimensional{APACrefauthors}Lock, E\BPBIF., Park, J\BPBIY.\BCBL \BBA Hoadley, K\BPBIA. \APACrefYearMonthDay2022. \BBOQ\APACrefatitleBidimensional linked matrix factorization for pan-omics pan-cancer analysis Bidimensional linked matrix factorization for pan-omics pan-cancer analysis.\BBCQ \APACjournalVolNumPagesThe Annals of Applied Statistics161193. \PrintBackRefs\CurrentBib
Meng \BOthers. (\APACyear2018) \APACinsertmetastarmeng2018functional{APACrefauthors}Meng, S., Liu, G., Su, L., Sun, L., Wu, D., Wang, L.\BCBL \BBA Zheng, Z. \APACrefYearMonthDay2018. \BBOQ\APACrefatitleFunctional clusters analysis and research based on differential coexpression networks Functional clusters analysis and research based on differential coexpression networks.\BBCQ \APACjournalVolNumPagesBiotechnology & Biotechnological Equipment321171–182. \PrintBackRefs\CurrentBib
Morgan \BBA Ramos (\APACyear2024) \APACinsertmetastarbiocmanagerpackage{APACrefauthors}Morgan, M.\BCBT \BBA Ramos, M. \APACrefYearMonthDay2024. \BBOQ\APACrefatitleBiocManager: Access the Bioconductor Project Package Repository BiocManager: Access the Bioconductor Project Package Repository\BBCQ [\bibcomputersoftwaremanual]. {APACrefURL} https://CRAN.R-project.org/package=BiocManager \APACrefnoteR package version 1.30.25 \PrintBackRefs\CurrentBib
O’Donnell \BBA Westin (\APACyear2011) \APACinsertmetastaro2011introduction{APACrefauthors}O’Donnell, L\BPBIJ.\BCBT \BBA Westin, C\BHBIF. \APACrefYearMonthDay2011. \BBOQ\APACrefatitleAn introduction to diffusion tensor image analysis An introduction to diffusion tensor image analysis.\BBCQ \APACjournalVolNumPagesNeurosurgery Clinics222185–196. \PrintBackRefs\CurrentBib
Pan \BOthers. (\APACyear2014) \APACinsertmetastarpan2014powerful{APACrefauthors}Pan, W., Kim, J., Zhang, Y., Shen, X.\BCBL \BBA Wei, P. \APACrefYearMonthDay2014. \BBOQ\APACrefatitleA powerful and adaptive association test for rare variants A powerful and adaptive association test for rare variants.\BBCQ \APACjournalVolNumPagesGenetics19741081–1095. \PrintBackRefs\CurrentBib
Park \BBA Lock (\APACyear2020) \APACinsertmetastarpark2020integrative{APACrefauthors}Park, J\BPBIY.\BCBT \BBA Lock, E\BPBIF. \APACrefYearMonthDay2020. \BBOQ\APACrefatitleIntegrative factorization of bidimensionally linked matrices Integrative factorization of bidimensionally linked matrices.\BBCQ \APACjournalVolNumPagesBiometrics76161–74. \PrintBackRefs\CurrentBib
Schott (\APACyear2007) \APACinsertmetastarschott2007test{APACrefauthors}Schott, J\BPBIR. \APACrefYearMonthDay2007. \BBOQ\APACrefatitleA test for the equality of covariance matrices when the dimension is large relative to the sample sizes A test for the equality of covariance matrices when the dimension is large relative to the sample sizes.\BBCQ \APACjournalVolNumPagesComputational Statistics & Data Analysis51126535–6542. \PrintBackRefs\CurrentBib
Srivastava \BBA Yanagihara (\APACyear2010) \APACinsertmetastarsrivastava2010testing{APACrefauthors}Srivastava, M\BPBIS.\BCBT \BBA Yanagihara, H. \APACrefYearMonthDay2010. \BBOQ\APACrefatitleTesting the equality of several covariance matrices with fewer observations than the dimension Testing the equality of several covariance matrices with fewer observations than the dimension.\BBCQ \APACjournalVolNumPagesJournal of Multivariate Analysis10161319–1329. \PrintBackRefs\CurrentBib
Stuart \BOthers. (\APACyear2003) \APACinsertmetastarstuart2003gene{APACrefauthors}Stuart, J\BPBIM., Segal, E., Koller, D.\BCBL \BBA Kim, S\BPBIK. \APACrefYearMonthDay2003. \BBOQ\APACrefatitleA gene-coexpression network for global discovery of conserved genetic modules A gene-coexpression network for global discovery of conserved genetic modules.\BBCQ \APACjournalVolNumPagesScience3025643249–255. \PrintBackRefs\CurrentBib
Tae \BOthers. (\APACyear2018) \APACinsertmetastartae2018current{APACrefauthors}Tae, W\BHBIS., Ham, B\BHBIJ., Pyun, S\BHBIB., Kang, S\BHBIH.\BCBL \BBA Kim, B\BHBIJ. \APACrefYearMonthDay2018. \BBOQ\APACrefatitleCurrent clinical applications of diffusion-tensor imaging in neurological disorders Current clinical applications of diffusion-tensor imaging in neurological disorders.\BBCQ \APACjournalVolNumPagesJournal of Clinical Neurology142129–140. \PrintBackRefs\CurrentBib
The Cancer Genome Atlas Research Network (\APACyear2012) \APACinsertmetastarcancer2012comprehensive{APACrefauthors}The Cancer Genome Atlas Research Network. \APACrefYearMonthDay2012. \BBOQ\APACrefatitleComprehensive genomic characterization of squamous cell lung cancers Comprehensive genomic characterization of squamous cell lung cancers.\BBCQ \APACjournalVolNumPagesNature4897417519. \PrintBackRefs\CurrentBib
The Cancer Genome Atlas Research Network (\APACyear2014) \APACinsertmetastarcancer2014comprehensive{APACrefauthors}The Cancer Genome Atlas Research Network. \APACrefYearMonthDay2014. \BBOQ\APACrefatitleComprehensive molecular profiling of lung adenocarcinoma Comprehensive molecular profiling of lung adenocarcinoma.\BBCQ \APACjournalVolNumPagesNature5117511543. \PrintBackRefs\CurrentBib
Van der Vaart (\APACyear2000) \APACinsertmetastarvan2000asymptotic{APACrefauthors}Van der Vaart, A\BPBIW. \APACrefYear2000. \APACrefbtitleAsymptotic Statistics Asymptotic Statistics (\BVOL 3). \APACaddressPublisherCambridge University Press. \PrintBackRefs\CurrentBib
Van Hecke \BOthers. (\APACyear2016) \APACinsertmetastarvan2016diffusion{APACrefauthors}Van Hecke, W., Emsell, L.\BCBL \BBA Sunaert, S. \APACrefYear2016. \APACrefbtitleDiffusion Tensor Imaging: a Practical Handbook Diffusion Tensor Imaging: a Practical Handbook (\BVOL 1). \APACaddressPublisherSpringer. \PrintBackRefs\CurrentBib
Veronese \BOthers. (\APACyear2019) \APACinsertmetastarveronese2019covariance{APACrefauthors}Veronese, M., Moro, L., Arcolin, M., Dipasquale, O., Rizzo, G., Expert, P.\BDBLTurkheimer, F. \APACrefYearMonthDay2019. \BBOQ\APACrefatitleCovariance statistics and network analysis of brain PET imaging studies Covariance statistics and network analysis of brain PET imaging studies.\BBCQ \APACjournalVolNumPagesScientific Reports912496. \PrintBackRefs\CurrentBib
Wang \BOthers. (\APACyear2009) \APACinsertmetastarwang2009rna{APACrefauthors}Wang, Z., Gerstein, M.\BCBL \BBA Snyder, M. \APACrefYearMonthDay2009. \BBOQ\APACrefatitleRNA-Seq: a revolutionary tool for transcriptomics RNA-Seq: a revolutionary tool for transcriptomics.\BBCQ \APACjournalVolNumPagesNature Reviews Genetics10157–63. \PrintBackRefs\CurrentBib
Yu \BBA Luo (\APACyear2021) \APACinsertmetastaryu2021recovering{APACrefauthors}Yu, J.\BCBT \BBA Luo, X. \APACrefYearMonthDay2021. \BBOQ\APACrefatitleRecovering spatially-varying cell-specific gene co-expression networks for single-cell spatial expression data Recovering spatially-varying cell-specific gene co-expression networks for single-cell spatial expression data.\BBCQ \APACjournalVolNumPagesFrontiers in Genetics12656637. \PrintBackRefs\CurrentBib
Zhang \BOthers. (\APACyear2024) \APACinsertmetastarzhang2024san{APACrefauthors}Zhang, R., Chen, L., Oliver, L\BPBID., Voineskos, A\BPBIN.\BCBL \BBA Park, J\BPBIY. \APACrefYearMonthDay2024. \BBOQ\APACrefatitleSAN: mitigating spatial covariance heterogeneity in cortical thickness data collected from multiple scanners or sites SAN: mitigating spatial covariance heterogeneity in cortical thickness data collected from multiple scanners or sites.\BBCQ \APACjournalVolNumPagesHuman Brain Mapping457e26692. \PrintBackRefs\CurrentBib
Zhang \BOthers. (\APACyear2023) \APACinsertmetastar10.1162/imag_a_00011{APACrefauthors}Zhang, R., Oliver, L\BPBID., Voineskos, A\BPBIN.\BCBL \BBA Park, J\BPBIY. \APACrefYearMonthDay202308. \BBOQ\APACrefatitleRELIEF: A structured multivariate approach for removal of latent inter-scanner effects RELIEF: A structured multivariate approach for removal of latent inter-scanner effects.\BBCQ \APACjournalVolNumPagesImaging Neuroscience11-16. \PrintBackRefs\CurrentBib
Zhu \BOthers. (\APACyear2017) \APACinsertmetastarzhu2017testing{APACrefauthors}Zhu, L., Lei, J., Devlin, B.\BCBL \BBA Roeder, K. \APACrefYearMonthDay2017. \BBOQ\APACrefatitleTesting high-dimensional covariance matrices, with application to detecting schizophrenia risk genes Testing high-dimensional covariance matrices, with application to detecting schizophrenia risk genes.\BBCQ \APACjournalVolNumPagesThe Annals of Applied Statistics1131810. \PrintBackRefs\CurrentBib

Appendix

Appendix A Proofs of Results in Section 3

A.1 Proof of Theorem 1

Let $\lambda_{j}$ denote the $j$ th largest singular value of ${\Sigma}_{1}-{\Sigma}_{2}$ . By matrix properties, we can write the eigendecomposition $\Sigma_{1}-\Sigma_{2}=\sum_{j=1}^{p}\Lambda_{j}u_{j}u_{j}^{\top}$ with ${\lambda}_{j}=|{\Lambda}_{j}|$ . Similarly, let $\hat{\lambda}_{j}$ denote the $j$ th largest singular value of $\hat{\Sigma}_{1}-\hat{\Sigma}_{2}$ , and we can write the eigendecomposition $\hat{\Sigma}_{1}-\hat{\Sigma}_{2}=\sum_{j=1}^{p}\hat{\Lambda}_{j}\hat{u}_{j}\hat{u}_{j}^{\top}$ with $\hat{\lambda}_{j}=|\hat{\Lambda}_{j}|$ . To prove Theorem 1, we first provide Lemma 3 below, which gives the joint limiting distribution of the eigenvalues of $\hat{\Sigma}_{1}-\hat{\Sigma}_{2}$ .

Lemma 3 (Limits of eigenvalues)

\displaystyle\sqrt{n}\left[\hat{\Lambda}_{1}-\Lambda_{1},\ \cdots,\ \hat{\Lambda}_{p}-\Lambda_{p}\right]^{\top}\xrightarrow{d}\mathcal{N}(\mathbf{0},\Omega)

where the limiting covariance matrix $\Omega=(\omega_{jk})_{1\leq j,l\leq p}$ with $\omega_{jl}$ defined same as in (3).

For $1\leq j\leq k_{0}$ , ${\sqrt{n}(}\hat{\lambda}_{j}-\lambda_{j}{)}=\sqrt{n}(|\hat{\Lambda}_{j}|-{|}\Lambda_{j}{|})$ and $\Lambda_{j}\neq 0$ . Thus by first-order Delta method and Lemma 3, for $k\leq k_{0}$ ,

\displaystyle\sqrt{n}\left[\|\hat{\Sigma}_{1}-\hat{\Sigma}_{2}\|_{(k)}-\|{\Sigma}_{1}-{\Sigma}_{2}\|_{(k)}\right]=\sqrt{n}\sum_{j=1}^{k}(\hat{\lambda}_{j}-\lambda_{j})\xrightarrow{d}\mathcal{N}(0,\omega_{1:k}^{2}),

where $\omega_{1:k}^{2}=\sum_{1\leq j,l\leq k}\omega_{jl}.$

For $k_{0}+1\leq j\leq p$ , $\Lambda_{j}=0$ , and $\sqrt{n}(\hat{\lambda}_{j}-\lambda_{j})=\sqrt{n}|\hat{\Lambda}_{j}|$ . For $k>k_{0}$ ,

\displaystyle\|\hat{\Sigma}_{1}-\hat{\Sigma}_{2}\|_{(k)}-\|{\Sigma}_{1}-{\Sigma}_{2}\|_{(k)}=\sum_{j=1}^{k_{0}}({\hat{\lambda}_{j}}-{\lambda_{j}})+\sum_{j=k_{0}+1}^{k}{\hat{\lambda}_{j}}

Then we have

\displaystyle\frac{\sqrt{n}}{\omega_{1:k}}\left(\|\hat{\Sigma}_{1}-\hat{\Sigma}_{2}\|_{(k)}-\|{\Sigma}_{1}-{\Sigma}_{2}\|_{(k)}\right)\xrightarrow{d}\frac{1}{\omega_{1:k}}\left(\sum_{j=1}^{k_{0}}Z_{j}+\sum_{j=k_{0}+1}^{k}|Z_{j}|\right)

where $(Z_{1},\ldots,Z_{k})$ is jointly normal with mean zero and covariance $(\omega_{jl})_{1\leq j,l\leq k}$ .

It remains to prove Lemma 3. Let $\Delta_{0}=\Sigma_{1}-\Sigma_{2}$ and $\hat{\Delta}=\hat{\Sigma}_{1}-\hat{\Sigma}_{2}$ . Then we have

	$\displaystyle\hat{\Lambda}_{j}=$	$\displaystyle~{}\hat{\Lambda}_{j}u_{j}^{\top}u_{j}$
	$\displaystyle=$	$\displaystyle~{}\hat{\Lambda}_{j}u_{j}^{\top}\hat{u}_{j}+\hat{\Lambda}_{j}u_{j}^{\top}(u_{j}-\hat{u}_{j})$
	$\displaystyle=$	$\displaystyle~{}u_{j}^{\top}\hat{\Delta}\hat{u}_{j}+\hat{\Lambda}_{j}u_{j}^{\top}(u_{j}-\hat{u}_{j})$
	$\displaystyle=$	$\displaystyle~{}u_{j}^{\top}\hat{\Delta}u_{j}+u_{j}^{\top}\hat{\Delta}(\hat{u}_{j}-u_{j})+\hat{\Lambda}_{j}u_{j}^{\top}(u_{j}-\hat{u}_{j})$
	$\displaystyle=$	$\displaystyle~{}u_{j}^{\top}\hat{\Delta}u_{j}+(\hat{\Delta}u_{j}-\hat{\Lambda}_{j}u_{j})^{\top}(\hat{u}_{j}-u_{j})$
	$\displaystyle=$	$\displaystyle~{}u_{j}^{\top}\hat{\Delta}u_{j}+[(\hat{\Delta}-\Delta_{0})u_{j}-({\hat{\Lambda}_{j}}-\Lambda_{j})u_{j}]^{\top}(\hat{u}_{j}-u_{j})\quad(\text{by }\Delta_{0}u_{j}=\Lambda_{j}u_{j})$
	$\displaystyle=$	$\displaystyle~{}u_{j}^{\top}\hat{\Delta}u_{j}+O_{p}(1/n).$

Write $X_{i}^{(s)}=\epsilon_{i}^{(s)}+\mu_{s}$ with $\epsilon_{i}^{(s)}\overset{i.i.d.}{\sim}\mathcal{N}(0,\Sigma_{s})$ , where i.i.d. represents independent and identically distributed, and $s\in\{1,2\}$ . By $\epsilon_{i}^{(s)}=X_{i}^{(s)}-\sum_{i=1}^{{n_{s}}}X_{i}^{(s)}/{n_{s}}+O_{p}(1/{\sqrt{n}})$ , we have $u_{j}^{\top}\hat{\Delta}u_{j}=\Xi_{j}+O_{p}(1/n)$ , where

\displaystyle\Xi_{j}=u_{j}^{\top}\biggr{(}\frac{1}{n_{1}}\sum_{i=1}^{n_{1}}\epsilon_{i}^{(1)}\epsilon_{i}^{(1)\top}-\frac{1}{n_{2}}\sum_{i=1}^{n_{2}}\epsilon_{i}^{(2)}\epsilon_{i}^{(2)\top}\biggr{)}u_{j}.

It follows that $\hat{\Lambda}_{j}=u_{j}^{\top}\Xi_{j}u_{j}+O_{p}(1/n)$ . By $\mathrm{E}(\Xi_{j})=\Lambda_{j}$ , and the central limit theorem, we have

\displaystyle\sqrt{n}\left[\Xi_{1}-\Lambda_{1},\,\ldots,\Xi_{p}-\Lambda_{p}\right]^{\top}\xrightarrow{d}{\mathcal{N}}\left(\mathbf{0},\Omega\right)

where $\Omega=(\omega_{jl})_{1\leq jl\leq p}$ with

\displaystyle\omega_{jl}=

\displaystyle~{}r_{1}\mathrm{cov}\left\{u_{j}^{\top}\epsilon_{i}^{(1)}\epsilon_{i}^{(1)\top}u_{j},\,u_{l}^{\top}\epsilon_{i}^{(1)}\epsilon_{i}^{(1)\top}u_{l}\right\}+r_{2}\mathrm{cov}\left\{u_{j}^{\top}\epsilon_{i}^{(2)}\epsilon_{i}^{(2)\top}u_{j},\,u_{l}^{\top}\epsilon_{i}^{(2)}\epsilon_{i}^{(2)\top}u_{l}\right\}.

(6)

It remains to derive $\omega_{jl}$ below.

We first consider the first term in (6) and the second one can be derived similarly. For the simplicity of notation, let

\displaystyle\xi=(\xi_{1},\ldots,\xi_{p})^{\top}=\Sigma_{1}^{-1/2}\epsilon_{i}^{(1)}\quad\ \text{and}\quad\ a_{j}=(a_{j1},\ldots,a_{jp})^{\top}=\Sigma_{1}^{1/2}u_{j}.

Then $\xi\sim\mathcal{N}(0,I_{p})$ , and

		$\displaystyle~{}\mathrm{cov}\left\{u_{j}^{\top}\epsilon_{i}^{(1)}\epsilon_{i}^{(1)\top}u_{j},\,u_{l}^{\top}\epsilon_{i}^{(1)}\epsilon_{i}^{(1)\top}u_{l}\right\}$
	$\displaystyle=$	$\displaystyle~{}\mathrm{E}\left\{(a_{j}^{\top}\xi)^{2}\times(a_{l}^{\top}\xi)^{2}\right\}-\mathrm{E}\left\{(a_{j}^{\top}\xi)^{2}\right\}\times\mathrm{E}\left\{(a_{l}^{\top}\xi)^{2}\right\}$
	$\displaystyle=$	$\displaystyle~{}\sum_{t_{1},t_{2},t_{3},t_{4}=1}^{p}\mathrm{E}\left(\prod_{m=1}^{2}a_{jt_{m}}\xi_{t_{m}}\prod_{m=3}^{4}a_{lt_{m}}\xi_{t_{m}}\right)-a_{j}^{\top}a_{j}\times a_{l}^{\top}a_{l}$
	$\displaystyle=$	$\displaystyle~{}\sum_{t=1}^{p}a_{jt}^{2}a_{lt}^{2}\mathrm{E}(\xi_{t}^{4})+\sum_{1\leq t_{1}\neq t_{3}\leq p}a_{jt_{1}}^{2}a_{lt_{3}}^{2}\mathrm{E}(\xi_{t_{1}}^{2}\xi_{t_{3}}^{2})+2\sum_{1\leq t_{1}\neq t_{2}\leq p}a_{jt_{1}}a_{lt_{1}}a_{jt_{2}}a_{lt_{2}}\mathrm{E}(\xi_{t_{1}}^{2})\mathrm{E}(\xi_{t_{2}}^{2})$
	$\displaystyle=$	$\displaystyle~{}\left(\sum_{t=1}^{p}a_{jt}^{2}\right)\left(\sum_{t=1}^{p}a_{lt}^{2}\right)+2\left(\sum_{t=1}^{p}a_{jt}a_{lt}\right)^{2}-a_{j}^{\top}a_{j}\times a_{l}^{\top}a_{l}$
	$\displaystyle=$	$\displaystyle~{}2(a_{j}^{\top}a_{l})^{2}.$

Applying similar analysis to the second term in (6), we can obtain

\displaystyle{\omega_{jl}=2\left(r_{1}(u_{j}^{\top}\Sigma_{1}u_{l})^{2}+r_{2}(u_{j}^{\top}\Sigma_{2}u_{l})^{2}\right).}

A.1.1 Proof of Proposition 2

By the definition of $\mathrm{SNR}_{k}$ , $\mathrm{SNR}_{k_{2}}\geq\mathrm{SNR}_{k_{1}}$ if and only if

		$\displaystyle~{}~{}\\|\Sigma_{1}-\Sigma_{2}\\|_{(k_{2})}{\omega_{1:k_{1}}}\geq\\|\Sigma_{1}-\Sigma_{2}\\|_{(k_{1})}{\omega_{1:k_{2}}}$
	$\displaystyle\Leftrightarrow$	$\displaystyle~{}~{}\frac{\\|\Sigma_{1}-\Sigma_{2}\\|_{(k_{2})}-\\|\Sigma_{1}-\Sigma_{2}\\|_{(k_{1})}}{\\|\Sigma_{1}-\Sigma_{2}\\|_{(k_{1})}}\geq\frac{{\omega_{1:k_{2}}}}{{\omega_{1:k_{1}}}}-1$
	$\displaystyle\Leftrightarrow$	$\displaystyle~{}~{}\beta_{k_{1},k_{2}}\geq\sqrt{\gamma_{k_{1},k_{2}}+1}-1.$

Appendix B Minimum $p$ -value Method

Below we outline the minimum $p$ -value approach from Section 2.4. For this approach we calculate the critical value threshold in a similar fashion to Pan \BOthers. (\APACyear2014) where for each individual $k$ we use the $B$ permuted datasets to calculate a $p$ -value, and then take a minimum across all $k$ to calculate a minimum $p$ -value

	$\displaystyle T_{\text{RACT-min}p}$	$\displaystyle=\min_{k\in\mathcal{K}}p_{k}$
	$\displaystyle p_{k}$	$\displaystyle=\left[1+\sum_{b=1}^{B}I\left(T_{k}^{(b)}\geq T_{k}\right)\right]\Big{/}[B+1]$
	$\displaystyle p_{k}^{(b)}$	$\displaystyle=\left[1+\sum_{b_{1}\neq b}I\left(T_{k}^{(b_{1})}\geq T_{k}^{(b)}\right)\right]\Big{/}B$
	$\displaystyle p_{\text{min}}^{(b)}$	$\displaystyle={\min_{k\in\mathcal{K}}p_{k}^{(b)}}$
	$\displaystyle p_{\text{crit}}$	$\displaystyle=\text{Quantile}(\{p_{\text{min}}^{(b)}\}_{b=1}^{B},\alpha)$

where $T_{k}^{(b)}$ is calculated in the same way as $T_{k}$ in (1), except with the data in permuted order instead of the original order. We note here that only one set of permutations is needed as once $p_{k}^{(b)}$ has been calculated for all $k\in\mathcal{K}$ and $b\in 1,\dots,B$ the minimum $p$ -values for each permutation can be calculated easily from these without the need to conduct a second set of permutations.

Rank-adaptive covariance testing with applications to genomics and neuroimaging