This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Statistical Inference for Qualitative Interactions with Applications to Precision Medicine and Differential Network Analysis

Aaron Hudson and Ali Shojaie
Department of Biostatistics, University of Washington, Seattle, Washington
Abstract

Qualitative interactions occur when a treatment effect or measure of association varies in sign by sub-population. Of particular interest in many biomedical settings are absence/presence qualitative interactions, which occur when an effect is present in one sub-population but absent in another. Absence/presence interactions arise in emerging applications in precision medicine, where the objective is to identify a set of predictive biomarkers that have prognostic value for clinical outcomes in some sub-population but not others. They also arise naturally in gene regulatory network inference, where the goal is to identify differences in networks corresponding to diseased and healthy individuals, or to different subtypes of disease; such differences lead to identification of network-based biomarkers for diseases. In this paper, we argue that while the absence/presence hypothesis is important, developing a statistical test for this hypothesis is an intractable problem. To overcome this challenge, we approximate the problem in a novel inference framework. In particular, we propose to make inferences about absence/presence interactions by quantifying the relative difference in effect size, reasoning that when the relative difference is large, an absence/presence interaction occurs. The proposed methodology is illustrated through a simulation study as well as an analysis of breast cancer data from the Cancer Genome Atlas.

1 Introduction

An objective of many biomedical studies is to identify and test for interactions, which arise when a measure of effect or association between variables differs by sub-population. Precision medicine and genetic network inference provide examples of areas in which interactions are of interest. For instance, researchers in precision medicine seek to understand how patients’ characteristics are associated with heterogeneity in response to treatment. In genetics studies, it is of interest to determine how genetic networks, which summarize the associations between genes, depend on phenotype.

Interactions may lack clinical or scientific significance when differences in effect are small. In addition to detecting interactions, it is important to identify which are meaningful. For example, in precision medicine, the most important differences in treatment effect may be those in which some sub-populations of patients benefit from the treatment, while other sub-populations are harmed or unaffected. Additionally, one may want to identify differences among sub-populations in the set of biomarkers that have prognostic value for a health outcome — that is, to determine whether some biomarkers are predictive of the outcome in only a subset of the full population. Genetic network inference provides another example: When comparing sub-population level genetic networks, it may be of primary interest to identify pairs of genes that share an association in some sub-populations but share no association in others, or to identify pairs that have a positive association in one sub-population and a negative association another. This is known as differential network biology (Ideker and Krogan, 2012).

Such qualitative interactions are the focus of this paper. Qualitative interactions occur when a measure of effect differs in sign by sub-population. We consider two types of qualitative interactions: positive/negative interactions — also known in the literature as cross-over interactions (Gail and Simon, 1985) — and absence/presence interactions — sometimes referred to as pure interactions (VanderWeele, 2019). Positive/negative interactions occur when an effect is positive in one sub-population and negative in another, and absence/presence interactions occur when the effect is present in one population but absent in another.

Our objective is to formally test for qualitative interactions, given independent samples from each sub-population. Testing for positive/negative interactions is well-studied (Gail and Simon, 1985; Piantadosi and Gail, 1993; Pan and Wolfe, 1997; Silvapulle, 2001; Li and Chan, 2006), while testing for absence/presence interactions has received substantially less attention. Naïve approaches, to be discussed in the sequel, require an untenable minimum signal strength condition — that if an effect is present in any sub-population, it is large enough to be detected with absolute certainty. No approaches exist, to the best of our knowledge, that obviate this assumption.

In this paper, we propose a novel framework for inference about absence/presence interactions. Our proposed methodology allows for well-calibrated hypothesis testing under mild assumptions. We also introduce a numerical summary that measures the strength of absence/presence interactions, while accounting for the uncertainty associated with parameter estimation. Additionally, we describe methods for simultaneous inference about absence/presence and positive/negative interactions. The methodology we introduce provides an effective and flexible inference tool in precision medicine and genetic network analysis, as we illustrate in simulations and an analysis of breast cancer data from The Cancer Genome Atlas (TCGA).

2 Background

2.1 Notation

As we begin to formalize the problem, we first introduce some notation. We consider two sub-populations, labeled by g{1,2}g\in\{1,2\}. Let θg\theta_{g}\in\mathds{R} denote a measure of association in sub-population gg. When convenient, we write θ=(θ1,θ2)\theta=(\theta_{1},\theta_{2}). We can consider various measures of association, such as: correlation coefficients, indicating the strength of linear relationship between two variables of interest; log odds ratios, describing the relationship between predictors and a binary outcome; and log hazard ratios, describing the association between predictors and a time-to-event outcome.

We assume that given i.i.d. samples of size n1n_{1} and n2n_{2} from each sub-population, ng\sqrt{n_{g}}-consistent and asymptotically normal estimates θ^g\hat{\theta}_{g} of θg\theta_{g} are available, i.e.,

ng(θ^gθg)dN(0,σg2),\sqrt{n_{g}}\left(\hat{\theta}_{g}-\theta_{g}\right)\to_{d}N\left(0,{\sigma_{g}^{2}}\right),

with σg2>0{\sigma^{2}_{g}}>0 denoting the asymptotic variance. For expositional simplicity, we assume balanced sample sizes n1=n2=nn_{1}=n_{2}=n (key results are stated more generally in the Appendix). We also assume σg2{\sigma^{2}_{g}} is known, though we can instead use a consistent estimate, as is commonly done in practice.

We now formally state the null hypotheses of no positive/negative interactions and no absence/presence interaction, labeled H0P/NH_{0}^{\text{P/N}} and H0A/PH_{0}^{\text{A/P}}, respectively:

H0P/N:θg0 for g{1,2} or θg0 for g{1,2}\displaystyle H_{0}^{\text{P/N}}:\theta_{g}\geq 0\text{ for }g\in\{1,2\}\textbf{ or }\theta_{g}\leq 0\text{ for }g\in\{1,2\} (1)
H0A/P:θg=0 for g{1,2} or θg0 for g{1,2}.\displaystyle H_{0}^{\text{A/P}}:\theta_{g}=0\text{ for }g\in\{1,2\}\textbf{ or }\theta_{g}\neq 0\text{ for }g\in\{1,2\}. (2)

We let Θ0P/N,Θ0A/P\Theta_{0}^{\text{P/N}},\Theta_{0}^{\text{A/P}} and Θ1P/N,Θ1A/P\Theta_{1}^{\text{P/N}},\Theta_{1}^{\text{A/P}} denote the corresponding null and alternative regions of the parameter space, depicted in Figure 1. (Recall that the null region is the set of parameters such that the null hypothesis holds, and the alternative region is the complement of the null region.) The positive/negative null region is the union of the the non-negative and non-positive orthants, and the absence/presence null region is the union of all open orthants and the origin.

Refer to caption
Figure 1: Null and alternative regions of parameter space for positive/negative, absence/presence, and relative difference hypotheses.

2.2 Testing Composite Null Hypotheses

Our goal is to use the estimate θ^\hat{\theta} to perform tests of H0P/NH_{0}^{\text{P/N}} and H0A/PH_{0}^{\text{A/P}} such that the size is controlled asymptotically under mild assumptions. Recall that for a null hypothesis H0H_{0} with accompanying null region Θ0\Theta_{0}, the size of a test is defined as

supθ0Θ0(“Reject the null hypothesis”|θ=θ0).\sup_{\theta_{0}\in\Theta_{0}}\mathbb{P}(\text{``Reject the null hypothesis''}|\theta=\theta_{0}).

In words, the size is the largest possible type-I error rate that could be achieved under any probability distribution given that θ\theta belongs to the null region.

Here, we describe our general approach for controlling the size at a pre-specified level α(0,1)\alpha\in(0,1). We first define a test statistic TT, a map from observable data to a real-valued number, with larger values of TT corresponding to more evidence against the null hypothesis. We then calculate the test statistic on the observed data, which we denote by tt. We write (T>t|θ=θ0)\mathbb{P}(T>t|\theta=\theta_{0}) as the probability of observing a random test statistic at least as large as the observed value tt, assuming θ=θ0\theta=\theta_{0}. We reject the null hypothesis if

ρ(t)supθ0Θ0(T>t|θ=θ0)<α.\rho(t)\equiv\sup_{\theta_{0}\in\Theta_{0}}\mathbb{P}(T>t|\theta=\theta_{0})<\alpha.

One can think of (T>t|θ=θ0)\mathbb{P}(T>t|\theta=\theta_{0}) as the p-value under a specific null distribution θ=θ0\theta=\theta_{0}; ρ(t)\rho(t) is then the largest of all such p-values. We reject H0H_{0} when there is sufficient evidence to reject all hypotheses θ=θ0\theta=\theta_{0}. We can view ρ(t)\rho(t) as a generalization of the usual p-value for simple null hypotheses to tests with composite null hypotheses, and will simply refer to ρ(t)\rho(t) as “p-value”. Tests of the above form are guaranteed to control the size (Casella and Berger, 2002).

2.3 Existing Methodology

We now review existing approaches to test for qualitative interactions. We first discuss testing positive/negative interactions before moving to absence/presence interactions.

Gail and Simon (1985) developed the most widely used procedure to test for positive/negative interactions. Though Gail and Simon proposed a general KK-sample test, we focus on the two-sample problem in this paper. We note that various KK-sample tests for positive/negative interactions have been proposed (Piantadosi and Gail, 1993; Silvapulle, 2001; Li and Chan, 2006), but these procedures are essentially equivalent to the Gail-Simon test in the two-sample setting.

Gail and Simon’s approach is to perform a likelihood ratio test based on the asymptotic sampling distribution of θ^\hat{\theta}. The likelihood ratio test rejects H0P/NH_{0}^{\text{P/N}} for large values of

sup(a1,a2)Θ0P/Ng{1,2}σg1ϕ{nσg1(θ^gag)}sup(b1,b2)2g{1,2}σg1ϕ{nσg1(θ^gbg)},\displaystyle-\,\frac{\sup_{(a_{1},a_{2})\in\Theta_{0}^{\text{P/N}}}\prod_{g\in\{1,2\}}\sigma_{g}^{-1}\phi\left\{\sqrt{n}\sigma_{g}^{-1}(\hat{\theta}_{g}-a_{g})\right\}}{\sup_{(b_{1},b_{2})\in\mathds{R}^{2}}\prod_{g\in\{1,2\}}\sigma_{g}^{-1}\phi\left\{\sqrt{n}\sigma_{g}^{-1}(\hat{\theta}_{g}-b_{g})\right\}},

where ϕ()\phi(\cdot) is the standard normal density. By performing algebreic manipulations, one can show that the likelihood ratio test equivalently rejects the null for large values of

TP/N=min(a1,a2)Θ0P/Ng{1,2}n{σg1(θ^gag)}2.\displaystyle T^{\text{P/N}}=\min_{(a_{1},a_{2})\in\Theta_{0}^{\text{P/N}}}\sum_{g\in\{1,2\}}n\left\{\sigma_{g}^{-1}\left(\hat{\theta}_{g}-a_{g}\right)\right\}^{2}.

The test statistic TP/NT^{\text{P/N}} can be interpreted as the shortest distance between θ^\hat{\theta} and the null region, where the distance is inversely weighted by the asymptotic variances of the estimates, as illustrated in Figure 2.

Refer to caption
Figure 2: Geometric interpretation of likelihood ratio statistic for positive/negative and absence/presence hypotheses.

Gail and Simon show that the test statistic TP/NT^{\text{P/N}} can be calculated as

TP/N=ming{1,2}{n(θ^g/σg)2}𝟙(θ^Θ1P/N),T^{\text{P/N}}=\min_{g\in\{1,2\}}\left\{n\left(\hat{\theta}_{g}/{\sigma_{g}}\right)^{2}\right\}\mathds{1}\left(\hat{\theta}\in\Theta_{1}^{\text{P/N}}\right),

where 𝟙()\mathds{1}(\cdot) denotes the indicator function. Furthermore, one can verify that for all t>0t>0, the p-value is easily calculated as

ρP/N(t)=supθ0Θ0P/Nlimn(TP/N>t|θ=θ0)=12(χ12>t).{\rho}^{\text{P/N}}(t)=\sup_{\theta_{0}\in\Theta_{0}^{\text{P/N}}}\lim_{n\to\infty}\mathbb{P}\left(T^{\text{P/N}}>t|\theta=\theta_{0}\right)=\frac{1}{2}\mathbb{P}\left(\chi^{2}_{1}>t\right).

The likelihood ratio test is quite intuitive and rejects the null when a positive estimate of association is observed in one population, a negative estimate is observed in another population, and both associations are statistically significant.

Now, we discuss approaches to test for absence/presence interactions. While one might be tempted to perform a likelihood ratio test for absence/presence interactions, the likelihood ratio test fails in the sense that it never can reject the null. To see this, we first recognize that, similar to the positive/negative interaction test, the absence/presence likelihood ratio test would reject for large values of

TA/P=min(a1,a2)Θ0A/Pg{1,2}n{σg1(θ^gag)}2.\displaystyle T^{\text{A/P}}=\min_{(a_{1},a_{2})\in\Theta_{0}^{\text{A/P}}}\sum_{g\in\{1,2\}}n\left\{\sigma_{g}^{-1}\left(\hat{\theta}_{g}-a_{g}\right)\right\}^{2}.

Again, the test statistic TA/PT^{\text{A/P}} is the shortest distance between θ^\hat{\theta} and the null region Θ0A/P\Theta_{0}^{\text{A/P}}. Because the alternative region Θ1A/P\Theta_{1}^{\text{A/P}} has zero area, θ^\hat{\theta} lies in the null region with probability one. Therefore, the test statistic is always 0, and the likelihood ratio test has no power.

One might alternatively attempt to test the absence/presence null by separately testing the null hypotheses H0g:θg=0H_{0}^{g}:\theta_{g}=0 for g{1,2}g\in\{1,2\} and rejecting the absence/presence null when H0gH_{0}^{g} is rejected for one gg and not rejected for the other. To control the size of a test of this form, we need to simultaneously control the type-I error under two scenarios: (1) there is an association in both sub-populations, and (2) there is no association in either population. When there is an association in both populations, a type-I error occurs when we incorrectly fail to reject one of H0gH_{0}^{g}. When there is no association in either population, we make a type-I error when we incorrectly reject one of H0gH_{0}^{g}. Thus, controlling the size of the test for H0A/PH_{0}^{\text{A/P}} using this approach requires simultaneous control of the type-I error rate and type-II error rate of tests for H0gH_{0}^{g}. If the tests for H0gH_{0}^{g} are consistent — that is, the type-II error rates tend to zero with sufficiently large samples — this approach is asymptotically valid, as only the type-I error rates for tests of H0gH_{0}^{g} need to be controlled. However, with even moderately large samples, we will not be able to correctly reject false H0gH_{0}^{g} with absolute certainty unless the true association is strong and hence easy to detect. This would make the test of H0A/PH_{0}^{\text{A/P}} unreliable in the presence of weak signal.

Specifically, we require θ1,θ2>o(n1/2)\theta_{1},\theta_{2}>o(n^{-1/2}). To see this, we construct a more formal argument. For simplicity, suppose σ1=σ2=1\sigma_{1}=\sigma_{2}=1. We consider tests of H0gH_{0}^{g} of the form

ψg={Reject;if n|θ^g|>aAccept;if n|θ^g|<a,\psi_{g}=\begin{cases}\text{Reject;}&\text{if }\sqrt{n}|\hat{\theta}_{g}|>a\\ \text{Accept;}&\text{if }\sqrt{n}|\hat{\theta}_{g}|<a\end{cases},

where aa is a constant that would be selected to control the size. The probability of rejecting the absence/presence null is

(Reject H0A/P)=(ψ1=Reject)(ψ2=Accept)+(ψ1=Accept)(ψ2=Reject).\mathbb{P}\left(\text{Reject }H_{0}^{\text{A/P}}\right)=\mathbb{P}\left(\psi_{1}=\text{Reject}\right)\mathbb{P}\left(\psi_{2}=\text{Accept}\right)+\mathbb{P}\left(\psi_{1}=\text{Accept}\right)\mathbb{P}\left(\psi_{2}=\text{Reject}\right).

Suppose θ1\theta_{1} and θ2\theta_{2} are guaranteed to be greater than o(n1/2)o(n^{-1/2}) if they are both nonzero. Then, for θ1,θ20\theta_{1},\theta_{2}\neq 0, n|θ^g|\sqrt{n}|\hat{\theta}_{g}|\to\infty, and (Reject H0A/P)0\mathbb{P}\left(\text{Reject }H_{0}^{\text{A/P}}\right)\to 0. Therefore, to control the size, we are only required to select α\alpha so that the type-I error is controlled when θ1=θ2=0\theta_{1}=\theta_{2}=0; this can be done by taking aa as the (1α/4)(1-\alpha/4) quantile of the standard normal distribution. However, if we allow θg<o(n1/2)\theta_{g}<o(n^{1/2}), we can see a drastically inflated type-I error rate. For instance, for a small ϵ>0\epsilon>0, let θ1=n1/2+ϵ\theta_{1}=n^{-1/2+\epsilon}, θ2=n1/2ϵ\theta_{2}=n^{-1/2-\epsilon}. Then n|θ^1|>a\sqrt{n}|\hat{\theta}_{1}|\to\infty>a while n|θ^2|0<a\sqrt{n}|\hat{\theta}_{2}|\to 0<a, so (Reject H0A/P)1\mathbb{P}\left(\text{Reject }H_{0}^{\text{A/P}}\right)\to 1. Thus, when small signal is permitted, tests of this form will be asymptotically anti-conservative.

Both approaches discussed above for testing absence/presence interactions fail for a similar reason: it is difficult to gather evidence supporting that a measure of association is exactly equal to zero. This is captured by the alternative region having zero area, causing the failure of the first approach. In the second approach, to obtain evidence supporting that an association is zero, we require that H0,gH_{0,g} is only accepted when θg=0\theta_{g}=0; for this, we rely upon a minimum signal strength condition to guarantee that any non-zero association is detected.

3 Proposed Methodology

3.1 Refinement of Absence/Presence Hypothesis

To mitigate the challenges described in Section 2, we consider a refinement of the absence/presence null hypothesis. The key idea is that in practice, absence/presence interactions can be approximated by considering the settings where an association is at least moderately large in one population and negligible or near zero in the other; or when one association is substantially stronger than the other. This means that we can expand the alternative region to include neighborhoods of zero in a way that the absence/presence interpretation is preserved.

Recall that when there exists an absence/presence interaction, the ratio of the maximum of the absolute value of the θg\theta_{g} to the minimum is infinite. We cannot test that the ratio is infinite because we will never have evidence to support that the denominator is exactly zero. However, we can test that the ratio is large because we may have evidence to support that the denominator is very small. Motivated by this intuition, we propose to test whether the relative difference between sub-population measures of association is greater than a large pre-specified constant κ>1\kappa>1. Formally, let θmax=maxg|θg|\theta_{\max}=\max_{g}|\theta_{g}| and θmin=ming|θg|\theta_{\min}=\min_{g}|\theta_{g}|. We define the new relative difference null hypothesis H0κH_{0}^{\kappa} as

H0κ:θmax/θminκ or θmax=θmin=0.H_{0}^{\kappa}:\theta_{\max}/\theta_{\min}\leq\kappa\text{ or }\theta_{\max}=\theta_{\min}=0.

Equivalently,

H0κ:θmaxκθmin0.\displaystyle H_{0}^{\kappa}:\theta_{\max}-\kappa\theta_{\min}\leq 0. (3)

The null region Θ0κ\Theta_{0}^{\kappa}, illustrated in Figure 1, is the union of four linear subspaces — each residing in a separate orthant of 2\mathds{R}^{2}; the boundary of each subspace is the union of the spans of vectors with absolute direction (κ,1)(\kappa,1)^{\prime} and (1,κ)(1,\kappa)^{\prime}.

The relative difference null region can be viewed as a relaxation of the absence/presence null. For a large choice of κ\kappa, both our original and refined null hypotheses have the same interpretation: the greater measure of association is substantially larger than the lesser. However, we find it appealing that H0κH_{0}^{\kappa} has a reasonable interpretation for any choice of κ\kappa; that is, the multiplicative difference in strength of association is no larger than κ\kappa.

To motivate defining the refined null hypothesis in terms of relative differences rather than absolute differences, we argue that testing for relative differences has at least the following benefits. First, relative differences are unitless, so the relative difference null is compatible with unitless measures of association such as the Pearson correlation coefficient, which are often preferred in the analysis of biological data. Second, a reasonable κ\kappa can be selected without prior knowledge of ranges of strength of association.

Of course, the relative difference null hypothesis depends on the choice of κ\kappa. For small values of κ\kappa, the relative difference null may be too dissimilar from the absence/presence null to retain its interpretation; for large κ\kappa, the alternative region becomes very small, and the hypothesis may be overly conservative. In the following subsections, we first construct a test of the relative difference null hypothesis for a pre-specified κ\kappa and then describe an approach to identify the set of κ\kappa such that the test rejects the null hypothesis, as to circumvent tuning parameter selection.

3.2 Likelihood Ratio Test for Relative Difference Hypothesis

We now develop a testing procedure for the new relative difference null hypothesis. The relative difference null region, unlike the absence/presence null region, has non-zero area, so a likelihood ratio test will not fail in the same manner as the likelihood ratio test for absence/presence interactions. Similar to the previously discussed examples, the likelihood ratio test statistic is

Tκ\displaystyle T^{\kappa} =min(a1,a2)Θ0κg{1,2}n{σg1(θ^gag)}2,\displaystyle=\min_{(a_{1},a_{2})\in\Theta_{0}^{\kappa}}\sum_{g\in\{1,2\}}n\left\{\sigma_{g}^{-1}\left(\hat{\theta}_{g}-a_{g}\right)\right\}^{2},

and can be interpreted as the shortest (weighted) distance between θ^\hat{\theta} and the null region Θ0κ\Theta_{0}^{\kappa}. Clearly, the test statistic is zero whenever θ^\hat{\theta} lies in the null region. Otherwise, TκT^{\kappa} is the shortest distance between θ^\hat{\theta} and the closest of the four linear subspaces that define Θ0κ\Theta_{0}^{\kappa}. The test statistic can be calculated as the distance between (θ^max,θ^min)(\hat{\theta}_{\max},\hat{\theta}_{\min})^{\prime} and its projection onto the span of the vector (κ,1)(\kappa,1)^{\prime}. The test statistic’s geometric interpretation is illustrated in Figure 2.

The likelihood ratio test statistic is straightforward to calculate. Let θ^max=maxg|θ^g|\hat{\theta}_{\max}=\max_{g}|\hat{\theta}_{g}| and θ^min=ming|θ^g|\hat{\theta}_{\min}=\min_{g}|\hat{\theta}_{g}| be the strongest and weakest estimated absolute association. In the following lemma, we state that TκT^{\kappa} is equal to the difference between θ^max\hat{\theta}_{\max} and κθ^min\kappa\hat{\theta}_{\min} divided by a normalizing constant. Thus, the test statistic can be viewed as a plug-in of θ^\hat{\theta} into (3) with an additional normalizing constant and closely resembles the test statistic proposed by Fieller (1940) to conduct inference about ratios of means.

Lemma 1.

The likelihood ratio test statistic TκT^{\kappa} can be written as

Tκ=θ^maxκθ^minτ^max+κ2τ^min,T^{\kappa}=\frac{\hat{\theta}_{\max}-\kappa\hat{\theta}_{\text{min}}}{\hat{\tau}_{\max}+\kappa^{2}\hat{\tau}_{\min}},

where τ^max=n1σ12𝟙(|θ^1|=θ^max)+n1σ22𝟙(|θ^2|=θ^max)\hat{\tau}_{\max}=n^{-1}\sigma^{2}_{1}\mathds{1}\left(|\hat{\theta}_{1}|=\hat{\theta}_{\max}\right)+n^{-1}\sigma^{2}_{2}\mathds{1}\left(|\hat{\theta}_{2}|=\hat{\theta}_{\max}\right), and τ^min=n1σ12𝟙(|θ^1|=θ^min)+n1σ22𝟙(|θ^2|=θ^min)\hat{\tau}_{\min}=n^{-1}\sigma^{2}_{1}\mathds{1}\left(|\hat{\theta}_{1}|=\hat{\theta}_{\min}\right)+n^{-1}\sigma^{2}_{2}\mathds{1}\left(|\hat{\theta}_{2}|=\hat{\theta}_{\min}\right).

We now discuss how to obtain a p-value for the relative difference hypothesis. First, we obtain an observed test statistic tt, a realization of TκT^{\kappa} calculated from the data. Following the approach described in Section 2.2, we define the p-value as ρκ(t)\rho^{\kappa}(t), the largest of all asymptotic tail probabilities limn(Tκ>t|θ=θ0)\lim_{n\to\infty}\mathbb{P}(T^{\kappa}>t|\theta=\theta_{0}) such that θ0\theta_{0} belongs to the null region. To determine the maximum tail probability, we characterize the limiting distribution of TκT^{\kappa} assuming θ=θ0\theta=\theta_{0} for all θ0\theta_{0} in the null region.

Though the null region contains an infinite number of values, TκT^{\kappa} can only attain one of three limiting distributions corresponding to the following three cases:

  1. 1.

    The true association is in the interior of the null region, i.e., θmaxκθmin<0\theta_{\max}-\kappa\theta_{\min}<0.

  2. 2.

    The true association is on the boundary of the null region, but both associations are non-zero, i.e., θmax=κθmin>0\theta_{\max}=\kappa\theta_{\min}>0.

  3. 3.

    The true association is zero in both sub-populations, i.e., θ1=θ2=0\theta_{1}=\theta_{2}=0.

In Proposition 1, we describe the asymptotic behavior of TκT^{\kappa} for cases 1 and 2 above. We provide here some intuition for the result and reserve a formal argument for the Appendix In case 1, it is easy to argue that because θ^\hat{\theta} is consistent, TκT^{\kappa} is a negative number with probability tending to one, and therefore never provides evidence against the null. In case 2, TκT^{\kappa} asymptotically follows a standard normal distribution. To see this, we note that because both associations are non-zero, consistency and asymptotic normality of θ^\hat{\theta} imply that, for large nn, the sign and order of the estimates are deterministic. That the signs are asymptotically deterministic implies that |θ^||\hat{\theta}| is asymptotically normal (speaking loosely, |θ^g|sign(θg)θ^g|\hat{\theta}_{g}|\to\text{sign}(\theta_{g})\hat{\theta}_{g}), and that order is asymptotically deterministic implies that θ^max\hat{\theta}_{\max} and θ^min\hat{\theta}_{\min} are asymptotically independent. Therefore, taking the difference between θ^max\hat{\theta}_{\max} and θ^min\hat{\theta}_{\min}, suitably standardized, is asymptotically equivalent to taking a difference between two independent normal random variables with equal means. Dividing by the asymptotic variances gives the claimed result.

Proposition 1.

In the interior of the null region, i.e., when θmaxκθmin<0\theta_{\max}-\kappa\theta_{\min}<0, TκT^{\kappa} converges in distribution to -\infty. At all nonzero boundary points of the null region, i.e., when θmax=κθmin>0\theta_{\max}=\kappa\theta_{\min}>0, TκT^{\kappa} converges in distribution to a standard normal random variable.

In case 3, when both associations are zero, the asymptotic distribution of the test statistic is more complicated. More specifically, θg=0\theta_{g}=0 implies that, asymptotically, |θ^g||\hat{\theta}_{g}| follows a half-normal distribution instead of a normal distribution. Moreover, the order of |θ^||\hat{\theta}| remains random in the limit. This gives rise to a non-standard limiting distribution. In particular, unlike cases 1 and 2, the limiting distribution depends on the asymptotic variances of the sub-population estimates (and also the ratio of the sample sizes in the unbalanced case). We are nonetheless able to derive an analytic expression for the distribution function, stated in Proposition 2. We reserve this statement for the Appendix, as the expression is cumbersome.

By Propositions 1 and 2, we can calculate the p-value as the maximum of the tail probabilities in cases 2 and 3. That is,

ρκ(t)=max{1Φ(t),1Fκ(t)},\displaystyle\rho^{\kappa}(t)=\max\left\{1-\Phi(t),1-F^{\kappa}(t)\right\}, (4)

where Φ()\Phi(\cdot) denotes the standard normal distribution function and Fκ()limn(Tκ<t|θ=θ0)F^{\kappa}(\cdot)\equiv\lim_{n\to\infty}\mathbb{P}(T^{\kappa}<t|\theta=\theta_{0}) is the limiting distribution function for the test statistic when θ=0\theta=0. Though the limiting distribution under θ=0\theta=0 is non-standard, tail probabilities can be calculated easily, as we show in the Appendix (Remark 1). The p-value is therefore simple to calculate.

We have derived an analytic approximation for the power of the likelihood ratio test. In Proposition 2, we more generally characterize the limiting distribution of the test statistic under hypotheses of the form (θ1,θ2)=n1/2(c1,c2)(\theta_{1},\theta_{2})=n^{-1/2}(c_{1},c_{2}). The local asymptotic power of the proposed test at the level α\alpha is available by considering βακ(c1,c2)limn(Tκ>t1α|θ=n1/2(c1,c2))\beta^{\kappa}_{\alpha}(c_{1},c_{2})\equiv\lim_{n\to\infty}\mathbb{P}\left(T^{\kappa}>t^{*}_{1-\alpha}|\theta=n^{-1/2}(c_{1},c_{2})\right), where we define t1αt^{*}_{1-\alpha} as the maximum of the 1α1-\alpha quantiles of the limiting distribution of TκT^{\kappa} under scenarios 2 and 3 described above. An analytic finite-sample approximation of the power can be calculated as βακ(n1/2θ1,n1/2θ2)\beta^{\kappa}_{\alpha}\left(n^{1/2}\theta_{1},n^{1/2}\theta_{2}\right).

A contour plot of the local asymptotic power is given in Figure 3. We consider both cases of equal and unequal variance of the estimators. We observe that the likelihood ratio test has low power when the strongest effect max{|c1|,|c2|}\max\left\{|c_{1}|,|c_{2}|\right\} is small, and power improves considerably when the strongest effect grows. Additionally, we find that in the presence of unequal variance, the test has greater power when the weakest effect is estimated with higher precision than the strongest effect.

Refer to caption
Figure 3: Contour plot of local asymptotic power of relative difference likelihood ratio test with κ=2\kappa=2 and α=.05\alpha=.05. Bold lines represent the boundary of the null region. Settings of equal and unequal asymptotic variance of estimators are represented.

3.3 Quantifying Relative Difference in Effect by Inverting Likelihood Ratio Test

Rather than perform a the likelihood ratio test for a pre-specified κ\kappa, one may prefer to directly estimate the relative difference in effect. Naïvely, one might consider estimating the relative difference in effect as θ^max/θ^min\hat{\theta}_{\max}/\hat{\theta}_{min}. However, this estimate will behave poorly when θ1=θ2=0\theta_{1}=\theta_{2}=0 and should therefore not be reported in practice.

To overcome this issue, we propose to quantify the relative difference in effect size by inverting the likelihood ratio test, similar to Fieller (1940). We define

κmaxαsup{κ:κ>1,ρκ(t)<α}\kappa_{\max}^{\alpha}\equiv\sup\{\kappa:\kappa>1,\rho^{\kappa}(t)<\alpha\}

as the largest κ>1\kappa>1 such that the likelihood ratio test rejects the null hypothesis at the α\alpha level. When the likelihood ratio test fails to reject for all κ>1\kappa>1, we will use the convention κmaxα=1\kappa_{\max}^{\alpha}=1. We find it appealing that κmaxα\kappa^{\alpha}_{\max} converges to 11 if θmax=θmin\theta_{\max}=\theta_{\min}, and κmax\kappa_{\max} should approach but not exceed θmax/θmin\theta_{\max}/\theta_{\min} otherwise. We discuss calculation of κmaxα\kappa^{\alpha}_{\max} in the Appendix.

3.4 Simultaneous Test of Qualitative Interactions

When it is of interest to identify both absence/presence and positive/negative qualitative interactions, it may be desirable to test for both simultaneously. In this section, we construct an omnibus test that achieves exact control of the size asymptotically.

We define the omnibus qualitative interaction null hypothesis as

H0P/N,κ: Both H0P/N and H0κ hold.H_{0}^{\text{P/N},\kappa}:\text{ Both }H_{0}^{\text{P/N}}\text{ and }H_{0}^{\kappa}\text{ hold.}

The null region Θ0P/N,κ\Theta_{0}^{\text{P/N},\kappa} is the intersection of the positive/negative and relative difference null regions, as depicted in Figure 4. We observe that as κ\kappa\to\infty, the omnibus null and alternative regions tend to the positive/negative null and alternative regions.

Refer to caption
Figure 4: (Left) Null and alternative region for omnibus qualitative interaction hypothesis. (Right) Geometric interpretation of likelihood ratio statistic for omnibus qualitative interaction hypothesis.

To construct the likelihood ratio test, we proceed using similar arguments to those presented in Section 3.2. The likelihood ratio statistic TP/N,κT^{\text{P/N},\kappa} is the distance between the estimate θ^\hat{\theta} and its projection onto the null region Θ0P/N,κ\Theta_{0}^{\text{P/N},\kappa}, inversely weighted by the asymptotic variance of θ^\hat{\theta}. A simple expression for TP/N,κT^{\text{P/N},\kappa} is given in Lemma 2.

Lemma 2.

The likelihood ratio statistic TP/N,κT^{\text{{P/N}},\kappa} can be written as

TP/N,κ=min{(θ^1κθ^2)2n1(σ12+κ2σ22),(κθ^1θ^2)2n1(κ2σ12+σ22)}𝟙(θ^Θ1P/N,κ).T^{\text{{P/N}},\kappa}=\min\left\{\frac{\left(\hat{\theta}_{1}-\kappa\hat{\theta}_{2}\right)^{2}}{n^{-1}(\sigma_{1}^{2}+\kappa^{2}\sigma^{2}_{2})},\frac{\left(\kappa\hat{\theta}_{1}-\hat{\theta}_{2}\right)^{2}}{n^{-1}(\kappa^{2}\sigma_{1}^{2}+\sigma_{2}^{2})}\right\}\mathds{1}\left(\hat{\theta}\in\Theta_{1}^{\text{{P/N}},\kappa}\right).

Unsurprisingly, the likelihood ratio statistic for the omnibus test approaches the likelihood ratio statistic for the Gail-Simon likelihood ratio statistic for positive/negative interactions in the limit of large κ\kappa. The tests will, therefore, be nearly identical for sufficiently large κ\kappa.

To characterize the asymptotic behavior of the omnibus test statistic at each location under the null, we use similar arguments to those in the previous section. If θ\theta belongs to the interior of the null region, TP/N,κT^{\text{P/N},\kappa} converges in probability to zero. If θ\theta belongs to the boundary of the null region and is non-zero, TP/N,κT^{\text{P/N},\kappa} converges weakly to a uniform mixture of zero and the chi-squared distribution with one degree of freedom. If θ\theta is zero, the limiting distribution of TP/N,κT^{\text{P/N},\kappa} is non-standard, though it can be characterized nonetheless. Formal statements of asymptotic properties of TP/N,κT^{\text{P/N},\kappa} are given in Propositions 3 and Remark 2 (the statement of Remark 2 is also cumbersome, and is reserved for the Appendix).

Proposition 3.

If θ\theta belongs to the interior of the null region Θ0P/N,κ\Theta_{0}^{\text{P/N},\kappa}, TP/N,κT^{\text{{P/N}},\kappa} converges in distribution to zero. If θ\theta is on the boundary of the null region, but θ0\theta\neq 0, (TP/N,κ>t)12P(χ12>t)\mathbb{P}(T^{\text{{P/N}},\kappa}>t)\to\frac{1}{2}P\left(\chi^{2}_{1}>t\right) as nn\to\infty.

Calculating the p-value for the omnibus test is no more difficult than calculating the p-value for the absence/presence test. Defining FP/N,κ(t)limn(TP/N<t|θ=0)F^{\text{P/N},\kappa}(t)\equiv\lim_{n\to\infty}\mathbb{P}(T^{\text{P/N}}<t|\theta=0), the p-value can be calculated as

ρκ,P/N(t)=max{(χ12>t),1FP/N,κ(t)},\displaystyle\rho^{\kappa,\text{P/N}}(t)=\max\left\{\mathbb{P}\left(\chi^{2}_{1}>t\right),1-F^{\text{P/N},\kappa}(t)\right\}, (5)

where tt is the value of the test statistic TP/N,κT^{\text{P/N},\kappa} calculated on the observed data. We characterize the local asymptotic power of the omnibus likelihood ratio test in Proposition 4 in the Appendix.

4 Simulation Study

In a Monte Carlo simulation study, we examine how type-I error probabilities and statistical power of the likelihood ratio tests for qualitative interactions are affected by signal strength, sample size, and selection of κ\kappa. Additionally, we examine how κmaxα\kappa^{\alpha}_{\max} depends on the true sub-population effects and the sample size.

We generate random observations (Yg,Xg)(Y_{g},X_{g}) in sub-population gg under the linear model:

Yg=θgXg+ϵ;XgN(0,1);ϵN(0,1).Y_{g}=\theta_{g}X_{g}+\epsilon;\,X_{g}\sim N(0,1);\,\epsilon\sim N(0,1).

Here, XgX_{g} is the predictor of interest, YgY_{g} is the response, and ϵ\epsilon is white noise. The measure of association in which we are interested is the regression coefficient θg\theta_{g}. We fix θ1=1\theta_{1}=1 and consider θ2{1,.9,.,.9,1}\theta_{2}\in\{-1,-.9,....,.9,1\}. A total of 10001000 synthetic data sets are randomly generated for each θ2\theta_{2} and n{50,100}n\in\{50,100\}.

For each synthetic data set, we perform both the relative difference likelihood ratio test and the omnibus qualitative interaction likelihood ratio test with κ=2\kappa=2 and κ=4\kappa=4; we use a significance level of α=.05\alpha=.05. We additionally calculate κmaxα\kappa^{\alpha}_{\max} with α=.05\alpha=.05. Parameter estimation is performed with ordinary least squares, and model-based estimates of the standard error are used.

In Figure 5, we plot the Monte Carlo estimate of the rejection probability of the relative difference likelihood ratio test over the range of θ2\theta_{2}. We see that the test achieves control of size for both choices of κ\kappa and both sample sizes. We observe that power is largest when |θ2||\theta_{2}| is near zero, as we would expect. Power is moderately strong when κ=2\kappa=2 but fairly poor when κ=4\kappa=4. With κ=4\kappa=4, the likelihood ratio test has almost zero power when the sample size is small, though we see modest improvement when the sample size is larger.

Figure 5 also shows the estimated rejection probabilities of the omnibus test over θ2\theta_{2}. Control of size is achieved in both large and small samples, as expected. We note that for the omnibus test, power increases as θ2\theta_{2} tends to 1-1, and power is uniformly larger when κ=2\kappa=2 than when κ=4\kappa=4.

Refer to caption
Figure 5: Monte Carlo estimate of rejection probability of the relative difference and omnibus likelihood ratio tests. The shaded light grey and dark grey areas correspond to sets of θ2\theta_{2} such that the null hypothesis holds with κ=2\kappa=2 and κ=4\kappa=4, respectively. The dashed red line denotes the specified size α=.05\alpha=.05. A color version of this figure is available in the online version of the article.

In Figure 6, we plot the quantiles of the κmaxα\kappa_{\max}^{\alpha} values from 1000 synthetic data sets for each θ2\theta_{2}. We expect that for a fixed θ2\theta_{2}, most κmaxα\kappa_{\max}^{\alpha} should approach but not exceed θ1/|θ2|\theta_{1}/|\theta_{2}| as sample size increases; our simulations are consistent with this expectation. We note that when the sample size is small, κmaxα\kappa^{\alpha}_{\max} tends to underestimate the relative difference in effect size.

Refer to caption
Figure 6: Distribution of κmaxα\kappa_{\max}^{\alpha} calculated on synthetic data. The .10, .50, and .90 quantiles are displayed in. The grey curve represents |θ1|/|θ2||\theta_{1}|/|\theta_{2}|, the value κmaxα\kappa_{\max}^{\alpha} is expected to approach for a given θ2\theta_{2}. A color version of this figure is available in the online version of the article.

5 Data Example

In this example, we investigate genetic differences in breast cancer sub-types. Classification of breast cancer based on expression of estrogen receptor (ER) is known to be associated with clinical outcomes. Approximately 70% of breast cancers are estrogen receptor positive (ER+) cancers, meaning that estrogen causes cancer cells to grow (Lumachi et al., 2013); breast cancers are otherwise estrogen receptor negative (ER-). Patients with ER+ breast cancer tend to experience better clinical outcomes than ER- patients (Carey et al., 2006).

We conduct an analysis using publicly available data from The Cancer Genome Atlas (TCGA) (Weinstein et al., 2013). We use clinical data and gene expression data from a total of 806 ER+ patients and 237 ER- patients.

We first investigate the differences between the genetic networks in ER+ and ER- breast cancer. Both ER+ and ER- breast cancer are expected to have similar pathways, but identifying differences between them may be key to understanding the underlying disease mechanism. We then conduct an analysis to assess whether any genes in a set known to be associated with breast cancer are strongly prognostic of disease outcomes in only one of the estrogen receptor groups.

5.1 Differential Network Analysis

Our objective is to determine whether there are any pairs of genes that are much more strongly associated in one estrogen receptor group than the other. We consider the set of p=145p=145 genes in the Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa and Goto, 2000) breast cancer pathway and measure the association between gene expression levels using the Pearson correlation.

We test the relative difference null hypothesis for each pair of genes with κ=2\kappa=2. In Figure 7, we display the pairs of genes that are statistically significant at the α=.10\alpha=.10 level after a Bonferroni adjustment. We find that each of the genes progesterone (PGR), insulin-like growth factor 1 (IGF1R), and estrogen receptor 1 (ESR1) have multiple differential connections; each belongs to at least two pairs such that the association is twice as strong in the ER+ population than in the ER- population. These genes have been shown in the literature to be associated with sub-type and prognosis (Farabaugh et al., 2015; Reinert et al., 2019; Kurozumi et al., 2017).

5.2 Prognostic Value of Biomarkers

The goal of this analysis is to assess whether any of the KEGG genes have a stronger association with time to death in one estrogen receptor group than in the other. For each gene, we fit a univariate Cox proportional-hazards model with time to death as the outcome in both of the estrogen receptor groups separately; we measure association using the log hazard ratio. A total of 64 deaths occurred in the ER+ group, and 33 deaths occurred the ER- group. We calculate κmaxα\kappa^{\alpha}_{\max} with α=.10\alpha=.10 for each gene.

In Figure 7, we compare the log hazard ratios of the ER+ and ER- groups in a scatterplot. Though the log hazard ratios for most genes are similar between subgroups, there are twelve genes with κmaxα\kappa^{\alpha}_{\max} larger than one. A complete list is available in Table 1. The two genes with the strongest interactions are Growth Factor Receptor-bound Protein 2 (GRB2; κmaxα=2.04\kappa^{\alpha}_{\max}=2.04), which has a stronger association in the ER- group, and Adenomatous Polyposis Coli (APC; κmaxα=1.91\kappa^{\alpha}_{\max}=1.91), which has a stronger association in the ER+ group. Both genes have been hypothesized to be associated with breast cancer carcinogenesis (Daly et al., 1994; Jin et al., 2001).

Gene ER+ Log HR (SE) ER- Log HR (SE) κmaxα\kappa^{\alpha}_{\max}
GRB2 -0.06 (0.31) -1.66 (0.68) 2.04
APC 1.34 (0.32) -0.09 (0.33) 1.91
BAX -1.05 (0.24) 0.04 (0.36) 1.53
PIK3CA 1.13 (0.28) 0.14 (0.32) 1.51
SOS2 1.13 (0.36) -0.1 (0.37) 1.33
MAP2K2 -0.87 (0.27) 0.03 (0.35) 1.22
GADD45G -0.52 (0.13) -0.07 (0.19) 1.21
HES5 0.02 (0.2) 0.51 (0.18) 1.19
WNT2 -0.36 (0.09) 0 (0.17) 1.14
DLL4 0.09 (0.2) 0.68 (0.27) 1.10
FRAT2 -1.22 (0.31) -0.45 (0.29) 1.08
SOS1 1.19 (0.3) -0.34 (0.42) 1.01
Table 1: KEGG genes that are more strongly associated with time to death in one ER group than the other, i.e., κmaxα>1\kappa^{\alpha}_{\max}>1 with α=.10\alpha=.10.
Refer to caption
Refer to caption
Figure 7: (Left) Pairs of genes in the KEGG breast cancer pathway for which we reject the relative difference hypothesis with κ=2\kappa=2. Blue edges indicate associations that are stronger in the ER+ group, and red edges indicate associations that are stronger in the ER- group. (Right) Log hazard ratios for KEGG genes in ER+ and ER- groups. The gray dashed line represents the 45-degree line. Blue diamonds and red triangles indicate genes for which κmaxα>1\kappa^{\alpha}_{\max}>1 with α=.10\alpha=.10, where the largest log hazard ratio is in the ER+ group and ER- group, respectively. A color version of this figure is available in the online version of the article.

6 Discussion

We have proposed a general framework for inference about absence/presence qualitative interactions. We argued that naïve procedures rely upon untenable conditions because the absence/presence hypothesis is ill-posed. We thus proposed to relax the problem in order to conduct well-calibrated inference that maintains the absence/presence interpretation and only requires mild assumptions.

In simulations, we found that our methodology has low power when signal is weak or sample sizes are small. To an extent, this is just a feature of the problem; naturally, one would require even more information to detect qualitative interactions than what is required to detect quantitative interactions. However, we provide no guarantee that our methodology is optimal, as tests for composite hypotheses based upon supremum p-values can be conservative in practice (Bayarri and Berger, 2000).

Nonetheless, our framework is interpretable and provides a natural approach for quantifying differences in strength of association by sub-population in general settings. Though we only considered measures of marginal association in our examples, our method can be used with conditional measures of association as well; we only require that asymptotically normal estimates are available. In particular, our approach remains valid in the high-dimensional setting, where asymptotically normal estimates can be obtained using the techniques of, e.g., van de Geer et al. (2014) and Zhang and Zhang (2014). Finally, our method has good utility in the analysis of genomics data, as we demonstrated in our real data example.

7 Acknowledgements

The authors gratefully acknowledge the support of the NSF Graduate Research Fellowship Program under grant DGE-1762114 as well as NSF grant DMS-1561814 and NIH grant R01-GM114029. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agencies.

References

  • Bayarri and Berger (2000) Bayarri, M. and Berger, J. O. (2000). P values for composite null models. Journal of the American Statistical Association 95, 1127–1142.
  • Carey et al. (2006) Carey, L. A., Perou, C. M., Livasy, C. A., Dressler, L. G., Cowan, D., Conway, K., Karaca, G., Troester, M. A., Tse, C. K., Edmiston, S., et al. (2006). Race, breast cancer subtypes, and survival in the carolina breast cancer study. Jama 295, 2492–2502.
  • Casella and Berger (2002) Casella, G. and Berger, R. L. (2002). Statistical inference, volume 2. Duxbury Pacific Grove, CA.
  • Daly et al. (1994) Daly, R. J., Binder, M. D., and Sutherland, R. L. (1994). Overexpression of the grb2 gene in human breast cancer cell lines. Oncogene 9, 2723–2727.
  • Farabaugh et al. (2015) Farabaugh, S. M., Boone, D. N., and Lee, A. V. (2015). Role of igf1r in breast cancer subtypes, stemness, and lineage differentiation. Frontiers in endocrinology 6, 59.
  • Fieller (1940) Fieller, E. C. (1940). The biological standardization of insulin. Supplement to the Journal of the Royal Statistical Society 7, 1–64.
  • Gail and Simon (1985) Gail, M. and Simon, R. (1985). Testing for qualitative interactions between treatment effects and patient subsets. Biometrics 41, 361–372.
  • Ideker and Krogan (2012) Ideker, T. and Krogan, N. J. (2012). Differential network biology. Molecular systems biology 8, 565.
  • Jin et al. (2001) Jin, Z., Tamura, G., Tsuchiya, T., Sakata, K., Kashiwaba, M., Osakabe, M., and Motoyama, T. (2001). Adenomatous polyposis coli (apc) gene promoter hypermethylation in primary breast cancers. British journal of cancer 85, 69–73.
  • Kanehisa and Goto (2000) Kanehisa, M. and Goto, S. (2000). Kegg: kyoto encyclopedia of genes and genomes. Nucleic acids research 28, 27–30.
  • Kurozumi et al. (2017) Kurozumi, S., Matsumoto, H., Hayashi, Y., Tozuka, K., Inoue, K., Horiguchi, J., Takeyoshi, I., Oyama, T., and Kurosumi, M. (2017). Power of pgr expression as a prognostic factor for er-positive/her2-negative breast cancer patients at intermediate risk classified by the ki67 labeling index. BMC cancer 17, 354.
  • Li and Chan (2006) Li, J. and Chan, I. S. (2006). Detecting qualitative interactions in clinical trials: an extension of range test. Journal of Biopharmaceutical statistics 16, 831–841.
  • Lumachi et al. (2013) Lumachi, F., Brunello, A., Maruzzo, M., Basso, U., and Mm Basso, S. (2013). Treatment of estrogen receptor-positive breast cancer. Current medicinal chemistry 20, 596–604.
  • Müller (2001) Müller, A. (2001). Stochastic ordering of multivariate normal distributions. Annals of the Institute of Statistical Mathematics 53, 567–575.
  • Pan and Wolfe (1997) Pan, G. and Wolfe, D. A. (1997). Test for qualitative interaction of clinical significance. Statistics in medicine 16, 1645–1652.
  • Piantadosi and Gail (1993) Piantadosi, S. and Gail, M. (1993). A comparison of the power of two tests for qualitative interactions. Statistics in Medicine 12, 1239–1248.
  • Reinert et al. (2019) Reinert, T., Coelho, G. P., Mandelli, J., Zimermann, E., Zaffaroni, F., Bines, J., Barrios, C. H., and Graudenz, M. S. (2019). Association of esr1 mutations and visceral metastasis in patients with estrogen receptor-positive advanced breast cancer from brazil. Journal of oncology 2019,.
  • Silvapulle (2001) Silvapulle, M. J. (2001). Tests against qualitative interaction: Exact critical values and robust tests. Biometrics 57, 1157–1165.
  • van de Geer et al. (2014) van de Geer, S., Bühlmann, P., Ritov, Y., Dezeure, R., et al. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics 42, 1166–1202.
  • VanderWeele (2019) VanderWeele, T. J. (2019). The interaction continuum. Epidemiology (Cambridge, Mass.) 30, 648.
  • Weinstein et al. (2013) Weinstein, J. N., Collisson, E. A., Mills, G. B., Shaw, K. R. M., Ozenberger, B. A., Ellrott, K., Shmulevich, I., Sander, C., Stuart, J. M., Network, C. G. A. R., et al. (2013). The cancer genome atlas pan-cancer analysis project. Nature genetics 45, 1113.
  • Zhang and Zhang (2014) Zhang, C.-H. and Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 76, 217–242.

Software Availability

Implementation of the proposed methodology and code to reproduce results from the data analysis are available at https://github.com/awhudson/QualitativeInteractions. The TCGA data are available using the public R package RTCGA.

Appendix

Theoretical Results for Relative Difference Likelihood Ratio Test

Below, we generalize Lemma 1 and Proposition 1 to the setting of unbalanced sample sizes n1n2n_{1}\neq n_{2}.

Lemma 1.

The likelihood ratio test statistic TκT^{\kappa} can be written as

Tκ=θ^maxκθ^minτ^max+κ2τ^min,T^{\kappa}=\frac{\hat{\theta}_{\max}-\kappa\hat{\theta}_{\text{min}}}{\sqrt{\hat{\tau}_{\max}+\kappa^{2}\hat{\tau}_{\min}}},

where τ^max=n11σ12𝟙(|θ1|=θmax)+n21σ22𝟙(|θ2|=θmax)\hat{\tau}_{\max}=n_{1}^{-1}\sigma^{2}_{1}\mathds{1}(|\theta_{1}|=\theta_{\max})+n_{2}^{-1}\sigma^{2}_{2}\mathds{1}(|\theta_{2}|=\theta_{\max}), and τ^min=n11σ12𝟙(|θ1|=θmin)+n21σ22𝟙(|θ2|=θmin)\hat{\tau}_{\min}=n_{1}^{-1}\sigma^{2}_{1}\mathds{1}(|\theta_{1}|=\theta_{\min})+n_{2}^{-1}\sigma^{2}_{2}\mathds{1}(|\theta_{2}|=\theta_{\min}).

Proof of Lemma 1.

We define θ^ord\hat{\theta}_{\text{ord}} and Σ^ord\hat{\Sigma}_{\text{ord}} as

θ^ord=(θ^maxθ^min)Σ^ord=(τ^max00τ^min).\displaystyle\hat{\theta}_{\text{ord}}=\begin{pmatrix}\hat{\theta}_{\max}\\ \hat{\theta}_{\min}\end{pmatrix}\quad\quad\hat{\Sigma}_{\text{ord}}=\begin{pmatrix}\hat{\tau}_{\max}&0\\ 0&\hat{\tau}_{\min}\end{pmatrix}.

Now, we define θord\theta^{*}_{\text{ord}} as the projection of θ^ord\hat{\theta}_{\text{ord}} onto the null region. If θ^\hat{\theta} is in the null region Θ0κ\Theta_{0}^{\kappa}, the projection is equal to θ^ord\hat{\theta}_{\text{ord}}. Otherwise, θord\theta^{*}_{\text{ord}} is the projection of θ^ord\hat{\theta}_{\text{ord}} onto the space spanned by (κ,1)(\kappa,1)^{\prime}, with distance inversely weighted by Σord\Sigma_{\text{ord}}. We proceed by finding an expression for θord\theta^{*}_{\text{ord}}.

θord=\displaystyle\theta^{*}_{\text{ord}}=
(κ1)[(κ1)(τ^max00τ^min)(κ1)]1(κ1)(τ^max00τ^min)(θ^maxθ^min)=\displaystyle\begin{pmatrix}\kappa\\ 1\end{pmatrix}\left[\begin{pmatrix}\kappa&1\end{pmatrix}\begin{pmatrix}\hat{\tau}_{\max}&0\\ 0&\hat{\tau}_{\min}\end{pmatrix}\begin{pmatrix}\kappa\\ 1\end{pmatrix}\right]^{-1}\begin{pmatrix}\kappa&1\end{pmatrix}\begin{pmatrix}\hat{\tau}_{\max}&0\\ 0&\hat{\tau}_{\min}\end{pmatrix}\begin{pmatrix}\hat{\theta}_{\max}\\ \hat{\theta}_{\min}\end{pmatrix}=
κτ^maxθ^max+τ^minθ^minκ2τ^max+τ^min(κ1).\displaystyle\frac{\kappa\hat{\tau}_{\max}\hat{\theta}_{\max}+\hat{\tau}_{\min}\hat{\theta}_{\min}}{\kappa^{2}\hat{\tau}_{\max}+\hat{\tau}_{\min}}\begin{pmatrix}\kappa\\ 1\end{pmatrix}.

Now, the weighted distance between θ^ord\hat{\theta}_{\text{ord}} and θord\theta^{*}_{\text{ord}} is

(θ^ordθord)Σord1(θ^ordθord)=\displaystyle(\hat{\theta}_{\text{ord}}-\theta^{*}_{\text{ord}})^{\prime}\Sigma_{\text{ord}}^{-1}(\hat{\theta}_{\text{ord}}-\theta^{*}_{\text{ord}})=
τ^max1(θ^maxκ2τ^maxθ^max+κτ^minθ^minκ2τ^max+τ^min)2+τ^min1(θ^minκτ^maxθ^max+τ^minθ^minκ2τ^max+τ^min)2=\displaystyle\hat{\tau}^{-1}_{\max}\left(\hat{\theta}_{\max}-\frac{\kappa^{2}\hat{\tau}_{\max}\hat{\theta}_{\max}+\kappa\hat{\tau}_{\min}\hat{\theta}_{\min}}{\kappa^{2}\hat{\tau}_{\max}+\hat{\tau}_{\min}}\right)^{2}+\hat{\tau}^{-1}_{\min}\left(\hat{\theta}_{\min}-\frac{\kappa\hat{\tau}_{\max}\hat{\theta}_{\max}+\hat{\tau}_{\min}\hat{\theta}_{\min}}{\kappa^{2}\hat{\tau}_{\max}+\hat{\tau}_{\min}}\right)^{2}=
τ^max1(τ^min1θ^maxκτ^min1θ^minκ2τ^max1+τ^min1)2+τ^min1(κτ^max1θ^max+κ2τ^max1θ^minκ2τ^max1+τ^min1)2=\displaystyle\hat{\tau}_{\text{max}}^{-1}\left(\frac{\hat{\tau}_{\min}^{-1}\hat{\theta}_{\text{max}}-\kappa\hat{\tau}_{\min}^{-1}\hat{\theta}_{\min}}{\kappa^{2}\hat{\tau}_{\text{max}}^{-1}+\hat{\tau}_{\min}^{-1}}\right)^{2}+\hat{\tau}_{\min}^{-1}\left(\frac{\kappa\hat{\tau}_{\text{max}}^{-1}\hat{\theta}_{\text{max}}+\kappa^{2}\hat{\tau}_{\text{max}}^{-1}\hat{\theta}_{\min}}{\kappa^{2}\hat{\tau}_{\text{max}}^{-1}+\hat{\tau}_{\min}^{-1}}\right)^{2}=
κ2τ^max2τ^min1+τ^max1τ^min2(κ2τ^max1+τ^min1)2(θ^maxκθ^min)2=\displaystyle\frac{\kappa^{2}\hat{\tau}^{-2}_{\text{max}}\hat{\tau}_{\min}^{-1}+\hat{\tau}_{\text{max}}^{-1}\hat{\tau}^{-2}_{\min}}{(\kappa^{2}\hat{\tau}_{\text{max}}^{-1}+\hat{\tau}_{\min}^{-1})^{2}}(\hat{\theta}_{\max}-\kappa\hat{\theta}_{\text{min}})^{2}=
(θ^maxκθ^min)2τ^max+κ2τ^min.\displaystyle\frac{(\hat{\theta}_{\max}-\kappa\hat{\theta}_{\text{min}})^{2}}{\hat{\tau}_{\max}+\kappa^{2}\hat{\tau}_{\min}}.

Hence, the likelihood ratio test rejects the null for large values of

(θ^maxκθ^min)2τ^max+κ2τ^min𝟙(θ^maxκθ^min>0),\frac{(\hat{\theta}_{\max}-\kappa\hat{\theta}_{\text{min}})^{2}}{\hat{\tau}_{\max}+\kappa^{2}\hat{\tau}_{\min}}\mathds{1}\left(\hat{\theta}_{\max}-\kappa\hat{\theta}_{\min}>0\right),

or equivalently, for large positive values of Tκ{T}^{\kappa}, where

Tκ=θ^maxκθ^minτ^max+κ2τ^min.T^{\kappa}=\frac{\hat{\theta}_{\max}-\kappa\hat{\theta}_{\text{min}}}{\sqrt{\hat{\tau}_{\max}+\kappa^{2}\hat{\tau}_{\min}}}.

Proposition 1.

Let n1+n2=Nn_{1}+n_{2}=N, and assume n1/Nλ>0n_{1}/N\to\lambda>0 as n1,n2n_{1},n_{2}\to\infty. Then,

  1. i

    In the interior of the null region, i.e., when θmaxκθmin<0\theta_{\max}-\kappa\theta_{\min}<0, TκT^{\kappa} converges in distribution to -\infty.

  2. ii

    At all nonzero boundary points of the null region, i.e., when θmax=κθmin>0\theta_{\max}=\kappa\theta_{\min}>0, TκT^{\kappa} converges in distribution to a standard normal random variable.

Proof of Proposition 1.

First we prove (i). Suppose θmaxκθmin=b<0\theta_{\max}-\kappa\theta_{\min}=b<0. Then consistency of θ^\hat{\theta} and the continuous mapping theorem imply that θ^maxκθ^minpb\hat{\theta}_{\max}-\kappa\hat{\theta}_{\min}\to_{p}b. Now, because n11σ12n_{1}^{-1}\sigma_{1}^{2} and n21σ22n_{2}^{-1}\sigma_{2}^{2} both tend to zero, 1τ^max+κ2τ^minp\frac{1}{\sqrt{\hat{\tau}_{\max}+\kappa^{2}\hat{\tau}_{\min}}}\to_{p}\infty. Therefore, Tκp.T^{\kappa}\to_{p}-\infty.

Now we prove (ii). Suppose θmax=κθmin>0\theta_{\max}=\kappa\theta_{\min}>0. By applying the delta method,

n1n2N(θ^maxκθ^min)=\displaystyle\sqrt{\frac{n_{1}n_{2}}{N}}\left(\hat{\theta}_{\max}-\kappa\hat{\theta}_{\min}\right)=
n1n2N[(max{|θ^1|,|θ^2|}κmin{|θ^1|,|θ^2|})(max{|θ1|,|θ2|}κmin{|θ1|,|θ2|})]=\displaystyle\sqrt{\frac{n_{1}n_{2}}{N}}\left[\left(\max\left\{|\hat{\theta}_{1}|,|\hat{\theta}_{2}|\right\}-\kappa\min\left\{|\hat{\theta}_{1}|,|\hat{\theta}_{2}|\right\}\right)-\left(\max\left\{|\theta_{1}|,|\theta_{2}|\right\}-\kappa\min\left\{|\theta_{1}|,|\theta_{2}|\right\}\right)\right]=
n1n2NM(θ^1θ1θ^2θ2)+op(1),\displaystyle\sqrt{\frac{n_{1}n_{2}}{N}}M\begin{pmatrix}\hat{\theta}_{1}-\theta_{1}\\ \hat{\theta}_{2}-\theta_{2}\end{pmatrix}+o_{p}(1),

where the matrix MM is

M=((κ)𝟙(|θ1|=θmin)sign(θ1)00(κ)𝟙(|θ2|=θmin)sign(θ2)).M=\begin{pmatrix}(-\kappa)^{\mathds{1}(|\theta_{1}|=\theta_{\min})}\,\text{sign}(\theta_{1})&0\\ 0&(-\kappa)^{\mathds{1}(|\theta_{2}|=\theta_{\min})}\,\text{sign}(\theta_{2})\end{pmatrix}.

Now, Slutsky’s theorem implies that

n1n2NM(θ^1θ1θ^2θ2)+op(1)dN(0,(1λ)(κ2)𝟙(|θ1|=θmin)σ12+λ(κ2)𝟙(|θ2|=θmin)σ22).\displaystyle\sqrt{\frac{n_{1}n_{2}}{N}}M\begin{pmatrix}\hat{\theta}_{1}-\theta_{1}\\ \hat{\theta}_{2}-\theta_{2}\end{pmatrix}+o_{p}(1)\to_{d}N\left(0,(1-\lambda)\left(\kappa^{2}\right)^{\mathds{1}(|\theta_{1}|=\theta_{\min})}{\sigma^{2}_{1}}+\lambda\left(\kappa^{2}\right)^{\mathds{1}(|\theta_{2}|=\theta_{\min})}{\sigma^{2}_{2}}\right).

The test statistic TκT^{\kappa} can be written as

Tκ\displaystyle T^{\kappa} =θ^maxκθ^minτ^max+κ2τ^min\displaystyle=\frac{\hat{\theta}_{\max}-\kappa\hat{\theta}_{\text{min}}}{\sqrt{\hat{\tau}_{\max}+\kappa^{2}\hat{\tau}_{\min}}}
=n1n2N(θ^maxκθ^min)n1n2Nτ^max2+n1n2Nκ2τ^min2,\displaystyle=\frac{\sqrt{\frac{n_{1}n_{2}}{N}}\left(\hat{\theta}_{\max}-\kappa\hat{\theta}_{\min}\right)}{\sqrt{\frac{n_{1}n_{2}}{N}\hat{\tau}^{2}_{\max}+\frac{n_{1}n_{2}}{N}\kappa^{2}\hat{\tau}^{2}_{\min}}},

with τ^max\hat{\tau}_{\max} and τ^min\hat{\tau}_{\min} as defined in Proposition 1. By the continuous mapping theorem, the denominator converges in probability to

(1λ)(κ2)𝟙(|θ1|=θmin)σ12+λ(κ2)𝟙(|θ2|=θmin)σ22.\sqrt{(1-\lambda)\left(\kappa^{2}\right)^{\mathds{1}(|\theta_{1}|=\theta_{\min})}{\sigma^{2}_{1}}+\lambda\left(\kappa^{2}\right)^{\mathds{1}(|\theta_{2}|=\theta_{\min})}{\sigma^{2}_{2}}}.

Thus, Slutsky’s theorem implies that

TκdN(0,1).T^{\kappa}\to_{d}N(0,1).

In Proposition 2, we characterize the local asymptotic behavior of the relative difference likelihood ratio test.

Proposition 2.

Let n1+n2=Nn_{1}+n_{2}=N, and assume n1/Nλ>0n_{1}/N\to\lambda>0 as n1,n2n_{1},n_{2}\to\infty. Suppose θ1=n11/2c1\theta_{1}=n^{-1/2}_{1}c_{1}, and θ2=n21/2c2\theta_{2}={n_{2}^{-1/2}c_{2}}. Further, assume that θ^\hat{\theta} is locally regular in the sense that ngθ^gdN(cg,σg2)\sqrt{n_{g}}\hat{\theta}_{g}\to_{d}N(c_{g},{\sigma^{2}_{g}}) for g{1,2}g\in\{1,2\}. Then as n1,n2n_{1},n_{2}\to\infty, for all t>0t>0, (Tκ>t)\mathbb{P}\left(T^{\kappa}>t\right) converges to

(W11>t,W12>t)+(W11<t,W12<t)+(W21>t,W22>t)+(W21<t,W22<t),\displaystyle\mathbb{P}(W_{11}>t,W_{12}>t)+\mathbb{P}(W_{11}<-t,W_{12}<-t)+\mathbb{P}(W_{21}>t,W_{22}>t)+\mathbb{P}(W_{21}<-t,W_{22}<-t),

where (W11,W12)(W_{11},W_{12}) follows a bivariate normal distribution with mean (c~11,c~12)(\tilde{c}_{11},\tilde{c}_{12}), unit variance and correlation ν1\nu_{1}, and (W21,W22)(W_{21},W_{22}) follows a bivariate normal distribution with mean (c~21,c~22)(\tilde{c}_{21},\tilde{c}_{22}), unit variance and correlation ν2\nu_{2}, with c~11,c~12,c~21,c~22,ν1,ν2\tilde{c}_{11},\tilde{c}_{12},\tilde{c}_{21},\tilde{c}_{22},\nu_{1},\nu_{2} are defined as

c~11=(1λ)c1κλc2(1λ)σ12+κ2λσ22\displaystyle\tilde{c}_{11}=\frac{(1-\lambda)c_{1}-\kappa\lambda c_{2}}{\sqrt{(1-\lambda)\sigma_{1}^{2}+\kappa^{2}\lambda\sigma_{2}^{2}}}
c~12=(1λ)c1+κλc2(1λ)σ12+κ2λσ22\displaystyle\tilde{c}_{12}=\frac{(1-\lambda)c_{1}+\kappa\lambda c_{2}}{\sqrt{(1-\lambda)\sigma_{1}^{2}+\kappa^{2}\lambda\sigma_{2}^{2}}}
c~21=λc2κ(1λ)c1κ2(1λ)σ12+λσ22\displaystyle\tilde{c}_{21}=\frac{\lambda c_{2}-\kappa(1-\lambda)c_{1}}{\sqrt{\kappa^{2}(1-\lambda)\sigma_{1}^{2}+\lambda\sigma_{2}^{2}}}
c~22=λc2+κ(1λ)c1κ2(1λ)σ12+λσ22\displaystyle\tilde{c}_{22}=\frac{\lambda c_{2}+\kappa(1-\lambda)c_{1}}{\sqrt{\kappa^{2}(1-\lambda)\sigma_{1}^{2}+\lambda\sigma_{2}^{2}}}
ν1=(1λ)σ12κ2λσ22(1λ)σ12+κ2λσ22\displaystyle\nu_{1}=\frac{(1-\lambda)\sigma_{1}^{2}-\kappa^{2}\lambda\sigma_{2}^{2}}{(1-\lambda)\sigma_{1}^{2}+\kappa^{2}\lambda\sigma_{2}^{2}}
ν2=λσ22κ2(1λ)σ12κ2(1λ)σ12+λσ22.\displaystyle\nu_{2}=\frac{\lambda\sigma_{2}^{2}-\kappa^{2}(1-\lambda)\sigma_{1}^{2}}{\kappa^{2}(1-\lambda)\sigma_{1}^{2}+\lambda\sigma_{2}^{2}}.
Remark 1.

Setting c1=c2=0c_{1}=c_{2}=0, we obtain the limiting distribution of TκT^{\kappa} when θ1=θ2=0\theta_{1}=\theta_{2}=0, which is critical for calculating the p-value ρκ(t)\rho^{\kappa}(t).

Proof of Proposition 2.

We first re-write the tail probability as

(θ^maxκθ^minτ^max+κ2τ^min>t)=\displaystyle\mathbb{P}\left(\frac{\hat{\theta}_{\max}-\kappa\hat{\theta}_{\min}}{\hat{\tau}_{\max}+\kappa^{2}\hat{\tau}_{\min}}>t\right)=
(θ^1κθ^2n11σ12+κ2n21σ22>t,θ^1+κθ^2n11σ12+κ2n21σ22>t)+\displaystyle\mathbb{P}\left(\frac{\hat{\theta}_{1}-\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}>t,\frac{\hat{\theta}_{1}+\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}>t\right)+
(θ^1κθ^2n11σ12+κ2n21σ22<t,θ^1+κθ^2n11σ12+κ2n21σ22<t)+\displaystyle\mathbb{P}\left(\frac{\hat{\theta}_{1}-\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}<-t,\frac{\hat{\theta}_{1}+\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}<-t\right)+
(θ^2κθ^1κ2n11σ12+n21σ22>t,θ^2+κθ^1κ2n11σ12+n21σ22>t)+\displaystyle\mathbb{P}\left(\frac{\hat{\theta}_{2}-\kappa\hat{\theta}_{1}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}>t,\frac{\hat{\theta}_{2}+\kappa\hat{\theta}_{1}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}>t\right)+
(θ^2κθ^1κ2n11σ12+n21σ22<t,θ^2+κθ^1κ2n11σ12+n21σ22<t).\displaystyle\mathbb{P}\left(\frac{\hat{\theta}_{2}-\kappa\hat{\theta}_{1}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}<-t,\frac{\hat{\theta}_{2}+\kappa\hat{\theta}_{1}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}<-t\right).

Thus,

n1n2N(θ^1θ^2)dN(((1λ)c1λc2),((1λ)σ1200λσ22)).\displaystyle\sqrt{\frac{n_{1}n_{2}}{N}}\begin{pmatrix}\hat{\theta}_{1}\\ \hat{\theta}_{2}\end{pmatrix}\to_{d}N\left(\begin{pmatrix}(1-\lambda)c_{1}\\ \lambda c_{2}\end{pmatrix},\begin{pmatrix}(1-\lambda)\sigma_{1}^{2}&0\\ 0&\lambda\sigma_{2}^{2}\end{pmatrix}\right).

And,

n1n2N(1κ1κ)(θ^1θ^2)dN(((1λ)c1κλc2(1λ)c1+κλc2),((1λ)σ12+κ2λσ22(1λ)σ12κ2λσ22(1λ)σ12κ2λσ22κ2(1λ)σ12+λσ22)).\displaystyle\sqrt{\frac{n_{1}n_{2}}{N}}\begin{pmatrix}1&-\kappa\\ 1&\kappa\end{pmatrix}\begin{pmatrix}\hat{\theta}_{1}\\ \hat{\theta}_{2}\end{pmatrix}\to_{d}N\left(\begin{pmatrix}(1-\lambda)c_{1}-\kappa\lambda c_{2}\\ (1-\lambda)c_{1}+\kappa\lambda c_{2}\end{pmatrix},\begin{pmatrix}(1-\lambda)\sigma_{1}^{2}+\kappa^{2}\lambda\sigma_{2}^{2}&(1-\lambda)\sigma_{1}^{2}-\kappa^{2}\lambda\sigma_{2}^{2}\\ (1-\lambda)\sigma_{1}^{2}-\kappa^{2}\lambda\sigma_{2}^{2}&\kappa^{2}(1-\lambda)\sigma_{1}^{2}+\lambda\sigma_{2}^{2}\end{pmatrix}\right).

Now,

(θ^1κθ^2n11σ12+κ2n21σ22θ^1+κθ^2n11σ12+κ2n21σ22)\displaystyle\begin{pmatrix}\frac{\hat{\theta}_{1}-\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}\\ \frac{\hat{\theta}_{1}+\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}\end{pmatrix} =(n1n2N(θ^1κθ^2)κ2n2N1σ12+n1N1σ22n1n2N(κθ^1θ^2)κ2n2N1σ12+n11Nσ22)dN((c~11c~12),(1ν1ν11)).\displaystyle=\begin{pmatrix}\frac{\sqrt{\frac{n_{1}n_{2}}{N}}(\hat{\theta}_{1}-\kappa\hat{\theta}_{2})}{\sqrt{\kappa^{2}n_{2}N^{-1}\sigma_{1}^{2}+n_{1}N^{-1}\sigma_{2}^{2}}}\\ \frac{\sqrt{\frac{n_{1}n_{2}}{N}}(\kappa\hat{\theta}_{1}-\hat{\theta}_{2})}{\sqrt{\kappa^{2}n_{2}N^{-1}\sigma_{1}^{2}+n_{1}^{-1}N\sigma_{2}^{2}}}\end{pmatrix}\to_{d}N\left(\begin{pmatrix}\tilde{c}_{11}\\ \tilde{c}_{12}\end{pmatrix},\begin{pmatrix}1&\nu_{1}\\ \nu_{1}&1\end{pmatrix}\right).

Similarly,

(θ^2κθ^1κ2n11σ12+n21σ22θ^2+κθ^1κ2n11σ12+n21σ22)dN((c~21c~22),(1ν2ν21)).\displaystyle\begin{pmatrix}\frac{\hat{\theta}_{2}-\kappa\hat{\theta}_{1}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}\\ \frac{\hat{\theta}_{2}+\kappa\hat{\theta}_{1}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}\end{pmatrix}\to_{d}N\left(\begin{pmatrix}\tilde{c}_{21}\\ \tilde{c}_{22}\end{pmatrix},\begin{pmatrix}1&\nu_{2}\\ \nu_{2}&1\end{pmatrix}\right).

Theoretical Results for Simultaneous Test for Qualitative Interactions

In what follows, we generalize Lemma 2 and Proposition 3 to the case of unbalanced sample sizes.

Lemma 2.

The likelihood ratio statistic TP/N,κT^{\text{{P/N}},\kappa} can be written as

TP/N,κ=min{(θ^1κθ^2)2n11σ12+κ2n21σ22,(κθ^1θ^2)2κ2n11σ12+n21σ22}𝟙(θ^Θ1P/N,κ).T^{\text{{P/N}},\kappa}=\min\left\{\frac{\left(\hat{\theta}_{1}-\kappa\hat{\theta}_{2}\right)^{2}}{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma^{2}_{2}},\frac{\left(\kappa\hat{\theta}_{1}-\hat{\theta}_{2}\right)^{2}}{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}\right\}\mathds{1}\left(\hat{\theta}\in\Theta_{1}^{\textnormal{P/N},\kappa}\right).
Proof of Lemma 2.

If θ^\hat{\theta} belongs to the null region, Θ0P/N,κ\Theta_{0}^{\text{P/N},\kappa}, the likelihood ratio statistic is clearly zero. If θ^1>κθ^2>0\hat{\theta}_{1}>\kappa\hat{\theta}_{2}>0 or θ^1<κθ^2<0\hat{\theta}_{1}<\kappa\hat{\theta}_{2}<0, the likelihood ratio statistic is minimum of the distances between θ^\hat{\theta} and its projections onto the span of (1,κ)(1,\kappa)^{\prime} and the span of (κ,1)(-\kappa,-1)^{\prime}. Algebra (similar to the proof of Lemma 1) gives the desired result, and similarly if θ^1>κθ^2>0\hat{\theta}_{1}>-\kappa\hat{\theta}_{2}>0 or θ^1<κθ^2<0\hat{\theta}_{1}<-\kappa\hat{\theta}_{2}<0. ∎

Proposition 3.

Let n1+n2=Nn_{1}+n_{2}=N, and assume n1/Nλ>0n_{1}/N\to\lambda>0. Then,

  1. i

    If θ\theta belongs to the interior of the null region Θ0P/N,κ\Theta_{0}^{\textnormal{P/N},\kappa}, TP/N,κT^{\textnormal{P/N},\kappa} converges in distribution to zero.

  2. ii

    If θ\theta is on the boundary of the null region, but θ0\theta\neq 0, (TP/N,κ>t)12(χ12>t)\mathbb{P}(T^{\textnormal{P/N},\kappa}>t)\to\frac{1}{2}\mathbb{P}\left(\chi^{2}_{1}>t\right) as n1,n2n_{1},n_{2}\to\infty.

Proof of Proposition 3.

We first prove (i). If θ\theta belongs to the interior of the null region, consistency of θ^\hat{\theta} and the continuous mapping theorem imply that 𝟙(θ^Θ1P/N,κ)\mathds{1}\left(\hat{\theta}\in\Theta_{1}^{\text{P/N},\kappa}\right) converges in probability to zero. Then, by Slutsky’s theorem, TP/N,κT^{\text{P/N},\kappa} converges in distribution to zero.

To prove (ii), recall that we can write the null and alternative regions as

Θ0P/N,κ=\displaystyle\Theta_{0}^{\text{P/N},\kappa}= {0<κ1θ1<θ2<κθ1}{κθ1<θ2<κ1θ1<0}\displaystyle\{0<\kappa^{-1}\theta_{1}<\theta_{2}<\kappa\theta_{1}\}\cup\{\kappa\theta_{1}<\theta_{2}<\kappa^{-1}\theta_{1}<0\}
Θ1P/N,κ=\displaystyle\Theta_{1}^{\text{P/N},\kappa}= {{θ1>κθ2}{θ1>0}}{{θ1>κθ2}{θ1<0}}\displaystyle\left\{\{\theta_{1}>\kappa\theta_{2}\}\cap\{\theta_{1}>0\}\right\}\cup\left\{\{-\theta_{1}>-\kappa\theta_{2}\}\cap\{\theta_{1}<0\}\right\}\cup
{{θ2>κθ1}{θ2>0}}{{θ2>κθ1}{θ2<0}}.\displaystyle\left\{\{\theta_{2}>\kappa\theta_{1}\}\cap\{\theta_{2}>0\}\right\}\cup\left\{\{-\theta_{2}>-\kappa\theta_{1}\}\cap\{\theta_{2}<0\}\right\}.

Now, we write

(TP/N,κ>t)=\displaystyle\mathbb{P}\left(T^{\text{P/N},\kappa}>t\right)=
((θ^1κθ^2)2n11σ12+κ2n21σ22𝟙(θ^Θ1)>t,(κθ^1θ^2)2κ2n11σ12+n21σ22𝟙(θ^Θ1)>t)=\displaystyle\mathbb{P}\left(\frac{\left(\hat{\theta}_{1}-\kappa\hat{\theta}_{2}\right)^{2}}{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}\mathds{1}\left(\hat{\theta}\in\Theta_{1}\right)>t,\frac{\left(\kappa\hat{\theta}_{1}-\hat{\theta}_{2}\right)^{2}}{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}\mathds{1}\left(\hat{\theta}\in\Theta_{1}\right)>t\right)=
(θ^1κθ^2n11σ12+κ2n21σ22𝟙(θ^Θ1)>t,κθ^1θ^2κ2n11σ12+n21σ22𝟙(θ^Θ1)>t)+\displaystyle\mathbb{P}\left(\frac{\hat{\theta}_{1}-\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}\mathds{1}\left(\hat{\theta}\in\Theta_{1}\right)>\sqrt{t},\frac{\kappa\hat{\theta}_{1}-\hat{\theta}_{2}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}\mathds{1}\left(\hat{\theta}\in\Theta_{1}\right)>\sqrt{t}\right)+
(θ^1κθ^2n11σ12+κ2n21σ22𝟙(θ^Θ1)>t,κθ^1θ^2κ2n11σ12+n21σ22𝟙(θ^Θ1)<t)+\displaystyle\mathbb{P}\left(\frac{\hat{\theta}_{1}-\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}\mathds{1}\left(\hat{\theta}\in\Theta_{1}\right)>\sqrt{t},\frac{\kappa\hat{\theta}_{1}-\hat{\theta}_{2}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}\mathds{1}\left(\hat{\theta}\in\Theta_{1}\right)<-\sqrt{t}\right)+
(θ^1κθ^2n11σ12+κ2n21σ22𝟙(θ^Θ1)<t,κθ^1θ^2κ2n11σ12+n21σ22𝟙(θ^Θ1)>t)+\displaystyle\mathbb{P}\left(\frac{\hat{\theta}_{1}-\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}\mathds{1}\left(\hat{\theta}\in\Theta_{1}\right)<-\sqrt{t},\frac{\kappa\hat{\theta}_{1}-\hat{\theta}_{2}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}\mathds{1}\left(\hat{\theta}\in\Theta_{1}\right)>\sqrt{t}\right)+
(θ^1κθ^2n11σ12+κ2n21σ22𝟙(θ^Θ1)<t,κθ^1θ^2κ2n11σ12+n21σ22𝟙(θ^Θ1)<t).\displaystyle\mathbb{P}\left(\frac{\hat{\theta}_{1}-\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}\mathds{1}\left(\hat{\theta}\in\Theta_{1}\right)<-\sqrt{t},\frac{\kappa\hat{\theta}_{1}-\hat{\theta}_{2}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}\mathds{1}\left(\hat{\theta}\in\Theta_{1}\right)<-\sqrt{t}\right).

The second and third summands above are exactly equal to 0. To see this, note that in the second term θ^1>κθ^2\hat{\theta}_{1}>\kappa\hat{\theta}_{2} and θ^1<κ1θ^2\hat{\theta}_{1}<\kappa^{-1}\hat{\theta}_{2} implies that θ^Θ0P/N,κ\hat{\theta}\in\Theta_{0}^{\text{P/N},\kappa}, and similarly for the third term. In the first and last summands, we can ignore the term 𝟙(θ^Θ1P/N,κ)\mathds{1}\left(\hat{\theta}\in\Theta_{1}^{\text{P/N},\kappa}\right). To see this, note that in the first term θ^1>κθ^2\hat{\theta}_{1}>\kappa\hat{\theta}_{2} and θ1^>κ1θ2^\hat{\theta_{1}}>\kappa^{-1}\hat{\theta_{2}} imply that θ^Θ1P/N,κ\hat{\theta}\in\Theta_{1}^{\text{P/N},\kappa}; a similar argument holds for the fourth term. Thus,

(TP/N,κ>t)\displaystyle\mathbb{P}\left(T^{\text{P/N},\kappa}>t\right) =(θ^1κθ^2n11σ12+κ2n21σ22>t,κθ^1θ^2κ2n11σ12+n21σ22>t)+\displaystyle=\mathbb{P}\left(\frac{\hat{\theta}_{1}-\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}>\sqrt{t},\frac{\kappa\hat{\theta}_{1}-\hat{\theta}_{2}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}>\sqrt{t}\right)+
(θ^1κθ^2n11σ12+κ2n21σ22<t,κθ^1θ^2κ2n11σ12+n21σ22<t).\displaystyle\quad\,\,\mathbb{P}\left(\frac{\hat{\theta}_{1}-\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}<-\sqrt{t},\frac{\kappa\hat{\theta}_{1}-\hat{\theta}_{2}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}<-\sqrt{t}\right). (6)

Suppose, for the moment, θ1=κθ2>0\theta_{1}=\kappa\theta_{2}>0. Then

n1n2N(θ^1θ1θ^2θ2)dN(0,((1λ)σ1200λσ22)).\displaystyle\sqrt{\frac{n_{1}n_{2}}{N}}\begin{pmatrix}\hat{\theta}_{1}-\theta_{1}\\ \hat{\theta}_{2}-\theta_{2}\end{pmatrix}\to_{d}N\left(0,\begin{pmatrix}(1-\lambda)\sigma_{1}^{2}&0\\ 0&\lambda\sigma_{2}^{2}\end{pmatrix}\right).

Therefore,

n1n2N(θ^1κθ^2κθ^1θ^2)\displaystyle\sqrt{\frac{n_{1}n_{2}}{N}}\begin{pmatrix}\hat{\theta}_{1}-\kappa\hat{\theta}_{2}\\ \kappa\hat{\theta}_{1}-\hat{\theta}_{2}\end{pmatrix} =n1n2N(θ^1κθ^2κ(θ^1θ1)(θ^2θ2)+(κθ1θ2))\displaystyle=\sqrt{\frac{n_{1}n_{2}}{N}}\begin{pmatrix}\hat{\theta}_{1}-\kappa\hat{\theta}_{2}\\ \kappa(\hat{\theta}_{1}-\theta_{1})-(\hat{\theta}_{2}-\theta_{2})+(\kappa\theta_{1}-\theta_{2})\end{pmatrix}
dN(0,((1λ)σ12+κ2λσ22κ(1λ)σ12+κλσ22κ(1λ)σ12+κλσ22κ2(1λ)σ12+λσ22))+(0).\displaystyle\to_{d}N\left(0,\begin{pmatrix}(1-\lambda)\sigma_{1}^{2}+\kappa^{2}\lambda\sigma_{2}^{2}&\kappa(1-\lambda)\sigma_{1}^{2}+\kappa\lambda\sigma_{2}^{2}\\ \kappa(1-\lambda)\sigma_{1}^{2}+\kappa\lambda\sigma_{2}^{2}&\kappa^{2}(1-\lambda)\sigma_{1}^{2}+\lambda\sigma_{2}^{2}\end{pmatrix}\right)+\begin{pmatrix}0\\ \infty\end{pmatrix}.

Thus,

(TP/N,κ>t)\displaystyle\mathbb{P}\left(T^{\text{P/N},\kappa}>t\right) 1Φ(t)=12(χ12>t).\displaystyle\to 1-\Phi\left(\sqrt{t}\right)=\frac{1}{2}\mathbb{P}\left(\chi^{2}_{1}>t\right).

A similar argument follows if θ1=κθ2<0\theta_{1}=\kappa\theta_{2}<0, θ2=κθ1>0\theta_{2}=\kappa\theta_{1}>0, or θ2=κθ1<0\theta_{2}=\kappa\theta_{1}<0. ∎

In Proposition 4, we characterize the local asymptotic behavior of the omnibus test for qualitative interactions.

Proposition 4.

Let n1+n2=Nn_{1}+n_{2}=N, and assume n1/Nλ>0n_{1}/N\to\lambda>0 as n1,n2n_{1},n_{2}\to\infty. Suppose θ1=n11/2c1,θ2n1/2c2\theta_{1}=n_{1}^{-1/2}c_{1},\theta_{2}n^{-1/2}c_{2} with c1,c20c_{1},c_{2}\geq 0. Further, assume θ^\hat{\theta} is locally regular in the sense that ngθ^gdN(cg,σg2)\sqrt{n_{g}}\hat{\theta}_{g}\to_{d}N(c_{g},\sigma_{g}^{2}) for g{1,2}g\in\{1,2\}. Then as n1,n2n_{1},n_{2}\to\infty, for all t>0t>0, (TP/N,κ>t)\mathbb{P}\left(T^{\textnormal{P/N},\kappa}>t\right) converges to

(V1>t,V2>t)+(V1<t,V2<t),\mathbb{P}\left(V_{1}>\sqrt{t},V_{2}>\sqrt{t}\right)+\mathbb{P}\left(V_{1}<-\sqrt{t},V_{2}<-\sqrt{t}\right),

where (V1,V2)(V_{1},V_{2}) follows a bivariate normal distribution with mean (c~1,c~2)(\tilde{c}_{1},\tilde{c}_{2}), unit variance, and correlation ν\nu, where

c~1=(1λ)c1κλc2(1λ)σ12+κ2λσ22\displaystyle\tilde{c}_{1}=\frac{(1-\lambda)c_{1}-\kappa\lambda c_{2}}{\sqrt{(1-\lambda)\sigma_{1}^{2}+\kappa^{2}\lambda\sigma_{2}^{2}}}
c~2=κ(1λ)c1λc2κ2(1λ)σ12+λσ22\displaystyle\tilde{c}_{2}=\frac{\kappa(1-\lambda)c_{1}-\lambda c_{2}}{\sqrt{\kappa^{2}(1-\lambda)\sigma_{1}^{2}+\lambda\sigma_{2}^{2}}}
ν=κ(1λ)σ12+κλσ22(1λ)σ12+κ2λσ22κ2(1λ)σ12+λσ22.\displaystyle\nu=\frac{\kappa(1-\lambda)\sigma_{1}^{2}+\kappa\lambda\sigma_{2}^{2}}{\sqrt{(1-\lambda)\sigma_{1}^{2}+\kappa^{2}\lambda\sigma_{2}^{2}}\sqrt{\kappa^{2}(1-\lambda)\sigma_{1}^{2}+\lambda\sigma_{2}^{2}}}.
Remark 2.

Setting c1=c2=0c_{1}=c_{2}=0, we obtain the limiting distribution of TP/N,κT^{\text{P/N},\kappa} when θ1=θ2=0\theta_{1}=\theta_{2}=0, which is critical for calculating the p-value ρP/N,κ(t)\rho^{\text{P/N},\kappa}(t).

Proof of Proposition 4.

First,

n1n2N(θ^1θ^2)dN(((1λ)c1λc2),((1λ)σ1200λσ22).)\displaystyle\sqrt{\frac{n_{1}n_{2}}{N}}\begin{pmatrix}\hat{\theta}_{1}\\ \hat{\theta}_{2}\end{pmatrix}\to_{d}N\left(\begin{pmatrix}(1-\lambda)c_{1}\\ \lambda c_{2}\end{pmatrix},\begin{pmatrix}(1-\lambda)\sigma_{1}^{2}&0\\ 0&\lambda\sigma_{2}^{2}\end{pmatrix}.\right)

Also,

n1n2N(1κκ1)(θ^1θ^2)dN(((1λ)c1κλc2κ(1λ)c1λc2),((1λ)σ12+κ2λσ22κ(1λ)σ12+κλσ22κ(1λ)σ12+κλσ22κ2(1λ)σ12+λσ22)).\displaystyle\sqrt{\frac{n_{1}n_{2}}{N}}\begin{pmatrix}1&-\kappa\\ \kappa&-1\end{pmatrix}\begin{pmatrix}\hat{\theta}_{1}\\ \hat{\theta}_{2}\end{pmatrix}\to_{d}N\left(\begin{pmatrix}(1-\lambda)c_{1}-\kappa\lambda c_{2}\\ \kappa(1-\lambda)c_{1}-\lambda c_{2}\end{pmatrix},\begin{pmatrix}(1-\lambda)\sigma_{1}^{2}+\kappa^{2}\lambda\sigma_{2}^{2}&\kappa(1-\lambda)\sigma_{1}^{2}+\kappa\lambda\sigma_{2}^{2}\\ \kappa(1-\lambda)\sigma_{1}^{2}+\kappa\lambda\sigma_{2}^{2}&\kappa^{2}(1-\lambda)\sigma_{1}^{2}+\lambda\sigma_{2}^{2}\end{pmatrix}\right).

Thus,

(θ^1κθ^2n11σ12+κ2n21σ22κθ^1θ^2κ2n11σ12+n21σ22)\displaystyle\begin{pmatrix}\frac{\hat{\theta}_{1}-\kappa\hat{\theta}_{2}}{\sqrt{n_{1}^{-1}\sigma_{1}^{2}+\kappa^{2}n_{2}^{-1}\sigma_{2}^{2}}}\\ \frac{\kappa\hat{\theta}_{1}-\hat{\theta}_{2}}{\sqrt{\kappa^{2}n_{1}^{-1}\sigma_{1}^{2}+n_{2}^{-1}\sigma_{2}^{2}}}\end{pmatrix} =(n1n2N(θ^1κθ^2)n2N1σ12+κ2n1N1σ22n1n2N(κθ^1θ^2)κ2n2N1σ12+n11Nσ22)dN((c~1c~2),(1νν1)).\displaystyle=\begin{pmatrix}\frac{\sqrt{\frac{n_{1}n_{2}}{N}}(\hat{\theta}_{1}-\kappa\hat{\theta}_{2})}{\sqrt{n_{2}N^{-1}\sigma_{1}^{2}+\kappa^{2}n_{1}N^{-1}\sigma_{2}^{2}}}\\ \frac{\sqrt{\frac{n_{1}n_{2}}{N}}(\kappa\hat{\theta}_{1}-\hat{\theta}_{2})}{\sqrt{\kappa^{2}n_{2}N^{-1}\sigma_{1}^{2}+n_{1}^{-1}N\sigma_{2}^{2}}}\end{pmatrix}\to_{d}N\left(\begin{pmatrix}\tilde{c}_{1}\\ \tilde{c}_{2}\end{pmatrix},\begin{pmatrix}1&\nu\\ \nu&1\end{pmatrix}\right).

Application of (6) completes the argument. ∎

Calculation of κmaxα\kappa^{\alpha}_{\max}

Define tκt^{\kappa} as a realization of the likelihood ratio statistic TκT^{\kappa} for the relative difference hypothesis, calculated on the observed data. For a pre-specified α<1\alpha<1, recall that κmaxαsup{κ:κ>1,ρκ(tκ)<α}\kappa_{\max}^{\alpha}\equiv\sup\{\kappa:\kappa>1,\rho^{\kappa}(t^{\kappa})<\alpha\} is defined as the largest κ>1\kappa>1 such that the likelihood ratio test rejects the relative difference null. Defining Fκ(t)limn1,n2(Tκ>t|θ=0)F^{\kappa}(t)\equiv\lim_{n_{1},n_{2}\to\infty}\mathbb{P}\left(T^{\kappa}>t|\theta=0\right), we can evaluate κmaxα\kappa^{\alpha}_{\max} as

κmaxα\displaystyle\kappa^{\alpha}_{\max} =min{sup{κ:κ>1,1Φ(tκ)<α},sup{κ:κ>1,1Fκ(tκ)<α}}\displaystyle=\min\left\{\sup\left\{\kappa:\kappa>1,1-\Phi\left(t^{\kappa}\right)<\alpha\right\},\sup\{\kappa:\kappa>1,1-F^{\kappa}(t^{\kappa})<\alpha\}\right\}
min{π1,π2}.\displaystyle\equiv\min\{\pi_{1},\pi_{2}\}.

Therefore, we are only required to calculate π1\pi_{1} and π2\pi_{2}.

Both π1\pi_{1} and π2\pi_{2} can be calculated with root-finding algorithms. To see this, we first observe that tκt^{\kappa} is monotone in κ\kappa, recalling that the numerator of Tκ=θ^maxκθ^minτ^max+κ2τ^minT^{\kappa}=\frac{\hat{\theta}_{\max}-\kappa\hat{\theta}_{\min}}{\hat{\tau}_{\max}+\kappa^{2}\hat{\tau}_{\min}} decreases in κ\kappa while the denominator increases. Monotonicity of Φ()\Phi(\cdot) implies that 1Φ(tκ)1-\Phi(t^{\kappa}) is monotone in κ\kappa. Now, applying of Proposition 2,

1Fκ(t)=2{(W11>t,W12>t)+(W21>t,W22>t)},1-F^{\kappa}(t)=2\left\{\mathbb{P}\left(W_{11}>t,W_{12}>t\right)+\mathbb{P}\left(W_{21}>t,W_{22}>t\right)\right\},

where (W11,W12)(W_{11},W_{12}) and (W21,W22)(W_{21},W_{22}) follow bivariate normal distributions with mean zero, unit variance, and correlations ν1\nu_{1} and ν2\nu_{2}, as defined in Proposition 2, respectively. Both ν1\nu_{1} and ν2\nu_{2} are monotone decreasing in κ\kappa, so Theorem 8 in Müller (2001) implies that 1Fκ(t)1-F^{\kappa}(t) is monotone in tt. Thus 1Fκ(tκ)1-F^{\kappa}(t^{\kappa}) is monotone in κ\kappa.

Monotonicity of 1Φ(tκ)1-\Phi(t^{\kappa}) and 1Fκ(tκ)1-F^{\kappa}(t^{\kappa}) implies that if the likelihood ratio test rejects the null hypothesis for some κ>1\kappa>1, π1\pi_{1} and π2\pi_{2} are the unique roots of

f1(κ)=1Φ(tκ)α\displaystyle f_{1}(\kappa)=1-\Phi(t^{\kappa})-\alpha
f2(κ)=1Fκ(tκ)α,\displaystyle f_{2}(\kappa)=1-F^{\kappa}(t^{\kappa})-\alpha,

respectively. These roots can be easily calculated via, e.g., the bisection method. In our implementation, we use the uniroot function in R.