This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\useunder

\ul

Impact of existence and nonexistence of pivot on the coverage of empirical best linear prediction intervals for small areas

Yuting Chen1 Masayo Y. Hirose2 and Partha Lahiri1 1 Joint Program in Survey Methodology, University of Maryland, College Park, United States 2 Institute of Mathematics for Industry, Kyushu University, Japan
Abstract

We advance the theory of parametric bootstrap in constructing highly efficient empirical best (EB) prediction intervals of small area means. The coverage error of such a prediction interval is of the order O(m3/2)O(m^{-3/2}), where mm is the number of small areas to be pooled using a linear mixed normal model. In the context of an area level model where the random effects follow a non-normal known distribution except possibly for unknown hyperparameters, we analytically show that the order of coverage error of empirical best linear (EBL) prediction interval remains the same even if we relax the normality of the random effects by the existence of pivot for a suitably standardized random effects when hyperpameters are known. Recognizing the challenge of showing existence of a pivot, we develop a simple moment-based method to claim non-existence of pivot. We show that existing parametric bootstrap EBL prediction interval fails to achieve the desired order of the coverage error, i.e. O(m3/2)O(m^{-3/2}), in absence of a pivot. We obtain a surprising result that the order O(m1)O(m^{-1}) term is always positive under certain conditions indicating possible overcoverage of the existing parametric bootstrap EBL prediction interval. In general, we analytically show for the first time that the coverage problem can be corrected by adopting a suitably devised double parametric bootstrap. Our Monte Carlo simulations show that our proposed single bootstrap method performs reasonably well when compared to rival methods.

Keywords: Small area estimation, empirical Bayes, linear mixed model, best linear predictor

1 Introduction

The following two-level model, commonly referred to as the area-level model, has been extensively used in small area applications.

The area-level model. For i=1,,mi=1,\cdots,m,

Level 1 (Sampling model): yi|θiind𝒩(θi,Di)y_{i}\,|\,\theta_{i}\stackrel{{\scriptstyle ind}}{{\sim}}\mathcal{N}(\theta_{i},D_{i});

Level 2 (Linking model): θiind𝒢(𝒙𝒊𝜷,A,ϕ)\theta_{i}\stackrel{{\scriptstyle ind}}{{\sim}}\mathcal{G}(\bm{x_{i}^{\prime}\beta},A,\bm{\phi}).

Here, mm represents the number of small areas, 𝒢\mathcal{G} is a fully parametric distribution, not necessarily normal, with mean E(θi)=𝒙𝒊𝜷\text{E}(\theta_{i})=\bm{x_{i}^{\prime}\beta}, variance Var(θi)=A0\text{Var}(\theta_{i})=A\geq 0, and any additional parameters ϕ\bm{\phi}. Assuming the same normal distribution at level 1, Datta and Lahiri (1995) assumed that 𝒢\mathcal{G} is a scale mixture of normal distribution. Later, Bell and Huang (2006) and Xie et al. (2007) applied a tt distribution as a specific case of the scale mixture of normals at level 2, with mean 𝒙𝒊𝜷\bm{x_{i}^{\prime}\beta}, variance AA and degrees of freedom ϕ\phi, to mitigate the influence of outliers. Fabrizi and Trivisano (2010) introduced two robust area-level models: the first assumes that θi\theta_{i} follows an exponential power distribution with a mean 𝒙𝒊𝜷\bm{x_{i}^{\prime}\beta}, variance AA and shape parameter ϕ\phi; the second model assumes a skewed exponential power distribution on level 2, with a mean 𝒙𝒊𝜷\bm{x_{i}^{\prime}\beta}, variance AA, shape parameter ψ\psi and the skewness parameter λ\lambda. Thus, in this case, the parameter vector ϕ\bm{\phi} can be defined as (ψ,λ)(\psi,\lambda)^{\prime}.

In the above model, level 1 is used to account for the sampling distribution of the direct estimates yiy_{i}, which are weighted averages of observations from small area ii. Level 2 links the true small area means θi\theta_{i} to a vector of pp known auxiliary variables 𝒙𝒊=(xi1,,xip)\bm{x_{i}}=(x_{i1},\cdots,x_{ip})^{\prime}, often obtained from various administrative records. The parameters 𝜷\bm{\beta} and AA of the linking model are generally unknown and are estimated from the available data. As in other papers on the area-level model, the sampling variances DiD_{i} are assumed to be known. In practice, DiD_{i}’s are estimated using a smoothing technique such as the ones given in Fay and Herriot (1979), Otto and Bell (1995), Ha et al. (2014) and Hawala and Lahiri (2018).

The two-level model can be rewritten as the following simple linear mixed model

yi=θi+ei=𝒙𝒊𝜷+ui+ei,i=1,,m,y_{i}=\theta_{i}+e_{i}=\bm{x_{i}^{\prime}\beta}+u_{i}+e_{i},\;i=1,\cdots,m, (1.1)

where the random effect uiiid𝒢(0,A,ϕ)u_{i}\overset{\mathrm{iid}}{\sim}\mathcal{G}(0,A,\bm{\phi}) and the sampling error eiind𝒩(0,Di)e_{i}\overset{\mathrm{ind}}{\sim}\mathcal{N}(0,D_{i}) are independent. When 𝒢\mathcal{G} is different from normal distribution, the best predictor (BP) of θi\theta_{i}, θi~BP=E(θi|yi)\tilde{\theta_{i}}^{\rm{BP}}=\text{E}(\theta_{i}|y_{i}) may not have a closed form. Instead, the best linear predictor (BLP) of θi\theta_{i} always has the explicit form as below:

θi~BLP=(1Bi)yi+Bi𝒙𝒊𝜷,\tilde{\theta_{i}}^{\rm{BLP}}=(1-B_{i})y_{i}+B_{i}\bm{x_{i}^{\prime}\beta}, (1.2)

where Bi=Di/(A+Di)B_{i}=D_{i}/(A+D_{i}), and the BLP minimizes the mean squared prediction error (MSPE) among all linear predictors of θi\theta_{i}. The variance of the prediction error (θiθi~BLP)(\theta_{i}-\tilde{\theta_{i}}^{\rm{BLP}}) is denoted by g1i(A):-Var(θiθi~BLP)=ADi/(A+Di)g_{1i}(A)\coloneq\text{Var}(\theta_{i}-\tilde{\theta_{i}}^{\rm{BLP}})=AD_{i}/(A+D_{i}). If AA is known, one can obtain the standard weighted least squares estimator of 𝜷\bm{\beta}, denoting as 𝜷~(A)\tilde{\bm{\beta}}(A). Replacing 𝜷\bm{\beta} in (1.2) by 𝜷~=𝜷~(A)\tilde{\bm{\beta}}=\tilde{\bm{\beta}}(A), a best linear unbiased prediction (BLUP) estimator of θi\theta_{i} is given by

θi~BLUP=(1Bi)yi+Bi𝒙𝒊𝜷~,\tilde{\theta_{i}}^{\text{BLUP}}=(1-B_{i})y_{i}+B_{i}\bm{x_{i}^{\prime}\tilde{\bm{\beta}}}, (1.3)

which does not require normality assumptions typically used in small area estimation models. In practice, it is common that both 𝜷\bm{\beta} and AA are unknown and need to be estimated based on data. After plugging in their estimators, an empirical best linear unbiased predictor (EBLUP) of θi\theta_{i} is given by:

θ^iEBLUP=(1B^i)yi+B^i𝒙𝒊𝜷^,\hat{\theta}_{i}^{\rm{EBLUP}}=(1-\hat{B}_{i})y_{i}+\hat{B}_{i}\bm{x_{i}^{\prime}\hat{\beta}}, (1.4)

where B^i=Di/(A^+Di)\hat{B}_{i}=D_{i}/(\hat{A}+D_{i}) and 𝜷^=𝜷~(A^)\hat{\bm{\beta}}=\tilde{\bm{\beta}}(\hat{A}), and A^\hat{A} is a consistent estimator of AA for large mm. After plugging in the estimator A^\hat{A}, we have g1i(A^)=A^Di/(A^+Di)g_{1i}(\hat{A})=\hat{A}D_{i}/(\hat{A}+D_{i}). To simplify the notation, throughout the remainder of this paper, we will use g1ig_{1i} and g^1i\hat{g}_{1i} to denote g1i(A)g_{1i}(A) and g1i(A^)g_{1i}(\hat{A}), respectively. Point prediction using EBLUP and the associated mean squared prediction error (MSPE) estimation have been studied extensively. See Rao and Molina (2015) and Jiang and Nguyen (2007) for a detailed discussion on this subject.

In this paper, we consider prediction interval approximates of small area means θi\theta_{i}. A prediction interval of θi\theta_{i}, denoted by IiI_{i}, is called a 100(1α)%100(1-\alpha)\% interval for θi\theta_{i} if P(θiIi|𝜷,A,ϕ)=1α\mathrm{P}(\theta_{i}\in I_{i}|\bm{\beta},A,\bm{\phi})=1-\alpha, for any fixed 𝜷p\bm{\beta}\in\mathcal{R}^{p}, A+A\in\mathcal{R}^{+} and ϕΘϕ\bm{\phi}\in\Theta_{\bm{\phi}}, the parameter space of ϕ\bm{\phi}. The probability P\mathrm{P} is with respect to the area-level model. There are several options for constructing interval estimates for θi\theta_{i}. The prediction interval based on only the Level 1 model for the observed data is given by yi±zα/2Diy_{i}\pm z_{\alpha/2}\sqrt{D_{i}}, where zα/2z_{\alpha/2} is the 100(1α/2)100(1-\alpha/2)th standard normal percentile. While the coverage probability for this direct interval is 1α1-\alpha, it is not efficient when DiD_{i} is large as in the case of small area estimation. Hall and Maiti (2006) considered the interval based the regression synthetic estimator: (𝒙𝒊𝜷^+bα^lsA^,𝒙𝒊𝜷^+bα^usA^)(\bm{x_{i}^{\prime}}\hat{\bm{\beta}}+b_{\hat{\alpha}_{l}^{s}}\sqrt{\hat{A}},\bm{x_{i}^{\prime}}\hat{\bm{\beta}}+b_{\hat{\alpha}_{u}^{s}}\sqrt{\hat{A}}), where bα^lsb_{\hat{\alpha}_{l}^{s}} and bα^usb_{\hat{\alpha}_{u}^{s}} are obtained using a parametric bootstrap method (described in detail in a later section). However, this interval is synthetic in the sense that it is constructed using synthetic regression estimator of θi\theta_{i} and its associated uncertainty measure, which are not area specific in the outcome variable and hence may cause larger length of the confidence interval.

It is of importance to combine both levels of the model in the interval estimation. A effective approach is to use empirical best methodology. We call an interval empirical best (EB) prediction interval if it is based on empirical best predictor of θi\theta_{i}. For a special case of the area-level model that θi\theta_{i} follows a normal distribution, Cox (1975) initiated the exact empirical Bayes interval: θ^iEB±zα/2g^1i\hat{\theta}_{i}^{\rm{EB}}\pm z_{\alpha/2}\sqrt{\hat{g}_{1i}}, where θ^iEB\hat{\theta}_{i}^{\rm{EB}} is the empirical Bayes estimator of θi\theta_{i}. Although the Cox interval always has smaller length than that of the direct interval, its coverage error is of the order O(m1)O(m^{-1}), not accurate enough in most small area applications. Yoshimori and Lahiri (2014) improved the cox-type empirical Bayes interval by using a carefully devised adjusted residual maximum likelihood (ARML) estimator of AA. Their interval has a coverage error of order O(m3/2)O(m^{-3/2}). Additionally, they analytically showed that their interval always produces shorter length than the corresponding direct interval. However, the properties of both the ARML estimator of AA and the associated interval have not yet been explored for cases involving non-normally distributed random effects.

A function of the data and parameters is called to be a pivot if its distribution does not depend on any unknown quantity (Shao, 2008; Hall, 2013). When we have a linear mixed normal model on (1.1) , (θiθ~iBLP)/g1i(\theta_{i}-\tilde{\theta}_{i}^{\rm{BLP}})/\sqrt{g_{1i}} is a standard normal pivot. The traditional method of interval estimation for θi\theta_{i} is of the form EBLUP±zα/2mspe\rm{EBLUP}\pm z_{\alpha/2}\sqrt{\text{mspe}}, where mspe is an estimate of the true mean squared prediction error (MSPE) of the EBLUP. Unfortunately, (θiθ^iEBLUP)/g^1i(\theta_{i}-\hat{\theta}_{i}^{\text{EBLUP}})/\sqrt{\hat{g}_{1i}} is not a pivot and this traditional approach produces too short or too long intervals. The coverage error of such interval is of the order O(m1)O(m^{-1}), not accurate enough in most small area applications. Recognizing that (θiθ^iEBLUP)/g^1i(\theta_{i}-\hat{\theta}_{i}^{\text{EBLUP}})/\sqrt{\hat{g}_{1i}} does not follow a standard normal distribution, Chatterjee et al. (2008) and Li and Lahiri (2010) developed a parametric bootstrap method to approximate its distribution and obtained a EB prediction interval for θi\theta_{i} in linear mixed normal models. They showed such interval has the coverage error of the order O(m3/2)O(m^{-3/2}). However, the property remains unknown for non-normal distributed random effects.

In this paper, one main aim is to bring out the virtues of pivoting or rescaling, which can decrease the dependence of our statistics on unknown parameters and yield improved prediction interval approximation for small areas under the general model (1.1). Analogous to Chatterjee et al. (2008), we propose parametric bootstrap methods to approximate the distribution of a suitably centered and scaled EBLUP under the general model (1.1), and apply it to construct a 100(1α)%100(1-\alpha)\% prediction interval estimation of θi\theta_{i}. Here, we define an interval based on the EBLUP of θi\theta_{i} as an Empirical Best Linear (EBL) prediction interval. Specifically, we introduce these two key quantities: one is the centered and scaled EBLUP following the F1iF_{1i} distribution, which can be expressed as below

Hi(𝜷^,A^)(θiθ^iEBLUP)/g^1iF1i,H_{i}(\bm{\hat{\beta}},\hat{A})\coloneqq(\theta_{i}-\hat{\theta}_{i}^{\rm{EBLUP}})/\sqrt{\hat{g}_{1i}}\sim F_{1i},

and the other is based on BLP which has the F2iF_{2i} distribution and can be expressed as

Hi(𝜷,A)(θiθ~iBLP)/g1iF2i.H_{i}(\bm{\beta},A)\coloneqq(\theta_{i}-\tilde{\theta}_{i}^{\rm{BLP}})/\sqrt{g_{1i}}\sim F_{2i}.

If F2iF_{2i} does not depend on any unknown parameters, Hi(𝜷,A)H_{i}(\bm{\beta},A) can be referred as a pivot; otherwise, Hi(𝜷,A)H_{i}(\bm{\beta},A) is not a pivot and we rewrite F2iF_{2i} as F2i(𝝂)F_{2i}(\bm{\nu}), where 𝝂=(ν1,,νs)\bm{\nu}=(\nu_{1},\cdots,\nu_{s})^{\prime} is the unknown parameter vector determining the distribution of Hi(𝜷,A)H_{i}(\bm{\beta},A). We define qαq_{\alpha} as the α\alpha-level quantile of the distribution F2i(𝝂)F_{2i}(\bm{\nu}), if P(Hi(𝜷,A)qα|𝝂)=α\mathrm{P}(H_{i}(\bm{\beta},A)\leq q_{\alpha}|\;\bm{\nu})=\alpha for any fixed 𝝂\bm{\nu}.

In Section 2, we introduce a list of notation and regularity conditions used throughout the paper. In Section 3, we propose a single parametric bootstrap EBL prediction interval for small area means in the context of an area-level model, where the random effects follow a general (non-normal) known distribution, except possibly for unknown hyperparameters. We analytically demonstrate that the coverage error order of the EBL prediction interval remains the same as in Chatterjee et al. (2008), even when relaxing the normality of random effects through the existence of a pivot for suitably standardized random effects when hyperparameters are known. However, when a pivot does not exist, the EBL prediction interval fails to achieve the desired coverage error order, i.e. O(m3/2)O(m^{-3/2}). Surprisingly, we find that the order O(m1)O(m^{-1}) term is always positive under certain conditions, indicating potential overcoverage in the current single parametric bootstrap EBL prediction interval. Recognizing the challenge of showing existence of a pivot, we develop a simple moment-based method to claim non-existence of pivot. In Section 4, we propose a double parametric bootstrap method under a general area level model and, for the first time, analytically show that this approach can correct the coverage problem. In Section 5, we compare our proposed EBL prediction interval methods with the direct method, other traditional approaches, and the parametric bootstrap prediction interval proposed by Hall and Maiti (2006).

2 A list of notations and regularity conditions

We use the following notations throughout the paper:

𝒀=(y1,,ym)\bm{Y}=(y_{1},\cdots,y_{m})^{\prime}, a m×1m\times 1 column vector of direct estimates;

𝑿=(𝒙𝟏,,𝒙𝒎)\bm{X^{\prime}}=(\bm{x_{1}},\cdots,\bm{x_{m}}), a p×mp\times m known matrix of rank pp;

𝚺=diag(A+D1,,A+Dm)\bm{\Sigma}={\rm diag}(A+D_{1},\cdots,A+D_{m}), a m×mm\times m diagonal matrix;

𝜷~=(𝑿𝚺𝟏𝑿)𝟏𝑿𝚺𝟏𝒀\bm{\tilde{\beta}}=(\bm{X^{\prime}\bm{\Sigma}^{-1}X)^{-1}X^{\prime}\bm{\Sigma}^{-1}Y}, weighted least square estimator of 𝜷\bm{\beta} with known AA;

𝑷=𝚺𝟏𝚺𝟏𝑿(𝑿𝚺𝟏𝑿)𝟏𝑿𝚺𝟏\bm{P=\bm{\Sigma}^{-1}-\bm{\Sigma}^{-1}X(X^{\prime}\bm{\Sigma}^{-1}X)^{-1}X^{\prime}\bm{\Sigma}^{-1}};

F2i(x;𝝂~)/𝝂=F2i(x;𝝂)/𝝂|𝝂=𝝂~\partial F_{2i}(x;\tilde{\bm{\nu}})/\partial\bm{\nu}=\partial F_{2i}(x;\bm{\nu})/\partial\bm{\nu}|_{\bm{\nu}=\tilde{\bm{\nu}}}, derivative with respect to 𝝂\bm{\nu} evaluated at 𝝂~\tilde{\bm{\nu}}.

We assume following regularity conditions for proving various results presented in this paper:

  • R1:

    The rank of 𝑿\bm{X}, rank(𝑿)=p\text{rank}(\bm{X})=p, is bounded for a large mm;

  • R2:

    0<infi1Disupi1Di<0<\inf_{i\geq 1}D_{i}\leq\sup_{i\geq 1}D_{i}<\infty, A(0,)A\in(0,\infty) and the true ϕΘϕ0\bm{\phi}\in\Theta_{\bm{\phi}}^{0}, the interior of Θϕ\Theta_{\bm{\phi}};

  • R3:

    supi1hii𝒙𝒊(𝑿𝑿)𝟏𝒙𝒊=O(m1)\text{sup}_{i\geq 1}h_{ii}\equiv\bm{x_{i}^{\prime}(X^{\prime}X)^{-1}x_{i}}=O(m^{-1});

  • R4:

    E|ui|8+δ<\text{E}|u_{i}|^{8+\delta}<\infty for 0<δ<10<\delta<1;

  • R5:

    The distribution function F2i()F_{2i}(\cdot) is three times continuously differentiable with respect to (\cdot), and third derivative is uniformly bounded. When Hi(𝜷,A)H_{i}(\bm{\beta},A) is not a pivot, F2i(;𝝂)F_{2i}(\cdot;\bm{\nu}) is three times continuously differentiable with respect to the parameter vector 𝝂{\bm{\nu}}, and its third derivative is uniformly bounded;

  • R6:

    When having a non-pivot Hi(𝜷,A)H_{i}(\bm{\beta},A), we assume the estimator 𝝂^\bm{\hat{\nu}} satisfies:

    E[(𝝂^𝝂)F2i(x;𝝂)/𝝂^]=O(m1),E[(𝝂^𝝂){2F2i(x;𝝂)/𝝂^𝝂^}(𝝂^𝝂)]=O(m1),E(𝝂^𝝂)3=i,j,kE[(ν^iνi)(ν^jνj)(ν^kνk)]=o(m1),\begin{split}\operatorname{E}[(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\partial F_{2i}(x;\bm{\nu})/\partial\bm{\hat{\nu}}]=&\;O(m^{-1}),\\ \operatorname{E}\left[(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\Bigl{\{}\partial^{2}F_{2i}(x;\bm{\nu})/\partial\bm{\hat{\nu}}\partial\bm{\hat{\nu}^{\prime}}\Bigr{\}}(\bm{\hat{\nu}}-\bm{\nu})\right]=&\;O(m^{-1}),\\ \text{E}\lVert(\bm{\hat{\nu}}-\bm{\nu})\rVert^{3}=\sum_{i,j,k}\text{E}[(\hat{\nu}_{i}-\nu_{i})(\hat{\nu}_{j}-\nu_{j})(\hat{\nu}_{k}-\nu_{k})]=&\;o(m^{-1}),\end{split}

    where the expectation is taken at the true 𝝂\bm{\nu}. For a special case that 𝝂=A\bm{\nu}=A, using the method of moment from Prasad and Rao (1990) to estimate AA, Lahiri and Rao (1995) showed that E(A^PRA)=o(m1)\text{E}(\hat{A}_{\rm{PR}}-A)=o(m^{-1}). Moreover, they provided the Lemma C.1 that under the regularity conditions R1 - R5, E|A^PRA|2q=O(mq)\text{E}|\hat{A}_{\rm{PR}}-A|^{2q}=O(m^{-q}) for any qq satisfying 0<q2+δ0<q\leq 2+\delta^{\prime} with 0<δ<14δ0<\delta^{\prime}<\frac{1}{4}\delta.

3 Single parametric bootstrap

For the remainder of this paper, without further explicit mention, we will use θ~i\tilde{\theta}_{i} to represent the BLP of θi\theta_{i}, and θ^i\hat{\theta}_{i} to denote the EBLUP of θi\theta_{i}. The single parametric bootstrap method has been widely studied for its simplicity in obtaining prediction intervals directly from the bootstrap histogram. Ideally, a prediction interval of θi\theta_{i} can be constructed based on the distribution of (θiθ^i)/g^1i(\theta_{i}-\hat{\theta}_{i})/\sqrt{\hat{g}_{1i}}, although this distribution function F1iF_{1i} is complex and difficult to approximate analytically. In this paper, we firstly follow the procedure introduced by Chatterjee et al. (2008) and Li and Lahiri (2010) to provide a bootstrap approximation of F1iF_{1i} by using a parametric bootstrap method. The implementation is straightforward, following these steps:

  • 1.

    Conditionally on the data (𝑿,𝒀)(\bm{X,Y}), draw θi\theta_{i}^{*} for i=1,,mi=1,\cdots,m, independently from the distribution 𝒢(𝒙𝒊𝜷^,A^,ϕ^)\mathcal{G}(\bm{x_{i}}^{\prime}\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}});

  • 2.

    Given θi\theta_{i}^{*}, draw yiy_{i}^{*} from the distribution 𝒩(θi,Di)\mathcal{N}(\theta_{i}^{*},D_{i});

  • 3.

    Construct the bootstrap estimators 𝜷^\hat{\bm{\beta}}^{*}, A^\hat{A}^{*} using the data (𝑿,𝒀)(\bm{X,Y^{*}}) and then obtain θ^i=(1B^i)yi+B^i𝒙𝒊𝜷^\hat{\theta}_{i}^{*}=(1-\hat{B}^{*}_{i})y_{i}+\hat{B}^{*}_{i}\bm{x_{i}^{\prime}\hat{\beta}^{*}} and g^1i=A^Di/(A^+Di)\hat{g}_{1i}^{*}=\hat{A}^{*}D_{i}/(\hat{A}^{*}+D_{i});

  • 4.

    Calibrate on α\alpha using the bootstrap method. Let (qαl,qαu)=(qα^ls,qα^us)(q_{\alpha_{l}},q_{\alpha_{u}})=(q_{\hat{\alpha}_{l}^{s}},q_{\hat{\alpha}_{u}^{s}}) be the solution of the following equations:

    P(Hi(𝜷^,A^)qαu|𝜷^,A^,ϕ^)=1α/2P(Hi(𝜷^,A^)qαl|𝜷^,A^,ϕ^)=α/2,\begin{split}\mathrm{P}^{*}(H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\leq q_{\alpha_{u}}\,|\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}})&=1-\alpha/2\\ \mathrm{P}^{*}(H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\leq q_{\alpha_{l}}\,|\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}})&=\alpha/2,\end{split} (3.1)

    where Hi(𝜷^,A^)=(θiθ^i)/g^1iH_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})=(\theta_{i}^{*}-\hat{\theta}^{*}_{i})/\sqrt{\hat{g}^{*}_{1i}};

  • 5.

    The single bootstrap calibrated prediction interval is constructed by:

    I^is=(θ^i+qα^lsg^1i,θ^i+qα^usg^1i).\hat{I}_{i}^{s}=\left(\hat{\theta}_{i}+q_{\hat{\alpha}_{l}^{s}}\sqrt{\hat{g}_{1i}}\,,\,\hat{\theta}_{i}+q_{\hat{\alpha}_{u}^{s}}\sqrt{\hat{g}_{1i}}\right).

One of our main results from the algorithm above is that we relax the normality of the random effects by the existence of a pivot Hi(𝜷,A)H_{i}(\bm{\beta},A), the prediction interval I^is\hat{I}_{i}^{s} obtained above still has a high degree of coverage accuracy. That is, it brings the coverage error down to O(m3/2)O(m^{-3/2}).

Theorem 3.1.

Under regularity conditions, for a preassigned α(0,1)\alpha\in(0,1) and arbitrary i=1,,mi=1,\cdots,m, when Hi(𝛃,A)H_{i}(\bm{\beta},A) is a pivot, we have

P(θiI^is)=P(θ^i+qα^lsg^1iθiθ^i+qα^usg^1i)=1α+O(m3/2),\mathrm{P}(\theta_{i}\in\hat{I}_{i}^{s})=\mathrm{P}\left(\hat{\theta}_{i}+q_{\hat{\alpha}_{l}^{s}}\sqrt{\hat{g}_{1i}}\leq\theta_{i}\leq\hat{\theta}_{i}+q_{\hat{\alpha}_{u}^{s}}\sqrt{\hat{g}_{1i}}\right)=1-\alpha+O(m^{-3/2}), (3.2)

where qα^lsq_{\hat{\alpha}_{l}^{s}} and qα^usq_{\hat{\alpha}_{u}^{s}} are determined via the single parametric bootstrap procedure described above.

The proof of Theorem 3.1 is given in the Appendix A. An example of Theorem 3.1 is that when 𝒢(𝒙𝒊𝜷,A,ϕ)\mathcal{G}(\bm{x_{i}^{\prime}\beta},A,\bm{\phi}) is a normal distribution, 𝒩(𝒙𝒊𝜷,A)\mathcal{N}(\bm{x_{i}^{\prime}\beta},A), using Theorem 3.2 of Chatterjee et al. (2008), we have

P(θiI^is)=1α+O(m3/2).\mathrm{P}(\theta_{i}\in\hat{I}_{i}^{s})=1-\alpha+O(m^{-3/2}). (3.3)
Proposition 3.1.

When Hi(𝛃,A)H_{i}(\bm{\beta},A) is not a pivot, we have

P(θiI^is)=1α+O(m1),\mathrm{P}(\theta_{i}\in\hat{I}_{i}^{s})=1-\alpha+O(m^{-1}), (3.4)

where I^is\hat{I}^{s}_{i} is obtained from the single parametric bootstrap procedure described above.

The proof is given in the Appendix B.

Proposition 3.2.

Suppose that:

  • (i)

    The random effects uiu_{i} are symmetrically distributed;

  • (ii)

    F2i(x;𝝂)/νi>0, 1is\partial F_{2i}(x;\bm{\nu})/\partial\nu_{i}>0,\;1\leq i\leq s and λmax(2F2i(x;𝝂)/𝝂𝝂)<0\lambda_{\rm{max}}(\partial^{2}F_{2i}(x;\bm{\nu})/\partial\bm{\nu}\partial\bm{\nu}^{\prime})<0, where λmax\lambda_{\rm{max}} means the largest eigenvalue.
    This condition is satisfied for some continuous distributions. For instance, when uiu_{i} follows a logistic or tt distribution with known degrees of freedom. The only unknown parameter of F2i(x;𝝂)F_{2i}(x;\bm{\nu}) is the variance AA. As Remark 1 indicates that the kurtosis of Hi(𝜷,A)H_{i}(\bm{\beta},A) is a decreasing function of AA, it is not difficult to show that F2i(x;A)/A>0\partial F_{2i}(x;A)/\partial A>0 and 2F2i(x;A)/A2<0\partial^{2}F_{2i}(x;A)/\partial A^{2}<0;

  • (iii)

    The estimators of the unknown parameters, 𝝂^=(ν^1,,ν^s)\hat{\bm{\nu}}=(\hat{\nu}_{1},\cdots,\hat{\nu}_{s}), are either second-order unbiased or negatively biased, that is E(ν^iνi)=bi+o(m1)\text{E}(\hat{\nu}_{i}-\nu_{i})=b_{i}+o(m^{-1}) with bi0b_{i}\leq 0, for i=1,,si=1,\cdots,s.

Under these conditions, the prediction interval (3.4) has an overcoverage property. More specifically, we can rewrite (3.4) as below:

P(θiI^is)=1α+T1+o(m1),\mathrm{P}(\theta_{i}\in\hat{I}_{i}^{s})=1-\alpha+T_{1}+o(m^{-1}), (3.5)

where T1T_{1} is of the order O(m1)O(m^{-1}) with T1>0T_{1}>0.

See detailed proof in the Appendix C. The proposition indicates that under the regularity conditions, the prediction intervals conducted by the proposed single parametric bootstrap can produce higher coverage than the nominal coverage with a non pivot Hi(𝜷,A)H_{i}(\bm{\beta},A) up to the order of O(m1)O(m^{-1}), which could be beneficial for practitioners without considering other properties of prediction intervals.

Remark 1.

When 𝒢(𝐱𝐢𝛃,A,ϕ)\mathcal{G}(\bm{x_{i}^{\prime}\beta},A,\bm{\phi}) is not a normal distribution, it is challenging to obtain the explicit form of the distribution of Hi(𝛃,A)H_{i}(\bm{\beta},A). Consequently, it is difficult to verify whether Hi(𝛃,A)H_{i}(\bm{\beta},A) is a pivot. Note the fact that for Hi(𝛃,A)H_{i}(\bm{\beta},A) to be a pivot, its moments should not depend on any unknown parameters. Based on that, we develop a simple moment-based method to claim non-existence of pivot. Under the symmetry assumption of uiu_{i}, the odd moments of Hi(𝛃,A)H_{i}(\bm{\beta},A) are zero if they exist, and the second moment equals to 1 because it is standardized. To verify if Hi(𝛃,A)H_{i}(\bm{\beta},A) is a pivot, we calculate its fourth moment as follows:

E[(θiθ~i)/g1i]4=1g1i2E[𝒙𝒊𝜷+ui(1Bi)yiBi𝒙𝒊𝜷]4=1g1i2E[Biui(1Bi)ei]4=1g1i2E[Bi4ui44Bi3ui3(1Bi)ei+6Bi2ui2(1Bi)2ei24Biui(1Bi)3ei3+(1Bi)4ei4]=1g1i2E[Bi4ui4+6Bi2ui2(1Bi)2ei2+(1Bi)4ei4]=3+Di2(A+Di)2[E(ui4)A23],\begin{split}&\text{E}\left[(\theta_{i}-\tilde{\theta}_{i})/\sqrt{g_{1i}}\right]^{4}\\ &=\frac{1}{g_{1i}^{2}}\text{E}\left[\bm{x_{i}^{\prime}\beta}+u_{i}-(1-B_{i})y_{i}-B_{i}\bm{x_{i}^{\prime}\beta}\right]^{4}\\ &=\frac{1}{g_{1i}^{2}}\text{E}\left[B_{i}u_{i}-(1-B_{i})e_{i}\right]^{4}\\ &=\frac{1}{g_{1i}^{2}}\text{E}\left[B_{i}^{4}u_{i}^{4}-4B_{i}^{3}u_{i}^{3}(1-B_{i})e_{i}+6B_{i}^{2}u_{i}^{2}(1-B_{i})^{2}e_{i}^{2}-4B_{i}u_{i}(1-B_{i})^{3}e_{i}^{3}+(1-B_{i})^{4}e_{i}^{4}\right]\\ &=\frac{1}{g_{1i}^{2}}\text{E}\left[B_{i}^{4}u_{i}^{4}+6B_{i}^{2}u_{i}^{2}(1-B_{i})^{2}e_{i}^{2}+(1-B_{i})^{4}e_{i}^{4}\right]\\ &=3+\frac{D_{i}^{2}}{(A+D_{i})^{2}}\left[\frac{\text{E}(u_{i}^{4})}{A^{2}}-3\right],\end{split} (3.6)

where [E(ui4)/A23]\left[\text{E}(u_{i}^{4})/A^{2}-3\right] is the excess kurtosis of uiu_{i}. Note that when uiu_{i} is normally distributed, [E(ui4)/A23]\left[\text{E}(u_{i}^{4})/A^{2}-3\right] is zero. When the distribution of uiu_{i} is other than normal, such as t, double exponential, and logistic, [E(ui4)/A23]\left[\text{E}(u_{i}^{4})/A^{2}-3\right] is a constant other than zero, indicating that the fourth moment of Hi(𝛃,A)H_{i}(\bm{\beta},A) depends on the unknown parameter AA and thus Hi(𝛃,A)H_{i}(\bm{\beta},A) is not a pivot. Moreover, in these cases, the fourth moment of Hi(𝛃,A)H_{i}(\bm{\beta},A) is a decreasing function of AA.

4 Double parametric bootstrap

Hall and Maiti (2006) considered parametric bootstrap methods to approximate the distribution of (θi𝒙𝒊𝜷^)/A^(\theta_{i}-\bm{x_{i}^{\prime}\hat{\beta}})/\sqrt{\hat{A}}. Then, the prediction interval can be constructed as (𝒙𝒊𝜷^+bα^lsA^,𝒙𝒊𝜷^+bα^usA^)(\bm{x_{i}^{\prime}}\hat{\bm{\beta}}+b_{\hat{\alpha}_{l}^{s}}\sqrt{\hat{A}},\bm{x_{i}^{\prime}}\hat{\bm{\beta}}+b_{\hat{\alpha}_{u}^{s}}\sqrt{\hat{A}}), where bα^lsb_{\hat{\alpha}_{l}^{s}} and bα^usb_{\hat{\alpha}_{u}^{s}} are obtained from single bootstrap approximation based on their algorithm. Their prediction interval is based on a synthetic model or the regression model, which does not permit approximation of the conditional distribution of θi\theta_{i} given the data 𝒀\bm{Y}. As a consequence, it is likely to underweight the area specific data. When the level 2 distribution 𝒢\mathcal{G} determined only by 𝒙𝒊𝜷\bm{x_{i}^{\prime}\beta} and AA, it is easy to know that the quantity, (θi𝒙𝒊𝜷)/A(\theta_{i}-\bm{x_{i}^{\prime}\beta})/\sqrt{A}, is a pivot. As Hall and Maiti (2006) stated, their prediction interval is effective as P(𝒙𝒊𝜷^+bα^lsA^θi𝒙𝒊𝜷^+bα^usA^)=1α+O(m2)\mathrm{P}(\bm{x_{i}^{\prime}}\hat{\bm{\beta}}+b_{\hat{\alpha}_{l}^{s}}\sqrt{\hat{A}}\leq\theta_{i}\leq\bm{x_{i}^{\prime}}\hat{\bm{\beta}}+b_{\hat{\alpha}_{u}^{s}}\sqrt{\hat{A}})=1-\alpha+O(m^{-2}). However, when additional parameters involved into the distribution of random effects, (θi𝒙𝒊𝜷)/A(\theta_{i}-\bm{x_{i}^{\prime}\beta})/\sqrt{A} might not a pivot and we may lose the effectiveness.

Proposition 4.1.

When considering a general distribution 𝒢(𝐱𝐢𝛃,A,ϕ)\mathcal{G}(\bm{x_{i}}^{\prime}\bm{\beta},A,\bm{\phi}), such as a t distribution with unknown degree of freedom, we have

P(𝒙𝒊𝜷^+bα^lsA^θi𝒙𝒊𝜷^+bα^usA^)=1α+O(m1),\mathrm{P}(\bm{x_{i}^{\prime}}\hat{\bm{\beta}}+b_{\hat{\alpha}_{l}^{s}}\sqrt{\hat{A}}\leq\theta_{i}\leq\bm{x_{i}^{\prime}}\hat{\bm{\beta}}+b_{\hat{\alpha}_{u}^{s}}\sqrt{\hat{A}})=1-\alpha+O(m^{-1}), (4.1)

where bα^lsb_{\hat{\alpha}_{l}^{s}} and bα^usb_{\hat{\alpha}_{u}^{s}} are obtained from single bootstrap approximation based on Hall and Maiti (2006) algorithm.

See proof in the appendix D, which is similar to Proposition 3.1.

Hall and Maiti (2006) proposed a double-bootstrap method to calibrate (α^ls,α^us)(\hat{\alpha}_{l}^{s},\hat{\alpha}_{u}^{s}) from single parametric bootstraps, which can achieve a high degree of coverage accuracy O(m3)O(m^{-3}). However, their calibration approach can overcorrect and produce a calibrated α^\hat{\alpha} greater than 1, which makes the implementation infeasible.

While the order O(m1)O(m^{-1}) term T1T_{1} in (3.5) is theoretically positive under certain conditions, it remains unclear whether this positiveness holds when uiu_{i} is asymmetrically distributed. Unlike Hall and Maiti (2006), we introduce a new double bootstrap method, which does not require a pivot or symmetric uiu_{i}. This method reduces the coverage error to o(m1)o(m^{-1}), even when uiu_{i} is asymmetrically distributed. Our double bootstrap approach is based on the algorithm from Shi (1992), where the double bootstrap is proposed to obtain accurate and efficient confidence interval for parameters of interest in both nonparametric and univariate parametric distribution settings. Later on, McCullough and Vinod (1998) discussed the theory of the double bootstrap both with and without pivoting, and provided the implementations in some nonlinear production functions. In this paper, we develop the double bootstrap in the context of our mixed effect model and apply it to obtain the EBL prediction intervals of θi\theta_{i}. The framework of our double parametric bootstrap is as below:

  • 1.

    First-stage bootstrap

    • 1.1

      Conditionally on the data (𝑿,𝒀)(\bm{X,Y}), draw θi,i=1,,m\theta_{i}^{*},\,i=1,\cdots,m, from the distribution 𝒢(𝒙𝒊𝜷^,A^,ϕ^)\mathcal{G}(\bm{x_{i}}^{\prime}\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}});

    • 1.2

      Given θi\theta_{i}^{*}, draw yiy_{i}^{*} from the distribution 𝒩(θi,Di)\mathcal{N}(\theta_{i}^{*},D_{i});

    • 1.3

      Compute 𝜷^\hat{\bm{\beta}}^{*}, A^\hat{A}^{*}, ϕ^\hat{\bm{\phi}}^{*} from the data (𝑿,𝒀)(\bm{X,Y^{*}}), and obtain θ^i\hat{\theta}_{i}^{*} and g^1i\hat{g}_{1i}^{*};

  • 2.

    Second-stage bootstrap

    • 2.1

      Given 𝜷^\hat{\bm{\beta}}^{*}, A^\hat{A}^{*}, ϕ^\hat{\bm{\phi}}^{*}, draw θi\theta_{i}^{**} from the distribution 𝒢(𝒙𝒊𝜷^,A^,ϕ^)\mathcal{G}(\bm{x_{i}}^{\prime}\hat{\bm{\beta}}^{*},\hat{A}^{*},\hat{\bm{\phi}}^{*});

    • 2.2

      Given θi\theta_{i}^{**}, draw yiy_{i}^{**} from the distribution 𝒩(θi,Di)\mathcal{N}(\theta_{i}^{**},D_{i});

    • 2.3

      Compute 𝜷^\hat{\bm{\beta}}^{**}, AA^{**} from the data (𝑿,𝒀)(\bm{X,Y^{**}}), also obtain θ^i\hat{\theta}_{i}^{**} and g^1i\hat{g}_{1i}^{**};

    • 2.4

      Consider to obtain a (1α)(1-\alpha)-level, two-sided, equal-tailed prediction interval. Define

      G^(z)P(Hi(𝜷^,A^)z|𝜷^,A^,ϕ^)\hat{G}^{*}(z)\equiv\mathrm{P}^{**}(H_{i}(\hat{\bm{\beta}}^{**},\hat{A}^{**})\leq z\,|\,\hat{\bm{\beta}}^{*},\hat{A}^{*},\hat{\phi}^{*}) (4.2)

      For seeking the upper limit, we firstly solve the following system of equations in order to obtain α^u\hat{\alpha}_{u} such that

      {P(Hi(𝜷^,A^)qα^u|𝜷^,A^,ϕ^)=1α/2P(Hi(𝜷^,A^)qα^u|𝜷^,A^,ϕ^)=α^u.,\left\{\begin{array}[]{@{}l@{}}\mathrm{P}^{*}(H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\leq q_{\hat{\alpha}_{u}}\,|\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}})=1-\alpha/2\\ \mathrm{P}^{**}(H_{i}(\hat{\bm{\beta}}^{**},\hat{A}^{**})\leq q_{\hat{\alpha}_{u}}\,|\,\hat{\bm{\beta}}^{*},\hat{A}^{*},\hat{\bm{\phi}}^{*})={\hat{\alpha}_{u}}.\end{array}\right.\,, (4.3)

      Using the definition (4.2), we have qα^u=G^1(α^u)q_{\hat{\alpha}_{u}}=\hat{G}^{*-1}(\hat{\alpha}_{u}). Then, the above system of equations is equivalent to

      P(Hi(𝜷^,A^)G^1(α^u)|𝜷^,A^,ϕ^)=1α/2.\mathrm{P}^{*}\left(H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\leq\hat{G}^{*-1}(\hat{\alpha}_{u})\,|\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}}\right)=1-\alpha/2. (4.4)

      Rewriting (4.4) gives

      P[P(Hi(𝜷^,A^)Hi(𝜷^,A^)|𝜷^,A^,ϕ^)α^u|𝜷^,A^,ϕ^]=1α/2.\mathrm{P}^{*}\left[\mathrm{P}^{**}\left(H_{i}(\hat{\bm{\beta}}^{**},\hat{A}^{**})\leq H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\,|\,\hat{\bm{\beta}}^{*},\hat{A}^{*},\hat{\bm{\phi}}^{*}\right)\leq\hat{\alpha}_{u}\,|\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}}\right]=1-\alpha/2. (4.5)

      Note that the inner probability, P(Hi(𝜷^,A^)Hi(𝜷^,A^)|𝜷^,A^,ϕ^)\mathrm{P}^{**}(H_{i}(\hat{\bm{\beta}}^{**},\hat{A}^{**})\leq H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\,|\,\hat{\bm{\beta}}^{*},\hat{A}^{*},\hat{\bm{\phi}}^{*}), is a function of the first-stage bootstrap resample. More specifically, on the jjth first-stage bootstrap sample, after all KK second-stage bootstrap operations are completed, let ZjZ_{j} be the proportion of times that Hi(𝜷^,A^)Hi(𝜷^,A^)H_{i}(\hat{\bm{\beta}}^{**},\hat{A}^{**})\leq H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*}), i.e.,

      Zj=P(Hi(𝜷^,A^)Hi(𝜷^,A^)|𝜷^,A^,ϕ^)#(Hi(𝜷^,A^)Hi(𝜷^,A^))/KZ_{j}=\mathrm{P}^{**}\left(H_{i}(\hat{\bm{\beta}}^{**},\hat{A}^{**})\leq H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\,|\,\hat{\bm{\beta}}^{*},\hat{A}^{*},\hat{\bm{\phi}}^{*}\right)\approx\#(H_{i}(\hat{\bm{\beta}}^{**},\hat{A}^{**})\leq H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*}))/K

      We will use ZjZ_{j} to adjust the first-stage intervals. After all bootstrapping operations are complete, we have estimates ZjZ_{j}, j=1,2,,Jj=1,2,\cdots,J, where JJ is the number of first-stage bootstrap operations. Sort ZjZ_{j} and choose the 1α/21-\alpha/2 quantile as the percentile point α^u\hat{\alpha}_{u} for defining the double-bootstrap upper limit.

    • 2.5

      After completing the step 2.4, we have all estimates Hi(j)(𝜷^,A^)H_{i}^{(j)}(\hat{\bm{\beta}}^{*},\hat{A}^{*}), j=1,,Jj=1,\cdots,J and α^u\hat{\alpha}_{u}. Choose qαu=qα^udq_{\alpha_{u}}=q_{\hat{\alpha}^{d}_{u}} such that

      P(Hi(𝜷^,A^)qαu|𝜷^,A^,ϕ^)=α^u.\mathrm{P}^{*}(H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\leq q_{\alpha_{u}}\,|\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}})=\hat{\alpha}_{u}. (4.6)
    • 2.6

      Take qα^udq_{\hat{\alpha}^{d}_{u}} as the upper limit of the double bootstrap prediction interval. Similar operations determine the lower limit, qα^ldq_{\hat{\alpha}^{d}_{l}}. Finally, construct the prediction interval of θi\theta_{i} as

      I^id=(θ^i+qα^ldg^1i,θ^i+qα^udg^1i).\hat{I}_{i}^{d}=\left(\hat{\theta}_{i}+q_{\hat{\alpha}^{d}_{l}}\sqrt{\hat{g}_{1i}}\,,\,\hat{\theta}_{i}+q_{\hat{\alpha}^{d}_{u}}\sqrt{\hat{g}_{1i}}\right).

The algorithm above shows that the single parametric bootstrap I^is\hat{I}^{s}_{i} is calibrated by the second-stage bootstrap. In general, such a calibration improves the coverage accuracy to o(m1)o(m^{-1}), even when the pivot does not exist. One more advantage of our double bootstrap algorithm is that it avoids the problem of over correction and so make it more practical than that of Hall and Maiti (2006).

Theorem 4.1.

Under regularity conditions, for a preassigned α(0,1)\alpha\in(0,1) and arbitrary i=1,,mi=1,\cdots,m, we have

P(θ^i+qα^ldg^1iθiθ^i+qα^udg^1i)=1α+o(m1),\mathrm{P}\left(\hat{\theta}_{i}+q_{\hat{\alpha}^{d}_{l}}\sqrt{\hat{g}_{1i}}\leq\theta_{i}\leq\hat{\theta}_{i}+q_{\hat{\alpha}_{u}^{d}}\sqrt{\hat{g}_{1i}}\right)=1-\alpha+o(m^{-1}), (4.7)

where qα^ldq_{\hat{\alpha}^{d}_{l}} and qα^udq_{\hat{\alpha}^{d}_{u}} are obtained from the double parametric bootstrap procedure described above.

See proof of the theorem in appendix E.

5 Monte Carlo Simulations

In this section, we compare the performance of the proposed parametric bootstrap with their competitors where available, using Monte Carlo simulation studies. To maintain comparability with existing studies, we adopt part of the simulation framework of Chatterjee et al. (2008). We consider an area-level model with 𝒙𝒊𝜷=0\bm{x_{i}^{\prime}\beta}=0, and five groups of small areas. Within each group, the sampling variances DiD_{i}’s remain the same. Two patterns for the DiD_{i}’s are considered: (i) (4.0,0.6,0.5,0.4,0.2)(4.0,0.6,0.5,0.4,0.2); (ii) (8.0,1.2,1.0,0.8,0.4)(8.0,1.2,1.0,0.8,0.4). For pattern (i), we take A=1A=1; for pattern (ii), we take A=2A=2 doubling the variances of pattern (i) while preserving the Bi=Di/(A+Di)B_{i}=D_{i}/(A+D_{i}) ratios. To examine the effect of mm, we consider m=15m=15 and 50. With the increase of mm, all methods improve and get closer to one another, supporting our asymptotic theory. Since we obtained virtually identical results under these two patterns, for the full study we confined attention to the pattern (i) and the results for pattern (ii) are provided in the Appendix G.

The Prasad-Rao method-of-moments (Prasad and Rao, 1990), and the Fay-Herriot method of estimating the variance component AA are considered. For the Fay-Herriot estimator of AA, we employ the method of scoring to solve the estimating equation, which has showed to be more stable (Datta et al., 2005) than the Newton-Raphson method originally used in Fay and Herriot (1979). The estimation equation of the Fay-Herriot estimator is

f(A)=i=1m(yi𝒙𝒊𝜷~)2A+Di=0.f(A)=\sum_{i=1}^{m}\frac{(y_{i}-\bm{x_{i}^{\prime}\tilde{\beta}})^{2}}{A+D_{i}}=0.

Here, f(A)f(A) is a decreasing function with respect to AA, and the expectation of its first derivative E(f(A))=tr(𝐏)<0\text{E}(f^{\prime}(A))=-\rm{tr}(\bm{P})<0. To improve computational efficiency, we implemented a slight modification to the algorithm. Specifically, the revised Fay-Herriot algorithm begins by calculating f(A)f(A) at the initial point A0=0A_{0}=0, as in Fay and Herriot (1979). If f(A0)<0f(A_{0})<0, we truncate the estimator to A^FH=0.01\hat{A}_{\rm{FH}}=0.01. Otherwise, the iterative process continues to search for a positive solution, with A^FH=0.01\hat{A}_{\rm{FH}}=0.01 also applied if no positive solution is found. This revised Fay-Herriot algorithm further enhances computational efficiency compared to the original method.

5.1 Simulations on symmetric cases

First, we consider the scenario that uiu_{i} is symmetrically distributed. Specifically, we assume 𝒢\mathcal{G} is a tt distribution with 9 degrees of freedom. In this setting, we compare coverage probabilities and average lengths of the following seven different prediction intervals of θi\theta_{i}:

  • Two prediction intervals based on the proposed single parametric bootstrap method with two different variance estimators: A^FH\hat{A}_{\rm{FH}} and A^PR\hat{A}_{\rm{PR}}, denoted as SB.FH and SB.PR, respectively;

  • Two prediction intervals based on the single parametric bootstrap methods proposed by Hall and Maiti (2006), using the same variance estimators, denoted as HM.FH and HM.PR;

  • Two traditional prediction intervals of the form θ^i±zα/2mspe(θ^i)\hat{\theta}_{i}\pm z_{\alpha/2}\sqrt{\text{mspe}(\hat{\theta}_{i})}, where the variance estimator A^FH\hat{A}_{\rm{FH}} is used along with the MSPE estimator of Datta et al. (2005), denoted as FH, and the variance estimator A^PR\hat{A}_{\rm{PR}} is used with the Prasad-Rao MSPE estimator (Prasad and Rao, 1990), denoted as PR;

  • The direct confidence interval (DIRECT) given by yi±zα/2Diy_{i}\pm z_{\alpha/2}\sqrt{D_{i}}.

Each reported result is based on 1000 simulation runs. For all cases, we consider single bootstrap sample of size 400 and three different nominal coverages: α(80%,90%,95%)\alpha\in(80\%,90\%,95\%).

We report the percentages of negative estimates for A under t9t_{9} distribution in Table 1. For m=15m=15, the Prasad-Rao method-of-moment approach yields as high as about 13% negative estimates of AA under pattern (i) of DiD_{i}. Fay-Herroit method produces significantly fewer negative estimates than the Prasad-Rao method-of-moment approach across all scenarios.

Table 1: Percentages of negative estimates of AA (A^\hat{A}^{*}) for different estimation methods and different patterns of DiD_{i} under t9t_{9}.
Pattern (i) Pattern (ii)
mm FH PR FH PR
15 1.6 (6.4) 13.1 (25.0) 1.1 (5.7) 11.8 ( 23.2)
50 0 (0.07) 1.2(6.6) 0 (0.07) 1.2 (6.1)

Table 2 presents coverage probabilities and average lengths for each prediction interval method with m=50m=50 and the t9t_{9} distribution under the pattern (i) of DiD_{i}. The SB.PR and HM.PR prediction intervals consistently over-cover. SB.PR prediction intervals have shorter lengths than HM.PR, especially for groups 2-5 with smaller sampling variances DiD_{i}. The SB.FH and HM.FH prediction intervals perform well regarding coverage errors. Specifically, their coverage probabilities are very close to all three nominal coverages. The SB.FH method produces the shortest average lengths among these four single parametric bootstrap methods. The FH intervals tend to undercover for G1 group when α=90%\alpha=90\% and 95%95\%. The PR intervals have the undercoverage issue for G1 group across all three nominal coverages.

Table 2: Coverage probabilities (average length) of different intervals under t9t_{9} distribution with mm = 50 and Pattern (i)

SB.FH HM.FH SB.PR HM.PR FH PR DIRECT (i) α=80%\alpha=80\% G1 79.94 ( 2.31 ) 80.30 ( 2.53 ) 83.7 ( 2.59 ) 84.45 ( 2.84 ) 79.89 ( 2.30 ) 78.66 ( 2.31 ) 79.43 ( 5.13 ) G2 80.15 ( 1.59 ) 80.31 ( 2.49 ) 83.63 ( 1.79 ) 84.54 ( 2.81 ) 80.38 ( 1.59 ) 81.55 ( 1.67 ) 79.85 ( 1.99 ) G3 79.71 ( 1.50 ) 79.53 ( 2.48 ) 83.12 ( 1.69 ) 83.75 ( 2.80 ) 79.86 ( 1.50 ) 80.88 ( 1.58 ) 79.99 ( 1.81 ) G4 80.47 ( 1.39 ) 80.03 ( 2.48 ) 84.06 ( 1.57 ) 83.95 ( 2.80 ) 80.49 ( 1.39 ) 82.06 ( 1.47 ) 80.29 ( 1.62 ) G5 79.65 ( 1.05 ) 80.48 ( 2.46 ) 83.32 ( 1.20 ) 84.89 ( 2.78 ) 79.99 ( 1.06 ) 81.8 ( 1.14 ) 80.32 ( 1.15 ) (i) α=90%\alpha=90\% G1 89.88 ( 3.05 ) 90.35 ( 3.39 ) 94.13 ( 3.79 ) 94.25 ( 4.21 ) 88.9 ( 2.96 ) 87.82 ( 2.97 ) 89.42 ( 6.58 ) G2 90.17 ( 2.05 ) 90.37 ( 3.30 ) 93.25 ( 2.59 ) 94.27 ( 4.14 ) 89.89 ( 2.04 ) 90.58 ( 2.14 ) 90.19 ( 2.55 ) G3 90.06 ( 1.94 ) 89.34 ( 3.30 ) 93.38 ( 2.44 ) 93.76 ( 4.14 ) 90.15 ( 1.92 ) 90.78 ( 2.03 ) 90.11 ( 2.33 ) G4 90.07 ( 1.79 ) 89.50 ( 3.29 ) 93.41 ( 2.26 ) 94.24 ( 4.13 ) 90.12 ( 1.78 ) 91.19 ( 1.89 ) 90.24 ( 2.08 ) G5 90.29 ( 1.36 ) 90.28 ( 3.26 ) 93.33 ( 1.72 ) 94.2 ( 4.09 ) 90.26 ( 1.36 ) 91.56 ( 1.47 ) 90.32 ( 1.47 ) (i) α=95%\alpha=95\% G1 95.13 ( 3.75 ) 95.04 ( 4.22 ) 97.3 ( 5.31 ) 97.42 ( 5.91 ) 93.57 ( 3.52 ) 92.65 ( 3.53 ) 94.77 ( 7.84 ) G2 95.11 ( 2.47 ) 95.35 ( 4.07 ) 96.99 ( 3.61 ) 97.5 ( 5.79 ) 94.87 ( 2.43 ) 95.52 ( 2.55 ) 95.38 ( 3.04 ) G3 94.96 ( 2.32 ) 94.45 ( 4.06 ) 97.05 ( 3.39 ) 97.42 ( 5.77 ) 94.70 ( 2.29 ) 95.51 ( 2.41 ) 95.00 ( 2.77 ) G4 95.06 ( 2.14 ) 95.03 ( 4.04 ) 96.86 ( 3.14 ) 97.58 ( 5.75 ) 95.12 ( 2.12 ) 95.86 ( 2.25 ) 95.07 ( 2.48 ) G5 95.31 ( 1.62 ) 94.98 ( 3.99 ) 96.87 ( 2.38 ) 97.42 ( 5.69 ) 95.28 ( 1.62 ) 96.01 ( 1.75 ) 95.17 ( 1.75 )

Table 3 reports the results for m=15m=15. As illustrated in Table 1, the Prasad–Rao method produces an extremely high percentage of zero estimates for a small m=15m=15. Thus, it is not surprising to note that the SB.PR intervals have severe undercoverage problem when α=90%\alpha=90\% and 95%. The high percentage of negative estimates of AA might also contribute to the similar undercoverage problem of HM.PR intervals at α=90%\alpha=90\% and 95% as well as the large average lengths of both SB.PR and HM.PR intervals. The SB.FH and HM.FH intervals uniformly tend to overcover. Still, SB.FH intervals have the shortest average lengths among the four types of parametric bootstrap intervals. The FH intervals significantly undercover for group G1 at three normial coverages and have slight undercoverage for groups G2 and G3, when α=90%\alpha=90\% and 95%. The PR intervals undercover for group G1 and switch to overcover for rest of the groups at all three nominal coverages.

Table 3: Coverage probabilities (average length) of different intervals under t9t_{9} distribution with mm = 15 and Pattern (i)

SB.FH HM.FH SB.PR HM.PR FH PR DIRECT (i) α=80%\alpha=80\% G1 82.43 ( 2.68 ) 82.50 ( 2.94 ) 81.50 ( 3.37 ) 82.30 ( 3.74 ) 75.07 ( 2.31 ) 76.23 ( 2.45 ) 81.37 ( 5.13 ) G2 82.80 ( 1.78 ) 83.50 ( 2.78 ) 80.77 ( 2.25 ) 82.13 ( 3.60 ) 79.43 ( 1.63 ) 88.97 ( 2.19 ) 80.03 ( 1.99 ) G3 81.40 ( 1.67 ) 81.37 ( 2.76 ) 80.57 ( 2.12 ) 82.27 ( 3.58 ) 79.07 ( 1.54 ) 88.20 ( 2.15 ) 80.40 ( 1.81 ) G4 84.07 ( 1.55 ) 84.17 ( 2.74 ) 81.50 ( 1.95 ) 82.97 ( 3.54 ) 81.40 ( 1.43 ) 90.47 ( 2.1 ) 80.67 ( 1.62 ) G5 81.80 ( 1.17 ) 83.17 ( 2.66 ) 79.33 ( 1.47 ) 82.23 ( 3.42 ) 80.73 ( 1.11 ) 88.50 ( 1.98 ) 80.00 ( 1.15 ) (i) α=90%\alpha=90\% G1 93.07 ( 3.89 ) 92.87 ( 4.33 ) 87.00 ( 5.77 ) 87.3 ( 6.45 ) 83.97 ( 2.97 ) 85.80 ( 3.14 ) 90.33 ( 6.58 ) G2 93.00 ( 2.51 ) 93.1 ( 3.96 ) 86.30 ( 3.78 ) 87.27 ( 6.03 ) 89.37 ( 2.09 ) 95.70 ( 2.81 ) 90.87 ( 2.55 ) G3 92.63 ( 2.35 ) 92.5 ( 3.91 ) 86.50 ( 3.57 ) 87.77 ( 6.01 ) 88.37 ( 1.98 ) 95.27 ( 2.76 ) 90.20 ( 2.33 ) G4 93.90 ( 2.16 ) 93.83 ( 3.86 ) 87.53 ( 3.28 ) 87.87 ( 5.94 ) 90.17 ( 1.84 ) 96.67 ( 2.70 ) 90.93 ( 2.08 ) G5 91.83 ( 1.60 ) 92.87 ( 3.67 ) 86.03 ( 2.42 ) 87.53 ( 5.62 ) 90.03 ( 1.43 ) 94.73 ( 2.54 ) 89.60 ( 1.47 ) (i) α=95%\alpha=95\% G1 97.13 ( 5.46 ) 97.07 ( 6.11 ) 88.77 ( 8.91 ) 88.80 ( 9.98 ) 88.73 ( 3.54 ) 90.73 ( 3.75 ) 94.73 ( 7.84 ) G2 96.33 ( 3.41 ) 96.83 ( 5.36 ) 88.43 ( 5.73 ) 88.87 ( 9.17 ) 93.7 ( 2.49 ) 97.87 ( 3.35 ) 95.7 ( 3.04 ) G3 96.60 ( 3.18 ) 96.83 ( 5.26 ) 89.20 ( 5.39 ) 89.50 ( 9.10 ) 93.33 ( 2.36 ) 97.90 ( 3.29 ) 94.77 ( 2.77 ) G4 97.23 ( 2.90 ) 97.37 ( 5.15 ) 89.87 ( 4.93 ) 89.90 ( 8.95 ) 95.33 ( 2.19 ) 98.83 ( 3.22 ) 95.63 ( 2.48 ) G5 96.30 ( 2.11 ) 96.93 ( 4.77 ) 88.93 ( 3.59 ) 89.30 ( 8.38 ) 95.13 ( 1.7 ) 97.80 ( 3.03 ) 94.57 ( 1.75 )

5.2 Further simulations on asymmetric cases

While some of our theoretical results for single bootstrap are based on the symmetry assumption, we also use simulation studies to assess our proposed parametric bootstrap method under asymmetric conditions. Specifically, for the same models and parameter choices as above, we change 𝒢\mathcal{G} to be a shifted exponential distribution (SE). Besides the seven prediction intervals mentioned before, we also include our proposed double parametric bootstrap intervals in this subsection and we expect they might improve the potential coverage error under asymmetric conditions. Below, the double bootstrap method based on the Fay-Herriot variance estimator is denoted by DB.FH, and the double bootstrap method based on the Prasad-Rao variance estimator is indicated by DB.PR. We keep bootstrap sample of size 400 in the first stage and apply two different sizes B2(50,100)B_{2}\in(50,100) in the second stage. Since for B2=50B_{2}=50 and B2=100B_{2}=100, we obtained very similar results. Moreover, the two patterns of DiD_{i} gave us similar results. In the following, we provide detailed discussions only for B2=100B_{2}=100 under the pattern (i), but the results for B2=50B_{2}=50 are provided in the Appendix G.

Table 4 shows the percentages of negative estimates for AA under the SE distribution. When m=15m=15, the Prasad-Rao method-of-moments approach yields very high percentages of negative estimates at both the first and second stages of bootstrapping. Although the Fay-Herriot method results in fewer negative estimates than the Prasad-Rao approach, the occurrence of negative estimates remains notable, especially when the number of small areas is small (e.g., m=15m=15).

Table 4: Percentages of negative estimates of AA (A^\hat{A}^{*})[A^\hat{A}^{**}] for different estimation methods and different patterns of DiD_{i} under SE distribution.
Pattern (i) Pattern (ii)
mm FH PR FH PR
15 3.6 (10.72) [16.32] 16.7 (27.98) [31.95] 3.9 (10.93) [16.46] 18.1 (28.49) [31.97]
50 0 (0.37) [1.44] 1.80 ( 7.89) [ 12.82] 0 (0.44) [1.56] 2.80 (8.89) [13.68]

Table 5 displays the coverage probabilities and average lengths for each prediction interval under m=50m=50. The parametric bootstrap methods based on Prasad-Rao estimators of AA, SB.PR, HM.PR and DB.PR, consistently tend to over-cover at α=80%\alpha=80\% and 90%. DB.PR intervals bring the coverage probabilities closer to the nominal coverages but have larger lengths than single bootstrap SB.PR intervals. Even with m=50m=50, the relatively high percentage of zero estimates of AA in the second bootstrap stage showed in Table 4, might contribute to the increment on interval length of DB.PR. When applying the Fay-Herriot method in estimating AA, our proposed single parametric bootstrap intervals SB.FH already showed good performance, DB.FH has little or no effect. The FH intervals perform well, except overcoverage at α=80%\alpha=80\%. PR intervals have undercoverage problem for group G1 at α=95%\alpha=95\%. Overall, the SB.FH and FH intervals perform the best in terms of both coverage probabilities and average lengths in this setting.

Table 5: Coverage probabilities (average length) of different intervals under shifted exponential distribution with mm = 50, B2B_{2} = 100 and Pattern (i)

SB.FH HM.FH DB.FH SB.PR HM.PR DB.PR FH PR DIRECT (i) α=80%\alpha=80\% G1 80.29 ( 2.20 ) 80.75 ( 2.35 ) 79.58 ( 2.22 ) 84.50 ( 2.50 ) 87.12 ( 2.68 ) 83.16 ( 2.86 ) 83.45 ( 2.28 ) 81.94 ( 2.30 ) 79.90 ( 5.13 ) G2 79.99 ( 1.55 ) 81.01 ( 2.31 ) 79.42 ( 1.56 ) 83.66 ( 1.77 ) 87.17 ( 2.65 ) 82.99 ( 2.07 ) 80.56 ( 1.57 ) 82.99 ( 1.67 ) 79.94 ( 1.99 ) G3 80.06 ( 1.47 ) 81.27 ( 2.30 ) 79.62 ( 1.47 ) 83.41 ( 1.68 ) 87.39 ( 2.65 ) 83.09 ( 1.97 ) 80.38 ( 1.48 ) 82.61 ( 1.59 ) 79.86 ( 1.81 ) G4 80.60 ( 1.36 ) 80.68 ( 2.30 ) 80.20 ( 1.37 ) 83.71 ( 1.56 ) 87.00 ( 2.64 ) 82.96 ( 1.83 ) 81.05 ( 1.37 ) 83.78 ( 1.48 ) 80.15 ( 1.62 ) G5 80.81 ( 1.05 ) 80.52 ( 2.29 ) 80.34 ( 1.05 ) 83.79 ( 1.20 ) 87.13 ( 2.63 ) 83.23 ( 1.4 ) 80.95 ( 1.05 ) 83.54 ( 1.17 ) 80.69 ( 1.15 ) (i) α=90%\alpha=90\% G1 90.46 ( 3.01 ) 91.81 ( 3.26 ) 89.56 ( 3.03 ) 93.49 ( 3.79 ) 94.70 ( 4.16 ) 92.15 ( 5.53 ) 90.98 ( 2.92 ) 90.23 ( 2.95 ) 90.01 ( 6.58 ) G2 90.22 ( 2.04 ) 91.74 ( 3.16 ) 89.53 ( 2.05 ) 93.01 ( 2.61 ) 94.91 ( 4.10 ) 92.09 ( 4.05 ) 90.14 ( 2.02 ) 92.22 ( 2.14 ) 89.81 ( 2.55 ) G3 90.00 ( 1.93 ) 91.93 ( 3.15 ) 89.48 ( 1.93 ) 92.51 ( 2.46 ) 94.89 ( 4.10 ) 91.69 ( 3.85 ) 90.2 ( 1.9 ) 91.69 ( 2.03 ) 89.80 ( 2.33 ) G4 90.48 ( 1.78 ) 91.74 ( 3.14 ) 89.8 ( 1.79 ) 92.72 ( 2.28 ) 94.65 ( 4.09 ) 92.09 ( 3.59 ) 90.47 ( 1.76 ) 92.06 ( 1.90 ) 90.17 ( 2.08 ) G5 90.69 ( 1.36 ) 91.34 ( 3.10 ) 90.24 ( 1.37 ) 92.97 ( 1.74 ) 94.64 ( 4.04 ) 92.23 ( 2.76 ) 90.71 ( 1.35 ) 92.35 ( 1.50 ) 90.29 ( 1.47 ) (i) α=95%\alpha=95\% G1 95.43 ( 3.83 ) 96.68 ( 4.18 ) 94.37 ( 3.90 ) 96.45 ( 5.44 ) 96.80 ( 6.02 ) 95.73 ( 9.56 ) 94.27 ( 3.48 ) 93.58 ( 3.52 ) 95.04 ( 7.84 ) G2 95.32 ( 2.51 ) 96.69 ( 3.98 ) 94.77 ( 2.58 ) 96.37 ( 3.69 ) 96.91 ( 5.90 ) 95.88 ( 6.82 ) 94.91 ( 2.40 ) 96.04 ( 2.55 ) 95.07 ( 3.04 ) G3 94.87 ( 2.36 ) 96.54 ( 3.96 ) 94.35 ( 2.43 ) 96.34 ( 3.49 ) 97.19 ( 5.89 ) 95.72 ( 6.48 ) 94.72 ( 2.27 ) 95.51 ( 2.42 ) 94.79 ( 2.77 ) G4 95.41 ( 2.17 ) 96.36 ( 3.93 ) 94.87 ( 2.23 ) 96.35 ( 3.22 ) 97.01 ( 5.86 ) 95.87 ( 6.06 ) 95.13 ( 2.10 ) 95.92 ( 2.27 ) 95.17 ( 2.48 ) G5 95.34 ( 1.64 ) 96.27 ( 3.86 ) 95.16 ( 1.69 ) 96.34 ( 2.43 ) 96.99 ( 5.79 ) 95.92 ( 4.60 ) 95.31 ( 1.61 ) 96.30 ( 1.78 ) 95.44 ( 1.75 )

With m=15m=15, similar to the cases under t9t_{9} distribution, the Prasad-Rao variance estimator produces very high percentages of zero estimates of AA, especially at the bootstrap stages. See Table 4. Thus, it is not abnormal to have the results that the three parametric bootstrap intervals based on A^PR\hat{A}_{PR} have very low coverage probabilities when the nominal coverage α=90%\alpha=90\% or 95%. See Table 6 for the cases m=15m=15. The other three parametric bootstrap intervals based on A^FH\hat{A}_{FH} overcover when nominal covarage is 80% or 90%. When α=95%\alpha=95\%, SB.FH intervals have good performance as well as DB.FH in terms of coverage error, while SB.FH intervals have shorter average lengths than those of HM.FH and DB.FH. The FH intervals suffer from severe undercoverage for group G1 at all nominal coverages as well as PR intervals at α=95%\alpha=95\%. With α=80%\alpha=80\% and 90%, PR intervals have longer average lengths than those of SB.FH intervals for groups G2 - G5.

Table 6: Coverage probabilities (average length) of different intervals under shifted exponential distribution with mm = 15, B2B_{2} = 100 and Pattern (i)

SB.FH HM.FH DB.FH SB.PR HM.PR DB.PR FH PR DIRECT (i) α=80%\alpha=80\% G1 84.63 ( 2.66 ) 86.00 ( 2.90 ) 83.63 ( 2.68 ) 79.77 ( 3.31 ) 81.33 ( 3.69 ) 80.6 ( 4.89 ) 77.17 ( 2.23 ) 80.43 ( 2.43 ) 79.8 ( 5.13 ) G2 82.40 ( 1.78 ) 85.5 ( 2.74 ) 81.63 ( 1.80 ) 79.50 ( 2.20 ) 81.33 ( 3.56 ) 79.57 ( 3.26 ) 79.43 ( 1.59 ) 88.6 ( 2.21 ) 79.23 ( 1.99 ) G3 82.77 ( 1.68 ) 85.87 ( 2.72 ) 81.97 ( 1.70 ) 80.17 ( 2.08 ) 81.53 ( 3.55 ) 80.40 ( 3.07 ) 80.13 ( 1.51 ) 89.27 ( 2.18 ) 78.7 ( 1.81 ) G4 84.13 ( 1.55 ) 84.90 ( 2.70 ) 82.97 ( 1.57 ) 79.63 ( 1.92 ) 80.83 ( 3.51 ) 79.63 ( 2.80 ) 81.83 ( 1.41 ) 90.9 ( 2.15 ) 82.03 ( 1.62 ) G5 82.80 ( 1.18 ) 84.23 ( 2.61 ) 81.70 ( 1.18 ) 79.7 ( 1.44 ) 80.67 ( 3.38 ) 80.00 ( 2.00 ) 81.40 ( 1.12 ) 88.93 ( 2.07 ) 80.60 ( 1.15 ) (i) α=90%\alpha=90\% G1 93.77 ( 4.08 ) 94.33 ( 4.54 ) 93.13 ( 4.83 ) 85.83 ( 5.84 ) 86.10 ( 6.59 ) 86.07 ( 13.91 ) 86.13 ( 2.86 ) 89.37 ( 3.12 ) 90.43 ( 6.58 ) G2 92.70 ( 2.59 ) 94.00 ( 4.06 ) 91.53 ( 2.93 ) 86.00 ( 3.78 ) 86.33 ( 6.19 ) 85.97 ( 8.66 ) 89.07 ( 2.04 ) 95.13 ( 2.84 ) 89.67 ( 2.55 ) G3 92.83 ( 2.43 ) 93.97 ( 4.01 ) 91.67 ( 2.69 ) 86.80 ( 3.55 ) 86.80 ( 6.14 ) 86.73 ( 8.16 ) 89.5 ( 1.94 ) 95.77 ( 2.8 ) 90.20 ( 2.33 ) G4 93.20 ( 2.23 ) 93.83 ( 3.94 ) 92.00 ( 2.48 ) 85.93 ( 3.26 ) 86.37 ( 6.07 ) 85.5 ( 7.39 ) 91.43 ( 1.81 ) 96.2 ( 2.76 ) 91.60 ( 2.08 ) G5 92.17 ( 1.65 ) 93.1 ( 3.71 ) 91.00 ( 1.75 ) 86.53 ( 2.39 ) 86.57 ( 5.74 ) 86.33 ( 5.20 ) 91.30 ( 1.43 ) 95.23 ( 2.66 ) 90.23 ( 1.47 ) (i) α=95%\alpha=95\% G1 96.60 ( 6.02 ) 96.77 ( 6.72 ) 96.50 ( 10.31 ) 87.97 ( 9.34 ) 88.07 ( 10.58 ) 88.4 ( 22.55 ) 90.33 ( 3.41 ) 93.63 ( 3.71 ) 95.07 ( 7.84 ) G2 96.17 ( 3.64 ) 96.67 ( 5.67 ) 95.43 ( 5.73 ) 88.8 ( 5.86 ) 88.80 ( 9.64 ) 88.93 ( 13.23 ) 93.73 ( 2.43 ) 97.57 ( 3.38 ) 95.00 ( 3.04 ) G3 96.07 ( 3.38 ) 96.30 ( 5.53 ) 95.57 ( 5.13 ) 89.13 ( 5.48 ) 89.00 ( 9.53 ) 89.8 ( 12.69 ) 93.97 ( 2.31 ) 98.00 ( 3.34 ) 95.67 ( 2.77 ) G4 95.93 ( 3.09 ) 96.63 ( 5.41 ) 95.40 ( 4.59 ) 88.53 ( 5.03 ) 88.30 ( 9.42 ) 88.67 ( 11.26 ) 95.33 ( 2.16 ) 98.03 ( 3.28 ) 96.00 ( 2.48 ) G5 96.17 ( 2.22 ) 96.4 ( 4.95 ) 95.47 ( 2.99 ) 88.83 ( 3.63 ) 88.63 ( 8.76 ) 89.07 ( 8.17 ) 95.77 ( 1.71 ) 97.77 ( 3.16 ) 94.90 ( 1.75 )

6 Discussion

In this study, we put forward parametric bootstrap approaches to construct prediction intervals in the contexts of small area estimation under a general (non-normal) area-level model. Our simulation results show that the proposed single bootstrap method with the Fay-Herriot method of variance estimator performs well for all cases. Moreover, it is more efficient in terms of average lengths than the existing parametric bootstrap method proposed by Hall and Maiti (2006).

When the number of area is small, there is a high likelihood to obtain negative estimates of the variance AA when applying Prasad-Rao method. Throughout our simulation studies, we notice that the likelihood of obtaining negative estimates of variance might affect the performance of the parametric bootstrap methods in terms of both coverage probabilities and average lengths. One might consider a better variance estimator, such as the Fay-Herriot estimator we used, or a sensible truncation of negative estimates. In this study, we arbitrarily truncate the negative estimates of the variance AA at 0.01 as similar to Datta et al. (2005). To this end, in the future we will consider an extension of adjusted maximum likelihood estimators of AA such as the ones considered by Li and Lahiri (2010) or Hirose and Lahiri (2018) to the model proposed in this paper.

There is an another issue that even though the double bootstrap calibration can bring the coverage accuracy to o(m1)o(m^{-1}) regardless the existence of pivot, our simulations suggest that it is not always beneficial to attempt boosting the theoretical coverage probability via double bootstrap, disregarding other properties of the interval. Specifically, variability of calibrated intervals are greater than uncalibrated ones, minimum length property is almost never preserved, and the results are quite dependent on the parameters and fixed constants of the problem, such as estimation of the variance components AA in this study. For instance, Table 5 shows that the proposed single bootstrap intervals already produced good performance and thus the calibration using double bootstrap has little or no effect, where m=50m=50. When mm is relatively small, i.e., m=15m=15, double bootstrap improves the coverage probability marginally but it produces much larger interval length than the corresponding single bootstrap method.

Appendices

Appendix A Appendix

Proof of Theorem3.1: Under regularity conditions, with a pivot Hi(𝜷,A)H_{i}(\bm{\beta},A), we have

F1i(qα)=P((θiθ^i)/g^1iqα)=E[P((θiθ~i)/g1iqα+(θ^iθ~i+qα(g^1ig1i))/g1i)]=E[F2i(qα+Q(qα+𝒀))]=F2i(qα)+m1γ(𝜷,A;qα)+O(m3/2).\begin{split}F_{1i}(q_{\alpha})&=\mathrm{P}((\theta_{i}-\hat{\theta}_{i})/\sqrt{\hat{g}_{1i}}\leq q_{\alpha})\\ &=E\left[\mathrm{P}\left((\theta_{i}-\tilde{\theta}_{i})/\sqrt{g_{1i}}\leq q_{\alpha}+(\hat{\theta}_{i}-\tilde{\theta}_{i}+q_{\alpha}(\sqrt{\hat{g}_{1i}}-\sqrt{g_{1i}}))/\sqrt{g_{1i}}\right)\right]\\ &=E\left[F_{2i}(q_{\alpha}+Q(q_{\alpha}+\bm{Y}))\right]\\ &=F_{2i}(q_{\alpha})+m^{-1}\gamma(\bm{\beta},A;q_{\alpha})+O(m^{-3/2}).\\ \end{split} (A.1)

where Q(qα+𝒀)=(θ^iθ~i+qα(g^1ig1i))/g1iQ(q_{\alpha}+\bm{Y})=(\hat{\theta}_{i}-\tilde{\theta}_{i}+q_{\alpha}(\sqrt{\hat{g}_{1i}}-\sqrt{g_{1i}}))/\sqrt{g_{1i}}, and γ\gamma is a smooth function of order O(1)O(1). The last equation is derived from similar results of the chapter 4 of Li (2007). See details in Appendix F. Then, from equations (3.1) and (A.1), we have

E[P(Hi(𝜷^,A^)qα^us|𝜷^,A^,ϕ^))]=E[F2i(qα^us)+m1γ(𝜷^,A^;qα^us)]+O(m3/2)=1α/2,E[P(Hi(𝜷^,A^)qα^ls|𝜷^,A^,ϕ^)]=E[F2i(qα^ls)+m1γ(𝜷^,A^;qα^ls)]+O(m3/2)=α/2.\begin{split}\text{E}\left[\mathrm{P}^{*}(H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\leq q_{\hat{\alpha}_{u}^{s}}\,\big{|}\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}}))\right]&=\text{E}\left[F_{2i}(q_{\hat{\alpha}_{u}^{s}})+m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}_{u}^{s}})\right]+O(m^{-3/2})\\ &=1-\alpha/2,\\ \text{E}\left[\mathrm{P}^{*}(H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\leq q_{\hat{\alpha}_{l}^{s}}\,\big{|}\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}})\right]&=\text{E}\left[F_{2i}(q_{\hat{\alpha}_{l}^{s}})+m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}_{l}^{s}})\right]+O(m^{-3/2})\\ &=\alpha/2.\end{split} (A.2)

Therefore, we have

E[F2i(qα^us)]=1α/2E[m1γ(𝜷^,A^;qα^us)]+O(m3/2),\begin{split}\text{E}\left[F_{2i}(q_{\hat{\alpha}_{u}^{s}})\right]=1-\alpha/2-\text{E}\left[m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}_{u}^{s}})\right]+O(m^{-3/2}),\end{split} (A.3)

and

E[F2i(qα^ls)]=α/2E[m1γ(𝜷^,A^;qα^ls)]+O(m3/2).\begin{split}\text{E}\left[F_{2i}(q_{\hat{\alpha}_{l}^{s}})\right]=\alpha/2-\text{E}\left[m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}_{l}^{s}})\right]+O(m^{-3/2}).\end{split} (A.4)
P(θ^i+qα^lsg^1iθiθ^i+qα^usg^1i)=P(Hi(𝜷^,A^)qα^us)P(Hi(𝜷^,A^)qα^ls)=E[F2i(qα^us)+m1γ(𝜷,A;qα^us)]E[F2i(qα^ls)+m1γ(𝜷,A;qα^ls)]+O(m3/2)=1α/2+E[m1γ(𝜷,A;qα^us)m1γ(𝜷^,A^;qα^us)]α/2+E[m1γ(𝜷^,A^;qα^ls)m1γ(𝜷,A;qα^ls)]+O(m3/2)=1α+O(m3/2).\begin{split}\mathrm{P}&\left(\hat{\theta}_{i}+q_{\hat{\alpha}_{l}^{s}}\sqrt{\hat{g}_{1i}}\leq\theta_{i}\leq\hat{\theta}_{i}+q_{\hat{\alpha}_{u}^{s}}\sqrt{\hat{g}_{1i}}\right)\\ =&\mathrm{P}\left(H_{i}(\hat{\bm{\beta}},\hat{A})\leq q_{\hat{\alpha}_{u}^{s}}\right)-\mathrm{P}\left(H_{i}(\hat{\bm{\beta}},\hat{A})\leq q_{\hat{\alpha}_{l}^{s}}\right)\\ =&\text{E}\left[F_{2i}(q_{\hat{\alpha}_{u}^{s}})+m^{-1}\gamma(\bm{\beta},A;q_{\hat{\alpha}_{u}^{s}})\right]-\text{E}\left[F_{2i}(q_{\hat{\alpha}_{l}^{s}})+m^{-1}\gamma(\bm{\beta},A;q_{\hat{\alpha}_{l}^{s}})\right]+O(m^{-3/2})\\ =&1-\alpha/2+\text{E}\left[m^{-1}\gamma(\bm{\beta},A;q_{\hat{\alpha}_{u}^{s}})-m^{-1}\gamma(\bm{\hat{\beta}},\hat{A};q_{\hat{\alpha}_{u}^{s}})\right]-\alpha/2\\ &+\text{E}\left[m^{-1}\gamma(\bm{\hat{\beta}},\hat{A};q_{\hat{\alpha}_{l}^{s}})-m^{-1}\gamma(\bm{\beta},A;q_{\hat{\alpha}_{l}^{s}})\right]+O(m^{-3/2})\\ =&1-\alpha+O(m^{-3/2}).\end{split} (A.5)

Appendix B Appendix

Proof of Proposition3.1: Consider H(𝜷,A)H(\bm{\beta},A) is not a pivot, that is, F2iF_{2i}, depends some unknown true parameters, 𝝂\bm{\nu}. Here, we rewrite the distribution function of H(𝜷,A)H(\bm{\beta},A) as F2i(𝝂)F_{2i}(\bm{\nu}), and Taylor expand F2i(x;𝝂)F_{2i}(x;\bm{\nu}) around 𝝂=𝝂1\bm{\nu}=\bm{\nu}_{1}, obtaining,

F2i(𝝂)=F2i(𝝂1)+(𝝂𝝂𝟏)F2i(𝝂1)𝝂+12(𝝂𝝂𝟏)2F2i(𝝂1)𝝂𝝂(𝝂𝝂𝟏)+𝐑(𝝂),\begin{split}F_{2i}(\bm{\nu})&=F_{2i}(\bm{\nu}_{1})+(\bm{\nu}-\bm{\nu_{1}})^{\prime}\frac{\partial F_{2i}(\bm{\nu}_{1})}{\partial\bm{\nu}}+\frac{1}{2}(\bm{\nu}-\bm{\nu_{1}})^{\prime}\frac{\partial^{2}F_{2i}(\bm{\nu}_{1})}{\partial\bm{\nu}\partial\bm{\nu^{\prime}}}(\bm{\nu}-\bm{\nu_{1}})+\mathbf{R}(\bm{\nu}),\end{split} (B.1)

where

𝐑(𝝂)=16i,j,k3F2i𝝂𝒊𝝂𝒋𝝂𝒌|𝝂(𝝂𝒊𝝂𝟏𝒊)(𝝂𝒋𝝂𝟏𝒋)(𝝂𝒌𝝂𝟏𝒌)\mathbf{R}(\bm{\nu})=\frac{1}{6}\sum_{i,j,k}\frac{\partial^{3}F_{2i}}{\partial\bm{\nu_{i}}\partial\bm{\nu_{j}}\partial\bm{\nu_{k}}}\bigg{|}_{\bm{\nu^{*}}}(\bm{\nu_{i}}-\bm{\nu_{1i}})(\bm{\nu_{j}}-\bm{\nu_{1j}})(\bm{\nu_{k}}-\bm{\nu_{1k}})

with 𝝂\bm{\nu^{*}} lying between 𝝂\bm{\nu} and 𝝂𝟏\bm{\nu_{1}}.

Under the regularity conditions, we have,

E[(𝝂^𝝂)F2i(𝝂)𝝂^]+E[12(𝝂^𝝂)2F2i(𝝂)𝝂^𝝂^](𝝂^𝝂)=m1c(𝝂),\begin{split}\operatorname{E}\left[(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\frac{\partial F_{2i}(\bm{\nu})}{\partial\bm{\hat{\nu}}}\right]+\operatorname{E}\left[\frac{1}{2}(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\frac{\partial^{2}F_{2i}(\bm{\nu})}{\partial\bm{\hat{\nu}}}\partial\bm{\hat{\nu}}^{\prime}\right](\bm{\hat{\nu}}-\bm{\nu})&=m^{-1}c(\bm{\nu}),\\ \end{split} (B.2)

where cc denotes a smooth function of order O(1)O(1), and

E[𝐑(𝝂^)]=16i,j,kE[3F2i(𝝂)𝝂^𝒊𝝂^𝒋𝝂^𝒌(𝝂^𝒊𝝂𝒊)(𝝂^𝒋𝝂𝒋)(𝝂^𝒌𝝂𝒌)]=o(m1)\operatorname{E}\left[\mathbf{R}(\bm{\hat{\nu}})\right]=\frac{1}{6}\sum_{i,j,k}\operatorname{E}\left[\frac{\partial^{3}F_{2i}(\bm{\nu^{*}})}{\partial\bm{\hat{\nu}_{i}}\partial\bm{\hat{\nu}_{j}}\partial\bm{\hat{\nu}_{k}}}(\bm{\hat{\nu}_{i}}-\bm{\nu_{i}})(\bm{\hat{\nu}_{j}}-\bm{\nu_{j}})(\bm{\hat{\nu}_{k}}-\bm{\nu_{k}})\right]=o(m^{-1}) (B.3)

with 𝝂\bm{\nu^{*}} lying between 𝝂^\bm{\hat{\nu}} and 𝝂\bm{\nu}.

Thus, using equations (3.1) and (A.1), with a non-pivot Hi(𝜷,A)H_{i}(\bm{\beta},A), we have

E[P(Hi(𝜷^,A^)qα^us|𝜷^,A^,ϕ^)]=E[F2i(qα^us;𝝂^)+m1γ(𝜷^,A^;qα^us)]+o(m1)=E[F2i(qα^us;𝝂)+m1c(𝝂;qα^us)+m1γ(𝜷^,A^;qα^us)]+o(m1)=1α/2,\begin{split}\text{E}\left[\mathrm{P}^{*}(H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\leq q_{\hat{\alpha}_{u}^{s}}\,\big{|}\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}})\right]&=\text{E}\left[F_{2i}(q_{\hat{\alpha}_{u}^{s}};\hat{\bm{\nu}})+m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}_{u}^{s}})\right]+o(m^{-1})\\ &=\text{E}\left[F_{2i}(q_{\hat{\alpha}_{u}^{s}};\bm{\nu})+m^{-1}c(\bm{\nu};q_{\hat{\alpha}_{u}^{s}})+m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}_{u}^{s}})\right]+o(m^{-1})\\ &=1-\alpha/2,\\ \end{split} (B.4)
E[P(Hi(𝜷^,A^)qα^ls|𝜷^,A^,ϕ^)]=E[F2i(qα^ls;𝝂^)+m1γ(𝜷^,A^;qα^ls)]+o(m1)=E[F2i(qα^ls;𝝂)+m1c(𝝂;qα^ls)+m1γ(𝜷^,A^;qα^ls)]+o(m1)=α/2.\begin{split}\text{E}\left[\mathrm{P}^{*}(H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\leq q_{\hat{\alpha}_{l}^{s}}\,\big{|}\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}})\right]&=\text{E}\left[F_{2i}(q_{\hat{\alpha}_{l}^{s}};\hat{\bm{\nu}})+m^{-1}\gamma(\bm{\hat{\beta}},\hat{A};q_{\hat{\alpha}_{l}^{s}})\right]+o(m^{-1})\\ &=\text{E}\left[F_{2i}(q_{\hat{\alpha}_{l}^{s}};\bm{\nu})+m^{-1}c(\bm{\nu};q_{\hat{\alpha}_{l}^{s}})+m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}_{l}^{s}})\right]+o(m^{-1})\\ &=\alpha/2.\end{split} (B.5)

Finally, we obtain,

P(θ^i+qα^lsg^1iθiθ^i+qα^usg^1i)=E[F2i(qα^us;𝝂)+m1γ(𝜷,A;qα^us)]E[F2i(qα^ls;𝝂)+m1γ(𝜷,A;qα^ls)]+o(m1)=E[1α/2m1c(𝝂;qα^us)m1γ(𝜷^,A^;qα^us)+m1γ(𝜷,A;qα^us)]E[α/2m1c(𝝂;qα^ls)m1γ(𝜷^,A^;qα^ls)+m1γ(𝜷,A;qα^ls)]+o(m1)=1αE[m1c(𝝂;qα^us)m1c(𝝂;qα^ls)]+o(m1)=1α+O(m1).\begin{split}\mathrm{P}&\left(\hat{\theta}_{i}+q_{\hat{\alpha}_{l}^{s}}\sqrt{\hat{g}_{1i}}\leq\theta_{i}\leq\hat{\theta}_{i}+q_{\hat{\alpha}_{u}^{s}}\sqrt{\hat{g}_{1i}}\right)\\ =&\text{E}\left[F_{2i}(q_{\hat{\alpha}_{u}^{s}};\bm{\nu})+m^{-1}\gamma(\bm{\beta},A;q_{\hat{\alpha}_{u}^{s}})\right]-\text{E}\left[F_{2i}(q_{\hat{\alpha}_{l}^{s}};\bm{\nu})+m^{-1}\gamma(\bm{\beta},A;q_{\hat{\alpha}_{l}^{s}})\right]+o(m^{-1})\\ =&\text{E}\left[1-\alpha/2-m^{-1}c(\bm{\nu};q_{\hat{\alpha}_{u}^{s}})-m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}_{u}^{s}})+m^{-1}\gamma(\bm{\beta},A;q_{\hat{\alpha}_{u}^{s}})\right]\\ &-\text{E}\left[\alpha/2-m^{-1}c(\bm{\nu};q_{\hat{\alpha}_{l}^{s}})-m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}_{l}^{s}})+m^{-1}\gamma(\bm{\beta},A;q_{\hat{\alpha}_{l}^{s}})\right]+o(m^{-1})\\ =&1-\alpha-\text{E}\left[m^{-1}c(\bm{\nu};q_{\hat{\alpha}_{u}^{s}})-m^{-1}c(\bm{\nu};q_{\hat{\alpha}_{l}^{s}})\right]+o(m^{-1})\\ =&1-\alpha+O(m^{-1}).\end{split} (B.6)

Appendix C Appendix

Proof of Proposition 3.2: From the third equation of (B.6) and (B.2), we have

T1=E[m1c(𝝂;qα^us)m1c(𝝂;qα^ls)]=E[(𝝂^𝝂){F2i(qα^us;𝝂)𝝂^F2i(qα^ls;𝝂)𝝂^}+12(𝝂^𝝂){2F2i(qα^us;𝝂)𝝂^𝝂^2F2i(qα^ls;𝝂)𝝂^𝝂^}(𝝂^𝝂)]=E[(𝝂^𝝂)𝝂^{F2i(qαu;𝝂)+F2i(qαu;𝝂)𝝂^(𝝂^𝝂)}(𝝂^𝝂)𝝂^{F2i(qαl;𝝂)+F2i(qαl;𝝂)𝝂^(𝝂^𝝂)}]E[12(𝝂^𝝂){2F2i(qαu;𝝂)𝝂^𝝂^2F2i(qαl;𝝂)𝝂^𝝂^}(𝝂^𝝂)]+o(m1)=E[(𝝂^𝝂){F2i(qαu;𝝂)𝝂^F2i(qαl;𝝂)𝝂^}]32E[(𝝂^𝝂){2F2i(qαu;𝝂)𝝂^𝝂^2F2i(qαl;𝝂)𝝂^𝝂^}(𝝂^𝝂)]+o(m1),\begin{split}T_{1}&=-\text{E}\left[m^{-1}c(\bm{\nu};q_{\hat{\alpha}_{u}^{s}})-m^{-1}c(\bm{\nu};q_{\hat{\alpha}_{l}^{s}})\right]\\ &=-\text{E}\left[(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\Bigl{\{}\frac{\partial F_{2i}(q_{\hat{\alpha}_{u}^{s}};\bm{\nu})}{\partial\bm{\hat{\nu}}}-\frac{\partial F_{2i}(q_{\hat{\alpha}_{l}^{s}};\bm{\nu})}{\partial\bm{\hat{\nu}}}\Bigr{\}}+\frac{1}{2}(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\Bigl{\{}\frac{\partial^{2}F_{2i}(q_{\hat{\alpha}_{u}^{s}};\bm{\nu})}{\partial\bm{\hat{\nu}}\partial\bm{\hat{\nu}^{\prime}}}-\frac{\partial^{2}F_{2i}(q_{\hat{\alpha}_{l}^{s}};\bm{\nu})}{\partial\bm{\hat{\nu}}\partial\bm{\hat{\nu}^{\prime}}}\Bigr{\}}(\bm{\hat{\nu}}-\bm{\nu})\right]\\ &=-\text{E}\left[(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\frac{\partial}{\partial\bm{\hat{\nu}}}\Bigl{\{}F_{2i}(q_{\alpha_{u}};\bm{\nu})+\frac{\partial F_{2i}(q_{\alpha_{u}};\bm{\nu})}{\partial\bm{\hat{\nu}}^{\prime}}(\bm{\hat{\nu}}-\bm{\nu})\Bigr{\}}-(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\frac{\partial}{\partial\bm{\hat{\nu}}}\Bigl{\{}F_{2i}(q_{\alpha_{l}};\bm{\nu})+\frac{\partial F_{2i}(q_{\alpha_{l}};\bm{\nu})}{\partial\bm{\hat{\nu}}^{\prime}}(\bm{\hat{\nu}}-\bm{\nu})\Bigr{\}}\right]\\ &-\text{E}\left[\frac{1}{2}(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\Bigl{\{}\frac{\partial^{2}F_{2i}(q_{\alpha_{u}};\bm{\nu})}{\partial\bm{\hat{\nu}}\partial\bm{\hat{\nu}^{\prime}}}-\frac{\partial^{2}F_{2i}(q_{\alpha_{l}};\bm{\nu})}{\partial\bm{\hat{\nu}}\partial\bm{\hat{\nu}^{\prime}}}\Bigr{\}}(\bm{\hat{\nu}}-\bm{\nu})\right]+o(m^{-1})\\ &=-\text{E}\left[(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\Bigl{\{}\frac{\partial F_{2i}(q_{\alpha_{u}};\bm{\nu})}{\partial\bm{\hat{\nu}}}-\frac{\partial F_{2i}(q_{\alpha_{l}};\bm{\nu})}{\partial\bm{\hat{\nu}}}\Bigr{\}}\right]\\ &-\frac{3}{2}\text{E}\left[(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\Bigl{\{}\frac{\partial^{2}F_{2i}(q_{\alpha_{u}};\bm{\nu})}{\partial\bm{\hat{\nu}}\partial\bm{\hat{\nu}^{\prime}}}-\frac{\partial^{2}F_{2i}(q_{\alpha_{l}};\bm{\nu})}{\partial\bm{\hat{\nu}}\partial\bm{\hat{\nu}^{\prime}}}\Bigr{\}}(\bm{\hat{\nu}}-\bm{\nu})\right]+o(m^{-1}),\end{split} (C.1)

where quq_{u} and qlq_{l} are unknown but fixed quantites.

In view of the above, we can write

T1=T11+T12+o(m1),T_{1}=T_{11}+T_{12}+o(m^{-1}), (C.2)

where

T11=E[(𝝂^𝝂){F2i(qαu;𝝂)𝝂^F2i(qαl;𝝂)𝝂^}],T12=32E[(𝝂^𝝂){2F2i(qαu;𝝂)𝝂^𝝂^2F2i(qαl;𝝂)𝝂^𝝂^}(𝝂^𝝂)].\begin{split}T_{11}&=-\text{E}\left[(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\Bigl{\{}\frac{\partial F_{2i}(q_{\alpha_{u}};\bm{\nu})}{\partial\bm{\hat{\nu}}}-\frac{\partial F_{2i}(q_{\alpha_{l}};\bm{\nu})}{\partial\bm{\hat{\nu}}}\Bigr{\}}\right],\\ T_{12}&=-\frac{3}{2}\text{E}\left[(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\Bigl{\{}\frac{\partial^{2}F_{2i}(q_{\alpha_{u}};\bm{\nu})}{\partial\bm{\hat{\nu}}\partial\bm{\hat{\nu}^{\prime}}}-\frac{\partial^{2}F_{2i}(q_{\alpha_{l}};\bm{\nu})}{\partial\bm{\hat{\nu}}\partial\bm{\hat{\nu}^{\prime}}}\Bigr{\}}(\bm{\hat{\nu}}-\bm{\nu})\right].\end{split}

With the assumption of symmetrically distributed random effects uiu_{i}, it is easy to show that F2i(qαu;𝝂)=1F2i(qαl;𝝂)F_{2i}(q_{\alpha_{u}};\bm{\nu})=1-F_{2i}(q_{\alpha_{l}};\bm{\nu}). Hence, we have

T11=2E[(𝝂^𝝂)F2i(qαu;𝝂)𝝂^],T12=3E[(𝝂^𝝂)2F2i(qαu;𝝂)𝝂^𝝂^(𝝂^𝝂)].\begin{split}T_{11}&=-2\text{E}\left[(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\frac{\partial F_{2i}(q_{\alpha_{u}};\bm{\nu})}{\partial\bm{\hat{\nu}}}\right],\\ T_{12}&=-3\text{E}\left[(\bm{\hat{\nu}}-\bm{\nu})^{\prime}\frac{\partial^{2}F_{2i}(q_{\alpha_{u}};\bm{\nu})}{\partial\bm{\hat{\nu}}\partial\bm{\hat{\nu}}^{\prime}}(\bm{\hat{\nu}}-\bm{\nu})\right].\end{split}

Given the assumptions (i)(i)-(iii)(iii), we have

T11+T12=O(m1)>0,T_{11}+T_{12}=O(m^{-1})>0,

under the regularity conditions. Thus, T1>0T_{1}>0.

Appendix D Appendix

Proof of Proposition 4.1 Let Mi(𝜷,A)=(θifi(𝜷))/AJ2i(𝜹)M_{i}(\bm{\beta},A)=(\theta_{i}-f_{i}(\bm{\beta}))/\sqrt{A}\sim J_{2i}(\bm{\delta}), and Mi(𝜷^,A^)=(θifi(𝜷^))/A^J1i(𝜹)M_{i}(\hat{\bm{\beta}},\hat{A})=(\theta_{i}-f_{i}(\hat{\bm{\beta}}))/\sqrt{\hat{A}}\sim J_{1i}(\bm{\delta}), where 𝜹\bm{\delta} is a vector of unknown parameters. We have

E[P(Mi(𝜷^,A^)bα^us|𝜷^,A^,ϕ^)]=E[J2i(bα^us;𝜹^)+m1c1(𝜷^,A^;bα^us)]+o(m1)=E[J2i(bα^us;𝜹)+m1c2(bα^us;𝜹)+m1c1(𝜷^,A^;bα^us)]+o(m1)=1α/2,\begin{split}\text{E}\left[\mathrm{P}^{*}(M_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\leq b_{\hat{\alpha}_{u}^{s}}\,\big{|}\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}})\right]&=\text{E}\left[J_{2i}(b_{\hat{\alpha}_{u}^{s}};\hat{\bm{\delta}})+m^{-1}c_{1}(\hat{\bm{\beta}},\hat{A};b_{\hat{\alpha}_{u}^{s}})\right]+o(m^{-1})\\ &=\text{E}\left[J_{2i}(b_{\hat{\alpha}_{u}^{s}};\bm{\delta})+m^{-1}c_{2}(b_{\hat{\alpha}_{u}^{s}};\bm{\delta})+m^{-1}c_{1}(\hat{\bm{\beta}},\hat{A};b_{\hat{\alpha}_{u}^{s}})\right]+o(m^{-1})\\ &=1-\alpha/2,\\ \end{split} (D.1)
E[P(Mi(𝜷^,A^)bα^ls|𝜷^,A^,ϕ^)]=E[J2i(bα^ls;𝜹^)+m1c1(𝜷^,A^;bα^ls)]+o(m1)=E[J2i(bα^ls;𝜹)+m1c2(𝜹;bα^ls)+m1c1(𝜷^,A^;bα^ls)]+o(m1)=α/2,\begin{split}\text{E}\left[\mathrm{P}^{*}(M_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\leq b_{\hat{\alpha}_{l}^{s}}\,\big{|}\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}})\right]&=\text{E}\left[J_{2i}(b_{\hat{\alpha}_{l}^{s}};\hat{\bm{\delta}})+m^{-1}c_{1}(\bm{\hat{\beta}},\hat{A};b_{\hat{\alpha}_{l}^{s}})\right]+o(m^{-1})\\ &=\text{E}\left[J_{2i}(b_{\hat{\alpha}_{l}^{s}};\bm{\delta})+m^{-1}c_{2}(\bm{\delta};b_{\hat{\alpha}_{l}^{s}})+m^{-1}c_{1}(\hat{\bm{\beta}},\hat{A};b_{\hat{\alpha}_{l}^{s}})\right]+o(m^{-1})\\ &=\alpha/2,\end{split} (D.2)

where c1c_{1} and c2c_{2} are smooth functions of order O(1)O(1).

Eventually, we have

P(fi(𝜷^)+bα^lsA^1/2θifi(𝜷^)+bα^usA^1/2)=E[J2i(bα^us;𝜹)+m1c1(𝜷,A;bα^us)]E[J2i(bα^ls;𝜹)+m1c1(𝜷,A;bα^ls)]=E[1α/2m1c2(𝜹;qα^us)m1c1(𝜷^,A^;bα^us)+m1c1(𝜷,A;bα^us)]E[α/2m1c2(𝜹;bα^ls)m1c1(𝜷^,A^;bα^ls)+m1c1(𝜷,A;bα^ls)]+o(m1)=1αE[m1c2(𝜹;bα^us)m1c2(𝜹;bα^ls)]+o(m1)=1α+O(m1).\begin{split}\mathrm{P}&(f_{i}(\hat{\bm{\beta}})+b_{\hat{\alpha}_{l}^{s}}\hat{A}^{1/2}\leq\theta_{i}\leq f_{i}(\hat{\bm{\beta}})+b_{\hat{\alpha}_{u}^{s}}\hat{A}^{1/2})\\ =&\text{E}\left[J_{2i}(b_{\hat{\alpha}_{u}^{s}};\bm{\delta})+m^{-1}c_{1}(\bm{\beta},A;b_{\hat{\alpha}_{u}^{s}})\right]-\text{E}\left[J_{2i}(b_{\hat{\alpha}_{l}^{s}};\bm{\delta})+m^{-1}c_{1}(\bm{\beta},A;b_{\hat{\alpha}_{l}^{s}})\right]\\ =&\text{E}\left[1-\alpha/2-m^{-1}c_{2}(\bm{\delta};q_{\hat{\alpha}_{u}^{s}})-m^{-1}c_{1}(\hat{\bm{\beta}},\hat{A};b_{\hat{\alpha}_{u}^{s}})+m^{-1}c_{1}(\bm{\beta},A;b_{\hat{\alpha}_{u}^{s}})\right]\\ &-\text{E}\left[\alpha/2-m^{-1}c_{2}(\bm{\delta};b_{\hat{\alpha}_{l}^{s}})-m^{-1}c_{1}(\hat{\bm{\beta}},\hat{A};b_{\hat{\alpha}_{l}^{s}})+m^{-1}c_{1}(\bm{\beta},A;b_{\hat{\alpha}_{l}^{s}})\right]+o(m^{-1})\\ =&1-\alpha-\text{E}\left[m^{-1}c_{2}(\bm{\delta};b_{\hat{\alpha}_{u}^{s}})-m^{-1}c_{2}(\bm{\delta};b_{\hat{\alpha}_{l}^{s}})\right]+o(m^{-1})\\ =&1-\alpha+O(m^{-1}).\end{split} (D.3)

Appendix E Appendix

Proof of Theorem 4.1: With equations (4.6) and (A.1), for the upper limit, we can obtain

E[α^u]=E[P(Hi(𝜷^,A^)qα^ud|𝜷^,A^,ϕ^)]=E[F1i(qα^ud;𝝂^)]=E[F2i(qα^ud;𝝂^)+m1γ(𝜷^,A^;qα^ud)]+O(m3/2)=E[F2i(qα^ud;𝝂)+m1c(𝝂;qα^ud)+m1γ(𝜷^,A^;qα^ud)]+o(m1).\begin{split}\text{E}\left[\hat{\alpha}_{u}\right]&=\text{E}\left[\mathrm{P}^{*}(H_{i}(\hat{\bm{\beta}}^{*},\hat{A}^{*})\leq q_{\hat{\alpha}^{d}_{u}}\,|\,\hat{\bm{\beta}},\hat{A},\hat{\bm{\phi}})\right]\\ &=\text{E}\left[F_{1i}(q_{\hat{\alpha}^{d}_{u}};\hat{\bm{\nu}})\right]\\ &=\text{E}\left[F_{2i}(q_{\hat{\alpha}^{d}_{u}};\hat{\bm{\nu}})+m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}^{d}_{u}})\right]+O(m^{-3/2})\\ &=\text{E}\left[F_{2i}(q_{\hat{\alpha}^{d}_{u}};\bm{\nu})+m^{-1}c(\bm{\nu};q_{\hat{\alpha}^{d}_{u}})+m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}^{d}_{u}})\right]+o(m^{-1}).\end{split} (E.1)

Replacing αu\alpha_{u} by α^u\hat{\alpha}_{u} in the system of equations (4.3), we obtain

E[1α/2α^u]=E[F1i(qα^u;𝝂^)F1i(qα^u;𝝂^)]=E[F2i(qα^u;𝝂^)F2i(qα^u;𝝂^)+m1γ(𝜷^,A^;qα^u)m1γ(𝜷^,A^;qα^u)]+O(m3/2)=E[F2i(qα^u;𝝂^)F2i(qα^u;𝝂^)]+O(m3/2)=E[F2i(qα^u;𝝂^)F2i(qα^u;𝝂^)m1c(𝝂^;qα^u)]+o(m1)=E[m1c(𝝂^;qα^u)]+o(m1).\begin{split}&\text{E}\left[1-\alpha/2-\hat{\alpha}_{u}\right]\\ &=\text{E}\left[F_{1i}(q_{\hat{\alpha}_{u}};\hat{\bm{\nu}})-F_{1i}(q_{\hat{\alpha}_{u}};\hat{\bm{\nu}}^{*})\right]\\ &=\text{E}\left[F_{2i}(q_{\hat{\alpha}_{u}};\hat{\bm{\nu}})-F_{2i}(q_{\hat{\alpha}_{u}};\hat{\bm{\nu}}^{*})+m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}_{u}})-m^{-1}\gamma(\hat{\bm{\beta}}^{*},\hat{A}^{*};q_{\hat{\alpha}_{u}})\right]+O(m^{-3/2})\\ &=\text{E}\left[F_{2i}(q_{\hat{\alpha}_{u}};\hat{\bm{\nu}})-F_{2i}(q_{\hat{\alpha}_{u}};\hat{\bm{\nu}}^{*})\right]+O(m^{-3/2})\\ &=\text{E}\left[F_{2i}(q_{\hat{\alpha}_{u}};\hat{\bm{\nu}})-F_{2i}(q_{\hat{\alpha}_{u}};\hat{\bm{\nu}})-m^{-1}c(\hat{\bm{\nu}};q_{\hat{\alpha}_{u}})\right]+o(m^{-1})\\ &=\text{E}\left[-m^{-1}c(\hat{\bm{\nu}};q_{\hat{\alpha}_{u}})\right]+o(m^{-1}).\end{split} (E.2)

Therefore,

E[α^u]=1α/2+E[m1c(𝝂^;qα^u)]+o(m1).\text{E}\left[\hat{\alpha}_{u}\right]=1-\alpha/2+\text{E}\left[m^{-1}c(\hat{\bm{\nu}};q_{\hat{\alpha}_{u}})\right]+o(m^{-1}). (E.3)

For the upper limit, using the last equation of (E.1) and (E.3), we have

P(θiθ^i+qα^udg^1i)=E[F1i(qα^ud)]=E[F2i(qα^ud;𝝂)+m1γ(𝜷,A;qα^ud)]+O(m3/2)=E[1α/2+m1c(𝝂^;qα^ud)m1c(𝝂;qα^ud)m1γ(𝜷^,A^;qα^ud)+m1γ(𝜷,A;qα^ud)]+o(m1)=1α/2+o(m1).\begin{split}&\mathrm{P}(\theta_{i}\leq\hat{\theta}_{i}+q_{\hat{\alpha}^{d}_{u}}\sqrt{\hat{g}_{1i}})\\ &=\text{E}\left[F_{1i}(q_{\hat{\alpha}^{d}_{u}})\right]\\ &=\text{E}\left[F_{2i}(q_{\hat{\alpha}^{d}_{u}};\bm{\nu})+m^{-1}\gamma(\bm{\beta},A;q_{\hat{\alpha}^{d}_{u}})\right]+O(m^{-3/2})\\ &=\text{E}\left[1-\alpha/2+m^{-1}c(\hat{\bm{\nu}};q_{\hat{\alpha}^{d}_{u}})-m^{-1}c(\bm{\nu};q_{\hat{\alpha}^{d}_{u}})-m^{-1}\gamma(\hat{\bm{\beta}},\hat{A};q_{\hat{\alpha}^{d}_{u}})+m^{-1}\gamma(\bm{\beta},A;q_{\hat{\alpha}^{d}_{u}})\right]+o(m^{-1})\\ &=1-\alpha/2+o(m^{-1}).\end{split} (E.4)

Similarly, for the lower limit, we have

P(θiθ^i+qα^ldg^1i)=E[F1i(qα^ld)]=α/2+o(m1).\begin{split}\mathrm{P}(\theta_{i}\leq\hat{\theta}_{i}+q_{\hat{\alpha}^{d}_{l}}\sqrt{\hat{g}_{1i}})&=\text{E}\left[F_{1i}(q_{\hat{\alpha}^{d}_{l}})\right]\\ &=\alpha/2+o(m^{-1}).\end{split} (E.5)

Finally, obtain

P(θ^i+qα^ldg^1iθiθ^i+qα^udg^1i)=E[F1i(qα^ud)F1i(qα^ld)]=1α/2α/2+o(m1)=1α+o(m1).\begin{split}&\mathrm{P}(\hat{\theta}_{i}+q_{\hat{\alpha}^{d}_{l}}\sqrt{\hat{g}_{1i}}\leq\theta_{i}\leq\hat{\theta}_{i}+q_{\hat{\alpha}^{d}_{u}}\sqrt{\hat{g}_{1i}})\\ &=\text{E}\left[F_{1i}(q_{\hat{\alpha}^{d}_{u}})-F_{1i}(q_{\hat{\alpha}^{d}_{l}})\right]\\ &=1-\alpha/2-\alpha/2+o(m^{-1})\\ &=1-\alpha+o(m^{-1}).\end{split} (E.6)

Appendix F Appendix

Let f2i()f_{2i}(\cdot) denote the probability density function of Hi(𝜷,A)H_{i}(\bm{\beta},A). Let f2if^{\prime}_{2i} and f2i′′f^{\prime\prime}_{2i} be the first and second derivative of f2i()f_{2i}(\cdot). Then (A.1) can be expressed as:

F1i(qα)=P((θiθ^i)/g^1iqα)=E[F2i(qα+Q(qα+𝒀))]=F2i(qα)+f2i(qα)EQ+12f2i(qα)EQ2+12E[qαqα+Qf2i′′(t)(qα+Qt)2𝑑t],\begin{split}F_{1i}(q_{\alpha})&=\mathrm{P}((\theta_{i}-\hat{\theta}_{i})/\sqrt{\hat{g}_{1i}}\leq q_{\alpha})\\ &=E\left[F_{2i}(q_{\alpha}+Q(q_{\alpha}+\bm{Y}))\right]\\ &=F_{2i}(q_{\alpha})+f_{2i}(q_{\alpha})\text{E}Q+\frac{1}{2}f^{\prime}_{2i}(q_{\alpha})\text{E}Q^{2}+\frac{1}{2}\text{E}\left[\int_{q_{\alpha}}^{q_{\alpha}+Q}f^{\prime\prime}_{2i}(t)(q_{\alpha}+Q-t)^{2}dt\right],\end{split} (F.1)

where Q=Q(qα+𝒀)Q=Q(q_{\alpha}+\bm{Y}). Note that for t(qα,qα+Q)t\in(q_{\alpha},q_{\alpha}+Q), we have 0|qα+Qt||Q|0\leq|q_{\alpha}+Q-t|\leq|Q|. Assuming that the third derivative is uniformly bounded, we have

Eqαqα+Qf2i′′(t)(qα+Qt)2𝑑tCE|Q|3\text{E}\int_{q_{\alpha}}^{q_{\alpha}+Q}f^{\prime\prime}_{2i}(t)(q_{\alpha}+Q-t)^{2}dt\leq C\text{E}|Q|^{3} (F.2)

for some constant C>0C>0. Then, to show the last equation of (A.1), it reduces to the evaluation of EQ\text{E}Q, EQ2\text{E}Q^{2} and E|Q|3\text{E}|Q|^{3}. In particular, if EQ=O(m1)\text{E}Q=O(m^{-1}), EQ2=O(m1)\text{E}Q^{2}=O(m^{-1}) and EQ4=O(m2)\text{E}Q^{4}=O(m^{-2}), then it follows that

E|Q|3(EQ4)3/4=O(m3/2),\text{E}|Q|^{3}\leq(\text{E}Q^{4})^{3/4}=O(m^{-3/2}), (F.3)

by the Lyapunov inequality so that

F1i(qα)=F2i(qα)+m1γ(𝜷,A;qα)+O(m3/2).F_{1i}(q_{\alpha})=F_{2i}(q_{\alpha})+m^{-1}\gamma(\bm{\beta},A;q_{\alpha})+O(m^{-3/2}).

First, note that

𝜷^=(𝑿𝚺^𝟏𝑿)𝟏𝑿𝚺^𝟏𝒀=𝜷+(𝑿𝚺^𝟏𝑿)𝟏𝑿𝚺^𝟏(𝒖+𝒆).\bm{\hat{\beta}}=(\bm{X^{\prime}\hat{\Sigma}^{-1}X)^{-1}X^{\prime}\hat{\Sigma}^{-1}Y}=\bm{\beta}+(\bm{X^{\prime}\hat{\Sigma}^{-1}X)^{-1}X^{\prime}\hat{\Sigma}^{-1}(u+e)}.

We now simplify the expression for Q=Q(qα+𝒀)Q=Q(q_{\alpha}+\bm{Y}):

θ^iθ~i=A^A^+Diyi+DiA^+Di𝒙i𝜷^AA+DiyiDiA+Di𝒙i𝜷=DiA+Di𝒙i(𝑿𝚺1𝑿)1𝑿𝚺1(𝒖+𝒆)+DiA^+Di𝒙i(𝑿𝚺^1𝑿)1𝑿𝚺^1(𝒖+𝒆)DiA+Di𝒙i(𝑿Σ1𝑿)1𝑿Σ1(𝒖+𝒆)+(A^A^+DiAA+Di)(ui+ei).\begin{split}\hat{\theta}_{i}-\tilde{\theta}_{i}=&\frac{\hat{A}}{\hat{A}+D_{i}}y_{i}+\frac{D_{i}}{\hat{A}+D_{i}}\bm{x}_{i}^{\prime}\bm{\hat{\beta}}-\frac{A}{A+D_{i}}y_{i}-\frac{D_{i}}{A+D_{i}}\bm{x}_{i}^{\prime}\bm{\beta}\\ =&\frac{D_{i}}{A+D_{i}}\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\bm{\Sigma}^{-1}\bm{X})^{-1}\bm{X}^{\prime}\bm{\Sigma}^{-1}(\bm{u}+\bm{e})\\ &+\frac{D_{i}}{\hat{A}+D_{i}}\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\hat{\bm{\Sigma}}^{-1}\bm{X})^{-1}\bm{X}^{\prime}\hat{\bm{\Sigma}}^{-1}(\bm{u}+\bm{e})-\frac{D_{i}}{A+D_{i}}\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\Sigma^{-1}\bm{X})^{-1}\bm{X}^{\prime}\Sigma^{-1}(\bm{u}+\bm{e})\\ &+(\frac{\hat{A}}{\hat{A}+D_{i}}-\frac{A}{A+D_{i}})(u_{i}+e_{i}).\\ \end{split} (F.4)

In view of the above, similar to Li (2007) and Saegusa et al. (2022), we can write

Q(qα,𝒀)=Q1+Q2(𝒀)+Q3(𝒀)+Q4(𝒀).Q(q_{\alpha},\bm{Y})=Q_{1}+Q_{2}(\bm{Y})+Q_{3}(\bm{Y})+Q_{4}(\bm{Y}).

where

Q1=1g1iDiA+Di𝒙i(𝑿𝚺1𝑿)1𝑿𝚺1(𝒖+𝒆),Q2(𝒀)=1g1i[DiA^+Di𝒙i(𝑿𝚺^1𝑿)1𝑿𝚺^1(𝒖+𝒆)DiA+Di𝒙i(𝑿Σ1𝑿)1𝑿Σ1(𝒖+𝒆)],Q3(𝒀)=1g1i[(A^A^+DiAA+Di)(ui+ei)],Q4(𝒀)=qα(g^1ig1i)/g1i.\begin{split}Q_{1}=&\frac{1}{\sqrt{g_{1i}}}\frac{D_{i}}{A+D_{i}}\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\bm{\Sigma}^{-1}\bm{X})^{-1}\bm{X}^{\prime}\bm{\Sigma}^{-1}(\bm{u}+\bm{e}),\\ Q_{2}(\bm{Y})=&\frac{1}{\sqrt{g_{1i}}}\left[\frac{D_{i}}{\hat{A}+D_{i}}\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\hat{\bm{\Sigma}}^{-1}\bm{X})^{-1}\bm{X}^{\prime}\hat{\bm{\Sigma}}^{-1}(\bm{u}+\bm{e})\right.\\ &\left.-\frac{D_{i}}{A+D_{i}}\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\Sigma^{-1}\bm{X})^{-1}\bm{X}^{\prime}\Sigma^{-1}(\bm{u}+\bm{e})\right],\\ Q_{3}(\bm{Y})=&\frac{1}{\sqrt{g_{1i}}}\left[(\frac{\hat{A}}{\hat{A}+D_{i}}-\frac{A}{A+D_{i}})(u_{i}+e_{i})\right],\\ Q_{4}(\bm{Y})=&q_{\alpha}(\sqrt{\hat{g}_{1i}}-\sqrt{g_{1i}})/\sqrt{g_{1i}}.\end{split}

Therefore, in order to obtain the last equation of (F.1), we need to show that EQi=O(m1)\text{E}Q_{i}=O(m^{-1}), EQi2=O(m1)\text{E}Q_{i}^{2}=O(m^{-1}) and EQi4=O(m2)\text{E}Q_{i}^{4}=O(m^{-2}), i=1,,4i=1,\cdots,4. Firstly, it is easy to show that EQ1=0\text{E}Q_{1}=0. Using the regularity condition that supi1hii𝒙𝒊(𝑿𝑿)𝟏𝒙𝒊=O(m1)\text{sup}_{i\geq 1}h_{ii}\equiv\bm{x_{i}^{\prime}(X^{\prime}X)^{-1}x_{i}}=O(m^{-1}), we have

EQ12=DiA(A+Di)𝒙i(𝑿𝚺1𝑿)1𝑿𝚺1E(𝒖+𝒆)(𝒖+𝒆)𝚺1𝑿𝚺1𝑿)1𝒙i=DiA(A+Di)hii=O(m1).\begin{split}\text{E}Q_{1}^{2}&=\frac{D_{i}}{A(A+D_{i})}\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\bm{\Sigma}^{-1}\bm{X})^{-1}\bm{X}^{\prime}\bm{\Sigma}^{-1}\text{E}(\bm{u}+\bm{e})(\bm{u}+\bm{e})^{\prime}\bm{\Sigma}^{-1}\bm{X}\bm{\Sigma}^{-1}\bm{X})^{-1}\bm{x}_{i}\\ &=\frac{D_{i}}{A(A+D_{i})}h_{ii}\\ &=O(m^{-1}).\end{split} (F.5)

Under the assumption hii=O(m1)h_{ii}=O(m^{-1}), it is easily to see that

h~ii=𝒙𝒊(𝑿𝚺𝟏𝑿)𝟏𝑿𝚺𝟐(𝑿𝚺𝟏𝑿)𝟏𝒙𝒊=𝒅𝒊𝑷𝑿𝑷𝑿𝒅𝒊=O(m1)\begin{split}\tilde{h}_{ii}&=\bm{x_{i}^{\prime}(X\Sigma^{-1}X)^{-1}X^{\prime}\Sigma^{-2}(X\Sigma^{-1}X)^{-1}x_{i}}\\ &=\bm{d_{i}^{\prime}P_{X}P_{X}^{\prime}d_{i}}\\ &=O(m^{-1})\end{split}

where 𝒅𝒊\bm{d_{i}} is a column vector with the ithi^{th} element unity and the rest zeros, PX=𝑿(𝑿𝚺𝟏𝑿)𝟏𝑿𝚺𝟏P_{X}=\bm{X(X\Sigma^{-1}X)^{-1}X^{\prime}\Sigma^{-1}}. An application of the Cauchy–Schwartz inequality yields

EQ14=Di2A2(A+Di)2E(𝒙i(𝑿𝚺1𝑿)1𝑿𝚺1(𝒖+𝒆))4=Di2A2(A+Di)2E(𝒅𝒊𝑷𝑿𝑷𝑿(𝒖+𝒆))4Di2A2(A+Di)2(𝒅𝒊𝑷𝑿𝑷𝑿𝒅𝒊)2{E[(𝒖+𝒆)𝑷𝑿𝑷𝑿(𝒖+𝒆)]}2=Cm2{tr(𝑿(𝑿𝚺𝟏𝑿)𝟏𝑿)}2=O(m2).\begin{split}\text{E}Q_{1}^{4}&=\frac{D_{i}^{2}}{A^{2}(A+D_{i})^{2}}\text{E}(\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\bm{\Sigma}^{-1}\bm{X})^{-1}\bm{X}^{\prime}\bm{\Sigma}^{-1}(\bm{u}+\bm{e}))^{4}\\ &=\frac{D_{i}^{2}}{A^{2}(A+D_{i})^{2}}\text{E}(\bm{d_{i}^{\prime}P_{X}P_{X}(u+e)})^{4}\\ &\leq\frac{D_{i}^{2}}{A^{2}(A+D_{i})^{2}}(\bm{d_{i}^{\prime}P_{X}P_{X}^{\prime}d_{i}})^{2}\{\text{E}[\bm{(u+e)^{\prime}P_{X}^{\prime}P_{X}(u+e)}]\}^{2}\\ &=Cm^{-2}\{\text{tr}(\bm{X(X^{\prime}\Sigma^{-1}X)^{-1}X})\}^{2}\\ &=O(m^{-2}).\end{split} (F.6)

To evaluate the fourth moment of rest of the terms in QQ, we need to evaluate the moment of (A^+Di)1(A+Di)1(\hat{A}+D_{i})^{-1}-(A+D_{i})^{-1} and 𝚺^1𝚺1\hat{\bm{\Sigma}}^{-1}-\bm{\Sigma}^{-1}. To see this, for example, we can break Q2Q_{2} down in terms of more tractable variables and remainder terms:

Q2(𝒀)=1g1i[DiA^+Di𝒙i(𝑿𝚺^1𝑿)1𝑿𝚺^1(𝒖+𝒆)DiA+Di𝒙i(𝑿Σ1𝑿)1𝑿𝚺1(𝒖+𝒆)]=1g1i[(DiA^+DiDiA+Di)𝒙i((𝑿𝚺^1𝑿)1(𝑿𝚺1𝑿)1)𝑿(𝚺^1𝚺1)(𝒖+𝒆)+(DiA^+DiDiA+Di)𝒙i((𝑿𝚺^1𝑿)1(𝑿𝚺1𝑿)1)𝑿𝚺1(𝒖+𝒆)+(DiA^+DiDiA+Di)𝒙i(𝑿𝚺1𝑿)1𝑿(𝚺^1𝚺1)(𝒖+𝒆)+(DiA^+DiDiA+Di)𝒙i(𝑿𝚺1𝑿)1𝑿𝚺1(𝒖+𝒆)+DiA+Di𝒙i((𝑿𝚺^1𝑿)1(𝑿𝚺1𝑿)1)𝑿(𝚺^1𝚺1)(𝒖+𝒆)+DiA+Di𝒙i((𝑿𝚺^1𝑿)1(𝑿𝚺1𝑿)1)𝑿𝚺1(𝒖+𝒆)+DiA+Di𝒙i(𝑿𝚺1𝑿)1𝑿(𝚺^1𝚺1)(𝒖+𝒆)].\begin{split}Q_{2}(\bm{Y})=&\frac{1}{\sqrt{g_{1i}}}\left[\frac{D_{i}}{\hat{A}+D_{i}}\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\hat{\bm{\Sigma}}^{-1}\bm{X})^{-1}\bm{X}^{\prime}\hat{\bm{\Sigma}}^{-1}(\bm{u}+\bm{e})-\frac{D_{i}}{A+D_{i}}\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\Sigma^{-1}\bm{X})^{-1}\bm{X}^{\prime}\bm{\Sigma}^{-1}(\bm{u}+\bm{e})\right]\\ =&\frac{1}{\sqrt{g_{1i}}}\bigg{[}\left(\frac{D_{i}}{\hat{A}+D_{i}}-\frac{D_{i}}{A+D_{i}}\right)\bm{x}_{i}^{\prime}\left((\bm{X}^{\prime}\hat{\bm{\Sigma}}^{-1}\bm{X})^{-1}-(\bm{X}^{\prime}\bm{\Sigma}^{-1}\bm{X})^{-1}\right)\bm{X}^{\prime}\left(\hat{\bm{\Sigma}}^{-1}-\bm{\Sigma}^{-1}\right)(\bm{u+e})\\ &+\left(\frac{D_{i}}{\hat{A}+D_{i}}-\frac{D_{i}}{A+D_{i}}\right)\bm{x}_{i}^{\prime}\left((\bm{X}^{\prime}\hat{\bm{\Sigma}}^{-1}\bm{X})^{-1}-(\bm{X}^{\prime}\bm{\Sigma}^{-1}\bm{X})^{-1}\right)\bm{X}^{\prime}\bm{\Sigma}^{-1}(\bm{u+e})\\ &+\left(\frac{D_{i}}{\hat{A}+D_{i}}-\frac{D_{i}}{A+D_{i}}\right)\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\bm{\Sigma}^{-1}\bm{X})^{-1}\bm{X}^{\prime}\left(\hat{\bm{\Sigma}}^{-1}-\bm{\Sigma}^{-1}\right)(\bm{u+e})\\ &+\left(\frac{D_{i}}{\hat{A}+D_{i}}-\frac{D_{i}}{A+D_{i}}\right)\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\bm{\Sigma}^{-1}\bm{X})^{-1}\bm{X}^{\prime}\bm{\Sigma}^{-1}(\bm{u+e})\\ &+\frac{D_{i}}{A+D_{i}}\bm{x}_{i}^{\prime}\left((\bm{X}^{\prime}\hat{\bm{\Sigma}}^{-1}\bm{X})^{-1}-(\bm{X}^{\prime}\bm{\Sigma}^{-1}\bm{X})^{-1}\right)\bm{X}^{\prime}\left(\hat{\bm{\Sigma}}^{-1}-\bm{\Sigma}^{-1}\right)(\bm{u+e})\\ &+\frac{D_{i}}{A+D_{i}}\bm{x}_{i}^{\prime}\left((\bm{X}^{\prime}\hat{\bm{\Sigma}}^{-1}\bm{X})^{-1}-(\bm{X}^{\prime}\bm{\Sigma}^{-1}\bm{X})^{-1}\right)\bm{X}^{\prime}\bm{\Sigma}^{-1}(\bm{u+e})\\ &+\frac{D_{i}}{A+D_{i}}\bm{x}_{i}^{\prime}(\bm{X}^{\prime}\bm{\Sigma}^{-1}\bm{X})^{-1}\bm{X}^{\prime}\left(\hat{\bm{\Sigma}}^{-1}-\bm{\Sigma}^{-1}\right)(\bm{u+e})\bigg{]}.\end{split} (F.7)

Using the Taylor series expansion, we have

DiA^+DiDiA+Di=(A^A)Di(A+Di)2+(A^A)2Di(A+Di)3(A^A)3Di(A+Di)4,\begin{split}&\frac{D_{i}}{\hat{A}+D_{i}}-\frac{D_{i}}{A+D_{i}}\\ =&-(\hat{A}-A)\frac{D_{i}}{(A+D_{i})^{2}}+(\hat{A}-A)^{2}\frac{D_{i}}{(A+D_{i})^{3}}-(\hat{A}-A)^{3}\frac{D_{i}}{(A^{*}+D_{i})^{4}},\end{split}
𝚺^1𝚺1=(A^A)diag{1(A+Di)2}i=1m+(A^A)2diag{1(A+Di)3}i=1m(A^A)3diag{1(A+Di)4}i=1m,\begin{split}&\hat{\bm{\Sigma}}^{-1}-\bm{\Sigma}^{-1}\\ =&-(\hat{A}-A)\text{diag}\Bigl{\{}\frac{1}{(A+D_{i})^{2}}\Bigr{\}}_{i=1}^{m}+(\hat{A}-A)^{2}\text{diag}\Bigl{\{}\frac{1}{(A+D_{i})^{3}}\Bigr{\}}_{i=1}^{m}\\ &-(\hat{A}-A)^{3}\text{diag}\Bigl{\{}\frac{1}{(A^{*}+D_{i})^{4}}\Bigr{\}}_{i=1}^{m},\end{split}

and for Q3Q_{3}, we have

A^A^+DiAA+Di=(A^A)Di(A+Di)2(A^A)2Di(A+Di)3+(A^A)3Di(A+Di)4,\begin{split}&\frac{\hat{A}}{\hat{A}+D_{i}}-\frac{A}{A+D_{i}}\\ =&(\hat{A}-A)\frac{D_{i}}{(A+D_{i})^{2}}-(\hat{A}-A)^{2}\frac{D_{i}}{(A+D_{i})^{3}}+(\hat{A}-A)^{3}\frac{D_{i}}{(A^{*}+D_{i})^{4}},\end{split}

where AA^{*} lies between AA and A^\hat{A}.

For the last term Q4Q_{4}, letting W=g1i1(g^1ig1i)W=g_{1i}^{-1}(\hat{g}_{1i}-g_{1i}), we have

Q4(qα,𝒀)=qα(g^1ig1i)/g1i=qα(g^1i/g1i1)=qα[{g1i1(g^1ig1i)+1}1/21]=qα[{W+1}1/21]=qα[W/2W2/8+rn],\begin{split}Q_{4}(q_{\alpha},\bm{Y})&=q_{\alpha}(\sqrt{\hat{g}_{1i}}-\sqrt{g_{1i}})/\sqrt{g_{1i}}\\ &=q_{\alpha}(\sqrt{\hat{g}_{1i}}/\sqrt{g_{1i}}-1)\\ &=q_{\alpha}\bigg{[}\{g_{1i}^{-1}(\hat{g}_{1i}-g_{1i})+1\}^{1/2}-1\bigg{]}\\ &=q_{\alpha}\bigg{[}\{W+1\}^{1/2}-1\bigg{]}\\ &=q_{\alpha}\bigg{[}W/2-W^{2}/8+r_{n}\bigg{]},\end{split} (F.8)

where the remainder being rn=O(W3)r_{n}=O(W^{3}). Then, apply Taylor expansion on WW:

W=g1i1(g^1ig1i)=g1i1(A^DiA^+DiADiA+Di)=g1i1((A^A)Di2(A+Di)2(A^A)2Di2(A+Di)3+(A^A)3Di2(A+Di)4).\begin{split}W&=g_{1i}^{-1}(\hat{g}_{1i}-g_{1i})\\ &=g_{1i}^{-1}\bigg{(}\frac{\hat{A}D_{i}}{\hat{A}+D_{i}}-\frac{AD_{i}}{A+D_{i}}\bigg{)}\\ &=g_{1i}^{-1}\bigg{(}(\hat{A}-A)\frac{D_{i}^{2}}{(A+D_{i})^{2}}-(\hat{A}-A)^{2}\frac{D_{i}^{2}}{(A+D_{i})^{3}}+(\hat{A}-A)^{3}\frac{D_{i}^{2}}{(A^{*}+D_{i})^{4}}\bigg{)}.\\ \end{split} (F.9)

As pointed out in Chatterjee et al. (2008) and Saegusa et al. (2022), this evaluation of (A^+Di)1(A+Di)1(\hat{A}+D_{i})^{-1}-(A+D_{i})^{-1} and 𝚺^1𝚺1\hat{\bm{\Sigma}}^{-1}-\bm{\Sigma}^{-1} involves numerous elementary calculations. In the end, both fourth moments reduce to the fourth moment of A^A\hat{A}-A. If using the Prasad-Rao method of moment estimator A^PR\hat{A}_{PR}, Lahiri and Rao (1995) showed that E(A^PRA)=o(m1)\text{E}(\hat{A}_{\rm{PR}}-A)=o(m^{-1}) and E|A^PRA|2q=O(mq)\text{E}|\hat{A}_{\rm{PR}}-A|^{2q}=O(m^{-q}) for any qq satisfying 0<q2+δ0<q\leq 2+\delta^{\prime} with 0<δ<14δ0<\delta^{\prime}<\frac{1}{4}\delta under regularity conditions.

Therefore, the above arguments support the final expression:

F1i(qα)=F2i(qα)+m1γ(𝜷,A;qα)+O(m3/2).F_{1i}(q_{\alpha})=F_{2i}(q_{\alpha})+m^{-1}\gamma(\bm{\beta},A;q_{\alpha})+O(m^{-3/2}).

Appendix G Appendix

Table 7: Coverage probabilities (average length) of different intervals under t9t_{9} distribution with mm = 50 and Pattern (ii)

SB.FH HM.FH SB.PR HM.PR FH PR DIRECT (i) α=80%\alpha=80\% G1 79.94 ( 3.26 ) 80.30 ( 3.58 ) 84.56 ( 3.76 ) 85.44 ( 4.12 ) 79.89 ( 3.26 ) 78.65 ( 3.27 ) 79.43 ( 7.25 ) G2 80.15 ( 2.25 ) 80.31 ( 3.52 ) 84.51 ( 2.61 ) 85.40 ( 4.07 ) 80.38 ( 2.25 ) 81.56 ( 2.36 ) 79.85 ( 2.81 ) G3 79.71 ( 2.12 ) 79.53 ( 3.51 ) 83.85 ( 2.47 ) 84.58 ( 4.07 ) 79.86 ( 2.12 ) 80.88 ( 2.23 ) 79.99 ( 2.56 ) G4 80.47 ( 1.96 ) 80.05 ( 3.51 ) 84.72 ( 2.29 ) 84.72 ( 4.06 ) 80.49 ( 1.96 ) 82.06 ( 2.08 ) 80.29 ( 2.29 ) G5 79.65 ( 1.49 ) 80.48 ( 3.48 ) 84.04 ( 1.76 ) 85.73 ( 4.03 ) 79.99 ( 1.49 ) 81.8 ( 1.62 ) 80.32 ( 1.62 ) (i) α=90%\alpha=90\% G1 89.89 ( 4.31 ) 90.38 ( 4.80 ) 94.62 ( 5.73 ) 94.68 ( 6.33 ) 88.90 ( 4.18 ) 87.79 ( 4.19 ) 89.42 ( 9.30 ) G2 90.18 ( 2.91 ) 90.37 ( 4.67 ) 93.44 ( 3.96 ) 94.56 ( 6.24 ) 89.89 ( 2.89 ) 90.58 ( 3.03 ) 90.19 ( 3.60 ) G3 90.06 ( 2.74 ) 89.34 ( 4.66 ) 93.68 ( 3.74 ) 94.17 ( 6.23 ) 90.15 ( 2.72 ) 90.78 ( 2.87 ) 90.11 ( 3.29 ) G4 90.09 ( 2.53 ) 89.5 ( 4.65 ) 93.69 ( 3.46 ) 94.65 ( 6.21 ) 90.12 ( 2.52 ) 91.18 ( 2.67 ) 90.24 ( 2.94 ) G5 90.29 ( 1.92 ) 90.28 ( 4.60 ) 93.56 ( 2.65 ) 94.73 ( 6.15 ) 90.26 ( 1.92 ) 91.56 ( 2.07 ) 90.32 ( 2.08 ) (i) α=95%\alpha=95\% G1 95.13 ( 5.30 ) 95.04 ( 5.97 ) 97.45 ( 8.42 ) 97.61 ( 9.31 ) 93.57 ( 4.98 ) 92.61 ( 5.00 ) 94.77 ( 11.09 ) G2 95.12 ( 3.49 ) 95.37 ( 5.76 ) 97.15 ( 5.82 ) 97.62 ( 9.13 ) 94.87 ( 3.44 ) 95.51 ( 3.60 ) 95.38 ( 4.29 ) G3 94.97 ( 3.29 ) 94.46 ( 5.74 ) 97.22 ( 5.48 ) 97.60 ( 9.08 ) 94.7 ( 3.24 ) 95.51 ( 3.42 ) 95 ( 3.92 ) G4 95.07 ( 3.03 ) 95.03 ( 5.71 ) 96.93 ( 5.08 ) 97.68 ( 9.04 ) 95.12 ( 3.00 ) 95.86 ( 3.18 ) 95.07 ( 3.51 ) G5 95.31 ( 2.29 ) 95.01 ( 5.64 ) 96.99 ( 3.88 ) 97.52 ( 8.94 ) 95.28 ( 2.29 ) 96.02 ( 2.47 ) 95.17 ( 2.48 )

Table 8: Coverage probabilities (average length) of different intervals under t9t_{9} distribution with mm = 15 and Pattern (ii)

SB.FH HM.FH SB.PR HM.PR FH PR DIRECT (i) α=80%\alpha=80\% G1 84.30 ( 3.89 ) 84.53 ( 4.26 ) 82.80 ( 5.42 ) 84.1 ( 5.98 ) 76.07 ( 3.30 ) 76.50 ( 3.48 ) 79.97 ( 7.25 ) G2 83.43 ( 2.59 ) 83.77 ( 4.03 ) 81.33 ( 3.65 ) 83.77 ( 5.74 ) 79.57 ( 2.32 ) 87.10 ( 3.09 ) 79.17 ( 2.81 ) G3 84.20 ( 2.43 ) 84.53 ( 3.99 ) 82.73 ( 3.42 ) 84.57 ( 5.67 ) 80.00 ( 2.19 ) 88.30 ( 3.03 ) 80.03 ( 2.56 ) G4 83.07 ( 2.25 ) 84.50 ( 3.97 ) 81.87 ( 3.17 ) 83.67 ( 5.65 ) 79.77 ( 2.04 ) 89.33 ( 2.97 ) 80.57 ( 2.29 ) G5 82.53 ( 1.69 ) 84.87 ( 3.87 ) 80.73 ( 2.38 ) 84 ( 5.42 ) 79.17 ( 1.57 ) 87.20 ( 2.82 ) 78.13 ( 1.62 ) (i) α=90%\alpha=90\% G1 94.50 ( 5.78 ) 94.97 ( 6.43 ) 87.57 ( 10.07 ) 87.8 ( 11.18 ) 85.23 ( 4.23 ) 85.83 ( 4.47 ) 89.5 ( 9.3 ) G2 93.73 ( 3.77 ) 94.00 ( 5.88 ) 87.57 ( 6.69 ) 87.87 ( 10.47 ) 88.80 ( 2.98 ) 94.87 ( 3.96 ) 90.03 ( 3.60 ) G3 93.83 ( 3.52 ) 95.10 ( 5.80) 87.60 ( 6.26 ) 88.33 ( 10.32 ) 88.93 ( 2.81 ) 95.37 ( 3.89 ) 89.47 ( 3.29 ) G4 93.87 ( 3.23 ) 94.50 ( 5.72 ) 87.60 ( 5.81 ) 88.53 ( 10.25 ) 89.83 ( 2.61 ) 95.50 ( 3.81 ) 89.83 ( 2.94 ) G5 91.73 ( 2.39 ) 94.47 ( 5.42 ) 87.17 ( 4.28 ) 88.7 ( 9.66 ) 89.17 ( 2.02 ) 93.73 ( 3.62 ) 88.5 ( 2.08 ) (i) α=95%\alpha=95\% G1 98.10 ( 8.48 ) 97.9 ( 9.45 ) 89.27 ( 16.44 ) 89.37 ( 18.35 ) 90.4 ( 5.04 ) 91.17 ( 5.33 ) 94.40 ( 11.09 ) G2 97.73 ( 5.36 ) 97.73 ( 8.3 ) 89.33 ( 10.69 ) 89.43 ( 16.85 ) 93.8 ( 3.55 ) 98.50 ( 4.72 ) 94.77 ( 4.29 ) G3 97.50 ( 4.97 ) 97.73 ( 8.1 ) 89.33 ( 9.96 ) 89.50 ( 16.57 ) 93.67 ( 3.35 ) 98.27 ( 4.64 ) 94.53 ( 3.92 ) G4 97.23 ( 4.55 ) 97.80 ( 7.92 ) 89.67 ( 9.20 ) 89.87 ( 16.39 ) 94.2 ( 3.11 ) 97.93 ( 4.54 ) 94.90 ( 3.51 ) G5 96.5 ( 3.29 ) 97.4 ( 7.30 ) 89.93 ( 6.70 ) 90.53 ( 15.28 ) 94.57 ( 2.40 ) 97.13 ( 4.32 ) 93.40 ( 2.48 )

Table 9: Coverage probabilities (average length) of different intervals under shifted exponential distribution with mm = 50, B2B_{2} = 50 and Pattern (i)

SB.FH HM.FH DB.FH SB.PR HM.PR DB.PR FH PR DIRECT (i) α=80%\alpha=80\% G1 80.41 ( 2.22 ) 81.98 ( 2.36 ) 80.46 ( 2.27 ) 84.82 ( 2.51 ) 87.88 ( 2.69 ) 84.59 ( 3.08 ) 83.69 ( 2.29 ) 82.36 ( 2.32 ) 79.51 ( 5.13 ) G2 79.91 ( 1.56 ) 80.96 ( 2.33 ) 80.36 ( 1.59 ) 83.76 ( 1.77 ) 87.22 ( 2.66 ) 84.04 ( 2.26 ) 80.64 ( 1.58 ) 82.97 ( 1.67 ) 80.08 ( 1.99 ) G3 79.99 ( 1.48 ) 81.47 ( 2.32 ) 80.06 ( 1.50 ) 83.42 ( 1.67 ) 87.41 ( 2.65 ) 83.80 ( 2.06 ) 80.62 ( 1.49 ) 82.90 ( 1.59 ) 79.56 ( 1.81 ) G4 80.32 ( 1.37 ) 81.26 ( 2.32 ) 80.89 ( 1.4 ) 83.69 ( 1.55 ) 87.52 ( 2.65 ) 83.83 ( 1.96 ) 80.67 ( 1.38 ) 83.41 ( 1.48 ) 80.17 ( 1.62 ) G5 80.10 ( 1.05 ) 81.69 ( 2.30 ) 80.41 ( 1.07 ) 83.22 ( 1.20 ) 87.29 ( 2.63 ) 83.32 ( 1.51 ) 80.14 ( 1.05 ) 82.80 ( 1.17 ) 80.30 ( 1.15 ) (i) α=90%\alpha=90\% G1 90.86 ( 3.03 ) 92.24 ( 3.28 ) 90.70 ( 3.16 ) 93.66 ( 3.78 ) 95.04 ( 4.14 ) 93.44 ( 6.27 ) 91.23 ( 2.94 ) 90.27 ( 2.97 ) 89.62 ( 6.58 ) G2 90.53 ( 2.05 ) 91.43 ( 3.19 ) 90.85 ( 2.13 ) 92.97 ( 2.59 ) 94.46 ( 4.09 ) 93.16 ( 4.54 ) 90.63 ( 2.02 ) 92.15 ( 2.15 ) 89.99 ( 2.55 ) G3 89.86 ( 1.93 ) 91.99 ( 3.18 ) 90.27 ( 2.00 ) 93.28 ( 2.44 ) 95.21 ( 4.08 ) 92.79 ( 4.27 ) 89.95 ( 1.91 ) 91.67 ( 2.04 ) 89.71 ( 2.33 ) G4 90.34 ( 1.79 ) 91.91 ( 3.17 ) 90.64 ( 1.85 ) 92.80 ( 2.26 ) 94.41 ( 4.07 ) 92.7 ( 4.08 ) 90.20 ( 1.77 ) 91.98 ( 1.90 ) 89.74 ( 2.08 ) G5 90.28 ( 1.36 ) 92.04 ( 3.13 ) 90.80 ( 1.41 ) 92.83 ( 1.72 ) 94.90 ( 4.03 ) 92.48 ( 3.21 ) 90.47 ( 1.35 ) 92.10 ( 1.50 ) 90.20 ( 1.47 ) (i) α=95%\alpha=95\% G1 95.90 ( 3.85 ) 96.88 ( 4.21 ) 95.72 ( 4.42 ) 96.67 ( 5.39 ) 97.25 ( 5.96 ) 96.93 ( 11.17 ) 94.28 ( 3.51 ) 93.54 ( 3.54 ) 94.54 ( 7.84 ) G2 95.51 ( 2.51 ) 96.10 ( 4.00 ) 95.72 ( 2.82 ) 96.56 ( 3.64 ) 96.98 ( 5.84 ) 96.83 ( 7.92 ) 94.98 ( 2.41 ) 95.36 ( 2.56 ) 94.81 ( 3.04 ) G3 94.99 ( 2.36 ) 96.76 ( 3.98 ) 95.51 ( 2.65 ) 96.43 ( 3.42 ) 97.28 ( 5.82 ) 96.70 ( 7.51 ) 94.85 ( 2.27 ) 95.94 ( 2.43 ) 94.68 ( 2.77 ) G4 95.34 ( 2.17 ) 96.54 ( 3.96 ) 95.64 ( 2.43 ) 96.40 ( 3.17 ) 96.99 ( 5.82 ) 96.30 ( 7.07 ) 94.98 ( 2.11 ) 95.91 ( 2.27 ) 94.84 ( 2.48 ) G5 95.51 ( 1.64 ) 96.57 ( 3.89 ) 95.76 ( 1.83 ) 96.41 ( 2.40) 97.08 ( 5.74 ) 96.5 ( 5.53 ) 95.13 ( 1.61 ) 96.10 ( 1.78 ) 95.21 ( 1.75 )

Table 10: Coverage probabilities (average length) of different intervals under shifted exponential distribution with mm = 15, B2B_{2} = 50 and Pattern (i)

SB.FH HM.FH DB.FH SB.PR HM.PR DB.PR FH PR DIRECT (i) α=80%\alpha=80\% G1 83.90( 2.66 ) 85.60 ( 2.90 ) 84.43 ( 2.77 ) 79.27 ( 3.29 ) 80.23 ( 3.67 ) 80.27 ( 5.36 ) 76.83 ( 2.23 ) 80.90 ( 2.43 ) 79.83 ( 5.13 ) G2 83.00 ( 1.78 ) 85.90 ( 2.74 ) 82.90 ( 1.85 ) 79.73 ( 2.19 ) 80.67 ( 3.54 ) 80.20 ( 3.49 ) 79.77 ( 1.59 ) 89.60 ( 2.22 ) 79.40 ( 1.99 ) G3 83.40 ( 1.68 ) 85.47 ( 2.71 ) 83.23 ( 1.74 ) 79.67 ( 2.06 ) 80.77 ( 3.52 ) 80.10 ( 3.28 ) 80.57 ( 1.51 ) 89.90 ( 2.19 ) 79.90 ( 1.81 ) G4 83.23 ( 1.55 ) 85.70 ( 2.69 ) 82.93 ( 1.60 ) 78.90 ( 1.90 ) 80.77 ( 3.48 ) 78.97 ( 2.98 ) 81.30 ( 1.41 ) 90.30 ( 2.15 ) 79.40 ( 1.62 ) G5 83.17 ( 1.18 ) 85.00 ( 2.60 ) 82.73 ( 1.21 ) 80.40 ( 1.43 ) 81.03 ( 3.36 ) 80.50 ( 2.08 ) 83.20 ( 1.12 ) 89.50 ( 2.08 ) 80.77 ( 1.15 ) (i) α=90%\alpha=90\% G1 93.30 ( 4.08 ) 93.93 ( 4.54 ) 93.40 ( 5.54 ) 85.07 ( 5.83 ) 85.27 ( 6.58 ) 85.53 ( 15.6 ) 85.53 ( 2.87 ) 89.27 ( 3.12 ) 90.10 ( 6.58 ) G2 92.87 ( 2.59 ) 93.83 ( 4.06 ) 92.63 ( 3.31 ) 85.77 ( 3.75 ) 86.07 ( 6.14 ) 86.43 ( 9.76 ) 89.83 ( 2.04 ) 95.43 ( 2.84 ) 90.33 ( 2.55 ) G3 92.87 ( 2.42 ) 93.50 ( 3.99 ) 92.60 ( 3.04 ) 85.33 ( 3.53 ) 85.67 ( 6.12 ) 85.90 ( 9.26 ) 89.83 ( 1.94 ) 95.83 ( 2.81 ) 89.67 ( 2.33 ) G4 92.77 ( 2.22 ) 94.53 ( 3.92 ) 92.87 ( 2.7 ) 85.70 ( 3.23 ) 86.37 ( 6.02 ) 86.07 ( 8.26 ) 90.43 ( 1.81 ) 95.57 ( 2.76 ) 89.53 ( 2.08 ) G5 92.23 ( 1.65 ) 93.40 ( 3.71 ) 92.00 ( 1.88 ) 86.87 ( 2.38 ) 87.03 ( 5.73 ) 87.13 ( 5.79 ) 91.47 ( 1.43 ) 95.30 ( 2.67 ) 90.6 ( 1.47 ) (i) α=95%\alpha=95\% G1 96.33 ( 5.99 ) 96.57 ( 6.68 ) 96.70 ( 12.50 ) 87.33 ( 9.30 ) 87.43 ( 10.53 ) 88.53 ( 24.21 ) 89.53 ( 3.41 ) 92.73 ( 3.71 ) 94.93 ( 7.84 ) G2 96.17 ( 3.64 ) 96.6 ( 5.65 ) 96.10 ( 7.03 ) 87.97 ( 5.82 ) 87.90 ( 9.59 ) 89.23 ( 14.30 ) 93.93 ( 2.44 ) 97.77 ( 3.39 ) 95.33 ( 3.04 ) G3 95.93 ( 3.38 ) 96.13 ( 5.53 ) 96.30 ( 6.24 ) 87.70 ( 5.46 ) 87.70 ( 9.51 ) 88.87 ( 13.53 ) 94.07 ( 2.31 ) 97.9 ( 3.34 ) 94.97 ( 2.77 ) G4 96.47 ( 3.08 ) 97.07 ( 5.39 ) 96.53 ( 5.67 ) 87.93 ( 5.01 ) 88.2 ( 9.39 ) 89.40 ( 12.21 ) 95.03 ( 2.16 ) 98.13 ( 3.29 ) 94.63 ( 2.48 ) G5 96.07 ( 2.22 ) 96.50 ( 4.95 ) 96.23 ( 3.69 ) 89.17 ( 3.60 ) 88.90 ( 8.72 ) 90.27 ( 8.69 ) 96.00 ( 1.71 ) 98.00 ( 3.19 ) 95.07 ( 1.75 )

References

  • Bell and Huang (2006) Bell, W. R. and E. T. Huang (2006). Using the t-distribution to deal with outliers in small area estimation. In Proceedings of statistics Canada symposium.
  • Chatterjee et al. (2008) Chatterjee, S., P. Lahiri, and H. Li (2008). Parametric bootstrap approximation to the distribution of EBLUP and related prediction intervals in linear mixed models. The Annals of Statistics 36(3), 1221 – 1245.
  • Cox (1975) Cox, D. (1975). Prediction intervals and empirical bayes confidence intervals. Journal of Applied Probability 12(S1), 47–55.
  • Datta and Lahiri (1995) Datta, G. S. and P. Lahiri (1995). Robust hierarchical bayes estimation of small area characteristics in the presence of covariates and outliers. Journal of Multivariate Analysis 54(2), 310–328.
  • Datta et al. (2005) Datta, G. S., J. Rao, and D. D. Smith (2005). On measuring the variability of small area estimators under a basic area level model. Biometrika 92(1), 183–196.
  • Fabrizi and Trivisano (2010) Fabrizi, E. and C. Trivisano (2010). Robust linear mixed models for small area estimation. Journal of Statistical Planning and Inference 140(2), 433–443.
  • Fay and Herriot (1979) Fay, R. E. and R. A. Herriot (1979). Estimates of income for small places: an application of james-stein procedures to census data. Journal of the American Statistical Association 74(366a), 269–277.
  • Ha et al. (2014) Ha, N. S., P. Lahiri, and V. Parsons (2014). Methods and results for small area estimation using smoking data from the 2008 national health interview survey. Statistics in Medicine 33(22), 3932–3945.
  • Hall (2013) Hall, P. (2013). The bootstrap and Edgeworth expansion. Springer Science & Business Media.
  • Hall and Maiti (2006) Hall, P. and T. Maiti (2006). On parametric bootstrap methods for small area prediction. Journal of the Royal Statistical Society Series B: Statistical Methodology 68(2), 221–238.
  • Hawala and Lahiri (2018) Hawala, S. and P. Lahiri (2018). Variance modeling for domains. Statistics and Applications 16(1), 399–409.
  • Hirose and Lahiri (2018) Hirose, M. Y. and P. Lahiri (2018). Estimating variance of random effects to solve multiple problems simultaneously. The Annals of Statistics 46(4), 1721–1741.
  • Jiang and Nguyen (2007) Jiang, J. and T. Nguyen (2007). Linear and generalized linear mixed models and their applications, Volume 1. Springer.
  • Lahiri and Rao (1995) Lahiri, P. and J. Rao (1995). Robust estimation of mean squared error of small area estimators. Journal of the American Statistical Association 90(430), 758–766.
  • Li (2007) Li, H. (2007). Small area estimation: An empirical best linear unbiased prediction approach. University of Maryland, College Park.
  • Li and Lahiri (2010) Li, H. and P. Lahiri (2010). An adjusted maximum likelihood method for solving small area estimation problems. Journal of multivariate analysis 101(4), 882–892.
  • McCullough and Vinod (1998) McCullough, B. and H. Vinod (1998). Implementing the double bootstrap. Computational Economics 12, 79–95.
  • Otto and Bell (1995) Otto, M. C. and W. R. Bell (1995). Sampling error modelling of poverty and income statistics for states. In American Statistical Association, Proceedings of the Section on Government Statistics, pp.  160–165.
  • Prasad and Rao (1990) Prasad, N. N. and J. N. Rao (1990). The estimation of the mean squared error of small-area estimators. Journal of the American statistical association 85(409), 163–171.
  • Rao and Molina (2015) Rao, J. N. and I. Molina (2015). Small area estimation. John Wiley & Sons.
  • Saegusa et al. (2022) Saegusa, T., S. Sugasawa, and P. Lahiri (2022). Parametric bootstrap confidence intervals for the multivariate fay–herriot model. Journal of Survey Statistics and Methodology 10(1), 115–130.
  • Shao (2008) Shao, J. (2008). Mathematical statistics. Springer Science & Business Media.
  • Shi (1992) Shi, S. G. (1992). Accurate and efficient double-bootstrap confidence limit method. Computational statistics & data analysis 13(1), 21–32.
  • Xie et al. (2007) Xie, D., T. E. Raghunathan, and J. M. Lepkowski (2007). Estimation of the proportion of overweight individuals in small areas—a robust extension of the fay–herriot model. Statistics in Medicine 26(13), 2699–2715.
  • Yoshimori and Lahiri (2014) Yoshimori, M. and P. Lahiri (2014). A second-order efficient empirical Bayes confidence interval. The Annals of Statistics 42(4), 1233 – 1261.