This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Gaussian Approximation for Lag-Window Estimators and the Construction of Confidence Bands for the Spectral Density

Jens-Peter Kreiß    Anne Leucht    Efstathios Paparoditis Institut für Mathematische Stochastik, TU Braunschweig, 38106 Braunschweig, Germany. E-mail: [email protected] Institut für Statistik, Universität Bamberg, 96052 Bamberg, Germany. E-mail: [email protected] Cyprus Academy of Sciences, Letters and Arts, P.O.Box 22554, CY-1522 Nicosia, Cyprus. E-mail: [email protected] TU Braunschweig, Universität Bamberg, University of Cyprus
Abstract

In this paper we consider the construction of simultaneous confidence bands for the spectral density of a stationary time series using a Gaussian approximation for classical lag-window spectral density estimators evaluated at the set of all positive Fourier frequencies. The Gaussian approximation opens up the possibility to verify asymptotic validity of a multiplier bootstrap procedure and, even further, to derive the corresponding rate of convergence. A small simulation study sheds light on the finite sample properties of this bootstrap proposal.

62G20,
62G09, 62G15,
Bootstrap,
Confidence Bands,
Gaussian Approximation,
Sample Autocovariance,
Spectral Density,
keywords:
[class=MSC]
keywords:

1 Introduction

We consider the problem of constructing (simultaneous) confidence bands for the spectral density of a stationary time series. Toward this goal we develop Gaussian approximation results for the maximum deviation over all (positive) Fourier frequencies of a lag-window spectral density estimator and for its bootstrap counterpart. Based on observations X1,X2,,XTX_{1},X_{2},\ldots,X_{T} stemming from a strictly stationary and centered stochastic process {Xt,t}\{X_{t},t\in\mbox{$\mathbb{Z}$}\}, a lag-window estimator of the spectral density f(λ)f(\lambda), λ[0,π]\lambda\in[0,\pi], is given by

f^T(λ)=12π|j|MTw(j/MT)eijλγ^(j),λ[0,π].\widehat{f}_{T}(\lambda)=\frac{1}{2\pi}\sum_{|j|\leq M_{T}}w(j/M_{T})\,\mathrm{e}^{-ij\lambda}\,\widehat{\gamma}(j),~{}\lambda\in[0,\pi]. (1)

In (1) and for j=0,1,2,,MT<Tj=0,1,2,\ldots,M_{T}<T,

γ^(j)=1Tt=j+1TXtXtj\widehat{\gamma}(j)=\frac{1}{T}\sum_{t=j+1}^{T}X_{t}\,X_{t-j} (2)

are estimators of the autocovariances γ(j)=Cov(X0,Xj)\gamma(j)=\mathrm{Cov}(X_{0},X_{j}) of the process {Xt,t}\{X_{t},t\in\mbox{$\mathbb{Z}$}\} and γ^(j)=γ^(j)\widehat{\gamma}(j)=\widehat{\gamma}(-j) for j<0j<0. The function w:[1,1]w:[-1,1]\rightarrow{\mathbb{R}} is a so-called lag-window, which assigns weights to the MTM_{T} sample autocovariances effectively used in the calculation of the estimator for f(λ)f(\lambda) and which satisfies some assumptions to be specified later.

It is well-known that for a fixed frequency λ\lambda, under suitable assumptions on the dependence structure of the underlying time series and for MTM_{T} converging not too fast to infinity, asymptotic normality for lag-window estimators can be shown. To elaborate and assuming sufficient smoothness of ff, which is equivalent to assuming a sufficiently fast decay of the underlying autocovariances γ(h)\gamma(h) as |h||h|\to\infty, one typically can show under certain conditions on the lag-window ww and for MTM_{T}\to\infty such that T/MT5C20T/M_{T}^{5}\to C^{2}\geq 0 as nn\to\infty, that

TMT(f^T(λ)f(λ))𝒟𝒩(CWf′′(λ),f2(λ)11w2(u)𝑑u),\sqrt{\frac{T}{M_{T}}}\Big{(}\widehat{f}_{T}(\lambda)-f(\lambda)\Big{)}\xrightarrow{\mathcal{D}}{\mathcal{N}}\Big{(}C\,W\,f^{\prime\prime}(\lambda),\,f^{2}(\lambda)\int_{-1}^{1}w^{2}(u)\,du\Big{)}, (3)

for 0<λ<π0<\lambda<\pi, where f′′f^{\prime\prime} denotes the second derivative of ff and W=limu0(1w(u))/u2W=\lim_{u\rightarrow 0}(1-w(u))/u^{2}, where the latter is assumed to exist and to be positive.

Moreover, for the estimator f^T(λ)\widehat{f}_{T}(\lambda) with sophisticated selected so-called flat-top lag-windows ww and under the additional assumption that j|j|r|γ(j)|<\sum_{j\in\mathbb{Z}}|j|^{r}|\gamma(j)|<\infty for some r1r\geq 1 convergence rates for the mean squared error MSE(f^T(λ))=O(T2r/(2r+1))\mbox{MSE}(\widehat{f}_{T}(\lambda))=O(T^{-2r/(2r+1)}) can be achieved, which corresponds to MTT1/(2r+1)M_{T}\sim T^{1/(2r+1)}. See \citeasnounPolitisRomano87 and \citeasnounBergPolitis2009 for details.

Instead of point-wise inference we aim in this paper for simultaneous inference using the estimator f^T\widehat{f}_{T}. More precisely, we aim for confidence bands covering the spectral density uniformly at all positive Fourier frequencies λk,T=2πk/T,k=1,,T/2=:NT\lambda_{k,T}=2\pi k/T,k=1,\ldots,\lfloor T/2\rfloor=:N_{T} with a desired (high) probability. Hence, we keep all information contained in X1,,XTX_{1},\dots,X_{T} in the sense that the discrete Fourier transform of any vector xTx\in\mbox{$\mathbb{R}$}^{T} is a linear combination of trigonometric polynomials at exactly these frequencies.

Note that so far simultaneous confidence bands for the spectral density had only been derived for special cases. In particular, for autoregressive processes relying on parametric spectral density estimates (see e.g. \citeasnounNP84 and \citeasnounT87) while \citeasnounNP08 proposed a bootstrap-aided approach for Gaussian time series using the integrated periodogram to construct simultaneous confidence bands for (a smoothed version of) the spectral density. Generalizing results from \citeasnounLiuWu2010, \citeasnounYZ22 derived simultaneous confidence bands for the spectral density on a grid with mesh size wider than 2π/T2\pi/T and for locally stationary Bernoulli shifts satisfying a geometric moment contraction condition. To this end, they proved that the maximum deviation of a suitably standardized lag-window spectral density estimator over such a grid asymptotically possesses a Gumbel distribution. Motivated by the slow rate of convergence of the maximum deviation to a Gumbel variable, they propose an asymptotically valid bootstrap method to improve the finite sample behavior of their confidence regions.

As mentioned above, we aim to construct simultaneous confidence bands on the finer grid of all (positive) Fourier frequencies. To achieve this goal, we establish a Gaussian approximation result for the distribution of a properly standardized statistic based on the random quantity

max1kNT|f^T(λk,T)Ef^T(λk,T)|.\max_{1\leq k\leq N_{T}}|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|.

In this context and in order to stabilize the asymptotic variance and to construct confidence bands that automatically adapt to the local variability of f(λ)f(\lambda), we are particularly interested in deriving a Gaussian approximation for the normalized statistic

max1kNT|f^T(λk,T)Ef^T(λk,T)|f^T(λk,T).\max_{1\leq k\leq N_{T}}\frac{\displaystyle|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|}{\displaystyle\widehat{f}_{T}(\lambda_{k,T})}.

Various Gaussian approximations of max-type statistics in the time domain have been derived during the last decades. For means of high dimensional time series data we refer the reader to \citeasnounZW17 as well as \citeasnounZC18 and references therein. In the context of i.i.d. functional data, \citeasnounCCH22 show that Gaussian approximation results for means of high dimensional vectors can be adapted to establish a Gaussian approximation for the maximum of the periodogram. To elaborate, they use that the periodogram at frequency λk,T\lambda_{k,T} is the squared norm of 1T2πt=1TXteitλk,T\frac{1}{\sqrt{T2\pi}}\sum_{t=1}^{T}X_{t}\,e^{-it\lambda_{k,T}}. However, their arguments do not directly carry over to lag-window estimators as the latter have a more complex structure. It is shown in (3) below, that they still have a mean-type representation, but the degree of dependence of the summands increases with the sample size which prevents the application of the afore-mentioned results. A Gaussian approximation result for high dimensional spectral density matrices of α\alpha-mixing time series has been derived in \citeasnounChangetal23. However, they require the dimension of the process to increase with sample size making their results not applicable for our purposes.

The paper is organized as follows. Section 2 summarizes key assumptions for our results. The core material on Gaussian approximation for spectral density estimators is contained in Section 3. In Section 4 we discuss application of the Gaussian approximation results to the construction of simultaneous confidence bands over the positive Fourier frequencies. Section 5 reports the results of a small simulation study. Auxiliary lemmas and proofs are deferred to Section 6.

2 Set up and Assumptions

We begin by imposing the following assumption on the dependence structure of the stochastic process generating the observed time series X1,X2,,XTX_{1},X_{2},\ldots,X_{T}.

Assumption 1: {Xt,t}\{X_{t},t\in\mbox{$\mathbb{Z}$}\} is a strictly stationary, centered process and Xt=g(et,et1,)X_{t}=g(e_{t},e_{t-1},\ldots) for some measurable function gg and an i.i.d. process {et,t}\{e_{t},t\in\mbox{$\mathbb{Z}$}\} with mean zero and variance 0<σe2<0<\sigma^{2}_{e}<\infty. For s0s\geq 0, denote by r,s=σ(er,er1,,ers)\mathcal{F}_{r,s}=\sigma(e_{r},e_{r-1},\dots,e_{r-s}) the σ\sigma-algebra generated by the random variables {erj,0js}\{e_{r-j},0\leq j\leq s\} and assume that E|Xt|m<E|X_{t}|^{m}<\infty for some m>16m>16. For a random variable XX let Xm:=(E|X|m)1/m\|X\|_{m}:=(E|X|^{m})^{1/m} and define for an independent copy {et,t}\{e_{t}^{\prime},t\in\mbox{$\mathbb{Z}$}\} of {et;t}\{e_{t};t\in\mbox{$\mathbb{Z}$}\},

δm(k):=Xtg(et,et1,,etk+1,etk,etk1,)m.\delta_{m}(k):=\|X_{t}-g(e_{t},e_{t-1},\ldots,e_{t-k+1},e^{\prime}_{t-k},e_{t-k-1},\ldots)\|_{m}.

The assumption is that

δm(k)C(1+k)α\delta_{m}(k)\leq C\,(1+k)^{-\alpha} (4)

for some α>3\alpha>3 and a constant C<C<\infty.

Assumption 1 implies that the autocovariance function γ(h)=Cov(X0,Xh),\gamma(h)=\mathrm{Cov}(X_{0},X_{h}), h,h\in\mbox{$\mathbb{Z}$}, is absolute summable, that is that the process {Xt,t}\{X_{t},t\in\mbox{$\mathbb{Z}$}\} possesses a continuous and bounded spectral density f(λ)=(2π)1hγ(h)eihλf(\lambda)=(2\pi)^{-1}\sum_{h\in\mbox{$\mathbb{Z}$}}\gamma(h)e^{-ih\lambda}, λ(π,π]\lambda\in(-\pi,\pi].
It can be even shown that (4) implies j|j|r|γ(j)|<\sum_{j\in\mathbb{Z}}|j|^{r}|\gamma(j)|<\infty for r<α1r<\alpha-1, see Lemma 1 in Section 6. As has been mentioned in the Introduction this allows for the option of convergence rates for f^T(λ)\widehat{f}_{T}(\lambda) of order O(Tr/(2r+1))O(T^{-r/(2r+1)}), r<α1r<\alpha-1. Note that our assumptions on the decay of the dependence coefficients are less restrictive than those in \citeasnounYZ22, where exponentially decaying coefficients are presumed. We assume for simplicity that EXt=0EX_{t}=0. Our results can be generalized to non-centered time series using the modified autocovariance estimator γ~(j)=1Tt=j+1T(XtX¯)(XtjX¯)withX¯=t=1TXt/T\widetilde{\gamma}(j)=\frac{1}{T}\sum_{t=j+1}^{T}(X_{t}-\overline{X})(X_{t-j}-\overline{X})\ \ \mbox{with}\ \ \overline{X}=\sum_{t=1}^{T}X_{t}/T, instead of the estimator γ^(j)\widehat{\gamma}(j) defined in (2).

Additional to Assumption 1 we also require the following boundedness condition for the spectral density ff.

Assumption 2: The spectral density ff satisfies infλ(π,π]f(λ)>0\inf_{\lambda\in(-\pi,\pi]}f(\lambda)>0.

Finally, we impose the following conditions on the lag-window function ww used in obtaining the estimator f^T\widehat{f}_{T} and which are standard in the literature; see \citeasnounPriestley81, Chapter 6.

Assumption 3: The lag-window w:[1,1]w\colon[-1,1]\to\mbox{$\mathbb{R}$} is assumed to be a differentiable and symmetric function with 11w(u)𝑑u=1\int_{-1}^{1}w(u)\,du=1 and w(0)=1w(0)=1.

3 Gaussian Approximation for Spectral Density Estimators

For the following considerations and in order to separate any bias related problems, we consider the centered sequence

TMT(f^T(λk,T)Ef^T(λk,T))\displaystyle\sqrt{\frac{T}{M_{T}}}\Big{(}\widehat{f}_{T}(\lambda_{k,T})-E\,\widehat{f}_{T}(\lambda_{k,T})\Big{)} =1Tt=1Tj=0MTak,j(XtXtjγ(j)) 1t>j\displaystyle=\frac{1}{\sqrt{T}}\sum_{t=1}^{T}\sum_{j=0}^{M_{T}}a_{k,j}\big{(}X_{t}X_{t-j}-\gamma(j)\big{)}\,\mathbbm{1}_{{t>j}}
=:1Tt=1TZt,k,\displaystyle=:\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,k}, (5)

with an obvious abbreviation for Zt,kZ_{t,k} and where for k=1,,NT\ k=1,\ldots,N_{T},

ak,j:={12π1MT,j=01π1MTw(j/MT)cos(jλk,T),j1.a_{k,j}:=\begin{cases}\frac{\displaystyle 1}{\displaystyle 2\pi}\cdot\frac{\displaystyle 1}{\displaystyle\sqrt{M_{T}}}&,\;j=0\\ \frac{\displaystyle 1}{\displaystyle\pi}\cdot\frac{\displaystyle 1}{\displaystyle\sqrt{M_{T}}}w(j/M_{T})\cos(j\lambda_{k,T})&,\;j\geq 1.\end{cases}

Due to the differentiability of the lag-window ww (cf. Assumption 3), we have

maxk=1,,NTj=0MTak,j2\displaystyle\max_{k=1,\ldots,N_{T}}\sum_{j=0}^{M_{T}}a_{k,j}^{2} 14π21MT+1π2MTj=1MTw2(j/MT)\displaystyle\leq\frac{1}{4\pi^{2}}\frac{1}{M_{T}}+\frac{1}{\pi^{2}M_{T}}\sum_{j=1}^{M_{T}}w^{2}(j/M_{T})
=1π201w2(u)𝑑u+𝒪(MT1).\displaystyle=\frac{1}{\pi^{2}}\int_{0}^{1}w^{2}(u)\,du+{\mathcal{O}}(M_{T}^{-1}).

The following theorem is our first result and establishes a valid Gaussian approximation for the maximum over all positive Fourier frequencies of the centered estimator given in (1).

Theorem 1.

Suppose that {Xt,t}\{X_{t},t\in\mathbb{Z}\} fulfils Assumption 1 and 2. Let f^T\widehat{f}_{T} be a lag-window estimator of ff as given in (1), where the lag-window ww satisfies Assumption 3 and let MTM_{T}\rightarrow\infty and MTTasM_{T}\sim T^{a_{s}}. Assume further that

1Tt=1TZt,k22>c>0.\displaystyle\Big{\|}\,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,k}\,\Big{\|}_{2}^{2}>c>0. (6)

Let ξk,k=1,,NT\xi_{k},k=1,\ldots,N_{T}, be jointly normally distributed random variables with zero mean and covariance Eξk1ξk2E\xi_{k_{1}}\xi_{k_{2}} equal to

1TE[(j=0MTt=j+1Tak1,j(XtXtjγ(j)))(j=0MTt=j+1Tak2,j(XtXtjγ(j)))].\frac{1}{T}E\Big{[}\Big{(}\sum_{j=0}^{M_{T}}\sum_{t=j+1}^{T}a_{k_{1},j}\big{(}X_{t}X_{t-j}-\gamma(j)\big{)}\Big{)}\Big{(}\sum_{j=0}^{M_{T}}\sum_{t=j+1}^{T}a_{k_{2},j}\big{(}X_{t}X_{t-j}-\gamma(j)\big{)}\Big{)}\Big{]}.

Let λ\lambda, asa_{s}, and ala_{l} be positive constants such that

α>min{1+al2as,1al12m4λmax{4m, 2λ}2as+32}\alpha>\min\Big{\{}1+\frac{a_{l}}{2a_{s}}\,,\,\frac{1-a_{l}-\frac{12}{m}-4\lambda-\max\{\frac{4}{m},\,2\lambda\}}{2a_{s}}+\frac{3}{2}\Big{\}} (7)

and

0<as+max{4λ,2λ+4/m}<al<16λ12/m.0<a_{s}+\max\{4\lambda,2\lambda+4/m\}<a_{l}<1-6\lambda-12/m. (8)

Then,

supx|P(maxk=1,,NTTMT|f^T(λk,T)Ef^T(λk,T)|x)P(maxk=1,,NT|ξk|x)|\displaystyle\sup_{x\in\mathbb{R}}\Big{|}P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\Big{|}\widehat{f}_{T}(\lambda_{k,T})-E\,\widehat{f}_{T}(\lambda_{k,T})\Big{|}\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}|\xi_{k}|\leq x\big{)}\Big{|}
=𝒪(Tκ+Tλ(log(NT))3/2),\displaystyle={\mathcal{O}}\Big{(}T^{-\kappa}+T^{-\lambda}\big{(}\log(N_{T})\big{)}^{3/2}\Big{)}, (9)

where κ=min{κ1,κ2}\kappa=\min\{\kappa_{1},\kappa_{2}\} with

κ1=al/2as/2λmax{2/m,λ}andκ2=1/2al/26/m3λ.\kappa_{1}=a_{l}/2-a_{s}/2-\lambda-\max\{2/m\,,\,\lambda\}\ \mbox{and}\ \ \kappa_{2}=1/2-a_{l}/2-6/m-3\lambda.

Some remarks regarding Theorem 1 are in order.

Remark 1.
  • (i)

    The lower bound in (6) is valid for sufficiently large TT. This results from our assumption that ff is bounded away from zero (see Assumption 2), together with

    supk{1,,NT}|1Tt=1TZt,k22f2(λk,T)1MTj=MTMTw2(jMT)(1+cos(2jλk,T))|\displaystyle\sup_{k\in\{1,\dots,N_{T}\}}\Big{|}\big{\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,k}\big{\|}_{2}^{2}-f^{2}(\lambda_{k,T})\,\frac{1}{M_{T}}\sum_{j=-M_{T}}^{M_{T}}w^{2}\big{(}\frac{j}{M_{T}}\big{)}\,(1+\cos(2j\,\lambda_{k,T}))\Big{|}
    =o(1),\displaystyle=o(1),

    see Lemma 2 in Section 6.

  • (ii)

    In order to verify that the max-statistic of interest can be approximated by the maximum of absolute values of Gaussian random variables inheriting the covariance structure of lag-window estimators, we combine smoothing techniques, Lindeberg’s method, and a blocking approach. The parameter λ\lambda appearing in Theorem 1 is related to the smoothness of the functions used in the proof to approximate indicator functions. The parameters ala_{l} and asa_{s} determine the sizes of the big and small blocks. The sizes of these blocks are tailor-made to handle the growing degree of dependence in the Zt,kZ_{t,k}^{\prime}s with increasing sample size.

Remark 2.
  • (i)

    Notice first that the rate in the Gaussian approximation (1) does not depend on the rate of decay of the dependence coefficients. Moreover, as can be seen by an inspection of the proof of Theorem 1, even if the underlying time series {Xt,t}\{X_{t},t\in\mbox{$\mathbb{Z}$}\} consists of i.i.d. observations, i.e. Xt=etX_{t}=e_{t} for all tt, we do not achieve better approximation rates using our method of proof.

  • (ii)

    If we assume that the number mm of moments assumed to exist and the convergence rate for the underlying lag-window estimator as=1/(2r+1),r,a_{s}=1/(2r+1),\;r\in\mathbb{N}, are fixed, then we need to optimize the rate in the Gaussian approximation depending on λ\lambda and αl\alpha_{l}. To do so we first ignore log-terms and secondly we divide the consideration into two cases according to the parameter mm, namely Case I: 2/mλ2/m\geq\lambda (moderate values of mm) and Case II: 2/mλ2/m\leq\lambda (large values of mm). For Case I we can find the resulting rate, which is the minimum of max{λ,λ+2/m+(asal)/2,3λ+6/m+(al1)/2}\max\{-\lambda,\lambda+2/m+(a_{s}-a_{l})/2,3\lambda+6/m+(a_{l}-1)/2\} by balancing the three terms. This leads to al=1/3+2/3as4/(3m)a_{l}=1/3+2/3\,a_{s}-4/(3m), λ=1/12(1as)4/(3m)\lambda=1/12(1-a_{s})-4/(3m) and the resulting rate Tr/(6(2r+1))+4/(3m)T^{-r/(6(2r+1))+4/(3m)}.
    Exactly along the same lines we obtain for Case II that λ=1/14(1as)6/(7m)\lambda=1/14(1-a_{s})-6/(7m) and the resulting rate Tr/(7(2r+1))+6/(7m)T^{-r/(7(2r+1))+6/(7m)}. Both choices are in line with (8).
    Hence, for usual lag-window estimators, i.e. as=1/5a_{s}=1/5, and if sufficiently high moments of the time series are assumed to exist, then we can achieve rates up to T2/35T^{-2/35} for the Gaussian approximation.
    In the limit as0a_{s}\to 0, we achieve the rate T1/14+6/(7m)T^{-1/14+6/(7m)}, which reaches, if sufficiently high moments exist, almost the rate T1/14T^{-1/14}.

  • (iii)

    If we would instead of sample autocovariances consider sample means of i.i.d. observations, which would make the small blocks/large blocks considerations in the proof of Theorem 1 superfluous, we could achieve with the presented method of proof a Gaussian approximation with rate T1/8+3/(2m)T^{-1/8+3/(2m)}.

4 Confidence Bands for the Spectral Density

To construct confidence bands that appropriately take into account the local variability of the lag-window estimator, we make use of the following Gaussian approximation result for the properly standardized lag-window estimators.

Theorem 2.

Suppose that the assumptions of Theorem 1 are satisfied. Further, assume that MT3/T0M_{T}^{3}/T\to 0, where additionally to the conditions given in (7),

λ+κmin{1/2as/24/m, 2as4/m}\lambda+\kappa\leq\min\{1/2-a_{s}/2-4/m\,,\,2a_{s}-4/m\} (10)

holds true. Moreover, we assume for the bias of f^T\widehat{f}_{T} that

Ef^T(λ)f(λ)=𝒪(MT2)E\widehat{f}_{T}(\lambda)-f(\lambda)={\mathcal{O}}(M_{T}^{-2}) (11)

uniformly in λ\lambda. Then,

supx|P(maxk=1,,NTTMT|f^T(λk,T)Ef^T(λk,T)|f^T(λk,T)x)P(maxk=1,,NT|ξ~k|x)|\displaystyle\sup_{x\in\mathbb{R}}\Big{|}P\Big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|}{\widehat{f}_{T}(\lambda_{k,T})}\leq x\Big{)}-P\Big{(}\max_{k=1,\ldots,N_{T}}|\widetilde{\xi}_{k}|\leq x\Big{)}\Big{|}
=\displaystyle= 𝒪(Tκ+Tλ(log(NT))3/2),\displaystyle\,{\mathcal{O}}\Big{(}T^{-\kappa}+T^{-\lambda}\big{(}\log(N_{T})\big{)}^{3/2}\Big{)},

where ξ~k,k=1,,NT\widetilde{\xi}_{k},k=1,\ldots,N_{T}, are jointly normally distributed random variables with zero mean and covariance Eξ~k1ξ~k2E\widetilde{\xi}_{k_{1}}\widetilde{\xi}_{k_{2}} equal to

CT(k1,k2)=1f(λk1,T)f(λk2,T)\displaystyle C_{T}(k_{1},k_{2})=\frac{1}{f(\lambda_{k_{1},T})f(\lambda_{k_{2},T})} j1,j2=0MTak1,j1ak2,j2\displaystyle\sum_{j_{1},j_{2}=0}^{M_{T}}a_{k_{1},j_{1}}a_{k_{2},j_{2}}
×1Tt=j1+1Ts=j2+1T\displaystyle\times\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T} E[(XtXtj1γ(j1))(XsXsj2γ(j2))].\displaystyle E\Big{[}\big{(}X_{t}X_{t-j_{1}}-\gamma(j_{1})\big{)}\big{(}X_{s}X_{s-j_{2}}-\gamma(j_{2})\big{)}\Big{]}. (12)
Remark 3.
  • (i)

    Choosing asa_{s} MSE optimal, that is as=1/5a_{s}=1/5, the corresponding optimal λ=κ=2/35\lambda=\kappa=2/35 (see Remark 2) satisfy (10).

  • (ii)

    Under the additional assumption limu0(1w(u))/u2>0\lim_{u\to 0}(1-w(u))/u^{2}>0 for the lag-window, validity of (11) is shown in \citeasnoun[Theorem 9.3.3]A71.
    For the case MTT1/5M_{T}\sim T^{1/5} \citeasnounBergPolitis2009 have shown in their Theorem 1 that, under our assumptions, for lag-window estimators with a so-called flat-top lag-window ww even

    supλ|Ef^T(λ)f(λ)|=o(MT2)\sup_{\lambda}|E\widehat{f}_{T}(\lambda)-f(\lambda)|=o(M_{T}^{-2})

    holds. This implies that the results of Theorem 2 also hold for the important class of flat-top lag-window estimators introduced in \citeasnounPolitisRomano87. It is worth mentioning that flat-top lag windows fail to fulfill limu0(1w(u))/u2>0\lim_{u\to 0}(1-w(u))/u^{2}>0.

Theorem 2 motivates the following multiplier bootstrap procedure to construct a confidence band for the smoothed spectral density

f~T():=E(f^T()).\widetilde{f}_{T}(\cdot):=E(\widehat{f}_{T}(\cdot)).
Step 1.

For k1,k2{1,2,,NT}k_{1},k_{2}\in\{1,2,\ldots,N_{T}\}, let C^T(k1,k2)\widehat{C}_{T}(k_{1},k_{2}) be an estimator of the covariance CT(k1,k2)C_{T}(k_{1},k_{2}) given in (2) which ensures that the NT×NTN_{T}\times N_{T} matrix Σ^NT\widehat{\Sigma}_{N_{T}} with the (i,j)(i,j)-th element equal to C^T(i,j)\widehat{C}_{T}(i,j) is non-negative definite.

Step 2.

Generate random variables ξ1,ξ2,,ξNT\xi^{\ast}_{1},\xi^{\ast}_{2},\ldots,\xi^{\ast}_{N_{T}}, where

(ξ1,ξ2,,ξNT)𝒩(0NT,Σ^NT),\big{(}\xi^{\ast}_{1},\xi^{\ast}_{2},\ldots,\xi^{\ast}_{N_{T}}\big{)}^{\top}\sim{\mathcal{N}}\Big{(}0_{N_{T}},\widehat{\Sigma}_{N_{T}}\Big{)},

with 0NT0_{N_{T}} a NTN_{T}-dimensional vector or zeros and covariance matrix Σ^NT\widehat{\Sigma}_{N_{T}}. Let ξmax=max{|ξ1|,|ξ2|,,|ξNT|}\xi^{\ast}_{\max}=\max\big{\{}|\xi^{\ast}_{1}|,|\xi^{\ast}_{2}|,\ldots,|\xi^{\ast}_{N_{T}}|\big{\}} and for α(0,1)\alpha\in(0,1) given, denote by q1αq^{\ast}_{1-\alpha} the upper (1α)(1-\alpha) percentage point of the distribution of ξmax\xi^{\ast}_{\max}.

Step 3.

A simultaneous (1α)(1-\alpha)-confidence band for f~(λj,T)\widetilde{f}(\lambda_{j,T}), j=1,2,,NTj=1,2,\ldots,N_{T}, is then given by

{[f^(λj,T)(1q1αMTT),f^(λj,T)(1+q1αMTT)],j=1,2,,NT.}\Big{\{}\Big{[}\widehat{f}(\lambda_{j,T})\Big{(}1-q^{\ast}_{1-\alpha}\sqrt{\frac{M_{T}}{T}}\Big{)},\ \ \widehat{f}(\lambda_{j,T})\Big{(}1+q^{\ast}_{1-\alpha}\sqrt{\frac{M_{T}}{T}}\Big{)}\Big{]},\ j=1,2,\ldots,N_{T}.\Big{\}} (13)

Notice that the distribution of ξmax\xi^{\ast}_{\max} in Step 2 can be estimated via Monte-Carlo simulation. While f(λki,T)f(\lambda_{k_{i},T}), i=1,2i=1,2, appearing in the expression for CT(k1,k2)C_{T}(k_{1},k_{2}) can be replaced by the estimator f^T(λki,T)\widehat{f}_{T}(\lambda_{k_{i},T}), the important step in the bootstrap algorithm proposed, is the estimation of the covariances

σT(j1,j2)=1Tt=j1+1Ts=j2+1TE[(XtXtj1γ(j1))(XsXsj2γ(j2))]\sigma_{T}(j_{1},j_{2})=\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}E\Big{[}\big{(}X_{t}X_{t-j_{1}}-\gamma(j_{1})\big{)}\big{(}X_{s}X_{s-j_{2}}-\gamma(j_{2})\big{)}\Big{]}

appearing in CT(k1,k2)C_{T}(k_{1},k_{2}). To obtain an estimator for this quantity which has some desired consistency properties, we follow \citeasnounZhang_etal2022. In particular, let KK be a kernel function satisfying the following conditions.

Assumption 4: K:[0,+)K:\mbox{$\mathbb{R}$}\to[0,+\infty) is symmetric, continuously differentiable and decreasing on [0,+)[0,+\infty) with K(0)=1K(0)=1 and K(x)𝑑x<\int K(x)\,dx<\infty. The Fourier transform of KK is integrable and nonnegative on \mathbb{R}.

Consider next the estimator

σ^T(j1,j2)=1Tt=j1+1Ts=j2+1TK(tsbT)(XtXtj1γ^(j1))(XsXsj2γ^(j2)),\widehat{\sigma}_{T}(j_{1},j_{2})=\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\big{(}X_{t}X_{t-j_{1}}-\widehat{\gamma}(j_{1})\big{)}\big{(}X_{s}X_{s-j_{2}}-\widehat{\gamma}(j_{2})\big{)}, (14)

of σT(j1,j2)\sigma_{T}(j_{1},j_{2}), where bT>0b_{T}>0 is a bandwidth parameter satisfying bTb_{T}\rightarrow\infty as TT\rightarrow\infty. Note that the conditions on the Fourier transform of KK in Assumption 4 guarantee positive semi-definiteness of σ^T(j1,j2)\widehat{\sigma}_{T}(j_{1},j_{2}). This assumption is satisfied if KK is, for instance, the Gaussian kernel.

The following consistency result for the estimator proposed in (14) can be established.

Proposition 1.

Suppose that Assumption 1 and Assumption 4 hold true. Let bTTcb_{T}\sim T^{c} with c<1/28as/mc<1/2-8a_{s}/m, where asa_{s} is as in Theorem 2. Then,

sup1j1,j2MT|σ^T(j1,j2)σT(j1,j2)|=OP(1bT+bTMT8/mT).\sup_{1\leq j_{1},j_{2}\leq M_{T}}\big{|}\widehat{\sigma}_{T}(j_{1},j_{2})-\sigma_{T}(j_{1},j_{2})\big{|}=O_{P}\Big{(}\frac{1}{b_{T}}+\frac{b_{T}M_{T}^{8/m}}{\sqrt{T}}\Big{)}.

Let, as usual, PP^{*} denote the conditional probability given the time series X1,,X_{1},\dots, XTX_{T}. We then have the following result which proves consistency of the multiplier bootstrap procedure proposed.

Theorem 3.

Suppose that the conditions of Theorem 2 hold true and that bTTcb_{T}\sim T^{c}, where

0<as<c<12as(1+8m).0<a_{s}<c<\frac{1}{2}-a_{s}\big{(}1+\frac{8}{m}\big{)}.

Let ξk\xi_{k}^{\ast}, k=1,2,,NTk=1,2,\ldots,N_{T}, be Gaussian random variables generated as in Step 2 of the multiplier bootstrap algorithm with Σ^NT\widehat{\Sigma}_{N_{T}} the NT×NTN_{T}\times N_{T} matrix the (k1,k2)(k_{1},k_{2})-th element of which equals

C^T(k1,k2)=1f^(λk1,T)f^(λk2,T)\displaystyle\widehat{C}_{T}(k_{1},k_{2})=\frac{1}{\widehat{f}(\lambda_{k_{1},T})\widehat{f}(\lambda_{k_{2},T})} j1,j2=0MTak1,j1ak2,j2σ^T(j1,j2),\displaystyle\sum_{j_{1},j_{2}=0}^{M_{T}}a_{k_{1},j_{1}}a_{k_{2},j_{2}}\widehat{\sigma}_{T}(j_{1},j_{2}), (15)

and σ^T(j1,j2)\widehat{\sigma}_{T}(j_{1},j_{2}) given in (14). Then, as nn\rightarrow\infty,

supx|P(maxk=1,,NT\displaystyle\sup_{x\in\mathbb{R}}\Big{|}P\big{(}\max_{k=1,\ldots,N_{T}} TMT|f^T(λk,T)Ef^T(λk,T)|f^T(λk,T)x)P(maxk=1,,NT|ξk|x)|\displaystyle\sqrt{\frac{T}{M_{T}}}\frac{|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|}{\widehat{f}_{T}(\lambda_{k,T})}\leq x\big{)}-P^{\ast}\big{(}\max_{k=1,\ldots,N_{T}}|\xi^{\ast}_{k}|\leq x\big{)}\Big{|}
=\displaystyle= 𝒪(Tκ+Tλ(log(NT))3/2)\displaystyle\,{\mathcal{O}}\Big{(}T^{-\kappa}+T^{-\lambda}\big{(}\log(N_{T})\big{)}^{3/2}\Big{)}
+oP({supk=1,,NT|f^T(λk,T)f(λk,T)|}1/6+Tρ/6)\displaystyle+o_{P}\big{(}\{\sup_{k=1,\ldots,N_{T}}|\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})|\}^{1/6}+T^{-\rho/6}\big{)} (16)

for ρ=min{cas,1/2cas(1+8/m)}\rho=\min\{c-a_{s},1/2-c-a_{s}(1+8/m)\}.

Remark 4.
  • (i)

    So far we have considered the construction of (simultaneous) confidence bands for the (smoothed) spectral density f~(λj,T)=E(f^T(λj,T))\widetilde{f}(\lambda_{j,T})=\mathrm{E}(\widehat{f}_{T}(\lambda_{j,T})) over the Fourier frequencies λj,T\lambda_{j,T}, j=1,2,,NTj=1,2,\ldots,N_{T}. To extend the procedure proposed to one that also delivers an asymptotically valid simultaneous confidence band for the spectral density ff itself, a (uniformly) consistent estimator of the (rescaled) bias term BT(λj,T)=T/MT(E(f^T(λj,T))f(λj,T))/f^T(λj,T)B_{T}(\lambda_{j,T})=\sqrt{T/M_{T}}\big{(}\mathrm{E}(\widehat{f}_{T}(\lambda_{j,T}))-f(\lambda_{j,T})\big{)}/\widehat{f}_{T}(\lambda_{j,T}) for j=1,2,,NTj=1,2,\ldots,N_{T}, is needed, provided BT(λj,T)B_{T}(\lambda_{j,T}) does not vanish asymptotically; see (3). If fact, it can be easily seen that a (theoretically) valid confidence band for ff which takes into account the bias in estimating ff, is given by

    {[f^(λj,T)\displaystyle\Big{\{}\Big{[}\widehat{f}(\lambda_{j,T}) (1MTT(q1α+BT(λj,T)),\displaystyle\Big{(}1-\sqrt{\frac{M_{T}}{T}}\big{(}q^{\ast}_{1-\alpha}+B_{T}(\lambda_{j,T}\big{)}\Big{)}, (17)
    f^(λj,T)(1+MTT(q1αBT(λj,T))],j=1,2,,NT};\displaystyle\ \ \widehat{f}(\lambda_{j,T})\Big{(}1+\sqrt{\frac{M_{T}}{T}}\big{(}q^{\ast}_{1-\alpha}-B_{T}(\lambda_{j,T}\big{)}\Big{)}\Big{]},j=1,2,\ldots,N_{T}\Big{\}};

    compare to (13). As in many other nonparametric inference problems too, different approaches can be considered for this purpose; see \citeasnounCalonicoetal2018 and the references therein for the cases of nonparametric density and regression estimation in an i.i.d. set up. One approach is an explicit bias correction that uses a plug-in type estimator of BT(λj,T)B_{T}(\lambda_{j,T}) based on the fact that, under certain conditions, see (3), T/MT(E(f^T(λj,T))f(λj,T))=CWf′′(λj,T)+o(1)\sqrt{T/M_{T}}\big{(}\mathrm{E}(\widehat{f}_{T}(\lambda_{j,T}))-f(\lambda_{j,T})\big{)}=CWf^{\prime\prime}(\lambda_{j,T})+o(1). This approach requires a consistent (nonparametric) estimator of the second order derivative f′′f^{\prime\prime} of the spectral density. A different approach proposed in the literature is the so-called ’undersmoothing’. The idea here is to make the bias term T/MT(E(f^T(λj,T))f(λj,T))\sqrt{T/M_{T}}\big{(}\mathrm{E}(\widehat{f}_{T}(\lambda_{j,T}))-f(\lambda_{j,T})\big{)} asymptotically negligible. This can be achieved by using a truncation lag MTM_{T} which increases to infinity at a rate faster than the (MSE optimal) rate T1/5T^{1/5}. An alternative approach is to perform a bias correction by using flat-top kernels. See \citeasnounPolitis24 for a recent discussion, where also an alternative approach to variance stabilization, respectively, studentization, called the confidence region method, has been proposed. Despite the fact that the problem of properly incorporating the (possible) bias in the construction of (simultaneous) confidence bands for the spectral density ff is an interesting one, we do not further pursue this problem in this paper.

  • (ii)

    Equation (3) contains as part of the convergence rate for the multiplier bootstrap the sup-distance of the lag-window estimator to the spectral density over all Fourier frequencies λk,T\lambda_{k,T}. Assuming a geometric rate for the decay of the physical dependence coefficients δm(k)\delta_{m}(k), one obtains from \citeasnounWuZaffaroni2018, Theorem 6, the uniform convergence rate (MTlog(MT)/T)1/2(M_{T}\log(M_{T})/T)^{1/2} over all frequencies for lag-window estimators, which dominates the uniform term over all Fourier frequencies in (3) and usually converges to zero faster than the other parts of the bound therein.

5 Simulations

In this section we investigate by means of simulations, the finite sample performance of the Gaussian approximation and of the corresponding multiplier bootstrap procedure proposed for the construction of simultaneous confidence bands. For this purpose, time series of length T=256, 512T=256,\;512 and 10241024 have been generated from the following three time series models:

  1. Model I:  Xt=0.8Xt1+εtX_{t}=0.8X_{t-1}+\varepsilon_{t},

  2. Model II:  Xt=1.3Xt10.75Xt2+utX_{t}=1.3X_{t-1}-0.75X_{t-2}+u_{t} with ut=εt1+0.25ut12u_{t}=\varepsilon_{t}\sqrt{1+0.25u^{2}_{t-1}},

  3. Model III:   Xt=(0.4+0.1εt1)Xt1+εtX_{t}=\big{(}0.4+0.1\varepsilon_{t-1}\big{)}X_{t-1}+\varepsilon_{t}.

In all models the innovations εt\varepsilon_{t} are chosen to be i.i.d. standard Gaussian. The empirical coverage over R=500R=500 repetitions of each model has been calculated for two different nominal coverages, 90%90\% and 95%95\%. The lag-window estimator of the spectral density used has been obtained using the Parzen lag-window and truncation lag MTM_{T}. The Gauss kernel with different values of the parameter bTb_{T} has been used for obtaining the covariance estimators σ^T(j1,j2)\widehat{\sigma}_{T}(j_{1},j_{2}) given in (14) and which are used for the calculation of the covariance matrix C^T(k1,k2)\widehat{C}_{T}(k_{1},k_{2}); see equation (15). All bootstrap approximations are based on B=1,000B=1,000 replications. Table 1 presents empirical coverages as well as mean lengths of the confidence bands obtained, where the mean lengths are calculated as

ML:=2MTT1NTj=1NTf^T(λj,T)1R=1Rq1α,(),ML:=2\sqrt{\frac{M_{T}}{T}}\frac{1}{N_{T}}\sum_{j=1}^{N_{T}}\widehat{f}_{T}(\lambda_{j,T})\frac{1}{R}\sum_{\ell=1}^{R}q_{1-\alpha}^{\ast,(\ell)},

with RR the number of repetitions and q1α,()q^{\ast,(\ell)}_{1-\alpha} the upper (1α)(1-\alpha) percentage point of the distribution of ξmax\xi^{\ast}_{\max} obtained in the \ell-th repetition; also see expression (13). Figure 1 shows averaged 90% and 95% confidence bands obtained using the method proposed in this paper for sample sizes of T=512T=512 and T=1024T=1024 and for Model II.

Model I Model II Model III
T=256
MT=10M_{T}=10 bT=b_{T}= 1.0 1.5 2.0 6.5 7.0 7.5 1.0 1.5 2.0
90% Cov 92.6 90.6 89.2 90.8 90.0 90.0 86.0 82.8 82.2
ML 0.56 0.47 0.43 1.46 1.42 1.38 0.18 0.17 0.17
95% Cov 94.0 92.4 91.4 94.4 94.0 92.8 91.0 89.4 87.6
ML 0.63 0.53 0.49 1.68 1.64 1.59 0.20 0.19 0.19
T=512
MT=14M_{T}=14 bT=b_{T}= 1.5 2.0 2.5 9.5 10.0 10.5 1.0 1.5 2.0
90% Cov 91.6 88.8 87.6 90.4 90.2 89.8 87.8 84.2 83.0
ML 0.41 0.37 0.36 1.05 1.03 1.02 0.15 0.14 0.15
95% Cov 94.6 91.8 91.2 93.8 94.0 93.2 92.2 89.8 88.0
ML 0.46 0.42 0.40 1.19 1.17 1.16 0.17 0.16 0.16
T=1024
MT=18M_{T}=18 bT=b_{T}= 2.0 2.5 3.0 11.0 11.5 12.0 1.0 1.5 2.0
90% Cov 92.8 90.8 88.6 90.0 89.4 89.2 89.0 86.4 86.0
ML 0.31 0.29 0.28 0.79 0.78 0.77 0.12 0.12 0.12
95% Cov 96.0 94.8 94.0 94.0 94.2 93.8 94.6 93.2 92.0
ML 0.34 0.32 0.31 0.88 0.88 0.87 0.14 0.13 0.13

Table 1. Empirical coverages (Cov) and mean lengths (ML) of simultaneous confidence bands (13) for different sample sizes and different values of the parameters MTM_{T} and bTb_{T}.

Refer to caption
Refer to caption
Figure 1: Plot of f~(λj,T)\widetilde{f}(\lambda_{j,T}) (solid line) together with 90%90\% and 95%95\% averaged confidence bands (dotted and dashed lines, respectively), for time series of length T=512T=512 and bT=10b_{T}=10 (top) and T=1024T=1024 and bT=11.5b_{T}=11.5 (bottom) stemming from Model II.

We also compare the performance of the approach proposed in this paper with that based on a Gumbel-type approximation of the maximum deviation of the centered lag-window spectral density estimator evaluated over a much coarser grid of frequencies in the interval (0,π](0,\pi]. To elaborate and based on Theorems 3-5 of \citeasnounLiuWu2010 derived under different conditions, an asymptotically (1α)(1-\alpha) simultaneous confidence band for E(f^T(λs))\mathrm{E}(\widehat{f}_{T}(\lambda_{s})) over the set of frequencies λs=sπ/MT\lambda_{s}=s\pi/M_{T} for s=1,2,,MTs=1,2,\ldots,M_{T}, is given by

{[f^T(λs)C^α,T,f^T(λs)+C^α,T],s=1,2,,MT},\displaystyle\Big{\{}\Big{[}\widehat{f}_{T}(\lambda_{s})-\widehat{C}_{\alpha,T},\widehat{f}_{T}(\lambda_{s})+\widehat{C}_{\alpha,T}\Big{]},\ s=1,2,\ldots,M_{T}\Big{\}}, (18)

where

C^α,T=MTT(c1α+μT)f^T2(λs)W2,μT=2log(MT)log(πlog(MT))\widehat{C}_{\alpha,T}=\sqrt{\frac{M_{T}}{T}\big{(}c_{1-\alpha}+\mu_{T}\big{)}\widehat{f}^{2}_{T}(\lambda_{s})W_{2}},\ \ \mu_{T}=2\log(M_{T})-\log(\pi\log(M_{T})) (19)

and W2=11w2(u)𝑑u=151/280W_{2}=\int_{-1}^{1}w^{2}(u)du=151/280 in the case of the Parzen lag-window. Furthermore, c1αc_{1-\alpha} denotes the (1α)(1-\alpha) percentage point of the standard Gumbel distribution. Table 2 summarizes empirical coverages and mean lengths of the confidence bands (18)-(19) obtained over R=500R=500 repetitions for the same models and sample sizes considered in Table 1. Additionally and in order to see the effect of dependence of the time series at hand, we also report results for the case of an i.i.d. process with Xt𝒩(0,1)X_{t}\sim{\mathcal{N}}(0,1). Since the set of frequencies λs\lambda_{s} captured by the confidence band (18)-(19) solely depends on the truncation lag MTM_{T}, we present results for different values of this parameter.

i.i.d. Model I Model II Model III
90% 95% 90% 95% 90% 95% 90% 95%
T=256
MT=10M_{T}=10 Cov 84.0 89.4 76.2 84.0 68.4 75.8 60.2 65.8
ML 0.12 0.13 0.25 0.28 0.77 0.84 0.16 0.18
MT=14M_{T}=14 Cov 79.0 85.0 74.8 80.8 71.2 76.6 59.6 66.0
ML 0.15 0.16 0.32 0.35 0.98 1.07 0.21 0.22
MT=22M_{T}=22 Cov 69.0 74.4 68.2 74.8 65.6 71.4 55.8 64.0
ML 0.20 0.21 0.45 0.49 1.32 1.43 0.28 0.30
T=512
MT=14M_{T}=14 Cov 84.0 87.8 81.6 85.2 75.4 80.0 59.0 63.8
ML 0.11 0.12 0.23 0.26 0.69 0.76 0.15 0.16
MT=18M_{T}=18 Cov 82.8 86.6 82.4 85.6 74.2 80.2 59.4 64.6
ML 0.12 0.14 0.28 0.30 0.82 0.89 0.17 0.19
MT=26M_{T}=26 Cov 78.6 84.0 78.2 84.0.8 71.2 72.2 58.4 66.2
ML 0.15 0.17 0.37 0.39 1.03 1.12 0.22 0.24
T=1024
MT=18M_{T}=18 Cov 88.0 91.0 85.6 88.6 76.0 82.4 59.6 66.2
ML 0.09 0.10 0.20 0.22 0.58 0.63 0.12 0.13
MT=22M_{T}=22 Cov 85.2 90.6 83.6 80.8 76.2 82.6 60.2 67.8
ML 0.10 0.11 0.23 0.25 0.66 0.72 0.14 0.15
MT=30M_{T}=30 Cov 83.0 88.0 82.2 86.6 75.2 81.2 62.4 68.4
ML 0.12 0.13 0.29 0.31 0.81 0.87 0.17 0.18

Table 2. Empirical coverages (Cov) and mean lengths (ML) of the simultaneous confidence bands (18) for different sample sizes and different values of the truncation-lag MTM_{T}.

As it can be seen from Table 1, the empirical coverages of our confidence bands are, in general, close to the desired levels and they improve as the sample size increases. While the choice of the parameter MTM_{T}, which specifies the number of empirical autocovariances effectively used in obtaining f^T\widehat{f}_{T}, seems not to be important for the performance of the method based on Gaussian approximation, this method is more sensitive with respect to the choice of the bandwidth parameter bTb_{T} used in the estimation of the covariance matrix of the approximating Gaussian variables. This parameter should be chosen larger for process with a stronger dependence structure compared to rather weakly dependent data. This is a standard observation in the context of covariance estimation, see e.g. \citeasnounA91. Also, differences of the empirical coverages between the different models can be seen, where the bilinear Model III seems to be a rather difficult case. This model clearly needs larger sample sizes than the other two models considered in order to obtain coverages which are close to the nominal ones. An inspection of Table 2 shows that the method based on the asymptotic Gumbel approximation and which uses a much coarser grid of frequencies, has difficulties in achieving the desired confidence levels even in the most simple case of a Gaussian i.i.d. process while for Model III this method leads to quite low empirical coverages even for T=1024T=1024. Despite the fact that, overall, the empirical coverages for the i.i.d. case, Model I and Model II, improve slowly as the sample size increases, the results obtained heavily depend on the choice of the truncation lag MTM_{T} and the coverages achieved stay in most cases quite below the desired level even for the largest sample size used in the simulation study.

6 Auxiliary Lemmas and Proofs

Throughout this section, CC denotes a generic constant that may vary from line to line. We first state the following useful lemmas. See also \citeasnounXiaoWu2014 for related results to Lemma 1.

Lemma 1.

Under Assumption 1, the following assertions hold true:

  1. (i)

    |γ(j)|C(1+|j|)α|\gamma(j)|\leq C(1+|j|)^{-\alpha},

  2. (ii)

    j|j|r|γ(j)|<r<α1\sum_{j\in\mbox{$\mathbb{Z}$}}|j|^{r}|\gamma(j)|<\infty~{}\forall r<\alpha-1,

  3. (iii)

    maxj1,j20|E(Yt,j1Ys,j2)|C(1+|st|)α\max_{j_{1},j_{2}\geq 0}|E(Y_{t,j_{1}}Y_{s,j_{2}})|\leq C(1+|s-t|)^{-\alpha}, where Yt,j=XtXt+jγ(j)Y_{t,j}=X_{t}X_{t+j}-\gamma(j).

Proof.
  • (i)

    We make use of the bound

    P0(Xs)mδm(s),s0,\|P_{0}(X_{s})\|_{m}\leq\delta_{m}(s),s\geq 0, (20)

    of the so-called projection operator Pjs(Xj):=E[Xj|js]E[Xj|js1]P_{j-s}(X_{j}):=E[X_{j}|{\cal F}_{j-s}]-E[X_{j}|{\cal F}_{j-s-1}] for s0s\geq 0 and jj\in\mathbb{Z}, where i:=σ(ei,ei1,){\cal F}_{i}:=\sigma(e_{i},e_{i-1},\ldots). For a proof of (20) we refer to \citeasnounWu2005, Theorem 1.
    Note that Xt=s=0Pts(Xt)X_{t}=\sum_{s=0}^{\infty}P_{t-s}(X_{t}) a.s. and in L1L_{1}. This is the case because E[Xt|ts]E[X_{t}|{\cal F}_{t-s}] converges for ss\to\infty by the backward martingale convergence theorem a.s. and in L1L_{1} towards a limit measurable with respect to {\cal F}_{-\infty}, which is trivial because of the i.i.d. structure of (ei)(e_{i}). Therefore the limit is constant and coincides with the mean of XtX_{t} which is assumed to be zero. Then (i) follows since,

    |γ(j)|=\displaystyle|\gamma(j)|= |E(X0Xj)|=|s1,s2=0E(Ps1(X0)Pjs2(Xj))|\displaystyle|E(X_{0}\,X_{j})|=\Big{|}\sum_{s_{1},s_{2}=0}^{\infty}E(P_{-s_{1}}(X_{0})P_{j-s_{2}}(X_{j}))\Big{|}
    =\displaystyle= |s=0E(Ps(X0)Ps(Xj))|,because E(Ps1(X0)Pjs2(Xj))=0,s1js2\displaystyle\Big{|}\sum_{s=0}^{\infty}E(P_{-s}(X_{0})P_{-s}(X_{j}))\Big{|},\ \text{because }E(P_{-s_{1}}(X_{0})P_{j-s_{2}}(X_{j}))=0,-s_{1}\neq j-s_{2}
    \displaystyle\leq s=0E|P0(Xs)P0(Xj+s)|,because of stationarity\displaystyle\sum_{s=0}^{\infty}E|P_{0}(X_{s})P_{0}(X_{j+s})|,\ \text{because of stationarity}
    \displaystyle\leq s=0P0(Xs)mP0(Xj+s)m,where 1/m+1/m=1\displaystyle\sum_{s=0}^{\infty}\|P_{0}(X_{s})\|_{m^{\prime}}\|P_{0}(X_{j+s})\|_{m},\ \text{where }1/m+1/m^{\prime}=1
    \displaystyle\leq s=0P0(Xs)mP0(Xj+s)m,because mm for m2\displaystyle\sum_{s=0}^{\infty}\|P_{0}(X_{s})\|_{m}\|P_{0}(X_{j+s})\|_{m},\ \text{because }m^{\prime}\leq m\text{ for }m\geq 2
    \displaystyle\leq s=0δm(s)δm(j+s), by (20).\displaystyle\sum_{s=0}^{\infty}\delta_{m}(s)\delta_{m}(j+s),\text{ by }\eqref{B-13a}.

    From this bound we get for j0j\geq 0,

    |γ(j)|Cs=01(1+s)α(1+s+j)αC1(1+j)αs=01(1+s)αC(1+j)α.\displaystyle|\gamma(j)|\leq C\sum_{s=0}^{\infty}\frac{1}{(1+s)^{\alpha}(1+s+j)^{\alpha}}\leq C\frac{1}{(1+j)^{\alpha}}\sum_{s=0}^{\infty}\frac{1}{(1+s)^{\alpha}}\leq C(1+j)^{-\alpha}.
  • (ii)

    The assertion follows because

    j=1jr|γ(j)|Cj=1jr(1+j)αCj=11(1+j)αr<\sum_{j=1}^{\infty}j^{r}|\gamma(j)|\leq C\sum_{j=1}^{\infty}j^{r}(1+j)^{-\alpha}\leq C\sum_{j=1}^{\infty}\frac{1}{(1+j)^{\alpha-r}}<\infty

    for r<α1r<\alpha-1.

  • (iii)

    We assume without loss of generality that t>st>s and define Xr(k):=E[Xrr,k]X_{r}^{(k)}\,:=\,E[X_{r}\mid\mathcal{F}_{r,k}]. The following three cases can then occur. If j1tsj_{1}\leq t-s, then

    |E(Yt,j1Ys,j2)|\displaystyle|E(Y_{t,j_{1}}Y_{s,j_{2}})| |E(XtXtj1XsXsj2|+|γ(j1)γ(j2)|\displaystyle\leq|E(X_{t}X_{t-{j_{1}}}X_{s}X_{s-j_{2}}|+|\gamma(j_{1})\gamma(j_{2})|
    =|E((XtXt(ts1))Xtj1XsXsj2)|+|γ(j1)γ(j2)|\displaystyle=|E(\big{(}X_{t}-X_{t}^{(t-s-1)})X_{t-j_{1}}X_{s}X_{s-j_{2}}\big{)}|+|\gamma(j_{1})\gamma(j_{2})|
    Xtj1XsXsj2m/(m1)XtXt(ts1)m+C(1+j1)α\displaystyle\leq\|X_{t-j_{1}}X_{s}X_{s-j_{2}}\|_{m/(m-1)}\|X_{t}-X_{t}^{(t-s-1)}\|_{m}+C(1+j_{1})^{-\alpha}
    C(1+ts)α.\displaystyle\leq C\,(1+t-s)^{-\alpha}.

    If j1[(ts)/2,ts]j_{1}\in[(t-s)/2,t-s], then

    |E(Yt,j1Ys,j2)|\displaystyle|E(Y_{t,j_{1}}Y_{s,j_{2}})| |E(XtXtj1XsXsj2|+|γ(j1)γ(j2)|\displaystyle\leq|E(X_{t}X_{t-{j_{1}}}X_{s}X_{s-j_{2}}|+|\gamma(j_{1})\gamma(j_{2})|
    =|E((XtXt(tj11))Xtj1XsXsj2)|+|γ(j1)γ(j2)|\displaystyle=|E(\big{(}X_{t}-X_{t}^{(t-j_{1}-1)})X_{t-j_{1}}X_{s}X_{s-j_{2}}\big{)}|+|\gamma(j_{1})\gamma(j_{2})|
    Xtj1XsXsj2m/(m1)XtXt(tj11)m+C(1+j1)α\displaystyle\leq\|X_{t-j_{1}}X_{s}X_{s-j_{2}}\|_{m/(m-1)}\|X_{t}-X_{t}^{(t-j_{1}-1)}\|_{m}+C(1+j_{1})^{-\alpha}
    C(1+j1)αC(1+(ts)/2)αC2α(1+ts)α.\displaystyle\leq C(1+j_{1})^{-\alpha}\leq C(1+(t-s)/2)^{-\alpha}\leq C2^{\alpha}(1+t-s)^{-\alpha}.

    Finally, if j1[0,(ts)/2]j_{1}\in[0,(t-s)/2], then

    |E(Yt,j1Ys,j2)|\displaystyle|E(Y_{t,j_{1}}Y_{s,j_{2}})| |E((Yt,j1Yt,j1(ts1))Ys,j2)|\displaystyle\leq|E((Y_{t,j_{1}}-Y_{t,j_{1}}^{(t-s-1)})Y_{s,j_{2}})|
    Yt,j1Yt,j1(ts1)m/(m2)Ys,j2m/2\displaystyle\leq\|Y_{t,j_{1}}-Y_{t,j_{1}}^{(t-s-1)}\|_{m/(m-2)}\|Y_{s,j_{2}}\|_{m/2}
    Ck=tsδm(k)+Ck=tsj1δm(k)\displaystyle\leq C\sum_{k=t-s}^{\infty}\delta_{m}(k)+C\sum_{k=t-s-j_{1}}^{\infty}\delta_{m}(k)
    C(1+ts)α+Ck=(ts)/2δm(k)\displaystyle\leq C(1+t-s)^{-\alpha}+C\sum_{k=(t-s)/2}^{\infty}\delta_{m}(k)
    C 2α(1+ts)α.\displaystyle\leq C\,2^{\alpha}(1+t-s)^{-\alpha}.

Lemma 2.

Suppose that Assumptions 1 to 3 hold. Then, we have

1Tj=0MTt=j+1Tak,j(XtXtjγ(j))m/2C\displaystyle\big{\|}\frac{1}{\sqrt{T}}\sum_{j=0}^{M_{T}}\sum_{t=j+1}^{T}a_{k,j}\big{(}X_{t}X_{t-j}-\gamma(j)\big{)}\big{\|}_{m/2}\leq\mbox{C} (21)

and

supk{1,,NT}|1Tt=1TZt,k22f2(λk,T)MTj=MTMTw2(jMT)(1+cos(2jλk,T))|\displaystyle\sup_{k\in\{1,\dots,N_{T}\}}\Big{|}\big{\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,k}\big{\|}_{2}^{2}-\,\frac{f^{2}(\lambda_{k,T})}{M_{T}}\sum_{j=-M_{T}}^{M_{T}}w^{2}\big{(}\frac{j}{M_{T}}\big{)}\,(1+\cos(2j\,\lambda_{k,T}))\Big{|}
=o(1).\displaystyle=o(1). (22)

Further

1Tt=1T(Zt,kZ~t,k(s))m/2Cds,m\displaystyle\Big{\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}(Z_{t,k}-\widetilde{Z}_{t,k}^{(s)})\Big{\|}_{m/2}\,\leq\,\mbox{C}\cdot d_{s,m} (23)

with the mm-dependent random variables (m=2sm=2s) Z~t,k(s)\widetilde{Z}_{t,k}^{(s)}, k=1,2,,NT,k=1,2,\ldots,N_{T}, defined as:

Z~t,k(s):=j=0MTak,j(Xt(s)Xtj(s)E[Xt(s)Xtj(s)]) 1t>j,\widetilde{Z}_{t,k}^{(s)}\,:=\,\sum_{j=0}^{M_{T}}a_{k,j}\,\left(X_{t}^{(s)}X_{t-j}^{(s)}\,-\,E[X_{t}^{(s)}X_{t-j}^{(s)}]\right)\,\mathbbm{1}_{t>j}\,, (24)

where Xr(s):=E[Xrr,s]X_{r}^{(s)}\,:=\,E[X_{r}\mid\mathcal{F}_{r,s}] and sMTs\geq M_{T}, and with

ds,m:=h=0min{δm(h),(j=s+1δm2(j))1/2}.d_{s,m}\,:=\,\sum_{h=0}^{\infty}\min\Big{\{}\delta_{m}(h)\,,\,\big{(}\sum_{j=s+1}^{\infty}\delta_{m}^{2}(j)\big{)}^{1/2}\Big{\}}. (25)
Proof.

Inequality (21) is a direct consequence of (S8) from the supplement of \citeasnounXiaoWu2014 since

j=0MTt=j+1Tak,j(XtXtjγ(j))m/2\displaystyle\big{\|}\sum_{j=0}^{M_{T}}\sum_{t=j+1}^{T}a_{k,j}\big{(}X_{t}X_{t-j}-\gamma(j)\big{)}\big{\|}_{m/2}
=\displaystyle= i,j=1,ij{0,,MT}Tak,ij(XiXjγ(ij))m/2\displaystyle\big{\|}\sum_{i,j=1,i-j\in\{0,\ldots,M_{T}\}}^{T}a_{k,i-j}\big{(}X_{i}X_{j}-\gamma(i-j)\big{)}\big{\|}_{m/2}
\displaystyle\leq C(h=0δm(h))2maxk=1,,NT(j=0MTak,j2)1/2T.\displaystyle\mbox{C}\cdot\Big{(}\sum_{h=0}^{\infty}\delta_{m}(h)\Big{)}^{2}\max_{k=1,\ldots,N_{T}}\left(\sum_{j=0}^{M_{T}}a_{k,j}^{2}\right)^{1/2}\,\sqrt{T}.

For (2), first note that Theorem S.6 in the supplement of \citeasnounXiaoWu2014 assures summability of the joint fourth order cumulants. Hence,

1Tt=1TZt,k22\displaystyle\big{\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,k}\big{\|}_{2}^{2} (26)
=14π2TMTj1,j2=MTMTt1=|j1|+1Tt2=|j2|+2Tw(j1MT)w(j2MT)cos(j1λk,T)cos(j2λk,T)\displaystyle=\frac{1}{4\pi^{2}TM_{T}}\sum_{j_{1},j_{2}=-M_{T}}^{M_{T}}\sum_{t_{1}=|j_{1}|+1}^{T}\sum_{t_{2}=|j_{2}|+2}^{T}w\big{(}\frac{j_{1}}{M_{T}}\big{)}w\big{(}\frac{j_{2}}{M_{T}}\big{)}\,\cos(j_{1}\lambda_{k,T})\,\cos(j_{2}\lambda_{k,T})
×γ(t1t2)γ(t1t2+|j2||j1|)\displaystyle\qquad\times\gamma(t_{1}-t_{2})\gamma(t_{1}-t_{2}+|j_{2}|-|j_{1}|)
+14π2TMTj1,j2=MTMTt1=|j1|+1Tt2=|j2|+1Tw(j1T)w(j2T)cos(j1λk,T)cos(j2λk,T)\displaystyle\quad+\frac{1}{4\pi^{2}TM_{T}}\sum_{j_{1},j_{2}=-M_{T}}^{M_{T}}\sum_{t_{1}=|j_{1}|+1}^{T}\sum_{t_{2}=|j_{2}|+1}^{T}w\big{(}\frac{j_{1}}{T}\big{)}w\big{(}\frac{j_{2}}{T}\big{)}\,\cos(j_{1}\lambda_{k,T})\,\cos(j_{2}\lambda_{k,T})
×γ(t1t2+|j2|)γ(t1t2|j1|)+RT,k(1)\displaystyle\qquad\times\gamma(t_{1}-t_{2}+|j_{2}|)\gamma(t_{1}-t_{2}-|j_{1}|)+R^{(1)}_{T,k}

with supk|RT,k(1)|=o(1)\sup_{k}|R_{T,k}^{(1)}|=o(1). For the first summand on the r.h.s. we get from kk|γ(k)|<\sum_{k\in\mbox{$\mathbb{N}$}}k|\gamma(k)|<\infty that

14π2TMTj1,j2=MTMTt1=|j1|+1Tt2=|j2|+1Tw(j1T)w(j2T)cos(j1λk,T)cos(j2λk,T)\displaystyle\frac{1}{4\pi^{2}TM_{T}}\sum_{j_{1},j_{2}=-M_{T}}^{M_{T}}\sum_{t_{1}=|j_{1}|+1}^{T}\sum_{t_{2}=|j_{2}|+1}^{T}w\big{(}\frac{j_{1}}{T}\big{)}w\big{(}\frac{j_{2}}{T}\big{)}\,\cos(j_{1}\lambda_{k,T})\,\cos(j_{2}\lambda_{k,T})
×γ(t1t2)γ(t1t2+|j2||j1|)\displaystyle\qquad\times\gamma(t_{1}-t_{2})\gamma(t_{1}-t_{2}+|j_{2}|-|j_{1}|)
=1π2TMTj1,j2=1MTt1=j1+1Tt2=j2+1Tw(j1MT)w(j2MT)cos(j1λk,T)cos(j2λk,T)\displaystyle=\frac{1}{\pi^{2}TM_{T}}\sum_{j_{1},j_{2}=1}^{M_{T}}\sum_{t_{1}=j_{1}+1}^{T}\sum_{t_{2}=j_{2}+1}^{T}w\big{(}\frac{j_{1}}{M_{T}}\big{)}w\big{(}\frac{j_{2}}{M_{T}}\big{)}\,\cos(j_{1}\lambda_{k,T})\,\cos(j_{2}\lambda_{k,T})
×γ(t1t2)γ(t1t2+j2j1)+RT,k(2)\displaystyle\qquad\times\gamma(t_{1}-t_{2})\gamma(t_{1}-t_{2}+j_{2}-j_{1})+R^{(2)}_{T,k}
=1π2MTtjγ(t)γ(t+j)j1=1(1j)MT(MTj)w(j1MT)w(j+j1MT)cos(j1λk,T)\displaystyle=\frac{1}{\pi^{2}M_{T}}\sum_{t\in\mbox{$\mathbb{Z}$}}\sum_{j\in\mbox{$\mathbb{Z}$}}\gamma(t)\gamma(t+j)\sum_{j_{1}=1\vee(1-j)}^{M_{T}\wedge(M_{T}-j)}w\big{(}\frac{j_{1}}{M_{T}}\big{)}w\big{(}\frac{j+j_{1}}{M_{T}}\big{)}\,\cos(j_{1}\lambda_{k,T})
×cos((j+j1)λk,T)+RT,k(3)\displaystyle\qquad\times\,\cos((j+j_{1})\lambda_{k,T})+R^{(3)}_{T,k}
=12π2MTtjγ(t)γ(t+j)j1=1MTw2(j1MT)\displaystyle=\frac{1}{2\pi^{2}M_{T}}\sum_{t\in\mbox{$\mathbb{Z}$}}\sum_{j\in\mbox{$\mathbb{Z}$}}\gamma(t)\gamma(t+j)\sum_{j_{1}=1}^{M_{T}}w^{2}\big{(}\frac{j_{1}}{M_{T}}\big{)}\,
×[cos(jλk,T)+cos((j+2j1)λk,T)]+RT,k(4)\displaystyle\qquad\times\,\big{[}\cos(j\lambda_{k,T})\,+\,\cos((j+2j_{1})\lambda_{k,T})\big{]}+R^{(4)}_{T,k}
=f2(λk,T)2MTj1=1MTw2(j1MT)(1+cos(2j1λk,T))+RT,k(5),\displaystyle=f^{2}(\lambda_{k,T})\,\frac{2}{M_{T}}\sum_{j_{1}=1}^{M_{T}}w^{2}\big{(}\frac{j_{1}}{M_{T}}\big{)}\,(1+\cos(2j_{1}\lambda_{k,T}))+R^{(5)}_{T,k},

where supk|RT,k()|=o(1),=2,,5.\sup_{k}|R_{T,k}^{(\ell)}|=o(1),~{}\ell=2,\dots,5. Similarly, we obtain from kk2|γ(k)|<\sum_{k\in\mbox{$\mathbb{N}$}}k^{2}|\gamma(k)|<\infty for the second summand on the r.h.s. of (26)

14π2TMTj1,j2=MTMTt1=|j1|+1Tt2>|j2|+1Tw(j1T)w(j2T)cos(j1λk,T)cos(j2λk,T)\displaystyle\frac{1}{4\pi^{2}TM_{T}}\sum_{j_{1},j_{2}=-M_{T}}^{M_{T}}\sum_{t_{1}=|j_{1}|+1}^{T}\sum_{t_{2}>|j_{2}|+1}^{T}w\big{(}\frac{j_{1}}{T}\big{)}w\big{(}\frac{j_{2}}{T}\big{)}\,\cos(j_{1}\lambda_{k,T})\,\cos(j_{2}\lambda_{k,T})
×γ(t1t2+|j2|)γ(t1t2|j1|)\displaystyle\qquad\times\gamma(t_{1}-t_{2}+|j_{2}|)\gamma(t_{1}-t_{2}-|j_{1}|)
=1π2MTtj=22MTγ(t)γ(tj)j1=1(jMT)MT(j1)w(j1T)w(jj1T)cos(j1λk,T)\displaystyle=\frac{1}{\pi^{2}M_{T}}\sum_{t\in\mbox{$\mathbb{Z}$}}\sum_{j=2}^{2M_{T}}\gamma(t)\gamma(t-j)\,\sum_{j_{1}=1\vee(j-M_{T})}^{M_{T}\wedge(j-1)}w\big{(}\frac{j_{1}}{T}\big{)}w\big{(}\frac{j-j_{1}}{T}\big{)}\,\cos(j_{1}\lambda_{k,T})\,
×cos((jj1)λk,T)+RT,k(6)\displaystyle\qquad\times\,\cos((j-j_{1})\lambda_{k,T})\,+R_{T,k}^{(6)}
=1π2MTtj=2MTγ(t)γ(tj)j1=1MT1w(j1T)w(jj1T)cos(j1λk,T)\displaystyle=\frac{1}{\pi^{2}M_{T}}\sum_{t\in\mbox{$\mathbb{Z}$}}\sum_{j=2}^{\sqrt{M_{T}}}\gamma(t)\gamma(t-j)\,\sum_{j_{1}=1}^{\sqrt{M_{T}}-1}w\big{(}\frac{j_{1}}{T}\big{)}w\big{(}\frac{j-j_{1}}{T}\big{)}\,\cos(j_{1}\lambda_{k,T})\,
×cos((jj1)λk,T)+RT,k(7)\displaystyle\qquad\times\,\cos((j-j_{1})\lambda_{k,T})\,+R_{T,k}^{(7)}
=RT,k(8),\displaystyle=R_{T,k}^{(8)},

where supk|RT,k()|=o(1),=6, 7, 8.\sup_{k}|R_{T,k}^{(\ell)}|=o(1),~{}\ell=6,\,7,\,8. This finishes the proof of (2).
Inequality (23) in turn can be deduced from Proposition 1 in \citeasnounLiuWu2010 as follows

t=1T(Zt,kZ~t,k(s))m/2CTds,mmaxk=1,,NT(j=0MTak,j2)1/2h=0δm(h).\Big{\|}\sum_{t=1}^{T}(Z_{t,k}-\widetilde{Z}_{t,k}^{(s)})\Big{\|}_{m/2}\leq\mbox{C}\,\sqrt{T}\,d_{s,m}\,\max_{k=1,\ldots,N_{T}}\left(\sum_{j=0}^{M_{T}}a_{k,j}^{2}\right)^{1/2}\,\sum_{h=0}^{\infty}\delta_{m}(h).

Proof of Theorem 1.

We mainly follow the strategy of the proof of Theorem 1 in \citeasnounZhang_etal2022 although some of the conditions used there are not fulfilled in our case. In particular, a different mm-dependent approximation Z~t,k(s)\widetilde{Z}^{(s)}_{t,k}, i.e. m=2sm=2s, is used in our proof which then leads to an improved rate of convergence. Using the notation hτ,τ,xh_{\tau,\tau,x} as in \citeasnounZhang_etal2022, we get the bound

supx|P{maxk=1,,NTTMT|f^T(λk,T)Ef^T(λk,T)|x}\displaystyle\sup_{x\in\mathbb{R}}\Big{|}P\Big{\{}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\Big{|}\widehat{f}_{T}(\lambda_{k,T})-E\,\widehat{f}_{T}(\lambda_{k,T})\Big{|}\leq x\Big{\}}
P{maxk=1,,NT|ξk|x}|\displaystyle\quad\quad-P\Big{\{}\max_{k=1,\ldots,N_{T}}|\xi_{k}|\leq x\Big{\}}\Big{|}
\displaystyle\leq supx|Ehτ,τ,x(1Tt=1TZt,1,,1Tt=1TZt,NT)Ehτ,τ,x(ξ1,,ξNT)|\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\,\Big{|}Eh_{\tau,\tau,x}\Big{(}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,1},\dots,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,N_{T}}\Big{)}-Eh_{\tau,\tau,x}\left(\xi_{1},\dots,\xi_{N_{T}}\right)\Big{|}
+Ct(1+log(NT)+|log(t)|),\displaystyle\quad\quad+C\,t\,\left(1+\sqrt{\log(N_{T})}+\sqrt{|\log(t)|}\right), (27)

in view of (6) and (21), where t=(1+log(2NT))/τt=(1+\log(2N_{T}))/\tau for some τ>0\tau>0 to be specified later. For the mm-dependent random variables (with m=2sm=2s) Z~t,k(s)\widetilde{Z}_{t,k}^{(s)}, k=1,2,,NTk=1,2,\ldots,N_{T}, defined in (24), by the properties of the function hτ,τ,xh_{\tau,\tau,x} and from Lemma 2, we get

supx|Ehτ,τ,x(1Tt=1TZt,1,,1Tt=1TZt,NT)\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\,\Big{|}Eh_{\tau,\tau,x}\Big{(}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,1},\dots,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,N_{T}}\Big{)}
Ehτ,τ,x(1Tt=1TZ~t,1(s),,1Tt=1TZt,NT(s))|\displaystyle\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ -Eh_{\tau,\tau,x}\Big{(}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}\widetilde{Z}^{(s)}_{t,1},\dots,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z^{(s)}_{t,N_{T}}\Big{)}\Big{|}
CτE[maxk=1,,NT|1Tt=1T(Zt,kZ~t,k(s))|]\displaystyle\leq C\,\tau\,E\Big{[}\max_{k=1,\dots,N_{T}}\Big{|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}(Z_{t,k}-\widetilde{Z}_{t,k}^{(s)})\Big{|}\Big{]}
CτNT2/mmaxk=1,,NT1Tt=1T(Zt,kZ~t,k(s))m/2\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,\max_{k=1,\dots,N_{T}}\Big{\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}(Z_{t,k}-\widetilde{Z}_{t,k}^{(s)})\Big{\|}_{m/2}
CτNT2/mds,m\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,d_{s,m} (28)

with ds,md_{s,m} given in (25). The introduction of the random variables Z~t,k(s)\widetilde{Z}^{(s)}_{t,k} allows us to proceed with the classical big block/small block technique. Toward this, we define for any integer l>2sl>2s big and small blocks of the Z~t,k(s)\widetilde{Z}_{t,k}^{(s)} as

Sj,k:=1Tt=2(j1)(s+l)+12(j1)(s+l)+2lZ~t,k(s)andUj,k:=1Tt=2(j1)(s+l)+2l+1(2j(s+l))TZ~t,k(s)S_{j,k}\,:=\,\frac{1}{\sqrt{T}}\sum_{t=2(j-1)(s+l)+1}^{2(j-1)(s+l)+2l}\widetilde{Z}_{t,k}^{(s)}\quad\text{and}\quad U_{j,k}\,:=\,\frac{1}{\sqrt{T}}\sum_{t=2(j-1)(s+l)+2l+1}^{(2j(s+l))\wedge T}\widetilde{Z}_{t,k}^{(s)}

for j=1,,T2(s+l)=:VTj=1,\dots,\lceil\frac{T}{2(s+l)}\rceil=:V_{T}. Smoothness of hx,x,τh_{x,x,\tau}, Rosenthal’s inequality and similar arguments as in (21) yield

supx|Ehτ,τ,x(1Tt=1TZ~t,1(s),,1Tt=1TZt,NT(s))\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\,\Big{|}Eh_{\tau,\tau,x}\Big{(}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}\widetilde{Z}^{(s)}_{t,1},\dots,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z^{(s)}_{t,N_{T}}\Big{)}
Ehτ,τ,x(j=1VTSj,1,,j=1VTSj,NT)|\displaystyle\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ -Eh_{\tau,\tau,x}\Big{(}\sum_{j=1}^{V_{T}}S_{j,1},\dots,\sum_{j=1}^{V_{T}}S_{j,N_{T}}\Big{)}\Big{|}
CτNT2/mmaxk=1,,NTj=1VTUj,km/2\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,\max_{k=1,\dots,N_{T}}\Big{\|}\sum_{j=1}^{V_{T}}U_{j,k}\Big{\|}_{m/2}
CτNT2/mVTmaxk=1,,NTmaxj=1,,VTUj,km/2\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,\sqrt{V_{T}}\,\max_{k=1,\dots,N_{T}}\max_{j=1,\dots,V_{T}}\|U_{j,k}\|_{m/2}
CτNT2/mVTsT\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,\sqrt{\frac{V_{T}\,s}{T}}
CτNT2/msl.\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,\sqrt{\frac{s}{l}}. (29)

Next, we define centred, joint normal random variables (Sj,k)j=1,,VT,k=1,,NT(S^{*}_{j,k})_{j=1,\dots,V_{T},k=1,\dots,N_{T}} with E[Sj,k1Sj,k2]=E[Sj,k1Sj,k2]E[S^{*}_{j,k_{1}}S^{*}_{j,k_{2}}]=E[S_{j,k_{1}}S_{j,k_{2}}] and such that (Sj,1,,Sj,NT),(S^{*}_{j,1},\dots,S^{*}_{j,N_{T}}), j=1,,VTj=1,\dots,V_{T} are independent. Further these variables are constructed such that they are independent of (Sj,k)j=1,,VT,k=1,,NT(S_{j,k})_{j=1,\dots,V_{T},k=1,\dots,N_{T}}. We obtain from Lindeberg’s method and using

||Sj,km/2CmSj,k2=CmSj,k2CmSj,km/2,||S_{j,k}^{\ast}\|_{m/2}\leq C_{m}\|S^{\ast}_{j,k}\|_{2}=C_{m}\|S_{j,k}\|_{2}\leq C_{m}\|S_{j,k}\|_{m/2},

that,

supx|Ehτ,τ,x(j=1VTSj,1,,j=1VTSj,NT)Ehτ,τ,x(j=1VTSj,1,,j=1VTSj,NT)|\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\,\Big{|}Eh_{\tau,\tau,x}\Big{(}\sum_{j=1}^{V_{T}}S_{j,1},\dots,\sum_{j=1}^{V_{T}}S_{j,N_{T}}\Big{)}-Eh_{\tau,\tau,x}\Big{(}\sum_{j=1}^{V_{T}}S_{j,1}^{*},\dots,\sum_{j=1}^{V_{T}}S_{j,N_{T}}^{*}\Big{)}\Big{|} (30)
C(1+τ3)j=1VTEmaxk=1,,NT[|Sj,k|3+|Sj,k|3]\displaystyle\leq C\,(1+\tau^{3})\,\sum_{j=1}^{V_{T}}E\max_{k=1,\dots,N_{T}}[|S_{j,k}|^{3}\,+\,|S_{j,k}^{*}|^{3}]
Cm(1+τ3)NT6/mVT(lT)3/2.\displaystyle\leq C^{\prime}_{m}\,(1+\tau^{3})\,\,N_{T}^{6/m}\,V_{T}\,\left(\frac{l}{T}\right)^{3/2}.

Using the bounds

1Ti=1T(Zi,kZ~i,k(s))m/2Cds,m,1Ti=1TZi,km/2C,\|\frac{1}{\sqrt{T}}\sum_{i=1}^{T}(Z_{i,k}-\widetilde{Z}_{i,k}^{(s)})\|_{m/2}\leq Cd_{s,m},\ \ \|\frac{1}{\sqrt{T}}\sum_{i=1}^{T}Z_{i,k}\|_{m/2}\leq C,

derived in Lemma 2,

|j1=1VTj2=1VTESj1,k1Uj2,k2|\displaystyle|\sum_{j_{1}=1}^{V_{T}}\sum_{j_{2}=1}^{V_{T}}ES_{j_{1},k_{1}}U_{j_{2},k_{2}}| j=1VTSj,k1m/2j=1VTUj,k2m/2\displaystyle\leq\|\sum_{j=1}^{V_{T}}S_{j,k_{1}}\|_{m/2}\|\sum_{j=1}^{V_{T}}U_{j,k_{2}}\|_{m/2}
VTmaxj=1,,VtSj,k1m/2maxj=1,,VtUj,k2m/2\displaystyle\leq V_{T}\max_{j=1,\ldots,V_{t}}\|S_{j,k_{1}}\|_{m/2}\max_{j=1,\ldots,V_{t}}\|U_{j,k_{2}}\|_{m/2}
CVTlsT\displaystyle\leq C\,V_{T}\,\frac{\sqrt{ls}}{T}

and

|j1=1VTj2=1VTEUj1,k1Uj2,k2|VTmaxj=1,,VTUj,k1m/2maxj=1,,VTUj,k2m/2CVTsT,|\sum_{j_{1}=1}^{V_{T}}\sum_{j_{2}=1}^{V_{T}}EU_{j_{1},k_{1}}U_{j_{2},k_{2}}|\leq V_{T}\max_{j=1,\ldots,V_{T}}\|U_{j,k_{1}}\|_{m/2}\max_{j=1,\ldots,V_{T}}\|U_{j,k_{2}}\|_{m/2}\leq CV_{T}\frac{s}{T},

we get

supx|Ehτ,τ,x(j=1VTSj,1,,j=1VTSj,NT)\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\,\Big{|}Eh_{\tau,\tau,x}\Big{(}\sum_{j=1}^{V_{T}}S^{\ast}_{j,1},\dots,\sum_{j=1}^{V_{T}}S^{\ast}_{j,N_{T}}\Big{)} Ehτ,τ,x(ξ1,,ξNT)|\displaystyle-Eh_{\tau,\tau,x}\Big{(}\xi_{1},\ldots,\xi_{N_{T}}\Big{)}\Big{|} (31)
Cτ2(ds,m+VTslT).\displaystyle\leq C\,\tau^{2}\,\big{(}d_{s,m}+V_{T}\frac{\sqrt{sl}}{T}\big{)}.

Now, let τTλ\tau\sim T^{\lambda}, sTass\sim T^{a_{s}} and lTall\sim T^{a_{l}} for some positive numbers λ\lambda, asa_{s} and ala_{l}. For (6), (6), (6), (30) and (31) to vanish asymptotically, the following conditions are sufficient:

  1. (i)

    t(1+log(NT)+|log(t)|)0t\big{(}1+\sqrt{\log(N_{T})}+\sqrt{|\log(t)|}\big{)}\rightarrow 0,

  2. (ii)

    2λ+4/m+as<al2\lambda+4/m+a_{s}<a_{l},   6λ+12/m+al<16\lambda+12/m+a_{l}<1,   4λ+as<al4\lambda+a_{s}<a_{l} and

  3. (iii)

    (Tλ+2/m+T2λ)ds,m0\big{(}T^{\lambda+2/m}+T^{2\lambda}\big{)}d_{s,m}\rightarrow 0.

For (i) and because t=(1+log(2NT))/τt=(1+\log(2N_{T}))/\tau, it is easily seen that

t(1+log(NT)+|log(t)|)CTλ(log(NT))3/2,t\big{(}1+\sqrt{\log(N_{T})}+\sqrt{|\log(t)|}\big{)}\leq CT^{-\lambda}\big{(}\log(N_{T})\big{)}^{3/2},

which converges to zero provided λ>0\lambda>0. Furthermore, since m>16m>16, (ii) is satisfied if as+max{4λ,2λ+4/m}<al<16λ12/ma_{s}+\max\{4\lambda,2\lambda+4/m\}<a_{l}<1-6\lambda-12/m. Finally, for (iii) notice first that for 0<δ<α10<\delta<\alpha-1 and because sTass\sim T^{a_{s}},

(j=s+1δm2(j))1/2\displaystyle\big{(}\sum_{j=s+1}^{\infty}\delta^{2}_{m}(j)\big{)}^{1/2} =(j=s+1j2α+1+2δj(1+2δ))1/2\displaystyle=\big{(}\sum_{j=s+1}^{\infty}j^{-2\alpha+1+2\delta}\cdot j^{-(1+2\delta)}\big{)}^{1/2}
Csα+δ+1/2CTas(αδ1/2).\displaystyle\leq Cs^{-\alpha+\delta+1/2}\leq CT^{-a_{s}(\alpha-\delta-1/2)}.

Now, for r>0r>0 we have,

h=0min{(1+h)α,Tas(α1/2δ)}\displaystyle\sum_{h=0}^{\infty}\min\{(1+h)^{-\alpha},T^{-a_{s}(\alpha-1/2-\delta)}\}
h=0TrTas(α1/2δ)+h=Tr(1+h)α+1+δh1δ\displaystyle\leq\sum_{h=0}^{T^{r}}T^{-a_{s}(\alpha-1/2-\delta)}+\sum_{h=T^{r}}^{\infty}(1+h)^{-\alpha+1+\delta}h^{-1-\delta}
Tras(α1/2δ)+CTr(α1δ).\displaystyle\leq T^{r-a_{s}(\alpha-1/2-\delta)}+CT^{-r(\alpha-1-\delta)}.

Balancing both terms in the last bound above yields,

ras(α1/2δ)=r(α1δ)r=as(α1/2δ)αδr-a_{s}(\alpha-1/2-\delta)=-r(\alpha-1-\delta)\ \Longleftrightarrow r=\frac{a_{s}(\alpha-1/2-\delta)}{\alpha-\delta}

and, therefore,

h=0min{δm(h),(j=s+1δm2(j))1/2}=Tas(α1/2δ)(α1δ)αδ0.\sum_{h=0}^{\infty}\min\big{\{}\delta_{m}(h),\big{(}\sum_{j=s+1}^{\infty}\delta^{2}_{m}(j)\big{)}^{1/2}\big{\}}=T^{-\frac{\displaystyle a_{s}(\alpha-1/2-\delta)(\alpha-1-\delta)}{\displaystyle\alpha-\delta}}\rightarrow 0.

Hence for (iii) to be satisfied,

as(α1/2δ)(α1δ)αδmax{λ+2/m, 2λ}>0\frac{\displaystyle a_{s}(\alpha-1/2-\delta)(\alpha-1-\delta)}{\displaystyle\alpha-\delta}-\max\{\lambda+2/m\,,\,2\lambda\}>0 (32)

should hold true. Notice that for α\alpha satisfying condition (7), (32) holds true and at the same time the rate in (32) is larger or equal to min{κ1,κ2}\min\{\kappa_{1},\kappa_{2}\}. ∎

Proof of Theorem 2.

Let

RT(λk,T)=TMTf^T(λk,T)Ef^T(λk,T)f(λk,T)(f(λk,T)f^T(λk,T)1)R_{T}(\lambda_{k,T})=\sqrt{\frac{T}{M_{T}}}\,\frac{\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})}{f(\lambda_{k,T})}\,\left(\frac{f(\lambda_{k,T})}{\widehat{f}_{T}(\lambda_{k,T})}-1\right)

and t=(1+log(2NT))/τt=(1+\log(2N_{T}))/\tau with τTλ\tau\sim T^{\lambda}. Then we can split up

P(maxk=1,,NT|ξ~k|x)P(maxk=1,,NTTMT|f^T(λk,T)Ef^T(λk,T)|f^T(λk,T)x)\displaystyle P\big{(}\max_{k=1,\ldots,N_{T}}|\widetilde{\xi}_{k}|\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|}{\widehat{f}_{T}(\lambda_{k,T})}\leq x\big{)}
P(maxk=1,,NT|ξ~k|x)P(maxk=1,,NT|ξ~k|xt)\displaystyle\leq P\big{(}\max_{k=1,\ldots,N_{T}}|\widetilde{\xi}_{k}|\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}|\widetilde{\xi}_{k}|\leq x-t\big{)}
+P(maxk=1,,NT|ξ~k|xt)P(maxk=1,,NTTMT|f^T(λk,T)Ef^T(λk,T)|f(λk,T)xt)\displaystyle\quad+\,P\big{(}\max_{k=1,\ldots,N_{T}}|\widetilde{\xi}_{k}|\leq x-t\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|}{f(\lambda_{k,T})}\leq x-t\big{)}
+P(maxk=1,,NT|RT(λk,T)|>t).\displaystyle\quad+\,P\big{(}\max_{k=1,\ldots,N_{T}}|R_{T}(\lambda_{k,T})|>t\big{)}.

One can proceed analogously to obtain a lower bound which then results in

supx|P(maxk=1,,NT|ξ~k|x)P(maxk=1,,NTTMT|f^T(λk,T)Ef^T(λk,T)|f^T(λk,T)x)|\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\big{|}\,P\big{(}\max_{k=1,\ldots,N_{T}}|\widetilde{\xi}_{k}|\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|}{\widehat{f}_{T}(\lambda_{k,T})}\leq x\big{)}\,\big{|}
supx|P(maxk=1,,NT|ξ~k|x)P(maxk=1,,NT|ξ~k|xt)|\displaystyle\leq\sup_{x\in\mbox{$\mathbb{R}$}}\big{|}P\big{(}\max_{k=1,\ldots,N_{T}}|\widetilde{\xi}_{k}|\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}|\widetilde{\xi}_{k}|\leq x-t\big{)}\big{|}
+supx|P(maxk=1,,NT|ξ~k|x)P(maxk=1,,NTTMT|f^T(λk,T)Ef^T(λk,T)|f(λk,T)x)|\displaystyle\quad+\,\sup_{x\in\mbox{$\mathbb{R}$}}\big{|}P\big{(}\max_{k=1,\ldots,N_{T}}|\widetilde{\xi}_{k}|\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|}{f(\lambda_{k,T})}\leq x\big{)}|
+supx|P(maxk=1,,NT|RT(λk,T)|>t)|\displaystyle\quad+\,\sup_{x\in\mbox{$\mathbb{R}$}}\big{|}P\big{(}\max_{k=1,\ldots,N_{T}}|R_{T}(\lambda_{k,T})|>t\big{)}\big{|}
=P1+P2+P3\displaystyle=P_{1}+P_{2}+P_{3}

with obvious abbreviations for P1,P2,P_{1},~{}P_{2}, and P3P_{3}. Lemma A.1 in \citeasnounZhang_etal2022 gives

P1Ct(1+log(NT)+|log(t)|).P_{1}\,\leq\,C\,t\,\left(1+\sqrt{\log(N_{T})}+\sqrt{|\log(t)|}\right).

The desired rate for P2P_{2} can be derived with exactly the same arguments as in the proof of Theorem 1 since the spectral density is assumed to be uniformly bounded from below.

Finally, for P3P_{3} we make use of the assumed bias property (11) of f^T\widehat{f}_{T}, that is of

supλ|Ef^T(λ)f(λ)|=O(MT2).\sup_{\lambda}|E\widehat{f}_{T}(\lambda)-f(\lambda)|=O(M_{T}^{-2}).

This allows to bound P3P_{3} as follows. We have

|RT(λk,T)|\displaystyle|R_{T}(\lambda_{k,T})| =TMT|f^T(λk,T)Ef^T(λk,T)|f(λk,T)|f(λk,T)f^T(λk,T)1|\displaystyle=\sqrt{\frac{T}{M_{T}}}\frac{|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|}{f(\lambda_{k,T})}\,\Big{|}\frac{f(\lambda_{k,T})}{\widehat{f}_{T}(\lambda_{k,T})}-1\Big{|}
=TMT|f^T(λk,T)Ef^T(λk,T)||f^T(λk,T)f(λk,T)|f^T(λk,T)f(λk,T).\displaystyle=\sqrt{\frac{T}{M_{T}}}\,\big{|}\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\big{|}\frac{|\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})|}{\widehat{f}_{T}(\lambda_{k,T})\,f(\lambda_{k,T})}\,.

A division of the considerations depending on whether maxk|f^T(λk,T)f(λk,T)|\max_{k}|\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})| is less or equal or larger than infλf(λ)/2\inf_{\lambda}f(\lambda)/2 results in

P(maxk=1,,NT|RT(λk,T)|>t)\displaystyle P(\max_{k=1,\dots,N_{T}}|R_{T}(\lambda_{k,T})|>t)
P(TMTmaxk=1,,NT|f^T(λk,T)Ef^T(λk,T)|2infλf(λ)2|f^T(λk,T)f(λk,T)|>t)\displaystyle\leq P\Big{(}\sqrt{\frac{T}{M_{T}}}\max_{k=1,\dots,N_{T}}|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|\frac{2}{\inf_{\lambda}f(\lambda)^{2}}|\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})|>t\Big{)}
+P(maxk=1,,NT|f^T(λk,T)f(λk,T)|>infλf(λ)/2).\displaystyle\quad+P\big{(}\max_{k=1,\dots,N_{T}}|\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})|>\inf_{\lambda}f(\lambda)/2\big{)}\,.

The first summand is bounded through

Ct1TMT(Emaxk=1,,NT|f^T(λk,T)Ef^T(λk,T)|2\displaystyle Ct^{-1}\sqrt{\frac{T}{M_{T}}}\Big{(}E\max_{k=1,\dots,N_{T}}|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|^{2}
+Emaxk=1,,NT|f^T(λk,T)Ef^T(λk,T)|MT2)\displaystyle\hskip 44.9554pt+E\max_{k=1,\dots,N_{T}}|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|\cdot M_{T}^{-2}\Big{)}
Ct1(NT4/mMTT+NT2/mMT2),\displaystyle\leq\,Ct^{-1}\Big{(}N_{T}^{4/m}\sqrt{\frac{M_{T}}{T}}+N_{T}^{2/m}M_{T}^{-2}\Big{)}\,,

where for arbitrary random variables UkU_{k} we make use of Emaxk=1,,NT|Uk|(Ek=1NT|Uk|r)1/rNT1/rmaxk=1,,NTUkrE\max_{k=1,\dots,N_{T}}|U_{k}|\leq(E\sum_{k=1}^{N_{T}}|U_{k}|^{r})^{1/r}\leq N_{T}^{1/r}\max_{k=1,\dots,N_{T}}\|U_{k}\|_{r}, r1,r\geq 1, and the last inequality follows from Lemma 2.
A similar consideration for the second summand finally leads to

P(maxk=1,,NT|RT(λk,T)|>t)C(t1NT4/m+NT2/m)(MTT+MT2).P(\max_{k=1,\dots,N_{T}}|R_{T}(\lambda_{k,T})|>t)\leq\,C(t^{-1}\,N_{T}^{4/m}+N_{T}^{2/m})\,\Big{(}\sqrt{\frac{M_{T}}{T}}+M_{T}^{-2}\Big{)}\,.

Using MTTasM_{T}\sim T^{a_{s}} and t1=Tλ/(1+log(2NT))t^{-1}=T^{\lambda}\big{/}(1+\log(2N_{T})), the above bound for P(maxk=1,,NT|RT(λk,T)|>t)P(\max_{k=1,\dots,N_{T}}|R_{T}(\lambda_{k,T})|>t) implies the order

Tλ+as/2T4/mT1/2(1+log(2NT))+TλT4/mT2as(1+log(2NT)),\frac{\displaystyle T^{\lambda+a_{s}/2}T^{4/m}}{\displaystyle T^{1/2}(1+\log(2N_{T}))}+\frac{\displaystyle T^{\lambda}T^{4/m}}{\displaystyle T^{2a_{s}}(1+\log(2N_{T}))},

for P3P_{3}, which for λ+κmin{1/2as/24/m, 2as4/m}\lambda+\kappa\leq\min\{1/2-a_{s}/2-4/m\,,\,2a_{s}-4/m\} leads to P3=O(Tκ/log(2NT))P_{3}=O(T^{-\kappa}/\log(2N_{T})). ∎

Proof of Proposition 1.

First, we split up

|σ^T(j1,j2)σT(j1,j2)|\displaystyle\big{|}\widehat{\sigma}_{T}(j_{1},j_{2})\,-\,\sigma_{T}(j_{1},j_{2})\big{|}
\displaystyle\leq\ |1Tt=j1+1Ts=j2+1TK(tsbT)(XtXtj1γ(j1))(XsXsj2γ(j2))σ(j1,j2)|\displaystyle\Big{|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\big{(}X_{t}X_{t-j_{1}}-{\gamma}(j_{1})\big{)}\big{(}X_{s}X_{s-j_{2}}-{\gamma}(j_{2})\big{)}\,-\,\sigma(j_{1},j_{2})\Big{|}
+|γ^(j1)γ(j1)||1Tt=j1+1Ts=j2+1TK(tsbT)(XsXsj2γ(j2))|\displaystyle+\,|\widehat{\gamma}(j_{1})-\gamma(j_{1})|\,\Big{|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\big{(}X_{s}X_{s-j_{2}}-{\gamma}(j_{2})\big{)}\Big{|}
+|γ^(j2)γ(j2)||1Tt=j1+1Ts=j2+1TK(tsbT)(XtXtj1γ(j1))|\displaystyle+\,|\widehat{\gamma}(j_{2})-\gamma(j_{2})|\,\Big{|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\big{(}X_{t}X_{t-j_{1}}-{\gamma}(j_{1})\big{)}\Big{|}
+|γ^(j1)γ(j1)||γ^(j2)γ(j2)||1Tt=j1+1Ts=j2+1TK(tsbT)|.\displaystyle+|\widehat{\gamma}(j_{1})-\gamma(j_{1})|\,|\widehat{\gamma}(j_{2})-\gamma(j_{2})|\,\Big{|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{|}. (33)

While the last summand is of order O(bTT1)O(b_{T}\,T^{-1}) (see (E.4) in \citeasnounZhang_etal2022), the two middle terms can be bounded from above by O(bTT1/2)O(b_{T}\,T^{-1/2}) using Cauchy-Schwarz inequality. Both bounds hold uniformly in j1j_{1} and j2j_{2}. Let Yt,j=XtXtjγ(j)Y_{t,j}=X_{t}X_{t-j}-\gamma(j) and

σ~T(j1,j2)=1Tt=j1+1Ts=j2+1TK(tsbT)Yt,j1Ys,j2.\widetilde{\sigma}_{T}(j_{1},j_{2})=\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}K\Big{(}\frac{t-s}{b_{T}}\Big{)}Y_{t,j_{1}}Y_{s,j_{2}}.

Then the first term on the right hand side of the bound given in (6) equals |σ~T(j1,j2)σ(j1,j2)||\widetilde{\sigma}_{T}(j_{1},j_{2})-\sigma(j_{1},j_{2})|, and for this we have

|σ~(j1,j2)σ(j1,j2)|\displaystyle|\widetilde{\sigma}(j_{1},j_{2})-\sigma(j_{1},j_{2})| |1Tt=j1+1Ts=j2+1T(K(tsbT)1)E(Yt,j1Ys,j2)|\displaystyle\leq\Big{|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}K\Big{(}\frac{t-s}{b_{T}}\Big{)}-1\Big{)}E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{|}
+|1Tt=j1+1Ts=j2+1T(Yt,j1Ys,j2E(Yt,j1Ys,j2))K(tsbT)|.\displaystyle+\Big{|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}Y_{t,j_{1}}Y_{s,j_{2}}-E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{)}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{|}. (34)

Using Lemma 1(iii) the first term of the right hand side of (6) can be bounded by

C1Tt=1Ts=1T\displaystyle C\frac{1}{T}\sum_{t=1}^{T}\sum_{s=1}^{T} (1K(tsbT))1(1+|ts|)α2Cs=0(1K(sbT))1(1+s)α.\displaystyle\Big{(}1-K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{)}\frac{1}{(1+|t-s|)^{\alpha}}\leq 2C\sum_{s=0}^{\infty}\Big{(}1-K\Big{(}\frac{s}{b_{T}}\Big{)}\Big{)}\frac{1}{(1+s)^{\alpha}}.

Let S=bTS=b_{T}. Using |1K(s/bT)|supu[0,1]|K(u)|s/bT|1-K(s/b_{T})|\leq\sup_{u\in[0,1]}|K^{\prime}(u)|s/b_{T} for 0sS0\leq s\leq S and |1K(s/bT)|1|1-K(s/b_{T})|\leq 1 for sS+1s\geq S+1, we get

s=0(1\displaystyle\sum_{s=0}^{\infty}\Big{(}1- K(sbT))1(1+s)α\displaystyle K\Big{(}\frac{s}{b_{T}}\Big{)}\Big{)}\frac{1}{(1+s)^{\alpha}}
supu[0,1]|K(u)|bTs=0S1(1+s)α1+s=S+11(1+s)α,\displaystyle\leq\frac{\sup_{u\in[0,1]}|K^{\prime}(u)|}{b_{T}}\sum_{s=0}^{S}\frac{1}{(1+s)^{\alpha-1}}+\sum_{s=S+1}^{\infty}\frac{1}{(1+s)^{\alpha}},

where the first term is O(bT1)O(b_{T}^{-1}) because α>2\alpha>2. For the second term we get

s=S+11(1+s)α\displaystyle\sum_{s=S+1}^{\infty}\frac{1}{(1+s)^{\alpha}} S+11xα𝑑x=1(α1)(S+1)α1=O(bT1).\displaystyle\leq\int_{S+1}^{\infty}\frac{1}{x^{\alpha}}dx=\frac{1}{(\alpha-1)(S+1)^{\alpha-1}}=O(b_{T}^{-1}).

Hence

max1j1,j2MT|1Tt=j1+1Ts=j2+1T(K(tsbT)1)E(Yt,j1Ys,j2)|=O(bT1).\max_{1\leq j_{1},j_{2}\leq M_{T}}\Big{|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}K\Big{(}\frac{t-s}{b_{T}}\Big{)}-1\Big{)}E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{|}=O(b_{T}^{-1}).

Consider next the second term of the bound given in (6) and observe that this term can be bounded by

MT8/mmax1j1,j2MT1Tt=j1+1Ts=j2+1T(Yt,j1Ys,j2E(Yt,j1Ys,j2))K(tsbT)m/4,M_{T}^{8/m}\max_{1\leq j_{1},j_{2}\leq M_{T}}\Big{\|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}Y_{t,j_{1}}Y_{s,j_{2}}-E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{)}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{\|}_{m/4}, (35)

where

1Tt=j1+1Ts=j2+1T(\displaystyle\Big{\|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(} Yt,j1Ys,j2E(Yt,j1Ys,j2))K(tsbT)m/4\displaystyle Y_{t,j_{1}}Y_{s,j_{2}}-E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{)}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{\|}_{m/4}
\displaystyle\leq 1T=0T1K(/bT)t=1T(Yt,j2Yt+,j1E(Yt,j2Yt+,j1))m/4\displaystyle\frac{1}{T}\sum_{\ell=0}^{T-1}K(\ell/b_{T})\big{\|}\sum_{t=1}^{T-\ell}(Y_{t,j_{2}}Y_{t+\ell,j_{1}}-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}))\big{\|}_{m/4}
+1T=1T1K(/bT)s=1T(Ys,j1Ys+,j2E(Ys,j1Ys+,j2))m/4.\displaystyle\ \ +\frac{1}{T}\sum_{\ell=1}^{T-1}K(\ell/b_{T})\big{\|}\sum_{s=1}^{T-\ell}(Y_{s,j_{1}}Y_{s+\ell,j_{2}}-E(Y_{s,j_{1}}Y_{s+\ell,j_{2}}))\big{\|}_{m/4}.

Recall that for s0s\geq 0, r,s{\mathcal{F}}_{r,s} denotes the σ\sigma-algebra generated by the set of random variables {er,er1,,ers}\{e_{r},e_{r-1},\ldots,e_{r-s}\}. Note, that by Assumption A.1 for some measurable function HH, Yt,j2Yt+,j1=H(et+,et+1,)Y_{t,j_{2}}Y_{t+\ell,j_{1}}=H(e_{t+\ell},e_{t+\ell-1},\ldots) and that the ete_{t}’s are i.i.d.. Then

Yt,j2Yt+,j1\displaystyle Y_{t,j_{2}}Y_{t+\ell,j_{1}} E(Yt,j2Yt+,j1)=E(Yt,j2Yt+,j1|t+,)E(Yt,j2Yt+,j1)\displaystyle-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})=E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,\infty})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})
=limqE(Yt,j2Yt+,j1|t+,q)E(Yt,j2Yt+,j1)\displaystyle=\lim_{q\rightarrow\infty}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,q})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})
=E(Yt,j2Yt+,j1|t+,0)E(Yt,j2Yt+,j1)\displaystyle=E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,0})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})
+r=1{E(Yt,j2Yt+,j1|t+,r)E(Yt,j2Yt+,j1|t+,r1)},\displaystyle\ \ \ \ +\sum_{r=1}^{\infty}\big{\{}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r-1})\big{\}},

a.s. Therefore,

t=1T(\displaystyle\big{\|}\sum_{t=1}^{T-\ell}( Yt,j2Yt+,j1E(Yt,j2Yt+,j1))m/4\displaystyle Y_{t,j_{2}}Y_{t+\ell,j_{1}}-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}))\big{\|}_{m/4}
\displaystyle\leq t=1T{E(Yt,j2Yt+,j1|t+,0)E(Yt,j2Yt+,j1)}m/4\displaystyle\big{\|}\sum_{t=1}^{T-\ell}\big{\{}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,0})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})\big{\}}\big{\|}_{m/4}
+r=1t=1T{E(Yt,j2Yt+,j1|t+,r)E(Yt,j2Yt+,j1|t+,r1)}m/4\displaystyle+\sum_{r=1}^{\infty}\big{\|}\sum_{t=1}^{T-\ell}\big{\{}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r-1})\big{\}}\big{\|}_{m/4}
=\displaystyle= S1,n+S2,n,\displaystyle S_{1,n}+S_{2,n},

with an obvious notation for S1,nS_{1,n} and S2,nS_{2,n}. Consider S1,nS_{1,n} and observe that E(Yt,j2Yt+,j1|t+,0)=g~(et+)E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,0})=\widetilde{g}(e_{t+\ell}) and therefore, E(Yt,j2Yt+,j1|t+,0)E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,0}) and
E(Ys,j2Ys+,j1|s+,0)E(Y_{s,j_{2}}Y_{s+\ell,j_{1}}|{\mathcal{F}}_{s+\ell,0}) are independent for tst\neq s. Hence,

S1,n\displaystyle S_{1,n} Ct=1T{E(Yt,j2Yt+,j1|t+,0)E(Yt,j2Yt+,j1)}m/4\displaystyle\leq C\sqrt{\sum_{t=1}^{T-\ell}\big{\|}\big{\{}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,0})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})\big{\}}\big{\|}_{m/4}}
CTE(Yt,j2Yt+,j1|t+,0)E(Yt,j2Yt+,j1)m/4.\displaystyle\leq C\sqrt{T}\big{\|}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,0})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})\big{\|}_{m/4}.

To bound the term S2,nS_{2,n} define first

Wu=t=Tu+1T{E(Yt,j2Yt+,j1|t+,r)E(Yt,j2Yt+,j1|t+,r1)}W_{u}=\sum_{t=T-\ell-u+1}^{T-\ell}\big{\{}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r-1})\big{\}}

and denote by 𝒜u{\mathcal{A}}_{u} the σ\sigma-algebra generated by the set {eT,eT1,,eTu+1r}\{e_{T},e_{T-1},\ldots,e_{T-u+1-r}\}. Notice that WuW_{u} is measurable with respect to 𝒜u{\mathcal{A}}_{u} and that 𝒜u𝒜u+1{\mathcal{A}}_{u}\subset{\mathcal{A}}_{u+1}. Furthermore,

E(\displaystyle E( Wu+1Wu|𝒜u)\displaystyle W_{u+1}-W_{u}|{\mathcal{A}}_{u})
=E[E(YTu,j2YTu,j1|Tu,r)E(YTu,j2YTu,j1|Tu,r1)|𝒜u]\displaystyle=E\Big{[}E\big{(}Y_{T-\ell-u,j_{2}}Y_{T-u,j_{1}}|{\mathcal{F}}_{T-u,r}\big{)}-E\big{(}Y_{T-\ell-u,j_{2}}Y_{T-u,j_{1}}|{\mathcal{F}}_{T-u,r-1}\big{)}\Big{|}{\mathcal{A}}_{u}\Big{]}
=E(YTu,j2YTu,j1|Tu,r1)E(YTu,j2YTu,j1|Tu,r1)\displaystyle=E\big{(}Y_{T-\ell-u,j_{2}}Y_{T-u,j_{1}}|{\mathcal{F}}_{T-u,r-1}\big{)}-E\big{(}Y_{T-\ell-u,j_{2}}Y_{T-u,j_{1}}|{\mathcal{F}}_{T-u,r-1}\big{)}
=0.\displaystyle=0.

Since WuW_{u} forms a martingale we get

WTm/4Ct=1TE(Yt,j2Yt+,j1|t+,r)E(Yt,j2Yt+,j1|t+,r1)m/42.\|W_{T-\ell}\|_{m/4}\leq C\sqrt{\sum_{t=1}^{T-\ell}\|E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r-1})\|^{2}_{m/4}}.

Recall that Yt,j2Yt+,j1=H(et+,et+1,)Y_{t,j_{2}}Y_{t+\ell,j_{1}}=H(e_{t+\ell},e_{t+\ell-1},\ldots) for some measurable function HH and let

Yt,j2(r)Yt+,j1(r)=H(et+,et+1,,et+r+1,et+r,et+lr1,),Y_{t,j_{2}}(r-\ell)Y_{t+\ell,j_{1}}(r)=H(e_{t+\ell},e_{t+\ell-1},\ldots,e_{t+\ell-r+1},e^{\prime}_{t+\ell-r},e_{t+l-r-1},\ldots),

where {et,t}\{e_{t}^{\prime},t\in\mbox{$\mathbb{Z}$}\} is an independent copy of {et,t}\{e_{t},t\in\mbox{$\mathbb{Z}$}\}. Then,

E(Yt,j2Yt+,j1|t+,r)\displaystyle\|E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r}) E(Yt,j2Yt+,j1|t+,r1)m/4\displaystyle-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r-1})\|_{m/4}
=E(Yt,j2Yt+,j1|t+,r)E(Yt,j2(r)Yt+,j1(r)|t+,r)m/4\displaystyle=\|E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r})-E(Y_{t,j_{2}}(r-\ell)Y_{t+\ell,j_{1}}(r)|{\mathcal{F}}_{t+\ell,r})\|_{m/4}
Yt,j2Yt+,j1Yt,j2(r)Yt+,j1(r)m/4,\displaystyle\leq\|Y_{t,j_{2}}Y_{t+\ell,j_{1}}-Y_{t,j_{2}}(r-\ell)Y_{t+\ell,j_{1}}(r)\|_{m/4},

that is,

WTm/4CTYt,j2Yt+,j1Yt,j2(r)Yt+,j1(r)m/4.\|W_{T-\ell}\|_{m/4}\leq C\sqrt{T}\|Y_{t,j_{2}}Y_{t+\ell,j_{1}}-Y_{t,j_{2}}(r-\ell)Y_{t+\ell,j_{1}}(r)\|_{m/4}.

Hence,

S2,n=\displaystyle S_{2,n}= r=1WTm/4\displaystyle\sum_{r=1}^{\infty}\|W_{T-\ell}\|_{m/4}
\displaystyle\leq CTr=1Yt,j2Yt+,j1Yt,j2(r)Yt+,j1(r)m/4\displaystyle\ C\sqrt{T}\sum_{r=1}^{\infty}\|Y_{t,j_{2}}Y_{t+\ell,j_{1}}-Y_{t,j_{2}}(r-\ell)Y_{t+\ell,j_{1}}(r)\|_{m/4}
\displaystyle\leq CT{r=1Yt+,j1m/2Yt,j2(r)Yt,j2m/2\displaystyle C\sqrt{T}\Big{\{}\sum_{r=1}^{\infty}\|Y_{t+\ell,j_{1}}\|_{m/2}\|Y_{t,j_{2}}(r-\ell)-Y_{t,j_{2}}\|_{m/2}
+r=1Yt,j2(r)m/2Yt+,j1(r)Yt+,j1m/2}CT,\displaystyle+\sum_{r=1}^{\infty}\|Y_{t,j_{2}}(r-\ell)\|_{m/2}\|Y_{t+\ell,j_{1}}(r)-Y_{t+\ell,j_{1}}\|_{m/2}\Big{\}}\ \leq\ C\sqrt{T},

since

Yt,j2(r)Yt,j2m/2\displaystyle\|Y_{t,j_{2}}(r-\ell)-Y_{t,j_{2}}\|_{m/2} =Xt(r)Xtj2(rj2)XtXtj2m/2\displaystyle=\|X_{t}(r-\ell)X_{t-j_{2}}(r-\ell-j_{2})-X_{t}X_{t-j_{2}}\|_{m/2}
Xt(r)XtmXtj2(rj2)m\displaystyle\leq\|X_{t}(r-\ell)-X_{t}\|_{m}\|X_{t-j_{2}}(r-\ell-j_{2})\|_{m}
+XtmXtj2(rj2)Xtj2m\displaystyle\ \ +\|X_{t}\|_{m}\|X_{t-j_{2}}(r-\ell-j_{2})-X_{t-j_{2}}\|_{m}
C(r)α\displaystyle\leq C(r-\ell)^{\alpha}

and an analogue argument for Yt+,j1(r)Yt+,j1m/2\|Y_{t+\ell,j_{1}}(r)-Y_{t+\ell,j_{1}}\|_{m/2}. Therefore,

1Tt=j1+1Ts=j2+1T(Yt,j1Ys,j2E(Yt,j1Ys,j2))K(tsbT)m/4\displaystyle\Big{\|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}Y_{t,j_{1}}Y_{s,j_{2}}-E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{)}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{\|}_{m/4}
\displaystyle\leq 1T=0T1K(/bT)t=1TYt,j2Yt+,j1E(Yt,j2Yt+,j1)m/4\displaystyle\ \frac{1}{T}\sum_{\ell=0}^{T-1}K(\ell/b_{T})\|\sum_{t=1}^{T-\ell}Y_{t,j_{2}}Y_{t+\ell,j_{1}}-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})\|_{m/4}
+1T=0T1K(/bT)t=1TYt,j1Yt+,j2E(Yt,j2Yt+,j2)m/4\displaystyle+\frac{1}{T}\sum_{\ell=0}^{T-1}K(\ell/b_{T})\|\sum_{t=1}^{T-\ell}Y_{t,j_{1}}Y_{t+\ell,j_{2}}-E(Y_{t,j_{2}}Y_{t+\ell,j_{2}})\|_{m/4}
\displaystyle\leq CT1/2=0T1K(/bT)\displaystyle\ CT^{-1/2}\sum_{\ell=0}^{T-1}K(\ell/b_{T})
\displaystyle\leq C(T1/2(K(0)+=1K(/bT))\displaystyle\ C(T^{-1/2}\big{(}K(0)+\sum_{\ell=1}^{\infty}K(\ell/b_{T})\big{)}
\displaystyle\leq CT/2(K(0)+0K(x/bT)𝑑x)\displaystyle\ CT^{-/2}\big{(}K(0)+\int_{0}^{\infty}K(x/b_{T})dx\big{)}
\displaystyle\leq CbT/T.\displaystyle\ Cb_{T}/\sqrt{T}.

From this and recalling (35) we conclude that

|1Tt=j1+1Ts=j2+1T(Yt,j1Ys,j2E(Yt,j1Ys,j2))K(tsbT)|=OP(MT8/mbT/T).\Big{|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}Y_{t,j_{1}}Y_{s,j_{2}}-E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{)}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{|}=O_{P}\big{(}M_{T}^{8/m}b_{T}/\sqrt{T}\big{)}.

Proof of Theorem 3.

Let ξ~k\widetilde{\xi}_{k}, k=1,2,,NT,k=1,2,\ldots,N_{T}, be Gaussian random variables as in Theorem 2. By the triangular inequality,

supx|P(maxk=1,,NTTMT|f^T(λk,T)Ef^T(λk,T)|f^T(λk,T)x)P(maxk=1,,NT|ξk|x)|\displaystyle\sup_{x\in\mathbb{R}}\Big{|}P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|}{\widehat{f}_{T}(\lambda_{k,T})}\leq x\big{)}-P^{\ast}\big{(}\max_{k=1,\ldots,N_{T}}|\xi^{\ast}_{k}|\leq x\big{)}\Big{|}
supx|P(maxk=1,,NTTMT|f^T(λk,T)Ef^T(λk,T)|f^T(λk,T)x)P(maxk=1,,NT|ξ~k|x)|\displaystyle\leq\sup_{x\in\mathbb{R}}\Big{|}P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|}{\widehat{f}_{T}(\lambda_{k,T})}\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}|\widetilde{\xi}_{k}|\leq x\big{)}\Big{|}
+supx|P(maxk=1,,NT|ξ~k|x)P(maxk=1,,NT|ξk|x)|.\displaystyle+\sup_{x\in\mathbb{R}}\Big{|}P\big{(}\max_{k=1,\ldots,N_{T}}|\widetilde{\xi}_{k}|\leq x\big{)}-P^{\ast}\big{(}\max_{k=1,\ldots,N_{T}}|\xi^{\ast}_{k}|\leq x\big{)}\Big{|}. (36)

The first term is bounded by Theorem 2. To handle the second term, we first show that

ΔT:=max1k1,k2NT|C^T(k1,k2)CT(k1,k2)|\displaystyle\Delta_{T}:=\max_{1\leq k_{1},k_{2}\leq N_{T}}\big{|}\widehat{C}_{T}(k_{1},k_{2})-C_{T}(k_{1},k_{2})\big{|}
=\displaystyle= OP(maxk=1,,NT|f^T(λk,T)f(λk,T)|+bTMTT+MTbT+MTbTT8as/m1/2).\displaystyle\ O_{P}\Big{(}\max_{k=1,\ldots,N_{T}}|\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})|+\frac{b_{T}M_{T}}{\sqrt{T}}+\frac{M_{T}}{b_{T}}\,+\,M_{T}b_{T}T^{8a_{s}/m-1/2}\Big{)}. (37)

We have

ΔT\displaystyle\Delta_{T}\leq max1k1,k2NT|(1f^T(λk1,T)f^(λk2,T)1fT(λk1,T)f(λk2,T))\displaystyle\max_{1\leq k_{1},k_{2}\leq N_{T}}\Big{|}\Big{(}\frac{1}{\widehat{f}_{T}(\lambda_{k_{1},T})\widehat{f}(\lambda_{k_{2},T})}-\frac{1}{f_{T}(\lambda_{k_{1},T})f(\lambda_{k_{2},T})}\Big{)}
×j1=1MTj2=1MTak1,j1ak2,j2σT(j1,j2)|+max1k1,k2NT|1f^T(λk1,T)f^(λk2,T)\displaystyle\times\sum_{j_{1}=1}^{M_{T}}\sum_{j_{2}=1}^{M_{T}}a_{k_{1},j_{1}}a_{k_{2},j_{2}}\sigma_{T}(j_{1},j_{2})\Big{|}+\max_{1\leq k_{1},k_{2}\leq N_{T}}\Big{|}\frac{1}{\widehat{f}_{T}(\lambda_{k_{1},T})\widehat{f}(\lambda_{k_{2},T})}
×j1=1MTj2=1MTak1,j1ak2,j2(σ^T(j1,j2)σT(j1,j2))|\displaystyle\times\sum_{j_{1}=1}^{M_{T}}\sum_{j_{2}=1}^{M_{T}}a_{k_{1},j_{1}}a_{k_{2},j_{2}}\big{(}\widehat{\sigma}_{T}(j_{1},j_{2})-\sigma_{T}(j_{1},j_{2})\big{)}\Big{|}
=\displaystyle= Δ1,T+Δ2,T,\displaystyle\ \Delta_{1,T}+\Delta_{2,T},

with an obvious notation for Δ1,T\Delta_{1,T} and Δ2,T\Delta_{2,T}.

For Δ1,T\Delta_{1,T} notice that by the boundedness of the spectral density ff and Assumption 2 we get that

max1k1,k2NT|1f^T(λk1,T)f^(λk2,T)1fT(λk1,T)f(λk2,T)|\displaystyle\max_{1\leq k_{1},k_{2}\leq N_{T}}\Big{|}\frac{1}{\widehat{f}_{T}(\lambda_{k_{1},T})\widehat{f}(\lambda_{k_{2},T})}-\frac{1}{f_{T}(\lambda_{k_{1},T})f(\lambda_{k_{2},T})}\Big{|}
\displaystyle\leq\ (1min1kNTf^T(λk,T))2(1min1kNTfT(λk,T))2\displaystyle\Big{(}\frac{1}{\min_{1\leq k\leq N_{T}}\widehat{f}_{T}(\lambda_{k,T})}\Big{)}^{2}\Big{(}\frac{1}{\min_{1\leq k\leq N_{T}}f_{T}(\lambda_{k,T})}\Big{)}^{2}
×(max1kNTf(λk,T)+max1kNTf^(λk,T))max1kNT|f^T(λk,T)f(λk,T)|\displaystyle\ \ \times\big{(}\max_{1\leq k\leq N_{T}}f(\lambda_{k,T})+\max_{1\leq k\leq N_{T}}\widehat{f}(\lambda_{k,T})\big{)}\max_{1\leq k\leq N_{T}}\big{|}\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})\big{|}
=𝒪P(max1kNT|f^T(λk,T)f(λk,T)|).\displaystyle={\mathcal{O}}_{P}(\max_{1\leq k\leq N_{T}}\big{|}\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})\big{|}).

Furthermore, recalling the definition of the Gaussian random variables ξk\xi_{k}, we have

max1k1,k2NT|j1=1MTj2=1MTak1,j1ak2,j2σT(j1,j2)|=max1k1,k2NT|E(ξk1,ξk2)|\displaystyle\max_{1\leq k_{1},k_{2}\leq N_{T}}\Big{|}\sum_{j_{1}=1}^{M_{T}}\sum_{j_{2}=1}^{M_{T}}a_{k_{1},j_{1}}a_{k_{2},j_{2}}\sigma_{T}(j_{1},j_{2})\Big{|}=\max_{1\leq k_{1},k_{2}\leq N_{T}}|E(\xi_{k_{1}},\xi_{k_{2}})|
\displaystyle\leq (max1kNTξk2)2=𝒪(1);\displaystyle\ \big{(}\max_{1\leq k\leq N_{T}}\|\xi_{k}\|_{2}\big{)}^{2}={\mathcal{O}}(1);

see Remark 1. Hence Δ1,T=𝒪P(max1kNT|f^T(λk,T)f(λk,T)|)\Delta_{1,T}={\mathcal{O}}_{P}\big{(}\max_{1\leq k\leq N_{T}}\big{|}\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})\big{|}\big{)}.

For Δ2,T\Delta_{2,T} we have

Δ2,T\displaystyle\Delta_{2,T} max1j1,j2MT|σ^T(j1,j2)σT(j1,j2)|(1min1kNTf^T(λk,T))2\displaystyle\leq\max_{1\leq j_{1},j_{2}\leq M_{T}}\big{|}\widehat{\sigma}_{T}(j_{1},j_{2})-\sigma_{T}(j_{1},j_{2})\Big{|}\Big{(}\frac{1}{\min_{1\leq k\leq N_{T}}\widehat{f}_{T}(\lambda_{k,T})}\Big{)}^{2}
×(max1kNTj=0MTak,j)2\displaystyle\ \ \ \ \times\Big{(}\max_{1\leq k\leq N_{T}}\sum_{j=0}^{M_{T}}a_{k,j}\Big{)}^{2}
=𝒪P(MTbTT+MTbT+MTbTT8as/m1/2),\displaystyle={\mathcal{O}}_{P}\Big{(}\frac{M_{T}b_{T}}{\sqrt{T}}+\frac{M_{T}}{b_{T}}+M_{T}b_{T}T^{8a_{s}/m-1/2}\Big{)},

by Proposition 1 and the fact that j=0MTak,j=O(MT)\sum_{j=0}^{M_{T}}a_{k,j}=O(\sqrt{M_{T}}) uniformly in 1kNT1\leq k\leq N_{T}. This establishes (6).

Recall next that for 1kNT1\leq k\leq N_{T},

E|ξk|2=f2(λk,T)11w2(u)𝑑u+o(1),E|\xi_{k}|^{2}=f^{2}(\lambda_{k,T})\int_{-1}^{1}w^{2}(u)du+o(1),

and that E|ξ~k|2=E|ξk|2/f2(λk,T)E|\widetilde{\xi}_{k}|^{2}=E|\xi_{k}|^{2}/f^{2}(\lambda_{k,T}). From these expressions and by Assumption 2 and the boundedness of ff we get that min1kNTCT(k,k)\min_{1\leq k\leq N_{T}}C_{T}(k,k) and max1kNTCT(k,k)\max_{1\leq k\leq N_{T}}C_{T}(k,k) are, for TT large enough, bounded from below and from above, respectively, by positive constants. Using Lemma A.1 of Zhang et al. (2022) we then get

supx|P(maxk=1,,NT|ξ~k|x)\displaystyle\sup_{x\in\mathbb{R}}\Big{|}P\big{(}\max_{k=1,\ldots,N_{T}}|\widetilde{\xi}_{k}|\leq x\big{)}- P(maxk=1,,NT|ξk|x)|\displaystyle P^{\ast}\big{(}\max_{k=1,\ldots,N_{T}}|\xi^{\ast}_{k}|\leq x\big{)}\Big{|}
=\displaystyle= 𝒪P(ΔT1/6(1+log(NT))1/4+ΔT1/3log3(NT))\displaystyle{\mathcal{O}}_{P}\Big{(}\frac{\Delta_{T}^{1/6}}{(1+\log(N_{T}))^{1/4}}+\Delta_{T}^{1/3}\log^{3}(N_{T})\Big{)}
=\displaystyle= oP(ΔT1/6).\displaystyle o_{P}(\Delta_{T}^{1/6}).

Acknowledgements The authors are grateful to the Co-Editor and to two referees for their valuable comments that lead to an improved version of this paper. They are also thankful to Panagiotis Maouris and Alexander Braumann for helping carrying out the numerical work of Section 5.

References

  • [1] \harvarditemAnderson1971A71 Anderson, T. W. (1971). The Statistical Analysis of Time Series. John Wiley & Sons, Inc., New York-London-Sydney.
  • [2] \harvarditemAndrews1991A91 Andrews, D.W.K. (1991). Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica 59, 817–858.
  • [3] \harvarditemBerg and Politis2009BergPolitis2009 Berg, A. and Politis, D.N. (2009). Higher-order accurate polyspectral estimation with flat-top lag-windows. Ann. Inst. Stat. Math. 61, 477–498.
  • [4] \harvarditemBrockwell and Davis1991BrockwellDavis91 Brockwell, P.J. and Davis, R.A. (1991). Time Series: Theory and Methods. Second Edition. Springer, New York.
  • [5] \harvarditemCalonico et al.2018Calonicoetal2018 Calonico, S., Cattaneo, M.D. and Farrell, M.H. (2018). On the effect of bias estimation on coverage accuracy in nonparametric inference. J. Am. Stat. Assoc. 113, 767-779.
  • [6] \harvarditemCerovecki et al.2022CCH22 Cerovecki, C., Characiejus, V. and Hörmann, S. (2022). The maximum of the periodogram of a sequence of functional data. J. Am. Stat. Assoc. 118, 2712-2720.
  • [7] \harvarditemChang et al.2023Changetal23 Chang, J., Jiang, Q., McElroy, T. S. and Shao, X. (2023). Statistical inference for high-dimensional spectral density matrix. Preprint, doi: 10.48550/arXiv.2212.13686.
  • [8] \harvarditemJirak2011Jirak2011 Jirak, M. (2011). On the maximum of covariance estimators. J. Mult. Analysis 102, 1032–1046.
  • [9] \harvarditemLiu and Wu2010LiuWu2010 Liu, W. and Wu, W.B. (2010). Asymptotics of spectral density estimates. Econometric Theory 26, 1218–1245.
  • [10] \harvarditemNeumann and Paparoditis2008NP08 Neumann, M. H. and Paparoditis, E. (2008). Simultaneous confidence bands in spectral density estimation. Biometrika 95, 381-397.
  • [11] \harvarditemNewton and Pagano1984NP84 Newton, J. H. and Pagano, M. (1984). Simultaneous confidence bands for autoregressive spectra. Biometrika 71, 197–202.
  • [12] \harvarditemPolitis and Romano1995PolitisRomano87 Politis, D. N. and Romano J.P. (1995). Bias-corrected nonparametric spectral estimation. J. Time Ser. Anal. 16, 67–103.
  • [13] \harvarditemPolitis2024Politis24 Politis, D. N. (2024). Studentization vs. variance stabilization: A simple way out of an old dilemma. Statistical Science 39, 409–427.
  • [14] \harvarditemPriestley1981Priestley81 Priestley, M. B., (1991). Spectral Analysis and Time Series. Academic Press, London.
  • [15] \harvarditemTomàšek1987T87 Tomàšek, L. (1987). Asymptotic simultaneous confidence bands for autoregressive spectral density. Journal of Time Series Analysis 8, 469–491.
  • [16] \harvarditemWoodroofe and Van Ness1967Woodroofe1967 Woodroofe, M.B. and Van Ness, J.W. (1967). The maximum deviation of sample spectral densities. Ann. Math. Statist. 38, 1558–1569.
  • [17] \harvarditemWu2005Wu2005 Wu, W.B. (2005). Nonlinear system theory: Another look at dependence. Proceedings of the National Academy of Sciences USA 102, 14150–14154.
  • [18] \harvarditemWu2009Wu2009 Wu, W.B. (2009). An asymptotic theory for sample covariances of Bernoulli shifts. Stochastic Processes and Their Applications 120, 2412–2431.
  • [19] \harvarditemWu2011Wu2011 Wu, W.B. (2011). Asymptotic theory for stationary time series. Statistics and its Interface 4, 207–226.
  • [20] \harvarditemXiao and Wu2014XiaoWu2014 Xiao, H. and Wu, W.B. (2014). Portmanteau test and simultaneous inference for serial covariances. Statistica Sinica 24, 577–599.
  • [21] \harvarditemWu and Zaffaroni2018WuZaffaroni2018 Wu, W.B. and Zaffaroni, P. (2018). Asymptotic theory for spectral density estimates of general multivariate time series. Econometric Theory 34, 1–22.
  • [22] \harvarditemYang and Zhou2022YZ22 Yang, J. and Zhou, Z.(2022) Spectral inference under complex temporal dynamics. J. Am. Stat. Assoc. 117, 133–155.
  • [23] \harvarditemZhang and Cheng2018ZC18 Zhang, X. and Cheng, G. (2018). Gaussian approximation for high dimensional vector under physical dependence. Bernoulli 24, 2640–2675.
  • [24] \harvarditemZhang et al.2022Zhang_etal2022 Zhang, Y., Paparoditis, E. and Politis, D.N. (2022). Simultaneous statistical inference for second order parameters of time series under weak conditions. Preprint, doi: 10.48550/arXiv.2110.14067.
  • [25] \harvarditemZhang and Wu2017ZW17 Zhang, D. and Wu, W.B (2017). Gaussian approximation for high dimensional time series. Annals of Statistics 45, 1895–1919.
  • [26]