Gaussian Approximation for Lag-Window Estimators and the Construction of Confidence Bands for the Spectral Density

Jens-Peter Kreiß Anne Leucht Efstathios Paparoditis Institut für Mathematische Stochastik, TU Braunschweig, 38106 Braunschweig, Germany. E-mail: [email protected] Institut für Statistik, Universität Bamberg, 96052 Bamberg, Germany. E-mail: [email protected] Cyprus Academy of Sciences, Letters and Arts, P.O.Box 22554, CY-1522 Nicosia, Cyprus. E-mail: [email protected] TU Braunschweig, Universität Bamberg, University of Cyprus

Abstract

In this paper we consider the construction of simultaneous confidence bands for the spectral density of a stationary time series using a Gaussian approximation for classical lag-window spectral density estimators evaluated at the set of all positive Fourier frequencies. The Gaussian approximation opens up the possibility to verify asymptotic validity of a multiplier bootstrap procedure and, even further, to derive the corresponding rate of convergence. A small simulation study sheds light on the finite sample properties of this bootstrap proposal.

62G20,

62G09, 62G15,

Bootstrap,

Confidence Bands,

Gaussian Approximation,

Sample Autocovariance,

Spectral Density,

keywords:

[class=MSC]

keywords:

1 Introduction

We consider the problem of constructing (simultaneous) confidence bands for the spectral density of a stationary time series. Toward this goal we develop Gaussian approximation results for the maximum deviation over all (positive) Fourier frequencies of a lag-window spectral density estimator and for its bootstrap counterpart. Based on observations $X_{1},X_{2},\ldots,X_{T}$ stemming from a strictly stationary and centered stochastic process $\{X_{t},t\in\mbox{$\mathbb{Z}$}\}$ , a lag-window estimator of the spectral density $f(\lambda)$ , $\lambda\in[0,\pi]$ , is given by

\widehat{f}_{T}(\lambda)=\frac{1}{2\pi}\sum_{|j|\leq M_{T}}w(j/M_{T})\,\mathrm{e}^{-ij\lambda}\,\widehat{\gamma}(j),~{}\lambda\in[0,\pi].

(1)

In (1) and for $j=0,1,2,\ldots,M_{T}<T$ ,

\widehat{\gamma}(j)=\frac{1}{T}\sum_{t=j+1}^{T}X_{t}\,X_{t-j}

(2)

are estimators of the autocovariances $\gamma(j)=\mathrm{Cov}(X_{0},X_{j})$ of the process $\{X_{t},t\in\mbox{$\mathbb{Z}$}\}$ and $\widehat{\gamma}(j)=\widehat{\gamma}(-j)$ for $j<0$ . The function $w:[-1,1]\rightarrow{\mathbb{R}}$ is a so-called lag-window, which assigns weights to the $M_{T}$ sample autocovariances effectively used in the calculation of the estimator for $f(\lambda)$ and which satisfies some assumptions to be specified later.

It is well-known that for a fixed frequency $\lambda$ , under suitable assumptions on the dependence structure of the underlying time series and for $M_{T}$ converging not too fast to infinity, asymptotic normality for lag-window estimators can be shown. To elaborate and assuming sufficient smoothness of $f$ , which is equivalent to assuming a sufficiently fast decay of the underlying autocovariances $\gamma(h)$ as $|h|\to\infty$ , one typically can show under certain conditions on the lag-window $w$ and for $M_{T}\to\infty$ such that $T/M_{T}^{5}\to C^{2}\geq 0$ as $n\to\infty$ , that

\sqrt{\frac{T}{M_{T}}}\Big{(}\widehat{f}_{T}(\lambda)-f(\lambda)\Big{)}\xrightarrow{\mathcal{D}}{\mathcal{N}}\Big{(}C\,W\,f^{\prime\prime}(\lambda),\,f^{2}(\lambda)\int_{-1}^{1}w^{2}(u)\,du\Big{)},

(3)

for $0<\lambda<\pi$ , where $f^{\prime\prime}$ denotes the second derivative of $f$ and $W=\lim_{u\rightarrow 0}(1-w(u))/u^{2}$ , where the latter is assumed to exist and to be positive.

Moreover, for the estimator $\widehat{f}_{T}(\lambda)$ with sophisticated selected so-called flat-top lag-windows $w$ and under the additional assumption that $\sum_{j\in\mathbb{Z}}|j|^{r}|\gamma(j)|<\infty$ for some $r\geq 1$ convergence rates for the mean squared error $\mbox{MSE}(\widehat{f}_{T}(\lambda))=O(T^{-2r/(2r+1)})$ can be achieved, which corresponds to $M_{T}\sim T^{1/(2r+1)}$ . See \citeasnounPolitisRomano87 and \citeasnounBergPolitis2009 for details.

Instead of point-wise inference we aim in this paper for simultaneous inference using the estimator $\widehat{f}_{T}$ . More precisely, we aim for confidence bands covering the spectral density uniformly at all positive Fourier frequencies $\lambda_{k,T}=2\pi k/T,k=1,\ldots,\lfloor T/2\rfloor=:N_{T}$ with a desired (high) probability. Hence, we keep all information contained in $X_{1},\dots,X_{T}$ in the sense that the discrete Fourier transform of any vector $x\in\mbox{$\mathbb{R}$}^{T}$ is a linear combination of trigonometric polynomials at exactly these frequencies.

Note that so far simultaneous confidence bands for the spectral density had only been derived for special cases. In particular, for autoregressive processes relying on parametric spectral density estimates (see e.g. \citeasnounNP84 and \citeasnounT87) while \citeasnounNP08 proposed a bootstrap-aided approach for Gaussian time series using the integrated periodogram to construct simultaneous confidence bands for (a smoothed version of) the spectral density. Generalizing results from \citeasnounLiuWu2010, \citeasnounYZ22 derived simultaneous confidence bands for the spectral density on a grid with mesh size wider than $2\pi/T$ and for locally stationary Bernoulli shifts satisfying a geometric moment contraction condition. To this end, they proved that the maximum deviation of a suitably standardized lag-window spectral density estimator over such a grid asymptotically possesses a Gumbel distribution. Motivated by the slow rate of convergence of the maximum deviation to a Gumbel variable, they propose an asymptotically valid bootstrap method to improve the finite sample behavior of their confidence regions.

As mentioned above, we aim to construct simultaneous confidence bands on the finer grid of all (positive) Fourier frequencies. To achieve this goal, we establish a Gaussian approximation result for the distribution of a properly standardized statistic based on the random quantity

\max_{1\leq k\leq N_{T}}|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|.

In this context and in order to stabilize the asymptotic variance and to construct confidence bands that automatically adapt to the local variability of $f(\lambda)$ , we are particularly interested in deriving a Gaussian approximation for the normalized statistic

\max_{1\leq k\leq N_{T}}\frac{\displaystyle|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})|}{\displaystyle\widehat{f}_{T}(\lambda_{k,T})}.

Various Gaussian approximations of max-type statistics in the time domain have been derived during the last decades. For means of high dimensional time series data we refer the reader to \citeasnounZW17 as well as \citeasnounZC18 and references therein. In the context of i.i.d. functional data, \citeasnounCCH22 show that Gaussian approximation results for means of high dimensional vectors can be adapted to establish a Gaussian approximation for the maximum of the periodogram. To elaborate, they use that the periodogram at frequency $\lambda_{k,T}$ is the squared norm of $\frac{1}{\sqrt{T2\pi}}\sum_{t=1}^{T}X_{t}\,e^{-it\lambda_{k,T}}$ . However, their arguments do not directly carry over to lag-window estimators as the latter have a more complex structure. It is shown in (3) below, that they still have a mean-type representation, but the degree of dependence of the summands increases with the sample size which prevents the application of the afore-mentioned results. A Gaussian approximation result for high dimensional spectral density matrices of $\alpha$ -mixing time series has been derived in \citeasnounChangetal23. However, they require the dimension of the process to increase with sample size making their results not applicable for our purposes.

The paper is organized as follows. Section 2 summarizes key assumptions for our results. The core material on Gaussian approximation for spectral density estimators is contained in Section 3. In Section 4 we discuss application of the Gaussian approximation results to the construction of simultaneous confidence bands over the positive Fourier frequencies. Section 5 reports the results of a small simulation study. Auxiliary lemmas and proofs are deferred to Section 6.

2 Set up and Assumptions

We begin by imposing the following assumption on the dependence structure of the stochastic process generating the observed time series $X_{1},X_{2},\ldots,X_{T}$ .

Assumption 1: $\{X_{t},t\in\mbox{$\mathbb{Z}$}\}$ is a strictly stationary, centered process and $X_{t}=g(e_{t},e_{t-1},\ldots)$ for some measurable function $g$ and an i.i.d. process $\{e_{t},t\in\mbox{$\mathbb{Z}$}\}$ with mean zero and variance $0<\sigma^{2}_{e}<\infty$ . For $s\geq 0$ , denote by $\mathcal{F}_{r,s}=\sigma(e_{r},e_{r-1},\dots,e_{r-s})$ the $\sigma$ -algebra generated by the random variables $\{e_{r-j},0\leq j\leq s\}$ and assume that $E|X_{t}|^{m}<\infty$ for some $m>16$ . For a random variable $X$ let $\|X\|_{m}:=(E|X|^{m})^{1/m}$ and define for an independent copy $\{e_{t}^{\prime},t\in\mbox{$\mathbb{Z}$}\}$ of $\{e_{t};t\in\mbox{$\mathbb{Z}$}\}$ ,

\delta_{m}(k):=\|X_{t}-g(e_{t},e_{t-1},\ldots,e_{t-k+1},e^{\prime}_{t-k},e_{t-k-1},\ldots)\|_{m}.

The assumption is that

\delta_{m}(k)\leq C\,(1+k)^{-\alpha}

(4)

for some $\alpha>3$ and a constant $C<\infty$ .

Assumption 1 implies that the autocovariance function $\gamma(h)=\mathrm{Cov}(X_{0},X_{h}),$ $h\in\mbox{$\mathbb{Z}$},$ is absolute summable, that is that the process $\{X_{t},t\in\mbox{$\mathbb{Z}$}\}$ possesses a continuous and bounded spectral density $f(\lambda)=(2\pi)^{-1}\sum_{h\in\mbox{$\mathbb{Z}$}}\gamma(h)e^{-ih\lambda}$ , $\lambda\in(-\pi,\pi]$ .
It can be even shown that (4) implies $\sum_{j\in\mathbb{Z}}|j|^{r}|\gamma(j)|<\infty$ for $r<\alpha-1$ , see Lemma 1 in Section 6. As has been mentioned in the Introduction this allows for the option of convergence rates for $\widehat{f}_{T}(\lambda)$ of order $O(T^{-r/(2r+1)})$ , $r<\alpha-1$ . Note that our assumptions on the decay of the dependence coefficients are less restrictive than those in \citeasnounYZ22, where exponentially decaying coefficients are presumed. We assume for simplicity that $EX_{t}=0$ . Our results can be generalized to non-centered time series using the modified autocovariance estimator $\widetilde{\gamma}(j)=\frac{1}{T}\sum_{t=j+1}^{T}(X_{t}-\overline{X})(X_{t-j}-\overline{X})\ \ \mbox{with}\ \ \overline{X}=\sum_{t=1}^{T}X_{t}/T$ , instead of the estimator $\widehat{\gamma}(j)$ defined in (2).

Additional to Assumption 1 we also require the following boundedness condition for the spectral density $f$ .

Assumption 2: The spectral density $f$ satisfies $\inf_{\lambda\in(-\pi,\pi]}f(\lambda)>0$ .

Finally, we impose the following conditions on the lag-window function $w$ used in obtaining the estimator $\widehat{f}_{T}$ and which are standard in the literature; see \citeasnounPriestley81, Chapter 6.

Assumption 3: The lag-window $w\colon[-1,1]\to\mbox{$\mathbb{R}$}$ is assumed to be a differentiable and symmetric function with $\int_{-1}^{1}w(u)\,du=1$ and $w(0)=1$ .

3 Gaussian Approximation for Spectral Density Estimators

For the following considerations and in order to separate any bias related problems, we consider the centered sequence

	$\displaystyle\sqrt{\frac{T}{M_{T}}}\Big{(}\widehat{f}_{T}(\lambda_{k,T})-E\,\widehat{f}_{T}(\lambda_{k,T})\Big{)}$	$\displaystyle=\frac{1}{\sqrt{T}}\sum_{t=1}^{T}\sum_{j=0}^{M_{T}}a_{k,j}\big{(}X_{t}X_{t-j}-\gamma(j)\big{)}\,\mathbbm{1}_{{t>j}}$
		$\displaystyle=:\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,k},$		(5)

with an obvious abbreviation for $Z_{t,k}$ and where for $\ k=1,\ldots,N_{T}$ ,

a_{k,j}:=\begin{cases}\frac{\displaystyle 1}{\displaystyle 2\pi}\cdot\frac{\displaystyle 1}{\displaystyle\sqrt{M_{T}}}&,\;j=0\\ \frac{\displaystyle 1}{\displaystyle\pi}\cdot\frac{\displaystyle 1}{\displaystyle\sqrt{M_{T}}}w(j/M_{T})\cos(j\lambda_{k,T})&,\;j\geq 1.\end{cases}

Due to the differentiability of the lag-window $w$ (cf. Assumption 3), we have

	$\displaystyle\max_{k=1,\ldots,N_{T}}\sum_{j=0}^{M_{T}}a_{k,j}^{2}$	$\displaystyle\leq\frac{1}{4\pi^{2}}\frac{1}{M_{T}}+\frac{1}{\pi^{2}M_{T}}\sum_{j=1}^{M_{T}}w^{2}(j/M_{T})$
		$\displaystyle=\frac{1}{\pi^{2}}\int_{0}^{1}w^{2}(u)\,du+{\mathcal{O}}(M_{T}^{-1}).$

The following theorem is our first result and establishes a valid Gaussian approximation for the maximum over all positive Fourier frequencies of the centered estimator given in (1).

Theorem 1.

Suppose that $\{X_{t},t\in\mathbb{Z}\}$ fulfils Assumption 1 and 2. Let $\widehat{f}_{T}$ be a lag-window estimator of $f$ as given in (1), where the lag-window $w$ satisfies Assumption 3 and let $M_{T}\rightarrow\infty$ and $M_{T}\sim T^{a_{s}}$ . Assume further that

\displaystyle\Big{\|}\,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,k}\,\Big{\|}_{2}^{2}>c>0.

(6)

Let $\xi_{k},k=1,\ldots,N_{T}$ , be jointly normally distributed random variables with zero mean and covariance $E\xi_{k_{1}}\xi_{k_{2}}$ equal to

\frac{1}{T}E\Big{[}\Big{(}\sum_{j=0}^{M_{T}}\sum_{t=j+1}^{T}a_{k_{1},j}\big{(}X_{t}X_{t-j}-\gamma(j)\big{)}\Big{)}\Big{(}\sum_{j=0}^{M_{T}}\sum_{t=j+1}^{T}a_{k_{2},j}\big{(}X_{t}X_{t-j}-\gamma(j)\big{)}\Big{)}\Big{]}.

Let $\lambda$ , $a_{s}$ , and $a_{l}$ be positive constants such that

\alpha>\min\Big{\{}1+\frac{a_{l}}{2a_{s}}\,,\,\frac{1-a_{l}-\frac{12}{m}-4\lambda-\max\{\frac{4}{m},\,2\lambda\}}{2a_{s}}+\frac{3}{2}\Big{\}}

(7)

and

0<a_{s}+\max\{4\lambda,2\lambda+4/m\}<a_{l}<1-6\lambda-12/m.

(8)

Then,

		$\displaystyle\sup_{x\in\mathbb{R}}\Big{\|}P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\Big{\|}\widehat{f}_{T}(\lambda_{k,T})-E\,\widehat{f}_{T}(\lambda_{k,T})\Big{\|}\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}\|\xi_{k}\|\leq x\big{)}\Big{\|}$
		$\displaystyle={\mathcal{O}}\Big{(}T^{-\kappa}+T^{-\lambda}\big{(}\log(N_{T})\big{)}^{3/2}\Big{)},$		(9)

where $\kappa=\min\{\kappa_{1},\kappa_{2}\}$ with

\kappa_{1}=a_{l}/2-a_{s}/2-\lambda-\max\{2/m\,,\,\lambda\}\ \mbox{and}\ \ \kappa_{2}=1/2-a_{l}/2-6/m-3\lambda.

Some remarks regarding Theorem 1 are in order.

Remark 1.

(i)

The lower bound in (6) is valid for sufficiently large $T$ . This results from our assumption that $f$ is bounded away from zero (see Assumption 2), together with

		$\displaystyle\sup_{k\in\{1,\dots,N_{T}\}}\Big{\|}\big{\\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,k}\big{\\|}_{2}^{2}-f^{2}(\lambda_{k,T})\,\frac{1}{M_{T}}\sum_{j=-M_{T}}^{M_{T}}w^{2}\big{(}\frac{j}{M_{T}}\big{)}\,(1+\cos(2j\,\lambda_{k,T}))\Big{\|}$
		$\displaystyle=o(1),$

see Lemma 2 in Section 6.

(ii)

In order to verify that the max-statistic of interest can be approximated by the maximum of absolute values of Gaussian random variables inheriting the covariance structure of lag-window estimators, we combine smoothing techniques, Lindeberg’s method, and a blocking approach. The parameter $\lambda$ appearing in Theorem 1 is related to the smoothness of the functions used in the proof to approximate indicator functions. The parameters $a_{l}$ and $a_{s}$ determine the sizes of the big and small blocks. The sizes of these blocks are tailor-made to handle the growing degree of dependence in the $Z_{t,k}^{\prime}$ s with increasing sample size.

Remark 2.

(i)

Notice first that the rate in the Gaussian approximation (1) does not depend on the rate of decay of the dependence coefficients. Moreover, as can be seen by an inspection of the proof of Theorem 1, even if the underlying time series $\{X_{t},t\in\mbox{$\mathbb{Z}$}\}$ consists of i.i.d. observations, i.e. $X_{t}=e_{t}$ for all $t$ , we do not achieve better approximation rates using our method of proof.
(ii)

If we assume that the number $m$ of moments assumed to exist and the convergence rate for the underlying lag-window estimator $a_{s}=1/(2r+1),\;r\in\mathbb{N},$ are fixed, then we need to optimize the rate in the Gaussian approximation depending on $\lambda$ and $\alpha_{l}$ . To do so we first ignore log-terms and secondly we divide the consideration into two cases according to the parameter $m$ , namely Case I: $2/m\geq\lambda$ (moderate values of $m$ ) and Case II: $2/m\leq\lambda$ (large values of $m$ ). For Case I we can find the resulting rate, which is the minimum of $\max\{-\lambda,\lambda+2/m+(a_{s}-a_{l})/2,3\lambda+6/m+(a_{l}-1)/2\}$ by balancing the three terms. This leads to $a_{l}=1/3+2/3\,a_{s}-4/(3m)$ , $\lambda=1/12(1-a_{s})-4/(3m)$ and the resulting rate $T^{-r/(6(2r+1))+4/(3m)}$ .
Exactly along the same lines we obtain for Case II that $\lambda=1/14(1-a_{s})-6/(7m)$ and the resulting rate $T^{-r/(7(2r+1))+6/(7m)}$ . Both choices are in line with (8).
Hence, for usual lag-window estimators, i.e. $a_{s}=1/5$ , and if sufficiently high moments of the time series are assumed to exist, then we can achieve rates up to $T^{-2/35}$ for the Gaussian approximation.
In the limit $a_{s}\to 0$ , we achieve the rate $T^{-1/14+6/(7m)}$ , which reaches, if sufficiently high moments exist, almost the rate $T^{-1/14}$ .
(iii)

If we would instead of sample autocovariances consider sample means of i.i.d. observations, which would make the small blocks/large blocks considerations in the proof of Theorem 1 superfluous, we could achieve with the presented method of proof a Gaussian approximation with rate $T^{-1/8+3/(2m)}$ .

4 Confidence Bands for the Spectral Density

To construct confidence bands that appropriately take into account the local variability of the lag-window estimator, we make use of the following Gaussian approximation result for the properly standardized lag-window estimators.

Theorem 2.

Suppose that the assumptions of Theorem 1 are satisfied. Further, assume that $M_{T}^{3}/T\to 0$ , where additionally to the conditions given in (7),

\lambda+\kappa\leq\min\{1/2-a_{s}/2-4/m\,,\,2a_{s}-4/m\}

(10)

holds true. Moreover, we assume for the bias of $\widehat{f}_{T}$ that

E\widehat{f}_{T}(\lambda)-f(\lambda)={\mathcal{O}}(M_{T}^{-2})

(11)

uniformly in $\lambda$ . Then,

		$\displaystyle\sup_{x\in\mathbb{R}}\Big{\|}P\Big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{\|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\|}{\widehat{f}_{T}(\lambda_{k,T})}\leq x\Big{)}-P\Big{(}\max_{k=1,\ldots,N_{T}}\|\widetilde{\xi}_{k}\|\leq x\Big{)}\Big{\|}$
	$\displaystyle=$	$\displaystyle\,{\mathcal{O}}\Big{(}T^{-\kappa}+T^{-\lambda}\big{(}\log(N_{T})\big{)}^{3/2}\Big{)},$

where $\widetilde{\xi}_{k},k=1,\ldots,N_{T}$ , are jointly normally distributed random variables with zero mean and covariance $E\widetilde{\xi}_{k_{1}}\widetilde{\xi}_{k_{2}}$ equal to

	$\displaystyle C_{T}(k_{1},k_{2})=\frac{1}{f(\lambda_{k_{1},T})f(\lambda_{k_{2},T})}$	$\displaystyle\sum_{j_{1},j_{2}=0}^{M_{T}}a_{k_{1},j_{1}}a_{k_{2},j_{2}}$
	$\displaystyle\times\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}$	$\displaystyle E\Big{[}\big{(}X_{t}X_{t-j_{1}}-\gamma(j_{1})\big{)}\big{(}X_{s}X_{s-j_{2}}-\gamma(j_{2})\big{)}\Big{]}.$		(12)

Remark 3.

(i)

Choosing $a_{s}$ MSE optimal, that is $a_{s}=1/5$ , the corresponding optimal $\lambda=\kappa=2/35$ (see Remark 2) satisfy (10).
(ii)

Under the additional assumption $\lim_{u\to 0}(1-w(u))/u^{2}>0$ for the lag-window, validity of (11) is shown in \citeasnoun[Theorem 9.3.3]A71.
For the case $M_{T}\sim T^{1/5}$ \citeasnounBergPolitis2009 have shown in their Theorem 1 that, under our assumptions, for lag-window estimators with a so-called flat-top lag-window $w$ even

$\sup_{\lambda}|E\widehat{f}_{T}(\lambda)-f(\lambda)|=o(M_{T}^{-2})$

holds. This implies that the results of Theorem 2 also hold for the important class of flat-top lag-window estimators introduced in \citeasnounPolitisRomano87. It is worth mentioning that flat-top lag windows fail to fulfill $\lim_{u\to 0}(1-w(u))/u^{2}>0$ .

Theorem 2 motivates the following multiplier bootstrap procedure to construct a confidence band for the smoothed spectral density

\widetilde{f}_{T}(\cdot):=E(\widehat{f}_{T}(\cdot)).

Step 1.

For $k_{1},k_{2}\in\{1,2,\ldots,N_{T}\}$ , let $\widehat{C}_{T}(k_{1},k_{2})$ be an estimator of the covariance $C_{T}(k_{1},k_{2})$ given in (2) which ensures that the $N_{T}\times N_{T}$ matrix $\widehat{\Sigma}_{N_{T}}$ with the $(i,j)$ -th element equal to $\widehat{C}_{T}(i,j)$ is non-negative definite.

Step 2.

Generate random variables $\xi^{\ast}_{1},\xi^{\ast}_{2},\ldots,\xi^{\ast}_{N_{T}}$ , where

\big{(}\xi^{\ast}_{1},\xi^{\ast}_{2},\ldots,\xi^{\ast}_{N_{T}}\big{)}^{\top}\sim{\mathcal{N}}\Big{(}0_{N_{T}},\widehat{\Sigma}_{N_{T}}\Big{)},

with $0_{N_{T}}$ a $N_{T}$ -dimensional vector or zeros and covariance matrix $\widehat{\Sigma}_{N_{T}}$ . Let $\xi^{\ast}_{\max}=\max\big{\{}|\xi^{\ast}_{1}|,|\xi^{\ast}_{2}|,\ldots,|\xi^{\ast}_{N_{T}}|\big{\}}$ and for $\alpha\in(0,1)$ given, denote by $q^{\ast}_{1-\alpha}$ the upper $(1-\alpha)$ percentage point of the distribution of $\xi^{\ast}_{\max}$ .

Step 3.

A simultaneous $(1-\alpha)$ -confidence band for $\widetilde{f}(\lambda_{j,T})$ , $j=1,2,\ldots,N_{T}$ , is then given by

\Big{\{}\Big{[}\widehat{f}(\lambda_{j,T})\Big{(}1-q^{\ast}_{1-\alpha}\sqrt{\frac{M_{T}}{T}}\Big{)},\ \ \widehat{f}(\lambda_{j,T})\Big{(}1+q^{\ast}_{1-\alpha}\sqrt{\frac{M_{T}}{T}}\Big{)}\Big{]},\ j=1,2,\ldots,N_{T}.\Big{\}}

(13)

Notice that the distribution of $\xi^{\ast}_{\max}$ in Step 2 can be estimated via Monte-Carlo simulation. While $f(\lambda_{k_{i},T})$ , $i=1,2$ , appearing in the expression for $C_{T}(k_{1},k_{2})$ can be replaced by the estimator $\widehat{f}_{T}(\lambda_{k_{i},T})$ , the important step in the bootstrap algorithm proposed, is the estimation of the covariances

\sigma_{T}(j_{1},j_{2})=\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}E\Big{[}\big{(}X_{t}X_{t-j_{1}}-\gamma(j_{1})\big{)}\big{(}X_{s}X_{s-j_{2}}-\gamma(j_{2})\big{)}\Big{]}

appearing in $C_{T}(k_{1},k_{2})$ . To obtain an estimator for this quantity which has some desired consistency properties, we follow \citeasnounZhang_etal2022. In particular, let $K$ be a kernel function satisfying the following conditions.

Assumption 4: $K:\mbox{$\mathbb{R}$}\to[0,+\infty)$ is symmetric, continuously differentiable and decreasing on $[0,+\infty)$ with $K(0)=1$ and $\int K(x)\,dx<\infty$ . The Fourier transform of $K$ is integrable and nonnegative on $\mathbb{R}$ .

Consider next the estimator

\widehat{\sigma}_{T}(j_{1},j_{2})=\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\big{(}X_{t}X_{t-j_{1}}-\widehat{\gamma}(j_{1})\big{)}\big{(}X_{s}X_{s-j_{2}}-\widehat{\gamma}(j_{2})\big{)},

(14)

of $\sigma_{T}(j_{1},j_{2})$ , where $b_{T}>0$ is a bandwidth parameter satisfying $b_{T}\rightarrow\infty$ as $T\rightarrow\infty$ . Note that the conditions on the Fourier transform of $K$ in Assumption 4 guarantee positive semi-definiteness of $\widehat{\sigma}_{T}(j_{1},j_{2})$ . This assumption is satisfied if $K$ is, for instance, the Gaussian kernel.

The following consistency result for the estimator proposed in (14) can be established.

Proposition 1.

Suppose that Assumption 1 and Assumption 4 hold true. Let $b_{T}\sim T^{c}$ with $c<1/2-8a_{s}/m$ , where $a_{s}$ is as in Theorem 2. Then,

\sup_{1\leq j_{1},j_{2}\leq M_{T}}\big{|}\widehat{\sigma}_{T}(j_{1},j_{2})-\sigma_{T}(j_{1},j_{2})\big{|}=O_{P}\Big{(}\frac{1}{b_{T}}+\frac{b_{T}M_{T}^{8/m}}{\sqrt{T}}\Big{)}.

Let, as usual, $P^{*}$ denote the conditional probability given the time series $X_{1},\dots,$ $X_{T}$ . We then have the following result which proves consistency of the multiplier bootstrap procedure proposed.

Theorem 3.

Suppose that the conditions of Theorem 2 hold true and that $b_{T}\sim T^{c}$ , where

0<a_{s}<c<\frac{1}{2}-a_{s}\big{(}1+\frac{8}{m}\big{)}.

Let $\xi_{k}^{\ast}$ , $k=1,2,\ldots,N_{T}$ , be Gaussian random variables generated as in Step 2 of the multiplier bootstrap algorithm with $\widehat{\Sigma}_{N_{T}}$ the $N_{T}\times N_{T}$ matrix the $(k_{1},k_{2})$ -th element of which equals

\displaystyle\widehat{C}_{T}(k_{1},k_{2})=\frac{1}{\widehat{f}(\lambda_{k_{1},T})\widehat{f}(\lambda_{k_{2},T})}

\displaystyle\sum_{j_{1},j_{2}=0}^{M_{T}}a_{k_{1},j_{1}}a_{k_{2},j_{2}}\widehat{\sigma}_{T}(j_{1},j_{2}),

(15)

and $\widehat{\sigma}_{T}(j_{1},j_{2})$ given in (14). Then, as $n\rightarrow\infty$ ,

$\displaystyle\sup_{x\in\mathbb{R}}\Big{\|}P\big{(}\max_{k=1,\ldots,N_{T}}$	$\displaystyle\sqrt{\frac{T}{M_{T}}}\frac{\|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\|}{\widehat{f}_{T}(\lambda_{k,T})}\leq x\big{)}-P^{\ast}\big{(}\max_{k=1,\ldots,N_{T}}\|\xi^{\ast}_{k}\|\leq x\big{)}\Big{\|}$
$\displaystyle=$	$\displaystyle\,{\mathcal{O}}\Big{(}T^{-\kappa}+T^{-\lambda}\big{(}\log(N_{T})\big{)}^{3/2}\Big{)}$
	$\displaystyle+o_{P}\big{(}\{\sup_{k=1,\ldots,N_{T}}\|\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})\|\}^{1/6}+T^{-\rho/6}\big{)}$	(16)

for $\rho=\min\{c-a_{s},1/2-c-a_{s}(1+8/m)\}$ .

Remark 4.

(i)

So far we have considered the construction of (simultaneous) confidence bands for the (smoothed) spectral density $\widetilde{f}(\lambda_{j,T})=\mathrm{E}(\widehat{f}_{T}(\lambda_{j,T}))$ over the Fourier frequencies $\lambda_{j,T}$ , $j=1,2,\ldots,N_{T}$ . To extend the procedure proposed to one that also delivers an asymptotically valid simultaneous confidence band for the spectral density $f$ itself, a (uniformly) consistent estimator of the (rescaled) bias term $B_{T}(\lambda_{j,T})=\sqrt{T/M_{T}}\big{(}\mathrm{E}(\widehat{f}_{T}(\lambda_{j,T}))-f(\lambda_{j,T})\big{)}/\widehat{f}_{T}(\lambda_{j,T})$ for $j=1,2,\ldots,N_{T}$ , is needed, provided $B_{T}(\lambda_{j,T})$ does not vanish asymptotically; see (3). If fact, it can be easily seen that a (theoretically) valid confidence band for $f$ which takes into account the bias in estimating $f$ , is given by

	$\displaystyle\Big{\{}\Big{[}\widehat{f}(\lambda_{j,T})$	$\displaystyle\Big{(}1-\sqrt{\frac{M_{T}}{T}}\big{(}q^{\ast}_{1-\alpha}+B_{T}(\lambda_{j,T}\big{)}\Big{)},$		(17)
		$\displaystyle\ \ \widehat{f}(\lambda_{j,T})\Big{(}1+\sqrt{\frac{M_{T}}{T}}\big{(}q^{\ast}_{1-\alpha}-B_{T}(\lambda_{j,T}\big{)}\Big{)}\Big{]},j=1,2,\ldots,N_{T}\Big{\}};$

compare to (13). As in many other nonparametric inference problems too, different approaches can be considered for this purpose; see \citeasnounCalonicoetal2018 and the references therein for the cases of nonparametric density and regression estimation in an i.i.d. set up. One approach is an explicit bias correction that uses a plug-in type estimator of $B_{T}(\lambda_{j,T})$ based on the fact that, under certain conditions, see (3), $\sqrt{T/M_{T}}\big{(}\mathrm{E}(\widehat{f}_{T}(\lambda_{j,T}))-f(\lambda_{j,T})\big{)}=CWf^{\prime\prime}(\lambda_{j,T})+o(1)$ . This approach requires a consistent (nonparametric) estimator of the second order derivative $f^{\prime\prime}$ of the spectral density. A different approach proposed in the literature is the so-called ’undersmoothing’. The idea here is to make the bias term $\sqrt{T/M_{T}}\big{(}\mathrm{E}(\widehat{f}_{T}(\lambda_{j,T}))-f(\lambda_{j,T})\big{)}$ asymptotically negligible. This can be achieved by using a truncation lag $M_{T}$ which increases to infinity at a rate faster than the (MSE optimal) rate $T^{1/5}$ . An alternative approach is to perform a bias correction by using flat-top kernels. See \citeasnounPolitis24 for a recent discussion, where also an alternative approach to variance stabilization, respectively, studentization, called the confidence region method, has been proposed. Despite the fact that the problem of properly incorporating the (possible) bias in the construction of (simultaneous) confidence bands for the spectral density $f$ is an interesting one, we do not further pursue this problem in this paper.

(ii)

Equation (3) contains as part of the convergence rate for the multiplier bootstrap the sup-distance of the lag-window estimator to the spectral density over all Fourier frequencies $\lambda_{k,T}$ . Assuming a geometric rate for the decay of the physical dependence coefficients $\delta_{m}(k)$ , one obtains from \citeasnounWuZaffaroni2018, Theorem 6, the uniform convergence rate $(M_{T}\log(M_{T})/T)^{1/2}$ over all frequencies for lag-window estimators, which dominates the uniform term over all Fourier frequencies in (3) and usually converges to zero faster than the other parts of the bound therein.

5 Simulations

In this section we investigate by means of simulations, the finite sample performance of the Gaussian approximation and of the corresponding multiplier bootstrap procedure proposed for the construction of simultaneous confidence bands. For this purpose, time series of length $T=256,\;512$ and $1024$ have been generated from the following three time series models:

Model I: $X_{t}=0.8X_{t-1}+\varepsilon_{t}$ ,
Model II: $X_{t}=1.3X_{t-1}-0.75X_{t-2}+u_{t}$ with $u_{t}=\varepsilon_{t}\sqrt{1+0.25u^{2}_{t-1}}$ ,
Model III: $X_{t}=\big{(}0.4+0.1\varepsilon_{t-1}\big{)}X_{t-1}+\varepsilon_{t}$ .

In all models the innovations $\varepsilon_{t}$ are chosen to be i.i.d. standard Gaussian. The empirical coverage over $R=500$ repetitions of each model has been calculated for two different nominal coverages, $90\%$ and $95\%$ . The lag-window estimator of the spectral density used has been obtained using the Parzen lag-window and truncation lag $M_{T}$ . The Gauss kernel with different values of the parameter $b_{T}$ has been used for obtaining the covariance estimators $\widehat{\sigma}_{T}(j_{1},j_{2})$ given in (14) and which are used for the calculation of the covariance matrix $\widehat{C}_{T}(k_{1},k_{2})$ ; see equation (15). All bootstrap approximations are based on $B=1,000$ replications. Table 1 presents empirical coverages as well as mean lengths of the confidence bands obtained, where the mean lengths are calculated as

ML:=2\sqrt{\frac{M_{T}}{T}}\frac{1}{N_{T}}\sum_{j=1}^{N_{T}}\widehat{f}_{T}(\lambda_{j,T})\frac{1}{R}\sum_{\ell=1}^{R}q_{1-\alpha}^{\ast,(\ell)},

with $R$ the number of repetitions and $q^{\ast,(\ell)}_{1-\alpha}$ the upper $(1-\alpha)$ percentage point of the distribution of $\xi^{\ast}_{\max}$ obtained in the $\ell$ -th repetition; also see expression (13). Figure 1 shows averaged 90% and 95% confidence bands obtained using the method proposed in this paper for sample sizes of $T=512$ and $T=1024$ and for Model II.

			Model I			Model II			Model III
T=256
$M_{T}=10$	$b_{T}=$		1.0	1.5	2.0	6.5	7.0	7.5	1.0	1.5	2.0
	90%	Cov	92.6	90.6	89.2	90.8	90.0	90.0	86.0	82.8	82.2
		ML	0.56	0.47	0.43	1.46	1.42	1.38	0.18	0.17	0.17
	95%	Cov	94.0	92.4	91.4	94.4	94.0	92.8	91.0	89.4	87.6
		ML	0.63	0.53	0.49	1.68	1.64	1.59	0.20	0.19	0.19
T=512
$M_{T}=14$	$b_{T}=$		1.5	2.0	2.5	9.5	10.0	10.5	1.0	1.5	2.0
	90%	Cov	91.6	88.8	87.6	90.4	90.2	89.8	87.8	84.2	83.0
		ML	0.41	0.37	0.36	1.05	1.03	1.02	0.15	0.14	0.15
	95%	Cov	94.6	91.8	91.2	93.8	94.0	93.2	92.2	89.8	88.0
		ML	0.46	0.42	0.40	1.19	1.17	1.16	0.17	0.16	0.16
T=1024
$M_{T}=18$	$b_{T}=$		2.0	2.5	3.0	11.0	11.5	12.0	1.0	1.5	2.0
	90%	Cov	92.8	90.8	88.6	90.0	89.4	89.2	89.0	86.4	86.0
		ML	0.31	0.29	0.28	0.79	0.78	0.77	0.12	0.12	0.12
	95%	Cov	96.0	94.8	94.0	94.0	94.2	93.8	94.6	93.2	92.0
		ML	0.34	0.32	0.31	0.88	0.88	0.87	0.14	0.13	0.13

Table 1. Empirical coverages (Cov) and mean lengths (ML) of simultaneous confidence bands (13) for different sample sizes and different values of the parameters $M_{T}$ and $b_{T}$ .

Refer to caption — Figure 1: Plot of $\widetilde{f}(\lambda_{j,T})$ (solid line) together with $90\%$ and $95\%$ averaged confidence bands (dotted and dashed lines, respectively), for time series of length $T=512$ and $b_{T}=10$ (top) and $T=1024$ and $b_{T}=11.5$ (bottom) stemming from Model II.

We also compare the performance of the approach proposed in this paper with that based on a Gumbel-type approximation of the maximum deviation of the centered lag-window spectral density estimator evaluated over a much coarser grid of frequencies in the interval $(0,\pi]$ . To elaborate and based on Theorems 3-5 of \citeasnounLiuWu2010 derived under different conditions, an asymptotically $(1-\alpha)$ simultaneous confidence band for $\mathrm{E}(\widehat{f}_{T}(\lambda_{s}))$ over the set of frequencies $\lambda_{s}=s\pi/M_{T}$ for $s=1,2,\ldots,M_{T}$ , is given by

\displaystyle\Big{\{}\Big{[}\widehat{f}_{T}(\lambda_{s})-\widehat{C}_{\alpha,T},\widehat{f}_{T}(\lambda_{s})+\widehat{C}_{\alpha,T}\Big{]},\ s=1,2,\ldots,M_{T}\Big{\}},

(18)

where

\widehat{C}_{\alpha,T}=\sqrt{\frac{M_{T}}{T}\big{(}c_{1-\alpha}+\mu_{T}\big{)}\widehat{f}^{2}_{T}(\lambda_{s})W_{2}},\ \ \mu_{T}=2\log(M_{T})-\log(\pi\log(M_{T}))

(19)

and $W_{2}=\int_{-1}^{1}w^{2}(u)du=151/280$ in the case of the Parzen lag-window. Furthermore, $c_{1-\alpha}$ denotes the $(1-\alpha)$ percentage point of the standard Gumbel distribution. Table 2 summarizes empirical coverages and mean lengths of the confidence bands (18)-(19) obtained over $R=500$ repetitions for the same models and sample sizes considered in Table 1. Additionally and in order to see the effect of dependence of the time series at hand, we also report results for the case of an i.i.d. process with $X_{t}\sim{\mathcal{N}}(0,1)$ . Since the set of frequencies $\lambda_{s}$ captured by the confidence band (18)-(19) solely depends on the truncation lag $M_{T}$ , we present results for different values of this parameter.

			i.i.d.		Model I		Model II		Model III
			90%	95%	90%	95%	90%	95%	90%	95%
T=256
	$M_{T}=10$	Cov	84.0	89.4	76.2	84.0	68.4	75.8	60.2	65.8
		ML	0.12	0.13	0.25	0.28	0.77	0.84	0.16	0.18
	$M_{T}=14$	Cov	79.0	85.0	74.8	80.8	71.2	76.6	59.6	66.0
		ML	0.15	0.16	0.32	0.35	0.98	1.07	0.21	0.22
	$M_{T}=22$	Cov	69.0	74.4	68.2	74.8	65.6	71.4	55.8	64.0
		ML	0.20	0.21	0.45	0.49	1.32	1.43	0.28	0.30
T=512
	$M_{T}=14$	Cov	84.0	87.8	81.6	85.2	75.4	80.0	59.0	63.8
		ML	0.11	0.12	0.23	0.26	0.69	0.76	0.15	0.16
	$M_{T}=18$	Cov	82.8	86.6	82.4	85.6	74.2	80.2	59.4	64.6
		ML	0.12	0.14	0.28	0.30	0.82	0.89	0.17	0.19
	$M_{T}=26$	Cov	78.6	84.0	78.2	84.0.8	71.2	72.2	58.4	66.2
		ML	0.15	0.17	0.37	0.39	1.03	1.12	0.22	0.24
T=1024
	$M_{T}=18$	Cov	88.0	91.0	85.6	88.6	76.0	82.4	59.6	66.2
		ML	0.09	0.10	0.20	0.22	0.58	0.63	0.12	0.13
	$M_{T}=22$	Cov	85.2	90.6	83.6	80.8	76.2	82.6	60.2	67.8
		ML	0.10	0.11	0.23	0.25	0.66	0.72	0.14	0.15
	$M_{T}=30$	Cov	83.0	88.0	82.2	86.6	75.2	81.2	62.4	68.4
		ML	0.12	0.13	0.29	0.31	0.81	0.87	0.17	0.18

Table 2. Empirical coverages (Cov) and mean lengths (ML) of the simultaneous confidence bands (18) for different sample sizes and different values of the truncation-lag $M_{T}$ .

As it can be seen from Table 1, the empirical coverages of our confidence bands are, in general, close to the desired levels and they improve as the sample size increases. While the choice of the parameter $M_{T}$ , which specifies the number of empirical autocovariances effectively used in obtaining $\widehat{f}_{T}$ , seems not to be important for the performance of the method based on Gaussian approximation, this method is more sensitive with respect to the choice of the bandwidth parameter $b_{T}$ used in the estimation of the covariance matrix of the approximating Gaussian variables. This parameter should be chosen larger for process with a stronger dependence structure compared to rather weakly dependent data. This is a standard observation in the context of covariance estimation, see e.g. \citeasnounA91. Also, differences of the empirical coverages between the different models can be seen, where the bilinear Model III seems to be a rather difficult case. This model clearly needs larger sample sizes than the other two models considered in order to obtain coverages which are close to the nominal ones. An inspection of Table 2 shows that the method based on the asymptotic Gumbel approximation and which uses a much coarser grid of frequencies, has difficulties in achieving the desired confidence levels even in the most simple case of a Gaussian i.i.d. process while for Model III this method leads to quite low empirical coverages even for $T=1024$ . Despite the fact that, overall, the empirical coverages for the i.i.d. case, Model I and Model II, improve slowly as the sample size increases, the results obtained heavily depend on the choice of the truncation lag $M_{T}$ and the coverages achieved stay in most cases quite below the desired level even for the largest sample size used in the simulation study.

6 Auxiliary Lemmas and Proofs

Throughout this section, $C$ denotes a generic constant that may vary from line to line. We first state the following useful lemmas. See also \citeasnounXiaoWu2014 for related results to Lemma 1.

Lemma 1.

Under Assumption 1, the following assertions hold true:

(i)

$|\gamma(j)|\leq C(1+|j|)^{-\alpha}$ ,
(ii)

$\sum_{j\in\mbox{$\mathbb{Z}$}}|j|^{r}|\gamma(j)|<\infty~{}\forall r<\alpha-1$ ,
(iii)

$\max_{j_{1},j_{2}\geq 0}|E(Y_{t,j_{1}}Y_{s,j_{2}})|\leq C(1+|s-t|)^{-\alpha}$ , where $Y_{t,j}=X_{t}X_{t+j}-\gamma(j)$ .

Proof.

(i)

We make use of the bound

\|P_{0}(X_{s})\|_{m}\leq\delta_{m}(s),s\geq 0,

(20)

of the so-called projection operator $P_{j-s}(X_{j}):=E[X_{j}|{\cal F}_{j-s}]-E[X_{j}|{\cal F}_{j-s-1}]$ for $s\geq 0$ and $j\in\mathbb{Z}$ , where ${\cal F}_{i}:=\sigma(e_{i},e_{i-1},\ldots)$ . For a proof of (20) we refer to \citeasnounWu2005, Theorem 1.
Note that $X_{t}=\sum_{s=0}^{\infty}P_{t-s}(X_{t})$ a.s. and in $L_{1}$ . This is the case because $E[X_{t}|{\cal F}_{t-s}]$ converges for $s\to\infty$ by the backward martingale convergence theorem a.s. and in $L_{1}$ towards a limit measurable with respect to ${\cal F}_{-\infty}$ , which is trivial because of the i.i.d. structure of $(e_{i})$ . Therefore the limit is constant and coincides with the mean of $X_{t}$ which is assumed to be zero. Then (i) follows since,

	$\displaystyle\|\gamma(j)\|=$	$\displaystyle\|E(X_{0}\,X_{j})\|=\Big{\|}\sum_{s_{1},s_{2}=0}^{\infty}E(P_{-s_{1}}(X_{0})P_{j-s_{2}}(X_{j}))\Big{\|}$
	$\displaystyle=$	$\displaystyle\Big{\|}\sum_{s=0}^{\infty}E(P_{-s}(X_{0})P_{-s}(X_{j}))\Big{\|},\ \text{because }E(P_{-s_{1}}(X_{0})P_{j-s_{2}}(X_{j}))=0,-s_{1}\neq j-s_{2}$
	$\displaystyle\leq$	$\displaystyle\sum_{s=0}^{\infty}E\|P_{0}(X_{s})P_{0}(X_{j+s})\|,\ \text{because of stationarity}$
	$\displaystyle\leq$	$\displaystyle\sum_{s=0}^{\infty}\\|P_{0}(X_{s})\\|_{m^{\prime}}\\|P_{0}(X_{j+s})\\|_{m},\ \text{where }1/m+1/m^{\prime}=1$
	$\displaystyle\leq$	$\displaystyle\sum_{s=0}^{\infty}\\|P_{0}(X_{s})\\|_{m}\\|P_{0}(X_{j+s})\\|_{m},\ \text{because }m^{\prime}\leq m\text{ for }m\geq 2$
	$\displaystyle\leq$	$\displaystyle\sum_{s=0}^{\infty}\delta_{m}(s)\delta_{m}(j+s),\text{ by }\eqref{B-13a}.$

From this bound we get for $j\geq 0$ ,

\displaystyle|\gamma(j)|\leq C\sum_{s=0}^{\infty}\frac{1}{(1+s)^{\alpha}(1+s+j)^{\alpha}}\leq C\frac{1}{(1+j)^{\alpha}}\sum_{s=0}^{\infty}\frac{1}{(1+s)^{\alpha}}\leq C(1+j)^{-\alpha}.

(ii)

The assertion follows because

\sum_{j=1}^{\infty}j^{r}|\gamma(j)|\leq C\sum_{j=1}^{\infty}j^{r}(1+j)^{-\alpha}\leq C\sum_{j=1}^{\infty}\frac{1}{(1+j)^{\alpha-r}}<\infty

for $r<\alpha-1$ .

(iii)

We assume without loss of generality that $t>s$ and define $X_{r}^{(k)}\,:=\,E[X_{r}\mid\mathcal{F}_{r,k}]$ . The following three cases can then occur. If $j_{1}\leq t-s$ , then

	$\displaystyle\|E(Y_{t,j_{1}}Y_{s,j_{2}})\|$	$\displaystyle\leq\|E(X_{t}X_{t-{j_{1}}}X_{s}X_{s-j_{2}}\|+\|\gamma(j_{1})\gamma(j_{2})\|$
		$\displaystyle=\|E(\big{(}X_{t}-X_{t}^{(t-s-1)})X_{t-j_{1}}X_{s}X_{s-j_{2}}\big{)}\|+\|\gamma(j_{1})\gamma(j_{2})\|$
		$\displaystyle\leq\\|X_{t-j_{1}}X_{s}X_{s-j_{2}}\\|_{m/(m-1)}\\|X_{t}-X_{t}^{(t-s-1)}\\|_{m}+C(1+j_{1})^{-\alpha}$
		$\displaystyle\leq C\,(1+t-s)^{-\alpha}.$

If $j_{1}\in[(t-s)/2,t-s]$ , then

	$\displaystyle\|E(Y_{t,j_{1}}Y_{s,j_{2}})\|$	$\displaystyle\leq\|E(X_{t}X_{t-{j_{1}}}X_{s}X_{s-j_{2}}\|+\|\gamma(j_{1})\gamma(j_{2})\|$
		$\displaystyle=\|E(\big{(}X_{t}-X_{t}^{(t-j_{1}-1)})X_{t-j_{1}}X_{s}X_{s-j_{2}}\big{)}\|+\|\gamma(j_{1})\gamma(j_{2})\|$
		$\displaystyle\leq\\|X_{t-j_{1}}X_{s}X_{s-j_{2}}\\|_{m/(m-1)}\\|X_{t}-X_{t}^{(t-j_{1}-1)}\\|_{m}+C(1+j_{1})^{-\alpha}$
		$\displaystyle\leq C(1+j_{1})^{-\alpha}\leq C(1+(t-s)/2)^{-\alpha}\leq C2^{\alpha}(1+t-s)^{-\alpha}.$

Finally, if $j_{1}\in[0,(t-s)/2]$ , then

	$\displaystyle\|E(Y_{t,j_{1}}Y_{s,j_{2}})\|$	$\displaystyle\leq\|E((Y_{t,j_{1}}-Y_{t,j_{1}}^{(t-s-1)})Y_{s,j_{2}})\|$
		$\displaystyle\leq\\|Y_{t,j_{1}}-Y_{t,j_{1}}^{(t-s-1)}\\|_{m/(m-2)}\\|Y_{s,j_{2}}\\|_{m/2}$
		$\displaystyle\leq C\sum_{k=t-s}^{\infty}\delta_{m}(k)+C\sum_{k=t-s-j_{1}}^{\infty}\delta_{m}(k)$
		$\displaystyle\leq C(1+t-s)^{-\alpha}+C\sum_{k=(t-s)/2}^{\infty}\delta_{m}(k)$
		$\displaystyle\leq C\,2^{\alpha}(1+t-s)^{-\alpha}.$

∎

Lemma 2.

Suppose that Assumptions 1 to 3 hold. Then, we have

\displaystyle\big{\|}\frac{1}{\sqrt{T}}\sum_{j=0}^{M_{T}}\sum_{t=j+1}^{T}a_{k,j}\big{(}X_{t}X_{t-j}-\gamma(j)\big{)}\big{\|}_{m/2}\leq\mbox{C}

(21)

and

	$\displaystyle\sup_{k\in\{1,\dots,N_{T}\}}\Big{\|}\big{\\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,k}\big{\\|}_{2}^{2}-\,\frac{f^{2}(\lambda_{k,T})}{M_{T}}\sum_{j=-M_{T}}^{M_{T}}w^{2}\big{(}\frac{j}{M_{T}}\big{)}\,(1+\cos(2j\,\lambda_{k,T}))\Big{\|}$
	$\displaystyle=o(1).$		(22)

Further

\displaystyle\Big{\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}(Z_{t,k}-\widetilde{Z}_{t,k}^{(s)})\Big{\|}_{m/2}\,\leq\,\mbox{C}\cdot d_{s,m}

(23)

with the $m$ -dependent random variables ( $m=2s$ ) $\widetilde{Z}_{t,k}^{(s)}$ , $k=1,2,\ldots,N_{T},$ defined as:

\widetilde{Z}_{t,k}^{(s)}\,:=\,\sum_{j=0}^{M_{T}}a_{k,j}\,\left(X_{t}^{(s)}X_{t-j}^{(s)}\,-\,E[X_{t}^{(s)}X_{t-j}^{(s)}]\right)\,\mathbbm{1}_{t>j}\,,

(24)

where $X_{r}^{(s)}\,:=\,E[X_{r}\mid\mathcal{F}_{r,s}]$ and $s\geq M_{T}$ , and with

d_{s,m}\,:=\,\sum_{h=0}^{\infty}\min\Big{\{}\delta_{m}(h)\,,\,\big{(}\sum_{j=s+1}^{\infty}\delta_{m}^{2}(j)\big{)}^{1/2}\Big{\}}.

(25)

Proof.

Inequality (21) is a direct consequence of (S8) from the supplement of \citeasnounXiaoWu2014 since

			$\displaystyle\big{\\|}\sum_{j=0}^{M_{T}}\sum_{t=j+1}^{T}a_{k,j}\big{(}X_{t}X_{t-j}-\gamma(j)\big{)}\big{\\|}_{m/2}$
		$\displaystyle=$	$\displaystyle\big{\\|}\sum_{i,j=1,i-j\in\{0,\ldots,M_{T}\}}^{T}a_{k,i-j}\big{(}X_{i}X_{j}-\gamma(i-j)\big{)}\big{\\|}_{m/2}$
		$\displaystyle\leq$	$\displaystyle\mbox{C}\cdot\Big{(}\sum_{h=0}^{\infty}\delta_{m}(h)\Big{)}^{2}\max_{k=1,\ldots,N_{T}}\left(\sum_{j=0}^{M_{T}}a_{k,j}^{2}\right)^{1/2}\,\sqrt{T}.$

For (2), first note that Theorem S.6 in the supplement of \citeasnounXiaoWu2014 assures summability of the joint fourth order cumulants. Hence,

		$\displaystyle\big{\\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,k}\big{\\|}_{2}^{2}$		(26)
		$\displaystyle=\frac{1}{4\pi^{2}TM_{T}}\sum_{j_{1},j_{2}=-M_{T}}^{M_{T}}\sum_{t_{1}=\|j_{1}\|+1}^{T}\sum_{t_{2}=\|j_{2}\|+2}^{T}w\big{(}\frac{j_{1}}{M_{T}}\big{)}w\big{(}\frac{j_{2}}{M_{T}}\big{)}\,\cos(j_{1}\lambda_{k,T})\,\cos(j_{2}\lambda_{k,T})$
		$\displaystyle\qquad\times\gamma(t_{1}-t_{2})\gamma(t_{1}-t_{2}+\|j_{2}\|-\|j_{1}\|)$
		$\displaystyle\quad+\frac{1}{4\pi^{2}TM_{T}}\sum_{j_{1},j_{2}=-M_{T}}^{M_{T}}\sum_{t_{1}=\|j_{1}\|+1}^{T}\sum_{t_{2}=\|j_{2}\|+1}^{T}w\big{(}\frac{j_{1}}{T}\big{)}w\big{(}\frac{j_{2}}{T}\big{)}\,\cos(j_{1}\lambda_{k,T})\,\cos(j_{2}\lambda_{k,T})$
		$\displaystyle\qquad\times\gamma(t_{1}-t_{2}+\|j_{2}\|)\gamma(t_{1}-t_{2}-\|j_{1}\|)+R^{(1)}_{T,k}$

with $\sup_{k}|R_{T,k}^{(1)}|=o(1)$ . For the first summand on the r.h.s. we get from $\sum_{k\in\mbox{$\mathbb{N}$}}k|\gamma(k)|<\infty$ that

		$\displaystyle\frac{1}{4\pi^{2}TM_{T}}\sum_{j_{1},j_{2}=-M_{T}}^{M_{T}}\sum_{t_{1}=\|j_{1}\|+1}^{T}\sum_{t_{2}=\|j_{2}\|+1}^{T}w\big{(}\frac{j_{1}}{T}\big{)}w\big{(}\frac{j_{2}}{T}\big{)}\,\cos(j_{1}\lambda_{k,T})\,\cos(j_{2}\lambda_{k,T})$
		$\displaystyle\qquad\times\gamma(t_{1}-t_{2})\gamma(t_{1}-t_{2}+\|j_{2}\|-\|j_{1}\|)$
		$\displaystyle=\frac{1}{\pi^{2}TM_{T}}\sum_{j_{1},j_{2}=1}^{M_{T}}\sum_{t_{1}=j_{1}+1}^{T}\sum_{t_{2}=j_{2}+1}^{T}w\big{(}\frac{j_{1}}{M_{T}}\big{)}w\big{(}\frac{j_{2}}{M_{T}}\big{)}\,\cos(j_{1}\lambda_{k,T})\,\cos(j_{2}\lambda_{k,T})$
		$\displaystyle\qquad\times\gamma(t_{1}-t_{2})\gamma(t_{1}-t_{2}+j_{2}-j_{1})+R^{(2)}_{T,k}$
		$\displaystyle=\frac{1}{\pi^{2}M_{T}}\sum_{t\in\mbox{$\mathbb{Z}$}}\sum_{j\in\mbox{$\mathbb{Z}$}}\gamma(t)\gamma(t+j)\sum_{j_{1}=1\vee(1-j)}^{M_{T}\wedge(M_{T}-j)}w\big{(}\frac{j_{1}}{M_{T}}\big{)}w\big{(}\frac{j+j_{1}}{M_{T}}\big{)}\,\cos(j_{1}\lambda_{k,T})$
		$\displaystyle\qquad\times\,\cos((j+j_{1})\lambda_{k,T})+R^{(3)}_{T,k}$
		$\displaystyle=\frac{1}{2\pi^{2}M_{T}}\sum_{t\in\mbox{$\mathbb{Z}$}}\sum_{j\in\mbox{$\mathbb{Z}$}}\gamma(t)\gamma(t+j)\sum_{j_{1}=1}^{M_{T}}w^{2}\big{(}\frac{j_{1}}{M_{T}}\big{)}\,$
		$\displaystyle\qquad\times\,\big{[}\cos(j\lambda_{k,T})\,+\,\cos((j+2j_{1})\lambda_{k,T})\big{]}+R^{(4)}_{T,k}$
		$\displaystyle=f^{2}(\lambda_{k,T})\,\frac{2}{M_{T}}\sum_{j_{1}=1}^{M_{T}}w^{2}\big{(}\frac{j_{1}}{M_{T}}\big{)}\,(1+\cos(2j_{1}\lambda_{k,T}))+R^{(5)}_{T,k},$

where $\sup_{k}|R_{T,k}^{(\ell)}|=o(1),~{}\ell=2,\dots,5.$ Similarly, we obtain from $\sum_{k\in\mbox{$\mathbb{N}$}}k^{2}|\gamma(k)|<\infty$ for the second summand on the r.h.s. of (26)

		$\displaystyle\frac{1}{4\pi^{2}TM_{T}}\sum_{j_{1},j_{2}=-M_{T}}^{M_{T}}\sum_{t_{1}=\|j_{1}\|+1}^{T}\sum_{t_{2}>\|j_{2}\|+1}^{T}w\big{(}\frac{j_{1}}{T}\big{)}w\big{(}\frac{j_{2}}{T}\big{)}\,\cos(j_{1}\lambda_{k,T})\,\cos(j_{2}\lambda_{k,T})$
		$\displaystyle\qquad\times\gamma(t_{1}-t_{2}+\|j_{2}\|)\gamma(t_{1}-t_{2}-\|j_{1}\|)$
		$\displaystyle=\frac{1}{\pi^{2}M_{T}}\sum_{t\in\mbox{$\mathbb{Z}$}}\sum_{j=2}^{2M_{T}}\gamma(t)\gamma(t-j)\,\sum_{j_{1}=1\vee(j-M_{T})}^{M_{T}\wedge(j-1)}w\big{(}\frac{j_{1}}{T}\big{)}w\big{(}\frac{j-j_{1}}{T}\big{)}\,\cos(j_{1}\lambda_{k,T})\,$
		$\displaystyle\qquad\times\,\cos((j-j_{1})\lambda_{k,T})\,+R_{T,k}^{(6)}$
		$\displaystyle=\frac{1}{\pi^{2}M_{T}}\sum_{t\in\mbox{$\mathbb{Z}$}}\sum_{j=2}^{\sqrt{M_{T}}}\gamma(t)\gamma(t-j)\,\sum_{j_{1}=1}^{\sqrt{M_{T}}-1}w\big{(}\frac{j_{1}}{T}\big{)}w\big{(}\frac{j-j_{1}}{T}\big{)}\,\cos(j_{1}\lambda_{k,T})\,$
		$\displaystyle\qquad\times\,\cos((j-j_{1})\lambda_{k,T})\,+R_{T,k}^{(7)}$
		$\displaystyle=R_{T,k}^{(8)},$

where $\sup_{k}|R_{T,k}^{(\ell)}|=o(1),~{}\ell=6,\,7,\,8.$ This finishes the proof of (2).
Inequality (23) in turn can be deduced from Proposition 1 in \citeasnounLiuWu2010 as follows

\Big{\|}\sum_{t=1}^{T}(Z_{t,k}-\widetilde{Z}_{t,k}^{(s)})\Big{\|}_{m/2}\leq\mbox{C}\,\sqrt{T}\,d_{s,m}\,\max_{k=1,\ldots,N_{T}}\left(\sum_{j=0}^{M_{T}}a_{k,j}^{2}\right)^{1/2}\,\sum_{h=0}^{\infty}\delta_{m}(h).

∎

Proof of Theorem 1.

We mainly follow the strategy of the proof of Theorem 1 in \citeasnounZhang_etal2022 although some of the conditions used there are not fulfilled in our case. In particular, a different $m$ -dependent approximation $\widetilde{Z}^{(s)}_{t,k}$ , i.e. $m=2s$ , is used in our proof which then leads to an improved rate of convergence. Using the notation $h_{\tau,\tau,x}$ as in \citeasnounZhang_etal2022, we get the bound

	$\displaystyle\sup_{x\in\mathbb{R}}\Big{\|}P\Big{\{}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\Big{\|}\widehat{f}_{T}(\lambda_{k,T})-E\,\widehat{f}_{T}(\lambda_{k,T})\Big{\|}\leq x\Big{\}}$
	$\displaystyle\quad\quad-P\Big{\{}\max_{k=1,\ldots,N_{T}}\|\xi_{k}\|\leq x\Big{\}}\Big{\|}$
$\displaystyle\leq$	$\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\,\Big{\|}Eh_{\tau,\tau,x}\Big{(}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,1},\dots,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,N_{T}}\Big{)}-Eh_{\tau,\tau,x}\left(\xi_{1},\dots,\xi_{N_{T}}\right)\Big{\|}$
	$\displaystyle\quad\quad+C\,t\,\left(1+\sqrt{\log(N_{T})}+\sqrt{\|\log(t)\|}\right),$	(27)

in view of (6) and (21), where $t=(1+\log(2N_{T}))/\tau$ for some $\tau>0$ to be specified later. For the $m$ -dependent random variables (with $m=2s$ ) $\widetilde{Z}_{t,k}^{(s)}$ , $k=1,2,\ldots,N_{T}$ , defined in (24), by the properties of the function $h_{\tau,\tau,x}$ and from Lemma 2, we get

		$\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\,\Big{\|}Eh_{\tau,\tau,x}\Big{(}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,1},\dots,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,N_{T}}\Big{)}$
		$\displaystyle\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ -Eh_{\tau,\tau,x}\Big{(}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}\widetilde{Z}^{(s)}_{t,1},\dots,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z^{(s)}_{t,N_{T}}\Big{)}\Big{\|}$
		$\displaystyle\leq C\,\tau\,E\Big{[}\max_{k=1,\dots,N_{T}}\Big{\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}(Z_{t,k}-\widetilde{Z}_{t,k}^{(s)})\Big{\|}\Big{]}$
		$\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,\max_{k=1,\dots,N_{T}}\Big{\\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}(Z_{t,k}-\widetilde{Z}_{t,k}^{(s)})\Big{\\|}_{m/2}$
		$\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,d_{s,m}$		(28)

with $d_{s,m}$ given in (25). The introduction of the random variables $\widetilde{Z}^{(s)}_{t,k}$ allows us to proceed with the classical big block/small block technique. Toward this, we define for any integer $l>2s$ big and small blocks of the $\widetilde{Z}_{t,k}^{(s)}$ as

S_{j,k}\,:=\,\frac{1}{\sqrt{T}}\sum_{t=2(j-1)(s+l)+1}^{2(j-1)(s+l)+2l}\widetilde{Z}_{t,k}^{(s)}\quad\text{and}\quad U_{j,k}\,:=\,\frac{1}{\sqrt{T}}\sum_{t=2(j-1)(s+l)+2l+1}^{(2j(s+l))\wedge T}\widetilde{Z}_{t,k}^{(s)}

for $j=1,\dots,\lceil\frac{T}{2(s+l)}\rceil=:V_{T}$ . Smoothness of $h_{x,x,\tau}$ , Rosenthal’s inequality and similar arguments as in (21) yield

		$\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\,\Big{\|}Eh_{\tau,\tau,x}\Big{(}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}\widetilde{Z}^{(s)}_{t,1},\dots,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z^{(s)}_{t,N_{T}}\Big{)}$
		$\displaystyle\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ -Eh_{\tau,\tau,x}\Big{(}\sum_{j=1}^{V_{T}}S_{j,1},\dots,\sum_{j=1}^{V_{T}}S_{j,N_{T}}\Big{)}\Big{\|}$
		$\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,\max_{k=1,\dots,N_{T}}\Big{\\|}\sum_{j=1}^{V_{T}}U_{j,k}\Big{\\|}_{m/2}$
		$\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,\sqrt{V_{T}}\,\max_{k=1,\dots,N_{T}}\max_{j=1,\dots,V_{T}}\\|U_{j,k}\\|_{m/2}$
		$\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,\sqrt{\frac{V_{T}\,s}{T}}$
		$\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,\sqrt{\frac{s}{l}}.$		(29)

Next, we define centred, joint normal random variables $(S^{*}_{j,k})_{j=1,\dots,V_{T},k=1,\dots,N_{T}}$ with $E[S^{*}_{j,k_{1}}S^{*}_{j,k_{2}}]=E[S_{j,k_{1}}S_{j,k_{2}}]$ and such that $(S^{*}_{j,1},\dots,S^{*}_{j,N_{T}}),$ $j=1,\dots,V_{T}$ are independent. Further these variables are constructed such that they are independent of $(S_{j,k})_{j=1,\dots,V_{T},k=1,\dots,N_{T}}$ . We obtain from Lindeberg’s method and using

||S_{j,k}^{\ast}\|_{m/2}\leq C_{m}\|S^{\ast}_{j,k}\|_{2}=C_{m}\|S_{j,k}\|_{2}\leq C_{m}\|S_{j,k}\|_{m/2},

that,

		$\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\,\Big{\|}Eh_{\tau,\tau,x}\Big{(}\sum_{j=1}^{V_{T}}S_{j,1},\dots,\sum_{j=1}^{V_{T}}S_{j,N_{T}}\Big{)}-Eh_{\tau,\tau,x}\Big{(}\sum_{j=1}^{V_{T}}S_{j,1}^{},\dots,\sum_{j=1}^{V_{T}}S_{j,N_{T}}^{}\Big{)}\Big{\|}$		(30)
		$\displaystyle\leq C\,(1+\tau^{3})\,\sum_{j=1}^{V_{T}}E\max_{k=1,\dots,N_{T}}[\|S_{j,k}\|^{3}\,+\,\|S_{j,k}^{*}\|^{3}]$
		$\displaystyle\leq C^{\prime}_{m}\,(1+\tau^{3})\,\,N_{T}^{6/m}\,V_{T}\,\left(\frac{l}{T}\right)^{3/2}.$

Using the bounds

\|\frac{1}{\sqrt{T}}\sum_{i=1}^{T}(Z_{i,k}-\widetilde{Z}_{i,k}^{(s)})\|_{m/2}\leq Cd_{s,m},\ \ \|\frac{1}{\sqrt{T}}\sum_{i=1}^{T}Z_{i,k}\|_{m/2}\leq C,

derived in Lemma 2,

	$\displaystyle\|\sum_{j_{1}=1}^{V_{T}}\sum_{j_{2}=1}^{V_{T}}ES_{j_{1},k_{1}}U_{j_{2},k_{2}}\|$	$\displaystyle\leq\\|\sum_{j=1}^{V_{T}}S_{j,k_{1}}\\|_{m/2}\\|\sum_{j=1}^{V_{T}}U_{j,k_{2}}\\|_{m/2}$
		$\displaystyle\leq V_{T}\max_{j=1,\ldots,V_{t}}\\|S_{j,k_{1}}\\|_{m/2}\max_{j=1,\ldots,V_{t}}\\|U_{j,k_{2}}\\|_{m/2}$
		$\displaystyle\leq C\,V_{T}\,\frac{\sqrt{ls}}{T}$

and

|\sum_{j_{1}=1}^{V_{T}}\sum_{j_{2}=1}^{V_{T}}EU_{j_{1},k_{1}}U_{j_{2},k_{2}}|\leq V_{T}\max_{j=1,\ldots,V_{T}}\|U_{j,k_{1}}\|_{m/2}\max_{j=1,\ldots,V_{T}}\|U_{j,k_{2}}\|_{m/2}\leq CV_{T}\frac{s}{T},

we get

	$\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\,\Big{\|}Eh_{\tau,\tau,x}\Big{(}\sum_{j=1}^{V_{T}}S^{\ast}_{j,1},\dots,\sum_{j=1}^{V_{T}}S^{\ast}_{j,N_{T}}\Big{)}$	$\displaystyle-Eh_{\tau,\tau,x}\Big{(}\xi_{1},\ldots,\xi_{N_{T}}\Big{)}\Big{\|}$		(31)
		$\displaystyle\leq C\,\tau^{2}\,\big{(}d_{s,m}+V_{T}\frac{\sqrt{sl}}{T}\big{)}.$		(31)

Now, let $\tau\sim T^{\lambda}$ , $s\sim T^{a_{s}}$ and $l\sim T^{a_{l}}$ for some positive numbers $\lambda$ , $a_{s}$ and $a_{l}$ . For (6), (6), (6), (30) and (31) to vanish asymptotically, the following conditions are sufficient:

(i)

$t\big{(}1+\sqrt{\log(N_{T})}+\sqrt{|\log(t)|}\big{)}\rightarrow 0$ ,
(ii)

$2\lambda+4/m+a_{s}<a_{l}$ , $6\lambda+12/m+a_{l}<1$ , $4\lambda+a_{s}<a_{l}$ and
(iii)

$\big{(}T^{\lambda+2/m}+T^{2\lambda}\big{)}d_{s,m}\rightarrow 0$ .

For (i) and because $t=(1+\log(2N_{T}))/\tau$ , it is easily seen that

t\big{(}1+\sqrt{\log(N_{T})}+\sqrt{|\log(t)|}\big{)}\leq CT^{-\lambda}\big{(}\log(N_{T})\big{)}^{3/2},

which converges to zero provided $\lambda>0$ . Furthermore, since $m>16$ , (ii) is satisfied if $a_{s}+\max\{4\lambda,2\lambda+4/m\}<a_{l}<1-6\lambda-12/m$ . Finally, for (iii) notice first that for $0<\delta<\alpha-1$ and because $s\sim T^{a_{s}}$ ,

	$\displaystyle\big{(}\sum_{j=s+1}^{\infty}\delta^{2}_{m}(j)\big{)}^{1/2}$	$\displaystyle=\big{(}\sum_{j=s+1}^{\infty}j^{-2\alpha+1+2\delta}\cdot j^{-(1+2\delta)}\big{)}^{1/2}$
		$\displaystyle\leq Cs^{-\alpha+\delta+1/2}\leq CT^{-a_{s}(\alpha-\delta-1/2)}.$

Now, for $r>0$ we have,

	$\displaystyle\sum_{h=0}^{\infty}\min\{(1+h)^{-\alpha},T^{-a_{s}(\alpha-1/2-\delta)}\}$
	$\displaystyle\leq\sum_{h=0}^{T^{r}}T^{-a_{s}(\alpha-1/2-\delta)}+\sum_{h=T^{r}}^{\infty}(1+h)^{-\alpha+1+\delta}h^{-1-\delta}$
	$\displaystyle\leq T^{r-a_{s}(\alpha-1/2-\delta)}+CT^{-r(\alpha-1-\delta)}.$

Balancing both terms in the last bound above yields,

r-a_{s}(\alpha-1/2-\delta)=-r(\alpha-1-\delta)\ \Longleftrightarrow r=\frac{a_{s}(\alpha-1/2-\delta)}{\alpha-\delta}

and, therefore,

\sum_{h=0}^{\infty}\min\big{\{}\delta_{m}(h),\big{(}\sum_{j=s+1}^{\infty}\delta^{2}_{m}(j)\big{)}^{1/2}\big{\}}=T^{-\frac{\displaystyle a_{s}(\alpha-1/2-\delta)(\alpha-1-\delta)}{\displaystyle\alpha-\delta}}\rightarrow 0.

Hence for (iii) to be satisfied,

\frac{\displaystyle a_{s}(\alpha-1/2-\delta)(\alpha-1-\delta)}{\displaystyle\alpha-\delta}-\max\{\lambda+2/m\,,\,2\lambda\}>0

(32)

should hold true. Notice that for $\alpha$ satisfying condition (7), (32) holds true and at the same time the rate in (32) is larger or equal to $\min\{\kappa_{1},\kappa_{2}\}$ . ∎

Proof of Theorem 2.

Let

R_{T}(\lambda_{k,T})=\sqrt{\frac{T}{M_{T}}}\,\frac{\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})}{f(\lambda_{k,T})}\,\left(\frac{f(\lambda_{k,T})}{\widehat{f}_{T}(\lambda_{k,T})}-1\right)

and $t=(1+\log(2N_{T}))/\tau$ with $\tau\sim T^{\lambda}$ . Then we can split up

		$\displaystyle P\big{(}\max_{k=1,\ldots,N_{T}}\|\widetilde{\xi}_{k}\|\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{\|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\|}{\widehat{f}_{T}(\lambda_{k,T})}\leq x\big{)}$
		$\displaystyle\leq P\big{(}\max_{k=1,\ldots,N_{T}}\|\widetilde{\xi}_{k}\|\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}\|\widetilde{\xi}_{k}\|\leq x-t\big{)}$
		$\displaystyle\quad+\,P\big{(}\max_{k=1,\ldots,N_{T}}\|\widetilde{\xi}_{k}\|\leq x-t\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{\|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\|}{f(\lambda_{k,T})}\leq x-t\big{)}$
		$\displaystyle\quad+\,P\big{(}\max_{k=1,\ldots,N_{T}}\|R_{T}(\lambda_{k,T})\|>t\big{)}.$

One can proceed analogously to obtain a lower bound which then results in

		$\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\big{\|}\,P\big{(}\max_{k=1,\ldots,N_{T}}\|\widetilde{\xi}_{k}\|\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{\|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\|}{\widehat{f}_{T}(\lambda_{k,T})}\leq x\big{)}\,\big{\|}$
		$\displaystyle\leq\sup_{x\in\mbox{$\mathbb{R}$}}\big{\|}P\big{(}\max_{k=1,\ldots,N_{T}}\|\widetilde{\xi}_{k}\|\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}\|\widetilde{\xi}_{k}\|\leq x-t\big{)}\big{\|}$
		$\displaystyle\quad+\,\sup_{x\in\mbox{$\mathbb{R}$}}\big{\|}P\big{(}\max_{k=1,\ldots,N_{T}}\|\widetilde{\xi}_{k}\|\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{\|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\|}{f(\lambda_{k,T})}\leq x\big{)}\|$
		$\displaystyle\quad+\,\sup_{x\in\mbox{$\mathbb{R}$}}\big{\|}P\big{(}\max_{k=1,\ldots,N_{T}}\|R_{T}(\lambda_{k,T})\|>t\big{)}\big{\|}$
		$\displaystyle=P_{1}+P_{2}+P_{3}$

with obvious abbreviations for $P_{1},~{}P_{2},$ and $P_{3}$ . Lemma A.1 in \citeasnounZhang_etal2022 gives

P_{1}\,\leq\,C\,t\,\left(1+\sqrt{\log(N_{T})}+\sqrt{|\log(t)|}\right).

The desired rate for $P_{2}$ can be derived with exactly the same arguments as in the proof of Theorem 1 since the spectral density is assumed to be uniformly bounded from below.

Finally, for $P_{3}$ we make use of the assumed bias property (11) of $\widehat{f}_{T}$ , that is of

\sup_{\lambda}|E\widehat{f}_{T}(\lambda)-f(\lambda)|=O(M_{T}^{-2}).

This allows to bound $P_{3}$ as follows. We have

	$\displaystyle\|R_{T}(\lambda_{k,T})\|$	$\displaystyle=\sqrt{\frac{T}{M_{T}}}\frac{\|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\|}{f(\lambda_{k,T})}\,\Big{\|}\frac{f(\lambda_{k,T})}{\widehat{f}_{T}(\lambda_{k,T})}-1\Big{\|}$
		$\displaystyle=\sqrt{\frac{T}{M_{T}}}\,\big{\|}\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\big{\|}\frac{\|\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})\|}{\widehat{f}_{T}(\lambda_{k,T})\,f(\lambda_{k,T})}\,.$

A division of the considerations depending on whether $\max_{k}|\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})|$ is less or equal or larger than $\inf_{\lambda}f(\lambda)/2$ results in

		$\displaystyle P(\max_{k=1,\dots,N_{T}}\|R_{T}(\lambda_{k,T})\|>t)$
		$\displaystyle\leq P\Big{(}\sqrt{\frac{T}{M_{T}}}\max_{k=1,\dots,N_{T}}\|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\|\frac{2}{\inf_{\lambda}f(\lambda)^{2}}\|\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})\|>t\Big{)}$
		$\displaystyle\quad+P\big{(}\max_{k=1,\dots,N_{T}}\|\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})\|>\inf_{\lambda}f(\lambda)/2\big{)}\,.$

The first summand is bounded through

	$\displaystyle Ct^{-1}\sqrt{\frac{T}{M_{T}}}\Big{(}E\max_{k=1,\dots,N_{T}}\|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\|^{2}$
	$\displaystyle\hskip 44.9554pt+E\max_{k=1,\dots,N_{T}}\|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\|\cdot M_{T}^{-2}\Big{)}$
	$\displaystyle\leq\,Ct^{-1}\Big{(}N_{T}^{4/m}\sqrt{\frac{M_{T}}{T}}+N_{T}^{2/m}M_{T}^{-2}\Big{)}\,,$

where for arbitrary random variables $U_{k}$ we make use of $E\max_{k=1,\dots,N_{T}}|U_{k}|\leq(E\sum_{k=1}^{N_{T}}|U_{k}|^{r})^{1/r}\leq N_{T}^{1/r}\max_{k=1,\dots,N_{T}}\|U_{k}\|_{r}$ , $r\geq 1,$ and the last inequality follows from Lemma 2.
A similar consideration for the second summand finally leads to

P(\max_{k=1,\dots,N_{T}}|R_{T}(\lambda_{k,T})|>t)\leq\,C(t^{-1}\,N_{T}^{4/m}+N_{T}^{2/m})\,\Big{(}\sqrt{\frac{M_{T}}{T}}+M_{T}^{-2}\Big{)}\,.

Using $M_{T}\sim T^{a_{s}}$ and $t^{-1}=T^{\lambda}\big{/}(1+\log(2N_{T}))$ , the above bound for $P(\max_{k=1,\dots,N_{T}}|R_{T}(\lambda_{k,T})|>t)$ implies the order

\frac{\displaystyle T^{\lambda+a_{s}/2}T^{4/m}}{\displaystyle T^{1/2}(1+\log(2N_{T}))}+\frac{\displaystyle T^{\lambda}T^{4/m}}{\displaystyle T^{2a_{s}}(1+\log(2N_{T}))},

for $P_{3}$ , which for $\lambda+\kappa\leq\min\{1/2-a_{s}/2-4/m\,,\,2a_{s}-4/m\}$ leads to $P_{3}=O(T^{-\kappa}/\log(2N_{T}))$ . ∎

Proof of Proposition 1.

First, we split up

	$\displaystyle\big{\|}\widehat{\sigma}_{T}(j_{1},j_{2})\,-\,\sigma_{T}(j_{1},j_{2})\big{\|}$
$\displaystyle\leq\$	$\displaystyle\Big{\|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\big{(}X_{t}X_{t-j_{1}}-{\gamma}(j_{1})\big{)}\big{(}X_{s}X_{s-j_{2}}-{\gamma}(j_{2})\big{)}\,-\,\sigma(j_{1},j_{2})\Big{\|}$
	$\displaystyle+\,\|\widehat{\gamma}(j_{1})-\gamma(j_{1})\|\,\Big{\|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\big{(}X_{s}X_{s-j_{2}}-{\gamma}(j_{2})\big{)}\Big{\|}$
	$\displaystyle+\,\|\widehat{\gamma}(j_{2})-\gamma(j_{2})\|\,\Big{\|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\big{(}X_{t}X_{t-j_{1}}-{\gamma}(j_{1})\big{)}\Big{\|}$
	$\displaystyle+\|\widehat{\gamma}(j_{1})-\gamma(j_{1})\|\,\|\widehat{\gamma}(j_{2})-\gamma(j_{2})\|\,\Big{\|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{\|}.$	(33)

While the last summand is of order $O(b_{T}\,T^{-1})$ (see (E.4) in \citeasnounZhang_etal2022), the two middle terms can be bounded from above by $O(b_{T}\,T^{-1/2})$ using Cauchy-Schwarz inequality. Both bounds hold uniformly in $j_{1}$ and $j_{2}$ . Let $Y_{t,j}=X_{t}X_{t-j}-\gamma(j)$ and

\widetilde{\sigma}_{T}(j_{1},j_{2})=\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}K\Big{(}\frac{t-s}{b_{T}}\Big{)}Y_{t,j_{1}}Y_{s,j_{2}}.

Then the first term on the right hand side of the bound given in (6) equals $|\widetilde{\sigma}_{T}(j_{1},j_{2})-\sigma(j_{1},j_{2})|$ , and for this we have

	$\displaystyle\|\widetilde{\sigma}(j_{1},j_{2})-\sigma(j_{1},j_{2})\|$	$\displaystyle\leq\Big{\|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}K\Big{(}\frac{t-s}{b_{T}}\Big{)}-1\Big{)}E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{\|}$
		$\displaystyle+\Big{\|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}Y_{t,j_{1}}Y_{s,j_{2}}-E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{)}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{\|}.$		(34)

Using Lemma 1(iii) the first term of the right hand side of (6) can be bounded by

\displaystyle C\frac{1}{T}\sum_{t=1}^{T}\sum_{s=1}^{T}

\displaystyle\Big{(}1-K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{)}\frac{1}{(1+|t-s|)^{\alpha}}\leq 2C\sum_{s=0}^{\infty}\Big{(}1-K\Big{(}\frac{s}{b_{T}}\Big{)}\Big{)}\frac{1}{(1+s)^{\alpha}}.

Let $S=b_{T}$ . Using $|1-K(s/b_{T})|\leq\sup_{u\in[0,1]}|K^{\prime}(u)|s/b_{T}$ for $0\leq s\leq S$ and $|1-K(s/b_{T})|\leq 1$ for $s\geq S+1$ , we get

	$\displaystyle\sum_{s=0}^{\infty}\Big{(}1-$	$\displaystyle K\Big{(}\frac{s}{b_{T}}\Big{)}\Big{)}\frac{1}{(1+s)^{\alpha}}$
		$\displaystyle\leq\frac{\sup_{u\in[0,1]}\|K^{\prime}(u)\|}{b_{T}}\sum_{s=0}^{S}\frac{1}{(1+s)^{\alpha-1}}+\sum_{s=S+1}^{\infty}\frac{1}{(1+s)^{\alpha}},$

where the first term is $O(b_{T}^{-1})$ because $\alpha>2$ . For the second term we get

\displaystyle\sum_{s=S+1}^{\infty}\frac{1}{(1+s)^{\alpha}}

\displaystyle\leq\int_{S+1}^{\infty}\frac{1}{x^{\alpha}}dx=\frac{1}{(\alpha-1)(S+1)^{\alpha-1}}=O(b_{T}^{-1}).

Hence

\max_{1\leq j_{1},j_{2}\leq M_{T}}\Big{|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}K\Big{(}\frac{t-s}{b_{T}}\Big{)}-1\Big{)}E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{|}=O(b_{T}^{-1}).

Consider next the second term of the bound given in (6) and observe that this term can be bounded by

M_{T}^{8/m}\max_{1\leq j_{1},j_{2}\leq M_{T}}\Big{\|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}Y_{t,j_{1}}Y_{s,j_{2}}-E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{)}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{\|}_{m/4},

(35)

where

	$\displaystyle\Big{\\|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}$	$\displaystyle Y_{t,j_{1}}Y_{s,j_{2}}-E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{)}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{\\|}_{m/4}$
	$\displaystyle\leq$	$\displaystyle\frac{1}{T}\sum_{\ell=0}^{T-1}K(\ell/b_{T})\big{\\|}\sum_{t=1}^{T-\ell}(Y_{t,j_{2}}Y_{t+\ell,j_{1}}-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}))\big{\\|}_{m/4}$
		$\displaystyle\ \ +\frac{1}{T}\sum_{\ell=1}^{T-1}K(\ell/b_{T})\big{\\|}\sum_{s=1}^{T-\ell}(Y_{s,j_{1}}Y_{s+\ell,j_{2}}-E(Y_{s,j_{1}}Y_{s+\ell,j_{2}}))\big{\\|}_{m/4}.$

Recall that for $s\geq 0$ , ${\mathcal{F}}_{r,s}$ denotes the $\sigma$ -algebra generated by the set of random variables $\{e_{r},e_{r-1},\ldots,e_{r-s}\}$ . Note, that by Assumption A.1 for some measurable function $H$ , $Y_{t,j_{2}}Y_{t+\ell,j_{1}}=H(e_{t+\ell},e_{t+\ell-1},\ldots)$ and that the $e_{t}$ ’s are i.i.d.. Then

	$\displaystyle Y_{t,j_{2}}Y_{t+\ell,j_{1}}$	$\displaystyle-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})=E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,\infty})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})$
		$\displaystyle=\lim_{q\rightarrow\infty}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,q})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})$
		$\displaystyle=E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,0})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})$
		$\displaystyle\ \ \ \ +\sum_{r=1}^{\infty}\big{\{}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,r})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,r-1})\big{\}},$

a.s. Therefore,

	$\displaystyle\big{\\|}\sum_{t=1}^{T-\ell}($	$\displaystyle Y_{t,j_{2}}Y_{t+\ell,j_{1}}-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}))\big{\\|}_{m/4}$
	$\displaystyle\leq$	$\displaystyle\big{\\|}\sum_{t=1}^{T-\ell}\big{\{}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,0})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})\big{\}}\big{\\|}_{m/4}$
		$\displaystyle+\sum_{r=1}^{\infty}\big{\\|}\sum_{t=1}^{T-\ell}\big{\{}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,r})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,r-1})\big{\}}\big{\\|}_{m/4}$
	$\displaystyle=$	$\displaystyle S_{1,n}+S_{2,n},$

with an obvious notation for $S_{1,n}$ and $S_{2,n}$ . Consider $S_{1,n}$ and observe that $E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,0})=\widetilde{g}(e_{t+\ell})$ and therefore, $E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,0})$ and
$E(Y_{s,j_{2}}Y_{s+\ell,j_{1}}|{\mathcal{F}}_{s+\ell,0})$ are independent for $t\neq s$ . Hence,

	$\displaystyle S_{1,n}$	$\displaystyle\leq C\sqrt{\sum_{t=1}^{T-\ell}\big{\\|}\big{\{}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,0})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})\big{\}}\big{\\|}_{m/4}}$
		$\displaystyle\leq C\sqrt{T}\big{\\|}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,0})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})\big{\\|}_{m/4}.$

To bound the term $S_{2,n}$ define first

W_{u}=\sum_{t=T-\ell-u+1}^{T-\ell}\big{\{}E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r-1})\big{\}}

and denote by ${\mathcal{A}}_{u}$ the $\sigma$ -algebra generated by the set $\{e_{T},e_{T-1},\ldots,e_{T-u+1-r}\}$ . Notice that $W_{u}$ is measurable with respect to ${\mathcal{A}}_{u}$ and that ${\mathcal{A}}_{u}\subset{\mathcal{A}}_{u+1}$ . Furthermore,

	$\displaystyle E($	$\displaystyle W_{u+1}-W_{u}\|{\mathcal{A}}_{u})$
		$\displaystyle=E\Big{[}E\big{(}Y_{T-\ell-u,j_{2}}Y_{T-u,j_{1}}\|{\mathcal{F}}_{T-u,r}\big{)}-E\big{(}Y_{T-\ell-u,j_{2}}Y_{T-u,j_{1}}\|{\mathcal{F}}_{T-u,r-1}\big{)}\Big{\|}{\mathcal{A}}_{u}\Big{]}$
		$\displaystyle=E\big{(}Y_{T-\ell-u,j_{2}}Y_{T-u,j_{1}}\|{\mathcal{F}}_{T-u,r-1}\big{)}-E\big{(}Y_{T-\ell-u,j_{2}}Y_{T-u,j_{1}}\|{\mathcal{F}}_{T-u,r-1}\big{)}$
		$\displaystyle=0.$

Since $W_{u}$ forms a martingale we get

\|W_{T-\ell}\|_{m/4}\leq C\sqrt{\sum_{t=1}^{T-\ell}\|E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r})-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}|{\mathcal{F}}_{t+\ell,r-1})\|^{2}_{m/4}}.

Recall that $Y_{t,j_{2}}Y_{t+\ell,j_{1}}=H(e_{t+\ell},e_{t+\ell-1},\ldots)$ for some measurable function $H$ and let

Y_{t,j_{2}}(r-\ell)Y_{t+\ell,j_{1}}(r)=H(e_{t+\ell},e_{t+\ell-1},\ldots,e_{t+\ell-r+1},e^{\prime}_{t+\ell-r},e_{t+l-r-1},\ldots),

where $\{e_{t}^{\prime},t\in\mbox{$\mathbb{Z}$}\}$ is an independent copy of $\{e_{t},t\in\mbox{$\mathbb{Z}$}\}$ . Then,

	$\displaystyle\\|E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,r})$	$\displaystyle-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,r-1})\\|_{m/4}$
		$\displaystyle=\\|E(Y_{t,j_{2}}Y_{t+\ell,j_{1}}\|{\mathcal{F}}_{t+\ell,r})-E(Y_{t,j_{2}}(r-\ell)Y_{t+\ell,j_{1}}(r)\|{\mathcal{F}}_{t+\ell,r})\\|_{m/4}$
		$\displaystyle\leq\\|Y_{t,j_{2}}Y_{t+\ell,j_{1}}-Y_{t,j_{2}}(r-\ell)Y_{t+\ell,j_{1}}(r)\\|_{m/4},$

that is,

\|W_{T-\ell}\|_{m/4}\leq C\sqrt{T}\|Y_{t,j_{2}}Y_{t+\ell,j_{1}}-Y_{t,j_{2}}(r-\ell)Y_{t+\ell,j_{1}}(r)\|_{m/4}.

Hence,

	$\displaystyle S_{2,n}=$	$\displaystyle\sum_{r=1}^{\infty}\\|W_{T-\ell}\\|_{m/4}$
	$\displaystyle\leq$	$\displaystyle\ C\sqrt{T}\sum_{r=1}^{\infty}\\|Y_{t,j_{2}}Y_{t+\ell,j_{1}}-Y_{t,j_{2}}(r-\ell)Y_{t+\ell,j_{1}}(r)\\|_{m/4}$
	$\displaystyle\leq$	$\displaystyle C\sqrt{T}\Big{\{}\sum_{r=1}^{\infty}\\|Y_{t+\ell,j_{1}}\\|_{m/2}\\|Y_{t,j_{2}}(r-\ell)-Y_{t,j_{2}}\\|_{m/2}$
		$\displaystyle+\sum_{r=1}^{\infty}\\|Y_{t,j_{2}}(r-\ell)\\|_{m/2}\\|Y_{t+\ell,j_{1}}(r)-Y_{t+\ell,j_{1}}\\|_{m/2}\Big{\}}\ \leq\ C\sqrt{T},$

since

	$\displaystyle\\|Y_{t,j_{2}}(r-\ell)-Y_{t,j_{2}}\\|_{m/2}$	$\displaystyle=\\|X_{t}(r-\ell)X_{t-j_{2}}(r-\ell-j_{2})-X_{t}X_{t-j_{2}}\\|_{m/2}$
		$\displaystyle\leq\\|X_{t}(r-\ell)-X_{t}\\|_{m}\\|X_{t-j_{2}}(r-\ell-j_{2})\\|_{m}$
		$\displaystyle\ \ +\\|X_{t}\\|_{m}\\|X_{t-j_{2}}(r-\ell-j_{2})-X_{t-j_{2}}\\|_{m}$
		$\displaystyle\leq C(r-\ell)^{\alpha}$

and an analogue argument for $\|Y_{t+\ell,j_{1}}(r)-Y_{t+\ell,j_{1}}\|_{m/2}$ . Therefore,

		$\displaystyle\Big{\\|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}Y_{t,j_{1}}Y_{s,j_{2}}-E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{)}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{\\|}_{m/4}$
	$\displaystyle\leq$	$\displaystyle\ \frac{1}{T}\sum_{\ell=0}^{T-1}K(\ell/b_{T})\\|\sum_{t=1}^{T-\ell}Y_{t,j_{2}}Y_{t+\ell,j_{1}}-E(Y_{t,j_{2}}Y_{t+\ell,j_{1}})\\|_{m/4}$
		$\displaystyle+\frac{1}{T}\sum_{\ell=0}^{T-1}K(\ell/b_{T})\\|\sum_{t=1}^{T-\ell}Y_{t,j_{1}}Y_{t+\ell,j_{2}}-E(Y_{t,j_{2}}Y_{t+\ell,j_{2}})\\|_{m/4}$
	$\displaystyle\leq$	$\displaystyle\ CT^{-1/2}\sum_{\ell=0}^{T-1}K(\ell/b_{T})$
	$\displaystyle\leq$	$\displaystyle\ C(T^{-1/2}\big{(}K(0)+\sum_{\ell=1}^{\infty}K(\ell/b_{T})\big{)}$
	$\displaystyle\leq$	$\displaystyle\ CT^{-/2}\big{(}K(0)+\int_{0}^{\infty}K(x/b_{T})dx\big{)}$
	$\displaystyle\leq$	$\displaystyle\ Cb_{T}/\sqrt{T}.$

From this and recalling (35) we conclude that

\Big{|}\frac{1}{T}\sum_{t=j_{1}+1}^{T}\sum_{s=j_{2}+1}^{T}\Big{(}Y_{t,j_{1}}Y_{s,j_{2}}-E(Y_{t,j_{1}}Y_{s,j_{2}})\Big{)}K\Big{(}\frac{t-s}{b_{T}}\Big{)}\Big{|}=O_{P}\big{(}M_{T}^{8/m}b_{T}/\sqrt{T}\big{)}.

∎

Proof of Theorem 3.

Let $\widetilde{\xi}_{k}$ , $k=1,2,\ldots,N_{T},$ be Gaussian random variables as in Theorem 2. By the triangular inequality,

	$\displaystyle\sup_{x\in\mathbb{R}}\Big{\|}P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{\|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\|}{\widehat{f}_{T}(\lambda_{k,T})}\leq x\big{)}-P^{\ast}\big{(}\max_{k=1,\ldots,N_{T}}\|\xi^{\ast}_{k}\|\leq x\big{)}\Big{\|}$
	$\displaystyle\leq\sup_{x\in\mathbb{R}}\Big{\|}P\big{(}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\frac{\|\widehat{f}_{T}(\lambda_{k,T})-E\widehat{f}_{T}(\lambda_{k,T})\|}{\widehat{f}_{T}(\lambda_{k,T})}\leq x\big{)}-P\big{(}\max_{k=1,\ldots,N_{T}}\|\widetilde{\xi}_{k}\|\leq x\big{)}\Big{\|}$
	$\displaystyle+\sup_{x\in\mathbb{R}}\Big{\|}P\big{(}\max_{k=1,\ldots,N_{T}}\|\widetilde{\xi}_{k}\|\leq x\big{)}-P^{\ast}\big{(}\max_{k=1,\ldots,N_{T}}\|\xi^{\ast}_{k}\|\leq x\big{)}\Big{\|}.$		(36)

The first term is bounded by Theorem 2. To handle the second term, we first show that

		$\displaystyle\Delta_{T}:=\max_{1\leq k_{1},k_{2}\leq N_{T}}\big{\|}\widehat{C}_{T}(k_{1},k_{2})-C_{T}(k_{1},k_{2})\big{\|}$
	$\displaystyle=$	$\displaystyle\ O_{P}\Big{(}\max_{k=1,\ldots,N_{T}}\|\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})\|+\frac{b_{T}M_{T}}{\sqrt{T}}+\frac{M_{T}}{b_{T}}\,+\,M_{T}b_{T}T^{8a_{s}/m-1/2}\Big{)}.$		(37)

We have

	$\displaystyle\Delta_{T}\leq$	$\displaystyle\max_{1\leq k_{1},k_{2}\leq N_{T}}\Big{\|}\Big{(}\frac{1}{\widehat{f}_{T}(\lambda_{k_{1},T})\widehat{f}(\lambda_{k_{2},T})}-\frac{1}{f_{T}(\lambda_{k_{1},T})f(\lambda_{k_{2},T})}\Big{)}$
		$\displaystyle\times\sum_{j_{1}=1}^{M_{T}}\sum_{j_{2}=1}^{M_{T}}a_{k_{1},j_{1}}a_{k_{2},j_{2}}\sigma_{T}(j_{1},j_{2})\Big{\|}+\max_{1\leq k_{1},k_{2}\leq N_{T}}\Big{\|}\frac{1}{\widehat{f}_{T}(\lambda_{k_{1},T})\widehat{f}(\lambda_{k_{2},T})}$
		$\displaystyle\times\sum_{j_{1}=1}^{M_{T}}\sum_{j_{2}=1}^{M_{T}}a_{k_{1},j_{1}}a_{k_{2},j_{2}}\big{(}\widehat{\sigma}_{T}(j_{1},j_{2})-\sigma_{T}(j_{1},j_{2})\big{)}\Big{\|}$
	$\displaystyle=$	$\displaystyle\ \Delta_{1,T}+\Delta_{2,T},$

with an obvious notation for $\Delta_{1,T}$ and $\Delta_{2,T}$ .

For $\Delta_{1,T}$ notice that by the boundedness of the spectral density $f$ and Assumption 2 we get that

		$\displaystyle\max_{1\leq k_{1},k_{2}\leq N_{T}}\Big{\|}\frac{1}{\widehat{f}_{T}(\lambda_{k_{1},T})\widehat{f}(\lambda_{k_{2},T})}-\frac{1}{f_{T}(\lambda_{k_{1},T})f(\lambda_{k_{2},T})}\Big{\|}$
	$\displaystyle\leq\$	$\displaystyle\Big{(}\frac{1}{\min_{1\leq k\leq N_{T}}\widehat{f}_{T}(\lambda_{k,T})}\Big{)}^{2}\Big{(}\frac{1}{\min_{1\leq k\leq N_{T}}f_{T}(\lambda_{k,T})}\Big{)}^{2}$
		$\displaystyle\ \ \times\big{(}\max_{1\leq k\leq N_{T}}f(\lambda_{k,T})+\max_{1\leq k\leq N_{T}}\widehat{f}(\lambda_{k,T})\big{)}\max_{1\leq k\leq N_{T}}\big{\|}\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})\big{\|}$
		$\displaystyle={\mathcal{O}}_{P}(\max_{1\leq k\leq N_{T}}\big{\|}\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})\big{\|}).$

Furthermore, recalling the definition of the Gaussian random variables $\xi_{k}$ , we have

		$\displaystyle\max_{1\leq k_{1},k_{2}\leq N_{T}}\Big{\|}\sum_{j_{1}=1}^{M_{T}}\sum_{j_{2}=1}^{M_{T}}a_{k_{1},j_{1}}a_{k_{2},j_{2}}\sigma_{T}(j_{1},j_{2})\Big{\|}=\max_{1\leq k_{1},k_{2}\leq N_{T}}\|E(\xi_{k_{1}},\xi_{k_{2}})\|$
	$\displaystyle\leq$	$\displaystyle\ \big{(}\max_{1\leq k\leq N_{T}}\\|\xi_{k}\\|_{2}\big{)}^{2}={\mathcal{O}}(1);$

see Remark 1. Hence $\Delta_{1,T}={\mathcal{O}}_{P}\big{(}\max_{1\leq k\leq N_{T}}\big{|}\widehat{f}_{T}(\lambda_{k,T})-f(\lambda_{k,T})\big{|}\big{)}$ .

For $\Delta_{2,T}$ we have

	$\displaystyle\Delta_{2,T}$	$\displaystyle\leq\max_{1\leq j_{1},j_{2}\leq M_{T}}\big{\|}\widehat{\sigma}_{T}(j_{1},j_{2})-\sigma_{T}(j_{1},j_{2})\Big{\|}\Big{(}\frac{1}{\min_{1\leq k\leq N_{T}}\widehat{f}_{T}(\lambda_{k,T})}\Big{)}^{2}$
		$\displaystyle\ \ \ \ \times\Big{(}\max_{1\leq k\leq N_{T}}\sum_{j=0}^{M_{T}}a_{k,j}\Big{)}^{2}$
		$\displaystyle={\mathcal{O}}_{P}\Big{(}\frac{M_{T}b_{T}}{\sqrt{T}}+\frac{M_{T}}{b_{T}}+M_{T}b_{T}T^{8a_{s}/m-1/2}\Big{)},$

by Proposition 1 and the fact that $\sum_{j=0}^{M_{T}}a_{k,j}=O(\sqrt{M_{T}})$ uniformly in $1\leq k\leq N_{T}$ . This establishes (6).

Recall next that for $1\leq k\leq N_{T}$ ,

E|\xi_{k}|^{2}=f^{2}(\lambda_{k,T})\int_{-1}^{1}w^{2}(u)du+o(1),

and that $E|\widetilde{\xi}_{k}|^{2}=E|\xi_{k}|^{2}/f^{2}(\lambda_{k,T})$ . From these expressions and by Assumption 2 and the boundedness of $f$ we get that $\min_{1\leq k\leq N_{T}}C_{T}(k,k)$ and $\max_{1\leq k\leq N_{T}}C_{T}(k,k)$ are, for $T$ large enough, bounded from below and from above, respectively, by positive constants. Using Lemma A.1 of Zhang et al. (2022) we then get

	$\displaystyle\sup_{x\in\mathbb{R}}\Big{\|}P\big{(}\max_{k=1,\ldots,N_{T}}\|\widetilde{\xi}_{k}\|\leq x\big{)}-$	$\displaystyle P^{\ast}\big{(}\max_{k=1,\ldots,N_{T}}\|\xi^{\ast}_{k}\|\leq x\big{)}\Big{\|}$
	$\displaystyle=$	$\displaystyle{\mathcal{O}}_{P}\Big{(}\frac{\Delta_{T}^{1/6}}{(1+\log(N_{T}))^{1/4}}+\Delta_{T}^{1/3}\log^{3}(N_{T})\Big{)}$
	$\displaystyle=$	$\displaystyle o_{P}(\Delta_{T}^{1/6}).$

∎

Acknowledgements The authors are grateful to the Co-Editor and to two referees for their valuable comments that lead to an improved version of this paper. They are also thankful to Panagiotis Maouris and Alexander Braumann for helping carrying out the numerical work of Section 5.

References

[1] \harvarditemAnderson1971A71 Anderson, T. W. (1971). The Statistical Analysis of Time Series. John Wiley & Sons, Inc., New York-London-Sydney.
[2] \harvarditemAndrews1991A91 Andrews, D.W.K. (1991). Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica 59, 817–858.
[3] \harvarditemBerg and Politis2009BergPolitis2009 Berg, A. and Politis, D.N. (2009). Higher-order accurate polyspectral estimation with flat-top lag-windows. Ann. Inst. Stat. Math. 61, 477–498.
[4] \harvarditemBrockwell and Davis1991BrockwellDavis91 Brockwell, P.J. and Davis, R.A. (1991). Time Series: Theory and Methods. Second Edition. Springer, New York.
[5] \harvarditemCalonico et al.2018Calonicoetal2018 Calonico, S., Cattaneo, M.D. and Farrell, M.H. (2018). On the effect of bias estimation on coverage accuracy in nonparametric inference. J. Am. Stat. Assoc. 113, 767-779.
[6] \harvarditemCerovecki et al.2022CCH22 Cerovecki, C., Characiejus, V. and Hörmann, S. (2022). The maximum of the periodogram of a sequence of functional data. J. Am. Stat. Assoc. 118, 2712-2720.
[7] \harvarditemChang et al.2023Changetal23 Chang, J., Jiang, Q., McElroy, T. S. and Shao, X. (2023). Statistical inference for high-dimensional spectral density matrix. Preprint, doi: 10.48550/arXiv.2212.13686.
[8] \harvarditemJirak2011Jirak2011 Jirak, M. (2011). On the maximum of covariance estimators. J. Mult. Analysis 102, 1032–1046.
[9] \harvarditemLiu and Wu2010LiuWu2010 Liu, W. and Wu, W.B. (2010). Asymptotics of spectral density estimates. Econometric Theory 26, 1218–1245.
[10] \harvarditemNeumann and Paparoditis2008NP08 Neumann, M. H. and Paparoditis, E. (2008). Simultaneous confidence bands in spectral density estimation. Biometrika 95, 381-397.
[11] \harvarditemNewton and Pagano1984NP84 Newton, J. H. and Pagano, M. (1984). Simultaneous confidence bands for autoregressive spectra. Biometrika 71, 197–202.
[12] \harvarditemPolitis and Romano1995PolitisRomano87 Politis, D. N. and Romano J.P. (1995). Bias-corrected nonparametric spectral estimation. J. Time Ser. Anal. 16, 67–103.
[13] \harvarditemPolitis2024Politis24 Politis, D. N. (2024). Studentization vs. variance stabilization: A simple way out of an old dilemma. Statistical Science 39, 409–427.
[14] \harvarditemPriestley1981Priestley81 Priestley, M. B., (1991). Spectral Analysis and Time Series. Academic Press, London.
[15] \harvarditemTomàšek1987T87 Tomàšek, L. (1987). Asymptotic simultaneous confidence bands for autoregressive spectral density. Journal of Time Series Analysis 8, 469–491.
[16] \harvarditemWoodroofe and Van Ness1967Woodroofe1967 Woodroofe, M.B. and Van Ness, J.W. (1967). The maximum deviation of sample spectral densities. Ann. Math. Statist. 38, 1558–1569.
[17] \harvarditemWu2005Wu2005 Wu, W.B. (2005). Nonlinear system theory: Another look at dependence. Proceedings of the National Academy of Sciences USA 102, 14150–14154.
[18] \harvarditemWu2009Wu2009 Wu, W.B. (2009). An asymptotic theory for sample covariances of Bernoulli shifts. Stochastic Processes and Their Applications 120, 2412–2431.
[19] \harvarditemWu2011Wu2011 Wu, W.B. (2011). Asymptotic theory for stationary time series. Statistics and its Interface 4, 207–226.
[20] \harvarditemXiao and Wu2014XiaoWu2014 Xiao, H. and Wu, W.B. (2014). Portmanteau test and simultaneous inference for serial covariances. Statistica Sinica 24, 577–599.
[21] \harvarditemWu and Zaffaroni2018WuZaffaroni2018 Wu, W.B. and Zaffaroni, P. (2018). Asymptotic theory for spectral density estimates of general multivariate time series. Econometric Theory 34, 1–22.
[22] \harvarditemYang and Zhou2022YZ22 Yang, J. and Zhou, Z.(2022) Spectral inference under complex temporal dynamics. J. Am. Stat. Assoc. 117, 133–155.
[23] \harvarditemZhang and Cheng2018ZC18 Zhang, X. and Cheng, G. (2018). Gaussian approximation for high dimensional vector under physical dependence. Bernoulli 24, 2640–2675.
[24] \harvarditemZhang et al.2022Zhang_etal2022 Zhang, Y., Paparoditis, E. and Politis, D.N. (2022). Simultaneous statistical inference for second order parameters of time series under weak conditions. Preprint, doi: 10.48550/arXiv.2110.14067.
[25] \harvarditemZhang and Wu2017ZW17 Zhang, D. and Wu, W.B (2017). Gaussian approximation for high dimensional time series. Annals of Statistics 45, 1895–1919.
[26]

	$\displaystyle\|E(Y_{t,j_{1}}Y_{s,j_{2}})\|$	$\displaystyle\leq\|E(X_{t}X_{t-{j_{1}}}X_{s}X_{s-j_{2}}\|+\|\gamma(j_{1})\gamma(j_{2})\|$
		$\displaystyle=\|E(\big{(}X_{t}-X_{t}^{(t-s-1)})X_{t-j_{1}}X_{s}X_{s-j_{2}}\big{)}\|+\|\gamma(j_{1})\gamma(j_{2})\|$
		$\displaystyle\leq\\|X_{t-j_{1}}X_{s}X_{s-j_{2}}\\|_{m/(m-1)}\\|X_{t}-X_{t}^{(t-s-1)}\\|_{m}+C(1+j_{1})^{-\alpha}$
		$\displaystyle\leq C\,(1+t-s)^{-\alpha}.$

	$\displaystyle\|E(Y_{t,j_{1}}Y_{s,j_{2}})\|$	$\displaystyle\leq\|E(X_{t}X_{t-{j_{1}}}X_{s}X_{s-j_{2}}\|+\|\gamma(j_{1})\gamma(j_{2})\|$
		$\displaystyle=\|E(\big{(}X_{t}-X_{t}^{(t-j_{1}-1)})X_{t-j_{1}}X_{s}X_{s-j_{2}}\big{)}\|+\|\gamma(j_{1})\gamma(j_{2})\|$
		$\displaystyle\leq\\|X_{t-j_{1}}X_{s}X_{s-j_{2}}\\|_{m/(m-1)}\\|X_{t}-X_{t}^{(t-j_{1}-1)}\\|_{m}+C(1+j_{1})^{-\alpha}$
		$\displaystyle\leq C(1+j_{1})^{-\alpha}\leq C(1+(t-s)/2)^{-\alpha}\leq C2^{\alpha}(1+t-s)^{-\alpha}.$

		$\displaystyle\big{\\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,k}\big{\\|}_{2}^{2}$		(26)
		$\displaystyle=\frac{1}{4\pi^{2}TM_{T}}\sum_{j_{1},j_{2}=-M_{T}}^{M_{T}}\sum_{t_{1}=\|j_{1}\|+1}^{T}\sum_{t_{2}=\|j_{2}\|+2}^{T}w\big{(}\frac{j_{1}}{M_{T}}\big{)}w\big{(}\frac{j_{2}}{M_{T}}\big{)}\,\cos(j_{1}\lambda_{k,T})\,\cos(j_{2}\lambda_{k,T})$
		$\displaystyle\qquad\times\gamma(t_{1}-t_{2})\gamma(t_{1}-t_{2}+\|j_{2}\|-\|j_{1}\|)$
		$\displaystyle\quad+\frac{1}{4\pi^{2}TM_{T}}\sum_{j_{1},j_{2}=-M_{T}}^{M_{T}}\sum_{t_{1}=\|j_{1}\|+1}^{T}\sum_{t_{2}=\|j_{2}\|+1}^{T}w\big{(}\frac{j_{1}}{T}\big{)}w\big{(}\frac{j_{2}}{T}\big{)}\,\cos(j_{1}\lambda_{k,T})\,\cos(j_{2}\lambda_{k,T})$
		$\displaystyle\qquad\times\gamma(t_{1}-t_{2}+\|j_{2}\|)\gamma(t_{1}-t_{2}-\|j_{1}\|)+R^{(1)}_{T,k}$

	$\displaystyle\sup_{x\in\mathbb{R}}\Big{\|}P\Big{\{}\max_{k=1,\ldots,N_{T}}\sqrt{\frac{T}{M_{T}}}\Big{\|}\widehat{f}_{T}(\lambda_{k,T})-E\,\widehat{f}_{T}(\lambda_{k,T})\Big{\|}\leq x\Big{\}}$
	$\displaystyle\quad\quad-P\Big{\{}\max_{k=1,\ldots,N_{T}}\|\xi_{k}\|\leq x\Big{\}}\Big{\|}$
$\displaystyle\leq$	$\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\,\Big{\|}Eh_{\tau,\tau,x}\Big{(}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,1},\dots,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,N_{T}}\Big{)}-Eh_{\tau,\tau,x}\left(\xi_{1},\dots,\xi_{N_{T}}\right)\Big{\|}$
	$\displaystyle\quad\quad+C\,t\,\left(1+\sqrt{\log(N_{T})}+\sqrt{\|\log(t)\|}\right),$	(27)

		$\displaystyle\sup_{x\in\mbox{$\mathbb{R}$}}\,\Big{\|}Eh_{\tau,\tau,x}\Big{(}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,1},\dots,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z_{t,N_{T}}\Big{)}$
		$\displaystyle\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ -Eh_{\tau,\tau,x}\Big{(}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}\widetilde{Z}^{(s)}_{t,1},\dots,\frac{1}{\sqrt{T}}\sum_{t=1}^{T}Z^{(s)}_{t,N_{T}}\Big{)}\Big{\|}$
		$\displaystyle\leq C\,\tau\,E\Big{[}\max_{k=1,\dots,N_{T}}\Big{\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}(Z_{t,k}-\widetilde{Z}_{t,k}^{(s)})\Big{\|}\Big{]}$
		$\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,\max_{k=1,\dots,N_{T}}\Big{\\|}\frac{1}{\sqrt{T}}\sum_{t=1}^{T}(Z_{t,k}-\widetilde{Z}_{t,k}^{(s)})\Big{\\|}_{m/2}$
		$\displaystyle\leq C\,\tau\,N_{T}^{2/m}\,d_{s,m}$		(28)