This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Empirical process theory for locally stationary processes

Abstract

We provide a framework for empirical process theory of locally stationary processes using the functional dependence measure. Our results extend known results for stationary Markov chains and mixing sequences by another common possibility to measure dependence and allow for additional time dependence. Our main result is a functional central limit theorem for locally stationary processes. Moreover, maximal inequalities for expectations of sums are developed. We show the applicability of our theory in some examples, for instance we provide uniform convergence rates for nonparametric regression with locally stationary noise.

Empirical process theory for locally stationary processes
Nathawut Phandoidaen, Stefan Richter

[email protected], [email protected]

Institut für angewandte Mathematik, Im Neuenheimer Feld 205, Universität Heidelberg

To appear in Bernoulli

1 Introduction

Empirical process theory is a powerful tool to prove uniform convergence rates and weak convergence of composite functionals. The theory for independent variables is well-studied (cf. [19], [23], [44] or [43] for an overview) based on the original ideas of [13], [17], [18], [37] and [34] among others. For random variables with dependence structure, various approaches have been discussed. There exists a well-developed empirical process theory and large deviation results for Harris-recurrent Markov chains based on regenerative schemes (cf. [28], [41], [21] and [1] among others) or geometric ergodicity (cf. [27]). To quantify the speed of convergence in maximal inequalities, additional assumptions like β\beta-recurrence (cf. [26]) have to be imposed. The theory covers a rich class of Markov chains, but for instance, can not discuss linear processes.

An empirical process theory for stationary processes under high-level assumptions on the moments of means was derived in [12] and further discussion papers. In the paradigm of weak dependence (which measures the size of covariances of Lipschitz functions of the random variables), [16] derived Bernstein-type inequalities. Focusing on the analysis of the empirical distribution function (EDF), much more techniques were discussed: For instance, [20, Theorem 4] provide uniform convergence of the EDF by using bounds for covariances of Hölder functions of the random variables. Another abstract concept was introduced by [4] via S-mixing (for stationary mixing), which imposes the existence of mm-dependent approximations of the original observations. They then derive strong approximations and uniform central limit theorems for the EDF.

A different idea to measure dependence of random variables is given by mixing coefficients. Here, several concepts were introduced, the most common (with increasing strength) being α\alpha-, β\beta- and ϕ\phi-mixing (for an overview about mixing coefficients, cf. [15]). Large deviation results and uniform central limit theorems for general classes of functions (not only EDF) were derived by using coupling techniques, cf. [38], [30] for α\alpha-mixing, [3], [50], and successively refined by [14], [39], [11], [9] (the last two developed for EDFs only) and [40] for β\beta-mixing, and [10], [6] for β\beta- and ϕ\phi-mixing. See also [2], [10] and [40] for comprehensive overviews.

In [10] it is argued that β\beta-mixing is the weakest mixing assumption that allows for a “complete” empirical process theory which incorporates maximal inequalities and uniform central limit theorems. There exist explicit upper bounds for β\beta-mixing coefficients for Markov chains (cf. [25]) and for so-called V-geometric mixing coefficients (cf. [32]). For several stationary time series models like linear processes (cf. [35] for α\alpha-mixing), ARMA (cf. [33]), nonlinear AR (cf. [26]) and GARCH processes (cf. [22]) there also exist upper bounds on mixing coefficients. A common assumption in these results is that the observed process or, more often, the innovations of the corresponding process, have a continuous distribution. This is a crucial assumption to handle the relatively complicated mixing coefficients defined over a supremum over two different sigma-algebras. A relaxation of β\beta-mixing coefficients was investigated by [11, Theorem 1] and is specifically designed for the analysis of the EDF. In opposite to sigma-algebras, these smaller coefficients are defined with conditional expectations of certain classes of functions and are easier to upper bound for a wide range of time series models.

During the last years, another measure for dependence, the so-called functional dependence measure, became popular (cf. [46]), which uses a Bernoulli shift representation (see (1.1) below) and decomposition into martingales and mm-dependent sequences. It has been shown in various applications that the functional dependence measure allows, when combined with the rich theory of martingales, for sharp large deviation inequalities (cf. [49] or [51]). In [47] and [31], uniform central limit theorems for the EDF were derived for stationary and piecewise locally stationary processes.

Up to now, no general empirical process theory (allowing for general classes of functions) using the functional dependence measure is available. In this paper we will fill this gap and prove maximal inequalities and functional central limit theorems under functional dependence. Furthermore, we will draw connections and compare our results to already existing empirical process concepts for dependent data we mentioned above. While the empirical process theory for Markov chains and mixing cited above was developed for stationary processes, we will work in the framework of locally stationary processes and therefore automatically provide the first general empirical process theory in this setting ([7] investigated spectral empirical processes for linear processes, [31] proved a functional central limit theorem for a localized empirical distribution function). Locally stationary processes allow for a smooth change of the distribution over time but can locally approximated by stationary processes. Therefore, they provide more flexible time series models (cf. [8] for an introduction).

The functional dependence measure uses a representation of the given process as a Bernoulli shift process and quantifies dependence with a LνL^{\nu}-norm. More precisely, we assume that Xi=(Xij)j=1,,dX_{i}=(X_{ij})_{j=1,...,d}, i=1,,ni=1,...,n, is a dd-dimensional process of the form

Xi=Ji,n(𝒢i),X_{i}=J_{i,n}(\mathcal{G}_{i}), (1.1)

where 𝒢i=σ(εi,εi1,)\mathcal{G}_{i}=\sigma(\varepsilon_{i},\varepsilon_{i-1},...) is the sigma-algebra generated by εi\varepsilon_{i}, ii\in\mathbb{Z}, a sequence of i.i.d. random variables in d~\mathbb{R}^{\tilde{d}} (d,d~d,\tilde{d}\in\mathbb{N}), and some measurable function Ji,n:(d~)0J_{i,n}:(\mathbb{R}^{\tilde{d}})^{\mathbb{N}_{0}}\to\mathbb{R}, i=1,,ni=1,...,n, nn\in\mathbb{N}. For a real-valued random variable WW and some ν>0\nu>0, we define Wν:=𝔼[|W|ν]1/ν\|W\|_{\nu}:=\mathbb{E}[|W|^{\nu}]^{1/\nu}. If εk\varepsilon_{k}^{*} is an independent copy of εk\varepsilon_{k}, independent of εi,i\varepsilon_{i},i\in\mathbb{Z}, we define 𝒢i(ik):=(εi,,εik+1,εik,εik1,)\mathcal{G}_{i}^{*(i-k)}:=(\varepsilon_{i},...,\varepsilon_{i-k+1},\varepsilon_{i-k}^{*},\varepsilon_{i-k-1},...) and Xi(ik):=Ji,n(𝒢i(ik))X_{i}^{*(i-k)}:=J_{i,n}(\mathcal{G}_{i}^{*(i-k)}). The uniform functional dependence measure is given by

δνX(k)=supi=1,,nsupj=1,,dXijXij(ik)ν.\delta_{\nu}^{X}(k)=\sup_{i=1,...,n}\sup_{j=1,...,d}\big{\|}X_{ij}-X_{ij}^{*(i-k)}\big{\|}_{\nu}. (1.2)

Graphically, δνX\delta_{\nu}^{X} measures the impact of ε0\varepsilon_{0} in XkX_{k}. Although representation (1.1) appears to be rather restrictive, it does cover a large variety of processes. In [5] it was motivated that the set of all processes of the form Xi=J(εi,εi1,)X_{i}=J(\varepsilon_{i},\varepsilon_{i-1},...) should be equal to the set of all stationary and ergodic processes. We additionally allow JJ to vary with ii and nn to cover processes which change their stochastic behavior over time. This is exactly the form of the so-called locally stationary processes discussed in [8].

Since we are working in the time series context, many applications ask for functions ff that not only depend on the actual observation of the process but on the whole (infinite) past Zi:=(Xi,Xi1,Xi2,)Z_{i}:=(X_{i},X_{i-1},X_{i-2},...). In the course of this paper, we aim to derive asymptotic properties of the empirical process

𝔾n(f):=1ni=1n{f(Zi,in)𝔼f(Zi,in)},f,\mathbb{G}_{n}(f):=\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\big{\{}f(Z_{i},\frac{i}{n})-\mathbb{E}f(Z_{i},\frac{i}{n})\big{\}},\quad f\in\mathcal{F}, (1.3)

where

{f:(d)0×[0,1] measurable}.\mathcal{F}\subset\{f:(\mathbb{R}^{d})^{\mathbb{N}_{0}}\times[0,1]\to\mathbb{R}\text{ measurable}\}.

Let (ε,,)\mathbb{H}(\varepsilon,\mathcal{F},\|\cdot\|) denote the bracketing entropy, that is, the logarithm of the number of ε\varepsilon-brackets with respect to some distance \|\cdot\| that is necessary to cover \mathcal{F} (this is made precise at the end of this section). We will define a distance VnV_{n} which guarantees weak convergence of (1.3) if the corresponding bracketing entropy integral 01(ε,,Vn)𝑑ε\int_{0}^{1}\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},V_{n})}d\varepsilon is finite.

The definition of the functional dependence measure for locally stationary processes is similar to its stationary version and is easy to calculate for many time series models. It does not rely on the stationarity assumption but on the representation of the process as a Bernoulli shift. Therefore, many well-known upper bounds for stationary time series given in [48], including recursively defined models and linear models, directly carry over to the locally stationary case (1.2). It seems reasonable to use it as a starting point to generalize empirical process theory for stationary processes to the more general setting of local stationarity. While the other two paradigms mentioned above should also allow such a generalization in principle, there are open questions:

  • The theory for Harris-recurrent Markov chain relies on stationarity and intrinsically needs some knowledge about the whole time series due to the assumption of null-recurrence. There exist generalizations of local stationary Markov chains (cf. for instance [42]), but the corresponding recurrence properties and examples of locally stationary time series are not worked out yet. Furthermore, it is not directly clear how to deal with processes ZiZ_{i} which incorporate the infinite past of XiX_{i}, and linear processes can not be discussed easily.

  • Absolutely regular β\beta-mixing is shown in most examples by assuming some continuity in the distribution or in the corresponding innovations. Especially for linear processes, the bounds are quite hard to obtain and seem not to be optimal. Moreover, there exist no “invariance rules” which would directly allow to transfer the mixing properties of XiX_{i} to f(Zi,in)f(Z_{i},\frac{i}{n}) which incorporates infinitely many lags of XiX_{i}.

  • Many of the more elaborated dependence concepts developed in e.g. [11], [6], [20], [4] are restricted (at least in their original formulation) to the discussion of the EDF or connected one-dimensional indexed function classes.

Contrary to β\beta-mixing, the functional dependence measure can easily deal with f(Zi,in)f(Z_{i},\frac{i}{n}) by using Hölder-type assumptions on ff. Furthermore, it easily can be calculated in many situations and is not restricted to noncontinuous distributions of XiX_{i}. Also, linear processes XiX_{i} are covered.

However, there are also some peculiarities using the functional dependence measure (1.2). While for Harris-recurrent Markov chains and β\beta-mixing, the empirical process theory is independent of the function class considered, the situation for the functional dependence measure is more complicated. In order to quantify the dependence of f(Zi,in)f(Z_{i},\frac{i}{n}) by δνX\delta_{\nu}^{X}, we have to impose smoothness conditions on ff in direction of its first argument. The distance VnV_{n} therefore will not only change with the dependence structure of XX, but also has to be “compatible” with the function class \mathcal{F}. The smoothness condition on ff also poses a challenging issue when considering chaining procedures where rare events are excluded by (non-smooth) indicator functions.

Our main contributions in this paper are the following:

  • We derive maximal inequalities for 𝔾n(f)\mathbb{G}_{n}(f) for classes of functions \mathcal{F},

  • a chaining device which preserves smoothness during the chaining procedure and

  • conditions to ensure asymptotic tightness and functional convergence of 𝔾n(f)\mathbb{G}_{n}(f), ff\in\mathcal{F}.

The paper is organized as follows. In Section 2, we present our main result Theorem 2.3, the functional central limit theorem under minimal moment conditions. As a special case, we derive a version for stationary processes. We give a discussion on the distance VnV_{n} and compare our result with the empirical process theory for β\beta-mixing. Some Assumptions are postponed to Section 3, where a new multivariate central limit theorem for locally stationary processes is presented. In Section 4, we provide new maximal inequalities for 𝔾n(f)\mathbb{G}_{n}(f) for both finite and infinite \mathcal{F}. In Section 5, we apply our theory to prove uniform convergence rates and weak convergences of several estimators. The aim of the last section is to highlight the wide range of applicability of our theory and to provide the typical conditions which have to be imposed as well as some discussion. In Section 6, a conclusion is drawn. We illustrate the main steps of the proofs in the Appendix of the article but postpone all detailed proofs to the Supplementary Material.

We now introduce some basic notation. For a,ba,b\in\mathbb{R}, let ab:=min{a,b}a\wedge b:=\min\{a,b\}, ab:=max{a,b}a\vee b:=\max\{a,b\}. For kk\in\mathbb{N},

H(k):=1log(k)H(k):=1\vee\log(k) (1.4)

which naturally appears in large deviation inequalities. For a given finite class \mathcal{F}, let |||\mathcal{F}| denote its cardinality. We use the abbreviation

H=H(||)=1log||H=H(|\mathcal{F}|)=1\vee\log|\mathcal{F}| (1.5)

if no confusion arises. For some distance \|\cdot\|, let (ε,,)\mathbb{N}(\varepsilon,\mathcal{F},\|\cdot\|) denote the bracketing numbers, that is, the smallest number of ε\varepsilon-brackets [lj,uj]:={f:ljfuj}[l_{j},u_{j}]:=\{f\in\mathcal{F}:l_{j}\leq f\leq u_{j}\} (i.e. measurable functions lj,uj:(d)0×[0,1]l_{j},u_{j}:(\mathbb{R}^{d})^{\mathbb{N}_{0}}\times[0,1]\to\mathbb{R} with ujljε\|u_{j}-l_{j}\|\leq\varepsilon for all jj) to cover \mathcal{F}. Let (ε,,):=log(ε,,)\mathbb{H}(\varepsilon,\mathcal{F},\|\cdot\|):=\log\mathbb{N}(\varepsilon,\mathcal{F},\|\cdot\|) denote the bracketing entropy. For ν1\nu\geq 1, let

fν,n:=(1ni=1nf(Zi,in)νν)1/ν.\|f\|_{\nu,n}:=\Big{(}\frac{1}{n}\sum_{i=1}^{n}\big{\|}f\big{(}Z_{i},\frac{i}{n}\big{)}\big{\|}_{\nu}^{\nu}\Big{)}^{1/\nu}.

2 A new functional central limit theorem

Roughly speaking, a process XiX_{i}, i=1,,ni=1,...,n is called locally stationary if for each u[0,1]u\in[0,1], there exists a stationary process X~i(u)\tilde{X}_{i}(u), i=1,,ni=1,...,n such that XiX~i(u)X_{i}\approx\tilde{X}_{i}(u) if |uin||u-\frac{i}{n}| is small (cf. [8]). Typical estimators are of the form

1nhi=1nK(i/nuh)f¯(Zi,in)\frac{1}{nh}\sum_{i=1}^{n}K\big{(}\frac{i/n-u}{h}\big{)}\bar{f}(Z_{i},\frac{i}{n})

where KK is a kernel function and h=hn>0h=h_{n}>0 is a bandwidth. Clearly, such a localization changes the convergence rate. To cover these cases, we suppose that any ff\in\mathcal{F} has a representation

f(z,u)=Df,n(u)f¯(z,u),z(d)0,u[0,1],f(z,u)=D_{f,n}(u)\cdot\bar{f}(z,u),\quad\quad z\in(\mathbb{R}^{d})^{\mathbb{N}_{0}},u\in[0,1], (2.1)

where f¯\bar{f} is independent of nn and Df,n(u)D_{f,n}(u) is independent of zz. We put

¯:={f¯:f}.\bar{\mathcal{F}}:=\{\bar{f}:f\in\mathcal{F}\}. (2.2)

The function class ¯\bar{\mathcal{F}} is considered to consist of Hölder-continuous functions in direction of zz. For s(0,1]s\in(0,1], a sequence z=(zi)i0z=(z_{i})_{i\in\mathbb{N}_{0}} of elements of d\mathbb{R}^{d} (equipped with the maximum norm |||\cdot|_{\infty}) and an absolutely summable sequence χ=(χi)i0\chi=(\chi_{i})_{i\in\mathbb{N}_{0}} of nonnegative real numbers, we set

|z|χ,s:=(i=0χi|zi|s)1/s|z|_{\chi,s}:=\Big{(}\sum_{i=0}^{\infty}\chi_{i}|z_{i}|_{\infty}^{s}\Big{)}^{1/s}

and |z|χ:=|z|χ,1|z|_{\chi}:=|z|_{\chi,1}.

Definition 2.1.

¯\bar{\mathcal{F}} is called a (L,s,R,C)(L_{\mathcal{F}},s,R,C)-class if L=(L,i)i0L_{\mathcal{F}}=(L_{\mathcal{F},i})_{i\in\mathbb{N}_{0}} is a sequence of nonnegative real numbers, s(0,1]s\in(0,1] and R:(d)0×[0,1][0,)R:(\mathbb{R}^{d})^{\mathbb{N}_{0}}\times[0,1]\to[0,\infty) satisfies for all u[0,1]u\in[0,1], z,z(d)0z,z^{\prime}\in(\mathbb{R}^{d})^{\mathbb{N}_{0}}, f¯¯\bar{f}\in\bar{\mathcal{F}},

|f¯(z,u)f¯(z,u)||zz|L,ss[R(z,u)+R(z,u)].|\bar{f}(z,u)-\bar{f}(z^{\prime},u)|\leq|z-z^{\prime}|_{L_{\mathcal{F}},s}^{s}\cdot\big{[}R(z,u)+R(z^{\prime},u)\big{]}.

Furthermore, C=(CR,Cf¯)(0,)2C=(C_{R},C_{\bar{f}})\in(0,\infty)^{2} satisfies supu|f¯(0,u)|Cf¯\sup_{u}|\bar{f}(0,u)|\leq C_{\bar{f}}, supu|R(0,u)|CR\sup_{u}|R(0,u)|\leq C_{R}.

The basic assumption for our main result is the following compatibility condition on \mathcal{F}.

Assumption 2.2.

¯\bar{\mathcal{F}} is a (L,s,R,C)(L_{\mathcal{F}},s,R,C)-class. There exists p(1,]p\in(1,\infty], CX>0C_{X}>0 such that

supi,uR(Zi,u)2pCR,supi,jXij2spp1CX.\sup_{i,u}\|R(Z_{i},u)\|_{2p}\leq C_{R},\quad\quad\sup_{i,j}\|X_{ij}\|_{\frac{2sp}{p-1}}\leq C_{X}. (2.3)

Let 𝔻n0\mathbb{D}_{n}\geq 0, Δ(k)0\Delta(k)\geq 0 be such that for all k0k\in\mathbb{N}_{0},

2dCRj=0kL,j(δ2spp1X(kj))sΔ(k),supf(1ni=1n|Df,n(in)|2)1/2𝔻n.2dC_{R}\cdot\sum_{j=0}^{k}L_{\mathcal{F},j}(\delta_{\frac{2sp}{p-1}}^{X}(k-j))^{s}\leq\Delta(k),\quad\quad\sup_{f\in\mathcal{F}}\Big{(}\frac{1}{n}\sum_{i=1}^{n}\big{|}D_{f,n}(\frac{i}{n})\big{|}^{2}\Big{)}^{1/2}\leq\mathbb{D}_{n}.

While (2.3) summarizes moment assumptions on XijX_{ij} which are balanced by pp, the sequence Δ(k)\Delta(k) reflects the intrinsic dependence of f(Zi,in)f(Z_{i},\frac{i}{n}) and 𝔻n\mathbb{D}_{n} measures the influence of the factor Df,n(u)D_{f,n}(u) to the convergence rate of 𝔾n(f)\mathbb{G}_{n}(f).

Based on Assumption 2.2, we define for ff\in\mathcal{F}

Vn(f):=f2,n+k=1min{f2,n,𝔻nΔ(k)}.V_{n}(f):=\|f\|_{2,n}+\sum_{k=1}^{\infty}\min\{\|f\|_{2,n},\mathbb{D}_{n}\Delta(k)\}. (2.4)

Clearly, VnV_{n} satisfies a triangle inequality. Therefore, Vn(fg)V_{n}(f-g) is a distance between f,gf,g\in\mathcal{F}. We are now able to state our main result. The weak convergence takes place in the normed space

()={𝔾:|𝔾:=supf|𝔾(f)|<},\ell^{\infty}(\mathcal{F})=\{\mathbb{G}:\mathcal{F}\to\mathbb{R}\,|\,\|\mathbb{G}\|_{\infty}:=\sup_{f\in\mathcal{F}}|\mathbb{G}(f)|<\infty\}, (2.5)

cf. [43] for a detailed discussion of this space.

Theorem 2.3.

Let \mathcal{F} satisfy Assumptions 2.2, 3.1, 3.2 and 3.3. Assume that

supn01(ε,,Vn)𝑑ε<.\sup_{n\in\mathbb{N}}\int_{0}^{1}\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},V_{n})}d\varepsilon<\infty.

Then it holds in ()\ell^{\infty}(\mathcal{F}) that

[𝔾n(f)]f𝑑[𝔾(f)]f,\big{[}\mathbb{G}_{n}(f)\big{]}_{f\in\mathcal{F}}\overset{d}{\to}\big{[}\mathbb{G}(f)\big{]}_{f\in\mathcal{F}},

where (𝔾(f))f(\mathbb{G}(f))_{f\in\mathcal{F}} is a centered Gaussian process with covariances

Cov(𝔾(f),𝔾(g))=limnCov(𝔾n(f),𝔾n(g)).\mathrm{Cov}(\mathbb{G}(f),\mathbb{G}(g))=\lim_{n\to\infty}\mathrm{Cov}(\mathbb{G}_{n}(f),\mathbb{G}_{n}(g)).

The proof of Theorem 2.3 consists of two ingredients, convergence of the finite-dimensional distributions (cf. Theorem 3.4) and asymptotic tightness (cf. Corollary 4.5). The more challenging part is asymptotic tightness; its proof only relies on Assumption 2.2 and consists of a new maximal inequality presented in Theorem 4.1 which may be of independent interest. To ensure convergence of the finite-dimensional distributions, we have to formalize local stationarity (Assumption 3.1) and pose conditions in time direction on f¯(z,)\bar{f}(z,\cdot) (cf. Assumption 3.2) and Df,n()D_{f,n}(\cdot) (cf. Assumption 3.3) which is done in Section 3. In particular, it is needed that Df,n(u)D_{f,n}(u) is properly normalized.

Let us note that in the case that XiX_{i} is stationary, f¯(z,u)=f¯(z)\bar{f}(z,u)=\bar{f}(z) and Df,n(u)=1D_{f,n}(u)=1, Assumptions 3.1, 3.2 and 3.3 are directly fulfilled. That is, in the stationary case, Assumption 2.2 is sufficient for Theorem 2.3. We formulate this finding as a simple corollary. Let

𝔾~n(h):=1ni=1n{h(Xi)𝔼h(Xi)},\tilde{\mathbb{G}}_{n}(h):=\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\big{\{}h(X_{i})-\mathbb{E}h(X_{i})\big{\}},

where Xi=J(𝒢i)X_{i}=J(\mathcal{G}_{i}), i=1,,ni=1,...,n, is a stationary process and hh are functions from

{h:d measurable}\mathcal{H}\subset\{h:\mathbb{R}^{d}\to\mathbb{R}\text{ measurable}\}

with the property that for all x,ydx,y\in\mathbb{R}^{d}, |h(x)h(y)|L|xy|s|h(x)-h(y)|\leq L_{\mathcal{H}}|x-y|_{\infty}^{s}.

Corollary 2.4.

Suppose that X12s<\|X_{1}\|_{2s}<\infty. Let Δ(k):=2dLδ2sX(k)s\Delta(k):=2dL_{\mathcal{H}}\delta_{2s}^{X}(k)^{s} and 𝔻n:=1\mathbb{D}_{n}:=1. Assume that

supn01(ε,,Vn)𝑑ε<.\sup_{n\in\mathbb{N}}\int_{0}^{1}\sqrt{\mathbb{H}(\varepsilon,\mathcal{H},V_{n})}d\varepsilon<\infty. (2.6)

Then it holds in ()\ell^{\infty}(\mathcal{H}) that

[𝔾~n(h)]h𝑑[𝔾~(h)]h\big{[}\tilde{\mathbb{G}}_{n}(h)\big{]}_{h\in\mathcal{H}}\overset{d}{\to}\big{[}\tilde{\mathbb{G}}(h)\big{]}_{h\in\mathcal{H}}

where (𝔾~(h))h(\tilde{\mathbb{G}}(h))_{h\in\mathcal{H}} is a centered Gaussian process with covariances

Cov(𝔾~(h1),𝔾~(h2))=kCov(h1(X0),h2(Xk)).\mathrm{Cov}(\tilde{\mathbb{G}}(h_{1}),\tilde{\mathbb{G}}(h_{2}))=\sum_{k\in\mathbb{Z}}\mathrm{Cov}(h_{1}(X_{0}),h_{2}(X_{k})).

2.1 Form of VnV_{n} and discussion on Δ(k)\Delta(k)

2.1.1 Form of VnV_{n}

Suppose that 𝔻n(0,)\mathbb{D}_{n}\in(0,\infty) is independent of nn\in\mathbb{N}. Based on decay rates of Δ(k)\Delta(k), simpler forms of VnV_{n} can be derived and are given in Table 1. These results are elementary and are proved in Lemmas 7.11, 7.12 in the Supplementary Material.

Δ(j)\Delta(j)
cjαcj^{-\alpha}, α>1,c>0\alpha>1,c>0 cρjc\rho^{j}, ρ(0,1)\rho\in(0,1), c>0c>0
Vn(f)V_{n}(f) f2,nmax{f2,n1α,1}\|f\|_{2,n}\max\{\|f\|_{2,n}^{-\frac{1}{\alpha}},1\} f2,nmax{log(f2,n1),1}\|f\|_{2,n}\max\{\log(\left\lVert f\right\rVert_{2,n}^{-1}),1\}
0σ(ε,,Vn)𝑑ε\int_{0}^{\sigma}\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},V_{n})}d\varepsilon 0σ~ε1α(ε,,2,n)𝑑ε\int_{0}^{\tilde{\sigma}}\varepsilon^{-\frac{1}{\alpha}}\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},\|\cdot\|_{2,n})}d\varepsilon 0σ~log(ε1)(ε,,2,n)𝑑ε\int_{0}^{\tilde{\sigma}}\log(\varepsilon^{-1})\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},\|\cdot\|_{2,n})}d\varepsilon
Table 1: Equivalent expressions of VnV_{n} and the corresponding entropy integral under the condition that 𝔻n(0,)\mathbb{D}_{n}\in(0,\infty) is independent of nn. We omitted the lower and upper bound constants which are only depending on c,ρ,αc,\rho,\alpha and 𝔻n\mathbb{D}_{n}. Furthermore, σ~=σ~(σ)\tilde{\sigma}=\tilde{\sigma}(\sigma) fulfills σ~0\tilde{\sigma}\to 0 for σ0\sigma\to 0.

If f(Zi,in)f(Z_{i},\frac{i}{n}), i=1,,ni=1,...,n, are independent, we can choose Δ(k)=0\Delta(k)=0 for k1k\geq 1 and thus Vn(f)V_{n}(f) is proportional to f2,n\|f\|_{2,n}. We therefore exactly recover the case of independent variables with our theory.

2.1.2 Discussion on Δ(k)\Delta(k)

Assumption 2.2 asks Δ(k)\Delta(k) to upper bound

j=0kL,j(δ2spp1X(kj))s,\sum_{j=0}^{k}L_{\mathcal{F},j}(\delta_{\frac{2sp}{p-1}}^{X}(k-j))^{s},

which is a convolution of the uniform Hölder constants L,jL_{\mathcal{F},j} of ff\in\mathcal{F} and the dependence measure δ2spp1X(k)\delta_{\frac{2sp}{p-1}}^{X}(k) of XX. Therefore, the specific form of ff\in\mathcal{F} has an impact on the dependence structure which is then introduced in VnV_{n}. This is contrary to other typical chaining approaches for Harris-recurrent Markov chains or β\beta-mixing sequences where the dependence structure of XiX_{i} simply transfers to functions f(Xi)f(X_{i}) without further conditions.

Furthermore, contrary to other chaining approaches, we have to ask for the existence of moments of XiX_{i} in Assumption 2.2 even though 𝔾n(f)\mathbb{G}_{n}(f) only involves f(Xi)f(X_{i}). This is due to the linear nature of the functional dependence measure (1.2). If ff is Lipschitz continuous with respect to its first argument (s=1s=1 in Assumption 2.2), we have to impose supi,jXij2<\sup_{i,j}\|X_{ij}\|_{2}<\infty. However, these moment assumptions can be relaxed at the cost of larger Δ(k)\Delta(k) as follows. Let us consider the special case that f(Zi,in)f(Z_{i},\frac{i}{n}) only depends on Xi,inX_{i},\frac{i}{n}, that is, f(z,u)=f(z0,u)f(z,u)=f(z_{0},u). If ff is bounded and Lipschitz continuous with respect to its first argument with Lipschitz constant LL, for any s(0,1]s\in(0,1],

|f(z0,u)f(z0,u)|min{2f,L|z0z0|}(2f)1sLs|z0z0|s.|f(z_{0},u)-f(z_{0}^{\prime},u)|\leq\min\{2\|f\|_{\infty},L|z_{0}-z_{0}^{\prime}|\}\leq(2\|f\|_{\infty})^{1-s}L^{s}|z_{0}-z_{0}^{\prime}|^{s}.

Thus, Δ(k)\Delta(k) can be chosen proportional to δ2sX(k)s\delta_{2s}^{X}(k)^{s}. This means that we can reduce the moment assumption to supi,jXij2s<\sup_{i,j}\|X_{ij}\|_{2s}<\infty at the cost of having a larger norm VnV_{n}.

2.2 Comparison to empirical process theory with β\beta-mixing

In this section, we compare our functional central limit theorem for stationary processes from Corollary 2.4 under functional dependence with similar results obtained under β\beta-mixing. Unfortunately, we were not able to find a general setting under which the functional dependence measure δ2X\delta_{2}^{X} can be compared with the β\beta-mixing coefficients βX\beta^{X} of XiX_{i}, i=1,,ni=1,...,n. However, in some special cases, both quantities can be upper bounded.

2.2.1 Upper bounds for dependence coefficients of linear processes

Consider the linear process

Xi=k=0akεik,i=1,,n,X_{i}=\sum_{k=0}^{\infty}a_{k}\varepsilon_{i-k},\quad i=1,...,n,

with an absolutely summable sequence aka_{k}, k0k\in\mathbb{N}_{0}, and i.i.d. εk\varepsilon_{k}, kk\in\mathbb{Z}, with 𝔼ε1=0\mathbb{E}\varepsilon_{1}=0. Then it is immediate that

δ2X(k)2|ak|ε12.\delta_{2}^{X}(k)\leq 2|a_{k}|\cdot\|\varepsilon_{1}\|_{2}.

From [35] (cf. also [15], Section 2.3.1), we have the following result. If for some ν1\nu\geq 1, ε1ν<\|\varepsilon_{1}\|_{\nu}<\infty, ε1\varepsilon_{1} has a Lipschitz-continuous Lebesgue-density and the process XiX_{i} is invertible, then for some constant ζ>0\zeta>0,

βX(k)ζ(m=k|Am,ν|11+ν)(m=kL(Am,2)),\beta^{X}(k)\leq\zeta\cdot\big{(}\sum_{m=k}^{\infty}|A_{m,\nu}|^{\frac{1}{1+\nu}}\big{)}\vee\big{(}\sum_{m=k}^{\infty}L(A_{m,2})\big{)},

where Am,s:=k=m|ak|sA_{m,s}:=\sum_{k=m}^{\infty}|a_{k}|^{s} and L(u)=u(1|log(u)|)L(u)=\sqrt{u(1\vee|\log(u)|)}. If ak=O(kα)a_{k}=O(k^{-\alpha}) for some α>1\alpha>1,

δ2X(k)=O(kα),βX(k)=O(kα+1+α1+ν+1(kα+32log(k)1/2)).\delta_{2}^{X}(k)=O(k^{-\alpha}),\quad\quad\beta^{X}(k)=O\big{(}k^{-\alpha+\frac{1+\alpha}{1+\nu}+1}\vee(k^{-\alpha+\frac{3}{2}}\log(k)^{1/2})\big{)}. (2.7)

Note that even for this specific example, the calculation of the functional dependence measure is much easier and possible with much less assumptions. Moreover, bounds for βX(k)\beta^{X}(k) is typically larger than δ2X(k)\delta_{2}^{X}(k). The reason being the simple structure of δ2X\delta_{2}^{X} compared to the much more involved formulation of dependence through sigma-algebras in the β\beta-mixing coefficients. For recursively defined processes with a finite number of lags, δ2X\delta_{2}^{X} are typically upper bounded by geometric decaying coefficients (cf. [48], [8]); the same holds true for βX(k)\beta^{X}(k) under additional continuity assumptions (cf. [15], Section 2.4. or [27], [25] among others).

2.2.2 Entropy integral

In [14] (cf. also [10]), it was shown that if XiX_{i}, i=1,,ni=1,...,n, is stationary and β\beta-mixing with coefficients β(k)\beta(k), k0k\in\mathbb{N}_{0}, then

01(ε,,2,β)𝑑ε<\int_{0}^{1}\sqrt{\mathbb{H}(\varepsilon,\mathcal{H},\|\cdot\|_{2,\beta})}d\varepsilon<\infty (2.8)

implies weak convergence of (𝔾n(h))h(\mathbb{G}_{n}(h))_{h\in\mathcal{H}} in ()\ell^{\infty}(\mathcal{H}). Here, the 2,β\|\cdot\|_{2,\beta}-norm is defined as follows. If β1\beta^{-1} denotes the inverse cadlag of the decreasing function tβ(t)t\mapsto\beta(\lfloor t\rfloor) and QhQ_{h} the inverse cadlag of the tail function t(h(X1)>t)t\mapsto\mathbb{P}(h(X_{1})>t), then

h2,β:=01β1(u)Qh(u)2𝑑u.\|h\|_{2,\beta}:=\int_{0}^{1}\beta^{-1}(u)Q_{h}(u)^{2}du.

Condition (2.8) was later relaxed in [40, Theorem 8.3]. It could be shown that if \mathcal{F} consists of indicator functions of specific classes of sets (in particular, \mathcal{F} corresponds to the empirical distribution function), weak convergence can be obtained under less restrictive conditions than (2.8). Since our theory does not directly allow us to analyze indicator functions because \mathcal{F} has to be a (L,s,R,C)(L_{\mathcal{F}},s,R,C)-class, we do not discuss their generalization here in detail.

In the special cases of polynomial and geometric decay, simple upper bounds for h2,β\|h\|_{2,\beta} are available (cf. [10]). If k=0kb1β(k)<\sum_{k=0}^{\infty}k^{b-1}\beta(k)<\infty for some b1b\geq 1, then 2,β\|\cdot\|_{2,\beta} is upper bounded by 2bb1\|\cdot\|_{\frac{2b}{b-1}}.

Generally speaking, (2.8) asks for 2bb1\frac{2b}{b-1} moments of the process f(Xi)f(X_{i}) to exist while our condition in (2.6) only asks for 22 moments of f(Xi)f(X_{i}) but allows for smaller function classes through the additional factors given in the entropy integral (cf. Table 1). In specific examples (cf. (2.7)) it may occur that the entropy integral (2.6) is finite while (2.8) is infinite due to missing summability of βX(k)\beta^{X}(k).

To give a precise comparison, consider the situation of linear processes from (2.7). If ν>2α+1\nu>2\alpha+1, we can choose b=α32b=\alpha-\frac{3}{2}. Then, the two entropy integrals from Corollary 2.4 (left) and (2.8) read

01ε1α(ε,,2)𝑑εvs.01(ε,,4α62α5)𝑑ε.\int_{0}^{1}\varepsilon^{-\frac{1}{\alpha}}\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},\|\cdot\|_{2})}d\varepsilon\quad\quad\text{vs.}\quad\quad\int_{0}^{1}\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},\|\cdot\|_{\frac{4\alpha-6}{2\alpha-5}})}d\varepsilon.

Here, the entropy integral for mixing only exists if α>52\alpha>\frac{5}{2}. The difference in the behavior is due to different bounds used for the variance of 𝔾n(f)\mathbb{G}_{n}(f).

2.3 Integration into other empirical process results for the empirical distribution function of dependent data

While our approach does allow for a general empirical process theory for Hölder continuous function classes, some more general dependence concepts already have been introduced (only) for the discussion of the empirical distribution function (EDF) based on the one-dimensional class ={x𝟙{xt}:t}\mathcal{F}=\{x\mapsto\mathbbm{1}_{\{x\leq t\}}:t\in\mathbb{R}\}. We mention [6], [9], [4] and [20]. The conditions therein cover the case where Xi=J(𝒢i)X_{i}=J(\mathcal{G}_{i}) is a stationary Bernoulli shift with 𝒢i=(εi,εi1,)\mathcal{G}_{i}=(\varepsilon_{i},\varepsilon_{i-1},...) and dependence is measured with the (stationary) functional dependence measure

δνX(k)=XiXi(ik)ν\delta_{\nu}^{X}(k)=\|X_{i}-X_{i}^{*(i-k)}\|_{\nu}

and its summed up version, Dν(k):=j=k+1δνX(j)D_{\nu}(k):=\sum_{j=k+1}^{\infty}\delta_{\nu}^{X}(j). [6] introduces so-called ν\nu-approximation coefficients aka_{k}, kk\in\mathbb{N}, which can be viewed as another formulation of functional dependence. However, their final result [6, Theorem 5] for the convergence of the EDF is stated with summability conditions both on aka_{k} and absolutely regular mixing coefficients, we therefore do not discuss it in detail here. On the other hand, [9, Theorem 2.1] in combination with [11, Section 6.1] show convergence of the EDF if for some ν>1\nu>1 and γ>0\gamma>0,

Dν(k)=O(k(1+γ)(ν+1)ν).D_{\nu}(k)=O(k^{-\frac{(1+\gamma)(\nu+1)}{\nu}}).

This is done by introducing simplified β\beta-mixing coefficients which can then be upper bounded by Dν(k)D_{\nu}(k). By using independent approximations of the original process, [4, Theorem 1, Corollary 1] obtain convergence of the EDF if for some ν1\nu\geq 1 and A>4A>4, (Dν(k)kA)νkA(D_{\nu}(k)k^{A})^{\nu}\leq k^{-A}, or equivalently,

Dν(k)kA(ν+1)ν.D_{\nu}(k)\leq k^{-\frac{A(\nu+1)}{\nu}}.

[20] discusses convergence of the EDF under a general growth condition imposed on the moments of i=1n{h(Xi)𝔼h(Xi)}\sum_{i=1}^{n}\{h(X_{i})-\mathbb{E}h(X_{i})\} where hαh\in\mathcal{H}_{\alpha}, the set of all Hölder-continuous functions with exponent α(0,1]\alpha\in(0,1]. Their condition is fulfilled if

k=1k2p2Dν(k)α<\sum_{k=1}^{\infty}k^{2p-2}D_{\nu}(k)^{\alpha}<\infty

for some p>νν1p>\frac{\nu}{\nu-1}, ν>1\nu>1.

3 A general central limit theorem for locally stationary processes

In this section, we introduce the remaining assumptions needed in Theorem 2.3 which pose regularity conditions on the process XiX_{i} and the function class \mathcal{F} in time direction. They are used to derive a multivariate central limit theorem for (𝔾n(f1),,𝔾n(fk))(\mathbb{G}_{n}(f_{1}),...,\mathbb{G}_{n}(f_{k})) under minimal moment conditions in Theorem 3.4. Comparable results in different and more specific contexts were shown in [8] or [42].

We first formalize the property of XiX_{i} to be locally stationary (cf. [8]). We ask that for each u[0,1]u\in[0,1] there exists a stationary process X~i(u)\tilde{X}_{i}(u), i=1,,ni=1,...,n, such that XiX~i(u)X_{i}\approx\tilde{X}_{i}(u) if |uin||u-\frac{i}{n}| is small. Recall R(),s,pR(\cdot),s,p from Assumption 2.2.

Assumption 3.1.

For each u[0,1]u\in[0,1], there exists a process X~i(u)=J(𝒢i,u)\tilde{X}_{i}(u)=J(\mathcal{G}_{i},u), ii\in\mathbb{Z}, where JJ is a measurable function. Furthermore, there exists some CX>0C_{X}>0, ς(0,1]\varsigma\in(0,1] such that for every i{1,,n}i\in\{1,...,n\}, u1,u2[0,1]u_{1},u_{2}\in[0,1],

XiX~i(in)2spp1CXnς,X~i(u1)X~i(u2)2spp1CX|u1u2|ς.\|X_{i}-\tilde{X}_{i}(\frac{i}{n})\|_{\frac{2sp}{p-1}}\leq C_{X}n^{-\varsigma},\quad\quad\|\tilde{X}_{i}(u_{1})-\tilde{X}_{i}(u_{2})\|_{\frac{2sp}{p-1}}\leq C_{X}|u_{1}-u_{2}|^{\varsigma}.

For Z~i(u)=(X~i(u),X~i1(u),)\tilde{Z}_{i}(u)=(\tilde{X}_{i}(u),\tilde{X}_{i-1}(u),...) it holds that supv,uR(Z~0(v),u)2p<\sup_{v,u}\|R(\tilde{Z}_{0}(v),u)\|_{2p}<\infty.

The behavior of the functions f(z,u)=Df,n(u)f¯(z,u)f(z,u)=D_{f,n}(u)\cdot\bar{f}(z,u) of the class \mathcal{F} in the direction of time u[0,1]u\in[0,1] is controlled by the following two continuity assumptions which state conditions on f¯(z,)\bar{f}(z,\cdot) and Df,n()D_{f,n}(\cdot) separately.

Assumption 3.2.

There exists some ς(0,1]\varsigma\in(0,1] such that for every f¯¯\bar{f}\in\bar{\mathcal{F}},

supv,u1,u2|f¯(Z~0(v),u1)f¯(Z~0(v),u2)||u1u2|ς2<.\sup_{v,u_{1},u_{2}}\Big{\|}\frac{|\bar{f}(\tilde{Z}_{0}(v),u_{1})-\bar{f}(\tilde{Z}_{0}(v),u_{2})|}{|u_{1}-u_{2}|^{\varsigma}}\Big{\|}_{2}<\infty.

For ff\in\mathcal{F}, let Df,n:=supi=1,,nDf,n(in)D^{\infty}_{f,n}:=\sup_{i=1,...,n}D_{f,n}(\frac{i}{n}).

Assumption 3.3.

For all ff\in\mathcal{F}, the function Df,n()Df,n\frac{D_{f,n}(\cdot)}{D_{f,n}^{\infty}} has bounded variation uniformly in nn, and

supn1ni=1nDf,n(in)2<,Df,nn0.\sup_{n\in\mathbb{N}}\frac{1}{n}\sum_{i=1}^{n}D_{f,n}(\frac{i}{n})^{2}<\infty,\quad\quad\frac{D_{f,n}^{\infty}}{\sqrt{n}}\to 0. (3.1)

One of the two following cases hold.

  1. (i)

    Case 𝕂=1\mathbb{K}=1 (global): For all f,gf,g\in\mathcal{F}, u𝔼[𝔼[f¯(Z~j1(u),u)|𝒢0]𝔼[g¯(Z~j2(u),u)|𝒢0]]u\mapsto\mathbb{E}[\mathbb{E}[\bar{f}(\tilde{Z}_{j_{1}}(u),u)|\mathcal{G}_{0}]\cdot\mathbb{E}[\bar{g}(\tilde{Z}_{j_{2}}(u),u)|\mathcal{G}_{0}]] has bounded variation for all j1,j20j_{1},j_{2}\in\mathbb{N}_{0} and the following limit exists:

    Σfg(1):=limn01Df,n(u)Dg,n(u)jCov(f¯(Z~0(u),u),g¯(Z~j(u),u))du.\Sigma_{fg}^{(1)}:=\lim_{n\to\infty}\int_{0}^{1}D_{f,n}(u)D_{g,n}(u)\cdot\sum_{j\in\mathbb{Z}}\text{Cov}(\bar{f}(\tilde{Z}_{0}(u),u),\bar{g}(\tilde{Z}_{j}(u),u))du.
  2. (ii)

    Case 𝕂=2\mathbb{K}=2 (local): There exists a sequence hn>0h_{n}>0 and v[0,1]v\in[0,1] such that suppDf,n()[vhn,v+hn]\mathrm{supp}D_{f,n}(\cdot)\subset[v-h_{n},v+h_{n}]. It holds that

    hn0,supn(hn1/2Df,n)<.h_{n}\to 0,\quad\quad\sup_{n\in\mathbb{N}}(h_{n}^{1/2}\cdot D_{f,n}^{\infty})<\infty.

    The following limit exists for all f,gf,g\in\mathcal{F}:

    Σfg(2):=limn01Df,n(u)Dg,n(u)𝑑ujCov(f¯(Z~0(v),v),g¯(Z~j(v),v)).\Sigma_{fg}^{(2)}:=\lim_{n\to\infty}\int_{0}^{1}D_{f,n}(u)D_{g,n}(u)du\cdot\sum_{j\in\mathbb{Z}}\text{Cov}(\bar{f}(\tilde{Z}_{0}(v),v),\bar{g}(\tilde{Z}_{j}(v),v)).

Assumption 3.3 looks rather technical. The first part including (3.1) guarantees the right normalization of Df,n()D_{f,n}(\cdot). The second part ensures the convergence of the asymptotic variances Var(𝔾n(f))\mathrm{Var}(\mathbb{G}_{n}(f)) and covariances Cov(𝔾n(f),𝔾n(g))\mathrm{Cov}(\mathbb{G}_{n}(f),\mathbb{G}_{n}(g)).

We obtain the following central limit theorem.

Theorem 3.4.

Let \mathcal{F} satisfy Assumptions 2.2, 3.1, 3.2 and 3.3. Let mm\in\mathbb{N} and f1,,fmf_{1},...,f_{m}\in\mathcal{F}. Let Σ(𝕂)=(Σfkfl(𝕂))k,l=1,,m\Sigma^{(\mathbb{K})}=(\Sigma^{(\mathbb{K})}_{f_{k}f_{l}})_{k,l=1,...,m}. Then

1ni=1n{(f1(Zi,in)fm(Zi,in))𝔼(f1(Zi,in)fm(Zi,in))}𝑑N(0,Σ(𝕂)).\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\Big{\{}\begin{pmatrix}f_{1}(Z_{i},\frac{i}{n})\\ \vdots\\ f_{m}(Z_{i},\frac{i}{n})\end{pmatrix}-\mathbb{E}\begin{pmatrix}f_{1}(Z_{i},\frac{i}{n})\\ \vdots\\ f_{m}(Z_{i},\frac{i}{n})\end{pmatrix}\Big{\}}\overset{d}{\to}N(0,\Sigma^{(\mathbb{K})}).

Theorem 3.4 generalizes the one-dimensional central limit theorem from [8]. We now comment on the assumptions.

Remark 3.5.

Assumptions 3.1, 3.2 and 3.3 allow for very general structures of ff\in\mathcal{F}. However, in many special cases, a subset of them is automatically fulfilled:

  • If XiX_{i} is stationary, then Assumption 2.2 already implies Assumption 3.1.

  • If f¯(z,u)=f¯(z)\bar{f}(z,u)=\bar{f}(z) does not depend on uu, Assumption 3.2 is fulfilled.

Regarding Assumption 3.3 we have:

  • If Df,n(u)=1D_{f,n}(u)=1, XiX_{i} is stationary and f¯(z,u)=f¯(z)\bar{f}(z,u)=\bar{f}(z), then Assumption 2.2 already implies Assumption 3.3(i) with Σfg(1)=jCov(f¯(Z0),f¯(Zj))\Sigma_{fg}^{(1)}=\sum_{j\in\mathbb{Z}}\mathrm{Cov}(\bar{f}(Z_{0}),\bar{f}(Z_{j})).

  • If Df,n(u)=1D_{f,n}(u)=1 and Assumption 2.2, 3.2 hold with s=ς=1s=\varsigma=1, then Assumption 3.3(i) holds with Σfg(1)=01jCov(f¯(Z~0(u),u),f¯(Z~j(u),u))du\Sigma_{fg}^{(1)}=\int_{0}^{1}\sum_{j\in\mathbb{Z}}\mathrm{Cov}(\bar{f}(\tilde{Z}_{0}(u),u),\bar{f}(\tilde{Z}_{j}(u),u))du.

  • If hn0h_{n}\to 0, nhnnh_{n}\to\infty and Df,n(u)=1hnK(uvhn)D_{f,n}(u)=\frac{1}{\sqrt{h_{n}}}K(\frac{u-v}{h_{n}}) with some Lipschitz-continuous kernel K:K:\mathbb{R}\to\mathbb{R} with support [1,1]\subset[-1,1] and fixed v(0,1)v\in(0,1), then Assumption 3.3(ii) holds with Σfg(2)=01K(x)2𝑑xjCov(f¯(Z~0(v),v),f¯(Z~j(v),v))\Sigma_{fg}^{(2)}=\int_{0}^{1}K(x)^{2}dx\cdot\sum_{j\in\mathbb{Z}}\mathrm{Cov}(\bar{f}(\tilde{Z}_{0}(v),v),\bar{f}(\tilde{Z}_{j}(v),v)).

4 Maximal inequalities and asymptotic tightness under functional dependence

In this section, we provide the necessary ingredients for the proof of asymptotic tightness of 𝔾n(f)\mathbb{G}_{n}(f). We derive a new maximal inequality for finite \mathcal{F} under functional dependence in Theorem 4.1. We then generalize this bound to arbitrary \mathcal{F} using chaining techniques in Section 4.2.

4.1 Maximal inequalities

We first derive a maximal inequality which is a main ingredient for chaining devices but also is of independent interest. To state the result, let

β(q):=j=qΔ(k).\beta(q):=\sum_{j=q}^{\infty}\Delta(k).

and define

q(x):=min{q:β(q)qx}.q^{*}(x):=\min\{q\in\mathbb{N}:\beta(q)\leq q\cdot x\}.

Set Dn(u):=supf|Df,n(u)|D_{n}^{\infty}(u):=\sup_{f\in\mathcal{F}}|D_{f,n}(u)|. For ν2\nu\geq 2, choose 𝔻ν,n\mathbb{D}_{\nu,n}^{\infty} such that

(1ni=1nDn(in)ν)1/ν𝔻ν,n.\Big{(}\frac{1}{n}\sum_{i=1}^{n}D_{n}^{\infty}(\frac{i}{n})^{\nu}\Big{)}^{1/\nu}\leq\mathbb{D}_{\nu,n}^{\infty}. (4.1)

Put 𝔻n=𝔻2,n\mathbb{D}_{n}^{\infty}=\mathbb{D}_{2,n}^{\infty}. Recall that H=H(||)=1log||H=H(|\mathcal{F}|)=1\vee\log|\mathcal{F}| as in (1.5).

Theorem 4.1.

Suppose that \mathcal{F} satisfies ||<|\mathcal{F}|<\infty and Assumption 2.2. Then there exists some universal constant c>0c>0 such that the following holds: If supffM\sup_{f\in\mathcal{F}}\|f\|_{\infty}\leq M and supfVn(f)σ\sup_{f\in\mathcal{F}}V_{n}(f)\leq\sigma, then

𝔼maxf|𝔾n(f)|cminq{1,,n}[σH+H𝔻nβ(q)+qMHn]\mathbb{E}\max_{f\in\mathcal{F}}\big{|}\mathbb{G}_{n}(f)\big{|}\leq c\cdot\min_{q\in\{1,...,n\}}\Big{[}\sigma\sqrt{H}+\sqrt{H}\cdot\mathbb{D}_{n}^{\infty}\beta(q)+\frac{qMH}{\sqrt{n}}\Big{]} (4.2)

and

𝔼maxf|𝔾n(f)|2c(σH+q(MHn𝔻n)MHn).\mathbb{E}\max_{f\in\mathcal{F}}\big{|}\mathbb{G}_{n}(f)\big{|}\leq 2c\cdot\Big{(}\sigma\sqrt{H}+q^{*}\big{(}\frac{M\sqrt{H}}{\sqrt{n}\mathbb{D}_{n}^{\infty}}\big{)}\frac{MH}{\sqrt{n}}\Big{)}. (4.3)

Clearly, the second bound (4.3) is a corollary of (4.2) which balances the two terms which involve qq. Values of q()q^{*}(\cdot) for the two prominent cases that Δ()\Delta(\cdot) is polynomial or exponential decaying can be found in Table 2. The proof of Theorem 4.1 relies on a decomposition of 𝔾n(f)\mathbb{G}_{n}(f) in i.i.d. parts and a residual term of martingale structure. Similar decompositions are also the core of empirical process results for Harris-recurrent Markov chains (cf. [29]) and mixing sequences (cf. [10]).

Δ(j)\Delta(j)
CjαCj^{-\alpha}, α>1\alpha>1 CρjC\rho^{j}, ρ(0,1)\rho\in(0,1)
q(x)q^{*}(x) max{x1α,1}\max\{x^{-\frac{1}{\alpha}},1\} max{log(x1),1}\max\{\log(x^{-1}),1\}
r(δ)r(\delta) min{δαα1,δ}\min\{\delta^{\frac{\alpha}{\alpha-1}},\delta\} min{δlog(δ1),δ}\min\{\frac{\delta}{\log(\delta^{-1})},\delta\}
Table 2: Equivalent expressions of q()q^{*}(\cdot) and r()r(\cdot) taken from Lemma 7.10 in Section 7.9. We omitted the lower and upper bound constants which are only depending on C,ρ,αC,\rho,\alpha.

In the next subsections, we will prove asymptotic tightness for 𝔾n(f)\mathbb{G}_{n}(f) under the condition that 𝔻n\mathbb{D}_{n}^{\infty}, 𝔻n\mathbb{D}_{n} do not depend on nn. However, uniform convergence rates of 𝔾n(f)\mathbb{G}_{n}(f) for finite \mathcal{F} (growing with nn) can be obtained without these conditions but with additional moment assumptions, which is done in the following Corollary 4.3. To incorporate the additional moment assumptions, we use a slightly stronger assumption than Assumption 2.2.

Assumption 4.2.

¯\bar{\mathcal{F}} is a (L,s,R,C)(L_{\mathcal{F}},s,R,C)-class. There exists ν2\nu\geq 2, p(1,]p\in(1,\infty], CX>0C_{X}>0 such that

supi,uR(Zi,u)νpCR,supi,jXijνspp1CX.\sup_{i,u}\|R(Z_{i},u)\|_{\nu p}\leq C_{R},\quad\quad\sup_{i,j}\|X_{ij}\|_{\frac{\nu sp}{p-1}}\leq C_{X}. (4.4)

Let 𝔻n0\mathbb{D}_{n}\geq 0, Δ(k)0\Delta(k)\geq 0 be such that for all k0k\in\mathbb{N}_{0},

2dCRj=0kL,j(δνspp1X(kj))sΔ(k),supf(1ni=1n|Df,n(in)|2)1/2𝔻n.2dC_{R}\cdot\sum_{j=0}^{k}L_{\mathcal{F},j}(\delta_{\frac{\nu sp}{p-1}}^{X}(k-j))^{s}\leq\Delta(k),\quad\quad\sup_{f\in\mathcal{F}}\Big{(}\frac{1}{n}\sum_{i=1}^{n}\big{|}D_{f,n}(\frac{i}{n})\big{|}^{2}\Big{)}^{1/2}\leq\mathbb{D}_{n}.

Note that Assumption 2.2 is obtained by taking ν=2\nu=2. For δ>0\delta>0, let

r(δ):=max{r>0:q(r)rδ},r(\delta):=\max\{r>0:q^{*}(r)r\leq\delta\},

cf. Table 2 for values of r()r(\cdot) in special cases.

Corollary 4.3 (Uniform convergence rates).

Suppose that \mathcal{F} satisfies ||<|\mathcal{F}|<\infty and Assumption 4.2 for some ν2\nu\geq 2. Let CΔ:=4d|L|1CXsCR+Cf¯C_{\Delta}:=4d\cdot|L_{\mathcal{F}}|_{1}\cdot C_{X}^{s}C_{R}+C_{\bar{f}}. Furthermore, suppose that

supnsupfVn(f)<,supn𝔻ν,n𝔻n<,supnCΔHn12νr(σ𝔻n)2<.\sup_{n\in\mathbb{N}}\sup_{f\in\mathcal{F}}V_{n}(f)<\infty,\quad\quad\sup_{n\in\mathbb{N}}\frac{\mathbb{D}_{\nu,n}^{\infty}}{\mathbb{D}_{n}^{\infty}}<\infty,\quad\quad\quad\sup_{n\in\mathbb{N}}\frac{C_{\Delta}H}{n^{1-\frac{2}{\nu}}r(\frac{\sigma}{\mathbb{D}_{n}^{\infty}})^{2}}<\infty. (4.5)

Then,

maxf|𝔾n(f)|=Op(H).\max_{f\in\mathcal{F}}|\mathbb{G}_{n}(f)|=O_{p}(\sqrt{H}).

The first condition in (4.5) guarantees that 𝔾n(f)\mathbb{G}_{n}(f) is properly normalized. The second and third condition are needed to prove that the “rare events”, where |f(Zi,in)||f(Z_{i},\frac{i}{n})| exceeds some threshold Mn(0,)M_{n}\in(0,\infty), are of the same order as H\sqrt{H}. For this, we may need more than two moments of f(Zi,in)f(Z_{i},\frac{i}{n}), that is, ν>2\nu>2, depending on H\sqrt{H} and the behavior of 𝔻n\mathbb{D}_{n}^{\infty}.

Corollary 4.3 can be used to prove (optimal) convergence rates for kernel density and regression estimators as well as maximum likelihood estimators under dependence. We give an example in Section 5.

4.2 Asymptotic tightness

In this section, we extend the maximal inequality from Theorem 4.1 to arbitrary (infinite) classes \mathcal{F}. Since Assumption 2.2 forces ff\in\mathcal{F} to be Hölder-continuous with respect to its first argument zz, classical chaining approaches which use indicator functions do not apply here. We provide a new chaining technique which preserves continuity in Section 7.2.

For nn\in\mathbb{N}, δ>0\delta>0 and kk\in\mathbb{N} define H(k)=1log(k)H(k)=1\vee\log(k) and

m(n,δ,k):=r(δ𝔻n)𝔻nn1/2H(k)1/2.m(n,\delta,k):=r(\frac{\delta}{\mathbb{D}_{n}})\cdot\frac{\mathbb{D}_{n}^{\infty}n^{1/2}}{H(k)^{1/2}}. (4.6)

Here, m(n,δ,k)m(n,\delta,k) represents the threshold for rare events in the chaining procedure. We have the following result.

Theorem 4.4.

Let \mathcal{F} satisfy Assumption 2.2 and let FF be some envelope function of \mathcal{F}, that is, for each ff\in\mathcal{F} it holds that |f|F|f|\leq F. Let σ>0\sigma>0 and assume that supfVn(f)σ\sup_{f\in\mathcal{F}}V_{n}(f)\leq\sigma. Then there exists some universal constant c~>0\tilde{c}>0 such that

𝔼supf|𝔾n(f)|\displaystyle\mathbb{E}\sup_{f\in\mathcal{F}}\big{|}\mathbb{G}_{n}(f)\big{|}
\displaystyle\leq c~[(1+𝔻n𝔻n+𝔻n𝔻n)0σ(ε,,Vn)dε+nF𝟙{F>14m(n,σ,(σ2,,Vn))}1,n].\displaystyle\tilde{c}\Big{[}(1+\frac{\mathbb{D}_{n}^{\infty}}{\mathbb{D}_{n}}+\frac{\mathbb{D}_{n}}{\mathbb{D}_{n}^{\infty}})\int_{0}^{\sigma}\sqrt{\mathbb{H}\big{(}\varepsilon,\mathcal{F},V_{n}\big{)}}\,\mathrm{d}\varepsilon+\sqrt{n}\big{\|}F\mathbbm{1}_{\{F>\frac{1}{4}m(n,\sigma,\mathbb{N}(\frac{\sigma}{2},\mathcal{F},V_{n}))\}}\big{\|}_{1,n}\Big{]}.

As a corollary, we obtain asymptotic equicontinuity of 𝔾n(f)\mathbb{G}_{n}(f). Here, we use Assumptions 3.1 and 3.2 only to discuss the remainder term in Theorem 4.4 without imposing the existence of additional moments.

Corollary 4.5.

Let \mathcal{F} satisfy Assumption 2.2, 3.1 and 3.2. Suppose that

supn01(ε,,Vn)𝑑ε<.\sup_{n\in\mathbb{N}}\int_{0}^{1}\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},V_{n})}d\varepsilon<\infty. (4.7)

Furthermore, assume that 𝔻n,𝔻n(0,)\mathbb{D}_{n},\mathbb{D}_{n}^{\infty}\in(0,\infty) are independent of nn, and

supi=1,,nDn(in)n0.\sup_{i=1,...,n}\frac{D_{n}^{\infty}(\frac{i}{n})}{\sqrt{n}}\to 0. (4.8)

Then, the process 𝔾n(f)\mathbb{G}_{n}(f) is equicontinuous with respect to VnV_{n}, that is, for every η>0\eta>0,

limσ0lim supn(supf,g,Vn(fg)σ|𝔾n(f)𝔾n(g)|η)=0.\lim_{\sigma\to 0}\limsup_{n\to\infty}\mathbb{P}\Big{(}\sup_{f,g\in\mathcal{F},V_{n}(f-g)\leq\sigma}|\mathbb{G}_{n}(f)-\mathbb{G}_{n}(g)|\geq\eta\Big{)}=0.

5 Applications

In this section, we provide some applications of the main results (Corollary 4.3 and Corollary 2.3). We will focus on locally stationary processes and therefore use localization in our functionals, but the results also hold for stationary processes, accordingly.

Let K:K:\mathbb{R}\to\mathbb{R} be some bounded kernel function which is Lipschitz continuous with Lipschitz constant LKL_{K}, K(u)𝑑u=1\int K(u)du=1, K(u)2𝑑u(0,)\int K(u)^{2}du\in(0,\infty) and support [12,12]\subset[-\frac{1}{2},\frac{1}{2}]. For some bandwidth h>0h>0, put Kh():=1hK(h)K_{h}(\cdot):=\frac{1}{h}K(\frac{\cdot}{h}).

In the first example we consider the nonparametric kernel estimator in the context of nonparametric regression with fixed design and locally stationary noise. We show that under conditions on the bandwidth hh, which are common in the presence of dependence (cf. [24] or [45]), we obtain the optimal uniform convergence rate log(n)nh\sqrt{\frac{\log(n)}{nh}}. Write anbna_{n}\gtrsim b_{n} for sequences an,bna_{n},b_{n} if there exists some constant c>0c>0 such that ancbna_{n}\geq cb_{n} for all nn\in\mathbb{N}.

Example 5.1 (Nonparametric Regression).

Let XiX_{i} be some arbitrary process of the form (1.1) with k=0δ2X(k)<\sum_{k=0}^{\infty}\delta^{X}_{2}(k)<\infty which fulfills supi=1,,nXiνCX(0,)\sup_{i=1,...,n}\|X_{i}\|_{\nu}\leq C_{X}\in(0,\infty) for some ν>2\nu>2. Suppose that we observe YiY_{i}, i=1,,ni=1,...,n given by

Yi=g(in)+Xi,Y_{i}=g(\frac{i}{n})+X_{i},

where g:[0,1]g:[0,1]\to\mathbb{R} is some function. Estimation of gg is performed via

g^n,h(v):=1ni=1nKh(inv)Yi.\hat{g}_{n,h}(v):=\frac{1}{n}\sum_{i=1}^{n}K_{h}(\frac{i}{n}-v)Y_{i}.

Suppose that either

  • δ2X(j)κjα\delta_{2}^{X}(j)\leq\kappa j^{-\alpha} with some κ>0,α>1\kappa>0,\alpha>1, and h(log(n)n12ν)α1αh\gtrsim(\frac{\log(n)}{n^{1-\frac{2}{\nu}}})^{\frac{\alpha-1}{\alpha}}, or

  • δ2X(j)κρj\delta_{2}^{X}(j)\leq\kappa\rho^{j} with some κ>0,ρ(0,1)\kappa>0,\rho\in(0,1) and hlog(n)3n12νh\gtrsim\frac{\log(n)^{3}}{n^{1-\frac{2}{\nu}}}.

From (5.1) and (5.2) below it follows that

supv[0,1]|g^n,h(v)𝔼g^n,h(v)|=Op(log(n)nh).\sup_{v\in[0,1]}|\hat{g}_{n,h}(v)-\mathbb{E}\hat{g}_{n,h}(v)|=O_{p}\big{(}\sqrt{\frac{\log(n)}{nh}}\big{)}.

First note that due to Lipschitz continuity of KK with Lipschitz constant LKL_{K}, we have

sup|vv|n3|(g^n,h(v)𝔼g^n,h(v))(g^n,h(v)𝔼g^n,h(v))|\displaystyle\sup_{|v-v^{\prime}|\leq n^{-3}}\big{|}(\hat{g}_{n,h}(v)-\mathbb{E}\hat{g}_{n,h}(v))-(\hat{g}_{n,h}(v^{\prime})-\mathbb{E}\hat{g}_{n,h}(v^{\prime}))\big{|} (5.1)
\displaystyle\leq LKn3nh2i=1n(|Xi|+𝔼|Xi|)=Op(n1).\displaystyle\cdot\frac{L_{K}n^{-3}}{nh^{2}}\sum_{i=1}^{n}\big{(}|X_{i}|+\mathbb{E}|X_{i}|\big{)}=O_{p}(n^{-1}).

For the grid Vn={in3,i=1,,n3}V_{n}=\{in^{-3},i=1,...,n^{3}\}, which discretizes [0,1][0,1] up to distances n3n^{-3}, we obtain by Corollary 4.3 that

nhsupvVn|g^n,h(v)𝔼g^n,h(v)|=supf|𝔾n(f)|=Op(log|Vn|)=Op(log(n)1/2),\sqrt{nh}\sup_{v\in V_{n}}|\hat{g}_{n,h}(v)-\mathbb{E}\hat{g}_{n,h}(v)|=\sup_{f\in\mathcal{F}}|\mathbb{G}_{n}(f)|=O_{p}\big{(}\sqrt{\log|V_{n}|}\big{)}=O_{p}\big{(}\log(n)^{1/2}\big{)}, (5.2)

where

={fv(x,u)=1hK(uvh)x:vVn}.\mathcal{F}=\{f_{v}(x,u)=\frac{1}{\sqrt{h}}K(\frac{u-v}{h})x:v\in V_{n}\}.

The conditions of Corollary 4.3 are easily verified: It holds that fv(x,u)=Df,n(u)f¯v(x,u)f_{v}(x,u)=D_{f,n}(u)\cdot\bar{f}_{v}(x,u) with Df,n(u)=1hK(uvh)D_{f,n}(u)=\frac{1}{\sqrt{h}}K(\frac{u-v}{h}) and f¯v(x,u)=x\bar{f}_{v}(x,u)=x. Thus, Assumption 4.2 is satisfied with Δ(k)=2δ2X(k)\Delta(k)=2\delta_{2}^{X}(k), p=p=\infty, R()=CR=1R(\cdot)=C_{R}=1. Furthermore, 𝔻n=|K|,𝔻ν,n=|K|h\mathbb{D}_{n}=|K|_{\infty},\mathbb{D}_{\nu,n}=\frac{|K|_{\infty}}{\sqrt{h}}, and

fv2,n1h(1ni=1nK(vuh)2Xi22)1/2CX|K|,\left\lVert f_{v}\right\rVert_{2,n}\leq\frac{1}{\sqrt{h}}\Big{(}\frac{1}{n}\sum_{i=1}^{n}K(\frac{v-u}{h})^{2}\left\lVert X_{i}\right\rVert_{2}^{2}\Big{)}^{1/2}\leq C_{X}|K|_{\infty},

which shows that supfVn(f)=O(1)\sup_{f\in\mathcal{F}}V_{n}(f)=O(1). The conditions on hh emerge from the last condition in (4.5) and using the bounds for r()r(\cdot) from Table 2.

For the following two examples we suppose that the underlying process XiX_{i} is locally stationary in the sense of Assumption 3.1. Similar assumptions are posed in [8] and are fulfilled for a large variety of locally stationary processes.

In the same spirit as in Example 5.1, it is possible to derive uniform rates of convergence for M-estimators of parameters θ\theta in models of locally stationary processes. Furthermore, weak Bahadur representations can be obtained. The following results apply for instance to maximum likelihood estimation of parameters in tvARMA or tvGARCH processes. The main tool is to prove uniform convergence of the corresponding objective functions and its derivatives. Since the rest of the proof is standard, the details are postponed to the Supplementary Material, Section 7.8. Let θj\nabla_{\theta}^{j} denote the jj-th derivative with respect to θ\theta. To apply empirical process theory, we ask for the objective functions to be (L,1,R,C)(L_{\mathcal{F}},1,R,C)-classes in (A1) and Lipschitz with respect to θ\theta in (A2).

Lemma 5.2 (M-estimation, uniform results).

Let ΘdΘ\Theta\subset\mathbb{R}^{d_{\Theta}} be compact and θ0:[0,1]interior(Θ)\theta_{0}:[0,1]\to\text{interior}(\Theta). For each θΘ\theta\in\Theta, let θ:k\ell_{\theta}:\mathbb{R}^{k}\to\mathbb{R} be some measurable function which is twice continuously differentiable. Let Zi=(Xi,,Xik+1)Z_{i}=(X_{i},...,X_{i-k+1}), and define for v[0,1]v\in[0,1],

θ^n,h(v):=argminθΘLn,h(v,θ),Ln,h(v,θ):=1ni=knKh(inv)θ(Zi)\hat{\theta}_{n,h}(v):=\arg\min_{\theta\in\Theta}L_{n,h}(v,\theta),\quad\quad L_{n,h}(v,\theta):=\frac{1}{n}\sum_{i=k}^{n}K_{h}\big{(}\frac{i}{n}-v\big{)}\cdot\ell_{\theta}(Z_{i})

Suppose that there exists CΘ>0C_{\Theta}>0 such that for j{0,1,2}j\in\{0,1,2\},

  • (A1)

    ¯j={θjθ:θΘ}\bar{\mathcal{F}}_{j}=\{\nabla_{\theta}^{j}\ell_{\theta}:\theta\in\Theta\} is an (L,1,R,C)(L_{\mathcal{F}},1,R,C)-class with R(z)=1+|z|1M1R(z)=1+|z|_{1}^{M-1} for some M1M\geq 1 and Assumption 3.1 for ¯j\bar{\mathcal{F}}_{j} is fulfilled with s=1s=1, p=MM1p=\frac{M}{M-1}.

  • (A2)

    for all zkz\in\mathbb{R}^{k}, θ,θΘ\theta,\theta^{\prime}\in\Theta,

    |θjθ(z)θjθ(z)|CΘ(1+|z|1M)|θθ|2,\big{|}\nabla_{\theta}^{j}\ell_{\theta}(z)-\nabla_{\theta}^{j}\ell_{\theta^{\prime}}(z)\big{|}_{\infty}\leq C_{\Theta}(1+|z|_{1}^{M})\cdot|\theta-\theta^{\prime}|_{2},
  • (A3)

    θ𝔼θ(Z~0(v))\theta\mapsto\mathbb{E}\ell_{\theta}(\tilde{Z}_{0}(v)) attains its global minimum in θ0(v)\theta_{0}(v) with positive definite I(v):=𝔼θ2θ(Z~0(v))I(v):=\mathbb{E}\nabla_{\theta}^{2}\ell_{\theta}(\tilde{Z}_{0}(v)).

Furthermore, suppose that either

  • δ2MX(j)κjα\delta_{2M}^{X}(j)\leq\kappa j^{-\alpha} with some κ>0,α>1\kappa>0,\alpha>1, and h(log(n)n12ν)α1αh\gtrsim(\frac{\log(n)}{n^{1-\frac{2}{\nu}}})^{\frac{\alpha-1}{\alpha}}, or

  • δ2MX(j)κρj\delta_{2M}^{X}(j)\leq\kappa\rho^{j} with some κ>0,ρ(0,1)\kappa>0,\rho\in(0,1) and hlog(n)3n12νh\gtrsim\frac{\log(n)^{3}}{n^{1-\frac{2}{\nu}}}.

Define τn:=log(n)nh\tau_{n}:=\sqrt{\frac{\log(n)}{nh}} and Bh:=supv[0,1]|𝔼θLn,h(v,θ0(v))|B_{h}:=\sup_{v\in[0,1]}|\mathbb{E}\nabla_{\theta}L_{n,h}(v,\theta_{0}(v))| (the bias). Then, Bh=O(hς)B_{h}=O(h^{\varsigma}), and as nhnh\to\infty,

supv[h2,1h2]|θ^n,h(v)θ0(v)|=Op(τn+Bh)\sup_{v\in[\frac{h}{2},1-\frac{h}{2}]}\big{|}\hat{\theta}_{n,h}(v)-\theta_{0}(v)\big{|}=O_{p}\big{(}\tau_{n}+B_{h}\big{)}

and

supv[h2,1h2]|{θ^n,h(v)θ0(v)}I(v)1θLn,h(v,θ0(v))|=Op((τn+hς)(τn+Bh)).\sup_{v\in[\frac{h}{2},1-\frac{h}{2}]}\big{|}\{\hat{\theta}_{n,h}(v)-\theta_{0}(v)\}-I(v)^{-1}\nabla_{\theta}L_{n,h}(v,\theta_{0}(v))\big{|}=O_{p}((\tau_{n}+h^{\varsigma})(\tau_{n}+B_{h})).
Remark 5.3.
  • In the tvAR(1) case Xi=a(i/n)Xi1+εiX_{i}=a(i/n)X_{i-1}+\varepsilon_{i}, we can use for instance

    θ(x1,x0)=(x1ax0)2,\ell_{\theta}(x_{1},x_{0})=(x_{1}-ax_{0})^{2},

    which for a(1,1)a\in(-1,1) is a ((1,a),1,|x0|+|x1|,(0,1))((1,a),1,|x_{0}|+|x_{1}|,(0,1))-class.

  • With more smoothness assumptions on θ\nabla_{\theta}\ell or using a local linear estimation method for θ^n,h\hat{\theta}_{n,h}, the bias term BhB_{h} can be shown to be of smaller order, for instance O(h2)O(h^{2}) (cf. [8]).

  • The theory derived in this paper can also be used to prove asymptotic properties of M-estimators based on objective functions θ\ell_{\theta} which are only almost everywhere differentiable in the Lebesgue sense by following the theory of chapter 5 in [43]. This is of utmost interest for θ\ell_{\theta} that have additional analytic properties, such as convexity. Since these properties are also needed in the proofs, we will not discuss this in detail.

We give an easy application of the functional central limit theorem from Theorem 2.3 by inspecting a local stationary version of Example 19.25 in [43].

Example 5.4 (Local mean absolute deviation).

For fixed v(0,1)v\in(0,1), put Xn¯(v):=1nKh(inv)Xi\overline{X_{n}}(v):=\frac{1}{n}K_{h}\big{(}\frac{i}{n}-v\big{)}X_{i} and define the mean absolute deviation

madn(v):=1ni=1nKh(inv)|XiXn¯(v)|.\text{mad}_{n}(v):=\frac{1}{n}\sum_{i=1}^{n}K_{h}\big{(}\frac{i}{n}-v\big{)}|X_{i}-\overline{X_{n}}(v)|.

Let Assumption 3.1 hold with s=1s=1, p=p=\infty. Suppose that (X~0(v)=𝔼X~0(v))=0\mathbb{P}(\tilde{X}_{0}(v)=\mathbb{E}\tilde{X}_{0}(v))=0 and that for some κ>0,α>1\kappa>0,\alpha>1, δ2X(j)κjα\delta_{2}^{X}(j)\leq\kappa j^{-\alpha}. We show that if nhnh\to\infty and nh1+2ς0nh^{1+2\varsigma}\to 0,

nh(madn(v)𝔼|X~0(v)μ|)𝑑N(0,σ2),\sqrt{nh}\big{(}\text{mad}_{n}(v)-\mathbb{E}|\tilde{X}_{0}(v)-\mu|\big{)}\overset{d}{\to}N(0,\sigma^{2}), (5.3)

where μ=𝔼X~0(v)\mu=\mathbb{E}\tilde{X}_{0}(v), GG denotes the distribution function of X~0(v)\tilde{X}_{0}(v) and

σ2\displaystyle\sigma^{2} =\displaystyle= K(u)2duj=0Cov(|X~0(v)μ|+(2G(μ)1)X~0(v),\displaystyle\int K(u)^{2}du\cdot\sum_{j=0}^{\infty}\mathrm{Cov}\big{(}|\tilde{X}_{0}(v)-\mu|+(2G(\mu)-1)\tilde{X}_{0}(v),
|X~j(v)μ|+(2G(μ)1)X~j(v)).\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad|\tilde{X}_{j}(v)-\mu|+(2G(\mu)-1)\tilde{X}_{j}(v)\big{)}.

The result is obtained by using the decomposition

nh(madn(v)𝔼|X~0(v)μ|)=𝔾n(fXn¯(v)fμ)+𝔾n(fμ)+An,\displaystyle\sqrt{nh}\big{(}\text{mad}_{n}(v)-\mathbb{E}|\tilde{X}_{0}(v)-\mu|\big{)}=\mathbb{G}_{n}(f_{\overline{X_{n}}(v)}-f_{\mu})+\mathbb{G}_{n}(f_{\mu})+A_{n},
An=nhni=1nKh(inv){𝔼|Xiθ|𝔼|X~0(v)μ|}|θ=Xn¯(v),\displaystyle\quad\quad\quad\quad A_{n}=\frac{\sqrt{nh}}{n}\sum_{i=1}^{n}K_{h}\big{(}\frac{i}{n}-v\big{)}\{\mathbb{E}|X_{i}-\theta|-\mathbb{E}|\tilde{X}_{0}(v)-\mu|\big{\}}\Big{|}_{\theta=\overline{X_{n}}(v)},

where Θ={θ:|θμ|1}\Theta=\{\theta\in\mathbb{R}:|\theta-\mu|\leq 1\} and

={fθ(x,u)=hKh(uv)|xθ|:θΘ}.\mathcal{F}=\{f_{\theta}(x,u)=\sqrt{h}K_{h}(u-v)|x-\theta|:\theta\in\Theta\}.

By the triangle inequality, \mathcal{F} satisfies Assumption 2.2 with f¯θ(x,u)=|xθ|\bar{f}_{\theta}(x,u)=|x-\theta|, R()=CR=1R(\cdot)=C_{R}=1, p=p=\infty, s=1s=1 and Δ(k)=2δ2X(k)\Delta(k)=2\delta_{2}^{X}(k). Assumption 3.2 is trivially fulfilled since f¯\bar{f} does not depend on uu. Since \mathcal{F} is a one-dimensional Lipschitz class, supn(ε,,2,n)=O(log(ε11))\sup_{n\in\mathbb{N}}\mathbb{H}(\varepsilon,\mathcal{F},\|\cdot\|_{2,n})=O(\log(\varepsilon^{-1}\vee 1)). By Corollary 2.3, we obtain that there exists some process [𝔾(fθ)]θΘ[\mathbb{G}(f_{\theta})]_{\theta\in\Theta} such that for h0h\to 0, nhnh\to\infty,

[𝔾n(fθ)]θΘ𝑑[𝔾(fθ)]θΘ in (Θ).\big{[}\mathbb{G}_{n}(f_{\theta})\big{]}_{\theta\in\Theta}\overset{d}{\to}\big{[}\mathbb{G}(f_{\theta})\big{]}_{\theta\in\Theta}\quad\quad\text{ in }\ell^{\infty}(\Theta). (5.4)

Furthermore, by Assumption 3.1,

fXn¯(v)(Xi)fμ(Xi)2\displaystyle\|f_{\overline{X_{n}}(v)}(X_{i})-f_{\mu}(X_{i})\|_{2} (5.5)
\displaystyle\leq Xn¯(v)μ2Xn¯(v)𝔼Xn¯(v)2+𝔼Xn¯(v)μ2\displaystyle\|\overline{X_{n}}(v)-\mu\|_{2}\leq\|\overline{X_{n}}(v)-\mathbb{E}\overline{X_{n}}(v)\|_{2}+\|\mathbb{E}\overline{X_{n}}(v)-\mu\|_{2}
\displaystyle\leq 1nh(1nhi=1nK(invh)2)1/2j=0δ2X(j)+1ni=1nKh(inv)|𝔼Xi𝔼X~0(v)|\displaystyle\frac{1}{\sqrt{nh}}\Big{(}\frac{1}{nh}\sum_{i=1}^{n}K\big{(}\frac{\frac{i}{n}-v}{h}\big{)}^{2}\Big{)}^{1/2}\sum_{j=0}^{\infty}\delta_{2}^{X}(j)+\frac{1}{n}\sum_{i=1}^{n}K_{h}(\frac{i}{n}-v)\big{|}\mathbb{E}X_{i}-\mathbb{E}\tilde{X}_{0}(v)|
=\displaystyle= O((nh)1/2+hς).\displaystyle O((nh)^{-1/2}+h^{\varsigma}).

By Lemma 19.24 in [43], we conclude from (5.4) and (5.5) that

𝔾n(fXn¯(v)fμ)𝑝0.\mathbb{G}_{n}(f_{\overline{X_{n}}(v)}-f_{\mu})\overset{p}{\to}0. (5.6)

By Assumption 3.1 and bounded variation of KK,

An=nh{𝔼|X~0(v)θ||θ=Xn¯(v)𝔼|X~0(v)μ|}+Op((nh)1/2+(nh)1/2hς).A_{n}=\sqrt{nh}\big{\{}\mathbb{E}|\tilde{X}_{0}(v)-\theta|\big{|}_{\theta=\overline{X_{n}}(v)}-\mathbb{E}|\tilde{X}_{0}(v)-\mu|\big{\}}+O_{p}((nh)^{-1/2}+(nh)^{1/2}h^{-\varsigma}). (5.7)

Due to (X~0(v)=μ)=0\mathbb{P}(\tilde{X}_{0}(v)=\mu)=0, g(θ)=𝔼|X~0(v)θ|g(\theta)=\mathbb{E}|\tilde{X}_{0}(v)-\theta| is differentiable in θ=μ\theta=\mu with derivative 2G(μ)12G(\mu)-1. The Delta method delivers

nh{𝔼|X~0(v)θ||θ=Xn¯(v)𝔼|X~0(v)μ|}\displaystyle\sqrt{nh}\big{\{}\mathbb{E}|\tilde{X}_{0}(v)-\theta|\big{|}_{\theta=\overline{X_{n}}(v)}-\mathbb{E}|\tilde{X}_{0}(v)-\mu|\big{\}} (5.8)
=\displaystyle= (2G(μ)1)nh(Xn¯(v)μ)+op(1).\displaystyle(2G(\mu)-1)\sqrt{nh}(\overline{X_{n}}(v)-\mu)+o_{p}(1).

From (5.6), (5.7) and (5.8) we obtain

nh(madn(v)𝔼|X~0(v)μ|)=𝔾n(fμ+(2G(μ)1)id)+op(1).\sqrt{nh}\big{(}\text{mad}_{n}(v)-\mathbb{E}|\tilde{X}_{0}(v)-\mu|\big{)}=\mathbb{G}_{n}(f_{\mu}+(2G(\mu)-1)\text{id})+o_{p}(1).

Theorem 3.4 now yields (5.3).

6 Conclusion

In this paper, we have developed a new empirical process theory for locally stationary processes with the functional dependence measure. We have proven a functional central limit theorem and maximal inequalities. A general empirical process theory for locally stationary processes is a key step to derive asymptotic and nonasymptotic results for M-estimates or testing based on L2L^{2}- or LL^{\infty}-statistics. We have given an example in nonparametric estimation where our theory is applicable. Due to the possibility to analyze the size of the function class and the stochastic properties of the underlying process separately, we conjecture that our theory also permits an extension of various results from i.i.d. to dependent data, such as empirical risk minimization.

From a technical point of view, the linear and moment-based nature of the functional dependence measure has forced us to modify several approaches from empirical process theory for i.i.d. or mixing variables. A main issue was given by the fact that the dependence measure only transfers decay rates for continuous functions. We therefore have provided a new chaining technique which preserves continuity of the arguments of the empirical process.

In principle, a similar empirical process theory for locally stationary processes can be established under mixing conditions such as absolute regularity. This would be a generalization of the results found in [38] and [10]. As we have seen in Section 2.2, such a theory would pose additional moment conditions on f(Zi,in)f(Z_{i},\frac{i}{n}). Contrary to that, our framework only requires second moments of f(Zi,in)f(Z_{i},\frac{i}{n}), but the entropy integral is enlarged by some factor which increases with stronger dependence. Moreover, in nearly all models the derivation of a bound for mixing coefficients needs continuity of the innovation process which may not be suitable in several examples. Therefore, we consider our theory as a valuable addition to this existing theory even in the stationary case.

One could also think of an extension of our empirical process theory to functions f(z,u)f(z,u) which are noncontinuous with respect to zz. This can in principle be done by using a martingale decomposition and assuming continuity of z𝔼[f(Zi,in)|𝒢i1=z]z\mapsto\mathbb{E}[f(Z_{i},\frac{i}{n})|\mathcal{G}_{i-1}=z], instead. However, one typically can only expect continuity of this functional if either already zf(z,u)z\mapsto f(z,u) was continuous or εi\varepsilon_{i} has a continuous density. In the latter case, the sequence might also be β\beta-mixing, and a more detailed discussion about advantages of our formulation is necessary.

Acknowledgements

The authors would like to thank the associate editor and two anonymous referees for their helpful remarks which helped to provide a much more concise version of the paper.

7 Appendix

In the appendix, we present the basic ideas used to prove the maximal inequalities of Section 4. We first consider the finite version in Section 7.1 and then present a chaining approach which preserves continuity in Section 7.2.

7.1 Proof idea: A maximal inequality for finite \mathcal{F}, Theorem 4.1

We provide an approach to obtain maximal inequalities for sums of random variables Wi(f)W_{i}(f), i=1,,ni=1,...,n, indexed by ff\in\mathcal{F}, by using a decomposition into independent random variables. An approach with similar intentions is presented in [10] (Section 4.3 therein) for absolutely regular sequences and in [29] for Harris-recurrent Markov chains. For convenience, we abbreviate

Wi(f)=f(Zi,in)W_{i}(f)=f(Z_{i},\frac{i}{n})

and put Sn(f):=i=1nWi(f)S_{n}(f):=\sum_{i=1}^{n}W_{i}(f).

To approximate Wi(f)W_{i}(f) by independent variables, we use a technique from [49] which was refined in [51]. This decomposition is much more involved then the ones for Harris-recurrent Markov chains or mixing sequences since no direct coupling method is available. Define

Wi,j(f):=𝔼[Wi(f)|εij,εij+1,,εi],j,W_{i,j}(f):=\mathbb{E}[W_{i}(f)|\varepsilon_{i-j},\varepsilon_{i-j+1},...,\varepsilon_{i}],\quad\quad j\in\mathbb{N},

and

Sn(f):=i=1n{Wi(f)𝔼Wi(f)},Sn,j(f):=i=1n{Wi,j(f)𝔼Wi,j(f)}.S_{n}(f):=\sum_{i=1}^{n}\{W_{i}(f)-\mathbb{E}W_{i}(f)\},\quad\quad S_{n,j}(f):=\sum_{i=1}^{n}\{W_{i,j}(f)-\mathbb{E}W_{i,j}(f)\}.

Let q{1,,n}q\in\{1,...,n\} be arbitrary. Put L:=log(q)log(2)L:=\lfloor\frac{\log(q)}{\log(2)}\rfloor and τl:=2l\tau_{l}:=2^{l} (l=0,,L1l=0,...,L-1), τL:=q\tau_{L}:=q. Then we have

Wi(f)=Wi(f)Wi,q(f)+l=1L(Wi,τl(f)Wi,τl1(f))+Wi,1(f)W_{i}(f)=W_{i}(f)-W_{i,q}(f)+\sum_{l=1}^{L}(W_{i,\tau_{l}}(f)-W_{i,\tau_{l-1}}(f))+W_{i,1}(f)

(in the case q=1q=1, the sum in the middle does not appear) and thus

Sn(f)=[Sn(f)Sn,q(f)]+l=1L[Sn,τl(f)Sn,τl1(f)]+Sn,1(f).S_{n}(f)=\big{[}S_{n}(f)-S_{n,q}(f)\big{]}+\sum_{l=1}^{L}\big{[}S_{n,\tau_{l}}(f)-S_{n,\tau_{l-1}}(f)\big{]}+S_{n,1}(f).

We write

Sn,τl(f)Sn,τl1(f)=i=1nτl+1Ti,l(f),Ti,l(f):=k=(i1)τl+1(iτl)n[Wk,τl(f)Wk,τl1(f)].S_{n,\tau_{l}}(f)-S_{n,\tau_{l-1}}(f)=\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}T_{i,l}(f),\quad\quad T_{i,l}(f):=\sum_{k=(i-1)\tau_{l}+1}^{(i\tau_{l})\wedge n}\big{[}W_{k,\tau_{l}}(f)-W_{k,\tau_{l-1}}(f)\big{]}.

The random variables Ti,l(f),Ti,l(f)T_{i,l}(f),T_{i^{\prime},l}(f) are independent if |ii|>1|i-i^{\prime}|>1. This leads to the decomposition

maxf|𝔾n(f)|\displaystyle\max_{f\in\mathcal{F}}\big{|}\mathbb{G}_{n}(f)\big{|} \displaystyle\leq maxf1n|Sn(f)Sn,q(f)|\displaystyle\max_{f\in\mathcal{F}}\frac{1}{\sqrt{n}}\big{|}S_{n}(f)-S_{n,q}(f)\big{|} (7.1)
+l=1L[maxf|1nτli=1nτl+1i even1τlTi,l(f)|+maxf|1nτli=1nτl+1i odd1τlTi,l(f)|]\displaystyle+\sum_{l=1}^{L}\Big{[}\max_{f\in\mathcal{F}}\Big{|}\frac{1}{\sqrt{\frac{n}{\tau_{l}}}}\underset{i\text{ even}}{\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}}\frac{1}{\sqrt{\tau_{l}}}T_{i,l}(f)\Big{|}+\max_{f\in\mathcal{F}}\Big{|}\frac{1}{\sqrt{\frac{n}{\tau_{l}}}}\underset{i\text{ odd}}{\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}}\frac{1}{\sqrt{\tau_{l}}}T_{i,l}(f)\Big{|}\Big{]}
+maxf1n|Sn,1W(f)|.\displaystyle+\max_{f\in\mathcal{F}}\frac{1}{\sqrt{n}}\big{|}S_{n,1}^{W}(f)\big{|}.

While the first term in (7.1) can be made small by assumptions on the dependence of Wi(f)W_{i}(f) and by the use of a large deviation inequality for martingales in Banach spaces from [36], the second and third term allow the application of Rosenthal-type bounds due to the independency of the summands Ti,l(f)T_{i,l}(f) and Wi,1(f)W_{i,1}(f), respectively. Since the first term in (7.1) allows for a stronger bound in terms of nn than it is the case for mixing, we can obtain a theory which only needs second moments of Wi(f)=f(Xi,in)W_{i}(f)=f(X_{i},\frac{i}{n}). By Assumption 2.2, we can show the following results (cf. Lemma 7.3 in the Supplementary Material and recall (4.1) for the definition of DnD_{n}^{\infty}). For each i=1,,ni=1,...,n, jj\in\mathbb{N}, s{}s\in\mathbb{N}\cup\{\infty\}, ff\in\mathcal{F},

supf|Wi(f)Wi(f)(ij)|2\displaystyle\Big{\|}\sup_{f\in\mathcal{F}}\big{|}W_{i}(f)-W_{i}(f)^{*(i-j)}\big{|}\,\Big{\|}_{2} \displaystyle\leq Dn(in)Δ(j),\displaystyle D_{n}^{\infty}(\frac{i}{n})\Delta(j), (7.2)
Wi(f)Wi(f)(ij)2\displaystyle\big{\|}W_{i}(f)-W_{i}(f)^{*(i-j)}\big{\|}_{2} \displaystyle\leq |Df,n(in)|Δ(j),\displaystyle|D_{f,n}(\frac{i}{n})|\cdot\Delta(j), (7.3)
Wi(f)s\displaystyle\big{\|}W_{i}(f)\|_{s} \displaystyle\leq f(Zi,in)s.\displaystyle\big{\|}f(Z_{i},\frac{i}{n})\big{\|}_{s}. (7.4)

We now summarize the proof of Theorem 4.1. The detailed proof is found in the Supplementary material. Denote

A1:=maxf1n|SnW(f)Sn,qW(f)|,A2:=l=1L[maxf|1nτli=1nτl+1i even1τlTi,l(f)|.A_{1}:=\max_{f\in\mathcal{F}}\frac{1}{\sqrt{n}}\big{|}S_{n}^{W}(f)-S_{n,q}^{W}(f)\big{|},\quad\quad A_{2}:=\sum_{l=1}^{L}\Big{[}\max_{f\in\mathcal{F}}\Big{|}\frac{1}{\sqrt{\frac{n}{\tau_{l}}}}\underset{i\text{ even}}{\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}}\frac{1}{\sqrt{\tau_{l}}}T_{i,l}(f)\Big{|}.

The remaining terms in (7.1) are discussed similarly or are special cases. We first have

𝔼A1j=q1n𝔼maxf|i=1nEi,j(f)|,\mathbb{E}A_{1}\leq\sum_{j=q}^{\infty}\frac{1}{\sqrt{n}}\mathbb{E}\max_{f\in\mathcal{F}}\Big{|}\sum_{i=1}^{n}E_{i,j}(f)\Big{|},

where Ei,j(f)=𝔼[Wi(f)|εij,,εi]𝔼[Wi(f)|εij+1,,εi]E_{i,j}(f)=\mathbb{E}[W_{i}(f)|\varepsilon_{i-j},...,\varepsilon_{i}]-\mathbb{E}[W_{i}(f)|\varepsilon_{i-j+1},...,\varepsilon_{i}] is a martingale difference sequence with respect to 𝒢i=σ(εij,εij+1,)\mathcal{G}^{i}=\sigma(\varepsilon_{i-j},\varepsilon_{i-j+1},...). For x=(xf)fx=(x_{f})_{f\in\mathcal{F}} and s:=2log||s:=2\vee\log|\mathcal{F}|, define |x|s:=(f|xf|s)1/s|x|_{s}:=(\sum_{f\in\mathcal{F}}|x_{f}|^{s})^{1/s}. Then

𝔼maxf|i=1nEi,j|𝔼|i=1nEi,j|s|i=1nEi,j|s2.\mathbb{E}\max_{f\in\mathcal{F}}\Big{|}\sum_{i=1}^{n}E_{i,j}\Big{|}\leq\mathbb{E}\Big{|}\sum_{i=1}^{n}E_{i,j}\Big{|}_{s}\leq\Big{\|}\Big{|}\sum_{i=1}^{n}E_{i,j}\Big{|}_{s}\Big{\|}_{2}.

By Theorem 4.1 in [36] there exists an absolute constant c1>0c_{1}>0 such that for s>1s>1,

|i=1nEi,j(f)|s2c1s(i=1n|Ei,j|s22)1/2.\Big{\|}\Big{|}\sum_{i=1}^{n}E_{i,j}(f)\Big{|}_{s}\Big{\|}_{2}\leq c_{1}\sqrt{s}\Big{(}\sum_{i=1}^{n}\big{\|}|E_{i,j}|_{s}\big{\|}_{2}^{2}\Big{)}^{1/2}.

By using (7.2) and standard techniques for the functional dependence measure, we conclude that

|Ei,j|s2s1/ssupf|Ei,j(f)|2eDn(in)Δ(j),\big{\|}|E_{i,j}|_{s}\|_{2}\leq s^{1/s}\big{\|}\sup_{f\in\mathcal{F}}|E_{i,j}(f)|\big{\|}_{2}\leq eD_{n}^{\infty}(\frac{i}{n})\Delta(j),

Summarizing the results, we obtain

𝔼A12ec1H𝔻nβ(q).\mathbb{E}A_{1}\leq 2ec_{1}\cdot\sqrt{H}\cdot\mathbb{D}_{n}^{\infty}\beta(q). (7.5)

Regarding A2A_{2}, we have

𝔼A2l=1L[𝔼maxf|1nτli=1nτl+1i even1τlTi,l(f)|].\mathbb{E}A_{2}\leq\sum_{l=1}^{L}\Big{[}\mathbb{E}\max_{f\in\mathcal{F}}\Big{|}\frac{1}{\sqrt{\frac{n}{\tau_{l}}}}\underset{i\text{ even}}{\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}}\frac{1}{\sqrt{\tau_{l}}}T_{i,l}(f)\Big{|}\Big{]}.

Using the fact that martingale sequences are uncorrelated, (7.3) and the simple bound |Wi(f)|M|W_{i}(f)|\leq M, one can show that

Ti,l2j=τl1+1τlmin{f2,n,𝔻nΔ(j2)},1τl|Ti,l(f)|2τlM.\|T_{i,l}\|_{2}\leq\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\min\big{\{}\|f\|_{2,n},\mathbb{D}_{n}\Delta(\lfloor\frac{j}{2}\rfloor)\big{\}},\quad\quad\frac{1}{\sqrt{\tau_{l}}}|T_{i,l}(f)|\leq 2\sqrt{\tau_{l}}M.

A maximal inequality for independent random variables based on Bernstein’s inequality yields that there exists some universal constant c2>0c_{2}>0 such that

𝔼maxf|1nτli=1nτl+1i even1τlTi,l(f)|\displaystyle\mathbb{E}\max_{f\in\mathcal{F}}\Big{|}\frac{1}{\sqrt{\frac{n}{\tau_{l}}}}\underset{i\text{ even}}{\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}}\frac{1}{\sqrt{\tau_{l}}}T_{i,l}(f)\Big{|}
\displaystyle\leq c2[maxf(1nτli=1nτl+1i even1τlTi,l(f)22)1/2H+supf1τl|Ti,l(f)|Hnτl]\displaystyle c_{2}\cdot\Big{[}\max_{f}\Big{(}\frac{1}{\frac{n}{\tau_{l}}}\underset{i\text{ even}}{\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}}\left\lVert\frac{1}{\sqrt{\tau_{l}}}T_{i,l}(f)\right\rVert_{2}^{2}\Big{)}^{1/2}\sqrt{H}+\frac{\sup_{f\in\mathcal{F}}\big{\|}\frac{1}{\sqrt{\tau_{l}}}|T_{i,l}(f)|\big{\|}_{\infty}H}{\sqrt{\frac{n}{\tau_{l}}}}\Big{]}
\displaystyle\leq 4c2[(j=τl1+1τlmin{maxff2,n,𝔻nΔ(j2)})H+τlMHn].\displaystyle 4c_{2}\Big{[}\Big{(}\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\min\{\max_{f\in\mathcal{F}}\|f\|_{2,n},\mathbb{D}_{n}\Delta(\lfloor\frac{j}{2}\rfloor)\}\Big{)}\cdot\sqrt{H}+\frac{\tau_{l}MH}{\sqrt{n}}\Big{]}.

By monotonicity of the first term with respect to f2,n\|f\|_{2,n} and

l=1Lj=τl1+1τlmin{f2,n,𝔻nΔ(j2)}2Vn(f),l=1Lτl2q,\sum_{l=1}^{L}\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\min\{\|f\|_{2,n},\mathbb{D}_{n}\Delta(\lfloor\frac{j}{2}\rfloor)\}\leq 2V_{n}(f),\quad\quad\sum_{l=1}^{L}\tau_{l}\leq 2q,

we obtain with some universal constant c3>0c_{3}>0,

𝔼A2c3(supfVn(f)H+qMHn)c3(σH+qMHn),\mathbb{E}A_{2}\leq c_{3}\Big{(}\sup_{f\in\mathcal{F}}V_{n}(f)\sqrt{H}+\frac{qMH}{\sqrt{n}}\Big{)}\leq c_{3}\Big{(}\sigma\sqrt{H}+\frac{qMH}{\sqrt{n}}\Big{)},

which together with (7.5) provides the result of Theorem 4.1.

7.2 An elementary chaining approach which preserves continuity

In this section, we provide a chaining approach which preserves continuity of the functions inside the empirical process. Typical chaining approaches work with indicator functions which is not suitable for application of Theorem 4.1. We replace the indicator functions by suitably chosen truncations. For m>0m>0, define φm:\varphi_{m}^{\wedge}:\mathbb{R}\to\mathbb{R} and the corresponding “peaky” residual φm:\varphi_{m}^{\vee}:\mathbb{R}\to\mathbb{R} via

φm(x):=(x(m))m,φm(x):=xφm(x).\varphi_{m}^{\wedge}(x):=(x\vee(-m))\wedge m,\quad\quad\varphi_{m}^{\vee}(x):=x-\varphi_{m}^{\wedge}(x).

In the following, assume that for each j0j\in\mathbb{N}_{0} there exists a decomposition =k=1Njjk\mathcal{F}=\bigcup_{k=1}^{N_{j}}\mathcal{F}_{jk}, where (jk)k=1,,Nj(\mathcal{F}_{jk})_{k=1,...,N_{j}}, j0j\in\mathbb{N}_{0} is a sequence of nested partitions. For each j0j\in\mathbb{N}_{0} and k{1,,Nj}k\in\{1,...,N_{j}\}, choose a fixed element fjkjkf_{jk}\in\mathcal{F}_{jk}. For j0j\in\mathbb{N}_{0}, define πjf:=fjk\pi_{j}f:=f_{jk} if fjkf\in\mathcal{F}_{jk}.

Assume furthermore that there exists a sequence (Δjf)j(\Delta_{j}f)_{j\in\mathbb{N}} such that for all j0j\in\mathbb{N}_{0}, supf,gjk|fg|Δjf\sup_{f,g\in\mathcal{F}_{jk}}|f-g|\leq\Delta_{j}f. Finally, let (mj)j0(m_{j})_{j\in\mathbb{N}_{0}} be a decreasing sequence which will serve as a truncation sequence.

For j0j\in\mathbb{N}_{0}, we use the decomposition

fπjf=φmj(fπjf)+φmj(fπjf)\displaystyle f-\pi_{j}f=\varphi_{m_{j}}^{\wedge}(f-\pi_{j}f)+\varphi_{m_{j}}^{\vee}(f-\pi_{j}f)

Since

fπjf\displaystyle f-\pi_{j}f =\displaystyle= fπj+1f+πj+1fπjf\displaystyle f-\pi_{j+1}f+\pi_{j+1}f-\pi_{j}f (7.6)
=\displaystyle= φmj+1(fπj+1f)+φmj+1(fπj+1f)\displaystyle\varphi_{m_{j+1}}^{\wedge}(f-\pi_{j+1}f)+\varphi_{m_{j+1}}^{\vee}(f-\pi_{j+1}f)
+φmjmj+1(πj+1fπjf)+φmjmj+1(πj+1fπjf),\displaystyle\quad\quad+\varphi_{m_{j}-m_{j+1}}^{\wedge}(\pi_{j+1}f-\pi_{j}f)+\varphi_{m_{j}-m_{j+1}}^{\vee}(\pi_{j+1}f-\pi_{j}f),

we can write

φmj(fπjf)=φmj+1(fπj+1f)+φmjmj+1(πj+1fπjf)+R(j),\varphi_{m_{j}}^{\wedge}(f-\pi_{j}f)=\varphi_{m_{j+1}}^{\wedge}(f-\pi_{j+1}f)+\varphi_{m_{j}-m_{j+1}}^{\wedge}(\pi_{j+1}f-\pi_{j}f)+R(j), (7.7)

where

R(j):=φmj(fπjf)φmj(φmj+1(fπj+1f))φmj(φmjmj+1(πj+1fπjf)).R(j):=\varphi_{m_{j}}^{\wedge}(f-\pi_{j}f)-\varphi_{m_{j}}^{\wedge}(\varphi_{m_{j+1}}^{\wedge}(f-\pi_{j+1}f))-\varphi_{m_{j}}^{\wedge}(\varphi_{m_{j}-m_{j+1}}^{\wedge}(\pi_{j+1}f-\pi_{j}f)).

To bound R(j)R(j), we use (i) of the following elementary Lemma 7.1 which is proved in Section 7.6 included in the Supplementary Material.

Lemma 7.1.

Let y,x,x1,x2,x3y,x,x_{1},x_{2},x_{3} and m,m>0m,m^{\prime}>0 be real numbers. Then the following assertions hold:

  1. (i)

    If |x1|+|x2|m|x_{1}|+|x_{2}|\leq m, then

    |φm(x1+x2+x3)φm(x1)φm(x2)|min{|x3|,2m}.\big{|}\varphi_{m}^{\wedge}(x_{1}+x_{2}+x_{3})-\varphi_{m}^{\wedge}(x_{1})-\varphi_{m}^{\wedge}(x_{2})\big{|}\leq\min\{|x_{3}|,2m\}.
  2. (ii)

    |φm(x)|min{|x|,m}|\varphi_{m}^{\wedge}(x)|\leq\min\{|x|,m\} and if |x|<y|x|<y,

    |φm(x)|φm(y)y𝟙{y>m}.|\varphi_{m}^{\vee}(x)|\leq\varphi_{m}^{\vee}(y)\leq y\mathbbm{1}_{\{y>m\}}.
  3. (iii)

    If \mathcal{F} fulfills Assumption 4.2, then Assumption 4.2 also holds for {φm(f):f}\{\varphi_{m}^{\wedge}(f):f\in\mathcal{F}\} and {φm(f):f}\{\varphi_{m}^{\vee}(f):f\in\mathcal{F}\}.

Because the partitions are nested, we have |πj+1fπjf|Δjf|\pi_{j+1}f-\pi_{j}f|\leq\Delta_{j}f. By Lemma 7.1 and (7.6), we have

|R(j)|\displaystyle|R(j)| \displaystyle\leq min{|φmj+1(fπj+1f)+φmjmj+1(πj+1fπjf)|,2mj}\displaystyle\min\big{\{}\big{|}\varphi_{m_{j+1}}^{\vee}(f-\pi_{j+1}f)+\varphi_{m_{j}-m_{j+1}}^{\vee}(\pi_{j+1}f-\pi_{j}f)\big{|},2m_{j}\big{\}} (7.8)
\displaystyle\leq min{|φmj+1(Δj+1f)|,2mj}+min{|φmjmj+1(Δjf)|,2mj}.\displaystyle\min\big{\{}\big{|}\varphi_{m_{j+1}}^{\vee}(\Delta_{j+1}f)\big{|},2m_{j}\big{\}}+\min\big{\{}\big{|}\varphi_{m_{j}-m_{j+1}}^{\vee}(\Delta_{j}f)\big{|},2m_{j}\big{\}}.

Let τ\tau\in\mathbb{N}. We then have with iterated application of (7.7) and linearity of fWi(f)f\mapsto W_{i}(f),

𝔾n(φm0(fπ0f))\displaystyle\mathbb{G}_{n}(\varphi_{m_{0}}^{\wedge}(f-\pi_{0}f)) (7.9)
=\displaystyle= 𝔾n(φm1(fπ1f))+𝔾n(φm0m1(π1fπ0f))+𝔾n(R(0))\displaystyle\mathbb{G}_{n}(\varphi_{m_{1}}^{\wedge}(f-\pi_{1}f))+\mathbb{G}_{n}(\varphi_{m_{0}-m_{1}}^{\wedge}(\pi_{1}f-\pi_{0}f))+\mathbb{G}_{n}(R(0))
=\displaystyle= 𝔾n(φmτ(fπτf))+j=0τ1𝔾n(φmjmj+1(πj+1fπjf))+j=0τ1𝔾n(R(j)),\displaystyle\mathbb{G}_{n}(\varphi_{m_{\tau}}^{\wedge}(f-\pi_{\tau}f))+\sum_{j=0}^{\tau-1}\mathbb{G}_{n}(\varphi_{m_{j}-m_{j+1}}^{\wedge}(\pi_{j+1}f-\pi_{j}f))+\sum_{j=0}^{\tau-1}\mathbb{G}_{n}(R(j)),

which in combination with (7.8) can now be used for chaining. The following lemma provides the necessary balancing between the truncated versions of 𝔾n(f)\mathbb{G}_{n}(f) and the rare events excluded. Recall that H(k)=1log(k)H(k)=1\vee\log(k) as in (1.4).

Lemma 7.2 (Compatibility lemma).

If \mathcal{F} fulfills ||k|\mathcal{F}|\leq k and Assumption 4.2, then supfVn(f)δ\sup_{f\in\mathcal{F}}V_{n}(f)\leq\delta, supffm(n,δ,k)\sup_{f\in\mathcal{F}}\|f\|_{\infty}\leq m(n,\delta,k) imply

𝔼maxf|𝔾n(f)|c(1+𝔻n𝔻n)δH(k),\mathbb{E}\max_{f\in\mathcal{F}}\big{|}\mathbb{G}_{n}(f)\big{|}\leq c(1+\frac{\mathbb{D}_{n}^{\infty}}{\mathbb{D}_{n}})\delta\sqrt{H(k)}, (7.10)

and supfVn(f)δ\sup_{f\in\mathcal{F}}V_{n}(f)\leq\delta implies that for each γ>0\gamma>0,

nf𝟙{f>γm(n,δ,k)}1,n1γ𝔻n𝔻nδH(k).\sqrt{n}\|f\mathbbm{1}_{\{f>\gamma\cdot m(n,\delta,k)\}}\|_{1,n}\leq\frac{1}{\gamma}\frac{\mathbb{D}_{n}}{\mathbb{D}_{n}^{\infty}}\delta\sqrt{H(k)}. (7.11)
Proof of Lemma 7.2.

For qq\in\mathbb{N}, put βnorm(q):=β(q)q\beta_{norm}(q):=\frac{\beta(q)}{q}. By Theorem 4.1 and the definition of r()r(\cdot),

𝔼maxf|𝔾n(f)|\displaystyle\mathbb{E}\max_{f\in\mathcal{F}}\big{|}\mathbb{G}_{n}(f)\big{|} \displaystyle\leq c(δH(k)+q(m(n,δ,k)H(k)n𝔻n)m(n,δ,k)H(k)n)\displaystyle c\Big{(}\delta\sqrt{H(k)}+q^{*}\Big{(}\frac{m(n,\delta,k)\sqrt{H(k)}}{\sqrt{n}\mathbb{D}_{n}^{\infty}}\Big{)}\frac{m(n,\delta,k)H(k)}{\sqrt{n}}\Big{)}
=\displaystyle= c(δH(k)+𝔻nq(r(δ𝔻n))r(δ𝔻n)H(k))\displaystyle c\Big{(}\delta\sqrt{H(k)}+\mathbb{D}_{n}^{\infty}q^{*}(r(\frac{\delta}{\mathbb{D}_{n}}))r(\frac{\delta}{\mathbb{D}_{n}})\sqrt{H(k)}\Big{)}
=\displaystyle= c(1+𝔻n𝔻n)δH(k).\displaystyle c(1+\frac{\mathbb{D}_{n}^{\infty}}{\mathbb{D}_{n}})\delta\sqrt{H(k)}.

which shows (7.10). Since

f(Zi,in)𝟙{f(Zi,in)>γm(n,δ,k)}11γm(n,δ,k)f(Zi,in)21=1γm(n,δ,k)f(Zi,in)22,\|f(Z_{i},\frac{i}{n})\mathbbm{1}_{\{f(Z_{i},\frac{i}{n})>\gamma m(n,\delta,k)\}}\|_{1}\leq\frac{1}{\gamma m(n,\delta,k)}\|f(Z_{i},\frac{i}{n})^{2}\|_{1}=\frac{1}{\gamma m(n,\delta,k)}\|f(Z_{i},\frac{i}{n})\|_{2}^{2},

for all ff\in\mathcal{F} with Vn(f)δV_{n}(f)\leq\delta, it holds that

nf𝟙{f>γm(n,δ,k)1,nnγm(n,δ,k)f2,n21γf2,n2𝔻nr(δ𝔻n)H(k).\displaystyle\sqrt{n}\|f\mathbbm{1}_{\{f>\gamma m(n,\delta,k)}\|_{1,n}\leq\frac{\sqrt{n}}{\gamma m(n,\delta,k)}\|f\|_{2,n}^{2}\leq\frac{1}{\gamma}\frac{\|f\|_{2,n}^{2}}{\mathbb{D}_{n}^{\infty}r(\frac{\delta}{\mathbb{D}_{n}})}\sqrt{H(k)}. (7.12)

If f2,n𝔻nΔ(1)\|f\|_{2,n}\geq\mathbb{D}_{n}\Delta(1), we have

Vn(f)=f2,n+𝔻nj=1Δ(j)f2,n+𝔻nβ(1).V_{n}(f)=\|f\|_{2,n}+\mathbb{D}_{n}\sum_{j=1}^{\infty}\Delta(j)\geq\|f\|_{2,n}+\mathbb{D}_{n}\beta(1). (7.13)

In the case f2,n<𝔻nΔ(1)\|f\|_{2,n}<\mathbb{D}_{n}\Delta(1), the fact that Δ()\Delta(\cdot) is decreasing implies that a=max{j:f2,n<𝔻nΔ(j)}a^{*}=\max\{j\in\mathbb{N}:\|f\|_{2,n}<\mathbb{D}_{n}\Delta(j)\} is well-defined. We conclude that

Vn(f)\displaystyle V_{n}(f) =\displaystyle= f2,n+j=0f2,n(𝔻nΔ(j))=f2,n+j=1af2,n+𝔻nj=a+1Δ(j)\displaystyle\|f\|_{2,n}+\sum_{j=0}^{\infty}\|f\|_{2,n}\wedge(\mathbb{D}_{n}\Delta(j))=\|f\|_{2,n}+\sum_{j=1}^{a^{*}}\|f\|_{2,n}+\mathbb{D}_{n}\sum_{j=a^{*}+1}^{\infty}\Delta(j) (7.14)
=\displaystyle= f2,n(a+1)+𝔻nβ(a)f2,na+β(a).\displaystyle\|f\|_{2,n}(a^{*}+1)+\mathbb{D}_{n}\beta(a^{*})\geq\|f\|_{2,n}a^{*}+\beta(a^{*}).

Summarizing the results (7.13) and (7.14), we have

Vn(f)f2,n(a1)+𝔻nβ(a1).V_{n}(f)\geq\|f\|_{2,n}(a^{*}\vee 1)+\mathbb{D}_{n}\beta(a^{*}\vee 1).

We conclude that

Vn(f)mina[f2,na+𝔻nβ(a)]f2,na^+𝔻nβ(a^),V_{n}(f)\geq\min_{a\in\mathbb{N}}\big{[}\|f\|_{2,n}a+\mathbb{D}_{n}\beta(a)\big{]}\geq\|f\|_{2,n}\hat{a}+\mathbb{D}_{n}\beta(\hat{a}),

where a^=argminj{f2,nj+𝔻nβ(j)}\hat{a}=\arg\min_{j\in\mathbb{N}}\big{\{}\|f\|_{2,n}\cdot j+\mathbb{D}_{n}\beta(j)\big{\}}.

Since δVn(f)\delta\geq V_{n}(f), we have δ𝔻nβ(a^)=𝔻nβnorm(a^)a^\delta\geq\mathbb{D}_{n}\beta(\hat{a})=\mathbb{D}_{n}\beta_{norm}(\hat{a})\hat{a}. Thus βnorm(a^)δ𝔻na^\beta_{norm}(\hat{a})\leq\frac{\delta}{\mathbb{D}_{n}\hat{a}}. By definition of qq^{*}, q(δ𝔻na^)a^q^{*}(\frac{\delta}{\mathbb{D}_{n}\hat{a}})\leq\hat{a}. Thus q(δ𝔻na^)δ𝔻na^δ𝔻nq^{*}(\frac{\delta}{\mathbb{D}_{n}\hat{a}})\frac{\delta}{\mathbb{D}_{n}\hat{a}}\leq\frac{\delta}{\mathbb{D}_{n}}. By definition of r()r(\cdot), r(δ𝔻n)δ𝔻na^r(\frac{\delta}{\mathbb{D}_{n}})\geq\frac{\delta}{\mathbb{D}_{n}\hat{a}}. We conclude with f2,nVn(f)δ\|f\|_{2,n}\leq V_{n}(f)\leq\delta that

f2,n2𝔻nr(δ𝔻n)𝔻na^f2,n2𝔻nδ𝔻nVn(f)f2,n𝔻nδ𝔻n𝔻nf2,n𝔻n𝔻nδ.\frac{\|f\|_{2,n}^{2}}{\mathbb{D}_{n}^{\infty}r(\frac{\delta}{\mathbb{D}_{n}})}\leq\frac{\mathbb{D}_{n}\hat{a}\|f\|_{2,n}^{2}}{\mathbb{D}_{n}^{\infty}\delta}\leq\frac{\mathbb{D}_{n}V_{n}(f)\|f\|_{2,n}}{\mathbb{D}_{n}^{\infty}\delta}\leq\frac{\mathbb{D}_{n}}{\mathbb{D}_{n}^{\infty}}\|f\|_{2,n}\leq\frac{\mathbb{D}_{n}}{\mathbb{D}_{n}^{\infty}}\delta. (7.15)

Inserting the result into (7.12), we finally obtain that for all ff\in\mathcal{F} with Vn(f)δV_{n}(f)\leq\delta it holds that

nf𝟙{f>γm(n,δ,k)1,nnγm(n,δ,k)f2,n21γf2,n2𝔻nr(δ𝔻n)H(k)1γ𝔻n𝔻nδH(k).\sqrt{n}\|f\mathbbm{1}_{\{f>\gamma m(n,\delta,k)}\|_{1,n}\leq\frac{\sqrt{n}}{\gamma m(n,\delta,k)}\|f\|_{2,n}^{2}\leq\frac{1}{\gamma}\frac{\|f\|_{2,n}^{2}}{\mathbb{D}_{n}^{\infty}r(\frac{\delta}{\mathbb{D}_{n}})}\sqrt{H(k)}\leq\frac{1}{\gamma}\frac{\mathbb{D}_{n}}{\mathbb{D}_{n}^{\infty}}\delta\sqrt{H(k)}.

which shows (7.11). ∎

7.3 Proof idea: A maximal inequality for infinite \mathcal{F}, Theorem 4.4

The details of the proof are given in Section 7.5 in the Supplementary material. In the following, we abbreviate (δ)=(δ,,Vn)\mathbb{H}(\delta)=\mathbb{H}(\delta,\mathcal{F},V_{n}) and (δ)=(δ,,Vn)\mathbb{N}(\delta)=\mathbb{N}(\delta,\mathcal{F},V_{n}). Choose δ0=σ\delta_{0}=\sigma and δj=2jδ0\delta_{j}=2^{-j}\delta_{0}.

For each j0j\in\mathbb{N}_{0}, we choose a covering by brackets jk:=[ljk,ujk]\mathcal{F}_{jk}:=[l_{jk},u_{jk}]\cap\mathcal{F}, k=1,,Nj:=(δj)k=1,...,N_{j}:=\mathbb{N}(\delta_{j}) such that Vn(ujkljk)δjV_{n}(u_{jk}-l_{jk})\leq\delta_{j} and supf,gjk|fg|ujkljk=:Δjk\sup_{f,g\in\mathcal{F}_{jk}}|f-g|\leq u_{jk}-l_{jk}=:\Delta_{jk}. We may assume w.l.o.g. that ljk,ujk,Δjkl_{jk},u_{jk},\Delta_{jk}\in\mathcal{F} and that (jk)k(\mathcal{F}_{jk})_{k} are nested.

In each jk\mathcal{F}_{jk}, fix some fjkf_{jk}\in\mathcal{F}, and define πjf:=fjk\pi_{j}f:=f_{jk} and Δjf:=Δjk\Delta_{j}f:=\Delta_{jk}. Put

I(σ):=0σ(ε,,Vn)𝑑ε,τ:=min{j0:δjI(σ)n}1.I(\sigma):=\int_{0}^{\sigma}\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},V_{n})}d\varepsilon,\quad\quad\tau:=\min\Big{\{}j\geq 0:\delta_{j}\leq\frac{I(\sigma)}{\sqrt{n}}\Big{\}}\vee 1.

The chaining procedure is now applied with mj:=12m(n,δj,Nj+1)m_{j}:=\frac{1}{2}m(n,\delta_{j},N_{j+1}) (m()m(\cdot) from (4.6)). Choose Mn=12m0M_{n}=\frac{1}{2}m_{0}. We then have

𝔼supf|𝔾n(f)|𝔼supf(Mn)|𝔾n(f)|+1ni=1n𝔼[Wi(F𝟙{F>Mn})],\mathbb{E}\sup_{f\in\mathcal{F}}\big{|}\mathbb{G}_{n}(f)\big{|}\leq\mathbb{E}\sup_{f\in\mathcal{F}(M_{n})}\big{|}\mathbb{G}_{n}(f)\big{|}+\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\mathbb{E}\big{[}W_{i}(F\mathbbm{1}_{\{F>M_{n}\}})\big{]}, (7.16)

where (Mn):={φMn(f):f}\mathcal{F}(M_{n}):=\{\varphi_{M_{n}}^{\wedge}(f):f\in\mathcal{F}\}.

By (7.8), (7.9), |𝔾n(f)|𝔾n(g)+2ni=1nWi(g)1𝔾n(g)+2ng1,n|\mathbb{G}_{n}(f)|\leq\mathbb{G}_{n}(g)+\frac{2}{\sqrt{n}}\sum_{i=1}^{n}\|W_{i}(g)\|_{1}\leq\mathbb{G}_{n}(g)+2\sqrt{n}\|g\|_{1,n} for |f|g|f|\leq g, we obtain the decomposition

supf(Mn)|𝔾n(f)|\displaystyle\sup_{f\in\mathcal{F}(M_{n})}|\mathbb{G}_{n}(f)| \displaystyle\leq supf|𝔾n(π0f)|\displaystyle\sup_{f\in\mathcal{F}}|\mathbb{G}_{n}(\pi_{0}f)| (7.17)
+{supf|𝔾n(φmτ(Δτf))|+2nsupfΔτf1,n}\displaystyle\quad\quad+\Big{\{}\sup_{f\in\mathcal{F}}|\mathbb{G}_{n}(\varphi_{m_{\tau}}^{\wedge}(\Delta_{\tau}f))|+2\sqrt{n}\sup_{f\in\mathcal{F}}\|\Delta_{\tau}f\|_{1,n}\Big{\}}
+j=0τ1supf|𝔾n(φmjmj+1(πj+1fπjf))|\displaystyle\quad\quad+\sum_{j=0}^{\tau-1}\sup_{f\in\mathcal{F}}\Big{|}\mathbb{G}_{n}(\varphi_{m_{j}-m_{j+1}}^{\wedge}(\pi_{j+1}f-\pi_{j}f))\Big{|}
+j=0τ1{supf|𝔾n(min{|φmj+1(Δj+1f)|,2mj})|\displaystyle\quad\quad+\sum_{j=0}^{\tau-1}\Big{\{}\sup_{f\in\mathcal{F}}\Big{|}\mathbb{G}_{n}(\min\big{\{}\big{|}\varphi_{m_{j+1}}^{\vee}(\Delta_{j+1}f)\big{|},2m_{j}\big{\}})\Big{|}
+2nsupfΔj+1f𝟙{Δj+1f>mj+1}1,n}\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad+2\sqrt{n}\sup_{f\in\mathcal{F}}\|\Delta_{j+1}f\mathbbm{1}_{\{\Delta_{j+1}f>m_{j+1}\}}\|_{1,n}\Big{\}}
+j=0τ1{supf|𝔾n(min{|φmjmj+1(Δjf)|,2mj})|\displaystyle\quad\quad+\sum_{j=0}^{\tau-1}\Big{\{}\sup_{f\in\mathcal{F}}\Big{|}\mathbb{G}_{n}(\min\big{\{}\big{|}\varphi_{m_{j}-m_{j+1}}^{\vee}(\Delta_{j}f)\big{|},2m_{j}\big{\}})\Big{|}
+2nsupfΔjf𝟙{Δjf>mjmj+1}1,n}\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad+2\sqrt{n}\sup_{f\in\mathcal{F}}\|\Delta_{j}f\mathbbm{1}_{\{\Delta_{j}f>m_{j}-m_{j+1}\}}\|_{1,n}\Big{\}}
=:\displaystyle=: R1+R2+R3+R4+R5.\displaystyle R_{1}+R_{2}+R_{3}+R_{4}+R_{5}.

The terms RiR_{i}, i=1,,5i=1,...,5 can now be discussed separately with Lemma 7.2 and give the upper bounds (with constants C>0C>0):

𝔼R1Cδ0(δ1),𝔼R22I(σ)+Cδτ(δτ+1),𝔼R3,𝔼R4,𝔼R5Cj=0τδj(δj+1),\mathbb{E}R_{1}\leq C\delta_{0}\sqrt{\mathbb{H}(\delta_{1})},\quad\mathbb{E}R_{2}\leq 2I(\sigma)+C\delta_{\tau}\sqrt{\mathbb{H}(\delta_{\tau+1})},\quad\mathbb{E}R_{3},\mathbb{E}R_{4},\mathbb{E}R_{5}\leq C\sum_{j=0}^{\tau}\delta_{j}\sqrt{\mathbb{H}(\delta_{j+1})},

and thus 𝔼supf(Mn)|𝔾n(f)|CI(σ)\mathbb{E}\sup_{f\in\mathcal{F}(M_{n})}|\mathbb{G}_{n}(f)|\leq C^{\prime}I(\sigma). Together with (7.16), we result follows.

References

  • [1] Radosł aw Adamczak. A tail inequality for suprema of unbounded empirical processes with applications to Markov chains. Electron. J. Probab., 13:no. 34, 1000–1034, 2008.
  • [2] Donald W. K. Andrews and David Pollard. An introduction to functional central limit theorems for dependent stochastic processes. International Statistical Review / Revue Internationale de Statistique, 62(1):119–132, 1994.
  • [3] M. A. Arcones and B. Yu. Central limit theorems for empirical and UU-processes of stationary mixing sequences. J. Theoret. Probab., 7(1):47–71, 1994.
  • [4] István Berkes, Siegfried Hörmann, and Johannes Schauer. Asymptotic results for the empirical process of stationary sequences. Stochastic Process. Appl., 119(4):1298–1324, 2009.
  • [5] Vivek S. Borkar. White-noise representations in stochastic realization theory. SIAM J. Control Optim., 31(5):1093–1102, 1993.
  • [6] Svetlana Borovkova, Robert Burton, and Herold Dehling. Limit theorems for functionals of mixing processes with applications to UU-statistics and dimension estimation. Trans. Amer. Math. Soc., 353(11):4261–4318, 2001.
  • [7] Rainer Dahlhaus and Wolfgang Polonik. Empirical spectral processes for locally stationary time series. Bernoulli, 15(1):1–39, 2009.
  • [8] Rainer Dahlhaus, Stefan Richter, and Wei Biao Wu. Towards a general theory for nonlinear locally stationary processes. Bernoulli, 25(2):1013–1044, 2019.
  • [9] J. Dedecker. An empirical central limit theorem for intermittent maps. Probab. Theory Related Fields, 148(1-2):177–195, 2010.
  • [10] Jérôme Dedecker and Sana Louhichi. Maximal inequalities and empirical central limit theorems. In Empirical process techniques for dependent data, pages 137–159. Birkhäuser Boston, Boston, MA, 2002.
  • [11] Jérôme Dedecker and Clémentine Prieur. An empirical central limit theorem for dependent sequences. Stochastic Process. Appl., 117(1):121–142, 2007.
  • [12] Herold Dehling, Olivier Durieu, and Dalibor Volny. New techniques for empirical processes of dependent data. Stochastic Process. Appl., 119(10):3699–3718, 2009.
  • [13] Monroe D. Donsker. Justification and extension of Doob’s heuristic approach to the Komogorov-Smirnov theorems. Ann. Math. Statistics, 23:277–281, 1952.
  • [14] P. Doukhan, P. Massart, and E. Rio. Invariance principles for absolutely regular empirical processes. Ann. Inst. H. Poincaré Probab. Statist., 31(2):393–427, 1995.
  • [15] Paul Doukhan. Mixing, volume 85 of Lecture Notes in Statistics. Springer-Verlag, New York, 1994. Properties and examples.
  • [16] Paul Doukhan and Michael H Neumann. Probability and moment inequalities for sums of weakly dependent random variables, with applications. Stochastic Processes and their Applications, 117(7):878–903, 2007.
  • [17] R. M. Dudley. Weak convergences of probabilities on nonseparable metric spaces and empirical measures on Euclidean spaces. Illinois J. Math., 10:109–126, 1966.
  • [18] R. M. Dudley. Central limit theorems for empirical measures. Ann. Probab., 6(6):899–929 (1979), 1978.
  • [19] R. M. Dudley. Uniform central limit theorems, volume 142 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, New York, second edition, 2014.
  • [20] Olivier Durieu and Marco Tusche. An empirical process central limit theorem for multidimensional dependent data. J. Theoret. Probab., 27(1):249–277, 2014.
  • [21] Richard S. Ellis and Aaron D. Wyner. Uniform large deviation property of the empirical process of a Markov chain. Ann. Probab., 17(3):1147–1151, 1989.
  • [22] Christian Francq and Jean-Michel Zakoïan. Mixing properties of a general class of GARCH(1,1) models without moment assumptions on the observed process. Econometric Theory, 22(5):815–834, 2006.
  • [23] Evarist Giné and Richard Nickl. Mathematical foundations of infinite-dimensional statistical models. Cambridge Series in Statistical and Probabilistic Mathematics, [40]. Cambridge University Press, New York, 2016.
  • [24] Bruce E. Hansen. Uniform convergence rates for kernel estimation with dependent data. Econometric Theory, 24(3):726–748, 2008.
  • [25] Lothar Heinrich. Bounds for the absolute regularity coefficient of a stationary renewal process. Yokohama Math. J., 40(1):25–33, 1992.
  • [26] Hans Arnfinn Karlsen and Dag Tjø stheim. Nonparametric estimation in null recurrent time series. Ann. Statist., 29(2):372–416, 2001.
  • [27] RafałKulik, Philippe Soulier, and Olivier Wintenberger. The tail empirical process of regularly varying functions of geometrically ergodic Markov chains. Stochastic Process. Appl., 129(11):4209–4238, 2019.
  • [28] Shlomo Levental. Uniform limit theorems for Harris recurrent Markov chains. Probab. Theory Related Fields, 80(1):101–118, 1988.
  • [29] Degui Li, Dag Tjø stheim, and Jiti Gao. Estimation in nonlinear regression with Harris recurrent Markov chains. Ann. Statist., 44(5):1957–1987, 2016.
  • [30] Eckhard Liebscher. Strong convergence of sums of [alpha]-mixing random variables with applications to density estimation. Stochastic Processes and their Applications, 65(1):69–80, 1996.
  • [31] Ulrike Mayer, Henryk Zähle, and Zhou Zhou. Functional weak limit theorem for a local empirical process of non-stationary time series and its application. Bernoulli, 26(3):1891 – 1911, 2020.
  • [32] Sean Meyn and Richard L. Tweedie. Markov chains and stochastic stability. Cambridge University Press, Cambridge, second edition, 2009. With a prologue by Peter W. Glynn.
  • [33] Abdelkader Mokkadem. Mixing properties of ARMA processes. Stochastic Process. Appl., 29(2):309–315, 1988.
  • [34] Mina Ossiander. A central limit theorem under metric entropy with L2L_{2} bracketing. Ann. Probab., 15(3):897–919, 1987.
  • [35] Tuan D. Pham and Lanh T. Tran. Some mixing properties of time series models. Stochastic Process. Appl., 19(2):297–303, 1985.
  • [36] Iosif Pinelis. Optimum bounds for the distributions of martingales in Banach spaces. Ann. Probab., 22(4):1679–1706, 1994.
  • [37] David Pollard. A central limit theorem for empirical processes. J. Austral. Math. Soc. Ser. A, 33(2):235–248, 1982.
  • [38] Emmanuel Rio. The functional law of the iterated logarithm for stationary strongly mixing sequences. Ann. Probab., 23(3):1188–1203, 07 1995.
  • [39] Emmanuel Rio. Processus empiriques absolument réguliers et entropie universelle. Probab. Theory Related Fields, 111(4):585–608, 1998.
  • [40] Emmanuel Rio. Asymptotic theory of weakly dependent random processes, volume 80 of Probability Theory and Stochastic Modelling. Springer, Berlin, 2017. Translated from the 2000 French edition [ MR2117923].
  • [41] Jorge D. Samur. A regularity condition and a limit theorem for Harris ergodic Markov chains. Stochastic Process. Appl., 111(2):207–235, 2004.
  • [42] Lionel Truquet. A perturbation analysis of Markov chains models with time-varying parameters. Bernoulli, 26(4):2876–2906, 2020.
  • [43] A. W. van der Vaart. Asymptotic statistics, volume 3 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 1998.
  • [44] Aad W. van der Vaart and Jon A. Wellner. Weak convergence and empirical processes. Springer Series in Statistics. Springer-Verlag, New York, 1996. With applications to statistics.
  • [45] Michael Vogt. Nonparametric regression for locally stationary time series. Ann. Statist., 40(5):2601–2633, 2012.
  • [46] Wei Biao Wu. Nonlinear system theory: another look at dependence. Proc. Natl. Acad. Sci. USA, 102(40):14150–14154, 2005.
  • [47] Wei Biao Wu. Empirical processes of stationary sequences. Statistica Sinica, 18(1):313–333, 2008.
  • [48] Wei Biao Wu. Asymptotic theory for stationary processes. Stat. Interface, 4(2):207–226, 2011.
  • [49] Wei Biao Wu, Weidong Liu, and Han Xiao. Probability and moment inequalities under dependence. Statist. Sinica, 23(3):1257–1272, 2013.
  • [50] Bin Yu. Rates of convergence for empirical processes of stationary mixing sequences. Ann. Probab., 22(1):94–116, 1994.
  • [51] Danna Zhang and Wei Biao Wu. Gaussian approximation for high dimensional time series. Ann. Statist., 45(5):1895–1919, 2017.

Supplementary Material

This material contains some details of the proofs in the paper as well as the proofs of the examples.

7.4 Proofs of Section 4.1

Lemma 7.3.

Let Assumption 4.2 hold for some ν2\nu\geq 2. Then,

δνf(Z,u)(k)\displaystyle\delta_{\nu}^{f(Z,u)}(k) \displaystyle\leq |Df,n(u)|Δ(k),\displaystyle|D_{f,n}(u)|\cdot\Delta(k),
supisupf|f(Zi,u)f(Zi(ij),u)|ν\displaystyle\sup_{i}\Big{\|}\sup_{f\in\mathcal{F}}\big{|}f(Z_{i},u)-f(Z_{i}^{*(i-j)},u)\big{|}\,\Big{\|}_{\nu} \displaystyle\leq Dn(u)Δ(k),\displaystyle D_{n}^{\infty}(u)\cdot\Delta(k),
supif(Zi,u)ν\displaystyle\sup_{i}\|f(Z_{i},u)\|_{\nu} \displaystyle\leq |Df,n(u)|CΔ,\displaystyle|D_{f,n}(u)|\cdot C_{\Delta},

where CΔ:=4d|L|1CXsCR+Cf¯C_{\Delta}:=4d\cdot|L_{\mathcal{F}}|_{1}\cdot C_{X}^{s}C_{R}+C_{\bar{f}}.

Proof of Lemma 7.3.

We have for each ff\in\mathcal{F} and ν2\nu\geq 2 that

supif¯(Zi,u)f¯(Zi(ik),u)ν\displaystyle\sup_{i}\left\lVert\bar{f}(Z_{i},u)-\bar{f}(Z_{i}^{*(i-k)},u)\right\rVert_{\nu}
\displaystyle\leq supi|ZiZi(ik)|L,ss(R(Zi,u)+R(Zi(ik),u))ν\displaystyle\sup_{i}\left\lVert|Z_{i}-Z_{i}^{*(i-k)}|_{L_{\mathcal{F},s}}^{s}\big{(}R(Z_{i},u)+R(Z_{i}^{*(i-k)},u)\big{)}\right\rVert_{\nu}
\displaystyle\leq supi|ZiZi(ik)|L,sspp1νR(Zi,u)+R(Zi(ik),u)pν\displaystyle\sup_{i}\left\lVert\big{|}Z_{i}-Z_{i}^{*(i-k)}\big{|}_{L_{\mathcal{F},s}}^{s}\right\rVert_{\frac{p}{p-1}\nu}\left\lVert R(Z_{i},u)+R(Z_{i}^{*(i-k)},u)\right\rVert_{p\nu}
\displaystyle\leq supij=0L,j|XijXij(ik)|spp1ν(R(Zi,u)pν+R(Zi(ik),u)pν)\displaystyle\sup_{i}\left\lVert\sum_{j=0}^{\infty}L_{\mathcal{F},j}\big{|}X_{i-j}-X_{i-j}^{*(i-k)}\big{|}_{\infty}^{s}\right\rVert_{\frac{p}{p-1}\nu}\left(\left\lVert R(Z_{i},u)\right\rVert_{p\nu}+\left\lVert R(Z_{i}^{*(i-k)},u)\right\rVert_{p\nu}\right)
\displaystyle\leq 2dCRj=0kL,j(δpp1νsX(kj))s.\displaystyle 2dC_{R}\sum_{j=0}^{k}L_{\mathcal{F},j}(\delta_{\frac{p}{p-1}\nu s}^{X}(k-j))^{s}.

This shows the first assertion. Due to

supf|f¯(Zi,u)f¯(Zi(ik),u)||ZiZi(ik)|L,ss(R(Zi,u)+R(Zi(ik),u)),\sup_{f\in\mathcal{F}}\big{|}\bar{f}(Z_{i},u)-\bar{f}(Z_{i}^{*(i-k)},u)\big{|}\leq|Z_{i}-Z_{i}^{*(i-k)}|_{L_{\mathcal{F},s}}^{s}\big{(}R(Z_{i},u)+R(Z_{i}^{*(i-k)},u)\big{)},

the second assertion follows similarly. The last assertion follows from

|f¯(z,u)||f¯(z,u)f¯(0,u)|+|f¯(0,u)||z|L,ss(R(z,u)+R(0,u))+|f¯(0,u)||\bar{f}(z,u)|\leq|\bar{f}(z,u)-\bar{f}(0,u)|+|\bar{f}(0,u)|\leq|z|_{L_{\mathcal{F}},s}^{s}\cdot(R(z,u)+R(0,u))+|\bar{f}(0,u)|

which implies

f¯(Zi,u)ν\displaystyle\|\bar{f}(Z_{i},u)\|_{\nu} \displaystyle\leq j=0L,j|Zij|spp1ν(R(Zi,u)pq+R(0,u))+|f¯(0,u)|\displaystyle\Big{\|}\sum_{j=0}^{\infty}L_{\mathcal{F},j}|Z_{i-j}|_{\infty}^{s}\Big{\|}_{\frac{p}{p-1}\nu}\big{(}\big{\|}R(Z_{i},u)\big{\|}_{pq}+R(0,u)\big{)}+|\bar{f}(0,u)|
\displaystyle\leq 2d|L|1CXs(CR+|R(0,u)|)+|f¯(0,u)|\displaystyle 2d\cdot|L_{\mathcal{F}}|_{1}\cdot C_{X}^{s}\cdot(C_{R}+|R(0,u)|)+|\bar{f}(0,u)|
\displaystyle\leq 4d|L|1CXsCR+Cf¯.\displaystyle 4d\cdot|L_{\mathcal{F}}|_{1}\cdot C_{X}^{s}\cdot C_{R}+C_{\bar{f}}.

Proof of Theorem 4.1.

Denote the three terms on the right hand side of (7.1) by A1,A2,A3A_{1},A_{2},A_{3}. We now discuss the three terms separately. First, we have

𝔼A1j=q1n𝔼maxf|i=1n(Wi,j+1(f)Wi,j(f))|.\mathbb{E}A_{1}\leq\sum_{j=q}^{\infty}\frac{1}{\sqrt{n}}\mathbb{E}\max_{f\in\mathcal{F}}\Big{|}\sum_{i=1}^{n}(W_{i,j+1}(f)-W_{i,j}(f))\Big{|}.

For fixed jj, the sequence

Ei,j:=(Ei,j(f))f\displaystyle E_{i,j}:=(E_{i,j}(f))_{f\in\mathcal{F}} =\displaystyle= ((Wi,j+1(f)Wi,j(f)))f\displaystyle\big{(}(W_{i,j+1}(f)-W_{i,j}(f))\big{)}_{f\in\mathcal{F}}
=\displaystyle= (𝔼[Wi(f)|εij,,εi]𝔼[Wi(f)|εij+1,,εi])f\displaystyle(\mathbb{E}[W_{i}(f)|\varepsilon_{i-j},...,\varepsilon_{i}]-\mathbb{E}[W_{i}(f)|\varepsilon_{i-j+1},...,\varepsilon_{i}])_{f\in\mathcal{F}}

is a |||\mathcal{F}|-dimensional martingale difference vector with respect to 𝒢i=σ(εij,εij+1,)\mathcal{G}^{i}=\sigma(\varepsilon_{i-j},\varepsilon_{i-j+1},...). For a vector x=(xf)fx=(x_{f})_{f\in\mathcal{F}} and s1s\geq 1, write |x|s:=(f|xf|s)1/s|x|_{s}:=(\sum_{f\in\mathcal{F}}|x_{f}|^{s})^{1/s}. By Theorem 4.1 in [36] there exists an absolute constant c1>0c_{1}>0 such that for s>1s>1,

|i=1nEi,j|s2c1{2supi=1,,n|Ei,j|s2+2(s1)(i=1n𝔼[|Ei,j|s2|𝒢i1])1/22}.\Big{\|}\Big{|}\sum_{i=1}^{n}E_{i,j}\Big{|}_{s}\Big{\|}_{2}\leq c_{1}\Big{\{}2\Big{\|}\sup_{i=1,...,n}|E_{i,j}|_{s}\Big{\|}_{2}+\sqrt{2(s-1)}\Big{\|}\Big{(}\sum_{i=1}^{n}\mathbb{E}[|E_{i,j}|_{s}^{2}|\mathcal{G}^{i-1}]\Big{)}^{1/2}\Big{\|}_{2}\Big{\}}. (7.18)

We have

supi=1,,n|Ei,j|s2=(supi=1,,n|Ei,j|s2)1/22(i=1n|Ei,j|s2)1/22,\Big{\|}\sup_{i=1,...,n}|E_{i,j}|_{s}\Big{\|}_{2}=\Big{\|}\big{(}\sup_{i=1,...,n}|E_{i,j}|_{s}^{2}\big{)}^{1/2}\Big{\|}_{2}\leq\Big{\|}\big{(}\sum_{i=1}^{n}|E_{i,j}|_{s}^{2}\big{)}^{1/2}\Big{\|}_{2},

therefore both terms in (7.18) are of the same order and it is enough to bound the second term in (7.18). We have

(i=1n𝔼[|Ei,j|s2|𝒢i1])1/22\displaystyle\Big{\|}\Big{(}\sum_{i=1}^{n}\mathbb{E}[|E_{i,j}|_{s}^{2}|\mathcal{G}^{i-1}]\Big{)}^{1/2}\Big{\|}_{2} =\displaystyle= i=1n𝔼[|Ei,j|s2|𝒢i1]11/2\displaystyle\Big{\|}\sum_{i=1}^{n}\mathbb{E}[|E_{i,j}|_{s}^{2}|\mathcal{G}^{i-1}]\Big{\|}_{1}^{1/2} (7.19)
\displaystyle\leq (i=1n𝔼[|Ei,j|s2|𝒢i1]1)1/2\displaystyle\Big{(}\sum_{i=1}^{n}\big{\|}\mathbb{E}[|E_{i,j}|_{s}^{2}|\mathcal{G}^{i-1}]\big{\|}_{1}\Big{)}^{1/2}
\displaystyle\leq (i=1n|Ei,j|s22)1/2.\displaystyle\Big{(}\sum_{i=1}^{n}\big{\|}|E_{i,j}|_{s}\big{\|}_{2}^{2}\Big{)}^{1/2}.

Note that

Ei,j(f)\displaystyle E_{i,j}(f) =\displaystyle= Wi,j+1(f)Wi,j(f)=𝔼[Wi(f)|εij,,εi]𝔼[Wi(f)|εij+1,,εi]\displaystyle W_{i,j+1}(f)-W_{i,j}(f)=\mathbb{E}[W_{i}(f)|\varepsilon_{i-j},...,\varepsilon_{i}]-\mathbb{E}[W_{i}(f)|\varepsilon_{i-j+1},...,\varepsilon_{i}] (7.20)
=\displaystyle= 𝔼[Wi(f)(ij)Wi(f)(ij+1)|𝒢i],\displaystyle\mathbb{E}[W_{i}(f)^{**(i-j)}-W_{i}(f)^{**(i-j+1)}|\mathcal{G}_{i}],

where H(i)(ij):=H(i(ij))H(\mathcal{F}_{i})^{**(i-j)}:=H(\mathcal{F}_{i}^{**(i-j)}) and i(ij)=(εi,εi1,,εij,εij1,εij2,)\mathcal{F}_{i}^{**(i-j)}=(\varepsilon_{i},\varepsilon_{i-1},...,\varepsilon_{i-j},\varepsilon_{i-j-1}^{*},\varepsilon_{i-j-2}^{*},...).

By Jensen’s inequality, Lemma 7.3 and the fact that (Wi(f)(ij),Wi(f)(ij+1))(W_{i}(f)^{**(i-j)},W_{i}(f)^{**(i-j+1)}) has the same distribution as (Wi(f),Wi(f)(ij))(W_{i}(f),W_{i}(f)^{*(i-j)}),

|Ei,j|s2\displaystyle\||E_{i,j}|_{s}\big{\|}_{2} =\displaystyle= |(f|Ei,j(f)|s)1/s2\displaystyle|\Big{\|}\Big{(}\sum_{f\in\mathcal{F}}|E_{i,j}(f)|^{s}\Big{)}^{1/s}\Big{\|}_{2} (7.21)
\displaystyle\leq s1/ssupf|𝔼[Wi(f)(ij)Wi(f)(ij+1)|𝒢i]|2\displaystyle s^{1/s}\Big{\|}\sup_{f\in\mathcal{F}}\big{|}\mathbb{E}[W_{i}(f)^{**(i-j)}-W_{i}(f)^{**(i-j+1)}|\mathcal{G}_{i}]\big{|}\Big{\|}_{2}
\displaystyle\leq e𝔼[supf|Wi(f)(ij)Wi(f)(ij+1)||𝒢i]2\displaystyle e\cdot\Big{\|}\mathbb{E}\big{[}\sup_{f\in\mathcal{F}}\big{|}W_{i}(f)^{**(i-j)}-W_{i}(f)^{**(i-j+1)}\big{|}\,\big{|}\mathcal{G}_{i}\big{]}\Big{\|}_{2}
\displaystyle\leq esupf|Wi(f)(ij)Wi(f)(ij+1)|2\displaystyle e\cdot\Big{\|}\sup_{f\in\mathcal{F}}\big{|}W_{i}(f)^{**(i-j)}-W_{i}(f)^{**(i-j+1)}\big{|}\,\Big{\|}_{2}
=\displaystyle= esupf|Wi(f)Wi(f)(ij)|2\displaystyle e\cdot\Big{\|}\sup_{f\in\mathcal{F}}\big{|}W_{i}(f)-W_{i}(f)^{*(i-j)}\big{|}\,\Big{\|}_{2}
\displaystyle\leq eDn(in)Δ(j).\displaystyle e\cdot D_{n}^{\infty}(\frac{i}{n})\Delta(j).

Inserting (7.21) into (7.19) delivers

(i=1n|Ei,j|s22)1/2e(i=1nDn(in)2)1/2Δ(j),\Big{(}\sum_{i=1}^{n}\big{\|}|E_{i,j}|_{s}\big{\|}_{2}^{2}\Big{)}^{1/2}\leq e\Big{(}\sum_{i=1}^{n}D_{n}^{\infty}(\frac{i}{n})^{2}\Big{)}^{1/2}\Delta(j),

Inserting this bound into (7.18), we obtain

|i=1nEi,j|s24ec1s1/2n1/2(1ni=1nDn(in)2)1/2Δ(j).\Big{\|}\Big{|}\sum_{i=1}^{n}E_{i,j}\Big{|}_{s}\Big{\|}_{2}\leq 4ec_{1}s^{1/2}n^{1/2}\Big{(}\frac{1}{n}\sum_{i=1}^{n}D_{n}^{\infty}(\frac{i}{n})^{2}\Big{)}^{1/2}\Delta(j).

We conclude with s:=2log||s:=2\vee\log|\mathcal{F}| that

𝔼A1\displaystyle\mathbb{E}A_{1} \displaystyle\leq 1nk=q|i=1nEi,j|s2\displaystyle\frac{1}{\sqrt{n}}\sum_{k=q}^{\infty}\Big{\|}\Big{|}\sum_{i=1}^{n}E_{i,j}\Big{|}_{s}\Big{\|}_{2} (7.22)
\displaystyle\leq 4ec12log||(1ni=1nDn(in)2)1/2j=qΔp(j)\displaystyle 4ec_{1}\cdot\sqrt{2\vee\log|\mathcal{F}|}\cdot\Big{(}\frac{1}{n}\sum_{i=1}^{n}D_{n}^{\infty}(\frac{i}{n})^{2}\Big{)}^{1/2}\sum_{j=q}^{\infty}\Delta_{p}(j)
\displaystyle\leq 8ec1H𝔻nβ(q).\displaystyle 8ec_{1}\cdot\sqrt{H}\cdot\mathbb{D}_{n}^{\infty}\beta(q).

We now discuss 𝔼A2\mathbb{E}A_{2}. If MQ,σQ>0M_{Q},\sigma_{Q}>0 are constants and Qi(f)Q_{i}(f), i=1,,mi=1,...,m mean-zero independent variables (depending on ff\in\mathcal{F}) with |Qi(f)|MQ|Q_{i}(f)|\leq M_{Q}, (1mi=1mQi(f)22)1/2σQ(\frac{1}{m}\sum_{i=1}^{m}\|Q_{i}(f)\|_{2}^{2})^{1/2}\leq\sigma_{Q}, then there exists some universal constant c2>0c_{2}>0 such that

𝔼maxf1m|i=1m[Qi(f)𝔼Qi(f)]|c2(σQH+MQHm),\mathbb{E}\max_{f\in\mathcal{F}}\frac{1}{\sqrt{m}}\Big{|}\sum_{i=1}^{m}\big{[}Q_{i}(f)-\mathbb{E}Q_{i}(f)\big{]}\Big{|}\leq c_{2}\cdot\Big{(}\sigma_{Q}\sqrt{H}+\frac{M_{Q}H}{\sqrt{m}}\Big{)}, (7.23)

(see e.g. [10] equation (4.3) in Section 4.1 therein).

Note that (Wk,jWk,j1)k(W_{k,j}-W_{k,j-1})_{k} is a martingale difference sequence and Wk,τlWk,τl1=j=τl1+1τl(Wk,jWk,j1)W_{k,\tau_{l}}-W_{k,\tau_{l-1}}=\sum_{j=\tau_{l-1}+1}^{\tau_{l}}(W_{k,j}-W_{k,j-1}). Furthermore, we have

Wk,jWk,j12Wk𝔼[Wk|εkj+1]2Wk2\|W_{k,j}-W_{k,j-1}\|_{2}\leq\|W_{k}-\mathbb{E}[W_{k}|\varepsilon_{k-j+1}]\|_{2}\leq\|W_{k}\|_{2}

and

Wk,jWk,j12\displaystyle\|W_{k,j}-W_{k,j-1}\|_{2} =\displaystyle= 𝔼[Wk(kj+1)Wk(kj+2)|𝒢k]2\displaystyle\|\mathbb{E}[W_{k}^{**(k-j+1)}-W_{k}^{**(k-j+2)}|\mathcal{G}_{k}]\|_{2}
\displaystyle\leq Wk(kj+1)Wk(kj+2)2\displaystyle\|W_{k}^{**(k-j+1)}-W_{k}^{**(k-j+2)}\|_{2}
=\displaystyle= WkWk(kj+1)2=δ2Wk(j1),\displaystyle\|W_{k}-W_{k}^{*(k-j+1)}\|_{2}=\delta_{2}^{W_{k}}(j-1),

thus

Wk,jWk,j12min{Wk2,δWk2(j1)}.\displaystyle\|W_{k,j}-W_{k,j-1}\|_{2}\leq\min\{\|W_{k}\|_{2},\delta^{W_{k}}_{2}(j-1)\}.

We conclude with the elementary inequality min{a1,b1}+min{a2,b2}min{a1+a2,b1+b2}\min\{a_{1},b_{1}\}+\min\{a_{2},b_{2}\}\leq\min\{a_{1}+a_{2},b_{1}+b_{2}\} that

Ti,l2\displaystyle\|T_{i,l}\|_{2} =\displaystyle= k=(i1)τl+1(iτl)n(Wk,τlWk,τl1)2\displaystyle\Big{\|}\sum_{k=(i-1)\tau_{l}+1}^{(i\tau_{l})\wedge n}(W_{k,\tau_{l}}-W_{k,\tau_{l-1}})\Big{\|}_{2}
=\displaystyle= j=τl1+1τlk=(i1)τl+1(iτl)n(Wk,jWk,j1)2\displaystyle\Big{\|}\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\sum_{k=(i-1)\tau_{l}+1}^{(i\tau_{l})\wedge n}(W_{k,j}-W_{k,j-1})\Big{\|}_{2}
\displaystyle\leq j=τl1+1τlk=(i1)τl+1(iτl)n(Wk,jWk,j1)2\displaystyle\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\Big{\|}\sum_{k=(i-1)\tau_{l}+1}^{(i\tau_{l})\wedge n}(W_{k,j}-W_{k,j-1})\Big{\|}_{2}
\displaystyle\leq j=τl1+1τl(k=(i1)τl+1(iτl)nWk,jWk,j122)1/2\displaystyle\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\Big{(}\sum_{k=(i-1)\tau_{l}+1}^{(i\tau_{l})\wedge n}\|W_{k,j}-W_{k,j-1}\|_{2}^{2}\Big{)}^{1/2}
\displaystyle\leq j=τl1+1τlmin{(k=(i1)τl+1(iτl)nWk22)1/2,(k=(i1)τl+1(iτl)(δ2Wk(j1))2)1/2}.\displaystyle\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\min\Big{\{}\Big{(}\sum_{k=(i-1)\tau_{l}+1}^{(i\tau_{l})\wedge n}\|W_{k}\|_{2}^{2}\Big{)}^{1/2},\Big{(}\sum_{k=(i-1)\tau_{l}+1}^{(i\tau_{l})}(\delta_{2}^{W_{k}}(j-1))^{2}\Big{)}^{1/2}\Big{\}}.

Put

σi,l:=(1τlk=(i1)τl+1(iτl)nWk22)1/2,Δi,j,l:=(1τlk=(i1)τl+1(iτl)nδ2Wk(j1)2)1/2.\sigma_{i,l}:=\Big{(}\frac{1}{\tau_{l}}\sum_{k=(i-1)\tau_{l}+1}^{(i\tau_{l})\wedge n}\|W_{k}\|_{2}^{2}\Big{)}^{1/2},\quad\quad\Delta_{i,j,l}:=\Big{(}\frac{1}{\tau_{l}}\sum_{k=(i-1)\tau_{l}+1}^{(i\tau_{l})\wedge n}\delta_{2}^{W_{k}}(j-1)^{2}\Big{)}^{1/2}.

Then

(1nτli=1nτl+1i even1τlTi,l(f)22)1/2\displaystyle\Big{(}\frac{1}{\frac{n}{\tau_{l}}}\underset{i\text{ even}}{\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}}\frac{1}{\tau_{l}}\left\lVert T_{i,l}(f)\right\rVert_{2}^{2}\Big{)}^{1/2} (7.24)
\displaystyle\leq (1nτli=1nτl+1(j=τl1+1τlmin{(1τlk=(i1)τl+1(iτl)nWk22)1/2,\displaystyle\Big{(}\frac{1}{\frac{n}{\tau_{l}}}\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}\Big{(}\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\min\Big{\{}\Big{(}\frac{1}{\tau_{l}}\sum_{k=(i-1)\tau_{l}+1}^{(i\tau_{l})\wedge n}\|W_{k}\|_{2}^{2}\Big{)}^{1/2},
(1τlk=(i1)τl+1(iτl)nδ2Wk(j1)2)1/2})2)1/2\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\Big{(}\frac{1}{\tau_{l}}\sum_{k=(i-1)\tau_{l}+1}^{(i\tau_{l})\wedge n}\delta_{2}^{W_{k}}(j-1)^{2}\Big{)}^{1/2}\Big{\}}\Big{)}^{2}\Big{)}^{1/2}
=\displaystyle= (1nτli=1nτl+1((τlτl1)2min{σi2,Δi,τl1+1,l2})1/2\displaystyle\Big{(}\frac{1}{\frac{n}{\tau_{l}}}\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}\Big{(}\big{(}\tau_{l}-\tau_{l-1}\big{)}^{2}\min\{\sigma_{i}^{2},\Delta_{i,\tau_{l-1}+1,l}^{2}\}\Big{)}^{1/2}
=\displaystyle= (1nτli=1nτl+1(τlτl1)2min{σi,l2,Δi,τl1+1,l2})1/2\displaystyle\Big{(}\frac{1}{\frac{n}{\tau_{l}}}\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}\big{(}\tau_{l}-\tau_{l-1}\big{)}^{2}\min\{\sigma_{i,l}^{2},\Delta_{i,\tau_{l-1}+1,l}^{2}\}\Big{)}^{1/2}
\displaystyle\leq (τlτl1)(min{1nτli=1nτl+1σi,l2,1nτli=1nτl+1Δi,τl1+1,l2})1/2\displaystyle\big{(}\tau_{l}-\tau_{l-1}\big{)}\cdot\Big{(}\min\{\frac{1}{\frac{n}{\tau_{l}}}\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}\sigma_{i,l}^{2},\frac{1}{\frac{n}{\tau_{l}}}\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}\Delta_{i,\tau_{l-1}+1,l}^{2}\}\Big{)}^{1/2}
\displaystyle\leq j=τl1+1τlmin{(1nτli=1nτl+1σi,l2)1/2,(1nτli=1nτl+1Δi,τl1+1,l2)1/2}\displaystyle\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\min\{\Big{(}\frac{1}{\frac{n}{\tau_{l}}}\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}\sigma_{i,l}^{2}\Big{)}^{1/2},\Big{(}\frac{1}{\frac{n}{\tau_{l}}}\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}\Delta_{i,\tau_{l-1}+1,l}^{2}\Big{)}^{1/2}\}
\displaystyle\leq j=τl1+1τlmin{f2,n,(1ni=1nδ2Wi(τl1)2)1/2}\displaystyle\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\min\{\|f\|_{2,n},\Big{(}\frac{1}{n}\sum_{i=1}^{n}\delta_{2}^{W_{i}}(\tau_{l-1})^{2}\Big{)}^{1/2}\}
\displaystyle\leq j=τl1+1τlmin{f2,n,𝔻nΔ(j2)}\displaystyle\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\min\{\|f\|_{2,n},\mathbb{D}_{n}\Delta(\lfloor\frac{j}{2}\rfloor)\}

With 1τl|Ti,l(f)|2τlf2τlM\frac{1}{\sqrt{\tau_{l}}}\big{|}T_{i,l}(f)\big{|}\leq 2\sqrt{\tau_{l}}\|f\|_{\infty}\leq 2\sqrt{\tau_{l}}M and (7.23), we obtain

l=1L[𝔼maxf|1nτli=1nτl+1i even1τlTi,l(f)|]\displaystyle\sum_{l=1}^{L}\Big{[}\mathbb{E}\max_{f\in\mathcal{F}}\Big{|}\frac{1}{\sqrt{\frac{n}{\tau_{l}}}}\underset{i\text{ even}}{\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}}\frac{1}{\sqrt{\tau_{l}}}T_{i,l}(f)\Big{|}\Big{]} \displaystyle\leq c2l=1L[supf(1nτli=1nτl+1i even1τlTi,l(f)22)1/2H+2τlMHnτl],\displaystyle c_{2}\sum_{l=1}^{L}\Big{[}\sup_{f}\Big{(}\frac{1}{\frac{n}{\tau_{l}}}\underset{i\text{ even}}{\sum_{i=1}^{\lfloor\frac{n}{\tau_{l}}\rfloor+1}}\left\lVert\frac{1}{\sqrt{\tau_{l}}}T_{i,l}(f)\right\rVert_{2}^{2}\Big{)}^{1/2}\sqrt{H}+\frac{2\sqrt{\tau_{l}}MH}{\sqrt{\frac{n}{\tau_{l}}}}\Big{]},

and a similar assertion for the second term (ii odd) in A2A_{2}. With (7.24), we conclude that

𝔼A2\displaystyle\mathbb{E}A_{2} \displaystyle\leq l=1L[𝔼maxf1nτl|1inτl+1,i odd1τlTi,l(f)|\displaystyle\sum_{l=1}^{L}\Big{[}\mathbb{E}\max_{f\in\mathcal{F}}\frac{1}{\sqrt{\frac{n}{\tau_{l}}}}\Big{|}\sum_{1\leq i\leq\lfloor\frac{n}{\tau_{l}}\rfloor+1,i\text{ odd}}\frac{1}{\sqrt{\tau_{l}}}T_{i,l}(f)\Big{|} (7.25)
+𝔼maxf1nτl|1inτl+1,i even1τlTi,l(f)|]\displaystyle\quad\quad\quad\quad\quad\quad+\mathbb{E}\max_{f\in\mathcal{F}}\frac{1}{\sqrt{\frac{n}{\tau_{l}}}}\Big{|}\sum_{1\leq i\leq\lfloor\frac{n}{\tau_{l}}\rfloor+1,i\text{ even}}\frac{1}{\sqrt{\tau_{l}}}T_{i,l}(f)\Big{|}\Big{]}
\displaystyle\leq 4c2l=1L[(j=τl1+1τlmin{maxff2,n,𝔻nΔ(j2)})H+τlMHnτl+1].\displaystyle 4c_{2}\sum_{l=1}^{L}\Big{[}\Big{(}\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\min\{\max_{f\in\mathcal{F}}\|f\|_{2,n},\mathbb{D}_{n}\Delta(\lfloor\frac{j}{2}\rfloor)\}\Big{)}\cdot\sqrt{H}+\frac{\sqrt{\tau_{l}}MH}{\sqrt{\lfloor\frac{n}{\tau_{l}}\rfloor+1}}\Big{]}.

Note that

l=1Lτlnτl+1l=1Lτlnτl=1nl=0Lτl=1nl=1L12l1n(2L+q)2qn.\sum_{l=1}^{L}\frac{\sqrt{\tau_{l}}}{\sqrt{\lfloor\frac{n}{\tau_{l}}\rfloor+1}}\leq\sum_{l=1}^{L}\frac{\sqrt{\tau_{l}}}{\sqrt{\frac{n}{\tau_{l}}}}=\frac{1}{\sqrt{n}}\sum_{l=0}^{L}\tau_{l}=\frac{1}{\sqrt{n}}\sum_{l=1}^{L-1}2^{l}\leq\frac{1}{\sqrt{n}}(2^{L}+q)\leq\frac{2q}{\sqrt{n}}. (7.26)

Furthermore, we have by Lemma 7.4 that

l=1Lj=τl1+1τlmin{maxff2,n,𝔻nΔ(j2)}\displaystyle\sum_{l=1}^{L}\sum_{j=\tau_{l-1}+1}^{\tau_{l}}\min\{\max_{f\in\mathcal{F}}\|f\|_{2,n},\mathbb{D}_{n}\Delta(\lfloor\frac{j}{2}\rfloor)\} \displaystyle\leq j=2min{maxff2,n,𝔻nΔ(j2)}\displaystyle\sum_{j=2}^{\infty}\min\{\max_{f\in\mathcal{F}}\|f\|_{2,n},\mathbb{D}_{n}\Delta(\lfloor\frac{j}{2}\rfloor)\} (7.27)
\displaystyle\leq 2V¯n(maxff2,n)\displaystyle 2\bar{V}_{n}(\max_{f\in\mathcal{F}}\|f\|_{2,n})
=\displaystyle= 2maxfV¯n(f2,n)=2maxfVn(f),\displaystyle 2\max_{f\in\mathcal{F}}\bar{V}_{n}(\|f\|_{2,n})=2\max_{f\in\mathcal{F}}V_{n}(f),

where

V¯n(x)=x+j=1min{x,𝔻nΔ(j)}\bar{V}_{n}(x)=x+\sum_{j=1}^{\infty}\min\{x,\mathbb{D}_{n}\Delta(j)\} (7.28)

and the second to last equality holds since xV¯n(x)x\mapsto\bar{V}_{n}(x) is increasing.

Inserting (7.26) and (7.27) into (7.25), we conclude that with some universal c3>0c_{3}>0,

𝔼A2c3(supfVn(f)H+qMHn)c2(σH+qMHn).\mathbb{E}A_{2}\leq c_{3}\Big{(}\sup_{f\in\mathcal{F}}V_{n}(f)\sqrt{H}+\frac{qMH}{\sqrt{n}}\Big{)}\leq c_{2}\Big{(}\sigma\sqrt{H}+\frac{qMH}{\sqrt{n}}\Big{)}. (7.29)

Since Sn,1W=i=1nWi,1(f)S_{n,1}^{W}=\sum_{i=1}^{n}W_{i,1}(f) is a sum of independent variables with |Wi,1(f)|fM|W_{i,1}(f)|\leq\|f\|_{\infty}\leq M and Wi,0(f)22f22Vn(f)2σ\|W_{i,0}(f)\|_{2}\leq 2\|f\|_{2}\leq 2V_{n}(f)\leq 2\sigma, we obtain from (7.23) again

𝔼A3c2(σH+MHn).\mathbb{E}A_{3}\leq c_{2}\Big{(}\sigma\sqrt{H}+\frac{MH}{\sqrt{n}}\Big{)}. (7.30)

If we insert the bounds (7.22), (7.29) and (7.30) into (7.1), we obtain the result (4.2).

We now show (4.3). If q(MHn𝔻n)Hn1q^{*}(\frac{M\sqrt{H}}{\sqrt{n}\mathbb{D}_{n}^{\infty}})\frac{H}{n}\leq 1, we have q(MHn𝔻n){1,,n}q^{*}(\frac{M\sqrt{H}}{\sqrt{n}\mathbb{D}_{n}^{\infty}})\in\{1,...,n\} and thus by (4.2):

𝔼maxf|1nSn(f)|\displaystyle\mathbb{E}\max_{f\in\mathcal{F}}\Big{|}\frac{1}{\sqrt{n}}S_{n}(f)\Big{|} \displaystyle\leq c(H𝔻nβ(q(MHn𝔻n))+q(MHn𝔻n)MHn+σH)\displaystyle c\Big{(}\sqrt{H}\mathbb{D}_{n}^{\infty}\beta\Big{(}q^{*}\Big{(}\frac{M\sqrt{H}}{\sqrt{n}\mathbb{D}_{n}^{\infty}}\Big{)}\Big{)}+q^{*}\Big{(}\frac{M\sqrt{H}}{\sqrt{n}\mathbb{D}_{n}^{\infty}}\Big{)}\frac{MH}{\sqrt{n}}+\sigma\sqrt{H}\Big{)} (7.31)
\displaystyle\leq 2c(q(MHn𝔻n)MHn+σH)\displaystyle 2c\Big{(}q^{*}\Big{(}\frac{M\sqrt{H}}{\sqrt{n}\mathbb{D}_{n}^{\infty}}\Big{)}\frac{MH}{\sqrt{n}}+\sigma\sqrt{H}\Big{)}
=\displaystyle= 2c(nMmin{q(MHn𝔻n)Hn,1}+σH).\displaystyle 2c\Big{(}\sqrt{n}M\cdot\min\Big{\{}q^{*}\Big{(}\frac{M\sqrt{H}}{\sqrt{n}\mathbb{D}_{n}^{\infty}}\Big{)}\frac{H}{n},1\Big{\}}+\sigma\sqrt{H}\Big{)}.

If q(MHn𝔻n)Hn1q^{*}(\frac{M\sqrt{H}}{\sqrt{n}\mathbb{D}_{n}^{\infty}})\frac{H}{n}\geq 1, we note that the simple bound

𝔼maxf|1nSn(f)|\displaystyle\mathbb{E}\max_{f\in\mathcal{F}}\Big{|}\frac{1}{\sqrt{n}}S_{n}(f)\Big{|} \displaystyle\leq 2nM\displaystyle 2\sqrt{n}M (7.32)
\displaystyle\leq 2c(nMmin{q(MHn𝔻n)Hn,1}+σH)\displaystyle 2c\Big{(}\sqrt{n}M\min\Big{\{}q^{*}\Big{(}\frac{M\sqrt{H}}{\sqrt{n}\mathbb{D}_{n}^{\infty}}\Big{)}\frac{H}{n},1\Big{\}}+\sigma\sqrt{H}\Big{)}

holds. Putting the two bounds (7.31) and (7.32) together, we obtain the result (4.3).

Lemma 7.4.

Let ω(k)\omega(k) be an increasing sequence in kk. Then, for any x>0x>0,

j=2min{x,𝔻nΔ(j2)}ω(j)2j=1min{x,𝔻nΔ(j)}ω(2j+1).\sum_{j=2}^{\infty}\min\{x,\mathbb{D}_{n}\Delta(\lfloor\frac{j}{2}\rfloor)\}\omega(j)\leq 2\sum_{j=1}^{\infty}\min\{x,\mathbb{D}_{n}\Delta(j)\}\omega(2j+1).

Especially in the case ω(k)=1\omega(k)=1,

j=2min{x,𝔻nΔ(j2)}2j=1min{x,𝔻nΔ(j)}.\sum_{j=2}^{\infty}\min\{x,\mathbb{D}_{n}\Delta(\lfloor\frac{j}{2}\rfloor)\}\leq 2\sum_{j=1}^{\infty}\min\{x,\mathbb{D}_{n}\Delta(j)\}.
Proof of Lemma 7.4.

It holds that

j=2min{x,𝔻nΔ(j2)}ω(j)\displaystyle\sum_{j=2}^{\infty}\min\{x,\mathbb{D}_{n}\Delta(\lfloor\frac{j}{2}\rfloor)\}\omega(j)
=\displaystyle= k=1min{x,𝔻nΔ(2k2)}ω(2k)+k=1min{x,𝔻nΔ(2k+12)}ω(2k+1)\displaystyle\sum_{k=1}^{\infty}\min\{x,\mathbb{D}_{n}\Delta(\lfloor\frac{2k}{2}\rfloor)\}\omega(2k)+\sum_{k=1}^{\infty}\min\{x,\mathbb{D}_{n}\Delta(\lfloor\frac{2k+1}{2}\rfloor)\}\omega(2k+1)
=\displaystyle= k=1min{x,𝔻nΔ(k)}{ω(2k)+ω(2k+1)}\displaystyle\sum_{k=1}^{\infty}\min\{x,\mathbb{D}_{n}\Delta(k)\}\cdot\{\omega(2k)+\omega(2k+1)\}
\displaystyle\leq 2k=1min{x,𝔻nΔ(k)}ω(2k+1).\displaystyle 2\sum_{k=1}^{\infty}\min\{x,\mathbb{D}_{n}\Delta(k)\}\cdot\omega(2k+1).

Proof of Corollary 4.3.

Let σ:=supnsupfVn(f)<\sigma:=\sup_{n\in\mathbb{N}}\sup_{f\in\mathcal{F}}V_{n}(f)<\infty. For Q1Q\geq 1, define

Mn=nHr(σQ1/2𝔻n)𝔻n.M_{n}=\frac{\sqrt{n}}{\sqrt{H}}r(\frac{\sigma Q^{1/2}}{\mathbb{D}_{n}^{\infty}})\mathbb{D}_{n}^{\infty}.

Let F¯=supff¯\bar{F}=\sup_{f\in\mathcal{F}}\bar{f}, and F(z,u)=Dn(u)F¯(z,u)F(z,u)=D_{n}^{\infty}(u)\cdot\bar{F}(z,u). Then FF is an envelope function of \mathcal{F}. We furthermore have

(supi=1,,nF(Zi,in)>Mn)((1ni=1nF(Zi,in)ν)1/ν>Mnn1/ν)nMnνFν,nν.\mathbb{P}(\sup_{i=1,...,n}F(Z_{i},\frac{i}{n})>M_{n})\leq\mathbb{P}\Big{(}\big{(}\frac{1}{n}\sum_{i=1}^{n}F(Z_{i},\frac{i}{n})^{\nu}\big{)}^{1/\nu}>\frac{M_{n}}{n^{1/\nu}}\Big{)}\leq\frac{n}{M_{n}^{\nu}}\cdot\|F\|_{\nu,n}^{\nu}. (7.33)

Inserting the bound

Fν,nν=1ni=1nDn(in)νF¯(Zi,in)ννCΔν1ni=1nDn(in)νCΔν(𝔻ν,n)ν\|F\|_{\nu,n}^{\nu}=\frac{1}{n}\sum_{i=1}^{n}D_{n}^{\infty}(\frac{i}{n})^{\nu}\|\bar{F}(Z_{i},\frac{i}{n})\|_{\nu}^{\nu}\leq C_{\Delta}^{\nu}\cdot\frac{1}{n}\sum_{i=1}^{n}D_{n}^{\infty}(\frac{i}{n})^{\nu}\leq C_{\Delta}^{\nu}\cdot(\mathbb{D}_{\nu,n}^{\infty})^{\nu}

into (7.33) and using r(γa)γr(a)r(\gamma a)\geq\gamma r(a) for γ1,a>0\gamma\geq 1,a>0 (this is similarly proven as for Lemma 7.5), we obtain

(supi=1,,nF(Zi,in)>Mn)\displaystyle\mathbb{P}(\sup_{i=1,...,n}F(Z_{i},\frac{i}{n})>M_{n}) \displaystyle\leq (Hn12νr(σQ1/2𝔻n)2)ν/2(CΔ𝔻ν,n𝔻n)ν\displaystyle\Big{(}\frac{H}{n^{1-\frac{2}{\nu}}r(\frac{\sigma Q^{1/2}}{\mathbb{D}_{n}^{\infty}})^{2}}\Big{)}^{\nu/2}\cdot\Big{(}\frac{C_{\Delta}\mathbb{D}_{\nu,n}^{\infty}}{\mathbb{D}_{n}^{\infty}}\Big{)}^{\nu} (7.34)
\displaystyle\leq 1Qν/2(Hn12νr(σ𝔻n)2)ν/2(CΔ𝔻ν,n𝔻n)ν.\displaystyle\frac{1}{Q^{\nu/2}}\Big{(}\frac{H}{n^{1-\frac{2}{\nu}}r(\frac{\sigma}{\mathbb{D}_{n}^{\infty}})^{2}}\Big{)}^{\nu/2}\cdot\Big{(}\frac{C_{\Delta}\mathbb{D}_{\nu,n}^{\infty}}{\mathbb{D}_{n}^{\infty}}\Big{)}^{\nu}.

Using the rough bound fν,nFν,n\|f\|_{\nu,n}\leq\|F\|_{\nu,n} and r(a)ar(a)\leq a for a>0a>0 from Lemma 7.5, we obtain

maxf1ni=1n𝔼[f(Zi,in)𝟙{|f(Zi,in)|>Mn}]\displaystyle\max_{f\in\mathcal{F}}\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\mathbb{E}[f(Z_{i},\frac{i}{n})\mathbbm{1}_{\{|f(Z_{i},\frac{i}{n})|>M_{n}\}}] \displaystyle\leq 1nMnν1maxfi=1n𝔼[|f(Zi,in)|ν]\displaystyle\frac{1}{\sqrt{n}M_{n}^{\nu-1}}\max_{f\in\mathcal{F}}\sum_{i=1}^{n}\mathbb{E}[|f(Z_{i},\frac{i}{n})|^{\nu}] (7.35)
\displaystyle\leq nMnνMnnmaxffν,nν\displaystyle\frac{n}{M_{n}^{\nu}}\cdot\frac{M_{n}}{\sqrt{n}}\max_{f\in\mathcal{F}}\|f\|_{\nu,n}^{\nu}
\displaystyle\leq (CΔ2Hn12νr(σQ1/2𝔻n)2)ν/2σQ1/2H(𝔻ν,n𝔻n)ν\displaystyle\Big{(}\frac{C_{\Delta}^{2}H}{n^{1-\frac{2}{\nu}}r(\frac{\sigma Q^{1/2}}{\mathbb{D}_{n}^{\infty}})^{2}}\Big{)}^{\nu/2}\cdot\frac{\sigma Q^{1/2}}{\sqrt{H}}\cdot\Big{(}\frac{\mathbb{D}_{\nu,n}^{\infty}}{\mathbb{D}_{n}^{\infty}}\Big{)}^{\nu}
\displaystyle\leq σQν22H(CΔ2Hn12νr(σ𝔻n)2)ν/2(𝔻ν,n𝔻n)ν.\displaystyle\frac{\sigma}{Q^{\frac{\nu-2}{2}}\sqrt{H}}\Big{(}\frac{C_{\Delta}^{2}H}{n^{1-\frac{2}{\nu}}r(\frac{\sigma}{\mathbb{D}_{n}^{\infty}})^{2}}\Big{)}^{\nu/2}\cdot\Big{(}\frac{\mathbb{D}_{\nu,n}^{\infty}}{\mathbb{D}_{n}^{\infty}}\Big{)}^{\nu}.

Abbreviate

Cn:=(CΔ2Hn12νr(σ𝔻n)2)ν/2(𝔻ν,n𝔻n)ν.C_{n}:=\Big{(}\frac{C_{\Delta}^{2}H}{n^{1-\frac{2}{\nu}}r(\frac{\sigma}{\mathbb{D}_{n}^{\infty}})^{2}}\Big{)}^{\nu/2}\cdot\Big{(}\frac{\mathbb{D}_{\nu,n}^{\infty}}{\mathbb{D}_{n}^{\infty}}\Big{)}^{\nu}.

By assumption, supnCn<\sup_{n\in\mathbb{N}}C_{n}<\infty. By Theorem 4.1, (7.34) and (7.35),

(maxf|𝔾n(f)|>QH)\displaystyle\mathbb{P}\Big{(}\max_{f\in\mathcal{F}}\big{|}\mathbb{G}_{n}(f)\big{|}>Q\sqrt{H}\Big{)}
\displaystyle\leq (maxf|𝔾n(f)|>QH,supi=1,,nF¯(Zi,in)M)\displaystyle\mathbb{P}\Big{(}\max_{f\in\mathcal{F}}\big{|}\mathbb{G}_{n}(f)\big{|}>Q\sqrt{H},\sup_{i=1,...,n}\bar{F}(Z_{i},\frac{i}{n})\leq M\Big{)}
+(supi=1,,nF(Zi,in)>M)\displaystyle\quad\quad\quad+\mathbb{P}(\sup_{i=1,...,n}F(Z_{i},\frac{i}{n})>M)
\displaystyle\leq (maxf|𝔾n(max{min{f,M},M})|>QH/2)\displaystyle\mathbb{P}\Big{(}\max_{f\in\mathcal{F}}\big{|}\mathbb{G}_{n}(\max\{\min\{f,M\},-M\})\big{|}>Q\sqrt{H}/2\Big{)}
+(maxf|1ni=1n𝔼[f(Zi,in)𝟙{|f(Zi,in)|>M}]>QH/2)\displaystyle\quad\quad+\mathbb{P}\Big{(}\max_{f\in\mathcal{F}}\big{|}\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\mathbb{E}[f(Z_{i},\frac{i}{n})\mathbbm{1}_{\{|f(Z_{i},\frac{i}{n})|>M\}}]>Q\sqrt{H}/2\Big{)}
+(supi=1,,nF(Zi,in)>M)\displaystyle\quad\quad\quad+\mathbb{P}(\sup_{i=1,...,n}F(Z_{i},\frac{i}{n})>M)
\displaystyle\leq 2cQH[σH+q(r(σQ1/2𝔻n))r(σQ1/2𝔻n)𝔻n]+(1Qν2+2σQν2H)Cn\displaystyle\frac{2c}{Q\sqrt{H}}\Big{[}\sigma\sqrt{H}+q^{*}\Big{(}r(\frac{\sigma Q^{1/2}}{\mathbb{D}_{n}^{\infty}})\Big{)}r(\frac{\sigma Q^{1/2}}{\mathbb{D}_{n}^{\infty}})\mathbb{D}_{n}^{\infty}\Big{]}+\Big{(}\frac{1}{Q^{\frac{\nu}{2}}}+\frac{2\sigma}{Q^{\frac{\nu}{2}}H}\Big{)}C_{n}
\displaystyle\leq 4cσQ1/2+(1Qν2+2σQν2H)Cn.\displaystyle\frac{4c\sigma}{Q^{1/2}}+\Big{(}\frac{1}{Q^{\frac{\nu}{2}}}+\frac{2\sigma}{Q^{\frac{\nu}{2}}H}\Big{)}C_{n}.

Since supnCn<\sup_{n\in\mathbb{N}}C_{n}<\infty and σ\sigma is independent of nn, the assertion follows for QQ\to\infty. ∎

7.5 Proofs of Section 4.2

Proof of Theorem 4.4.

In the following, we abbreviate (δ)=(δ,,Vn)\mathbb{H}(\delta)=\mathbb{H}(\delta,\mathcal{F},V_{n}) and (δ)=(δ,,Vn)\mathbb{N}(\delta)=\mathbb{N}(\delta,\mathcal{F},V_{n}). Choose δ0=σ\delta_{0}=\sigma and δj=2jδ0\delta_{j}=2^{-j}\delta_{0}.

For each j0j\in\mathbb{N}_{0}, we choose a covering by brackets jkpre:=[ljk,ujk]\mathcal{F}_{jk}^{pre}:=[l_{jk},u_{jk}]\cap\mathcal{F}, k=1,,(δj)k=1,...,\mathbb{N}(\delta_{j}) such that Vn(ujkljk)δjV_{n}(u_{jk}-l_{jk})\leq\delta_{j} and supf,gjk|fg|ujkljk=:Δjk\sup_{f,g\in\mathcal{F}_{jk}}|f-g|\leq u_{jk}-l_{jk}=:\Delta_{jk}. We may assume w.l.o.g. that ljk,ujk,Δjkl_{jk},u_{jk},\Delta_{jk}\in\mathcal{F}.

If ljk,ujkl_{jk},u_{jk} do not belong to \mathcal{F}, we can simply define new brackets by

l~jk(z,u):=inff[ljk,ujk]f(z,u),u~jk(z,u):=supf[ljk,ujk]f(z,u)\tilde{l}_{jk}(z,u):=\inf_{f\in[l_{jk},u_{jk}]}f(z,u),\quad\quad\tilde{u}_{jk}(z,u):=\sup_{f\in[l_{jk},u_{jk}]}f(z,u)

which fulfill [ljk,ujk]=[l~jk,u~jk][l_{jk},u_{jk}]\cap\mathcal{F}=[\tilde{l}_{jk},\tilde{u}_{jk}]\cap\mathcal{F}, and

|l~jk(z,u)l~jk(z,u)|supf[ljk,ujk]|f(z,u)f(z,u)|.|\tilde{l}_{jk}(z,u)-\tilde{l}_{jk}(z^{\prime},u)|\leq\sup_{f\in[l_{jk},u_{jk}]}|f(z,u)-f(z^{\prime},u)|.

Thus, we can add l~jk,u~jk\tilde{l}_{jk},\tilde{u}_{jk} to \mathcal{F} without changing the bracketing numbers (ε,,)\mathbb{N}(\varepsilon,\mathcal{F},\|\cdot\|) and the validity of Assumption 4.2.

We now construct inductively a new nested sequence of partitions (jk)k(\mathcal{F}_{jk})_{k} of \mathcal{F} from (jkpre)k(\mathcal{F}_{jk}^{pre})_{k} in the following way: For each fixed j0j\in\mathbb{N}_{0}, put

{jk:k}:={i=0jikipre:ki{1,,(δi)},i{0,,j}}\{\mathcal{F}_{jk}:k\}:=\{\bigcap_{i=0}^{j}\mathcal{F}_{ik_{i}}^{pre}:k_{i}\in\{1,...,\mathbb{N}(\delta_{i})\},i\in\{0,...,j\}\}

as the intersections of all previous partitions and the jj-th partition. Then |{jk:k}|Nj:=(δ0)(δj)|\{\mathcal{F}_{jk}:k\}|\leq N_{j}:=\mathbb{N}(\delta_{0})\cdot...\cdot\mathbb{N}(\delta_{j}). By monotonicity of VnV_{n}, we have

supf,gjk|fg|Δjk,Vn(Δjk)δj.\sup_{f,g\in\mathcal{F}_{jk}}|f-g|\leq\Delta_{jk},\quad\quad V_{n}(\Delta_{jk})\leq\delta_{j}.

In each jk\mathcal{F}_{jk}, fix some fjkf_{jk}\in\mathcal{F}, and define πjf:=fj,ψjf\pi_{j}f:=f_{j,\psi_{j}f} where ψjf:=min{i{1,,Nj}:fji}\psi_{j}f:=\min\{i\in\{1,...,N_{j}\}:f\in\mathcal{F}_{ji}\}. Put Δjf:=Δj,ψjf\Delta_{j}f:=\Delta_{j,\psi_{j}f} and

I(σ):=0σ1(ε,,Vn)dε,I(\sigma):=\int_{0}^{\sigma}\sqrt{1\vee\mathbb{H}(\varepsilon,\mathcal{F},V_{n})}d\varepsilon,

we set

τ:=min{j0:δjI(σ)n}1.\tau:=\min\Big{\{}j\geq 0:\delta_{j}\leq\frac{I(\sigma)}{\sqrt{n}}\Big{\}}\vee 1. (7.36)

Put

mj:=12m(n,δj,Nj+1),m_{j}:=\frac{1}{2}m(n,\delta_{j},N_{j+1}),

(m()m(\cdot) from Lemma 7.2). Choose Mn=12m0M_{n}=\frac{1}{2}m_{0}. We then have

𝔼supf|𝔾n(f)|𝔼supf(Mn)|𝔾n(f)|+1ni=1n𝔼[Wi(F𝟙{F>Mn})],\mathbb{E}\sup_{f\in\mathcal{F}}\big{|}\mathbb{G}_{n}(f)\big{|}\leq\mathbb{E}\sup_{f\in\mathcal{F}(M_{n})}\big{|}\mathbb{G}_{n}(f)\big{|}+\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\mathbb{E}\big{[}W_{i}(F\mathbbm{1}_{\{F>M_{n}\}})\big{]},

where (Mn):={φMn(f):f}\mathcal{F}(M_{n}):=\{\varphi_{M_{n}}^{\wedge}(f):f\in\mathcal{F}\}. Due to Lemma 7.1(iii), (Mn)\mathcal{F}(M_{n}) still fulfills Assumption 4.2.

Since |f|g|f|\leq g implies |Wi(f)|Wi(g)|W_{i}(f)|\leq W_{i}(g) and Wi(g)1g(Zi,in)1\|W_{i}(g)\|_{1}\leq\|g(Z_{i},\frac{i}{n})\|_{1}, it holds that

|𝔾n(f)|\displaystyle|\mathbb{G}_{n}(f)| \displaystyle\leq 1ni=1n|Wi(f)𝔼Wi(f)|\displaystyle\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\big{|}W_{i}(f)-\mathbb{E}W_{i}(f)\big{|}
\displaystyle\leq 𝔾n(g)+2ni=1nWi(g)1𝔾n(g)+2ng1,n.\displaystyle\mathbb{G}_{n}(g)+\frac{2}{\sqrt{n}}\sum_{i=1}^{n}\|W_{i}(g)\|_{1}\leq\mathbb{G}_{n}(g)+2\sqrt{n}\|g\|_{1,n}.

By (7.8) and (7.9) and the fact that fπ0f2Mnm0\|f-\pi_{0}f\|_{\infty}\leq 2M_{n}\leq m_{0}, we have the decomposition

supf|𝔾n(f)|\displaystyle\sup_{f\in\mathcal{F}}|\mathbb{G}_{n}(f)| \displaystyle\leq supf|𝔾n(π0f)|\displaystyle\sup_{f\in\mathcal{F}}|\mathbb{G}_{n}(\pi_{0}f)| (7.37)
+supf|𝔾n(φmτ(fπτf))|+j=0τ1supf|𝔾n(φmjmj+1(πj+1fπjf))|\displaystyle\quad\quad+\sup_{f\in\mathcal{F}}|\mathbb{G}_{n}(\varphi_{m_{\tau}}^{\wedge}(f-\pi_{\tau}f))|+\sum_{j=0}^{\tau-1}\sup_{f\in\mathcal{F}}\Big{|}\mathbb{G}_{n}(\varphi_{m_{j}-m_{j+1}}^{\wedge}(\pi_{j+1}f-\pi_{j}f))\Big{|}
+j=0τ1supf|𝔾n(R(j))|\displaystyle\quad\quad\quad\quad+\sum_{j=0}^{\tau-1}\sup_{f\in\mathcal{F}}|\mathbb{G}_{n}(R(j))|
\displaystyle\leq supf|𝔾n(π0f)|\displaystyle\sup_{f\in\mathcal{F}}|\mathbb{G}_{n}(\pi_{0}f)|
+{supf|𝔾n(φmτ(Δτf))|+2nsupfΔτf1,n}\displaystyle\quad\quad+\Big{\{}\sup_{f\in\mathcal{F}}|\mathbb{G}_{n}(\varphi_{m_{\tau}}^{\wedge}(\Delta_{\tau}f))|+2\sqrt{n}\sup_{f\in\mathcal{F}}\|\Delta_{\tau}f\|_{1,n}\Big{\}}
+j=0τ1supf|𝔾n(φmjmj+1(πj+1fπjf))|\displaystyle\quad\quad+\sum_{j=0}^{\tau-1}\sup_{f\in\mathcal{F}}\Big{|}\mathbb{G}_{n}(\varphi_{m_{j}-m_{j+1}}^{\wedge}(\pi_{j+1}f-\pi_{j}f))\Big{|}
+j=0τ1{supf|𝔾n(min{|φmj+1(Δj+1f)|,2mj})|\displaystyle\quad\quad+\sum_{j=0}^{\tau-1}\Big{\{}\sup_{f\in\mathcal{F}}\Big{|}\mathbb{G}_{n}(\min\big{\{}\big{|}\varphi_{m_{j+1}}^{\vee}(\Delta_{j+1}f)\big{|},2m_{j}\big{\}})\Big{|}
+2nsupfΔj+1f𝟙{Δj+1f>mj+1}1,n}\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad+2\sqrt{n}\sup_{f\in\mathcal{F}}\|\Delta_{j+1}f\mathbbm{1}_{\{\Delta_{j+1}f>m_{j+1}\}}\|_{1,n}\Big{\}}
+j=0τ1{supf|𝔾n(min{|φmjmj+1(Δjf)|,2mj})|\displaystyle\quad\quad+\sum_{j=0}^{\tau-1}\Big{\{}\sup_{f\in\mathcal{F}}\Big{|}\mathbb{G}_{n}(\min\big{\{}\big{|}\varphi_{m_{j}-m_{j+1}}^{\vee}(\Delta_{j}f)\big{|},2m_{j}\big{\}})\Big{|}
+2nsupfΔjf𝟙{Δjf>mjmj+1}1,n}\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad+2\sqrt{n}\sup_{f\in\mathcal{F}}\|\Delta_{j}f\mathbbm{1}_{\{\Delta_{j}f>m_{j}-m_{j+1}\}}\|_{1,n}\Big{\}}
=:\displaystyle=: R1+R2+R3+R4+R5.\displaystyle R_{1}+R_{2}+R_{3}+R_{4}+R_{5}.

We now discuss the terms RiR_{i}, i{1,,5}i\in\{1,...,5\} from (7.37). Therefore, put Cn:=c(1+𝔻n𝔻n)+𝔻n𝔻nC_{n}:=c(1+\frac{\mathbb{D}_{n}^{\infty}}{\mathbb{D}_{n}})+\frac{\mathbb{D}_{n}}{\mathbb{D}_{n}^{\infty}}.

Since Δjk=ujkljk\Delta_{jk}=u_{jk}-l_{jk} with ljk,ujkl_{jk},u_{jk}\in\mathcal{F}, the class {12Δjk:k{1,,(δj)}}\{\frac{1}{2}\Delta_{jk}:k\in\{1,...,\mathbb{N}(\delta_{j})\}\} still fulfills Assumption 4.2. We conclude by Lemma 7.1(iii) that for arbitrary m,m~>0m,\tilde{m}>0, the classes

{12φm(Δjk):k{1,,(δj)}},\displaystyle\{\frac{1}{2}\varphi_{m}^{\wedge}(\Delta_{jk}):k\in\{1,...,\mathbb{N}(\delta_{j})\}\},
{12min{φm(Δjk),2m~}:k{1,,(δj)}},\displaystyle\{\frac{1}{2}\min\{\varphi_{m}^{\vee}(\Delta_{jk}),2\tilde{m}\}:k\in\{1,...,\mathbb{N}(\delta_{j})\}\},
{12φm(πj+1fπjf):k{1,,(δj)}}\displaystyle\{\frac{1}{2}\varphi_{m}^{\wedge}(\pi_{j+1}f-\pi_{j}f):k\in\{1,...,\mathbb{N}(\delta_{j})\}\}

fulfill Assumption 4.2.

  • Since |{π0f:f(Mn)}|(δ0)=(σ)|\{\pi_{0}f:f\in\mathcal{F}(M_{n})\}|\leq\mathbb{N}(\delta_{0})=\mathbb{N}(\sigma), π0fMnm(n,δ0,(δ1))\|\pi_{0}f\|_{\infty}\leq M_{n}\leq m(n,\delta_{0},\mathbb{N}(\delta_{1})) and Vn(π0f)σ=δ0V_{n}(\pi_{0}f)\leq\sigma=\delta_{0} (by assumption, every ff\in\mathcal{F} fulfills Vn(f)σV_{n}(f)\leq\sigma), we have by (7.10):

    𝔼R1=𝔼supf(Mn)|𝔾n(π0f)|Cnδ01log(δ1).\mathbb{E}R_{1}=\mathbb{E}\sup_{f\in\mathcal{F}(M_{n})}|\mathbb{G}_{n}(\pi_{0}f)|\leq C_{n}\delta_{0}\sqrt{1\vee\log\mathbb{N}(\delta_{1})}.
  • It holds that |{φmτ(Δτf):f(Mn)}|Nτ|\{\varphi^{\wedge}_{m_{\tau}}(\Delta_{\tau}f):f\in\mathcal{F}(M_{n})\}|\leq N_{\tau}. If g:=φmτ(Δτf)g:=\varphi^{\wedge}_{m_{\tau}}(\Delta_{\tau}f), then gmτm(n,δτ,Nτ+1)\|g\|_{\infty}\leq m_{\tau}\leq m(n,\delta_{\tau},N_{\tau+1}) and Vn(g)Vn(Δτf)δτV_{n}(g)\leq V_{n}(\Delta_{\tau}f)\leq\delta_{\tau}. We conclude by (7.10) that:

    𝔼supf(Mn)|𝔾n(φmτ(Δτf))|Cnδτ1logNτ+1.\mathbb{E}\sup_{f\in\mathcal{F}(M_{n})}|\mathbb{G}_{n}(\varphi^{\wedge}_{m_{\tau}}(\Delta_{\tau}f))|\leq C_{n}\delta_{\tau}\cdot\sqrt{1\vee\log N_{\tau+1}}. (7.38)

    For the second term, we have by definition of τ\tau in (7.36) and the Cauchy Schwarz inequality:

    nΔτf1,nnΔτf2,nnVn(Δτf)nδτI(σ).\sqrt{n}\|\Delta_{\tau}f\|_{1,n}\leq\sqrt{n}\|\Delta_{\tau}f\|_{2,n}\leq\sqrt{n}V_{n}(\Delta_{\tau}f)\leq\sqrt{n}\delta_{\tau}\leq I(\sigma). (7.39)

    From (7.38) and (7.39) we obtain

    𝔼R2Cnδτ1logNτ+1+2I(σ).\mathbb{E}R_{2}\leq C_{n}\delta_{\tau}\sqrt{1\vee\log N_{\tau+1}}+2\cdot I(\sigma).
  • Since the partitions are nested, it holds that |{φmjmj+1(πj+1fπjf):f(Mn)}|Nj+1|\{\varphi^{\wedge}_{m_{j}-m_{j+1}}(\pi_{j+1}f-\pi_{j}f):f\in\mathcal{F}(M_{n})\}|\leq N_{j+1}. If g:=φmjmj+1(πj+1fπjf)g:=\varphi^{\wedge}_{m_{j}-m_{j+1}}(\pi_{j+1}f-\pi_{j}f), we have gmjmj+1mjm(n,δj,Nj+1)\|g\|_{\infty}\leq m_{j}-m_{j+1}\leq m_{j}\leq m(n,\delta_{j},N_{j+1}) and

    |g||πj+1fπjf|Δjf.|g|\leq|\pi_{j+1}f-\pi_{j}f|\leq\Delta_{j}f.

    Furthermore, Vn(g)Vn(Δjf)δjV_{n}(g)\leq V_{n}(\Delta_{j}f)\leq\delta_{j}. We conclude by (7.10) that:

    𝔼R3j=0τ1𝔼supf(Mn)|𝔾n(φmjmj+1(πj+1fπjf))|Cnj=0τ1δj1logNj+1.\mathbb{E}R_{3}\leq\sum_{j=0}^{\tau-1}\mathbb{E}\sup_{f\in\mathcal{F}(M_{n})}|\mathbb{G}_{n}(\varphi^{\wedge}_{m_{j}-m_{j+1}}(\pi_{j+1}f-\pi_{j}f))|\leq C_{n}\sum_{j=0}^{\tau-1}\delta_{j}\sqrt{1\vee\log N_{j+1}}.
  • It holds that |{min{φmj+1(Δj+1f),2mj}:f(Mn)}|Nj+1|\{\min\{\varphi^{\vee}_{m_{j+1}}(\Delta_{j+1}f),2m_{j}\}:f\in\mathcal{F}(M_{n})\}|\leq N_{j+1}. If g:=min{φmj+1(Δj+1f),2mj}g:=\min\{\varphi^{\vee}_{m_{j+1}}(\Delta_{j+1}f),2m_{j}\}, we have g2mj=m(n,δj,Nj+1)\|g\|_{\infty}\leq 2m_{j}=m(n,\delta_{j},N_{j+1}) and

    |g|Δj+1f.|g|\leq\Delta_{j+1}f.

    By monotonicity of VnV_{n}, we have Vn(g)Vn(Δj+1f)δj+1δjV_{n}(g)\leq V_{n}(\Delta_{j+1}f)\leq\delta_{j+1}\leq\delta_{j}. We conclude by (7.10) that:

    j=0τ1𝔼supf(Mn)|𝔾n(min{φmj+1(Δj+1f),2mj})|Cnj=0τ1δj1logNj+1.\sum_{j=0}^{\tau-1}\mathbb{E}\sup_{f\in\mathcal{F}(M_{n})}|\mathbb{G}_{n}(\min\{\varphi^{\vee}_{m_{j+1}}(\Delta_{j+1}f),2m_{j}\})|\leq C_{n}\sum_{j=0}^{\tau-1}\delta_{j}\sqrt{1\vee\log N_{j+1}}. (7.40)

    Note that Vn(Δj+1f)δj+1V_{n}(\Delta_{j+1}f)\leq\delta_{j+1} and mj+1=12m(n,δj+1,Nj+2)m_{j+1}=\frac{1}{2}m(n,\delta_{j+1},N_{j+2}). By (7.11), we have

    nΔj+1f𝟙{Δj+1f>mj+1}12δj+11logNj+2.\sqrt{n}\|\Delta_{j+1}f\mathbbm{1}_{\{\Delta_{j+1}f>m_{j+1}\}}\|_{1}\leq 2\delta_{j+1}\sqrt{1\vee\log N_{j+2}}. (7.41)

    From (7.40) and (7.41) we obtain

    𝔼R4(Cn+4)j=0τδj1logNj+1.\mathbb{E}R_{4}\leq(C_{n}+4)\sum_{j=0}^{\tau}\delta_{j}\sqrt{1\vee\log N_{j+1}}.
  • It holds that |{min{φmjmj+1(Δjf),2mj}:f(Mn)}|Nj+1|\{\min\{\varphi^{\vee}_{m_{j}-m_{j+1}}(\Delta_{j}f),2m_{j}\}:f\in\mathcal{F}(M_{n})\}|\leq N_{j+1}. If g:=min{φmjmj+1(Δjf),2mj}g:=\min\{\varphi^{\vee}_{m_{j}-m_{j+1}}(\Delta_{j}f),2m_{j}\}, we have g2mj=m(n,δj,Nj+1)\|g\|_{\infty}\leq 2m_{j}=m(n,\delta_{j},N_{j+1}) and

    |g|Δjf.|g|\leq\Delta_{j}f.

    Thus, Vn(g)Vn(Δjf)δjV_{n}(g)\leq V_{n}(\Delta_{j}f)\leq\delta_{j}. We conclude by (7.10) that:

    j=0τ1𝔼supf(Mn)|𝔾n(min{φmjmj+1(Δj+1f),2mj})|Cnj=0τ1δj1logNj+1.\sum_{j=0}^{\tau-1}\mathbb{E}\sup_{f\in\mathcal{F}(M_{n})}|\mathbb{G}_{n}(\min\{\varphi^{\vee}_{m_{j}-m_{j+1}}(\Delta_{j+1}f),2m_{j}\})|\leq C_{n}\sum_{j=0}^{\tau-1}\delta_{j}\cdot\sqrt{1\vee\log N_{j+1}}. (7.42)

    Note that Vn(Δjf)δjV_{n}(\Delta_{j}f)\leq\delta_{j} and

    2(mjmj+1)\displaystyle 2(m_{j}-m_{j+1}) =\displaystyle= m(n,δj,Nj+1)m(n,δj+1,Nj+2)\displaystyle m(n,\delta_{j},N_{j+1})-m(n,\delta_{j+1},N_{j+2})
    =\displaystyle= 𝔻nn1/2[r(δj𝔻n)1logNj+1r(δj+1𝔻n)1logNj+2]\displaystyle\mathbb{D}_{n}^{\infty}n^{1/2}\Big{[}\frac{r(\frac{\delta_{j}}{\mathbb{D}_{n}})}{\sqrt{1\vee\log N_{j+1}}}-\frac{r(\frac{\delta_{j+1}}{\mathbb{D}_{n}})}{\sqrt{1\vee\log N_{j+2}}}\Big{]}
    \displaystyle\geq 𝔻nn1/21logNj+1[r(δj𝔻n)r(δj+1𝔻n)]\displaystyle\frac{\mathbb{D}_{n}^{\infty}n^{1/2}}{\sqrt{1\vee\log N_{j+1}}}\big{[}r(\frac{\delta_{j}}{\mathbb{D}_{n}})-r(\frac{\delta_{j+1}}{\mathbb{D}_{n}})\big{]}
    \displaystyle\geq 12𝔻nn1/21logNj+1r(δj𝔻n)=mj,\displaystyle\frac{1}{2}\frac{\mathbb{D}_{n}^{\infty}n^{1/2}}{\sqrt{1\vee\log N_{j+1}}}r(\frac{\delta_{j}}{\mathbb{D}_{n}})=m_{j},

    where the last inequality is due to Lemma 7.5. By (7.11) we have

    nΔjf𝟙{Δjf>mjmj+1}1,nnΔjf𝟙{Δjf>mj2}1,nmj=12m(n,δj,Nj+1)4δj1logNj+1.\sqrt{n}\|\Delta_{j}f\mathbbm{1}_{\{\Delta_{j}f>m_{j}-m_{j+1}\}}\|_{1,n}\leq\sqrt{n}\|\Delta_{j}f\mathbbm{1}_{\{\Delta_{j}f>\frac{m_{j}}{2}\}}\|_{1,n}\overset{m_{j}=\frac{1}{2}m(n,\delta_{j},N_{j+1})}{\leq}4\delta_{j}\sqrt{1\vee\log N_{j+1}}. (7.43)

    From (7.42) and (7.43) we obtain

    R5(Cn+8)j=0τ1δj1logNj+1.R_{5}\leq(C_{n}+8)\sum_{j=0}^{\tau-1}\delta_{j}\sqrt{1\vee\log N_{j+1}}.

Summarizing the bounds for RiR_{i}, i=1,,5i=1,...,5, we obtain that with some universal constant c~>0\tilde{c}>0,

𝔼supf(Mn)|𝔾n(f)|c~Cn[j=0τδj1logNj+1+I(σ)].\mathbb{E}\sup_{f\in\mathcal{F}(M_{n})}\Big{|}\mathbb{G}_{n}(f)\Big{|}\leq\tilde{c}\cdot C_{n}\Big{[}\sum_{j=0}^{\tau}\delta_{j}\sqrt{1\vee\log N_{j+1}}+I(\sigma)\Big{]}. (7.44)

We have (1logNj)1/2=(1i=0jlog(δi))1/2(i=0j(1(δi)))i=0j(1(δi))1/2(1\vee\log N_{j})^{1/2}=\Big{(}1\vee\sum_{i=0}^{j}\log\mathbb{N}(\delta_{i})\Big{)}^{1/2}\leq\Big{(}\sum_{i=0}^{j}(1\vee\mathbb{H}(\delta_{i}))\Big{)}\leq\sum_{i=0}^{j}(1\vee\mathbb{H}(\delta_{i}))^{1/2}, thus

j=0τδj1logNj+1\displaystyle\sum_{j=0}^{\tau}\delta_{j}\sqrt{1\vee\log N_{j+1}} \displaystyle\leq j=0δji=0j1(δi+1)i=0(j=iδj)1(δi+1)\displaystyle\sum_{j=0}^{\infty}\delta_{j}\sum_{i=0}^{j}\sqrt{1\vee\mathbb{H}(\delta_{i+1})}\leq\sum_{i=0}^{\infty}\Big{(}\sum_{j=i}^{\infty}\delta_{j}\Big{)}\sqrt{1\vee\mathbb{H}(\delta_{i+1})} (7.45)
=\displaystyle= 2i=0δi1(δi+1)4i=0δi+11(δi+1).\displaystyle 2\sum_{i=0}^{\infty}\delta_{i}\sqrt{1\vee\mathbb{H}(\delta_{i+1})}\leq 4\sum_{i=0}^{\infty}\delta_{i+1}\sqrt{1\vee\mathbb{H}(\delta_{i+1})}.

Since HH is increasing, we obtain

i=0δi+11(δi+1)\displaystyle\sum_{i=0}^{\infty}\delta_{i+1}\sqrt{1\vee\mathbb{H}(\delta_{i+1})} \displaystyle\leq i=0δi1(δi)=2i=0δi+11(δi)\displaystyle\sum_{i=0}^{\infty}\delta_{i}\sqrt{1\vee\mathbb{H}(\delta_{i})}=2\sum_{i=0}^{\infty}\delta_{i+1}\sqrt{1\vee\mathbb{H}(\delta_{i})} (7.46)
=\displaystyle= 2i=0δi+1δi1(δi)dε\displaystyle 2\sum_{i=0}^{\infty}\int_{\delta_{i+1}}^{\delta_{i}}\sqrt{1\vee\mathbb{H}(\delta_{i})}d\varepsilon
\displaystyle\leq 2i=0δi+1δi1(ε)dε=20σ1(ε)dε=2I(σ).\displaystyle 2\sum_{i=0}^{\infty}\int_{\delta_{i+1}}^{\delta_{i}}\sqrt{1\vee\mathbb{H}(\varepsilon)}d\varepsilon=2\int_{0}^{\sigma}\sqrt{1\vee\mathbb{H}(\varepsilon)}d\varepsilon=2\cdot I(\sigma).

Inserting (7.46) into (7.45) and then into (7.44), we obtain the result. ∎

Proof of Corollary 4.5.

Define ~:={fg:f,g}\tilde{\mathcal{F}}:=\{f-g:f,g\in\mathcal{F}\}. It is easily seen that (ε,~,Vn)(ε2,,Vn)2\mathbb{N}(\varepsilon,\tilde{\mathcal{F}},V_{n})\leq\mathbb{N}(\frac{\varepsilon}{2},\mathcal{F},V_{n})^{2} (cf. [43], Theorem 19.5), thus

(ε,~,Vn)2(ε2,,Vn)\mathbb{H}(\varepsilon,\tilde{\mathcal{F}},V_{n})\leq 2\mathbb{H}(\frac{\varepsilon}{2},\mathcal{F},V_{n}) (7.47)

Let σ>0\sigma>0. Define

F(z,u):=2Dn(u)F¯(z,u),F¯(z,u):=supf|f¯(z,u)|.F(z,u):=2D_{n}^{\infty}(u)\cdot\bar{F}(z,u),\quad\quad\bar{F}(z,u):=\sup_{f\in\mathcal{F}}|\bar{f}(z,u)|.

Then obviously, FF is an envelope function of ~\tilde{\mathcal{F}}.

By Markov’s inequality, Theorem 4.4 and (7.47),

(supVn(fg)σ,f,g|𝔾n(f)𝔾n(g)|η)\displaystyle\mathbb{P}\Big{(}\sup_{V_{n}(f-g)\leq\sigma,\,f,g\in\mathcal{F}}|\mathbb{G}_{n}(f)-\mathbb{G}_{n}(g)|\geq\eta\Big{)}
\displaystyle\leq 1η𝔼supVn(fg)σ,f,g|𝔾n(f)𝔾n(g)|\displaystyle\frac{1}{\eta}\mathbb{E}\sup_{V_{n}(f-g)\leq\sigma,\,f,g\in\mathcal{F}}|\mathbb{G}_{n}(f)-\mathbb{G}_{n}(g)|
=\displaystyle= 1η𝔼supf~~,Vn(f~)σ|𝔾n(f~)|\displaystyle\frac{1}{\eta}\mathbb{E}\sup_{\tilde{f}\in\tilde{\mathcal{F}},V_{n}(\tilde{f})\leq\sigma}|\mathbb{G}_{n}(\tilde{f})|
\displaystyle\leq c~η[(1+𝔻n𝔻n+𝔻n𝔻n)0σ1(ε,~,Vn)dε+nF𝟙{F>14m(n,σ,(σ2))}1]\displaystyle\frac{\tilde{c}}{\eta}\Big{[}(1+\frac{\mathbb{D}_{n}^{\infty}}{\mathbb{D}_{n}}+\frac{\mathbb{D}_{n}}{\mathbb{D}_{n}^{\infty}})\int_{0}^{\sigma}\sqrt{1\vee\mathbb{H}(\varepsilon,\tilde{\mathcal{F}},V_{n})}d\varepsilon+\sqrt{n}\big{\|}F\mathbbm{1}_{\{F>\frac{1}{4}m(n,\sigma,\mathbb{N}(\frac{\sigma}{2}))\}}\big{\|}_{1}\Big{]}
\displaystyle\leq c~η[22(1+𝔻n𝔻n+𝔻n𝔻n)0σ/21(u,,Vn)du+41(σ2)r(σ𝔻n)F2𝟙{F>14n1/2r(σ)1(σ2)}1,n].\displaystyle\frac{\tilde{c}}{\eta}\Big{[}2\sqrt{2}(1+\frac{\mathbb{D}_{n}^{\infty}}{\mathbb{D}_{n}}+\frac{\mathbb{D}_{n}}{\mathbb{D}_{n}^{\infty}})\int_{0}^{\sigma/2}\sqrt{1\vee\mathbb{H}(u,\mathcal{F},V_{n})}du+\frac{4\sqrt{1\vee\mathbb{H}(\frac{\sigma}{2})}}{r(\frac{\sigma}{\mathbb{D}_{n}})}\big{\|}F^{2}\mathbbm{1}_{\{F>\frac{1}{4}n^{1/2}\frac{r(\sigma)}{\sqrt{1\vee\mathbb{H}(\frac{\sigma}{2})}}\}}\big{\|}_{1,n}\Big{]}.

The first term converges to 0 by (4.7) and (4.8) for σ0\sigma\to 0 (uniformly in nn).

We now discuss the second term. The continuity conditions from Assumption 4.2 and Assumption 3.2 transfer to F¯\bar{F} by the inequality

|F¯(z1,u1)F¯(z2,u2)|=|supff¯(z1,u1)supff¯(z2,u2)|supf|f(z1,u1)f(z2,u2)||\bar{F}(z_{1},u_{1})-\bar{F}(z_{2},u_{2})|=|\sup_{f\in\mathcal{F}}\bar{f}(z_{1},u_{1})-\sup_{f\in\mathcal{F}}\bar{f}(z_{2},u_{2})|\leq\sup_{f\in\mathcal{F}}|f(z_{1},u_{1})-f(z_{2},u_{2})|

We therefore have as in Lemma 7.8(ii) that for all u,u1,u2,v1,v2[0,1]u,u_{1},u_{2},v_{1},v_{2}\in[0,1],

F¯(Zi,u)F¯(Z~i(in),u)2Ccontnαs,\displaystyle\|\bar{F}(Z_{i},u)-\bar{F}(\tilde{Z}_{i}(\frac{i}{n}),u)\|_{2}\leq C_{cont}\cdot n^{-\alpha s}, (7.48)
F¯(Zi(v1),u1)F¯(Z~i(v2),v2)2Ccont(|v1v2|αs+|u1u2|αs).\displaystyle\|\bar{F}(Z_{i}(v_{1}),u_{1})-\bar{F}(\tilde{Z}_{i}(v_{2}),v_{2})\|_{2}\leq C_{cont}\cdot\big{(}|v_{1}-v_{2}|^{\alpha s}+|u_{1}-u_{2}|^{\alpha s}\big{)}. (7.49)

Put cn=18n1/2supi=1,,nDn(in)r(σ)1(σ2)c_{n}=\frac{1}{8}\frac{n^{1/2}}{\sup_{i=1,...,n}D_{n}^{\infty}(\frac{i}{n})}\frac{r(\sigma)}{\sqrt{1\vee\mathbb{H}(\frac{\sigma}{2})}}. Then by Lemma 7.6(ii) and (7.48),

F2𝟙{F>14n1/2r(σ)1(σ2)}1,n\displaystyle\|F^{2}\mathbbm{1}_{\{F>\frac{1}{4}n^{1/2}\frac{r(\sigma)}{\sqrt{1\vee\mathbb{H}(\frac{\sigma}{2})}}\}}\|_{1,n} (7.50)
\displaystyle\leq 4ni=1nDn(in)2𝔼[F¯(Zi,in)2𝟙{|F¯(Zi,in)|>cn}]\displaystyle\frac{4}{n}\sum_{i=1}^{n}D_{n}^{\infty}(\frac{i}{n})^{2}\cdot\mathbb{E}\Big{[}\bar{F}(Z_{i},\frac{i}{n})^{2}\mathbbm{1}_{\{|\bar{F}(Z_{i},\frac{i}{n})|>c_{n}\}}\Big{]}
\displaystyle\leq 16ni=1nDn(in)2𝔼[F¯(Z~i(in),in)2𝟙{|F¯(Z~i(in),in)|>cn}]\displaystyle\frac{16}{n}\sum_{i=1}^{n}D_{n}^{\infty}(\frac{i}{n})^{2}\cdot\mathbb{E}\Big{[}\bar{F}(\tilde{Z}_{i}(\frac{i}{n}),\frac{i}{n})^{2}\mathbbm{1}_{\{|\bar{F}(\tilde{Z}_{i}(\frac{i}{n}),\frac{i}{n})|>c_{n}\}}\Big{]}
+16Ccontnαs(𝔻n)2.\displaystyle\quad\quad\quad\quad+16C_{cont}\cdot n^{-\alpha s}\cdot(\mathbb{D}_{n}^{\infty})^{2}.

Put W~i(u):=F¯(Z~i(u),u)\tilde{W}_{i}(u):=\bar{F}(\tilde{Z}_{i}(u),u) and an(u):=(Dn(u))2a_{n}(u):=(D_{n}^{\infty}(u))^{2}. By (7.49), W~i(u1)W~i(u2)22Ccont|u1u2|αs\|\tilde{W}_{i}(u_{1})-\tilde{W}_{i}(u_{2})\|_{2}\leq 2C_{cont}|u_{1}-u_{2}|^{\alpha s}. By the assumptions on Df,n()D_{f,n}(\cdot), cnc_{n}\to\infty and lim supn1ni=1n|an(in)|=lim supn(𝔻n)2<\limsup_{n\to\infty}\frac{1}{n}\sum_{i=1}^{n}|a_{n}(\frac{i}{n})|=\limsup_{n\to\infty}(\mathbb{D}_{n}^{\infty})^{2}<\infty. We conclude with Lemma 7.7(i) that

16ni=1nDn(in)2𝔼[F¯(Z~i(in),in)2𝟙{|F¯(Z~i(in),in)|>cn}]0,\frac{16}{n}\sum_{i=1}^{n}D_{n}^{\infty}(\frac{i}{n})^{2}\cdot\mathbb{E}\Big{[}\bar{F}(\tilde{Z}_{i}(\frac{i}{n}),\frac{i}{n})^{2}\mathbbm{1}_{\{|\bar{F}(\tilde{Z}_{i}(\frac{i}{n}),\frac{i}{n})|>c_{n}\}}\Big{]}\to 0,

that is, the first summand in (7.50) tends to 0. Since lim supn𝔻n<\limsup_{n\to\infty}\mathbb{D}_{n}^{\infty}<\infty, we obtain that (7.50) tends to 0. ∎

7.6 Proofs of Section 7.2

Proof of Lemma 7.1.
  1. (i)

    Since |x1|+|x2|m|x_{1}|+|x_{2}|\leq m implies |x1|,|x2|m|x_{1}|,|x_{2}|\leq m, we have

    I:=|φm(x1+x2+x3)φm(x1)φm(x2)|=|φm(x1+x2+x3)x1x2|.I:=\big{|}\varphi_{m}^{\wedge}(x_{1}+x_{2}+x_{3})-\varphi_{m}^{\wedge}(x_{1})-\varphi_{m}^{\wedge}(x_{2})\big{|}=\big{|}\varphi_{m}^{\wedge}(x_{1}+x_{2}+x_{3})-x_{1}-x_{2}|.

    Case 1: x1+x2+x3>mx_{1}+x_{2}+x_{3}>m. Then, since |x1|+|x2|m|x_{1}|+|x_{2}|\leq m, we have I=|mx1x2|=mx1x2<x3|x3|I=|m-x_{1}-x_{2}|=m-x_{1}-x_{2}<x_{3}\leq|x_{3}|.
    Case 2: x1+x2+x3[m,m]x_{1}+x_{2}+x_{3}\in[-m,m]. Then I=|x1+x2+x3x1x2|=|x3|I=|x_{1}+x_{2}+x_{3}-x_{1}-x_{2}|=|x_{3}|.
    Case 3: x1+x2+x3<mx_{1}+x_{2}+x_{3}<-m. Then, since |x1|+|x2|m|x_{1}|+|x_{2}|\leq m, we have I=|mx1x2|=m+x1+x2<x3|x3|I=|-m-x_{1}-x_{2}|=m+x_{1}+x_{2}<-x_{3}\leq|x_{3}|.
    Furthermore, I|φm(x1+x2+x3)|+|x1+x2|m+m=2mI\leq|\varphi_{m}(x_{1}+x_{2}+x_{3})|+|x_{1}+x_{2}|\leq m+m=2m.

  2. (ii)

    The first assertion is obvious. If |x|y|x|\leq y, we have

    |φm(x)|\displaystyle|\varphi_{m}^{\vee}(x)| =\displaystyle= {xm,x>m0,x[m,m]xm,x<m={|x|m,x>m0,x[m,m]|x|m,x<m=(|x|m)𝟙|x|>m\displaystyle\begin{cases}x-m,&x>m\\ 0,&x\in[-m,m]\\ -x-m,&x<-m\end{cases}=\begin{cases}|x|-m,&x>m\\ 0,&x\in[-m,m]\\ |x|-m,&x<-m\end{cases}=(|x|-m)\mathbbm{1}_{|x|>m}
    \displaystyle\leq (ym)𝟙y>m=(ym)0=(ym)𝟙{ym>0}y𝟙y>m,\displaystyle(y-m)\mathbbm{1}_{y>m}=(y-m)\vee 0=(y-m)\mathbbm{1}_{\{y-m>0\}}\leq y\mathbbm{1}_{y>m},

    which shows the second assertion.

  3. (iii)

    We will show that for all z,zz,z^{\prime}\in\mathbb{R}^{\mathbb{N}} it holds that

    |φm(f)(z)φm(f)(z)||f(z)f(z)|,|φm(f)(z)φm(f)(z)||f(z)f(z)||\varphi_{m}^{\wedge}(f)(z)-\varphi_{m}^{\wedge}(f)(z^{\prime})|\leq|f(z)-f(z^{\prime})|,\quad\quad|\varphi_{m}^{\vee}(f)(z)-\varphi_{m}^{\vee}(f)(z^{\prime})|\leq|f(z)-f(z^{\prime})| (7.51)

    from which the assertion follows. For real numbers ai,bia_{i},b_{i}, we have

    maxi{ai}=maxi{aibi+bi}maxi{aibi}+maxi{bi},\max_{i}\{a_{i}\}=\max_{i}\{a_{i}-b_{i}+b_{i}\}\leq\max_{i}\{a_{i}-b_{i}\}+\max_{i}\{b_{i}\},

    thus |maxi{ai}maxi{bi}|maxi|aibi||\max_{i}\{a_{i}\}-\max_{i}\{b_{i}\}|\leq\max_{i}|a_{i}-b_{i}|. This implies |max{a,y}max{a,y}||yy||\max\{a,y\}-\max\{a,y^{\prime}\}|\leq|y-y^{\prime}| and therefore

    |φm(f)(z)φm(f)(z)|\displaystyle|\varphi_{m}^{\wedge}(f)(z)-\varphi_{m}^{\wedge}(f)(z^{\prime})| =\displaystyle= |(m)(f(z)m)(m)(f(z)m)||f(z)mf(z)m|\displaystyle|(-m)\vee(f(z)\wedge m)-(-m)\vee(f(z^{\prime})\wedge m)|\leq|f(z)\wedge m-f(z^{\prime})\wedge m|
    =\displaystyle= |(f(z))(m)(f(z))(m)||f(z)f(z)|.\displaystyle|(-f(z^{\prime}))\vee(-m)-(-f(z))\vee(-m)|\leq|f(z)-f(z^{\prime})|.

    For the second inequality in (7.51), note that

    φm(f)(z)=(f(z)m)0+(f(z)+m)0.\varphi_{m}^{\vee}(f)(z)=(f(z)-m)\vee 0+(f(z)+m)\wedge 0.

    We therefore have

    |φm(f)(z)φm(f)(z)|=|(f(z)m)0(f(z)m)0+(f(z)+m)0(f(z)+m)0|.|\varphi_{m}^{\vee}(f)(z)-\varphi_{m}^{\vee}(f)(z^{\prime})|=\big{|}(f(z)-m)\vee 0-(f(z^{\prime})-m)\vee 0+(f(z)+m)\wedge 0-(f(z^{\prime})+m)\wedge 0|.

    If f(z),f(z)mf(z),f(z^{\prime})\geq m, then

    |φm(f)(z)φm(f)(z)||(f(z)m)0(f(z)m)0||f(z)f(z)|.|\varphi_{m}^{\vee}(f)(z)-\varphi_{m}^{\vee}(f)(z^{\prime})|\leq\big{|}(f(z)-m)\vee 0-(f(z^{\prime})-m)\vee 0|\leq|f(z)-f(z^{\prime})|.

    A similar result is obtained for f(z),f(z)mf(z),f(z^{\prime})\leq-m. If f(z)mf(z)\geq m, f(z)<mf(z^{\prime})<m, then

    |φm(f)(z)φm(f)(z)|\displaystyle|\varphi_{m}^{\vee}(f)(z)-\varphi_{m}^{\vee}(f)(z^{\prime})|
    \displaystyle\leq |(f(z)m)(f(z)+m)0|\displaystyle\big{|}(f(z)-m)-(f(z^{\prime})+m)\wedge 0|
    =\displaystyle= {|f(z)f(z)2m|=f(z)f(z)2mf(z)f(z),f(z)m,|f(z)m|=f(z)mf(z)f(z),f(z)>m.\displaystyle\begin{cases}|f(z)-f(z^{\prime})-2m|=f(z)-f(z^{\prime})-2m\leq f(z)-f(z^{\prime}),&f(z^{\prime})\leq-m,\\ |f(z)-m|=f(z)-m\leq f(z)-f(z^{\prime}),&f(z^{\prime})>-m\end{cases}.

    A similar result is obtained for f(z)mf(z)\geq m, f(z)mf(z^{\prime})\leq m, which proves (7.51).

Lemma 7.5 (Properties of r()r(\cdot)).

r()r(\cdot) is well-defined and for each a>0a>0, r(a)2r(a2)\frac{r(a)}{2}\geq r(\frac{a}{2}) and r(a)ar(a)\leq a.

Proof.

q()q^{*}(\cdot) and r()r(\cdot) are well-defined since βnorm()\beta_{norm}(\cdot) is decreasing (at a rate q1\ll q^{-1}) and rq(r)rr\mapsto q^{*}(r)r is increasing (at a rate r\ll r) and limr0q(r)r=0\lim_{r\downarrow 0}q^{*}(r)r=0.

Let a>0a>0. We show that r=2r(a2)r=2r(\frac{a}{2}) fulfills q(r)raq^{*}(r)r\leq a. By definition of r(a)r(a), we obtain r(a)r=2r(a2)r(a)\geq r=2r(\frac{a}{2}) which gives the result. Since βnorm\beta_{norm} is decreasing, qq^{*} is decreasing. We conclude that

q(r)r=2q(2r(a2))r(a2)2q(r(a2))r(a2)2a2=a.q^{*}(r)r=2\cdot q^{*}(2r(\frac{a}{2}))r(\frac{a}{2})\leq 2\cdot q^{*}(r(\frac{a}{2}))r(\frac{a}{2})\leq 2\cdot\frac{a}{2}=a.

The second inequality r(a)ar(a)\leq a follows from the fact that q(r)rq^{*}(r)r is increasing and q(a)aaq^{*}(a)a\geq a. ∎

7.7 Proofs of Section 3

Proof of Theorem 3.4.

Denote Wi(f):=f(Zi,in)W_{i}(f):=f(Z_{i},\frac{i}{n}) and 𝕎i:=(Wi(f1),,Wi(fm))\mathbb{W}_{i}:=(W_{i}(f_{1}),...,W_{i}(f_{m}))^{\prime}. Let a=(a1,,am)m\{0}a=(a_{1},...,a_{m})^{\prime}\in\mathbb{R}^{m}\backslash\{0\}. We use the decomposition

1ni=1na(𝕎i𝔼𝕎i)=j=01ni=1naPij𝕎i.\displaystyle\frac{1}{\sqrt{n}}\sum_{i=1}^{n}a^{\prime}(\mathbb{W}_{i}-\mathbb{E}\mathbb{W}_{i})=\sum_{j=0}^{\infty}\frac{1}{\sqrt{n}}\sum_{i=1}^{n}a^{\prime}P_{i-j}\mathbb{W}_{i}.

For fixed J{}J\in\mathbb{N}\cup\{\infty\}, put

(Sn(J))k=1,,m:=Sn(J):=j=0J11ni=1nPij𝕎i.(S_{n}(J))_{k=1,...,m}:=S_{n}(J):=\sum_{j=0}^{J-1}\frac{1}{\sqrt{n}}\sum_{i=1}^{n}P_{i-j}\mathbb{W}_{i}.

Then, since PijWi(fk)P_{i-j}W_{i}(f_{k}), i=1,,ni=1,...,n is a martingale difference sequence and by Lemma 7.8(i),

Sn()kSn(J)k2\displaystyle\|S_{n}(\infty)_{k}-S_{n}(J)_{k}\|_{2} \displaystyle\leq j=J1ni=1nPijWi(fk)2=j=J(1ni=1nPijWi(fk)22)1/2\displaystyle\sum_{j=J}^{\infty}\big{\|}\frac{1}{\sqrt{n}}\sum_{i=1}^{n}P_{i-j}W_{i}(f_{k})\big{\|}_{2}=\sum_{j=J}^{\infty}\Big{(}\frac{1}{n}\sum_{i=1}^{n}\|P_{i-j}W_{i}(f_{k})\|_{2}^{2}\Big{)}^{1/2}
\displaystyle\leq (1ni=1nDfk,2,n(in)2)1/2j=JΔ(j),\displaystyle\Big{(}\frac{1}{n}\sum_{i=1}^{n}D_{f_{k},2,n}(\frac{i}{n})^{2}\Big{)}^{1/2}\cdot\sum_{j=J}^{\infty}\Delta(j),

thus

lim supJ,nSn()kSn(J)k2supn(1ni=1nDfk,2,n(in)2)1/2lim supJj=JΔ(j)=0.\limsup_{J,n\to\infty}\|S_{n}(\infty)_{k}-S_{n}(J)_{k}\|_{2}\leq\sup_{n\in\mathbb{N}}\Big{(}\frac{1}{n}\sum_{i=1}^{n}D_{f_{k},2,n}(\frac{i}{n})^{2}\Big{)}^{1/2}\cdot\limsup_{J\to\infty}\sum_{j=J}^{\infty}\Delta(j)=0. (7.52)

Define

(Sn(J)k)k=1,,m:=Sn(J):=1ni=1nJ+1j=0J1Pi𝕎i+j.(S_{n}^{\circ}(J)_{k})_{k=1,...,m}:=S_{n}^{\circ}(J):=\frac{1}{\sqrt{n}}\sum_{i=1}^{n-J+1}\sum_{j=0}^{J-1}P_{i}\mathbb{W}_{i+j}.

Then we have

Sn(J)kSn(J)k2\displaystyle\|S_{n}^{\circ}(J)_{k}-S_{n}(J)_{k}\|_{2} \displaystyle\leq j=0J11ni=1jPijWi(fk)2+1nj=0J1i=nJ+j+1nPijWi(fk)2\displaystyle\sum_{j=0}^{J-1}\|\frac{1}{\sqrt{n}}\sum_{i=1}^{j}P_{i-j}W_{i}(f_{k})\|_{2}+\frac{1}{\sqrt{n}}\sum_{j=0}^{J-1}\|\sum_{i=n-J+j+1}^{n}P_{i-j}W_{i}(f_{k})\|_{2}
\displaystyle\leq 2J2nsupi=1,,n+jPijWi(fk)2\displaystyle\frac{2J^{2}}{\sqrt{n}}\cdot\sup_{i=1,...,n+j}\|P_{i-j}W_{i}(f_{k})\|_{2}
\displaystyle\leq 2J2nsupi=1,,n+jfk(Zi,in)2.\displaystyle\frac{2J^{2}}{\sqrt{n}}\cdot\sup_{i=1,...,n+j}\|f_{k}(Z_{i},\frac{i}{n})\|_{2}.

By Lemma 7.8(i),

supi=1,,n+jfk(Zi,in)2CΔ,2D2,n(in),\sup_{i=1,...,n+j}\|f_{k}(Z_{i},\frac{i}{n})\|_{2}\leq C_{\Delta,2}\cdot D_{2,n}(\frac{i}{n}),

which gives

limnSn(J)kSn(J)k2=0.\lim_{n\to\infty}\|S_{n}^{\circ}(J)_{k}-S_{n}(J)_{k}\|_{2}=0. (7.53)

Stationary approximation: Put S~n(J)=(S~n(J)k)k=1,,m\tilde{S}_{n}^{\circ}(J)=(\tilde{S}_{n}^{\circ}(J)_{k})_{k=1,...,m}, where

S~n(J)k:=1ni=1nJ+1j=0J1Pifk(Z~i+j(in),in).\tilde{S}_{n}^{\circ}(J)_{k}:=\frac{1}{\sqrt{n}}\sum_{i=1}^{n-J+1}\sum_{j=0}^{J-1}P_{i}f_{k}(\tilde{Z}_{i+j}(\frac{i}{n}),\frac{i}{n}).

Then we have

Sn(J)kS~n(J)k2\displaystyle\|S_{n}^{\circ}(J)_{k}-\tilde{S}_{n}^{\circ}(J)_{k}\|_{2}
\displaystyle\leq j=0J1(1ni=1nJ+1Pifk(Zi+j,i+jn)Pifk(Z~i+j(in),in)22)1/2.\displaystyle\sum_{j=0}^{J-1}\Big{(}\frac{1}{n}\sum_{i=1}^{n-J+1}\Big{\|}P_{i}f_{k}(Z_{i+j},\frac{i+j}{n})-P_{i}f_{k}(\tilde{Z}_{i+j}(\frac{i}{n}),\frac{i}{n})\Big{\|}_{2}^{2}\Big{)}^{1/2}.

For each j,kj,k, it holds that

1ni=1nJ+1Pifk(Zi+j,i+jn)Pifk(Z~i+j(in),in)22\displaystyle\frac{1}{n}\sum_{i=1}^{n-J+1}\|P_{i}f_{k}(Z_{i+j},\frac{i+j}{n})-P_{i}f_{k}(\tilde{Z}_{i+j}(\frac{i}{n}),\frac{i}{n})\|_{2}^{2}
\displaystyle\leq 2ni=1nJ+1(Dfk,n(i+jn)Dfk,n(in))2supif¯(Zi+j,i+jn)22\displaystyle\frac{2}{n}\sum_{i=1}^{n-J+1}\Big{(}D_{f_{k},n}(\frac{i+j}{n})-D_{f_{k},n}(\frac{i}{n})\Big{)}^{2}\cdot\sup_{i}\|\bar{f}(Z_{i+j},\frac{i+j}{n})\|_{2}^{2}
+2ni=1nJ+1Df,n(in)2supif¯k(Zi+j,i+jn)f¯k(Z~i+j(in),in)]22.\displaystyle\quad\quad+\frac{2}{n}\sum_{i=1}^{n-J+1}D_{f,n}(\frac{i}{n})^{2}\cdot\sup_{i}\Big{\|}\bar{f}_{k}(Z_{i+j},\frac{i+j}{n})-\bar{f}_{k}(\tilde{Z}_{i+j}(\frac{i}{n}),\frac{i}{n})]\|_{2}^{2}.

By Lemma 7.8, we have supif¯(Zi+j,i+jn)22<\sup_{i}\|\bar{f}(Z_{i+j},\frac{i+j}{n})\|_{2}^{2}<\infty. Since 1nDfk,n()\frac{1}{\sqrt{n}}D_{f_{k},n}(\cdot) has bounded variation uniformly in nn,

1ni=1nJ+1(Dfk,n(i+jn)Dfk,n(in))2supi=1,,n1nDfk,n(in)1ni=1nJ+1|Dfk,n(i+jn)Dfk,n(in)|0.\frac{1}{n}\sum_{i=1}^{n-J+1}\Big{(}D_{f_{k},n}(\frac{i+j}{n})-D_{f_{k},n}(\frac{i}{n})\Big{)}^{2}\leq\sup_{i=1,...,n}\frac{1}{\sqrt{n}}D_{f_{k},n}(\frac{i}{n})\cdot\frac{1}{\sqrt{n}}\sum_{i=1}^{n-J+1}\Big{|}D_{f_{k},n}(\frac{i+j}{n})-D_{f_{k},n}(\frac{i}{n})\Big{|}\to 0.

By Lemma 7.8(ii),

supif¯k(Zi+j,i+jn)f¯k(Z~i+j(in),in)20.\displaystyle\sup_{i}\Big{\|}\bar{f}_{k}(Z_{i+j},\frac{i+j}{n})-\bar{f}_{k}(\tilde{Z}_{i+j}(\frac{i}{n}),\frac{i}{n})\Big{\|}_{2}\to 0.

We therefore obtain

Sn(J)kS~n(J)k20.\|S_{n}^{\circ}(J)_{k}-\tilde{S}_{n}^{\circ}(J)_{k}\|_{2}\to 0. (7.54)

Note that

Mi,k:=1nj=0JPifk(Z~i+j(in),in),i=1,,nM_{i,k}:=\frac{1}{\sqrt{n}}\sum_{j=0}^{J}P_{i}f_{k}(\tilde{Z}_{i+j}(\frac{i}{n}),\frac{i}{n}),\quad i=1,...,n

is a martingale difference sequence with respect to 𝒢i1\mathcal{G}_{i-1}, and

S~n(J)k=i=1nJ+1Mi,k.\tilde{S}_{n}^{\circ}(J)_{k}=\sum_{i=1}^{n-J+1}M_{i,k}.

We can therefore apply a central limit theorem for martingale difference sequences to aS~n(J)=i=1nJ+1(k=1makMi,k)a^{\prime}\tilde{S}_{n}^{\circ}(J)=\sum_{i=1}^{n-J+1}(\sum_{k=1}^{m}a_{k}M_{i,k}).

The Lindeberg condition: Let ς>0\varsigma>0. Iterated application of Lemma 7.6(i) yields that there are constants c1,c2>0c_{1},c_{2}>0 only depending on m,Jm,J such that

i=1nJ+1𝔼[(k=1makMi,k)2𝟙{|k=1makMi,k|>ςn}]\displaystyle\sum_{i=1}^{n-J+1}\mathbb{E}[(\sum_{k=1}^{m}a_{k}M_{i,k})^{2}\mathbbm{1}_{\{|\sum_{k=1}^{m}a_{k}M_{i,k}|>\varsigma\sqrt{n}\}}]
\displaystyle\leq c1l=0,1j=0J1k=1m|ak|21ni=1nJ𝔼[𝔼[fk(Z~i+j(in),in)|𝒢il]2𝟙{|𝔼[fk(Z~i+j(in),in)|𝒢il]|>nςc2|a|}].\displaystyle c_{1}\sum_{l=0,1}\sum_{j=0}^{J-1}\sum_{k=1}^{m}|a_{k}|^{2}\cdot\frac{1}{n}\sum_{i=1}^{n-J}\mathbb{E}\Big{[}\mathbb{E}[f_{k}(\tilde{Z}_{i+j}(\frac{i}{n}),\frac{i}{n})|\mathcal{G}_{i-l}]^{2}\mathbbm{1}_{\{|\mathbb{E}[f_{k}(\tilde{Z}_{i+j}(\frac{i}{n}),\frac{i}{n})|\mathcal{G}_{i-l}]|>\sqrt{n}\frac{\varsigma}{c_{2}|a|_{\infty}}\}}\Big{]}.

For each l,j,kl,j,k, we have

1ni=1nJ𝔼[𝔼[fk(Z~i+j(in),in)|𝒢il]2𝟙{|𝔼[fk(Z~i+j(in),in)|𝒢il]|>nςc2|a|}]\displaystyle\frac{1}{n}\sum_{i=1}^{n-J}\mathbb{E}\Big{[}\mathbb{E}[f_{k}(\tilde{Z}_{i+j}(\frac{i}{n}),\frac{i}{n})|\mathcal{G}_{i-l}]^{2}\mathbbm{1}_{\{|\mathbb{E}[f_{k}(\tilde{Z}_{i+j}(\frac{i}{n}),\frac{i}{n})|\mathcal{G}_{i-l}]|>\sqrt{n}\frac{\varsigma}{c_{2}|a|_{\infty}}\}}\Big{]} (7.55)
=\displaystyle= 1ni=1nJDfk,n(in)2𝔼[𝔼[f¯k(Z~i(in),in)|𝒢il]2𝟙{|𝔼[f¯k(Z~i(in),in)|𝒢il]|>nsupi=1,,n|Df,n(in)|ςc2|a|}]\displaystyle\frac{1}{n}\sum_{i=1}^{n-J}D_{f_{k},n}(\frac{i}{n})^{2}\mathbb{E}\Big{[}\mathbb{E}[\bar{f}_{k}(\tilde{Z}_{i}(\frac{i}{n}),\frac{i}{n})|\mathcal{G}_{i-l}]^{2}\mathbbm{1}_{\{|\mathbb{E}[\bar{f}_{k}(\tilde{Z}_{i}(\frac{i}{n}),\frac{i}{n})|\mathcal{G}_{i-l}]|>\frac{\sqrt{n}}{\sup_{i=1,...,n}|D_{f,n}(\frac{i}{n})|}\frac{\varsigma}{c_{2}|a|_{\infty}}\}}\Big{]}
=\displaystyle= 1ni=1nJDfk,n(in)2𝔼[W~i(in)2𝟙{|W~i(in)|>cn}],\displaystyle\frac{1}{n}\sum_{i=1}^{n-J}D_{f_{k},n}(\frac{i}{n})^{2}\mathbb{E}\Big{[}\tilde{W}_{i}(\frac{i}{n})^{2}\mathbbm{1}_{\{|\tilde{W}_{i}(\frac{i}{n})|>c_{n}\}}\Big{]},

where we have put

W~i(u):=𝔼[f¯k(Z~i(u),u)|𝒢il],cn:=nsupi=1,,n|Df,n(in)|ςc2|a|.\tilde{W}_{i}(u):=\mathbb{E}[\bar{f}_{k}(\tilde{Z}_{i}(u),u)|\mathcal{G}_{i-l}],\quad\quad c_{n}:=\frac{\sqrt{n}}{\sup_{i=1,...,n}|D_{f,n}(\frac{i}{n})|}\frac{\varsigma}{c_{2}|a|_{\infty}}.

By Lemma 7.8(ii), W~i(u)\tilde{W}_{i}(u) satisfies the assumptions (7.59) of Lemma 7.7. By assumption, cnc_{n}\to\infty. With an(u):=Dfk,n(u)2a_{n}(u):=D_{f_{k},n}(u)^{2}, we obtain from Lemma 7.7 that (7.55) converges to 0, which shows that the Lindeberg condition is satisfied.

Convergence of the variance: We have

i=1nJ+1𝔼[(k=1mMi,k)2|𝒢i1]\displaystyle\sum_{i=1}^{n-J+1}\mathbb{E}[(\sum_{k=1}^{m}M_{i,k})^{2}|\mathcal{G}_{i-1}]
=\displaystyle= j1,j2=0J1k1,k2=1makal1ni=1nJ+1Dfk,n(in)Dfl,n(in)𝔼[Pif¯k(Z~i+j1(in),in)Pif¯l(Z~i+j2(in),in)|𝒢i1].\displaystyle\sum_{j_{1},j_{2}=0}^{J-1}\sum_{k_{1},k_{2}=1}^{m}a_{k}a_{l}\cdot\frac{1}{n}\sum_{i=1}^{n-J+1}D_{f_{k},n}(\frac{i}{n})D_{f_{l},n}(\frac{i}{n})\cdot\mathbb{E}\big{[}P_{i}\bar{f}_{k}(\tilde{Z}_{i+j_{1}}(\frac{i}{n}),\frac{i}{n})\cdot P_{i}\bar{f}_{l}(\tilde{Z}_{i+j_{2}}(\frac{i}{n}),\frac{i}{n})|\mathcal{G}_{i-1}\big{]}.

For each j1,j2,k1,k2j_{1},j_{2},k_{1},k_{2}, we define

W~i(u):=𝔼[Pif¯k(Z~i+j1(u),u)Pif¯l(Z~i+j2(u),u)|𝒢i1],an(u):=Dfk,n(u)Dfl,n(u).\tilde{W}_{i}(u):=\mathbb{E}\big{[}P_{i}\bar{f}_{k}(\tilde{Z}_{i+j_{1}}(u),u)\cdot P_{i}\bar{f}_{l}(\tilde{Z}_{i+j_{2}}(u),u)|\mathcal{G}_{i-1}\big{]},\quad\quad a_{n}(u):=D_{f_{k},n}(u)D_{f_{l},n}(u).

Then

1ni=1nJ+1Dfk,n(in)Dfl,n(in)𝔼[Pif¯k(Z~i+j1(in),in)Pif¯l(Z~i+j2(in),in)|𝒢i1]\displaystyle\frac{1}{n}\sum_{i=1}^{n-J+1}D_{f_{k},n}(\frac{i}{n})D_{f_{l},n}(\frac{i}{n})\cdot\mathbb{E}\big{[}P_{i}\bar{f}_{k}(\tilde{Z}_{i+j_{1}}(\frac{i}{n}),\frac{i}{n})\cdot P_{i}\bar{f}_{l}(\tilde{Z}_{i+j_{2}}(\frac{i}{n}),\frac{i}{n})|\mathcal{G}_{i-1}\big{]}
=\displaystyle= 1ni=1nJ+1an(in)W~i(in).\displaystyle\frac{1}{n}\sum_{i=1}^{n-J+1}a_{n}(\frac{i}{n})\tilde{W}_{i}(\frac{i}{n}).

By Lemma 7.8(i),(ii), we have

W~0(u)W~0(v)1\displaystyle\|\tilde{W}_{0}(u)-\tilde{W}_{0}(v)\|_{1} \displaystyle\leq f¯k(Z~0(u),u)f¯k(Z~0(v),v)2f¯l(Z~0(u))2\displaystyle\|\bar{f}_{k}(\tilde{Z}_{0}(u),u)-\bar{f}_{k}(\tilde{Z}_{0}(v),v)\|_{2}\cdot\|\bar{f}_{l}(\tilde{Z}_{0}(u))\|_{2}
+f¯l(Z~0(u),u)f¯l(Z~0(v),v)2f¯k(Z~0(v))2\displaystyle\quad\quad+\|\bar{f}_{l}(\tilde{Z}_{0}(u),u)-\bar{f}_{l}(\tilde{Z}_{0}(v),v)\|_{2}\cdot\|\bar{f}_{k}(\tilde{Z}_{0}(v))\|_{2}
\displaystyle\leq 2CcontCf¯|uv|ςs/2\displaystyle 2C_{cont}C_{\bar{f}}\cdot|u-v|^{\varsigma s/2}

Let An:=supi=1,,n|an(in)|A_{n}:=\sup_{i=1,...,n}|a_{n}(\frac{i}{n})|. Since Df,n()Df,n\frac{D_{f,n}(\cdot)}{D_{f,n}^{\infty}} has bounded variation uniformly in nn, it follows that an()An\frac{a_{n}(\cdot)}{A_{n}} has bounded variation uniformly in nn. From Df,nn0\frac{D_{f,n}^{\infty}}{\sqrt{n}}\to 0 we conclude Ann0\frac{A_{n}}{n}\to 0.

By assumption and the Cauchy-Schwarz inequality,

supn[1ni=1n|an(in)|]supn(1ni=1nDfk,n(in)2)1/2(1ni=1nDfl,n(in)2)1/2<.\sup_{n}\Big{[}\frac{1}{n}\sum_{i=1}^{n}|a_{n}(\frac{i}{n})|\Big{]}\leq\sup_{n}\Big{(}\frac{1}{n}\sum_{i=1}^{n}D_{f_{k},n}(\frac{i}{n})^{2}\Big{)}^{1/2}\cdot\Big{(}\frac{1}{n}\sum_{i=1}^{n}D_{f_{l},n}(\frac{i}{n})^{2}\Big{)}^{1/2}<\infty.

It holds that supn(hnAn)supn(hn1/2Dfk,n)supn(hn1/2Dfl,n)<\sup_{n}(h_{n}\cdot A_{n})\leq\sup_{n}(h_{n}^{1/2}D_{f_{k},n}^{\infty})\cdot\sup_{n}(h_{n}^{1/2}D_{f_{l},n}^{\infty})<\infty, and

|vu|>hnDfk,n(u)=0,Dfl,n(u)=0,an(u)=0.|v-u|>h_{n}\quad\Rightarrow\quad D_{f_{k},n}(u)=0,D_{f_{l},n}(u)=0,\quad\Rightarrow\quad a_{n}(u)=0.

Thus, Lemma 7.7(ii) is applicable.

Case 𝕂=1\mathbb{K}=1: If u𝔼[P0f¯k(Z~j1(u),u)P0f¯l(Z~j2(u),u)]u\mapsto\mathbb{E}[P_{0}\bar{f}_{k}(\tilde{Z}_{j_{1}}(u),u)\cdot P_{0}\bar{f}_{l}(\tilde{Z}_{j_{2}}(u),u)] has bounded variation, we have

1ni=1nJ+1Dfk,n(in)Dfl,n(in)𝔼[Pif¯k(Z~i+j1(in),in)Pif¯l(Z~i+j2(in),in)|𝒢i1]\displaystyle\frac{1}{n}\sum_{i=1}^{n-J+1}D_{f_{k},n}(\frac{i}{n})D_{f_{l},n}(\frac{i}{n})\cdot\mathbb{E}\big{[}P_{i}\bar{f}_{k}(\tilde{Z}_{i+j_{1}}(\frac{i}{n}),\frac{i}{n})\cdot P_{i}\bar{f}_{l}(\tilde{Z}_{i+j_{2}}(\frac{i}{n}),\frac{i}{n})|\mathcal{G}_{i-1}\big{]}
𝑝\displaystyle\overset{p}{\to} limn01Dfk,n(u)Dfl,n(u)𝔼[P0f¯k(Z~j1(u),u)P0f¯l(Z~j2(u),u)]du.\displaystyle\lim_{n\to\infty}\int_{0}^{1}D_{f_{k},n}(u)D_{f_{l},n}(u)\cdot\mathbb{E}[P_{0}\bar{f}_{k}(\tilde{Z}_{j_{1}}(u),u)\cdot P_{0}\bar{f}_{l}(\tilde{Z}_{j_{2}}(u),u)]du.

and thus

i=1nJ+1𝔼[(k=1mMi,k)2|𝒢i1]\displaystyle\sum_{i=1}^{n-J+1}\mathbb{E}[(\sum_{k=1}^{m}M_{i,k})^{2}|\mathcal{G}_{i-1}]
𝑝\displaystyle\overset{p}{\to} k,l=1makallimn01Dfk,n(u)Dfl,n(u)j1,j2=0J1𝔼[P0f¯k(Z~j1(u),u)P0f¯l(Z~j2(u),u)]du\displaystyle\sum_{k,l=1}^{m}a_{k}a_{l}\cdot\lim_{n\to\infty}\int_{0}^{1}D_{f_{k},n}(u)D_{f_{l},n}(u)\cdot\sum_{j_{1},j_{2}=0}^{J-1}\mathbb{E}[P_{0}\bar{f}_{k}(\tilde{Z}_{j_{1}}(u),u)\cdot P_{0}\bar{f}_{l}(\tilde{Z}_{j_{2}}(u),u)]du
=\displaystyle= aΣkl(1)(J)a\displaystyle a^{\prime}\Sigma_{kl}^{(1)}(J)a

Here, for f,gf,g\in\mathcal{F}, we have that 𝔼[P0f¯(Z~j1(u),u)P0g¯(Z~j2(u),u)]\mathbb{E}[P_{0}\bar{f}(\tilde{Z}_{j_{1}}(u),u)\cdot P_{0}\bar{g}(\tilde{Z}_{j_{2}}(u),u)] can be written as

𝔼[P0f¯(Z~j1(u),u)P0g¯(Z~j2(u),u)]\displaystyle\mathbb{E}[P_{0}\bar{f}(\tilde{Z}_{j_{1}}(u),u)\cdot P_{0}\bar{g}(\tilde{Z}_{j_{2}}(u),u)]
=\displaystyle= 𝔼[𝔼[f¯(Z~j1(u),u)|𝒢0]]𝔼[g¯(Z~j2(u),u)|𝒢0]]𝔼[𝔼[f¯(Z~j1(u),u)|𝒢1]]𝔼[g¯(Z~j2(u),u)|𝒢1]]\displaystyle\mathbb{E}[\mathbb{E}[\bar{f}(\tilde{Z}_{j_{1}}(u),u)|\mathcal{G}_{0}]]\cdot\mathbb{E}[\bar{g}(\tilde{Z}_{j_{2}}(u),u)|\mathcal{G}_{0}]]-\mathbb{E}[\mathbb{E}[\bar{f}(\tilde{Z}_{j_{1}}(u),u)|\mathcal{G}_{-1}]]\cdot\mathbb{E}[\bar{g}(\tilde{Z}_{j_{2}}(u),u)|\mathcal{G}_{-1}]]

which shows that the condition stated in the assumption guarantees the bounded variation of u𝔼[P0f¯(Z~j1(u),u)P0g¯(Z~j2(u),u)]u\mapsto\mathbb{E}[P_{0}\bar{f}(\tilde{Z}_{j_{1}}(u),u)\cdot P_{0}\bar{g}(\tilde{Z}_{j_{2}}(u),u)].

Case 𝕂=2\mathbb{K}=2: If hn0h_{n}\to 0, then we obtain similarly

i=1nJ+1𝔼[(k=1mMi,k)2|𝒢i1]\displaystyle\sum_{i=1}^{n-J+1}\mathbb{E}[(\sum_{k=1}^{m}M_{i,k})^{2}|\mathcal{G}_{i-1}]
𝑝\displaystyle\overset{p}{\to} k,l=1makallimn01Dfk,n(u)Dfl,n(u)duj1,j2=0J1𝔼[P0f¯k(Z~j1(v),v)P0f¯l(Z~j2(v),v)]du\displaystyle\sum_{k,l=1}^{m}a_{k}a_{l}\cdot\lim_{n\to\infty}\int_{0}^{1}D_{f_{k},n}(u)D_{f_{l},n}(u)du\cdot\sum_{j_{1},j_{2}=0}^{J-1}\mathbb{E}[P_{0}\bar{f}_{k}(\tilde{Z}_{j_{1}}(v),v)\cdot P_{0}\bar{f}_{l}(\tilde{Z}_{j_{2}}(v),v)]du
=\displaystyle= aΣkl(2)(J)a.\displaystyle a^{\prime}\Sigma_{kl}^{(2)}(J)a.

By the martingale central limit theorem and (7.53), (7.54), we obtain that

aSn(J)𝑑N(0,aΣkl(𝕂)(J)a).a^{\prime}S_{n}(J)\overset{d}{\to}N(0,a^{\prime}\Sigma_{kl}^{(\mathbb{K})}(J)a). (7.56)

Conclusion: For 𝕂{1,2}\mathbb{K}\in\{1,2\}, we have

aΣkl(𝕂)(J)aaΣkl(𝕂)()a(J)a^{\prime}\Sigma_{kl}^{(\mathbb{K})}(J)a\to a^{\prime}\Sigma_{kl}^{(\mathbb{K})}(\infty)a\quad\quad(J\to\infty) (7.57)

due to

j1,j2:max{j1,j2}JP0f¯k(Z~j1(u),u)P0f¯l(Z~j2(u),u)1\displaystyle\sum_{j_{1},j_{2}:\max\{j_{1},j_{2}\}\geq J}\|P_{0}\bar{f}_{k}(\tilde{Z}_{j_{1}}(u),u)\cdot P_{0}\bar{f}_{l}(\tilde{Z}_{j_{2}}(u),u)\|_{1}
\displaystyle\leq j1,j2:max{j1,j2}JP0f¯k(Z~j1(u),u)2P0f¯l(Z~j2(u),u)20(J)\displaystyle\sum_{j_{1},j_{2}:\max\{j_{1},j_{2}\}\geq J}\|P_{0}\bar{f}_{k}(\tilde{Z}_{j_{1}}(u),u)\|_{2}\|P_{0}\bar{f}_{l}(\tilde{Z}_{j_{2}}(u),u)\|_{2}\to 0\quad(J\to\infty)

uniformly in nn and

supn01|Dfk,n(u)Dfl,n(u)|dusupn(01Dfk,n(u)2du)1/2(01Dfl,n(u)2du)1/2<.\sup_{n}\int_{0}^{1}|D_{f_{k},n}(u)D_{f_{l},n}(u)|du\leq\sup_{n}\big{(}\int_{0}^{1}D_{f_{k},n}(u)^{2}du\big{)}^{1/2}\big{(}\int_{0}^{1}D_{f_{l},n}(u)^{2}du\big{)}^{1/2}<\infty.

By (7.52), (7.56), (7.57),

jCov(f¯k(Z~0(u),u),f¯l(Z~j(u),u))=j1,j2=0𝔼[P0f¯k(Z~j1(u),u)P0f¯l(Z~j2(u),u)]\sum_{j\in\mathbb{Z}}\text{Cov}(\bar{f}_{k}(\tilde{Z}_{0}(u),u),\bar{f}_{l}(\tilde{Z}_{j}(u),u))=\sum_{j_{1},j_{2}=0}^{\infty}\mathbb{E}[P_{0}\bar{f}_{k}(\tilde{Z}_{j_{1}}(u),u)\cdot P_{0}\bar{f}_{l}(\tilde{Z}_{j_{2}}(u),u)]

and the Cramer-Wold device, the assertion of the theorem follows.

Lemma 7.6.

Let cc\in\mathbb{R}, c>0c>0.

  1. (i)

    For x,yx,y\in\mathbb{R}, it holds that

    (x+y)2𝟙{|x+y|>c}8x2𝟙{|x|>c2}+8y2𝟙{|y|>c2}.(x+y)^{2}\mathbbm{1}_{\{|x+y|>c\}}\leq 8x^{2}\mathbbm{1}_{\{|x|>\frac{c}{2}\}}+8y^{2}\mathbbm{1}_{\{|y|>\frac{c}{2}\}}.
  2. (ii)

    For random variables W,W~W,\tilde{W}, it holds that

    𝔼[W2𝟙{|W|>c}]4𝔼[(WW~)2]+4𝔼[W~2𝟙{|W~|>c2}].\mathbb{E}[W^{2}\mathbbm{1}_{\{|W|>c\}}]\leq 4\mathbb{E}[(W-\tilde{W})^{2}]+4\mathbb{E}[\tilde{W}^{2}\mathbbm{1}_{\{|\tilde{W}|>\frac{c}{2}\}}].
Proof of Lemma 7.6.
  1. (i)

    It holds that

    (x+y)2𝟙{|x+y|>c}\displaystyle(x+y)^{2}\mathbbm{1}_{\{|x+y|>c\}} \displaystyle\leq 2[x2+y2]𝟙{|x|>c2 or |y|>c2}\displaystyle 2\big{[}x^{2}+y^{2}\big{]}\mathbbm{1}_{\{|x|>\frac{c}{2}\text{ or }|y|>\frac{c}{2}\}}
    \displaystyle\leq 2[x2+y2]{2𝟙{|x|>c2,|y|>c2}+𝟙{|x|>c2,|y|c2}+𝟙{|x|c2,|y|>c2}}\displaystyle 2\big{[}x^{2}+y^{2}\big{]}\big{\{}2\mathbbm{1}_{\{|x|>\frac{c}{2},|y|>\frac{c}{2}\}}+\mathbbm{1}_{\{|x|>\frac{c}{2},|y|\leq\frac{c}{2}\}}+\mathbbm{1}_{\{|x|\leq\frac{c}{2},|y|>\frac{c}{2}\}}\big{\}}
    \displaystyle\leq 4[x2𝟙{|x|>c2}+y2𝟙{|y|>c2}]+4x2𝟙{|x|>c2}+4y2𝟙{|y|>c2}\displaystyle 4\big{[}x^{2}\mathbbm{1}_{\{|x|>\frac{c}{2}\}}+y^{2}\mathbbm{1}_{\{|y|>\frac{c}{2}\}}\big{]}+4x^{2}\mathbbm{1}_{\{|x|>\frac{c}{2}\}}+4y^{2}\mathbbm{1}_{\{|y|>\frac{c}{2}\}}
    \displaystyle\leq 8x2𝟙{|x|>c2}+8y2𝟙{|y|>c2}.\displaystyle 8x^{2}\mathbbm{1}_{\{|x|>\frac{c}{2}\}}+8y^{2}\mathbbm{1}_{\{|y|>\frac{c}{2}\}}.
  2. (ii)

    We have

    𝔼[W2𝟙{|W|>c}]\displaystyle\mathbb{E}[W^{2}\mathbbm{1}_{\{|W|>c\}}] \displaystyle\leq 2𝔼[(|W|W~)2𝟙{|W|>c}]+2𝔼[W~2𝟙{|W|>c}]\displaystyle 2\mathbb{E}[(|W|-\tilde{W})^{2}\mathbbm{1}_{\{|W|>c\}}]+2\mathbb{E}[\tilde{W}^{2}\mathbbm{1}_{\{|W|>c\}}] (7.58)
    \displaystyle\leq 2𝔼[(WW~)2]+2𝔼[W~2𝟙{|WW~|+|W~|>c}].\displaystyle 2\mathbb{E}[(W-\tilde{W})^{2}]+2\mathbb{E}[\tilde{W}^{2}\mathbbm{1}_{\{|W-\tilde{W}|+|\tilde{W}|>c\}}].

    Furthermore, with Markov’s inequality,

    𝔼[W~2𝟙{|WW~|+|W~|>c}]\displaystyle\mathbb{E}[\tilde{W}^{2}\mathbbm{1}_{\{|W-\tilde{W}|+|\tilde{W}|>c\}}]
    \displaystyle\leq 𝔼[W~2𝟙{|WW~|>c2}]+𝔼[W~2𝟙{|W~|>c2}]\displaystyle\mathbb{E}[\tilde{W}^{2}\mathbbm{1}_{\{|W-\tilde{W}|>\frac{c}{2}\}}]+\mathbb{E}[\tilde{W}^{2}\mathbbm{1}_{\{|\tilde{W}|>\frac{c}{2}\}}]
    \displaystyle\leq (c2)2(|WW~|>c2)+𝔼[W~2𝟙{|WW~|>c2}𝟙{|W~|>c2}]+𝔼[W~2𝟙{|W~|>c2}]\displaystyle(\frac{c}{2})^{2}\mathbb{P}(|W-\tilde{W}|>\frac{c}{2})+\mathbb{E}[\tilde{W}^{2}\mathbbm{1}_{\{|W-\tilde{W}|>\frac{c}{2}\}}\mathbbm{1}_{\{|\tilde{W}|>\frac{c}{2}\}}]+\mathbb{E}[\tilde{W}^{2}\mathbbm{1}_{\{|\tilde{W}|>\frac{c}{2}\}}]
    \displaystyle\leq 𝔼[(WW~)2]+2𝔼[W~2𝟙{|W~|>c2}].\displaystyle\mathbb{E}[(W-\tilde{W})^{2}]+2\mathbb{E}[\tilde{W}^{2}\mathbbm{1}_{\{|\tilde{W}|>\frac{c}{2}\}}].

    Inserting this inequality into (7.58), we obtain the assertion.

The following lemma generalizes some results from [8] using similar techniques as therein.

Lemma 7.7.

Let q{1,2}q\in\{1,2\}. Let W~i(u)\tilde{W}_{i}(u) be a stationary sequence with

supu[0,1]W~0(u)q<,W~0(u)W~0(v)qCW|uv|ς.\sup_{u\in[0,1]}\|\tilde{W}_{0}(u)\|_{q}<\infty,\quad\quad\|\tilde{W}_{0}(u)-\tilde{W}_{0}(v)\|_{q}\leq C_{W}|u-v|^{\varsigma}. (7.59)

Let an:[0,1]a_{n}:[0,1]\to\mathbb{R} be some sequence of functions with lim supn1ni=1n|an(in)|<\limsup_{n\to\infty}\frac{1}{n}\sum_{i=1}^{n}|a_{n}(\frac{i}{n})|<\infty.

  1. (i)

    Let q=2q=2. Let cnc_{n} be some sequence with cnc_{n}\to\infty. Then

    1ni=1n|an(in)|𝔼[W~i(in)2𝟙{|W~i(in)|>cn}]0,\frac{1}{n}\sum_{i=1}^{n}|a_{n}(\frac{i}{n})|\cdot\mathbb{E}[\tilde{W}_{i}(\frac{i}{n})^{2}\mathbbm{1}_{\{|\tilde{W}_{i}(\frac{i}{n})|>c_{n}\}}]\to 0,
  2. (ii)

    Let q=1q=1. Suppose that there exists hn>0,v[0,1]h_{n}>0,v\in[0,1] such that for all u[0,1]u\in[0,1], |vu|>hn|v-u|>h_{n} implies an(u)=0a_{n}(u)=0. Put An=supi=1,,n|an(in)|A_{n}=\sup_{i=1,...,n}|a_{n}(\frac{i}{n})| and suppose that

    supn(hnAn)<,Ann0,an()An has bounded variation uniformly in n.\sup_{n\in\mathbb{N}}(h_{n}\cdot A_{n})<\infty,\quad\quad\frac{A_{n}}{n}\to 0,\quad\quad\frac{a_{n}(\cdot)}{A_{n}}\text{ has bounded variation uniformly in $n$}.

    Suppose that the limits on the following right hand sides exist. If u𝔼W~0(u)u\mapsto\mathbb{E}\tilde{W}_{0}(u) has bounded variation, then

    1ni=1nan(in)W~i(in)𝑝limn01an(u)𝔼W~0(u)du.\frac{1}{n}\sum_{i=1}^{n}a_{n}(\frac{i}{n})\tilde{W}_{i}(\frac{i}{n})\overset{p}{\to}\lim_{n\to\infty}\int_{0}^{1}a_{n}(u)\mathbb{E}\tilde{W}_{0}(u)du.

    If hn0h_{n}\to 0, then

    1ni=1nan(in)W~i(in)𝑝limn01an(u)du𝔼W~0(v).\frac{1}{n}\sum_{i=1}^{n}a_{n}(\frac{i}{n})\tilde{W}_{i}(\frac{i}{n})\overset{p}{\to}\lim_{n\to\infty}\int_{0}^{1}a_{n}(u)du\cdot\mathbb{E}\tilde{W}_{0}(v).
Proof of Lemma 7.7.

Let JJ\in\mathbb{N} be fixed and assume that n22Jn\geq 2\cdot 2^{J}. For j{1,,2J}j\in\{1,...,2^{J}\}, Define Ij,J,n:={i{1,,n}:in(j12J,j2J]}I_{j,J,n}:=\{i\in\{1,...,n\}:\frac{i}{n}\in(\frac{j-1}{2^{J}},\frac{j}{2^{J}}]\}. Then (Ij,J,n)j(I_{j,J,n})_{j} forms a decomposition of {1,,n}\{1,...,n\} in the sense that j=12JIj,J,n={1,,n}\sum_{j=1}^{2^{J}}I_{j,J,n}=\{1,...,n\}. Since in(j12J,j2J]j12Jn<inj12Jn2J\frac{i}{n}\in(\frac{j-1}{2^{J}},\frac{j}{2^{J}}]\Longleftrightarrow\frac{j-1}{2^{J}}\cdot n<i\leq n\cdot\frac{j-1}{2^{J}}\leq\frac{n}{2^{J}}, we conclude that n2J1|Ij,J,n|n2J\frac{n}{2^{J}}-1\leq|I_{j,J,n}|\leq\frac{n}{2^{J}}. Thus, since n22Jn\geq 2\cdot 2^{J},

|Ij,J,n|n12J|1n,|Ij,J,n|12n2J.\Big{|}\frac{I_{j,J,n}|}{n}-\frac{1}{2^{J}}\Big{|}\leq\frac{1}{n},\quad\quad|I_{j,J,n}|\geq\frac{1}{2}\frac{n}{2^{J}}. (7.60)

Let wiw_{i}, ii\in\mathbb{N} be an arbitrary sequence. Then it holds that

|1ni=1nwi12Jj=12J1|Ij,J,n|iIj,J,nwi|\displaystyle\Big{|}\frac{1}{n}\sum_{i=1}^{n}w_{i}-\frac{1}{2^{J}}\sum_{j=1}^{2^{J}}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}w_{i}\Big{|} \displaystyle\leq j=12J||Ij,J,n|n12J||1|Ij,J,n|iIj,J,nwi|\displaystyle\sum_{j=1}^{2^{J}}\Big{|}\frac{|I_{j,J,n}|}{n}-\frac{1}{2^{J}}\Big{|}\cdot\Big{|}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}w_{i}\Big{|} (7.61)
\displaystyle\leq 1nj=12J1|Ij,J,n|iIj,J,n|wi|\displaystyle\frac{1}{n}\sum_{j=1}^{2^{J}}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}|w_{i}|
\displaystyle\leq 2Jn2i=1n|wi|\displaystyle\frac{2^{J}}{n^{2}}\sum_{i=1}^{n}|w_{i}|
  1. (i)

    Application of (7.61) with wi=an(in)𝔼[W~i(in)2𝟙{|W~i(in)|>cn}]w_{i}=a_{n}(\frac{i}{n})\mathbb{E}[\tilde{W}_{i}(\frac{i}{n})^{2}\mathbbm{1}_{\{|\tilde{W}_{i}(\frac{i}{n})|>c_{n}\}}] yields

    1ni=1n𝔼[W~i(in)2𝟙{|W~i(in)|>cn}]\displaystyle\frac{1}{n}\sum_{i=1}^{n}\mathbb{E}[\tilde{W}_{i}(\frac{i}{n})^{2}\mathbbm{1}_{\{|\tilde{W}_{i}(\frac{i}{n})|>c_{n}\}}] (7.62)
    \displaystyle\leq 12Jj=12J1|Ij,J,n|iIj,J,n𝔼[W~i(in)2𝟙{|W~i(in)|>cn}]+2Jn1ni=1nan(in)supuW~0(u)22.\displaystyle\frac{1}{2^{J}}\sum_{j=1}^{2^{J}}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}\mathbb{E}[\tilde{W}_{i}(\frac{i}{n})^{2}\mathbbm{1}_{\{|\tilde{W}_{i}(\frac{i}{n})|>c_{n}\}}]+\frac{2^{J}}{n}\cdot\frac{1}{n}\sum_{i=1}^{n}a_{n}(\frac{i}{n})\cdot\sup_{u}\|\tilde{W}_{0}(u)\|_{2}^{2}.

    By Lemma 7.6(ii),

    12Jj=12J1|Ij,J,n|iIj,J,n|an(in)|𝔼[W~i(in)2𝟙{|W~i(in)|>cn}]\displaystyle\frac{1}{2^{J}}\sum_{j=1}^{2^{J}}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}|a_{n}(\frac{i}{n})|\cdot\mathbb{E}[\tilde{W}_{i}(\frac{i}{n})^{2}\mathbbm{1}_{\{|\tilde{W}_{i}(\frac{i}{n})|>c_{n}\}}] (7.63)
    \displaystyle\leq 12Jj=12J1|Ij,J,n|iIj,J,n|an(in)|𝔼[W~0(j2J)2𝟙{|W~0(j2J)|>cn}]\displaystyle\frac{1}{2^{J}}\sum_{j=1}^{2^{J}}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}|a_{n}(\frac{i}{n})|\cdot\mathbb{E}[\tilde{W}_{0}(\frac{j}{2^{J}})^{2}\mathbbm{1}_{\{|\tilde{W}_{0}(\frac{j}{2^{J}})|>c_{n}\}}]
    +12Jj=12J1|Ij,J,n|iIj,J,n|an(in)|W~0(in)W~0(j2J)22\displaystyle\quad\quad\quad+\frac{1}{2^{J}}\sum_{j=1}^{2^{J}}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}|a_{n}(\frac{i}{n})|\cdot\big{\|}\tilde{W}_{0}(\frac{i}{n})-\tilde{W}_{0}(\frac{j}{2^{J}})\big{\|}_{2}^{2}
    \displaystyle\leq [supj=1,,2J𝔼[W~0(j2J)2𝟙{|W~0(j2J)|>cn}]+CW(2J)ς]12Jj=12J1|Ij,J,n|iIj,J,n|an(in)|.\displaystyle\Big{[}\sup_{j=1,...,2^{J}}\mathbb{E}[\tilde{W}_{0}(\frac{j}{2^{J}})^{2}\mathbbm{1}_{\{|\tilde{W}_{0}(\frac{j}{2^{J}})|>c_{n}\}}]+C_{W}(2^{-J})^{\varsigma}\Big{]}\cdot\frac{1}{2^{J}}\sum_{j=1}^{2^{J}}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}|a_{n}(\frac{i}{n})|.

    By (7.60),

    12Jj=12J1|Ij,J,n|iIj,J,n|an(in)|2ni=1n|an(in)|.\frac{1}{2^{J}}\sum_{j=1}^{2^{J}}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}|a_{n}(\frac{i}{n})|\leq\frac{2}{n}\sum_{i=1}^{n}|a_{n}(\frac{i}{n})|.

    By the dominated convergence theorem,

    lim supn𝔼[W~0(j2J)2𝟙{|W~0(j2J)|>cn}].\limsup_{n\to\infty}\mathbb{E}[\tilde{W}_{0}(\frac{j}{2^{J}})^{2}\mathbbm{1}_{\{|\tilde{W}_{0}(\frac{j}{2^{J}})|>c_{n}\}}].

    Furthermore, lim supn2JnsupuW~0(u)22=0\limsup_{n\to\infty}\frac{2^{J}}{n}\cdot\sup_{u}\|\tilde{W}_{0}(u)\|_{2}^{2}=0. Inserting (7.63) into (7.62) and applying lim supn\limsup_{n\to\infty} and afterwards, lim supJ\limsup_{J\to\infty}, yields the assertion.

  2. (ii)

    Since (7.59) also holds for W~0(u)\tilde{W}_{0}(u) replaced by W~0(u)𝔼W~0(u)\tilde{W}_{0}(u)-\mathbb{E}\tilde{W}_{0}(u), we may assume in the following that w.l.o.g. that 𝔼W~0(u)=0\mathbb{E}\tilde{W}_{0}(u)=0.

    By (7.61) applied to wi=a(in)Wi(in)w_{i}=a(\frac{i}{n})W_{i}(\frac{i}{n}), we obtain

    1ni=1nan(in)W~i(in)12Jj=12J1|Ij,J,n|iIj,J,nan(in)W~i(in)1\displaystyle\Big{\|}\frac{1}{n}\sum_{i=1}^{n}a_{n}(\frac{i}{n})\tilde{W}_{i}(\frac{i}{n})-\frac{1}{2^{J}}\sum_{j=1}^{2^{J}}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}a_{n}(\frac{i}{n})\tilde{W}_{i}(\frac{i}{n})\Big{\|}_{1} (7.64)
    \displaystyle\leq 2Jn1ni=1n|an(in)|supuW0(u)10(n).\displaystyle\frac{2^{J}}{n}\cdot\frac{1}{n}\sum_{i=1}^{n}|a_{n}(\frac{i}{n})|\cdot\sup_{u}\|W_{0}(u)\|_{1}\to 0\quad(n\to\infty).

    We furthermore have

    12Jj=12J1|Ij,J,n|iIj,J,nan(in)W~i(in)\displaystyle\Big{\|}\frac{1}{2^{J}}\sum_{j=1}^{2^{J}}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}a_{n}(\frac{i}{n})\tilde{W}_{i}(\frac{i}{n}) (7.65)
    12Jj=12J1|Ij,J,n|iIj,J,nan(in)W~i(j12J)1\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad-\frac{1}{2^{J}}\sum_{j=1}^{2^{J}}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}a_{n}(\frac{i}{n})\tilde{W}_{i}(\frac{j-1}{2^{J}})\Big{\|}_{1}
    \displaystyle\leq 12Jj=12J1|Ij,J,n|iIj,J,n|an(in)|W~0(in)W~0(j12J)1\displaystyle\frac{1}{2^{J}}\sum_{j=1}^{2^{J}}\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}|a_{n}(\frac{i}{n})|\cdot\big{\|}\tilde{W}_{0}(\frac{i}{n})-\tilde{W}_{0}(\frac{j-1}{2^{J}})\big{\|}_{1}
    \displaystyle\leq 2ni=1n|an(in)|CW(2J)ς.\displaystyle\frac{2}{n}\sum_{i=1}^{n}|a_{n}(\frac{i}{n})|\cdot C_{W}(2^{-J})^{\varsigma}.

    Fix j{1,,2J}j\in\{1,...,2^{J}\}. Put uj:=j12Ju_{j}:=\frac{j-1}{2^{J}} and, for a real-valued positive xx, define [x]:=max{k:k>x}[x]:=\max\{k\in\mathbb{N}:k>x\}. By stationarity, the following equality holds in distribution:

    1|Ij,J,n|iIj,J,nan(in)W~i(uj)=𝑑1|Ij,J,n|i=1|Ij,J,n|an(in+[ujn]1n)W~i(uj).\frac{1}{|I_{j,J,n}|}\sum_{i\in I_{j,J,n}}a_{n}(\frac{i}{n})\tilde{W}_{i}(u_{j})\overset{d}{=}\frac{1}{|I_{j,J,n}|}\sum_{i=1}^{|I_{j,J,n}|}a_{n}(\frac{i}{n}+\frac{[u_{j}n]-1}{n})\tilde{W}_{i}(u_{j}). (7.66)

    Put W~i(u):=W~i(u)𝟙{in+[ujn]1n[r¯n,r¯n]}\tilde{W}_{i}(u)^{\circ}:=\tilde{W}_{i}(u)\mathbbm{1}_{\{\frac{i}{n}+\frac{[u_{j}n]-1}{n}\in[\underline{r}_{n},\overline{r}_{n}]\}}. By partial summation and since an()An\frac{a_{n}(\cdot)}{A_{n}} has bounded variation BaB_{a} uniformly in nn,

    1|Ij,J,n|i=1|Ij,J,n|an(in+[ujn]1)W~i(uj)\displaystyle\frac{1}{|I_{j,J,n}|}\sum_{i=1}^{|I_{j,J,n}|}a_{n}(\frac{i}{n}+[u_{j}n]-1)\tilde{W}_{i}(u_{j}) (7.67)
    =\displaystyle= 1|Ij,J,n|i=1|Ij,J,n|1{an(in+[ujn]1)an(i+1n+[ujn]1)}l=1iW~l(uj)\displaystyle\frac{1}{|I_{j,J,n}|}\sum_{i=1}^{|I_{j,J,n}|-1}\big{\{}a_{n}(\frac{i}{n}+[u_{j}n]-1)-a_{n}(\frac{i+1}{n}+[u_{j}n]-1)\big{\}}\sum_{l=1}^{i}\tilde{W}_{l}(u_{j})^{\circ}
    +1|Ij,J,n|Anl=1|Ij,J,n|W~l(uj)\displaystyle\quad\quad+\frac{1}{|I_{j,J,n}|}A_{n}\cdot\sum_{l=1}^{|I_{j,J,n}|}\tilde{W}_{l}(u_{j})^{\circ}
    \displaystyle\leq Ba+1|Ij,J,n|Ansupi=1,,|Ij,J,n||l=1iW~l(uj)|\displaystyle\frac{B_{a}+1}{|I_{j,J,n}|}A_{n}\cdot\sup_{i=1,...,|I_{j,J,n}|}\Big{|}\sum_{l=1}^{i}\tilde{W}_{l}(u_{j})^{\circ}\Big{|}

    By stationarity, we have

    supi=1,,|Ij,J,n||l=1iW~l(uj)|\displaystyle\sup_{i=1,...,|I_{j,J,n}|}\Big{|}\sum_{l=1}^{i}\tilde{W}_{l}(u_{j})^{\circ}\Big{|}
    =\displaystyle= supi=1,,|Ij,J,n||l=1(n(v+hn)[ujn]+1)i(n(vhn)[ujn]+1)W~l(uj)|=𝑑supi=1,,mn|l=1iW~l(uj)|,\displaystyle\sup_{i=1,...,|I_{j,J,n}|}\Big{|}\sum_{l=1\vee(\lceil n(v+h_{n})\rceil-[u_{j}n]+1)}^{i\wedge(\lfloor n(v-h_{n})\rfloor-[u_{j}n]+1)}\tilde{W}_{l}(u_{j})\Big{|}\overset{d}{=}\sup_{i=1,...,m_{n}}\Big{|}\sum_{l=1}^{i}\tilde{W}_{l}(u_{j})\Big{|},

    since (|Ij,J,n|(n(v+hn)[ujn]+1))(1(n(vhn)[ujn]+1))mn:=2nhn(|I_{j,J,n}|\wedge(\lfloor n(v+h_{n})\rfloor-[u_{j}n]+1))-(1\vee(\lceil n(v-h_{n})\rceil-[u_{j}n]+1))\leq m_{n}:=2nh_{n}. By assumption, mn=2nAnAnhnm_{n}=\frac{2n}{A_{n}}\cdot A_{n}h_{n}\to\infty.

    By the ergodic theorem,

    limm|1ml=1mW~l(uj)|=0a.s.\lim_{m\to\infty}\Big{|}\frac{1}{m}\sum_{l=1}^{m}\tilde{W}_{l}(u_{j})\Big{|}=0\quad a.s.

    and especially (1ml=1mW~l(uj))m(\frac{1}{m}\sum_{l=1}^{m}\tilde{W}_{l}(u_{j}))_{m} is bounded a.s. We conclude that

    1mnsupi=1,,mn|l=1iW~l(uj)|\displaystyle\frac{1}{m_{n}}\sup_{i=1,...,m_{n}}\Big{|}\sum_{l=1}^{i}\tilde{W}_{l}(u_{j})\Big{|}
    \displaystyle\leq 1mnsupi=1,,mn|1il=1iW~l(uj)|+supi=mn+1,,mn|1il=1iW~l(uj)|0.\displaystyle\frac{1}{\sqrt{m_{n}}}\sup_{i=1,...,\sqrt{m_{n}}}\Big{|}\frac{1}{i}\sum_{l=1}^{i}\tilde{W}_{l}(u_{j})\Big{|}+\sup_{i=\sqrt{m_{n}}+1,...,m_{n}}\Big{|}\frac{1}{i}\sum_{l=1}^{i}\tilde{W}_{l}(u_{j})\Big{|}\to 0.

    We conclude from (7.67) that

    1|Ij,J,n|i=1|Ij,J,n|an(in+[ujn]1)W~i(uj)\displaystyle\frac{1}{|I_{j,J,n}|}\sum_{i=1}^{|I_{j,J,n}|}a_{n}(\frac{i}{n}+[u_{j}n]-1)\tilde{W}_{i}(u_{j}) (7.68)
    \displaystyle\leq 22J(Ba+1)Anmnn1mnsupi=1,,|Ij,J,n||l=1iW~l(uj)|0.\displaystyle 2\cdot 2^{J}(B_{a}+1)\cdot A_{n}\cdot\frac{m_{n}}{n}\cdot\frac{1}{m_{n}}\sup_{i=1,...,|I_{j,J,n}|}\Big{|}\sum_{l=1}^{i}\tilde{W}_{l}(u_{j})^{\circ}\Big{|}\to 0.

    Combination of (7.64), (7.65), (7.66) and (7.68) and applying lim supn\limsup_{n\to\infty} and afterwards lim supJ\limsup_{J\to\infty}, we obtain

    1ni=1nan(in){W~i(in)𝔼W~0(in)}𝑝0.\frac{1}{n}\sum_{i=1}^{n}a_{n}(\frac{i}{n})\big{\{}\tilde{W}_{i}(\frac{i}{n})-\mathbb{E}\tilde{W}_{0}(\frac{i}{n})\big{\}}\overset{p}{\to}0.

If u𝔼W~0(u)u\mapsto\mathbb{E}\tilde{W}_{0}(u) has bounded variation, we have with some intermediate value ξi,n[i1n,in]\xi_{i,n}\in[\frac{i-1}{n},\frac{i}{n}],

|1ni=1nan(in)𝔼W~0(in)01an(u)𝔼W~0(u)du|\displaystyle\Big{|}\frac{1}{n}\sum_{i=1}^{n}a_{n}(\frac{i}{n})\mathbb{E}\tilde{W}_{0}(\frac{i}{n})-\int_{0}^{1}a_{n}(u)\mathbb{E}\tilde{W}_{0}(u)du\Big{|}
\displaystyle\leq 1ni=1n|an(in)𝔼W~0(in)an(ξi,n)𝔼W~0(ξi,n)|\displaystyle\frac{1}{n}\sum_{i=1}^{n}\big{|}a_{n}(\frac{i}{n})\mathbb{E}\tilde{W}_{0}(\frac{i}{n})-a_{n}(\xi_{i,n})\mathbb{E}\tilde{W}_{0}(\xi_{i,n})\big{|}
\displaystyle\leq Ann1Ani=1n|an(in)an(ξi,n)|supuW~0(u)1\displaystyle\frac{A_{n}}{n}\cdot\frac{1}{A_{n}}\sum_{i=1}^{n}|a_{n}(\frac{i}{n})-a_{n}(\xi_{i,n})|\cdot\sup_{u}\|\tilde{W}_{0}(u)\|_{1}
+Anni=1n|𝔼W~0(in)𝔼W~0(ξi,n)|0.\displaystyle\quad\quad+\frac{A_{n}}{n}\sum_{i=1}^{n}\big{|}\mathbb{E}\tilde{W}_{0}(\frac{i}{n})-\mathbb{E}\tilde{W}_{0}(\xi_{i,n})\big{|}\to 0.

If instead hn0h_{n}\to 0, we have with some intermediate value ξi,n[i1n,in]\xi_{i,n}\in[\frac{i-1}{n},\frac{i}{n}],

|1ni=1nan(in)𝔼W~0(in)1ni=1nan(in)𝔼W~0(v)|\displaystyle\Big{|}\frac{1}{n}\sum_{i=1}^{n}a_{n}(\frac{i}{n})\mathbb{E}\tilde{W}_{0}(\frac{i}{n})-\frac{1}{n}\sum_{i=1}^{n}a_{n}(\frac{i}{n})\mathbb{E}\tilde{W}_{0}(v)\Big{|}
\displaystyle\leq 1ni=1n|an(in)|sup|uv|hnW~0(u)W~0(v)10.\displaystyle\frac{1}{n}\sum_{i=1}^{n}|a_{n}(\frac{i}{n})|\cdot\sup_{|u-v|\leq h_{n}}\|\tilde{W}_{0}(u)-\tilde{W}_{0}(v)\|_{1}\to 0.

Since an()An\frac{a_{n}(\cdot)}{A_{n}} has bounded variation uniformly in nn,

|1ni=1nan(in)01an(u)du|Ann1Ani=1n|an(in)an(ξi,n)|0.\Big{|}\frac{1}{n}\sum_{i=1}^{n}a_{n}(\frac{i}{n})-\int_{0}^{1}a_{n}(u)du\Big{|}\leq\frac{A_{n}}{n}\cdot\frac{1}{A_{n}}\sum_{i=1}^{n}|a_{n}(\frac{i}{n})-a_{n}(\xi_{i,n})|\to 0.

Lemma 7.8.

Let \mathcal{F} satisfy Assumptions 3.1, 3.2 and 2.2. Then there exist constants Ccont>0,Cf¯>0C_{cont}>0,C_{\bar{f}}>0 such that for any ff\in\mathcal{F},

  1. (i)

    for any j1j\geq 1,

    Pijf(Zi,u)2\displaystyle\|P_{i-j}f(Z_{i},u)\|_{2} \displaystyle\leq Df,n(u)Δ(j),\displaystyle D_{f,n}(u)\Delta(j),
    supi=1,,nf(Zi,u)2\displaystyle\sup_{i=1,...,n}\|f(Z_{i},u)\|_{2} \displaystyle\leq CΔDf,n(u),\displaystyle C_{\Delta}\cdot D_{f,n}(u),
    supi,uf¯(Zi,u)2Cf¯,\displaystyle\sup_{i,u}\|\bar{f}(Z_{i},u)\|_{2}\leq C_{\bar{f}}, supv,uf¯(Z~0(v),u)2Cf¯.\displaystyle\sup_{v,u}\|\bar{f}(\tilde{Z}_{0}(v),u)\|_{2}\leq C_{\bar{f}}.
  2. (ii)
    f¯(Zi,u)f¯(Z~i(in),u)2\displaystyle\|\bar{f}(Z_{i},u)-\bar{f}(\tilde{Z}_{i}(\frac{i}{n}),u)\|_{2} \displaystyle\leq Ccontnςs,\displaystyle C_{cont}\cdot n^{-\varsigma s}, (7.69)
    f¯(Z~i(v1),u1)f¯(Z~i(v2),u2)2\displaystyle\|\bar{f}(\tilde{Z}_{i}(v_{1}),u_{1})-\bar{f}(\tilde{Z}_{i}(v_{2}),u_{2})\|_{2} \displaystyle\leq Ccont(|v1v2|ςs+|u1u2|ςs).\displaystyle C_{cont}\cdot\big{(}|v_{1}-v_{2}|^{\varsigma s}+|u_{1}-u_{2}|^{\varsigma s}\big{)}. (7.70)
Proof of Lemma 7.8.
  1. (i)

    If Assumption 2.2 is satisfied, we have by Lemma 7.3 that

    Pijf(Zi,u)2f(Zi,u)f(Zi(ij),u)2=δ2f(Z,u)(j)Df,n(u)Δ(j).\|P_{i-j}f(Z_{i},u)\|_{2}\leq\|f(Z_{i},u)-f(Z_{i}^{*(i-j)},u)\|_{2}=\delta_{2}^{f(Z,u)}(j)\leq D_{f,n}(u)\Delta(j).

    The second assertion follows from Lemma 7.3.

  2. (ii)

    Let C¯R:=supv,u1,u2|f¯(Z~0(v),u1)f(Z~0(v),u2)|)|u1u2|ς2<\bar{C}_{R}:=\sup_{v,u_{1},u_{2}}\|\frac{|\bar{f}(\tilde{Z}_{0}(v),u_{1})-f(\tilde{Z}_{0}(v),u_{2})|)}{|u_{1}-u_{2}|^{\varsigma}}\|_{2}<\infty (by Assumption 3.2) and
    CR:=max{supi,uR(Zi,u)2,supu,vR(Z~0(v),u)2}C_{R}:=\max\{\sup_{i,u}\|R(Z_{i},u)\|_{2},\sup_{u,v}\|R(\tilde{Z}_{0}(v),u)\|_{2}\}. Then

    f¯(Z~i(v),u1)f¯(Z~i(v),u2)2C¯R|u1u2|ς.\displaystyle\|\bar{f}(\tilde{Z}_{i}(v),u_{1})-\bar{f}(\tilde{Z}_{i}(v),u_{2})\|_{2}\leq\bar{C}_{R}|u_{1}-u_{2}|^{\varsigma}. (7.71)

    We then have

    f¯(Zi,u)f¯(Z~i(v),u)2\displaystyle\|\bar{f}(Z_{i},u)-\bar{f}(\tilde{Z}_{i}(v),u)\|_{2} \displaystyle\leq |ZiZ~i(v)|L,ss(R(Zi,u)+R(Z~i(v),u)2\displaystyle\||Z_{i}-\tilde{Z}_{i}(v)|_{L_{\mathcal{F},s}}^{s}(R(Z_{i},u)+R(\tilde{Z}_{i}(v),u)\|_{2}
    \displaystyle\leq |ZiZ~i(v)|L,ss2pp1(R(Zi,u)2p+R(Z~i(v),u)2p)\displaystyle\||Z_{i}-\tilde{Z}_{i}(v)|_{L_{\mathcal{F},s}}^{s}\|_{\frac{2p}{p-1}}\big{(}\|R(Z_{i},u)\|_{2p}+\|R(\tilde{Z}_{i}(v),u)\|_{2p}\big{)}
    \displaystyle\leq 2CR|ZiZ~i(v)|L,ss2pp1.\displaystyle 2C_{R}\||Z_{i}-\tilde{Z}_{i}(v)|_{L_{\mathcal{F},s}}^{s}\|_{\frac{2p}{p-1}}.

    Furthermore,

    |ZiZ~i(v)|L,ss2pp¯1\displaystyle\||Z_{i}-\tilde{Z}_{i}(v)|_{L_{\mathcal{F}},s}^{s}\|_{\frac{2p}{\bar{p}-1}} \displaystyle\leq l=0L,l|XilX~il(v)|s2pp1\displaystyle\sum_{l=0}^{\infty}L_{\mathcal{F},l}\||X_{i-l}-\tilde{X}_{i-l}(v)|^{s}\|_{\frac{2p}{p-1}}
    =\displaystyle= l=0iL,lXilX~il(v)2psp1s\displaystyle\sum_{l=0}^{i}L_{\mathcal{F},l}\|X_{i-l}-\tilde{X}_{i-l}(v)\|_{\frac{2ps}{p-1}}^{s}
    \displaystyle\leq l=0iL,lCXs(|vin|ς+lςnς)s\displaystyle\sum_{l=0}^{i}L_{\mathcal{F},l}C_{X}^{s}\big{(}|v-\frac{i}{n}|^{\varsigma}+l^{\varsigma}n^{-\varsigma}\big{)}^{s}
    \displaystyle\leq |vin|ςCX|L|1+nςCXl=0L,llςs}.\displaystyle|v-\frac{i}{n}|^{\varsigma}\cdot C_{X}|L_{\mathcal{F}}|_{1}+n^{-\varsigma}\cdot C_{X}\sum_{l=0}^{\infty}L_{\mathcal{F},l}l^{\varsigma s}\big{\}}.

    We obtain with Ccont:=2C¯R+2CRCX{|L|1+j=0L,jjςs}C_{cont}:=2\bar{C}_{R}+2C_{R}C_{X}\big{\{}|L_{\mathcal{F}}|_{1}+\sum_{j=0}^{\infty}L_{\mathcal{F},j}j^{\varsigma s}\big{\}} that

    f¯(Zi,u)f¯(Z~i(v),u)2Ccont[|vin|ςs+nςs].\|\bar{f}(Z_{i},u)-\bar{f}(\tilde{Z}_{i}(v),u)\|_{2}\leq C_{cont}\cdot\Big{[}|v-\frac{i}{n}|^{\varsigma s}+n^{-\varsigma s}\Big{]}. (7.72)

    Furthermore, as above,

    f(Z~i(v1),u)f(Z~i(v2),u)2\displaystyle\|f(\tilde{Z}_{i}(v_{1}),u)-f(\tilde{Z}_{i}(v_{2}),u)\|_{2} \displaystyle\leq 2CR|Z~0(v1)Z~0(v2)|L,ss2pp1\displaystyle 2C_{R}\||\tilde{Z}_{0}(v_{1})-\tilde{Z}_{0}(v_{2})|_{L_{\mathcal{F}},s}^{s}\|_{\frac{2p}{p-1}} (7.73)
    \displaystyle\leq 2CRl=0iL,lX~0(v1)X~0(v2)2psp1s\displaystyle 2C_{R}\sum_{l=0}^{i}L_{\mathcal{F},l}\|\tilde{X}_{0}(v_{1})-\tilde{X}_{0}(v_{2})\|_{\frac{2ps}{p-1}}^{s}
    \displaystyle\leq 2CRCX|L|1|v1v2|ςs\displaystyle 2C_{R}C_{X}|L_{\mathcal{F}}|_{1}\cdot|v_{1}-v_{2}|^{\varsigma s}

    From (7.72), we obtain (7.69) with v=inv=\frac{i}{n}. From (7.71) and (7.73), we conclude (7.70).

7.8 Proofs of Section 5

Proof of Lemma 5.2.

Put Dv,n(u)=hKh(uv)D_{v,n}(u)=\sqrt{h}K_{h}(u-v). From (A1) and Assumption 3.1 we obtain that Δ(k)=O(δ2MX(k))\Delta(k)=O(\delta_{2M}^{X}(k)), CR=1+kmax{CX,1}2MC_{R}=1+k\max\{C_{X},1\}^{2M}.

Since KK is Lipschitz continuous and (A2) holds, we have

sup|vv|n3,|θθ|2n3|(θjLn,h(v,θ)𝔼θjLn,h(v,θ))\displaystyle\sup_{|v-v^{\prime}|\leq n^{-3},|\theta-\theta^{\prime}|_{2}\leq n^{-3}}\big{|}\big{(}\nabla_{\theta}^{j}L_{n,h}(v,\theta)-\mathbb{E}\nabla_{\theta}^{j}L_{n,h}(v,\theta)\big{)}
(θjLn,h(v,θ)𝔼θjLn,h(v,θ))|\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad-\big{(}\nabla_{\theta}^{j}L_{n,h}(v^{\prime},\theta^{\prime})-\mathbb{E}\nabla_{\theta}^{j}L_{n,h}(v^{\prime},\theta^{\prime})\big{)}\big{|}_{\infty}
\displaystyle\leq sup|vv|n3,|θθ|2n3CRh2[LK|vv|+CΘ|θθ|2]\displaystyle\sup_{|v-v^{\prime}|\leq n^{-3},|\theta-\theta^{\prime}|_{2}\leq n^{-3}}\frac{C_{R}}{h^{2}}\big{[}L_{K}|v-v^{\prime}|+C_{\Theta}|\theta-\theta^{\prime}|_{2}\big{]}
×1ni=kn(1+|Zi|1M+𝔼|Zi|1M)\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\times\frac{1}{n}\sum_{i=k}^{n}\big{(}1+|Z_{i}|_{1}^{M}+\mathbb{E}|Z_{i}|_{1}^{M}\big{)}
=\displaystyle= Op(n1).\displaystyle O_{p}(n^{-1}).

Let Θn\Theta_{n} be a grid approximation of Θ\Theta such that for any θΘ\theta\in\Theta, there exists some θΘn\theta^{\prime}\in\Theta_{n} such that |θθ|2n3|\theta-\theta^{\prime}|_{2}\leq n^{-3}. Since ΘdΘ\Theta\subset\mathbb{R}^{d_{\Theta}}, it is possible to choose Θn\Theta_{n} such that |Θn|=O(n6dΘ)|\Theta_{n}|=O(n^{-6d_{\Theta}}). Furthermore, define Vn:={in3:i=1,,n}V_{n}:=\{in^{-3}:i=1,...,n\} as an approximation of [0,1][0,1].

As in Example 5.1, Corollary 4.3 applied to

j={fv,θ:θΘn,vVn}\mathcal{F}_{j}^{\prime}=\{f_{v,\theta}:\theta\in\Theta_{n},v\in V_{n}\}

yields for j{0,1,2}j\in\{0,1,2\} that

supv[h2,1h2]|θjLn,h(v,θ)𝔼θjLn,h(v,θ)|=Op(τn).\sup_{v\in[\frac{h}{2},1-\frac{h}{2}]}\big{|}\nabla_{\theta}^{j}L_{n,h}(v,\theta)-\mathbb{E}\nabla_{\theta}^{j}L_{n,h}(v,\theta)\big{|}_{\infty}=O_{p}\big{(}\tau_{n}\big{)}. (7.74)

Put L~n,h(v,θ)=1ni=1nKh(i/nv)θ(Z~i(v))\tilde{L}_{n,h}(v,\theta)=\frac{1}{n}\sum_{i=1}^{n}K_{h}(i/n-v)\ell_{\theta}(\tilde{Z}_{i}(v)). With (A1) it is easy to see that

|𝔼θjLn,h(v,θ)𝔼θjL~n,h(v,θ)|\displaystyle\big{|}\mathbb{E}\nabla_{\theta}^{j}L_{n,h}(v,\theta)-\mathbb{E}\nabla_{\theta}^{j}\tilde{L}_{n,h}(v,\theta)\big{|}_{\infty} (7.75)
\displaystyle\leq dΘjCRni=1n|Kh(i/nv)||ZiZ~i(v)|1M\displaystyle\frac{d_{\Theta}^{j}C_{R}}{n}\sum_{i=1}^{n}|K_{h}(i/n-v)|\cdot\||Z_{i}-\tilde{Z}_{i}(v)|_{1}\|_{M}
×(1+|Zi|1MM1+|Z~i(v)|1MM1)\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\times\big{(}1+\||Z_{i}|_{1}\|_{M}^{M-1}+\||\tilde{Z}_{i}(v)|_{1}\|_{M}^{M-1}\big{)}
\displaystyle\leq dΘjCR|K|CX(1+2CXM1)(n1+h).\displaystyle d_{\Theta}^{j}C_{R}|K|_{\infty}C_{X}(1+2C_{X}^{M-1})\big{(}n^{-1}+h\big{)}.

Finally, since KK has bounded variation and K(u)du=1\int K(u)du=1, uniformly in v[h2,1h2]v\in[\frac{h}{2},1-\frac{h}{2}] it holds that

𝔼θjL~n,h(v,θ)=1ni=1nKh(i/nv)𝔼θjθ(Z~1(v))=𝔼θjθ(Z~1(v))+O((nh)1).\mathbb{E}\nabla_{\theta}^{j}\tilde{L}_{n,h}(v,\theta)=\frac{1}{n}\sum_{i=1}^{n}K_{h}(i/n-v)\mathbb{E}\nabla_{\theta}^{j}\ell_{\theta}(\tilde{Z}_{1}(v))=\mathbb{E}\nabla_{\theta}^{j}\ell_{\theta}(\tilde{Z}_{1}(v))+O((nh)^{-1}). (7.76)

From (7.74), (7.75) and (7.76) we obtain

supv[h2,1h2]supθΘ|θjLn,h(v,θ)𝔼θjθ(Z~1(v))|=Op(τn(j)),\sup_{v\in[\frac{h}{2},1-\frac{h}{2}]}\sup_{\theta\in\Theta}\big{|}\nabla_{\theta}^{j}L_{n,h}(v,\theta)-\mathbb{E}\nabla_{\theta}^{j}\ell_{\theta}(\tilde{Z}_{1}(v))\big{|}_{\infty}=O_{p}(\tau_{n}^{(j)}), (7.77)

where

τn(j):=τn+(nh)1+h,j{0,2},τn(1):=τn+(nh)1+Bh.\tau_{n}^{(j)}:=\tau_{n}+(nh)^{-1}+h,\quad j\in\{0,2\},\quad\quad\tau_{n}^{(1)}:=\tau_{n}+(nh)^{-1}+B_{h}.

By (A3) and (7.77) for j=0j=0, we obtain with standard arguments that if τn(0)=o(1)\tau_{n}^{(0)}=o(1),

supv[h2,1h2]|θ^n,h(v)θ0(v)|=op(1).\sup_{v\in[\frac{h}{2},1-\frac{h}{2}]}\big{|}\hat{\theta}_{n,h}(v)-\theta_{0}(v)\big{|}_{\infty}=o_{p}(1).

Since θ^n,h(v)\hat{\theta}_{n,h}(v) is a minimizer of θLn,h(v,θ)\theta\mapsto L_{n,h}(v,\theta) and θ\ell_{\theta} is twice continuously differentiable, we have the representation

θ^n,h(v)θ0(v)=θ2Ln,h(v,θ¯v)1θLn,h(v,θ0(v)),\hat{\theta}_{n,h}(v)-\theta_{0}(v)=-\nabla_{\theta}^{2}L_{n,h}(v,\bar{\theta}_{v})^{-1}\nabla_{\theta}L_{n,h}(v,\theta_{0}(v)), (7.78)

where θ¯vΘ\bar{\theta}_{v}\in\Theta fulfills |θ¯vθ0(v)||θ^n,h(v)θ0(v)|=op(1)|\bar{\theta}_{v}-\theta_{0}(v)|_{\infty}\leq|\hat{\theta}_{n,h}(v)-\theta_{0}(v)|_{\infty}=o_{p}(1).

By (A2), we have

|𝔼θ2θ(Z~0(v))|θ=θ0(v)𝔼θ2θ(Z~0(v))|θ=θ¯v|=O(|θ0(v)θ¯v|2)=op(1).\big{|}\mathbb{E}\nabla_{\theta}^{2}\ell_{\theta}(\tilde{Z}_{0}(v))\big{|}_{\theta=\theta_{0}(v)}-\mathbb{E}\nabla_{\theta}^{2}\ell_{\theta}(\tilde{Z}_{0}(v))\big{|}_{\theta=\bar{\theta}_{v}}\big{|}_{\infty}=O(|\theta_{0}(v)-\bar{\theta}_{v}|_{2})=o_{p}(1).

and thus with (7.77),

supv[h2,1h2]|θ2Ln,h(v,θ¯v)𝔼θ2θ(Z~1(v))|θ=θ0(v)|=Op(τn(2))+op(1).\sup_{v\in[\frac{h}{2},1-\frac{h}{2}]}\big{|}\nabla_{\theta}^{2}L_{n,h}(v,\bar{\theta}_{v})-\mathbb{E}\nabla_{\theta}^{2}\ell_{\theta}(\tilde{Z}_{1}(v))\big{|}_{\theta=\theta_{0}(v)}\big{|}_{\infty}=O_{p}(\tau_{n}^{(2)})+o_{p}(1). (7.79)

By (A3) and the dominated convergence theorem, 𝔼θ(Z~0(v))=θ𝔼(Z~0(v))=0\mathbb{E}\nabla_{\theta}\ell(\tilde{Z}_{0}(v))=\nabla_{\theta}\mathbb{E}\ell(\tilde{Z}_{0}(v))=0. By (7.77),

supv[h2,1h2]|θLn,h(v,θ0(v))|\displaystyle\sup_{v\in[\frac{h}{2},1-\frac{h}{2}]}\big{|}\nabla_{\theta}L_{n,h}(v,\theta_{0}(v))\big{|}_{\infty} =\displaystyle= supv[h2,1h2]|θLn,h(v,θ0(v))𝔼θ(Z~0(v))|\displaystyle\sup_{v\in[\frac{h}{2},1-\frac{h}{2}]}\big{|}\nabla_{\theta}L_{n,h}(v,\theta_{0}(v))-\mathbb{E}\nabla_{\theta}\ell(\tilde{Z}_{0}(v))\big{|}_{\infty} (7.80)
=\displaystyle= Op(τn(1)).\displaystyle O_{p}(\tau_{n}^{(1)}).

Inserting (7.79) and (7.80) into (7.78), we obtain

supv[h2,1h2]|θ^n,h(v)θ0(u)|=Op(τn(1)).\sup_{v\in[\frac{h}{2},1-\frac{h}{2}]}\big{|}\hat{\theta}_{n,h}(v)-\theta_{0}(u)\big{|}_{\infty}=O_{p}(\tau_{n}^{(1)}).

This yields an improved version of (7.79):

supv[h2,1h2]|θ2Ln,h(v,θ¯v)𝔼θ2θ(Z~1(v))|θ=θ0(v)|=Op(τn(2)).\sup_{v\in[\frac{h}{2},1-\frac{h}{2}]}\big{|}\nabla_{\theta}^{2}L_{n,h}(v,\bar{\theta}_{v})-\mathbb{E}\nabla_{\theta}^{2}\ell_{\theta}(\tilde{Z}_{1}(v))\big{|}_{\theta=\theta_{0}(v)}\big{|}_{\infty}=O_{p}(\tau_{n}^{(2)}). (7.81)

Inserting (7.80) and (7.81) into (7.78), we obtain the assertion. ∎

7.9 Form of the VnV_{n}-norm and connected quantities

Lemma 7.9 (Summation of polynomial and geometric decay).

Let α>1\alpha>1 and qq\in\mathbb{N}. Then it holds that

  1. (i)
    1α1qα+1j=qjαmax{α,2α+1}α1qα+1.\frac{1}{\alpha-1}q^{-\alpha+1}\leq\sum_{j=q}^{\infty}j^{-\alpha}\leq\frac{\max\{\alpha,2^{-\alpha+1}\}}{\alpha-1}q^{-\alpha+1}.
  2. (ii)

    For σ>0\sigma>0, κ21\kappa_{2}\geq 1

    bρ,κ2,lσlog(σ1)j=1min{σ,κ2ρj}\displaystyle b_{\rho,\kappa_{2},l}\cdot\sigma\cdot\log(\sigma^{-1})\leq\sum_{j=1}^{\infty}\min\{\sigma,\kappa_{2}\rho^{j}\} \displaystyle\leq bρ,κ2σlog(σ1e),\displaystyle b_{\rho,\kappa_{2}}\cdot\sigma\cdot\log(\sigma^{-1}\vee e),
    bα,κ2,lσσ1αj=1min{σ,κ2jα}\displaystyle b_{\alpha,\kappa_{2},l}\cdot\sigma\cdot\sigma^{-\frac{1}{\alpha}}\leq\sum_{j=1}^{\infty}\min\{\sigma,\kappa_{2}j^{-\alpha}\} \displaystyle\leq bα,κ2σmax{σ1α,1},\displaystyle b_{\alpha,\kappa_{2}}\cdot\sigma\cdot\max\{\sigma^{-\frac{1}{\alpha}},1\},

    where bρ,κ2,bρ,κ2,lb_{\rho,\kappa_{2}},b_{\rho,\kappa_{2},l}, bα,κ2,bα,κ2,lb_{\alpha,\kappa_{2}},b_{\alpha,\kappa_{2},l} are constants only depending on ρ,κ2,α\rho,\kappa_{2},\alpha.

Proof of Lemma 7.9.
  1. (i)

    Upper bound: If q2q\geq 2, then

    j=qjα\displaystyle\sum_{j=q}^{\infty}j^{-\alpha} =\displaystyle= j=qj1jjαdxj=qj1jxαdx=q1xαdx=1α+1xα+1|q1\displaystyle\sum_{j=q}^{\infty}\int_{j-1}^{j}j^{-\alpha}dx\leq\sum_{j=q}^{\infty}\int_{j-1}^{j}x^{-\alpha}dx=\int_{q-1}^{\infty}x^{-\alpha}dx=\frac{1}{-\alpha+1}x^{-\alpha+1}\Big{|}_{q-1}^{\infty}
    =\displaystyle= 1α1(q1)α+1=1α1qα+1(q1q)α+12α+1α1qα+1.\displaystyle\frac{1}{\alpha-1}(q-1)^{-\alpha+1}=\frac{1}{\alpha-1}q^{-\alpha+1}\cdot(\frac{q-1}{q})^{-\alpha+1}\leq\frac{2^{-\alpha+1}}{\alpha-1}q^{-\alpha+1}.

    If q=1q=1, then j=qjα=1+j=q+1jα1+1α1qα+1=αα1\sum_{j=q}^{\infty}j^{-\alpha}=1+\sum_{j=q+1}^{\infty}j^{-\alpha}\leq 1+\frac{1}{\alpha-1}q^{-\alpha+1}=\frac{\alpha}{\alpha-1}.

    Lower bound: Using similar decomposition arguments as above, we have

    j=qjα\displaystyle\sum_{j=q}^{\infty}j^{-\alpha} \displaystyle\geq j=qjj+1xαdx=qxαdx=1α+1xα+1|q=1α1qα+1.\displaystyle\sum_{j=q}^{\infty}\int_{j}^{j+1}x^{-\alpha}dx=\int_{q}^{\infty}x^{-\alpha}dx=\frac{1}{-\alpha+1}x^{-\alpha+1}\Big{|}_{q}^{\infty}=\frac{1}{\alpha-1}q^{-\alpha+1}.
  2. (ii)
    • Exponential decay: Upper bound: First let a:=max{log(σ/κ2)log(ρ),0}+1a:=\max\{\lfloor\frac{\log(\sigma/\kappa_{2})}{\log(\rho)}\rfloor,0\}+1. Then we have

      j=0min{σ,κ2ρj}\displaystyle\sum_{j=0}^{\infty}\min\{\sigma,\kappa_{2}\rho^{j}\} \displaystyle\leq j=0a1σ+κ2j=aρj=aσ+κ2ρa1ρ\displaystyle\sum_{j=0}^{a-1}\sigma+\kappa_{2}\sum_{j=a}^{\infty}\rho^{j}=a\sigma+\kappa_{2}\frac{\rho^{a}}{1-\rho}
      \displaystyle\leq aσ+κ21ρmin{σκ2,1}aσ+σ1ρ\displaystyle a\sigma+\frac{\kappa_{2}}{1-\rho}\min\{\frac{\sigma}{\kappa_{2}},1\}\leq a\sigma+\frac{\sigma}{1-\rho}
      \displaystyle\leq σ[1log(ρ1)max{log(κ2/σ),0}+21ρ]\displaystyle\sigma\cdot\Big{[}\frac{1}{\log(\rho^{-1})}\max\{\log(\kappa_{2}/\sigma),0\}+\frac{2}{1-\rho}\Big{]}
      \displaystyle\leq σ[1log(ρ1)max{log(σ1),0}+log(κ2)0log(ρ1)+21ρ]\displaystyle\sigma\cdot\Big{[}\frac{1}{\log(\rho^{-1})}\max\{\log(\sigma^{-1}),0\}+\frac{\log(\kappa_{2})\vee 0}{\log(\rho^{-1})}+\frac{2}{1-\rho}\Big{]}
      \displaystyle\leq bρ,κ2σlog(σ1e),\displaystyle b_{\rho,\kappa_{2}}\cdot\sigma\cdot\log(\sigma^{-1}\vee e),

      where bρ,κ2:=2(log(κ2)1)1log(ρ1)[1+2log(ρ1)1ρ]b_{\rho,\kappa_{2}}:=2(\log(\kappa_{2})\vee 1)\cdot\frac{1}{\log(\rho^{-1})}\big{[}1+\frac{2\log(\rho^{-1})}{1-\rho}\big{]}.

      Lower Bound: Put β(q)=κ2j=qρj=κ21ρρq\beta(q)=\kappa_{2}\sum_{j=q}^{\infty}\rho^{j}=\frac{\kappa_{2}}{1-\rho}\rho^{q}. Then

      j=1min{σ,κ2ρj}\displaystyle\sum_{j=1}^{\infty}\min\{\sigma,\kappa_{2}\rho^{j}\} \displaystyle\geq σ(q^1)+β(q^),\displaystyle\sigma(\hat{q}-1)+\beta(\hat{q}),

      where q^=min{q:σκ2ρq}\hat{q}=\min\{q\in\mathbb{N}:\frac{\sigma}{\kappa_{2}}\geq\rho^{q}\}. We have q^log(σ/κ2)log(ρ)=:q¯\hat{q}\geq\frac{\log(\sigma/\kappa_{2})}{\log(\rho)}=:\underline{q} and q^q¯+1\hat{q}\leq\underline{q}+1. Thus

      j=1min{σ,κ2ρj}σ(q¯1)+β(q¯+1).\sum_{j=1}^{\infty}\min\{\sigma,\kappa_{2}\rho^{j}\}\geq\sigma(\underline{q}-1)+\beta(\underline{q}+1).

      Now consider the case σκ2<ρ2\frac{\sigma}{\kappa_{2}}<\rho^{2}, that is, log(σ/κ2)log(ρ)2\frac{\log(\sigma/\kappa_{2})}{\log(\rho)}\geq 2. Then, q¯112q¯\underline{q}-1\geq\frac{1}{2}\underline{q}, and q¯2log(σ/κ2)log(ρ)\overline{q}\leq 2\frac{\log(\sigma/\kappa_{2})}{\log(\rho)}. We obtain

      j=1min{σ,κ2ρj}\displaystyle\sum_{j=1}^{\infty}\min\{\sigma,\kappa_{2}\rho^{j}\} \displaystyle\geq 12σlog(σ/κ2)log(ρ)+κ2ρ1ρρlog(σ/κ2)log(ρ)=12σlog(σ/κ2)log(ρ)+ρ1ρσ\displaystyle\frac{1}{2}\sigma\frac{\log(\sigma/\kappa_{2})}{\log(\rho)}+\frac{\kappa_{2}\rho}{1-\rho}\rho^{\frac{\log(\sigma/\kappa_{2})}{\log(\rho)}}=\frac{1}{2}\sigma\frac{\log(\sigma/\kappa_{2})}{\log(\rho)}+\frac{\rho}{1-\rho}\sigma
      \displaystyle\geq 12(ρ1ρ+1log(ρ1))σlog(σ1κ2),\displaystyle\frac{1}{2}\Big{(}\frac{\rho}{1-\rho}+\frac{1}{\log(\rho^{-1})}\Big{)}\sigma\log(\sigma^{-1}\kappa_{2}),

      that is, the assertion holds with bρ,κ2,l:=12(ρ1ρ+1log(ρ1))b_{\rho,\kappa_{2},l}:=\frac{1}{2}\big{(}\frac{\rho}{1-\rho}+\frac{1}{\log(\rho^{-1})}\big{)}.

    • Polynomial decay: Upper bound: Let a:=(σκ2)1α+1(σκ2)1αa:=\lfloor(\frac{\sigma}{\kappa_{2}})^{-\frac{1}{\alpha}}\rfloor+1\geq(\frac{\sigma}{\kappa_{2}})^{-\frac{1}{\alpha}}. Then we have by (i):

      j=1min{σ,κ2jα}\displaystyle\sum_{j=1}^{\infty}\min\{\sigma,\kappa_{2}j^{-\alpha}\} \displaystyle\leq j=1aσ+κ2j=a+1jα=aσ+κ2α1aα+1\displaystyle\sum_{j=1}^{a}\sigma+\kappa_{2}\sum_{j=a+1}^{\infty}j^{-\alpha}=a\sigma+\frac{\kappa_{2}}{\alpha-1}a^{-\alpha+1}
      \displaystyle\leq aσ+κ21αα1σα1α\displaystyle a\sigma+\frac{\kappa_{2}^{\frac{1}{\alpha}}}{\alpha-1}\sigma^{\frac{\alpha-1}{\alpha}}
      \displaystyle\leq σ[κ21ασ1α+1+κ21αα1σ1α]\displaystyle\sigma\cdot\Big{[}\kappa_{2}^{\frac{1}{\alpha}}\sigma^{-\frac{1}{\alpha}}+1+\frac{\kappa_{2}^{\frac{1}{\alpha}}}{\alpha-1}\sigma^{-\frac{1}{\alpha}}\Big{]}
      \displaystyle\leq σ[αα1κ21ασ1α+1]\displaystyle\sigma\cdot\Big{[}\frac{\alpha}{\alpha-1}\kappa_{2}^{\frac{1}{\alpha}}\sigma^{-\frac{1}{\alpha}}+1\Big{]}
      \displaystyle\leq bα,κ2σmax{σ1α,1},\displaystyle b_{\alpha,\kappa_{2}}\cdot\sigma\cdot\max\{\sigma^{-\frac{1}{\alpha}},1\},

      where bα,κ2:=2αα1(κ21)1αb_{\alpha,\kappa_{2}}:=2\frac{\alpha}{\alpha-1}(\kappa_{2}\vee 1)^{\frac{1}{\alpha}}.

      Lower Bound: Put β(q)=κ2j=qjα\beta(q)=\kappa_{2}\sum_{j=q}^{\infty}j^{-\alpha}. By (i), β(q)κ2α1qα+1\beta(q)\geq\frac{\kappa_{2}}{\alpha-1}q^{-\alpha+1}. Then

      j=1min{σ,κ2jα}\displaystyle\sum_{j=1}^{\infty}\min\{\sigma,\kappa_{2}j^{-\alpha}\} \displaystyle\geq minq{σq+β(q)}\displaystyle\min_{q\in\mathbb{N}}\{\sigma q+\beta(q)\}
      \displaystyle\geq minq{σq+κ2α1qα+1}.\displaystyle\min_{q\in\mathbb{N}}\{\sigma q+\frac{\kappa_{2}}{\alpha-1}q^{-\alpha+1}\}.

      Elementary analysis yields that the minimum is achieved for q=κ21ασ1a=(κ2σ)1αq=\kappa_{2}^{\frac{1}{\alpha}}\cdot\sigma^{-\frac{1}{a}}=(\frac{\kappa_{2}}{\sigma})^{\frac{1}{\alpha}}, that is,

      j=1min{σ,κ2jα}αα1κ21ασα1α,\sum_{j=1}^{\infty}\min\{\sigma,\kappa_{2}j^{-\alpha}\}\geq\frac{\alpha}{\alpha-1}\kappa_{2}^{\frac{1}{\alpha}}\cdot\sigma^{\frac{\alpha-1}{\alpha}},

      the assertion holds with bα,κ2,l:=αα1κ21αb_{\alpha,\kappa_{2},l}:=\frac{\alpha}{\alpha-1}\kappa_{2}^{\frac{1}{\alpha}}.

Lemma 7.10 (Values of qq^{*}, r(δ)r(\delta)).
  • Polynomial decay Δ(j)=κjα\Delta(j)=\kappa j^{-\alpha} (α>1(\alpha>1). Then there exist constants cα,κ(i),Cα,κ(i)>0c_{\alpha,\kappa}^{(i)},C_{\alpha,\kappa}^{(i)}>0, i=1,2i=1,2 only depending on κ,α\kappa,\alpha such that

    cα,κ(1)max{x1α,1}q(x)Cα,κ(1)max{x1α,1},c_{\alpha,\kappa}^{(1)}\max\{x^{-\frac{1}{\alpha}},1\}\leq q^{*}(x)\leq C_{\alpha,\kappa}^{(1)}\max\{x^{-\frac{1}{\alpha}},1\},

    and

    cα,κ(2)min{δαα1,δ}r(δ)Cα,κ(2)min{δαα1,δ}.c_{\alpha,\kappa}^{(2)}\min\{\delta^{\frac{\alpha}{\alpha-1}},\delta\}\leq r(\delta)\leq C_{\alpha,\kappa}^{(2)}\min\{\delta^{\frac{\alpha}{\alpha-1}},\delta\}.
  • Geometric decay Δ(j)=κρj\Delta(j)=\kappa\rho^{j} (ρ(0,1)\rho\in(0,1)). Then there exist constants cρ,κ(i),Cρ,κ(i)>0c_{\rho,\kappa}^{(i)},C_{\rho,\kappa}^{(i)}>0, i=1,2i=1,2 only depending on κ,ρ\kappa,\rho such that

    cρ,κ(1)max{log(x1),1}q(x)Cρ,κ(1)max{log(x1),1},c_{\rho,\kappa}^{(1)}\max\{\log(x^{-1}),1\}\leq q^{*}(x)\leq C_{\rho,\kappa}^{(1)}\max\{\log(x^{-1}),1\},

    and

    cρ,κ(2)δlog(δ1e)r(δ)Cρ,κ(2)δlog(δ1e).c_{\rho,\kappa}^{(2)}\frac{\delta}{\log(\delta^{-1}\vee e)}\leq r(\delta)\leq C_{\rho,\kappa}^{(2)}\frac{\delta}{\log(\delta^{-1}\vee e)}.
Proof of Lemma 7.10.
  1. (i)

    By Lemma 7.9(i), βnorm(q)=β(q)q[cα,κqα,Cα,κqα]\beta_{norm}(q)=\frac{\beta(q)}{q}\in[c_{\alpha,\kappa}q^{-\alpha},C_{\alpha,\kappa}q^{-\alpha}] with cα,κ=κα1c_{\alpha,\kappa}=\frac{\kappa}{\alpha-1}, Cα,κ=κmax{α,2α+1}α1C_{\alpha,\kappa}=\kappa\frac{\max\{\alpha,2^{-\alpha+1}\}}{\alpha-1}. In the following we assume w.l.o.g. that Cα,κ>1C_{\alpha,\kappa}>1 and cα,κ<1c_{\alpha,\kappa}<1.

    • q(x)q^{*}(x) Upper bound: For any x>0x>0,

      q(x)=min{q:βnorm(q)x}min{q:q(xCα,κ)1α}=(xCα,κ)1α.q^{*}(x)=\min\{q\in\mathbb{N}:\beta_{norm}(q)\leq x\}\leq\min\{q\in\mathbb{N}:q\geq(\frac{x}{C_{\alpha,\kappa}})^{-\frac{1}{\alpha}}\}=\lceil(\frac{x}{C_{\alpha,\kappa}})^{-\frac{1}{\alpha}}\rceil.

      Especially we obtain q(x)(xCα,κ)1α+12Cα,κ1αmax{x1α,1}q^{*}(x)\leq(\frac{x}{C_{\alpha,\kappa}})^{-\frac{1}{\alpha}}+1\leq 2C_{\alpha,\kappa}^{\frac{1}{\alpha}}\max\{x^{-\frac{1}{\alpha}},1\}. The assertion holds with Cα,κ(1):=2max{Cα,κ,1}1αC_{\alpha,\kappa}^{(1)}:=2\max\{C_{\alpha,\kappa},1\}^{\frac{1}{\alpha}}.

    • q(x)q^{*}(x) Lower bound: Similarly to above,

      q(x)(xcα,κ)1α(xcα,κ)1α=cα,κ1αx1α.q^{*}(x)\geq\lceil(\frac{x}{c_{\alpha,\kappa}})^{-\frac{1}{\alpha}}\rceil\geq\big{(}\frac{x}{c_{\alpha,\kappa}}\big{)}^{-\frac{1}{\alpha}}=c_{\alpha,\kappa}^{\frac{1}{\alpha}}x^{-\frac{1}{\alpha}}.

      On the other hand, q(x)1cα,κ1αq^{*}(x)\geq 1\geq c_{\alpha,\kappa}^{\frac{1}{\alpha}}, which yields the assertion with cα,κ(1)=min{cα,κ,1}1αc_{\alpha,\kappa}^{(1)}=\min\{c_{\alpha,\kappa},1\}^{\frac{1}{\alpha}}.

    • r(δ)r(\delta) Upper bound: Put r=2αα1cα,κ1α1δαα1r=2^{\frac{\alpha}{\alpha-1}}c_{\alpha,\kappa}^{-\frac{1}{\alpha-1}}\delta^{\frac{\alpha}{\alpha-1}}. Then we have

      q(r)r(rcα,κ)1αr=2αα1cα,κ1α121α1cα,κ1α1δ1α1δαα12δ>δ.q^{*}(r)r\geq\lceil(\frac{r}{c_{\alpha,\kappa}})^{-\frac{1}{\alpha}}\rceil r=2^{\frac{\alpha}{\alpha-1}}c_{\alpha,\kappa}^{-\frac{1}{\alpha-1}}\lceil 2^{-\frac{1}{\alpha-1}}c_{\alpha,\kappa}^{\frac{1}{\alpha-1}}\delta^{-\frac{1}{\alpha-1}}\rceil\delta^{\frac{\alpha}{\alpha-1}}\geq 2\delta>\delta.

      By definition of r()r(\cdot), r(δ)rr(\delta)\leq r. It was already shown in Lemma 7.5 that r(δ)δr(\delta)\leq\delta holds for all δ>0\delta>0. We obtain the assertion with Cα,κ(2)=2αα1cα,κ1α1C_{\alpha,\kappa}^{(2)}=2^{\frac{\alpha}{\alpha-1}}c_{\alpha,\kappa}^{-\frac{1}{\alpha-1}}.

    • r(δ)r(\delta) Lower bound: First consider the case δ<Cα,κ\delta<C_{\alpha,\kappa}.

      Put r=2αα1Cα,κ1α1δαα1r=2^{-\frac{\alpha}{\alpha-1}}C_{\alpha,\kappa}^{-\frac{1}{\alpha-1}}\delta^{\frac{\alpha}{\alpha-1}}. Since x:=21α1Cα,κ1α1δ1α1>1x:=2^{\frac{1}{\alpha-1}}C_{\alpha,\kappa}^{\frac{1}{\alpha-1}}\delta^{-\frac{1}{\alpha-1}}>1, x2x\lceil x\rceil\leq 2x and thus

      q(r)r(rCα,κ)1αr=2αα1Cα,κ1α121α1Cα,κ1α1δ1α1δαα1221δδ.q^{*}(r)r\leq\lceil(\frac{r}{C_{\alpha,\kappa}})^{-\frac{1}{\alpha}}\rceil r=2^{-\frac{\alpha}{\alpha-1}}C_{\alpha,\kappa}^{-\frac{1}{\alpha-1}}\lceil 2^{\frac{1}{\alpha-1}}C_{\alpha,\kappa}^{\frac{1}{\alpha-1}}\delta^{-\frac{1}{\alpha-1}}\rceil\delta^{\frac{\alpha}{\alpha-1}}\leq 2\cdot 2^{-1}\delta\leq\delta.

      By definition of r()r(\cdot), r(δ)r=2αα1min{(δCα,κ)1α1,1}δr(\delta)\geq r=2^{-\frac{\alpha}{\alpha-1}}\min\{(\frac{\delta}{C_{\alpha,\kappa}})^{\frac{1}{\alpha-1}},1\}\delta.

      In the case δ>Cα,κ\delta>C_{\alpha,\kappa}, we have

      q(δ)δ=(δCα,κ)1αδ1δδ,q^{*}(\delta)\delta=\lceil(\frac{\delta}{C_{\alpha,\kappa}})^{-\frac{1}{\alpha}}\rceil\delta\leq 1\cdot\delta\leq\delta,

      thus r(δ)δ=min{(δCα,κ)1α1,1}δ2αα1min{(δCα,κ)1α1,1}δr(\delta)\geq\delta=\min\{(\frac{\delta}{C_{\alpha,\kappa}})^{\frac{1}{\alpha-1}},1\}\delta\geq 2^{-\frac{\alpha}{\alpha-1}}\min\{(\frac{\delta}{C_{\alpha,\kappa}})^{\frac{1}{\alpha-1}},1\}\delta. We conclude that the assertion holds with cα,κ(2)=2αα1Cα,κ1α1c_{\alpha,\kappa}^{(2)}=2^{-\frac{\alpha}{\alpha-1}}C_{\alpha,\kappa}^{-\frac{1}{\alpha-1}}.

  2. (ii)

    We have βnorm(q)=β(q)q=Cρ,κρqq\beta_{norm}(q)=\frac{\beta(q)}{q}=C_{\rho,\kappa}\frac{\rho^{q}}{q}, where Cρ,κ=κρ1ρC_{\rho,\kappa}=\frac{\kappa\rho}{1-\rho}. In the following we assume w.l.o.g. that Cρ,κ>8C_{\rho,\kappa}>8.

    • q(x)q^{*}(x) Upper bound: Put ψ(x)=max{log(x1),1}\psi(x)=\max\{\log(x^{-1}),1\}. Define q~=ψ(xCρ,κlog(ρ1))log(ρ1)\tilde{q}=\lceil\frac{\psi(\frac{x}{C_{\rho,\kappa}\log(\rho^{-1})})}{\log(\rho^{-1})}\rceil. Then we have

      βnorm(q~)Cρ,κρlog((xCρ,κlog(ρ1))1)/log(ρ1)q~xlog(ρ1)q~xψ(xCρ,κlog(ρ1))x,\beta_{norm}(\tilde{q})\leq C_{\rho,\kappa}\frac{\rho^{\log(\big{(}\frac{x}{C_{\rho,\kappa}\log(\rho^{-1})}\big{)}^{-1})/\log(\rho^{-1})}}{\tilde{q}}\leq\frac{\frac{x}{\log(\rho^{-1})}}{\tilde{q}}\leq\frac{x}{\psi(\frac{x}{C_{\rho,\kappa}\log(\rho^{-1})})}\leq x,

      thus

      q(x)=min{q:βnorm(q)x}q~=ψ(xCρ,κlog(ρ1))log(ρ1).q^{*}(x)=\min\{q\in\mathbb{N}:\beta_{norm}(q)\leq x\}\leq\tilde{q}=\Big{\lceil}\frac{\psi(\frac{x}{C_{\rho,\kappa}\log(\rho^{-1})})}{\log(\rho^{-1})}\Big{\rceil}.

      Especially we obtain

      q(x)1log(ρ1)(ψ(x)+log(Cρ,κlog(ρ1)))+12(1+log(Cρ,κlog(ρ1)))log(ρ1)ψ(x),q^{*}(x)\leq\frac{1}{\log(\rho^{-1})}\big{(}\psi(x)+\log(C_{\rho,\kappa}\log(\rho{-1}))\big{)}+1\leq\frac{2(1+\log(C_{\rho,\kappa}\log(\rho^{-1})))}{\log(\rho^{-1})}\psi(x),

      that is, the assertion holds with Cρ,κ(1)=2(1+log(Cρ,κlog(ρ1)))log(ρ1)C_{\rho,\kappa}^{(1)}=\frac{2(1+\log(C_{\rho,\kappa}\log(\rho{-1})))}{\log(\rho^{-1})}.

    • q(x)q^{*}(x) Lower Bound: Case 1: Assume that x<Cρ,κlog(ρ1)ρ4x<C_{\rho,\kappa}\log(\rho^{-1})\rho^{4}. Define q~=14log((xCρ,κlog(ρ1))1)log(ρ1)1\tilde{q}=\lceil\frac{1}{4}\frac{\log((\frac{x}{C_{\rho,\kappa}\log(\rho^{-1})})^{-1})}{\log(\rho^{-1})}\rceil\geq 1. Then q~12log((xCρ,κlog(ρ1))1)log(ρ1)\tilde{q}\leq\frac{1}{2}\frac{\log((\frac{x}{C_{\rho,\kappa}\log(\rho^{-1})})^{-1})}{\log(\rho^{-1})}, and thus

      βnorm(q~)Cρ,k(xCρ,κlog(ρ1))1/2q~(Cρ,κlog(ρ1))1/2x1/2log((xCρ,κlog(ρ1))1/2)>x\beta_{norm}(\tilde{q})\geq C_{\rho,k}\frac{\Big{(}\frac{x}{C_{\rho,\kappa}\log(\rho^{-1})}\Big{)}^{1/2}}{\tilde{q}}\geq(C_{\rho,\kappa}\log(\rho^{-1}))^{1/2}\frac{x^{1/2}}{\log((\frac{x}{C_{\rho,\kappa}\log(\rho^{-1})})^{-1/2})}>x

      since

      (xCρ,κlog(ρ1))1/2>log((xCρ,κlog(ρ1))1/2).(\frac{x}{C_{\rho,\kappa}\log(\rho^{-1})})^{-1/2}>\log((\frac{x}{C_{\rho,\kappa}\log(\rho^{-1})})^{-1/2}).

      We have therefore shown that for x<Cρ,κlog(ρ1)ρ4x<C_{\rho,\kappa}\log(\rho^{-1})\rho^{4},

      q(x)q~=max{1,q~}.q^{*}(x)\geq\tilde{q}=\max\{1,\tilde{q}\}. (7.82)

      Case 2: If xCρ,κlog(ρ1)ρ4x\geq C_{\rho,\kappa}\log(\rho^{-1})\rho^{4}, then q~1\tilde{q}\leq 1, that is,

      q(x)1=max{1,q~}.q^{*}(x)\geq 1=\max\{1,\tilde{q}\}.

      We have shown that for all x>0x>0,

      q(x)max{1,q~}.q^{*}(x)\geq\max\{1,\tilde{q}\}.

      Since

      q~\displaystyle\tilde{q} \displaystyle\geq 14log((xCρ,κlog(ρ1))1)log(ρ1)14log(ρ1)[log(x1)+log(Cρ,κlog(ρ1))]\displaystyle\frac{1}{4}\frac{\log((\frac{x}{C_{\rho,\kappa}\log(\rho^{-1})})^{-1})}{\log(\rho^{-1})}\geq\frac{1}{4\log(\rho^{-1})}\big{[}\log(x^{-1})+\log(C_{\rho,\kappa}\log(\rho^{-1}))\big{]}
      \displaystyle\geq 14log(ρ1)log(x1),\displaystyle\frac{1}{4\log(\rho^{-1})}\log(x^{-1}),

      the assertion follows with cρ,κ(1)=14log(ρ1)c_{\rho,\kappa}^{(1)}=\frac{1}{4\log(\rho^{-1})}.

    • r(δ)r(\delta) Upper bound: Put r~=2(cρ,κ(1))1δlog((21cρ,κ(1)δ1)e)\tilde{r}=\frac{2(c_{\rho,\kappa}^{(1)})^{-1}\delta}{\log((2^{-1}c_{\rho,\kappa}^{(1)}\delta^{-1})\vee e)}. Then we have

      q(r~)r~\displaystyle q^{*}(\tilde{r})\tilde{r} \displaystyle\geq cρ,κ(1)log(r~1e)r~\displaystyle c_{\rho,\kappa}^{(1)}\log(\tilde{r}^{-1}\vee e)\cdot\tilde{r}
      =\displaystyle= 2δlog((21cρ,κ(1)δ1)e)log([21cρ,κ(1)δ1log((21cρ,κ(1)δ1)e)]e)\displaystyle\frac{2\delta}{\log((2^{-1}c_{\rho,\kappa}^{(1)}\delta^{-1})\vee e)}\cdot\log([2^{-1}c_{\rho,\kappa}^{(1)}\delta^{-1}\log((2^{-1}c_{\rho,\kappa}^{(1)}\delta^{-1})\vee e)]\vee e)
      \displaystyle\geq 2δlog((21cρ,κ(1)δ1)e)log([21cρ,κ(1)δ1]e)=2δ>δ.\displaystyle\frac{2\delta}{\log((2^{-1}c_{\rho,\kappa}^{(1)}\delta^{-1})\vee e)}\cdot\log([2^{-1}c_{\rho,\kappa}^{(1)}\delta^{-1}]\vee e)=2\delta>\delta.

      By definition of r()r(\cdot), we obtain

      r(δ)r~.r(\delta)\leq\tilde{r}.

      For a(0,1)a\in(0,1), the function (0,)(0,),xlog(x1e)log((ax1)e)(0,\infty)\to(0,\infty),x\mapsto\frac{\log(x^{-1}\vee e)}{\log((ax^{-1})\vee e)} attains its maximum at x=ae1x=ae^{-1} with maximum value 1+log(a1)1+\log(a^{-1}). Thus

      r~2(cρ,κ(1))1(1+log(21(cρ,κ(1))1))δlog(δ1e),\tilde{r}\leq 2(c_{\rho,\kappa}^{(1)})^{-1}(1+\log(2^{-1}(c_{\rho,\kappa}^{(1)})^{-1}))\cdot\frac{\delta}{\log(\delta^{-1}\vee e)},

      that is, the assertion holds with Cρ,κ(2)=2(cρ,κ(1))1(1+log(21(cρ,κ(1))1))C_{\rho,\kappa}^{(2)}=2(c_{\rho,\kappa}^{(1)})^{-1}(1+\log(2^{-1}(c_{\rho,\kappa}^{(1)})^{-1})).

    • r(δ)r(\delta) Lower Bound: Put r~=21(Cρ,κ(1))1δlog((2Cρ,κ(1)δ1)e)\tilde{r}=\frac{2^{-1}(C_{\rho,\kappa}^{(1)})^{-1}\delta}{\log((2C_{\rho,\kappa}^{(1)}\delta^{-1})\vee e)}. Then

      q(r~)r~\displaystyle q^{*}(\tilde{r})\tilde{r} \displaystyle\leq Cρ,κ(1)log(r~1e)r~\displaystyle C_{\rho,\kappa}^{(1)}\log(\tilde{r}^{-1}\vee e)\cdot\tilde{r}
      =\displaystyle= 21δlog((2Cρ,κ(1)δ1)e)log([2Cρ,κ(1)δ1log((2Cρ,κ(1)δ1)e)]e)\displaystyle\frac{2^{-1}\delta}{\log((2C_{\rho,\kappa}^{(1)}\delta^{-1})\vee e)}\cdot\log([2C_{\rho,\kappa}^{(1)}\delta^{-1}\log((2C_{\rho,\kappa}^{(1)}\delta^{-1})\vee e)]\vee e)
      \displaystyle\leq 21δlog((Cρ,κ(1)δ1)e)[log((2Cρ,κ(1)δ1)e)+loglog((2Cρ,κ(1)δ1)e)]\displaystyle\frac{2^{-1}\delta}{\log((C_{\rho,\kappa}^{(1)}\delta^{-1})\vee e)}\cdot\big{[}\log((2C_{\rho,\kappa}^{(1)}\delta^{-1})\vee e)+\log\log((2C_{\rho,\kappa}^{(1)}\delta^{-1})\vee e)\big{]}
      \displaystyle\leq δ,\displaystyle\delta,

      where the last step is due to log(x)+loglog(x)2log(x)\log(x)+\log\log(x)\leq 2\log(x) for xex\geq e. By definition of r()r(\cdot), we obtain

      r(δ)r~.r(\delta)\geq\tilde{r}.

      For a>1a>1, the function (0,)(0,),xlog(x1e)log((ax1)e)(0,\infty)\to(0,\infty),x\mapsto\frac{\log(x^{-1}\vee e)}{\log((ax^{-1})\vee e)} attains its minimum at x=e1x=e^{-1} with minimum value 11+log(a)\frac{1}{1+\log(a)}. We therefore obtain

      r~(Cρ,κ(1))12(1+log(2Cρ,κ(1)))δlog(δ1e),\tilde{r}\geq\frac{(C_{\rho,\kappa}^{(1)})^{-1}}{2(1+\log(2C_{\rho,\kappa}^{(1)}))}\frac{\delta}{\log(\delta^{-1}\vee e)},

      that is, the assertion holds with cρ,κ(2)=(Cρ,κ(1))12(1+log(2Cρ,κ(1)))c_{\rho,\kappa}^{(2)}=\frac{(C_{\rho,\kappa}^{(1)})^{-1}}{2(1+\log(2C_{\rho,\kappa}^{(1)}))}.

Lemma 7.11 (Form of VnV_{n}).
  1. (i)

    Polynomial decay Δ(j)=κjα\Delta(j)=\kappa j^{-\alpha} (where α>1\alpha>1): Then there exist some constants Cα,κ(3),cα,κ(3)C_{\alpha,\kappa}^{(3)},c_{\alpha,\kappa}^{(3)} only depending on κ,α,𝔻n\kappa,\alpha,\mathbb{D}_{n} such that

    cα,κ(3)f2,nmax{f2,n1α,1}Vn(f)Cα,κ(3)f2,nmax{f2,n1α,1}.c_{\alpha,\kappa}^{(3)}\|f\|_{2,n}\max\{\|f\|_{2,n}^{-\frac{1}{\alpha}},1\}\leq V_{n}(f)\leq C_{\alpha,\kappa}^{(3)}\|f\|_{2,n}\max\{\|f\|_{2,n}^{-\frac{1}{\alpha}},1\}.
  2. (ii)

    Geometric decay Δ(j)=κρj\Delta(j)=\kappa\rho^{j} (where ρ(0,1)\rho\in(0,1)): Then there exist some constants cρ,κ(3),Cρ,κ(3)c_{\rho,\kappa}^{(3)},C_{\rho,\kappa}^{(3)} only depending on κ,ρ,𝔻n\kappa,\rho,\mathbb{D}_{n} such that

    cρ,κ(3)f2,nmax{log(f2,n1),1}Vn(f)Cρ,κ(3)f2,nmax{log(f2,n1),1}.c_{\rho,\kappa}^{(3)}\|f\|_{2,n}\max\{\log(\left\lVert f\right\rVert_{2,n}^{-1}),1\}\leq V_{n}(f)\leq C_{\rho,\kappa}^{(3)}\|f\|_{2,n}\max\{\log(\left\lVert f\right\rVert_{2,n}^{-1}),1\}.
Proof of Lemma 7.11.

The assertions follow from Lemma 7.9(ii) by taking κ2=κ𝔻n\kappa_{2}=\kappa\mathbb{D}_{n}. The maximum in the lower bounds is obtained due to the additional summand f2,n\|f\|_{2,n} in Vn(f)V_{n}(f).

The following lemma formulates the entropy integral in terms of the well-known bracketing numbers in terms of the 2,n\|\cdot\|_{2,n}-norm in the case that supn𝔻n<\sup_{n\in\mathbb{N}}\mathbb{D}_{n}<\infty. For this, we use the upper bounds of VnV_{n} given in Lemma 7.11.

Lemma 7.12.
  1. (i)

    Polynomial decay Δ(j)=κjα\Delta(j)=\kappa j^{-\alpha} (where α>1\alpha>1). Then for any σ(0,Cα,κ(3))\sigma\in(0,C_{\alpha,\kappa}^{(3)}),

    0σ(ε,,Vn)dεCα,κ(3)α1α0(σCα,κ(3))αα1u1α(u,,2,n)du,\int_{0}^{\sigma}\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},V_{n})}d\varepsilon\leq C_{\alpha,\kappa}^{(3)}\frac{\alpha-1}{\alpha}\int_{0}^{(\frac{\sigma}{C_{\alpha,\kappa}^{(3)}})^{\frac{\alpha}{\alpha-1}}}u^{-\frac{1}{\alpha}}\sqrt{\mathbb{H}(u,\mathcal{F},\|\cdot\|_{2,n})}du,

    where Cα,κ(3)C_{\alpha,\kappa}^{(3)} is from lemma 7.11.

  2. (ii)

    Exponential decay Δ(j)=κρj\Delta(j)=\kappa\rho^{j} (where ρ(0,1)\rho\in(0,1)). Then for any σ(0,e1Cρ,κ(3))\sigma\in(0,e^{-1}C_{\rho,\kappa}^{(3)}),

    0σ(ε,,Vn)dεCρ,κ(3)0E(σCρ,κ(3))log(u1)(u,,2,n)du,\int_{0}^{\sigma}\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},V_{n})}d\varepsilon\leq C_{\rho,\kappa}^{(3)}\int_{0}^{E^{-}(\frac{\sigma}{C_{\rho,\kappa}^{(3)}})}\log(u^{-1})\sqrt{\mathbb{H}(u,\mathcal{F},\|\cdot\|_{2,n})}du,

    where E(x)=xlog(x1)E^{-}(x)=\frac{x}{\log(x^{-1})} and Cρ,κ(3)C_{\rho,\kappa}^{(3)} is from lemma 7.11.

Proof of Lemma 7.12.
  1. (i)

    By Lemma 7.11, Vn(f)Cα,κ(3)f2,nmax{f2,n1α,1}V_{n}(f)\leq C_{\alpha,\kappa}^{(3)}\|f\|_{2,n}\max\{\|f\|_{2,n}^{-\frac{1}{\alpha}},1\}. We abbreviate c=Cα,κ(3)c=C_{\alpha,\kappa}^{(3)} in the following.

    Let ε(0,c)\varepsilon\in(0,c) and (lj,uj)(l_{j},u_{j}), j=1,,Nj=1,...,N brackets such that ujlj2,n(εc)αα1\|u_{j}-l_{j}\|_{2,n}\leq(\frac{\varepsilon}{c})^{\frac{\alpha}{\alpha-1}}. Then

    Vn(ujlj)cmax{ujlj2,n,ujlj2,nα1α}cmax{(εc)αα1,εc}cεc=ε.V_{n}(u_{j}-l_{j})\leq c\max\{\|u_{j}-l_{j}\|_{2,n},\|u_{j}-l_{j}\|_{2,n}^{\frac{\alpha-1}{\alpha}}\}\leq c\max\Big{\{}(\frac{\varepsilon}{c})^{\frac{\alpha}{\alpha-1}},\frac{\varepsilon}{c}\Big{\}}\leq c\cdot\frac{\varepsilon}{c}=\varepsilon.

    Therefore, the bracketing number fulfill the relation

    (ε,,Vn)((εc)αα1,,2,n).\mathbb{N}(\varepsilon,\mathcal{F},V_{n})\leq\mathbb{N}\Big{(}(\frac{\varepsilon}{c})^{\frac{\alpha}{\alpha-1}},\mathcal{F},\|\cdot\|_{2,n}\Big{)}.

    We conclude that for σ(0,c)\sigma\in(0,c),

    0σ(ε,,Vn)dε\displaystyle\int_{0}^{\sigma}\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},V_{n})}d\varepsilon \displaystyle\leq 0σ((εc)αα1,,2,n)dε\displaystyle\int_{0}^{\sigma}\sqrt{\mathbb{H}\Big{(}(\frac{\varepsilon}{c})^{\frac{\alpha}{\alpha-1}},\mathcal{F},\|\cdot\|_{2,n}\Big{)}}d\varepsilon
    =\displaystyle= cα1α0(σc)αα1u1α(u,,2,n)du.\displaystyle c\frac{\alpha-1}{\alpha}\int_{0}^{(\frac{\sigma}{c})^{\frac{\alpha}{\alpha-1}}}u^{-\frac{1}{\alpha}}\sqrt{\mathbb{H}(u,\mathcal{F},\|\cdot\|_{2,n})}du.

    In the last step, we used the substitution u=(εc)αα1u=(\frac{\varepsilon}{c})^{\frac{\alpha}{\alpha-1}} which leads to dudε=αα11c(εc)1α1=αα11cu1α\frac{du}{d\varepsilon}=\frac{\alpha}{\alpha-1}\cdot\frac{1}{c}\cdot(\frac{\varepsilon}{c})^{\frac{1}{\alpha-1}}=\frac{\alpha}{\alpha-1}\cdot\frac{1}{c}\cdot u^{\frac{1}{\alpha}}.

  2. (ii)

    By Lemma 7.11, Vn(f)Cρ,κ(3)E(f2,n)V_{n}(f)\leq C_{\rho,\kappa}^{(3)}E(\|f\|_{2,n}) with E(x)=xmax{log(x1),1}E(x)=x\max\{\log(x^{-1}),1\}. We abbreviate c=Cρ,κ(3)c=C_{\rho,\kappa}^{(3)} in the following.

    We first collect some properties of EE. Put E(x)=xlog(x1e)E^{-}(x)=\frac{x}{\log(x^{-1}\vee e)}. In the case x>e1x>e^{-1}, we have E(E(x))=xE(E^{-}(x))=x. In the case xe1x\leq e^{-1}, we have

    E(E(x))=xlog(x1)log(x1log(x1)1)xlog(x1)log(x1)=x.E(E^{-}(x))=\frac{x}{\log(x^{-1})}\cdot\log\Big{(}\frac{x^{-1}}{\log(x^{-1})^{-1}}\Big{)}\leq\frac{x}{\log(x^{-1})}\log(x^{-1})=x.

    This shows that for all x>0x>0,

    E(E(x))x.E(E^{-}(x))\leq x. (7.83)

    Furthermore, for x<e1x<e^{-1},

    log(E(x)1)=log(x1log(x1))log(x1).\log(E^{-}(x)^{-1})=\log(x^{-1}\log(x^{-1}))\geq\log(x^{-1}). (7.84)

    Now let ε(0,1)\varepsilon\in(0,1) and (lj,uj)(l_{j},u_{j}), j=1,,Nj=1,...,N brackets such that ujlj2,nE(εc)\|u_{j}-l_{j}\|_{2,n}\leq E^{-}(\frac{\varepsilon}{c}). Then by (7.83),

    Vn(ujlj)cE(E(εc))cεc=ε.V_{n}(u_{j}-l_{j})\leq cE(E^{-}(\frac{\varepsilon}{c}))\leq c\cdot\frac{\varepsilon}{c}=\varepsilon.

    Therefore, we have the following relation between the bracketing numbers

    (ε,,Vn)(E(εc),,2,n).\mathbb{N}(\varepsilon,\mathcal{F},V_{n})\leq\mathbb{N}\Big{(}E^{-}(\frac{\varepsilon}{c}),\mathcal{F},\|\cdot\|_{2,n}\Big{)}.

    We conclude that for σ(0,ce1)\sigma\in(0,ce^{-1}),

    0σ(ε,,Vn)dε0σ(E(εc),,2)dεc0E(σc)log(u1)(u,,2)du.\int_{0}^{\sigma}\sqrt{\mathbb{H}(\varepsilon,\mathcal{F},V_{n})}d\varepsilon\leq\int_{0}^{\sigma}\sqrt{\mathbb{H}\Big{(}E^{-}(\frac{\varepsilon}{c}),\mathcal{F},\|\cdot\|_{2}\Big{)}}d\varepsilon\leq c\int_{0}^{E^{-}(\frac{\sigma}{c})}\log(u^{-1})\sqrt{\mathbb{H}(u,\mathcal{F},\|\cdot\|_{2})}du.

    In the last step, we used the substitution u=E(εc)u=E^{-}(\frac{\varepsilon}{c}) which leads to dudε=1c1+log((ε/c)1)log((ε/c)1)2\frac{du}{d\varepsilon}=\frac{1}{c}\cdot\frac{1+\log((\varepsilon/c)^{-1})}{\log((\varepsilon/c)^{-1})^{2}}, and with (7.84) we obtain

    dε=clog((ε/c)1)21+log((ε/c)1)duclog((ε/c)1)duclog(E(εc)1)du=clog(u1)du.d\varepsilon=c\frac{\log((\varepsilon/c)^{-1})^{2}}{1+\log((\varepsilon/c)^{-1})}du\leq c\log((\varepsilon/c)^{-1})du\leq c\log(E^{-}(\frac{\varepsilon}{c})^{-1})du=c\log(u^{-1})du.