This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Asymptotically optimal Wasserstein couplings for the small-time stable domain of attraction

Jorge González Cázares, David Kramer-Bang & Aleksandar Mijatović IIMAS, UNAM, Mexico. Department of Mathematics, Aarhus University, Denmark. Department of Statistics, University of Warwick & The Alan Turing Institute, UK. [email protected] [email protected] [email protected]
Abstract.

We develop two novel couplings between general pure-jump Lévy processes in d\mathbb{R}^{d} and apply them to obtain upper bounds on the rate of convergence in an appropriate Wasserstein distance on the path space for a wide class of Lévy processes attracted to a multidimensional stable process in the small-time regime. We also establish general lower bounds based on certain universal properties of slowly varying functions and the relationship between the Wasserstein and Toscani–Fourier distances of the marginals. Our upper and lower bounds typically have matching rates. In particular, the rate of convergence is polynomial for the domain of normal attraction and slower than a slowly varying function for the domain of non-normal attraction.

Key words and phrases:
small-time scaling limit, stable domain of attraction, Wasserstein distance, Coupling of Lévy processes
2020 Mathematics Subject Classification:
Primary 60F05, 60G51; Secondary 60G52, 60F25.

1. Introduction

Stable processes arise naturally as universal scaling limits of a vast class of stochastic processes at either small or large times. In particular, in the small-time regime, stable processes arise as weak limits of discretisation errors of widely used models in theoretical and applied probability [34, 27, 1]. Most of these models are based on Lévy processes in the small-time domain of attraction of a stable processes [26, 7, 12]. In contrast with the more classical long-time regime of Lamperti, where literature is abundant (see, e.g. [25, 24, 28, 9, 33]), the study of the convergence in the small-time regime, which is the focus of this paper, has been underdeveloped. In the long-time regime, the convergence is a consequence of heavy tails with regularly varying tail probabilities or a finite second moment. On the other hand, in the small-time regime, the convergence depends on the activity of the small jumps of the underlying Lévy process and does not depend on the behaviour of the tail probabilites [26]. However, having a heavy-tailed limit may severely deteriorate the convergence speed as uniform integrability typically fails. Quantifying such an error is a fundamental problem, crucial in a number of disparate application areas, such as controlling the bias of discretised models in mathematical finance and elsewhere (see [26] and the references therein), quantifying the model misspecification risk [8] or asserting the convergence properties of estimators for the index of variation, such as Hill’s estimator, which is known to require a second order condition for the convergence to have good properties [17, p. 193–195].

The main aim of the present paper is to establish lower and upper bounds in Wasserstein distance on the convergence rate of multivariate Lévy processes attracted to a stable process in the domains of both normal and non-normal attraction (see definition in Section 2 below). Moreover, we will show that our bounds are often sharp. Our upper bounds are applicable to a large class of Lévy processes that are attracted to a multivariate α\alpha-stable process (which is Gaussian if α=2\alpha=2 and heavy-tailed if α(0,2)\alpha\in(0,2)), while our lower bounds are universal within the small-time regime. To establish the upper bounds on the path supremum norm, we construct two couplings between any two arbitrary Lévy processes, inspired by the stochastic representations in [13], and bound the LpL^{p}-norm of the maximum distance between the paths of the resulting processes. The lower bounds for the domain of normal attraction are obtained by comparing the Wasserstein distance with the Toscani–Fourier distance between the marginals and in the case of non-normal attraction, using a universal property of slowly varying functions.

We show that in the domain of normal attraction to heavy-tailed laws, under suitable second order assumptions, the rate of convergence of the upper and lower bounds are polynomial and agree for the LqL^{q}-norm in the cases q<α<1q<\alpha<1 and q=1<αq=1<\alpha, making our couplings rate-optimal in this sense. In the domain of non-normal attraction (to either Gaussian or a heavy-tailed stable law), the upper and lower bounds are both ‘slow’ and, in particular, the convergence is never faster than log1ε(1/t)\log^{-1-\varepsilon}(1/t) as t0t\downarrow 0 for any ε>0\varepsilon>0. Moreover, for large subclasses of Lévy processes, the upper and lower bounds on the convergence rate agree in the case q=1<αq=1<\alpha (see e.g. Corollary 2.4). In the domain of normal attraction to the Gaussian law, our upper and lower bounds are also polynomial and dependent on the Blumenthal–Getoor index of the attracted process. The bounds on the convergence rates in this case often agree and, when they do not, the gap between them is small (see Figure 2 below). A short YouTube presentation [21] describes our results, including the ideas behind the proofs.

1.1. Summary of our results in the heavy-tailed stable domain of attraction

In preparation for the summary of our results in Table 1, we introduce some notation: f(t)g(t)f(t)\lesssim g(t) as t0t\downarrow 0 holds for two functions f,g0f,g\geqslant 0 if there exists c,t0>0c,t_{0}>0 satisfying f(t)cg(t)f(t)\leqslant cg(t) for all t(0,t0]t\in(0,t_{0}]. An eventually positive function GG is slowly varying at infinity, GSVG\in\mbox{SV}_{\infty}, if limxG(cx)/G(x)=1\lim_{x\to\infty}G(cx)/G(x)=1 for all c>0c>0.

Table 1 summarises our results on the convergence rates established here for processes in both domains of attraction of stable processes. Recall that an α\alpha-stable process has a finite qq-moment if and only if q<αq<\alpha. Due to this technical constraint, our upper bounds on the LqL^{q}-Wasserstein distance, defined in (1) below, always require q<αq<\alpha for the corresponding domains of attraction. We remark that, in this case, both lower and upper bounds are typically asymptotically equivalent up to a multiplicative constant, making our methods and couplings, rate optimal. Indeed, in the domain of normal attraction, this occurs for any admissible q>0q>0 if α(0,1)\alpha\in(0,1) and for the L1L^{1}-Wasserstein distance if α(1,2)\alpha\in(1,2). Our bounds for the domain of non-normal attraction are also seen to be rate optimal when GG is sufficiently regular (see discussion following Theorem 2.3 and Corollary 2.4 below) when q=1q=1 (and hence α>1\alpha>1).

Domain of attraction α(0,2){1}\alpha\in(0,2)\setminus\{1\}, q(0,α)(0,1]q\in(0,\alpha)\cap(0,1] and t0t\downarrow 0
normal t1q/α𝒲q(𝑿t,𝒁)t1q/α𝟙{α<1}+tq(11/α)𝟙{α>1}t^{1-q/\alpha}\lesssim\mathcal{W}_{q}(\bm{X}^{t},\bm{Z})\lesssim t^{1-q/\alpha}\mathds{1}_{\{\alpha<1\}}+t^{q(1-1/\alpha)}\mathds{1}_{\{\alpha>1\}}
non-normal L(t)max{𝒲q(𝑿t,𝒁),𝒲q(𝑿2t,𝒁)}L(t)qL(t)\lesssim\max\{\mathcal{W}_{q}(\bm{X}^{t},\bm{Z}),\mathcal{W}_{q}(\bm{X}^{2t},\bm{Z})\}\lesssim L(t)^{q}, where L(t)t1|G(t1)|/G(t1)L(t)\coloneqq t^{-1}|G^{\prime}(t^{-1})|/G(t^{-1}) cannot be bounded above by any non-decreasing integrable function >0\ell>0: if 01(t)t1dt<\int_{0}^{1}\ell(t)t^{-1}\mathrm{d}t<\infty, then lim supt0L(t)/(t)=\limsup_{t\downarrow 0}L(t)/\ell(t)=\infty
Table 1. Summary of the results in Theorem 2.1 for the domain of normal attraction and Theorem 2.3 and Corollary 2.4 for the domain of non-normal attraction. Domain of normal attraction requires limt0G(1/t)(0,)\exists\lim_{t\to 0}G(1/t)\in(0,\infty), where g(t)=t1/αG(1/t)g(t)=t^{1/\alpha}G(1/t) is the normalising function in the scaling limit 𝑿t/g(t)𝑑𝒁1\bm{X}_{t}/g(t)\xrightarrow{d}\bm{Z}_{1} and 𝒁\bm{Z} is the stable process of index α\alpha. Otherwise, 𝑿\bm{X} is in the domain of non-normal attraction. An example of an increasing function \ell satisfying 01(t)t1dt<\int_{0}^{1}\ell(t)t^{-1}\mathrm{d}t<\infty is (t)=|logt|1ε\ell(t)=|\log t|^{-1-\varepsilon} for any ε>0\varepsilon>0.

More precisely, in Table 1 we let 𝑿=(𝑿t)t[0,1]\bm{X}=(\bm{X}_{t})_{t\in[0,1]} be a Lévy process in d\mathbb{R}^{d} attracted to an α\alpha-stable process 𝒁=(𝒁t)t[0,1]\bm{Z}=(\bm{Z}_{t})_{t\in[0,1]} with normalising function gg. That is, 𝑿t=(𝑿st)s[0,1](𝑿st/g(t))s[0,1]\bm{X}^{t}=(\bm{X}_{s}^{t})_{s\in[0,1]}\coloneqq(\bm{X}_{st}/g(t))_{s\in[0,1]}, t(0,1]t\in(0,1], satisfies 𝑿1t=𝑿t/g(t)𝑑𝒁1\bm{X}^{t}_{1}=\bm{X}_{t}/g(t)\xrightarrow{d}\bm{Z}_{1} as t0t\downarrow 0. The table gives asymptotic bounds on the distance 𝒲q(𝑿t,𝒁)\mathcal{W}_{q}(\bm{X}^{t},\bm{Z}) as t0t\downarrow 0 in both regimes of attraction. We let the assumptions of either Theorem 2.1 (with p=1p=1, for the domain of normal attraction) or Theorem 2.3 (for the domain of non-normal attraction) hold for α(0,2){1}\alpha\in(0,2)\setminus\{1\} and pick q(0,α)(0,1]q\in(0,\alpha)\cap(0,1] satisfying 𝔼[|𝑿1|q]<\mathds{E}[|\bm{X}_{1}|^{q}]<\infty. We stress that the lower bounds in Table 1 in both domains of normal and non-normal attraction require no assumptions beyond the existence of the scaling limit (see Theorem 5.1 below for the precise description of the class of Lévy processes 𝑿\bm{X} attracted to a stable process 𝒁\bm{Z}). In particular, as explained in the caption of Table 1, for any 𝐗\bf{X} in the domain of non-normal attraction and arbitrary ϵ>0\epsilon>0, there exists a positive increasing sequence (tk)k(t_{k})_{k\in\mathbb{N}} tending to infinity, such that the lower bound on the LqL^{q}-Wasserstein distance satisfies L(tk)|logtk|1ϵL(t_{k})\geqslant|\log t_{k}|^{-1-\epsilon} for all kk\in\mathbb{N}. In Example 3.2 below, we show that, even if the slowly varying function G(1/t)=g(t)t1/αG(1/t)=g(t)t^{-1/\alpha} in the scaling limit grows arbitrarily slowly, the lower bound LL may be asymptotically equivalent to it and bounded below by 1/log1+ε(1/t)1/\log^{1+\varepsilon}(1/t) for all small t>0t>0.

Recall that the slowly varying function tG(1/t)t\mapsto G(1/t) in the scaling limit 𝑿t/g(t)𝑑𝒁1\bm{X}_{t}/g(t)\xrightarrow{d}\bm{Z}_{1} (as t0t\downarrow 0) is uniquely determined up to asymptotic equivalence only. Interestingly, our results imply that the rate of convergence in the Wasserstein distance can be affected by different choices of GG, see Remark 2.5(IV) below for more details. Finally, we note that the couplings yielding the upper bounds in Table 1 require some structural assumptions on the Lévy measure of 𝐗\bf{X} discussed in Sections 2 and 5 below.

1.2. A heuristic account of our couplings of Lévy processes

One of the main purposes of this article is to introduce two couplings between two arbitrary multivariate Lévy processes 𝑿\bm{X} and 𝒀\bm{Y} and analyse their properties. Both coupling constructions are centered around coupling the respective Poisson jump measures Ξ𝑿\Xi_{\bm{X}} and Ξ𝒀\Xi_{\bm{Y}}. As the Brownian components of 𝑿\bm{X} and 𝒀\bm{Y} are coupled synchronously under both couplings, the couplings are named after the techniques involved in coupling of the Poisson jump measures Ξ𝑿\Xi_{\bm{X}} and Ξ𝒀\Xi_{\bm{Y}}: the first is the thinning coupling, as it is based on Poisson thinning (see Subsection 4.1), and the second is the comonotonic coupling, based on the minimal transport coupling of real-valued random variables and LePage’s simulation method (see Subsection 4.2). As illustrated in Figure 1, the thinning coupling aims to maximise the intersection of the Poisson jump measures, whereas the comonotonic coupling aims to establish an optimal one-to-one correspondence between the atoms of the Poisson jump measures.

Refer to caption
(a) Comonotonic coupling
Refer to caption
(b) Thinning coupling
Figure 1. Ξ𝑿\Xi_{\bm{X}} and Ξ𝒀\Xi_{\bm{Y}} are Poisson random measures of jumps of the Lévy processes 𝑿\bm{X} and 𝒀\bm{Y}, respectively. Panel (A) depicts atoms of Ξ𝑿\Xi_{\bm{X}} and Ξ𝒀\Xi_{\bm{Y}} under the comonotonic coupling, where the angular component of each jump of 𝑿\bm{X} and 𝒀\bm{Y} coincides, while there magnitudes are coupled in the comonotonic fashion. In panel (B), the atoms of Ξ𝑿\Xi_{\bm{X}} and Ξ𝒀\Xi_{\bm{Y}} from the thinning coupling are sampled: either each jump of 𝑿\bm{X} coincides with a jump of 𝒀\bm{Y} or the two jumps are sampled independently.

The assumptions and constructions of both couplings are rather different. In the thinning coupling, we consider a common dominating Lévy measure (say, the sum of both Lévy measures) such that the Lévy measures of both processes are absolutely continuous with respect to it with a bounded density. Then, we consider a Poisson measure with mean measure given by the dominating Lévy measure and then thin the Poisson measure appropriately to produce coupled Poisson measures with mean measures given by the Lévy measures of the processes. This maximises the common jumps of both processes.

For the comonotonic coupling, we assume that both processes have a radial decomposition with their own angular measures. We then construct a radial decomposition for both with respect to a common angular measure. We use this measure and LePage’s method to construct a one-to-one correspondence of jumps in which both processes jump in the same direction but with different magnitudes. Indeed, both Poisson measures are a transformation of a standard Poisson measure with independent decorations that select the direction of the jump. With our assumption, we construct such a transformation explicitly with the following properties. The decorations of both processes agree. Conditionally given a direction, the jumps of both processes, when ordered by decreasing magnitude, are in a one-to-one correspondence that mimics the relationship of real random variables under the minimal transport (or comonotonic) coupling. More precisely, the magnitudes are expressed as right inverse of the radial tail Lévy measure evaluated on the epochs of a standard Poisson process.

1.3. Comparison with the literature

Few results identifying the small-time convergence rate exist in the multivariate setting even when the limit is Gaussian. Indeed, most results in this regime are restricted to dimension 11 and often require finite jump activity [11, 10, 14, 15]. In those situations, the limit law of the rescaled error can be identified for some functionals, leading to accurate estimates of the resulting bias and the celebrated continuity corrections [10, 15].

For heavy-tailed stable limits (i.e. non-Gaussian) and infinite activity Lévy processes attracted to the Gaussian law, again in one dimension, the literature is more scarce and only a fraction of the analogous results exist (see [7] for the convergence of certain path statistics to heavy-tailed limits). There are several complications in developing such results for small-time. First, the Berry–Esseen type bounds (see e.g. [24, 37]), commonly used to establish convergence rates to the Gaussian law, fail to give convergent upper bounds since the jump intensity is vanishing in the small-time regime. Second, the rescaled variables often either fail to be uniformly integrable or their uniform integrability is difficult to prove (see, e.g. [7]).

In [16], the authors consider estimating the density of a discretely observed Lévy process satisfying Orey’s condition. Under the assumption that sufficiently large jumps are identifiable and removable in the sample, the estimation attains a minimax rate that is optimal up to a logarithmic factor if the Blumenthal–Getoor index is known. This regime is different from our situation, as the authors assume that we may remove all sufficiently large jumps. In fact, under this kind of assumption, the residual small-jump process may not be attracted to a stable process but to a Brownian motion [2].

In [34, 30], the authors introduce couplings between Lévy processes to bound the Wasserstein distance between them. The coupling in [34] is generic and pays special attention to the small jumps. However, the bounds fail to converge to zero when applied to a stable process and a Lévy process in its small-time stable domain of attraction. In contrast to the coupling used in [34], where the authors couple the big-jump components based on the magnitude of the jumps (i.e. based on a common threshold), we couple these components matching their jump intensities. Moreover, in [34] the authors couple the small jumps through an artificial Brownian motion, while we instead couple the compensated Poisson measures directly. On the other hand, the coupling in [30], based on McCann’s coupling and Rogers’ results on random walks, is the optimal Markovian coupling. However, the coupling requires such tight control of the infinitesimal dynamics of the processes that the coupling could only be constructed for Lévy processes with finitely many jumps on compact intervals, excluding all heavy-tailed stable processes and most processes in the small-time domain of attraction of Brownian motion.

Although the slow convergence phenomenon under the presence of a slowly varying function that does not converge to a positive constant has been observed in some specific settings such as in the case of Hill’s estimator (see [17, p. 193–195]), to the best of our knowledge it was first documented rigorously in [9] in an elementary general setting. The authors in [9] lower bound the Prokhorov distance between the marginals of the limit and that of a random walk in its domain of non-normal attraction with a function b(n)b(n), satisfying lim supnb(n)log1+ε(n)=\limsup_{n\to\infty}b(n)\log^{1+\varepsilon}(n)=\infty for any ε>0\varepsilon>0. However, as is often the case with lower bounds in the form of upper limits, the sparsity of the sequence of times along which the divergence holds remains unclear. The present paper extends the applicability of such a lower bound and strengthens the conclusions. In particular we show that the function analogous to b(n)b(n) is typically slowly varying and provide some explicit asymptotically equivalent lower bounds.

1.4. Organisation of the article

In Section 2 we introduce the main results of the paper, namely, upper and lower bounds on the convergence rate for processes in the domains of normal and non-normal attraction. Subsection 2.5 explains why the existing literature cannot be directly applied to obtain the bounds presented in Section 2. We present two examples in Section 3 in which our main results are applied to tempered stable processes. The two couplings for general Lévy processes on d\mathbb{R}^{d} used to prove the upper bounds on the Wasserstein distance in Section 2 are introduced in Section 4. General upper bounds (in LpL^{p}) for each component in the Lévy–Itô decomposition of coupled Lévy processes are also established in Section 4. The upper bounds for processes in the domain of (normal and non-normal) attraction of a stable process (Gaussian and heavy-tailed) are established in Section 5, while the lower bounds are established in Section 6. The proofs of the results stated in Section 2 are given in Section 7. Section 8 concludes the paper.

2. Main results

The LqL^{q}-Wasserstein distance 𝒲q(𝝃,𝜻)\mathcal{W}_{q}(\bm{\xi},\bm{\zeta}), for any q(0,)q\in(0,\infty), between the laws of d\mathbb{R}^{d}-valued random vectors 𝝃\bm{\xi} and 𝜻\bm{\zeta} equals inf𝝃=𝑑𝝃,𝜻=𝑑𝜻𝔼[|𝝃𝜻|q]1/(q1)\inf_{\bm{\xi}^{\prime}\overset{d}{=}\bm{\xi},\bm{\zeta}^{\prime}\overset{d}{=}\bm{\zeta}}\mathds{E}[|\bm{\xi}^{\prime}-\bm{\zeta}^{\prime}|^{q}]^{1/(q\vee 1)} where the infimum is taken over all couplings (𝝃,𝜻)(\bm{\xi}^{\prime},\bm{\zeta}^{\prime}) with 𝝃=𝑑𝝃\bm{\xi}^{\prime}\overset{d}{=}\bm{\xi} and 𝜻=𝑑𝜻\bm{\zeta}^{\prime}\overset{d}{=}\bm{\zeta} (throughout |||\cdot| denotes the Euclidean norm in d\mathbb{R}^{d} and xy:=max{x,y}x\vee y:=\max\{x,y\} for y,xy,x\in\mathbb{R}).111For q(0,1)q\in(0,1) and any u>0u>0, y0y\geqslant 0 we have q(u+y)q1quq1q(u+y)^{q-1}\leqslant qu^{q-1}. Hence, by integrating in u(0,x)u\in(0,x), we obtain (x+y)qxq+yq(x+y)^{q}\leqslant x^{q}+y^{q} for all x,y0x,y\geqslant 0, thus implying that 𝒲q\mathcal{W}_{q} is a metric. For d\mathbb{R}^{d}-valued stochastic processes 𝓧=(𝓧t)t[0,1]\bm{\mathcal{X}}=(\bm{\mathcal{X}}_{t})_{t\in[0,1]} and 𝓨=(𝓨t)t[0,1]\bm{\mathcal{Y}}=(\bm{\mathcal{Y}}_{t})_{t\in[0,1]}, the LqL^{q}-Wasserstein distance, based on the distance between the paths in the uniform norm, is given by:

(1) 𝒲q(𝓧,𝓨)inf𝓧=𝑑𝓧,𝓨=𝑑𝓨𝔼[supt[0,1]|𝓧t𝓨t|q]1/(q1),q>0,\mathcal{W}_{q}(\bm{\mathcal{X}},\bm{\mathcal{Y}})\coloneqq\inf_{\bm{\mathcal{X}}^{\prime}\overset{d}{=}\bm{\mathcal{X}},\bm{\mathcal{Y}}^{\prime}\overset{d}{=}\bm{\mathcal{Y}}}\mathds{E}\bigg{[}\sup_{t\in[0,1]}|\bm{\mathcal{X}}^{\prime}_{t}-\bm{\mathcal{Y}}^{\prime}_{t}|^{q}\bigg{]}^{1/(q\vee 1)},\qquad q>0,

where the infimum is taken over all couplings (𝓧,𝓨)(\bm{\mathcal{X}}^{\prime},\bm{\mathcal{Y}}^{\prime}) with 𝓧=𝑑𝓧\bm{\mathcal{X}}^{\prime}\overset{d}{=}\bm{\mathcal{X}} and 𝓨=𝑑𝓨\bm{\mathcal{Y}}^{\prime}\overset{d}{=}\bm{\mathcal{Y}}, where 𝓧=𝑑𝓧\bm{\mathcal{X}}^{\prime}\overset{d}{=}\bm{\mathcal{X}} means that 𝓧\bm{\mathcal{X}}^{\prime} and 𝓧\bm{\mathcal{X}} are equal in law as processes. The case q(0,1)q\in(0,1) is important in our setting because the stable limit does not necessarily possess the first moment.

Let 𝑿=(𝑿t)t[0,1]\bm{X}=(\bm{X}_{t})_{t\in[0,1]} and 𝒁=(𝒁t)t[0,1]\bm{Z}=(\bm{Z}_{t})_{t\in[0,1]} be Lévy processes in d\mathbb{R}^{d} (see [40, Ch. 1, Def. 1.6] for definition), where 𝒁\bm{Z} is α\alpha-stable (see Section 5.1 below for definition).222Note that 𝒁\bm{Z} need not be isotropic: the angular component of its Lévy measure is not necessarily a uniform probability measure on the unit sphere 𝕊d1\mathbb{S}^{d-1} in d\mathbb{R}^{d}, see Section 5.1 for details. We say 𝑿\bm{X} is in the small-time domain of attraction of 𝒁\bm{Z} if (𝑿st/g(t))s[0,1]𝑑(𝒁s)s[0,1](\bm{X}_{st}/g(t))_{s\in[0,1]}\xrightarrow{d}(\bm{Z}_{s})_{s\in[0,1]} as t0t\downarrow 0 in the Skorokhod space for some normalising positive function g:(0,1](0,)g:(0,1]\to(0,\infty). Then, it is well known that 𝒁\bm{Z} is α\alpha-stable for some α(0,2]\alpha\in(0,2] and the normalising function admits the representation g(t)=t1/αG(t1)g(t)=t^{1/\alpha}G(t^{-1}) where GG is a slowly varying function at infinity (see [26, Eq. (8)]) that is asymptotically unique (see Theorem 5.1 below for the description of all Lévy processes 𝑿\bm{X} attracted to 𝒁\bm{Z}). We say 𝑿\bm{X} is in the domain of normal attraction when the slowly varying function G(x)G(x) converges to a positive finite constant as xx\to\infty (see [20, p. 181]). Otherwise, we say 𝑿\bm{X} is in the domain of non-normal attraction. Throughout the paper we denote 𝑿t=(𝑿st)s[0,1](𝑿st/g(t))s[0,1]\bm{X}^{t}=(\bm{X}_{s}^{t})_{s\in[0,1]}\coloneqq(\bm{X}_{st}/g(t))_{s\in[0,1]} for t(0,1]t\in(0,1].

2.1. Heavy-tailed stable domain of normal attraction

For a Lévy process 𝑿\bm{X} to be in the domain of attraction of an α\alpha-stable process 𝒁\bm{Z}, the necessary condition in (25) of Theorem 5.1 suggests the Lévy measure of 𝑿\bm{X} around the origin should be “asymptotically absolutely continuous” with respect to the α\alpha-stable Lévy measure of 𝒁\bm{Z} (see also Remark 5.3(b) below). Assumption ( (T).) quantifies the regularity of the corresponding density at the origin 𝟎\bm{0} via the parameter p>0p>0 (the larger pp is, the more asymptotic regularity there is). Assumption ( (T).) is required for the upper bound on the rate of convergence in the scaling limit in Theorem 2.1 and is stated in Section 5.2 below. Moreover, Assumption ( (T).), widely satisfied in practice (e.g. in the class of tempered stable processes [39] with p=1p=1; cf. Section 3 below for specific examples), can be seen as quantifying the speed of convergence in the necessary condition (25) for 𝑿\bm{X} to be in the stable domain of attraction.

Throughout the paper, for positive functions f1f_{1} and f2f_{2}, we use the notation f1(x)=𝒪(f2(x))f_{1}(x)=\mathcal{O}(f_{2}(x)) as x0x\downarrow 0 if lim supx0f1(x)/f2(x)<\limsup_{x\downarrow 0}f_{1}(x)/f_{2}(x)<\infty, and f1(x)=o(f2(x))f_{1}(x)=\mathrm{o}(f_{2}(x)) as x0x\downarrow 0 if limx0f1(x)/f2(x)=0\lim_{x\downarrow 0}f_{1}(x)/f_{2}(x)=0.

Theorem 2.1.

Let α(0,2){1}\alpha\in(0,2)\setminus\{1\}, 𝐙\bm{Z} be α\alpha-stable and 𝐗\bm{X} be in the domain of normal attraction of 𝐙\bm{Z}.
(a) Let Assumption ( (T).) hold for p=1p=1. Then for any q(0,1](0,α)q\in(0,1]\cap(0,\alpha) with 𝔼[|𝐗1|q]<\mathds{E}[|\bm{X}_{1}|^{q}]<\infty, as t0t\downarrow 0,

𝒲q(𝑿t,𝒁)={𝒪(t1q/α),α<1,𝒪(tq(11/α)),α>1.\mathcal{W}_{q}\big{(}\bm{X}^{t},\bm{Z}\big{)}=\begin{dcases}\mathcal{O}\big{(}t^{1-q/\alpha}\big{)},&\alpha<1,\\ \mathcal{O}\big{(}t^{q(1-1/\alpha)}\big{)},&\alpha>1.\end{dcases}

(b) If 𝐗\bm{X} does not have the law of 𝐙\bm{Z}, then for any q(0,1](0,α)q\in(0,1]\cap(0,\alpha) there exists some Cq>0C_{q}>0 satisfying

𝒲q(𝑿t,𝒁)𝒲q(𝑿1t,𝒁1)Cqt1q/α,for all sufficiently small t>0.\displaystyle\mathcal{W}_{q}(\bm{X}^{t},\bm{Z})\geqslant\mathcal{W}_{q}(\bm{X}_{1}^{t},\bm{Z}_{1})\geqslant C_{q}t^{1-q/\alpha},\quad\text{for all sufficiently small }t>0.
Remark 2.2.

(I) The upper bounds in Theorem 2.1(a) are based on the thinning coupling in Section 4.1 below. The upper bounds are asymptotically proportional to the lower bounds of Theorem 2.1(b) when either α<1\alpha<1 or q=1q=1 when α>1\alpha>1, making the thinning coupling rate-optimal with respect to these Wasserstein distances. Coincidentally, the upper bounds decay the fastest for small values of qq when α<1\alpha<1 and for q=1q=1 when α>1\alpha>1. Moreover, the multiplicative constants in 𝒪\mathcal{O} can be made explicit and depend on the dimension dd only through the characteristics of 𝑿\bm{X} and 𝒁\bm{Z}. The lower bounds are based on the lower bound on the Toscani–Fourier distance, see Section 6.2 below for details.
(II) Note that 𝔼[|𝒁1|q]<\mathds{E}[|\bm{Z}_{1}|^{q}]<\infty for q<αq<\alpha and that most models in practice satisfy Assumption ( (T).) with p=1p=1. Theorem 2.1 thus focuses on the case p=1p=1 in order to simplify the exposition, while retaining the key message of the paper. Our technical result Theorem 5.5 in Section 5 (resp. Lemma 6.4 in Subsection 6.2), used to prove part (a) (resp. part (b)) of Theorem 2.1, covers all parameters p>0p>0 and 𝒲q\mathcal{W}_{q}-distances with q(0,1](0,α)q\in(0,1]\cap(0,\alpha). The statement of the corresponding general version of Theorem 2.1 is omitted for brevity. ∎

2.2. Stable domain of non-normal attraction

Consider the case where the slowly varying function GG in the scaling limit (𝑿st/(t1/αG(1/t)))s[0,1]𝑑(𝒁s)s[0,1](\bm{X}_{st}/(t^{1/\alpha}G(1/t)))_{s\in[0,1]}\xrightarrow{d}(\bm{Z}_{s})_{s\in[0,1]}, as t0t\downarrow 0, is not asymptotically equivalent to a positive constant. In this section, we show that the lower bound on the 𝒲q\mathcal{W}_{q}-distance cannot be upper bounded by a positive non-decreasing function ϕ\phi satisfying 01ϕ(t)t1dt<\int_{0}^{1}\phi(t)t^{-1}\mathrm{d}t<\infty. The lower bound requires no assumptions (beyond 𝑿\bm{X} being in domain of attraction), while the assumptions for the upper bounds give us multiplicative non-asymptotic control over the distance from G(x/t)/G(1/t)G(x/t)/G(1/t) to 11 for small t>0t>0 and any x>0x>0.

Assumption (S).

There exist G1,G2:[0,)[0,)G_{1},\,G_{2}:[0,\infty)\to[0,\infty), such that G2G_{2} is bounded with G2(t)0G_{2}(t)\to 0 as t0t\downarrow 0, G1G_{1} is a slowly varying function both at 0 and at infinity and

|G(x/t)/G(1/t)1|G1(x)G2(t), for all x>0 and all sufficiently small t>0.\left|G(x/t)/G(1/t)-1\right|\leqslant G_{1}(x)G_{2}(t),\quad\text{ for all }x>0\text{ and all sufficiently small }t>0.

The upper bound in Theorem 2.3(a) below require an additional technical Assumption ( (C).), see Section 5 below. Intuitively, these assumptions require non-parametric structural properties of the Lévy measure of 𝑿\bm{X} that allows us to compare it to the Lévy measure of the stable limit 𝒁\bm{Z}. Indeed, the necessary condition in (25) of Theorem 5.1 suggests the Lévy measure of 𝑿\bm{X} around the origin should “asymptotically admit a radial decomposition that is close to that of the stable process”. Assumption ( (C).) states precisely this and specifies the proximity of the corresponding radial decomposition to that of the stable process via the parameters p,δ>0p,\delta>0 (as before, the larger pp and δ\delta are, the closer the radial decompositions are). Moreover, both conditions are widely satisfied with p=1p=1, e.g. for the class of tempered α\alpha-stable processes (see [39] and, for specific examples, Section 3 below).

Theorem 2.3.

Let 𝐗\bm{X} be in the domain of non-normal attraction of an α\alpha-stable process 𝐙\bm{Z}.
(a) Let α(0,2){1}\alpha\in(0,2)\setminus\{1\} and Assumptions ( (C).) and ( (S).) hold for some p(0,){α1}p\in(0,\infty)\setminus\{\alpha-1\}, δ>0\delta>0 and a function G2G_{2} that is slowly varying at 0. Then 𝒲q(𝐗t,𝐙)=𝒪(G2(t)q)\mathcal{W}_{q}(\bm{X}^{t},\bm{Z})=\mathcal{O}(G_{2}(t)^{q}) as t0t\downarrow 0 for any q(0,1](0,α){α/(p+1),α/(αδ+1)}q\in(0,1]\cap(0,\alpha)\setminus\{\alpha/(p+1),\alpha/(\alpha\delta+1)\} with 𝔼[|𝐗1|q]<\mathds{E}[|\bm{X}_{1}|^{q}]<\infty.
(b) Let α(0,2]\alpha\in(0,2] and define a(t)G(1/(2t))/G(1/t)a(t)\coloneqq G(1/(2t))/G(1/t) for t>0t>0. Then for any q(0,1](0,α)q\in(0,1]\cap(0,\alpha),

max{𝒲q(𝑿1t,𝒁1),𝒲q(𝑿12t,𝒁1)}|1a(t)q|𝔼[|𝒁1|q]/3,for all sufficiently small t>0.\displaystyle\max\big{\{}\mathcal{W}_{q}(\bm{X}^{t}_{1},\bm{Z}_{1}),\mathcal{W}_{q}(\bm{X}^{2t}_{1},\bm{Z}_{1})\big{\}}\geqslant|1-a(t)^{q}|\mathds{E}\big{[}|\bm{Z}_{1}|^{q}\big{]}/3,\quad\text{for all sufficiently small }t>0.

Moreover, |1a(t)q||1-a(t)^{q}| cannot be upper bounded by a non-decreasing function ϕ\phi with 01ϕ(t)t1dt<\int_{0}^{1}\phi(t)t^{-1}\mathrm{d}t<\infty.

Since G2G_{2} bounds |G(1/(2t))/G(1/t)1|G1(1/2)G2(t)\left|G(1/(2t))/G(1/t)-1\right|\leqslant G_{1}(1/2)G_{2}(t) for all small t>0t>0 and GG is not asymptotically equivalent to a positive finite constant, Lemma 7.2 below (which extends [9, Prop., p. 683]) implies G2G_{2} cannot be upper bounded by any non-decreasing function ϕ\phi satisfying 01ϕ(t)t1dt<\int_{0}^{1}\phi(t)t^{-1}\mathrm{d}t<\infty. The assumption on the slow variation of G2G_{2} in Theorem 2.3(a) is not essential and may be replaced by assuming that G2G_{2} dominates any (positive) power at zero. However, by Lemma 7.1 below, in most cases of interest such a function G2G_{2} will be slowly varying.

Given a slowly varying function GG, the construction of functions G1G_{1} and G2G_{2} satisfying Assumption ( (S).) is not immediately clear. However, in most cases and for a sufficiently regular GG, by virtue of Lemma 7.1, Assumption ( (S).) will be satisfied by choosing G2(t)t|G(1/t)|/G(1/t)G_{2}(t)\sim t|G^{\prime}(1/t)|/G(1/t) as t0t\downarrow 0 and a slowly varying G1G_{1} (at 0 and \infty) with G1(x)|logx|supt>0,y[x1,x1]G(y/t)/G(1/t)G_{1}(x)\geqslant|\log x|\cdot\sup_{t>0,y\in[x\wedge 1,x\vee 1]}G^{\prime}(y/t)/G^{\prime}(1/t) (for a,ba,b\in\mathbb{R}, we denote abmin{a,b}a\wedge b\coloneqq\min\{a,b\}). In such cases, the lower bound in Theorem 2.3 is (by Lemma 7.1) proportional to G2G_{2}, i.e. G2(t)|1a(t)|/log2G_{2}(t)\sim|1-a(t)|/\log 2 as t0t\downarrow 0, making the comonotonic coupling rate optimal with respect to the 𝒲1\mathcal{W}_{1}-distance when α>1\alpha>1. The following corollary makes this precise and shows that this is the case for a large class of processes in the domain of non-normal attraction.

Corollary 2.4.

Let 𝐗\bm{X} be in the domain of non-normal attraction of an α\alpha-stable process 𝐙\bm{Z}.
(a) Let α(0,2){1}\alpha\in(0,2)\setminus\{1\} and Assumption ( (C).) hold for some δ>0\delta>0 and pα1p\neq\alpha-1. Suppose GG is C1C^{1} with derivative equal to G~(t)/(c+t)\widetilde{G}(t)/(c+t), where c0c\geqslant 0 and |G~|SV|\widetilde{G}|\in\mbox{SV}_{\infty} is eventually positive. Further suppose there exists a slowly varying function ϕ(x)\phi(x) both at zero and infinity satisfying ϕ(x)supt>0,y[x1,x1]G~(yt)/G~(t)\phi(x)\geqslant\sup_{t>0,y\in[x\wedge 1,x\vee 1]}\widetilde{G}(yt)/\widetilde{G}(t) for x>0x>0. Define L(t)|G~(1/t)|/G(1/t)L(t)\coloneqq|\widetilde{G}(1/t)|/G(1/t), then, for any q(0,1](0,α){α/2,α/(αδ+1)}q\in(0,1]\cap(0,\alpha)\setminus\{\alpha/2,\alpha/(\alpha\delta+1)\} with 𝔼[|𝐗1|q]<\mathds{E}[|\bm{X}_{1}|^{q}]<\infty, we have 𝒲q(𝐗t,𝐙)=𝒪(L(t)q)\mathcal{W}_{q}(\bm{X}^{t},\bm{Z})=\mathcal{O}(L(t)^{q}) as t0t\downarrow 0 and

max{𝒲q(𝑿1t,𝒁1),𝒲q(𝑿12t,𝒁1)}qlog23𝔼[|𝒁1|q]L(t),for all sufficiently small t>0.\displaystyle\max\{\mathcal{W}_{q}(\bm{X}^{t}_{1},\bm{Z}_{1}),\mathcal{W}_{q}(\bm{X}^{2t}_{1},\bm{Z}_{1})\}\geqslant\frac{q\log 2}{3}\mathds{E}\big{[}|\bm{Z}_{1}|^{q}\big{]}\cdot L(t),\quad\text{for all sufficiently small }t>0.

(b) Define iteratively the functions 1(t)=log(e+t)\ell_{1}(t)=\log(e+t) and n+1(t)=log(e+n(t))\ell_{n+1}(t)=\log(e+\ell_{n}(t)) for t0t\geqslant 0 and nn\in\mathbb{N}. Suppose GG is eventually equal to n(t)qnm(t)qm\ell_{n}(t)^{q_{n}}\cdots\ell_{m}(t)^{q_{m}} where 1nm1\leqslant n\leqslant m in \mathbb{N} and either qn,,qm0q_{n},\ldots,q_{m}\geqslant 0 with qn,qm>0q_{n},q_{m}>0 or qn,,qm0q_{n},\ldots,q_{m}\leqslant 0 with qn,qm<0q_{n},q_{m}<0. Then GG satisfies the assumptions of Part (a).

Remark 2.5.

(I) The upper bound in Theorem 2.3(a) is based on the comonotonic coupling in Section 4.2 below. Since the bound is independent of both pp and δ\delta, the restriction in pα1p\neq\alpha-1 and q{α/(p+1),α/(αδ+1)}q\notin\{\alpha/(p+1),\alpha/(\alpha\delta+1)\} is nonessential. Indeed, if pp and δ\delta satisfy Assumption ( (C).), then any p(0,p)p^{\prime}\in(0,p) and δ(0,δ)\delta^{\prime}\in(0,\delta) also satisfy Assumption ( (C).). Moreover, the multiplicative constants in 𝒪\mathcal{O} can be made explicit and depend on the dimension dd only through the characteristics of 𝑿\bm{X} and 𝒁\bm{Z}.
(II) The lower bounds are based on elementary estimates and a universal property of slowly varying functions, see Section 6.1 below for details. When α=2\alpha=2, despite the fact that we do not have an upper bound in Theorem 2.3(a) for this case, the lower bound of Theorem 2.3(b) ensures the nonexistence of a coupling that makes the 𝒲q\mathcal{W}_{q}-distance decay polynomially.
(III) Corollary 2.4 is a consequence of Theorem 2.3 and Lemmas 7.17.3 below. Furthermore, we stress that the resulting upper and lower bounds may converge slowly and at a rate that is, in some sense, “bounded away from polynomials” even for a very slow function G=nG=\ell_{n} or G=1/nG=1/\ell_{n}, nn\in\mathbb{N}, see Example 3.2 below. Furthermore, given any SV\ell\in\mbox{SV}_{\infty} with (1/t)0\ell(1/t)\to 0 as t0t\downarrow 0 and 1(t)t1dt=\int_{1}^{\infty}\ell(t)t^{-1}\mathrm{d}t=\infty, the functions G±(t)exp(±1t(s)s1ds)G_{\pm}(t)\coloneqq\exp(\pm\int_{1}^{t}\ell(s)s^{-1}\mathrm{d}s) are slowly varying, G+(t)G_{+}(t)\to\infty, G+(t)0G_{+}(t)\to 0 and the corresponding G2(t)G_{2}(t) functions are proportional (1/t)\ell(1/t) as t0t\downarrow 0. Thus, we may construct processes 𝑿\bm{X} such that max{𝒲q(𝑿t,𝒁),𝒲q(𝑿2t,𝒁)}\max\{\mathcal{W}_{q}(\bm{X}^{t},\bm{Z}),\mathcal{W}_{q}(\bm{X}^{2t},\bm{Z})\} is asymptotically bounded above and below by multiples of (1/t)\ell(1/t) as t0t\downarrow 0, see Lemma 7.2 and Example 3.2 below.
(IV) For a given process 𝑿\bm{X}, we may choose two asymptotically equivalent slowly varying functions GG and G^\hat{G} that have different convergence properties. Indeed, if GG^{\prime} is not asymptotically equivalent to G^\hat{G^{\prime}}, then the resulting bounds will change (recall that GG is only unique up to asymptotic equivalence and that 𝑿st=𝑿st/(t1/αG(1/t))\bm{X}^{t}_{s}=\bm{X}_{st}/(t^{1/\alpha}G(1/t))). For instance, fix r1,r2(0,1)r_{1},r_{2}\in(0,1), denote (t)=(logt)r1\ell(t)=(\log t)^{r_{1}} and let

G(t)(t)(1+(1(t)r2)cos((t))),\displaystyle G^{\prime}(t)\coloneqq\ell^{\prime}(t)(1+(1-\ell^{\prime}(t)^{r_{2}})\cos(\ell(t))), G^(t)(t)(1+(1(t)r2)sin((t))),\displaystyle\hat{G}^{\prime}(t)\coloneqq\ell^{\prime}(t)(1+(1-\ell^{\prime}(t)^{r_{2}})\sin(\ell(t))),
G(t)=(t)+sin((t))+𝒪((logt)(r11)(r2+1)+1),\displaystyle G(t)=\ell(t)+\sin(\ell(t))+\mathcal{O}((\log t)^{(r_{1}-1)(r_{2}+1)+1}), G^(t)=(t)cos((t))+𝒪((logt)(r11)(r2+1)+1).\displaystyle\hat{G}(t)=\ell(t)-\cos(\ell(t))+\mathcal{O}((\log t)^{(r_{1}-1)(r_{2}+1)+1}).

Then tG(t)tG^{\prime}(t) and tG^(t)t\hat{G}^{\prime}(t) are slowly varying and G(t)/G^(t)1G(t)/\hat{G}(t)\to 1 as tt\to\infty but lim suptG(t)/G^(t)=\limsup_{t\to\infty}G^{\prime}(t)/\hat{G}^{\prime}(t)=\infty and lim inftG(t)/G^(t)=0\liminf_{t\to\infty}G^{\prime}(t)/\hat{G}^{\prime}(t)=0. Optimising the convergence rate within this class appears to be a very difficult task; however, the limitations imposed by Lemma 7.2 would apply to any choice of GG. A similar phenomenon was also observed recently in the standard central limit theorem for Lévy processes in [4], where the Kolmogorov distance is shown to satisfy (resp. fail) an integral condition for a non-standard (resp. standard) scaling.
(V) Theorem 2.3 makes full use of Assumption ( (S).), however, a more detailed analysis that does not require G2G_{2} to be slowly varying can be found in our technical result Theorem 5.9 in Section 5 below.
(VI) We note that a lower bound via the Toscani–Fourier distance is plausible but appears suboptimal since the rate has a polynomial factor. Moreover, we believe the slow lower bound in part (b) to hold for 𝒲q(𝑿1t,𝒁1)\mathcal{W}_{q}(\bm{X}^{t}_{1},\bm{Z}_{1}) alone (i.e. without taking the maximum value between times tt and 2t2t). However, this remains a conjecture. ∎

2.3. Selecting the coupling

The main idea behind the proof of Theorems 2.12.3 is a good coupling between 𝑿\bm{X} and 𝒁\bm{Z}. The two couplings we apply in this article, are the thinning coupling and the comonotonic coupling introduced in Sections 4.14.2. In Theorem 2.3 we solely apply the comonotonic coupling, since this yields clear and concise results. Note that one could apply the thinning coupling to get a similar result in the domain of non-normal attraction. However, since this would require a lengthy argument, and would distort the main story and result, this has been left out of the paper. In comparison, it is easier to use the comonotonic coupling to give bounds for processes in the domain of normal attraction.

Proposition 2.6.

Let 𝐙\bm{Z} be α\alpha-stable with α(0,2)\alpha\in(0,2) and 𝐗\bm{X} be in its domain of normal attraction. Let Assumption ( (C).) (with constant HGH\equiv G and Q1Q\equiv 1) hold for some p(0,){α1}p\in(0,\infty)\setminus\{\alpha-1\}. Then, for any q(0,1](0,α)q\in(0,1]\cap(0,\alpha) with 𝔼[|𝐗1|q]<\mathds{E}[|\bm{X}_{1}|^{q}]<\infty, we have, as t0t\downarrow 0,

𝒲q(𝑿t,𝒁)={𝒪(tmin{pq/α,1q/α}),α(0,1),𝒪(tqmin{p/α,11/α}),α(1,2).\mathcal{W}_{q}\big{(}\bm{X}^{t},\bm{Z}\big{)}=\begin{dcases}\mathcal{O}\big{(}t^{\min\{pq/\alpha,1-q/\alpha\}}\big{)},&\alpha\in(0,1),\\ \mathcal{O}\big{(}t^{q\min\{p/\alpha,1-1/\alpha\}}\big{)},&\alpha\in(1,2).\end{dcases}
Remark 2.7.

Proposition 2.6 follows from Theorem 5.9 (see Remark 5.10). The assumptions in Theorem 2.1(a) and Proposition 2.6 are significantly different, making it is necessary to split the upper bounds in two statements. Indeed, as seen in Example 3.4 below, Assumption ( (C).) is slightly stricter than Assumption ( (T).), since we can show that there exist processes for which Assumption ( (T).) is true, where Assumption ( (C).) is no longer valid. In the case where Assumptions ( (T).) and ( (C).) are valid simultaneously with the same parameter p=1p=1, Theorem 2.1 yields an upper bound that is never worse than that of Proposition 2.6. ∎

2.4. Gaussian domain of attraction

The domain of attraction to Brownian motion is substantially different as the previously described couplings are inapplicable. Obtaining a coupling between Brownian motion and other Lévy processes that reduced the LpL^{p}-distance in uniform norm has been the work of a large body of literature (which we review in Subsection 2.5 below). In this paper, we use a simple independent coupling, which, heuristically, compares the pure-jump component of 𝑿\bm{X} with the null process 𝟎\bm{0}. Let ,\langle\cdot,\cdot\rangle denote the Euclidean inner product on d×d\mathbb{R}^{d\times d}, 𝟎\bm{0} denote the zero-vector in d\mathbb{R}^{d} as well as the zero-matrix in d×d\mathbb{R}^{d\times d} and let 𝟎dd{𝟎}\mathbb{R}^{d}_{\bm{0}}\coloneqq\mathbb{R}^{d}\setminus\{\bm{0}\}. Let φ𝑿(𝒖)𝔼[ei𝒖,𝑿]\varphi_{\bm{X}}(\bm{u})\coloneqq\mathds{E}[e^{i\langle\bm{u},\bm{X}\rangle}] for 𝒖d\bm{u}\in\mathbb{R}^{d} denote the characteristic function of 𝑿\bm{X}. Furthermore, let ψ𝑺\psi_{\bm{S}} be the Lévy-Khintchine exponent of 𝑺\bm{S}, given by ψ𝑺(𝒖)t1logφ𝑺t(𝒖)\psi_{\bm{S}}(\bm{u})\coloneqq t^{-1}\log\varphi_{\bm{S}_{t}}(\bm{u}) for 𝒖d\bm{u}\in\mathbb{R}^{d} and t>0t>0.

Theorem 2.8.

Let 𝚺\bm{\Sigma} be a symmetric non-negative definite matrix on d×d\mathbb{R}^{d\times d} and define the process 𝐗t=((𝚺𝐁st+𝐒st)/t)s[0,1]\bm{X}^{t}=((\bm{\Sigma}\bm{B}_{st}+\bm{S}_{st})/\sqrt{t})_{s\in[0,1]} for t(0,1]t\in(0,1] where (𝐁t)t[0,1](\bm{B}_{t})_{t\in[0,1]} is a standard Brownian motion on d\mathbb{R}^{d} independent of the pure-jump Lévy process 𝐒\bm{S} with Blumenthal–Getoor index β\beta (defined in (27)).
(a) Suppose β[0,2)\beta\in[0,2) and fix any β(β,2]\beta_{*}\in(\beta,2] when 𝐒\bm{S} is of infinite variation and β=1\beta_{*}=1 otherwise. Then for any q>0q>0 with 𝔼[|𝐗1|q]<\mathds{E}[|\bm{X}_{1}|^{q}]<\infty, we have

𝒲q(𝑿t,𝚺𝑩)=𝒪(t(q1)(min{1/q,1/β}1/2)), as t0.\displaystyle\mathcal{W}_{q}\big{(}\bm{X}^{t},\bm{\Sigma}\bm{B}\big{)}=\mathcal{O}\big{(}t^{(q\wedge 1)(\min\{1/q,1/\beta_{*}\}-1/2)}\big{)},\qquad\text{ as }t\downarrow 0.

(b) Pick any 𝐮𝟎d\bm{u}_{*}\in\mathbb{R}^{d}_{\bm{0}} and define C|𝐮|1|ψ𝐒(𝐮)|>0C_{*}\coloneqq|\bm{u}_{*}|^{-1}|\psi_{\bm{S}}(\bm{u}_{*})|>0. Then for all q1q\geqslant 1, we have

𝒲q(𝑿1t,𝚺𝑩1)C2t+𝒪(t3/2),as t0.\displaystyle\mathcal{W}_{q}(\bm{X}^{t}_{1},\bm{\Sigma}\bm{B}_{1})\geqslant\frac{C_{*}}{\sqrt{2}}\sqrt{t}+\mathcal{O}(t^{3/2}),\quad\text{as }t\downarrow 0.

(c) Let λ\lambda be the largest eigenvalue of 𝚺2\bm{\Sigma}^{2}. Suppose there exist δ[1,2)\delta\in[1,2) and vectors (𝐮r)r(0,)(\bm{u}_{r})_{r\in(0,\infty)} with |𝐮r|=r|\bm{u}_{r}|=r satisfying c:=infr>1rδ|ψ𝐒(𝐮r)|>0c:=\inf_{r>1}r^{-\delta}|\psi_{\bm{S}}(\bm{u}_{r})|>0. Then for any C(0,ceλ/2)C_{*}\in(0,ce^{-\lambda/2}) we have 𝒲q(𝐗1t,𝚺𝐁1)(C/2)t1δ/2\mathcal{W}_{q}(\bm{X}^{t}_{1},\bm{\Sigma}\bm{B}_{1})\geqslant(C_{*}/\sqrt{2})t^{1-\delta/2} for all sufficiently small t>0t>0.

Parts (a) and (c) of Theorem 2.8, with q=1q=1, imply that for processes whose pure jump part is in the domain of attraction of a β\beta-stable process, the upper and lower bounds are essentially proportional to t1/max{1,β}1/2t^{1/\max\{1,\beta\}-1/2} and t1max{1,β}/2t^{1-\max\{1,\beta\}/2}, respectively. These agree in the finite variation case with rate t\sqrt{t} and also as β2\beta\to 2 with an arbitrarily deteriorating convergence rate. As shown in Figure 2 these bounds are not far from each other for fixed β\beta, and the powers of tt from the rates are also not far. In the ‘limiting case’ where 𝑺\bm{S} is itself attracted to a Brownian motion and β=2\beta=2, the rescaled process (𝑺st/t)s[0,1](\bm{S}_{st}/\sqrt{t})_{s\in[0,1]} is distributionally close to (t)𝑩\ell(t)\bm{B} (see e.g. [26, Thm 2]) for a slowly varying function \ell satisfying limt0(t)=0\lim_{t\downarrow 0}\ell(t)=0. It is thus natural to expect that the convergence, in this case, is slow as in Theorem 2.3 above, see Example 6.7.

Refer to caption
Refer to caption
Figure 2. In the left picture we see polynomials with the dominant powers from the upper and lower bounds for 𝒲1(𝑿1t,𝚺𝑩1)\mathcal{W}_{1}(\bm{X}_{1}^{t},\bm{\Sigma}\bm{B}_{1}) from Theorem 2.8 with β=3/2\beta=3/2. In the right picture, we see the dominant powers of tt in the upper and lower bounds as a function of β[1,2]\beta\in[1,2].

2.5. Classical bounds are hard to apply!

The couplings and methods used to achieve these bounds are crucial, and they differ significantly from the classical methods used to find rates of convergence. Indeed, if we tried to use standard methods (namely, the Berry–Esseen theorem or [34]) to construct bounds for small-time domain of attraction, the bounds would not converge as t0t\downarrow 0.

The Berry–Essen theorem exploits an increase in the activity of the process to obtain bounds on the distance between a random walk and the limit law. In the small-time regime, the activity is instead decreasing, explaining the unsuitability of this tool in this context (see details below). In fact, the bound would converge to infinity as t0t\downarrow 0 (and in particular does not go to 0) unlike the bounds introduced in this paper. The coupling in [34] couples corresponding components of the Lévy–Itô decompositions for a common small-jump cutoff level. When the time horizon is fixed and the Lévy measure is supported on [ε,ε][-\varepsilon,\varepsilon], the bounds of [34] are asymptotically sharp as ε0\varepsilon\downarrow 0. However, for general Lévy measures and as time tends to 0, no time-dependent cutoff level εt\varepsilon_{t} can be used to obtain convergent bounds. The lack of convergent bounds in the small-time domain of attraction of stable processes is mainly caused by a difference in the jump intensities of the large-jump components (see details below).

We first explain why the Berry–Esseen theorem does not yield suitable bounds even when the limit is Gaussian (see [18, 22]). For the explanation, it is enough to consider the one-dimensional case. Let (Xt)t[0,1](X_{t})_{t\in[0,1]} be a zero-mean Lévy process on \mathbb{R} with characteristic triplet (γ,σ2,νX)(\gamma,\sigma^{2},\nu_{X}) (see [40, Def. 8.2]) and finite fourth moment. The variance of XX is given by 𝔼[Xt2]=(σ2+μ2)t\mathds{E}[X_{t}^{2}]=(\sigma^{2}+\mu_{2})t, where μ2{0}x2νX(dx)\mu_{2}\coloneqq\int_{\mathbb{R}\setminus\{0\}}x^{2}\nu_{X}(\mathrm{d}x). Then, X1t=Xt/(σ2+μ2)tX_{1}^{t}=X_{t}/\sqrt{(\sigma^{2}+\mu_{2})t} is attracted to a standard Gaussian random variable ZZ as t0t\downarrow 0. Denote by νt\nu_{t} the Lévy measure of X1tX_{1}^{t}, the Berry–Esseen theorem thus implies that there exists some universal constant C>0C>0, such that

𝒲2(X1t,Z)C{0}x4νt(dx)=C{0}x4tνX(d((σ2+μ2)tx))=Ct(σ2+μ22)2{0}x4νX(dx),\mathcal{W}_{2}(X_{1}^{t},Z)\leqslant C\int_{\mathbb{R}\setminus\{0\}}x^{4}\nu_{t}(\mathrm{d}x)=C\int_{\mathbb{R}\setminus\{0\}}x^{4}t\nu_{X}(\mathrm{d}(\sqrt{(\sigma^{2}+\mu_{2})t}x))=\frac{C}{t(\sigma^{2}+\mu_{2}^{2})^{2}}\int_{\mathbb{R}\setminus\{0\}}x^{4}\nu_{X}(\mathrm{d}x),

for all t(0,1]t\in(0,1]. As we can see above, this upper bound will tend to \infty as t0t\downarrow 0 and is therefore not an informative bound in the small-time regime.

For Lévy processes in the domain of attraction of an α\alpha-stable law, an application of the bounds in [34] does not yield convergent bounds. The proofs of the bounds in [34] rely on the coupling of small jumps to a Gaussian law. Again, it is enough to consider the one-dimensional case. Let XX be symmetric and in the domain of attraction of the symmetric α\alpha-stable random variable ZZ with α>1\alpha>1. Suppose their Lévy measures satisfy νX((x,x))=xα+x(α+1)/2\nu_{X}(\mathbb{R}\setminus(-x,x))=x^{-\alpha}+x^{-(\alpha+1)/2} and νZ((x,x))=xα\nu_{Z}(\mathbb{R}\setminus(-x,x))=x^{-\alpha} for x>0x>0. In this case we have X1t=Xt/t1/α𝑑ZX^{t}_{1}=X_{t}/t^{1/\alpha}\xrightarrow{d}Z as t0t\downarrow 0.

Let η=(α1)/2>0\eta=(\alpha-1)/2>0 and apply [34, Thm 11] (at time 11 and cutoff εt\varepsilon_{t}) to obtain:

𝒲1(X1t,Z)\displaystyle\mathcal{W}_{1}(X^{t}_{1},Z) Cεt+((12αεt2α+13/2α/2tη/αεt3η)1/2(12αεt2α)1/2)\displaystyle\leqslant C\varepsilon_{t}+\Big{(}\big{(}\tfrac{1}{2-\alpha}\varepsilon_{t}^{2-\alpha}+\tfrac{1}{3/2-\alpha/2}t^{\eta/\alpha}\varepsilon_{t}^{3\eta}\big{)}^{1/2}-\big{(}\tfrac{1}{2-\alpha}\varepsilon_{t}^{2-\alpha}\big{)}^{1/2}\Big{)}
+2(εtα+tη/αεtη1)εt|xαεtαxα+x(α+1)/2εtα+tη/αεtη1|dx\displaystyle\qquad+2\big{(}\varepsilon_{t}^{-\alpha}+t^{\eta/\alpha}\varepsilon_{t}^{-\eta-1}\big{)}\int_{\varepsilon_{t}}^{\infty}\bigg{|}\frac{x^{-\alpha}}{\varepsilon_{t}^{-\alpha}}-\frac{x^{-\alpha}+x^{-(\alpha+1)/2}}{\varepsilon_{t}^{-\alpha}+t^{\eta/\alpha}\varepsilon_{t}^{-\eta-1}}\bigg{|}\mathrm{d}x
+2tη/αεtη1εtxαxα1dxεtα, for all t>0,\displaystyle\qquad+2t^{\eta/\alpha}\varepsilon_{t}^{-\eta-1}\int_{\varepsilon_{t}}^{\infty}x\frac{\alpha x^{-\alpha-1}\mathrm{d}x}{\varepsilon_{t}^{-\alpha}},\quad\text{ for all }t>0,

where we used the formula for the L1L^{1}-Wasserstein distance in [19, p. 8]. For the first line in the display above to vanish as t0t\downarrow 0, we require εt=o(1)\varepsilon_{t}=\mathrm{o}(1). The term in the middle line of the display above equals

2εt|xαtη/αεtηxη1|dx\displaystyle 2\int_{\varepsilon_{t}}^{\infty}\big{|}x^{-\alpha}t^{\eta/\alpha}\varepsilon_{t}^{-\eta}-x^{-\eta-1}\big{|}\mathrm{d}x 2|tη/αεtηεtcxαdxεtcxη1dx|\displaystyle\geqslant 2\bigg{|}t^{\eta/\alpha}\varepsilon_{t}^{-\eta}\int_{\varepsilon_{t}}^{c}x^{-\alpha}\mathrm{d}x-\int_{\varepsilon_{t}}^{c}x^{-\eta-1}\mathrm{d}x\bigg{|}
=2|1α1(tη/αεt3ηtη/αεtηc2η)2α1(εtηcη)|,\displaystyle=2\big{|}\tfrac{1}{\alpha-1}(t^{\eta/\alpha}\varepsilon_{t}^{-3\eta}-t^{\eta/\alpha}\varepsilon_{t}^{-\eta}c^{-2\eta})-\tfrac{2}{\alpha-1}(\varepsilon_{t}^{-\eta}-c^{-\eta})\big{|},

where c(0,]c\in(0,\infty] is an arbitrary number and the inequality holds for all t>0t>0 for which εt<c\varepsilon_{t}<c. For the right-hand side of the display to vanish at t0t\downarrow 0 with c=c=\infty we must have εtη(tη/αεt2η2)=o(1)\varepsilon_{t}^{-\eta}(t^{\eta/\alpha}\varepsilon_{t}^{-2\eta}-2)=\mathrm{o}(1). Then, for c=1c=1, the display above will converge to the constant 4/(α1)4/(\alpha-1). In particular, the bound implied by [34, Thm 11] cannot vanish for any choice of εt\varepsilon_{t}.

3. Examples

In this section, we apply some of the main results from Section 2 on tempered α\alpha-stable processes [39, Def. 2.1], that are in the domain of attraction of α\alpha-stable processes. We say that a process (𝑿t)t[0,1](\bm{X}_{t})_{t\in[0,1]} is a tempered α\alpha-stable process if it has no Gaussian component, and its Lévy measure ν𝑿\nu_{\bm{X}} has the form

(2) ν𝑿(A)=𝕊d10𝟙A(x𝒗)αxα1q(x,𝒗)dxσ(d𝒗), for A(𝟎d),\nu_{\bm{X}}(A)=\int_{\mathbb{S}^{d-1}}\int_{0}^{\infty}\mathds{1}_{A}(x\bm{v})\alpha x^{-\alpha-1}q(x,\bm{v})\mathrm{d}x\sigma(\mathrm{d}\bm{v}),\quad\text{ for }A\in\mathcal{B}(\mathbb{R}^{d}_{\bm{0}}),

where 𝕊d1\mathbb{S}^{d-1} is the unit sphere in d\mathbb{R}^{d} and q(,𝒗):(0,)×𝕊d1(0,)q(\cdot,\bm{v}):(0,\infty)\times\mathbb{S}^{d-1}\mapsto(0,\infty) is a completely monotone Borel function (see [39, p. 680]) with q(,𝒗)=0q(\infty,\bm{v})=0 for all 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1} and (𝟎d)\mathcal{B}(\mathbb{R}^{d}_{\bm{0}}) is the Borel σ\sigma-algebra on d{0}\mathbb{R}^{d}\setminus\{0\}. In Examples 3.1 and 3.2 the process 𝑿\bm{X} is a multidimensional tempered α\alpha-stable process in the stable domain of attraction. Both of the examples can be easily seen to fulfil Assumption ( (C).) or ( (T).). Example 3.4 shows that Assumption ( (C).) does not imply ( (T).), while Example 3.3 deals with a Gaussian perturbation of a tempered α\alpha-stable process.

Example 3.1.

Assume that (𝒁t)t[0,1](\bm{Z}_{t})_{t\in[0,1]} is an α\alpha-stable process on d\mathbb{R}^{d} and that (𝑿t)t[0,1](\bm{X}_{t})_{t\in[0,1]} is a tempered stable process with Lévy measure as in (2). Assume that q(x,𝒗)=eλ(𝒗)xq(x,\bm{v})=e^{-\lambda(\bm{v})x}, for all x>0x>0 and 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1}, where λ(𝒗)\lambda(\bm{v}) is a bounded non-negative function. Thus,

|eλ(𝒗)x1|(1|x||λ(𝒗)|)K(1|x|), for all (x,𝒗)(0,)×𝕊d1.\displaystyle|e^{-\lambda(\bm{v})x}-1|\leqslant(1\wedge|x||\lambda(\bm{v})|)\leqslant K(1\wedge|x|),\quad\text{ for all }(x,\bm{v})\in(0,\infty)\times\mathbb{S}^{d-1}.

If α>1\alpha>1, then Theorem 2.1 with p=1p=1 implies that 𝒲1(𝑿t/t1/α,𝒁1)=𝒪(t11/α)\mathcal{W}_{1}(\bm{X}_{t}/t^{1/\alpha},\bm{Z}_{1})=\mathcal{O}(t^{1-1/\alpha}), with lower bound given by 𝒲1(𝑿t/t1/α,𝒁1)Ct11/α+𝒪(t21/α)\mathcal{W}_{1}(\bm{X}_{t}/t^{1/\alpha},\bm{Z}_{1})\geqslant Ct^{1-1/\alpha}+\mathcal{O}(t^{2-1/\alpha}) as t0t\downarrow 0, for some finite constant C>0C>0. Thus, the upper and lower bounds have the same rate in this case, yielding a rate-optimal bound. ∎

Next, we give an example where the function GG is non-constant, and see how the rates deteriorate in these cases, as Theorem 2.3 implies. Throughout the paper, we use the notation f(x)g(x)f(x)\sim g(x) as xax\to a, if limxaf(x)/g(x)=1\lim_{x\to a}f(x)/g(x)=1.

Example 3.2.

Assume that (𝒁t)t[0,1](\bm{Z}_{t})_{t\in[0,1]} is an α\alpha-stable process on d\mathbb{R}^{d} with α(1,2)\alpha\in(1,2) and that 𝑿\bm{X} is a tempered stable process with Lévy measure as in (2). We assume that ρ𝑿𝖼([x,),𝒗)=H(x)αxα\rho_{\bm{X}}^{\mathsf{c}}([x,\infty),\bm{v})=H(x)^{\alpha}x^{-\alpha} for all x>0x>0 and 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1} (see (32)), where HH is differentiable and slowly varying, implying q(x,𝒗)H(x)αq(x,\bm{v})\sim H(x)^{\alpha} as x0x\downarrow 0. Then ρ𝑿𝖼(x,𝒗)\rho^{\mathsf{c}\leftarrow}_{\bm{X}}(x,\bm{v}) does not depend on 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1} and its value, denoted ρ𝑿𝖼(x)\rho^{\mathsf{c}\leftarrow}_{\bm{X}}(x), satisfies ρ𝑿𝖼(x)x1/αH(x1/α)\rho^{\mathsf{c}\leftarrow}_{\bm{X}}(x)\sim x^{-1/\alpha}H(x^{-1/\alpha}) as xx\to\infty by [6, Cor. 2.3.4]. For any 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1}, by (33), we have

G(x)=𝕊d1H(ρ𝑿(x,𝒖))σ(d𝒖)=H(ρ𝑿(x))H(x1/α), as x.\displaystyle G(x)=\int_{\mathbb{S}^{d-1}}H(\rho^{\leftarrow}_{\bm{X}}(x,\bm{u}))\sigma(\mathrm{d}\bm{u})=H(\rho^{\leftarrow}_{\bm{X}}(x))\sim H(x^{-1/\alpha}),\qquad\text{ as }x\to\infty.

Theorem 2.3 now yields both the upper and lower bound in terms of GG and related functions.

Let n\ell_{n} be recursively defined as in Lemma 7.3 below: 1(t)log(e+t)\ell_{1}(t)\coloneqq\log(e+t) and n+1(t)=log(e+n(t))\ell_{n+1}(t)=\log(e+\ell_{n}(t)) for t>0t>0. If either H(x)=n(1/x)H(x)=\ell_{n}(1/x) or H(x)=n(1/x)1H(x)=\ell_{n}(1/x)^{-1} for some nn\in\mathbb{N} (i.e. G(x)n(x1/α)G(x)\sim\ell_{n}(x^{1/\alpha}) or G(x)n(x1/α)1G(x)\sim\ell_{n}(x^{1/\alpha})^{-1} as xx\to\infty), then Lemma 7.3 shows that for small t>0t>0 we have

G2(t)k=1n(e+k(1/t))1|1G(1/(2t))/G(1/t)|/log2.G_{2}(t)\coloneqq\prod_{k=1}^{n}(e+\ell_{k}(1/t))^{-1}\geqslant|1-G(1/(2t))/G(1/t)|/\log 2.

Moreover, by Lemma 7.1, Assumption ( (S).) holds with this G2G_{2}. Thus, by Theorem 2.3, there exist constants 0<C1<C20<C_{1}<C_{2} such that, for all small enough t>0t>0, we have

C1k=1n(e+k(1/t))max{𝒲1(𝑿1t,𝒁),𝒲1(𝑿12t,𝒁)}C2k=1n(e+k(1/t)).\frac{C_{1}}{\prod_{k=1}^{n}(e+\ell_{k}(1/t))}\leqslant\max\{\mathcal{W}_{1}(\bm{X}_{1}^{t},\bm{Z}),\mathcal{W}_{1}(\bm{X}_{1}^{2t},\bm{Z})\}\leqslant\frac{C_{2}}{\prod_{k=1}^{n}(e+\ell_{k}(1/t))}.

Thus, despite the function n\ell_{n} being “nearly constant” for large nn\in\mathbb{N}, the convergence rates of the upper and lower bounds match and are slower than log(1/t)1ε\log(1/t)^{-1-\varepsilon} for any ε>0\varepsilon>0.

Now consider any SV\ell\in\mbox{SV}_{\infty} with (t)0\ell(t)\downarrow 0 as tt\to\infty and 1(t)t1dt=\int_{1}^{\infty}\ell(t)t^{-1}\mathrm{d}t=\infty. Then G±(t)exp(±1t(s)s1ds)G_{\pm}(t)\coloneqq\exp(\pm\int_{1}^{t}\ell(s)s^{-1}\mathrm{d}s) are slowly varying, G+(t)G_{+}(t)\to\infty, G(t)0G_{-}(t)\to 0 and |1G±(t/2)/G±(t)|(t)log2|1-G_{\pm}(t/2)/G_{\pm}(t)|\sim\ell(t)\log 2 as tt\to\infty by Lemma 7.1. Thus, by Theorem 2.3, for any q(0,α)(0,1]q\in(0,\alpha)\cap(0,1], we have 𝒲q(𝑿t,𝒁)𝒲q(𝑿2t,𝒁)C(1/t)\mathcal{W}_{q}(\bm{X}^{t},\bm{Z})\vee\mathcal{W}_{q}(\bm{X}^{2t},\bm{Z})\geqslant C^{*}\ell(1/t) for some C>0C^{*}>0 and all sufficiently small t>0t>0. In particular, by taking an appropriate \ell, e.g., (t)=1/n(t)\ell(t)=1/\ell_{n}(t) where n\ell_{n} is as in the previous paragraph and nn\in\mathbb{N} is large, the convergence in Wasserstein distance may be arbitrarily slow. ∎

As a last example, we will consider the case of Theorem 2.8, where the pure-jump Lévy process is a tempered α\alpha-stable process.

Example 3.3.

Let 𝚺\bm{\Sigma} be a positive definite matrix on d×d\mathbb{R}^{d\times d} and set 𝑿t((𝚺𝑩st+𝑺st)/t)s[0,1]\bm{X}^{t}\coloneqq((\bm{\Sigma}\bm{B}_{st}+\bm{S}_{st})/\sqrt{t})_{s\in[0,1]} for t(0,1]t\in(0,1] where (𝑩t)t[0,1](\bm{B}_{t})_{t\in[0,1]} is a standard Brownian motion on d\mathbb{R}^{d} independent of the pure-jump tempered α\alpha-stable Lévy process 𝑺\bm{S}. Assume α[1,2)\alpha\in[1,2), that 𝑺\bm{S} has zero-mean, and fix any β(α,2]\beta_{*}\in(\alpha,2]. Then, by Theorem 2.8(a), we have the upper bound 𝒲1(𝑿t,𝚺𝑩)=𝒪(t1/β1/2)\mathcal{W}_{1}\big{(}\bm{X}^{t},\bm{\Sigma}\bm{B}\big{)}=\mathcal{O}\big{(}t^{1/\beta_{*}-1/2}\big{)} as t0t\downarrow 0. To find the lower bound, we let λ\lambda be the largest eigenvalue of 𝚺2\bm{\Sigma}^{2}, and define c:=infr>1rα|ψ𝑺(r𝒖)|>0c:=\inf_{r>1}r^{-\alpha}|\psi_{\bm{S}}(r\bm{u})|>0, for some 𝒖d\bm{u}\in\mathbb{R}^{d} with |𝒖|=1|\bm{u}|=1. Then, for any C(0,ceλ/2)C_{*}\in(0,ce^{-\lambda/2}), Theorem 2.8(c) implies that 𝒲1(𝑿1t,𝚺𝑩1)Ct1α/2\mathcal{W}_{1}(\bm{X}^{t}_{1},\bm{\Sigma}\bm{B}_{1})\geqslant C_{*}t^{1-\alpha/2} for all sufficiently small t>0t>0.

Note that, as α\alpha approaches 11, the gap between the lower and upper bound decreases. Indeed, for α=1\alpha=1, we have β=1+ε\beta_{*}=1+\varepsilon for some small ε>0\varepsilon>0, so the upper bound is of the rate t1/(1+ε)1/2t^{1/(1+\varepsilon)-1/2}, while the lower bound has the rate t\sqrt{t}, making the quotient of the two bounds proportional to tε/(1+ε)t^{\varepsilon/(1+\varepsilon)}. ∎

Example 3.4.

We show in this example, that we can find a process that satisfies Assumption ( (T).) but not Assumption ( (C).). Let α(1,2)\alpha\in(1,2) and α(1,α)\alpha^{\prime}\in(1,\alpha). Next, let XX be a one-dimensional α\alpha-stable process and YY be a α\alpha^{\prime}-stable process that is spectrally negative, with Lévy measures νX(dx)=c1|x|1αdx\nu_{X}(\mathrm{d}x)=c_{1}|x|^{-1-\alpha}\mathrm{d}x and νY(dx)=c2𝟙(,0)(x)|x|1αdx\nu_{Y}(\mathrm{d}x)=c_{2}\mathds{1}_{(-\infty,0)}(x)|x|^{-1-\alpha^{\prime}}\mathrm{d}x for some constants c1,c2>0c_{1},c_{2}>0. We note that X+YX+Y has Lévy measure νX+Y(dx)=[c2𝟙(,0)(x)|x|1α+c1|x|1α]dx\nu_{X+Y}(\mathrm{d}x)=[c_{2}\mathds{1}_{(-\infty,0)}(x)|x|^{-1-\alpha^{\prime}}+c_{1}|x|^{-1-\alpha}]\mathrm{d}x, showing that Assumption ( (T).) is indeed fulfilled. We can however note that Assumption ( (C).) cannot be fulfilled, since there doesn’t exist the necessary radial decomposition of νX+Y\nu_{X+Y}. ∎

4. Two couplings of Lévy processes

Let 𝑿=(𝑿t)t0\bm{X}=(\bm{X}_{t})_{t\geqslant 0} be a Lévy process on d\mathbb{R}^{d} with generating triplet (𝜸𝑿,𝚺𝑿𝚺𝑿,ν𝑿)(\bm{\gamma_{X}},\bm{\Sigma_{X}}\bm{\Sigma_{X}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{X}}) (also called characteristic triplet, see [40, Def. 8.2]) with respect to the cutoff function 𝒘𝟙B𝟎(1)(𝒘)\bm{w}\mapsto\mathds{1}_{B_{\bm{0}}(1)}(\bm{w}), where 𝜸𝑿d\bm{\gamma_{X}}\in\mathbb{R}^{d}, 𝚺𝑿d×d\bm{\Sigma_{X}}\in\mathbb{R}^{d\times d} (with transpose 𝚺𝑿d×d\bm{\Sigma_{X}}^{\scalebox{0.6}{$\top$}}\in\mathbb{R}^{d\times d}) and 𝚺𝑿𝚺𝑿\bm{\Sigma_{X}}\bm{\Sigma_{X}}^{\scalebox{0.6}{$\top$}} a symmetric non-negative definite matrix and ν𝑿\nu_{\bm{X}} a Lévy measure on d\mathbb{R}^{d}. Throughout, we denote by |||\cdot| the Euclidean norm of appropriate dimension and let B𝟎(r){𝒙d:|𝒙|<r}B_{\bm{0}}(r)\coloneqq\{\bm{x}\in\mathbb{R}^{d}:|\bm{x}|<r\} be the open ball in d\mathbb{R}^{d} of radius r>0r>0, centered at the origin 𝟎d\bm{0}\in\mathbb{R}^{d}. Fix any κ(0,1]\kappa\in(0,1] and consider the Lévy–Itô decomposition of 𝑿\bm{X} given by

(3) 𝑿t=𝜸𝑿,κt+𝚺𝑿𝑩𝑿t+𝑫t𝑿,κ+𝑱t𝑿,κ,t0,\bm{X}_{t}=\bm{\gamma}_{\bm{X},\kappa}t+\bm{\Sigma_{X}}\bm{B^{X}}_{t}+\bm{D}^{\bm{X},\kappa}_{t}+\bm{J}^{\bm{X},\kappa}_{t},\quad t\geqslant 0,

where 𝜸𝑿,κ𝜸𝑿d𝒘𝟙B𝟎(1)B𝟎(κ)(𝒘)ν𝑿(d𝒘)\bm{\gamma}_{\bm{X},\kappa}\coloneqq\bm{\gamma}_{\bm{X}}-\int_{\mathbb{R}^{d}}\bm{w}\mathds{1}_{B_{\bm{0}}(1)\setminus B_{\bm{0}}(\kappa)}(\bm{w})\nu_{\bm{X}}(\mathrm{d}\bm{w}), 𝑩𝑿\bm{B^{X}} is a standard Brownian motion on d\mathbb{R}^{d}, 𝑫𝑿,κ\bm{D}^{\bm{X},\kappa} is the small-jump martingale containing all the jumps of 𝑿\bm{X} of magnitude less that κ\kappa, 𝑱𝑿,κ\bm{J}^{\bm{X},\kappa} is the driftless compound Poisson process containing all the jumps of 𝑿\bm{X} of magnitude at least κ\kappa and all three processes 𝑩𝑿\bm{B^{X}}, 𝑫𝑿,κ\bm{D}^{\bm{X},\kappa} and 𝑱𝑿,κ\bm{J}^{\bm{X},\kappa} are independent. Moreover, the pure-jump component 𝑫𝑿,κ+𝑱𝑿,κ\bm{D}^{\bm{X},\kappa}+\bm{J}^{\bm{X},\kappa} of 𝑿\bm{X} is a Lévy process with paths of finite variation (i.e. the jumps are summable on any compact time interval) if and only if 𝟎d|𝒘|𝟙B𝟎(1)(𝒘)ν𝑿(d𝒘)<\int_{\mathbb{R}^{d}_{\bm{0}}}|\bm{w}|\mathds{1}_{B_{\bm{0}}(1)}(\bm{w})\nu_{\bm{X}}(\mathrm{d}\bm{w})<\infty [40, Thm 21.9]. In particular, (𝜸𝑿,𝟎,ν𝑿)(\bm{\gamma_{X}},\bm{0},\nu_{\bm{X}}) is a characteristic triplet of a Lévy process 𝑿\bm{X} without a Gaussian component. Thus, if 𝑿\bm{X} has finite variation, then 𝑿\bm{X} has zero natural drift (i.e. the process equals the sum of its jumps) if and only if γ𝑿=𝟎d𝒘𝟙B𝟎(1)(𝒘)ν𝑿(d𝒘)\gamma_{\bm{X}}=\int_{\mathbb{R}^{d}_{\bm{0}}}\bm{w}\mathds{1}_{B_{\bm{0}}(1)}(\bm{w})\nu_{\bm{X}}(\mathrm{d}\bm{w}).

Similarly, we let 𝒀=(𝒀t)t0\bm{Y}=(\bm{Y}_{t})_{t\geqslant 0} be a Lévy process on d\mathbb{R}^{d} with characteristic triplet (𝜸𝒀,𝚺𝒀𝚺𝒀,ν𝒀)(\bm{\gamma}_{\bm{Y}},\bm{\Sigma_{Y}}\bm{\Sigma_{Y}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{Y}}) with respect to the cutoff function 𝒘𝟙B𝟎(1)(𝒘)\bm{w}\mapsto\mathds{1}_{B_{\bm{0}}(1)}(\bm{w}) and whose corresponding Lévy–Itô decomposition is given by 𝒀t=𝜸𝒀,κt+𝚺𝒀𝑩𝒀t+𝑫t𝒀,κ+𝑱t𝒀,κ\bm{Y}_{t}=\bm{\gamma}_{\bm{Y},\kappa}t+\bm{\Sigma_{Y}}\bm{B^{Y}}_{t}+\bm{D}^{\bm{Y},\kappa}_{t}+\bm{J}^{\bm{Y},\kappa}_{t}, defined as above. The following elementary inequality will be used throughout: for any q(0,2]q\in(0,2],

(4) 𝒲q(𝑿,𝒀)|𝜸𝑿,κ𝜸𝒀,κ|q1+(2d)q1|𝚺𝑿𝚺𝒀|q1+𝒲q(𝑫𝑿,κ,𝑫𝒀,κ)+𝒲q(𝑱𝑿,κ,𝑱𝒀,κ),\mathcal{W}_{q}(\bm{X},\bm{Y})\leqslant|\bm{\gamma}_{\bm{X},\kappa}-\bm{\gamma}_{\bm{Y},\kappa}|^{q\wedge 1}+(2\sqrt{d})^{q\wedge 1}|\bm{\Sigma_{X}}-\bm{\Sigma_{Y}}|^{q\wedge 1}+\mathcal{W}_{q}\big{(}\bm{D}^{\bm{X},\kappa},\bm{D}^{\bm{Y},\kappa}\big{)}+\mathcal{W}_{q}\big{(}\bm{J}^{\bm{X},\kappa},\bm{J}^{\bm{Y},\kappa}\big{)},

where |||\cdot| in the last term denotes the Frobenius norm on d×d\mathbb{R}^{d\times d} (i.e. |𝚺|2=i,j=1nΣi,j2|\bm{\Sigma}|^{2}=\sum_{i,j=1}^{n}\Sigma_{i,j}^{2} for 𝚺d×d\bm{\Sigma}\in\mathbb{R}^{d\times d}). For completeness, we give a proof of (4) in Appendix C below.

Let Ξ𝑿\Xi_{\bm{X}} and Ξ𝒀\Xi_{\bm{Y}} be the Poisson random measures on [0,)×𝟎d[0,\infty)\times\mathbb{R}^{d}_{\bm{0}} of the jumps of 𝑿\bm{X} and 𝒀\bm{Y}, respectively, with corresponding compensated measures Ξ~𝑿=Ξ𝑿Lebν𝑿\widetilde{\Xi}_{\bm{X}}=\Xi_{\bm{X}}-\text{Leb}\otimes\nu_{\bm{X}} and Ξ~𝒀=Ξ𝒀Lebν𝒀\widetilde{\Xi}_{\bm{Y}}=\Xi_{\bm{Y}}-\text{Leb}\otimes\nu_{\bm{Y}}, where Leb denotes the Lebesgue measure on [0,)[0,\infty). Since, for every t0t\geqslant 0, we have

(5) 𝑫t𝑿,κ=[0,t]×𝟎d𝟙B𝟎(κ)(𝒘)𝒘Ξ~𝑿(ds,d𝒘),𝑱t𝑿,κ=[0,t]×𝟎d𝟙dB𝟎(κ)(𝒘)𝒘Ξ𝑿(ds,d𝒘),𝑫t𝒀,κ=[0,t]×𝟎d𝟙B𝟎(κ)(𝒘)𝒘Ξ~𝒀(ds,d𝒘),𝑱t𝒀,κ=[0,t]×𝟎d𝟙dB𝟎(κ)(𝒘)𝒘Ξ𝒀(ds,d𝒘),\displaystyle\begin{aligned} \bm{D}^{\bm{X},\kappa}_{t}&=\int_{[0,t]\times\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{B_{\bm{0}}(\kappa)}(\bm{w})\bm{w}\widetilde{\Xi}_{\bm{X}}(\mathrm{d}s,\mathrm{d}\bm{w}),\quad\bm{J}^{\bm{X},\kappa}_{t}=\int_{[0,t]\times\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})\bm{w}\Xi_{\bm{X}}(\mathrm{d}s,\mathrm{d}\bm{w}),\\ \bm{D}^{\bm{Y},\kappa}_{t}&=\int_{[0,t]\times\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{B_{\bm{0}}(\kappa)}(\bm{w})\bm{w}\widetilde{\Xi}_{\bm{Y}}(\mathrm{d}s,\mathrm{d}\bm{w}),\quad\bm{J}^{\bm{Y},\kappa}_{t}=\int_{[0,t]\times\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})\bm{w}\Xi_{\bm{Y}}(\mathrm{d}s,\mathrm{d}\bm{w}),\end{aligned}

the problem of coupling the jump components of 𝑿\bm{X} and 𝒀\bm{Y} is reduced to coupling the Poisson random measures Ξ𝑿\Xi_{\bm{X}} and Ξ𝒀\Xi_{\bm{Y}}. Sections 4.1 and 4.2 below each describe such a coupling.

4.1. Thinning

Choose any Lévy measure μ\mu on 𝟎d\mathbb{R}^{d}_{\bm{0}} that dominates both ν𝑿\nu_{\bm{X}} and ν𝒀\nu_{\bm{Y}} with Radon-Nikodym derivatives bounded by 11 μ\mu-a.e., i.e. f𝑿=dν𝑿/dμ1f_{\bm{X}}=\mathrm{d}\nu_{\bm{X}}/\mathrm{d}\mu\leqslant 1 and f𝒀=dν𝒀/dμ1f_{\bm{Y}}=\mathrm{d}\nu_{\bm{Y}}/\mathrm{d}\mu\leqslant 1 μ\mu-a.e. For instance, a possible choice of μ\mu is ν𝑿+ν𝒀\nu_{\bm{X}}+\nu_{\bm{Y}}. Let Ξ=nδ(Un,𝑽n)\Xi=\sum_{n\in\mathbb{N}}\delta_{(U_{n},\bm{V}_{n})} be a Poisson random measure on (0,1]×d(0,1]\times\mathbb{R}^{d}, with mean measure Lebμ\text{Leb}\otimes\mu and the corresponding compensated Poisson random measure Ξ~(ds,d𝒘)=Ξ(ds,d𝒘)dsμ(d𝒘)\widetilde{\Xi}(\mathrm{d}s,\mathrm{d}\bm{w})=\Xi(\mathrm{d}s,\mathrm{d}\bm{w})-\mathrm{d}s\otimes\mu(\mathrm{d}\bm{w}). Assume the sequence (ϑn)n(\vartheta_{n})_{n\in\mathbb{N}} of iid uniform random variables on [0,1][0,1] is independent of Ξ\Xi. The Marking and Mapping Theorems [31] imply that the following Poisson random measures

(6) Ξ𝑿=n𝟙{ϑnf𝑿(𝑽n)}δ(Un,𝑽n),andΞ𝒀=n𝟙{ϑnf𝒀(𝑽n)}δ(Un,𝑽n),\Xi_{\bm{X}}=\sum_{n\in\mathbb{N}}\mathds{1}_{\{\vartheta_{n}\leqslant f_{\bm{X}}(\bm{V}_{n})\}}\delta_{(U_{n},\bm{V}_{n})},\qquad\text{and}\qquad\Xi_{\bm{Y}}=\sum_{n\in\mathbb{N}}\mathds{1}_{\{\vartheta_{n}\leqslant f_{\bm{Y}}(\bm{V}_{n})\}}\delta_{(U_{n},\bm{V}_{n})},

have mean measures Lebν𝑿\text{Leb}\otimes\nu_{\bm{X}} and Lebν𝒀\text{Leb}\otimes\nu_{\bm{Y}}, respectively. We couple 𝑿\bm{X} and 𝒀\bm{Y} by choosing 𝑩𝑿=𝑩𝒀\bm{B^{X}}=\bm{B^{Y}} in their Lévy–Itô decompositions and couple their jump parts from (5) via the coupling of the Poisson random measures in (6).

Proposition 4.1.

The coupling (𝐃𝐗,κ,𝐃𝐘,κ,𝐉𝐗,κ,𝐉𝐘,κ)(\bm{D}^{\bm{X},\kappa},\bm{D}^{\bm{Y},\kappa},\bm{J}^{\bm{X},\kappa},\bm{J}^{\bm{Y},\kappa}) defined in (5) and (6) satisfies

(7) 𝔼[supt[0,1]|𝑫t𝑿,κ𝑫t𝒀,κ|2]4𝟎d𝟙B𝟎(κ)(𝒘)|𝒘|2|f𝑿(𝒘)f𝒀(𝒘)|μ(d𝒘).\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{D}^{\bm{X},\kappa}_{t}-\bm{D}^{\bm{Y},\kappa}_{t}\big{|}^{2}\bigg{]}\leqslant 4\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{2}|f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w})|\mu(\mathrm{d}\bm{w}).

Moreover, if ν𝐗(d𝐰)𝟙dB𝟎(κ)(𝐰)\nu_{\bm{X}}(\mathrm{d}\bm{w})\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w}) and ν𝐘(d𝐰)𝟙dB𝟎(κ)(𝐰)\nu_{\bm{Y}}(\mathrm{d}\bm{w})\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w}) have a finite second moment, then

𝔼[supt[0,1]|𝑫t𝑿,κ+𝑱t𝑿,κ(𝑫t𝒀,κ+𝑱t𝒀,κ)𝒎κt|2]4𝟎d|𝒘|2|f𝑿(𝒘)f𝒀(𝒘)|μ(d𝒘)and\displaystyle\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{D}^{\bm{X},\kappa}_{t}+\bm{J}^{\bm{X},\kappa}_{t}-(\bm{D}^{\bm{Y},\kappa}_{t}+\bm{J}^{\bm{Y},\kappa}_{t})-\bm{m}_{\kappa}t\big{|}^{2}\bigg{]}\leqslant 4\int_{\mathbb{R}^{d}_{\bm{0}}}|\bm{w}|^{2}|f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w})|\mu(\mathrm{d}\bm{w})\quad\text{and}
𝒲2(𝑿,𝒀)|𝜸𝑿,κ𝜸𝒀,κ+𝒎κ|+2d1/2|𝚺𝑿𝚺𝒀|+2(𝟎d|𝒘|2|f𝑿(𝒘)f𝒀(𝒘)|μ(d𝒘))1/2,\displaystyle\mathcal{W}_{2}(\bm{X},\bm{Y})\leqslant|\bm{\gamma}_{\bm{X},\kappa}-\bm{\gamma}_{\bm{Y},\kappa}+\bm{m}_{\kappa}|+2d^{1/2}|\bm{\Sigma_{X}}-\bm{\Sigma_{Y}}|+2\bigg{(}\int_{\mathbb{R}^{d}_{\bm{0}}}|\bm{w}|^{2}|f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w})|\mu(\mathrm{d}\bm{w})\bigg{)}^{1/2},

where the mean 𝐦κ:=𝔼[𝐉1𝐗,κ𝐉1𝐘,κ]=𝟎d𝐰𝟙dB𝟎(κ)(𝐰)(f𝐗(𝐰)f𝐘(𝐰))μ(d𝐰)\bm{m}_{\kappa}:=\mathds{E}[\bm{J}^{\bm{X},\kappa}_{1}-\bm{J}^{\bm{Y},\kappa}_{1}]=\int_{\mathbb{R}^{d}_{\bm{0}}}\bm{w}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})(f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w}))\mu(\mathrm{d}\bm{w}) is finite.

Proof.

Denote f+:=max{0,f}f^{+}:=\max\{0,f\} for any function mapping into \mathbb{R}. Define the Poisson random measures

(8) Λ+n𝟙{f𝒀(𝑽n)<ϑnf𝑿(𝑽n)}δ(Un,𝑽n)andΛn𝟙{f𝑿(𝑽n)<ϑnf𝒀(𝑽n)}δ(Un,𝑽n),\Lambda_{+}\coloneqq\sum_{n\in\mathbb{N}}\mathds{1}_{\{f_{\bm{Y}}(\bm{V}_{n})<\vartheta_{n}\leqslant f_{\bm{X}}(\bm{V}_{n})\}}\delta_{(U_{n},\bm{V}_{n})}\qquad\text{and}\qquad\Lambda_{-}\coloneqq\sum_{n\in\mathbb{N}}\mathds{1}_{\{f_{\bm{X}}(\bm{V}_{n})<\vartheta_{n}\leqslant f_{\bm{Y}}(\bm{V}_{n})\}}\delta_{(U_{n},\bm{V}_{n})},

with mean measures Leb(f𝑿f𝒀)+μ\text{Leb}\otimes(f_{\bm{X}}-f_{\bm{Y}})^{+}\mu and Leb(f𝒀f𝑿)+μ\text{Leb}\otimes(f_{\bm{Y}}-f_{\bm{X}})^{+}\mu, respectively. Thus Ξ𝑿Ξ𝒀=Λ+Λ\Xi_{\bm{X}}-\Xi_{\bm{Y}}=\Lambda_{+}-\Lambda_{-}. Note that Λ+\Lambda_{+} is independent of Λ\Lambda_{-} since they are both thinnings of the same Poisson random measure and have disjoint supports. Let Λ~+\widetilde{\Lambda}_{+} and Λ~\widetilde{\Lambda}_{-} denote their respective compensated Poisson random measures and define the Lévy processes 𝑫±=(𝑫t±)t0\bm{D^{\pm}}=(\bm{D}^{\pm}_{t})_{t\geqslant 0} by 𝑫t±:=(0,t]×𝟎d𝒘𝟙B𝟎(κ)(𝒘)Λ~±,(ds,d𝒘)\bm{D}^{\pm}_{t}:=\int_{(0,t]\times\mathbb{R}^{d}_{\bm{0}}}\bm{w}\mathds{1}_{B_{\bm{0}}(\kappa)}(\bm{w})\widetilde{\Lambda}_{\pm},(\mathrm{d}s,\mathrm{d}\bm{w}), where ±{+,}\pm\in\{+,-\}. By construction, 𝑫+\bm{D}^{+} and 𝑫\bm{D}^{-} are independent square-integrable martingales, satisfying 𝔼[𝑫t+]=𝔼[𝑫t]=0\mathds{E}\big{[}\bm{D}^{+}_{t}\big{]}=\mathds{E}\big{[}\bm{D}^{-}_{t}\big{]}=0 and 𝑫t𝑿,κ𝑫t𝒀,κ=𝑫t+𝑫t\bm{D}^{\bm{X},\kappa}_{t}-\bm{D}^{\bm{Y},\kappa}_{t}=\bm{D}^{+}_{t}-\bm{D}^{-}_{t} for all t+t\in\mathbb{R}_{+}. In particular, we have 𝔼[𝑫t+,𝑫t]=0\mathds{E}\big{[}\langle\bm{D}^{+}_{t},\bm{D}^{-}_{t}\rangle\big{]}=0 and, by Campbell’s formula [31, p. 28],

𝔼[|𝑫t+|2]\displaystyle\mathds{E}\left[|\bm{D}^{+}_{t}|^{2}\right] =t𝟎d𝟙B𝟎(κ)(𝒘)|𝒘|2(f𝑿(𝒘)f𝒀(𝒘))+μ(d𝒘),\displaystyle=t\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{2}(f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w}))^{+}\mu(\mathrm{d}\bm{w}),
𝔼[|𝑫t|2]\displaystyle\mathds{E}\left[|\bm{D}^{-}_{t}|^{2}\right] =t𝟎d𝟙B𝟎(κ)(𝒘)|𝒘|2(f𝒀(𝒘)f𝑿(𝒘))+μ(d𝒘).\displaystyle=t\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{2}(f_{\bm{Y}}(\bm{w})-f_{\bm{X}}(\bm{w}))^{+}\mu(\mathrm{d}\bm{w}).

Doob’s maximal inequality [29, Prop. 7.16], applied to the submartingale |𝑫+𝑫||\bm{D}^{+}-\bm{D}^{-}|, and the independence of martingales 𝑫+\bm{D}^{+} and 𝑫\bm{D}^{-} yield

𝔼[supt[0,1]|𝑫t𝑿,κ𝑫t𝒀,κ|2]\displaystyle\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{D}^{\bm{X},\kappa}_{t}-\bm{D}^{\bm{Y},\kappa}_{t}\big{|}^{2}\bigg{]} =𝔼[supt[0,1]|𝑫t+𝑫t|2]4𝔼[|𝑫1+𝑫1|2]=4𝔼[|𝑫1+|2]+4𝔼[|𝑫1|2]\displaystyle=\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{D}^{+}_{t}-\bm{D}^{-}_{t}\big{|}^{2}\bigg{]}\leqslant 4\mathds{E}\big{[}\big{|}\bm{D}^{+}_{1}-\bm{D}^{-}_{1}\big{|}^{2}\big{]}=4\mathds{E}\big{[}\big{|}\bm{D}^{+}_{1}|^{2}\big{]}+4\mathds{E}\big{[}\big{|}\bm{D}^{-}_{1}|^{2}\big{]}
=4𝟎d𝟙B𝟎(κ)(𝒘)|𝒘|2|f𝑿(𝒘)f𝒀(𝒘)|μ(d𝒘).\displaystyle=4\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{2}|f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w})|\mu(\mathrm{d}\bm{w}).

Assume, that 𝟎d|𝒘|2𝟙dB𝟎(κ)(𝒘)ν𝑿(d𝒘)<\int_{\mathbb{R}^{d}_{\bm{0}}}|\bm{w}|^{2}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})\nu_{\bm{X}}(\mathrm{d}\bm{w})<\infty and 𝟎d|𝒘|2𝟙dB𝟎(κ)(𝒘)ν𝒀(d𝒘)<\int_{\mathbb{R}^{d}_{\bm{0}}}|\bm{w}|^{2}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})\nu_{\bm{Y}}(\mathrm{d}\bm{w})<\infty, and define the Lévy processes 𝑱±=(𝑱t±)t0\bm{J}^{\pm}=(\bm{J}^{\pm}_{t})_{t\geqslant 0} by 𝑱t±:=(0,t]×𝟎d𝒘𝟙dB𝟎(κ)(𝒘)Λ±,(ds,d𝒘)\bm{J}^{\pm}_{t}:=\int_{(0,t]\times\mathbb{R}^{d}_{\bm{0}}}\bm{w}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})\Lambda_{\pm},(\mathrm{d}s,\mathrm{d}\bm{w}). By the integrability assumption and construction, 𝑱+\bm{J}^{+} and 𝑱\bm{J}^{-} are independent square-integrable processes with 𝔼[𝑱t+𝑱t]=t𝒎κ\mathds{E}\big{[}\bm{J}^{+}_{t}-\bm{J}^{-}_{t}\big{]}=t\bm{m}_{\kappa} and 𝑱t𝑿,κ𝑱t𝒀,κ=𝑱t+𝑱t\bm{J}^{\bm{X},\kappa}_{t}-\bm{J}^{\bm{Y},\kappa}_{t}=\bm{J}^{+}_{t}-\bm{J}^{-}_{t} for all t+t\in\mathbb{R}_{+}. Thus, Campbell’s formula [31, p. 28] yields

𝔼[|𝑫t++𝑱t+𝔼[𝑱t+]|2]\displaystyle\mathds{E}\big{[}|\bm{D}^{+}_{t}+\bm{J}^{+}_{t}-\mathds{E}[\bm{J}^{+}_{t}]|^{2}\big{]} =t𝟎d|𝒘|2(f𝑿(𝒘)f𝒀(𝒘))+μ(d𝒘),\displaystyle=t\int_{\mathbb{R}^{d}_{\bm{0}}}|\bm{w}|^{2}(f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w}))^{+}\mu(\mathrm{d}\bm{w}),
𝔼[|𝑫t+𝑱t𝔼[𝑱t]|2]\displaystyle\mathds{E}\big{[}|\bm{D}^{-}_{t}+\bm{J}^{-}_{t}-\mathds{E}[\bm{J}^{-}_{t}]|^{2}\big{]} =t𝟎d|𝒘|2(f𝒀(𝒘)f𝑿(𝒘))+μ(d𝒘).\displaystyle=t\int_{\mathbb{R}^{d}_{\bm{0}}}|\bm{w}|^{2}(f_{\bm{Y}}(\bm{w})-f_{\bm{X}}(\bm{w}))^{+}\mu(\mathrm{d}\bm{w}).

Next, Doob’s maximal inequality applied to the submartingale |𝑫t++𝑱t+(𝑫t+𝑱t)t𝒎κ||\bm{D}^{+}_{t}+\bm{J}^{+}_{t}-(\bm{D}^{-}_{t}+\bm{J}^{-}_{t})-t\bm{m}_{\kappa}|, and the independence between 𝑫t++𝑱t+𝔼[𝑱t+]\bm{D}^{+}_{t}+\bm{J}^{+}_{t}-\mathds{E}[\bm{J}^{+}_{t}] and 𝑫t+𝑱t𝔼[𝑱t]\bm{D}^{-}_{t}+\bm{J}^{-}_{t}-\mathds{E}[\bm{J}^{-}_{t}], yield

𝔼[supt[0,1]|𝑫t𝑿,κ+𝑱t𝑿,κ(𝑫t𝒀,κ+𝑱t𝒀,κ)𝒎κt|2]\displaystyle\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{D}^{\bm{X},\kappa}_{t}+\bm{J}^{\bm{X},\kappa}_{t}-\big{(}\bm{D}^{\bm{Y},\kappa}_{t}+\bm{J}^{\bm{Y},\kappa}_{t}\big{)}-\bm{m}_{\kappa}t\big{|}^{2}\bigg{]} =𝔼[supt[0,1]|𝑫t++𝑱t+(𝑫t+𝑱t)𝒎κt|2]\displaystyle=\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{D}^{+}_{t}+\bm{J}^{+}_{t}-\big{(}\bm{D}^{-}_{t}+\bm{J}^{-}_{t}\big{)}-\bm{m}_{\kappa}t\big{|}^{2}\bigg{]}
4𝔼[|𝑫1++𝑱1+(𝑫1+𝑱1)𝒎κ|2]\displaystyle\leqslant 4\mathds{E}\big{[}\big{|}\bm{D}^{+}_{1}+\bm{J}^{+}_{1}-(\bm{D}^{-}_{1}+\bm{J}^{-}_{1})-\bm{m}_{\kappa}\big{|}^{2}\big{]}
=4𝟎d|𝒘|2|f𝑿(𝒘)f𝒀(𝒘)|μ(d𝒘).\displaystyle=4\int_{\mathbb{R}^{d}_{\bm{0}}}|\bm{w}|^{2}|f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w})|\mu(\mathrm{d}\bm{w}).\qed

The following bound is required when the big jump components have infinite variance.

Proposition 4.2.

Consider the coupling (𝐃𝐗,κ,𝐃𝐘,κ,𝐉𝐗,κ,𝐉𝐘,κ)(\bm{D}^{\bm{X},\kappa},\bm{D}^{\bm{Y},\kappa},\bm{J}^{\bm{X},\kappa},\bm{J}^{\bm{Y},\kappa}) defined in (5) and (6). Then

(9) 𝔼[supt[0,1]|𝑱t𝑿,κ𝑱t𝒀,κ|q]𝟎d𝟙dB𝟎(κ)(𝒘)|𝒘|q|f𝑿(𝒘)f𝒀(𝒘)|μ(d𝒘),for any q(0,1].\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{J}^{\bm{X},\kappa}_{t}-\bm{J}^{\bm{Y},\kappa}_{t}\big{|}^{q}\bigg{]}\leqslant\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{q}|f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w})|\mu(\mathrm{d}\bm{w}),\,\text{for any }q\in(0,1].

In particular, the following inequality holds for any q(0,1]q\in(0,1]:

(10) 𝒲q(𝑿,𝒀)|𝜸𝑿,κ𝜸𝒀,κ|q+(4𝟎d𝟙B𝟎(κ)(𝒘)|𝒘|2|f𝑿(𝒘)f𝒀(𝒘)|μ(d𝒘))q/2+2qdq/2|𝚺𝑿𝚺𝒀|q+𝟎d𝟙dB𝟎(κ)(𝒘)|𝒘|q|f𝑿(𝒘)f𝒀(𝒘)|μ(d𝒘).\begin{split}\mathcal{W}_{q}(\bm{X},\bm{Y})&\leqslant|\bm{\gamma}_{\bm{X},\kappa}-\bm{\gamma}_{\bm{Y},\kappa}|^{q}+\bigg{(}4\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{2}|f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w})|\mu(\mathrm{d}\bm{w})\bigg{)}^{q/2}\\ &\qquad+2^{q}d^{q/2}|\bm{\Sigma_{X}}-\bm{\Sigma_{Y}}|^{q}+\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{q}|f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w})|\mu(\mathrm{d}\bm{w}).\end{split}

Note that (9) & (10) hold without assuming ν𝑿(d𝒘)𝟙dB𝟎(κ)(𝒘)\nu_{\bm{X}}(\mathrm{d}\bm{w})\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w}) and ν𝒀(d𝒘)𝟙dB𝟎(κ)(𝒘)\nu_{\bm{Y}}(\mathrm{d}\bm{w})\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w}) have a finite qq-moment. If this holds, however, the bound is non-trivial because the big jumps in Proposition 4.2 are then controlled by 𝟎d𝟙dB𝟎(κ)(𝒘)|𝒘|qν𝑿(d𝒘)+𝟎d𝟙dB𝟎(κ)(𝒘)|𝒘|qν𝒀(d𝒘)<\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{q}\nu_{\bm{X}}(\mathrm{d}\bm{w})+\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{q}\nu_{\bm{Y}}(\mathrm{d}\bm{w})<\infty.

Proof.

Recall that κ(0,1]\kappa\in(0,1] and let κ(κ,)\kappa^{\prime}\in(\kappa,\infty). Next, for t0t\geqslant 0, we define the processes 𝑱t,κ𝑿,κ[0,t]×𝟎d𝟙B𝟎(κ)B𝟎(κ)(𝒘)𝒘Ξ𝑿(ds,d𝒘)\bm{J}^{\bm{X},\kappa}_{t,\kappa^{\prime}}\coloneqq\int_{[0,t]\times\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{B_{\bm{0}}(\kappa^{\prime})\setminus B_{\bm{0}}(\kappa)}(\bm{w})\bm{w}\Xi_{\bm{X}}(\mathrm{d}s,\mathrm{d}\bm{w}) and 𝑱t,κ𝒀,κ[0,t]×𝟎d𝟙B𝟎(κ)B𝟎(κ)(𝒘)𝒘Ξ𝒀(ds,d𝒘)\bm{J}^{\bm{Y},\kappa}_{t,\kappa^{\prime}}\coloneqq\int_{[0,t]\times\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{B_{\bm{0}}(\kappa^{\prime})\setminus B_{\bm{0}}(\kappa)}(\bm{w})\bm{w}\Xi_{\bm{Y}}(\mathrm{d}s,\mathrm{d}\bm{w}). Let Λ±\Lambda_{\pm} be as in (8), and define 𝑱,κ±=(𝑱t,κ±)t0\bm{J}^{\pm}_{\cdot,\kappa^{\prime}}=(\bm{J}^{\pm}_{t,\kappa^{\prime}})_{t\geqslant 0} as 𝑱t,κ±(0,t]×𝟎d𝒘𝟙B𝟎(κ)B𝟎(κ)(𝒘)Λ±(ds,d𝒘)\bm{J}^{\pm}_{t,\kappa^{\prime}}\coloneqq\int_{(0,t]\times\mathbb{R}^{d}_{\bm{0}}}\bm{w}\mathds{1}_{B_{\bm{0}}(\kappa^{\prime})\setminus B_{\bm{0}}(\kappa)}(\bm{w})\Lambda_{\pm}(\mathrm{d}s,\mathrm{d}\bm{w}) (both being Lévy processes), and note that 𝑱t,κ+𝑱t,κ=𝑱t,κ𝑿,κ𝑱t,κ𝒀,κ\bm{J}^{+}_{t,\kappa^{\prime}}-\bm{J}^{-}_{t,\kappa^{\prime}}=\bm{J}^{\bm{X},\kappa}_{t,\kappa^{\prime}}-\bm{J}^{\bm{Y},\kappa}_{t,\kappa^{\prime}} for t+t\in\mathbb{R}_{+} and κ(κ,)\kappa^{\prime}\in(\kappa,\infty). Note that ν𝑿(d𝒘)𝟙B𝟎(κ)B𝟎(κ)(𝒘)\nu_{\bm{X}}(\mathrm{d}\bm{w})\mathds{1}_{B_{\bm{0}}(\kappa^{\prime})\setminus B_{\bm{0}}(\kappa)}(\bm{w}) and ν𝒀(d𝒘)𝟙B𝟎(κ)B𝟎(κ)(𝒘)\nu_{\bm{Y}}(\mathrm{d}\bm{w})\mathds{1}_{B_{\bm{0}}(\kappa^{\prime})\setminus B_{\bm{0}}(\kappa)}(\bm{w}) have a finite qq-moment for all κ(κ,)\kappa^{\prime}\in(\kappa,\infty). Moreover, by the triangle inequality and the fact (x+y)qxq+yq(x+y)^{q}\leqslant x^{q}+y^{q} for all x,y0x,y\geqslant 0, we have

(11) supt[0,1]|𝑱t,κ+𝑱t,κ|q(0,1]×𝟎d𝟙B𝟎(κ)B𝟎(κ)(𝒘)|𝒘|q(Λ+(ds,d𝒘)+Λ(ds,d𝒘)).\displaystyle\begin{split}\sup_{t\in[0,1]}\big{|}\bm{J}^{+}_{t,\kappa^{\prime}}-\bm{J}^{-}_{t,\kappa^{\prime}}\big{|}^{q}&\leqslant\int_{(0,1]\times\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{B_{\bm{0}}(\kappa^{\prime})\setminus B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{q}(\Lambda_{+}(\mathrm{d}s,\mathrm{d}\bm{w})+\Lambda_{-}(\mathrm{d}s,\mathrm{d}\bm{w})).\end{split}

Recall that Leb(f𝑿f𝒀)+μ\text{Leb}\otimes(f_{\bm{X}}-f_{\bm{Y}})^{+}\mu and Leb(f𝒀f𝑿)+μ\text{Leb}\otimes(f_{\bm{Y}}-f_{\bm{X}})^{+}\mu are the mean measures of Λ+\Lambda_{+} and Λ\Lambda_{-}, respectively. Thus, by taking expectations in (11) and applying Campbell’s formula [31, p. 28], we get

𝔼[supt[0,1]|𝑱t,κ𝑿,κ𝑱t,κ𝒀,κ|q]\displaystyle\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{J}^{\bm{X},\kappa}_{t,\kappa^{\prime}}-\bm{J}^{\bm{Y},\kappa}_{t,\kappa^{\prime}}\big{|}^{q}\bigg{]} =𝔼[supt[0,1]|𝑱t,κ+𝑱t,κ|q]\displaystyle=\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{J}^{+}_{t,\kappa^{\prime}}-\bm{J}^{-}_{t,\kappa}\big{|}^{q}\bigg{]}
𝟎d𝟙B𝟎(κ)B𝟎(κ)(𝒘)|𝒘|q|f𝑿(𝒘)f𝒀(𝒘)|μ(d𝒘).\displaystyle\leqslant\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{B_{\bm{0}}(\kappa^{\prime})\setminus B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{q}|f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w})|\mu(\mathrm{d}\bm{w}).

Due to the monotone convergence theorem, it follows that, as κ\kappa^{\prime}\to\infty,

B𝟎(κ)B𝟎(κ)|𝒘|q|f𝑿(𝒘)f𝒀(𝒘)|μ(d𝒘)𝟎d𝟙dB𝟎(κ)(𝒘)|𝒘|q|f𝑿(𝒘)f𝒀(𝒘)|μ(d𝒘).\int_{B_{\bm{0}}(\kappa^{\prime})\setminus B_{\bm{0}}(\kappa)}|\bm{w}|^{q}|f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w})|\mu(\mathrm{d}\bm{w})\to\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{q}|f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w})|\mu(\mathrm{d}\bm{w}).

Furthermore, Fatou’s lemma together with the above observations imply that

𝔼[lim infκsupt[0,1]|𝑱t,κ𝑿,κ𝑱t,κ𝒀,κ|q]\displaystyle\mathds{E}\bigg{[}\liminf_{\kappa^{\prime}\to\infty}\sup_{t\in[0,1]}\big{|}\bm{J}^{\bm{X},\kappa}_{t,\kappa^{\prime}}-\bm{J}^{\bm{Y},\kappa}_{t,\kappa^{\prime}}\big{|}^{q}\bigg{]} lim infκ𝔼[supt[0,1]|𝑱t,κ𝑿,κ𝑱t,κ𝒀,κ|q]\displaystyle\leqslant\liminf_{\kappa^{\prime}\to\infty}\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{J}^{\bm{X},\kappa}_{t,\kappa^{\prime}}-\bm{J}^{\bm{Y},\kappa}_{t,\kappa^{\prime}}\big{|}^{q}\bigg{]}
𝟎d𝟙dB𝟎(κ)(𝒘)|𝒘|q|f𝑿(𝒘)f𝒀(𝒘)|μ(d𝒘).\displaystyle\leqslant\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{q}|f_{\bm{X}}(\bm{w})-f_{\bm{Y}}(\bm{w})|\mu(\mathrm{d}\bm{w}).

We have lim infκsupt[0,1]|𝑱t,κ𝑿,κ𝑱t,κ𝒀,κ|q=supt[0,1]|𝑱t𝑿,κ𝑱t𝒀,κ|q\liminf_{\kappa^{\prime}\to\infty}\sup_{t\in[0,1]}\big{|}\bm{J}^{\bm{X},\kappa}_{t,\kappa^{\prime}}-\bm{J}^{\bm{Y},\kappa}_{t,\kappa^{\prime}}\big{|}^{q}=\sup_{t\in[0,1]}\big{|}\bm{J}^{\bm{X},\kappa}_{t}-\bm{J}^{\bm{Y},\kappa}_{t}\big{|}^{q} a.s., since the largest jump of 𝑱𝑿,κ\bm{J}^{\bm{X},\kappa} and 𝑱𝒀,κ\bm{J}^{\bm{Y},\kappa} are finite on the time interval [0,1][0,1]. This implies (9).

Since 𝒲q(𝑫𝑿,𝑫𝒀)𝒲2(𝑫𝑿,𝑫𝒀)q\mathcal{W}_{q}(\bm{D}^{\bm{X}},\bm{D}^{\bm{Y}})\leqslant\mathcal{W}_{2}(\bm{D}^{\bm{X}},\bm{D}^{\bm{Y}})^{q}, the inequality in (10) follows from (4), (7) and (9). ∎

4.2. Comonotonic coupling

In this section, we introduce the dd-dimensional comonotonic coupling of jumps for any d1d\geqslant 1. We use two ingredients to construct this coupling of the Lévy processes 𝑿\bm{X} and 𝒀\bm{Y}: (I) the comonotonic coupling of real-valued random variables ξ\xi and ζ\zeta, given by (ξ,ζ)=(Fξ(U),Fζ(U))(\xi,\zeta)=(F_{\xi}^{\leftarrow}(U),F_{\zeta}^{\leftarrow}(U)), where UU is uniform on (0,1)(0,1) and the functions FξF_{\xi}^{\leftarrow} and FζF_{\zeta}^{\leftarrow} are the right inverses of the functions FξF_{\xi} and FζF_{\zeta}; (II) LaPage’s representation of the Poisson random measures of a Lévy process (see [38, p. 4]).

The comonotonic coupling of the real-valued variables in (I) is optimal for the LpL^{p}-Wasserstein distance (see [36, Ex. 3.2.14]), 𝒲p(ξ,ζ)p=01|Fξ(u)Fζ(u)|pdu=𝔼[|Fξ(U)Fζ(U)|p]\mathcal{W}_{p}(\xi,\zeta)^{p}=\int_{0}^{1}|F_{\xi}^{\leftarrow}(u)-F_{\zeta}^{\leftarrow}(u)|^{p}\mathrm{d}u=\mathds{E}\big{[}|F_{\xi}^{\leftarrow}(U)-F_{\zeta}^{\leftarrow}(U)|^{p}\big{]} for p1p\geqslant 1. The representation in (II) decomposes the jumps of a Lévy process into its magnitude (i.e. norm) and angular component. The main idea behind our coupling of the Lévy processes 𝑿\bm{X} and 𝒀\bm{Y} is to couple their respective Poisson random measures of jumps via a comonotonic coupling of the magnitudes of jumps, while simultaneously aligning their angular components. We now describe this construction.

Recall that the Lévy processes 𝑿\bm{X} and 𝒀\bm{Y} in d\mathbb{R}^{d} have characteristic triplets (𝜸𝑿,𝚺𝑿𝚺𝑿,ν𝑿)(\bm{\gamma_{X}},\bm{\Sigma_{X}}\bm{\Sigma_{X}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{X}}) and (𝜸𝒀,𝚺𝒀𝚺𝒀,ν𝒀)(\bm{\gamma_{Y}},\bm{\Sigma_{Y}}\bm{\Sigma_{Y}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{Y}}), respectively. Suppose the Lévy measure ν𝑿\nu_{\bm{X}} (resp. ν𝒀\nu_{\bm{Y}}) of 𝑿\bm{X} (resp. 𝒀\bm{Y}) admits a radial decomposition (see [32, p. 282]), that is, there exists a probability measure σ𝑿\sigma_{\bm{X}} (resp. σ𝒀\sigma_{\bm{Y}}) on the unit sphere 𝕊d1\mathbb{S}^{d-1} (with convention 𝕊0{1,1}\mathbb{S}^{0}\coloneqq\{-1,1\}) such that:

ν𝑿(B)=𝕊d10𝟙B(x𝒗)ρ𝑿0(dx,𝒗)σ𝑿(d𝒗),(resp. ν𝒀(B)=𝕊d10𝟙B(x𝒗)ρ𝒀0(dx,𝒗)σ𝒀(d𝒗)),\nu_{\bm{X}}(B)=\int_{\mathbb{S}^{d-1}}\int_{0}^{\infty}\mathds{1}_{B}(x\bm{v})\rho_{\bm{X}}^{0}(\mathrm{d}x,\bm{v})\sigma_{\bm{X}}(\mathrm{d}\bm{v}),\,\,\bigg{(}\text{resp. }\nu_{\bm{Y}}(B)=\int_{\mathbb{S}^{d-1}}\int_{0}^{\infty}\mathds{1}_{B}(x\bm{v})\rho_{\bm{Y}}^{0}(\mathrm{d}x,\bm{v})\sigma_{\bm{Y}}(\mathrm{d}\bm{v})\bigg{)},

for any B(d{𝟎})B\in\mathcal{B}(\mathbb{R}^{d}\setminus\{\bm{0}\}), where {ρ𝑿0(,𝒗)}𝒗𝕊d1\{\rho^{0}_{\bm{X}}(\cdot,\bm{v})\}_{\bm{v}\in\mathbb{S}^{d-1}} (resp. {ρ𝒀0(,𝒗)}𝒗𝕊d1\{\rho^{0}_{\bm{Y}}(\cdot,\bm{v})\}_{\bm{v}\in\mathbb{S}^{d-1}}) is a measurable family of Lévy measures on (0,)(0,\infty). Define the probability measure σ(σ𝑿+σ𝒀)/2\sigma\coloneqq(\sigma_{\bm{X}}+\sigma_{\bm{Y}})/2 on 𝕊d1\mathbb{S}^{d-1} and the Radon-Nikodym derivatives f𝑿σ(𝒗)σ𝑿(d𝒗)/σ(d𝒗)2f^{\sigma}_{\bm{X}}(\bm{v})\coloneqq\sigma_{\bm{X}}(\mathrm{d}\bm{v})/\sigma(\mathrm{d}\bm{v})\leqslant 2 and f𝒀σ(𝒗)σ𝒀(d𝒗)/σ(d𝒗)2f^{\sigma}_{\bm{Y}}(\bm{v})\coloneqq\sigma_{\bm{Y}}(\mathrm{d}\bm{v})/\sigma(\mathrm{d}\bm{v})\leqslant 2 for 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1}. Consider the following radial decompositions of ν𝑿\nu_{\bm{X}} and ν𝒀\nu_{\bm{Y}}:

(12) ν𝑿(B)=𝕊d10𝟙B(x𝒗)ρ𝑿(dx,𝒗)σ(d𝒗),ν𝒀(B)=𝕊d10𝟙B(x𝒗)ρ𝒀(dx,𝒗)σ(d𝒗),\nu_{\bm{X}}(B)=\int_{\mathbb{S}^{d-1}}\int_{0}^{\infty}\mathds{1}_{B}(x\bm{v})\rho_{\bm{X}}(\mathrm{d}x,\bm{v})\sigma(\mathrm{d}\bm{v}),\quad\nu_{\bm{Y}}(B)=\int_{\mathbb{S}^{d-1}}\int_{0}^{\infty}\mathds{1}_{B}(x\bm{v})\rho_{\bm{Y}}(\mathrm{d}x,\bm{v})\sigma(\mathrm{d}\bm{v}),

for B(d{𝟎})B\in\mathcal{B}(\mathbb{R}^{d}\setminus\{\bm{0}\}), where ρ𝑿(,𝒗)f𝑿σ(𝒗)ρ𝑿0(,𝒗)\rho_{\bm{X}}(\cdot,\bm{v})\coloneqq f^{\sigma}_{\bm{X}}(\bm{v})\rho_{\bm{X}}^{0}(\cdot,\bm{v}) and ρ𝒀(,𝒗)f𝒀σ(𝒗)ρ𝒀0(,𝒗)\rho_{\bm{Y}}(\cdot,\bm{v})\coloneqq f^{\sigma}_{\bm{Y}}(\bm{v})\rho_{\bm{Y}}^{0}(\cdot,\bm{v}) for 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1}. The advantage of the decomposition in (12), compared to the one in the display above, is that the angular components of jumps are sampled from the same measure σ\sigma on 𝕊d1\mathbb{S}^{d-1}, making it possible to couple the jumps of 𝑿\bm{X} and 𝒀\bm{Y} by coupling their magnitudes.

For every 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1}, let uρ𝑿(u,𝒗)u\mapsto\rho_{\bm{X}}^{\leftarrow}(u,\bm{v}) (resp. uρ𝒀(u,𝒗)u\mapsto\rho_{\bm{Y}}^{\leftarrow}(u,\bm{v})) be the right inverse of xρ𝑿([x,),𝒗)x\mapsto\rho_{\bm{X}}([x,\infty),\bm{v}) (resp. xρ𝒀([x,),𝒗)x\mapsto\rho_{\bm{Y}}([x,\infty),\bm{v})). Let (Un)n(U_{n})_{n\in\mathbb{N}} be a sequence of iid uniform random variables on [0,1][0,1], and let (Γn)n(\Gamma_{n})_{n\in\mathbb{N}} be a sequence of partial sums of iid standard exponentially distributed random variables that is independent of (Un)n(U_{n})_{n\in\mathbb{N}}. Next, independent of (Un,Γn)n(U_{n},\Gamma_{n})_{n\in\mathbb{N}}, we denote by (𝑽n)n(\bm{V}_{n})_{n\in\mathbb{N}} a sequence of iid random vectors on 𝕊d1\mathbb{S}^{d-1} with common distribution σ\sigma. Define the Poisson point process Ξ\Xi on [0,1]×(0,)×𝕊d1[0,1]\times(0,\infty)\times\mathbb{S}^{d-1} with measure LebLebσ\text{Leb}\otimes\text{Leb}\otimes\sigma and the compensated Poisson random measure Ξ~(ds,dx,d𝒗)\widetilde{\Xi}(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v}) as follows:

(13) Ξnδ(Un,Γn,𝑽n),Ξ~(ds,dx,d𝒗)=Ξ(ds,dx,d𝒗)dsdxσ(d𝒗).\Xi\coloneqq\sum_{n\in\mathbb{N}}\delta_{(U_{n},\Gamma_{n},\bm{V}_{n})},\qquad\widetilde{\Xi}(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v})=\Xi(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v})-\mathrm{d}s\otimes\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v}).

Next, we note that (by Proposition 4.3 below) for any ε(0,)\varepsilon\in(0,\infty) (and even ε=\varepsilon=\infty when 𝑿\bm{X} and 𝒀\bm{Y} both have jumps of finite variation), the small-jump components of 𝑿\bm{X} and 𝒀\bm{Y} take the form

(14) 𝑴𝑿t[0,t]×[ε,)×𝕊d1𝒗ρ𝑿(x,𝒗)Ξ~(ds,dx,d𝒗),𝑴𝒀t[0,t]×[ε,)×𝕊d1𝒗ρ𝒀(x,𝒗)Ξ~(ds,dx,d𝒗).\bm{M^{X}}_{t}\!\coloneqq\!\!\int_{[0,t]\times[\varepsilon,\infty)\times\mathbb{S}^{d-1}}\!\!\bm{v}\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})\widetilde{\Xi}(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v}),\enskip\bm{M^{Y}}_{t}\!\coloneqq\!\!\int_{[0,t]\times[\varepsilon,\infty)\times\mathbb{S}^{d-1}}\!\!\bm{v}\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v})\widetilde{\Xi}(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v}).

The big-jump components of 𝑿\bm{X} and 𝒀\bm{Y} can similarly be expressed as

(15) 𝑳𝑿t[0,t]×(0,ε)×𝕊d1𝒗ρ𝑿(x,𝒗)Ξ(ds,dx,d𝒗),𝑳𝒀t[0,t]×(0,ε)×𝕊d1𝒗ρ𝒀(x,𝒗)Ξ(ds,dx,d𝒗).\bm{L^{X}}_{t}\coloneqq\int_{[0,t]\times(0,\varepsilon)\times\mathbb{S}^{d-1}}\bm{v}\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})\Xi(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v}),\enskip\bm{L^{Y}}_{t}\coloneqq\int_{[0,t]\times(0,\varepsilon)\times\mathbb{S}^{d-1}}\bm{v}\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v})\Xi(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v}).
Proposition 4.3.

Let Lévy processes 𝐗\bm{X} and 𝐘\bm{Y} have characteristic triplets (𝛄𝐗,𝚺𝐗𝚺𝐗,ν𝐗)(\bm{\gamma_{X}},\bm{\Sigma_{X}}\bm{\Sigma_{X}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{X}}) and (𝛄𝐘,𝚺𝐘𝚺𝐘,ν𝐘)(\bm{\gamma_{Y}},\bm{\Sigma_{Y}}\bm{\Sigma_{Y}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{Y}}), respectively. Assume that the Lévy measures of ν𝐗\nu_{\bm{X}} and ν𝐘\nu_{\bm{Y}} admit the radial decomposition in (12) and construct the processes (𝐌𝐗,𝐌𝐘,𝐋𝐗,𝐋𝐘)(\bm{M^{X}},\bm{M^{Y}},\bm{L^{X}},\bm{L^{Y}}) by (14) and (15), independent of standard Brownian motions 𝐁𝐗\bm{B^{X}} and 𝐁𝐘\bm{B^{Y}} on d\mathbb{R}^{d}. Then there exists constants ϖ𝐗,ϖ𝐘d\bm{\varpi_{X}},\bm{\varpi_{Y}}\in\mathbb{R}^{d}, such that 𝐗t=𝑑ϖ𝐗t+𝚺𝐗𝐁t𝐗+𝐌t𝐗+𝐋t𝐗\bm{X}_{t}\overset{d}{=}\bm{\varpi_{X}}t+\bm{\Sigma_{X}}\bm{B}^{\bm{X}}_{t}+\bm{M}^{\bm{X}}_{t}+\bm{L}^{\bm{X}}_{t} and 𝐘t=𝑑ϖ𝐘t+𝚺𝐘𝐁t𝐘+𝐌t𝐘+𝐋t𝐘\bm{Y}_{t}\overset{d}{=}\bm{\varpi_{Y}}t+\bm{\Sigma_{Y}}\bm{B}^{\bm{Y}}_{t}+\bm{M}^{\bm{Y}}_{t}+\bm{L}^{\bm{Y}}_{t} for all t[0,1]t\in[0,1]. Moreover, this coupling of 𝐗\bm{X} and 𝐘\bm{Y} satisfies

(16) 𝔼[supt[0,1]|𝑴𝑿t𝑴𝒀t|2]\displaystyle\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{M^{X}}_{t}-\bm{M^{Y}}_{t}\big{|}^{2}\bigg{]} 4[ε,)×𝕊d1(ρ𝑿(x,𝒗)ρ𝒀(x,𝒗))2dxσ(d𝒗).\displaystyle\leqslant 4\int_{[\varepsilon,\infty)\times\mathbb{S}^{d-1}}(\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v}))^{2}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v}).

Furthermore, if 𝟎d|𝐰|2𝟙dB𝟎(1)(𝐰)ν𝐗(d𝐰)<\int_{\mathbb{R}^{d}_{\bm{0}}}|\bm{w}|^{2}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}(\bm{w})\nu_{\bm{X}}(\mathrm{d}\bm{w})<\infty and 𝟎d|𝐰|2𝟙dB𝟎(1)(𝐰)ν𝐘(d𝐰)<\int_{\mathbb{R}^{d}_{\bm{0}}}|\bm{w}|^{2}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}(\bm{w})\nu_{\bm{Y}}(\mathrm{d}\bm{w})<\infty, then

(17) 𝔼[supt[0,1]|𝑴𝑿t+𝑳𝑿t(𝑴𝒀t+𝑳𝒀t)𝒎t|2]4(0,)×𝕊d1(ρ𝑿(x,𝒗)ρ𝒀(x,𝒗))2dxσ(d𝒗),\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{M^{X}}_{t}+\bm{L^{X}}_{t}-(\bm{M^{Y}}_{t}+\bm{L^{Y}}_{t})-\bm{m}t\big{|}^{2}\bigg{]}\leqslant 4\int_{(0,\infty)\times\mathbb{S}^{d-1}}(\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v}))^{2}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v}),

were we define 𝐦𝔼[𝐋𝐗1𝐋𝐘1]=(0,ε)×𝕊d1𝐯(ρ𝐗(x,𝐯)ρ𝐘(x,𝐯))dsσ(d𝐯)d\bm{m}\coloneqq\mathds{E}[\bm{L^{X}}_{1}-\bm{L^{Y}}_{1}]=\int_{(0,\varepsilon)\times\mathbb{S}^{d-1}}\bm{v}(\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v}))\mathrm{d}s\otimes\sigma(\mathrm{d}\bm{v})\in\mathbb{R}^{d}. In particular,

(18) 𝒲2(𝑿,𝒀)|ϖ𝑿ϖ𝒀+𝒎|+2d1/2|𝚺𝑿𝚺𝒀|+2((0,)×𝕊d1(ρ𝑿(x,𝒗)ρ𝒀(x,𝒗))2dxσ(d𝒗))1/2.\begin{split}\mathcal{W}_{2}(\bm{X},\bm{Y})&\leqslant|\bm{\varpi_{X}}-\bm{\varpi_{Y}}+\bm{m}|+2d^{1/2}|\bm{\Sigma_{X}}-\bm{\Sigma_{Y}}|\\ &\qquad+2\bigg{(}\int_{(0,\infty)\times\mathbb{S}^{d-1}}(\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v}))^{2}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v})\bigg{)}^{1/2}.\end{split}

Coupling the jumps of 𝑿\bm{X} and 𝒀\bm{Y} via (14) and (15) is based on the idea behind the one-dimensional comonotonic coupling, applied to the magnitudes of the jumps of 𝑿\bm{X} and 𝒀\bm{Y}. Indeed, in the coupling of Proposition 4.3, we align the angular components of the jumps and then couple the magnitudes via the right inverses ρ𝑿(,𝒗)\rho_{\bm{X}}^{\leftarrow}(\cdot,\bm{v}) and ρ𝒀(,𝒗)\rho_{\bm{Y}}^{\leftarrow}(\cdot,\bm{v}) (of possibly unbounded functions xρ𝑿([x,),𝒗)x\mapsto\rho_{\bm{X}}([x,\infty),\bm{v}) and xρ𝒀([x,),𝒗)x\mapsto\rho_{\bm{Y}}([x,\infty),\bm{v})) evaluated along the sequence (Γn)n(\Gamma_{n})_{n\in\mathbb{N}} of partial sums of iid standard exponentially distributed random variables. Note that this construction is analogous to the one-dimensional comonotonic coupling of real random variables described above, but allows for the functions xρ𝑿([x,),𝒗)x\mapsto\rho_{\bm{X}}([x,\infty),\bm{v}) and xρ𝒀([x,),𝒗)x\mapsto\rho_{\bm{Y}}([x,\infty),\bm{v}) to be unbounded.

Proof.

We start by showing that there exist ϖ𝑿,ϖ𝒀d\bm{\varpi_{X}},\bm{\varpi_{Y}}\in\mathbb{R}^{d}, such that 𝑿t=𝑑ϖ𝑿t+𝚺𝑿𝑩t𝑿+𝑴t𝑿+𝑳t𝑿\bm{X}_{t}\overset{d}{=}\bm{\varpi_{X}}t+\bm{\Sigma_{X}}\bm{B}^{\bm{X}}_{t}+\bm{M}^{\bm{X}}_{t}+\bm{L}^{\bm{X}}_{t} and 𝒀t=𝑑ϖ𝒀t+𝚺𝒀𝑩t𝒀+𝑴t𝒀+𝑳t𝒀\bm{Y}_{t}\overset{d}{=}\bm{\varpi_{Y}}t+\bm{\Sigma_{Y}}\bm{B}^{\bm{Y}}_{t}+\bm{M}^{\bm{Y}}_{t}+\bm{L}^{\bm{Y}}_{t} for all t[0,1]t\in[0,1]. The proof of this fact is essentially given in [38, p. 4], we outline it here for completeness. By the symmetry of the construction, it is sufficient to prove the first equality in law only. Since 𝑿\bm{X} is a Lévy process, Ξ𝑿={t:Δ𝑿t0}δ(t,Δ𝑿t)\Xi_{\bm{X}}=\sum_{\{t:\Delta\bm{X}_{t}\neq 0\}}\delta_{(t,\Delta\bm{X}_{t})} is a Poisson random measure on [0,1]×𝟎d[0,1]\times\mathbb{R}^{d}_{\bm{0}} of the jumps of 𝑿\bm{X} with mean measure Lebν𝑿\text{Leb}\otimes\nu_{\bm{X}} [40, Thm 19.2]. By (14) and (15), the equality in law 𝑿t=𝑑ϖ𝑿t+𝚺𝑿𝑩t𝑿+𝑴t𝑿+𝑳t𝑿\bm{X}_{t}\overset{d}{=}\bm{\varpi_{X}}t+\bm{\Sigma_{X}}\bm{B}^{\bm{X}}_{t}+\bm{M}^{\bm{X}}_{t}+\bm{L}^{\bm{X}}_{t} holds for some ϖ𝑿d\bm{\varpi_{X}}\in\mathbb{R}^{d} if

(19) Ξ𝑿=𝑑n=1δ(Un,ρ𝑿(Γn,𝑽n)𝑽n).\Xi_{\bm{X}}\overset{d}{=}\sum_{n=1}^{\infty}\delta_{(U_{n},\rho_{\bm{X}}^{\leftarrow}(\Gamma_{n},\bm{V}_{n})\bm{V}_{n})}.

To prove this, consider the Poisson random measure Ξ\Xi on [0,1]×(0,)×𝕊d1[0,1]\times(0,\infty)\times\mathbb{S}^{d-1}, with mean measure LebLebσ\text{Leb}\otimes\text{Leb}\otimes\sigma, defined in (13). Define h:[0,1]×(0,)×𝕊d1[0,1]×dh:[0,1]\times(0,\infty)\times\mathbb{S}^{d-1}\to[0,1]\times\mathbb{R}^{d} by h(t,x,𝒗)(t,ρ𝑿(x,𝒗)𝒗)h(t,x,\bm{v})\coloneqq(t,\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})\bm{v}). Crucially, by construction, we have (LebLebσ)h1=Lebν𝑿(\text{Leb}\otimes\text{Leb}\otimes\sigma)\circ h^{-1}=\text{Leb}\otimes\nu_{\bm{X}} on (𝟎d)\mathcal{B}(\mathbb{R}^{d}_{\bm{0}}). Thus, by the Mapping Theorem [31, Sec. 2.3], we get Ξh1=𝑑Ξ𝑿\Xi\circ h^{-1}\overset{d}{=}\Xi_{\bm{X}}. Moreover, since n=1δ(Un,ρ𝑿(Γn,𝑽n)𝑽n)=Ξh1\sum_{n=1}^{\infty}\delta_{(U_{n},\rho_{\bm{X}}^{\leftarrow}(\Gamma_{n},\bm{V}_{n})\bm{V}_{n})}=\Xi\circ h^{-1} by construction, the equality in law in (19) follows.

Next, we prove that

𝑴𝑿t𝑴𝒀t=[0,t]×[ε,)×𝕊d1𝒗(ρ𝑿(x,𝒗)ρ𝑿(x,𝒗))Ξ~(ds,dx,d𝒗)\displaystyle\bm{M^{X}}_{t}-\bm{M^{Y}}_{t}=\int_{[0,t]\times[\varepsilon,\infty)\times\mathbb{S}^{d-1}}\bm{v}(\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{X}}^{\leftarrow}(x,\bm{v}))\widetilde{\Xi}(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v})

is a square-integrable martingale. Let t\mathcal{F}_{t} to be the σ\sigma-field generated by Ξ((0,s]×A)\Xi((0,s]\times A) for 0st0\leqslant s\leqslant t and A([ε,)×𝕊d1)A\in\mathcal{B}([\varepsilon,\infty)\times\mathbb{S}^{d-1}), then 𝑴𝑿𝑴𝒀\bm{M^{X}}-\bm{M^{Y}} is adapted w.r.t. (t)t0(\mathcal{F}_{t})_{t\geqslant 0} and fulfils the martingale property by virtue of being an integral with respect to a compensated Poisson random measure. Furthermore, by the triangle inequality, 𝑴𝑿t𝑴𝒀t\bm{M^{X}}_{t}-\bm{M^{Y}}_{t} is square integrable since both 𝑴𝑿t\bm{M^{X}}_{t} and 𝑴𝒀t\bm{M^{Y}}_{t} are square integrable. Since the process |𝑴𝑿t𝑴𝒀t||\bm{M^{X}}_{t}-\bm{M^{Y}}_{t}| is a submartingale, Doob’s maximal inequality [29, Prop. 7.16] and Campbell’s formula [31, p. 28] imply

𝔼[supt[0,1]|𝑴𝑿t𝑴𝒀t|2]\displaystyle\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{M^{X}}_{t}-\bm{M^{Y}}_{t}\big{|}^{2}\bigg{]} 4𝔼[|𝑴𝑿1𝑴𝑿1|2]=4[ε,)×𝕊d1(ρ𝑿(x,𝒗)ρ𝒀(x,𝒗))2dxσ(d𝒗).\displaystyle\leqslant 4\mathds{E}\big{[}\big{|}\bm{M^{X}}_{1}-\bm{M^{X}}_{1}\big{|}^{2}\big{]}=4\int_{[\varepsilon,\infty)\times\mathbb{S}^{d-1}}(\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v}))^{2}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v}).

If 𝟙dB𝟎(1)(𝒘)ν𝑿(d𝒘)\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}(\bm{w})\nu_{\bm{X}}(\mathrm{d}\bm{w}) and 𝟙dB𝟎(1)(𝒘)ν𝒀(d𝒘)\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}(\bm{w})\nu_{\bm{Y}}(\mathrm{d}\bm{w}) have finite second moment, a similar bound can be established for the big-jump components using Doob’s maximal inequality and Campbell’s formula:

𝔼[supt[0,1]|𝑴𝑿t+𝑳𝑿t(𝑴𝒀t+𝑳𝒀t)𝒎t|2]\displaystyle\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{M^{X}}_{t}+\bm{L^{X}}_{t}-(\bm{M^{Y}}_{t}+\bm{L^{Y}}_{t})-\bm{m}t\big{|}^{2}\bigg{]} 4𝔼[|𝑴𝑿1+𝑳𝑿1(𝑴𝒀1+𝑳𝒀1)𝒎|2]\displaystyle\leqslant 4\mathds{E}\big{[}\big{|}\bm{M^{X}}_{1}+\bm{L^{X}}_{1}-(\bm{M^{Y}}_{1}+\bm{L^{Y}}_{1})-\bm{m}\big{|}^{2}\big{]}
=4(0,)×𝕊d1(ρ𝑿(x,𝒗)ρ𝒀(x,𝒗))2dxσ(d𝒗).\displaystyle=4\int_{(0,\infty)\times\mathbb{S}^{d-1}}(\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v}))^{2}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v}).

Finally, (18) follows directly from (17) and the standard arguments given in Appendix C. ∎

Proposition 4.4.

Pick q(0,1]q\in(0,1]. Assume that the Lévy measures of ν𝐗\nu_{\bm{X}} and ν𝐘\nu_{\bm{Y}} admit the radial decomposition in (12) and construct the processes (𝐋𝐗,𝐋𝐘)(\bm{L^{X}},\bm{L^{Y}}) by (15). For any ε(0,)\varepsilon\in(0,\infty) (we may have ε=\varepsilon=\infty when 𝐗\bm{X} and 𝐘\bm{Y} are of finite variation), the coupling (𝐋𝐗,𝐋𝐘)(\bm{L^{X}},\bm{L^{Y}}) satisfies

(20) 𝔼[supt[0,1]|𝑳𝑿t𝑳𝒀t|q](0,ε)×𝕊d1|ρ𝑿(x,𝒗)ρ𝒀(x,𝒗)|qdxσ(d𝒗).\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{L^{X}}_{t}-\bm{L^{Y}}_{t}\big{|}^{q}\bigg{]}\leqslant\int_{(0,\varepsilon)\times\mathbb{S}^{d-1}}|\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v})|^{q}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v}).

In particular, the following inequality holds

(21) 𝒲q(𝑿,𝒀)|ϖ𝑿ϖ𝒀|q+(4[ε,)×𝕊d1(ρ𝑿(x,𝒗)ρ𝒀(x,𝒗))2dxσ(d𝒗))q/2+2qdq/2|𝚺𝑿𝚺𝒀|q+(0,ε)×𝕊d1|ρ𝑿(x,𝒗)ρ𝒀(x,𝒗)|qdxσ(d𝒗).\begin{split}\mathcal{W}_{q}(\bm{X},\bm{Y})&\leqslant|\bm{\varpi_{X}}-\bm{\varpi_{Y}}|^{q}+\bigg{(}4\int_{[\varepsilon,\infty)\times\mathbb{S}^{d-1}}(\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v}))^{2}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v})\bigg{)}^{q/2}\\ &\qquad+2^{q}d^{q/2}|\bm{\Sigma_{X}}-\bm{\Sigma_{Y}}|^{q}+\int_{(0,\varepsilon)\times\mathbb{S}^{d-1}}|\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v})|^{q}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v}).\end{split}

As was the case for the thinning coupling, we can again note that (20) & (21) hold even without assuming that 𝟙dB𝟎(1)(𝒘)ν𝑿(d𝒘)\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}(\bm{w})\nu_{\bm{X}}(\mathrm{d}\bm{w}) and 𝟙dB𝟎(1)(𝒘)ν𝒀(d𝒘)\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}(\bm{w})\nu_{\bm{Y}}(\mathrm{d}\bm{w}) have a finite qq-moment. However, under such an assumption, the upper bounds are finite since the integral on the right of (20) is bounded by 𝟎d𝟙dB𝟎(κ)(𝒘)|𝒘|qν𝑿(d𝒘)+𝟎d𝟙dB𝟎(κ)(𝒘)|𝒘|qν𝒀(d𝒘)<\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{q}\nu_{\bm{X}}(\mathrm{d}\bm{w})+\int_{\mathbb{R}^{d}_{\bm{0}}}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa)}(\bm{w})|\bm{w}|^{q}\nu_{\bm{Y}}(\mathrm{d}\bm{w})<\infty for some κ>0\kappa>0.

Proof.

For κ(0,ε)\kappa\in(0,\varepsilon), we denote by 𝑳𝑿t,κ\bm{L^{X}}_{t,\kappa} and 𝑳𝒀t,κ\bm{L^{Y}}_{t,\kappa} the truncated large jumps of 𝑿\bm{X} and 𝒀\bm{Y}, given by

𝑳𝑿t,κ[0,t]×(κ,ε)×𝕊d1𝒗ρ𝑿(x,𝒗)Ξ(ds,dx,d𝒗), and 𝑳𝒀t,κ[0,t]×(κ,ε)×𝕊d1𝒗ρ𝒀(x,𝒗)Ξ(ds,dx,d𝒗).\bm{L^{X}}_{t,\kappa}\coloneqq\int_{[0,t]\times(\kappa,\varepsilon)\times\mathbb{S}^{d-1}}\bm{v}\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})\Xi(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v}),\,\text{ and }\,\bm{L^{Y}}_{t,\kappa}\coloneqq\int_{[0,t]\times(\kappa,\varepsilon)\times\mathbb{S}^{d-1}}\bm{v}\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v})\Xi(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v}).

Note that 𝑳𝑿t,κ𝑳𝒀t,κ=[0,t]×(κ,ε)×𝕊d1𝒗(ρ𝑿(x,𝒗)ρ𝒀(x,𝒗))Ξ(ds,dx,d𝒗)\bm{L^{X}}_{t,\kappa}-\bm{L^{Y}}_{t,\kappa}=\int_{[0,t]\times(\kappa,\varepsilon)\times\mathbb{S}^{d-1}}\bm{v}(\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v}))\Xi(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v}), and thus, from the concavity of xxqx\mapsto x^{q} for x>0x>0, it follows that

supt[0,1]|𝑳𝑿t,κ𝑳𝒀t,κ|q[0,1]×(κ,ε)×𝕊d1|𝒗|q|ρ𝑿(x,𝒗)ρ𝒀(x,𝒗)|qΞ(ds,dx,d𝒗).\sup_{t\in[0,1]}\big{|}\bm{L^{X}}_{t,\kappa}-\bm{L^{Y}}_{t,\kappa}\big{|}^{q}\leqslant\int_{[0,1]\times(\kappa,\varepsilon)\times\mathbb{S}^{d-1}}|\bm{v}|^{q}|\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v})|^{q}\Xi(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v}).

Since 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1} we have that |𝒗|q=1|\bm{v}|^{q}=1, and Campbell’s theorem [31, p. 28] then implies that

𝔼[[0,1]×(κ,ε)×𝕊d1|ρ𝑿(x,𝒗)ρ𝒀(x,𝒗)|qΞ(ds,dx,d𝒗)]=(κ,ε)×𝕊d1|ρ𝑿(x,𝒗)ρ𝒀(x,𝒗)|qdxσ(d𝒗).\mathds{E}\left[\int_{[0,1]\times(\kappa,\varepsilon)\times\mathbb{S}^{d-1}}|\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v})|^{q}\Xi(\mathrm{d}s,\mathrm{d}x,\mathrm{d}\bm{v})\right]\\ =\int_{(\kappa,\varepsilon)\times\mathbb{S}^{d-1}}|\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v})|^{q}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v}).

Thus, altogether, this implies that

𝔼[supt[0,1]|𝑳𝑿t,κ𝑳𝒀t,κ|q](κ,ε)×𝕊d1|ρ𝑿(x,𝒗)ρ𝒀(x,𝒗)|qdxσ(d𝒗).\displaystyle\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{L^{X}}_{t,\kappa}-\bm{L^{Y}}_{t,\kappa}\big{|}^{q}\bigg{]}\leqslant\int_{(\kappa,\varepsilon)\times\mathbb{S}^{d-1}}|\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v})|^{q}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v}).

Due to the monotone convergence theorem, it follows, as κ0\kappa\downarrow 0, that

(κ,ε)×𝕊d1|ρ𝑿(x,𝒗)ρ𝒀(x,𝒗)|qdxσ(d𝒗)(0,ε)×𝕊d1|ρ𝑿(x,𝒗)ρ𝒀(x,𝒗)|qdxσ(d𝒗).\int_{(\kappa,\varepsilon)\times\mathbb{S}^{d-1}}|\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v})|^{q}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v})\to\int_{(0,\varepsilon)\times\mathbb{S}^{d-1}}|\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v})|^{q}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v}).

Furthermore, Fatou’s lemma together with the above observations imply that

𝔼[lim infκ0supt[0,1]|𝑳𝑿t,κ𝑳𝒀t,κ|q]\displaystyle\mathds{E}\bigg{[}\liminf_{\kappa\downarrow 0}\sup_{t\in[0,1]}\big{|}\bm{L^{X}}_{t,\kappa}-\bm{L^{Y}}_{t,\kappa}\big{|}^{q}\bigg{]} lim infκ0𝔼[supt[0,1]|𝑳𝑿t,κ𝑳𝒀t,κ|q]\displaystyle\leqslant\liminf_{\kappa\downarrow 0}\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{L^{X}}_{t,\kappa}-\bm{L^{Y}}_{t,\kappa}\big{|}^{q}\bigg{]}
(0,ε)×𝕊d1|ρ𝑿(x,𝒗)ρ𝒀(x,𝒗)|qdxσ(d𝒗).\displaystyle\leqslant\int_{(0,\varepsilon)\times\mathbb{S}^{d-1}}|\rho_{\bm{X}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Y}}^{\leftarrow}(x,\bm{v})|^{q}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v}).

We can now conclude (20), as lim infκ0supt[0,1]|𝑳𝑿t,κ𝑳𝒀t,κ|q=supt[0,1]|𝑳𝑿t𝑳𝒀t|q\liminf_{\kappa\downarrow 0}\sup_{t\in[0,1]}\big{|}\bm{L^{X}}_{t,\kappa}-\bm{L^{Y}}_{t,\kappa}\big{|}^{q}=\sup_{t\in[0,1]}\big{|}\bm{L^{X}}_{t}-\bm{L^{Y}}_{t}\big{|}^{q} a.s., since the largest jumps of 𝑳𝑿t,κ\bm{L^{X}}_{t,\kappa} and 𝑳𝒀t,κ\bm{L^{Y}}_{t,\kappa} are finite on the time interval [0,1][0,1].

Inequality (21) then follows from (16), (20) and the elementary arguments in Appendix C. ∎

5. Upper bounds on the Wasserstein distance in the domain of attraction

The main aim of this section is to prove the upper bounds in Theorems 2.12.32.8 above. In Section 5.1 we give the characterisation, in terms of their generating triplets, of the Lévy processes in d\mathbb{R}^{d} that are in the stable domain-of-attraction. The proof of the upper bounds in Theorem 2.1, based on the thinning coupling, is given in Section 5.2. The upper bounds in Theorem 2.3 are established in Section 5.3 using the comonotonic coupling. In Section 5.4, we prove the upper bounds of Theorem 2.8 for the Brownian limit. In the proofs, we will rely on the following consequence of Jensen’s inequality

(22) 𝒲q(𝓧,𝓨)𝒲q(𝓧,𝓨)q1q1for any 0<q<q.\mathcal{W}_{q}(\bm{\mathcal{X}},\bm{\mathcal{Y}})\leqslant\mathcal{W}_{q^{\prime}}(\bm{\mathcal{X}},\bm{\mathcal{Y}})^{\frac{q\wedge 1}{q^{\prime}\wedge 1}}\qquad\text{for any $0<q<q^{\prime}$.}

5.1. Small-time domain of attraction for Lévy processes

We start by defining the attractor.

Definition.

For any α(0,2]\alpha\in(0,2], the law of an α\alpha-stable Lévy process 𝐙\bm{Z} is given by a generating triplet (𝛄𝐙,𝚺𝐙𝚺𝐙,ν𝐙)(\bm{\gamma_{Z}},\bm{\Sigma_{Z}}\bm{\Sigma_{Z}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{Z}}) (for the cutoff function 𝐰𝟙B𝟎(1)(𝐰)\bm{w}\mapsto\mathds{1}_{B_{\bm{0}}(1)}(\bm{w})) as follows: the Lévy measure equals

(23) ν𝒁(A)cα0𝕊d1𝟙A(r𝒗)σ(d𝒗)rα1dr,A(𝟎d),\nu_{\bm{Z}}(A)\coloneqq c_{\alpha}\int_{0}^{\infty}\int_{\mathbb{S}^{d-1}}\mathds{1}_{A}(r\bm{v})\sigma(\mathrm{d}\bm{v})r^{-\alpha-1}\mathrm{d}r,\quad A\in\mathcal{B}(\mathbb{R}^{d}_{\bm{0}}),

where σ\sigma is a probability measure on (𝕊d1)\mathcal{B}(\mathbb{S}^{d-1}) and cα[0,)c_{\alpha}\in[0,\infty) an “intensity” parameter, satisfying

  • α=2\alpha=2 [Brownian motion with zero drift]: 𝚺𝒁𝟎\bm{\Sigma_{Z}}\neq\bm{0}, 𝜸𝒁=𝟎\bm{\gamma_{Z}}=\bm{0} and cα=0c_{\alpha}=0 (i.e. ν𝒁0\nu_{\bm{Z}}\equiv 0);

  • α(1,2)\alpha\in(1,2) [infinite variation, zero-mean process]: cα>0c_{\alpha}>0, 𝜸𝒁=dB𝟎(1)𝒙ν𝒁(d𝒙)\bm{\gamma_{Z}}=-\int_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}\bm{x}\nu_{\bm{Z}}(\mathrm{d}\bm{x}) and 𝚺𝒁=𝟎\bm{\Sigma_{Z}}=\bm{0};

  • α=1\alpha=1 [Cauchy process]: either cα>0c_{\alpha}>0, with symmetric angular component 𝕊d1𝒗σ(d𝒗)=𝟎\int_{\mathbb{S}^{d-1}}\bm{v}\sigma(\mathrm{d}\bm{v})=\bm{0}, or cα=0c_{\alpha}=0 and the process 𝒁\bm{Z} is a deterministic nonzero linear drift, i.e. 𝒁t=𝜸𝒁t\bm{Z}_{t}=\bm{\gamma_{Z}}t for all times tt;

  • α(0,1)\alpha\in(0,1) [finite variation and zero natural drift]: cα>0c_{\alpha}>0 and 𝜸𝒁=B𝟎(1){𝟎}𝒙ν𝒁(d𝒙)\bm{\gamma_{Z}}=\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}\bm{x}\nu_{\bm{Z}}(\mathrm{d}\bm{x}).

It follows from the definition that an α\alpha-stable process 𝒁\bm{Z} satisfies the scaling property (𝒁st)s[0,1]=𝑑(t1/α𝒁s)s[0,1](\bm{Z}_{st})_{s\in[0,1]}\overset{d}{=}(t^{1/\alpha}\bm{Z}_{s})_{s\in[0,1]} for t>0t>0. Moreover, for α[1,2)\alpha\in[1,2) (resp. α(0,1)\alpha\in(0,1)), a non-deterministic α\alpha-stable process 𝒁\bm{Z} is of infinite (resp. finite) variation by [40, Thm 21.9], since (23) implies B𝟎(1){𝟎}|𝒙|ν𝒁(d𝒙)=\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}|\bm{x}|\nu_{\bm{Z}}(\mathrm{d}\bm{x})=\infty (resp. B𝟎(1){𝟎}|𝒙|ν𝒁(d𝒙)<\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}|\bm{x}|\nu_{\bm{Z}}(\mathrm{d}\bm{x})<\infty). Note also that in the case of the Cauchy process (stability index α=1\alpha=1), γ𝒁\gamma_{\bm{Z}} can be arbitrary if cα>0c_{\alpha}>0 and satisfies γ𝒁𝟎d\gamma_{\bm{Z}}\in\mathbb{R}^{d}_{\bm{0}} if cα=0c_{\alpha}=0.

For any 𝒂𝕊d1\bm{a}\in\mathbb{S}^{d-1}, define 𝒂(r){𝒙d:𝒂,𝒙r}\mathscr{L}_{\bm{a}}(r)\coloneqq\{\bm{x}\in\mathbb{R}^{d}:\langle\bm{a},\bm{x}\rangle\geqslant r\} for any r>0r>0. The following known result characterises the Lévy processes in the domain of attraction of an α\alpha-stable process defined above. It is a consequence of [29, Thm 15.14] and [26, Thm 2], see Appendix B below for the proof.

Theorem 5.1 (Small-time domains of attraction).

Let 𝐗=(𝐗t)t[0,1]\bm{X}=(\bm{X}_{t})_{t\in[0,1]} and 𝐙=(𝐙t)t[0,1]\bm{Z}=(\bm{Z}_{t})_{t\in[0,1]} be Lévy processes in d\mathbb{R}^{d}. Then (𝐗st/g(t))s[0,1]𝑑(𝐙s)s[0,1](\bm{X}_{st}/g(t))_{s\in[0,1]}\xrightarrow{d}(\bm{Z}_{s})_{s\in[0,1]} as t0t\downarrow 0 in the Skorokhod space for some positive normalising function g:(0,1](0,)g:(0,1]\to(0,\infty) if and only if 𝐙\bm{Z} is α\alpha-stable for some α(0,2]\alpha\in(0,2], the normalising function admits the representation g(t)=t1/αG(t1)g(t)=t^{1/\alpha}G(t^{-1}), where GG is a slowly varying function at infinity, and the generating triplets (𝛄𝐗,𝚺𝐗𝚺𝐗,ν𝐗)(\bm{\gamma}_{\bm{X}},\bm{\Sigma_{X}}\bm{\Sigma}_{\bm{X}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{X}}) and (𝛄𝐙,𝚺𝐙𝚺𝐙,ν𝐙)(\bm{\gamma}_{\bm{Z}},\bm{\Sigma_{Z}}\bm{\Sigma}_{\bm{Z}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{Z}}) (for the cutoff function 𝐰𝟙B𝟎(1)(𝐰)\bm{w}\mapsto\mathds{1}_{B_{\bm{0}}(1)}(\bm{w})) of 𝐗\bm{X} and 𝐙\bm{Z}, respectively, are related as follows:

  • if α=2\alpha=2 (attraction to Brownian motion), then

    (24) G(t1)2(𝚺𝑿𝚺𝑿+B𝟎(g(t)){𝟎}𝒙𝒙ν𝑿(d𝒙))𝚺𝒁𝚺𝒁,as t0;G(t^{-1})^{-2}\bigg{(}\bm{\Sigma_{X}}\bm{\Sigma}_{\bm{X}}^{\scalebox{0.6}{$\top$}}+\int_{B_{\bm{0}}(g(t))\setminus\{\bm{0}\}}\bm{x}\bm{x}^{\scalebox{0.6}{$\top$}}\nu_{\bm{X}}(\mathrm{d}\bm{x})\bigg{)}\to\bm{\Sigma_{Z}}\bm{\Sigma}_{\bm{Z}}^{\scalebox{0.6}{$\top$}},\quad\text{as }t\downarrow 0;
  • if α(1,2)\alpha\in(1,2), we have 𝚺𝑿=𝟎\bm{\Sigma_{X}}=\bm{0} and

    (25) tν𝑿(𝒗(g(t)))ν𝒁(𝒗(1)),as t0,for any 𝒗𝕊d1;t\nu_{\bm{X}}(\mathscr{L}_{\bm{v}}(g(t)))\to\nu_{\bm{Z}}(\mathscr{L}_{\bm{v}}(1)),\quad\text{as }t\downarrow 0,\quad\text{for any }\bm{v}\in\mathbb{S}^{d-1};
  • if α=1\alpha=1 (attraction to Cauchy process), then (25) holds,

    (26) G(t1)1(𝜸𝑿B𝟎(1)B𝟎(g(t))𝒙ν(d𝒙))𝜸𝒁,as t0,G(t^{-1})^{-1}\bigg{(}\bm{\gamma_{X}}-\int_{B_{\bm{0}}(1)\setminus B_{\bm{0}}(g(t))}\bm{x}\nu(\mathrm{d}\bm{x})\bigg{)}\to\bm{\gamma_{Z}},\quad\text{as }t\downarrow 0,

    and, for any 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1}, such that 𝒗,𝑿\langle\bm{v},\bm{X}\rangle has finite variation (i.e. B𝟎(1){𝟎}|𝒗,𝒙|ν𝑿(d𝒙)<\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}|\langle\bm{v},\bm{x}\rangle|\nu_{\bm{X}}(\mathrm{d}\bm{x})<\infty) and ν𝒁(𝒗(1))>0\nu_{\bm{Z}}(\mathscr{L}_{\bm{v}}(1))>0, the process 𝒗,𝑿\langle\bm{v},\bm{X}\rangle has zero natural drift: 𝒗,𝜸𝑿=B𝟎(1){𝟎}𝒗,𝒙ν𝑿(d𝒙)\langle\bm{v},\bm{\gamma_{X}}\rangle=\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}\langle\bm{v},\bm{x}\rangle\nu_{\bm{X}}(\mathrm{d}\bm{x}).

  • if α(0,1)\alpha\in(0,1), then (25) holds, 𝑿\bm{X} has finite variation (i.e. B𝟎(1){𝟎}|𝒙|ν𝑿(d𝒙)<\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}|\bm{x}|\nu_{\bm{X}}(\mathrm{d}\bm{x})<\infty) and zero natural drift (i.e. 𝜸𝑿=B𝟎(1){𝟎}𝒙ν𝑿(d𝒙)\bm{\gamma_{X}}=\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}\bm{x}\nu_{\bm{X}}(\mathrm{d}\bm{x})).

Moreover, the function gg satisfying the weak limit above is asymptotically unique at 0: a positive function g~\widetilde{g} satisfies (𝐗st/g~(t))s[0,1]𝑑(𝐙s)s[0,1](\bm{X}_{st}/\widetilde{g}(t))_{s\in[0,1]}\xrightarrow{d}(\bm{Z}_{s})_{s\in[0,1]} as t0t\downarrow 0 if and only if g~(t)/g(t)1\widetilde{g}(t)/g(t)\to 1 as t0t\downarrow 0.

Note that in the case α=2\alpha=2 in Theorem 5.1, we may have 𝚺𝑿=𝟎\bm{\Sigma_{X}}=\bm{0} (see Example 6.7 below), but in this case the function GG cannot be asymptotically equal to a positive constant. Moreover, in the case α(1,2)\alpha\in(1,2), the process 𝑿\bm{X} does not require centering since its mean is linear in time and thus disappears in the scaling limit. However, in the finite variation case (i.e. when α(0,1)\alpha\in(0,1)), the process 𝑿\bm{X} must have zero natural drift for the scaling limit to exist.

5.2. Domain of normal attraction: the thinning coupling

Let (𝜸𝑿,𝚺𝑿𝚺𝑿,ν𝑿)(\bm{\gamma_{X}},\bm{\Sigma_{X}}\bm{\Sigma_{X}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{X}}) denote the generating triplet [40, Def. 8.2] of 𝑿\bm{X} with respect to the cutoff function 𝒘𝟙B𝟎(1)(𝒘)\bm{w}\mapsto\mathds{1}_{B_{\bm{0}}(1)}(\bm{w}) on 𝒘d\bm{w}\in\mathbb{R}^{d}. Define the Blumenthal–Getoor (BG) index β\beta of 𝑿\bm{X} by

(27) β:=inf{p>0:Ip<}[0,2],Ip:=B𝟎(1){𝟎}|𝒘|pν𝑿(d𝒘).\beta:=\inf\{p>0:I_{p}<\infty\}\in[0,2],\qquad I_{p}:=\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}|\bm{w}|^{p}\nu_{\bm{X}}(\mathrm{d}\bm{w}).

Fix β+[β,2]\beta_{+}\in[\beta,2] as follows: β+β\beta_{+}\coloneqq\beta if Iβ<I_{\beta}<\infty; if Iβ=I_{\beta}=\infty and β<1\beta<1, then pick β+(β,1)\beta_{+}\in(\beta,1); if Iβ=I_{\beta}=\infty and β1\beta\geqslant 1, then β<2\beta<2 and hence choose β+(β,2)\beta_{+}\in(\beta,2). In particular, note that Iβ+<I_{\beta_{+}}<\infty and β+>0\beta_{+}>0. Furthermore, if B𝟎(1){𝟎}|𝒘|ν𝑿(d𝒘)<\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}|\bm{w}|\nu_{\bm{X}}(\mathrm{d}\bm{w})<\infty (or, equivalently, if the pure-jump component of the Lévy–Itô decomposition (3) of 𝑿\bm{X} is finite variation), we say that 𝑿\bm{X} has zero natural drift if 𝜸𝑿=B𝟎(1){𝟎}𝒘ν𝑿(d𝒘)\bm{\gamma_{X}}=\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}\bm{w}\nu_{\bm{X}}(\mathrm{d}\bm{w}) and nonzero natural drift otherwise. Moreover, if ν𝑿(B𝟎(1){𝟎})<\nu_{\bm{X}}(B_{\bm{0}}(1)\setminus\{\bm{0}\})<\infty (or, equivalently, the pure-jump component of 𝑿\bm{X} is of finite activity, i.e. a compound Poisson process), then β=β+=0\beta=\beta_{+}=0. If β+=0\beta_{+}=0, throughout the paper we use the convention 1/β+1/\beta_{+}\coloneqq\infty.

The following lemma gives an upper bound on the moments of the supremum of the norm of a general Lévy process. Lemma 5.2 plays an important role in the proofs of Section 5.

Lemma 5.2.

Let 𝐗\bm{X} be a Lévy process with generating triplet (𝛄𝐗,𝚺𝐗𝚺𝐗,ν𝐗)(\bm{\gamma_{X}},\bm{\Sigma_{X}}\bm{\Sigma_{X}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{X}}). Recall the Blumenthal–Getoor index β\beta from (27) and the associated quantity β+[β,2]\beta_{+}\in[\beta,2]. Assume that, for some p>0p>0, we have dB𝟎(1)|𝐰|pν𝐗(d𝐰)<\int_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}|\bm{w}|^{p}\nu_{\bm{X}}(\mathrm{d}\bm{w})<\infty. Then there exist constants Ci[0,)C_{i}\in[0,\infty), i=1,,4i=1,\ldots,4, such that

𝔼[sups[0,t]|𝑿s|p]𝟙{𝚺𝑿𝟎}C1tp/2+C2tp+C3tmin{1,p/β+}, for t[0,1].\mathds{E}\bigg{[}\sup_{s\in[0,t]}|\bm{X}_{s}|^{p}\bigg{]}\leqslant\mathds{1}_{\{\bm{\Sigma_{X}}\neq\bm{0}\}}C_{1}t^{p/2}+C_{2}t^{p}+C_{3}t^{\min\{1,p/\beta_{+}\}},\quad\text{ for }t\in[0,1].

If B𝟎(1){𝟎}|𝐰|ν𝐗(d𝐰)<\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}|\bm{w}|\nu_{\bm{X}}(\mathrm{d}\bm{w})<\infty and 𝐗\bm{X} has zero natural drift, i.e. 𝛄𝐗=B𝟎(1){𝟎}𝐰ν𝐗(d𝐰)\bm{\gamma_{X}}=\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}\bm{w}\nu_{\bm{X}}(\mathrm{d}\bm{w}), then C2=0C_{2}=0 in the inequality above.

Note that, by the definition of β+\beta_{+} above, the pure-jump component of 𝑿\bm{X} is a compound Poisson process if and only if β+=0\beta_{+}=0. In particular, if in addition in this case we have zero natural drift, then the pure-jump component of 𝑿\bm{X} is a compound Poisson process. The term tp/2t^{p/2} in the bound of Lemma 5.2 is present only if 𝑿\bm{X} has a non-trivial Gaussian component.

Lemma 5.2 is a multidimensional generalisation of [23, Lem. 2]. The proof of Lemma 5.2, given in Appendix A below, is likewise a multidimensional generalisation of the arguments in the proof of [23, Lem. 2]. As in [23, Lem. 2], the constants CiC_{i}, i=1,,4i=1,\ldots,4, can be given explicitly in terms of the characteristic triplet of 𝑿\bm{X}.

Consider a Lévy process 𝑿\bm{X} in d\mathbb{R}^{d} in the domain of normal attraction of the α\alpha-stable process 𝒁\bm{Z}. Thus we may assume that 𝑿t=(𝑿st/t1/α)s[0,1]\bm{X}^{t}=(\bm{X}_{st}/t^{1/\alpha})_{s\in[0,1]} converges weakly to 𝒁\bm{Z} as t0t\downarrow 0. We will now apply the thinning coupling, described in (5) and (6) of Subsection 4.1 above, to quantify this convergence in terms of the Wasserstein distance under the following assumption.

Assumption (T).

Let the Lévy process 𝐗\bm{X} be in the small-time domain of attraction of a stable process 𝐙\bm{Z}. Assume 𝐗\bm{X} has no Gaussian component (i.e. 𝚺𝐗=𝟎\bm{\Sigma_{X}}=\bm{0}) and its Lévy measure has a decomposition ν𝐗=ν𝐗𝖼+ν𝐗d\nu_{\bm{X}}=\nu_{\bm{X}}^{\mathsf{c}}+\nu_{\bm{X}}^{\mathrm{d}} satisfying the following: ν𝐗d\nu_{\bm{X}}^{\mathrm{d}} is arbitrary with finite mass ν𝐗d(𝟎d)<\nu_{\bm{X}}^{\mathrm{d}}(\mathbb{R}^{d}_{\bm{0}})<\infty and

ν𝑿𝖼(d𝒘)=c1f𝑺(𝒘)ν𝒁(d𝒘)&|f𝑺(𝒘)c|KT(1|𝒘|p),for all𝒘𝟎d,\nu_{\bm{X}}^{\mathsf{c}}(\mathrm{d}\bm{w})=c^{-1}f_{\bm{S}}(\bm{w})\nu_{\bm{Z}}(\mathrm{d}\bm{w})\quad\&\quad|f_{\bm{S}}(\bm{w})-c|\leqslant K_{T}(1\wedge|\bm{w}|^{p}),\quad\text{for all}\quad\bm{w}\in\mathbb{R}^{d}_{\bm{0}},

a measurable function f𝐒:𝟎d[0,1]f_{\bm{S}}:\mathbb{R}^{d}_{\bm{0}}\to[0,1] and constants KT[0,)K_{T}\in[0,\infty), p(0,)p\in(0,\infty) and c(0,1]c\in(0,1].

Remark 5.3.

(a) Condition (25) in Theorem 5.1 suggests that the Lévy measure of the process 𝑿\bm{X}, which is in the domain of attraction of a stable process 𝒁\bm{Z}, possesses a decomposition of the type ν𝑿=ν𝑿𝖼+ν𝑿d\nu_{\bm{X}}=\nu_{\bm{X}}^{\mathsf{c}}+\nu_{\bm{X}}^{\mathrm{d}}. Since Assumption ( (T).) stipulates the regularity of the density f𝑺f_{\bm{S}} of ν𝑿𝖼\nu_{\bm{X}}^{\mathsf{c}} with respect to ν𝒁\nu_{\bm{Z}}, it may be interpreted as specifying the rate of convergence in the limit in (25) of Theorem 5.1.

(b) Assumption ( (T).) implies dB𝟎(1)|𝒘|qν𝑿(d𝒘)(1+KTc1)dB𝟎(1)|𝒘|qν𝒁(d𝒘)<\int_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}|\bm{w}|^{q}\nu_{\bm{X}}(\mathrm{d}\bm{w})\leqslant(1+K_{T}c^{-1})\int_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}|\bm{w}|^{q}\nu_{\bm{Z}}(\mathrm{d}\bm{w})<\infty for all q(0,α)q\in(0,\alpha). Hence, by [40, Thm 25.3], the component 𝑺\bm{S} of 𝑿\bm{X} with Lévy measure ν𝑿𝖼\nu^{\mathsf{c}}_{\bm{X}} has as many moments as the limit 𝒁\bm{Z}. Note that this is not a restriction on 𝑿\bm{X} since ν𝑿d\nu_{\bm{X}}^{\mathrm{d}} may contain all the mass of ν𝑿\nu_{\bm{X}} outside of some neighborhood of 𝟎\bm{0}. ∎

Remark 5.4.

Under Assumption ( (T).), we may decompose the process 𝑿\bm{X} as the sum 𝑺+𝑹\bm{S}+\bm{R} of independent Lévy processes 𝑺\bm{S} and 𝑹\bm{R} with generating triplets (𝜸𝑺,𝟎,ν𝑿𝖼)(\bm{\gamma_{S}},\bm{0},\nu_{\bm{X}}^{\mathsf{c}}) and (𝜸𝑹,𝟎,ν𝑿d)(\bm{\gamma_{R}},\bm{0},\nu_{\bm{X}}^{\mathrm{d}}), respectively, such that, when α(0,1)\alpha\in(0,1), both processes have zero natural drift (note that for α(0,1)\alpha\in(0,1), Assumption ( (T).) and Theorem 5.1 imply that 𝑿\bm{X} has zero natural drift), and when α(1,2)\alpha\in(1,2) then 𝑹\bm{R} has zero natural drift. For t(0,1]t\in(0,1], let 𝑺t=(𝑺st/t1/α)s[0,1]\bm{S}^{t}=(\bm{S}_{st}/t^{1/\alpha})_{s\in[0,1]} and 𝑹t=(𝑹st/t1/α)s[0,1]\bm{R}^{t}=(\bm{R}_{st}/t^{1/\alpha})_{s\in[0,1]} and note that 𝑿t\bm{X}^{t} has the same law as 𝑺t+𝑹t\bm{S}^{t}+\bm{R}^{t}. We couple 𝑺t\bm{S}^{t} and 𝒁\bm{Z} via the coupling (𝑫𝑺t,κ,𝑫𝒁,κ,𝑱𝑺t,κ,𝑱𝒁,κ)(\bm{D}^{\bm{S}^{t},\kappa},\bm{D}^{\bm{Z},\kappa},\bm{J}^{\bm{S}^{t},\kappa},\bm{J}^{\bm{Z},\kappa}) given in (5) and (6) of Subsection 4.1. ∎

Theorem 5.5.

Let α(0,2){1}\alpha\in(0,2)\setminus\{1\} and Assumption ( (T).) hold for some p>0p>0. Then, for any q(0,1]q\in(0,1] with dB𝟎(1)|𝐰|qν𝐗d(d𝐰)<\int_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}|\bm{w}|^{q}\nu_{\bm{X}}^{\mathrm{d}}(\mathrm{d}\bm{w})<\infty, we have 𝔼[sups[0,1]|𝐑st/t1/α|q]=𝒪(t1q/α)\mathds{E}\big{[}\sup_{s\in[0,1]}|\bm{R}_{st}/t^{1/\alpha}|^{q}\big{]}=\mathcal{O}\big{(}t^{1-q/\alpha}\big{)} as t0t\downarrow 0. Moreover, for any q(0,α)(0,1]q\in(0,\alpha)\cap(0,1], we let κ(t)tr\kappa(t)\coloneqq t^{r} for t(0,1]t\in(0,1] and some r1/αr\geqslant-1/\alpha. Then, as t0t\downarrow 0, we have

(28) 𝒲q(𝑱𝑺t,κ(t),𝑱𝒁,κ(t))={𝒪(t1q/α(1+log(1/t)𝟙{p+q=α,r1/α})),for p+qα,𝒪(tp/α+r(p+qα)),for p+q<α.\displaystyle\mathcal{W}_{q}\big{(}\bm{J}^{\bm{S}^{t},\kappa(t)},\bm{J}^{\bm{Z},\kappa(t)}\big{)}=\begin{dcases}\mathcal{O}\big{(}t^{1-q/\alpha}\big{(}1+\log(1/t)\mathds{1}_{\{p+q=\alpha,\,r\neq-1/\alpha\}}\big{)}\big{)},&\text{for }p+q\geqslant\alpha,\\ \mathcal{O}\big{(}t^{p/\alpha+r(p+q-\alpha)}\big{)},&\text{for }p+q<\alpha.\end{dcases}
(29) 𝒲2(𝑫𝑺t,κ(t),𝑫𝒁,κ(t))=𝒪(tp/α+r(pα+2))\displaystyle\mathcal{W}_{2}\big{(}\bm{D}^{\bm{S}^{t},\kappa(t)},\bm{D}^{\bm{Z},\kappa(t)}\big{)}=\mathcal{O}\big{(}t^{p/\alpha+r(p-\alpha+2)}\big{)}
(30) |𝜸𝑺t,κ(t)𝜸𝒁,κ(t)|={𝒪(t11/α(1+𝟙{p+1=α,r1/α}log(1/t))),pα1>0,𝒪(tp/α+r(pα+1)),p<α1 or α(0,1).\displaystyle|\bm{\gamma}_{\bm{S}^{t},\kappa(t)}-\bm{\gamma}_{\bm{Z},\kappa(t)}|=\begin{dcases}\mathcal{O}\big{(}t^{1-1/\alpha}(1+\mathds{1}_{\{p+1=\alpha,\,r\neq-1/\alpha\}}\log(1/t))\big{)},&p\geqslant\alpha-1>0,\\ \mathcal{O}(t^{p/\alpha+r(p-\alpha+1)}),&p<\alpha-1\text{ or }\alpha\in(0,1).\end{dcases}
Remark 5.6.

By (4) and (22), we have

𝒲q(𝑿t,𝒁)𝔼[sups[0,1]|𝑹st|q]+𝒲q(𝑱𝑺t,κ(t),𝑱𝒁,κ(t))+𝒲2(𝑫𝑺t,κ(t),𝑫𝒁,κ(t))q+|𝜸𝑺t,κ(t)𝜸𝒁,κ(t)|q.\mathcal{W}_{q}(\bm{X}^{t},\bm{Z})\leqslant\mathds{E}\bigg{[}\sup_{s\in[0,1]}|\bm{R}_{s}^{t}|^{q}\bigg{]}+\mathcal{W}_{q}\big{(}\bm{J}^{\bm{S}^{t},\kappa(t)},\bm{J}^{\bm{Z},\kappa(t)}\big{)}+\mathcal{W}_{2}\big{(}\bm{D}^{\bm{S}^{t},\kappa(t)},\bm{D}^{\bm{Z},\kappa(t)}\big{)}^{q}+|\bm{\gamma}_{\bm{S}^{t},\kappa(t)}-\bm{\gamma}_{\bm{Z},\kappa(t)}|^{q}.

A careful case-by-case analysis reveals that the upper bound implied by Theorem 5.5 on the distance above (which decreases as fast as the slowest of the terms on the right) decreases the fastest when rr is chosen as follows (recall α(0,2){1}\alpha\in(0,2)\setminus\{1\}):

r={0,α>1,pα(αp),α<1,α>p+q,αq(p+1)αq(p+1α),α<1,αp+q.r=\begin{dcases}0,&\alpha>1,\\ \frac{p}{\alpha(\alpha-p)},&\alpha<1,\,\alpha>p+q,\\ \frac{\alpha-q(p+1)}{\alpha q(p+1-\alpha)},&\alpha<1,\,\alpha\leqslant p+q.\\ \end{dcases}

Moreover, in that case, we have

𝒲q(𝑿t,𝒁)={𝒪(tmin{α/q1,p,α1}q/α(1+|logt|𝟙{p+q=α}+|logt|q𝟙{p+1=α})),α>1,𝒪(tmin{1q/α,pq/(α(αp))}),α<1,α>p+q,𝒪(t1q/α(1+|logt|𝟙{p+q=α})),α<1,αp+q.\mathcal{W}_{q}(\bm{X}^{t},\bm{Z})=\begin{dcases}\mathcal{O}\big{(}t^{\min\{\alpha/q-1,\,p,\,\alpha-1\}q/\alpha}\big{(}1+|\log t|\mathds{1}_{\{p+q=\alpha\}}+|\log t|^{q}\mathds{1}_{\{p+1=\alpha\}}\big{)}\big{)},&\alpha>1,\\ \mathcal{O}\big{(}t^{\min\{1-q/\alpha,pq/(\alpha(\alpha-p))\}}\big{)},&\alpha<1,\,\alpha>p+q,\\ \mathcal{O}\big{(}t^{1-q/\alpha}\big{(}1+|\log t|\mathds{1}_{\{p+q=\alpha\}}\big{)}\big{)},&\alpha<1,\,\alpha\leqslant p+q.\end{dcases}

Since the above bounds are not easily interpretable because of the multiple cases depending on the parameters (α,p,q)(\alpha,p,q), we decided to only present the case p=1p=1 in Theorem 2.1 above. In particular, this removes the possibility of a logarithmic term appearing in the upper bound. ∎

Proof of Theorem 5.5.

The bound on 𝑹t\bm{R}^{t} follows directly from Lemma 5.2 with β+=0\beta_{+}=0 and the construction of 𝑹t\bm{R}^{t}.

We now consider the process 𝑺t\bm{S}^{t}. Define the measure

μ(A)c1ν𝒁(A)=cαc𝕊d10𝟙A(x𝒗)dxxα+1σ(d𝒗), for A(𝟎d),\mu(A)\coloneqq c^{-1}\nu_{\bm{Z}}(A)=\frac{c_{\alpha}}{c}\int_{\mathbb{S}^{d-1}}\int_{0}^{\infty}\mathds{1}_{A}(x\bm{v})\frac{\mathrm{d}x}{x^{\alpha+1}}\sigma(\mathrm{d}\bm{v}),\qquad\text{ for }A\in\mathcal{B}(\mathbb{R}^{d}_{\bm{0}}),

where cc (resp. cαc_{\alpha}) is in Assumption ( (T).) (resp. in (23) of the definition of 𝒁\bm{Z}). The Radon–Nikodym derivative f𝑺t(𝒘):=ν𝑺t(d𝒘)/μ(d𝒘)f_{\bm{S}^{t}}(\bm{w}):=\nu_{\bm{S}^{t}}(\mathrm{d}\bm{w})/\mu(\mathrm{d}\bm{w}) equals f𝑺(t1/α𝒘)f_{\bm{S}}(t^{1/\alpha}\bm{w}) on the support of μ\mu, since Lévy–Khintchine exponent satisfies tψ𝑺(𝒖/t1/α)=ψ𝑺t(𝒖)t\psi_{\bm{S}}(\bm{u}/t^{1/\alpha})=\psi_{\bm{S}^{t}}(\bm{u}) and hence

ν𝑺t(d𝒘)=t1+d/αν𝑿𝖼(d(t1/α𝒘))=t1+d/αf𝑺(t1/α𝒘)μ(d(t1/α𝒘))=f𝑺(t1/α𝒘)μ(d𝒘).\nu_{\bm{S}^{t}}(\mathrm{d}\bm{w})=t^{1+d/\alpha}\nu_{\bm{X}}^{\mathsf{c}}(\mathrm{d}(t^{1/\alpha}\bm{w}))=t^{1+d/\alpha}f_{\bm{S}}(t^{1/\alpha}\bm{w})\mu(\mathrm{d}(t^{1/\alpha}\bm{w}))=f_{\bm{S}}(t^{1/\alpha}\bm{w})\mu(\mathrm{d}\bm{w}).

First we bound the large-jump component 𝑱𝑺t,κ(t)𝑱𝒁,κ(t)\bm{J}^{\bm{S}^{t},\kappa(t)}-\bm{J}^{\bm{Z},\kappa(t)}: inequality (9) of Proposition 4.2 yields

𝔼[sups[0,1]|𝑱s𝑺t,κ(t)𝑱s𝒁,κ(t)|q]cαc𝕊d1κ(t)|f𝑺(t1/αx𝒗)c|xqα1dxσ(d𝒗).\mathds{E}\bigg{[}\sup_{s\in[0,1]}|\bm{J}^{\bm{S}^{t},\kappa(t)}_{s}-\bm{J}^{\bm{Z},\kappa(t)}_{s}|^{q}\bigg{]}\leqslant\frac{c_{\alpha}}{c}\int_{\mathbb{S}^{d-1}}\int_{\kappa(t)}^{\infty}|f_{\bm{S}}(t^{1/\alpha}x\bm{v})-c|x^{q-\alpha-1}\mathrm{d}x\sigma(\mathrm{d}\bm{v}).

Recall that f(x)g(x)f(x)\lesssim g(x) as x0x\downarrow 0 means that there exists some c0,x0>0c_{0},x_{0}>0 such that f(x)c0g(x)f(x)\leqslant c_{0}g(x) for all xx0x\leqslant x_{0}. Using Assumption ( (T).), as t0t\downarrow 0, we obtain

𝔼[sups[0,1]|𝑱s𝑺t,κ(t)𝑱s𝒁,κ(t)|q]\displaystyle\mathds{E}\bigg{[}\sup_{s\in[0,1]}\big{|}\bm{J}^{\bm{S}^{t},\kappa(t)}_{s}-\bm{J}^{\bm{Z},\kappa(t)}_{s}\big{|}^{q}\bigg{]} trt1/αtp/αxp+qα1dx+t1/αxqα1dx,\displaystyle\lesssim\int_{t^{r}}^{t^{-1/\alpha}}t^{p/\alpha}x^{p+q-\alpha-1}\mathrm{d}x+\int_{t^{-1/\alpha}}^{\infty}x^{q-\alpha-1}\mathrm{d}x,

where t1/αxqα1dx=𝒪(t1q/α)\int_{t^{-1/\alpha}}^{\infty}x^{q-\alpha-1}\mathrm{d}x=\mathcal{O}(t^{1-q/\alpha}). Next, as t0t\downarrow 0, we note that

trt1/αtp/αxp+qα1dx={𝒪(t1q/α),for p+q>α,𝒪(t1q/α(1+log(1/t)𝟙{r1/α})),for p+q=α,𝒪(tp/α+r(p+qα)),for p+q<α.\displaystyle\int_{t^{r}}^{t^{-1/\alpha}}t^{p/\alpha}x^{p+q-\alpha-1}\mathrm{d}x=\begin{dcases}\mathcal{O}\big{(}t^{1-q/\alpha}\big{)},&\text{for }p+q>\alpha,\\ \mathcal{O}\big{(}t^{1-q/\alpha}\big{(}1+\log(1/t)\mathds{1}_{\{r\neq-1/\alpha\}}\big{)}\big{)},&\text{for }p+q=\alpha,\\ \mathcal{O}\big{(}t^{p/\alpha+r(p+q-\alpha)}\big{)},&\text{for }p+q<\alpha.\end{dcases}

Thus, since r1/αr\geqslant-1/\alpha, altogether we have, as t0t\downarrow 0,

𝔼[sups[0,1]|𝑱s𝑺t,κ(t)𝑱s𝒁,κ(t)|q]={𝒪(t1q/α(1+log(1/t)𝟙{p+q=α,r1/α})),for p+qα,𝒪(tp/α+r(p+qα)),for p+q<α.\mathds{E}\bigg{[}\sup_{s\in[0,1]}\big{|}\bm{J}^{\bm{S}^{t},\kappa(t)}_{s}-\bm{J}^{\bm{Z},\kappa(t)}_{s}\big{|}^{q}\bigg{]}=\begin{dcases}\mathcal{O}\big{(}t^{1-q/\alpha}\big{(}1+\log(1/t)\mathds{1}_{\{p+q=\alpha,\,r\neq-1/\alpha\}}\big{)}\big{)},&\text{for }p+q\geqslant\alpha,\\ \mathcal{O}\big{(}t^{p/\alpha+r(p+q-\alpha)}\big{)},&\text{for }p+q<\alpha.\end{dcases}

Next, we find the rate for the small-jump component 𝑫𝑺t,κ(t)𝑫𝒁,κ(t)\bm{D}^{\bm{S}^{t},\kappa(t)}-\bm{D}^{\bm{Z},\kappa(t)}. Assumption ( (T).) and (7) of Proposition 4.1 imply that, as t0t\downarrow 0,

𝔼[sups[0,1]|𝑫s𝑺t,κ(t)𝑫s𝒁,κ(t)|2]\displaystyle\mathds{E}\bigg{[}\sup_{s\in[0,1]}\big{|}\bm{D}^{\bm{S}^{t},\kappa(t)}_{s}-\bm{D}^{\bm{Z},\kappa(t)}_{s}\big{|}^{2}\bigg{]} 4cαc𝕊d10κ(t)|f𝑺(t1/αx𝒗)c|x1αdxσ(d𝒗)\displaystyle\leqslant 4\frac{c_{\alpha}}{c}\int_{\mathbb{S}^{d-1}}\int_{0}^{\kappa(t)}|f_{\bm{S}}(t^{1/\alpha}x\bm{v})-c|x^{1-\alpha}\mathrm{d}x\sigma(\mathrm{d}\bm{v})
4KTcαc0κ(t)tp/αxpα+1dx=𝒪(tp/α+r(pα+2)).\displaystyle\leqslant 4K_{T}\frac{c_{\alpha}}{c}\int_{0}^{\kappa(t)}t^{p/\alpha}x^{p-\alpha+1}\mathrm{d}x=\mathcal{O}\big{(}t^{p/\alpha+r(p-\alpha+2)}\big{)}.

Next, we control the difference |𝜸𝑺t,κ(t)𝜸𝒁,κ(t)||\bm{\gamma}_{\bm{S}^{t},\kappa(t)}-\bm{\gamma}_{\bm{Z},\kappa(t)}| of the drift terms. First, consider the infinite variation case α(1,2)\alpha\in(1,2). Since 𝒁\bm{Z} has zero mean, representation (3) implies

𝜸𝑺t,κ(t)\displaystyle\bm{\gamma}_{\bm{S}^{t},\kappa(t)} =t11/α𝔼[𝑺1]d𝒘𝟙dB𝟎(κ(t))(𝒘)f𝑺(t1/α𝒘)μ(d𝒘), and\displaystyle=t^{1-1/\alpha}\mathds{E}[\bm{S}_{1}]-\int_{\mathbb{R}^{d}}\bm{w}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa(t))}(\bm{w})f_{\bm{S}}(t^{1/\alpha}\bm{w})\mu(\mathrm{d}\bm{w}),\quad\text{ and }
𝜸𝒁,κ(t)\displaystyle\bm{\gamma}_{\bm{Z},\kappa(t)} =d𝒘𝟙dB𝟎(κ(t))(𝒘)cμ(d𝒘).\displaystyle=-\int_{\mathbb{R}^{d}}\bm{w}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa(t))}(\bm{w})c\mu(\mathrm{d}\bm{w}).

Thus, we obtain

(31) 𝜸𝑺t,κ(t)𝜸𝒁,κ(t)\displaystyle\bm{\gamma}_{\bm{S}^{t},\kappa(t)}-\bm{\gamma}_{\bm{Z},\kappa(t)} =t11/α𝔼[𝑺1]d𝒘𝟙dB𝟎(κ(t))(𝒘)(f𝑺(t1/α𝒘)c)μ(d𝒘).\displaystyle=t^{1-1/\alpha}\mathds{E}[\bm{S}_{1}]-\int_{\mathbb{R}^{d}}\bm{w}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa(t))}(\bm{w})(f_{\bm{S}}(t^{1/\alpha}\bm{w})-c)\mu(\mathrm{d}\bm{w}).

By Assumption ( (T).), the integral in the display satisfies

|d𝒘𝟙dB𝟎(κ(t))(𝒘)(f𝑺(t1/α𝒘)c)μ(d𝒘)|\displaystyle\bigg{|}\int_{\mathbb{R}^{d}}\bm{w}\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa(t))}(\bm{w})(f_{\bm{S}}(t^{1/\alpha}\bm{w})-c)\mu(\mathrm{d}\bm{w})\bigg{|}
d|𝒘|𝟙dB𝟎(κ(t))(𝒘)|f𝑺(t1/α𝒘)c|μ(d𝒘)=cαc𝕊d1κ(t)xα|f𝑺(t1/αx𝒗)c|dxσ(d𝒗)\displaystyle\leqslant\int_{\mathbb{R}^{d}}|\bm{w}|\mathds{1}_{\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa(t))}(\bm{w})|f_{\bm{S}}(t^{1/\alpha}\bm{w})-c|\mu(\mathrm{d}\bm{w})=\frac{c_{\alpha}}{c}\int_{\mathbb{S}^{d-1}}\int_{\kappa(t)}^{\infty}x^{-\alpha}|f_{\bm{S}}(t^{1/\alpha}x\bm{v})-c|\mathrm{d}x\,\sigma(\mathrm{d}\bm{v})
KTcαc(tp/ακ(t)t1/αxpαdx+t1/αxαdx)={𝒪(t11/α),p>α1,𝒪(t11/α(1+log(1/t)𝟙{r1/α})),p=α1,𝒪(tp/α+r(pα+1)),p<α1,\displaystyle\leqslant K_{T}\frac{c_{\alpha}}{c}\bigg{(}t^{p/\alpha}\int_{\kappa(t)}^{t^{-1/\alpha}}x^{p-\alpha}\mathrm{d}x+\int_{t^{-1/\alpha}}^{\infty}x^{-\alpha}\mathrm{d}x\bigg{)}=\begin{dcases}\mathcal{O}(t^{1-1/\alpha}),&p>\alpha-1,\\ \mathcal{O}(t^{1-1/\alpha}(1+\log(1/t)\mathds{1}_{\{r\neq-1/\alpha\}})),&p=\alpha-1,\\ \mathcal{O}(t^{p/\alpha+r(p-\alpha+1)}),&p<\alpha-1,\end{dcases}

where we used the fact that r1/αr\geqslant-1/\alpha. By (31), we obtain

|𝜸𝑺t,κ(t)𝜸𝒁,κ(t)|={𝒪(t11/α(1+𝟙{p+1=α,r1/α}log(1/t))),pα1,𝒪(tp/α+r(pα+1)),p<α1.|\bm{\gamma}_{\bm{S}^{t},\kappa(t)}-\bm{\gamma}_{\bm{Z},\kappa(t)}|=\begin{dcases}\mathcal{O}\big{(}t^{1-1/\alpha}(1+\mathds{1}_{\{p+1=\alpha,\,r\neq-1/\alpha\}}\log(1/t))\big{)},&p\geqslant\alpha-1,\\ \mathcal{O}(t^{p/\alpha+r(p-\alpha+1)}),&p<\alpha-1.\end{dcases}

In the finite variation case α(0,1)\alpha\in(0,1), recall that 𝑺\bm{S} and 𝒁\bm{Z} have zero natural drift, so that

𝜸𝑺t,κ(t)\displaystyle\bm{\gamma}_{\bm{S}^{t},\kappa(t)} =d𝒘𝟙B𝟎(κ(t))(𝒘)f𝑺(t1/α𝒘)μ(d𝒘),𝜸𝒁,κ(t)=d𝒘𝟙B𝟎(κ(t))(𝒘)cμ(d𝒘).\displaystyle=\int_{\mathbb{R}^{d}}\bm{w}\mathds{1}_{B_{\bm{0}}(\kappa(t))}(\bm{w})f_{\bm{S}}(t^{1/\alpha}\bm{w})\mu(\mathrm{d}\bm{w}),\quad\bm{\gamma}_{\bm{Z},\kappa(t)}=\int_{\mathbb{R}^{d}}\bm{w}\mathds{1}_{B_{\bm{0}}(\kappa(t))}(\bm{w})c\mu(\mathrm{d}\bm{w}).

Thus, we have, by Assumption ( (T).),

|𝜸𝑺t,κ(t)𝜸𝒁,κ(t)|\displaystyle|\bm{\gamma}_{\bm{S}^{t},\kappa(t)}-\bm{\gamma}_{\bm{Z},\kappa(t)}| d|𝒘|𝟙B𝟎(κ(t))(𝒘)|f𝑺(t1/α𝒘)c|μ(d𝒘)\displaystyle\leqslant\int_{\mathbb{R}^{d}}|\bm{w}|\mathds{1}_{B_{\bm{0}}(\kappa(t))}(\bm{w})|f_{\bm{S}}(t^{1/\alpha}\bm{w})-c|\mu(\mathrm{d}\bm{w})
KTcαc0κ(t)tp/αxpαdx=𝒪(tp/α+r(pα+1)).\displaystyle\leqslant K_{T}\frac{c_{\alpha}}{c}\int_{0}^{\kappa(t)}t^{p/\alpha}x^{p-\alpha}\mathrm{d}x=\mathcal{O}(t^{p/\alpha+r(p-\alpha+1)}).\qed

5.3. Domain of non-normal attraction: the comonotonic coupling

Let 𝒁\bm{Z} be an α\alpha-stable process on d\mathbb{R}^{d} for some α(0,2)\alpha\in(0,2), defined as in Section 5.1, with “intensity” parameter cαc_{\alpha}, probability measure σ\sigma on (𝕊d1)\mathcal{B}(\mathbb{S}^{d-1}) and Lévy measure ν𝒁\nu_{\bm{Z}} in (23). Define the measure ρ𝒁(dx,𝒗)cαxα1dx\rho_{\bm{Z}}(\mathrm{d}x,\bm{v})\coloneqq c_{\alpha}x^{-\alpha-1}\mathrm{d}x on ((0,))\mathcal{B}((0,\infty)) and note that the right inverse of its tail xρ𝒁([x,),𝒗)x\mapsto\rho_{\bm{Z}}([x,\infty),\bm{v}) is given by ρ𝒁(x,𝒗)=(cα/α)1/αx1/α\rho_{\bm{Z}}^{\leftarrow}(x,\bm{v})=(c_{\alpha}/\alpha)^{1/\alpha}x^{-1/\alpha} for all x>0x>0 and 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1}. The comonotonic coupling of 𝒁\bm{Z} and a Lévy process 𝑿\bm{X} requires the following assumption on the generating triplet (𝜸𝑿,𝚺𝑿𝚺𝑿,ν𝑿)(\bm{\gamma_{X}},\bm{\Sigma_{X}}\bm{\Sigma_{X}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{X}}) of 𝑿\bm{X}.

Assumption (C).

𝑿\bm{X} has no Gaussian component 𝚺𝐗=𝟎\bm{\Sigma_{X}}=\bm{0} and ν𝐗=ν𝐗𝖼+ν𝐗d\nu_{\bm{X}}=\nu_{\bm{X}}^{\mathsf{c}}+\nu_{\bm{X}}^{\mathrm{d}}, where the measure ν𝐗d\nu_{\bm{X}}^{\mathrm{d}} is arbitrary with finite mass ν𝐗d(𝟎d)<\nu_{\bm{X}}^{\mathrm{d}}(\mathbb{R}^{d}_{\bm{0}})<\infty and the Lévy measure ν𝐗𝖼\nu_{\bm{X}}^{\mathsf{c}} can be expressed as

(32) ν𝑿𝖼(B)=𝕊d10𝟙B(x𝒗)ρ𝑿𝖼(dx,𝒗)σ(d𝒗)&ρ𝑿𝖼([x,),𝒗)=cαα(1+h(x,𝒗))H(x)αxα\nu_{\bm{X}}^{\mathsf{c}}(B)=\int_{\mathbb{S}^{d-1}}\int_{0}^{\infty}\mathds{1}_{B}(x\bm{v})\rho_{\bm{X}}^{\mathsf{c}}(\mathrm{d}x,\bm{v})\sigma(\mathrm{d}\bm{v})\quad\&\quad\rho_{\bm{X}}^{\mathsf{c}}([x,\infty),\bm{v})=\frac{c_{\alpha}}{\alpha}(1+h(x,\bm{v}))H(x)^{\alpha}x^{-\alpha}

for all B(𝟎d)B\in\mathcal{B}(\mathbb{R}^{d}_{\bm{0}}), x>0x>0, 𝐯𝕊d1\bm{v}\in\mathbb{S}^{d-1} and some monotonic function H:(0,)(0,)H:(0,\infty)\to(0,\infty), slowly varying at 0, and a measurable h:(0,)×𝕊d1[1,)h:(0,\infty)\times\mathbb{S}^{d-1}\to[-1,\infty). Assume that the functions HH, hh and

(33) G(x)𝕊d1H(ρ𝑿𝖼(x,𝒖))σ(d𝒖),x>0,G(x)\coloneqq\int_{\mathbb{S}^{d-1}}H(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{u}))\sigma(\mathrm{d}\bm{u}),\qquad x>0,

where ρ𝐗𝖼(,𝐯)\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(\cdot,\bm{v}) is the right inverse of xρ𝐗𝖼([x,),𝐯)x\mapsto\rho_{\bm{X}}^{\mathsf{c}}([x,\infty),\bm{v}), satisfy

(34) |h(x,𝒗)|Kh(1xp)&|H(ρ𝑿𝖼(x,𝒗))/G(x)1|KQ(1xδ)for all x>0,𝒗𝕊d1|h(x,\bm{v})|\leqslant K_{h}(1\wedge x^{p})\quad\&\quad\left|H(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}))/G(x)-1\right|\leqslant K_{Q}(1\wedge x^{-\delta})\quad\text{for all }x>0,\bm{v}\in\mathbb{S}^{d-1}

and some constants p,δ>0p,\delta>0 and Kh,KQ0K_{h},K_{Q}\geqslant 0.

Under Assumption ( (C).), GG is monotonic and slowly varying at \infty. In fact, for any 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1}, we have G(x)H(ρ𝑿𝖼(x,𝒗))G(x)\sim H(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v})) as xx\to\infty by virtue of (34), which is slowly varying by [6, Prop. 1.5.7(ii)] (since ρ𝑿𝖼([x,),𝒗)\rho_{\bm{X}}^{\mathsf{c}\leftarrow}([x,\infty),\bm{v}) is regularly varying and HH is slowly varying). Note also that HH may be either non-increasing or non-decreasing.

Remark 5.7.

Condition (25) in Theorem 5.1 states that the Lévy measure of the process 𝑿\bm{X} in the domain of attraction of a stable process 𝒁\bm{Z} behaves as the Lévy measure of 𝒁\bm{Z} in every half-space of the form 𝒗(c)\mathscr{L}_{\bm{v}}(c) for every 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1} and small c>0c>0. Assumptions ( (S).) and ( (C).) may thus be interpreted as a refinement of this condition, requiring the Lévy measure of 𝑿\bm{X} to satisfy an analogue of (25) but on every ray directed by 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1} and quantifying how fast such a limit holds. In particular, under Assumptions ( (S).) and ( (C).) and for GG defined in (33), we let g(t):=t1/αG(1/t)g(t):=t^{1/\alpha}G(1/t) with t(0,1]t\in(0,1]. For such a gg it follows that (𝑿st/g(t))s[0,1]𝑑(𝒁s)s[0,1](\bm{X}_{st}/g(t))_{s\in[0,1]}\xrightarrow{d}(\bm{Z}_{s})_{s\in[0,1]}, i.e. 𝑿\bm{X} is in the small-time domain of non-normal attraction of 𝒁\bm{Z} (by Theorem 5.1 above). ∎

Remark 5.8.

We decompose the process 𝑿\bm{X} as the sum 𝑺+𝑹\bm{S}+\bm{R} of the independent Lévy processes 𝑺\bm{S} and 𝑹\bm{R} with generating triplets (𝜸𝑺,𝟎,ν𝑿𝖼)(\bm{\gamma_{S}},\bm{0},\nu_{\bm{X}}^{\mathsf{c}}) and (𝜸𝑹,𝟎,ν𝑿d)(\bm{\gamma_{R}},\bm{0},\nu_{\bm{X}}^{\mathrm{d}}), respectively, such that, when α(0,1)\alpha\in(0,1), both processes have zero natural drift, and when α(1,2)\alpha\in(1,2) then 𝑹\bm{R} has zero natural drift. Let the processes (𝑴𝑺t,𝑴𝒁,𝑳𝑺t,𝑳𝒁)(\bm{M}^{\bm{S}^{t}},\bm{M^{Z}},\bm{L}^{\bm{S}^{t}},\bm{L^{Z}}) be coupled as in (14) and (15) from Subsection 4.2, where 𝑺t=(𝑺st/g(t))s[0,1]\bm{S}^{t}=(\bm{S}_{st}/g(t))_{s\in[0,1]} and 𝑹t=(𝑹st/g(t))s[0,1]\bm{R}^{t}=(\bm{R}_{st}/g(t))_{s\in[0,1]} for t(0,1]t\in(0,1]. Note that 𝑿t\bm{X}^{t} has the same law as 𝑺t+𝑹t\bm{S}^{t}+\bm{R}^{t} for t(0,1]t\in(0,1] and that, under Assumption ( (C).), 𝑺\bm{S} has a finite qq-moment for every q(0,α)q\in(0,\alpha) by (32). ∎

Theorem 5.9.

Let α(0,2){1}\alpha\in(0,2)\setminus\{1\}, 𝐗\bm{X} and 𝐙\bm{Z} be as above and Assumptions ( (S).) & ( (C).) hold. Then, for every q(0,1]q\in(0,1] with dB𝟎(1)|𝐰|qν𝐗d(d𝐰)<\int_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}|\bm{w}|^{q}\nu_{\bm{X}}^{\mathrm{d}}(\mathrm{d}\bm{w})<\infty we have 𝔼[sups[0,1]|𝐑st|q]=𝒪(t1q/αG(t)q)\mathds{E}\big{[}\sup_{s\in[0,1]}|\bm{R}^{t}_{s}|^{q}\big{]}=\mathcal{O}\big{(}t^{1-q/\alpha}G(t)^{-q}\big{)} as t0t\downarrow 0 and, if pα1p\neq\alpha-1 and q(0,α)(0,1]q\in(0,\alpha)\cap(0,1] satisfy q{α/(p+1),α/(αδ+1)}q\notin\{\alpha/(p+1),\alpha/(\alpha\delta+1)\}, we have, as t0t\downarrow 0,

(35) 𝒲2(𝑴𝑺t,𝑴𝒁)\displaystyle\mathcal{W}_{2}\big{(}\bm{M}^{\bm{S}^{t}},\bm{M^{Z}}\big{)} =𝒪(G2(t)+(1+G(1/t)p)tmin{p/α,δ}),\displaystyle=\mathcal{O}\big{(}G_{2}(t)+(1+G(1/t)^{p})t^{\min\{p/\alpha,\delta\}}\big{)},
(36) 𝒲q(𝑳𝑺t,𝑳𝒁)\displaystyle\mathcal{W}_{q}\big{(}\bm{L}^{\bm{S}^{t}},\bm{L}^{\bm{Z}}\big{)} =𝒪(G2(t)q+(1+G(1/t)pq)(1+G1(x))q(1+p)tmin{pq/α,qδ,1q/α}), and\displaystyle=\mathcal{O}\big{(}G_{2}(t)^{q}+(1+G(1/t)^{pq})(1+G_{1}(x))^{q(1+p)}t^{\min\{pq/\alpha,q\delta,1-q/\alpha\}}\big{)},\qquad\text{ and }
(37) |ϖ𝑺tϖ𝒁|\displaystyle|\bm{\varpi}_{\bm{S}^{t}}-\bm{\varpi}_{\bm{Z}}| ={𝒪(G2(t)+t11/αG(1/t)1+(1+G1(t))tmin{11/α,δ}+G(1/t)p(1+G1(t))1+ptmin{11/α,p/α}),α(1,2),𝒪(G2(t)+tp/αG(1/t)p+tδ),α(0,1).\displaystyle=\begin{dcases}\mathcal{O}\big{(}G_{2}(t)+t^{1-1/\alpha}G(1/t)^{-1}+(1+G_{1}(t))t^{\min\{1-1/\alpha,\delta\}}\\ \qquad+G(1/t)^{p}(1+G_{1}(t))^{1+p}t^{\min\{1-1/\alpha,p/\alpha\}}\big{)},&\alpha\in(1,2),\\ \mathcal{O}\big{(}G_{2}(t)+t^{p/\alpha}G(1/t)^{p}+t^{\delta}\big{)},&\alpha\in(0,1).\end{dcases}
Remark 5.10.

If H1H\equiv 1, then GG is constant and hence Theorem 5.9 is also applicable to the domain of normal attraction: set G11G_{1}\equiv 1, G20G_{2}\equiv 0 and δ\delta arbitrarily large, then for pα1p\neq\alpha-1 and q(0,α)(0,1]q\in(0,\alpha)\cap(0,1] with dB𝟎(1)|𝒘|qν𝑿d(d𝒘)<\int_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}|\bm{w}|^{q}\nu_{\bm{X}}^{\mathrm{d}}(\mathrm{d}\bm{w})<\infty and qα/(p+1)q\neq\alpha/(p+1), we have, as t0t\downarrow 0,

𝒲2(𝑴𝑺t,𝑴𝒁)\displaystyle\mathcal{W}_{2}\big{(}\bm{M}^{\bm{S}^{t}},\bm{M^{Z}}\big{)} =𝒪(tp/α),𝒲q(𝑳𝑺t,𝑳𝒁)=𝒪(tmin{pq/α,1q/α}),and\displaystyle=\mathcal{O}\big{(}t^{p/\alpha}\big{)},\quad\mathcal{W}_{q}\big{(}\bm{L}^{\bm{S}^{t}},\bm{L^{Z}}\big{)}=\mathcal{O}\big{(}t^{\min\{pq/\alpha,1-q/\alpha\}}\big{)},\quad\text{and}
|ϖ𝑺tϖ𝒁|\displaystyle|\bm{\varpi}_{\bm{S}^{t}}-\bm{\varpi}_{\bm{Z}}| ={𝒪(tmin{11/α,p/α}),α(1,2),𝒪(tp/α),α(0,1).\displaystyle=\begin{dcases}\mathcal{O}\big{(}t^{\min\{1-1/\alpha,p/\alpha\}}\big{)},&\alpha\in(1,2),\\ \mathcal{O}(t^{p/\alpha}),&\alpha\in(0,1).\end{dcases}

Note that, when q(p+1)>αq(p+1)>\alpha, these rates match the ones in Theorem 5.5, established under more general conditions using the thinning coupling. ∎

Lemma 5.11.

Under Assumption ( (C).), there exists a function h~:(0,)×𝕊d1(1,)\widetilde{h}:(0,\infty)\times\mathbb{S}^{d-1}\to(-1,\infty) and a constant Kh~0K_{\widetilde{h}}\geqslant 0, such that, for all x>0x>0 and 𝐯𝕊d1\bm{v}\in\mathbb{S}^{d-1},

(38) ρ𝑿𝖼(x,𝒗)=(cα/c)1/αx1/αG(x)(1+h~(x,𝒗))and|h~(x,𝒗)|Kh~(1(xp/αG(x)p+xδ)).\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v})=\left(c_{\alpha}/c\right)^{1/\alpha}x^{-1/\alpha}G(x)(1+\widetilde{h}(x,\bm{v}))\quad\text{and}\quad|\widetilde{h}(x,\bm{v})|\leqslant K_{\widetilde{h}}\big{(}1\wedge(x^{-p/\alpha}G(x)^{p}+x^{-\delta})\big{)}.
Proof.

Note that ρ𝑿𝖼([ρ𝑿𝖼(x,𝒗),),𝒗)=x\rho_{\bm{X}}^{\mathsf{c}}([\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}),\infty),\bm{v})=x for all 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1} and x>0x>0. Hence, for all 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1} and x>0x>0, x=cαα(1+h(ρ𝑿𝖼(x,𝒗),𝒗))H(ρ𝑿𝖼(x,𝒗))αρ𝑿𝖼(x,𝒗)αx=\tfrac{c_{\alpha}}{\alpha}(1+h(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}),\bm{v}))H(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}))^{\alpha}\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v})^{-\alpha}, implying that

ρ𝑿𝖼(x,𝒗)\displaystyle\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}) =(cαα)1/αx1/αH(ρ𝑿𝖼(x,𝒗))(1+h(ρ𝑿𝖼(x,𝒗))1/α\displaystyle=\bigg{(}\frac{c_{\alpha}}{\alpha}\bigg{)}^{1/\alpha}x^{-1/\alpha}H(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}))(1+h(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}))^{1/\alpha}
=(cαα)1/αx1/αG(x)H(ρ𝑿𝖼(x,𝒗))G(x)(1+h(ρ𝑿𝖼(x,𝒗)))1/α.\displaystyle=\bigg{(}\frac{c_{\alpha}}{\alpha}\bigg{)}^{1/\alpha}x^{-1/\alpha}G(x)\frac{H(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}))}{G(x)}\big{(}1+h(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}))\big{)}^{1/\alpha}.

Thus, the first part of (38) holds if h~(x,𝒗)(H(ρ𝑿𝖼(x,𝒗))/G(x))(1+h(ρ𝑿𝖼(x,𝒗))1/α1(1,)\widetilde{h}(x,\bm{v})\coloneqq(H(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}))/G(x))(1+h(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}))^{1/\alpha}-1\in(-1,\infty).

Suppose now that (34) in Assumption ( (C).) holds for some p,δ>0p,\delta>0. Since hh is bounded by KhK_{h} and by (34), we obtain h~(x,𝒗)(KQ+1)(1+Kh)1/α1\widetilde{h}(x,\bm{v})\leqslant(K_{Q}+1)(1+K_{h})^{1/\alpha}-1 for all x>0x>0 and 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1}. Moreover, the elementary inequality |(1+x)r1||x||(1+x)^{r}-1|\leqslant|x| for any r[0,1]r\in[0,1] and x1x\geqslant-1 and the triangle inequality yield |(1+y)(1+x)r1||x|+|y|(1+x)r|(1+y)(1+x)^{r}-1|\leqslant|x|+|y|(1+x)^{r}, implies for all x>0x>0 and 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1}, that

|h~(x,𝒗)|\displaystyle|\widetilde{h}(x,\bm{v})| |h(ρ𝑿𝖼(x,𝒗))|+|H(ρ𝑿𝖼(x,𝒗))G(x)1|(1+Kh)1/α\displaystyle\leqslant|h(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}))|+\bigg{|}\frac{H(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}))}{G(x)}-1\bigg{|}(1+K_{h})^{1/\alpha}
Khρ𝑿𝖼(x,𝒗)p+KQ(1+Kh)1/αxδ\displaystyle\leqslant K_{h}\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v})^{p}+K_{Q}(1+K_{h})^{1/\alpha}x^{-\delta}
Kh(cα/α)p/αxp/αG(x)p(H(ρ𝑿𝖼(x,𝒗))G(x))p(1+h(ρ𝑿𝖼(x,𝒗))p/α+KQ(1+Kh)1/αxδ\displaystyle\leqslant K_{h}(c_{\alpha}/\alpha)^{p/\alpha}x^{-p/\alpha}G(x)^{p}\left(\frac{H(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}))}{G(x)}\right)^{p}(1+h(\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x,\bm{v}))^{p/\alpha}+K_{Q}(1+K_{h})^{1/\alpha}x^{-\delta}
Kh(1+Kh)p/α(1+KQ)p(cα/α)p/αxp/αG(x)p+KQ(1+Kh)1/αxδ.\displaystyle\leqslant K_{h}(1+K_{h})^{p/\alpha}(1+K_{Q})^{p}(c_{\alpha}/\alpha)^{p/\alpha}x^{-p/\alpha}G(x)^{p}+K_{Q}(1+K_{h})^{1/\alpha}x^{-\delta}.

Choosing Kh~max{(cα/α)p/αKh(1+Kh)p/α(1+KQ)p,(KQ+1)(1+Kh)1/α,1}K_{\widetilde{h}}\coloneqq\max\{(c_{\alpha}/\alpha)^{p/\alpha}K_{h}(1+K_{h})^{p/\alpha}(1+K_{Q})^{p},(K_{Q}+1)(1+K_{h})^{1/\alpha},1\}, concludes the last part of (38). ∎

Lemma 5.12.

Let α(0,2){1}\alpha\in(0,2)\setminus\{1\} and q(0,α)(0,1]q\in(0,\alpha)\cap(0,1] satisfy dB𝟎(1)|𝐰|qν𝐗(d𝐰)<\int_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}|\bm{w}|^{q}\nu_{\bm{X}}(\mathrm{d}\bm{w})<\infty, where qα/(p+1)q\neq\alpha/(p+1) and qα/(αδ+1)q\neq\alpha/(\alpha\delta+1). Then, under Assumptions ( (S).) & ( (C).) we have, as t0t\downarrow 0,

𝒲q(𝑳𝑺t,𝑳𝒁)\displaystyle\mathcal{W}_{q}\big{(}\bm{L}^{\bm{S}^{t}},\bm{L}^{\bm{Z}}\big{)} =𝒪(G2(t)q+(1+G1(t))qtmin{1q/α,qδ}+G(1/t)pq(1+G1(t))q(1+p)tmin{1q/α,pq/α}).\displaystyle=\mathcal{O}\big{(}G_{2}(t)^{q}+(1+G_{1}(t))^{q}t^{\min\{1-q/\alpha,q\delta\}}+G(1/t)^{pq}(1+G_{1}(t))^{q(1+p)}t^{\min\{1-q/\alpha,pq/\alpha\}}\big{)}.
Proof.

By Lemma 5.11, for all x>0x>0, t(0,1]t\in(0,1] and 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1}, it holds that

(39) ρ𝑺t(x,𝒗)=ρ𝑿𝖼(x/t,𝒗)/g(t)=(cαα)1/αx1/αG(x/t)G(1/t)(1+h~(x/t,𝒗)).\rho_{\bm{S}^{t}}^{\leftarrow}(x,\bm{v})=\rho_{\bm{X}}^{\mathsf{c}\leftarrow}(x/t,\bm{v})/g(t)=\bigg{(}\frac{c_{\alpha}}{\alpha}\bigg{)}^{1/\alpha}x^{-1/\alpha}\frac{G(x/t)}{G(1/t)}\big{(}1+\widetilde{h}(x/t,\bm{v})\big{)}.

Hence, Proposition 4.4 now implies that

𝒲q(𝑳𝑺t,𝑳𝒁)\displaystyle\mathcal{W}_{q}\big{(}\bm{L}^{\bm{S}^{t}},\bm{L}^{\bm{Z}}\big{)} 𝕊d10ε|ρ𝑺t(x,𝒗)ρ𝒁(x,𝒗)|qdxσ(d𝒗)\displaystyle\leqslant\int_{\mathbb{S}^{d-1}}\int_{0}^{\varepsilon}|\rho_{\bm{S}^{t}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Z}}^{\leftarrow}(x,\bm{v})|^{q}\mathrm{d}x\sigma(\mathrm{d}\bm{v})
=(cαα)q/α𝕊d10ε|G(x/t)G(1/t)(1+h~(x/t,𝒗))1|qxq/αdxσ(d𝒗)I(t).\displaystyle=\bigg{(}\frac{c_{\alpha}}{\alpha}\bigg{)}^{q/\alpha}\int_{\mathbb{S}^{d-1}}\int_{0}^{\varepsilon}\bigg{|}\frac{G(x/t)}{G(1/t)}\big{(}1+\widetilde{h}(x/t,\bm{v})\big{)}-1\bigg{|}^{q}x^{-q/\alpha}\mathrm{d}x\sigma(\mathrm{d}\bm{v})\eqqcolon I(t).

To bound I(t)I(t), we use the triangle inequality and the fact that xxqx\mapsto x^{q} is concave, to obtain

(αcα)q/αI(t)\displaystyle\bigg{(}\frac{\alpha}{c_{\alpha}}\bigg{)}^{q/\alpha}I(t) 0εxq/α|G(x/t)G(1/t)1|qdx+𝕊d10εxq/α|G(x/t)G(1/t)h~(x/t,𝒗)|qdxσ(d𝒗).\displaystyle\leqslant\int_{0}^{\varepsilon}x^{-q/\alpha}\left|\frac{G(x/t)}{G(1/t)}-1\right|^{q}\mathrm{d}x+\int_{\mathbb{S}^{d-1}}\int_{0}^{\varepsilon}x^{-q/\alpha}\left|\frac{G(x/t)}{G(1/t)}\widetilde{h}(x/t,\bm{v})\right|^{q}\mathrm{d}x\,\sigma(\mathrm{d}\bm{v}).

We consider each integral on its own. Assumption ( (S).) implies that the first integral I1(t)I_{1}(t) in the display above is bounded by G2(t)q0εxq/αG1(x)qdx<G_{2}(t)^{q}\int_{0}^{\varepsilon}x^{-q/\alpha}G_{1}(x)^{q}\mathrm{d}x<\infty. Next, we bound the second integral I2(t)I_{2}(t) in the display above. Assumption ( (S).) and (34) yield, as t0t\downarrow 0,

I2(t)\displaystyle I_{2}(t) 𝕊d10ε(1+G1(x)G2(t))qxq/α|h~(x/t,𝒗)|qdxσ(d𝒗)\displaystyle\lesssim\int_{\mathbb{S}^{d-1}}\int_{0}^{\varepsilon}(1+G_{1}(x)G_{2}(t))^{q}x^{-q/\alpha}|\widetilde{h}(x/t,\bm{v})|^{q}\mathrm{d}x\,\sigma(\mathrm{d}\bm{v})
0t(1+G1(x))qxq/αdx+tqδtε(1+G1(x))qxq/αqδdx\displaystyle\lesssim\int_{0}^{t}(1+G_{1}(x))^{q}x^{-q/\alpha}\mathrm{d}x+t^{q\delta}\int_{t}^{\varepsilon}(1+G_{1}(x))^{q}x^{-q/\alpha-q\delta}\mathrm{d}x
+tpq/αG(1/t)pqtε(1+G1(x))qxq/αpq/αG(x/t)pqG(1/t)pqdx\displaystyle\qquad+t^{pq/\alpha}G(1/t)^{pq}\int_{t}^{\varepsilon}(1+G_{1}(x))^{q}x^{-q/\alpha-pq/\alpha}\frac{G(x/t)^{pq}}{G(1/t)^{pq}}\mathrm{d}x
(1+G1(t))qtmin{1q/α,qδ}+tpq/αG(1/t)pqtε(1+G1(x))q(1+p)xq/αpq/αdx\displaystyle\lesssim(1+G_{1}(t))^{q}t^{\min\{1-q/\alpha,q\delta\}}+t^{pq/\alpha}G(1/t)^{pq}\int_{t}^{\varepsilon}(1+G_{1}(x))^{q(1+p)}x^{-q/\alpha-pq/\alpha}\mathrm{d}x
=𝒪((1+G1(t))qtmin{1q/α,qδ}+G(1/t)pq(1+G1(t))q(1+p)tmin{1q/α,pq/α}).\displaystyle=\mathcal{O}\big{(}(1+G_{1}(t))^{q}t^{\min\{1-q/\alpha,q\delta\}}+G(1/t)^{pq}(1+G_{1}(t))^{q(1+p)}t^{\min\{1-q/\alpha,pq/\alpha\}}\big{)}.\qed
Lemma 5.13.

Let α(0,2){1}\alpha\in(0,2)\setminus\{1\}. Under Assumptions ( (S).) & ( (C).) we have

(40) 𝒲2(𝑴𝑺t,𝑴𝒁)=𝒪(G2(t)+tp/αG(1/t)p+tδ),as t0.\mathcal{W}_{2}\big{(}\bm{M}^{\bm{S}^{t}},\bm{M^{Z}}\big{)}=\mathcal{O}\big{(}G_{2}(t)+t^{p/\alpha}G(1/t)^{p}+t^{\delta}\big{)},\qquad\text{as }t\downarrow 0.
Proof.

Proposition 4.3 together with (39) shows that

𝒲2(𝑴𝑺t,𝑴𝒁)2\displaystyle\mathcal{W}_{2}\big{(}\bm{M}^{\bm{S}^{t}},\bm{M^{Z}}\big{)}^{2} 4[ε,)×𝕊d1(ρ𝑺t(x,𝒗)ρ𝒁(x,𝒗))2dxσ(d𝒗)\displaystyle\leqslant 4\int_{[\varepsilon,\infty)\times\mathbb{S}^{d-1}}(\rho_{\bm{S}^{t}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Z}}^{\leftarrow}(x,\bm{v}))^{2}\mathrm{d}x\otimes\sigma(\mathrm{d}\bm{v})
=4(cαα)2/α𝕊d1ε(G(x/t)G(1/t)(1+h~(x/t,𝒗))1)2x2/αdxσ(d𝒗)I(t).\displaystyle=4\bigg{(}\frac{c_{\alpha}}{\alpha}\bigg{)}^{2/\alpha}\int_{\mathbb{S}^{d-1}}\int_{\varepsilon}^{\infty}\left(\frac{G(x/t)}{G(1/t)}\big{(}1+\widetilde{h}(x/t,\bm{v})\big{)}-1\right)^{2}x^{-2/\alpha}\mathrm{d}x\sigma(\mathrm{d}\bm{v})\eqqcolon I(t).

To bound I(t)I(t), we use the elementary inequality (x+y)22(x2+y2)(x+y)^{2}\leqslant 2(x^{2}+y^{2}), which implies,

18(αcα)2/αI(t)\displaystyle\frac{1}{8}\bigg{(}\frac{\alpha}{c_{\alpha}}\bigg{)}^{2/\alpha}I(t) εx2/α(G(x/t)G(1/t)1)2dx+𝕊d1εx2/α(G(x/t)G(1/t)h~(x/t,𝒗))2dxσ(d𝒗).\displaystyle\leqslant\int_{\varepsilon}^{\infty}x^{-2/\alpha}\left(\frac{G(x/t)}{G(1/t)}-1\right)^{2}\mathrm{d}x+\int_{\mathbb{S}^{d-1}}\int_{\varepsilon}^{\infty}x^{-2/\alpha}\left(\frac{G(x/t)}{G(1/t)}\widetilde{h}(x/t,\bm{v})\right)^{2}\mathrm{d}x\,\sigma(\mathrm{d}\bm{v}).

By Assumption ( (S).), the first integral I1(t)I_{1}(t) above is bounded by G2(t)2εx2/αG1(x)2dx<G_{2}(t)^{2}\int_{\varepsilon}^{\infty}x^{-2/\alpha}G_{1}(x)^{2}\mathrm{d}x<\infty. Assumption ( (S).) and (38) imply, as t0t\downarrow 0,

I2(t)\displaystyle I_{2}(t) t2p/αG(1/t)2pε(1+G1(x))2G(x/t)2pG(1/t)2px2/α2p/αdx+t2δε(1+G1(x))2x2/α2δdx\displaystyle\lesssim t^{2p/\alpha}G(1/t)^{2p}\int_{\varepsilon}^{\infty}(1+G_{1}(x))^{2}\frac{G(x/t)^{2p}}{G(1/t)^{2p}}x^{-2/\alpha-2p/\alpha}\mathrm{d}x+t^{2\delta}\int_{\varepsilon}^{\infty}(1+G_{1}(x))^{2}x^{-2/\alpha-2\delta}\mathrm{d}x
t2p/αG(1/t)2pε(1+G1(x))2(1+p)x2/α2p/αdx+t2δ=𝒪(t2p/αG(1/t)2p+t2δ).\displaystyle\lesssim t^{2p/\alpha}G(1/t)^{2p}\int_{\varepsilon}^{\infty}(1+G_{1}(x))^{2(1+p)}x^{-2/\alpha-2p/\alpha}\mathrm{d}x+t^{2\delta}=\mathcal{O}(t^{2p/\alpha}G(1/t)^{2p}+t^{2\delta}).\qed

In the following lemma, we find at what rate the drifts converge.

Lemma 5.14.

(a) Let α(1,2)\alpha\in(1,2), and p,δ>0p,\delta>0 where pα1p\neq\alpha-1 and δ(α1)/α\delta\neq(\alpha-1)/\alpha. Then, under Assumptions ( (S).) & ( (C).), we have, as t0t\downarrow 0,

|ϖ𝑺tϖ𝒁|\displaystyle|\bm{\varpi}_{\bm{S}^{t}}-\bm{\varpi}_{\bm{Z}}| =𝒪(G2(t)+t11/αG(1/t)1+(1+G1(t))tmin{11/α,δ}\displaystyle=\mathcal{O}\big{(}G_{2}(t)+t^{1-1/\alpha}G(1/t)^{-1}+(1+G_{1}(t))t^{\min\{1-1/\alpha,\delta\}}
+G(1/t)p(1+G1(t))1+ptmin{11/α,p/α}).\displaystyle\qquad+G(1/t)^{p}(1+G_{1}(t))^{1+p}t^{\min\{1-1/\alpha,p/\alpha\}}\big{)}.

(b) Let α(0,1)\alpha\in(0,1). Then, under Assumptions ( (S).) & ( (C).) we have, as t0t\downarrow 0,

|ϖ𝑺tϖ𝒁|=𝒪(G2(t)+tp/αG(1/t)p+tδ).|\bm{\varpi}_{\bm{S}^{t}}-\bm{\varpi}_{\bm{Z}}|=\mathcal{O}\big{(}G_{2}(t)+t^{p/\alpha}G(1/t)^{p}+t^{\delta}\big{)}.
Proof.

First, assume that α(1,2)\alpha\in(1,2). The proof in this setting follows the steps of the proof of Lemma 5.12. Note that

ϖ𝑺t\displaystyle\bm{\varpi}_{\bm{S}^{t}} =t11/αG(1/t)1(ϖ𝑺𝔼[𝑺1])𝕊d10ερ𝑺t(x,𝒗)dxσ(d𝒗),\displaystyle=t^{1-1/\alpha}G(1/t)^{-1}(\bm{\varpi}_{\bm{S}}-\mathds{E}[\bm{S}_{1}])-\int_{\mathbb{S}^{d-1}}\int_{0}^{\varepsilon}\rho_{\bm{S}^{t}}^{\leftarrow}(x,\bm{v})\mathrm{d}x\sigma(\mathrm{d}\bm{v}),
ϖ𝒁\displaystyle\bm{\varpi}_{\bm{Z}} =𝕊d10ερ𝒁(x,𝒗)dxσ(d𝒗),\displaystyle=-\int_{\mathbb{S}^{d-1}}\int_{0}^{\varepsilon}\rho_{\bm{Z}}^{\leftarrow}(x,\bm{v})\mathrm{d}x\sigma(\mathrm{d}\bm{v}),

and since 𝑺\bm{S} has a finite first moment, it follows that

ϖ𝑺tϖ𝒁=t11/αG(1/t)1(ϖ𝑺+𝔼[𝑺1])𝕊d10ερ𝑺t(x,𝒗)ρ𝒁(x,𝒗)dxσ(d𝒗).\bm{\varpi}_{\bm{S}^{t}}-\bm{\varpi}_{\bm{Z}}=t^{1-1/\alpha}G(1/t)^{-1}(\bm{\varpi}_{\bm{S}}+\mathds{E}[\bm{S}_{1}])-\int_{\mathbb{S}^{d-1}}\int_{0}^{\varepsilon}\rho_{\bm{S}^{t}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Z}}^{\leftarrow}(x,\bm{v})\mathrm{d}x\sigma(\mathrm{d}\bm{v}).

Recall from (39), that ρ𝑺t(x,𝒗)=(cα/α)1/αx1/α(1+h~(x/t,𝒗))G(x/t)/G(1/t)\rho_{\bm{S}^{t}}^{\leftarrow}(x,\bm{v})=(c_{\alpha}/\alpha)^{1/\alpha}x^{-1/\alpha}\big{(}1+\widetilde{h}(x/t,\bm{v})\big{)}G(x/t)/G(1/t), which implies that

|𝕊d10ερ𝑺t(x,𝒗)ρ𝒁(x,𝒗)dxσ(d𝒗)|𝕊d10ε|ρ𝑺t(x,𝒗)ρ𝒁(x,𝒗)|dxσ(d𝒗)\displaystyle\bigg{|}\int_{\mathbb{S}^{d-1}}\int_{0}^{\varepsilon}\rho_{\bm{S}^{t}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Z}}^{\leftarrow}(x,\bm{v})\mathrm{d}x\sigma(\mathrm{d}\bm{v})\bigg{|}\leqslant\int_{\mathbb{S}^{d-1}}\int_{0}^{\varepsilon}|\rho_{\bm{S}^{t}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Z}}^{\leftarrow}(x,\bm{v})|\mathrm{d}x\sigma(\mathrm{d}\bm{v})
=(cαα)1/α𝕊d10ε|G(x/t)G(1/t)(1+h~(x/t,𝒗))1|x1/αdxσ(d𝒗)I(t).\displaystyle\qquad=\bigg{(}\frac{c_{\alpha}}{\alpha}\bigg{)}^{1/\alpha}\int_{\mathbb{S}^{d-1}}\int_{0}^{\varepsilon}\bigg{|}\frac{G(x/t)}{G(1/t)}\big{(}1+\widetilde{h}(x/t,\bm{v})\big{)}-1\bigg{|}x^{-1/\alpha}\mathrm{d}x\sigma(\mathrm{d}\bm{v})\eqqcolon I(t).

The triangle inequality now implies, that

(αcα)1/αI(t)\displaystyle\bigg{(}\frac{\alpha}{c_{\alpha}}\bigg{)}^{1/\alpha}I(t) 0εx1/α|G(x/t)G(1/t)1|dx+𝕊d10εx1/α|G(x/t)G(1/t)h~(x/t,𝒗)|dxσ(d𝒗).\displaystyle\leqslant\int_{0}^{\varepsilon}x^{-1/\alpha}\left|\frac{G(x/t)}{G(1/t)}-1\right|\mathrm{d}x+\int_{\mathbb{S}^{d-1}}\int_{0}^{\varepsilon}x^{-1/\alpha}\left|\frac{G(x/t)}{G(1/t)}\widetilde{h}(x/t,\bm{v})\right|\mathrm{d}x\,\sigma(\mathrm{d}\bm{v}).

The two terms in the upper bound are denoted by I1(t)I_{1}(t) and I2(t)I_{2}(t). Following the calculations made in the proof of Lemma 5.12, we see by Assumption ( (S).) and (38), that I1(t)I_{1}(t) in the display above is bounded by G2(t)0εx1/αG1(x)dx<G_{2}(t)\int_{0}^{\varepsilon}x^{-1/\alpha}G_{1}(x)\mathrm{d}x<\infty, and, as t0t\downarrow 0,

I2(t)=𝒪((1+G1(t))tmin{11/α,δ}+G(1/t)p(1+G1(t))1+ptmin{11/α,p/α}).I_{2}(t)=\mathcal{O}\big{(}(1+G_{1}(t))t^{\min\{1-1/\alpha,\delta\}}+G(1/t)^{p}(1+G_{1}(t))^{1+p}t^{\min\{1-1/\alpha,p/\alpha\}}\big{)}.

Assume α(0,1)\alpha\in(0,1). Recall ϖ𝑺t=𝕊d1ερ𝑺t(x,𝒗)dxσ(d𝒗)\bm{\varpi}_{\bm{S}^{t}}=\int_{\mathbb{S}^{d-1}}\int_{\varepsilon}^{\infty}\rho_{\bm{S}^{t}}^{\leftarrow}(x,\bm{v})\mathrm{d}x\,\sigma(\mathrm{d}\bm{v}) and ϖ𝒁=𝕊d1ερ𝒁(x,𝒗)dxσ(d𝒗)\bm{\varpi}_{\bm{Z}}=\int_{\mathbb{S}^{d-1}}\int_{\varepsilon}^{\infty}\rho_{\bm{Z}}^{\leftarrow}(x,\bm{v})\mathrm{d}x\,\sigma(\mathrm{d}\bm{v}) and hence, by Lemma 5.11 and (39),

|ϖ𝑺tϖ𝒁|\displaystyle|\bm{\varpi}_{\bm{S}^{t}}-\bm{\varpi}_{\bm{Z}}| 𝕊d1ε|ρ𝑺t(x,𝒗)ρ𝒁(x,𝒗)|dxσ(d𝒗)\displaystyle\leqslant\int_{\mathbb{S}^{d-1}}\int_{\varepsilon}^{\infty}|\rho_{\bm{S}^{t}}^{\leftarrow}(x,\bm{v})-\rho_{\bm{Z}}^{\leftarrow}(x,\bm{v})|\mathrm{d}x\,\sigma(\mathrm{d}\bm{v})
=(cαα)1/α𝕊d1ε|G(x/t)G(1/t)(1+h~(x/t,𝒗))1|x1/αdxσ(d𝒗)I(t).\displaystyle=\bigg{(}\frac{c_{\alpha}}{\alpha}\bigg{)}^{1/\alpha}\int_{\mathbb{S}^{d-1}}\int_{\varepsilon}^{\infty}\left|\frac{G(x/t)}{G(1/t)}\big{(}1+\widetilde{h}(x/t,\bm{v})\big{)}-1\right|x^{-1/\alpha}\mathrm{d}x\,\sigma(\mathrm{d}\bm{v})\eqqcolon I(t).

Bounding I(t)I(t) using the triangle inequality, yields

(αcα)1/αI(t)\displaystyle\bigg{(}\frac{\alpha}{c_{\alpha}}\bigg{)}^{1/\alpha}I(t) εx1/α|G(x/t)G(1/t)1|dx+𝕊d1εx1/α|G(x/t)G(1/t)h~(x/t,𝒗)|dxσ(d𝒗),\displaystyle\leqslant\int_{\varepsilon}^{\infty}x^{-1/\alpha}\left|\frac{G(x/t)}{G(1/t)}-1\right|\mathrm{d}x+\int_{\mathbb{S}^{d-1}}\int_{\varepsilon}^{\infty}x^{-1/\alpha}\left|\frac{G(x/t)}{G(1/t)}\widetilde{h}(x/t,\bm{v})\right|\mathrm{d}x\,\sigma(\mathrm{d}\bm{v}),

where the upper bound is denoted I1(t)+I2(t)I_{1}(t)+I_{2}(t). Assumption ( (S).) and (34) with Lemma 5.11, imply that I1(t)G2(t)εx1/αG1(x)dx<I_{1}(t)\leqslant G_{2}(t)\int_{\varepsilon}^{\infty}x^{-1/\alpha}G_{1}(x)\mathrm{d}x<\infty and

I2(t)\displaystyle I_{2}(t) tp/αG(1/t)pε(1+G1(x))G(x/t)pG(1/t)px1/αp/αdx+tδε(1+G1(x))x1/αδdx\displaystyle\lesssim t^{p/\alpha}G(1/t)^{p}\int_{\varepsilon}^{\infty}(1+G_{1}(x))\frac{G(x/t)^{p}}{G(1/t)^{p}}x^{-1/\alpha-p/\alpha}\mathrm{d}x+t^{\delta}\int_{\varepsilon}^{\infty}(1+G_{1}(x))x^{-1/\alpha-\delta}\mathrm{d}x
=𝒪(tp/αG(1/t)p+tδ), as t0.\displaystyle=\mathcal{O}\big{(}t^{p/\alpha}G(1/t)^{p}+t^{\delta}\big{)},\qquad\text{ as }t\downarrow 0.\qed
Proof of Theorem 5.9.

The bound on 𝑹t\bm{R}^{t} follows directly from its definition and Lemma 5.2 with β+=0\beta_{+}=0. The bounds on the big-jump components, the small-jump components and the drifts, follow directly from Lemmas 5.125.135.14, respectively. ∎

5.4. Brownian limits: upper bounds

In this subsection, we construct upper bounds on the distance between a Lévy process with nonzero Gaussian component and its attracting Brownian motion. Recall that (𝜸𝑿,𝚺𝑿𝚺𝑿,ν𝑿)(\bm{\gamma_{X}},\bm{\Sigma_{X}}\bm{\Sigma_{X}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{X}}) denotes the characteristic triplet [40, Def. 8.2] of 𝑿\bm{X} with respect to the cutoff function 𝒘𝟙B𝟎(1)(𝒘)\bm{w}\mapsto\mathds{1}_{B_{\bm{0}}(1)}(\bm{w}) on 𝒘d\bm{w}\in\mathbb{R}^{d} and β+\beta_{+} is given in terms of the BG index defined in (27).

Proposition 5.15.

Let 𝐗\bm{X} be a Lévy process on d\mathbb{R}^{d} with the characteristic triplet (𝛄𝐗,𝚺𝐗𝚺𝐗,ν𝐗)(\bm{\gamma_{X}},\bm{\Sigma_{X}}\bm{\Sigma_{X}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{X}}). Let 𝐗t=(𝐗st/t)s[0,1]\bm{X}^{t}=(\bm{X}_{st}/\sqrt{t})_{s\in[0,1]} for t(0,1]t\in(0,1] and assume dB𝟎(1)|𝐰|qν𝐗(d𝐰)<\int_{\mathbb{R}^{d}\setminus B_{\bm{0}}(1)}|\bm{w}|^{q}\nu_{\bm{X}}(\mathrm{d}\bm{w})<\infty for some q(0,2]q\in(0,2]. Let 𝚺𝐗𝐁\bm{\Sigma_{X}}\bm{B} be the Gaussian component of 𝐗\bm{X} in its Lévy–Itô decomposition (3) and define 𝐒𝐗𝚺𝐗𝐁\bm{S}\coloneqq\bm{X}-\bm{\Sigma_{X}}\bm{B}.

(a) If 𝑺\bm{S} is of infinite variation or has finite variation with infinite activity and zero natural drift, then

𝒲q(𝑿t,𝚺𝑿𝑩)=𝒪(t(q1)(min{1/q,1/β+}1/2)), as t0.\displaystyle\mathcal{W}_{q}\big{(}\bm{X}^{t},\bm{\Sigma_{X}}\bm{B}\big{)}=\mathcal{O}\big{(}t^{(q\wedge 1)(\min\{1/q,1/\beta_{+}\}-1/2)}\big{)},\qquad\text{ as }t\downarrow 0.

(b) If 𝑺\bm{S} has finite variation and nonzero natural drift, then

𝒲q(𝑿t,𝚺𝑿𝑩)=𝒪(t(q1)min{1/q1/2,1/2}), as t0.\displaystyle\mathcal{W}_{q}\big{(}\bm{X}^{t},\bm{\Sigma_{X}}\bm{B}\big{)}=\mathcal{O}\big{(}t^{(q\wedge 1)\min\{1/q-1/2,1/2\}}\big{)},\qquad\text{ as }t\downarrow 0.

Note that, if 𝑺\bm{S} has infinite activity we have β+>0\beta_{+}>0 and if the BG index β<2\beta<2, then 1/β+>1/21/\beta_{+}>1/2. Hence, Proposition 5.15 provides bounds on the rate of convergence in the appropriate LqL^{q}-Wasserstein distance for the weak limit in Theorem 5.1 (case α=2\alpha=2 and GG asymptotically constant). In the case β=2\beta=2, it is well-known that 𝑿1t\bm{X}^{t}_{1} converges weakly to the Gaussian law of 𝚺𝑿𝑩1\bm{\Sigma_{X}}\bm{B}_{1} (see e.g. [5, Prop. I.2(i)]), but the convergence of 𝒲q(𝑿t,𝚺𝑿𝑩)\mathcal{W}_{q}\big{(}\bm{X}^{t},\bm{\Sigma_{X}}\bm{B}\big{)} could be arbitrarily slow, see Example 6.7 below. It is thus not surprising that Proposition 5.15 gives no information about the rate of convergence. Note also that Proposition 5.15 covers the case 𝚺𝑿=𝟎\bm{\Sigma_{X}}=\bm{0}. Moreover, the bound on the LqL^{q}-Wasserstein distance when the BG index is less than one is sharper if the natural drift is zero, than if it is not.

Proof of Proposition 5.15.

Fix t(0,1]t\in(0,1], let 𝑩\bm{B} be the Brownian motion in the Lévy–Itô decomposition (3) of the Lévy process 𝑿\bm{X} and recall 𝑿t=(𝑿st/t)s[0,1]\bm{X}^{t}=(\bm{X}_{st}/\sqrt{t})_{s\in[0,1]}. Since the Brownian motion 𝑩\bm{B} satisfies the identity in law (t1/2𝑩st)s[0,1]=𝑑(𝑩s)s[0,1](t^{-1/2}\bm{B}_{st})_{s\in[0,1]}\overset{d}{=}(\bm{B}_{s})_{s\in[0,1]} by self-similarity (Lévy’s characterisation theorem), there exists a coupling (𝑿t,𝑩)(\bm{X}^{t},\bm{B}^{\prime}), such that 𝑩=(t1/2𝑩st)s[0,1]\bm{B}^{\prime}=(t^{-1/2}\bm{B}_{st})_{s\in[0,1]} and 𝑩=𝑑𝑩\bm{B}^{\prime}\overset{d}{=}\bm{B}. Recalling 𝑺=𝑿𝚺𝑿𝑩\bm{S}=\bm{X}-\bm{\Sigma}_{\bm{X}}\bm{B}, we obtain

(41) 𝒲q(𝑿t,𝚺𝑿𝑩)q1\displaystyle\mathcal{W}_{q}(\bm{X}^{t},\bm{\Sigma}_{\bm{X}}\bm{B})^{q\vee 1} 𝔼[sups[0,1]|𝚺𝑿𝑩st/t+𝑺st/t𝚺𝑿𝑩s|q]=tq/2𝔼[sups[0,t]|𝑺s|q].\displaystyle\leqslant\mathds{E}\bigg{[}\sup_{s\in[0,1]}|\bm{\Sigma}_{\bm{X}}\bm{B}_{st}/\sqrt{t}+\bm{S}_{st}/\sqrt{t}-\bm{\Sigma}_{\bm{X}}\bm{B}^{\prime}_{s}|^{q}\bigg{]}=t^{-q/2}\mathds{E}\bigg{[}\sup_{s\in[0,t]}|\bm{S}_{s}|^{q}\bigg{]}.

Note that the characteristic triplet (𝜸𝒀,𝟎,ν𝑺)(\bm{\gamma_{Y}},\bm{0},\nu_{\bm{S}}) of 𝑺\bm{S} is given by 𝜸𝒀=𝜸𝑿\bm{\gamma_{Y}}=\bm{\gamma_{X}} and ν𝑺=ν𝑿\nu_{\bm{S}}=\nu_{\bm{X}}. In particular, the BG index of 𝑺\bm{S} equals that of 𝑿\bm{X}.

Part (a). Assume 𝑺\bm{S} is of infinite variation. Since, 𝑺\bm{S} has no Gaussian component, by [40, Thm 21.9] we have 𝟎d|𝒘|𝟙B𝟎(1)(𝒘)ν𝑿(d𝒘)=\int_{\mathbb{R}^{d}_{\bm{0}}}|\bm{w}|\mathds{1}_{B_{\bm{0}}(1)}(\bm{w})\nu_{\bm{X}}(\mathrm{d}\bm{w})=\infty, implying that the BG index of 𝑺\bm{S} satisfies β1\beta\geqslant 1. Hence the associated quantity β+[β,2]\beta_{+}\in[\beta,2] satisfies: min{1/q,1/β+}1/21/β+1/21/2\min\{1/q,1/\beta_{+}\}-1/2\leqslant 1/\beta_{+}-1/2\leqslant 1/2. Thus, by Lemma 5.2, we have tq/2𝔼[sups[0,t]|𝑺s|q]C2tq/2+C3tq(min{1/q,1/β+}1/2)t^{-q/2}\mathds{E}\bigg{[}\sup_{s\in[0,t]}|\bm{S}_{s}|^{q}\bigg{]}\leqslant C_{2}t^{q/2}+C_{3}t^{q(\min\{1/q,1/\beta_{+}\}-1/2)}, implying that 𝒲q(𝑿t,𝚺𝑿𝑩)q12max{C2,C3}tq(min{1/β+,1/q}1/2)\mathcal{W}_{q}(\bm{X}^{t},\bm{\Sigma}_{\bm{X}}\bm{B})^{q\vee 1}\leqslant 2\max\{C_{2},C_{3}\}t^{q(\min\{1/\beta_{+},1/q\}-1/2)} for t(0,1]t\in(0,1].

If 𝑺\bm{S} has finite variation and zero natural drift, then the bound in Lemma 5.2 with C3=0C_{3}=0 yields 𝒲q(𝑿t,𝚺𝑿𝑩)q1C3tq(min{1/q,1/β+}1/2)\mathcal{W}_{q}(\bm{X}^{t},\bm{\Sigma}_{\bm{X}}\bm{B})^{q\vee 1}\leqslant C_{3}t^{q(\min\{1/q,1/\beta_{+}\}-1/2)}. Noting that q/(q1)=q1q/(q\vee 1)=q\wedge 1 implies Part (a).

Part (b). Since 𝑺\bm{S} has finite variation, by definition (27) and [40, Thm 21.9], we have β[0,1]\beta\in[0,1] and I1<I_{1}<\infty, thus implying β+[0,1]\beta_{+}\in[0,1]. By Lemma 5.2 applied to 𝑺\bm{S}, for t[0,1]t\in[0,1], we find

tq/2𝔼[sups[0,t]|𝑺s|q]C2tq/2+C3tq(min{1/q,1/β+}1/2),t(0,1].t^{-q/2}\mathds{E}\bigg{[}\sup_{s\in[0,t]}|\bm{S}_{s}|^{q}\bigg{]}\leqslant C_{2}t^{q/2}+C_{3}t^{q(\min\{1/q,1/\beta_{+}\}-1/2)},\quad t\in(0,1].

Since 11/β+1\leqslant 1/\beta_{+}, we have 1/21/β+1/21/2\leqslant 1/\beta_{+}-1/2 and min{1/q1/2,1/2}min{1/q,1/β+}1/2\min\{1/q-1/2,1/2\}\leqslant\min\{1/q,1/\beta_{+}\}-1/2. Thus, for any β+[0,1]\beta_{+}\in[0,1], by (41) we get 𝒲q(𝑿t,𝚺𝑿𝑩)q12max{C2,C3}tqmin{1/q1/2,1/2}\mathcal{W}_{q}(\bm{X}^{t},\bm{\Sigma}_{\bm{X}}\bm{B})^{q\vee 1}\leqslant 2\max\{C_{2},C_{3}\}t^{q\min\{1/q-1/2,1/2\}}. As in Part (a), note that q/(q1)=q1q/(q\vee 1)=q\wedge 1, implying the claim in Part (b). ∎

6. Lower bounds on the Wasserstein distance in the domain of attraction

In this section we prove the lower bounds from Theorems 2.1, 2.32.8. We first cover the domain of non-normal attraction and then turn to the domain of normal attraction.

6.1. Domain of non-normal attraction

The lower bound on the rate of decay of 𝒲q(𝑿t,𝒁)\mathcal{W}_{q}(\bm{X}^{t},\bm{Z}) as t0t\downarrow 0 is much greater than polynomial when the scaling function g(t)=t1/αG(1/t)g(t)=t^{1/\alpha}G(1/t) is such that GG, which is slowly varying at 0, does not convergent to a positive constant (i.e. the process 𝑿\bm{X} is in the domain of non-normal attraction). To show this, we start with the following result, which can be viewed as an extension of [9, Thm 1] from random walks to multidimensional Lévy processes, stated for the LqL^{q}-Wasserstein distance. (We remark here that an extension for the Prokhorov distance, used in [9], is also possible in this context.) Our proof below was inspired by that of [9, Thm 1]. Our main tool in the proof of the lower bound in Theorem 2.3 is the following.

Proposition 6.1.

Let 𝐗\bm{X} be in the domain of non-normal attraction of an α\alpha-stable process 𝐙\bm{Z} with α(0,2]\alpha\in(0,2] and define a(t)G(1/(2t))/G(1/t)a(t)\coloneqq G(1/(2t))/G(1/t) for t>0t>0. Then, for all t>0t>0 and q(0,1](0,α)q\in(0,1]\cap(0,\alpha),

21q/α𝒲q(𝑿1t,𝒁1)+a(t)q𝒲q(𝑿12t,𝒁1)|1a(t)q|𝔼[|𝒁1|q].\displaystyle 2^{1-q/\alpha}\mathcal{W}_{q}(\bm{X}^{t}_{1},\bm{Z}_{1})+a(t)^{q}\mathcal{W}_{q}(\bm{X}^{2t}_{1},\bm{Z}_{1})\geqslant|1-a(t)^{q}|\mathds{E}\big{[}|\bm{Z}_{1}|^{q}\big{]}.

In Lemma 6.2 we state some well-known facts used in the proof of Proposition 6.1.

Lemma 6.2.

(a) Let 𝝃\bm{\xi} be a random vector in LqL^{q}, i.e. 𝔼[|𝝃|q]<\mathds{E}[|\bm{\xi}|^{q}]<\infty, for some q(0,1]q\in(0,1]. Then,

𝒲q(𝝃,a𝝃)|1aq|𝔼[|𝝃|q] for any constant a(0,).\mathcal{W}_{q}(\bm{\xi},a\bm{\xi})\geqslant|1-a^{q}|\mathds{E}\big{[}|\bm{\xi}|^{q}\big{]}\quad\text{ for any constant }a\in(0,\infty).

(b) Assume that the random vectors 𝝃1,𝝃2,𝜻1,𝜻2\bm{\xi}_{1},\bm{\xi}_{2},\bm{\zeta}_{1},\bm{\zeta}_{2} are in LqL^{q}, for some q(0,1]q\in(0,1], and that (𝝃1,𝜻1)(\bm{\xi}_{1},\bm{\zeta}_{1}) and (𝝃2,𝜻2)(\bm{\xi}_{2},\bm{\zeta}_{2}) are independent. Then the following inequality holds:

𝒲q(𝝃1+𝝃2,𝜻1+𝜻2)𝒲q(𝝃1,𝜻1)+𝒲q(𝝃2,𝜻2).\mathcal{W}_{q}(\bm{\xi}_{1}+\bm{\xi}_{2},\bm{\zeta}_{1}+\bm{\zeta}_{2})\leqslant\mathcal{W}_{q}(\bm{\xi}_{1},\bm{\zeta}_{1})+\mathcal{W}_{q}(\bm{\xi}_{2},\bm{\zeta}_{2}).
Proof.

(a) By the subadditivity of ttqt\mapsto t^{q} on +\mathbb{R}_{+}, we have |𝒙|q(|𝒚|+|𝒙𝒚|)q|𝒚|q+|𝒙𝒚|q|\bm{x}|^{q}\leqslant(|\bm{y}|+|\bm{x}-\bm{y}|)^{q}\leqslant|\bm{y}|^{q}+|\bm{x}-\bm{y}|^{q} for any 𝒙,𝒚d\bm{x},\bm{y}\in\mathbb{R}^{d}. A similar inequality holds by reversing the roles of 𝒙\bm{x} and 𝒚\bm{y}, implying |𝒙𝒚|q||𝒙|q|𝒚|q||\bm{x}-\bm{y}|^{q}\geqslant||\bm{x}|^{q}-|\bm{y}|^{q}|. Hence, we have

𝒲q(𝝃,a𝝃)=inf(𝝃,𝜻),𝜻=𝑑a𝝃𝔼[|𝝃𝜻|q]inf(𝝃,𝜻),𝜻=𝑑a𝝃|𝔼[|𝝃|q]𝔼[|𝜻|q]|=|1aq|𝔼[|𝝃|q].\mathcal{W}_{q}(\bm{\xi},a\bm{\xi})=\inf_{(\bm{\xi},\bm{\zeta}),\,\bm{\zeta}\overset{d}{=}a\bm{\xi}}\mathds{E}\big{[}|\bm{\xi}-\bm{\zeta}|^{q}\big{]}\geqslant\inf_{(\bm{\xi},\bm{\zeta}),\,\bm{\zeta}\overset{d}{=}a\bm{\xi}}\big{|}\mathds{E}\big{[}|\bm{\xi}|^{q}\big{]}-\mathds{E}\big{[}|\bm{\zeta}|^{q}\big{]}\big{|}=|1-a^{q}|\mathds{E}\big{[}|\bm{\xi}|^{q}\big{]}.

(b) By [41, Thm 4.1] and [35, Main Thm] there exist minimal couplings (𝝃1,𝜻1)(\bm{\xi}_{1},\bm{\zeta}_{1}) and (𝝃1,𝜻2)(\bm{\xi}_{1},\bm{\zeta}_{2}), satisfying 𝔼[|𝝃1𝜻1|q]=𝒲q(𝝃1,𝜻1)\mathds{E}[|\bm{\xi}_{1}-\bm{\zeta}_{1}|^{q}]=\mathcal{W}_{q}(\bm{\xi}_{1},\bm{\zeta}_{1}) and 𝔼[|𝝃2𝜻2|q]=𝒲q(𝝃2,𝜻2)\mathds{E}[|\bm{\xi}_{2}-\bm{\zeta}_{2}|^{q}]=\mathcal{W}_{q}(\bm{\xi}_{2},\bm{\zeta}_{2}). The product of these two probability spaces yields a coupling of all four vectors 𝝃1,𝝃2,𝜻1,𝜻2\bm{\xi}_{1},\bm{\xi}_{2},\bm{\zeta}_{1},\bm{\zeta}_{2}, such that (𝝃1,𝜻1)(\bm{\xi}_{1},\bm{\zeta}_{1}) and (𝝃2,𝜻2)(\bm{\xi}_{2},\bm{\zeta}_{2}) are independent. Thus,

𝒲q(𝝃1+𝝃2,𝜻1+𝜻2)\displaystyle\mathcal{W}_{q}(\bm{\xi}_{1}+\bm{\xi}_{2},\bm{\zeta}_{1}+\bm{\zeta}_{2}) 𝔼[|𝝃1+𝝃2𝜻1𝜻2|q]\displaystyle\leqslant\mathds{E}[|\bm{\xi}_{1}+\bm{\xi}_{2}-\bm{\zeta}_{1}-\bm{\zeta}_{2}|^{q}]
𝔼[|𝝃1𝜻1|q]+𝔼[|𝝃2𝜻2|q]=𝒲q(𝝃1,𝜻1)+𝒲q(𝝃2,𝜻2).\displaystyle\leqslant\mathds{E}[|\bm{\xi}_{1}-\bm{\zeta}_{1}|^{q}]+\mathds{E}[|\bm{\xi}_{2}-\bm{\zeta}_{2}|^{q}]=\mathcal{W}_{q}(\bm{\xi}_{1},\bm{\zeta}_{1})+\mathcal{W}_{q}(\bm{\xi}_{2},\bm{\zeta}_{2}).\qed
Proof of Proposition 6.1.

Recall 𝑿1t=𝑿t/g(t)\bm{X}^{t}_{1}=\bm{X}_{t}/g(t), and note that 𝑿2t=𝑿2t𝑿t+𝑿t\bm{X}_{2t}=\bm{X}_{2t}-\bm{X}_{t}+\bm{X}_{t}, where 𝑿2t𝑿t\bm{X}_{2t}-\bm{X}_{t} and 𝑿t\bm{X}_{t} are independent and equal in distribution. Furthermore let 𝒁(1),𝒁(2)\bm{Z}^{(1)},\bm{Z}^{(2)} be independent copies of 𝒁\bm{Z}. Recall that g(t)=t1/αG(1/t)g(t)=t^{1/\alpha}G(1/t) and note that 𝒁1=𝑑21/α𝒁1(1)+21/α𝒁1(2)\bm{Z}_{1}\overset{d}{=}2^{-1/\alpha}\bm{Z}_{1}^{(1)}+2^{-1/\alpha}\bm{Z}^{(2)}_{1}. This together with Lemma 6.2(b) implies that

𝒲q(G(1/(2t))G(1/t)𝑿12t,𝒁1)\displaystyle\mathcal{W}_{q}\left(\frac{G(1/(2t))}{G(1/t)}\bm{X}^{2t}_{1},\bm{Z}_{1}\right) =𝒲q(𝑿2t𝑿t(2t)1/αG(1/t)+𝑿t(2t)1/αG(1/t),𝒁1(1)21/α+𝒁1(2)21/α)\displaystyle=\mathcal{W}_{q}\left(\frac{\bm{X}_{2t}-\bm{X}_{t}}{(2t)^{1/\alpha}G(1/t)}+\frac{\bm{X}_{t}}{(2t)^{1/\alpha}G(1/t)},\frac{\bm{Z}^{(1)}_{1}}{2^{1/\alpha}}+\frac{\bm{Z}^{(2)}_{1}}{2^{1/\alpha}}\right)
2𝒲q(𝑿t(2t)1/αG(1/t),𝒁1(1)21/α)=21q/α𝒲q(𝑿1t,𝒁1).\displaystyle\leqslant 2\mathcal{W}_{q}\left(\frac{\bm{X}_{t}}{(2t)^{1/\alpha}G(1/t)},\frac{\bm{Z}^{(1)}_{1}}{2^{1/\alpha}}\right)=2^{1-q/\alpha}\mathcal{W}_{q}\left(\bm{X}^{t}_{1},\bm{Z}_{1}\right).

The scaling property for the 𝒲q\mathcal{W}_{q}-distance implies that 𝒲q(a(t)𝑿12t,a(t)𝒁1)=a(t)q𝒲q(𝑿12t,𝒁1)\mathcal{W}_{q}(a(t)\bm{X}^{2t}_{1},a(t)\bm{Z}_{1})=a(t)^{q}\mathcal{W}_{q}(\bm{X}^{2t}_{1},\bm{Z}_{1}). Putting everything together and applying the triangle inequality with Lemma 6.2(a), yields

21q/α𝒲q(𝑿1t,𝒁1)+a(t)q𝒲q(𝑿12t,𝒁1)\displaystyle 2^{1-q/\alpha}\mathcal{W}_{q}(\bm{X}^{t}_{1},\bm{Z}_{1})+a(t)^{q}\mathcal{W}_{q}(\bm{X}^{2t}_{1},\bm{Z}_{1}) 𝒲q(a(t)𝑿12t,𝒁1)+𝒲q(a(t)𝑿12t,a(t)𝒁1)\displaystyle\geqslant\mathcal{W}_{q}\left(a(t)\bm{X}^{2t}_{1},\bm{Z}_{1}\right)+\mathcal{W}_{q}\left(a(t)\bm{X}^{2t}_{1},a(t)\bm{Z}_{1}\right)
𝒲q(𝒁1,a(t)𝒁1)|1a(t)q|𝔼[|𝒁1|q].\displaystyle\geqslant\mathcal{W}_{q}\left(\bm{Z}_{1},a(t)\bm{Z}_{1}\right)\geqslant|1-a(t)^{q}|\mathds{E}\big{[}|\bm{Z}_{1}|^{q}\big{]}.\qed

6.2. Domain of normal attraction and the Toscani-Fourier lower bounds

We begin with the following technical result, used in the proofs of Theorems 2.12.8. Given two dd-dimensional random vectors 𝝃\bm{\xi} and 𝜻\bm{\zeta} with characteristic functions φ𝝃(𝒖)𝔼[exp(i𝒖,𝝃)]\varphi_{\bm{\xi}}(\bm{u})\coloneqq\mathds{E}[\exp(i\langle\bm{u},\bm{\xi}\rangle)] and φ𝜻(𝒖)𝔼[exp(i𝒖,𝜻)]\varphi_{\bm{\zeta}}(\bm{u})\coloneqq\mathds{E}[\exp(i\langle\bm{u},\bm{\zeta}\rangle)], respectively, and define the Toscani–Fourier distance (see [3, Eq. (1)]) as

Ts(𝝃,𝜻)sup𝒖𝟎d|φ𝝃(𝒖)φ𝜻(𝒖)||𝒖|s, for s>0.T_{s}(\bm{\xi},\bm{\zeta})\coloneqq\sup_{\bm{u}\in\mathbb{R}^{d}_{\bm{0}}}\frac{|\varphi_{\bm{\xi}}(\bm{u})-\varphi_{\bm{\zeta}}(\bm{u})|}{|\bm{u}|^{s}},\quad\text{ for }s>0.

The following lemma is an extension of [34, Prop. 2] to the multivariate case and to LqL^{q}-Wasserstein distances for q(0,1]q\in(0,1], and the proof is inspired by the proof in the one-dimensional case. For completeness, we give a simple proof below.

Lemma 6.3.

For any random vectors 𝛏,𝛇\bm{\xi},\bm{\zeta} and q(0,1]q\in(0,1], we have 𝒲q(𝛏,𝛇)2q1Tq(𝛏,𝛇)\mathcal{W}_{q}(\bm{\xi},\bm{\zeta})\geqslant 2^{q-1}T_{q}(\bm{\xi},\bm{\zeta}).

Proof.

Fix q(0,1]q\in(0,1]. Since the map ψ:xeix\psi:x\mapsto e^{ix}, xx\in\mathbb{R}, satisfies |1ψ(x)|2min{|x/2|,1}2|x/2|q|1-\psi(x)|\leqslant 2\min\{|x/2|,1\}\leqslant 2|x/2|^{q} for xx\in\mathbb{R}, we have |ψ(x)ψ(y)|21q|xy|q|\psi(x)-\psi(y)|\leqslant 2^{1-q}|x-y|^{q} for any x,yx,y\in\mathbb{R}. Hence, for any 𝒖d{𝟎}\bm{u}\in\mathbb{R}^{d}\setminus\{\bm{0}\},

𝔼[21q|𝝃𝜻|q]𝔼[21q|𝒖,𝝃𝒖,𝜻|q]|𝒖|q𝔼[|ψ(𝒖,𝜻)ψ(𝒖,𝜻)|]|𝒖|q|φ𝝃(𝒖)φ𝜻(𝒖)||𝒖|q.\mathds{E}\big{[}2^{1-q}|\bm{\xi}-\bm{\zeta}|^{q}\big{]}\geqslant\frac{\mathds{E}\big{[}2^{1-q}|\langle\bm{u},\bm{\xi}\rangle-\langle\bm{u},\bm{\zeta}\rangle|^{q}\big{]}}{|\bm{u}|^{q}}\geqslant\frac{\mathds{E}\big{[}|\psi(\langle\bm{u},\bm{\zeta}\rangle)-\psi(\langle\bm{u},\bm{\zeta}\rangle)|\big{]}}{|\bm{u}|^{q}}\geqslant\frac{|\varphi_{\bm{\xi}}(\bm{u})-\varphi_{\bm{\zeta}}(\bm{u})|}{|\bm{u}|^{q}}.

Since 𝒖d{𝟎}\bm{u}\in\mathbb{R}^{d}\setminus\{\bm{0}\} is arbitrary, the result follows. ∎

6.2.1. Heavy-tailed domain of normal attraction

Let (𝑿t)t0(\bm{X}_{t})_{t\geqslant 0} be a Lévy process on d\mathbb{R}^{d} in the domain of attraction of an α\alpha-stable process 𝒁\bm{Z}, such that 𝑿1t=𝑿t/t1/α𝑑𝒁1\bm{X}^{t}_{1}=\bm{X}_{t}/t^{1/\alpha}\xrightarrow{d}\bm{Z}_{1} as t0t\downarrow 0.

Lemma 6.4.

Let 𝐗\bm{X} be a Lévy process that differs in law from the α\alpha-stable process 𝐙\bm{Z}, α(0,2]\alpha\in(0,2]. Let ψ𝐗\psi_{\bm{X}} and ψ𝐙\psi_{\bm{Z}} denote their Lévy–Khintchine exponents. Pick any q(0,1](0,α)q\in(0,1]\cap(0,\alpha) and 𝐮d{𝟎}\bm{u}_{*}\in\mathbb{R}^{d}\setminus\{\bm{0}\} for which C2q1|𝐮|q|ψ𝐗1(𝐮)ψ𝐙1(𝐮)|>0C_{*}\coloneqq 2^{q-1}|\bm{u}_{*}|^{-q}|\psi_{\bm{X}_{1}}(\bm{u}_{*})-\psi_{\bm{Z}_{1}}(\bm{u}_{*})|>0. Then, we have

𝒲q(𝑿1t,𝒁1)Ct1q/α+𝒪(t2q/α),as t0.\displaystyle\mathcal{W}_{q}(\bm{X}^{t}_{1},\bm{Z}_{1})\geqslant C_{*}t^{1-q/\alpha}+\mathcal{O}(t^{2-q/\alpha}),\qquad\text{as }t\downarrow 0.
Proof.

First, Lemma 6.3 implies that

𝒲q(𝑿1t,𝒁1)2q1Tq(𝑿1t,𝒁1)2q1|𝒖|q|φ𝑿1t(𝒖)φ𝒁1(𝒖)|,for any 𝒖𝟎d,\displaystyle\mathcal{W}_{q}(\bm{X}^{t}_{1},\bm{Z}_{1})\geqslant 2^{q-1}T_{q}(\bm{X}^{t}_{1},\bm{Z}_{1})\geqslant\frac{2^{q-1}}{|\bm{u}|^{q}}|\varphi_{\bm{X}^{t}_{1}}(\bm{u})-\varphi_{\bm{Z}_{1}}(\bm{u})|,\qquad\text{for any }\bm{u}\in\mathbb{R}^{d}_{\bm{0}},

where φ𝝃\varphi_{\bm{\xi}} denotes the characteristic function of the random vector 𝝃\bm{\xi}. Second, set 𝒖=t1/α𝒖\bm{u}=t^{1/\alpha}\bm{u}_{*} with 𝒖d{𝟎}\bm{u}_{*}\in\mathbb{R}^{d}\setminus\{\bm{0}\} as in the statement of the lemma and note that

φ𝑿1t(𝒖)=𝔼[exp(i𝑿1t,t1/α𝒖)]=𝔼[exp(i𝑿t/t1/α,t1/α𝒖)]=etψ𝑿(𝒖).\varphi_{\bm{X}^{t}_{1}}(\bm{u})=\mathds{E}\big{[}\exp(i\langle\bm{X}^{t}_{1},t^{1/\alpha}\bm{u}_{*}\rangle)\big{]}=\mathds{E}\big{[}\exp(i\langle\bm{X}_{t}/t^{1/\alpha},t^{1/\alpha}\bm{u}_{*}\rangle)\big{]}=e^{t\psi_{\bm{X}}(\bm{u}_{*})}.

Similarly, since 𝒁1=𝑑𝒁t/t1/α\bm{Z}_{1}\overset{d}{=}\bm{Z}_{t}/t^{1/\alpha}, we have φ𝒁1(𝒖)=exp(tψ𝒁(𝒖))\varphi_{\bm{Z}_{1}}(\bm{u})=\exp(t\psi_{\bm{Z}}(\bm{u}_{*})) and hence

𝒲q(𝑿1t,𝒁1)2q1|𝒖|qtq/α|etψ𝑿(𝒖)etψ𝒁(𝒖)|.\displaystyle\mathcal{W}_{q}(\bm{X}^{t}_{1},\bm{Z}_{1})\geqslant\frac{2^{q-1}}{|\bm{u}_{*}|^{q}t^{q/\alpha}}\big{|}e^{t\psi_{\bm{X}}(\bm{u}_{*})}-e^{t\psi_{\bm{Z}}(\bm{u}_{*})}\big{|}.

Since for any zz\in\mathbb{C} we have ez=1+z+𝒪(z2)e^{z}=1+z+\mathcal{O}(z^{2}) as |z|0|z|\to 0, it follows that |eatebt|=|atbt+𝒪(t2)|=|ab|t+𝒪(t2)|e^{at}-e^{bt}|=|at-bt+\mathcal{O}(t^{2})|=|a-b|t+\mathcal{O}(t^{2}) for a=ψ𝑿(𝒖)a=\psi_{\bm{X}}(\bm{u}_{*}) and b=ψ𝒁(𝒖)b=\psi_{\bm{Z}}(\bm{u}_{*}). The result then follows. ∎

6.2.2. Brownian domain of normal attraction

The domain of normal attraction to a Brownian motion consists of the class of Lévy processes with a nontrivial Brownian component (see e.g. [26] and [5, Prop. I.2(i)]). To construct a lower bound on the distance between the Lévy process and its Brownian limit require the following lower estimates.

Lemma 6.5.

Let 𝐘\bm{Y} be a nonzero pure-jump Lévy process on d\mathbb{R}^{d}, let ψ𝐘(𝐮)\psi_{\bm{Y}}(\bm{u}) denote its Lévy-Khintchine exponent and ν𝐘\nu_{\bm{Y}} its Lévy measure.
(a) If 𝐘\bm{Y} has finite variation and nonzero drift with direction 𝐮𝕊d1\bm{u}_{*}\in\mathbb{S}^{d-1}. Then |ψ𝐘(r𝐮)|cr|\psi_{\bm{Y}}(r\bm{u}_{*})|\geqslant cr for some c>0c>0 and all sufficiently large r>0r>0.
(b) Suppose there exist a locally finite measure ρ\rho on (0,)(0,\infty) and a probability measure σ\sigma on 𝕊d1\mathbb{S}^{d-1} with

ν𝒀(A)𝕊d1(0,)𝟙A(r𝒗)ρ(dr)σ(d𝒗),A(𝟎d).\nu_{\bm{Y}}(A)\geqslant\int_{\mathbb{S}^{d-1}}\int_{(0,\infty)}\mathds{1}_{A}(r\bm{v})\rho(\mathrm{d}r)\sigma(\mathrm{d}\bm{v}),\qquad A\in\mathcal{B}(\mathbb{R}^{d}_{\bm{0}}).

Define υ(r):=(0,r)r2ρ(dr)\upsilon(r):=\int_{(0,r)}r^{2}\rho(\mathrm{d}r) for r>0r>0, and, given any c(0,1)c\in(0,1), let 𝐮𝕊d1\bm{u}_{*}\in\mathbb{S}^{d-1} be such that the set Cc,𝐮:={𝐯𝕊d1:|𝐮,𝐯|c}C_{c,\bm{u}_{*}}:=\{\bm{v}\in\mathbb{S}^{d-1}:|\langle\bm{u}_{*},\bm{v}\rangle|\geqslant c\} has positive σ\sigma-measure m:=σ(Cc,𝐮)>0m:=\sigma(C_{c,\bm{u}_{*}})>0. Then we have

|ψ𝒀(r𝒖)|c2m3r2υ(r1)for all r>0.|\psi_{\bm{Y}}(r\bm{u}_{*})|\geqslant\frac{c^{2}m}{3}r^{2}\upsilon(r^{-1})\qquad\text{for all }r>0.

In particular, if cδ:=infr(0,1)rδ2υ(r)>0c_{\delta}:=\inf_{r\in(0,1)}r^{\delta-2}\upsilon(r)>0 for some δ(0,2)\delta\in(0,2), then |ψ𝐘(r𝐮)|(cδc2m/3)rδ|\psi_{\bm{Y}}(r\bm{u}_{*})|\geqslant(c_{\delta}c^{2}m/3)r^{\delta}, r>1r>1.

Proof.

Let z\Re z and z\Im z denote the real and imaginary parts of zz\in\mathbb{C}, respectively.
(a) Since 𝒀\bm{Y} has finite variation, it is clear from the Lévy-Khintchine formula without compensator that |ψ𝒀(r𝒖)||ψ𝒀(r𝒖)|cr|\psi_{\bm{Y}}(r\bm{u}_{*})|\geqslant|\Im\psi_{\bm{Y}}(r\bm{u}_{*})|\geqslant cr for some c>0c>0 and all sufficiently large r>0r>0. Indeed, this follows from [5, Prop. 2(ii)] applied to the finite variation Lévy process 𝒖,𝒀\langle\bm{u_{*}},\bm{Y}\rangle.
(b) Note at first, that 1eix=1cos(x)13x2𝟙{|x|<1}1-\Re e^{ix}=1-\cos(x)\geqslant\tfrac{1}{3}x^{2}\mathds{1}_{\{|x|<1\}} for all xx\in\mathbb{R}. Thus, the Lévy-Khintchine formula applied to ψ\Re\psi yields

3|ψ𝒀(r𝒖)|\displaystyle 3|\psi_{\bm{Y}}(r\bm{u}_{*})| 3|ψ(r𝒖)|𝟎d|r𝒖,𝒘|2𝟙{|r𝒖,𝒘|<1}ν𝒀(d𝒘)\displaystyle\geqslant 3|\Re\psi(r\bm{u}_{*})|\geqslant\int_{\mathbb{R}^{d}_{\bm{0}}}|\langle r\bm{u}_{*},\bm{w}\rangle|^{2}\mathds{1}_{\{|\langle r\bm{u}_{*},\bm{w}\rangle|<1\}}\nu_{\bm{Y}}(\mathrm{d}\bm{w})
𝟎d|r𝒖,𝒘|2𝟙{r|𝒘|<1}ν𝒀(d𝒘)01/rCc,𝒖|r𝒖,s𝒗|2σ(d𝒗)ρ(ds)\displaystyle\geqslant\int_{\mathbb{R}^{d}_{\bm{0}}}|\langle r\bm{u}_{*},\bm{w}\rangle|^{2}\mathds{1}_{\{r|\bm{w}|<1\}}\nu_{\bm{Y}}(\mathrm{d}\bm{w})\geqslant\int_{0}^{1/r}\int_{C_{c,\bm{u}_{*}}}|\langle r\bm{u}_{*},s\bm{v}\rangle|^{2}\sigma(\mathrm{d}\bm{v})\rho(\mathrm{d}s)
01/rCc,𝒖c2r2s2σ(d𝒗)ρ(ds)c2m01/rr2s2ρ(ds)=c2mr2υ(r1).\displaystyle\geqslant\int_{0}^{1/r}\int_{C_{c,\bm{u}_{*}}}c^{2}r^{2}s^{2}\sigma(\mathrm{d}\bm{v})\rho(\mathrm{d}s)\geqslant c^{2}m\int_{0}^{1/r}r^{2}s^{2}\rho(\mathrm{d}s)=c^{2}mr^{2}\upsilon(r^{-1}).

This proves the first claim in Part (b). The second claim follows from the additional assumption. ∎

Lemma 6.6.

Let 𝐗\bm{X} be a Lévy process on d\mathbb{R}^{d} with the characteristic triplet (𝛄𝐗,𝚺𝐗𝚺𝐗,ν𝐗)(\bm{\gamma_{X}},\bm{\Sigma_{X}}\bm{\Sigma_{X}}^{\scalebox{0.6}{$\top$}},\nu_{\bm{X}}). Let 𝐗t=(𝐗st/t)s[0,1]\bm{X}^{t}=(\bm{X}_{st}/\sqrt{t})_{s\in[0,1]} for t(0,1]t\in(0,1]. Moreover, let 𝚺𝐗𝐁\bm{\Sigma_{X}}\bm{B} be the Gaussian component of 𝐗\bm{X} in its Lévy–Itô decomposition (3) and define 𝐒𝐗𝚺𝐗𝐁\bm{S}\coloneqq\bm{X}-\bm{\Sigma_{X}}\bm{B} with Lévy–Khintchine exponent ψ𝐒\psi_{\bm{S}}.
(a) Pick any 𝐮𝟎d\bm{u}_{*}\in\mathbb{R}^{d}_{\bm{0}} and define C|𝐮|1|ψ𝐒1(𝐮)|>0C_{*}\coloneqq|\bm{u}_{*}|^{-1}|\psi_{\bm{S}_{1}}(\bm{u}_{*})|>0. Then, we have

𝒲1(𝑿1t,𝚺𝑿𝑩1)Ct+𝒪(t3/2),as t0.\displaystyle\mathcal{W}_{1}(\bm{X}^{t}_{1},\bm{\Sigma}_{\bm{X}}\bm{B}_{1})\geqslant C_{*}\sqrt{t}+\mathcal{O}(t^{3/2}),\quad\text{as }t\downarrow 0.

(b) Let λ\lambda be the largest eigenvalue of 𝚺𝐗𝚺𝐗\bm{\Sigma}_{\bm{X}}\bm{\Sigma}^{\scalebox{0.6}{$\top$}}_{\bm{X}}. Suppose there exist δ[1,2)\delta\in[1,2) and vectors (𝐮r)r(0,)(\bm{u}_{r})_{r\in(0,\infty)} on 𝟎d\mathbb{R}^{d}_{\bm{0}} satisfying |𝐮r|=r|\bm{u}_{r}|=r and c:=infr>1rδ|ψ𝐒(𝐮r)|>0c:=\inf_{r>1}r^{-\delta}|\psi_{\bm{S}}(\bm{u}_{r})|>0. Then for any C(0,ceλ/2)C_{*}\in(0,ce^{-\lambda/2}) we have

𝒲1(𝑿1t,𝚺𝑿𝑩1)Ct1δ/2for all sufficiently small t>0.\mathcal{W}_{1}(\bm{X}^{t}_{1},\bm{\Sigma}_{\bm{X}}\bm{B}_{1})\geqslant C_{*}t^{1-\delta/2}\qquad\text{for all sufficiently small }t>0.
Proof.

Since 𝑩\bm{B} and 𝑺\bm{S} are independent, we have φ𝑿1t(𝒖)=φ𝚺𝑿𝑩1(𝒖)φ𝑺t(𝒖/t)\varphi_{\bm{X}^{t}_{1}}(\bm{u})=\varphi_{\bm{\Sigma}_{\bm{X}}\bm{B}_{1}}(\bm{u})\varphi_{\bm{S}_{t}}(\bm{u}/\sqrt{t}). Hence, Lemma 6.3 gives, for any 𝒖𝟎d\bm{u}\in\mathbb{R}^{d}_{\bm{0}},

(42) 𝒲1(𝑿1t,𝚺𝑿𝑩1)T1(𝑿1t,𝚺𝑿𝑩1)1|𝒖||φ𝚺𝑿𝑩1(𝒖)||φ𝑺t(𝒖/t)1|,\mathcal{W}_{1}(\bm{X}^{t}_{1},\bm{\Sigma}_{\bm{X}}\bm{B}_{1})\geqslant T_{1}(\bm{X}^{t}_{1},\bm{\Sigma}_{\bm{X}}\bm{B}_{1})\geqslant\frac{1}{|\bm{u}|}|\varphi_{\bm{\Sigma}_{\bm{X}}\bm{B}_{1}}(\bm{u})|\cdot|\varphi_{\bm{S}_{t}}(\bm{u}/\sqrt{t})-1|,

(a) Let 𝒖d\bm{u}_{*}\in\mathbb{R}^{d} be as in the statement. The result then follows from Lemma 6.4.

(b) The proof follows as in that of Lemma 6.4. The main idea is to use the fact that, if at0a_{t}\to 0 as t0t\downarrow 0 and supt>0|bt|<\sup_{t>0}|b_{t}|<\infty, then

|eat+btebt|=|eat1||ebt||at|infs>0|ebs|+𝒪(|at|2),as t0.|e^{a_{t}+b_{t}}-e^{b_{t}}|=|e^{a_{t}}-1|\cdot|e^{b_{t}}|\geqslant|a_{t}|\cdot\inf_{s>0}|e^{b_{s}}|+\mathcal{O}(|a_{t}|^{2}),\quad\text{as }t\downarrow 0.

Set bt=(t/2)𝒖t1/2𝚺𝑿2𝒖t1/2b_{t}=-(t/2)\bm{u}_{t^{-1/2}}^{\scalebox{0.6}{$\top$}}\bm{\Sigma}^{2}_{\bm{X}}\bm{u}_{t^{-1/2}} and at=tψ𝑺(𝒖t1/2)a_{t}=t\psi_{\bm{S}}(\bm{u}_{t^{-1/2}}) for t>0t>0. Note that bt=𝒪(1)b_{t}=\mathcal{O}(1) as t0t\downarrow 0 and |ebt|eλ/2|e^{b_{t}}|\geqslant e^{-\lambda/2}. Since 𝑺\bm{S} does not have a Brownian component, we have lim|𝒖||𝒖|2|ψ𝑺(𝒖)|=0\lim_{|\bm{u}|\to\infty}|\bm{u}|^{-2}\cdot|\psi_{\bm{S}}(\bm{u})|=0 (see, e.g. [40, Lem. 43.11]) and thus at0a_{t}\to 0 as t0t\downarrow 0. Moreover, |at|ct1δ/2|a_{t}|\geqslant ct^{1-\delta/2} for t<1t<1 by assumption. Thus, applying (42) with 𝒖=t𝒖t1/2\bm{u}=\sqrt{t}\bm{u}_{t^{-1/2}} yields Part (b). ∎

Example 6.7.

Consider an example inspired by [26, Ex. 4.2.1]. Let SS be a real-valued martingale Lévy process with Lévy measure ν(dy)=py3log1p(y)𝟙(0,1)(y)dy\nu(\mathrm{d}y)=py^{-3}\log^{-1-p}(y)\mathds{1}_{(0,1)}(y)\mathrm{d}y for some p>0p>0 and all yy\in\mathbb{R}. Let B=(Bs)s[0,1]B=(B_{s})_{s\in[0,1]} be a standard Brownian motion independent of SS and let σ2>0\sigma^{2}>0. Define Xt(Sst/t+σBst/t)s[0,1]X^{t}\coloneqq(S_{st}/\sqrt{t}+\sigma B_{st}/\sqrt{t})_{s\in[0,1]} for t>0t>0. Choose gg to satisfy g(t)2log(1/g(t))ptg(t)^{2}\log(1/g(t))^{p}\sim t as t0t\downarrow 0. The function gg is regularly varying at 0 with index 1/21/2, and therefore g(t)t/log(1/t)pg(t)\sim\sqrt{t/\log(1/t)^{p}} as t0t\downarrow 0 by [6, Thm 1.5.12]. Note that St/g(t)S_{t}/g(t) converges in law to a standard normal distribution by [26, Thm 2(i)]. Hence, the Lévy–Khintchine exponent ψS\psi_{S} of SS satisfies tψS(u/g(t))u2/2t\psi_{S}(u/g(t))\to-u^{2}/2 as t0t\downarrow 0 for any uu\in\mathbb{R}. In particular, by taking t=g1(s)slog(1/s)pt=g^{-1}(\sqrt{s})\sim s\log(1/s)^{p}, which tends to 0 as s0s\downarrow 0, we obtain slog(1/s)pψS(1/s)1/2s\log(1/s)^{p}\psi_{S}(1/\sqrt{s})\to-1/2 as s0s\downarrow 0. Then, a slight modification of the argument in the proof of Lemma 6.6(b) shows that lim inft0log(1/t)p𝒲1(X1t,σB1)>0\liminf_{t\downarrow 0}\log(1/t)^{p}\mathcal{W}_{1}(X^{t}_{1},\sigma B_{1})>0. ∎

7. Proofs of Section 2

In this section we give the proofs of the results stated in Section 2.

Proof of Theorem 2.1.

Part (a). Recall from Assumption ( (T).) and Remark 5.4, that we may decompose 𝑿\bm{X} as the sum 𝑺+𝑹\bm{S}+\bm{R} of independent Lévy processes 𝑺\bm{S} and 𝑹\bm{R} with generating triplets (𝜸𝑺,𝟎,ν𝑿𝖼)(\bm{\gamma_{S}},\bm{0},\nu_{\bm{X}}^{\mathsf{c}}) and (𝜸𝑹,𝟎,ν𝑿d)(\bm{\gamma_{R}},\bm{0},\nu_{\bm{X}}^{\mathrm{d}}), respectively. For t(0,1]t\in(0,1], denote 𝑺t=(𝑺st/t1/α)s[0,1]\bm{S}^{t}=(\bm{S}_{st}/t^{1/\alpha})_{s\in[0,1]} and 𝑹t=(𝑹st/t1/α)s[0,1]\bm{R}^{t}=(\bm{R}_{st}/t^{1/\alpha})_{s\in[0,1]}. Let κ(t)tr\kappa(t)\coloneqq t^{r} for t(0,1]t\in(0,1] and some r1/αr\geqslant-1/\alpha. Assume the processes (𝑫𝑺t,κ(t),𝑫𝒁,κ(t),𝑱𝑺t,κ(t),𝑱𝒁,κ(t))(\bm{D}^{\bm{S}^{t},\kappa(t)},\bm{D}^{\bm{Z},\kappa(t)},\bm{J}^{\bm{S}^{t},\kappa(t)},\bm{J}^{\bm{Z},\kappa(t)}) are coupled as in Subsection 4.1 (i.e. (5) and (6)).

Note that 𝒲q(𝑿t,𝒁)𝒲q(𝑺t,𝒁)+𝔼[supt[0,1]|𝑹st|q]\mathcal{W}_{q}(\bm{X}^{t},\bm{Z})\leqslant\mathcal{W}_{q}(\bm{S}^{t},\bm{Z})+\mathds{E}[\sup_{t\in[0,1]}|\bm{R}_{s}^{t}|^{q}] by the triangle inequality and the definition of 𝒲q\mathcal{W}_{q}. Theorem 5.5 (with p=1p=1 and q(0,1](0,α)q\in(0,1]\cap(0,\alpha)) yields a bound on 𝒲q(𝑿t,𝒁)\mathcal{W}_{q}(\bm{X}^{t},\bm{Z}) via (4) as follows:

𝔼[supt[0,1]|𝑹st/t1/α|q]=𝒪(t1q/α),\displaystyle\mathds{E}\bigg{[}\sup_{t\in[0,1]}\big{|}\bm{R}_{st}/t^{1/\alpha}\big{|}^{q}\bigg{]}=\mathcal{O}\big{(}t^{1-q/\alpha}\big{)},
𝒲q(𝑱𝑺t,κ(t),𝑱𝒁,κ(t))={𝒪(t1/α+r(q+1α)),q<α1,𝒪(t1q/α(1+log(1/t)𝟙{q=α1,r1/α})),qα1,by (28),\displaystyle\mathcal{W}_{q}\big{(}\bm{J}^{\bm{S}^{t},\kappa(t)},\bm{J}^{\bm{Z},\kappa(t)}\big{)}=\begin{dcases}\mathcal{O}\big{(}t^{1/\alpha+r(q+1-\alpha)}\big{)},&q<\alpha-1,\\ \mathcal{O}\big{(}t^{1-q/\alpha}(1+\log(1/t)\mathds{1}_{\{q=\alpha-1,\,r\neq-1/\alpha\}})\big{)},&q\geqslant\alpha-1,\end{dcases}\quad\text{by~{}\eqref{eq:Up_bound_thin_J},}
𝒲q(𝑫𝑺t,κ(t),𝑫𝒁,κ(t))𝒲2(𝑫𝑺t,κ(t),𝑫𝒁,κ(t))q=𝒪(tq/α+rq(3α)),by (22) and (29),\displaystyle\mathcal{W}_{q}\big{(}\bm{D}^{\bm{S}^{t},\kappa(t)},\bm{D}^{\bm{Z},\kappa(t)}\big{)}\leqslant\mathcal{W}_{2}\big{(}\bm{D}^{\bm{S}^{t},\kappa(t)},\bm{D}^{\bm{Z},\kappa(t)}\big{)}^{q}=\mathcal{O}\big{(}t^{q/\alpha+rq(3-\alpha)}\big{)},\quad\text{by~{}\eqref{eq:Wasserstein_relationship} and~{}\eqref{eq:Up_bound_thin_D},}
|𝜸𝑺t,κ(t)𝜸𝒁,κ(t)|q={𝒪(tq/α+rq(2α)),α(0,1),𝒪(tq(11/α)),α(1,2),by (30).\displaystyle|\bm{\gamma}_{\bm{S}^{t},\kappa(t)}-\bm{\gamma}_{\bm{Z},\kappa(t)}|^{q}=\begin{cases}\mathcal{O}\big{(}t^{q/\alpha+rq(2-\alpha)}\big{)},&\alpha\in(0,1),\\ \mathcal{O}\big{(}t^{q(1-1/\alpha)}\big{)},&\alpha\in(1,2),\end{cases}\quad\text{by~{}\eqref{eq:Up_bound_thin_gamma}.}

Part (a) can now be deduced by optimising rr as follows. If α<1\alpha<1 then q>0>α1q>0>\alpha-1 and taking rr sufficiently large makes all terms become 𝒪(t1q/α)\mathcal{O}(t^{1-q/\alpha}). If α>1\alpha>1 and q<α1q<\alpha-1, then the bounds are 𝒪(t1q/α)\mathcal{O}(t^{1-q/\alpha}), 𝒪(t1/α+r(q+1α))\mathcal{O}(t^{1/\alpha+r(q+1-\alpha)}), 𝒪(tq/α+rq(3α))\mathcal{O}(t^{q/\alpha+rq(3-\alpha)}) and 𝒪(tq(11/α))\mathcal{O}(t^{q(1-1/\alpha)}). Since q1q\leqslant 1 and 11/α1/α1-1/\alpha\leqslant 1/\alpha, all these bounds can be made 𝒪(tq(11/α))\mathcal{O}(t^{q(1-1/\alpha)}) by picking r=0r=0. If α>1\alpha>1 and qα1q\geqslant\alpha-1, then the bounds are 𝒪(t1q/α)\mathcal{O}(t^{1-q/\alpha}), 𝒪(t1q/α(1+log(1/t)𝟙{q=α1,r1/α}))\mathcal{O}(t^{1-q/\alpha}(1+\log(1/t)\mathds{1}_{\{q=\alpha-1,\,r\neq-1/\alpha\}})), 𝒪(tq/α+rq(3α))\mathcal{O}(t^{q/\alpha+rq(3-\alpha)}) and 𝒪(tq(11/α))\mathcal{O}(t^{q(1-1/\alpha)}) and, as before, these can all be made 𝒪(tq(11/α))\mathcal{O}(t^{q(1-1/\alpha)}) by picking r=0r=0. Note here that the logarithmic term never arises in the dominant term, as it would require 1=q=α11=q=\alpha-1, but α<2\alpha<2.

Part (b). The claim follows from a direct application of Lemma 6.4. ∎

Proof of Theorem 2.8.

Proposition 5.15 gives Part (a). Parts (b) and (c) follow from Lemma 6.6. ∎

In preparation for the proofs of Theorem 2.3 and Corollary 2.4, we establish Lemmas 7.17.2 and 7.3 about slowly varying functions.

Lemma 7.1.

Let \ell be C1C^{1} and slowly varying such that its derivative equals t~(t)/(c+t)t\mapsto\widetilde{\ell}(t)/(c+t) for some c0c\geqslant 0, where |~||\widetilde{\ell}| is positive and slowly varying at infinity. Then, for each x>0x>0, we have ((t)(xt))/~(t)logx(\ell(t)-\ell(xt))/\widetilde{\ell}(t)\to-\log x as tt\to\infty.

For x>0x>0, define Lx(t)1(xt)/(t)L_{x}(t)\coloneqq 1-\ell(xt)/\ell(t) for all t>0t>0. The function |Lx||L_{x}| is asymptotically equivalent to |~(t)logx|/(t)|Lx(t)||\widetilde{\ell}(t)\log x|/\ell(t)\sim|L_{x}(t)| as tt\to\infty and, if x1x\neq 1, slowly varying at infinity. Moreover, for x>0x>0, the function Σ(x)supt>0,y[x1,x1]~(yt)/~(t)\Sigma(x)\coloneqq\sup_{t>0,y\in[x\wedge 1,x\vee 1]}\widetilde{\ell}(yt)/\widetilde{\ell}(t) satisfies

|Lx(t)|=|(xt)(t)1||~(t)|(t)Σ(x)|logx| for all t,x>0.\big{|}L_{x}(t)\big{|}=\bigg{|}\frac{\ell(xt)}{\ell(t)}-1\bigg{|}\leqslant\frac{|\widetilde{\ell}(t)|}{\ell(t)}\cdot\Sigma(x)|\log x|\qquad\text{ for all }t,x>0.
Proof.

Since |~||\widetilde{\ell}| is positive and ~\widetilde{\ell} is continuous, ~\widetilde{\ell} is either eventually positive or negative and either ~\widetilde{\ell} or ~-\widetilde{\ell} is slowly varying at infinity, respectively. By [6, Thm 1.2.1], we have supy[a,b]|~(yt)/~(t)1|0\sup_{y\in[a,b]}|\widetilde{\ell}(yt)/\widetilde{\ell}(t)-1|\to 0 as tt\to\infty for any 0<a<b<0<a<b<\infty. Thus for all sufficiently large t>0t>0 we have

~(ty)~(t)1c/t+y(1+supz[x1,x1]|~(zt)/~(t)1|)1y2yfor all y[x1,x1].\frac{\widetilde{\ell}(ty)}{\widetilde{\ell}(t)}\frac{1}{c/t+y}\leqslant\big{(}1+\sup_{z\in[x\wedge 1,x\vee 1]}|\widetilde{\ell}(zt)/\widetilde{\ell}(t)-1|\big{)}\frac{1}{y}\leqslant\frac{2}{y}\qquad\text{for all $y\in[x\wedge 1,x\vee 1]$.}

The dominated convergence theorem now yields

(t)(xt)~(t)=1~(t)xtt~(y)dyc+y=x1~(ty)~(t)dyc/t+yx1dyy=logx,as t,\frac{\ell(t)-\ell(xt)}{\widetilde{\ell}(t)}=\frac{1}{\widetilde{\ell}(t)}\int_{xt}^{t}\widetilde{\ell}(y)\frac{\mathrm{d}y}{c+y}=\int_{x}^{1}\frac{\widetilde{\ell}(ty)}{\widetilde{\ell}(t)}\frac{\mathrm{d}y}{c/t+y}\to\int_{x}^{1}\frac{\mathrm{d}y}{y}=-\log x,\quad\text{as }t\to\infty,

which establishes the first claim. Since Lx(t)=(((t)(xt))/~(t))(~(t)/(t))L_{x}(t)=\big{(}(\ell(t)-\ell(xt))/\widetilde{\ell}(t)\big{)}\big{(}\widetilde{\ell}(t)/\ell(t)\big{)}, the function |Lx||L_{x}| is positive on a neighbourhood of infinity and asymptotically equivalent to |~(t)logx|/(t)|\widetilde{\ell}(t)\log x|/\ell(t) by the limit in the previous display. Moreover, since x>0x>0 in the limit was arbitrary, for any λ>0\lambda>0 we have

Lx(λt)Lx(t)=(t)(λt)~(λt)1((λt)(λtx))~(t)1((t)(xt))~(λt)~(t)1,as t,\frac{L_{x}(\lambda t)}{L_{x}(t)}=\frac{\ell(t)}{\ell(\lambda t)}\cdot\frac{\widetilde{\ell}(\lambda t)^{-1}(\ell(\lambda t)-\ell(\lambda tx))}{\widetilde{\ell}(t)^{-1}(\ell(t)-\ell(xt))}\cdot\frac{\widetilde{\ell}(\lambda t)}{\widetilde{\ell}(t)}\to 1,\quad\text{as }t\to\infty,

implying that |Lx||L_{x}| is slowly varying at infinity.

To establish the non-asymptotic inequality in the lemma, fix x>0x>0 and note that

|(xt)(t)||txt~(y)dyc+y||txt~(y)dyy|[x1,x1]|~(yt)|dyy|~(t)|Σ(x)|logx|.|\ell(xt)-\ell(t)|\leqslant\bigg{|}\int_{t}^{xt}\widetilde{\ell}(y)\frac{\mathrm{d}y}{c+y}\bigg{|}\leqslant\bigg{|}\int_{t}^{xt}\widetilde{\ell}(y)\frac{\mathrm{d}y}{y}\bigg{|}\leqslant\int_{[x\wedge 1,x\vee 1]}\big{|}\widetilde{\ell}(yt)\big{|}\frac{\mathrm{d}y}{y}\leqslant|\widetilde{\ell}(t)|\cdot\Sigma(x)|\log x|.\qed

Note that the assumption in Lemma 7.1 requires \ell to be eventually strictly monotone. Moreover, if \ell satisfies the conditions of Lemma 7.1, then so does q\ell^{q} for any q>0q>0.

Lemma 7.2.

(a) Let \ell be slowly varying at infinity. Suppose that, for some λ(0,){1}\lambda\in(0,\infty)\setminus\{1\} and non-increasing function ϕλ\phi_{\lambda}, we have ϕλ(t)|1(λt)/(t)|\phi_{\lambda}(t)\geqslant|1-\ell(\lambda t)/\ell(t)| for all t1t\geqslant 1 and 1ϕλ(t)t1dt<\int_{1}^{\infty}\phi_{\lambda}(t)t^{-1}\mathrm{d}t<\infty. Then \ell has a positive finite limit at infinity.
(b) Let ϕ\phi be slowly varying at infinity with ϕ(t)0\phi(t)\to 0 as tt\to\infty and 1ϕ(t)t1dt=\int_{1}^{\infty}\phi(t)t^{-1}\mathrm{d}t=\infty. Then the functions ±(t)exp(±1tϕ(s)s1ds)\ell_{\pm}(t)\coloneqq\exp(\pm\int_{1}^{t}\phi(s)s^{-1}\mathrm{d}s) are slowly varying at infinity, +(t)\ell_{+}(t)\to\infty, (t)0\ell_{-}(t)\to 0 and |1±(λt)/±(t)||logλ|ϕ(t)|1-\ell_{\pm}(\lambda t)/\ell_{\pm}(t)|\sim|\log\lambda|\phi(t) as tt\to\infty for any λ>0\lambda>0.

Note that the smallest non-increasing function φλ\varphi_{\lambda} satisfying ϕλ(t)|1(λt)/(t)|\phi_{\lambda}(t)\geqslant|1-\ell(\lambda t)/\ell(t)| for all t1t\geqslant 1 is given by φλ(t)supst|1(λs)/(s)|\varphi_{\lambda}(t)\coloneqq\sup_{s\geqslant t}|1-\ell(\lambda s)/\ell(s)|.

Proof of Lemma 7.2.

Part (a). First assume λ>1\lambda>1. Define Uλ(t)supx[1,λ]|1(xt)/(t)|U_{\lambda}(t)\coloneqq\sup_{x\in[1,\lambda]}|1-\ell(xt)/\ell(t)| and note that Uλ(t)0U_{\lambda}(t)\to 0 as tt\to\infty by the uniform convergence theorem [6, Thm 1.2.1]. As 1ϕλ(t)t1dt<\int_{1}^{\infty}\phi_{\lambda}(t)t^{-1}\mathrm{d}t<\infty, we also have ϕλ(t)0\phi_{\lambda}(t)\to 0 as t0t\to 0, making ηinf{t1:max{ϕλ(s),Uλ(s)}<1/2 for all st}\eta\coloneqq\inf\{t\geqslant 1:\max\{\phi_{\lambda}(s),U_{\lambda}(s)\}<1/2\text{ for all }s\geqslant t\} finite. Since 1+Uλ(t)(xt)/(t)1Uλ(t)1+U_{\lambda}(t)\geqslant\ell(xt)/\ell(t)\geqslant 1-U_{\lambda}(t) and Uλ(t)1/2U_{\lambda}(t)\leqslant 1/2 for all t>ηt>\eta and x[1,λ]x\in[1,\lambda], we obtain

Uλ(t)log(1+Uλ(t))log((xt)(t))log(1Uλ(t))2Uλ(t), implying |log((xt)(t))|2Uλ(t) U_{\lambda}(t)\geqslant\log(1+U_{\lambda}(t))\geqslant\log\bigg{(}\frac{\ell(xt)}{\ell(t)}\bigg{)}\geqslant\log(1-U_{\lambda}(t))\geqslant-2U_{\lambda}(t),\text{ implying $\bigg{|}\log\bigg{(}\frac{\ell(xt)}{\ell(t)}\bigg{)}\bigg{|}\leqslant 2U_{\lambda}(t)$ }

for all t>ηt>\eta and x[1,λ]x\in[1,\lambda]. Similarly, for t>ηt>\eta we have |log((λt)/(t))|2ϕλ(t)|\log(\ell(\lambda t)/\ell(t))|\leqslant 2\phi_{\lambda}(t). For any T>tηT>t\geqslant\eta^{\prime} set nlog(T/t)/logλn\coloneqq\lfloor\log(T/t)/\log\lambda\rfloor, implying T/(λnt)[1,λ)T/(\lambda^{n}t)\in[1,\lambda). By the monotonicity of ϕλ\phi_{\lambda} we obtain

|log(T)log(t)|\displaystyle|\log\ell(T)-\log\ell(t)| |log((T)(λnt))|+k=1n|log((λkt)(λk1t))|\displaystyle\leqslant\bigg{|}\log\bigg{(}\frac{\ell(T)}{\ell(\lambda^{n}t)}\bigg{)}\bigg{|}+\sum_{k=1}^{n}\bigg{|}\log\bigg{(}\frac{\ell(\lambda^{k}t)}{\ell(\lambda^{k-1}t)}\bigg{)}\bigg{|}
2Uλ(λnt)+k=1n2ϕλ(λk1t)\displaystyle\leqslant 2U_{\lambda}(\lambda^{n}t)+\sum_{k=1}^{n}2\phi_{\lambda}\big{(}\lambda^{k-1}t\big{)}
2Uλ(λnt)+k=1n2logλλk2tλk1tϕλ(s)dss\displaystyle\leqslant 2U_{\lambda}(\lambda^{n}t)+\sum_{k=1}^{n}\frac{2}{\log\lambda}\int_{\lambda^{k-2}t}^{\lambda^{k-1}t}\phi_{\lambda}\big{(}s\big{)}\frac{\mathrm{d}s}{s}
(43) 2Uλ(λnt)+2logλλ1tϕλ(s)dsst0 (uniformly in T[t,)).\displaystyle\leqslant 2U_{\lambda}(\lambda^{n}t)+\frac{2}{\log\lambda}\int_{\lambda^{-1}t}^{\infty}\phi_{\lambda}\big{(}s\big{)}\frac{\mathrm{d}s}{s}\overset{t\to\infty}{\longrightarrow}0\qquad\text{ (uniformly in $T\in[t,\infty)$).}

If we had lim suptlog(t)>lim inftlog(t)\limsup_{t\to\infty}\log\ell(t)>\liminf_{t\to\infty}\log\ell(t), there would exist an increasing sequence (tk)k(t_{k})_{k\in\mathbb{N}} and ϵ>0\epsilon>0 such that tkt_{k}\to\infty and |log(tk+1)log(tk)|>ϵ|\log\ell(t_{k+1})-\log\ell(t_{k})|>\epsilon for all kk\in\mathbb{N}, contradicting (43). Hence the limit limtlog(t)\lim_{t\to\infty}\log\ell(t) exists. By taking the limit as TT\to\infty on the left-hand side of (43) for any fixed tt, it follows that limt|log(t)|\lim_{t\to\infty}|\log\ell(t)|\neq\infty. Thus \ell has a finite and positive limit. The case λ<1\lambda<1 can be established in a similar way.

Part (b). The statement follows from a direct application of Lemma 7.1. ∎

Lemma 7.3.

Define iteratively the functions 1(t)=log(e+t)\ell_{1}(t)=\log(e+t) and n+1(t)=log(e+n(t))\ell_{n+1}(t)=\log(e+\ell_{n}(t)) for t0t\geqslant 0 and nn\in\mathbb{N}. Then, the following statements hold.
(a) For any x>0x>0 and c0c\geqslant 0, we have (c+n(xt))/(c+n(t))1+𝟙{x>1}logx(c+\ell_{n}(xt))/(c+\ell_{n}(t))\leqslant 1+\mathds{1}_{\{x>1\}}\log x.
(b) We have (e+t)n(t)=i=1n1(e+k(t))1(e+t)\ell_{n}^{\prime}(t)=\prod_{i=1}^{n-1}(e+\ell_{k}(t))^{-1}.
(c) Suppose (t)=n(t)qnm(t)qm\ell(t)=\ell_{n}(t)^{q_{n}}\cdots\ell_{m}(t)^{q_{m}} for some 1nm1\leqslant n\leqslant m in \mathbb{N} and either qn,,qm0q_{n},\ldots,q_{m}\geqslant 0 with qn,qm>0q_{n},q_{m}>0 or qn,,qm0q_{n},\ldots,q_{m}\leqslant 0 with qn,qm<0q_{n},q_{m}<0. Set ~(t)(e+t)(t)\widetilde{\ell}(t)\coloneqq(e+t)\ell^{\prime}(t), then we have

Σ(x)supt>0,y[x1,x1]~(yt)~(t){(1+logx)j=nmqj+,x1,(1+|logx|)m+j=nmqj,x<1.\Sigma(x)\coloneqq\sup_{t>0,\,y\in[x\wedge 1,x\vee 1]}\frac{\widetilde{\ell}(yt)}{\widetilde{\ell}(t)}\leqslant\begin{cases}(1+\log x)^{\sum_{j=n}^{m}q_{j}^{+}},&x\geqslant 1,\\ (1+|\log x|)^{m+\sum_{j=n}^{m}q_{j}^{-}},&x<1.\end{cases}
Proof.

For x<1x<1 we have n(xt)n(t)\ell_{n}(xt)\leqslant\ell_{n}(t). For x>1x>1, we have

n(xt)n(t)=txt1k=1n1(e+k(s))dse+s1k=1n1(e+k(t))txtdss=logxk=1n1(e+k(t)).\ell_{n}(xt)-\ell_{n}(t)=\int_{t}^{xt}\frac{1}{\prod_{k=1}^{n-1}(e+\ell_{k}(s))}\frac{\mathrm{d}s}{e+s}\leqslant\frac{1}{\prod_{k=1}^{n-1}(e+\ell_{k}(t))}\int_{t}^{xt}\frac{\mathrm{d}s}{s}=\frac{\log x}{\prod_{k=1}^{n-1}(e+\ell_{k}(t))}.

In particular, we may add c0c\geqslant 0 to n(xt)\ell_{n}(xt) and n(t)\ell_{n}(t) and divide by c+n(t)c+\ell_{n}(t) to obtain

c+n(xt)c+n(t)1+logx(c+n(t))k=1n1(e+k(t))1+logx,n,x>1,t>0,\frac{c+\ell_{n}(xt)}{c+\ell_{n}(t)}\leqslant 1+\frac{\log x}{(c+\ell_{n}(t))\prod_{k=1}^{n-1}(e+\ell_{k}(t))}\leqslant 1+\log x,\quad n\in\mathbb{N},\,x>1,\,t>0,

implying Part (a). Part (b) is obvious, so we need only establish Part (c).

It is simple to show that for a1,a2,b1,b2>0a_{1},a_{2},b_{1},b_{2}>0, the fraction (a1+a2)/(b1+b2)(a_{1}+a_{2})/(b_{1}+b_{2}) lies between a1/b1a_{1}/b_{1} and a2/b2a_{2}/b_{2}. An inductive argument implies that for any a1,,ak,b1,,bk>0a_{1},\ldots,a_{k},b_{1},\ldots,b_{k}>0, we have

minj{1,,k}ajbjj=1kajj=1kbjmaxj{1,,k}ajbj.\min_{j\in\{1,\ldots,k\}}\frac{a_{j}}{b_{j}}\leqslant\frac{\sum_{j=1}^{k}a_{j}}{\sum_{j=1}^{k}b_{j}}\leqslant\max_{j\in\{1,\ldots,k\}}\frac{a_{j}}{b_{j}}.

Thus, by virtue of Parts (a) and (b) and denoting ~j(t)(e+t)j(t)\widetilde{\ell}_{j}(t)\coloneqq(e+t)\ell_{j}^{\prime}(t), we have

Σ(x)\displaystyle\Sigma(x) =supt>0,y[x1,x1](yt)j=nm|qj|~j(yt)/j(yt)(t)j=nm|qj|~j(t)/j(t)supt>0,y[x1,x1]maxj{n,,m},qj0(yt)~j(yt)/j(yt)(t)~j(t)/j(t)\displaystyle=\sup_{t>0,\,y\in[x\wedge 1,x\vee 1]}\frac{\ell(yt)\sum_{j=n}^{m}|q_{j}|\widetilde{\ell}_{j}(yt)/\ell_{j}(yt)}{\ell(t)\sum_{j=n}^{m}|q_{j}|\widetilde{\ell}_{j}(t)/\ell_{j}(t)}\leqslant\sup_{t>0,\,y\in[x\wedge 1,x\vee 1]}\max_{j\in\{n,\ldots,m\},\,q_{j}\neq 0}\frac{\ell(yt)\widetilde{\ell}_{j}(yt)/\ell_{j}(yt)}{\ell(t)\widetilde{\ell}_{j}(t)/\ell_{j}(t)}
=supt>0,y[x1,x1]maxj{n,,m},qj0i=1j1e+i(t)e+i(yt)i=nm(i(yt)i(t))qi𝟙{i=j}\displaystyle=\sup_{t>0,\,y\in[x\wedge 1,x\vee 1]}\max_{j\in\{n,\ldots,m\},\,q_{j}\neq 0}\prod_{i=1}^{j-1}\frac{e+\ell_{i}(t)}{e+\ell_{i}(yt)}\cdot\prod_{i=n}^{m}\bigg{(}\frac{\ell_{i}(yt)}{\ell_{i}(t)}\bigg{)}^{q_{i}-\mathds{1}_{\{i=j\}}}
{(1+logx)j=nmqj+,x1,(1+|logx|)m+j=nmqj,x<1.\displaystyle\leqslant\begin{cases}(1+\log x)^{\sum_{j=n}^{m}q_{j}^{+}},&x\geqslant 1,\\ (1+|\log x|)^{m+\sum_{j=n}^{m}q_{j}^{-}},&x<1.\end{cases}\qed
Proof of Theorem 2.3.

Part (a). Recall from Remark 5.8, that we can decompose 𝑿\bm{X} as the sum 𝑺+𝑹\bm{S}+\bm{R} of the independent processes 𝑺\bm{S} and 𝑹\bm{R} with generating triplets (𝜸𝑺,𝟎,ν𝑿𝖼)(\bm{\gamma_{S}},\bm{0},\nu_{\bm{X}}^{\mathsf{c}}) and (𝜸𝑹,𝟎,ν𝑿d)(\bm{\gamma_{R}},\bm{0},\nu_{\bm{X}}^{\mathrm{d}}), respectively. For t[0,1]t\in[0,1], let 𝑺t=(𝑺st/g(t))s[0,1]\bm{S}^{t}=(\bm{S}_{st}/g(t))_{s\in[0,1]} and 𝑹t=(𝑹st/g(t))s[0,1]\bm{R}^{t}=(\bm{R}_{st}/g(t))_{s\in[0,1]}. Assume that (𝑴𝑺t,𝑴𝒁,𝑳𝑺t,𝑳𝒁)(\bm{M}^{\bm{S}^{t}},\bm{M^{Z}},\bm{L}^{\bm{S}^{t}},\bm{L^{Z}}) is coupled as in (14) and (15).

Note that 𝒲q(𝑿t,𝒁)𝒲q(𝑺t,𝒁)+𝔼[supt[0,1]|𝑹st|q]\mathcal{W}_{q}(\bm{X}^{t},\bm{Z})\leqslant\mathcal{W}_{q}(\bm{S}^{t},\bm{Z})+\mathds{E}[\sup_{t\in[0,1]}|\bm{R}_{s}^{t}|^{q}] by the triangle inequality. Next, we apply (4) with p=1p=1 to 𝒲q(𝑺t,𝒁)\mathcal{W}_{q}(\bm{S}^{t},\bm{Z}) and use Theorem 5.9 and Potter’s bounds [6, Thm 1.5.6] (applied to the slowly varying function G2G_{2}) to show that each resulting term is 𝒪(G2(t)q)\mathcal{O}(G_{2}(t)^{q}):

  • 𝒲q(𝑴𝑺t,𝑴𝒁)𝒲2(𝑴𝑺t,𝑴𝒁)q\mathcal{W}_{q}\big{(}\bm{M}^{\bm{S}^{t}},\bm{M^{Z}}\big{)}\leqslant\mathcal{W}_{2}\big{(}\bm{M}^{\bm{S}^{t}},\bm{M^{Z}}\big{)}^{q} by (22), and (35) then yields the bound;

  • 𝒲q(𝑳𝑺t,𝑳𝒁)\mathcal{W}_{q}\big{(}\bm{L}^{\bm{S}^{t}},\bm{L}^{\bm{Z}}\big{)} is bounded by (36);

  • |ϖ𝑺tϖ𝒁|q|\bm{\varpi}_{\bm{S}^{t}}-\bm{\varpi}_{\bm{Z}}|^{q} is bounded by (37).

Similarly, by Potter’s bounds and Theorem 5.9, 𝔼[supt[0,1]|𝑹st|q]=𝒪(t1q/αG(1/t)q)=𝒪(G2(t)q)\mathds{E}[\sup_{t\in[0,1]}|\bm{R}_{s}^{t}|^{q}]=\mathcal{O}(t^{1-q/\alpha}G(1/t)^{-q})=\mathcal{O}(G_{2}(t)^{q}).

Part (b). Recall a(t)=G(1/(2t))/G(1/t)1a(t)=G(1/(2t))/G(1/t)\to 1 as t0t\to 0. Directly from Proposition 6.1, we see that

3max{𝒲q(𝑿t1,𝒁1),𝒲q(𝑿2t1,𝒁1)}|1a(t)q|𝔼[|𝒁1|q]for all sufficiently small t>0,3\max\{\mathcal{W}_{q}(\bm{X}^{t}_{1},\bm{Z}_{1}),\mathcal{W}_{q}(\bm{X}^{2t}_{1},\bm{Z}_{1})\}\geqslant|1-a(t)^{q}|\mathds{E}[|\bm{Z}_{1}|^{q}]\quad\text{for all sufficiently small $t>0$,}

yielding the first claim of part (b). The second claim follows from Lemma 7.2 since GG is assumed not to have a positive finite limit. ∎

Proof of Corollary 2.4.

Part (b) follows from Lemma 7.3 above. Given Theorem 2.3, it suffices to show that the assumptions in Corollary 2.4(a) imply those of Theorem 2.3 and that the upper and lower bounds have the desired form. These facts follow from Lemmas 7.17.3. Indeed, for instance, the function G1G_{1} in Assumption ( (S).) is given by the upper bound on xΣ(x)|logx|x\mapsto\Sigma(x)|\log x| given in Lemma 7.3, where Σ\Sigma is as in Lemma 7.1. ∎

8. Concluding remarks

Over small time horizons, a Lévy process may be attracted to an α\alpha-stable process with heavy tails (i.e. α(0,2)\alpha\in(0,2)) or Brownian motion (i.e. α=2\alpha=2). In this paper, we established upper and lower bounds on the rate of convergence in LqL^{q}-Wasserstein distance in both regimes, as listed below.

  • For α(0,2]{1}\alpha\in(0,2]\setminus\{1\} and processes in the domain of non-normal attraction, the Wasserstein distance is bounded above and below by slowly varying functions (see Theorem 2.3), both of which are slower than any power of logarithm greater than 11.

  • For α(0,2){1}\alpha\in(0,2)\setminus\{1\} and processes in the domain of normal attraction, we establish upper and lower bounds that are polynomial in tt (see Theorem 2.1). The established bounds are often rate optimal in LqL^{q}-Wasserstein distance for q<α<1q<\alpha<1 or q=1<αq=1<\alpha and proportional to t1q/αt^{1-q/\alpha}.

  • For α=2\alpha=2 (i.e. Brownian limit) and processes in the domain of normal attraction, the established upper and lower bounds are also polynomial in tt (see Theorem 2.8). In this case, the bounds are rate optimal when the Blumenthal–Getoor index β1\beta\leqslant 1 and otherwise there is a polynomial gap. This suggests at least one of the bounds is not sharp. Establishing sharper bounds in this special case is nontrivial (as classical tools such as the Berry–Esseen theorem fail to provide converging bounds) and is therefore left for future work.

The process 𝑹\bm{R} in Assumption ( (T).) (resp. ( (C).)) in the thinning (resp. comonotonic) coupling is assumed to have finitely many jumps on compact time intervals. Our results can be extended to the case where this process has infinitely many jumps on compact time intervals and a Blumenthal–Getoor index β<α\beta<\alpha. For such an extension, the moments of sups[0,1]|𝑹st|\sup_{s\in[0,1]}|\bm{R}_{s}^{t}|, as a function of tt, can be controlled via Lemma 5.2 and would result in worse convergence rates as βα\beta\uparrow\alpha. We chose not to include this simple extension as the convergence rates would be much harder to express in terms of all the model parameters, resulting in a less concise presentation of our results.

The tools developed in Section 4 could be used for the omitted case α=1\alpha=1 to establish upper and lower bounds on the Wasserstein distance in the domains of normal and non-normal attraction. However, as multiple cases would arise, requiring careful treatment of the emerging slowly varying functions, we leave such extension for future work.

The upper bounds in the heavy-tailed case α(0,2){1}\alpha\in(0,2)\setminus\{1\} are based on two distinct couplings introduced in Section 4: the comonotonic and thinning couplings. We mention briefly that it is possible to combine both couplings. Consider two Lévy processes 𝑿\bm{X} and 𝒀\bm{Y} with Lévy measures f𝑿dμf_{\bm{X}}\mathrm{d}\mu and f𝒀dμf_{\bm{Y}}\mathrm{d}\mu where 0f𝑿10\leqslant f_{\bm{X}}\leqslant 1 and 0f𝒀10\leqslant f_{\bm{Y}}\leqslant 1 are measurable and μ\mu is a Lévy measure. It is then possible to first apply the thinning coupling to synchronise the jumps arising from the Lévy measure gdμg\mathrm{d}\mu, where gmin{f𝑿,f𝒀}g\coloneqq\min\{f_{\bm{X}},f_{\bm{Y}}\}, and then apply the comonotonic coupling to the remaining jumps of 𝑿\bm{X} and 𝒀\bm{Y} with corresponding Lévy measures (f𝑿g)dμ(f_{\bm{X}}-g)\mathrm{d}\mu and (f𝒀g)dμ(f_{\bm{Y}}-g)\mathrm{d}\mu. It appears, however, that this combined coupling does not yield improved rates of convergence to heavy-tailed stable limits as each coupling already attains optimal rates of convergence in most cases. We expect such a combined coupling to reduce the LpL^{p}-distance between coupled processes by a constant factor.

When α(0,2)\alpha\in(0,2), the α\alpha-stable limit has heavy tails and its qq-moment is finite if and only if q<αq<\alpha, making it impossible to obtain general converging bounds in the L2L^{2}-Wasserstein distance, which play a key role in various applications including Multilevel Monte Carlo. However, substituting the standard Euclidean metric on d\mathbb{R}^{d} with an equivalent bounded metric would remove this obstruction. We expect our couplings to perform well and have a fast converging L2L^{2}-Wasserstein distance under the bounded metric on d\mathbb{R}^{d}. Such an extension of our results is also left for future work.

The present work focused on the small-time stable domain of attraction where the small jumps of the process dominate the activity. Finally we remark that it is natural to expect that the couplings developed in this paper could also typically achieve asymptotically optimal convergence rate in the Wasserstein distance in the scaling limits of the long-time stable domain of attraction. This is because in the long-time horizon regime the activity in the limit is dominated by the large jumps of the Lévy process, which are also efficiently coupled under the couplings of Section 4 above.

Acknowledgements

JGC and AM were supported by EPSRC grant EP/V009478/1 and The Alan Turing Institute under the EPSRC grant EP/N510129/1; AM was also supported by EPSRC grant EP/W006227/1; DKB was funded by the CDT in Mathematics and Statistics at The University of Warwick and is supported by AUFF NOVA grant AUFF-E-2022-9-39. The authors would like to thank the Isaac Newton Institute for Mathematical Sciences, Cambridge, for support during the INI satellite programme Heavy tails in machine learning, hosted by The Alan Turing Institute, London, and the INI programme Stochastic systems for anomalous diffusion hosted at INI in Cambridge, where work on this paper was undertaken. This work was supported by EPSRC grant EP/R014604/1.

References

  • [1] S. Asmussen and J. Ivanovs. Discretization error for a two-sided reflected Lévy process. Queueing Syst., 89(1-2):199–212, 2018.
  • [2] S. Asmussen and J. Rosiński. Approximations of small jumps of Lévy processes with a view towards simulation. J. Appl. Probab., 38(2):482–493, 2001.
  • [3] G. Auricchio, A. Codegoni, S. Gualandi, G. Toscani, and M. Veneroni. The equivalence of Fourier-based and Wasserstein metrics on imaging problems. Atti Accad. Naz. Lincei Rend. Lincei Mat. Appl., 31(3):627–649, 2020.
  • [4] D. Bang, J. González Cázares, and A. Mijatović. A Gaussian approximation theorem for Lévy processes. Statist. Probab. Lett., 178:Paper No. 109187, 4, 2021.
  • [5] J. Bertoin. Lévy processes, volume 121 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge, 1996.
  • [6] N. H. Bingham, C. M. Goldie, and J. L. Teugels. Regular variation, volume 27 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge, 1989.
  • [7] K. Bisewski and J. Ivanovs. Zooming-in on a Lévy process: failure to observe threshold exceedance over a dense grid. Electron. J. Probab., 25:Paper No. 113, 33, 2020.
  • [8] J. Blanchet and K. Murthy. Quantifying distributional model risk via optimal transport. Math. Oper. Res., 44(2):565–600, 2019.
  • [9] C. Börgers and C. Greengard. Slow convergence in generalized central limit theorems. C. R. Math. Acad. Sci. Paris, 356(6):679–685, 2018.
  • [10] M. Broadie, P. Glasserman, and S. Kou. A continuity correction for discrete barrier options. Math. Finance, 7(4):325–349, 1997.
  • [11] M. Broadie, P. Glasserman, and S. G. Kou. Connecting discrete and continuous path-dependent options. Finance Stoch., 3(1):55–82, 1999.
  • [12] M. E. Caballero, J. C. Pardo, and J. L. Pérez. On Lamperti stable processes. Probab. Math. Statist., 30(1):1–28, 2010.
  • [13] S. Cohen and J. Rosiński. Gaussian approximation of multivariate Lévy processes with applications to simulation of tempered stable processes. Bernoulli, 13(1):195–210, 2007.
  • [14] E. H. A. Dia and D. Lamberton. Connecting discrete and continuous lookback or hindsight options in exponential Lévy models. Adv. in Appl. Probab., 43(4):1136–1165, 2011.
  • [15] E. H. A. Dia and D. Lamberton. Continuity correction for barrier options in jump-diffusion models. SIAM J. Financial Math., 2(1):866–900, 2011.
  • [16] C. Duval, T. Jalal, and E. Mariucci. Nonparametric density estimation for the small jumps of lévy processes. 2024.
  • [17] P. Embrechts, C. Klüppelberg, and T. Mikosch. Modelling extremal events, volume 33 of Applications of Mathematics (New York). Springer-Verlag, Berlin, 1997. For insurance and finance.
  • [18] V. Fomichov, J. González Cázares, and J. Ivanovs. Implementable coupling of Lévy process and Brownian motion. Stochastic Process. Appl., 142:407–431, 2021.
  • [19] A. L. Gibbs and F. E. Su. On choosing and bounding probability metrics. INTERNAT. STATIST. REV., pages 419–435, 2002.
  • [20] B. V. Gnedenko and A. N. Kolmogorov. Limit distributions for sums of independent random variables. Addison-Wesley Publishing Co., Reading, Mass.-London-Don Mills., Ont., revised edition, 1968. Translated from the Russian, annotated, and revised by K. L. Chung, With appendices by J. L. Doob and P. L. Hsu.
  • [21] J. González Cázares, D. Kramer-Bang, and A. Mijatović. Presentation on “asymptotically optimal Wasserstein couplings for the small-time stable domain of attraction”. YouTube presentation on the YouTube channel Prob-AM, 2024.
  • [22] J. González Cázares and A. Mijatović. Simulation of the drawdown and its duration in Lévy models via stick-breaking Gaussian approximation. Finance Stoch., 26(4):671–732, 2022.
  • [23] J. I. González Cázares, A. Mijatović, and G. Uribe Bravo. Geometrically convergent simulation of the extrema of lévy processes. Mathematics of Operations Research, 47(2):1141–1168, 2022.
  • [24] F. Götze. On the rate of convergence in the multivariate CLT. Ann. Probab., 19(2):724–739, 1991.
  • [25] P. Hall. Two-sided bounds on the rate of convergence to a stable law. Z. Wahrsch. Verw. Gebiete, 57(3):349–364, 1981.
  • [26] J. Ivanovs. Zooming in on a Lévy process at its supremum. Ann. Appl. Probab., 28(2):912–940, 2018.
  • [27] J. Ivanovs and J. D. Thøstesen. Discretization of the Lamperti representation of a positive self-similar Markov process. Stochastic Process. Appl., 137:200–221, 2021.
  • [28] O. Johnson and R. Samworth. Central limit theorem and convergence to stable laws in Mallows distance. Bernoulli, 11(5):829–845, 2005.
  • [29] O. Kallenberg. Foundations of modern probability. Probability and its Applications (New York). Springer-Verlag, New York, second edition, 2002.
  • [30] W. S. Kendall, M. B. Majka, and A. Mijatović. Optimal Markovian coupling for finite activity Lévy processes. Bernoulli, 30(4):2821–2845, 2024.
  • [31] J. F. C. Kingman. Poisson processes, volume 3 of Oxford Studies in Probability. The Clarendon Press, Oxford University Press, New York, 1993. Oxford Science Publications.
  • [32] R. LePage. Multidimensional infinitely divisible variables and processes. II. In Probability in Banach spaces, III (Medford, Mass., 1980), volume 860 of Lecture Notes in Math., pages 279–284. Springer, Berlin-New York, 1981.
  • [33] S. Manou-Abi. Rate of convergence to alpha stable law using Zolotarev distance. Journal of Statistics: Advances in Theory and Applications, 18:166–177, 2017.
  • [34] E. Mariucci and M. Reiß. Wasserstein and total variation distance between marginals of Lévy processes. Electron. J. Stat., 12(2):2482–2514, 2018.
  • [35] P. Pegon, F. Santambrogio, and D. Piazzoli. Full characterization of optimal transport plans for concave costs. Discrete Contin. Dyn. Syst., 35(12):6113–6132, 2015.
  • [36] S. T. Rachev and L. Rüschendorf. Mass transportation problems. Vol. I. Probability and its Applications (New York). Springer-Verlag, New York, 1998. Theory.
  • [37] E. Rio. Upper bounds for minimal distances in the central limit theorem. Ann. Inst. Henri Poincaré Probab. Stat., 45(3):802–817, 2009.
  • [38] J. Rosiński. Series representations of Lévy processes from the perspective of point processes. In Lévy processes, pages 401–415. Birkhäuser Boston, Boston, MA, 2001.
  • [39] J. Rosiński. Tempering stable processes. Stochastic Process. Appl., 117(6):677–707, 2007.
  • [40] K.-i. Sato. Lévy processes and infinitely divisible distributions, volume 68 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2013. Translated from the 1990 Japanese original, Revised edition of the 1999 English translation.
  • [41] C. Villani. Optimal Transport: Old and New. Grundlehren der mathematischen Wissenschaften. Springer Berlin Heidelberg, 2008.

Appendix A Proof of Lemma 5.2

Given any κ(0,1]\kappa\in(0,1] consider the Lévy–Itô decomposition 𝑿t=𝜸𝑿(κ)t+𝚺𝑿𝑩t+𝑫(κ)t+𝑱(κ)t\bm{X}_{t}=\bm{\gamma}_{\bm{X}}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}t+\bm{\Sigma_{X}}\bm{B}_{t}+\bm{D}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{t}+\bm{J}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{t} given in (3), where 𝑩\bm{B} is a standard Brownian motion, 𝑫(κ)\bm{D}^{(\kappa)} is the pure-jump martingale containing all the jumps of 𝑿\bm{X} of magnitude less than κ\kappa and 𝑱(κ)\bm{J}^{(\kappa)} is the driftless compound Poisson process containing all the jumps of 𝑿\bm{X} of magnitude at least κ\kappa. Since |𝚺𝑿𝑩s||𝚺𝑿||𝑩s||\bm{\Sigma_{X}}\bm{B}_{s}|\leqslant|\bm{\Sigma_{X}}|\cdot|\bm{B}_{s}|, we have

(44) sups[0,t]|𝑿s||𝜸𝑿(κ)|t+|𝚺𝑿|sups[0,t]|𝑩s|+sups[0,t]|𝑫(κ)s|+sups[0,t]|𝑱(κ)s|, for t[0,1].\sup_{s\in[0,t]}|\bm{X}_{s}|\leqslant|\bm{\gamma}_{\bm{X}}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}|t+|\bm{\Sigma_{X}}|\sup_{s\in[0,t]}|\bm{B}_{s}|+\sup_{s\in[0,t]}|\bm{D}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{s}|+\sup_{s\in[0,t]}|\bm{J}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{s}|,\quad\text{ for }t\in[0,1].

By the elementary bound (i=1nxi)pn(p1)+i=1nxip(\sum_{i=1}^{n}x_{i})^{p}\leqslant n^{(p-1)^{+}}\sum_{i=1}^{n}x_{i}^{p}, p>0p>0, (p1)+=max{p1,0}(p-1)^{+}=\max\{p-1,0\} and xi0x_{i}\geqslant 0, i{1,,n}i\in\{1,\ldots,n\}, we only need to bound the pp-th moment of each summand on the right-hand side of the display above. Recall that β+\beta_{+} is the quantity associated to the BG index of 𝑿\bm{X} defined in (27).

Case β+>0\beta_{+}>0. Define κt1/β+\kappa\coloneqq t^{1/\beta_{+}}. To bound the drift term |𝜸𝑿(κ)|t|\bm{\gamma}_{\bm{X}}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}|t, first assume β+1\beta_{+}\geqslant 1 and note that

|𝜸𝑿(κ)|=|𝜸𝑿B𝟎(1)B𝟎(κ)𝒘ν𝑿(d𝒘)|\displaystyle|\bm{\gamma}_{\bm{X}}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}|=\bigg{|}\bm{\gamma}_{\bm{X}}-\int_{B_{\bm{0}}(1)\setminus B_{\bm{0}}(\kappa)}\bm{w}\nu_{\bm{X}}(\mathrm{d}\bm{w})\bigg{|} |𝜸𝑿|+B𝟎(1)B𝟎(κ)|𝒘|ν𝑿(d𝒘)\displaystyle\leqslant|\bm{\gamma}_{\bm{X}}|+\int_{B_{\bm{0}}(1)\setminus B_{\bm{0}}(\kappa)}|\bm{w}|\nu_{\bm{X}}(\mathrm{d}\bm{w})
|𝜸𝑿|+B𝟎(1)B𝟎(κ)κ1β+|𝒘|β+ν𝑿(d𝒘)|𝜸𝑿|+κ1β+Iβ+.\displaystyle\leqslant|\bm{\gamma}_{\bm{X}}|+\int_{B_{\bm{0}}(1)\setminus B_{\bm{0}}(\kappa)}\kappa^{1-\beta_{+}}|\bm{w}|^{\beta_{+}}\nu_{\bm{X}}(\mathrm{d}\bm{w})\leqslant|\bm{\gamma}_{\bm{X}}|+\kappa^{1-\beta_{+}}I_{\beta_{+}}.

Thus, (|𝜸𝑿(κ)|t)p(|\bm{\gamma_{\bm{X}}}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}|t)^{p} is bounded by a constant multiple of tp+tp/β+t^{p}+t^{p/\beta_{+}}. If β+(0,1)\beta_{+}\in(0,1) and the natural drift of 𝑿\bm{X} is zero (i.e. 𝜸𝑿=B𝟎(1){𝟎}𝒘ν𝑿(d𝒘)\bm{\gamma_{X}}=\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}\bm{w}\nu_{\bm{X}}(\mathrm{d}\bm{w})), then |𝜸𝑿(κ)||\bm{\gamma}_{\bm{X}}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}| is bounded (and convergent) as t0t\downarrow 0,

|𝜸𝑿(κ)|=|B𝟎(κ){𝟎}𝒘ν𝑿(d𝒘)|\displaystyle|\bm{\gamma}_{\bm{X}}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}|=\bigg{|}\int_{B_{\bm{0}}(\kappa)\setminus\{\bm{0}\}}\bm{w}\nu_{\bm{X}}(\mathrm{d}\bm{w})\bigg{|} B𝟎(κ){𝟎}|𝒘|ν𝑿(d𝒘)\displaystyle\leqslant\int_{B_{\bm{0}}(\kappa)\setminus\{\bm{0}\}}|\bm{w}|\nu_{\bm{X}}(\mathrm{d}\bm{w})
B𝟎(κ){𝟎}κ1β+|𝒘|β+ν𝑿(d𝒘)κ1β+Iβ+,\displaystyle\leqslant\int_{B_{\bm{0}}(\kappa)\setminus\{\bm{0}\}}\kappa^{1-\beta_{+}}|\bm{w}|^{\beta_{+}}\nu_{\bm{X}}(\mathrm{d}\bm{w})\leqslant\kappa^{1-\beta_{+}}I_{\beta_{+}},

making (|𝜸𝑿(κ)|t)p(|\bm{\gamma}_{\bm{X}}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}|t)^{p} bounded by a multiple of tp/β+t^{p/\beta_{+}}. Hence, in this case, we may take C2=0C_{2}=0 in Lemma 5.2 (it will become clear from the remainder of the proof that none of the other summands on the right-hand side of the inequality in (44) will produce a term of order tpt^{p}).

The pp-th moment of the Brownian term is easily bounded by a constant multiple of tp/2t^{p/2} since we have sups[0,t]|𝑩s|=𝑑t1/2sups[0,1]|𝑩s|\sup_{s\in[0,t]}|\bm{B}_{s}|\overset{d}{=}t^{1/2}\sup_{s\in[0,1]}|\bm{B}_{s}|. If 𝚺𝑿\bm{\Sigma_{X}} is a zero matrix, then |𝚺𝑿|=0|\bm{\Sigma_{X}}|=0 and hence C1=0C_{1}=0.

Next, we bound the big-jump term 𝑱(κ)\bm{J}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}. Let Aκ:=dB𝟎(κ)A_{\kappa}:=\mathbb{R}^{d}\setminus B_{\bm{0}}(\kappa) and recall that 𝑱t=k=1Nt𝑹k\bm{J}_{t}=\sum_{k=1}^{N_{t}}\bm{R}_{k} for some Poisson random variable NtN_{t} with mean tν𝑿(Aκ)t\nu_{\bm{X}}(A_{\kappa}) and iid random vectors (𝑹n)n(\bm{R}_{n})_{n\in\mathbb{N}} independent of NtN_{t} with law ν𝑿(Aκ)/ν𝑿(Aκ)\nu_{\bm{X}}(\cdot\cap A_{\kappa})/\nu_{\bm{X}}(A_{\kappa}). Recall, from the formula for the moments of a Poisson random variable, that 𝔼[Ntk]=j=1k{kj}(tν𝑿(Aκ))j\mathds{E}[N_{t}^{k}]=\sum_{j=1}^{k}\genfrac{\{}{\}}{0.0pt}{}{k}{j}(t\nu_{\bm{X}}(A_{\kappa}))^{j}, where {kj}\genfrac{\{}{\}}{0.0pt}{}{k}{j} denotes the Stirling number of the second kind. Note that the triangle inequality of the Euclidean norm |||\cdot| implies that

|𝑱(κ)s|p(k=1Ns|𝑹k|)p(k=1Nt|𝑹k|)pNt(p1)+k=1Nt|𝑹k|p, for every s[0,t].|\bm{J}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{s}|^{p}\leqslant\left(\sum_{k=1}^{N_{s}}|\bm{R}_{k}|\right)^{p}\leqslant\left(\sum_{k=1}^{N_{t}}|\bm{R}_{k}|\right)^{p}\leqslant N_{t}^{(p-1)^{+}}\sum_{k=1}^{N_{t}}|\bm{R}_{k}|^{p},\quad\text{ for every }s\in[0,t].

Denote p:=inf{n:np}\lceil p\rceil:=\inf\{n\in\mathbb{N}\,:\,n\geqslant p\}, and note that since 𝑹k\bm{R}_{k}, kk\in\mathbb{N}, are iid and independent of NtN_{t} and 1(p1)++1p1\leqslant(p-1)^{+}+1\leqslant\lceil p\rceil, we find

(45) 𝔼[sups[0,t]|𝑱(κ)s|p]\displaystyle\mathds{E}\bigg{[}\sup_{s\in[0,t]}|\bm{J}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{s}|^{p}\bigg{]} 𝔼[|𝑹1|p]𝔼[Ntp]=Aκ|𝒘|pν𝑿(d𝒘)ν𝑿(Aκ)k=1p{pk}(tν𝑿(Aκ))k\displaystyle\leqslant\mathds{E}\big{[}|\bm{R}_{1}|^{p}\big{]}\mathds{E}\Big{[}N_{t}^{\lceil p\rceil}\Big{]}=\int_{A_{\kappa}}|\bm{w}|^{p}\frac{\nu_{\bm{X}}(\mathrm{d}\bm{w})}{\nu_{\bm{X}}(A_{\kappa})}\cdot\sum_{k=1}^{\lceil p\rceil}\genfrac{\{}{\}}{0.0pt}{}{\lceil p\rceil}{k}(t\nu_{\bm{X}}(A_{\kappa}))^{k}
=tAκ|𝒘|pν𝑿(d𝒘)k=1p{pk}(tν𝑿(Aκ))k1.\displaystyle=t\int_{A_{\kappa}}|\bm{w}|^{p}\nu_{\bm{X}}(\mathrm{d}\bm{w})\cdot\sum_{k=1}^{\lceil p\rceil}\genfrac{\{}{\}}{0.0pt}{}{\lceil p\rceil}{k}(t\nu_{\bm{X}}(A_{\kappa}))^{k-1}.

Note that ν𝑿(Aκ)ν𝑿(A1)+B𝟎(1)B𝟎(κ)κβ+|𝒘|β+ν𝑿(d𝒘)ν𝑿(A1)+κβ+Iβ+\nu_{\bm{X}}(A_{\kappa})\leqslant\nu_{\bm{X}}(A_{1})+\int_{B_{\bm{0}}(1)\setminus B_{\bm{0}}(\kappa)}\kappa^{-\beta_{+}}|\bm{w}|^{\beta_{+}}\nu_{\bm{X}}(\mathrm{d}\bm{w})\leqslant\nu_{\bm{X}}(A_{1})+\kappa^{-\beta_{+}}I_{\beta_{+}} and hence tν𝑿(Aκ)t\nu_{\bm{X}}(A_{\kappa}) is bounded in t[0,1]t\in[0,1] (recall that κ=t1/β+\kappa=t^{1/\beta_{+}}), making the sum in the display also bounded. Denote Ip=A1|𝒘|pν𝑿(d𝒘)I_{p}^{\prime}=\int_{A_{1}}|\bm{w}|^{p}\nu_{\bm{X}}(\mathrm{d}\bm{w}), which we assumed finite, and hence

Aκ|𝒘|pν𝑿(d𝒘)\displaystyle\int_{A_{\kappa}}|\bm{w}|^{p}\nu_{\bm{X}}(\mathrm{d}\bm{w}) Ip+B𝟎(1)B𝟎(κ)|𝒘|pν𝑿(d𝒘)\displaystyle\leqslant I_{p}^{\prime}+\int_{B_{\bm{0}}(1)\setminus B_{\bm{0}}(\kappa)}|\bm{w}|^{p}\nu_{\bm{X}}(\mathrm{d}\bm{w})
Ip+B𝟎(1)B𝟎(κ)κ(β+p)+|𝒘|max{β+,p}ν𝑿(d𝒘)Ip+κ(β+p)+Imax{β+,p}.\displaystyle\leqslant I_{p}^{\prime}+\int_{B_{\bm{0}}(1)\setminus B_{\bm{0}}(\kappa)}\kappa^{-(\beta_{+}-p)^{+}}|\bm{w}|^{\max\{\beta_{+},p\}}\nu_{\bm{X}}(\mathrm{d}\bm{w})\leqslant I_{p}^{\prime}+\kappa^{-(\beta_{+}-p)^{+}}I_{\max\{\beta_{+},p\}}.

Thus, there is a finite constant C>0C>0 such that

𝔼[sups[0,t]|𝑱(κ)s|p]Ct(Ip+κ(β+p)+Imax{β+,p})=C(Ipt+Imax{β+,p}tmin{1,p/β+}).\mathds{E}\bigg{[}\sup_{s\in[0,t]}|\bm{J}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{s}|^{p}\bigg{]}\leqslant Ct(I_{p}^{\prime}+\kappa^{-(\beta_{+}-p)^{+}}I_{\max\{\beta_{+},p\}})=C(I_{p}^{\prime}t+I_{\max\{\beta_{+},p\}}t^{\min\{1,p/\beta_{+}\}}).

In the case where β+>0\beta_{+}>0, it remains to bound the small-jump term 𝑫(κ)\bm{D}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}. In this case, we show that the pp-th moment is bounded by a multiple of tp/β+t^{p/\beta_{+}}. We may assume without loss of generality that p>1p>1, since the other cases would follow by Jensen’s inequality since 𝔼[|𝝃|q]𝔼[|𝝃|p]q/p\mathds{E}[|\bm{\xi}|^{q}]\leqslant\mathds{E}[|\bm{\xi}|^{p}]^{q/p} for any qpq\leqslant p. Since |𝑫(κ)t||\bm{D}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{t}| is a submartingale, Doob’s maximal inequality and the elementary inequality |x|p(p/e)pe|x||x|^{p}\leqslant(p/e)^{p}e^{|x|} imply

𝔼[sups[0,t]|𝑫(κ)s|p](pp1)p𝔼[|𝑫(κ)t|p]=(κpp1)p𝔼[|κ1𝑫(κ)t|p](κp2/ep1)p𝔼[eκ1|𝑫(κ)t|],\mathds{E}\bigg{[}\sup_{s\in[0,t]}|\bm{D}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{s}|^{p}\bigg{]}\leqslant\Big{(}\frac{p}{p-1}\Big{)}^{p}\mathds{E}\big{[}|\bm{D}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{t}|^{p}\big{]}=\Big{(}\frac{\kappa p}{p-1}\Big{)}^{p}\mathds{E}\big{[}|\kappa^{-1}\bm{D}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{t}|^{p}\big{]}\leqslant\Big{(}\frac{\kappa p^{2}/e}{p-1}\Big{)}^{p}\mathds{E}\Big{[}e^{\kappa^{-1}|\bm{D}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{t}|}\Big{]},

for t[0,1]t\in[0,1]. Thus, to complete the proof it suffices to show that the expectation on the right is bounded as t0t\downarrow 0. Let {𝒆i}i=12d\{\bm{e}_{i}\}_{i=1}^{2^{d}} be the vertices of the hypercube centered at the origin with sides parallel to the axes and side length 22 (e.g., the vectors (1,1,,1)(1,1,\ldots,1) and (1,1,,1)(-1,-1,\ldots,-1) are opposite vertices of this hypercube). Note that e|𝒘|e|𝒘|1i=12de𝒆i,𝒘e^{|\bm{w}|}\leqslant e^{|\bm{w}|_{1}}\leqslant\sum_{i=1}^{2^{d}}e^{\langle\bm{e}_{i},\bm{w}\rangle} where |(s1,,sd)|1i=1d|si||(s_{1},\ldots,s_{d})|_{1}\coloneqq\sum_{i=1}^{d}|s_{i}| denotes the 1\ell^{1}-norm in d\mathbb{R}^{d}. Hence, it suffices to show that 𝔼[exp(κ1𝒆i,𝑫(κ)t)]\mathds{E}[\exp(\langle\kappa^{-1}\bm{e}_{i},\bm{D}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{t}\rangle)] is bounded as t0t\downarrow 0 for each i{1,,2d}i\in\{1,\ldots,2^{d}\}. The Lévy–Khintchine formula, the elementary inequality ex1xecx2e^{x}-1-x\leqslant e^{c}x^{2} for x[c,c]x\in[-c,c], c>0c>0, and the Cauchy-Schwarz inequality |κ1𝒆i,𝒘||𝒆i|=d|\langle\kappa^{-1}\bm{e}_{i},\bm{w}\rangle|\leqslant|\bm{e}_{i}|=\sqrt{d} for all 𝒘B𝟎(κ)\bm{w}\in B_{\bm{0}}(\kappa) and i{1,,2d}i\in\{1,\ldots,2^{d}\} yield

log𝔼[eκ1𝒆i,𝑫(κ)t]\displaystyle\log\mathds{E}\big{[}e^{\langle\kappa^{-1}\bm{e}_{i},\bm{D}^{\scalebox{0.5}{(}\kappa\scalebox{0.5}{)}}_{t}\rangle}\big{]} =tB𝟎(κ){𝟎}(eκ1𝒆i,𝒘1κ1𝒆i,𝒘)ν𝑿(d𝒘)\displaystyle=t\int_{B_{\bm{0}}(\kappa)\setminus\{\bm{0}\}}\big{(}e^{\langle\kappa^{-1}\bm{e}_{i},\bm{w}\rangle}-1-\langle\kappa^{-1}\bm{e}_{i},\bm{w}\rangle\big{)}\nu_{\bm{X}}(\mathrm{d}\bm{w})
tB𝟎(κ){𝟎}edκ2𝒆i,𝒘2ν𝑿(d𝒘)tB𝟎(κ){𝟎}dedκ2|𝒘|2ν𝑿(d𝒘)\displaystyle\leqslant t\int_{B_{\bm{0}}(\kappa)\setminus\{\bm{0}\}}e^{\sqrt{d}}\kappa^{-2}\langle\bm{e}_{i},\bm{w}\rangle^{2}\nu_{\bm{X}}(\mathrm{d}\bm{w})\leqslant t\int_{B_{\bm{0}}(\kappa)\setminus\{\bm{0}\}}de^{\sqrt{d}}\kappa^{-2}|\bm{w}|^{2}\nu_{\bm{X}}(\mathrm{d}\bm{w})
tB𝟎(κ){𝟎}dedκβ+|𝒘|β+ν𝑿(d𝒘)dedIβ+,\displaystyle\leqslant t\int_{B_{\bm{0}}(\kappa)\setminus\{\bm{0}\}}de^{\sqrt{d}}\kappa^{-\beta_{+}}|\bm{w}|^{\beta_{+}}\nu_{\bm{X}}(\mathrm{d}\bm{w})\leqslant de^{\sqrt{d}}I_{\beta_{+}},

completing the proof in the case β+>0\beta_{+}>0.

Case β+=0\beta_{+}=0. Note in this case, that the pure-jump component of 𝑿\bm{X} is compound Poisson. Thus, as in (44), we have that sups[0,t]|𝑿s||𝜸𝑿|t+|𝚺𝑿|sups[0,t]|𝑩s|+sups[0,t]|𝑱~s|\sup_{s\in[0,t]}|\bm{X}_{s}|\leqslant|\bm{\gamma}_{\bm{X}}|t+|\bm{\Sigma_{X}}|\sup_{s\in[0,t]}|\bm{B}_{s}|+\sup_{s\in[0,t]}|\widetilde{\bm{J}}_{s}| for all t[0,1]t\in[0,1], where 𝑱~t=𝑿t𝜸𝑿t𝚺𝑿𝑩t\widetilde{\bm{J}}_{t}=\bm{X}_{t}-\bm{\gamma}_{\bm{X}}t-\bm{\Sigma_{X}}\bm{B}_{t} for all t[0,1]t\in[0,1]. The bound on the pp-moment of the Brownian term follows exactly as in the case of β+>0\beta_{+}>0 and is a constant multiple of tp/2t^{p/2}. From the term (|𝜸𝑿|t)p(|\bm{\gamma}_{\bm{X}}|t)^{p}, we get a multiple of tpt^{p}. Note that 𝑱~s\widetilde{\bm{J}}_{s} is a compound Poisson process with finitely many jumps on d{𝟎}\mathbb{R}^{d}\setminus\{\bm{0}\}, with β=0\beta=0, and hence 𝑱~t=n=1Nt𝑹~k\widetilde{\bm{J}}_{t}=\sum_{n=1}^{N_{t}}\widetilde{\bm{R}}_{k} for some Poisson random variable NtN_{t} with mean tν𝑿(d{𝟎})t\nu_{\bm{X}}(\mathbb{R}^{d}\setminus\{\bm{0}\}) and iid random vectors (𝑹~n)n(\widetilde{\bm{R}}_{n})_{n\in\mathbb{N}} independent of NtN_{t} with law ν𝑿((d{𝟎}))/ν𝑿(d{𝟎})\nu_{\bm{X}}(\cdot\cap(\mathbb{R}^{d}\setminus\{\bm{0}\}))/\nu_{\bm{X}}(\mathbb{R}^{d}\setminus\{\bm{0}\}). For the term 𝔼[sups[0,t]|𝑱~s|p]\mathds{E}[\sup_{s\in[0,t]}|\widetilde{\bm{J}}_{s}|^{p}], we now use the same proof as in the case of β+>0\beta_{+}>0, until (45). Hence, we see that

𝔼[sups[0,t]|𝑱~s|p]dp/2td{𝟎}|𝒘|pν𝑿(d𝒘)k=1p{pk}(tν𝑿(d{𝟎}))k1.\mathds{E}\bigg{[}\sup_{s\in[0,t]}|\widetilde{\bm{J}}_{s}|^{p}\bigg{]}\leqslant d^{p/2}t\int_{\mathbb{R}^{d}\setminus\{\bm{0}\}}|\bm{w}|^{p}\nu_{\bm{X}}(\mathrm{d}\bm{w})\cdot\sum_{k=1}^{\lceil p\rceil}\genfrac{\{}{\}}{0.0pt}{}{\lceil p\rceil}{k}(t\nu_{\bm{X}}(\mathbb{R}^{d}\setminus\{\bm{0}\}))^{k-1}.

Since 𝑱~\widetilde{\bm{J}} has finite activity, it follows that d{𝟎}|𝒘|pν𝑿(d𝒘)<\int_{\mathbb{R}^{d}\setminus\{\bm{0}\}}|\bm{w}|^{p}\nu_{\bm{X}}(\mathrm{d}\bm{w})<\infty. Moreover, since the sum in the display above is bounded in t[0,1]t\in[0,1], we get that 𝔼[sups[0,t]|𝑱~s|p]\mathds{E}[\sup_{s\in[0,t]}|\widetilde{\bm{J}}_{s}|^{p}] is bounded by a multiple of tt, concluding the proof of Lemma 5.2.

Appendix B Small-time domains of attraction - proof of Theorem 5.1

The proof is essentially a consequence of [29, Thm 15.14] and [26, Thm 2]. Recall that B𝒂(r)={𝒙d:|𝒙𝒂|<r}B_{\bm{a}}(r)=\{\bm{x}\in\mathbb{R}^{d}:|\bm{x}-\bm{a}|<r\} denotes the open ball in d\mathbb{R}^{d} with center 𝒂d\bm{a}\in\mathbb{R}^{d} and radius r>0r>0, by 𝕊d1\mathbb{S}^{d-1} the unit sphere in d\mathbb{R}^{d} and define 𝒂(r){𝒙d:𝒂,𝒙r}\mathscr{L}_{\bm{a}}(r)\coloneqq\{\bm{x}\in\mathbb{R}^{d}:\langle\bm{a},\bm{x}\rangle\geqslant r\}.

Since 𝑿\bm{X} and 𝒁\bm{Z} are Lévy processes, the stated weak convergence is equivalent to 𝝀,𝑿t/g(t)𝑑𝝀,𝒁1\langle\bm{\lambda},\bm{X}_{t}\rangle/g(t)\xrightarrow{d}\langle\bm{\lambda},\bm{Z}_{1}\rangle as t0t\downarrow 0 for any 𝝀d\bm{\lambda}\in\mathbb{R}^{d} by [29, Cor. 15.7]. By [26, Thm 2], it follows that, for some α(0,2]\alpha\in(0,2], g(t)=t1/αG(t1)g(t)=t^{1/\alpha}G(t^{-1}) for t>0t>0 where GG is a slowly varying function at infinity and, moreover, 𝝀,𝒁1\langle\bm{\lambda},\bm{Z}_{1}\rangle is α\alpha-stable for all 𝝀d\bm{\lambda}\in\mathbb{R}^{d}. Thus, 𝒁\bm{Z} is itself α\alpha-stable. We then have the following cases.

If α=2\alpha=2, then, by [26, Thm 2(i)], the weak convergence in the direction 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1} is equivalent to

G(t1)2(|𝒗𝚺𝑿|2+B𝟎(g(t)){𝟎}|𝒗,𝒙|2ν𝑿(d𝒙))|𝒗𝚺𝒁|2,as t0,G(t^{-1})^{-2}\bigg{(}|\bm{v}^{\scalebox{0.6}{$\top$}}\bm{\Sigma}_{\bm{X}}|^{2}+\int_{B_{\bm{0}}(g(t))\setminus\{\bm{0}\}}|\langle\bm{v},\bm{x}\rangle|^{2}\nu_{\bm{X}}(\mathrm{d}\bm{x})\bigg{)}\to|\bm{v}^{\scalebox{0.6}{$\top$}}\bm{\Sigma}_{\bm{Z}}|^{2},\quad\text{as }t\downarrow 0,

so the weak convergence in d\mathbb{R}^{d} is equivalent to (24), completing the proof in this case.

If α(1,2)\alpha\in(1,2), the weak convergence in the direction 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1} is equivalent to (25) by [26, Thm 2(iii)], completing the proof in this case.

If α(0,1)\alpha\in(0,1), the weak convergence in the direction 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1} is equivalent to (25) and 𝒗,𝑿\langle\bm{v},\bm{X}\rangle having zero natural drift by [26, Thm 2(iii)]. Since the latter condition is required for all 𝒗d\bm{v}\in\mathbb{R}^{d}, it is equivalent to 𝑿\bm{X} having zero natural drift 𝜸𝑿=B𝟎(1){𝟎}𝒙ν𝑿(d𝒙)\bm{\gamma_{X}}=\int_{B_{\bm{0}}(1)\setminus\{\bm{0}\}}\bm{x}\nu_{\bm{X}}(\mathrm{d}\bm{x}), completing the proof in this case.

If α=1\alpha=1, the weak convergence in the direction 𝒗𝕊d1\bm{v}\in\mathbb{S}^{d-1} may be different depending on the behaviour of the limiting process in this direction. If ν𝒁(𝒗(1))=0\nu_{\bm{Z}}(\mathscr{L}_{\bm{v}}(1))=0 then 𝒗,𝒁\langle\bm{v},\bm{Z}\rangle is a linear drift and the weak convergence, by [26, Thm 2(ii)], is equivalent to the following two limits as t0t\downarrow 0:

tg(t)(𝒗,𝜸𝑿B𝟎(1)B𝟎(g(t))𝒗,𝒙ν(d𝒙))𝒗,𝜸𝒁,and\displaystyle\frac{t}{g(t)}\bigg{(}\langle\bm{v},\bm{\gamma_{X}}\rangle-\int_{B_{\bm{0}}(1)\setminus B_{\bm{0}}(g(t))}\langle\bm{v},\bm{x}\rangle\nu(\mathrm{d}\bm{x})\bigg{)}\to\langle\bm{v},\bm{\gamma_{Z}}\rangle,\quad\text{and}
g(t)ν𝑿(𝒗(g(t)))(𝒗,𝜸𝑿B𝟎(1)B𝟎(g(t))𝒗,𝒙ν(d𝒙))10,whenever 𝒗,𝜸𝒁0,\displaystyle g(t)\nu_{\bm{X}}(\mathscr{L}_{\bm{v}}(g(t)))\bigg{(}\langle\bm{v},\bm{\gamma_{X}}\rangle-\int_{B_{\bm{0}}(1)\setminus B_{\bm{0}}(g(t))}\langle\bm{v},\bm{x}\rangle\nu(\mathrm{d}\bm{x})\bigg{)}^{-1}\to 0,\quad\text{whenever }\langle\bm{v},\bm{\gamma_{Z}}\rangle\neq 0,

(where we recall that t/g(t)=1/G(t1)t/g(t)=1/G(t^{-1})). By the first limit, the second limit is equivalent to tν𝑿(𝒗(g(t)))0=ν𝒁(𝒗(1))t\nu_{\bm{X}}(\mathscr{L}_{\bm{v}}(g(t)))\to 0=\nu_{\bm{Z}}(\mathscr{L}_{\bm{v}}(1)). If instead ν𝒁(𝒗(1))>0\nu_{\bm{Z}}(\mathscr{L}_{\bm{v}}(1))>0, then, by [26, Thm 2(ii)], the weak limit is equivalent to the following. The process 𝒗,𝑿\langle\bm{v},\bm{X}\rangle has zero natural drift whenever it has finite variation and 𝒗,𝒁\langle\bm{v},\bm{Z}\rangle and the following two limits hold as t0t\downarrow 0:

ν𝒁(𝒗(1))g(t)ν𝑿(𝒗(g(t)))(𝒗,𝜸𝑿B𝟎(1)B𝟎(g(t))𝒗,𝒙ν(d𝒙))𝒗,𝜸𝒁,and\displaystyle\frac{\nu_{\bm{Z}}(\mathscr{L}_{\bm{v}}(1))}{g(t)\nu_{\bm{X}}(\mathscr{L}_{\bm{v}}(g(t)))}\bigg{(}\langle\bm{v},\bm{\gamma_{X}}\rangle-\int_{B_{\bm{0}}(1)\setminus B_{\bm{0}}(g(t))}\langle\bm{v},\bm{x}\rangle\nu(\mathrm{d}\bm{x})\bigg{)}\to\langle\bm{v},\bm{\gamma_{Z}}\rangle,\quad\text{and}
tν𝑿(𝒗(g(t)))ν𝒁(𝒗(1)).\displaystyle t\nu_{\bm{X}}(\mathscr{L}_{\bm{v}}(g(t)))\to\nu_{\bm{Z}}(\mathscr{L}_{\bm{v}}(1)).

By the second limit, the first limit can be rewritten as the first limit in the display above. Thus, in either case, the conditions are equivalent to those stated in Theorem 5.1 in the direction 𝒗\bm{v}. Since the directional limits are equivalent to the corresponding limits in d\mathbb{R}^{d}, the result follows.∎

Appendix C Proof of the inequality in (4)

Recall that the two Lévy processes 𝑿\bm{X} and 𝒀\bm{Y} in d\mathbb{R}^{d} have the Lévy–Itô decompositions 𝑿t=𝜸𝑿,κt+𝚺𝑿𝑩𝑿t+𝑫𝑿,κt+𝑱𝑿,κt\bm{X}_{t}=\bm{\gamma}_{\bm{X},\kappa}t+\bm{\Sigma_{X}}\bm{B^{X}}_{t}+\bm{D}^{\bm{X},\kappa}_{t}+\bm{J}^{\bm{X},\kappa}_{t} and 𝒀t=𝜸𝒀,κt+𝚺𝒀𝑩𝒀t+𝑫𝒀,κt+𝑱𝒀,κt\bm{Y}_{t}=\bm{\gamma}_{\bm{Y},\kappa}t+\bm{\Sigma_{Y}}\bm{B^{Y}}_{t}+\bm{D}^{\bm{Y},\kappa}_{t}+\bm{J}^{\bm{Y},\kappa}_{t}, see Section 4. Recall that we chose coupling 𝑩𝑿=𝑩𝒀\bm{B^{X}}=\bm{B^{Y}}, implying |(𝚺𝑿𝚺𝒀)𝑩𝑿t||𝚺𝑿𝑩𝑿t𝚺𝒀𝑩𝒀t||𝚺𝑿𝚺𝒀||𝑩𝑿t||(\bm{\Sigma_{X}}-\bm{\Sigma_{Y}})\bm{B^{X}}_{t}|\leqslant|\bm{\Sigma_{X}}\bm{B^{X}}_{t}-\bm{\Sigma_{Y}}\bm{B^{Y}}_{t}|\leqslant|\bm{\Sigma_{X}}-\bm{\Sigma_{Y}}|\cdot|\bm{B^{X}}_{t}| (where |𝚺𝑿𝚺𝒀||\bm{\Sigma_{X}}-\bm{\Sigma_{Y}}| is the Frobenius norm of the matrix 𝚺𝑿𝚺𝒀\bm{\Sigma_{X}}-\bm{\Sigma_{Y}}). Applying Doob’s maximal inequality, we obtain

(46) 𝒲q(𝚺𝑿𝑩𝑿,𝚺𝒀𝑩𝒀)𝒲2(𝚺𝑿𝑩𝑿,𝚺𝒀𝑩𝒀)q1𝔼[supt[0,1]|𝚺𝑿𝑩𝑿t𝚺𝒀𝑩𝒀t|2](q1)/2(2d|𝚺𝑿𝚺𝒀|)q1.\begin{split}\mathcal{W}_{q}\big{(}\bm{\Sigma_{X}}\bm{B^{X}},\bm{\Sigma_{Y}}\bm{B^{Y}}\big{)}&\leqslant\mathcal{W}_{2}\big{(}\bm{\Sigma_{X}}\bm{B^{X}},\bm{\Sigma_{Y}}\bm{B^{Y}}\big{)}^{q\wedge 1}\\ &\leqslant\mathds{E}\bigg{[}\sup_{t\in[0,1]}|\bm{\Sigma_{X}}\bm{B^{X}}_{t}-\bm{\Sigma_{Y}}\bm{B^{Y}}_{t}|^{2}\bigg{]}^{(q\wedge 1)/2}\leqslant(2\sqrt{d}|\bm{\Sigma_{X}}-\bm{\Sigma_{Y}}|)^{q\wedge 1}.\end{split}

Applying the triangle inequality, we obtain

supt[0,1]|𝜸𝑿,κt+𝚺𝑿𝑩𝑿t+𝑫𝑿,κt+𝑱𝑿,κt𝜸𝒀,κt𝚺𝒀𝑩𝒀t𝑫𝒀,κt𝑱𝒀,κt||𝜸𝑿,κ𝜸𝒀,κ|+supt[0,1]|𝚺𝑿𝑩𝑿t𝚺𝒀𝑩𝒀t|+supt[0,1]|𝑫𝑿,κt𝑫𝒀,κt|+supt[0,1]|𝑱𝑿,κt𝑱𝒀,κt|.\sup_{t\in[0,1]}|\bm{\gamma}_{\bm{X},\kappa}t+\bm{\Sigma_{X}}\bm{B}^{\bm{X}}_{t}+\bm{D}^{\bm{X},\kappa}_{t}+\bm{J}^{\bm{X},\kappa}_{t}-\bm{\gamma}_{\bm{Y},\kappa}t-\bm{\Sigma_{Y}}\bm{B}^{\bm{Y}}_{t}-\bm{D}^{\bm{Y},\kappa}_{t}-\bm{J}^{\bm{Y},\kappa}_{t}|\\ \leqslant|\bm{\gamma}_{\bm{X},\kappa}-\bm{\gamma}_{\bm{Y},\kappa}|+\sup_{t\in[0,1]}|\bm{\Sigma_{X}}\bm{B}^{\bm{X}}_{t}-\bm{\Sigma_{Y}}\bm{B}^{\bm{Y}}_{t}|+\sup_{t\in[0,1]}|\bm{D}^{\bm{X},\kappa}_{t}-\bm{D}^{\bm{Y},\kappa}_{t}|+\sup_{t\in[0,1]}|\bm{J}^{\bm{X},\kappa}_{t}-\bm{J}^{\bm{Y},\kappa}_{t}|.

For q(0,1]q\in(0,1] (resp. q(1,2]q\in(1,2]) inequality (4) follows by subadditivity (a+b)qaq+bq(a+b)^{q}\leqslant a^{q}+b^{q} for a,b0a,b\geqslant 0 (resp. Minkowski’s inequality).