Asymptotically optimal Wasserstein couplings for the small-time stable domain of attraction
Abstract.
We develop two novel couplings between general pure-jump Lévy processes in and apply them to obtain upper bounds on the rate of convergence in an appropriate Wasserstein distance on the path space for a wide class of Lévy processes attracted to a multidimensional stable process in the small-time regime. We also establish general lower bounds based on certain universal properties of slowly varying functions and the relationship between the Wasserstein and Toscani–Fourier distances of the marginals. Our upper and lower bounds typically have matching rates. In particular, the rate of convergence is polynomial for the domain of normal attraction and slower than a slowly varying function for the domain of non-normal attraction.
Key words and phrases:
small-time scaling limit, stable domain of attraction, Wasserstein distance, Coupling of Lévy processes2020 Mathematics Subject Classification:
Primary 60F05, 60G51; Secondary 60G52, 60F25.1. Introduction
Stable processes arise naturally as universal scaling limits of a vast class of stochastic processes at either small or large times. In particular, in the small-time regime, stable processes arise as weak limits of discretisation errors of widely used models in theoretical and applied probability [34, 27, 1]. Most of these models are based on Lévy processes in the small-time domain of attraction of a stable processes [26, 7, 12]. In contrast with the more classical long-time regime of Lamperti, where literature is abundant (see, e.g. [25, 24, 28, 9, 33]), the study of the convergence in the small-time regime, which is the focus of this paper, has been underdeveloped. In the long-time regime, the convergence is a consequence of heavy tails with regularly varying tail probabilities or a finite second moment. On the other hand, in the small-time regime, the convergence depends on the activity of the small jumps of the underlying Lévy process and does not depend on the behaviour of the tail probabilites [26]. However, having a heavy-tailed limit may severely deteriorate the convergence speed as uniform integrability typically fails. Quantifying such an error is a fundamental problem, crucial in a number of disparate application areas, such as controlling the bias of discretised models in mathematical finance and elsewhere (see [26] and the references therein), quantifying the model misspecification risk [8] or asserting the convergence properties of estimators for the index of variation, such as Hill’s estimator, which is known to require a second order condition for the convergence to have good properties [17, p. 193–195].
The main aim of the present paper is to establish lower and upper bounds in Wasserstein distance on the convergence rate of multivariate Lévy processes attracted to a stable process in the domains of both normal and non-normal attraction (see definition in Section 2 below). Moreover, we will show that our bounds are often sharp. Our upper bounds are applicable to a large class of Lévy processes that are attracted to a multivariate -stable process (which is Gaussian if and heavy-tailed if ), while our lower bounds are universal within the small-time regime. To establish the upper bounds on the path supremum norm, we construct two couplings between any two arbitrary Lévy processes, inspired by the stochastic representations in [13], and bound the -norm of the maximum distance between the paths of the resulting processes. The lower bounds for the domain of normal attraction are obtained by comparing the Wasserstein distance with the Toscani–Fourier distance between the marginals and in the case of non-normal attraction, using a universal property of slowly varying functions.
We show that in the domain of normal attraction to heavy-tailed laws, under suitable second order assumptions, the rate of convergence of the upper and lower bounds are polynomial and agree for the -norm in the cases and , making our couplings rate-optimal in this sense. In the domain of non-normal attraction (to either Gaussian or a heavy-tailed stable law), the upper and lower bounds are both ‘slow’ and, in particular, the convergence is never faster than as for any . Moreover, for large subclasses of Lévy processes, the upper and lower bounds on the convergence rate agree in the case (see e.g. Corollary 2.4). In the domain of normal attraction to the Gaussian law, our upper and lower bounds are also polynomial and dependent on the Blumenthal–Getoor index of the attracted process. The bounds on the convergence rates in this case often agree and, when they do not, the gap between them is small (see Figure 2 below). A short YouTube presentation [21] describes our results, including the ideas behind the proofs.
1.1. Summary of our results in the heavy-tailed stable domain of attraction
In preparation for the summary of our results in Table 1, we introduce some notation: as holds for two functions if there exists satisfying for all . An eventually positive function is slowly varying at infinity, , if for all .
Table 1 summarises our results on the convergence rates established here for processes in both domains of attraction of stable processes. Recall that an -stable process has a finite -moment if and only if . Due to this technical constraint, our upper bounds on the -Wasserstein distance, defined in (1) below, always require for the corresponding domains of attraction. We remark that, in this case, both lower and upper bounds are typically asymptotically equivalent up to a multiplicative constant, making our methods and couplings, rate optimal. Indeed, in the domain of normal attraction, this occurs for any admissible if and for the -Wasserstein distance if . Our bounds for the domain of non-normal attraction are also seen to be rate optimal when is sufficiently regular (see discussion following Theorem 2.3 and Corollary 2.4 below) when (and hence ).
Domain of attraction | , and |
---|---|
normal | |
non-normal | , where cannot be bounded above by any non-decreasing integrable function : if , then |
More precisely, in Table 1 we let be a Lévy process in attracted to an -stable process with normalising function . That is, , , satisfies as . The table gives asymptotic bounds on the distance as in both regimes of attraction. We let the assumptions of either Theorem 2.1 (with , for the domain of normal attraction) or Theorem 2.3 (for the domain of non-normal attraction) hold for and pick satisfying . We stress that the lower bounds in Table 1 in both domains of normal and non-normal attraction require no assumptions beyond the existence of the scaling limit (see Theorem 5.1 below for the precise description of the class of Lévy processes attracted to a stable process ). In particular, as explained in the caption of Table 1, for any in the domain of non-normal attraction and arbitrary , there exists a positive increasing sequence tending to infinity, such that the lower bound on the -Wasserstein distance satisfies for all . In Example 3.2 below, we show that, even if the slowly varying function in the scaling limit grows arbitrarily slowly, the lower bound may be asymptotically equivalent to it and bounded below by for all small .
Recall that the slowly varying function in the scaling limit (as ) is uniquely determined up to asymptotic equivalence only. Interestingly, our results imply that the rate of convergence in the Wasserstein distance can be affected by different choices of , see Remark 2.5(IV) below for more details. Finally, we note that the couplings yielding the upper bounds in Table 1 require some structural assumptions on the Lévy measure of discussed in Sections 2 and 5 below.
1.2. A heuristic account of our couplings of Lévy processes
One of the main purposes of this article is to introduce two couplings between two arbitrary multivariate Lévy processes and and analyse their properties. Both coupling constructions are centered around coupling the respective Poisson jump measures and . As the Brownian components of and are coupled synchronously under both couplings, the couplings are named after the techniques involved in coupling of the Poisson jump measures and : the first is the thinning coupling, as it is based on Poisson thinning (see Subsection 4.1), and the second is the comonotonic coupling, based on the minimal transport coupling of real-valued random variables and LePage’s simulation method (see Subsection 4.2). As illustrated in Figure 1, the thinning coupling aims to maximise the intersection of the Poisson jump measures, whereas the comonotonic coupling aims to establish an optimal one-to-one correspondence between the atoms of the Poisson jump measures.


The assumptions and constructions of both couplings are rather different. In the thinning coupling, we consider a common dominating Lévy measure (say, the sum of both Lévy measures) such that the Lévy measures of both processes are absolutely continuous with respect to it with a bounded density. Then, we consider a Poisson measure with mean measure given by the dominating Lévy measure and then thin the Poisson measure appropriately to produce coupled Poisson measures with mean measures given by the Lévy measures of the processes. This maximises the common jumps of both processes.
For the comonotonic coupling, we assume that both processes have a radial decomposition with their own angular measures. We then construct a radial decomposition for both with respect to a common angular measure. We use this measure and LePage’s method to construct a one-to-one correspondence of jumps in which both processes jump in the same direction but with different magnitudes. Indeed, both Poisson measures are a transformation of a standard Poisson measure with independent decorations that select the direction of the jump. With our assumption, we construct such a transformation explicitly with the following properties. The decorations of both processes agree. Conditionally given a direction, the jumps of both processes, when ordered by decreasing magnitude, are in a one-to-one correspondence that mimics the relationship of real random variables under the minimal transport (or comonotonic) coupling. More precisely, the magnitudes are expressed as right inverse of the radial tail Lévy measure evaluated on the epochs of a standard Poisson process.
1.3. Comparison with the literature
Few results identifying the small-time convergence rate exist in the multivariate setting even when the limit is Gaussian. Indeed, most results in this regime are restricted to dimension and often require finite jump activity [11, 10, 14, 15]. In those situations, the limit law of the rescaled error can be identified for some functionals, leading to accurate estimates of the resulting bias and the celebrated continuity corrections [10, 15].
For heavy-tailed stable limits (i.e. non-Gaussian) and infinite activity Lévy processes attracted to the Gaussian law, again in one dimension, the literature is more scarce and only a fraction of the analogous results exist (see [7] for the convergence of certain path statistics to heavy-tailed limits). There are several complications in developing such results for small-time. First, the Berry–Esseen type bounds (see e.g. [24, 37]), commonly used to establish convergence rates to the Gaussian law, fail to give convergent upper bounds since the jump intensity is vanishing in the small-time regime. Second, the rescaled variables often either fail to be uniformly integrable or their uniform integrability is difficult to prove (see, e.g. [7]).
In [16], the authors consider estimating the density of a discretely observed Lévy process satisfying Orey’s condition. Under the assumption that sufficiently large jumps are identifiable and removable in the sample, the estimation attains a minimax rate that is optimal up to a logarithmic factor if the Blumenthal–Getoor index is known. This regime is different from our situation, as the authors assume that we may remove all sufficiently large jumps. In fact, under this kind of assumption, the residual small-jump process may not be attracted to a stable process but to a Brownian motion [2].
In [34, 30], the authors introduce couplings between Lévy processes to bound the Wasserstein distance between them. The coupling in [34] is generic and pays special attention to the small jumps. However, the bounds fail to converge to zero when applied to a stable process and a Lévy process in its small-time stable domain of attraction. In contrast to the coupling used in [34], where the authors couple the big-jump components based on the magnitude of the jumps (i.e. based on a common threshold), we couple these components matching their jump intensities. Moreover, in [34] the authors couple the small jumps through an artificial Brownian motion, while we instead couple the compensated Poisson measures directly. On the other hand, the coupling in [30], based on McCann’s coupling and Rogers’ results on random walks, is the optimal Markovian coupling. However, the coupling requires such tight control of the infinitesimal dynamics of the processes that the coupling could only be constructed for Lévy processes with finitely many jumps on compact intervals, excluding all heavy-tailed stable processes and most processes in the small-time domain of attraction of Brownian motion.
Although the slow convergence phenomenon under the presence of a slowly varying function that does not converge to a positive constant has been observed in some specific settings such as in the case of Hill’s estimator (see [17, p. 193–195]), to the best of our knowledge it was first documented rigorously in [9] in an elementary general setting. The authors in [9] lower bound the Prokhorov distance between the marginals of the limit and that of a random walk in its domain of non-normal attraction with a function , satisfying for any . However, as is often the case with lower bounds in the form of upper limits, the sparsity of the sequence of times along which the divergence holds remains unclear. The present paper extends the applicability of such a lower bound and strengthens the conclusions. In particular we show that the function analogous to is typically slowly varying and provide some explicit asymptotically equivalent lower bounds.
1.4. Organisation of the article
In Section 2 we introduce the main results of the paper, namely, upper and lower bounds on the convergence rate for processes in the domains of normal and non-normal attraction. Subsection 2.5 explains why the existing literature cannot be directly applied to obtain the bounds presented in Section 2. We present two examples in Section 3 in which our main results are applied to tempered stable processes. The two couplings for general Lévy processes on used to prove the upper bounds on the Wasserstein distance in Section 2 are introduced in Section 4. General upper bounds (in ) for each component in the Lévy–Itô decomposition of coupled Lévy processes are also established in Section 4. The upper bounds for processes in the domain of (normal and non-normal) attraction of a stable process (Gaussian and heavy-tailed) are established in Section 5, while the lower bounds are established in Section 6. The proofs of the results stated in Section 2 are given in Section 7. Section 8 concludes the paper.
2. Main results
The -Wasserstein distance , for any , between the laws of -valued random vectors and equals where the infimum is taken over all couplings with and (throughout denotes the Euclidean norm in and for ).111For and any , we have . Hence, by integrating in , we obtain for all , thus implying that is a metric. For -valued stochastic processes and , the -Wasserstein distance, based on the distance between the paths in the uniform norm, is given by:
(1) |
where the infimum is taken over all couplings with and , where means that and are equal in law as processes. The case is important in our setting because the stable limit does not necessarily possess the first moment.
Let and be Lévy processes in (see [40, Ch. 1, Def. 1.6] for definition), where is -stable (see Section 5.1 below for definition).222Note that need not be isotropic: the angular component of its Lévy measure is not necessarily a uniform probability measure on the unit sphere in , see Section 5.1 for details. We say is in the small-time domain of attraction of if as in the Skorokhod space for some normalising positive function . Then, it is well known that is -stable for some and the normalising function admits the representation where is a slowly varying function at infinity (see [26, Eq. (8)]) that is asymptotically unique (see Theorem 5.1 below for the description of all Lévy processes attracted to ). We say is in the domain of normal attraction when the slowly varying function converges to a positive finite constant as (see [20, p. 181]). Otherwise, we say is in the domain of non-normal attraction. Throughout the paper we denote for .
2.1. Heavy-tailed stable domain of normal attraction
For a Lévy process to be in the domain of attraction of an -stable process , the necessary condition in (25) of Theorem 5.1 suggests the Lévy measure of around the origin should be “asymptotically absolutely continuous” with respect to the -stable Lévy measure of (see also Remark 5.3(b) below). Assumption ( (T).) quantifies the regularity of the corresponding density at the origin via the parameter (the larger is, the more asymptotic regularity there is). Assumption ( (T).) is required for the upper bound on the rate of convergence in the scaling limit in Theorem 2.1 and is stated in Section 5.2 below. Moreover, Assumption ( (T).), widely satisfied in practice (e.g. in the class of tempered stable processes [39] with ; cf. Section 3 below for specific examples), can be seen as quantifying the speed of convergence in the necessary condition (25) for to be in the stable domain of attraction.
Throughout the paper, for positive functions and , we use the notation as if , and as if .
Theorem 2.1.
Let , be -stable and be in the domain of normal attraction of .
(a) Let Assumption ( (T).) hold for . Then for any with , as ,
(b) If does not have the law of , then for any there exists some satisfying
Remark 2.2.
(I) The upper bounds in Theorem 2.1(a) are based on the thinning coupling in Section 4.1 below. The upper bounds are asymptotically proportional to the lower bounds of Theorem 2.1(b) when either or when , making the thinning coupling rate-optimal with respect to these Wasserstein distances. Coincidentally, the upper bounds decay the fastest for small values of when and for when .
Moreover, the multiplicative constants in can be made explicit and depend on the dimension only through the characteristics of and . The lower bounds are based on the lower bound on the Toscani–Fourier distance, see Section 6.2 below for details.
(II) Note that for and that most models in practice satisfy Assumption ( (T).) with . Theorem 2.1 thus focuses on the case in order to simplify the exposition, while retaining the key message of the paper. Our technical result Theorem 5.5 in Section 5 (resp. Lemma 6.4 in Subsection 6.2), used to prove part (a) (resp. part (b)) of Theorem 2.1, covers all parameters and -distances with . The statement of the corresponding general version of Theorem 2.1 is omitted for brevity.
∎
2.2. Stable domain of non-normal attraction
Consider the case where the slowly varying function in the scaling limit , as , is not asymptotically equivalent to a positive constant. In this section, we show that the lower bound on the -distance cannot be upper bounded by a positive non-decreasing function satisfying . The lower bound requires no assumptions (beyond being in domain of attraction), while the assumptions for the upper bounds give us multiplicative non-asymptotic control over the distance from to for small and any .
Assumption (S).
There exist , such that is bounded with as , is a slowly varying function both at and at infinity and
The upper bound in Theorem 2.3(a) below require an additional technical Assumption ( (C).), see Section 5 below. Intuitively, these assumptions require non-parametric structural properties of the Lévy measure of that allows us to compare it to the Lévy measure of the stable limit . Indeed, the necessary condition in (25) of Theorem 5.1 suggests the Lévy measure of around the origin should “asymptotically admit a radial decomposition that is close to that of the stable process”. Assumption ( (C).) states precisely this and specifies the proximity of the corresponding radial decomposition to that of the stable process via the parameters (as before, the larger and are, the closer the radial decompositions are). Moreover, both conditions are widely satisfied with , e.g. for the class of tempered -stable processes (see [39] and, for specific examples, Section 3 below).
Theorem 2.3.
Let be in the domain of non-normal attraction of an -stable process .
(a) Let and Assumptions ( (C).) and ( (S).) hold for some , and a function that is slowly varying at . Then
as for any with .
(b) Let and define for . Then for any ,
Moreover, cannot be upper bounded by a non-decreasing function with .
Since bounds for all small and is not asymptotically equivalent to a positive finite constant, Lemma 7.2 below (which extends [9, Prop., p. 683]) implies cannot be upper bounded by any non-decreasing function satisfying . The assumption on the slow variation of in Theorem 2.3(a) is not essential and may be replaced by assuming that dominates any (positive) power at zero. However, by Lemma 7.1 below, in most cases of interest such a function will be slowly varying.
Given a slowly varying function , the construction of functions and satisfying Assumption ( (S).) is not immediately clear. However, in most cases and for a sufficiently regular , by virtue of Lemma 7.1, Assumption ( (S).) will be satisfied by choosing as and a slowly varying (at and ) with (for , we denote ). In such cases, the lower bound in Theorem 2.3 is (by Lemma 7.1) proportional to , i.e. as , making the comonotonic coupling rate optimal with respect to the -distance when . The following corollary makes this precise and shows that this is the case for a large class of processes in the domain of non-normal attraction.
Corollary 2.4.
Let be in the domain of non-normal attraction of an -stable process .
(a) Let and Assumption ( (C).) hold for some and . Suppose is with derivative equal to , where and is eventually positive.
Further suppose there exists a slowly varying function both at zero and infinity satisfying for . Define , then, for any with , we have as and
(b) Define iteratively the functions and for and . Suppose is eventually equal to where in and either with or with . Then satisfies the assumptions of Part (a).
Remark 2.5.
(I) The upper bound in Theorem 2.3(a) is based on the comonotonic coupling in Section 4.2 below. Since the bound is independent of both and , the restriction in and is nonessential. Indeed, if and satisfy Assumption ( (C).), then any and also satisfy Assumption ( (C).). Moreover, the multiplicative constants in can be made explicit and depend on the dimension only through the characteristics of and .
(II)
The lower bounds are based on elementary estimates and a universal property of slowly varying functions, see Section 6.1 below for details.
When , despite the fact that we do not have an upper bound in Theorem 2.3(a) for this case, the lower bound of Theorem 2.3(b) ensures the nonexistence of a coupling that makes the -distance decay polynomially.
(III) Corollary 2.4 is a consequence of Theorem 2.3 and Lemmas 7.1 & 7.3 below. Furthermore, we stress that the resulting upper and lower bounds may converge slowly and at a rate that is, in some sense, “bounded away from polynomials” even for a very slow function or , , see Example 3.2 below. Furthermore, given any with as and , the functions are slowly varying, , and the corresponding functions are proportional as . Thus, we may construct processes such that is asymptotically bounded above and below by multiples of as , see Lemma 7.2 and Example 3.2 below.
(IV) For a given process , we may choose two asymptotically equivalent slowly varying functions and that have different convergence properties. Indeed, if is not asymptotically equivalent to , then the resulting bounds will change (recall that is only unique up to asymptotic equivalence and that ). For instance, fix , denote and let
Then and are slowly varying and as but and . Optimising the convergence rate within this class appears to be a very difficult task; however, the limitations imposed by Lemma 7.2 would apply to any choice of . A similar phenomenon was also observed recently in the standard central limit theorem for Lévy processes in [4], where the Kolmogorov distance is shown to satisfy (resp. fail) an integral condition for a non-standard (resp. standard) scaling.
(V)
Theorem 2.3 makes full use of Assumption ( (S).), however, a more detailed analysis that does not require to be slowly varying can be found in our technical result Theorem 5.9 in Section 5 below.
(VI) We note that a lower bound via the Toscani–Fourier distance is plausible but appears suboptimal since the rate has a polynomial factor. Moreover, we believe the slow lower bound in part (b) to hold for alone (i.e. without taking the maximum value between times and ). However, this remains a conjecture.
∎
2.3. Selecting the coupling
The main idea behind the proof of Theorems 2.1 & 2.3 is a good coupling between and . The two couplings we apply in this article, are the thinning coupling and the comonotonic coupling introduced in Sections 4.1 & 4.2. In Theorem 2.3 we solely apply the comonotonic coupling, since this yields clear and concise results. Note that one could apply the thinning coupling to get a similar result in the domain of non-normal attraction. However, since this would require a lengthy argument, and would distort the main story and result, this has been left out of the paper. In comparison, it is easier to use the comonotonic coupling to give bounds for processes in the domain of normal attraction.
Proposition 2.6.
Let be -stable with and be in its domain of normal attraction. Let Assumption ( (C).) (with constant and ) hold for some . Then, for any with , we have, as ,
Remark 2.7.
Proposition 2.6 follows from Theorem 5.9 (see Remark 5.10). The assumptions in Theorem 2.1(a) and Proposition 2.6 are significantly different, making it is necessary to split the upper bounds in two statements. Indeed, as seen in Example 3.4 below, Assumption ( (C).) is slightly stricter than Assumption ( (T).), since we can show that there exist processes for which Assumption ( (T).) is true, where Assumption ( (C).) is no longer valid. In the case where Assumptions ( (T).) and ( (C).) are valid simultaneously with the same parameter , Theorem 2.1 yields an upper bound that is never worse than that of Proposition 2.6. ∎
2.4. Gaussian domain of attraction
The domain of attraction to Brownian motion is substantially different as the previously described couplings are inapplicable. Obtaining a coupling between Brownian motion and other Lévy processes that reduced the -distance in uniform norm has been the work of a large body of literature (which we review in Subsection 2.5 below). In this paper, we use a simple independent coupling, which, heuristically, compares the pure-jump component of with the null process . Let denote the Euclidean inner product on , denote the zero-vector in as well as the zero-matrix in and let . Let for denote the characteristic function of . Furthermore, let be the Lévy-Khintchine exponent of , given by for and .
Theorem 2.8.
Let be a symmetric non-negative definite matrix on and define the process for where is a standard Brownian motion on independent of the pure-jump Lévy process with Blumenthal–Getoor index (defined in (27)).
(a) Suppose and fix any when is of infinite variation and otherwise. Then for any with , we have
(b) Pick any and define . Then for all , we have
(c) Let be the largest eigenvalue of . Suppose there exist and vectors with satisfying . Then for any we have for all sufficiently small .
Parts (a) and (c) of Theorem 2.8, with , imply that for processes whose pure jump part is in the domain of attraction of a -stable process, the upper and lower bounds are essentially proportional to and , respectively. These agree in the finite variation case with rate and also as with an arbitrarily deteriorating convergence rate. As shown in Figure 2 these bounds are not far from each other for fixed , and the powers of from the rates are also not far. In the ‘limiting case’ where is itself attracted to a Brownian motion and , the rescaled process is distributionally close to (see e.g. [26, Thm 2]) for a slowly varying function satisfying . It is thus natural to expect that the convergence, in this case, is slow as in Theorem 2.3 above, see Example 6.7.


2.5. Classical bounds are hard to apply!
The couplings and methods used to achieve these bounds are crucial, and they differ significantly from the classical methods used to find rates of convergence. Indeed, if we tried to use standard methods (namely, the Berry–Esseen theorem or [34]) to construct bounds for small-time domain of attraction, the bounds would not converge as .
The Berry–Essen theorem exploits an increase in the activity of the process to obtain bounds on the distance between a random walk and the limit law. In the small-time regime, the activity is instead decreasing, explaining the unsuitability of this tool in this context (see details below). In fact, the bound would converge to infinity as (and in particular does not go to ) unlike the bounds introduced in this paper. The coupling in [34] couples corresponding components of the Lévy–Itô decompositions for a common small-jump cutoff level. When the time horizon is fixed and the Lévy measure is supported on , the bounds of [34] are asymptotically sharp as . However, for general Lévy measures and as time tends to , no time-dependent cutoff level can be used to obtain convergent bounds. The lack of convergent bounds in the small-time domain of attraction of stable processes is mainly caused by a difference in the jump intensities of the large-jump components (see details below).
We first explain why the Berry–Esseen theorem does not yield suitable bounds even when the limit is Gaussian (see [18, 22]). For the explanation, it is enough to consider the one-dimensional case. Let be a zero-mean Lévy process on with characteristic triplet (see [40, Def. 8.2]) and finite fourth moment. The variance of is given by , where . Then, is attracted to a standard Gaussian random variable as . Denote by the Lévy measure of , the Berry–Esseen theorem thus implies that there exists some universal constant , such that
for all . As we can see above, this upper bound will tend to as and is therefore not an informative bound in the small-time regime.
For Lévy processes in the domain of attraction of an -stable law, an application of the bounds in [34] does not yield convergent bounds. The proofs of the bounds in [34] rely on the coupling of small jumps to a Gaussian law. Again, it is enough to consider the one-dimensional case. Let be symmetric and in the domain of attraction of the symmetric -stable random variable with . Suppose their Lévy measures satisfy and for . In this case we have as .
Let and apply [34, Thm 11] (at time and cutoff ) to obtain:
where we used the formula for the -Wasserstein distance in [19, p. 8]. For the first line in the display above to vanish as , we require . The term in the middle line of the display above equals
where is an arbitrary number and the inequality holds for all for which . For the right-hand side of the display to vanish at with we must have . Then, for , the display above will converge to the constant . In particular, the bound implied by [34, Thm 11] cannot vanish for any choice of .
3. Examples
In this section, we apply some of the main results from Section 2 on tempered -stable processes [39, Def. 2.1], that are in the domain of attraction of -stable processes. We say that a process is a tempered -stable process if it has no Gaussian component, and its Lévy measure has the form
(2) |
where is the unit sphere in and is a completely monotone Borel function (see [39, p. 680]) with for all and is the Borel -algebra on . In Examples 3.1 and 3.2 the process is a multidimensional tempered -stable process in the stable domain of attraction. Both of the examples can be easily seen to fulfil Assumption ( (C).) or ( (T).). Example 3.4 shows that Assumption ( (C).) does not imply ( (T).), while Example 3.3 deals with a Gaussian perturbation of a tempered -stable process.
Example 3.1.
Assume that is an -stable process on and that is a tempered stable process with Lévy measure as in (2). Assume that , for all and , where is a bounded non-negative function. Thus,
If , then Theorem 2.1 with implies that , with lower bound given by as , for some finite constant . Thus, the upper and lower bounds have the same rate in this case, yielding a rate-optimal bound. ∎
Next, we give an example where the function is non-constant, and see how the rates deteriorate in these cases, as Theorem 2.3 implies. Throughout the paper, we use the notation as , if .
Example 3.2.
Assume that is an -stable process on with and that is a tempered stable process with Lévy measure as in (2). We assume that for all and (see (32)), where is differentiable and slowly varying, implying as . Then does not depend on and its value, denoted , satisfies as by [6, Cor. 2.3.4]. For any , by (33), we have
Theorem 2.3 now yields both the upper and lower bound in terms of and related functions.
Let be recursively defined as in Lemma 7.3 below: and for . If either or for some (i.e. or as ), then Lemma 7.3 shows that for small we have
Moreover, by Lemma 7.1, Assumption ( (S).) holds with this . Thus, by Theorem 2.3, there exist constants such that, for all small enough , we have
Thus, despite the function being “nearly constant” for large , the convergence rates of the upper and lower bounds match and are slower than for any .
Now consider any with as and . Then are slowly varying, , and as by Lemma 7.1. Thus, by Theorem 2.3, for any , we have for some and all sufficiently small . In particular, by taking an appropriate , e.g., where is as in the previous paragraph and is large, the convergence in Wasserstein distance may be arbitrarily slow. ∎
As a last example, we will consider the case of Theorem 2.8, where the pure-jump Lévy process is a tempered -stable process.
Example 3.3.
Let be a positive definite matrix on and set for where is a standard Brownian motion on independent of the pure-jump tempered -stable Lévy process . Assume , that has zero-mean, and fix any . Then, by Theorem 2.8(a), we have the upper bound as . To find the lower bound, we let be the largest eigenvalue of , and define , for some with . Then, for any , Theorem 2.8(c) implies that for all sufficiently small .
Note that, as approaches , the gap between the lower and upper bound decreases. Indeed, for , we have for some small , so the upper bound is of the rate , while the lower bound has the rate , making the quotient of the two bounds proportional to . ∎
Example 3.4.
We show in this example, that we can find a process that satisfies Assumption ( (T).) but not Assumption ( (C).). Let and . Next, let be a one-dimensional -stable process and be a -stable process that is spectrally negative, with Lévy measures and for some constants . We note that has Lévy measure , showing that Assumption ( (T).) is indeed fulfilled. We can however note that Assumption ( (C).) cannot be fulfilled, since there doesn’t exist the necessary radial decomposition of . ∎
4. Two couplings of Lévy processes
Let be a Lévy process on with generating triplet (also called characteristic triplet, see [40, Def. 8.2]) with respect to the cutoff function , where , (with transpose ) and a symmetric non-negative definite matrix and a Lévy measure on . Throughout, we denote by the Euclidean norm of appropriate dimension and let be the open ball in of radius , centered at the origin . Fix any and consider the Lévy–Itô decomposition of given by
(3) |
where , is a standard Brownian motion on , is the small-jump martingale containing all the jumps of of magnitude less that , is the driftless compound Poisson process containing all the jumps of of magnitude at least and all three processes , and are independent. Moreover, the pure-jump component of is a Lévy process with paths of finite variation (i.e. the jumps are summable on any compact time interval) if and only if [40, Thm 21.9]. In particular, is a characteristic triplet of a Lévy process without a Gaussian component. Thus, if has finite variation, then has zero natural drift (i.e. the process equals the sum of its jumps) if and only if .
Similarly, we let be a Lévy process on with characteristic triplet with respect to the cutoff function and whose corresponding Lévy–Itô decomposition is given by , defined as above. The following elementary inequality will be used throughout: for any ,
(4) |
where in the last term denotes the Frobenius norm on (i.e. for ). For completeness, we give a proof of (4) in Appendix C below.
Let and be the Poisson random measures on of the jumps of and , respectively, with corresponding compensated measures and , where Leb denotes the Lebesgue measure on . Since, for every , we have
(5) |
the problem of coupling the jump components of and is reduced to coupling the Poisson random measures and . Sections 4.1 and 4.2 below each describe such a coupling.
4.1. Thinning
Choose any Lévy measure on that dominates both and with Radon-Nikodym derivatives bounded by -a.e., i.e. and -a.e. For instance, a possible choice of is . Let be a Poisson random measure on , with mean measure and the corresponding compensated Poisson random measure . Assume the sequence of iid uniform random variables on is independent of . The Marking and Mapping Theorems [31] imply that the following Poisson random measures
(6) |
have mean measures and , respectively. We couple and by choosing in their Lévy–Itô decompositions and couple their jump parts from (5) via the coupling of the Poisson random measures in (6).
Proposition 4.1.
Proof.
Denote for any function mapping into . Define the Poisson random measures
(8) |
with mean measures and , respectively. Thus . Note that is independent of since they are both thinnings of the same Poisson random measure and have disjoint supports. Let and denote their respective compensated Poisson random measures and define the Lévy processes by , where . By construction, and are independent square-integrable martingales, satisfying and for all . In particular, we have and, by Campbell’s formula [31, p. 28],
Doob’s maximal inequality [29, Prop. 7.16], applied to the submartingale , and the independence of martingales and yield
Assume, that and , and define the Lévy processes by . By the integrability assumption and construction, and are independent square-integrable processes with and for all . Thus, Campbell’s formula [31, p. 28] yields
Next, Doob’s maximal inequality applied to the submartingale , and the independence between and , yield
The following bound is required when the big jump components have infinite variance.
Proposition 4.2.
Note that (9) & (10) hold without assuming and have a finite -moment. If this holds, however, the bound is non-trivial because the big jumps in Proposition 4.2 are then controlled by .
Proof.
Recall that and let . Next, for , we define the processes and . Let be as in (8), and define as (both being Lévy processes), and note that for and . Note that and have a finite -moment for all . Moreover, by the triangle inequality and the fact for all , we have
(11) |
Recall that and are the mean measures of and , respectively. Thus, by taking expectations in (11) and applying Campbell’s formula [31, p. 28], we get
Due to the monotone convergence theorem, it follows that, as ,
Furthermore, Fatou’s lemma together with the above observations imply that
We have a.s., since the largest jump of and are finite on the time interval . This implies (9).
4.2. Comonotonic coupling
In this section, we introduce the -dimensional comonotonic coupling of jumps for any . We use two ingredients to construct this coupling of the Lévy processes and : (I) the comonotonic coupling of real-valued random variables and , given by , where is uniform on and the functions and are the right inverses of the functions and ; (II) LaPage’s representation of the Poisson random measures of a Lévy process (see [38, p. 4]).
The comonotonic coupling of the real-valued variables in (I) is optimal for the -Wasserstein distance (see [36, Ex. 3.2.14]), for . The representation in (II) decomposes the jumps of a Lévy process into its magnitude (i.e. norm) and angular component. The main idea behind our coupling of the Lévy processes and is to couple their respective Poisson random measures of jumps via a comonotonic coupling of the magnitudes of jumps, while simultaneously aligning their angular components. We now describe this construction.
Recall that the Lévy processes and in have characteristic triplets and , respectively. Suppose the Lévy measure (resp. ) of (resp. ) admits a radial decomposition (see [32, p. 282]), that is, there exists a probability measure (resp. ) on the unit sphere (with convention ) such that:
for any , where (resp. ) is a measurable family of Lévy measures on . Define the probability measure on and the Radon-Nikodym derivatives and for . Consider the following radial decompositions of and :
(12) |
for , where and for . The advantage of the decomposition in (12), compared to the one in the display above, is that the angular components of jumps are sampled from the same measure on , making it possible to couple the jumps of and by coupling their magnitudes.
For every , let (resp. ) be the right inverse of (resp. ). Let be a sequence of iid uniform random variables on , and let be a sequence of partial sums of iid standard exponentially distributed random variables that is independent of . Next, independent of , we denote by a sequence of iid random vectors on with common distribution . Define the Poisson point process on with measure and the compensated Poisson random measure as follows:
(13) |
Next, we note that (by Proposition 4.3 below) for any (and even when and both have jumps of finite variation), the small-jump components of and take the form
(14) |
The big-jump components of and can similarly be expressed as
(15) |
Proposition 4.3.
Let Lévy processes and have characteristic triplets and , respectively. Assume that the Lévy measures of and admit the radial decomposition in (12) and construct the processes by (14) and (15), independent of standard Brownian motions and on . Then there exists constants , such that and for all . Moreover, this coupling of and satisfies
(16) |
Furthermore, if and , then
(17) |
were we define . In particular,
(18) |
Coupling the jumps of and via (14) and (15) is based on the idea behind the one-dimensional comonotonic coupling, applied to the magnitudes of the jumps of and . Indeed, in the coupling of Proposition 4.3, we align the angular components of the jumps and then couple the magnitudes via the right inverses and (of possibly unbounded functions and ) evaluated along the sequence of partial sums of iid standard exponentially distributed random variables. Note that this construction is analogous to the one-dimensional comonotonic coupling of real random variables described above, but allows for the functions and to be unbounded.
Proof.
We start by showing that there exist , such that and for all . The proof of this fact is essentially given in [38, p. 4], we outline it here for completeness. By the symmetry of the construction, it is sufficient to prove the first equality in law only. Since is a Lévy process, is a Poisson random measure on of the jumps of with mean measure [40, Thm 19.2]. By (14) and (15), the equality in law holds for some if
(19) |
To prove this, consider the Poisson random measure on , with mean measure , defined in (13). Define by . Crucially, by construction, we have on . Thus, by the Mapping Theorem [31, Sec. 2.3], we get . Moreover, since by construction, the equality in law in (19) follows.
Next, we prove that
is a square-integrable martingale. Let to be the -field generated by for and , then is adapted w.r.t. and fulfils the martingale property by virtue of being an integral with respect to a compensated Poisson random measure. Furthermore, by the triangle inequality, is square integrable since both and are square integrable. Since the process is a submartingale, Doob’s maximal inequality [29, Prop. 7.16] and Campbell’s formula [31, p. 28] imply
If and have finite second moment, a similar bound can be established for the big-jump components using Doob’s maximal inequality and Campbell’s formula:
Proposition 4.4.
As was the case for the thinning coupling, we can again note that (20) & (21) hold even without assuming that and have a finite -moment. However, under such an assumption, the upper bounds are finite since the integral on the right of (20) is bounded by for some .
Proof.
For , we denote by and the truncated large jumps of and , given by
Note that , and thus, from the concavity of for , it follows that
Since we have that , and Campbell’s theorem [31, p. 28] then implies that
Thus, altogether, this implies that
Due to the monotone convergence theorem, it follows, as , that
Furthermore, Fatou’s lemma together with the above observations imply that
We can now conclude (20), as a.s., since the largest jumps of and are finite on the time interval .
5. Upper bounds on the Wasserstein distance in the domain of attraction
The main aim of this section is to prove the upper bounds in Theorems 2.1, 2.3 & 2.8 above. In Section 5.1 we give the characterisation, in terms of their generating triplets, of the Lévy processes in that are in the stable domain-of-attraction. The proof of the upper bounds in Theorem 2.1, based on the thinning coupling, is given in Section 5.2. The upper bounds in Theorem 2.3 are established in Section 5.3 using the comonotonic coupling. In Section 5.4, we prove the upper bounds of Theorem 2.8 for the Brownian limit. In the proofs, we will rely on the following consequence of Jensen’s inequality
(22) |
5.1. Small-time domain of attraction for Lévy processes
We start by defining the attractor.
Definition.
For any , the law of an -stable Lévy process is given by a generating triplet (for the cutoff function ) as follows: the Lévy measure equals
(23) |
where is a probability measure on and an “intensity” parameter, satisfying
-
•
[Brownian motion with zero drift]: , and (i.e. );
-
•
[infinite variation, zero-mean process]: , and ;
-
•
[Cauchy process]: either , with symmetric angular component , or and the process is a deterministic nonzero linear drift, i.e. for all times ;
-
•
[finite variation and zero natural drift]: and .
It follows from the definition that an -stable process satisfies the scaling property for . Moreover, for (resp. ), a non-deterministic -stable process is of infinite (resp. finite) variation by [40, Thm 21.9], since (23) implies (resp. ). Note also that in the case of the Cauchy process (stability index ), can be arbitrary if and satisfies if .
For any , define for any . The following known result characterises the Lévy processes in the domain of attraction of an -stable process defined above. It is a consequence of [29, Thm 15.14] and [26, Thm 2], see Appendix B below for the proof.
Theorem 5.1 (Small-time domains of attraction).
Let and be Lévy processes in . Then as in the Skorokhod space for some positive normalising function if and only if is -stable for some , the normalising function admits the representation , where is a slowly varying function at infinity, and the generating triplets and (for the cutoff function ) of and , respectively, are related as follows:
-
•
if (attraction to Brownian motion), then
(24) -
•
if , we have and
(25) -
•
if (attraction to Cauchy process), then (25) holds,
(26) and, for any , such that has finite variation (i.e. ) and , the process has zero natural drift: .
-
•
if , then (25) holds, has finite variation (i.e. ) and zero natural drift (i.e. ).
Moreover, the function satisfying the weak limit above is asymptotically unique at : a positive function satisfies as if and only if as .
Note that in the case in Theorem 5.1, we may have (see Example 6.7 below), but in this case the function cannot be asymptotically equal to a positive constant. Moreover, in the case , the process does not require centering since its mean is linear in time and thus disappears in the scaling limit. However, in the finite variation case (i.e. when ), the process must have zero natural drift for the scaling limit to exist.
5.2. Domain of normal attraction: the thinning coupling
Let denote the generating triplet [40, Def. 8.2] of with respect to the cutoff function on . Define the Blumenthal–Getoor (BG) index of by
(27) |
Fix as follows: if ; if and , then pick ; if and , then and hence choose . In particular, note that and . Furthermore, if (or, equivalently, if the pure-jump component of the Lévy–Itô decomposition (3) of is finite variation), we say that has zero natural drift if and nonzero natural drift otherwise. Moreover, if (or, equivalently, the pure-jump component of is of finite activity, i.e. a compound Poisson process), then . If , throughout the paper we use the convention .
The following lemma gives an upper bound on the moments of the supremum of the norm of a general Lévy process. Lemma 5.2 plays an important role in the proofs of Section 5.
Lemma 5.2.
Let be a Lévy process with generating triplet . Recall the Blumenthal–Getoor index from (27) and the associated quantity . Assume that, for some , we have . Then there exist constants , , such that
If and has zero natural drift, i.e. , then in the inequality above.
Note that, by the definition of above, the pure-jump component of is a compound Poisson process if and only if . In particular, if in addition in this case we have zero natural drift, then the pure-jump component of is a compound Poisson process. The term in the bound of Lemma 5.2 is present only if has a non-trivial Gaussian component.
Lemma 5.2 is a multidimensional generalisation of [23, Lem. 2]. The proof of Lemma 5.2, given in Appendix A below, is likewise a multidimensional generalisation of the arguments in the proof of [23, Lem. 2]. As in [23, Lem. 2], the constants , , can be given explicitly in terms of the characteristic triplet of .
Consider a Lévy process in in the domain of normal attraction of the -stable process . Thus we may assume that converges weakly to as . We will now apply the thinning coupling, described in (5) and (6) of Subsection 4.1 above, to quantify this convergence in terms of the Wasserstein distance under the following assumption.
Assumption (T).
Let the Lévy process be in the small-time domain of attraction of a stable process . Assume has no Gaussian component (i.e. ) and its Lévy measure has a decomposition satisfying the following: is arbitrary with finite mass and
a measurable function and constants , and .
Remark 5.3.
(a) Condition (25) in Theorem 5.1 suggests that the Lévy measure of the process , which is in the domain of attraction of a stable process , possesses a decomposition of the type . Since Assumption ( (T).) stipulates the regularity of the density of with respect to , it may be interpreted as specifying the rate of convergence in the limit in (25) of Theorem 5.1.
Remark 5.4.
Under Assumption ( (T).), we may decompose the process as the sum of independent Lévy processes and with generating triplets and , respectively, such that, when , both processes have zero natural drift (note that for , Assumption ( (T).) and Theorem 5.1 imply that has zero natural drift), and when then has zero natural drift. For , let and and note that has the same law as . We couple and via the coupling given in (5) and (6) of Subsection 4.1. ∎
Theorem 5.5.
Let and Assumption ( (T).) hold for some . Then, for any with , we have as . Moreover, for any , we let for and some . Then, as , we have
(28) | |||
(29) | |||
(30) |
Remark 5.6.
A careful case-by-case analysis reveals that the upper bound implied by Theorem 5.5 on the distance above (which decreases as fast as the slowest of the terms on the right) decreases the fastest when is chosen as follows (recall ):
Moreover, in that case, we have
Since the above bounds are not easily interpretable because of the multiple cases depending on the parameters , we decided to only present the case in Theorem 2.1 above. In particular, this removes the possibility of a logarithmic term appearing in the upper bound. ∎
Proof of Theorem 5.5.
The bound on follows directly from Lemma 5.2 with and the construction of .
We now consider the process . Define the measure
where (resp. ) is in Assumption ( (T).) (resp. in (23) of the definition of ). The Radon–Nikodym derivative equals on the support of , since Lévy–Khintchine exponent satisfies and hence
First we bound the large-jump component : inequality (9) of Proposition 4.2 yields
Recall that as means that there exists some such that for all . Using Assumption ( (T).), as , we obtain
where . Next, as , we note that
Thus, since , altogether we have, as ,
Next, we find the rate for the small-jump component . Assumption ( (T).) and (7) of Proposition 4.1 imply that, as ,
Next, we control the difference of the drift terms. First, consider the infinite variation case . Since has zero mean, representation (3) implies
Thus, we obtain
(31) |
By Assumption ( (T).), the integral in the display satisfies
where we used the fact that . By (31), we obtain
In the finite variation case , recall that and have zero natural drift, so that
Thus, we have, by Assumption ( (T).),
5.3. Domain of non-normal attraction: the comonotonic coupling
Let be an -stable process on for some , defined as in Section 5.1, with “intensity” parameter , probability measure on and Lévy measure in (23). Define the measure on and note that the right inverse of its tail is given by for all and . The comonotonic coupling of and a Lévy process requires the following assumption on the generating triplet of .
Assumption (C).
has no Gaussian component and , where the measure is arbitrary with finite mass and the Lévy measure can be expressed as
(32) |
for all , , and some monotonic function , slowly varying at , and a measurable . Assume that the functions , and
(33) |
where is the right inverse of , satisfy
(34) |
and some constants and .
Under Assumption ( (C).), is monotonic and slowly varying at . In fact, for any , we have as by virtue of (34), which is slowly varying by [6, Prop. 1.5.7(ii)] (since is regularly varying and is slowly varying). Note also that may be either non-increasing or non-decreasing.
Remark 5.7.
Condition (25) in Theorem 5.1 states that the Lévy measure of the process in the domain of attraction of a stable process behaves as the Lévy measure of in every half-space of the form for every and small . Assumptions ( (S).) and ( (C).) may thus be interpreted as a refinement of this condition, requiring the Lévy measure of to satisfy an analogue of (25) but on every ray directed by and quantifying how fast such a limit holds. In particular, under Assumptions ( (S).) and ( (C).) and for defined in (33), we let with . For such a it follows that , i.e. is in the small-time domain of non-normal attraction of (by Theorem 5.1 above). ∎
Remark 5.8.
We decompose the process as the sum of the independent Lévy processes and with generating triplets and , respectively, such that, when , both processes have zero natural drift, and when then has zero natural drift. Let the processes be coupled as in (14) and (15) from Subsection 4.2, where and for . Note that has the same law as for and that, under Assumption ( (C).), has a finite -moment for every by (32). ∎
Theorem 5.9.
Remark 5.10.
Lemma 5.11.
Under Assumption ( (C).), there exists a function and a constant , such that, for all and ,
(38) |
Proof.
Note that for all and . Hence, for all and , , implying that
Thus, the first part of (38) holds if .
Proof.
By Lemma 5.11, for all , and , it holds that
(39) |
Hence, Proposition 4.4 now implies that
To bound , we use the triangle inequality and the fact that is concave, to obtain
We consider each integral on its own. Assumption ( (S).) implies that the first integral in the display above is bounded by . Next, we bound the second integral in the display above. Assumption ( (S).) and (34) yield, as ,
Proof.
In the following lemma, we find at what rate the drifts converge.
Lemma 5.14.
Proof.
First, assume that . The proof in this setting follows the steps of the proof of Lemma 5.12. Note that
and since has a finite first moment, it follows that
Recall from (39), that , which implies that
The triangle inequality now implies, that
The two terms in the upper bound are denoted by and . Following the calculations made in the proof of Lemma 5.12, we see by Assumption ( (S).) and (38), that in the display above is bounded by , and, as ,
5.4. Brownian limits: upper bounds
In this subsection, we construct upper bounds on the distance between a Lévy process with nonzero Gaussian component and its attracting Brownian motion. Recall that denotes the characteristic triplet [40, Def. 8.2] of with respect to the cutoff function on and is given in terms of the BG index defined in (27).
Proposition 5.15.
Let be a Lévy process on with the characteristic triplet . Let for and assume for some . Let be the Gaussian component of in its Lévy–Itô decomposition (3) and define .
(a) If is of infinite variation or has finite variation with infinite activity and zero natural drift, then
(b) If has finite variation and nonzero natural drift, then
Note that, if has infinite activity we have and if the BG index , then . Hence, Proposition 5.15 provides bounds on the rate of convergence in the appropriate -Wasserstein distance for the weak limit in Theorem 5.1 (case and asymptotically constant). In the case , it is well-known that converges weakly to the Gaussian law of (see e.g. [5, Prop. I.2(i)]), but the convergence of could be arbitrarily slow, see Example 6.7 below. It is thus not surprising that Proposition 5.15 gives no information about the rate of convergence. Note also that Proposition 5.15 covers the case . Moreover, the bound on the -Wasserstein distance when the BG index is less than one is sharper if the natural drift is zero, than if it is not.
Proof of Proposition 5.15.
Fix , let be the Brownian motion in the Lévy–Itô decomposition (3) of the Lévy process and recall . Since the Brownian motion satisfies the identity in law by self-similarity (Lévy’s characterisation theorem), there exists a coupling , such that and . Recalling , we obtain
(41) |
Note that the characteristic triplet of is given by and . In particular, the BG index of equals that of .
Part (a). Assume is of infinite variation. Since, has no Gaussian component, by [40, Thm 21.9] we have , implying that the BG index of satisfies . Hence the associated quantity satisfies: . Thus, by Lemma 5.2, we have , implying that for .
If has finite variation and zero natural drift, then the bound in Lemma 5.2 with yields . Noting that implies Part (a).
6. Lower bounds on the Wasserstein distance in the domain of attraction
In this section we prove the lower bounds from Theorems 2.1, 2.3 & 2.8. We first cover the domain of non-normal attraction and then turn to the domain of normal attraction.
6.1. Domain of non-normal attraction
The lower bound on the rate of decay of as is much greater than polynomial when the scaling function is such that , which is slowly varying at , does not convergent to a positive constant (i.e. the process is in the domain of non-normal attraction). To show this, we start with the following result, which can be viewed as an extension of [9, Thm 1] from random walks to multidimensional Lévy processes, stated for the -Wasserstein distance. (We remark here that an extension for the Prokhorov distance, used in [9], is also possible in this context.) Our proof below was inspired by that of [9, Thm 1]. Our main tool in the proof of the lower bound in Theorem 2.3 is the following.
Proposition 6.1.
Let be in the domain of non-normal attraction of an -stable process with and define for . Then, for all and ,
Lemma 6.2.
(a) Let be a random vector in , i.e. , for some . Then,
(b) Assume that the random vectors are in , for some , and that and are independent. Then the following inequality holds:
Proof.
(a) By the subadditivity of on , we have for any . A similar inequality holds by reversing the roles of and , implying . Hence, we have
Proof of Proposition 6.1.
Recall , and note that , where and are independent and equal in distribution. Furthermore let be independent copies of . Recall that and note that . This together with Lemma 6.2(b) implies that
The scaling property for the -distance implies that . Putting everything together and applying the triangle inequality with Lemma 6.2(a), yields
6.2. Domain of normal attraction and the Toscani-Fourier lower bounds
We begin with the following technical result, used in the proofs of Theorems 2.1 & 2.8. Given two -dimensional random vectors and with characteristic functions and , respectively, and define the Toscani–Fourier distance (see [3, Eq. (1)]) as
The following lemma is an extension of [34, Prop. 2] to the multivariate case and to -Wasserstein distances for , and the proof is inspired by the proof in the one-dimensional case. For completeness, we give a simple proof below.
Lemma 6.3.
For any random vectors and , we have .
Proof.
Fix . Since the map , , satisfies for , we have for any . Hence, for any ,
Since is arbitrary, the result follows. ∎
6.2.1. Heavy-tailed domain of normal attraction
Let be a Lévy process on in the domain of attraction of an -stable process , such that as .
Lemma 6.4.
Let be a Lévy process that differs in law from the -stable process , . Let and denote their Lévy–Khintchine exponents. Pick any and for which . Then, we have
Proof.
First, Lemma 6.3 implies that
where denotes the characteristic function of the random vector . Second, set with as in the statement of the lemma and note that
Similarly, since , we have and hence
Since for any we have as , it follows that for and . The result then follows. ∎
6.2.2. Brownian domain of normal attraction
The domain of normal attraction to a Brownian motion consists of the class of Lévy processes with a nontrivial Brownian component (see e.g. [26] and [5, Prop. I.2(i)]). To construct a lower bound on the distance between the Lévy process and its Brownian limit require the following lower estimates.
Lemma 6.5.
Let be a nonzero pure-jump Lévy process on , let denote its Lévy-Khintchine exponent and its Lévy measure.
(a) If has finite variation and nonzero drift with direction . Then for some and all sufficiently large .
(b) Suppose there exist a locally finite measure on and a probability measure on with
Define for , and, given any , let be such that the set has positive -measure . Then we have
In particular, if for some , then , .
Proof.
Let and denote the real and imaginary parts of , respectively.
(a) Since has finite variation, it is clear from the Lévy-Khintchine formula without compensator that for some and all sufficiently large . Indeed, this follows from [5, Prop. 2(ii)] applied to the finite variation Lévy process .
(b) Note at first, that for all . Thus, the Lévy-Khintchine formula applied to yields
This proves the first claim in Part (b). The second claim follows from the additional assumption. ∎
Lemma 6.6.
Let be a Lévy process on with
the characteristic triplet .
Let for . Moreover, let be the
Gaussian component of in its Lévy–Itô decomposition (3) and define
with Lévy–Khintchine exponent .
(a) Pick any and define . Then, we have
(b) Let be the largest eigenvalue of . Suppose there exist and vectors on satisfying and . Then for any we have
Proof.
Since and are independent, we have . Hence, Lemma 6.3 gives, for any ,
(42) |
(a) Let be as in the statement. The result then follows from Lemma 6.4.
(b) The proof follows as in that of Lemma 6.4. The main idea is to use the fact that, if as and , then
Set and for . Note that as and . Since does not have a Brownian component, we have (see, e.g. [40, Lem. 43.11]) and thus as . Moreover, for by assumption. Thus, applying (42) with yields Part (b). ∎
Example 6.7.
Consider an example inspired by [26, Ex. 4.2.1]. Let be a real-valued martingale Lévy process with Lévy measure for some and all . Let be a standard Brownian motion independent of and let . Define for . Choose to satisfy as . The function is regularly varying at with index , and therefore as by [6, Thm 1.5.12]. Note that converges in law to a standard normal distribution by [26, Thm 2(i)]. Hence, the Lévy–Khintchine exponent of satisfies as for any . In particular, by taking , which tends to as , we obtain as . Then, a slight modification of the argument in the proof of Lemma 6.6(b) shows that . ∎
7. Proofs of Section 2
In this section we give the proofs of the results stated in Section 2.
Proof of Theorem 2.1.
Part (a). Recall from Assumption ( (T).) and Remark 5.4, that we may decompose as the sum of independent Lévy processes and with generating triplets and , respectively. For , denote and . Let for and some . Assume the processes are coupled as in Subsection 4.1 (i.e. (5) and (6)).
Note that by the triangle inequality and the definition of . Theorem 5.5 (with and ) yields a bound on via (4) as follows:
Part (a) can now be deduced by optimising as follows. If then and taking sufficiently large makes all terms become . If and , then the bounds are , , and . Since and , all these bounds can be made by picking . If and , then the bounds are , , and and, as before, these can all be made by picking . Note here that the logarithmic term never arises in the dominant term, as it would require , but .
Part (b). The claim follows from a direct application of Lemma 6.4. ∎
In preparation for the proofs of Theorem 2.3 and Corollary 2.4, we establish Lemmas 7.1, 7.2 and 7.3 about slowly varying functions.
Lemma 7.1.
Let be and slowly varying such that its derivative equals for some , where is positive and slowly varying at infinity. Then, for each , we have as .
For , define for all . The function is asymptotically equivalent to as and, if , slowly varying at infinity. Moreover, for , the function satisfies
Proof.
Since is positive and is continuous, is either eventually positive or negative and either or is slowly varying at infinity, respectively. By [6, Thm 1.2.1], we have as for any . Thus for all sufficiently large we have
The dominated convergence theorem now yields
which establishes the first claim. Since , the function is positive on a neighbourhood of infinity and asymptotically equivalent to by the limit in the previous display. Moreover, since in the limit was arbitrary, for any we have
implying that is slowly varying at infinity.
To establish the non-asymptotic inequality in the lemma, fix and note that
Note that the assumption in Lemma 7.1 requires to be eventually strictly monotone. Moreover, if satisfies the conditions of Lemma 7.1, then so does for any .
Lemma 7.2.
(a) Let be slowly varying at infinity. Suppose that, for some and non-increasing function , we have for all and . Then has a positive finite limit at infinity.
(b) Let be slowly varying at infinity with as and . Then the functions are slowly varying at infinity, , and as for any .
Note that the smallest non-increasing function satisfying for all is given by .
Proof of Lemma 7.2.
Part (a). First assume . Define and note that as by the uniform convergence theorem [6, Thm 1.2.1]. As , we also have as , making finite. Since and for all and , we obtain
for all and . Similarly, for we have . For any set , implying . By the monotonicity of we obtain
(43) |
If we had , there would exist an increasing sequence and such that and for all , contradicting (43). Hence the limit exists. By taking the limit as on the left-hand side of (43) for any fixed , it follows that . Thus has a finite and positive limit. The case can be established in a similar way.
Part (b). The statement follows from a direct application of Lemma 7.1. ∎
Lemma 7.3.
Define iteratively the functions and for and . Then, the following statements hold.
(a) For any and , we have .
(b) We have .
(c) Suppose for some in and either with or with . Set , then we have
Proof.
For we have . For , we have
In particular, we may add to and and divide by to obtain
implying Part (a). Part (b) is obvious, so we need only establish Part (c).
It is simple to show that for , the fraction lies between and . An inductive argument implies that for any , we have
Thus, by virtue of Parts (a) and (b) and denoting , we have
Proof of Theorem 2.3.
Part (a). Recall from Remark 5.8, that we can decompose as the sum of the independent processes and with generating triplets and , respectively. For , let and . Assume that is coupled as in (14) and (15).
Proof of Corollary 2.4.
Part (b) follows from Lemma 7.3 above. Given Theorem 2.3, it suffices to show that the assumptions in Corollary 2.4(a) imply those of Theorem 2.3 and that the upper and lower bounds have the desired form. These facts follow from Lemmas 7.1 & 7.3. Indeed, for instance, the function in Assumption ( (S).) is given by the upper bound on given in Lemma 7.3, where is as in Lemma 7.1. ∎
8. Concluding remarks
Over small time horizons, a Lévy process may be attracted to an -stable process with heavy tails (i.e. ) or Brownian motion (i.e. ). In this paper, we established upper and lower bounds on the rate of convergence in -Wasserstein distance in both regimes, as listed below.
-
•
For and processes in the domain of non-normal attraction, the Wasserstein distance is bounded above and below by slowly varying functions (see Theorem 2.3), both of which are slower than any power of logarithm greater than .
-
•
For and processes in the domain of normal attraction, we establish upper and lower bounds that are polynomial in (see Theorem 2.1). The established bounds are often rate optimal in -Wasserstein distance for or and proportional to .
-
•
For (i.e. Brownian limit) and processes in the domain of normal attraction, the established upper and lower bounds are also polynomial in (see Theorem 2.8). In this case, the bounds are rate optimal when the Blumenthal–Getoor index and otherwise there is a polynomial gap. This suggests at least one of the bounds is not sharp. Establishing sharper bounds in this special case is nontrivial (as classical tools such as the Berry–Esseen theorem fail to provide converging bounds) and is therefore left for future work.
The process in Assumption ( (T).) (resp. ( (C).)) in the thinning (resp. comonotonic) coupling is assumed to have finitely many jumps on compact time intervals. Our results can be extended to the case where this process has infinitely many jumps on compact time intervals and a Blumenthal–Getoor index . For such an extension, the moments of , as a function of , can be controlled via Lemma 5.2 and would result in worse convergence rates as . We chose not to include this simple extension as the convergence rates would be much harder to express in terms of all the model parameters, resulting in a less concise presentation of our results.
The tools developed in Section 4 could be used for the omitted case to establish upper and lower bounds on the Wasserstein distance in the domains of normal and non-normal attraction. However, as multiple cases would arise, requiring careful treatment of the emerging slowly varying functions, we leave such extension for future work.
The upper bounds in the heavy-tailed case are based on two distinct couplings introduced in Section 4: the comonotonic and thinning couplings. We mention briefly that it is possible to combine both couplings. Consider two Lévy processes and with Lévy measures and where and are measurable and is a Lévy measure. It is then possible to first apply the thinning coupling to synchronise the jumps arising from the Lévy measure , where , and then apply the comonotonic coupling to the remaining jumps of and with corresponding Lévy measures and . It appears, however, that this combined coupling does not yield improved rates of convergence to heavy-tailed stable limits as each coupling already attains optimal rates of convergence in most cases. We expect such a combined coupling to reduce the -distance between coupled processes by a constant factor.
When , the -stable limit has heavy tails and its -moment is finite if and only if , making it impossible to obtain general converging bounds in the -Wasserstein distance, which play a key role in various applications including Multilevel Monte Carlo. However, substituting the standard Euclidean metric on with an equivalent bounded metric would remove this obstruction. We expect our couplings to perform well and have a fast converging -Wasserstein distance under the bounded metric on . Such an extension of our results is also left for future work.
The present work focused on the small-time stable domain of attraction where the small jumps of the process dominate the activity. Finally we remark that it is natural to expect that the couplings developed in this paper could also typically achieve asymptotically optimal convergence rate in the Wasserstein distance in the scaling limits of the long-time stable domain of attraction. This is because in the long-time horizon regime the activity in the limit is dominated by the large jumps of the Lévy process, which are also efficiently coupled under the couplings of Section 4 above.
Acknowledgements
JGC and AM were supported by EPSRC grant EP/V009478/1 and The Alan Turing Institute under the EPSRC grant EP/N510129/1; AM was also supported by EPSRC grant EP/W006227/1; DKB was funded by the CDT in Mathematics and Statistics at The University of Warwick and is supported by AUFF NOVA grant AUFF-E-2022-9-39. The authors would like to thank the Isaac Newton Institute for Mathematical Sciences, Cambridge, for support during the INI satellite programme Heavy tails in machine learning, hosted by The Alan Turing Institute, London, and the INI programme Stochastic systems for anomalous diffusion hosted at INI in Cambridge, where work on this paper was undertaken. This work was supported by EPSRC grant EP/R014604/1.
References
- [1] S. Asmussen and J. Ivanovs. Discretization error for a two-sided reflected Lévy process. Queueing Syst., 89(1-2):199–212, 2018.
- [2] S. Asmussen and J. Rosiński. Approximations of small jumps of Lévy processes with a view towards simulation. J. Appl. Probab., 38(2):482–493, 2001.
- [3] G. Auricchio, A. Codegoni, S. Gualandi, G. Toscani, and M. Veneroni. The equivalence of Fourier-based and Wasserstein metrics on imaging problems. Atti Accad. Naz. Lincei Rend. Lincei Mat. Appl., 31(3):627–649, 2020.
- [4] D. Bang, J. González Cázares, and A. Mijatović. A Gaussian approximation theorem for Lévy processes. Statist. Probab. Lett., 178:Paper No. 109187, 4, 2021.
- [5] J. Bertoin. Lévy processes, volume 121 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge, 1996.
- [6] N. H. Bingham, C. M. Goldie, and J. L. Teugels. Regular variation, volume 27 of Encyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge, 1989.
- [7] K. Bisewski and J. Ivanovs. Zooming-in on a Lévy process: failure to observe threshold exceedance over a dense grid. Electron. J. Probab., 25:Paper No. 113, 33, 2020.
- [8] J. Blanchet and K. Murthy. Quantifying distributional model risk via optimal transport. Math. Oper. Res., 44(2):565–600, 2019.
- [9] C. Börgers and C. Greengard. Slow convergence in generalized central limit theorems. C. R. Math. Acad. Sci. Paris, 356(6):679–685, 2018.
- [10] M. Broadie, P. Glasserman, and S. Kou. A continuity correction for discrete barrier options. Math. Finance, 7(4):325–349, 1997.
- [11] M. Broadie, P. Glasserman, and S. G. Kou. Connecting discrete and continuous path-dependent options. Finance Stoch., 3(1):55–82, 1999.
- [12] M. E. Caballero, J. C. Pardo, and J. L. Pérez. On Lamperti stable processes. Probab. Math. Statist., 30(1):1–28, 2010.
- [13] S. Cohen and J. Rosiński. Gaussian approximation of multivariate Lévy processes with applications to simulation of tempered stable processes. Bernoulli, 13(1):195–210, 2007.
- [14] E. H. A. Dia and D. Lamberton. Connecting discrete and continuous lookback or hindsight options in exponential Lévy models. Adv. in Appl. Probab., 43(4):1136–1165, 2011.
- [15] E. H. A. Dia and D. Lamberton. Continuity correction for barrier options in jump-diffusion models. SIAM J. Financial Math., 2(1):866–900, 2011.
- [16] C. Duval, T. Jalal, and E. Mariucci. Nonparametric density estimation for the small jumps of lévy processes. 2024.
- [17] P. Embrechts, C. Klüppelberg, and T. Mikosch. Modelling extremal events, volume 33 of Applications of Mathematics (New York). Springer-Verlag, Berlin, 1997. For insurance and finance.
- [18] V. Fomichov, J. González Cázares, and J. Ivanovs. Implementable coupling of Lévy process and Brownian motion. Stochastic Process. Appl., 142:407–431, 2021.
- [19] A. L. Gibbs and F. E. Su. On choosing and bounding probability metrics. INTERNAT. STATIST. REV., pages 419–435, 2002.
- [20] B. V. Gnedenko and A. N. Kolmogorov. Limit distributions for sums of independent random variables. Addison-Wesley Publishing Co., Reading, Mass.-London-Don Mills., Ont., revised edition, 1968. Translated from the Russian, annotated, and revised by K. L. Chung, With appendices by J. L. Doob and P. L. Hsu.
- [21] J. González Cázares, D. Kramer-Bang, and A. Mijatović. Presentation on “asymptotically optimal Wasserstein couplings for the small-time stable domain of attraction”. YouTube presentation on the YouTube channel Prob-AM, 2024.
- [22] J. González Cázares and A. Mijatović. Simulation of the drawdown and its duration in Lévy models via stick-breaking Gaussian approximation. Finance Stoch., 26(4):671–732, 2022.
- [23] J. I. González Cázares, A. Mijatović, and G. Uribe Bravo. Geometrically convergent simulation of the extrema of lévy processes. Mathematics of Operations Research, 47(2):1141–1168, 2022.
- [24] F. Götze. On the rate of convergence in the multivariate CLT. Ann. Probab., 19(2):724–739, 1991.
- [25] P. Hall. Two-sided bounds on the rate of convergence to a stable law. Z. Wahrsch. Verw. Gebiete, 57(3):349–364, 1981.
- [26] J. Ivanovs. Zooming in on a Lévy process at its supremum. Ann. Appl. Probab., 28(2):912–940, 2018.
- [27] J. Ivanovs and J. D. Thøstesen. Discretization of the Lamperti representation of a positive self-similar Markov process. Stochastic Process. Appl., 137:200–221, 2021.
- [28] O. Johnson and R. Samworth. Central limit theorem and convergence to stable laws in Mallows distance. Bernoulli, 11(5):829–845, 2005.
- [29] O. Kallenberg. Foundations of modern probability. Probability and its Applications (New York). Springer-Verlag, New York, second edition, 2002.
- [30] W. S. Kendall, M. B. Majka, and A. Mijatović. Optimal Markovian coupling for finite activity Lévy processes. Bernoulli, 30(4):2821–2845, 2024.
- [31] J. F. C. Kingman. Poisson processes, volume 3 of Oxford Studies in Probability. The Clarendon Press, Oxford University Press, New York, 1993. Oxford Science Publications.
- [32] R. LePage. Multidimensional infinitely divisible variables and processes. II. In Probability in Banach spaces, III (Medford, Mass., 1980), volume 860 of Lecture Notes in Math., pages 279–284. Springer, Berlin-New York, 1981.
- [33] S. Manou-Abi. Rate of convergence to alpha stable law using Zolotarev distance. Journal of Statistics: Advances in Theory and Applications, 18:166–177, 2017.
- [34] E. Mariucci and M. Reiß. Wasserstein and total variation distance between marginals of Lévy processes. Electron. J. Stat., 12(2):2482–2514, 2018.
- [35] P. Pegon, F. Santambrogio, and D. Piazzoli. Full characterization of optimal transport plans for concave costs. Discrete Contin. Dyn. Syst., 35(12):6113–6132, 2015.
- [36] S. T. Rachev and L. Rüschendorf. Mass transportation problems. Vol. I. Probability and its Applications (New York). Springer-Verlag, New York, 1998. Theory.
- [37] E. Rio. Upper bounds for minimal distances in the central limit theorem. Ann. Inst. Henri Poincaré Probab. Stat., 45(3):802–817, 2009.
- [38] J. Rosiński. Series representations of Lévy processes from the perspective of point processes. In Lévy processes, pages 401–415. Birkhäuser Boston, Boston, MA, 2001.
- [39] J. Rosiński. Tempering stable processes. Stochastic Process. Appl., 117(6):677–707, 2007.
- [40] K.-i. Sato. Lévy processes and infinitely divisible distributions, volume 68 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2013. Translated from the 1990 Japanese original, Revised edition of the 1999 English translation.
- [41] C. Villani. Optimal Transport: Old and New. Grundlehren der mathematischen Wissenschaften. Springer Berlin Heidelberg, 2008.
Appendix A Proof of Lemma 5.2
Given any consider the Lévy–Itô decomposition given in (3), where is a standard Brownian motion, is the pure-jump martingale containing all the jumps of of magnitude less than and is the driftless compound Poisson process containing all the jumps of of magnitude at least . Since , we have
(44) |
By the elementary bound , , and , , we only need to bound the -th moment of each summand on the right-hand side of the display above. Recall that is the quantity associated to the BG index of defined in (27).
Case . Define . To bound the drift term , first assume and note that
Thus, is bounded by a constant multiple of . If and the natural drift of is zero (i.e. ), then is bounded (and convergent) as ,
making bounded by a multiple of . Hence, in this case, we may take in Lemma 5.2 (it will become clear from the remainder of the proof that none of the other summands on the right-hand side of the inequality in (44) will produce a term of order ).
The -th moment of the Brownian term is easily bounded by a constant multiple of since we have . If is a zero matrix, then and hence .
Next, we bound the big-jump term . Let and recall that for some Poisson random variable with mean and iid random vectors independent of with law . Recall, from the formula for the moments of a Poisson random variable, that , where denotes the Stirling number of the second kind. Note that the triangle inequality of the Euclidean norm implies that
Denote , and note that since , , are iid and independent of and , we find
(45) | ||||
Note that and hence is bounded in (recall that ), making the sum in the display also bounded. Denote , which we assumed finite, and hence
Thus, there is a finite constant such that
In the case where , it remains to bound the small-jump term . In this case, we show that the -th moment is bounded by a multiple of . We may assume without loss of generality that , since the other cases would follow by Jensen’s inequality since for any . Since is a submartingale, Doob’s maximal inequality and the elementary inequality imply
for . Thus, to complete the proof it suffices to show that the expectation on the right is bounded as . Let be the vertices of the hypercube centered at the origin with sides parallel to the axes and side length (e.g., the vectors and are opposite vertices of this hypercube). Note that where denotes the -norm in . Hence, it suffices to show that is bounded as for each . The Lévy–Khintchine formula, the elementary inequality for , , and the Cauchy-Schwarz inequality for all and yield
completing the proof in the case .
Case . Note in this case, that the pure-jump component of is compound Poisson. Thus, as in (44), we have that for all , where for all . The bound on the -moment of the Brownian term follows exactly as in the case of and is a constant multiple of . From the term , we get a multiple of . Note that is a compound Poisson process with finitely many jumps on , with , and hence for some Poisson random variable with mean and iid random vectors independent of with law . For the term , we now use the same proof as in the case of , until (45). Hence, we see that
Since has finite activity, it follows that . Moreover, since the sum in the display above is bounded in , we get that is bounded by a multiple of , concluding the proof of Lemma 5.2.
Appendix B Small-time domains of attraction - proof of Theorem 5.1
The proof is essentially a consequence of [29, Thm 15.14] and [26, Thm 2]. Recall that denotes the open ball in with center and radius , by the unit sphere in and define .
Since and are Lévy processes, the stated weak convergence is equivalent to as for any by [29, Cor. 15.7]. By [26, Thm 2], it follows that, for some , for where is a slowly varying function at infinity and, moreover, is -stable for all . Thus, is itself -stable. We then have the following cases.
If , then, by [26, Thm 2(i)], the weak convergence in the direction is equivalent to
so the weak convergence in is equivalent to (24), completing the proof in this case.
If , the weak convergence in the direction is equivalent to (25) by [26, Thm 2(iii)], completing the proof in this case.
If , the weak convergence in the direction is equivalent to (25) and having zero natural drift by [26, Thm 2(iii)]. Since the latter condition is required for all , it is equivalent to having zero natural drift , completing the proof in this case.
If , the weak convergence in the direction may be different depending on the behaviour of the limiting process in this direction. If then is a linear drift and the weak convergence, by [26, Thm 2(ii)], is equivalent to the following two limits as :
(where we recall that ). By the first limit, the second limit is equivalent to . If instead , then, by [26, Thm 2(ii)], the weak limit is equivalent to the following. The process has zero natural drift whenever it has finite variation and and the following two limits hold as :
By the second limit, the first limit can be rewritten as the first limit in the display above. Thus, in either case, the conditions are equivalent to those stated in Theorem 5.1 in the direction . Since the directional limits are equivalent to the corresponding limits in , the result follows.∎
Appendix C Proof of the inequality in (4)
Recall that the two Lévy processes and in have the Lévy–Itô decompositions and , see Section 4. Recall that we chose coupling , implying (where is the Frobenius norm of the matrix ). Applying Doob’s maximal inequality, we obtain
(46) |
Applying the triangle inequality, we obtain
For (resp. ) inequality (4) follows by subadditivity for (resp. Minkowski’s inequality).