On the rate of convergence of the martingale central limit theorem in Wasserstein distances
Abstract
For martingales with a wide range of integrability, we will quantify the rate of convergence of the central limit theorem via Wasserstein distances of order , . Our bounds are in terms of Lyapunov’s coefficients and the fluctuation of the total conditional variances. We will show that our Wasserstein-1 bound is optimal up to a multiplicative constant.
1 Introduction
In this paper, we consider the rate of convergence of the martingale central limit theorem (CLT) under Wasserstein distances. Let be a square integrable martingale difference sequence (mds) with respect to -fields . Here denotes the trivial -field. For , let and
(1) |
For a mds , when in probability and Lindeberg’s condition is satisfied, it is well-known that
where denotes a standard normal random variable.
To quantify such a convergence in distribution, one of the most important metrics is the Wasserstein distance which has deep roots in the optimal transport theory [42]. Recall that, for two probability measures on , their Wasserstein distance (also called minimal distance) of order , , is defined by
where . With abuse of notations, we may use random variables as synonyms of their distributions. E.g., for , we may write as .
When , recall that admits the following alternative representations:
where denotes the set of all 1-Lipschitz functions, and .
Throughout the paper, we use to denote positive constants which may change from line to line. Unless otherwise stated, denote constants depending only on the parameter . We write if , and if and . We also use notations , to indicate that the multiplicative constant depends on the parameter .
1.1 Earlier results in the literature
There is an immense literature on the convergence rate of the CLT for independent random variables. Such results are often phrased in terms of Lyapunov’s coefficient (i.e., the term in the Lyapunov condition). To be specific, for , set111Note that the meaning of our notation is slightly different from that of [2, 27, 25, 38, 4] in the literature whose means the in our paper.
(2) |
Note that typically is of size .
When is a centered independent sequence, a nonuniform estimate of Bikjalis [2] implies , , extending the bounds of Esseen [16], Zolotarev [45], and Ibragimov [29]. For , Sakhanenko [40] proved that which is optimal for independent -integrable variables. For , it is established by Rio [37, 38] that
(3) |
In particular, when have roughly the same moments, the bound in (3) achieves the optimal rate . Bobkov [4] confirmed Rio’s conjecture that (3) holds for all . Cf. [38, 4] and references therein. For further developments, see e.g. [44, 20, 8, 22, 14, 6, 33] for work in the multivariate setting and e.g. [21, 33] for results on random variables with local-dependence.
It is natural to to ask whether the martingale CLT can be quantified by similar Wasserstein metric bounds in terms of . However, despite its theoretical importance, there are very few such results for (general) martingales compared to the independent case, let alone questions on optimal rates.
When is a mds, the non-uniform bound of the distribution functions by Joos [31] (which generalizes Haeusler and Joos [26]) implies that
(4) |
In the case , the nonuniform bound of Joos [30] implies
(5) |
Still for the case, Dung, Son, Tien [12] improved the last term in (5) to be for any . Under the condition
(6) |
Röllin’s [39, (2.1)] result inferred that (See also [18, Lemma 2.1])
(7) |
Fan and Ma [18] dropped the condition (6) and obtained
(8) |
Recently, assuming (6), Fan, Su [19, Corollary 2.5] implicitly extended (7) to be
(9) |
Dedecker, Merlevède, and Rio [11] proved bounds, , that involve the quantities (instead of in our bounds), , which can be suitably bounded in many situations.
Readers may refer to [10, 9, 39, 19, 11] for Wasserstein bounds for the martingale CLT under other special conditions (e.g. the sequence being stationary, ’s being close to be deterministic, ’s are uniformly bounded from below, or certain variants of the case, etc.).
Remark 1.
Unlike the Wasserstein metric, the Kolmogorov distance bounds for the martingale CLT has been thoroughly investigated since the 1970s. One of the earliest results is due to Heyde, Brown [28] which states that, for ,
(10) |
When the mds is , i.e., , Bolthausen [5] showed that
(11) |
and that the first term is optimal for the case . Haeusler [24, 25] generalized (10) to and showed that the first term is exact. Joos [31] proved that the second term in (10) can be replaced by , . Mourrat [34] improved the second term in (11) to , . Cf. also [26, 31, 23, 35, 15, 17, 19] and references therein for work in this direction.
1.2 Motivation and our contributions
Our paper concerns the convergence rates for the martingale CLT, .
Let us comment on some weaknesses of the aforementioned bounds.
When the martingale differences are at least -integrable, , the best rates given by (7), (9), (8) are typically , leaving a big gap between the rate in the case (5), not to mention that the condition (6) imposed in (9) is too restrictive for general martingales. Compared to (9), the exponent of in (4) is clearly not optimal, at least for . But (4) does not assume (6), and it offers a typical rate which is better than for . However, the constant in (4) is expected to grow linearly as , rendering (4) a useless bound when is bigger than .
Notice that all of the results discussed above are bounds. There are rarely any results on the bounds, , in terms of for (general) martingales.
Can we obtain bounds, , in terms of the Lyapunov coefficient (2) for the martingale CLT that generalize all of the previous results (4), (5), (7), (8), (9)? Further, what are the optimal rates and how to justify their optimality?
Is it possible to get Wasserstein distance bounds for the CLT so that the constant coefficents do not blow up as the integrability of the martingale increases? Such estimates would be important when we have a limited sample size, and they allow us to exploit the integrability of the martingale to obtain better rates.
In theory and in applications, there are numerous stochastic processes that do not fit into any of the category, , cf. [41, 32, 43]. Can we quantify the CLT for martingales with much wider spectrum of integrability than , ?
Motivated by these questions, we will prove the following results.
-
(1)
We will obtain , , convergence rates for martingales which are Orlicz-integrable. For instance, if
(12) for appropriate convex function that grows at most polynomially fast, then
(13) where denotes the inverse function of .
-
(2)
We will derive Wasserstein bounds whose constant coefficients do not depend on the integrability of the variables. Take the distance for example, if (12) holds for in a wide class of convex functions, we show that
(14) This explains the presence of the term in (5), and it implies that if the mds is bounded, i.e., , then
The novelty of this result lies not only on the fact that it encompasses an even larger spectrum (all the way up to ) of integrability than our first bound (13), but also that it yields a better bound than (13) when the order of the “integrability” is bigger than the logarithm of the sample size (i.e., beyond ).
-
(3)
Similar bounds for the distances in terms of the Lyapunov coefficient and , , will be established as well.
1.3 Structure of the paper
The organization of our paper is as follows.
Subsection 1.4 contains definitions of N-functions and the corresponding Orlicz norm for random sequences. In Section 2, we present our main Wasserstein distance bounds (Theorems 5,7, and 8) and the optimality of the rate (Proposition 10). In Section 3, using Taylor expansion and Lindeberg’s telescopic sum argument, we will derive bounds in terms of the conditional moments for martingales with a.s.. As consequences, bounds (Corollary 15) will be obtained for martingales that satisfy certain special conditions.
Section 4 is devoted to the proof of our main results. Our proof consists of the following components. First, we truncate the martingale as Haeusler [25] and elongate it as Dvoretzky [13] to turn it into a sequence with bounded increments and . We will bound the error of this modification in terms of the Lyapunov conefficient and . (See Section 4.1.) Then, as a crucial technical step, we use Young’s inequality within conditional expectations to “decouple” the Lyapunov coefficient and ’s from the conditional moments, and turn the bound to be an optimization problem over three parameters. This argument is robust enough for us to deal with martingales with very flexible integrability. Our another key observation is that there should be different bounds for the two different scenarios when the martingale is “at most ” and “more integrable than ”. By possibly sacrificing a small factor (e.g. ), we can make the constants of the bound independent of the integrability , which leads to bounds of type (14).
1.4 Preliminaries: N-functions and an Orlicz-norm for sequences
To generalize the notion of integrability, we recall the definitions of N-function and the corresponding Orlicz-norm in this subsection.
Definition 2.
A convex function is called an N-function if it satisfies , , and for . For two functions , we write
if is non-decreasing on .
Note that every N-function satisfies . See [1].
Denote the Fenchel-Legendre transform of by , i.e.,
Then is still an N-function.Young’s inequality states that
(15) |
Another useful relation between the pair is
(16) |
See [1] for a proof and for more properties of N-functions.
Definition 3.
For any N-function and , the -Orlicz norm of a random sequence with length is defined by
(17) |
In particular, when , i.e., is a single random variable, we still write
When , , we simply write as . Notice that
(18) | ||||
(19) |
Remark 4.
-
(a)
Using the property for for N-functions, it is easily seen that the integrability condition (12) implies .
-
(b)
Since this paper concerns square-integrable martingales, we are only interested in sup-quadratic N-functions . For instance, , (with ), , , (with ) are among such N-functions.
- (c)
- (d)
2 Main results
Recall in (19). For the mds and an N-function , we generalize the notation in (2) to be
(20) |
where denotes the inverse function of .
Theorem 5 ( bounds).
-
(i)
For any N-function ,
(21) -
(ii)
For any N-function with , ,
(22)
As consequences, for any ,
In particular, for the case, by (19),
Remark 6.
- (a)
-
(b)
Apparently, the second bound (22) gives better rates for mds which are at most polynomially integrable. E.g., for the “barely more than square integrable” case , (22) yields as the first term in the bound. However, (22) becomes a trivial bound for .
Although the first bound (21) is seemingly factor worse than the latter, it is applicable to martingales with more general integrability. For instance, if the martingale is exponentially integrable, taking , Theorem 5(i) yields , where the first term is better than the rates offered by (22) for any .
-
(c)
In some sense, the term quantifies the extent of decorrelation of the process. For independent sequences, is . In general, could converge to arbitrarily fast, depending on how decorrelated the process is.
-
(d)
The term is essential for bounds of type
(23) i.e., we cannot allow to be . For example, let be such that and let be i.i.d. variables. Define . Then . Whereas, for , as . Thus .
Our next main results concern the and convergence rates.
Theorem 7 ( bounds).
Let be a martingale difference sequence.
-
(i)
For any N-function such that is convex,
-
(ii)
For any N-function ,
where denotes the inverse function of .
As consequences, for ,
In particular, taking ,
(24) |
Theorem 8 ( bound).
Let be a martingale difference sequence. Then
Remark 9.
For the ease of our presentation, we only present results in terms of the metrics, . Readers can use the interpolation inequality
to easily infer other bounds, .
The following proposition justifies that the terms and the exponent of within the bound (22) are optimal.
Proposition 10.
(Optimality of the rates).
-
(1)
For any , there exists a constant depending on such that for any , we can find a martingale which satisfies
-
•
, a.s.;
-
•
;
-
•
.
-
•
-
(2)
For any , there exists a martingale that satisfies
-
•
;
-
•
.
-
•
3 Martingales with
In this section we will derive bounds, , for martingales with . As in [39, 18, 11, 19], we will use Lindeberg’s argument and Taylor expansion to bound the Wasserstein distance with a sum involving the third order conditional moments. Our derivation of the bounds for also relies on an observation of Rio [38] that relates the Wasserstein distances to Zolotarev’s ideal metric (25).
Note that [39, 18, 19] use Stein’s method to express the bound in terms of derivatives of Stein’s equation. But as observed in [11], Gaussian smoothing can already provide us enough regularity to do Taylor expansions. We follow [11] in this respect.
3.1 Wasserstein distance bounds via Lindeberg’s argument
For any -th order differentiable function and , denote its -Hölder constant by
We say that if .
Definition 11.
Suppose , and , are probability measures on . Let and be the unique numbers such that . We let . The Zolotarev distance is defined by
Recall that by the Kantorovich-Rubinstein Theorem, for . When , it is shown by Rio [38, Theorem 3.1] that
(25) |
For and any function , let
(26) |
where denotes a standard normal random variable. Direct integration by parts would yield the following regularity estimate for the Gaussian smoothing which is a special case of [10, Lemma 6.1].
Lemma 12.
For the reader’s convenience, we include a proof of Lemma 12 in the Appendix.
Recall the notations in (1). For any random variable , we write the expectation conditioning on as , i.e.,
(27) |
Proposition 13.
Let , , be a martingale difference sequence and let . Assume that almost surely. For any , set
Then, for any such that is convex,
(28) | ||||
(29) |
Proof of Proposition 13:.
Without loss of generality , assume .
Let be random variables with distributions and , so that the triple are independent. For , , and any random variable which is independent of ,
The triangle inequality and (25) then yields, for ,
(30) |
In what follows we will use Lindeberg’s argument to derive bounds for , . Recall notations in (1), (27).
Note that is -measurable. Recall the function in (26). Recall the operation in (27). For , we consider the following telescopic sum:
where .
Note that
(31) |
Hence, for , , using the fact , we have
(32) |
When , notice that is irrelevant . So, letting we immediately get (29). It remains to consider .
Using the fact that for ,
we get, for any event and any , ,
and similarly
Further, since , we have for . Hence
Since , we have for . Thus
and so, for ,
Similarly, for , ,
Since is convex, by Jensen’s inequality we know that
(33) |
Thus, we obtain, for , ,
This inequality, together with (32), (30) and (25), yields the Proposition. ∎
3.2 The rates of some special cases
From Proposition 13 we can obtain Wasserstein distance bounds for some special cases (with ). See Corollary 15 below. These cases usually yield better rates than the typical cases. They are not only of interest on their own right, but can also serve as important references when we construct counterexamples.
The bounds within Corollary 15 can be considered generalizations of some of the results in [39, Corollaries 2.2,2.3] and [19, Corollaries 2.5,2.6,2.7] to the integrable cases.
Corollary 15.
Assume that almost surely. Let be an N-function such that and is convex. Then the following statements hold.
-
(i)
for .
-
(ii)
If there exists such that
then, writing (and write as when ),
-
•
.
-
•
.
-
•
-
(iii)
If there exists a constant such that almost surely,
then
-
•
-
•
.
In particular, when , we have
-
•
Proof.
(i) For the ease of notation we write . Note that is still an N-function that satisfies conditions of Proposition 13. Recall the conditional expectation in (27). By Proposition 13,
where in the second inequality we used the fact that for all and that is decreasing. Taking , the , bounds are proved.
(ii) Recall in Proposition 13. When for all we have
Thus, by Proposition 13, we get, for any , , ,
(34) |
The first bound follows when , , .
Now consider the case , , . Since and is increasing, we get, for any ,
Taking such that for appropriate constant and recalling (3.2), we get (the case , )
the rest of the bounds in (ii) follow. Note that this is a trivial inequality (since ) if . If , it remains to justify that such a choice of satisfies . Indeed, since and , we have, for ,
and so, for ,
where in the last inequality we used the fact that .
(iii) First, notice that there exists such that , . Indeed,
where is an increasing function on . Thus
Clearly, if and only if . Moreover,
Thus we get, for any , ,
where in the first inequality we used the fact that is decreasing. When , taking , the desired bound follows. For the case , , the bound of this integral can be handled the same way as in (ii). Then the bounds in (iii) are proved.
It remains to prove the bound. By the inequality above, for any ,
where in the last inequality we used the fact that is increasing. Taking , the bound in (iii) follows. ∎
4 Proofs of the main bounds
Note that the Taylor expansion results (Proposition 13), which are in terms of the second and third conditional moments, suit exactly martingales (with ) with integrability between and , and so not surprisingly, the bounds of Corollary 15 for martingales, , followed quite easily from Proposition 13. However, to obtain Wasserstein bounds for martingales with more general integrability, we need new insights.
In Subsection 4.1, we will modify the mds into a bounded sequence with deterministic total conditional variance . To this end, we first truncate the martingale into a bounded sequence with , and then lengthen it to have . Such tricks of elongation and truncation of martingales were already used by Bolthausen [5] (the idea goes back to Dvoretzky [13]) and Haeusler [25] in the study of the Kolmogorov distance of the martingale CLT. The main result of Subsection 4.1 is a control of the error due to this modification in terms of and . See Proposition 17. As an application, we prove the bound in Theorem 8.
In Subsections 4.2 and 4.3, we prove the estimates in Theorems 5 and 7 by bounding the corresponding metrics for the modified martingale. A crucial step of our method is to use Young’s inequality inside the conditional expectations to “decouple” the Lyapunov coefficient and the conditional variances from the summation within Proposition 13. This will turn the bounds into an optimization problem over three parameters: the truncation parameter (), the smoothing parameter (), and an Orlicz parameter (). To this end, tools from the theory of N-functions will be employed to compare N-functions to polynomials.
4.1 A modified martingale, and Proof of Theorem 8 ( bounds)
The goal of this subsection is to modify the original mds to a new mds which is uniformly bounded except the last term. Throughout this subsection, we let be any fixed constant.
Define a martingale difference sequence as
(36) |
Note that for all and, with ,
(37) |
Define a stopping time (with the convention )
(38) |
Since is -measurable, is also a stopping time.
Definition 16.
Let and let be a standard normal which is independent of . Define the modified martingale difference sequence as follows: ,
(39) |
Enlarging the -fields if necessary, is still a mds. Let
Note that for all and recall that . We write, for , .
The main feature of the mds is that a.s. for , and
(40) |
Proposition 17.
Estimate (b) is slightly better than (a) but with a slightly stronger condition. Statement (b) is more useful for the case .
Lemma 18.
Proof.
Proof of Proposition 17.
Throughout, notice that and .
Without loss of generality, assume . Set
(42) |
To prove Proposition 17, it suffices to show that
(43) |
Indeed, trivially we have . If is convex, we claim that
(44) |
To this end, we apply Lemma A.2 to get
Notice that, when is convex, then using Jensen’s inequality as in (33), we have which is bounded by by Lemma 18. Claim (44) follows.
The rest of the proof is devoted to obtain (43).
Recall in (38). Since and coincide up to , we have
(45) |
Step 1. We will bound the first term in (45) by
(46) |
Since is a stopping time, form a martingale difference sequence. Hence, by an inequality of Burkholder Theorem A.3 and Lemma A.2, for ,
(47) |
Further, by the definition of the stopping time ,
Step 2. Consider the second in (45). We will show that
(48) |
Indeed, this inequality is trivial for . When , note that is a martingale difference sequence. For , by Burkholder’s inequality,
Further, by Hölder’s inequality, Jensen’s inequality, and the fact , we get, for ,
Since , we have
(49) |
where the last inequality used the definition of . Hence
Inequality (48) is proved.
When , we have
If , then
Hence, we have
(50) |
4.2 Proof of Theorem 5 ( bounds)
In what follows we let be constants to be determined later. Let the mds be as in Definition 16, and set
(52) |
Throughout this subsection, we simply write
(53) |
Note that , and is -measurable for .
The following estimate of the summation within (54) will be crucially employed in the derivation of our bounds, .
Proposition 19.
Let be an N-function with , and set . Recall above, and . Then, for any ,
Proof.
Without loss of generality, assume .
Recall that is -measurable. Recall that as in (27).
By Young’s inequality (15), for any ,
Note that is an N-function and so . Using the fact that and that for , we have, almost surely,
Thus we further have (Note that for .)
To prove Theorem 7 and Theorem 8, we will need the following lemma to compare N-functions to power functions.
Lemma 20.
Let and let denote its Hölder conjugate. Let be an N-function. Set .
-
(1)
If , then
-
(2)
If , then
Note that if , and (16) guarantees that in general.
Proof of Theorem 5:.
Without loss of generality, assume . Set .
Since is increasing, we have for . Hence
This inequality, together with Proposition 19, yields
Taking in the above inequality, we obtain
(56) |
where in the last inequality we used the fact
(57) |
Further, putting
(58) |
then inequality (56) and (54) imply
This inequality, together with Proposition 17(a), yields (recall in (42))
4.3 Proof of Theorem 7 ( bounds)
In some sense, our proof of the bound in Theorem 7 is an interpolation between the proofs of Theorem 5 and Theorem 8. We observe that, to bound , a better estimate than Proposition 19 can be achieved by simply bounding by when is larger than some “threshold” .
Proof of Theorem 7(i):.
Without loss of generality, assume . Set .
Define the mds and the martingale as in Definition 16. Note that for all .
Proof of Theorem 7(ii):.
Without loss of generality, assume .
Let be a constant to be determined. We define the mds and the martingale as in Definition 16. Set .
Step 1. Applying Proposition 13 and Remark 14 to the case , we get
(61) |
Let be a constant to be determined later, and define and event
By Proposition 19,
(62) |
Since , by Lemma 20, for . Thus
This inequality, together with (62), yields
Taking such that , i.e.,
we obtain
(63) |
5 Optimality of the rates: Proof of Proposition 10
For any , we will construct a mds such that and . Note that in our examples, as , justifying the optimality of the term in Theorem 5.
We choose
(Actually for any , any so that as would work.)
Example 21.
For , let be independent random variables such that
-
•
,
-
•
are i.i.d. normal random variables,
-
•
has distribution
We let and define a set as
Define
Clearly, is a martingale, and
Of course, almost surely, and
Proof of Proposition 10(1).
Let be the mds in Example 21. We consider the function
Clearly, . Recall the definition of the function in (26). Then
(64) |
where denotes the standard normal density. Moreover,
By the definition of , we have
Thus
(65) | ||||
Since the set is symmetric, i.e., , and , we get . Hence
Note that , and, writing ,
(66) |
Therefore, and so
justifying the optimality of the power of in Theorem 5. ∎
Example 22.
Proof of Proposition 10(2).
Let the mds be as in Example 22.
6 Some open questions
- 1.
- 2.
-
3.
Can we say anything about the optimality of the rates in Theorem 7 when the mds is integrable, for any ?
-
4.
Is there a better (unified) formula than the “piecewise” bound in Theorem 7 for martingales, ?
- 5.
-
6.
How to obtain the Wasserstein rates for the CLT of multi-dimensional martingales? Can we say something on the dependence of the rates on the dimension?
-
7.
How to obtain the Wasserstein- convergence rates, , for the martingale CLT in terms of ?
Appendix A Appendix
A.1 Comparison between Orlicz norms
Recall the definition of the (mean-)orlicz norm for a sequence in Definition 3.
Lemma A.1.
Let and let be a sequence of random variables with . If an N-function satisfies , then
Proof.
Without loss of generality assume . Since is increasing,
Thus, by the definition of ,
The lemma follows. ∎
A.2 Regularity of Gaussian smoothing:Proof of Lemma 12
The proof is exactly as in [10, Lemma 6.1].
Proof.
Since , integration by parts yields
The lemma follows by taking and using the fact . ∎
A.3 Moment bound of the maximum
Lemma A.2.
Let be an N-function, . For any sequence of random variables , we have
Proof.
Without loss of generality, assume . For any ,
where we used the fact that is increasing. Hence
Taking , we get . ∎
A.4 Proof of Lemma 20
A.5 A Burkholder inequality
Theorem A.3.
Let be a mds and let , . Recall the notation in (1). Then, fir ,
References
- [1] R. A. Adams, J.J.F. Fournier, Sobolev Spaces. Second edition. Pure Appl. Math. (Amst.), 140. Elsevier/Academic Press, Amsterdam, 2003.
- [2] A. Bikjalis, Estimates of the remainder term in the central limit theorem. Litovsk. Mat. Sb.6(1966), 323-346.
- [3] S. G. Bobkov, Entropic approach to E. Rio’s central limit theorem for transport distance. Stat. Probab. Lett. 83(7), 1644-1648 (2013).
- [4] S. G. Bobkov, Berry-Esseen bounds and Edgeworth expansions in the central limit theorem for transport distances. Probab. Theory Relat. Fields (2018) 170:229-262.
- [5] E. Bolthausen, Exact convergence rates in some martingale central limit theorems. Ann. Probab. 10 (1982)672-688.
- [6] T. Bonis, Stein’s method for normal approximation in Wasserstein distances with application to the multivariate central limit theorem. Probab. Theory Related Fields 178 (2020), no. 3-4, 827–860.
- [7] D. L. Burkholder, B. J. Davis, R. F. Gundy, Integral inequalities for convex functions of operators on martingales. University of California Press, Berkeley, CA, 1972, pp. 223-240.
- [8] T. Courtade, M. Fathi, A. Pananjady, Existence of Stein kernels under a spectral gap, and discrepancy bounds. Ann. Inst. Henri Poincaré Probab. Stat.55(2019), no.2, 777-790.
- [9] J. Dedecker, F. Merlevède, Rates of convergence in the central limit theorem for linear statistics of martingale differences. Stochastic Process. Appl. 121 (2011) 1013-1043.
- [10] J. Dedecker, F. Merlevède, E. Rio, Rates of convergence for minimal distances in the central limit theorem under projective criteria. Electron. J. Probab.14(2009), no. 35, 978-1011.
- [11] J. Dedecker, F. Merlevède, E. Rio, Rates of convergence in the central limit theorem for martingales in the non stationary setting. Ann. Inst. Henri Poincaré Probab. Stat. 58 (2022), no. 2, 945-966.
- [12] L.V. Dung, T.C. Son, N.D. Tien, bounds for some martingale central limit theorems. Lith. Math. J. 54, 48-60 (2014).
- [13] A. Dvoretzky, Asymptotic normality for sums of dependent random variables. Proc. Sixth Berkeley Symp. Math. Statist. Probab. 2 513-535. Univ. California Press, 1972.
- [14] R. Eldan, D. Mikulincer, A. Zhai, The CLT in high dimensions: quantitative bounds via martingale embedding. Ann. Probab. 48 (2020), no. 5, 2494-2524.
- [15] M. El Machkouri, L. Ouchti, Exact convergence rates in the central limit theorem for a class of martingales. Bernoulli 13 (4) (2007) 981-999.
- [16] C.-G. Esseen, On mean central limit theorem. Kungl. Tekn. Högsk. Handl. Stockholm,121(1958)
- [17] X. Fan, Exact rates of convergence in some martingale central limit theorems. J. Math. Anal. Appl. 469 (2019), no. 2, 1028-1044.
- [18] X. Fan, X. Ma, On the Wasserstein distance for a martingale central limit theorem. Statist. Probab. Lett. 167 (2020), 108892, 6 pp.
- [19] X. Fan, Z. Su, Rates of convergence in the distances of Kolmogorov and Wasserstein for standardized martingales. (2023), arXiv:2309.08189v1
- [20] X. Fang, Q.-M. Shao, L. Xu, Multivariate approximations in Wasserstein distance by Stein’s method and Bismut’s formula. Probab. Theory Related Fields 174(2019), no.3-4, 945-979.
- [21] X. Fang, Wasserstein-2 bounds in normal approximation under local dependence. Electron. J. Probab. 24 (2019), no. 35, 1-14.
- [22] M. Fathi, Stein kernels and moment maps. Ann. Probab. 47 (2019), no. 4, 2172-2185.
- [23] I. Grama, E. Haeusler, Large deviations for martingales via Cramér’s method. Stochastic Process. Appl. 85 (2000) 279-293.
- [24] E. Haeusler, A note on the rate of convergence in the martingale central limit theorem. Ann. Probab. 12 (1984) 635-639.
- [25] E. Haeusler, On the rate of convergence in the central limit theorem for martingales with discrete and continuous time. Ann. Probab. 16 (1) (1988) 275-299.
- [26] E. Haeusler, K. Joos, A nonuniform bound on the rate of convergence in the martingale central limit theorem. Ann. Probab.16(1988), no.4, 1699-1720.
- [27] P. Hall, C.C. Heyde, Martingale Limit Theory and Its Applications. Academic, New York, 1980.
- [28] C.C. Heyde, B.M. Brown, On the departure from normality of a certain class of martingales. Ann. Math. Stat. 41 (1970) 2161-2165.
- [29] I. Ibragimov, On the accuracy of Gaussian approximation to the distribution functions of sums of independent random variables. Theory Probab. Appl. 11 (1966) 559-579.
- [30] K. Joos, Nonuniform convergence rates in the central limit theorem for martingales. J. Multivariate Anal.36(1991), no.2, 297-313.
- [31] K. Joos, Nonuniform convergence rates in the central limit theorem for martingales. Studia Sci. Math. Hungar. 28 (1-2) (1993) 145-158.
- [32] R. Kulik, P. Soulier, Heavy-tailed time series. Springer Ser. Oper. Res. Financ. Eng. Springer, New York, 2020.
- [33] T. Liu, M. Austern, Wasserstein-p bounds in the central limit theorem under local dependence. Electron. J. Probab. 28 (2023), no. 117, 1-47.
- [34] J.C. Mourrat, On the rate of convergence in the martingale central limit theorem. Bernoulli 19 (2) (2013) 633-645.
- [35] L. Ouchti, On the rate of convergence in the central limit theorem for martingale difference sequences. Ann. Inst. Henri Poincaré Probab. Stat. 41 (1) (2005) 35-43.
- [36] V.V. Petrov, Limit theorems of probability theory. Oxford Stud. Probab., 4. The Clarendon Press, Oxford University Press, New York, 1995.
- [37] E. Rio, Distances minimales et distances idéales. C. R. Acad. Sci. Paris 326 (1998) 1127-1130.
- [38] E. Rio, Upper bounds for minimal distances in the central limit theorem. Ann. Inst. Henri Poincaré Probab. Stat. 45(3), 802–817 (2009)
- [39] A. Röllin, On quantitative bounds in the mean martingale central limit theorem. Statist. Probab. Lett.138(2018), 171-176.
- [40] A. I. Sakhanenko, Estimates in an invariance principle. (Russian) Limit theorems of probability theory, 27–44, 175, Trudy Inst. Mat., 5, “Nauka” Sibirsk. Otdel., Novosibirsk (1985).
- [41] S. van de Geer, J. Lederer, The Bernstein-Orlicz norm and deviation inequalities. Probab. Theory Related Fields 157 (2013), no. 1-2, 225-250.
- [42] C. Villani, Optimal Transport: Old and New. Grundlehren Math. Wiss., 338, Springer, Berlin (2009)
- [43] M. Vladimirova, S. Girard, H. Nguyen, J. Arbel, Sub-Weibull distributions: generalizing sub-Gaussian and sub-exponential properties to heavier tailed distributions. Stat 9 (2020), e318.
- [44] A. Zhai, A high-dimensional CLT in distance with near optimal convergence rate. Probab.Theory Related Fields 170, 1-25 (2018).
- [45] V. M. Zolotarev, On asymptotically best constants in refinements of mean central limit theorems. Theory Probab. Appl. 9 (1964) 268-276.
E-mail address, Xiaoqin Guo: [email protected]