Averaging principle for McKean-Vlasov SDEs driven by multiplicative fractional noise with highly oscillatory drift coefficient

Bin Pei [email protected] Lifang Feng [email protected] Min Han [email protected] School of Mathematics and Statistics, Northwestern Polytechnical University, Xi’an, 710072, China Chongqing Technology Innovation Center, Northwestern Polytechnical University, Chongqing, 401120, China

Abstract

In this paper, we study averaging principle for a class of McKean-Vlasov stochastic differential equations (SDEs) that contain multiplicative fractional noise with Hurst parameter $H>$ 1/2 and highly oscillatory drift coefficient. Here the integral corresponding to fractional Brownian motion is the generalized Riemann-Stieltjes integral. Using Khasminskii’s time discretization techniques, we prove that the solution of the original system strongly converges to the solution of averaging system as the times scale $\epsilon$ gose to zero in the supremum- and Hölder-topologies which are sharpen existing ones in the classical Mckean-Vlasov SDEs framework.

Keywords. Multiplicative fractional noise, highly oscillatory drift, stochastic averaging, McKean-Vlasov SDEs Mathematics subject classification. 60G22, 60H10, 60H05, 34C29

1 Introduction

The present paper focuses on the following McKean-Vlasov stochastic differential equations (SDEs) with highly oscillatory drift coefficient driven by multiplicative fractional noise in related path spaces, namely with supremum- and Hölder-topologies

\mathrm{d}X_{t}^{\epsilon}=b({t}/{\epsilon},X_{t}^{\epsilon},\mathscr{L}_{X_{t}^{\epsilon}})\mathrm{d}t+\sigma(X_{t}^{\epsilon})\mathrm{d}B_{t}^{H},\quad X_{0}^{\epsilon}=x_{0},\quad t\in[0,T]

(1.1)

where the parameter $0<\epsilon\ll 1$ , $x_{0}\in\mathbb{R}$ is arbitrary and non-random but fixed and the coefficients $b:[0,T]\times\mathbb{R}\times\mathcal{P}_{2}(\mathbb{R})\rightarrow\mathbb{R}$ and $\sigma:\mathbb{R}\rightarrow\mathbb{R}$ are measurable functions and $\mathscr{L}_{X_{t}^{\epsilon}}$ is the law of $X_{t}^{\epsilon}$ . Here $\mathcal{P}_{2}(\mathbb{R})$ is the space of probability measures on $\mathbb{R}$ with finite $2$ -th moment which will be introduced in Section 2. $\{B_{t}^{H},t\geq 0\}$ is one dimensional fractional Brownian motion (FBM) with Hurst parameter $H>1/2$ which is a Gaussian centered process with the covariance function

R_{H}(t,s)=\frac{1}{2}(t^{2H}+s^{2H}-|t-s|^{2H}).

The McKean-Vlasov SDEs, also known as distribution dependent SDEs or mean-field SDEs, whose evolution is determined by both the microcosmic location and the macrocosmic distribution of the particle, see e.g. [16], [22], [3], can better describe many models than classical SDEs as their coefficients depend on the law of the solution. Such kind of stochastic systems (1.1) are of independent interest and appear widely in applications including granular materials dynamics, mean-field games, as well as complex networked systems, see e.g. [2], [19], [14].

Now, we remind the reader what an averaging principle is. Since the highly oscillating component, it is relatively difficult to solve (1.1). The main goal of the averaging principle to find a simplified system which simulates and predicts the evolution of the original system (1.1) over a long time scale by averaging the highly oscillating drift coefficient under some suitable conditions. The history of averaging principle for deterministic systems is long which can be traced back to the result by Krylov, Bogolyubov and Mitropolsky, see e.g. [18], [1]. After that, [17] established an averaging principle for the SDEs driven by Brownian motion (BM). Up to now, there have existed some kind of methods, such as the techniques of time discretization and Poisson equation, the weak convergence method, studing averaging principle, see e.g. [20], [21], [26], [27], [30] for SDEs, and see e.g. [4], [10], [7], [11] for stochastic partial differential equations (SPDEs).

In recent years there has been considerable research interest in averaging for Mckean-Vlasov stochastic (partial) differential equations S(P)DEs. [28] established the averaging principle for slow-fast Mckean-Vlasov SDEs by the techniques of time discretization and Poisson equation. [13] investigated the strong convergence rate of averaging principle for slow-fast Mckean-Vlasov SPDEs based on the variational approach and the technique of time discretization. [6] studied averaging principle for distribution dependent SDEs with localized $L^{p}$ drift using Zvonkin’s transformation and estimates for Kolmogorov equations. [29] obtained the strong convergence without a rate for distribution dependent SDEs with highly oscillating component driven by FBM and standard BM, which requires that the FBM-term should be additive case.

However, the aforementioned references all focused on the Mckean-Vlasov S(P)DEs with addictive noise or multiplicative white noise. Up to now, there are no work concentrating on averaging for Mckean-Vlasov SDEs driven by multiplicative fractional noise. In this work, we aim to close this gap. It is known that the FBMs are not semimartingales. Therefore, the beautiful classical stochastic analysis is not applicable to fractional noises for $H\neq 1/2$ . It is a non-trivial task to extent the results in the classical stochastic analysis to these multiplicative fractional noises while one can use Wiener integral for the addictive fractional noise because the diffusion term is a dererministic function. Note that the diffusion term of Mckean-Vlasov SDEs in this paper is state variables-dependent, based on Riemann Stieltjes integral framework, we cannot use Gronwall’s lemma or generalized Gronwall’s lemma directly to prove the convergence of $X^{\epsilon}$ to $X$ as in [24], [23]. So, we will use the $\lambda$ -equivalent Hölder norm (see, Section 4.3 ) to overcome this problem.

From the above motivations, we consider the strong convergence of averaging principle for Mckean-Vlasov SDEs driven by multiplicative fractional noise in the present paper. The problem is solved by the fractional approach and Khasminskii type averaging principle efficiently. Moreover, our averaging result in the supremum- and Hölder-topologies sharpen existing ones in the classical Mckean-Vlasov S(P)DEs framework.

The paper is organized as follows. Section 2 presents some necessary notations and assumptions. Stochastic averaging principles for such McKean-Vlasov SDEs are then established in Section 3. Note that $C$ and $C_{\mathrm{x}}$ denote some positive constants which may change from line to line throughout this paper, where $\mathrm{x}$ is one or more than one parameter and $C_{\mathrm{x}}$ is used to emphasize that the constant depends on the corresponding parameter, for example, $C_{\alpha,\beta,\gamma,T,|x_{0}|}$ depends on $\alpha,\beta,\gamma,T$ and $|x_{0}|$ .

2 Preliminaries

In this section, we will recall some basic facts on definitions and properties of the fractional caculus. For more details, we refer to [12] and [25]. Firstly, we now introduce some necessary spaces and norms. In what follows of the rest of this section , let $a,b\in\mathbb{R},a<b$ . For $\gamma\in(0,1)$ , let $C^{\gamma}((a,b),\mathbb{R})$ be the space of $\gamma$ -Hölder continuous functions $f:[a,b]\rightarrow\mathbb{R}$ , equipped with the the norm

\|f\|_{\gamma,a,b}=\|f\|_{\infty,a,b}+\||f\||_{\gamma,a,b}

with

\|f\|_{\infty,a,b}:=\sup_{t\in[a,b]}|f(t)|,\quad\||f\||_{\gamma,a,b}=\sup_{a\leq s<t\leq b}\frac{|f(t)-f(s)|}{(t-s)^{\gamma}}.

For simplify, let $\|f\|_{\beta}:=\|f\|_{\beta,0,T},\|f\|_{\infty}:=\|f\|_{\infty,0,T}$ and $\||f\||_{\beta}:=\||f\||_{\beta,0,T}$ .

The following proposition provides an explicit expression for the integral $\int_{a}^{b}f\mathrm{d}g$ when $f\in C^{\gamma}((a,b),\mathbb{R})$ and $g\in C^{\beta}((a,b),\mathbb{R})$ with $\beta+\gamma>1,\beta,\gamma\in(0,1)$ in terms of fractional derivatives, see [31].

Proposition 2.1.

(Remark 4.1 in Nualart and Răşcanu, 2002). Suppose that $f\in C^{\gamma}((a,b),\mathbb{R})$ and $g\in C^{\beta}((a,b),\mathbb{R})$ with $\beta+\gamma>1,\beta,\gamma\in(0,1)$ . Let $\alpha\in(0,1)$ , $\gamma>\alpha$ and $\beta>1-\alpha$ . Then the Riemann Stieltjes integral $\int_{a}^{b}f\mathrm{d}g$ exists and it can be expressed as

\int_{a}^{b}f\mathrm{d}g=(-1)^{\alpha}\int_{a}^{b}D_{a+}^{\alpha}f(t)D_{b-}^{1-\alpha}g_{b-}(t)\mathrm{d}t

(2.1)

where $g_{b-}(t)=g(t)-g(b)$ and for $a\leq t\leq b$ the Weyl derivatives of $f$ are defined by formulas

	$\displaystyle D_{a+}^{\alpha}f(t)$	$\displaystyle=\frac{1}{\Gamma(1-\alpha)}\bigg{(}\frac{f(t)}{(t-a)^{\alpha}}+\alpha\int_{a}^{t}\frac{f(t)-f(s)}{(t-s)^{\alpha+1}}\mathrm{d}s\bigg{)},$
	$\displaystyle D_{b-}^{\alpha}f(t)$	$\displaystyle=\frac{(-1)^{\alpha}}{\Gamma(1-\alpha)}\bigg{(}\frac{f(t)}{(b-t)^{\alpha}}+\alpha\int_{t}^{b}\frac{f(t)-f(s)}{(s-t)^{\alpha+1}}\mathrm{d}s\bigg{)}$

and $\Gamma$ denotes the Gamma function.

Lemma 2.2.

(Theorem 2 in Hu and Nualart, 2007) Suppose that $f\in C^{\gamma}([0,T],\mathbb{R})$ and $g\in C^{\beta}([0,T],\mathbb{R})$ with $\beta+\gamma>1$ and $1>\gamma>\alpha,\,1>\beta>1-\alpha$ , for all $s,t\in[0,T]$ , one has

\displaystyle\bigg{|}\int_{s}^{t}f(r)\mathrm{d}g(r)\bigg{|}\leq C_{\alpha,\beta,\gamma,T}\|f\|_{\gamma,s,t}\||g\||_{\beta}(t-s)^{\beta}.

Proof.

By using the fractional integration given in (2.1), for all $s,t\in[0,T]$ , we have

$\displaystyle\bigg{\|}\int_{s}^{t}f(r)\mathrm{d}g(r)\bigg{\|}$	$\displaystyle\leq$	$\displaystyle\int_{s}^{t}\big{\|}D_{s+}^{\alpha}f(r)\big{\|}\cdot\big{\|}D_{t-}^{1-\alpha}g_{t-}(r)\big{\|}\mathrm{d}r$
	$\displaystyle\leq$	$\displaystyle\int_{s}^{t}C_{\alpha,\gamma,T}\\|f\\|_{\gamma,s,t}(r-s)^{-\alpha}C_{\alpha,\beta}\\|\|g\\|\|_{\beta}(t-r)^{\alpha+\beta-1}\mathrm{d}r$
	$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta,\gamma,T}\\|f\\|_{\gamma,s,t}\\|\|g\\|\|_{\beta}\int_{s}^{t}(r-s)^{-\alpha}(t-r)^{\alpha+\beta-1}\mathrm{d}r$
	$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta,\gamma,T}\\|f\\|_{\gamma,s,t}\\|\|g\\|\|_{\beta}(t-s)^{\beta}.$

This completes the proof. ∎

Remark 2.3.

(Lemma 7.5 in Nualart and Răşcanu, 2002) The trajectories of $B^{H}$ are locally $\beta$ -Hölder continuous a.s. for all $\beta\in(0,H)$ and $\||B^{H}\||_{\beta}$ has moments of all order.

Remark 2.4.

Suppose that $f\in C^{\gamma}([0,T],\mathbb{R})$ and $B^{H}\in C^{\beta}([0,T],\mathbb{R})$ with $\beta+\gamma>1$ and $1>\gamma>\alpha,\,1>\beta>1-\alpha$ , for all $s,t\in[0,T]$ , one has

\displaystyle\bigg{|}\int_{s}^{t}f(r)\mathrm{d}B^{H}_{r}\bigg{|}\leq C_{\alpha,\beta,\gamma,T}\|f\|_{\gamma,s,t}\||B^{H}\||_{\beta}(t-s)^{\beta},\,\,\,{\rm a.s.}

Lemma 2.5.

For any positive constants $a$ , $d$ , if $a,d<1$ , one has

\displaystyle\int_{s}^{t}(r-s)^{-a}(t-r)^{-d}\mathrm{d}r\leq(t-s)^{1-a-d}B(1-a,1-d)

where $s\in(0,t)$ and $\mathit{B}$ is the Beta function.

Proof.

By a change of variable $y=(r-s)/(t-s)$ , we have

$\displaystyle\int_{s}^{t}(r-s)^{-a}(t-r)^{-d}\mathrm{d}r$	$\displaystyle=$	$\displaystyle\int_{0}^{1}(y(t-s))^{-a}(t-s-y(t-s))^{-d}(t-s)\mathrm{d}y$
	$\displaystyle=$	$\displaystyle(t-s)^{1-a-d}\int_{0}^{1}y^{-a}(1-y)^{-d}\mathrm{d}y$
	$\displaystyle=$	$\displaystyle(t-s)^{1-a-d}B(1-a,1-d).$

This completes the proof. ∎

Let $\mathcal{P}(\mathbb{R})$ be the collection of all probability measures on $\mathbb{R}$ , and $\mathcal{P}_{2}(\mathbb{R})$ be the space of probability measures on $\mathbb{R}$ with finite $2$ -th moment, i.e.,

\mathcal{P}_{2}(\mathbb{R})=\bigg{\{}\mu\in\mathcal{P}(\mathbb{R}):\;\mu(|\cdot|^{2}):=\int_{\mathbb{R}}|x|^{2}\mu(\mathrm{d}x)<\infty\bigg{\}}.

We define the $L^{2}$ -Wasserstein distance on $\mathcal{P}_{2}(\mathbb{R})$ by

\mathbb{W}_{2}\left(\mu_{1},\mu_{2}\right):=\inf_{\pi\in\mathcal{C}_{\mu_{1},\mu_{2}}}\left(\int_{\mathbb{R}\times\mathbb{R}}|x-y|^{2}\pi(dx,dy)\right)^{1/2},\quad\mu_{1},\mu_{2}\in\mathcal{P}_{2}(\mathbb{R})

where $\mathcal{C}_{\mu_{1},\mu_{2}}$ is the set of probability measures on $\mathbb{R}\times\mathbb{R}$ with marginals $\mu_{1}$ and $\mu_{2}$ . It is well-known that $(\mathcal{P}_{2}(\mathbb{R}),\mathbb{W}_{2})$ is a Polish space.

Note that for any $x\in\mathbb{R}$ , the Dirac measure $\delta_{x}$ belongs to $\mathcal{P}_{2}(\mathbb{R})$ , specially $\delta_{0}$ is the Dirac measure at point 0 and if $\mu_{1}=\mathscr{L}_{X},\mu_{2}=\mathscr{L}_{Y}$ are the corresponding distributions of random variables $X$ and $Y$ respectively, then

\mathbb{W}_{2}^{2}(\mu_{1},\mu_{2})\leq\int_{\mathbb{R}\times\mathbb{R}}|x-y|^{2}\mathscr{L}_{(X,Y)}(dx,dy)=\mathbb{E}[|X-Y|^{2}]

in which $\mathscr{L}_{(X,Y)}$ represents the joint distribution of the random pair $(X,Y)$ . Then for arbitrarily fixed $T>0$ , let $C([0,T];\mathbb{R})$ be the Banach space of all $\mathbb{R}$ -valued continuous functions on $[0,T]$ , endowing with the supremum norm. Furthermore, we let $L^{2}(\Omega;C([0,T];\mathbb{R}))$ be the totality of $C([0,T];\mathbb{R})$ -valued random variables $X$ satisfying $\mathbb{E}[\sup_{0\leq t\leq T}|X(t)|^{2}]<\infty$ . Then, $L^{2}(\Omega;C([0,T];\mathbb{R}))$ is a Banach space under the norm

\|X\|_{L^{2}}:=\Big{(}\mathbb{E}\Big{[}\sup_{0\leq t\leq T}|X(t)|^{2}\Big{]}\Big{)}^{1/2}.

3 Assumptions and main result

3.1 Assumptions

To derive a unique solution to (1.1), we first introduce assumptions on the coefficients $b$ and $\sigma$ such that

(H1)

There exists a constant $L_{b}>0$ , such that for any $t\in[0,T]$ , $x_{1},x_{2}\in\mathbb{R}$ and $\mu_{1},\mu_{2}\in\mathcal{P}_{2}(\mathbb{R})$ ,

$\displaystyle|b(t,x_{1},\mu_{1})-b(t,x_{2},\mu_{2})|\leq L_{b}\big{(}|x_{1}-x_{2}|+\mathbb{W}_{2}(\mu_{1},\mu_{2})\big{)}.$

Moreover, $b$ is bounded by a positive constant $M_{b}$ , i.e.,

$\sup_{(t,x,\mu)\in[0,T]\times\mathbb{R}\times\mathcal{P}_{2}(\mathbb{R})}|b(t,x,\mu)|\leq M_{b}.$

(H2)

There exist constants $M_{\sigma}>K_{\sigma}>0$ and $L_{\sigma}>0$ such that for any $x,x_{1},x_{2}\in\mathbb{R}$

\displaystyle|\sigma(x_{1})-\sigma(x_{2})|\leq L_{\sigma}|x_{1}-x_{2}|\,{\rm and}\,\,K_{\sigma}\leq|\sigma(x)|\leq M_{\sigma}.

Under assumptions (H1) and (H2) above, one can deduce from Theorem 3.3 in [8] that the system (1.1) admits a unique solution via a Lamperti transform.

Lemma 3.1.

Suppose that (H1) and (H2) hold and $1/2<H<1$ , then (1.1) has a unique solution $X\in L^{2}(\Omega;C([0,T];\mathbb{R}))$ .

In order to establish the averaging principle, besides conditions (H1) and (H2), we further assume:

(H3)

$b$ is Lipschitz continuous respect to $t$ , i.e., there exists a positive constant $L^{\prime}_{b}$ , such that for any $t\in[0,T]$ , $x\in\mathbb{R}$ and $\mu\in\mathcal{P}_{2}(\mathbb{R})$ ,

$\displaystyle|b(t_{1},x,\mu)-b(t_{2},x,\mu)|\leq L^{\prime}_{b}|t_{1}-t_{2}|.$

(H4)

The function $\sigma(x)$ is of class $C^{1}(\mathbb{R})$ . There exists a constant $M^{\prime}_{\sigma}>0$ such that for any $x,x_{1},x_{2}\in\mathbb{R}$ ,

\displaystyle|\nabla\sigma(x_{1})-\nabla\sigma(x_{2})|\leq M^{\prime}_{\sigma}|x_{1}-x_{2}|\,{\rm and}\,\,|\nabla\sigma(x)|\leq M^{\prime}_{\sigma}

hold. Here, $\nabla$ is the standard gradient operator on $\mathbb{R}$ .

(H5)

There exist a bounded positive function $\varphi:\mathbb{R}^{+}\rightarrow\mathbb{R}^{+}$ and a measurable function $\bar{b}:\mathbb{R}\times\mathcal{P}_{2}(\mathbb{R})\rightarrow\mathbb{R}$ , such that for any $x\in\mathbb{R}$ , $\mu\in\mathcal{P}_{2}(\mathbb{R})$ it holds that

\displaystyle\sup_{t\geq 0}\Bigg{|}\frac{1}{T}\int_{t}^{t+T}(b(s,x,\mu)-\bar{b}(x,\mu))\mathrm{d}s\Bigg{|}\leq\varphi(T)\big{(}1+|x|+\mu(|\cdot|^{2})\big{)}

where $\varphi(T)$ satisfies $\lim_{T\rightarrow\infty}\varphi(T)=0$ .

Remark 3.2.

It follows from the conditions (H1) and (H5) that $\bar{b}$ satisfies

	$\displaystyle\|\bar{b}(x,\mu)\|$	$\displaystyle\leq L_{\bar{b}},$
	$\displaystyle\|\bar{b}(x_{1},\mu_{1})-\bar{b}(x_{2},\mu_{2})\|$	$\displaystyle\leq L_{\bar{b}}\big{(}\|x_{1}-x_{2}\|+\mathbb{W}_{2}(\mu_{1},\mu_{2})\big{)}$

for any $x,x_{1},x_{2}\in\mathbb{R}$ and $\mu,\mu_{1},\mu_{2}\in\mathcal{P}_{2}(\mathbb{R})$ , where $L_{\bar{b}}$ is a positive constant.

Proof.

We have that

	$\displaystyle\|\bar{b}(x,\mu)\|$	$\displaystyle\leq$	$\displaystyle\Bigg{\|}\frac{1}{T}\int_{0}^{T}(b(s,x,\mu)-\bar{b}(x,\mu))\mathrm{d}s\Bigg{\|}+\Bigg{\|}\frac{1}{T}\int_{0}^{T}b(s,x,\mu)\mathrm{d}s\Bigg{\|}$
		$\displaystyle\leq$	$\displaystyle\varphi(T)\big{(}1+\|x\|+\mu(\|\cdot\|^{2})\big{)}+M_{b}$

and

$\displaystyle\|\bar{b}(x_{1},\mu_{1})-\bar{b}(x_{2},\mu_{2})\|$	$\displaystyle\leq$	$\displaystyle\Bigg{\|}\frac{1}{T}\int_{0}^{T}(b(s,x_{1},\mu_{1})-\bar{b}(x_{1},\mu_{1}))\mathrm{d}s\Bigg{\|}+\Bigg{\|}\frac{1}{T}\int_{0}^{T}(b(s,x_{2},\mu_{2})-\bar{b}(x_{2},\mu_{2}))\mathrm{d}s\Bigg{\|}$
		$\displaystyle+\Bigg{\|}\frac{1}{T}\int_{0}^{T}(b(s,x_{1},\mu_{1})-b(s,x_{2},\mu_{2}))\mathrm{d}s\Bigg{\|}$
	$\displaystyle\leq$	$\displaystyle\varphi(T)\big{(}1+\|x_{1}\|+\|x_{2}\|+\mu_{1}(\|\cdot\|^{2})+\mu_{2}(\|\cdot\|^{2})\big{)}+L_{b}\big{(}\|x_{1}-x_{2}\|+\mathbb{W}_{2}(\mu_{1},\mu_{2})\big{)}.$

Taking $T\rightarrow\infty$ , there exist a constant $L_{\bar{b}}>0$ such that Lemma 3.2 holds. ∎

Remark 3.3.

Noting that

\displaystyle\sup_{t\geq 0}\Bigg{|}\frac{1}{T}\int_{t}^{t+T}(b(s,x,\mu)-\bar{b}(x,\mu))\mathrm{d}s\Bigg{|}\leq\sup_{t\geq 0}\frac{1}{T}\int_{t}^{t+T}|b(s,x,\mu)-\bar{b}(x,\mu)|\mathrm{d}s.

This shows that the averaging condition (H5) is weaker than the following averaging condition

\displaystyle\sup_{t\geq 0}\frac{1}{T}\int_{t}^{t+T}|b(s,x,\mu)-\bar{b}(x,\mu)|\mathrm{d}s\leq\varphi(T)\big{(}1+|x|+\mu(|\cdot|^{2})\big{)}.

3.2 Main result

Now, we define the averaged equation:

\mathrm{d}\bar{X}_{t}=\bar{b}(\bar{X}_{t},\mathscr{L}_{\bar{X}_{t}})\mathrm{d}t+\sigma(\bar{X}_{t})\mathrm{d}B_{t}^{H},\quad\bar{X}_{0}=x_{0}

(3.1)

where $\bar{b}$ has been given in (H5) and using Theorem 3.3 in [8] again, we have the unique solution result to (3.1).

Lemma 3.4.

Suppose that (H1)-(H5) hold, then Eq. (3.1) has a unique solution $\bar{X}\in L^{2}(\Omega;C([0,T];\mathbb{R}))$ .

Theorem 3.5.

Suppose that (H1)-(H5) hold, then we obtain

\lim_{\epsilon\rightarrow 0}\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\big{]}=0.

The proof of Theorem 3.5 will be given in Section 4.

Remark 3.6.

The averaging principle result (Theorem 3.5) is also applicable to the following system

\displaystyle\begin{split}\mathrm{d}X_{t}=\epsilon b(t,X_{t},\mathscr{L}_{X_{t}})\mathrm{d}t+\epsilon^{H}\sigma(X_{t})\mathrm{d}B^{H}_{t}.\end{split}

(3.2)

Let $t\mapsto{\frac{t}{\epsilon}}$ , define $Y_{t}^{\epsilon}:=X_{t/\epsilon}$ and $B^{\epsilon,H}_{t}:=\epsilon^{H}B^{H}_{t/\epsilon}$ for all $t\in\mathbb{R}^{+}$ we rewrite (3.2) as

\displaystyle\begin{split}&\mathrm{d}Y_{t}^{\epsilon}=b(t/\epsilon,Y_{t}^{\epsilon},\mathscr{L}_{Y_{t}^{\epsilon}})\mathrm{d}t+\sigma(Y_{t}^{\epsilon})\mathrm{d}B^{\epsilon,H}_{t}.\end{split}

(3.3)

Then we can consider the following system

\displaystyle\begin{split}\mathrm{d}\tilde{X}_{t}^{\epsilon}=b(t/\epsilon,\tilde{X}_{t}^{\epsilon},\mathscr{L}_{\tilde{X}_{t}^{\epsilon}})\mathrm{d}t+\sigma(\tilde{X}_{t}^{\epsilon})\mathrm{d}B^{H}_{t}.\end{split}

(3.4)

4 The proof of main result

4.1 Some a-prior estimate of the solution

Lemma 4.1.

Suppose that (H1)-(H5) hold. Then, for $t\in[0,T]$ , we have

\|X^{\epsilon}\|_{\gamma}+\|\bar{X}\|_{\gamma}\leq C_{\alpha,\beta,\gamma,T,|x_{0}|}\big{(}(1+\||B^{H}\||_{\beta})\vee(1+\||B^{H}\||_{\beta})^{\frac{1}{\gamma}}\big{)},\,\,{\rm a.s.}

Proof.

Like the proof of Theorem 2.2 in [15] and Exercise 4.5 in [9], for any $0\leq s<t\leq T$ , we have

\displaystyle\||X^{\epsilon}\||_{\gamma,s,t}\leq C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})\big{(}1+\||X^{\epsilon}\||_{\gamma,s,t}(t-s)^{\gamma}\big{)}.

Suppose that $\Delta$ satisfies $\Delta=(2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})^{-\frac{1}{\gamma}}.$ Then, for all $s$ and $t$ such that $t-s\leq\Delta$ we have

\displaystyle\||X^{\epsilon}\||_{\gamma,s,t}\leq 2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta}).

(4.1)

Therefore we can obtain

\displaystyle\|X^{\epsilon}\|_{\infty,s,t}\leq|X^{\epsilon}_{s}|+\||X^{\epsilon}\||_{\gamma,s,t}(t-s)^{\gamma}\leq|X^{\epsilon}_{s}|+2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})\Delta^{\gamma}.

(4.2)

If $\Delta\geq T$ , from (4.1) and (4.2), we obtain the estimate

\displaystyle\|X^{\epsilon}\|_{\infty,s,t}\leq|x_{0}|+2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})T^{\gamma}\,{\rm and}\,\,\||X^{\epsilon}\||_{\gamma}\leq 2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta}).

(4.3)

While if $\Delta\leq T$ , then from (4.2) we get

\displaystyle\|X^{\epsilon}\|_{\infty,s,t}\leq|X^{\epsilon}_{s}|+2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})(2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta}))^{-{\frac{1}{\gamma}}\cdot\gamma}\leq|X^{\epsilon}_{s}|+1.

(4.4)

Divide the interval $[0,T]$ into $n=[\frac{T}{\Delta}]+1$ subintervals, and use the estimate (4.4) in every interval, we obtain

\displaystyle\|X^{\epsilon}\|_{\infty}\leq|x_{0}|+n\leq|x_{0}|+T\Delta^{-1}+1\leq|x_{0}|+2T\big{(}2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})\big{)}^{\frac{1}{\gamma}}\leq|x_{0}|+2^{\frac{1+\gamma}{\gamma}}TC_{\alpha,\beta,\gamma,T}^{\frac{1}{\gamma}}(1+\||B^{H}\||_{\beta})^{\frac{1}{\gamma}}

(4.5)

and from (4.1), we know that when $t-s\leq\Delta$ , then $\||X^{\epsilon}\||_{\gamma,s,t}\leq 2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})$ , when $t-s\geq\Delta$ , define $t_{i}=(s+i\Delta)\wedge t$ , for $i=0,1,\dots,N$ , noting that $t_{N}=t$ for $N\geq|t-s|/\Delta$ and also $t_{i+1}-t_{i}\leq\Delta$ for all $i$ , then

\displaystyle|X^{\epsilon}_{t}-X^{\epsilon}_{s}|\leq\sum_{0\leq i<|t-s|/\Delta}|X^{\epsilon}_{t_{i+1}}-X^{\epsilon}_{t_{i}}|\leq\big{(}|t-s|/\Delta+1\big{)}\Delta^{\gamma}=\Delta^{\gamma-1}(\Delta+|t-s|)\leq 2\Delta^{\gamma-1}|t-s|.

So we have

	$\displaystyle\\|\|X^{\epsilon}\\|\|_{\gamma,s,t}$	$\displaystyle\leq$	$\displaystyle 2C_{\alpha,\beta,\gamma,T}(1+\\|\|B^{H}\\|\|_{\beta})(1\vee 2\Delta^{\gamma-1})=\Big{\{}2C_{\alpha,\beta,\gamma,T}(1+\\|\|B^{H}\\|\|_{\beta})\vee 2\Delta^{-1}\Big{\}}$		(4.6)
		$\displaystyle=$	$\displaystyle\Big{\{}2C_{\alpha,\beta,\gamma,T}(1+\\|\|B^{H}\\|\|_{\beta})\vee 2\big{(}2C_{\alpha,\beta,\gamma,T}(1+\\|\|B^{H}\\|\|_{\beta})\big{)}^{\frac{1}{\gamma}}\Big{\}}.$		(4.7)

Thus, from (4.3), (4.5) and (4.6), we get the desire estimate.

Using similar techniques, we can prove

\|\bar{X}\|_{\gamma}\leq C_{\alpha,\beta,\gamma,T,|x_{0}|}\big{(}(1+\||B^{H}\||_{\beta})\vee(1+\||B^{H}\||_{\beta})^{\frac{1}{\gamma}}\big{)},\,\,{\rm a.s.}

Here we omit the proof. ∎

Lemma 4.2.

Suppose that (H1)-(H5) hold. Then, if $0\leq t\leq t+h\leq T$ , and $h\in(0,1)$ , we have

|X^{\epsilon}_{t+h}-X^{\epsilon}_{t}|\leq C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})(1+\|X^{\epsilon}\|_{\gamma})h^{\beta},\,\,{\rm a.s.}

Proof.

From (1.1), by (H1)-(H5), Hölder inequality and Remark 2.4, we have

$\displaystyle\|X^{\epsilon}_{t+h}-X^{\epsilon}_{t}\|$	$\displaystyle\leq$	$\displaystyle\bigg{\|}\int_{t}^{t+h}b({s}/{\epsilon},X^{\epsilon}_{s},\mathscr{L}_{X^{\epsilon}_{s}})\mathrm{d}s\bigg{\|}+\bigg{\|}\int_{t}^{t+h}\sigma(X^{\epsilon}_{s})\mathrm{d}B^{H}_{s}\bigg{\|}$
	$\displaystyle\leq$	$\displaystyle\int_{t}^{t+h}\|b({s}/{\epsilon},X^{\epsilon}_{s},\mathscr{L}_{X^{\epsilon}_{s}})\|\mathrm{d}s+C_{\alpha,\beta,\gamma,T}\\|\|B^{H}\\|\|_{\beta}\Big{(}\sup_{t\leq r\leq t+h}\|\sigma(X^{\epsilon}_{r})\|+\sup_{t\leq u<r\leq t+h}\frac{\|\sigma(X^{\epsilon}_{r})-\sigma(X^{\epsilon}_{u})\|}{(r-u)^{\gamma}}\Big{)}h^{\beta}$
	$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta,\gamma,T}(1+\\|\|B^{H}\\|\|_{\beta})\big{(}1+\\|X^{\epsilon}\\|_{\gamma}\big{)}h^{\beta}.$

This completes the proof. ∎

4.2 The proof of Theorem 3.5

For each $R>1$ , we define the following stopping times $\tau_{R}$ such that

\displaystyle\tau_{R}:=\inf\{t\geq 0:\||B^{H}\||_{\beta,0,t}>R\}\wedge T.

(4.8)

Firstly, we have

\displaystyle\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\big{]}\leq\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\mathbf{1}_{\{\tau_{R}\geq T\}}\big{]}+\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\mathbf{1}_{\{\tau_{R}<T\}}\big{]}

(4.9)

where $\mathbf{1}_{\cdot}$ is an indicator function. For the first supremum in the right-hand side of inequality (4.9), denote $D:=\{\||B^{H}\||_{\beta}\leq R\}$ . Now for $\lambda\geq 1$ a equivalent norm of $C^{\gamma}([0,T],\mathbb{R})$ with $\gamma\in(0,1)$ is defined by

\|f\|_{\gamma,\lambda,0,T}:=\|f\|_{\infty,\lambda,0,T}+\||f\||_{\gamma,\lambda,0,T}:=\sup_{t\in[0,T]}e^{-\lambda t/2}|f(t)|+\sup_{0\leq s<t\leq T}e^{-\lambda t/2}\frac{|f(t)-f(s)|}{(t-s)^{\gamma}}.

For simplify, let $\|f\|_{\gamma,\lambda}:=\|f\|_{\gamma,\lambda,0,T},\|f\|_{\infty,\lambda}:=\|f\|_{\infty,\lambda,0,T}$ and $\||f\||_{\gamma,\lambda}:=\||f\||_{\gamma,\lambda,0,T}$ .

In what follows we fix $0<\alpha<\gamma<\beta$ , $\frac{1}{2}<\beta<H$ , $\gamma+\beta>1$ . We will show that for every $\rho_{0}>0$ there exists an $\epsilon_{0}>0$ so that for $\epsilon<\epsilon_{0}$ , $\lambda>\lambda_{0}$ we have

\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}\leq\rho_{0}.

(4.10)

Note that the norm here is equivalent to the norm in the conclusion. Here $\delta\in(0,1)$ is a parameter depending on $\rho_{0}$ . To estimate all the terms in the following inequality we have to consider 3 cases. For the first case the right hand side will be absorbed by the left hand side of the inequality when $\lambda$ is sufficiently large. The second case includes terms providing estimates like $C\delta^{2-2\gamma}$ where $C$ is a priori determined by $\alpha,\beta,\gamma,T,|x_{0}|$ but independent of $\rho_{0},\lambda,\delta,\epsilon$ , then we choose fixed $\delta$ so that $C\delta^{2-2\gamma}<\zeta\rho_{0}$ , $\zeta>0$ sufficiently small. The third case contains terms providing an estimate $\sqrt{R^{-1}\mathbb{E}[\||B^{H}\||_{\beta}]}$ , which can be made arbitrarily small when $R$ is sufficiently large.

Let $\mathbf{A}:=\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]},$ and divide $[0,T]$ into intervals depending of size $\delta$ , where $\delta\in(0,1)$ is a fixed positive number. For $t\in[k\delta,\min\{(k+1)\delta,T\}]$ and $s(\delta)=\lfloor\frac{s}{\delta}\rfloor\delta$ , where $\lfloor\frac{s}{\delta}\rfloor$ is the integer part of $\frac{s}{\delta}$ . From (1.1) and (3.1), we have

$\displaystyle\mathbf{A}$	$\displaystyle\leq$	$\displaystyle 5\mathbb{E}\bigg{[}\bigg{\\|}\int_{0}^{\cdot}(b({s}/{\epsilon},X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}})-b({s}/{\epsilon},X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}}))\mathrm{d}s\bigg{\\|}_{\gamma,\lambda}^{2}\mathbf{1}_{D}\bigg{]}$
		$\displaystyle+5\mathbb{E}\bigg{[}\bigg{\\|}\int_{0}^{\cdot}(b({s}/{\epsilon},X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})-\bar{b}(X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}}))\mathrm{d}s\bigg{\\|}_{\gamma,\lambda}^{2}\mathbf{1}_{D}\bigg{]}$
		$\displaystyle+5\mathbb{E}\bigg{[}\bigg{\\|}\int_{0}^{\cdot}(\bar{b}(X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})-\bar{b}(X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}}))\mathrm{d}s\bigg{\\|}_{\gamma,\lambda}^{2}\mathbf{1}_{D}\bigg{]}$
		$\displaystyle+5\mathbb{E}\bigg{[}\bigg{\\|}\int_{0}^{\cdot}(\bar{b}(X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}})-\bar{b}(\bar{X}_{s},\mathscr{L}_{\bar{X}_{s}}))\mathrm{d}s\bigg{\\|}_{\gamma,\lambda}^{2}\mathbf{1}_{D}\bigg{]}$
		$\displaystyle+5\mathbb{E}\bigg{[}\bigg{\\|}\int_{0}^{\cdot}(\sigma(X^{\epsilon}_{s})-\sigma(\bar{X}_{s}))\mathrm{d}B^{H}_{s}\bigg{\\|}_{\gamma,\lambda}^{2}\mathbf{1}_{D}\bigg{]}=:\sum_{i=1}^{5}\mathbf{A}_{i}.$

By Hölder’s inequality, it is easy to obtain

\displaystyle\bigg{\|}\int_{0}^{\cdot}f(s)\mathrm{d}s\bigg{\|}_{\gamma,\lambda}^{2}\leq C_{\gamma,T}\sup_{t\in[0,T]}e^{-\lambda t}\int_{0}^{t}\frac{|f(s)|^{2}}{(t-s)^{\gamma}}\mathrm{d}s.

(4.11)

By (H2), (4.11) and Lemma 4.2, we obtain

$\displaystyle\mathbf{A}_{1}+\mathbf{A}_{3}$	$\displaystyle\leq$	$\displaystyle C_{\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\int_{0}^{t}\|b({s}/{\epsilon},X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}})-b({s}/{\epsilon},X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})\|^{2}(t-s)^{-\gamma}\mathrm{d}s\mathbf{1}_{D}\bigg{]}$	(4.13)
		$\displaystyle+C_{\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\int_{0}^{t}\|\bar{b}(X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})-\bar{b}(X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}})\|^{2}(t-s)^{-\gamma}\mathrm{d}s\mathbf{1}_{D}\bigg{]}$	(4.13)
	$\displaystyle\leq$	$\displaystyle C_{\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\int_{0}^{t}(\|X_{s(\delta)}^{\epsilon}-X^{\epsilon}_{s}\|^{2}+\mathbb{E}[\|X_{s(\delta)}^{\epsilon}-X^{\epsilon}_{s}\|^{2}])(t-s)^{-\gamma}\mathrm{d}s\mathbf{1}_{D}\bigg{]}$	(4.14)
	$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta,\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\int_{0}^{t}((1+\\|\|B^{H}\\|\|_{\beta}^{2})(1+\\|X^{\epsilon}\\|_{\gamma}^{2})\delta^{2\beta}+\mathbb{E}[(1+\\|\|B^{H}\\|\|_{\beta}^{2})(1+\\|X^{\epsilon}\\|_{\gamma}^{2})\delta^{2\beta}])(t-s)^{-\gamma}\mathrm{d}s\mathbf{1}_{D}\bigg{]}$	(4.15)
	$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta,\gamma,T,\|x_{0}\|}\delta^{2\beta}.$	(4.16)

By elementary inequality, we have

$\displaystyle\mathbf{A}_{2}$	$\displaystyle\leq$	$\displaystyle C\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\bigg{\|}\int_{0}^{t}(b({s}/{\epsilon},X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})-\bar{b}(X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}}))\mathrm{d}s\bigg{\|}^{2}\mathbf{1}_{D}\bigg{]}$
		$\displaystyle+C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}e^{-\lambda t}\frac{\big{\|}\int_{s}^{t}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{\|}^{2}}{(t-s)^{2\gamma}}\mathbf{1}_{D}\bigg{]}$
	$\displaystyle=:$	$\displaystyle\mathbf{A}_{21}+\mathbf{A}_{22}.$

For $\mathbf{A}_{21}$ , by (H1)-(H5), Hölder inequality and Lemma 4.1, we have

$\displaystyle\mathbf{A}_{21}$	$\displaystyle\leq$	$\displaystyle C\mathbb{E}\bigg{[}\sup_{t\in[0,T]}\bigg{\|}\sum_{k=0}^{\lfloor\frac{t}{\delta}\rfloor-1}\int_{k\delta}^{(k+1)\delta}(b({s}/{\epsilon},X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}s\bigg{\|}^{2}\bigg{]}$
		$\displaystyle+C\mathbb{E}\bigg{[}\sup_{t\in[0,T]}\bigg{\|}\int_{\lfloor\frac{t}{\delta}\rfloor\delta}^{t}(b({s}/{\epsilon},X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})-\bar{b}(X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}}))\mathrm{d}s\bigg{\|}^{2}\bigg{]}$
	$\displaystyle\leq$	$\displaystyle C\mathbb{E}\bigg{[}\sup_{t\in[0,T]}\lfloor\frac{t}{\delta}\rfloor\sum_{k=0}^{\lfloor\frac{t}{\delta}\rfloor-1}\bigg{\|}\int_{k\delta}^{(k+1)\delta}(b({s}/{\epsilon},X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}s\bigg{\|}^{2}\bigg{]}$
		$\displaystyle+C\mathbb{E}\bigg{[}\sup_{t\in[0,T]}\Big{(}t-\lfloor\frac{t}{\delta}\rfloor\delta\Big{)}\int_{\lfloor\frac{t}{\delta}\rfloor\delta}^{t}(\|b({s}/{\epsilon},X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})\|^{2}+\|\bar{b}(X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})\|^{2})\mathrm{d}s\Big{]}$
	$\displaystyle\leq$	$\displaystyle C_{T}\delta^{2}+\frac{C_{T}}{\delta^{2}}\max_{0\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1}\mathbb{E}\bigg{[}\bigg{\|}\epsilon\int_{\frac{k\delta}{\epsilon}}^{\frac{(k+1)\delta}{\epsilon}}(b(s,X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}s\bigg{\|}^{2}\bigg{]}$
	$\displaystyle\leq$	$\displaystyle C_{T}\delta^{2}+\frac{C_{T}}{\delta^{2}}\max_{0\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1}\mathbb{E}\bigg{[}\bigg{\|}\frac{T\epsilon}{\delta(k+1)}\int_{0}^{\frac{(k+1)\delta}{\epsilon}}(b(s,X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}s\bigg{\|}^{2}\bigg{]}$
		$\displaystyle+\frac{C_{T}}{\delta^{2}}\max_{0\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1}\mathbb{E}\bigg{[}\bigg{\|}\frac{T\epsilon}{\delta k}\int_{0}^{\frac{k\delta}{\epsilon}}(b(s,X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}s\bigg{\|}^{2}\bigg{]}.$

We have for $\epsilon\rightarrow 0$ , $\frac{\delta(k+1)}{\epsilon}\rightarrow\infty$ for any $k,1\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1$ . In addition we take the maximum over finitely many elements determined by the fixed number $\delta$ given and $T$ . Following (H5), we have for every element under the maximum

\displaystyle\begin{split}\max_{0\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1}\bigg{|}\frac{\epsilon}{\delta(k+1)}&\int_{0}^{\frac{(k+1)\delta}{\epsilon}}(b(s,X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}s\bigg{|}\\ \leq&\max_{0\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1}\varphi\bigg{(}\frac{(k+1)\delta}{\epsilon}\bigg{)}\big{(}1+|X_{k\delta}^{\epsilon}|^{2}+\mathbb{W}_{2}(\mathscr{L}_{X_{k\delta}^{\epsilon}},\delta_{0})\big{)}\leq C_{\epsilon}\end{split}

(4.17)

where $C_{\epsilon}\rightarrow 0$ , as $\epsilon\rightarrow 0$ . Thus, we have for $\epsilon$ sufficiently small and the $\delta$ given

\mathbf{A}_{21}\leq C_{\alpha,\beta,\gamma,T,|x_{0}|}\delta^{2},

(4.18)

For $\mathbf{A}_{22}$ , by (H1)-(H5), Hölder inequality and Lemma 4.1 again, we have

$\displaystyle\mathbf{A}_{22}$	$\displaystyle\leq$	$\displaystyle C\mathbb{E}\bigg{[}\bigg{(}\sup_{0\leq s<t\leq T}\frac{\big{\|}\int_{s}^{t}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{\|}}{(t-s)^{\gamma}}\bigg{)}^{2}\mathbf{1}_{\ell}\bigg{]}$
		$\displaystyle+C\mathbb{E}\bigg{[}\bigg{(}\sup_{0\leq s<t\leq T}\frac{\big{\|}\int_{s}^{t}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{\|}}{(t-s)^{\gamma}}\bigg{)}^{2}\mathbf{1}_{\ell^{c}}\bigg{]}$
	$\displaystyle=:$	$\displaystyle\mathbf{A}_{221}+\mathbf{A}_{222}$

where $\ell:=\{t<(\lfloor\frac{s}{\delta}\rfloor+2)\delta\}$ and $\ell^{c}:=\{t\geq(\lfloor\frac{s}{\delta}\rfloor+2)\delta\}$ . For $\mathbf{A}_{222}$ , by (H2), (H3) and the fact $\ell=\{t<(\lfloor\frac{s}{\delta}\rfloor+2)\delta\}$ implies that $t-s<\lfloor\frac{s}{\delta}\rfloor\delta-s+2\delta\leq 2\delta$ , so we have

\displaystyle\mathbf{A}_{221}\leq C\mathbb{E}\Big{[}\sup_{0\leq s<t\leq T}(t-s)^{2-2\gamma}\mathbf{1}_{\ell}\Big{]}\leq C\delta^{2-2\gamma}.

By (H1)-(H5) and the fact that $\lfloor\lambda_{1}\rfloor-\lfloor\lambda_{2}\rfloor\leq\lambda_{1}-\lambda_{2}+1,$ for $\lambda_{1}\geq\lambda_{2}\geq 0$ , we have

$\displaystyle\mathbf{A}_{222}$	$\displaystyle\leq$	$\displaystyle C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{\big{\|}\int_{s}^{(\lfloor\frac{s}{\delta}\rfloor+1)\delta}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{\|}^{2}}{(t-s)^{2\gamma}}\mathbf{1}_{\ell^{c}}\bigg{]}$
		$\displaystyle+C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{\big{\|}\int_{\lfloor\frac{t}{\delta}\rfloor\delta}^{t}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{\|}^{2}}{(t-s)^{2\gamma}}\mathbf{1}_{\ell^{c}}\bigg{]}$
		$\displaystyle+C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{\big{\|}\sum_{k=\lfloor\frac{s}{\delta}\rfloor+1}^{\lfloor\frac{t}{\delta}\rfloor-1}\int_{k\delta}^{(k+1)\delta}(b({r}/{\epsilon},X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}r\big{\|}^{2}}{(t-s)^{2\gamma}}\mathbf{1}_{\ell^{c}}\bigg{]}$
	$\displaystyle\leq$	$\displaystyle C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{\big{\|}\int_{s}^{(\lfloor\frac{s}{\delta}\rfloor+1)\delta}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{\|}^{2}}{(t-s)^{2\gamma}}\mathbf{1}_{\ell^{c}}\bigg{]}$
		$\displaystyle+C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{\big{\|}\int_{\lfloor\frac{t}{\delta}\rfloor\delta}^{t}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{\|}^{2}}{(t-s)^{2\gamma}}\mathbf{1}_{\ell^{c}}\bigg{]}$
		$\displaystyle+C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{(\lfloor\frac{t}{\delta}\rfloor-\lfloor\frac{s}{\delta}\rfloor-1)}{(t-s)^{2\gamma}}\sum_{k=\lfloor\frac{s}{\delta}\rfloor+1}^{\lfloor\frac{t}{\delta}\rfloor-1}\bigg{\|}\int_{k\delta}^{(k+1)\delta}(b({r}/{\epsilon},X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}r\bigg{\|}^{2}\mathbf{1}_{\ell^{c}}\bigg{]}$
	$\displaystyle\leq$	$\displaystyle C\delta^{2-2\gamma}+\frac{C_{\gamma,T}}{\delta^{2}}\max_{0\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1}\mathbb{E}\bigg{[}\bigg{\|}\int_{k\delta}^{(k+1)\delta}(b({r}/{\epsilon},X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}r\bigg{\|}^{2}\mathbf{1}_{\ell^{c}}\bigg{]}.$

Using (H5) again, the remaining term on the right hand side can be estimated similar to $\mathbf{A}_{21}$ , see (4.17). We have

\mathbf{A}_{22}\leq C_{\alpha,\beta,\gamma,T,|x_{0}|}\delta^{2-2\gamma}.

(4.19)

For $\mathbf{A}_{4}$ , by (4.11), we have

$\displaystyle\mathbf{A}_{4}$	$\displaystyle\leq$	$\displaystyle C_{\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\int_{0}^{t}\|\bar{b}(X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}})-\bar{b}(\bar{X}_{s},\mathscr{L}_{\bar{X}_{s}})\|^{2}(t-s)^{-\gamma}\mathbf{1}_{D}\mathrm{d}s\bigg{]}$	(4.20)
	$\displaystyle\leq$	$\displaystyle C_{\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}\int_{0}^{t}e^{-\lambda(t-s)}e^{-\lambda s}(t-s)^{-\gamma}\|\bar{b}(X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}})-\bar{b}(\bar{X}_{s},\mathscr{L}_{\bar{X}_{s}})\|^{2}\mathbf{1}_{D}\mathrm{d}s\bigg{]}$	(4.21)
	$\displaystyle\leq$	$\displaystyle C_{\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}\int_{0}^{t}e^{-\lambda(t-s)}(t-s)^{-\gamma}e^{-\lambda s}\big{(}\|X^{\epsilon}_{s}-\bar{X}_{s}\|^{2}+\mathbb{E}[\|X^{\epsilon}_{s}-\bar{X}_{s}\|^{2}]\big{)}\mathbf{1}_{D}\mathrm{d}s\bigg{]}$	(4.22)
	$\displaystyle\leq$	$\displaystyle C_{\gamma,T}\mathbb{E}\big{[}\sup_{t\in[0,T]}e^{-\lambda t}\big{(}\|X^{\epsilon}_{t}-\bar{X}_{t}\|^{2}+\mathbb{E}[\|X^{\epsilon}_{t}-\bar{X}_{t}\|^{2}]\big{)}\mathbf{1}_{D}\big{]}\sup_{t\in[0,T]}\int_{0}^{t}e^{-\lambda(t-s)}(t-s)^{-\gamma}\mathrm{d}s$	(4.23)
	$\displaystyle\leq$	$\displaystyle C_{\gamma,T}\lambda^{\gamma-1}\mathbb{E}\big{[}\\|X^{\epsilon}-\bar{X}\\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}+C_{\gamma,T}\mathbb{E}\big{[}\\|X^{\epsilon}-\bar{X}\\|_{\gamma,\lambda}^{2}\mathbf{1}_{D^{c}}\big{]}.$	(4.24)

To proceed, we have

	$\displaystyle\mathbf{A}_{5}$	$\displaystyle=$	$\displaystyle\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\bigg{\|}\int_{0}^{t}(\sigma(X^{\epsilon}_{s})-\sigma(\bar{X}_{s}))\mathrm{d}B^{H}_{s}\bigg{\|}^{2}\bigg{]}+\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{e^{-\lambda t}}{(t-s)^{2\gamma}}\bigg{\|}\int_{s}^{t}(\sigma(X^{\epsilon}_{s})-\sigma(\bar{X}_{s}))\mathrm{d}B^{H}_{s}\bigg{\|}^{2}\bigg{]}$
		$\displaystyle=:$	$\displaystyle\mathbf{A}_{51}+\mathbf{A}_{52}.$

Since (2.1) and Lemma 2.5, we have

	$\displaystyle e^{-\lambda t}\bigg{\|}\int_{s}^{t}(\sigma(X^{\epsilon}_{r})-\sigma(\bar{X}_{r}))\mathrm{d}B^{H}_{r}\bigg{\|}^{2}$	(4.25)
$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta}e^{-\lambda t}\bigg{(}\int_{s}^{t}\bigg{(}\frac{\|\sigma(X^{\epsilon}_{r})-\sigma(\bar{X}_{r})\|}{(r-s)^{\alpha}}+\int_{s}^{r}\frac{\|\sigma(X^{\epsilon}_{r})-\sigma(\bar{X}_{r})-\sigma(X^{\epsilon}_{u})+\sigma(\bar{X}_{u})\|}{(r-u)^{\alpha+1}}\mathrm{d}u\bigg{)}\frac{\\|\|B^{H}\\|\|_{\beta}}{(t-r)^{-\alpha-\beta+1}}\mathrm{d}r\bigg{)}^{2}$	(4.26)
$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta}\\|\|B^{H}\\|\|_{\beta}^{2}\bigg{(}\int_{s}^{t}e^{-\frac{\lambda(t-r)}{2}}\frac{e^{-\frac{\lambda r}{2}}\|\sigma(X^{\epsilon}_{r})-\sigma(\bar{X}_{r})\|}{(r-s)^{\alpha}}(t-r)^{\alpha+\beta-1}\mathrm{d}r$	(4.27)
	$\displaystyle+\int_{s}^{t}\int_{s}^{r}e^{-\frac{\lambda(t-r)}{2}}\frac{e^{-\frac{\lambda r}{2}}\|\sigma(X^{\epsilon}_{r})-\sigma(\bar{X}_{r})-\sigma(X^{\epsilon}_{u})+\sigma(\bar{X}_{u})\|(r-u)^{\gamma}}{(r-u)^{\gamma}(r-u)^{\alpha+1}}\mathrm{d}u(t-r)^{\alpha+\beta-1}\mathrm{d}r\bigg{)}^{2}$	(4.28)
$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta}\\|\|B^{H}\\|\|_{\beta}^{2}\\|X^{\epsilon}-\bar{X}\\|_{\gamma,\lambda}^{2}\bigg{(}\int_{s}^{t}e^{-\lambda(t-r)}(r-s)^{-\alpha}(t-r)^{\alpha-1}\mathrm{d}r\bigg{)}\bigg{(}\int_{s}^{t}(r-s)^{-\alpha}(t-r)^{\alpha+2\beta-1}\mathrm{d}r\bigg{)}$	(4.29)
	$\displaystyle+C_{\alpha,\beta,\gamma}\\|\|B^{H}\\|\|_{\beta}^{2}\\|X^{\epsilon}-\bar{X}\\|_{\gamma,\lambda}^{2}(1+\\|X^{\epsilon}\\|_{\gamma}^{2}+\\|\bar{X}\\|_{\gamma}^{2})\bigg{(}\int_{s}^{t}e^{-\lambda(t-r)}(r-s)^{-\alpha}(t-r)^{\alpha-1}\mathrm{d}r\bigg{)}\bigg{(}\int_{s}^{t}(r-s)^{2\gamma-\alpha}(t-r)^{\alpha+2\beta-1}\mathrm{d}r\bigg{)}$	(4.30)
$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta}\\|\|B^{H}\\|\|_{\beta}^{2}(t-s)^{2\beta}\\|X^{\epsilon}-\bar{X}\\|_{\gamma,\lambda}^{2}\int_{s}^{t}e^{-\lambda(t-r)}(r-s)^{-\alpha}(t-r)^{\alpha-1}\mathrm{d}r$	(4.31)
	$\displaystyle+C_{\alpha,\beta,\gamma}\\|\|B^{H}\\|\|_{\beta}^{2}(t-s)^{2(\beta+\gamma)}\\|X^{\epsilon}-\bar{X}\\|_{\gamma,\lambda}^{2}(1+\\|X^{\epsilon}\\|_{\gamma}^{2}+\\|\bar{X}\\|_{\gamma}^{2})\int_{s}^{t}e^{-\lambda(t-r)}(r-s)^{-\alpha}(t-r)^{\alpha-1}\mathrm{d}r$	(4.32)

where Lemma 7.1 in [25] implies that

\displaystyle|\sigma(X^{\epsilon}_{r})-\sigma(\bar{X}_{r})-\sigma(X^{\epsilon}_{u})+\sigma(\bar{X}_{u})|\leq|X^{\epsilon}_{r}-\bar{X}_{r}-X^{\epsilon}_{u}+\bar{X}_{u}|+|X^{\epsilon}_{r}-\bar{X}_{r}|(|X^{\epsilon}_{r}-X^{\epsilon}_{u}|+|\bar{X}_{r}-\bar{X}_{u}|)

and by a change of variable $v=\frac{r-s}{t-s}$ , from Lemma 8 in [5] and the fact that $\gamma<\beta$ , it is easy to see that

	$\displaystyle(t-s)^{2\beta}\int_{s}^{t}e^{-\lambda(t-r)}(r-s)^{-\alpha}(t-r)^{\alpha-1}\mathrm{d}r$
	$\displaystyle=(t-s)^{2\gamma}(t-s)^{2(\beta-\gamma)}\int_{0}^{1}e^{-\lambda(t-s)(1-v)}v^{-\alpha}(1-v)^{\alpha-1}\mathrm{d}v\leq(t-s)^{2\gamma}K(\lambda)$

where $K(\lambda)\rightarrow 0$ as $\lambda\rightarrow\infty$ .

Then, by Lemma 4.1, we have

\displaystyle\mathbf{A}_{52}\leq C_{\alpha,\beta,\gamma,T}K(\lambda)\mathbb{E}[\||B^{H}\||_{\beta}^{2}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}(1+\|X^{\epsilon}\|_{\gamma}^{2}+\|\bar{X}\|_{\gamma}^{2})\mathbf{1}_{D}]\leq C_{\alpha,\beta,\gamma,T,R}K(\lambda)\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}.

(4.33)

In a similar manner than before for the first expression on $\mathbf{A}_{52}$ , we obtain

\displaystyle\mathbf{A}_{51}\leq C_{\alpha,\beta,\gamma,T,R}K(\lambda)\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}.

Thus, we have

\displaystyle\mathbf{A}_{5}\leq C_{\alpha,\beta,\gamma,T,R}K(\lambda)\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}.

(4.34)

Summing up (4.13), (4.18), (4.19), (4.20) and (4.34) and the fact that $\mathbb{P}\big{\{}\tau_{R}<T\big{\}}\leq R^{-1}\mathbb{E}[\||B^{H}\||_{\beta}]$ (see Lemma 4.7 in Pei et al., 2020), we obtain

$\displaystyle\mathbf{A}$	$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta,\gamma,T,\|x_{0}\|}\delta^{2\beta}+C_{\alpha,\beta,\gamma,T,\|x_{0}\|}\delta^{2}+C_{\alpha,\beta,\gamma,T,\|x_{0}\|}\delta^{2-2\gamma}+C_{\alpha,\beta,\gamma,T,R}K(\lambda)\mathbb{E}\big{[}\\|X^{\epsilon}-\bar{X}\\|_{\gamma,\lambda}^{2}\big{]}$
		$\displaystyle+C_{\gamma,T}\lambda^{\gamma-1}\mathbb{E}\big{[}\\|X^{\epsilon}-\bar{X}\\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}+C_{\gamma,T}\lambda^{\gamma-1}\mathbb{E}\big{[}\\|X^{\epsilon}-\bar{X}\\|_{\gamma,\lambda}^{2}\mathbf{1}_{D^{c}}\big{]}$
	$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta,\gamma,T,\|x_{0}\|}\delta^{2-2\gamma}+C_{\alpha,\beta,\gamma,T,R,\|x_{0}\|}\big{(}\lambda^{\gamma-1}+K(\lambda)\big{)}\mathbb{E}\big{[}\\|X^{\epsilon}-\bar{X}\\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}+C_{\gamma,T}\lambda^{\gamma-1}\sqrt{R^{-1}\mathbb{E}[\\|\|B^{H}\\|\|_{\beta}]}.$

Taking $\lambda$ large enough, such that $C_{\alpha,\beta,\gamma,T,R,|x_{0}|}\big{(}\lambda^{\gamma-1}+K(\lambda)\big{)}\vee C_{\gamma,T}\lambda^{\gamma-1}<1$ , we have

\displaystyle\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\mathbf{1}_{D}\big{]}\leq C_{\alpha,\beta,\gamma,T,|x_{0}|}e^{\lambda T}\delta^{2-2\gamma}+e^{\lambda T}\sqrt{R^{-1}\mathbb{E}[\||B^{H}\||_{\beta}]}.

Next, we return to the second supremum on the right-hand side of inequality (4.9), by Cauchy-Schwarz’s inequality, Lemma 4.1 and using Lemma 4.7 in [27] again, we have

\displaystyle\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\mathbf{1}_{\{\tau_{R}<T\}}\big{]}\leq\Big{(}\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{4}\big{]}\Big{)}^{1/2}\mathbb{P}\big{\{}\tau_{R}<T\big{\}}^{1/2}\leq C_{\alpha,\beta,\gamma,T,|x_{0}|}\sqrt{R^{-1}\mathbb{E}[\||B^{H}\||_{\beta}]}.

Summing above and let $R\rightarrow\infty$ , we have

\displaystyle\lim_{\epsilon\rightarrow 0}\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\big{]}=0.

This completes the proof.∎

Acknowledgments

This work was partially supported by National Natural Science Foundation of China (NSF) under Grant No. 12172285, NSF of Chongqing under Grant No.cstc2021jcyj-msxmX0296, Shaanxi Fundamental Science Research Project for Mathematics and Physics under Grant No. 22JSQ027 and Fundamental Research Funds for the Central Universities.

References

Bogoliubov and Mitropolski [1963] Bogoliubov, N.N., Mitropolski, Y.A.. Asymptotic methods in the theory of non-linear oscillations. Phys Today 1963;16(2):61–61.
Bolley et al. [2013] Bolley, F., Gentil, I., Guillin, A.. Uniform convergence to equilibrium for granular media. Arch Ration Mech Anal 2013;208:429–445.
Braun and Hepp [1977] Braun, W., Hepp, K.. The Vlasov dynamics and its fluctuations in the 1/N limit of interacting classical particles. Commun Math Phys 1977;56(2):101–113.
Cerrai and Freidlin [2009] Cerrai, S., Freidlin, M.. Averaging principle for a class of stochastic reaction–diffusion equations. Probab Theory Relat Fields 2009;144:137–177.
Chen et al. [2013] Chen, Y., Gao, H., Garrido-Atienza, M.J., Schmalfuss, B.. Pathwise solutions of SPDEs driven by Hölder-continuous integrators with exponent larger than $1/2$ and random dynamical systems. Discrete Contin Dyn Syst Ser A 2013;34(1):79–98.
Cheng et al. [2022] Cheng, M., Hao, Z., Röckner, M.. Strong and weak convergence for averaging principle of DDSDE with singular drift. arXiv preprint arXiv:220712108 2022;.
Dong et al. [2018] Dong, Z., Sun, X., Xiao, H., Zhai, J.. Averaging principle for one dimensional stochastic Burgers equation. J Differ Equ 2018;265(10):4749–4797.
Fan et al. [2022] Fan, X., Huang, X., Suo, Y., Yuan, C.. Distribution dependent SDEs driven by fractional Brownian motions. Stoch Process Their Appl 2022;151:23–67.
Friz and Hairer [2020] Friz, P.K., Hairer, M.. A course on rough paths. Springer, 2020.
Fu and Liu [2011] Fu, H., Liu, J.. Strong convergence in stochastic averaging principle for two time-scales stochastic partial differential equations. J Math Anal Appl 2011;384(1):70–86.
Gao [2018] Gao, P.. Averaging principle for the higher order nonlinear Schrödinger equation with a random fast oscillation. J Stat Phys 2018;171(5):897–926.
Guerra and Nualart [2008] Guerra, J., Nualart, D.. Stochastic differential equations driven by fractional Brownian motion and standard Brownian motion. Stoch Anal Appl 2008;26(5):1053–1075.
Hong et al. [2022] Hong, W., Li, S., Liu, W.. Strong convergence rates in averaging principle for slow-fast McKean-Vlasov SPDEs. J Differ Equ 2022;316:94–135.
Hu et al. [2021] Hu, K., Ren, Z., Šiška, D., Szpruch, Ł.. Mean-field Langevin dynamics and energy landscape of neural networks. In: Annales de l’Institut Henri Poincare (B) Probabilites et statistiques. Institut Henri Poincaré; volume 57; 2021. p. 2043–2065.
Hu and Nualart [2007] Hu, Y., Nualart, D.. Differential equations driven by Hölder continuous functions of order greater than 1/2. Stoch Anal Appl 2007;2:399–413.
Kac [1956] Kac, M.. Foundations of kinetic theory. In: Proceedings of The third Berkeley symposium on mathematical statistics and probability. volume 3; 1956. p. 171–197.
Khasminskii [1968] Khasminskii, R.. On an averaging principle for Itô stochastic differential equations. Kybernetika 1968;4(3):260–279.
Krylov and Bogoliubov [1950] Krylov, N.M., Bogoliubov, N.N.. Introduction to non-linear mechanics. Number 11. Princeton university press, 1950.
Lasry and Lions [2007] Lasry, J.M., Lions, P.L.. Mean field games. Japanese J Math 2007;2(1):229–260.
Liu [2010] Liu, D.. Strong convergence of principle of averaging for multiscale stochastic dynamical systems. Commun Math Sci 2010;8(4):999–1020.
Liu et al. [2020] Liu, W., Röckner, M., Sun, X., Xie, Y.. Averaging principle for slow-fast stochastic differential equations with time dependent locally Lipschitz coefficients. J Differ Equ 2020;268(6):2910–2948.
McKean [1966] McKean, P.. A class of Markov processes associated with nonlinear parabolic equations. Proceedings of the National Academy of Sciences of the United States of America 1966;56(6):1907–1907.
Mishura and Shevchenko [2012] Mishura, Y., Shevchenko, G.. Mixed stochastic differential equations with long-range dependence: Existence, uniqueness and convergence of solutions. Comput Math with Appl 2012;64(10):3217–3227.
Mishura and Posashkova [2011] Mishura, Y.S., Posashkova, S.. Stochastic differential equations driven by a Wiener process and fractional Brownian motion: Convergence in besov space with respect to a parameter. Comput Math with Appl 2011;62(3):1166–1180.
Nualart and Răşcanu [2002] Nualart, D., Răşcanu, A.. Differential equations driven by fractional Brownian motion. Collect Math 2002;53(1):55–81.
Pardoux and Veretennikov [2001] Pardoux, É., Veretennikov, Y.. On the Poisson equation and diffusion approximation. I. Ann Probab 2001;29(3):1061–1085.
Pei et al. [2020] Pei, B., Inahama, Y., Xu, Y.. Averaging principles for mixed fast-slow systems driven by fractional Brownian motion. arXiv preprint arXiv:200106945 2020;.
Röckner and Zhang [2021] Röckner, M., Zhang, X.. Well-posedness of distribution dependent SDEs with singular drifts. Bernoulli 2021;27(2):1131–1158.
Shen et al. [2022] Shen, G., Xiang, J., Wu, J.L.. Averaging principle for distribution dependent stochastic differential equations driven by fractional Brownian motion and standard Brownian motion. J Differ Equ 2022;321:381–414.
Sun et al. [2022] Sun, X., Xie, L., Xie, Y.. Strong and weak convergence rates for slow–fast stochastic differential equations driven by $\alpha$ -stable process. Bernoulli 2022;28(1):343–369.
Zähle [1998] Zähle, M.. Integration with respect to fractal functions and stochastic calculus. I. Probab Theory Relat Fields 1998;111:333–374.

$\displaystyle\bigg{\|}\int_{s}^{t}f(r)\mathrm{d}g(r)\bigg{\|}$	$\displaystyle\leq$	$\displaystyle\int_{s}^{t}\big{\|}D_{s+}^{\alpha}f(r)\big{\|}\cdot\big{\|}D_{t-}^{1-\alpha}g_{t-}(r)\big{\|}\mathrm{d}r$
	$\displaystyle\leq$	$\displaystyle\int_{s}^{t}C_{\alpha,\gamma,T}\\|f\\|_{\gamma,s,t}(r-s)^{-\alpha}C_{\alpha,\beta}\\|\|g\\|\|_{\beta}(t-r)^{\alpha+\beta-1}\mathrm{d}r$
	$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta,\gamma,T}\\|f\\|_{\gamma,s,t}\\|\|g\\|\|_{\beta}\int_{s}^{t}(r-s)^{-\alpha}(t-r)^{\alpha+\beta-1}\mathrm{d}r$
	$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta,\gamma,T}\\|f\\|_{\gamma,s,t}\\|\|g\\|\|_{\beta}(t-s)^{\beta}.$

	$\displaystyle\|\bar{b}(x,\mu)\|$	$\displaystyle\leq L_{\bar{b}},$
	$\displaystyle\|\bar{b}(x_{1},\mu_{1})-\bar{b}(x_{2},\mu_{2})\|$	$\displaystyle\leq L_{\bar{b}}\big{(}\|x_{1}-x_{2}\|+\mathbb{W}_{2}(\mu_{1},\mu_{2})\big{)}$

	$\displaystyle\|\bar{b}(x,\mu)\|$	$\displaystyle\leq$	$\displaystyle\Bigg{\|}\frac{1}{T}\int_{0}^{T}(b(s,x,\mu)-\bar{b}(x,\mu))\mathrm{d}s\Bigg{\|}+\Bigg{\|}\frac{1}{T}\int_{0}^{T}b(s,x,\mu)\mathrm{d}s\Bigg{\|}$
		$\displaystyle\leq$	$\displaystyle\varphi(T)\big{(}1+\|x\|+\mu(\|\cdot\|^{2})\big{)}+M_{b}$

$\displaystyle\|\bar{b}(x_{1},\mu_{1})-\bar{b}(x_{2},\mu_{2})\|$	$\displaystyle\leq$	$\displaystyle\Bigg{\|}\frac{1}{T}\int_{0}^{T}(b(s,x_{1},\mu_{1})-\bar{b}(x_{1},\mu_{1}))\mathrm{d}s\Bigg{\|}+\Bigg{\|}\frac{1}{T}\int_{0}^{T}(b(s,x_{2},\mu_{2})-\bar{b}(x_{2},\mu_{2}))\mathrm{d}s\Bigg{\|}$
		$\displaystyle+\Bigg{\|}\frac{1}{T}\int_{0}^{T}(b(s,x_{1},\mu_{1})-b(s,x_{2},\mu_{2}))\mathrm{d}s\Bigg{\|}$
	$\displaystyle\leq$	$\displaystyle\varphi(T)\big{(}1+\|x_{1}\|+\|x_{2}\|+\mu_{1}(\|\cdot\|^{2})+\mu_{2}(\|\cdot\|^{2})\big{)}+L_{b}\big{(}\|x_{1}-x_{2}\|+\mathbb{W}_{2}(\mu_{1},\mu_{2})\big{)}.$

$\displaystyle\|X^{\epsilon}_{t+h}-X^{\epsilon}_{t}\|$	$\displaystyle\leq$	$\displaystyle\bigg{\|}\int_{t}^{t+h}b({s}/{\epsilon},X^{\epsilon}_{s},\mathscr{L}_{X^{\epsilon}_{s}})\mathrm{d}s\bigg{\|}+\bigg{\|}\int_{t}^{t+h}\sigma(X^{\epsilon}_{s})\mathrm{d}B^{H}_{s}\bigg{\|}$
	$\displaystyle\leq$	$\displaystyle\int_{t}^{t+h}\|b({s}/{\epsilon},X^{\epsilon}_{s},\mathscr{L}_{X^{\epsilon}_{s}})\|\mathrm{d}s+C_{\alpha,\beta,\gamma,T}\\|\|B^{H}\\|\|_{\beta}\Big{(}\sup_{t\leq r\leq t+h}\|\sigma(X^{\epsilon}_{r})\|+\sup_{t\leq u<r\leq t+h}\frac{\|\sigma(X^{\epsilon}_{r})-\sigma(X^{\epsilon}_{u})\|}{(r-u)^{\gamma}}\Big{)}h^{\beta}$
	$\displaystyle\leq$	$\displaystyle C_{\alpha,\beta,\gamma,T}(1+\\|\|B^{H}\\|\|_{\beta})\big{(}1+\\|X^{\epsilon}\\|_{\gamma}\big{)}h^{\beta}.$