This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Averaging principle for McKean-Vlasov SDEs driven by multiplicative fractional noise with highly oscillatory drift coefficient

Bin Pei [email protected] Lifang Feng [email protected] Min Han [email protected] School of Mathematics and Statistics, Northwestern Polytechnical University, Xi’an, 710072, China Chongqing Technology Innovation Center, Northwestern Polytechnical University, Chongqing, 401120, China
Abstract

In this paper, we study averaging principle for a class of McKean-Vlasov stochastic differential equations (SDEs) that contain multiplicative fractional noise with Hurst parameter H>H> 1/2 and highly oscillatory drift coefficient. Here the integral corresponding to fractional Brownian motion is the generalized Riemann-Stieltjes integral. Using Khasminskii’s time discretization techniques, we prove that the solution of the original system strongly converges to the solution of averaging system as the times scale ϵ\epsilon gose to zero in the supremum- and Hölder-topologies which are sharpen existing ones in the classical Mckean-Vlasov SDEs framework.

Keywords. Multiplicative fractional noise, highly oscillatory drift, stochastic averaging, McKean-Vlasov SDEs Mathematics subject classification. 60G22, 60H10, 60H05, 34C29

1 Introduction

The present paper focuses on the following McKean-Vlasov stochastic differential equations (SDEs) with highly oscillatory drift coefficient driven by multiplicative fractional noise in related path spaces, namely with supremum- and Hölder-topologies

dXtϵ=b(t/ϵ,Xtϵ,Xtϵ)dt+σ(Xtϵ)dBtH,X0ϵ=x0,t[0,T]\mathrm{d}X_{t}^{\epsilon}=b({t}/{\epsilon},X_{t}^{\epsilon},\mathscr{L}_{X_{t}^{\epsilon}})\mathrm{d}t+\sigma(X_{t}^{\epsilon})\mathrm{d}B_{t}^{H},\quad X_{0}^{\epsilon}=x_{0},\quad t\in[0,T] (1.1)

where the parameter 0<ϵ10<\epsilon\ll 1, x0x_{0}\in\mathbb{R} is arbitrary and non-random but fixed and the coefficients b:[0,T]××𝒫2()b:[0,T]\times\mathbb{R}\times\mathcal{P}_{2}(\mathbb{R})\rightarrow\mathbb{R} and σ:\sigma:\mathbb{R}\rightarrow\mathbb{R} are measurable functions and Xtϵ\mathscr{L}_{X_{t}^{\epsilon}} is the law of XtϵX_{t}^{\epsilon}. Here 𝒫2()\mathcal{P}_{2}(\mathbb{R}) is the space of probability measures on \mathbb{R} with finite 22-th moment which will be introduced in Section 2. {BtH,t0}\{B_{t}^{H},t\geq 0\} is one dimensional fractional Brownian motion (FBM) with Hurst parameter H>1/2H>1/2 which is a Gaussian centered process with the covariance function

RH(t,s)=12(t2H+s2H|ts|2H).R_{H}(t,s)=\frac{1}{2}(t^{2H}+s^{2H}-|t-s|^{2H}).

The McKean-Vlasov SDEs, also known as distribution dependent SDEs or mean-field SDEs, whose evolution is determined by both the microcosmic location and the macrocosmic distribution of the particle, see e.g. [16], [22], [3], can better describe many models than classical SDEs as their coefficients depend on the law of the solution. Such kind of stochastic systems (1.1) are of independent interest and appear widely in applications including granular materials dynamics, mean-field games, as well as complex networked systems, see e.g. [2], [19], [14].

Now, we remind the reader what an averaging principle is. Since the highly oscillating component, it is relatively difficult to solve (1.1). The main goal of the averaging principle to find a simplified system which simulates and predicts the evolution of the original system (1.1) over a long time scale by averaging the highly oscillating drift coefficient under some suitable conditions. The history of averaging principle for deterministic systems is long which can be traced back to the result by Krylov, Bogolyubov and Mitropolsky, see e.g. [18], [1]. After that, [17] established an averaging principle for the SDEs driven by Brownian motion (BM). Up to now, there have existed some kind of methods, such as the techniques of time discretization and Poisson equation, the weak convergence method, studing averaging principle, see e.g. [20], [21], [26], [27], [30] for SDEs, and see e.g. [4], [10], [7], [11] for stochastic partial differential equations (SPDEs).

In recent years there has been considerable research interest in averaging for Mckean-Vlasov stochastic (partial) differential equations S(P)DEs. [28] established the averaging principle for slow-fast Mckean-Vlasov SDEs by the techniques of time discretization and Poisson equation. [13] investigated the strong convergence rate of averaging principle for slow-fast Mckean-Vlasov SPDEs based on the variational approach and the technique of time discretization. [6] studied averaging principle for distribution dependent SDEs with localized LpL^{p} drift using Zvonkin’s transformation and estimates for Kolmogorov equations. [29] obtained the strong convergence without a rate for distribution dependent SDEs with highly oscillating component driven by FBM and standard BM, which requires that the FBM-term should be additive case.

However, the aforementioned references all focused on the Mckean-Vlasov S(P)DEs with addictive noise or multiplicative white noise. Up to now, there are no work concentrating on averaging for Mckean-Vlasov SDEs driven by multiplicative fractional noise. In this work, we aim to close this gap. It is known that the FBMs are not semimartingales. Therefore, the beautiful classical stochastic analysis is not applicable to fractional noises for H1/2H\neq 1/2. It is a non-trivial task to extent the results in the classical stochastic analysis to these multiplicative fractional noises while one can use Wiener integral for the addictive fractional noise because the diffusion term is a dererministic function. Note that the diffusion term of Mckean-Vlasov SDEs in this paper is state variables-dependent, based on Riemann Stieltjes integral framework, we cannot use Gronwall’s lemma or generalized Gronwall’s lemma directly to prove the convergence of XϵX^{\epsilon} to XX as in [24], [23]. So, we will use the λ\lambda-equivalent Hölder norm (see, Section 4.3 ) to overcome this problem.

From the above motivations, we consider the strong convergence of averaging principle for Mckean-Vlasov SDEs driven by multiplicative fractional noise in the present paper. The problem is solved by the fractional approach and Khasminskii type averaging principle efficiently. Moreover, our averaging result in the supremum- and Hölder-topologies sharpen existing ones in the classical Mckean-Vlasov S(P)DEs framework.

The paper is organized as follows. Section 2 presents some necessary notations and assumptions. Stochastic averaging principles for such McKean-Vlasov SDEs are then established in Section 3. Note that CC and CxC_{\mathrm{x}} denote some positive constants which may change from line to line throughout this paper, where x\mathrm{x} is one or more than one parameter and CxC_{\mathrm{x}} is used to emphasize that the constant depends on the corresponding parameter, for example, Cα,β,γ,T,|x0|C_{\alpha,\beta,\gamma,T,|x_{0}|} depends on α,β,γ,T\alpha,\beta,\gamma,T and |x0||x_{0}|.

2 Preliminaries

In this section, we will recall some basic facts on definitions and properties of the fractional caculus. For more details, we refer to [12] and [25]. Firstly, we now introduce some necessary spaces and norms. In what follows of the rest of this section , let a,b,a<ba,b\in\mathbb{R},a<b. For γ(0,1)\gamma\in(0,1), let Cγ((a,b),)C^{\gamma}((a,b),\mathbb{R}) be the space of γ\gamma-Hölder continuous functions f:[a,b]f:[a,b]\rightarrow\mathbb{R}, equipped with the the norm

fγ,a,b=f,a,b+|f|γ,a,b\|f\|_{\gamma,a,b}=\|f\|_{\infty,a,b}+\||f\||_{\gamma,a,b}

with

f,a,b:=supt[a,b]|f(t)|,|f|γ,a,b=supas<tb|f(t)f(s)|(ts)γ.\|f\|_{\infty,a,b}:=\sup_{t\in[a,b]}|f(t)|,\quad\||f\||_{\gamma,a,b}=\sup_{a\leq s<t\leq b}\frac{|f(t)-f(s)|}{(t-s)^{\gamma}}.

For simplify, let fβ:=fβ,0,T,f:=f,0,T\|f\|_{\beta}:=\|f\|_{\beta,0,T},\|f\|_{\infty}:=\|f\|_{\infty,0,T} and |f|β:=|f|β,0,T\||f\||_{\beta}:=\||f\||_{\beta,0,T}.

The following proposition provides an explicit expression for the integral abfdg\int_{a}^{b}f\mathrm{d}g when fCγ((a,b),)f\in C^{\gamma}((a,b),\mathbb{R}) and gCβ((a,b),)g\in C^{\beta}((a,b),\mathbb{R}) with β+γ>1,β,γ(0,1)\beta+\gamma>1,\beta,\gamma\in(0,1) in terms of fractional derivatives, see [31].

Proposition 2.1.

(Remark 4.1 in Nualart and Răşcanu, 2002). Suppose that fCγ((a,b),)f\in C^{\gamma}((a,b),\mathbb{R}) and gCβ((a,b),)g\in C^{\beta}((a,b),\mathbb{R}) with β+γ>1,β,γ(0,1)\beta+\gamma>1,\beta,\gamma\in(0,1). Let α(0,1)\alpha\in(0,1), γ>α\gamma>\alpha and β>1α\beta>1-\alpha. Then the Riemann Stieltjes integral abfdg\int_{a}^{b}f\mathrm{d}g exists and it can be expressed as

abfdg=(1)αabDa+αf(t)Db1αgb(t)dt\int_{a}^{b}f\mathrm{d}g=(-1)^{\alpha}\int_{a}^{b}D_{a+}^{\alpha}f(t)D_{b-}^{1-\alpha}g_{b-}(t)\mathrm{d}t (2.1)

where gb(t)=g(t)g(b)g_{b-}(t)=g(t)-g(b) and for atba\leq t\leq b the Weyl derivatives of ff are defined by formulas

Da+αf(t)\displaystyle D_{a+}^{\alpha}f(t) =1Γ(1α)(f(t)(ta)α+αatf(t)f(s)(ts)α+1ds),\displaystyle=\frac{1}{\Gamma(1-\alpha)}\bigg{(}\frac{f(t)}{(t-a)^{\alpha}}+\alpha\int_{a}^{t}\frac{f(t)-f(s)}{(t-s)^{\alpha+1}}\mathrm{d}s\bigg{)},
Dbαf(t)\displaystyle D_{b-}^{\alpha}f(t) =(1)αΓ(1α)(f(t)(bt)α+αtbf(t)f(s)(st)α+1ds)\displaystyle=\frac{(-1)^{\alpha}}{\Gamma(1-\alpha)}\bigg{(}\frac{f(t)}{(b-t)^{\alpha}}+\alpha\int_{t}^{b}\frac{f(t)-f(s)}{(s-t)^{\alpha+1}}\mathrm{d}s\bigg{)}

and Γ\Gamma denotes the Gamma function.

Lemma 2.2.

(Theorem 2 in Hu and Nualart, 2007) Suppose that fCγ([0,T],)f\in C^{\gamma}([0,T],\mathbb{R}) and gCβ([0,T],)g\in C^{\beta}([0,T],\mathbb{R}) with β+γ>1\beta+\gamma>1 and 1>γ>α, 1>β>1α1>\gamma>\alpha,\,1>\beta>1-\alpha, for all s,t[0,T]s,t\in[0,T], one has

|stf(r)dg(r)|Cα,β,γ,Tfγ,s,t|g|β(ts)β.\displaystyle\bigg{|}\int_{s}^{t}f(r)\mathrm{d}g(r)\bigg{|}\leq C_{\alpha,\beta,\gamma,T}\|f\|_{\gamma,s,t}\||g\||_{\beta}(t-s)^{\beta}.
Proof.

By using the fractional integration given in (2.1), for all s,t[0,T]s,t\in[0,T], we have

|stf(r)dg(r)|\displaystyle\bigg{|}\int_{s}^{t}f(r)\mathrm{d}g(r)\bigg{|} \displaystyle\leq st|Ds+αf(r)||Dt1αgt(r)|dr\displaystyle\int_{s}^{t}\big{|}D_{s+}^{\alpha}f(r)\big{|}\cdot\big{|}D_{t-}^{1-\alpha}g_{t-}(r)\big{|}\mathrm{d}r
\displaystyle\leq stCα,γ,Tfγ,s,t(rs)αCα,β|g|β(tr)α+β1dr\displaystyle\int_{s}^{t}C_{\alpha,\gamma,T}\|f\|_{\gamma,s,t}(r-s)^{-\alpha}C_{\alpha,\beta}\||g\||_{\beta}(t-r)^{\alpha+\beta-1}\mathrm{d}r
\displaystyle\leq Cα,β,γ,Tfγ,s,t|g|βst(rs)α(tr)α+β1dr\displaystyle C_{\alpha,\beta,\gamma,T}\|f\|_{\gamma,s,t}\||g\||_{\beta}\int_{s}^{t}(r-s)^{-\alpha}(t-r)^{\alpha+\beta-1}\mathrm{d}r
\displaystyle\leq Cα,β,γ,Tfγ,s,t|g|β(ts)β.\displaystyle C_{\alpha,\beta,\gamma,T}\|f\|_{\gamma,s,t}\||g\||_{\beta}(t-s)^{\beta}.

This completes the proof. ∎

Remark 2.3.

(Lemma 7.5 in Nualart and Răşcanu, 2002) The trajectories of BHB^{H} are locally β\beta-Hölder continuous a.s. for all β(0,H)\beta\in(0,H) and |BH|β\||B^{H}\||_{\beta} has moments of all order.

Remark 2.4.

Suppose that fCγ([0,T],)f\in C^{\gamma}([0,T],\mathbb{R}) and BHCβ([0,T],)B^{H}\in C^{\beta}([0,T],\mathbb{R}) with β+γ>1\beta+\gamma>1 and 1>γ>α, 1>β>1α1>\gamma>\alpha,\,1>\beta>1-\alpha, for all s,t[0,T]s,t\in[0,T], one has

|stf(r)dBrH|Cα,β,γ,Tfγ,s,t|BH|β(ts)β,a.s.\displaystyle\bigg{|}\int_{s}^{t}f(r)\mathrm{d}B^{H}_{r}\bigg{|}\leq C_{\alpha,\beta,\gamma,T}\|f\|_{\gamma,s,t}\||B^{H}\||_{\beta}(t-s)^{\beta},\,\,\,{\rm a.s.}
Lemma 2.5.

For any positive constants aa, dd, if a,d<1a,d<1, one has

st(rs)a(tr)ddr(ts)1adB(1a,1d)\displaystyle\int_{s}^{t}(r-s)^{-a}(t-r)^{-d}\mathrm{d}r\leq(t-s)^{1-a-d}B(1-a,1-d)

where s(0,t)s\in(0,t) and B\mathit{B} is the Beta function.

Proof.

By a change of variable y=(rs)/(ts)y=(r-s)/(t-s), we have

st(rs)a(tr)ddr\displaystyle\int_{s}^{t}(r-s)^{-a}(t-r)^{-d}\mathrm{d}r =\displaystyle= 01(y(ts))a(tsy(ts))d(ts)dy\displaystyle\int_{0}^{1}(y(t-s))^{-a}(t-s-y(t-s))^{-d}(t-s)\mathrm{d}y
=\displaystyle= (ts)1ad01ya(1y)ddy\displaystyle(t-s)^{1-a-d}\int_{0}^{1}y^{-a}(1-y)^{-d}\mathrm{d}y
=\displaystyle= (ts)1adB(1a,1d).\displaystyle(t-s)^{1-a-d}B(1-a,1-d).

This completes the proof. ∎

Let 𝒫()\mathcal{P}(\mathbb{R}) be the collection of all probability measures on \mathbb{R}, and 𝒫2()\mathcal{P}_{2}(\mathbb{R}) be the space of probability measures on \mathbb{R} with finite 22-th moment, i.e.,

𝒫2()={μ𝒫():μ(||2):=|x|2μ(dx)<}.\mathcal{P}_{2}(\mathbb{R})=\bigg{\{}\mu\in\mathcal{P}(\mathbb{R}):\;\mu(|\cdot|^{2}):=\int_{\mathbb{R}}|x|^{2}\mu(\mathrm{d}x)<\infty\bigg{\}}.

We define the L2L^{2}-Wasserstein distance on 𝒫2()\mathcal{P}_{2}(\mathbb{R}) by

𝕎2(μ1,μ2):=infπ𝒞μ1,μ2(×|xy|2π(dx,dy))1/2,μ1,μ2𝒫2()\mathbb{W}_{2}\left(\mu_{1},\mu_{2}\right):=\inf_{\pi\in\mathcal{C}_{\mu_{1},\mu_{2}}}\left(\int_{\mathbb{R}\times\mathbb{R}}|x-y|^{2}\pi(dx,dy)\right)^{1/2},\quad\mu_{1},\mu_{2}\in\mathcal{P}_{2}(\mathbb{R})

where 𝒞μ1,μ2\mathcal{C}_{\mu_{1},\mu_{2}} is the set of probability measures on ×\mathbb{R}\times\mathbb{R} with marginals μ1\mu_{1} and μ2\mu_{2}. It is well-known that (𝒫2(),𝕎2)(\mathcal{P}_{2}(\mathbb{R}),\mathbb{W}_{2}) is a Polish space.

Note that for any xx\in\mathbb{R}, the Dirac measure δx\delta_{x} belongs to 𝒫2()\mathcal{P}_{2}(\mathbb{R}), specially δ0\delta_{0} is the Dirac measure at point 0 and if μ1=X,μ2=Y\mu_{1}=\mathscr{L}_{X},\mu_{2}=\mathscr{L}_{Y} are the corresponding distributions of random variables XX and YY respectively, then

𝕎22(μ1,μ2)×|xy|2(X,Y)(dx,dy)=𝔼[|XY|2]\mathbb{W}_{2}^{2}(\mu_{1},\mu_{2})\leq\int_{\mathbb{R}\times\mathbb{R}}|x-y|^{2}\mathscr{L}_{(X,Y)}(dx,dy)=\mathbb{E}[|X-Y|^{2}]

in which (X,Y)\mathscr{L}_{(X,Y)} represents the joint distribution of the random pair (X,Y)(X,Y). Then for arbitrarily fixed T>0T>0 , let C([0,T];)C([0,T];\mathbb{R}) be the Banach space of all \mathbb{R}-valued continuous functions on [0,T][0,T], endowing with the supremum norm. Furthermore, we let L2(Ω;C([0,T];))L^{2}(\Omega;C([0,T];\mathbb{R})) be the totality of C([0,T];)C([0,T];\mathbb{R})-valued random variables XX satisfying 𝔼[sup0tT|X(t)|2]<\mathbb{E}[\sup_{0\leq t\leq T}|X(t)|^{2}]<\infty. Then, L2(Ω;C([0,T];))L^{2}(\Omega;C([0,T];\mathbb{R})) is a Banach space under the norm

XL2:=(𝔼[sup0tT|X(t)|2])1/2.\|X\|_{L^{2}}:=\Big{(}\mathbb{E}\Big{[}\sup_{0\leq t\leq T}|X(t)|^{2}\Big{]}\Big{)}^{1/2}.

3 Assumptions and main result

3.1 Assumptions

To derive a unique solution to (1.1), we first introduce assumptions on the coefficients bb and σ\sigma such that

  1. (H1)

    There exists a constant Lb>0L_{b}>0, such that for any t[0,T]t\in[0,T], x1,x2x_{1},x_{2}\in\mathbb{R} and μ1,μ2𝒫2()\mu_{1},\mu_{2}\in\mathcal{P}_{2}(\mathbb{R}),

    |b(t,x1,μ1)b(t,x2,μ2)|Lb(|x1x2|+𝕎2(μ1,μ2)).\displaystyle|b(t,x_{1},\mu_{1})-b(t,x_{2},\mu_{2})|\leq L_{b}\big{(}|x_{1}-x_{2}|+\mathbb{W}_{2}(\mu_{1},\mu_{2})\big{)}.

    Moreover, bb is bounded by a positive constant MbM_{b}, i.e.,

    sup(t,x,μ)[0,T]××𝒫2()|b(t,x,μ)|Mb.\sup_{(t,x,\mu)\in[0,T]\times\mathbb{R}\times\mathcal{P}_{2}(\mathbb{R})}|b(t,x,\mu)|\leq M_{b}.
  1. (H2)

    There exist constants Mσ>Kσ>0M_{\sigma}>K_{\sigma}>0 and Lσ>0L_{\sigma}>0 such that for any x,x1,x2x,x_{1},x_{2}\in\mathbb{R}

    |σ(x1)σ(x2)|Lσ|x1x2|andKσ|σ(x)|Mσ.\displaystyle|\sigma(x_{1})-\sigma(x_{2})|\leq L_{\sigma}|x_{1}-x_{2}|\,{\rm and}\,\,K_{\sigma}\leq|\sigma(x)|\leq M_{\sigma}.

Under assumptions (H1) and (H2) above, one can deduce from Theorem 3.3 in [8] that the system (1.1) admits a unique solution via a Lamperti transform.

Lemma 3.1.

Suppose that (H1) and (H2) hold and 1/2<H<11/2<H<1, then (1.1) has a unique solution XL2(Ω;C([0,T];))X\in L^{2}(\Omega;C([0,T];\mathbb{R})).

In order to establish the averaging principle, besides conditions (H1) and (H2), we further assume:

  1. (H3)

    bb is Lipschitz continuous respect to tt, i.e., there exists a positive constant LbL^{\prime}_{b}, such that for any t[0,T]t\in[0,T], xx\in\mathbb{R} and μ𝒫2()\mu\in\mathcal{P}_{2}(\mathbb{R}),

    |b(t1,x,μ)b(t2,x,μ)|Lb|t1t2|.\displaystyle|b(t_{1},x,\mu)-b(t_{2},x,\mu)|\leq L^{\prime}_{b}|t_{1}-t_{2}|.
  2. (H4)

    The function σ(x)\sigma(x) is of class C1()C^{1}(\mathbb{R}). There exists a constant Mσ>0M^{\prime}_{\sigma}>0 such that for any x,x1,x2x,x_{1},x_{2}\in\mathbb{R},

    |σ(x1)σ(x2)|Mσ|x1x2|and|σ(x)|Mσ\displaystyle|\nabla\sigma(x_{1})-\nabla\sigma(x_{2})|\leq M^{\prime}_{\sigma}|x_{1}-x_{2}|\,{\rm and}\,\,|\nabla\sigma(x)|\leq M^{\prime}_{\sigma}

    hold. Here, \nabla is the standard gradient operator on \mathbb{R}.

  3. (H5)

    There exist a bounded positive function φ:++\varphi:\mathbb{R}^{+}\rightarrow\mathbb{R}^{+} and a measurable function b¯:×𝒫2()\bar{b}:\mathbb{R}\times\mathcal{P}_{2}(\mathbb{R})\rightarrow\mathbb{R}, such that for any xx\in\mathbb{R}, μ𝒫2()\mu\in\mathcal{P}_{2}(\mathbb{R}) it holds that

    supt0|1Ttt+T(b(s,x,μ)b¯(x,μ))ds|φ(T)(1+|x|+μ(||2))\displaystyle\sup_{t\geq 0}\Bigg{|}\frac{1}{T}\int_{t}^{t+T}(b(s,x,\mu)-\bar{b}(x,\mu))\mathrm{d}s\Bigg{|}\leq\varphi(T)\big{(}1+|x|+\mu(|\cdot|^{2})\big{)}

    where φ(T)\varphi(T) satisfies limTφ(T)=0\lim_{T\rightarrow\infty}\varphi(T)=0.

Remark 3.2.

It follows from the conditions (H1) and (H5) that b¯\bar{b} satisfies

|b¯(x,μ)|\displaystyle|\bar{b}(x,\mu)| Lb¯,\displaystyle\leq L_{\bar{b}},
|b¯(x1,μ1)b¯(x2,μ2)|\displaystyle|\bar{b}(x_{1},\mu_{1})-\bar{b}(x_{2},\mu_{2})| Lb¯(|x1x2|+𝕎2(μ1,μ2))\displaystyle\leq L_{\bar{b}}\big{(}|x_{1}-x_{2}|+\mathbb{W}_{2}(\mu_{1},\mu_{2})\big{)}

for any x,x1,x2x,x_{1},x_{2}\in\mathbb{R} and μ,μ1,μ2𝒫2()\mu,\mu_{1},\mu_{2}\in\mathcal{P}_{2}(\mathbb{R}), where Lb¯L_{\bar{b}} is a positive constant.

Proof.

We have that

|b¯(x,μ)|\displaystyle|\bar{b}(x,\mu)| \displaystyle\leq |1T0T(b(s,x,μ)b¯(x,μ))ds|+|1T0Tb(s,x,μ)ds|\displaystyle\Bigg{|}\frac{1}{T}\int_{0}^{T}(b(s,x,\mu)-\bar{b}(x,\mu))\mathrm{d}s\Bigg{|}+\Bigg{|}\frac{1}{T}\int_{0}^{T}b(s,x,\mu)\mathrm{d}s\Bigg{|}
\displaystyle\leq φ(T)(1+|x|+μ(||2))+Mb\displaystyle\varphi(T)\big{(}1+|x|+\mu(|\cdot|^{2})\big{)}+M_{b}

and

|b¯(x1,μ1)b¯(x2,μ2)|\displaystyle|\bar{b}(x_{1},\mu_{1})-\bar{b}(x_{2},\mu_{2})| \displaystyle\leq |1T0T(b(s,x1,μ1)b¯(x1,μ1))ds|+|1T0T(b(s,x2,μ2)b¯(x2,μ2))ds|\displaystyle\Bigg{|}\frac{1}{T}\int_{0}^{T}(b(s,x_{1},\mu_{1})-\bar{b}(x_{1},\mu_{1}))\mathrm{d}s\Bigg{|}+\Bigg{|}\frac{1}{T}\int_{0}^{T}(b(s,x_{2},\mu_{2})-\bar{b}(x_{2},\mu_{2}))\mathrm{d}s\Bigg{|}
+|1T0T(b(s,x1,μ1)b(s,x2,μ2))ds|\displaystyle+\Bigg{|}\frac{1}{T}\int_{0}^{T}(b(s,x_{1},\mu_{1})-b(s,x_{2},\mu_{2}))\mathrm{d}s\Bigg{|}
\displaystyle\leq φ(T)(1+|x1|+|x2|+μ1(||2)+μ2(||2))+Lb(|x1x2|+𝕎2(μ1,μ2)).\displaystyle\varphi(T)\big{(}1+|x_{1}|+|x_{2}|+\mu_{1}(|\cdot|^{2})+\mu_{2}(|\cdot|^{2})\big{)}+L_{b}\big{(}|x_{1}-x_{2}|+\mathbb{W}_{2}(\mu_{1},\mu_{2})\big{)}.

Taking TT\rightarrow\infty, there exist a constant Lb¯>0L_{\bar{b}}>0 such that Lemma 3.2 holds. ∎

Remark 3.3.

Noting that

supt0|1Ttt+T(b(s,x,μ)b¯(x,μ))ds|supt01Ttt+T|b(s,x,μ)b¯(x,μ)|ds.\displaystyle\sup_{t\geq 0}\Bigg{|}\frac{1}{T}\int_{t}^{t+T}(b(s,x,\mu)-\bar{b}(x,\mu))\mathrm{d}s\Bigg{|}\leq\sup_{t\geq 0}\frac{1}{T}\int_{t}^{t+T}|b(s,x,\mu)-\bar{b}(x,\mu)|\mathrm{d}s.

This shows that the averaging condition (H5) is weaker than the following averaging condition

supt01Ttt+T|b(s,x,μ)b¯(x,μ)|dsφ(T)(1+|x|+μ(||2)).\displaystyle\sup_{t\geq 0}\frac{1}{T}\int_{t}^{t+T}|b(s,x,\mu)-\bar{b}(x,\mu)|\mathrm{d}s\leq\varphi(T)\big{(}1+|x|+\mu(|\cdot|^{2})\big{)}.

3.2 Main result

Now, we define the averaged equation:

dX¯t=b¯(X¯t,X¯t)dt+σ(X¯t)dBtH,X¯0=x0\mathrm{d}\bar{X}_{t}=\bar{b}(\bar{X}_{t},\mathscr{L}_{\bar{X}_{t}})\mathrm{d}t+\sigma(\bar{X}_{t})\mathrm{d}B_{t}^{H},\quad\bar{X}_{0}=x_{0} (3.1)

where b¯\bar{b} has been given in (H5) and using Theorem 3.3 in [8] again, we have the unique solution result to (3.1).

Lemma 3.4.

Suppose that (H1)-(H5) hold, then Eq. (3.1) has a unique solution X¯L2(Ω;C([0,T];))\bar{X}\in L^{2}(\Omega;C([0,T];\mathbb{R})).

Theorem 3.5.

Suppose that (H1)-(H5) hold, then we obtain

limϵ0𝔼[XϵX¯γ2]=0.\lim_{\epsilon\rightarrow 0}\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\big{]}=0.

The proof of Theorem 3.5 will be given in Section 4.

Remark 3.6.

The averaging principle result (Theorem 3.5) is also applicable to the following system

dXt=ϵb(t,Xt,Xt)dt+ϵHσ(Xt)dBtH.\displaystyle\begin{split}\mathrm{d}X_{t}=\epsilon b(t,X_{t},\mathscr{L}_{X_{t}})\mathrm{d}t+\epsilon^{H}\sigma(X_{t})\mathrm{d}B^{H}_{t}.\end{split} (3.2)

Let ttϵt\mapsto{\frac{t}{\epsilon}}, define Ytϵ:=Xt/ϵY_{t}^{\epsilon}:=X_{t/\epsilon} and Btϵ,H:=ϵHBt/ϵHB^{\epsilon,H}_{t}:=\epsilon^{H}B^{H}_{t/\epsilon} for all t+t\in\mathbb{R}^{+} we rewrite (3.2) as

dYtϵ=b(t/ϵ,Ytϵ,Ytϵ)dt+σ(Ytϵ)dBtϵ,H.\displaystyle\begin{split}&\mathrm{d}Y_{t}^{\epsilon}=b(t/\epsilon,Y_{t}^{\epsilon},\mathscr{L}_{Y_{t}^{\epsilon}})\mathrm{d}t+\sigma(Y_{t}^{\epsilon})\mathrm{d}B^{\epsilon,H}_{t}.\end{split} (3.3)

Then we can consider the following system

dX~tϵ=b(t/ϵ,X~tϵ,X~tϵ)dt+σ(X~tϵ)dBtH.\displaystyle\begin{split}\mathrm{d}\tilde{X}_{t}^{\epsilon}=b(t/\epsilon,\tilde{X}_{t}^{\epsilon},\mathscr{L}_{\tilde{X}_{t}^{\epsilon}})\mathrm{d}t+\sigma(\tilde{X}_{t}^{\epsilon})\mathrm{d}B^{H}_{t}.\end{split} (3.4)

4 The proof of main result

4.1 Some a-prior estimate of the solution

Lemma 4.1.

Suppose that (H1)-(H5) hold. Then, for t[0,T]t\in[0,T], we have

Xϵγ+X¯γCα,β,γ,T,|x0|((1+|BH|β)(1+|BH|β)1γ),a.s.\|X^{\epsilon}\|_{\gamma}+\|\bar{X}\|_{\gamma}\leq C_{\alpha,\beta,\gamma,T,|x_{0}|}\big{(}(1+\||B^{H}\||_{\beta})\vee(1+\||B^{H}\||_{\beta})^{\frac{1}{\gamma}}\big{)},\,\,{\rm a.s.}
Proof.

Like the proof of Theorem 2.2 in [15] and Exercise 4.5 in [9], for any 0s<tT0\leq s<t\leq T, we have

|Xϵ|γ,s,tCα,β,γ,T(1+|BH|β)(1+|Xϵ|γ,s,t(ts)γ).\displaystyle\||X^{\epsilon}\||_{\gamma,s,t}\leq C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})\big{(}1+\||X^{\epsilon}\||_{\gamma,s,t}(t-s)^{\gamma}\big{)}.

Suppose that Δ\Delta satisfies Δ=(2Cα,β,γ,T(1+|BH|β)1γ.\Delta=(2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})^{-\frac{1}{\gamma}}. Then, for all ss and tt such that tsΔt-s\leq\Delta we have

|Xϵ|γ,s,t2Cα,β,γ,T(1+|BH|β).\displaystyle\||X^{\epsilon}\||_{\gamma,s,t}\leq 2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta}). (4.1)

Therefore we can obtain

Xϵ,s,t|Xsϵ|+|Xϵ|γ,s,t(ts)γ|Xsϵ|+2Cα,β,γ,T(1+|BH|β)Δγ.\displaystyle\|X^{\epsilon}\|_{\infty,s,t}\leq|X^{\epsilon}_{s}|+\||X^{\epsilon}\||_{\gamma,s,t}(t-s)^{\gamma}\leq|X^{\epsilon}_{s}|+2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})\Delta^{\gamma}. (4.2)

If ΔT\Delta\geq T, from (4.1) and (4.2), we obtain the estimate

Xϵ,s,t|x0|+2Cα,β,γ,T(1+|BH|β)Tγand|Xϵ|γ2Cα,β,γ,T(1+|BH|β).\displaystyle\|X^{\epsilon}\|_{\infty,s,t}\leq|x_{0}|+2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})T^{\gamma}\,{\rm and}\,\,\||X^{\epsilon}\||_{\gamma}\leq 2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta}). (4.3)

While if ΔT\Delta\leq T, then from (4.2) we get

Xϵ,s,t|Xsϵ|+2Cα,β,γ,T(1+|BH|β)(2Cα,β,γ,T(1+|BH|β))1γγ|Xsϵ|+1.\displaystyle\|X^{\epsilon}\|_{\infty,s,t}\leq|X^{\epsilon}_{s}|+2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})(2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta}))^{-{\frac{1}{\gamma}}\cdot\gamma}\leq|X^{\epsilon}_{s}|+1. (4.4)

Divide the interval [0,T][0,T] into n=[TΔ]+1n=[\frac{T}{\Delta}]+1 subintervals, and use the estimate (4.4) in every interval, we obtain

Xϵ|x0|+n|x0|+TΔ1+1|x0|+2T(2Cα,β,γ,T(1+|BH|β))1γ|x0|+21+γγTCα,β,γ,T1γ(1+|BH|β)1γ\displaystyle\|X^{\epsilon}\|_{\infty}\leq|x_{0}|+n\leq|x_{0}|+T\Delta^{-1}+1\leq|x_{0}|+2T\big{(}2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})\big{)}^{\frac{1}{\gamma}}\leq|x_{0}|+2^{\frac{1+\gamma}{\gamma}}TC_{\alpha,\beta,\gamma,T}^{\frac{1}{\gamma}}(1+\||B^{H}\||_{\beta})^{\frac{1}{\gamma}} (4.5)

and from (4.1), we know that when tsΔt-s\leq\Delta, then |Xϵ|γ,s,t2Cα,β,γ,T(1+|BH|β)\||X^{\epsilon}\||_{\gamma,s,t}\leq 2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta}), when tsΔt-s\geq\Delta, define ti=(s+iΔ)tt_{i}=(s+i\Delta)\wedge t, for i=0,1,,Ni=0,1,\dots,N, noting that tN=tt_{N}=t for N|ts|/ΔN\geq|t-s|/\Delta and also ti+1tiΔt_{i+1}-t_{i}\leq\Delta for all ii, then

|XtϵXsϵ|0i<|ts|/Δ|Xti+1ϵXtiϵ|(|ts|/Δ+1)Δγ=Δγ1(Δ+|ts|)2Δγ1|ts|.\displaystyle|X^{\epsilon}_{t}-X^{\epsilon}_{s}|\leq\sum_{0\leq i<|t-s|/\Delta}|X^{\epsilon}_{t_{i+1}}-X^{\epsilon}_{t_{i}}|\leq\big{(}|t-s|/\Delta+1\big{)}\Delta^{\gamma}=\Delta^{\gamma-1}(\Delta+|t-s|)\leq 2\Delta^{\gamma-1}|t-s|.

So we have

|Xϵ|γ,s,t\displaystyle\||X^{\epsilon}\||_{\gamma,s,t} \displaystyle\leq 2Cα,β,γ,T(1+|BH|β)(12Δγ1)={2Cα,β,γ,T(1+|BH|β)2Δ1}\displaystyle 2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})(1\vee 2\Delta^{\gamma-1})=\Big{\{}2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})\vee 2\Delta^{-1}\Big{\}} (4.6)
=\displaystyle= {2Cα,β,γ,T(1+|BH|β)2(2Cα,β,γ,T(1+|BH|β))1γ}.\displaystyle\Big{\{}2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})\vee 2\big{(}2C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})\big{)}^{\frac{1}{\gamma}}\Big{\}}. (4.7)

Thus, from (4.3), (4.5) and (4.6), we get the desire estimate.

Using similar techniques, we can prove

X¯γCα,β,γ,T,|x0|((1+|BH|β)(1+|BH|β)1γ),a.s.\|\bar{X}\|_{\gamma}\leq C_{\alpha,\beta,\gamma,T,|x_{0}|}\big{(}(1+\||B^{H}\||_{\beta})\vee(1+\||B^{H}\||_{\beta})^{\frac{1}{\gamma}}\big{)},\,\,{\rm a.s.}

Here we omit the proof. ∎

Lemma 4.2.

Suppose that (H1)-(H5) hold. Then, if 0tt+hT0\leq t\leq t+h\leq T, and h(0,1)h\in(0,1), we have

|Xt+hϵXtϵ|Cα,β,γ,T(1+|BH|β)(1+Xϵγ)hβ,a.s.|X^{\epsilon}_{t+h}-X^{\epsilon}_{t}|\leq C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})(1+\|X^{\epsilon}\|_{\gamma})h^{\beta},\,\,{\rm a.s.}
Proof.

From (1.1), by (H1)-(H5), Hölder inequality and Remark 2.4, we have

|Xt+hϵXtϵ|\displaystyle|X^{\epsilon}_{t+h}-X^{\epsilon}_{t}| \displaystyle\leq |tt+hb(s/ϵ,Xsϵ,Xsϵ)ds|+|tt+hσ(Xsϵ)dBsH|\displaystyle\bigg{|}\int_{t}^{t+h}b({s}/{\epsilon},X^{\epsilon}_{s},\mathscr{L}_{X^{\epsilon}_{s}})\mathrm{d}s\bigg{|}+\bigg{|}\int_{t}^{t+h}\sigma(X^{\epsilon}_{s})\mathrm{d}B^{H}_{s}\bigg{|}
\displaystyle\leq tt+h|b(s/ϵ,Xsϵ,Xsϵ)|ds+Cα,β,γ,T|BH|β(suptrt+h|σ(Xrϵ)|+suptu<rt+h|σ(Xrϵ)σ(Xuϵ)|(ru)γ)hβ\displaystyle\int_{t}^{t+h}|b({s}/{\epsilon},X^{\epsilon}_{s},\mathscr{L}_{X^{\epsilon}_{s}})|\mathrm{d}s+C_{\alpha,\beta,\gamma,T}\||B^{H}\||_{\beta}\Big{(}\sup_{t\leq r\leq t+h}|\sigma(X^{\epsilon}_{r})|+\sup_{t\leq u<r\leq t+h}\frac{|\sigma(X^{\epsilon}_{r})-\sigma(X^{\epsilon}_{u})|}{(r-u)^{\gamma}}\Big{)}h^{\beta}
\displaystyle\leq Cα,β,γ,T(1+|BH|β)(1+Xϵγ)hβ.\displaystyle C_{\alpha,\beta,\gamma,T}(1+\||B^{H}\||_{\beta})\big{(}1+\|X^{\epsilon}\|_{\gamma}\big{)}h^{\beta}.

This completes the proof. ∎

4.2 The proof of Theorem 3.5

For each R>1R>1, we define the following stopping times τR\tau_{R} such that

τR:=inf{t0:|BH|β,0,t>R}T.\displaystyle\tau_{R}:=\inf\{t\geq 0:\||B^{H}\||_{\beta,0,t}>R\}\wedge T. (4.8)

Firstly, we have

𝔼[XϵX¯γ2]𝔼[XϵX¯γ2𝟏{τRT}]+𝔼[XϵX¯γ2𝟏{τR<T}]\displaystyle\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\big{]}\leq\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\mathbf{1}_{\{\tau_{R}\geq T\}}\big{]}+\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\mathbf{1}_{\{\tau_{R}<T\}}\big{]} (4.9)

where 𝟏\mathbf{1}_{\cdot} is an indicator function. For the first supremum in the right-hand side of inequality (4.9), denote D:={|BH|βR}D:=\{\||B^{H}\||_{\beta}\leq R\}. Now for λ1\lambda\geq 1 a equivalent norm of Cγ([0,T],)C^{\gamma}([0,T],\mathbb{R}) with γ(0,1)\gamma\in(0,1) is defined by

fγ,λ,0,T:=f,λ,0,T+|f|γ,λ,0,T:=supt[0,T]eλt/2|f(t)|+sup0s<tTeλt/2|f(t)f(s)|(ts)γ.\|f\|_{\gamma,\lambda,0,T}:=\|f\|_{\infty,\lambda,0,T}+\||f\||_{\gamma,\lambda,0,T}:=\sup_{t\in[0,T]}e^{-\lambda t/2}|f(t)|+\sup_{0\leq s<t\leq T}e^{-\lambda t/2}\frac{|f(t)-f(s)|}{(t-s)^{\gamma}}.

For simplify, let fγ,λ:=fγ,λ,0,T,f,λ:=f,λ,0,T\|f\|_{\gamma,\lambda}:=\|f\|_{\gamma,\lambda,0,T},\|f\|_{\infty,\lambda}:=\|f\|_{\infty,\lambda,0,T} and |f|γ,λ:=|f|γ,λ,0,T\||f\||_{\gamma,\lambda}:=\||f\||_{\gamma,\lambda,0,T}.

In what follows we fix 0<α<γ<β0<\alpha<\gamma<\beta, 12<β<H\frac{1}{2}<\beta<H, γ+β>1\gamma+\beta>1. We will show that for every ρ0>0\rho_{0}>0 there exists an ϵ0>0\epsilon_{0}>0 so that for ϵ<ϵ0\epsilon<\epsilon_{0}, λ>λ0\lambda>\lambda_{0} we have

XϵX¯γ,λρ0.\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}\leq\rho_{0}. (4.10)

Note that the norm here is equivalent to the norm in the conclusion. Here δ(0,1)\delta\in(0,1) is a parameter depending on ρ0\rho_{0}. To estimate all the terms in the following inequality we have to consider 3 cases. For the first case the right hand side will be absorbed by the left hand side of the inequality when λ\lambda is sufficiently large. The second case includes terms providing estimates like Cδ22γC\delta^{2-2\gamma} where CC is a priori determined by α,β,γ,T,|x0|\alpha,\beta,\gamma,T,|x_{0}| but independent of ρ0,λ,δ,ϵ\rho_{0},\lambda,\delta,\epsilon, then we choose fixed δ\delta so that Cδ22γ<ζρ0C\delta^{2-2\gamma}<\zeta\rho_{0}, ζ>0\zeta>0 sufficiently small. The third case contains terms providing an estimate R1𝔼[|BH|β]\sqrt{R^{-1}\mathbb{E}[\||B^{H}\||_{\beta}]}, which can be made arbitrarily small when RR is sufficiently large.

Let 𝐀:=𝔼[XϵX¯γ,λ2𝟏D],\mathbf{A}:=\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}, and divide [0,T][0,T] into intervals depending of size δ\delta, where δ(0,1)\delta\in(0,1) is a fixed positive number. For t[kδ,min{(k+1)δ,T}]t\in[k\delta,\min\{(k+1)\delta,T\}] and s(δ)=sδδs(\delta)=\lfloor\frac{s}{\delta}\rfloor\delta, where sδ\lfloor\frac{s}{\delta}\rfloor is the integer part of sδ\frac{s}{\delta}. From (1.1) and (3.1), we have

𝐀\displaystyle\mathbf{A} \displaystyle\leq 5𝔼[0(b(s/ϵ,Xsϵ,Xsϵ)b(s/ϵ,Xs(δ)ϵ,Xs(δ)ϵ))dsγ,λ2𝟏D]\displaystyle 5\mathbb{E}\bigg{[}\bigg{\|}\int_{0}^{\cdot}(b({s}/{\epsilon},X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}})-b({s}/{\epsilon},X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}}))\mathrm{d}s\bigg{\|}_{\gamma,\lambda}^{2}\mathbf{1}_{D}\bigg{]}
+5𝔼[0(b(s/ϵ,Xs(δ)ϵ,Xs(δ)ϵ)b¯(Xs(δ)ϵ,Xs(δ)ϵ))dsγ,λ2𝟏D]\displaystyle+5\mathbb{E}\bigg{[}\bigg{\|}\int_{0}^{\cdot}(b({s}/{\epsilon},X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})-\bar{b}(X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}}))\mathrm{d}s\bigg{\|}_{\gamma,\lambda}^{2}\mathbf{1}_{D}\bigg{]}
+5𝔼[0(b¯(Xs(δ)ϵ,Xs(δ)ϵ)b¯(Xsϵ,Xsϵ))dsγ,λ2𝟏D]\displaystyle+5\mathbb{E}\bigg{[}\bigg{\|}\int_{0}^{\cdot}(\bar{b}(X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})-\bar{b}(X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}}))\mathrm{d}s\bigg{\|}_{\gamma,\lambda}^{2}\mathbf{1}_{D}\bigg{]}
+5𝔼[0(b¯(Xsϵ,Xsϵ)b¯(X¯s,X¯s))dsγ,λ2𝟏D]\displaystyle+5\mathbb{E}\bigg{[}\bigg{\|}\int_{0}^{\cdot}(\bar{b}(X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}})-\bar{b}(\bar{X}_{s},\mathscr{L}_{\bar{X}_{s}}))\mathrm{d}s\bigg{\|}_{\gamma,\lambda}^{2}\mathbf{1}_{D}\bigg{]}
+5𝔼[0(σ(Xsϵ)σ(X¯s))dBsHγ,λ2𝟏D]=:i=15𝐀i.\displaystyle+5\mathbb{E}\bigg{[}\bigg{\|}\int_{0}^{\cdot}(\sigma(X^{\epsilon}_{s})-\sigma(\bar{X}_{s}))\mathrm{d}B^{H}_{s}\bigg{\|}_{\gamma,\lambda}^{2}\mathbf{1}_{D}\bigg{]}=:\sum_{i=1}^{5}\mathbf{A}_{i}.

By Hölder’s inequality, it is easy to obtain

0f(s)dsγ,λ2Cγ,Tsupt[0,T]eλt0t|f(s)|2(ts)γds.\displaystyle\bigg{\|}\int_{0}^{\cdot}f(s)\mathrm{d}s\bigg{\|}_{\gamma,\lambda}^{2}\leq C_{\gamma,T}\sup_{t\in[0,T]}e^{-\lambda t}\int_{0}^{t}\frac{|f(s)|^{2}}{(t-s)^{\gamma}}\mathrm{d}s. (4.11)

By (H2), (4.11) and Lemma 4.2, we obtain

𝐀1+𝐀3\displaystyle\mathbf{A}_{1}+\mathbf{A}_{3} \displaystyle\leq Cγ,T𝔼[supt[0,T]eλt0t|b(s/ϵ,Xsϵ,Xsϵ)b(s/ϵ,Xs(δ)ϵ,Xs(δ)ϵ)|2(ts)γds𝟏D]\displaystyle C_{\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\int_{0}^{t}|b({s}/{\epsilon},X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}})-b({s}/{\epsilon},X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})|^{2}(t-s)^{-\gamma}\mathrm{d}s\mathbf{1}_{D}\bigg{]} (4.13)
+Cγ,T𝔼[supt[0,T]eλt0t|b¯(Xs(δ)ϵ,Xs(δ)ϵ)b¯(Xsϵ,Xsϵ)|2(ts)γds𝟏D]\displaystyle+C_{\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\int_{0}^{t}|\bar{b}(X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})-\bar{b}(X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}})|^{2}(t-s)^{-\gamma}\mathrm{d}s\mathbf{1}_{D}\bigg{]}
\displaystyle\leq Cγ,T𝔼[supt[0,T]eλt0t(|Xs(δ)ϵXsϵ|2+𝔼[|Xs(δ)ϵXsϵ|2])(ts)γds𝟏D]\displaystyle C_{\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\int_{0}^{t}(|X_{s(\delta)}^{\epsilon}-X^{\epsilon}_{s}|^{2}+\mathbb{E}[|X_{s(\delta)}^{\epsilon}-X^{\epsilon}_{s}|^{2}])(t-s)^{-\gamma}\mathrm{d}s\mathbf{1}_{D}\bigg{]} (4.14)
\displaystyle\leq Cα,β,γ,T𝔼[supt[0,T]eλt0t((1+|BH|β2)(1+Xϵγ2)δ2β+𝔼[(1+|BH|β2)(1+Xϵγ2)δ2β])(ts)γds𝟏D]\displaystyle C_{\alpha,\beta,\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\int_{0}^{t}((1+\||B^{H}\||_{\beta}^{2})(1+\|X^{\epsilon}\|_{\gamma}^{2})\delta^{2\beta}+\mathbb{E}[(1+\||B^{H}\||_{\beta}^{2})(1+\|X^{\epsilon}\|_{\gamma}^{2})\delta^{2\beta}])(t-s)^{-\gamma}\mathrm{d}s\mathbf{1}_{D}\bigg{]} (4.15)
\displaystyle\leq Cα,β,γ,T,|x0|δ2β.\displaystyle C_{\alpha,\beta,\gamma,T,|x_{0}|}\delta^{2\beta}. (4.16)

By elementary inequality, we have

𝐀2\displaystyle\mathbf{A}_{2} \displaystyle\leq C𝔼[supt[0,T]eλt|0t(b(s/ϵ,Xs(δ)ϵ,Xs(δ)ϵ)b¯(Xs(δ)ϵ,Xs(δ)ϵ))ds|2𝟏D]\displaystyle C\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\bigg{|}\int_{0}^{t}(b({s}/{\epsilon},X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})-\bar{b}(X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}}))\mathrm{d}s\bigg{|}^{2}\mathbf{1}_{D}\bigg{]}
+C𝔼[sup0s<tTeλt|st(b(r/ϵ,Xr(δ)ϵ,Xr(δ)ϵ)b¯(Xr(δ)ϵ,Xr(δ)ϵ))dr|2(ts)2γ𝟏D]\displaystyle+C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}e^{-\lambda t}\frac{\big{|}\int_{s}^{t}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{|}^{2}}{(t-s)^{2\gamma}}\mathbf{1}_{D}\bigg{]}
=:\displaystyle=: 𝐀21+𝐀22.\displaystyle\mathbf{A}_{21}+\mathbf{A}_{22}.

For 𝐀21\mathbf{A}_{21}, by (H1)-(H5), Hölder inequality and Lemma 4.1, we have

𝐀21\displaystyle\mathbf{A}_{21} \displaystyle\leq C𝔼[supt[0,T]|k=0tδ1kδ(k+1)δ(b(s/ϵ,Xkδϵ,Xkδϵ)b¯(Xkδϵ,Xkδϵ))ds|2]\displaystyle C\mathbb{E}\bigg{[}\sup_{t\in[0,T]}\bigg{|}\sum_{k=0}^{\lfloor\frac{t}{\delta}\rfloor-1}\int_{k\delta}^{(k+1)\delta}(b({s}/{\epsilon},X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}s\bigg{|}^{2}\bigg{]}
+C𝔼[supt[0,T]|tδδt(b(s/ϵ,Xs(δ)ϵ,Xs(δ)ϵ)b¯(Xs(δ)ϵ,Xs(δ)ϵ))ds|2]\displaystyle+C\mathbb{E}\bigg{[}\sup_{t\in[0,T]}\bigg{|}\int_{\lfloor\frac{t}{\delta}\rfloor\delta}^{t}(b({s}/{\epsilon},X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})-\bar{b}(X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}}))\mathrm{d}s\bigg{|}^{2}\bigg{]}
\displaystyle\leq C𝔼[supt[0,T]tδk=0tδ1|kδ(k+1)δ(b(s/ϵ,Xkδϵ,Xkδϵ)b¯(Xkδϵ,Xkδϵ))ds|2]\displaystyle C\mathbb{E}\bigg{[}\sup_{t\in[0,T]}\lfloor\frac{t}{\delta}\rfloor\sum_{k=0}^{\lfloor\frac{t}{\delta}\rfloor-1}\bigg{|}\int_{k\delta}^{(k+1)\delta}(b({s}/{\epsilon},X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}s\bigg{|}^{2}\bigg{]}
+C𝔼[supt[0,T](ttδδ)tδδt(|b(s/ϵ,Xs(δ)ϵ,Xs(δ)ϵ)|2+|b¯(Xs(δ)ϵ,Xs(δ)ϵ)|2)ds]\displaystyle+C\mathbb{E}\bigg{[}\sup_{t\in[0,T]}\Big{(}t-\lfloor\frac{t}{\delta}\rfloor\delta\Big{)}\int_{\lfloor\frac{t}{\delta}\rfloor\delta}^{t}(|b({s}/{\epsilon},X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})|^{2}+|\bar{b}(X_{s(\delta)}^{\epsilon},\mathscr{L}_{X_{s(\delta)}^{\epsilon}})|^{2})\mathrm{d}s\Big{]}
\displaystyle\leq CTδ2+CTδ2max0kTδ1𝔼[|ϵkδϵ(k+1)δϵ(b(s,Xkδϵ,Xkδϵ)b¯(Xkδϵ,Xkδϵ))ds|2]\displaystyle C_{T}\delta^{2}+\frac{C_{T}}{\delta^{2}}\max_{0\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1}\mathbb{E}\bigg{[}\bigg{|}\epsilon\int_{\frac{k\delta}{\epsilon}}^{\frac{(k+1)\delta}{\epsilon}}(b(s,X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}s\bigg{|}^{2}\bigg{]}
\displaystyle\leq CTδ2+CTδ2max0kTδ1𝔼[|Tϵδ(k+1)0(k+1)δϵ(b(s,Xkδϵ,Xkδϵ)b¯(Xkδϵ,Xkδϵ))ds|2]\displaystyle C_{T}\delta^{2}+\frac{C_{T}}{\delta^{2}}\max_{0\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1}\mathbb{E}\bigg{[}\bigg{|}\frac{T\epsilon}{\delta(k+1)}\int_{0}^{\frac{(k+1)\delta}{\epsilon}}(b(s,X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}s\bigg{|}^{2}\bigg{]}
+CTδ2max0kTδ1𝔼[|Tϵδk0kδϵ(b(s,Xkδϵ,Xkδϵ)b¯(Xkδϵ,Xkδϵ))ds|2].\displaystyle+\frac{C_{T}}{\delta^{2}}\max_{0\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1}\mathbb{E}\bigg{[}\bigg{|}\frac{T\epsilon}{\delta k}\int_{0}^{\frac{k\delta}{\epsilon}}(b(s,X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}s\bigg{|}^{2}\bigg{]}.

We have for ϵ0\epsilon\rightarrow 0, δ(k+1)ϵ\frac{\delta(k+1)}{\epsilon}\rightarrow\infty for any k,1kTδ1k,1\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1. In addition we take the maximum over finitely many elements determined by the fixed number δ\delta given and TT. Following (H5), we have for every element under the maximum

max0kTδ1|ϵδ(k+1)0(k+1)δϵ(b(s,Xkδϵ,Xkδϵ)b¯(Xkδϵ,Xkδϵ))ds|max0kTδ1φ((k+1)δϵ)(1+|Xkδϵ|2+𝕎2(Xkδϵ,δ0))Cϵ\displaystyle\begin{split}\max_{0\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1}\bigg{|}\frac{\epsilon}{\delta(k+1)}&\int_{0}^{\frac{(k+1)\delta}{\epsilon}}(b(s,X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}s\bigg{|}\\ \leq&\max_{0\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1}\varphi\bigg{(}\frac{(k+1)\delta}{\epsilon}\bigg{)}\big{(}1+|X_{k\delta}^{\epsilon}|^{2}+\mathbb{W}_{2}(\mathscr{L}_{X_{k\delta}^{\epsilon}},\delta_{0})\big{)}\leq C_{\epsilon}\end{split} (4.17)

where Cϵ0C_{\epsilon}\rightarrow 0, as ϵ0\epsilon\rightarrow 0. Thus, we have for ϵ\epsilon sufficiently small and the δ\delta given

𝐀21Cα,β,γ,T,|x0|δ2,\mathbf{A}_{21}\leq C_{\alpha,\beta,\gamma,T,|x_{0}|}\delta^{2}, (4.18)

For 𝐀22\mathbf{A}_{22}, by (H1)-(H5), Hölder inequality and Lemma 4.1 again, we have

𝐀22\displaystyle\mathbf{A}_{22} \displaystyle\leq C𝔼[(sup0s<tT|st(b(r/ϵ,Xr(δ)ϵ,Xr(δ)ϵ)b¯(Xr(δ)ϵ,Xr(δ)ϵ))dr|(ts)γ)2𝟏]\displaystyle C\mathbb{E}\bigg{[}\bigg{(}\sup_{0\leq s<t\leq T}\frac{\big{|}\int_{s}^{t}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{|}}{(t-s)^{\gamma}}\bigg{)}^{2}\mathbf{1}_{\ell}\bigg{]}
+C𝔼[(sup0s<tT|st(b(r/ϵ,Xr(δ)ϵ,Xr(δ)ϵ)b¯(Xr(δ)ϵ,Xr(δ)ϵ))dr|(ts)γ)2𝟏c]\displaystyle+C\mathbb{E}\bigg{[}\bigg{(}\sup_{0\leq s<t\leq T}\frac{\big{|}\int_{s}^{t}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{|}}{(t-s)^{\gamma}}\bigg{)}^{2}\mathbf{1}_{\ell^{c}}\bigg{]}
=:\displaystyle=: 𝐀221+𝐀222\displaystyle\mathbf{A}_{221}+\mathbf{A}_{222}

where :={t<(sδ+2)δ}\ell:=\{t<(\lfloor\frac{s}{\delta}\rfloor+2)\delta\} and c:={t(sδ+2)δ}\ell^{c}:=\{t\geq(\lfloor\frac{s}{\delta}\rfloor+2)\delta\}. For 𝐀222\mathbf{A}_{222}, by (H2), (H3) and the fact ={t<(sδ+2)δ}\ell=\{t<(\lfloor\frac{s}{\delta}\rfloor+2)\delta\} implies that ts<sδδs+2δ2δt-s<\lfloor\frac{s}{\delta}\rfloor\delta-s+2\delta\leq 2\delta, so we have

𝐀221C𝔼[sup0s<tT(ts)22γ𝟏]Cδ22γ.\displaystyle\mathbf{A}_{221}\leq C\mathbb{E}\Big{[}\sup_{0\leq s<t\leq T}(t-s)^{2-2\gamma}\mathbf{1}_{\ell}\Big{]}\leq C\delta^{2-2\gamma}.

By (H1)-(H5) and the fact that λ1λ2λ1λ2+1,\lfloor\lambda_{1}\rfloor-\lfloor\lambda_{2}\rfloor\leq\lambda_{1}-\lambda_{2}+1, for λ1λ20\lambda_{1}\geq\lambda_{2}\geq 0, we have

𝐀222\displaystyle\mathbf{A}_{222} \displaystyle\leq C𝔼[sup0s<tT|s(sδ+1)δ(b(r/ϵ,Xr(δ)ϵ,Xr(δ)ϵ)b¯(Xr(δ)ϵ,Xr(δ)ϵ))dr|2(ts)2γ𝟏c]\displaystyle C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{\big{|}\int_{s}^{(\lfloor\frac{s}{\delta}\rfloor+1)\delta}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{|}^{2}}{(t-s)^{2\gamma}}\mathbf{1}_{\ell^{c}}\bigg{]}
+C𝔼[sup0s<tT|tδδt(b(r/ϵ,Xr(δ)ϵ,Xr(δ)ϵ)b¯(Xr(δ)ϵ,Xr(δ)ϵ))dr|2(ts)2γ𝟏c]\displaystyle+C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{\big{|}\int_{\lfloor\frac{t}{\delta}\rfloor\delta}^{t}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{|}^{2}}{(t-s)^{2\gamma}}\mathbf{1}_{\ell^{c}}\bigg{]}
+C𝔼[sup0s<tT|k=sδ+1tδ1kδ(k+1)δ(b(r/ϵ,Xkδϵ,Xkδϵ)b¯(Xkδϵ,Xkδϵ))dr|2(ts)2γ𝟏c]\displaystyle+C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{\big{|}\sum_{k=\lfloor\frac{s}{\delta}\rfloor+1}^{\lfloor\frac{t}{\delta}\rfloor-1}\int_{k\delta}^{(k+1)\delta}(b({r}/{\epsilon},X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}r\big{|}^{2}}{(t-s)^{2\gamma}}\mathbf{1}_{\ell^{c}}\bigg{]}
\displaystyle\leq C𝔼[sup0s<tT|s(sδ+1)δ(b(r/ϵ,Xr(δ)ϵ,Xr(δ)ϵ)b¯(Xr(δ)ϵ,Xr(δ)ϵ))dr|2(ts)2γ𝟏c]\displaystyle C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{\big{|}\int_{s}^{(\lfloor\frac{s}{\delta}\rfloor+1)\delta}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{|}^{2}}{(t-s)^{2\gamma}}\mathbf{1}_{\ell^{c}}\bigg{]}
+C𝔼[sup0s<tT|tδδt(b(r/ϵ,Xr(δ)ϵ,Xr(δ)ϵ)b¯(Xr(δ)ϵ,Xr(δ)ϵ))dr|2(ts)2γ𝟏c]\displaystyle+C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{\big{|}\int_{\lfloor\frac{t}{\delta}\rfloor\delta}^{t}(b({r}/{\epsilon},X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}})-\bar{b}(X_{r(\delta)}^{\epsilon},\mathscr{L}_{X_{r(\delta)}^{\epsilon}}))\mathrm{d}r\big{|}^{2}}{(t-s)^{2\gamma}}\mathbf{1}_{\ell^{c}}\bigg{]}
+C𝔼[sup0s<tT(tδsδ1)(ts)2γk=sδ+1tδ1|kδ(k+1)δ(b(r/ϵ,Xkδϵ,Xkδϵ)b¯(Xkδϵ,Xkδϵ))dr|2𝟏c]\displaystyle+C\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{(\lfloor\frac{t}{\delta}\rfloor-\lfloor\frac{s}{\delta}\rfloor-1)}{(t-s)^{2\gamma}}\sum_{k=\lfloor\frac{s}{\delta}\rfloor+1}^{\lfloor\frac{t}{\delta}\rfloor-1}\bigg{|}\int_{k\delta}^{(k+1)\delta}(b({r}/{\epsilon},X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}r\bigg{|}^{2}\mathbf{1}_{\ell^{c}}\bigg{]}
\displaystyle\leq Cδ22γ+Cγ,Tδ2max0kTδ1𝔼[|kδ(k+1)δ(b(r/ϵ,Xkδϵ,Xkδϵ)b¯(Xkδϵ,Xkδϵ))dr|2𝟏c].\displaystyle C\delta^{2-2\gamma}+\frac{C_{\gamma,T}}{\delta^{2}}\max_{0\leq k\leq\lfloor\frac{T}{\delta}\rfloor-1}\mathbb{E}\bigg{[}\bigg{|}\int_{k\delta}^{(k+1)\delta}(b({r}/{\epsilon},X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}})-\bar{b}(X_{k\delta}^{\epsilon},\mathscr{L}_{X_{k\delta}^{\epsilon}}))\mathrm{d}r\bigg{|}^{2}\mathbf{1}_{\ell^{c}}\bigg{]}.

Using (H5) again, the remaining term on the right hand side can be estimated similar to 𝐀21\mathbf{A}_{21}, see (4.17). We have

𝐀22Cα,β,γ,T,|x0|δ22γ.\mathbf{A}_{22}\leq C_{\alpha,\beta,\gamma,T,|x_{0}|}\delta^{2-2\gamma}. (4.19)

For 𝐀4\mathbf{A}_{4}, by (4.11), we have

𝐀4\displaystyle\mathbf{A}_{4} \displaystyle\leq Cγ,T𝔼[supt[0,T]eλt0t|b¯(Xsϵ,Xsϵ)b¯(X¯s,X¯s)|2(ts)γ𝟏Dds]\displaystyle C_{\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\int_{0}^{t}|\bar{b}(X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}})-\bar{b}(\bar{X}_{s},\mathscr{L}_{\bar{X}_{s}})|^{2}(t-s)^{-\gamma}\mathbf{1}_{D}\mathrm{d}s\bigg{]} (4.20)
\displaystyle\leq Cγ,T𝔼[supt[0,T]0teλ(ts)eλs(ts)γ|b¯(Xsϵ,Xsϵ)b¯(X¯s,X¯s)|2𝟏Dds]\displaystyle C_{\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}\int_{0}^{t}e^{-\lambda(t-s)}e^{-\lambda s}(t-s)^{-\gamma}|\bar{b}(X_{s}^{\epsilon},\mathscr{L}_{X_{s}^{\epsilon}})-\bar{b}(\bar{X}_{s},\mathscr{L}_{\bar{X}_{s}})|^{2}\mathbf{1}_{D}\mathrm{d}s\bigg{]} (4.21)
\displaystyle\leq Cγ,T𝔼[supt[0,T]0teλ(ts)(ts)γeλs(|XsϵX¯s|2+𝔼[|XsϵX¯s|2])𝟏Dds]\displaystyle C_{\gamma,T}\mathbb{E}\bigg{[}\sup_{t\in[0,T]}\int_{0}^{t}e^{-\lambda(t-s)}(t-s)^{-\gamma}e^{-\lambda s}\big{(}|X^{\epsilon}_{s}-\bar{X}_{s}|^{2}+\mathbb{E}[|X^{\epsilon}_{s}-\bar{X}_{s}|^{2}]\big{)}\mathbf{1}_{D}\mathrm{d}s\bigg{]} (4.22)
\displaystyle\leq Cγ,T𝔼[supt[0,T]eλt(|XtϵX¯t|2+𝔼[|XtϵX¯t|2])𝟏D]supt[0,T]0teλ(ts)(ts)γds\displaystyle C_{\gamma,T}\mathbb{E}\big{[}\sup_{t\in[0,T]}e^{-\lambda t}\big{(}|X^{\epsilon}_{t}-\bar{X}_{t}|^{2}+\mathbb{E}[|X^{\epsilon}_{t}-\bar{X}_{t}|^{2}]\big{)}\mathbf{1}_{D}\big{]}\sup_{t\in[0,T]}\int_{0}^{t}e^{-\lambda(t-s)}(t-s)^{-\gamma}\mathrm{d}s (4.23)
\displaystyle\leq Cγ,Tλγ1𝔼[XϵX¯γ,λ2𝟏D]+Cγ,T𝔼[XϵX¯γ,λ2𝟏Dc].\displaystyle C_{\gamma,T}\lambda^{\gamma-1}\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}+C_{\gamma,T}\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D^{c}}\big{]}. (4.24)

To proceed, we have

𝐀5\displaystyle\mathbf{A}_{5} =\displaystyle= 𝔼[supt[0,T]eλt|0t(σ(Xsϵ)σ(X¯s))dBsH|2]+𝔼[sup0s<tTeλt(ts)2γ|st(σ(Xsϵ)σ(X¯s))dBsH|2]\displaystyle\mathbb{E}\bigg{[}\sup_{t\in[0,T]}e^{-\lambda t}\bigg{|}\int_{0}^{t}(\sigma(X^{\epsilon}_{s})-\sigma(\bar{X}_{s}))\mathrm{d}B^{H}_{s}\bigg{|}^{2}\bigg{]}+\mathbb{E}\bigg{[}\sup_{0\leq s<t\leq T}\frac{e^{-\lambda t}}{(t-s)^{2\gamma}}\bigg{|}\int_{s}^{t}(\sigma(X^{\epsilon}_{s})-\sigma(\bar{X}_{s}))\mathrm{d}B^{H}_{s}\bigg{|}^{2}\bigg{]}
=:\displaystyle=: 𝐀51+𝐀52.\displaystyle\mathbf{A}_{51}+\mathbf{A}_{52}.

Since (2.1) and Lemma 2.5, we have

eλt|st(σ(Xrϵ)σ(X¯r))dBrH|2\displaystyle e^{-\lambda t}\bigg{|}\int_{s}^{t}(\sigma(X^{\epsilon}_{r})-\sigma(\bar{X}_{r}))\mathrm{d}B^{H}_{r}\bigg{|}^{2} (4.25)
\displaystyle\leq Cα,βeλt(st(|σ(Xrϵ)σ(X¯r)|(rs)α+sr|σ(Xrϵ)σ(X¯r)σ(Xuϵ)+σ(X¯u)|(ru)α+1du)|BH|β(tr)αβ+1dr)2\displaystyle C_{\alpha,\beta}e^{-\lambda t}\bigg{(}\int_{s}^{t}\bigg{(}\frac{|\sigma(X^{\epsilon}_{r})-\sigma(\bar{X}_{r})|}{(r-s)^{\alpha}}+\int_{s}^{r}\frac{|\sigma(X^{\epsilon}_{r})-\sigma(\bar{X}_{r})-\sigma(X^{\epsilon}_{u})+\sigma(\bar{X}_{u})|}{(r-u)^{\alpha+1}}\mathrm{d}u\bigg{)}\frac{\||B^{H}\||_{\beta}}{(t-r)^{-\alpha-\beta+1}}\mathrm{d}r\bigg{)}^{2} (4.26)
\displaystyle\leq Cα,β|BH|β2(steλ(tr)2eλr2|σ(Xrϵ)σ(X¯r)|(rs)α(tr)α+β1dr\displaystyle C_{\alpha,\beta}\||B^{H}\||_{\beta}^{2}\bigg{(}\int_{s}^{t}e^{-\frac{\lambda(t-r)}{2}}\frac{e^{-\frac{\lambda r}{2}}|\sigma(X^{\epsilon}_{r})-\sigma(\bar{X}_{r})|}{(r-s)^{\alpha}}(t-r)^{\alpha+\beta-1}\mathrm{d}r (4.27)
+stsreλ(tr)2eλr2|σ(Xrϵ)σ(X¯r)σ(Xuϵ)+σ(X¯u)|(ru)γ(ru)γ(ru)α+1du(tr)α+β1dr)2\displaystyle+\int_{s}^{t}\int_{s}^{r}e^{-\frac{\lambda(t-r)}{2}}\frac{e^{-\frac{\lambda r}{2}}|\sigma(X^{\epsilon}_{r})-\sigma(\bar{X}_{r})-\sigma(X^{\epsilon}_{u})+\sigma(\bar{X}_{u})|(r-u)^{\gamma}}{(r-u)^{\gamma}(r-u)^{\alpha+1}}\mathrm{d}u(t-r)^{\alpha+\beta-1}\mathrm{d}r\bigg{)}^{2} (4.28)
\displaystyle\leq Cα,β|BH|β2XϵX¯γ,λ2(steλ(tr)(rs)α(tr)α1dr)(st(rs)α(tr)α+2β1dr)\displaystyle C_{\alpha,\beta}\||B^{H}\||_{\beta}^{2}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\bigg{(}\int_{s}^{t}e^{-\lambda(t-r)}(r-s)^{-\alpha}(t-r)^{\alpha-1}\mathrm{d}r\bigg{)}\bigg{(}\int_{s}^{t}(r-s)^{-\alpha}(t-r)^{\alpha+2\beta-1}\mathrm{d}r\bigg{)} (4.29)
+Cα,β,γ|BH|β2XϵX¯γ,λ2(1+Xϵγ2+X¯γ2)(steλ(tr)(rs)α(tr)α1dr)(st(rs)2γα(tr)α+2β1dr)\displaystyle+C_{\alpha,\beta,\gamma}\||B^{H}\||_{\beta}^{2}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}(1+\|X^{\epsilon}\|_{\gamma}^{2}+\|\bar{X}\|_{\gamma}^{2})\bigg{(}\int_{s}^{t}e^{-\lambda(t-r)}(r-s)^{-\alpha}(t-r)^{\alpha-1}\mathrm{d}r\bigg{)}\bigg{(}\int_{s}^{t}(r-s)^{2\gamma-\alpha}(t-r)^{\alpha+2\beta-1}\mathrm{d}r\bigg{)} (4.30)
\displaystyle\leq Cα,β|BH|β2(ts)2βXϵX¯γ,λ2steλ(tr)(rs)α(tr)α1dr\displaystyle C_{\alpha,\beta}\||B^{H}\||_{\beta}^{2}(t-s)^{2\beta}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\int_{s}^{t}e^{-\lambda(t-r)}(r-s)^{-\alpha}(t-r)^{\alpha-1}\mathrm{d}r (4.31)
+Cα,β,γ|BH|β2(ts)2(β+γ)XϵX¯γ,λ2(1+Xϵγ2+X¯γ2)steλ(tr)(rs)α(tr)α1dr\displaystyle+C_{\alpha,\beta,\gamma}\||B^{H}\||_{\beta}^{2}(t-s)^{2(\beta+\gamma)}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}(1+\|X^{\epsilon}\|_{\gamma}^{2}+\|\bar{X}\|_{\gamma}^{2})\int_{s}^{t}e^{-\lambda(t-r)}(r-s)^{-\alpha}(t-r)^{\alpha-1}\mathrm{d}r (4.32)

where Lemma 7.1 in [25] implies that

|σ(Xrϵ)σ(X¯r)σ(Xuϵ)+σ(X¯u)||XrϵX¯rXuϵ+X¯u|+|XrϵX¯r|(|XrϵXuϵ|+|X¯rX¯u|)\displaystyle|\sigma(X^{\epsilon}_{r})-\sigma(\bar{X}_{r})-\sigma(X^{\epsilon}_{u})+\sigma(\bar{X}_{u})|\leq|X^{\epsilon}_{r}-\bar{X}_{r}-X^{\epsilon}_{u}+\bar{X}_{u}|+|X^{\epsilon}_{r}-\bar{X}_{r}|(|X^{\epsilon}_{r}-X^{\epsilon}_{u}|+|\bar{X}_{r}-\bar{X}_{u}|)

and by a change of variable v=rstsv=\frac{r-s}{t-s}, from Lemma 8 in [5] and the fact that γ<β\gamma<\beta, it is easy to see that

(ts)2βsteλ(tr)(rs)α(tr)α1dr\displaystyle(t-s)^{2\beta}\int_{s}^{t}e^{-\lambda(t-r)}(r-s)^{-\alpha}(t-r)^{\alpha-1}\mathrm{d}r
=(ts)2γ(ts)2(βγ)01eλ(ts)(1v)vα(1v)α1dv(ts)2γK(λ)\displaystyle=(t-s)^{2\gamma}(t-s)^{2(\beta-\gamma)}\int_{0}^{1}e^{-\lambda(t-s)(1-v)}v^{-\alpha}(1-v)^{\alpha-1}\mathrm{d}v\leq(t-s)^{2\gamma}K(\lambda)

where K(λ)0K(\lambda)\rightarrow 0 as λ\lambda\rightarrow\infty.

Then, by Lemma 4.1, we have

𝐀52Cα,β,γ,TK(λ)𝔼[|BH|β2XϵX¯γ,λ2(1+Xϵγ2+X¯γ2)𝟏D]Cα,β,γ,T,RK(λ)𝔼[XϵX¯γ,λ2𝟏D].\displaystyle\mathbf{A}_{52}\leq C_{\alpha,\beta,\gamma,T}K(\lambda)\mathbb{E}[\||B^{H}\||_{\beta}^{2}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}(1+\|X^{\epsilon}\|_{\gamma}^{2}+\|\bar{X}\|_{\gamma}^{2})\mathbf{1}_{D}]\leq C_{\alpha,\beta,\gamma,T,R}K(\lambda)\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}. (4.33)

In a similar manner than before for the first expression on 𝐀52\mathbf{A}_{52}, we obtain

𝐀51Cα,β,γ,T,RK(λ)𝔼[XϵX¯γ,λ2𝟏D].\displaystyle\mathbf{A}_{51}\leq C_{\alpha,\beta,\gamma,T,R}K(\lambda)\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}.

Thus, we have

𝐀5Cα,β,γ,T,RK(λ)𝔼[XϵX¯γ,λ2𝟏D].\displaystyle\mathbf{A}_{5}\leq C_{\alpha,\beta,\gamma,T,R}K(\lambda)\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}. (4.34)

Summing up (4.13), (4.18), (4.19), (4.20) and (4.34) and the fact that {τR<T}R1𝔼[|BH|β]\mathbb{P}\big{\{}\tau_{R}<T\big{\}}\leq R^{-1}\mathbb{E}[\||B^{H}\||_{\beta}] (see Lemma 4.7 in Pei et al., 2020), we obtain

𝐀\displaystyle\mathbf{A} \displaystyle\leq Cα,β,γ,T,|x0|δ2β+Cα,β,γ,T,|x0|δ2+Cα,β,γ,T,|x0|δ22γ+Cα,β,γ,T,RK(λ)𝔼[XϵX¯γ,λ2]\displaystyle C_{\alpha,\beta,\gamma,T,|x_{0}|}\delta^{2\beta}+C_{\alpha,\beta,\gamma,T,|x_{0}|}\delta^{2}+C_{\alpha,\beta,\gamma,T,|x_{0}|}\delta^{2-2\gamma}+C_{\alpha,\beta,\gamma,T,R}K(\lambda)\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\big{]}
+Cγ,Tλγ1𝔼[XϵX¯γ,λ2𝟏D]+Cγ,Tλγ1𝔼[XϵX¯γ,λ2𝟏Dc]\displaystyle+C_{\gamma,T}\lambda^{\gamma-1}\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}+C_{\gamma,T}\lambda^{\gamma-1}\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D^{c}}\big{]}
\displaystyle\leq Cα,β,γ,T,|x0|δ22γ+Cα,β,γ,T,R,|x0|(λγ1+K(λ))𝔼[XϵX¯γ,λ2𝟏D]+Cγ,Tλγ1R1𝔼[|BH|β].\displaystyle C_{\alpha,\beta,\gamma,T,|x_{0}|}\delta^{2-2\gamma}+C_{\alpha,\beta,\gamma,T,R,|x_{0}|}\big{(}\lambda^{\gamma-1}+K(\lambda)\big{)}\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma,\lambda}^{2}\mathbf{1}_{D}\big{]}+C_{\gamma,T}\lambda^{\gamma-1}\sqrt{R^{-1}\mathbb{E}[\||B^{H}\||_{\beta}]}.

Taking λ\lambda large enough, such that Cα,β,γ,T,R,|x0|(λγ1+K(λ))Cγ,Tλγ1<1C_{\alpha,\beta,\gamma,T,R,|x_{0}|}\big{(}\lambda^{\gamma-1}+K(\lambda)\big{)}\vee C_{\gamma,T}\lambda^{\gamma-1}<1, we have

𝔼[XϵX¯γ2𝟏D]Cα,β,γ,T,|x0|eλTδ22γ+eλTR1𝔼[|BH|β].\displaystyle\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\mathbf{1}_{D}\big{]}\leq C_{\alpha,\beta,\gamma,T,|x_{0}|}e^{\lambda T}\delta^{2-2\gamma}+e^{\lambda T}\sqrt{R^{-1}\mathbb{E}[\||B^{H}\||_{\beta}]}.

Next, we return to the second supremum on the right-hand side of inequality (4.9), by Cauchy-Schwarz’s inequality, Lemma 4.1 and using Lemma 4.7 in [27] again, we have

𝔼[XϵX¯γ2𝟏{τR<T}](𝔼[XϵX¯γ4])1/2{τR<T}1/2Cα,β,γ,T,|x0|R1𝔼[|BH|β].\displaystyle\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\mathbf{1}_{\{\tau_{R}<T\}}\big{]}\leq\Big{(}\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{4}\big{]}\Big{)}^{1/2}\mathbb{P}\big{\{}\tau_{R}<T\big{\}}^{1/2}\leq C_{\alpha,\beta,\gamma,T,|x_{0}|}\sqrt{R^{-1}\mathbb{E}[\||B^{H}\||_{\beta}]}.

Summing above and let RR\rightarrow\infty, we have

limϵ0𝔼[XϵX¯γ2]=0.\displaystyle\lim_{\epsilon\rightarrow 0}\mathbb{E}\big{[}\|X^{\epsilon}-\bar{X}\|_{\gamma}^{2}\big{]}=0.

This completes the proof.∎

Acknowledgments

This work was partially supported by National Natural Science Foundation of China (NSF) under Grant No. 12172285, NSF of Chongqing under Grant No.cstc2021jcyj-msxmX0296, Shaanxi Fundamental Science Research Project for Mathematics and Physics under Grant No. 22JSQ027 and Fundamental Research Funds for the Central Universities.

References

  • Bogoliubov and Mitropolski [1963] Bogoliubov, N.N., Mitropolski, Y.A.. Asymptotic methods in the theory of non-linear oscillations. Phys Today 1963;16(2):61–61.
  • Bolley et al. [2013] Bolley, F., Gentil, I., Guillin, A.. Uniform convergence to equilibrium for granular media. Arch Ration Mech Anal 2013;208:429–445.
  • Braun and Hepp [1977] Braun, W., Hepp, K.. The Vlasov dynamics and its fluctuations in the 1/N limit of interacting classical particles. Commun Math Phys 1977;56(2):101–113.
  • Cerrai and Freidlin [2009] Cerrai, S., Freidlin, M.. Averaging principle for a class of stochastic reaction–diffusion equations. Probab Theory Relat Fields 2009;144:137–177.
  • Chen et al. [2013] Chen, Y., Gao, H., Garrido-Atienza, M.J., Schmalfuss, B.. Pathwise solutions of SPDEs driven by Hölder-continuous integrators with exponent larger than 1/21/2 and random dynamical systems. Discrete Contin Dyn Syst Ser A 2013;34(1):79–98.
  • Cheng et al. [2022] Cheng, M., Hao, Z., Röckner, M.. Strong and weak convergence for averaging principle of DDSDE with singular drift. arXiv preprint arXiv:220712108 2022;.
  • Dong et al. [2018] Dong, Z., Sun, X., Xiao, H., Zhai, J.. Averaging principle for one dimensional stochastic Burgers equation. J Differ Equ 2018;265(10):4749–4797.
  • Fan et al. [2022] Fan, X., Huang, X., Suo, Y., Yuan, C.. Distribution dependent SDEs driven by fractional Brownian motions. Stoch Process Their Appl 2022;151:23–67.
  • Friz and Hairer [2020] Friz, P.K., Hairer, M.. A course on rough paths. Springer, 2020.
  • Fu and Liu [2011] Fu, H., Liu, J.. Strong convergence in stochastic averaging principle for two time-scales stochastic partial differential equations. J Math Anal Appl 2011;384(1):70–86.
  • Gao [2018] Gao, P.. Averaging principle for the higher order nonlinear Schrödinger equation with a random fast oscillation. J Stat Phys 2018;171(5):897–926.
  • Guerra and Nualart [2008] Guerra, J., Nualart, D.. Stochastic differential equations driven by fractional Brownian motion and standard Brownian motion. Stoch Anal Appl 2008;26(5):1053–1075.
  • Hong et al. [2022] Hong, W., Li, S., Liu, W.. Strong convergence rates in averaging principle for slow-fast McKean-Vlasov SPDEs. J Differ Equ 2022;316:94–135.
  • Hu et al. [2021] Hu, K., Ren, Z., Šiška, D., Szpruch, Ł.. Mean-field Langevin dynamics and energy landscape of neural networks. In: Annales de l’Institut Henri Poincare (B) Probabilites et statistiques. Institut Henri Poincaré; volume 57; 2021. p. 2043–2065.
  • Hu and Nualart [2007] Hu, Y., Nualart, D.. Differential equations driven by Hölder continuous functions of order greater than 1/2. Stoch Anal Appl 2007;2:399–413.
  • Kac [1956] Kac, M.. Foundations of kinetic theory. In: Proceedings of The third Berkeley symposium on mathematical statistics and probability. volume 3; 1956. p. 171–197.
  • Khasminskii [1968] Khasminskii, R.. On an averaging principle for Itô stochastic differential equations. Kybernetika 1968;4(3):260–279.
  • Krylov and Bogoliubov [1950] Krylov, N.M., Bogoliubov, N.N.. Introduction to non-linear mechanics. Number 11. Princeton university press, 1950.
  • Lasry and Lions [2007] Lasry, J.M., Lions, P.L.. Mean field games. Japanese J Math 2007;2(1):229–260.
  • Liu [2010] Liu, D.. Strong convergence of principle of averaging for multiscale stochastic dynamical systems. Commun Math Sci 2010;8(4):999–1020.
  • Liu et al. [2020] Liu, W., Röckner, M., Sun, X., Xie, Y.. Averaging principle for slow-fast stochastic differential equations with time dependent locally Lipschitz coefficients. J Differ Equ 2020;268(6):2910–2948.
  • McKean [1966] McKean, P.. A class of Markov processes associated with nonlinear parabolic equations. Proceedings of the National Academy of Sciences of the United States of America 1966;56(6):1907–1907.
  • Mishura and Shevchenko [2012] Mishura, Y., Shevchenko, G.. Mixed stochastic differential equations with long-range dependence: Existence, uniqueness and convergence of solutions. Comput Math with Appl 2012;64(10):3217–3227.
  • Mishura and Posashkova [2011] Mishura, Y.S., Posashkova, S.. Stochastic differential equations driven by a Wiener process and fractional Brownian motion: Convergence in besov space with respect to a parameter. Comput Math with Appl 2011;62(3):1166–1180.
  • Nualart and Răşcanu [2002] Nualart, D., Răşcanu, A.. Differential equations driven by fractional Brownian motion. Collect Math 2002;53(1):55–81.
  • Pardoux and Veretennikov [2001] Pardoux, É., Veretennikov, Y.. On the Poisson equation and diffusion approximation. I. Ann Probab 2001;29(3):1061–1085.
  • Pei et al. [2020] Pei, B., Inahama, Y., Xu, Y.. Averaging principles for mixed fast-slow systems driven by fractional Brownian motion. arXiv preprint arXiv:200106945 2020;.
  • Röckner and Zhang [2021] Röckner, M., Zhang, X.. Well-posedness of distribution dependent SDEs with singular drifts. Bernoulli 2021;27(2):1131–1158.
  • Shen et al. [2022] Shen, G., Xiang, J., Wu, J.L.. Averaging principle for distribution dependent stochastic differential equations driven by fractional Brownian motion and standard Brownian motion. J Differ Equ 2022;321:381–414.
  • Sun et al. [2022] Sun, X., Xie, L., Xie, Y.. Strong and weak convergence rates for slow–fast stochastic differential equations driven by α\alpha-stable process. Bernoulli 2022;28(1):343–369.
  • Zähle [1998] Zähle, M.. Integration with respect to fractal functions and stochastic calculus. I. Probab Theory Relat Fields 1998;111:333–374.