This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Moment stability of stochastic processes with applications to control systems

Arnab Ganguly Department of Mathematics, Louisiana State University, USA. [email protected]  and  Debasish Chatterjee Indian Institute of Technology Bombay, Systems & Control Engineering, India. [email protected]
Abstract.

We establish new conditions for obtaining uniform bounds on the moments of discrete-time stochastic processes. Our results require a weak negative drift criterion along with a state-dependent restriction on the sizes of the one-step jumps of the processes. The state-dependent feature of the results make them suitable for a large class of multiplicative-noise processes. Under the additional assumption of Markovian property, new result on ergodicity has also been proved. There are several applications to iterative systems, control systems, and other dynamical systems with state-dependent multiplicative noise, and we include illustrative examples to demonstrate applicability of our results.

Key words and phrases:
moment bound, stability, ergodicity, Markov processes, control systems.
2010 Mathematics Subject Classification:
60J05, 60J20, 60F17, 93D05, 93E15
Research of A. Ganguly is supported in part by NSF DMS - 1855788 and Louisiana Board of Regents through the Board of Regents Support Fund (contract number: LEQSF(2016-19)-RD-A-04)

1. Introduction

The paper studies stability properties of a general class of discrete-time stochastic systems. Assessment of stability of dynamical systems is an important research area which has been studied extensively over the years. For example, in control theory a primary objective is to design suitable control policies which will ensure appropriate stability properties (e.g. bounded variance) of the underlying controlled system. There are various notions of stability of a system. In mathematics, stability often refers to equilibrium stability, which, for deterministic dynamical systems, is mainly concerned with qualitative behaviors of the trajectories of the system that start near the equilibrium point. For the stochastic counterpart, in Markovian setting it usually involves study of existence of invariant distributions and associated convergence and ergodic properties. A comprehensive source of results on different ergodicty properties for discrete-time Markov chains using Foster-Lyapunov functions is [15] (also see the references therein). Several extensions of such results have since then been explored quite extensively in the literature (for example, see [17, 18]). Another important book in this area is [9], which uses expected occupation measures of the chain for identifying conditions for stability.

The primary objective of the paper is to study moment stability, which concerns itself with uniform bounds on moments of a general stochastic process XnX_{n} or, more generally, on expectations of the form 𝔼(V(Xn))\mathbb{E}(V(X_{n})) for a given function VV. This is a bit different from the usual notions of stability in the Markovian setting as mentioned in the previous paragraph, but they are not unrelated. Indeed, if the process {Xn}\{X_{n}\} has certain underlying Lyapunov structure a strong form of Markovian stability holds which in particular implies moment stability. The result, which is based on Foster-Lyapunov criterion, can be described as follows. Given a Markov chain {Xn}n\{X_{n}\}_{n\in\mathbb{N}} taking values in a Polish space 𝒮\mathcal{S} with a transition probability kernel 𝒫\mathcal{P}, suppose there exists a non-negative measurable function u:𝒮[0,)u:\mathcal{S}\rightarrow[0,\infty), called a Foster-Lyapunov function, such that the process {u(Xn)}n\{u(X_{n})\}_{n\in\mathbb{N}} possesses has the following negative drift condition: for some constant b0,θ>0b\geqslant 0,\ \theta>0, a set A𝒮A\subset\mathcal{S}, and a function V:𝒮[0,)V:\mathcal{S}\rightarrow[0,\infty)

(1.1) 𝔼[u(Xn+1)u(Xn)|Xn=x]𝒮𝒫(x,dy)u(y)u(x)θV(x)+b𝟙{xA}.\displaystyle\mathbb{E}\left[u(X_{n+1})-u(X_{n})|X_{n}=x\right]\equiv\int_{\mathcal{S}}\mathcal{P}(x,dy)u(y)-u(x)\leqslant-\theta V(x)+b\mathds{1}_{\{x\in A\}}.

If the set AA is petite, (which, roughly speaking, are the sets that have the property that any set BB is ‘equally accessible’ from any point inside the petite set - for definition and more details, see [16, 15]), the process {Xn}\{X_{n}\} has a unique invariant distribution π\pi and also π(V)=𝒮π(dx)V(x)<\pi(V)=\int_{\mathcal{S}}\pi(dx)V(x)<\infty. Moreover, under aperiodicity, it can be concluded that the chain is Harris ergodic, that is,

𝒫n(x,)πV0, as n,\|\mathcal{P}^{n}(x,\cdot)-\pi\|_{V}\rightarrow 0,\quad\mbox{ as }n\rightarrow\infty,

where V\|\cdot\|_{V} is the VV-norm (see, the definition at the end of introduction) [15, Chapter 14]. In particular, one has 𝔼[V(Xn)]π(V)\mathbb{E}[V(X_{n})]\rightarrow\pi(V) as nn\rightarrow\infty (which of course implies boundedness of 𝔼[V(Xn)]\mathbb{E}[V(X_{n})]). Thus for a Markov process {Xn}\{X_{n}\}, one way to get a uniform bound on V(Xn)V(X_{n}) is to find a Foster-Lyapunov function uu such that (1.1) holds.

The objective of the first part of the paper is to explore scenarios where a strong negative drift condition like (1.1) does not hold or at least such a Lypaunov function is not easy to find for a specific VV. We do note that the required conditions in our results are formulated in terms of the target function VV itself. One pleasing aspect of this feature is that search for a suitable Lyapunov function uu is not required for applying these results.

Our main result, Theorem 2.2, deals with the general regime where the state process {Xn}\{X_{n}\} is a general stochastic process and not necessarily Markovian. While past study on stability mostly concerns homogeneous Markov processes, the literature in the case of more general processes including non-homogeneous Markov processes and processes with long-range dependence is rather limited. The starting point in Theorem 2.2 is a weaker negative drift like condition:

(1.2) 𝔼(V(Xn+1)V(Xn)|n)A,Xn𝒟\displaystyle\mathbb{E}\left(V(X_{n+1})-V(X_{n})|\mathcal{F}_{n}\right)\leqslant-A,\quad X_{n}\notin\mathcal{D}

which, if XnX_{n} is a homogeneous Markov chain, is of course equivalent to 𝒫V(x)V(x)A\mathcal{P}V(x)-V(x)\leqslant-A for xx outside 𝒟\mathcal{D}. As could be seen by comparing (1.2) with (1.1) that even in the Markovian setting, the results of [15, Chapter 14] will not imply supn𝔼(V(Xn))<.\sup_{n}\mathbb{E}(V(X_{n}))<\infty. In fact, condition (1.2) is not enough to guarantee such an assertion even in a deterministic setting. For example, consider the sequence {xn}\{x_{n}\} on \mathbb{N} defined by

xn+1={xn1, if xn>1n+1, if xn=1.\displaystyle x_{n+1}=\begin{cases}x_{n}-1,&\quad\text{ if }x_{n}>1\\ n+1,&\quad\text{ if }x_{n}=1.\end{cases}

Clearly, supn1xn=,\sup_{n\geqslant 1}x_{n}=\infty, even though the negative drift condition is satisfied for 𝒟={1}\mathcal{D}=\{1\}. But we showed in Theorem 2.2 that under a state-dependent restriction on the conditional moments of V(Xn+1)V(X_{n+1}) given n\mathcal{F}_{n} (see Assumption 2.1 for details), the desired uniform moment bound can be achieved. Note that the above sequence {xn+1}\{x_{n+1}\} fails (2.1-c) of Assumption 2.1 but satisfies the other two conditions.

In the (homogeneous) Markovian framework, Theorem 2.2 leads to a new result (c.f. Theorem 2.8) on Harris ergodicity of Markov chains which will be useful in occasions when Foster-Lyapunov drift criterion in form of (1.1) does not hold. Importantly, Theorem 2.8 does not require 𝒟\mathcal{D} to be petite or prior checking of aperiodicity of the chain.

Theorem 2.2 is partly influenced by a result of Pemantle and Rosenthal [21] which established a uniform bound on 𝔼(Vr(Xn))\mathbb{E}(V^{r}(X_{n})) under (1.2) and the additional assumption of a constant bound on conditional pp-th moment of one-step jumps of the process given n\mathcal{F}_{n}, that is, 𝔼[|V(Xn+1)V(Xn)|p|n]\mathbb{E}\left[|V(X_{n+1})-V(X_{n})|^{p}|\mathcal{F}_{n}\right]. However, for a large class of stochastic systems the latter requirement of a uniform bound on conditional moments of jump sizes cannot be fulfilled. In particular, our work is motivated by some problems on stability about a class of stochastic systems with multiplicative noise where such conditions on one-step jumps are almost always state-dependent and can never be bounded by a constant. Our work generalized the result of [21] in two important directions - it uses a different “metric” to control the one step jumps and it allows such jumps to be bounded by a suitable state dependent function. Specifically, instead of 𝔼[|V(Xn+1)V(Xn)|p|n]\mathbb{E}\left[|V(X_{n+1})-V(X_{n})|^{p}|\mathcal{F}_{n}\right] we control the centered conditional pp-th moment of V(Xn+1)V(X_{n+1}), that is, 𝔼[|V(Xn+1)𝔼(V(Xn+1)|n|p|n]\mathbb{E}\left[\left|V(X_{n+1})-\mathbb{E}(V(X_{n+1})|\mathcal{F}_{n}\right|^{p}\Big{|}\mathcal{F}_{n}\right], in a state-dependent way. The latter quantity can be viewed as a distance between the actual position at time n+1n+1, V(Xn+1)V(X_{n+1}), and the expected position at that time given the past information, 𝔼(V(Xn+1)|n)\mathbb{E}(V(X_{n+1})|\mathcal{F}_{n}), while [21] uses the distance between actual positions at times n+1n+1 and nn. These extensions require a different approach involving different auxiliary estimates. The advantages of this new ‘jump metric’ and the state dependency feature have been discussed in detail after the the proof of Theorem 2.2. Together, they significantly increase applicability of our result to a large class of stochastic systems.

This is demonstrated in Section 3, where a broad class of systems with multiplicative noise is studied and new stability results (see Proposition 3.2 and Corollary 3.4) are obtained. This, in particular, includes stochastic switching systems and Markov processes of the form Xn+1=H(Xn)+G(Xn)ξn+1.X_{n+1}=H(X_{n})+G(X_{n})\xi_{n+1}. The last part of this section is devoted to the important problem of stabilization of stochastic linear systems with bounded control inputs. The problem of interest here is to find conditions which guarantee L2L^{2}-boundedness of a stochastic linear system of the form Xn+1=AXn+Bun+ξn+1X_{n+1}=AX_{n}+Bu_{n}+\xi_{n+1} with bounded control input. This has been studied in a previous work of the second author (see [24] and references therein for more background on the problem), and it has been shown that when (A,B)(A,B) is stabilizable, there exists a kk-history dependent control policy which assures bounded variance of such system provided the norm of the control is sufficiently large. This upper bound on the norm of the control is an artificial obstacle on its design, and it has been conjectured in [24] that it is not required although a proof couldn’t be provided. Here we show that this conjecture is indeed true (c.f. Proposition 3.7), and the artificial restriction on the control norm can be lifted largely owing to the new “metric” in Theorem 2.2. In fact, as Proposition 3.2 and Corollary 3.4 indicate this stabilization result can be easily extended to cover more general classes of stochastic control systems including the ones with multiplicative noise.

The article is organized as follows. The mathematical framework and the main results are described in Section 2. Section 3 discusses potential applications of our results for a large class of stochastic systems including switching systems, multiplicative Markov models, which are especially relevant to control theory.

Notation and terminology: For a probability kernel PP on 𝒮×𝒮\mathcal{S}\times\mathcal{S}, and a function f:𝒮[0,)f:\mathcal{S}\rightarrow[0,\infty), the function Pf:𝒮[0,)Pf:\mathcal{S}\rightarrow[0,\infty) will be defined by Pf(x)=𝒮f(y)P(x,dy)Pf(x)=\int_{\mathcal{S}}f(y)P(x,dy). In similar spirit, for a measure μ\mu on 𝒮\mathcal{S}, μ(f)\mu(f) will be defined by μ(f)=𝒮f(x)μ(dx).\mu(f)=\int_{\mathcal{S}}f(x)\mu(dx). For a signed measure, μ\mu, on 𝒮\mathcal{S}. the corresponding total variation measure is denoted by |μ|=μ++μ|\mu|=\mu^{+}+\mu^{-}, where μ=μ+μ\mu=\mu^{+}-\mu^{-} as per the Jordan decomposition. If μ=ν1ν2\mu=\nu_{1}-\nu_{2}, where ν1\nu_{1} and ν2\nu_{2} are probability measures, the total variation distance ν1ν2TV\|\nu_{1}-\nu_{2}\|_{TV} is given by

ν1ν2TV=|μ|(𝒮)=2supA(𝒮)|ν1(A)ν2(A)|.\|\nu_{1}-\nu_{2}\|_{TV}=|\mu|(\mathcal{S})=2\sup_{A\in\mathcal{B}(\mathcal{S})}|\nu_{1}(A)-\nu_{2}(A)|.

More generally, if g:𝒮[0,)g:\mathcal{S}\rightarrow[0,\infty) is a measurable function, the gg-norm of μ=ν1ν2\mu=\nu_{1}-\nu_{2} is defined by μg=sup{|μ(f)|:f measurable and 0fg}\|\mu\|_{g}=\sup\{|\mu(f)|:f\text{ measurable and }0\leqslant f\leqslant g\}

Throughout, we will work on an abstract probability space (Ω,,)(\Omega,\mathcal{F},\mathbb{P}). 𝔼\mathbb{E} will denote the expectation operator under \mathbb{P}. In context of the process {Xn}\{X_{n}\}, 𝔼x\mathbb{E}_{x} will denote the conditional expectation given X0=xX_{0}=x.

2. Mathematical framework and main results

The section presents two main results, Theorem 2.2 on uniform bounds on functions of a general stochastic process and Theorem 2.8 on ergodicity properties in the homogeneous Markovian setting. The mathematical framework pertains to a stochastic process {Xn}\{X_{n}\} taking values in a topological space 𝒮\mathcal{S} and involves negative drift conditions outside a set 𝒟\mathcal{D}, together with a state-dependent control on the size of one-step jumps of {Xn}\{X_{n}\}.

2.1. Uniform bounds for moments of stochastic processes

Assumption 2.1.

There exist measurable functions V:𝒮[0,),φ:𝒮[0,)V:\mathcal{S}\rightarrow[0,\infty),\ \varphi:\mathcal{S}\rightarrow[0,\infty), and a set 𝒟𝒮\mathcal{D}\subset\mathcal{S} such that

  1. (2.1-a)

    for all nNn\in N,

    𝔼x0[V(Xn+1)V(Xn)n]A on {Xn𝒟};\mathbb{E}_{x_{0}}[V(X_{n+1})-V(X_{n})\mid\mathcal{F}_{n}]\leqslant-A\mbox{\ \ on\ \ }\{X_{n}\notin\mathcal{D}\};
  2. (2.1-b)

    for all nn\in\mathbb{N} and some p>2p>2, Ξn\Xi_{n}, the centered conditional pp-th moment of V(Xn+1)V(X_{n+1}) given n\mathcal{F}_{n}, satisfy

    Ξn𝔼x0[|V(Xn+1)𝔼(V(Xn+1)|n)|p|n]φ(Xn),\Xi_{n}\doteq\mathbb{E}_{x_{0}}\Big{[}|V(X_{n+1})-\mathbb{E}(V(X_{n+1})|\mathcal{F}_{n})|^{p}\Big{|}\mathcal{F}_{n}\Big{]}\leqslant\varphi(X_{n}),

    where φ(x)𝒞φ(1+Vs(x))\varphi(x)\leqslant\mathscr{C}_{\varphi}(1+V^{s}(x)) for some 0s<p/210\leqslant s<p/2-1 and some constant 𝒞φ>0\mathscr{C}_{\varphi}>0.

  3. (2.1-c)

    supx𝒟V(x)<,\sup_{x\in\mathcal{D}}V(x)<\infty, and for some constant ¯0(x0)\bar{\mathscr{B}}_{0}({x_{0}}),

    𝔼x0[(𝔼[V(Xn+1)|n])p𝟙{Xn𝒟}]<¯0(x0).\mathbb{E}_{x_{0}}\left[\left(\mathbb{E}[V(X_{n+1})|\mathcal{F}_{n}]\right)^{p}\mathds{1}_{\{X_{n}\in\mathcal{D}\}}\right]<\bar{\mathscr{B}}_{0}({x_{0}}).
Theorem 2.2.

Suppose that Assumption 2.1 holds for the process {Xn}\{X_{n}\} with X0=x0X_{0}={x_{0}}. Then

r(x0)supn𝔼x0[V(Xn)r]<,\mathscr{B}_{r}({x_{0}})\doteq\sup_{n\in\mathbb{N}}\mathbb{E}_{x_{0}}\bigl{[}V(X_{n})^{r}\bigr{]}<\infty,

for any 0r<ς(s,p),0\leqslant r<\varsigma(s,p), where

ς(s,p)={p(1sp2)1, for s[0,(p2)2/2p)[12/p,p/21), when  2<p<4; for all s[0,p/21), when p4;p2, for (p2)2/2ps<12/p, when  2<p<4.\varsigma(s,p)=\begin{cases}p\left(1-\frac{s}{p-2}\right)-1,&\quad\text{ for }s\in[0,(p-2)^{2}/2p)\cup[1-2/p,\ p/2-1),\text{ when }\ 2<p<4;\\ &\quad\text{ for all }s\in[0,p/2-1),\text{ when }\ p\geqslant 4;\\ p-2,&\quad\text{ for }\ (p-2)^{2}/2p\leqslant s<1-2/p,\text{ when }\ 2<p<4.\\ \end{cases}
Remark 2.3.
  • The proof is a combination of Proposition 2.5 and Proposition 2.6. Proposition 2.5 first establishes a weaker version of the above assertion by showing that supn𝔼x0[V(Xn)r]<\sup_{n\in\mathbb{N}}\mathbb{E}_{x_{0}}\bigl{[}V(X_{n})^{r}\bigr{]}<\infty, for all r<p/21r<p/2-1. However, extension of the result from there to all r<ς(s,p)r<\varsigma(s,p) (notice that ς(s,p)p/21\varsigma(s,p)\geqslant p/2-1) requires a substantial amount of extra work and is achieved through Proposition 2.6.

  • Note that (2.1-c) is implied by the simpler condition: 𝔼x0[V(Xn+1)|n]¯0\mathbb{E}_{x_{0}}[V(X_{n+1})|\mathcal{F}_{n}]\leqslant\bar{\mathscr{B}}_{0} on {Xn𝒟}\{X_{n}\in\mathcal{D}\} for some constant ¯0\bar{\mathscr{B}}_{0}.

Proof of Theorem 2.2.

From Proposition 2.5 and the growth assumption on φ\varphi, it follows that for any 1θ<(p2)/2s1\leqslant\theta<(p-2)/2s, supnΞnθsupn(𝔼x0(φθ(Xn)))1/θ<\sup_{n}\|\Xi_{n}\|_{\theta}\leqslant\sup_{n}\left(\mathbb{E}_{x_{0}}(\varphi^{\theta}(X_{n}))\right)^{1/\theta}<\infty, where θ\|\cdot\|_{\theta} is the θ(Ω,)\mathcal{L}^{\theta}(\Omega,\mathbb{P})-norm (c.f. Proposition 2.6). The result now follows from Proposition 2.6 by letting θ(p2)/2s.\theta\rightarrow(p-2)/2s-. If s=0s=0, that is, Ξn𝒞\Xi_{n}\leqslant\mathscr{C}, for some constant 𝒞\mathscr{C} a.s., we take θ=\theta=\infty in Proposition 2.6. ∎

At this stage it is instructive to compare Theorem 2.2 with [21, Theorem 1] and precisely note some of the improvements the former offer. The first significant extension is that Theorem 2.2 allows the jump sizes in (2.1-b) to be state dependent whereas, [21] requires

(†) 𝔼x0[|V(Xn+1)V(Xn)|p|n]B,\displaystyle\mathbb{E}_{x_{0}}\left[|V(X_{n+1})-V(X_{n})|^{p}|\mathcal{F}_{n}\right]\leqslant B,

for some constant B>0B>0. The resulting benefits are obvious as it allows the result in particular to be applicable to large class of multiplicative systems of the form

Xn+1=H(Xn)+G(Xn)ξn+1,X_{n+1}=H(X_{n})+G(X_{n})\xi_{n+1},

which [21, Theorem 1] will not cover. The second notable distinction is in the ‘metric’ used in (2.1-b) in controlling jump sizes : while [21] involves 𝔼[|V(Xn+1)V(Xn)|p|n]\mathbb{E}\left[|V(X_{n+1})-V(X_{n})|^{p}|\mathcal{F}_{n}\right], our result only requires controlling the centered conditional pp-th moments of V(Xn+1)V(X_{n+1}) given n\mathcal{F}_{n}, namely, 𝔼x[|V(Xn+1)𝔼[V(Xn+1)|n]|p|n]\mathbb{E}_{x}\left[\big{|}V(X_{n+1})-\mathbb{E}[V(X_{n+1})|\mathcal{F}_{n}]\big{|}^{p}\Big{|}\mathcal{F}_{n}\right]. Of course, the latter leads to weaker hypothesis as

𝔼x0[|V(Xn+1)𝔼[V(Xn+1)|n]|p|n]2p𝔼x0[|V(Xn+1)V(Xn)|p|n].\mathbb{E}_{x_{0}}\left[\big{|}V(X_{n+1})-\mathbb{E}[V(X_{n+1})|\mathcal{F}_{n}]\big{|}^{p}\Big{|}\mathcal{F}_{n}\right]\leqslant 2^{p}\mathbb{E}_{x_{0}}\left[|V(X_{n+1})-V(X_{n})|^{p}|\mathcal{F}_{n}\right].

It is important to emphasize the advantages of the weaker hypothesis as the condition in (2.1) precludes it from being applicable even to some additive models. To illustrate this with a simple example, consider a [0,)[0,\infty)-valued process {Xn}\{X_{n}\} given by

Xn+1=Xn/2+ξn+1,X00,X_{n+1}=X_{n}/2+\xi_{n+1},\quad X_{0}\geqslant 0,

where ξn\xi_{n} are [0,)[0,\infty)-valued random variables with μp=supn𝔼(ξnp)<\mu_{p}=\sup_{n}\mathbb{E}(\xi_{n}^{p})<\infty for p>2p>2. Since Xn+1Xn=Xn/2+ξn+1,X_{n+1}-X_{n}=-X_{n}/2+\xi_{n+1}, clearly the negative drift condition (c.f (2.1-a)) holds with V(x)=|x|V(x)=|x|. but for the jump sizes we can only have

𝔼x0[|Xn+1Xn|p|n]=O(Xnp).\displaystyle\mathbb{E}_{x_{0}}\left[|X_{n+1}-X_{n}|^{p}|\mathcal{F}_{n}\right]=O(X_{n}^{p}).

This means that [21, Theorem 1] cannot be used to get supn𝔼x(Xn)<\sup_{n}\mathbb{E}_{x}(X_{n})<\infty for this simple additive system - a fact which easily follows from an elementary iteration argument (note, 𝔼x(Xn)n2μ1\mathbb{E}_{x}(X_{n})\stackrel{{\scriptstyle n\rightarrow\infty}}{{\rightarrow}}2\mu_{1}). On the other hand, our theorem clearly covers such cases as

𝔼x0[|Xn+1E(Xn+1|n)|p|n]μ¯p,μ¯p=supn𝔼|ξn𝔼(ξn)|p.\displaystyle\mathbb{E}_{x_{0}}\left[|X_{n+1}-E\left(X_{n+1}|\mathcal{F}_{n}\right)|^{p}\Big{|}\mathcal{F}_{n}\right]\leqslant\bar{\mu}_{p},\quad\bar{\mu}_{p}=\sup_{n}\mathbb{E}|\xi_{n}-\mathbb{E}(\xi_{n})|^{p}.

It should actually be noted that had Theorem 2.2 simply controlled the jump sizes by imposing the more restrictive condition, 𝔼[|Xn+1Xn|p|n]φ(Xn)\mathbb{E}\left[|X_{n+1}-X_{n}|^{p}|\mathcal{F}_{n}\right]\leqslant\varphi(X_{n}), the state-dependency feature was not enough to salvage the moment bound of the above additive system (because of the requirement φ(x)=O(Vs(x))\varphi(x)=O(V^{s}(x)) for s<p/21s<p/2-1). It is interesting to note that the results of [15] based on Foster-Lyapunov drift conditions also cannot directly be used in this simple example, as {Xn}\{X_{n}\} is not necessarily Markov (since the ξn\xi_{n} are not assumed to be i.i.d). To summarize, the weaker jump metric coupled with state dependency feature makes Theorem 2.2 a rather powerful tool in understanding stability for a broad class of stochastic systems. Some important results in this direction for switching systems have been discussed in the application section.

The following lemma will be used in various necessary estimates.

Lemma 2.4.

Let Mn{M_{n}} be a martingale relative to the filtration {n}\{\mathcal{F}_{n}\},

(2.3) γn=def𝔼[|Mn+1Mn|p|n],n0\displaystyle\gamma_{n}\stackrel{{\scriptstyle def}}{{=}}\mathbb{E}\bigl{[}\left\lvert{M_{n+1}-M_{n}}\right\rvert^{p}\,\big{|}\,\mathcal{F}_{n}\bigr{]},\quad n\geqslant 0

Θ\Theta a non-negative random variable, and b>0b>0 a constant. Then for some constants 𝒞0\mathscr{C}_{0} and 𝒞00\mathscr{C}_{00}

  1. (a)

    𝔼[|MnMk|p|k]𝒞0(nk)p21m=kn1𝔼[γm|k].\displaystyle\mathbb{E}\left[|M_{n}-M_{k}|^{p}|\mathcal{F}_{k}\right]\leqslant\mathscr{C}_{0}(n-k)^{\frac{p}{2}-1}\sum_{m=k}^{n-1}\mathbb{E}[\gamma_{m}|\mathcal{F}_{k}].

  2. (b)

    𝔼[(|MnMk|+Θ)r𝟙{|MnMk|+Θ)>b}|k]𝒞00((nk)p21m=kn1𝔼[γm|k]+𝔼[|Θ|p|k])brp.\displaystyle\mathbb{E}\left[(|M_{n}-M_{k}|+\Theta)^{r}\mathds{1}_{\{|M_{n}-M_{k}|+\Theta)>b\}}|\mathcal{F}_{k}\right]\leqslant\mathscr{C}_{00}\left((n-k)^{\frac{p}{2}-1}\sum_{m=k}^{n-1}\mathbb{E}\left[\gamma_{m}|\mathcal{F}_{k}\right]+\mathbb{E}\left[|\Theta|^{p}|\mathcal{F}_{k}\right]\right)b^{r-p}.

Proof.

Note that by Burkholder’s inequality (e.g., see [23]), there exists cp>0c_{p}>0 such that

𝔼[|MnMk|p|k]cp𝔼[(m=kn1[|Mm+1Mm|2])p/2|k].\mathbb{E}\bigl{[}\left\lvert{M_{n}-M_{k}}\right\rvert^{p}\,\big{|}\,\mathcal{F}_{k}\bigr{]}\leqslant c_{p}\mathbb{E}\left[\biggl{(}\sum_{m=k}^{n-1}\bigl{[}\left\lvert{M_{m+1}-M_{m}}\right\rvert^{2}\bigr{]}\biggr{)}^{p/2}\,\bigg{|}\,\mathcal{F}_{k}\right].

Now by Hölder’s inequality and by (2.3)

𝔼[|MnMk|p|k]\displaystyle{}\mathbb{E}\bigl{[}\left\lvert{M_{n}-M_{k}}\right\rvert^{p}\,\big{|}\,\mathcal{F}_{k}\bigr{]} cp(nk)p21m=kn1𝔼[|Mm+1Mm|p|k]\displaystyle\leqslant c_{p}(n-k)^{\frac{p}{2}-1}\sum_{m=k}^{n-1}\mathbb{E}\left[\left\lvert{M_{m+1}-M_{m}}\right\rvert^{p}|\mathcal{F}_{k}\right]
(2.4) cp(nk)p21m=kn1𝔼[γm|k].\displaystyle\leqslant c_{p}(n-k)^{\frac{p}{2}-1}\sum_{m=k}^{n-1}\mathbb{E}\left[\gamma_{m}|\mathcal{F}_{k}\right].

Now observe that for a random variable YnY_{n}, by Hölder’s inequality and Markov’s inequality: (|Yn|>b|k)𝔼[|Yn|p|k]/bp\mathbb{P}(|Y_{n}|>b|\mathcal{F}_{k})\leqslant\mathbb{E}\left[|Y_{n}|^{p}|\mathcal{F}_{k}\right]/b^{p}, we have for r<pr<p and nkn\geqslant k

𝔼[|Yn|r𝟙{|Yn|>b}|k]𝔼[|Yn|p|k]r/p(|Yn|>b|k)(pr)/p𝔼[|Yn|p|k]/bpr.\displaystyle\mathbb{E}\left[|Y_{n}|^{r}\mathds{1}_{\{|Y_{n}|>b\}}|\mathcal{F}_{k}\right]\leqslant\ \mathbb{E}\left[|Y_{n}|^{p}|\mathcal{F}_{k}\right]^{r/p}\mathbb{P}(|Y_{n}|>b|\mathcal{F}_{k})^{(p-r)/p}\leqslant\ \mathbb{E}\left[|Y_{n}|^{p}|\mathcal{F}_{k}\right]/b^{p-r}.

Taking Yn=|MnMk|+ΘY_{n}=|M_{n}-M_{k}|+\Theta we have

𝔼[|MnMk|+Θ|r𝟙{||MnMk|+Θk|>b}|k] 2p1(𝔼[|MnMk|p|k]+𝔼[Θp|k])/bpr,\displaystyle\mathbb{E}\left[|M_{n}-M_{k}|+\Theta|^{r}\mathds{1}_{\{||M_{n}-M_{k}|+\Theta_{k}|>b\}}|\mathcal{F}_{k}\right]\leqslant\ 2^{p-1}\left(\mathbb{E}\left[|M_{n}-M_{k}|^{p}|\mathcal{F}_{k}\right]+\mathbb{E}\left[\Theta^{p}|\mathcal{F}_{k}\right]\right)/b^{p-r},

and part (b) follows from (2.4). ∎

We now prove the two propositions which form the backbone of our main result, Theorem 2.2.

Proposition 2.5.

Suppose that Assumption 2.1 holds. Then for any 0r<p/21,0\leqslant r<p/2-1,

r(x0)supn𝔼x0[V(Xn)r]<,\mathscr{B}_{r}({x_{0}})\doteq\sup_{n\in\mathbb{N}}\mathbb{E}_{x_{0}}\bigl{[}V(X_{n})^{r}\bigr{]}<\infty,
Proof of Proposition 2.5.

Fix an r(s,p/21)r\in(s,p/2-1). Observe that it is enough to prove the result for such an rr. Writing φ(x)=φ(x)𝟙{|V(x)|M}+(φ(x)/Vr(x))Vr(x)𝟙{|V(x)|>M}\varphi(x)=\varphi(x)\mathds{1}_{\{|V(x)|\leqslant M\}}+(\varphi(x)/V^{r}(x))V^{r}(x)\mathds{1}_{\{|V(x)|>M\}}, we can say, because of the growth assumption on φ\varphi (c.f (2.1-b)), that for every ε>0\varepsilon>0, there exists a constant 𝒞1(ε)\mathscr{C}_{1}(\varepsilon) such that φ(x)𝒞1(ε)+εVr(x).\varphi(x)\leqslant\mathscr{C}_{1}(\varepsilon)+\varepsilon V^{r}(x).

The constants appearing in various estimates below will be denoted by 𝒞i\mathscr{C}_{i}’s. They will not depend on nn but may depend on the parameters of the system and the initial position x0{x_{0}}.

Define 0=0\mathscr{M}_{0}=0 and

n=j=0n1V(Xj+1)𝔼x0[V(Xj+1)|j],n1.\displaystyle\mathscr{M}_{n}=\sum_{j=0}^{n-1}V(X_{j+1})-\mathbb{E}_{x_{0}}[V(X_{j+1})|\mathcal{F}_{j}],\quad n\geqslant 1.

Then n\mathscr{M}_{n} is a martingale. Fix NN\in\mathbb{N}, and define the last time {Xk}\{X_{k}\} is in 𝒟\mathcal{D}:

ηmax{kNXk𝒟}.\eta\equiv\max\{k\leqslant N\mid X_{k}\in\mathcal{D}\}.

Notice that {η=k}={Xk𝒟}j>kN{Xj𝒟}\{\eta=k\}=\{X_{k}\in\mathcal{D}\}\cap\cap_{j>k}^{N}\{X_{j}\notin\mathcal{D}\}. On {η=k}\{\eta=k\}, for k<nNk<n\leqslant N

nk=\displaystyle{}\mathscr{M}_{n}-\mathscr{M}_{k}= V(Xn)V(Xk)j=kn1(𝔼x0[V(Xj+1)|j]V(Xj))\displaystyle V(X_{n})-V(X_{k})-\sum_{j=k}^{n-1}\left(\mathbb{E}_{x_{0}}[V(X_{j+1})|\mathcal{F}_{j}]-V(X_{j})\right)
\displaystyle{}\geqslant V(Xn)V(Xk)(𝔼x0[V(Xk+1)|k]V(Xk))+A(nk1)\displaystyle\ V(X_{n})-V(X_{k})-\left(\mathbb{E}_{x_{0}}[V(X_{k+1})|\mathcal{F}_{k}]-V(X_{k})\right)+A(n-k-1)
(2.5) \displaystyle\equiv V(Xn)+A(nk1)𝔼x0[V(Xk+1)|k].\displaystyle\ V(X_{n})+A(n-k-1)-\mathbb{E}_{x_{0}}[V(X_{k+1})|\mathcal{F}_{k}].

It follows that on {η=k}\{\eta=k\},

V(XN)r(|Nk|+ξk)r,andA(Nk1)|Nk|+ξk,\displaystyle V(X_{N})^{r}\leqslant\left(|\mathscr{M}_{N}-\mathscr{M}_{k}|+\xi_{k}\right)^{r},\quad\text{and}\quad A(N-k-1)\leqslant|\mathscr{M}_{N}-\mathscr{M}_{k}|+\xi_{k},

where ξk=𝔼x0[V(Xk+1)|k]𝟙{Xk𝒟}\xi_{k}=\mathbb{E}_{x_{0}}[V(X_{k+1})|\mathcal{F}_{k}]\mathds{1}_{\{X_{k}\in\mathcal{D}\}}.

On {η=},\{\eta=-\infty\}, which corresponds to the case that the chain starting outside 𝒟\mathcal{D} never enters 𝒟\mathcal{D} by time NN, we have

V(XN)r(|N0|+V(x0))r,andAN|N0|+V(x0).\displaystyle V(X_{N})^{r}\leqslant\left(|\mathscr{M}_{N}-\mathscr{M}_{0}|+V({x_{0}})\right)^{r},\quad\text{and}\quad AN\leqslant|\mathscr{M}_{N}-\mathscr{M}_{0}|+V({x_{0}}).

Thus for kN2k\leqslant N-2,

𝔼x0[V(XN)r1{η=k}]\displaystyle\mathbb{E}_{x_{0}}[V(X_{N})^{r}1_{\{\eta=k\}}]\leqslant 𝔼x0[(|Nk|+ξk)r1{η=k}]\displaystyle\ \mathbb{E}_{x_{0}}\left[\left(|\mathscr{M}_{N}-\mathscr{M}_{k}|+\xi_{k}\right)^{r}1_{\{\eta=k\}}\right]
\displaystyle\leqslant 𝔼x0[(|Nk|+ξk)r1{|Nk|+ξkA(Nk1)}]\displaystyle\ \mathbb{E}_{x_{0}}\left[\left(|\mathscr{M}_{N}-\mathscr{M}_{k}|+\xi_{k}\right)^{r}1_{\{|\mathscr{M}_{N}-\mathscr{M}_{k}|+\xi_{k}\geqslant A(N-k-1)\}}\right]
\displaystyle\leqslant 2p1(cp(Nk)p21m=kN1𝔼x0[φ(Xm)]+𝔼x[ξkp])/(Nk1)pr\displaystyle\ 2^{p-1}\left(c_{p}(N-k)^{\frac{p}{2}-1}\sum_{m=k}^{N-1}\mathbb{E}_{x_{0}}\left[\varphi(X_{m})\right]+\mathbb{E}_{x}\left[\xi_{k}^{p}\right]\right)/(N-k-1)^{p-r}
\displaystyle\leqslant 𝒞2((Nk)r1p/2m=kN1𝔼x0[φ(Xm)]+(Nk)rp),\displaystyle\ \mathscr{C}_{2}\left((N-k)^{r-1-p/2}\sum_{m=k}^{N-1}\mathbb{E}_{x_{0}}\left[\varphi(X_{m})\right]+(N-k)^{r-p}\right),

where we used (a) (2.1-c), (b) Lemma 2.4 along with the observation that

𝔼x0[|Mn+1Mn|p]=𝔼x0[|V(Xn+1)𝔼[V(Xn+1)|n]|p]𝔼x0[φ(Xn)]\displaystyle\mathbb{E}_{x_{0}}\left[\left\lvert{M_{n+1}-M_{n}}\right\rvert^{p}\right]=\mathbb{E}_{x_{0}}\left[\left\lvert{V(X_{n+1})-\mathbb{E}[V(X_{n+1})|\mathcal{F}_{n}]}\right\rvert^{p}\right]\leqslant\mathbb{E}_{x_{0}}[\varphi(X_{n})]

and (c) the fact that supm2m/(m1)=2.\sup_{m\geqslant 2}m/(m-1)=2.

Similarly, on {η=}\{\eta=-\infty\},

𝔼x0[V(XN)r1{η=}]\displaystyle\mathbb{E}_{x_{0}}[V(X_{N})^{r}1_{\{\eta=-\infty\}}]\leqslant 𝔼x0[(|N|+V(x0))r1{|N|+V(x0)AN}]\displaystyle\mathbb{E}_{x_{0}}\left[\left(|\mathscr{M}_{N}|+V({x_{0}})\right)^{r}1_{\{|\mathscr{M}_{N}|+V({x_{0}})\geqslant AN\}}\right]
\displaystyle\leqslant 2p1(cpNp21m=0N1𝔼x0[φ(Xm)]+V(x0)p)/Npr\displaystyle\ 2^{p-1}\left(c_{p}N^{\frac{p}{2}-1}\sum_{m=0}^{N-1}\mathbb{E}_{x_{0}}\left[\varphi(X_{m})\right]+V({x_{0}})^{p}\right)/N^{p-r}
\displaystyle\leqslant 2p1(cpNr1p2m=0N1𝔼x0[φ(Xm)]+V(x0)pNrp).\displaystyle\ 2^{p-1}\left(c_{p}N^{r-1-\frac{p}{2}}\sum_{m=0}^{N-1}\mathbb{E}_{x_{0}}\left[\varphi(X_{m})\right]+V({x_{0}})^{p}N^{r-p}\right).

Next, note that because of (2.1-b)

𝔼x0[Vp(XN)|N1]𝟙{XN1𝒟}2p1(𝔼x0[V(XN)|N1]p𝟙{XN1𝒟}+supx𝒟φ(x)),\displaystyle\mathbb{E}_{x_{0}}[V^{p}(X_{N})|\mathcal{F}_{N-1}]\mathds{1}_{\{X_{N-1}\in\mathcal{D}\}}\leqslant 2^{p-1}\left(\mathbb{E}_{x_{0}}[V(X_{N})|\mathcal{F}_{N-1}]^{p}\mathds{1}_{\{X_{N-1}\in\mathcal{D}\}}+\sup_{x\in\mathcal{D}}\varphi(x)\right),

which by (2.1-c) of course implies that for any qpq\leqslant p,

𝔼x0[V(XN)q1{η=N1}]𝔼x0[Vq(XN)𝟙{XN1𝒟}]𝒞3.\mathbb{E}_{x_{0}}[V(X_{N})^{q}1_{\{\eta=N-1\}}]\leqslant\ \mathbb{E}_{x_{0}}[V^{q}(X_{N})\mathds{1}_{\{X_{N-1}\in\mathcal{D}\}}]\leqslant\mathscr{C}_{3}.

Lastly,

𝔼x0[V(XN)r1{η=N}]\displaystyle\mathbb{E}_{x_{0}}[V(X_{N})^{r}1_{\{\eta=N\}}]\leqslant 𝔼x0[V(XN)r1{XN𝒟}]supz𝒟Vr(z),\displaystyle\ \mathbb{E}_{x_{0}}[V(X_{N})^{r}1_{\{X_{N}\in\mathcal{D}\}}]\leqslant\sup_{z\in\mathcal{D}}V^{r}(z),

Thus,

𝔼x0[V(XN)r]=\displaystyle\mathbb{E}_{x_{0}}[V(X_{N})^{r}]= k=0N𝔼x0[V(XN)r1{η=k}]+𝔼x0V(XN)r1{η=}]\displaystyle\ \sum_{k=0}^{N}\mathbb{E}_{x_{0}}[V(X_{N})^{r}1_{\{\eta=k\}}]+\mathbb{E}_{x_{0}}V(X_{N})^{r}1_{\{\eta=-\infty\}}]
\displaystyle\leqslant k=0N2𝔼x0[V(XN)r1{ζ=k}]+𝒞3+supz𝒟Vr(z)+𝔼x0[V(XN)r1{ζ=}]\displaystyle\ \ \sum_{k=0}^{N-2}\mathbb{E}_{x_{0}}[V(X_{N})^{r}1_{\{\zeta=k\}}]+\mathscr{C}_{3}+\sup_{z\in\mathcal{D}}V^{r}(z)+\mathbb{E}_{x_{0}}[V(X_{N})^{r}1_{\{\zeta=-\infty\}}]
\displaystyle\leqslant 𝒞4(1+Vp(x0))ζ(pr)+𝒞4k=0N2(Nk)r1p/2m=kN1𝔼x0[φ(Xm)]\displaystyle\ \mathscr{C}_{4}(1+V^{p}({x_{0}}))\zeta(p-r)+\mathscr{C}_{4}\sum_{k=0}^{N-2}(N-k)^{r-1-p/2}\sum_{m=k}^{N-1}\mathbb{E}_{x_{0}}\left[\varphi(X_{m})\right]
\displaystyle\leqslant 𝒞5(1+Vp(x0))+𝒞4k=0N2(Nk)r1p/2m=kN1𝔼x0[𝒞1(ε)+εVr(Xm)]\displaystyle\ \mathscr{C}_{5}(1+V^{p}({x_{0}}))+\mathscr{C}_{4}\sum_{k=0}^{N-2}(N-k)^{r-1-p/2}\sum_{m=k}^{N-1}\mathbb{E}_{x_{0}}\left[\mathscr{C}_{1}(\varepsilon)+\varepsilon V^{r}(X_{m})\right]
\displaystyle\leqslant 𝒞6(ε)(1+Vp(x0))+𝒞4εm=0N1βmN𝔼x0[Vr(Xm)],\displaystyle\ \mathscr{C}_{6}(\varepsilon)(1+V^{p}({x_{0}}))+\mathscr{C}_{4}\varepsilon\sum_{m=0}^{N-1}\beta^{N}_{m}\mathbb{E}_{x_{0}}\left[V^{r}(X_{m})\right],

where the choice of ε\varepsilon will be specified shortly, and βmN=k=0m(Nk)r1p/2.\beta^{N}_{m}=\sum_{k=0}^{m}(N-k)^{r-1-p/2}. Iterating, we have

(2.6) 𝔼x0[V(XN)r]\displaystyle\mathbb{E}_{x_{0}}\bigl{[}V(X_{N})^{r}\bigr{]} 𝒞6(ε)(1+Vp(x0))(1+𝒞4εl1=0N1βl1N+(𝒞4ε)2l1=0N1βl1Nl2=0l11βl2l1+\displaystyle\leqslant\mathscr{C}_{6}(\varepsilon)(1+V^{p}({x_{0}}))\biggl{(}1+\mathscr{C}_{4}\varepsilon\sum_{l_{1}=0}^{N-1}\beta^{N}_{l_{1}}+(\mathscr{C}_{4}\varepsilon)^{2}\sum_{l_{1}=0}^{N-1}\beta^{N}_{l_{1}}\sum_{l_{2}=0}^{l_{1}-1}\beta_{l_{2}}^{l_{1}}+\cdots
+(𝒞4ε)N1βN1NβN2N1β12β01)(1+Vr(x0)).\displaystyle\qquad\cdots+(\mathscr{C}_{4}\varepsilon)^{N-1}\beta^{N}_{N-1}\beta^{N-1}_{N-2}\ldots\beta^{2}_{1}\beta^{1}_{0}\biggr{)}(1+V^{r}({x_{0}})).

Notice that for any k>0k>0, since r<p/21r<p/2-1,

l=0k1βlk=l=0k1j=0l(kj)r1p/2=j=0k1(kj)rp/2ζ(p/2r).\sum_{l=0}^{k-1}\beta^{k}_{l}=\sum_{l=0}^{k-1}\sum_{j=0}^{l}(k-j)^{r-1-p/2}=\sum_{j=0}^{k-1}(k-j)^{r-p/2}\leqslant\zeta(p/2-r).

Choosing ε\varepsilon so that 𝒞4εζ(p/2r)<1\mathscr{C}_{4}\varepsilon\zeta(p/2-r)<1, (2.6) yields

𝔼x0[V(XN)r]𝒞6(ε)(1+Vp(x0))(1+Vr(x0))1𝒞4εζ(p/2r),\displaystyle\mathbb{E}_{x_{0}}\bigl{[}V(X_{N})^{r}\bigr{]}\leqslant\frac{\mathscr{C}_{6}(\varepsilon)(1+V^{p}({x_{0}}))(1+V^{r}({x_{0}}))}{1-\mathscr{C}_{4}\varepsilon\zeta(p/2-r)},

and the assertion follows.

The next proposition helps to extend the above result from any r<p/21r<p/2-1 to ς(s,p)\varsigma(s,p) as stipulated in Theorem 2.2. However it is also a stand-alone result that is applicable to certain models where Theorem 2.2 is not directly applicable. These are cases where one directly does not have any good estimate of the conditional centered moment Ξn\Xi_{n} as required in Theorem 2.6, but have suitable upper bounds for its θ\|\cdot\|_{\theta} norm. As a simple example, let XnX_{n} be a stochastic process taking values in [𝔠0,)[-\mathfrak{c}_{0},\infty), whose temporal evolution is given by

Xn+1=𝔠1+Xn/2+YnX_{n+1}=\mathfrak{c}_{1}+X_{n}/2+Y_{n}

where 𝔠0\mathfrak{c}_{0} and 𝔠1\mathfrak{c}_{1} are (real-valued) constants, and {Yn}\{Y_{n}\} is an n\mathcal{F}_{n}-adapted martingale difference process (that is, 𝔼(Yn+1|n)=0\mathbb{E}(Y_{n+1}|\mathcal{F}_{n})=0) and supn𝔼(|Yn|p)<\sup_{n}\mathbb{E}(|Y_{n}|^{p})<\infty for p>2p>2. Then Theorem 2.2 is not applicable, but the following proposition can be applied with θ=1\theta=1 to V(x)=x+𝔠0.V(x)=x+\mathfrak{c}_{0}.

Proposition 2.6.

Let Ξn𝔼x0[|V(Xn+1)𝔼(V(Xn+1)|n)|p|n]\Xi_{n}\equiv\mathbb{E}_{x_{0}}\left[|V(X_{n+1})-\mathbb{E}(V(X_{n+1})|\mathcal{F}_{n})|^{p}|\mathcal{F}_{n}\right] denote the centered conditional pp-th moment of V(Xn+1)V(X_{n+1}) given n\mathcal{F}_{n}. Assume that (2.1-a) and (2.1-c) of Assumption 2.1 hold, and for p>2p>2, some θ[1,]\theta\in[1,\infty] and some constant 0<¯θ(x)<0<\bar{\mathscr{B}}_{\theta}(x)<\infty,

Ξnθ=𝔼x0[Ξnθ]1/θ¯θ(x0), for all n0.\displaystyle\|\Xi_{n}\|_{\theta}=\mathbb{E}_{x_{0}}\left[\Xi_{n}^{\theta}\right]^{1/\theta}\leqslant\bar{\mathscr{B}}_{\theta}({x_{0}}),\quad\text{ for all }n\geqslant 0.

Then r(x0)supn𝔼x0[V(Xn)r]<\displaystyle\mathscr{B}_{r}({x_{0}})\doteq\sup_{n\in\mathbb{N}}\mathbb{E}_{x_{0}}\bigl{[}V(X_{n})^{r}\bigr{]}<\infty for 0r<ς¯(θ,p),0\leqslant r<\bar{\varsigma}(\theta,p),

ς¯(θ,p)={p(112θ)1,for θ[1,p2](pp2,] when 2<p<4,for any θ1 when p>4;p2,for θ(p2,pp2] when 2<p<4.\displaystyle\bar{\varsigma}(\theta,p)=\begin{cases}p\left(1-\frac{1}{2\theta}\right)-1,&\quad\text{for }\theta\in\left[1,\frac{p}{2}\right]\cup\left(\frac{p}{p-2},\infty\right]\text{ when }2<p<4,\\ &\quad\text{for any }\ \theta\geqslant 1\text{ when }p>4;\\ p-2,&\quad\text{for }\theta\in\left(\frac{p}{2},\frac{p}{p-2}\right]\text{ when }2<p<4.\end{cases}

Here θ=\theta=\infty cooresponds to the case that Ξn=𝔼x0[|V(Xn+1)𝔼(V(Xn+1)|n)|p|n]¯\Xi_{n}=\mathbb{E}_{x_{0}}\left[|V(X_{n+1})-\mathbb{E}(V(X_{n+1})|\mathcal{F}_{n})|^{p}|\mathcal{F}_{n}\right]\leqslant\bar{\mathscr{B}} a.s, for some constant ¯>0\bar{\mathscr{B}}>0.

Proof of Proposition 2.6.

The constants appearing in various estimates below (besides the ones that appeared before) will be denoted by C^i\hat{C}_{i}’s. They will not depend on nn but may depend on the parameters of the system and the initial position x0{x_{0}}.

Define n\mathscr{M}_{n}, η\eta and ξk\xi_{k} as in the proof of Proposition 2.5. Fix NN, 0kN0\leqslant k\leqslant N, and define ςς(N,k)\varsigma\equiv\varsigma(N,k) by

ς=inf{jk:jk+ξkA(Nk1)/2}.\displaystyle\varsigma=\inf\{j\geqslant k:\mathscr{M}_{j}-\mathscr{M}_{k}+\xi_{k}\geqslant A(N-k-1)/2\}.

Clearly, ςN\varsigma\leqslant N. For j>kj>k, notice that on {ς=j}\{\varsigma=j\},

j1k+ξkA(Nk1)/2,\displaystyle\mathscr{M}_{j-1}-\mathscr{M}_{k}+\xi_{k}\leqslant A(N-k-1)/2,

and hence on {η=k}{ς=j}\{\eta=k\}\cap\{\varsigma=j\}

Nj1A(Nk1)/2+V(XN).\displaystyle\mathscr{M}_{N}-\mathscr{M}_{j-1}\geqslant A(N-k-1)/2+V(X_{N}).

It follows that for j>kj>k

𝔼x0[V(XN)𝟙{η=k}𝟙{ς=j}]𝔼x0[|Nj1|r𝟙{|Nj1|>A(Nk1)/2}𝟙{|j1k|+ξk>A(jk2)0}𝟙{ς=j}].\displaystyle\mathbb{E}_{x_{0}}[V(X_{N})\mathds{1}_{\{\eta=k\}}\mathds{1}_{\{\varsigma=j\}}]\leqslant\mathbb{E}_{x_{0}}\left[|\mathscr{M}_{N}-\mathscr{M}_{j-1}|^{r}\mathds{1}_{\{|\mathscr{M}_{N}-\mathscr{M}_{j-1}|>A(N-k-1)/2\}}\mathds{1}_{\{|\mathscr{M}_{j-1}-\mathscr{M}_{k}|+\xi_{k}>A(j-k-2)\vee 0\}}\mathds{1}_{\{\varsigma=j\}}\right].

Notice that 𝒮j𝔼x0[|Nj1|r𝟙|Nj1|>A(Nk1)/2|j]\mathcal{S}_{j}\equiv\mathbb{E}_{x_{0}}\left[|\mathscr{M}_{N}-\mathscr{M}_{j-1}|^{r}\mathds{1}_{|\mathscr{M}_{N}-\mathscr{M}_{j-1}|>A(N-k-1)/2}|\mathcal{F}_{j}\right] can be estimated by Lemma 2.4 as

𝒮j\displaystyle\mathcal{S}_{j}\leqslant (2/A(Nk1))pr𝔼x0[|Nj1|p|j]\displaystyle\ (2/A(N-k-1))^{p-r}\mathbb{E}_{x_{0}}\left[|\mathscr{M}_{N}-\mathscr{M}_{j-1}|^{p}|\mathcal{F}_{j}\right]
\displaystyle\leqslant C^0[𝔼x0[|Nj|p|j]+|jj1|p]/(Nk1)pr\displaystyle\ \hat{C}_{0}\left[\mathbb{E}_{x_{0}}\left[|\mathscr{M}_{N}-\mathscr{M}_{j}|^{p}|\mathcal{F}_{j}\right]+|\mathscr{M}_{j}-\mathscr{M}_{j-1}|^{p}\right]/(N-k-1)^{p-r}
\displaystyle\leqslant C^0[𝒞0(Nj)p21𝔼x0[l=jN1Ξl|j]+|jj1|p]/(Nk1)pr.\displaystyle\ \hat{C}_{0}\left[\mathscr{C}_{0}(N-j)^{\frac{p}{2}-1}\mathbb{E}_{x_{0}}\left[\sum_{l=j}^{N-1}\Xi_{l}|\mathcal{F}_{j}\right]+|\mathscr{M}_{j}-\mathscr{M}_{j-1}|^{p}\right]/(N-k-1)^{p-r}.

Also, for ς=k\varsigma=k by Lemma 2.4,

𝔼x0[V(XN)𝟙{η=k}𝟙{ς=k}]\displaystyle\mathbb{E}_{x_{0}}[V(X_{N})\mathds{1}_{\{\eta=k\}}\mathds{1}_{\{\varsigma=k\}}]\leqslant 𝔼x0[𝟙{ς=k}𝔼x0[(|Nk|+ξk)r𝟙||Nk|+ξk>A(Nk1)|k]]\displaystyle\ \mathbb{E}_{x_{0}}\left[\mathds{1}_{\{\varsigma=k\}}\mathbb{E}_{x_{0}}\left[(|\mathscr{M}_{N}-\mathscr{M}_{k}|+\xi_{k})^{r}\mathds{1}_{||\mathscr{M}_{N}-\mathscr{M}_{k}|+\xi_{k}>A(N-k-1)}|\mathcal{F}_{k}\right]\right]
\displaystyle\leqslant C^1𝔼x0[𝟙{ς=k}((Nk1)rp/21l=kN1𝔼x0[Ξl|k]+(Nk1)rp|ξk|p)].\displaystyle\ \hat{C}_{1}\mathbb{E}_{x_{0}}\left[\mathds{1}_{\{\varsigma=k\}}\left((N-k-1)^{r-p/2-1}\sum_{l=k}^{N-1}\mathbb{E}_{x_{0}}\left[\Xi_{l}|\mathcal{F}_{k}\right]+(N-k-1)^{r-p}|\xi_{k}|^{p}\right)\right].

Hence,

𝔼x0[V(XN)𝟙{η=k}]=\displaystyle{}\mathbb{E}_{x_{0}}[V(X_{N})\mathds{1}_{\{\eta=k\}}]= j=kN𝔼x0[V(XN)𝟙{η=k}𝟙{ς=j}]C^1(Nk1)rp/21𝔼x0[𝟙{ς=k}l=kN1𝔼x0[Ξl|k]]\displaystyle\ \sum_{j=k}^{N}\mathbb{E}_{x_{0}}[V(X_{N})\mathds{1}_{\{\eta=k\}}\mathds{1}_{\{\varsigma=j\}}]\leqslant\ \hat{C}_{1}(N-k-1)^{r-p/2-1}\mathbb{E}_{x_{0}}\left[\mathds{1}_{\{\varsigma=k\}}\sum_{l=k}^{N-1}\mathbb{E}_{x_{0}}\left[\Xi_{l}|\mathcal{F}_{k}\right]\right]
+C^1(Nk1)rp¯(x0)+j=k+1N𝔼x0[𝟙{ς=j}𝟙{|j1k|+ξk>A(jk2)}𝒮j]\displaystyle\hskip 14.22636pt+\hat{C}_{1}(N-k-1)^{r-p}\bar{\mathscr{B}}({x_{0}})+\sum_{j=k+1}^{N}\mathbb{E}_{x_{0}}\left[\mathds{1}_{\{\varsigma=j\}}\mathds{1}_{\{|\mathscr{M}_{j-1}-\mathscr{M}_{k}|+\xi_{k}>A(j-k-2)\}}\mathcal{S}_{j}\right]
\displaystyle{}\leqslant C^2[(Nk1)rp21j=kNl=jN1𝔼x0[Ξl𝟙{ς=j}]+(Nk1)pr\displaystyle\ \hat{C}_{2}\Big{[}(N-k-1)^{r-\frac{p}{2}-1}\sum_{j=k}^{N}\sum_{l=j}^{N-1}\mathbb{E}_{x_{0}}\left[\Xi_{l}\mathds{1}_{\{\varsigma=j\}}\right]+(N-k-1)^{p-r}
(2.7) +(Nk1)rpj=k+1N|jj1|p𝟙{|j1k|+ξk>A(jk2)0}].\displaystyle\hskip 8.5359pt+(N-k-1)^{r-p}\sum_{j=k+1}^{N}|\mathscr{M}_{j}-\mathscr{M}_{j-1}|^{p}\mathds{1}_{\{|\mathscr{M}_{j-1}-\mathscr{M}_{k}|+\xi_{k}>A(j-k-2)\vee 0\}}\Big{]}.

We next estimate the above two terms separately, and for that we need the following bound which is an immediate consequence of Doob’s maximal inequality and assumption:

(2.8) x0(maxkjl|jk|+ξk>Υ)\displaystyle\mathbb{P}_{x_{0}}\left(\max_{k\leqslant j\leqslant l}|\mathscr{M}_{j}-\mathscr{M}_{k}|+\xi_{k}>\Upsilon\right)\leqslant 𝔼x0[maxkjl|jk|+ξk]p/ΥpC^3((lk)p/2+¯0(x))/Υp\displaystyle\ \mathbb{E}_{x_{0}}\left[\max_{k\leqslant j\leqslant l}|\mathscr{M}_{j}-\mathscr{M}_{k}|+\xi_{k}\right]^{p}/\Upsilon^{p}\leqslant\hat{C}_{3}\left((l-k)^{p/2}+\bar{\mathscr{B}}_{0}(x)\right)/\Upsilon^{p}

Now notice that

j=kN1l=jN1𝔼x0[Ξl𝟙{ς=j}]=\displaystyle{}\sum_{j=k}^{N-1}\sum_{l=j}^{N-1}\mathbb{E}_{x_{0}}\left[\Xi_{l}\mathds{1}_{\{\varsigma=j\}}\right]= l=kN1j=kl𝔼x0[Ξl𝟙{ς=j}]=l=kN1𝔼x0[Ξl𝟙{ςl}]l=kN1Ξlθx0(ςl)1/θ\displaystyle\ \sum_{l=k}^{N-1}\sum_{j=k}^{l}\mathbb{E}_{x_{0}}\left[\Xi_{l}\mathds{1}_{\{\varsigma=j\}}\right]=\sum_{l=k}^{N-1}\mathbb{E}_{x_{0}}\left[\Xi_{l}\mathds{1}_{\{\varsigma\leqslant l\}}\right]\leqslant\ \sum_{l=k}^{N-1}\|\Xi_{l}\|_{\theta}\mathbb{P}_{x_{0}}(\varsigma\leqslant l)^{1/\theta^{*}}
\displaystyle{}\leqslant ¯θ(x)l=kN1x0(maxkjl(|jk|+ξk)>A(Nk1)/2)1/θ\displaystyle\ \bar{\mathscr{B}}_{\theta}(x)\sum_{l=k}^{N-1}\mathbb{P}_{x_{0}}\left(\max_{k\leqslant j\leqslant l}\left(|\mathscr{M}_{j}-\mathscr{M}_{k}|+\xi_{k}\right)>A(N-k-1)/2\right)^{1/\theta^{*}}
\displaystyle{}\leqslant (2A)p/θ¯θ(x)l=kN1(C^3((lk)p/2+¯0(x))(Nk1)p)1/θ\displaystyle\left(\frac{2}{A}\right)^{p/\theta^{*}}\bar{\mathscr{B}}_{\theta}(x)\sum_{l=k}^{N-1}\left(\frac{\hat{C}_{3}\left((l-k)^{p/2}+\bar{\mathscr{B}}_{0}(x)\right)}{(N-k-1)^{p}}\right)^{1/\theta^{*}}
(2.9) \displaystyle\leqslant C^4(Nk1)p2θ+1=C^4(Nk1)p2(11θ)+1,\displaystyle\ \hat{C}_{4}(N-k-1)^{-\frac{p}{2\theta^{*}}+1}=\hat{C}_{4}(N-k-1)^{-\frac{p}{2}\left(1-\frac{1}{\theta}\right)+1},

where 1θ+1θ=1\frac{1}{\theta}+\frac{1}{\theta^{*}}=1.

Next notice that the term

𝒜j=k+1N𝔼x0[|jj1|p𝟙{|j1k|+ξk>A(jk2)}]\mathcal{A}\equiv\sum_{j=k+1}^{N}\mathbb{E}_{x_{0}}\left[|\mathscr{M}_{j}-\mathscr{M}_{j-1}|^{p}\mathds{1}_{\{|\mathscr{M}_{j-1}-\mathscr{M}_{k}|+\xi_{k}>A(j-k-2)\}}\right]

can be estimated as

𝒜\displaystyle{}\mathcal{A}\leqslant Ξk+1θ+Ξk+2θ+j=k+3N𝔼x0[Ξj1𝟙{|j1k|+ξk>A(jk2)}]\displaystyle\ \|\Xi_{k+1}\|_{\theta}+\|\Xi_{k+2}\|_{\theta}+\sum_{j=k+3}^{N}\mathbb{E}_{x_{0}}\left[\Xi_{j-1}\mathds{1}_{\{|\mathscr{M}_{j-1}-\mathscr{M}_{k}|+\xi_{k}>A(j-k-2)\}}\right]
\displaystyle{}\leqslant 2¯θ(x)+j=k+3NΞj1θx0(|j1k|+ξk>A(jk2))1/θ\displaystyle\ 2\bar{\mathscr{B}}_{\theta}(x)+\sum_{j=k+3}^{N}\|\Xi_{j-1}\|_{\theta}\mathbb{P}_{x_{0}}\left(|\mathscr{M}_{j-1}-\mathscr{M}_{k}|+\xi_{k}>A(j-k-2)\right)^{1/\theta^{*}}
\displaystyle{}\leqslant ¯θ(x)[2+Ap/θj=k+3N(C^3((jk1)p/2+¯0(x))/(jk2)p)1/θ],\displaystyle\ \bar{\mathscr{B}}_{\theta}(x)\left[2+A^{-p/\theta^{*}}\sum_{j=k+3}^{N}\left(\hat{C}_{3}\left((j-k-1)^{p/2}+\bar{\mathscr{B}}_{0}(x)\right)/(j-k-2)^{p}\right)^{1/\theta^{*}}\right],
\displaystyle{}\leqslant C^5[1+j=k+3N1/(jk2)p/2θ]\displaystyle\ \hat{C}_{5}\left[1+\sum_{j=k+3}^{N}1/(j-k-2)^{p/2\theta^{*}}\right]
(2.10) \displaystyle\leqslant {C^6, if p/2θ=p(11/θ)/2>1,C^7(Nk1), otherwise,\displaystyle\ \begin{cases}\hat{C}_{6},&\quad\text{ if }p/2\theta^{*}=p(1-1/\theta)/2>1,\\ \hat{C}_{7}(N-k-1),&\quad\text{ otherwise},\end{cases}

where the third inequality is by (2.8). We now consider some cases.

Case 1: θp/2\theta\leqslant p/2: Suppose that r<p(112θ)1r<p\left(1-\frac{1}{2\theta}\right)-1. Notice in this case this implies that pr1>p2θ1.p-r-1>\frac{p}{2\theta}\geqslant 1. It follows from (2.7), (2.9) and (2.10) (second case) that

𝔼x0[Vr(XN)]=\displaystyle\mathbb{E}_{x_{0}}[V^{r}(X_{N})]= k=0N𝔼x0[Vr(XN)𝟙{η=k}]k=0N2𝔼x0[Vr(XN)𝟙{η=k}]+𝔼x0[Vr(XN)𝟙{XN1𝒟}]+supx𝒟Vr(x)\displaystyle\ \sum_{k=0}^{N}\mathbb{E}_{x_{0}}[V^{r}(X_{N})\mathds{1}_{\{\eta=k\}}]\leqslant\sum_{k=0}^{N-2}\mathbb{E}_{x_{0}}[V^{r}(X_{N})\mathds{1}_{\{\eta=k\}}]+\mathbb{E}_{x_{0}}[V^{r}(X_{N})\mathds{1}_{\{X_{N-1}\in\mathcal{D}\}}]+\sup_{x\in\mathcal{D}}V^{r}(x)
\displaystyle\leqslant C^8[k=0N2(Nk1)rp21p2(11θ)+1+k=0N2(Nk1)rp+k=0N2(Nk1)rp+1]\displaystyle\ \hat{C}_{8}\left[\sum_{k=0}^{N-2}(N-k-1)^{r-\frac{p}{2}-1-\frac{p}{2}\left(1-\frac{1}{\theta}\right)+1}+\sum_{k=0}^{N-2}(N-k-1)^{r-p}+\sum_{k=0}^{N-2}(N-k-1)^{r-p+1}\right]
+𝒞3+supx𝒟Vr(x)\displaystyle\ +\mathscr{C}_{3}+\sup_{x\in\mathcal{D}}V^{r}(x)
=\displaystyle= C^8(ζ(p(112θ)r)+ζ(pr)+ζ(pr1))+𝒞3+supx𝒟Vr(x).\displaystyle\ \hat{C}_{8}\left(\zeta\left(p\left(1-\frac{1}{2\theta}\right)-r\right)+\zeta(p-r)+\zeta(p-r-1)\right)+\mathscr{C}_{3}+\sup_{x\in\mathcal{D}}V^{r}(x).

Case 2: θ>p/2\theta>p/2, and p4p\geqslant 4: Suppose that r<p(112θ)1r<p\left(1-\frac{1}{2\theta}\right)-1. Notice that θ>p/2\theta>p/2, and p4p\geqslant 4 imply that p/2θ=p(11/θ)/2>1p/2\theta^{*}=p(1-1/\theta)/2>1. Like the previous case, it again follows from (2.7), (2.9) and (2.10) (first case)

𝔼x0[Vr(XN)]\displaystyle\mathbb{E}_{x_{0}}[V^{r}(X_{N})]\leqslant k=0N2𝔼x0[Vr(XN)𝟙{η=k}]+𝔼x0[Vr(XN)𝟙{XN1𝒟}]+supx𝒟Vr(x)\displaystyle\ \sum_{k=0}^{N-2}\mathbb{E}_{x_{0}}[V^{r}(X_{N})\mathds{1}_{\{\eta=k\}}]+\mathbb{E}_{x_{0}}[V^{r}(X_{N})\mathds{1}_{\{X_{N-1}\in\mathcal{D}\}}]+\sup_{x\in\mathcal{D}}V^{r}(x)
\displaystyle\leqslant C^9(k=0N2(Nk)rp(112θ)+k=0N1(Nk1)rp)+𝒞3+supx𝒟Vr(x)C^10.\displaystyle\ \hat{C}_{9}\left(\sum_{k=0}^{N-2}(N-k)^{r-p\left(1-\frac{1}{2\theta}\right)}+\sum_{k=0}^{N-1}(N-k-1)^{r-p}\right)+\mathscr{C}_{3}+\sup_{x\in\mathcal{D}}V^{r}(x)\leqslant\ \hat{C}_{10}.

The other cases in the assertion follow similarly once we observe that θ>p/(p2)p/2θ>1\theta>p/(p-2)\Leftrightarrow p/2\theta^{*}>1 and for 2<p<42<p<4, p/2<p/(p2)p/2<p/(p-2).

2.2. Ergodicity of Markov processes

Theorem 2.2 leads to the following result on Harris ergodicity of Markov processes.

Definition 2.7.

A function V:[0,)V:\mathcal{E}\rightarrow[0,\infty) is inf-compact if the level sets, 𝒦m={x:V(x)m}\mathcal{K}_{m}=\{x:V(x)\leqslant m\} are compact for all m0m\geqslant 0.

Note that an inf-compact function VV is lower-semicontinuous.

Theorem 2.8.

Let {Xn}\{X_{n}\} be a Markov process taking values in a locally compact separable space \mathcal{E} with transition kernel 𝒫\mathcal{P}. Suppose for an inf-compact function V:[0,)V:\mathcal{E}\rightarrow[0,\infty), the following conditions hold:

  1. (2.8-a)

    for all nNn\in N,

    𝒫V(x)V(x)A,on {x𝒟};\mathcal{P}V(x)-V(x)\leqslant-A,\quad\text{on }\ \{x\notin\mathcal{D}\};
  2. (2.8-b)

    for some p>2p>2

    𝒫|V()𝒫V(x)|p(x)=|V(y)𝒫V(x)|p𝒫(x,dy)φ(x),\mathcal{P}|V(\cdot)-\mathcal{P}V(x)|^{p}(x)=\int|V(y)-\mathcal{P}V(x)|^{p}\mathcal{P}(x,dy)\leqslant\varphi(x),

    where φ:[0,]\varphi:\mathcal{E}\rightarrow[0,\infty] satisfies φ(x)𝒞φ(1+Vs(x))\varphi(x)\leqslant\mathscr{C}_{\varphi}(1+V^{s}(x)) for some s<p/21s<p/2-1 and some constant 𝒞φ>0\mathscr{C}_{\varphi}>0. This is of course same as requiring 𝔼[|V(Xn+1)𝒫V(Xn)|p|n]φ(Xn).\mathbb{E}\left[\big{|}V(X_{n+1})-\mathcal{P}V(X_{n})\big{|}^{p}\Big{|}\mathcal{F}_{n}\right]\leqslant\varphi(X_{n}).

  3. (2.8-c)

    supx𝒟V(x)<,\sup_{x\in\mathcal{D}}V(x)<\infty, and supx𝒟𝒫V(x)<,\sup_{x\in\mathcal{D}}\mathcal{P}V(x)<\infty,

Also, suppose that

  1. (2.8-d)

    𝒫\mathcal{P} is weak Feller, ψ\psi-irreducible, and admits a density qq with respect to some Radon measure μ\mu, that is, 𝒫(x,dy)=q(x,y)μ(dy)\mathcal{P}(x,dy)=q(x,y)\mu(dy), and that for every compact set 𝒦\mathcal{K}, there exists a constant 𝔠𝒦,0\mathfrak{c}_{\mathcal{K},0} such that

    supy𝒦q(x,y)𝔠𝒦,0(1+Vr(x)).\sup_{y\in\mathcal{K}}q(x,y)\leqslant\mathfrak{c}_{\mathcal{K},0}\left(1+V^{r}(x)\right).

Then

  1. (i)

    Under (2.8-a) - (2.8-c), supn𝔼x0(Vr(Xn))supn𝒫nVr(x0)<\sup_{n}\mathbb{E}_{x_{0}}(V^{r}(X_{n}))\equiv\sup_{n}\mathcal{P}^{n}V^{r}({x_{0}})<\infty for any 0r<ς(s,p)0\leqslant r<\varsigma(s,p), where ς(s,p)\varsigma(s,p) is as in Theorem 2.2.

  2. (ii)

    Under additional assumption of (2.8-d), {Xn}\{X_{n}\} is positive Harris recurrent (PHR) and aperiodic with a unique invariant distribution π\pi, and for any x0x_{0} and r(0,ς(s,p))r\in(0,\varsigma(s,p))

    (2.11) (Vr+1)d|𝒫n(x0,)π|0as n;\displaystyle\int(V^{r}+1)d|\mathcal{P}^{n}({x_{0}},\cdot)-\pi|\rightarrow 0\quad\text{as }n\rightarrow\infty;

    or equivalently,

    (2.12) 𝒫n(x0,)πVr+1supf:|f|Vr+1|𝒫nf(x0)π(f)|0,as n.\displaystyle\|\mathcal{P}^{n}({x_{0}},\cdot)-\pi\|_{V^{r}+1}\doteq\sup_{f:|f|\leqslant V^{r}+1}|\mathcal{P}^{n}f({x_{0}})-\pi(f)|\rightarrow 0,\quad\text{as }n\rightarrow\infty.
Proof.

(i) follows from the Theorem 2.2. Since VV is inf-compact, it follows from (i) that for every x0x_{0}, {𝒫n(x0,)}\{\mathcal{P}^{n}(x_{0},\cdot)\} is tight, and let π\pi be one of its limit point. Since 𝒫\mathcal{P} is weak Feller, by the Krylov-Bogolyubov theorem [22, Theorem 7.1], π\pi is invariant for 𝒫\mathcal{P}, and uniqueness of π\pi follows from the assumption of ψ\psi-irreducibility [9, Proposition 4.2.2] . Hence, for every x0x_{0}, 𝒫n(x0,)π\mathcal{P}^{n}(x_{0},\cdot)\Rightarrow\pi (along the full sequence) as nn\rightarrow\infty.

For (ii) we start by establishing the following claim.

Claim: Suppose that fVr+1f\leqslant V^{r}+1 for some r(0,ς(s,p))r\in(0,\varsigma(s,p)). Then 𝒫nf(x0)π(f)\mathcal{P}^{n}f(x_{0})\rightarrow\pi(f) as nn\rightarrow\infty for any x0x_{0}\in\mathcal{E}.

Since VV is lower semi-continuous we have by (generalized) Fatou’s lemma,

π(Vr)lim infn𝒫nVr(x0)r(x0)\displaystyle\pi(V^{r})\leqslant\liminf_{n\rightarrow\infty}\mathcal{P}^{n}V^{r}(x_{0})\leqslant\mathscr{B}_{r}(x_{0})

for any r(0,ς(s,p))r\in(0,\varsigma(s,p)). Now let fVr+1f\leqslant V^{r}+1 for some r(0,ς(s,p))r\in(0,\varsigma(s,p)) and fix ε>0\varepsilon>0.

Since {𝒫n(x0,)}\{\mathcal{P}^{n}(x_{0},\cdot)\} is tight, for a given ε~>0\tilde{\varepsilon}>0, there exists a compact set 𝒦\mathcal{K} (which depends on x0x_{0} and which we take of the form 𝒦m={x:V(x)m}\mathcal{K}_{m}=\{x:V(x)\leqslant m\} for sufficiently large mm) such that

supn𝒫n(x0,𝒦c)ε~, and π(𝒦c)ε~.\sup_{n}\mathcal{P}^{n}(x_{0},\mathcal{K}^{c})\leqslant\tilde{\varepsilon},\quad\text{ and }\quad\pi(\mathcal{K}^{c})\leqslant\tilde{\varepsilon}.

Now by Hölder’s inequality

𝒫nf1𝒦c(x0)=\displaystyle{}\mathcal{P}^{n}f1_{\mathcal{K}^{c}}(x_{0})= f(y)1𝒦c(y)𝒫n(x0,dy)(Vr(y)+1)1𝒦c(y)𝒫n(x0,dy)\displaystyle\ \int f(y)1_{\mathcal{K}^{c}}(y)\mathcal{P}^{n}(x_{0},dy)\leqslant\int(V^{r}(y)+1)1_{\mathcal{K}^{c}}(y)\mathcal{P}^{n}(x_{0},dy)
\displaystyle{}\leqslant (Vr(y)𝒫n(x0,dy))r/r(𝟙𝒦c(y)𝒫n(x0,dy))1r/r+𝒫n(x0,𝒦c),\displaystyle\ \left(\int V^{r^{\prime}}(y)\mathcal{P}^{n}(x_{0},dy)\right)^{r/r^{\prime}}\left(\int\mathds{1}_{\mathcal{K}^{c}}(y)\mathcal{P}^{n}(x_{0},dy)\right)^{1-r/r^{\prime}}+\mathcal{P}^{n}(x_{0},\mathcal{K}^{c}),
(2.13) \displaystyle\leqslant rr/r(x0)ε~1r/r+ε~\displaystyle\ \mathscr{B}_{r^{\prime}}^{r/r^{\prime}}(x_{0})\tilde{\varepsilon}^{1-r/r^{\prime}}+\tilde{\varepsilon}

for some r<r<ς(s,p)r<r^{\prime}<\varsigma(s,p). Similarly, π(f𝟙𝒦c)rr/r(x)ε~1r/r+ε~\pi(f\mathds{1}_{\mathcal{K}^{c}})\leqslant\mathscr{B}_{r^{\prime}}^{r/r^{\prime}}(x)\tilde{\varepsilon}^{1-r/r^{\prime}}+\tilde{\varepsilon}.

Since f𝟙𝒦L1(μ)f\mathds{1}_{\mathcal{K}}\in L^{1}(\mu), there exist {hm}Cc(,)\{h_{m}\}\subset C_{c}(\mathcal{E},\mathbb{R}) such that hmf1𝒦h_{m}\rightarrow f1_{\mathcal{K}} in L1(μ)L^{1}(\mu) as mm\rightarrow\infty, and supx|hm(x)|supx𝒦|f(x)|\sup_{x}|h_{m}(x)|\leqslant\sup_{x\in\mathcal{K}}|f(x)| for m1m\geqslant 1. In fact, we can choose {hm}\{h_{m}\} such that supp(hm)𝒦𝒦.supp(h_{m})\subset\mathcal{K}^{\prime}\supset\mathcal{K}. for some compact set 𝒦\mathcal{K}^{\prime}.

Observe that for xx\in\mathcal{E} y𝒦y\in\mathcal{K}^{\prime}

qn(x,y)=\displaystyle q^{n}(x,y)= qn1(x,z)q(z,y)𝑑μ(z)qn1(x,z)𝔠𝒦,0(1+Vr(z))𝑑μ(z)\displaystyle\ \int q^{n-1}(x,z)q(z,y)d\mu(z)\leqslant\int q^{n-1}(x,z)\mathfrak{c}_{\mathcal{K^{\prime}},0}\left(1+V^{r}(z)\right)d\mu(z)
\displaystyle\leqslant 𝔠𝒦,0(1+𝔼x(Vr(Xn1)))𝔠𝒦,0(1+r(x))𝒞𝒦(x).\displaystyle\ \mathfrak{c}_{\mathcal{K^{\prime}},0}\left(1+\mathbb{E}_{x}(V^{r}(X_{n-1}))\right)\leqslant\mathfrak{c}_{\mathcal{K^{\prime}},0}\left(1+\mathscr{B}_{r}(x)\right)\equiv\mathscr{C}_{\mathcal{K}^{\prime}}(x).

Hence

supn|𝒫nf𝟙𝒦(x0)𝒫nhm|\displaystyle{}\sup_{n}|\mathcal{P}^{n}f\mathds{1}_{\mathcal{K}}(x_{0})-\mathcal{P}^{n}h_{m}|\leqslant 𝒦|f(y)𝟙𝒦(y)hm(y)|qn(x0,y)𝑑μ(y)\displaystyle\ \int_{\mathcal{K}^{\prime}}|f(y)\mathds{1}_{\mathcal{K}}(y)-h_{m}(y)|q^{n}(x_{0},y)d\mu(y)
(2.14) \displaystyle\leqslant 𝒞𝒦(x0)f1𝒦hm1.\displaystyle\ \mathscr{C}_{\mathcal{K}^{\prime}}(x_{0})\|f1_{\mathcal{K}}-h_{m}\|_{1}.

Next, notice that π\pi is absolutely continuous with μ\mu. Indeed, if μ(A)=0\mu(A)=0, then 𝒫(x,A)=0\mathcal{P}(x,A)=0, and hence π(A)=π(dx)𝒫(x,A)=0\pi(A)=\int\pi(dx)\mathcal{P}(x,A)=0. Let g=dπ/dμg=d\pi/d\mu. For any M>0M>0,

|π(hm)π(f1𝒦)|\displaystyle{}|\pi(h_{m})-\pi(f1_{\mathcal{K}})|\leqslant M|hmf𝟙𝒦|𝟙{gM}𝑑μ+|hmf𝟙𝒦|g𝟙{gM}𝑑μ\displaystyle\ M\int|h_{m}-f\mathds{1}_{\mathcal{K}}|\mathds{1}_{\{g\leqslant M\}}d\mu+\int|h_{m}-f\mathds{1}_{\mathcal{K}}|g\mathds{1}_{\{g\geqslant M\}}d\mu
(2.15) \displaystyle\leqslant Mhmf𝟙𝒦1+2supx𝒦|f(x)|g𝟙{gM}𝑑μ.\displaystyle M\|h_{m}-f\mathds{1}_{\mathcal{K}}\|_{1}+2\sup_{x\in\mathcal{K}}|f(x)|\int g\mathds{1}_{\{g\geqslant M\}}d\mu.

Write

𝒫nf(x0)π(f)=\displaystyle{}\mathcal{P}^{n}f(x_{0})-\pi(f)= (𝒫nf1𝒦(x0)𝒫nhm(x0))+(𝒫nhm(x0)π(hm))+(π(hm)π(f1𝒦(x0)))\displaystyle\ \left(\mathcal{P}^{n}f1_{\mathcal{K}}(x_{0})-\mathcal{P}^{n}h_{m}(x_{0})\right)+\left(\mathcal{P}^{n}h_{m}(x_{0})-\pi(h_{m})\right)+\left(\pi(h_{m})-\pi(f1_{\mathcal{K}}(x_{0}))\right)
(2.16) +𝒫nf1𝒦c(x0)π(f1𝒦c(x0)),\displaystyle\ +\mathcal{P}^{n}f1_{\mathcal{K}^{c}}(x_{0})-\pi(f1_{\mathcal{K}^{c}}(x_{0})),

and choose 𝒦\mathcal{K} such that (2.13) holds for ε~\tilde{\varepsilon} where ε~\tilde{\varepsilon} is chosen such that rr/r(x0)ε~1r/r+ε~ε/10\mathscr{B}_{r^{\prime}}^{r/r^{\prime}}(x_{0})\tilde{\varepsilon}^{1-r/r^{\prime}}+\tilde{\varepsilon}\leqslant\varepsilon/10. Since g𝑑μ=1\int gd\mu=1, choose sufficiently large MM such that g𝟙{gM}𝑑με/(20supx𝒦|f(x)|)\int g\mathds{1}_{\{g\geqslant M\}}d\mu\leqslant\varepsilon/(20\sup_{x\in\mathcal{K}}|f(x)|), then a sufficiently large mm such that

f1𝒦hm1(ε/5𝒞𝒦(x0))(ε/10M).\|f1_{\mathcal{K}}-h_{m}\|_{1}\leqslant(\varepsilon/5\mathscr{C}_{\mathcal{K}^{\prime}}(x_{0}))\wedge(\varepsilon/10M).

Finally, since 𝒫n(x0,)π\mathcal{P}^{n}(x_{0},\cdot)\Rightarrow\pi, and hmCc(,)h_{m}\in C_{c}(\mathcal{E},\mathbb{R}), we have (𝒫nhm(x0)π(hm))0\left(\mathcal{P}^{n}h_{m}(x_{0})-\pi(h_{m})\right)\rightarrow 0 as nn\rightarrow\infty. Hence, we can choose a sufficiently large nn such that |𝒫nhm(x0)π(hm)|ε/5|\mathcal{P}^{n}h_{m}(x_{0})-\pi(h_{m})|\leqslant\varepsilon/5, and thus from (2.13), (2.14), (2.15) and (2.16),

|𝒫nf(x0)π(f)|\displaystyle|\mathcal{P}^{n}f(x_{0})-\pi(f)|\leqslant ε.\displaystyle\varepsilon.

This proves the claim, which in particular says that for any x0x_{0}\in\mathcal{E} and any Borel set AA, 𝒫n(x,A)nπ(A)\mathcal{P}^{n}(x,A)\stackrel{{\scriptstyle n\rightarrow\infty}}{{\rightarrow}}\pi(A). By [9, Theorem 4.3.4] (also see [8]), {Xn}\{X_{n}\} is aperiodic and PHR, and by the same result this implies 𝒫n(x,)πTV0.\|\mathcal{P}^{n}(x,\cdot)-\pi\|_{TV}\rightarrow 0. The equivalence of the setwise convergence of 𝒫n(x,)\mathcal{P}^{n}(x,\cdot) and convergence in total-variation norm is a unique feature of PHR chains. Now note that by Hölder’s inequality for some r(r,ς(s,p))r^{\prime}\in(r,\varsigma(s,p))

(Vr(y)+1)d|𝒫n(x,)π|(y)\displaystyle\int(V^{r}(y)+1)d|\mathcal{P}^{n}(x,\cdot)-\pi|(y)\leqslant (Vr(y)(𝒫n(x,dy)+π(dy)))r/r𝒫n(x,)πTV1r/r\displaystyle\ \left(\int V^{r^{\prime}}(y)(\mathcal{P}^{n}(x,dy)+\pi(dy))\right)^{r/r^{\prime}}\|\mathcal{P}^{n}(x,\cdot)-\pi\|_{TV}^{1-r/r^{\prime}}
+𝒫n(x,)πTV\displaystyle\hskip 11.38092pt+\|\mathcal{P}^{n}(x,\cdot)-\pi\|_{TV}
\displaystyle\leqslant 2r(x)r/r𝒫n(x,)πTV1r/r+𝒫n(x,)πTVn0.\displaystyle\ 2\mathscr{B}_{r^{\prime}}(x)^{r/r^{\prime}}\|\mathcal{P}^{n}(x,\cdot)-\pi\|_{TV}^{1-r/r^{\prime}}+\|\mathcal{P}^{n}(x,\cdot)-\pi\|_{TV}\stackrel{{\scriptstyle n\rightarrow\infty}}{{\rightarrow}}0.

The equivalence of (2.11) and (2.12) follows from Lemma 2.9 below.

Lemma 2.9.

Let ν\nu be a signed measure on a complete separable metric space \mathcal{E}. Suppose that g:[0,)g:\mathcal{E}\rightarrow[0,\infty) is a measurable function such that |ν|(g)=gd|ν|<|\nu|(g)=\int gd|\nu|<\infty. Then

12|ν|(g)νg|ν|(g),\frac{1}{2}|\nu|(g)\leqslant\|\nu\|_{g}\leqslant|\nu|(g),

where recall νg=supf:|f|g|ν(f)|\|\nu\|_{g}=\sup_{f:|f|\leqslant g}|\nu(f)|

Proof.

The last inequality is trivial as for any measurable ff with |f|g|f|\leqslant g, |ν(f)||ν|(|f|)|ν|(g)|\nu(f)|\leqslant|\nu|(|f|)\leqslant|\nu|(g). For the first inequality, let =𝒴𝒩\mathcal{E}=\mathcal{Y}\cup\mathcal{N} be the Hahn decomposition for ν\nu (in particular, 𝒴𝒩=\mathcal{Y}\cap\mathcal{N}=\varnothing) , with the corresponding Jordan decomposition ν=ν+ν\nu=\nu^{+}-\nu^{-} (i.e., supp(ν+)𝒴(\nu^{+})\subset\mathcal{Y} and supp(ν)𝒩)(\nu^{-})\subset\mathcal{N}). Choose f=g𝟙𝒴f=g\mathds{1}_{\mathcal{Y}}. Then

νg|ν(g𝟙𝒴)|=|ν+(g𝟙𝒴)ν(g𝟙𝒴)|=ν+(g𝟙𝒴)=ν+(g),\displaystyle\|\nu\|_{g}\geqslant|\nu(g\mathds{1}_{\mathcal{Y}})|=|\nu^{+}(g\mathds{1}_{\mathcal{Y}})-\nu^{-}(g\mathds{1}_{\mathcal{Y}})|=\nu^{+}(g\mathds{1}_{\mathcal{Y}})=\nu^{+}(g),

where the last equality is because supp(ν+)𝒴(\nu^{+})\subset\mathcal{Y}. Similarly, choosing f=g𝟙𝒩f=g\mathds{1}_{\mathcal{N}}, we have νgν(g)\|\nu\|_{g}\geqslant\nu^{-}(g), whence it follows that 2νg|ν|(g).2\|\nu\|_{g}\geqslant|\nu|(g).

3. Applications

This sections is devoted to understanding stability of a broad class of multiplicative systems through application of the previous theorems.

3.1. Discrete time switching systems

Let \mathbb{H} be a Hilbert space and \mathcal{E} a Polish space. Suppose there exists a sequence of measurable maps Pn:××[0,1]P_{n}:\mathbb{H}\times\mathcal{E}\times\mathcal{E}\rightarrow[0,1] such that for each xx\in\mathbb{H}, the function Pn(x,,)P_{n}(x,\cdot,\cdot) is a transition probability kernel. Consider a discrete-time n\mathcal{F}_{n}-adapted process {Zn}{(Xn,Yn)}\{Z_{n}\}\equiv\{(X_{n},Y_{n})\} taking values in ×\mathbb{H}\times\mathcal{E}, whose dynamics is defined by the following rule: given the state (Xn,Yn)=(xn,yn)(X_{n},Y_{n})=(x_{n},y_{n}),

  1. (SS-1)

    first, Yn+1Y_{n+1} is selected randomly according to the (possibly) time-inhomogenous transition probability distribution Pn(xn,yn,)Pn,xn(yn,)P_{n}(x_{n},y_{n},\cdot)\equiv P_{n,x_{n}}(y_{n},\cdot),

  2. (SS-2)

    next given Yn+1=yn+1Y_{n+1}=y_{n+1},

    Xn+1=Hn(xn,yn+1,ξn+1),X_{n+1}=H_{n}(x_{n},y_{n+1},\xi_{n+1}),

    where {ξk:k=1,}\{\xi_{k}:k=1,\ldots\} is a sequence of independent random variables taking values in a Banach space 𝔹\mathbb{B}, ξn+1\xi_{n+1} is independent of σ{n,Yn+1}\sigma\{\mathcal{F}_{n},Y_{n+1}\} and Hn:××𝔹H_{n}:\mathbb{H}\times\mathcal{E}\times\mathbb{B}\rightarrow\mathbb{H}.

In general {(Xn,Yn)}\{(X_{n},Y_{n})\} is a (possibly) time-inhomogeneous Markov process but clearly, neither {Xn}\{X_{n}\} nor {Yn}\{Y_{n}\} is Markovian on its own. The stochastic system {(Xn,Yn)}\{(X_{n},Y_{n})\} is known as a discrete-time switching system or a stochastic hybrid system (and sometimes also known as iterated function system with place dependent probabilities [1]). Stochastic hybrid systems are extensively used to model practical phenomena where system parameters are subject to sudden changes. These systems have found widespread applications in various disciplines including synthesis of fractals, modeling of biological networks, [12], target tracking [19], communication networks [10], control theory [2, 3, 4] - to name a few. There is a considerable literature addressing classical weak stability questions concerning the existence and uniqueness of invariant measures of iterated function systems, see e.g., [20, 13, 25, 5, 11] and the references therein. Comprehensive sources studying various properties of these systems including results on stability in both continuous and discrete time can be found in [14, 28] (also see the references therein). In most of these works, YnY_{n} is often assumed to be a stand-alone finite or countable state-space Markov chains.

We consider a broad class of coupled switching or hybrid systems whose dynamics is described by (SS-1) and (SS-2) with HnH_{n} of the form

Hn(x,y,z)=Ln(x,y)+Fn(x,y)+Gn(x,y,z),H_{n}(x,y,z)=L_{n}(x,y)+F_{n}(x,y)+G_{n}(x,y,z),

where Ln,Fn:×L_{n},F_{n}:\mathbb{H}\times\mathcal{E}\rightarrow\mathbb{H} and Gn:××𝔹G_{n}:\mathbb{H}\times\mathcal{E}\times\mathbb{B}\rightarrow\mathbb{H}. In other words, {Xn}\{X_{n}\} satisfies

(3.17) Xn+1=Ln(Xn,Yn+1)+Fn(Xn,Yn+1)+Gn(Xn,Yn+1,ξn+1)\displaystyle X_{n+1}=L_{n}(X_{n},Y_{n+1})+F_{n}(X_{n},Y_{n+1})+G_{n}(X_{n},Y_{n+1},\xi_{n+1})

where the ξn\xi_{n} are 𝔹\mathbb{B}-valued random variables. (3.17), for example, includes multiplicative systems of the form

Xn+1=Xn+Fn(Xn,Yn+1)+Gn0(Xn,Yn+1)ξn+1.\displaystyle X_{n+1}=X_{n}+F_{n}(X_{n},Y_{n+1})+G^{0}_{n}(X_{n},Y_{n+1})\xi_{n+1}.

We will make the following assumptions on the above system.

Assumption 3.1.
  1. (SS-7)

    For x>B\|x\|>B, and any yy\in\mathcal{E},

    Pn,xFn(x,),Ln(x,)(y)=Fn(x,y),Ln(x,y)Pn,x(y,dy)𝔪0x(1+γ),P_{n,x}\langle F_{n}(x,\cdot),L_{n}(x,\cdot)\rangle(y)=\int\langle F_{n}(x,y^{\prime}),L_{n}(x,y^{\prime})\rangle P_{n,x}(y,dy^{\prime})\leqslant-\mathfrak{m}_{0}\|x\|^{-(1+\gamma)},

    for some constants 𝔪0\mathfrak{m}_{0} and exponent γ0\gamma\geqslant 0.

  2. (SS-8)

    The following growth conditions hold:

    • Ln(x,y)𝔪L,1(y)x+𝔪L,2(y)\|L_{n}(x,y)\|\leqslant\mathfrak{m}_{L,1}(y)\|x\|+\mathfrak{m}_{L,2}(y)and L¯n(x,y)𝔪L¯(y)(1+x)l1,\displaystyle\|\bar{L}_{n}(x,y)\|\leqslant\mathfrak{m}_{\bar{L}}(y)(1+\|x\|)^{l_{1}}, where
      L¯n(x,y)=Ln(x,y)Pn,xLn(x,)(y).\bar{L}_{n}(x,y)=L_{n}(x,y)-P_{n,x}L_{n}(x,\cdot)(y).

    • Fn(x,y)𝔪F(y)(1+x)f0,\displaystyle\|F_{n}(x,y)\|\leqslant\mathfrak{m}_{F}(y)(1+\|x\|)^{f_{0}}, F¯n(x,y)𝔪F¯(y)(1+x)f1\bar{F}_{n}(x,y)\leqslant\mathfrak{m}_{\bar{F}}(y)(1+\|x\|)^{f_{1}},
      Gn(x,y,z)𝔪G(y)(1+x)g0Ψ(z),\|G_{n}(x,y,z)\|\leqslant\mathfrak{m}_{G}(y)(1+\|x\|)^{g_{0}}\Psi(z), where Ψ:𝔹[0,)\Psi:\mathbb{B}\rightarrow[0,\infty) and F¯n(x,y)=Fn(x,y)Pn,xFn(x,)(y).\bar{F}_{n}(x,y)=F_{n}(x,y)-P_{n,x}F_{n}(x,\cdot)(y).

    • For any p>0p>0, the constants 𝔪¯F,p,𝔪¯F¯,p,𝔪¯G,p,𝔪¯L,1,p,𝔪¯L,2,p\bar{\mathfrak{m}}_{F,p},\bar{\mathfrak{m}}_{\bar{F},p},\bar{\mathfrak{m}}_{G,p},\bar{\mathfrak{m}}_{L,1,p},\bar{\mathfrak{m}}_{L,2,p} and 𝔪¯L¯,p\bar{\mathfrak{m}}_{\bar{L},p} are finite, and 𝔪¯L,1,21\bar{\mathfrak{m}}_{L,1,2}\leqslant 1 ,where the above constants are defined as

      (3.18) 𝔪¯χ,psupn,x,z𝔪χp(y)Pn,x(z,dy),χ=F,F¯,G,{L,1},{L,2},L¯.\displaystyle\bar{\mathfrak{m}}_{\chi,p}\doteq\sup_{n,x,z}\int\mathfrak{m}^{p}_{\chi}(y)P_{n,x}(z,dy),\quad\chi=F,\bar{F},G,\{L,1\},\{L,2\},\bar{L}.
  3. (SS-9)

    The exponents satisfy:

    • (a) f0<(1+γ)/2f_{0}<(1+\gamma)/2, or (b) f0=(1+γ)/2f_{0}=(1+\gamma)/2 and 𝔪¯F,22𝔪0\bar{\mathfrak{m}}_{F,2}\leqslant 2\mathfrak{m}_{0};

    • g0<γ1/2\ g_{0}<\gamma\wedge 1/2, and l1f1<1/2l_{1}\vee f_{1}<1/2.

  4. (SS-10)

    The ξn\xi_{n} are independent 𝔹\mathbb{B}-valued random variables with distribution νn\nu_{n}; for each nn, ξn+1\xi_{n+1} is independent of σ{n,Yn+1}\sigma\{\mathcal{F}_{n},Y_{n+1}\}, and for any p>0p>0, mp=supn𝔼(Ψ(ξn)p)<m_{*}^{p}=\sup_{n}\mathbb{E}(\Psi(\xi_{n})^{p})<\infty

Proposition 3.2.

Under Assumption 3.1, supn𝔼x0Xnm<.\sup_{n}\mathbb{E}_{x_{0}}\|X_{n}\|^{m}<\infty. for any m>0m>0 and x0x_{0}\in\mathbb{H}. If the functions GnG_{n} are centered with respect to the variable zz in the sense that G^n(x,y)𝔹Gn(x,y,z)νn+1(dz)=0\displaystyle\hat{G}_{n}(x,y)\doteq\ \int_{\mathbb{B}}G_{n}(x,y,z)\nu_{n+1}(dz)=0 for all n1n\geqslant 1, xx\in\mathbb{H} and yy\in\mathcal{E}, then we only need g0<1/2g_{0}<1/2 instead of g0<γ1/2\ g_{0}<\gamma\wedge 1/2 in (SS-9) for the above assertion to be true.

Remark 3.3.

A few comments are in order.

  • Because of the growth assumption on GnG_{n} in (SS-8) and the condition (SS-10), for each n,xn,x and yy, the function zGn(x,y,z)z\rightarrow G_{n}(x,y,z) is Bochner integrable, and hence G^n(x,y)𝔹Gn(x,y,z)νn+1(dz)\hat{G}_{n}(x,y)\doteq\int_{\mathbb{B}}G_{n}(x,y,z)\nu_{n+1}(dz) is well defined (the integral is defined in Bochner sense).

  • One scenario where the functions GnG_{n} are centered (with respect to the variable zz) occurs when considering multiplicative stochastic system driven by zero-mean random variables. Specifically, in such models the GnG_{n} are of the form Gn(x,y,z)=Gn0(x,y)zG_{n}(x,y,z)=G^{0}_{n}(x,y)z and the ξn\xi_{n} are mean zero-random variables. Also notice for these models, Ψ(z)=z𝔹.\Psi(z)=\|z\|_{\mathbb{B}}.

  • Suppose that the GnG_{n} are not centered in the variable zz. If γ<1/2\gamma<1/2, (SS-9) requires that the growth exponent of GnG_{n}, g0<γg_{0}<\gamma. However, this could be extended to the boundary case of g0=γg_{0}=\gamma (when γ<1/2\gamma<1/2) provided the averaged growth constants 𝔪¯χ,p\bar{\mathfrak{m}}_{\chi,p} (c.f. (3.18)) meet certain conditions. If g0=γg_{0}=\gamma and f0<(1+γ)/2f_{0}<(1+\gamma)/2, then the assertion of Proposition 3.2 is true provided (𝔪¯G,2m2)1/2<𝔪0\left(\bar{\mathfrak{m}}_{G,2}m^{2}_{*}\right)^{1/2}<\mathfrak{m}_{0}. If g0=γg_{0}=\gamma and f0=(1+γ)/2f_{0}=(1+\gamma)/2, then the same assertion holds provided (𝔪¯G,2m2)1/2+𝔪¯F,2/2<𝔪0\left(\bar{\mathfrak{m}}_{G,2}m^{2}_{*}\right)^{1/2}+\bar{\mathfrak{m}}_{F,2}/2<\mathfrak{m}_{0}.

  • Condition (SS-7) is implied by the simpler condition:

    Fn(x,y),Ln(x,y)𝔪0x1+γ,x>B,y.\langle F_{n}(x,y),L_{n}(x,y)\rangle\leqslant-\mathfrak{m}_{0}\|x\|^{1+\gamma},\quad\|x\|>B,\ \forall y.

    Similarly, for many models a stronger (but easier to check) form of the condition (SS-8) , where the ‘constants’ 𝔪χ\mathfrak{m}_{\chi} (for χ=F,F¯,G,{L,1},{L,2},L¯\chi=F,\bar{F},G,\{L,1\},\{L,2\},\bar{L}) do not depend on yy, suffices. In that case the corresponding averaged constants (given by (3.18)) are of course given by 𝔪¯χ,p=𝔪χp\bar{\mathfrak{m}}_{\chi,p}=\mathfrak{m}_{\chi}^{p}, and are therefore trivially finite.

  • One common example of LnL_{n} is Ln(x,y)Ln(x)=xL_{n}(x,y)\equiv L_{n}(x)=x or UnxU_{n}x for some unitary operator UnU_{n}. If Ln(x,y)Ln(x)L_{n}(x,y)\equiv L_{n}(x), then centered LnL_{n}, that is, L¯n0\bar{L}_{n}\equiv 0, and the condition on the corresponding growth exponent l1l_{1} is trivially satisfied.

  • Clearly, f1f0f_{1}\leqslant f_{0}, where recall that f1f_{1} and f0f_{0} are the growth rates of F¯n(x,y)=Fn(x,y)PxFn(x,)(y)\bar{F}_{n}(x,y)=F_{n}(x,y)-P_{x}F_{n}(x,\cdot)(y) (centered FnF_{n}) and FnF_{n}, respectively. In some models, without any other information or suitable estimates on F¯\bar{F}, f1f_{1} may just have to be taken the same as f0f_{0}, in which case condition (SS-9) implies that the above result on uniform bounds on moments applies to systems for which f0<1/2.f_{0}<1/2. (and not (1+γ)/2(1+\gamma)/2). However, in some other models the optimal growth rate f1f_{1} of F¯n\bar{F}_{n} can indeed be lower than that of FnF_{n}. For example, as we noted before for the function LnL_{n}, if Fn(x,y)Fn(x)F_{n}(x,y)\equiv F_{n}(x), then F¯n(x,y)0\bar{F}_{n}(x,y)\equiv 0 (that is, in particular, f1=0f_{1}=0), and this along with Theorem 2.8 leads to Corollary 3.4 about Harris ergodicty of a large class of multiplicative Markovian systems.

Proof of Proposition 3.2.

Besides the different parameters in Assumption 3.1, other constants appearing in various estimates below will be denoted by 𝔪i\mathfrak{m}_{i}’s. They will not depend on nn but may depend on the parameters of the system.

For the proof we will only consider the case of (SS-9)-(a), where f0<(1+γ)/2f_{0}<(1+\gamma)/2; the proofs in the cases of (SS-9)-(b) and the second point in Remark 3.3 follow from (3.21) and some minor modification of the arguments. For each nn, define the functions G^n:×\hat{G}_{n}:\mathbb{H}\times\mathcal{E}\rightarrow\mathbb{H} and G~,G¯n:××𝔹\tilde{G},\ \bar{G}_{n}:\mathbb{H}\times\mathcal{E}\times\mathbb{B}\rightarrow\mathbb{H} by

G^n(x,y)=𝔹Gn(x,y,z)νn+1(dz),G~n(x,y,z)=Gn(x,y,z)G^n(x,y), and\displaystyle\ \hat{G}_{n}(x,y)=\ \int_{\mathbb{B}}G_{n}(x,y,z)\nu_{n+1}(dz),\quad\tilde{G}_{n}(x,y,z)=G_{n}(x,y,z)-\hat{G}_{n}(x,y),\quad\text{ and}
G¯n(x,y,z)=Gn(x,y,z)Pn,xG^n(x,)(y)=Gn(x,y,z)𝔼(G(Xn,Yn+1,ξn+1)|(Xn,Yn)=(x,y))\displaystyle\ \bar{G}_{n}(x,y,z)=\ G_{n}(x,y,z)-P_{n,x}\hat{G}_{n}(x,\cdot)(y)=G_{n}(x,y,z)-\mathbb{E}(G(X_{n},Y_{n+1},\xi_{n+1})|(X_{n},Y_{n})=(x,y))

(recall that νn\nu_{n} is the distribution measure of ξn\xi_{n}), and notice that by (SS-8) and (SS-10) for any p>0p>0,

𝔼[|G^n(Xn,Yn+1)|p|n]=\displaystyle{}\mathbb{E}\left[|\hat{G}_{n}(X_{n},Y_{n+1})|^{p}|\mathcal{F}_{n}\right]= (𝔹Gn(x,y,z)νn+1(dz))pPn,Xn(Yn,dy)\displaystyle\ \int_{\mathcal{E}}\left(\int_{\mathbb{B}}G_{n}(x,y,z)\nu_{n+1}(dz)\right)^{p}P_{n,X_{n}}(Y_{n},dy)
\displaystyle{}\leqslant 𝔹𝔪Gp(y)(1+Xn)pg0Ψ(z)pνn+1(dz)Pn,Xn(Yn,dy)\displaystyle\ \int_{\mathcal{E}}\int_{\mathbb{B}}\mathfrak{m}_{G}^{p}(y)(1+\|X_{n}\|)^{pg_{0}}\Psi(z)^{p}\nu_{n+1}(dz)P_{n,X_{n}}(Y_{n},dy)
(3.19) \displaystyle\leqslant 𝔪¯G^,p(1+Xn)pg0,\displaystyle\ \bar{\mathfrak{m}}_{\hat{G},p}(1+\|X_{n}\|)^{pg_{0}},

where 𝔪¯G^,p=𝔪¯G,pmp\bar{\mathfrak{m}}_{\hat{G},p}=\bar{\mathfrak{m}}_{G,p}m_{*}^{p} (recall mp=supk𝔼[Ψ(ξk)p]<m_{*}^{p}=\sup_{k}\mathbb{E}\left[\Psi(\xi_{k})^{p}\right]<\infty). It now easily follows that G¯n\bar{G}_{n} and G~n\tilde{G}_{n} satisfy the following growth conditions:

G~n(x,y,z)𝔪G~(y)(1+x)g0Ψ(z),andG¯n(x,y,z)𝔪G¯(y)(1+x)g0Ψ(z)\displaystyle\|\tilde{G}_{n}(x,y,z)\|\leqslant\mathfrak{m}_{\tilde{G}}(y)(1+\|x\|)^{g_{0}}\Psi(z),\quad\text{and}\quad\|\bar{G}_{n}(x,y,z)\|\leqslant\mathfrak{m}_{\bar{G}}(y)(1+\|x\|)^{g_{0}}\Psi(z)

for some functions 𝔪G¯(y)\mathfrak{m}_{\bar{G}}(y) and 𝔪G~(y)\mathfrak{m}_{\tilde{G}}(y) (depending on yy), where 𝔪¯χ,p<\bar{\mathfrak{m}}_{\chi,p}<\infty for χ=G~,G¯\chi=\tilde{G},\bar{G} (see (3.18) for definition of 𝔪¯χ,p\bar{\mathfrak{m}}_{\chi,p}). Consequently, for any p>0p>0

𝔼[G~n(Xn,Yn+1,ξn+1)p|n]\displaystyle\mathbb{E}\left[\|\tilde{G}_{n}(X_{n},Y_{n+1},\xi_{n+1})\|^{p}\big{|}\mathcal{F}_{n}\right]\leqslant 𝔪¯G~,pmp(1+Xn)pg0,\displaystyle\ \bar{\mathfrak{m}}_{\tilde{G},p}m_{*}^{p}(1+\|X_{n}\|)^{pg_{0}},
𝔼[G¯n(Xn,Yn+1,ξn+1)p|n]\displaystyle\mathbb{E}\left[\|\bar{G}_{n}(X_{n},Y_{n+1},\xi_{n+1})\|^{p}\big{|}\mathcal{F}_{n}\right]\leqslant 𝔪¯G¯,pmp(1+Xn)pg0.\displaystyle\ \bar{\mathfrak{m}}_{\bar{G},p}m_{*}^{p}(1+\|X_{n}\|)^{pg_{0}}.

Also,

(3.20) 𝔼[Ln(Xn,Yn+1)2|n]Xn2+2𝔪¯L,2,21/2Xn+𝔪¯L,2,2=(𝔪¯L,2,21/2+Xn)2𝔼[Fn(Xn,Yn+1)2|n]𝔪¯F,2(1+Xn)2f0.\displaystyle\begin{aligned} \mathbb{E}\left[\|L_{n}(X_{n},Y_{n+1})\|^{2}\big{|}\mathcal{F}_{n}\right]\leqslant&\ \|X_{n}\|^{2}+2\bar{\mathfrak{m}}_{L,2,2}^{1/2}\|X_{n}\|+\bar{\mathfrak{m}}_{L,2,2}=\left(\bar{\mathfrak{m}}_{L,2,2}^{1/2}+\|X_{n}\|\right)^{2}\\ \mathbb{E}\left[\|F_{n}(X_{n},Y_{n+1})\|^{2}\big{|}\mathcal{F}_{n}\right]\leqslant&\ \bar{\mathfrak{m}}_{F,2}(1+\|X_{n}\|)^{2f_{0}}.\end{aligned}

Now writing G(Xn,Yn+1,ξn+1)=G^n(Xn,Yn+1)+G~(Xn,Yn+1,ξn+1)G(X_{n},Y_{n+1},\xi_{n+1})=\hat{G}_{n}(X_{n},Y_{n+1})+\tilde{G}(X_{n},Y_{n+1},\xi_{n+1}), we have

Xn+12=\displaystyle\|X_{n+1}\|^{2}= Ln(Xn,Yn+1)2+Fn(Xn,Yn+1)2+G^n(Xn,Yn+1)2+G~(Xn,Yn+1,ξn+1)2\displaystyle\ \|L_{n}(X_{n},Y_{n+1})\|^{2}+\|F_{n}(X_{n},Y_{n+1})\|^{2}+\|\hat{G}_{n}(X_{n},Y_{n+1})\|^{2}+\|\tilde{G}(X_{n},Y_{n+1},\xi_{n+1})\|^{2}
+2Ln(Xn,Yn+1),Fn(Xn,Yn+1)+2(Ln+Fn+G^n)(Xn,Yn+1),G~(Xn,Yn+1,ξn+1)\displaystyle\ +2\langle L_{n}(X_{n},Y_{n+1}),F_{n}(X_{n},Y_{n+1})\rangle+2\langle(L_{n}+F_{n}+\hat{G}_{n})(X_{n},Y_{n+1}),\tilde{G}(X_{n},Y_{n+1},\xi_{n+1})\rangle
+2(Ln+Fn)(Xn,Yn+1),G^n(Xn,Yn+1).\displaystyle+2\langle(L_{n}+F_{n})(X_{n},Y_{n+1}),\hat{G}_{n}(X_{n},Y_{n+1})\rangle.

Denoting the term (Ln+Fn+G^n)(Xn,Yn+1),G~(Xn,Yn+1,ξn+1)\langle(L_{n}+F_{n}+\hat{G}_{n})(X_{n},Y_{n+1}),\tilde{G}(X_{n},Y_{n+1},\xi_{n+1})\rangle by Jn+1J_{n+1} , we have

𝔼[Jn+1|n]=\displaystyle\mathbb{E}\left[J_{n+1}|\mathcal{F}_{n}\right]= 𝔹(Ln+Fn+G^n)(Xn,y),G~(Xn,y,z)Pn,Xn(Yn,dy)νn+1(dz)\displaystyle\ \int_{\mathbb{B}}\int_{\mathcal{E}}\langle(L_{n}+F_{n}+\hat{G}_{n})(X_{n},y),\tilde{G}(X_{n},y,z)\rangle P_{n,X_{n}}(Y_{n},dy)\nu_{n+1}(dz)
=\displaystyle= (Ln+Fn+G^n)(Xn,y),𝔹G~(Xn,y,z)νn+1(dz)Pn,Xn(Yn,dy)=0.\displaystyle\ \int_{\mathcal{E}}\left\langle(L_{n}+F_{n}+\hat{G}_{n})(X_{n},y),\int_{\mathbb{B}}\tilde{G}(X_{n},y,z)\nu_{n+1}(dz)\right\rangle P_{n,X_{n}}(Y_{n},dy)=0.

Also by Cauchy-Schwartz inequality, (3.19) and (3.20)

𝔼[|Fn(Xn,Yn+1),G^n(Xn,Yn+1)||n]\displaystyle\mathbb{E}\left[|\langle F_{n}(X_{n},Y_{n+1}),\hat{G}_{n}(X_{n},Y_{n+1})\rangle|\big{|}\mathcal{F}_{n}\right]\leqslant (𝔼[Fn(Xn,Yn+1)2|n])1/2(𝔼[G^n(Xn,Yn+1)2|n])1/2\displaystyle\ \left(\mathbb{E}\left[\|F_{n}(X_{n},Y_{n+1})\|^{2}\big{|}\mathcal{F}_{n}\right]\right)^{1/2}\left(\mathbb{E}\left[\|\hat{G}_{n}(X_{n},Y_{n+1})\|^{2}\big{|}\mathcal{F}_{n}\right]\right)^{1/2}
\displaystyle\leqslant 𝔪¯F,21/2𝔪¯G^,21/2(1+Xn)f0+g0,\displaystyle\ \bar{\mathfrak{m}}_{F,2}^{1/2}\bar{\mathfrak{m}}_{\hat{G},2}^{1/2}(1+\|X_{n}\|)^{f_{0}+g_{0}},

and similarly,

𝔼[|Ln(Xn,Yn+1),G^n(Xn,Yn+1)||n]\displaystyle\mathbb{E}\left[|\langle L_{n}(X_{n},Y_{n+1}),\hat{G}_{n}(X_{n},Y_{n+1})\rangle|\big{|}\mathcal{F}_{n}\right]\leqslant 𝔪¯G^,21/2(𝔪¯L,2,21/21+Xn)1+g0.\displaystyle\ \bar{\mathfrak{m}}_{\hat{G},2}^{1/2}\left(\bar{\mathfrak{m}}_{L,2,2}^{1/2}\vee 1+\|X_{n}\|\right)^{1+g_{0}}.

Hence, on {Xn>B}\{\|X_{n}\|>B\}

𝔼[Xn+12|n]\displaystyle{}\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\leqslant Xn2+2𝔪¯L,2,21/2Xn+𝔪¯L,2,2+𝔪¯F,2(1+Xn)2f0+(𝔪¯G^,p+𝔪¯G~,pmp)(1+Xn)2g0\displaystyle\ \|X_{n}\|^{2}+2\bar{\mathfrak{m}}_{L,2,2}^{1/2}\|X_{n}\|+\bar{\mathfrak{m}}_{L,2,2}+\bar{\mathfrak{m}}_{F,2}(1+\|X_{n}\|)^{2f_{0}}+(\bar{\mathfrak{m}}_{\hat{G},p}+\bar{\mathfrak{m}}_{\tilde{G},p}m_{*}^{p})(1+\|X_{n}\|)^{2g_{0}}
(3.21) 2𝔪0Xn1+γ+2𝔪¯F,21/2𝔪¯G^,21/2(1+Xn)f0+g0+2𝔪¯G^,21/2(𝔪¯L,2,21/21+Xn)1+g0.\displaystyle\ -2\mathfrak{m}_{0}\|X_{n}\|^{1+\gamma}+2\bar{\mathfrak{m}}_{F,2}^{1/2}\bar{\mathfrak{m}}_{\hat{G},2}^{1/2}(1+\|X_{n}\|)^{f_{0}+g_{0}}+2\bar{\mathfrak{m}}_{\hat{G},2}^{1/2}\left(\bar{\mathfrak{m}}_{L,2,2}^{1/2}\vee 1+\|X_{n}\|\right)^{1+g_{0}}.

Since δ02(f0g0)(f0+g0)(1+g0)<1+γ,\delta_{0}\doteq 2(f_{0}\vee g_{0})\vee(f_{0}+g_{0})\vee(1+g_{0})<1+\gamma, by (SS-9) it follows from the above inequality that we can choose C>BC>B large enough so that for xn>C\|x_{n}\|>C,

𝔼[Xn+12Xn2|n]\displaystyle\mathbb{E}\left[\|X_{n+1}\|^{2}-\|X_{n}\|^{2}\left|\right.\mathcal{F}_{n}\right]\leqslant 𝔪1(Xnδ0Xn1+γ)<0.\displaystyle\ \mathfrak{m}_{1}\left(\|X_{n}\|^{\delta_{0}}-\|X_{n}\|^{1+\gamma}\right)<0.

Also notice that choosing C>B1C>B\vee 1 we have for Xn>C\|X_{n}\|>C

𝔼[Xn+12|n] +Xn𝔪2(1+Xn)1δ0/221δ0/2𝔪2Xn1δ0/2.\displaystyle\mathchoice{{\hbox{$\displaystyle\sqrt{\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\,}$}\lower 0.4pt\hbox{\vrule height=9.30444pt,depth=-7.44359pt}}}{{\hbox{$\textstyle\sqrt{\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\,}$}\lower 0.4pt\hbox{\vrule height=9.30444pt,depth=-7.44359pt}}}{{\hbox{$\scriptstyle\sqrt{\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\,}$}\lower 0.4pt\hbox{\vrule height=6.53888pt,depth=-5.23112pt}}}{{\hbox{$\scriptscriptstyle\sqrt{\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\,}$}\lower 0.4pt\hbox{\vrule height=5.03888pt,depth=-4.03113pt}}}+\|X_{n}\|\leqslant\mathfrak{m}_{2}(1+\|X_{n}\|)^{1\vee\delta_{0}/2}\leqslant 2^{1\vee\delta_{0}/2}\mathfrak{m}_{2}\|X_{n}\|^{1\vee\delta_{0}/2}.

Therefore for Xn>C\|X_{n}\|>C,

𝔼[Xn+1|n]Xn\displaystyle\mathbb{E}\left[\|X_{n+1}\||\mathcal{F}_{n}\right]-\|X_{n}\|\leqslant 𝔼[Xn+12|n] Xn=𝔼[Xn+12|n]Xn2𝔼[Xn+12|n] +Xn\displaystyle\ \mathchoice{{\hbox{$\displaystyle\sqrt{\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\,}$}\lower 0.4pt\hbox{\vrule height=9.30444pt,depth=-7.44359pt}}}{{\hbox{$\textstyle\sqrt{\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\,}$}\lower 0.4pt\hbox{\vrule height=9.30444pt,depth=-7.44359pt}}}{{\hbox{$\scriptstyle\sqrt{\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\,}$}\lower 0.4pt\hbox{\vrule height=6.53888pt,depth=-5.23112pt}}}{{\hbox{$\scriptscriptstyle\sqrt{\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\,}$}\lower 0.4pt\hbox{\vrule height=5.03888pt,depth=-4.03113pt}}}-\|X_{n}\|=\frac{\mathbb{E}\left[\|X_{n+1}\|^{2}\left|\right.\mathcal{F}_{n}\right]-\|X_{n}\|^{2}}{\mathchoice{{\hbox{$\displaystyle\sqrt{\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\,}$}\lower 0.4pt\hbox{\vrule height=9.30444pt,depth=-7.44359pt}}}{{\hbox{$\textstyle\sqrt{\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\,}$}\lower 0.4pt\hbox{\vrule height=9.30444pt,depth=-7.44359pt}}}{{\hbox{$\scriptstyle\sqrt{\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\,}$}\lower 0.4pt\hbox{\vrule height=6.53888pt,depth=-5.23112pt}}}{{\hbox{$\scriptscriptstyle\sqrt{\mathbb{E}\left[\|X_{n+1}\|^{2}|\mathcal{F}_{n}\right]\,}$}\lower 0.4pt\hbox{\vrule height=5.03888pt,depth=-4.03113pt}}}+\|X_{n}\|}
\displaystyle\leqslant 𝔪3(Xnδ01δ0/2Xn1+γ1δ0/2).\displaystyle\ \mathfrak{m}_{3}\left(\|X_{n}\|^{\delta_{0}-1\vee\delta_{0}/2}-\|X_{n}\|^{1+\gamma-1\vee\delta_{0}/2}\right).

Because of assumption (SS-9), notice that

xδ01δ0/2x1+γ1δ0/2x{,γ>0𝔪3,γ=0.\|x\|^{\delta_{0}-1\vee\delta_{0}/2}-\|x\|^{1+\gamma-1\vee\delta_{0}/2}\stackrel{{\scriptstyle\|x\|\rightarrow\infty}}{{\longrightarrow}}\begin{cases}-\infty,&\quad\gamma>0\\ -\mathfrak{m}_{3},&\quad\gamma=0.\end{cases}

In either case, there exist a constant A>0A>0, and a sufficiently large CC, such that

(3.22) 𝔼[Xn+1|n]Xn\displaystyle\mathbb{E}\left[\|X_{n+1}\||\mathcal{F}_{n}\right]-\|X_{n}\|\leqslant A, on Xn>C.\displaystyle-A,\quad\text{ on }\|X_{n}\|>C.

Next, notice that

|Xn+1𝔼[Xn+1|n]|\displaystyle\Big{|}\|X_{n+1}\|-\mathbb{E}\left[\|X_{n+1}\|\big{|}\mathcal{F}_{n}\right]\Big{|}\leqslant |Xn+1𝔼[Xn+1|n]|+|𝔼[Xn+1|n]𝔼[Xn+1|n]|\displaystyle\ \Big{|}\|X_{n+1}\|-\|\mathbb{E}[X_{n+1}\big{|}\mathcal{F}_{n}]\|\Big{|}+\Big{|}\|\mathbb{E}[X_{n+1}|\mathcal{F}_{n}]\|-\mathbb{E}\left[\|X_{n+1}\|\big{|}\mathcal{F}_{n}\right]\Big{|}
\displaystyle\leqslant Xn+1𝔼[Xn+1|n]+|𝔼[Xn+1𝔼[Xn+1|n]|n]|\displaystyle\|X_{n+1}-\mathbb{E}\left[X_{n+1}\big{|}\mathcal{F}_{n}\right]\|+\Big{|}\mathbb{E}\left[\|X_{n+1}\|-\|\mathbb{E}[X_{n+1}\big{|}\mathcal{F}_{n}]\|\big{|}\mathcal{F}_{n}\right]\Big{|}
\displaystyle\leqslant Xn+1𝔼[Xn+1|n]+𝔼[Xn+1𝔼[Xn+1|n]|n].\displaystyle\|X_{n+1}-\mathbb{E}[X_{n+1}|\mathcal{F}_{n}]\|+\mathbb{E}\left[\|X_{n+1}-\mathbb{E}[X_{n+1}|\mathcal{F}_{n}]\|\big{|}\mathcal{F}_{n}\right].

Hence,

(3.23) Ξn=\displaystyle\Xi_{n}= 𝔼[|Xn+1𝔼[Xn+1|n]|p|n] 2p𝔼[Xn+1𝔼[Xn+1|n]p|n]\displaystyle\ \mathbb{E}\left[\left|\|X_{n+1}\|-\mathbb{E}\left[\|X_{n+1}\|\big{|}\mathcal{F}_{n}\right]\right|^{p}\Big{|}\mathcal{F}_{n}\right]\leqslant\ 2^{p}\mathbb{E}\left[\|X_{n+1}-\mathbb{E}[X_{n+1}|\mathcal{F}_{n}]\|^{p}\Big{|}\mathcal{F}_{n}\right]
=\displaystyle= 2p𝔼[L¯(Xn,Yn+1)+F¯(Xn,Yn+1)+G¯(Xn,Yn+1,ξn+1)p|n]\displaystyle\ 2^{p}\mathbb{E}\left[\|\bar{L}(X_{n},Y_{n+1})+\bar{F}(X_{n},Y_{n+1})+\bar{G}(X_{n},Y_{n+1},\xi_{n+1})\|^{p}\Big{|}\mathcal{F}_{n}\right]
\displaystyle\leqslant 𝔪4(1+Xn)p(l1f1g0)ϕp(Xn),\displaystyle\ \mathfrak{m}_{4}(1+\|X_{n}\|)^{p(l_{1}\vee f_{1}\vee g_{0})}\equiv\phi_{p}(X_{n}),

where ϕp(x)𝔪4(1+x)p(l1f1g0)\phi_{p}(x)\doteq\mathfrak{m}_{4}(1+\|x\|)^{p(l_{1}\vee f_{1}\vee g_{0})}. Since l1f1g0<1/2l_{1}\vee f_{1}\vee g_{0}<1/2, for large enough pp, we have p(l1f1g0)<p/21p(l_{1}\vee f_{1}\vee g_{0})<p/2-1. It now follows from Theorem 2.2 (using V(x)=xV(x)=\|x\|) that for any r(0,ς(s=p(l1f1g0),p))r\in(0,\varsigma(s=p(l_{1}\vee f_{1}\vee g_{0}),p)), supn𝔼Xnr<\sup_{n}\mathbb{E}\|X_{n}\|^{r}<\infty. Since p>0p>0 is arbitrarily large, the assertion follows.

If Gn(x,y,z)G_{n}(x,y,z) are centered, that is, if G^n0\hat{G}_{n}\equiv 0, then of course 𝔪¯G^,p\bar{\mathfrak{m}}_{\hat{G},p} can be taken to be 0 for all p>0p>0, and from (3.21) , δ0=2(f0g0)\delta_{0}=2(f_{0}\vee g_{0}). Consequently, we do not need g0<γg_{0}<\gamma to have δ0<1+γ\delta_{0}<1+\gamma.

Corollary 3.4.

Consider the class of {n}\{\mathcal{F}_{n}\}-adapted Markov processes taking values in d\mathbb{R}^{d}, whose dynamics is defined by

(3.24) Xn+1=L(Xn)+F(Xn)+G(Xn)ξn+1,\displaystyle X_{n+1}=L(X_{n})+F(X_{n})+G(X_{n})\xi_{n+1},

where F,L:ddF,L:\mathbb{R}^{d}\rightarrow\mathbb{R}^{d}, G:d𝕄d×dG:\mathbb{R}^{d}\rightarrow\mathbb{M}^{d\times d^{\prime}} are continuous functions, and ddd\leqslant d^{\prime}. Assume that

  1. (M-1)

    FF, GG and LL satisfy the growth conditions (a) L(x)x\|L(x)\|\leqslant\|x\| for x>B\|x\|>B, (b)  F(x)𝔪F(1+x)γ0,\displaystyle\|F(x)\|\leqslant\mathfrak{m}_{F}(1+\|x\|)^{\gamma_{0}}, and (c) G(x)𝔪G(1+x)g0;\|G(x)\|\leqslant\mathfrak{m}_{G}(1+\|x\|)^{g_{0}};

  2. (M-2)

    for some constant 𝔪0,B\mathfrak{m}_{0},B and exponent γ0\gamma\geqslant 0,

    F(x),L(x)𝔪0x1+γ, for x>B;\langle F(x),L(x)\rangle\leqslant-\mathfrak{m}_{0}\|x\|^{1+\gamma},\quad\text{ for }\|x\|>B;
  3. (M-3)

    the exponents satisfy: (a) γ0<(1+γ)/2\gamma_{0}<(1+\gamma)/2, or  γ0=(1+γ)/2\gamma_{0}=(1+\gamma)/2 and 𝔪F𝔪0/2\mathfrak{m}_{F}\leqslant\mathfrak{m}_{0}/2; (b) g0<1/2g_{0}<1/2;

  4. (M-4)

    the ξn\xi_{n} are i.i.d d\mathbb{R}^{d^{\prime}}-valued random variables with density ρ\rho with respect to Lebesgue measure, λleb\lambda_{\text{leb}}; ρ(z)>0\rho(z)>0 for all zdz\in\mathbb{R}^{d^{\prime}}, supzdρ(z)<\sup_{z\in\mathbb{R}^{d^{\prime}}}\rho(z)<\infty, and for each p>0p>0, mp=𝔼(ξ1p)<m_{*}^{p}=\mathbb{E}(\|\xi_{1}\|^{p})<\infty;

  5. (M-5)

    for some θ0\theta\geqslant 0 and ε0>0\varepsilon_{0}>0,

    uTG(x)G(x)Tuε0uTu/(1+x)θ,u,xd.u^{T}G(x)G(x)^{T}u\geqslant\varepsilon_{0}u^{T}u/(1+\|x\|)^{\theta},\quad\forall\ u,x\in\mathbb{R}^{d}.

If in addition 𝔼(ξ1)=0\mathbb{E}(\xi_{1})=0, for all nn, then (a) {Xn}\{X_{n}\} is PHR and aperiodic with a unique invariant distribution π\pi, (b) supn𝔼x0(Xnr)𝔼πXnr<\sup_{n}\mathbb{E}_{x_{0}}(\|X_{n}\|^{r})\vee\mathbb{E}_{\pi}\|X_{n}\|^{r}<\infty, and (c) (2.11) or equivalently, (2.12) holds with V(u)=uV(u)=\|u\|, for any x0x_{0} and r>0r>0. If 𝔼(ξ1)0\mathbb{E}(\xi_{1})\neq 0, then the same assertion is true provided g0<γ1/2g_{0}<\gamma\wedge 1/2

Proof.

Since L,FL,F and GG are continuous, it follows by the dominated convergence theorem that {Xn}\{X_{n}\} is weak-Feller. From the assumption (M-5), it follows that GGTGG^{T} is positive definite (in particular, non singular), and det(G(x)GT(x))ε0d/(1+x)θd(G(x)G^{T}(x))\geqslant\varepsilon^{d}_{0}/(1+\|x\|)^{\theta d}. Note that 𝒫(x,)\mathcal{P}(x,\cdot) admits a density q(x,)q(x,\cdot). Specifically,

q(x,y)=1det(G(x)GT(x)) ρ(G(x)R(yL(x)H(x))supzρ(z)(1+x)θd/2/ε0d/2,q(x,y)=\frac{1}{\mathchoice{{\hbox{$\displaystyle\sqrt{\text{det}(G(x)G^{T}(x))\,}$}\lower 0.4pt\hbox{\vrule height=8.74664pt,depth=-6.99734pt}}}{{\hbox{$\textstyle\sqrt{\text{det}(G(x)G^{T}(x))\,}$}\lower 0.4pt\hbox{\vrule height=8.74664pt,depth=-6.99734pt}}}{{\hbox{$\scriptstyle\sqrt{\text{det}(G(x)G^{T}(x))\,}$}\lower 0.4pt\hbox{\vrule height=6.15pt,depth=-4.92003pt}}}{{\hbox{$\scriptscriptstyle\sqrt{\text{det}(G(x)G^{T}(x))\,}$}\lower 0.4pt\hbox{\vrule height=4.78333pt,depth=-3.82668pt}}}}\rho\left(G(x)^{-}_{R}(y-L(x)-H(x)\right)\leqslant\sup_{z}\rho(z)(1+\|x\|)^{\theta d/2}/\varepsilon^{d/2}_{0},

where G(x)R=GT(x)(G(x)G(x)T)1G(x)^{-}_{R}=G^{T}(x)\left(G(x)G(x)^{T}\right)^{-1} is the Moore-Penrose pseudoinverse (in particular, right inverse) of G(x)G(x). Moreover, since ρ(z)>0\rho(z)>0 a.s, for each xx, q(x,y)>0q(x,y)>0 a.s in yy (with respect to λleb\lambda_{\text{leb}}), and consequently, XnX_{n} is λleb\lambda_{\text{leb}}-irreducible. This shows that Condition (2.8-d) of Theorem 2.8 holds. The various assertions now follow from Theorem 2.8 and Proposition 3.2

Remark 3.5.

The condition (M-5) is much weaker than uniform ellipticity condition that is sometimes imposed on GGTGG^{T} for these kinds of models - the latter requiring for some ε0>0\varepsilon_{0}>0, uTG(x)G(x)Tuε0uTuu^{T}G(x)G(x)^{T}u\geqslant\varepsilon_{0}u^{T}u, for all u,xd.u,x\in\mathbb{R}^{d}.

The above theorem also holds, with some possible minor modifications, for systems of the form (3.24) taking values in other locally compact spaces with ξn\xi_{n} admitting a density ρ\rho with respect to the Haar measure. In particular, for such systems taking values in a countable state space like d\mathbb{Z}^{d} or d\mathbb{Q}^{d}, notice that the transition probability mass function (density with respect to counting measure) q(x,y)q(x,y) naturally exists and q(x,y)1q(x,y)\leqslant 1, that is, the bound on qq in condition (2.8-d) of Theorem 2.8 is trivially satisfied. Hence condition (M-5) in Corollary 3.4 is not needed in this case. However, depending on the specific model, one might still require GG to have full row rank for establishing irreducibility of the chain.

As an important application, the above corollary can be used to establish ergodicity of numerical schemes of stochastic differential equations (SDEs).

Example 3.6.

Euler-Maruyama scheme for ergodic SDEs: Consider the SDE

X(t)=X(0)+0tF(X(s))𝑑s+0tG(X(s))𝑑W(s),\displaystyle X(t)=X(0)+\int_{0}^{t}F(X(s))ds+\int_{0}^{t}G(X(s))dW(s),

and suppose that XX is ergodic with invariant / equilibrium distribution π\pi - which is typically unknown. Approximating this equilibrium distribution is an important computational problem in various areas including statistical physics, machine learning, mathematical finance etc. Since numerically solving the corresponding (stationary) Kolomogorov PDE for π\pi is computationally expensive even when the dimension is as low as 33, one commonly resorts to discretization schemes like the Euler-Maruyama method:

XΔ(tn+1)=XΔ(tn)+F(XΔ(tn))Δ+Δ1/2G(XΔ(tn))ξn+1.\displaystyle X^{\Delta}(t_{n+1})=X^{\Delta}(t_{n})+F(X^{\Delta}(t_{n}))\Delta+\Delta^{1/2}G(X^{\Delta}(t_{n}))\xi_{n+1}.

Here the ξn\xi_{n} are iid N(0,I)N(0,I)-random variables, and {tn}\{t_{n}\} is a partition of [0,)[0,\infty) with tn+1tn=Δt_{n+1}-t_{n}=\Delta - the step size of discretization. However, the use of such discretization techniques in approximating π\pi is justified provided one can establish (a) ergodicity of the discretized chain {XΔ(tn)}\{X^{\Delta}(t_{n})\} with a unique invariant distribution πΔ\pi^{\Delta}, and (b) convergence of πΔ\pi^{\Delta} to π\pi as Δ0\Delta\rightarrow 0. This is a hard problem involving infinite time horizon, and usual error analysis of Euler-Maruyama schemes, which has of course been well studied in the literature, is not useful here, as they are over finite time intervals. In comparison, much less is available on theoretical error analyses of these types of infinite-time horizon approximation problems, and some important results in this direction have been obtained by Talay [27, 26, 7]. A recent paper [6] (also see the references therein for more background on the problem) conducts a thorough large deviation error analysis of the problem in an appropriate scaling regime.

This short example do not attempt to address both the points (a) and (b) of this problem as that requires a separate paper-long treatment. Here, we are only interested in the point (a) above - which is ergodicity of the discretized chain {XΔ(tn)}\{X^{\Delta}(t_{n})\}. It is well known that ergodicity of XX does not guarantee the ergodicity of the discretized chain XΔX^{\Delta}. Discretization can destroy the underlying Lyapunov structure of an ergodic SDE!

In [27, 26] among several other important results, Talay et al. in particular showed that the chain {XΔ(tn)}\{X^{\Delta}(t_{n})\} is ergodic with unique invariant measure πΔ\pi^{\Delta} and 𝔼(f(XΔ(tn))πΔ(f)\mathbb{E}(f(X^{\Delta}(t_{n}))\rightarrow\pi^{\Delta}(f) as nn\rightarrow\infty for any fC(d,)f\in C^{\infty}(\mathbb{R}^{d},\mathbb{R}) such that ff and all its derivatives have polynomial growth under the assumption (i) F(x),x𝔪0x2\langle F(x),x\rangle\leqslant-\mathfrak{m}_{0}\|x\|^{2}, for x>B\|x\|>B, (ii) FF and GG are CC^{\infty} with bounded derivatives of all order and (iii) GGTGG^{T} is uniformly elliptic and bounded. An application of Corollary 3.4 shows that this result can be significantly improved with stronger convergence results under weaker hypothesis (c.f (M-1) -(M-5)). In particular, uniform ellipticity and boundedness conditions on GGTGG^{T}, which are quite restrictive for many models, can be removed.

3.2. Moment stability of linear stochastic control systems

Consider the system

(3.25) Xn+1=AXn+Bun+ξn+1\displaystyle X_{n+1}=AX_{n}+Bu_{n}+\xi_{n+1}

We are interested in the problem of finding conditions under which a linear stochastic system with possibly unbounded additive stochastic noise is globally stabilizable with bounded control inputs {un}\{u_{n}\}. Stabilization of stochastic linear systems with bounded control is a topic of significant interest in control engineering because of its importance in diverse fields; suboptimal control strategies such as receding-horizon control, and rollout algorithms, among others, can be easily constructed incorporating such constraints, and have become popular in applications. Here we simply refer to [24] and references therein for a detailed background on this topic.

Of course, boundedness of some moments of the noise component is necessary for attaining (moment) stability of the system. Specifically, we consider the following problem:

Problem: Suppose 𝕌{zm:zUmax}\mathbb{U}\doteq\left\{z\in\mathbb{R}^{m}:\|z\|\leqslant U_{\max}\right\}. We consider admissible possible kk-history dependent control policies of the type π={πn}\pi=\{\pi_{n}\} so that πn:d×k𝕌\pi_{n}:\mathbb{R}^{d\times k}\rightarrow\mathbb{U}, and for every y1,y2,ykdy_{1},y_{2}\ldots,y_{k}\in\mathbb{R}^{d}, πn(y1,,yk)𝕌\pi_{n}(y_{1},\ldots,y_{k})\in\mathbb{U}. Given r1r\geqslant 1 and Umax>0U_{\max}>0, find an admissible policy π={πn}n\pi=\{\pi_{n}\}_{n\in\mathbb{N}} with control authority UmaxU_{\max}, such that the system (3.25) with un=πn(Xnk+1,,Xn1,Xn)u_{n}=\pi_{n}(X_{n-k+1},\ldots,X_{n-1},X_{n}) is rr-th moment stable, that is, for every initial condition X0=x0X_{0}=x_{0}, supn𝔼x0Xnr<\sup_{n}\mathbb{E}_{x_{0}}\|X_{n}\|^{r}<\infty.

It is known that mean square boundedness holds for systems with bounded controls where AA is Schur stable, that is, all eigenvalues of AA are contained in the open unit disk (the proof uses Foster-Lyapunov techniques from [15]). In the more general framework, under the assumption that the pair (A,B)(A,B) is only stabilizable (which in particular allows the eigenvalues of AA to lie on the closed unit disk), [24] shows that there exist a kk-history dependent control policy that ensures moment stability of (3.25), provided the control authority UmaxU_{\max} is chosen sufficiently large. It was conjectured in [24], that the lower bound on UmaxU_{\max} can possibly be lifted with newer techniques, and here we demonstrate that is indeed the case. The following result is an easy corollary of Proposition 3.2. For simplicity, we assume that AA is orthogonal and (A,B)(A,B) is reachable in kk-steps. The steps from there to the more general case are similar to that in [24]. In case BB has full row rank, it will follow that kk can be taken to be 11, that is, the resulting policy is stationary feedback.

Proposition 3.7.

Consider the system defined by (3.25). Suppose that AA is orthogonal and the pair (A,B)(A,B) is reachable in kk steps (that is, rank(k)=d\operatorname{rank}(\mathcal{R}_{k})=d, where k=[BABA2BAk1B]\mathcal{R}_{k}=[B\ AB\ A^{2}B\ \ldots\ A^{k-1}B]). Then for any Umax>0U_{\max}>0, there exists a kk-history dependent policy π={πn}\pi=\{\pi_{n}\} such that given (Xnk+1,,Xn1,Xn)=(xnk+1,,xn1,xn)(X_{n-k+1},\ldots,X_{n-1},X_{n})=(x_{n-k+1},\ldots,x_{n-1},x_{n}), πn(xnk+1,,xn1,xn)fn mod k(xn/kk)\pi_{n}(x_{n-k+1},\ldots,x_{n-1},x_{n})\doteq f_{n\text{ \rm mod }k}(x_{\left\lfloor{n/k}\right\rfloor k}) for some functions f0,f1,,fk1:dmf_{0},f_{1},\ldots,f_{k-1}:\mathbb{R}^{d}\rightarrow\mathbb{R}^{m} where fi(x)Umax\|f_{i}(x)\|\leqslant U_{\max} for i=0,1,2,,k1i=0,1,2,\ldots,k-1, and for which supn𝔼x0Xnr<\sup_{n}\mathbb{E}_{x_{0}}\|X_{n}\|^{r}<\infty for any x0dx_{0}\in\mathbb{R}^{d}.

Proof.

Define X^n(k)=Xnk\hat{X}^{(k)}_{n}=X_{nk}, and notice that by iterating (3.25) we get

X^n+1(k)=\displaystyle\hat{X}^{(k)}_{n+1}= AkX^n(k)+k(u(n+1)k1unk+1unk)+j=1kAk1jξnk+iAkX^n(k)+ku^n(k)+ξ^n(k)\displaystyle A^{k}\hat{X}^{(k)}_{n}+\mathcal{R}_{k}\begin{pmatrix}u_{(n+1)k-1}\\ \vdots\\ u_{nk+1}\\ u_{nk}\end{pmatrix}+\sum_{j=1}^{k}A^{k-1-j}\xi_{nk+i}\equiv A^{k}\hat{X}^{(k)}_{n}+\mathcal{R}_{k}\hat{u}^{(k)}_{n}+\hat{\xi}^{(k)}_{n}

Notice that 𝔼(ξn(k))=0\mathbb{E}(\xi^{(k)}_{n})=0 and supn𝔼ξn(k)p𝒞^k\sup_{n}\mathbb{E}\|\xi^{(k)}_{n}\|^{p}\leqslant\hat{\mathscr{C}}_{k} for some constant 𝒞^k>0\hat{\mathscr{C}}_{k}>0. Since k\mathcal{R}_{k} has full row rank, it has a right inverse k\mathcal{R}_{k}^{-}. Define

sat(y)={y,yB(0,U^max)U^maxy/y,otherwise\operatorname{sat}(y)=\begin{cases}y,&\quad y\in B(0,\hat{U}_{\max})\\ \hat{U}_{\max}\ y/\|y\|,&\quad\text{otherwise}\end{cases}

and choose u^n(k)=kAksat(X^n(k))\hat{u}^{(k)}_{n}=-\mathcal{R}_{k}^{-}A^{k}\operatorname{sat}(\hat{X}^{(k)}_{n}), where U^max\hat{U}_{\max} is such that kAkU^maxUmax.\|\mathcal{R}_{k}^{-}A^{k}\|\hat{U}_{\max}\leqslant U_{\max}. This yields the system

X^n+1(k)=AkX^n(k)Aksat(X^n(k))+ξ^n(k).\displaystyle\hat{X}^{(k)}_{n+1}=A^{k}\hat{X}^{(k)}_{n}-A^{k}\operatorname{sat}(\hat{X}^{(k)}_{n})+\hat{\xi}^{(k)}_{n}.

Since for z>Umax\|z\|>U_{\max}, Akz,Aksat(z)=z\langle A^{k}z,-A^{k}\operatorname{sat}(z)\rangle=-\|z\| (recall that AA is orthogonal), we have from Proposition 3.2 that there exists a constant 𝔠0(k,r)\mathfrak{c}^{(k,r)}_{0} such that

supn𝔼X^n(k)r=supn𝔼Xnkr<𝔠0(k).\sup_{n}\mathbb{E}\|\hat{X}^{(k)}_{n}\|^{r}=\sup_{n}\mathbb{E}\|X_{nk}\|^{r}<\mathfrak{c}^{(k)}_{0}.

It is now immediate by a sequential argument that for any =0,1,,k1\ell=0,1,\ldots,k-1, 𝔼Xnk+r𝔠(k,r)\mathbb{E}\|X_{nk+\ell}\|^{r}\leqslant\mathfrak{c}^{(k,r)}_{\ell} where 𝔠(k,r)=3r1(Ar𝔠1(k,r)+kAkrUmaxr+𝔪r)\mathfrak{c}^{(k,r)}_{\ell}=3^{r-1}\left(\|A\|^{r}\mathfrak{c}^{(k,r)}_{\ell-1}+\|\mathcal{R}_{k}^{-}A^{k}\|^{r}U^{r}_{\max}+\mathfrak{m}^{r}_{*}\right).

Notice that the original controls unu_{n} are defined

un=Ek(n mod k)TkAksat(Xn/kk),u_{n}=-E_{k-(n\text{ \rm mod }k)}^{T}\mathcal{R}_{k}^{-}A^{k}\operatorname{sat}(X_{\left\lfloor{n/k}\right\rfloor k}),

where the matrices Ej𝕄m×km,j=1,2,,kE_{j}\in\mathbb{M}_{m\times km},\ j=1,2,\ldots,k, are defined by

Ej=[𝟎m×m𝟎m×m𝑰m×mj-th block𝟎m×m𝟎m×m]\displaystyle E_{j}=\begin{bmatrix}\bm{0}_{m\times m}&\ldots&\bm{0}_{m\times m}&\underbrace{\bm{I}_{m\times m}}_{j\text{-th block}}&\bm{0}_{m\times m}&\ldots&\bm{0}_{m\times m}\end{bmatrix}

In particular, from the state at time nknk, the present and the next k1k-1 controls uj,j=nk,nk+1,,nk+k1u_{j},j=nk,nk+1,\ldots,nk+k-1 can be computed. ∎


References

  • Barnsley et al. [1988] M. F. Barnsley, S. G. Demko, J. H. Elton, and J. S. Geronimo. Invariant measures for Markov processes arising from iterated function systems with place-dependent probabilities. Annales de l’Institut Henri Poincaré. Probabilités et Statistique, 24(3):367–394, 1988. Erratum in ibid., 24 (1989), no. 4, 589–590.
  • Chatterjee and Pal [2011] D. Chatterjee and S. Pal. An excursion-theoretic approach to stability of discrete-time stochastic hybrid systems. Applied Mathematics & Optimization, 63(2):217–237, 2011. http://dx.doi.org/10.1007/s00245-010-9117-6.
  • Chatterjee et al. [2011] D. Chatterjee, E. Cinquemani, and J. Lygeros. Maximizing the probability of attaining a target prior to extinction. Nonlinear Analysis: Hybrid Systems, 5(2):367 – 381, 2011. Special Issue related to IFAC Conference on Analysis and Design of Hybrid Systems (ADHS’09) - IFAC ADHS’09, http://dx.doi.org/10.1016/j.nahs.2010.12.003.
  • Costa et al. [2005] O. L. V. Costa, M. D. Fragoso, and R. P. Marques. Discrete-Time Markov Jump Linear Systems. Probability and its Applications (New York). Springer-Verlag London, Ltd., London, 2005.
  • Diaconis and Freedman [1999] P. Diaconis and D. Freedman. Iterated random functions. SIAM Review, 41(1):45–76 (electronic), 1999.
  • Ganguly and Sundar [2021] Arnab Ganguly and P. Sundar. Inhomogeneous functionals and approximations of invariant distributions of ergodic diffusions: central limit theorem and moderate deviation asymptotics. Stochastic Process. Appl., 133:74–110, 2021. ISSN 0304-4149.
  • Graham and Talay [2013] Carl Graham and Denis Talay. Stochastic simulation and Monte Carlo methods, volume 68 of Stochastic Modelling and Applied Probability. Springer, Heidelberg, 2013. ISBN 978-3-642-39362-4; 978-3-642-39363-1. Mathematical foundations of stochastic simulation.
  • Hernández-Lerma and Lasserre [2001] Onésimo Hernández-Lerma and Jean B. Lasserre. Further criteria for positive Harris recurrence of Markov chains. Proc. Amer. Math. Soc., 129(5):1521–1524, 2001. ISSN 0002-9939.
  • Hernández-Lerma and Lasserre [2003] Onésimo Hernández-Lerma and Jean Bernard Lasserre. Markov chains and invariant probabilities, volume 211 of Progress in Mathematics. Birkhäuser Verlag, Basel, 2003. ISBN 3-7643-7000-9.
  • Hespanha [2005] J. P. Hespanha. A model for stochastic hybrid systems with application to communication networks. Nonlinear Anal., 62(8):1353–1383, 2005. ISSN 0362-546X.
  • Jarner and Tweedie [2001] S. F. Jarner and R. L. Tweedie. Locally contracting iterated functions and stability of Markov chains. Journal of Applied Probability, 38(2):494–507, 2001.
  • Lasota and Mackey [1994] A. Lasota and M. C. Mackey. Chaos, Fractals, and Noise, volume 97 of Applied Mathematical Sciences. Springer-Verlag, New York, 2 edition, 1994.
  • Lasota and Yorke [1994] A. Lasota and J. A. Yorke. Lower bound technique for Markov operators and iterated function systems. Random & Computational Dynamics, 2(1):41–77, 1994.
  • Mao and Yuan [2006] Xuerong Mao and Chenggui Yuan. Stochastic differential equations with Markovian switching. Imperial College Press, London, 2006. ISBN 1-86094-701-8.
  • Meyn and Tweedie [2009] S. P. Meyn and R. L. Tweedie. Markov Chains and Stochastic Stability. Cambridge University Press, London, 2 edition, 2009.
  • Meyn and Tweedie [1992] Sean P. Meyn and R. L. Tweedie. Stability of Markovian processes. I. Criteria for discrete-time chains. Adv. in Appl. Probab., 24(3):542–574, 1992. ISSN 0001-8678.
  • Meyn and Tweedie [1993a] Sean P. Meyn and R. L. Tweedie. Stability of Markovian processes. II. Continuous-time processes and sampled chains. Adv. in Appl. Probab., 25(3):487–517, 1993a. ISSN 0001-8678.
  • Meyn and Tweedie [1993b] Sean P. Meyn and R. L. Tweedie. Stability of Markovian processes. III. Foster-Lyapunov criteria for continuous-time processes. Adv. in Appl. Probab., 25(3):518–548, 1993b. ISSN 0001-8678.
  • Michel [1990] Mariton Michel. Jump Linear Systems in Automatic Control. Marcel Dekker, New York, 1990.
  • Peigné [1993] M. Peigné. Iterated function systems and spectral decomposition of the associated Markov operator. In Fascicule de probabilités, volume 1993 of Publ. Inst. Rech. Math. Rennes, page 28. Univ. Rennes I, Rennes, 1993.
  • Pemantle and Rosenthal [1999] R. Pemantle and J. S. Rosenthal. Moment conditions for a sequence with negative drift to be uniformly bounded in LrL^{r}. Stochastic Processes and their Applications, 82(1):143–155, 1999.
  • [22] G. Da Prato. An Introduction to Infinite-Dimensional Analysis. Universitext. Springer-Verlag, Berlin. Revised and extended from the 2001 original by Da Prato.
  • Protter [2005] Philip E. Protter. Stochastic Integration and Differential Equations, volume 21 of Stochastic Modelling and Applied Probability. Springer-Verlag, Berlin, 2005. ISBN 3-540-00313-4. Second edition. Version 2.1, Corrected third printing.
  • Ramponi et al. [2010] F. Ramponi, D. Chatterjee, A. Milias-Argeitis, P. Hokayem, and J. Lygeros. Attaining mean square boundedness of a marginally stable stochastic linear system with a bounded control input. IEEE Transactions on Automatic Control, 55(10):2414–2418, 2010. http://arxiv.org/abs/0907.1436.
  • Szarek [2003] T. Szarek. Invariant measures for nonexpensive Markov operators on Polish spaces. Dissertationes Mathematicae (Rozprawy Matematyczne), 415, 2003. Dissertation, Polish Academy of Science, Warsaw, 2003.
  • Talay [1990] Denis Talay. Second-order discretization schemes of stochastic differential systems for the computation of the invariant law. Stochastics and Stochastic Reports, 29(1):13–36, 1990.
  • Talay and Tubaro [1990] Denis Talay and Luciano Tubaro. Expansion of the global error for numerical schemes solving stochastic differential equations. Stochastic Anal. Appl., 8(4):483–509 (1991), 1990. ISSN 0736-2994.
  • Yin and Zhu [2010] G. George Yin and Chao Zhu. Hybrid switching diffusions, volume 63 of Stochastic Modelling and Applied Probability. Springer, New York, 2010. ISBN 978-1-4419-1104-9. Properties and applications.