This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Expanding on Average Diffeomorphisms of Surfaces: Exponential Mixing

Jonathan DeWitt and Dmitry Dolgopyat Department of Mathematics, The University of Maryland, College Park, MD 20742, USA [email protected], [email protected]
Abstract.

We show that the Bernoulli random dynamical system associated to a expanding on average tuple of volume preserving diffeomorphisms of a closed surface is exponentially mixing.

1. Introduction

1.1. The main result

In this paper, we prove exponential equidistribution and mixing results for expanding on average random dynamical systems. Suppose that MM is a closed Riemannian surface with a smooth area, and (f1,,fm)(f_{1},\ldots,f_{m}) is a tuple of diffeomorphisms in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). We then define a random dynamical system, where at each time step we choose uniformly at random an index i{1,,m}i\in\{1,\ldots,m\} and apply fif_{i} to MM. We call this the (uniform Bernoulli) random dynamical system on MM associated to the tuple (f1,,fm)(f_{1},\ldots,f_{m}). A realization of the randomness is then given by a word from Σ={1,,m}\Sigma=\{1,\ldots,m\}^{\mathbb{N}}. As usual, we equip Σ\Sigma with the distance d(ω,ω′′)=2kd(\omega^{\prime},\omega^{\prime\prime})=2^{-k} where k=max{N:ωn=ωn′′ for n<N}k=\max\{N:\omega^{\prime}_{n}=\omega^{\prime\prime}_{n}\text{ for }n<N\}. We let σ:ΣΣ\sigma\colon\Sigma\to\Sigma denote the left shift and let μ\mu the uniform Bernoulli product measure on Σ\Sigma.

For such random dynamical systems, mixing does not hold for all tuples (f1,,fm)(f_{1},\ldots,f_{m}). We will introduce an additional hypothesis. We say that a tuple (f1,,fm)(f_{1},\ldots,f_{m}) is expanding on average if there exists λ>0\lambda>0 and n0n_{0}\in\mathbb{N} such that for all vT1Mv\in T^{1}M, the unit tangent bundle of MM,

(1.1) 1n0𝔼[lnDfωn0v]λ>0.\frac{1}{n_{0}}\mathbb{E}\left[{\ln\|Df^{n_{0}}_{\omega}v\|}\right]\geq\lambda>0.

Note that (1.1) is a C1C^{1}-open condition on the tuple (f1,,fm)(f_{1},\ldots,f_{m}), so in principle it could be checked on a computer (cf. [Chu20]).

The main result of our paper is that the systems satisfying (1.1) enjoy exponential mixing.

Theorem 1.1.

(Quenched Exponential Mixing) Suppose that MM is a closed surface and that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple of diffeomorphisms in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). Let β(0,1)\beta\in(0,1) be a Hölder regularity. There exists η>0\eta>0 such that for a.e. ωΣ\omega\in\Sigma, there exists CωC_{\omega} such that for any ϕ,ψCβ(M)\phi,\psi\in C^{\beta}(M),

(1.2) |ϕψfωndvolϕdvolψdvol|CωeηnϕCβψCβ\left|\int\phi\psi\circ f^{n}_{\omega}\,d\operatorname{vol}-\int\phi\,d\operatorname{vol}\int\psi\,d\operatorname{vol}\right|\leq C_{\omega}e^{-\eta n}\|\phi\|_{C^{\beta}}\|\psi\|_{C^{\beta}}

where fσj(ω)i=fωj+ifωj+1f_{\sigma^{j}(\omega)}^{i}=f_{\omega_{j+i}}\cdots f_{\omega_{j+1}}. Further, there exists D1>0D_{1}>0 such that

(1.3) μ(ω:CωC)D1C1.\mu(\omega:C_{\omega}\geq C)\leq D_{1}C^{-1}.

In fact, the tail bound (1.3) implies a related result, annealed exponential mixing for the associated skew product. We give the proof of the following in §11.4.

Corollary 1.2.

(Annealed Exponential Mixing) Let MM be a closed surface, let (f1,,fm)(f_{1},\ldots,f_{m}) be an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M), and β(0,1)\beta\in(0,1) be a Hölder regularity. Let F:Σ×MΣ×MF\colon\Sigma\times M\to\Sigma\times M be the skew product defined by

F(ω,x)=(σ(ω),fω0(x)).F(\omega,x)=(\sigma(\omega),f_{\omega_{0}}(x)).

Then FF is exponentially mixing, that is, there exist η¯>0\bar{\eta}>0, DD such that for any Φ,ΨCβ(Σ×M)\Phi,\Psi\in C^{\beta}(\Sigma\times M),

|Φ(ΨFn)𝑑μdvolΦ𝑑μdvolΨ𝑑μdvol|Deη¯nΦCβΨCβ.\left|\iint\Phi(\Psi\circ F^{n})\,d\mu\,d\operatorname{vol}-\iint\Phi\,d\mu\,d\operatorname{vol}\iint\Psi\,d\mu\,d\operatorname{vol}\right|\leq De^{-\bar{\eta}n}\|\Phi\|_{C^{\beta}}\|\Psi\|_{C^{\beta}}.

Before we proceed to discussing the relationship of this work with the existing literature, we will look at some examples of systems satisfying (1.1).

Remark 1.3.

Although we have written this paper for a finite tuple (f1,,fm)(f_{1},\ldots,f_{m}) of diffeomorphisms to emphasize the discreteness of the noise, one can consider random dynamics generated by any probability measure μ\mu on Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). Similar arguments to the ones we present here imply the analogous conclusions hold for random dynamics generated by a measure μ\mu with compact support on Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M), where MM is a closed surface.

1.2. Examples

There are a number of sources of tuples (f1,,fm)(f_{1},\ldots,f_{m}) that are expanding on average. The random dynamics arising from such tuples may exhibit uniform or non-uniform hyperbolicity. One of the simplest and archetypal examples is the following.

Example 1.4.

Suppose that (A1,,Am)(A_{1},\ldots,A_{m}) is a tuple of matrices in SL(2,)\operatorname{SL}(2,\mathbb{Z}) satisfying the hypotheses of Furstenberg’s theorem, namely the tuple is strongly irreducible and contracting. Then the Bernoulli random product of these matrices has a positive top Lyapunov exponent. It follows from the proof of Furstenberg’s theorem, see, e.g. [BL85, Thm. III.4.3], that there exists NN and λ>0\lambda>0 such that for all unit vectors v2v\in\mathbb{R}^{2},

N1𝔼[lnAωNv]λ>0.N^{-1}\mathbb{E}\left[{\ln\|A^{N}_{\omega}v\|}\right]\geq\lambda>0.

Each AiSL(2,)A_{i}\in\operatorname{SL}(2,\mathbb{Z}) acts on 𝕋2=2/2\mathbb{T}^{2}=\mathbb{R}^{2}/\mathbb{Z}^{2}, and the associated random dynamics on 𝕋2\mathbb{T}^{2} is uniformly expanding on average. Because this is an open condition, we see that any volume preserving perturbation of the AiA_{i} is also uniformly expanding. Thus, our theorem applies to a class of non-linear systems that do not exhibit any uniform hyperbolicity.

In addition, the expanding on average property generalizes to many other random walks on homogeneous spaces, see for example [EL, Def. 1.4], which uses this property to study stiffness of stationary measures of random walks on homogeneous spaces.

Expanding on average systems also arise as perturbations of isometric systems.

Example 1.5.

Perhaps the first example where this condition was considered for nonlinear diffeomorphisms was the paper of Dolgopyat and Krikorian [DK07]. Suppose that (R1,,Rm)(R_{1},\ldots,R_{m}) is a tuple of isometries of S2S^{2} that generates a dense subgroup of SO(3)\operatorname{SO}(3). Then [DK07] shows that there exists k0k_{0} such that if (f1,,fm)(f_{1},\ldots,f_{m}) is a sufficiently Ck0C^{k_{0}} small volume preserving perturbation of (R1,,Rm)(R_{1},\ldots,R_{m}), and the tuple (f1,,fm)(f_{1},\ldots,f_{m}) has a stationary measure with non-zero Lyapunov exponents, then (f1,,fm)(f_{1},\ldots,f_{m}) is expanding on average. See also DeWitt [DeW24].

Other work has explored how ubiquitous expanding on average systems are, in some cases studying whether expanding on average systems can be realized by perturbing a known system of interest.

Example 1.6.

Chung [Chu20] gives a proof that certain random perturbations of the standard map are expanding on average (see also [BXY17, BXY18] which studies the size of Lyapunov exponents for perturbations of the standard map with a large coupling constant). [Chu20] also presents convincing numerical simulations showing that certain actions on character varieties are expanding on average as well.

There are also some results that construct expanding on average systems densely in a weak* sense.

Example 1.7.

The paper [Pot22] says that for every open set 𝒰Diffvol(M)\mathcal{U}\subseteq\operatorname{Diff}^{\infty}_{\operatorname{vol}}(M), where MM is a surface, there exists a finitely supported measure on 𝒰\mathcal{U} that is expanding on average. This result was generalized to higher dimensions in [ES23].

1.3. Relationship with other works

Exponential mixing plays the central role in the study of statistical properties of dynamical systems. In particular, multiple exponential mixing implies several probabilistic results including the Central Limit Theorem [Che06, BG20], Poisson Limit Theorem [DFL22],and the dynamical Borel Cantelli Lemma [Gal10] among others. Further, exponential mixing was recently shown to imply Bernoullicity [DKRH24].

For deterministic systems, however, robust exponential mixing has been only established for a limited class of systems: uniformly hyperbolic systems in both smooth and piecewise smooth settings [CM06, Via99, You98], or for partially hyperbolic systems where all Lyapunov exponents in the central direction have the same sign [dCJ02, CV13, Dol00]. Here we say that a certain property holds robustly if it holds for a given system as well as for its small perturbations. In contrast, if additional symmetries are present then there are many other cases where exponential mixing is known, see [GS14, KM96, Liv04, TZ23]. There are also checkable conditions for exponential mixing in the nonuniformly hyperbolic setting, see [You98, You99]. However, except for the aforementioned examples, these conditions hold for individual systems rather than open sets. On the other hand KAM theory tells us that away from (partially) hyperbolic systems one has open sets of non-ergodic systems, so one cannot expect chaotic behavior to be generic.

The situation is different for random systems. In fact, if the supply of random maps is rich enough then one show that exponential mixing and other statistical properties hold generically. Such results are known for stochastic flows of diffeomorphisms [DKK04] as well as for random deterministic shear flows [BCZG23]. It is therefore natural to ask how large should the set of random diffeomorphisms must be so that the corresponding random dynamical system exhibits random behavior. The following conjecture is formulated in [DK07].

Conjecture 1.8.

For each closed manifold MM with volume and regularity k1k\geq 1, there exists mm, such that the space of tuples (f1,,fm)(f_{1},\ldots,f_{m}) that are stably ergodic is open and dense in (Diffvolk(M))m\displaystyle\left(\operatorname{Diff}^{k}_{\operatorname{vol}}(M)\right)^{m}.

The point of this conjecture is that only a tiny bit of randomness, perhaps even the minimum amount, should be sufficient to ensure robust ergodic and statistical properties for dynamical system. Consequently, the situation where the driving measure has uniformly small, finite support on Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M) is the most interesting, and hardest case to consider this question. The obvious approach to this conjecture is to first to show that an open and dense set of tuples is expanding on average.

Other papers have significantly extended the properties of expanding on average systems. One of the first is [BRH17], which shows a strong stiffness property of these systems: any stationary measure for the Markov process that is not finitely supported is volume [BRH17, Thm. 3.4]. Thus, in some sense, volume is the only measure whose statistical properties are interesting to study. The only statistical property beyond ergodicity studied before for expanding on average systems is large deviations for ergodic sums established in [Liu16, Thm. 4.1.1]. Our paper provides an additional contribution to this topic by showing that expanding on average systems enjoy exponential mixing. In fact, Conjecture 1.8 provides an additional motivation for this work, because it shows that should the conjecture be true, then exponential mixing is a generic property for random dynamical systems.

Some work has been done towards showing that uniform expansion is a generic property. In particular, [OP22] shows that one may obtain positive integrated Lyapunov exponent for conservative random systems on surfaces. This work differs from the papers [Pot22] and [ES23] as [OP22] does not require an arbitrarily large number of diffeomorphisms to obtain its result.

Returning to deterministic systems, it is natural to ask for conditions for strong statistical properties to hold in a robust way. Optimal conditions are not yet well understood. While there are strong indications that at least a dominated splitting is necessary [Pal00], the best available results pertain to partially hyperbolic systems. A well known conjecture of Pugh and Shub [Shu06] states that stably ergodic systems contain an open and dense subset of partially hyperbolic systems. Currently the best results on this problem are due to [BW10] which can be consulted for a detailed discussion on this subject. In fact, the methods of Pugh and Shub also give the KK-property [BW10]. Going beyond the KK-property remains an outstanding challenge even in the partially hyperbolic setting. In view of the strong consequences of exponential mixing it is natural to conjecture the following.

Conjecture 1.9.

Exponential mixing holds for an open and dense set of volume preserving partially hyperbolic systems.

Currently there are two possible ways to attack this conjecture. The first one is based on the theory of weighted Banach spaces, [AGT06, CL22, GL06, Tsu01, TZ23]. To describe the second approach recall that the papers [Via08, AV10] show that partially hyperbolic systems often have non-zero exponents. It is therefore natural to see if one could try to extend the methods used in proving exponential mixing in non-uniformly hyperbolic systems to handle partially hyperbolic setting. As mentioned above, this approach was successful in handling the case there the central exponents have the same sign. In the present paper we consider a skew product with a shift in the base and where the Lyapunov exponents in the central direction have different signs. We hope that a similar approach could be useful for studying more general skew products, and hopefully could provide a blueprint for studying mixing in partially hyperbolic systems.

In summary, the present work is the first step in extending mixing to a large class of smooth systems both random and deterministic, and we hope that various extensions will be addressed in future works.

Acknowledgments: The first author was supported by the National Science Foundation under Award No. DMS-2202967. The second author was supported by the National Science Foundation under award No. DMS-2246983. The authors are grateful to Matheus Manzatto de Castro for comments on an earlier version of the manuscript.

2. Setting and basic definitions

2.1. Random dynamics and skew products

In this section, we will state some basic definitions that will be used throughout the paper. Although we introduce many of these definitions and notations here, we will recall and reintroduce them when they are used; this section is just an overview.

We begin by recalling the main definition of our setup.

Definition 2.1.

We say that a tuple (f1,,fm)Diff1(M)(f_{1},\ldots,f_{m})\in\operatorname{Diff}^{1}(M) is expanding on average if there exists some n0n_{0}\in\mathbb{N} and λ0>0\lambda_{0}>0 such that for all vT1Mv\in T^{1}M,

(2.1) 𝔼[n01lnDfωn0v]λ0>0.\mathbb{E}\left[{n_{0}^{-1}\ln\|Df^{n_{0}}_{\omega}v\|}\right]\geq\lambda_{0}>0.

Throughout the paper, (f1,,fm)(f_{1},\ldots,f_{m}) typically denotes an uniformly expanding on average tuple of volume preserving diffeomorphisms of a closed surface MM. However, in some cases, we merely are referring to a tuple and do not make use of any further assumptions.

We write (Σ,σ)(\Sigma,\sigma) for the one sided shift on mm symbols, i.e. Σ={1,,m}\Sigma=\{1,\ldots,m\}^{\mathbb{N}} with σ\sigma being the left shift. We endow this space with the measure μ\mu, which is the uniform Bernoulli measure on Σ\Sigma. Write Σ^\hat{\Sigma} and μ^\hat{\mu} for the two-sided shift and the invariant Bernoulli measure over μ\mu.

We may view the random dynamics in two ways. First, as a Markov process on MM. The second way, as mentioned in the statement of Corollary 1.2, is as the skew product F:Σ×MΣ×MF\colon\Sigma\times M\to\Sigma\times M. This skew product preserves the product measures μvol\mu\otimes\operatorname{vol}. When we say that the tuple (f1,,fm)(f_{1},\ldots,f_{m}) is ergodic, we mean that the skew product FF is ergodic for the measure μvol\mu\otimes\operatorname{vol}. This is equivalent to the absence of almost surely invariant Borel subsets of MM of intermediate measure. See [Kif86] for more discussion of the relationship between the skew product and the random dynamics on MM.

For a word ωΣ\omega\in\Sigma, we write fωn:MMf^{n}_{\omega}\colon M\to M for the composition fωnfω1f_{\omega_{n}}\cdots f_{\omega_{1}}. We use the same notation for finite words ω\omega. For a sequence of linear maps (Ai)1in(A_{i})_{1\leq i\leq n}, we write Ai=AiA1A^{i}=A_{i}\cdots A_{1}. We do not always start this product with the first matrix, so we also have the notation

Aik=Ai+kAi+1.A_{i}^{k}=A_{i+k}\cdots A_{i+1}.

Note that this is compatible with the notation fσj(ω)i=fωj+ifωj+1f_{\sigma^{j}(\omega)}^{i}=f_{\omega_{j+i}}\cdots f_{\omega_{j+1}} from above.

2.2. Stable subspaces

For a sequence of linear maps, we will frequently use the singular value decomposition when it is defined. If we have a sequence of matrices A1,A2,A_{1},A_{2},\ldots then, when it is defined, we write EnsE^{s}_{n} for the most contracted singular direction of AnA^{n}. We usually apply this to the sequence of linear maps DxfωnD_{x}f^{n}_{\omega}. We write Eis(ω,x)E^{s}_{i}(\omega,x) for most contracted singular direction of DxfωiD_{x}f^{i}_{\omega}, and we write Eiu(ω,x)E^{u}_{i}(\omega,x) for the most expanded singular direction of DxfωiD_{x}f^{i}_{\omega}, should these directions be well defined. Often we will suppress the xx and ω\omega and just write EisE^{s}_{i}, other times we will write Eωs(x)E^{s}_{\omega}(x).

Throughout the paper we will consider sets Λnω\Lambda^{\omega}_{n} which are the sets of points xMx\in M that are (C,λ,ϵ)(C,\lambda,\epsilon)-tempered for the word ω\omega up until time nn, where temperedness is defined in §4.1. These points are essentially the finite time analogue of a Pesin block, c.f. [BP07].

2.3. Stable manifolds

The most important dynamical objects we will consider are the stable manifolds and fake stable manifolds. Given a point xMx\in M, we define its stable manifold to be the set of points

Ws(ω,x)={yM:d(fωn(x),fωn(x)) exponentially fast}.W^{s}(\omega,x)=\{y\in M:d(f^{n}_{\omega}(x),f^{n}_{\omega}(x))\text{ exponentially fast}\}.

Note that the stable manifold depends on ω\omega. We denote a segment of length 2δ2\delta centered at xx in Ws(ω,x)W^{s}(\omega,x) by Wδs(ω,x)W^{s}_{\delta}(\omega,x). The properties of these “true” stable manifolds are discussed in Section 5. For general information about stable manifolds in random dynamical systems, see [LQ95].

As alluded to above, we will not only work with the stable manifolds, but also with finite time versions of stable manifolds. We will denote by Wn,δ0s(ω,x)W^{s}_{n,\delta_{0}}(\omega,x) the time nn fake stable manifold of xx for the word ω\omega restricted to segment of radius δ0\delta_{0} centered at xx. The point of the fake stable manifolds is that up to time nn, they have similar contraction properties to an actual stable manifold. In the limit, they converge to the true stable manifold. Their definition is somewhat technical, but a detailed treatment of the fake stable manifolds is given in Appendix B which essentially concerns itself with a quantified, finite time version of Pesin theory.

An important application of stable manifolds, fake or otherwise, is their holonomy. Suppose that we have two curves γ1,γ2\gamma_{1},\gamma_{2} and a locally defined lamination 𝒲\mathcal{W} such that each leaf of 𝒲\mathcal{W} intersects γ1\gamma_{1} and γ2\gamma_{2} at a unique point. Let I1I_{1} and I2I_{2} be the points of intersection of 𝒲\mathcal{W} with γ1\gamma_{1} and γ2\gamma_{2}. Then 𝒲\mathcal{W} defines a holonomy map H𝒲:I1I2H^{\mathcal{W}}\colon I_{1}\to I_{2} by carrying the unique point of intersection with a particular plaque of the lamination to the corresponding point in the other curve.

An important property that such a holonomy may satisfy is absolutely continuity with respect to volume, which means that it carries Riemannian volume of γ1\gamma_{1} restricted to I1I_{1} to a measure equivalent to the restriction to I2I_{2} of Riemannian volume on γ2\gamma_{2}. These properties will be discussed in more detail in Appendix B.

2.4. Norms

In this paper, we will use many estimates from calculus.

First we consider the norms of curves. An unparametrized curve in a manifold does not come equipped with any C1C^{1} norm, as the C1C^{1} norm of a curve is dependent on parametrization. Consequently, we will always view such a curve with its arclength parametrization. For xγx\in\gamma, we may consider the norm of the second derivative of γ\gamma at the origin when we view γ\gamma as a graph over its tangent in an exponential chart. We then define γC2\|\gamma\|_{C^{2}} as the supremum of this norm over all xγx\in\gamma. Note that this is essentially the same thing as the supremum of the extrinsic curvature of γ\gamma at xx over all points xγx\in\gamma.

Throughout the proof, we will be interested in studying the log Hölder norms of some densities along curves. We will be slightly unconventional and write lnρCα\|\ln\rho\|_{C^{\alpha}} for the Hölder constant of lnρ\ln\rho, where ρ\rho is a density. Note that this doesn’t include an estimate on lnρ\|\ln\rho\|_{\infty}, as such a norm usually contains. This is because the magnitude of the density is infrequently the important things in our arguments.

When we work in coordinates, we will write ϕi\|\phi\|_{i} as the supremum of all the iith partial derivatives of the function ϕ\phi. For example, if ϕ:2\phi\colon\mathbb{R}^{2}\to\mathbb{R}, then we define

ϕ2=supx2max{|d2ϕdxdy|,|d2ϕdx2|,|d2ϕdy2|}.\|\phi\|_{2}=\sup_{x\in\mathbb{R}^{2}}\max\left\{\left|\frac{d^{2}\phi}{dxdy}\right|,\left|\frac{d^{2}\phi}{dx^{2}}\right|,\left|\frac{d^{2}\phi}{dy^{2}}\right|\right\}.

2.5. Probability facts

In the course of the paper we will some facts from probability, which we state here for the convenience of readers who are familiar with dynamics but not as much with probability. Sometimes we will write something like ω(A)\mathbb{P}_{\omega}(A) for the measure μ(A)\mu(A) when we are thinking probabilistically. Also, we will often write 𝔼[]\mathbb{E}\left[{\ldots}\right] when we are taking expectations with respect to μ\mu, as μ\mu is the measure driving the random dynamics.

The following concentration in equality is very useful for us.

Theorem 2.2.

[Ste97, Thm. 1.3.1] (Azuma-Hoeffding inequality) Suppose that X1,X2,X_{1},X_{2},\ldots is a martingale difference sequence. Then

(2.2) (|i=1nXi|λ)2exp(λ22i=1nXiL2).\mathbb{P}\left(\left|\sum_{i=1}^{n}X_{i}\right|\geq\lambda\right)\leq 2\exp\left(\frac{-\lambda^{2}}{2\sum_{i=1}^{n}\|X_{i}\|^{2}_{L^{\infty}}}\right).

3. Outline of the paper

3.1. Quenched and annealed properties

The main technical result of this paper is a type of “annealed” coupling theorem, Proposition 7.7. From this theorem we deduce after a small amount of additional work, quenched exponential equidistribution (Proposition 11.9) as well as quenched exponential mixing, which, in turn, implies the annealed exponential mixing (see Corollary 1.2).

Before proceeding, let us recall what is meant, in the probabilistic sense, by an annealed as opposed to a quenched limit theorem for a random dynamical system defined by Bernoulli random application of maps (f1,,fm)(f_{1},\ldots,f_{m}). In an annealed limit theorem, we average over the entire ensemble whereas in a quenched limit theorem one obtains a limit theorem for almost every realization of the random dynamics. For example, in the case of equidistribution consider ϕ:M\phi\colon M\to\mathbb{R} a Hölder observable and ν\nu a probability measure on MM, such as a curve with density. Then annealed equidistribution says:

1mnωn{1,,m}nϕfωn𝑑νϕdvol,\frac{1}{m^{n}}\sum_{\omega^{n}\in\{1,\ldots,m\}^{n}}\int\phi\circ f^{n}_{\omega}\,d\nu\to\int\phi\,d\operatorname{vol},

whereas quenched equidistribution says that for almost every ωΣ\omega\in\Sigma^{\mathbb{N}} with respect to the Bernoulli measure μ\mu on Σ\Sigma,

ϕfωn𝑑μϕdvol.\int\phi\circ f^{n}_{\omega}\,d\mu\to\int\phi\,d\operatorname{vol}.

Note that the annealed result follows from the mixing of the skew product studied in §6.2.

While the two notions are not always equivalent, our annealed coupling theorem comes with such fast rates that by the Fubini theorem, we can deduce quenched limit theorems. This reduction happens in Section 11.

3.2. Description of the key step

The main results of this paper follow from our annealed exponentially fast coupling proposition, Proposition 7.7, which says the following. Suppose we have two standard pairs γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2}. Each standard pair is a C2C^{2} curve γi\gamma_{i} along with a density ρi\rho_{i} defined along γ\gamma. Suppose that ωΣ\omega\in\Sigma is a random word. We say that two points xγ1x\in\gamma_{1} and yγ2y\in\gamma_{2} are “coupled” at time kk if:

  1. (1)

    fωk(x)Wlocs(σk(ω),fωk(y))f^{k}_{\omega}(x)\in W^{s}_{loc}(\sigma^{k}(\omega),f^{k}_{\omega}(y)),

  2. (2)

    The stable manifold Wlocs(σk(ω),fωk(y))W^{s}_{loc}(\sigma^{k}(\omega),f^{k}_{\omega}(y)) contracts uniformly exponentially quickly, so that fωk(x)f^{k}_{\omega}(x) and fωk(y)f^{k}_{\omega}(y) attract uniformly exponentially fast, independent of x,y,ωx,y,\omega.

In other words, after two points couple at time kk they attract uniformly quickly. In fact, in our coupling procedure if xx and yy couple at time kk then fωk(x)f^{k}_{\omega}(x) and fωk(y)f^{k}_{\omega}(y) both lie in a uniformly (C,λ,ϵ)(C,\lambda,\epsilon)-tempered stable manifold (see Definition 5.1). Proposition 7.7 constructs a coupling which occur exponentially quickly in the sense that the set of points where the coupling time is greater than kk has exponentially small measure.

The first step towards constructing the coupling is to show that for two “nice” standard pairs γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} that are quite close, there exist uniform ϵ0,ϵ1>0\epsilon_{0},\epsilon_{1}>0 such that with ϵ0\epsilon_{0} probability at least ϵ1\epsilon_{1} proportion of the mass of γ^1\hat{\gamma}_{1} couples at time 0. Namely, with ϵ0\epsilon_{0} probability, the stable manifolds WωsW^{s}_{\omega} intersect γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} in sets of uniformly large measure, thus those points can be coupled. This fact implies that a positive proportion of the mass on γ^1\hat{\gamma}_{1} can be coupled at the first attempt.

The complement of the pairs that couple is the disjoint union of a potentially large number of very small curves. For these “leftover” curves we will wait a potentially long time for them to grow and smoothen and then equidistribute at small scale so that we can try coupling them again. We refer to this growth and smoothening as “recovery” and the equdistribution as “precoupling.” As a positive proportion of the remaining mass gets coupled during each attempt at coupling, we expect only an exponentially small amount of mass to remain uncoupled after nn attempts.

The actual argument is much more complicated for a fairly simple reason: we cannot determine if two points xx and yy lie in the same stable manifold until we have seen the entire word ω\omega. However, we do not want to look into the future at the entire word ω\omega since then we would loose the Markov character of dynamics and would not be able to use many estimates that rely on the Markov property. Consequently, we define a “stopping” time for each pair (x,ω)(x,\omega) which tells us when to “give up” on trying to couple during the current attempt and switch to recovery. For the moment, we regard the coupling argument as having three main steps:

  1. (1)

    (Local Coupling) Attempt to couple two uniformly smooth nearby curves γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2}.

  2. (2)

    (Recovery) Show that pieces of curve that fail to couple recover quickly so that their image become long and smooth.

  3. (3)

    (Precoupling) There is a time N0N_{0} such that given two long smooth curves we can divide them into subcurves such that for most of the subcurves their images N0N_{0} units of time later are close to each other, so we can then try to locally couple them again.

We now describe the outline of the rest of the paper and how its different sections relate to the three main steps described above.

The first goal of the paper is show that for any point xMx\in M that for most words ωΣ\omega\in\Sigma the stable manifolds Ws(ω,x)W^{s}(\omega,x) have good properties including good distribution of their tangent vector, controlled C2C^{2} norm, and that they contract quickly. To do this, we will need to obtain good estimates on DfωnDf^{n}_{\omega}. We show that for typical words ω\omega, DfωnDf^{n}_{\omega} has a putative stable direction that has all of the properties that the stable direction of a Pesin regular point would have. We formalize these properties with our notion of (C,λ,ϵ)(C,\lambda,\epsilon)-temperedness, which is described in detail in §4.1. We remark, however, that this notion is weaker than the usual notion of ϵ\epsilon-temperedness used in Pesin theory. We show that there exist λ,ϵ>0\lambda,\epsilon>0 such that for almost every word ω\omega that the trajectory will exhibit (C(ω),λ,ϵ)(C(\omega),\lambda,\epsilon)-temperedness for some C(ω)>0C(\omega)>0. Further, we obtain estimates for the tail of C(ω)C(\omega). We then also study the distribution of Eωs(x)E^{s}_{\omega}(x), the stable direction for the word ω\omega at the point xx and obtain estimates on the regularity of this measure, which show that the distribution of Es(ω)E^{s}(\omega) and hence the stable manifolds is not concentrated in any particular direction, see Proposition 4.11. This discussion occupies Section 4. Through the application of Azuma’s inequality, we are able to show that a typical trajectory exhibits temperedness.

In Section 6, we study the mixing properties of the skew product map FF. The proofs rely on the properties of stable manifolds that are recalled in Section 5. Mixing plays a crucial role in the Finite Time Mixing Proposition given in Section 9. This plays an important role at the precoupling stage.

Section 7 contains the precise statement of the main coupling Proposition 7.7. We then divide the proof into three main parts: the Local Coupling Lemma 7.10, the Coupled Recovery Lemma 7.9, and the Finite Time Mixing Proposition 7.11 which corresponds to steps (1)–(3) in the outline above. Lemma 7.9 is proven in Section 8, Proposition 7.11 is proven in Section 9, and Lemma 7.10 is proven in Section 10.

Finally, in Section 11 we derive our main results from the main coupling proposition: we derive Theorem 1.1 and Corollary 1.2 from Proposition 7.7.

The paper contains two appendices. Appendix A describes how the smoothness of a curve which is transversal to the stable direction improves under the dynamics, while Appendix B discusses fake stable manifolds and their holonomy. In particular, we show that these objects converge exponentially fast to true stable manifolds and holonomies respectively. While the estimates in the appendices are similar to several results in Pesin theory, we provide the proofs in our paper since we could not find exact references in the existing literature. This is partially due to the fact that we put a greater emphasis to the finite time estimates because we want to preserve the Markov property of the dynamics and hence cannot base our coupling algorithm on the knowledge of the future behavior of orbits.

3.3. Mixing in hyperbolic dynamics

We now compare our work with strategies used in other works. Historically the first mixing results for hyperbolic systems relied on symbolic dynamics, see [Bow75, Rue78, Sin72, PP90]. Currently the most flexible realization of this approach is via symbolic dynamics given by Young towers ([You98]). Later, several methods working directly with the hyperbolic systems were developed. In particular, we would like to mention weighted Banach spaces developed in [GL06] (see [Bal00] for a review) as well as the coupling approach developed in [You99]. We note that most hyperbolic systems could be analyzed by each of these methods but a different amount of work is required in different cases. For example, a recent paper [DL23] constructs weighted Banach spaces suitable for the billiard dynamics. However, these spaces are necessarily complicated reflecting the complexity of billiards systems.

In our work, we use the coupling approach. This method was originally used in [You99] to handle symbolic systems, while the modifications which allow working directly on the phase space are due to [Dol00, CM06]. The two papers mentioned above implemented the coupling methods for systems with dominated splitting. In our case, we have to deal with the general non-uniformly hyperbolic situation and this significantly expands the potential applications of the coupling method.

An attractive feature of our result is that we make only one assumption (1.1) which is, in fact, open. Our result is an example of a successful implementation of the line of research asking which dynamical properties follow just from existence of a hyperbolic set with controlled geometry. This direction is exemplified by a conjecture of Viana [Via98], which asks if the existence of positive measure hyperbolic set implies existence of a physical measure. While several important recent results obtained progress on this question (see [BO21, Bur24, BCS23, CLP22] as well as [BCS22] which deals with a measure of maximal entropy), much less is known about qualitative properties. In the present (and a follow up) paper we are able to get a full package of statistical properties starting from a simple assumption (1.1).

Below we list key ingredients of our approach since similar ideas could be useful in studying other hyperbolic systems.

  1. (1)

    Using martingale large deviation bounds, we demonstrate an abundance of times where the orbit of a given vector is backward tempered.

  2. (2)

    Using two dimensionality and volume preservation, we promote exponential growth of the norm to existence of a hyperbolic splitting.

  3. (3)

    Using Pesin theory we show that hyperbolic set cannot have gaps of too small a size since these gaps would be filled with orbits of slightly weaker hyperbolicity.

  4. (4)

    We use fake stable manifolds and quantitative estimates on their convergence to construct a finite time “fake” coupling.

  5. (5)

    Using a Mañe type argument we show that a fake coupling converges quickly to a real coupling for most trajectories.

Finally, we would like to mention that recently a different approach to quenched mixing based on random Young towers has been developed, see [ABR22, ABRV23]. So far, the authors have proved the existence of random towers for relatively simple systems where hyperbolicity is uniform at least in one direction. It might be possible to obtain exponential mixing in our case by verifying the conditions of [ABRV23], however, this would not simplify our analysis. Indeed the main ingredients of the Young towers is the following: the existence of a positive measure horseshoe, an exponential tail on the return time, and a finite time mixing estimate. The last ingredient is already established in our paper. To construct a large horseshoe would require estimates similar to our local coupling lemma of Section 10, while having an exponential tail on return times would be similar to our recovery lemma of Section 8. In addition there several technical properties of Young tower whose verification would require additional space and effort. For this reason we prefer to give a direct proof of exponential mixing in our setting rather than deducing our result by a lengthy verification of the conditions of the deep recent work of [ABRV23].

4. Estimates on the growth of vectors and temperedness

In this section, we study infinitesimal properties of uniformly expanding random dynamical systems. The main results of this section are a proof that the sequence of linear maps Dfω0,Dfω1,,DfωnDf_{\omega_{0}},Df_{\omega_{1}},\ldots,Df_{\omega_{n}} applied along the trajectory of a point xx typically has a splitting with most of the same properties as a point in a Pesin block has. Moreover, we give quantitative estimates on the angle between the vectors in the splitting, as well as the probability that the splitting experiences a renewal.

4.1. Tempered vectors and sequences of linear maps

In this subsection we discuss some notions of tempering for sequences of linear maps. We remark that typical notions of tempering used in Pesin theory involve both lower and upper bounds, i.e. they involve a statement like eλϵA|Eueλ+ϵe^{\lambda-\epsilon}\leq\|A|_{E^{u}}\|\leq e^{\lambda+\epsilon}. We will only take one of these two bounds to avoid having to do more estimates than necessary. Further, the version of tempering used in Pesin theory is often adapted so that the value of λ\lambda is a particular Lyapunov exponent for a particular measure. In such a context, a tempered splitting will have expansion at rate eλϵe^{\lambda-\epsilon} rather than at rate eλe^{\lambda}, as we have below. Compare for example, with the definition of (λ,μ,ϵ)(\lambda,\mu,\epsilon)-tempered in [BP07, Def. 1.2.]. In the language of this section, points that are (λ,μ,ϵ)(\lambda,\mu,\epsilon)-tempered in the sense of [BP07], have a splitting that is (C,λϵ,ϵ)(C,\lambda-\epsilon,\epsilon)-tempered in our sense.

Before we get to our ultimate notion of a tempered splitting, Definition 4.2, we first record several estimates and introduce intermediate notions.

Definition 4.1.

Consider a finite or infinite sequence of linear maps (An)nI(A_{n})_{n\in I} between a sequence of normed 22-dimensional vector spaces ViV_{i}, where II is either \mathbb{N} or a set of the form {1,,n}\{1,\ldots,n\}, and Ai:ViVi+1A_{i}\colon V_{i}\to V_{i+1}.

  1. (1)

    We say that (An)(A_{n}) has (C,λ,ϵ)(C,\lambda,\epsilon)-subtempered norms when

    Ai+jeCeλieϵjAj,\|A^{i+j}\|\geq e^{C}e^{\lambda i}e^{-\epsilon j}\|A^{j}\|,

    for all i1i\geq 1, j0j\geq 0, with i+jIi+j\in I.

  2. (2)

    We say that a vector vv is (C,λ,ϵ)(C,\lambda,\epsilon)-subtempered for the sequence of linear transformations AiA_{i} if

    (4.1) AkmvkeCeλmeϵk,\|A^{m}_{k}v^{k}\|\geq e^{C}e^{\lambda m}e^{-\epsilon k},

    where Akm=Ak+mAk+1A^{m}_{k}=A_{k+m}\cdots A_{k+1} and vk=Akv/Akvv^{k}=A^{k}v/\|A^{k}v\|, for all k,mk,m\in\mathbb{N} with k+mIk+m\in I.

  3. (3)

    We say that the vector vv is (C,λ,ϵ)(C,\lambda,\epsilon)-supertempered if

    (4.2) AkmvkeCeλmeϵk,\|A^{m}_{k}v^{k}\|\leq e^{C}e^{\lambda m}e^{\epsilon k},

    for all m,km,k and vkv^{k} as above.

  4. (4)

    Similarly, we may speak of a vector vTxMv\in T_{x}M being sub or super tempered for a sequence of diffeomorphisms (fn)nI(f_{n})_{n\in I} if it sub or super tempered for the sequence of differentials Dxf1,Df1(x)f2,D_{x}f_{1},D_{f_{1}(x)}f_{2},\ldots, etc.

Finally, we say that a sequence of maps has an (C,λ,ϵ)(C,\lambda,\epsilon)-tempered splitting if there exists a pair of directions eue^{u} and ese^{s} such that the action of the maps is (C,λ,ϵ)(-C,\lambda,\epsilon)-subtempered on eue^{u} and (C,λ,ϵ)(C,-\lambda,\epsilon)-supertempered on ese^{s}. In addition, we impose a lower bound on the angle between these two directions. Note that we do not require the angle itself to be tempered in the sense that it locally decays slowly: we just require that it stay bounded below by a slowly decaying function.

Definition 4.2.

We say that a finite or infinite sequence A1,,AnA_{1},\ldots,A_{n} of linear maps Ai:ViVi+1A_{i}\colon V_{i}\to V_{i+1} of 22-dimensional inner product spaces has a (C,λ,ϵ)(C,\lambda,\epsilon)-tempered splitting if there exists a pair of unit vectors es,euV1e^{s},e^{u}\in V_{1} such that

(4.3) Akm(Akeu)/Akeu\displaystyle\|A^{m}_{k}(A^{k}e^{u})\|/\|A^{k}e^{u}\| eCeλmeϵk,\displaystyle\geq e^{-C}e^{\lambda m}e^{-\epsilon k},
(4.4) Akm(Akeu)/Akes\displaystyle\|A^{m}_{k}(A^{k}e^{u})\|/\|A^{k}e^{s}\| eCeλme+ϵk,\displaystyle\leq e^{C}e^{-\lambda m}e^{+\epsilon k},
(4.5) (Akes,Akeu)eCeϵk.\displaystyle\angle(A^{k}e^{s},A^{k}e^{u})\geq e^{-C}e^{-\epsilon k}.

Similarly, we say that this sequence of maps has a reverse tempered splitting, if the sequence of maps An1,,A11A_{n}^{-1},\ldots,A_{1}^{-1} has a tempered splitting.

In the rest of this section we will show that typically the sequence of differentials along a random orbit has a tempered splitting.

4.2. Temperedness of sums of real valued random variables

In order to study the temperedness of vectors, we will first study additive sequences of real random variables. This will be sufficient for our purposes because one may think of the norm of a vector acted upon by matrices as the sum of random variables of the form lnAvv1\ln\|Av\|\|v\|^{-1}.

In what follows, we will be studying tempered sequences of sums of real valued random variables. The results of this subsection will be used in the proof of Proposition 4.16, which says that tempered times occur exponentially fast.

Definition 4.3.

If X1,,XnX_{1},\ldots,X_{n} is a finite or infinite sequence of real numbers then we say that this sequence is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered if for each 0j<kn0\leq j<k\leq n, we have that

(4.6) i=j+1kXiλ(kj)+jϵC.\sum_{i=j+1}^{k}X_{i}-\lambda(k-j)+j\epsilon\geq C.

We also say that a finite sequence X1,XnX_{1},\ldots X_{n} is (C,λ,ϵ)(C,\lambda,\epsilon)-reverse tempered if the sequence Xn,,X1X_{n},\ldots,X_{1} is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered.

Note that for fixed λ,ϵ>0\lambda,\epsilon>0 every finite sequence is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered for a sufficiently negative choice of CC. Further, note that this condition is harder to satisfy for large positive CC, and easier to satisfy for very negative CC.

We are interested in finding tempered times for sequences of random variables.

Proposition 4.4.

Fix constants c>λ0>λ1>0c>\lambda_{0}>\lambda_{1}>0 and ϵ>0\epsilon>0. Then there exist D1,D2>0D_{1},D_{2}>0 such that the following hold. Suppose that X1,X2,X_{1},X_{2},\ldots is a submartingale difference sequence with respect to a filtration (n)n(\mathcal{F}_{n})_{n\in\mathbb{N}} such that

  1. (1)

    |Xi|c\left|X_{i}\right|\leq c;

  2. (2)

    𝔼[Xi|i1]λ0\mathbb{E}\left[{X_{i}|\mathcal{F}_{i-1}}\right]\geq\lambda_{0}.

Then the temperedness constant of the random sequence has an exponential tail. Namely, for C0C\geq 0,

(4.7) (X1,X2,, is not (C,λ1,ϵ)-tempered)D1exp(D2C).\mathbb{P}(X_{1},X_{2},\ldots,\text{ is not }(-C,\lambda_{1},\epsilon)\text{-tempered})\leq D_{1}\exp(-D_{2}C).

Under the same assumptions on a finite sequence, (4.7) holds with the same constants.

Proof.

For a fixed CC, for the sequence to be (C,λ1,ϵ)(-C,\lambda_{1},\epsilon)-tempered, for each pair of indices 0j<k0\leq j<k the following inequality must be satisfied:

(4.8) Xk++Xj+1(kj)λ1+jϵC.X_{k}+\cdots+X_{j+1}-(k-j)\lambda_{1}+j\epsilon\geq-C.

To estimate the probability of this event consider χk+1=𝔼[Xk+1|k]\chi_{k+1}=\mathbb{E}\left[{X_{k+1}|\mathcal{F}_{k}}\right], and let X^k=Xk+1χk+1\hat{X}_{k}\!\!=\!\!X_{k+1}\!-\!\chi_{k+1}. Then the sequence X^k\hat{X}_{k} is a martingale difference sequence. Then,

(Xk++Xj+1(kj)λ1+jϵC)=(X^k++X^j+1+i=j+1kχi(kj)λ1+jϵC)\displaystyle\mathbb{P}(X_{k}+\cdots+X_{j+1}-(k-j)\lambda_{1}+j\epsilon\leq\!\!-C)\!\!=\!\!\mathbb{P}(\hat{X}_{k}+\cdots+\hat{X}_{j+1}\!\!+\!\!\sum_{i=j+1}^{k}\chi_{i}-(k-j)\lambda_{1}+j\epsilon\leq\!\!-C)
(|i=j+1kX^i||i=j+1kχi+(kj)λ1jϵC|)(|i=j+1kX^i||(kj)(λ0λ1)jϵC|)\displaystyle\leq\!\mathbb{P}\!\left(\left|\sum_{i=j+1}^{k}\hat{X}_{i}\right|\geq\left|-\!\!\!\sum_{i=j+1}^{k}\chi_{i}+(k-j)\lambda_{1}-j\epsilon\!\!-\!\!C\right|\right)\!\leq\!\mathbb{P}\!\left(\left|\sum_{i=j+1}^{k}\hat{X}_{i}\right|\geq\left|-\!(k-j)(\lambda_{0}-\lambda_{1})\!\!-\!\!j\epsilon\!\!-\!\!C\right|\right)

because we know that the term in the right hand absolute value is negative and χiλ0>λ1\chi_{i}\geq\lambda_{0}>\lambda_{1}. Then by Azuma’s inequality (Thm. 2.2),

(4.9) (Xk++Xj+1(kj)λ1+jϵC)2exp((m(λ0λ1)+jϵ+C)22mc2)\mathbb{P}\left(X_{k}+\cdots+X_{j+1}-(k-j)\lambda_{1}+j\epsilon\leq-C\right)\leq 2\exp\left(-\frac{(m(\lambda_{0}-\lambda_{1})+j\epsilon+C)^{2}}{2mc^{2}}\right)
2exp(m(λ0λ1)2+2(jϵ+C)(λ0λ1)2c2),\leq 2\exp\left(-\frac{m(\lambda_{0}-\lambda_{1})^{2}+2(j\epsilon+C)(\lambda_{0}-\lambda_{1})}{2c^{2}}\right),

where m=kjm=k-j. Summing over jj and mm we obtain that there exist D1,D2>0D_{1},D_{2}>0 independent of nn such that:

(4.10) kj+1n(Xk++Xj+1(kj)λ1+jϵC)D1exp(D2C),\sum_{k\geq j+1}^{n}\mathbb{P}(X_{k}+\cdots+X_{j+1}-(k-j)\lambda_{1}+j\epsilon\leq-C)\leq D_{1}\exp(-D_{2}C),

which gives the needed conclusion. ∎

We now estimate the probability that a sequence of random variables as above first fails to be tempered at a time nn. This will be used to ensure that failure times in the local coupling lemma have an exponential tail.

Proposition 4.5.

Fix constants c>λ0>λ1>0c>\lambda_{0}>\lambda_{1}>0 and ϵ>0\epsilon>0. Then there exists η>0\eta>0 such that the following holds. For each CC there exists D1D_{1} such that if X1,X2,X_{1},X_{2},\ldots is a submartingale difference sequence with respect to a filtration (n)n(\mathcal{F}_{n})_{n\in\mathbb{N}} and

  1. (1)

    |Xi|c\left|X_{i}\right|\leq c;

  2. (2)

    𝔼[Xi|i1]λ0\mathbb{E}\left[{X_{i}|\mathcal{F}_{i-1}}\right]\geq\lambda_{0},

then if 𝒮\mathcal{S} is the first nn such that X1,X2,,XnX_{1},X_{2},\ldots,X_{n} is not (C,λ1,ϵ)(C,\lambda_{1},\epsilon)-tempered then:

(𝒮n)D1eηn.\mathbb{P}(\mathcal{S}\geq n)\leq D_{1}e^{-\eta n}.
Proof.

To obtain a proof of the proposition we show that except on a set of exponentially small probability, the sequence X1,,XnX_{1},\ldots,X_{n} satisfies better estimates than (C,λ1,ϵ)(C,\lambda_{1},\epsilon)-temperedness requires for the constraints related on Xn+1X_{n+1}. In fact, these estimates are so much better than what is needed, that regardless of what Xn+1X_{n+1} is the sequence will remain (C,λ1,ϵ)(C,\lambda_{1},\epsilon)-tempered as long as X1,,XnX_{1},\ldots,X_{n} is (C,λ1,ϵ(C,\lambda_{1},\epsilon)-tempered. Hence the sequence fails to be tempered for the first time at time n+1n+1 with exponentially small probability.

We claim that there exist η,D1>0\eta,D_{1}>0 such that with probability at least 1D1enη1-D_{1}e^{-n\eta}, for all 0j<n0\leq j<n,

(4.11) i=j+1nXiλ1(nj)+jϵC+(nj)(λ0λ1)/2.\sum_{i=j+1}^{n}X_{i}-\lambda_{1}(n-j)+j\epsilon\geq C+(n-j)(\lambda_{0}-\lambda_{1})/2.

We now estimate the probability that (4.11) holds for each 0j<n0\leq j<n. This is the same as estimating the probability that

i=j+1nXi<λ1(nj)jϵ+C+(nj)(λ0λ1)/2.\sum_{i=j+1}^{n}X_{i}<\lambda_{1}(n-j)-j\epsilon+C+(n-j)(\lambda_{0}-\lambda_{1})/2.

Note that this is the same inequality as (4.8), with (nj)(λ0λ1)/2(n-j)(\lambda_{0}-\lambda_{1})/2 added to the constant CC appearing there. Thus (4.9) gives

(i=j+1nXi<λ1(nj)jϵ+C+(nj)(λ0λ1)2)2exp(((nj)(λ0λ1)/2+jϵ+C)22(nj)c2)\!\!\!\mathbb{P}\left(\!\!\sum_{\;\;i=j+1}^{n}\!\!\!X_{i}<\lambda_{1}(n-j)\!-\!j\epsilon+\!C\!+\frac{(n-j)(\lambda_{0}-\lambda_{1})}{2}\right)\!\!\leq\!\!2\exp\left(\!\!-\frac{((n-j)(\lambda_{0}-\lambda_{1})/2\!+\!j\epsilon+C)^{2}}{2(n-j)c^{2}}\right)

As at least one of jj and njn-j exceeds n/2n/2 in size, we see that there exists a>0a>0 such that

(i=j+1nXi<λ1(nj)jϵ+C+(nj)(λ0λ1)/2)ean.\mathbb{P}\left(\sum_{i=j+1}^{n}X_{i}<\lambda_{1}(n-j)-j\epsilon+C+(n-j)(\lambda_{0}-\lambda_{1})/2\right)\leq e^{-an}.

Hence there exists D1>0D_{1}>0 such that

j=0n1(i=j+1nXi<λ1(nj)jϵ+C+(nj)(λ0λ1)/2)D1e(a/2)n.\sum_{j=0}^{n-1}\mathbb{P}\left(\sum_{i=j+1}^{n}X_{i}<\lambda_{1}(n-j)-j\epsilon+C+(n-j)(\lambda_{0}-\lambda_{1})/2\right)\leq D_{1}e^{-(a/2)n}.

Thus we see that there is a set of probability 1D1e(a/2)n1-D_{1}e^{-(a/2)n} such that the inequalities (4.11) all hold. In particular as long as nn is sufficiently large, for a realization X1,,XnX_{1},\ldots,X_{n} in this set, it follows that X1,,Xn,Xn+1X_{1},\ldots,X_{n},X_{n+1} is necessarily also (C,λ1,ϵ)(C,\lambda_{1},\epsilon)-tempered if X1,,XnX_{1},\ldots,X_{n} is.

This implies that the probability of X1,X2X_{1},X_{2}\ldots failing to be (C,λ1,ϵ)(C,\lambda_{1},\epsilon)-tempered for the first time at time nn is at most D1e(a/2)nD_{1}e^{-(a/2)n}, and the proposition follows. ∎

4.3. Tempered splittings from tempered norms

In this subsection, we show that one may obtain a tempered splitting for a sequence of matrices in SL(2,)\operatorname{SL}(2,\mathbb{R}) when the norms of the matrix products are themselves tempered. Namely, we show that if the norms of a product of matrices has subtempered norm in the sense of Definition 4.1, then the product has a hyperbolic splitting. The proof consists of several steps. The first step is to show that there is a stable subspace on which the product’s action is super-tempered.

As before, we write An=AnA1A^{n}=A_{n}\cdots A_{1}. We denote by sns_{n} the most contracted singular direction of AnA^{n} and by unu_{n} the most expanded singular direction. Recall that for ASL(2,)A\in\operatorname{SL}(2,\mathbb{R}) we have As=A1\|As\|=\|A\|^{-1} where ss is a unit vector in the most contracted singular direction.

Before proceeding to the next proof, we see how the most contracted singular direction changes as we compose more matrices. Note that the following computation does not use any temperedness assumptions. Define αn\alpha_{n} as follows:

(4.12) sn=cosαnsn+1+sinαnun+1.s_{n}=\cos\alpha_{n}s_{n+1}+\sin\alpha_{n}u_{n+1}.

Then we can compute that

An+1sn=An+12cos2αn+An+12sin2αnAn+1sinαn.\|A^{n+1}s_{n}\|=\sqrt{\|A^{n+1}\|^{-2}\cos^{2}\alpha_{n}+\|A^{n+1}\|^{2}\sin^{2}\alpha_{n}}\geq\|A^{n+1}\|\sin\alpha_{n}.

But we also have the estimate:

An+1snAn+1An(sn)=An+1An1.\|A^{n+1}s_{n}\|\leq\|A_{n+1}\|\|A^{n}(s_{n})\|=\|A_{n+1}\|\|A^{n}\|^{-1}.

Thus

(4.13) sinαnAn+1An+1An.\sin\alpha_{n}\leq\frac{\|A_{n+1}\|}{\|A^{n+1}\|\|A^{n}\|}.

We now observe that if the sequence (An)n(A_{n})_{n\in\mathbb{N}} has a well defined stable direction EsE^{s}, then snEss_{n}\to E^{s} and we can estimate their distance by

(4.14) (Es,sn)Dmnαm.\angle(E^{s},s_{n})\leq D\sum_{m\geq n}\alpha_{m}.

This is good because we expect this sum to be dominated by its first term in the presence of non-trivial Lyapunov exponents.

Now consider a sequence of matrices A1,A2,A_{1},A_{2},\ldots whose norm is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered and such that each matrix has norm bounded above by Λ>0\Lambda>0. If we have AnveCenλ\|A^{n}v\|\geq e^{C}e^{n\lambda} for some unit vector vv, then

(4.15) (Es,sn)Dmne2CΛe2mλeD2CΛe2nλ,\angle(E^{s},s_{n})\leq D\sum_{m\geq n}e^{-2C}\Lambda e^{-2m\lambda}\leq e^{D^{\prime}-2C}\Lambda e^{-2n\lambda},

for some DD^{\prime} depending only on λ\lambda.

Proposition 4.6.

Suppose that C0,λ,ϵ,Λ>0C_{0},\lambda,\epsilon,\Lambda>0 are fixed. Then there exist DD and NN\in\mathbb{N} such that if A1,,AnA_{1},\ldots,A_{n}, nNn\geq N is a sequence of matrices in SL(2,)\operatorname{SL}(2,\mathbb{R}) with (C,λ,ϵ)(C,\lambda,\epsilon)-subtempered norms. Then:

  1. (1)

    There exist perpendicular vectors ss and uu so that (Ai)1in(A_{i})_{1\leq i\leq n} has a (max{0,3C}+D,λ2ϵ,3ϵ)(\max\{0,-3C\}+D,\lambda-2\epsilon,3\epsilon) tempered splitting in the sense of Definition 4.2. In the case that An>1\|A^{n}\|>1, we may take ss and uu to be the most contracted and expanded singular directions of AnA^{n}, respectively.

  2. (2)

    In the case of an infinite sequence (Ai)i(A_{i})_{i\in\mathbb{N}} with subtempered norms there exists an orthogonal pair of unit vectors ss and uu that defines such a splitting. Further, there exists a unique one dimensional subspace EsE^{s} such that any non-zero vEsv\in E^{s} that satisfies lim supnn1lnAnv<0\displaystyle\limsup_{n\to\infty}n^{-1}\ln\|A^{n}v\|<0 is in EsE^{s}.

  3. (3)

    Finally, there exists N0(C)=(C+ln(2))/λN_{0}(C)=\lceil(C+\ln(2))/\lambda\rceil and DD^{\prime} such that for nN0n\geq N_{0} and m2m1N0m_{2}\!\geq\!m_{1}\!\geq\!N_{0}, and any (C,λ,ϵ)(C,\lambda,\epsilon)-tempered sequence of matrices (Ai)1in(A_{i})_{1\leq i\leq n} as above, Am1A^{m_{1}} and Am2A^{m_{2}} have unique contracted singular directions Em1sE^{s}_{m_{1}} and Em2sE^{s}_{m_{2}} and moreover,

    (sm1,sm2)e4C+De2(λϵ)m1.\angle(s_{m_{1}},s_{m_{2}})\leq e^{-4C+D^{\prime}}e^{-2(\lambda-\epsilon)m_{1}}.

    The analogous statement also holds for n=n=\infty.

Proof.

If An=1\|A^{n}\|=1, choose arbitrarily a vector sns_{n}. Otherwise, let sns_{n} be a unit vector most contracted by AnA^{n}. Let sms_{m} be the most contracted vector for AmA^{m}. If sms_{m} does not exist because Am=1\|A^{m}\|=1, then there is no most contracted direction, and we instead set sm=sns_{m}=s_{n}. Let unu_{n} be a unit vector in the orthogonal complement of sns_{n}. We show that unu_{n} and sns_{n} define a tempered splitting. This requires estimating three things: the contraction of sns_{n}, the growth of unu_{n}, and the decay of the angle between them.

We now proceed with the proof of (1). First, we will show that the action on the vector sns_{n} is super-tempered. Define αm\alpha_{m} as in (4.12). Then there exists some D1D_{1} such that

(4.16) sinαmD1AmAmAm+1.\sin\alpha_{m}\leq D_{1}\frac{\|A_{m}\|}{\|A^{m}\|\|A^{m+1}\|}.

Indeed for indices mm where sms_{m} and sm+1s_{m+1} are both defined by the actual most contracting directions, this follows as in (4.13). Otherwise, note that one of AmA^{m} or Am+1A^{m+1} has norm 11, hence the right hand side is uniformly bounded below by e2Λe^{-2\Lambda}, and thus there exists such a D1D_{1}.

From (4.16), it is immediate that there exists D2>0D_{2}>0 such that

(4.17) (sm,sn)D2mj<nAjAjAj+1.\angle(s_{m},s_{n})\leq D_{2}\sum_{m\leq j<n}\frac{\|A_{j}\|}{\|A^{j}\|\|A^{j+1}\|}.

From (C,λ,ϵ)(C,\lambda,\epsilon)-subtempered norms we have for all m+lnm+l\leq n,

(4.18) Am+leCeλlAmeϵm.\|A^{m+l}\|\geq e^{C}e^{\lambda l}\|A^{m}\|e^{-\epsilon m}.

Combining (4.17) and (4.18), and the uniform bound AeΛ\|A\|\leq e^{\Lambda}, we get

(4.19) (sm,sn)D2e2C+2ΛAm2e2ϵm0l<nme2λlD2Dλe2C+2ΛAm2e2ϵm\angle(s_{m},s_{n})\leq D_{2}e^{-2C+2\Lambda}\|A^{m}\|^{-2}e^{2\epsilon m}\sum_{0\leq l<n-m}e^{-2\lambda l}\leq D_{2}D_{\lambda}e^{-2C+2\Lambda}\|A^{m}\|^{-2}e^{2\epsilon m}

Hence there exists D3>0D_{3}>0 such that for all 0mn0\leq m\leq n,

Amsn\displaystyle\|A^{m}s_{n}\| Am1+sin(sn,sm)AmAm1+D3Dλe2C+2ΛAm1e2ϵm\displaystyle\leq\|A^{m}\|^{-1}+\sin\angle(s_{n},s_{m})\|A^{m}\|\leq\|A^{m}\|^{-1}+D_{3}D_{\lambda}e^{-2C+2\Lambda}\|A^{m}\|^{-1}e^{2\epsilon m}
(4.20) (1+D3Dλe2C+2Λe2ϵm)Am1.\displaystyle\leq(1+D_{3}D_{\lambda}e^{-2C+2\Lambda}e^{2\epsilon m})\|A^{m}\|^{-1}.

We now check that sns_{n} is supertempered. This is more complicated. Write s^nk\hat{s}_{n}^{k} for Aksn/AksnA^{k}s_{n}/\|A^{k}s_{n}\|. For all j+knj+k\leq n, we have

Akjs^nkAksn=Aj+ksn.\|A^{j}_{k}\hat{s}^{k}_{n}\|\|A^{k}s_{n}\|=\|A^{j+k}s_{n}\|.

Thus

Akjs^nkAj+ksnAksn1.\|A^{j}_{k}\hat{s}^{k}_{n}\|\leq\|A^{j+k}s_{n}\|\|A^{k}s_{n}\|^{-1}.

Applying (4.20) with m=j+km=j+k we get

(4.21) Akjs^nk\displaystyle\|A^{j}_{k}\hat{s}^{k}_{n}\| (1+D3Dλe2C+2Λe2ϵ(j+k))Aj+k1Aksn1\displaystyle\leq(1+D_{3}D_{\lambda}e^{-2C+2\Lambda}e^{2\epsilon(j+k)})\|A^{j+k}\|^{-1}\|A^{k}s_{n}\|^{-1}

By subtemperedness, Aj+keCejλekϵAk\|A^{j+k}\|\geq e^{C}e^{j\lambda}e^{-k\epsilon}\|A^{k}\|, thus

Akjs^nkeCejλekϵ(1+D3Dλe2C+2Λe2ϵ(j+k)).\|A^{j}_{k}\hat{s}_{n}^{k}\|\leq e^{-C}e^{-j\lambda}e^{k\epsilon}(1+D_{3}D_{\lambda}e^{-2C+2\Lambda}e^{2\epsilon(j+k)}).

Hence there exists D4D_{4} such that

(4.22) Akjs^nkemin{C,3C}+D4ej(λ2ϵ)e3kϵ.\|A^{j}_{k}\hat{s}_{n}^{k}\|\leq e^{-\min\{-C,-3C\}+D_{4}}e^{-j(\lambda-2\epsilon)}e^{3k\epsilon}.

Thus sns_{n} is (max{0,3C}+D4,λ2ϵ,3ϵ(\max\{0,-3C\}+D_{4},\lambda-2\epsilon,3\epsilon)-supertempered.

Next we estimate how fast the angle between sns_{n} and un=(sn)u_{n}=(s_{n})^{\perp} decays. This will lead to a growth estimate on unu_{n}. Consider the angle θm\theta_{m} between AmsnA^{m}s_{n} and AmunA^{m}u_{n}. Because the maps are in SL(2,)\operatorname{SL}(2,\mathbb{R}),

(4.23) 1=AmsnAmunsinθm.1=\|A^{m}s_{n}\|\|A^{m}u_{n}\|\sin\theta_{m}.

Hence by (4.20),

(4.24) sinθm1AmsnAm(1+D3Dλe2C+2Λe2ϵm)1.\sin\theta_{m}\geq\frac{1}{\|A^{m}s_{n}\|\|A^{m}\|}\geq(1+D_{3}D_{\lambda}e^{-2C+2\Lambda}e^{2\epsilon m})^{-1}.

For 0D3Dλe2C+2Λe2ϵm10\leq D_{3}D_{\lambda}e^{-2C+2\Lambda}e^{2\epsilon m}\leq 1,

(4.25) sinθm1/2.\sin\theta_{m}\geq 1/2.

Otherwise, as 1/(1+x)1/(2x)1/(1+x)\geq 1/(2x) for x1x\geq 1,

(4.26) sinθm(2D3)1Dλ1e2C2Λe2ϵm.\sin\theta_{m}\geq(2D_{3})^{-1}D_{\lambda}^{-1}e^{2C-2\Lambda}e^{-2\epsilon m}.

In both cases, we see that there exists D5D_{5} such that

(4.27) sinθmemin{2C,0}D5e2ϵm.\sin\theta_{m}\geq e^{\min\{2C,0\}-D_{5}}e^{-2\epsilon m}.

Finally, we estimate the rate of growth of unu_{n}. First, note that because sns_{n} and unu_{n} are orthogonal, applying (4.24) and (4.20) to (4.23) gives

Amun=(sinθm)1Amsn11(1+D3Dλe2C+2Λe2ϵm)1Am.\|A^{m}u_{n}\|=(\sin\theta_{m})^{-1}\|A^{m}s_{n}\|^{-1}\geq 1\cdot(1+D_{3}D_{\lambda}e^{-2C+2\Lambda}e^{2\epsilon m})^{-1}\|A^{m}\|.

Then letting u^nk=Akun/Akun\hat{u}_{n}^{k}=A^{k}u_{n}/\|A^{k}u_{n}\|, we can estimate Akju^nk\|A^{j}_{k}\hat{u}_{n}^{k}\| as before:

(4.28) Akju^nk\displaystyle\|A^{j}_{k}\hat{u}_{n}^{k}\| =Aj+kunAkun1\displaystyle=\|A^{j+k}u_{n}\|\|A^{k}u_{n}\|^{-1}
(4.29) (1+D3Dλe2C+2Λe2ϵ(j+k))1Aj+kAk1\displaystyle\geq(1+D_{3}D_{\lambda}e^{-2C+2\Lambda}e^{2\epsilon(j+k)})^{-1}\|A^{j+k}\|\|A^{k}\|^{-1}
(4.30) (1+D3Dλe2C+2Λe2ϵ(j+k))1eCeϵkeλjAkAk1\displaystyle\geq(1+D_{3}D_{\lambda}e^{-2C+2\Lambda}e^{2\epsilon(j+k)})^{-1}e^{C}e^{-\epsilon k}e^{\lambda j}\|A^{k}\|\|A^{k}\|^{-1}
(4.31) =(1+D3Dλe2C+2Λe2ϵ(j+k))1eCeϵkeλj.\displaystyle=(1+D_{3}D_{\lambda}e^{-2C+2\Lambda}e^{2\epsilon(j+k)})^{-1}e^{C}e^{-\epsilon k}e^{\lambda j}.

If D3Dλe2C+2Λe2ϵ(j+k)<1D_{3}D_{\lambda}e^{-2C+2\Lambda}e^{2\epsilon(j+k)}<1, then

(4.32) Akju^nk12eCeϵkeλj.\|A^{j}_{k}\hat{u}_{n}^{k}\|\geq\frac{1}{2}e^{C}e^{-\epsilon k}e^{\lambda j}.

Otherwise, as 1/(1+x)1/(2x)1/(1+x)\geq 1/(2x) for x1x\geq 1, we see that there exists D5>0D_{5}>0 such that:

(4.33) Akju^nk(2D3)1Dλ1e2C2Λe2ϵ(j+k)eCeϵkeλjeD5e3C2Λe3ϵke(λ2ϵ)j.\|A^{j}_{k}\hat{u}_{n}^{k}\|\geq{(2D_{3})}^{-1}D_{\lambda}^{-1}e^{2C-2\Lambda}e^{-2\epsilon(j+k)}e^{C}e^{-\epsilon k}e^{\lambda j}\\ \geq e^{D_{5}}e^{3C-2\Lambda}e^{-3\epsilon k}e^{(\lambda-2\epsilon)j}.

So, we see that there exists D6D_{6} such that

(4.34) Akju^nkemin{C,3C}+D6e(λ2ϵ)je3ϵk,\|A^{j}_{k}\hat{u}_{n}^{k}\|\geq e^{\min\{C,3C\}+D_{6}}e^{(\lambda-2\epsilon)j}e^{-3\epsilon k},

which shows that unu_{n} is (max{0,3C}+D6,λ2ϵ,3ϵ)(\max\{0,-3C\}+D_{6},\lambda-2\epsilon,3\epsilon)-subtempered.

We can now conclude by reading off the constants for the splitting we just obtained from equations (4.22), (4.27), and (4.34) and comparing with Definition 4.2. Thus there is D7D_{7} depending only on λ,Λ,ϵ\lambda,\Lambda,\epsilon, such that sns_{n} and unu_{n} define a subtempered splitting with constants:

(4.35) D7=(max{0,3C}+D7,λ2ϵ,3ϵ).D_{7}=(\max\{0,-3C\}+D_{7},\lambda-2\epsilon,3\epsilon).

This finishes the proof of the first conclusion of the proposition.

The proof of (2) is straightforward, similar to part (1), and very similar to a usual proof of Osceledec theorem [Via14, Ch. 4], so we omit it.

Item (3) also follows from the above proof once we know that NN is large enough that the stable subspace is well defined. This certainly holds if n(C+ln(2))/λn\geq\lceil(C+\ln(2))/\lambda\rceil since then An2\|A^{n}\|\geq 2. Then from equation (4.19) and temperedness of the norm, if m1m2m_{1}\leq m_{2}, we have that

(sm1,sm2)\displaystyle\angle(s_{m_{1}},s_{m_{2}}) D2Dλe2C+2ΛAm12e2ϵm1\displaystyle\leq D_{2}D_{\lambda}e^{-2C+2\Lambda}\|A^{m_{1}}\|^{-2}e^{2\epsilon m_{1}}
D2Dλe2C+2Λe2ϵm1(e2Ce2m1λ)e4C+D8e2(λϵ)m1,\displaystyle\leq D_{2}D_{\lambda}e^{-2C+2\Lambda}e^{2\epsilon m_{1}}(e^{-2C}e^{-2m_{1}\lambda})\leq e^{-4C+D_{8}}e^{-2(\lambda-\epsilon)m_{1}},

for some D8D_{8}, which gives item (3). ∎

4.4. Tempered splittings for expanding on average diffeomorphisms

In this subsection, we apply the above developments to describe hyperbolicity of expanding on average random dynamical systems. There are two main results, the first is Proposition 4.8, which is a quantitative estimate on the probability that DxfωnD_{x}f^{n}_{\omega} has a (C,λ,ϵ)(C,\lambda,\epsilon)-tempered splitting. The second estimate is Proposition 4.14, which controls the stable direction for this splitting.

To begin, we estimate the probability that the sequence Dxfn\|D_{x}f^{n}\| is tempered.

Proposition 4.7.

For a closed surface MM, suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is a uniformly expanding on average tuple in Diffvol(M)\operatorname{Diff}_{\operatorname{vol}}(M) with constants n0n_{0} and λ0\lambda_{0}. Then for all 0<λ1<λ00<\lambda_{1}<\lambda_{0} and all sufficiently small ϵ>0\epsilon>0, there exists D,α>0D,\alpha>0 such that for all xMx\in M,

(4.36) μ({ω:Dxfωnisnot(C,λ1,ϵ)subtempered})DeαC.\mu(\{\omega:\|D_{x}f^{n}_{\omega}\|{\ \rm is\ not\ }(-C,\lambda_{1},\epsilon){\rm-subtempered}\})\leq De^{-\alpha C}.
Proof.

This follows from the estimates on temperedness obtained for submartingales. Essentially, for a fixed vTx1Mv\in T^{1}_{x}M, Xn=Dxfωnn0vX_{n}=\|D_{x}f^{nn_{0}}_{\omega}v\| is a submartingale with respect to a filtration n\mathcal{F}_{n} generated by the coordinates of ω\omega, and 𝔼[Xn|n1]λ0\mathbb{E}\left[{X_{n}|\mathcal{F}_{n-1}}\right]\geq\lambda_{0}. Thus Proposition 4.4 gives that for all sufficiently small ϵ>0\epsilon>0, and 0<λ1<λ00<\lambda_{1}<\lambda_{0}, there exist D1,D2>0D_{1},D_{2}>0 such that:

(Dxfnn0 is not (C,λ1,ϵ)-tempered)D1eD2C.\mathbb{P}(\|D_{x}f^{nn_{0}}\|\text{ is not }(-C,\lambda_{1},\epsilon)\text{-tempered})\leq D_{1}e^{-D_{2}C}.

Then to obtain temperedness along the entire sequence, not just times of the form nn0nn_{0}, note that we have a uniform bound on the norm and conorm of all Dxfωi\|D_{x}f_{\omega_{i}}\|, 1im1\leq i\leq m. ∎

Since a tempered sequence of norms implies the existence of a tempered splitting by Proposition 4.6, the following is immediate.

Proposition 4.8.

Suppose that MM is a closed surface and (f1,,fm)(f_{1},\ldots,f_{m}) is uniformly expanding on average tuple of diffeomorphisms in Diffvol2(M)\operatorname{Diff}_{\operatorname{vol}}^{2}(M) with expansion constant λ0\lambda_{0}. Then for all 0<λ1<λ00<\lambda_{1}<\lambda_{0}, and sufficiently small ϵ>0\epsilon>0, there exists D,α>0D,\alpha>0 such that for all xT1Mx\in T^{1}M,

(4.37) μ({ω:Dxfωndoesnothavea(C,λ,ϵ)temperedsplitting})DeαC.\mu(\{\omega:D_{x}f^{n}_{\omega}{\rm\ does\ not\ have\ a\ }(C,\lambda,\epsilon)-{\rm tempered\ splitting}\})\leq De^{-\alpha C}.

In particular, for all xMx\in M and almost every ω\omega, DxfωnD_{x}f^{n}_{\omega} has a well defined one-dimensional stable subspace Eωs(x)E^{s}_{\omega}(x).

Below, it will be important to consider the probability that a trajectory that is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered suddenly fails to be tempered. In order to quantify this we will introduce an auxiliary quantity for (C,λ,ϵ)(C,\lambda,\epsilon)-tempered orbits of length nn. We call this the cushion of the orbit and it measures how far the inequalities from Definition 4.1(1) are from failing.

Definition 4.9.

If the sequence of matrices A1,,AnA_{1},\ldots,A_{n} is (C0,λ,ϵ)(C_{0},\lambda,\epsilon)-tempered, then we define its cushion UU to be

U=min0k<n[lnAnlnAkC0(nk)λ+ϵk]U=\min_{0\leq k<n}\left[\ln\|A^{n}\|-\ln\|A^{k}\|-C_{0}-(n-k)\lambda+\epsilon k\right]

Note that a trajectory can have such a large cushion that whatever happens at the next iterate, the trajectory will not fail to be tempered. The cushion reflects the only inequalities relevant to tempering that the term An+1A_{n+1} would affect, should it be added to the sequence.

The following proposition is a large deviations estimate that says that typically the cushion is quite large.

Proposition 4.10.

For a closed surface MM, suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M) with expansion constant λ0>0\lambda_{0}>0. For fixed C0C_{0}, let U(n,ω,x)U(n,\omega,x) be the cushion of DxfnD_{x}f^{n} when viewed as a (C0,λ,ϵ)(C_{0},\lambda,\epsilon)-tempered trajectory.

Then for any C0C_{0}, λ<λ0\lambda<\lambda_{0}, and ϵ>0\epsilon>0, there exist δ,η,D>0\delta,\eta,D>0 such that

(U(n,ω,x)<nδ|Dxfωn is (C0,λ,ϵ)-tempered)Deηn.\mathbb{P}(U(n,\omega,x)<n\delta|D_{x}f^{n}_{\omega}\text{ is }(C_{0},\lambda,\epsilon)\text{-tempered})\leq De^{-\eta n}.
Proof.

The proof is straightforward: we are just estimating the difference between lnDxfωn\ln\|D_{x}f^{n}_{\omega}\| and lnDxfωi\ln\|D_{x}f^{i}_{\omega}\|.

Note that in order for a given trajectory to fail to have a cushion of size ϵ¯n\bar{\epsilon}n, it needs to be the case that for each 0kn0\leq k\leq n, that

(4.38) ϵ¯n>lnDxfωnlnDxfωkC0λ(nk)+ϵk.\bar{\epsilon}n>\ln\|D_{x}f^{n}_{\omega}\|-\ln\|D_{x}f^{k}_{\omega}\|-C_{0}-\lambda(n-k)+\epsilon k.

Call this event Ωn,k\Omega_{n,k}. Note that this event is a subset of the event that

ϵ¯n+C0lnDxfωnlnDxfωkλ(nk)\overline{\epsilon}n+C_{0}\geq\ln\|D_{x}f^{n}_{\omega}\|-\ln\|D_{x}f^{k}_{\omega}\|-\lambda(n-k)

As before, lnDxfωnlnDxfωkλ(nk)\ln\|D_{x}f^{n}_{\omega}\|-\ln\|D_{x}f^{k}_{\omega}\|-\lambda(n-k) is a submartingale with differences bounded by some Λ>0\Lambda>0. Hence as ϵ¯n+C0\bar{\epsilon}n+C_{0} is positive for nn sufficiently large, it is less than the expectation of lnDxfωnlnDxfωkλ(nk)\ln\|D_{x}f^{n}_{\omega}\|-\ln\|D_{x}f^{k}_{\omega}\|-\lambda(n-k). Thus Azuma’s inequality gives

(Ωn,k)\displaystyle\mathbb{P}(\Omega_{n,k}) (|lnAnlnAk𝔼[lnAnlnAk]|>ϵ¯n+C0)\displaystyle\leq\mathbb{P}\left(\left|\ln\|A^{n}\|-\ln\|A^{k}\|-\mathbb{E}\left[{\ln\|A^{n}\|-\ln\|A^{k}\|}\right]\right|>\bar{\epsilon}n+C_{0}\right)
2exp((ϵ¯n+C0)22Λn)C1exp(ϵ¯2Λn).\displaystyle\leq 2\exp\left(-\frac{(\bar{\epsilon}n+C_{0})^{2}}{2\Lambda n}\right)\leq C_{1}\exp\left(-\frac{\overline{\epsilon}}{2\Lambda}n\right).

Summing over kk, we find that the probability that at least one of the inequalities (4.38) fails for 1kn1\leq k\leq n is exponentially small, which gives the result. ∎

Next, we study the distribution of the stable subspaces in an expanding on average system. We obtain two estimates. First, we obtain an estimate on the distribution of all stable subspaces through a point, Proposition 4.11. Second, in Proposition 4.14, we show that the empirical distribution of stable subspaces converges quickly to the actual distribution of the true stable subspaces.

Proposition 4.11.

Suppose that MM is a closed surface and that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple of diffeomorphisms in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). Then there exist constants C,α>0C,\alpha>0 such that if νxs\nu^{s}_{x} denotes the distribution of stable subspaces through the point xx, then for each vTxMv\in\mathbb{P}T_{x}M,

νxs({zd(z,v)ϵ})Cϵα,\nu^{s}_{x}(\{z\mid d(z,v)\leq\epsilon\})\leq C\epsilon^{\alpha},

where dd is the angle between those points and (TxM)\mathbb{P}(T_{x}M) denotes the projectivization of TxMT_{x}M.

Naturally, before proceeding with the proof, we must show for vT1Mv\in T^{1}M that the norm of DxfωnvD_{x}f^{n}_{\omega}v along a typical trajectory does grow exponentially. In fact, we show that even slow exponential growth is quite unlikely.

Lemma 4.12.

In the setting of Proposition 4.11, suppose that (1.1) holds with constants n0n_{0}\in\mathbb{N} and λ0>0\lambda_{0}>0. Then there exist γ,C>0\gamma,C>0 such that if vT1Mv\in T^{1}M, then

(4.39) ω(Dfωnveλ0n/3)Ceγn.\mathbb{P}_{\omega}(\|Df^{n}_{\omega}v\|\leq e^{\lambda_{0}n/3})\leq Ce^{-\gamma n}.
Proof.

First, note that by considering the Taylor expansion of ete^{-t}, that for sufficiently small tt and all vT1Mv\in T^{1}M,

𝔼[etlnDfωn0v](1(n0λ0/2)t).\mathbb{E}\left[{e^{-t\ln\|Df^{n_{0}}_{\omega}v\|}}\right]\leq(1-(n_{0}\lambda_{0}/2)t).

Next, observe that writing v¯\overline{v} for v/vv/\|v\|,

𝔼[etlnDfω2n0v]\displaystyle\mathbb{E}\left[{e^{-t\ln\|Df^{2n_{0}}_{\omega}v\|}}\right] =𝔼[etlnDfωn0vetlnDfσn0(ω)n0(Dfωn0v¯)]\displaystyle=\mathbb{E}\left[{e^{-t\ln\|Df^{n_{0}}_{\omega}v\|}e^{-t\ln\|Df^{n_{0}}_{\sigma^{n_{0}}(\omega)}(\overline{Df^{n_{0}}_{\omega}v})\|}}\right]
𝔼[etlnDfωn0v(1(n0λ0/2)t)](1(n0λ0/2)t)2,\displaystyle\leq\mathbb{E}\left[{e^{-t\ln\|Df^{n_{0}}_{\omega}v\|}(1-(n_{0}\lambda_{0}/2)t)}\right]\leq\left(1-(n_{0}\lambda_{0}/2)t\right)^{2},

where we have used the independence of σn0ω\sigma^{n_{0}}\omega from ωi\omega_{i} for i<n0i<n_{0}. Similarly, by boundedness of the C1C^{1} norm of the fif_{i}, we see inductively that there exists D>0D>0 such that for all nn,

𝔼[etlnDfωnv]D(1(n0λ0/2)t)n/n0enλ0/2,\mathbb{E}\left[{e^{-t\ln\|Df^{n}_{\omega}v\|}}\right]\leq D\left(1-(n_{0}\lambda_{0}/2)t\right)^{n/n_{0}}\leq e^{-n\lambda_{0}/2},

since 1t/2<et1-t/2<e^{-t} for small tt. By Markov’s inequality

(Dfωnveλ0n/3)(etlnDfωnvetλ0n/3)\mathbb{P}(\|Df^{n}_{\omega}v\|\leq e^{\lambda_{0}n/3})\leq\mathbb{P}(e^{-t\ln\|Df^{n}_{\omega}v\|}\geq e^{-t\lambda_{0}n/3})

𝔼[etlnDfωnv]etλ0n/3D(1(n0λ0/2)t)n/n0etλ0n/3Denλ0t/2+λ0nt/3Denλ0t/6.\displaystyle\leq\frac{\mathbb{E}\left[{e^{-t\ln\|Df^{n}_{\omega}v\|}}\right]}{e^{-t\lambda_{0}n/3}}\leq D\frac{\left(1-(n_{0}\lambda_{0}/2)t\right)^{n/n_{0}}}{e^{-t\lambda_{0}n/3}}\leq De^{-n\lambda_{0}t/2+\lambda_{0}nt/3}\leq De^{-n\lambda_{0}t/6}.

For vT1Mv\in T^{1}M, let Bϵ(v)B_{\epsilon}(v) be the set of directions ww with sin((v,w))ϵ\sin(\angle(v,w))\leq\epsilon and Λ\Lambda be the maximum of the norm of Dxfi\|D_{x}f_{i}\| over the set of all 1im1\leq i\leq m and xMx\in M.

Lemma 4.13.

For all σ>0\sigma>0 sufficiently small there exist 0<θ<10<\theta<1 such that for any v(TxM)v\in\mathbb{P}(T_{x}M) and sufficiently small ϵ>0\epsilon>0, if λ06Λln(ϵ)nλ03Λln(ϵ)-\frac{\lambda_{0}}{6\Lambda}\ln(\epsilon)\leq n\leq-\frac{\lambda_{0}}{3\Lambda}\ln(\epsilon), and

δ=maxuBϵ(v)sin(Dfωnu,Dfωnv),\delta=\max_{u\in B_{\epsilon}(v)}\sin\angle(Df^{n}_{\omega}u,Df^{n}_{\omega}v),

then

(δϵ1+σ and for all uBϵ(v),Dfωnu21enλ0/3u)1ϵθ.\mathbb{P}(\delta\leq\epsilon^{1+\sigma}\text{ and for all }u\in B_{\epsilon}(v),\,\,\|Df^{n}_{\omega}u\|\geq 2^{-1}e^{n\lambda_{0}/3}\|u\|)\geq 1-\epsilon^{\theta}.
Proof.

By Lemma 4.12, for each nn we have Dfωnveλ0n/3\|Df^{n}_{\omega}v\|\geq e^{\lambda_{0}n/3} on a set of measure 1Ceγn1-Ce^{-\gamma n}. Then for any unit vector uu with sin((v,u))ϵ\sin(\angle(v,u))\leq\epsilon,

DfωnuDfωnvDfωn(uv)eλ0n/3ϵeΛneλ0n/3/2,\|Df^{n}_{\omega}u\|\geq\|Df^{n}_{\omega}v\|-\|Df^{n}_{\omega}(u-v)\|\geq e^{\lambda_{0}n/3}-\epsilon e^{\Lambda n}\geq e^{\lambda_{0}n/3}/2,

as long as ϵ\epsilon is sufficiently small and nn satisfies nλ03Λln(ϵ).n\leq-\frac{\lambda_{0}}{3\Lambda}\ln(\epsilon).

Since the fif_{i} are volume preserving, the areas of the triangles between vectors are preserved. Since all vectors in Bϵ(v)B_{\epsilon}(v) are stretched, we see that

sin(Dfωnv,Dfωnu)=ϵDfωnv1Dfωnu12ϵe(2/3)λ0n.\sin\angle(Df^{n}_{\omega}v,Df^{n}_{\omega}u)=\epsilon\|Df^{n}_{\omega}v\|^{-1}\|Df^{n}_{\omega}u\|^{-1}\leq 2\epsilon e^{-(2/3)\lambda_{0}n}.

But if nλ06Λln(ϵ)n\geq-\frac{\lambda_{0}}{6\Lambda}\ln(\epsilon) and ϵ\epsilon is sufficiently small, then sin(Dfωnv,Dfωnu)2ϵe23λ0λ06Λ(ln(ϵ)).\displaystyle\sin\angle(Df^{n}_{\omega}v,Df^{n}_{\omega}u)\leq 2\epsilon e^{-\frac{2}{3}\lambda_{0}\frac{\lambda_{0}}{6\Lambda}(-\ln(\epsilon))}. Thus we see that for sufficiently small ϵ\epsilon and σ>0\sigma>0 that for nn satisfying

λ06Λln(ϵ)nλ03Λln(ϵ)-\frac{\lambda_{0}}{6\Lambda}\ln(\epsilon)\leq n\leq-\frac{\lambda_{0}}{3\Lambda}\ln(\epsilon)

it holds that sin(Dfωnv,Dfωnu)ϵ1+σ\displaystyle\sin\angle(Df^{n}_{\omega}v,Df^{n}_{\omega}u)\leq\epsilon^{1+\sigma} for all ω\omega in a set of size 1Cϵγn1-C\epsilon^{-\gamma n}. ∎

Proof of Proposition 4.11..

Using Lemma 4.13 we may now conclude. Fix some σ>0\sigma>0 as in the lemma, Λ/(6λ0)<α<Λ/(3λ0)\Lambda/(6\lambda_{0})<\alpha<\Lambda/(3\lambda_{0}) and let ϵ>0\epsilon>0 be small enough that the lemma applies. Let ϵ1=ϵ\epsilon_{1}=\epsilon and then define ϵk=ϵ(1+σ)k\epsilon_{k}=\epsilon^{(1+\sigma)^{k}}. Let bk=α(1+σ)kln(ϵ)b_{k}=\lfloor-\alpha(1+\sigma)^{k}\ln(\epsilon)\rfloor and nk=k=0k1bk\displaystyle n_{k}=\sum_{k=0}^{k-1}b_{k} be an increasing sequence of times. By our choice of α\alpha we may apply the lemma to each additional block of iterations of fωf_{\omega} of length bkb_{k} with ϵ=ϵk\epsilon=\epsilon_{k}. We then define:

ηkω(ϵ,v)\displaystyle\eta_{k}^{\omega}(\epsilon,v) =maxwBϵk(Dfωnk1v)sin(Dfωbkw,Dfωbkv),\displaystyle=\max_{w\in B_{{\epsilon_{k}}}(Df_{\omega}^{n_{k-1}}v)}\sin\angle(Df^{b_{k}}_{\omega}w,Df^{b_{k}}_{\omega}v),
τkω(ϵ,v)\displaystyle\tau_{k}^{\omega}(\epsilon,v) =infwBϵk(Dfωnk1v)Dfσnk1ωbkw.\displaystyle=\inf_{w\in B_{\epsilon_{k}}(Df^{n_{k-1}}_{\omega}v)}\|Df^{b_{k}}_{\sigma^{n_{k-1}}\omega}w\|.

Lemma 4.13 asserts that for every vv and kk that

(ηkω(ϵk,v)ϵk1+σ and τkω(ϵk,v)21eλ0(nknk1)/3)1ϵkθ.\mathbb{P}(\eta_{k}^{\omega}(\epsilon_{k},v)\leq\epsilon_{k}^{1+\sigma}\text{ and }\tau_{k}^{\omega}(\epsilon_{k},v)\geq 2^{-1}e^{\lambda_{0}(n_{k}-n_{k-1})/3})\geq 1-\epsilon_{k}^{\theta}.

As the dynamics is IID and the above estimate is independent of the vector v(TM)v\in\mathbb{P}(TM), we see that there exists C>0C>0 such that:

(4.40) (for all kηkω(ϵ,v)ϵk and τkω(ϵ,v)eλ0nk/32)i=1(1ϵkθ)1Cϵθ.\mathbb{P}\left(\text{for all }k\,\,\eta_{k}^{\omega}(\epsilon,v)\leq\epsilon_{k}\text{ and }\tau_{k}^{\omega}(\epsilon,v)\geq\frac{e^{\lambda_{0}n_{k}/3}}{2}\right)\geq\prod_{i=1}^{\infty}\left(1-\epsilon_{k}^{\theta}\right)\geq 1-C\epsilon^{\theta}.

By Proposition 4.8, at the point xx almost every word ω\omega has a well defined stable subspace Eωs(x)E^{s}_{\omega}(x). If a vector vTx1Mv\in T^{1}_{x}M satisfies (4.40), then for any wBϵ(v)w\in B_{\epsilon}(v), Dfωnkweλ0nk/32k\|Df^{n_{k}}_{\omega}w\|\geq e^{\lambda_{0}n_{k}/3}2^{-k}, which grows rapidly in kk as long as ϵ\epsilon was chosen sufficiently small. Thus this vector cannot be in Eωs(x)E^{s}_{\omega}(x). Thus (Eωs(x)Bϵ(v))Cϵθ,\mathbb{P}(E^{s}_{\omega}(x)\in B_{\epsilon}(v))\leq C\epsilon^{\theta}, and we are done. ∎

Next we check that if we consider the distribution of stable subspaces for finite time realizations of the dynamics that the distribution of the finite time stable subspaces converges quickly to the stationary stable distribution. Essentially this should be true for the same reason that it is true for IID matrix products. The proof is a slight extension of the argument that appears above.

Proposition 4.14.

Suppose that MM is a closed surface and (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). There exist c0,C,θc_{0},C,\theta such that for any xMx\in M and vTx1Mv\in T^{1}_{x}M, if N0c0|ln(ϵ)|N_{0}\geq c_{0}\left|\ln(\epsilon)\right| the following holds. Let Ens(ω)E^{s}_{n}(\omega) be the maximally contracted subspace of the product DxfωnD_{x}f^{n}_{\omega}. Then:

(4.41) (for some n>N0,Ens(ω)Bϵ(v) or Ens(ω) does not exist)Cϵθ.\mathbb{P}(\text{for some }n>N_{0},E^{s}_{n}(\omega)\in B_{\epsilon}(v)\text{ or }E^{s}_{n}(\omega)\text{ does not exist})\leq C\epsilon^{\theta}.
Proof.

The proof of the above fact is essentially a corollary of the estimates obtained in the proof of Lemma 4.13.

We apply that same proof and choose sufficiently small 0<σ<λ0/(3Λ)0<\sigma<\lambda_{0}/(3\Lambda) where λ0\lambda_{0} and Λ\Lambda are as in that proposition, as are bkb_{k} and nkn_{k}. Then we find that there exists C,θC,\theta such that for all sufficiently small ϵ>0\epsilon>0, we have equation (4.40), so for ϵk=ϵ(1+σ)k\epsilon_{k}=\epsilon^{(1+\sigma)^{k}},

(4.42) (for all kδkω(ϵ,v)ϵk and τkω(ϵ,v)21eλ0nk/3)1Cϵθ.\mathbb{P}(\text{for all }k\,\,\delta_{k}^{\omega}(\epsilon,v)\leq\epsilon_{k}\text{ and }\tau_{k}^{\omega}(\epsilon,v)\geq 2^{-1}e^{\lambda_{0}n_{k}/3})\geq 1-C\epsilon^{\theta}.

This shows as before that at the times nkn_{k}, that we have the estimate

Dfωnkweλ0nk/32k\|Df^{n_{k}}_{\omega}w\|\geq e^{\lambda_{0}n_{k}/3}2^{-k}

for all wBϵ(v)w\in B_{\epsilon}(v) on a set of measure 1Cϵθ1-C\epsilon^{\theta}. In particular, as we chose σ\sigma quite small, for k2k\geq 2, we see that for any time nn from nk1n_{k-1} to nkn_{k}, that

DfωnwDfωnk1we(nnk1)Λenk1λ0/3(nnk1)Λ.\|Df^{n}_{\omega}w\|\geq\|Df^{n_{k-1}}_{\omega}w\|e^{-{(n-n_{k-1})}\Lambda}\geq e^{n_{k-1}\lambda_{0}/3-(n-n_{k-1})\Lambda}.

But by choice of σ\sigma, that exponent is at least

((1+σ)k1λ0/3((1+σ)k(1+σ)k1)Λ)ln(ϵ)=(1+σ)k1(λ0/3σΛ)ln(ϵ)>0.\displaystyle((1+\sigma)^{k-1}\lambda_{0}/3-((1+\sigma)^{k}-(1+\sigma)^{k-1})\Lambda)\ln(\epsilon)=(1+\sigma)^{k-1}(\lambda_{0}/3-\sigma\Lambda)\ln(\epsilon)>0.

Thus from the definition of the nkn_{k} in Lemma 4.13, we see that on a set of probability 1Cϵθ1-C\epsilon^{\theta} for any n>n1=|α(1+σ)ln(ϵ)|n>n_{1}=\left|\alpha(1+\sigma)\ln(\epsilon)\right|, that Ens(ω)E^{s}_{n}(\omega) does not lie in Bϵ(v)B_{\epsilon}(v) and the result follows. ∎

4.5. Reverse tempered sequences

We are interested in reverse tempered times since they are key for proving smoothing lemmas. The main result of this subsection is Proposition 4.18, which shows that the waiting time until a reverse tempered time occurs has an exponential tail.

The following lemma estimates how much the temperedness of a sequence improves when we prepend entries on it. Note that by reversing the order of the sequence, this gives the corresponding estimate for reverse temperedness.

Lemma 4.15.

Suppose that a1,,ana_{1},\ldots,a_{n} is a (C,λ0,ϵ)(C,\lambda_{0},\epsilon) tempered sequence and b1,,bmb_{1},\ldots,b_{m} is a (D,λ1,ϵ/2)(D,\lambda_{1},\epsilon/2) tempered sequence where λ1λ0>ϵ\lambda_{1}-\lambda_{0}>\epsilon, then b1,,bm,a1,,anb_{1},\ldots,b_{m},a_{1},\ldots,a_{n} is

(min{D,mϵ/2+C+D,mϵ+C},λ0,ϵ)(\min\{D,m\epsilon/2+C+D,m\epsilon+C\},\lambda_{0},\epsilon)

tempered sequence.

Proof.

Let c1,,cm+nc_{1},\ldots,c_{m+n} denote the new joined sequence and let CC^{\prime} be the (λ0,ϵ)(\lambda_{0},\epsilon) temperedness constant for this sequence. Each pair of indices 0j<kn+m0\leq j<k\leq n+m gives a constraint on the constant of temperedness:

(4.43) C=min0j<kn+mjϵ+i=j+1k(ciλ0).C^{\prime}=\min_{0\leq j<k\leq n+m}j\epsilon+\sum_{i=j+1}^{k}(c_{i}-\lambda_{0}).

Note that the only pairs of indices that offer a non-trivial constraint are those with at least one of j+1,km+1j+1,k\geq m+1. The constraint arising from a pair of indices with j,kmj,k\leq m, is certainly satisfied as long as the temperedness constant is at most DD. This leaves two cases.

For a pair of indices j<m<kj<m<k, we obtain the constraint that

(4.44) Cjϵ+i=j+1m(biλ0)+i=m+1k(aiλ0).C^{\prime}\leq j\epsilon+\sum_{i=j+1}^{m}(b_{i}-\lambda_{0})+\sum_{i=m+1}^{k}(a_{i}-\lambda_{0}).

But by temperedness, we can bound the right hand side below:

jϵ+i=j+1m(biλ0)+i=m+1k(aiλ0)D+jϵ2+(mj)(λ1λ0)+Cmϵ/2+D+C.j\epsilon+\sum_{i=j+1}^{m}(b_{i}-\lambda_{0})+\sum_{i=m+1}^{k}(a_{i}-\lambda_{0})\geq D+\frac{j\epsilon}{2}+(m-j)(\lambda_{1}-\lambda_{0})+C\geq m\epsilon/2+D+C.

If both j+1,km+1j+1,k\geq m+1, then as the sequence a1,,ama_{1},\ldots,a_{m} is already (C,λ0,ϵ)(C,\lambda_{0},\epsilon)-tempered, the constraint on these entries of the sequence improves by mϵm\epsilon as they are now additionally offset by mm from 0. So, they give the constraint CC+mϵC^{\prime}\leq C+m\epsilon.

Taking the minimum over the three bounds above gives the result. ∎

Using the above, we will now prove that for submartingale difference sequences the renewals of backward temperedness have exponential tails.

Proposition 4.16.

(Exponential return times to the tempered set) Fix c>λ0>λ>0c>\lambda_{0}>\lambda>0 and pick 0<ϵ<(λ0λ)/30<\epsilon<(\lambda_{0}-\lambda)/3. There exist C0,D1,D2>0C_{0},D_{1},D_{2}>0 such that the following holds. Let X1,X2,X_{1},X_{2},\ldots be a submartingale difference sequence with respect to a filtration (n)n(\mathcal{F}_{n})_{n\in\mathbb{N}} such that for all nn\in\mathbb{N},

  1. (1)

    |Xn|<c\left|X_{n}\right|<c;

  2. (2)

    𝔼[Xn|n1]λ0\mathbb{E}\left[{X_{n}|\mathcal{F}_{n-1}}\right]\geq\lambda_{0}.

Fix NN\in\mathbb{N} and let TT denote the first time kk after NN such that X1,,XN+kX_{1},\ldots,X_{N+k} is (C0,λ,ϵ)(C_{0},\lambda,\epsilon)-reverse tempered. Then

(4.45) (T>N+k)D1eD2k.\mathbb{P}(T>N+k)\leq D_{1}e^{-D_{2}k}.
Proof.

The proof has essentially two steps. First, in the following claim, we study how long it takes for a sequence with bad temperedness constant to recover. This happens with linear speed because we are studying a submartingale sequence with 𝔼[Xn|n1]\mathbb{E}\left[{X_{n}|\mathcal{F}_{n-1}}\right] uniformly bounded away from zero. We estimate how fast the reverse-temperedness constant improves as we append blocks of a fixed size Δ0\Delta_{0}. As a sequence of length NN might have a bad temperedness constant, to obtain the result we then apply the tail estimate on the temperedness constant for sequences of length NN. As each of these things has an exponential tail, we obtain the result.

The main claim is the following.

Claim 4.17.

There exist C0C_{0} and A,B>0A,B>0 independent of NN, such that if X1,,XNX_{1},\ldots,X_{N} is (R,λ,ϵ)(R,\lambda,\epsilon)-tempered and TT is the first time greater than NN that is (C0,λ,ϵ)(C_{0},\lambda,\epsilon)-reverse tempered, then

(T>N+k|X1,,XN is (R,λ,ϵ)-tempered)AeRBk.\mathbb{P}(T>N+k|X_{1},\ldots,X_{N}\text{ is }(R,\lambda,\epsilon)\text{-tempered})\leq Ae^{R-Bk}.
Proof.

Let λ1=(λ+λ0)/2\lambda_{1}=(\lambda+\lambda_{0})/2 and denote by Bi,ΔB_{i,\Delta} the backwards (λ1,ϵ/2)(\lambda_{1},\epsilon/2)-temperedness constant of the sequence Xi+1,,Xi+ΔX_{i+1},\ldots,X_{i+\Delta}. By Proposition  4.4, there exist A2,B2A_{2},B_{2} (independent of ii and Δ\Delta) such that for C0C\geq 0,

(Xi+1,,Xi+Δ is not (C,λ,ϵ)-tempered)A2eB2C.\mathbb{P}(X_{i+1},\ldots,X_{i+\Delta}\text{ is not }(-C,\lambda,\epsilon)\text{-tempered})\leq A_{2}e^{-B_{2}C}.

As this tail on the temperedness constant is independent of ii and Δ\Delta, we see that there exists Δ0\Delta_{0} sufficiently large and δ>0\delta>0 such that for any ii\in\mathbb{N},

(4.46) 𝔼[Δ0ϵ/2+Bi,Δ0|i]>δ>0.\mathbb{E}\left[{\Delta_{0}\epsilon/2+B_{i,\Delta_{0}}|\mathcal{F}_{i}}\right]>\delta>0.

We now check how much appending a block of length Δ0\Delta_{0} improves temperedness. Let CiC_{i}^{\prime} denote the backwards (λ,ϵ)(\lambda,\epsilon)-temperedness constant of the sequence

X1,,XN,XN+1,,XN+iΔ0.X_{1},\ldots,X_{N},X_{N+1},\ldots,X_{N+i\Delta_{0}}.

and let DiD_{i} denote the (λ1,ϵ/2)(\lambda_{1},\epsilon/2) backwards tempered constant of the sequence

XN+(i1)Δ0+1,,XN+iΔ0.X_{N+(i-1)\Delta_{0}+1},\ldots,X_{N+i\Delta_{0}}.

Then by Lemma 4.15,

Ci+1=min{Di+1,ϵΔ0/2+Di+1+Ci,ϵΔ0+Ci}.C^{\prime}_{i+1}=\min\{D_{i+1},\epsilon\Delta_{0}/2+D_{i+1}+C^{\prime}_{i},\epsilon\Delta_{0}+C^{\prime}_{i}\}.

We also define C0^=C0\hat{C_{0}}=C^{\prime}_{0} and

C^i+1=min{ϵΔ0/2+Di+1+C^i,ϵΔ0+C^i}.\hat{C}_{i+1}=\min\{\epsilon\Delta_{0}/2+D_{i+1}+\hat{C}_{i},\epsilon\Delta_{0}+\hat{C}_{i}\}.

Note that by (4.46) there exists δ>0\delta>0 depending only on c,λ,λ1,ϵc,\lambda,\lambda_{1},\epsilon, such that

(4.47) 𝔼[C^i+1|N+iΔ0]C^iδ>0.\mathbb{E}\left[{\hat{C}_{i+1}|\mathcal{F}_{N+i\Delta_{0}}}\right]-\hat{C}_{i}\geq\delta>0.

Suppose that we define TT so that we decide to stop when CiϵΔ0/2C^{\prime}_{i}\geq-\epsilon\Delta_{0}/2. Observe that if i+1i+1 is the first index such that C^i+10\hat{C}_{i+1}\geq 0 then because

C^i+1ϵΔ0/2+Di+1+C^i,\hat{C}_{i+1}\geq\epsilon\Delta_{0}/2+D_{i+1}+\hat{C}_{i},

and C^i<0\hat{C}_{i}<0 we must have that Di+1ϵΔ0/2D_{i+1}\geq-\epsilon\Delta_{0}/2. Thus

(4.48) Ci+1min{Di+1,ϵΔ0/2+Di+1+Ci,ϵΔ0+Ci}ϵΔ0/2.C^{\prime}_{i+1}\geq\min\{D_{i+1},\epsilon\Delta_{0}/2+D_{i+1}+C^{\prime}_{i},\epsilon\Delta_{0}+C^{\prime}_{i}\}\geq-\epsilon\Delta_{0}/2.

Let C0=ϵΔ0/2C_{0}=-\epsilon\Delta_{0}/2. Thus if kk is the first index such that C^k0\hat{C}_{k}\geq 0, then T<n+Δ0kT<n+\Delta_{0}k. Thus we need to obtain a bound for the first time C^i0\hat{C}_{i}\geq 0.

We now bound the tail on the first time C^i0\hat{C}_{i}\geq 0. Note that C^i\hat{C}_{i} is a submartingale. Further let MM be an upper bound on |Ci+1Ci|\left|C^{\prime}_{i+1}-C^{\prime}_{i}\right| over all ii (an upper bound exists because |Xi|<c\left|X_{i}\right|<c). Let χi=𝔼[C^i|n+(i1)Δ0]δ>0.\displaystyle\chi_{i}=\mathbb{E}\left[{\hat{C}_{i}|\mathcal{F}_{n+(i-1)\Delta_{0}}}\right]\geq\delta>0. Then βi=C^i+1χi\beta_{i}=\hat{C}_{i+1}-\chi_{i} is a martingale difference sequence. We now estimate:

(C^k0)(R+i=1kβki=0k1χi)(i=1kβkkδ+R)\mathbb{P}(\hat{C}_{k}\leq 0)\leq\mathbb{P}\left(-R+\sum_{i=1}^{k}\beta_{k}\leq-\sum_{i=0}^{k-1}\chi_{i}\right)\leq\mathbb{P}\left(\sum_{i=1}^{k}\beta_{k}\leq-k\delta+R\right)

Thus for kR/δk\geq R/\delta, by Azuma’s inequality (Theorem 2.2),

(C^k0)2exp((kδR)22kM2)2exp(kδ22M2+RδM2R22kM2)2exp(kδ22M2+R(δM2)).\mathbb{P}(\hat{C}_{k}\leq 0)\!\!\leq 2\exp\!\!\left(\!\!-\frac{(k\delta-R)^{2}}{2kM^{2}}\right)\!\!\leq\!\!2\exp\!\!\left(\!\!-\frac{k\delta^{2}}{2M^{2}}+\frac{R\delta}{M^{2}}-\frac{R^{2}}{2kM^{2}}\right)\!\!\leq\!\!2\exp\!\!\left(\!\!-k\frac{\delta^{2}}{2M^{2}}\!+\!R\left(\frac{\delta}{M^{2}}\right)\!\right).

If δ/M21\delta/M^{2}\leq 1, then we are already done with B=δ2/(2M2Δ0)B=\delta^{2}/(2M^{2}\Delta_{0}). Otherwise, if δ/M2>1\delta/M^{2}>1, then for k2R/δk\geq 2R/\delta, which is the only range where the bound is less than 11, the right hand side is bounded above by

2exp(kδ22M2+R(δM2))2exp(Rkδ22M2M2δ),2\exp\left(-k\frac{\delta^{2}}{2M^{2}}+R\left(\frac{\delta}{M^{2}}\right)\right)\leq 2\exp\left(R-k\frac{\delta^{2}}{2M^{2}}\frac{M^{2}}{\delta}\right),

and thus the estimate holds with B=δ/(2Δ0)B=\delta/(2\Delta_{0}) in this case as well. This finishes the proof of the claim. ∎

Let A,BA,B and C0C_{0} be as in the claim. From Proposition 4.4, there exists D1,D2D_{1},D_{2} such that for all C0C\geq 0,

(X1,,XN is (C,λ,ϵ)-tempered)1D1exp(D2C).\mathbb{P}(X_{1},\ldots,X_{N}\text{ is }(-C,\lambda,\epsilon)\text{-tempered})\geq 1-D_{1}\exp(-D_{2}C).

From the claim we know that if X1,,XNX_{1},\ldots,X_{N} is (C,λ,ϵ)(-C,\lambda,\epsilon)-tempered and TT is the waiting time for a future (C0,λ,ϵ)(C_{0},\lambda,\epsilon)-tempered time, then

(T>N+k)AeCBk.\mathbb{P}(T>N+k)\leq Ae^{C-Bk}.

Combining these two estimates we see that

(T>N+k)\displaystyle\mathbb{P}(T>N+k)\leq (X1,,XN is (Bk/2,λ,ϵ)-tempered and T>N+k)\displaystyle\mathbb{P}(X_{1},\ldots,X_{N}\text{ is }(-Bk/2,\lambda,\epsilon)\text{-tempered and }T>N+k)
+(X1,,XN is not (Bk/2,λ,ϵ)-tempered)\displaystyle+\mathbb{P}(X_{1},\ldots,X_{N}\text{ is not }(-Bk/2,\lambda,\epsilon)\text{-tempered})
Aexp(Bk/2Bk)+D1exp(D2Bk/2)\displaystyle\leq A\exp(Bk/2-Bk)+D_{1}\exp(-D_{2}Bk/2)
Aexp(Bk/2)+D1exp(D2Bk/2).\displaystyle\leq A\exp(-Bk/2)+D_{1}\exp(-D_{2}Bk/2).

The conclusion is now immediate. ∎

The above results imply that expanding on average diffeomorphisms have frequent reverse tempered times.

Proposition 4.18.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple of diffeomorphisms in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). There exist λ>0\lambda>0 such that for all sufficiently small ϵ>0\epsilon>0, there exists C0,C,αC_{0},C,\alpha such that for all xMx\in M and NN\in\mathbb{N}, if we let T(x)T(x) be the first (C0,λ,ϵ)(C_{0},\lambda,\epsilon)-reverse tempered time for Dxfωn\|D_{x}f^{n}_{\omega}\| that is greater than or equal to NN, then

(T(x)N+k)1Ceαk,\mathbb{P}(T(x)\leq N+k)\geq 1-Ce^{-\alpha k},

and DxfωT(x)D_{x}f^{T(x)}_{\omega} has a well defined splitting into maximally expanded and contracted singular directions.

Proof.

Xn=Dxfnn0X_{n}=\|D_{x}f^{nn_{0}}\| is a submartingale satisfying the hypotheses of Proposition 4.16, hence XnX_{n} satisfies the required estimate on reverse tempered times. The last claim follows from Proposition 4.6. ∎

Proposition 4.18 shows that there is a uniformly large density subset of points such that DxfωnD_{x}f^{n}_{\omega} is reverse tempered. We now show that the stable direction of the resulting tempered splitting does not lie too close to any particular vector vv.

Lemma 4.19.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M), for MM a closed surface. There exist D,α,c0D,\alpha,c_{0} and C,λ>0C,\lambda>0 such that for all sufficiently small ϵ>0,xM\epsilon>0,x\in M and interval ITx1MI\subset T^{1}_{x}M, if nc0ln|I|n\geq c_{0}\ln\left|I\right|, where |I|\left|I\right| is the length of II, if T(x)T(x) is the first time greater than nn that the sequence DxfωnD_{x}f^{n}_{\omega} has a (C,λ,ϵ)(C,\lambda,\epsilon) reverse tempered splitting, denoting the most contracted direction of DxfωnD_{x}f^{n}_{\omega} by ETsE^{s}_{T},

(ETsI|T(x)n+k)C|I|α.\mathbb{P}(E^{s}_{T}\in I|T(x)\leq n+k)\leq C\left|I\right|^{\alpha}.
Proof.

This probability equals (ETsI and T(x)n+k)(T(x)n+k).\displaystyle\frac{\mathbb{P}(E^{s}_{T}\in I\text{ and }T(x)\leq n+k)}{\mathbb{P}(T(x)\leq n+k)}. By Proposition 4.18, the denominator is at least 1C1ekC21-C_{1}e^{-kC_{2}}, for some C1,C2C_{1},C_{2}. If c0c_{0} is as in Proposition 4.14, then for nc0ln|I|n\geq c_{0}\ln\left|I\right|, then the numerator is bounded above by (ETsI)C3|I|α.\displaystyle\mathbb{P}(E^{s}_{T}\in I)\leq C_{3}\left|I\right|^{\alpha}.

5. Stable manifolds of expanding on average systems

In this section we show Proposition 5.3, which says that with probability 1Cα1-C^{-\alpha} a point has a stable manifold of length at least CC. The proof has two parts. First we state a abstract proposition that gives the existence of a stable manifold with good properties through a point xx provided that there exists a tempered hyperbolic splitting along the orbit of xx. We then estimate the probability that this criterion holds.

In §2.3 we introduced the stable manifolds for the random dynamics. We now introduce a quantitative property of them that will be of use later.

Definition 5.1.

We say that a stable manifold Ws(ω,z)W^{s}(\omega,z) is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered if the length of Ws(ω,z)W^{s}(\omega,z) is at least C1C^{-1} and the points in the stable manifold attract uniformly quickly: for x,yfωn(WC1s(ω,z))\displaystyle x,y\in f^{n}_{\omega}(W^{s}_{C^{-1}}(\omega,z)),

dfωn+m(WC1s(ω,z))(fσn(ω)m(x),fσn(ω)m(y))Ceλmeϵn.d_{f^{n+m}_{\omega}(W^{s}_{C^{-1}}(\omega,z))}({f^{m}_{\sigma^{n}(\omega)}(x),f^{m}_{\sigma^{n}(\omega)}(y)})\leq Ce^{-\lambda m}e^{\epsilon n}.

Now we give a quantitative estimate on the number of stable curves of a given C2C^{2} norm and length. This result follows from a careful reading of the construction of stable manifolds in the book of Liu and Qian [LQ95], in particular, Theorem III.3.1, which constructs stable manifolds of random dynamical systems lying in a certain type of Pesin block that the authors denote by Λa,b,k,ϵl,r\Lambda^{l,r}_{a,b,k,\epsilon}. In the case that the random dynamics only arises from a finite collection of diffeomorphisms (i.e. has bounded C2C^{2} norm), the constraint from the rr parameter does not matter—rr essentially measures how small a neighborhood of xx one must look at for the map in an exponential chart to be uniformly close to its derivative. In our setting, once we pick sufficiently large r0>0r_{0}>0 there is no constraint. The number kk is our case also does not matter—it specifies the dimension of the splitting we are considering.

In the 22-dimensional setting a point xMx\in M lies in Λa,b,k,ϵl,r\Lambda^{l,r}_{a,b,k,\epsilon} for the sequence of diffeomorphisms f1,f2,f_{1},f_{2},\ldots if, writing fnn+k=fn+kfn+1f^{n+k}_{n}=f_{n+k}\cdots f_{n+1}, we have an invariant splitting along the trajectory Efn(x)sEfn(x)uE^{s}_{f^{n}(x)}\oplus E^{u}_{f^{n}(x)} such that for the reference metric on the manifold we have that:

|Dfnn+k(fn(x))|Es|\displaystyle\left|Df^{n+k}_{n}(f^{n}(x))|_{E^{s}}\right| leϵne(a+ϵ)k\displaystyle\leq le^{\epsilon n}e^{(a+\epsilon)k}
|Dfnn+k(fn(x))|Eu|\displaystyle\left|Df^{n+k}_{n}(f^{n}(x))|_{E^{u}}\right| l1eϵne(bϵ)k\displaystyle\geq l^{-1}e^{-\epsilon n}e^{(b-\epsilon)k}
(Ef1n(x)s,Ef1n(x)u)\displaystyle\angle(E^{s}_{f_{1}^{n}(x)},E^{u}_{f_{1}^{n}(x)}) l1eϵn.\displaystyle\geq l^{-1}e^{-\epsilon n}.

This is defined at the beginning of [LQ95, Sec. 3]. In the language we have been using above, a (C,λ,ϵ)(-C,\lambda,\epsilon)-tempered trajectory belongs to the set Λλ,λ,1,ϵeC,r0\Lambda^{e^{C},r_{0}}_{\lambda,-\lambda,1,\epsilon}. From [LQ95, Thm. III.3.1], we may now deduce the following proposition.

Proposition 5.2.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is a tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M), where MM is a closed surface. Fix λ,ϵ>0\lambda,\epsilon>0. Then there exist constants D1,D2D_{1},D_{2} such that if (ω,x)(\omega,x) is a (C,λ,ϵ)(-C,\lambda,\epsilon)-tempered trajectory, then Wωs(x)W^{s}_{\omega}(x) exists and is at least D1e2CD_{1}e^{-2C} long. Further, on this interval, its C2C^{2} norm is at most D2e6CD_{2}e^{6C} (when viewed as a graph over its tangent space at xx). Moreover these estimates are e7ϵe^{7\epsilon}-tempered along the trajectory.

Proof.

From the above discussion, a (C,λ,ϵ)(C,\lambda,\epsilon)-tempered point lies in Λλ,λ,1,ϵeC,r0\Lambda^{e^{C},r_{0}}_{\lambda,-\lambda,1,\epsilon}. So, we just need to recover the estimates from the proof of [LQ95, Thm. III.3.1]. In fact these estimates are stated there. As we are keeping λ,ϵ\lambda,\epsilon fixed, the conclusion will follow once we compute the quantities αn\alpha_{n} and βn\beta_{n} appearing in that theorem given our particular choices. Although [LQ95] only shows the stable manifolds are C1,1C^{1,1}, the estimates provided there on the Lipschitz constant of the derivative is enough for controlling the C2C^{2} norm because we know that the stable manifolds are in fact as smooth as the dynamics, which is C2C^{2} [Arn98, Rem. 7.3.20].

First we explain how to estimate βn\beta_{n}, which controls the norm. The first quantity that gets defined in the proof is c0=4Are2ϵ.\displaystyle c_{0}=4Ar^{\prime}e^{2\epsilon}. Here, AA is the quantity appearing in the proof of [LQ95, Lem. 1.3], which is equal to 4(l2)(1ϵ2ϵ)1/24(l^{2})(1-\epsilon^{-2\epsilon})^{-1/2}. Thus c0C1e2Cc_{0}\leq C_{1}e^{2C}. Therefore the quantity D=(1e2ϵ)3(1+e2ϵ)2c0eaD=(1-e^{-2\epsilon})^{-3}(1+e^{-2\epsilon})^{2}c_{0}e^{-a} on p. 66 of [LQ95] is at most C2e2CC_{2}e^{2C}. Hence βn\beta_{n}, which is defined on p. 68 of [LQ95] as 2DA2e7ϵn2DA^{2}e^{7\epsilon n} and controls the norm of the stable curve, is at most C3e6Ce7ϵnC_{3}e^{6C}e^{7\epsilon n}.

The length of the curve given by the quantity αn\alpha_{n} defined on p. 68 of [LQ95] where it is defined to be A1r0e5ϵnA^{-1}r_{0}e^{-5\epsilon n}. From the definition of AA given above, this is bounded below by C4e2Ce5ϵnC_{4}e^{-2C}e^{-5\epsilon n}. We are done. ∎

We then estimate the probability that a stable manifold is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered.

Proposition 5.3.

Suppose that (f1,,fm)Diffvol(M)(f_{1},\ldots,f_{m})\in\operatorname{Diff}_{\operatorname{vol}}(M) is a uniformly expanding on average tuple, where MM is a closed surface. Then there exists λ,ϵ,α>0\lambda,\epsilon,\alpha>0 such that for all C>0C>0

μ({ω:Wωs(x)isnot(C,λ,ϵ)-tempered})Cα.\mu(\{\omega:W^{s}_{\omega}(x){\rm\ is\ not\ }(C,\lambda,\epsilon)\text{-tempered}\})\leq C^{-\alpha}.
Proof.

As the maps f1,,fmf_{1},\ldots,f_{m} are uniformly C1+HölderC^{1+\text{H\"{o}lder}} and uniformly expanding, the trajectory is (C,λ,ϵ)(-C,\lambda,\epsilon)-tempered with probability 1DeαC1-De^{-\alpha C} by Proposition 4.8. This stable curve is at least D1e2CD_{1}e^{-2C} long from Proposition 5.2. The contracting of the stable manifold required by Definition 5.1 then follows from a standard graph transform argument, appearing in Chapter 7 of [BP07] or [LQ95, Lem. 3.2], or from keeping track of the contraction in the graph transform arguments in §A.4. ∎

6. Exactness of the skew product

We now consider measure theoretic properties of the skew product F:Σ×MΣ×MF\colon\Sigma\times M\to\Sigma\times M. We begin with the most basic property, ergodicity, in Proposition 6.1. Then we show that this system is exact in Proposition 6.5. As exactness implies mixing, this proposition plays a key role in the proof of finite time mixing in Section 9 where it is used in the proof of fiberwise mixing in Proposition 9.1.

6.1. Ergodicity

The ergodicity of expanding on average systems has been known since [DK07, Section 10]. We need an extension of this result. Consider the diagonal skew product

(6.1) Fk:Σ×MkΣ×Mkgiven by(ω,x1,,xk)(σ(ω),fω0(x1),,fω0(xk)).F_{k}\colon\Sigma\times M^{k}\to\Sigma\times M^{k}\quad\text{given by}\quad(\omega,x_{1},\ldots,x_{k})\mapsto(\sigma(\omega),f_{\omega_{0}}(x_{1}),\ldots,f_{\omega_{0}}(x_{k})).

Note that FkF_{k} preserves the measure μvolk\mu\otimes\operatorname{vol}^{k}.

Proposition 6.1.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M) for MM a closed surface. Then for each kk\in\mathbb{N}, FkF_{k} is ergodic with respect to μvolk\mu\otimes\operatorname{vol}^{k}.

We will not include a full proof of the above proposition as the result for F=F1F=F_{1} is explained quite clearly in [Chu20, §3.2] as well as [Liu16, Lem. 4.41]. For k>1k>1, the result can be deduced along similar lines. No higher dimensional dynamics is needed because the dynamics is a product and hence all dynamical constructs, like stable manifolds, are just products of the constructs for the system F1F_{1}.

The proof of Proposition 6.1 relies implicitly on the following lemma which will be important in Section 6.2 as well. For xMx\in M, we let Bδ(x)B_{\delta}(x) denote the ball of radius δ\delta centered at xx.

Lemma 6.2.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). Then there exist 0<δ1<δ20<\delta_{1}<\delta_{2} and λ,ϵ,C0,ϵ0>0\lambda,\epsilon,C_{0},\epsilon_{0}>0 such that for all xMx\in M there exist two positive measure subsets V1,V2ΣV_{1},V_{2}\subseteq\Sigma and a pair of transverse cones 𝒞1,𝒞2\mathcal{C}_{1},\mathcal{C}_{2} defined on Bδ2(x)B_{\delta_{2}}(x) by parallel transport of cones based at xx such that the following holds. Let Λω\Lambda_{\omega} denote the set of (C0,λ,ϵ)(C_{0},\lambda,\epsilon)-tempered points in Bδ1(x)B_{\delta_{1}}(x) under the dynamics defined by ω\omega, and set

Qω(x)=yΛωBδ1(x)Wδ2s(ω,y).Q^{\omega}(x)=\bigcup_{y\in\Lambda_{\omega}\cap B_{\delta_{1}}(x)}W^{s}_{\delta_{2}}(\omega,y).

Then

  1. (1)

    For i{1,2}i\in\{1,2\}, ωiVi\omega_{i}\in V_{i}, and yΛωiy\in\Lambda_{\omega_{i}} the stable manifold Wδ2s(ω,y)W^{s}_{\delta_{2}}(\omega,y) is uniformly contracting and tangent to 𝒞i\mathcal{C}_{i}.

  2. (2)

    For i{1,2}i\in\{1,2\} and ωiVi\omega_{i}\in V_{i}, the laminations by stable manifolds satisfy the usual absolute continuity properties:

    (AC 1) If KMK\subseteq M is a Borel set, and for almost every yΛωiy\in\Lambda_{\omega_{i}} the Riemannian leaf measure of KWδ2s(ωi,y)K\cap W^{s}_{\delta_{2}}(\omega_{i},y) is zero, then vol(QωiK)=0\operatorname{vol}(Q^{\omega_{i}}\cap K)=0.

    (AC 2) If TT is a transversal to 𝒞i\mathcal{C}_{i} and KMK\subseteq M is a Borel set, and for a positive measure subset of zTz\in T, Wδ2s(ωi,z)KW^{s}_{\delta_{2}}(\omega_{i},z)\cap K has positive leaf measure, then vol(K)>0\operatorname{vol}(K)>0.

  3. (3)

    For i{1,2}i\in\{1,2\} and ωiVi\omega_{i}\in V_{i}, vol(QωiBδ1(x))>.99vol(Bδ1(x))\operatorname{vol}(Q^{\omega_{i}}\cap B_{\delta_{1}}(x))>.99\operatorname{vol}(B_{\delta_{1}}(x)).

This lemma is implicit in Chung [Chu20] and Liu [Liu16], and further can be deduced from the propositions we prove below. In particular, our Propositions 10.12 and B.13 contain the needed claims. Lemma 6.2 allows a random version of the Hopf argument where the stable manifolds for different words ωΣ\omega\in\Sigma play the role of the stable and unstable manifolds in the usual Hopf argument. This can be used to prove Proposition 6.1. We will not repeat this argument here as it is adequately explained in the sources mentioned.

6.2. Strong mixing

Here we show that for k1k\geq 1 the skew product Fk:Σ×MkΣ×MkF_{k}\colon\Sigma\times M^{k}\to\Sigma\times M^{k} defined in (6.1) is strong mixing for the measure μvolk\mu\otimes\operatorname{vol}^{k}. We will use this property later. A good reference for many of the properties discussed in this section is [Roh67].

Definition 6.3.

An endomorphism TT of a Lebesgue space (M,,μ)(M,\mathcal{B},\mu) is exact if n=0Tn=𝒩\displaystyle\bigcap_{n=0}^{\infty}T^{-n}\mathcal{B}=\mathcal{N}, the trivial sub-sigma algebra of MM.

An invertible map, i.e. an automorphism, TT of a Lebesgue space (M,,μ)(M,\mathcal{B},\mu), is called a KK-automorphism if there exists a sub-sigma algebra 𝒦\mathcal{K}\subset\mathcal{B} such that:

(1) 𝒦T𝒦\mathcal{K}\subset T\mathcal{K};  (2) n=0Tn𝒦=\bigvee_{n=0}^{\infty}T^{n}\mathcal{K}=\mathcal{B};   (3) n=0Tn𝒦={,M}\displaystyle\bigcap_{n=0}^{\infty}T^{-n}\mathcal{K}=\{\emptyset,M\}.

Both exact systems and KK-automorphisms are strong multiple mixing [Roh64, p. 17, 27], [Roh67, 15.2]. Further, an endomorphism is exact if and only if its natural extension is a KK-automorphism [Roh64, p. 27].

We now describe how one may show that an automorphism T:(M,μ)(M,μ)T\colon(M,\mu)\to(M,\mu) is exact. The Pinsker partition of MM is the finest measurable partition π(T)\pi(T) of MM that has zero entropy. This means that any other measurable partition with zero entropy is coarser, mod 0, than π(T)\pi(T). It turns out that TT is a KK-automorphism if the Pinsker partition of TT trivial, i.e. π(T)={,M}\pi(T)=\{\emptyset,M\}, see [Roh67, 13.1,13.10]. In fact, the conditions enumerated in the definition of KK-automorphism above essentially say that the Pinsker partition is trivial.

A useful fact for studying the Pinsker partition is the following.

Lemma 6.4.

(see [BP07, p. 288], [Roh67, 12.1]) If a measurable partition η\eta satisfies TηηT\eta\geq\eta and n=0Tnη=ϵ\bigvee_{n=0}^{\infty}T^{n}\eta=\epsilon, the partition into points, then n=0Tnηπ(T)\bigwedge_{n=0}^{\infty}T^{-n}\eta\geq\pi(T) .

Here we use the standard notation for partitions where we write 𝒜\mathcal{A}\leq\mathcal{B} if 𝒜\mathcal{A} is coarser than \mathcal{B}. An example of a partition satisfying the hypotheses of Lemma 6.4 is the partition of a shift space Σ\Sigma into local stable sets, Wlocs(ω)={η:ωi=ηi for i0}W^{s}_{loc}(\omega)=\{\eta:\omega_{i}=\eta_{i}\text{ for }i\geq 0\}.

We now show for k1k\geq 1 that the map FkF_{k} defined above is mixing.

Proposition 6.5.

Let (f1,,fm)(f_{1},\ldots,f_{m}) be an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M) for MM a closed surface. Then the associated skew product F:Σ×MΣ×MF\colon\Sigma\times M\to\Sigma\times M is exact, and hence strong mixing of all orders, for the measure μvol\mu\otimes\operatorname{vol}. The same holds for Fk:Σ×MkΣ×MkF_{k}\colon\Sigma\times M^{k}\to\Sigma\times M^{k}.

Proof.

To show exactness and hence strong mixing of FF, we will show that the natural extension of the skew product F:Σ×MΣ×MF\colon\Sigma\times M\to\Sigma\times M has the KK-property. As before, we denote by Σ^\hat{\Sigma} the two sided shift, so that the natural extension of FF is F^:(Σ^×M,μ^vol)(Σ^,μ^vol)\hat{F}\colon(\hat{\Sigma}\times M,\hat{\mu}\otimes\operatorname{vol})\to(\hat{\Sigma},\hat{\mu}\otimes\operatorname{vol}), where μ^\hat{\mu} is the Bernoulli measure on Σ^\hat{\Sigma}. Note that the measure on the natural extension has this simple description because each fif_{i} preserves volume.

We begin by showing that modulo 0, any element of the Pinsker partition is of the form Σ^×U\hat{\Sigma}\times U where UMU\subseteq M. The local stable sets of the words ωΣ^\omega\in\hat{\Sigma}, form a measurable partition of Σ^\hat{\Sigma} indexed by the elements of Σ\Sigma. Further, the sets {Wlocs(ω)×{x}}xM\{W^{s}_{loc}(\omega)\times\{x\}\}_{x\in M} form a measurable partition of Σ^×M\hat{\Sigma}\times M. If we let η\eta denote this partition, then n=0Fnη\bigwedge_{n=0}^{\infty}{F}^{-n}\eta is the partition into sets of the form Σ^×{x}\hat{\Sigma}\times\{x\}, where xMx\in M. By Lemma 6.4, we see that π(F^){Σ^×{x}:xM}\pi(\hat{F})\leq\{\hat{\Sigma}\times\{x\}:x\in M\}. Note that this shows that the atoms of the Pinsker partition of F^\hat{F} are of the form Σ×A\Sigma\times A where AA are the atoms of a partition of M.M. We denote this partition by 𝒫\mathcal{P} and the atom containing a point xMx\in M by 𝒫(x)\mathcal{P}(x).

We now show that the Pinsker partition is even coarser by using the dynamics in the fiber; in fact our goal is to show that π(F^)\pi(\hat{F}) has an atom with positive mass. From Liu and Qian, there is a measurable partition of Σ^×M\hat{\Sigma}\times M subordinate to the partition into full stable leaves [LQ95, Proposition VI.5.2] where each atom is a non-trivial curve in a stable leaf. This shows that for almost every xMx\in M and almost every ω\omega, that Lebesgue almost every yWs(ω,x)y\in W^{s}(\omega,x) is in 𝒫(x)\mathcal{P}(x). (This uses AC1 for the stable lamination.) Let GωiG^{\omega_{i}} be the subset of QωiQ^{\omega_{i}} of points yy such that Wδ2s(ωi,y)W^{s}_{\delta_{2}}(\omega_{i},y) satisfies that almost every zWδ2s(ωi,y)z\in W^{s}_{\delta_{2}}(\omega_{i},y) is in 𝒫(y)\mathcal{P}(y). Note that there there is a subset V¯i\bar{V}_{i} of full measure in ViV_{i} such that for ωiV¯i\omega_{i}\in\bar{V}_{i},  GωiG^{\omega_{i}} has full measure in QωiQ^{\omega_{i}}. Now for ω2V¯2\omega_{2}\in\bar{V}_{2} and zGω2z\in G^{\omega_{2}}, consider the intersection of a leaf Wδ2s(ω2,z)W^{s}_{\delta_{2}}(\omega_{2},z) with Gω1G^{\omega_{1}}, where ω1V¯1\omega_{1}\in\bar{V}_{1}. Suppose that for some such zz the set Gω1Wδ2s(ω2,z)G^{\omega_{1}}\cap W^{s}_{\delta_{2}}(\omega_{2},z) has positive measure. Then by definition of Gω1G^{\omega_{1}}, almost every yGω1y\in G^{\omega_{1}} has Wδ2s(ω1,y)W^{s}_{\delta_{2}}(\omega_{1},y) saturated with points in 𝒫(z)\mathcal{P}(z), and hence by AC2, 𝒫(z)\mathcal{P}(z) has positive measure. Thus the Pinsker partition has a positive measure atom. If there were no such point zz, then for almost every zGω2z\in G^{\omega_{2}}, the intersection Gω1Ws(z,ω2)G^{\omega_{1}}\cap W^{s}(z,\omega_{2}) has zero leaf measure. Thus by AC1, Qω2Qω1Bδ1(x)Q^{\omega_{2}}\cap Q^{\omega_{1}}\cap B_{\delta_{1}}(x) has measure zero. But as Qω1Q^{\omega_{1}} and Qω2Q^{\omega_{2}} each take up .99.99 proportion of the volume of Bδ1(x)B_{\delta_{1}}(x), this is impossible. Thus we see that there is a positive volume atom of 𝒫\mathcal{P}. Let Σ×A\Sigma\times A be this atom of π(F^)\pi(\hat{F}) of positive measure.

As F^\hat{F} is ergodic, it must cyclically permute a finite number of these positive measure sets. Because F^\hat{F} is expanding on average, every power of F^\hat{F} is also expanding on average. Hence, by Proposition 6.1, every power of FF is ergodic. Thus the Pinsker partition has only a single non-trivial element, hence π(F^)\pi(\hat{F}) is trivial. Hence F^\hat{F} is a KK-automorphism and so FF is exact.

For the higher “diagonal” skew products FkF_{k}, the proof proceeds along very similar lines. As before, one has stable and unstable manifolds in each of the factors of MkM^{k} and hence through any particular point (x1,,xk)Mk(x_{1},\ldots,x_{k})\in M^{k}, one has the stable/unstable manifold that is the product of the stable manifolds Wlocs/u(ω,xi)W^{s/u}_{loc}(\omega,x_{i}). Hence in the extended system the stable an unstable foliations are transverse as before. By using these, one can similarly deduce that the Pinsker partition is finite. Further, from Proposition 6.1 every power of FkF_{k} is ergodic, which, as before implies that the Pinsker partition is trivial and thus the KK-property holds for F^k:Σ^×MkΣ^×Mk\hat{F}_{k}\colon\hat{\Sigma}\times M^{k}\to\hat{\Sigma}\times M^{k}. ∎

7. Coupling

In this section we present our main technical tool: the coupling lemma. We divide its proof into several steps according to the plan from Section 3. Accordingly, this section contains the outline of the rest of the paper.

7.1. Standard pairs and standard families

The proof of exponential mixing in this paper proceeds by showing that if μ1\mu_{1} and μ2\mu_{2} are two measures with smooth densities and ψ\psi is a Hölder function then μ1(ψfωnx)μ2(ψfωnx)\mu_{1}(\psi\circ f_{\omega}^{n}x)-\mu_{2}(\psi\circ f_{\omega}^{n}x) is exponentially small. Taking μ2\mu_{2} to be vol\operatorname{vol} and μ1\mu_{1} to be the measure with density ϕ\phi we obtain Theorem 1.1. Unfortunately, the set of measures whose densities satisfy a certain bound on their Hölder norm is not invariant by the dynamics, since compositions worsen Hölder regularity. So we need to consider a larger class of measures: the measures that are convex combinations of measures on (unstable) curves. This leads to notions of standard pairs and standard families that we now recall. We refer to [CM06, Chapter 7] for a detailed discussion of these notions.

Definition 7.1.

A standard pair in a Riemannian manifold MM is an arclength parametrized C2C^{2} curve γ:[a,b]M\gamma\colon[a,b]\to M of bounded length along with a log-Hölder density ρ\rho defined along γ\gamma (or equivalently [a,b][a,b]). We denote the pair of the curve and density by γ^\hat{\gamma} for emphasis.

There are two different ways of thinking about standard pairs. The first is that a standard pair is literally a pair of a curve and a density as in Definition 7.1. The second way is that we think of γ^=(γ,ρ)\hat{\gamma}=(\gamma,\rho) as a “thickened” version of the underlying curve γ\gamma where the “thickness” is given at a point xx by ρ(x)\rho(x). More precisely, we may think of γ^\hat{\gamma} as a subset of [a,b]×[0,maxρ][a,b]\times[0,\max{\rho}] comprising the points (c,y)(c,y) where yρ(c)y\leq\rho(c). We will often write xγ^x\in\hat{\gamma} when referring to a point in this set associated to γ^\hat{\gamma}. By thinking of the standard pair in this manner, we can imagine geometrically subdividing the pair into pieces. This type of subdivision is frequently used below.

Each standard pair defines a measure on MM given for continuous ψ:M\psi\colon M\to\mathbb{R} by the formula

(7.1) ρ^γ(ψ)=γψ(x)ρ(x)𝑑x\hat{\rho}_{\gamma}(\psi)=\int_{\gamma}\psi(x)\rho(x)dx

where dxdx denotes the arclength parametrization of γ\gamma.

A standard curve comes with a notion of regularity. The regularity of γ^\hat{\gamma} is determined by the C2C^{2} norm of γ\gamma as well as the C2C^{2} norm of the density along γ\gamma. We recall now some notions from §2.4. Recall that we define the C2C^{2} norm, γC2\|\gamma\|_{C^{2}}, of the curve γ\gamma as the supremum of its second derivative as a graph over its tangent space in exponential charts.

Definition 7.2.

Suppose that γ^\hat{\gamma} is a C2C^{2} standard pair consisting of a curve γ\gamma and a density ρ\rho. We say that γ^\hat{\gamma} is RR-good if

(1) The length of γ\gamma is at least eRe^{-R}.

(2) The C2C^{2} norm of γ\gamma is at most eRe^{R}.

(3) The density of ρ\rho satisfies lnρCαeR\|\ln\rho\|_{C^{\alpha}}\leq e^{R}, where we measure distance with respect to the arclength parameter of γ\gamma. Recall that CαC^{\alpha} only means the Hölder constant of the function.

We say that a standard pair γ^\hat{\gamma} is RR-regular when at least (2) and (3) are satisfied.

Note that a larger RR corresponds to a less regular curve.

Definition 7.3.

For a standard pair γ^=(γ,ρ)\hat{\gamma}=(\gamma,\rho), we say that xγx\in\gamma has an RR-good neighborhood, if there is a subcurve γγ\gamma^{\prime}\subseteq\gamma containing xx such that (γ,ρ|γ)(\gamma^{\prime},\rho|_{\gamma^{\prime}}) is RR-good.

Note that if xx is in an RR-good neighborhood of γ^\hat{\gamma}, this does not imply that xx is centered in long neighborhood. The point xx might still be quite close to the edge. Later we will also deal with points xx that are centered in an RR-good neighborhood, meaning that the segments on either side of xx form RR-good neighborhoods.

Definition 7.4.

A standard family is a collection of standard pairs {γ^θ}θΛ\{\hat{\gamma}_{\theta}\}_{\theta\in\Lambda} indexed by points from a probability space (Λ,λ).(\Lambda,\lambda).

Thus in the case that λ\lambda is atomic we just have a finite collection of standard pairs (counted with weights).

We say that a standard family is RR-good if each standard pair that comprises it is RR-good. We will only consider standard families where the goodness is bounded below.

Given a standard family {γθ}θΛ\{\gamma_{\theta}\}_{\theta\in\Lambda} we can associate a measure by integrating the measures corresponding to individual standard pairs with respect to the factor measure λ\lambda. For a function ψ:M\psi\colon M\to\mathbb{R}, we set

(7.2) ρ^Λ(ψ)=Λρ^γθ(ψ)𝑑λ(θ)\hat{\rho}_{\Lambda}(\psi)=\int_{\Lambda}\hat{\rho}_{\gamma_{\theta}}(\psi)d\lambda(\theta)

where ρ^γθ\hat{\rho}_{\gamma_{\theta}} is defined by (7.1).

A particularly useful property of standard families is that they can represent volume. It is straightforward to check that a standard pair representing volume exists by using charts.

Proposition 7.5.

Given a closed smooth manifold MM endowed with a volume, there exists some C>0C>0 and a CC-good standard family PvolP_{\operatorname{vol}} such that the associated measure represents volume on MM, i.e. for any continuous function

ϕ𝑑Pvol=ϕdvol.\int\phi\,dP_{\operatorname{vol}}=\int\phi\,d\operatorname{vol}.

Below we will use a naïve estimate saying that the goodness of a standard pair can deteriorate at most exponentially quickly.

Proposition 7.6.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) are C2C^{2} diffeomorphisms of a closed manifold. Then there exists C,η>0C,\eta>0 such that for any standard pair γ^\hat{\gamma} that is RR-good and any ωΣ\omega\in\Sigma, fωn(γ^)f^{n}_{\omega}(\hat{\gamma}) is max{C+R+nη,C+nη}\max\{C+R+n\eta,C+n\eta\}-good.

Proof.

The condition that the length of the curve can shrink at most exponentially fast is clear from the uniform bound on the derivative. The fact about the C2C^{2} norm of curve follows immediately from Lemma A.9. This leaves the estimate on the density, which follows from Lemma A.7 because the C2C^{2} norm of fωnf^{n}_{\omega} grows at most exponentially. ∎

Note that the representation (7.2) (including the representation of the volume from Proposition 7.5) is highly non-unique. One type of non-uniqueness that we shall often exploit in our proof is the possibility to divide a standard pair into pieces. To do so we partition the underlying curve γ\gamma into multiple disjoint subcurves γ1,,γn\gamma_{1},\ldots,\gamma_{n}. We then obtain a subdivision of (γ,ρ)(\gamma,\rho) from the restrictions (γ1,ρ|γ1),,(γn,ρ|γn)(\gamma_{1},\rho|_{\gamma_{1}}),\ldots,(\gamma_{n},\rho|_{\gamma_{n}}). We give each piece unit mass for the indexing measure λ\lambda. Note that (γ,ρ)(\gamma,\rho) as well as the standard family {(γi,ρ|γi)}1in\{(\gamma_{i},\rho|_{\gamma_{i}})\}_{1\leq i\leq n} both represent the same measure on MM.

A more subtle type of subdivision occurs when we view a standard pair as a subset of γ×[0,maxρ]\gamma\times[0,\max\rho] and partition this subset in the vertical direction. Similarly, we will obtain a new standard family. But now the underlying curves of the family may not be disjoint. For a simple example, something we do multiple places in the local coupling argument is take a standard pair (γ,ρ)(\gamma,\rho), a number α(0,1)\alpha\in(0,1), and subdivide this standard pair into {(γ,αρ),(γ,(1α)ρ)}\{(\gamma,\alpha\rho),(\gamma,(1-\alpha)\rho)\} and give each piece mass 11 for the indexing measure λ\lambda. Alternatively, we could take γ^1=γ^2=(γ,ρ)\hat{\gamma}_{1}=\hat{\gamma}_{2}=(\gamma,\rho) and allow the indexing measure assign them mass α\alpha and 1α1-\alpha, which gives the same measure on MM independent of α\alpha. Below, we will often think of this geometrically: we take the region associated to the standard pair in γ×[0,maxρ)\gamma\times[0,\max\rho) and slice it into regions. Projecting the Lebesgue measure on each region down to γ\gamma naturally defines a standard pair.

Next, if we have a standard family γ^\hat{\gamma} and a subfamily γ^\hat{\gamma}^{\prime} of γ^\hat{\gamma} defined by some subdivision of γ×[0,maxρ)\gamma\times[0,\max\rho) as mentioned above, then we define γ^γ^\hat{\gamma}\setminus\hat{\gamma}^{\prime} to be the standard family defined by the complement of γ^\hat{\gamma}^{\prime} in the subdivision.

7.2. Main coupling proposition

We now state the main technical result of the paper, from which the main mixing results of this paper are a consequence.

Proposition 7.7.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M), where MM is a closed surface. There exists λ>0\lambda>0 such that for all sufficiently small ϵ>0\epsilon>0, there exist C,α>0C,\alpha>0, such that for any RR, a goodness of standard pairs, the following holds.

Let γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} be two standard pairs with associated measures ρ1\rho_{1} and ρ2\rho_{2} of equal mass that are RR-good. Then we have the measures μρi\mu\otimes\rho_{i} on Σ×γ^i\Sigma\times\hat{\gamma}_{i}, where μ\mu is the Bernoulli measure on the one sided shift. There exists a coupling function Υ:Σ×γ^1γ^2\Upsilon\colon\Sigma\times\hat{\gamma}_{1}\to\hat{\gamma}_{2}, where for each ω\omega the map Υ(ω,):γ^1γ^2\Upsilon(\omega,\cdot)\colon\hat{\gamma}_{1}\to\hat{\gamma}_{2} is measure preserving, and a time T^(ω,x)\hat{T}(\omega,x) such that

fωT^(ω,x)(x)WσT^(ω,x)ω,C1s(fωT^(ω,x)Υ(ω,x)),f^{\hat{T}(\omega,x)}_{\omega}(x)\in W^{s}_{\sigma^{\hat{T}(\omega,x)}\omega,C^{-1}}({f^{\hat{T}(\omega,x)}_{\omega}\Upsilon(\omega,x)}),

and this stable manifold is uniformly (C,λ,ϵ)(C,\lambda,\epsilon)-tempered in the sense of Definition 5.1. Further

ω,x(T^(ω,x)n)emax{R,0}eαn.\mathbb{P}_{\omega,x}(\hat{T}(\omega,x)\geq n)\leq e^{\max\{R,0\}}e^{-\alpha n}.

The proof of this proposition is a combination of a local coupling lemma (Lemma 7.10) along with a recovery procedure.

When we attempt to couple two curves, we will insist that they are in a configuration that allows us to try and apply the Local Coupling Lemma (Lemma 7.10). What we mean by this is that the curves have controlled regularity and are sufficiently near to each other.

Definition 7.8.

Let γ^\hat{\gamma} be a standard pair and xγx\in\gamma. We say that xx is (C,δ)(C,\delta)-well positioned in γ^\hat{\gamma} if γ^\hat{\gamma} is CC-regular and xx is δ\delta distance away from the endpoints of γ\gamma, with distance measured along γ\gamma.

We say that two standard pairs γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} are in a (C,δ,υ)(C,\delta,\upsilon)-configuration if there exist xx which is (C,δ)(C,\delta)-well positioned in γ^1\hat{\gamma}_{1}, and yy which is (C,δ)(C,\delta)-well positioned in γ^2\hat{\gamma}_{2} such that d(x,y)<υd(x,y)\!<\!\upsilon.

The proof of Proposition 7.7 proceeds along the following steps. We start with two C0C_{0}-good standard pairs, γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2}. Here C0C_{0} is some uniform regularity appearing in Proposition 7.9 that we may obtain starting from an arbitrarily bad curve by waiting long enough.

  1. (1)

    We prove that for a large proportion of words ωΣ\omega\in\Sigma, the images fωn(γ^1)f^{n}_{\omega}(\hat{\gamma}_{1}) and fωn(γ^2)f^{n}_{\omega}(\hat{\gamma}_{2}) are mostly quite regular, and moreover, there is a large measure subset of the images that can be paired to form (C1,δ,υ)(C_{1},\delta,\upsilon)-configurations for some C1C_{1} that is worse that C0C_{0}. This relies on the mixing properties of our system studied in Section 6, and the needed conclusions are made precise in Proposition 7.11.

  2. (2)

    We then run a “local” coupling argument on each tiny (C1,δ,υ)(C_{1},\delta,\upsilon)-configuration. At each time step, we attempt to couple the remaining well tempered points using “fake” stable manifolds. This local coupling argument, Lemma 7.10, has a number of steps and draws on several intermediate estimates.

    (a) There are C,λ,ϵ>0C,\lambda,\epsilon>0 and a cone field 𝒞θ\mathcal{C}_{\theta} that is uniformly transverse to both γ1\gamma_{1} and γ2\gamma_{2} such that the probability that any point is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered and has EsE^{s} tangent to 𝒞θ\mathcal{C}_{\theta} is positive. Further, the probability that the tempering fails at time nn is exponentially small.

    (b) For a (C,λ,ϵ)(C,\lambda,\epsilon)-tempered point at time nn, we see that there is a “fake” stable manifold WnsW^{s}_{n} given by taking a curve nearly tangent to Dfωn(Ens)Df^{n}_{\omega}(E^{s}_{n}) and pushing this curve backwards by (Dfωn)1(Df^{n}_{\omega})^{-1}. (This construction is the subject of §B.4)

    (c) There exist worse (C,λ,ϵ)(C^{\prime},\lambda^{\prime},\epsilon^{\prime}) such that for every (C,λ,ϵ)(C,\lambda,\epsilon)-tempered point xx in γ1\gamma_{1}, all points within distanceDxfωn(1+σ)\|D_{x}f^{n}_{\omega}\|^{-(1+\sigma)} of are (C,λ,ϵ)(C^{\prime},\lambda^{\prime},\epsilon^{\prime})-tempered points at time nn. (This is the content of Proposition 10.3). These (C,λ,ϵ)(C^{\prime},\lambda^{\prime},\epsilon^{\prime})-tempered points also have fake stable manifolds. We will try to couple these thickened neighborhoods of the (C,λ,ϵ)(C,\lambda,\epsilon)-tempered points with some neighborhoods in γ2\gamma_{2} determined by the fake stable holonomies. At the time when DxfωnD_{x}f^{n}_{\omega} fails to be (C,λ,ϵ)(C,\lambda,\epsilon)-tempered with EsE^{s} tangent to 𝒞θ\mathcal{C}_{\theta} we discard the point xx and stop trying to couple it.

    (d) For (C,λ,ϵ)(C^{\prime},\lambda^{\prime},\epsilon^{\prime})-tempered points, the holonomies of the fake stable manifolds WnsW^{s}_{n} between γ1\gamma_{1} and γ2\gamma_{2} converge exponentially fast to the true, limiting stable holonomy. Moreover, the image of a point xγ1x\in\gamma_{1} under HnsH^{s}_{n} has fluctuations, as nn changes, of size Dxfωn1.99\|D_{x}f^{n}_{\omega}\|^{-1.99}, i.e. the distance between Hns(x)H^{s}_{n}(x) and Hn+1s(x)H^{s}_{n+1}(x) in γ2\gamma_{2} is at most Dxfωn1.99\|D_{x}f^{n}_{\omega}\|^{-1.99}. (This is proved in Proposition B.12.)

    (e) The points we try to couple with on γ2\gamma_{2} are the image of the points on γ1\gamma_{1} under the fake stable holonomy HnsH^{s}_{n}.

    (f) By carefully choosing subdivisions of the standard pairs γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} we may discard mass from the standard pairs so that at the end of the procedure a positive proportion of the mass above each (C,λ,ϵ(C,\lambda,\epsilon)-tempered point remains. The control on the size of the fluctuations of HnsH^{s}_{n} relative to the lengths of the intervals of (C,λ,ϵ)(C^{\prime},\lambda^{\prime},\epsilon^{\prime})-tempered points containing the (C,λ,ϵ)(C,\lambda,\epsilon)-tempered points Dxfωn1.99Dxfωn(1+σ)\|D_{x}f^{n}_{\omega}\|^{-1.99}\ll\|D_{x}f^{n}_{\omega}\|^{-(1+\sigma)} allows us to ensure that we always have enough points on γ2\gamma_{2} to try to couple with.

  3. (3)

    We prove that we may find simultaneous recovery times for a pair of RR-good standard pairs (Proposition 7.9), so that if we have failed to couple and are left with a short standard subcurve of γ^1\hat{\gamma}_{1} we can have this subcurve recover at the same time as a subcurve of γ^2\hat{\gamma}_{2}.

  4. (4)

    Once we recover we will try to couple again using steps (1)–(3) above. Each time we try to couple, a positive amount of mass couples, and as the tail on the recovery time is exponential we do not spend too much time recovering.

7.3. Statements of the lemmas for use during coupling

We now state the main propositions and lemmas that are used in the proof of Proposition 7.7.

Lemma 7.9.

(Coupled Recovery Lemma) Let MM be a closed surface and let (f1,,fm)(f_{1},\ldots,f_{m}) be an expanding on average tuple with entries in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). There exist C0,D1,α>0C_{0},D_{1},\alpha>0 such that if γ1^=(γ1,ρ1)\hat{\gamma_{1}}=(\gamma_{1},\rho_{1}) and γ2^=(γ2,ρ2)\hat{\gamma_{2}}=(\gamma_{2},\rho_{2}) are RR-good standard families of equal mass then there is a pair of stopping times T^1\hat{T}_{1} and T^2\hat{T}_{2} defined on γ1^\hat{\gamma_{1}} and γ^2\hat{\gamma}_{2} with the following properties:

(1) There is an exponential tail on the stopping time. Namely,

(μρ1)((ω,x)T^1(ω,x)>n)D1emax{R,0}αn.(\mu\otimes\rho_{1})((\omega,x)\mid\hat{T}_{1}(\omega,x)>n)\leq D_{1}e^{\max\{R,0\}-\alpha n}.

(2) If zγ^iz\in\hat{\gamma}_{i} is a point that stops at time nn, and Bi(z)B_{i}(z) is the connected component of zz in the set {xγ^i:T^i(ω,x)=n}\{x\in\hat{\gamma}_{i}:\hat{T}_{i}(\omega,x)=n\}, i.e the set of points zγi^z\in\hat{\gamma_{i}} stopped at time nn, then fωT^i(z)(Bi(z))\displaystyle f^{\hat{T}_{i}(z)}_{\omega}(B_{i}(z)) is a C0C_{0}-good standard pair.

(3) For each ωΣ\omega\in\Sigma, we always stop on the same amount of mass of γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} at each time nn. Specifically, for each ω\omega and nn, denote Si(ω,n)={xγ^i:T^i(ω,x)=n}S_{i}(\omega,n)=\{x\in\hat{\gamma}_{i}:\hat{T}_{i}(\omega,x)=n\}. For each pair (ω,n)(\omega,n) there is a measure preserving map Φnω:S1(ω,n)S2(ω,n)\Phi_{n}^{\omega}\colon S_{1}(\omega,n)\to S_{2}(\omega,n) carrying C0C_{0}-good connected components of S1(ω,n)S_{1}(\omega,n) to C0C_{0}-good connected components of S2(ω,n)S_{2}(\omega,n).

The following lemma is the most technical part of the coupling argument.

Lemma 7.10.

(Local Coupling Lemma) Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple. There exists 0<τ<10<\tau<1 such that for any C1>0C_{1}>0 there exists δ0,L,D1,D2,β,C,λ,ϵ>0\delta_{0},L,D_{1},D_{2},\beta,C,\lambda,\epsilon>0 such that for any 0<δ<δ00<\delta^{\prime}<\delta_{0} there exists δ1\delta_{1} and ϵ0,a0>0\epsilon_{0},a_{0}>0 such that for any two standard pairs γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} that are in a (C1,δ,υ)(C_{1},\delta^{\prime},\upsilon)-configuration with υτδ\upsilon\leq\tau\delta^{\prime}, we may couple a uniform proportion of the points on the two curves with an exponential tail on the points that do not couple.

Specifically, for two C1C_{1}-good standard pairs γ^1,γ^2\hat{\gamma}_{1},\hat{\gamma}_{2} of the same mass in a (C1,δ,υ)(C_{1},\delta^{\prime},\upsilon)-configuration with υτδ\upsilon\leq\tau\delta^{\prime}, there is a point xMx\in M, a ball Bδ0(x)MB_{\delta_{0}}(x)\subset M and connected components Γ1\Gamma_{1} and Γ2\Gamma_{2} of γ^1Bδ1(x)\hat{\gamma}_{1}\cap B_{\delta_{1}}(x) and γ^2Bδ1(x)\hat{\gamma}_{2}\cap B_{\delta_{1}}(x) such that Γ1\Gamma_{1} and Γ2\Gamma_{2} each contain a0a_{0} proportion of the mass of γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} respectively.

Further, there exist a pair of stopping times T^1(ω,x)\hat{T}_{1}(\omega,x) and T^2(ω,x)\hat{T}_{2}(\omega,x) defined on γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} such that if BT^i(ω,x)γ^iB^{\hat{T}_{i}}(\omega,x)\subseteq\hat{\gamma}_{i} denotes the block of points stopped at the same time as xx, then

  1. (1)

    For all ω,n\omega,n there exists Ψnω:{xγ^1:T^1(ω,x)=n}{xγ^2:T^2(ω,x)=n}\Psi^{\omega}_{n}\colon\{x\in\hat{\gamma}_{1}\colon\hat{T}_{1}(\omega,x)=n\}\to\{x\in\hat{\gamma}_{2}:\hat{T}_{2}(\omega,x)=n\} such that if T^i(ω,x)=n\hat{T}_{i}(\omega,x)=n, then B(ω,x)B(\omega,x) is an nLnL-good standard pair and Φnω\Phi^{\omega}_{n} carries B(x)B(x) to an nLnL-good standard pair B(Φnω(x))γ^2B(\Phi^{\omega}_{n}(x))\subseteq\hat{\gamma}_{2} of equal mass that is also stopped at time nn.

  2. (2)

    For each ω\omega, the set of points in γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} where T^i=\hat{T}_{i}=\infty are of equal measure and moreover these sets are intertwined by a measure preserving stable holonomy along uniformly (C,λ,ϵ)(C,\lambda,\epsilon)-tempered stable manifolds.

  3. (3)

    There exists D1>0D_{1}>0 such that (μρ^1)({(ω,x^):T^1(ω,x^)=n})D1eβn.\displaystyle(\mu\otimes\hat{\rho}^{1})(\{(\omega,\hat{x}):\hat{T}_{1}(\omega,\hat{x})=n\})\leq D_{1}e^{-\beta n}. For γ^2\hat{\gamma}_{2}, we have a similar estimate, (μρ^2)({(ω,x^):T^2(x^)=n})D1eβn.\displaystyle(\mu\otimes\hat{\rho}^{2})(\{(\omega,\hat{x}):\hat{T}_{2}(\hat{x})=n\})\leq D_{1}e^{-\beta n}.

  4. (4)

    For all xΓ1x\in\Gamma_{1}, the measure of words ω\omega such that T^i(ω,x)=\hat{T}_{i}(\omega,x)=\infty is at least ϵ0\epsilon_{0}.

In the lemma above, part (2) says that the points where T^i=\hat{T}_{i}=\infty are coupled and such points attract exponentially fast. Part (4) says that the probability that the next coupling attempt is successful is at least ϵ0.\epsilon_{0}. Part (3) says that the probability that “a point” stops and fails to couple at time nn is exponentially small, while part (1) controls he regularity of the set of such points.

The following proposition says that there is a fixed time N0N_{0} required for the C0C_{0}-good pairs produced by the coupled recovery lemma to get into position for the application of the local coupling lemma. The proof relies on the mixing properties from Section 6.

Proposition 7.11.

(Finite Time Mixing) Suppose (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple as in Proposition 7.7. For any fixed C0>0C_{0}>0, there exist C1,C2,δ,υ>0C_{1},C_{2},\delta,\upsilon>0 such that the following holds.

  1. (1)

    C1,δ,υ>0C_{1},\delta,\upsilon>0 are such that a (C1,δ,υ)(C_{1},\delta,\upsilon)-configuration satisfies the hypotheses of the Local Coupling Lemma 7.10 with C1=C1C_{1}=C_{1}, δ=δ\delta^{\prime}=\delta, and υ=υ\upsilon=\upsilon.

  2. (2)

    There exists N0N_{0}\in\mathbb{N} and b0>0b_{0}>0 such that for any C0C_{0} regular standard pairs γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} of equal mass, for .99%.99\% of the words ω{1,,m}N0\omega\in\{1,\ldots,m\}^{N_{0}}, there is a subdivision Pω1,Pω2P^{1}_{\omega},P^{2}_{\omega} of the standard families fωN0(γ^1)f^{N_{0}}_{\omega}(\hat{\gamma}_{1}) and fωN0(γ^2)f^{N_{0}}_{\omega}(\hat{\gamma}_{2}) and subfamilies Qω1,Qω2Q^{1}_{\omega},Q^{2}_{\omega} of Pω1P^{1}_{\omega} and Pω2P^{2}_{\omega}, and a map Ψ:Qω1Qω2\Psi\colon Q^{1}_{\omega}\to Q^{2}_{\omega} preserving measure such that the following hold.

    1. (a)

      Each pair γ^Qω1\hat{\gamma}\in Q^{1}_{\omega} is associated by Ψ\Psi with a pair Ψ(γ^)\Psi(\hat{\gamma}) such that these pairs have equal mass and satisfy (1) above.

    2. (b)

      The set Q1=ωΣ^{σN0(ω)}×Qω1Q^{1}=\bigcup_{\omega\in\hat{\Sigma}}\{\sigma^{N_{0}}(\omega)\}\times Q^{1}_{\omega} has measure b0ρ1(γ^)b_{0}\rho_{1}(\hat{\gamma}) with respect to μ^ρ1\hat{\mu}\otimes\rho_{1}. The same holds for Q2Q^{2}.

  3. (3)

    The complement of Qω1Q^{1}_{\omega} in fωn(γ^1)f^{n}_{\omega}(\hat{\gamma}_{1}) is a standard family of C2C_{2}-good standard pairs. The same holds for Qω2Q^{2}_{\omega}.

As mentioned before, the proofs of these lemmas appear later in the paper. Lemma 7.9 is proven in Section 8, Proposition 7.11 is proven in Section 9, and Lemma 7.10 is proven in Section 10.

7.4. Proof of the main coupling proposition

We now show how to deduce the main coupling proposition, Proposition 7.7, from the various results stated in this section. We need a preliminary estimate showing that if we fail to couple then the whole failed attempt does not take too long. In the lemma below the recovery time is the sum of three terms:

(1) The time when we stop trying to locally couple as in Lemma 7.10 item (3);

(2) The time it takes for a point to recover so that it belongs to a C0C_{0}-good pair as in the Coupled Recovery Lemma 7.9;

(3) The fixed time N0N_{0} where the point has a chance to enter a (C1,δ,υ)(C_{1},\delta,\upsilon)-configuration according to Proposition 7.11.

The following lemma verifies that each trip through the coupling procedure has an exponential tail on its duration.

Lemma 7.12.

In the setting of Proposition 7.7, for each CC there exist C^\hat{C} and r¯\bar{r} such that if γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} are CC-good standard pairs of equal mass, then

(μρ1)((ω,x^):(ω,x^) fails to couple and the recovery time is greater than n)C^er¯n.(\mu\otimes\rho_{1})((\omega,\hat{x}):(\omega,\hat{x})\text{ fails to couple and the recovery time is greater than }n)\leq\hat{C}e^{-\bar{r}n}.
Proof.

Take a small κ>0\kappa>0 that will be specified below. First we try to locally couple, and then we recover. Let TT be the recovery time and SS be the time when we stop our attempt at coupling (ω,x).(\omega,x). Then if TnT\geq n then either:

(i) SκnS\geq\kappa n or (ii) SκnS\leq\kappa n and the time it takes the corresponding part of the curve to recover is at least (1κ)n.(1-\kappa)n.

The probability of the first event is exponentially small due to Proposition 7.10(3). In the second case since SκnS\leq\kappa n, it follows that (ω,x)(\omega,x) belongs to κLn\kappa Ln-good component. Thus by Proposition 7.9 the probability that the recovery takes more than (1κ)n(1-\kappa)n time is less than D1e(κLα(1κ))nD_{1}e^{(\kappa L-\alpha(1-\kappa))n} which is exponentially small if κ<α/(L+α).\kappa<\alpha/(L+\alpha).

The main coupling proposition is now easy to deduce because each coupling attempt couples a positive proportion of the remaining mass and, from Lemma 7.12, there is an exponential tail bound on how long a coupling attempt takes.

Proof of Proposition  7.7..

Let N(ω,x)+1N(\omega,x)+1 be the number of total attempts at local coupling before (ω,x)(\omega,x) couples. Let T^(ω,x)\hat{T}(\omega,x) be the time when (ω,x)(\omega,x) couples, and let Tk(ω,x)T_{k}(\omega,x) be its kkth recovery time, i.e. the k+1k+1st time we attempt to locally couple. As a positive amount of mass couples each time we apply the local coupling lemma, we see that there exists δ>0\delta>0 such that

(7.3) (μρ1)((ω,x):N(ω,x)>k)ekδ.(\mu\otimes\rho_{1})((\omega,x):N(\omega,x)>k)\leq e^{-k\delta}.

Next we show that for points that take kk-attempts at local coupling to couple, that these attempts occur linearly fast. This will follow once we have a tail bound on TkT_{k}. By Lemma 7.12, T1{T}_{1} has an exponential moment. In particular, sup𝔼[etT1]=M(t)\sup\mathbb{E}\left[{e^{tT_{1}}}\right]=M(t) is finite for trt\leq r where r<r¯r<\bar{r} and r¯\bar{r} is the constant from Lemma 7.12 and the supremum is taken over all pairs γ^1,γ^2\hat{\gamma}_{1},\hat{\gamma}_{2} of C1C_{1}-good standard pairs which are in (C1,δ,υ)(C_{1},\delta,\upsilon)-configurations as required by Lemma 7.10 and produced by Proposition 7.11.

Extend Tk=TN(ω)T_{k}=T_{N}(\omega) if k>n(ω).k>n(\omega). A straightforward induction shows that 𝔼[etTk]M(t)k.\displaystyle\mathbb{E}\left[{e^{tT_{k}}}\right]\leq M(t)^{k}. Thus by the Chernoff bound (μρ1)(Tkn)M(t)ketn.\displaystyle(\mu\otimes\rho_{1})(T_{k}\geq n)\leq M(t)^{k}e^{-tn}. In particular taking t=rt=r, there is some β>0\beta>0 such that (μρ1)(Tkn|N=k)eβkern.\displaystyle(\mu\otimes\rho_{1})(T_{k}\geq n|N=k)\leq e^{\beta k}e^{-rn}. Fix some small number α\alpha such that 0<βα<r/20<\beta\alpha<r/2. Then

(μρ1)(TN>n and Nαn)(μρ1)(Tαn>n)D1er/2n.(\mu\otimes\rho_{1})(T_{N}>n\text{ and }N\leq\alpha n)\leq(\mu\otimes\rho_{1})(T_{\alpha n}>n)\leq D_{1}e^{-r/2n}.

By (7.3), with probability 1eδαn1-e^{-\delta\alpha n}, a point (ω,x)(\omega,x) couples after at most αn\alpha n trials, and the result follows. ∎

8. Proof of the Coupled Recovery Lemma

8.1. Recovery times

In this subsection, we use the preceding lemmas to describe a recovery algorithm for the C2C^{2} norm of an irregular curve and estimate the tail of the recovery time.

The next definition describes an iterate of fωnf^{n}_{\omega} that has a good enough splitting that fωn(γ)f^{n}_{\omega}(\gamma) will have a good neighborhood of a particular point. Note that a “good enough” splitting requires both a condition on the hyperbolicity as well as a condition on the angle between the curve γ\gamma and and the stable subspace. This definition will be used in the proof of the recovery lemma.

Definition 8.1.

Fix a tuple of non-negative numbers (C,λ,ϵ,A,ϵ,R)(C,\lambda,\epsilon,A,\epsilon^{\prime},R). For a standard pair γ^\hat{\gamma}, a point xγx\in\gamma and a word ωΣ\omega\in\Sigma, we say that nn is a (C,λ,ϵ,A,ϵ,R)(C,\lambda,\epsilon,A,\epsilon^{\prime},R)-backwards good time for x,γ,ωx,\gamma,\omega if n=Amax{R,1}+in=A\max\{R,1\}+i, for some i0i\geq 0 and

  1. (1)

    DfωnDf^{n}_{\omega} has a (C,λ,ϵ)(C,\lambda,\epsilon)-reverse tempered splitting, for which we write Ems,EmuE_{m}^{s},E_{m}^{u} for the stable and unstable subspaces of this splitting in Tfωm(x)MT_{f^{m}_{\omega}(x)}M.

  2. (2)

    (E0s,γ˙(x))eϵi\angle(E^{s}_{0},\dot{\gamma}(x))\geq e^{-\epsilon^{\prime}i}.

The following lemma asserts that this type of backwards good time is sufficient to conclude that an RR-good curve γ\gamma has its neighborhood of xx smoothed by the random dynamics fωnf^{n}_{\omega}.

Note that the second condition in the lemma considers the situation where γ\gamma “recovers” in a neighborhood of xx prior to time nn. It is important in this case to know that from that point on, we can just restrict to the portion of the curve that has already recovered. This is useful because it helps us deal with situations where we wish to “stop” on certain parts of the curve and know that the parts we have stopped on will not be needed later when a different part of the curve recovers. Recall from Definition 7.2 that an RR-regular curve has all the characteristics of RR-good curves except that it is not required to be eRe^{-R} long.

Lemma 8.2.

Suppose MM is a closed surface and that (f1,,fm)(f_{1},\ldots,f_{m}) is a tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). Then for any λ>0\lambda>0, sufficiently small ϵ,ϵ>0\epsilon,\epsilon^{\prime}>0, and any C>0C>0, there exists A,C0,C1>0A,C_{0},C_{1}>0 such that for any RR-regular standard pair γ^=(γ,ρ)\hat{\gamma}=(\gamma,\rho) and any (C,λ,ϵ,A,ϵ,R)(C,\lambda,\epsilon,A,\epsilon^{\prime},R)-backwards good time nn for ωΣ\omega\in\Sigma and xγx\in\gamma if:

  1. (1)

    γ^\hat{\gamma} is RR-good, or

  2. (2)

    there exists a time 0m<n0\leq m<n and a subinterval IγI\subseteq\gamma such that fωm(I)f^{m}_{\omega}(I) contains a neighborhood of fωm(x)f^{m}_{\omega}(x) that is eC1e.8λ(nm)e^{-C_{1}}e^{-.8\lambda(n-m)}-long;

then fωn(γ^)f^{n}_{\omega}(\hat{\gamma}) contains a C0C_{0}-good neighborhood of fωn(x)f^{n}_{\omega}(x). Moreover, if (2) holds, this neighborhood is contained in fωn(I)f^{n}_{\omega}(I).

The above lemma follows immediately from the result below. The second paragraph of the statement of the lemma essentially says: if there is another point in γ\gamma that also experiences a recovery time, then we can stop on that recovering segment while still leaving enough of the curve γ\gamma so that xx can still recover.

Lemma 8.3.

(Deterministic Recovery Lemma) Given a closed surface MM and a tuple (f1,,fm)(f_{1},\ldots,f_{m}) in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M), for any α,λ>0\alpha,\lambda>0 and all sufficiently small ϵ,ϵ>0\epsilon,\epsilon^{\prime}>0 and any C>0C>0, there exist C0,A>0C_{0},A>0 such that for any RR-good standard pair γ^=(γ,ρ)\hat{\gamma}=(\gamma,\rho), and any word ω\omega such that time nn is a (C,λ,ϵ,A,ϵ,R)(C,\lambda,\epsilon,A,\epsilon^{\prime},R)-backwards good time for xγx\in\gamma, then there exists a neighborhood B(x)γB(x)\subseteq\gamma of size at most e.9λne^{-.9\lambda n} such that fωn(B^(x))f^{n}_{\omega}(\hat{B}(x)) is C0C_{0}-good, i.e. the pushforward of the standard pair γ^\hat{\gamma} restricted to B(x)B(x) is C0C_{0}-good.

Further, there exists C1C_{1} such that for ω,x,γ\omega,x,\gamma as in the first part of the lemma, if IγI\subseteq\gamma is an interval containing xx and for some 1m<n1\leq m<n, fωm(I)f^{m}_{\omega}(I) has length at least eC1e.8λ(ni)e^{-C_{1}}e^{-.8\lambda(n-i)}, then fωn(I)f^{n}_{\omega}(I) contains a C0C_{0}-good neighborhood of fωn(x)f^{n}_{\omega}(x).

Proof.

We divide the proof into several steps. We begin by fixing some preliminaries. For the given (C,λ,ϵ)(C,\lambda,\epsilon), we apply Proposition A.13 with eiϵ=θe^{-i\epsilon^{\prime}}=\theta, which gives us the constants ϵ0,max,D2,,D8\epsilon_{0},\ell_{\max},D_{2},\ldots,D_{8} appearing in that proposition.

Step 1. (Length of fωnγf^{n}_{\omega}\gamma) By Proposition A.13(2), if

(8.1) nD5+max{R,0}2ln(eiϵ).99λ,n\geq D_{5}+\frac{\max\{R,0\}-2\ln(e^{-i\epsilon^{\prime}})}{.99\lambda},

then fωnγf^{n}_{\omega}\gamma contains a neighborhood γn\gamma_{n} of fωn(x)f^{n}_{\omega}(x) of length max\ell_{\max}. For ϵ\epsilon^{\prime} sufficiently small relative to λ\lambda, it follows that (8.1) holds as long as nA1max{R,1}+in\geq A_{1}\max\{R,1\}+i for some A1A_{1} depending only on D5,λ,ϵD_{5},\lambda,\epsilon^{\prime}.

Step 2. (C2C^{2} estimate) By Proposition A.13(3)

(8.2) γnC2<D6e2.9λneD7lnθmax{γC2,1}+D8.\|\gamma_{n}\|_{C^{2}}<D_{6}e^{-2.9\lambda n}e^{D_{7}\ln\theta}\max\{\|\gamma\|_{C^{2}},1\}+D_{8}.

Thus there exists A2,C2A_{2},C_{2} such that as long as nA2max{R,1}+in\geq A_{2}\max\{R,1\}+i, that γnC2C2\|\gamma_{n}\|_{C^{2}}\leq C_{2}.

Step 3. (Smoothing the density) From Proposition A.13(4) applied to D9=C2D_{9}=C_{2} from the previous step, we see that there exists D10,D11D_{10},D_{11} such that the following holds. If γn2<D8\|\gamma_{n}\|_{2}<D_{8}, then the pushforward of ρ\rho along γn\gamma_{n} is given by:

(8.3) lnρn|γnCαD10e.9αλneD7lnθ(1+lnρCα+γC2)+D11.\|\ln\rho_{n}|_{\gamma_{n}}\|_{C^{\alpha}}\leq D_{10}e^{-.9\alpha\lambda n}e^{D_{7}\ln\theta}(1+\|\ln\rho\|_{C^{\alpha}}+\|\gamma\|_{C^{2}})+D_{11}.

In particular as long as NA2max{R,1}+iN\geq A_{2}\max\{R,1\}+i, the above estimate holds. In the case that this estimate holds, then as lnρCα\|\ln\rho\|_{C^{\alpha}} and γC2\|\gamma\|_{C^{2}} are both at most eRe^{R}, we similarly see that there exists C3C_{3} and A3A_{3} such that if nA3max{R,1}+in\geq A_{3}\max\{R,1\}+i then lnρn|γnCαC3\|\ln\rho_{n}|_{\gamma_{n}}\|_{C^{\alpha}}\leq C_{3}. Thus we see that there exists AA such that the conclusion of the first paragraph holds.

For the claim in the second paragraph of the Lemma, we can apply Proposition A.13(2). The choice of A,C0A,C_{0} in the first part of the proof imply that for such nn, max\ell_{\max} is realized and thus by the final part of item (2) then the preimage of γn\gamma_{n} in fωiγf^{i}_{\omega}\gamma has length at most D4e.9λ(ni)D_{4}e^{-.9\lambda(n-i)}, thus if fωi(I)f^{i}_{\omega}(I) has length at least D4e.8λ(ni)D_{4}e^{-.8\lambda(n-i)}, then the image of fωi(I)f^{i}_{\omega}(I) will have image that is a C0C_{0} good neighborhood of fωn(x)f^{n}_{\omega}(x). ∎

Next we show that the recovery times from the above lemma occur frequently.

Proposition 8.4.

Let MM be a closed surface and suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). There exists λ>0\lambda>0 such that for any A>0A>0 and sufficiently small ϵ,ϵ>0\epsilon,\epsilon^{\prime}>0, there exist C>0C>0 and α3>0\alpha_{3}>0 such that for any RR-good standard pair γ^\hat{\gamma}, if for xγx\in\gamma we let T^(ω,x)\hat{T}(\omega,x) be the first (C,λ,ϵ,A,ϵ,R)(C,\lambda,\epsilon,A,\epsilon^{\prime},R)-backwards good time. Then

(8.4) (μρ)((ω,x):T^(ω,x)>Amax{R,1}+i)Ceα3i.(\mu\otimes\rho)((\omega,x):\hat{T}(\omega,x)>A\max\{R,1\}+i)\leq Ce^{-\alpha_{3}i}.

The same holds for the analogous stopping time defined on an RR-good standard family.

Proof.

It suffices to prove this estimate at a single point xx as we may then integrate the resulting estimate over all of γ^\hat{\gamma}. From Proposition 4.18 there exist C1,α1C_{1},\alpha_{1} and C,λ>0C,\lambda>0 such that for all sufficiently small ϵ>0\epsilon>0 there exists NN\in\mathbb{N} such that if we let S(ω)S(\omega) be the stopping time that stops at the first (C,λ,ϵ)(C,\lambda,\epsilon)-reverse tempered time of DxfωnD_{x}f_{\omega}^{n} greater than any fixed nNn\geq N, then at that time there is a well defined splitting TxM=ESsESuT_{x}M=E^{s}_{S}\oplus E^{u}_{S} into maximally expanded and contracted singular directions, and

(8.5) (S(ω)>n+k)C1eα1k.\mathbb{P}(S(\omega)>n+k)\leq C_{1}e^{-\alpha_{1}k}.

By Lemma 4.19 there exist C2,α2>0C_{2},\alpha_{2}>0 such that as long as nc0|lnθ|n\geq c_{0}\left|\ln\theta\right|,

(8.6) ((ESs,γ˙(x))<θ|Sn+k)<C2θα2.\mathbb{P}(\angle(E^{s}_{S},\dot{\gamma}(x))<\theta|S\leq n+k)<C_{2}\theta^{\alpha_{2}}.

Hence there exists α3>0\alpha_{3}>0 such that if SS is the first time greater than n=c0ϵin=c_{0}\epsilon^{\prime}i that has a reverse tempered splitting, then

(8.7) ((ESs,γ˙(x))<eϵi|Sn+k)<C2eα2ϵi.\mathbb{P}(\angle(E^{s}_{S},\dot{\gamma}(x))<e^{-\epsilon^{\prime}i}|S\leq n+k)<C_{2}e^{-\alpha_{2}\epsilon^{\prime}i}.

In particular, as long as ϵ\epsilon^{\prime} is sufficiently small relative to c0c_{0}, then c0ϵi<i/2c_{0}\epsilon^{\prime}i<i/2. Let SS be the first (C,λ,ϵ)(C,\lambda,\epsilon)-reverse tempered time greater than Amax{R,1}+i/2A\max\{R,1\}+i/2. Multiplying equations (8.5) and (8.7), we find that there exist C3,α3>0C_{3},\alpha_{3}>0 such that:

(SAmax{R,1}+i and (ESs,γ˙)eϵi)1C3eα3i.\displaystyle\mathbb{P}(S\leq A\max\{R,1\}+i\text{ and }\angle(E^{s}_{S},\dot{\gamma})\geq e^{-\epsilon^{\prime}i})\geq 1-C_{3}e^{-\alpha_{3}i}.

We now state without proof a more technical variant of the preceding lemma. It will be used in the proof of the coupled recovery lemma to allow “recovery times” for the hyperbolicity. We will divide the iterates of the system into blocks of size Δq+Δ\Delta q+\Delta, where Δ,q\Delta,q\in\mathbb{N}. Each block will be divided into two pieces one of length Δq\Delta q and one of length Δ\Delta. We will only be interested in backwards good tempered times that occur in the second part of the block, which has length Δ\Delta. This is to ensure that there are large (temporal) gaps between possible recovery times. The following lemma shows that given this extra restriction on the backwards good times, we still have an exponential tail.

Proposition 8.5.

Let MM be a closed surface and suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). There exists λ>0\lambda>0 such that for any A>0A>0 and sufficiently small ϵ,ϵ>0\epsilon,\epsilon^{\prime}>0, there exist C>0C>0 and α4>0\alpha_{4}>0 such that for all Δ,q\Delta,q\in\mathbb{N} and any RR-good standard pair γ^\hat{\gamma}, for any NAmax{R,1}N\geq A\max\{R,1\}, if for xγx\in\gamma we let T^(ω,x)\hat{T}(\omega,x) be the first time greater than equal to NN such that

Amax{R,1}+j(q+1)Δ+qΔ<T^(ω,x)Amax{R,1}+(j+1)(q+1)Δ,\lceil A\max\{R,1\}\rceil+j(q+1)\Delta+q\Delta<\hat{T}(\omega,x)\leq\lceil A\max\{R,1\}\rceil+(j+1)(q+1)\Delta,

for some j>0j>0 and T^\hat{T} is a (C,λ,ϵ,A,ϵ,R)(C,\lambda,\epsilon,A,\epsilon^{\prime},R) backwards good time, then

(8.8) (μρ)((ω,x):T^(ω,x)>N+i(q+1)Δ)Ceα4iΔ.(\mu\otimes\rho)((\omega,x):\hat{T}(\omega,x)>N+i(q+1)\Delta)\leq Ce^{-\alpha_{4}i\Delta}.

8.2. Coupled Recovery Lemma

In this subsection, we prove the coupled recovery lemma, Lemma 7.9. In the statement we view the standard pair as the uniform distribution on the subset of γ×[0,)\gamma\times[0,\infty) of pairs (x,t)(x,t) where tρ(x)t\leq\rho(x). We do this so that we may define stopping times for γ^\hat{\gamma} that stop on only part of the fiber over each point in γ\gamma. Additionally, in an abuse of notation, we will identify the density ρ\rho with a measure that we also call ρ\rho.

Proof of Lemma 7.9.

After initial preliminaries, the proof divides into two parts. The first part is a coupled stopping procedure, which takes a word ωΣ\omega\in\Sigma and two standard pairs γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2}, and shows which parts of each curve get stopped as we follow the dynamics specified by ω\omega so that we always stop on the same amount of mass of each pair. In the second part we show that with high probability the procedure from the first part actually stops on all but an exponentially small amount of γ^1,γ^2\hat{\gamma}_{1},\hat{\gamma}_{2} in a linear amount of time. In the proof, we consider the case that R>1R>1 as otherwise we can stop immediately and conclude.

We now fix some constants. By Proposition 8.5 there exists λ>0\lambda>0 such that for any A>0A>0 and sufficiently small ϵ,ϵ>0\epsilon,\epsilon^{\prime}>0, there exists C>0C>0 and α>0\alpha>0 such that (C,λ,ϵ,A,ϵ,R)(C,\lambda,\epsilon,A,\epsilon^{\prime},R)-backwards good times at the end of blocks of length (q+1)Δ(q+1)\Delta occur exponentially fast after any time NN greater than Amax{R,1}A\max\{R,1\} for an RR-good standard pair γ^\hat{\gamma}, i.e. (8.8) holds.

We then apply Lemma 8.2, which shows that for this choice of λ,C,ϵ,ϵ,A\lambda,C,\epsilon,\epsilon^{\prime},A, that any RR-good standard pair γ^\hat{\gamma} and any (C,λ,ϵ,A,ϵ,R)(C,\lambda,\epsilon,A,\epsilon^{\prime},R)-backwards good time to xγ^x\in\hat{\gamma}, fωn(x)f^{n}_{\omega}(x) has a C0C_{0}-good neighborhood in fωn(γ^)f^{n}_{\omega}(\hat{\gamma}), i.e. the dynamics smoothens a neighborhood of xx and makes it C0C_{0} regular. Lemma 8.2 also gives the constant C1C_{1} so that as long as fωi(I)f^{i}_{\omega}(I) contains a neighborhood of fωi(x)f^{i}_{\omega}(x) of size at least eC1e.8λ(ni)e^{-C_{1}}e^{-.8\lambda(n-i)}, then fσi(ω)ni(fωi(I))f^{n-i}_{\sigma^{i}(\omega)}(f^{i}_{\omega}(I)) contains a C0C_{0}-good neighborhood of fωn(x)f^{n}_{\omega}(x).

For the rest of the proof we will not repeat (C,λ,ϵ,A,ϵ,R)(C,\lambda,\epsilon,A,\epsilon^{\prime},R)-backwards good but just refer to such times as tempered times with this particular choice of constants being understood.

In the proof that follows, we divide the iterates of the system into blocks of size (q+1)Δ(q+1)\Delta. We will attempt to stop on a neighborhood of a point xx when DxfωnD_{x}f^{n}_{\omega} has a tempered time in the interval (AR+i(q+1)Δ+qΔ,AR+(i+1)(q+1)Δ](\lceil AR\rceil+i(q+1)\Delta+q\Delta,\lceil AR\rceil+(i+1)(q+1)\Delta]. This is the iith block, if there is such a tempered time, then we say that this is a tempered block. In the following, there will be points xx that experience a tempered block ending at AR+iqΔ\lceil AR\rceil+iq\Delta but that we do not stop because there was not enough mass stopping on the other curve to couple them. For these curves, we then wait for their next tempered time relative to the original curve. That we only allow stopping on the last Δ\Delta iterates of a block of length (q+1)Δ(q+1)\Delta is to ensure that the hyperbolicity has enough time to stretch what remains of the recovered neighborhood of fωAR+iΔ(γ)f^{\lceil AR\rceil+i\Delta}_{\omega}(\gamma) so that it can recover to be a C0C_{0}-good curve at the tempered time.

In the proof we only try to couple recovered curves at the very last time in each block, whereas a curve may have a tempered time up to Δ\Delta iterates before then. If we have a C0C_{0}-good curve, γ^\hat{\gamma}, and we apply the dynamics from (f1,,fm)(f_{1},\ldots,f_{m}) at most Δ\Delta additional times, then there is some C0C0C_{0}^{\prime}\geq C_{0}, so that the image of the curve will still be C0C_{0}^{\prime} good even after those extra iterates. Consequently, for any α>0\alpha>0, there exists δ(α)>0\delta(\alpha)>0, such that if γ^\hat{\gamma} is a C0C_{0}^{\prime} good curve, and we trim off the end segments of the curve of length eδe^{-\delta}, then we have lost at most eαe^{-\alpha} proportion of the curve, where α\alpha is some number we will choose below. Further, note that as long as δ\delta is sufficiently large, the trimmed off curves will be eδe^{-\delta}-good and that when we trim a C0C_{0}^{\prime}-good curve, what remains will also still be δ\delta-good.

The proof involves four additional parameters some of which were alluded to above, and which we choose to be sufficiently large that the following hold:

(1) There is an exponential tail on the wait for the first tempered block. For any NARN\geq\lceil AR\rceil, if T(ω,x)T(\omega,x) is the next tempered block after NN, then

(8.9) ω(T(ω,x)N+i(q+1)Δ)eiα.\mathbb{P}_{\omega}(T(\omega,x)\geq N+i(q+1)\Delta)\leq e^{-i\alpha}.

(2) We also fix a small constant β>0\beta>0. Then by possibly increasing Δ\Delta even further we can arrange that β<α/7\beta<\alpha/7 and in addition have that α\alpha is greater than the cutoffs in Claims 8.6 and 8.7 below.

(3) We then choose qq sufficiently large that eδ>eC1e.8λqΔe^{-\delta}>e^{-C_{1}}e^{-.8\lambda q\Delta}, where δ\delta is the goodness of the recovered curve from above and depends on α\alpha and Δ\Delta.

Note that when picking the constants above, from the statement of Proposition 8.5 we first choose Δ\Delta to make eαe^{-\alpha} arbitrarily small and both (1) and (2) hold. Then we increase qq to ensure that (3) holds as well, which does not affect (1) or (2).

Part 1: Coupled Stopping Procedure. Fix a word ωΣ\omega\in\Sigma. We begin with two standard pairs γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2}. We will let PniP_{n}^{i} be the subset of γ^i\hat{\gamma}_{i} that has not been coupled after nn attempts at coupled stopping, i.e. it consists of points that are not permanently stopped at time AR+i(q+1)Δ\lceil AR\rceil+i(q+1)\Delta. Note that PniP_{n}^{i} is naturally viewed as a standard family. We let IjiI^{i}_{j} be the set of points in PjiP^{i}_{j} whose (j+1)(j+1)st block is a tempered block. For every point xPjix\in P^{i}_{j} its next stopping time T(x,ω)T(x,\omega) is defined to be the end of the next tempered block for that point. To simplify the notation, we write N0=ARN_{0}=\lceil AR\rceil.

An inductive assumption of the following procedure is the following:

(8.10) For any γ^Pji\hat{\gamma}\in P^{i}_{j}, and xγ^x\in\hat{\gamma}, γ^\hat{\gamma} is sufficiently long that if for some k>jk>j,
the kkth block is tempered, then fσN0+(q+1)jΔ(ω)(q+1)(kj)Δ(γ^)f^{(q+1)(k-j)\Delta}_{\sigma^{N_{0}+(q+1)j\Delta}(\omega)}(\hat{\gamma}) is C0C_{0}^{\prime}-good.

For i{1,2}i\in\{1,2\}, let U~ji\widetilde{U}_{j}^{i} be the union of the C0C_{0}^{\prime} good intervals of the points xIjix\in I^{i}_{j} at the end of the (j+1)(j+1)st block; if two intervals within a single standard pair in PjiP^{i}_{j} overlap, we take their union, so some intervals may be longer than eC0e^{-C_{0}^{\prime}}. Note that U~ji\widetilde{U}_{j}^{i} is a C0C_{0}^{\prime}-good standard family. Then for each standard pair IU~jiI\in\widetilde{U}_{j}^{i}, we discard the interval of size eδe^{-\delta} from the end of the interval. This gives us a new standard family UjiU~jiU_{j}^{i}\subseteq\widetilde{U}_{j}^{i}. By choice of δ(α)\delta(\alpha) from above,

ρi(Uji)(1eα)ρi(U~ji).\rho_{i}(U_{j}^{i})\geq(1-e^{-\alpha})\rho_{i}(\widetilde{U}_{j}^{i}).

We now choose which of the subpairs in U~j1\widetilde{U}_{j}^{1} and U~j2\widetilde{U}_{j}^{2} to stop on for our fixed word ω\omega. Suppose without loss of generality that Uj1U^{1}_{j} has less mass than Uj2U^{2}_{j}. We now stop on all points in Uj1U^{1}_{j}. We would like to stop on all the points in Uj2U^{2}_{j}, however Uj2U^{2}_{j} has too much mass compared with Uj1U^{1}_{j}. To compensate, we subdivide the standard family to create pieces with the appropriate height so that we can stop on a set of equal mass to Uj1U^{1}_{j}. First we subdivide γ^2\hat{\gamma}_{2} vertically at height ρ1(Uj1)(ρ2(Uj2))1ρ2\rho_{1}(U^{1}_{j})(\rho_{2}(U^{2}_{j}))^{-1}\rho_{2} so that we keep over each point the same proportion of the mass. Call the two pieces of γ^2\hat{\gamma}_{2} by AA and BB, where AA is the piece with mass ρ1(Uj1)(ρ2(Uj2))1ρ2(γ^2)\rho_{1}(U^{1}_{j})(\rho_{2}(U^{2}_{j}))^{-1}\rho_{2}(\hat{\gamma}_{2}). Then if we take AA^{\prime} to be the restriction of the standard pair AA to the points over Uj2U^{2}_{j}, this subpair satisfies that ρ2(A)=ρ1(Uj1)\rho_{2}(A^{\prime})=\rho_{1}(U^{1}_{j}). We stop on all points in AA^{\prime}. The map Φ\Phi in the statement of the proposition associates AA^{\prime} and Uj1U^{1}_{j}. The complement of these stopped sets AA^{\prime} and Uj1U^{1}_{j} then defines a pair of new standard families Pj+1iP^{i}_{j+1}.

In order for us to be able to proceed with this argument inductively, we must verify that the inductive assumption (8.10) still holds. From the second part of Lemma 8.2, as long as xfωN0+(j+1)(q+1)Δ(γ)x\in f^{N_{0}+(j+1)(q+1)\Delta}_{\omega}(\gamma) has length at least eδe^{-\delta}, and a point xx experiences another tempered time qΔq\Delta iterates later, then by choice of qq,

eδ>eC1e.8λqΔ,e^{-\delta}>e^{-C_{1}}e^{-.8\lambda q\Delta},

so by that lemma if there is a future tempered time n>N0+(j+1)(q+1)Δ+qΔn>N_{0}+(j+1)(q+1)\Delta+q\Delta, then at that time the image of xx will lie in a C0C_{0}-good pair. Note that as we only consider future tempered times that are at least qΔq\Delta past the point where the curve is eδe^{-\delta} long that by our choice of constants and the last part of Lemma 8.2 the assumption (8.10) holds inductively.

This completes the description of the stopping procedure. We now turn to estimating the tail of the stopping time.

Part 2: Rate of Stopping. Let An1A_{n}^{1} and An2A_{n}^{2} be the pairs (ω,x)Σ×γ^1(\omega,x)\subset\Sigma\times\hat{\gamma}_{1} and Σ×γ^2\Sigma\times\hat{\gamma}_{2} that have not permanently stopped at time n(q+1)Δn(q+1)\Delta, i.e. after nn attempts at coupled stopping they are still not stopped. Our goal now is to show that (μρ1)(An1)(\mu\otimes\rho_{1})(A_{n}^{1}) has an exponential tail. We begin with several claims. The idea is that if the amount of mass that has not stopped at time nn is large, then this implies that a large proportion of points will have a tempered time very quickly. If a large proportion of each curve has a tempered time, then we can stop on these points and obtain the result.

In this part of the proof, we will write all stopping times as if we had reindexed things so that N0=ARN_{0}=\lceil AR\rceil is time 0, AR+(q+1)Δ\lceil AR\rceil+(q+1)\Delta is time 11, etc, to avoid a mess of notation. Keep in mind from our choice of constants earlier that we can pick Δ\Delta as large as we like at the beginning of the proof to ensure that α\alpha is as large as we like below.

Claim 8.6.

For any β>0\beta>0, there exists α02β\alpha_{0}\geq 2\beta such that for all αα0\alpha\geq\alpha_{0}, if we have chosen the block size Δ\Delta as above to ensure an enαe^{-n\alpha} tail on tempered times pointwise (8.9), then if for some nn\in\mathbb{N} and all i<ni<n, (μρ1)(Ai1)eiβeβ(\mu\otimes\rho_{1})(A_{i}^{1})\leq e^{-i\beta}e^{\beta} and enβ(μρ1)(An1)e2βenβe^{-n\beta}\leq(\mu\otimes\rho_{1})(A_{n}^{1})\leq e^{2\beta}e^{-n\beta}, then at the end of the next block, 1e99100α1-e^{-\frac{99}{100}\alpha} proportion of the points (ω,x)(\omega,x) in An1A_{n}^{1} experience a tempered time.

Proof.

Let T(ω,x)T(\omega,x) denote the next tempered time for (ω,x)An1(\omega,x)\in A_{n}^{1} then we wish to study a conditional probability (T(ω,x)>n+1|(ω,x)An1),\displaystyle\mathbb{P}(T(\omega,x)>n+1|(\omega,x)\in A_{n}^{1}), as this gives a bound on the probability that we stop at the next attempt. Then

(8.11) (T(ω,x)>n+1|(ω,x)An1)=(T(ω,x)>n+1 and (ω,x)An1)(An1)\displaystyle\mathbb{P}(T(\omega,x)>n+1|(\omega,x)\in A_{n}^{1})=\frac{\mathbb{P}(T(\omega,x)>n+1\text{ and }(\omega,x)\in A_{n}^{1})}{\mathbb{P}(A_{n}^{1})}

Let BjnAn1B_{j}^{n}\subseteq A_{n}^{1} be the set of trajectories that have not had a tempered time since iterate jj and hence are in An1A_{n}^{1} for this reason. Thus An1=j=0nBjn\displaystyle A_{n}^{1}=\sqcup_{j=0}^{n}B_{j}^{n}. Note that BjnAj1B_{j}^{n}\subseteq A_{j}^{1} as these points certainly weren’t stopped at time jj. Hence

(T(ω,x)>n+1|(ω,x)An1)=j=0n(T(ω,x)>n+1 and (ω,x)Bjn)(An1)\displaystyle\mathbb{P}(T(\omega,x)>n+1|(\omega,x)\in A_{n}^{1})=\frac{\sum_{j=0}^{n}\mathbb{P}(T(\omega,x)>n+1\text{ and }(\omega,x)\in B_{j}^{n})}{\mathbb{P}(A_{n}^{1})}
j=0n(T(ω,x)>n+1 and (ω,x)Aj1)(An1)((An1))1j=0ne(nj+1)αeβj+2βby (8.8)\displaystyle\leq\frac{\sum_{j=0}^{n}\mathbb{P}(T(\omega,x)>n+1\text{ and }(\omega,x)\in A_{j}^{1})}{\mathbb{P}(A_{n}^{1})}\leq(\mathbb{P}(A_{n}^{1}))^{-1}\sum_{j=0}^{n}e^{-(n-j+1)\alpha}e^{-\beta j+2\beta}\quad\text{by }\eqref{eqn:exp_tail_on_local_recovery_time1}
e2βen(βα)eαj=0nej(αβ)=e2βeαj=0ne(nj)(βα)=e2βeαj=0nej(βα)\displaystyle\leq e^{2\beta}e^{n(\beta-\alpha)}e^{-\alpha}\sum_{j=0}^{n}e^{j(\alpha-\beta)}=e^{2\beta}e^{-\alpha}\sum_{j=0}^{n}e^{(n-j)(\beta-\alpha)}=e^{2\beta}e^{-\alpha}\sum_{j=0}^{n}e^{j(\beta-\alpha)}
e2βeα(1+2e(βα))e99100α,\displaystyle\leq e^{2\beta}e^{-\alpha}(1+2e^{(\beta-\alpha)})\leq e^{-\frac{99}{100}\alpha},

for α\alpha sufficiently large relative to β\beta. This is the needed claim, so we are done. ∎

The following claim shows that if most of the remaining pairs (ω,x)(\omega,x) are experiencing a tempered time at time nn then we stop on a relatively large amount of mass at that step.

Claim 8.7.

There exists α0\alpha_{0} such that for all αα0\alpha\geq\alpha_{0}, if Bn1B_{n}^{1} and Bn2B_{n}^{2} are the subsets of An1A_{n}^{1} and An2A_{n}^{2} having tempered times at time n+1n+1 and if for i{1,2}i\in\{1,2\},

(8.12) (μρi)(Bni)(1eα)(μρi)(Ani),(\mu\otimes\rho_{i})(B_{n}^{i})\geq(1-e^{-\alpha})(\mu\otimes\rho_{i})(A_{n}^{i}),

then

(8.13) (μρi)(An+1i)eα/3(μρi)(Ani).(\mu\otimes\rho_{i})(A_{n+1}^{i})\leq e^{-\alpha/3}(\mu\otimes\rho_{i})(A_{n}^{i}).
Proof.

Let π:Σ×γ^1Σ\pi\colon\Sigma\times\hat{\gamma}_{1}\to\Sigma denote the projection. Associated to An1A_{n}^{1} and An2A_{n}^{2} we have a measure μ~n\widetilde{\mu}_{n} on Σ\Sigma, given by

μ~n(X)=(μρ1)(π1(X)An1).\widetilde{\mu}_{n}(X)=(\mu\otimes\rho_{1})(\pi^{-1}(X)\cap A_{n}^{1}).

Note that if we had used An2A_{n}^{2} to define μ~n\widetilde{\mu}_{n}, we would have obtained the same result.

Let Ani(ω)A_{n}^{i}(\omega) denote π1({ω})Ani\pi^{-1}(\{\omega\})\cap A_{n}^{i}. We claim that there is a set XΣX\subseteq\Sigma such that μ~n(X)(1eα/2)(μρ1)(Ani)\widetilde{\mu}_{n}(X)\geq(1-e^{-\alpha/2})(\mu\otimes\rho_{1})(A^{i}_{n}) and for ωX\omega\in X, we have that

(8.14) ρ1(An1(ω)Bn1)(1eα/2)ρ1(An1(ω)).\rho_{1}(A_{n}^{1}(\omega)\cap B^{1}_{n})\geq(1-e^{-\alpha/2})\rho_{1}(A_{n}^{1}(\omega)).

Otherwise there would exist a set YY such that μ~n(Y)>eα/2(μρ1)(An1)\widetilde{\mu}_{n}(Y)>e^{-\alpha/2}(\mu\otimes\rho_{1})(A^{1}_{n}) such that for ωY\omega\in Y, equation (8.14) fails. Then by Fubini, we would find

(μρ1)(Bn1)((1μ~n(Y))+μ~n(Y)(1eα/2))(μρ1)(An1)<(1eα/2)(μρ1)(An1),(\mu\otimes\rho_{1})(B_{n}^{1})\leq((1-\widetilde{\mu}_{n}(Y))+\widetilde{\mu}_{n}(Y)(1-e^{-\alpha/2}))(\mu\otimes\rho_{1})(A_{n}^{1})<(1-e^{-\alpha/2})(\mu\otimes\rho_{1})(A_{n}^{1}),

which is impossible from our assumption (8.12).

Thus we may find a set X1XX_{1}\subseteq X such that μ~n(X1)(1eα/2)(μρ1)(An1)\widetilde{\mu}_{n}(X_{1})\geq(1-e^{-\alpha/2})(\mu\otimes\rho_{1})(A_{n}^{1}) and for ωX1\omega\in X_{1}, (8.14) holds. Similarly we may find a set X2X_{2} such that the same holds for An2A_{n}^{2}. Then μ~n(X1X2)(12eα/2)μ~n(An1)\widetilde{\mu}_{n}(X_{1}\cap X_{2})\geq(1-2e^{-\alpha/2})\widetilde{\mu}_{n}(A_{n}^{1}) and for every point ωX1X2\omega\in X_{1}\cap X_{2}, each curve in Ani(ω)A^{i}_{n}(\omega) has at least 1eα/21-e^{-\alpha/2} proportion of its remaining mass recovering. As described in the first part of the proof, we then trim segments of length eδe^{-\delta} off these subcurves, which by the choice of δ\delta, leaves us with (1eα)(1-e^{-\alpha}) proportion of the remaining mass. Thus on each curve there is at least

(1eα/2)(1eα)(μρ1)(An1(ω))(1-e^{-\alpha/2})(1-e^{-\alpha})(\mu\otimes\rho_{1})(A_{n}^{1}(\omega))

mass to stop on. Hence by the estimate on the measure of such ω\omega, we can stop on

(12eα/2)(1eα/2)(1eα)(μρ1)(An1)(1-2e^{-\alpha/2})(1-e^{-\alpha/2})(1-e^{-\alpha})(\mu\otimes\rho_{1})(A_{n}^{1})

of the remaining mass. In particular, this implies that for sufficiently large α\alpha, that the unstopped mass remaining at the (n+1)(n+1)th step satisfies:

(8.15) (μρ1)(An+11)eα/3(μρ1)(An1),(\mu\otimes\rho_{1})(A_{n+1}^{1})\leq e^{-\alpha/3}(\mu\otimes\rho_{1})(A_{n}^{1}),

as desired. ∎

We can now conclude the desired rate of stopping. From our choice of constants, we have β>0\beta>0 sufficiently small and α>0\alpha>0 sufficiently large that β<α/7\beta<\alpha/7 and both Claims 8.6 and 8.7 of the proof hold. As mentioned previously, from the choice of Δ\Delta at the beginning, we may take α\alpha as large as we like. Then we will show that for nn\in\mathbb{N},

(8.16) (μρ1)(An1)enβeβ.(\mu\otimes\rho_{1})(A_{n}^{1})\leq e^{-n\beta}e^{\beta}.

We consider two cases depending on how much mass is left at time nn.

(1) First, suppose that

(8.17) (μρ1)(An1)enβ(\mu\otimes\rho_{1})(A_{n}^{1})\leq e^{-n\beta}

Then certainly, (μρ1)(An+11)eβe(n+1)β.\displaystyle(\mu\otimes\rho_{1})(A_{n+1}^{1})\leq e^{\beta}e^{-(n+1)\beta}.

(2) If at time nn,

(8.18) enβ(μρ1)(An1)e2βeβn,e^{-n\beta}\leq(\mu\otimes\rho_{1})(A_{n}^{1})\leq e^{2\beta}e^{-\beta n},

and at all previous times (μρ1)(An1)eβenβ(\mu\otimes\rho_{1})(A_{n}^{1})\leq e^{\beta}e^{-n\beta}, then Claim 8.6 applies to An1A_{n}^{1} and An2A_{n}^{2}, which gives that at time n+1n+1, that 1e99/100α1-e^{-99/100\alpha} proportion of the points in An1A^{1}_{n} and An2A^{2}_{n} will recover at time n+1n+1. Thus by Claim 8.7 and our choice of α>7β\alpha>7\beta, we see that

(8.19) (μρi)(An+1i)e99300α(μρi)(Ani)<e2β(μρi)(Ani),(\mu\otimes\rho_{i})(A_{n+1}^{i})\leq e^{-\frac{99}{300}\alpha}(\mu\otimes\rho_{i})(A_{n}^{i})<e^{-2\beta}(\mu\otimes\rho_{i})(A_{n}^{i}),

and for the next iterate we are back in the first case, (μρ1)(An+11)e(n+1)β(\mu\otimes\rho_{1})(A_{n+1}^{1})\leq e^{-(n+1)\beta}.

In order to conclude, we apply the two options above inductively to obtain equation (8.16) for all nn. In fact, we will show something slightly stronger: there are never two consecutive indices n,n+1n,n+1 such that

enβ<(μρ1)(An1)enβeβe^{-n\beta}<(\mu\otimes\rho_{1})(A_{n}^{1})\leq e^{-n\beta}e^{\beta}

holds for both nn and n+1n+1.

Throughout the induction either we have

(8.20) (μρi)(Ani)<eβn or enβ(μρi)(Ani).(\mu\otimes\rho_{i})(A_{n}^{i})<e^{-\beta n}\text{ or }e^{-n\beta}\leq(\mu\otimes\rho_{i})(A_{n}^{i}).

In the former case, we may apply item (1) in the list just mentioned.

Suppose we are in the latter case, that at time n1n-1 that (μρi)(Ani)<eβ(n1)(\mu\otimes\rho_{i})(A_{n}^{i})<e^{-\beta(n-1)} and at time nn that eβn(μρi)(Ani)eβ(n1)e^{-\beta n}\leq(\mu\otimes\rho_{i})(A_{n}^{i})\leq e^{-\beta(n-1)}, and that for all prior iterates equation (8.16) holds. Then we may apply (2) above to find that

(8.21) (μρi)(An+1i)<e2β(μρi)(Ani)e2βe(n1)β=e(n+1)β.(\mu\otimes\rho_{i})(A_{n+1}^{i})<e^{-2\beta}(\mu\otimes\rho_{i})(A_{n}^{i})\leq e^{-2\beta}e^{-(n-1)\beta}=e^{-(n+1)\beta}.

Thus for the iteration n+1n+1 we have (μρi)(An+1i)<e(n+1)β(\mu\otimes\rho_{i})(A_{n+1}^{i})<e^{-(n+1)\beta}. Note that this means that the second case in (8.20) cannot occur twice in a row. Hence we may proceed inductively to verify that (8.16) holds for every nn. This concludes the proof of the lemma. ∎

9. Precoupling

In this section, we prove the finite time mixing proposition, Proposition 7.11, which prepares curves for the application of the local coupling lemma.

9.1. Fibrewise mixing

In this subsection we study fiber-wise mixing properties of the skew product F:Σ×MΣ×MF\colon\Sigma\times M\to\Sigma\times M. A skew product being mixing does not imply that it has any mixing properties fiberwise. For example, the system could be isometric on the fibers. For this reason we will leverage the mixing of Fk:Σ×MkΣ×MkF_{k}\colon\Sigma\times M^{k}\to\Sigma\times M^{k}. We will obtain a sort of coarse fiberwise mixing by using a concentration of measure argument. The basic idea of the argument is that if AA is a subset of MM, and BΣ×MB\subset\Sigma\times M is a set giving equal measure to each fiber, then if BB does not mix with AA fiberwise, then it implies that on many fibers AFn(B)A\cap F^{n}(B) is quite concentrated. As a consequence of this concentration we show that FkF_{k} cannot be mixing as there are too many points that stay in the set AkMkA^{k}\subset M^{k}.

Proposition 9.1.

Suppose that the skew product Fk:Σ×MkΣ×MkF_{k}\colon\Sigma\times M^{k}\to\Sigma\times M^{k} from (6.1) is mixing for μvolk\mu\otimes\operatorname{vol}^{k} for all kk\in\mathbb{N}. Let AMA\subseteq M be a positive measure set. Then for all ϵ1,ϵ2>0\epsilon_{1},\epsilon_{2}>0 if UΣ^×MU\subseteq\hat{\Sigma}\times M is a set giving exactly mass α0>0\alpha_{0}>0 to (1ϵ2)(1-\epsilon_{2}) of the fibers of Σ^\hat{\Sigma} and 0 to the rest, then there exists NN\in\mathbb{N}, such that for all nNn\geq N, there exist (12ϵ2)(1-2\epsilon_{2}) proportion of words ωΣ^\omega\in\hat{\Sigma}, such that

(9.1) vol(A)α0(1ϵ1)vol(fωn(Uω)A)vol(A)α0(1+ϵ1),\operatorname{vol}(A)\alpha_{0}(1-\epsilon_{1})\leq\operatorname{vol}(f^{n}_{\omega}(U_{\omega})\cap A)\leq\operatorname{vol}(A)\alpha_{0}(1+\epsilon_{1}),

where we write UωMU_{\omega}\subseteq M for the portion of UU in the fibre over ω\omega.

Proof.

We will prove the lower bound; the upper bound then follows by taking the complement of AA. For the sake of contradiction, suppose that the lower bound in (9.1) is false. Then there exist ϵ1,ϵ2>0\epsilon_{1},\epsilon_{2}>0 such that for arbitrarily large nn, there exist measure 2ϵ22\epsilon_{2} words ω\omega such that

(9.2) vol(Uω)=α0 and vol(fωn(Uω)A)<vol(A)α0(1ϵ1).\operatorname{vol}(U_{\omega})=\alpha_{0}\text{ and }\operatorname{vol}(f^{n}_{\omega}(U_{\omega})\cap A)<\operatorname{vol}(A)\alpha_{0}(1-\epsilon_{1}).

For these words ω\omega

(9.3) vol(fωn(Uω)(MA))α0(vol(MA)+ϵ1vol(A)).\operatorname{vol}(f^{n}_{\omega}(U_{\omega})\cap(M\setminus A))\geq\alpha_{0}(\operatorname{vol}(M\setminus A)+\epsilon_{1}\operatorname{vol}(A)).

We now consider what this implies on Σ^×Mk\hat{\Sigma}\times M^{k}. Write UkU^{k} for the union of the sets {ω}×Uωk\{\omega\}\times U_{\omega}^{k}. Then for the words ω\omega satisfying (9.2), we obtain

(9.4) (volk)(Fk,ωn(Uωk){σk(ω)}×(MA)k)α0k(vol(MA)+ϵ1vol(A))k,(\operatorname{vol}^{k})(F^{n}_{k,\omega}(U^{k}_{\omega})\cap\{\sigma^{k}(\omega)\}\times(M\setminus A)^{k})\geq\alpha_{0}^{k}(\operatorname{vol}(M\setminus A)+\epsilon_{1}\operatorname{vol}(A))^{k},

because fiberwise this intersection is equal to the product (fωn(Uω)(MA))k(f^{n}_{\omega}(U_{\omega})\cap(M\setminus A))^{k}. Thus integrating over this set of ω\omega of measure 2ϵ22\epsilon_{2}, we find that

(9.5) (μ^volk)(Fkn(Uk)Σ^×(MA)k)2ϵ2α0k(vol(MA)+ϵ1vol(A))k.(\hat{\mu}\otimes\operatorname{vol}^{k})(F^{n}_{k}(U^{k})\cap\hat{\Sigma}\times(M\setminus A)^{k})\geq 2\epsilon_{2}\alpha_{0}^{k}(\operatorname{vol}(M\setminus A)+\epsilon_{1}\operatorname{vol}(A))^{k}.

Note that (μ^volk)(Uk)(1ϵ2)α0k(\hat{\mu}\otimes\operatorname{vol}^{k}){(U^{k})\leq}(1-\epsilon_{2})\alpha_{0}^{k} by the definition of UU. Since (μ^volk)(Σ^×(MA)k)=vol(MA)k(\hat{\mu}\otimes\operatorname{vol}^{k})(\hat{\Sigma}\times(M\setminus A)^{k})=\operatorname{vol}(M\setminus A)^{k}, mixing of FkF_{k} implies that for sufficiently large nn,

(9.6) (μ^volk)(Fkn(Uk)Σ^×(MA)k)(1ϵ2/2)vol(MA)kα0k.(\hat{\mu}\otimes\operatorname{vol}^{k})(F^{n}_{k}(U^{k})\cap\hat{\Sigma}\times(M\setminus A)^{k})\leq(1-\epsilon_{2}/2)\operatorname{vol}(M\setminus A)^{k}\alpha_{0}^{k}.

For large kk the bounds (9.5) and (9.6) are incompatible, so we obtain a contradiction. ∎

9.2. Proof of the finite time mixing proposition

In this subsection we prove the finite time mixing Proposition 7.11. The idea is straightforward. We can saturate the curve γ^\hat{\gamma} with stable manifolds to embed γ^\hat{\gamma} in a positive measure set that will contract onto the image of γ^\hat{\gamma} forward in time. As the skew product F:Σ^×MΣ^×MF\colon\hat{\Sigma}\times M\to\hat{\Sigma}\times M is fibrewise mixing (Proposition 9.1), this positive measure thickening of γ^\hat{\gamma} must equidistribute for most words. Simultaneously, we know that most images of γ^\hat{\gamma} will be relatively smooth. This allows us to conclude.

In the proof we will need some intermediate claims.

Definition 9.2.

An ϵ\epsilon-thickening of a curve γ\gamma for a word ωΣ\omega\in\Sigma consists of two pieces of information. The first piece is a subset γ0γ\gamma_{0}\subset\gamma that will be thickened. The second piece is a set of the form

xγ0Wϵ(x)s(ω,x),\bigcup_{x\in\gamma_{0}}W^{s}_{\epsilon(x)}(\omega,x),

and Wϵ(x)s(ω,x)W^{s}_{\epsilon(x)}(\omega,x) is the local stable leaf of radius 0<ϵ(x)<ϵ0<\epsilon(x)<\epsilon through xx. We will often denote such sets by κω(γ)\kappa_{\omega}(\gamma).

Note that although the thickening can in principle be defined over all of γ\gamma, we will usually only use it on a special subset γ0\gamma_{0} that has better properties.

The following lemma shows that we may choose thickenings of γ\gamma so that the pushforward of the volume along the thickening to γ\gamma by the stable holonomy is proportional to ρ\rho on γ0\gamma_{0}.

Lemma 9.3.

(Local Thickening Lemma) Fix ϵ1>0\epsilon_{1}>0 and C0>0C_{0}>0, a level of goodness of standard pairs. For any ϵ2>0\epsilon_{2}>0, there exist ϵ3,c1,C2,ϱ>0\epsilon_{3},c_{1},C_{2},\varrho>0 such that for 1ϵ21-\epsilon_{2} of words ωΣ\omega\in\Sigma, and any C0C_{0}-good standard pair γ^=(γ,ρ)\hat{\gamma}=(\gamma,\rho) of unit mass, we can form an ϵ1\epsilon_{1}-thickening of γ\gamma, κω(γ)\kappa_{\omega}(\gamma), in the sense of Definition 9.2, such that:

  1. (1)

    Let πs\pi^{s} be the projection to γ\gamma along the stable leaves. Then πs(vol|κω(γ))=c1ρ|πs(κω(γ))\displaystyle\pi^{s}_{*}(\operatorname{vol}|_{\kappa_{\omega}(\gamma)})=c_{1}\rho|_{\pi^{s}(\kappa_{\omega}(\gamma))} and ρ(πs(κω(γ)))>ϱ.\displaystyle\rho(\pi^{s}(\kappa_{\omega}(\gamma)))>\varrho.

  2. (2)

    Every stable leaf in κω(γ)\kappa_{\omega}(\gamma) is uniformly (C2,λ,ϵ3)(C_{2},\lambda,\epsilon_{3})-tempered under forward iterations.

  3. (3)

    The choice of thickening κω(γ)\kappa_{\omega}(\gamma) depends measurably on ω\omega.

Proof.

We know that for every point xx and almost every word ω\omega, that xx is in the Pesin block Λω(C)\Lambda^{\omega}_{\infty}(C) for some sufficiently large CC, and on a measure one subset, EsE^{s} is not tangent to γ\gamma. Thus we can saturate a positive measure subset of γ\gamma with stable manifolds with uniformly controlled geometry by increasing CC. By taking a shorter subset of the saturating stable curves in such a Pesin block, we can ensure that the volume measure of the saturation projected along the stable leaves to γ\gamma gives a measure that is proportional to ρ\rho restricted to the images of πs\pi^{s}. ∎

The following lemma says that if we start with CC-good curve, then we can ensure that a large proportion of the images of the curve are C0C_{0}-good at any time in the future.

Lemma 9.4.

For any ϵ>0\epsilon>0, there exists C0C_{0}, such that for any C>0C>0, a level of goodness, there exists N00N_{0}\geq 0, such that for any CC-good standard pair γ^\hat{\gamma} and all nN0n\geq N_{0}, there exists a set Σ0nΣ\Sigma_{0}^{n}\subseteq\Sigma of measure at least 1ϵ1-\epsilon, such that for ωΣ0n\omega\in\Sigma_{0}^{n},

(9.7) ρ(x:fωn(γ^) has a C0-good neighborhood of fωn(x))(1ϵ)ρ(γ^).\rho(x:f^{n}_{\omega}(\hat{\gamma})\text{ has a }C_{0}\text{-good neighborhood of }f^{n}_{\omega}(x))\geq(1-\epsilon)\rho(\hat{\gamma}).

The same holds for a CC-good standard family.

Proof.

This is immediate from Proposition 8.4, which says that for large enough Δ\Delta, we may ensure that 1ϵ1-\epsilon of the pairs (ω,x)(\omega,x) will have a tempered time between times n+Δn+\Delta and n+2Δn+2\Delta for any nn. We choose N0N_{0} large enough that such a tempered time recovers a CC-good curve to being DD-good for some uniform DD. Then we wait until to the end of the block, which gives a further, bounded loss of goodness. As in other places in the paper, a Fubini argument gives the fiberwise estimate stated here. Finally, note that this argument is independent of nN0n\geq N_{0}. ∎

We are now ready to prove the finite time mixing proposition.

Proof of Proposition 7.11.

The outline of the proof is as follows. We first find a collection of balls in MM that a thickened version of γ1\gamma_{1} and γ2\gamma_{2} will mix onto due to the fibered mixing lemma. Then once mixing is accomplished most subcurves of fωn(γ1)f^{n}_{\omega}(\gamma_{1}) and fωn(γ2)f^{n}_{\omega}(\gamma_{2}) will still be long. Consequently, if there are subcurves intersecting a small ball Bυ(x)B_{\upsilon}(x) then those subcurves will form a (C1,δ,υ)(C_{1},\delta,\upsilon) configuration for some C1C_{1}. To achieve this setup, we will construct subsets Σ0,,Σ4\Sigma_{0},\ldots,\Sigma_{4} of Σ\Sigma. Each of these sets will consist of words ω\omega that have some particular finite time mixing properties, so that their intersection has all the properties we need to conclude along the lines just described. We will also have some additional parameters mim_{i} that are chosen below.

The input to this proposition requires some constants. First, let 0<τ<10<\tau<1 be the constant from the Local Coupling Lemma, Lemma 7.10, which says that the conclusions of that lemma hold for (C,δ,τδ)(C,\delta,\tau\delta)-configurations for any CC as long as δ\delta is sufficiently small relative to CC. We then obtain the following claim—note that this holds for all sufficiently small δ\delta with a uniform lower bound in the last term.

Claim 9.5.

There exists m0m_{0} such that for all sufficiently small δ>0\delta>0, we can find a family of disjoint balls Bi=Bδi(xi)B_{i}=B_{\delta_{i}}(x_{i}) in MM such that:

(1) Each BiB_{i} has equal volume between (1/10)δ2(1/10)\delta^{2} and 10δ210\delta^{2};

(2) Each BiB_{i} contains a ball BiB_{i}^{\prime} of diameter at most τδ/2\tau\delta/2 so that d(Bi,Bi)>δ/2d(\partial B_{i}^{\prime},\partial B_{i})>\delta/2;

(3) Each BiB_{i}^{\prime} contains a ball Bi′′B_{i}^{\prime\prime} with the same center and radius between τδ/9\tau\delta/9 and τδ/10\tau\delta/10, and the balls Bi′′B_{i}^{\prime\prime} all have equal volume;

(4) vol(iBi′′)10m0\displaystyle\operatorname{vol}(\bigcup_{i}B_{i}^{\prime\prime})\geq 10^{-m_{0}}.

We now pick Σ0\Sigma_{0}, which are words where γ1\gamma_{1} and γ2\gamma_{2} have good thickenings. Both γ1\gamma_{1} and γ2\gamma_{2} are C0C_{0}-good by assumption. Then for any m1m_{1}\in\mathbb{N}, which we will pick later, we see that there exists c1c_{1}, which is distinct from C1C_{1}, and ϱ\varrho such that there is a set Σ0Σ\Sigma_{0}\subseteq\Sigma, such that μ(Σ0)>110m1\mu(\Sigma_{0})>1-10^{-m_{1}} such that for ωΣ0\omega\in\Sigma_{0}, there exists a thickening κω(γi)\kappa_{\omega}(\gamma_{i}), i{1,2}i\in\{1,2\} satisfying the properties of Lemma 9.3. By possibly shrinking the thickening we may make the thickenings each have the same identical mass ϱ\varrho. For the words in Σ0\Sigma_{0}, we form a set U1Σ×MU^{1}\subseteq\Sigma\times M by taking the union of the sets {ω}×κω(γ1)\{\omega\}\times\kappa_{\omega}(\gamma_{1}), similarly we define U2U^{2}. We denote by Uω1U^{1}_{\omega} the part of U1U^{1} above ω\omega and use a similar notation for Uω2U^{2}_{\omega}.

We now choose C1C_{1}, the regularity of the pairs that will be in (C1,δ,τδ)(C_{1},\delta,\tau\delta)-configurations in the conclusion of the proposition, as well as Σ1n\Sigma^{n}_{1} and Σ2n\Sigma^{n}_{2}, words where most images of γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} are C1C_{1}-good curves. Choose C1>0C_{1}>0 such that the conclusion of Lemma 9.4 holds for a set Σ1n\Sigma_{1}^{n} of words of measure (110m2)(1-10^{-m_{2}}), for some m2m_{2} that we will choose later, so that for ωΣ1n\omega\in\Sigma_{1}^{n},

(9.8) ρ1(x:fωn(γ^1) has a C1-good neighborhood of fωn(x))(110m2)ρ1(γ^1).\rho_{1}(x:f^{n}_{\omega}(\hat{\gamma}_{1})\text{ has a }C_{1}\text{-good neighborhood of }f^{n}_{\omega}(x))\geq(1-10^{-m_{2}})\rho_{1}(\hat{\gamma}_{1}).

For all ϵ>0\epsilon>0, there exists D(ϵ)>0D(\epsilon)>0 such that for a C0C_{0}-good standard pair γ^=(γ,ρ)\hat{\gamma}=(\gamma,\rho), the measure of the points xγ^x\in\hat{\gamma} such that

(9.9) ρ(xγ:d(x,γ^)<D)<ϵρ(γ).\rho(x\in\gamma:d(x,\partial\hat{\gamma})<D)<\epsilon\rho(\gamma).

Recalling Definition 7.8, the previous equation implies that there exists D1>0D_{1}>0 such that we may strengthen the conclusion in equation (9.8) above:

(9.10) ρ1(x:fωn(x) is (C1,D1)-well positioned in fωn(γ^1))(110(m21))ρ1(γ^1).\rho_{1}(x:f^{n}_{\omega}(x)\text{ is }(C_{1},D_{1})\text{-well positioned in }f^{n}_{\omega}(\hat{\gamma}_{1}))\geq(1-10^{-(m_{2}-1)})\rho_{1}(\hat{\gamma}_{1}).

Call this set of (C1,D1)(C_{1},D_{1})-well positioned points Gn,ω1G_{n,\omega}^{1}. Similarly, for γ^2\hat{\gamma}_{2} there exists a set Σ2n\Sigma_{2}^{n} and a set Gn,ω2G_{n,\omega}^{2} with this same property.

Take a covering BiB_{i} as in Claim 9.5 applied with the parameter δ\delta small enough that the local coupling lemma holds for (C1,δ,τδ)(C_{1},\delta,\tau\delta)-configurations. Let m1=m2=m3=m4=20m_{1}=m_{2}=m_{3}=m_{4}=20.

Next we choose Σ3n\Sigma_{3}^{n} and Σ4n\Sigma_{4}^{n}, which are sets that mix the thickenings of γ^1\hat{\gamma}_{1} and γ2^\hat{\gamma_{2}} onto the balls Bi′′B_{i}^{\prime\prime}. Let ϵ2=10m3\epsilon_{2}=10^{-m_{3}} from above, and let 0<ϵ1<10m30<\epsilon_{1}<10^{-m_{3}}. Then by the fibrewise mixing proposition (Proposition 9.1), there exists N1N_{1} such that for nN1n\geq N_{1}, there is a set Σ3n\Sigma_{3}^{n} of ωΣ\omega\subseteq\Sigma of μ\mu-measure 1210m31-2\cdot 10^{-m_{3}} such that for ωΣ3n\omega\in\Sigma_{3}^{n}, Uω1U_{\omega}^{1} mixes onto the Bi′′B_{i}^{\prime\prime} for each Bi′′B_{i}^{\prime\prime} in the covering, i.e. for ωΣ3n\omega\in\Sigma_{3}^{n},

(9.11) (110m3)ϱvol(Bi′′)vol(fωn(Uω1){ω}×Bi′′)vol(Bi′′)ϱ(1+10m3).(1-10^{-m_{3}})\varrho\operatorname{vol}(B_{i}^{\prime\prime})\leq\operatorname{vol}(f^{n}_{\omega}(U_{\omega}^{1})\cap\{\omega\}\times B_{i}^{\prime\prime})\leq\operatorname{vol}(B_{i}^{\prime\prime})\varrho(1+10^{-m_{3}}).

Similarly we have a cutoff N2N_{2}, and sets Σ4n\Sigma_{4}^{n} such that the same holds for U2U_{2}. We will strengthen this estimate even further, we will let Bi′′′B_{i}^{\prime\prime\prime} be a ball with the same center as Bi′′B_{i}^{\prime\prime} but with slightly larger radius so that the ratio of the volumes vol(Bi′′′)/vol(Bi′′)=1+10m4\operatorname{vol}(B_{i}^{\prime\prime\prime})/\operatorname{vol}(B_{i}^{\prime\prime})=1+10^{-m_{4}}. Then by possibly enlarging the numbers N1N_{1} and N2N_{2}, we can arrange that the same estimate holds simultaneously for the sets Bi′′′B_{i}^{\prime\prime\prime} as well.

Now consider what happens for ωΣ0Σ1nΣ3n\omega\in\Sigma_{0}\cap\Sigma_{1}^{n}\cap\Sigma_{3}^{n}. These are words ω\omega where γ1\gamma_{1} has a good thickening by stable manifolds, and many of the points in the image of γ^1\hat{\gamma}_{1} are good standard pairs and there is equidistribution. For any m4m_{4} as long as nn is sufficiently large, the diameter of the image of any Wϵ(x)s(ω,x)W^{s}_{\epsilon(x)}(\omega,x) leaf in the thickening of γ^1\hat{\gamma}_{1} is at most τδ/102m4\tau\delta/10^{2m_{4}}. Thus from the measure preservation of the projection πs\pi^{s} of fωn(Uω)f^{n}_{\omega}(U_{\omega}) onto fωn(γ^1)f^{n}_{\omega}(\hat{\gamma}_{1}), we see that if some point xfωn(Uω1)x\in f^{n}_{\omega}(U_{\omega}^{1}) is in Bi′′B_{i}^{\prime\prime}, then as Bi′′′B_{i}^{\prime\prime\prime} contains a neighborhood of Bi′′B_{i}^{\prime\prime} of radius τδ/10m4\tau\delta/10^{m_{4}}, all points on fωn(Wϵ(x)s(ω,x))f^{n}_{\omega}(W^{s}_{\epsilon(x)}(\omega,x)) and, in particular, the points of γ^\hat{\gamma} lie in this set. Hence, writing ρ1n,ω\rho^{n,\omega}_{1} for the density on fωn(γ^1)f^{n}_{\omega}(\hat{\gamma}_{1}),

(9.12) ρ1n,ω(fωn(γ^1)Bi′′′)vol(Bi′′)(110m3)ρ1(γ^1).\rho_{1}^{n,\omega}(f^{n}_{\omega}(\hat{\gamma}_{1})\cap B_{i}^{\prime\prime\prime})\geq\operatorname{vol}(B_{i}^{\prime\prime})(1-10^{-m_{3}})\rho_{1}(\hat{\gamma}_{1}).

We claim that for such ωΣ0Σ1nΣ3n\omega\in\Sigma_{0}\cap\Sigma_{1}^{n}\cap\Sigma_{3}^{n} that there exists a subfamily of the Bi′′B_{i}^{\prime\prime} containing at least 90%90\% of the Bi′′B_{i}^{\prime\prime}, and such that for each of these Bi′′B_{i}^{\prime\prime},

(9.13) ρ1n,ω(Gn,ω1Bi′′′)vol(Bi′′′)ρ1(γ^)/2.\rho_{1}^{n,\omega}(G_{n,\omega}^{1}\cap B_{i}^{\prime\prime\prime})\geq\operatorname{vol}(B_{i}^{\prime\prime\prime})\rho_{1}(\hat{\gamma})/2.

Suppose that this were not the case, then for such an ω\omega there is a set of 10%10\% of the balls Bi′′′B_{i}^{\prime\prime\prime} such that for these balls we have ρ1n,ω(Gn,ω1Bi′′′)<vol(Bi′′′)ρ1(γ^)/2\rho_{1}^{n,\omega}(G_{n,\omega}^{1}\cap B_{i}^{\prime\prime\prime})<\operatorname{vol}(B_{i}^{\prime\prime\prime})\rho_{1}(\hat{\gamma})/2. Then, from (9.10) and the fibrewise mixing estimate (9.11),

vol(fωn(Uω)iBi′′)ρ1n,ω(Gn,ω1iBi′′′)ϱρ1(γ^)+10(m21)ϱ\operatorname{vol}(f^{n}_{\omega}(U_{\omega})\cap\bigcup_{i}B_{i}^{\prime\prime})\leq\frac{\rho_{1}^{n,\omega}(G_{n,\omega}^{1}\cap\bigcup_{i}B_{i}^{\prime\prime\prime})\varrho}{\rho_{1}(\hat{\gamma})}+10^{-(m_{2}-1)}\varrho
.1ivol(Bi′′′)ϱ12+.9ivol(Bi′′′)ϱ(1+10m3)+10(m21)ϱ\leq.1\sum_{i}\operatorname{vol}(B_{i}^{\prime\prime\prime})\varrho\frac{1}{2}+.9\sum_{i}\operatorname{vol}(B_{i}^{\prime\prime\prime})\varrho(1+10^{-m_{3}})+10^{-(m_{2}-1)}\varrho
.96ivol(Bi′′′)ϱ.96(1+110m41)ϱivol(Bi′′)\leq.96\sum_{i}\operatorname{vol}(B_{i}^{\prime\prime\prime})\varrho\leq.96(1+\frac{1}{10^{m_{4}-1}})\varrho\sum_{i}\operatorname{vol}(B_{i}^{\prime\prime})

which contradicts fiberwise mixing of U1U^{1} with the set iBi′′′\bigcup_{i}B_{i}^{\prime\prime\prime}.

Now consider ωΣ0Σ1nΣ2nΣ3nΣ4n\omega\in\Sigma_{0}\cap\Sigma_{1}^{n}\cap\Sigma_{2}^{n}\cap\Sigma_{3}^{n}\cap\Sigma_{4}^{n}. We have that for 90%90\% of the balls Bi′′′B_{i}^{\prime\prime\prime}, that Bi′′′B_{i}^{\prime\prime\prime} has radius at most τδ/8\tau\delta/8, and this ball contains points of fωn(γ^1)f^{n}_{\omega}(\hat{\gamma}_{1}) that are (C1,D1)(C_{1},D_{1})-well centered of measure at least ρ1(γ^1)τ2δ2/200\rho_{1}(\hat{\gamma}_{1})\tau^{2}\delta^{2}/200. The same holds for γ^2\hat{\gamma}_{2} for a possibly different 90%90\% of balls. Thus for 80%80\% of the balls Bi′′′B_{i}^{\prime\prime\prime} each of fωn(γ^1)f^{n}_{\omega}(\hat{\gamma}_{1}) and fωn(γ^1)f^{n}_{\omega}(\hat{\gamma}_{1}) contains measure ρ1(γ^1)τ2δ2/200\rho_{1}(\hat{\gamma}_{1})\tau^{2}\delta^{2}/200 points that are (C1,D1)(C_{1},D_{1})-well centered. As these points are in a ball of radius τδ/8\tau\delta/8. From our choice of δ\delta, it follows that any pair of such images is (C1,δ,τδ)(C_{1},\delta,\tau\delta)-configured. Thus the needed conclusion follows by possibly subdividing the standard pairs we have identified so that they may be coupled in a measure preserving way. We may now conclude because

μ(Σ0Σ1nΣ2nΣ3nΣ4n)110m1210m2410m311019.\displaystyle\mu(\Sigma_{0}\cap\Sigma_{1}^{n}\cap\Sigma_{2}^{n}\cap\Sigma_{3}^{n}\cap\Sigma_{4}^{n})\geq 1-10^{-m_{1}}-2\cdot 10^{-m_{2}}-4\cdot 10^{-m_{3}}\geq 1-10^{-19}.

10. Proof of the Local Coupling Lemma

10.1. Inductive local coupling procedure

To prove the Local Coupling Lemma 7.10, we would like two positive measure sets to be intertwined under the true stable holonomy. However, at any finite time, we do not yet know what the true limiting stable manifold is. To compensate, at finite times we approximate the limiting holonomy by using the fake stable manifolds. In the proof of the local coupling lemma, we will consider the differences between different standard families as discussed in §7.1.

To begin this section, we introduce a notion of a “fake coupling” of two standard pairs γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2}. We use fake couplings because in our setting we cannot use the stable manifold as is done in the deterministic setting. In the deterministic setting, if γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} are near each other, then we can immediately determine which points in γ^1\hat{\gamma}_{1} attract to which in γ^2\hat{\gamma}_{2} by using the stable holonomy. We work in an opposite manner: at each time nn we discard points that cannot couple yet. For example, if yγ^2y\in\hat{\gamma}_{2} and none of the time nn fake stable manifolds come near yy, then yy can’t couple because the true stable manifold is near the fake one. Consequently, we stop trying to couple yy at time nn. After we see the dynamics for all time, the points that remain in γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} are those that can be coupled with each other using the stable manifold. Hence after the fact, we see that they were coupled. The fake coupling is not a coupling. A time nn fake-coupling is a pair of subfamilies Pn1γ^1P_{n}^{1}\subseteq\hat{\gamma}_{1} and Pn2γ^2P_{n}^{2}\subseteq\hat{\gamma}_{2} that could potentially be coupled by the true stable manifolds. For a time nn fake coupling, we insist that the holonomies of the time nn fake stable manifolds carry Pn1P_{n}^{1} to Pn2P_{n}^{2}. Another way to describe this is that Pn1P_{n}^{1} and Pn2P_{n}^{2} seem coupled until time nn.

The definition of a fake coupling that follows that is adapted to the neighborhood Bδ0(x)B_{\delta_{0}}(x) from Proposition 10.12 and relies on the constants obtained in that proposition. Fake stable manifolds WnsW^{s}_{n} and their properties are discussed in detail in Appendix B.

Definition 10.1.

Suppose that γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} are two standard pairs that we are attempting to couple that are (C,δ,υ)(C,\delta^{\prime},\upsilon)-configured where C,δ,υC,\delta,\upsilon are parameters as in Proposition 10.12. Fix some xx and neighborhood Bδ0(x)B_{\delta_{0}}(x), 𝒞θ\mathcal{C}_{\theta} as in part 4 of that Proposition. We will use the other constants from that proposition as well without reintroducing them.

For nNn\geq N, we say that Pn1γ^1P^{1}_{n}\subseteq\hat{\gamma}_{1} and Pn2γ^2P^{2}_{n}\subseteq\hat{\gamma}_{2} are a (b0,η^)(b_{0},\hat{\eta})-fake coupled pair at time nNn\geq N for some word ω\omega on Bδ0(x)B_{\delta_{0}}(x) if the following statements hold. Write ρn1\rho_{n}^{1} and ρn2\rho_{n}^{2} for the densities of Pn1P^{1}_{n} and Pn2P^{2}_{n} on γ1\gamma_{1} and γ2\gamma_{2}. Let n1\mathcal{I}^{1}_{n} and n2\mathcal{I}^{2}_{n} be the underlying curves of Pn1P^{1}_{n} and Pn2P^{2}_{n}.

  1. (1)

    Pn1P^{1}_{n} and Pn2P^{2}_{n} have equal mass and (Hn1s)(ρn1)=ρn2(H^{s}_{n-1})_{*}(\rho^{1}_{n})=\rho^{2}_{n}.

  2. (2)

    Hn1sH^{s}_{n-1} carries n1\mathcal{I}^{1}_{n} to n2\mathcal{I}^{2}_{n}.

  3. (3)

    If xγ1x\in\gamma_{1} is (C,λ,ϵ,𝒞θ)(C,\lambda,\epsilon,\mathcal{C}_{\theta})-tempered for times NinN\leq i\leq n, then xn1x\in\mathcal{I}_{n}^{1}.

  4. (4)

    At each point xx in the curve underlying Pn1P^{1}_{n}, we have that

    ρn1(x)b0Nin(1eiη^)ρ1(x).\rho^{1}_{n}(x)\geq b_{0}\prod_{N\leq i\leq n}(1-e^{-i\hat{\eta}})\rho^{1}(x).

We will see below that if for a given word ω\omega we are able to arrange that the statements above hold for each nn, then in the limit, for each point xγ1x\in\gamma_{1} that is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered and in each Pn1P^{1}_{n} that at least ϵ0\epsilon_{0} of the mass above xx in γ^1\hat{\gamma}_{1} couples. Thus as typically a positive measure set of xx have this property, a positive proportion of the mass of Pn1P^{1}_{n} couples.

The structure of the rest of this section is as follows. In §10.2 and §10.3 we show that if a trajectory has a tempered splitting then nearby trajectories also have tempered splittings. In §10.4 we prove Proposition 10.12 which shows how small a scale we need to work at in order to run a coupling procedure. Then in §10.5 we prove the local coupling lemma in two steps. First, we prove Lemma 10.13, which describes a deterministic local coupling procedure that can be applied to a fixed word ω\omega under the choice of constants provided by Proposition 10.12. We then finish the proof of Lemma 7.10 by using that the hypotheses of this deterministic local coupling procedure are satisfied with high probability.

10.2. Nearby points inherit tempered splitting

In this subsubsection we prove Proposition 10.3, which says that nearby trajectories inherit splittings from each other. This will be used later to show that the set of points on a curve that have a tempered splitting after nn iterations is quite fat. The idea that points close to hyperbolic orbits inherit hyperbolicity is useful in many problems in dynamics. For example, a classical Collet–Eckmann condition is used in one dimensional dynamics to show that near critical orbits recover hyperbolicity if the critical orbit is hyperbolic (see [CE80]). Analogous results for two dimensional strongly dissipative maps appear in [BC91, WY01]. In this paper we present a version for general two dimensional maps based on Pesin theory.

We begin with a fact showing how far attracting and repelling directions of a linear map of P1\mathbb{R}\operatorname{P}^{1} move under perturbation.

Lemma 10.2.

Fix some λ>1\lambda>1, then there exists C,ϵ0,δ0,N0>0C,\epsilon_{0},\delta_{0},N_{0}>0, such that if L:22L\colon\mathbb{R}^{2}\to\mathbb{R}^{2} is a linear map of the form

(10.1) [σ100σ2]\begin{bmatrix}\sigma_{1}&0\\ 0&\sigma_{2}\end{bmatrix}

with |σ1|,|σ|21|λ|>1\left|\sigma_{1}\right|,\left|\sigma\right|_{2}^{-1}\geq\left|\lambda\right|>1, g0:P1P1g_{0}\colon\mathbb{R}\operatorname{P}^{1}\to\mathbb{R}\operatorname{P}^{1} is the induced map, and gϵg_{\epsilon} is a perturbation with dC1(g0,gϵ)=ϵ<ϵ0d_{C^{1}}(g_{0},g_{\epsilon})=\epsilon<\epsilon_{0}, then:

(1) gϵg_{\epsilon} has a unique repelling fixed point rϵr_{\epsilon} and a unique attracting fixed point aϵa_{\epsilon}, and these satisfy d(rϵ,(0,1))Cϵ,\displaystyle d(r_{\epsilon},(0,1))\leq C\epsilon, d(aϵ,(1,0))Cϵ.\displaystyle d(a_{\epsilon},(1,0))\leq C\epsilon.

(2) On the neighborhood Bδ0((0,1))B_{\delta_{0}}((0,1)), DgϵλCϵ\|Dg_{\epsilon}\|\geq\lambda-C\epsilon and on the neighborhood Bδ0((1,0))B_{\delta_{0}}((1,0)), Dgϵλ1+Cϵ\|Dg_{\epsilon}\|\leq\lambda^{-1}+C\epsilon. These neighborhoods are overflowing and under-flowing, respectively.

(3) If yBδ0((0,1))y\notin B_{\delta_{0}}((0,1)), then gϵN0(y)Bδ0((1,0))g^{N_{0}}_{\epsilon}(y)\in B_{\delta_{0}}((1,0)).

We omit the proof of the above lemma as these are standard facts about the dynamics in a neighborhood of a hyperbolic fixed point. The proof of the next result is long and relies on a number of intermediate lemmas.

Proposition 10.3.

(Nearby points inherit temperedness) Fix C0,C1,λ,α,ϵ0,D0,σ>0C_{0},C_{1},\lambda,\alpha,\epsilon_{0},D_{0},\sigma>0 and 0<λ<λ0<\lambda^{\prime}<\lambda. Then for sufficiently small ϵ>0\epsilon>0 there exist ν,k,D1,N>0\nu,k,D_{1},N>0 such that kϵ<ϵ0k\epsilon<\epsilon_{0} and if we have a sequence of matrices of length nNn\geq N (Ai)1inSL(2,)(A_{i})_{1\leq i\leq n}\in\operatorname{SL}(2,\mathbb{R}) that are uniformly bounded in norm by D0D_{0} and are (C0,λ,ϵ)(C_{0},\lambda,\epsilon)-tempered, and (Bi)1in(B_{i})_{1\leq i\leq n} is another sequence of matrices such that AiBiC1eα(ni)\|A_{i}-B_{i}\|\leq C_{1}e^{-\alpha(n-i)} then:

  1. (1)

    BiB_{i} has a (D1C0,λ,kϵ)(D_{1}C_{0},\lambda^{\prime},k\epsilon)-subtempered splitting with the stable direction equal to the contracting singular direction of BnB^{n}, and

  2. (2)

    The angle between (Bi)i=1n(B_{i})_{i=1}^{n} and (Ai)i=1n(A_{i})_{i=1}^{n}’s stable directions is at most eνne^{-\nu n}.

  3. (3)

    BnAn(1σ).\displaystyle\|B^{n}\|\geq\|A^{n}\|^{(1-\sigma)}.

Proof.

Before we begin, observe that due to the presence of the factor D1D_{1} in the conclusion, it suffices to show that the needed claim holds for nn sufficiently large as we may always deal with small nn by adjusting D1D_{1}. Let λ^=λ+λ2\hat{\lambda}=\frac{\lambda+\lambda^{\prime}}{2}.

As long as ϵ0<(λλ)/2\epsilon_{0}<(\lambda-\lambda^{\prime})/2, we may view the sequence of matrices AiA_{i} in the finite time Lyapunov charts from Lemma A.1, where we view the sequence as being (C0,λ^,ϵ+λλ2)(C_{0},\hat{\lambda},\epsilon+\frac{\lambda-\lambda^{\prime}}{2})-tempered. In these charts, we have: Ai=[σ1,i00σ2,i],\displaystyle A_{i}=\begin{bmatrix}\sigma_{1,i}&0\\ 0&\sigma_{2,i}\end{bmatrix}, where min{σ1,i,σ2,i1}eλ^\min\{\sigma_{1,i},\sigma_{2,i}^{-1}\}\geq e^{\hat{\lambda}}. From Lemma A.1, the ratio of the reference norm and the Lyapunov norm at step ii is OC1,α,λ,λ(e4ϵi)O_{C_{1},\alpha,\lambda,\lambda^{\prime}}(e^{4\epsilon i}).

As BiB_{i} is a perturbation of size eα(ni)e^{-\alpha(n-i)} by viewing BiB_{i} in the same Lyapunov coordinates as AiA_{i}, we have that

(10.2) Bi=[σ1,i00σ2,i]+OC1,α,λ,λ(eα(ni)e4ϵi),B_{i}=\begin{bmatrix}\sigma_{1,i}&0\\ 0&\sigma_{2,i}\end{bmatrix}+O_{C_{1},\alpha,\lambda,\lambda^{\prime}}(e^{-\alpha(n-i)}e^{4\epsilon i}),

where min{σ1,i,σ2,i1}eλ^\min\{\sigma_{1,i},\sigma_{2,i}^{-1}\}\geq e^{\hat{\lambda}}.

Using this representation, we will now study BiB_{i} as a perturbation of the matrix product involving the AiA_{i}. For most ii, the two are quite close and consequently BiB_{i} will inherit temperedness of its norm. The remaining ii will be negligible. To show this, we first identify where the stable direction of BnB^{n} lies. Then using this we show that the norm of BiB^{i} is subtempered up to a particular time. Then we do a little bookkeeping to show that if we relax the subtemperedness condition, then norm will remain subtempered up to time nn.

First we study how temperedness changes as we continue appending matrices to a sequence.

Lemma 10.4.

Fix some bound eΔ>1e^{\Delta}>1. Suppose that A1,,AnA_{1},\ldots,A_{n} is a sequence of matrices whose splitting into singular directions is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered. Then for any kk and sequence B1,,BmB_{1},\ldots,B_{m} with BieΔ\|B_{i}\|\leq e^{\Delta} and m<Δ1(nkϵC)m<\Delta^{-1}(nk\epsilon-C), the sequence A1,,An,B1,,BmA_{1},\ldots,A_{n},B_{1},\ldots,B_{m} is (C,λkϵ,kϵ)(C,\lambda-k\epsilon,k\epsilon)-tempered.

Proof.

A straightforward generalization of Lemma 4.15 gives that if we have a sequence of matrices A1,,AnA_{1},\ldots,A_{n} with (C,λ,ϵ)(C,\lambda,\epsilon)-tempered norm and we append a sequence B1,,BmB_{1},\ldots,B_{m} that is (Δm,λϵ,ϵ)(-\Delta m,\lambda-\epsilon,\epsilon)-tempered, then the concatenation is a (C~,λkϵ,kϵ)(\tilde{C},\lambda-k\epsilon,k\epsilon) tempered sequence with

(10.3) C~=min{C,CmΔ+nkϵ/2,mΔ+nkϵ}.\tilde{C}=\min\{C,C-m\Delta+nk\epsilon/2,-m\Delta+nk\epsilon\}.

Thus the needed conclusion holds as long as mnkϵCΔ.\displaystyle m\leq\frac{nk\epsilon-C}{\Delta}.

The following lemma gives tight control on where vsv^{s}, the most contracted vector for the sequence (Bi)i=1n(B_{i})_{i=1}^{n} lies. Below we will write gi,ϵg_{i,\epsilon} for the map on P1\mathbb{R}\operatorname{P}^{1} induced by BiB_{i}, viewed in the Lyapunov coordinates above. We write gϵig^{i}_{\epsilon} for the composition gi,ϵg1,ϵg_{i,\epsilon}\circ\cdots\circ g_{1,\epsilon}.

Lemma 10.5.

For all C0,C1,α,λ,λ,D0>0C_{0},C_{1},\alpha,\lambda,\lambda^{\prime},D_{0}>0 as above and all sufficiently small ϵ>0\epsilon>0, there exists ν>0\nu>0 and NsN_{s}\in\mathbb{N} such that if nNsn\geq N_{s} and (Bi)i=1n(B_{i})_{i=1}^{n} is a sequence of matrices as above, a perturbation of (Ai)i=1n(A_{i})_{i=1}^{n}, a sequence of matrices with a (C0,λ,ϵ)(C_{0},\lambda,\epsilon)-subtempered splitting, then the most contracted direction of BnB^{n}, vBsv^{s}_{B}, lies within a neighborhood of size enνe^{-n\nu} of the most contracted direction of AnA^{n}.

Proof.

We will use the perturbed dynamics gi,ϵg_{i,\epsilon} on P1\mathbb{R}\operatorname{P}^{1} from above and prove this result by studying how fast a vector near the vector (0,1)(0,1) escapes and goes to (1,0)(1,0). We will use the estimates of Lemma 10.2 freely and not restate them here. Given δ0>0\delta_{0}>0 in the conclusion of that lemma, we see that as long as the size of the perturbation is at most some ϵδ0\epsilon_{\delta_{0}}, then on the neighborhoods of size δ0\delta_{0} of (0,1)(0,1) the expansion is by a factor of at least e.9λ^e^{.9\hat{\lambda}} and similarly in the δ0\delta_{0}-neighborhood of aϵa_{\epsilon}, the contraction of distance is by a factor of e.9λ^e^{-.9\hat{\lambda}}. As long as ϵ\epsilon is sufficiently small relative to α\alpha and i99100ni\leq\frac{99}{100}n, then gi,ϵg_{i,\epsilon} is a perturbation of size less than ϵδ0\epsilon_{\delta_{0}} and the estimate for the norm of gi,ϵg_{i,\epsilon} on Bδ0((1,0))B_{\delta_{0}}((1,0)) and Bδ0((0,1))B_{\delta_{0}}((0,1)) holds.

Next, we study the norm growth of vv over its entire trajectory. Define Φϵi:P1+\Phi^{i}_{\epsilon}\colon\mathbb{R}\operatorname{P}^{1}\to\mathbb{R}^{+} by

(10.4) Φϵ1(v)=lnBivv.\Phi^{1}_{\epsilon}(v)=\ln\frac{\|B_{i}v\|}{\|v\|}.

Then Biv\|B^{i}v\| is the sum of Φϵi\Phi^{i}_{\epsilon} along the trajectory of vv. We divide the trajectory of vv into three segments. The first segment is when vv is does not yet lie in Bδ0((1,0))B_{\delta_{0}}((1,0)). The middle segment is when it lies in Bδ0((1,0))B_{\delta_{0}}((1,0)) and BiB_{i} remains a small enough perturbation of AiA_{i} that we may use the approximations of Lemma 10.2. Finally, during the last part of the trajectory ii is so big that these estimates no longer hold. We will let 1<n1<n2<n1<n_{1}<n_{2}<n denote the indices where gϵi(v)g^{i}_{\epsilon}(v) first enters Bδ0((1,0))B_{\delta_{0}}((1,0)) and n2n_{2} the index where the approximations of Lemma 10.2 first cease to hold. We now proceed to estimate how large n1n_{1} and n2n_{2} are. Then using this information we will calculate Bnv\|B^{n}v\|.

By estimating in this manner, we will see that any vector that starts at distance more than enνe^{-n\nu} from (0,1)(0,1) cannot be a stable vector as its norm grows. Below, we will track the estimates for BiB_{i}, the same apply to AiA_{i}. Consequently, we see that the stable vector for both AiA_{i} and BiB_{i} must lie within distance enνe^{-n\nu} of (1,0)(1,0) for some sufficiently large ν\nu.

We now estimate n1n_{1}, i.e. we study how long it takes a vector vv near (0,1)(0,1) to leave Bδ0((0,1))B_{\delta_{0}}((0,1)). We claim that if ν\nu is sufficiently small then for sufficiently large nn, any vector vv that starts enνe^{-n\nu} away from (0,1)(0,1) will exit Bδ0((0,1))B_{\delta_{0}}((0,1)) after at most (2ν/λ)n(2\nu/\lambda)n iterates. To this end consider

d(gϵi(v),(0,1))d(gϵi(v),gi,ϵ(0,1))d(gi,ϵ((0,1)),(0,1))d(g_{\epsilon}^{i}(v),(0,1))\geq d(g^{i}_{\epsilon}(v),g_{i,\epsilon}(0,1))-d(g_{i,\epsilon}((0,1)),(0,1))
e.9λ^d(gϵi1(v),(0,1))(1C1enαei(α+4ϵ)e.9λ^d(gϵi1(v),(0,1)))\geq e^{.9\hat{\lambda}}d(g^{i-1}_{\epsilon}(v),(0,1))\!\!\left(1\!-\!\frac{C_{1}e^{-n\alpha}e^{i(\alpha+4\epsilon)}}{e^{.9\hat{\lambda}}d(g^{i-1}_{\epsilon}(v),(0,1))}\right)

As long as i(1/3)ni\leq(1/3)n, ϵ<α/100\epsilon<\alpha/100, and nn is sufficiently large,

(10.5) C1enαei(α+4ϵ)eα2n.C_{1}e^{-n\alpha}e^{i(\alpha+4\epsilon)}\leq e^{-\frac{\alpha}{2}n}.

Thus if ν<α/2\nu<\alpha/2, then for sufficiently large nn, if d(gϵi1(v),(0,1))enνd(g^{i-1}_{\epsilon}(v),(0,1))\geq e^{-n\nu}, then

(10.6) (1C1enαei(α+4ϵ)e.9λ^d(gϵi1(v),(0,1)))e.1λ^.\left(1-\frac{C_{1}e^{-n\alpha}e^{i(\alpha+4\epsilon)}}{e^{.9\hat{\lambda}}d(g^{i-1}_{\epsilon}(v),(0,1))}\right)\geq e^{-.1\hat{\lambda}}.

From the above, we see that as long as nn is sufficiently large, in/3i\leq n/3, and the trajectory of vv has not left the Bδ0((0,1))B_{\delta_{0}}((0,1)) after ii, iterates, then

(10.7) d(gϵi(v),(0,1))e.8λ^d(gϵi1((0,1)),(0,1)).d(g_{\epsilon}^{i}(v),(0,1))\geq e^{.8\hat{\lambda}}d(g^{i-1}_{\epsilon}((0,1)),(0,1)).

Proceeding iteratively, we see that after ii iterations, assuming in/3i\leq n/3 and that the trajectory of vv has not left Bδ0((0,1))B_{\delta_{0}}((0,1)),

(10.8) d(gϵi(v),(0,1))e.8λ^id(v,(0,1)).d(g_{\epsilon}^{i}(v),(0,1))\geq e^{.8\hat{\lambda}i}d(v,(0,1)).

In particular, if gϵi(v)g_{\epsilon}^{i}(v) has not left Bδ0((0,1))B_{\delta_{0}}((0,1)) after (2ν/.8λ^)n(2\nu/.8\hat{\lambda})n, iterates then we would have that d(gϵi(v),(0,1))eνn,\displaystyle d(g_{\epsilon}^{i}(v),(0,1))\geq e^{\nu n}, which is absurd.

Thus as long as ϵ<α/10\epsilon<\alpha/10 it follows for sufficiently large nn that gϵi(v)g^{i}_{\epsilon}(v) exits Bδ0((0,1))B_{\delta_{0}}((0,1)) after at most 2ν.8λ^n\frac{2\nu}{.8\hat{\lambda}}n steps. Moreover, by Lemma 10.2, it enters the neighborhood Bδ0((1,0))B_{\delta_{0}}((1,0)) after an additional N0N_{0} iterates. Thus for sufficiently large nn, n12ν.79λ^nn_{1}\leq\frac{2\nu}{.79\hat{\lambda}}n.

We now estimate n2n_{2}. In the Lyapunov charts, BiB_{i} is a perturbation of AiA_{i} of size eα(ni)e4ϵie^{-\alpha(n-i)}e^{4\epsilon i}. Lemma 10.2 ceases to hold when the size of the perturbation is size Oϵ0(1)O_{\epsilon_{0}}(1). This will occur when eα(ni)e4ϵie^{-\alpha(n-i)}e^{4\epsilon i} is order 11, which happens when iαα+4ϵni\approx\frac{\alpha}{\alpha+4\epsilon}n. If ϵ\epsilon is sufficiently small relative to α\alpha, then α/(α+4ϵ)18ϵ/α\alpha/(\alpha+4\epsilon)\geq 1-8\epsilon/\alpha. Hence by picking some N2N_{2}^{\prime} depending only on ϵ0,δ0\epsilon_{0},\delta_{0} and C1C_{1}, we see that n2n_{2} may be chosen to be the smallest number satisfying n2(18ϵα)nN2n_{2}\geq(1-8\frac{\epsilon}{\alpha})n-N_{2}^{\prime}. Hence for sufficiently large nn we can take the bound n2(19ϵα)nn_{2}\geq(1-9\frac{\epsilon}{\alpha})n.

Thus between times n1n_{1} and n2n_{2} there are at least (19ϵα2ν.79λ^)n(1-9\frac{\epsilon}{\alpha}-\frac{2\nu}{.79\hat{\lambda}})n iterates. As long as nn is sufficiently large and

(10.9) (19ϵα2ν.79λ^)>12\left(1-9\frac{\epsilon}{\alpha}-\frac{2\nu}{.79\hat{\lambda}}\right)>\frac{1}{2}

which we can certainly arrange if we take ϵ,ν\epsilon,\nu sufficiently small, we see that there are at least n/2n/2 iterates between n1n_{1} and n2n_{2}.

We now estimate Bnv\|B^{n}v\|. Let us first consider the norm Bn2v\|B^{n_{2}}v\| by estimating in the Lyapunov metric. Let viv^{i} equal gϵi(v)g^{i}_{\epsilon}(v). Then, for in1i\leq n_{1} and nn sufficiently large, using (10.2) and (10.5) and the inequality eX+YeX+Ye^{X}+Y\leq e^{X+Y}, valid for X,Y0X,Y\geq 0, we obtain
BieΔ+eα2neΔ+eα2n\displaystyle\|B_{i}\|^{\prime}\leq e^{\Delta}+e^{-\frac{\alpha}{2}n}\leq e^{\Delta+e^{-\frac{\alpha}{2}n}}. Taking logarithms we get lnBiΔ+eα2n\displaystyle\ln\|B_{i}\|^{\prime}\leq\Delta+e^{-\frac{\alpha}{2}n}. Thus,

lnBn2v\displaystyle\ln\|B^{n_{2}}v\|^{\prime} i=n1n2Φϵi(vi)+i=0n1Φϵi(vi)(n2n1).8λ^(n2ν.79λ^)(Δ+eα2n).\displaystyle\geq\sum_{i=n_{1}}^{n_{2}}\Phi^{i}_{\epsilon}(v^{i})+\sum_{i=0}^{n_{1}}\Phi_{\epsilon}^{i}(v^{i})\geq(n_{2}-n_{1}).8\hat{\lambda}-\left(n\frac{2\nu}{.79\hat{\lambda}}\right)(\Delta+e^{-\frac{\alpha}{2}n}).

This is the amount of growth in the Lyapunov coordinates. For the original metric, by Lemma A.1(3) this implies from our bounds on n1n_{1} and n2n_{2}, that

(10.10) lnBn2v(n2n1).8λ^n2ν.79λ^(Δ+eα2n)4n2ϵ.\ln\|B^{n_{2}}v\|\geq(n_{2}-n_{1}).8\hat{\lambda}-n\frac{2\nu}{.79\hat{\lambda}}(\Delta+e^{-\frac{\alpha}{2}n})-4n_{2}\epsilon.

Since lnBiΔ\ln\|B_{i}\|\leq\Delta, and because nn2(9ϵ/α)nn-n_{2}\leq(9\epsilon/\alpha)n, and n2n1>n/2n_{2}-n_{1}>n/2, we see that

(10.11) lnBn.4λ^nn2ν.79λ^(Δ+eα2n)n24ϵ9Δϵnα.\ln\|B^{n}\|\geq.4\hat{\lambda}n-n\frac{2\nu}{.79\hat{\lambda}}(\Delta+e^{-\frac{\alpha}{2}n})-n_{2}4\epsilon-\frac{9\Delta\epsilon n}{\alpha}.

So, we may conclude if

(10.12) .4λ^2ν.79λ^9Δϵα>0,.4\hat{\lambda}-\frac{2\nu}{.79\hat{\lambda}}-9\Delta\frac{\epsilon}{\alpha}>0,

which is certainly true as long as ϵ\epsilon and ν\nu are sufficiently small relative to α\alpha, λ\lambda^{\prime}, and Δ\Delta. ∎

Remark 10.6.

Note that the proof of the previous claim shows something more precise: letting vAs,vAuv_{A}^{s},v_{A}^{u} be the most contracted and expanded direction of AnA^{n}, in the Lyapunov charts both vAsv_{A}^{s} and vBsv_{B}^{s} lie within the neighborhood Bδ0((0,1))B_{\delta_{0}}((0,1)) and vAuv_{A}^{u} and vBuv_{B}^{u} both lie within the neighborhood Bδ0((1,0))B_{\delta_{0}}((1,0)) of where the conclusions of Lemma 10.2 hold.

Now that we have located where vsv^{s}, and hence vuv^{u} lies, we check that the norm of BiB^{i} is subtempered.

Lemma 10.7.

For any ϵ0>0\epsilon_{0}>0, suppose that we have a sequence of matrices as above. Then there exists k(C,λ,ϵ,α,Δ)k(C,\lambda,\epsilon,\alpha,\Delta) such that kϵ<ϵ0k\epsilon<\epsilon_{0} and the norm Bi\|B^{i}\| is (C,λkϵ,4kϵ)(C,\lambda-k\epsilon,4k\epsilon) sub-tempered.

Proof.

From Lemma 10.2, we see that if vu(E0s)v^{u}\in(E^{s}_{0})^{\perp}, then vuv^{u} lies in Bδ0((1,0))B_{\delta_{0}}((1,0)). Given any β0>0\beta_{0}>0 and nn sufficiently large, any vector vv in this neighborhood satisfies that for i<n2i<n_{2},

(10.13) Φϵi(v)(1β0)λ.\Phi^{i}_{\epsilon}(v)\geq(1-\beta_{0})\lambda.

Thus we see that along the trajectory from time 11 to n2n_{2} that every vector that begins in Bδ0(0,1)B_{\delta_{0}}(0,1) is (C,(1β0)λ,0)(C,(1-\beta_{0})\lambda,0)-subtempered for the sequence of matrices BiB_{i} viewed in Lyapunov charts. Take β0\beta_{0} such that (1β0)λ>(λ+λ)/2(1-\beta_{0})\lambda>(\lambda+\lambda^{\prime})/2.

With respect to the reference metric, such a sequence is (C,(1β0)λ,4ϵ)(C,(1-\beta_{0})\lambda,4\epsilon)-tempered due to Lemma A.1(3). This gives temperedness up to time n2n_{2}.

Recall that Lemma 10.4 says that if we extend the sequence by mm matrices where

m<Δ1(nkϵC),m<\Delta^{-1}(nk\epsilon-C),

then the result will be (C,(1β0)λkϵ,4kϵ)(C,(1-\beta_{0})\lambda-k\epsilon,4k\epsilon)-tempered. In our case because n2(19ϵα)nn_{2}\geq(1-\frac{9\epsilon}{\alpha})n, we would like to append 9ϵα\frac{9\epsilon}{\alpha} matrices of norm at most eΔe^{\Delta} and have the resulting sequence still be tempered. So, we need that

(10.14) 9ϵαn<Δ1(nkϵC)\frac{9\epsilon}{\alpha}n<\Delta^{-1}(nk\epsilon-C)

For sufficiently large nn, this holds as long as kϵΔ1>9ϵ/αk\epsilon\Delta^{-1}>9\epsilon/\alpha, that is, k>9Δ/αk>9{\Delta}/{\alpha}. Taking ϵ\epsilon sufficiently small we can arrange that kϵ<ϵ0k\epsilon<\epsilon_{0}. In particular choosing β0\beta_{0} sufficiently small, we can have that (1β0)λkϵλ(1-\beta_{0})\lambda-k\epsilon\geq\lambda^{\prime}, so the needed conclusion holds. ∎

The first and second conclusions of Proposition 10.3 for sufficiently large nn are now immediate from the two lemmas once we apply Proposition 4.6, which constructs a splitting for a norm subtempered sequence.

We now turn to the proof of the third conclusion of the proposition. We need additional estimates.

We let n2n_{2} be as above; it is the point past which the estimate in Lemma 10.2 ceases to hold. Note that there exists β1\beta_{1} such that AiBieβ1(n2i)\|A_{i}-B_{i}\|^{\prime}\leq e^{-\beta_{1}(n_{2}-i)} where \|\cdot\|^{\prime}, denotes the Lyapunov metric. Also, recall that from our choice of n2n_{2}, that on a neighborhood of (1,0)(1,0) of size δ0\delta_{0} that BiB_{i} contracts distances by a factor of e.9λ^e^{-.9\hat{\lambda}}.

Claim 10.8.

There exists β2>0\beta_{2}>0 such that if vuv^{u} is the unstable vector for the AiA_{i}, then

d(Aivu,Bivu)Keβ2(n2i),d^{\prime}(A^{i}v^{u},B^{i}v^{u})\leq Ke^{-\beta_{2}(n_{2}-i)},

where d(u1,u2)=u1u1u2u2\displaystyle d^{\prime}(u_{1},u_{2})={\left\|\frac{u_{1}}{\|u_{1}\|^{\prime}}-\frac{u_{2}}{\|u_{2}\|^{\prime}}\right\|^{\prime}} is the metric on P1\mathbb{R}\operatorname{P}^{1} with corresponding to the Lyapunov metric.

Proof.

Recall that in the Lyapunov coordinates, we have Ai((1,0))=(1,0)A^{i}((1,0))=(1,0). Further, from the previous Lemma, vAv_{A} is within distance enνe^{-n\nu} of (1,0)(1,0). Consequently, we begin by suppose that vv is a vector with d(v,(1,0))<enνd^{\prime}(v,(1,0))<e^{-n\nu} and then seeing how this vector shadows the trajectory of (1,0)(1,0). Then as both vAv_{A} and vBv_{B} are vectors satisfying this property, the needed conclusion follows by the triangle inequality.

This can be seen inductively because, by that lemma111 Note that Lemma 10.2 applies to the Lyapunov metric since the eigenvalues of the matrices AiA_{i} are uniformly bounded in both in both original and Lyapunov coordinates.,

d(Bi(v),(1,0))\displaystyle d^{\prime}(B^{i}(v),(1,0)) d(Bi(v),Bi(1,0))+d(Bi(1,0),(1,0))\displaystyle\leq d^{\prime}(B^{i}(v),B_{i}(1,0))+d^{\prime}(B_{i}(1,0),(1,0))
e.9λ^d(Bi1(v),(0,1))+CAiBi.\displaystyle\leq e^{-.9\hat{\lambda}}d^{\prime}(B^{i-1}(v),(0,1))+C\|A_{i}-B_{i}\|^{\prime}.

We may continue inductively as long as BivB^{i}v still lies in the neighborhood Bδ0B_{\delta_{0}}. For such ii before this point, the form of the estimate that we obtain is:

d(Bi(v),(1,0))enνe.9iλ^+Cj=1ie.9λ^(ij)AjBjCeβ2(n2i).d^{\prime}(B^{i}(v),(1,0))\leq e^{-n\nu}e^{-.9i\hat{\lambda}}+C\sum_{j=1}^{i}e^{-.9\hat{\lambda}(i-j)}\|A_{j}-B_{j}\|^{\prime}\leq C^{\prime}e^{-\beta_{2}(n_{2}-i)}.

Note that as this estimate is growing exponentially quickly that the difference between the index ii where it first exceeds δ0\delta_{0} and n2n_{2} is of size at most ln(C)/β2\ln(C^{\prime})/\beta_{2}, which is constant. Hence by possibly adjusting the constant, the needed result follows.

To conclude we apply apply the triangle inequality to the corresponding estimates on d(Bi(v),(1,0)d^{\prime}(B^{i}(v),(1,0) and d(Ai(v),(1,0))d^{\prime}(A^{i}(v),(1,0))

Before proceeding further, we record an additional quantitative estimate about the norms of the maps considered in Lemma 10.2.

Claim 10.9.

For a matrix AA as in Lemma 10.2, for all σ>0\sigma>0, there exists ϵ1>0\epsilon_{1}>0 such that if E:22E\colon\mathbb{R}^{2}\to\mathbb{R}^{2} is a matrix of norm ϵϵ1\epsilon\leq\epsilon_{1}, then if vP1v\in\mathbb{R}\operatorname{P}^{1} with d((1,0),v)ϵ1d((1,0),v)\leq\epsilon_{1}:

  1. (1)

    |ΦA(v)ΦA+E(v)|E\left|\Phi_{A}(v)-\Phi_{A+E}(v)\right|\leq\|E\|.

  2. (2)

    |ΦA(v)ΦA((1,0))|(σ/2)lnA\left|\Phi_{A}(v)-\Phi_{A}((1,0))\right|\leq(\sigma/2)\ln\|A\|.

Proof.

This claims follows easily because we are restricting to a neighborhood in P1\mathbb{R}\operatorname{P}^{1} where AA has large norm. Note that if vv is a unit vector vv and ϵ1\epsilon_{1} is sufficiently small then Av\|Av\| and (A+E)v\|(A+E)v\| are both greater than 11, hence as ln\ln is 11-Lipschitz on [1,)[1,\infty), so the first claim follows. The second claim is straightforward because by assumption A=diag(σ1,σ2)A=\operatorname{diag}(\sigma_{1},\sigma_{2}). ∎

Similar to before we have the map ΦA(v)=ln(Av/v)\Phi_{A}^{\prime}(v)=\ln(\|Av\|^{\prime}/\|v\|^{\prime}) on P1\mathbb{R}\operatorname{P}^{1}; note that this measures the expansion of vectors with respect to the Lyapunov metric. By possibly decreasing the constants in the statement of Lemma 10.2, we can arrange that the conclusions of Claim 10.9 hold as well for all in2i\leq n_{2}. (Both statements hold with respect to the Lyapunov metric, see footnote 1). We record two facts that follow from Claim 10.8 along with the estimate AiBieβ1(n2i)\|A_{i}-B_{i}\|^{\prime}\leq e^{-\beta_{1}(n_{2}-i)}:

|ΦAi(Ai1vu)ΦAi(Bi1vu)|(σ/2)lnAi\left|\Phi_{A_{i}}^{\prime}(A^{i-1}v^{u})-\Phi_{A_{i}}^{\prime}(B^{i-1}v^{u})\right|\leq(\sigma/2)\ln\|A_{i}\|^{\prime}
|ΦAi(Bi1vu)ΦBi(Bi1vu)|Bieβ1(n2i).\left|\Phi_{A_{i}}^{\prime}(B^{i-1}v^{u})-\Phi_{B_{i}}^{\prime}(B^{i-1}v^{u})\right|\leq\|B_{i}\|^{\prime}\leq e^{-\beta_{1}(n_{2}-i)}.

Using these claims, we now estimate Bn2vu\|B^{n_{2}}v^{u}\|^{\prime}:

|Bn2vuAn2vu|=|i=1n2ΦAi(Ai1vu)ΦBi(Bi1vu)|\left|\|B^{n_{2}}v^{u}\|^{\prime}-\|A^{n_{2}}v^{u}\|^{\prime}\right|=\left|\sum_{i=1}^{n_{2}}\Phi_{A_{i}}^{\prime}(A^{i-1}v^{u})-\Phi_{B_{i}}^{\prime}(B^{i-1}v^{u})\right|
i=1n2|ΦAi(Ai1vu)ΦAi(Bi1vu)|+|ΦAi(Bi1vu)ΦBi(Bi1vu)|\leq\sum_{i=1}^{n_{2}}\left|\Phi_{A_{i}}^{\prime}(A^{i-1}v^{u})-\Phi_{A_{i}}^{\prime}(B^{i-1}v^{u})\right|+\left|\Phi_{A_{i}}^{\prime}(B^{i-1}v^{u})-\Phi_{B_{i}}^{\prime}(B^{i-1}v^{u})\right|
i=1n2((σ/2)lnAi+eβ1(n2i))\leq\sum_{i=1}^{n_{2}}\left((\sigma/2)\ln\|A_{i}\|^{\prime}+e^{-\beta_{1}(n_{2}-i)}\right)

Thus we see that lnBn2(1σ/2)lnAn2C1\ln\|B^{n_{2}}\|^{\prime}\geq(1-\sigma/2)\ln\|A^{n_{2}}\|^{\prime}-C_{1}. Using this we now estimate the norm of Bn\|B^{n}\|. As the norm of all BiB_{i} and AiA_{i} are uniformly bounded by eΔe^{\Delta} by assumption, it follows that |lnAnlnAn2|(nn2)Δ\left|\ln\|A^{n}\|-\ln\|A^{n_{2}}\|\right|\leq(n-n_{2})\Delta. Thus,

lnBn\displaystyle\ln\|B^{n}\| lnBn2(nn2)Δ\displaystyle\geq\ln\|B^{n_{2}}\|-(n-n_{2})\Delta
(1σ/2)lnAn24n2ϵC1(nn2)Δ\displaystyle\geq(1-\sigma/2)\ln\|A^{n_{2}}\|-4n_{2}\epsilon-C_{1}-(n-n_{2})\Delta
(1σ/2)lnAn(σ/2)(nn2)Δ4n2ϵC1(nn2)Δ.\displaystyle\geq(1-\sigma/2)\ln\|A^{n}\|-(\sigma/2)(n-n_{2})\Delta-4n_{2}\epsilon-C_{1}-(n-n_{2})\Delta.

By subtemperedness lnAnλnC2\ln\|A^{n}\|\geq\lambda n-C_{2} for some C2C_{2}, and hence if ε\varepsilon is small enough compared to λ\lambda and σ\sigma, then as nn2=O(ϵ)n-n_{2}=O(\epsilon) the estimate of part (3) of Lemma 10.2 holds. ∎

Proposition 10.3 implies that nearby points have close splittings so that the blocks where a tempered splitting fails to exist are not too small.

10.3. Cushion of nearby points

In this subsection, we prove a refinement of the estimate from the previous subsection. Recall Definition 4.9. We show that points with very close trajectories have cushion that differs by O(1)O(1). This will be used later because it shows that if a short curve has a single point with bad cushion, then all of these points have bad cushion.

Proposition 10.10.

Fix (C0,λ)(C_{0},\lambda), Λ>0\Lambda>0, σ>0,\sigma>0, ϖ>0\varpi>0, then for all sufficiently small ϵ>0\epsilon>0 there exists NN and DD such that the following holds. Suppose that (Ai)1in(A_{i})_{1\leq i\leq n} and (Bi)1in(B_{i})_{1\leq i\leq n} are sequences of matrices in SL(2,)\operatorname{SL}(2,\mathbb{R}) with norm at most eΛe^{\Lambda} that are (C0,λ,ϵ)(C_{0},\lambda,\epsilon)-tempered such that AiBiC1eσ(ni)enϖ\|A_{i}-B_{i}\|\leq C_{1}e^{-\sigma(n-i)}e^{-n\varpi}. Let U(A)U(A) and U(B)U(B) denote the cushion of AA and BB. Then

|U(A)U(B)|D.\left|U(A)-U(B)\right|\leq D.
Proof.

In view of the definition of the cushion, it suffices to prove that there exists D>0D>0 such that for two such sequences and 1kn1\leq k\leq n, |lnAklnBk|D\left|\ln\|A^{k}\|-\ln\|B^{k}\|\right|\leq D. This will follow from the claim below, which gives an exponential shadowing for the most expanded directions of AkA^{k} and BkB^{k}

Claim 10.11.

There exists N1N_{1}, D1D_{1} such that as long as nNn\geq N, There exists β2(λ,ϵ,σ)\beta_{2}(\lambda,\epsilon,\sigma) and K(C0,λ,Λ,σ,ϖ,ϵ)K(C_{0},\lambda,\Lambda,\sigma,\varpi,\epsilon) such that for any NknN\leq k\leq n the following holds. If vkv_{k} is the unstable vector for the AkA^{k}, then for iki\leq k,

d(Aivk,Bivk)K(eβ2(k+i)+enϖ/2).d(A^{i}v_{k},B^{i}v_{k})\leq K(e^{-\beta_{2}(k+i)}+e^{-n\varpi/2}).
Proof.

This essentially follows due to an enhancement of the argument surrounding Claim 10.8, which we can improve due to the stronger assumptions of the present claim.

As before, we work in Lyapunov charts, and estimate the distance that a vector near (1,0)(1,0) can drift away from it. Comparing with (10.2), when we look in the Lyapunov charts adapted to the sequence A1,,AkA_{1},\ldots,A_{k}, we now have that

Ai=[σi,100σ2,i]=Bi+OC,λ,Λ(eσ(ni)eϖne4ϵi).A_{i}=\begin{bmatrix}\sigma_{i,1}&0\\ 0&\sigma_{2,i}\end{bmatrix}=B_{i}+O_{C,\lambda,\Lambda}(e^{-\sigma(n-i)}e^{-\varpi n}e^{4\epsilon i}).

Hence Lemma 10.2 holds for all 1in1\leq i\leq n, i.e. for the entire sequence, as long as ϵ\epsilon is sufficiently small relative to ϖ\varpi. Note that this implies that there exists some CC^{\prime} such that AjBjCeσ(ni)eϖne4ϵi\|A_{j}-B_{j}\|^{\prime}\leq C^{\prime}e^{-\sigma(n-i)}e^{-\varpi n}e^{4\epsilon i}.

We now do an induction similar to that in Claim 10.8. Denote

dn,k,i=ekνe.9λ^i+C¯en(2/3)ϖeσ(nk),d_{n,k,i}=e^{-k\nu}e^{-.9\hat{\lambda}i}+\bar{C}e^{-n(2/3)\varpi}e^{-\sigma(n-k)},

where C¯\bar{C} is a large constant that will be chosen below. From Lemma 10.2, we can take δ0\delta_{0} so small that any vector making angle less than δ0\delta_{0} with (1,0)(1,0) is contracted by at least e.9λ^e^{-.9\hat{\lambda}}. Take NN so large that for all NknN\leq k\leq n we have that dn,k,0δ0d_{n,k,0}\leq\delta_{0} and hence also for all iNi\leq N, dn,k,iδ0.d_{n,k,i}\leq\delta_{0}. We now verify by induction on ii that if we start with a vector vv such that d(v,(1,0))ekνd^{\prime}(v,(1,0))\leq e^{-k\nu}, then for all iki\leq k,  d(Bi(v),(1,0))dn,k,i.d^{\prime}(B^{i}(v),(1,0))\leq d_{n,k,i}. Indeed

d(Bi(v),(1,0))\displaystyle d^{\prime}(B^{i}(v),(1,0)) e.9λ^dn,k,i1+AiBi\displaystyle\leq e^{-.9\hat{\lambda}}d_{n,k,i-1}+\|A_{i}-B_{i}\|^{\prime}
ekνe.9iλ^+e.9λ^C¯en(2/3)ϖeσ(nk)+Ceσ(ni)eϖne4ϵi.\displaystyle\leq e^{-k\nu}e^{-.9i\hat{\lambda}}+e^{-.9\hat{\lambda}}\bar{C}e^{-n(2/3)\varpi}e^{-\sigma(n-k)}+C^{\prime}e^{-\sigma(n-i)}e^{-\varpi n}e^{4\epsilon i}.

As long as C¯\bar{C} is sufficiently large and ϵ\epsilon is sufficiently small relative to ϖ\varpi, it then follows that:

d(Bi(v),(1,0))ekνe.9λ^i+C¯en(2/3)ϖeσ(nk)dn,k,i.d^{\prime}(B^{i}(v),(1,0))\leq e^{-k\nu}e^{-.9\hat{\lambda}i}+\bar{C}e^{-n(2/3)\varpi}e^{-\sigma(n-k)}\leq d_{n,k,i}.

Thus for 1ik1\leq i\leq k,

d(Biv,Ai(1,0))C1(ekνe.9λ^i+C¯en(2/3)ϖeσ(nk)).d^{\prime}(B^{i}v,A^{i}(1,0))\leq C_{1}(e^{-k\nu}e^{-.9\hat{\lambda}i}+\bar{C}e^{-n(2/3)\varpi}e^{-\sigma(n-k)}).

Lemma A.8, which compares distance on S1S^{1} for different metrics, implies that as long as ϵ\epsilon is sufficiently small relative to λ\lambda and σ\sigma, then respect to the reference metric on P1\mathbb{R}\operatorname{P}^{1} that there exists C2C_{2} such that

d(Biv,Ai((1,0)))C2(ekνe.45λ^i+C¯enϖ/2eσ(nk)).d(B^{i}v,A^{i}((1,0)))\leq C_{2}(e^{-k\nu}e^{-.45\hat{\lambda}i}+\bar{C}e^{-n\varpi/2}e^{-\sigma(n-k)}).

The above estimate holds for any vector vv at distance ekνe^{-k\nu} from (1,0)(1,0).

In particular, from Lemma 10.5 whose weaker hypotheses (Ai)1ik(A_{i})_{1\leq i\leq k} and (Bi)1ik(B_{i})_{1\leq i\leq k} satisfy, we see that vAkv^{k}_{A} and vBkv^{k}_{B} are both within ekνe^{-k\nu} distance of (1,0)(1,0) in the Lyapunov charts as long as kN2k\geq N_{2} for some N2N_{2}. Thus by specializing to these vectors and applying the triangle inequality, we find that d(BivAk,AivAk)C3(ekνe.45λi^+enϖ/2),\displaystyle d(B^{i}v_{A}^{k},A^{i}v_{A}^{k})\leq C_{3}(e^{-k\nu}e^{-.45\hat{\lambda i}}+e^{-n\varpi/2}), which is the desired claim. ∎

Because the norm of all the matrices we are considering is uniformly bounded by eΛe^{\Lambda}, the estimate in Claim 10.11 gives that for k>Nk>N,

|lnAklnBk|i=1kKeβ2(k+i)+enϖ/2K\left|\ln\|A^{k}\|-\ln\|B^{k}\|\right|\leq\sum_{i=1}^{k}Ke^{-\beta_{2}(k+i)}+e^{-n\varpi/2}\leq K^{\prime}

for some fixed KK^{\prime}. Note that this gives the conclusion of the lemma about cushioning for all indices greater than NN. For those less than NN, since there are only finitely many such words and the norms of matrices are bounded, we can accommodate them by increasing the constant in the conclusion of the theorem. ∎

10.4. Scale selection proposition

Given two nearby standard pairs, we can attempt to “couple” them using the fake stable manifolds. For this we need more quantitative estimates on how close and smooth standard pairs need to be so that we can couple a significant proportion of them. For example, if they are too far apart then a fake stable leaf may not reach from one to the next. Proposition 10.12 below are mostly a summary of results appearing elsewhere in the paper. Note that the first parts of the proposition are statements about temperedness and splittings on uniformly large balls in MM. Part (4) shows that for fixed C0C_{0} if we consider sufficiently small (C0,δ,υ)(C_{0},\delta,\upsilon)-configurations that on balls of radius O(δ)O(\delta) that transversality to the contracting direction and temperedness of the splitting imply that the holonomies between the curves in a configuration exist and converge exponentially fast.

Below we say that a curve and a cone field are θ0\theta_{0}-transverse if the smallest angle they make is at least θ0\theta_{0}. Also, see Definition B.11 in the appendix for the definition of (C,λ,ϵ,𝒞)(C,\lambda,\epsilon,\mathcal{C})-tempered, which means (C,λ,ϵ)(C,\lambda,\epsilon)-tempered plus the additional condition that the stable direction lies in the cone 𝒞\mathcal{C}.

Proposition 10.12.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M) with MM a closed surface. There exists λ>0\lambda>0 such that for any 0<λ<λ0<\lambda^{\prime}<\lambda, 0<σ0<\sigma there exists 0<ϵ0,τ<10<\epsilon_{0},\tau<1 such that for any 0<ϵ<ϵ<ϵ00<\epsilon<\epsilon^{\prime}<\epsilon_{0} there exist δ0,δ1,θ,b0,C,C,C′′,θ0,η>0\delta_{0},\delta_{1},\theta,b_{0},C,C^{\prime},C^{\prime\prime},\theta_{0},\eta>0 and NN\in\mathbb{N} such that: for any xMx\in M, i{1,2,3}i\in\{1,2,3\}, there are three nested cone fields 𝒞θi𝒞2θi𝒞3θi\mathcal{C}^{i}_{\theta}\subset\mathcal{C}^{i}_{2\theta}\subset\mathcal{C}^{i}_{3\theta} of angles θ\theta, 2θ2\theta, 3θ3\theta, respectively defined on Bδ0(x)B_{\delta_{0}}(x) by parallel transport from a cone at xx. Further, the 𝒞3θi\mathcal{C}^{i}_{3\theta} are uniformly transverse on Bδ0(x)B_{\delta_{0}}(x). These conefields satisfy the following properties for words ω\omega, where probabilities below are with respect to the Bernoulli measure μ\mu on Σ\Sigma.

  1. (1)

    (Positive probability of tangency to 𝒞θi\mathcal{C}^{i}_{\theta}) For any point yBδ0(x)y\in B_{\delta_{0}}(x) and any i{1,2,3}i\in\{1,2,3\}, the probability that DxfωnD_{x}f^{n}_{\omega} is (C,λ,ϵ,𝒞θi)(C,\lambda,\epsilon,\mathcal{C}^{i}_{\theta})-tempered for all nNn\geq N is at least b0>0b_{0}>0.

  2. (2)

    (Nearby points are also tempered) For any curve γ\gamma, if xγx\in\gamma is (C,λ,ϵ,𝒞θi)(C,\lambda,\epsilon,\mathcal{C}^{i}_{\theta})-tempered at time nn and yγy\in\gamma is a point with dγ(x,y)Dxfωn(1+σ)d_{\gamma}(x,y)\leq\|D_{x}f^{n}_{\omega}\|^{-(1+\sigma)}, then yy is (C,λ,ϵ,𝒞2θi)(C^{\prime},\lambda^{\prime},\epsilon^{\prime},\mathcal{C}^{i}_{2\theta})-tempered at time nn and

    (10.15) DyfωnDxfωn1σ.\|D_{y}f^{n}_{\omega}\|\geq\|D_{x}f^{n}_{\omega}\|^{1-\sigma}.
  3. (3)

    (Existence of fake stable manifolds) For any (C,λ,ϵ,𝒞2θi)(C^{\prime},\lambda^{\prime},\epsilon^{\prime},\mathcal{C}^{i}_{2\theta})-tempered point yBδ0(x)y\in B_{\delta_{0}}(x) at time nNn\geq N, the fake stable curve Wn,δ1s(ω,y)W^{s}_{n,\delta_{1}}(\omega,y) exists, has length at least δ1\delta_{1}, has C2C^{2} norm at most C′′C^{\prime\prime}, and is tangent to 𝒞3θi\mathcal{C}^{i}_{3\theta} on Bδ0(x)B_{\delta_{0}}(x).

  4. (4)

    (There exists a well configured neighborhood) For any C0C_{0}, there exists δ(0,1)\delta\in(0,1), a0,a1,D1,D2>0a_{0},a_{1},D_{1},D_{2}>0 and N1N_{1}\in\mathbb{N} such that for all 0<δ<δ0<\delta^{\prime}<\delta, and any υδτ\upsilon\leq\delta^{\prime}\tau, the following holds for any (C0,δ,υ)(C_{0},\delta^{\prime},\upsilon)-configuration (γ^1,γ^2)(\hat{\gamma}_{1},\hat{\gamma}_{2}). There exists xMx\in M and i{1,2,3}i\in\{1,2,3\} such that γ1\gamma_{1} and γ2\gamma_{2} are uniformly θ0\theta_{0}-transverse to 𝒞3θi\mathcal{C}^{i}_{3\theta} on Bδ0(x)B_{\delta_{0}}(x). We let B2ν(x)B_{2\nu}(x) be a ball that demonstrates that γ^1,\hat{\gamma}_{1}, and γ^2\hat{\gamma}_{2} are in a (C0,δ,υ)(C_{0},\delta^{\prime},\upsilon)-configuration, i.e. it contains points of γ1\gamma_{1} and γ2\gamma_{2} that are distance at least υ\upsilon from the boundary of those curves. We maintain this choice of xx and ii in the following lettered items:

    1. (a)

      (Fake stable manifolds tangent to 𝒞3θi\mathcal{C}^{i}_{3\theta} are transverse to pairs) If yB2υ(x)y\in B_{2\upsilon}(x) is as in item (3) above, then Wn,δ1s(ω,y)W^{s}_{n,\delta_{1}}(\omega,y) intersects both γ1\gamma_{1} and γ2\gamma_{2} and the points of intersection are both θ0\theta_{0}-transverse, i.e. both γ1\gamma_{1} and γ2\gamma_{2} make an angle at least θ0\theta_{0} with Wn,δ1s(ω,y)W^{s}_{n,\delta_{1}}(\omega,y).

    2. (b)

      (Lower bound on derivative of the holonomy) For nN1n\geq N_{1}, if Bγ1B2ν(x)B\subseteq\gamma_{1}\cap B_{2\nu}(x) is a subset of γ1\gamma_{1} consisting of (C,λ,ϵ,𝒞2θi)(C^{\prime},\lambda^{\prime},\epsilon^{\prime},\mathcal{C}^{i}_{2\theta})-tempered points at time nn, then Hns(B)γ2H^{s}_{n}(B)\subseteq\gamma_{2} has length at least D1len(B)D_{1}\operatorname{len}(B). Further, as long as γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} have equal mass and are at most 4δ4\delta^{\prime}-long, there are a pair of connected components of γ^1B2υ(x)\hat{\gamma}_{1}\cap B_{2\upsilon}(x) and γ^2B2υ(x)\hat{\gamma}_{2}\cap B_{2\upsilon}(x) each containing at least a1a_{1} proportion of the mass of γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} such that if BB is as above and lies in this set, then

      a0(Hns)ρ1|Bρ2|Hns(B).a_{0}(H^{s}_{n})_{*}\rho^{1}|_{B}\leq\rho^{2}|_{H^{s}_{n}(B)}.
    3. (c)

      (Fluctuations in the holonomies) For any (C0,δ,υ)(C_{0},\delta^{\prime},\upsilon)-configured pair (γ1,γ2)(\gamma_{1},\gamma_{2}), if zB2υ(x)z\in B_{2\upsilon}(x) is a (C,λ,ϵ,𝒞2θi)(C^{\prime},\lambda^{\prime},\epsilon^{\prime},\mathcal{C}_{2\theta}^{i}) tempered point at times n,n1N1n,n-1\geq N_{1} and yy is any point with dγ1(x,y)Dxfωn(1+σ)d_{\gamma_{1}}(x,y)\leq\|D_{x}f^{n}_{\omega}\|^{-(1+\sigma)}, then

      (10.16) dγ2(Hns(y),Hn1s(y))e1.99lnDxfωn.d_{\gamma_{2}}(H_{n}^{s}(y),H^{s}_{n-1}(y))\leq e^{-1.99\ln\|D_{x}f^{n}_{\omega}\|}.

      Further, for nN1n\geq N_{1} the rate of convergence of the Jacobians is exponentially fast

      (10.17) |JacHnsJacHn1s|eηn.\left|\operatorname{Jac}H^{s}_{n}-\operatorname{Jac}H^{s}_{n-1}\right|\leq e^{-\eta n}.
    4. (d)

      (Log-α\alpha-Hölder control of Jacobian) If Bγ1B2υ(x)B\subseteq\gamma_{1}\cap B_{2\upsilon}(x) is an open set comprised of (C,λ,ϵ,𝒞θi)(C^{\prime},\lambda^{\prime},\epsilon^{\prime},\mathcal{C}_{\theta}^{i})-tempered points at time nn, then

      (10.18) |logJacHns(x)logJacHns(y)|D2dγ1(x,y)α.\left|\log\operatorname{Jac}H^{s}_{n}(x)-\log\operatorname{Jac}H^{s}_{n}(y)\right|\leq D_{2}d_{\gamma_{1}}(x,y)^{\alpha}.
Proof.

The main non-trivial input to this proposition is the definition of the cones. After they are chosen correctly, the remaining statements follow in a straightforward manner from facts about the fake stable manifolds proven elsewhere.

For any point xMx\in M, we let νx\nu_{x} denote the distribution of the true stable directions EsE^{s} at the point xx, which is a measure on Px1\mathbb{R}\operatorname{P}^{1}_{x}, the projectivization of TxMT_{x}M. As νx\nu_{x} is non-atomic, we can find three disjoint intervals I1,I2,I3I_{1},I_{2},I_{3} of width θ\theta that are each separated by angle at least 4θ4\theta for some angle θ>0\theta>0 and such that νx(I1),νx(I2),νx(I3)\nu_{x}(I_{1}),\nu_{x}(I_{2}),\nu_{x}(I_{3}) are each positive. We then use these intervals to define nested cones 𝒞θ/2i(x)𝒞θi(x)𝒞2θi(x)𝒞3θi(x)\mathcal{C}^{i}_{\theta/2}(x)\subset\mathcal{C}^{i}_{\theta}(x)\subset\mathcal{C}^{i}_{2\theta}(x)\subset\mathcal{C}^{i}_{3\theta}(x) at xx for i{1,2,3}i\in\{1,2,3\}. Due to the continuity of νx\nu_{x} from Proposition B.4, we see that if we parallel translate I1,I2,I3I_{1},I_{2},I_{3} to form cone fields 𝒞1,𝒞2,𝒞3\mathcal{C}_{1},\mathcal{C}_{2},\mathcal{C}_{3} over a ball B(x)B(x) around xx, then we similarly have that νy(𝒞θ/2i)\nu_{y}(\mathcal{C}^{i}_{\theta/2}) is uniformly positive for all yB(x)y\in B(x). All these properties are uniform, so we can do this for any xMx\in M and obtain a neighborhood of uniform size, with uniform lower bound on νy(𝒞θ/2i)\nu_{y}(\mathcal{C}^{i}_{\theta/2}) over all these neighborhoods.

We now verify item (1). There exist λ,ϵ>0\lambda,\epsilon>0 such that for any yMy\in M and almost every ω\omega, DyfωnD_{y}f^{n}_{\omega} is (C(ω),λ,ϵ)(C(\omega),\lambda,\epsilon)-tempered for some C(ω)C(\omega). Further, by Proposition 4.7 we have a uniform estimate on the tail on C(ω)C(\omega) independent of the point yy. Thus by choosing C1C_{1} sufficiently large for any yB(x)y\in B(x) and 1i31\leq i\leq 3, with probability at least b0b_{0}, DyfωnD_{y}f^{n}_{\omega} is (C1,λ,ϵ)(C_{1},\lambda,\epsilon)-subtempered and Ens(ω,y)𝒞θ/2i(y)E^{s}_{n}(\omega,y)\in\mathcal{C}^{i}_{\theta/2}(y) for all nNn\geq N. By Proposition 4.6, there exists N0NN_{0}\in N such that for any (C1,λ,ϵ)(C_{1},\lambda,\epsilon)-subtempered trajectory of length nN0n\geq N_{0}, then for all nN0n\geq N_{0}, (Ens(ω,y),Es)<θ/4\angle(E^{s}_{n}(\omega,y),E^{s})<\theta/4 and so Ens𝒞θiE^{s}_{n}\in\mathcal{C}^{i}_{\theta}. This gives us the uniformly positive probability of at least b0>0b_{0}>0.

Item (2) is immediate from Proposition 10.3.

Item (3), which states the existence of the fake stable manifolds for (C,λ,ϵ,𝒞2θi)(C^{\prime},\lambda^{\prime},\epsilon^{\prime},\mathcal{C}^{i}_{2\theta})-tempered points, follows from Proposition B.10 (possibly after decreasing δ\delta).

We now verify item (4), which has many subparts. The statement in the initial part follows by making a judicious choice of xx as well as the particular cone 𝒞3θi\mathcal{C}^{i}_{3\theta} on Bδ0(x)B_{\delta_{0}}(x) that the fake stable manifolds will be tangent to. Because γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} are a (C0,δ,υ)(C_{0},\delta^{\prime},\upsilon)-configuration then there exists a pair of points xγ1x\in\gamma_{1} and yγ2y\in\gamma_{2} with d(x,y)<υd(x,y)<\upsilon. We choose to work on the neighborhood Bδ0(x)B_{\delta_{0}}(x). We then must show that we can pick one of the cones 𝒞3θi\mathcal{C}^{i}_{3\theta} that is uniformly transverse to γ1\gamma_{1} and γ2\gamma_{2} on Bδ0(x)B_{\delta_{0}}(x). Let 𝒦1\mathcal{K}_{1} be a small cone around γ1(x)\gamma_{1}^{\prime}(x) and 𝒦2\mathcal{K}_{2} be a small cone around γ2(y)\gamma_{2}^{\prime}(y). We can extend both cones to the whole of Bδ0(x)B_{\delta_{0}}(x) by parallel transport. Since there are three cones, we let i{1,2,3}i\in\{1,2,3\} be an index such that the cone 𝒞3θi\mathcal{C}^{i}_{3\theta} is transverse to both 𝒦1\mathcal{K}_{1} and 𝒦2\mathcal{K}_{2}. We let θ0\theta_{0} be a lower bound on the angle that γ1,γ2\gamma_{1},\gamma_{2} make with 𝒞3θi\mathcal{C}^{i}_{3\theta} and note that, as before, that θ0\theta_{0} is uniform as it only relies on knowing C0,δC_{0},\delta^{\prime}. We now proceed to checking the lettered items that follow.

Item (4a) says that the fake stable manifolds of (C,λ,ϵ,𝒞2θi)(C^{\prime},\lambda^{\prime},\epsilon^{\prime},\mathcal{C}^{i}_{2\theta})-tempered points are θ0\theta_{0}-transverse to γ1,γ2\gamma_{1},\gamma_{2} and intersect them. This follows from Proposition B.10 because by choice of our constants, for such a tempered point yy, it follows that Wns(ω,y)W^{s}_{n}(\omega,y) is tangent to 𝒞3θi\mathcal{C}^{i}_{3\theta}, and the uniform transversality follows from our control on the C2C^{2} norm of Wns(ω,y)W^{s}_{n}(\omega,y) and the Hölder continuity of the most contracting subspace EnsE^{s}_{n}. Further, the fact that we only need the curves to be at most υ=τδ\upsilon=\tau\delta^{\prime} apart from each other, with τ\tau depending only on θ0,λ,λ\theta_{0},\lambda^{\prime},\lambda is clear from the uniform C2C^{2} bound on the norm of the fake stable manifolds Wns(ω,y)W^{s}_{n}(\omega,y) from item (3). Item (4a) follows because as long as δ\delta is sufficiently small compared with C0C_{0}, the tangent direction to γi\gamma_{i} is close to constant on a segment of length δ\delta.

The first part of item (4b) saying that there is a lower bound on the derivative of the holonomies follows from Proposition B.13.

The next claim is that restricted to a segment in Bδ0B_{\delta_{0}}, γ1\gamma_{1} and γ2\gamma_{2} have a positive proportion of their mass there. This follows due to the log-Hölder regularity of ρ1\rho^{1} and ρ2\rho^{2} as long as δ\delta is sufficiently small. Due to the boundedness of the Jacobian, the log-Hölderness of the densities and them both having a positive amount of their mass on Bδ0(x)B_{\delta_{0}}(x), it additionally follows that there exists such a uniform constant a0a_{0} as stated in item (4b).

Item (4c) is immediate from the statement of Proposition B.12.

Finally, item (4d), which concerns the fluctuations in the Jacobian of HnsH^{s}_{n}, follows from Proposition B.13. ∎

10.5. Proof of Inductive Local Coupling Lemma.

We are now ready to prove the inductive local coupling lemma.

First we prove a result that does not make any assertions about the quantity of points on the curve γ1\gamma_{1} that have a tempered splitting. It just shows that given an infinite trajectory ωΣ\omega\in\Sigma, we may use this trajectory to define a fake coupling in the sense of Definition 10.1 at all future times.

Lemma 10.13.

(Inductive Coupling Lemma.) Let (f1,,fm)(f_{1},\ldots,f_{m}) be an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M) for MM a closed surface.

For any C0>0C_{0}>0 let λ,λ,ϵ0,τ,ϵ,ϵ\lambda,\lambda^{\prime},\epsilon_{0},\tau,\epsilon,\epsilon^{\prime}, etc. be a valid choice of constants in the first paragraph of Proposition 10.12 and δ,δ,υ,\delta,\delta^{\prime},\upsilon, etc., be a valid choice of constants in part (4) of that proposition. Then there exist b1,η^,Λ>0b_{1},\hat{\eta},\Lambda>0 such that for any (C0,δ,υ)(C_{0},\delta^{\prime},\upsilon)-configuration (γ^1,γ^2)(\hat{\gamma}_{1},\hat{\gamma}_{2}) the conclusions of Proposition 10.12 apply and the following holds. If xMx\in M and Bδ0(x)B_{\delta_{0}}(x) is the neighborhood where the statements from Proposition 10.12(4) hold, then we can construct a (b1,η^)(b_{1},\hat{\eta})-fake couplings out of (γ^1,γ^2)(\hat{\gamma}_{1},\hat{\gamma}_{2}): For each ωΣ\omega\in\Sigma there exists a decreasing sequence of pairs of standard subfamilies Pn1γ^1P^{1}_{n}\subseteq\hat{\gamma}_{1} and Pn2γ^2P^{2}_{n}\subseteq\hat{\gamma}_{2} that are (b1,η^)(b_{1},\hat{\eta})-fake coupled at each time nN1n\geq N_{1}. Further, for nN1n\geq N_{1} and i{1,2}i\in\{1,2\} PniPn+1iP^{i}_{n}\setminus P^{i}_{n+1} are nΛn\Lambda-good standard families.

These sequences of standard families are decreasing and converge to measures P1P^{1}_{\infty} and P2P^{2}_{\infty}. Further, for such a fake coupling we also have the true stable holonomies HsH^{s}_{\infty} and these satisfy (Hs)P1=P2(H^{s}_{\infty})_{*}P^{1}_{\infty}=P^{2}_{\infty}.

Proof of Lemma 10.13.

We divide the proof into several steps. In Step 0, we introduce the constants that will be used later in the proof; naturally we will also make use of many constants from Proposition 10.12, which is essentially the setup for this lemma. Then in the following steps we give an iterative procedure showing how one may construct a new fake coupled pair out of an old one. By iterating that procedure, we then obtain the result.

Step 0: Introduction of constants. At this step we introduce some of constants that will be used in the proof. Most of these constants will be chosen when they appear in the proof.

  1. (1)

    First, we let λ,λ,ϵ,D1,\lambda,\lambda^{\prime},\epsilon^{\prime},D_{1}, etc., be the constants from the statement of Proposition 10.12. For the given γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} we let Bδ0(x)B_{\delta_{0}}(x) be a neighborhood so that the conclusions of part 4 of that proposition apply. We will simply write 𝒞θ\mathcal{C}_{\theta} rather than 𝒞θi\mathcal{C}^{i}_{\theta} below for the cones defined on Bδ0(x)B_{\delta_{0}}(x) such that γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} have segments that are both uniformly θ0\theta_{0}-transverse to 𝒞3θi\mathcal{C}_{3\theta}^{i} on Bδ0(x)B_{\delta_{0}}(x). We let Λmax\Lambda_{\max} be sufficiently large so that DxfieΛmax\|D_{x}f_{i}\|\leq e^{\Lambda_{\max}} for all xMx\in M and 1im1\leq i\leq m.

  2. (2)

    Further, in the application of Proposition 10.12 we will insist that δ\delta is so small that for any C0C_{0}-good curve with density ρ\rho on a ball of size δ\delta, the log\log-Hölder condition on ρ\rho implies that 1/2<ρ(y)/ρ(x)<21/2<\rho(y)/\rho(x)<2 on this ball.

  3. (3)

    Below, we have certain estimates that will only hold as long as nn is sufficiently large. We will have some cutoffs N1,N2N_{1},N_{2} that we define in the course of the proof at the ends of steps 2 and 6, respectively. The cutoffs N1N_{1} and N2N_{2} only depend on the fixed constants from (1) and (2) above. We then set N0=max{N,N1,N2}\displaystyle N_{0}=\max\{N,N_{1},N_{2}\} in the conclusion of the theorem where NN is the cutoff for Proposition 10.12 to hold.

Step 1: Definition of n1\mathcal{I}_{n}^{1}. Let Γ1\Gamma_{1} be a connected component of Bδ0/2(x)γ1B_{\delta_{0}/2}(x)\cap\gamma_{1} within distance υ\upsilon of γ2\gamma_{2}. Let GωnG^{n}_{\omega} be the (C,λ,ϵ,𝒞2θ)(C^{\prime},\lambda^{\prime},\epsilon^{\prime},\mathcal{C}_{2\theta})-tempered points at time nn lying in Γ1\Gamma_{1} (See Definition B.11). Note that GωnGωn1G^{n}_{\omega}\subseteq G^{n-1}_{\omega}. We set

(10.19) ηn(x)=14(max1mn{Dxfωme(nm)λ/2}),\eta_{n}(x)=\frac{1}{4(\displaystyle\max_{1\leq m\leq n}\{\|D_{x}f^{m}_{\omega}\|e^{(n-m)\lambda^{\prime}/2}\})},

and

(10.20) δn(x)=ηn(1+σ)(x).\delta_{n}(x)=\eta_{n}^{(1+\sigma)}(x).

We now construct n1\mathcal{I}_{n}^{1}. For each xGωnx\in G^{n}_{\omega}, we say that xx is padded if Bδn(x)γ1(x)GωnB_{\delta_{n}(x)}^{\gamma_{1}}(x)\subseteq G^{n}_{\omega}, where the Bδnγ1(x)B^{\gamma_{1}}_{\delta_{n}}(x) denotes a ball of radius δn\delta_{n} about xx in γ1\gamma_{1} with respect to the arclength on γ1\gamma_{1}. We let HωnH^{n}_{\omega} denote the set of all such padded points. Let ^n1γ1\hat{\mathcal{I}}_{n}^{1}\subset\gamma_{1} be the set

(10.21) ^n1=xHωnBδn(x)γ1(x).\hat{\mathcal{I}}_{n}^{1}=\bigcup_{x\in H^{n}_{\omega}}B_{\delta_{n}(x)}^{\gamma_{1}}(x).

Note that ^n1\hat{\mathcal{I}}_{n}^{1} is a finite union of intervals. Delete intervals of length K1e4ΛmaxnK_{1}e^{-4\Lambda_{\max}n} from the edges of each component where K1>0K_{1}>0 is a fixed small constant that we choose below. Call this trimmed collection of intervals n1\mathcal{I}^{1}_{n}.

We next check that n1n11\mathcal{I}_{n}^{1}\subseteq\mathcal{I}_{n-1}^{1}. By the definition of δn\delta_{n}, δn(x)eλ/2δn1(x)\delta_{n}(x)\leq e^{-\lambda^{\prime}/2}\delta_{n-1}(x), thus as long as K1K_{1} is sufficiently small,

(10.22) δneλ/4δn1K1e4(n1)Λmax.\delta_{n}\leq e^{-\lambda^{\prime}/4}\delta_{n-1}-K_{1}e^{-4(n-1)\Lambda_{\max}}.

Thus from the definition of n1\mathcal{I}^{1}_{n}, it is immediate that n1n11\mathcal{I}^{1}_{n}\subseteq\mathcal{I}^{1}_{n-1}.

Step 2: Definition of n2\mathcal{I}^{2}_{n}. From the previous step, we know that any point in n1\mathcal{I}^{1}_{n} satisfies the hypotheses of Proposition 10.12. Since γ^1\hat{\gamma}_{1} and γ^2\hat{\gamma}_{2} are uniformly θ0\theta_{0}-transverse to 𝒞3θ\mathcal{C}_{3\theta}, it follows from Proposition 10.12(4a) that the fake stable manifold Wn,δ1s(y)W^{s}_{n,\delta_{1}}(y) of each point yn1y\in\mathcal{I}_{n}^{1} intersects γ2\gamma_{2}. Hence, there is a well defined holonomy Hns:n1γ2H^{s}_{n}\colon\mathcal{I}^{1}_{n}\to\gamma_{2} which satisfies all the conclusions of Proposition 10.12. We define

(10.23) n2=Hns(n1).\mathcal{I}^{2}_{n}=H^{s}_{n}(\mathcal{I}_{n}^{1}).

Next we check that n2n12\mathcal{I}^{2}_{n}\subseteq\mathcal{I}^{2}_{n-1}. For this we will use the control on the fluctuations in the size of the holonomies from Claim 10.14 below. As we vary nn, the fluctuations in Hns(y)H^{s}_{n}(y) are smaller than the width of the neighborhoods δn\delta_{n} in (10.20), and the result will follow.

Suppose that xn1x\in\mathcal{I}^{1}_{n}. We must show that Hns(x)n12H^{s}_{n}(x)\in\mathcal{I}^{2}_{n-1}. Note that while xx might not be in HωnH^{n}_{\omega} it is in GωnG^{n}_{\omega}. So, there exists some point yy such that xBδn(y)γ1(y)x\in B^{\gamma_{1}}_{\delta_{n}(y)}(y) and hence also in Bδn1(y)γ1(y)B^{\gamma_{1}}_{\delta_{n-1}(y)}(y).

To show that Hns(x)n12H^{s}_{n}(x)\in\mathcal{I}_{n-1}^{2}, we estimate how far Hns(x)H^{s}_{n}(x) is from Hn1s(x)H^{s}_{n-1}(x) and then estimate how far Hns(x)H^{s}_{n}(x) is from the boundary of Hn1s(Bδn1(y)γ1(y))H^{s}_{n-1}(B^{\gamma_{1}}_{\delta_{n-1}(y)}(y)). For the former, we use the following claim.

Claim 10.14.

There exists C1>0C_{1}>0 such that if xBδn(y)γ1(y)x\in B^{\gamma_{1}}_{\delta_{n}(y)}(y) for some yn1y\in\mathcal{I}^{1}_{n} then

dγ2(Hns(x),Hn1s(x))C1ηn1.99(1σ)2(y).d_{\gamma_{2}}(H^{s}_{n}(x),H^{s}_{n-1}(x))\leq C_{1}\eta_{n}^{1.99(1-\sigma)^{2}}(y).
Proof.

First we show that there exists CaC_{a} such that

(10.24) DyfωnCaηn(1σ)(y).\|D_{y}f^{n}_{\omega}\|\geq C_{a}\eta_{n}^{-(1-\sigma)}(y).

Let NmnN\leq m\leq n, be the number achieving the maximum in the definition of ηn\eta_{n}, (10.19). From (C,λ,ϵ)(C^{\prime},\lambda^{\prime},\epsilon^{\prime}) temperedness,

(10.25) Dyfωn=Dyfωm+(nm)eCeλ(nm)eϵmDyfωm.\|D_{y}f^{n}_{\omega}\|=\|D_{y}f^{m+(n-m)}_{\omega}\|\geq e^{-C^{\prime}}e^{\lambda^{\prime}(n-m)}e^{-\epsilon^{\prime}m}\|D_{y}f^{m}_{\omega}\|.

By the definition of ηn\eta_{n},

(10.26) ηn(1σ)4(1σ)e(nm)(1σ)λ/2Dyfωm(1σ).\eta_{n}^{-(1-\sigma)}\leq 4^{(1-\sigma)}e^{(n-m)(1-\sigma)\lambda^{\prime}/2}\|D_{y}f^{m}_{\omega}\|^{(1-\sigma)}.

But from (C,λ,ϵ)(C^{\prime},\lambda^{\prime},\epsilon^{\prime})-temperedness, DyfωmeCeλm\|D_{y}f^{m}_{\omega}\|\geq e^{-C^{\prime}}e^{\lambda^{\prime}m}. Hence as long as λσ>2ϵ\lambda^{\prime}\sigma>2\epsilon^{\prime}, it follows that there exists CbC_{b} such that for all nNn\geq N we have CbDyfωm(1σ)Dyfωme2ϵmC_{b}\|D_{y}f^{m}_{\omega}\|^{(1-\sigma)}\leq\|D_{y}f^{m}_{\omega}\|e^{-2\epsilon^{\prime}m}. Hence there is some CcC_{c} such that

Ccηn(1σ)e(nm)λ/2Dyfωme2ϵm.C_{c}\eta_{n}^{-(1-\sigma)}\leq e^{(n-m)\lambda^{\prime}/2}\|D_{y}f^{m}_{\omega}\|e^{-2\epsilon^{\prime}m}.

Comparing the above equation with (10.25) yields equation (10.24).

Next, as explained in Step 1 above, all points in n1\mathcal{I}^{1}_{n} satisfy the conclusions of Proposition 10.12(2). Thus,

DxfωnDyfωn(1σ).\|D_{x}f^{n}_{\omega}\|\geq\|D_{y}f^{n}_{\omega}\|^{(1-\sigma)}.

Combining this with (10.24) gives:

DxfωnCa(1σ)ηn(1σ)2(y).\|D_{x}f^{n}_{\omega}\|\geq C_{a}^{(1-\sigma)}\eta_{n}^{-(1-\sigma)^{2}}(y).

Then applying Proposition 10.12(4c) gives the conclusion. ∎

We now continue with the proof that n2n12\mathcal{I}^{2}_{n}\subseteq\mathcal{I}^{2}_{n-1}. First, note that by the triangle inequality,

dγ1(x,Bδn1(y)K1eΛmax(n1)γ1(y))δn1(y)K1e4(n1)Λmaxδn(y).d_{\gamma_{1}}(x,\partial B_{\delta_{n-1}(y)-K_{1}e^{-\Lambda_{\max}(n-1)}}^{\gamma_{1}}(y))\geq\delta_{n-1}(y)-K_{1}e^{-4(n-1)\Lambda_{\max}}-\delta_{n}(y).

We then apply Hn1sH^{s}_{n-1}. By Proposition 10.12(4b) it follows that

(10.27) dγ2(Hn1s(x),Hn1s(Bδn1(y)K1e4Λmax(n1)γ1(y)))\displaystyle d_{\gamma_{2}}(H^{s}_{n-1}(x),\partial H^{s}_{n-1}(B_{\delta_{n-1}(y)-K_{1}e^{-4\Lambda_{\max}(n-1)}}^{\gamma_{1}}(y)))
\displaystyle\geq D1(δn1(y)K1e4(n1)Λmaxδn(y)).\displaystyle D_{1}(\delta_{n-1}(y)-K_{1}e^{-4(n-1)\Lambda_{\max}}-\delta_{n}(y)).

But by Claim 10.14, dγ2(Hn1s(x),Hns(x))C2ηn1.99(1σ)2(y).\displaystyle d_{\gamma_{2}}(H^{s}_{n-1}(x),H^{s}_{n}(x))\leq C_{2}\eta_{n}^{1.99(1-\sigma)^{2}}(y). Hence by the triangle inequality

dγ2(Hns(x),Hn1s(Bδn1(y)K1e4Λmax(n1)γ1(y)))d_{\gamma_{2}}(H^{s}_{n}(x),\partial H^{s}_{n-1}(B_{\delta_{n-1}(y)-K_{1}e^{-4\Lambda_{\max}(n-1)}}^{\gamma_{1}}(y)))
D1(δn1(y)K1e4(n1)Λmaxδn(y))C2ηn1.99(1σ)2(y).\geq D_{1}(\delta_{n-1}(y)-K_{1}e^{-4(n-1)\Lambda_{\max}}-\delta_{n}(y))-C_{2}\eta_{n}^{-1.99(1-\sigma)^{2}}(y).

By (10.22) δn1K1e(n1)Λmaxδn(1eλ/4)δn1.\displaystyle\delta_{n-1}-K_{1}e^{-(n-1)\Lambda_{\max}}-\delta_{n}\geq\left(1-e^{-\lambda/4}\right)\delta_{n-1}. Hence as ηn1.99(1σ)2\eta_{n}^{1.99(1-\sigma)^{2}} is of a higher order than δn\delta_{n}, there exists some N1N_{1} such that for nN1n\geq N_{1},

(10.28) dγ2(Hns(x),n12)21D1(1eλ/4)δn1(y)>0.d_{\gamma_{2}}(H^{s}_{n}(x),\partial\mathcal{I}^{2}_{n-1})\geq 2^{-1}D_{1}\left(1-e^{-\lambda/4}\right)\delta_{n-1}(y)>0.

This shows that Hns(n1)n12H^{s}_{n}(\mathcal{I}^{1}_{n})\subset\mathcal{I}^{2}_{n-1} as desired.

Step 3. Lengths of curves in nin1i\mathcal{I}_{n}^{i}\setminus\mathcal{I}_{n-1}^{i}. This is needed to estimate the regularity of PniPn+1iP_{n}^{i}\setminus P_{n+1}^{i}.

First we consider the size of the trimmed segments when we pass from ^n1\hat{\mathcal{I}}^{1}_{n} to n1\mathcal{I}^{1}_{n}. Any connected component of ^n1\hat{\mathcal{I}}^{1}_{n} has length at least δn(x)\delta_{n}(x) for some xx. Note that this is bounded below by an exponential eΛmaxne^{-\Lambda_{\max}n}. Then as we trim a remaining K1e4ΛmaxnK_{1}e^{-4\Lambda_{\max}n} length off these intervals when we pass from ^n1\hat{\mathcal{I}}^{1}_{n} to n1\mathcal{I}^{1}_{n}, we see that each interval we trim has length at least K1e4ΛmaxnK_{1}e^{-4\Lambda_{\max}n}.

There are two ways that xn11x\in\mathcal{I}^{1}_{n-1} may fail to be in n1\mathcal{I}^{1}_{n}. Write n11(x)\mathcal{I}^{1}_{n-1}(x) for the connected component of n11\mathcal{I}^{1}_{n-1} containing xx. Then either n1(x)\mathcal{I}_{n-1}(x) contains a point yy that is in n1\mathcal{I}^{1}_{n} or the entire component containing xx is deleted. In the first case the connected component of n11n1\mathcal{I}^{1}_{n-1}\setminus\mathcal{I}^{1}_{n} containing xx has length at least k1e4Λmaxnk_{1}e^{-4\Lambda_{\max}n} by the previous paragraph. In the second case, the removed segment is at least eΛmaxne^{-\Lambda_{\max}n} long. Thus we have obtained an exponential lower bound on the lengths of curves in n11n1\mathcal{I}^{1}_{n-1}\setminus\mathcal{I}^{1}_{n}.

As Hns(n1)=n2H^{s}_{n}(\mathcal{I}^{1}_{n})=\mathcal{I}^{2}_{n} we can use the size of the gaps in n1\mathcal{I}^{1}_{n} to estimates the size of those in n2\mathcal{I}^{2}_{n}. Note that from estimate (10.27), each segment in n12n2\mathcal{I}^{2}_{n-1}\setminus\mathcal{I}^{2}_{n} has width at least D1K1e4ΛmaxnD_{1}K_{1}e^{-4\Lambda_{\max}n}.

Step 4. Definition of the densities. So far we have defined the underlying curves n1,n2\mathcal{I}^{1}_{n},\mathcal{I}^{2}_{n} that the standard families Pn1,Pn2P^{1}_{n},P^{2}_{n} will be defined on. We now define the densities on n1\mathcal{I}^{1}_{n} and n2\mathcal{I}^{2}_{n}. To begin, we will define ρN1\rho^{1}_{N} and ρN2\rho^{2}_{N} where NN is the first time we attempt to fake-couple. From Proposition 10.12(4b) , there exists a0>0a_{0}>0 such that for Bn1B\subseteq\mathcal{I}^{1}_{n},

(10.29) a0(Hns)ρ1|Bρ2|Hn1s(B).a_{0}(H^{s}_{n})_{*}\rho^{1}|_{B}\leq\rho^{2}|_{H^{s}_{n-1}(B)}.

We then take as our initial definition:

(10.30) ρN1=a0ρ1|N1 and ρN2=(HNs)ρN1.\rho^{1}_{N}=a_{0}\rho^{1}|_{\mathcal{I}^{1}_{N}}\text{ and }\rho^{2}_{N}=(H^{s}_{N})_{*}\rho^{1}_{N}.

This gives us ρN1\rho^{1}_{N} and ρN2\rho^{2}_{N}.

We now define ρn1\rho^{1}_{n} and ρn2\rho^{2}_{n} for nNn\geq N. We set:

(10.31) ρn1=ρn11(1enη^)|n1\rho^{1}_{n}=\rho^{1}_{n-1}(1-e^{-n\hat{\eta}})|_{\mathcal{I}^{1}_{n}}

where η^\hat{\eta} is chosen in equation (10.41) below. We then define

(10.32) ρn2=(Hns)(ρn1).\rho^{2}_{n}=(H^{s}_{n})_{*}(\rho^{1}_{n}).

As we push forward ρn1\rho^{1}_{n} by the holonomy HnsH^{s}_{n}, which carries n1\mathcal{I}^{1}_{n} to n2\mathcal{I}^{2}_{n}, ρn2\rho^{2}_{n} is a measure on n2\mathcal{I}^{2}_{n}. This defines completely Pn1P^{1}_{n} and Pn2P^{2}_{n}.

The rest of the proof will be checking that the standard families Pn1P^{1}_{n} and Pn2P^{2}_{n} have the required properties to be a fake coupling. Some are evident from the definition above, but it remains to check:

(1) the regularity of ρn1\rho^{1}_{n} and ρn2\rho^{2}_{n},

(2) that ρn2\rho^{2}_{n} is a decreasing sequence of measures, and

(3) the goodness of the standard families PniPn1iP^{i}_{n}\setminus P^{i}_{n-1} for i{1,2}i\in\{1,2\}.

Step 5: Regularity of ρn1\rho^{1}_{n} and ρn2\rho^{2}_{n}. In this step we study the log-Hölder constants of ρn1\rho^{1}_{n} and ρn2\rho^{2}_{n} for nNn\geq N. Note that ρn1\rho^{1}_{n} is ρN1\rho^{1}_{N} scaled by a constant that it has the same log-Hölder constant as ρN1\rho^{1}_{N}.

Before proceeding to study the regularity of ρn2\rho^{2}_{n}, we introduce some notation related to the Jacobian of the holonomies. Typically the Jacobian of an invertible, absolutely continuous map ϕ:(X,ν)(Y,μ)\phi\colon(X,\nu)\to(Y,\mu) is the Radon-Nikodym derivative dϕμ/dνd\phi^{*}\mu/d\nu. In our case, as we are pushing forward the density ρn1\rho^{1}_{n} by HnsH^{s}_{n}, the result is the same thing as pulling back ρn1\rho^{1}_{n} by (Hns)1(H^{s}_{n})^{-1}. To simplify notation, we will simply write JnJ_{n} for the Jacobian of (Hns)1(H^{s}_{n})^{-1}, which is a function Jn:n2>0J_{n}\colon\mathcal{I}^{2}_{n}\to\mathbb{R}_{>0}. Returning to ρn2\rho^{2}_{n}, this function satisfies for yn2y\in\mathcal{I}^{2}_{n} that

(10.33) ρn2(y)=Jn(y)ρn1((Hns)1(y)).\rho^{2}_{n}(y)=J_{n}(y)\rho^{1}_{n}((H^{s}_{n})^{-1}(y)).

As the assumptions on the holonomies are symmetric in γ1\gamma_{1} and γ2\gamma_{2}, we know from Proposition 10.12(4b) that HnsH^{s}_{n} is D1D_{1}-bilipschitz. Thus by Proposition 10.12(4d), there exists D2D_{2} such that JnJ_{n} is log-α\alpha-Hölder with constant D2D_{2} for all nNn\geq N. Next, since ρn1\rho^{1}_{n} is log-α\alpha-Hölder with constant C0C_{0}, ρn1(Hns)1\rho^{1}_{n}\circ(H^{s}_{n})^{-1} is log-α\alpha-Hölder with constant D1αC0D_{1}^{\alpha}C_{0}. As mentioned before, JnJ_{n} is log-α\alpha-Hölder with constant D2D_{2}. The product of log-α\alpha-Hölder functions is log-α\alpha-Hölder with constant equal to the sum of the constants. Thus by (10.33), we see that ρn2\rho^{2}_{n} is D1αC0+D2D_{1}^{\alpha}C_{0}+D_{2} log-α\alpha-Hölder. Thus we have obtained uniform log-α\alpha-Hölder control for ρn1\rho^{1}_{n} and ρn2\rho^{2}_{n}.

We need one more estimate before we continue: an actual Hölder, rather than log-Hölder, bound on ρn1\rho^{1}_{n} and ρn2\rho^{2}_{n}; we need this as at a certain point we will compare the difference of these functions rather than their ratio. We obtain this bound by rescaling the functions by a constant; however we need to be sure the constant is not too big.

From (10.29), it follows from the C0C_{0} log-α\alpha-Hölder constant of the density that there exists D1D\geq 1 such for any xγ^1x\in\hat{\gamma}_{1} and yγ^2y\in\hat{\gamma}_{2},

(10.34) D1ρ1(x)ρ2(y)D.D^{-1}\leq\frac{\rho^{1}(x)}{\rho^{2}(y)}\leq D.

Note that for a log-α\alpha-Hölder function ρ:K(0,)\rho\colon K\to(0,\infty) on a set KK of diameter at most 11 that there exists DD depending only on the log-Hölder constant of ρ\rho such that

D1ρ/maxρ1.D^{-1}\leq\rho/\max\rho\leq 1.

If we let MM denote the larger of the maximum of ρn1\rho^{1}_{n} and the maximum of ρn2\rho^{2}_{n}, then we may define for i{1,2}i\in\{1,2\}, ρ~ni=ρi/M\widetilde{\rho}^{i}_{n}=\rho^{i}/M. Then as the maximums of ρ1\rho^{1} and ρ2\rho^{2} are uniformly comparable, note that there exists D>0D>0 depending only on C0C_{0} such that for i{1,2}i\in\{1,2\},

D1ρ~i1.D^{-1}\leq\widetilde{\rho}^{i}\leq 1.

In particular, as as exp\exp is 11-Lipschitz on (,0](-\infty,0], it follows that ρ~n1,ρ~n2\widetilde{\rho}_{n}^{1},\widetilde{\rho}_{n}^{2} are both uniformly α\alpha-Hölder with the same constant as their log-Hölder constant. Below we will work with these rescaled functions that have maximum 11 and just write ρn1\rho^{1}_{n} instead of ρ~n1\widetilde{\rho}^{1}_{n}. Note that we have not gained any extra regularity for free: to get the lower bound DD depending only on the log-Hölder constant on both at the same time used substantial input from our setup.

Step 6. Sign and regularity of ρn12ρn2\rho_{n-1}^{2}-\rho_{n}^{2}. We now analyze ρn12ρn2\rho^{2}_{n-1}-\rho^{2}_{n}. In particular, we show that ρn2\rho^{2}_{n} is a decreasing sequence of densities. To begin, we will obtain a lower bound on ρn12ρn2\rho^{2}_{n-1}-\rho^{2}_{n}. Then we will use the various lemmas relating Hölder and log-Hölder functions to conclude a bound on the regularity of ρn12ρn2\rho^{2}_{n-1}-\rho^{2}_{n}. By definition:

ρn12ρn2=\displaystyle\rho^{2}_{n-1}-\rho^{2}_{n}= ρn11((Hn1s)1y)Jn1(y)(1enη^)ρn11((Hns)1y)Jn(y)\displaystyle\rho_{n-1}^{1}((H^{s}_{n-1})^{-1}y)J_{n-1}(y)-(1-e^{-n\hat{\eta}})\rho_{n-1}^{1}((H^{s}_{n})^{-1}y)J_{n}(y)
=\displaystyle= [Jn1(y)(ρn11((Hn1s)1(y))ρn11((Hns)1(y)))]+\displaystyle[J_{n-1}(y)(\rho^{1}_{n-1}((H^{s}_{n-1})^{-1}(y))-\rho^{1}_{n-1}((H^{s}_{n})^{-1}(y)))]+
[ρn11((Hns)1(y))(Jn1(y)Jn(y))]+[enη^ρn11((Hns)1y)Jn(y)]\displaystyle[\rho^{1}_{n-1}((H^{s}_{n})^{-1}(y))(J_{n-1}(y)-J_{n}(y))]+[e^{-n\hat{\eta}}\rho_{n-1}^{1}((H^{s}_{n})^{-1}y)J_{n}(y)]
=\displaystyle= A+B+C.\displaystyle A+B+C.

We next estimate A,BA,B, and CC.

Term AA. To estimate term AA, we first pull the function back to γ1\gamma_{1} by composing with HnsH^{s}_{n}. Let Qn=(Hn1s)1HnsQ_{n}=(H^{s}_{n-1})^{-1}\circ H^{s}_{n}. For yn1y\in\mathcal{I}^{1}_{n}, there exists yGωny^{\prime}\in G^{n}_{\omega} satisfying the hypotheses of Claim 10.14 such that dγ2(Hns(y),Hn1s(y))<C1ηn(y)1.99(1σ)2d_{\gamma_{2}}(H^{s}_{n}(y),H^{s}_{n-1}(y))<C_{1}\eta_{n}(y^{\prime})^{-1.99(1-\sigma)^{2}}. By Lipschitzness of the holonomies from Proposition 10.12(4b), this implies that

(10.35) dγ1(Q(y),y)<D1C1ηn(y)1.99(1σ)2.d_{\gamma_{1}}(Q(y),y)<D_{1}C_{1}\eta_{n}(y^{\prime})^{1.99(1-\sigma)^{2}}.

Precomposing again with (Hns)1(H^{s}_{n})^{-1} gives that for yn2y\in\mathcal{I}^{2}_{n},

(10.36) dγ1((Hn1s)1(y),(Hns)1(y))D12C1ηn(y)1.99(1σ)2.d_{\gamma_{1}}((H^{s}_{n-1})^{-1}(y),(H^{s}_{n})^{-1}(y))\leq D_{1}^{2}C_{1}\eta_{n}(y^{\prime})^{1.99(1-\sigma)^{2}}.

But this implies, using Lemma A.11 and (10.36) in the second line, that:

(10.37) |A|\displaystyle\left|A\right| =|Jn1(y)(ρn11((Hn1s)1(y))ρn11((Hns)1(y)))|\displaystyle=\left|J_{n-1}(y)(\rho^{1}_{n-1}((H^{s}_{n-1})^{-1}(y))-\rho^{1}_{n-1}((H^{s}_{n})^{-1}(y)))\right|
(10.38) |Jn1(y)|(D12C2ηn(y)1.99(1σ)2)αρn11((Hns)1y)\displaystyle\leq\left|J_{n-1}(y)\right|(D_{1}^{2}C_{2}\eta_{n}(y^{\prime})^{1.99(1-\sigma)^{2}})^{\alpha}\rho^{1}_{n-1}((H^{s}_{n})^{-1}y)
(10.39) |Jn1(y)|C3ηn1.99(1σ)2αρn11(Hns(y))\displaystyle\leq\left|J_{n-1}(y)\right|C_{3}\eta_{n}^{1.99(1-\sigma)^{2}\alpha}\rho^{1}_{n-1}(H^{s}_{n}(y))
(10.40) CAe1.99λ(1σ)2αnρn11((Hns)1y).\displaystyle\leq C_{A}e^{-1.99\lambda^{\prime}(1-\sigma)^{2}\alpha n}\rho^{1}_{n-1}((H^{s}_{n})^{-1}y).

where we have used temperedness to pass to the last line. We now turn to the next term.

Term BB. This term is simpler. We use (10.17) in the third step below:

|B||ρn11((Hns)1(y))(Jn1(y)Jn(y))||ρn11||Jn1(y)Jn(y)|\left|B\right|\leq\left|\rho^{1}_{n-1}((H^{s}_{n})^{-1}(y))(J_{n-1}(y)-J_{n}(y))\right|\leq\left|\rho^{1}_{n-1}\right|\left|J_{n-1}(y)-J_{n}(y)\right|
|ρn11|enηenηρn11((Hns)1(y)).\leq\left|\rho^{1}_{n-1}\right|e^{-n\eta}\leq e^{-n\eta}\rho^{1}_{n-1}((H^{s}_{n})^{-1}(y)).

Term CC. The final term is straightforward

C=eη^nρn11((Hns)1y)Jn(y)D2eη^nρn11((Hns)1(y)).C=e^{-\hat{\eta}n}\rho_{n-1}^{1}((H^{s}_{n})^{-1}y)J_{n}(y)\leq D_{2}e^{-\hat{\eta}n}\rho_{n-1}^{1}((H^{s}_{n})^{-1}(y)).

We can now conclude. Combining the estimates on A,B,CA,B,C, we see that

ρn12(y)ρn2(y)[D2eη^neηnCAe1.99λ(1σ)2αn]ρn11((Hns)1y).\rho^{2}_{n-1}(y)-\rho^{2}_{n}(y)\geq[D_{2}e^{-\hat{\eta}n}-e^{-\eta n}-C_{A}e^{-1.99\lambda^{\prime}(1-\sigma)^{2}\alpha n}]\rho^{1}_{n-1}((H^{s}_{n})^{-1}y).

In particular, as long as

(10.41) 0<η^<min{η/2,1.99λ(1σ)2α/2},0<\hat{\eta}<\min\{\eta/2,-1.99\lambda^{\prime}(1-\sigma)^{2}\alpha/2\},

it follows that there exists N2N_{2} such that for nN2n\geq N_{2},

ρn12ρn2e2η^nρn12.\rho^{2}_{n-1}-\rho^{2}_{n}\geq e^{-2\hat{\eta}n}\rho^{2}_{n-1}.

Also because Jn,ρn2,ρn12J_{n},\rho^{2}_{n},\rho^{2}_{n-1} are uniformly bounded, there exists D3D_{3} such that

D3ρn2ρn12e2η^nD31.D_{3}\geq\rho^{2}_{n}-\rho^{2}_{n-1}\geq e^{-2\hat{\eta}n}D_{3}^{-1}.

Thus we can apply Claim A.10 to the function (ρn2ρn12)D31e2η^n(\rho^{2}_{n}-\rho^{2}_{n-1})\geq D_{3}^{-1}e^{-2\hat{\eta}n}. As ρn2\rho^{2}_{n} and ρn12\rho^{2}_{n-1} are uniformly α\alpha-Hölder from Step 5, we obtain that there exists D4D_{4} such that ρn12ρn2\rho^{2}_{n-1}-\rho^{2}_{n} is uniformly D4e2η^nD_{4}e^{2\hat{\eta}n} log-α\alpha-Hölder. This concludes the analysis of the Hölder regularity of ρn12ρn2\rho^{2}_{n-1}-\rho^{2}_{n}.

Step 7: Bookkeeping. In this step we verify that for each point yn1y\in\mathcal{I}_{n}^{1} that a positive proportion of the mass over yy is retained during the fake coupling procedure. This is straightforward to see because at each step, we discard enη^/2e^{-n\hat{\eta}/2} proportion of the remaining mass in ρn1(y)\rho^{1}_{n}(y). Thus from the definition 10.30 of ρN1\rho^{1}_{N} the amount of mass is bounded below by

ρn1(y)a0ρ1(y)nN(1enη^)>0.\rho^{1}_{n}(y)\geq a_{0}\rho^{1}(y)\prod_{n\geq N}(1-e^{-n\hat{\eta}})>0.

Thus we keep a positive proportion of the mass above each yωny\in\mathcal{I}^{n}_{\omega} for all nNn\geq N.

Step 8: n=n=\infty behavior As the sequences ρn1\rho^{1}_{n} and ρn2\rho^{2}_{n} are decreasing they converge to some limiting measures ρ1\rho^{1}_{\infty} and ρ2\rho^{2}_{\infty}. Further, by Proposition B.13, the true stable holonomies HsH^{s}_{\infty} satisfy (Hs)ρn1=ρn2(H^{s}_{\infty})_{*}\rho^{1}_{n}=\rho^{2}_{n} as required.

Step 9: (C,λ,ϵ,𝒞θ)(C,\lambda,\epsilon,\mathcal{C}_{\theta})-tempered points are never dropped. Finally, we must show that we actually keep the (C,λ,ϵ,𝒞θ)(C,\lambda,\epsilon,\mathcal{C}_{\theta}) tempered points throughout the entire procedure, so that part (3) of the requirements for a fake coupling are satisfied. Suppose that (ω,x)(\omega,x) is such a (C,λ,ϵ,𝒞θ)(C,\lambda,\epsilon,\mathcal{C}_{\theta})-tempered trajectory. It suffices to show that for each nn that all points in Bδn(x)(x)B_{\delta_{n}(x)}(x) are (C,λ,ϵ,𝒞2θ)(C^{\prime},\lambda^{\prime},\epsilon^{\prime},\mathcal{C}_{2\theta})-tempered, as from the procedure above this ensures that xn1x\in\mathcal{I}^{1}_{n} for all nn. By Part (2) of Proposition 10.12, this follows as long as δn(x)Dxfωn(1+σ)\delta_{n}(x)\leq\|D_{x}f^{n}_{\omega}\|^{-(1+\sigma)}. This inequality holds because by the definition of ηn\eta_{n}, (10.19), ηn(x)Dxfωn1\eta_{n}(x)\leq\|D_{x}f^{n}_{\omega}\|^{-1}, and δn(x)=ηn(1+σ)\delta_{n}(x)=\eta_{n}^{(1+\sigma)}.

Thus we have verified all of the required claims in the definition of fake coupling as well as the additional required claim about the goodness of the families Pn1iPniP^{i}_{n-1}\setminus P^{i}_{n}, we conclude the proof. ∎

We now have everything ready to prove the local coupling lemma, Lemma 7.10.

Proof of Lemma 7.10.

Almost everything in the statement of Lemma 7.10 is contained in the statement of Lemma 10.13. We explain them in order.

Item 1 follows because the points we stop trying to couple at time nn are precisely the points in γ^i\hat{\gamma}_{i} that are in PniPn1iP^{i}_{n}\setminus P^{i}_{n-1}. As the standard family PniPn1iP^{i}_{n}\setminus P^{i}_{n-1} is nΛn\Lambda-good, the claim follows with L=ΛL=\Lambda.

Item 2 is the statement in the final paragraph of Lemma 10.13.

Item 3 is more complicated. There are two ways that a point xn11x\in\mathcal{I}^{1}_{n-1} fails to appear in n1\mathcal{I}^{1}_{n}. The first is that xx is not in any interval Bδn(y)γ1(y)B_{\delta_{n}(y)}^{\gamma_{1}}(y) for any yHωny\in H^{n}_{\omega}. The second is if xx is in an interval that gets trimmed off of ^n1\hat{\mathcal{I}}^{1}_{n}.

First we consider the former case. This means that some yy such that xBδn1(y)γ1(y)x\in B_{\delta_{n-1}(y)}^{\gamma_{1}}(y) failed to be tempered at time nn. In Σ×γ1\Sigma\times\gamma_{1}, we consider the union of these intervals:

Un={{ω}×Bδn1(y)γ1(y):yn11n1 for the word ω}.U_{n}=\bigcup\{\{\omega\}\times B_{\delta_{n-1}(y)}^{\gamma_{1}}(y):y\in\mathcal{I}_{n-1}^{1}\setminus\mathcal{I}_{n}^{1}\text{ for the word }\omega\}.

Note that as each of these sets Bδn1(y)γ1(y)B_{\delta_{n-1}(y)}^{\gamma_{1}}(y) contains a point zz that fails to be tempered at time nn that (ω,z)(\omega,z) has cushion that is within Λmax\Lambda_{\max} of CC^{\prime}, the cutoff for tempering to fail. By Proposition 10.10, as all the points in Bδn1(y)γ1(y)B^{\gamma_{1}}_{\delta_{n-1}(y)}(y) satisfy the hypotheses of that proposition due to the size of δn1(y)Dyfωn(1+σ)\delta_{n-1}(y)\leq\|D_{y}f^{n}_{\omega}\|^{-(1+\sigma)} and the tempering, this implies that all points in Bδn1(y)γ1(y)B_{\delta_{n-1}(y)}^{\gamma_{1}}(y) have cushion at most C+Λmax+DC^{\prime}+\Lambda_{\max}+D. But by Proposition 4.10, the number of points having cushion of this size is exponentially small. Thus μρ(Un)D1enη\mu\otimes\rho(U_{n})\leq D_{1}e^{-n\eta} for some D1,η>0D_{1},\eta>0, and we have an exponential tail for points experiencing the first type of failure.

In the case that a point fails to be included because it was trimmed off, it was observed in Step 3 of the coupling construction, that every curve being trimmed has length at least e(1+σ)Λmaxne^{-(1+\sigma)\Lambda_{\max}n} and the amount we cut off has length 2K1e4λmaxn2K_{1}e^{-4\lambda_{\max}n}. Thus as 1/2ρ(x)/ρ(y)21/2\leq\rho(x)/\rho(y)\leq 2 for two points x,yx,y along the curve we are coupling, the amount we trim has mass at most 4e2Λmaxn4e^{-2\Lambda_{\max}n} times the mass of the curve. Thus summing over all curves we stop on at most 4e2Λmaxn4e^{-2\Lambda_{\max}n} mass, which is exponentially small.

The last way that mass is lost during the local coupling procedure is when we rescale the density by (1enη^)(1-e^{-n\hat{\eta}}) in Step 4, which also gives at most an exponentially small amount of mass is stopping at time nn. This concludes the proof of the tail bound.

Item 4 follows from Proposition 10.12(1). ∎

11. Mixing theorems

11.1. Overview of the section

In this section we prove our main result, Theorem 1.1. The proof will rely on coupling and expansion following the standard argument, see e.g. [CM06].

First, we show that coupling implies equidistribution of standard families by coupling a given family to a family representing volume and using that volume is invariant by the dynamics. See Proposition 11.9 for details.

Next, we use the expansion and exponential equidistribution to obtain exponential mixing using the following reasoning. Consider an RR-good standard family γ^\hat{\gamma} and let fωn(γ^)f^{n}_{\omega}(\hat{\gamma}) be its image after nn iterations. We shall show that for almost all ω\omega that fωn(γ^)f^{n}_{\omega}(\hat{\gamma}) contains a subfamily PnP_{n} with the following properties:

  1. (1)

    PnP_{n} consists of ϵn\epsilon n-good standard pairs

  2. (2)

    standard pairs in PnP_{n} contract backwards in time

  3. (3)

    the forward image of pairs from PnP_{n} equidistribute at an exponential rate

  4. (4)

    the complement of PnP_{n} has exponentially small measure.

Now given Hölder functions ϕ\phi and ψ\psi we obtain exponential decorrelation between ϕfωN\phi\circ f^{N}_{\omega} with N=cnN=cn and ψ\psi using that ψ\psi is constant on the elements of PnP_{n} (up to exponentially small error), ϕfωN\phi\circ f^{N}_{\omega} is equidistributed on the elements of PnP_{n} (up to exponentially small error), and the complement of PnP_{n} is exponentially small.

The purpose of this section is to execute this argument precisely, using the results of Sections 4, 8, and the appendices.

11.2. Preparatory lemmas

Below we will use Definition A.14 from §A.6 in the appendix. Briefly, this definition concerns a (C,λ,ϵ,θ)(C,\lambda,\epsilon,\theta)-forward tempered point at time nn for a vector vTxMv\in T_{x}M, which is a (C,λ,ϵ)(C,\lambda,\epsilon)-forward tempered time nn such that EnsE^{s}_{n} makes angle at least θ\theta with vv.

Proposition 11.1.

Suppose that MM is a closed surface and (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple of diffeomorphisms in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). There exists λ>0\lambda>0 such that for all sufficiently small ϵ>0\epsilon>0 there exist C0C_{0}, NN\in\mathbb{N}, and α>0\alpha>0 such that for all nNn\geq N, and any direction vTx1Mv\in T_{x}^{1}M,

μ(ω:(ω,x) is not (ϵn+C0,λ,ϵ,C0eϵn)-forward tempered at time n relative to v)enα.\mu(\omega:(\omega,x)\text{ is \emph{not} }(\epsilon n+C_{0},\lambda,\epsilon,{C_{0}e^{-\epsilon n}})\text{-forward tempered at time $n$ relative to $v$})\leq e^{-n\alpha}.
Proof.

Proposition 4.8 says that there exist λ>0\lambda>0 such that for arbitrarily small ϵ>0\epsilon>0, there exists α>0\alpha>0 such that the measure of the words ω\omega that are not (C,λ,ϵ)(C,\lambda,\epsilon)-subtempered for all n0n\geq 0 is at most eαCe^{-\alpha C}. From Proposition 4.14 there exists some C2,c,θ>0C_{2},c,\theta>0 such that for all sufficiently small ϵ\epsilon^{\prime} as long as nc|ln(ϵ)|=N0n\geq c\left|\ln(\epsilon^{\prime})\right|=N_{0} , then for all nN0n\geq N_{0}, the probability that EnsBϵ(v)E^{s}_{n}\in B_{\epsilon^{\prime}}(v) is at most C2(ϵ)θC_{2}(\epsilon^{\prime})^{\theta}. Taking ϵ=eϵn\epsilon^{\prime}=e^{-\epsilon n}, this gives that the probability that

EnsBeϵn(v)E^{s}_{n}\in B_{e^{-\epsilon n}}(v) for nN0n\geq N_{0} is at most C2eθϵnC_{2}e^{-\theta\epsilon n} as long as ϵ\epsilon is sufficiently small relative to cc, ncϵnn\geq c\epsilon n. Combining these two estimates, we obtain the result. ∎

Below, we will typically assume that the standard family or standard pair we are considering has unit mass. The statements below can be adapted to any amount of mass by multiplying the right hand side of the bound by the mass of the family.

Definition 11.2.

Given a standard pair γ^=(γ,ρ)\hat{\gamma}=(\gamma,\rho), for xγx\in\gamma we say that (ω,x)(\omega,x) is (n,λ,ϵ)(n,\lambda,\epsilon)-backwards good if

  1. (1)

    fωn(x)f^{n}_{\omega}(x) is contained in a standard pair B(ω,x)fωn(γ^)B(\omega,x)\subseteq f^{n}_{\omega}(\hat{\gamma}) that is ϵn\epsilon n-good, and

  2. (2)

    DfωnBe14ϵnγ(x)Df^{-n}_{\omega}B_{e^{-14\epsilon n}}^{\gamma}(x) has diameter at most e(λ/2)ne^{-(\lambda/2)n}.

We define analogously the same notion for a standard family.

Proposition 11.3.

(Annealed goodness) Suppose that MM is a closed surface and (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple in Diffvol2(M)\operatorname{Diff}_{\operatorname{vol}}^{2}(M). Then there exists λ>0\lambda>0 such that for all sufficiently small ϵ>0\epsilon>0, if we fix R>0R>0 there exists α,C>0\alpha,C>0 such that for any RR-good, unit mass standard family γ^\hat{\gamma} with associated measure ρ\rho:

(11.1) (μρ)({(x,ω):(x,ω) is not (n,λ,ϵ)-backwards good})Ceαn.(\mu\otimes\rho)(\{(x,\omega):(x,\omega)\text{ is not }(n,\lambda,\epsilon)\text{-backwards good}\})\leq Ce^{-\alpha n}.
Proof.

This is immediate from Propositions A.15 and 11.1. ∎

From Proposition 11.3, we can deduce a related quenched statement for almost every ω\omega.

Lemma 11.4.

(Quenched goodness) Under the hypotheses of Proposition 11.3, there exist λ,α,D>0\lambda,\alpha,D>0 such that for all sufficiently small ϵ>0\epsilon>0 and a unit mass RR-good standard family γ^\hat{\gamma}, then for almost every ω\omega, there exists CωC_{\omega} such that 1Cωeαn1-C_{\omega}e^{-\alpha n} proportion of points in γ^\hat{\gamma} are (n,λ,ϵ)(n,\lambda,\epsilon)-backwards good for ω\omega. Further,

μ(ω:Cω>C)DC1.\mu(\omega:C_{\omega}>C)\leq DC^{-1}.
Proof.

Let AnωA_{n}^{\omega} be the set of points in γ^\hat{\gamma} that are not (n,λ,ϵ)(n,\lambda,\epsilon)-backwards good for ω\omega. Then

μ(ω:n ρ(Anω)>Ce(α/2)n)n0μ(ω:ρ(Anω)>Ce(α/2)n)n0C1C1e(α/2)nC1D,\mu(\omega:\exists n\text{ }\rho(A^{\omega}_{n})>Ce^{-(\alpha/2)n})\leq\sum_{n\geq 0}\mu(\omega:\rho(A^{\omega}_{n})>Ce^{-(\alpha/2)n})\leq\sum_{n\geq 0}C^{-1}C_{1}e^{-(\alpha/2)n}\leq C^{-1}D,

where the second inequality follows from (11.1) and the Markov inequality. The result follows. ∎

We also need another proposition, that says that on the ϵn\epsilon n-good neighborhoods at time nn that we have rapid coupling, which will then imply that these neighborhoods rapidly equidistribute. The following estimate is immediate from Proposition 7.7.

Proposition 11.5.

Suppose (f1,,fm)(f_{1},\ldots,f_{m}) is as in Proposition 11.3. Then there exists λ>0\lambda>0 such that for any sufficiently small ϵ>0\epsilon>0 there exist C,α>0C,\alpha>0 such that the following holds. For any nn\in\mathbb{N}, suppose P1P^{1} and P2P^{2} are two unit mass standard families of ϵn\epsilon n-good curves. Then there exists a coupling function Υ\Upsilon and stopping times T^1,T^2\hat{T}^{1},\hat{T}^{2} as in Proposition 7.7 such that for i{1,2}i\in\{1,2\}:

(μρi)({(x,ω):T^i(x,ω)>j})Ceϵneαj.(\mu\otimes\rho^{i})(\{(x,\omega):\hat{T}^{i}(x,\omega)>j\})\leq Ce^{\epsilon n}e^{-\alpha j}.
Remark 11.6.

In the applications of Proposition 11.5 below we will assume unless it is explicitly stated otherwise that P2P^{2} is the family representing the volume from Proposition 7.5. We couple with a family representing volume because it implies that the statistics of an arbitrary standard family P1P_{1} approach those of volume.

In what follows for a word ω\omega at time ii, we have subfamilies Pi,ω1P^{1}_{i,\omega} and Pi,ω2P^{2}_{i,\omega} of fωi(P1)f^{i}_{\omega}(P^{1}). We then apply Proposition 11.5 above, to find a pair of stopping times T^i1\hat{T}^{1}_{i} and T^i2\hat{T}^{2}_{i} defined on fωi(Pi,ω1)f^{i}_{\omega}(P^{1}_{i,\omega}) and fωi(Pi,ω2)f^{i}_{\omega}(P^{2}_{i,\omega}) respectively. Note that the the T^i\hat{T}^{i} are not defined on all of fωi(γ^)f^{i}_{\omega}(\hat{\gamma}) because not all points in this pair need be ϵn\epsilon n-good.

Then from Proposition 11.5 we obtain the following.

Proposition 11.7.

Let (f1,,fm)(f_{1},\ldots,f_{m}), γ^\hat{\gamma}, ρ\rho, and λ,ϵ,α>0\lambda,\epsilon,\alpha>0 as be as in Proposition 11.3, then there exists CC such that if we let the T^n1\hat{T}_{n}^{1} be the stopping time defined as in Remark 11.6, for all i,n0i,n\geq 0 we have the bound:

(11.2) (μρ)((x,ω):xPi,ω1 and T^i1(x,ω)>i+n)Ceϵienα.(\mu\otimes\rho)((x,\omega):x\in P^{1}_{i,\omega}\text{ and }\hat{T}_{i}^{1}(x,\omega)>i+n)\leq Ce^{\epsilon i}e^{-n\alpha}.

From this, we easily deduce a statement about each ω\omega.

Proposition 11.8.

Let (f1,,fm)(f_{1},\ldots,f_{m}), λ,ϵ>0\lambda,\epsilon>0 and γ^,ρ\hat{\gamma},\rho be as in the setting of Proposition 11.3 and Remark 11.6, then there exists α,D1>0\alpha,D_{1}>0 such that

(11.3) μ(\displaystyle\mu( ω:there exists i such that ρ(x:(x,ω) is (i,λ,ϵ)-backwards good)<1Ceiα or\displaystyle\omega:{\text{there exists }i\text{ such that }\rho(x:(x,\omega)\text{ is }(i,\lambda,\epsilon)\text{-backwards good})<1-Ce^{-i\alpha}}\text{ or }
(11.4) there exist (i,n) such that ρ(xPi,ω1:T^i1(x,ω)i+n)C2eϵienα)D1C1.\displaystyle\text{there exist }(i,n)\text{ such that }\rho(x\in P_{i,\omega}^{1}:\hat{T}_{i}^{1}(x,\omega)\geq i+n)\geq C^{2}e^{\epsilon i}e^{-n\alpha})\leq D_{1}C^{-1}.
Proof.

To control the event in (11.4) let Bi,nω={xPi,ω1:T^i1(x,ω)>i+n}B_{i,n}^{\omega}=\{x\in P^{1}_{i,\omega}:\hat{T}_{i}^{1}(x,\omega)>i+n\}. By (11.2) and the Markov inequality, there is C1>0C_{1}>0 such that

(11.5) μ({ω:ρ(Bi,nω)>Ce2ϵie(α/2)n})C1C1eϵie(α/2)n.\mu(\{\omega:\rho(B^{\omega}_{i,n})>Ce^{2\epsilon i}e^{-(\alpha/2)n}\})\leq C_{1}C^{-1}e^{-\epsilon i}e^{-(\alpha/2)n}.

Then using (11.5), we find that

μ(ω:for some i,n ρ({x:T^i1(x,ω)i+n})C2eϵienα/2)\mu(\omega:\text{for some $i,n$ }\rho(\{x:\hat{T}_{i}^{1}(x,\omega)\geq i+n\})\geq C^{2}e^{\epsilon i}e^{-n\alpha/2})
i0n0μ({ω:ρ(Bi,nω)C2eϵie(α/2)n})i0n0C1C1eϵie(α/2)nC1C2\leq\sum_{i\geq 0}\sum_{n\geq 0}\mu(\{\omega:\rho(B^{\omega}_{i,n})\geq C^{2}e^{\epsilon i}e^{-(\alpha/2)n}\})\leq\sum_{i\geq 0}\sum_{n\geq 0}C_{1}C^{-1}e^{-\epsilon i}e^{-(\alpha/2)n}\leq C^{-1}C_{2}

for some C2C_{2} provided that ϵ\epsilon is small enough. Combining this estimate with Proposition 11.3 to control the event in (11.3) allows us to conclude. ∎

11.3. Quenched equidistribution

Using the quenched coupling lemmas above, it is straightforward to deduce quenched equidistribution and correlation decay theorems. The ideas in the proofs below are essentially standard, compare with [CM06, Ch. 7], however some modifications are necessary because the quenched random dynamics is not stationary.

We start with quenched equidistribution.

Proposition 11.9.

(Quenched exponential equidistribution on subfamilies) Let (f1,,fm)(f_{1},\ldots,f_{m}) be an expanding on average tuple in Diffvol2(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M), where MM is a closed surface. There exists λ>0\lambda>0 such that for all sufficiently small ϵ>0\epsilon>0, fixed β(0,1)\beta\in(0,1) and RR, there exists D1D_{1} such that for any RR-good, unit mass standard family γ^\hat{\gamma}, there exists α,ν>0\alpha,\nu>0 such that for almost every ω\omega, there exists Cω1C_{\omega}\geq 1 such that, such that coupling as in Remark 11.6:

  1. (1)

    There exists a subfamily Pi,ωP_{i,\omega} of (fωi)γ^(f^{i}_{\omega})_{*}\hat{\gamma} of eϵie^{\epsilon i}-good standard pairs having total ρ\rho–measure (1Cωeαi)(1-C_{\omega}e^{-\alpha i})

  2. (2)

    The atoms of (fωi)1(Pi,ω)(f^{i}_{\omega})^{-1}(P_{i,\omega}) have diameter at most eλ/2ie^{-\lambda/2i}.

  3. (3)

    The atoms Ai,ωPi,ωA_{i,\omega}\in P_{i,\omega} exponentially equidistribute, i.e., letting A¯i,ω\overline{A}_{i,\omega} be the normalized measure on Ai,ωA_{i,\omega},

    (11.6) |ϕfnσiωdA¯i,ωϕdvol|CωeϵieαnϕCβ.\left|\int\phi\circ f^{n}_{\sigma^{i}\omega}\,d\overline{A}_{i,\omega}-\int\phi\,d\operatorname{vol}\right|\leq C_{\omega}e^{\epsilon i}e^{-\alpha n}\|\phi\|_{C^{\beta}}.
  4. (4)

    We have a tail bound μ({ω:Cω>C})D1C1.\displaystyle\mu(\{\omega:C_{\omega}>C\})\leq D_{1}C^{-1}.

Proof.

From Lemma 11.4, the only thing that remains to be checked is that the individual atoms of Ai,ωA_{i,\omega} are exponentially equidistributing.

Let Pi,ωP_{i,\omega} be the subfamily of fiω(γ^)f^{i}_{\omega}(\hat{\gamma}) of curves that are iϵi\epsilon-good. Let P2P^{2} be a standard family representing volume as in Remark 11.6. Then coupling with P2P^{2}, we have the stopping time T^i\hat{T}_{i} on Pi,ωP_{i,\omega} as discussed in Proposition 11.7 and uniform α,Cω>0\alpha,C_{\omega}>0 such that for all i,ni,n\in\mathbb{N},

(11.7) ρ(xPi,ω:T^i(x,ω)>n+i)Cωeϵienα.\rho(x\in P_{i,\omega}:\hat{T}_{i}(x,\omega)>n+i)\leq C_{\omega}e^{\epsilon i}e^{-n\alpha}.

We would like to know that most of the curves in Pi,ωP_{i,\omega} have all but an exponentially small amount of their points coupling quickly.

We claim that for a.e. ω\omega there exists a subfamily Gi,ωG_{i,\omega} of ϵi\epsilon i-good curves in Pi,ωP_{i,\omega} of measure at least 1Cωeαi/31-{C_{\omega}}e^{-\alpha i/3} such that for each AGi,ωA\in G_{i,\omega} all but eiϵneα/3ne^{i\epsilon n}e^{-\alpha/3n} of the mass of the subfamily has coupled to volume by time i+ni+n, i.e. T^i(x,ω)i+n\hat{T}_{i}(x,\omega)\leq i+n. Suppose that ω\omega satisfies (11.7) and for the sake of contradiction, suppose that there is a subfamily BiB_{i} (of bad pairs) of P^i,ω\hat{P}_{i,\omega} having measure more than than eαi/3e^{-\alpha i/3} so that for some nn all pairs in BiB_{i} have more than eiϵenα/3e^{i\epsilon}e^{-n\alpha/3} proportion of points not coupled at time n+in+i, i.e. T^i>i+n\hat{T}_{i}>i+n. This implies that ρ(x:T^i(x,ω)>n+i)Cωe2αn/3eiϵ,\displaystyle\rho(x:\hat{T}_{i}(x,\omega)>n+i)\geq C_{\omega}e^{-2\alpha n/3}e^{i\epsilon}, contradicting (11.7). Thus the claim about Gi,ωG_{i,\omega} holds.

Suppose now that Ai,ωGi,ωPi,ωA_{i,\omega}\in G_{i,\omega}\subseteq P_{i,\omega} is such a good atom where at time n+in+i all but at most eiϵe(α/3)ne^{i\epsilon}e^{-(\alpha/3)n} proportion of the mass of Ai,ωA_{i,\omega} has coupled to volume. Let Ai,ωnAi,ωA_{i,\omega}^{n}\subseteq A_{i,\omega} be the set of points that have coupled by time i+ni+n. Let Υ\Upsilon be the measure preserving coupling function and let Vn=Υ(Ai,ωn)V^{n}=\Upsilon(A_{i,\omega}^{n}) be the corresponding set of points in the standard family representing volume that have T^i(x,ω)i+n\hat{T}_{i}(x,\omega)\leq i+n. Then we may write the integral in question as

|ϕfnσiωdA¯i,ωϕdPvol|\left|\int\phi\circ f^{n}_{\sigma^{i}\omega}\,d\overline{A}_{i,\omega}-\int\phi\,dP_{\operatorname{vol}}\right|
|An/2i,ωϕfnσiωdA¯i,ωVn/2ϕdPvol|+|Ai,ωAi,ωn/2ϕfnσiωdA¯i,ω|+|(Vn/2)cϕdPvol|\leq\left|\int_{A^{n/2}_{i,\omega}}\phi\circ f^{n}_{\sigma^{i}\omega}\,d\overline{A}_{i,\omega}-\int_{V^{n/2}}\phi\,dP_{\operatorname{vol}}\right|+\left|\int_{A_{i,\omega}\setminus A_{i,\omega}^{n/2}}\phi\circ f^{n}_{\sigma^{i}\omega}\,d\overline{A}_{i,\omega}\right|+\left|\int_{(V^{n/2})^{c}}\phi\,dP_{\operatorname{vol}}\right|
|Υ1(Vn/2)ϕfnσiω(Υ(x))ϕ(x)dPvol|+2Cωeiϵenα/6ϕCβ.\leq\left|\int_{\Upsilon^{-1}(V^{n/2})}\phi\circ f^{n}_{\sigma^{i}\omega}(\Upsilon(x))-\phi(x)\,dP_{\operatorname{vol}}\right|+2C_{\omega}e^{i\epsilon}e^{-n\alpha/6}\|\phi\|_{C^{\beta}}.

As the points Υ(x)\Upsilon(x) and xx both lie in a common (C0,λ,ϵ)(C_{0},\lambda,\epsilon)-tempered local stable leaf of uniformly bounded length at time i+n/2i+n/2, then we see that at time i+ni+n, that

d(fnσiωΥ(x),fnσiω(x))C10eλ/2n.d(f^{n}_{\sigma^{i}\omega}\Upsilon(x),f^{n}_{\sigma^{i}\omega}(x))\leq C^{-1}_{0}e^{-\lambda/2n}.

Now the Hölder regularity of ϕ\phi implies that

(11.8) |ϕfnσiωdA¯i,ωϕdPvol|Cβ0eλβ/2nϕCβ+2Cωeiϵenα/6ϕCβ,\left|\int\phi\circ f^{n}_{\sigma^{i}\omega}\,d\overline{A}_{i,\omega}-\int\phi\,dP_{\operatorname{vol}}\right|\leq C^{-\beta}_{0}e^{-\lambda\beta/2n}\|\phi\|_{C^{\beta}}+2C_{\omega}e^{i\epsilon}e^{-n\alpha/6}\|\phi\|_{C^{\beta}},

which is what what we wanted for the pair Ai,ωA_{i,\omega}. The required tail bound on CωC_{\omega} follows from Proposition 11.8 and (11.7) by taking D1D_{1} sufficiently large because the first term involving C0βC_{0}^{\beta} is uniformly bounded independent of Cω1C_{\omega}\geq 1. ∎

Theorem 11.10.

(Quenched, tempered equidistribution) Suppose that MM is a closed surface, (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple in Diffvol2(M)\operatorname{Diff}_{\operatorname{vol}}^{2}(M), and β(0,1)\beta\in(0,1) is a Hölder regularity. For any ϵ>0\epsilon>0 there exists η>0\eta>0 such that for any RR-good standard family γ^\hat{\gamma} with associated measure ρ\rho, this family satisfies quenched, tempered equidistribution. Namely, for a.e. ωΣ\omega\in\Sigma, there exists CωC_{\omega} such that for any ϕCβ(M)\phi\in C^{\beta}(M), for all natural numbers kk and nn,

|ϕfnσk(ω)dρϕdvol|CωekϵeηnϕCβ.\left|\int\phi\circ f^{n}_{\sigma^{k}(\omega)}\,d\rho-\int\phi\,d\operatorname{vol}\right|\leq C_{\omega}e^{k\epsilon}e^{-\eta n}\|\phi\|_{C^{\beta}}.

The above theorem is an immediate consequence of Proposition 11.9, so we do not write a separate proof of it. Next we turn to exponential mixing.

11.4. Exponential mixing

We are now ready to prove exponential mixing. In a subsequent paper we plan to show that several classical statistical limit theorems are valid in our setting.

Proof of Theorem 1.1..

As before, let PvolP_{\operatorname{vol}} be an RR-good standard family representing volume. We then apply Proposition 11.9 with γ^=Pvol\hat{\gamma}=P_{\operatorname{vol}}, and obtain λ,ϵ,α>0\lambda,\epsilon,\alpha>0 such that the conclusions of that proposition hold for these constants. Pick some ωΣ\omega\in\Sigma such that the conclusion of Proposition 11.9 holds for ω\omega, and let CωC_{\omega} be the associated constant. We will now show that fnωf^{n}_{\omega} is exponentially mixing. Let δ(0,1)\delta\in(0,1) be some fixed number small enough that ϵδ(1δ)α<0\epsilon\delta-(1-\delta)\alpha<0.

Below, we will be implicitly rounding to nearest integers so that everything makes sense. In particular, we will denote by PδnP_{\delta n} the standard family Pδn,ωP_{\lfloor\delta n\rfloor,\omega} from Proposition 11.9; as ω\omega is fixed we will omit it below.

We now record some useful properties of PδnP_{\delta n}. First, PδnP_{\delta n} comprises all but CωeδαnC_{\omega}e^{-\delta\alpha n} of the mass of fδnω(Pvol)f^{\delta n}_{\omega}(P_{\operatorname{vol}}). Thus, by volume preservation:

(11.9) ϕψfnωdPvol\displaystyle\int\phi\cdot\psi\circ f^{n}_{\omega}\,dP_{\operatorname{vol}} =ϕ(fδnω)1ψf(1δ)nσδn(ω)d(fδnω)(Pvol)\displaystyle=\int\phi\circ(f^{\delta n}_{\omega})^{-1}\cdot\psi\circ f^{(1-\delta)n}_{\sigma^{\delta n}(\omega)}\,d(f^{\delta n}_{\omega})_{*}(P_{\operatorname{vol}})
(11.10) =APδnϕ(fδnω)1ψf(1δ)nσδn(ω)dA±CωeδαnϕCβψCβ.\displaystyle=\sum_{A\in P_{\delta n}}\int\phi\circ(f^{\delta n}_{\omega})^{-1}\cdot\psi\circ f^{(1-\delta)n}_{\sigma^{\delta n}(\omega)}\,dA\pm C_{\omega}e^{-\delta\alpha n}\|\phi\|_{C^{\beta}}\|\psi\|_{C^{\beta}}.

Now, by Proposition 11.9, the preimage of each curve APδnA\in P_{\delta n} has length at most eδλn/2e^{-\delta\lambda n/2}. By Hölder continuity of ϕ\phi

(11.11) |maxϕ(fδnω)1|Aminϕ(fδnω)1|A|<eβδλn/2ϕCβ.\left|\max\phi\circ(f^{\delta n}_{\omega})^{-1}|_{A}-\min\phi\circ(f^{\delta n}_{\omega})^{-1}|_{A}\right|<e^{-\beta\delta\lambda n/2}\|\phi\|_{C^{\beta}}.

In particular, applying this observation to each summand in (11.10), we see that

APδnϕ(fδnω)1ψf(1δ)nσδn(ω)dA=\displaystyle\sum_{A\in P_{\delta n}}\int\phi\circ(f^{\delta n}_{\omega})^{-1}\cdot\psi\circ f^{(1-\delta)n}_{\sigma^{\delta n}(\omega)}\,dA= APδnϕ(fδnω)1dA¯ψf(1δ)nσδn(ω)dA\displaystyle\sum_{A\in P_{\delta n}}\int\phi\circ(f^{\delta n}_{\omega})^{-1}\,d\overline{A}\int\psi\circ f^{(1-\delta)n}_{\sigma^{\delta n}(\omega)}\,dA
(11.12) ±enβδλ/2ϕCβψCβ,\displaystyle\,\,\,\,\pm e^{-n\beta\delta\lambda/2}\|\phi\|_{C^{\beta}}\|\psi\|_{C^{\beta}},

where A¯\overline{A} denotes the unit mass version of AA. By the exponential equidistribution estimate from Proposition 11.9,

(11.13) ψf(1δ)nσδn(ω)dA=ρ(A)(ψdvol±Cωe((1δ)αδϵ)nψCβ)),\int\psi\circ f^{(1-\delta)n}_{\sigma^{\delta n}(\omega)}\,dA=\rho(A)\left(\int\psi\,d\operatorname{vol}\pm C_{\omega}e^{-((1-\delta)\alpha-\delta\epsilon)n}\|\psi\|_{C^{\beta}})\right),

where ρ(A)\rho(A) is the mass of the pair AA. Note by our choice of δ\delta that the exponent appearing in the above equation is negative.

Combining (11.10), (11.13), and (11.4), we find that

ϕψfnωdPvol\displaystyle\int\phi\cdot\psi\circ f^{n}_{\omega}\,dP_{\operatorname{vol}} =APδn(ϕ(fδnω)1dA)(ψdvol)\displaystyle=\sum_{A\in P_{\delta n}}\left(\int\phi\circ(f^{\delta n}_{\omega})^{-1}\,dA\right)\left(\int\psi\,d\operatorname{vol}\right)
±Cω(eδαn+eβδλn/2+e((1δ)αδϵ)n)ϕCβψCβ\displaystyle\,\,\,\,\,\,\,\,\pm C_{\omega}(e^{-\delta\alpha n}+e^{-\beta\delta\lambda n/2}+e^{-((1-\delta)\alpha-\delta\epsilon)n})\|\phi\|_{C^{\beta}}\|\psi\|_{C^{\beta}}

But as PδnP_{\delta n} comprises all but at most CωeδαnC_{\omega}e^{-\delta\alpha n} of the mass of fnω(Pvol)f^{n}_{\omega}(P_{\operatorname{vol}}), it follows that:

ϕψfnωdPvol\displaystyle\int\phi\cdot\psi\circ f^{n}_{\omega}\,dP_{\operatorname{vol}} =(ϕdPvol±CωϕCβeδαn)(ψdvol)\displaystyle=\left(\int\phi\,dP_{\operatorname{vol}}\pm C_{\omega}\|\phi\|_{C^{\beta}}e^{-\delta\alpha n}\right)\left(\int\psi\,d\operatorname{vol}\right)
±Cω(eδαn+eβδλn/2+e((1δ)αδϵ)n)ϕCβψCβ\displaystyle\hskip 20.00003pt\pm C_{\omega}(e^{-\delta\alpha n}+e^{-\beta\delta\lambda n/2}+e^{-((1-\delta)\alpha-\delta\epsilon)n})\|\phi\|_{C^{\beta}}\|\psi\|_{C^{\beta}}
=ϕdvolψdvol±4Cω(eηnϕCβψCβ),\displaystyle=\int\phi\,d\operatorname{vol}\int\psi\,d\operatorname{vol}\pm 4C_{\omega}(e^{-\eta n}\|\phi\|_{C^{\beta}}\|\psi\|_{C^{\beta}}),

where η=min{δα,βδλ/2,(1δ)αδϵ}.\displaystyle\eta=\min\{\delta\alpha,\beta\delta\lambda/2,(1-\delta)\alpha-\delta\epsilon\}. Since the tail bound on CωC_{\omega} is part of Proposition 11.9, the proof is complete. ∎

We now give the proof of annealed exponential mixing, i.e. exponential mixing of the skew product.

Proof of Corollary 1.2..

Let Φ¯(ω)=MΦ(ω,x)dvol\bar{\Phi}(\omega)=\int_{M}\Phi(\omega,x)\,d\operatorname{vol}, Ψ¯(ω)=MΨ(ω,x)dvol\bar{\Psi}(\omega)=\int_{M}\Psi(\omega,x)\,d\operatorname{vol}. Note that

Φ(ΨFn)dμdvol=𝔼ω(Φ(ω,x)Ψ(σnω,fωnx)dvol).\iint\Phi(\Psi\circ F^{n})\,d\mu d\operatorname{vol}=\mathbb{E}_{\omega}\left(\Phi(\omega,x)\Psi(\sigma^{n}\omega,f_{\omega}^{n}x)\,d\operatorname{vol}\right).

Splitting the right hand side into the regions where Cωeηn/2C_{\omega}\!\leq\!\!e^{\eta n/2} and Cω>eηn/2C_{\omega}\!>\!\!e^{\eta n/2} and using (1.2) in the first region and (1.3) in the second region we obtain

Φ(ΨFn)dμdvol=Φ¯(Ψ¯σn)dμ+O(eηn/2ΦCβΨCβ).\iint\Phi(\Psi\circ F^{n})\,d\mu\,d\operatorname{vol}=\int\bar{\Phi}(\bar{\Psi}\circ\sigma^{n})d\mu+O\left(e^{-\eta n/2}\|\Phi\|_{C^{\beta}}\|\Psi\|_{C^{\beta}}\right).

Now the result follows from the exponential mixing for the shift, see [PP90, Chapter 2]. ∎

Appendix A Finite time smoothing estimates

In the following two appendices we present finite time estimates for nonuniformly hyperbolic systems. While such estimates should be familiar to experts in Pesin theory, it is difficult to find precise references in the literature since most works concentrate on infinite orbits. The finite time estimates play an important role in the paper because in the main coupling algorithm we want to use the independence of the dynamics, hence we decide to stop at time nn based only on the dynamics on the time interval from zero to n.n.

A.1. Finite time Lyapunov metrics

Typically one defines Lyapunov metrics for an infinite sequence of diffeomorphisms. In our case have only a finite sequence, so we show that these also have Lyapunov metrics. The most important point in Lemma A.1 below is item (3), which tells us that at a reverse tempered point the Lyapunov metric will not be distorted.

The appearance of λ\lambda^{\prime} in Lemma A.1 reflects that we need to make a small sacrifice in the rate of growth to obtain the uniform estimates. If we consider sequences that are (C,λ,ϵ)(C,\lambda,\epsilon)-tempered, and construct the Lyapunov metrics that guarantee a growth rate of exactly eλe^{\lambda} up to a factor of ϵ\epsilon, then as we let ϵ\epsilon go to zero, the Lyapunov metrics get very distorted with respect to the reference metrics. With the lemma below, as ϵ\epsilon goes to zero the metrics do not get any more distorted, however, they guarantee only expansion at some rate λλ\lambda^{\prime}\leq\lambda.

Lemma A.1.

(Lyapunov Metric Estimates) Fix (C,λ)(C,\lambda). Then for any 0<λλ0<\lambda^{\prime}\leq\lambda, and any sequence of linear maps A1,,AnSL(2,)A_{1},\ldots,A_{n}\in\operatorname{SL}(2,\mathbb{R}) that have a (C,λ,ϵ)(C,\lambda,\epsilon)-subtempered splitting, EsiEuiE^{s}_{i}\oplus E^{u}_{i} with respect to a sequence of uniformly bounded reference metrics i\|\cdot\|_{i}, there exists a sequence of metrics i\|\cdot\|_{i}^{\prime} such that

  1. (1)

    Ai|Esieλ\|A_{i}|_{E^{s}}\|_{i}^{\prime}\leq e^{-\lambda^{\prime}}

  2. (2)

    Ai|Euieλ\|A_{i}|_{E^{u}}\|_{i}^{\prime}\geq e^{\lambda^{\prime}}

  3. (3)

    12ξiξi4e2C+2ϵi(1e2(λλ))1/2ξi\frac{1}{\sqrt{2}}\|\xi\|_{i}\leq\|\xi\|_{i}^{\prime}\leq 4e^{2C+2\epsilon i}\left(1-e^{2(\lambda^{\prime}-\lambda)}\right)^{-1/2}\|\xi\|_{i}, for ξ2\xi\in\mathbb{R}^{2}.

The same holds for reverse tempered sequences of maps, mutatis mutandis.

The estimates below are similar to [LQ95, Lem. III.1.3]. The reverse version follows by just taking inverses. This result holds because dropping terms from the definition of the Lyapunov metric doesn’t stop them from satisfying the required estimates.

Proof.

We begin by defining the new Lyapunov metric. Then we check the desired properties.

For ξEsi,\xi\in E^{s}_{i}, let ξi=(l=0niAilξi2e2λl)1/2\displaystyle\|\xi\|_{i}^{\prime}\!=\!\!\left(\sum_{l=0}^{n-i}\|A_{i}^{l}\xi\|_{i}^{2}e^{2\lambda^{\prime}l}\right)^{\!\!1/2} and for ξEui,\xi\in E^{u}_{i}, let ξi=(l=0ie2λl[Aill]1ξil2)1/2.\displaystyle\|\xi\|_{i}^{\prime}\!=\!\!\left(\sum_{l=0}^{i}e^{2\lambda^{\prime}l}\|[A_{i-l}^{l}]^{-1}\xi\|_{i-l}^{2}\right)^{\!\!1/2}\!\!.

We then define i\|\cdot\|_{i}^{\prime} on all of 2\mathbb{R}^{2} by declaring EsiE^{s}_{i} and EuiE^{u}_{i} to be orthogonal.

We now check the required estimate for the stable norm. Let ξEis\xi\in E_{i}^{s}, then

(Aiξi+1)2\displaystyle(\|A_{i}\xi\|^{\prime}_{i+1})^{2} =l=0ni1Ai+1lAiξ2e2λl=l=0ni1Al+1iξ2e2λl\displaystyle=\sum_{l=0}^{n-i-1}\|A_{i+1}^{l}A_{i}\xi\|^{2}e^{2\lambda^{\prime}l}=\sum_{l=0}^{n-i-1}\|A^{l+1}_{i}\xi\|^{2}e^{2\lambda^{\prime}l}
=e2λl=0ni1Al+1iξ2e2λ(l+1)e2λ(ξi)2.\displaystyle=e^{-2\lambda^{\prime}}\sum_{l=0}^{n-i-1}\|A^{l+1}_{i}\xi\|^{2}e^{2\lambda^{\prime}(l+1)}\leq e^{-2\lambda^{\prime}}(\|\xi\|^{\prime}_{i})^{2}.

Note that the last inequality follows because the penultimate expression is missing the first term in the sum that defines ξi\|\xi\|_{i}^{\prime}.

We now check the estimate on EuiE^{u}_{i}. Suppose ξEui\xi\in E^{u}_{i}, i<ni<n, then

(Aiξi+1)2\displaystyle(\|A_{i}\xi\|^{\prime}_{i+1})^{2} =l=0i+1e2λl[Ai+1ll]1Aiξi+1l2\displaystyle=\sum_{l=0}^{i+1}e^{2\lambda^{\prime}l}\|[A_{i+1-l}^{l}]^{-1}A_{i}\xi\|_{i+1-l}^{2}
=Aiξ2i+1+e2λl=1i+1e2λ(l1)[Ai(l1)l1]1ξi(l1)2\displaystyle=\|A_{i}\xi\|^{2}_{i+1}+e^{2\lambda^{\prime}}\sum_{l=1}^{i+1}e^{2\lambda^{\prime}(l-1)}\|[A_{i-(l-1)}^{l-1}]^{-1}\xi\|_{i-(l-1)}^{2}
=Aiξ2i+1+e2λl=0ie2λl[Aill]1ξ2ile2λ(ξi)2.\displaystyle=\|A_{i}\xi\|^{2}_{i+1}+e^{2\lambda^{\prime}}\sum_{l=0}^{i}e^{2\lambda^{\prime}l}\|[A_{i-l}^{l}]^{-1}\xi\|^{2}_{i-l}\geq e^{2\lambda^{\prime}}(\|\xi\|_{i}^{\prime})^{2}.

This verifies the first two estimates in the lemma. Note that neither of the above required any control on the angle between EsE^{s} and EuE^{u}.

We now compare the two norms on EsiE^{s}_{i} and EuiE^{u}_{i}. For ξEsi\xi\in E^{s}_{i},

ξi2=l=0niAliξi2e2λl=0nie2Ce2λle2ϵiξi2e2λle2Ce2ϵi1e2(λλ)ξi2.\|\xi\|_{i}^{\prime 2}=\sum_{l=0}^{n-i}\|A^{l}_{i}\xi\|_{i}^{2}e^{2\lambda^{\prime}}\leq\sum_{l=0}^{n-i}e^{2C}e^{-2\lambda l}e^{2\epsilon i}\|\xi\|_{i}^{2}e^{2\lambda^{\prime}l}\leq\frac{e^{2C}e^{2\epsilon i}}{1-e^{2(\lambda^{\prime}-\lambda)}}\|\xi\|_{i}^{2}.

Next for ξEui\xi\in E^{u}_{i}, we estimate

(ξi)2\displaystyle(\|\xi\|_{i}^{\prime})^{2} =l=0ie2λl[Alil]1ξ2il=l=0ie2λle2Ce2(il)ϵe2λlξi2\displaystyle=\sum_{l=0}^{i}e^{2\lambda^{\prime}l}\|[A^{l}_{i-l}]^{-1}\xi\|^{2}_{i-l}=\sum_{l=0}^{i}e^{2\lambda^{\prime}l}e^{2C}e^{2(i-l)\epsilon}e^{-2\lambda l}\|\xi\|_{i}^{2}
e2Ce2iϵl=0ie2(λλ)le2ϵlξi2e2Cei2ϵ1e2(λλ)ξi2.\displaystyle\leq e^{2C}e^{2i\epsilon}\sum_{l=0}^{i}e^{2(\lambda^{\prime}-\lambda)l}e^{-2\epsilon l}\|\xi\|_{i}^{2}\leq\frac{e^{2C}e^{i2\epsilon}}{1-e^{2(\lambda^{\prime}-\lambda)}}\|\xi\|_{i}^{2}.

We now check final estimate in the theorem. For the lower bound, note that by definition ξsiξsi\|\xi^{s}\|_{i}^{\prime}\geq\|\xi^{s}\|_{i} and ξuiξui\|\xi^{u}\|_{i}^{\prime}\geq\|\xi^{u}\|_{i}, thus

(A.1) ξi2(ξsi+ξui)22[(ξsi)2+(ξui)2]=2(ξi)2.\|\xi\|_{i}^{2}\leq(\|\xi^{s}\|_{i}+\|\xi^{u}\|_{i})^{2}\leq 2[(\|\xi^{s}\|_{i}^{\prime})^{2}+(\|\xi^{u}\|_{i}^{\prime})^{2}]=2(\|\xi\|_{i}^{\prime})^{2}.

For the upper bound, we have that

(A.2) ξiξsi+ξui\displaystyle\|\xi\|_{i}^{\prime}\leq\|\xi^{s}\|_{i}^{\prime}+\|\xi^{u}\|_{i}^{\prime} eC+ϵi1e2(λλ)(ξsi+ξui).\displaystyle\leq\frac{e^{C+\epsilon i}}{\sqrt{1-e^{2(\lambda^{\prime}-\lambda)}}}(\|\xi^{s}\|_{i}+\|\xi^{u}\|_{i}).

But we know from subtemperedness that the angle θ\theta between EsiE^{s}_{i} and EuiE^{u}_{i} is at least eCeiϵe^{-C}e^{-i\epsilon}. So by the Law of Sines we have that for {u,s}*\in\{u,s\} that ξiξi/sinθ2ξi/θ\|\xi^{*}\|_{i}\leq\|\xi\|_{i}/\sin\theta\leq 2\|\xi\|_{i}/\theta because for 0θπ/20\leq\theta\leq\pi/2, θ/2sin(θ)\theta/2\leq\sin(\theta). Thus (A.2) gives ξi4e2C+2ϵi1e2(λλ)ξi,\displaystyle\|\xi\|_{i}^{\prime}\leq\frac{4e^{2C+2\epsilon i}}{\sqrt{1-e^{2(\lambda^{\prime}-\lambda)}}}\|\xi\|_{i}, which completes the final estimate in the proof. ∎

A.2. Basic calculus facts

We now record some facts from calculus that will be needed when we study estimates for the graph transform. In the following statements, as elsewhere, we use ϕi\|\phi\|_{i} to denote the supremum of norm of the iith partial derivatives of ϕ\phi.

Lemma A.2.

(Norms of functions in twisted charts) Suppose that ϕ:22\phi\colon\mathbb{R}^{2}\to\mathbb{R}^{2} is a C2C^{2} function. Then if we apply a linear change of coordinates L1,L2L_{1},L_{2} to ϕ\phi, then we see that

L2ϕL11L1L2ϕ1.\|L_{2}\circ\phi\circ L_{1}\|_{1}\leq\|L_{1}\|\|L_{2}\|\|\phi\|_{1}.

Further, for the second derivatives of ϕ\phi:

L2ϕL12L2ϕ2L12.\|L_{2}\circ\phi\circ L_{1}\|_{2}\leq\|L_{2}\|\|\phi\|_{2}\|L_{1}\|^{2}.

The next lemma studies how the C2C^{2} norm of a curve changes when we apply a linear map.

Lemma A.3.

Suppose that γ\gamma is a C2C^{2} curve in 2\mathbb{R}^{2} and that L:22L\colon\mathbb{R}^{2}\to\mathbb{R}^{2} is an invertible linear map. Then LγC2L(m(L))2γC2\displaystyle\|L\circ\gamma\|_{C^{2}}\leq\frac{\|L\|}{(m(L))^{2}}\|\gamma\|_{C^{2}}. Here γC2\|\gamma\|_{C^{2}} refers to the C2C^{2} norm of γ\gamma as a curve in 2\mathbb{R}^{2} and m(L)m(L) is the conorm of the matrix, m(L)=minv0Lv/v\displaystyle m(L)=\min_{v\neq 0}\|Lv\|/\|v\|.

Proof.

By definition, the C2C^{2} norm of a curve is the supremum of the second derivative of its graph over each of its tangent spaces. So, without loss of generality suppose that γ\gamma passes through the origin and that at this point γ\gamma is the curve t(t,λt2)t\mapsto(t,\lambda t^{2}) (O(t3)O(t^{3}) terms do not change the computation below). Then we apply L=[abcd]\displaystyle L=\begin{bmatrix}a&b\\ c&d\end{bmatrix} to (t,λt2)T(t,\lambda t^{2})^{T} to get the curve t(ac)+λt2(bd).\displaystyle t\begin{pmatrix}a\\ c\end{pmatrix}+\lambda t^{2}\begin{pmatrix}b\\ d\end{pmatrix}.

To study the C2C^{2} norm of LγL\circ\gamma at 0, we must write it as a graph over its tangent space, i.e. in the form tu+t2λ^utu+t^{2}\hat{\lambda}u^{\perp}, where uu is a unit vector and λ^\hat{\lambda} is to be determined. Let v=(a,c)Tv=(a,c)^{T}, u=v/vu=v/\|v\| and w=(b,d)Tw=(b,d)^{T}. Then we may reparametrize vt+λwt2vt+\lambda wt^{2} in the form ut+λ(w/v2)t2ut+\lambda(w/\|v\|^{2})t^{2}. Decomposing w=pu+quw=pu+qu^{\perp} we obtain the parametrization us+(λq/v2)s2u+O(s3)us+(\lambda q/\|v\|^{2})s^{2}u^{\perp}+O(s^{3}) where s=t+pλ/v2t2.s=t+p\lambda/\|v\|^{2}t^{2}. Thus λ^=qλ/v2.\hat{\lambda}=q\lambda/\|v\|^{2}. Since |q|w,|q|\leq\|w\|, wL\|w\|\leq\|L\|, and 1/v1/m(L)1/\|v\|\leq 1/m(L), the result follows. ∎

We now estimate the C2C^{2} norm of a function in terms of its inverse.

Lemma A.4.

Suppose that ϕ:\phi\colon\mathbb{R}\to\mathbb{R} (or from one interval to another) is a C2C^{2} diffeomorphism. If |Dϕ|>λ\left|D\phi\right|>\lambda, then |Dϕ1|λ1\left|D\phi^{-1}\right|\leq\lambda^{-1} and ϕ12λ3ϕ2\|\phi^{-1}\|_{2}\leq\lambda^{-3}\|\phi\|_{2}.

Proof.

At each point, we express the Taylor polynomial of ψ1\psi^{-1} in terms of the Taylor polynomial of ψ\psi. Suppose that ψ\psi has Taylor polynomial νx+Ax2\nu x+Ax^{2} at some point, with |ν|λ.|\nu|\leq\lambda. Then the Taylor polynomial of ψ1\psi^{-1} at the corresponding point is ν1x+Cx2\nu^{-1}x+Cx^{2}, where C=ν3A.\displaystyle C=-\nu^{-3}A. The conclusion follows. ∎

For the future reference, we record a bound on compositions. An overview of estimates like these is contained in [Hör76, App. A].

Lemma A.5.

Suppose we are composing three functions f,g,h:nnf,g,h\colon\mathbb{R}^{n}\to\mathbb{R}^{n}, then

fg2f2g12+f1g2.\|f\circ g\|_{2}\leq\|f\|_{2}\|g\|_{1}^{2}+\|f\|_{1}\|g\|_{2}.

and

fgh2f2g12h12+f1g2h12+f1g1h2.\|f\circ g\circ h\|_{2}\leq\|f\|_{2}\|g\|_{1}^{2}\|h\|_{1}^{2}+\|f\|_{1}\|g\|_{2}\|h\|_{1}^{2}+\|f\|_{1}\|g\|_{1}\|h\|_{2}.

When we study how fast the dynamics smooths curves, we will represent the curve as a graph and then apply the graph transform to it. The following relates the C2C^{2} norm of an embedded curve with the C2C^{2} norm of the curve represented as a graph. Recall that the C2C^{2} norm of an embedded curve is the same thing as the norm of the curve as a graph over its tangent space at each point in an exponential chart.

Lemma A.6.

Suppose γ\gamma is a C2C^{2} curve in 2\mathbb{R}^{2} that is θ\theta-transverse to the yy-axis. Then if we represent γ\gamma as the graph over the xx-axis of a function γ^\hat{\gamma}, then

γ^1cotθ, and γ^2(sinθ)3γC2.\|\hat{\gamma}\|_{1}\leq\cot\theta\text{, and }\|\hat{\gamma}\|_{2}\leq(\sin\theta)^{-3}\|\gamma\|_{C^{2}}.
Proof.

The first estimate is essentially the definition of tangent, so we will show the second.

Locally we may represent γ\gamma as a graph:

p+(sinθp,cosθp)t+ϕp(t)(cosθp,sinθp)=p+(tsinθpϕp(t)cosθp,0)+(0,tcosθp+ϕp(t)sinθp)p+(\sin\theta_{p},\cos\theta_{p})t+\phi_{p}(t)(\!-\!\!\cos\theta_{p},\sin\theta_{p})\!\!=\!\!p+(t\sin\theta_{p}-\phi_{p}(t)\cos\theta_{p},0)+(0,t\cos\theta_{p}+\phi_{p}(t)\sin\theta_{p})

where ϕp(t)=0\phi^{\prime}_{p}(t)=0. By definition of γC2\|\gamma\|_{C^{2}}, |ϕp(0)|γC2\left|\phi_{p}^{\prime\prime}(0)\right|\leq\|\gamma\|_{C^{2}}.

In order to estimate γ^(0)\hat{\gamma}^{\prime\prime}(0), we must write the graph in the form p+(t,ψ(t))p+(t,\psi(t)) for some ψ\psi and estimate ψ(0)\psi^{\prime\prime}(0). Accordingly, we make a change of variables s=t/sinθps=t/\sin\theta_{p} getting

(A.3) p+(sϕp(ssinθp)cosθp,0)+(0,cosθpsinθps+ϕp(ssinθp)sinθp).p+\left(s-\phi_{p}\left(\frac{s}{\sin\theta_{p}}\right)\cos\theta_{p},0\right)+\left(0,\frac{\cos\theta_{p}}{\sin\theta_{p}}s+\phi_{p}\left(\frac{s}{\sin\theta_{p}}\right)\sin\theta_{p}\right).

To estimate the second derivative of the graph at 0, we need a representation of the form (u+O(u3),ψ(u)+O(u3))(u+O(u^{3}),\psi(u)+O(u^{3})), so we make a further change of variables u=sϕp(ssinθp)\displaystyle u=s-\phi_{p}\left(\frac{s}{\sin\theta_{p}}\right). Then s=u+ϕp(0)u2cosθp2sin2θp+o(u2).\displaystyle s=u+\frac{\phi_{p}^{\prime\prime}(0)u^{2}\cos\theta_{p}}{2\sin^{2}\theta_{p}}+o(u^{2}). Plugging this into (A.3) and using that cos2θp+sin2θp=1\cos^{2}\theta_{p}+\sin^{2}\theta_{p}=1 we obtain the parametrization

p+(u,ucosθpsinθp+ϕ(0)u22sin3θp+o(u2))p+\left(u,\frac{u\cos\theta_{p}}{\sin\theta_{p}}+\frac{\phi^{\prime\prime}(0)u^{2}}{2\sin^{3}\theta_{p}}+o(u^{2})\right)

and the result follows. ∎

The next lemma estimates how the density is distorted by diffeomorphisms.

Lemma A.7.

Suppose that MM is a closed Riemannian manifold. There exists C>0C>0 such that if f:MMf\colon M\to M is a C2C^{2} diffeomorphism, γ\gamma is a C2C^{2} curve in MM and ρ\rho is a log-α\alpha-Hölder density along γ\gamma, then the density fρf_{*}\rho along f(γ)f(\gamma) satisfies

(A.4) ln(fρ)Cα(1/m(Df))1+α(lnρCα+CfC2(1+γC2)).\|\ln(f_{*}\rho)\|_{C^{\alpha}}\leq(1/m(Df))^{1+\alpha}\left(\|\ln\rho\|_{C^{\alpha}}+C\|f\|_{C^{2}}(1+\|\gamma\|_{C^{2}})\right).

The same estimate holds for local diffeomorphisms, mutatis mutandis.

We leave the proof of the lemma to the readers, since we provide a similar estimate below (see (A.25)).

Next we record an estimate comparing two inner products.

Lemma A.8.

Suppose that we have two inner products 1\|\cdot\|_{1} and 2\|\cdot\|_{2} on a vector space VV and that

A12B1.A\|\cdot\|_{1}\leq\|\cdot\|_{2}\leq B\|\cdot\|_{1}.

Then for v,wV{0}v,w\in V\setminus\{0\}

AB11(v,w)2(v,w)A1B1(v,w),AB^{-1}\angle_{1}(v,w)\leq\angle_{2}(v,w)\leq A^{-1}B\angle_{1}(v,w),

where i\angle_{i} denotes the angle with respect to the metric i\|\cdot\|_{i}.

Proof.

We show the upper bound; the lower bound is a straightforward consequence. Let Si1S_{i}^{1} denote the unit sphere with respect to the inner product ii and vv and ww be two unit vectors with respect to 1\|\cdot\|_{1}. Let II be a curve between vv and ww such that len1(I)=1(v,w)\operatorname{len}_{1}(I)=\angle_{1}(v,w). Then len2(I)Blen1(I)\operatorname{len}_{2}(I)\leq B\operatorname{len}_{1}(I). Let π2:V{0}S12\pi_{2}\colon V\setminus\{0\}\to S^{1}_{2} denote the radial projection onto S12S^{1}_{2}. Then 2(v,w)len2(π2(I))\angle_{2}(v,w)\leq\operatorname{len}_{2}(\pi_{2}(I)). Note that the norm of Dπ2|ID\pi_{2}|_{I} is bounded above by 1/d2(0,I)1/d_{2}(0,I). Since d2(0,I)Ad_{2}(0,I)\geq A, we see that len2(π2(I))A1Blen1(I)\operatorname{len}_{2}(\pi_{2}(I))\leq A^{-1}B\operatorname{len}_{1}(I), so we are done. ∎

We now record an estimate on how fast a C2C^{2} curve can get worse under the dynamics. Note that one iteration can instantaneously make a line into an O(1)O(1) bad curve, hence the estimate has the form below.

Lemma A.9.

Fix D>0D>0 then there exists Λ>0\Lambda>0, such that if for 1in1\leq i\leq n, fiDiff2(M)f_{i}\in\operatorname{Diff}^{2}(M) is a sequence of diffeomorphisms of a closed Riemannian manifold MM with fC2<D\|f\|_{C^{2}}<D, γ\gamma is a C2C^{2} curve in MM, and γn=fn1(γ)\gamma_{n}=f^{n}_{1}(\gamma), then

γnC2max{eΛnγC2,eΛn}.\|\gamma_{n}\|_{C^{2}}\leq\max\{e^{\Lambda n}\|\gamma\|_{C^{2}},e^{\Lambda n}\}.
Proof.

Recall that the C2C^{2} norm of γ\gamma is bounded by the maximum over all tγt\in\gamma of the second derivative of γ\gamma in an exponential chart at tt where γ\gamma is viewed as a graph over its tangent plane. The result then follows because the second derivative of a sequence of maps with uniformly bounded C2C^{2} norm grows at most exponentially fast. ∎

A.3. Properties of Hölder functions

In this subsection, we record some additional claims about Hölder and log-Hölder functions that will be used in the proof of the coupling lemma.

Claim A.10.

Suppose that ρ:M\rho\colon M\to\mathbb{R} is a (C,α)(C,\alpha)-Hölder function on a metric space MM such that ρA1\rho\geq A^{-1}, for some A>0A>0. Then lnρ\ln\rho is ACAC-log-α\alpha-Hölder.

Proof.

First, observe that on [A1,)[A^{-1},\infty), that ln\ln is AA-Lipschitz because its derivative 1/x1/x is at most AA. Thus |ln(ρ(x))ln(ρ(y))|A|ρ(x)ρ(y)|AC|xy|α,\displaystyle\left|\ln(\rho(x))-\ln(\rho(y))\right|\leq A\left|\rho(x)-\rho(y)\right|\leq AC\left|x-y\right|^{\alpha}, as desired. ∎

The next lemma relates two different ways of dealing with log-Hölder functions.

Lemma A.11.

Suppose that ρ\rho is an (A,α)(A,\alpha)-log Hölder function on a metric space of diameter at most DD. Then there exists CA,DC_{A,D} such that

(A.5) |ρ(x)ρ(y)|ρ(x)CA,D|xy|α.\left|\rho(x)-\rho(y)\right|\leq\rho(x)C_{A,D}\left|x-y\right|^{\alpha}.
Proof.

Suppose that ρ(y)ρ(x)\rho(y)\geq\rho(x). Then log-α\alpha-Hölder gives that

ln(ρ(y)/ρ(x))=|ln(ρ(y)/ρ(x))|A|xy|α.\ln(\rho(y)/\rho(x))=\left|\ln(\rho(y)/\rho(x))\right|\leq A\left|x-y\right|^{\alpha}.

Thus taking exe^{x}, by boundedness of the metric space and the constant AA, there exists CA,DC_{A,D} such that

ρ(y)ρ(x)eA|xy|α1+CA,D|xy|α.\frac{\rho(y)}{\rho(x)}\leq e^{A\left|x-y\right|^{\alpha}}\leq 1+C_{A,D}\left|x-y\right|^{\alpha}.

Thus

ρ(y)ρ(x)ρ(x)CA,D|xy|α.\rho(y)-\rho(x)\leq\rho(x)C_{A,D}\left|x-y\right|^{\alpha}.

The case when ρ(y)<ρ(x)\rho(y)<\rho(x) is similar, so we are done. ∎

A.4. Graph transform with estimates on the second derivative

We now study the graph transform and record how C2C^{2} norms of curves are affected by it. If one constructs the stable manifolds by using the graph transform, then after one has checked that the stable manifold is C1C^{1}, one can check that the manifolds are CrC^{r} inductively by studying the action of the graph transform on the jet of the stable manifold which is C1C^{1}. See for instance the construction in [Shu87], which proceeds along these lines.

Proposition A.12.

(C2C^{2} estimates for the graph transform) Suppose λ>1\lambda>1 and F:22F\colon\mathbb{R}^{2}\to\mathbb{R}^{2} is a C2C^{2} diffeomorphism of the form

(A.6) F=(σ1x+f1(x,y),σ2y+f2(x,y)),F=(\sigma_{1}x+f_{1}(x,y),\sigma_{2}y+f_{2}(x,y)),

with min{σ1,σ21}λ\min\{\sigma_{1},\sigma_{2}^{-1}\}\geq\lambda. Suppose that γ\gamma is a C2C^{2} curve given as the graph of a function ϕ:I1\phi\colon I_{1}\to\mathbb{R}. Assume that F(0,0)=(0,0)F(0,0)=(0,0) and that we have the following estimates:

(A.7) f1C1\displaystyle\|f_{1}\|_{C^{1}} =ϵ1,\displaystyle=\epsilon_{1},
(A.8) f2C1=ϵ2\displaystyle\|f_{2}\|_{C^{1}}=\epsilon_{2} <λ1,\displaystyle<\lambda^{-1},
(A.9) λϵ1ϵ1ϕ1\displaystyle\lambda-\epsilon_{1}-\epsilon_{1}\|\phi\|_{1} >0.\displaystyle>0.

Then the following hold.

  1. (1)

    The curve FγF\circ\gamma is given as the graph of a function ϕ~:I2\widetilde{\phi}\colon I_{2}\to\mathbb{R} and

    (A.10) len(I2)(λϵ1ϵ1ϕ1)len(I1).\operatorname{len}(I_{2})\geq(\lambda-\epsilon_{1}-\epsilon_{1}\|\phi\|_{1})\operatorname{len}(I_{1}).
  2. (2)

    We have an estimate on how much FF smooths ϕ\phi,

    (A.11) ϕ~C0\displaystyle\|\widetilde{\phi}\|_{C^{0}} λ1ϕC0+ϵ2,\displaystyle\leq\lambda^{-1}\|\phi\|_{C^{0}}+\epsilon_{2},
    (A.12) ϕ~1\displaystyle\|\widetilde{\phi}\|_{1} (λ1ϕ1+ϵ2+ϵ2ϕ1)(λϵ1ϵ1ϕ1)1.\displaystyle\leq(\lambda^{-1}\|\phi\|_{1}+\epsilon_{2}+\epsilon_{2}\|\phi\|_{1})(\lambda-\epsilon_{1}-\epsilon_{1}\|\phi\|_{1})^{-1}.
  3. (3)

    There is ϵ0>0\epsilon_{0}\!>\!0 such that under the additional assumption that ϵ1,ϵ2,ϕ1<ϵ0\epsilon_{1},\epsilon_{2},\|\phi\|_{1}\!<\!\epsilon_{0}

    (A.13) ϕ~2λ1.99f2+λ2.99ϕ2.\|\widetilde{\phi}\|_{2}\leq\lambda^{-1.99}\|f\|_{2}+\lambda^{-2.99}\|\phi\|_{2}.
  4. (4)

    The graph transform smooths densities along curves. If ρ(x,ϕ(x))\rho(x,\phi(x)) is a log α\alpha-Hölder density along γ\gamma with respect to the arclength, write ρ~(x,ϕ~(x))\widetilde{\rho}(x,\widetilde{\phi}(x)) for the density of the pushforward of ρ\rho along F(γ)F(\gamma). For ϵ0>0\epsilon_{0}>0 as in (3), if ϵ1,ϵ2,ϕ1<ϵ0\epsilon_{1},\epsilon_{2},\|\phi\|_{1}<\epsilon_{0}, then

    (A.14) lnρ~Cαλ.9α(lnρCα+f2+ϕ2).\|\ln\widetilde{\rho}\|_{C^{\alpha}}\leq\lambda^{-.9\alpha}(\|\ln\rho\|_{C^{\alpha}}+\|f\|_{2}+\|\phi\|_{2}).

Note that part (1) of the proposition implies that if I1I_{1} contains a neighborhood of 0 of size δ\delta, then then I2I_{2} contains a neighborhood of size (λϵ1ϵ1ϕ1)δ(\lambda-\epsilon_{1}-\epsilon_{1}\|\phi\|_{1})\delta.

Proof.

We write down explicitly a formula for ϕ~\widetilde{\phi} and then estimate each term that appears in the formula. It is tedious but straightforward. Throughout we will use π1\pi_{1} and π2\pi_{2} for the projections onto the two factors in 2\mathbb{R}^{2}.

We estimate the C1C^{1} norm of ϕ\phi as a graph over ×{0}\mathbb{R}\times\{0\}. To this end we first study how much the graph of ϕ\phi is stretched horizontally, which will verify (1) above. To do this we consider a natural map ψ1:I1\psi^{-1}\colon I_{1}\to\mathbb{R}:

(A.15) ψ1:x(x,ϕ(x))π1(F(x,ϕ(x)))=λx+f1(x,ϕ(x)).\psi^{-1}\colon x\mapsto(x,\phi(x))\mapsto\pi_{1}(F(x,\phi(x)))=\lambda x+f_{1}(x,\phi(x)).

From the definition of ψ1\psi^{-1},

(A.16) Dψ1λϵ1ϵ1ϕ1,\|D\psi^{-1}\|\geq\lambda-\epsilon_{1}-\epsilon_{1}\|\phi\|_{1},

thus by (A.9), Dψ1\|D\psi^{-1}\| is positive, so ψ1\psi^{-1} is monotone. Hence F(γ)F(\gamma) is the graph of a function ϕ~\widetilde{\phi}, and we may write ψ1:I1I2\psi^{-1}\colon I_{1}\to I_{2}. By (A.16), (λϵ1ϵ1ϕ1)len(I1)len(I2).\displaystyle(\lambda-\epsilon_{1}-\epsilon_{1}\|\phi\|_{1})\operatorname{len}(I_{1})\leq\operatorname{len}(I_{2}). This completes the proof of item (1).

We now prove item (2). First we give the C0C^{0} estimate and then the estimate on the first derivative. By the assumption on f2f_{2}, we see that the image of ϕ\phi is at most λ1ϕC0+ϵ2\lambda^{-1}\|\phi\|_{C^{0}}+\epsilon_{2} from the xx-axis. Thus

(A.17) ϕ~C0λ1ϕC0+ϵ2.\|\widetilde{\phi}\|_{C^{0}}\leq\lambda^{-1}\|\phi\|_{C^{0}}+\epsilon_{2}.

Now we estimate ϕ~1\|\widetilde{\phi}\|_{1}. From equation (A.16), we obtain that:

(A.18) Dψ(λϵ1ϵ1ϕ1)1.\|D\psi\|\leq(\lambda-\epsilon_{1}-\epsilon_{1}\|\phi\|_{1})^{-1}.

This allows us to estimate the C1C^{1} norm of FγF\circ\gamma as a graph over I2×{0}I_{2}\times\{0\}. The curve F(γ)F(\gamma) is given by the graph of

(A.19) xπ2F(ψ(x),ϕ(ψ(x)))=λ1ϕ(ψ(x))+f2(ψ(x),ϕ(ψ(x)))=ϕ~.x\mapsto\pi_{2}F(\psi(x),\phi(\psi(x)))=\lambda^{-1}\phi(\psi(x))+f_{2}(\psi(x),\phi(\psi(x)))=\widetilde{\phi}.

Thus by the chain rule

ϕ~1λ1ϕ1(λϵ1ϵ1ϕ1)1+ϵ2(λϵ1ϵ1ϕ1)1+ϵ2ϕ1(λϵ1ϵ1ϕ1)1.\|\widetilde{\phi}\|_{1}\leq\lambda^{-1}\|\phi\|_{1}(\lambda-\epsilon_{1}-\epsilon_{1}\|\phi\|_{1})^{-1}+\epsilon_{2}(\lambda-\epsilon_{1}-\epsilon_{1}\|\phi\|_{1})^{-1}+\epsilon_{2}\|\phi\|_{1}(\lambda-\epsilon_{1}-\epsilon_{1}\|\phi\|_{1})^{-1}.

Hence,

(A.20) ϕ~1(λ1ϕ1+ϵ2+ϵ2ϕ1)(λϵ1ϵ1ϕC1)1,\|\widetilde{\phi}\|_{1}\leq(\lambda^{-1}\|\phi\|_{1}+\epsilon_{2}+\epsilon_{2}\|\phi\|_{1})(\lambda-\epsilon_{1}-\epsilon_{1}\|\phi\|_{C^{1}})^{-1},

which finishes the proof of item (2).

We now turn to the C2C^{2} estimates and check item (3). To begin we need to obtain a C2C^{2} estimate on the function ψ\psi used above. By (A.15) and the chain rule,

ψ12f2+2f2ϕ1+f2ϕ12+f1ϕ2.\|\psi^{-1}\|_{2}\leq\|f\|_{2}+2\|f\|_{2}\|\phi\|_{1}+\|f\|_{2}\|\phi\|_{1}^{2}+\|f\|_{1}\|\phi\|_{2}.

Thus by Lemma A.4,

ψ2(λϵ1ϵ1ϕ1)3(f2+2f2ϕ1+f2ϕ12+f1ϕ2).\|\psi\|_{2}\leq(\lambda-\epsilon_{1}-\epsilon_{1}\|\phi\|_{1})^{-3}(\|f\|_{2}+2\|f\|_{2}\|\phi\|_{1}+\|f\|_{2}\|\phi\|_{1}^{2}+\|f\|_{1}\|\phi\|_{2}).

We can now plug everything in to estimate the C2C^{2} norm of the image of ϕ\phi. By definition ϕ~\widetilde{\phi} is equal to λ1ϕ(ψ(x))+f2(ψ(x),ϕ(ψ(x))).\displaystyle\lambda^{-1}\phi(\psi(x))+f_{2}(\psi(x),\phi(\psi(x))). For the first term, we have the estimate

λ1ϕψ2λ1(ϕ2ψ12+ϕ1ψ2).\|\lambda^{-1}\phi\circ\psi\|_{2}\leq\lambda^{-1}(\|\phi\|_{2}\|\psi\|_{1}^{2}+\|\phi\|_{1}\|\psi\|_{2}).

By the chain rule

f2(ψ(x),ϕ(ψ(x)))2\displaystyle\|f_{2}(\psi(x),\phi(\psi(x)))\|_{2} f22ψ12+f22ϕ1ψ12+f21ϕ1ψ2\displaystyle\leq\|f_{2}\|_{2}\|\psi\|_{1}^{2}+\|f_{2}\|_{2}\|\phi\|_{1}\|\psi\|_{1}^{2}+\|f_{2}\|_{1}\|\phi\|_{1}\|\psi\|_{2}
+(f21ϕ1ψ2+f21ϕ2ψ12+f22ϕ12ψ12+f22ϕ1ψ12).\displaystyle+(\|f_{2}\|_{1}\|\phi\|_{1}\|\psi\|_{2}\!+\!\|f_{2}\|_{1}\|\phi\|_{2}\|\psi\|_{1}^{2}\!+\!\|f_{2}\|_{2}\|\phi\|_{1}^{2}\|\psi\|_{1}^{2}\!+\!\|f_{2}\|_{2}\|\phi\|_{1}\|\psi\|_{1}^{2}).

Hence if ϵ1,ϵ2,ϕ1<ϵ0\epsilon_{1},\epsilon_{2},\|\phi\|_{1}<\epsilon_{0} and ϵ0\epsilon_{0} sufficiently small, then

(A.21) ψ1\displaystyle\|\psi\|_{1} λ.9999,\displaystyle\leq\lambda^{-.9999},
(A.22) ψ2\displaystyle\|\psi\|_{2} λ2.999(ϵ0ϕ2+f2).\displaystyle\leq\lambda^{-2.999}(\epsilon_{0}\|\phi\|_{2}+\|f\|_{2}).

In particular, as long as ϵ0>0\epsilon_{0}>0 is sufficiently small, under the assumptions just listed applying the estimates on ϕ1\|\phi\|_{1} and ϕ2\|\phi\|_{2} gives

(A.23) λ1ϕψ2\displaystyle\|\lambda^{-1}\phi\circ\psi\|_{2} λ2.999(ϵ0f2+ϕ2),\displaystyle\leq\lambda^{-2.999}(\epsilon_{0}\|f\|_{2}+\|\phi\|_{2}),
(A.24) f2(ψ(x),ϕ(ψ(x)))2\displaystyle\|f_{2}(\psi(x),\phi(\psi(x)))\|_{2} λ1.999f2+ϵ0λ1.8ϕ2.\displaystyle\leq\lambda^{-1.999}\|f\|_{2}+\epsilon_{0}\lambda^{-1.8}\|\phi\|_{2}.

Combining these estimates, we see that as long as ϵ0\epsilon_{0} is sufficiently small,

ϕ~2λ1.99f2+λ2.99ϕ2.\|\widetilde{\phi}\|_{2}\leq\lambda^{-1.99}\|f\|_{2}+\lambda^{-2.99}\|\phi\|_{2}.

We next study how the Hölder norm of the log of the density ρ\rho along γ\gamma changes when we iterate the dynamics and prove item (4). From the change of variables formula, we must estimate the following:

(A.25) ln[ρ(ψ(x),ϕ(ψ(x)))DF|(1,dϕ/dx)(ψ(x),ϕ(ψ(x)))1]=\ln[\rho(\psi(x),\phi(\psi(x)))\|{DF|_{(1,d\phi/dx)}(\psi(x),\phi(\psi(x)))}\|^{-1}]=
lnρ(ψ(x),ϕ(ψ(x)))+lnDF|(1,dϕ/dx)(ψ(x))1=I+II.\ln\rho(\psi(x),\phi(\psi(x)))+\ln\|{DF|_{(1,d\phi/dx)}(\psi(x))}\|^{-1}=I+II.

Term I. The estimate of the term II is straightforward:

lnρ(ψ(x),ϕ(ψ(x)))Cαψ1αlnρCαλ.9αlnρCα\|\ln\rho(\psi(x),\phi(\psi(x)))\|_{C^{\alpha}}\leq\|\psi\|_{1}^{\alpha}\|\ln\rho\|_{C^{\alpha}}\leq\lambda^{-.9\alpha}\|\ln\rho\|_{C^{\alpha}}

by equation (A.18) as we are assuming f11,f21,ϕ1\|f_{1}\|_{1},\|f_{2}\|_{1},\|\phi\|_{1} are all small.

Term II. The second term is more complicated to estimate. Note that this term does not actually involve ρ\rho as it is just the Jacobian of the map between two curves. So, to control the log-α\alpha-Hölder norm of this function we can estimate the derivative of the logarithm, which is an upper bound on the log-α\alpha-Hölder constant for all α1\alpha\leq 1. To begin, we write

DlnDF|(1,dϕ/dx)(ψ(x),ϕ(ψ(x)))1\displaystyle D\ln\|DF|_{(1,d\phi/dx)}(\psi(x),\phi(\psi(x)))\|^{-1} =21DlnDF|(1,dϕ/dx)221Dln(1,dϕ/dx)2\displaystyle=2^{-1}D\ln\|DF|_{(1,d\phi/dx)}\|^{2}-2^{-1}D\ln\|(1,d\phi/dx)\|^{2}
=III+IV,\displaystyle=III+IV,

where IIIIII and IVIV are evaluated at the point (ψ(x),ϕ(ψ(x))(\psi(x),\phi(\psi(x)).

Term III. We now bound term IIIIII. Because Dlnfψ=ψDlnfD\ln f\circ\psi=\psi^{\prime}D\ln f, we see that the required estimate will hold assuming that it holds without precomposing with ψ\psi because |ψ|1\left|\psi^{\prime}\right|\leq 1 under these assumptions. Thus we suppress the ψ\psi below. From before we have an expression for DFDF in terms of λ,f1,f2\lambda,f_{1},f_{2}:

DF=[λ+df1dxdf1dydf2dxλ1+df2dy].DF=\begin{bmatrix}\lambda+\frac{df_{1}}{dx}&\frac{df_{1}}{dy}\\ \frac{df_{2}}{dx}&\lambda^{-1}+\frac{df_{2}}{dy}\end{bmatrix}.

Thus we are reduced to evaluating

(A.26) Dln[(λ+df1dx+df1dydϕdx)2+(df2dx+λ1dϕdx+df2dxdϕdx)2],D\ln\left[\left(\lambda+\frac{df_{1}}{dx}+\frac{df_{1}}{dy}\frac{d\phi}{dx}\right)^{2}+\left(\frac{df_{2}}{dx}+\lambda^{-1}\frac{d\phi}{dx}+\frac{df_{2}}{dx}\frac{d\phi}{dx}\right)^{2}\right],

where the dfidf_{i} terms are evaluated at (x,ϕ(x))(x,\phi(x)). Then taking derivatives gives:

(A.27) A+B(λ+df1dx+df1dydϕdx)2+(df2dx+λ1dϕdx+df2dxdϕdx)2=AQ+BQ.\frac{A+B}{\left(\lambda+\frac{df_{1}}{dx}+\frac{df_{1}}{dy}\frac{d\phi}{dx}\right)^{2}+\left(\frac{df_{2}}{dx}+\lambda^{-1}\frac{d\phi}{dx}+\frac{df_{2}}{dx}\frac{d\phi}{dx}\right)^{2}}=\frac{A}{Q}+\frac{B}{Q}.

where AA and BB are the derivatives of the two parenthetical terms in equation (A.26) and QQ is the denominator of the left hand side of equation (A.27). Note that QQ can be made arbitrarily close to λ2\lambda^{2} as long as df1/dx,df1/dydf_{1}/dx,df_{1}/dy and dϕ/dxd\phi/dx are sufficiently small.

Keeping in mind that the fif_{i} terms are evaluated at (x,ϕ(x))(x,\phi(x)), we find that:

A=2(λ+df1dx+df1dydϕdx)(d2f1dx2+d2f1dxdydϕdx+(d2f1dxdy+d2f1dy2dϕdx)dϕdx+df1dyd2ϕdx2)A=2\left(\lambda+\frac{df_{1}}{dx}+\frac{df_{1}}{dy}\frac{d\phi}{dx}\right)\left(\frac{d^{2}f_{1}}{dx^{2}}+\frac{d^{2}f_{1}}{dxdy}\frac{d\phi}{dx}+\left(\frac{d^{2}f_{1}}{dxdy}+\frac{d^{2}f_{1}}{dy^{2}}\frac{d\phi}{dx}\right)\frac{d\phi}{dx}+\frac{df_{1}}{dy}\frac{d^{2}\phi}{dx^{2}}\right)

and

B=2(df2dx+λ1dϕdx+df2dxdϕdx)(d2f2dx2+λ1d2ϕdx2+(d2f2dxdy+d2f2dx2+d2f2dxdydϕdx)dϕdx+df2dxd2ϕdx2).B\!\!=\!\!2\left(\frac{df_{2}}{dx}\!+\!\lambda^{-1}\frac{d\phi}{dx}\!+\!\frac{df_{2}}{dx}\frac{d\phi}{dx}\right)\left(\frac{d^{2}f_{2}}{dx^{2}}+\lambda^{-1}\frac{d^{2}\phi}{dx^{2}}+\left(\frac{d^{2}f_{2}}{dxdy}+\frac{d^{2}f_{2}}{dx^{2}}\!+\!\frac{d^{2}f_{2}}{dxdy}\frac{d\phi}{dx}\right)\frac{d\phi}{dx}\!+\!\frac{df_{2}}{dx}\frac{d^{2}\phi}{dx^{2}}\right).

Pick a small number ϵ¯\bar{\epsilon}. Then as f11,f21\|f_{1}\|_{1},\|f_{2}\|_{1} and ϕ1\|\phi\|_{1} are sufficiently small it is easy to see from the above expressions for AA and BB, that

(A.28) |III|λ+ϵ¯λ2ϵ¯(f2+ϕ2).\left|III\right|\leq\frac{\lambda+\bar{\epsilon}}{\lambda^{2}-\bar{\epsilon}}(\|f\|_{2}+\|\phi\|_{2}).

Term IV. We now bound Term IV. For this term we have

21Dln(1,dϕ/dx)2=21Dln(1+(dϕdx)2)=dϕdxd2ϕdx21+(dϕdx)2.2^{-1}D\ln\|(1,d\phi/dx)\|^{2}=2^{-1}D\ln\left(1+\left(\frac{d\phi}{dx}\right)^{2}\right)\\ =\frac{\frac{d\phi}{dx}\frac{d^{2}\phi}{dx^{2}}}{1+\left(\frac{d\phi}{dx}\right)^{2}}.

Since we are assuming ϕ1\|\phi\|_{1} is small, we see that |IV|ϵ¯ϕ2.\displaystyle\left|IV\right|\leq\bar{\epsilon}\|\phi\|_{2}.

Conclusion of estimates on |Dlnρ~|\left|D\ln\widetilde{\rho}\right|. From the above discussion,

lnρ~(x,ϕ~(x))Cα\displaystyle\|\ln\widetilde{\rho}(x,\widetilde{\phi}(x))\|_{C^{\alpha}} |I|+|III|+|IV|\displaystyle\leq\left|I\right|+\left|III\right|+\left|IV\right|
λ.9αlnρCα+[λ+ϵ¯λ2ϵ¯+ϵ¯](f2+ϕ2)\displaystyle\leq\lambda^{-.9\alpha}\|\ln\rho\|_{C^{\alpha}}+\left[\frac{\lambda+\bar{\epsilon}}{\lambda^{2}-\bar{\epsilon}}+\bar{\epsilon}\right](\|f\|_{2}+\|\phi\|_{2}) λ.9α(lnρCα+f2+ϕ2).\displaystyle\leq\lambda^{-.9\alpha}(\|\ln\rho\|_{C^{\alpha}}+\|f\|_{2}+\|\phi\|_{2}).

where the last inequality holds since the expression in square brackets is less than 1 provided that ϵ¯\bar{\epsilon} is sufficiently small. This concludes the proof of the proposition. ∎

A.5. Finite time smoothing estimate

Now that we control the amount of smoothing due to a single iteration of the graph transform, we study a reverse subtempered point for a sequence of diffeomorphisms. An important feature of the estimate below is that it covers curves that are extremely close to the contracting direction. This complicates the estimates compared to the case that one only considers curves lying in a cone near the expanding direction.

Proposition A.13.

Fix constants C,λ,ϵ,D1>0C,\lambda,\epsilon,D_{1}>0 with ϵ<λ/30\epsilon<\lambda/30. Suppose that fi:22f_{i}\colon\mathbb{R}^{2}\to\mathbb{R}^{2}, 1in1\leq i\leq n, is a sequence of diffeomorphisms such that fi(0)=0f_{i}(0)=0, the sequence D0fiD_{0}f_{i} has a (C,λ,ϵ)(C,\lambda,\epsilon)-reverse tempered splitting Esi,EuiE^{s}_{i},E^{u}_{i} in the sense of Definition 4.2, fiC2<D1\|f_{i}\|_{C^{2}}<D_{1}, and Df1iD11\|Df^{-1}_{i}\|\geq D_{1}^{-1}. Then there exist constants ϵ0,max,D2,D3,D4,D5,D6,D7,D8,C0\epsilon_{0},\ell_{\max},D_{2},D_{3},D_{4},D_{5},D_{6},D_{7},D_{8},C_{0} depending only on (C,λ,ϵ)(C,\lambda,\epsilon) and D1D_{1} such that the following holds. Let γ\gamma be a C2C^{2} curve in 2\mathbb{R}^{2} passing through 0 not tangent to Es0E^{s}_{0} at 0, containing an RR-good neighborhood of 0. Let fn=fnf1f^{n}=f_{n}\circ\cdots\circ f_{1}. Let θ=(γ˙(0),Es0)\theta=\angle(\dot{\gamma}(0),E^{s}_{0}), and γ0\gamma_{0} be a segment of γ\gamma containing 0 of length at least

(A.29) len(γ0)=D2min{eRθ,e.9λn}.\operatorname{len}(\gamma_{0})=D_{2}\min\{e^{-R}\theta,e^{-.9\lambda n}\}.

There is an associated auxiliary quantity

(A.30) l0=D3θe2ϵnmin{eRθ,e.9λn},l_{0}=D_{3}\theta e^{-2\epsilon n}\min\{e^{-R}\theta,e^{-.9\lambda n}\},

and a subcurve γn\gamma_{n} of fn(γ0)f^{n}(\gamma_{0}) containing 0 such that the following hold:

  1. (1)

    The curve γn\gamma_{n} has length at least

    out=min{l0e.9λn,max}.\ell_{out}=\min\{l_{0}e^{.9\lambda n},\ell_{\max}\}.
  2. (2)

    If the minimum in item (1) is realized by max\ell_{\max}, then the preimage of γn\gamma_{n} in γ\gamma has length at most D4e.9λn,\displaystyle D_{4}e^{-.9\lambda n}, and this occurs as long as

    nD5+max{R,0}2ln(θ).99λ.n\geq D_{5}+\frac{\max\{R,0\}-2\ln(\theta)}{.99\lambda}.

    Further, in this case, the preimage of γn\gamma_{n} in fi(γ)f^{i}(\gamma) has length at most D4e.9λ(ni)D_{4}e^{-.9\lambda(n-i)}. In fact if Ifi(γ)I\subseteq f^{i}(\gamma) is a curve of length at least D4e.85λ(ni)D_{4}e^{-.85\lambda(n-i)} containing a point fi(x)f^{i}(x), then fni(I)f^{n-i}(I) contains a C0C_{0}-good neighborhood of fn(x)f^{n}(x).

  3. (3)

    On γn\gamma_{n}, we have the estimate:

    (A.31) γnC2<D6e2.9λneD7lnθmax{γC2,1}+D8.\|\gamma_{n}\|_{C^{2}}<D_{6}e^{-2.9\lambda n}e^{D_{7}\ln\theta}\max\{\|\gamma\|_{C^{2}},1\}+D_{8}.
  4. (4)

    Finally, for any arbitrarily large D9>0D_{9}>0 and fixed α\alpha, there exist D10,D11D_{10},D_{11} such that the following holds. Suppose that ρ\rho is a density along γ\gamma that is log-α\alpha-Hölder. Then for the same collection of nn, the density of ρn=(fn)(ρ)\rho_{n}=(f^{n})_{*}(\rho) along γn\gamma_{n} with respect to arclength parametrization of γn\gamma_{n} satisfies the following estimate, as long as γn2<D9\|\gamma_{n}\|_{2}<D_{9},

    (A.32) lnρn|γnCαD10e.9αλneD7lnθ(1+lnρCα+γC2)+D11.\|\ln\rho_{n}|_{\gamma_{n}}\|_{C^{\alpha}}\leq D_{10}e^{-.9\alpha\lambda n}e^{D_{7}\ln\theta}(1+\|\ln\rho\|_{C^{\alpha}}+\|\gamma\|_{C^{2}})+D_{11}.

The analogous statement holds for sequences of local diffeomorphisms fif_{i} defined on a sequence of neighborhoods of 0 in 2\mathbb{R}^{2} or of a closed manifold.

Proof.

We begin by fixing some notation and constants that we will use throughout the argument. Let λ=.999λ\lambda^{\prime}=.999\lambda. Then from Lemma A.1 we obtain finite time Lyapunov metrics i\|\cdot\|^{\prime}_{i}, 0in0\leq i\leq n, associated to this splitting that satisfy for all ξ2\xi\in\mathbb{R}^{2}:

(A.33) 12ξiξi4e2C+2ϵ(ni)(1e2(λλ))1/2ξi.\frac{1}{\sqrt{2}}\|\xi\|_{i}\leq\|\xi\|_{i}^{\prime}\leq 4e^{2C+2\epsilon(n-i)}\left(1-e^{2(\lambda^{\prime}-\lambda)}\right)^{-1/2}\|\xi\|_{i}.

Note that because the sequence is reverse tempered n\|\cdot\|_{n}^{\prime} is uniformly comparable to the original metric independent of nn. As is standard, the metrics i\|\cdot\|^{\prime}_{i} give new linear coordinates Li:22L_{i}\colon\mathbb{R}^{2}\to\mathbb{R}^{2} that satisfy that (Li)i=i(L_{i})^{*}\|\cdot\|_{i}=\|\cdot\|_{i}^{\prime}. We let f^i=Li+1fiLi1\hat{f}_{i}=L_{i+1}\circ f_{i}\circ L_{i}^{-1}. Thus from properties of the Lyapunov metric, D0f^iD_{0}\hat{f}_{i} is a uniformly hyperbolic sequence satisfying

(A.34) D0f^i|Euie.999λ,D0f^i|Esie.999λ.D_{0}\hat{f}_{i}|_{E^{u}_{i}}\geq e^{.999\lambda},\quad D_{0}\hat{f}_{i}|_{E^{s}_{i}}\leq e^{-.999\lambda}.

We write:

(A.35) f^i(x,y)=(σ1,ix+f^i,1(x,y),σ2,iy+f^i,2(x,y)),\hat{f}_{i}(x,y)=(\sigma_{1,i}x+\hat{f}_{i,1}(x,y),\sigma_{2,i}y+\hat{f}_{i,2}(x,y)),

where D0f^i=diag(σ1,i,σ2,i)D_{0}\hat{f}_{i}=\operatorname{diag}(\sigma_{1,i},\sigma_{2,i}) and σi,1,σi,21e.999λ\sigma_{i,1},\sigma_{i,2}^{-1}\geq e^{.999\lambda}.

We now record estimates on C2C^{2} norms in these charts. By (A.33), there is C1C_{1} such that:

(A.36) max{Li,Li1}C1e2C+2ϵ(ni).\max\{\|L_{i}\|,\|L_{i}^{-1}\|\}\leq C_{1}e^{2C+2\epsilon(n-i)}.

Thus by Lemma A.2, for 1in1\leq i\leq n,

(A.37) f^iC2D1e6Ce6(ni)ϵ.\|\hat{f}_{i}\|_{C^{2}}\leq D_{1}e^{6C}e^{6(n-i)\epsilon}.

For 0in0\leq i\leq n, let

(A.38) ri=C2min{θγ21e.9λi,e.9λ(ni)},r_{i}=C_{2}\min\{\theta\|\gamma\|_{2}^{-1}e^{.9\lambda i},e^{-.9\lambda(n-i)}\},

where 0<C2<10<C_{2}<1 is a small number that we will choose later. We then restrict to studying the segment of γ\gamma inside the cube BiB_{i} centered at 020\in\mathbb{R}^{2} of side length rir_{i} with respect to the i\|\cdot\|_{i}^{\prime} metric. Let γi\gamma_{i} be the connected component of 0 in fi(γ)Bif^{i}(\gamma)\cap B_{i}. We write γ^i\hat{\gamma}_{i} for the function giving γi\gamma_{i} as a graph over the xx-axis and let lil_{i} be the length of the projection of γ^i\hat{\gamma}_{i} to the xx-axis in 2\mathbb{R}^{2} measured with respect to i\|\cdot\|^{\prime}_{i}.

We begin working with the ambient metric. By the mean value theorem, there exists C3C_{3} such that for a C2C^{2} curve γ\gamma in 2\mathbb{R}^{2} in an arclength parametrization,

(γ˙(t),γ˙(s))C3γC2|ts|,\angle(\dot{\gamma}(t),\dot{\gamma}(s))\leq C_{3}\|\gamma\|_{C^{2}}\left|t-s\right|,

because γ˙(t)\dot{\gamma}(t) is orthogonal to γ¨\ddot{\gamma}. In particular, as our curve γ\gamma satisfies (γ˙(0),Es0)>θ\angle(\dot{\gamma}(0),E^{s}_{0})>\theta restricted to a segment of γ\gamma of length C31γC21θ/2C_{3}^{-1}\|\gamma\|_{C^{2}}^{-1}\theta/2 around γ(0)\gamma(0), that on this segment (Es0,γ˙(t))>θ/2\angle(E^{s}_{0},\dot{\gamma}(t))>\theta/2. Then from Lemma A.8 in the Lyapunov chart we have that, letting \angle^{\prime} denote angle with respect to the Lyapunov metric, there exists C4C_{4} such that:

(A.39) C41θe2ϵn(Es0,γ˙(t))C4θe2ϵn.C_{4}^{-1}\theta e^{-2\epsilon n}\leq\angle^{\prime}(E^{s}_{0},\dot{\gamma}(t))\leq C_{4}\theta e^{2\epsilon n}.

From the construction of the Lyapunov metric, 12ii\frac{1}{\sqrt{2}}\|\cdot\|_{i}\leq\|\cdot\|^{\prime}_{i}, thus the length of γ\gamma in the Lyapunov chart is at least eR/2e^{-R}/2. We now restrict to a segment of γ^\hat{\gamma}, which we call γ^0\hat{\gamma}_{0}, with length with respect to the Lyapunov metric:

(A.40) len(γ^0)=min{C31eRθ/2,r0}.\operatorname{len}^{\prime}(\hat{\gamma}_{0})=\min\{C_{3}^{-1}e^{-R}\theta/2,r_{0}\}.

From (A.33), as the ratio of n\|\cdot\|_{n} to n\|\cdot\|_{n}^{\prime} does not depend on nn, we obtain the restriction (A.40) on the length of the initial segment γ^0\hat{\gamma}_{0} gives the condition (A.29) appearing in the theorem.

Note that (A.40) implies that: (Es0,γ˙)C41θe2ϵn/2\angle^{\prime}(E^{s}_{0},\dot{\gamma})\geq C_{4}^{-1}\theta e^{-2\epsilon n}/2. So the length of the projection of γ^0\hat{\gamma}_{0} to the Eu0E^{u}_{0} axis, which we call l0l_{0}, has length (with respect to the Lyapunov metric) of at least len(γ0)sin(C41θe2ϵn/2).\displaystyle\operatorname{len}^{\prime}(\gamma_{0})\sin(C_{4}^{-1}\theta e^{-2\epsilon n}/2). Thus

(A.41) l0len(γ0)sin(C41θe2ϵn/2)C5θe2ϵnmin{eRθ,e.9λn}l_{0}\geq\operatorname{len}^{\prime}(\gamma_{0})\sin(C_{4}^{-1}\theta e^{-2\epsilon n}/2)\geq C_{5}\theta e^{-2\epsilon n}\min\{e^{-R}\theta,e^{-.9\lambda n}\}

Also by Lemma A.6

γ^01cot(C41θe2ϵn)2C4θ1e2ϵn.\|\hat{\gamma}_{0}\|_{1}\leq\cot(C_{4}^{-1}\theta e^{-2\epsilon n})\leq 2C_{4}\theta^{-1}e^{2\epsilon n}.

We apply Proposition A.12(3), and get an ϵ0<1/3\epsilon_{0}<1/\sqrt{3}, which is the cutoff for the one step C2C^{2} smoothing estimate (A.13) to hold.

In keeping with the previous proposition, denote

ϵ1,i=f^i,1|Bi1andϵ2,i=f^2,i|Bi1.\epsilon_{1,i}=\|\hat{f}_{i,1}|_{B_{i}}\|_{1}\quad\text{and}\quad\epsilon_{2,i}=\|\hat{f}_{2,i}|_{B_{i}}\|_{1}.

Because f^i=D0f^i+(f^1,f^2)\hat{f}_{i}=D_{0}\hat{f}_{i}+(\hat{f}_{1},\hat{f}_{2}), we see from the C2C^{2} bound on f^i\hat{f}_{i} that on BiB_{i},

(A.42) max{ϵ1,i,ϵ2,i}rif^i2C2e.9λ(ni)e6(ni)ϵ=C2e(.9λϵ)(ni).\max\{\epsilon_{1,i},\epsilon_{2,i}\}\leq r_{i}\|\hat{f}_{i}\|_{2}\leq C_{2}e^{-.9\lambda(n-i)}e^{6(n-i)\epsilon}=C_{2}e^{-(.9\lambda-\epsilon)(n-i)}.

We now proceed to the main part of the proof.

Step 1. We begin by checking that if we inductively define: f^iγ^i|Bi=γ^i+1\hat{f}_{i}\hat{\gamma}_{i}|_{B_{i}}=\hat{\gamma}_{i+1}, and, as before, lil_{i} is the length of the projection of γ^i\hat{\gamma}_{i} to EuiE^{u}_{i} measured with respect to i\|\cdot\|_{i}^{\prime}, then the sequence γ^i\hat{\gamma}_{i} satisfies the following estimates:

(A.43) li\displaystyle l_{i} min{ri,e.99λil0},\displaystyle\geq\min\{r_{i},e^{.99\lambda i}l_{0}\},
(A.44) γ^i1\displaystyle\|\hat{\gamma}_{i}\|_{1} max{2θ1eiλ,ϵ0}.\displaystyle\leq\max\{2\theta^{-1}e^{-i\lambda},\epsilon_{0}\}.

(1) (lil_{i}): By Proposition A.12(1)

li+1min{(e.999λϵ1,iϵ1,iγ^i1)li,ri}.l_{i+1}\geq\min\{(e^{.999\lambda}-\epsilon_{1,i}-\epsilon_{1,i}\|\hat{\gamma}_{i}\|_{1})l_{i},r_{i}\}.

Hence to verify (A.43), it suffices to show that

(e.999λϵ1,iϵ1,iγ^i1)lie.99λli.(e^{.999\lambda}-\epsilon_{1,i}-\epsilon_{1,i}\|\hat{\gamma}_{i}\|_{1})l_{i}\geq e^{.99\lambda}l_{i}.

which follows by (A.42) and the inductive hypothesis (A.44) if C2C_{2} is chosen sufficiently small.

(2) We now check the estimate on γ^i+11\|\hat{\gamma}_{i+1}\|_{1} assuming it holds for ii.

To begin, from Proposition A.12(2),

(A.45) γ^i+11(e.999λγ^i1+ϵ2,i+ϵ2,iγ^i1)(e.999λϵ1,iϵ1,iγ^i1)1.\|\hat{\gamma}_{i+1}\|_{1}\leq(e^{-.999\lambda}\|\hat{\gamma}_{i}\|_{1}+\epsilon_{2,i}+\epsilon_{2,i}\|\hat{\gamma}_{i}\|_{1})(e^{.999\lambda}-\epsilon_{1,i}-\epsilon_{1,i}\|\hat{\gamma}_{i}\|_{1})^{-1}.

There are two cases depending on whether γ^i1ϵ0\|\hat{\gamma}_{i}\|_{1}\geq\epsilon_{0} or not. If γ^i1ϵ0\|\hat{\gamma}_{i}\|_{1}\geq\epsilon_{0}, then as long as C2C_{2} is chosen sufficiently small, then the second parenthetical term in the above equation is at most e.9λe^{-.9\lambda} by (A.42). Hence

γ^i+11e.9λ(e.999λγ^i1+ϵ2,i+ϵ2,iγ^i1)e.9λγ^i1(e.999λ+ϵ2,iϵ01+ϵ2,i).\|\hat{\gamma}_{i+1}\|_{1}\leq e^{-.9\lambda}(e^{-.999\lambda}\|\hat{\gamma}_{i}\|_{1}+\epsilon_{2,i}+\epsilon_{2,i}\|\hat{\gamma}_{i}\|_{1})\leq e^{-.9\lambda}\|\hat{\gamma}_{i}\|_{1}(e^{-.999\lambda}+\epsilon_{2,i}\epsilon_{0}^{-1}+\epsilon_{2,i}).

Because ϵ0>0\epsilon_{0}>0 is independent of C2C_{2}, if C2C_{2} is sufficiently small then (A.42) gives γ^i+11e3λ/2γ^i1,\|\hat{\gamma}_{i+1}\|_{1}\leq e^{-3\lambda/2}\|\hat{\gamma}_{i}\|_{1}, which concludes the proof since γ^i12θ1eiλ\|\hat{\gamma}_{i}\|_{1}\leq 2\theta^{-1}e^{-i\lambda}.

We now consider the case γ^i1ϵ0\|\hat{\gamma}_{i}\|_{1}\leq\epsilon_{0}. In this case it suffices to show that γ^i+11ϵ0\|\hat{\gamma}_{i+1}\|_{1}\leq\epsilon_{0}. The argument in this case is similar and follows because, as in the previous case, we may ensure that ϵ1,i,ϵ2,i\epsilon_{1,i},\epsilon_{2,i} are small relative to ϵ0\epsilon_{0} through our initial choice of C2C_{2}.

Thus we have shown that both estimates hold inductively proving (A.43) and (A.44).

We now conclude item (1). Since the Lyapunov metric n\|\cdot\|_{n}^{\prime} is uniformly comparable to the ambient metric n\|\cdot\|_{n} due to (A.33), it is enough to prove the lower bound on len(γn).\operatorname{len}^{\prime}(\gamma_{n}). Thus the length of γ0\gamma_{0} is at least min{rn,e.9λnl0}\min\{r_{n},e^{.9\lambda n}l_{0}\}. Note that

e.9λnl0C5eRθ2e2ϵn.e^{.9\lambda n}l_{0}\geq C_{5}e^{-R}\theta^{2}e^{-2\epsilon n}.

Hence if the minimum of min{rn,e.9λnl0}\min\{r_{n},e^{.9\lambda n}l_{0}\} is realized by rnr_{n}, then rn=C2r_{n}=C_{2} because the first term in the definition of rnr_{n} (see (A.38)) is bigger than e.9λn0e^{.9\lambda n}\ell_{0}. This shows that len(γn)min(0e.9λn,C2)\operatorname{len}^{\prime}(\gamma_{n})\geq\min(\ell_{0}e^{.9\lambda n},C_{2}), completing the proof of part (1).

We now check the claim about the length of the preimage of f^nγ^0\hat{f}^{n}\hat{\gamma}_{0} in part (2). This is immediate from our choice of len(γ^0)\operatorname{len}^{\prime}(\hat{\gamma}_{0}) in (A.40). Because the preimage of γn\gamma_{n} is contained in a segment of length at most e.9λne^{-.9\lambda n} with respect to the Lyapunov metric, and because 020\|\cdot\|_{0}\leq\sqrt{2}\|\cdot\|_{0}^{\prime}, this implies that the length of the initial segment we consider with respect to the ambient metric is at most 2e.9λn\sqrt{2}e^{-.9\lambda n}. Similar considerations give the claim about the length of the preimage of γn\gamma_{n} in fi(γ)f^{i}(\gamma) at the end of item (2). Note that the final curve γn\gamma_{n} promised by the lemma is not unique: for instance, it need not be centered at fn(x)f^{n}(x). The final claim in item (2) follows because any such curve is long enough that it fills the entire segment of fi(γ)f^{i}(\gamma) we are considering by our choice of rir_{i}.

To finish the proof of item (2), we must see how large nn must be in order too ensure that rn=maxr_{n}=\ell_{\max}. For this to occur nn must satisfy e.99λnl0max.\displaystyle e^{.99\lambda n}l_{0}\geq\ell_{\max}. That is, nln(max)ln(l0).99λ.\displaystyle n\geq\frac{\ln(\ell_{\max})-\ln(l_{0})}{.99\lambda}. Now the definition of l0l_{0} (see (A.41)) gives

nln(max)ln(C5θe2ϵnmin{eRθ,e.9λn}).99λ.n\geq\frac{\ln(\ell_{\max})-\ln(C_{5}\theta e^{-2\epsilon n}\min\{e^{-R}\theta,e^{-.9\lambda n}\})}{.99\lambda}.

Now the needed conclusion in item (2) follows by considering the two cases depending on which term realizes the minimum and using that ϵ<λ/30\epsilon<\lambda/30.

Step 2. We now obtain item (3), the C2C^{2} estimate on γ^i\hat{\gamma}_{i}. Should it happen that there is an index ii such that γ^i1ϵ0\|\hat{\gamma}_{i}\|_{1}\leq\epsilon_{0}, we call this index N0N_{0}. We proceed under the assumption that there is some such N0N_{0}. After concluding in this case, we explain how the same estimate holds otherwise. Observe that if γ^i1ϵ0\|\hat{\gamma}_{i}\|_{1}\leq\epsilon_{0}, then for all jij\geq i, γ^j1ϵ0\|\hat{\gamma}_{j}\|_{1}\leq\epsilon_{0} as well. Keeping in mind the strength of hyperbolicity from (A.34), for all indices iN0i\geq N_{0}, we have from (A.13), that

(A.46) γ^i+12e1.99λ.999f^i2+e2.99λ.999γ^i2.\|\hat{\gamma}_{i+1}\|_{2}\leq e^{-1.99\lambda\cdot.999}\|\hat{f}_{i}\|_{2}+e^{-2.99\lambda\cdot.999}\|\hat{\gamma}_{i}\|_{2}.

By applying the above equation iteratively, we can obtain an estimate on γ^n2\|\hat{\gamma}_{n}\|_{2} in terms of γ^N02\|\hat{\gamma}_{N_{0}}\|_{2}. This gives the required estimate because the homogeneous part of (A.46) has multipliers smaller than 1.

By (A.37), f^i2D1e6Ce6(ni)ϵ\|\hat{f}_{i}\|_{2}\leq D_{1}e^{6C}e^{6(n-i)\epsilon}. Let M=nN0M=n-N_{0}. Applying iteratively (A.46), we get

(A.47) γ^n2γ^N02e2.99λ.999M+i=1MD1e6Ce6(ni)ϵe1.99λ.999e2.99λ.999(Mi1).\|\hat{\gamma}_{n}\|_{2}\leq\|\hat{\gamma}_{N_{0}}\|_{2}e^{-2.99\lambda\cdot.999M}+\sum_{i=1}^{M}D_{1}e^{6C}e^{6(n-i)\epsilon}e^{-1.99\lambda\cdot.999}e^{-2.99\lambda\cdot.999(M-i-1)}.

Note that the second term is bounded by a constant C6C_{6} depending only on C,λC,\lambda and ϵ\epsilon.

To conclude, we also need a bound for γ^N02\|\hat{\gamma}_{N_{0}}\|_{2}. By Lemma A.9, there exists Λ\Lambda depending only on the C2C^{2} norm of the maps fif_{i}, which is uniformly bounded by D1D_{1}, such that

(A.48) fiγC2eΛimax{γC2,1}.\|f^{i}\gamma\|_{C^{2}}\leq e^{\Lambda i}\max\{\|\gamma\|_{C^{2}},1\}.

Hence γN0eΛN0max{γC2,1}\|\gamma_{N_{0}}\|\leq e^{\Lambda N_{0}}\max\{\|\gamma\|_{C^{2}},1\}. We then need an estimate on γ^N0\hat{\gamma}_{N_{0}}. Note that in the Lyapunov coordinates that γ^N0\hat{\gamma}_{N_{0}}, which as a graph over Eu0E^{u}_{0} has slope at most ϵ0<1/3\epsilon_{0}<1/\sqrt{3}. Thus by Lemma A.6 ,

γ^N02sin(arccot(ϵ0))3eN0Λmax{γC2,1}2eΛN0max{γC2,1},\|\hat{\gamma}_{N_{0}}\|_{2}\leq\sin(\text{arccot}(\epsilon_{0}))^{-3}e^{N_{0}\Lambda}\max\{\|\gamma\|_{C^{2}},1\}\leq 2e^{\Lambda N_{0}}\max\{\|\gamma\|_{C^{2}},1\},

because ϵ0<1/3=tan(π/6)=cot(π/3)\epsilon_{0}<1/\sqrt{3}=\tan(\pi/6)=\cot(\pi/3). Combining this with (A.47),

(A.49) γ^n22eΛN0e2.99λ.999Mmax{γC2,1}+C6.\|\hat{\gamma}_{n}\|_{2}\leq 2e^{\Lambda N_{0}}e^{-2.99\lambda\cdot.999M}\max\{\|\gamma\|_{C^{2}},1\}+C_{6}.

But we also have a straightforward estimate for the cutoff N0N_{0}. From equation (A.44), we know that N0(ln(2)ln(θ))/λN_{0}\leq(\ln(2)-\ln(\theta))/\lambda. Hence because N0ln(θ)N_{0}\approx-\ln(\theta), it is straightforward to see that there exist C7,C8C_{7},C_{8} such that

(A.50) γ^n2<C6+C7e2.9λneC8lnθmax{γC2,1}.\|\hat{\gamma}_{n}\|_{2}<C_{6}+C_{7}e^{-2.9\lambda n}e^{C_{8}\ln\theta}\max\{\|\gamma\|_{C^{2}},1\}.

In the case that there is no index ii such that γ^i1ϵ0\|\hat{\gamma}_{i}\|_{1}\leq\epsilon_{0}, we may conclude similarly as equation (A.44) implies that n(ln(2)ln(θ))/λn\leq(\ln(2)-\ln(\theta))/\lambda. Thus we have finished with Step 2 and conclude item (3).

Before going to Step 3, we record an additional more precise estimate on the rate that γ^i2\|\hat{\gamma}_{i}\|_{2} improves. Similar to above, we find:

γ^N0+i2\displaystyle\|\hat{\gamma}_{N_{0}+i}\|_{2} γ^N02e2.99λ.999i+j=1iD1e6Ce6(nN0j)ϵe1.99λ.999e2.99λ.999(ij1),\displaystyle\leq\|\hat{\gamma}_{N_{0}}\|_{2}e^{-2.99\lambda\cdot.999i}+\sum_{j=1}^{i}D_{1}e^{6C}e^{6(n-N_{0}-j)\epsilon}e^{-1.99\lambda\cdot.999}e^{-2.99\lambda\cdot.999(i-j-1)},
γ^N02e2.99λ.999i+e6(nN0)ϵe1.99λ.999e6iϵD1e6Ck=1ie(2.99λ.9996ϵ)(k1),\displaystyle\leq\|\hat{\gamma}_{N_{0}}\|_{2}e^{-2.99\lambda\cdot.999i}+e^{6(n-N_{0})\epsilon}e^{-1.99\lambda\cdot.999}e^{-6i\epsilon}D_{1}e^{6C}\sum_{k=1}^{i}e^{-(2.99\lambda\cdot.999-6\epsilon)(k-1)},
(A.51) 2eΛN0e2.99λ.999imax{γC2,1}+C9e6ϵ(nN0i),\displaystyle\leq 2e^{\Lambda N_{0}}e^{-2.99\lambda\cdot.999i}\max\{\|\gamma\|_{C^{2}},1\}+C_{9}e^{6\epsilon(n-N_{0}-i)},

for some C9>0C_{9}>0.

Step 3. We now show item (4), i.e. we obtain estimates for smoothing a density along γ\gamma. We let ρ^i\hat{\rho}_{i} be the function giving the density ρ\rho on γ^i\hat{\gamma}_{i} in the Lyapunov coordinates.

We now apply the smoothing estimate. As in Step 2, supposing it exists, let N0N_{0} be the first index such that γ^i1ϵ0\|\hat{\gamma}_{i}\|_{1}\leq\epsilon_{0}. If such an index N0N_{0} does not exist, then we may conclude similarly to in Step 2. Then for any iN0i\geq N_{0}, by (A.14),

(A.52) lnρ^i+1Cαλ.9α(lnρ^iCα+f^i2+γ^i2).\|\ln\hat{\rho}_{i+1}\|_{C^{\alpha}}\leq\lambda^{-.9\alpha}(\|\ln\hat{\rho}_{i}\|_{C^{\alpha}}+\|\hat{f}_{i}\|_{2}+\|\hat{\gamma}_{i}\|_{2}).

As before, let M=nN0M=n-N_{0}. By a bookkeeping similar to Step 22, we find that

lnρ^n1e.9.999λαMlnρ^N0Cα+i=1Me.9.999λα(Mi)(f^N0+iC2+γ^N0+iC2).\|\ln\hat{\rho}_{n}\|_{1}\leq e^{-.9\cdot.999\lambda\alpha M}\|\ln\hat{\rho}_{N_{0}}\|_{C^{\alpha}}+\sum_{i=1}^{M}e^{-.9\cdot.999\lambda\alpha(M-i)}(\|\hat{f}_{N_{0}+i}\|_{C^{2}}+\|\hat{\gamma}_{N_{0}+i}\|_{C^{2}}).

By (A.51) and (A.37), we see that there exists C11C_{11} such that

(A.53) lnρ^nCα\displaystyle\|\ln\hat{\rho}_{n}\|_{C^{\alpha}} e.9.999λαMlnρ^N0Cα+2eΛN0γC2e.9λ.999M+C11.\displaystyle\leq e^{-.9\cdot.999\lambda\alpha M}\|\ln\hat{\rho}_{N_{0}}\|_{C^{\alpha}}+2e^{\Lambda N_{0}}\|\gamma\|_{C^{2}}e^{-.9\lambda\cdot.999M}+C_{11}.

We now estimate lnρ^N0\|\ln\hat{\rho}_{N_{0}}\|. We first obtain an estimate without the use of the Lyapunov charts. By Lemma A.9 because of the uniform C2C^{2} bound D1D_{1}, there exist C12,Λ>0C_{12},\Lambda>0 such that

ln(fN0)ρCαC12(eΛN0+eΛN0lnρCα).\|\ln(f^{N_{0}})_{*}\rho\|_{C^{\alpha}}\leq C_{12}(e^{\Lambda N_{0}}+e^{\Lambda N_{0}}\|\ln\rho\|_{C^{\alpha}}).

Next, we push forward γN0\gamma_{N_{0}} and ρN0\rho_{N_{0}} by LN0L_{N_{0}} to obtain a density in the Lyapunov coordinates. Because max{LN0,LN01}C1e2Ce2ϵn\max\{\|L_{N_{0}}\|,\|L_{N_{0}}\|^{-1}\}\leq C_{1}e^{2C}e^{2\epsilon n}, Lemma A.7 gives that there exists C13C_{13} such that

ln(LN0)(fN01)ρCαC13e(2+2α)ϵn(eΛN0+eΛN0lnρCα+e2ϵn(1+γN0C2)).\|\ln(L_{N_{0}})*(f^{N_{0}}_{1})_{*}\rho\|_{C^{\alpha}}\leq C_{13}e^{(2+2\alpha)\epsilon n}\left(e^{\Lambda N_{0}}+e^{\Lambda N_{0}}\|\ln\rho\|_{C^{\alpha}}+e^{2\epsilon n}(1+\|\gamma_{N_{0}}\|_{C^{2}})\right).

For the application we are then interested in the regularity of (LfN01)ρ(Lf^{N_{0}}_{1})_{*}\rho as a function parametrized by Eu0E^{u}_{0}. As at time N0N_{0}, γN0\gamma_{N_{0}} is uniformly transverse to Es0E^{s}_{0}, this projection has uniformly bounded norm. From before, we have the C2C^{2} bound on γN0\gamma_{N_{0}} following (A.48), which gives that there exists C14C_{14} such that:

(A.54) lnρ^N0CαC14e7ϵneΛN0(1+lnρCα+γC2).\|\ln\hat{\rho}_{N_{0}}\|_{C^{\alpha}}\leq C_{14}e^{7\epsilon n}e^{\Lambda N_{0}}(1+\|\ln\rho\|_{C^{\alpha}}+\|\gamma\|_{C^{2}}).

Combining this with (A.53), we find

lnρ^ne.9.999λαM(C14e7ϵneΛN0(1+lnρCα+γC2))+2eΛN0γC2e.9λ.999M+C11.\|\ln\hat{\rho}_{n}\|\leq e^{-.9\cdot.999\lambda\alpha M}(C_{14}e^{7\epsilon n}e^{\Lambda N_{0}}(1+\|\ln\rho\|_{C^{\alpha}}+\|\gamma\|_{C^{2}}))+2e^{\Lambda N_{0}}\|\gamma\|_{C^{2}}e^{-.9\lambda\cdot.999M}+C_{11}.

Then as before, because N0N_{0} is order ln(θ)\ln(\theta) and M=nN0M=n-N_{0},

(A.55) lnρ^nC15e.9λαneC16ln(θ)(1+lnρCα+γC2).\|\ln\hat{\rho}_{n}\|\leq C_{15}e^{-.9\lambda\alpha n}e^{C_{16}\ln(\theta)}(1+\|\ln\rho\|_{C^{\alpha}}+\|\gamma\|_{C^{2}}).

As γnC2<D9\|\gamma_{n}\|_{C^{2}}<D_{9} for some fixed D9D_{9} by assumption, then (A.55) gives the corresponding estimate on ρ\rho with respect to the arclength parameters on γn\gamma_{n}, and we conclude item (4). ∎

A.6. Loss of regularity

In this subsection, we prove some additional estimates that will be used later in the proof of mixing but not the proof of the coupling lemma. These estimates say that for all but an exponentially small amount of the curve γ\gamma, typically the images of points in fnω(γ)f^{n}_{\omega}(\gamma) are in a neighborhood that is at least nϵn\epsilon-good. First we introduce in Definition A.14, a notion of a forward tempered point relative to a curve. Then, in Proposition A.15, we show that the image of a curve at a forward tempered time will be 18ϵn18\epsilon n good.

We begin by stating the main definition of this section. Note that it is similar to definitions we also considered for backwards good points (Definition 8.1).

Definition A.14.

For a standard pair γ^=(γ,ρ)\hat{\gamma}=(\gamma,\rho) and a word ωΣ\omega\in\Sigma, we say that nn is a (C,λ,ϵ,θ)(C,\lambda,\epsilon,\theta)-forward tempered time for xγx\in\gamma if the sequence of maps (Dxfiω)1in(D_{x}f^{i}_{\omega})_{1\leq i\leq n} is (C,λ,ϵ(C,\lambda,\epsilon)-subtempered and the most contracted direction of DxfnωD_{x}f^{n}_{\omega} exists and is at least θ\theta-transverse to γ\gamma. Similarly, we speak of a trajectory being forward tempered relative to a vector vTxMv\in T_{x}M.

The following lemma gives a quantitative estimate on the length of an image of a curve experiencing a forward tempered time.

Proposition A.15.

Suppose that MM is a closed surface and that (f1,,fm)(f_{1},\ldots,f_{m}) is a tuple in Diff2vol(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). Then for any λ>0\lambda>0 and C1>0C_{1}>0 there exist D0,D1>0D_{0},D_{1}>0 and NN\in\mathbb{N}, such that for all θ>0\theta>0 and λ/30>ϵ>0\lambda/30>\epsilon>0, if γ^=(γ,ρ)\hat{\gamma}=(\gamma,\rho) is a C1C_{1}-good standard pair, ωΣ\omega\in\Sigma and xγx\in\gamma has a (C,λ,ϵ,θ)(C,\lambda,\epsilon,\theta) forward tempered time at time

(A.56) nN+D0lnθn\geq N+D_{0}\ln\theta

then

(1) The pushforward fnω(γ)f^{n}_{\omega}(\gamma) contains a neighborhood of fnω(x)f^{n}_{\omega}(x), B(x)B(x), such that denoting by B^(x)\hat{B}(x) the restriction of the standard pair fnω(γ^)f^{n}_{\omega}(\hat{\gamma}) to B(x)B(x), then B^(x)\hat{B}(x) is an (18ϵn+18max{C,0}+D1)(18\epsilon n+18\max\{C,0\}+D_{1})-good standard pair.

(2) The preimage of B^(x)\hat{B}(x), (fnω)1(B^(x))(f^{n}_{\omega})^{-1}(\hat{B}(x)), has length at most e(λ/2)ne^{-(\lambda/2)n}.

Proof.

As before, we will use the deterministic smoothing lemmas. We begin by first picking a choice of Lyapunov metrics to use. Applying Lemma A.1 with λ=.999λ\lambda^{\prime}=.999\lambda we get, since the trajectory is forward tempered, that

(A.57) 12ξiξi4e2C+2ϵi(1e2(λλ))1/2ξi.\frac{1}{\sqrt{2}}\|\xi\|_{i}\leq\|\xi\|_{i}^{\prime}\leq 4e^{2C+2\epsilon i}\left(1-e^{2(\lambda^{\prime}-\lambda)}\right)^{-1/2}\|\xi\|_{i}.

As in the proof of Proposition A.13, using the Lyapunov metric, we obtain new dynamics f^i\hat{f}_{i} in the Lyapunov coordinates, which are given by composing with a sequence of maps LiL_{i}. Crucially, these dynamics satisfy that D0f^i|Esie.999λ\displaystyle D_{0}\hat{f}_{i}|_{E^{s}_{i}}\leq e^{-.999\lambda} and D0f^i|Euie.999λ.\displaystyle D_{0}\hat{f}_{i}|_{E^{u}_{i}}\geq e^{.999\lambda}. Moreover, we can write:

f^i(x,y)=(σ1,ix+f^i,1(x,y),σ2,iy+f^i,2(x,y)).\hat{f}_{i}(x,y)=(\sigma_{1,i}x+\hat{f}_{i,1}(x,y),\sigma_{2,i}y+\hat{f}_{i,2}(x,y)).

Further there exists C2C_{2} such that

(A.58) max{Li,Li1}C2e2C+2ϵi.\max\{\|L_{i}\|,\|L_{i}\|^{-1}\}\leq C_{2}e^{2C+2\epsilon i}.

Proceeding as in (A.37), there exists DD such that:

(A.59) f^iC2De6C+6iϵ.\|\hat{f}_{i}\|_{C^{2}}\leq De^{6C+6i\epsilon}.

From here, we set up the constants in a manner similar to before. Things are slightly simpler because by assumption the standard pair is C1C_{1}-good and hence uniformly long and good. We will take some small C3>0C_{3}>0 that we will choose later. Set

ri=C3e10ϵn6Cmin{θ1,e.9λ(ni)}.r_{i}=C_{3}e^{-10\epsilon n-6C}\min\{\theta^{-1},e^{-.9\lambda(n-i)}\}.

As before, we let BiB_{i} be the square of side length rir_{i} centered at 0 with respect to the i\|\cdot\|^{\prime}_{i} metric. As in the previous argument, we let γ^i\hat{\gamma}_{i} denote the portion of fi1(γ^)f^{i-1}(\hat{\gamma}) lying in BiB_{i} and we let ρ^n\hat{\rho}_{n} denote the density along γ^n\hat{\gamma}_{n}. Let ϵ0>0\epsilon_{0}>0 be the cutoff so that (A.13) holds in Proposition A.12

As above, we denote ϵ1,i=f^i,1|Bi1\epsilon_{1,i}=\|\hat{f}_{i,1}|_{B_{i}}\|_{1} and ϵ2,i=f^2,i|Bi1\epsilon_{2,i}=\|\hat{f}_{2,i}|_{B_{i}}\|_{1}. Because f^i=D0f^i+(f^1,f^2)\hat{f}_{i}=D_{0}\hat{f}_{i}+(\hat{f}_{1},\hat{f}_{2}), we see that from the C2C^{2} bound on f^i\hat{f}_{i} that on BiB_{i},

(A.60) max{ϵ1,i,ϵ2,i}rif^i2C3e10ϵn6Ce.9λ(ni)e6Ce6(ni)ϵC3e.9λ(ni).\max\{\epsilon_{1,i},\epsilon_{2,i}\}\leq r_{i}\|\hat{f}_{i}\|_{2}\leq C_{3}e^{-10\epsilon n-6C}e^{-.9\lambda(n-i)}e^{6C}e^{6(n-i)\epsilon}\leq C_{3}e^{-.9\lambda(n-i)}.

In particular, note that by choosing C3C_{3} sufficiently small in a manner that only depends on λ\lambda, we may ensure that for all ii that max{ϵ1,i,ϵ2,i}<ϵ0\max\{\epsilon_{1,i},\epsilon_{2,i}\}<\epsilon_{0}.

We then carry out an inductive argument to determine the regularity of γ^n\hat{\gamma}_{n}. In order, we obtain estimates on the length, the C2C^{2} norm, and then lnfn(ρ)Cα\|\ln f^{n}(\rho)\|_{C^{\alpha}}.

Step 1. (Length of the curve) As in the proof of Proposition A.13, we see that from the choice of constants γ^n\hat{\gamma}_{n} is uniformly transverse to EsnE^{s}_{n} and the projection of its graph to the EunE^{u}_{n} axis fills EunBnE^{u}_{n}\cap B_{n}. Thus there exists C4C_{4}, depending only on C3C_{3} such that γ^n\hat{\gamma}_{n} has length at least C4eϵnC_{4}e^{-\epsilon n} in the Lyapunov charts. By equation (A.58), this implies that, in the ambient metric, fnω(x)f^{n}_{\omega}(x) lies in a neighborhood of length at least

(A.61) C21C4e2C3ϵn.C_{2}^{-1}C_{4}e^{-2C-3\epsilon n}.

Step 2. (C2C^{2} norm of the curve) We now turn to an estimate on the C2C^{2} norm of γ^n\hat{\gamma}_{n}. This is perhaps the most complicated part of the argument along with the estimate on smoothing the density. We apply the estimate (A.13) from Proposition A.12. Let N0N_{0} be the first iterate such that γ^N01<ϵ0\|\hat{\gamma}_{N_{0}}\|_{1}<\epsilon_{0}. From our choice of the size of the neighborhood and the comment on the size of C3C_{3} immediately after (A.60), we have that for all iN0i\geq N_{0}, the estimate (A.13) holds, i.e. the C2C^{2} smoothing estimate is valid. Thus we find that:

(A.62) γ^i+12λ1.99f^i2+λ2.99γ^i2.\|\hat{\gamma}_{i+1}\|_{2}\leq\lambda^{-1.99}\|\hat{f}_{i}\|_{2}+\lambda^{-2.99}\|\hat{\gamma}_{i}\|_{2}.

From (A.59), it follows inductively that:

(A.63) γ^n2e2.99λ(nN0)γ^N02+De6Ce6ϵnj=0nN01e1.99λje6ϵj.\|\hat{\gamma}_{n}\|_{2}\leq e^{-2.99\lambda(n-N_{0})}\|\hat{\gamma}_{N_{0}}\|_{2}+De^{6C}e^{6\epsilon n}\sum_{j=0}^{n-N_{0}-1}e^{-1.99\lambda j}e^{6\epsilon j}.

We then need to estimate N0N_{0}. As in the proof of Proposition A.13, we get N0=Oλ(ln(θ))N_{0}\!\!=\!\!O_{\lambda}(-\ln(\theta)). Thus there exists C4,C5>0C_{4},C_{5}>0, such that

γ^n2e2.99nλeC4lnθγ^N02+C5e6Ce6ϵn.\|\hat{\gamma}_{n}\|_{2}\leq e^{-2.99n\lambda}e^{-C_{4}\ln\theta}\|\hat{\gamma}_{N_{0}}\|_{2}+C_{5}e^{6C}e^{6\epsilon n}.

As in the proof of Proposition A.13, after (A.48), we see that there exists Λ>0\Lambda>0 such that fiγC2\|f^{i}\gamma\|_{C^{2}}, with respect to the ambient metric is at most eΛie^{\Lambda i}. Using (A.56) and the fact that the angle between γ\gamma and EsN0E^{s}_{N_{0}} is uniformly large, we see that there exists C6C_{6} such that with respect to the Lyapunov metric,

γ^N02eC6lnθ.\|\hat{\gamma}_{N_{0}}\|_{2}\leq e^{-C_{6}\ln\theta}.

Thus for some C7C_{7},

(A.64) γ^n2e2.99nλeC7lnθ+C5e6Ce6ϵn.\|\hat{\gamma}_{n}\|_{2}\leq e^{-2.99n\lambda}e^{-C_{7}\ln\theta}+C_{5}e^{6C}e^{6\epsilon n}.

We now record an intermediate estimate that will be useful later. By possibly increasing the constants, for each N0inN_{0}\leq i\leq n, we find:

(A.65) γ^i2e2.99iλeC7lnθ+C5e6Ce6ϵi.\|\hat{\gamma}_{i}\|_{2}\leq e^{-2.99i\lambda}e^{-C_{7}\ln\theta}+C_{5}e^{6C}e^{6\epsilon i}.

Equation (A.64) is an estimate in the Lyapunov chart, but we need the estimate with respect to the original metric. The C2C^{2} norm of γ^n\hat{\gamma}_{n} as a curve is uniformly comparable to γ^n2\|\hat{\gamma}_{n}\|_{2} because γ^n\hat{\gamma}_{n} is uniformly transverse to EsnE^{s}_{n}. By Lemma A.3 there exists C7C_{7}, such that, letting γn\gamma_{n} be the segment of γ\gamma lying in BnB_{n}, we get the following bound in the ambient metric

γnC2(e2.99nλeC7lnθ+C5e6Ce6ϵn)C23e6C+6ϵn.\|\gamma_{n}\|_{C^{2}}\leq(e^{-2.99n\lambda}e^{-C_{7}\ln\theta}+C_{5}e^{6C}e^{6\epsilon n})C_{2}^{3}e^{6C+6\epsilon n}.

This is the bound required by the proposition. Indeed for D0D_{0} sufficiently large we have:

(A.66) γnC2C8e12max{C,0}+12ϵn.\|\gamma_{n}\|_{C^{2}}\leq C_{8}e^{12\max\{C,0\}+12\epsilon n}.

Step 3. (Regularity of the density) Finally, we turn to estimating the Hölder norm of the pushed density. At the same iterate N0N_{0} from Step 2, we have that ϵN0,1,ϵN0,2,γ^N01ϵ0\epsilon_{N_{0},1},\epsilon_{N_{0},2},\|\hat{\gamma}_{N_{0}}\|_{1}\leq\epsilon_{0} and that these estimates hold for all future iterates. Consequently, estimate (A.14) applies, hence for N0in1N_{0}\leq i\leq n-1,

lnρ~i+1Cαe.9αλ(lnρ~iCα+f^i2+γ^i2).\|\ln\widetilde{\rho}_{i+1}\|_{C^{\alpha}}\leq e^{-.9\alpha\lambda}(\|\ln\widetilde{\rho}_{i}\|_{C^{\alpha}}+\|\hat{f}_{i}\|_{2}+\|\hat{\gamma}_{i}\|_{2}).

This leads inductively to the estimate that

(A.67) lnρ~nCαe.9αλ(nN0)lnρ~N0Cα+i=N0n1e.9λα(ni)(f^i2+γ^i2).\|\ln\widetilde{\rho}_{n}\|_{C^{\alpha}}\leq e^{-.9\alpha\lambda(n-N_{0})}\|\ln\widetilde{\rho}_{N_{0}}\|_{C^{\alpha}}+\sum_{i=N_{0}}^{n-1}e^{-.9\lambda\alpha(n-i)}(\|\hat{f}_{i}\|_{2}+\|\hat{\gamma}_{i}\|_{2}).

We then need some further estimates in order to simplify this.

We start with an estimate on lnρ~N0Cα\|\ln\widetilde{\rho}_{N_{0}}\|_{C^{\alpha}}. A similar argument to that giving (A.54) yields that lnρ~N0CαeΛN0\|\ln\widetilde{\rho}_{N_{0}}\|_{C^{\alpha}}\leq e^{\Lambda N_{0}}, where Λ>0\Lambda>0 only depends on the C2C^{2} norm of the diffeomorphisms and the initial regularity of γ\gamma. Hence as long as D0D_{0} is large enough, it follows that the first term is uniformly bounded.

For the other terms, we already have estimates for f^i2\|\hat{f}_{i}\|_{2} and γ^i2\|\hat{\gamma}_{i}\|_{2}, (A.59) and (A.65). These yield a bound on the sum in (A.67):

i=N0n1e.9λα(ni)(f^i2+γ^i2)\sum_{i=N_{0}}^{n-1}\!\!e^{-.9\lambda\alpha(n-i)}(\|\hat{f}_{i}\|_{2}+\|\hat{\gamma}_{i}\|_{2})
j=0nN01e.9λαj(De6Ce6ϵ(nj)+e2.99λ(nj)eC7ln(θ)+C5e6Ce6ϵ(nj)).\leq\sum_{j=0}^{n-N_{0}-1}\!\!e^{-.9\lambda\alpha j}\left(De^{6C}e^{6\epsilon(n-j)}\!\!+\!e^{-2.99\lambda(n-j)}e^{-C_{7}\ln(\theta)}\!\!+\!C_{5}e^{6C}e^{6\epsilon(n-j)}\right)\!\!.

The sum of the first and third terms inside the parentheses is straightforward to evaluate. There is a constant C9C_{9} such that each is bounded by C9e6Ce6ϵnC_{9}e^{6C}e^{6\epsilon n}. The terms involving the ln(θ)\ln(\theta) are only slightly more complicated as either jj or njn-j is large, hence the terms involving λ\lambda dominate the eC7lnθe^{-C_{7}\ln\theta} term as long as D0D_{0} is large enough. Thus by the above estimates, it follows that as long as D0D_{0} is sufficiently large that there exists C10C_{10} such that

lnρ~nCαC10e6max{C,0}e6ϵn.\|\ln\widetilde{\rho}_{n}\|_{C^{\alpha}}\leq C_{10}e^{6\max\{C,0\}}e^{6\epsilon n}.

This is the form of the estimate in the Lyapunov charts. We then need to pass back to the original metric. Applying Lemma A.7 we see that letting CC^{\prime} denote the constant from that lemma and using (A.58) and (A.66)) we get,

lnρnCα\displaystyle\|\ln\rho_{n}\|_{C^{\alpha}} e(1+α)(2C+2ϵn)(C10e6max{C,0}+6ϵn+Ce2C+ϵn(1+C8e12max{C,0}+12ϵn))\displaystyle\leq e^{(1+\alpha)(2C+2\epsilon n)}\left(C_{10}e^{6\max\{C,0\}+6\epsilon n}+C^{\prime}e^{2C+\epsilon n}(1+C_{8}e^{12\max\{C,0\}+12\epsilon n})\right)
C11e18max{C,0}e18ϵn.\displaystyle\leq C_{11}e^{18\max\{C,0\}}e^{18\epsilon n}.

This is the needed conclusion, so we are done. ∎

Appendix B Finite time Pesin theory and fake stable manifolds

B.1. Fake stable manifolds

In the proof of the coupling lemma, we will use the holonomies of some “fake” stable manifolds Wsn(ω,x)W^{s}_{n}(\omega,x). These manifolds behave for finite a time like a true stable manifold insofar as they contract. We then prove some lemmas about fake stable curves. Some of the results below are variants on standard facts in Pesin theory, however, some of the proofs are a little different due to us only using a finite portion of an orbit. For other facts that look standard we needed to supply our own proofs because we could not find a similar enough statement in the literature.

For a given word ω\omega and nn\in\mathbb{N} the fake stable manifolds are curves that have analogous properties to the stable manifolds up until time nn. So, unlike true stable manifolds, they are not canonically defined.

Before we begin we recall some notation. Throughout this section we will write Λωn(C,λ,ϵ)\Lambda^{\omega}_{n}(C,\lambda,\epsilon) for the set of (C,λ,ϵ)(C,\lambda,\epsilon)-tempered points xMx\in M at time nn for the word ωΣ\omega\in\Sigma. This is essentially the finite time version of a Pesin block. For many of the results there is a lower bound on nn, which is required to ensure that the orbit is actual experiencing hyperbolicity.

Below we will make a number of arguments concerning these fake stable manifolds. The main properties we need concern the holonomies between two transversals to the WsnW^{s}_{n} lamination. We need to know that the WsnW^{s}_{n} holonomies have a uniformly Hölder continuous Jacobian independent of nn. In addition we would like to know that as nn\to\infty that the holonomies are converging exponentially quickly to the true stable holonomy.

Before proceeding to the proof, we remark that there are other approaches to fake stable manifolds that are adapted to different sorts of dynamical problems and may differ from each other substantially. For example, Burns and Wilkinson [BW10], which originated the term fake manifold, use fake center and stable manifolds where a potentially different fake foliation is defined at every point in the manifold. A different approach in Dolgopyat, Kanigowski, Rodriguez-Hertz [DKRH24] uses a fake foliation that is globally defined but does not cover the entire Pesin regular set. Note that, in contrast with our setting, [BW10] and [DKRH24] allow systems with some zero exponents, and so the invariant manifolds need not be unique in their settings. One benefit of the construction described below is that it applies to every point in a Pesin block and further gives a single fake stable lamination defined on the manifold rather than a collection of different overlapping laminations. While this makes the fake stable lamination simple to think about, it requires more work to show that it exists.

B.2. Preliminaries

Here we present some background that will be used in the next subsection to study the regularity of EnsE_{n}^{s}.

We start with a useful fact for showing that the limit of a sequence of functions is Hölder continuous. This fact is completely standard. Note that the statement is false if the diameter of M2M_{2} is unbounded. Also, recall that in our setup, the Hölder constant only applies to estimates on the distance between g(x)g(x) and g(y)g(y) for points with d(x,y)1d(x,y)\leq 1.

Lemma B.1.

Suppose that M2M_{2} is a metric space with bounded diameter. Fix η,λ,δ,β>0\eta,\lambda,\delta,\beta>0. Then there exists 0<α<β0<\alpha<\beta and D(η,λ,δ,β,α)D(\eta,\lambda,\delta,\beta,\alpha) such that for any metric space M1M_{1} the following holds. Let gn:M1M2g_{n}\colon M_{1}\to M_{2}, 1nN1\leq n\leq N be a finite or infinite sequence of β\beta-Hölder continuous functions such that:

(1) For 1n<N1\leq n<N, dC0(gn,gn+1)C1eδnd_{C^{0}}(g_{n},g_{n+1})\leq C_{1}e^{-\delta n}.

(2) The function gng_{n} is C3eηnC_{3}e^{\eta n} β\beta-Hölder continuous at scale eC2eλne^{-C_{2}}e^{-\lambda n}, i.e., if d(x,y)eC2eλnd(x,y)\leq e^{-C_{2}}e^{-\lambda n} then d(gn(x),gn(y))C3eηnd(x,y)βd(g_{n}(x),g_{n}(y))\leq C_{3}e^{\eta n}d(x,y)^{\beta}.

Then the functions g1,,gNg_{1},\ldots,g_{N} in the sequence, as well as the possible limiting value of the sequence are all uniformly α\alpha-Hölder with constant at most

max{DeC2α,2C1(1eδ)1eβC2Dβ+C3eC2(βα)}.\max\left\{De^{C_{2}\alpha},2C_{1}(1-e^{-\delta})^{-1}e^{\beta C_{2}}D^{\beta}+C_{3}e^{-C_{2}(\beta-\alpha)}\right\}.
Proof.

We will assume throughout the proof that gNg_{N} is fixed and obtain an estimate for gNg_{N} that is independent of NN. As the resulting estimate is independent of NN, the conclusion holds for infinite sequences as well.

To begin we pick some constants. First, for fixed η>0\eta>0 and any 0<α1<β0<\alpha_{1}<\beta there exists γλ\gamma\geq\lambda such that

(B.1) ηγβα1γandη<γα1.\eta-\gamma\beta\leq-\alpha_{1}\gamma\quad\text{and}\quad\eta<\gamma\alpha_{1}.

Note that γ\gamma only depends on η,α1,λ,β\eta,\alpha_{1},\lambda,\beta, but not on C1,C2,C3C_{1},C_{2},C_{3}.

Next given δ\delta, let 0<α2<β0<\alpha_{2}<\beta be sufficiently small that we have

(B.2) δα2γ.\delta\geq\alpha_{2}\gamma.

Due the first assumption, we have a uniform estimate independent of NN:

|gNgn|i=nN1C1eiδC1enδ1eδ.\left|g_{N}-g_{n}\right|\leq\sum_{i=n}^{N-1}C_{1}e^{-i\delta}\leq\frac{C_{1}e^{-n\delta}}{1-e^{-\delta}}.

Having picked those constants, now consider a pair of points x,yM1x,y\in M_{1}. We consider three cases depending on how far apart xx and yy are. We proceed from closest to furthest away.

(1) First suppose that d(x,y)<min{eC2eγN,1}d(x,y)<\min\{e^{-C_{2}}e^{-\gamma N},1\}. Then

d(gN(x),gN(y))C3eηNd(x,y)βC3eηNd(x,y)α1d(x,y)βα1d(g_{N}(x),g_{N}(y))\leq C_{3}e^{\eta N}d(x,y)^{\beta}\leq C_{3}e^{\eta N}d(x,y)^{\alpha_{1}}d(x,y)^{\beta-\alpha_{1}}
C3eηNeC2α1eγα1Nd(x,y)βα1C3eC2α1d(x,y)βα1,\leq C_{3}e^{\eta N}e^{-C_{2}\alpha_{1}}e^{-\gamma\alpha_{1}N}d(x,y)^{\beta-\alpha_{1}}\leq C_{3}e^{-C_{2}\alpha_{1}}d(x,y)^{\beta-\alpha_{1}},

where we have used (B.1).

(2) Next, we consider the case where eC2eγeγnd(x,y)min{1,eC2eγn}e^{-C_{2}}e^{-\gamma}e^{-\gamma n}\leq d(x,y)\leq\min\{1,e^{-C_{2}}e^{-\gamma n}\} for some 1n<N1\leq n<N. By the choice of constants α1,α2\alpha_{1},\alpha_{2} and γ\gamma in the first part of the proof we find:

d(gN(x),gN(y))\displaystyle d(g_{N}(x),g_{N}(y)) d(gN(x),gn(x))+d(gn(x),gn(y))+d(gn(y),gN(y))\displaystyle\leq d(g_{N}(x),g_{n}(x))+d(g_{n}(x),g_{n}(y))+d(g_{n}(y),g_{N}(y))
C1enδ(1eδ)1+C1eηnd(x,y)β+C1enδ(1eδ)1\displaystyle\leq C_{1}e^{-n\delta}(1-e^{-\delta})^{-1}+C_{1}e^{\eta n}d(x,y)^{\beta}+C_{1}e^{-n\delta}(1-e^{-\delta})^{-1}
2C1enγα2(1eδ)1+C3eηnenγα1eC2α1d(x,y)βα1.\displaystyle\leq 2C_{1}e^{-n\gamma\alpha_{2}}(1-e^{-\delta})^{-1}+C_{3}e^{\eta n}e^{-n\gamma\alpha_{1}}e^{-C_{2}\alpha_{1}}d(x,y)^{\beta-\alpha_{1}}.

Then due to the lower bound on d(x,y)d(x,y) and η<γα1\eta<\gamma\alpha_{1} from (B.1):

d(gN(x),gN(y))\displaystyle d(g_{N}(x),g_{N}(y)) 2C1(1eδ)1eα2C2eα2γd(x,y)α2+C3eC2α1d(x,y)βα1\displaystyle\leq 2C_{1}(1-e^{-\delta})^{-1}e^{\alpha_{2}C_{2}}e^{\alpha_{2}\gamma}d(x,y)^{\alpha_{2}}+C_{3}e^{-C_{2}\alpha_{1}}d(x,y)^{\beta-\alpha_{1}}
(2C1(1eδ)1eα2C2eα2γ+C3eC2α1)d(x,y)min{α2,βα1}.\displaystyle\leq\left(2C_{1}(1-e^{-\delta})^{-1}e^{\alpha_{2}C_{2}}e^{\alpha_{2}\gamma}+C_{3}e^{-C_{2}\alpha_{1}}\right)d(x,y)^{\min\{\alpha_{2},\beta-\alpha_{1}\}}.

(3) Finally we consider the case where eC2eγ<d(x,y)e^{-C_{2}}e^{-\gamma}<d(x,y). Then we use a trivial bound

d(gN(x),gN(y))diam(M2)(diam(M2)(eC2eγ)βα1)d(x,y)βα1.d(g_{N}(x),g_{N}(y))\leq\operatorname{diam}{(M_{2})}\leq\left(\frac{\operatorname{diam}{(M_{2})}}{(e^{-C_{2}}e^{-\gamma})^{\beta-\alpha_{1}}}\right)d(x,y)^{\beta-\alpha_{1}}.

Now using all three cases above, we may conclude. Note that the (βα1)(\beta-\alpha_{1})-Hölder constant obtained in the second item above is at least as big as the constant obtained in the first item in the list. Thus the function gN(x)g_{N}(x) is uniformly (βα1)(\beta-\alpha_{1})-Hölder with constant at most

max{(diamM2)e(C2+γ)(βα1),2C1(1eδ)1eα2C2eα2γ+C3eC2α1}.\max\left\{(\operatorname{diam}M_{2})e^{(C_{2}+\gamma)(\beta-\alpha_{1})},2C_{1}(1-e^{-\delta})^{-1}e^{\alpha_{2}C_{2}}e^{\alpha_{2}\gamma}+C_{3}e^{-C_{2}\alpha_{1}}\right\}.

As the choice of constants α1,α2,γ\alpha_{1},\alpha_{2},\gamma depend only on δ,η\delta,\eta we obtain the needed conclusion. ∎

We will apply Lemma B.1 to obtain regularity of EsnE^{s}_{n} after we obtain small scale Hölder continuity of EsnE^{s}_{n}.

Next we present a perturbation result on the singular subspaces of linear transformations called Wedin’s theorem. This theorem gives a bound on the change in the angle between the singular directions. We state a specialized version of this theorem adapted from the presentation in [Ste91, Thm. 4]. First we describe the theorem in some generality, but below we give a precise statement for SL(2,)\operatorname{SL}(2,\mathbb{R}) independent of the discussion and definitions mentioned below. If AA and A~\widetilde{A} are two n×nn\times n matrices then we may list their singular values as σ1σn\sigma_{1}\geq\cdots\geq\sigma_{n} and σ~1σ~n\widetilde{\sigma}_{1}\geq\cdots\widetilde{\sigma}_{n}. Write EF\|E\|_{F} for the Frobenius norm of the matrix EE, i.e. the L2L^{2} norm of its entries viewed as a vector. Fix some index kk such that σkσ~k+1\sigma_{k}\geq\widetilde{\sigma}_{k+1}. If |σkσ~k+1|δ\left|\sigma_{k}-\widetilde{\sigma}_{k+1}\right|\geq\delta, and σ~kδ\widetilde{\sigma}_{k}\geq\delta, then Wedin’s theorem implies that:

sinΦF2EFδ,\|\sin\Phi\|_{F}\leq\frac{\sqrt{2}\|E\|_{F}}{\delta},

where sinΦF\|\sin\Phi\|_{F} denotes the Frobenius norm of the matrix that defines the canonical angles between the right singular subspace associated to σ1,,σk\sigma_{1},\ldots,\sigma_{k} and σ~1,,σ~k\widetilde{\sigma}_{1},\ldots,\widetilde{\sigma}_{k}. (The matrix sinΦ\sin\Phi is defined by taking the inner products between an orthonormal basis of the right singular subspaces of AA and A~\widetilde{A}.) Note that the statement in [Ste91, Thm. 4] is in terms of certain residuals, but by the comment before the theorem, these are bounded by EF\|E\|_{F}. Below we will use that the Frobenius norm of a 22 by 22 matrix satisfies the bound EF2E\|E\|_{F}\leq\sqrt{2}\|E\|, where E\|E\| is the usual operator norm of the matrix [HJ13, 5.6.P23].

Although the statement from the above paragraph is somewhat technical, when both the matrix AA and its perturbation A+EA+E are in SL(2,)\operatorname{SL}(2,\mathbb{R}), as is the case for us, the statement simplifies considerably. This is because for such a matrix σ1=σ21\sigma_{1}=\sigma_{2}^{-1} and the top singular value of a matrix in SL(2,)\operatorname{SL}(2,\mathbb{R}) can change by at most E\|E\| when we perturb by EE. If A2\|A\|\geq 2 and EE is a perturbation with EA/2\|E\|\leq\|A\|/2, then

A+E12A,\displaystyle\|A+E\|\geq\frac{1}{2}\|A\|,
A(A+E)1A2A12A,\displaystyle\|A\|-(\|A+E\|)^{-1}\geq\|A\|-\frac{2}{\|A\|}\geq\frac{1}{2}\|A\|,

as long as A2\|A\|\geq 2. So, we may apply Wedin’s theorem with δ=A/2\delta=\|A\|/2. In this case, the matrix of canonical angles described above consists of a single number: the angle between the original most expanded singular direction and the new one. Thus we obtain the following proposition.

Proposition B.2.

Suppose that AA is a matrix in SL(2,)\operatorname{SL}(2,\mathbb{R}) with A2\|A\|\geq 2. Consider a perturbation A+ESL(2,)A+E\in\operatorname{SL}(2,\mathbb{R}) with EA/2\|E\|\leq\|A\|/2. Denote by vAv_{A} and vA+Ev_{A+E} the most expanded singular vectors of AA and A+EA+E. Then

|sin(vA,vA+E)|22EA.\left|\sin\angle(v_{A},v_{A+E})\right|\leq\frac{2\sqrt{2}\|E\|}{\|A\|}.

B.3. Regularity of the most contracting direction

We now estimate the regularity of Esn(x)E^{s}_{n}(x), the most contracted direction of DxfnωD_{x}f^{n}_{\omega}, on the set of (C,λ,ϵ)(C,\lambda,\epsilon)-tempered points at time nn in terms of CC. The approach to studying Hölder regularity here may be contrasted with the approach in Shub [Shu87, Thm. 5.18(c)]. That approach establishes Hölder regularity for an invariant section of a bundle automorphism under an appropriate bunching condition by comparing the contraction in the fiber with the strength of hyperbolicity in the base. In some sense the approach is similar: it uses the dynamics to study the Hölder regularity at different scales. One can compare equation (***) there with our Lemma B.1.

Proposition B.3.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is a tuple of diffeomorphisms in Diff2vol(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M) of a closed surface MM. Fix λ>0\lambda>0 then there exists ϵ0,β>0\epsilon_{0},\beta>0 such that for any 0ϵϵ00\leq\epsilon\leq\epsilon_{0} there exists D1D_{1} such that if for C0C\geq 0, Λnω(C)\Lambda^{n}_{\omega}(C) denotes the (C,λ,ϵ)(C,\lambda,\epsilon) tempered points at time nn for ωΣ\omega\in\Sigma, and nN0(C)=(C+ln(2))/λn\geq N_{0}(C)=\lceil(C+\ln(2))/\lambda\rceil, then restricted to Λnω(C)\Lambda^{n}_{\omega}(C), EsnE^{s}_{n} is β\beta-Hölder with constant eD1Ce^{D_{1}C}.

Proof.

We may always study the dynamics in an atlas of uniformly smooth volume preserving charts on MM. So, in what follows we will implicitly be working with such charts.

The first claim is an immediate analog of [BP07, Lem. 5.3.4]. There exists Λ>0\Lambda>0 such that for nn\in\mathbb{N}, if x,yMx,y\in M with d(x,y)eΛnd(x,y)\leq e^{-\Lambda n}, then (as viewed in charts),

(B.3) DxfnωDyfnωeΛnd(x,y).\|D_{x}f^{n}_{\omega}-D_{y}f^{n}_{\omega}\|\leq e^{\Lambda n}d(x,y).

Our plan is to apply Lemma B.1, so we need to estimate the regularity of EsnE^{s}_{n}. The first thing we need is a lower bound on nn for the subspace EsnE^{s}_{n} to necessarily exist. From the definition of (C,λ,ϵ)(C,\lambda,\epsilon) tempered, we see that as long as

(B.4) nC+ln2λ=N0(C),n\geq\left\lceil\frac{C+\ln 2}{\lambda}\right\rceil=N_{0}(C),

then Dxfnω2\|D_{x}f^{n}_{\omega}\|\geq 2 and hence there is a well defined most contracted subspace.

Next we estimate the Hölder regularity of EsnE^{s}_{n} on ΛNω\Lambda^{N}_{\omega} for NnN0N\geq n\geq N_{0}. If xΛnωx\in\Lambda^{n}_{\omega} and d(x,y)eΛn/2d(x,y)\leq e^{-\Lambda n}/2, then it follows from (B.3) that

DxfnωDyfnωeΛnd(x,y)1/2.\|D_{x}f^{n}_{\omega}-D_{y}f^{n}_{\omega}\|\leq e^{\Lambda n}d(x,y)\leq 1/2.

Thus, from Proposition B.2, as Dxfn2\|D_{x}f^{n}\|\geq 2, it follows that for d(x,y)eΛn/2d(x,y)\leq e^{-\Lambda n}/2 that

(B.5) d(Esn(x),Esn(y))<2eΛnd(x,y),d(E^{s}_{n}(x),E^{s}_{n}(y))<\sqrt{2}e^{\Lambda n}d(x,y),

which is the small scale Hölder estimate we were seeking.

Next, we study how fast EsnE^{s}_{n} fluctuates as we increase nn. By assumption the sequence of points is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered. Hence by Proposition 4.6, there exists D8D_{8} depending only on λ,ϵ\lambda,\epsilon such that for nn greater than or equal to our same N0N_{0} it follows that on Λωn(C,λ,ϵ)\Lambda^{\omega}_{n}(C,\lambda,\epsilon)

(B.6) (Esn(x),Esn+1(x))e4C+D8e2(λϵ)n.\angle(E^{s}_{n}(x),E^{s}_{n+1}(x))\leq e^{4C+D_{8}}e^{-2(\lambda-\epsilon)n}.

We can now apply Lemma B.1 to the sequence of distributions EsnE^{s}_{n}, for N0nNN_{0}\leq n\leq N by combining estimates (B.5) and (B.6). Thus there exists 0<β<10<\beta<1 and C3C_{3} such that the EsnE^{s}_{n} are β\beta-Hölder with constant

max{C3eΛN0,2e4C+D8(1e2(λϵ))1eΛN0C3+C3eΛN0}\max\{C_{3}e^{\Lambda N_{0}},2e^{4C+D_{8}}(1-e^{-2(\lambda-\epsilon)})^{-1}e^{\Lambda N_{0}}C_{3}+C_{3}e^{-\Lambda N_{0}}\}

But by our choice of N0C/λN_{0}\approx C/\lambda and absorbing some constants into each other, we find that there is some C4C_{4} such that the β\beta-Hölder constant of EsnE^{s}_{n} is at most C4e((Λ/λ)+4)C,\displaystyle C_{4}e^{((\Lambda/\lambda)+4)C}, which gives the needed conclusion. ∎

The above lemma will give us a Hölder estimate on the regularity of Dfn(Esn)Df^{n}(E^{s}_{n}) as well, which will allow us to define the fake stable manifolds. Before proceeding, we use the above results to record another useful fact about the continuity of the distribution of the stable directions.

Proposition B.4.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple of diffeomorphisms on a surface MM in Diff2vol(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M). Let νsx\nu^{s}_{x} be the distribution of stable subspaces through xx, which is a probability measure on TxM\mathbb{P}T_{x}M, the projectivization of TxMT_{x}M. Then if we identify nearby fibres by parallel transport, the map xνsxx\mapsto\nu^{s}_{x} is continuous in the weak* topology.

Proof.

Let νsx(C,λ,ϵ,n)\nu^{s}_{x}(C,\lambda,\epsilon,n) denote the distribution of Esn(ω)E^{s}_{n}(\omega) for words ω\omega that are (c,λ,ϵ)(c,\lambda,\epsilon)-tempered for some cc in [C,C+1)[C,C+1). Then by Proposition B.3, the distribution EsnE^{s}_{n} for such words ω\omega is uniformly Hölder continuous in nn for fixed CC. So, if νsx(C,λ,ϵ)\nu^{s}_{x}(C,\lambda,\epsilon) denotes the distribution of Esω(x)E^{s}_{\omega}(x) for (C,λ,ϵ)(C,\lambda,\epsilon)-tempered ω\omega, we see that the measures νsx(C,λ,ϵ)\nu^{s}_{x}(C,\lambda,\epsilon) vary weak* continuously. Almost every word ω\omega is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered for some CC. Thus we see that

νsx=C=0νsx(C,λ,ϵ).\nu^{s}_{x}=\sum_{C=0}^{\infty}\nu^{s}_{x}(C,\lambda,\epsilon).

Note that each partial sum of this series varies weak* continuously and that the mass is uniformly absolutely summable pointwise. Thus the limiting family νsx\nu^{s}_{x} is seen to vary weak* continuously. ∎

B.4. Construction of fake stable manifolds

As mentioned above, we will define the fake stable manifolds by taking curves tangent to a smooth approximation to the distribution VnV_{n}, which is defined to equal Dfnω(Esn)Df^{n}_{\omega}(E^{s}_{n}) as above. First, we note that Lemma B.3 above will be applicable to studying the regularity of VnV_{n} due to the following.

Lemma B.5.

Suppose that A1,,AnA_{1},\ldots,A_{n} is a sequence of linear transformations that are (C,λ,ϵ)(C,\lambda,\epsilon)-tempered. Then the sequence An1,,A11A_{n}^{-1},\ldots,A_{1}^{-1} is (C+ϵn,λ,ϵ)(C+\epsilon n,\lambda,\epsilon)-tempered, and the corresponding splitting is the splitting with the stable and unstable subspaces from the original splitting swapped.

Using Lemmata B.3 and B.5 we can estimate the regularity of VnV_{n}.

Lemma B.6.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is a tuple in Diff2vol(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M) where MM is a compact surface. Fix C,λ>0C,\lambda>0, then there exist β,η>0\beta,\eta>0 such that for any sufficiently small ϵ>0\epsilon>0 there exists D1,ND_{1},N\in\mathbb{N}, such that if Λnω\Lambda^{n}_{\omega} is the set of points that are (C,λ,ϵ)(C,\lambda,\epsilon)-tempered at some time nNn\geq N, then the distribution VnV_{n} defined on fnω(Λnω)f^{n}_{\omega}(\Lambda^{n}_{\omega}) by Dxfnω(Esn(x))D_{x}f^{n}_{\omega}(E^{s}_{n}(x)), is β\beta-Hölder with constant D1eηϵnD_{1}e^{\eta\epsilon n}.

Proof.

Apply Proposition B.3 with λ\lambda as above to the diffeomorphisms (f11,,fm1)(f_{1}^{-1},\ldots,f_{m}^{-1}). Then there exist β\beta and ϵ0\epsilon_{0} such that restricted to the set of (C,λ,ϵ)(C,\lambda,\epsilon)-tempered points at time nOλ(C)n\geq O_{\lambda}(C), EsnE^{s}_{n} is β\beta-Hölder with constant at most eD1Ce^{D_{1}C}. From Lemma B.5, we see that for the backwards dynamics (fσni(ω))1(f_{\sigma^{n-i}(\omega)})^{-1}, the points in fnω(Λnω)f^{n}_{\omega}(\Lambda^{n}_{\omega}) are (C+ϵn,λ,ϵ)(C+\epsilon n,\lambda,\epsilon)-tempered. Note that VnV_{n} is equal to the distribution of the most expanded direction for (fnω)1(f^{n}_{\omega})^{-1} and that VnV_{n}^{\perp} is the most contracted direction of (fnω)1(f^{n}_{\omega})^{-1}. As the set fnω(Λnω)f^{n}_{\omega}(\Lambda^{n}_{\omega}) is (C+ϵn,λ,ϵ)(C+\epsilon n,\lambda,\epsilon)-tempered for the backwards dynamics, it follows that as long as ϵ\epsilon is sufficiently small and N0N_{0} is sufficiently large, for all nN0n\geq N_{0}, VnV_{n}^{\perp} is eD1(C+ϵn)e^{D_{1}(C+\epsilon n)} β\beta-Hölder. The statement of the lemma now follows. ∎

Next we take a smooth approximation V~n\widetilde{V}_{n} to the distribution VnV_{n} that will be defined in an open neighborhood of fnω(Λnω)f^{n}_{\omega}(\Lambda^{n}_{\omega}). First we extend the domain of VnV_{n}, and then we smooth the extension. If we do not extend the domain, then we won’t be able to integrate the distribution. If we do not do this smoothing, then we will have little control over the norm of the integral curves to VnV_{n} rather than tempered growth in nn.

Lemma B.7.

Suppose that MM is a smooth closed surface. There exist D1,D2D_{1},D_{2} such that if KMK\subseteq M is a subset and EE is a distribution defined over KK that is (C,α)(C,\alpha)-Hölder then EE admits a (D1C,α)(D_{1}C,\alpha)-Hölder extension to a neighborhood of KK of size δ=D2min{1,C1/α}\delta=D_{2}\min\{1,C^{-1/\alpha}\}.

Proof.

We first prove the result with vector fields instead of distributions. Cover MM by finitely many charts. In each chart the vector field XX is represented as a map ϕ0:KS12\phi_{0}\colon K\to S^{1}\subset\mathbb{R}^{2}. The McShane extension theorem [McS34, Cor. 1] says that a (C,α)(C,\alpha)-Hölder function defined from a subset XX of an arbitrary metric space to \mathbb{R} admits a (C,α)(C,\alpha)-extension to all of XX. Then we glue the maps from different charts using a partition of unity. This proves the result for vector fields. Note that the resulting vector field is defined on the whole manifold. To obtain the result for distributions, we take a unit vector field on KK in the direction of EE, extend it to a vectorfield X~\tilde{X} as above and note that the resulting extension is nonzero inside the δ\delta neighborhood of KK, so we can take E~\tilde{E} to be the direction of X~\tilde{X}. ∎

The content of the following lemma is item (2), the C2C^{2} estimate on V~n\widetilde{V}_{n}. While VnV_{n} could be seen to be C2C^{2}, we have little ability to control its norm; thus we need to produce a more regular approximation to this distribution.

Lemma B.8.

Let (f1,,fm)(f_{1},\ldots,f_{m}) be a tuple of diffeomorphisms in Diff2vol(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M), for MM a closed surface. Fix λ>0\lambda>0. Then there exists ϵ1>0\epsilon_{1}>0, ν1,ν2>0\nu_{1},\nu_{2}>0 and NN\in\mathbb{N}, D1,D2,D3D_{1},D_{2},D_{3}, such that if for ϵ<ϵ1\epsilon<\epsilon_{1} Λωn\Lambda^{\omega}_{n} denotes the set of (C,λ,ϵ)(C,\lambda,\epsilon)-tempered points, then there exists a distribution V~n\widetilde{V}_{n} such that

  1. (1)

    The domain of V~n\widetilde{V}_{n} contains all points within distance D1eϵν1nD_{1}e^{-\epsilon\nu_{1}n} of the domain of VnV_{n}.

  2. (2)

    V~n\widetilde{V}_{n} is C2C^{2} with V~nC2D2eϵν1n\|\widetilde{V}_{n}\|_{C^{2}}\leq D_{2}e^{\epsilon\nu_{1}n}.

  3. (3)

    At each xx in the domain of VnV_{n}, d(V~n(x),Vn(x))<D3eϵν2nd(\widetilde{V}_{n}(x),V_{n}(x))<D_{3}e^{-\epsilon\nu_{2}n}.

Proof.

First, from Lemma B.6, given ϵ>0\epsilon>0 we may choose ϵ1\epsilon_{1} sufficiently small that VnV_{n} is β\beta-Hölder with constant D1eηϵnD_{1}e^{\eta\epsilon n}. Let V^n\hat{V}_{n} be an extension of VnV_{n} obtained from Lemma B.7, then from the Hölder estimate on VnV_{n},  V^n\hat{V}_{n} is defined in a neighborhood of Dfnω(Λnω)Df^{n}_{\omega}(\Lambda^{n}_{\omega}) of size at least D11/βeηϵn/βD_{1}^{-1/\beta}e^{-\eta\epsilon n/\beta}.

We now take a smooth approximation to V^n\hat{V}_{n}. For this we can represent V^n\hat{V}_{n} in charts as a function ϕ:US12\phi\colon U\to S^{1}\subset\mathbb{R}^{2}, then mollify ϕ\phi. From [FKS13, Eq. (11)], we have estimates for convolution fϵ=fψϵf_{\epsilon}=f*\psi_{\epsilon} of a standard mollifier ψϵ\psi_{\epsilon} with a compactly supported function f:2f\colon\mathbb{R}^{2}\to\mathbb{R}:

(B.7) fϵ2ϵα4fα and ffϵ0ϵαfα.\|f_{\epsilon}\|_{2}\leq\epsilon^{\alpha-4}\|f\|_{\alpha}\hskip 10.00002pt\text{ and }\hskip 10.00002pt\|f-f_{\epsilon}\|_{0}\leq\epsilon^{\alpha}\|f\|_{\alpha}.

As domain of V^n\hat{V}_{n} has size at least D11/βeηϵn/βD_{1}^{-1/\beta}e^{-\eta\epsilon n/\beta}, we can mollify with any ϵ<D11/βeηϵn/β/100\epsilon^{\prime}<D_{1}^{-1/\beta}e^{-\eta\epsilon n/\beta}/100 and obtain a function that is well defined at all points at least distance D11/βeηϵn/β/100D_{1}^{-1/\beta}e^{-\eta\epsilon n/\beta}/100 from the boundary of the domain of V^n\hat{V}_{n}. Let V~n\widetilde{V}_{n} denote the mollified function restricted to the points in the domain of V^n\hat{V}_{n} of distance at most D11/βeηϵn/β/100D_{1}^{-1/\beta}e^{-\eta\epsilon n/\beta}/100 from the domain of VnV_{n}. Then taking ϵ=eνϵ\epsilon^{\prime}=e^{-\nu\epsilon} for some large ν\nu, mollifying with ψϵ\psi_{\epsilon^{\prime}}, and applying the estimates in (B.7) gives that there exist constants D2,D3,D4,D5D_{2},D_{3},D_{4},D_{5} such that

V~n2D2eD3ϵn and d(V~n,Vn)<D4eD5ϵn.\|\widetilde{V}_{n}\|_{2}\leq D_{2}e^{-D_{3}\epsilon n}\text{ and }d(\widetilde{V}_{n},V_{n})<D_{4}e^{-D_{5}\epsilon n}.

This gives the needed conclusion. ∎

The use of the distributions V~n\widetilde{V}_{n} is that they are integrable and their C2C^{2} norm is well controlled. This implies that if we take a holonomy along the distribution, then we will have good control of the norm of the Jacobian.

Definition B.9.

Fix λ>0\lambda>0 and sufficiently small ϵ>0\epsilon>0. Then take ϵ1<ϵ/max{ν1,ν2}\epsilon_{1}<\epsilon/\max\{\nu_{1},\nu_{2}\} where ν1,ν2\nu_{1},\nu_{2} are as in Proposition B.8. We consider a collection of (C,λ,ϵ1)(C,\lambda,\epsilon_{1})-tempered points. Let W~n\widetilde{W}_{n} be the foliation defined by the integral curves to V~n\widetilde{V}_{n}. The fake stable leaf through xΛnωx\in\Lambda^{n}_{\omega} is then defined to be Wsn(ω,x)=(fnω)1(W~n(fnω(x))).\displaystyle W^{s}_{n}(\omega,x)=(f^{n}_{\omega})^{-1}(\widetilde{W}_{n}(f^{n}_{\omega}(x))).

We will now state basic facts about the fake stable manifolds. In particular, we show that the fake stable manifolds of sufficiently small size enjoy uniform contraction.

Proposition B.10.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is an expanding on average tuple of diffeomorphisms in Diff2(M)\operatorname{Diff}^{2}(M), where MM is a closed surface. Fix λ>0\lambda>0. Then there exists λ,ϵ0>0\lambda^{\prime},\epsilon_{0}>0 such that for any 0ϵϵ00\leq\epsilon\leq\epsilon_{0} and any CC, there exist N0,δ0,C0,α>0N_{0},\delta_{0},C_{0},\alpha>0 such that if ΛnωM\Lambda_{n}^{\omega}\subset M is any collection of (C,λ,ϵ)(C,\lambda,\epsilon)-tempered points at time nN0n\geq N_{0} lying in some ball Bδ0MB_{\delta_{0}}\subset M. Then

  1. (1)

    ​​For N0inN_{0}\!\!\leq\!\!i\!\!\leq\!\!n the fake stable manifolds Wsi,δ0(ω,x)W^{s}_{i,\delta_{0}}(\omega,x)​ exist and have C2C^{2} norm at most C0C_{0}.

  2. (2)

    d(TxWsi,Esi(x))eλi/2d(T_{x}W^{s}_{i},E^{s}_{i}(x))\leq e^{-\lambda i/2}.

  3. (3)

    The fake stable direction EsiE^{s}_{i} is (C0,α)(C_{0},\alpha)-Hölder continuous on Λnω\Lambda_{n}^{\omega}.

  4. (4)

    The fake stable leaves Wsi,δ0(ω,x)W^{s}_{i,\delta_{0}}(\omega,x) vary Hölder continuously in the C1C^{1} topology, and the Hölder constants are independent of N0inN_{0}\leq i\leq n.

  5. (5)

    The fake stable leaves Wsi,δ0(ω,x)W^{s}_{i,\delta_{0}}(\omega,x) are contracting, i.e. for y,zWsi,δ0(ω,x)y,z\in W^{s}_{i,\delta_{0}}(\omega,x), for each 0ki0\leq k\leq i,  dWsi,δ0(x)(fkω(y),fkω(z))C0eλk.\displaystyle d_{W^{s}_{i,\delta_{0}}(x)}(f^{k}_{\omega}(y),f^{k}_{\omega}(z))\leq C_{0}e^{-\lambda^{\prime}k}.

Proof Sketch..

The claim about the existence and regularity of the fake stable manifolds in (1) essentially follows from the construction of the stable manifolds described in Section 5 or Proposition A.13, depending on taste. An integral curve to the V~n\widetilde{V}_{n} distribution has C2C^{2} norm that is order eO(ϵ)e^{O(\epsilon)}, and is almost tangent to the most expanded direction of (Dfnω)1(Df^{n}_{\omega})^{-1} allowing us to apply those lemmas. Similarly, the final item in the lemma says that the dynamics on the fake stable manifolds is contracting. This also follows from the graph transform argument. Specifically one can produce this statement by a generalization of Step 1 in the proof of Proposition A.13, which studies the growth in length of curves in the Lyapunov charts.

The statement (2) saying that TxWsiT_{x}W^{s}_{i} is near to EsiE^{s}_{i} is immediate because DfnωCeλn\|Df^{n}_{\omega}\|\!\!\geq\!\!Ce^{\lambda n} by assumption. Since DfnωEsiDf^{n}_{\omega}E^{s}_{i} and V~n\widetilde{V}_{n} are exponentially close, they will attract further under (Dfnω)1(Df^{n}_{\omega})^{-1}.

The statements about Hölder-ness are standard facts; it follows from the same argument as in [BP07, Sec. 5.3] applied for only finitely many iterations. Alternatively, Lemma 10.2 contains an explicit computation showing that nearby points inherit a nearby splitting. The proof of that lemma does not rely on any of the claims from this section. We will not use (4) as everything we need for the main result of this paper follows from (1), (2), and (3). So will will omit detailed proof. The claim essentially follows Hölder continuity of the stable distribution, Hölder continuity of the holonomies, which will be obtained in Proposition B.13, and Lemma B.1. Compare for example, with [BP07, Sec. 8.1.5], which describes a similar argument. ∎

B.5. Rate of convergence of fake stable manifolds

Proposition B.12, proven in this section, is one of the key estimates in this paper playing an important role in the local coupling procedure.

The main crucial feature that the fake stable leaves exhibit is that the fluctuations in WsiW^{s}_{i} as we increase ii decay exponentially fast. In fact, we have a quantitative estimate that directly relates the speed of convergence of Wsi(ω,x)W^{s}_{i}(\omega,x) with the hyperbolicty of DxfiωD_{x}f^{i}_{\omega}.

In the following proposition, we will use an additional refinement of (C,λ,ϵ)(C,\lambda,\epsilon)-tempered points that also requires that the stable direction points in a particular direction. The definition below is structured so that it is hopefully straightforward to think about. When a point is (C,λ,ϵ)(C,\lambda,\epsilon)-tempered, there is a definite rate at which EsnE^{s}_{n} converges to EsE^{s}. Thus if EsnE^{s}_{n} happens to lie sufficiently far from the boundary of a cone 𝒞\mathcal{C} at a sufficiently large time n1n_{1}, then Esi𝒞E^{s}_{i}\in\mathcal{C} for all in1i\geq n_{1}.

Definition B.11.

Suppose xMx\in M and 𝒞TxM\mathcal{C}\subset T_{x}M is a cone. We say that a word ω\omega is (C,λ,ϵ,𝒞,n1,n2)(C,\lambda,\epsilon,\mathcal{C},n_{1},n_{2})-tempered if for all n1in2n_{1}\leq i\leq n_{2}, EsiE^{s}_{i} is defined and lies in 𝒞\mathcal{C}. We may also speak of being (C,λ,ϵ,𝒞)(C,\lambda,\epsilon,\mathcal{C})-tempered at a time nn, in which case we mean n1=n2=nn_{1}=n_{2}=n in the previous sentence.

We now estimate how much the fake stable leaves fluctuate. The requirements on the cone are, strictly speaking, not necessary in order to state the theorem below: as long as NN is chosen sufficiently large, one can use EsN(x)E^{s}_{N}(x) to define the cone 𝒞\mathcal{C} in the following proposition and obtain the same result.

Proposition B.12.

(Fluctuations in fake-stable leaves) Let (f1,,fm)(f_{1},\!\ldots\!,f_{m}) be a tuple in Diff2vol(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M) for a closed surface MM. Fix λ,C1,θ0>0\lambda,C_{1},\theta_{0}>0, then there exists ϵ0>0\epsilon_{0}>0 such that for all 0ϵ<ϵ00\leq\epsilon<\epsilon_{0} and C>0C>0 there exist D1,N,δ0>0D_{1},N,\delta_{0}>0 such that for any δδ0\delta\leq\delta_{0} the following holds. Given xMx\in M and a cone 𝒞TxM\mathcal{C}\subset T_{x}M, extend 𝒞\mathcal{C} by parallel transport to a conefield 𝒞\mathcal{C} defined over B2δ(x)B_{2\delta}(x). Suppose that γ\gamma is a CC-good curve with distance d(x,γ)<δd(x,\gamma)<\delta and γ\gamma is θ0\theta_{0} transverse to 𝒞\mathcal{C}. If ω\omega is a (C1,λ,ϵ,𝒞,n,n+1)(C_{1},\lambda,\epsilon,\mathcal{C},n,n+1)-tempered with nNn\geq N, then

(B.8) dγ(Wsn(x)γ,Wsn+1(x)γ)e1.99lnDxfnω,d_{\gamma}(W^{s}_{n}(x)\cap\gamma,W^{s}_{n+1}(x)\cap\gamma)\leq e^{-1.99\ln\|D_{x}f^{n}_{\omega}\|},

where Wsn(x),Wsn+1(x)W^{s}_{n}(x),W^{s}_{n+1}(x) are the fake stable manifolds from Definition B.9.

The reason the proposition follows is evident in the case of a linear map. Consider the action of the map L=diag(σ,σ1)L=\operatorname{diag}(\sigma,\sigma^{-1}) on P1\mathbb{R}\operatorname{P}^{1} where σ>1\sigma>1. Note that the map LL has an attracting fixed point of multiplier σ2\sigma^{-2}, which suggests the asymptotic in the theorem. Consider what happens if we apply LL to two curves tangent at (0,0)(0,0) to the expanded direction of LL: the distance between them will contract by a factor of σ2\sigma^{-2}. The result for a sequence of maps will follow because the temperedness assures a uniform Es,EuE^{s},E^{u} splitting. When we work with this splitting, the full strength of the hyperbolicity will be available allowing us to recover almost e2lnDxfnωe^{-2\ln\|D_{x}f^{n}_{\omega}\|} contraction as in the theorem.

The formal proof will rely on the study of the graph transform. The argument for this proposition is simpler than the argument in the recovery lemma since the curves we consider in this lemma are (by assumption) well positioned with respect to the stable and unstable splitting.

There are three steps in the proof. We have two curves at fnω(x)f^{n}_{\omega}(x), one corresponding to the time nn fake stable manifolds and one corresponding to the time n+1n+1 fake stable manifolds. In the first step, we iterate the graph transform until these curves look uniformly Lipschitz in the Lyapunov charts. In the second step, we iterate the graph transform to see that these two curves approach each other at the appropriate exponential rate. In the third step, we do some bookkeeping to conclude.

Proof.

Recall that, by definition, the fake stable manifold Wsn(x)W^{s}_{n}(x) is given by taking a curve γn\gamma_{n} tangent to the distribution V~n\widetilde{V}_{n} from Lemma B.8 and letting Wsn(x)W^{s}_{n}(x) equal (fnω)1(γn)(f^{n}_{\omega})^{-1}(\gamma_{n}) restricted to a segment of length δ0\delta_{0} about xx where δ0\delta_{0} is chosen as in Proposition B.10. Note that we need not take the δ0\delta_{0} in this proposition to be the same as the one in Proposition B.10. Indeed, at certain points in the analysis below it may be convenient to decrease δ0\delta_{0} in a way that depends only on the parameters of the proposition.

The proposition is comparing (fnω)1(γn)(f^{n}_{\omega})^{-1}(\gamma_{n}) and (fn+1ω)1(γn+1)(f^{n+1}_{\omega})^{-1}(\gamma_{n+1}). As in previous sections, we will view both of these curves as graphs of functions from EuE^{u} to EsE^{s} in the Lyapunov charts. In this proof we will work with the splitting into stable and unstable subspaces for the subspaces defined by the associated splitting for DfnωDf^{n}_{\omega} rather than Dfn+1ωDf^{n+1}_{\omega}. Recall that EsnE^{s}_{n} denotes the most contracted subspace for DfnωDf^{n}_{\omega} and Esn+1E^{s}_{n+1} denotes the most contracted subspace for Dfn+1ωDf^{n+1}_{\omega}.

In the Lyapunov charts at fjω(x)f^{j}_{\omega}(x), we write (fnjσj(ω))1(γn)(f^{n-j}_{\sigma^{j}(\omega)})^{-1}(\gamma_{n}) as the graph of the function ϕ1j\phi^{1}_{j} and we write (fnj+1σj(ω))1(γn+1)(f^{n-j+1}_{\sigma^{j}(\omega)})^{-1}(\gamma_{n+1}) as the graph of ϕ2j(x)\phi^{2}_{j}(x). Let eΛe^{\Lambda} be an upper bound on Dfi\|Df_{i}\|, 1im1\leq i\leq m, with Λ>100\Lambda>100.

With respect to the Lyapunov metrics, we use the similar choices as in previous arguments, specifically Proposition A.15, and thereby obtain essentially identical intermediate estimates. View the sequence of maps fωjf_{\omega}^{j} as being reversed tempered starting at fn+1ω(x)f^{n+1}_{\omega}(x) and ending at xx. So, set λ=.9999λ\lambda^{\prime}=.9999\lambda and take the finite time Lyapunov metrics as in Lemma A.1 for this sequence. In particular, note that from the construction of the Lyapunov metrics, the eO(ϵn)e^{O(\epsilon n)} bound on the C2C^{2} norm of the curves γn\gamma_{n} from Lemma B.8 and the angle V~n\widetilde{V}_{n} makes with VnV_{n} of O(eO(ϵn))\displaystyle O\left(e^{O(-\epsilon n)}\right) combine to show that there exist C2,ν>0C_{2},\nu>0 such that ϕ1n1,ϕ2n1C2eνϵn\|\phi^{1}_{n}\|_{1},\|\phi^{2}_{n}\|_{1}\leq C_{2}e^{\nu\epsilon n}. We now proceed with the proof.

Step 1. (Lipschitzness) In this step, we will identify Nl(1O(ϵ))nN_{l}\approx(1-O(\epsilon))n such that for jNlj\leq N_{l}, ϕ1j\phi^{1}_{j} and ϕ2j\phi^{2}_{j} are C0C^{0} close.

To begin we estimate how far apart Dfnω(Esn)Df^{n}_{\omega}(E^{s}_{n}) and Dfnω(Esn+1)Df^{n}_{\omega}(E^{s}_{n+1}) are. We claim that there exists N0N_{0} such that for nN0n\geq N_{0}, then (Dfnω(Esn),Dfnω(Esn+1))1/4\angle(Df^{n}_{\omega}(E^{s}_{n}),Df^{n}_{\omega}(E^{s}_{n+1}))\leq 1/4. Note that if N0N_{0} is sufficiently large that both Dfnω\|Df^{n}_{\omega}\| and Dfn+1ω\|Df^{n+1}_{\omega}\| are at least e10Λe^{10\Lambda} and (Esn,Esn+1)<1/100\angle(E^{s}_{n},E^{s}_{n+1})<1/100 both of which follow from the (C1,λ,ϵ)(C_{1},\lambda,\epsilon)-temperedness (The latter claim is part of Proposition 4.6). As in previous computations, it follows that if (DfnωEsn,DfnωEsn+1)>1/4\angle(Df^{n}_{\omega}E^{s}_{n},Df^{n}_{\omega}E^{s}_{n+1})\!\!>\!\!1/4, then Dfnω(Esn+1)>2\|Df^{n}_{\omega}(E^{s}_{n+1})\|>2 because DfnωDf^{n}_{\omega} expands EunE^{u}_{n} and contracts EsnE^{s}_{n}. Consequently, Dfn+1ω(Esn+1)>2eΛ\|Df^{n+1}_{\omega}(E^{s}_{n+1})\|>2e^{-\Lambda}. But this is not less than e10Λe^{-10\Lambda}, so it is impossible that (Dfnω(Esn),Dfnω(Esn+1))>1/4\angle(Df^{n}_{\omega}(E^{s}_{n}),Df^{n}_{\omega}(E^{s}_{n+1}))>1/4.

Note that in Proposition A.13, we considered smoothing estimates for a reverse tempered point. In the case of this theorem, we may consider xx as a reverse tempered point for the sequence of maps (fjσnj(ω))1(f^{j}_{\sigma^{n-j}(\omega)})^{-1} beginning at fnω(x)f^{n}_{\omega}(x). Consequently, we may read off the intermediate estimates from the proof of that theorem. In particular, as in equation (A.44) by possibly restricting the domain of ϕ1j\phi^{1}_{j} and ϕ2j\phi^{2}_{j} as in that proposition, it follows that there exists C3C_{3} such that for i{1,2}i\in\{1,2\} that

ϕinj1C3eνϵnejλ.\|\phi^{i}_{n-j}\|_{1}\leq C_{3}e^{\nu\epsilon n}e^{-j\lambda}.

In particular this shows that if we let Nl=nνϵ/λnN_{l}=\lfloor n-\nu\epsilon/\lambda n\rfloor, then because both curves pass through 0 and our choice of NlN_{l}, we see that there exists C4C_{4} such that for i{1,2}i\in\{1,2\}, ϕij1C4\|\phi^{i}_{j}\|_{1}\leq C_{4}. Because both pass through 0, the following estimate holds for all N0jNlN_{0}\leq j\leq N_{l}:

(B.9) |ϕ1j(x)ϕ2j(x)|2C4|x|,\left|\phi^{1}_{j}(x)-\phi^{2}_{j}(x)\right|\leq 2C_{4}\left|x\right|,

which is the desired estimate for this step in the proof.

Step 2. (Contraction) In this step, we study how fast the curves ϕ1j\phi^{1}_{j} and ϕ2j\phi^{2}_{j} attract as we apply the dynamics (fσj(ω))1(f_{\sigma^{j}(\omega)})^{-1}. Our goal is to show that the C0C^{0} distance between these functions is rapidly decreasing, which is the content of (B.14).

First, in the Lyapunov chart we have

(B.10) f^σj(ω)1=(eσj1x+f^j,1(x,y),eσj2y+f^j,2(x,y)),\hat{f}_{\sigma^{j}(\omega)}^{-1}=(e^{\sigma_{j}^{1}}x+\hat{f}_{j,1}(x,y),e^{\sigma_{j}^{2}}y+\hat{f}_{j,2}(x,y)),

where min{σj1,σj2}.999λ\min\{\sigma_{j}^{1},-\sigma_{j}^{2}\}\geq.999\lambda. Then in the Lyapunov charts, the differential is

(B.11) Df^1σj(ω)=[eσj,1+xf^j,1yf^j,1xf^j,2eσj,2+yf^j,2].D\hat{f}^{-1}_{\sigma^{j}(\omega)}=\begin{bmatrix}e^{\sigma_{j,1}}+\partial_{x}\hat{f}_{j,1}&\partial_{y}\hat{f}_{j,1}\\ \partial_{x}\hat{f}_{j,2}&e^{\sigma_{j,2}}+\partial_{y}\hat{f}_{j,2}\end{bmatrix}.

In addition, write

(B.12) Λj=i=jNlσj,1σj,2.\Lambda_{j}=\sum_{i=j}^{N_{l}}\sigma_{j,1}-\sigma_{j,2}.

As in Proposition A.13, we have a C2C^{2} estimate in the Lyapunov charts. There exists C5>0C_{5}>0 such that

(B.13) (f^σi(ω))1C2C5e6C1e6iϵ.\|(\hat{f}_{\sigma^{i}(\omega)})^{-1}\|_{C^{2}}\leq C_{5}e^{6C_{1}}e^{6i\epsilon}.

We will now verify inductively that a strengthening of (B.9) holds for N0<j<NlN_{0}<j<N_{l}. We now show that by possibly increasing N0N_{0}, which is fixed and does not depend on nn, that for all |x|<e(λ/2)j\left|x\right|<e^{-{(\lambda/2)}j}, and N0j<NlN_{0}\leq j<N_{l},

(B.14) |ϕ1j(x)ϕ2j(x)|C4e1.999Λj|x|.\left|\phi^{1}_{j}(x)-\phi^{2}_{j}(x)\right|\leq C_{4}e^{-1.999\Lambda_{j}}\left|x\right|.

To show (B.14), we measure the distance between ϕ1j\phi^{1}_{j} and ϕ2j\phi^{2}_{j} using a piece of the vertical curve V(t)V(t) parallel to EsE^{s} between ϕ1j+1(x)\phi^{1}_{j+1}(x) and ϕ2j+1(x)\phi^{2}_{j+1}(x). We then apply (fσj(ω))1(f_{\sigma^{j}(\omega)})^{-1} to the curve and estimate its length. We then use the Lipschitzness of ϕ1j\phi^{1}_{j} and ϕ2j\phi^{2}_{j} to obtain (B.14). Let V(t)V(t) be a vertical curve (parallel to EsE^{s}) defined on [1,1][-1,1] taking values in the Lyapunov charts such that V(1)ϕj+11V(-1)\in\phi_{j+1}^{1} and V(1)ϕj+12V(1)\in\phi_{j+1}^{2} passing through the point (x,0)(x,0). Then from the inductive hypothesis, we see that len(V)C4e1.999Λj|x|\operatorname{len}(V)\leq C_{4}e^{-1.999\Lambda_{j}}\left|x\right|.

By applying the differential to VV, we see by (B.11), (f^σj(ω))1(V)(\hat{f}_{\sigma^{j}(\omega)})^{-1}(V) is tangent to a vector of the form

(B.15) t((f^σj(ω))1V(t))=[yf^j,1eσj,2+yf^j,2].\partial_{t}((\hat{f}_{\sigma^{j}(\omega)})^{-1}V(t))=\begin{bmatrix}\partial_{y}\hat{f}_{j,1}\\ e^{\sigma_{j,2}}+\partial_{y}\hat{f}_{j,2}\end{bmatrix}.

In particular, for C5C_{5} as before if we are restricted to a ball of radius C51e(λ/2)jC_{5}^{-1}e^{-(\lambda/2)j}, then as the C2C^{2} norm of (f^σj(ω))1(\hat{f}_{\sigma^{j}(\omega)})^{-1} is O(e6jϵ)O(e^{6j\epsilon}), it follows that

(B.16) |yf^j,i|<e(λ/4)j\left|\partial_{y}\hat{f}_{j,i}\right|<e^{-(\lambda/4)j}

for i{1,2}i\in\{1,2\}. Let πu\pi_{u} be the projection onto the EuE^{u} direction in the Lyapunov coordinates and let πs\pi_{s} be the projection onto the EsE^{s} direction in the Lyapunov coordinates. We see that there exists C6C_{6} such that:

(B.17) |πs((f^σj(ω))1V(1))πs((f^σj(ω))1V(1)))|C4e1.999Λje(1ϵj)σj,2|x|\left|\pi_{s}((\hat{f}_{\sigma^{j}(\omega)})^{-1}V(-1))-\pi_{s}((\hat{f}_{\sigma^{j}(\omega)})^{-1}V(1)))\right|\leq C_{4}e^{-1.999\Lambda_{j}}e^{(1-\epsilon_{j})\sigma_{j,2}}\left|x\right|

where |ϵj|C6eλ/4j\left|\epsilon_{j}\right|\leq C_{6}e^{-\lambda/4j}.

We now use (B.17) to estimate the C0C^{0} norm of ϕ1j\phi^{1}_{j} and ϕ2j\phi^{2}_{j}, rather than just the distance between two points along these curves. The endpoints of (f^σj(ω))1V(t)(\hat{f}_{\sigma^{j}(\omega)})^{-1}V(t) lie in ϕ1j\phi^{1}_{j} and ϕ2j\phi^{2}_{j}. Note that when (f^σj(ω))1V(\hat{f}_{\sigma^{j}(\omega)})^{-1}V is viewed as a graph over the vertical line parallel to EsE^{s} through πu(f^σjω)1(x,0)\pi_{u}(\hat{f}_{\sigma^{j}\omega})^{-1}(x,0), that (f^σj(ω))1V(\hat{f}_{\sigma^{j}(\omega)})^{-1}V is distance at most eλ/4jlen(V)e^{-\lambda/4j}\operatorname{len}(V) from a vertical line by (B.15) and (B.16). Thus as ϕ1j\phi^{1}_{j} and ϕ2j\phi^{2}_{j} are both C4C_{4} Lipschitz for N0jNlN_{0}\leq j\leq N_{l}, we see that

|ϕ1j(π1(f^σjω)1(x,0))ϕ2j(π1(f^σjω)1(x,0))|\displaystyle\left|\phi^{1}_{j}(\pi_{1}(\hat{f}_{\sigma^{j}\omega})^{-1}(x,0))-\phi^{2}_{j}(\pi_{1}(\hat{f}_{\sigma^{j}\omega})^{-1}(x,0))\right| <C4e1.999Λje(1ϵj)σj,2|x|+C4eλ/4jlen(V)\displaystyle<C_{4}e^{-1.999\Lambda_{j}}e^{(1-\epsilon_{j})\sigma_{j,2}}\left|x\right|+C_{4}e^{-\lambda/4j}\operatorname{len}(V)
(B.18) (e(1ϵj)σj,2+C4eλ/4j)e1.999Λj|x|.\displaystyle\leq(e^{(1-\epsilon_{j})\sigma_{j,2}}+C_{4}e^{-\lambda/4j})e^{-1.999\Lambda_{j}}\left|x\right|.

As long as N0N_{0} is sufficiently large, for jN0j\geq N_{0},

(B.19) |x|e(1ϵj)σj,1|π1(f^σj(ω))1(x,0))|.\left|x\right|\leq e^{-(1-\epsilon_{j})\sigma_{j,1}}\left|\pi_{1}(\hat{f}_{\sigma^{j}(\omega)})^{-1}(x,0))\right|.

Note that if jj is larger than some fixed N0N_{0} and ϵj\epsilon_{j} is sufficiently small relative to λ\lambda, then

(B.20) (e(1ϵj)σj,2+C4eλ/4j)e(1ϵj)σj,1e1.999(σj,2σj,1).(e^{(1-\epsilon_{j})\sigma_{j,2}}+C_{4}e^{-\lambda/4j})e^{-(1-\epsilon_{j})\sigma_{j,1}}\leq e^{1.999(\sigma_{j,2}-\sigma_{j,1})}.

Combining (B.18), (B.19), and (B.20), we get |ϕ1j(x)ϕ2j(x)|C4e1.999Λj1,\displaystyle\left|\phi^{1}_{j}(x)-\phi^{2}_{j}(x)\right|\leq C_{4}e^{-{1.999}\Lambda_{j-1}}, as required.

Step 3. (Bookkeeping and Conclusion) So far, we have obtained that for some N0N_{0} and C4C_{4} depending only on the constants in the theorem

|ϕ1N0(x)ϕ2N0(x)|C4e1.999ΛN0\left|\phi^{1}_{N_{0}}(x)-\phi^{2}_{N_{0}}(x)\right|\leq C_{4}e^{-1.999\Lambda_{N_{0}}}

Thus as ϕ10\phi^{1}_{0} and ϕ20\phi^{2}_{0} are related to ϕ1N0\phi^{1}_{N_{0}} and ϕ2N0\phi^{2}_{N_{0}} by applying only the fixed number N0N_{0} more maps, we see that there exists C7C_{7} and δ2>0\delta_{2}>0 such that on a ball of radius δ2\delta_{2} in the Lyapunov charts at xx:

|ϕ10(x)ϕ20(x)|C7e1.999ΛN0.\left|\phi^{1}_{0}(x)-\phi^{2}_{0}(x)\right|\leq C_{7}e^{-1.999\Lambda_{N_{0}}}.

Consider a nearby CC-good curve γ\gamma that is θ0\theta_{0}-transverse to 𝒞\mathcal{C} and hence to EsE^{s}, ϕ10\phi^{1}_{0}, and ϕ20\phi^{2}_{0}. It then follows easily from transversality, that as ϕ10\phi^{1}_{0} is nearly tangent to EsE^{s} by Proposition B.10(2) and ϕ10,ϕ20\phi^{1}_{0},\phi^{2}_{0} are uniformly Lipschitz, there exists C8C_{8} such that

dγ(ϕ10γ,ϕ20γ)C8e1.999ΛN0.d_{\gamma}(\phi^{1}_{0}\cap\gamma,\phi^{2}_{0}\cap\gamma)\leq C_{8}e^{-1.999\Lambda_{N_{0}}}.

The only remaining thing we need is to know that ΛN0\Lambda_{N_{0}} is within a factor of .001Λ.001\Lambda of lnDfnω\ln\|Df^{n}_{\omega}\|. This will follow as long as we take ϵ\epsilon sufficiently small relative to λ,ν1,ν2\lambda,\nu_{1},\nu_{2} and the maximum of the norm of the differentials of f1,,fmf_{1},\ldots,f_{m}. We omit the computation of exactly how small ϵ\epsilon must be. Such sufficiently small ϵ\epsilon exists because when we look in the Lyapunov charts, we obtain the straightforward bound that there exists C9C_{9} such that

lnDfnωC9+4ϵn+σj,1.\ln\|Df^{n}_{\omega}\|\leq C_{9}+4\epsilon n+\sum\sigma_{j,1}.

But ΛN0\Lambda_{N_{0}} includes only the hyperbolicity for the iterates N0jNlN_{0}\leq j\leq N_{l}. From volume preservation of the fif_{i}, it similarly follows that lnDfnC10+4ϵnσj,2\ln\|Df^{n}\|\leq C_{10}+4\epsilon n-\sum\sigma_{j,2} for some C10C_{10}. As Nl=(1O(ϵ))nN_{l}=(1-O(\epsilon))n and N0N_{0} is a fixed independent of nn, it follows that for sufficiently small ϵ\epsilon and sufficiently large nn that e1.99lnDfne1.999ΛN0,e^{1.99\ln\|Df^{n}\|}\leq e^{1.999\Lambda_{N_{0}}}, which is the needed conclusion. ∎

B.6. Jacobian of the fake stable holonomies

Now that we have defined the fake stable manifolds and have an estimate for the rate at which their holonomies converge, we study the Jacobian of their holonomies, whose properties are crucial in the coupling argument. The next quantity of interest is the fluctuations in the Jacobian of the holonomies for the fake stable manifolds.

Proposition B.13.

Suppose that (f1,,fm)(f_{1},\ldots,f_{m}) is a tuple of diffeomorphisms in Diff2vol(M)\operatorname{Diff}^{2}_{\operatorname{vol}}(M) for a closed surface MM. For λ>0\lambda>0 there exists ϵ0>0\epsilon_{0}>0 such that for all 0ϵϵ00\leq\epsilon\leq\epsilon_{0} and C>0C>0, there exists NN\in\mathbb{N} and δ,η,α>0\delta,\eta,\alpha>0 such that for any nNn\geq N, and any ωΣ\omega\in\Sigma, if Λnω\Lambda_{n}^{\omega} is the set of (C,λ,ϵ)(C,\lambda,\epsilon)-tempered points up to time nn then for any ball BδMB_{\delta}\subseteq M of radius δ\delta, the following holds for xΛωnBδx\in\Lambda^{\omega}_{n}\cap B_{\delta}.

For any two uniform transversals T1T_{1} and T2T_{2} to the WNsW_{N}^{s} laminations of Bδ(x)B_{\delta}(x), T1T_{1} and T2T_{2} will be uniform transversals to the WisW_{i}^{s} lamination for NinN\leq i\leq n. Where defined, consider the holonomies HsiH^{s}_{i} between T1T_{1} and T2T_{2} and moreover the Jacobian JacHsi\operatorname{Jac}H^{s}_{i}, which is defined on a subset of T1T_{1}. Then we have the following for all NinN\leq i\leq n:

(1) The Jacobians of the holonomies between uniform transversals are uniformly α\alpha-Hölder and bounded away from zero. In particular, this implies that these Jacobians are uniformly log-α\alpha-Hölder between uniform transversals. Specifically, for fixed (C1,δ1)(C_{1},\delta_{1}), there exist D1,D2,D3D_{1},D_{2},D_{3} such that if γ1\gamma_{1} and γ2\gamma_{2} are a (C1,δ1)(C_{1},\delta_{1})-configuration in the sense of Definition 7.8 with γ1\gamma_{1} and γ2\gamma_{2} uniformly transverse to the EsN(x)E^{s}_{N}(x) extended by parallel transport in a small neighborhood, and IΛωnI\subseteq\Lambda^{\omega}_{n} is a subset of γ1\gamma_{1} then, for x,yIx,y\in I,

(B.21) |logJacHsn(x)logJacHsn(y)|D1dγ1(x,y)α.\left|\log\operatorname{Jac}H^{s}_{n}(x)-\log\operatorname{Jac}H^{s}_{n}(y)\right|\leq D_{1}d_{\gamma_{1}}(x,y)^{\alpha}.

(2) The Jacobians from item (1) converge exponentially quickly, i.e.

(B.22) |JacHsi1JacHsi|D2eηi,\left|\operatorname{Jac}H^{s}_{i-1}-\operatorname{Jac}H^{s}_{i}\right|\leq D_{2}e^{-\eta i},

and

(B.23) |JacHsiJacHsi11|D3eηi.\left|\frac{\operatorname{Jac}H^{s}_{i}}{\operatorname{Jac}H^{s}_{i-1}}-1\right|\leq D_{3}e^{-\eta i}.

(3) The true stable holonomy restricted to ΛωT1\Lambda^{\omega}_{\infty}\cap T_{1} is absolutely continuous. The Jacobian of the fake stable holonomies converges to the Jacobian of the true stable holonomies restricted to the set ΛωT1\Lambda^{\omega}_{\infty}\cap T_{1}. Namely, for almost every point of this intersection, JacHsnJacHs\operatorname{Jac}H^{s}_{n}\to\operatorname{Jac}H^{s}, this convergence is uniform, and the limit is uniformly Hölder and bounded away from zero.

Proof.

Part 1. (Formula for Jacobian) We begin by exhibiting a formula for the Jacobian of the stable holonomies. This may be compared with [BP07, Sec. 8.6.4], which uses a similar formula though analyzes it differently. Suppose that T1T_{1} and T2T_{2} are the two transversals we are considering as in the statement of the proposition. Then write Πis\Pi_{i}^{s} for the holonomy along fiω(Wsi)=W~isf^{i}_{\omega}(W^{s}_{i})=\widetilde{W}_{i}^{s}, the smooth integral curves to V~i\widetilde{V}_{i} we used when defining the fake stable foliation. Then we have the following formula for the Jacobian of HsiH^{s}_{i}:

(B.24) Jac(Hsi)(y)=k=0i1Jac(D(fσkω)1|TfkωHsi(y)fkω(T2))Jac(D(fσkω)1|Tfkω(y)fkω(T1))Jac(Πsi(y)).\operatorname{Jac}(H^{s}_{i})(y)=\prod_{k=0}^{i-1}\frac{\operatorname{Jac}(D(f_{\sigma^{k}\omega})^{-1}|T_{f^{k}_{\omega}{H^{s}_{i}(y)}}f^{k}_{\omega}(T^{2}))}{\operatorname{Jac}(D(f_{\sigma^{k}\omega})^{-1}|T_{f^{k}_{\omega}(y)}f^{k}_{\omega}(T^{1}))}\operatorname{Jac}(\Pi^{s}_{i}(y)).

For finite time this formula is evident because all of the foliations we are considering are smooth: it is just the change of variables formula.

Part 2. (Exponential convergence) Applying Lemma B.1 we will obtain Hölder continuity for the Jacobians once we know that Jac(Hsi)\operatorname{Jac}(H^{s}_{i}) is converging exponentially fast.

To see that (B.24) converges exponentially quickly, two estimates are needed.

(1) The first is showing that for some η>0\eta>0

(B.25) |Jac(Πsn)1|C1enδ1.\left|\operatorname{Jac}(\Pi^{s}_{n})-1\right|\leq C_{1}e^{-n\delta_{1}}.

This is the Jacobian of the foliation holonomy of W~sn\widetilde{W}^{s}_{n}. The foliation holonomy is between two transversals that are distance e(λ/2)ne^{-(\lambda/2)n} apart. By working in Lyapunov charts, it is straightforward to see that the fnω(T1)f^{n}_{\omega}(T_{1}) and fnω(T2)f^{n}_{\omega}(T_{2}) make angle at least CeϵnCe^{-\epsilon n} with W~sn\widetilde{W}^{s}_{n}. As W~sn\widetilde{W}^{s}_{n} itself has C2C^{2} norm at most eO(ϵn)e^{O(\epsilon n)} from Lemma B.8, it is easy to see that there exists some C1,δ1>0C_{1},\delta_{1}>0 such that (B.25) holds.

(2) Next we estimate the rate of convergence of:

(B.26) k=0i1Jac(D(fσkω)1|TfkωHsi(y)fkω(T2))Jac(D(fσkω)1|Tfkω(y)fkω(T1))=exp(k=0i1P(k,i)),\prod_{k=0}^{i-1}\frac{\operatorname{Jac}(D(f_{\sigma^{k}\omega})^{-1}|T_{f^{k}_{\omega}{H^{s}_{i}(y)}}f^{k}_{\omega}(T^{2}))}{\operatorname{Jac}(D(f_{\sigma^{k}\omega})^{-1}|T_{f^{k}_{\omega}(y)}f^{k}_{\omega}(T^{1}))}=\exp\left(\sum_{k=0}^{i-1}P(k,i)\right),

where P(k,i)P(k,i) is the logarithm of the kkth term of the product.

We claim that there exist C2,δ2,N2C_{2},\delta_{2},N_{2}, such that for iN2i\geq N_{2} and k0k\geq 0,

(B.27) |Jac(D(fσkω)1|TfkωHsi(y)fkω(T2))Jac(D(fσkω)1|Tfkω(y)fkω(T1))1|C2eδ2k.\left|\frac{\operatorname{Jac}(D(f_{\sigma^{k}\omega})^{-1}|T_{f^{k}_{\omega}{H^{s}_{i}(y)}}f^{k}_{\omega}(T^{2}))}{\operatorname{Jac}(D(f_{\sigma^{k}\omega})^{-1}|T_{f^{k}_{\omega}(y)}f^{k}_{\omega}(T^{1}))}-1\right|\leq C_{2}e^{-\delta_{2}k}.

We will not give a detailed proof of this estimate because it standard. The key claim is that if V1V_{1} and V2V_{2} are the tangent vectors to γ1\gamma_{1} and γ2\gamma_{2} at yy and Hsi(y)H^{s}_{i}(y), respectively, then there exists a uniform constant C2C_{2}^{\prime} and ϖ>0\varpi>0 such that when we identify DfkωV1Df^{k}_{\omega}V_{1} and DfkωV2Df^{k}_{\omega}V_{2} by parallel transport along the distance minimizing geodesic between their basepoints, then

(B.28) d(DfkωV1,DfkωV2)C2ekϖ.d(Df^{k}_{\omega}V_{1},Df^{k}_{\omega}V_{2})\leq C_{2}^{\prime}e^{-k\varpi}.

One can deduce this in a very similar way to the argument for [Mn87, Lem. III.3.7], which inductively checks that as one applies more iterates of the dynamics that these two vectors attract exponentially quickly by using that the basepoints of the vectors do as well; this argument is similar to the proof of our Proposition 10.3. Once (B.28) is known, then it is straightforward to conclude (B.27) because the Jacobian of a diffeomorphism f:MMf\colon M\!\!\to\!\!M restricted to a curve γM\gamma\!\subset\!M depends Hölder continuously on the direction of γ˙\dot{\gamma}.

(B.27) shows that the product (B.26) is uniformly bounded. It then suffices to estimate:

k=0i1P(k,i)k=0iP(k,i+1).\sum_{k=0}^{i-1}P(k,i)-\sum_{k=0}^{i}P(k,i+1).

We will pick some 0<θ<10<\theta<1, and split this sum as follows:

k=0θi(P(k,i)P(k,i+1))+[kθiiP(k,i)kθiiP(k,i+1)]=I+II.\sum_{k=0}^{\theta i}\left(P(k,i)-P(k,i+1)\right)+\left[\sum_{k\geq\theta i}^{i}P(k,i)-\sum_{k\geq\theta i}^{i}P(k,i+1)\right]=I+II.

For any such θ\theta, it follows from (B.27) that there exists C3,δ3>0C_{3},\delta_{3}>0 such that |II|C3eδ3i\left|II\right|\leq C_{3}e^{-\delta_{3}i}. Thus to conclude we need only bound term II. From Proposition B.12 and the temperedness, we know that there exists C4,δ4C_{4},\delta_{4} such that

(B.29) dT2(Hsi(y),Hsi+1(y))C4eδ4i.d_{T_{2}}(H^{s}_{i}(y),H^{s}_{i+1}(y))\leq C_{4}e^{-\delta_{4}i}.

It is straightforward to see that there exists β,β1>0\beta,\beta_{1}>0 such that the function

Jac(D(fσkω)1|TfkωHsi(y)fkω(T2))Jac(D(fσkω)1|Tfkω(y)fkω(T1))\frac{\operatorname{Jac}(D(f_{\sigma^{k}\omega})^{-1}|T_{f^{k}_{\omega}{H^{s}_{i}(y)}}f^{k}_{\omega}(T^{2}))}{\operatorname{Jac}(D(f_{\sigma^{k}\omega})^{-1}|T_{f^{k}_{\omega}(y)}f^{k}_{\omega}(T^{1}))}

viewed as a function of Hsi(y)H^{s}_{i}(y) is β\beta-Hölder with the Hölder constant at most eβ1ke^{\beta_{1}k} for all kik\leq i. Thus by combining (B.29) with the Hölder continuity, we see that |P(k,i)P(k,i+1)|eβkeδi\left|P(k,i)-P(k,i+1)\right|\leq e^{\beta k}e^{-\delta i}. Thus as long as θ>β/δ\theta>\beta/\delta, we see that there exists C5,δ5C_{5},\delta_{5}, such that

|I|C5eδ5i.\left|I\right|\leq C_{5}e^{-\delta_{5}i}.

Combining the estimates on II and IIII implies that there exists C6,δ6C_{6},\delta_{6} so that (B.26) is converging exponentially fast, as desired.

Thus we see that the Jacobian of the holonomies converges exponentially fast pointwise and is uniformly positive. Thus we have concluded (2) of the statement of the proposition.

Part 3. (Uniform Hölderness) We now apply Lemma B.1. We have just shown that the Jacobian of the holonomies is converging exponentially fast, and certainly the Hölder norm of the terms is growing at most exponentially fast as well as it is the composition of diffeomorphisms along with a holonomy, whose Hölder norm is also growing at most exponentially fast. Thus we conclude (1) above.

Part 4. The final claim (3) about the holonomies is fairly standard. The following lemma implies the conclusion:

Lemma B.14.

Let γ1\gamma_{1} and γ2\gamma_{2} be two curves with finite Lebesgue measure and for nn\in\mathbb{N} let Ωnγ1\Omega_{n}\subseteq\gamma_{1} be a decreasing sequence of subsets, each of which is a union of intervals. Suppose that K:=nΩn\displaystyle K:=\bigcap_{n\geq\mathbb{N}}\Omega_{n} is compact. Let ϕn:Ωnγ2\phi_{n}\colon\Omega_{n}\to\gamma_{2} be a sequence of absolutely continuous maps with uniformly continuous, equicontinuous Jacobians JnJ_{n}. If (ϕn)(\phi_{n}) converges uniformly to an injective map ϕ:Kγ2\phi\colon K\to\gamma_{2}, and Jn|kJ_{n}|_{k} converges uniformly to an integrable function J:KJ\colon K\to\mathbb{R}, then ϕ\phi is absolutely continuous with Jacobian JJ.

We will not include a proof of the above lemma since it is a variant of a lemma in Mañé [Mn87, Thm. 3.3] and the proof of [Mn87] can be modified to obtain a proof of this lemma. ∎

References

  • [ABR22] José F. Alves, Wael Bahsoun, and Marks Ruziboev, Almost sure rates of mixing for partially hyperbolic attractors, J. Differential Equations 311 (2022), 98–157. MR 4354854
  • [ABRV23] José F. Alves, Wael Bahsoun, Marks Ruziboev, and Paulo Varandas, Quenched decay of correlations for nonuniformly hyperbolic random maps with an ergodic driving system, Nonlinearity 36 (2023), no. 6, 3294–3318. MR 4588339
  • [AGT06] Artur Avila, Sébastien Gouëzel, and Masato Tsujii, Smoothness of solenoidal attractors, Discrete Contin. Dyn. Syst. 15 (2006), no. 1, 21–35. MR 2191383
  • [Arn98] Ludwig Arnold, Random dynamical systems, Springer Monographs in Mathematics, Springer-Verlag, Berlin, 1998. MR 1723992
  • [AV10] Artur Avila and Marcelo Viana, Extremal Lyapunov exponents: an invariance principle and applications, Invent. Math. 181 (2010), no. 1, 115–189. MR 2651382
  • [Bal00] Viviane Baladi, Positive transfer operators and decay of correlations, Advanced Series in Nonlinear Dynamics, vol. 16, World Scientific Publishing Co., Inc., River Edge, NJ, 2000. MR 1793194
  • [BC91] Michael Benedicks and Lennart Carleson, The dynamics of the Hénon map, Ann. of Math. (2) 133 (1991), no. 1, 73–169. MR 1087346
  • [BCS22] Jérôme Buzzi, Sylvain Crovisier, and Omri Sarig, Measures of maximal entropy for surface diffeomorphisms, Ann. of Math. (2) 195 (2022), no. 2, 421–508. MR 4387233
  • [BCS23] by same author, On the existence of SRB measures for CC^{\infty} surface diffeomorphisms, Int. Math. Res. Not. IMRN (2023), no. 24, 20812–20826. MR 4681273
  • [BCZG23] Alex Blumenthal, Michele Coti Zelati, and Rishabh S. Gvalani, Exponential mixing for random dynamical systems and an example of Pierrehumbert, Ann. Probab. 51 (2023), no. 4, 1559–1601. MR 4597327
  • [BG20] Michael Björklund and Alexander Gorodnik, Central limit theorems for group actions which are exponentially mixing of all orders, J. Anal. Math. 141 (2020), no. 2, 457–482. MR 4179768
  • [BL85] Philippe Bougerol and Jean Lacroix, Products of random matrices with applications to Schrödinger operators, Progress in Probability and Statistics, vol. 8, Birkhäuser Boston, Inc., Boston, MA, 1985. MR 886674
  • [BO21] Snir Ben Ovadia, Hyperbolic SRB measures and the leaf condition, Comm. Math. Phys. 387 (2021), no. 3, 1353–1404. MR 4324380
  • [Bow75] Rufus Bowen, Equilibrium states and the ergodic theory of Anosov diffeomorphisms, Lecture Notes in Mathematics, vol. Vol. 470, Springer-Verlag, Berlin-New York, 1975. MR 442989
  • [BP07] Luis Barreira and Yakov Pesin, Nonuniform hyperbolicity, Encyclopedia of Mathematics and its Applications, vol. 115, Cambridge University Press, Cambridge, 2007, Dynamics of systems with nonzero Lyapunov exponents. MR 2348606
  • [BRH17] Aaron Brown and Federico Rodriguez Hertz, Measure rigidity for random dynamics on surfaces and related skew products, J. Amer. Math. Soc. 30 (2017), no. 4, 1055–1132. MR 3671937
  • [Bur24] David Burguet, SRB measures for CC^{\infty} surface diffeomorphisms, Invent. Math. 235 (2024), no. 3, 1019–1062. MR 4701884
  • [BW10] Keith Burns and Amie Wilkinson, On the ergodicity of partially hyperbolic systems, Ann. of Math. (2) 171 (2010), no. 1, 451–489. MR 2630044
  • [BXY17] Alex Blumenthal, Jinxin Xue, and Lai-Sang Young, Lyapunov exponents for random perturbations of some area-preserving maps including the standard map, Ann. of Math. (2) 185 (2017), no. 1, 285–310. MR 3583355
  • [BXY18] by same author, Lyapunov exponents and correlation decay for random perturbations of some prototypical 2D maps, Comm. Math. Phys. 359 (2018), no. 1, 347–373. MR 3781453
  • [CE80] Pierre Collet and Jean-Pierre Eckmann, Iterated maps on the interval as dynamical systems, Progress in Physics, vol. 1, Birkhäuser, Boston, MA, 1980. MR 613981
  • [Che06] N. Chernov, Advanced statistical properties of dispersing billiards, J. Stat. Phys. 122 (2006), no. 6, 1061–1094. MR 2219528
  • [Chu20] Ping Ngai Chung, Stationary measures and orbit closures of uniformly expanding random dynamical systems on surfaces, 2020, https://arxiv.org/abs/2006.03166.
  • [CL22] Roberto Castorrini and Carlangelo Liverani, Quantitative statistical properties of two-dimensional partially hyperbolic systems, Adv. Math. 409 (2022), Paper No. 108625, 122. MR 4469072
  • [CLP22] Vaughn Climenhaga, Stefano Luzzatto, and Yakov Pesin, SRB measures and Young towers for surface diffeomorphisms, Ann. Henri Poincaré 23 (2022), no. 3, 973–1059. MR 4396671
  • [CM06] Nikolai Chernov and Roberto Markarian, Chaotic billiards, Mathematical Surveys and Monographs, vol. 127, American Mathematical Society, Providence, RI, 2006. MR 2229799
  • [CV13] A. Castro and P. Varandas, Equilibrium states for non-uniformly expanding maps: decay of correlations and strong stability, Ann. Inst. H. Poincaré C Anal. Non Linéaire 30 (2013), no. 2, 225–249. MR 3035975
  • [dCJ02] Augusto Armando de Castro Júnior, Backward inducing and exponential decay of correlations for partially hyperbolic attractors, Israel J. Math. 130 (2002), 29–75. MR 1919371
  • [DeW24] Jonathan DeWitt, Simultaneous linearization of diffeomorphisms of isotropic manifolds, J. Eur. Math. Soc. (JEMS) 26 (2024), no. 8, 2897–2969. MR 4756948
  • [DFL22] Dmitry Dolgopyat, Bassam Fayad, and Sixu Liu, Multiple Borel-Cantelli lemma in dynamics and multilog law for recurrence, J. Mod. Dyn. 18 (2022), 209–289. MR 4447598
  • [DK07] Dmitry Dolgopyat and Raphaël Krikorian, On simultaneous linearization of diffeomorphisms of the sphere, Duke Math. J. 136 (2007), no. 3, 475–505. MR 2309172
  • [DKK04] Dmitry Dolgopyat, Vadim Kaloshin, and Leonid Koralov, Sample path properties of the stochastic flows, Ann. Probab. 32 (2004), no. 1A, 1–27. MR 2040774
  • [DKRH24] D. Dolgopyat, A. Kanigowski, and F. Rodriguez Hertz, Exponential mixing implies Bernoulli, Ann. of Math. (2) 199 (2024), no. 3, 1225–1292. MR 4740539
  • [DL23] Mark F. Demers and Carlangelo Liverani, Projective cones for sequential dispersing billiards, Comm. Math. Phys. 401 (2023), no. 1, 841–923. MR 4604909
  • [Dol00] Dmitry Dolgopyat, On dynamics of mostly contracting diffeomorphisms, Comm. Math. Phys. 213 (2000), no. 1, 181–201. MR 1782146
  • [EL] Alex Eskin and Elon Lindenstrauss, Random walks on locally homogeneous spaces.
  • [ES23] Rosemary Elliott Smith, Uniformly expanding random walks on manifolds, Nonlinearity 36 (2023), no. 11, 5955–5972. MR 4656974
  • [FKS13] David Fisher, Boris Kalinin, and Ralf Spatzier, Global rigidity of higher rank Anosov actions on tori and nilmanifolds, J. Amer. Math. Soc. 26 (2013), no. 1, 167–198, With an appendix by James F. Davis. MR 2983009
  • [Gal10] Stefano Galatolo, Hitting time in regular sets and logarithm law for rapidly mixing dynamical systems, Proc. Amer. Math. Soc. 138 (2010), no. 7, 2477–2487. MR 2607877
  • [GL06] Sébastien Gouëzel and Carlangelo Liverani, Banach spaces adapted to Anosov systems, Ergodic Theory Dynam. Systems 26 (2006), no. 1, 189–217. MR 2201945
  • [GS14] Alexander Gorodnik and Ralf Spatzier, Exponential mixing of nilmanifold automorphisms, J. Anal. Math. 123 (2014), 355–396. MR 3233585
  • [HJ13] Roger A. Horn and Charles R. Johnson, Matrix analysis, second ed., Cambridge University Press, Cambridge, 2013. MR 2978290
  • [Hör76] Lars Hörmander, The boundary problems of physical geodesy, Arch. Rational Mech. Anal. 62 (1976), no. 1, 1–52. MR 602181
  • [Kif86] Yuri Kifer, Ergodic theory of random transformations, Progress in Probability and Statistics, vol. 10, Birkhäuser Boston, Inc., Boston, MA, 1986. MR 884892
  • [KM96] D. Y. Kleinbock and G. A. Margulis, Bounded orbits of nonquasiunipotent flows on homogeneous spaces, Sinaĭ’s Moscow Seminar on Dynamical Systems, Amer. Math. Soc. Transl. Ser. 2, vol. 171, Amer. Math. Soc., Providence, RI, 1996, pp. 141–172. MR 1359098
  • [Liu16] Xiao-Chuan Liu, Lyapunov exponents approximation, symplectic cocycle deformation and a large deviation theorem, ProQuest LLC, Ann Arbor, MI, 2016, Thesis (Ph.D.)–IMPA.
  • [Liv04] Carlangelo Liverani, On contact Anosov flows, Ann. of Math. (2) 159 (2004), no. 3, 1275–1312. MR 2113022
  • [LQ95] Pei-Dong Liu and Min Qian, Smooth ergodic theory of random dynamical systems, Lecture Notes in Mathematics, vol. 1606, Springer-Verlag, Berlin, 1995. MR 1369243
  • [McS34] E. J. McShane, Extension of range of functions, Bull. Amer. Math. Soc. 40 (1934), no. 12, 837–842. MR 1562984
  • [Mn87] Ricardo Mañé, Ergodic theory and differentiable dynamics, Ergebnisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)], vol. 8, Springer-Verlag, Berlin, 1987, Translated from the Portuguese by Silvio Levy. MR 889254
  • [OP22] Davi Obata and Mauricio Poletti, Positive exponents for random products of conservative surface diffeomorphisms and some skew products, J. Dynam. Differential Equations 34 (2022), no. 3, 2405–2428. MR 4482258
  • [Pal00] Jacob Palis, A global view of dynamics and a conjecture on the denseness of finitude of attractors, Géométrie complexe et systèmes dynamiques (Orsay, 1995), no. 261, SMF, 2000, pp. xiii–xiv, 335–347. MR 1755446
  • [Pot22] Rafael Potrie, A remark on uniform expansion, Rev. Un. Mat. Argentina 64 (2022), no. 1, 11–21. MR 4477288
  • [PP90] William Parry and Mark Pollicott, Zeta functions and the periodic orbit structure of hyperbolic dynamics, Astérisque (1990), no. 187-188, 268. MR 1085356
  • [Roh64] V.A. Rohlin, Exact endomorphisms of a Lebesgue space, 15 papers on topology and logic, American Mathematical Society Translations. Series 2, vol. 39, American Mathematical Society, Providence, RI, 1964, pp. 1–36.
  • [Roh67] V. A. Rohlin, Lectures on the entropy theory of transformations with invariant measure, Uspehi Mat. Nauk 22 (1967), no. 5(137), 3–56. MR 217258
  • [Rue78] David Ruelle, Thermodynamic formalism, Encyclopedia of Mathematics and its Applications, vol. 5, Addison-Wesley Publishing Co., Reading, MA, 1978, The mathematical structures of classical equilibrium statistical mechanics, With a foreword by Giovanni Gallavotti and Gian-Carlo Rota. MR 511655
  • [Shu87] Michael Shub, Global stability of dynamical systems, Springer-Verlag, New York, 1987, With the collaboration of Albert Fathi and Rémi Langevin, Translated from the French by Joseph Christy. MR 869255
  • [Shu06] by same author, All, most, some differentiable dynamical systems, International Congress of Mathematicians. Vol. III, Eur. Math. Soc., Zürich, 2006, pp. 99–120. MR 2275672
  • [Sin72] Ja. G. Sinaĭ, Gibbs measures in ergodic theory, Uspehi Mat. Nauk 27 (1972), no. 4(166), 21–64. MR 399421
  • [Ste91] Gilbert W. Stewart, Perturbation theory for the singular value decomposition, SVD and signal processing, II: algorithms, analysis, and applications (Richard J. Vaccaro, ed.), Elsevier, Amsterdam, 1991.
  • [Ste97] J. Michael Steele, Probability theory and combinatorial optimization, CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 69, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1997. MR 1422018
  • [Tsu01] Masato Tsujii, Fat solenoidal attractors, Nonlinearity 14 (2001), no. 5, 1011–1027. MR 1862809
  • [TZ23] Masato Tsujii and Zhiyuan Zhang, Smooth mixing Anosov flows in dimension three are exponentially mixing, Ann. of Math. (2) 197 (2023), no. 1, 65–158. MR 4513143
  • [Via98] Marcelo Viana, Dynamics: a probabilistic and geometric perspective, Proceedings of the International Congress of Mathematicians, Vol. I (Berlin, 1998), 1998, pp. 557–578. MR 1648047
  • [Via99] by same author, Lecture notes on attractors and physical measures, Monografías del Instituto de Matemática y Ciencias Afines [Monographs of the Institute of Mathematics and Related Sciences], vol. 8, Instituto de Matemática y Ciencias Afines, IMCA, Lima, 1999, A paper from the 12th Escuela Latinoamericana de Matemáticas (XII-ELAM) held in Lima, June 28–July 3, 1999. MR 2007887
  • [Via08] by same author, Almost all cocycles over any hyperbolic system have nonvanishing Lyapunov exponents, Ann. of Math. (2) 167 (2008), no. 2, 643–680. MR 2415384
  • [Via14] by same author, Lectures on Lyapunov exponents, Cambridge Studies in Advanced Mathematics, vol. 145, Cambridge University Press, Cambridge, 2014. MR 3289050
  • [WY01] Qiudong Wang and Lai-Sang Young, Strange attractors with one direction of instability, Comm. Math. Phys. 218 (2001), no. 1, 1–97. MR 1824198
  • [You98] Lai-Sang Young, Statistical properties of dynamical systems with some hyperbolicity, Ann. of Math. (2) 147 (1998), no. 3, 585–650. MR 1637655
  • [You99] by same author, Recurrence times and rates of mixing, Israel J. Math. 110 (1999), 153–188. MR 1750438