This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Simultaneous Linearization of Diffeomorphisms of Isotropic Manifolds

Jonathan DeWitt Department of Mathematics, The University of Chicago, Chicago, IL 60637, USA [email protected]
Abstract.

Suppose that MM is a closed isotropic Riemannian manifold and that R1,,RmR_{1},...,R_{m} generate the isometry group of MM. Let f1,,fmf_{1},...,f_{m} be smooth perturbations of these isometries. We show that the fif_{i} are simultaneously conjugate to isometries if and only if their associated uniform Bernoulli random walk has all Lyapunov exponents zero. This extends a linearization result of Dolgopyat and Krikorian [DK07] from SnS^{n} to real, complex, and quaternionic projective spaces. In addition, we identify and remedy an oversight in that earlier work.

1. Introduction

A 00footnotetext: This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-1746045.basic problem in dynamics is determining whether two dynamical systems are equivalent. A standard notion of equivalence is conjugacy: if ff and gg are two diffeomorphisms of a manifold MM, then ff and gg are conjugate if there exists a homeomorphism hh of MM such that hfh1=ghfh^{-1}=g. Some classes of dynamical systems are distinguished up to conjugacy by a small amount of dynamical information. One of the most basic examples of this is Denjoy’s theorem: a C2C^{2} orientation preserving circle diffeomorphism with irrational rotation number is conjugate to a rotation [KH97, §12.1]. In the case of Denjoy’s theorem, the rotation number is all the information needed to determine the topological equivalence class of the diffeomorphism under conjugacy.

Rigidity theory focuses on identifying dynamics that are distinguished up to conjugacy by particular kinds of dynamical information such as the rotation number. There are finer dynamical invariants than rotation number which require a finer notion of equivalence to study. For instance, one obtains a finer notion of equivalence if one insists that the conjugacy be a C1C^{1} or even CC^{\infty} diffeomorphism. A smoother conjugacy allows one to consider invariants such as Lyapunov exponents, which may not be preserved under conjugacy by homeomorphisms. For a single volume preserving Anosov diffeomorphism, the Lyapunov exponents with respect to volume are invariant under conjugation by C1C^{1} volume preserving maps. Consequently, one is naturally led to ask, “If two volume preserving Anosov diffeomorphisms have the same Lyapunov exponents are the two C1C^{1} conjugate?” In some circumstances the answer is “yes”. Such situations where knowledge about Lyapunov exponents implies systems are conjugate by a C1C^{1} diffeomorphism are instances of a phenomenon called “Lyapunov spectrum rigidity”. See [Gog19] for examples and discussion of this type of rigidity. For recent examples, see [But17], [DeW19],[GRH19],[GKS18], and [SY19].

In rigidity problems related to isometries, it is often natural to consider a family of isometries. A collection of isometries may have strong rigidity properties even if the individual elements of the collection do not. For example, Fayad and Khanin [FK09] proved that a collection of commuting diffeomorphisms of the circle whose rotation numbers satisfy a simultaneous Diophantine condition are smoothly simultaneously conjugated to rotations. Their result is a strengthening of an earlier result of Moser [Mos90]. A single diffeomorphism in such a collection might not satisfy the Diophantine condition on its own.

Although the two types of rigidity described above occur in the dissimilar hyperbolic and elliptic settings, a result of Dolgopyat and Krikorian combines the two. They introduce a notion of a Diophantine set of rotations of a sphere and use this notion to prove that certain random dynamical systems with all Lyapunov exponents zero are conjugated to isometric systems [DK07]. Our result is a generalization of this result to the setting of isotropic manifolds. We now develop the language to state both precisely.

Let (f1,,fm)(f_{1},...,f_{m}) be a tuple of diffeomorphisms of a manifold MM. Let (ωi)i(\omega_{i})_{i\in\mathbb{N}} be a sequence of independent and identically distributed random variables with uniform distribution on {1,,m}\{1,...,m\}. Given an initial point x0Mx_{0}\in M, define xn=fωnxn1x_{n}=f_{\omega_{n}}x_{n-1}. This defines a Markov process on MM. We refer to this process as the random dynamical system associated to the tuple (f1,,fm)(f_{1},...,f_{m}). Let fωnf_{\omega}^{n} be defined to equal fωnfω1f_{\omega_{n}}\circ\cdots\circ f_{\omega_{1}}. We say that a probability measure μ\mu on MM is a stationary measure for this process if m1i=1m(fi)μ=μm^{-1}\sum_{i=1}^{m}(f_{i})_{*}\mu=\mu. A stationary measure is ergodic if it is not a non-trivial convex combination of two distinct stationary measures. Fix an ergodic stationary measure μ\mu. For μ\mu-almost every xx, almost surely for any vTxM{0}v\in T_{x}M\setminus\{0\}, the following limit exists

(1) limn1nlnDxfωnv\lim_{n\to\infty}\frac{1}{n}\ln\|D_{x}f^{n}_{\omega}v\|

and takes its value in a fixed finite list of numbers depending only on μ\mu:

(2) λ1(μ)λ2(μ)λdimM(μ).\lambda_{1}(\mu)\geq\lambda_{2}(\mu)\geq\cdots\geq\lambda_{\dim M}(\mu).

These are the Lyapunov exponents with respect to μ\mu. In fact, for almost every ω\omega and μ\mu-a.e. xx there exists a flag V1VjV_{1}\subset\cdots\subset V_{j} inside TxMT_{x}M such that if vViVi1v\in V_{i}\setminus V_{i-1} then the limit in (2) is equal to λdimMdimVi\lambda_{\dim M-\dim V_{i}}. The number of times a particular exponent appears in (2) is given by dimVidimVi1\dim V_{i}-\dim V_{i-1}; this number is referred to as the multiplicity of the exponent. For more information, see [Kif86].

Our result holds for isotropic manifolds. By definition, an isotropic manifold is a Riemannian manifold whose isometry group acts transitively on its unit tangent bundle. The closed isotropic manifolds are SnS^{n}, Pn\mathbb{R}\operatorname{P}^{n}, Pn\mathbb{C}\operatorname{P}^{n}, Pn\mathbb{H}\operatorname{P}^{n}, and the Cayley projective plane. In the following we write GG^{\circ} for the identity component of a Lie group GG.

Theorem 1.

Let MdM^{d} be a closed isotropic Riemannian manifold other than S1S^{1}. There exists k0k_{0} such that if (R1,,Rm)(R_{1},...,R_{m}) is a tuple of isometries of MM such that the subgroup of Isom(M)\operatorname{Isom}(M) generated by this tuple contains Isom(M)\operatorname{Isom}(M)^{\circ}, then there exists ϵk0>0\epsilon_{k_{0}}>0 such that the following holds. Let (f1,,fm)(f_{1},...,f_{m}) be a tuple of CC^{\infty} diffeomorphisms satisfying maxidCk0(fi,Ri)<ϵk0\max_{i}d_{C^{k_{0}}}(f_{i},R_{i})<\epsilon_{k_{0}}. Suppose that there exists a sequence of ergodic stationary measures μn\mu_{n} for the random dynamical system generated by (f1,,fm)(f_{1},...,f_{m}) such that |λd(μn)|0\left|\lambda_{d}(\mu_{n})\right|\to 0, then there exists ψDiff(M)\psi\in\operatorname{Diff}^{\infty}(M) such that for each ii the map ψfiψ1\psi f_{i}\psi^{-1} is an isometry of MM and lies in the subgroup of Isom(M)\operatorname{Isom}(M) generated by (R1,,Rm)(R_{1},\ldots,R_{m}).

Dolgopyat and Krikorian proved Theorem 1 in the case of SnS^{n} [DK07, Thm. 1].

Dolgopyat and Krikorian also obtained a Taylor expansion of the Lyapunov exponents of the stationary measure of the perturbed system [DK07, Thm. 2]. Fix (R1,,Rm)(R_{1},\ldots,R_{m}) generating Isom(Sn)\operatorname{Isom}(S^{n})^{\circ}. Let (f1,,fm)(f_{1},\ldots,f_{m}) be a Ck0C^{k_{0}} small perturbation of (R1,,Rm)(R_{1},\ldots,R_{m}) and μ\mu be any ergodic stationary measure for (f1,,fm)(f_{1},\ldots,f_{m}). Let Λr=λ1++λr\Lambda_{r}=\lambda_{1}+\cdots+\lambda_{r} denote the sum of the top rr Lyapunov exponents. In [DK07, Thm. 2], it is shown that the Lyapunov exponents of μ\mu satisfy

(3) λr(μ)=Λdd+d2r+1d1(λ1Λdd)+o(1)|λd(μ)|,\lambda_{r}(\mu)=\frac{\Lambda_{d}}{d}+\frac{d-2r+1}{d-1}\left(\lambda_{1}-\frac{\Lambda_{d}}{d}\right)+o(1)|\lambda_{d}(\mu)|,

where o(1)o(1) goes to zero as maxidCk0(fi,Ri)0\max_{i}d_{C^{k_{0}}}(f_{i},R_{i})\to 0. Using this formula Dolgopyat and Krikorian obtain an even stronger dichotomy for systems on even dimensional spheres: either (f1,,fm)(f_{1},\ldots,f_{m}) is simultaneously conjugated to isometries or the Lyapunov exponents of every ergodic stationary measure of the perturbation are uniformly bounded away from zero. By using this result they show if (R1,,Rm)(R_{1},\ldots,R_{m}) generates Isom(S2n)\operatorname{Isom}(S^{2n})^{\circ} and (f1,,fm)(f_{1},\ldots,f_{m}) is a Ck0C^{k_{0}} small perturbation such that each fif_{i} preserves volume, then volume is an ergodic stationary measure for (f1,,fm)(f_{1},\ldots,f_{m}) [DK07, Cor. 2].

It is natural to ask if a similar Taylor expansion can be obtained in the setting of isotropic manifolds. Proposition 26 shows that Λr\Lambda_{r} may be Taylor expanded assuming that (R1,,Rm)(R_{1},...,R_{m}) generates Isom(M)\operatorname{Isom}(M)^{\circ} and the induced action of Isom(M)\operatorname{Isom}(M)^{\circ} on Grr(M)\operatorname{Gr}_{r}(M), the Grassmanian bundle of rr-planes in TMTM, is transitive.

In Theorem 40, we give a Taylor expansion relating λ1\lambda_{1} and λd\lambda_{d} which holds for isotropic manifolds. However, we cannot Taylor expand every Lyapunov exponent as in equation (3) because if a manifold does not have constant curvature then its isometry group cannot act transitively on the two-planes in its tangent spaces. The argument of Dolgopyat and Krikorian requires that the isometry group act transitively on the space of kk-planes in TMTM for 0kd0\leq k\leq d.

It is natural to ask why the proof of Theorem 1 does not work in the case of S1S^{1} even though S1S^{1} is isotropic. As Proposition 13 shows, for a tuple (R1,,Rm)(R_{1},\ldots,R_{m}) as in the theorem, uniformly small perturbations of (R1,,Rm)(R_{1},\ldots,R_{m}) are uniformly Diophantine in a sense explained below. This uniformity is used crucially in the proof when we change the tuple of isometries that we are working with. The same uniformity of Diophantineness does not hold for tuples of isometries of S1S^{1}: a small perturbation may lose all Diophantine properties. The reason that the proof of Proposition 13 does not work for S1S^{1} is that the isometry group of S1S^{1} is not semi-simple.

There are not many other results like Theorem 1. In addition to the aforementioned result of Dolgopyat and Krikorian, there are some results of Malicet. In [Mal12], a similar linearization result is obtained that applies to a particular type of map of 𝕋2\mathbb{T}^{2} that fibers over a rotation on S1S^{1}. In a recent work, Malicet obtained a Taylor expansion of the Lyapunov exponent for a perturbation of a Diophantine random dynamical system on the circle [Mal20].

Acknowledgements. The author thanks Aaron Brown and Amie Wilkinson for their critical comments during all parts of this project. The author also thanks Dmitry Dolgopyat for his generosity in answering the author’s questions about [DK07]. The author is also grateful to the anonymous referees for carefully reading the manuscript and providing many useful comments and suggestions.

1.1. Outline

The proof of Theorem 1 follows the general argument of [DK07]. For readability, the argument in this paper is self-contained. While a number of the results below appear in [DK07], we have substantially reformulated many of them and in many places offer a different proof. Doing so is not merely a courtesy to the reader: the results in [DK07] are stated in too narrow a setting for us to use. Simply stating more general reformulations would unduly burden the reader’s trust. In addition, as will be discussed below, there are some oversights in [DK07] which we explain in subsection 1.2 and that we have remedied in Section 5. We have also stated intermediate results and lemmas in more generality than is needed for the proof of Theorem 1 so that they may be used by others. Below we sketch the general argument of the paper and emphasize some differences with the approach in [DK07].

The proof of Theorem 1 is by an iterative KAM convergence scheme. Fix a closed isotropic manifold MM. We start with a tuple of diffeomorphisms (f1,,fm)(f_{1},\ldots,f_{m}) nearby to a tuple of isometries (R1,,Rm)(R_{1},\ldots,R_{m}). We must find some smooth diffeomorphism ψ\psi such that f~iψfiψ1Isom(M)\widetilde{f}_{i}\coloneqq\psi f_{i}\psi^{-1}\in\operatorname{Isom}(M). To do this we produce a conjugacy ψ\psi that brings each fif_{i} closer to being an isometry. To judge the distance from being an isometry, we define a strain tensor that vanishes precisely when a diffeomorphism is an isometry. By solving a particular coboundary equation and using that the Lyapunov exponents are zero, we can construct ψ\psi so that f~i\widetilde{f}_{i} has small strain tensor. In our setting, a diffeomorphism with small strain is near to an isometry, so (f~1,,f~m)(\widetilde{f}_{1},\ldots,\widetilde{f}_{m}) is near to a tuple of isometries (R1,,Rm)(R_{1}^{\prime},\ldots,R_{m}^{\prime}). We then repeat the procedure using these new tuples as our starting point. The results of performing a single step of this procedure comprise Lemma 39. Once Lemma 39 is proved, the rest of the proof of Theorem 1 is bookkeeping that checks that the procedure converges. Most of the paper is in service of the proof Lemma 39, which gives the result of a single step in the convergence scheme.

Proofs of technical and basic facts are relegated to a significant number of appendices. This has been done to focus the main exposition on the important ideas in the proof of Theorem 1 and not on the technical details. The appendices that might be most beneficial to look at before they are referenced in the text are appendices A and B. These appendices concern CkC^{k} calculus and interpolation inequalities. Both contain estimates that are common in KAM arguments. The organization of the main body of the paper reflects the order of the steps in the proof of Lemma 39. There are several important results in the proof of Lemma 39, which we now describe.

The first part of the proof of Lemma 39 requires that a particular coboundary equation can be tamely solved. The solution to this equation is one of the main subjects of Section 2. The equation is solved in Proposition 16. This proposition is essential in the work of Dolgopyat and Krikorian [DK07] and its proof follows from the appendix to [Dol02]; it relies on a Diophantine property of the tuple of isometries (R1,,Rm)(R_{1},\ldots,R_{m}). This property is formulated in subsection 2.2. The stability of this property under perturbations is crucial in the proof and an essential feature of our setting. In addition, the argument in Section 2 is different from Dolgopyat’s earlier argument because we we use the Solovay-Kitaev algorithm (Theorem 2), which is more efficient than the procedure used in the appendix to [Dol02].

Section 3 considers stationary measures for perturbations of (R1,,Rm)(R_{1},\ldots,R_{m}). Suppose MM is a quotient of its isometry group, its isometry group is semisimple, and (R1,,Rm)(R_{1},\ldots,R_{m}) is a Diophantine subset of Isom(M)\operatorname{Isom}(M). Suppose (f1,,fm)(f_{1},\ldots,f_{m}) is a small smooth perturbation of (R1,,Rm)(R_{1},\ldots,R_{m}). There is a relation between a stationary measure μ\mu for the perturbed system and the Haar measure. Proposition 23 relates integration against μ\mu with integration against the Haar measure. Lyapunov exponents are calculated by integrating the log\log Jacobian against a stationary measure of an extended dynamical system on a Grassmannian bundle over MM. Consequently, this proposition relates stationary measures and their Lyapunov exponents to the volume on a Grassmannian bundle.

The relationship between Lyapunov exponents and stationary measures is explained in Section 4. Proposition 26 provides a Taylor expansion of the sum of the top rr Lyapunov exponents of a stationary measure μ\mu. Three terms appear in the Taylor expansion. The first two terms have a direct geometric meaning, which we interpret in terms of strain tensors introduced in subsection 4.2. The final term in the Taylor expansion depends on a quantity 𝒰(ψ)\mathcal{U}(\psi). This quantity does not have a direct geometric interpretation. However, in the proof of Lemma 39, we show that by solving the coboundary equation from Proposition 16 the quantity 𝒰(ψ)\mathcal{U}(\psi) can be made to vanish. Once 𝒰(ψ)\mathcal{U}(\psi) vanishes, then we have an equation directly relating Lyapunov exponents to the strain. This equation then allows us to conclude that a diffeomorphism with small Lyapunov exponents also has small strain. We reformulate in a Riemannian geometric setting some arguments of [DK07] by using the strain tensor. This gives coordinate-free expressions that are easier to interpret.

Section 5 contains the most important connection between the strain tensor and isometries: diffeomorphisms of small strain on isotropic manifolds are near to isometries. The basic geometric fact proved in Section 5 is Theorem 27, which is true on any manifold. Theorem 27 is then used to prove Proposition 28, which is a more technical result adapted for use in the KAM scheme. Proposition 28 then allows us to prove that our conjugated tuple is near to a new tuple of isometries, which allows us to repeat the process.

All of the previous sections combine in Section 6 to prove Lemma 39. We then obtain the main theorem, Theorem 1, and prove an additional theorem that relates the top and bottom Lyapunov exponents of a perturbation, Theorem 40.

1.2. An oversight and its remedy

Section 5 is entirely new and different from anything appearing in [DK07]. Consequently, the reader may wonder why it is needed. Section 5 provides a method of finding a tuple of isometries (R1,,Rm)(R_{1}^{\prime},\ldots,R_{m}^{\prime}) near to the tuple (f~1,,f~m)(\widetilde{f}_{1},\ldots,\widetilde{f}_{m}) of diffeomorphisms. In [DK07], the new diffeomorphisms RmR_{m} are found in the following manner. As in equation (10), one may find vector fields YiY_{i} such that

expRi(x)Yi(x)=fi(x).\exp_{R_{i}(x)}Y_{i}(x)=f_{i}(x).

If ZZ is a vector field on MM, we define ψZ\psi_{Z}, as in equation (11) to be the map xexpxZ(x)x\mapsto\exp_{x}Z(x). There is a certain operator, the Casimir Laplacian, which acts on vector fields. This operator is defined and discussed in more detail in subsection 2.2. Dolgopyat and Krikorian then project the vector fields YiY_{i} onto the kernel of the Casimir Laplacian, to obtain a vector field YiY_{i}^{\prime}. They then define RiR_{i}^{\prime} to equal ψYiRi\psi_{Y_{i}^{\prime}}\circ R_{i}. This happens in the line immediately below equation (19) in [DK07].

One difficulty is establishing that the maps (R1,,Rm)(R_{1}^{\prime},\ldots,R_{m}^{\prime}) are close to the (fi~,,fm~)(\widetilde{f_{i}},\ldots,\widetilde{f_{m}}). The argument for their nearness hinges on part (d) of Proposition 3 in [DK07], which essentially says that, up to a third order error, the magnitude of the smallest Lyapunov exponent is a bound on the distance. As written, the argument in [DK07] suggests that part (d) is an easy consequence of part (c) of [DK07, Prop. 3]. However, part (d) does not follow. Here is a simplification of the problem. Suppose that f:nnf\colon\mathbb{R}^{n}\to\mathbb{R}^{n} is a diffeomorphism. Pick a point xnx\in\mathbb{R}^{n} and write Dxf=A+B+CD_{x}f=A+B+C, where AA is a multiple of the identity, BB is symmetric with trace zero, and CC is skew-symmetric. The results in part (c) imply that AA and BB are small, but they offer no information about CC.111For those comparing with the original paper, AA and BB correspond to the terms q1q_{1} and q2q_{2}, respectively, which appear in part (c) of [DK07, Prop. 3]. Concluding that the norm of DfDf is small requires that CC be small as well. As CC is skew-symmetric it is natural to think of it as the germ of an isometry. Our modification to the argument is designed to accommodate the term CC by recognizing it as the “isometric” part of the differential. Pursuing this perspective leads to the strain tensor and our Proposition 28. Conversation with Dmitry Dolgopyat confirmed that there is a problem in the paper on this point and that part (d) of Proposition 3 does not follow from part (c).

2. A Diophantine Property and Spectral Gap

Fix a compact connected semisimple Lie group GG and let 𝔤\mathfrak{g} denote its Lie algebra. Endow GG with the bi-invariant metric arising from the negative of the Killing form on 𝔤\mathfrak{g}. We denote this metric on GG by dd. We endow a subgroup HH of GG with the pullback of the Riemannian metric from GG and denote the distance on HH with respect to the pullback metric by dHd_{H}. We use the manifold topology on GG unless explicitly stated otherwise. Consequently, whenever we say that a subset of GG is dense, we mean this with respect to the manifold topology on GG. We say that a subset SS of GG generates GG if the smallest closed subgroup of GG containing SS is GG. In other words, if S\langle S\rangle denotes the smallest subgroup of GG containing SS, then SS generates if S¯=G\overline{\langle S\rangle}=G.

Suppose that SGS\subset G generates GG. We begin this section by discussing how long a word in the elements of SS is needed to approximate an element of GG. Then using this approximation we obtain quantitative estimates for the spectral gap of certain operators associated to SS. Finally, those spectral gap estimates allow us to obtain a “tameness” estimate for a particular operator that arises from SS. This final estimate, Proposition 16, will be crucial in the KAM scheme that we use to prove Theorem 1.

The content of this section is broadly analogous to Appendix A in [Dol02]. However, our development follows a different approach and in some places we are able to obtain stronger estimates.

2.1. The Solovay-Kitaev algorithm

Suppose that SS is a subset of GG. We say that SS is symmetric if sSs\in S implies s1Ss^{-1}\in S. For a natural number nn, let SnS^{n} denote the nn-fold product of SS with itself. Let S1S^{-1} be {s1:sS}\{s^{-1}:s\in S\}. For n<0n<0, define SnS^{n} to equal (S1)n(S^{-1})^{-n}. The following theorem says that any sufficiently dense symmetric subset SS of a compact semisimple Lie group is a generating set. More importantly, it also gives an estimate on how long a word in the generating set SS is needed to approximate an element of GG to within error ϵ\epsilon. If w=s1snw=s_{1}\cdots s_{n} is a word in the elements of the set SS, then we say that ww is balanced if for each sSs\in S, ss appears the same number of times in ww as s1s^{-1} does.

Theorem 2.

[DN06, Thm. 1](Solovay-Kitaev Algorithm) Suppose that GG is a compact semisimple Lie group. There exists ϵ0(G)>0\epsilon_{0}(G)>0 and α>0\alpha>0 and C>0C>0 such that if SS is any symmetric ϵ0\epsilon_{0}-dense subset of GG then the following holds. For any gGg\in G and any ϵ>0\epsilon>0, there exists a natural number lϵl_{\epsilon} such that d(g,Slϵ)<ϵd(g,S^{l_{\epsilon}})<\epsilon. Moreover, lϵClogα(1/ϵ)l_{\epsilon}\leq C\log^{\alpha}(1/\epsilon). Further, there is a balanced word of length lϵl_{\epsilon} within distance ϵ\epsilon of gg.

Later, we use a version of this result that does not require that the set SS be symmetric. Using a non-symmetric generating set significantly increases the word length obtained in the conclusion of the theorem. It is unknown if there exists a version of the Solovay-Kitaev algorithm that does not require a symmetric generating set and keeps the O(logα(1/ϵ))O(\log^{\alpha}(1/\epsilon)) word length. See [BO18] for a partial result in this direction.

Proposition 3.

Suppose that GG is a compact semisimple Lie group endowed with a bi-invariant metric. There exists ϵ0(G)>0\epsilon_{0}(G)>0, α>0\alpha>0, and C0C\geq 0 such that if SS is any ϵ0\epsilon_{0}-dense subset of GG then the following holds. For any gGg\in G and any ϵ>0\epsilon>0, there exists a natural number lϵl_{\epsilon} such that d(g,Slϵ)<ϵd(g,S^{l_{\epsilon}})<\epsilon. Moreover, lϵCϵαl_{\epsilon}\leq C\epsilon^{-\alpha}.

Our weakened version of the Solovay-Kitaev algorithm relies on the following lemma, which allows us to approximate the inverse of an element hh by some positive power of hh.

Lemma 4.

Suppose that GG is a compact dd-dimensional Lie group with a fixed bi-invariant metric. Then there exists a constant CC such that for all ϵ>0\epsilon>0 and any hGh\in G there exists a natural number n<C/ϵdn<C/\epsilon^{d} such that d(h1,hn)<ϵd(h^{-1},h^{n})<\epsilon.

Proof.

This follows from a straightforward pigeonhole argument. We cover GG with sets of diameter ϵ\epsilon. There exists a constant CC so that we can cover GG with at most Cvol(G)/ϵdC\operatorname{vol}(G)/\epsilon^{d} such sets, where dd is the dimension of GG. Consider now the first Cvol(G)/ϵd\lceil C\operatorname{vol}(G)/\epsilon^{d}\rceil iterates of h2h^{2}. By the pigeonhole principle, two of these must fall into the same set in the covering, and so there exist natural numbers nin_{i} and njn_{j} such that 0<ni<nj<Cvol(G)/(ϵd)0<n_{i}<n_{j}<\lceil C\operatorname{vol}(G)/(\epsilon^{d})\rceil and h2nih^{2n_{i}} and h2njh^{2n_{j}} lie in the same set in the covering. Thus d(h2ni,h2nj)<ϵd(h^{2n_{i}},h^{2n_{j}})<\epsilon. As hh is an isometry it follows that d(e,h2nj2ni)<ϵd(e,h^{2n_{j}-2n_{i}})<\epsilon and hence d(h1,h2nj2ni1)<ϵd(h^{-1},h^{2n_{j}-2n_{i}-1})<\epsilon as well. This finishes the proof. ∎

We now prove the proposition.

Proof of Proposition 3.

Let S^=SS1\hat{S}=S\cup S^{-1}. Note that as S^\hat{S} is a symmetric generating set of GG that by Theorem 2 for any ϵ>0\epsilon>0, there exists a number lϵ/2=O(logα(1/ϵ))l_{\epsilon/2}=O(\log^{\alpha}(1/\epsilon)) such that for any gGg\in G there exists an element hh in S^lϵ/2\hat{S}^{l_{\epsilon/2}} such that d(h,g)<ϵ/2d(h,g)<\epsilon/2. Further, by the statement of Theorem 2, we know that hh is represented by a balanced word ww in S^lϵ/2\hat{S}^{l_{\epsilon/2}}.

To finish the proof, we replace each element of ww that is in S1S^{-1} by a word in SjS^{j} for some uniform j>0j>0. To do this we show that there exists a fixed jj so that the elements of SjS^{j} approximate well the inverses of the elements of SS. Write S={s1,,sm}S=\{s_{1},\cdots,s_{m}\} and consider the element (s1,,sm)(s_{1},...,s_{m}) in the group G××GG\times\cdots\times G, where there are mm terms in the product. By applying Lemma 4 to the group G××GG\times\cdots\times G and the element (s1,,sm)(s_{1},...,s_{m}), we obtain that there exists a uniform constant CC^{\prime} and j<C2dmlϵ/2dm/ϵdmj<C^{\prime}2^{dm}l_{\epsilon/2}^{dm}/\epsilon^{dm} such that any sS1s\in S^{-1} may be approximated to distance ϵ/(2lϵ/2)\epsilon/(2l_{\epsilon/2}) by an element in SjS^{j}.

We now replace each element of S1S^{-1} appearing in ww with a word in SjS^{j} that is at distance ϵ/(2lϵ/2)\epsilon/(2l_{\epsilon/2}) away from it. Call this new word ww^{\prime}. Because ww is balanced, we replace exactly half of the terms in ww. Thus ww^{\prime} is a word of length jlϵ/2/2+lϵ/2/2jl_{\epsilon/2}/2+l_{\epsilon/2}/2 as we have replaced half the entries of ww, which has length lϵ/2l_{\epsilon/2}, with words of length jj. Let hh^{\prime} be the element of GG obtained by multiplying together the terms in ww^{\prime}.

Note that multiplication of any number of elements of GG is 11-Lipschitz in each argument. Hence as we have modified the expression for hh in exactly lϵ/2/2l_{\epsilon/2}/2 terms and each modification is of size ϵ/(2lϵ/2)\epsilon/(2l_{\epsilon/2}), hh^{\prime} is distance at most ϵ/2\epsilon/2 from hh and hence at most distance ϵ\epsilon from gg. Thus Sjlϵ/2/2+lϵ/2/2S^{jl_{\epsilon/2}/2+l_{\epsilon/2}/2} is ϵ\epsilon dense in GG and

jlϵ/2/2+lϵ/2/2<C′′lϵ/2dm+1/ϵdm=O(log(dm+1)α(1/ϵ)ϵdm),jl_{\epsilon/2}/2+l_{\epsilon/2}/2<C^{\prime\prime}l_{\epsilon/2}^{dm+1}/\epsilon^{dm}=O(\log^{(dm+1)\alpha}(1/\epsilon)\epsilon^{-dm}),

which establishes the proposition as mm depends only on |S|\left|S\right|. ∎

We record one final result that asserts that if SGS\subseteq G generates, then the powers of SS individually become dense in GG.

Proposition 5.

Suppose that GG is a compact connected Lie group. Suppose that SGS\subseteq G generates GG. Then for all ϵ>0\epsilon>0 there exists a natural number nϵn_{\epsilon} such that SnϵS^{n_{\epsilon}} is ϵ\epsilon-dense in GG.

Proof.

Let {g1,,gm}\{g_{1},...,g_{m}\} be an ϵ/2\epsilon/2-dense subset of GG. Because SS generates, for each gig_{i} there exists nin_{i} and wiSniw_{i}\in S^{n_{i}} such that d(gi,wi)<ϵ/2d(g_{i},w_{i})<\epsilon/2. By a pigeonhole argument similar to the proof of Lemma 4, it holds that for all ϵ>0\epsilon>0 there exists a natural number NN such that for all nNn\geq N, d(Sn,e)<ϵd(S^{n},e)<\epsilon. Thus there exists NN such that for all nNn\geq N, SnS^{n} contains elements within distance ϵ/2\epsilon/2 of the identity. Thus SN+maxi{ni}S^{N+\max_{i}\{n_{i}\}} is ϵ\epsilon-dense in GG. ∎

2.2. Diophantine Sets

We will now introduce a notion of a Diophantine subset of a compact connected semisimple Lie group GG. Write 𝔤\mathfrak{g} for the Lie algebra of GG. We recall the definition of the standard quadratic Casimir inside of U(𝔤)U(\mathfrak{g}), the universal enveloping algebra of 𝔤\mathfrak{g}. Write BB for the Killing form on 𝔤\mathfrak{g} and let XiX_{i} be an orthonormal basis for 𝔤\mathfrak{g} with respect to BB. We will also denote the inner product arising from the Killing form by ,\langle\cdot,\cdot\rangle. Then the Casimir, Ω\Omega, is the element of U(𝔤)U(\mathfrak{g}) defined by

Ω=iXi2.\Omega=\sum_{i}X_{i}^{2}.

The element Ω\Omega is well-defined and central in U(𝔤)U(\mathfrak{g}). Elements of U(𝔤)U(\mathfrak{g}) act on the smooth vectors of representations of GG. Consequently, as Ω\Omega is central and every vector in an irreducible representation (π,V)(\pi,V) is smooth, π(Ω)\pi(\Omega) acts by a multiple of the identity. Given an irreducible unitary representation (π,V)(\pi,V), Define c(π)c(\pi) by

(4) c(π)Id=π(Ω).c(\pi)\operatorname{Id}=-\pi(\Omega).

The quantity c(π)c(\pi) is positive in non-trivial representations. Further, as π\pi ranges over all non-trivial representations, c(π)c(\pi) is uniformly bounded away from 0. For further information see [Wal18, 5.6].

Definition 6.

Let GG be a compact, connected, semisimple Lie group. We say that a subset SGS\subset G is (C,α)(C,\alpha)-Diophantine if the following holds for each non-trivial, irreducible, finite dimensional unitary representation (π,V)(\pi,V) of GG. For all non-zero vVv\in V there exists gSg\in S such that

vπ(g)vCc(π)αv,\|v-\pi(g)v\|\geq Cc(\pi)^{-\alpha}\|v\|,

where c(π)c(\pi) is defined in (4). We say that SS is Diophantine if SS is (C,α)(C,\alpha)-Diophantine for some C,α>0C,\alpha>0. If (g1,,gm)(g_{1},...,g_{m}) is a tuple of elements of GG, the we say that this tuple is (C,α)(C,\alpha)-Diophantine if the underlying set is (C,α)(C,\alpha)-Diophantine.

Our formulation of Diophantine is slightly different from the definition in [Dol02] as we refer directly to irreducible representations. We choose this formulation because it allows for a unified analysis of the action of Ω\Omega in diverse representations of GG.

It is useful to compare Definition 6 with the simultaneous Diophantine condition used when studying translations on tori, such as is considered in [DF19] or [Pet21]. The condition for tori is a generalization of the simultaneous Diophantine condition considered considered by Moser [Mos90] for circle diffeomorphisms. Denote by ,\langle\cdot,\cdot\rangle denote the standard inner product in d\mathbb{R}^{d}. A tuple of vectors (θ1,,θm)(\theta_{1},\ldots,\theta_{m}) in d\mathbb{R}^{d} defines a tuple of translations of 𝕋d\mathbb{T}^{d}. We say that this tuple is (C,α)(C,\alpha)-Diophantine if for every non-zero kdk\in\mathbb{Z}^{d},

(5) max1imminl|θi,kl|Ckα.\max_{1\leq i\leq m}\min_{l\in\mathbb{Z}}\left|\langle\theta_{i},k\rangle-l\right|\geq\frac{C}{\|k\|^{\alpha}}.

One can see the relationship between this definition and the one for compact semisimple groups when we think of d\mathbb{Z}^{d} as indexing the unitary representations of 𝕋d\mathbb{T}^{d}. Although these definitions apply to different types of groups, one can check that the estimates at their core are equivalent: for a given unitary representation defined by kdk\in\mathbb{Z}^{d}, use the θi\theta_{i} that achieves the maximum in (5) to act on the representation defined by kk.

We now give a useful characterization of Diophantine subsets of compact semisimple groups.

Proposition 7.

[Dol02, Thm. A.3] Suppose that SS is a finite subset of a compact connected semisimple Lie group GG. Then SS is Diophantine if and only if S¯=G\overline{\langle S\rangle}=G. Moreover, there exists ϵ0(G)\epsilon_{0}(G) such that any ϵ0\epsilon_{0}-dense subset of GG is Diophantine.

Before proceeding to the proof we will show two preliminary results.

Lemma 8.

Suppose that GG is a compact connected semisimple Lie group. Suppose that (π,V)(\pi,V) is an irreducible unitary representation of GG. Then for any vVv\in V of unit length, any X𝔤X\in\mathfrak{g} of unit length, and t0t\geq 0,

π(exp(tX))vvtc(π).\|\pi(\exp(tX))v-v\|\leq t\sqrt{c(\pi)}.
Proof.

A similar argument to the following appears in [Wal18, 5.7.13]. There exists an orthonormal basis {X1,,Xn}\{X_{1},...,X_{n}\} of 𝔤\mathfrak{g} such that X1=XX_{1}=X. Observe that

π(exp(tX))vv=tdπ(X)v+O(t2).\pi(\exp(tX))v-v=td\pi(X)v+O(t^{2}).

The transformation dπ(X)d\pi(X) is skew symmetric with respect to the inner product. Thus dπ(X)2d\pi(X)^{2} is positive semidefinite. Consequently:

dπ(X)v,dπ(X)v=dπ(X)2v,vπ(Ω)v,v=c(π)v2.\langle d\pi(X)v,d\pi(X)v\rangle=-\langle d\pi(X)^{2}v,v\rangle\leq-\langle\pi(\Omega)v,v\rangle=c(\pi)\|v\|^{2}.

Hence

π(exp(tX)v)vtc(π)+O(t2).\|\pi(\exp(tX)v)-v\|\leq t\sqrt{c(\pi)}+O(t^{2}).

For 0in0\leq i\leq n, let ti=intt_{i}=\frac{i}{n}t. Then

π(exp(tX))vv\displaystyle\|\pi(\exp(tX))v-v\| i=1nπ(exp(tiX))vπ(exp(ti1X))v\displaystyle\leq\sum_{i=1}^{n}\|\pi(\exp(t_{i}X))v-\pi(\exp(t_{i-1}X))v\|
i=1nπ(exp(tX/n))vv\displaystyle\leq\sum_{i=1}^{n}\|\pi(\exp(tX/n))v-v\|
n(tnc(π)+O((t/n)2)).\displaystyle\leq n\left(\frac{t}{n}\sqrt{c(\pi)}+O((t/n)^{2})\right).

Taking the liminf of the right hand side as nn\to\infty gives the result. ∎

The following lemma will be of use in the proof of Proposition 10.

Lemma 9.

Suppose that (π,V)(\pi,V) is a non-trivial, irreducible, finite dimensional, unitary representation of a compact, connected, semisimple group GG. Then for any vVv\in V, there exists gg such that π(g)v,v=0\langle\pi(g)v,v\rangle=0.

Proof.

If such a gg does not exist, then for all gGg\in G, π(g)v\pi(g)v lies in the same half-space as vv. But then Gπ(g)v𝑑g0\int_{G}\pi(g)v\,dg\neq 0 and is a GG invariant vector, which contradicts the irreducibility of π\pi. ∎

Proposition 10.

Suppose that GG is a compact connected semisimple Lie group. Then there exist ϵ0,C,α>0\epsilon_{0},C,\alpha>0 such that any ϵ0\epsilon_{0}-dense subset of GG is (C,α)(C,\alpha)-Diophantine. If SS is a subset of GG such that Sn0S^{n_{0}} is ϵ0\epsilon_{0}-dense in GG, then SS is (C/n0,α)(C/n_{0},\alpha) Diophantine.

Proof.

Let ϵ0\epsilon_{0} equal the ϵ0(G)\epsilon_{0}(G) in Theorem 2, the Solovay-Kitaev algorithm. In the case that SS is already ϵ0\epsilon_{0}-dense, let n0=1n_{0}=1. By Theorem 2, there exist CC and α\alpha such that for each ϵ\epsilon there exists lϵClogα(ϵ1)l_{\epsilon}\leq C\log^{\alpha}(\epsilon^{-1}) such that Sn0lϵS^{n_{0}l_{\epsilon}} is ϵ\epsilon-dense in GG. Suppose that (π,V)(\pi,V) is a non-trivial irreducible unitary representation of GG and suppose that vVv\in V is a unit vector. By Lemma 9 there exists gGg\in G such that π(g)v,v=0\langle\pi(g)v,v\rangle=0. Now fix ϵ=1/(100c(π))\epsilon=1/(100\sqrt{c(\pi)}). Then there exists an element wSn0lϵw\in S^{n_{0}l_{\epsilon}} such that d(g,w)<ϵd(g,w)<\epsilon. Thus by Lemma 8,

π(g)vπ(w)vϵc(π)<1100.\|\pi(g)v-\pi(w)v\|\leq\epsilon\sqrt{c(\pi)}<\frac{1}{100}.

By the triangle inequality, this implies that

π(w)vv1.\|\pi(w)v-v\|\geq 1.

Write w=g1σ1gn0lϵσn0lϵw=g_{1}^{\sigma_{1}}\cdots g_{n_{0}l_{\epsilon}}^{\sigma_{n_{0}l_{\epsilon}}} where each σi{±1}\sigma_{i}\in\{\pm 1\} and each giSg_{i}\in S. Let wi=g1σ1giσiw_{i}=g_{1}^{\sigma_{1}}\cdots g_{i}^{\sigma_{i}}. Let w0=ew_{0}=e. By applying the triangle inequality n0lϵn_{0}l_{\epsilon} times, we see that

i=0n0lϵ1π(wi)vπ(wi+1)vvπ(w)v1.\sum_{i=0}^{n_{0}l_{\epsilon}-1}\|\pi(w_{i})v-\pi(w_{i+1})v\|\geq\|v-\pi(w)v\|\geq 1.

Thus there exists some ii such that

π(wi)vπ(wi+1)v1n0lϵ.\|\pi(w_{i})v-\pi(w_{i+1})v\|\geq\frac{1}{n_{0}l_{\epsilon}}.

Applying π(wi1)\pi(w_{i}^{-1}) and noting by our choice of ϵ\epsilon that lϵClogα(c(π))l_{\epsilon}\leq C\log^{\alpha}(c(\pi)), we obtain that

(6) vπ(giσi)v1n0Clogα(c(π)).\|v-\pi(g_{i}^{\sigma_{i}})v\|\geq\frac{1}{n_{0}C^{\prime}\log^{\alpha}(c(\pi))}.

Thus we are done as we have obtained an estimate that is stronger than the required lower bound of C/c(π)αC/c(\pi)^{\alpha}. ∎

We now prove the equivalence of the Diophantine property appearing in Proposition 10 with that in Definition 6.

Proof of Proposition 7..

To begin, suppose that SS is Diophantine. For the sake of contradiction, suppose that HS¯GH\coloneqq\overline{\langle S\rangle}\neq G. Consider the action of GG on L2(G/H)L^{2}(G/H) by left translation. Note that HH acts trivially. However, L2(G/H)L^{2}(G/H) contains non-trivial representations of GG. Thus SHS\subset H cannot be Diophantine, which is a contradiction.

For the other direction, suppose that S¯=G\overline{\langle S\rangle}=G. Then by Proposition 5 there exists nn such that SnS^{n} is ϵ0(G)\epsilon_{0}(G)-dense and, hence SS is Diophantine by Proposition 10. ∎

The stronger bound in equation (6) gives an equivalent characterization of Diophantineness.

Corollary 11.

Let GG be a compact, connected, semisimple Lie group. A subset SS of GG is Diophantine if and only if there exist C,α>0C,\alpha>0 such that the following holds for each non-trivial, irreducible, finite dimensional, unitary representation (π,V)(\pi,V) of GG. For all vVv\in V there exists gSg\in S such that

vπ(g)vvClogα(c(π)).\|v-\pi(g)v\|\geq\frac{\|v\|}{C\log^{\alpha}(c(\pi))}.

Diophantine subsets of a group are typical in the following sense.

Proposition 12.

Suppose that GG is a compact connected semisimple Lie group. Let UG×GU\subset G\times G be the set of ordered pairs (u1,u2)(u_{1},u_{2}) such that {u1,u2}\{u_{1},u_{2}\} is a Diophantine subset of GG. Then UU is Zariski open and hence open and dense in the manifold topology on G×GG\times G.

Proof.

Let UG×GU\subset G\times G be the set of points (u1,u2)(u_{1},u_{2}) such that {u1,u2}\{u_{1},u_{2}\} generates a dense subset of GG. Theorem 1.1 in [Fie99] gives that UU is Zariski open and non-empty. By Proposition 7, this implies that {u1,u2}\{u_{1},u_{2}\} is Diophantine. As UU is non-empty, the final claim follows. ∎

2.3. Polylogarithmic spectral gap

In this subsection, we study spectral properties of an averaging operator associated to a tuple of elements of GG. Consider a tuple (g1,,gm)(g_{1},...,g_{m}) of elements of GG. Let [G]\mathbb{R}[G] denote the group ring of GG over \mathbb{R}. From this tuple we form (g1++gm)/m[G]\mathcal{L}\coloneqq(g_{1}+\cdots+g_{m})/m\in\mathbb{R}[G]. The element \mathcal{L} acts in representations of GG in the natural way. If (π,V)(\pi,V) is a representation of GG, then we write π\mathcal{L}_{\pi} for the action of \mathcal{L} on VV. The main result of this subsection is the following proposition, which gives some spectral properties of π\mathcal{L}_{\pi} under the assumption that {g1,,gm}\{g_{1},...,g_{m}\} is Diophantine.

Proposition 13.

Suppose that GG is a compact connected semisimple Lie group, (g1,,gm)(g_{1},...,g_{m}) is a tuple of elements of GG, and that {g1,,gm}\{g_{1},...,g_{m}\} generates GG. Then there exists a neighborhood NN of (g1,,gm)(g_{1},...,g_{m}) in G××GG\times\cdots\times G and constants D1,D2,α>0D_{1},D_{2},\alpha>0 such that if (g1,.,gm)N(g_{1}^{\prime},....,g_{m}^{\prime})\in N, then {g1,,gm}\{g_{1}^{\prime},\ldots,g_{m}^{\prime}\} is Diophantine and its associated averaging operator \mathcal{L} satisfies

πnD1(11D2logα(c(π)))n,\|\mathcal{L}_{\pi}^{n}\|\leq D_{1}\left(1-\frac{1}{D_{2}\log^{\alpha}(c(\pi))}\right)^{n},

for each non-trivial irreducible unitary representation (π,V)(\pi,V).

The proof of Proposition 13 uses the following lemma, which is a sharpening the triangle inequality for vectors that are not colinear.

Lemma 14.

Suppose that v,wv,w are two vectors in an inner product space. Suppose that vw\|v\|\leq\|w\| and let v^=v/v\hat{v}=v/\|v\| and w^=w/w\hat{w}=w/\|w\|. If

v^w^ϵ,\|\hat{v}-\hat{w}\|\geq\epsilon,

then

v+w(1ϵ2/10)v+w.\|v+w\|\leq(1-\epsilon^{2}/10)\|v\|+\|w\|.
Proof.

We begin by considering the following estimate for unit vectors.

Claim 1.

Suppose that the angle between two unit vectors v^\hat{v} and w^\hat{w} is θ[0,π]\theta\in[0,\pi], then

v^+wv^+(1θ2/10)w^.\|\hat{v}+{w}\|\leq\|\hat{v}\|+(1-\theta^{2}/10)\|\hat{w}\|.
Proof.

It suffices to consider the two vectors v^=(1,0)\hat{v}=(1,0) and w^=(cosθ,sinθ)\hat{w}=(\cos\theta,\sin\theta) in 2\mathbb{R}^{2}. It then suffices to show:

v^+w^2(v^+(1θ210)w^)2.\|\hat{v}+\hat{w}\|^{2}\leq\left(\|\hat{v}\|+\left(1-\frac{\theta^{2}}{10}\right)\|\hat{w}\|\right)^{2}.

From the definitions,

v^+w^2=2+2cosθ\|\hat{v}+\hat{w}\|^{2}=2+2\cos\theta

and

(v^+(1θ210)w^)2=44θ210+θ410044θ210.\left(\|\hat{v}\|+\left(1-\frac{\theta^{2}}{10}\right)\|\hat{w}\|\right)^{2}=4-4\frac{\theta^{2}}{10}+\frac{\theta^{4}}{100}\geq 4-4\frac{\theta^{2}}{10}.

Thus it suffices to show for θ[0,π]\theta\in[0,\pi] that

2+2cosθ44θ210,2+2\cos\theta\leq 4-4\frac{\theta^{2}}{10},

which follows because for θ[0,π]\theta\in[0,\pi] we have the estimate cosθ1θ2/5\cos\theta\leq 1-\theta^{2}/5. ∎

We may prove the lemma once we have one more observation. Note that if v^\hat{v} and w^\hat{w} are two unit vectors, then v^w^=ϵ\|\hat{v}-\hat{w}\|=\epsilon is less than the angle θ\theta between v^\hat{v} and w^\hat{w} because the distance between v^\hat{v} and w^\hat{w} along a unit circle they lie on is precisely θ\theta. Thus we see that ϵθ\epsilon\leq\theta for 0θπ0\leq\theta\leq\pi.

We now compute. Note that without loss of generality we may assume that w=1\|w\|=1, which we do in the following. By the triangle inequality,

v+wvv^+w^+(1v)w^.\displaystyle\|v+w\|\leq\|v\|\|\hat{v}+\hat{w}\|+(1-\|v\|)\|\hat{w}\|.

By the claim it then follows that

v+wv((1θ2)v^+w^)+(1v)w^.\|v+w\|\leq\|v\|((1-\theta^{2})\|\hat{v}\|+\|\hat{w}\|)+(1-\|v\|)\|\hat{w}\|.

Noting from before that 0ϵθ0\leq\epsilon\leq\theta for θ[0,π]\theta\in[0,\pi], we then conclude:

v+wv((1ϵ2/10)v^+w^)+(1v)w^=(1ϵ2/10)v+w.\|v+w\|\leq\|v\|((1-\epsilon^{2}/10)\|\hat{v}\|+\|\hat{w}\|)+(1-\|v\|)\|\hat{w}\|=(1-\epsilon^{2}/10)\|v\|+\|w\|.

Proof of Proposition 13..

For convenience, let W=(g1,,gm)W=(g_{1},...,g_{m}) and let S={g1,,gm}S=\{g_{1},...,g_{m}\}. Let ϵ0(G)\epsilon_{0}(G) be as in Proposition 10. By Proposition 5, because S¯=G\overline{\langle S\rangle}=G there exists some n0n_{0} such that Sn0S^{n_{0}} is ϵ0/2\epsilon_{0}/2-dense in GG. Then let NN be the neighborhood of (g1,,gm)(g_{1},...,g_{m}) in G××GG\times\cdots\times G such that if p=(g1,,gm)Np=(g_{1}^{\prime},...,g_{m}^{\prime})\in N then {g1,,gm}n0\{g_{1}^{\prime},...,g_{m}^{\prime}\}^{n_{0}} is at least ϵ0\epsilon_{0}-dense in GG. It now suffices to obtain the given estimate for the set W=(g1,,gm)W=(g_{1},...,g_{m}) using only the assumption that Sn0S^{n_{0}} is ϵ0\epsilon_{0}-dense. Below, Wn0W^{n_{0}} is the tuple of the mn0m^{n_{0}} words of length n0n_{0} with entries in WW.

By Proposition 10, there exist (C,α)(C,\alpha) such that any ϵ0\epsilon_{0}-dense set is (C,α)(C,\alpha)-Diophantine. As Sn0S^{n_{0}} is ϵ0\epsilon_{0}-dense, so is Sn0Sn0S^{n_{0}}S^{-n_{0}}, and hence Sn0Sn0S^{n_{0}}S^{-n_{0}} is (C,α)(C,\alpha)-Diophantine.

Consider now a non-trivial irreducible finite dimensional unitary representation (π,V)(\pi,V) of GG. Since Sn0Sn0S^{n_{0}}S^{-n_{0}} is (C,α)(C,\alpha)-Diophantine, Corollary 11 implies that for any unit length vVv\in V there exist w1,w2Sn0w_{1},w_{2}\in S^{n_{0}} such that

vπ(w11w2)v1Clogα(c(π)),\|v-\pi(w_{1}^{-1}w_{2})v\|\geq\frac{1}{C\log^{\alpha}(c(\pi))},

and so

π(w1)vπ(w2)v1Clogα(c(π)).\|\pi(w_{1})v-\pi(w_{2})v\|\geq\frac{1}{C\log^{\alpha}(c(\pi))}.

Hence by Lemma 14, since π\pi is unitary

π(w1)v+π(w2)v(1110C2log2α(c(π)))π(w1)v+π(w2)v(2110C2log2α(c(π)))v.\|\pi(w_{1})v+\pi(w_{2})v\|\leq\left(1-\frac{1}{10C^{2}\log^{2\alpha}(c(\pi))}\right)\|\pi(w_{1})v\|+\|\pi(w_{2})v\|\leq\left(2-\frac{1}{10C^{2}\log^{2\alpha}(c(\pi))}\right)\|v\|.

Then by the triangle inequality:

πn0v\displaystyle\|\mathcal{L}_{\pi}^{n_{0}}v\| =1|W|n0wWn0π(w)v\displaystyle=\left\|\frac{1}{\left|W\right|^{n_{0}}}\sum_{w\in W^{n_{0}}}\pi(w)v\right\|
1|W|n0(π(w1)v+π(w2)v+wWn0{w1,w2}π(w)v)\displaystyle\leq\frac{1}{\left|W\right|^{n_{0}}}\left(\|\pi(w_{1})v+\pi(w_{2})v\|+\sum_{w\in W^{n_{0}}\setminus\{w_{1},w_{2}\}}\|\pi(w)v\|\right)
1|W|n0(2110C2log2α(c(π)))v+|W|n02|W|n0v\displaystyle\leq\frac{1}{\left|W\right|^{n_{0}}}\left(2-\frac{1}{10C^{2}\log^{2\alpha}(c(\pi))}\right)\|v\|+\frac{\left|W\right|^{n_{0}}-2}{\left|W\right|^{n_{0}}}\|v\|
(1110C2|Wn0|log2α(c(π)))v.\displaystyle\leq\left(1-\frac{1}{10C^{2}|W^{n_{0}}|\log^{2\alpha}(c(\pi))}\right)\|v\|.

Interpolating gives that for all n0n\geq 0,

πn(1110C2|Wn0|log2α(c(π)))1(1110C2|Wn0|log2α(c(π)))n/n0.\|\mathcal{L}_{\pi}^{n}\|\leq\left(1-\frac{1}{10C^{2}|W^{n_{0}}|\log^{2\alpha}(c(\pi))}\right)^{-1}\left(1-\frac{1}{10C^{2}|W^{n_{0}}|\log^{2\alpha}(c(\pi))}\right)^{n/n_{0}}.

As (π,V)(\pi,V) ranges over all non-trivial representations, c(π)c(\pi) is uniformly bounded away from 0; see [Wal18, 5.6.7]. This implies that the first term above is uniformly bounded by some D>0D>0 independent of π\pi. Applying the estimate (1+x)ϵ1+ϵx(1+x)^{\epsilon}\leq 1+\epsilon x to the second term then gives the proposition. ∎

Notice that in Proposition 13 that we obtain an entire neighborhood of our initial set SS on which we have the same estimates for π\mathcal{L}_{\pi}. Consequently, because these estimates remain true under small perturbations, we think of them as being stable. We will use the term “stable” in the following precise sense.

Definition 15.

Suppose that TT is some property of a tuple W=(g1,,gm)W=(g_{1},...,g_{m}) with elements in a Lie group GG. We say that TT is stable at W=(g1,,gm)W=(g_{1},...,g_{m}) if there exists a neighborhood NN of (g1,,gm)(g_{1},...,g_{m}) in G××GG\times\cdots\times G such that if (g1,,gm)N(g_{1}^{\prime},...,g_{m}^{\prime})\in N then TT holds for (g1,,gm)(g_{1}^{\prime},...,g_{m}^{\prime}). We will also say that TT is stable without reference to a subset when the relevant tuples that TT is stable on are evident.

A crucial aspect of the Diophantine property in compact semisimple Lie groups is that by Proposition 10 there is a stable lower bound on (C,α)(C,\alpha). This stability will be essential during the KAM scheme.

2.4. Diophantine sets and tameness

Consider a smooth vector bundle EE over a closed manifold MM. We may consider the space C(M,E)C^{\infty}(M,E) of smooth sections of EE. Consider a linear map L:C(M,E)C(M,E)L\colon C^{\infty}(M,E)\to C^{\infty}(M,E). We say that LL is tame if there exists α\alpha such that for all kk there exists CkC_{k}, such that for all sC(M,E)s\in C^{\infty}(M,E),

LsCkCksCk+α.\|Ls\|_{C^{k}}\leq C_{k}\|s\|_{C^{k+\alpha}}.

See [Ham82, II.2.1] for more about tameness. The main result of this section is to show such estimates for certain operators related to \mathcal{L}.

Though \mathcal{L} acts in any representation of GG, we are most interested in the action of GG on the sections of certain vector bundles, which we now describe. Suppose that KK is a closed subgroup of GG and that EE is a smooth vector bundle over G/KG/K. We say that EE is a homogeneous vector bundle over G/KG/K if GG acts on EE by bundle maps and this action projects to the action of GG on G/KG/K by left translation. We now give an explicit description of all homogeneous vector bundles over G/KG/K via the Borel construction. See [Wal18, Ch. 5] for more details about this topic and what follows. Suppose that (τ,E0)(\tau,E_{0}) is a finite dimensional unitary representation of KK. Form the trivial bundle G×E0G\times E_{0}. Then KK acts on this bundle by (g,v)(gk,τ(k)1v)(g,v)\mapsto(gk,\tau(k)^{-1}v). Then (G×E0)/K(G\times E_{0})/K is a vector bundle over G/KG/K that we denote by G×τE0G\times_{\tau}E_{0}. Note, for instance, that C(G,)C^{\infty}(G,\mathbb{R}) is the space of sections of the homogeneous vector bundle obtained from the trivial representation of {e}<G\{e\}<G. The left action of GG on G×E0G\times E_{0} descends to G×τE0G\times_{\tau}E_{0}, and hence this is a homogeneous vector bundle.

In order to do analysis in a homogeneous vector bundle, we must introduce some additional structures. Suppose that E=G×τE0E=G\times_{\tau}E_{0} is a homogeneous vector bundle. The base G/KG/K comes equipped with the projection of the Haar measure on GG. As the action of KK on G×E0G\times E_{0} is isometric on fibers, the fibers of EE are naturally endowed with an inner product. We may then consider the space L2(E)L^{2}(E), the space of all L2L^{2} sections of EE. In addition, we will write C(E)C^{\infty}(E) for the space of all smooth sections of EE. The action of GG on EE preserves L2(E)L^{2}(E) and C(E)C^{\infty}(E).

We recall briefly how one may do harmonic analysis on sections of such bundles. As before, let Ω\Omega be the Casimir operator, which is an element of U(𝔤)U(\mathfrak{g}). Then Ω\Omega acts on the CC^{\infty} vectors of any representation of GG. Denote by Δ\Delta the differential operator obtained by the action of Ω-\Omega on C(E)C^{\infty}(E). Then Δ\Delta is a hypoelliptic differential operator on EE. We then use the spectrum of Δ\Delta to define for any s0s\geq 0 the Sobolev norm HsH^{s} in the following manner. L2(E)L^{2}(E) may be decomposed as the Hilbert space direct sum of finite dimensional irreducible unitary representations VπV_{\pi}. Write ϕ=πϕπ\phi=\sum_{\pi}\phi_{\pi} for the decomposition of an element ϕL2(E)\phi\in L^{2}(E). Then the ss-Sobolev norm is defined by

ϕHs2=π(1+c(π))sϕπL22.\|\phi\|_{H^{s}}^{2}=\sum_{\pi}(1+c(\pi))^{s}\|\phi_{\pi}\|_{L^{2}}^{2}.

We write fCs\|f\|_{C^{s}} for the usual CsC^{s} norm of a function or section of a vector bundle. It is not always necessary to work with the decomposition of L2(E)L^{2}(E) into irreducible subspaces, but instead use a coarser decomposition as follows. We let HλH_{\lambda} denote the subspace of L2(E)L^{2}(E) on which Δ\Delta acts by multiplication by λ>0\lambda>0. There are countably many such subspaces HλH_{\lambda} and each is finite dimensional. In the sequel, those functions that are orthogonal to the trivial representations in L2(E)L^{2}(E) will be of particular importance. We denote by L02(E)L^{2}_{0}(E) the orthogonal complement of the trivial representations in L2(E)L^{2}(E), and C0(E)C^{\infty}_{0}(E) the subspace L02(E)C(E)L^{2}_{0}(E)\cap C^{\infty}(E).

We now consider the action of \mathcal{L} on the sections of a homogeneous vector bundle.

Proposition 16.

[DK07, Prop 1.] (Tameness) Suppose that (g1,,gm)(g_{1},...,g_{m}) is a Diophantine tuple with elements in a compact connected semisimple Lie group GG. Suppose that EE is a homogeneous vector bundle that GG acts on. Then there exist constants C1,α1,α2>0C_{1},\alpha_{1},\alpha_{2}>0 such that for any s0s\geq 0 there exists CsC_{s} such that for any nonzero ϕC0(G/K,E)\phi\in C^{\infty}_{0}(G/K,E) the following holds:

(I)1ϕHsC1ϕHs+α1\|(I-\mathcal{L})^{-1}\phi\|_{H^{s}}\leq C_{1}\|\phi\|_{H^{s+\alpha_{1}}}

and

(I)1ϕCsCsϕCs+α2.\|(I-\mathcal{L})^{-1}\phi\|_{C^{s}}\leq C_{s}\|\phi\|_{C^{s+\alpha_{2}}}.

Moreover, these estimates are stable.

Proof.

As before, let HλH_{\lambda} be the λ\lambda-eigenspace of Δ\Delta acting on sections of EE. Let λ\mathcal{L}_{\lambda} denote the action of \mathcal{L} on HλH_{\lambda}. From Proposition 13, we see that there exist D1,D2D_{1},D_{2} and α3\alpha_{3} such that for all λ>0\lambda>0, λnH0D1(11/(D2logα3(λ))n\|\mathcal{L}_{\lambda}^{n}\|_{H^{0}}\leq D_{1}(1-1/(D_{2}\log^{\alpha_{3}}(\lambda))^{n}. Thus there exists C3C_{3} such that (Iλ)1H0C3logα3(λ)\|(I-\mathcal{L}_{\lambda})^{-1}\|_{H^{0}}\leq C_{3}\log^{\alpha_{3}}(\lambda). Now observe, that in the following sum that λ0\lambda\neq 0 by our assumption that ϕ\phi is orthogonal to the trivial representations contained in L2(E)L^{2}(E):

(I)1ϕHs2\displaystyle\|(I-\mathcal{L})^{-1}\phi\|_{H^{s}}^{2} =λ>0(1+λ)s(Iλ)1ϕλL22\displaystyle=\sum_{\lambda>0}(1+\lambda)^{s}\|(I-\mathcal{L}_{\lambda})^{-1}\phi_{\lambda}\|^{2}_{L^{2}}
λ>0(1+λ)s(Iλ)12ϕλL22\displaystyle\leq\sum_{\lambda>0}(1+\lambda)^{s}\|(I-\mathcal{L}_{\lambda})^{-1}\|^{2}\|\phi_{\lambda}\|^{2}_{L^{2}}
λ>0C32log2α3(λ)(1+λ)sϕλL22\displaystyle\leq\sum_{\lambda>0}C_{3}^{2}\log^{2\alpha_{3}}(\lambda)(1+\lambda)^{s}\|\phi_{\lambda}\|^{2}_{L^{2}}
λ>0C42(1+λ)s+α1ϕλL22\displaystyle\leq\sum_{\lambda>0}C_{4}^{2}(1+\lambda)^{s+\alpha_{1}}\|\phi_{\lambda}\|^{2}_{L^{2}}
C42ϕHs+α12,\displaystyle\leq C_{4}^{2}\|\phi\|_{H^{s+\alpha_{1}}}^{2},

for any α1>0\alpha_{1}>0 and sufficiently large C4C_{4}. The second estimate in the proposition then follows from the first by applying the Sobolev embedding theorem. ∎

2.5. Application to isotropic manifolds

We now introduce the class of isotropic manifolds, which are the subject of this paper and whose isometry groups may be studied along the above lines. We say that MM is isotropic if Isom(M)\operatorname{Isom}(M) acts transitively on the unit tangent bundle of MM, T1MT^{1}M. This is equivalent to Isom(M)\operatorname{Isom}(M)^{\circ} acting transitively on T1MT^{1}M. There are not many isotropic manifolds. In fact, all are globally symmetric spaces. The following is the complete list of all compact isotropic manifolds:

  1. (1)

    Sn=SO(n+1)/SO(n)S^{n}=\operatorname{SO}(n+1)/\operatorname{SO}(n), sphere,

  2. (2)

    Pn=SO(n+1)/O(n)\mathbb{R}\operatorname{P}^{n}=\operatorname{SO}(n+1)/O(n), real projective space,

  3. (3)

    Pn=SU(n+1)/U(n)\mathbb{C}\operatorname{P}^{n}=\operatorname{SU}(n+1)/U(n), complex projective space,

  4. (4)

    Pn=Sp(n+1)/(Sp(n)×Sp(1))\mathbb{H}\operatorname{P}^{n}=\operatorname{Sp}(n+1)/(\operatorname{Sp}(n)\times\operatorname{Sp}(1)), quaternionic projective space,

  5. (5)

    F4/Spin(9)F_{4}/\operatorname{Spin}(9), Cayley projective plane.

A proof of this classification may be found in [Wol72, Thm. 8.12.2].

Though S1S^{1} is an isotropic manifold, we will exclude it in all future statements because its isometry group is not semisimple. The reason that we study isotropic manifolds is that if MM is an isotropic manifold, Isom(M)\operatorname{Isom}(M) is semisimple.

Lemma 17.

Suppose that MM is a compact connected isotropic manifold other than S1S^{1}, then Isom(M)\operatorname{Isom}(M) is semisimple. The same is true for Isom0(M)\operatorname{Isom}^{0}(M), the connected component of the identity.

For a proof of this Lemma, see [Sha01], which computes the isometry groups for each of these spaces explicitly. In fact, these isometry groups all have simple Lie algebras.

One minor issue with applying what we have developed so far to isotropic manifolds is that Isom(M)\operatorname{Isom}(M) need not be connected. Even in the case of S2S^{2}, Isom(M)\operatorname{Isom}(M) is disconnected. In fact, Dolgopyat and Krikorian assume that the isometries in their theorem all lie in the identity component of Isom(M)\operatorname{Isom}(M) and hence are rotations. Here, we consider the full isometry group. Hence Theorem 1 is a generalization even in the case of SnS^{n}. That said, the generalization is minor: the identity component is index 22 in the full isometry group.

Although connectedness of Isom(M)\operatorname{Isom}(M) has not been the crux of previous arguments, if Isom(M)Isom(M)\operatorname{Isom}(M)\neq\operatorname{Isom}(M)^{\circ}, then there are “extra” representations of Isom(M)\operatorname{Isom}(M) that appear in the definition of Diophantineness that would need to be dealt with slightly differently. For this reason we give the following definition, which is adapted to the case where Isom(M)\operatorname{Isom}(M) is not connected.

Definition 18.

We say that a tuple (g1,,gm)(g_{1},\ldots,g_{m}) with each giIsom(M)g_{i}\in\operatorname{Isom}(M) is Diophantine if there exists nn such that if S={g1,,gm}S=\{g_{1},\ldots,g_{m}\} then SnIsom(M)S^{n}\cap\operatorname{Isom}(M)^{\circ} is (C,α)(C,\alpha)-Diophantine for some C,α>0C,\alpha>0. We say that such a tuple is (C,α,n)(C,\alpha,n)-Diophantine.

It follows from Proposition 7 that if a tuple is Diophantine, then there exists a neighborhood of that tuple such that the constants C,α,nC,\alpha,n may be taken to be uniform over that neighborhood. Thus Diophantineness in this more general sense is a stable property. The following analogue of Proposition 19 is then immediate.

Proposition 19.

Let MM be a closed isotropic manifold of dimension at least 22 and SS be a finite subset of Isom(M)\operatorname{Isom}(M). The set SS is Diophantine if and only if Isom(M)S¯\operatorname{Isom}(M)^{\circ}\subseteq\overline{\langle S\rangle}. Moreover, there exists ϵ0(M),C,α,n>0\epsilon_{0}(M),C,\alpha,n>0 such that any subset of Isom(M)\operatorname{Isom}(M) that is ϵ0\epsilon_{0}-dense in Isom(M)\operatorname{Isom}(M)^{\circ} is stably (C,α,n)(C,\alpha,n)-Diophantine.

We will show a tameness result in this setting. The important point is that Isom(M)\operatorname{Isom}(M)^{\circ} is a semisimple connected Lie group and TMTM is a homogeneous vector bundle that Isom(M)\operatorname{Isom}(M)^{\circ} acts on. Further, due to MM being isotropic L2(M,TM)L^{2}(M,TM) contains no trivial representations of Isom(M)\operatorname{Isom}(M)^{\circ}. Thus we are almost in a position where we can apply Proposition 16. There is one small issue: there may be representations of Isom(M)\operatorname{Isom}(M) that are trivial on Isom(M)\operatorname{Isom}(M)^{\circ} and hence the previous arguments do not apply directly to these representations. However, for the purpose of studying sections of TMTM, studying representations of Isom(M)\operatorname{Isom}(M)^{\circ} suffices. The following Proposition explains how one may get around this issue to recover the appropriate analog of Proposition 13. It is important to note that there are many choices of a “Laplacian” acting on vector fields over a manifold, and they may not all be the same. In our case, we are choosing to work with the Casimir Laplacian, which arises from viewing TMTM as a homogeneous vector bundle. Given a tuple (g1,,gm)(g_{1},\ldots,g_{m}) of isometries of MM, the associated operator \mathcal{L} that acts on L2(M,TM)L^{2}(M,TM) is defined for a vector field VV by Vm1i=1m(Dgi)VV\mapsto m^{-1}\sum_{i=1}^{m}(Dg_{i})_{*}V.

Proposition 20.

Suppose that MM is a closed isotropic manifold with dimM2\dim M\geq 2. Suppose that (g1,,gm)(g_{1},\ldots,g_{m}) is a Diophantine tuple with elements in Isom(M)\operatorname{Isom}(M). There exists a neighborhood 𝒩\mathcal{N} of (g1,,gm)(g_{1},\ldots,g_{m}) in Isom(M)××Isom(M)\operatorname{Isom}(M)\times\cdots\times\operatorname{Isom}(M) and constants D1,D2,α>0D_{1},D_{2},\alpha>0 such that if (g1,,gm)𝒩(g_{1}^{\prime},\ldots,g_{m}^{\prime})\in\mathcal{N}, then {g1,,gm}\{g_{1}^{\prime},\ldots,g_{m}^{\prime}\} is Diophantine. Let HλH_{\lambda} denote the λ\lambda-eigenspace of Δ\Delta acting on sections of TMTM. For any tuple in this neighborhood, the associated operator \mathcal{L} acts on L2(M,TM)L^{2}(M,TM) and preserves the HλH_{\lambda}-eigenspaces. In fact, writing λ\mathcal{L}_{\lambda} for this induced action we have that:

λnD1(11D2logα(λ))n.\|\mathcal{L}^{n}_{\lambda}\|\leq D_{1}\left(1-\frac{1}{D_{2}\log^{\alpha}(\lambda)}\right)^{n}.

The same holds for the eigenspaces HλH_{\lambda} of Δ\Delta acting on other bundles over MM assuming that Isom(M)\operatorname{Isom}(M) acts isometrically on the space of sections of those bundles. In cases where there is a trivial representation, we must also assume λ>0\lambda>0. Examples of such bundles are L2(M,)L^{2}(M,\mathbb{R}) as well as L2(Grr(M),)L^{2}(\operatorname{Gr}_{r}(M),\mathbb{R}) in the case that Isom(M)\operatorname{Isom}(M)^{\circ} acts transitively on the rr-planes in TMTM.

Proof.

The key steps in the proof are substantially similar to those in Proposition 13, once we show that the elements of Isom(M)\operatorname{Isom}(M) all preserve the spaces HλH_{\lambda}. Let Γ\Gamma be a bundle as in the statement of the proposition that Isom(M)\operatorname{Isom}(M) acts on isometrically.

Claim 2.

Suppose that VΓV\subset\Gamma is an irreducible representation of Isom(M)\operatorname{Isom}(M)^{\circ} isomorphic to (π,W)(\pi,W). Then for any kIsom(M)k\in\operatorname{Isom}(M)^{\circ}, kVkV is an irreducible representation of VV isomorphic to (πα,W)(\pi\circ\alpha,W) for some automorphism α\alpha of Isom(M)\operatorname{Isom}(M)^{\circ}. In particular, c(πα)=c(π)c(\pi\circ\alpha)=c(\pi).

Proof.

Let gk=k1gkg^{k}=k^{-1}gk as usual. We claim that for any kIsom(M)k\in\operatorname{Isom}(M) that kVkV is a representation of Isom(M)\operatorname{Isom}(M)^{\circ}. To see this note that for vVv\in V, that gkv=kgkvgkv=kg^{k}v, but gkIsom(M)g^{k}\in\operatorname{Isom}(M)^{\circ}, so kgkvkVkg^{k}v\in kV. Moreover, it is straightforward to see that the representation of Isom(M)\operatorname{Isom}(M)^{\circ} on kVkV is isomorphic to the representation (πα,W)(\pi\circ\alpha,W) where α\alpha is the automorphism ggkg\mapsto g^{k}.

We now claim that c(πα)=c(π)c(\pi\circ\alpha)=c(\pi). Because α\alpha is an automorphism, it preserves the Killing form, and hence we see that we can write the Casimir element as i(dα1(Xi))2\sum_{i}(d\alpha^{-1}(X_{i}))^{2}. Now note that if one traces through the computation of what the value c(πα)c(\pi\circ\alpha) for the representation πα\pi\circ\alpha, that the α1\alpha^{-1} we have introduced cancels with the α\alpha. Thus the computation reduces to the computation of c(π)c(\pi) with the original expression iXi2\sum_{i}X_{i}^{2}. Hence c(πα)=c(π)c(\pi\circ\alpha)=c(\pi). ∎

To conclude from this point, one does the same argument as in Proposition 13, except we start with the set Sn0S^{n_{0}} and only make use of the elements in Sn0Isom(M)S^{n_{0}}\cap\operatorname{Isom}(M)^{\circ}. No issues arise because any terms that do not lie in Isom(M)\operatorname{Isom}(M)^{\circ} are isometries of HλH_{\lambda} as we have now shown. ∎

Having established the previous proposition the following is immediate and may be shown by repeating the argument of Proposition 16.

Proposition 21.

Suppose that MM is a closed isotropic manifold with dimM2\dim M\geq 2. Suppose that (g1,,gm)(g_{1},\ldots,g_{m}) is a Diophantine tuple with elements in Isom(M)\operatorname{Isom}(M). There exist constants C1,α1,α2>0C_{1},\alpha_{1},\alpha_{2}>0 such that for any s0s\geq 0 there exists CsC_{s} such that for any ϕC(M,TM)\phi\in C^{\infty}(M,TM) the following holds:

(I)1ϕHsC1ϕHs+α1,\|(I-\mathcal{L})^{-1}\phi\|_{H^{s}}\leq C_{1}\|\phi\|_{H^{s+\alpha_{1}}},

and

(I)1ϕCsCsϕCs+α2.\|(I-\mathcal{L})^{-1}\phi\|_{C^{s}}\leq C_{s}\|\phi\|_{C^{s+\alpha_{2}}}.

Moreover these estimates are stable. The same holds for the action of \mathcal{L} on any of the sections of any of the bundles that Proposition 20 applies to.

3. Approximation of Stationary Measures

In this section, we introduce the notion of a stationary measure associated to a random dynamical system. We consider stationary measures of certain random dynamical systems associated to a Diophantine subset of a compact semisimple Lie group as well as perturbations of these systems. We begin by introducing these systems and some associated transfer operators. In Proposition 23, we give an asymptotic expansion of the stationary measures of a perturbation.

3.1. Random dynamical systems and their transfer operators

We now give some basic definitions concerning random dynamical systems. For general treatments of random dynamical systems and their basic properties, see [Kif86] or [Arn13]. If (f1,,fm)(f_{1},...,f_{m}) is a tuple of maps of a standard Borel space MM, then these maps generate a uniform Bernoulli random dynamical system on MM. This dynamical system is given by choosing an index 1im1\leq i\leq m uniformly at random and then applying the function fif_{i} to MM. To iterate the system further, one chooses additional independent uniformly distributed indices and repeats. We always use the words random dynamical system to mean uniform Bernoulli random dynamical system in the sense just described.

Associated to this random dynamical system are two operators. The first operator is called the averaged Koopman operator. It acts on functions and is defined by

(7) ϕ1mi=1mϕfi.\mathcal{M}\phi\coloneqq\frac{1}{m}\sum_{i=1}^{m}\phi\circ f_{i}.

The second operator is called the averaged transfer operator. It acts on measures and is defined by

(8) μ1mi=1m(fi)μ.\mathcal{M}^{*}\mu\coloneqq\frac{1}{m}\sum_{i=1}^{m}(f_{i})_{*}\mu.

Depending on the space MM, we may restrict the domains of these operators to a suitable subset of the spaces of functions and measures on MM. We say that a measure is stationary if μ=μ\mathcal{M}^{*}\mu=\mu. We assume that stationary measures have unit mass.

In this paper, we take MM to be a compact homogeneous space G/KG/K. If gGg\in G, then left translation by gg gives an isometry of G/KG/K that we also call gg. As before, a tuple (g1,,gm)(g_{1},...,g_{m}) with each giGg_{i}\in G generates a random dynamical system on G/KG/K. We will also consider perturbations of this random dynamical system. Consider a tuple (f1,,fm)(f_{1},...,f_{m}) where each fiDiff(G/K)f_{i}\in\operatorname{Diff}^{\infty}(G/K). This collection also generates a random dynamical system on G/KG/K. The indices 1,,m1,...,m give a natural way to compare the two systems. We refer to the initial system as homogeneous or linear and to the latter system as non-homogeneous or non-linear.

We will simultaneously work with a homogeneous and non-homogeneous systems, so we now introduce notation to distinguish the transfer operators of each. We write \mathcal{M} for the averaged Koopman operator associated to the system generated by the tuple (g1,,gm)(g_{1},...,g_{m}) and we write ϵ\mathcal{M}_{\epsilon} for the averaged Koopman operator associated to the tuple (f1,,fm)(f_{1},...,f_{m}). Analogously we use the notation \mathcal{M}^{*} and ϵ\mathcal{M}_{\epsilon}^{*}.

Later we will compare the homogeneous system given by a tuple (g1,,gm)(g_{1},...,g_{m}) and a non-homogeneous perturbation (f1,,fm)(f_{1},...,f_{m}). We thus introduce the notation

(9) εkmaxidCk(fi,gi),\varepsilon_{k}\coloneqq\max_{i}d_{C^{k}}(f_{i},g_{i}),

for describing how large a perturbation is. In addition, it will be useful to have a linearization of the difference between fif_{i} and gig_{i}. The standard way to do this is via a chart on the Fréchet manifold Diff(G/K)\operatorname{Diff}^{\infty}(G/K). If dC0(fi,gi)<injG/Kd_{C^{0}}(f_{i},g_{i})<\operatorname{inj}G/K, then we associate fif_{i} with the vector field YiY_{i} defined at gi(x)G/Kg_{i}(x)\in G/K by

(10) Yi(gi(x))expgi(x)1fi(x),Y_{i}(g_{i}(x))\coloneqq\exp_{g_{i}(x)}^{-1}f_{i}(x),

where we choose the minimum length preimage of fi(x)f_{i}(x) in Tgi(x)G/KT_{g_{i}(x)}G/K under the map expgi(x)1\exp^{-1}_{g_{i}(x)}. In addition, if YY is a vector field on MM, then we define ψY:MM\psi_{Y}\colon M\to M to be the map that sends

(11) ψY:xexpx(Y(x)).\psi_{Y}:x\mapsto\exp_{x}(Y(x)).

The following theorem asserts the existence of Lyapunov exponents for random dynamical systems.

Theorem 22.

[Kif86, Ch. 3, Thm. 1.1]. Suppose that EE is measurable vector bundle over a Borel space MM. Suppose that F1,F2,F_{1},F_{2},... is a sequence of independent and identically distributed bundle maps of EE with common distribution ν\nu and suppose that ν\nu has finite support. Suppose that μ\mu is an ergodic ν\nu-stationary measure on MM for the random dynamics on MM induced by those on EE.

Then there exists a list of numbers, the Lyapunov exponents,

<λs<λs1<<λ1<,-\infty<\lambda^{s}<\lambda^{s-1}<\cdots<\lambda^{1}<\infty,

such that for μ\mu a.e. xMx\in M and almost every realization of the sequence, there exists a filtration of linear subspaces

0VsV1Ex0\subset V^{s}\subset\cdots\subset V^{1}\subset E_{x}

such that, for that particular realization of the sequence, if ξVi+1Vi\xi\in V^{i+1}\setminus V^{i}, where Vi{0}V^{i}\equiv\{0\} for i>si>s, then

limn1nlogFnF1ξ=λi.\lim_{n\to\infty}\frac{1}{n}\log\|F^{n}\circ\cdots\circ F^{1}\xi\|=\lambda^{i}.

3.2. Approximation of stationary measures

Let dmdm denote the push-forward of Haar measure to G/KG/K. Note that Haar measure is stationary for the homogeneous random dynamical system given by (g1,,gm)(g_{1},...,g_{m}). The following proposition compares the integral against a stationary measure μ\mu for a perturbation (f1,,fm)(f_{1},...,f_{m}) and the Haar measure. Up to higher order terms, the difference between integrating against Haar and against μ\mu is given by the integral of a particular function 𝒰(ϕ)\mathcal{U}(\phi). We obtain an explicit expression for 𝒰(ϕ)\mathcal{U}(\phi), which is useful because we can tell when 𝒰(ϕ)\mathcal{U}(\phi) vanishes and thus when μ\mu is near to Haar. Compare the following with [DK07, Prop. 2].

Proposition 23.

Suppose that S=(g1,,gm)S=(g_{1},...,g_{m}) is a Diophantine tuple with elements in a compact connected semisimple group GG or elements in Isom(M)\operatorname{Isom}(M) for an isotropic manifold MM with dimM2\dim M\geq 2. Let G/KG/K be a quotient of GG in the former case or a space Isom(M)\operatorname{Isom}(M)^{\circ} acts transitively on in the latter. There exist constants kk and CC such that if (f1,,fm)(f_{1},...,f_{m}) is a tuple with elements in Diff(G/K)\operatorname{Diff}^{\infty}(G/K) with ε0=maxidC0(fi,gi)<injG/K\varepsilon_{0}=\max_{i}d_{C^{0}}(f_{i},g_{i})<\operatorname{inj}G/K, then the following holds for each stationary measure μ\mu for the uniform Bernoulli random dynamical system generated by the fif_{i}. Let Yi=expgi(x)1fi(x)Y_{i}=\exp^{-1}_{g_{i}(x)}f_{i}(x). Then for any ϕC(G/K)\phi\in C^{\infty}(G/K), we have

(12) G/Kϕ𝑑μ=G/Kϕ𝑑m+G/K𝒰(ϕ)𝑑m+O(εk2ϕCk),\int_{G/K}\phi\,d\mu=\int_{G/K}\phi\,dm+\int_{G/K}\mathcal{U}(\phi)\,dm+O(\varepsilon_{k}^{2}\|\phi\|_{C^{k}}),

where dmdm denotes the normalized push-forward of Haar measure to G/KG/K and

(13) 𝒰(ϕ)1mi=1mYi(I)1(ϕϕdm).\mathcal{U}(\phi)\coloneqq\frac{1}{m}\sum_{i=1}^{m}\nabla_{Y_{i}}(I-\mathcal{M})^{-1}(\phi-\smallint\phi\,dm).

Moreover,

(14) |𝒰(ϕ)𝑑m|CϕCki=1mYiCk,\left|\int\mathcal{U}(\phi)\,dm\right|\leq C\|\phi\|_{C^{k}}\left\|\sum_{i=1}^{m}Y_{i}\right\|_{C^{k}},

and the constants, including the constant in the big-OO in equation (12), are stable in SS.

Proof.

The proof is similar to the proof of [Mal12, Prop. 4]. We write the proof for the connected group GG; the proof for Isom(M)\operatorname{Isom}(M) is identical with us using Proposition 21 instead of Proposition 16.

Note that a smooth real valued function defined on G/KG/K is naturally viewed as a section of the trivial bundle over G/KG/K. If we view the averaged Koopman operator \mathcal{M} associated to (g1,,gm)(g_{1},...,g_{m}) as acting on the sections of the trivial bundle G/K×G/K\times\mathbb{R}, then \mathcal{M} satisfies the hypotheses of Proposition 16. Thus there exists α\alpha and constants CsC_{s} such that for any ϕC0(G/K)\phi\in C^{\infty}_{0}(G/K), the space of integral 0 smooth functions on G/KG/K,

(15) (I)1ϕCsCsϕCs+α.\|(I-\mathcal{M})^{-1}\phi\|_{C^{s}}\leq C_{s}\|\phi\|_{C^{s+\alpha}}.

Observe that for any ii:

|ϕfi(x)ϕgi(x)|ε0ϕC1.\left|\phi\circ f_{i}(x)-\phi\circ g_{i}(x)\right|\leq\varepsilon_{0}\|\phi\|_{C^{1}}.

Since μ\mu is ϵ\mathcal{M}_{\epsilon}^{*} invariant, this implies that

|ϕϕdμ|=|εϕϕdμ|ε0ϕC1.\left|\int\phi-\mathcal{M}\phi\,d\mu\right|=\left|\int\mathcal{M}_{\varepsilon}\phi-\mathcal{M}\phi\,d\mu\right|\leq\varepsilon_{0}\|\phi\|_{C^{1}}.

Substituting (I)1(ϕϕ𝑑m)(I-\mathcal{M})^{-1}(\phi-\int\phi\,dm) for the function ϕ\phi in the previous line and using equation (15) yields a first order approximation:

(16) |ϕ𝑑μϕ𝑑m|ε0C1ϕC1+α.\left|\int\phi\,d\mu-\int\phi\,dm\right|\leq\varepsilon_{0}C_{1}\|\phi\|_{C^{1+\alpha}}.

We now use this first order approximation to obtain a better estimate. Note the Taylor expansion:

ϕfi(x)ϕgi(x)=(Yiϕ)(gi(x))+O(ε02ϕC2).\phi\circ f_{i}(x)-\phi\circ g_{i}(x)=(\nabla_{Y_{i}}\phi)(g_{i}(x))+O(\varepsilon_{0}^{2}\|\phi\|_{C^{2}}).

Integrating against μ\mu yields

ϕϕdμ=εϕϕdμ=1mi=1mYiϕ(gi(x))dμ+O(ε02ϕC2).\int\phi-\mathcal{M}\phi\,d\mu=\int\mathcal{M}_{\varepsilon}\phi-\mathcal{M}\phi\,d\mu=\int\frac{1}{m}\sum_{i=1}^{m}\nabla_{Y_{i}}\phi(g_{i}(x))\,d\mu+O(\varepsilon_{0}^{2}\|\phi\|_{C^{2}}).

We now plug in (I)1(ϕϕ𝑑m)(I-\mathcal{M})^{-1}(\phi-\smallint\phi\,dm) for ϕ\phi in the previous line and use the estimate in equation (15) to obtain:

ϕdμϕdm=1mi=1m(Yi(I)1(ϕϕdm))(gi(x))dμ+O(ε02ϕC2+α).\int\phi\,d\mu-\int\phi\,dm=\int\frac{1}{m}\sum_{i=1}^{m}\left(\nabla_{Y_{i}}(I-\mathcal{M})^{-1}(\phi-\smallint\phi\,dm)\right)(g_{i}(x))\,d\mu+O(\varepsilon_{0}^{2}\|\phi\|_{C^{2+\alpha}}).

Using equation (16) on the first term on the right hand side above yields

(17) ϕ𝑑μϕ𝑑m=\displaystyle\int\phi\,d\mu-\int\phi\,dm= 1mi=1m(Yi(I)1(ϕϕdm))(gi(x))dm\displaystyle\int\frac{1}{m}\sum_{i=1}^{m}\left(\nabla_{Y_{i}}(I-\mathcal{M})^{-1}(\phi-\smallint\phi\,dm)\right)(g_{i}(x))\,dm
+O(ε0i=1mYi(I)1ϕC1+α)+O(ε02ϕC2+α).\displaystyle+O\left(\varepsilon_{0}\left\|\sum_{i=1}^{m}\nabla_{Y_{i}}(I-\mathcal{M})^{-1}\phi\right\|_{C^{1+\alpha}}\right)+O(\varepsilon_{0}^{2}\|\phi\|_{C^{2+\alpha}}).

Note that

i=1mYi(I)1ϕC1+α=O(ε2+α(I)1ϕC2+α).\left\|\sum_{i=1}^{m}\nabla_{Y_{i}}(I-\mathcal{M})^{-1}\phi\right\|_{C^{1+\alpha}}=O(\varepsilon_{2+\alpha}\|(I-\mathcal{M})^{-1}\phi\|_{C^{2+\alpha}}).

The application of equation (15) to (I)1ϕC2+α\|(I-\mathcal{M})^{-1}\phi\|_{C^{2+\alpha}}then gives that the first big OO-term in (17) is O(ε0ε2+αϕC2+2α)O(\varepsilon_{0}\varepsilon_{2+\alpha}\|\phi\|_{C^{2+2\alpha}}). Thus,

ϕdμϕdm=1mi=1m(Yi(I)1(ϕϕdm))(gi(x))dm+O(ε2+α2ϕC2+2α).\int\phi\,d\mu-\int\phi\,dm=\int\frac{1}{m}\sum_{i=1}^{m}\left(\nabla_{Y_{i}}(I-\mathcal{M})^{-1}(\phi-\smallint\phi\,dm)\right)(g_{i}(x))\,dm+O(\varepsilon_{2+\alpha}^{2}\|\phi\|_{C^{2+2\alpha}}).

Now, by translation invariance of the Haar measure we may remove the gig_{i}’s:

ϕdμϕdm=1mi=1mYi(I)1(ϕϕdm)dm+O(ε2+α2ϕC2+2α).\int\phi\,d\mu-\int\phi\,dm=\int\frac{1}{m}\sum_{i=1}^{m}\nabla_{Y_{i}}(I-\mathcal{M})^{-1}(\phi-\smallint\phi\,dm)\,dm+O(\varepsilon_{2+\alpha}^{2}\|\phi\|_{C^{2+2\alpha}}).

This proves everything except equation (14).

We now estimate the integral of

𝒰(ϕ)\displaystyle\mathcal{U}(\phi) =1mi=1mYi(I)1(ϕϕdm),\displaystyle=\frac{1}{m}\sum_{i=1}^{m}\nabla_{Y_{i}}(I-\mathcal{M})^{-1}(\phi-\smallint\phi\,dm),
=1mi=1mYi(I)1(ϕϕdm),\displaystyle=\nabla_{\frac{1}{m}\sum_{i=1}^{m}Y_{i}}(I-\mathcal{M})^{-1}(\phi-\smallint\phi\,dm),

against Haar. By equation (15) there exists C1C_{1} such that

(I)1(ϕϕ𝑑m)C1C1ϕC1+α,\left\|(I-\mathcal{M})^{-1}(\phi-\int\phi\,dm)\right\|_{C^{1}}\leq C_{1}\|\phi\|_{C^{1+\alpha}},

which establishes equation (14) by a similar argument to the estimate of the big-O term occurring in the previous part of this proof. ∎

4. Strain and Lyapunov Exponents

In this section we study the Lyapunov exponents of perturbations of isometric systems. The main result is Proposition 26, which gives a Taylor expansion of the Lyapunov exponents of a perturbation. The terms appearing in the Taylor expansion have a particular geometric meaning. We explain this meaning in terms of two “strain” tensors associated to a diffeomorphism. These tensors measure how far a diffeomorphism is from being an isometry. After introducing these tensors, we prove Proposition 26. The Lyapunov exponents of a random dynamical system may be calculated by integrating against a stationary measure of a certain extension of the original system. By using Proposition 23, we are able to approximate such stationary measures by the Haar measure and thereby obtain a Taylor expansion.

4.1. Norms on Tensors

Throughout this paper we use the pointwise L2L^{2} norm on tensors, which we now describe. For a more detailed discussion, see the discussion surrounding [Lee18, Prop. 2.40]. If VV is an inner product space with orthonormal basis [e1,,en][e_{1},\ldots,e_{n}], then VkV^{\otimes k} has a basis of tensors of the form

ei1eike_{i_{1}}\otimes\cdots\otimes e_{i_{k}}

where 1ijn1\leq i_{j}\leq n for each 1jk1\leq j\leq k. We declare the vectors of this basis to be orthonormal for the inner product on VkV^{\otimes k}. This norm is independent of the choice of orthonormal basis. For a continuous tensor field TT on a closed Riemannian manifold MM, we write T\|T\| for maxxMT(x)\max_{x\in M}\|T(x)\|. If TT is a tensor on a Riemannian manifold MM, we then define its L2L^{2} norm in the expected way by integrating the norm of T(x)T(x) as a tensor on TxMT_{x}M over all points xMx\in M, i.e.

TL2=(MT(x)2dvol(x))1/2.\|T\|_{L^{2}}=\left(\int_{M}\|T(x)\|^{2}\,d\operatorname{vol}(x)\right)^{1/2}.

4.2. Strain

If a diffeomorphism of a Riemannian manifold is an isometry, then it pulls back the metric tensor to itself. Consequently, if we are interested in how near a diffeomorphism is to being an isometry, it is natural to consider the difference between the metric tensor and the pullback of the metric tensor. This leads us to the following definition.

Definition 24.

Suppose that ff is a diffeomorphism of a Riemannian manifold (M,g)(M,g). We define the Lagrangian strain tensor associated to ff to be

Ef12(fgg).E^{f}\coloneqq\frac{1}{2}\left(f^{*}g-g\right).

This definition is consonant with the definition of the Lagrangian strain tensor that appears in continuum mechanics, c.f. [LRK09].

The strain tensor will be useful for two reasons. First, it naturally appears in the Taylor expansion in Proposition 26, which will allow us to conclude that a random dynamical system with small Lyapunov exponents has small strain. Secondly, we prove in Theorem 27 that for certain manifolds that a diffeomorphism with small strain is near to an isometry. The combination of these two things will be essential in the proof of our main linearization result, Theorem 1, which shows that perturbations with all Lyapunov exponents zero are conjugate to isometric systems.

We now introduce two refinements of the strain tensor that will appear in the Taylor expansion in Proposition 26. Note that EfE^{f} is a (0,2)(0,2)-tensor. Consequently, we may take its trace with respect to the ambient metric gg.

Definition 25.

Suppose that ff is a diffeomorphism of a Riemannian manifold (M,g)(M,g). We define the conformal strain tensor by

ECfTr(fgg)2dg.E_{C}^{f}\coloneqq\frac{\operatorname{Tr}(f^{*}g-g)}{2d}g.

We define the nonconformal strain tensor by

ENCfEfECf=12(fggTr(fgg)dg).E_{NC}^{f}\coloneqq E^{f}-E_{C}^{f}=\frac{1}{2}\left(f^{*}g-g-\frac{\operatorname{Tr}(f^{*}g-g)}{d}g\right).

4.3. Taylor expansion of Lyapunov exponents

Suppose that MM is a manifold and that ff is a diffeomorphism of MM. Let Grr(M)\operatorname{Gr}_{r}(M) denote the Grassmannian bundle comprised of rr-planes in TMTM. When working with Grr(M)\operatorname{Gr}_{r}(M) we write a subspace of TxMT_{x}M as ExE_{x} to emphasize the basepoint. Then ff naturally induces a map F:Grr(M)Grr(M)F\colon\operatorname{Gr}_{r}(M)\to\operatorname{Gr}_{r}(M) by sending a subspace ExGrr(TxM)E_{x}\in\operatorname{Gr}_{r}(T_{x}M) to DxfExGrr(Tf(x)M)D_{x}fE_{x}\in\operatorname{Gr}_{r}(T_{f(x)}M). If we have a random dynamical system on MM, then by this construction we naturally obtain a random dynamical system on Grr(M)\operatorname{Gr}_{r}(M). The following Proposition should be compared with [DK07, Prop. 3].

Proposition 26.

Suppose that MM is a compact connected Riemannian manifold such that Isom(M)\operatorname{Isom}(M) is semisimple and that Isom(M)\operatorname{Isom}(M)^{\circ} acts transitively on Grr(M)\operatorname{Gr}_{r}(M). Suppose that S=(g1,,gm)S=(g_{1},...,g_{m}) is a Diophantine tuple of elements of Isom(M)\operatorname{Isom}(M). Then there exists ϵ>0\epsilon>0 and k>0k>0 such that if (f1,,fm)(f_{1},...,f_{m}) is a tuple with elements in Diff(M)\operatorname{Diff}^{\infty}(M) such that dCk(fi,gi)<ϵd_{C^{k}}(f_{i},g_{i})<\epsilon, then the following holds. Suppose that μ\mu is an ergodic stationary measure for the random dynamical system obtained from the (f1,,fm)(f_{1},...,f_{m}). Let Λr\Lambda_{r} be the sum of the top rr Lyapunov exponents of μ\mu. Then

(18) Λr(μ)=\displaystyle\Lambda_{r}(\mu)= r2dmi=1mMECfi2dvol+r(dr)(d+2)(d1)mi=1mMENCfi2dvol\displaystyle-\frac{r}{2dm}\sum_{i=1}^{m}\int_{M}\|E_{C}^{f_{i}}\|^{2}\,d\operatorname{vol}+\frac{r(d-r)}{(d+2)(d-1)m}\sum_{i=1}^{m}\int_{M}\|E_{NC}^{f_{i}}\|^{2}\,d\operatorname{vol}
+Grr(M)𝒰(ψ)dvol+O(εk3).\displaystyle+\int_{\operatorname{Gr}_{r}(M)}\mathcal{U}(\psi)\,d\operatorname{vol}+O(\varepsilon_{k}^{3}).

where ψ=1mi=1mlndet(DfiEx)\psi=\frac{1}{m}\sum_{i=1}^{m}\ln\det(Df_{i}\mid E_{x}), εk=maxi{dCk(fi,gi)}\varepsilon_{k}=\max_{i}\{d_{C^{k}}(f_{i},g_{i})\}, 𝒰\mathcal{U} is defined as in Proposition 23, and det\det is defined in Appendix D.

Proof.

Given the random dynamical system on MM generated by the tuple (f1,,fm)(f_{1},...,f_{m}), there is the induced random dynamical system on Grr(M)\operatorname{Gr}_{r}(M) generated by the tuple (F1,,Fm)(F_{1},...,F_{m}). The Lyapunov exponents of the system on MM may be obtained from the system on Grr(M)\operatorname{Gr}_{r}(M) in the following way. By [Kif86, Ch. III, Thm 1.2], given an ergodic stationary measure μ\mu on MM, there exists a stationary measure μ¯\overline{\mu} on Grr(M)\operatorname{Gr}_{r}(M) such that

Λr(μ)=1mi=1mGrr(M)lndet(DfiEx)dμ¯(Ex).\Lambda_{r}(\mu)=\frac{1}{m}\sum_{i=1}^{m}\int_{\operatorname{Gr}_{r}(M)}\ln\det(Df_{i}\mid E_{x})\,d\overline{\mu}(E_{x}).

Reversing the order of summation, this is equal to

(19) Grr(M)1mi=1mlndet(DfiEx)dμ¯(Ex).\int_{\operatorname{Gr}_{r}(M)}\frac{1}{m}\sum_{i=1}^{m}\ln\det(Df_{i}\mid E_{x})\,d\overline{\mu}(E_{x}).

As Isom(M)\operatorname{Isom}(M) acts transitively on Grr(M)\operatorname{Gr}_{r}(M), Grr(M)\operatorname{Gr}_{r}(M) is a homogeneous space of Isom(M)\operatorname{Isom}(M). Thus as (g1,,gm)(g_{1},...,g_{m}) is Diophantine, we may apply Proposition 23 to approximate the integral in equation (19). Letting 𝒰\mathcal{U} be as in that proposition, there exists kk such that

(20) Λr(μ)=\displaystyle\Lambda_{r}(\mu)= Grr(M)1mi=1mlndet(DfiEx)dvol(Ex)+Grr(M)𝒰(1mi=1mlndet(DfiEx))dvol\displaystyle\int_{\operatorname{Gr}_{r}(M)}\frac{1}{m}\sum_{i=1}^{m}\ln\det(Df_{i}\mid E_{x})\,d\operatorname{vol}(E_{x})+\int_{\operatorname{Gr}_{r}(M)}\mathcal{U}\left(\frac{1}{m}\sum_{i=1}^{m}\ln\det(Df_{i}\mid E_{x})\right)\,d\operatorname{vol}
+O((maxi{dCk(Fi,Gi)})2i=1mlndet(DfiEx)Ck)\displaystyle+O\left((\max_{i}\{d_{C^{k}}(F_{i},G_{i})\})^{2}\left\|\sum_{i=1}^{m}\ln\det(Df_{i}\mid E_{x})\right\|_{C^{k}}\right)

We now estimate the error term. The following two estimates follow by working in a chart on Grr(M)\operatorname{Gr}_{r}(M). If f,gf,g are two maps of MM and F,GF,G are the induced maps on Grr(M)\operatorname{Gr}_{r}(M), then dCk(F,G)=O(dCk+1(f,g))d_{C^{k}}(F,G)=O(d_{C^{k+1}}(f,g)). In addition, by Lemma 58 we have that

(21) i=1mlndet(DfiEx)Ck=O(εk+1).\left\|\sum_{i=1}^{m}\ln\det(Df_{i}\mid E_{x})\right\|_{C^{k}}=O(\varepsilon_{k+1}).

Thus the error term in (20) is small enough to conclude (18).

To finish, we apply the Taylor expansion in Proposition 59, which is in Appendix E, to

Grr(M)lndet(DfiEx)dvol(Ex),\int_{\operatorname{Gr}_{r}(M)}\ln\det(Df_{i}\mid E_{x})\,d\operatorname{vol}(E_{x}),

which gives precisely the first two terms on the right hand side of equation (18) and error that is O(ε13)O(\varepsilon_{1}^{3}). ∎

5. Diffeomorphisms of Small Strain: Extracting an Isometry in the KAM Scheme

In this section we prove Proposition 28, which gives that a diffeomorphism of small strain on an isotropic manifold is near to an isometry. In the KAM scheme, we will see that diffeomorphisms with small Lyapunov exponents are low strain and hence conclude by Proposition 28 that they are near to isometries. Proposition 28 follows from Theorem 27, which shows that certain diffeomorphisms with small strain of a closed Riemannian manifold are C0C^{0} close to the identity.

Theorem 27.

Suppose that (M,g)(M,g) is a closed Riemannian manifold. Then there exists 1>r>01>r>0 and C>0C>0 such that if fDiff2(M)f\in\operatorname{Diff}^{2}(M) and

  1. (1)

    there exists xMx\in M such that f(x)=xf(x)=x and DxfId=θ<r,\|D_{x}f-\operatorname{Id}\|=\theta<r,

  2. (2)

    fgg=η<r\|f^{*}g-g\|=\eta<r, and

  3. (3)

    dC2(f,Id)=κ<rd_{C^{2}}(f,\operatorname{Id})=\kappa<r,

then for all γ(0,r)\gamma\in(0,r),

dC0(f,Id)C(θ+κγ+ηγ1).d_{C^{0}}(f,\operatorname{Id})\leq C(\theta+\kappa\gamma+\eta\gamma^{-1}).

Theorem 27 is the main ingredient in the proof of our central technical result.

Proposition 28.

Suppose that (M,g)(M,g) is a closed isotropic Riemannian manifold. Then for all σ>0\sigma>0 and all integers >0\ell>0, there exist kk and C,r>0C,r>0 such that for every fDiffk(M)f\in\operatorname{Diff}^{k}(M), if there exists an isometry IIsom(M)I\in\operatorname{Isom}(M) such that

  1. (1)

    dCk(I,f)<rd_{C^{k}}(I,f)<r, and

  2. (2)

    fggH0<r\|f^{*}g-g\|_{H^{0}}<r,

then there exists an isometry RIsom(M)R\in\operatorname{Isom}(M) such that

(22) dC0(R,I)\displaystyle d_{C^{0}}(R,I) <C(dC2(f,I)+fggH01σ), and\displaystyle<C(d_{C^{2}}(f,I)+\|f^{*}g-g\|_{H^{0}}^{1-\sigma}),\text{ and}
(23) dC(f,R)\displaystyle d_{C^{\ell}}(f,R) <C(fggH01/2σdC2(f,I)1/2σ).\displaystyle<C(\|f^{*}g-g\|_{H^{0}}^{1/2-\sigma}d_{C^{2}}(f,I)^{1/2-\sigma}).

Though the statement of Proposition 28 is technical, its use in the proof of Theorem 1 is fairly transparent: the proposition produces an isometry near to a diffeomorphism with small strain, which is the essence of iterative step in the KAM scheme. This remedies the gap in [DK07].

5.1. Low strain diffeomorphisms on a general manifold: proof of Theorem 27

The main geometric idea in the proof of Theorem 27 is to study distances by intersecting spheres. In order to show that a diffeomorphism ff is close to the identity, we must show that it does not move points far. As we shall show, a diffeomorphism of small strain distorts distances very little. Consequently, a diffeomorphism of small strain nearly carries spheres to spheres. If we have two points xx and yy that are fixed by ff, then the unit spheres centered at xx and yy are carried near to themselves by ff. Consequently, the intersection of those spheres will be nearly fixed by ff. By considering the intersection of spheres in this way, we may take a small set on which ff nearly fixes points and enlarge that set until it fills the whole manifold.

Before the proof of the theorem we prove several lemmas.

Lemma 29.

Let MM be a closed Riemannian manifold. There exists C>0C>0 such that the following holds. If fDiff1(M)f\in\operatorname{Diff}^{1}(M) and fggη\|f^{*}g-g\|\leq\eta then for all x,yMx,y\in M,

(1Cη)d(x,y)d(f(x),f(y))(1+Cη)d(x,y).(1-C\eta)d(x,y)\leq d(f(x),f(y))\leq(1+C\eta)d(x,y).
Proof.

If γ\gamma is a path between xx and yy parametrized by arc length, then fγf\circ\gamma is a path between f(x)f(x) and f(y)f(y). The length of fγf\circ\gamma is equal to

len(fγ)\displaystyle\operatorname{len}(f\circ\gamma) =0len(γ)g(Dfγ˙,Dfγ˙)𝑑t\displaystyle=\int_{0}^{\operatorname{len}(\gamma)}\sqrt{g(Df\dot{\gamma},Df\dot{\gamma})}\,dt
=0len(γ)fg(γ˙,γ˙)𝑑t\displaystyle=\int_{0}^{\operatorname{len}(\gamma)}\sqrt{f^{*}g(\dot{\gamma},\dot{\gamma})}\,dt
=0len(γ)g(γ˙,γ˙)+[fgg](γ˙,γ˙)𝑑t\displaystyle=\int_{0}^{\operatorname{len}(\gamma)}\sqrt{g(\dot{\gamma},\dot{\gamma})+[f^{*}g-g](\dot{\gamma},\dot{\gamma})}\,dt
=0len(γ)1+[fgg](γ˙,γ˙)𝑑t.\displaystyle=\int_{0}^{\operatorname{len}(\gamma)}\sqrt{1+[f^{*}g-g](\dot{\gamma},\dot{\gamma})}\,dt.

By our assumption on the norm of fggf^{*}g-g, there exists CC such that |[fgg](γ˙,γ˙)|Cη\left|[f^{*}g-g](\dot{\gamma},\dot{\gamma})\right|\leq C\eta. Then using that 1+x1+x\sqrt{1+x}\leq 1+x for x0x\geq 0, we see that

len(fγ)0len(γ)1+|[fgg](γ˙,γ˙)|dtlen(γ)+Cηlen(γ).\operatorname{len}(f\circ\gamma)\leq\int_{0}^{\operatorname{len}(\gamma)}1+\left|[f^{*}g-g](\dot{\gamma},\dot{\gamma})\right|\,dt\leq\operatorname{len}(\gamma)+C\eta\operatorname{len}(\gamma).

The lower bound follows similarly by using that 1+x1+x1+x\leq\sqrt{1+x} for 1x0-1\leq x\leq 0. ∎

Lemma 30.

Let MM be a closed Riemannian manifold. Then there exist r,C>0r,C>0 such that for all fDiff2(M)f\in\operatorname{Diff}^{2}(M), if

  1. (1)

    there exists xMx\in M such that f(x)=xf(x)=x and DxfId=θ<r\|D_{x}f-\operatorname{Id}\|=\theta<r, and

  2. (2)

    dC2(f,Id)=κ<rd_{C^{2}}(f,\operatorname{Id})=\kappa<r,

then for all 0<γ<r0<\gamma<r and yy such that d(x,y)<γd(x,y)<\gamma

d(y,f(y))C(γθ+γ2κ).d(y,f(y))\leq C(\gamma\theta+\gamma^{2}\kappa).
Proof.

Let r=injM/2r=\operatorname{inj}M/2. We work in a fixed exponential chart centered at xx, so that xx is represented by 0 in the chart. Write

f(y)=0+D0fy+R(y)=y+(D0fId)y+R(y).f(y)=0+D_{0}fy+R(y)=y+(D_{0}f-\operatorname{Id})y+R(y).

As the C2C^{2} distance between ff and the identity is at most κ\kappa, by Taylor’s Theorem R(y)R(y) is bounded in size by Cκ|y|2C\kappa\left|y\right|^{2} for a uniform constant CC. Thus

|f(y)y|θ|y|+Cκ|y|2.\left|f(y)-y\right|\leq\theta\left|y\right|+C\kappa\left|y\right|^{2}.

In particular, for all yy such that |y|γ<r\left|y\right|\leq\gamma<r,

|f(y)y|C(γθ+γ2κ).\left|f(y)-y\right|\leq C^{\prime}(\gamma\theta+\gamma^{2}\kappa).

But the distance in such a chart is uniformly bi-Lipschitz with respect to the metric on MM, so the lemma follows. ∎

The following geometric lemma produces points on two spheres in a Riemannian manifold that are further apart than the centers of the spheres.

yyxxppqqSd(x,y)(x)S^{d(x,y)}(x)Sd(x,y)(y)S^{d(x,y)}(y)
Figure 1. The four points x,y,p,qx,y,p,q appearing in Lemma 31. Given x,y,px,y,p, the lemma produces the point qq and gives an estimate on the length of the dotted line, which is longer than d(x,y)d(x,y).
Lemma 31.

Let MM be a closed Riemannian manifold. There exist C,r>0C,r>0 such that for all β(0,r)\beta\in(0,r), if x,yMx,y\in M satisfy injM3<d(x,y)<injM2\frac{\operatorname{inj}M}{3}<d(x,y)<\frac{\operatorname{inj}M}{2}, and there is a fixed pMp\in M such that d(x,p)=d(y,x)d(x,p)=d(y,x) and d(p,y)<rd(p,y)<r, then there exists qMq\in M depending on pp such that:

  1. (1)

    d(q,y)=d(y,x)d(q,y)=d(y,x),

  2. (2)

    d(q,x)<βd(q,x)<\beta, and

  3. (3)

    d(q,p)d(x,y)+Cd(y,p)βd(q,p)\geq d(x,y)+Cd(y,p)\beta.

In order to prove Lemma 31, we recall the following form of the second variation of length formula. For a proof of this and related discussion, see [CE75, Ch. 1,§6].

Lemma 32.

Let MM be a Riemannian manifold and γ\gamma be a unit speed geodesic. Let γv,w\gamma_{v,w} be a two parameter family of constant speed geodesics parametrized by γv,w:[a,b]×(ϵ,ϵ)×(ϵ,ϵ)M\gamma_{v,w}\colon[a,b]\times(-\epsilon,\epsilon)\times(-\epsilon,\epsilon)\to M such that γ0,0=γ\gamma_{0,0}=\gamma. Suppose that γv,wv=V\frac{\partial\gamma_{v,w}}{\partial v}=V and γv,ww=W\frac{\partial\gamma_{v,w}}{\partial w}=W are both normal to γ˙0,0\dot{\gamma}_{0,0}, which we denote by TT. Then

2len(γv,w)vw=WV,T|ab+V,TW|ab.\frac{\partial^{2}\operatorname{len}(\gamma_{v,w})}{\partial v\partial w}=\langle\nabla_{W}V,T\rangle|_{a}^{b}+\langle V,\nabla_{T}W\rangle|_{a}^{b}.
Proof of Lemma 31..

We will give a geometric construction using the points xx and yy and then explain how this construction may be applied to the particular point pp to produce a point qq.

Let QQ be a unit tangent vector based at yy that is tangent to Sd(x,y)(x)S^{d(x,y)}(x), the sphere of radius d(x,y)d(x,y) centered at xx. Let γt:[a,b]M\gamma_{t}\colon[a,b]\to M be a one-parameter family of geodesics parametrized by arc length so that γ0\gamma_{0} is the unit speed geodesic from xx to yy, tγt(b)|t=0=Q\partial_{t}\gamma_{t}(b)|_{t=0}=Q, γt(b)\gamma_{t}(b) is a path in Sd(x,y)(x)S^{d(x,y)}(x), and γt(a)=x\gamma_{t}(a)=x for all tt. The variation γt\gamma_{t} gives rise to a Jacobi field YY. Note that Y(a)=0Y(a)=0, Y(b)=QY(b)=Q, and YY is a normal Jacobi field.

Next, let XX be the Jacobi field along γ0\gamma_{0} defined by X(b)=0X(b)=0 and TX|b=Y(b)\nabla_{T}X|_{b}=Y(b), where TT denotes γ0˙\dot{\gamma_{0}}, i.e. the tangent to the curve γ0\gamma_{0}. Such a field exists and has uniformly bounded norms because γ0\gamma_{0} is shorter than the injectivity radius. Let ηt:[a,b]M\eta_{t}\colon[a,b]\to M be a one-parameter family of geodesics tangent to the field XX such that ηt(b)=y\eta_{t}(b)=y, ηt\eta_{t} is arc length parametrized, and η0=γ0\eta_{0}=\gamma_{0}. Note that each ηt\eta_{t} has length d(x,y)d(x,y). Let TT now denote γ˙s,t\dot{\gamma}_{s,t}, which give the tangent direction to each curve γs,t\gamma_{s,t} in the variation.

Define γs,t:[a,b]M\gamma_{s,t}\colon[a,b]\to M to be the arc length parametrized geodesic between ηs(a)\eta_{s}(a) and γt(b)\gamma_{t}(b). The variation γs,t\gamma_{s,t} is a two parameter variation satisfying the hypotheses of Lemma 32. Consequently, we see that

(24) d2len(γs,t)dsdt=XY,T|ab+Y,TX|ab.\frac{d^{2}\operatorname{len}(\gamma_{s,t})}{dsdt}=\langle\nabla_{X}Y,T\rangle|_{a}^{b}+\langle Y,\nabla_{T}X\rangle|_{a}^{b}.

The first term may be rewritten as

(25) XY,T|ab=XY,T|abY,XT|ab.\langle\nabla_{X}Y,T\rangle|_{a}^{b}=\nabla_{X}\langle Y,T\rangle|_{a}^{b}-\langle Y,\nabla_{X}T\rangle|_{a}^{b}.

As Y(a)=0Y(a)=0 and X(b)=0X(b)=0, the second term in (25) is zero. Similarly XY,T|b=0\nabla_{X}\langle Y,T\rangle|_{b}=0. We claim that XY,T|a=0\nabla_{X}\langle Y,T\rangle|_{a}=0 as well. To see this we claim that Y=tγs,t|a=0Y=\partial_{t}\gamma_{s,t}|_{a}=0 for all ss. This is the case because γs,t(a)\gamma_{s,t}(a) is constant in tt as γs,t(a)\gamma_{s,t}(a) depends only on ss. Thus Y,T|a=0\langle Y,T\rangle|_{a}=0. When we differentiate by XX, we are differentiating along the path γs,0(a)\gamma_{s,0}(a). Thus XY,T|a=0\nabla_{X}\langle Y,T\rangle|_{a}=0 as Y,T\langle Y,T\rangle is 0 along this path. Thus XY,T|ab=0\langle\nabla_{X}Y,T\rangle|_{a}^{b}=0. Noting in addition that Y(a)=0Y(a)=0, equation (24) simplifies to

d2len(γs,t)dsdt=Y,TX|b.\frac{d^{2}\operatorname{len}(\gamma_{s,t})}{dsdt}=\langle Y,\nabla_{T}X\rangle|_{b}.

Hence as we defined XX so that TX|b=Y(b)\nabla_{T}X|_{b}=Y(b),

d2len(γs,t)dsdt=Y(b),Y(b)=Q=1.\frac{d^{2}\operatorname{len}(\gamma_{s,t})}{dsdt}=\langle Y(b),Y(b)\rangle=\|Q\|=1.

Note next that d2ds2len(γs,t)=0\frac{d^{2}}{ds^{2}}\operatorname{len}(\gamma_{s,t})=0 because the geodesics γs,0\gamma_{s,0} all have the same length. Similarly, d2dt2len(γs,t)=0\frac{d^{2}}{dt^{2}}\operatorname{len}(\gamma_{s,t})=0. Thus we have the Taylor expansion

(26) d2dsdtlen(γs,t)=d(x,y)+st+O(s3,t3).\frac{d^{2}}{dsdt}\operatorname{len}(\gamma_{s,t})=d(x,y)+st+O(s^{3},t^{3}).

There exist r0>0r_{0}>0 and C>0C>0 such that for all 0s,t<r00\leq s,t<r_{0},

(27) len(γs,t)d(x,y)+Cst.\operatorname{len}(\gamma_{s,t})\geq d(x,y)+Cst.

Consider now the pairs of points γs,0(a)\gamma_{s,0}(a) and γ0,t(b)\gamma_{0,t}(b). We claim that if pp is of the form p=γ0,t(b)p=\gamma_{0,t}(b) for some small tt then we may take q=γs,0(a)q=\gamma_{s,0}(a), where the choice of ss will be dictated by β\beta.

Note that

d(γs,0(a),x)=sX(a)+O(s2) and d(γ0,t(b),y)=tY(b)+O(t2).d(\gamma_{s,0}(a),x)=s\|X(a)\|+O(s^{2})\text{ and }d(\gamma_{0,t}(b),y)=t\|Y(b)\|+O(t^{2}).

Hence there exists s0s_{0} such that for 0<s,t<s00<s,t<s_{0},

(28) d(γs,0(a),x)<2sX(a) and d(γ0,t(b),y)<2tY(b).d(\gamma_{s,0}(a),x)<2s\|X(a)\|\text{ and }d(\gamma_{0,t}(b),y)<2t\|Y(b)\|.

For any β<min{2s0X(a),2r0X(a)}\beta<\min\{2s_{0}\|X(a)\|,2r_{0}\|X(a)\|\}, by (27) taking s=β/2X(a)s=\beta/2\|X(a)\| we obtain

d(γs,0(a),γ0,t(b))d(x,y)+tβC/2X(a),d(\gamma_{s,0}(a),\gamma_{0,t}(b))\geq d(x,y)+t\beta C/2\|X(a)\|,

which by (28) implies

d(γs,0(a),γ0,t(b))d(x,y)+C4X(a)Y(b)βd(γ0,t(b),y).d(\gamma_{s,0}(a),\gamma_{0,t}(b))\geq d(x,y)+\frac{C}{4\|X(a)\|\|Y(b)\|}\beta d(\gamma_{0,t}(b),y).

By (28) and our choice of ss

d(γs,0(a),x)<β.d(\gamma_{s,0}(a),x)<\beta.

Finally, d(γs,0(a),y)=d(x,y)d(\gamma_{s,0}(a),y)=d(x,y) by the construction of the variation. Thus the conclusion of the lemma holds for the points p=γ0,t(b)p=\gamma_{0,t}(b) and q=γs,0(a)q=\gamma_{s,0}(a).

We claim that this gives the full result. First, note that for all pairs of points xx and yy and choices of vectors QQ in our construction that X(a)\|X(a)\| and Y(b)\|Y(b)\| are bounded above and below. This is because the distance minimizing geodesic from XX to YY does not cross the cut locus. Similarly, the constants CC, r0r_{0}, and s0s_{0} may be uniformly bounded below over all such choices of xx and yy by compactness. Thus as all these constants are uniformly bounded independent of x,yx,y and QQ, the above argument shows that for any pair xx and yy that there is a neighborhood NN of yy in Sd(x,y)S^{d(x,y)} of uniformly bounded size, such that for any pNp\in N there exists qq satisfying the conclusion of the lemma. This gives the result as any pp sufficiently close to yy such that d(x,p)=d(x,y)d(x,p)=d(x,y) lies in such a neighborhood NN. ∎

The following lemma shows that if a diffeomorphism with small strain nearly fixes a large region, then that diffeomorphism is close to the identity.

Lemma 33.

Let (M,g)(M,g) be a closed Riemannian manifold. Then there exists r0(0,1)r_{0}\in(0,1) such that for any r,β(0,r0)r^{\prime},\beta\in(0,r_{0}), there exists C>0C>0 such that if fDiff1(M)f\in\operatorname{Diff}^{1}(M) and

  1. (1)

    dC0(f,Id)r0d_{C^{0}}(f,\operatorname{Id})\leq r_{0},

  2. (2)

    there exists a point xMx\in M such that all yy with d(x,y)<rd(x,y)<r^{\prime} satisfy d(y,f(y))βr0d(y,f(y))\leq\beta\leq r_{0}, and

  3. (3)

    fgg=ηr0\|f^{*}g-g\|=\eta\leq r_{0},

then

(29) dC0(f,Id)<C(β+η).d_{C^{0}}(f,\operatorname{Id})<C(\beta+\eta).
Proof.

Let r1,C1r_{1},C_{1} denote the rr and CC in Lemma 31. Let C2C_{2} be the constant in Lemma 29. There exists a constant r2r_{2} such that for any x,yMx,y\in M with inj(M)/3<d(y,x)<inj(M)/2\operatorname{inj}(M)/3<d(y,x)<\operatorname{inj}(M)/2 and any zz such that d(y,z)<r2d(y,z)<r_{2}, then d(y,z^)<r1d(y,\hat{z})<r_{1}, where z^\hat{z} is the radial projection of zz onto Sd(x,y)(x)S^{d(x,y)}(x). Let r0=min{r1,r2,inj(M)/24}r_{0}=\min\{r_{1},r_{2},\operatorname{inj}(M)/24\}.

Suppose that xMx\in M has the property that d(x,z)<rd(x,z)<r implies d(z,f(z))βd(z,f(z))\leq\beta. Suppose that yy is a point such that inj(M)/3<d(y,x)<inj(M)/2\operatorname{inj}(M)/3<d(y,x)<\operatorname{inj}(M)/2. Let f(y)^\widehat{f(y)} be the radial projection of f(y)f(y) onto Sd(x,y)(x)S^{d(x,y)}(x).

By choice of r0r2r_{0}\leq r_{2}, d(y,f(y))<r2d(y,f(y))<r_{2} and so d(y,f(y)^)r1d(y,\widehat{f(y)})\leq r_{1}. Hence we may apply Lemma 31 with β=r\beta=r^{\prime}, x=xx=x, y=yy=y and p=f(y)^p=\widehat{f(y)} to conclude that there exists a point qMq\in M such that

(30) d(q,y)\displaystyle d(q,y) =d(x,y),\displaystyle=d(x,y),
(31) d(q,x)\displaystyle d(q,x) <r,\displaystyle<r^{\prime},
(32) d(q,f(y)^)\displaystyle d(q,\widehat{f(y)}) d(x,y)+C1d(y,f(y)^)r.\displaystyle\geq d(x,y)+C_{1}d(y,\widehat{f(y)})r^{\prime}.

Using the triangle inequality, we bound the left hand side of (32) to find

(33) d(q,f(q))+d(f(q),f(y))+d(f(y),f(y)^)d(q,f(y)^)d(x,y)+C1d(y,f(y)^)r.d(q,f(q))+d(f(q),f(y))+d(f(y),\widehat{f(y)})\geq d(q,\widehat{f(y)})\geq d(x,y)+C_{1}d(y,\widehat{f(y)})r^{\prime}.

First, as d(q,x)<rd(q,x)<r^{\prime} and points within rr^{\prime} of xx do not move more than β\beta,

d(q,f(q))β.d(q,f(q))\leq\beta.

Second, by Lemma 29, as the distance between qq and yy is bounded above by inj(M)/2\operatorname{inj}(M)/2, there exists C3C_{3} such that

d(f(q),f(y))d(q,y)(1+C2η)=d(x,y)+C3η.d(f(q),f(y))\leq d(q,y)(1+C_{2}\eta)=d(x,y)+C_{3}\eta.

Similarly, as inj(M)/3<d(x,y)<inj(M)/2\operatorname{inj}(M)/3<d(x,y)<\operatorname{inj}(M)/2, Lemma 29 implies the following two bounds

(34) d(x,f(y))d(x,f(x))+d(f(x),f(y))β+d(x,y)+C3ηd(x,f(y))\leq d(x,f(x))+d(f(x),f(y))\leq\beta+d(x,y)+C_{3}\eta

and similarly

(35) d(x,f(y))d(x,y)βC3η.d(x,f(y))\geq d(x,y)-\beta-C_{3}\eta.

For ww sufficiently close to Sd(x,y)(x)S^{d(x,y)}(x) we claim that the radial projection w^\hat{w} is the point in Sd(x,y)(x)S^{d(x,y)}(x) that minimizes the distance to ww. To see this we use that below the injectivity radius geodesics are the unique distance minimizing path between two points. There are two cases: if d(x,w)>d(x,y)d(x,w)>d(x,y) and there is some other point wSd(x,y)(x)w^{\prime}\in S^{d(x,y)}(x) with d(w,w)d(w^,w)d(w^{\prime},w)\leq d(\hat{w},w), then the path from xx to ww^{\prime} to ww along geodesics must be strictly longer than the geodesic path from xx directly to w^\hat{w}. If d(x,w)<d(x,y)d(x,w)<d(x,y) and w^wSd(x,y)(x)\hat{w}\neq w^{\prime}\in S^{d(x,y)}(x), then one obtains two distance minimizing paths from xx to Sd(x,y)(x)S^{d(x,y)}(x) passing through ww: the first along a single geodesic and the second from xx to ww and then from ww to ww^{\prime}. By the uniqueness of distance minimizing geodesics, the latter path must have length greater than d(x,y)d(x,y) because it is not a geodesic. Thus d(w,w)>d(w,w^)d(w,w^{\prime})>d(w,\hat{w}); a contradiction.

The estimates (34) and (35) imply that |d(f(y),x)d(x,y)|β+C3η\left|d(f(y),x)-d(x,y)\right|\leq\beta+C_{3}\eta. Thus the distance from f(y)f(y) to Sd(x,y)(x)S^{d(x,y)}(x) is at most β+C3η\beta+C_{3}\eta. By the previous paragraph, f(y)^\widehat{f(y)} is the point in Sd(x,y)(x)S^{d(x,y)}(x) that minimizes distance to f(y)f(y). Thus

(36) d(f(y),f(y)^)β+C3η.d(f(y),\widehat{f(y)})\leq\beta+C_{3}\eta.

Thus, we obtain from equation (33)

β+d(x,y)+C3η+β+C3ηd(x,y)+C1d(y,f(y)^)r.\beta+d(x,y)+C_{3}\eta+\beta+C_{3}\eta\geq d(x,y)+C_{1}d(y,\widehat{f(y)})r^{\prime}.

Thus

2β+2C3ηC1rd(y,f(y)^).\frac{2\beta+2C_{3}\eta}{C_{1}r^{\prime}}\geq d(y,\widehat{f(y)}).

Hence

d(y,f(y))d(f(y),f(y)^)+d(y,f(y)^)2β+2C3ηC1r+β+C3η.d(y,f(y))\leq d(f(y),\widehat{f(y)})+d(y,\widehat{f(y)})\leq\frac{2\beta+2C_{3}\eta}{C_{1}r^{\prime}}+\beta+C_{3}\eta.

Thus by introducing a new constant C41C_{4}\geq 1, we see that for any yy satisfying inj(M)/3<d(y,x)<inj(M)/2\operatorname{inj}(M)/3<d(y,x)<\operatorname{inj}(M)/2, that

d(y,f(y))C4(β+η).d(y,f(y))\leq C_{4}(\beta+\eta).

Note that the constant C4C_{4} depends only on rr^{\prime} and (M,g)(M,g).

Consider a point yy where (1/3+1/24)inj(M)<d(x,y)<(1/21/24)inj(M)(1/3+1/24)\operatorname{inj}(M)<d(x,y)<(1/2-1/24)\operatorname{inj}(M). Because r<inj(M)/24r^{\prime}<\operatorname{inj}(M)/24 such a point yy has a neighborhood of size rr^{\prime} on which points are moved at most distance C4(β+η)C_{4}(\beta+\eta) by ff. Hence we may repeat the procedure taking yy as the new basepoint. Let xx be the given point in the statement of the lemma. Any point qMq\in M may be connected to xx via a finite sequence of points x=x0,,xn=qx=x_{0},\ldots,x_{n}=q such that each consecutive pair of points in the sequence are at a distance between (1/3+1/24)inj(M)(1/3+1/24)\operatorname{inj}(M) and (1/21/24)inj(M)(1/2-1/24)\operatorname{inj}(M) apart. As MM is compact there is a uniform upper bound on the length of the shortest such sequence. If NN is a uniform upper bound on the length of such a sequence, the above argument shows that for all qMq\in M

d(q,f(q))NC4N(β+η),d(q,f(q))\leq NC_{4}^{N}(\beta+\eta),

which gives the result. ∎

The proof of Theorem 27 consists of two steps. First a disk of uniform radius is produced on which ff nearly fixes points. Then Lemma 33 is applied to this disk to conclude that ff is near to the identity.

Proof of Theorem 27..

Let r1,C1r_{1},C_{1} be denote the rr and CC in Lemma 30, and let r2,C2r_{2},C_{2} denote the rr and cc in Lemma 31. There will be a constant r3>0r_{3}>0 introduced later when it is needed. Let r4r_{4} denote the constant r0r_{0} appearing in Lemma 33. We let r=min{1,r1,r2,r3,r4,inj(M)/24}r=\min\{1,r_{1},r_{2},r_{3},r_{4},\operatorname{inj}(M)/24\}. Let C3C_{3} be the constant in Lemma 29. Let γ(0,r)\gamma\in(0,r) be given.

By Lemma 30, for all zz such that d(x,z)<γd(x,z)<\gamma,

(37) d(z,f(z))C1(θγ+γ2κ).d(z,f(z))\leq C_{1}(\theta\gamma+\gamma^{2}\kappa).

Suppose that yy satisfies inj(M)/3<d(x,y)<inj(M)/2\operatorname{inj}(M)/3<d(x,y)<\operatorname{inj}(M)/2. Let f(y)^\widehat{f(y)} be the radial projection of f(y)f(y) onto the sphere Sd(x,y)(x)S^{d(x,y)}(x).

By Lemma 29,

d(x,y)(1C3η)d(f(x),f(y))d(x,y)(1+C3η).d(x,y)(1-C_{3}\eta)\leq d(f(x),f(y))\leq d(x,y)(1+C_{3}\eta).

As f(x)=xf(x)=x, this implies

d(x,y)(1C3η)d(x,f(y))d(x,y)(1+C3η).d(x,y)(1-C_{3}\eta)\leq d(x,f(y))\leq d(x,y)(1+C_{3}\eta).

Hence as d(x,y)d(x,y) is uniformly bounded above and below, there exists C4C_{4} such that

(38) d(f(y),f(y)^)<C4η.d(f(y),\widehat{f(y)})<C_{4}\eta.

There exists r3>0r_{3}>0 such that if η<r3\eta<r_{3}, then C4η<r2C_{4}\eta<r_{2}. Hence by our choice of rr, d(y,f(y)^)<r2d(y,\widehat{f(y)})<r_{2} and we may apply Lemma 31 with β=γ\beta=\gamma, x=xx=x, y=yy=y, p=f(y)^p=\widehat{f(y)} to deduce that there exists qq such that

(39) d(q,y)\displaystyle d(q,y) =d(x,y),\displaystyle=d(x,y),
(40) d(q,x)\displaystyle d(q,x) <γ,\displaystyle<\gamma,
(41) d(q,f(y)^)\displaystyle d(q,\widehat{f(y)}) d(x,y)+C2d(y,f(y)^)γ.\displaystyle\geq d(x,y)+C_{2}d(y,\widehat{f(y)})\gamma.

By Lemma 29, and using that d(x,y)d(x,y) is bounded by inj(M)/2\operatorname{inj}(M)/2, there exists C5C_{5} such that

(42) d(f(q),f(y))d(q,y)(1+C3η)d(x,y)+C5η.d(f(q),f(y))\leq d(q,y)(1+C_{3}\eta)\leq d(x,y)+C_{5}\eta.

By equation (37), as d(q,x)<γd(q,x)<\gamma,

(43) d(q,f(q))<C1(θγ+κγ2).d(q,f(q))<C_{1}(\theta\gamma+\kappa\gamma^{2}).

Using the triangle inequality with (38), (42), (43), to bound the left hand side of equation (41), we obtain that

C1(θγ+κγ2)+d(x,y)+C5η+C4ηd(q,f(q))+d(f(q),f(y))+d(f(y),f(y)^)d(x,y)+C2d(y,f(y)^)γ.C_{1}(\theta\gamma+\kappa\gamma^{2})+d(x,y)+C_{5}\eta+C_{4}\eta\geq d(q,f(q))+d(f(q),f(y))+d(f(y),\widehat{f(y)})\geq d(x,y)+C_{2}d(y,\widehat{f(y)})\gamma.

Moreover (38) gives the lower bound d(y,f(y)^)>d(y,f(y))C4ηd(y,\widehat{f(y)})>d(y,f(y))-C_{4}\eta. We then obtain that

C1(θγ+κγ2)+C5η+C4ηC2d(y,f(y))γC2C4ηγ,C_{1}(\theta\gamma+\kappa\gamma^{2})+C_{5}\eta+C_{4}\eta\geq C_{2}d(y,f(y))\gamma-C_{2}C_{4}\eta\gamma,

and so

C1(θγ+κγ2)+C5η+C4η+C2C4ηγC2γd(y,f(y)).\frac{C_{1}(\theta\gamma+\kappa\gamma^{2})+C_{5}\eta+C_{4}\eta+C_{2}C_{4}\eta\gamma}{C_{2}\gamma}\geq d(y,f(y)).

The constants C1,,C5C_{1},\ldots,C_{5} are uniform over all yy satisfying inj(M)/3<d(x,y)<inj(M)/2\operatorname{inj}(M)/3<d(x,y)<\operatorname{inj}(M)/2. Thus there exists C6>0C_{6}>0 such that for all such yy,

(44) C6(ηγ1+θ+κγ)d(y,f(y)).C_{6}(\eta\gamma^{-1}+\theta+\kappa\gamma)\geq d(y,f(y)).

Suppose that yy is a point at distance 512inj(M)\frac{5}{12}\operatorname{inj}(M) from xx. The above argument shows if zz satisfies d(y,z)<inj(M)/12d(y,z)<\operatorname{inj}(M)/12 then (44) holds with yy replaced by zz, i.e.

C6(ηγ1+θ+κγ)d(z,f(z)).C_{6}(\eta\gamma^{-1}+\theta+\kappa\gamma)\geq d(z,f(z)).

Define α\alpha by

(45) α=C6(ηγ1+θ+κγ).\alpha=C_{6}(\eta\gamma^{-1}+\theta+\kappa\gamma).

Assuming that α<r4\alpha<r_{4}, zz satisfies the second numbered hypothesis of Lemma 33 with β=α\beta=\alpha and any rinj(M)/12r^{\prime}\leq\operatorname{inj}(M)/12.

There are then two cases depending on whether α>r4\alpha>r_{4} or αr4\alpha\leq r_{4}. In the case that αr4\alpha\leq r_{4}, we apply Lemma 33 with x0=zx_{0}=z, r=r/2r^{\prime}=r/2, and β=α\beta=\alpha. This gives that there exists a C7C_{7} depending only on r/2r/2 such that

dC0(f,Id)C7(ηγ1+θ+κγ).d_{C^{0}}(f,\operatorname{Id})\leq C_{7}(\eta\gamma^{-1}+\theta+\kappa\gamma).

If α>r4\alpha>r_{4}, then as κr4\kappa\leq r_{4},

dC0(f,Id)κr4α=C6(ηγ1+θ+κγ).d_{C^{0}}(f,\operatorname{Id})\leq\kappa\leq r_{4}\leq\alpha=C_{6}(\eta\gamma^{-1}+\theta+\kappa\gamma).

Thus letting C8=max{C6,C7}C_{8}=\max\{C_{6},C_{7}\}, we have that

dC0(f,Id)C8(ηγ1+θ+κγ),d_{C^{0}}(f,\operatorname{Id})\leq C_{8}(\eta\gamma^{-1}+\theta+\kappa\gamma),

which gives the result. ∎

5.2. Application to isotropic spaces: proof of Proposition 28

We now prove Proposition 28, which is an application of Theorem 27 to isotropic spaces. The idea of the proof is geometric. We consider the diffeomorphism I1fI^{-1}f. This diffeomorphism is small in C0C^{0} norm, so there is an isometry R1R_{1} that is close to the identity such that R11I1fR_{1}^{-1}I^{-1}f has a fixed point xx. The differential of R11I1fR_{1}^{-1}I^{-1}f at xx is very close to preserving both the metric tensor and curvature tensor at xx. We then use the following lemma to obtain an isometry R2R_{2} that is nearby to R11I1fR_{1}^{-1}I^{-1}f.

Lemma 34.

[Hel01, Ch. IV Ex. A.6] Let MM be a simply connected Riemannian globally symmetric space or Pn\mathbb{R}\operatorname{P}^{n}. Then if xMx\in M and L:TxMTxML\colon T_{x}M\to T_{x}M is a linear map preserving both the metric tensor at xx and the curvature tensor at xx, then there exists RIsom(M)R\in\operatorname{Isom}(M) such that R(x)=xR(x)=x and DxR=LD_{x}R=L.

We take the diffeomorphism in the conclusion of Proposition 28 to equal IR1R2IR_{1}R_{2}. We then apply Theorem 27 to deduce that R21R11I1fR_{2}^{-1}R_{1}^{-1}I^{-1}f is near the identity diffeomorphism. It follows that IR1R2IR_{1}R_{2} is near to ff. Before beginning the proof, we state some additional lemmas.

Lemma 35.

Suppose that V1V_{1} and V2V_{2} are two subspaces of a finite dimensional inner product space WW. Then there exists C>0C>0 such that if xWx\in W, then

d(x,V1V2)<C(d(x,V1)+d(x,V2)).d(x,V_{1}\cap V_{2})<C(d(x,V_{1})+d(x,V_{2})).
Lemma 36.

Suppose that RR is a tensor on n\mathbb{R}^{n}. Let stab(R)\operatorname{stab}(R) be the subgroup of GL(n)\operatorname{GL}(\mathbb{R}^{n}) that stabilizes RR under pullback. Then there exist C,D>0C,D>0 such that if L:nnL\colon\mathbb{R}^{n}\to\mathbb{R}^{n} is an invertible linear map and LId<D\|L-\operatorname{Id}\|<D, then

dGL(n)(L,stab(R))CLRR.d_{\operatorname{GL}(\mathbb{R}^{n})}(L,\operatorname{stab}(R))\leq C\|L^{*}R-R\|.
Proof.

Let 𝔰\mathfrak{s} be the Lie algebra to stab(R)\operatorname{stab}(R). Then consider the map ϕ\phi from 𝔤𝔩\mathfrak{gl} to the tensor algebra on n\mathbb{R}^{n} given by

wexp(w)RR.w\mapsto\exp(w)^{*}R-R.

We may write w=v+vw=v+v^{\perp}, where v𝔰v\in\mathfrak{s} and v𝔰v\in\mathfrak{s}^{\perp}. Because ϕ\phi is smooth it has a Taylor expansion of the form

(46) ϕ(tv+tv)=0+tAv+tBv+O(t2).\phi(tv+tv^{\perp})=0+tAv+tBv^{\perp}+O(t^{2}).

Note that AA is zero because v𝔰v\in\mathfrak{s}. We claim that BB is injective. For the sake of contradiction, suppose Bv=0Bv^{\perp}=0 for some v𝔰v^{\perp}\in\mathfrak{s}^{\perp}. Then exp(tv)RR=O(t2)\exp(tv^{\perp})^{*}R-R=O(t^{2}). But then

exp(v)RR\displaystyle\exp(v^{\perp})^{*}R-R =i=0n1exp((i+1)v/n)Rexp(iv/n)R\displaystyle=\sum_{i=0}^{n-1}\exp((i+1)v^{\perp}/n)^{*}R-\exp(iv^{\perp}/n)R
=i=0n1exp(iv/n)(exp(v/n)RR)\displaystyle=\sum_{i=0}^{n-1}\exp(iv^{\perp}/n)^{*}(\exp(v^{\perp}/n)^{*}R-R)
=O(1/n).\displaystyle=O(1/n).

And hence exp(v)RR=0\exp(v^{\perp})^{*}R-R=0, which contradicts v𝔰v^{\perp}\notin\mathfrak{s}. Thus BB is an injection and hence by Taylor’s theorem for small vv^{\perp} there exists C1C_{1} such that

(47) exp(v)RRC1v.\|\exp(v^{\perp})^{*}R-R\|\geq C_{1}\|v^{\perp}\|.

By using the Taylor expansion (46) and noting that A=0A=0 there, we obtain from equation (47) that there exists C2>0C_{2}>0 such that

(48) exp(w)RRC2v.\|\exp(w)^{*}R-R\|\geq C_{2}\|v^{\perp}\|.

It then follows there exists a neighborhood NN of IdGL(n)\operatorname{Id}\in\operatorname{GL}(\mathbb{R}^{n}) such that stab(R)N\operatorname{stab}(R)\cap N is the image of a disc D𝔰D\subset\mathfrak{s} under exp\exp. Write 𝔤𝔩=𝔰𝔰\mathfrak{gl}=\mathfrak{s}\oplus\mathfrak{s}^{\perp} as a vector space. Thus as exp\exp is bilipschitz in a neighborhood of 0𝔤𝔩0\in\mathfrak{gl} there exists C3C_{3} such that if we write wDw\in D as w=v+vw=v+v^{\perp}, where v𝔰v\in\mathfrak{s} and v𝔰v^{\perp}\in\mathfrak{s}^{\perp}, then

(49) C31vdGL(n)(exp(w),exp(D))C3v.C_{3}^{-1}\|v^{\perp}\|\leq d_{\operatorname{GL}(\mathbb{R}^{n})}(\exp(w),\exp(D))\leq C_{3}\|v^{\perp}\|.

As stab(R)N=exp(D)\operatorname{stab}(R)\cap N=\exp(D), for all ww in a smaller neighborhood DDD^{\prime}\subset D, the middle term above is comparable to dGL(n)(exp(w),stab(R))d_{GL(\mathbb{R}^{n})}(\exp(w),\operatorname{stab}(R)).

Thus combining (49) with (48), we obtain

dGL(n)(exp(w),stab(R))C21C3exp(w)RR.d_{\operatorname{GL}(\mathbb{R}^{n})}(\exp(w),\operatorname{stab}(R))\leq C_{2}^{-1}C_{3}\|\exp(w)^{*}R-R\|.

This gives the result as exp\exp is a surjection onto a neighborhood of IdGL(n)\operatorname{Id}\in\operatorname{GL}(\mathbb{R}^{n}). ∎

The following lemma is immediate from [Hel01, Thm. IV.3.3], which explicitly describes the isometries of globally symmetric spaces.

Lemma 37.

Suppose that MM is a closed globally symmetric space. There exists C>0C>0 such that if x,yMx,y\in M, then there exists an isometry IIsom(M)I\in\operatorname{Isom}(M)^{\circ} such that I(x)=yI(x)=y and dC0(I,Id)Cd(x,y)d_{C^{0}}(I,\operatorname{Id})\leq Cd(x,y). As Isom(M)\operatorname{Isom}(M)^{\circ} is compact, it follows that for each kk there exists a constant CkC_{k} such that one choose II with dCk(I,Id)Ckd(x,y)d_{C^{k}}(I,\operatorname{Id})\leq C_{k}d(x,y).

We also use the following lemma, which is the specialization of Lemma 36 to the metric tensor.

Lemma 38.

Suppose that VV is a finite dimensional inner product space with metric gg of dimension dd. There exists a neighborhood UU of IdGL(V)\operatorname{Id}\in\operatorname{GL}(V) and a constant CC such that if LUL\in U then

dGL(V)(L,SO(V))CLgg,d_{\operatorname{GL}(V)}(L,\operatorname{SO}(V))\leq C\|L^{*}g-g\|,

where GL(V)\operatorname{GL}(V) is endowed with the right-invariant Riemannian metric it inherits from the inner product space VV.

We now prove the proposition.

Proof of Proposition 28..

Pick 0<λ<10<\lambda<1 and a small τ\tau such that

(50) λ2λτ>12σ and σ>τ>0.\frac{\lambda}{2}-\lambda\tau>\frac{1}{2}-\sigma\text{ and }\sigma>\tau>0.

We also assume without loss of generality that 3\ell\geq 3. By Lemma 55 there exist k0k_{0} and ϵ0>0\epsilon_{0}>0 such that if ss is a smooth section of the bundle of symmetric 22-tensors over MM, sCk04\|s\|_{C^{k_{0}}}\leq 4, and sH0ϵ0\|s\|_{H^{0}}\leq\epsilon_{0}, then sCsH01τ\|s\|_{C^{\ell}}\leq\|s\|_{H^{0}}^{1-\tau}. Choose kk such that

(51) k>max{k0,1λ}.k>\max\{k_{0},\frac{\ell}{1-\lambda}\}.

In addition, there are positive numbers ϵ1,,ϵ7\epsilon_{1},\ldots,\epsilon_{7} that will be introduced when needed in the proof below. We define

r=min{ϵ0,ϵ11/(1τ),ϵ2,,ϵ7,1}.r=\min\{\epsilon_{0},\epsilon_{1}^{1/(1-\tau)},\epsilon_{2},\ldots,\epsilon_{7},1\}.

Let ϵ1>0\epsilon_{1}>0 be small enough that for any xMx\in M, if L:TxMTxML\colon T_{x}M\to T_{x}M is invertible and Lggϵ1\|L^{*}g-g\|\leq\epsilon_{1}, then the conclusion of Lemma 38 holds for LL.

Let η=fggH0\eta=\|f^{*}g-g\|_{H^{0}} and ε2=dC2(f,I)\varepsilon_{2}=d_{C^{2}}(f,I). Consider the norm fggCk0\|f^{*}g-g\|_{C^{k_{0}}}. As dCk(I,f)d_{C^{k}}(I,f) is uniformly bounded, we see that fggCk1\|f^{*}g-g\|_{C^{k-1}} is uniformly bounded. In fact, there exists ϵ2>0\epsilon_{2}>0 such that if dCk(I,f)<ϵ2d_{C^{k}}(I,f)<\epsilon_{2}, then fggCk14\|f^{*}g-g\|_{C^{k-1}}\leq 4. As r<ϵ0r<\epsilon_{0}, the discussion in the first paragraph of the proof implies that

(52) fggC3η1τ.\|f^{*}g-g\|_{C^{3}}\leq\eta^{1-\tau}.

Note that this is less than ϵ1\epsilon_{1} by the choice of rr.

For xMx\in M, we may consider the Lie group GL(TxM)\operatorname{GL}(T_{x}M) as well as its Lie algebra 𝔤𝔩\mathfrak{gl}. There exists ϵ3>0\epsilon_{3}>0 such that restricted to the ball of radius ϵ3\epsilon_{3} about 0𝔤𝔩0\in\mathfrak{gl}, the Lie exponential, which we denote by exp\exp, is bilipschitz with constant 22.

Let xMx\in M be a point that is moved the maximum distance by I1fI^{-1}f. By Lemma 37, there exists a constant Dk>0D_{k}>0 independent of xx and an isometry R1R_{1} such that R1(x)=I1f(x)R_{1}(x)=I^{-1}f(x) and dCk(R1,Id)<Dkd(x,I1f(x))d_{C^{k}}(R_{1},\operatorname{Id})<D_{k}d(x,I^{-1}f(x)). Let h=R11I1fh=R_{1}^{-1}I^{-1}f and note that hh fixes xx. Note that there exists ϵ4>0\epsilon_{4}>0 such that if dCk(f,I)<ϵ4d_{C^{k}}(f,I)<\epsilon_{4}, then by the previous sentence R1R_{1} can be chosen so that dCk(R1,Id)d_{C^{k}}(R_{1},\operatorname{Id}) is small enough that

(53) DxhIdC0ε2.\|D_{x}h-\operatorname{Id}\|\leq C_{0}\varepsilon_{2}.

We claim that DxhD_{x}h is near a linear map of TxMT_{x}M that preserves both the metric tensor and the curvature tensor. Let SO(TxM)\operatorname{SO}(T_{x}M) be the group of linear maps preserving the metric tensor on TxMT_{x}M and let GG be the group of linear maps preserving the curvature tensor on TxMT_{x}M. Both of these are subgroups of GL(TxM)\operatorname{GL}(T_{x}M). By the sentence after equation (52), DxhD_{x}h pulls back the metric on TxMT_{x}M to be within ϵ1\epsilon_{1} of itself. Thus by Lemma 37, there exists a uniform constant C1C_{1} such that DxhD_{x}h is within distance C1η1τC_{1}\eta^{1-\tau} of SO(TxM)\operatorname{SO}(T_{x}M). Again by equation (52), we have that hggC3η1τ\|h^{*}g-g\|_{C^{3}}\leq\eta^{1-\tau}. In particular, as the curvature tensor is defined by the second derivatives of the metric, this implies by Lemma 36 that there exists a constant C2C_{2} such that DxhD_{x}h is within distance C2η1τC_{2}\eta^{1-\tau} of GG.

The previous paragraph shows that there exists C3C_{3} such that DxhD_{x}h is within distance C3η1τC_{3}\eta^{1-\tau} of both SO(TxM)\operatorname{SO}(T_{x}M) and GG. Consider now the exponential map of GL(TxM)\operatorname{GL}(T_{x}M). As before, let 𝔤𝔩\mathfrak{gl} denote the Lie algebra of GL(TxM)\operatorname{GL}(T_{x}M). Let H=exp1(Dxh)GL(TxM)H=\exp^{-1}(D_{x}h)\in\operatorname{GL}(T_{x}M). Note that this preimage is defined as DxhD_{x}h is near to the identity. Let 𝔰𝔬\mathfrak{so} be the Lie algebra to SO(TxM)\operatorname{SO}(T_{x}M) and let 𝔤\mathfrak{g} be the Lie algebra to GG. As both SO(TxM)SO(T_{x}M) and GG are closed subgroups and exp\exp is bilipschitz we conclude that the distance both between HH and each of 𝔰𝔬\mathfrak{so} and 𝔤\mathfrak{g} is bounded above by 2C3η1τ2C_{3}\eta^{1-\tau}. Thus by Lemma 35, there exists C4C_{4} such that HH is at most distance C4η1τC_{4}\eta^{1-\tau} from 𝔤𝔰𝔬\mathfrak{g}\cap\mathfrak{so}. Let XGL(TxM)X\in\operatorname{GL}(T_{x}M) be an element of 𝔤𝔰𝔬\mathfrak{g}\cap\mathfrak{so} minimizing the distance from HH to 𝔤𝔰𝔬\mathfrak{g}\cap\mathfrak{so}. There exists ϵ5>0\epsilon_{5}>0 such that if ηϵ5\eta\leq\epsilon_{5} then C4η1τ<ϵ3C_{4}\eta^{1-\tau}<\epsilon_{3}. Hence as r<ϵ5r<\epsilon_{5}, the same bilipschitz estimate on the Lie exponential gives

(54) d(exp(X),Dxh)2C4η1τ.d(\exp(X),D_{x}h)\leq 2C_{4}\eta^{1-\tau}.

Note that exp(X)SO(TxM)G\exp(X)\in SO(T_{x}M)\cap G. By Lemma 34, there exists an isometry R2R_{2} of MM such that R2R_{2} fixes xx and DxR2=exp(X)D_{x}R_{2}=\exp(X). In fact, because of equation (53) and because XX is within distance C4η1τC_{4}\eta^{1-\tau} of HH, we may bound the norm of XX and hence deduce that there exists C5C_{5} such that

(55) dCk(R2,Id)C5(ε2+η1τ).d_{C^{k}}(R_{2},\operatorname{Id})\leq C_{5}(\varepsilon_{2}+\eta^{1-\tau}).

The map RR in the conclusion of the proposition will be IR1R2IR_{1}R_{2}. We must now check that R=IR1R2R=IR_{1}R_{2} satisfies estimates (22) and (23). The former is straightforward: (22) follows from (55) combined with knowing that R1R_{1} was constructed so that d(R1,Id)Dε2d(R_{1},\operatorname{Id})\leq D^{\prime}\varepsilon_{2} for some uniform D>0D^{\prime}>0.

Let h2=R21hh_{2}=R_{2}^{-1}h. The map h2h_{2} has xx as a fixed point. There exists C6>0C_{6}>0 such that the following four estimates hold:

(56) Dxh2Id\displaystyle\|D_{x}h_{2}-\operatorname{Id}\| C6η1τ,\displaystyle\leq C_{6}\eta^{1-\tau},
(57) h2ggC3\displaystyle\|h_{2}^{*}g-g\|_{C^{3}} η1τ,\displaystyle\leq\eta^{1-\tau},
(58) dC2(h2,Id)\displaystyle d_{C^{2}}(h_{2},\operatorname{Id}) C6(ε2+η1τ),\displaystyle\leq C_{6}(\varepsilon_{2}+\eta^{1-\tau}),
(59) dCk(h2,Id)\displaystyle d_{C^{k}}(h_{2},\operatorname{Id}) C6(η1τ+dCk(I,f)).\displaystyle\leq C_{6}(\eta^{1-\tau}+d_{C^{k}}(I,f)).

The first two estimates above are immediate from equations (54) and (52), respectively. The third and fourth follow from an estimate on CkC^{k} compositions, Lemma 50, and equation (55).

Let r0r_{0} be the cutoff rr appearing in Theorem 27. Note that there exists ϵ6>0\epsilon_{6}>0 such that if dCk(f,I)<ϵ6d_{C^{k}}(f,I)<\epsilon_{6} and η<ϵ6\eta<\epsilon_{6}, then the right hand side of each of inequalities (56) through (59) is bounded above by r0r_{0}. Hence as r<ϵ6r<\epsilon_{6} we apply Theorem 27 to h2h_{2} to conclude that there exists C7C_{7} such that for all 0<γ<r00<\gamma<r_{0},

dC0(Id,h2)<C7(η1τ+C6(ε2+η1τ)γ+η1τγ1).d_{C^{0}}(\operatorname{Id},h_{2})<C_{7}(\eta^{1-\tau}+C_{6}(\varepsilon_{2}+\eta^{1-\tau})\gamma+\eta^{1-\tau}\gamma^{-1}).

But h2=R21R11I1fh_{2}=R_{2}^{-1}R_{1}^{-1}I^{-1}f, so

(60) dC0(R,f)<C8(η1τ+C6(ε2+η1τ)γ+η1τγ1).d_{C^{0}}(R,f)<C_{8}(\eta^{1-\tau}+C_{6}(\varepsilon_{2}+\eta^{1-\tau})\gamma+\eta^{1-\tau}\gamma^{-1}).

We now obtain the high regularity estimate, equation (23), via interpolation. By similarly moving the isometries from one slot to the other, (59) gives that

(61) dCk(R,f)<C9(η1τ+dCk(I,f)).d_{C^{k}}(R,f)<C_{9}(\eta^{1-\tau}+d_{C^{k}}(I,f)).

There exists ϵ7>0\epsilon_{7}>0 such that if dCk(I,f)<ϵ7d_{C^{k}}(I,f)<\epsilon_{7} and η<ϵ7\eta<\epsilon_{7}, then the right hand side of equation (61) is at most 11.

We now apply the interpolation inequality in Lemma 52 and interpolate between the C0C^{0} and CkC^{k} distance to estimate dC(R,f)d_{C^{\ell}}(R,f). Write =(1λ)k\ell=(1-\lambda^{\prime})k for some λ\lambda^{\prime} and note that 1>λ>λ1>\lambda^{\prime}>\lambda by (51). We use the estimate in equation (60) to estimate the C0C^{0} norm and use 11 to estimate the CkC^{k} norm, which we may do because r<ϵ7r<\epsilon_{7}. Thus there exists C10C_{10} such that for 0<γ<r00<\gamma<r_{0},

(62) dC(R,f)<C10(η1τγ1+ε2γ)λ.d_{C^{\ell}}(R,f)<C_{10}(\eta^{1-\tau}\gamma^{-1}+\varepsilon_{2}\gamma)^{\lambda^{\prime}}.

Note that there exists C11>0C_{11}>0 such that fggH0C11ε2\|f^{*}g-g\|_{H^{0}}\leq C_{11}\varepsilon_{2}. Consequently, there exists a constant C13C_{13} such that C12η/ε2C_{12}\sqrt{\eta/\varepsilon_{2}} is less than the cutoff r0r_{0}. We take γ\gamma to equal C12η/ε2C_{12}\sqrt{\eta/\varepsilon_{2}} in equation (62), which gives

(63) dC(R,f)<C13(η1/2τε21/2+η1/2ε21/2)λ<C14(ηλ/2λτε2λ/2+ηλ/2ε2λ/2).d_{C^{\ell}}(R,f)<C_{13}(\eta^{1/2-\tau}\varepsilon_{2}^{1/2}+\eta^{1/2}\varepsilon_{2}^{1/2})^{\lambda^{\prime}}<C_{14}(\eta^{\lambda/2-\lambda\tau}\varepsilon_{2}^{\lambda/2}+\eta^{\lambda/2}\varepsilon_{2}^{\lambda/2}).

Hence by our choice of λ\lambda and τ\tau in equation (50) and because η<r<1\eta<r<1,

(64) dC(R,f)<C15η1/2σε21/2σ,d_{C^{\ell}}(R,f)<C_{15}\eta^{1/2-\sigma}\varepsilon_{2}^{1/2-\sigma},

which establishes equation (23) and finishes the proof. ∎

6. KAM Scheme

In this section we develop the KAM scheme and prove that it converges. A KAM scheme is an iterative approach to constructing a conjugacy between two systems in the CC^{\infty} setting. We begin by discussing the smoothing operators that will be used in the scheme. Then we state a lemma, Lemma 39, that summarizes the results of performing a step in the scheme. We then prove in Theorem 1 that by iterating the single KAM step that we obtain the convergence needed for this theorem. We conclude the section with a final corollary of the KAM scheme which gives an asymptotic relationship between the top exponent, the bottom exponent, and the sum of all the exponents.

6.1. One step in the KAM scheme

In the KAM scheme, we begin with a tuple of isometries (R1,,Rm)(R_{1},...,R_{m}) and a nearby tuple of diffeomorphisms (f1,,fm)(f_{1},...,f_{m}). We want to find a diffeomorphism ϕ\phi such that for all ii, ϕ1fiϕ=Ri\phi^{-1}f_{i}\phi=R_{i}. However, such a ϕ\phi may not exist.

We will then attempt construct a conjugacy, ϕ\phi that has the following property. Let f~i\widetilde{f}_{i} equal ϕ1fiϕ\phi^{-1}f_{i}\phi. If we consider the tuple (f~1,,f~m)(\widetilde{f}_{1},...,\widetilde{f}_{m}) and (R1,.,Rm)(R_{1},....,R_{m}), we can arrange that the error term, 𝒰\mathcal{U}, in Proposition 26, is small. Once we know that the error term is small, the estimate in Proposition 26 shows that small Lyapunov exponents imply that each f~i\widetilde{f}_{i} has small strain. Then using Proposition 28, small strain implies that there exist RiR_{i}^{\prime} that each f~i\widetilde{f}_{i} is near to that RiR_{i}^{\prime}. We then apply the same process to the tuples (f~1,,f~m)(\widetilde{f}_{1},...,\widetilde{f}_{m}) and (R1,,Rm)(R_{1}^{\prime},\ldots,R_{m}^{\prime}).

The previous paragraph contains the core idea of the KAM scheme. Following this scheme, one encounters a common technical difficulty inherent in KAM arguments: regularity. In our case, this problem is most crucial when we construct the conjugacy ϕ\phi. There is not a single choice of ϕ\phi, but rather a family depending on a parameter λ\lambda. The parameter λ\lambda controls how smooth ϕ\phi is. Larger values of λ\lambda give less regular conjugacies. We refer to this as a conjugation of cutoff λ\lambda; the formal construction of the conjugation of cutoff λ\lambda appears in the proof in Lemma 39 which also gives estimates following from this construction. The nnth time we iterate this procedure we will use a particular value λn\lambda_{n} as our cutoff. The proof of Theorem 1 shows how to pick the sequence λn\lambda_{n} so that the procedure converges.

We now introduce the smoothing operators. Suppose that MM is a closed Riemannian manifold. As before, let Δ\Delta denote the Casimir Laplacian on MM as in subsection 2.4. As Δ\Delta is self adjoint, it decomposes the space of L2L^{2} vector fields into subspaces depending on the particular eigenvalue associated to that subspace. We call these subspaces HλH_{\lambda}. For a vector field XX, we may write X=λXλX=\sum_{\lambda}X_{\lambda}, where XλHλX_{\lambda}\in H_{\lambda} is the projection of XX onto the λ\lambda eigenspace of Δ\Delta. All of the eigenvalues of Δ\Delta are positive. By removing the components of XX that lie in high eigenvalue subspaces, we are able to smooth XX. Let 𝒯λX=λ<λXλ\mathcal{T}_{\lambda}X=\sum_{\lambda^{\prime}<\lambda}X_{\lambda^{\prime}} equal the projection onto the modes strictly less than λ\lambda in magnitude. Let λX=λλXλ\mathcal{R}_{\lambda}X=\sum_{\lambda^{\prime}\geq\lambda}X_{\lambda^{\prime}} be the projection onto the modes of magnitude greater than or equal to λ\lambda. Then X=𝒯λX+λXX=\mathcal{T}_{\lambda}X+\mathcal{R}_{\lambda}X.

We record two standard estimates which may be obtained by application of the Sobolev embedding theorem. For s0s\geq 0, there exists a constant Cs>0C_{s}>0 such that for any s¯s\overline{s}\geq s and any CC^{\infty} vector field XX on MM,

(65) 𝒯λXCs¯Csλk3+(s¯s)/2XCs,\|\mathcal{T}_{\lambda}X\|_{C^{\overline{s}}}\leq C_{s}\lambda^{k_{3}+(\overline{s}-s)/2}\|X\|_{C^{s}},
(66) λXCsCsλk3(s¯s)/2XCs¯.\|\mathcal{R}_{\lambda}X\|_{C^{s}}\leq C_{s}\lambda^{k_{3}-(\overline{s}-s)/2}\|X\|_{C^{\overline{s}}}.

The smoothing operators and the above estimates on them are useful because without smoothing certain estimates appearing in the KAM scheme become unusable. One may see this by considering what happens in the proof of Lemma 39 if one removes the smoothing operator 𝒯λ\mathcal{T}_{\lambda} from equation (73).

The proof of the following lemma should be compared with [DK07, Sec. 3.4]

Lemma 39.

Suppose that (Md,g)(M^{d},g) is a closed isotropic Riemannian manifold other than S1S^{1}. There exists a natural number l0l_{0} such that for >l0\ell>l_{0} and any (C,α,n0)(C,\alpha,n_{0}) the following holds. For any sufficiently small σ>0\sigma>0, there exist a constant r>0r_{\ell}>0 and numbers k0,k1,k2k_{0},k_{1},k_{2} such that for any s>s>\ell and any mm there exist constants Cs,,rs,>0C_{s,\ell},r_{s,\ell}>0 such that the following holds. Suppose that (R1,,Rm)(R_{1},...,R_{m}) is a (C,α,n0)(C,\alpha,n_{0})-Diophantine tuple with entries in Isom(M)\operatorname{Isom}(M) and (f1,,fm)(f_{1},...,f_{m}) is a collection of CC^{\infty} diffeomorphisms of MM. Suppose that the random dynamical system generated by (f1,,fm)(f_{1},...,f_{m}) has stationary measures with arbitrarily small in magnitude bottom exponent. Write εk\varepsilon_{k} for maxidCk(fi,Ri)\max_{i}d_{C^{k}}(f_{i},R_{i}). If λ1\lambda\geq 1 is a number such that

(67) λk0εl0r\lambda^{k_{0}}\varepsilon_{l_{0}}\leq r_{\ell}

and

(68) λk1s/4εs+εl03/2<rs,,\lambda^{k_{1}-s/4}\varepsilon_{s}+\varepsilon_{l_{0}}^{3/2}<r_{s,\ell},

then there exists a smooth diffeomorphism ϕ\phi and a new tuple (R1,,Rm)(R_{1}^{\prime},...,R_{m}^{\prime}) of isometries of MM such that for all ii setting f~i=ϕfiϕ1\widetilde{f}_{i}=\phi f_{i}\phi^{-1}, we have

(69) dC(f~i,Ri)\displaystyle d_{C^{\ell}}(\widetilde{f}_{i},R_{i}^{\prime}) Cs,(λk1s/10εs1σ+εl09/8),\displaystyle\leq C_{s,\ell}(\lambda^{k_{1}-s/10}\varepsilon_{s}^{1-\sigma}+\varepsilon_{l_{0}}^{9/8}),
(70) dC0(Ri,Ri)\displaystyle d_{C^{0}}(R_{i},R_{i}^{\prime}) Cs,(εl0+(λk1s/4εs+εl03/2)1σ),\displaystyle\leq C_{s,\ell}(\varepsilon_{l_{0}}+(\lambda^{k_{1}-s/4}\varepsilon_{s}+\varepsilon_{l_{0}}^{3/2})^{1-\sigma}),
(71) dCs(f~i,Ri)\displaystyle d_{C^{s}}(\widetilde{f}_{i},R_{i}^{\prime}) Cs,λk2εs, and\displaystyle\leq C_{s,\ell}\lambda^{k_{2}}\varepsilon_{s},\text{ and}
(72) dCs(ϕ,Id)\displaystyle d_{C^{s}}(\phi,\operatorname{Id}) Cs,λk2εs.\displaystyle\leq C_{s,\ell}\lambda^{k_{2}}\varepsilon_{s}.

The diffeomorphism ϕ\phi is called a conjugation of cutoff λ\lambda.

Proof.

As in equation (10), let YiY_{i} be the smallest vector field on YiY_{i} satisfying expR(x)Yi(x)=fi(x)\exp_{R(x)}Y_{i}(x)=f_{i}(x). Let \mathcal{L} be the operator on vectors fields defined by (Z)=m1i=1m(Ri)Z\mathcal{L}(Z)=m^{-1}\sum_{i=1}^{m}(R_{i})_{*}Z as in Proposition 21. Let

(73) V(1)1(1mi𝒯λYi)V\coloneqq-(1-\mathcal{L})^{-1}\left(\frac{1}{m}\sum_{i}\mathcal{T}_{\lambda}Y_{i}\right)

and let f~i=ψVfiψV1\widetilde{f}_{i}=\psi_{V}f_{i}\psi_{V}^{-1}. Let ε~k=maxidCk(f~i,Ri)\widetilde{\varepsilon}_{k}=\max_{i}d_{C^{k}}(\widetilde{f}_{i},R_{i}) and let Y~i\widetilde{Y}_{i} be the pointwise smallest vector field such that expR(x)Y~i(x)=f~i(x)\exp_{R(x)}\widetilde{Y}_{i}(x)=\widetilde{f}_{i}(x). By Proposition 43, for a C1C^{1} small vector field VV,

(74) Y~i=Yi+VRiV+Q(Yi,V),\widetilde{Y}_{i}=Y_{i}+V-R_{i}V+Q(Y_{i},V),

where QQ is quadratic in the sense of Definition 42. By Proposition 16, we see that VCkCkεk+α\|V\|_{C^{k}}\leq C_{k}\varepsilon_{k+\alpha} for some fixed α\alpha. There exist β,D1\beta,D_{1} such that Q(Yi,V)CkDkεk+β2\|Q(Y_{i},V)\|_{C^{k}}\leq D_{k}\varepsilon_{k+\beta}^{2}. By estimating the terms in equation (74), it follows that for each k>0k>0 if εk+α+β<1\varepsilon_{k+\alpha+\beta}<1 then there exists a constant D2,kD_{2,k} such that

(75) dCk(f~i,Ri)<D2,kεk+α+β.d_{C^{k}}(\widetilde{f}_{i},R_{i})<D_{2,k}\varepsilon_{k+\alpha+\beta}.

Let μ\mu be an ergodic stationary measure on MM for the tuple (f~1,,f~m)(\widetilde{f}_{1},...,\widetilde{f}_{m}) as in the statement of the lemma. We now apply Proposition 26 with r=d1,dr=d-1,d and recall why the hypotheses of that proposition are satisfied. First, by our assumption that MM is isotropic, Isom(M)\operatorname{Isom}(M)^{\circ} acts transitively on MM and Gr1(M)\operatorname{Gr}_{1}(M). We have also assumed the tuple (R1,,Rm)(R_{1},\ldots,R_{m}) is Diophantine. The nearness of (f~1,,f~m)(\widetilde{f}_{1},\ldots,\widetilde{f}_{m}) to (R1,,Rm)(R_{1},\ldots,R_{m}) is guaranteed by equation (75), a sufficiently small choice of rr_{\ell}, and sufficiently large choice of l0l_{0} by equation (67) as λ1\lambda\geq 1. Thus by applying Proposition 26 to the conjugated system, there exists k1k_{1} such that, in the language of that proposition:

Λr(μ)=r2dmi=1mMECf~i2+r(dr)(d+2)(d1)mi=1mMENCf~i2dvol+Gr(M)𝒰(ψr)dvol+O(Y~Ck13),\Lambda_{r}(\mu)=\frac{-r}{2dm}\sum_{i=1}^{m}\int_{M}\|E_{C}^{\widetilde{f}_{i}}\|^{2}+\frac{r(d-r)}{(d+2)(d-1)m}\sum_{i=1}^{m}\int_{M}\|E_{NC}^{\widetilde{f}_{i}}\|^{2}\,d\operatorname{vol}+\int_{G_{r}(M)}\mathcal{U}(\psi_{r})d\operatorname{vol}+O(\|\widetilde{Y}\|_{C^{k_{1}}}^{3}),

where ψr(x)=1mi=1mlndet(Dxf~iEx)\psi_{r}(x)=\frac{1}{m}\sum_{i=1}^{m}\ln\det(D_{x}\widetilde{f}_{i}\mid E_{x}) and 𝒰\mathcal{U} is defined in Proposition 23.

Pick a sequence of ergodic stationary measures μn\mu_{n} so that |λd(μn)|0\left|\lambda_{d}(\mu_{n})\right|\to 0. Subtracting the expression for Λd1(μn)\Lambda_{d-1}(\mu_{n}) from the expression for Λd(μn)\Lambda_{d}(\mu_{n}), we obtain that

(76) λd(μn)=Λd(μn)Λd1(μn)=12dmi=1mMECf~i2dvol+(d1)(d+2)(d1)mi=1mMENCf~i2dvolGrd1(M)𝒰(ψd1)dvol+Grd(M)𝒰(ψd)dvol+O(Y~Ck13).\displaystyle\begin{split}\lambda_{d}(\mu_{n})=\Lambda_{d}(\mu_{n})-\Lambda_{d-1}(\mu_{n})=&\frac{-1}{2dm}\sum_{i=1}^{m}\int_{M}\|E_{C}^{\widetilde{f}_{i}}\|^{2}\,d\operatorname{vol}+\frac{-(d-1)}{(d+2)(d-1)m}\sum_{i=1}^{m}\int_{M}\|E_{NC}^{\widetilde{f}_{i}}\|^{2}\,d\operatorname{vol}\\ &-\int_{\operatorname{Gr}_{d-1}(M)}\mathcal{U}(\psi_{d-1})\,d\operatorname{vol}+\int_{\operatorname{Gr}_{d}(M)}\mathcal{U}(\psi_{d})\,d\operatorname{vol}+O(\|\widetilde{Y}\|^{3}_{C^{k_{1}}}).\end{split}

Write Grr(R)\operatorname{Gr}_{r}(R) for the map on Grr(M)\operatorname{Gr}_{r}(M) induced by RR. Write 𝐘i\mathbf{Y}_{i} for the shortest vector field on Grr(M)\operatorname{Gr}_{r}(M) such that expGrr(Ri)(x)𝐘i=Grr(f~i)(x)\exp_{\operatorname{Gr}_{r}(R_{i})(x)}\mathbf{Y}_{i}=\operatorname{Gr}_{r}(\widetilde{f}_{i})(x). By Lemma 56, for each kk there exists C1,kC_{1,k} such that

i=1m𝐘iCkC1,k(i=1mY~iCk+1+ε~k+12).\left\|\sum_{i=1}^{m}\mathbf{Y}_{i}\right\|_{C^{k}}\leq C_{1,k}\left(\left\|\sum_{i=1}^{m}\widetilde{Y}_{i}\right\|_{C^{k+1}}+\widetilde{\varepsilon}_{k+1}^{2}\right).

Hence by the above line and the final estimate in Proposition 23 there exists k2k_{2} such that

(77) |Grr(M)𝒰(ψr)dvol|C2ψrCk2(1mi=1mY~iCk2+Y~iCk22).\left|\int_{\operatorname{Gr}_{r}(M)}\mathcal{U}(\psi_{r})\,d\operatorname{vol}\right|\leq C_{2}\|\psi_{r}\|_{C^{k_{2}}}\left(\left\|\frac{1}{m}\sum_{i=1}^{m}\widetilde{Y}_{i}\right\|_{C^{k_{2}}}+\|\widetilde{Y}_{i}\|_{C^{k_{2}}}^{2}\right).

The term ψrCk2\|\psi_{r}\|_{C^{k_{2}}} is bounded by a constant times ε~k2\widetilde{\varepsilon}_{k_{2}}. By using equation (74) we may rewrite the second term appearing in the product in equation (77).

1mi=1mY~i\displaystyle\frac{1}{m}\sum_{i=1}^{m}\widetilde{Y}_{i} =1miYi+(1)1(1mi𝒯λYi)1mi(Ri)((1)1)(𝒯λYi)+1miQ(Yi,V)\displaystyle=\frac{1}{m}\sum_{i}Y_{i}+-(1-\mathcal{L})^{-1}(\frac{1}{m}\sum_{i}\mathcal{T}_{\lambda}Y_{i})-\frac{1}{m}\sum_{i}(R_{i})_{*}(-(1-\mathcal{L})^{-1})(\mathcal{T}_{\lambda}Y_{i})+\frac{1}{m}\sum_{i}Q(Y_{i},V)
=1miλYi+1mi𝒯λYi(1)(1)1(1mi𝒯λYi)+1miQ(Yi,V)\displaystyle=\frac{1}{m}\sum_{i}\mathcal{R}_{\lambda}Y_{i}+\frac{1}{m}\sum_{i}\mathcal{T}_{\lambda}Y_{i}-(1-\mathcal{L})(1-\mathcal{L})^{-1}(\frac{1}{m}\sum_{i}\mathcal{T}_{\lambda}Y_{i})+\frac{1}{m}\sum_{i}Q(Y_{i},V)
=1miλYi+1mi𝒯λYi1mi𝒯λYi+1miQ(Yi,V)\displaystyle=\frac{1}{m}\sum_{i}\mathcal{R}_{\lambda}Y_{i}+\frac{1}{m}\sum_{i}\mathcal{T}_{\lambda}Y_{i}-\frac{1}{m}\sum_{i}\mathcal{T}_{\lambda}Y_{i}+\frac{1}{m}\sum_{i}Q(Y_{i},V)
=1miλYi+1miQ(Yi,V)\displaystyle=\frac{1}{m}\sum_{i}\mathcal{R}_{\lambda}Y_{i}+\frac{1}{m}\sum_{i}Q(Y_{i},V)

By equation (66), there exists k3k_{3} such that for all s0s\geq 0:

RλYiC1C3,sλk3s/2YiCs.\|R_{\lambda}Y_{i}\|_{C^{1}}\leq C_{3,s}\lambda^{k_{3}-s/2}\|Y_{i}\|_{C^{s}}.

As the QQ term is quadratic, there exist 2\ell_{2}, C4C_{4} such that

Q(Yi,V)Ck2C4YiC2VC2=C4YiC2(1)1(𝒯λYi)C2C5ε32\|Q(Y_{i},V)\|_{C^{k_{2}}}\leq C_{4}\|Y_{i}\|_{C^{\ell_{2}}}\|V\|_{C^{\ell_{2}}}=C_{4}\|Y_{i}\|_{C^{\ell_{2}}}\|(1-\mathcal{L})^{-1}(\mathcal{T}_{\lambda}Y_{i})\|_{C^{\ell_{2}}}\leq C_{5}\varepsilon_{\ell_{3}}^{2}

for some 3\ell_{3} by Proposition 21. Thus

1miY~iCk2C6,s(λk3s/2εs+ε32).\left\|\frac{1}{m}\sum_{i}\widetilde{Y}_{i}\right\|_{C^{k_{2}}}\leq C_{6,s}(\lambda^{k_{3}-s/2}\varepsilon_{s}+\varepsilon_{\ell_{3}}^{2}).

Finally, by equation (75) we have that Y~iCk2C7ε3\|\widetilde{Y}_{i}\|_{C^{k_{2}}}\leq C_{7}\varepsilon_{\ell_{3}} as before. Let 4=max{3,k2+α+β}\ell_{4}=\max\{\ell_{3},k_{2}+\alpha+\beta\}. Applying all of these estimates to (77) gives

(78) |Grr(M)𝒰(ψr)dvol|C8,sεk2(λk3s/2εs+ε42).\left|\int_{\operatorname{Gr}_{r}(M)}\mathcal{U}(\psi_{r})\,d\operatorname{vol}\right|\leq C_{8,s}\varepsilon_{k_{2}}(\lambda^{k_{3}-s/2}\varepsilon_{s}+\varepsilon_{\ell_{4}}^{2}).

By taking 5>max{k1+α+β,k2,4}\ell_{5}>\max\{k_{1}+\alpha+\beta,k_{2},\ell_{4}\}, using that λd(μn)0\lambda_{d}(\mu_{n})\to 0,222Note that we did not need λ(μn)0\lambda(\mu_{n})\to 0 in order to conclude equation (79). It suffices to know that there μ\mu such that λd(μ)\lambda_{d}(\mu) is comparable to the right hand side of (78). This observation is the essence of the proof of Theorem 40. and combining equations (78) and (76) we obtain for s0s\geq 0 that there exists C9,sC_{9,s} such that

(79) C9,s(λk3s/2εsε5+ε53)12dmi=1mMECf~i2dvol+(d1)(d+2)(d1)mi=1mMENCf~i2dvol.C_{9,s}(\lambda^{k_{3}-s/2}\varepsilon_{s}\varepsilon_{\ell_{5}}+\varepsilon_{\ell_{5}}^{3})\geq\frac{1}{2dm}\sum_{i=1}^{m}\int_{M}\|E_{C}^{\widetilde{f}_{i}}\|^{2}\,d\operatorname{vol}+\frac{(d-1)}{(d+2)(d-1)m}\sum_{i=1}^{m}\int_{M}\|E_{NC}^{\widetilde{f}_{i}}\|^{2}\,d\operatorname{vol}.

Note that the coefficients on each of the strain terms are positive. If s>5s>\ell_{5}, then by taking square roots, we see that there exist constants C10,sC_{10,s} such that for each ii

(80) C10,s(λk3/2s/4εs+ε53/2)f~iggH0.C_{10,s}(\lambda^{k_{3}/2-s/4}\varepsilon_{s}+\varepsilon_{\ell_{5}}^{3/2})\geq\|\widetilde{f}_{i}^{*}g-g\|_{H^{0}}.

We now give a naive estimate on the higher CsC^{s} norms under the assumption that ε1\varepsilon_{1} is bounded by a constant ϵ1>0\epsilon_{1}>0. To begin, by combining equation (65) and Proposition 16 we see that there exists α>0\alpha>0 such that for each ss there exists D3,sD_{3,s} such that VCsD3,sλαεs\|V\|_{C^{s}}\leq D_{3,s}\lambda^{\alpha}\varepsilon_{s}. Hence by Lemma 48, both dCs(ψV,Id)d_{C^{s}}(\psi_{V},\operatorname{Id}) and dCs(ψV1,Id)d_{C^{s}}(\psi_{V}^{-1},\operatorname{Id}) are bounded by D4,sλαεsD_{4,s}\lambda^{\alpha}\varepsilon_{s}. This establishes equation (72).

Now applying the composition estimate from Lemma 50, we find that assuming λ1\lambda\geq 1:

dCs(fψV1,R)\displaystyle d_{C^{s}}(f\circ\psi_{V}^{-1},R) C11,s(dCs(f,R)+dCs(ψV1,Id))\displaystyle\leq C_{11,s}(d_{C^{s}}(f,R)+d_{C^{s}}(\psi_{V}^{-1},\operatorname{Id}))
C12,s(εs+λαεs)\displaystyle\leq C_{12,s}(\varepsilon_{s}+\lambda^{\alpha}\varepsilon_{s})
C13,s(λαεs).\displaystyle\leq C_{13,s}(\lambda^{\alpha}\varepsilon_{s}).

We then apply the other estimate in Lemma 50, to find:

dCs(ψVfψV1,R)\displaystyle d_{C^{s}}(\psi_{V}\circ f\circ\psi_{V}^{-1},R) C11,s(dCs(ψV,Id)+dCs(fψV1,R))\displaystyle\leq C_{11,s}(d_{C^{s}}(\psi_{V},\operatorname{Id})+d_{C^{s}}(f\circ\psi_{V}^{-1},R))
C14,s(λαεs+λαεs)\displaystyle\leq C_{14,s}(\lambda^{\alpha}\varepsilon_{s}+\lambda^{\alpha}\varepsilon_{s})
C15,sλαεs.\displaystyle\leq C_{15,s}\lambda^{\alpha}\varepsilon_{s}.

Hence under an assumption of the type in equation (67), namely ε1<ϵ1\varepsilon_{1}<\epsilon_{1}, we may conclude

(81) dCs(f~i,R)C15,sλαεs,d_{C^{s}}(\widetilde{f}_{i},R)\leq C_{15,s}\lambda^{\alpha}\varepsilon_{s},

which establishes equation (71).

We now apply Proposition 28 to this system. Let kσk_{\sigma} and rσr_{\sigma} be the kk and rr in Proposition 28 for a given choice of σ\sigma and our fixed \ell. In preparation for the application of the lemma, we record some basic estimates:

  1. (1)

    By combining equation (65) and Proposition 21 as before, we see that there exists 6\ell_{6} such that

    (82) dC2(f~i,Ri)ε6.d_{C^{2}}(\widetilde{f}_{i},R_{i})\leq\varepsilon_{\ell_{6}}.
  2. (2)

    From the previous discussion we also have

    f~iggH0C10,s(λk3/2s/4εs+ε53/2).\|\widetilde{f}_{i}^{*}g-g\|_{H^{0}}\leq C_{10,s}(\lambda^{k_{3}/2-s/4}\varepsilon_{s}+\varepsilon_{\ell_{5}}^{3/2}).
  3. (3)

    We also need the CkσC^{k_{\sigma}} estimate

    dCkσ(f~i,R)C15,kσλαεkσ.d_{C^{k_{\sigma}}}(\widetilde{f}_{i},R)\leq C_{15,k_{\sigma}}\lambda^{\alpha}\varepsilon_{k_{\sigma}}.

Hence if

(83) C15,kσλαεkσ<rσC_{15,k_{\sigma}}\lambda^{\alpha}\varepsilon_{k_{\sigma}}<r_{\sigma}

and

(84) C10,s(λk3/2s/4εs+ε53/2)rσ,C_{10,s}(\lambda^{k_{3}/2-s/4}\varepsilon_{s}+\varepsilon_{\ell_{5}}^{3/2})\leq r_{\sigma},

then by Proposition 28 and the previous estimates there exist C6C_{6} and isometries RiR_{i}^{\prime} such that

(85) dC(f~i,Ri)C16,s(λk3/2s/4εs+ε53/2)1/2σε61/2σd_{C^{\ell}}(\widetilde{f}_{i},R_{i}^{\prime})\leq C_{16,s}(\lambda^{k_{3}/2-s/4}\varepsilon_{s}+\varepsilon_{\ell_{5}}^{3/2})^{1/2-\sigma}\varepsilon_{\ell_{6}}^{1/2-\sigma}

and

(86) dC0(Ri,Ri)<C17,s(ε6+(λk3/2s/4εs+ε53/2)1σ).d_{C^{0}}(R_{i}^{\prime},R_{i})<C_{17,s}(\varepsilon_{\ell_{6}}+(\lambda^{k_{3}/2-s/4}\varepsilon_{s}+\varepsilon_{\ell_{5}}^{3/2})^{1-\sigma}).

Let 7=max{5,6}\ell_{7}=\max\{\ell_{5},\ell_{6}\}. If s>7s>\ell_{7}, then equation (85) implies

dC(f~i,Ri)C16,s(λk4s/9εs12σ+ε75/4(5/2)σ),d_{C^{\ell}}(\widetilde{f}_{i},R_{i}^{\prime})\leq C_{16,s}(\lambda^{k_{4}-s/9}\varepsilon_{s}^{1-2\sigma}+\varepsilon_{\ell_{7}}^{5/4-(5/2)\sigma}),

which yields equation (69) under the assumption that σ>0\sigma>0 is sufficiently small. Note that equation (86) establishes equation (70). Thus we are done as we have established these estimates assuming only bounds of the type appearing in equations (67) and (68). ∎

Remark 1.

In the above lemma, we could instead have assumed that there exist stationary measures for which both the top exponent and the sum of all the exponents were arbitrarily small and concluded the same result. The reason being if we had considered Λ1Λd\Lambda_{1}-\Lambda_{d} in equation (76), the coefficients of the strain terms would still have the same sign and so we could conclude the same result. By related modifications, one can produce many other formulations of the main result in [DK07] that require other hypotheses on the Lyapunov exponents.

6.2. Convergence of the KAM scheme

In this section we prove the main linearization theorem. It is helpful to note that the approach to this theorem is somewhat different from the classical approach to KAM type results. In a classical argument, one might typically linearize the problem at a target isometric system and then find a solution to the linearized problem. In our case, while we are able to linearize the problem, the resulting linearized problem does not obviously have any solution. Consequently we must give dynamical and geometric arguments that show that a related type of averaged linearized problem can be solved and that solving this averaged problem is indeed helpful. This then allows us to make progress in the KAM scheme by conjugating the system closer to an isometric one. In particular, note that in our case we do not know from the outset which isometric system our random system will ultimately be conjugate to.

Theorem 1.

Let MdM^{d} be a closed isotropic Riemannian manifold other than S1S^{1}. There exists k0k_{0} such that if (R1,,Rm)(R_{1},...,R_{m}) is a tuple of isometries of MM such that the subgroup of Isom(M)\operatorname{Isom}(M) generated by this tuple contains Isom(M)\operatorname{Isom}(M)^{\circ}, then there exists ϵk0>0\epsilon_{k_{0}}>0 such that the following holds. Let (f1,,fm)(f_{1},...,f_{m}) be a tuple of CC^{\infty} diffeomorphisms satisfying maxidCk0(fi,Ri)<ϵk0\max_{i}d_{C^{k_{0}}}(f_{i},R_{i})<\epsilon_{k_{0}}. Suppose that there exists a sequence of ergodic stationary measures μn\mu_{n} for the random dynamical system generated by (f1,,fm)(f_{1},...,f_{m}) such that |λd(μn)|0\left|\lambda_{d}(\mu_{n})\right|\to 0, then there exists ψDiff(M)\psi\in\operatorname{Diff}^{\infty}(M) such that for each ii the map ψfiψ1\psi f_{i}\psi^{-1} is an isometry of MM and lies in the subgroup of Isom(M)\operatorname{Isom}(M) generated by (R1,,Rm)(R_{1},\ldots,R_{m}).

Before giving the proof, we sketch briefly the argument, which is typical of arguments establishing the convergence of a KAM scheme. In a KAM scheme where one wishes to show that some sequence of objects hnh_{n} converges there are often two parts. The first part of the proof is an inductive argument obtaining a sequence of estimates by the repeated application of the KAM step, which in our case is Lemma 39. The second half of the proof checks that the repeated application of the KAM step is valid by showing that we never leave the neighborhood of its validity and then checks that the procedure is converging in CC^{\infty}.

In the first part, one inductively produces a sequence of estimates by iterating a KAM step. The estimates produced usually come in two forms: a single good estimate in a low norm and bad estimates in high norms. The low regularity estimate probably looks like hnC0N(1+τ)n\|h_{n}\|_{C^{0}}\leq N^{-(1+\tau)^{n}} where τ>0\tau>0, while for every ss one has a high regularity estimate like hnCsN(1+τ)n\|h_{n}\|_{C^{s}}\leq N^{(1+\tau)^{n}}. A priori, the hnh_{n} become superexponentially C0C^{0} small, yet might be diverging in higher CsC^{s} norms. To remedy this situation one then interpolates between the low and high norms by using an equality derived from Lemma 52. In this case such an inequality for the objects hnh_{n} might assert something like

hnCλ0+(1λ)sCsfC0λfCs1λ.\|h_{n}\|_{C^{\lambda\cdot 0+(1-\lambda)s}}\leq C_{s}\|f\|_{C^{0}}^{\lambda}\|f\|_{C^{s}}^{1-\lambda}.

If λ\lambda is sufficiently close to 11 and ss is sufficiently large, a brief calculation then implies that the C(1λ)sC^{(1-\lambda)s} norm is also super exponentially small. By changing ss and λ\lambda one then obtains convergence in CC^{\infty}.

Proof of Theorem 1..

The proof is by a KAM convergence scheme. To begin we introduce the Diophantine condition we will use. By Proposition 19, (R1,,Rm)(R_{1},...,R_{m}) is (C,α,n)(C^{\prime},\alpha^{\prime},n^{\prime})-Diophantine for some C,α>0C^{\prime},\alpha^{\prime}>0 and is stably so. By stability, there exist (C,α,n)(C,\alpha,n) and a C0C^{0} neighborhood 𝒰\mathcal{U} of (R1,,Rm)(R_{1},...,R_{m}) such that any tuple in 𝒰\mathcal{U} is also (C,α,n)(C,\alpha,n)-Diophantine. Hence if (R1,,Rm)𝒰(R_{1}^{\prime},...,R_{m}^{\prime})\in\mathcal{U}, then the coefficients Ci,sC_{i,s} appearing in Lemma 39 are uniform over all of these tuples. Assuming we do not leave the set 𝒰\mathcal{U}, the constants appearing in Lemma 39 will be uniform. We check this at the end of the proof in the discussion surrounding equation (91).

We now show that there exists a sequence of cutoffs λn\lambda_{n} so that if we repeatedly apply Lemma 39 with the cutoff λn\lambda_{n} on the nnth time we apply the Lemma, then the resulting sequence of conjugates converges and the hypotheses of Lemma 39 remain satisfied. Given such a sequence λn\lambda_{n} the convergence scheme is run as follows. Let (f1,1,,fm,1)=(f1,,fm)(f_{1,1},\ldots,f_{m,1})=(f_{1},\ldots,f_{m}) and let (R1,1,,Rm,1)=(R1,,Rm)(R_{1,1},\ldots,R_{m,1})=(R_{1},\ldots,R_{m}). Given (f1,n1,,fm,n1)(f_{1,n-1},\ldots,f_{m,n-1}) and (R1,n1,,Rm,n1)(R_{1,n-1},\ldots,R_{m,n-1}) we apply Lemma 39 with cutoff λ=λn\lambda=\lambda_{n} to produce a diffeomorphism ϕn\phi_{n} and a tuple of isometries that we denote by (R1,n,,Rm,n)(R_{1,n},\ldots,R_{m,n}). We set fi,n=ϕnfi,n1ϕn1f_{i,n}=\phi_{n}f_{i,n-1}\phi_{n}^{-1} to obtain a new tuple of diffeomorphisms (f1,n,,fm,n)(f_{1,n},\ldots,f_{m,n}). We write ψn\psi_{n} for ϕnϕn1ϕ1\phi_{n}\circ\phi_{n-1}\circ\cdots\circ\phi_{1}, so that fi,n=ψnfiψn1f_{i,n}=\psi_{n}\circ f_{i}\circ\psi_{n}^{-1}. Let εk,n=maxidCk(fi,n,Ri,n)\varepsilon_{k,n}=\max_{i}d_{C^{k}}(f_{i,n},R_{i,n}).

We now show that such a sequence of cutoffs λn\lambda_{n} exist. Let σ\sigma be a small positive number and let l0l_{0} and ϵl0\epsilon_{l_{0}} be as in Lemma 39. Let k0,k1,k2,r,Cs,,rs,k_{0},k_{1},k_{2},r_{\ell},C_{s,\ell},r_{s,\ell} be as in Lemma 39 as well. To show that such a sequence of cutoffs λn\lambda_{n} exists we must also provide a fixed choice of s,s,\ell for the application of Lemma 39. We will first show that the scheme converges in the Cl0C^{l_{0}} norm and then bootstrap to get CC^{\infty} convergence. Fix some arbitrary >l0\ell>l_{0}. The choice of \ell does not matter in the sequel because we only will consider estimates on the l0l_{0} norm. We will choose ss such that

(87) s>.s>\ell.

Further, if ss is sufficiently large and τ\tau is sufficiently small, then we can pick α\alpha such that

(88) 2+τs/4k1<α<min{1/k0,τ/k2}\frac{2+\tau}{s/4-k_{1}}<\alpha<\min\{1/k_{0},\tau/k_{2}\}

So, we increase ss if needed and choose such a τ\tau satisfying

(89) 1/8>τ>0.1/8>\tau>0.

Pick s,α,τs,\alpha,\tau so that each of equations (87), (88), (89) is satisfied.

Let λn=Nα(1+τ)n\lambda_{n}=N^{\alpha(1+\tau)^{n}} for some NN we choose later. We will show that with this choice of cutoff at the nnth step that the KAM scheme converges. In order to show this, we show the following two estimates hold inductively given a choice of sufficiently large NN:

(H1) εl0,n\displaystyle\varepsilon_{l_{0},n} N(1+τ)n\displaystyle\leq N^{-(1+\tau)^{n}}
(H2) εs,n\displaystyle\varepsilon_{s,n} N(1+τ)n\displaystyle\leq N^{(1+\tau)^{n}}
(H3) maxi{dC0(Ri,n,Ri,1)}\displaystyle\max_{i}\{d_{C^{0}}(R_{i,n},R_{i,1})\} i=1nN12(1+τ)i.\displaystyle\leq\sum_{i=1}^{n}N^{-\frac{1}{2}(1+\tau)^{i}}.

This involves two arguments. The first argument shows that there is a sufficiently large NN such that if we have these estimates for nn, then the hypotheses of Lemma 39 are satisfied. The second argument is the actual induction, which checks that if equations (H1) and (H2) hold for nn then they also hold for n+1n+1, i.e. we apply Lemma 39 and then deduce (H1) and (H2) for n+1n+1 from this.

We begin by checking that for all sufficiently large N>0N>0 and any nn\in\mathbb{N} if (H1), (H2), and (H3) are satisfied, then the hypotheses of Lemma 39 are satisfied as well. To begin, as the summation in (H3) is summable, for all sufficiently large NN, we are assured that (R1,n,,Rm,n)(R_{1,n},\ldots,R_{m,n}) lies in 𝒰\mathcal{U}. The first numbered hypothesis of Lemma 39 is equation (67):

λnk0εl0,nr.\lambda_{n}^{k_{0}}\varepsilon_{l_{0},n}\leq r_{\ell}.

Given the choice of λn\lambda_{n}, if equations (H1) and (H2) hold it suffices to have

Nαk0(1+τ)nN(1+τ)n<r,N^{\alpha k_{0}(1+\tau)^{n}}N^{-(1+\tau)^{n}}<r_{\ell},

which holds for NN sufficiently large and all nn by our choice of α\alpha. The other hypothesis of Lemma 39, equation (68), requires that

λnk1s/4εs,n+εl0,n3/2<rs,.\lambda_{n}^{k_{1}-s/4}\varepsilon_{s,n}+\varepsilon_{l_{0},n}^{3/2}<r_{s,\ell}.

Given equations (H1) and (H2) and our choice of λn\lambda_{n} it suffices to have

Nα(k1s/4)(1+τ)nN(1+τ)n+N32(1+τ)n<rs,.N^{\alpha(k_{1}-s/4)(1+\tau)^{n}}N^{(1+\tau)^{n}}+N^{-\frac{3}{2}(1+\tau)^{n}}<r_{s,\ell}.

Our choice of ss and α\alpha implies that α(k1s/4)<1\alpha(k_{1}-s/4)<-1, hence the above inequality holds for sufficiently large NN. Thus the two hypotheses of Lemma 39 follow from equations (H1) and (H2). Thus we may apply Lemma 39 given (H1), (H2), (H3), and our choice of NN.

We now proceed to the inductive argument. What we will show is that for all NN sufficiently large, if we now require that our perturbation is small enough that (H1) and (H2) hold for n=1n=1 and our choice of NN we check that we may continue applying Lemma 39 and that these estimates as well as (H3) continue to hold. Note that (H3) is trivial when n=1n=1. We must then check that equations (H1), (H2), and (H3) are satisfied for n+1n+1 given they hold for nn. By the previous paragraph, we are free to apply the estimates from Lemma 39 as long as NN is sufficiently large.

We now check that equation (H1) holds for n+1n+1. By equation (69), we obtain that

εl0,n+1Cs,(λnk1s/10εs,n1σ+εl0,n9/8).\varepsilon_{l_{0},n+1}\leq C_{s,\ell}(\lambda_{n}^{k_{1}-s/10}\varepsilon_{s,n}^{1-\sigma}+\varepsilon_{l_{0},n}^{9/8}).

By applying equations (H1) and (H2) to each term on the right it suffices to show

(90) Cs,(Nα(k1s/10)(1+τ)nN(1σ)(1+τ)n+N9/8(1+τ)n)<N(1+τ)n+1.C_{s,\ell}(N^{\alpha(k_{1}-s/10)(1+\tau)^{n}}N^{(1-\sigma)(1+\tau)^{n}}+N^{-9/8(1+\tau)^{n}})<N^{-(1+\tau)^{n+1}}.

By our choice of ss, α\alpha, and τ\tau, the lower bound in equation (88) implies that

α(k1s/10)+(1σ)<(1+τ).\alpha(k_{1}-s/10)+(1-\sigma)<-(1+\tau).

In addition, by equation (89), 9/8<(1+τ)-9/8<-(1+\tau). Thus for sufficiently large NN the left hand side of equation (90) is bounded above by N(1+τ)n+1N^{-(1+\tau)^{n+1}}.

Next we check equation (H2) holds for n+1n+1. By equation (71),

εs,n+1Cs,λnk2εs,n.\varepsilon_{s,n+1}\leq C_{s,\ell}\lambda_{n}^{k_{2}}\varepsilon_{s,n}.

Hence,

εsCs,Nk2α(1+τ)nN(1+τ)n,\varepsilon_{s}\leq C_{s,\ell}N^{k_{2}\alpha(1+\tau)^{n}}N^{(1+\tau)^{n}},

By equation (88), 1+k2α<1+τ1+k_{2}\alpha<1+\tau and hence, assuming NN is sufficiently large, the right hand side is bounded by N(1+τ)n+1N^{(1+\tau)^{n+1}}, which shows equation (H2) is satisfied.

We now check that (H3). This follows easily by the application of equation (70), which gives

(91) dC0(Ri,n,Ri,n+1)Cs,(εl0,n+(λnk1s/4εs,n+εl0,n3/2)1σ)d_{C^{0}}(R_{i,n},R_{i,n+1})\leq C_{s,\ell}(\varepsilon_{l_{0},n}+(\lambda_{n}^{k_{1}-s/4}\varepsilon_{s,n}+\varepsilon_{l_{0},n}^{3/2})^{1-\sigma})

Applying (H1) and (H2) and the definition of λn\lambda_{n} to estimate the right hand side of equation (91), we find that for the γ\gamma given in (H3) and NN sufficiently large that

(92) dC0(Ri,n,Ri,n+1)N12(1+τ)n,d_{C^{0}}(R_{i,n},R_{i,n+1})\leq N^{-\frac{1}{2}(1+\tau)^{n}},

and (H3) holds for n+1n+1.

We have now finished the induction but not the proof. We have shown that there exists a sequence λn\lambda_{n} and a choice s,α,,τ,Ns,\alpha,\ell,\tau,N, so that if the initial conditions of the scheme are satisfied then we may iterate indefinitely and be assured of the estimates in equations (H1), (H2), (H3) at each step. We must now check that the conjugacies ψn\psi_{n} are converging in CC^{\infty} and that the tuples (R1,n,,Rm,n)(R_{1,n},\ldots,R_{m,n}) are converging. The latter is immediate because by (92) this is a Cauchy sequence. In fact, we chose NN large enough that we never leave 𝒰\mathcal{U}, hence the limit is in 𝒰\mathcal{U}. As the group of isometries of MM is C0C^{0} closed and the distance of the tuples (f1,n,,fm,n)(f_{1,n},\ldots,f_{m,n}) from a tuple of isometries is converging to 0, it follows that (f1,n,,fm,n)(f_{1,n},\ldots,f_{m,n}) is converging to a tuple of isometries. To show that the ψn\psi_{n} converge in CC^{\infty}, we obtain for every ss an estimate on dCs(ϕn,Id)d_{C^{s}}(\phi_{n},\operatorname{Id}). By a similar induction to that just performed, the estimate (72) implies

dCs(ϕn,Id)CsN(1+τ)n.d_{C^{s}}(\phi_{n},\operatorname{Id})\leq C_{s}N^{(1+\tau)^{n}}.

Let j>0j>0 be an integer. By Lemma 53, interpolating with λ=11/10\lambda=1-1/10 between the Cl0C^{l_{0}} distance and the Cjl0C^{jl_{0}} distance of ϕn\phi_{n} to the identity gives

dC.9l0+(j/10)l0(ϕn,Id)CjN.9(1+τ)nN.1(1+τ)n=CjN.8(1+τ)n.d_{C^{.9l_{0}+(j/10)l_{0}}}(\phi_{n},\operatorname{Id})\leq C_{j}N^{-.9(1+\tau)^{n}}N^{.1(1+\tau)^{n}}=C_{j}N^{-.8(1+\tau)^{n}}.

Thus by increasing jj, we see that there exists τ>0\tau^{\prime}>0 such that for each CsC^{s} norm

dCs(ϕn,Id)<CsN(1+τ)n.d_{C^{s}}(\phi_{n},\operatorname{Id})<C_{s}^{\prime}N^{-(1+\tau^{\prime})^{n}}.

The previous line is summable in nn. Hence we can apply Lemma 51 to obtain convergence of sequence of the ψn=ϕnϕ1\psi_{n}=\phi_{n}\circ\cdots\circ\phi_{1} in the CsC^{s} norm for each ss and thus CC^{\infty} convergence.

Thus we see that we have simultaneously conjugated each fif_{i} into Isom(M)\operatorname{Isom}(M). In order to obtain the full theorem, we must be assured that ψ1fiψ\psi^{-1}f_{i}\psi lies in the subgroup of Isom(M)\operatorname{Isom}(M) generated by (R1,,Rm)(R_{1},\ldots,R_{m}). Note that Isom(M)/Isom(M)\operatorname{Isom}(M)/\operatorname{Isom}(M)^{\circ} is a finite group and that ψ\psi is homotopic to the identity by construction. Thus we see that the image of the group generated by (ψ1f1ψ,,ψ1fmψ)(\psi^{-1}f_{1}\psi,\ldots,\psi^{-1}f_{m}\psi) in Isom(M)/Isom(M)\operatorname{Isom}(M)/\operatorname{Isom}(M)^{\circ} is the same as the image of the group generated by (R1,,Rm)(R_{1},\ldots,R_{m}). By our choice of NN, (ψ1f1ψ,,ψ1fmψ)(\psi^{-1}f_{1}\psi,\ldots,\psi^{-1}f_{m}\psi) is in 𝒰\mathcal{U} and thus generates Isom(M)\operatorname{Isom}(M)^{\circ}. Thus the original tuple and the new one generate the same subgroup of Isom(M)\operatorname{Isom}(M) and we are done. ∎

6.3. Taylor expansion of Lyapunov exponents

In order to recover Dolgopyat and Krikorian’s Taylor expansion in the setting of isotropic manifolds, we would need to apply Proposition 26 for each 0rdimM0\leq r\leq\dim M. However, one of the hypotheses of Proposition 26 is that Isom(M)\operatorname{Isom}(M)^{\circ} acts transitively on Grr(M)\operatorname{Gr}_{r}(M). In Proposition 41, we see that unless MM is SnS^{n} or Pn\mathbb{R}\operatorname{P}^{n}, Isom(M)\operatorname{Isom}(M) does not act transitively on Grr(M)\operatorname{Gr}_{r}(M) for r1r\neq 1 or d1d-1. Despite Proposition 41, we are able to obtain a partial result: the greatest and least Lyapunov exponents are symmetric about the “average” Lyapunov exponent 1dΛd(μ)\frac{1}{d}\Lambda_{d}(\mu).

Theorem 40.

Suppose that MdM^{d} is a closed isotropic manifold other than S1S^{1} and that (R1,,Rm)(R_{1},...,R_{m}) is a subset of Isom(M)\operatorname{Isom}(M) that generates a subgroup of Isom(M)\operatorname{Isom}(M) containing Isom(M)\operatorname{Isom}(M)^{\circ}. Suppose that (f1,,fn)(f_{1},...,f_{n}) is a collection of CC^{\infty} diffeomorphisms of MM. Then there exists k0k_{0} such that if μ\mu is an ergodic stationary measure of the random dynamical system generated by the (f1,,fm)(f_{1},...,f_{m}), then

(93) |λ1(μ)(λd(μ)+2dΛd(μ))|o(1)|λd(μ)|.\left|\lambda_{1}(\mu)-(-\lambda_{d}(\mu)+\frac{2}{d}\Lambda_{d}(\mu))\right|\leq o(1)\left|\lambda_{d}(\mu)\right|.

where the o(1)o(1) term goes to 0 as maxidCk0(fi,Ri)0\max_{i}d_{C^{k_{0}}}(f_{i},R_{i})\to 0. The o(1)o(1) term depends only on (R1,,Rm)(R_{1},...,R_{m}).

Proof.

By Theorem 1, there are two cases: either (f1,,fm)(f_{1},\ldots,f_{m}) is conjugate to isometries or it is not. In the isometric case equation (93) is immediate, so we may assume that there there is an ergodic stationary measure μ\mu with λd(μ)\lambda_{d}(\mu) non-zero. The proof that follows is then essentially an observation about what happens when the KAM scheme is run on a system that has a measure with such a non-zero Lyapunov exponent. If we run the KAM scheme without assuming that (f1,,fm)(f_{1},...,f_{m}) has a measure with zero exponents, we can keep running the scheme until the non-trivial exponents prevent us from continuing. At a certain point in the procedure, the non-trivial exponents cause a certain inequality fail. Using the failed inequality then gives the result.

We now give the details. Fix an ergodic stationary measure μ\mu and consider equation (76) appearing in the KAM step:

(94) λd(μ)=12dmi=1mMECf~i2dvol+(d1)(d+2)(d1)mi=1mMENCf~i2dvolGrd1(M)𝒰(ψd1)dvol+Grd(M)𝒰(ψd)dvol+O(Y~Ck13).\displaystyle\begin{split}\lambda_{d}(\mu)=&\frac{-1}{2dm}\sum_{i=1}^{m}\int_{M}\|E_{C}^{\widetilde{f}_{i}}\|^{2}\,d\operatorname{vol}+\frac{-(d-1)}{(d+2)(d-1)m}\sum_{i=1}^{m}\int_{M}\|E_{NC}^{\widetilde{f}_{i}}\|^{2}\,d\operatorname{vol}\\ &-\int_{\operatorname{Gr}_{d-1}(M)}\mathcal{U}(\psi_{d-1})\,d\operatorname{vol}+\int_{\operatorname{Gr}_{d}(M)}\mathcal{U}(\psi_{d})\,d\operatorname{vol}+O(\|\widetilde{Y}\|^{3}_{C^{k_{1}}}).\end{split}

The above equation allows us to use that the exponent λd\lambda_{d} is small in magnitude. In the KAM step, we proceed from this estimate by estimating the Y~Ck13\|\widetilde{Y}\|^{3}_{C^{k_{1}}} term as well as the 𝒰\mathcal{U} terms. Equation (78) and the choice of 5\ell_{5} imply that these terms satisfy:

(95) |Grd1(M)𝒰(ψd1)dvolGrd(M)𝒰(ψd)dvol+O(Y~Ck13)|C8,sε5(λk3s/2εs+ε52).\left|\int_{\operatorname{Gr}_{d-1}(M)}\mathcal{U}(\psi_{d-1})\,d\operatorname{vol}-\int_{\operatorname{Gr}_{d}(M)}\mathcal{U}(\psi_{d})\,d\operatorname{vol}+O(\|\widetilde{Y}\|^{3}_{C^{k_{1}}})\right|\leq C_{8,s}\varepsilon_{\ell_{5}}(\lambda^{k_{3}-s/2}\varepsilon_{s}+\varepsilon_{\ell_{5}}^{2}).

Hence as long as

(96) |λd(μ)|<(C9,sC8,s)(ε5(λk3s/2εs+ε52))\left|\lambda_{d}(\mu)\right|<(C_{9,s}-C_{8,s})(\varepsilon_{\ell_{5}}(\lambda^{k_{3}-s/2}\varepsilon_{s}+\varepsilon_{\ell_{5}}^{2}))

the proof of Lemma 39 may proceed to equation (79) even if there is not a sequence of measures μn\mu_{n} such that |λd(μn)|0\left|\lambda_{d}(\mu_{n})\right|\to 0. Hence we may continue running the KAM scheme until equation (96) fails to hold.

Suppose that we iterate the KAM scheme until equation (96) fails. We consider the estimates available in the KAM scheme at the step of failure. By applying Proposition 26 with rr equal to 11, dd, and d1d-1, we obtain:

(97) Λ1(μ)=12dmi=1mMECf~i2dvol+(d1)(d+2)(d1)mi=1mMENCf~i2dvol+G1(M)𝒰(ψ1)dvol+O(Y~Ck13)Λd1(μ)=(d1)2dmi=1mMECf~i2dvol+(d1)(d+2)(d1)mi=1mMENCf~i2dvol+Gd1(M)𝒰(ψd1)dvol+O(Y~Ck13)Λd(μ)=d2dmi=1mMECf~i2dvol+Gd(M)𝒰(ψd)dvol+O(Y~Ck13)\displaystyle\begin{split}\Lambda_{1}(\mu)=&\frac{-1}{2dm}\sum_{i=1}^{m}\int_{M}\|E_{C}^{\widetilde{f}_{i}}\|^{2}\,d\operatorname{vol}+\frac{(d-1)}{(d+2)(d-1)m}\sum_{i=1}^{m}\int_{M}\|E_{NC}^{\widetilde{f}_{i}}\|^{2}\,d\operatorname{vol}+\int_{G_{1}(M)}\mathcal{U}(\psi_{1})\,d\operatorname{vol}+\\ &O(\|\widetilde{Y}\|^{3}_{C^{k_{1}}})\\ \Lambda_{d-1}(\mu)=&\frac{-(d-1)}{2dm}\sum_{i=1}^{m}\int_{M}\|E_{C}^{\widetilde{f}_{i}}\|^{2}\,d\operatorname{vol}+\frac{(d-1)}{(d+2)(d-1)m}\sum_{i=1}^{m}\int_{M}\|E_{NC}^{\widetilde{f}_{i}}\|^{2}\,d\operatorname{vol}+\int_{G_{d-1}(M)}\mathcal{U}(\psi_{d-1})\,d\operatorname{vol}+\\ &O(\|\widetilde{Y}\|^{3}_{C^{k_{1}}})\\ \Lambda_{d}(\mu)=&\frac{-d}{2dm}\sum_{i=1}^{m}\int_{M}\|E_{C}^{\widetilde{f}_{i}}\|^{2}\,d\operatorname{vol}+\int_{G_{d}(M)}\mathcal{U}(\psi_{d})\,d\operatorname{vol}+O(\|\widetilde{Y}\|^{3}_{C^{k_{1}}})\end{split}

Write 𝒰i\mathcal{U}_{i} as shorthand for the term Gri(M)𝒰(ψi)dvol\int_{\operatorname{Gr}_{i}(M)}\mathcal{U}(\psi_{i})\,d\operatorname{vol}. Then,

(98) λ1(μ)(λd(μ)+2dΛd(μ))\displaystyle\lambda_{1}(\mu)-(-\lambda_{d}(\mu)+\frac{2}{d}\Lambda_{d}(\mu)) =Λ1(μ)Λd1(μ)+(d2)dΛd(μ)\displaystyle=\Lambda_{1}(\mu)-\Lambda_{d-1}(\mu)+\frac{(d-2)}{d}\Lambda_{d}(\mu)
(99) =𝒰1+𝒰d1+𝒰d+O(Y~Ck13).\displaystyle=\mathcal{U}_{1}+\mathcal{U}_{d-1}+\mathcal{U}_{d}+O(\|\widetilde{Y}\|^{3}_{C^{k_{1}}}).

Using equations (78), (75), and that 5>k1+α\ell_{5}>k_{1}+\alpha, we bound the right hand side of equation (99) to find

|λ1(μ)(λd(μ)+2dΛd(μ))|4C8,s(λnk3s/2εsε5+ε53).\left|\lambda_{1}(\mu)-(-\lambda_{d}(\mu)+\frac{2}{d}\Lambda_{d}(\mu))\right|\leq 4C_{8,s}(\lambda_{n}^{k_{3}-s/2}\varepsilon_{s}\varepsilon_{\ell_{5}}+\varepsilon_{\ell_{5}}^{3}).

But by the failure of estimate (96), we may bound the right hand side of the previous line to obtain:

(100) |λ1(μ)(λd(μ)+2dΛd(μ))|4C9,sC8,s|λd(μ)|.\left|\lambda_{1}(\mu)-(-\lambda_{d}(\mu)+\frac{2}{d}\Lambda_{d}(\mu))\right|\leq\frac{4}{C_{9,s}-C_{8,s}}\left|\lambda_{d}(\mu)\right|.

Note in the above equation that the larger C9,sC_{9,s} is the smaller the left hand side of the equation is. We can take C9,sC_{9,s} as large as we like and still run the KAM scheme. Running the KAM scheme while having a larger constant C9,sC_{9,s} only requires that we assume our initial perturbation is closer to the original system of rotations in the Ck0C^{k_{0}} norm. Hence by assuming that the initial distance is arbitrarily small in the Ck0C^{k_{0}} norm, we may take C9,sC_{9,s} as large as we like. Thus equation (93) follows from equation (100). ∎

We now check the claim about isotropic manifolds.

Proposition 41.

Suppose that MM is a closed isotropic manifold other than Pn\mathbb{R}\operatorname{P}^{n} or SnS^{n}. Then Isom(M)\operatorname{Isom}(M) does not act transitively on Grk(M)\operatorname{Gr}_{k}(M) except if kk equals =0,1,dimM1=0,1,\dim M-1 or dimM\dim M.

Proof.

From subsection 2.5, we have a list of all the closed isotropic manifolds, so we may give an argument for each of the families, Pn\mathbb{C}\operatorname{P}^{n}, Pn\mathbb{H}\operatorname{P}^{n}, and F4/Spin(9)F_{4}/\operatorname{Spin}(9).

The isometry group of Pn\mathbb{C}\operatorname{P}^{n} is PSU(n+1)\operatorname{PSU}(n+1). If we fix a point pp in Pn\mathbb{C}\operatorname{P}^{n}, then the isotropy group is naturally identified with SU(n)\operatorname{SU}(n). It is then immediate that the action of the isotropy group preserves complex subspaces of Grk(Pn)\operatorname{Gr}_{k}(\mathbb{C}\operatorname{P}^{n}). Consequently Isom(Pn)\operatorname{Isom}(\mathbb{C}\operatorname{P}^{n}) does not act transitively on Grk(Pn)\operatorname{Gr}_{k}(\mathbb{C}\operatorname{P}^{n}) as Pn\mathbb{C}\operatorname{P}^{n} has subspaces that are not complex. In the case of Pn\mathbb{H}\operatorname{P}^{n}, which is constructed similarly to Pn\mathbb{C}\operatorname{P}^{n}, a similar argument works where we use instead that the isotropy group is Sp(k)\operatorname{Sp}(k), the compact symplectic group.

We now turn to the Cayley plane, for which we give a dimension counting argument. The dimension of F4F_{4} is 5252 while dimF4/Spin(9)=16\dim F_{4}/\operatorname{Spin}(9)=16. Recall that if MM is a manifold and dimM=d\dim M=d then dimGrk(M)=(k+1)d+k(k+1)2\dim\operatorname{Gr}_{k}(M)=(k+1)d+\frac{k(k+1)}{2}. Hence dimGr3(F4/Spin(9))>52\dim\operatorname{Gr}_{3}(F_{4}/\operatorname{Spin}(9))>52. If Isom(M)\operatorname{Isom}(M) acts transitively on 22-planes then MM must have constant sectional curvature and hence is a sphere. The Cayley plane does not have constant sectional curvature hence k=2k=2 is ruled out. Similarly, a dimension count excludes the possibility that F4F_{4} acts transitively on Grk(F4/Spin(0))\operatorname{Gr}_{k}(F_{4}/\operatorname{Spin}(0)) when k0,1,15,16k\neq 0,1,15,16. ∎

Appendix A CkC^{k} Estimates

In this section of the appendix, we collect some basic results concerning the calculus of CkC^{k} functions. Most of the estimates stated here are used to compare constructions coming from Riemannian geometry and constructions coming from a chart.

Most of the estimates we prove below involve the following definition, which is an appropriate form for a second order term in the CkC^{k} setting.

Definition 42.

Suppose that X,Y,ZX,Y,Z are all vector fields and that Z=Z(X,Y)Z=Z(X,Y) is a function of XX and YY. We say that ZZ is quadratic in XX and YY if there exists a fixed \ell such that for each kk there is a constant CkC_{k} depending only on ZZ such that:

(101) ZCkCk(XCk+2+YCk+2).\|Z\|_{C^{k}}\leq C_{k}(\|X\|_{C^{k+\ell}}^{2}+\|Y\|_{C^{k+\ell}}^{2}).

In addition to quadratic, we may also refer to ZZ as being second order in XX and YY. In the case that ZZ depends only on XX the definition is analogous.

One thinks of equation (101) as a quadratic tameness estimate. Our main use of this notion is the following proposition, which allows us compose diffeomorphisms up to a quadratic error. As before, if YY is a vector field on MM, we write ψY\psi_{Y} for the map of MM that sends xexpx(Y(x))x\mapsto\exp_{x}(Y(x)). To emphasize that ψ\psi depends on a metric gg, we may write ψYg\psi_{Y}^{g}.

The main result from this section is the following, which is used in the KAM scheme to see how the linearized error between fif_{i} and RiR_{i} changes when fif_{i} is conjugated by a diffeomorphism ψ\psi.

Proposition 43.

[DK07, Eq. (8)] Suppose that (M,g)(M,g) is a closed Riemannian manifold and that RR is an isometry of MM. Suppose that ff is a diffeomorphism of MM that is C1C^{1} close to RR. Let Y(x)=expR(x)1f(x)Y(x)=\exp^{-1}_{R(x)}f(x). If CC is a C1C^{1} small vector field on MM, then the error field expR(x)1ψCfψC1\exp^{-1}_{R(x)}\psi_{C}f\psi_{C}^{-1} is equal to

Y+CRC+Q(C,Y),Y+C-R_{*}C+Q(C,Y),

where QQ is quadratic in CC and YY.

The proof of Proposition 43 is straightforward. It particularly relies on the following proposition, which simplifies working with diffeomorphisms of the form ψX\psi_{X}.

Proposition 44.

Let MM be a compact Riemannian manifold. If X,YVect(M)X,Y\in\operatorname{Vect}^{\infty}(M) are sufficiently C1C^{1} small and we define ZZ by

ψYψX=ψX+Y+Z,\psi_{Y}\circ\psi_{X}=\psi_{X+Y+Z},

then there exists a fixed \ell such that for each kk there exists CkC_{k} such that

ZCkCk(XCk+2+YCk+2),\|Z\|_{C^{k}}\leq C_{k}(\|X\|_{C^{k+\ell}}^{2}+\|Y\|_{C^{k+\ell}}^{2}),

i.e. ZZ is quadratic in XX and YY.

The proof of Proposition 44 uses the following two lemmas concerning maps of n\mathbb{R}^{n}.

Lemma 45.

[Hör76, Thm. A.7] Suppose that BB is a compact convex domain in n\mathbb{R}^{n} with interior points. Then for k0k\geq 0, there exists CC such if f,gf,g are CkC^{k} maps from BB to \mathbb{R}, then

fgCkCk(fCkgC0+fC0gCk).\|fg\|_{C^{k}}\leq C_{k}(\|f\|_{C^{k}}\|g\|_{C^{0}}+\|f\|_{C^{0}}\|g\|_{C^{k}}).
Lemma 46.

[Hör76, Thm. A.8] For i{1,2,3}i\in\{1,2,3\}, let BiB_{i} be a fixed compact convex domain in ni\mathbb{R}^{n_{i}} with interior points. Let k1k\geq 1. There exists Ck>0C_{k}>0 such that if f:B1B2f\colon B_{1}\to B_{2} and g:B2B3g\colon B_{2}\to B_{3} are both CkC^{k}, then fgf\circ g is CkC^{k} and

fgCkCk(fCkgC1k+fC1gCk+fgC0).\|f\circ g\|_{C^{k}}\leq C_{k}(\|f\|_{C^{k}}\|g\|_{C^{1}}^{k}+\|f\|_{C^{1}}\|g\|_{C^{k}}+\|f\circ g\|_{C^{0}}).

Using the previous two lemmas, we prove the following.

Proposition 47.

Suppose that gg is a metric on n\mathbb{R}^{n}. For a smooth vector field YY such that YC1<1\|Y\|_{C^{1}}<1, define

Z(x)=ψYg(x)Y(x)x.Z(x)=\psi^{g}_{Y}(x)-Y(x)-x.

Let BB be a compact convex domain in n\mathbb{R}^{n} with interior points. Then Z|BZ|_{B} is quadratic in YY. In fact, for each kk there exists CkC_{k} such that

Z|BCkCkYCk2.\|Z|_{B}\|_{C^{k}}\leq C_{k}\|Y\|_{C^{k}}^{2}.
Proof.

Let BB be as in the statement of the proposition. Define γ(Y(x),t)\gamma(Y(x),t) to be the map that sends xexpxtY(x)xx\mapsto\exp_{x}tY(x)-x, so that γ(Y(x),1)+x=ψYg\gamma(Y(x),1)+x=\psi^{g}_{Y} and γ(Y(x),0)=0\gamma(Y(x),0)=0. We rewrite ZZ.

Z=ψYg(x)xY(x)\displaystyle Z=\psi_{Y}^{g}(x)-x-Y(x) =γ(Y(x),1)Y(x)\displaystyle=\gamma(Y(x),1)-Y(x)
=01γ˙(Y(x),t)Y(x)dt\displaystyle=\int_{0}^{1}\dot{\gamma}(Y(x),t)-Y(x)\,dt
=01γ˙(Y(x),t)γ˙(Y(x),0)dt\displaystyle=\int_{0}^{1}\dot{\gamma}(Y(x),t)-\dot{\gamma}(Y(x),0)\,dt
=0101tγ¨(Y(x),st)𝑑s𝑑t\displaystyle=\int_{0}^{1}\int_{0}^{1}t\ddot{\gamma}(Y(x),st)\,ds\,dt
=01t01γ¨(Y(x),st)𝑑s𝑑t.\displaystyle=\int_{0}^{1}t\int_{0}^{1}\ddot{\gamma}(Y(x),st)\,ds\,dt.

By differentiating under the integral, we see that the nnth derivatives of ZZ are controlled by the maximum of the nnth derivatives of γ¨(Y(x),t)\ddot{\gamma}(Y(x),t) for each fixed tt. Hence it suffices to show for each t[0,1]t\in[0,1] that γ¨(Y(x),t)\ddot{\gamma}(Y(x),t) is second order in YY.

Dropping the explicit dependence on xx, we recall the coordinate expression of the geodesic equation. For a coordinate frame [e1,,en][e_{1},\ldots,e_{n}] and indices 1μ,ν,λn1\leq\mu,\nu,\lambda\leq n, we define the Christoffel symbols Γμνλ\Gamma_{\mu\nu}^{\lambda} by eμeν,eλ.\langle\nabla_{e_{\mu}}e_{\nu},e_{\lambda}\rangle. In addition, we write γ˙ν\dot{\gamma}^{\nu} for γ˙,eν\langle\dot{\gamma},e_{\nu}\rangle and similarly for γ¨\ddot{\gamma}. The coordinate expression for the geodesic equation is then

γ¨λ=Γμνλγ˙μγ˙ν.\ddot{\gamma}^{\lambda}=-\Gamma^{\lambda}_{\mu\nu}\dot{\gamma}^{\mu}\dot{\gamma}^{\nu}.

We estimate the CkC^{k} norm of the right hand side. Write ϕt\phi^{t} for the geodesic flow on TBTB. For fixed r>0r>0 in TBTB, let TB(r)TB(r) be the set of vectors vTBv\in TB such that v<r\|v\|<r. Note that the restriction ϕt|TB(t)Ck\|\phi^{t}|_{TB(t)}\|_{C^{k}} is bounded. Let π\pi be the projection from a tangent vector in TnT\mathbb{R}^{n} to its basepoint in n\mathbb{R}^{n}. Then

γ(x,t)=πϕtY(x).\gamma(x,t)=\pi\circ\phi^{t}\circ Y(x).

Hence, writing ϕ˙\dot{\phi} for the geodesic spray,

(102) γ˙(x,t)=Dπϕ˙ϕt(Y(x)).\dot{\gamma}(x,t)=D\pi\circ\dot{\phi}\mid_{\phi^{t}(Y(x))}.

Dπϕ˙tTB(r)D\pi\circ\dot{\phi}^{t}\mid_{TB(r)} has its CkC^{k} norm uniformly bounded in tt by some DkD_{k}. By Lemma 46 because YC1<1\|Y\|_{C^{1}}<1 it follows that ϕt(Y(x),t)CkCkYCk\|\phi^{t}(Y(x),t)\|_{C^{k}}\leq C_{k}\|Y\|_{C^{k}}.

Hence by applying Lemma 46 to (102), and similarly using that YC1<1\|Y\|_{C^{1}}<1 and Dπϕ˙D\pi\circ\dot{\phi} is uniformly bounded we find

(Dπϕ˙t)YCkCk(DkYC1+D1YCk+YC0).\|(D\pi\circ\dot{\phi}^{t})\circ Y\|_{C^{k}}\leq C_{k}^{\prime}(D_{k}\|Y\|_{C^{1}}+D_{1}\|Y\|_{C^{k}}+\|Y\|_{C^{0}}).

Hence

γ˙(x,t)Ck=Dπϕ˙|ϕt(Y(x))CkCkYCk.\|\dot{\gamma}(x,t)\|_{C^{k}}=\|D\pi\circ\dot{\phi}|_{\phi^{t}(Y(x))}\|_{C^{k}}\leq C_{k}\|Y\|_{C^{k}}.

The geodesic equation shows that at each point the coordinates of γ¨\ddot{\gamma} are a quadratic polynomial in the coordinates of γ˙\dot{\gamma}. Hence by Lemma 45

γ¨(x,t)CkCk′′YCk2\|\ddot{\gamma}(x,t)\|_{C^{k}}\leq C_{k}^{\prime\prime}\|Y\|_{C^{k}}^{2}

for all t[0,1]t\in[0,1]. Thus we obtain a uniform estimate on ZZ. ∎

Proof of Proposition 44..

As before, it suffices to prove the estimate in a chart. So, we are reduced to working in a neighborhood of 0n0\in\mathbb{R}^{n}. Fix some kk, then by Proposition 47 we may write

ψY(x)=x+Y(x)+ZY(x),\psi_{Y}(x)=x+Y(x)+Z_{Y}(x),

where ZY(x)Z_{Y}(x) is quadratic in YY. Similarly define ZX(x)Z_{X}(x) and ZX+Y(x)Z_{X+Y}(x). Then

ψYψX\displaystyle\psi_{Y}\circ\psi_{X} =ψY(x+X(x)+ZX(x))\displaystyle=\psi_{Y}(x+X(x)+Z_{X}(x))
=x+X(x)+ZX(x)+Y(x+X(x)+ZX(x))+ZY(x+X(x)+ZX(x)).\displaystyle=x+X(x)+Z_{X}(x)+Y(x+X(x)+Z_{X}(x))+Z_{Y}(x+X(x)+Z_{X}(x)).

To prove this proposition, we compare the previous line with

ψX+Y=x+X(x)+Y(x)+ZX+Y(x).\psi_{X+Y}=x+X(x)+Y(x)+Z_{X+Y}(x).

The difference is

ψYψXψX+Y=ZX(x)ZX+Y(x)+Y(x+X(x)+ZX(x))Y(x)+ZY(x+X(x)+ZX(x))\psi_{Y}\circ\psi_{X}-\psi_{X+Y}=Z_{X}(x)-Z_{X+Y}(x)+Y(x+X(x)+Z_{X}(x))-Y(x)+Z_{Y}(x+X(x)+Z_{X}(x))

The first and second terms satisfy the appropriate quadratic CkC^{k} estimate already. For the last term, we apply Lemma 46. Hence by assuming that XC1\|X\|_{C^{1}} is sufficiently small, we conclude that the ZYZ_{Y} term is quadratic. We now turn to the YY terms:

Y(x+X(x)+ZX(x))Y(x).Y(x+X(x)+Z_{X}(x))-Y(x).

For this we apply the same trick as before. Write

Y(x+X(x)+ZX(x))Y(x)=01Y(x+t(X(x)+ZX(x)))X(x)+ZX(x)𝑑t.Y(x+X(x)+Z_{X}(x))-Y(x)=\int_{0}^{1}Y^{\prime}(x+t(X(x)+Z_{X}(x)))\|X(x)+Z_{X}(x)\|\,dt.

By differentiating under the integral, it suffices to show that the integrand is quadratic in XX and YY. By Lemma 45, the integrand will be quadratic if there exists \ell such that for each kk there is a constant CkC_{k} such that both of Y(x+t(X(x)+ZX(x)))Ck\|Y^{\prime}(x+t(X(x)+Z_{X}(x)))\|_{C^{k}} and X(x)+ZX(x)Ck\|X(x)+Z_{X}(x)\|_{C^{k}} are bounded by Ck(XCk++YCk+)C_{k}(\|X\|_{C^{k+{\ell}}}+\|Y\|_{C^{k+{\ell}}}). This follows for both terms by the application of Lemma 46, so we are done. ∎

We now show another basic fact: near to the identity map a diffeomorphism and its inverse have comparable size.

Lemma 48.

Suppose that MM is a closed Riemannian manifold. Then there exists ϵ>0\epsilon>0 such that for all k0k\geq 0 then there exists CkC_{k}, such that if fDiffk(M)f\in\operatorname{Diff}^{k}(M) and dC1(f,Id)<ϵd_{C^{1}}(f,\operatorname{Id})<\epsilon then

dCk(f1,Id)CkdCk(f,Id).d_{C^{k}}(f^{-1},\operatorname{Id})\leq C_{k}d_{C^{k}}(f,\operatorname{Id}).
Proof.

This proof follows the outline of the similar estimate in [Ham82, Lem. 2.3.6]. For convenience, write g=f1g=f^{-1}. In a chart, we write f(x)=x+X(x)f(x)=x+X(x) where the CkC^{k} norm of XX is bounded by dCk(f,Id)d_{C^{k}}(f,\operatorname{Id}). Similarly write g(x)=x+Y(x)g(x)=x+Y(x). We now apply the chainrule to differentiate gfg\circ f. The case where n=1n=1 is immediate by differentiating gf=x+X(x)+Y(x+X(x))g\circ f=x+X(x)+Y(x+X(x)), which gives that

DX+DY(Id+DX)=0.DX+DY(\operatorname{Id}+DX)=0.

Hence

DY=DX(Id+DX)1,DY=-DX(\operatorname{Id}+DX)^{-1},

which is uniformly comparable to DX\|DX\| because dC1(f,Id)d_{C^{1}}(f,\operatorname{Id}) is uniformly bounded.

For k>1k>1, we must estimate the higher order derivatives of YY. Note that for k>1k>1 that Dkg=DkYD^{k}g=D^{k}Y and Dkf=DkXD^{k}f=D^{k}X.

Applying the chain rule to fg=Idf\circ g=\operatorname{Id} to calculate the kkth derivative gives:

0=l=1kj1++jl=kCl,j1,,jlDg(x)lf{Dxj1g,,Dxjlg},0=\sum_{l=1}^{k}\sum_{j_{1}+\cdots+j_{l}=k}C_{l,j_{1},...,j_{l}}D_{g(x)}^{l}f\{D^{j_{1}}_{x}g,\ldots,D_{x}^{j_{l}}g\},

and hence

(103) Dxkg=(Dg(x)f)1l=2kj1++jl=kCl,j1,,jlDg(x)lf{Dxj1g(x),,Dxjlg(x)}.D^{k}_{x}g=-(D_{g(x)}f)^{-1}\sum_{l=2}^{k}\sum_{j_{1}+\cdots+j_{l}=k}C_{l,j_{1},\ldots,j_{l}}D^{l}_{g(x)}f\{D^{j_{1}}_{x}g(x),\ldots,D^{j_{l}}_{x}g(x)\}.

As (Df)1(Df)^{-1} has uniformly bounded norm, it suffices to show that the each term in the sum has norm bounded by XCn\|X\|_{C^{n}}.

We use the interpolation estimate in Lemma 52. If j>1j>1, then

Djg=DjY,\|D^{j}g\|=\|D^{j}Y\|,

By interpolation between the C1C^{1} and Cn1C^{n-1} norms, for 1jn11\leq j\leq n-1,

YCjC1,n1YC1nj1n2YCn1j1n2.\|Y\|_{C^{j}}\leq C_{1,n-1}\|Y\|_{C^{1}}^{\frac{n-j-1}{n-2}}\|Y\|_{C^{n-1}}^{\frac{j-1}{n-2}}.

By interpolation between the C1C^{1} and CnC^{n} norms, for 1jn1\leq j\leq n,

XCjC1,nXC1njn1XCnj1n1.\|X\|_{C^{j}}\leq C_{1,n}\|X\|_{C^{1}}^{\frac{n-j}{n-1}}\|X\|_{C^{n}}^{\frac{j-1}{n-1}}.

We now estimate the terms in the right hand side of equation (103). In the case that some ji=1j_{i}=1, then Djig=Id+DYD^{j_{i}}g=\operatorname{Id}+DY. Hence the right hand side of equation (103), may be rewritten as the sum of terms of the form

Dg(x)lX{A1,,Al},D^{l}_{g(x)}X\{A_{1},...,A_{l}\},

where each AiA_{i} is either equal to Id\operatorname{Id} or DjiYD^{j_{i}}Y and the sum of the jij_{i} is less than or equal to kk. If YCk11\|Y\|_{C^{k-1}}\leq 1, then we are immediately done as the norm of this expression is at most Dkf\|D^{k}f\|. Otherwise, we may suppose that YCk11\|Y\|_{C^{k-1}}\geq 1. The C1C^{1} norms of XX and YY are uniformly bounded. Hence by interpolating between the C1C^{1} and CkC^{k} norm to estimate the DlXD^{l}X term and the C1C^{1} and the Ck1C^{k-1} norm to estimate the AiA_{i} terms, we find that

Dg(x)kX{A1,,Ak}CXCkl1k1YCk1krn2,\|D^{k}_{g(x)}X\{A_{1},...,A_{k}\}\|\leq C^{\prime}\|X\|_{C^{k}}^{\frac{l-1}{k-1}}\|Y\|_{C^{k-1}}^{\frac{k-r}{n-2}},

where rlr\geq l. But as YCk1>1\|Y\|_{C^{k-1}}>1, this bounded above by

CXCkl1k1YCk1klk2.C^{\prime}\|X\|_{C^{k}}^{\frac{l-1}{k-1}}\|Y\|_{C^{k-1}}^{\frac{k-l}{k-2}}.

Thus

DkYC0C′′l=2kXCkl1k1YCk1klk2.\|D^{k}Y\|_{C^{0}}\leq C^{\prime\prime}\sum_{l=2}^{k}\|X\|_{C^{k}}^{\frac{l-1}{k-1}}\|Y\|_{C^{k-1}}^{\frac{k-l}{k-2}}.

We may now proceed by induction on kk. We already established the theorem for k=1k=1. Now, given that YCk1Ck1XCk1\|Y\|_{C^{k-1}}\leq C_{k-1}\|X\|_{C^{k-1}}, it follows that

DkYC0C′′′l=2kXCkl1k1XCk1klk2.\|D^{k}Y\|_{C^{0}}\leq C^{\prime\prime\prime}\sum_{l=2}^{k}\|X\|_{C^{k}}^{\frac{l-1}{k-1}}\|X\|_{C^{k-1}}^{\frac{k-l}{k-2}}.

By interpolation between the 11 and kk norms, the uniform bound on the C1C^{1} norm, we find that XCk1DkXCkk2k1\|X\|_{C^{k-1}}\leq D_{k}\|X\|_{C^{k}}^{\frac{k-2}{k-1}}. This yields

DkYC0Dl=2kXCkl1k1XCkklk1D′′XCk,\|D^{k}Y\|_{C^{0}}\leq D^{\prime}\sum_{l=2}^{k}\|X\|_{C^{k}}^{\frac{l-1}{k-1}}\|X\|_{C^{k}}^{\frac{k-l}{k-1}}\leq D^{\prime\prime}\|X\|_{C^{k}},

which is the desired result. ∎

We now obtain the following corollary.

Corollary 49.

Suppose that MM is a closed Riemannian manifold. For smooth C1C^{1} small vector fields XX on MM, we may write

ψX1=ψX+Z,\psi_{X}^{-1}=\psi_{-X+Z},

where ZZ is quadratic in XX.

Proof.

To begin we know by Proposition 44 that

ψXψX=ψZ,\psi_{X}\circ\psi_{-X}=\psi_{Z},

where ZZ is quadratic in XX. Note that ψXψZ1=ψX1\psi_{-X}\circ\psi_{Z}^{-1}=\psi_{X}^{-1}. By Lemma 48, ψZ1=ψZ\psi_{Z}^{-1}=\psi_{Z^{\prime}} where ZZ^{\prime} is quadratic in XX. Hence ψX1=ψXψZ\psi_{X}^{-1}=\psi_{-X}\circ\psi_{Z^{\prime}}. By Proposition 44, this gives that ψX1=ψX+Z+Q\psi_{X}^{-1}=\psi_{-X+Z^{\prime}+Q}, where QQ is quadratic in XX and ZZ^{\prime}. Hence as ZZ^{\prime} is quadratic in XX and the corollary follows. ∎

We can now complete the proof of the estimate on the error field of the conjugated system.

Proof of Prop. 43..

To show this, we repeatedly apply Proposition 44 and Corollary 49. Writing ZZ for anything second order in CC and YY, we find:

ψCfψC1\displaystyle\psi_{C}f\psi_{C}^{-1} =ψCψYRψC1\displaystyle=\psi_{C}\psi_{Y}R\psi_{C}^{-1}
=ψC+Y+ZRψC1\displaystyle=\psi_{C+Y+Z}R\psi_{C}^{-1}
=ψC+Y+ZRψC+Z\displaystyle=\psi_{C+Y+Z}R\psi_{-C+Z}
=ψC+Y+Z+R(C+Z)R\displaystyle=\psi_{C+Y+Z+R_{*}(-C+Z)}R
=ψC+YRC+ZR.\displaystyle=\psi_{C+Y-R_{*}C+Z}R.

We now show two additional lemmas that we use in the KAM scheme.

Lemma 50.

Let MM be a closed Riemannian manifold. Fix k1k\geq 1. There exist C,ϵ>0C,\epsilon>0 such that if RIsom(M)R\in\operatorname{Isom}(M) and f,gDiffk(M)f,g\in\operatorname{Diff}^{k}(M) satisfy dC1(f,R)<ϵd_{C^{1}}(f,R)<\epsilon, and dC1(g,Id)<ϵd_{C^{1}}(g,\operatorname{Id})<\epsilon, then

dCk(fg,R)Ck(dCk(f,R)+dCk(g,Id)),d_{C^{k}}(f\circ g,R)\leq C_{k}(d_{C^{k}}(f,R)+d_{C^{k}}(g,\operatorname{Id})),

and

dCk(gf,R)Ck(dCk(f,R)+dCk(g,Id)).d_{C^{k}}(g\circ f,R)\leq C_{k}(d_{C^{k}}(f,R)+d_{C^{k}}(g,\operatorname{Id})).
Proof.

We begin with a proof for the first inequality. In coordinates write f(x)=R(x)+Y(x)f(x)=R(x)+Y(x) and g(x)=x+X(x)g(x)=x+X(x). Then we just need to estimate

fg(x)R(x)=R(x+X(x))R(x)+Y(x+X(x)).f\circ g(x)-R(x)=R(x+X(x))-R(x)+Y(x+X(x)).

The last term is controlled by dCk(f,R)+dCk(g,Id)d_{C^{k}}(f,R)+d_{C^{k}}(g,\operatorname{Id}) by Lemma 46. So, it suffices to estimate the first term. The kkth derivative of R(x+X(x))R(x)R(x+X(x))-R(x) is then

l=1kj1++jl=kCl,j1,,jlDx+X(x)lR{Dxj1g,,Dxjlg}DxlR.\sum_{l=1}^{k}\sum_{j_{1}+\cdots+j_{l}=k}C_{l,j_{1},...,j_{l}}D_{x+X(x)}^{l}R\{D^{j_{1}}_{x}g,\ldots,D_{x}^{j_{l}}g\}-D_{x}^{l}R.

For all the terms with l<kl<k, the same interpolation approach as in Lemma 48 gives the appropriate estimate, i.e. they are bounded by

Cl=1k1XCkl1k1XCk1klk2.C\sum_{l=1}^{k-1}\|X\|_{C^{k}}^{\frac{l-1}{k-1}}\|X\|_{C^{k-1}}^{\frac{k-l}{k-2}}.

There are two remaining terms which are unaccounted for: DkRx+X(x)DkRxD^{k}R_{x+X(x)}-D^{k}R_{x}. This is bounded by a constant time XC0\|X\|_{C^{0}} and the result follows.

We now consider the second inequality. As before we must estimate

gf(x)R(x)=X(x)+Y(R(x)+X(x)).g\circ f(x)-R(x)=X(x)+Y(R(x)+X(x)).

The important term is the second one. A similar argument to before then gives the result as all derviatives of RR are uniformly bounded independent of RR. ∎

Lemma 51.

Let MM be a closed Riemannian manifold and k0k\geq 0. If gnDiffk(M)g_{n}\in\operatorname{Diff}^{k}(M) is a sequence of diffeomorphisms and ndCk(gn,Id)<\sum_{n}d_{C^{k}}(g_{n},\operatorname{Id})<\infty, then the sequence of compositions of diffeomorphisms hn=gngn1g2g1h_{n}=g_{n}g_{n-1}\cdots g_{2}g_{1} converges in CkC^{k} to a diffeomorphism.

Proof.

As before, we check in charts. Having fixed a chart, write gn(x)=x+Xn(x)g_{n}(x)=x+X_{n}(x). Write hn(x)=1+Yn(x)h_{n}(x)=1+Y_{n}(x). Let an=XnCka_{n}=\|X_{n}\|_{C^{k}} and let bn=YnCkb_{n}=\|Y_{n}\|_{C^{k}}. Note that

(104) hn(x)=x+Yn1(x)+Xn(x+Yn(x)).h_{n}(x)=x+Y_{n-1}(x)+X_{n}(x+Y_{n}(x)).

Suppose for the moment that Yn1Ck1\|Y_{n-1}\|_{C^{k}}\leq 1. Using Lemma 46 and that YnCk1\|Y_{n}\|_{C^{k}}\leq 1,

(105) Xn(x+Yn1)Ck\displaystyle\|X_{n}(x+Y_{n-1})\|_{C^{k}} Ck(XnCkx+Yn1C1k+XnC1x+Yn1Ck+XC0)\displaystyle\leq C_{k}(\|X_{n}\|_{C^{k}}\|x+Y_{n-1}\|_{C^{1}}^{k}+\|X_{n}\|_{C^{1}}\|x+Y_{n-1}\|_{C^{k}}+\|X\|_{C^{0}})
(106) Ck(an+anbn1)\displaystyle\leq C_{k}^{\prime}(a_{n}+a_{n}b_{n-1})

Hence it follows from equation (104) that there exists DkD_{k} such that if bn11b_{n-1}\leq 1 then

bnbn1+Dkan(1+bn1).b_{n}\leq b_{n-1}+D_{k}a_{n}(1+b_{n-1}).\\

By induction, under the same assumption that YjCk1\|Y_{j}\|_{C^{k}}\leq 1 for j<nj<n, it follows that

bn1+i=1n(1+Dkai).b_{n}\leq-1+\prod_{i=1}^{n}(1+D_{k}a_{i}).

By noting that i=1(1+xn)exp(i=1xn)\prod_{i=1}^{\infty}(1+x_{n})\leq\exp\left(\sum_{i=1}^{\infty}x_{n}\right) for xn0x_{n}\geq 0, we can conclude that a tail of the sequence converges. This follows because as nan\sum_{n}a_{n} converges we can inductively check that these inequalities hold starting the argument from an index NN satisfying exp(i=NDkai)1<1\exp(\sum_{i=N}^{\infty}D_{k}a_{i})-1<1. Hence as a tail of the infinite composition converges so does the whole composition. ∎

Appendix B Interpolation Inequalities

There is a basic CkC^{k} interpolation inequality, which may be found in the appendix of [Hör76, Thm A.5]. It states that:

Lemma 52.

Suppose that MM is a closed Riemannian manifold. For 0ab<0\leq a\leq b<\infty and 0<λ<10<\lambda<1 there exists a constant C(a,b,λ)C(a,b,\lambda) such that for any real valued CbC^{b} function ff defined on MM,

fCλa+(1λ)bCfCaλfCb1λ.\|f\|_{C^{\lambda a+(1-\lambda)b}}\leq C\|f\|_{C^{a}}^{\lambda}\|f\|_{C^{b}}^{1-\lambda}.

The following is an immediate consequence of Lemma 52.

Lemma 53.

Suppose that MM is a closed Riemannian manifold. There exists ϵ>0\epsilon>0 such that for 0ab<0\leq a\leq b<\infty and 0<λ<10<\lambda<1 there exists a constant C(a,b,λ)C(a,b,\lambda) such that for any fDiff(M)f\in\operatorname{Diff}^{\infty}(M) such that dC0(f,Id)<ϵd_{C^{0}}(f,\operatorname{Id})<\epsilon, then

dCλa+(1λ)b(f,Id)CdCa(f,Id)λdCb(f,Id)1λ.d_{C^{\lambda a+(1-\lambda)b}}(f,\operatorname{Id})\leq Cd_{C^{a}}(f,\operatorname{Id})^{\lambda}d_{C^{b}}(f,\operatorname{Id})^{1-\lambda}.
Lemma 54.

Consider the space C(M,N)C^{\infty}(M,N) where MM and NN are Riemannian manifolds and MM and NN are closed. For all j,σ>0j,\sigma>0, there exists a natural number kk and a number ϵ0>0\epsilon_{0}>0 such that if f,gC(M,N)f,g\in C^{\infty}(M,N), fgHj<ϵ0<1\|f-g\|_{H^{j}}<\epsilon_{0}<1, and fgCk1/2\|f-g\|_{C^{k}}\leq 1/2 then fgCjfgHj1σ\|f-g\|_{C^{j}}\leq\|f-g\|_{H^{j}}^{1-\sigma}.

Proof.

The proof is a relatively straightforward application of the Sobolev embedding theorem and interpolation inequalities. First, we recall an interpolation inequality for Sobolev norms, see [BL76, Thm. 6.5.4]. For each 0<θ<10<\theta<1, s0,s1s_{0},s_{1}, there exists a constant CC such that if we let s=(1θ)s0+θs1s=(1-\theta)s_{0}+\theta s_{1}, then we have

fgHsCfgHs01θfgHs1θ.\|f-g\|_{H^{s}}\leq C\|f-g\|_{H^{s_{0}}}^{1-\theta}\|f-g\|_{H^{s_{1}}}^{\theta}.

To begin the proof, note that it suffices to estimate fgCj+1\|f-g\|_{C^{j+1}}. Fix \ell large enough that HH^{\ell} embeds compactly in Cj+1C^{j+1} by a Sobolev embedding. Then pick kk large enough that

fgHCλ,fgHj1θfgHkθ,\|f-g\|_{H^{\ell}}\leq C_{\lambda,\ell}\|f-g\|_{H^{j}}^{1-\theta}\|f-g\|_{H^{k}}^{\theta},

where 0<θ<σ0<\theta<\sigma. The term fgHkθ\|f-g\|_{H^{k}}^{\theta} is uniformly bounded by CkfgCkθC_{k}\|f-g\|_{C^{k}}^{\theta}. Hence as HH^{\ell} compactly embeds in Cj+1C^{j+1}, there exists C>0C^{\prime}>0 such that

fgCj+1CfgHj1θ=CfgHjσθfgHj1σ.\|f-g\|_{C^{j+1}}\leq C^{\prime}\|f-g\|_{H^{j}}^{1-\theta}=C^{\prime}\|f-g\|^{\sigma-\theta}_{H^{j}}\|f-g\|_{H^{j}}^{1-\sigma}.

If we choose ϵ0\epsilon_{0} sufficiently small that CfgHjσθ1C^{\prime}\|f-g\|^{\sigma-\theta}_{H^{j}}\leq 1, then the result follows. ∎

A similar argument shows the following:

Lemma 55.

Suppose that EE is a smooth Riemannian vector bundle over a closed Riemannian manifold MM. For all choices j,,σ,D>0j,\ell,\sigma,D>0 there exist k,ϵ0k,\epsilon_{0} such that if ff is a smooth section of EE and fHjϵ0<1\|f\|_{H^{j}}\leq\epsilon_{0}<1 and fCkD\|f\|_{C^{k}}\leq D then fCfHj1σ\|f\|_{C^{\ell}}\leq\|f\|_{H^{j}}^{1-\sigma}.

Appendix C Estimate on Lifted Error Fields

The goal of this section is to prove a technical estimate on the error fields of a lifted system. The proof is a computation in charts.

Lemma 56.

Suppose that MM is a closed Riemannian manifold. Fix numbers m,k0m,k\geq 0 and dd such that 0ddimM0\leq d\leq\dim M. There exists a constant CC such that the following holds. For any tuple (f1,,fm)(f_{1},...,f_{m}) of diffeomorphisms of MM and (r1,,rm)(r_{1},...,r_{m}) a C1C^{1} close tuple of isometries of MM, let YiY_{i} be the shortest vector field such that expri(x)Yi(x)=fi(x)\exp_{r_{i}(x)}Y_{i}(x)=f_{i}(x). Let FiF_{i} be the lift of fif_{i} to Grd(M)\operatorname{Gr}_{d}(M) and RiR_{i} be the lift of rir_{i} to Grd(M)\operatorname{Gr}_{d}(M). Let Y~i\widetilde{Y}_{i} be the shortest vector field on Grd(M)\operatorname{Gr}_{d}(M) such that expRi(x)Y~i(x)=Fi(x)\exp_{R_{i}(x)}\widetilde{Y}_{i}(x)=F_{i}(x). If iYiCk=ϵ\|\sum_{i}Y_{i}\|_{C^{k}}=\epsilon and maxiYiCk=η\max_{i}\|Y_{i}\|_{C^{k}}=\eta, then

i=1mY~iCk1C(ϵ+η2).\left\|\sum_{i=1}^{m}\widetilde{Y}_{i}\right\|_{C^{k-1}}\leq C(\epsilon+\eta^{2}).
Proof.

The proof is straightforward but tedious. We give the proof in the case that each RiR_{i} is the identity. Removing this assumption both complicates the argument in purely technical ways and substantially obscures why the lemma is true. At the end of the argument, we indicate the modifications needed for the general proof.

For readability we redevelop some of the basic notions concerning Grassmannians. First we recall the charts on Grd(V)\operatorname{Gr}_{d}(V), the Grassmannian of dd-planes in a vector space VV. Recall that given a vector space VV and a pair of complementary subspaces PP and QQ of VV that if dimP=d\dim P=d we obtain a chart on Grd(V)\operatorname{Gr}_{d}(V) in the following manner. Let L(P,Q)L(P,Q) denote the space of linear maps from PP to QQ. For AL(P,Q)A\in L(P,Q), we send AA to the subspace {x+AxxP}Grd(V)\{x+Ax\mid x\in P\}\in\operatorname{Gr}_{d}(V). This gives a smooth parametrization of a subset of Grd(V)\operatorname{Gr}_{d}(V). Having fixed a complementary pair of subspaces PP and QQ, let πP\pi_{P} denote the projection to PP along QQ.

Suppose that UU is a chart on MM and let 1,,n\partial_{1},...,\partial_{n} denote the coordinate vector fields. We use the usual coordinate framing of TUTU to give coordinates on the Grassmannian bundle Grd(M)\operatorname{Gr}_{d}(M). The tangent bundle to UU naturally splits into sub-bundles spanned by {1,,d}\{\partial_{1},...,\partial_{d}\} and {d+1,,n}\{\partial_{d+1},\ldots,\partial_{n}\}. Call these sub-bundles PP and QQ, respectively. Let End(P,Q)\operatorname{End}(P,Q) denote the bundle of maps from PP to QQ. We obtain a coordinate chart via associating an element of AEnd(P,Q)A\in\operatorname{End}(P,Q) and a point xUx\in U with the graph of AA in the tangent space over xx.

As we have assumed that each rir_{i} is the identity, in charts we write fi(x)=x+Xi(x)f_{i}(x)=x+X_{i}(x). As the fif_{i} are C1C^{1} small, we work in a single chart. It now suffices to prove the corresponding estimate on the field XiX_{i} because XiX_{i} and YiY_{i} are equal up to an error that is quadratic in the sense of Definition 42. We now calculate the action of ff on Grd(U)\operatorname{Gr}_{d}(U). Suppose that AEnd(P,Q)A\in\operatorname{End}(P,Q). Then we have that {Df(v+Av)}\{Df(v+Av)\} is a subspace of Tf(x)MT_{f(x)}M. We must find the map AA^{\prime} whose graph gives the same subspace. Let IAI_{A} be the n×dn\times d matrix with top block II and bottom block AA. Then the action of DfDf sends AA to AA^{\prime} which is equal to

A=DfIA(πPDfIA)1Id.A^{\prime}=DfI_{A}(\pi_{P}DfI_{A})^{-1}-\operatorname{Id}.

To see that this is true, we must check that AVQA^{\prime}V\subseteq Q and that {Dfv+DfAvvV}\{Dfv+DfAv\mid v\in V\} is the same as {v+Av}\{v+A^{\prime}v\}. The second condition is evident from the definition of AA^{\prime}. If vPv\in P, then (πPDfIA)1v=w(\pi_{P}DfI_{A})^{-1}v=w is an element of PP satisfying πPDfIAw=v\pi_{P}DfI_{A}w=v. Thus Av=DfIA(πPDfIA)1vvQA^{\prime}v=DfI_{A}(\pi_{P}DfI_{A})^{-1}v-v\in Q and hence AVQA^{\prime}V\subseteq Q. Write FF for the induced map on Grd(U)\operatorname{Gr}_{d}(U). In coordinates FF is the map that sends

(107) (x,A)(x,DfIA(πPDfIA)1Id).(x,A)\mapsto(x,DfI_{A}(\pi_{P}DfI_{A})^{-1}-\operatorname{Id}).

Write IdI_{d} for the dd by dd identity matrix. Let DXi^\widehat{DX_{i}} be the matrix comprised of the first dd rows of the matrix DXiDX_{i}. In the estimates below, we will assume that the size of AA is uniformly bounded. This does not restrict the generality as any subspace may be represented by such a uniformly bounded AA. Then note that

(πPDf[IdA])1\displaystyle(\pi_{P}D_{f}\left[\frac{I_{d}}{A}\right])^{-1} =(Id+DX^[IdA])1\displaystyle=(I_{d}+\widehat{DX}\left[\frac{I_{d}}{A}\right])^{-1}
=IdDX^[IdA]+O(DX2),\displaystyle=I_{d}-\widehat{DX}\left[\frac{I_{d}}{A}\right]+O(DX^{2}),

where the O(DX2)O(DX^{2}) is quadratic in the sense of Definition 42. Write XAX_{A} for the second term above.

We then have that

DfIA(πPDfIA)1Id\displaystyle DfI_{A}(\pi_{P}DfI_{A})^{-1}-\operatorname{Id} =(Id+DX)[IdA](IdXA)[Id0]+O(DX2)\displaystyle=(\operatorname{Id}+DX)\left[\frac{I_{d}}{A}\right](I_{d}-X_{A})-\left[\frac{I_{d}}{0}\right]+O(DX^{2})
=[IdA][IdA]XA+DX[IdA]+DX[IdA]XA[Id0]+O(DX2)\displaystyle=\left[\frac{I_{d}}{A}\right]-\left[\frac{I_{d}}{A}\right]X_{A}+DX\left[\frac{I_{d}}{A}\right]+DX\left[\frac{I_{d}}{A}\right]X_{A}-\left[\frac{I_{d}}{0}\right]+O(DX^{2})
=[0A][IdA]XA+DX[IdA]+O(DX2)\displaystyle=\left[\frac{0}{A}\right]-\left[\frac{I_{d}}{A}\right]X_{A}+DX\left[\frac{I_{d}}{A}\right]+O(DX^{2})
=[0A]+H(A,DX)+O(DX2),\displaystyle=\left[\frac{0}{A}\right]+H(A,DX)+O(DX^{2}),

where H(A,DX)H(A,DX) is the sum of the second and third terms two lines above. Note that HH is linear in DXDX and that H(A,DX)CDX\|H(A,DX)\leq C\|DX\| given our uniform boundedness assumption on AA.

Thus we see that in this chart on Grd(U)\operatorname{Gr}_{d}(U) that

(108) F(x,A)(x,A)=(f(x)x,H(A,DX)+O(DX2)).F(x,A)-(x,A)=(f(x)-x,H(A,DX)+O(DX^{2})).

In this chart, ifi(x)xCkϵ\|\sum_{i}f_{i}(x)-x\|_{C^{k}}\leq\epsilon. Hence writing fi(x)=x+Xi(x)f_{i}(x)=x+X_{i}(x) as before, iDXi(x)Ck1ϵ\|\sum_{i}DX_{i}(x)\|_{C^{k-1}}\leq\epsilon. Thus

iFi(x,A)(x,A)Ck1\displaystyle\left\|\sum_{i}F_{i}(x,A)-(x,A)\right\|_{C^{k-1}} =i(fi(x)x,H(A,DXi)+O(DX2))Ck1\displaystyle=\left\|\sum_{i}(f_{i}(x)-x,H(A,DX_{i})+O(DX^{2}))\right\|_{C^{k-1}}
C(iXiCk+maxiXiCk2)\displaystyle\leq C\left(\left\|\sum_{i}X_{i}\right\|_{C^{k}}+\max_{i}\|X_{i}\|_{C^{k}}^{2}\right)

by the linearity of HH. This completes the proof in the special case where ri=Idr_{i}=\operatorname{Id} for each ii.

In the general setting one follows the same sequence of steps. One writes fi(x)=ri(x)+Xi(ri(x))f_{i}(x)=r_{i}(x)+X_{i}(r_{i}(x)). One then does the same computation to determine the action on the Grassmannian bundle. This is complicated by additional terms related to RR. Having finished this computation, one finds a natural analog of H(A,DX)H(A,DX), which now comprises eight terms instead of two, and also depends on rir_{i}. Recognizing the cancellation is then somewhat complicated because of the dependence on rir_{i}. However, this dependence does not cause an issue because the terms that would potentially cause trouble satisfy some useful relations. These relations emerge when one keeps in mind the base points, which is crucial when the isometries are non-trivial. ∎

Appendix D Determinants

Suppose that VV and WW are finite dimensional inner product spaces. Consider a linear map L:VWL\colon V\to W. The determinant of the map LL is defined as follows. If {vi}\{v_{i}\} is an orthonormal basis for VV, one may measure the size of the tensor Lv1LvnLv_{1}\wedge\cdots\wedge Lv_{n} with respect to the norm on tensors induced by the metric on WW. If {v1,,vn}\{v_{1},...,v_{n}\} is a basis for VV, then we define

det(L,g1,g2)Det(Lvi,Lvjg2)Det(vi,vjg1),\det(L,g_{1},g_{2})\coloneqq\sqrt{\frac{\operatorname{Det}\left(\langle Lv_{i},Lv_{j}\rangle_{g_{2}}\right)}{\operatorname{Det}\left(\langle v_{i},v_{j}\rangle_{g_{1}}\right)}},

where Det\operatorname{Det} is the usual determinant of a square matrix. Sometimes we have a map L:VWL\colon V\to W and a subspace EVE\subset V. We then define

(109) det(L,g1,g2E)=det(L|E,g1|E,g2).\det(L,g_{1},g_{2}\mid E)=\det(L|_{E},g_{1}|_{E},g_{2}).

When the spaces VV and WW are understood, we may write det(LE)\det(L\mid E).

There are some properties of det\det that we will record for later use.

Lemma 57.

Fix a basis and suppose that V=WV=W. Working with respect to this basis, the determinant has the following properties:

(110) det(L,g1,g2)\displaystyle\det(L,g_{1},g_{2}) =det(Id,g1,Lg2),\displaystyle=\det(\operatorname{Id},g_{1},L^{*}g_{2}),
(111) det(Id,Id,A)\displaystyle\det(\operatorname{Id},\operatorname{Id},A) =det(A,Id,Id)=|Det(A)|.\displaystyle=\sqrt{\det(A,\operatorname{Id},\operatorname{Id})}=\sqrt{\left|\operatorname{Det}(A)\right|}.
Proof.

For the first equality, let {vi}\{v_{i}\} be a basis of (V,g1)(V,g_{1}), then

det(L,g1,g2)=DetLvi,Lvjg2Detvi,vjg1\det(L,g_{1},g_{2})=\sqrt{\frac{\operatorname{Det}\langle Lv_{i},Lv_{j}\rangle_{g_{2}}}{\operatorname{Det}\langle v_{i},v_{j}\rangle_{g_{1}}}}

But, vi,vjLg2=Lvi,Lvjg2\langle v_{i},v_{j}\rangle_{L^{*}g_{2}}=\langle Lv_{i},Lv_{j}\rangle_{g_{2}}, so, this is equal to

Detvi,vjLg2Detvi,vjg1,\sqrt{\frac{\operatorname{Det}\langle v_{i},v_{j}\rangle_{L^{*}g_{2}}}{\operatorname{Det}\langle v_{i},v_{j}\rangle_{g_{1}}}},

which is the definition of det(Id,g1,Lg2)\det(\operatorname{Id},g_{1},L^{*}g_{2}).

For the second equality, fix an orthonormal basis {ei}\{e_{i}\}, then

det(Id,Id,A)=Detei,ejA=DetAij\det(\operatorname{Id},\operatorname{Id},A)=\sqrt{\operatorname{Det}\langle e_{i},e_{j}\rangle_{A}}=\sqrt{\operatorname{Det}A_{ij}}

whereas,

det(A,Id,Id)=DetAei,AejId=DetATA=|DetA|2=|DetA|.\det(A,\operatorname{Id},\operatorname{Id})=\sqrt{\operatorname{Det}\langle Ae_{i},Ae_{j}\rangle_{\operatorname{Id}}}=\sqrt{\operatorname{Det}A^{T}A}=\sqrt{\left|\operatorname{Det}A\right|^{2}}=\left|\operatorname{Det}A\right|.

We record the following estimate which is used in the proof.

Lemma 58.

Let MM be a closed manifold and let 0rdimM0\leq r\leq\dim M. If gg is an isometry of MM, then lndet(Df|Ex)\ln\det(Df|E_{x}), which is defined on Grr(M)\operatorname{Gr}_{r}(M), satisfies the following estimate:

lndet(DfEx)Ck=O(dCk+1(f,g)),\left\|\ln\det(Df\mid E_{x})\right\|_{C^{k}}=O(d_{C^{k+1}}(f,g)),

as fgf\to g in Ck+1C^{k+1}. The big-O is uniform over all isometries gg.

Proof.

It suffices to show that this estimate holds in charts. So, fix a pair of charts UU and VV on MM such that f(U)f(U) has compact closure inside of VV. We define a map H:Grd(U)×U×V×n2H\colon\operatorname{Gr}_{d}(U)\times U\times V\times\mathbb{R}^{n^{2}}\to\mathbb{R} by sending the point (E,x,y,A)(E,x,y,A) to the lndet(A,gx,gy|E)\ln\det(A,g_{x},g_{y}|E), where gxg_{x} and gyg_{y} denote the pullback metric from MM. Using ff we define a map f~:Grd×UGrd(U)×U×V×n2\widetilde{f}\colon\operatorname{Gr}_{d}\times U\to\operatorname{Gr}_{d}(U)\times U\times V\times\mathbb{R}^{n^{2}} by

(E,x)(E,x,f(x),Df),(E,x)\mapsto(E,x,f(x),Df),

where we are using the coordinates to express DfDf as a matrix. Then the quantity we wish to estimate the CkC^{k} norm of is Hf~H\circ\widetilde{f}. If we analogously define g~\widetilde{g}, then note that Hg~0H\circ\widetilde{g}\equiv 0 because gg is an isometry. By writing out the derivatives using the chain rule and using that ff is uniformly close to gg, one sees that Hg~Hf~Ck=O(dCk+1(f,g))\|H\circ\widetilde{g}-H\circ\widetilde{f}\|_{C^{k}}=O(d_{C^{k+1}}(f,g)), and the result follows. ∎

Appendix E Taylor Expansions

E.1. Taylor expansion of the log Jacobian

Proposition 59.

For C1C^{1} small vector fields YY on a Riemannian manifold MM, the following approximation holds

Grr(M)lndet(DxψY,Id,gψY(x)Ex)dvol=r2dMEC2dvol+r(dr)(d+2)(d1)MENC2dvol+O(YC13),\int_{\operatorname{Gr}_{r}(M)}\ln\det(D_{x}\psi_{Y},\operatorname{Id},g_{\psi_{Y(x)}}\mid E_{x})\,d\operatorname{vol}=-\frac{r}{2d}\int_{M}\|E_{C}\|^{2}\,d\operatorname{vol}+\frac{r(d-r)}{(d+2)(d-1)}\int_{M}\|E_{NC}\|^{2}\,d\operatorname{vol}+O(\|Y\|_{C^{1}}^{3}),

where ECE_{C} and ENCE_{NC} are the conformal and non-conformal strain tensors associated to ψY\psi_{Y} as defined in subsection 4.2. In addition, det\det is defined in Appendix D and ψY\psi_{Y} is defined in equation (11).

The proof of this proposition is a lengthy computation with several subordinate lemmas.

Proof.

In order to estimate the integral over MM, we will first obtain a pointwise estimate on:

Grr(TxM)lndet(DxψYE)dE.\int_{\operatorname{Gr}_{r}(T_{x}M)}\ln\det(D_{x}\psi_{Y}\mid E)\,dE.

To estimate this we work in an exponential chart on MM centered at xx. In this chart, xx is 0 and ψY(0)=Y(0)\psi_{Y}(0)=Y(0). Then

Grr(TxM)lndet(DxψYE)dE=Grr(TxM)lndet(D0ψY,Id,gY(0)E)dE.\int_{\operatorname{Gr}_{r}(T_{x}M)}\ln\det(D_{x}\psi_{Y}\mid E)\,dE=\int_{\operatorname{Gr}_{r}(T_{x}M)}\ln\det(D_{0}\psi_{Y},\operatorname{Id},g_{Y(0)}\mid E)\,dE.

We now rewrite the above line so that we can apply the Taylor approximation in Proposition 62.

Write the metric as Id+g^\operatorname{Id}+\hat{g}. As we are in an exponential chart, g^Y(0)=O(YC02)\|\hat{g}_{Y(0)}\|=O(\|Y\|^{2}_{C^{0}}). Write DψY=Id+ψ^D\psi_{Y}=\operatorname{Id}+\hat{\psi}. The integral we are calculating only involves ψ^0\hat{\psi}_{0} and g^Y(0)\hat{g}_{Y(0)}, so below we drop the subscripts. Then

Grr(TxM)lndet(DxψYE)dE=Grr(TxM)lndet(Id+ψ^,Id,Id+g^E)dE.\int_{\operatorname{Gr}_{r}(T_{x}M)}\ln\det(D_{x}\psi_{Y}\mid E)\,dE=\int_{\operatorname{Gr}_{r}(T_{x}M)}\ln\det(\operatorname{Id}+\hat{\psi},\operatorname{Id},\operatorname{Id}+\hat{g}\mid E)\,dE.

Now applying the Taylor expansions in Propositions 62 and 63, we obtain the following expansion. For convenience let

(112) K=(ψ^+ψ^T)/2Trψ^dId.K=(\hat{\psi}+\hat{\psi}^{T})/2-\frac{\operatorname{Tr}\hat{\psi}}{d}\operatorname{Id}.

Then

(113) Grr(TxM)lndet(DψY,Id,gY(0)E)dE=\displaystyle\int_{\operatorname{Gr}_{r}(T_{x}M)}\ln\det(D\psi_{Y},\operatorname{Id},g_{Y(0)}\mid E)\,dE=
(114) rdTr(ψ^)+[r2dTr(ψ^2)+r(dr)(d+2)(d1)Tr(K2)]+O(ψ^3)+r2dTr(g^)+O(g^2)\displaystyle\frac{r}{d}\operatorname{Tr}(\hat{\psi})+\left[-\frac{r}{2d}\operatorname{Tr}(\hat{\psi}^{2})+\frac{r(d-r)}{(d+2)(d-1)}\operatorname{Tr}(K^{2})\right]+O(\|\hat{\psi}^{3}\|)+\frac{r}{2d}\operatorname{Tr}(\hat{g})+O(\|\hat{g}\|^{2})

Note that ψ^=O(YC1)\|\hat{\psi}\|=O(\|Y\|_{C^{1}}) and g^=O(YC02)\|\hat{g}\|=O(\|Y\|^{2}_{C^{0}}), hence the fourth and sixth terms in the above expression are each O(YC13)O(\|Y\|^{3}_{C^{1}}).

We now eliminate the two trace terms that are not quadratic in their arguments. For this, we use a Taylor expansion of the determinant.333Recall the usual Taylor expansion Det(Id+A)=Id+Tr(A)+(Tr(A))2Tr(A2)2+O(A3)\operatorname{Det}(\operatorname{Id}+A)=\operatorname{Id}+\operatorname{Tr}(A)+\frac{(\operatorname{Tr}(A))^{2}-\operatorname{Tr}(A^{2})}{2}+O(\|A\|^{3}). We combine this with the first order Taylor expansion det(Id,Id,Id+G)=Det(1+G)=1+Tr(G)+O(G2)=1+Tr(G)2+O(G2).\det(\operatorname{Id},\operatorname{Id},\operatorname{Id}+G)=\sqrt{\operatorname{Det}(1+G)}=\sqrt{1+\operatorname{Tr}(G)+O(\|G\|^{2})}=1+\frac{\operatorname{Tr}(G)}{2}+O(\|G\|^{2}). Thus

det(Dψ,Id,gY(0))=1+Trψ^+(Tr(ψ^))2Tr(ψ^2)2+Tr(g^)2+O(YC13)\det(D\psi,\operatorname{Id},g_{Y(0)})=1+\operatorname{Tr}\hat{\psi}+\frac{(\operatorname{Tr}(\hat{\psi}))^{2}-\operatorname{Tr}(\hat{\psi}^{2})}{2}+\frac{\operatorname{Tr}(\hat{g})}{2}+O(\|Y\|^{3}_{C^{1}})

The integral of the Jacobian is 11, so integrating the previous line over MM against volume we obtain

1=1+MTrψ^+(Tr(ψ^))2Tr(ψ^2)2+Tr(g^)2dvol+O(YC13).1=1+\int_{M}\operatorname{Tr}\hat{\psi}+\frac{(\operatorname{Tr}(\hat{\psi}))^{2}-\operatorname{Tr}(\hat{\psi}^{2})}{2}+\frac{\operatorname{Tr}(\hat{g})}{2}\,d\operatorname{vol}+O(\|Y\|^{3}_{C^{1}}).

Thus

MTr(ψ^)+Tr(g^)2Tr(ψ^2)2dvol=M(Tr(ψ^))22dvol+O(YC13).\int_{M}\operatorname{Tr}(\hat{\psi})+\frac{\operatorname{Tr}(\hat{g})}{2}-\frac{\operatorname{Tr}(\hat{\psi}^{2})}{2}\,d\operatorname{vol}=-\int_{M}\frac{(\operatorname{Tr}(\hat{\psi}))^{2}}{2}\,d\operatorname{vol}+O(\|Y\|^{3}_{C^{1}}).

Now, we integrate equation (113) over MM and apply the previous line to eliminate the non-quadratic terms. This gives

(115) Grr(M)lndet(DxψY,Id,gψY(x)Ex)dEx=Mr2d(Tr(ψ^x))2+r(dr)(d+2)(d1)Tr(Kx2)dvol+O(YC13),\displaystyle\int_{\operatorname{Gr}_{r}(M)}\ln\det(D_{x}\psi_{Y},\operatorname{Id},g_{\psi_{Y(x)}}\mid E_{x})\,dE_{x}=\int_{M}-\frac{r}{2d}(\operatorname{Tr}(\hat{\psi}_{x}))^{2}+\frac{r(d-r)}{(d+2)(d-1)}\operatorname{Tr}(K_{x}^{2})\,d\operatorname{vol}+O(\|Y\|^{3}_{C^{1}}),

where we have written ψ^x\hat{\psi}_{x} and KxK_{x} to emphasize the basepoint. The formula above is not yet very usable as both KxK_{x} and ψ^x\hat{\psi}_{x} are defined in terms of exponential charts. We now obtain an intrinsic expression for these terms. Recall that pointwise we use the L2L^{2} norm on tensors. Below we suppress the xx in EC(x)\|E_{C}(x)\| and ψ^x\hat{\psi}_{x}.

Lemma 60.

Let ECE_{C} be the conformal strain tensor associated to ψY\psi_{Y}. Then

M(Tr(ψ^x))2dvol=EC2dvol+O(YC13).\int_{M}(\operatorname{Tr}(\hat{\psi}_{x}))^{2}\,d\operatorname{vol}=\int\|E_{C}\|^{2}\,d\operatorname{vol}+O(\|Y\|^{3}_{C^{1}}).
Proof.

We use an exponential chart and compute a coordinate expression for EC2\|E_{C}\|^{2} in the center of this chart. As before, write DψY=Id+ψ^D\psi_{Y}=\operatorname{Id}+\hat{\psi}, where ψ^=O(YC1)\hat{\psi}=O(\|Y\|_{C^{1}}). Then working in exponential coordinates,

Tr(ψYgg)\displaystyle\operatorname{Tr}(\psi^{*}_{Y}g-g) =Tr((Id+ψ^)T(Id+O(YC02))(Id+ψ^)Id)\displaystyle=\operatorname{Tr}((\operatorname{Id}+\hat{\psi})^{T}(\operatorname{Id}+O(\|Y\|^{2}_{C^{0}}))(\operatorname{Id}+\hat{\psi})-\operatorname{Id})
=Tr(Id+ψ^T+ψ^Id+O(YC12)\displaystyle=\operatorname{Tr}(\operatorname{Id}+\hat{\psi}^{T}+\hat{\psi}-\operatorname{Id}+O(\|Y\|^{2}_{C^{1}})
=2Tr(ψ^)+O(YC12).\displaystyle=2\operatorname{Tr}(\hat{\psi})+O(\|Y\|^{2}_{C^{1}}).

Thus since ψ^=O(YC1)\hat{\psi}=O(\|Y\|_{C^{1}}), by definition of ECE_{C}, we have

EC2\displaystyle\|E_{C}\|^{2} =Tr(ψYgg)2dId2\displaystyle=\left\|\frac{\operatorname{Tr}(\psi_{Y}^{*}g-g)}{2d}\operatorname{Id}\right\|^{2}
=2Tr(ψ^)2dId2\displaystyle=\left\|\frac{2\operatorname{Tr}(\hat{\psi})}{2d}\operatorname{Id}\right\|^{2}
=Tr(ψ^)dId2\displaystyle=\frac{\operatorname{Tr}(\hat{\psi})}{d}\|\operatorname{Id}\|^{2}
=Tr(ψ^).\displaystyle=\operatorname{Tr}(\hat{\psi}).

Integrating over MM, we obtain the result. ∎

Lemma 61.

Let ENCE_{NC} be the non-conformal strain tensor associated to ψY\psi_{Y} and let KxK_{x} be as in equation (112), then

MTr(Kx2)dvol=MENC2dvol+O(YC13).\int_{M}\operatorname{Tr}\left(K^{2}_{x}\right)\,d\operatorname{vol}=\int_{M}\|E_{NC}\|^{2}\,d\operatorname{vol}+O(\|Y\|^{3}_{C^{1}}).
Proof.

As before, we first compute a local expression for the integrand and check that this expression is comparable to the local expression for the non-conformal strain tensor. We compute at the center of an exponential chart. As before, write DψY=Id+ψ^D\psi_{Y}=\operatorname{Id}+\hat{\psi} where ψ^=O(YC1)\hat{\psi}=O(\|Y\|_{C^{1}}). In this case

ψYg=(Id+ψ^)T(Id+O(YC02))(Id+ψ^)=Id+ψ^T+ψ^+O(YC12).\psi^{*}_{Y}g=(\operatorname{Id}+\hat{\psi})^{T}(\operatorname{Id}+O(\|Y\|^{2}_{C^{0}}))(\operatorname{Id}+\hat{\psi})=\operatorname{Id}+\hat{\psi}^{T}+\hat{\psi}+O(\|Y\|^{2}_{C^{1}}).

Using the above line and the definition of ENCE_{NC} we then compute:

ENC2\displaystyle\|E_{NC}\|^{2} =12(ψYggTr(ψYgg)dg)2\displaystyle=\left\|\frac{1}{2}\left(\psi^{*}_{Y}g-g-\frac{\operatorname{Tr}(\psi^{*}_{Y}g-g)}{d}g\right)\right\|^{2}
=14(Id+ψ^)T(Id+O(YC02)(Id+ψ^)Id2Trψ^dId+O(YC12)2\displaystyle=\frac{1}{4}\|(\operatorname{Id}+\hat{\psi})^{T}(\operatorname{Id}+O(\|Y\|^{2}_{C^{0}})(\operatorname{Id}+\hat{\psi})-\operatorname{Id}-2\frac{\operatorname{Tr}\hat{\psi}}{d}\operatorname{Id}+O(\|Y\|^{2}_{C^{1}})\|^{2}
=14ψ^T+ψ^2Trψ^dId+O(YC12)2\displaystyle=\frac{1}{4}\|\hat{\psi}^{T}+\hat{\psi}-2\frac{\operatorname{Tr}\hat{\psi}}{d}\operatorname{Id}+O(\|Y\|^{2}_{C^{1}})\|^{2}
=14Tr((ψ^T+ψ^2Trψ^dId+O(YC12))2)\displaystyle=\frac{1}{4}\operatorname{Tr}\left(\left(\hat{\psi}^{T}+\hat{\psi}-2\frac{\operatorname{Tr}\hat{\psi}}{d}\operatorname{Id}+O(\|Y\|^{2}_{C^{1}})\right)^{2}\right)
=Tr((ψ^T+ψ^2Trψ^dId)2)+O(YC13)\displaystyle=\operatorname{Tr}\left(\left(\frac{\hat{\psi}^{T}+\hat{\psi}}{2}-\frac{\operatorname{Tr}\hat{\psi}}{d}\operatorname{Id}\right)^{2}\right)+O(\|Y\|^{3}_{C^{1}})
=Tr(K2)+O(YC13).\displaystyle=\operatorname{Tr}(K^{2})+O(\|Y\|^{3}_{C^{1}}).

By integrating the above equality over MM, the result follows. ∎

Finally, the proof of Proposition 59 follows by applying Lemma 60 and Lemma 61 to equation (115), which gives

(116) Grr(M)lndet(DxψY,Id,gψY(x)Ex)dEx=r2dMEC2dvol+r(dr)(d+2)(d1)MENC2dvol+O(YC13).\displaystyle\int_{\operatorname{Gr}_{r}(M)}\ln\det(D_{x}\psi_{Y},\operatorname{Id},g_{\psi_{Y(x)}}\mid E_{x})\,dE_{x}=-\frac{r}{2d}\int_{M}\|E_{C}\|^{2}\,d\operatorname{vol}+\frac{r(d-r)}{(d+2)(d-1)}\int_{M}\|E_{NC}\|^{2}\,d\operatorname{vol}+O(\|Y\|_{C^{1}}^{3}).

E.2. Approximation of integrals over Grassmanians

Let 𝔾r,d\mathbb{G}_{r,d} be the Grassmanian of rr-planes in d\mathbb{R}^{d}. In this subsection, we prove the following simple estimate.

Proposition 62.

For 1rd1\leq r\leq d, let Λr:End(d)\Lambda_{r}\colon\operatorname{End}(\mathbb{R}^{d})\rightarrow\mathbb{R} be defined by

Λr(L):=𝔾r,dlndet(Id+L,Id,IdE)dE,\Lambda_{r}(L):=\int_{\mathbb{G}_{r,d}}\ln\det(\operatorname{Id}+L,\operatorname{Id},\operatorname{Id}\mid E)\,dE,

where dEdE denotes the Haar measure on 𝔾r,d\mathbb{G}_{r,d}. Then the second order Taylor approximation for Λr\Lambda_{r} at 0 is

Λr(L)=rdTrL+[r2dTr(L2)+r(dr)(d+2)(d1)Tr(K2)]+O(L3),\Lambda_{r}(L)=\frac{r}{d}\operatorname{Tr}L+\left[-\frac{r}{2d}\operatorname{Tr}(L^{2})+\frac{r(d-r)}{(d+2)(d-1)}\operatorname{Tr}(K^{2})\right]+O(\|L\|^{3}),

where

K=L+LT2TrLdId.K=\frac{L+L^{T}}{2}-\frac{\operatorname{Tr}L}{d}\operatorname{Id}.

Let λr(L)=Λr(L)Λr1(L)\lambda_{r}(L)=\Lambda_{r}(L)-\Lambda_{r-1}(L). Then the above expansion implies

λr(L)=1dTrL+[12dTr(L2)+d2r+1(d+2)(d1)Tr(K2)]+O(L3).\lambda_{r}(L)=\frac{1}{d}\operatorname{Tr}L+\left[-\frac{1}{2d}\operatorname{Tr}(L^{2})+\frac{d-2r+1}{(d+2)(d-1)}\operatorname{Tr}(K^{2})\right]+O(\|L\|^{3}).
Proof.

Before beginning, note from the definition of Λr\Lambda_{r} that if UU is an orthogonal transformation, Λr(UTLU)=Λr(L)\Lambda_{r}(U^{T}LU)=\Lambda_{r}(L). Consequently, if αi\alpha_{i} is the iith term in the Taylor expansion of Λr\Lambda_{r}, then αi\alpha_{i} is invariant under conjugation by isometries.

The map Λr\Lambda_{r} is smooth, so it admits a Taylor expansion:

Λr(L)=α1(L)+α2(L)+O(L3),\Lambda_{r}(L)=\alpha_{1}(L)+\alpha_{2}(L)+O(\|L\|^{3}),

where α1\alpha_{1} is linear in LL and α2\alpha_{2} is quadratic in LL. The rest of the proof is a calculation of α1\alpha_{1} and α2\alpha_{2}. Before we begin this calculation we describe the approach. In each case, we reduce to the case of a symmetric matrix LL. Then restricted to symmetric matrices, we diagonalize. There are few linear or quadratic maps from End(n)\operatorname{End}(\mathbb{R}^{n}) to \mathbb{R} that are invariant under conjugation by an orthogonal matrix. We then write αi\alpha_{i} as a linear combination of such invariant maps from End(n)\operatorname{End}(\mathbb{R}^{n}) to \mathbb{R} and then solve for the coefficients of this linear combination.

We begin by calculating α1\alpha_{1}.

Claim 3.

With notation as above,

α1(L)=rdTrL.\alpha_{1}(L)=\frac{r}{d}\operatorname{Tr}L.
Proof.

Let Λ~r(Id+L)=Λr(L)\widetilde{\Lambda}_{r}(\operatorname{Id}+L)=\Lambda_{r}(L). Then from the definition, note that if UU is an isometry then Λ~r(U(Id+L))=Λ~r((Id+L)U))=Λr(L)\widetilde{\Lambda}_{r}(U(\operatorname{Id}+L))=\widetilde{\Lambda}_{r}((\operatorname{Id}+L)U))=\Lambda_{r}(L). Suppose that OtO_{t} is some path tangent to O(n)End(n)O(n)\subset\operatorname{End}(\mathbb{R}^{n}) such that O0=IdO_{0}=\operatorname{Id}. Then Λ~r(Ot)=0\widetilde{\Lambda}_{r}(O_{t})=0. Write Ot=Id+tS+O(t2)O_{t}=\operatorname{Id}+tS+O(t^{2}) where SS is skew symmetric. Then we see that

Λ~r(Id+tS+O(t2))=O(t2),\widetilde{\Lambda}_{r}(\operatorname{Id}+tS+O(t^{2}))=O(t^{2}),

So, Λr(tS)=O(t2)\Lambda_{r}(tS)=O(t^{2}). Hence α1\alpha_{1} vanishes on skew symmetric matrices.

Thus it suffices to evaluate α1\alpha_{1} restricted to symmetric matrices. Suppose that AA is a symmetric matrix, then there exists an orthogonal matrix UU so that UTAUU^{T}AU is diagonal. Restricted to the space of diagonal matrices, which we identify with d\mathbb{R}^{d} in the natural way, observe that α1:d\alpha_{1}\colon\mathbb{R}^{d}\to\mathbb{R} is invariant under permutation of the coordinates in d\mathbb{R}^{d} because it is invariant under conjugation by isometries. There is a one dimensional space of maps having this property, and it is spanned by the trace, Tr\operatorname{Tr}. So, α1(A)=α1(UTAU)=a1Tr(A)\alpha_{1}(A)=\alpha_{1}(U^{T}AU)=a_{1}\operatorname{Tr}(A) for some constant CC. To compute the constant cc it suffices to consider a specific matrix, e.g. A=IdA=\operatorname{Id}.

α1(Id)\displaystyle\alpha_{1}(\operatorname{Id}) =ddϵlndet(Id+ϵIdE)dE\displaystyle=\frac{d}{d\epsilon}\int\ln\det(\operatorname{Id}+\epsilon\operatorname{Id}\mid E)\,dE
=ddϵln(Id+ϵ)rdE\displaystyle=\frac{d}{d\epsilon}\int\ln(\operatorname{Id}+\epsilon)^{r}\,dE
=ddϵrln(1+ϵ)\displaystyle=\frac{d}{d\epsilon}r\ln(1+\epsilon)
=r.\displaystyle=r.

So, a1=r/da_{1}=r/d. Thus for LEnd(d)L\in\operatorname{End}(\mathbb{R}^{d}), α1(L)=rdTr((L+LT)/2)=rdTr(L)\alpha_{1}(L)=\frac{r}{d}\operatorname{Tr}((L+L^{T})/2)=\frac{r}{d}\operatorname{Tr}(L). ∎

We now compute α2\alpha_{2}.

Claim 4.

With notation as in the statement of Proposition 62,

α2(L)=r2dTr(L2)+r(dr)(d+2)(d1)Tr(K2).\alpha_{2}(L)=-\frac{r}{2d}\operatorname{Tr}(L^{2})+\frac{r(d-r)}{(d+2)(d-1)}\operatorname{Tr}(K^{2}).
Proof.

Let Λ~r(Id+L)=Λr(L)\widetilde{\Lambda}_{r}(\operatorname{Id}+L)=\Lambda_{r}(L). From the definition, note that for an isometry UU, that Λ~r((Id+L)U)=Λ~r(U)\widetilde{\Lambda}_{r}((\operatorname{Id}+L)U)=\widetilde{\Lambda}_{r}(U). Fix LL and let J=(LLT)/2J=(L-L^{T})/2. Observe that

(Id+L)eJ=Id+(LJ)+(J2/2LJ)+O(|L|3).(\operatorname{Id}+L)e^{-J}=\operatorname{Id}+(L-J)+(J^{2}/2-LJ)+O(\left|L\right|^{3}).

Thus we see that

Λr(L)\displaystyle\Lambda_{r}(L) =Λ~r(Id+L)\displaystyle=\widetilde{\Lambda}_{r}(\operatorname{Id}+L)
=Λ~r((L+Id)eJ)\displaystyle=\widetilde{\Lambda}_{r}((L+\operatorname{Id})e^{-J})
=Λ~r(Id+(LJ)+(J2/2LJ)+O(|L|3))\displaystyle=\widetilde{\Lambda}_{r}(\operatorname{Id}+(L-J)+(J^{2}/2-LJ)+O(\left|L\right|^{3}))
=Λr((LJ)+(J2/2LJ))+O(|L|3).\displaystyle=\Lambda_{r}((L-J)+(J^{2}/2-LJ))+O(\left|L\right|^{3}).

Now comparing the two Taylor expansions of Λ~r(Id+L)\widetilde{\Lambda}_{r}(\operatorname{Id}+L), we find:

α2(L)=α2(LJ)+α1(J2/2LJ).\alpha_{2}(L)=\alpha_{2}(L-J)+\alpha_{1}(J^{2}/2-LJ).

Thus as we have already determined α1\alpha_{1}:

α2(L)=α2((L+LT)/2)+rdTr(J2/2LJ).\alpha_{2}(L)=\alpha_{2}((L+L^{T})/2)+\frac{r}{d}\operatorname{Tr}(J^{2}/2-LJ).

So, we are again reduced to the case of a symmetric matrix SS. In fact, by invariance of α2\alpha_{2} under conjugation by isometries, we are reduced to determining α2\alpha_{2} on the space of diagonal matrices. Identify d\mathbb{R}^{d} with diagonal matrices as before. We see that α2\alpha_{2} is a symmetric polynomial of degree 22 in dd variables. The space of such polynomials is spanned by xi2\sum x_{i}^{2} and i,jxixj\sum_{i,j}x_{i}x_{j}. It is convenient to observe that for a diagonal matrix, DD, Tr(D2)\operatorname{Tr}(D^{2}) and Tr(D)2\operatorname{Tr}(D)^{2} span this space as well. Hence

α2(S)=b1Tr(S)2+b2Tr(S2)\alpha_{2}(S)=b_{1}\operatorname{Tr}(S)^{2}+b_{2}\operatorname{Tr}(S^{2})

Now in order to calculate b1b_{1} and b2b_{2} we will explicitly calculate α2(Id)\alpha_{2}(\operatorname{Id}) and α2(P)\alpha_{2}(P), where PP is the orthogonal projection onto a coordinate axis.

In the first case,

2α2(Id)=ddϵ1ddϵ2𝔾r,dlndet((Id+ϵ1+ϵ2)E)dEϵ1=0,ϵ2=0\displaystyle 2\alpha_{2}(\operatorname{Id})=\frac{d}{d\epsilon_{1}}\frac{d}{d\epsilon_{2}}\int_{\mathbb{G}_{r,d}}\ln\det((\operatorname{Id}+\epsilon_{1}+\epsilon_{2})\mid E)\,dE\mid_{\epsilon_{1}=0,\epsilon_{2}=0} =d2dϵln(1+ϵ)r0=r.\displaystyle=\frac{d^{2}}{d\epsilon}\ln(1+\epsilon)^{r}\mid_{0}=-r.

So, α2(Id)=r/2\alpha_{2}(\operatorname{Id})=-r/2.

Next suppose that PP is projection onto a fixed vector ee. Suppose that (e,E)=θ\angle(e,E)=\theta. We now compute lndet(Id+ϵPE)\ln\det(\operatorname{Id}+\epsilon P\mid E). We fix a useful basis of EE. Let vv be a unit vector making angle (e,E)\angle(e,E) with ee. Then let e2,,ere_{2},...,e_{r} be unit vectors in EE that are orthogonal to ee and vv. Then using the basis v,e2,,erv,e_{2},...,e_{r}, we see that

det(Id+ϵPE)=(Id+ϵP)v(Id+ϵP)e2(Id+ϵP)erver=(Id+ϵP)v,(Id+ϵP)v,\det(\operatorname{Id}+\epsilon P\mid E)=\frac{\|(\operatorname{Id}+\epsilon P)v\wedge(\operatorname{Id}+\epsilon P)e_{2}\wedge\cdots\wedge(\operatorname{Id}+\epsilon P)e_{r}\|}{\|v\wedge\cdots\wedge e_{r}\|}=\sqrt{\langle(\operatorname{Id}+\epsilon P)v,(\operatorname{Id}+\epsilon P)v\rangle},

by considering the determinant defining the wedge product. But then as Pv=cos(θ)ePv=\cos(\theta)e,

v+ϵcos(θ)e,v+ϵcos(θ)e=v,v+2ϵcosθv,e+ϵ2Pv,Pv=1+2ϵcos2θ+ϵ2cos2θ.\sqrt{\langle v+\epsilon\cos(\theta)e,v+\epsilon\cos(\theta)e\rangle}=\sqrt{\langle v,v\rangle+2\epsilon\cos\theta\langle v,e\rangle+\epsilon^{2}\langle Pv,Pv\rangle}=\sqrt{1+2\epsilon\cos^{2}\theta+\epsilon^{2}\cos^{2}\theta}.

Now, the Taylor approximation for ln1+x\ln\sqrt{1+x} at x=0x=0 is x/2x2/4+O(x3)x/2-x^{2}/4+O(x^{3}), so

lndet(Id+ϵPE)=ϵcos(E,e)+ϵ2[cos(E,e)2cos4(E,e)]+O(ϵ3).\ln\det(\operatorname{Id}+\epsilon P\mid E)=\epsilon\cos\angle(E,e)+\epsilon^{2}\left[\frac{\cos\angle(E,e)}{2}-\cos^{4}\angle(E,e)\right]+O(\epsilon^{3}).

Hence, as this estimate is uniform over EE, by integrating,

𝔾r,dlndet(Id+ϵPE)dE=ϵ𝔾r,dcos2(E,e)𝑑E+ϵ2𝔾r,d[cos2(E,e)2cos4(E,e)]𝑑E+O(ϵ3).\int_{\mathbb{G}_{r,d}}\ln\det(\operatorname{Id}+\epsilon P\mid E)\,dE=\epsilon\int_{\mathbb{G}_{r,d}}\cos^{2}\angle(E,e)\,dE+\epsilon^{2}\int_{\mathbb{G}_{r,d}}\left[\frac{\cos^{2}\angle(E,e)}{2}-\cos^{4}\angle(E,e)\right]\,dE+O(\epsilon^{3}).

So, we are reduced to calculating the coefficient of ϵ2\epsilon^{2} in the above expression. One may rewrite the above integrals in the following manner, by definition of the Haar measure as 𝔾r,d\mathbb{G}_{r,d} is a homogeneous space of SO(d)\operatorname{SO}(d). Write x1,,xdx_{1},...,x_{d} for the restriction of the Euclidean coordinates to the sphere. By fixing the coordinate plane E0=e1,,erE_{0}=\langle e_{1},...,e_{r}\rangle, and letting θ=((x1,,xd),E)\theta=\angle((x_{1},...,x_{d}),E) we then have that cos(θ)=i=1rxi2\cos(\theta)=\sqrt{\sum_{i=1}^{r}x_{i}^{2}}. Thus

𝔾r,dcos2(E,e)𝑑E\displaystyle\int_{\mathbb{G}_{r,d}}\cos^{2}\angle(E,e)\,dE =SOdcos2(gE0,e)𝑑g\displaystyle=\int_{\operatorname{SO}_{d}}\cos^{2}\angle(gE_{0},e)\,dg
=SOdcos2(E0,ge)𝑑g\displaystyle=\int_{\operatorname{SO}_{d}}\cos^{2}\angle(E_{0},ge)\,dg
=Sd1cos2(E0,x)𝑑x\displaystyle=\int_{S^{d-1}}\cos^{2}\angle(E_{0},x)\,dx
=Sd1i=1rxi2dx,\displaystyle=\int_{S^{d-1}}\sum_{i=1}^{r}x_{i}^{2}\,dx,

Similarly, fixing the plane E0=e1,,erE_{0}=\langle e_{1},...,e_{r}\rangle, we see that as cos4(E0,x)=(i=1rxi2)2\cos^{4}\angle(E_{0},x)=\left(\sum_{i=1}^{r}x_{i}^{2}\right)^{2}

𝔾r,dcos4(E,e)=Sd1(i=1rxi2)2𝑑x.\int_{\mathbb{G}_{r,d}}\cos^{4}\angle(E,e)=\int_{S^{d-1}}\left(\sum_{i=1}^{r}x_{i}^{2}\right)^{2}\,dx.

The evaluation of these integrals is immediate by using the following standard formulas:

Sd1x12𝑑x=1d,Sd1x14𝑑x=3d(d+2),Sd1x12x22𝑑x=1d(d+2).\int_{S^{d-1}}x_{1}^{2}\,dx=\frac{1}{d},\quad\int_{S^{d-1}}x_{1}^{4}\,dx=\frac{3}{d(d+2)},\quad\int_{S^{d-1}}x_{1}^{2}x_{2}^{2}\,dx=\frac{1}{d(d+2)}.

Thus we see that

𝔾r,dcos2(E,e)2cos4(E,e)dE=r2dr(r+2)d(d+2).\int_{\mathbb{G}_{r,d}}\frac{\cos^{2}\angle(E,e)}{2}-\cos^{4}\angle(E,e)\,dE=\frac{r}{2d}-\frac{r(r+2)}{d(d+2)}.

Thus

α2(P)=r2dr(r+2)d(d+2).\alpha_{2}(P)=\frac{r}{2d}-\frac{r(r+2)}{d(d+2)}.

Returning to b1,b2b_{1},b_{2}, the coefficients of (Tr(S))2(\operatorname{Tr}(S))^{2} and Tr(S2)\operatorname{Tr}(S^{2}), respectively, combining the cases of Id\operatorname{Id} and PP gives

r2=b1d2+b2d.-\frac{r}{2}=b_{1}d^{2}+b_{2}d.

and

r2dr(r+2)d(d+2)=b1+b2.\frac{r}{2d}-\frac{r(r+2)}{d(d+2)}=b_{1}+b_{2}.

We can now solve for b1b_{1} and b2b_{2} with respect to this basis of the space of conjugation invariant quadratic functionals. However, the computation will be more direct if instead we we use a different basis and write write α2(S)\alpha_{2}(S) as

b1(Tr(S))2+b2Tr((STrSd)2),b_{1}(\operatorname{Tr}(S))^{2}+b_{2}\operatorname{Tr}(\left(S-\frac{\operatorname{Tr}S}{d}\right)^{2}),

so that the second term is trace 0. Our computations from before now show that:

r2=b1d2+0,-\frac{r}{2}=b_{1}d^{2}+0,

and

r2dr(r+2)d(d+2)=b1+d1db2(=b1(Tr(P))2+b2Tr((PTrPdId)2)).\frac{r}{2d}-\frac{r(r+2)}{d(d+2)}=b_{1}+\frac{d-1}{d}b_{2}\left(=b_{1}(\operatorname{Tr}(P))^{2}+b_{2}\operatorname{Tr}\left((P-\frac{\operatorname{Tr}P}{d}\operatorname{Id})^{2}\right)\right).

The first equation implies that

b1=r2d2,b_{1}=-\frac{r}{2d^{2}},

The left hand side of the second equation of the pair is equal to

r(dr)d(d+2)r2d.\frac{r(d-r)}{d(d+2)}-\frac{r}{2d}.

This gives

b2=r(dr)(d1)(d+2)r2d.b_{2}=\frac{r(d-r)}{(d-1)(d+2)}-\frac{r}{2d}.

So, for symmetric LL, we have

(117) α2(S)=r2d2(Tr(S))2+(r(dr)(d1)(d+2)r2d)Tr((STrSdId)2).\alpha_{2}(S)=\frac{-r}{2d^{2}}(\operatorname{Tr}(S))^{2}+\left(\frac{r(d-r)}{(d-1)(d+2)}-\frac{r}{2d}\right)\operatorname{Tr}((S-\frac{\operatorname{Tr}S}{d}\operatorname{Id})^{2}).

Recall that we specialized to the case of a symmetric matrix, and that for a non-symmetric matrix there is another term. For LEnddL\in\operatorname{End}{\mathbb{R}^{d}}, setting J=(LLT)/2J=(L-L^{T})/2, as before,

α2(L)=α2(L+LT2)+rdTr(J22LJ).\alpha_{2}(L)=\alpha_{2}\left(\frac{L+L^{T}}{2}\right)+\frac{r}{d}\operatorname{Tr}\left(\frac{J^{2}}{2}-LJ\right).

To simplify this we compute that:

Tr(J22LJ)\displaystyle\operatorname{Tr}\left(\frac{J^{2}}{2}-LJ\right) =Tr(L2LLTLTL+(LT)28LLLT2)\displaystyle=\operatorname{Tr}\left(\frac{L^{2}-LL^{T}-L^{T}L+(L^{T})^{2}}{8}-L\frac{L-L^{T}}{2}\right)
=Tr(LLTL24).\displaystyle=\operatorname{Tr}\left(\frac{LL^{T}-L^{2}}{4}\right).

Write

S=L+LT2.S=\frac{L+L^{T}}{2}.

Observe that for an arbitrary matrix XX, Tr((X(TrX)/dId)2)=Tr(X2)(Tr(X))2/d\operatorname{Tr}((X-(\operatorname{Tr}X)/d\operatorname{Id})^{2})=\operatorname{Tr}(X^{2})-(\operatorname{Tr}(X))^{2}/d. Thus

r2d2(Tr(S))2r2dTr((S(TrS)/dId)2)+rdTr(LLTL24)\displaystyle-\frac{r}{2d^{2}}\left(\operatorname{Tr}(S)\right)^{2}-\frac{r}{2d}\operatorname{Tr}(\left(S-(\operatorname{Tr}S)/d\operatorname{Id})^{2}\right)+\frac{r}{d}\operatorname{Tr}(\frac{LL^{T}-L^{2}}{4})
=r2d2(Tr(S))2r2d(Tr(S2))r2d2(Tr(S))2+rd(Tr(LLTL24))\displaystyle=-\frac{r}{2d^{2}}\left(\operatorname{Tr}(S)\right)^{2}-\frac{r}{2d}\left(\operatorname{Tr}(S^{2})\right)-\frac{-r}{2d^{2}}(\operatorname{Tr}(S))^{2}+\frac{r}{d}\left(\operatorname{Tr}\left(\frac{LL^{T}-L^{2}}{4}\right)\right)
=r2d(Tr(S2))+rd(Tr(LLTL24))\displaystyle=-\frac{r}{2d}\left(\operatorname{Tr}(S^{2})\right)+\frac{r}{d}\left(\operatorname{Tr}\left(\frac{LL^{T}-L^{2}}{4}\right)\right)
=rd[12Tr(((L+LT)/2)2)+Tr(LLTL24)]\displaystyle=\frac{r}{d}\left[\frac{-1}{2}\operatorname{Tr}(((L+L^{T})/2)^{2})+\operatorname{Tr}(\frac{LL^{T}-L^{2}}{4})\right]
=rd[12(Tr(L2+(LT)2+2LLT4))+Tr(LLTL24)]\displaystyle=\frac{r}{d}\left[\frac{-1}{2}(\operatorname{Tr}(\frac{L^{2}+(L^{T})^{2}+2LL^{T}}{4}))+\operatorname{Tr}(\frac{LL^{T}-L^{2}}{4})\right]
=r2dTr(L2).\displaystyle=-\frac{r}{2d}\operatorname{Tr}(L^{2}).

From before, we have that

α2(L)=r2d2(Tr(S))2+(r(dr)(d1)(d2)r2d)Tr((STrSdId)2)+rdTr(LLTL24).\alpha_{2}(L)=-\frac{r}{2d^{2}}(\operatorname{Tr}(S))^{2}+\left(\frac{r(d-r)}{(d-1)(d-2)}-\frac{r}{2d}\right)\operatorname{Tr}((S-\frac{\operatorname{Tr}S}{d}\operatorname{Id})^{2})+\frac{r}{d}\operatorname{Tr}(\frac{LL^{T}-L^{2}}{4}).

So substituting the previous calculation we obtain:

α2(L)=r2dTr(L2)+(r(dr)(d1)(d2))Tr((L+LT2TrLdId)2),\alpha_{2}(L)=-\frac{r}{2d}\operatorname{Tr}(L^{2})+\left(\frac{r(d-r)}{(d-1)(d-2)}\right)\operatorname{Tr}\left(\left(\frac{L+L^{T}}{2}-\frac{\operatorname{Tr}L}{d}\operatorname{Id}\right)^{2}\right),

which is the desired formula. ∎

We have now calculated α1\alpha_{1} and α2\alpha_{2}. This concludes the proof of Proposition 62. ∎

We will also use a first order Taylor expansion as well with respect to the metric.

Proposition 63.

Let Λr(G)\Lambda_{r}(G) be defined for symmetric matrices GG by

Λr(G)𝔾r,dlndet(Id,Id,Id+GE)dE.\Lambda_{r}(G)\coloneqq\int_{\mathbb{G}_{r,d}}\ln\det(\operatorname{Id},\operatorname{Id},\operatorname{Id}+G\mid E)\,dE.

Then Λr(G)\Lambda_{r}(G) admits the following Taylor development:

Λr(G)=r2dTrG+O(G2).\Lambda_{r}(G)=\frac{r}{2d}\operatorname{Tr}G+O(\|G\|^{2}).
Proof.

The proof of this proposition is substantially similar to that of the previous proposition. Let α1\alpha_{1} denote the first term in the Taylor expansion. Note that if UU is an isometry that Λr(UTGU)=Λr(G)\Lambda_{r}(U^{T}GU)=\Lambda_{r}(G). Thus α1\alpha_{1} is invariant under conjugation by isometries. Thus by conjugating by an orthogonal matrix, we are reduced to the case of GG and diagonal matrix. As before, we see that α1(D)\alpha_{1}(D) is a multiple of Tr(D)\operatorname{Tr}(D) as Tr\operatorname{Tr} spans the linear forms on d\mathbb{R}^{d} that are invariant under permutation of coordinates.

Thus it suffices to calculate the derivative in the case of D=IdD=\operatorname{Id}. So, we see that

α1(Id)=ddϵElndet(Id,Id,Id+ϵIdE)dE.\alpha_{1}(\operatorname{Id})=\frac{d}{d\epsilon}\int_{E}\ln\det(\operatorname{Id},\operatorname{Id},\operatorname{Id}+\epsilon\operatorname{Id}\mid E)\,dE.

Thus the integral is equal to ln(1+ϵ)r\ln\sqrt{(1+\epsilon)^{r}} on every plane EE. Thus the derivative is r/2r/2 and so

α1(Id)=r2=r2dTr(Id).\alpha_{1}(\operatorname{Id})=\frac{r}{2}=\frac{r}{2d}\operatorname{Tr}(\operatorname{Id}).

And so the result follows. ∎

References

  • [Arn13] Ludwig Arnold, Random dynamical systems, Springer, 2013.
  • [BL76] Jöran Bergh and Jörgen Löfström, Interpolation spaces: an introduction, Springer, 1976.
  • [BO18] Adam Bouland and Maris Ozols, Trading inverses for an irrep in the Solovay-Kitaev theorem, 13th Conference on the Theory of Quantum Computation, Communication and Cryptography, 2018.
  • [But17] Clark Butler, Characterizing symmetric spaces by their Lyapunov spectra, arXiv preprint arXiv:1709.08066 (2017).
  • [CE75] Jeff Cheeger and David G. Ebin, Comparison theorems in Riemannian geometry, American Mathematical Society (1975).
  • [DF19] Danijela Damjanović and Bassam Fayad, On local rigidity of partially hyperbolic affine k\mathbb{Z}^{k} actions, Journal für die reine und angewandte Mathematik (2019), no. 751, 1–26.
  • [DN06] Christopher Dawson and Michael Nielsen, The Solovay-Kitaev algorithm, Quantum Information & Computation 6 (2006), no. 1, 81–95.
  • [DeW19] Jonathan DeWitt, Local Lyapunov spectrum rigidity of nilmanifold automorphisms, arXiv preprint arXiv:1911.07717 (2019).
  • [Dol02] Dmitry Dolgopyat, On mixing properties of compact group extensions of hyperbolic systems, Israel Journal of Mathematics 130 (2002), no. 1, 157–205.
  • [DK07] Dmitry Dolgopyat and Raphaël Krikorian, On simultaneous linearization of diffeomorphisms of the sphere, Duke Mathematical Journal 136 (2007), no. 3, 475–505.
  • [Fie99] Michael Field, Generating sets for compact semisimple Lie groups, Proceedings of the American Mathematical Society 127 (1999), no. 11, 3361–3365.
  • [FK09] Bassam Fayad and Kostantin Khanin, Smooth linearization of commuting circle diffeomorphisms, Annals of Mathematics (2009), 961–980.
  • [GKS18] Andrey Gogolev, Boris Kalinin, and Victoria Sadovskaya, Local rigidity of Lyapunov spectrum for toral automorphisms, Israel J. Math (2018).
  • [Gog19] Andrey Gogolev, Rigidity lecture notes, https://people.math.osu.edu/gogolyev.1/index_files/CIRM_notes_all.pdf, 2019.
  • [GRH19] Andrey Gogolev and Federico Rodriguez Hertz, Smooth rigidity for very non-algebraic expanding maps, arXiv preprint arXiv:1911.07751 (2019).
  • [Ham82] Richard Hamilton, The inverse function theorem of Nash and Moser, Bulletin of the American Mathematical Society 7 (1982), no. 1, 65–222.
  • [Hel01] Sigurdur Helgason, Differential geometry, Lie groups, and symmetric spaces, American Mathematical Society, 2001.
  • [Hör76] Lars Hörmander, The boundary problems of physical geodesy, Archive for Rational Mechanics and Analysis 62 (1976), no. 1, 1–52.
  • [KH97] Anatole Katok and Boris Hasselblatt, Introduction to the modern theory of dynamical systems, vol. 54, Cambridge University Press, 1997.
  • [Kif86] Yuri Kifer, Ergodic theory of random transformations, Birkhäuser, 1986.
  • [LRK09] Michael Lai, David Rubin, and Erhard Krempl, Introduction to continuum mechanics, Butterworth-Heinemann, 2009.
  • [Lee18] John Lee, Introduction to Riemannian geometry (2nd ed.), Springer, 2018.
  • [Mal12] Dominique Malicet, On simultaneous linearization of diffeomorphisms of 𝕋2\mathbb{T}^{2}.
  • [Mal20] by same author, Lyapunov exponent of random dynamical systems on the circle, arXiv preprint arXiv:2006.15397 (2020).
  • [Mos90] Jürgen Moser, On commuting circle mappings and simultaneous diophantine approximations, Mathematische Zeitschrift 205 (1990), no. 1, 105–121.
  • [Pet21] Boris Petković, Classification of perturbations of Diophantine m\mathbb{Z}^{m} actions on tori of arbitrary dimension, Regular and Chaotic Dynamics 26 (2021), no. 6, 700–716.
  • [Sha01] Krishnan Shankar, Isometry groups of homogeneous spaces with positive sectional curvature, Differential Geometry and its Applications 14 (2001), no. 1, 57–78.
  • [SY19] Radu Saghin and Jiagang Yang, Lyapunov exponents and rigidity of Anosov automorphisms and skew products, Advances in Mathematics 355 (2019), 106764.
  • [Wal18] Nolan R. Wallach, Harmonic analysis on homogeneous spaces, Dover, 2018.
  • [Wol72] Joseph Wolf, Spaces of constant curvature, American Mathematical Society, 1972.