This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Curvatures of Stiefel manifolds with deformation metrics

Du Nguyen [email protected]
Abstract.

We compute curvatures of a family of tractable metrics on Stiefel manifolds, introduced recently by Hüper, Markina and Silva Leite, which includes the well-known embedded and canonical metrics on Stiefel manifolds as special cases. The metrics could be identified with the Cheeger deformation metrics. We identify parameter values in the family to make a Stiefel manifold an Einstein manifold and show Stiefel manifolds always carry an Einstein metric. We analyze the sectional curvature range and identify the parameter range where the manifold has non-negative sectional curvature. We provide the exact sectional curvature range when the number of columns in a Stiefel matrix is 22, and a conjectural range for other cases. We derive the formulas from two approaches, one from a global curvature formula derived in our recent work, another using curvature formulas for left-invariant metrics. The second approach leads to curvature formulas for Cheeger deformation metrics on normal homogeneous spaces.

Key words and phrases:
Optimization, Riemannian geometry, Riemannian curvature, Einstein manifold, Stiefel, Jacobi field, Machine Learning
2010 Mathematics Subject Classification:
Primary 65K10, 58C05, 49Q12, 53C25, 57Z20, 57Z25, 68T05

1. Introduction

In a recent paper [10], we derived global formulas to compute the curvature of a manifold \mathcal{M}, embedded differentiably in a Euclidean space \mathcal{E}, with metric defined by an operator 𝗀\mathsf{g} from \mathcal{M} to the space of positive-definite operators on \mathcal{E}. The formulas have similar forms to the classical formula for the curvature in local coordinates. While we have provided a few applications of those formulas in that paper, we would like to show the formula could be used to compute the curvatures for a family of manifolds important in both theory and application.

The purpose of this paper is to compute and analyze curvatures of a Stiefel manifold with the family of metrics defined in [6]. It turns out this family of metrics is the same family of metrics arising from the Cheeger deformation, which has been one of the main tools to construct non-negative curvature metrics[2, 5, 16]. Thus, the curvatures could be computed in two ways, one is from our formula using Christoffel functions, which is very similar to the local-coordinate formula, the other way is to use the relationship with the Cheeger deformation. In the second method, the Stiefel manifold is identified with a quotient manifold of the special orthogonal group with a left-invariant metric. Using a result of Michor [7] and the Euler-Poisson-Arnold framework [1], we compute the (1,3)(1,3)-curvature tensor of the Cheeger deformation of a normal homogeneous space. The second approach provides independent confirmation of our curvature formulas. The first method probably requires lengthier calculation, however, it is straightforward conceptually and could be implemented symbolically.

Recall for two positive integers p<np<n, the real Stiefel manifold Stp,n\mathrm{St}_{p,n} consists of real orthogonal matrices YY of size n×pn\times p. If α1\alpha_{1}, α0\alpha_{0} are two positive numbers, the metric in [6] could be reparameterized so that the inner product of two tangent vectors ξ,η\xi,\eta on Stp,n\mathrm{St}_{p,n} at YStp,nY\in\mathrm{St}_{p,n} is given by α0Tr(ξ𝖳η)+(α1α0)Tr(ξ𝖳YY𝖳η)\alpha_{0}\operatorname{Tr}(\xi^{\operatorname{\mathsf{T}}}\eta)+(\alpha_{1}-\alpha_{0})\operatorname{Tr}(\xi^{\operatorname{\mathsf{T}}}YY^{\operatorname{\mathsf{T}}}\eta). Set α=α1/α0\alpha=\alpha_{1}/\alpha_{0}, and up to scaling we can take α0=1\alpha_{0}=1. This family of metrics contains both well-known metrics on Stiefel manifolds, the embedded (α=1\alpha=1, where the metric is induced from the embedding in n×p\mathbb{R}^{n\times p}) and canonical metrics (α=12)(\alpha=\frac{1}{2}) (Stp,n\mathrm{St}_{p,n} is normal homogeneous in this case). It will be shown in proposition 7 that if SO(n)\operatorname{SO}(n) is equipped with a Cheeger deformation metric with deformation parameter 2α2\alpha (reviewed in section 5) from the right-multiplication action of SO(p)\operatorname{SO}(p) embedded diagonally then SO(n)/SO(np)\operatorname{SO}(n)/\operatorname{SO}(n-p) with the quotient metric could be identified with Stp,n\mathrm{St}_{p,n} with the metric just described.

While a framework to compute curvatures for Cheeger deformation metrics is available, explicit formulas and detailed analysis are not yet known to the best of our knowledge (note [13] is an early paper dealing with the embedded metric). We provide formulas for Riemannian, Ricci, scalar, and sectional curvature for the Stiefel manifold equipped with this family of metrics. We show the sectional curvature range always contains a specific interval, which is likely to be the full curvature range for metrics in the family. The ends of the interval are piecewise smooth functions described in table 2. In particular, except for some special cases, for the embedded metric on the Stiefel manifold, we show the curvature range contains the interval [12,1][-\frac{1}{2},1], thus it could have negative curvatures, in contrast to the canonical metric, which has range [0,54][0,\frac{5}{4}].

Specifically, St2,3\mathrm{St}_{2,3} has positive curvature for α<23\alpha<\frac{2}{3}, non-negative curvature for α=23\alpha=\frac{2}{3} and both negative and positive curvature for α>23\alpha>\frac{2}{3}. With n>3n>3, the Stiefel manifold St2,n\mathrm{St}_{2,n} has non-negative curvature for α23\alpha\leq\frac{2}{3}, and both negative and positive curvature for α>23\alpha>\frac{2}{3}, and we identify the exact sectional curvature range in this case. For p3p\geq 3, we show Stp,n\mathrm{St}_{p,n} has non-negative curvature for α12\alpha\leq\frac{1}{2} and both negative and positive curvature otherwise. This agrees with [5] and we actually show the curvature range contains negative values in the indicated intervals.

We also show the Stiefel manifold always has an Einstein metric, and when p>2p>2, there are two metrics in the family (up to a scaling factor) that make the Stiefel manifold an Einstein manifold. We note this may be the same metric as in [15].

For notations, if nn and mm are two positive integers, by n×m\mathbb{R}^{n\times m}, we denote the space of n×mn\times m matrices in \mathbb{R}, the field of real numbers. We denote by 𝔬(p)\mathfrak{o}(p) the space of antisymmetric matrices in p×p\mathbb{R}^{p\times p}. The transpose of matrix or adjoint of an operator is denoted by 𝖳\operatorname{\mathsf{T}}. Working on a manifold, say \mathcal{M}, by DξF\operatorname{D}_{\xi}F, we denote the directional (Lie) derivative of a scalar/vector/operator-valued function FF on \mathcal{M} in direction ξ\xi (either a tangent vector defined at a point xx\in\mathcal{M}, or a vector field on \mathcal{M}). If \mathcal{E} is a Euclidean space (inner product space with a positive-definite inner product), the space of linear operators on \mathcal{E} is denoted by 𝔏(,)\mathfrak{L}(\mathcal{E},\mathcal{E}). Similarly, we denote by 𝔏(,)\mathfrak{L}(\mathcal{E}\otimes\mathcal{E},\mathcal{E}) the space of bilinear form on \mathcal{E} with value in \mathcal{E}. For two positive integers nn and pp, the Stiefel manifold Stp,n\mathrm{St}_{p,n} is the space of matrices Yn×pY\in\mathbb{R}^{n\times p} satisfying Y𝖳Y=IpY^{\operatorname{\mathsf{T}}}Y=\operatorname{I}_{p}. The Frobenius norm is denoted by F\|\|_{F}.

2. Curvature formulas for embedded manifolds with metric operators

Let \mathcal{M}\subset\mathcal{E} be a differentiable embedding, where \mathcal{E} is a Euclidean space with a given inner product \langle\rangle_{\mathcal{E}}, and \mathcal{M} is a differentiable submanifold, and 𝗀\mathsf{g} is an operator-valued function from \mathcal{M} to 𝔏(,)\mathfrak{L}(\mathcal{E},\mathcal{E}), such that 𝗀\mathsf{g} is positive-definite, then 𝗀\mathsf{g} induces a Riemannian metric on \mathcal{M}, where the inner product of two tangent vectors ξ,η\xi,\eta at a point xx\in\mathcal{M} is defined by ξ,𝗀xη\langle\xi,\mathsf{g}_{x}\eta\rangle_{\mathcal{E}}. Here, each tangent space TxT_{x}\mathcal{M} is identified with a subspace of \mathcal{E} thanks to the embedding, so ξ,η\xi,\eta are considered as elements of \mathcal{E}, while 𝗀x\mathsf{g}_{x} denotes the evaluation of the operator 𝗀\mathsf{g} at xx.

We call (,𝗀,)(\mathcal{M},\mathsf{g},\mathcal{E}) an embedded ambient structure. The embedding allows us to identify vector fields on \mathcal{M} with \mathcal{E}-valued functions, thus we can take directional derivatives. A Christoffel function is a function Γ\Gamma from \mathcal{M} with value in 𝔏(,)\mathfrak{L}(\mathcal{E}\otimes\mathcal{E},\mathcal{E}), the space of \mathcal{E}-bilinear forms, such that for two vector fields 𝚇,𝚈\mathtt{X},\mathtt{Y} on \mathcal{M}, the Levi-Civita connection on \mathcal{M} is given by

𝚇𝚈=D𝚇𝚈+Γ(𝚇,𝚈)\nabla_{\mathtt{X}}\mathtt{Y}=\operatorname{D}_{\mathtt{X}}\mathtt{Y}+\Gamma(\mathtt{X},\mathtt{Y})

In [10] we proved the following curvature formulas for three tangent vectors ξ,η,ϕ\xi,\eta,\phi

(2.1) Rξ,ηϕ=(DξΓ)(η,ϕ)+(DηΓ)(ξ,ϕ)Γ(ξ,Γ(η,ϕ))+Γ(η,Γ(ξ,ϕ))Rξ,ηϕ=(DξΓ)(η,ϕ)+(DηΓ)(ξ,ϕ)Γ(Γ(ϕ,η),ξ))+Γ(Γ(ϕ,ξ),η)\begin{gathered}\operatorname{R^{\mathcal{M}}}_{\xi,\eta}\phi=-(\operatorname{D}_{\xi}\Gamma)(\eta,\phi)+(\operatorname{D}_{\eta}\Gamma)(\xi,\phi)-\Gamma(\xi,\Gamma(\eta,\phi))+\Gamma(\eta,\Gamma(\xi,\phi))\\ \operatorname{R^{\mathcal{M}}}_{\xi,\eta}\phi=-(\operatorname{D}_{\xi}\Gamma)(\eta,\phi)+(\operatorname{D}_{\eta}\Gamma)(\xi,\phi)-\Gamma(\Gamma(\phi,\eta),\xi))+\Gamma(\Gamma(\phi,\xi),\eta)\end{gathered}

where DξΓ\operatorname{D}_{\xi}\Gamma denotes the directional derivative of Γ\Gamma, considered as an operator-valued function, in the direction ξ\xi, for example. The curvature for three vector fields 𝚇,𝚈,𝚉\mathtt{X},\mathtt{Y},\mathtt{Z} is defined in the convention

R𝚇𝚈𝚉=[𝚇,𝚈]𝚉𝚇𝚈𝚉+𝚈𝚇𝚉\operatorname{R^{\mathcal{M}}}_{\mathtt{X}\mathtt{Y}}\mathtt{Z}=\nabla_{[\mathtt{X},\mathtt{Y}]}\mathtt{Z}-\nabla_{\mathtt{X}}\nabla_{\mathtt{Y}}\mathtt{Z}+\nabla_{\mathtt{Y}}\nabla_{\mathtt{X}}\mathtt{Z}

3. Curvatures of the Stiefel manifold

In the following, p<np<n are two positive integers. In [6], the authors introduced a family of metrics on the Stiefel manifold Stp,n\mathrm{St}_{p,n} of orthogonal matrices in n×p\mathbb{R}^{n\times p} (thus Y𝖳Y=IpY^{\operatorname{\mathsf{T}}}Y=\operatorname{I}_{p}). We introduced a different parameterization in [9]. The metric depends on two positive real numbers α0\alpha_{0}, α1\alpha_{1} with ratio α=α1α0\alpha=\frac{\alpha_{1}}{\alpha_{0}}. In the convention of section 2, we have :=Stp,n:=n×p\mathcal{M}:=\mathrm{St}_{p,n}\subset\mathcal{E}:=\mathbb{R}^{n\times p}, with the base inner product on \mathcal{E} is the Frobenius inner product, thus ω1,ω2=Tr(ω1ω2𝖳)\langle\omega_{1},\omega_{2}\rangle_{\mathcal{E}}=\operatorname{Tr}(\omega_{1}\omega_{2}^{\operatorname{\mathsf{T}}}) for ω1,ω2\omega_{1},\omega_{2}\in\mathcal{E}. Consider the metric operator 𝗀ω=𝗀Yω:=α0ω+(α1α0)YY𝖳ω\mathsf{g}\omega=\mathsf{g}_{Y}\omega:=\alpha_{0}\omega+(\alpha_{1}-\alpha_{0})YY^{\operatorname{\mathsf{T}}}\omega, for YStp,nY\in\mathrm{St}_{p,n}, with inverse 𝗀1ω=α01ω+(α11α01)YY𝖳ω\mathsf{g}^{-1}\omega=\alpha^{-1}_{0}\omega+(\alpha^{-1}_{1}-\alpha^{-1}_{0})YY^{\operatorname{\mathsf{T}}}\omega and the inner product on \mathcal{E} induced by 𝗀\mathsf{g} is ω1,𝗀Yω2=α0Trω1ω2𝖳+(α1α0)Trω1𝖳YY𝖳ω2\langle\omega_{1},\mathsf{g}_{Y}\omega_{2}\rangle_{\mathcal{E}}=\alpha_{0}\operatorname{Tr}\omega_{1}\omega_{2}^{\operatorname{\mathsf{T}}}+(\alpha_{1}-\alpha_{0})\operatorname{Tr}\omega_{1}^{\operatorname{\mathsf{T}}}YY^{\operatorname{\mathsf{T}}}\omega_{2}, and this induces a Riemannian metric on Stp,n\mathrm{St}_{p,n}.

A geodesic equation for this metric was derived in [6], and we provided a different derivation of a Christoffel function Γ\Gamma in [9]. We will give another derivation of Γ\Gamma in proposition 7 to clarify the concepts and keep the material reasonably independent. For an orthogonal matrix YStp,nY\in\mathrm{St}_{p,n} and ω,ω1,ω2n×p\omega,\omega_{1},\omega_{2}\in\mathbb{R}^{n\times p}, a Christoffel function is

(3.1) Γ(ω1,ω2)=12Y(ω1𝖳ω2+ω2𝖳ω1)+(1α)(InYY𝖳)(ω1ω2𝖳+ω2ω1𝖳)Y\begin{gathered}\Gamma(\omega_{1},\omega_{2})=\frac{1}{2}Y(\omega_{1}^{\operatorname{\mathsf{T}}}\omega_{2}+\omega_{2}^{\operatorname{\mathsf{T}}}\omega_{1})+(1-\alpha)(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})(\omega_{1}\omega_{2}^{\operatorname{\mathsf{T}}}+\omega_{2}\omega_{1}^{\operatorname{\mathsf{T}}})Y\end{gathered}

We can extend YY to a full basis (Y|Y)(Y|Y_{\perp}) of n\mathbb{R}^{n}, by adding YY_{\perp}, an orthogonal complement. Thus, YY𝖳=InYY𝖳Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}=\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}}, Y𝖳Y=Inp,Y𝖳Y=0,Y𝖳Y=0Y_{\perp}^{\operatorname{\mathsf{T}}}Y_{\perp}=\operatorname{I}_{n-p},Y^{\operatorname{\mathsf{T}}}Y_{\perp}=0,Y_{\perp}^{\operatorname{\mathsf{T}}}Y=0. Any matrix ω=n×p\omega\in\mathcal{E}=\mathbb{R}^{n\times p} could be represented in this basis as ω=YA+YB\omega=YA+Y_{\perp}B with Ap×pA\in\mathbb{R}^{p\times p}, Bp×(np)B\in\mathbb{R}^{p\times(n-p)} and ω\omega is a tangent vector to Stp,n\mathrm{St}_{p,n} at YY if and only if AA is antisymmetric, A𝔬(p)A\in\mathfrak{o}(p), or equivalently Y𝖳ω+ω𝖳Y=0Y^{\operatorname{\mathsf{T}}}\omega+\omega^{\operatorname{\mathsf{T}}}Y=0.

For two tangent vectors ξ\xi and η\eta at a point on the manifold, denote by 𝗀\langle\rangle_{\mathsf{g}} and 𝗀\|\|_{\mathsf{g}} the inner product and the norm defined by a metric operator 𝗀\mathsf{g}. We will denote the wedge, the sectional curvature numerator, and the sectional curvature by

(3.2) ξη𝗀2=ξ𝗀2η𝗀2ξ,η𝗀2𝒦^(ξ,η)=Rξ,ηξ,η𝗀𝒦(ξ,η)=𝒦^(ξ,η)ξη𝗀2\begin{gathered}||\xi\wedge\eta||_{\mathsf{g}}^{2}=||\xi||_{\mathsf{g}}^{2}||\eta||_{\mathsf{g}}^{2}-\langle\xi,\eta\rangle_{\mathsf{g}}^{2}\\ \operatorname{\hat{\mathcal{K}}}(\xi,\eta)=\langle\operatorname{R^{\mathcal{M}}}_{\xi,\eta}\xi,\eta\rangle_{\mathsf{g}}\\ \mathcal{K}(\xi,\eta)=\frac{\operatorname{\hat{\mathcal{K}}}(\xi,\eta)}{||\xi\wedge\eta||_{\mathsf{g}}^{2}}\\ \end{gathered}
Theorem 3.1.

Representing three tangent vectors ξ,η,ϕn×p\xi,\eta,\phi\in\mathbb{R}^{n\times p} at YStp,nY\in\mathrm{St}_{p,n} in an orthogonal basis (Y|Y)(Y|Y_{\perp}) of n\mathbb{R}^{n} as ξ=YA1+YB1,η=YA2+YB2,ϕ=YA3+YB3\xi=YA_{1}+Y_{\perp}B_{1},\eta=YA_{2}+Y_{\perp}B_{2},\phi=YA_{3}+Y_{\perp}B_{3}, where A1,A2,A3𝔬(p)A_{1},A_{2},A_{3}\in\mathfrak{o}(p) and B1,B2,B3(np)×pB_{1},B_{2},B_{3}\in\mathbb{R}^{(n-p)\times p}. Then the Riemannian curvature tensor is Rξηϕ=YAR+YBR\operatorname{R^{\mathcal{M}}}_{\xi\eta}\phi=YA_{R}+Y_{\perp}B_{R} with AR𝔬(p),BR(np)×pA_{R}\in\mathfrak{o}(p),B_{R}\in\mathbb{R}^{(n-p)\times p} where

(3.3) AR=Y𝖳Rξηϕ=12α4(A1B3𝖳B2A2B3𝖳B1B1𝖳B3A2+B2𝖳B3A1)+1α2(A3B1𝖳B2A3B2𝖳B1B1𝖳B2A3+B2𝖳B1A3)+14([[A1,A2],A3]A1B2𝖳B3+A2B1𝖳B3+B3𝖳B1A2B3𝖳B2A1)\begin{gathered}A_{R}=Y^{\operatorname{\mathsf{T}}}\operatorname{R^{\mathcal{M}}}_{\xi\eta}\phi=\frac{1-2\alpha}{4}(A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1})+\\ \frac{1-\alpha}{2}(A_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-A_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}+B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{3})+\\ \frac{1}{4}([[A_{1},A_{2}],A_{3}]-A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}+A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}+B_{3}^{\operatorname{\mathsf{T}}}B_{1}A_{2}-B_{3}^{\operatorname{\mathsf{T}}}B_{2}A_{1})\end{gathered}
(3.4) BR=Y𝖳Rξηϕ=2α2α2(B1A3A2B2A3A1)+(α2α)(B3A1A2B3A2A1)+(1α)(B3B1𝖳B2B3B2𝖳B1)+α22(B1B2𝖳B3B2B1𝖳B3)+α2(B1A2A3B1B3𝖳B2B2A1A3+B2B3𝖳B1)\begin{gathered}B_{R}=Y_{\perp}^{\operatorname{\mathsf{T}}}\operatorname{R^{\mathcal{M}}}_{\xi\eta}\phi=\frac{2\alpha^{2}-\alpha}{2}(B_{1}A_{3}A_{2}-B_{2}A_{3}A_{1})+\\ (\alpha^{2}-\alpha)(B_{3}A_{1}A_{2}-B_{3}A_{2}A_{1})+(1-\alpha)(B_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1})+\\ \frac{\alpha-2}{2}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3})+\frac{\alpha}{2}(B_{1}A_{2}A_{3}-B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}A_{1}A_{3}+B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1})\end{gathered}

If p>1p>1, the Ricci and scalar curvatures are given by:

(3.5) Ric(ξ,η)=(2p4+(pn)α2)Tr(A1A2)+[(1p)α+(n2)]Tr(B1𝖳B2)\begin{gathered}\textsc{Ric}(\xi,\eta)=(\frac{2-p}{4}+(p-n)\alpha^{2})\operatorname{Tr}(A_{1}A_{2})+[(1-p)\alpha+(n-2)]\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{2})\end{gathered}
(3.6) Scl(Y)=((1p)α+n2)(np)p+((np)α+p24α)p(p1)2\begin{gathered}\textsc{Scl}(Y)=((1-p)\alpha+n-2)(n-p)p+((n-p)\alpha+\frac{p-2}{4\alpha})\frac{p(p-1)}{2}\end{gathered}

The sectional curvature numerator 𝒦^\operatorname{\hat{\mathcal{K}}} is computed from one of the following

(3.7) 𝒦^=Tr(23α2B2𝖳B1B1𝖳B2+3α42B2𝖳B1B2𝖳B1+B2𝖳B2B1𝖳B1α4[A1,A2]2)+αTr((4α3)A1A2B2𝖳B1+(32α)A1A2B1𝖳B2αA22B1𝖳B1αA12B2𝖳B2)\begin{gathered}\operatorname{\hat{\mathcal{K}}}=\operatorname{Tr}(\frac{2-3\alpha}{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}+\frac{3\alpha-4}{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+B_{2}^{\operatorname{\mathsf{T}}}B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\frac{\alpha}{4}[A_{1},A_{2}]^{2})\\ +\alpha\operatorname{Tr}((4\alpha-3)A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+(3-2\alpha)A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\alpha A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\alpha A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2})\end{gathered}
(3.8) 𝒦^=α4[A1,A2]+(34α)(B2𝖳B1B1𝖳B2)F2+α2B1A2B2A1F2+12B1B2𝖳B2B1𝖳F2+(12α)32B2𝖳B1B1𝖳B2F2\begin{gathered}\operatorname{\hat{\mathcal{K}}}=\frac{\alpha}{4}\|[A_{1},A_{2}]+(3-4\alpha)(B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})\|_{F}^{2}+\\ \alpha^{2}\|B_{1}A_{2}-B_{2}A_{1}\|_{F}^{2}+\frac{1}{2}\|B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}\|_{F}^{2}+\frac{(1-2\alpha)^{3}}{2}\|B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}\|_{F}^{2}\end{gathered}

In particular, if α12\alpha\leq\frac{1}{2}, the sectional curvature is non-negative. If ξ\xi and η\eta are orthogonal, the sectional curvature denominator is (α1TrA1A1𝖳+α0TrB1B1𝖳)(α1TrA2A2𝖳+α0TrB2B2𝖳)(\alpha_{1}\operatorname{Tr}A_{1}A_{1}^{\operatorname{\mathsf{T}}}+\alpha_{0}\operatorname{Tr}B_{1}B_{1}^{\operatorname{\mathsf{T}}})(\alpha_{1}\operatorname{Tr}A_{2}A_{2}^{\operatorname{\mathsf{T}}}+\alpha_{0}\operatorname{Tr}B_{2}B_{2}^{\operatorname{\mathsf{T}}}).

We also use the following expansion of eq. 3.8 when A1A_{1} or A2A_{2} is zero.

(3.9) 𝒦^=α4[A1,A2]F2+α(34α)2Tr[A1,A2](B2𝖳B1B1𝖳B2)𝖳+23α4B2𝖳B1B1𝖳B2F2+α2B1A2B2A1F2+12B1B2𝖳B2B1𝖳F2\begin{gathered}\operatorname{\hat{\mathcal{K}}}=\frac{\alpha}{4}\|[A_{1},A_{2}]\|_{F}^{2}+\frac{\alpha(3-4\alpha)}{2}\operatorname{Tr}[A_{1},A_{2}](B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})^{\operatorname{\mathsf{T}}}+\\ \frac{2-3\alpha}{4}\|B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}\|_{F}^{2}+\alpha^{2}\|B_{1}A_{2}-B_{2}A_{1}\|_{F}^{2}+\frac{1}{2}\|B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}\|_{F}^{2}\end{gathered}
Proof.

As noted, any ωn×p\omega\in\mathbb{R}^{n\times p} could be expressed as ω=YA+YB\omega=YA+Y_{\perp}B, however AA may not be antisymmetric. By direct substitution (InYY𝖳)(ηω𝖳+ωη𝖳)Y=Y(B2A𝖳BA2)(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})(\eta\omega^{\operatorname{\mathsf{T}}}+\omega\eta^{\operatorname{\mathsf{T}}})Y=Y_{\perp}(B_{2}A^{\operatorname{\mathsf{T}}}-BA_{2}), hence

Γ(η,ω)=12Y(A2A+A𝖳A2+B𝖳B2+B2𝖳B)+(1α)Y(B2A𝖳BA2)\Gamma(\eta,\omega)=\frac{1}{2}Y(-A_{2}A+A^{\operatorname{\mathsf{T}}}A_{2}+B^{\operatorname{\mathsf{T}}}B_{2}+B_{2}^{\operatorname{\mathsf{T}}}B)+(1-\alpha)Y_{\perp}(B_{2}A^{\operatorname{\mathsf{T}}}-BA_{2})

In particular, Y𝖳Γ(η,ω)=12(A2A+A𝖳A2+B𝖳B2+B2𝖳B)Y^{\operatorname{\mathsf{T}}}\Gamma(\eta,\omega)=\frac{1}{2}(-A_{2}A+A^{\operatorname{\mathsf{T}}}A_{2}+B^{\operatorname{\mathsf{T}}}B_{2}+B_{2}^{\operatorname{\mathsf{T}}}B), Y𝖳Γ(η,ω)=(1α)(B2A𝖳BA2)Y_{\perp}^{\operatorname{\mathsf{T}}}\Gamma(\eta,\omega)=(1-\alpha)(B_{2}A^{\operatorname{\mathsf{T}}}-BA_{2}), and

DξΓ(η,ϕ)=12ξ(η𝖳ϕ+ϕ𝖳η)+(1α){(InYY𝖳)(ηϕ𝖳+ϕη𝖳)ξ(ξY𝖳+Yξ𝖳)(ηϕ𝖳+ϕη𝖳)Y}\begin{gathered}\operatorname{D}_{\xi}\Gamma(\eta,\phi)=\frac{1}{2}\xi(\eta^{\operatorname{\mathsf{T}}}\phi+\phi^{\operatorname{\mathsf{T}}}\eta)+\\ (1-\alpha)\{(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})(\eta\phi^{\operatorname{\mathsf{T}}}+\phi\eta^{\operatorname{\mathsf{T}}})\xi-(\xi Y^{\operatorname{\mathsf{T}}}+Y\xi^{\operatorname{\mathsf{T}}})(\eta\phi^{\operatorname{\mathsf{T}}}+\phi\eta^{\operatorname{\mathsf{T}}})Y\}\end{gathered}

Expanding ξ,η,ϕ\xi,\eta,\phi

Y𝖳(DξΓ)(η,ϕ)=12B1(A2A3A3A2+B2𝖳B3+B3𝖳B2)+(1α){B2(A3Y𝖳+B3𝖳Y)+B3(A2Y𝖳+B2𝖳Y)}(YA1+YB1)(1α)(B1Y𝖳(YA2A3YA3A2)=B1B2𝖳B32+B1B3𝖳B22+(12α)(B1A2A3+B1A3A2)+(1α)(B2A3A1+B2B3𝖳B1B3A2A1+B3B2𝖳B1)\begin{gathered}Y_{\perp}^{\operatorname{\mathsf{T}}}(\operatorname{D}_{\xi}\Gamma)(\eta,\phi)=\frac{1}{2}B_{1}(-A_{2}A_{3}-A_{3}A_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}+B_{3}^{\operatorname{\mathsf{T}}}B_{2})+\\ (1-\alpha)\{B_{2}(-A_{3}Y^{\operatorname{\mathsf{T}}}+B_{3}^{\operatorname{\mathsf{T}}}Y_{\perp})+B_{3}(-A_{2}Y^{\operatorname{\mathsf{T}}}+B_{2}^{\operatorname{\mathsf{T}}}Y_{\perp})\}(YA_{1}+Y_{\perp}B_{1})-\\ (1-\alpha)(B_{1}Y^{\operatorname{\mathsf{T}}}(-YA_{2}A_{3}-YA_{3}A_{2})\\ =\frac{B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}}{2}+\frac{B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}}{2}+(\frac{1}{2}-\alpha)(B_{1}A_{2}A_{3}+B_{1}A_{3}A_{2})+\\ (1-\alpha)(-B_{2}A_{3}A_{1}+B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{3}A_{2}A_{1}+B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1})\end{gathered}

Simplify Y𝖳(ξY𝖳+Yξ𝖳)=A1Y𝖳A1Y𝖳+B1𝖳Y𝖳=B1𝖳Y𝖳Y^{\operatorname{\mathsf{T}}}(\xi Y^{\operatorname{\mathsf{T}}}+Y\xi^{\operatorname{\mathsf{T}}})=A_{1}Y^{\operatorname{\mathsf{T}}}-A_{1}Y^{\operatorname{\mathsf{T}}}+B_{1}^{\operatorname{\mathsf{T}}}Y_{\perp}^{\operatorname{\mathsf{T}}}=B_{1}^{\operatorname{\mathsf{T}}}Y_{\perp}^{\operatorname{\mathsf{T}}}

Y𝖳(DξΓ)(η,ϕ)=12A1(A2A3+B2𝖳B3A3A2+B3𝖳B2)(1α)(B1𝖳Y𝖳)(YB2A3Y𝖳YB3A2Y𝖳)Y=(1α)(B1𝖳B2A3+B1𝖳B3A2)+12(A1A2A3A1A3A2+A1B2𝖳B3+A1B3𝖳B2)\begin{gathered}Y^{\operatorname{\mathsf{T}}}(\operatorname{D}_{\xi}\Gamma)(\eta,\phi)=\frac{1}{2}A_{1}(-A_{2}A_{3}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}-A_{3}A_{2}+B_{3}^{\operatorname{\mathsf{T}}}B_{2})-\\ (1-\alpha)(B_{1}^{\operatorname{\mathsf{T}}}Y_{\perp}^{\operatorname{\mathsf{T}}})(-Y_{\perp}B_{2}A_{3}Y^{\operatorname{\mathsf{T}}}-Y_{\perp}B_{3}A_{2}Y^{\operatorname{\mathsf{T}}})Y=\\ (1-\alpha)(B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}+B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2})+\frac{1}{2}(-A_{1}A_{2}A_{3}-A_{1}A_{3}A_{2}+A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}+A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2})\end{gathered}

Next, use the formula for Γ(ξ,ω)\Gamma(\xi,\omega) with ω=Γ(η,ϕ)\omega=\Gamma(\eta,\phi)

Y𝖳Γ(ξ,Γ(η,ϕ))=12(A1(12(A2A3A3A2+B3𝖳B2+B2𝖳B3))+(12(A2A3A3A2+B3𝖳B2+B2𝖳B3))𝖳A1+B1𝖳((1α)(B2A3B3A2))+((1α)(B2A3B3A2))𝖳B1)=1α2(A2B3𝖳B1+A3B2𝖳B1B1𝖳B2A3B1𝖳B3A2)+14(A1A2A3+A1A3A2A1B2𝖳B3A1B3𝖳B2A2A3A1A3A2A1+B2𝖳B3A1+B3𝖳B2A1)\begin{gathered}Y^{\operatorname{\mathsf{T}}}\Gamma(\xi,\Gamma(\eta,\phi))=\frac{1}{2}(-A_{1}(\frac{1}{2}(-A_{2}A_{3}-A_{3}A_{2}+B_{3}^{\operatorname{\mathsf{T}}}B_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}))+\\ (\frac{1}{2}(-A_{2}A_{3}-A_{3}A_{2}+B_{3}^{\operatorname{\mathsf{T}}}B_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}))^{\operatorname{\mathsf{T}}}A_{1}+B_{1}^{\operatorname{\mathsf{T}}}((1-\alpha)(-B_{2}A_{3}-B_{3}A_{2}))+\\ ((1-\alpha)(-B_{2}A_{3}-B_{3}A_{2}))^{\operatorname{\mathsf{T}}}B_{1})=\\ \frac{1-\alpha}{2}(A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}+A_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2})+\frac{1}{4}(A_{1}A_{2}A_{3}+\\ A_{1}A_{3}A_{2}-A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-A_{2}A_{3}A_{1}-A_{3}A_{2}A_{1}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1}+B_{3}^{\operatorname{\mathsf{T}}}B_{2}A_{1})\end{gathered}
Y(Γ(ξ,Γ(η,ϕ))=(1α){B1(12(A2A3A3A2+B3𝖳B2+B2𝖳B3)𝖳((1α)(B2A3B3A2))A1)}=(α1)2(B2A3A1+B3A2A1)+α12(B1A2A3+B1A3A2B1B2𝖳B3B1B3𝖳B2)\begin{gathered}Y_{\perp}(\Gamma(\xi,\Gamma(\eta,\phi))=(1-\alpha)\{B_{1}(\frac{1}{2}(-A_{2}A_{3}-A_{3}A_{2}+B_{3}^{\operatorname{\mathsf{T}}}B_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3})^{\operatorname{\mathsf{T}}}-\\ ((1-\alpha)(-B_{2}A_{3}-B_{3}A_{2}))A_{1})\}=\\ (\alpha-1)^{2}(B_{2}A_{3}A_{1}+B_{3}A_{2}A_{1})+\frac{\alpha-1}{2}(B_{1}A_{2}A_{3}+B_{1}A_{3}A_{2}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2})\end{gathered}

Therefore:

Y𝖳Rξηϕ={(1α)(B1𝖳B2A3+B1𝖳B3A2)+12(A1A2A3A1A3A2+A1B2𝖳B3+A1B3𝖳B2)}+{(1α)(B2𝖳B1A3+B2𝖳B3A1)+12(A2A1A3A2A3A1+A2B1𝖳B3+A2B3𝖳B1)}{1α2(A2B3𝖳B1+A3B2𝖳B1B1𝖳B2A3B1𝖳B3A2)+14(A1A2A3+A1A3A2A1B2𝖳B3A1B3𝖳B2A2A3A1A3A2A1+B2𝖳B3A1+B3𝖳B2A1)}+{1α2(A1B3𝖳B2+A3B1𝖳B2B2𝖳B1A3B2𝖳B3A1)+14(A2A1A3+A2A3A1A2B1𝖳B3A2B3𝖳B1A1A3A2A3A1A2+B1𝖳B3A2+B3𝖳B1A2)}=12α4(A1B3𝖳B2A2B3𝖳B1B1𝖳B3A2+B2𝖳B3A1)+1α2(A3B1𝖳B2A3B2𝖳B1B1𝖳B2A3+B2𝖳B1A3)+14(A1A2A3A1B2𝖳B3A2A1A3+A2B1𝖳B3A3A1A2+A3A2A1+B3𝖳B1A2B3𝖳B2A1)\begin{gathered}Y^{\operatorname{\mathsf{T}}}\operatorname{R^{\mathcal{M}}}_{\xi\eta}\phi=-\{(1-\alpha)(B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}+B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2})+\\ \frac{1}{2}(-A_{1}A_{2}A_{3}-A_{1}A_{3}A_{2}+A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}+A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2})\}+\\ \{(1-\alpha)(B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{3}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1})+\frac{1}{2}(-A_{2}A_{1}A_{3}-A_{2}A_{3}A_{1}+A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}+A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1})\}-\\ \{\frac{1-\alpha}{2}(A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}+A_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2})+\frac{1}{4}(A_{1}A_{2}A_{3}+\\ A_{1}A_{3}A_{2}-A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-A_{2}A_{3}A_{1}-A_{3}A_{2}A_{1}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1}+B_{3}^{\operatorname{\mathsf{T}}}B_{2}A_{1})\}+\\ \{\frac{1-\alpha}{2}(A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}+A_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{3}-B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1})+\frac{1}{4}(A_{2}A_{1}A_{3}+\\ A_{2}A_{3}A_{1}-A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}-A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-A_{1}A_{3}A_{2}-A_{3}A_{1}A_{2}+B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2}+B_{3}^{\operatorname{\mathsf{T}}}B_{1}A_{2})\}\\ =\frac{1-2\alpha}{4}(A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1})+\\ \frac{1-\alpha}{2}(A_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-A_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}+B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{3})+\frac{1}{4}(A_{1}A_{2}A_{3}-A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-\\ A_{2}A_{1}A_{3}+A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}-A_{3}A_{1}A_{2}+A_{3}A_{2}A_{1}+B_{3}^{\operatorname{\mathsf{T}}}B_{1}A_{2}-B_{3}^{\operatorname{\mathsf{T}}}B_{2}A_{1})\end{gathered}

The last expression follows from a term by term collection, for example, the coefficient of A1A2A3A_{1}A_{2}A_{3} is (1/2)1/4=1/4-(-1/2)-1/4=1/4, and similarly for all terms with coefficient 1/41/4. The coefficient for A1B3𝖳B2A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2} is 1/2+1/4+(1α)/2=(12α/4)-1/2+1/4+(1-\alpha)/2=(1-2\alpha/4), and similar to all the terms with that coefficient.

Y𝖳Rξηϕ=(B1B2𝖳B32+B1B3𝖳B22+(12α)(B1A2A3+B1A3A2)+(1α)(B2A3A1+B2B3𝖳B1B3A2A1+B3B2𝖳B1))+(B2B1𝖳B32+B2B3𝖳B12+(12α)(B2A1A3+B2A3A1)+(1α)(B1A3A2+B1B3𝖳B2B3A1A2+B3B1𝖳B2))(α1)2(B2A3A1+B3A2A1)α12(B1A2A3+B1A3A2B1B2𝖳B3B1B3𝖳B2)+(α1)2(B1A3A2+B3A1A2)+α12(B2A1A3+B2A3A1B2B1𝖳B3B2B3𝖳B1)\begin{gathered}Y_{\perp}^{\operatorname{\mathsf{T}}}\operatorname{R^{\mathcal{M}}}_{\xi\eta}\phi=-(\frac{B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}}{2}+\frac{B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}}{2}+(\frac{1}{2}-\alpha)(B_{1}A_{2}A_{3}+B_{1}A_{3}A_{2})+\\ (1-\alpha)(-B_{2}A_{3}A_{1}+B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{3}A_{2}A_{1}+B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}))+\\ (\frac{B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}}{2}+\frac{B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}}{2}+(\frac{1}{2}-\alpha)(B_{2}A_{1}A_{3}+B_{2}A_{3}A_{1})+\\ (1-\alpha)(-B_{1}A_{3}A_{2}+B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{3}A_{1}A_{2}+B_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}))-\\ (\alpha-1)^{2}(B_{2}A_{3}A_{1}+B_{3}A_{2}A_{1})-\frac{\alpha-1}{2}(B_{1}A_{2}A_{3}+B_{1}A_{3}A_{2}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2})\\ +(\alpha-1)^{2}(B_{1}A_{3}A_{2}+B_{3}A_{1}A_{2})+\frac{\alpha-1}{2}(B_{2}A_{1}A_{3}+B_{2}A_{3}A_{1}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}-B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1})\\ \end{gathered}

Again, we collect term by term, (we do use a symbolic calculation program helper). The coefficient for B1B2𝖳B3B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3} is 1/2+(α1)/2=(α2)/2-1/2+(\alpha-1)/2=(\alpha-2)/2, and similar for B2B1𝖳B3B_{2}B_{1}^{\operatorname{\mathsf{T}}}B3. The coefficient for B1B3𝖳B2B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2} is 1/2+(1α)+(α1)/2=α/2-1/2+(1-\alpha)+(\alpha-1)/2=-\alpha/2, and similar for B2B3𝖳B1B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}. The coefficient for B1A2A3B_{1}A_{2}A_{3} is (1/2α)(α1)/2=α/2-(1/2-\alpha)-(\alpha-1)/2=\alpha/2, and similar for B2A1A3B_{2}A_{1}A_{3}. The coefficient for B1A3A2B_{1}A_{3}A_{2} is (12α)(1α)α12+(α1)2=α2α2=2α2α2-(\frac{1}{2}-\alpha)-(1-\alpha)-\frac{\alpha-1}{2}+(\alpha-1)^{2}=\alpha^{2}-\frac{\alpha}{2}=\frac{2\alpha^{2}-\alpha}{2} and similar for B2A3A1B_{2}A_{3}A_{1}. The coefficient for B3A2A1B_{3}A_{2}A_{1} is (1α)(α1)2=αα2(1-\alpha)-(\alpha-1)^{2}=\alpha-\alpha^{2}, and B3A1A2B_{3}A_{1}A_{2} follows by permutation. The coefficient for B3B2𝖳B1B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1} is (1α)-(1-\alpha), and similar for B3B1𝖳B2B_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}. Finally

Y𝖳Rξηϕ=2α2α2(B1A3A2B2A3A1)+(α2α)(B3A1A2B3A2A1)+(1α)(B3B1𝖳B2B3B2𝖳B1)+α22(B1B2𝖳B3B2B1𝖳B3)+α2(B1A2A3B1B3𝖳B2B2A1A3+B2B3𝖳B1)\begin{gathered}Y_{\perp}^{\operatorname{\mathsf{T}}}\operatorname{R^{\mathcal{M}}}_{\xi\eta}\phi=\frac{2\alpha^{2}-\alpha}{2}(B_{1}A_{3}A_{2}-B_{2}A_{3}A_{1})+(\alpha^{2}-\alpha)(B_{3}A_{1}A_{2}-B_{3}A_{2}A_{1})+\\ (1-\alpha)(B_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1})+\frac{\alpha-2}{2}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3})+\\ \frac{\alpha}{2}(B_{1}A_{2}A_{3}-B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}A_{1}A_{3}+B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1})\end{gathered}

The Ricci curvature is Tr((A2,B2)(AR,BR))\operatorname{Tr}((A_{2},B_{2})\mapsto(A_{R},B_{R})). Using item 3 of lemma A.1, for the ARA_{R} component, we compute the trace of

A212α4(A2B3𝖳B1B1𝖳B3A2)+14([[A1,A2],A3]+A2B1𝖳B3+B3𝖳B1A2)A_{2}\mapsto\frac{1-2\alpha}{4}(-A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2})+\frac{1}{4}([[A_{1},A_{2}],A_{3}]+A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}+B_{3}^{\operatorname{\mathsf{T}}}B_{1}A_{2})

which evaluates to 12α4(p1)Tr(B3𝖳B1)+14((2p)Tr(A1A3)+pTr(B1𝖳B3)Tr(B1𝖳B3))\frac{1-2\alpha}{4}(p-1)\operatorname{Tr}(-B_{3}^{\operatorname{\mathsf{T}}}B_{1})+\frac{1}{4}((2-p)\operatorname{Tr}(A_{1}A_{3})+p\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{3})-\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{3})), or 2p4Tr(A1A3)+(p1)α2Tr(B1𝖳B3)\frac{2-p}{4}\operatorname{Tr}(A_{1}A_{3})+(p-1)\frac{\alpha}{2}\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{3}). Here, we need p>1p>1, otherwise 𝔬(p)\mathfrak{o}(p) is zero and there is no contribution from this component.

For the BRB_{R} component, use item 1 of lemma A.1, we compute

Tr(B22α2α2(B2A3A1)+(1α)(B3B1𝖳B2B3B2𝖳B1)+α22(B1B2𝖳B3B2B1𝖳B3)+α2(B1B3𝖳B2B2A1A3+B2B3𝖳B1))=2α2α2(np)Tr(A3A1)+(1α)(p1)Tr(B3B1𝖳)+α22(1n+p)Tr(B1𝖳B3)+α(n2p)2Tr(B1B3𝖳)α(np)2Tr(A1A3)\begin{gathered}\operatorname{Tr}(B_{2}\mapsto\frac{2\alpha^{2}-\alpha}{2}(-B_{2}A_{3}A_{1})+(1-\alpha)(B_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1})+\\ \frac{\alpha-2}{2}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3})+\frac{\alpha}{2}(-B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}A_{1}A_{3}+B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}))\\ =\frac{2\alpha^{2}-\alpha}{2}(n-p)\operatorname{Tr}(-A_{3}A_{1})+(1-\alpha)(p-1)\operatorname{Tr}(B_{3}B_{1}^{\operatorname{\mathsf{T}}})+\\ \frac{\alpha-2}{2}(1-n+p)\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{3})+\frac{\alpha(n-2p)}{2}\operatorname{Tr}(B_{1}B_{3}^{\operatorname{\mathsf{T}}})-\frac{\alpha(n-p)}{2}\operatorname{Tr}(A_{1}A_{3})\end{gathered}

The Ricci curvature is

(2p42α2α2(np)α(np)2)Tr(A1A3)+{(p1)α2+(1α)(p1)+α22(1n+p)+α(n2p)2}Tr(B1𝖳B3)=(2p4+(pn)α2)Tr(A1A2)+[(1p)α+(n2)]Tr(B1𝖳B2)\begin{gathered}(\frac{2-p}{4}-\frac{2\alpha^{2}-\alpha}{2}(n-p)-\frac{\alpha(n-p)}{2})\operatorname{Tr}(A_{1}A_{3})+\\ \{(p-1)\frac{\alpha}{2}+(1-\alpha)(p-1)+\frac{\alpha-2}{2}(1-n+p)+\frac{\alpha(n-2p)}{2}\}\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{3})\\ =(\frac{2-p}{4}+(p-n)\alpha^{2})\operatorname{Tr}(A_{1}A_{2})+[(1-p)\alpha+(n-2)]\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{2})\end{gathered}

The Ricci map is thus (A2,B2)((p24α+(np)α)A2,((1p)α+(n2))B2)(A_{2},B_{2})\mapsto((\frac{p-2}{4\alpha}+(n-p)\alpha)A_{2},((1-p)\alpha+(n-2))B_{2}), which gives us the scalar curvature formula.

For the sectional curvature, we substitute A1,B1A_{1},B_{1} in place of A3,B3A_{3},B_{3} in the expressions for ARA_{R} and BRB_{R}, then compute Tr(αARA2+BRB2𝖳)\operatorname{Tr}(-\alpha A_{R}A_{2}+B_{R}B_{2}^{\operatorname{\mathsf{T}}})

𝒦^(ξ,η)=Tr(α(12α4(A1B1𝖳B2A2B1𝖳B1B1𝖳B1A2+B2𝖳B1A1)+1α2(A1B1𝖳B2A1B2𝖳B1B1𝖳B2A1+B2𝖳B1A1)+14([[A1,A2],A1]A1B2𝖳B1+A2B1𝖳B1+B1𝖳B1A2B1𝖳B2A1))A2)+Tr((2α2α2(B1A1A2B2A1A1)+(α2α)(B1A1A2B1A2A1)+(1α)(B1B1𝖳B2B1B2𝖳B1)+α22(B1B2𝖳B1B2B1𝖳B1)+α2(B1A2A1B1B1𝖳B2B2A1A1+B2B1𝖳B1))B2𝖳)\begin{gathered}\operatorname{\hat{\mathcal{K}}}(\xi,\eta)=\operatorname{Tr}(-\alpha(\frac{1-2\alpha}{4}(A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{1}A_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{1})+\\ \frac{1-\alpha}{2}(A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{1}+B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{1})+\\ \frac{1}{4}([[A_{1},A_{2}],A_{1}]-A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}+B_{1}^{\operatorname{\mathsf{T}}}B_{1}A_{2}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{1}))A_{2})\\ +\operatorname{Tr}((\frac{2\alpha^{2}-\alpha}{2}(B_{1}A_{1}A_{2}-B_{2}A_{1}A_{1})+(\alpha^{2}-\alpha)(B_{1}A_{1}A_{2}-B_{1}A_{2}A_{1})+\\ (1-\alpha)(B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1})+\frac{\alpha-2}{2}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1})+\\ \frac{\alpha}{2}(B_{1}A_{2}A_{1}-B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}A_{1}A_{1}+B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}))B_{2}^{\operatorname{\mathsf{T}}})\end{gathered}

We collect terms. From Tr([[A1,A2]A1]A2)=Tr[A1,A2][A1,A2]𝖳-\operatorname{Tr}([[A_{1},A_{2}]A_{1}]A_{2})=\operatorname{Tr}[A_{1},A_{2}][A_{1},A_{2}]^{\operatorname{\mathsf{T}}}, terms involving A1,A2A_{1},A_{2} only are α/4Tr[A1,A2][A1,A2]𝖳\alpha/4\operatorname{Tr}[A_{1},A_{2}][A_{1},A_{2}]^{\operatorname{\mathsf{T}}}. Terms with both AA’s and BB’s:

Tr(α(12α4(A1B1𝖳B2A2+A2B1𝖳B1A2+B1𝖳B1A22B2𝖳B1A1A2)+1α2(A1B1𝖳B2A2+A1B2𝖳B1A2+B1𝖳B2A1A2B2𝖳B1A1A2)+14(A1B2𝖳B1A2A2B1𝖳B1A2B1𝖳B1A22+B1𝖳B2A1A2)))+αTr(2α12(B1A1A2B2𝖳B2A1A1B2𝖳)+(α1)(B1A1A2B2𝖳B1A2A1B2𝖳)+12(B1A2A1B2𝖳B2A1A1B2𝖳))=αTr((1α2+14(α1)+12)A2A1B2𝖳B1+(12α41α2)A2A1B1𝖳B2+(12α41α2+α1+2α12)A1A2B2𝖳B1+(1α2+14)A1A2B1𝖳B2+(212α424)A22B1𝖳B1+(2α1212)A12B2𝖳B2)=αTr(96α4A2A1B2𝖳B1+4α34A2A1B1𝖳B2+12α94A1A2B2𝖳B1+32α4A1A2B1𝖳B2αA22B1𝖳B1αA12B2𝖳B2)=αTr((4α3)A1A2B2𝖳B1+(32α)A1A2B1𝖳B2αA22B1𝖳B1αA12B2𝖳B2)\begin{gathered}\operatorname{Tr}(\alpha(\frac{1-2\alpha}{4}(-A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{2}+A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}A_{2}+B_{1}^{\operatorname{\mathsf{T}}}B_{1}A_{2}^{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{1}A_{2})+\\ \frac{1-\alpha}{2}(-A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{2}+A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{2}+B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{1}A_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{1}A_{2})+\\ \frac{1}{4}(A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{2}-A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}A_{2}-B_{1}^{\operatorname{\mathsf{T}}}B_{1}A_{2}^{2}+B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{1}A_{2})))\\ +\alpha\operatorname{Tr}(\frac{2\alpha-1}{2}(B_{1}A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}A_{1}A_{1}B_{2}^{\operatorname{\mathsf{T}}})+(\alpha-1)(B_{1}A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}-B_{1}A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}})+\\ \frac{1}{2}(B_{1}A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}A_{1}A_{1}B_{2}^{\operatorname{\mathsf{T}}}))=\\ \alpha\operatorname{Tr}((\frac{1-\alpha}{2}+\frac{1}{4}-(\alpha-1)+\frac{1}{2})A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+(-\frac{1-2\alpha}{4}-\frac{1-\alpha}{2})A_{2}A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}+\\ (-\frac{1-2\alpha}{4}-\frac{1-\alpha}{2}+\alpha-1+\frac{2\alpha-1}{2})A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+\\ (\frac{1-\alpha}{2}+\frac{1}{4})A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}+(2\frac{1-2\alpha}{4}-\frac{2}{4})A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}+(-\frac{2\alpha-1}{2}-\frac{1}{2})A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2})=\\ \alpha\operatorname{Tr}(\frac{9-6\alpha}{4}A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+\frac{4\alpha-3}{4}A_{2}A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}+\frac{12\alpha-9}{4}A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+\\ \frac{3-2\alpha}{4}A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\alpha A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\alpha A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2})=\\ \alpha\operatorname{Tr}((4\alpha-3)A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+(3-2\alpha)A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\alpha A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\alpha A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2})\end{gathered}

where we use Tr(A2A1B2𝖳B1)=Tr((A2A1B2𝖳B1)𝖳)=Tr(A1A2B1𝖳B2)\operatorname{Tr}(A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1})=\operatorname{Tr}((A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1})^{\operatorname{\mathsf{T}}})=\operatorname{Tr}(A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}) and similarly Tr(A2A1B1𝖳B2)=Tr(A1A2B2𝖳B1)\operatorname{Tr}(A_{2}A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2})=\operatorname{Tr}(A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}). Next, we collect the terms with B1B_{1} and B2B_{2} only:

Tr((1α)(B1B1𝖳B2B2𝖳B1B2𝖳B1B2𝖳)+α22(B1B2𝖳B1B2𝖳B2B1𝖳B1B2𝖳)+α2(B1B1𝖳B2B2𝖳+B2B1𝖳B1B2𝖳))=Tr((13α2)B1B1𝖳B2B2𝖳+(α1+α22)B1B2𝖳B1B2𝖳+(α22+α2)B2B1𝖳B1B2𝖳)=Tr(23α2B1B1𝖳B2B2𝖳+3α42B1B2𝖳B1B2𝖳+B2B1𝖳B1B2𝖳)\begin{gathered}\operatorname{Tr}((1-\alpha)(B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})+\frac{\alpha-2}{2}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})+\\ \frac{\alpha}{2}(-B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}+B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}))=\\ \operatorname{Tr}((1-\frac{3\alpha}{2})B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}+(\alpha-1+\frac{\alpha-2}{2})B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}+(-\frac{\alpha-2}{2}+\frac{\alpha}{2})B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})\\ =\operatorname{Tr}(\frac{2-3\alpha}{2}B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}+\frac{3\alpha-4}{2}B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}+B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})\end{gathered}

This proves eq. 3.7. On the other hand, it is clear on the right-hand side of eq. 3.8, the AA’s only term is α4Tr[A1,A2][A1,A2]𝖳\frac{\alpha}{4}\operatorname{Tr}[A_{1},A_{2}][A_{1},A_{2}]^{\operatorname{\mathsf{T}}}, the BB’s only term is:

(α(34α)24+(12α)32)Tr(B2𝖳B1B1𝖳B2)(B2𝖳B1B1𝖳B2)𝖳+12Tr(B1B2𝖳B2B1𝖳)(B1B2𝖳B2B1𝖳)𝖳=23α4Tr(B2𝖳B1B1𝖳B2B2𝖳B1B2𝖳B1B1𝖳B2B1𝖳B2+B1𝖳B2B2𝖳B1)+12Tr(B1B2𝖳B2B1𝖳B1B2𝖳B1B2𝖳B2B1𝖳B2B1𝖳+B2B1𝖳B1B2𝖳)=Tr(223α4B1B1𝖳B2B2𝖳+(223α4212)B1B2𝖳B1B2𝖳+212B2B1𝖳B1B2𝖳)=Tr(23α2B1B1𝖳B2B2𝖳+3α42B1B2𝖳B1B2𝖳+B2B1𝖳B1B2𝖳)\begin{gathered}(\frac{\alpha(3-4\alpha)^{2}}{4}+\frac{(1-2\alpha)^{3}}{2})\operatorname{Tr}(B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})(B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})^{\operatorname{\mathsf{T}}}+\\ \frac{1}{2}\operatorname{Tr}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}})(B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}})^{\operatorname{\mathsf{T}}}=\\ \frac{2-3\alpha}{4}\operatorname{Tr}(B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}+B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1})+\\ \frac{1}{2}\operatorname{Tr}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{2}B_{1}^{\operatorname{\mathsf{T}}}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{1}^{\operatorname{\mathsf{T}}}+B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})\\ =\operatorname{Tr}(2\frac{2-3\alpha}{4}B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}+(-2\frac{2-3\alpha}{4}-2\frac{1}{2})B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}+2\frac{1}{2}B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})\\ =\operatorname{Tr}(\frac{2-3\alpha}{2}B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}+\frac{3\alpha-4}{2}B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}+B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})\end{gathered}

The terms with both AA and BB in eq. 3.8 are:

αTr(34α2(A2A1A1A2)(B2𝖳B1B1𝖳B2)+α(B1A2B2A1)𝖳(B1A2B2A1))=αTr{(34α2+α)A2A1B2𝖳B134α2A2A1B1𝖳B234α2A1A2B2𝖳B1+(34α2+α)A1A2B1𝖳B2αA22B1𝖳B1αA12B2𝖳B2}=αTr{32α2A2A1B2𝖳B134α2A2A1B1𝖳B234α2A1A2B2𝖳B1+32α2A1A2B1𝖳B2αA22B1𝖳B1αA12B2𝖳B2}=αTr{(4α3)A1A2B2𝖳B1+(32α)A1A2B1𝖳B2αA22B1𝖳B1αA12B2𝖳B2}\begin{gathered}\alpha\operatorname{Tr}(\frac{3-4\alpha}{2}(A_{2}A_{1}-A_{1}A_{2})(B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})+\alpha(B_{1}A_{2}-B_{2}A_{1})^{\operatorname{\mathsf{T}}}(B_{1}A_{2}-B_{2}A_{1}))\\ =\alpha\operatorname{Tr}\{(\frac{3-4\alpha}{2}+\alpha)A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-\frac{3-4\alpha}{2}A_{2}A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\frac{3-4\alpha}{2}A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+\\ (\frac{3-4\alpha}{2}+\alpha)A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\alpha A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\alpha A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2}\}\\ =\alpha\operatorname{Tr}\{\frac{3-2\alpha}{2}A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-\frac{3-4\alpha}{2}A_{2}A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\frac{3-4\alpha}{2}A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+\\ \frac{3-2\alpha}{2}A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\alpha A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\alpha A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2}\}\\ =\alpha\operatorname{Tr}\{(4\alpha-3)A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+(3-2\alpha)A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\alpha A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\alpha A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2}\}\end{gathered}

Therefore we have shown eq. 3.8 gives us the sectional curvature numerator. For the sign of the sectional curvature, in eq. 3.8 the terms are all positive, except for the last, which is non-negative if α12\alpha\leq\frac{1}{2}. The formula for the curvature denominator is clear. ∎

Recall an Einstein manifold is a Riemannian manifold where the Ricci curvature tensor is proportional to the metric tensor. We have a quick application

Corollary 1.

For p>1p>1, the Stiefel manifold with the metric 𝗀=α0ω+(α1α0)YY𝖳ω\mathsf{g}=\alpha_{0}\omega+(\alpha_{1}-\alpha_{0})YY^{\operatorname{\mathsf{T}}}\omega is an Einstein manifold if and only if α=α1/α0\alpha=\alpha_{1}/\alpha_{0} satisfies the equation

(3.10) (n1)α2(n2)α+p24=0(n-1)\alpha^{2}-(n-2)\alpha+\frac{p-2}{4}=0

For p=2p=2, α=n2n1\alpha=\frac{n-2}{n-1} is the only value of α\alpha that makes St2,n\mathrm{St}_{2,n} an Einstein manifold. If p>2p>2, there are two values of α\alpha in the family making the Stiefel manifold an Einstein manifold.

Proof.

From eq. 3.5, the manifold is an Einstein manifold if and only if (np)α2+(p2)/4=α(n2+(1p)α)(n-p)\alpha^{2}+(p-2)/4=\alpha(n-2+(1-p)\alpha), from here eq. 3.10 follows. When p=2p=2, it is clear n2n1\frac{n-2}{n-1} is the only solution. When p>2p>2, eq. 3.10 has positive discriminant (n2)2+(p2)(n1)(n-2)^{2}+(p-2)(n-1), and has two positive roots. ∎

It is noted in [6] that when p=n1p=n-1, Stn1,n\mathrm{St}_{n-1,n} is just SO(n)\operatorname{SO}(n). Thus, we have provided SO(n)\operatorname{SO}(n) with Einstein metrics.

4. Sectional curvature range

We have seen the sectional curvature numerator 𝒦^\operatorname{\hat{\mathcal{K}}} could be expressed as a weighted sum of squares, this allows us to estimate the sectional curvature range. If p=1p=1 then the Stiefel manifold is a sphere and has constant sectional curvature. Therefore we will assume p>1p>1 below.

It is easy to establish upper and lower bounds (not tight) for the sectional curvature from eq. 3.8. Using the triangle inequality we can bound 𝒦\mathcal{K} from eq. 3.8 by bounding an expression of the form K1=a[A1,A2]F2+bB1B2𝖳B2B1𝖳F2+cB1𝖳B2B2𝖳B1F2+dB1A2B2A1F2K_{1}=a\|[A_{1},A_{2}]\|^{2}_{F}+b\|B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}\|^{2}_{F}+c\|B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{1}\|^{2}_{F}+d\|B_{1}A_{2}-B_{2}A_{1}\|_{F}^{2} by the curvature denominator S:=(αA1F2+B1F2)(αA2F2+B2F2)S:=(\alpha\|A_{1}\|_{F}^{2}+\|B_{1}\|_{F}^{2})(\alpha\|A_{2}\|_{F}^{2}+\|B_{2}\|_{F}^{2}). We use the inequality [X,Z]F2XF2ZF2\|[X,Z]\|_{F}^{2}\leq\|X\|_{F}^{2}\|Z\|_{F}^{2}, for two antisymmetric matrices in 𝔬(n)\mathfrak{o}(n) if n>3n>3 ([3], lemma 2.5 provides the explicit matrices where we have equality, see also [4], proposition 4.2). Apply that inequality with X=12[2αA1B1𝖳B10]X=\frac{1}{\sqrt{2}}\begin{bmatrix}\sqrt{2\alpha}A_{1}&-B_{1}^{\operatorname{\mathsf{T}}}\\ B_{1}&0\end{bmatrix}, Z=12[2αA2B2𝖳B20]Z=\frac{1}{\sqrt{2}}\begin{bmatrix}\sqrt{2\alpha}A_{2}&B_{2}^{\operatorname{\mathsf{T}}}\\ B_{2}&0\end{bmatrix} and similar inequalities for B1=B2=0B_{1}=B_{2}=0, A1=A2=0A_{1}=A_{2}=0, we can bound each term of K1K_{1} by SS, thus getting a bound for 𝒦\mathcal{K}.

We will attempt to provide more refined bounds. The analysis of sectional curvature range for Stiefel manifolds is more complicated than that of symmetric spaces because of the presence of both the AA and BB components. The manifold is homogeneous, therefore the sectional curvature range is the same at any point. Let EijE_{ij} 1ip1\leq i\leq p be the elementary matrix in p×p\mathbb{R}^{p\times p} with the (i,j)(i,j) entry is 11, and other entries 0. Let eije_{ij} be the elementary matrix in (np)×p\mathbb{R}^{(n-p)\times p} (1inp,1jp)(1\leq i\leq n-p,1\leq j\leq p) with the (i,j)(i,j) entry is 11 and the other entries are zero.

In table 1, we show sectional curvature values of Stp,n\mathrm{St}_{p,n} at several sections (pairs of linearly independent tangent vectors), each defined by a quadruple (A1,B1,A2,B2)(A_{1},B_{1},A_{2},B_{2}). A few of those sections come from the corresponding sections for SO(n)\operatorname{SO}(n), in [3] as cited. We have noted that 𝒦\mathcal{K} is non-negative if α12\alpha\leq\frac{1}{2}, and table 1 shows a section with 𝒦=23α2\mathcal{K}=\frac{2-3\alpha}{2}, thus, if α>23\alpha>\frac{2}{3}, 𝒦\mathcal{K} always has negative values in its range. When p=2p=2, we will show 𝒦\mathcal{K} is non-negative if α23\alpha\leq\frac{2}{3}. When p>2p>2, 𝒦\mathcal{K} could be negative if 12α23\frac{1}{2}\leq\alpha\leq\frac{2}{3}. To see this, let A1=E12E21,B1=γ1/2e11,A2=E23E32A_{1}=E_{12}-E_{21},B_{1}=\gamma^{1/2}e_{11},A_{2}=E_{23}-E_{32}, B2=γ1/2e13B_{2}=\gamma^{1/2}e_{13} for γ,γ>0\gamma\in\mathbb{R},\gamma>0. Thus, [A1,A2]=E13E31[A_{1},A_{2}]=E_{13}-E_{31}, B1A2=B2A1=0B_{1}A_{2}=B_{2}A_{1}=0, B1B2𝖳=0B_{1}B_{2}^{\operatorname{\mathsf{T}}}=0, B1𝖳B2B2𝖳B1=γ(E13E31)B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{1}=\gamma(E_{13}-E_{31}). By eq. 3.9, the corresponding sectional curvature is

(4.1) 𝔠(γ)=α/2+α(4α3)γ+(23α)γ2/2(2α+γ)2\mathfrak{c}(\gamma)=\frac{\alpha/2+\alpha(4\alpha-3)\gamma+(2-3\alpha)\gamma^{2}/2}{(2\alpha+\gamma)^{2}}

with ddγ𝔠(γ)=α((710α)γ16α+8α2)/(γ+2α)3\frac{d}{d\gamma}\mathfrak{c}(\gamma)=\alpha((7-10\alpha)\gamma-1-6\alpha+8\alpha^{2})/(\gamma+2\alpha)^{3}, 𝔠\mathfrak{c} is minimized at

(4.2) γmin(α)=(1+6α8α2)/(710α)\gamma_{\min}(\alpha)=(1+6\alpha-8\alpha^{2})/(7-10\alpha)

Substitute in, the function 𝔩(α):=𝔠(γmin(α))\mathfrak{l}(\alpha):=\mathfrak{c}(\gamma_{\min}(\alpha)) is slightly negative for α\alpha in the interval (12,710)(\frac{1}{2},\frac{7}{10}), which contains 23\frac{2}{3}. Note that α=710\alpha=\frac{7}{10} is a removable singularity of 𝔩\mathfrak{l}, and setting 𝔩(710)=limγ𝔠(γ)=12(23×710)=120\mathfrak{l}(\frac{7}{10})=\lim_{\gamma\to\infty}\mathfrak{c}(\gamma)=\frac{1}{2}(2-3\times\frac{7}{10})=\frac{-1}{20} makes it a smooth function. This function is strictly decreasing and negative in the interval (12,710)(\frac{1}{2},\frac{7}{10}), with 𝔩(12)=0\mathfrak{l}(\frac{1}{2})=0 and 𝔩(23)\mathfrak{l}(\frac{2}{3}) around 0.02-0.02.

The curvature range contains the interval between the maximum and minimum of values in table 1 if the condition in the last column of the table is satisfied. For p=2p=2, proposition 1 determines the exact curvature range. For p>2p>2, numerically, the sections in that table seem to determine the range completely. For each α\alpha, the lower and upper bound of the curvature range, found numerically by optimizing 𝒦\mathcal{K} over the space of all sections, the Grassmann manifold of two-dimensional subspaces of dimStp,n\mathbb{R}^{\dim\mathrm{St}_{p,n}} is within the maximal and minimal values of the sections in the table if the condition in the last column is satisfied, as shown in fig. 1, 2, 3, 4. There, we plot the graphs of the curvatures of the list of sections as functions of α\alpha for the scenarios, and also plot the results of the numerical optimization for curvature range, for a set of 3030 values of α\alpha. The optimization is done for n=4,p=3n=4,p=3, n=5,p{3,4}n=5,p\in\{3,4\}, n=6,p{3,4,5}n=6,p\in\{3,4,5\}, n=10,p{3,5,10,9}n=10,p\in\{3,5,10,9\}, n=100,p{10,20}n=100,p\in\{10,20\}. The curve llll in the figure is for the function 𝔩\mathfrak{l}. The reason the optimized maximum is sometimes smaller than the proposed maximum, for small α\alpha, is because the optimizer may be stuck at a local maximum.

𝒦\mathcal{K} AA and BB condition
0 A1=A2=E12E21,B1=2e13,B2=αe13A_{1}=A_{2}=E_{12}-E_{21},B_{1}=2e_{13},B_{2}=-\alpha e_{13} n4,p3n\geq 4,p\geq 3
0 A1=A2=0,B1=e11,B2=e22A_{1}=A_{2}=0,B_{1}=e_{11},B_{2}=e_{22} n4,pn2n\geq 4,p\leq n-2
1 A1=A2=0,B1=e11,B2=e21A_{1}=A_{2}=0,B_{1}=e_{11},B_{2}=e_{21} n4,pn2n\geq 4,p\leq n-2
12α+1\frac{1}{2\alpha+1} A1=E12E21,A2=E1pEp1,B1=e1p,B2=e12A_{1}=E_{12}-E_{21},A_{2}=E_{1p}-E_{p1},B_{1}=-e_{1p},B_{2}=e_{12} p3p\geq 3
18α\frac{1}{8\alpha} A1=E12E21,A2=E23E32,B1=B2=0A_{1}=E_{12}-E_{21},A_{2}=E_{23}-E_{32},B_{1}=B_{2}=0 p3p\geq 3
14α\frac{1}{4\alpha} A1=E12E21+Ep1,pEp,p1A_{1}=E_{12}-E_{21}+E_{p-1,p}-E_{p,p-1} p4p\geq 4
A2=E1,p1Ep1,1E2,p+Ep,2,B1=B2=0A_{2}=E_{1,p-1}-E_{p-1,1}-E_{2,p}+E_{p,2},B_{1}=B_{2}=0
α2\frac{\alpha}{2} A1=(E12E21),A2=0,B1=0,B2=e11A_{1}=(E_{12}-E_{21}),A_{2}=0,B_{1}=0,B_{2}=e_{11}
23α2\frac{2-3\alpha}{2} A1=0,A2=0,B1=e11,B2=e12A_{1}=0,A_{2}=0,B_{1}=e_{11},B_{2}=e_{12}
43α2\frac{4-3\alpha}{2} A1=A2=0,B1=e11+e22,B2=e12e21A_{1}=A_{2}=0,B_{1}=e_{11}+e_{22},B_{2}=e_{12}-e_{21} n4,pn2n\geq 4,p\leq n-2
𝔩(α)\mathfrak{l}(\alpha) A1=E12E21,A2=E23E31A_{1}=E_{12}-E_{21},A_{2}=E_{23}-E_{31}
B1=γmin(α)1/2e11,B2=γmin(α)1/2e13B_{1}=\gamma_{\min}(\alpha)^{1/2}e_{11},B_{2}=\gamma_{\min}(\alpha)^{1/2}e_{13} p3p\geq 3, α<7/10\alpha<7/10
Table 1. Sectional curvature at representative sections. 𝔩(α)=𝔠(γmin(α))\mathfrak{l}(\alpha)=\mathfrak{c}(\gamma_{\min}(\alpha)), from eq. 4.2 and eq. 4.1.
Refer to caption
Figure 1. Numerical test for curvature range n=4,p=3n=4,p=3. Max, min sims are curvature ranges from numerical optimization.
Refer to caption
Figure 2. Numerical test for curvature range n>4,p=3n>4,p=3
Refer to caption
Figure 3. Numerical test for curvature range n2p4n-2\geq p\geq 4. Max, min sims are curvature ranges from numerical optimization.
Refer to caption
Figure 4. Numerical test for curvature range p=n14p=n-1\geq 4
Proposition 1.

If p=2p=2 and n=3n=3, then the sectional curvature range of Stp,n\mathrm{St}_{p,n} is [α2,23α2][\frac{\alpha}{2},\frac{2-3\alpha}{2}] if α12\alpha\leq\frac{1}{2} and [23α2,α2][\frac{2-3\alpha}{2},\frac{\alpha}{2}] otherwise. In particular, if α<23\alpha<\frac{2}{3}, St2,3\mathrm{St}_{2,3} has strictly positive sectional curvature.

If p=2p=2 and n>3n>3 then the sectional curvature range is [0,43α2][0,\frac{4-3\alpha}{2}] if α23\alpha\leq\frac{2}{3}, [23α2,1][\frac{2-3\alpha}{2},1] if 23<α2\frac{2}{3}<\alpha\leq 2 and [23α2,α2][\frac{2-3\alpha}{2},\frac{\alpha}{2}] if α>2\alpha>2. Hence, when n>3n>3, St2,n\mathrm{St}_{2,n} has non-negative curvature if α23\alpha\leq\frac{2}{3}.

Proof.

When p=2p=2, 𝔬(2)\mathfrak{o}(2) is one dimension so [A1,A2]=0[A_{1},A_{2}]=0 and we can set A1=(2α)1/2c1J,A2=(2α)1/2c2JA_{1}=(2\alpha)^{-1/2}c_{1}J,A_{2}=(2\alpha)^{-1/2}c_{2}J for J=[0110]J=\begin{bmatrix}0&1\\ -1&0\end{bmatrix}, with c1,c2c_{1},c_{2}\in\mathbb{R}. Further, for two orthogonal matrices in U,VU,V of compatible dimensions, the sectional curvature is unchanged if we replace (A1,B1,A2,B2)(A_{1},B_{1},A_{2},B_{2}) with (VA1V𝖳,UB1V,VA2V𝖳,UB2V)(VA_{1}V^{\operatorname{\mathsf{T}}},UB_{1}V,VA_{2}V^{\operatorname{\mathsf{T}}},UB_{2}V). Thus, we can assume B1B_{1} is rectangular diagonal, with diagonal entries denoted by did_{i}, 1imin(np,p)1\leq i\leq\min(n-p,p). We denote entries of B2B_{2} by bijb_{ij}, 1inp,1jp1\leq i\leq n-p,1\leq j\leq p. We note B1A2B2A1=(2α)1/2(c2B1Jc1B2J)B_{1}A_{2}-B_{2}A_{1}=(2\alpha)^{-1/2}(c_{2}B_{1}J-c_{1}B_{2}J), and since JJ𝖳=I2JJ^{\operatorname{\mathsf{T}}}=\operatorname{I}_{2}, α2B1A1B2A2F2=α/2(c22Tr(B1B1𝖳)+c12Tr(B2B2𝖳)2c1c2Tr(B1B2𝖳))\alpha^{2}\|B_{1}A_{1}-B_{2}A_{2}\|_{F}^{2}=\alpha/2(c_{2}^{2}\operatorname{Tr}(B_{1}B_{1}^{\operatorname{\mathsf{T}}})+c_{1}^{2}\operatorname{Tr}(B_{2}B_{2}^{\operatorname{\mathsf{T}}})-2c_{1}c_{2}\operatorname{Tr}(B_{1}B_{2}^{\operatorname{\mathsf{T}}})). The orthogonal condition αTrA1A2𝖳+TrB1B2𝖳=0\alpha\operatorname{Tr}A_{1}A_{2}^{\operatorname{\mathsf{T}}}+\operatorname{Tr}B_{1}B_{2}^{\operatorname{\mathsf{T}}}=0 implies c1c2+TrB1B2𝖳=c1c2+i=1min(p,np)dibii=0c_{1}c_{2}+\operatorname{Tr}B_{1}B_{2}^{\operatorname{\mathsf{T}}}=c_{1}c_{2}+\sum_{i=1}^{\min(p,n-p)}d_{i}b_{ii}=0, or c1c2=TrB1B2𝖳c_{1}c_{2}=-\operatorname{Tr}B_{1}B_{2}^{\operatorname{\mathsf{T}}}, so 2c1c2TrB1B2𝖳=c12c22+(TrB1B2)2-2c_{1}c_{2}\operatorname{Tr}B_{1}B_{2}^{\operatorname{\mathsf{T}}}=c_{1}^{2}c_{2}^{2}+(\operatorname{Tr}B_{1}B_{2})^{2}. This implies

α2B1A1B2A2F2=α/2(c22Tr(B1B1𝖳)+c12Tr(B2B2𝖳)+c12c22+Tr(B1B2𝖳)2)\alpha^{2}\|B_{1}A_{1}-B_{2}A_{2}\|_{F}^{2}=\alpha/2(c_{2}^{2}\operatorname{Tr}(B_{1}B_{1}^{\operatorname{\mathsf{T}}})+c_{1}^{2}\operatorname{Tr}(B_{2}B_{2}^{\operatorname{\mathsf{T}}})+c_{1}^{2}c_{2}^{2}+\operatorname{Tr}(B_{1}B_{2}^{\operatorname{\mathsf{T}}})^{2})

For the case n=3n=3, from eq. 3.9, the curvature numerator 𝒦^\operatorname{\hat{\mathcal{K}}} is reduced to

23α2b122d12+α2(c22d12+c12(b112+b122)+c12c22+d12b112)\frac{2-3\alpha}{2}b_{12}^{2}d_{1}^{2}+\frac{\alpha}{2}(c_{2}^{2}d_{1}^{2}+c_{1}^{2}(b_{11}^{2}+b_{12}^{2})+c_{1}^{2}c_{2}^{2}+d_{1}^{2}b_{11}^{2})

and the curvature denominator is S=(c12+d12)(c22+b112+b122)S=(c_{1}^{2}+d_{1}^{2})(c_{2}^{2}+b_{11}^{2}+b_{12}^{2}). We have 𝒦^α/2S=(12α)b122d12\operatorname{\hat{\mathcal{K}}}-\alpha/2S=(1-2\alpha)b_{12}^{2}d_{1}^{2}, 𝒦^(13α/2)S=(2α1)(c22d12+c12(b112+b122)+c12c22+d12b112)\operatorname{\hat{\mathcal{K}}}-(1-3\alpha/2)S=(2\alpha-1)(c_{2}^{2}d_{1}^{2}+c_{1}^{2}(b_{11}^{2}+b_{12}^{2})+c_{1}^{2}c_{2}^{2}+d_{1}^{2}b_{11}^{2}). Thus, the signs of the differences are dependent on 12α1-2\alpha, and 𝒦^\operatorname{\hat{\mathcal{K}}} is between the smaller and the larger of α/2S\alpha/2S and (13α/2)S(1-3\alpha/2)S. The bound is tight based on table 1.

When n>3n>3, the denominator is S=(c12+i=12di2)(c22+ijbij2)S=(c_{1}^{2}+\sum_{i=1}^{2}d_{i}^{2})(c_{2}^{2}+\sum_{ij}b^{2}_{ij}). B1B_{1} consists of a square diagonal block of size 2×22\times 2 and the remaining zero block of size (n4)×2(n-4)\times 2. Expand B1B2𝖳B2B1𝖳F2\|B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}\|_{F}^{2} by dividing B2B_{2} to a square block corresponding to indices not exceeding two, which contributes 2(b21d1b12d2)22(b_{21}d_{1}-b_{12}d_{2})^{2} and the remaining blocks, which contributes 2j=12i>2bij2dj22\sum_{j=1}^{2}\sum_{i>2}b_{ij}^{2}d_{j}^{2}, 𝒦^\operatorname{\hat{\mathcal{K}}} is

23α2(b21d2b12d1)2+α2(c22di2+c12ijbij2+c12c22+(i=12dibii)2)+(b21d1b12d2)2+j=12i>2bij2dj2\begin{gathered}\frac{2-3\alpha}{2}(b_{21}d_{2}-b_{12}d_{1})^{2}+\frac{\alpha}{2}(c_{2}^{2}\sum d_{i}^{2}+c_{1}^{2}\sum_{ij}b_{ij}^{2}+c_{1}^{2}c_{2}^{2}+(\sum_{i=1}^{2}d_{i}b_{ii})^{2})\\ +(b_{21}d_{1}-b_{12}d_{2})^{2}+\sum_{j=1}^{2}\sum_{i>2}b_{ij}^{2}d_{j}^{2}\end{gathered}

The above expression shows when α2/3\alpha\leq 2/3, 𝒦0\mathcal{K}\geq 0. In this case, 123α/21\leq 2-3\alpha/2, α/223α/2\alpha/2\leq 2-3\alpha/2, thus j=12i>2bij2dj2(23α/2)i=12di2i>2bij2\sum_{j=1}^{2}\sum_{i>2}b_{ij}^{2}d_{j}^{2}\leq(2-3\alpha/2)\sum_{i=1}^{2}d_{i}^{2}\sum_{i>2}b^{2}_{ij} and

α2(c22di2+c12ijbij2+c12c22)43α2(c22di2+c12ijbij2+c12c22)\frac{\alpha}{2}(c_{2}^{2}\sum d_{i}^{2}+c_{1}^{2}\sum_{ij}b_{ij}^{2}+c_{1}^{2}c_{2}^{2})\leq\frac{4-3\alpha}{2}(c_{2}^{2}\sum d_{i}^{2}+c_{1}^{2}\sum_{ij}b_{ij}^{2}+c_{1}^{2}c_{2}^{2})

To show 𝒦^(23α/2)S\operatorname{\hat{\mathcal{K}}}\leq(2-3\alpha/2)S, we only need to show

23α2(b21d2b12d1)2+(b21d1b12d2)2+α2(i=12dibii)2(23α2)k=12dk2i2bij2\frac{2-3\alpha}{2}(b_{21}d_{2}-b_{12}d_{1})^{2}+(b_{21}d_{1}-b_{12}d_{2})^{2}+\frac{\alpha}{2}(\sum_{i=1}^{2}d_{i}b_{ii})^{2}\leq(2-\frac{3\alpha}{2})\sum_{k=1}^{2}d_{k}^{2}\sum_{i\leq 2}b_{ij}^{2}

This follows from Cauchy-Schwarz’s theorem, applying to three different combinations on the left-hand side then sum up the inequalities, as the first two terms on the left-hand side are dominated by ((23α)/2+1)(d12+d22)(b212+b122)((2-3\alpha)/2+1)(d_{1}^{2}+d_{2}^{2})(b_{21}^{2}+b_{12}^{2}), while the last one is dominated by α/2(d12+d22)(b112+b222)(23α/2)(d12+d22)(b112+b222)\alpha/2(d_{1}^{2}+d_{2}^{2})(b_{11}^{2}+b_{22}^{2})\leq(2-3\alpha/2)(d_{1}^{2}+d_{2}^{2})(b_{11}^{2}+b_{22}^{2}).

Next, when α>2/3\alpha>2/3, by Cauchy-Schwarz, 𝒦^(13α/2)(b212+b122)(d12+d22)(13α/2)S\operatorname{\hat{\mathcal{K}}}\geq(1-3\alpha/2)(b_{21}^{2}+b_{12}^{2})(d_{1}^{2}+d_{2}^{2})\geq(1-3\alpha/2)S, as 13α/2<01-3\alpha/2<0. When 2/3<α22/3<\alpha\leq 2, α/21\alpha/2\leq 1, thus 𝒦S\mathcal{K}\leq S, as the first term of 𝒦^\operatorname{\hat{\mathcal{K}}} is negative, while we can use Cauchy-Schwarz on (dibii)2(\sum d_{i}b_{ii})^{2} and (b21d1b12d2)2(b_{21}d_{1}-b_{12}d_{2})^{2} as before. Finally, for α>2\alpha>2, 𝒦^α/2S\operatorname{\hat{\mathcal{K}}}\leq\alpha/2S, again because the first term of 𝒦^\operatorname{\hat{\mathcal{K}}} is negative, while the remaining terms are dominated by the corresponding terms in α/2S\alpha/2S, using Cauchy-Schwarz if necessary. Again, the bounds are tight using table 1. ∎

We note St2,3\mathrm{St}_{2,3} is SO(3)\operatorname{SO}(3), and could be considered as the sphere S3S^{3} with antipodal points identified (via the quaternion representation, for example). From the formula for the metric, we see this is the projective version of the Berger sphere.

Proposition 2.

For p3p\geq 3, the sectional curvature range of Stp,n\mathrm{St}_{p,n} contains an interval I=I(n,p,α)I=I(n,p,\alpha) as described in table 2. The first row describes the applicable combination of (n,p)(n,p), the columns labeled αu\alpha_{u} specify the range of α\alpha where the interval formula next to it is applicable. The interval is applicable for α\alpha greater than the previous αu\alpha_{u} (if exists) and not exceeding the current αu\alpha_{u}.

(n=4,p=3)(n=4,p=3) (n,3),n5(n,3),n\geq 5 (n,p),n2p4(n,p),n-2\geq p\geq 4 (n,n1),n5(n,n-1),n\geq 5
αu\alpha_{u} I αu\alpha_{u} I αu\alpha_{u} I αu\alpha_{u} I
16\frac{1}{6} [0,18α][0,\frac{1}{8\alpha}] 4136\frac{4-\sqrt{13}}{6} [0,18α][0,\frac{1}{8\alpha}] 4106\frac{4-\sqrt{10}}{6} [0,14α][0,\frac{1}{4\alpha}] 1/21/2 [0,14α][0,\frac{1}{4\alpha}]
1/21/2 [0,11+2α][0,\frac{1}{1+2\alpha}] 1/21/2 [0,43α2)][0,\frac{4-3\alpha}{2})] 1/21/2 [0,43α2][0,\frac{4-3\alpha}{2}] 710\frac{7}{10} [𝔩(α),11+2α][\mathfrak{l}(\alpha),\frac{1}{1+2\alpha}]
710\frac{7}{10} [𝔩(α),11+2α][\mathfrak{l}(\alpha),\frac{1}{1+2\alpha}] 23\frac{2}{3} [𝔩(α),43α2][\mathfrak{l}(\alpha),\frac{4-3\alpha}{2}] 23\frac{2}{3} [𝔩(α),43α2][\mathfrak{l}(\alpha),\frac{4-3\alpha}{2}] 1714\frac{\sqrt{17}-1}{4} [23α2,11+2α][\frac{2-3\alpha}{2},\frac{1}{1+2\alpha}]
1714\frac{\sqrt{17}-1}{4} [23α2,11+2α][\frac{2-3\alpha}{2},\frac{1}{1+2\alpha}] 710\frac{7}{10} [𝔩(α),1][\mathfrak{l}(\alpha),1] 710\frac{7}{10} [𝔩(α),1][\mathfrak{l}(\alpha),1] \infty [23α2,α2][\frac{2-3\alpha}{2},\frac{\alpha}{2}]
\infty [23α2,α2][\frac{2-3\alpha}{2},\frac{\alpha}{2}] 22 [23α2,1][\frac{2-3\alpha}{2},1] 22 [23α2,1][\frac{2-3\alpha}{2},1]
\infty [23α2,α2][\frac{2-3\alpha}{2},\frac{\alpha}{2}] \infty [23α2,α2][\frac{2-3\alpha}{2},\frac{\alpha}{2}]
Table 2. Interval contained in the sectional curvature range of the Stiefel manifold Stp,n\mathrm{St}_{p,n} with metric defined by α\alpha. 𝔩(α)=𝔠(γmin)\mathfrak{l}(\alpha)=\mathfrak{c}(\gamma_{\min}) with 𝔠\mathfrak{c} defined in eq. 4.1, and γmin\gamma_{\min} in eq. 4.2.

To illustrate, with (n,p)=(4,3)(n,p)=(4,3), for α16\alpha\leq\frac{1}{6}, the sectional curvature range contains the interval [0,18α][0,\frac{1}{8\alpha}], for 16<α12\frac{1}{6}<\alpha\leq\frac{1}{2}, it contains the interval [0,11+2α][0,\frac{1}{1+2\alpha}], etc. In the final row, for α>1714\alpha>\frac{\sqrt{17}-1}{4}, it contains the interval [23α2,α2][\frac{2-3\alpha}{2},\frac{\alpha}{2}].

Proof.

It is straightforward to check that for each pair (n,p)(n,p) in table 2, the values indicated correspond to a quadruple (A1,B1,A2,B2)(A_{1},B_{1},A_{2},B_{2}) in table 1, which is applicable for the pair. For example, in the case (n,p)=(4,3)(n,p)=(4,3), the only applicable values from table 1 are 0 (from the first row), 12α+1\frac{1}{2\alpha+1}, 18α\frac{1}{8\alpha}, 𝔩(α)\mathfrak{l}(\alpha) and 23α2\frac{2-3\alpha}{2}. To show the sectional curvature range contains II, it remains to verify the lower end of II is not greater than the upper end, which is immediate, as 𝔩(α)\mathfrak{l}(\alpha) is negative between 0 and 710\frac{7}{10}, and 23α2\frac{2-3\alpha}{2} is negative for α>710>23\alpha>\frac{7}{10}>\frac{2}{3}.

The graphs in figures 1, 2, 3, 4 display the relative values of these functions. As all the functions involved are simple algebraic functions, except for 𝔩\mathfrak{l}, if we can assess the contribution of 𝔩\mathfrak{l}, it will be easy to check that the lower end of II corresponds to the smallest value among the applicable values, and the upper to the largest of the applicable values. The function γmin\gamma_{\min} from eq. 4.2 has a root at αs=3+178\alpha_{s}=\frac{3+\sqrt{17}}{8} at around 0.890.89, and is negative in the interval (710,αs)(\frac{7}{10},\alpha_{s}), hence γmin\sqrt{\gamma_{\min}} and B1,B2B_{1},B_{2} for this section are not defined, so 𝔩(α)\mathfrak{l}(\alpha) cannot be an extremum for α(710,αs)\alpha\in(\frac{7}{10},\alpha_{s}). In the interval [αs,2][\alpha_{s},2], 𝔩\mathfrak{l} has the approximate range of [0.14,0.38][0.14,0.38], less than 11, and in the interval [αs,1714][\alpha_{s},\frac{\sqrt{17}-1}{4}] it is less than 11+2α\frac{1}{1+2\alpha}. For large α\alpha, γmin\gamma_{\min} is approximated by 0.8α0.8\alpha, thus 𝔩(α)\mathfrak{l}(\alpha) has an asymptote with slope 4×0.83×0.82/22.820.286\frac{4\times 0.8-3\times 0.8^{2}/2}{2.8^{2}}\approx 0.286, smaller than the slope of α2\frac{\alpha}{2}. It is also easy to graph 𝔩\mathfrak{l} in the interim to show beyond the contribution to the lower bound in [1/2,710][1/2,\frac{7}{10}], 𝔩\mathfrak{l} has no other effect on the curvature range.

With that analysis, for the case (n,p)=(4,3)(n,p)=(4,3), the only applicable values from table 1 are 0 (from the first row), 12α+1\frac{1}{2\alpha+1}, 18α\frac{1}{8\alpha}, 𝔩(α)\mathfrak{l}(\alpha) and 23α2\frac{2-3\alpha}{2}. If α<1/2\alpha<1/2, all these functions are non-negative, and thus 0 is the smallest value among them. When 12<α<710\frac{1}{2}<\alpha<\frac{7}{10}, 𝔩(α)\mathfrak{l}(\alpha) is negative, and in the interval [23,710][\frac{2}{3},\frac{7}{10}], 23α2\frac{2-3\alpha}{2} is also negative, but 𝔩(α)\mathfrak{l}(\alpha) is the lesser of the two, while we have discussed 𝔩(α)\mathfrak{l}(\alpha) has no effect for α>710\alpha>\frac{7}{10}. Thus, for α>710\alpha>\frac{7}{10} the upper end of II is max(11+2α,α2)\max(\frac{1}{1+2\alpha},\frac{\alpha}{2}), with the break-even point 1714\frac{\sqrt{17}-1}{4}. In general, consider the upper or lower ends of II as functions of α\alpha, the values in column αu\alpha_{u} corresponds to nonsmooth points of these functions or infinity.

For the case n5,p=n1n\geq 5,p=n-1, (0,14α,18α,12α+1,𝔩(α),23α2,α2)(0,\frac{1}{4\alpha},\frac{1}{8\alpha},\frac{1}{2\alpha+1},\mathfrak{l}(\alpha),\frac{2-3\alpha}{2},\frac{\alpha}{2}) are the applicable curvature values. Again, with 𝔩\mathfrak{l} having only an effect in [23,710][\frac{2}{3},\frac{7}{10}], it is straightforward to verify the piecewise smooth function max(0,14α,18α,12α+1,𝔩(α),23α2,α2)\max(0,\frac{1}{4\alpha},\frac{1}{8\alpha},\frac{1}{2\alpha+1},\mathfrak{l}(\alpha),\frac{2-3\alpha}{2},\frac{\alpha}{2}) has the form corresponding to the upper end of II, and the lower end corresponding to the minimum of those functions, for α>710\alpha>\frac{7}{10}. We address the case pn2p\geq n-2 similarly. ∎

For α=12\alpha=\frac{1}{2}, when p=n1,n4p=n-1,n\geq 4, the range contains [0,12][0,\frac{1}{2}], and it could be shown to be exactly [0,12][0,\frac{1}{2}] as the manifold is isometric to SO(n)\operatorname{SO}(n) with a bi-invariant metric. If 2pn2,n52\leq p\leq n-2,n\geq 5, the range contains [0,23α/2]=[0,5/4][0,2-3\alpha/2]=[0,5/4], which is proved to be the exact range in [14]. For α=1\alpha=1, the interval is [1/2,1][-1/2,1]. From the numerical evidence mentioned, this seems to be tight. We note for p3p\geq 3, both when α\alpha is large or α\alpha is small, the curvature range becomes large.

5. Deformation metrics on normal homogeneous manifolds

For a Lie group 𝙶\mathtt{G}, with U𝙶U\in\mathtt{G}, we will denote by U\mathcal{L}_{U} the left-multiplication map and by dUd\mathcal{L}_{U} its differential. As usual, adA\operatorname{ad}_{A} denotes the operator X[A,X]X\mapsto[A,X] on the Lie algebra 𝔤\mathfrak{g} of 𝙶\mathtt{G} (A,X𝔤A,X\in\mathfrak{g}). We recall a few results on curvatures of Lie groups.

Proposition 3.

Let 𝙶\mathtt{G} be a connected Lie group with Lie algebra 𝔤\mathfrak{g} with a left-invariant metric given by an inner product P\langle\rangle_{\operatorname{P}} on 𝔤\mathfrak{g}. For A𝔤A\in\mathfrak{g}, let adA\operatorname{ad}_{A}^{\dagger} be the adjoint of adA\operatorname{ad}_{A} under P\langle\rangle_{\operatorname{P}}, that means adA\operatorname{ad^{\dagger}}_{A} is a linear operator on 𝔤\mathfrak{g} such that [A,A1],A2P=A1,adAA2P\langle[A,A_{1}],A_{2}\rangle_{\operatorname{P}}=\langle A_{1},\operatorname{ad^{\dagger}}_{A}A_{2}\rangle_{\operatorname{P}}. Define

(5.1) [A,B]P=[A,B]adABadBA[A,B]_{\operatorname{P}}=[A,B]-\operatorname{ad^{\dagger}}_{A}B-\operatorname{ad^{\dagger}}_{B}A

Let 𝙶\nabla^{\mathtt{G}} be the Levi-Civita connection on 𝙶\mathtt{G}. For two vector fields 𝚇,𝚈\mathtt{X},\mathtt{Y} on 𝙶\mathtt{G}, there exists 𝔤\mathfrak{g}-valued functions A(U),B(U)A(U),B(U), U𝙶U\in\mathtt{G} such that 𝚇(U)=dUA(U),𝚈(U)=dUB(U)\mathtt{X}(U)=d\mathcal{L}_{U}A(U),\mathtt{Y}(U)=d\mathcal{L}_{U}B(U). We have

(5.2) (𝚇𝙶𝚈)(U)=dU((D𝚇B)(U)+12[A(U),B(U)]P)(\nabla^{\mathtt{G}}_{\mathtt{X}}\mathtt{Y})(U)=d\mathcal{L}_{U}((\operatorname{D}_{\mathtt{X}}B)(U)+\frac{1}{2}[A(U),B(U)]_{\operatorname{P}})

where D𝚇B\operatorname{D}_{\mathtt{X}}B is the Lie-derivative of the 𝔤\mathfrak{g}-valued function BB by the vector field 𝚇\mathtt{X}.

For ω1,ω2,ω3𝔤\omega_{1},\omega_{2},\omega_{3}\in\mathfrak{g}, the curvature of 𝙶\mathtt{G} at the identity is given by

(5.3) Rω1,ω2𝙶ω3=12[[ω1,ω2],ω3]P14[ω1[ω2,ω3]P]P+14[ω2[ω1,ω3]P]P\operatorname{R}^{\mathtt{G}}_{\omega_{1},\omega_{2}}\omega_{3}=\frac{1}{2}[[\omega_{1},\omega_{2}],\omega_{3}]_{\operatorname{P}}-\frac{1}{4}[\omega_{1}[\omega_{2},\omega_{3}]_{\operatorname{P}}]_{\operatorname{P}}+\frac{1}{4}[\omega_{2}[\omega_{1},\omega_{3}]_{\operatorname{P}}]_{\operatorname{P}}

Let 𝔨\mathfrak{k} be a subalgebra of 𝔤\mathfrak{g} such that P\operatorname{P} is ad(𝔨)\operatorname{ad}(\mathfrak{k})-invariant, [A,K],BP+A,[K,B]P=0\langle[A,K],B\rangle_{\operatorname{P}}+\langle A,[K,B]\rangle_{\operatorname{P}}=0 for K𝔨,A,B𝔤K\in\mathfrak{k},A,B\in\mathfrak{g}, and 𝔨\mathfrak{k} corresponds to a closed subgroup 𝙺𝙶\mathtt{K}\subset\mathtt{G}, such that 𝙺\mathtt{K} acts freely and properly on 𝙶\mathtt{G} by isometries under right multiplication and 𝙶/𝙺\mathtt{G}/\mathtt{K} is a homogeneous space. If 𝔤=𝔨𝔪\mathfrak{g}=\mathfrak{k}\oplus\mathfrak{m} is an orthogonal decomposition under P\langle\rangle_{\operatorname{P}}, then the horizontal lift of the curvature of 𝙼=𝙶/𝙺\mathtt{M}=\mathtt{G}/\mathtt{K} at oo, the equivariant class containing the unit of 𝙶\mathtt{G}, evaluated at three horizontal vectors ω1,ω2,ω3𝔪\omega_{1},\omega_{2},\omega_{3}\in\mathfrak{m} is

(5.4) Rω1,ω2𝙼ω3=(12[[ω1,ω2],ω3]P14[ω1[ω2,ω3]P]P+14[ω2[ω1,ω3]P]P+12adω3[ω1,ω2]𝔨14adω1[ω2,ω3]𝔨+14adω2[ω1,ω3]𝔨)𝔪\begin{gathered}\operatorname{R}^{\mathtt{M}}_{\omega_{1},\omega_{2}}\omega_{3}=(\frac{1}{2}[[\omega_{1},\omega_{2}],\omega_{3}]_{\operatorname{P}}-\frac{1}{4}[\omega_{1}[\omega_{2},\omega_{3}]_{\operatorname{P}}]_{\operatorname{P}}+\frac{1}{4}[\omega_{2}[\omega_{1},\omega_{3}]_{\operatorname{P}}]_{\operatorname{P}}\\ +\frac{1}{2}\operatorname{ad^{\dagger}}_{\omega_{3}}[\omega_{1},\omega_{2}]_{\mathfrak{k}}-\frac{1}{4}\operatorname{ad^{\dagger}}_{\omega_{1}}[\omega_{2},\omega_{3}]_{\mathfrak{k}}+\frac{1}{4}\operatorname{ad^{\dagger}}_{\omega_{2}}[\omega_{1},\omega_{3}]_{\mathfrak{k}})_{\mathfrak{m}}\end{gathered}

Here, ω𝔳\omega_{\mathfrak{v}} denotes the orthogonal projection of ω\omega to 𝔳\mathfrak{v} for an element ω𝔤\omega\in\mathfrak{g} and a subspace 𝔳\mathfrak{v} of 𝔤\mathfrak{g}. Also, given two vector fields 𝚇,𝚈\mathtt{X},\mathtt{Y} on 𝙼\mathtt{M}, which lift to horizontal vector fields 𝚇¯,𝚈¯\bar{\mathtt{X}},\bar{\mathtt{Y}} on 𝙶\mathtt{G}, with 𝚇¯(U)=dUA(U),𝚈¯=dUB(U)\bar{\mathtt{X}}(U)=d\mathcal{L}_{U}A(U),\bar{\mathtt{Y}}=d\mathcal{L}_{U}B(U) for two 𝔤\mathfrak{g}-valued functions A(U),B(U)A(U),B(U) on 𝙶\mathtt{G} then the horizontal lift of 𝚇𝚈\nabla_{\mathtt{X}}\mathtt{Y} is given by

(5.5) 𝚇𝚈(U)¯=dU((D𝚇¯B)(U)+12[A(U),B(U)]P)𝔪\overline{\nabla_{\mathtt{X}}\mathtt{Y}(U)}=d\mathcal{L}_{U}((\operatorname{D}_{\bar{\mathtt{X}}}B)(U)+\frac{1}{2}[A(U),B(U)]_{\operatorname{P}})_{\mathfrak{m}}

Note that in general []P[\quad]_{\operatorname{P}} is not anticommutative, as the term adAB+adBA\operatorname{ad^{\dagger}}_{A}B+\operatorname{ad^{\dagger}}_{B}A is commutative, and we have [A,B]P[B,A]P=2[A,B][A,B]_{\operatorname{P}}-[B,A]_{\operatorname{P}}=2[A,B].

Proof.

First, we note for three 𝔤\mathfrak{g}-valued functions A, B, C

[A,B]P,CP+B,[A,C]PP=[A,B],CPB,[A,C]PA,[B,C]P+B,[A,C]P[A,B],CP[C,B],AP=0\begin{gathered}\langle[A,B]_{\operatorname{P}},C\rangle_{\operatorname{P}}+\langle B,[A,C]_{\operatorname{P}}\rangle_{\operatorname{P}}=\langle[A,B],C\rangle_{\operatorname{P}}-\langle B,[A,C]\rangle_{\operatorname{P}}-\langle A,[B,C]\rangle_{\operatorname{P}}\\ +\langle B,[A,C]\rangle_{\operatorname{P}}-\langle[A,B],C\rangle_{\operatorname{P}}-\langle[C,B],A\rangle_{\operatorname{P}}=0\end{gathered}

For each smooth function F:𝙶𝗀F:\mathtt{G}\to\mathsf{g}, denote by [F]\mathcal{L}[F] the vector field UdUF(U)U\mapsto d\mathcal{L}_{U}F(U). Denote by 𝙶\langle\rangle_{\mathtt{G}} the left-invariant metric induced by P\operatorname{P}. For three vector fields 𝚇,𝚈,𝚉\mathtt{X},\mathtt{Y},\mathtt{Z} with 𝚇=[A],𝚈=[B]\mathtt{X}=\mathcal{L}[A],\mathtt{Y}=\mathcal{L}[B] and 𝚉=[C]\mathtt{Z}=\mathcal{L}[C] with three smooth 𝔤\mathfrak{g}-valued functions A,B,CA,B,C, we have

D𝚇𝚈,𝚉𝙶=D𝚇B,CP=D𝚇B,CP+B,D𝚇CP=[D𝚇B+12[A,B]P],𝚉𝙶+𝚈,[(D𝚇C+12[A,C]P]𝙶\begin{gathered}\operatorname{D}_{\mathtt{X}}\langle\mathtt{Y},\mathtt{Z}\rangle_{\mathtt{G}}=\operatorname{D}_{\mathtt{X}}\langle B,C\rangle_{\operatorname{P}}=\langle\operatorname{D}_{\mathtt{X}}B,C\rangle_{\operatorname{P}}+\langle B,\operatorname{D}_{\mathtt{X}}C\rangle_{\operatorname{P}}\\ =\langle\mathcal{L}[\operatorname{D}_{\mathtt{X}}B+\frac{1}{2}[A,B]_{\operatorname{P}}],\mathtt{Z}\rangle_{\mathtt{G}}+\langle\mathtt{Y},\mathcal{L}[(\operatorname{D}_{\mathtt{X}}C+\frac{1}{2}[A,C]_{\operatorname{P}}]\rangle_{\mathtt{G}}\end{gathered}

as the metric is left-invariant, P\operatorname{P} is constant on 𝔤\mathfrak{g}, and apply the just proved identity. We can verify [D𝚇B+12[A,B]P]\mathcal{L}[\operatorname{D}_{\mathtt{X}}B+\frac{1}{2}[A,B]_{\operatorname{P}}] satisfies the derivative rule of a connection, and we have just proved it is metric compatible. Torsion-freeness follows from [A,B]P[B,A]P=2[A,B][A,B]_{\operatorname{P}}-[B,A]_{\operatorname{P}}=2[A,B], thus [D𝚇B+12[A,B]P]\mathcal{L}[\operatorname{D}_{\mathtt{X}}B+\frac{1}{2}[A,B]_{\operatorname{P}}] is the Levi-Civita connection.

Equation 5.2 is from [7], equation 3.3.2 (the author uses a right-invariant metric). It is related to the Euler-Poisson-Arnold equation (EPDiff), see equation (55) in Arnold’s classical paper [1]. See also [8].

Equation (5.3) now follows directly from the definition of curvature [𝚇,𝚈]𝚉𝚇𝚈𝚉+𝚈𝚇𝚉\nabla_{[\mathtt{X},\mathtt{Y}]}\mathtt{Z}-\nabla_{\mathtt{X}}\nabla_{\mathtt{Y}}\mathtt{Z}+\nabla_{\mathtt{Y}}\nabla_{\mathtt{X}}\mathtt{Z}, applying to the invariant vector fields [ωi],i{1,2,3}\mathcal{L}[\omega_{i}],i\in\{1,2,3\}. Equation eq. 5.4 follows from the O’Neil equation (Theorem 2, [11]) , written in (1,3)(1,3) tensor form. Indeed, the O’Neil tensor of two vector fields [A],[B]\mathcal{L}[A],\mathcal{L}[B] on 𝙶\mathtt{G} for 𝔤\mathfrak{g}-valued functions AA and BB evaluated at the coset oo is 12[A,B]𝔨\frac{1}{2}[A,B]_{\mathfrak{k}} as the just proved result for covariant derivatives shows the Lie bracket {[A],[B]}=[[A,B]]\{\mathcal{L}[A],\mathcal{L}[B]\}=\mathcal{L}[[A,B]], then we use Lemma 2, [11]. By properties of adjoint and projection, the right-hand side of eq. 5.4 is the unique vector in 𝔪\mathfrak{m} such that the O’Neil equation (equation 4, theorem 2, [11]) is satisfied. Equation (5.5) follows from the result for 𝙶\mathtt{G} and property of horizontal lift of a connection in Riemannian submersion, e.g. lemma 7.45 in [12] (because of left-invariance, we can translate the projection to the identity). ∎

For a subspace 𝔳𝔤\mathfrak{v}\subset\mathfrak{g}, we write ω1𝔳\omega_{1\mathfrak{v}} for (ω1)𝔳(\omega_{1})_{\mathfrak{v}}, the projection of ω1\omega_{1} to 𝔳\mathfrak{v} (ω1𝔤\omega_{1}\in\mathfrak{g}). We write [ω1,ω2]𝔳[\omega_{1},\omega_{2}]_{\mathfrak{v}}, [ω1,ω2]P𝔳[\omega_{1},\omega_{2}]_{\operatorname{P}\mathfrak{v}} for the corresponding projections of brackets.

On a Lie group with a bi-invariant metric \langle\rangle, we now introduce a family of left-invariant metrics called the Cheeger deformation metrics ([2, 16, 5]). The Lie algebra used in the deformation will be called 𝔞\mathfrak{a} here (it is often called 𝔨\mathfrak{k}, but we use 𝙺\mathtt{K} for the stabilizer group. We will use the letters 𝔞,𝔟\mathfrak{a},\mathfrak{b} corresponding to the component AA, BB of the Stiefel tangent vectors as will be seen shortly). Let 𝙰\mathtt{A} be a connected subgroup of 𝙶\mathtt{G} with Lie algebra 𝔞\mathfrak{a}. With the bi-invariant metric on 𝙶\mathtt{G}, 𝙰\mathtt{A} acts via right multiplication as a group of isometries on 𝙶\mathtt{G}. Give 𝙶×𝙰\mathtt{G}\times\mathtt{A} a bi-invariant metric corresponding to the inner product on 𝔤𝔞\mathfrak{g}\oplus\mathfrak{a} evaluated as g,g+ra,a\langle g,g\rangle+r\langle a,a\rangle for (g,a)𝔤×𝔞(g,a)\in\mathfrak{g}\times\mathfrak{a} with r>0r>0, we have the submersion 𝙶×𝙰𝙶\mathtt{G}\times\mathtt{A}\to\mathtt{G} given by (U,Q)UQ1(U,Q)\mapsto UQ^{-1} (U𝙶,Q𝙰U\in\mathtt{G},Q\in\mathtt{A}). Let 𝔤=𝔞𝔫\mathfrak{g}=\mathfrak{a}\oplus\mathfrak{n} be an orthogonal decomposition with respect to \langle\rangle. The submersion induces a new metric on 𝙶\mathtt{G} which is shown in [16] to be

ω𝔫,ω𝔫+r(r+1)ω𝔞,ω𝔞\langle\omega_{\mathfrak{n}},\omega_{\mathfrak{n}}\rangle+\frac{r}{(r+1)}\langle\omega_{\mathfrak{a}},\omega_{\mathfrak{a}}\rangle

for ω𝔤\omega\in\mathfrak{g}. Denote the Cheeger deformation metric Pt\operatorname{P}_{t} on 𝔤\mathfrak{g} by the formula ω𝔫,ω𝔫+tω𝔞,ω𝔞\langle\omega_{\mathfrak{n}},\omega_{\mathfrak{n}}\rangle+t\langle\omega_{\mathfrak{a}},\omega_{\mathfrak{a}}\rangle for t>0t>0. At t=1t=1, it is the original metric. For t<1t<1, the metric corresponds to the submersion above with r=t/(1t)r=t/(1-t), thus 𝙶\mathtt{G} has non-negative curvature by O’Neil’s equation. For t>1t>1, the metric on 𝙶×𝙰\mathtt{G}\times\mathtt{A} is semi-Riemannian but the corresponding metric on 𝙶\mathtt{G} is Riemannian. If 𝔫\mathfrak{n} contains a subalgebra 𝔨\mathfrak{k} corresponding to a closed subgroup 𝙺\mathtt{K} of 𝙶\mathtt{G}, such that 𝔨\mathfrak{k} commutes with 𝔞\mathfrak{a} then 𝙶/𝙺\mathtt{G}/\mathtt{K} could be equipped with the quotient metric induced from Pt\operatorname{P}_{t}. Hence, we will consider the situation when 𝔨\mathfrak{k} is a subalgebra of an algebra 𝔥\mathfrak{h} commuting if 𝔞\mathfrak{a}. Note that 𝙶/𝙺\mathtt{G}/\mathtt{K} with the original bi-invariant metric is called a normal homogeneous space in the literature, while Pt\operatorname{P}_{t} is no longer bi-invariant.

Proposition 4.

Assume the Lie algebra 𝔤\mathfrak{g} has a bi-invariant metric \langle\rangle. Let 𝔥𝔤\mathfrak{h}\subset\mathfrak{g} be a Lie subalgebra of 𝔤\mathfrak{g} and 𝔥\mathfrak{h}^{\perp} be the orthogonal complement of 𝔥\mathfrak{h} in 𝔤\mathfrak{g} under \langle\rangle, 𝔤=𝔥𝔥\mathfrak{g}=\mathfrak{h}\oplus\mathfrak{h}^{\perp}. Then 𝔟:=[𝔥,𝔥]𝔥\mathfrak{b}:=[\mathfrak{h},\mathfrak{h}^{\perp}]\subset\mathfrak{h}^{\perp}, or 𝔥\mathfrak{h}^{\perp} is a 𝔥\mathfrak{h}-module. Let 𝔥=𝔟𝔞\mathfrak{h}^{\perp}=\mathfrak{b}\oplus\mathfrak{a} be an orthogonal decomposition under \langle\rangle. We can characterize 𝔞\mathfrak{a} as the subspace {A𝔥|[A,𝔥]=0}\{A\in\mathfrak{h}^{\perp}|[A,\mathfrak{h}]=0\}. Then

(5.6) 𝔤=𝔥𝔟𝔞\mathfrak{g}=\mathfrak{h}\oplus\mathfrak{b}\oplus\mathfrak{a}

We have [𝔞,𝔟]𝔟[\mathfrak{a},\mathfrak{b}]\subset\mathfrak{b}, 𝔞\mathfrak{a} is a Lie subalgebra of 𝔤\mathfrak{g}, [𝔞,𝔥]=0[\mathfrak{a},\mathfrak{h}]=0 and 𝔟\mathfrak{b} is both a 𝔥\mathfrak{h} and 𝔞\mathfrak{a} module. The correspondence 𝔥𝔞\mathfrak{h}\mapsto\mathfrak{a} is involutive on the set of all subalgebras of 𝔤\mathfrak{g}, that means if we apply the same procedure on 𝔞\mathfrak{a}, we recover 𝔥\mathfrak{h}.

Proof.

Let X𝔥X\in\mathfrak{h}^{\perp} and A,H𝔥A,H\in\mathfrak{h}. Then [A,X],H=X,[A,H]=0\langle[A,X],H\rangle=-\langle X,[A,H]\rangle=0 since 𝔥\mathfrak{h} is a subalgebra of 𝔤\mathfrak{g}, thus [A,X]𝔥[A,X]\in\mathfrak{h}^{\perp}. Assume the \langle\rangle-orthogonal decomposition 𝔥=𝔟𝔞\mathfrak{h}^{\perp}=\mathfrak{b}\oplus\mathfrak{a} with 𝔟=[𝔥,𝔥]\mathfrak{b}=[\mathfrak{h},\mathfrak{h}^{\perp}]. For A𝔞A\in\mathfrak{a}, [A,𝔥],𝔥A,[𝔥,𝔥]{0}\langle[A,\mathfrak{h}],\mathfrak{h}^{\perp}\rangle\subset\langle A,[\mathfrak{h},\mathfrak{h}^{\perp}]\rangle\subset\{0\} and [A,𝔥]𝔥[A,\mathfrak{h}]\subset\mathfrak{h}^{\perp} as 𝔥\mathfrak{h}^{\perp} is a 𝔥\mathfrak{h}-module. Hence, [A,𝔥]=0[A,\mathfrak{h}]=0 as \langle\rangle is non-degenerate on 𝔥\mathfrak{h}^{\perp}. Conversely, if A𝔥A\in\mathfrak{h}^{\perp} and [A,𝔥]=0[A,\mathfrak{h}]=0 then A,[𝔥,𝔥][A,𝔥],𝔥]{0}\langle A,[\mathfrak{h},\mathfrak{h}^{\perp}]\rangle\subset\langle[A,\mathfrak{h}],\mathfrak{h}^{\perp}]\rangle\subset\{0\}, thus A𝔞A\in\mathfrak{a}. We have proved 𝔞\mathfrak{a} is characterized as the subspace of 𝔥\mathfrak{h}^{\perp} such that [A,𝔥]=0[A,\mathfrak{h}]=0 for A𝔞A\in\mathfrak{a}.

Next, for A𝔞A\in\mathfrak{a}, [A,𝔥],𝔥A,[𝔥,𝔥]{0}\langle[A,\mathfrak{h}^{\perp}],\mathfrak{h}\rangle\subset\langle A,[\mathfrak{h}^{\perp},\mathfrak{h}]\rangle\subset\{0\}, thus [A,𝔥]𝔥[A,\mathfrak{h}^{\perp}]\subset\mathfrak{h}^{\perp}. Then

[A,[𝔥,𝔥]],𝔞[[A,𝔥],𝔥],𝔞+[𝔥,[A,𝔥]],𝔞{0}\langle[A,[\mathfrak{h},\mathfrak{h}^{\perp}]],\mathfrak{a}\rangle\subset\langle[[A,\mathfrak{h}],\mathfrak{h}^{\perp}],\mathfrak{a}\rangle+\langle[\mathfrak{h},[A,\mathfrak{h}^{\perp}]],\mathfrak{a}\rangle\subset\{0\}

as in the middle sum, the first item is zeros because [A,𝔥]=0[A,\mathfrak{h}]=0, the second is [𝔥,[A,𝔥]],𝔞[A,𝔥],[𝔥,𝔞]{0}\langle[\mathfrak{h},[A,\mathfrak{h}^{\perp}]],\mathfrak{a}\rangle\subset\langle[A,\mathfrak{h}^{\perp}],[\mathfrak{h},\mathfrak{a}]\rangle\subset\{0\} as [𝔥,𝔞]={0}[\mathfrak{h},\mathfrak{a}]=\{0\}. This shows [𝔞,𝔟][\mathfrak{a},\mathfrak{b}] is in the orthogonal complement of 𝔞\mathfrak{a} in 𝔥\mathfrak{h}^{\perp}, or [𝔞,𝔟]𝔟[\mathfrak{a},\mathfrak{b}]\subset\mathfrak{b}.

Now, [𝔞,𝔞],𝔥𝔞,[𝔞,𝔥]{0}\langle[\mathfrak{a},\mathfrak{a}],\mathfrak{h}\rangle\subset\langle\mathfrak{a},[\mathfrak{a},\mathfrak{h}]\rangle\subset\{0\}, thus [𝔞,𝔞]𝔥[\mathfrak{a},\mathfrak{a}]\subset\mathfrak{h}^{\perp}. But then [𝔞,𝔞],𝔟𝔞,[𝔞,𝔟]𝔞,𝔟{0}\langle[\mathfrak{a},\mathfrak{a}],\mathfrak{b}\rangle\subset\langle\mathfrak{a},[\mathfrak{a},\mathfrak{b}]\rangle\subset\langle\mathfrak{a},\mathfrak{b}\rangle\subset\{0\}, hence [𝔞,𝔞]𝔞[\mathfrak{a},\mathfrak{a}]\subset\mathfrak{a}, therefore 𝔞\mathfrak{a} is a subalgebra of 𝔤\mathfrak{g}, and 𝔟\mathfrak{b} is an 𝔞\mathfrak{a}-module. Involutiveness follows from the orthogonal decomposition 𝔤=𝔥𝔟𝔞\mathfrak{g}=\mathfrak{h}\oplus\mathfrak{b}\oplus\mathfrak{a}, and the characterization of 𝔞\mathfrak{a} by the relation [𝔞,𝔥]=0[\mathfrak{a},\mathfrak{h}]=0, which implies 𝔞=𝔟𝔥\mathfrak{a}^{\perp}=\mathfrak{b}\oplus\mathfrak{h}. ∎

Proposition 5.

Assume 𝔤\mathfrak{g} has a bi-invariant inner product \langle\rangle. Let P\operatorname{P} be a positive-definite self-adjoint operator under the inner product \langle\rangle. Then under the inner product P\langle\rangle_{\operatorname{P}} defined by A1,A2P:=A1,PA2\langle A_{1},A_{2}\rangle_{\operatorname{P}}:=\langle A_{1},\operatorname{P}A_{2}\rangle, we have adAX=P1[A,PX]\operatorname{ad^{\dagger}}_{A}X=-\operatorname{P}^{-1}[A,\operatorname{P}X] for X𝔤X\in\mathfrak{g}, or adA=P1adAP\operatorname{ad^{\dagger}}_{A}=-\operatorname{P}^{-1}\circ\operatorname{ad}_{A}\circ\operatorname{P}.

Let tt be a positive number and 𝔞,𝔟,𝔥\mathfrak{a},\mathfrak{b},\mathfrak{h} as in proposition 4. Let 𝔫=𝔟𝔥\mathfrak{n}=\mathfrak{b}\oplus\mathfrak{h}, thus 𝔤=𝔞𝔫\mathfrak{g}=\mathfrak{a}\oplus\mathfrak{n}. Define the operator P=Pt\operatorname{P}=\operatorname{P}_{t} by Pω=tω𝔞+ω𝔫\operatorname{P}\omega=t\omega_{\mathfrak{a}}+\omega_{\mathfrak{n}}. Then for ω1,ω2𝔤\omega_{1},\omega_{2}\in\mathfrak{g}

(5.7) (adω1ω2)𝔞=[ω1𝔞,ω2𝔞]1/t[ω1𝔫,ω2𝔫]𝔞(\operatorname{ad^{\dagger}}_{\omega_{1}}\omega_{2})_{\mathfrak{a}}=-[\omega_{1\mathfrak{a}},\omega_{2\mathfrak{a}}]-1/t[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}]_{\mathfrak{a}}
(5.8) (adω1ω2)𝔫=[ω1𝔞,ω2𝔟]+t[ω2𝔞,ω1𝔟][ω𝔫,1,ω2𝔫]𝔫(\operatorname{ad^{\dagger}}_{\omega_{1}}\omega_{2})_{\mathfrak{n}}=-[\omega_{1\mathfrak{a}},\omega_{2\mathfrak{b}}]+t[\omega_{2\mathfrak{a}},\omega_{1\mathfrak{b}}]-[\omega_{\mathfrak{n},1},\omega_{2\mathfrak{n}}]_{\mathfrak{n}}
(5.9) [ω1,ω2]P=[ω1,ω2]+(1t)([ω1𝔞,ω2𝔟]+[ω2𝔞,ω1𝔟])[\omega_{1},\omega_{2}]_{\operatorname{P}}=[\omega_{1},\omega_{2}]+(1-t)([\omega_{1\mathfrak{a}},\omega_{2\mathfrak{b}}]+[\omega_{2\mathfrak{a}},\omega_{1\mathfrak{b}}])

Let 𝔨𝔥\mathfrak{k}\subset\mathfrak{h} be a Lie subalgebra of 𝔥\mathfrak{h} and 𝔪=𝔞𝔟𝔡\mathfrak{m}=\mathfrak{a}\oplus\mathfrak{b}\oplus\mathfrak{d} where 𝔥=𝔨𝔡\mathfrak{h}=\mathfrak{k}\oplus\mathfrak{d} is an orthogonal decomposition, thus 𝔤=𝔨𝔪\mathfrak{g}=\mathfrak{k}\oplus\mathfrak{m}. For ω3𝔤\omega_{3}\in\mathfrak{g}

(5.10) (adω3[ω1,ω2]𝔨)𝔪=[ω3𝔪[ω1,ω2]𝔨](\operatorname{ad^{\dagger}}_{\omega_{3}}[\omega_{1},\omega_{2}]_{\mathfrak{k}})_{\mathfrak{m}}=-[\omega_{3\mathfrak{m}}[\omega_{1},\omega_{2}]_{\mathfrak{k}}]
Proof.

Let A,Y,X𝔤A,Y,X\in\mathfrak{g}. From ad(𝔤)\operatorname{ad}(\mathfrak{g}) invariance of \langle\rangle we have

[A,Y],PX=Y,PP1[A,PX]\langle[A,Y],\operatorname{P}X\rangle=\langle Y,-\operatorname{P}\operatorname{P}^{-1}[A,\operatorname{P}X]\rangle

which gives us the first statement.

For eq. 5.7 and eq. 5.8, we expand

adω1ω2=P1[ω1𝔞+ω1𝔫,tω2𝔞+ω2𝔫]=(1/t)([tω1𝔞,ω2𝔞]+[ω1𝔫,ω2𝔫]𝔞)([ω1𝔞,ω2𝔫]+[ω1𝔫,tω2𝔞]+[ω1𝔫,ω2𝔫])𝔫\begin{gathered}\operatorname{ad^{\dagger}}_{\omega_{1}}\omega_{2}=-\operatorname{P}^{-1}[\omega_{1\mathfrak{a}}+\omega_{1\mathfrak{n}},t\omega_{2\mathfrak{a}}+\omega_{2\mathfrak{n}}]\\ =(-1/t)([t\omega_{1\mathfrak{a}},\omega_{2\mathfrak{a}}]+[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}]_{\mathfrak{a}})-([\omega_{1\mathfrak{a}},\omega_{2\mathfrak{n}}]+[\omega_{1\mathfrak{n}},t\omega_{2\mathfrak{a}}]+[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}])_{\mathfrak{n}}\end{gathered}

then use the fact that [𝔞,𝔥]={0}[\mathfrak{a},\mathfrak{h}]=\{0\}. Equation 5.9 follows from this and the definition of []P[\quad]_{\operatorname{P}}, using anti-commutativity to cancel 1/t([ω1𝔫,ω2𝔫]𝔞+[ω2𝔫,ω1𝔫]𝔞)1/t([\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}]_{\mathfrak{a}}+[\omega_{2\mathfrak{n}},\omega_{1\mathfrak{n}}]_{\mathfrak{a}}).

For eq. 5.10, let ω4𝔪\omega_{4}\in\mathfrak{m}, we have

adω3[ω1,ω2]𝔨,ω4P=[ω1,ω2]𝔨,P[ω3,ω4]=[ω1,ω2]𝔨,[ω3,ω4]𝔨\langle\operatorname{ad^{\dagger}}_{\omega_{3}}[\omega_{1},\omega_{2}]_{\mathfrak{k}},\omega_{4}\rangle_{\operatorname{P}}=\langle[\omega_{1},\omega_{2}]_{\mathfrak{k}},\operatorname{P}[\omega_{3},\omega_{4}]\rangle=\langle[\omega_{1},\omega_{2}]_{\mathfrak{k}},[\omega_{3},\omega_{4}]_{\mathfrak{k}}\rangle

as when we expand P[ω3,ω4]\operatorname{P}[\omega_{3},\omega_{4}], only [ω3,ω4]𝔨[\omega_{3},\omega_{4}]_{\mathfrak{k}} could be not orthogonal to [ω1,ω2]𝔨[\omega_{1},\omega_{2}]_{\mathfrak{k}}. From here [ω1,ω2]𝔨,[ω3,ω4]𝔨=[ω1,ω2]𝔨,[ω3,ω4]=[ω3,[ω1,ω2]𝔨],ω4\langle[\omega_{1},\omega_{2}]_{\mathfrak{k}},[\omega_{3},\omega_{4}]_{\mathfrak{k}}\rangle=\langle[\omega_{1},\omega_{2}]_{\mathfrak{k}},[\omega_{3},\omega_{4}]\rangle=-\langle[\omega_{3},[\omega_{1},\omega_{2}]_{\mathfrak{k}}],\omega_{4}\rangle. But [ω3𝔨,[ω1,ω2]𝔨][\omega_{3\mathfrak{k}},[\omega_{1},\omega_{2}]_{\mathfrak{k}}] is orthogonal to ω4𝔪\omega_{4}\in\mathfrak{m}, so we are left with [ω3𝔪,[ω1,ω2]𝔨],ω4=[ω3𝔪,[ω1,ω2]𝔨],ω4P-\langle[\omega_{3\mathfrak{m}},[\omega_{1},\omega_{2}]_{\mathfrak{k}}],\omega_{4}\rangle=-\langle[\omega_{3\mathfrak{m}},[\omega_{1},\omega_{2}]_{\mathfrak{k}}],\omega_{4}\rangle_{\operatorname{P}} as [ω3𝔪,[ω1,ω2]𝔨]𝔞=0[\omega_{3\mathfrak{m}},[\omega_{1},\omega_{2}]_{\mathfrak{k}}]_{\mathfrak{a}}=0, because [ω3𝔞,[ω1,ω2]𝔨]=0[\omega_{3\mathfrak{a}},[\omega_{1},\omega_{2}]_{\mathfrak{k}}]=0 while the remaining term is in 𝔟𝔥\mathfrak{b}\oplus\mathfrak{h}. By proposition 4 [ω3𝔪,[ω1,ω2]𝔨]𝔪[\omega_{3\mathfrak{m}},[\omega_{1},\omega_{2}]_{\mathfrak{k}}]\in\mathfrak{m} since 𝔪\mathfrak{m} is the orthogonal complement of 𝔨\mathfrak{k}, this proves eq. 5.10. ∎

Recall oo is the coset containing the identity in the homogeneous manifold 𝙶/𝙺\mathtt{G}/\mathtt{K}. The expression R[0]\operatorname{R}^{[0]} in the following theorem is the curvature of a normal homogeneous manifold, probably not usually known in this format.

Proposition 6.

For a Lie group 𝙶\mathtt{G} with Lie algebra 𝔤\mathfrak{g} and a bi-invariant metric 𝙶\langle\rangle_{\mathtt{G}}, the curvature of the homogeneous manifold 𝙼=𝙶/𝙺\mathtt{M}=\mathtt{G}/\mathtt{K} under the metric Pt\operatorname{P}_{t} at oo with 𝔨𝔥\mathfrak{k}\subset\mathfrak{h} are subalgebras of 𝔤\mathfrak{g}, (𝔤=𝔞𝔟𝔥=𝔞𝔫=𝔪𝔨\mathfrak{g}=\mathfrak{a}\oplus\mathfrak{b}\oplus\mathfrak{h}=\mathfrak{a}\oplus\mathfrak{n}=\mathfrak{m}\oplus\mathfrak{k} as in proposition 4) at ω1,ω2,ω3𝔪\omega_{1},\omega_{2},\omega_{3}\in\mathfrak{m} is given by

(5.11) Rω1,ω2ω3=Rω1,ω2[0]ω3+(1t)Rω1,ω2[1]ω3+(1t)2Rω1,ω2[2]ω3\operatorname{R}_{\omega_{1},\omega_{2}}\omega_{3}=\operatorname{R}^{[0]}_{\omega_{1},\omega_{2}}\omega_{3}+(1-t)\operatorname{R}^{[1]}_{\omega_{1},\omega_{2}}\omega_{3}+(1-t)^{2}\operatorname{R}^{[2]}_{\omega_{1},\omega_{2}}\omega_{3}
(5.12) Rω1,ω2[0]ω3:=14([[ω1,ω2],ω3]𝔪+2[[ω1,ω2]𝔨,ω3][[ω2,ω3]𝔨,ω1]+[[ω1,ω3]𝔨,ω2])\begin{gathered}\operatorname{R}^{[0]}_{\omega_{1},\omega_{2}}\omega_{3}:=\frac{1}{4}([[\omega_{1},\omega_{2}],\omega_{3}]_{\mathfrak{m}}+2[[\omega_{1},\omega_{2}]_{\mathfrak{k}},\omega_{3}]-[[\omega_{2},\omega_{3}]_{\mathfrak{k}},\omega_{1}]+[[\omega_{1},\omega_{3}]_{\mathfrak{k}},\omega_{2}])\end{gathered}
(5.13) Rω1,ω2[1]ω3:=12([[ω1,ω2]𝔞,ω3𝔟]+[ω3𝔞,[ω1,ω2]𝔟])14([ω1,[ω2𝔞,ω3𝔟]+[ω3𝔞,ω2𝔟]]+[ω1𝔞,[ω2,ω3]𝔟]+[[ω2,ω3]𝔞,ω1𝔟])𝔪+14([ω2,[ω1𝔞,ω3𝔟]+[ω3𝔞,ω1𝔟]]+[ω2𝔞,[ω1,ω3]𝔟]+[[ω1,ω3]𝔞,ω2𝔟])𝔪\begin{gathered}\operatorname{R}^{[1]}_{\omega_{1},\omega_{2}}\omega_{3}:=\frac{1}{2}([[\omega_{1},\omega_{2}]_{\mathfrak{a}},\omega_{3\mathfrak{b}}]+[\omega_{3\mathfrak{a}},[\omega_{1},\omega_{2}]_{\mathfrak{b}}])\\ -\frac{1}{4}([\omega_{1},[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]+[\omega_{3\mathfrak{a}},\omega_{2\mathfrak{b}}]]+[\omega_{1\mathfrak{a}},[\omega_{2},\omega_{3}]_{\mathfrak{b}}]+[[\omega_{2},\omega_{3}]_{\mathfrak{a}},\omega_{1\mathfrak{b}}])_{\mathfrak{m}}\\ +\frac{1}{4}([\omega_{2},[\omega_{1\mathfrak{a}},\omega_{3\mathfrak{b}}]+[\omega_{3\mathfrak{a}},\omega_{1\mathfrak{b}}]]+[\omega_{2\mathfrak{a}},[\omega_{1},\omega_{3}]_{\mathfrak{b}}]+[[\omega_{1},\omega_{3}]_{\mathfrak{a}},\omega_{2\mathfrak{b}}])_{\mathfrak{m}}\end{gathered}
(5.14) 4Rω1,ω2[2]ω3:=[ω1𝔞,[ω2𝔞,ω3𝔟]+[ω3𝔞,ω2𝔟]]+[ω2𝔞,[ω1𝔞,ω3𝔟]+[ω3𝔞,ω1𝔟]]\begin{gathered}4\operatorname{R}^{[2]}_{\omega_{1},\omega_{2}}\omega_{3}:=-[\omega_{1\mathfrak{a}},[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]+[\omega_{3\mathfrak{a}},\omega_{2\mathfrak{b}}]]+[\omega_{2\mathfrak{a}},[\omega_{1\mathfrak{a}},\omega_{3\mathfrak{b}}]+[\omega_{3\mathfrak{a}},\omega_{1\mathfrak{b}}]]\end{gathered}
Proof.

We apply the formulas for []P[\quad]_{\operatorname{P}}, with 𝔫=𝔟𝔥\mathfrak{n}=\mathfrak{b}\oplus\mathfrak{h} and 𝔤=𝔞𝔫\mathfrak{g}=\mathfrak{a}\oplus\mathfrak{n}

[[ω1,ω2],ω3]P,𝔞=[[ω1,ω2],ω3]𝔞[[\omega_{1},\omega_{2}],\omega_{3}]_{\operatorname{P},\mathfrak{a}}=[[\omega_{1},\omega_{2}],\omega_{3}]_{\mathfrak{a}}
[[ω1,ω2],ω3]P,𝔫=[[ω1,ω2],ω3]𝔫+(1t)([[ω1,ω2]𝔞,ω3,𝔟]+[ω3,𝔞,[ω1,ω2]𝔟][[\omega_{1},\omega_{2}],\omega_{3}]_{\operatorname{P},\mathfrak{n}}=[[\omega_{1},\omega_{2}],\omega_{3}]_{\mathfrak{n}}+(1-t)([[\omega_{1},\omega_{2}]_{\mathfrak{a}},\omega_{3,\mathfrak{b}}]+[\omega_{3,\mathfrak{a}},[\omega_{1},\omega_{2}]_{\mathfrak{b}}]
[ω1[ω2,ω3]P]P=[ω1,[ω2,ω3]]+(1t)[ω1,[ω2,𝔞,ω3,𝔟]+[ω3,𝔞,ω2,𝔟]]+[\omega_{1}[\omega_{2},\omega_{3}]_{\operatorname{P}}]_{\operatorname{P}}=[\omega_{1},[\omega_{2},\omega_{3}]]+(1-t)[\omega_{1},[\omega_{2,\mathfrak{a}},\omega_{3,\mathfrak{b}}]+[\omega_{3,\mathfrak{a}},\omega_{2,\mathfrak{b}}]]+
(1t)([ω1𝔞,[ω2,ω3]𝔟]+[[ω2,ω3]𝔞,ω1,𝔟])+(1t)2([ω1𝔞,[ω2𝔞,ω3𝔟]+[ω3𝔞,ω2𝔟]])(1-t)([\omega_{1\mathfrak{a}},[\omega_{2},\omega_{3}]_{\mathfrak{b}}]+[[\omega_{2},\omega_{3}]_{\mathfrak{a}},\omega_{1,\mathfrak{b}}])+(1-t)^{2}([\omega_{1\mathfrak{a}},[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]+[\omega_{3\mathfrak{a}},\omega_{2\mathfrak{b}}]])

We now apply eq. 5.4. By the Jacobi identity, the R[0]\operatorname{R}^{[0]} component of the first line is

(12[[ω1,ω2],ω3]14[ω1,[ω2,ω3]]+14[ω2,[ω1,ω3]])𝔪=14[[ω1,ω2],ω3]𝔪(\frac{1}{2}[[\omega_{1},\omega_{2}],\omega_{3}]-\frac{1}{4}[\omega_{1},[\omega_{2},\omega_{3}]]+\frac{1}{4}[\omega_{2},[\omega_{1},\omega_{3}]])_{\mathfrak{m}}=\frac{1}{4}[[\omega_{1},\omega_{2}],\omega_{3}]_{\mathfrak{m}}

while the second line has the O’Neil terms adωi[ωj,ωk]𝔨\operatorname{ad^{\dagger}}_{\omega_{i}}[\omega_{j},\omega_{k}]_{\mathfrak{k}} (i,j,ki,j,k in a permutation of {1,2,3}\{1,2,3\}) evaluated as [ωi𝔪[ωj,ωk]𝔨]-[\omega_{i\mathfrak{m}}[\omega_{j},\omega_{k}]_{\mathfrak{k}}]. Since we assume ωi𝔪\omega_{i}\in\mathfrak{m}, this gives us the expression for R[0]\operatorname{R}^{[0]}. Permuting indices the collect terms, we get R[1]\operatorname{R}^{[1]} and R[2]\operatorname{R}^{[2]}. Some of the expressions, for example, Rω1ω2[2]ω3\operatorname{R}^{[2]}_{\omega_{1}\omega_{2}}\omega_{3} are already in 𝔪\mathfrak{m} so we do not need to apply projection again. ∎

We use these formulas to compute the Levi-Civita connection and curvature for Stiefel manifolds. For two integers n>pn>p, we will describe the Stiefel manifold as SO(n)/SO(np)\operatorname{SO}(n)/\operatorname{SO}(n-p). Here, 𝙶=SO(n)\mathtt{G}=\operatorname{SO}(n) and 𝙺=SO(np)\mathtt{K}=\operatorname{SO}(n-p). We take 𝔤=𝔬(n)\mathfrak{g}=\mathfrak{o}(n), 𝙶=SO(n)\mathtt{G}=\operatorname{SO}(n). Take the bi-invariant form to be 12Tr(ω1𝖳ω2)\frac{1}{2}\operatorname{Tr}(\omega_{1}^{\operatorname{\mathsf{T}}}\omega_{2}). We divide a matrix in 𝔬(n)\mathfrak{o}(n) to blocks of the form [AB𝖳BH]\begin{bmatrix}A&-B^{\operatorname{\mathsf{T}}}\\ B&H\end{bmatrix}, with A𝔬(p)A\in\mathfrak{o}(p), B(np)×pB\in\mathbb{R}^{(n-p)\times p} and H(np)×(np)H\in\mathbb{R}^{(n-p)\times(n-p)}, and we represent that matrix by a triple [[A,B,H]][\![A,B,H]\!] to save space.

Take the subalgebra generated by the HH block to be 𝔨=𝔥=𝔬(np)\mathfrak{k}=\mathfrak{h}=\mathfrak{o}(n-p), identified with the bottom right (np)×(np)(n-p)\times(n-p) block of 𝔬(n)\mathfrak{o}(n), then 𝔪\mathfrak{m} is the subspace of 𝔬(n)\mathfrak{o}(n) where the HH-block is zero, the subalgebra 𝔞\mathfrak{a} is 𝔬(p)\mathfrak{o}(p) identified with the AA-block, and 𝔟\mathfrak{b} is the subspace generated by the BB and B𝖳B^{\operatorname{\mathsf{T}}}-blocks, as in the below

𝔤:[𝔞𝔟𝔟𝔥]𝔫:[0𝔟𝔟𝔥]𝔪:[𝔞𝔟𝔟0]\mathfrak{g}:\begin{bmatrix}\mathfrak{a}&\mathfrak{b}\\ \mathfrak{b}&\mathfrak{h}\end{bmatrix}\quad\quad\mathfrak{n}:\begin{bmatrix}0&\mathfrak{b}\\ \mathfrak{b}&\mathfrak{h}\end{bmatrix}\quad\quad\mathfrak{m}:\begin{bmatrix}\mathfrak{a}&\mathfrak{b}\\ \mathfrak{b}&0\end{bmatrix}

The Lie and []P[\quad]_{\operatorname{P}} brackets of [[A1,B1,H1]]],[[A2,B2,H2]]𝔬(n)\operatorname{[\![}A_{1},B_{1},H_{1}]\operatorname{]\!]},\operatorname{[\![}A_{2},B_{2},H_{2}\operatorname{]\!]}\in\mathfrak{o}(n) are given by

[[[A1,B1,H1]],[[A2,B2,H2]]]=[\operatorname{[\![}A_{1},B_{1},H_{1}\operatorname{]\!]},\operatorname{[\![}A_{2},B_{2},H_{2}\operatorname{]\!]}]=
[[[A1,A2]+B2𝖳B1B1𝖳B2,B1A2+H1B2B2A1H2B1,[H1,H2]+B2B1𝖳B1B2𝖳]]\operatorname{[\![}[A_{1},A_{2}]+B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2},B_{1}A_{2}+H_{1}B_{2}-B_{2}A_{1}-H_{2}B_{1},[H_{1},H_{2}]+B_{2}B_{1}^{\operatorname{\mathsf{T}}}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}\operatorname{]\!]}
[[[A1,B1,H1]],[[A2,B2,H2]]]P=[[[A1,A2]+B2𝖳B1B1𝖳B2,[\operatorname{[\![}A_{1},B_{1},H_{1}\operatorname{]\!]},\operatorname{[\![}A_{2},B_{2},H_{2}\operatorname{]\!]}]_{\operatorname{P}}=\operatorname{[\![}[A_{1},A_{2}]+B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2},
tB1A2+H1B2+(t2)B2A1H2B1,tB_{1}A_{2}+H_{1}B_{2}+(t-2)B_{2}A_{1}-H_{2}B_{1},
[H1,H2]+B2B1𝖳B1B2𝖳]][H_{1},H_{2}]+B_{2}B_{1}^{\operatorname{\mathsf{T}}}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}\operatorname{]\!]}

For U=(Y|Y)SO(n)U=(Y|Y_{\perp})\in\operatorname{SO}(n), where (|)(|) denotes the division of a matrix in n×n\mathbb{R}^{n\times n} to the first pp (in n×p\mathbb{R}^{n\times p}) and last npn-p (in n×(np)\mathbb{R}^{n\times(n-p)}) column blocks, if ω=(η|η)\omega=(\eta|\eta_{\perp}) is a tangent vector at UU to SO(n)\operatorname{SO}(n) then ω=dU(U𝖳ω)=U[[Y𝖳η,Y𝖳η,Y𝖳η]]\omega=d\mathcal{L}_{U}(U^{\operatorname{\mathsf{T}}}\omega)=U\operatorname{[\![}Y^{\operatorname{\mathsf{T}}}\eta,Y_{\perp}^{\operatorname{\mathsf{T}}}\eta,Y_{\perp}^{\operatorname{\mathsf{T}}}\eta_{\perp}\operatorname{]\!]}.

We describe the submersion SO(n)Stp,n\operatorname{SO}(n)\to\mathrm{St}_{p,n}, identifying Stp,n\mathrm{St}_{p,n} with SO(n)/SO(np)\operatorname{SO}(n)/\operatorname{SO}(n-p) by the map UYU\mapsto Y, where U=(Y|Y)U=(Y|Y_{\perp}) as just described. The map is clearly a differentiable submersion on to Stp,n\mathrm{St}_{p,n}, the fiber over YY consists of matrices of the form (Y|YQ)(Y|Y_{\perp}Q), QSO(np)Q\in\operatorname{SO}(n-p), hence the vertical space consists of (0|Yq)(0|Y_{\perp}\mathrm{q}), q𝔬(np)\mathrm{q}\in\mathfrak{o}(n-p).

Equip SO(n)\operatorname{SO}(n) with the metric Pt\operatorname{P}_{t} in proposition 5. At U=InU=\operatorname{I}_{n}, the horizontal space consists of matrices of the form [[A,B,0]]\operatorname{[\![}A,B,0\operatorname{]\!]}, with A𝔬(p),B(np)×pA\in\mathfrak{o}(p),B\in\mathbb{R}^{(n-p)\times p}, and in general, a horizontal vector is of the form U[[A,B,0]]U\operatorname{[\![}A,B,0\operatorname{]\!]}. The submersion maps ω=(η|η)\omega=(\eta|\eta_{\perp}) to ηn×p\eta\in\mathbb{R}^{n\times p} satisfying Y𝖳η𝔬(p)Y^{\operatorname{\mathsf{T}}}\eta\in\mathfrak{o}(p).

Proposition 7.

With the above setting, the horizontal lift of a tangent vector ηn×p\eta\in\mathbb{R}^{n\times p} to Stp,n\mathrm{St}_{p,n} at U=(Y|Y)SO(n)U=(Y|Y_{\perp})\in\operatorname{SO}(n) under Pt\operatorname{P}_{t} is η¯=(η|Yη𝖳Y)\bar{\eta}=(\eta|-Y\eta^{\operatorname{\mathsf{T}}}Y_{\perp}) and the induced metric is

(5.15) η,ηt=Tr(ηη𝖳+(t21)YY𝖳ηη𝖳)\langle\eta,\eta\rangle_{t}=\operatorname{Tr}(\eta\eta^{\operatorname{\mathsf{T}}}+(\frac{t}{2}-1)YY^{\operatorname{\mathsf{T}}}\eta\eta^{\operatorname{\mathsf{T}}})

The Levi-Civita connection for two vector fields 𝚅,𝚉\mathtt{V},\mathtt{Z} on Stp,n\mathrm{St}_{p,n} under this metric is given by

(5.16) 𝚅𝚉=D𝚅𝚉+12Y(𝚅𝖳𝚉+𝚅𝖳𝚉)+2t2(InYY𝖳)(𝚅𝚉𝖳+𝚅𝚉𝖳)Y\nabla_{\mathtt{V}}\mathtt{Z}=\operatorname{D}_{\mathtt{V}}\mathtt{Z}+\frac{1}{2}Y(\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z})+\frac{2-t}{2}(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})(\mathtt{V}\mathtt{Z}^{\operatorname{\mathsf{T}}}+\mathtt{V}\mathtt{Z}^{\operatorname{\mathsf{T}}})Y

The curvature Rξ,ηϕ\operatorname{R}_{\xi,\eta}\phi at YStp,nY\in\mathrm{St}_{p,n} for three tangent vectors ξ,η,ϕ\xi,\eta,\phi computed by proposition 3 is identical to that computed by eq. 3.3 and (3.4) if we represent the tangent and curvature vectors in the format in theorem 3.1, and set α=t/2\alpha=t/2.

Proof.

A matrix multiplication shows U𝖳η¯U^{\operatorname{\mathsf{T}}}\bar{\eta} is antisymmetric and could be represented as [[Y𝖳η,Y𝖳η,0]]𝔬(n)\operatorname{[\![}Y^{\operatorname{\mathsf{T}}}\eta,Y_{\perp}^{\operatorname{\mathsf{T}}}\eta,0\operatorname{]\!]}\in\mathfrak{o}(n), which is horizontal at In\operatorname{I}_{n}, thus η¯\bar{\eta} is horizontal and maps to η\eta, hence it is the horizontal lift.

Using the relations YY𝖳+YY𝖳=InY_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}+YY^{\operatorname{\mathsf{T}}}=\operatorname{I}_{n} the induced metric is

U𝖳η¯,U𝖳η¯P=12Tr[Y𝖳ηη𝖳YY𝖳η0][tη𝖳Yη𝖳YY𝖳η0]=12Tr(tYY𝖳ηη𝖳+2YY𝖳ηη𝖳)=Tr(ηη𝖳+(t21)YY𝖳ηη𝖳)\begin{gathered}\langle U^{\operatorname{\mathsf{T}}}\bar{\eta},U^{\operatorname{\mathsf{T}}}\bar{\eta}\rangle_{\operatorname{P}}=\frac{1}{2}\operatorname{Tr}\begin{bmatrix}Y^{\operatorname{\mathsf{T}}}\eta&-\eta^{\operatorname{\mathsf{T}}}Y_{\perp}\\ Y_{\perp}^{\operatorname{\mathsf{T}}}\eta&0\end{bmatrix}\begin{bmatrix}t\eta^{\operatorname{\mathsf{T}}}Y&\eta^{\operatorname{\mathsf{T}}}Y_{\perp}\\ -Y_{\perp}^{\operatorname{\mathsf{T}}}\eta&0\end{bmatrix}\\ =\frac{1}{2}\operatorname{Tr}(tYY^{\operatorname{\mathsf{T}}}\eta\eta^{\operatorname{\mathsf{T}}}+2Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}\eta\eta^{\operatorname{\mathsf{T}}})=\operatorname{Tr}(\eta\eta^{\operatorname{\mathsf{T}}}+(\frac{t}{2}-1)YY^{\operatorname{\mathsf{T}}}\eta\eta^{\operatorname{\mathsf{T}}})\end{gathered}

Let 𝚅,𝚉\mathtt{V},\mathtt{Z} be two vector fields on Stp,n\mathrm{St}_{p,n}, which lift to SO(n)\operatorname{SO}(n)-vector fields 𝚅¯=(𝚅|Y𝚅𝖳Y)\bar{\mathtt{V}}=(\mathtt{V}|-Y\mathtt{V}^{\operatorname{\mathsf{T}}}Y_{\perp}), 𝚉¯=(𝚉|Y𝚉𝖳Y)\bar{\mathtt{Z}}=(\mathtt{Z}|-Y\mathtt{Z}^{\operatorname{\mathsf{T}}}Y_{\perp}). Let F=U𝖳𝚉¯=[[Y𝖳𝚉,Y𝖳𝚉,0]]F=U^{\operatorname{\mathsf{T}}}\bar{\mathtt{Z}}=\operatorname{[\![}Y^{\operatorname{\mathsf{T}}}\mathtt{Z},Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{Z},0\operatorname{]\!]}, by eq. 5.5, 𝚅𝚉\nabla_{\mathtt{V}}\mathtt{Z} lifts to UC𝔪UC_{\mathfrak{m}} with C=D𝚅¯F+12[[[Y𝖳𝚅,Y𝖳𝚅,0]],[[Y𝖳𝚉,Y𝖳𝚉,0]]]PC=\operatorname{D}_{\bar{\mathtt{V}}}F+\frac{1}{2}[\operatorname{[\![}Y^{\operatorname{\mathsf{T}}}\mathtt{V},Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V},0\operatorname{]\!]},\operatorname{[\![}Y^{\operatorname{\mathsf{T}}}\mathtt{Z},Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{Z},0\operatorname{]\!]}]_{\operatorname{P}}. Expand the Lie-derivative and the P\operatorname{P}-bracket

C=[[𝚅𝖳𝚉+Y𝖳D𝚅𝚉,Y𝖳𝚅Y𝖳𝚉+Y𝖳D𝚅𝚉,0]]+12[[[Y𝖳𝚅,Y𝖳𝚉]+𝚉𝖳YY𝖳𝚅𝚅𝖳YY𝖳𝚉,tY𝖳𝚅Y𝖳𝚉+(t2)Y𝖳𝚉Y𝖳𝚅,CH]]\begin{gathered}C=\operatorname{[\![}\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+Y^{\operatorname{\mathsf{T}}}\operatorname{D}_{\mathtt{V}}\mathtt{Z},-Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+Y_{\perp}^{\operatorname{\mathsf{T}}}\operatorname{D}_{\mathtt{V}}\mathtt{Z},0\operatorname{]\!]}+\\ \frac{1}{2}\operatorname{[\![}[Y^{\operatorname{\mathsf{T}}}\mathtt{V},Y^{\operatorname{\mathsf{T}}}\mathtt{Z}]+\mathtt{Z}^{\operatorname{\mathsf{T}}}Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V}-\mathtt{V}^{\operatorname{\mathsf{T}}}Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{Z},tY_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+(t-2)Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{Z}Y^{\operatorname{\mathsf{T}}}\mathtt{V},C_{H}\operatorname{]\!]}\end{gathered}

for CH𝔬(np)C_{H}\in\mathfrak{o}(n-p). Thus, the submersion maps UC𝔪UC_{\mathfrak{m}} to its left pp columns

Y(𝚅𝖳𝚉+Y𝖳D𝚅𝚉+12([Y𝖳𝚅,Y𝖳𝚉]+𝚉𝖳YY𝖳𝚅𝚅𝖳YY𝖳𝚉))+Y(Y𝖳𝚅Y𝖳𝚉+Y𝖳D𝚅𝚉+12(tY𝖳𝚅Y𝖳𝚉+(t2)Y𝖳𝚉Y𝖳𝚅))=D𝚅𝚉+Y𝚅𝖳𝚉+12(YY𝖳𝚅Y𝖳𝚉YY𝖳𝚉Y𝖳𝚅+Y𝚉𝖳YY𝚅Y𝚅𝖳YY𝚉)+12YY𝖳(2𝚅Y𝖳𝚉+t𝚅Y𝖳𝚉+(t2)𝚉Y𝖳𝚅)\begin{gathered}Y(\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+Y^{\operatorname{\mathsf{T}}}\operatorname{D}_{\mathtt{V}}\mathtt{Z}+\frac{1}{2}([Y^{\operatorname{\mathsf{T}}}\mathtt{V},Y^{\operatorname{\mathsf{T}}}\mathtt{Z}]+\mathtt{Z}^{\operatorname{\mathsf{T}}}Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V}-\mathtt{V}^{\operatorname{\mathsf{T}}}Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{Z}))+\\ Y_{\perp}(-Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+Y_{\perp}^{\operatorname{\mathsf{T}}}\operatorname{D}_{\mathtt{V}}\mathtt{Z}+\frac{1}{2}(tY_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+(t-2)Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{Z}Y^{\operatorname{\mathsf{T}}}\mathtt{V}))\\ =\operatorname{D}_{\mathtt{V}}\mathtt{Z}+Y\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+\frac{1}{2}(YY^{\operatorname{\mathsf{T}}}\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}-YY^{\operatorname{\mathsf{T}}}\mathtt{Z}Y^{\operatorname{\mathsf{T}}}\mathtt{V}+Y\mathtt{Z}^{\operatorname{\mathsf{T}}}Y_{\perp}Y_{\perp}\mathtt{V}-Y\mathtt{V}^{\operatorname{\mathsf{T}}}Y_{\perp}Y_{\perp}\mathtt{Z})\\ +\frac{1}{2}Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}(-2\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+t\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+(t-2)\mathtt{Z}Y^{\operatorname{\mathsf{T}}}\mathtt{V})\end{gathered}

The last line simplifies to

t22(InYY𝖳)(𝚅Y𝖳𝚉+𝚉Y𝖳𝚅)=2t2(InYY𝖳)(𝚅𝚉𝖳+𝚉𝚅𝖳)Y\frac{t-2}{2}(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})(\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+\mathtt{Z}Y^{\operatorname{\mathsf{T}}}\mathtt{V})=\frac{2-t}{2}(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})(\mathtt{V}\mathtt{Z}^{\operatorname{\mathsf{T}}}+\mathtt{Z}\mathtt{V}^{\operatorname{\mathsf{T}}})Y

while twice the remaining terms, except for D𝚅𝚉\operatorname{D}_{\mathtt{V}}\mathtt{Z} is

2Y𝚅𝖳𝚉+YY𝖳𝚅Y𝖳𝚉YY𝖳𝚉Y𝖳𝚅+Y𝚉𝖳(InYY𝖳)𝚅Y𝚅𝖳(InYY𝖳)𝚉=Y𝚅𝖳𝚉+Y𝚉𝖳𝚅+Y(Y𝖳𝚅+𝚅𝖳Y)Y𝖳𝚉Y(Y𝖳𝚉+𝚉𝖳Y)Y𝖳𝚅=Y𝚅𝖳𝚉+Y𝚉𝖳𝚅\begin{gathered}2Y\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+YY^{\operatorname{\mathsf{T}}}\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}-YY^{\operatorname{\mathsf{T}}}\mathtt{Z}Y^{\operatorname{\mathsf{T}}}\mathtt{V}+Y\mathtt{Z}^{\operatorname{\mathsf{T}}}(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})\mathtt{V}-Y\mathtt{V}^{\operatorname{\mathsf{T}}}(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})\mathtt{Z}\\ =Y\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+Y\mathtt{Z}^{\operatorname{\mathsf{T}}}\mathtt{V}+Y(Y^{\operatorname{\mathsf{T}}}\mathtt{V}+\mathtt{V}^{\operatorname{\mathsf{T}}}Y)Y^{\operatorname{\mathsf{T}}}\mathtt{Z}-Y(Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+\mathtt{Z}^{\operatorname{\mathsf{T}}}Y)Y^{\operatorname{\mathsf{T}}}\mathtt{V}\\ =Y\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+Y\mathtt{Z}^{\operatorname{\mathsf{T}}}\mathtt{V}\end{gathered}

Thus we have proved eq. 5.16.

Let us prove the curvature expressions. To show f(t)=g(t/2)f(t)=g(t/2), with f(t)=f0+(1t)f1+(1t)2f2f(t)=f_{0}+(1-t)f_{1}+(1-t)^{2}f_{2} where f1,f2,f3f_{1},f_{2},f_{3} are constant matrices and gg is a matrix-valued quadratic function in tt, we need to show f0=g(1/2)f_{0}=g(1/2), 2f1=g(1/2)-2f_{1}=g^{\prime}(1/2) and 8f2=g′′(1/2)8f_{2}=g^{\prime\prime}(1/2). From left invariance we can take U=InU=\operatorname{I}_{n}. Thus, we need to compute R[0],R[1],R[2]\operatorname{R}^{[0]},\operatorname{R}^{[1]},\operatorname{R}^{[2]} and compare with values and derivatives of g(α)=[[AR(α),BR(α),0]]g(\alpha)=\operatorname{[\![}A_{R}(\alpha),B_{R}(\alpha),0\operatorname{]\!]} with AR,BRA_{R},B_{R} defined from eq. 3.3 and (3.4) evaluated at α=1/2\alpha=1/2.

Let ξ=ω1,η=ω2,ϕ=ω3\xi=\omega_{1},\eta=\omega_{2},\phi=\omega_{3} with ωi=[[Ai,Bi,0]]\omega_{i}=\operatorname{[\![}A_{i},B_{i},0\operatorname{]\!]} we have [ω2𝔞,ω3𝔟][\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}] is [[0,B3A2,0]]\operatorname{[\![}0,-B_{3}A_{2},0\operatorname{]\!]}, [ω1𝔞,[ω2𝔞,ω3𝔟]]=[[0,B3A2A1,0]][\omega_{1\mathfrak{a}},[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]]=\operatorname{[\![}0,B_{3}A_{2}A_{1},0\operatorname{]\!]} and permuting the indices

4Rω1,ω2[2]ω3=[[0,B3A2A1B2A3A1+B3A1A2+B1A3A2,0]]4\operatorname{R}^{[2]}_{\omega_{1},\omega_{2}}\omega_{3}=\operatorname{[\![}0,-B_{3}A_{2}A_{1}-B_{2}A_{3}A_{1}+B_{3}A_{1}A_{2}+B_{1}A_{3}A_{2},0\operatorname{]\!]}

On the other hand, eq. 3.3 and 3.4 gives AR,α=1/2′′=0A_{R,\alpha=1/2}^{\prime\prime}=0 and BR,α=1/2′′B_{R,\alpha=1/2}^{\prime\prime} is

BR,α=1/2′′=42(B1A3A2B2A3A1)+(2)(B3A1A2B3A2A1)B_{R,\alpha=1/2}^{\prime\prime}=\frac{4}{2}(B_{1}A_{3}A_{2}-B_{2}A_{3}A_{1})+\\ (2)(B_{3}A_{1}A_{2}-B_{3}A_{2}A_{1})

which confirms 8Rω1,ω2[2]ω3=g′′(1/2)8\operatorname{R}^{[2]}_{\omega_{1},\omega_{2}}\omega_{3}=g^{\prime\prime}(1/2). Next,

[[ω1,ω2]𝔞,ω3𝔟]=[[0,B3(([A1,A2]+B2𝖳B1B1𝖳B2),0]][[\omega_{1},\omega_{2}]_{\mathfrak{a}},\omega_{3\mathfrak{b}}]=\operatorname{[\![}0,-B_{3}(([A_{1},A_{2}]+B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}),0\operatorname{]\!]}
[ω3𝔞,[ω1,ω2]𝔟]𝔞=[[0,(B1A2B2A1)A3,0]][\omega_{3\mathfrak{a}},[\omega_{1},\omega_{2}]_{\mathfrak{b}}]_{\mathfrak{a}}=\operatorname{[\![}0,-(B_{1}A_{2}-B_{2}A_{1})A_{3},0\operatorname{]\!]}
[ω1,[ω2𝔞,ω3𝔟]]𝔪=[[[A1,B1,0]],[[0,B3A2,0]]]𝔪=[[A2B3𝖳B1+B1𝖳B3A2,B3A2A1,0]][\omega_{1},[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]]_{\mathfrak{m}}=[\operatorname{[\![}A_{1},B_{1},0\operatorname{]\!]},\operatorname{[\![}0,-B_{3}A_{2},0\operatorname{]\!]}]_{\mathfrak{m}}=\operatorname{[\![}A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}+B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2},B_{3}A_{2}A_{1},0\operatorname{]\!]}

By permuting indices, we evaluate the 𝔞\mathfrak{a} component of 4Rω1,ω2[1]ω34\operatorname{R}^{[1]}_{\omega_{1},\omega_{2}}\omega_{3} from four expressions similar to [ω1,[ω2𝔞,ω3𝔟]]𝔞[\omega_{1},[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]]_{\mathfrak{a}} as

A2B3𝖳B1B1𝖳B3A2A3B2𝖳B1B1𝖳B2A3+A1B3𝖳B2+B2𝖳B3A1+A1B2𝖳B3+B3𝖳B2A1\begin{gathered}-A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2}-A_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}\\ +A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1}+A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}+B_{3}^{\operatorname{\mathsf{T}}}B_{2}A_{1}\end{gathered}

and evaluate the 𝔟\mathfrak{b} component of 4Rω1,ω2[1]ω34\operatorname{R}^{[1]}_{\omega_{1},\omega_{2}}\omega_{3} from the remaining items as

2(B3([A1,A2]+B2𝖳B1B1𝖳B2)(B1A2B2A1)A3)B3A2A1B2A3A1+(B2A3B3A2)A1+B1([A2,A3]+B3𝖳B2B2𝖳B3)+B3A1A2+B1A3A2(B1A3B3A1)A2B2([A1,A3]+B3𝖳B1B1𝖳B3)\begin{gathered}2(-B_{3}([A_{1},A_{2}]+B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})-(B_{1}A_{2}-B_{2}A_{1})A_{3})\\ -B_{3}A_{2}A_{1}-B_{2}A_{3}A_{1}+(B_{2}A_{3}-B_{3}A_{2})A_{1}+B_{1}([A_{2},A_{3}]+B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{3})\\ +B_{3}A_{1}A_{2}+B_{1}A_{3}A_{2}-(B_{1}A_{3}-B_{3}A_{1})A_{2}-B_{2}([A_{1},A_{3}]+B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3})\end{gathered}

Let us collect terms. Terms starting with B3B_{3} and two AA factors are

B3[A1,A2]B3A2A1B3A2A1+B3A1A2+B3A1A2=0-B_{3}[A_{1},A_{2}]-B_{3}A_{2}A_{1}-B_{3}A_{2}A_{1}+B_{3}A_{1}A_{2}+B_{3}A_{1}A_{2}=0

Terms starting with B2B_{2} and two AA factors:

2B2A1A3B2A3A1+B2A3A1B2[A1,A3]=B2A1A3+B2A3A12B_{2}A_{1}A_{3}-B_{2}A_{3}A_{1}+B_{2}A_{3}A_{1}-B_{2}[A_{1},A_{3}]=B_{2}A_{1}A_{3}+B_{2}A_{3}A_{1}

Terms starting with B1B_{1} and two AA factors:

2B1A2A3+B1[A2,A3]+B1A3A2B1A3A2=B1A2A3B1A3A2-2B_{1}A_{2}A_{3}+B_{1}[A_{2},A_{3}]+B_{1}A_{3}A_{2}-B_{1}A_{3}A_{2}=-B_{1}A_{2}A_{3}-B_{1}A_{3}A_{2}

Terms with BB’s only factors

2B3(B2𝖳B1B1𝖳B2)+B1(B3𝖳B2B2𝖳B3)B2(B3𝖳B1B1𝖳B3)\begin{gathered}-2B_{3}(B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})+B_{1}(B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{3})-B_{2}(B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3})\end{gathered}

On the other hand, we have

AR,α=1/2=24(A1B3𝖳B2A2B3𝖳B1B1𝖳B3A2+B2𝖳B3A1)+12(A3B1𝖳B2A3B2𝖳B1B1𝖳B2A3+B2𝖳B1A3)\begin{gathered}A_{R,\alpha=1/2}^{\prime}=\frac{-2}{4}(A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1})+\\ \frac{-1}{2}(A_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-A_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}+B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{3})\end{gathered}
BR,α=1/2=4(1/2)12(B1A3A2B2A3A1)+(2(1/2)1)(B3A1A2B3A2A1)(B3B1𝖳B2B3B2𝖳B1)+12(B1B2𝖳B3B2B1𝖳B3)+12(B1A2A3B1B3𝖳B2B2A1A3+B2B3𝖳B1)\begin{gathered}B_{R,\alpha=1/2}^{\prime}=\frac{4(1/2)-1}{2}(B_{1}A_{3}A_{2}-B_{2}A_{3}A_{1})+\\ (2(1/2)-1)(B_{3}A_{1}A_{2}-B_{3}A_{2}A_{1})-(B_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1})+\\ \frac{1}{2}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3})+\frac{1}{2}(B_{1}A_{2}A_{3}-B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}A_{1}A_{3}+B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1})\end{gathered}

and we can confirm by inspection 2Rω1,ω2[1]ω3=g(1/2)-2\operatorname{R}^{[1]}_{\omega_{1},\omega_{2}}\omega_{3}=g^{\prime}(1/2). The constant term R[0]\operatorname{R}^{[0]} is verified similarly, which we will not show here. ∎

Remark 5.1.

We have shown the metric in section 3 is Pt\operatorname{P}_{t} for t=α2t=\frac{\alpha}{2}. The submersion associated with the Cheeger deformation gives a sectional curvature formula for 𝙶\mathtt{G} with the metric Pt\operatorname{P}_{t} in proposition 2.4 of [5]. Using the O’Neil equation and eq. 5.10, it implies the following sectional curvature formula for 𝙼=𝙶/𝙺\mathtt{M}=\mathtt{G}/\mathtt{K} (the norm \|\| corresponds to the bi-invariant inner product \langle\rangle)

(5.17) Rω1,ω2𝙼ω1,Ptω2=14[ω1𝔫,ω2𝔫]𝔫+t[ω1𝔞,ω2𝔫]+t[ω1𝔫,ω2𝔞]2+14[ω1𝔫,ω2𝔫]𝔞+t2[ω1𝔞,ω2𝔞]2+14t(1t)3[ω1𝔞,ω2𝔞]2+34(1t)[ω1𝔫,ω2𝔫]𝔞+t[ω1𝔞,ω2𝔞]2+34[ω1,ω2]𝔨2\begin{gathered}\langle\operatorname{R}^{\mathtt{M}}_{\omega_{1},\omega_{2}}\omega_{1},\operatorname{P}_{t}\omega_{2}\rangle=\frac{1}{4}\|[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}]_{\mathfrak{n}}+t[\omega_{1\mathfrak{a}},\omega_{2\mathfrak{n}}]+t[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{a}}]\|^{2}+\\ \frac{1}{4}\|[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}]_{\mathfrak{a}}+t^{2}[\omega_{1\mathfrak{a}},\omega_{2\mathfrak{a}}]\|^{2}+\frac{1}{4}t(1-t)^{3}\|[\omega_{1\mathfrak{a}},\omega_{2\mathfrak{a}}]\|^{2}+\\ \frac{3}{4}(1-t)\|[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}]_{\mathfrak{a}}+t[\omega_{1\mathfrak{a}},\omega_{2\mathfrak{a}}]\|^{2}+\frac{3}{4}\|[\omega_{1},\omega_{2}]_{\mathfrak{k}}\|^{2}\end{gathered}

It is also a weighted sum of squares in a different format from eq. 3.8. It is shown to imply both the non-negativity of curvature when t1t\leq 1 and in the case 𝔞\mathfrak{a} is abelian, when t4/3t\leq 4/3.

6. Discussion

In this paper, we have obtained explicit formulas for curvatures of real Stiefel manifolds with deformation metrics and obtained several results related to Einstein metrics and sectional curvature range, including parameter values corresponding to non-negative sectional curvatures. We expect similar results could be obtained for complex and quaternionic Stiefel manifolds. We hope the availability of explicit curvature formulas for a family of metrics on an important class of manifolds will be helpful in both theory and applications. The framework to compute the Levi-Civita connection and curvature for deformations of normal homogeneous spaces could be applied to other families of manifolds, potentially allowing the construction of new Einstein manifolds or manifolds with non-negative curvatures.

Appendix A A few trace formulas

We collect a few results on the trace of common operators that will be useful in the computation of the Ricci curvature for matrix spaces. They are most likely known, but we do not have the exact references.

Lemma A.1.

1. Let XX be a matrix in m×n\mathbb{R}^{m\times n}. The trace of the operator XAXBX\mapsto AXB where Am×mA\in\mathbb{R}^{m\times m} and Bn×nB\in\mathbb{R}^{n\times n} is Tr(A)Tr(B)\operatorname{Tr}(A)\operatorname{Tr}(B). In particular, the trace of XAXX\mapsto AX is nTr(A)n\operatorname{Tr}(A), the trace of XXBX\mapsto XB is mTr(B)m\operatorname{Tr}(B). The trace of the operator XAX𝖳BX\mapsto AX^{\operatorname{\mathsf{T}}}B where AA and BB are matrices of size m×nm\times n is Tr(AB𝖳)\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}}). 2. The trace of the operator XAXB+B𝖳XA𝖳X\mapsto AXB+B^{\operatorname{\mathsf{T}}}XA^{\operatorname{\mathsf{T}}} from the space Symp\mathrm{Sym}_{p} to itself is Tr(A)Tr(B)+Tr(AB𝖳)\operatorname{Tr}(A)\operatorname{Tr}(B)+\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}}). In particular, the trace of the operator XAX+XA𝖳X\mapsto AX+XA^{\operatorname{\mathsf{T}}} is (p+1)Tr(A)(p+1)\operatorname{Tr}(A). The trace of the operator XTr(AX)BX\mapsto\operatorname{Tr}(AX)B, with BB is a symmetric matrix and AA is a p×pp\times p matrix is Tr(12(A+A𝖳)B)\operatorname{Tr}(\frac{1}{2}(A+A^{\operatorname{\mathsf{T}}})B). 3. The trace of the operator XAXB+B𝖳XA𝖳X\mapsto AXB+B^{\operatorname{\mathsf{T}}}XA^{\operatorname{\mathsf{T}}}, from the space 𝔬(p)\mathfrak{o}(p) to itself, where AA and BB are p×pp\times p matrices, is Tr(A)Tr(B)Tr(AB𝖳)\operatorname{Tr}(A)\operatorname{Tr}(B)-\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}}). In particular, if AA and BB are antisymmetric matrices then the trace of X[[AX]B]X\mapsto[[AX]B] is (2p)Tr(AB)(2-p)\operatorname{Tr}(AB).

Proof.

Let EijE_{ij} be the matrix with the ijij-entry equal to 11, and other entries equal to 0 and of the same size as XX. All the statements are proved similarly, by summing the coefficients of the operators on an appropriate base based on EijE_{ij}. Let entries of AA be aija_{ij} and entries of BB be bijb_{ij}.

For item 1, (AEijB)ij=aiibjj(AE_{ij}B)_{ij}=a_{ii}b_{jj}, so the trace of XAXBX\mapsto AXB is ijaiibjj=Tr(A)Tr(B)\sum_{ij}a_{ii}b_{jj}=\operatorname{Tr}(A)\operatorname{Tr}(B). Since (AEij𝖳B)ij=aijbij(AE_{ij}^{\operatorname{\mathsf{T}}}B)_{ij}=a_{ij}b_{ij}, Tr(XAX𝖳B)\operatorname{Tr}(X\mapsto AX^{\operatorname{\mathsf{T}}}B) is ijaijbij=Tr(AB𝖳)\sum_{ij}a_{ij}b_{ij}=\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}}).

For item 2, a basis of Symp\mathrm{Sym}_{p} consists of matrices EiiE_{ii} (i=1,,p)(i=1,\cdots,p) and Eij+EjiE_{ij}+E_{ji} for i<ji<j. We now compute the trace of XAXB+B𝖳XA𝖳X\mapsto AXB+B^{\operatorname{\mathsf{T}}}XA^{\operatorname{\mathsf{T}}} with respect to this basis. For EiiE_{ii}, (AEiiB+B𝖳EiiA𝖳)ii=2aiibii(AE_{ii}B+B^{\operatorname{\mathsf{T}}}E_{ii}A^{\operatorname{\mathsf{T}}})_{ii}=2a_{ii}b_{ii}, for Eij+EjiE_{ij}+E_{ji}, the coefficient is

(A(Eij+Eji)B+B𝖳(Eij+Eji)A𝖳)ij=aiibjj+aijbji+biiajj+bijaij(A(E_{ij}+E_{ji})B+B^{\operatorname{\mathsf{T}}}(E_{ij}+E_{ji})A^{\operatorname{\mathsf{T}}})_{ij}=a_{ii}b_{jj}+a_{ij}b_{ji}+b_{ii}a_{jj}+b_{ij}a_{ij}

Hence the trace is

i2aiibii+i<j(aiibjj+aijbij+biiajj+bijaij)=iaiijbjj+ijaijbij\sum_{i}2a_{ii}b_{ii}+\sum_{i<j}(a_{ii}b_{jj}+a_{ij}b_{ij}+b_{ii}a_{jj}+b_{ij}a_{ij})=\sum_{i}a_{ii}\sum_{j}b_{jj}+\sum_{ij}a_{ij}b_{ij}

which is Tr(A)Tr(B)+Tr(AB𝖳)\operatorname{Tr}(A)\operatorname{Tr}(B)+\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}}), as iaiibii+i<j(aiibjj+biiajj)\sum_{i}a_{ii}b_{ii}+\sum_{i<j}(a_{ii}b_{jj}+b_{ii}a_{jj}) rearranges to the first sum, and the sum of remaining terms is Tr(AB𝖳)\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}}). With B=IpB=\operatorname{I}_{p} we have the trace of XAX+XA𝖳X\mapsto AX+XA^{\operatorname{\mathsf{T}}} is (p+1)Tr(A)(p+1)\operatorname{Tr}(A). For the operator XTr(AX)BX\mapsto\operatorname{Tr}(AX)B, the coefficient corresponding to EiiE_{ii} is aiibiia_{ii}b_{ii}, corresponding to Eij+EjiE_{ij}+E_{ji} is (aij+aji)bij(a_{ij}+a_{ji})b_{ij}. The trace is

iaiibii+i<j(aij+aji)bij=12Tr((A+A𝖳)B)\sum_{i}a_{ii}b_{ii}+\sum_{i<j}(a_{ij}+a_{ji})b_{ij}=\frac{1}{2}\operatorname{Tr}((A+A^{\operatorname{\mathsf{T}}})B)

For item 3, a basis of 𝔬(p)\mathfrak{o}(p) consist of matrices EijEjiE_{ij}-E_{ji} for i<ji<j. The coefficient corresponds to EijEjiE_{ij}-E_{ji} is

(A(EijEji)B+B𝖳(EijEji)A𝖳)ij=aiibjjaijbij+biiajjbjiaji(A(E_{ij}-E_{ji})B+B^{\operatorname{\mathsf{T}}}(E_{ij}-E_{ji})A^{\operatorname{\mathsf{T}}})_{ij}=a_{ii}b_{jj}-a_{ij}b_{ij}+b_{ii}a_{jj}-b_{ji}a_{ji}

The trace is i<jaiibjjaijbij+biiajjbjiaji=ijaiibjjijaijbij\sum_{i<j}a_{ii}b_{jj}-a_{ij}b_{ij}+b_{ii}a_{jj}-b_{ji}a_{ji}=\sum_{ij}a_{ii}b_{jj}-\sum_{ij}a_{ij}b_{ij}, which is Tr(A)Tr(B)Tr(AB)\operatorname{Tr}(A)\operatorname{Tr}(B)-\operatorname{Tr}(AB).

For the trace of X[[AX]B]=(AXXA)BB(AXXA)=AXB+BXABAXXABX\mapsto[[AX]B]=(AX-XA)B-B(AX-XA)=AXB+BXA-BAX-XAB, we have:

Tr(XAXB+BXA)=Tr(AB𝖳)=Tr(AB)\operatorname{Tr}(X\mapsto AXB+BXA)=-\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}})=\operatorname{Tr}(AB)
Tr(XBAX+XAB)=Tr(Ip)Tr(BA)Tr(BA)=(p1)Tr(BA)\operatorname{Tr}(X\mapsto BAX+XAB)=\operatorname{Tr}(\operatorname{I}_{p})\operatorname{Tr}(BA)-\operatorname{Tr}(BA)=(p-1)\operatorname{Tr}(BA)

from here we get Tr(X[[AX]B])=(2p)Tr(AB)\operatorname{Tr}(X\mapsto[[AX]B])=(2-p)\operatorname{Tr}(AB). ∎

References

  • [1] V. Arnold, Sur la géométrie différentielle des groupes de Lie de dimension infinie et ses applications à l’hydrodynamique des fluides parfaits, Annales de l’institut Fourier 16 (1966), no. 1, 319–361 (fre).
  • [2] J. Cheeger, Some examples of manifolds of nonnegative curvature, Journal of Differential Geometry 8 (1973), no. 4, 623 – 628.
  • [3] J. Ge, DDVV-type inequality for skew-symmetric matrices and Simons-type inequality for Riemannian submersions, Advances in Mathematics 251 (2014), 62–86.
  • [4] K. Grove, H. Karcher, and E. Ruh, Group actions and curvature, Inventiones mathematicae 23 (1974), 31–48.
  • [5] K. Grove and W. Ziller, Curvature and symmetry of Milnor spheres, The Annals of Mathematics 152 (2000), 331–367.
  • [6] K. Hüper, I. Markina, and F. Silva Leite, A Lagrangian approach to extremal curves on Stiefel manifolds, Journal of Geometric Mechanics 13 (2021), 55–72.
  • [7] P. W. Michor, Some geometric evolution equations arising as geodesic equations on groups of diffeomorphisms including the Hamiltonian approach, Phase Space Analysis of Partial Differential Equations (Antonio Bove, Ferruccio Colombini, and Daniele Del Santo, eds.), Birkhäuser Boston, Boston, MA, 2007, pp. 133–215.
  • [8] J. Milnor, Curvatures of left invariant metrics on Lie groups, Advances in Mathematics 21 (1976), no. 3, 293–329.
  • [9] D. Nguyen, Operator-valued formulas for Riemannian gradient and Hessian and families of tractable metrics in optimization and machine learning, 2020.
  • [10] by same author, Riemannian geometry with differentiable ambient space and metric operator, 2021.
  • [11] B. O’Neill, The fundamental equations of a submersion., Michigan Math. J. 13 (1966), no. 4, 459–469.
  • [12] by same author, Semi-Riemannian geometry with applications to relativity, Pure and Applied Mathematics, vol. 103, Academic Press, Inc, New York, NY, 1983.
  • [13] T. Rapcsak, Sectional curvatures in nonlinear optimization, J. Global Optimization 40 (2008), 375–388.
  • [14] Q. Rentmeesters, Algorithms for data fitting on some common homogeneous spaces, Ph.D. thesis, Université Catholique de Louvain, Louvain, Belgium, 2013.
  • [15] A. A. Sagle, Some homogeneous Einstein manifolds, Nagoya Mathematical Journal 39 (1970), no. 39, 81–106.
  • [16] W. Ziller, Examples of Riemannian manifolds with non-negative sectional curvature, Metric and Comparison Geometry, Surv. Diff. Gem. 11 (K. Grove and J Cheeger, eds.), International Press, 2007, pp. 63–102.