Curvatures of Stiefel manifolds with deformation metrics

Abstract.

We compute curvatures of a family of tractable metrics on Stiefel manifolds, introduced recently by Hüper, Markina and Silva Leite, which includes the well-known embedded and canonical metrics on Stiefel manifolds as special cases. The metrics could be identified with the Cheeger deformation metrics. We identify parameter values in the family to make a Stiefel manifold an Einstein manifold and show Stiefel manifolds always carry an Einstein metric. We analyze the sectional curvature range and identify the parameter range where the manifold has non-negative sectional curvature. We provide the exact sectional curvature range when the number of columns in a Stiefel matrix is $2$ , and a conjectural range for other cases. We derive the formulas from two approaches, one from a global curvature formula derived in our recent work, another using curvature formulas for left-invariant metrics. The second approach leads to curvature formulas for Cheeger deformation metrics on normal homogeneous spaces.

Key words and phrases:

Optimization, Riemannian geometry, Riemannian curvature, Einstein manifold, Stiefel, Jacobi field, Machine Learning

2010 Mathematics Subject Classification:

Primary 65K10, 58C05, 49Q12, 53C25, 57Z20, 57Z25, 68T05

1. Introduction

In a recent paper [10], we derived global formulas to compute the curvature of a manifold $\mathcal{M}$ , embedded differentiably in a Euclidean space $\mathcal{E}$ , with metric defined by an operator $\mathsf{g}$ from $\mathcal{M}$ to the space of positive-definite operators on $\mathcal{E}$ . The formulas have similar forms to the classical formula for the curvature in local coordinates. While we have provided a few applications of those formulas in that paper, we would like to show the formula could be used to compute the curvatures for a family of manifolds important in both theory and application.

The purpose of this paper is to compute and analyze curvatures of a Stiefel manifold with the family of metrics defined in [6]. It turns out this family of metrics is the same family of metrics arising from the Cheeger deformation, which has been one of the main tools to construct non-negative curvature metrics[2, 5, 16]. Thus, the curvatures could be computed in two ways, one is from our formula using Christoffel functions, which is very similar to the local-coordinate formula, the other way is to use the relationship with the Cheeger deformation. In the second method, the Stiefel manifold is identified with a quotient manifold of the special orthogonal group with a left-invariant metric. Using a result of Michor [7] and the Euler-Poisson-Arnold framework [1], we compute the $(1,3)$ -curvature tensor of the Cheeger deformation of a normal homogeneous space. The second approach provides independent confirmation of our curvature formulas. The first method probably requires lengthier calculation, however, it is straightforward conceptually and could be implemented symbolically.

Recall for two positive integers $p<n$ , the real Stiefel manifold $\mathrm{St}_{p,n}$ consists of real orthogonal matrices $Y$ of size $n\times p$ . If $\alpha_{1}$ , $\alpha_{0}$ are two positive numbers, the metric in [6] could be reparameterized so that the inner product of two tangent vectors $\xi,\eta$ on $\mathrm{St}_{p,n}$ at $Y\in\mathrm{St}_{p,n}$ is given by $\alpha_{0}\operatorname{Tr}(\xi^{\operatorname{\mathsf{T}}}\eta)+(\alpha_{1}-\alpha_{0})\operatorname{Tr}(\xi^{\operatorname{\mathsf{T}}}YY^{\operatorname{\mathsf{T}}}\eta)$ . Set $\alpha=\alpha_{1}/\alpha_{0}$ , and up to scaling we can take $\alpha_{0}=1$ . This family of metrics contains both well-known metrics on Stiefel manifolds, the embedded ( $\alpha=1$ , where the metric is induced from the embedding in $\mathbb{R}^{n\times p}$ ) and canonical metrics $(\alpha=\frac{1}{2})$ ( $\mathrm{St}_{p,n}$ is normal homogeneous in this case). It will be shown in proposition 7 that if $\operatorname{SO}(n)$ is equipped with a Cheeger deformation metric with deformation parameter $2\alpha$ (reviewed in section 5) from the right-multiplication action of $\operatorname{SO}(p)$ embedded diagonally then $\operatorname{SO}(n)/\operatorname{SO}(n-p)$ with the quotient metric could be identified with $\mathrm{St}_{p,n}$ with the metric just described.

While a framework to compute curvatures for Cheeger deformation metrics is available, explicit formulas and detailed analysis are not yet known to the best of our knowledge (note [13] is an early paper dealing with the embedded metric). We provide formulas for Riemannian, Ricci, scalar, and sectional curvature for the Stiefel manifold equipped with this family of metrics. We show the sectional curvature range always contains a specific interval, which is likely to be the full curvature range for metrics in the family. The ends of the interval are piecewise smooth functions described in table 2. In particular, except for some special cases, for the embedded metric on the Stiefel manifold, we show the curvature range contains the interval $[-\frac{1}{2},1]$ , thus it could have negative curvatures, in contrast to the canonical metric, which has range $[0,\frac{5}{4}]$ .

Specifically, $\mathrm{St}_{2,3}$ has positive curvature for $\alpha<\frac{2}{3}$ , non-negative curvature for $\alpha=\frac{2}{3}$ and both negative and positive curvature for $\alpha>\frac{2}{3}$ . With $n>3$ , the Stiefel manifold $\mathrm{St}_{2,n}$ has non-negative curvature for $\alpha\leq\frac{2}{3}$ , and both negative and positive curvature for $\alpha>\frac{2}{3}$ , and we identify the exact sectional curvature range in this case. For $p\geq 3$ , we show $\mathrm{St}_{p,n}$ has non-negative curvature for $\alpha\leq\frac{1}{2}$ and both negative and positive curvature otherwise. This agrees with [5] and we actually show the curvature range contains negative values in the indicated intervals.

We also show the Stiefel manifold always has an Einstein metric, and when $p>2$ , there are two metrics in the family (up to a scaling factor) that make the Stiefel manifold an Einstein manifold. We note this may be the same metric as in [15].

For notations, if $n$ and $m$ are two positive integers, by $\mathbb{R}^{n\times m}$ , we denote the space of $n\times m$ matrices in $\mathbb{R}$ , the field of real numbers. We denote by $\mathfrak{o}(p)$ the space of antisymmetric matrices in $\mathbb{R}^{p\times p}$ . The transpose of matrix or adjoint of an operator is denoted by $\operatorname{\mathsf{T}}$ . Working on a manifold, say $\mathcal{M}$ , by $\operatorname{D}_{\xi}F$ , we denote the directional (Lie) derivative of a scalar/vector/operator-valued function $F$ on $\mathcal{M}$ in direction $\xi$ (either a tangent vector defined at a point $x\in\mathcal{M}$ , or a vector field on $\mathcal{M}$ ). If $\mathcal{E}$ is a Euclidean space (inner product space with a positive-definite inner product), the space of linear operators on $\mathcal{E}$ is denoted by $\mathfrak{L}(\mathcal{E},\mathcal{E})$ . Similarly, we denote by $\mathfrak{L}(\mathcal{E}\otimes\mathcal{E},\mathcal{E})$ the space of bilinear form on $\mathcal{E}$ with value in $\mathcal{E}$ . For two positive integers $n$ and $p$ , the Stiefel manifold $\mathrm{St}_{p,n}$ is the space of matrices $Y\in\mathbb{R}^{n\times p}$ satisfying $Y^{\operatorname{\mathsf{T}}}Y=\operatorname{I}_{p}$ . The Frobenius norm is denoted by $\|\|_{F}$ .

2. Curvature formulas for embedded manifolds with metric operators

Let $\mathcal{M}\subset\mathcal{E}$ be a differentiable embedding, where $\mathcal{E}$ is a Euclidean space with a given inner product $\langle\rangle_{\mathcal{E}}$ , and $\mathcal{M}$ is a differentiable submanifold, and $\mathsf{g}$ is an operator-valued function from $\mathcal{M}$ to $\mathfrak{L}(\mathcal{E},\mathcal{E})$ , such that $\mathsf{g}$ is positive-definite, then $\mathsf{g}$ induces a Riemannian metric on $\mathcal{M}$ , where the inner product of two tangent vectors $\xi,\eta$ at a point $x\in\mathcal{M}$ is defined by $\langle\xi,\mathsf{g}_{x}\eta\rangle_{\mathcal{E}}$ . Here, each tangent space $T_{x}\mathcal{M}$ is identified with a subspace of $\mathcal{E}$ thanks to the embedding, so $\xi,\eta$ are considered as elements of $\mathcal{E}$ , while $\mathsf{g}_{x}$ denotes the evaluation of the operator $\mathsf{g}$ at $x$ .

We call $(\mathcal{M},\mathsf{g},\mathcal{E})$ an embedded ambient structure. The embedding allows us to identify vector fields on $\mathcal{M}$ with $\mathcal{E}$ -valued functions, thus we can take directional derivatives. A Christoffel function is a function $\Gamma$ from $\mathcal{M}$ with value in $\mathfrak{L}(\mathcal{E}\otimes\mathcal{E},\mathcal{E})$ , the space of $\mathcal{E}$ -bilinear forms, such that for two vector fields $\mathtt{X},\mathtt{Y}$ on $\mathcal{M}$ , the Levi-Civita connection on $\mathcal{M}$ is given by

\nabla_{\mathtt{X}}\mathtt{Y}=\operatorname{D}_{\mathtt{X}}\mathtt{Y}+\Gamma(\mathtt{X},\mathtt{Y})

In [10] we proved the following curvature formulas for three tangent vectors $\xi,\eta,\phi$

(2.1)

\begin{gathered}\operatorname{R^{\mathcal{M}}}_{\xi,\eta}\phi=-(\operatorname{D}_{\xi}\Gamma)(\eta,\phi)+(\operatorname{D}_{\eta}\Gamma)(\xi,\phi)-\Gamma(\xi,\Gamma(\eta,\phi))+\Gamma(\eta,\Gamma(\xi,\phi))\\ \operatorname{R^{\mathcal{M}}}_{\xi,\eta}\phi=-(\operatorname{D}_{\xi}\Gamma)(\eta,\phi)+(\operatorname{D}_{\eta}\Gamma)(\xi,\phi)-\Gamma(\Gamma(\phi,\eta),\xi))+\Gamma(\Gamma(\phi,\xi),\eta)\end{gathered}

where $\operatorname{D}_{\xi}\Gamma$ denotes the directional derivative of $\Gamma$ , considered as an operator-valued function, in the direction $\xi$ , for example. The curvature for three vector fields $\mathtt{X},\mathtt{Y},\mathtt{Z}$ is defined in the convention

\operatorname{R^{\mathcal{M}}}_{\mathtt{X}\mathtt{Y}}\mathtt{Z}=\nabla_{[\mathtt{X},\mathtt{Y}]}\mathtt{Z}-\nabla_{\mathtt{X}}\nabla_{\mathtt{Y}}\mathtt{Z}+\nabla_{\mathtt{Y}}\nabla_{\mathtt{X}}\mathtt{Z}

3. Curvatures of the Stiefel manifold

In the following, $p<n$ are two positive integers. In [6], the authors introduced a family of metrics on the Stiefel manifold $\mathrm{St}_{p,n}$ of orthogonal matrices in $\mathbb{R}^{n\times p}$ (thus $Y^{\operatorname{\mathsf{T}}}Y=\operatorname{I}_{p}$ ). We introduced a different parameterization in [9]. The metric depends on two positive real numbers $\alpha_{0}$ , $\alpha_{1}$ with ratio $\alpha=\frac{\alpha_{1}}{\alpha_{0}}$ . In the convention of section 2, we have $\mathcal{M}:=\mathrm{St}_{p,n}\subset\mathcal{E}:=\mathbb{R}^{n\times p}$ , with the base inner product on $\mathcal{E}$ is the Frobenius inner product, thus $\langle\omega_{1},\omega_{2}\rangle_{\mathcal{E}}=\operatorname{Tr}(\omega_{1}\omega_{2}^{\operatorname{\mathsf{T}}})$ for $\omega_{1},\omega_{2}\in\mathcal{E}$ . Consider the metric operator $\mathsf{g}\omega=\mathsf{g}_{Y}\omega:=\alpha_{0}\omega+(\alpha_{1}-\alpha_{0})YY^{\operatorname{\mathsf{T}}}\omega$ , for $Y\in\mathrm{St}_{p,n}$ , with inverse $\mathsf{g}^{-1}\omega=\alpha^{-1}_{0}\omega+(\alpha^{-1}_{1}-\alpha^{-1}_{0})YY^{\operatorname{\mathsf{T}}}\omega$ and the inner product on $\mathcal{E}$ induced by $\mathsf{g}$ is $\langle\omega_{1},\mathsf{g}_{Y}\omega_{2}\rangle_{\mathcal{E}}=\alpha_{0}\operatorname{Tr}\omega_{1}\omega_{2}^{\operatorname{\mathsf{T}}}+(\alpha_{1}-\alpha_{0})\operatorname{Tr}\omega_{1}^{\operatorname{\mathsf{T}}}YY^{\operatorname{\mathsf{T}}}\omega_{2}$ , and this induces a Riemannian metric on $\mathrm{St}_{p,n}$ .

A geodesic equation for this metric was derived in [6], and we provided a different derivation of a Christoffel function $\Gamma$ in [9]. We will give another derivation of $\Gamma$ in proposition 7 to clarify the concepts and keep the material reasonably independent. For an orthogonal matrix $Y\in\mathrm{St}_{p,n}$ and $\omega,\omega_{1},\omega_{2}\in\mathbb{R}^{n\times p}$ , a Christoffel function is

(3.1)

\begin{gathered}\Gamma(\omega_{1},\omega_{2})=\frac{1}{2}Y(\omega_{1}^{\operatorname{\mathsf{T}}}\omega_{2}+\omega_{2}^{\operatorname{\mathsf{T}}}\omega_{1})+(1-\alpha)(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})(\omega_{1}\omega_{2}^{\operatorname{\mathsf{T}}}+\omega_{2}\omega_{1}^{\operatorname{\mathsf{T}}})Y\end{gathered}

We can extend $Y$ to a full basis $(Y|Y_{\perp})$ of $\mathbb{R}^{n}$ , by adding $Y_{\perp}$ , an orthogonal complement. Thus, $Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}=\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}}$ , $Y_{\perp}^{\operatorname{\mathsf{T}}}Y_{\perp}=\operatorname{I}_{n-p},Y^{\operatorname{\mathsf{T}}}Y_{\perp}=0,Y_{\perp}^{\operatorname{\mathsf{T}}}Y=0$ . Any matrix $\omega\in\mathcal{E}=\mathbb{R}^{n\times p}$ could be represented in this basis as $\omega=YA+Y_{\perp}B$ with $A\in\mathbb{R}^{p\times p}$ , $B\in\mathbb{R}^{p\times(n-p)}$ and $\omega$ is a tangent vector to $\mathrm{St}_{p,n}$ at $Y$ if and only if $A$ is antisymmetric, $A\in\mathfrak{o}(p)$ , or equivalently $Y^{\operatorname{\mathsf{T}}}\omega+\omega^{\operatorname{\mathsf{T}}}Y=0$ .

For two tangent vectors $\xi$ and $\eta$ at a point on the manifold, denote by $\langle\rangle_{\mathsf{g}}$ and $\|\|_{\mathsf{g}}$ the inner product and the norm defined by a metric operator $\mathsf{g}$ . We will denote the wedge, the sectional curvature numerator, and the sectional curvature by

(3.2)

\begin{gathered}||\xi\wedge\eta||_{\mathsf{g}}^{2}=||\xi||_{\mathsf{g}}^{2}||\eta||_{\mathsf{g}}^{2}-\langle\xi,\eta\rangle_{\mathsf{g}}^{2}\\ \operatorname{\hat{\mathcal{K}}}(\xi,\eta)=\langle\operatorname{R^{\mathcal{M}}}_{\xi,\eta}\xi,\eta\rangle_{\mathsf{g}}\\ \mathcal{K}(\xi,\eta)=\frac{\operatorname{\hat{\mathcal{K}}}(\xi,\eta)}{||\xi\wedge\eta||_{\mathsf{g}}^{2}}\\ \end{gathered}

Theorem 3.1.

Representing three tangent vectors $\xi,\eta,\phi\in\mathbb{R}^{n\times p}$ at $Y\in\mathrm{St}_{p,n}$ in an orthogonal basis $(Y|Y_{\perp})$ of $\mathbb{R}^{n}$ as $\xi=YA_{1}+Y_{\perp}B_{1},\eta=YA_{2}+Y_{\perp}B_{2},\phi=YA_{3}+Y_{\perp}B_{3}$ , where $A_{1},A_{2},A_{3}\in\mathfrak{o}(p)$ and $B_{1},B_{2},B_{3}\in\mathbb{R}^{(n-p)\times p}$ . Then the Riemannian curvature tensor is $\operatorname{R^{\mathcal{M}}}_{\xi\eta}\phi=YA_{R}+Y_{\perp}B_{R}$ with $A_{R}\in\mathfrak{o}(p),B_{R}\in\mathbb{R}^{(n-p)\times p}$ where

(3.3)

\begin{gathered}A_{R}=Y^{\operatorname{\mathsf{T}}}\operatorname{R^{\mathcal{M}}}_{\xi\eta}\phi=\frac{1-2\alpha}{4}(A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1})+\\ \frac{1-\alpha}{2}(A_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-A_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}+B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{3})+\\ \frac{1}{4}([[A_{1},A_{2}],A_{3}]-A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}+A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}+B_{3}^{\operatorname{\mathsf{T}}}B_{1}A_{2}-B_{3}^{\operatorname{\mathsf{T}}}B_{2}A_{1})\end{gathered}

(3.4)

\begin{gathered}B_{R}=Y_{\perp}^{\operatorname{\mathsf{T}}}\operatorname{R^{\mathcal{M}}}_{\xi\eta}\phi=\frac{2\alpha^{2}-\alpha}{2}(B_{1}A_{3}A_{2}-B_{2}A_{3}A_{1})+\\ (\alpha^{2}-\alpha)(B_{3}A_{1}A_{2}-B_{3}A_{2}A_{1})+(1-\alpha)(B_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1})+\\ \frac{\alpha-2}{2}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3})+\frac{\alpha}{2}(B_{1}A_{2}A_{3}-B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}A_{1}A_{3}+B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1})\end{gathered}

If $p>1$ , the Ricci and scalar curvatures are given by:

(3.5)

\begin{gathered}\textsc{Ric}(\xi,\eta)=(\frac{2-p}{4}+(p-n)\alpha^{2})\operatorname{Tr}(A_{1}A_{2})+[(1-p)\alpha+(n-2)]\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{2})\end{gathered}

(3.6)

\begin{gathered}\textsc{Scl}(Y)=((1-p)\alpha+n-2)(n-p)p+((n-p)\alpha+\frac{p-2}{4\alpha})\frac{p(p-1)}{2}\end{gathered}

The sectional curvature numerator $\operatorname{\hat{\mathcal{K}}}$ is computed from one of the following

(3.7)

\begin{gathered}\operatorname{\hat{\mathcal{K}}}=\operatorname{Tr}(\frac{2-3\alpha}{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}+\frac{3\alpha-4}{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+B_{2}^{\operatorname{\mathsf{T}}}B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\frac{\alpha}{4}[A_{1},A_{2}]^{2})\\ +\alpha\operatorname{Tr}((4\alpha-3)A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+(3-2\alpha)A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\alpha A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\alpha A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2})\end{gathered}

(3.8)

\begin{gathered}\operatorname{\hat{\mathcal{K}}}=\frac{\alpha}{4}\|[A_{1},A_{2}]+(3-4\alpha)(B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})\|_{F}^{2}+\\ \alpha^{2}\|B_{1}A_{2}-B_{2}A_{1}\|_{F}^{2}+\frac{1}{2}\|B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}\|_{F}^{2}+\frac{(1-2\alpha)^{3}}{2}\|B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}\|_{F}^{2}\end{gathered}

In particular, if $\alpha\leq\frac{1}{2}$ , the sectional curvature is non-negative. If $\xi$ and $\eta$ are orthogonal, the sectional curvature denominator is $(\alpha_{1}\operatorname{Tr}A_{1}A_{1}^{\operatorname{\mathsf{T}}}+\alpha_{0}\operatorname{Tr}B_{1}B_{1}^{\operatorname{\mathsf{T}}})(\alpha_{1}\operatorname{Tr}A_{2}A_{2}^{\operatorname{\mathsf{T}}}+\alpha_{0}\operatorname{Tr}B_{2}B_{2}^{\operatorname{\mathsf{T}}})$ .

We also use the following expansion of eq. 3.8 when $A_{1}$ or $A_{2}$ is zero.

(3.9)

\begin{gathered}\operatorname{\hat{\mathcal{K}}}=\frac{\alpha}{4}\|[A_{1},A_{2}]\|_{F}^{2}+\frac{\alpha(3-4\alpha)}{2}\operatorname{Tr}[A_{1},A_{2}](B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})^{\operatorname{\mathsf{T}}}+\\ \frac{2-3\alpha}{4}\|B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}\|_{F}^{2}+\alpha^{2}\|B_{1}A_{2}-B_{2}A_{1}\|_{F}^{2}+\frac{1}{2}\|B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}\|_{F}^{2}\end{gathered}

Proof.

As noted, any $\omega\in\mathbb{R}^{n\times p}$ could be expressed as $\omega=YA+Y_{\perp}B$ , however $A$ may not be antisymmetric. By direct substitution $(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})(\eta\omega^{\operatorname{\mathsf{T}}}+\omega\eta^{\operatorname{\mathsf{T}}})Y=Y_{\perp}(B_{2}A^{\operatorname{\mathsf{T}}}-BA_{2})$ , hence

\Gamma(\eta,\omega)=\frac{1}{2}Y(-A_{2}A+A^{\operatorname{\mathsf{T}}}A_{2}+B^{\operatorname{\mathsf{T}}}B_{2}+B_{2}^{\operatorname{\mathsf{T}}}B)+(1-\alpha)Y_{\perp}(B_{2}A^{\operatorname{\mathsf{T}}}-BA_{2})

In particular, $Y^{\operatorname{\mathsf{T}}}\Gamma(\eta,\omega)=\frac{1}{2}(-A_{2}A+A^{\operatorname{\mathsf{T}}}A_{2}+B^{\operatorname{\mathsf{T}}}B_{2}+B_{2}^{\operatorname{\mathsf{T}}}B)$ , $Y_{\perp}^{\operatorname{\mathsf{T}}}\Gamma(\eta,\omega)=(1-\alpha)(B_{2}A^{\operatorname{\mathsf{T}}}-BA_{2})$ , and

\begin{gathered}\operatorname{D}_{\xi}\Gamma(\eta,\phi)=\frac{1}{2}\xi(\eta^{\operatorname{\mathsf{T}}}\phi+\phi^{\operatorname{\mathsf{T}}}\eta)+\\ (1-\alpha)\{(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})(\eta\phi^{\operatorname{\mathsf{T}}}+\phi\eta^{\operatorname{\mathsf{T}}})\xi-(\xi Y^{\operatorname{\mathsf{T}}}+Y\xi^{\operatorname{\mathsf{T}}})(\eta\phi^{\operatorname{\mathsf{T}}}+\phi\eta^{\operatorname{\mathsf{T}}})Y\}\end{gathered}

Expanding $\xi,\eta,\phi$

\begin{gathered}Y_{\perp}^{\operatorname{\mathsf{T}}}(\operatorname{D}_{\xi}\Gamma)(\eta,\phi)=\frac{1}{2}B_{1}(-A_{2}A_{3}-A_{3}A_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}+B_{3}^{\operatorname{\mathsf{T}}}B_{2})+\\ (1-\alpha)\{B_{2}(-A_{3}Y^{\operatorname{\mathsf{T}}}+B_{3}^{\operatorname{\mathsf{T}}}Y_{\perp})+B_{3}(-A_{2}Y^{\operatorname{\mathsf{T}}}+B_{2}^{\operatorname{\mathsf{T}}}Y_{\perp})\}(YA_{1}+Y_{\perp}B_{1})-\\ (1-\alpha)(B_{1}Y^{\operatorname{\mathsf{T}}}(-YA_{2}A_{3}-YA_{3}A_{2})\\ =\frac{B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}}{2}+\frac{B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}}{2}+(\frac{1}{2}-\alpha)(B_{1}A_{2}A_{3}+B_{1}A_{3}A_{2})+\\ (1-\alpha)(-B_{2}A_{3}A_{1}+B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{3}A_{2}A_{1}+B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1})\end{gathered}

Simplify $Y^{\operatorname{\mathsf{T}}}(\xi Y^{\operatorname{\mathsf{T}}}+Y\xi^{\operatorname{\mathsf{T}}})=A_{1}Y^{\operatorname{\mathsf{T}}}-A_{1}Y^{\operatorname{\mathsf{T}}}+B_{1}^{\operatorname{\mathsf{T}}}Y_{\perp}^{\operatorname{\mathsf{T}}}=B_{1}^{\operatorname{\mathsf{T}}}Y_{\perp}^{\operatorname{\mathsf{T}}}$

\begin{gathered}Y^{\operatorname{\mathsf{T}}}(\operatorname{D}_{\xi}\Gamma)(\eta,\phi)=\frac{1}{2}A_{1}(-A_{2}A_{3}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}-A_{3}A_{2}+B_{3}^{\operatorname{\mathsf{T}}}B_{2})-\\ (1-\alpha)(B_{1}^{\operatorname{\mathsf{T}}}Y_{\perp}^{\operatorname{\mathsf{T}}})(-Y_{\perp}B_{2}A_{3}Y^{\operatorname{\mathsf{T}}}-Y_{\perp}B_{3}A_{2}Y^{\operatorname{\mathsf{T}}})Y=\\ (1-\alpha)(B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}+B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2})+\frac{1}{2}(-A_{1}A_{2}A_{3}-A_{1}A_{3}A_{2}+A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}+A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2})\end{gathered}

Next, use the formula for $\Gamma(\xi,\omega)$ with $\omega=\Gamma(\eta,\phi)$

\begin{gathered}Y^{\operatorname{\mathsf{T}}}\Gamma(\xi,\Gamma(\eta,\phi))=\frac{1}{2}(-A_{1}(\frac{1}{2}(-A_{2}A_{3}-A_{3}A_{2}+B_{3}^{\operatorname{\mathsf{T}}}B_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}))+\\ (\frac{1}{2}(-A_{2}A_{3}-A_{3}A_{2}+B_{3}^{\operatorname{\mathsf{T}}}B_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}))^{\operatorname{\mathsf{T}}}A_{1}+B_{1}^{\operatorname{\mathsf{T}}}((1-\alpha)(-B_{2}A_{3}-B_{3}A_{2}))+\\ ((1-\alpha)(-B_{2}A_{3}-B_{3}A_{2}))^{\operatorname{\mathsf{T}}}B_{1})=\\ \frac{1-\alpha}{2}(A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}+A_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2})+\frac{1}{4}(A_{1}A_{2}A_{3}+\\ A_{1}A_{3}A_{2}-A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-A_{2}A_{3}A_{1}-A_{3}A_{2}A_{1}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1}+B_{3}^{\operatorname{\mathsf{T}}}B_{2}A_{1})\end{gathered}

\begin{gathered}Y_{\perp}(\Gamma(\xi,\Gamma(\eta,\phi))=(1-\alpha)\{B_{1}(\frac{1}{2}(-A_{2}A_{3}-A_{3}A_{2}+B_{3}^{\operatorname{\mathsf{T}}}B_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3})^{\operatorname{\mathsf{T}}}-\\ ((1-\alpha)(-B_{2}A_{3}-B_{3}A_{2}))A_{1})\}=\\ (\alpha-1)^{2}(B_{2}A_{3}A_{1}+B_{3}A_{2}A_{1})+\frac{\alpha-1}{2}(B_{1}A_{2}A_{3}+B_{1}A_{3}A_{2}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2})\end{gathered}

Therefore:

\begin{gathered}Y^{\operatorname{\mathsf{T}}}\operatorname{R^{\mathcal{M}}}_{\xi\eta}\phi=-\{(1-\alpha)(B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}+B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2})+\\ \frac{1}{2}(-A_{1}A_{2}A_{3}-A_{1}A_{3}A_{2}+A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}+A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2})\}+\\ \{(1-\alpha)(B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{3}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1})+\frac{1}{2}(-A_{2}A_{1}A_{3}-A_{2}A_{3}A_{1}+A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}+A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1})\}-\\ \{\frac{1-\alpha}{2}(A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}+A_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2})+\frac{1}{4}(A_{1}A_{2}A_{3}+\\ A_{1}A_{3}A_{2}-A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-A_{2}A_{3}A_{1}-A_{3}A_{2}A_{1}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1}+B_{3}^{\operatorname{\mathsf{T}}}B_{2}A_{1})\}+\\ \{\frac{1-\alpha}{2}(A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}+A_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{3}-B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1})+\frac{1}{4}(A_{2}A_{1}A_{3}+\\ A_{2}A_{3}A_{1}-A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}-A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-A_{1}A_{3}A_{2}-A_{3}A_{1}A_{2}+B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2}+B_{3}^{\operatorname{\mathsf{T}}}B_{1}A_{2})\}\\ =\frac{1-2\alpha}{4}(A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1})+\\ \frac{1-\alpha}{2}(A_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-A_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}+B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{3})+\frac{1}{4}(A_{1}A_{2}A_{3}-A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-\\ A_{2}A_{1}A_{3}+A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}-A_{3}A_{1}A_{2}+A_{3}A_{2}A_{1}+B_{3}^{\operatorname{\mathsf{T}}}B_{1}A_{2}-B_{3}^{\operatorname{\mathsf{T}}}B_{2}A_{1})\end{gathered}

The last expression follows from a term by term collection, for example, the coefficient of $A_{1}A_{2}A_{3}$ is $-(-1/2)-1/4=1/4$ , and similarly for all terms with coefficient $1/4$ . The coefficient for $A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}$ is $-1/2+1/4+(1-\alpha)/2=(1-2\alpha/4)$ , and similar to all the terms with that coefficient.

\begin{gathered}Y_{\perp}^{\operatorname{\mathsf{T}}}\operatorname{R^{\mathcal{M}}}_{\xi\eta}\phi=-(\frac{B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}}{2}+\frac{B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}}{2}+(\frac{1}{2}-\alpha)(B_{1}A_{2}A_{3}+B_{1}A_{3}A_{2})+\\ (1-\alpha)(-B_{2}A_{3}A_{1}+B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{3}A_{2}A_{1}+B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}))+\\ (\frac{B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}}{2}+\frac{B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}}{2}+(\frac{1}{2}-\alpha)(B_{2}A_{1}A_{3}+B_{2}A_{3}A_{1})+\\ (1-\alpha)(-B_{1}A_{3}A_{2}+B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{3}A_{1}A_{2}+B_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}))-\\ (\alpha-1)^{2}(B_{2}A_{3}A_{1}+B_{3}A_{2}A_{1})-\frac{\alpha-1}{2}(B_{1}A_{2}A_{3}+B_{1}A_{3}A_{2}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2})\\ +(\alpha-1)^{2}(B_{1}A_{3}A_{2}+B_{3}A_{1}A_{2})+\frac{\alpha-1}{2}(B_{2}A_{1}A_{3}+B_{2}A_{3}A_{1}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}-B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1})\\ \end{gathered}

Again, we collect term by term, (we do use a symbolic calculation program helper). The coefficient for $B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}$ is $-1/2+(\alpha-1)/2=(\alpha-2)/2$ , and similar for $B_{2}B_{1}^{\operatorname{\mathsf{T}}}B3$ . The coefficient for $B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}$ is $-1/2+(1-\alpha)+(\alpha-1)/2=-\alpha/2$ , and similar for $B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}$ . The coefficient for $B_{1}A_{2}A_{3}$ is $-(1/2-\alpha)-(\alpha-1)/2=\alpha/2$ , and similar for $B_{2}A_{1}A_{3}$ . The coefficient for $B_{1}A_{3}A_{2}$ is $-(\frac{1}{2}-\alpha)-(1-\alpha)-\frac{\alpha-1}{2}+(\alpha-1)^{2}=\alpha^{2}-\frac{\alpha}{2}=\frac{2\alpha^{2}-\alpha}{2}$ and similar for $B_{2}A_{3}A_{1}$ . The coefficient for $B_{3}A_{2}A_{1}$ is $(1-\alpha)-(\alpha-1)^{2}=\alpha-\alpha^{2}$ , and $B_{3}A_{1}A_{2}$ follows by permutation. The coefficient for $B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}$ is $-(1-\alpha)$ , and similar for $B_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}$ . Finally

\begin{gathered}Y_{\perp}^{\operatorname{\mathsf{T}}}\operatorname{R^{\mathcal{M}}}_{\xi\eta}\phi=\frac{2\alpha^{2}-\alpha}{2}(B_{1}A_{3}A_{2}-B_{2}A_{3}A_{1})+(\alpha^{2}-\alpha)(B_{3}A_{1}A_{2}-B_{3}A_{2}A_{1})+\\ (1-\alpha)(B_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1})+\frac{\alpha-2}{2}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3})+\\ \frac{\alpha}{2}(B_{1}A_{2}A_{3}-B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}A_{1}A_{3}+B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1})\end{gathered}

The Ricci curvature is $\operatorname{Tr}((A_{2},B_{2})\mapsto(A_{R},B_{R}))$ . Using item 3 of lemma A.1, for the $A_{R}$ component, we compute the trace of

A_{2}\mapsto\frac{1-2\alpha}{4}(-A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2})+\frac{1}{4}([[A_{1},A_{2}],A_{3}]+A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3}+B_{3}^{\operatorname{\mathsf{T}}}B_{1}A_{2})

which evaluates to $\frac{1-2\alpha}{4}(p-1)\operatorname{Tr}(-B_{3}^{\operatorname{\mathsf{T}}}B_{1})+\frac{1}{4}((2-p)\operatorname{Tr}(A_{1}A_{3})+p\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{3})-\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{3}))$ , or $\frac{2-p}{4}\operatorname{Tr}(A_{1}A_{3})+(p-1)\frac{\alpha}{2}\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{3})$ . Here, we need $p>1$ , otherwise $\mathfrak{o}(p)$ is zero and there is no contribution from this component.

For the $B_{R}$ component, use item 1 of lemma A.1, we compute

\begin{gathered}\operatorname{Tr}(B_{2}\mapsto\frac{2\alpha^{2}-\alpha}{2}(-B_{2}A_{3}A_{1})+(1-\alpha)(B_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1})+\\ \frac{\alpha-2}{2}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3})+\frac{\alpha}{2}(-B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}A_{1}A_{3}+B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}))\\ =\frac{2\alpha^{2}-\alpha}{2}(n-p)\operatorname{Tr}(-A_{3}A_{1})+(1-\alpha)(p-1)\operatorname{Tr}(B_{3}B_{1}^{\operatorname{\mathsf{T}}})+\\ \frac{\alpha-2}{2}(1-n+p)\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{3})+\frac{\alpha(n-2p)}{2}\operatorname{Tr}(B_{1}B_{3}^{\operatorname{\mathsf{T}}})-\frac{\alpha(n-p)}{2}\operatorname{Tr}(A_{1}A_{3})\end{gathered}

The Ricci curvature is

\begin{gathered}(\frac{2-p}{4}-\frac{2\alpha^{2}-\alpha}{2}(n-p)-\frac{\alpha(n-p)}{2})\operatorname{Tr}(A_{1}A_{3})+\\ \{(p-1)\frac{\alpha}{2}+(1-\alpha)(p-1)+\frac{\alpha-2}{2}(1-n+p)+\frac{\alpha(n-2p)}{2}\}\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{3})\\ =(\frac{2-p}{4}+(p-n)\alpha^{2})\operatorname{Tr}(A_{1}A_{2})+[(1-p)\alpha+(n-2)]\operatorname{Tr}(B_{1}^{\operatorname{\mathsf{T}}}B_{2})\end{gathered}

The Ricci map is thus $(A_{2},B_{2})\mapsto((\frac{p-2}{4\alpha}+(n-p)\alpha)A_{2},((1-p)\alpha+(n-2))B_{2})$ , which gives us the scalar curvature formula.

For the sectional curvature, we substitute $A_{1},B_{1}$ in place of $A_{3},B_{3}$ in the expressions for $A_{R}$ and $B_{R}$ , then compute $\operatorname{Tr}(-\alpha A_{R}A_{2}+B_{R}B_{2}^{\operatorname{\mathsf{T}}})$

\begin{gathered}\operatorname{\hat{\mathcal{K}}}(\xi,\eta)=\operatorname{Tr}(-\alpha(\frac{1-2\alpha}{4}(A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{1}A_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{1})+\\ \frac{1-\alpha}{2}(A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{1}+B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{1})+\\ \frac{1}{4}([[A_{1},A_{2}],A_{1}]-A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}+B_{1}^{\operatorname{\mathsf{T}}}B_{1}A_{2}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{1}))A_{2})\\ +\operatorname{Tr}((\frac{2\alpha^{2}-\alpha}{2}(B_{1}A_{1}A_{2}-B_{2}A_{1}A_{1})+(\alpha^{2}-\alpha)(B_{1}A_{1}A_{2}-B_{1}A_{2}A_{1})+\\ (1-\alpha)(B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1})+\frac{\alpha-2}{2}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1})+\\ \frac{\alpha}{2}(B_{1}A_{2}A_{1}-B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}A_{1}A_{1}+B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}))B_{2}^{\operatorname{\mathsf{T}}})\end{gathered}

We collect terms. From $-\operatorname{Tr}([[A_{1},A_{2}]A_{1}]A_{2})=\operatorname{Tr}[A_{1},A_{2}][A_{1},A_{2}]^{\operatorname{\mathsf{T}}}$ , terms involving $A_{1},A_{2}$ only are $\alpha/4\operatorname{Tr}[A_{1},A_{2}][A_{1},A_{2}]^{\operatorname{\mathsf{T}}}$ . Terms with both $A$ ’s and $B$ ’s:

\begin{gathered}\operatorname{Tr}(\alpha(\frac{1-2\alpha}{4}(-A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{2}+A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}A_{2}+B_{1}^{\operatorname{\mathsf{T}}}B_{1}A_{2}^{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{1}A_{2})+\\ \frac{1-\alpha}{2}(-A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{2}+A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{2}+B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{1}A_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{1}A_{2})+\\ \frac{1}{4}(A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{2}-A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}A_{2}-B_{1}^{\operatorname{\mathsf{T}}}B_{1}A_{2}^{2}+B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{1}A_{2})))\\ +\alpha\operatorname{Tr}(\frac{2\alpha-1}{2}(B_{1}A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}A_{1}A_{1}B_{2}^{\operatorname{\mathsf{T}}})+(\alpha-1)(B_{1}A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}-B_{1}A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}})+\\ \frac{1}{2}(B_{1}A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}A_{1}A_{1}B_{2}^{\operatorname{\mathsf{T}}}))=\\ \alpha\operatorname{Tr}((\frac{1-\alpha}{2}+\frac{1}{4}-(\alpha-1)+\frac{1}{2})A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+(-\frac{1-2\alpha}{4}-\frac{1-\alpha}{2})A_{2}A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}+\\ (-\frac{1-2\alpha}{4}-\frac{1-\alpha}{2}+\alpha-1+\frac{2\alpha-1}{2})A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+\\ (\frac{1-\alpha}{2}+\frac{1}{4})A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}+(2\frac{1-2\alpha}{4}-\frac{2}{4})A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}+(-\frac{2\alpha-1}{2}-\frac{1}{2})A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2})=\\ \alpha\operatorname{Tr}(\frac{9-6\alpha}{4}A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+\frac{4\alpha-3}{4}A_{2}A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}+\frac{12\alpha-9}{4}A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+\\ \frac{3-2\alpha}{4}A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\alpha A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\alpha A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2})=\\ \alpha\operatorname{Tr}((4\alpha-3)A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+(3-2\alpha)A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\alpha A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\alpha A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2})\end{gathered}

where we use $\operatorname{Tr}(A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1})=\operatorname{Tr}((A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1})^{\operatorname{\mathsf{T}}})=\operatorname{Tr}(A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2})$ and similarly $\operatorname{Tr}(A_{2}A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2})=\operatorname{Tr}(A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1})$ . Next, we collect the terms with $B_{1}$ and $B_{2}$ only:

\begin{gathered}\operatorname{Tr}((1-\alpha)(B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})+\frac{\alpha-2}{2}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})+\\ \frac{\alpha}{2}(-B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}+B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}))=\\ \operatorname{Tr}((1-\frac{3\alpha}{2})B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}+(\alpha-1+\frac{\alpha-2}{2})B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}+(-\frac{\alpha-2}{2}+\frac{\alpha}{2})B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})\\ =\operatorname{Tr}(\frac{2-3\alpha}{2}B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}+\frac{3\alpha-4}{2}B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}+B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})\end{gathered}

This proves eq. 3.7. On the other hand, it is clear on the right-hand side of eq. 3.8, the $A$ ’s only term is $\frac{\alpha}{4}\operatorname{Tr}[A_{1},A_{2}][A_{1},A_{2}]^{\operatorname{\mathsf{T}}}$ , the $B$ ’s only term is:

\begin{gathered}(\frac{\alpha(3-4\alpha)^{2}}{4}+\frac{(1-2\alpha)^{3}}{2})\operatorname{Tr}(B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})(B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})^{\operatorname{\mathsf{T}}}+\\ \frac{1}{2}\operatorname{Tr}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}})(B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}})^{\operatorname{\mathsf{T}}}=\\ \frac{2-3\alpha}{4}\operatorname{Tr}(B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}+B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1})+\\ \frac{1}{2}\operatorname{Tr}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{2}B_{1}^{\operatorname{\mathsf{T}}}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{1}^{\operatorname{\mathsf{T}}}+B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})\\ =\operatorname{Tr}(2\frac{2-3\alpha}{4}B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}+(-2\frac{2-3\alpha}{4}-2\frac{1}{2})B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}+2\frac{1}{2}B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})\\ =\operatorname{Tr}(\frac{2-3\alpha}{2}B_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}B_{2}^{\operatorname{\mathsf{T}}}+\frac{3\alpha-4}{2}B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}}+B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}B_{2}^{\operatorname{\mathsf{T}}})\end{gathered}

The terms with both $A$ and $B$ in eq. 3.8 are:

\begin{gathered}\alpha\operatorname{Tr}(\frac{3-4\alpha}{2}(A_{2}A_{1}-A_{1}A_{2})(B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})+\alpha(B_{1}A_{2}-B_{2}A_{1})^{\operatorname{\mathsf{T}}}(B_{1}A_{2}-B_{2}A_{1}))\\ =\alpha\operatorname{Tr}\{(\frac{3-4\alpha}{2}+\alpha)A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-\frac{3-4\alpha}{2}A_{2}A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\frac{3-4\alpha}{2}A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+\\ (\frac{3-4\alpha}{2}+\alpha)A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\alpha A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\alpha A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2}\}\\ =\alpha\operatorname{Tr}\{\frac{3-2\alpha}{2}A_{2}A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-\frac{3-4\alpha}{2}A_{2}A_{1}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\frac{3-4\alpha}{2}A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+\\ \frac{3-2\alpha}{2}A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\alpha A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\alpha A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2}\}\\ =\alpha\operatorname{Tr}\{(4\alpha-3)A_{1}A_{2}B_{2}^{\operatorname{\mathsf{T}}}B_{1}+(3-2\alpha)A_{1}A_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-\alpha A_{2}^{2}B_{1}^{\operatorname{\mathsf{T}}}B_{1}-\alpha A_{1}^{2}B_{2}^{\operatorname{\mathsf{T}}}B_{2}\}\end{gathered}

Therefore we have shown eq. 3.8 gives us the sectional curvature numerator. For the sign of the sectional curvature, in eq. 3.8 the terms are all positive, except for the last, which is non-negative if $\alpha\leq\frac{1}{2}$ . The formula for the curvature denominator is clear. ∎

Recall an Einstein manifold is a Riemannian manifold where the Ricci curvature tensor is proportional to the metric tensor. We have a quick application

Corollary 1.

For $p>1$ , the Stiefel manifold with the metric $\mathsf{g}=\alpha_{0}\omega+(\alpha_{1}-\alpha_{0})YY^{\operatorname{\mathsf{T}}}\omega$ is an Einstein manifold if and only if $\alpha=\alpha_{1}/\alpha_{0}$ satisfies the equation

(3.10)

(n-1)\alpha^{2}-(n-2)\alpha+\frac{p-2}{4}=0

For $p=2$ , $\alpha=\frac{n-2}{n-1}$ is the only value of $\alpha$ that makes $\mathrm{St}_{2,n}$ an Einstein manifold. If $p>2$ , there are two values of $\alpha$ in the family making the Stiefel manifold an Einstein manifold.

Proof.

From eq. 3.5, the manifold is an Einstein manifold if and only if $(n-p)\alpha^{2}+(p-2)/4=\alpha(n-2+(1-p)\alpha)$ , from here eq. 3.10 follows. When $p=2$ , it is clear $\frac{n-2}{n-1}$ is the only solution. When $p>2$ , eq. 3.10 has positive discriminant $(n-2)^{2}+(p-2)(n-1)$ , and has two positive roots. ∎

It is noted in [6] that when $p=n-1$ , $\mathrm{St}_{n-1,n}$ is just $\operatorname{SO}(n)$ . Thus, we have provided $\operatorname{SO}(n)$ with Einstein metrics.

4. Sectional curvature range

We have seen the sectional curvature numerator $\operatorname{\hat{\mathcal{K}}}$ could be expressed as a weighted sum of squares, this allows us to estimate the sectional curvature range. If $p=1$ then the Stiefel manifold is a sphere and has constant sectional curvature. Therefore we will assume $p>1$ below.

It is easy to establish upper and lower bounds (not tight) for the sectional curvature from eq. 3.8. Using the triangle inequality we can bound $\mathcal{K}$ from eq. 3.8 by bounding an expression of the form $K_{1}=a\|[A_{1},A_{2}]\|^{2}_{F}+b\|B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}\|^{2}_{F}+c\|B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{1}\|^{2}_{F}+d\|B_{1}A_{2}-B_{2}A_{1}\|_{F}^{2}$ by the curvature denominator $S:=(\alpha\|A_{1}\|_{F}^{2}+\|B_{1}\|_{F}^{2})(\alpha\|A_{2}\|_{F}^{2}+\|B_{2}\|_{F}^{2})$ . We use the inequality $\|[X,Z]\|_{F}^{2}\leq\|X\|_{F}^{2}\|Z\|_{F}^{2}$ , for two antisymmetric matrices in $\mathfrak{o}(n)$ if $n>3$ ([3], lemma 2.5 provides the explicit matrices where we have equality, see also [4], proposition 4.2). Apply that inequality with $X=\frac{1}{\sqrt{2}}\begin{bmatrix}\sqrt{2\alpha}A_{1}&-B_{1}^{\operatorname{\mathsf{T}}}\\ B_{1}&0\end{bmatrix}$ , $Z=\frac{1}{\sqrt{2}}\begin{bmatrix}\sqrt{2\alpha}A_{2}&B_{2}^{\operatorname{\mathsf{T}}}\\ B_{2}&0\end{bmatrix}$ and similar inequalities for $B_{1}=B_{2}=0$ , $A_{1}=A_{2}=0$ , we can bound each term of $K_{1}$ by $S$ , thus getting a bound for $\mathcal{K}$ .

We will attempt to provide more refined bounds. The analysis of sectional curvature range for Stiefel manifolds is more complicated than that of symmetric spaces because of the presence of both the $A$ and $B$ components. The manifold is homogeneous, therefore the sectional curvature range is the same at any point. Let $E_{ij}$ $1\leq i\leq p$ be the elementary matrix in $\mathbb{R}^{p\times p}$ with the $(i,j)$ entry is $1$ , and other entries $0$ . Let $e_{ij}$ be the elementary matrix in $\mathbb{R}^{(n-p)\times p}$ $(1\leq i\leq n-p,1\leq j\leq p)$ with the $(i,j)$ entry is $1$ and the other entries are zero.

In table 1, we show sectional curvature values of $\mathrm{St}_{p,n}$ at several sections (pairs of linearly independent tangent vectors), each defined by a quadruple $(A_{1},B_{1},A_{2},B_{2})$ . A few of those sections come from the corresponding sections for $\operatorname{SO}(n)$ , in [3] as cited. We have noted that $\mathcal{K}$ is non-negative if $\alpha\leq\frac{1}{2}$ , and table 1 shows a section with $\mathcal{K}=\frac{2-3\alpha}{2}$ , thus, if $\alpha>\frac{2}{3}$ , $\mathcal{K}$ always has negative values in its range. When $p=2$ , we will show $\mathcal{K}$ is non-negative if $\alpha\leq\frac{2}{3}$ . When $p>2$ , $\mathcal{K}$ could be negative if $\frac{1}{2}\leq\alpha\leq\frac{2}{3}$ . To see this, let $A_{1}=E_{12}-E_{21},B_{1}=\gamma^{1/2}e_{11},A_{2}=E_{23}-E_{32}$ , $B_{2}=\gamma^{1/2}e_{13}$ for $\gamma\in\mathbb{R},\gamma>0$ . Thus, $[A_{1},A_{2}]=E_{13}-E_{31}$ , $B_{1}A_{2}=B_{2}A_{1}=0$ , $B_{1}B_{2}^{\operatorname{\mathsf{T}}}=0$ , $B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{1}=\gamma(E_{13}-E_{31})$ . By eq. 3.9, the corresponding sectional curvature is

(4.1)

\mathfrak{c}(\gamma)=\frac{\alpha/2+\alpha(4\alpha-3)\gamma+(2-3\alpha)\gamma^{2}/2}{(2\alpha+\gamma)^{2}}

with $\frac{d}{d\gamma}\mathfrak{c}(\gamma)=\alpha((7-10\alpha)\gamma-1-6\alpha+8\alpha^{2})/(\gamma+2\alpha)^{3}$ , $\mathfrak{c}$ is minimized at

(4.2)

\gamma_{\min}(\alpha)=(1+6\alpha-8\alpha^{2})/(7-10\alpha)

Substitute in, the function $\mathfrak{l}(\alpha):=\mathfrak{c}(\gamma_{\min}(\alpha))$ is slightly negative for $\alpha$ in the interval $(\frac{1}{2},\frac{7}{10})$ , which contains $\frac{2}{3}$ . Note that $\alpha=\frac{7}{10}$ is a removable singularity of $\mathfrak{l}$ , and setting $\mathfrak{l}(\frac{7}{10})=\lim_{\gamma\to\infty}\mathfrak{c}(\gamma)=\frac{1}{2}(2-3\times\frac{7}{10})=\frac{-1}{20}$ makes it a smooth function. This function is strictly decreasing and negative in the interval $(\frac{1}{2},\frac{7}{10})$ , with $\mathfrak{l}(\frac{1}{2})=0$ and $\mathfrak{l}(\frac{2}{3})$ around $-0.02$ .

The curvature range contains the interval between the maximum and minimum of values in table 1 if the condition in the last column of the table is satisfied. For $p=2$ , proposition 1 determines the exact curvature range. For $p>2$ , numerically, the sections in that table seem to determine the range completely. For each $\alpha$ , the lower and upper bound of the curvature range, found numerically by optimizing $\mathcal{K}$ over the space of all sections, the Grassmann manifold of two-dimensional subspaces of $\mathbb{R}^{\dim\mathrm{St}_{p,n}}$ is within the maximal and minimal values of the sections in the table if the condition in the last column is satisfied, as shown in fig. 1, 2, 3, 4. There, we plot the graphs of the curvatures of the list of sections as functions of $\alpha$ for the scenarios, and also plot the results of the numerical optimization for curvature range, for a set of $30$ values of $\alpha$ . The optimization is done for $n=4,p=3$ , $n=5,p\in\{3,4\}$ , $n=6,p\in\{3,4,5\}$ , $n=10,p\in\{3,5,10,9\}$ , $n=100,p\in\{10,20\}$ . The curve $ll$ in the figure is for the function $\mathfrak{l}$ . The reason the optimized maximum is sometimes smaller than the proposed maximum, for small $\alpha$ , is because the optimizer may be stuck at a local maximum.

$\mathcal{K}$	$A$ and $B$	condition
0	$A_{1}=A_{2}=E_{12}-E_{21},B_{1}=2e_{13},B_{2}=-\alpha e_{13}$	$n\geq 4,p\geq 3$
0	$A_{1}=A_{2}=0,B_{1}=e_{11},B_{2}=e_{22}$	$n\geq 4,p\leq n-2$
1	$A_{1}=A_{2}=0,B_{1}=e_{11},B_{2}=e_{21}$	$n\geq 4,p\leq n-2$
$\frac{1}{2\alpha+1}$	$A_{1}=E_{12}-E_{21},A_{2}=E_{1p}-E_{p1},B_{1}=-e_{1p},B_{2}=e_{12}$	$p\geq 3$
$\frac{1}{8\alpha}$	$A_{1}=E_{12}-E_{21},A_{2}=E_{23}-E_{32},B_{1}=B_{2}=0$	$p\geq 3$
$\frac{1}{4\alpha}$	$A_{1}=E_{12}-E_{21}+E_{p-1,p}-E_{p,p-1}$	$p\geq 4$
	$A_{2}=E_{1,p-1}-E_{p-1,1}-E_{2,p}+E_{p,2},B_{1}=B_{2}=0$
$\frac{\alpha}{2}$	$A_{1}=(E_{12}-E_{21}),A_{2}=0,B_{1}=0,B_{2}=e_{11}$
$\frac{2-3\alpha}{2}$	$A_{1}=0,A_{2}=0,B_{1}=e_{11},B_{2}=e_{12}$
$\frac{4-3\alpha}{2}$	$A_{1}=A_{2}=0,B_{1}=e_{11}+e_{22},B_{2}=e_{12}-e_{21}$	$n\geq 4,p\leq n-2$
$\mathfrak{l}(\alpha)$	$A_{1}=E_{12}-E_{21},A_{2}=E_{23}-E_{31}$
	$B_{1}=\gamma_{\min}(\alpha)^{1/2}e_{11},B_{2}=\gamma_{\min}(\alpha)^{1/2}e_{13}$	$p\geq 3$ , $\alpha<7/10$

Table 1. Sectional curvature at representative sections.

\mathfrak{l}(\alpha)=\mathfrak{c}(\gamma_{\min}(\alpha))

, from eq. 4.2 and eq. 4.1.

Refer to caption — Figure 1. Numerical test for curvature range $n=4,p=3$ . Max, min sims are curvature ranges from numerical optimization.

Proposition 1.

If $p=2$ and $n=3$ , then the sectional curvature range of $\mathrm{St}_{p,n}$ is $[\frac{\alpha}{2},\frac{2-3\alpha}{2}]$ if $\alpha\leq\frac{1}{2}$ and $[\frac{2-3\alpha}{2},\frac{\alpha}{2}]$ otherwise. In particular, if $\alpha<\frac{2}{3}$ , $\mathrm{St}_{2,3}$ has strictly positive sectional curvature.

If $p=2$ and $n>3$ then the sectional curvature range is $[0,\frac{4-3\alpha}{2}]$ if $\alpha\leq\frac{2}{3}$ , $[\frac{2-3\alpha}{2},1]$ if $\frac{2}{3}<\alpha\leq 2$ and $[\frac{2-3\alpha}{2},\frac{\alpha}{2}]$ if $\alpha>2$ . Hence, when $n>3$ , $\mathrm{St}_{2,n}$ has non-negative curvature if $\alpha\leq\frac{2}{3}$ .

Proof.

When $p=2$ , $\mathfrak{o}(2)$ is one dimension so $[A_{1},A_{2}]=0$ and we can set $A_{1}=(2\alpha)^{-1/2}c_{1}J,A_{2}=(2\alpha)^{-1/2}c_{2}J$ for $J=\begin{bmatrix}0&1\\ -1&0\end{bmatrix}$ , with $c_{1},c_{2}\in\mathbb{R}$ . Further, for two orthogonal matrices in $U,V$ of compatible dimensions, the sectional curvature is unchanged if we replace $(A_{1},B_{1},A_{2},B_{2})$ with $(VA_{1}V^{\operatorname{\mathsf{T}}},UB_{1}V,VA_{2}V^{\operatorname{\mathsf{T}}},UB_{2}V)$ . Thus, we can assume $B_{1}$ is rectangular diagonal, with diagonal entries denoted by $d_{i}$ , $1\leq i\leq\min(n-p,p)$ . We denote entries of $B_{2}$ by $b_{ij}$ , $1\leq i\leq n-p,1\leq j\leq p$ . We note $B_{1}A_{2}-B_{2}A_{1}=(2\alpha)^{-1/2}(c_{2}B_{1}J-c_{1}B_{2}J)$ , and since $JJ^{\operatorname{\mathsf{T}}}=\operatorname{I}_{2}$ , $\alpha^{2}\|B_{1}A_{1}-B_{2}A_{2}\|_{F}^{2}=\alpha/2(c_{2}^{2}\operatorname{Tr}(B_{1}B_{1}^{\operatorname{\mathsf{T}}})+c_{1}^{2}\operatorname{Tr}(B_{2}B_{2}^{\operatorname{\mathsf{T}}})-2c_{1}c_{2}\operatorname{Tr}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}))$ . The orthogonal condition $\alpha\operatorname{Tr}A_{1}A_{2}^{\operatorname{\mathsf{T}}}+\operatorname{Tr}B_{1}B_{2}^{\operatorname{\mathsf{T}}}=0$ implies $c_{1}c_{2}+\operatorname{Tr}B_{1}B_{2}^{\operatorname{\mathsf{T}}}=c_{1}c_{2}+\sum_{i=1}^{\min(p,n-p)}d_{i}b_{ii}=0$ , or $c_{1}c_{2}=-\operatorname{Tr}B_{1}B_{2}^{\operatorname{\mathsf{T}}}$ , so $-2c_{1}c_{2}\operatorname{Tr}B_{1}B_{2}^{\operatorname{\mathsf{T}}}=c_{1}^{2}c_{2}^{2}+(\operatorname{Tr}B_{1}B_{2})^{2}$ . This implies

\alpha^{2}\|B_{1}A_{1}-B_{2}A_{2}\|_{F}^{2}=\alpha/2(c_{2}^{2}\operatorname{Tr}(B_{1}B_{1}^{\operatorname{\mathsf{T}}})+c_{1}^{2}\operatorname{Tr}(B_{2}B_{2}^{\operatorname{\mathsf{T}}})+c_{1}^{2}c_{2}^{2}+\operatorname{Tr}(B_{1}B_{2}^{\operatorname{\mathsf{T}}})^{2})

For the case $n=3$ , from eq. 3.9, the curvature numerator $\operatorname{\hat{\mathcal{K}}}$ is reduced to

\frac{2-3\alpha}{2}b_{12}^{2}d_{1}^{2}+\frac{\alpha}{2}(c_{2}^{2}d_{1}^{2}+c_{1}^{2}(b_{11}^{2}+b_{12}^{2})+c_{1}^{2}c_{2}^{2}+d_{1}^{2}b_{11}^{2})

and the curvature denominator is $S=(c_{1}^{2}+d_{1}^{2})(c_{2}^{2}+b_{11}^{2}+b_{12}^{2})$ . We have $\operatorname{\hat{\mathcal{K}}}-\alpha/2S=(1-2\alpha)b_{12}^{2}d_{1}^{2}$ , $\operatorname{\hat{\mathcal{K}}}-(1-3\alpha/2)S=(2\alpha-1)(c_{2}^{2}d_{1}^{2}+c_{1}^{2}(b_{11}^{2}+b_{12}^{2})+c_{1}^{2}c_{2}^{2}+d_{1}^{2}b_{11}^{2})$ . Thus, the signs of the differences are dependent on $1-2\alpha$ , and $\operatorname{\hat{\mathcal{K}}}$ is between the smaller and the larger of $\alpha/2S$ and $(1-3\alpha/2)S$ . The bound is tight based on table 1.

When $n>3$ , the denominator is $S=(c_{1}^{2}+\sum_{i=1}^{2}d_{i}^{2})(c_{2}^{2}+\sum_{ij}b^{2}_{ij})$ . $B_{1}$ consists of a square diagonal block of size $2\times 2$ and the remaining zero block of size $(n-4)\times 2$ . Expand $\|B_{1}B_{2}^{\operatorname{\mathsf{T}}}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}\|_{F}^{2}$ by dividing $B_{2}$ to a square block corresponding to indices not exceeding two, which contributes $2(b_{21}d_{1}-b_{12}d_{2})^{2}$ and the remaining blocks, which contributes $2\sum_{j=1}^{2}\sum_{i>2}b_{ij}^{2}d_{j}^{2}$ , $\operatorname{\hat{\mathcal{K}}}$ is

\begin{gathered}\frac{2-3\alpha}{2}(b_{21}d_{2}-b_{12}d_{1})^{2}+\frac{\alpha}{2}(c_{2}^{2}\sum d_{i}^{2}+c_{1}^{2}\sum_{ij}b_{ij}^{2}+c_{1}^{2}c_{2}^{2}+(\sum_{i=1}^{2}d_{i}b_{ii})^{2})\\ +(b_{21}d_{1}-b_{12}d_{2})^{2}+\sum_{j=1}^{2}\sum_{i>2}b_{ij}^{2}d_{j}^{2}\end{gathered}

The above expression shows when $\alpha\leq 2/3$ , $\mathcal{K}\geq 0$ . In this case, $1\leq 2-3\alpha/2$ , $\alpha/2\leq 2-3\alpha/2$ , thus $\sum_{j=1}^{2}\sum_{i>2}b_{ij}^{2}d_{j}^{2}\leq(2-3\alpha/2)\sum_{i=1}^{2}d_{i}^{2}\sum_{i>2}b^{2}_{ij}$ and

\frac{\alpha}{2}(c_{2}^{2}\sum d_{i}^{2}+c_{1}^{2}\sum_{ij}b_{ij}^{2}+c_{1}^{2}c_{2}^{2})\leq\frac{4-3\alpha}{2}(c_{2}^{2}\sum d_{i}^{2}+c_{1}^{2}\sum_{ij}b_{ij}^{2}+c_{1}^{2}c_{2}^{2})

To show $\operatorname{\hat{\mathcal{K}}}\leq(2-3\alpha/2)S$ , we only need to show

\frac{2-3\alpha}{2}(b_{21}d_{2}-b_{12}d_{1})^{2}+(b_{21}d_{1}-b_{12}d_{2})^{2}+\frac{\alpha}{2}(\sum_{i=1}^{2}d_{i}b_{ii})^{2}\leq(2-\frac{3\alpha}{2})\sum_{k=1}^{2}d_{k}^{2}\sum_{i\leq 2}b_{ij}^{2}

This follows from Cauchy-Schwarz’s theorem, applying to three different combinations on the left-hand side then sum up the inequalities, as the first two terms on the left-hand side are dominated by $((2-3\alpha)/2+1)(d_{1}^{2}+d_{2}^{2})(b_{21}^{2}+b_{12}^{2})$ , while the last one is dominated by $\alpha/2(d_{1}^{2}+d_{2}^{2})(b_{11}^{2}+b_{22}^{2})\leq(2-3\alpha/2)(d_{1}^{2}+d_{2}^{2})(b_{11}^{2}+b_{22}^{2})$ .

Next, when $\alpha>2/3$ , by Cauchy-Schwarz, $\operatorname{\hat{\mathcal{K}}}\geq(1-3\alpha/2)(b_{21}^{2}+b_{12}^{2})(d_{1}^{2}+d_{2}^{2})\geq(1-3\alpha/2)S$ , as $1-3\alpha/2<0$ . When $2/3<\alpha\leq 2$ , $\alpha/2\leq 1$ , thus $\mathcal{K}\leq S$ , as the first term of $\operatorname{\hat{\mathcal{K}}}$ is negative, while we can use Cauchy-Schwarz on $(\sum d_{i}b_{ii})^{2}$ and $(b_{21}d_{1}-b_{12}d_{2})^{2}$ as before. Finally, for $\alpha>2$ , $\operatorname{\hat{\mathcal{K}}}\leq\alpha/2S$ , again because the first term of $\operatorname{\hat{\mathcal{K}}}$ is negative, while the remaining terms are dominated by the corresponding terms in $\alpha/2S$ , using Cauchy-Schwarz if necessary. Again, the bounds are tight using table 1. ∎

We note $\mathrm{St}_{2,3}$ is $\operatorname{SO}(3)$ , and could be considered as the sphere $S^{3}$ with antipodal points identified (via the quaternion representation, for example). From the formula for the metric, we see this is the projective version of the Berger sphere.

Proposition 2.

For $p\geq 3$ , the sectional curvature range of $\mathrm{St}_{p,n}$ contains an interval $I=I(n,p,\alpha)$ as described in table 2. The first row describes the applicable combination of $(n,p)$ , the columns labeled $\alpha_{u}$ specify the range of $\alpha$ where the interval formula next to it is applicable. The interval is applicable for $\alpha$ greater than the previous $\alpha_{u}$ (if exists) and not exceeding the current $\alpha_{u}$ .

$(n=4,p=3)$		$(n,3),n\geq 5$		$(n,p),n-2\geq p\geq 4$		$(n,n-1),n\geq 5$
$\alpha_{u}$	I	$\alpha_{u}$	I	$\alpha_{u}$	I	$\alpha_{u}$	I
$\frac{1}{6}$	$[0,\frac{1}{8\alpha}]$	$\frac{4-\sqrt{13}}{6}$	$[0,\frac{1}{8\alpha}]$	$\frac{4-\sqrt{10}}{6}$	$[0,\frac{1}{4\alpha}]$	$1/2$	$[0,\frac{1}{4\alpha}]$
$1/2$	$[0,\frac{1}{1+2\alpha}]$	$1/2$	$[0,\frac{4-3\alpha}{2})]$	$1/2$	$[0,\frac{4-3\alpha}{2}]$	$\frac{7}{10}$	$[\mathfrak{l}(\alpha),\frac{1}{1+2\alpha}]$
$\frac{7}{10}$	$[\mathfrak{l}(\alpha),\frac{1}{1+2\alpha}]$	$\frac{2}{3}$	$[\mathfrak{l}(\alpha),\frac{4-3\alpha}{2}]$	$\frac{2}{3}$	$[\mathfrak{l}(\alpha),\frac{4-3\alpha}{2}]$	$\frac{\sqrt{17}-1}{4}$	$[\frac{2-3\alpha}{2},\frac{1}{1+2\alpha}]$
$\frac{\sqrt{17}-1}{4}$	$[\frac{2-3\alpha}{2},\frac{1}{1+2\alpha}]$	$\frac{7}{10}$	$[\mathfrak{l}(\alpha),1]$	$\frac{7}{10}$	$[\mathfrak{l}(\alpha),1]$	$\infty$	$[\frac{2-3\alpha}{2},\frac{\alpha}{2}]$
$\infty$	$[\frac{2-3\alpha}{2},\frac{\alpha}{2}]$	$2$	$[\frac{2-3\alpha}{2},1]$	$2$	$[\frac{2-3\alpha}{2},1]$
		$\infty$	$[\frac{2-3\alpha}{2},\frac{\alpha}{2}]$	$\infty$	$[\frac{2-3\alpha}{2},\frac{\alpha}{2}]$

Table 2. Interval contained in the sectional curvature range of the Stiefel manifold

\mathrm{St}_{p,n}

with metric defined by

\alpha

\mathfrak{l}(\alpha)=\mathfrak{c}(\gamma_{\min})

with

\mathfrak{c}

defined in eq. 4.1, and

\gamma_{\min}

in eq. 4.2.

To illustrate, with $(n,p)=(4,3)$ , for $\alpha\leq\frac{1}{6}$ , the sectional curvature range contains the interval $[0,\frac{1}{8\alpha}]$ , for $\frac{1}{6}<\alpha\leq\frac{1}{2}$ , it contains the interval $[0,\frac{1}{1+2\alpha}]$ , etc. In the final row, for $\alpha>\frac{\sqrt{17}-1}{4}$ , it contains the interval $[\frac{2-3\alpha}{2},\frac{\alpha}{2}]$ .

Proof.

It is straightforward to check that for each pair $(n,p)$ in table 2, the values indicated correspond to a quadruple $(A_{1},B_{1},A_{2},B_{2})$ in table 1, which is applicable for the pair. For example, in the case $(n,p)=(4,3)$ , the only applicable values from table 1 are $0$ (from the first row), $\frac{1}{2\alpha+1}$ , $\frac{1}{8\alpha}$ , $\mathfrak{l}(\alpha)$ and $\frac{2-3\alpha}{2}$ . To show the sectional curvature range contains $I$ , it remains to verify the lower end of $I$ is not greater than the upper end, which is immediate, as $\mathfrak{l}(\alpha)$ is negative between $0$ and $\frac{7}{10}$ , and $\frac{2-3\alpha}{2}$ is negative for $\alpha>\frac{7}{10}>\frac{2}{3}$ .

The graphs in figures 1, 2, 3, 4 display the relative values of these functions. As all the functions involved are simple algebraic functions, except for $\mathfrak{l}$ , if we can assess the contribution of $\mathfrak{l}$ , it will be easy to check that the lower end of $I$ corresponds to the smallest value among the applicable values, and the upper to the largest of the applicable values. The function $\gamma_{\min}$ from eq. 4.2 has a root at $\alpha_{s}=\frac{3+\sqrt{17}}{8}$ at around $0.89$ , and is negative in the interval $(\frac{7}{10},\alpha_{s})$ , hence $\sqrt{\gamma_{\min}}$ and $B_{1},B_{2}$ for this section are not defined, so $\mathfrak{l}(\alpha)$ cannot be an extremum for $\alpha\in(\frac{7}{10},\alpha_{s})$ . In the interval $[\alpha_{s},2]$ , $\mathfrak{l}$ has the approximate range of $[0.14,0.38]$ , less than $1$ , and in the interval $[\alpha_{s},\frac{\sqrt{17}-1}{4}]$ it is less than $\frac{1}{1+2\alpha}$ . For large $\alpha$ , $\gamma_{\min}$ is approximated by $0.8\alpha$ , thus $\mathfrak{l}(\alpha)$ has an asymptote with slope $\frac{4\times 0.8-3\times 0.8^{2}/2}{2.8^{2}}\approx 0.286$ , smaller than the slope of $\frac{\alpha}{2}$ . It is also easy to graph $\mathfrak{l}$ in the interim to show beyond the contribution to the lower bound in $[1/2,\frac{7}{10}]$ , $\mathfrak{l}$ has no other effect on the curvature range.

With that analysis, for the case $(n,p)=(4,3)$ , the only applicable values from table 1 are $0$ (from the first row), $\frac{1}{2\alpha+1}$ , $\frac{1}{8\alpha}$ , $\mathfrak{l}(\alpha)$ and $\frac{2-3\alpha}{2}$ . If $\alpha<1/2$ , all these functions are non-negative, and thus $0$ is the smallest value among them. When $\frac{1}{2}<\alpha<\frac{7}{10}$ , $\mathfrak{l}(\alpha)$ is negative, and in the interval $[\frac{2}{3},\frac{7}{10}]$ , $\frac{2-3\alpha}{2}$ is also negative, but $\mathfrak{l}(\alpha)$ is the lesser of the two, while we have discussed $\mathfrak{l}(\alpha)$ has no effect for $\alpha>\frac{7}{10}$ . Thus, for $\alpha>\frac{7}{10}$ the upper end of $I$ is $\max(\frac{1}{1+2\alpha},\frac{\alpha}{2})$ , with the break-even point $\frac{\sqrt{17}-1}{4}$ . In general, consider the upper or lower ends of $I$ as functions of $\alpha$ , the values in column $\alpha_{u}$ corresponds to nonsmooth points of these functions or infinity.

For the case $n\geq 5,p=n-1$ , $(0,\frac{1}{4\alpha},\frac{1}{8\alpha},\frac{1}{2\alpha+1},\mathfrak{l}(\alpha),\frac{2-3\alpha}{2},\frac{\alpha}{2})$ are the applicable curvature values. Again, with $\mathfrak{l}$ having only an effect in $[\frac{2}{3},\frac{7}{10}]$ , it is straightforward to verify the piecewise smooth function $\max(0,\frac{1}{4\alpha},\frac{1}{8\alpha},\frac{1}{2\alpha+1},\mathfrak{l}(\alpha),\frac{2-3\alpha}{2},\frac{\alpha}{2})$ has the form corresponding to the upper end of $I$ , and the lower end corresponding to the minimum of those functions, for $\alpha>\frac{7}{10}$ . We address the case $p\geq n-2$ similarly. ∎

For $\alpha=\frac{1}{2}$ , when $p=n-1,n\geq 4$ , the range contains $[0,\frac{1}{2}]$ , and it could be shown to be exactly $[0,\frac{1}{2}]$ as the manifold is isometric to $\operatorname{SO}(n)$ with a bi-invariant metric. If $2\leq p\leq n-2,n\geq 5$ , the range contains $[0,2-3\alpha/2]=[0,5/4]$ , which is proved to be the exact range in [14]. For $\alpha=1$ , the interval is $[-1/2,1]$ . From the numerical evidence mentioned, this seems to be tight. We note for $p\geq 3$ , both when $\alpha$ is large or $\alpha$ is small, the curvature range becomes large.

5. Deformation metrics on normal homogeneous manifolds

For a Lie group $\mathtt{G}$ , with $U\in\mathtt{G}$ , we will denote by $\mathcal{L}_{U}$ the left-multiplication map and by $d\mathcal{L}_{U}$ its differential. As usual, $\operatorname{ad}_{A}$ denotes the operator $X\mapsto[A,X]$ on the Lie algebra $\mathfrak{g}$ of $\mathtt{G}$ ( $A,X\in\mathfrak{g}$ ). We recall a few results on curvatures of Lie groups.

Proposition 3.

Let $\mathtt{G}$ be a connected Lie group with Lie algebra $\mathfrak{g}$ with a left-invariant metric given by an inner product $\langle\rangle_{\operatorname{P}}$ on $\mathfrak{g}$ . For $A\in\mathfrak{g}$ , let $\operatorname{ad}_{A}^{\dagger}$ be the adjoint of $\operatorname{ad}_{A}$ under $\langle\rangle_{\operatorname{P}}$ , that means $\operatorname{ad^{\dagger}}_{A}$ is a linear operator on $\mathfrak{g}$ such that $\langle[A,A_{1}],A_{2}\rangle_{\operatorname{P}}=\langle A_{1},\operatorname{ad^{\dagger}}_{A}A_{2}\rangle_{\operatorname{P}}$ . Define

(5.1)

[A,B]_{\operatorname{P}}=[A,B]-\operatorname{ad^{\dagger}}_{A}B-\operatorname{ad^{\dagger}}_{B}A

Let $\nabla^{\mathtt{G}}$ be the Levi-Civita connection on $\mathtt{G}$ . For two vector fields $\mathtt{X},\mathtt{Y}$ on $\mathtt{G}$ , there exists $\mathfrak{g}$ -valued functions $A(U),B(U)$ , $U\in\mathtt{G}$ such that $\mathtt{X}(U)=d\mathcal{L}_{U}A(U),\mathtt{Y}(U)=d\mathcal{L}_{U}B(U)$ . We have

(5.2)

(\nabla^{\mathtt{G}}_{\mathtt{X}}\mathtt{Y})(U)=d\mathcal{L}_{U}((\operatorname{D}_{\mathtt{X}}B)(U)+\frac{1}{2}[A(U),B(U)]_{\operatorname{P}})

where $\operatorname{D}_{\mathtt{X}}B$ is the Lie-derivative of the $\mathfrak{g}$ -valued function $B$ by the vector field $\mathtt{X}$ .

For $\omega_{1},\omega_{2},\omega_{3}\in\mathfrak{g}$ , the curvature of $\mathtt{G}$ at the identity is given by

(5.3)

\operatorname{R}^{\mathtt{G}}_{\omega_{1},\omega_{2}}\omega_{3}=\frac{1}{2}[[\omega_{1},\omega_{2}],\omega_{3}]_{\operatorname{P}}-\frac{1}{4}[\omega_{1}[\omega_{2},\omega_{3}]_{\operatorname{P}}]_{\operatorname{P}}+\frac{1}{4}[\omega_{2}[\omega_{1},\omega_{3}]_{\operatorname{P}}]_{\operatorname{P}}

Let $\mathfrak{k}$ be a subalgebra of $\mathfrak{g}$ such that $\operatorname{P}$ is $\operatorname{ad}(\mathfrak{k})$ -invariant, $\langle[A,K],B\rangle_{\operatorname{P}}+\langle A,[K,B]\rangle_{\operatorname{P}}=0$ for $K\in\mathfrak{k},A,B\in\mathfrak{g}$ , and $\mathfrak{k}$ corresponds to a closed subgroup $\mathtt{K}\subset\mathtt{G}$ , such that $\mathtt{K}$ acts freely and properly on $\mathtt{G}$ by isometries under right multiplication and $\mathtt{G}/\mathtt{K}$ is a homogeneous space. If $\mathfrak{g}=\mathfrak{k}\oplus\mathfrak{m}$ is an orthogonal decomposition under $\langle\rangle_{\operatorname{P}}$ , then the horizontal lift of the curvature of $\mathtt{M}=\mathtt{G}/\mathtt{K}$ at $o$ , the equivariant class containing the unit of $\mathtt{G}$ , evaluated at three horizontal vectors $\omega_{1},\omega_{2},\omega_{3}\in\mathfrak{m}$ is

(5.4)

\begin{gathered}\operatorname{R}^{\mathtt{M}}_{\omega_{1},\omega_{2}}\omega_{3}=(\frac{1}{2}[[\omega_{1},\omega_{2}],\omega_{3}]_{\operatorname{P}}-\frac{1}{4}[\omega_{1}[\omega_{2},\omega_{3}]_{\operatorname{P}}]_{\operatorname{P}}+\frac{1}{4}[\omega_{2}[\omega_{1},\omega_{3}]_{\operatorname{P}}]_{\operatorname{P}}\\ +\frac{1}{2}\operatorname{ad^{\dagger}}_{\omega_{3}}[\omega_{1},\omega_{2}]_{\mathfrak{k}}-\frac{1}{4}\operatorname{ad^{\dagger}}_{\omega_{1}}[\omega_{2},\omega_{3}]_{\mathfrak{k}}+\frac{1}{4}\operatorname{ad^{\dagger}}_{\omega_{2}}[\omega_{1},\omega_{3}]_{\mathfrak{k}})_{\mathfrak{m}}\end{gathered}

Here, $\omega_{\mathfrak{v}}$ denotes the orthogonal projection of $\omega$ to $\mathfrak{v}$ for an element $\omega\in\mathfrak{g}$ and a subspace $\mathfrak{v}$ of $\mathfrak{g}$ . Also, given two vector fields $\mathtt{X},\mathtt{Y}$ on $\mathtt{M}$ , which lift to horizontal vector fields $\bar{\mathtt{X}},\bar{\mathtt{Y}}$ on $\mathtt{G}$ , with $\bar{\mathtt{X}}(U)=d\mathcal{L}_{U}A(U),\bar{\mathtt{Y}}=d\mathcal{L}_{U}B(U)$ for two $\mathfrak{g}$ -valued functions $A(U),B(U)$ on $\mathtt{G}$ then the horizontal lift of $\nabla_{\mathtt{X}}\mathtt{Y}$ is given by

(5.5)

\overline{\nabla_{\mathtt{X}}\mathtt{Y}(U)}=d\mathcal{L}_{U}((\operatorname{D}_{\bar{\mathtt{X}}}B)(U)+\frac{1}{2}[A(U),B(U)]_{\operatorname{P}})_{\mathfrak{m}}

Note that in general $[\quad]_{\operatorname{P}}$ is not anticommutative, as the term $\operatorname{ad^{\dagger}}_{A}B+\operatorname{ad^{\dagger}}_{B}A$ is commutative, and we have $[A,B]_{\operatorname{P}}-[B,A]_{\operatorname{P}}=2[A,B]$ .

Proof.

First, we note for three $\mathfrak{g}$ -valued functions A, B, C

\begin{gathered}\langle[A,B]_{\operatorname{P}},C\rangle_{\operatorname{P}}+\langle B,[A,C]_{\operatorname{P}}\rangle_{\operatorname{P}}=\langle[A,B],C\rangle_{\operatorname{P}}-\langle B,[A,C]\rangle_{\operatorname{P}}-\langle A,[B,C]\rangle_{\operatorname{P}}\\ +\langle B,[A,C]\rangle_{\operatorname{P}}-\langle[A,B],C\rangle_{\operatorname{P}}-\langle[C,B],A\rangle_{\operatorname{P}}=0\end{gathered}

For each smooth function $F:\mathtt{G}\to\mathsf{g}$ , denote by $\mathcal{L}[F]$ the vector field $U\mapsto d\mathcal{L}_{U}F(U)$ . Denote by $\langle\rangle_{\mathtt{G}}$ the left-invariant metric induced by $\operatorname{P}$ . For three vector fields $\mathtt{X},\mathtt{Y},\mathtt{Z}$ with $\mathtt{X}=\mathcal{L}[A],\mathtt{Y}=\mathcal{L}[B]$ and $\mathtt{Z}=\mathcal{L}[C]$ with three smooth $\mathfrak{g}$ -valued functions $A,B,C$ , we have

\begin{gathered}\operatorname{D}_{\mathtt{X}}\langle\mathtt{Y},\mathtt{Z}\rangle_{\mathtt{G}}=\operatorname{D}_{\mathtt{X}}\langle B,C\rangle_{\operatorname{P}}=\langle\operatorname{D}_{\mathtt{X}}B,C\rangle_{\operatorname{P}}+\langle B,\operatorname{D}_{\mathtt{X}}C\rangle_{\operatorname{P}}\\ =\langle\mathcal{L}[\operatorname{D}_{\mathtt{X}}B+\frac{1}{2}[A,B]_{\operatorname{P}}],\mathtt{Z}\rangle_{\mathtt{G}}+\langle\mathtt{Y},\mathcal{L}[(\operatorname{D}_{\mathtt{X}}C+\frac{1}{2}[A,C]_{\operatorname{P}}]\rangle_{\mathtt{G}}\end{gathered}

as the metric is left-invariant, $\operatorname{P}$ is constant on $\mathfrak{g}$ , and apply the just proved identity. We can verify $\mathcal{L}[\operatorname{D}_{\mathtt{X}}B+\frac{1}{2}[A,B]_{\operatorname{P}}]$ satisfies the derivative rule of a connection, and we have just proved it is metric compatible. Torsion-freeness follows from $[A,B]_{\operatorname{P}}-[B,A]_{\operatorname{P}}=2[A,B]$ , thus $\mathcal{L}[\operatorname{D}_{\mathtt{X}}B+\frac{1}{2}[A,B]_{\operatorname{P}}]$ is the Levi-Civita connection.

Equation 5.2 is from [7], equation 3.3.2 (the author uses a right-invariant metric). It is related to the Euler-Poisson-Arnold equation (EPDiff), see equation (55) in Arnold’s classical paper [1]. See also [8].

Equation (5.3) now follows directly from the definition of curvature $\nabla_{[\mathtt{X},\mathtt{Y}]}\mathtt{Z}-\nabla_{\mathtt{X}}\nabla_{\mathtt{Y}}\mathtt{Z}+\nabla_{\mathtt{Y}}\nabla_{\mathtt{X}}\mathtt{Z}$ , applying to the invariant vector fields $\mathcal{L}[\omega_{i}],i\in\{1,2,3\}$ . Equation eq. 5.4 follows from the O’Neil equation (Theorem 2, [11]) , written in $(1,3)$ tensor form. Indeed, the O’Neil tensor of two vector fields $\mathcal{L}[A],\mathcal{L}[B]$ on $\mathtt{G}$ for $\mathfrak{g}$ -valued functions $A$ and $B$ evaluated at the coset $o$ is $\frac{1}{2}[A,B]_{\mathfrak{k}}$ as the just proved result for covariant derivatives shows the Lie bracket $\{\mathcal{L}[A],\mathcal{L}[B]\}=\mathcal{L}[[A,B]]$ , then we use Lemma 2, [11]. By properties of adjoint and projection, the right-hand side of eq. 5.4 is the unique vector in $\mathfrak{m}$ such that the O’Neil equation (equation 4, theorem 2, [11]) is satisfied. Equation (5.5) follows from the result for $\mathtt{G}$ and property of horizontal lift of a connection in Riemannian submersion, e.g. lemma 7.45 in [12] (because of left-invariance, we can translate the projection to the identity). ∎

For a subspace $\mathfrak{v}\subset\mathfrak{g}$ , we write $\omega_{1\mathfrak{v}}$ for $(\omega_{1})_{\mathfrak{v}}$ , the projection of $\omega_{1}$ to $\mathfrak{v}$ ( $\omega_{1}\in\mathfrak{g}$ ). We write $[\omega_{1},\omega_{2}]_{\mathfrak{v}}$ , $[\omega_{1},\omega_{2}]_{\operatorname{P}\mathfrak{v}}$ for the corresponding projections of brackets.

On a Lie group with a bi-invariant metric $\langle\rangle$ , we now introduce a family of left-invariant metrics called the Cheeger deformation metrics ([2, 16, 5]). The Lie algebra used in the deformation will be called $\mathfrak{a}$ here (it is often called $\mathfrak{k}$ , but we use $\mathtt{K}$ for the stabilizer group. We will use the letters $\mathfrak{a},\mathfrak{b}$ corresponding to the component $A$ , $B$ of the Stiefel tangent vectors as will be seen shortly). Let $\mathtt{A}$ be a connected subgroup of $\mathtt{G}$ with Lie algebra $\mathfrak{a}$ . With the bi-invariant metric on $\mathtt{G}$ , $\mathtt{A}$ acts via right multiplication as a group of isometries on $\mathtt{G}$ . Give $\mathtt{G}\times\mathtt{A}$ a bi-invariant metric corresponding to the inner product on $\mathfrak{g}\oplus\mathfrak{a}$ evaluated as $\langle g,g\rangle+r\langle a,a\rangle$ for $(g,a)\in\mathfrak{g}\times\mathfrak{a}$ with $r>0$ , we have the submersion $\mathtt{G}\times\mathtt{A}\to\mathtt{G}$ given by $(U,Q)\mapsto UQ^{-1}$ ( $U\in\mathtt{G},Q\in\mathtt{A}$ ). Let $\mathfrak{g}=\mathfrak{a}\oplus\mathfrak{n}$ be an orthogonal decomposition with respect to $\langle\rangle$ . The submersion induces a new metric on $\mathtt{G}$ which is shown in [16] to be

\langle\omega_{\mathfrak{n}},\omega_{\mathfrak{n}}\rangle+\frac{r}{(r+1)}\langle\omega_{\mathfrak{a}},\omega_{\mathfrak{a}}\rangle

for $\omega\in\mathfrak{g}$ . Denote the Cheeger deformation metric $\operatorname{P}_{t}$ on $\mathfrak{g}$ by the formula $\langle\omega_{\mathfrak{n}},\omega_{\mathfrak{n}}\rangle+t\langle\omega_{\mathfrak{a}},\omega_{\mathfrak{a}}\rangle$ for $t>0$ . At $t=1$ , it is the original metric. For $t<1$ , the metric corresponds to the submersion above with $r=t/(1-t)$ , thus $\mathtt{G}$ has non-negative curvature by O’Neil’s equation. For $t>1$ , the metric on $\mathtt{G}\times\mathtt{A}$ is semi-Riemannian but the corresponding metric on $\mathtt{G}$ is Riemannian. If $\mathfrak{n}$ contains a subalgebra $\mathfrak{k}$ corresponding to a closed subgroup $\mathtt{K}$ of $\mathtt{G}$ , such that $\mathfrak{k}$ commutes with $\mathfrak{a}$ then $\mathtt{G}/\mathtt{K}$ could be equipped with the quotient metric induced from $\operatorname{P}_{t}$ . Hence, we will consider the situation when $\mathfrak{k}$ is a subalgebra of an algebra $\mathfrak{h}$ commuting if $\mathfrak{a}$ . Note that $\mathtt{G}/\mathtt{K}$ with the original bi-invariant metric is called a normal homogeneous space in the literature, while $\operatorname{P}_{t}$ is no longer bi-invariant.

Proposition 4.

Assume the Lie algebra $\mathfrak{g}$ has a bi-invariant metric $\langle\rangle$ . Let $\mathfrak{h}\subset\mathfrak{g}$ be a Lie subalgebra of $\mathfrak{g}$ and $\mathfrak{h}^{\perp}$ be the orthogonal complement of $\mathfrak{h}$ in $\mathfrak{g}$ under $\langle\rangle$ , $\mathfrak{g}=\mathfrak{h}\oplus\mathfrak{h}^{\perp}$ . Then $\mathfrak{b}:=[\mathfrak{h},\mathfrak{h}^{\perp}]\subset\mathfrak{h}^{\perp}$ , or $\mathfrak{h}^{\perp}$ is a $\mathfrak{h}$ -module. Let $\mathfrak{h}^{\perp}=\mathfrak{b}\oplus\mathfrak{a}$ be an orthogonal decomposition under $\langle\rangle$ . We can characterize $\mathfrak{a}$ as the subspace $\{A\in\mathfrak{h}^{\perp}|[A,\mathfrak{h}]=0\}$ . Then

(5.6)

\mathfrak{g}=\mathfrak{h}\oplus\mathfrak{b}\oplus\mathfrak{a}

We have $[\mathfrak{a},\mathfrak{b}]\subset\mathfrak{b}$ , $\mathfrak{a}$ is a Lie subalgebra of $\mathfrak{g}$ , $[\mathfrak{a},\mathfrak{h}]=0$ and $\mathfrak{b}$ is both a $\mathfrak{h}$ and $\mathfrak{a}$ module. The correspondence $\mathfrak{h}\mapsto\mathfrak{a}$ is involutive on the set of all subalgebras of $\mathfrak{g}$ , that means if we apply the same procedure on $\mathfrak{a}$ , we recover $\mathfrak{h}$ .

Proof.

Let $X\in\mathfrak{h}^{\perp}$ and $A,H\in\mathfrak{h}$ . Then $\langle[A,X],H\rangle=-\langle X,[A,H]\rangle=0$ since $\mathfrak{h}$ is a subalgebra of $\mathfrak{g}$ , thus $[A,X]\in\mathfrak{h}^{\perp}$ . Assume the $\langle\rangle$ -orthogonal decomposition $\mathfrak{h}^{\perp}=\mathfrak{b}\oplus\mathfrak{a}$ with $\mathfrak{b}=[\mathfrak{h},\mathfrak{h}^{\perp}]$ . For $A\in\mathfrak{a}$ , $\langle[A,\mathfrak{h}],\mathfrak{h}^{\perp}\rangle\subset\langle A,[\mathfrak{h},\mathfrak{h}^{\perp}]\rangle\subset\{0\}$ and $[A,\mathfrak{h}]\subset\mathfrak{h}^{\perp}$ as $\mathfrak{h}^{\perp}$ is a $\mathfrak{h}$ -module. Hence, $[A,\mathfrak{h}]=0$ as $\langle\rangle$ is non-degenerate on $\mathfrak{h}^{\perp}$ . Conversely, if $A\in\mathfrak{h}^{\perp}$ and $[A,\mathfrak{h}]=0$ then $\langle A,[\mathfrak{h},\mathfrak{h}^{\perp}]\rangle\subset\langle[A,\mathfrak{h}],\mathfrak{h}^{\perp}]\rangle\subset\{0\}$ , thus $A\in\mathfrak{a}$ . We have proved $\mathfrak{a}$ is characterized as the subspace of $\mathfrak{h}^{\perp}$ such that $[A,\mathfrak{h}]=0$ for $A\in\mathfrak{a}$ .

Next, for $A\in\mathfrak{a}$ , $\langle[A,\mathfrak{h}^{\perp}],\mathfrak{h}\rangle\subset\langle A,[\mathfrak{h}^{\perp},\mathfrak{h}]\rangle\subset\{0\}$ , thus $[A,\mathfrak{h}^{\perp}]\subset\mathfrak{h}^{\perp}$ . Then

\langle[A,[\mathfrak{h},\mathfrak{h}^{\perp}]],\mathfrak{a}\rangle\subset\langle[[A,\mathfrak{h}],\mathfrak{h}^{\perp}],\mathfrak{a}\rangle+\langle[\mathfrak{h},[A,\mathfrak{h}^{\perp}]],\mathfrak{a}\rangle\subset\{0\}

as in the middle sum, the first item is zeros because $[A,\mathfrak{h}]=0$ , the second is $\langle[\mathfrak{h},[A,\mathfrak{h}^{\perp}]],\mathfrak{a}\rangle\subset\langle[A,\mathfrak{h}^{\perp}],[\mathfrak{h},\mathfrak{a}]\rangle\subset\{0\}$ as $[\mathfrak{h},\mathfrak{a}]=\{0\}$ . This shows $[\mathfrak{a},\mathfrak{b}]$ is in the orthogonal complement of $\mathfrak{a}$ in $\mathfrak{h}^{\perp}$ , or $[\mathfrak{a},\mathfrak{b}]\subset\mathfrak{b}$ .

Now, $\langle[\mathfrak{a},\mathfrak{a}],\mathfrak{h}\rangle\subset\langle\mathfrak{a},[\mathfrak{a},\mathfrak{h}]\rangle\subset\{0\}$ , thus $[\mathfrak{a},\mathfrak{a}]\subset\mathfrak{h}^{\perp}$ . But then $\langle[\mathfrak{a},\mathfrak{a}],\mathfrak{b}\rangle\subset\langle\mathfrak{a},[\mathfrak{a},\mathfrak{b}]\rangle\subset\langle\mathfrak{a},\mathfrak{b}\rangle\subset\{0\}$ , hence $[\mathfrak{a},\mathfrak{a}]\subset\mathfrak{a}$ , therefore $\mathfrak{a}$ is a subalgebra of $\mathfrak{g}$ , and $\mathfrak{b}$ is an $\mathfrak{a}$ -module. Involutiveness follows from the orthogonal decomposition $\mathfrak{g}=\mathfrak{h}\oplus\mathfrak{b}\oplus\mathfrak{a}$ , and the characterization of $\mathfrak{a}$ by the relation $[\mathfrak{a},\mathfrak{h}]=0$ , which implies $\mathfrak{a}^{\perp}=\mathfrak{b}\oplus\mathfrak{h}$ . ∎

Proposition 5.

Assume $\mathfrak{g}$ has a bi-invariant inner product $\langle\rangle$ . Let $\operatorname{P}$ be a positive-definite self-adjoint operator under the inner product $\langle\rangle$ . Then under the inner product $\langle\rangle_{\operatorname{P}}$ defined by $\langle A_{1},A_{2}\rangle_{\operatorname{P}}:=\langle A_{1},\operatorname{P}A_{2}\rangle$ , we have $\operatorname{ad^{\dagger}}_{A}X=-\operatorname{P}^{-1}[A,\operatorname{P}X]$ for $X\in\mathfrak{g}$ , or $\operatorname{ad^{\dagger}}_{A}=-\operatorname{P}^{-1}\circ\operatorname{ad}_{A}\circ\operatorname{P}$ .

Let $t$ be a positive number and $\mathfrak{a},\mathfrak{b},\mathfrak{h}$ as in proposition 4. Let $\mathfrak{n}=\mathfrak{b}\oplus\mathfrak{h}$ , thus $\mathfrak{g}=\mathfrak{a}\oplus\mathfrak{n}$ . Define the operator $\operatorname{P}=\operatorname{P}_{t}$ by $\operatorname{P}\omega=t\omega_{\mathfrak{a}}+\omega_{\mathfrak{n}}$ . Then for $\omega_{1},\omega_{2}\in\mathfrak{g}$

(5.7)

(\operatorname{ad^{\dagger}}_{\omega_{1}}\omega_{2})_{\mathfrak{a}}=-[\omega_{1\mathfrak{a}},\omega_{2\mathfrak{a}}]-1/t[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}]_{\mathfrak{a}}

(5.8)

(\operatorname{ad^{\dagger}}_{\omega_{1}}\omega_{2})_{\mathfrak{n}}=-[\omega_{1\mathfrak{a}},\omega_{2\mathfrak{b}}]+t[\omega_{2\mathfrak{a}},\omega_{1\mathfrak{b}}]-[\omega_{\mathfrak{n},1},\omega_{2\mathfrak{n}}]_{\mathfrak{n}}

(5.9)

[\omega_{1},\omega_{2}]_{\operatorname{P}}=[\omega_{1},\omega_{2}]+(1-t)([\omega_{1\mathfrak{a}},\omega_{2\mathfrak{b}}]+[\omega_{2\mathfrak{a}},\omega_{1\mathfrak{b}}])

Let $\mathfrak{k}\subset\mathfrak{h}$ be a Lie subalgebra of $\mathfrak{h}$ and $\mathfrak{m}=\mathfrak{a}\oplus\mathfrak{b}\oplus\mathfrak{d}$ where $\mathfrak{h}=\mathfrak{k}\oplus\mathfrak{d}$ is an orthogonal decomposition, thus $\mathfrak{g}=\mathfrak{k}\oplus\mathfrak{m}$ . For $\omega_{3}\in\mathfrak{g}$

(5.10)

(\operatorname{ad^{\dagger}}_{\omega_{3}}[\omega_{1},\omega_{2}]_{\mathfrak{k}})_{\mathfrak{m}}=-[\omega_{3\mathfrak{m}}[\omega_{1},\omega_{2}]_{\mathfrak{k}}]

Proof.

Let $A,Y,X\in\mathfrak{g}$ . From $\operatorname{ad}(\mathfrak{g})$ invariance of $\langle\rangle$ we have

\langle[A,Y],\operatorname{P}X\rangle=\langle Y,-\operatorname{P}\operatorname{P}^{-1}[A,\operatorname{P}X]\rangle

which gives us the first statement.

For eq. 5.7 and eq. 5.8, we expand

\begin{gathered}\operatorname{ad^{\dagger}}_{\omega_{1}}\omega_{2}=-\operatorname{P}^{-1}[\omega_{1\mathfrak{a}}+\omega_{1\mathfrak{n}},t\omega_{2\mathfrak{a}}+\omega_{2\mathfrak{n}}]\\ =(-1/t)([t\omega_{1\mathfrak{a}},\omega_{2\mathfrak{a}}]+[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}]_{\mathfrak{a}})-([\omega_{1\mathfrak{a}},\omega_{2\mathfrak{n}}]+[\omega_{1\mathfrak{n}},t\omega_{2\mathfrak{a}}]+[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}])_{\mathfrak{n}}\end{gathered}

then use the fact that $[\mathfrak{a},\mathfrak{h}]=\{0\}$ . Equation 5.9 follows from this and the definition of $[\quad]_{\operatorname{P}}$ , using anti-commutativity to cancel $1/t([\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}]_{\mathfrak{a}}+[\omega_{2\mathfrak{n}},\omega_{1\mathfrak{n}}]_{\mathfrak{a}})$ .

For eq. 5.10, let $\omega_{4}\in\mathfrak{m}$ , we have

\langle\operatorname{ad^{\dagger}}_{\omega_{3}}[\omega_{1},\omega_{2}]_{\mathfrak{k}},\omega_{4}\rangle_{\operatorname{P}}=\langle[\omega_{1},\omega_{2}]_{\mathfrak{k}},\operatorname{P}[\omega_{3},\omega_{4}]\rangle=\langle[\omega_{1},\omega_{2}]_{\mathfrak{k}},[\omega_{3},\omega_{4}]_{\mathfrak{k}}\rangle

as when we expand $\operatorname{P}[\omega_{3},\omega_{4}]$ , only $[\omega_{3},\omega_{4}]_{\mathfrak{k}}$ could be not orthogonal to $[\omega_{1},\omega_{2}]_{\mathfrak{k}}$ . From here $\langle[\omega_{1},\omega_{2}]_{\mathfrak{k}},[\omega_{3},\omega_{4}]_{\mathfrak{k}}\rangle=\langle[\omega_{1},\omega_{2}]_{\mathfrak{k}},[\omega_{3},\omega_{4}]\rangle=-\langle[\omega_{3},[\omega_{1},\omega_{2}]_{\mathfrak{k}}],\omega_{4}\rangle$ . But $[\omega_{3\mathfrak{k}},[\omega_{1},\omega_{2}]_{\mathfrak{k}}]$ is orthogonal to $\omega_{4}\in\mathfrak{m}$ , so we are left with $-\langle[\omega_{3\mathfrak{m}},[\omega_{1},\omega_{2}]_{\mathfrak{k}}],\omega_{4}\rangle=-\langle[\omega_{3\mathfrak{m}},[\omega_{1},\omega_{2}]_{\mathfrak{k}}],\omega_{4}\rangle_{\operatorname{P}}$ as $[\omega_{3\mathfrak{m}},[\omega_{1},\omega_{2}]_{\mathfrak{k}}]_{\mathfrak{a}}=0$ , because $[\omega_{3\mathfrak{a}},[\omega_{1},\omega_{2}]_{\mathfrak{k}}]=0$ while the remaining term is in $\mathfrak{b}\oplus\mathfrak{h}$ . By proposition 4 $[\omega_{3\mathfrak{m}},[\omega_{1},\omega_{2}]_{\mathfrak{k}}]\in\mathfrak{m}$ since $\mathfrak{m}$ is the orthogonal complement of $\mathfrak{k}$ , this proves eq. 5.10. ∎

Recall $o$ is the coset containing the identity in the homogeneous manifold $\mathtt{G}/\mathtt{K}$ . The expression $\operatorname{R}^{[0]}$ in the following theorem is the curvature of a normal homogeneous manifold, probably not usually known in this format.

Proposition 6.

For a Lie group $\mathtt{G}$ with Lie algebra $\mathfrak{g}$ and a bi-invariant metric $\langle\rangle_{\mathtt{G}}$ , the curvature of the homogeneous manifold $\mathtt{M}=\mathtt{G}/\mathtt{K}$ under the metric $\operatorname{P}_{t}$ at $o$ with $\mathfrak{k}\subset\mathfrak{h}$ are subalgebras of $\mathfrak{g}$ , ( $\mathfrak{g}=\mathfrak{a}\oplus\mathfrak{b}\oplus\mathfrak{h}=\mathfrak{a}\oplus\mathfrak{n}=\mathfrak{m}\oplus\mathfrak{k}$ as in proposition 4) at $\omega_{1},\omega_{2},\omega_{3}\in\mathfrak{m}$ is given by

(5.11)

\operatorname{R}_{\omega_{1},\omega_{2}}\omega_{3}=\operatorname{R}^{[0]}_{\omega_{1},\omega_{2}}\omega_{3}+(1-t)\operatorname{R}^{[1]}_{\omega_{1},\omega_{2}}\omega_{3}+(1-t)^{2}\operatorname{R}^{[2]}_{\omega_{1},\omega_{2}}\omega_{3}

(5.12)

\begin{gathered}\operatorname{R}^{[0]}_{\omega_{1},\omega_{2}}\omega_{3}:=\frac{1}{4}([[\omega_{1},\omega_{2}],\omega_{3}]_{\mathfrak{m}}+2[[\omega_{1},\omega_{2}]_{\mathfrak{k}},\omega_{3}]-[[\omega_{2},\omega_{3}]_{\mathfrak{k}},\omega_{1}]+[[\omega_{1},\omega_{3}]_{\mathfrak{k}},\omega_{2}])\end{gathered}

(5.13)

\begin{gathered}\operatorname{R}^{[1]}_{\omega_{1},\omega_{2}}\omega_{3}:=\frac{1}{2}([[\omega_{1},\omega_{2}]_{\mathfrak{a}},\omega_{3\mathfrak{b}}]+[\omega_{3\mathfrak{a}},[\omega_{1},\omega_{2}]_{\mathfrak{b}}])\\ -\frac{1}{4}([\omega_{1},[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]+[\omega_{3\mathfrak{a}},\omega_{2\mathfrak{b}}]]+[\omega_{1\mathfrak{a}},[\omega_{2},\omega_{3}]_{\mathfrak{b}}]+[[\omega_{2},\omega_{3}]_{\mathfrak{a}},\omega_{1\mathfrak{b}}])_{\mathfrak{m}}\\ +\frac{1}{4}([\omega_{2},[\omega_{1\mathfrak{a}},\omega_{3\mathfrak{b}}]+[\omega_{3\mathfrak{a}},\omega_{1\mathfrak{b}}]]+[\omega_{2\mathfrak{a}},[\omega_{1},\omega_{3}]_{\mathfrak{b}}]+[[\omega_{1},\omega_{3}]_{\mathfrak{a}},\omega_{2\mathfrak{b}}])_{\mathfrak{m}}\end{gathered}

(5.14)

\begin{gathered}4\operatorname{R}^{[2]}_{\omega_{1},\omega_{2}}\omega_{3}:=-[\omega_{1\mathfrak{a}},[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]+[\omega_{3\mathfrak{a}},\omega_{2\mathfrak{b}}]]+[\omega_{2\mathfrak{a}},[\omega_{1\mathfrak{a}},\omega_{3\mathfrak{b}}]+[\omega_{3\mathfrak{a}},\omega_{1\mathfrak{b}}]]\end{gathered}

Proof.

We apply the formulas for $[\quad]_{\operatorname{P}}$ , with $\mathfrak{n}=\mathfrak{b}\oplus\mathfrak{h}$ and $\mathfrak{g}=\mathfrak{a}\oplus\mathfrak{n}$

[[\omega_{1},\omega_{2}],\omega_{3}]_{\operatorname{P},\mathfrak{a}}=[[\omega_{1},\omega_{2}],\omega_{3}]_{\mathfrak{a}}

[[\omega_{1},\omega_{2}],\omega_{3}]_{\operatorname{P},\mathfrak{n}}=[[\omega_{1},\omega_{2}],\omega_{3}]_{\mathfrak{n}}+(1-t)([[\omega_{1},\omega_{2}]_{\mathfrak{a}},\omega_{3,\mathfrak{b}}]+[\omega_{3,\mathfrak{a}},[\omega_{1},\omega_{2}]_{\mathfrak{b}}]

[\omega_{1}[\omega_{2},\omega_{3}]_{\operatorname{P}}]_{\operatorname{P}}=[\omega_{1},[\omega_{2},\omega_{3}]]+(1-t)[\omega_{1},[\omega_{2,\mathfrak{a}},\omega_{3,\mathfrak{b}}]+[\omega_{3,\mathfrak{a}},\omega_{2,\mathfrak{b}}]]+

(1-t)([\omega_{1\mathfrak{a}},[\omega_{2},\omega_{3}]_{\mathfrak{b}}]+[[\omega_{2},\omega_{3}]_{\mathfrak{a}},\omega_{1,\mathfrak{b}}])+(1-t)^{2}([\omega_{1\mathfrak{a}},[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]+[\omega_{3\mathfrak{a}},\omega_{2\mathfrak{b}}]])

We now apply eq. 5.4. By the Jacobi identity, the $\operatorname{R}^{[0]}$ component of the first line is

(\frac{1}{2}[[\omega_{1},\omega_{2}],\omega_{3}]-\frac{1}{4}[\omega_{1},[\omega_{2},\omega_{3}]]+\frac{1}{4}[\omega_{2},[\omega_{1},\omega_{3}]])_{\mathfrak{m}}=\frac{1}{4}[[\omega_{1},\omega_{2}],\omega_{3}]_{\mathfrak{m}}

while the second line has the O’Neil terms $\operatorname{ad^{\dagger}}_{\omega_{i}}[\omega_{j},\omega_{k}]_{\mathfrak{k}}$ ( $i,j,k$ in a permutation of $\{1,2,3\}$ ) evaluated as $-[\omega_{i\mathfrak{m}}[\omega_{j},\omega_{k}]_{\mathfrak{k}}]$ . Since we assume $\omega_{i}\in\mathfrak{m}$ , this gives us the expression for $\operatorname{R}^{[0]}$ . Permuting indices the collect terms, we get $\operatorname{R}^{[1]}$ and $\operatorname{R}^{[2]}$ . Some of the expressions, for example, $\operatorname{R}^{[2]}_{\omega_{1}\omega_{2}}\omega_{3}$ are already in $\mathfrak{m}$ so we do not need to apply projection again. ∎

We use these formulas to compute the Levi-Civita connection and curvature for Stiefel manifolds. For two integers $n>p$ , we will describe the Stiefel manifold as $\operatorname{SO}(n)/\operatorname{SO}(n-p)$ . Here, $\mathtt{G}=\operatorname{SO}(n)$ and $\mathtt{K}=\operatorname{SO}(n-p)$ . We take $\mathfrak{g}=\mathfrak{o}(n)$ , $\mathtt{G}=\operatorname{SO}(n)$ . Take the bi-invariant form to be $\frac{1}{2}\operatorname{Tr}(\omega_{1}^{\operatorname{\mathsf{T}}}\omega_{2})$ . We divide a matrix in $\mathfrak{o}(n)$ to blocks of the form $\begin{bmatrix}A&-B^{\operatorname{\mathsf{T}}}\\ B&H\end{bmatrix}$ , with $A\in\mathfrak{o}(p)$ , $B\in\mathbb{R}^{(n-p)\times p}$ and $H\in\mathbb{R}^{(n-p)\times(n-p)}$ , and we represent that matrix by a triple $[\![A,B,H]\!]$ to save space.

Take the subalgebra generated by the $H$ block to be $\mathfrak{k}=\mathfrak{h}=\mathfrak{o}(n-p)$ , identified with the bottom right $(n-p)\times(n-p)$ block of $\mathfrak{o}(n)$ , then $\mathfrak{m}$ is the subspace of $\mathfrak{o}(n)$ where the $H$ -block is zero, the subalgebra $\mathfrak{a}$ is $\mathfrak{o}(p)$ identified with the $A$ -block, and $\mathfrak{b}$ is the subspace generated by the $B$ and $B^{\operatorname{\mathsf{T}}}$ -blocks, as in the below

\mathfrak{g}:\begin{bmatrix}\mathfrak{a}&\mathfrak{b}\\ \mathfrak{b}&\mathfrak{h}\end{bmatrix}\quad\quad\mathfrak{n}:\begin{bmatrix}0&\mathfrak{b}\\ \mathfrak{b}&\mathfrak{h}\end{bmatrix}\quad\quad\mathfrak{m}:\begin{bmatrix}\mathfrak{a}&\mathfrak{b}\\ \mathfrak{b}&0\end{bmatrix}

The Lie and $[\quad]_{\operatorname{P}}$ brackets of $\operatorname{[\![}A_{1},B_{1},H_{1}]\operatorname{]\!]},\operatorname{[\![}A_{2},B_{2},H_{2}\operatorname{]\!]}\in\mathfrak{o}(n)$ are given by

[\operatorname{[\![}A_{1},B_{1},H_{1}\operatorname{]\!]},\operatorname{[\![}A_{2},B_{2},H_{2}\operatorname{]\!]}]=

\operatorname{[\![}[A_{1},A_{2}]+B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2},B_{1}A_{2}+H_{1}B_{2}-B_{2}A_{1}-H_{2}B_{1},[H_{1},H_{2}]+B_{2}B_{1}^{\operatorname{\mathsf{T}}}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}\operatorname{]\!]}

[\operatorname{[\![}A_{1},B_{1},H_{1}\operatorname{]\!]},\operatorname{[\![}A_{2},B_{2},H_{2}\operatorname{]\!]}]_{\operatorname{P}}=\operatorname{[\![}[A_{1},A_{2}]+B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2},

tB_{1}A_{2}+H_{1}B_{2}+(t-2)B_{2}A_{1}-H_{2}B_{1},

[H_{1},H_{2}]+B_{2}B_{1}^{\operatorname{\mathsf{T}}}-B_{1}B_{2}^{\operatorname{\mathsf{T}}}\operatorname{]\!]}

For $U=(Y|Y_{\perp})\in\operatorname{SO}(n)$ , where $(|)$ denotes the division of a matrix in $\mathbb{R}^{n\times n}$ to the first $p$ (in $\mathbb{R}^{n\times p}$ ) and last $n-p$ (in $\mathbb{R}^{n\times(n-p)}$ ) column blocks, if $\omega=(\eta|\eta_{\perp})$ is a tangent vector at $U$ to $\operatorname{SO}(n)$ then $\omega=d\mathcal{L}_{U}(U^{\operatorname{\mathsf{T}}}\omega)=U\operatorname{[\![}Y^{\operatorname{\mathsf{T}}}\eta,Y_{\perp}^{\operatorname{\mathsf{T}}}\eta,Y_{\perp}^{\operatorname{\mathsf{T}}}\eta_{\perp}\operatorname{]\!]}$ .

We describe the submersion $\operatorname{SO}(n)\to\mathrm{St}_{p,n}$ , identifying $\mathrm{St}_{p,n}$ with $\operatorname{SO}(n)/\operatorname{SO}(n-p)$ by the map $U\mapsto Y$ , where $U=(Y|Y_{\perp})$ as just described. The map is clearly a differentiable submersion on to $\mathrm{St}_{p,n}$ , the fiber over $Y$ consists of matrices of the form $(Y|Y_{\perp}Q)$ , $Q\in\operatorname{SO}(n-p)$ , hence the vertical space consists of $(0|Y_{\perp}\mathrm{q})$ , $\mathrm{q}\in\mathfrak{o}(n-p)$ .

Equip $\operatorname{SO}(n)$ with the metric $\operatorname{P}_{t}$ in proposition 5. At $U=\operatorname{I}_{n}$ , the horizontal space consists of matrices of the form $\operatorname{[\![}A,B,0\operatorname{]\!]}$ , with $A\in\mathfrak{o}(p),B\in\mathbb{R}^{(n-p)\times p}$ , and in general, a horizontal vector is of the form $U\operatorname{[\![}A,B,0\operatorname{]\!]}$ . The submersion maps $\omega=(\eta|\eta_{\perp})$ to $\eta\in\mathbb{R}^{n\times p}$ satisfying $Y^{\operatorname{\mathsf{T}}}\eta\in\mathfrak{o}(p)$ .

Proposition 7.

With the above setting, the horizontal lift of a tangent vector $\eta\in\mathbb{R}^{n\times p}$ to $\mathrm{St}_{p,n}$ at $U=(Y|Y_{\perp})\in\operatorname{SO}(n)$ under $\operatorname{P}_{t}$ is $\bar{\eta}=(\eta|-Y\eta^{\operatorname{\mathsf{T}}}Y_{\perp})$ and the induced metric is

(5.15)

\langle\eta,\eta\rangle_{t}=\operatorname{Tr}(\eta\eta^{\operatorname{\mathsf{T}}}+(\frac{t}{2}-1)YY^{\operatorname{\mathsf{T}}}\eta\eta^{\operatorname{\mathsf{T}}})

The Levi-Civita connection for two vector fields $\mathtt{V},\mathtt{Z}$ on $\mathrm{St}_{p,n}$ under this metric is given by

(5.16)

\nabla_{\mathtt{V}}\mathtt{Z}=\operatorname{D}_{\mathtt{V}}\mathtt{Z}+\frac{1}{2}Y(\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z})+\frac{2-t}{2}(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})(\mathtt{V}\mathtt{Z}^{\operatorname{\mathsf{T}}}+\mathtt{V}\mathtt{Z}^{\operatorname{\mathsf{T}}})Y

The curvature $\operatorname{R}_{\xi,\eta}\phi$ at $Y\in\mathrm{St}_{p,n}$ for three tangent vectors $\xi,\eta,\phi$ computed by proposition 3 is identical to that computed by eq. 3.3 and (3.4) if we represent the tangent and curvature vectors in the format in theorem 3.1, and set $\alpha=t/2$ .

Proof.

A matrix multiplication shows $U^{\operatorname{\mathsf{T}}}\bar{\eta}$ is antisymmetric and could be represented as $\operatorname{[\![}Y^{\operatorname{\mathsf{T}}}\eta,Y_{\perp}^{\operatorname{\mathsf{T}}}\eta,0\operatorname{]\!]}\in\mathfrak{o}(n)$ , which is horizontal at $\operatorname{I}_{n}$ , thus $\bar{\eta}$ is horizontal and maps to $\eta$ , hence it is the horizontal lift.

Using the relations $Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}+YY^{\operatorname{\mathsf{T}}}=\operatorname{I}_{n}$ the induced metric is

\begin{gathered}\langle U^{\operatorname{\mathsf{T}}}\bar{\eta},U^{\operatorname{\mathsf{T}}}\bar{\eta}\rangle_{\operatorname{P}}=\frac{1}{2}\operatorname{Tr}\begin{bmatrix}Y^{\operatorname{\mathsf{T}}}\eta&-\eta^{\operatorname{\mathsf{T}}}Y_{\perp}\\ Y_{\perp}^{\operatorname{\mathsf{T}}}\eta&0\end{bmatrix}\begin{bmatrix}t\eta^{\operatorname{\mathsf{T}}}Y&\eta^{\operatorname{\mathsf{T}}}Y_{\perp}\\ -Y_{\perp}^{\operatorname{\mathsf{T}}}\eta&0\end{bmatrix}\\ =\frac{1}{2}\operatorname{Tr}(tYY^{\operatorname{\mathsf{T}}}\eta\eta^{\operatorname{\mathsf{T}}}+2Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}\eta\eta^{\operatorname{\mathsf{T}}})=\operatorname{Tr}(\eta\eta^{\operatorname{\mathsf{T}}}+(\frac{t}{2}-1)YY^{\operatorname{\mathsf{T}}}\eta\eta^{\operatorname{\mathsf{T}}})\end{gathered}

Let $\mathtt{V},\mathtt{Z}$ be two vector fields on $\mathrm{St}_{p,n}$ , which lift to $\operatorname{SO}(n)$ -vector fields $\bar{\mathtt{V}}=(\mathtt{V}|-Y\mathtt{V}^{\operatorname{\mathsf{T}}}Y_{\perp})$ , $\bar{\mathtt{Z}}=(\mathtt{Z}|-Y\mathtt{Z}^{\operatorname{\mathsf{T}}}Y_{\perp})$ . Let $F=U^{\operatorname{\mathsf{T}}}\bar{\mathtt{Z}}=\operatorname{[\![}Y^{\operatorname{\mathsf{T}}}\mathtt{Z},Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{Z},0\operatorname{]\!]}$ , by eq. 5.5, $\nabla_{\mathtt{V}}\mathtt{Z}$ lifts to $UC_{\mathfrak{m}}$ with $C=\operatorname{D}_{\bar{\mathtt{V}}}F+\frac{1}{2}[\operatorname{[\![}Y^{\operatorname{\mathsf{T}}}\mathtt{V},Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V},0\operatorname{]\!]},\operatorname{[\![}Y^{\operatorname{\mathsf{T}}}\mathtt{Z},Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{Z},0\operatorname{]\!]}]_{\operatorname{P}}$ . Expand the Lie-derivative and the $\operatorname{P}$ -bracket

\begin{gathered}C=\operatorname{[\![}\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+Y^{\operatorname{\mathsf{T}}}\operatorname{D}_{\mathtt{V}}\mathtt{Z},-Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+Y_{\perp}^{\operatorname{\mathsf{T}}}\operatorname{D}_{\mathtt{V}}\mathtt{Z},0\operatorname{]\!]}+\\ \frac{1}{2}\operatorname{[\![}[Y^{\operatorname{\mathsf{T}}}\mathtt{V},Y^{\operatorname{\mathsf{T}}}\mathtt{Z}]+\mathtt{Z}^{\operatorname{\mathsf{T}}}Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V}-\mathtt{V}^{\operatorname{\mathsf{T}}}Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{Z},tY_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+(t-2)Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{Z}Y^{\operatorname{\mathsf{T}}}\mathtt{V},C_{H}\operatorname{]\!]}\end{gathered}

for $C_{H}\in\mathfrak{o}(n-p)$ . Thus, the submersion maps $UC_{\mathfrak{m}}$ to its left $p$ columns

\begin{gathered}Y(\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+Y^{\operatorname{\mathsf{T}}}\operatorname{D}_{\mathtt{V}}\mathtt{Z}+\frac{1}{2}([Y^{\operatorname{\mathsf{T}}}\mathtt{V},Y^{\operatorname{\mathsf{T}}}\mathtt{Z}]+\mathtt{Z}^{\operatorname{\mathsf{T}}}Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V}-\mathtt{V}^{\operatorname{\mathsf{T}}}Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{Z}))+\\ Y_{\perp}(-Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+Y_{\perp}^{\operatorname{\mathsf{T}}}\operatorname{D}_{\mathtt{V}}\mathtt{Z}+\frac{1}{2}(tY_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+(t-2)Y_{\perp}^{\operatorname{\mathsf{T}}}\mathtt{Z}Y^{\operatorname{\mathsf{T}}}\mathtt{V}))\\ =\operatorname{D}_{\mathtt{V}}\mathtt{Z}+Y\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+\frac{1}{2}(YY^{\operatorname{\mathsf{T}}}\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}-YY^{\operatorname{\mathsf{T}}}\mathtt{Z}Y^{\operatorname{\mathsf{T}}}\mathtt{V}+Y\mathtt{Z}^{\operatorname{\mathsf{T}}}Y_{\perp}Y_{\perp}\mathtt{V}-Y\mathtt{V}^{\operatorname{\mathsf{T}}}Y_{\perp}Y_{\perp}\mathtt{Z})\\ +\frac{1}{2}Y_{\perp}Y_{\perp}^{\operatorname{\mathsf{T}}}(-2\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+t\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+(t-2)\mathtt{Z}Y^{\operatorname{\mathsf{T}}}\mathtt{V})\end{gathered}

The last line simplifies to

\frac{t-2}{2}(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})(\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+\mathtt{Z}Y^{\operatorname{\mathsf{T}}}\mathtt{V})=\frac{2-t}{2}(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})(\mathtt{V}\mathtt{Z}^{\operatorname{\mathsf{T}}}+\mathtt{Z}\mathtt{V}^{\operatorname{\mathsf{T}}})Y

while twice the remaining terms, except for $\operatorname{D}_{\mathtt{V}}\mathtt{Z}$ is

\begin{gathered}2Y\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+YY^{\operatorname{\mathsf{T}}}\mathtt{V}Y^{\operatorname{\mathsf{T}}}\mathtt{Z}-YY^{\operatorname{\mathsf{T}}}\mathtt{Z}Y^{\operatorname{\mathsf{T}}}\mathtt{V}+Y\mathtt{Z}^{\operatorname{\mathsf{T}}}(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})\mathtt{V}-Y\mathtt{V}^{\operatorname{\mathsf{T}}}(\operatorname{I}_{n}-YY^{\operatorname{\mathsf{T}}})\mathtt{Z}\\ =Y\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+Y\mathtt{Z}^{\operatorname{\mathsf{T}}}\mathtt{V}+Y(Y^{\operatorname{\mathsf{T}}}\mathtt{V}+\mathtt{V}^{\operatorname{\mathsf{T}}}Y)Y^{\operatorname{\mathsf{T}}}\mathtt{Z}-Y(Y^{\operatorname{\mathsf{T}}}\mathtt{Z}+\mathtt{Z}^{\operatorname{\mathsf{T}}}Y)Y^{\operatorname{\mathsf{T}}}\mathtt{V}\\ =Y\mathtt{V}^{\operatorname{\mathsf{T}}}\mathtt{Z}+Y\mathtt{Z}^{\operatorname{\mathsf{T}}}\mathtt{V}\end{gathered}

Thus we have proved eq. 5.16.

Let us prove the curvature expressions. To show $f(t)=g(t/2)$ , with $f(t)=f_{0}+(1-t)f_{1}+(1-t)^{2}f_{2}$ where $f_{1},f_{2},f_{3}$ are constant matrices and $g$ is a matrix-valued quadratic function in $t$ , we need to show $f_{0}=g(1/2)$ , $-2f_{1}=g^{\prime}(1/2)$ and $8f_{2}=g^{\prime\prime}(1/2)$ . From left invariance we can take $U=\operatorname{I}_{n}$ . Thus, we need to compute $\operatorname{R}^{[0]},\operatorname{R}^{[1]},\operatorname{R}^{[2]}$ and compare with values and derivatives of $g(\alpha)=\operatorname{[\![}A_{R}(\alpha),B_{R}(\alpha),0\operatorname{]\!]}$ with $A_{R},B_{R}$ defined from eq. 3.3 and (3.4) evaluated at $\alpha=1/2$ .

Let $\xi=\omega_{1},\eta=\omega_{2},\phi=\omega_{3}$ with $\omega_{i}=\operatorname{[\![}A_{i},B_{i},0\operatorname{]\!]}$ we have $[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]$ is $\operatorname{[\![}0,-B_{3}A_{2},0\operatorname{]\!]}$ , $[\omega_{1\mathfrak{a}},[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]]=\operatorname{[\![}0,B_{3}A_{2}A_{1},0\operatorname{]\!]}$ and permuting the indices

4\operatorname{R}^{[2]}_{\omega_{1},\omega_{2}}\omega_{3}=\operatorname{[\![}0,-B_{3}A_{2}A_{1}-B_{2}A_{3}A_{1}+B_{3}A_{1}A_{2}+B_{1}A_{3}A_{2},0\operatorname{]\!]}

On the other hand, eq. 3.3 and 3.4 gives $A_{R,\alpha=1/2}^{\prime\prime}=0$ and $B_{R,\alpha=1/2}^{\prime\prime}$ is

B_{R,\alpha=1/2}^{\prime\prime}=\frac{4}{2}(B_{1}A_{3}A_{2}-B_{2}A_{3}A_{1})+\\ (2)(B_{3}A_{1}A_{2}-B_{3}A_{2}A_{1})

which confirms $8\operatorname{R}^{[2]}_{\omega_{1},\omega_{2}}\omega_{3}=g^{\prime\prime}(1/2)$ . Next,

[[\omega_{1},\omega_{2}]_{\mathfrak{a}},\omega_{3\mathfrak{b}}]=\operatorname{[\![}0,-B_{3}(([A_{1},A_{2}]+B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}),0\operatorname{]\!]}

[\omega_{3\mathfrak{a}},[\omega_{1},\omega_{2}]_{\mathfrak{b}}]_{\mathfrak{a}}=\operatorname{[\![}0,-(B_{1}A_{2}-B_{2}A_{1})A_{3},0\operatorname{]\!]}

[\omega_{1},[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]]_{\mathfrak{m}}=[\operatorname{[\![}A_{1},B_{1},0\operatorname{]\!]},\operatorname{[\![}0,-B_{3}A_{2},0\operatorname{]\!]}]_{\mathfrak{m}}=\operatorname{[\![}A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}+B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2},B_{3}A_{2}A_{1},0\operatorname{]\!]}

By permuting indices, we evaluate the $\mathfrak{a}$ component of $4\operatorname{R}^{[1]}_{\omega_{1},\omega_{2}}\omega_{3}$ from four expressions similar to $[\omega_{1},[\omega_{2\mathfrak{a}},\omega_{3\mathfrak{b}}]]_{\mathfrak{a}}$ as

\begin{gathered}-A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2}-A_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}\\ +A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1}+A_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}+B_{3}^{\operatorname{\mathsf{T}}}B_{2}A_{1}\end{gathered}

and evaluate the $\mathfrak{b}$ component of $4\operatorname{R}^{[1]}_{\omega_{1},\omega_{2}}\omega_{3}$ from the remaining items as

\begin{gathered}2(-B_{3}([A_{1},A_{2}]+B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})-(B_{1}A_{2}-B_{2}A_{1})A_{3})\\ -B_{3}A_{2}A_{1}-B_{2}A_{3}A_{1}+(B_{2}A_{3}-B_{3}A_{2})A_{1}+B_{1}([A_{2},A_{3}]+B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{3})\\ +B_{3}A_{1}A_{2}+B_{1}A_{3}A_{2}-(B_{1}A_{3}-B_{3}A_{1})A_{2}-B_{2}([A_{1},A_{3}]+B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3})\end{gathered}

Let us collect terms. Terms starting with $B_{3}$ and two $A$ factors are

-B_{3}[A_{1},A_{2}]-B_{3}A_{2}A_{1}-B_{3}A_{2}A_{1}+B_{3}A_{1}A_{2}+B_{3}A_{1}A_{2}=0

Terms starting with $B_{2}$ and two $A$ factors:

2B_{2}A_{1}A_{3}-B_{2}A_{3}A_{1}+B_{2}A_{3}A_{1}-B_{2}[A_{1},A_{3}]=B_{2}A_{1}A_{3}+B_{2}A_{3}A_{1}

Terms starting with $B_{1}$ and two $A$ factors:

-2B_{1}A_{2}A_{3}+B_{1}[A_{2},A_{3}]+B_{1}A_{3}A_{2}-B_{1}A_{3}A_{2}=-B_{1}A_{2}A_{3}-B_{1}A_{3}A_{2}

Terms with $B$ ’s only factors

\begin{gathered}-2B_{3}(B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2})+B_{1}(B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}^{\operatorname{\mathsf{T}}}B_{3})-B_{2}(B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3})\end{gathered}

On the other hand, we have

\begin{gathered}A_{R,\alpha=1/2}^{\prime}=\frac{-2}{4}(A_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-A_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{3}A_{2}+B_{2}^{\operatorname{\mathsf{T}}}B_{3}A_{1})+\\ \frac{-1}{2}(A_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-A_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1}-B_{1}^{\operatorname{\mathsf{T}}}B_{2}A_{3}+B_{2}^{\operatorname{\mathsf{T}}}B_{1}A_{3})\end{gathered}

\begin{gathered}B_{R,\alpha=1/2}^{\prime}=\frac{4(1/2)-1}{2}(B_{1}A_{3}A_{2}-B_{2}A_{3}A_{1})+\\ (2(1/2)-1)(B_{3}A_{1}A_{2}-B_{3}A_{2}A_{1})-(B_{3}B_{1}^{\operatorname{\mathsf{T}}}B_{2}-B_{3}B_{2}^{\operatorname{\mathsf{T}}}B_{1})+\\ \frac{1}{2}(B_{1}B_{2}^{\operatorname{\mathsf{T}}}B_{3}-B_{2}B_{1}^{\operatorname{\mathsf{T}}}B_{3})+\frac{1}{2}(B_{1}A_{2}A_{3}-B_{1}B_{3}^{\operatorname{\mathsf{T}}}B_{2}-B_{2}A_{1}A_{3}+B_{2}B_{3}^{\operatorname{\mathsf{T}}}B_{1})\end{gathered}

and we can confirm by inspection $-2\operatorname{R}^{[1]}_{\omega_{1},\omega_{2}}\omega_{3}=g^{\prime}(1/2)$ . The constant term $\operatorname{R}^{[0]}$ is verified similarly, which we will not show here. ∎

Remark 5.1.

We have shown the metric in section 3 is $\operatorname{P}_{t}$ for $t=\frac{\alpha}{2}$ . The submersion associated with the Cheeger deformation gives a sectional curvature formula for $\mathtt{G}$ with the metric $\operatorname{P}_{t}$ in proposition 2.4 of [5]. Using the O’Neil equation and eq. 5.10, it implies the following sectional curvature formula for $\mathtt{M}=\mathtt{G}/\mathtt{K}$ (the norm $\|\|$ corresponds to the bi-invariant inner product $\langle\rangle$ )

(5.17)

\begin{gathered}\langle\operatorname{R}^{\mathtt{M}}_{\omega_{1},\omega_{2}}\omega_{1},\operatorname{P}_{t}\omega_{2}\rangle=\frac{1}{4}\|[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}]_{\mathfrak{n}}+t[\omega_{1\mathfrak{a}},\omega_{2\mathfrak{n}}]+t[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{a}}]\|^{2}+\\ \frac{1}{4}\|[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}]_{\mathfrak{a}}+t^{2}[\omega_{1\mathfrak{a}},\omega_{2\mathfrak{a}}]\|^{2}+\frac{1}{4}t(1-t)^{3}\|[\omega_{1\mathfrak{a}},\omega_{2\mathfrak{a}}]\|^{2}+\\ \frac{3}{4}(1-t)\|[\omega_{1\mathfrak{n}},\omega_{2\mathfrak{n}}]_{\mathfrak{a}}+t[\omega_{1\mathfrak{a}},\omega_{2\mathfrak{a}}]\|^{2}+\frac{3}{4}\|[\omega_{1},\omega_{2}]_{\mathfrak{k}}\|^{2}\end{gathered}

It is also a weighted sum of squares in a different format from eq. 3.8. It is shown to imply both the non-negativity of curvature when $t\leq 1$ and in the case $\mathfrak{a}$ is abelian, when $t\leq 4/3$ .

6. Discussion

In this paper, we have obtained explicit formulas for curvatures of real Stiefel manifolds with deformation metrics and obtained several results related to Einstein metrics and sectional curvature range, including parameter values corresponding to non-negative sectional curvatures. We expect similar results could be obtained for complex and quaternionic Stiefel manifolds. We hope the availability of explicit curvature formulas for a family of metrics on an important class of manifolds will be helpful in both theory and applications. The framework to compute the Levi-Civita connection and curvature for deformations of normal homogeneous spaces could be applied to other families of manifolds, potentially allowing the construction of new Einstein manifolds or manifolds with non-negative curvatures.

Appendix A A few trace formulas

We collect a few results on the trace of common operators that will be useful in the computation of the Ricci curvature for matrix spaces. They are most likely known, but we do not have the exact references.

Lemma A.1.

1. Let $X$ be a matrix in $\mathbb{R}^{m\times n}$ . The trace of the operator $X\mapsto AXB$ where $A\in\mathbb{R}^{m\times m}$ and $B\in\mathbb{R}^{n\times n}$ is $\operatorname{Tr}(A)\operatorname{Tr}(B)$ . In particular, the trace of $X\mapsto AX$ is $n\operatorname{Tr}(A)$ , the trace of $X\mapsto XB$ is $m\operatorname{Tr}(B)$ . The trace of the operator $X\mapsto AX^{\operatorname{\mathsf{T}}}B$ where $A$ and $B$ are matrices of size $m\times n$ is $\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}})$ . 2. The trace of the operator $X\mapsto AXB+B^{\operatorname{\mathsf{T}}}XA^{\operatorname{\mathsf{T}}}$ from the space $\mathrm{Sym}_{p}$ to itself is $\operatorname{Tr}(A)\operatorname{Tr}(B)+\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}})$ . In particular, the trace of the operator $X\mapsto AX+XA^{\operatorname{\mathsf{T}}}$ is $(p+1)\operatorname{Tr}(A)$ . The trace of the operator $X\mapsto\operatorname{Tr}(AX)B$ , with $B$ is a symmetric matrix and $A$ is a $p\times p$ matrix is $\operatorname{Tr}(\frac{1}{2}(A+A^{\operatorname{\mathsf{T}}})B)$ . 3. The trace of the operator $X\mapsto AXB+B^{\operatorname{\mathsf{T}}}XA^{\operatorname{\mathsf{T}}}$ , from the space $\mathfrak{o}(p)$ to itself, where $A$ and $B$ are $p\times p$ matrices, is $\operatorname{Tr}(A)\operatorname{Tr}(B)-\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}})$ . In particular, if $A$ and $B$ are antisymmetric matrices then the trace of $X\mapsto[[AX]B]$ is $(2-p)\operatorname{Tr}(AB)$ .

Proof.

Let $E_{ij}$ be the matrix with the $ij$ -entry equal to $1$ , and other entries equal to 0 and of the same size as $X$ . All the statements are proved similarly, by summing the coefficients of the operators on an appropriate base based on $E_{ij}$ . Let entries of $A$ be $a_{ij}$ and entries of $B$ be $b_{ij}$ .

For item 1, $(AE_{ij}B)_{ij}=a_{ii}b_{jj}$ , so the trace of $X\mapsto AXB$ is $\sum_{ij}a_{ii}b_{jj}=\operatorname{Tr}(A)\operatorname{Tr}(B)$ . Since $(AE_{ij}^{\operatorname{\mathsf{T}}}B)_{ij}=a_{ij}b_{ij}$ , $\operatorname{Tr}(X\mapsto AX^{\operatorname{\mathsf{T}}}B)$ is $\sum_{ij}a_{ij}b_{ij}=\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}})$ .

For item 2, a basis of $\mathrm{Sym}_{p}$ consists of matrices $E_{ii}$ $(i=1,\cdots,p)$ and $E_{ij}+E_{ji}$ for $i<j$ . We now compute the trace of $X\mapsto AXB+B^{\operatorname{\mathsf{T}}}XA^{\operatorname{\mathsf{T}}}$ with respect to this basis. For $E_{ii}$ , $(AE_{ii}B+B^{\operatorname{\mathsf{T}}}E_{ii}A^{\operatorname{\mathsf{T}}})_{ii}=2a_{ii}b_{ii}$ , for $E_{ij}+E_{ji}$ , the coefficient is

(A(E_{ij}+E_{ji})B+B^{\operatorname{\mathsf{T}}}(E_{ij}+E_{ji})A^{\operatorname{\mathsf{T}}})_{ij}=a_{ii}b_{jj}+a_{ij}b_{ji}+b_{ii}a_{jj}+b_{ij}a_{ij}

Hence the trace is

\sum_{i}2a_{ii}b_{ii}+\sum_{i<j}(a_{ii}b_{jj}+a_{ij}b_{ij}+b_{ii}a_{jj}+b_{ij}a_{ij})=\sum_{i}a_{ii}\sum_{j}b_{jj}+\sum_{ij}a_{ij}b_{ij}

which is $\operatorname{Tr}(A)\operatorname{Tr}(B)+\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}})$ , as $\sum_{i}a_{ii}b_{ii}+\sum_{i<j}(a_{ii}b_{jj}+b_{ii}a_{jj})$ rearranges to the first sum, and the sum of remaining terms is $\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}})$ . With $B=\operatorname{I}_{p}$ we have the trace of $X\mapsto AX+XA^{\operatorname{\mathsf{T}}}$ is $(p+1)\operatorname{Tr}(A)$ . For the operator $X\mapsto\operatorname{Tr}(AX)B$ , the coefficient corresponding to $E_{ii}$ is $a_{ii}b_{ii}$ , corresponding to $E_{ij}+E_{ji}$ is $(a_{ij}+a_{ji})b_{ij}$ . The trace is

\sum_{i}a_{ii}b_{ii}+\sum_{i<j}(a_{ij}+a_{ji})b_{ij}=\frac{1}{2}\operatorname{Tr}((A+A^{\operatorname{\mathsf{T}}})B)

For item 3, a basis of $\mathfrak{o}(p)$ consist of matrices $E_{ij}-E_{ji}$ for $i<j$ . The coefficient corresponds to $E_{ij}-E_{ji}$ is

(A(E_{ij}-E_{ji})B+B^{\operatorname{\mathsf{T}}}(E_{ij}-E_{ji})A^{\operatorname{\mathsf{T}}})_{ij}=a_{ii}b_{jj}-a_{ij}b_{ij}+b_{ii}a_{jj}-b_{ji}a_{ji}

The trace is $\sum_{i<j}a_{ii}b_{jj}-a_{ij}b_{ij}+b_{ii}a_{jj}-b_{ji}a_{ji}=\sum_{ij}a_{ii}b_{jj}-\sum_{ij}a_{ij}b_{ij}$ , which is $\operatorname{Tr}(A)\operatorname{Tr}(B)-\operatorname{Tr}(AB)$ .

For the trace of $X\mapsto[[AX]B]=(AX-XA)B-B(AX-XA)=AXB+BXA-BAX-XAB$ , we have:

\operatorname{Tr}(X\mapsto AXB+BXA)=-\operatorname{Tr}(AB^{\operatorname{\mathsf{T}}})=\operatorname{Tr}(AB)

\operatorname{Tr}(X\mapsto BAX+XAB)=\operatorname{Tr}(\operatorname{I}_{p})\operatorname{Tr}(BA)-\operatorname{Tr}(BA)=(p-1)\operatorname{Tr}(BA)

from here we get $\operatorname{Tr}(X\mapsto[[AX]B])=(2-p)\operatorname{Tr}(AB)$ . ∎

References

[1] V. Arnold, Sur la géométrie différentielle des groupes de Lie de dimension infinie et ses applications à l’hydrodynamique des fluides parfaits, Annales de l’institut Fourier 16 (1966), no. 1, 319–361 (fre).
[2] J. Cheeger, Some examples of manifolds of nonnegative curvature, Journal of Differential Geometry 8 (1973), no. 4, 623 – 628.
[3] J. Ge, DDVV-type inequality for skew-symmetric matrices and Simons-type inequality for Riemannian submersions, Advances in Mathematics 251 (2014), 62–86.
[4] K. Grove, H. Karcher, and E. Ruh, Group actions and curvature, Inventiones mathematicae 23 (1974), 31–48.
[5] K. Grove and W. Ziller, Curvature and symmetry of Milnor spheres, The Annals of Mathematics 152 (2000), 331–367.
[6] K. Hüper, I. Markina, and F. Silva Leite, A Lagrangian approach to extremal curves on Stiefel manifolds, Journal of Geometric Mechanics 13 (2021), 55–72.
[7] P. W. Michor, Some geometric evolution equations arising as geodesic equations on groups of diffeomorphisms including the Hamiltonian approach, Phase Space Analysis of Partial Differential Equations (Antonio Bove, Ferruccio Colombini, and Daniele Del Santo, eds.), Birkhäuser Boston, Boston, MA, 2007, pp. 133–215.
[8] J. Milnor, Curvatures of left invariant metrics on Lie groups, Advances in Mathematics 21 (1976), no. 3, 293–329.
[9] D. Nguyen, Operator-valued formulas for Riemannian gradient and Hessian and families of tractable metrics in optimization and machine learning, 2020.
[10] by same author, Riemannian geometry with differentiable ambient space and metric operator, 2021.
[11] B. O’Neill, The fundamental equations of a submersion., Michigan Math. J. 13 (1966), no. 4, 459–469.
[12] by same author, Semi-Riemannian geometry with applications to relativity, Pure and Applied Mathematics, vol. 103, Academic Press, Inc, New York, NY, 1983.
[13] T. Rapcsak, Sectional curvatures in nonlinear optimization, J. Global Optimization 40 (2008), 375–388.
[14] Q. Rentmeesters, Algorithms for data fitting on some common homogeneous spaces, Ph.D. thesis, Université Catholique de Louvain, Louvain, Belgium, 2013.
[15] A. A. Sagle, Some homogeneous Einstein manifolds, Nagoya Mathematical Journal 39 (1970), no. 39, 81–106.
[16] W. Ziller, Examples of Riemannian manifolds with non-negative sectional curvature, Metric and Comparison Geometry, Surv. Diff. Gem. 11 (K. Grove and J Cheeger, eds.), International Press, 2007, pp. 63–102.