Finsler geometries on strictly accretive matrices

( Axel Ringh¹¹1Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. Email: [email protected] (A. Ringh), [email protected] (L. Qiu) and Li Qiu¹¹1Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China. Email: [email protected] (A. Ringh), [email protected] (L. Qiu) )

Abstract

In this work we study the set of strictly accretive matrices, that is, the set of matrices with positive definite Hermitian part, and show that the set can be interpreted as a smooth manifold. Using the recently proposed symmetric polar decomposition for sectorial matrices, we show that this manifold is diffeomorphic to a direct product of the manifold of (Hermitian) positive definite matrices and the manifold of strictly accretive unitary matrices. Utilizing this decomposition, we introduce a family of Finsler metrics on the manifold and charaterize their geodesics and geodesic distance. Finally, we apply the geodesic distance to a matrix approximation problem, and also give some comments on the relation between the introduced geometry and the geometric mean for strictly accretive matrices as defined by S. Drury in [S. Drury, Linear Multilinear Algebra. 2015 63(2):296–301].

Key words: Accretive matrices; matrix manifolds; Finsler geometry; numerical range; geometric mean

²²footnotetext: This work was supported by the Knut and Alice Wallenberg foundation, Stockholm, Sweden, under grant KAW 2018.0349, and the Hong Kong Research Grants Council, Hong Kong, China, under project GRF 16200619.

1 Introduction

Given a complex number $z\in\mathbb{C}$ we can write it in its Cartesian form $z=a+ib$ , where $a=\Re(z)$ is the real part and $b=\Im(z)$ is the imaginary part, or we can write it in its polar form as $z=re^{i\theta}$ , where $r=|z|$ is the magnitude and $\theta=\angle\,z$ is the phase. The standard metric on $\mathbb{C}$ defines the (absolute) distance between $z_{1}$ and $z_{2}$ as $|z_{1}-z_{2}|=\sqrt{\Re(z_{1}-z_{2})^{2}+\Im(z_{1}-z_{2})^{2}}$ which is efficiently computed using the Cartesian form as $\sqrt{(a_{1}-a_{2})^{2}+(b_{1}-b_{2})^{2}}$ . However, sometimes a logarithmic (relative) distance between the numbers contains information that is more relevant for the problem at hand. One such distance is given by $\sqrt{\log(r_{1}/r_{2})^{2}+[(\theta_{1}-\theta_{2})\mid\text{mod }2\pi]^{2}}$ , and in this distance measure the point $1$ is as close to $10e^{i\theta}$ as it is to $0.1e^{-i\theta}$ . This type of distances have wide application in engineering problems, e.g., as demonstrated in the use of Bode plots and Nichols charts in control theory [41]. Moreover, this type of logarithmic metric has been generalized to (Hermitian) positive definite matrices, with plenty of applications, for example in computing geometric means between such matrices [37], [11, Chp. 6], [29, Chp. XII]. This generalization can done by identifying the set of positive matrices as a smooth manifold and introducing a Riemannian or Finsler metric on it. Here, we follow a similar path and extend this type of logarithmic metrics to so called strictly accretive matrices. More specifically, the outline of the paper is as follows: in Section 2 we review relevant background material and set up the notation used in the paper. Section 3 is devoted to showing that the set of strictly accretive matrices can be interpreted as a smooth manifold, and that this manifold is diffeomorphic to a direct product of the smooth manifold of positive definite matrices and the smooth manifold of strictly accretive unitary matrices. The latter is done using the newly introduced symmetric polar decomposition for sectorial matrices [44]. In Section 4 we introduce a family of Finsler metrics on the manifold, by means of the decomposition from the previous section and so called (Minkowskian) product functions [39]. In particular, this allows us to characterize the corresponding geodesics and compute the geodesic distance. Finally, in Section 5 we given an application of the metric to a matrix approximation problem and also give some comments on the relation between the geodesic midpoint and the geometric mean between strictly accretive matrices as introduced in [16].

2 Background and notation

In the following section we introduce some background material needed for the rest of the paper. At the same time, this section is also be used to set up the notation used throughout. To this end, let ${\mathbb{M}}_{n}$ denote the set of $n\times n$ matrices over the filed $\mathbb{C}$ of complex numbers. For $A\in{\mathbb{M}}_{n}$ , let $A^{*}$ denotes its complex conjugate transpose, let $H(A):=\tfrac{1}{2}(A+A^{*})$ denote its Hermitian part, and let $S(A):=\tfrac{1}{2}(A-A^{*})$ denote its skew-Hermitian part.¹¹1Note that this is equivalent to the Toeplitz decomposition since if $A=\Re(A)+i\Im(A)$ , then $\Re(A)=H(A)$ and $\Im(A)=\tfrac{1}{i}S(A)$ , see [23, p. 7]. Moreover, by $I$ we denote the identity matrix, and for $A\in{\mathbb{M}}_{n}$ by $\lambda(A)$ we denote its spectrum, i.e., $\lambda(A):=\{\lambda\in\mathbb{C}\mid\det(\lambda I-A)=0\}$ , and by $\sigma(A)$ we denote it singular values, i.e., $\sigma(A)=\sqrt{\lambda(A^{*}A)}$ .

The following is a number of different sets of matrices that will be used throughout: ${\mathbb{G}\mathbb{L}_{n}}$ denotes the set of invertible matrices, ${\mathbb{U}}_{n}$ denotes the set of unitary matrices, ${\mathbb{H}}_{n}$ denotes the set of Hermitian matrices, ${\mathbb{P}}_{n}$ denotes the set of positive definite matrices, i.e., $A\in{\mathbb{H}}_{n}$ s.t. $\lambda(A)\subset\mathbb{R}_{+}\setminus\{0\}$ , $\mathbb{S}_{n}$ denotes the set of skew-Hermitian matrices, and ${\mathbb{A}}_{n}$ denotes the set of strictly accretive matrices, i.e., $A\in{\mathbb{A}}_{n}$ if and only if $H(A)\in{\mathbb{P}}_{n}$ .²²2The naming used here is the same as in [28, p. 281], in contrast to [8].

Two matrices $A,B\in{\mathbb{M}}_{n}$ are said to be congruent if there exists a matrix $C\in{\mathbb{G}\mathbb{L}_{n}}$ such that $A=C^{*}BC$ . For matrices $A,B\in{\mathbb{M}}_{n}$ we define the inner product $\langle A,B\rangle:={\text{\rm tr}}(A^{*}B)$ , which gives the Frobenius norm $\|A\|:=\sqrt{\langle A,A\rangle}=\sqrt{\sum_{j=1}^{n}\sigma_{j}(A)^{2}}$ . By $\|\cdot\|_{\text{sp}}$ we denote the spectral norm, i.e., $\|A\|_{\text{sp}}=\sup_{x\in\mathbb{C}^{n}\setminus\{0\}}\|Ax\|_{2}/\|x\|_{2}=\sigma_{\max}(A)$ , the larges singular value of $A$ . Next, a function $\Phi:\mathbb{R}^{n}\to\mathbb{R}$ is called a symmetric gauge function if for all $x,y\in\mathbb{R}^{n}$ and all $\beta\in\mathbb{R}$ i) $\Phi(x)>0$ if $x\neq 0$ , ii) $\Phi(\beta x)=|\beta|\Phi(x)$ , iii) $\Phi(x+y)\leq\Phi(x)+\Phi(y)$ , and iv) $\Phi(x)=\Phi(\tilde{x})$ for all $\tilde{x}=[\pm x_{\alpha(i)}]_{i=1}^{n}$ where $\alpha$ is any permutation of $\{1,\ldots,n\}$ [36], [34, Sec. 3.I.1]. For any unitary invariant norm, i.e., norms $|\|\cdot\||$ such that $|\|UAV\||=|\|A\||$ for all $A\in{\mathbb{M}}_{n}$ and all $U,V\in{\mathbb{U}}_{n}$ , there exists a symmetric gauge function $\Phi$ such that $|\|A\||=\Phi(\sigma(A))$ [10, Thm IV.2.1 ], [34, Thm. 10.A.1]. For this reason we will henceforth denote such norms $\|\cdot\|_{\Phi}$ . Moreover, we will call a symmetric gauge function, and the corresponding norm, smooth if it is smooth outside of the origin, cf. [32, Thm. 8.5].

For a vector $x\in\mathbb{R}^{n}$ , by $x^{\downarrow}$ we denote the vector obtained by sorting the elements in $x$ in a nonincreasing order. More precisely, $x^{\downarrow}$ is obtained by permuting the elements of $x$ such that $x^{\downarrow}=[x_{k}^{\downarrow}]_{k=1}^{n}$ where $x_{1}^{\downarrow}\geq x_{2}^{\downarrow}\geq\ldots\geq x_{n}^{\downarrow}$ . For two vectors $x,y\in\mathbb{R}^{n}$ , we say that $x$ is submajorized (weakly submajorized) by $y$ if $\sum_{k=1}^{\ell}x_{k}^{\downarrow}\leq\sum_{k=1}^{\ell}y_{k}^{\downarrow}$ for $\ell=1,\ldots,n-1$ and $\sum_{k=1}^{n}x_{k}^{\downarrow}=(\leq)\sum_{k=1}^{n}y_{k}^{\downarrow}$ [34, p. 12]. Submajorization (weak submajorization) is a preorder on $\mathbb{R}^{n}$ , and we write $x\prec(\prec_{w})y$ . On the equivalence classes of vectors sorted in nonincreasing order it is a partial ordering [34, p. 19].

2.1 Sectorial matrices and the phases of a matrix

Given a matrix $A\in{\mathbb{M}}_{n}$ , we define the numerical range (field of values) as

W(A):=\big{\{}z\in\mathbb{C}\mid z=x^{*}Ax,\;x\in\mathbb{C}^{n}\text{ and }\|x\|^{2}=x^{*}x=1\big{\}}.

Using the numerical range, we can define the set of so called sectorial matrices as

\mathbb{W}_{n}:=\{A\in{\mathbb{M}}_{n}\mid 0\not\in W(A)\}.

The name comes from the fact that the numerical range of a matrix $A\in\mathbb{W}_{n}$ is contained in a sector of opening angle less than $\pi$ . The latter can be seen from the Toeplitz-Hausdorff theorem, which states that for any matrix $A\in{\mathbb{M}}_{n}$ , $W(A)$ is a convex set [45, Thm. 4.1], [20, Thm. 1.1-2]. Recently, sectorial matrices have received considerable attention in the literature, see, e.g., [8, 35, 17, 33, 46, 44, 12].

Sectorial matrices have several interesting properties. In particular, if $A$ is sectorial it is congruent to a unitary diagonal matrix $D$ , i.e., $A=T^{*}DT$ for some $T\in{\mathbb{G}\mathbb{L}_{n}}$ [21, 15, 19, 26, 24]. Although the decomposition is not unique, the elements in $D$ are unique up to permutation, and any such decomposition is called a sectorial decoposition [46]. Using this decomposition, we define the phases of $A$ , denoted $\phi_{1}(A),\phi_{2}(A),\ldots,\phi_{n}(A)$ , as the phases of the eigenvalues of $D$ [19, 44, 12];³³3In [19], these were called canonical angles. by convention we defined them to belong the an interval of length strictly less than $\pi$ . With this definition we have, e.g., that $A\in{\mathbb{A}}_{n}$ if and only if $A\in\mathbb{W}_{n}$ and $H(A)\in{\mathbb{P}}_{n}$ , which is true if and only if $(\phi_{1}(A),\ldots,\phi_{n}(A))\subset(-\pi/2,\pi/2)$ . Note that the phases of a sectorial matrix $A$ is different from the angles of the eigenvalues, i.e., in general $\phi(A)\neq\varphi(A)$ where $\varphi(A):=\angle\,\lambda(A)$ . More precisely, equality holds for normal matrices. The phases have a number of desirable properties that the angles of the eigenvalues do not, see [44].

Another decomposition of sectorial matrices, which will in fact be central to this work, is the so called symmetric polar decomposition [44, Thm. 3.1]: for $A\in\mathbb{W}_{n}$ there is a unique decomposition given by

A=PUP,

where $P\in{\mathbb{P}}_{n}$ and $U\in\mathbb{W}{\mathbb{U}}_{n}:={\mathbb{U}}_{n}\cap\mathbb{W}_{n}$ . The latter is the set of sectorial unitary matrices. The phases of $A$ are given by the phases of $U$ , which is in fact the phases of the eigenvalues of $U$ . Therefore, we have that $A\in{\mathbb{A}}_{n}$ if and only if it has a symmetric polar decomposition such that $U\in{\mathbb{A}}{\mathbb{U}}_{n}:=\{U\in\mathbb{W}{\mathbb{U}}_{n}\mid H(U)\in{\mathbb{P}}_{n}\}$ , i.e., the set of strictly accretive unitary matrices.

2.2 Riemannian and Finsler manifolds

Smooth manifolds are important mathematical objects which show up in such diverse fields as theoretical physics [40], robotics [38], and statistics and information theory [2]. Intuitively, they can be thought of as spaces that locally look like the Euclidean space, and on these spaces one can introduce geometric concepts such as curves and metrics. In particular, all smooth manifolds admit a so called Riemannian metric [27, Thm. 1.4.1], [30, Prop. 13.3], and Riemannian geometry is a well-studied subjects, see, e.g., one of the monographs [40, 29, 27, 30, 31]. An relaxation of Riemannian geometry leads to so called Finsler geometry [14]; loosely expressed it can be interpreted as chaining the tangent space from a Hilbert space to a Banach space. For an introduction to Finsler geometry, see, e.g., [9, 42, 13].

More specifically, given a smooth manifold $\mathcal{M}$ , for $x\in\mathcal{M}$ we denote the tangent space by $T_{x}\mathcal{M}$ and the tangent bundle by $T\mathcal{M}:=\cup_{x\in\mathcal{M}}\{x\}\times T_{x}\mathcal{M}$ . A Riemannian metric is induced by an inner product on the tangent space, $\langle\cdot,\cdot\rangle_{x}:T_{x}\mathcal{M}\times T_{x}\mathcal{M}\to\mathbb{R}$ , that varies smoothly with the base point $x$ . Using this inner product, one defines the norm $\sqrt{\langle\cdot,\cdot\rangle_{x}}$ , which in fact defines a smooth function on the slit tangent bundle $T\mathcal{M}\setminus\cup_{x\in\mathcal{M}}(x,0)$ . In this work we consider Finsler structures on smooth (matrix) manifolds, but we will limit the scope to Finsler structures $F:T\mathcal{M}\to\mathbb{R}_{+}$ , $(x,X)\mapsto\|X\|_{x}$ , where $\|\cdot\|_{x}$ is a norm on $T_{x}\mathcal{X}$ which is not necessarily induced by an inner product, and such that $F$ is smooth on the slit tangent bundle $T\mathcal{M}\setminus\cup_{x\in\mathcal{M}}(x,0)$ . Given a piece-wise smooth curve $\gamma:[0,1]\to\mathcal{M}$ , the arc length on the manifold is defined using this Finsler structure. More precisely, it is defined as

\mathcal{L}(\gamma):=\int_{0}^{1}F(\gamma(t),\dot{\gamma}(t))dt,

where $\dot{\gamma}(t)$ is the derivative of $\gamma$ with respect to $t$ . Using arc length, the geodesic distance between two points $x,y\in\mathcal{M}$ is defined as

\delta(x,y):=\inf_{\gamma}\mathcal{L}(\gamma)\;:\;\gamma\text{ is a piece-wise smooth curve such that }\gamma(0)=x,\gamma(1)=y,

and a minimizing curve (if one exists) is called a geodesic. A final notion we need is that of diffeomorphic manifolds. More precisely, two smooth manifolds $\mathcal{M}$ and $\mathcal{N}$ are said to be diffeomorphic if there exists a diffeomorphism $f:\mathcal{M}\to\mathcal{N}$ , i.e., a function $f$ which is a smooth bijection with a smooth inverse. In this case we write $\mathcal{M}\cong\mathcal{N}$ .

Next, we summarize some results regarding two matrix manifolds, namely ${\mathbb{P}}_{n}$ and ${\mathbb{U}}_{n}$ , together with specific Finsler structures. These will be needed later.

2.2.1 A family of Finsler metrics on ${\mathbb{P}}_{n}$ and their geodesics

Riemannian and Finsler geometry on ${\mathbb{P}}_{n}$ is a well-studied subject, and we refer the reader to, e.g., [37], [11, Chp. 6] or [29, Chp. XII]. Here, we summarize some of the results we will need for later. To this end, note that the tangent space at $P\in{\mathbb{P}}_{n}$ is ${\mathbb{H}}_{n}$ , and given $P\in{\mathbb{P}}_{n}$ , $X,Y\in{\mathbb{H}}_{n}$ we can introduce the inner product on the tangent space as

\langle X,Y\rangle_{P}={\text{\rm tr}}((P^{-1/2}XP^{-1/2})^{*}(P^{-1/2}YP^{-1/2})),

(2.1)

with corresponding norm $F_{{\mathbb{P}}_{n}}(P,X)=\|X\|_{P}=\sqrt{\sum_{j=1}^{n}\sigma_{j}(P^{-1/2}XP^{-1/2})^{2}}$ . The geodesic between $P,Q\in{\mathbb{P}}_{n}$ in the induced Riemannian metric is given by

\gamma_{{\mathbb{P}}_{n}}(t)=P^{1/2}(P^{-1/2}QP^{-1/2})^{t}P^{1/2}=Pe^{t\log(P^{-1}Q)}

(2.2)

and the length of the curve, i.e., the Riemannian distance between $P$ and $Q$ , is given by

\delta_{{\mathbb{P}}_{n}}(P,Q)=\|\log(P^{-1/2}QP^{-1/2})\|.

Interestingly, if the norm $\|\cdot\|_{P}$ on $T_{P}{\mathbb{P}}_{n}$ is changed to any other unitary invariant matrix norm

\|\cdot\|_{\Phi,P}=\Phi(\sigma(P^{-1/2}\cdot P^{-1/2}))

the expressions for a geodesic between two matrices remains unchanged and the corresponding distance is given by $\delta_{{\mathbb{P}}_{n}}^{\Phi}(P,Q)=\|\log(P^{-1/2}QP^{-1/2})\|_{\Phi}$ [11, Sec. 6.4]. However, the geodesic (2.2) might no longer be the unique shortest curve [11, p. 223].

An alternative expression for the geodesic (2.2) is given by the following proposition.

Proposition 2.1.

Let $\|\cdot\|_{\Phi}$ be any smooth unitarily invariant norm and consider the Finsler structure given by $F_{{\mathbb{P}}_{n}}^{\Phi}:T{\mathbb{P}}_{n}\to\mathbb{R}_{+}$ , $F_{{\mathbb{P}}_{n}}^{\Phi}:(P,X)\mapsto\|P^{-1/2}XP^{-1/2}\|_{\Phi}$ . For $P,Q\in{\mathbb{P}}_{n}$ , a geodesic between them can be written as $\gamma(t)=S\Lambda^{t}S^{*}$ where $P=SS^{*}$ , $Q=S\Lambda S^{*}$ is a simultaneous diagonalization by congruence of $P$ and $Q$ , i.e., $S\in{\mathbb{G}\mathbb{L}_{n}}$ and $\Lambda$ is diagonal with positive elements on the diagonal. Moreover, the geodesic distance is

\delta_{{\mathbb{P}}_{n}}^{\Phi}(P,Q)=\|\log(P^{-1/2}QP^{-1/2})\|_{\Phi}=\Phi\Big{(}\log\big{(}\lambda(P^{-1}Q)\big{)}\Big{)}=\|\log(\Lambda)\|_{\Phi}.

(2.3)

Proof.

To show the second equality in (2.3), note that

	$\displaystyle\\|\log(P^{-1/2}QP^{-1/2})\\|_{\Phi}$	$\displaystyle=\Phi\Big{(}\sigma\big{(}\log(P^{-1/2}QP^{-1/2})\big{)}\Big{)}=\Phi\Big{(}\|\lambda\big{(}\log(P^{-1/2}QP^{-1/2})\big{)}\|\Big{)}$
		$\displaystyle=\Phi\Big{(}\lambda\big{(}\log(P^{-1/2}QP^{-1/2})\big{)}\Big{)}=\Phi\Big{(}\log\big{(}\lambda(P^{-1/2}QP^{-1/2})\big{)}\Big{)}$
		$\displaystyle=\Phi\Big{(}\log\big{(}\lambda(P^{-1}Q)\big{)}\Big{)}$

where the second equality comes from that the singular values of a Hermitian matrix, i.e., the matrix $\log(P^{-1/2}QP^{-1/2})$ , are the absolute values of the eigenvalues, the third equality follows since the symmetric gauge function is invariant under sign changes, the forth can be seen by using a unitary diagonalization of $P^{-1/2}QP^{-1/2}$ , and the fifth equality comes from that the spectrum is invariant under similarity. Next, since both $P,Q\in{\mathbb{P}}_{n}$ , by [23, Thm. 7.6.4] we can simultaneously diagonalize $P$ and $Q$ by congruence, i.e., there exists an $S\in{\mathbb{G}\mathbb{L}_{n}}$ such that $P=SS^{*}$ and $Q=S\Lambda S^{*}$ , where $\Lambda={\text{\rm diag}}(\lambda_{1},\ldots,\lambda_{n})$ and where $\lambda_{1},\ldots,\lambda_{n}$ are all strictly larger than $0$ . In fact, $\lambda_{1},\ldots,\lambda_{n}$ are the eigenvalues of $P^{-1}Q$ , which means that $\log\big{(}\lambda(P^{-1}Q)\big{)}=\log(\lambda(\Lambda))=\log(\Lambda),$ which in turn gives the last equality in (2.3). Finally, this also gives that $\gamma(t)=Pe^{t\log(P^{-1}Q)}=SS^{*}S^{-*}e^{t\log(\Lambda)}S^{*}=S\Lambda^{t}S^{*}.$ ∎

2.2.2 A family of Finsler metrics on ${\mathbb{U}}_{n}$ and their geodesics

The set of unitary matrices is a Lie group, and results related to Riemannian and Finsler geometry on ${\mathbb{U}}_{n}$ can be found in, e.g., [3, 5, 4, 7, 6]. Again, we here summarize some of the results that we will need for later. To this end, note that the tangent space at $U\in{\mathbb{U}}_{n}$ is $\mathbb{S}_{n}$ and given $U\in{\mathbb{U}}_{n}$ , $X,Y\in\mathbb{S}_{n}$ we can introduce the inner product on the tangent space as

\langle X,Y\rangle_{U}={\text{\rm tr}}(X^{*}Y),

(2.4)

with corresponding induced norm $F_{{\mathbb{U}}_{n}}(U,X)=\|X\|_{U}=\sqrt{\langle X,X\rangle_{U}}$ . The induced Riemannian metric have shortest curves between $U,V\in{\mathbb{U}}_{n}$ given by

\gamma_{{\mathbb{U}}_{n}}(t)=Ue^{itZ},

(2.5)

where $V=Ue^{iZ}$ , and where $Z\in{\mathbb{H}}_{n}$ is such that $\|Z\|_{\text{sp}}\leq\pi$ . Moreover, the geodesic distance is

\delta_{{\mathbb{U}}_{n}}(U,V)=\|Z\|,

(2.6)

and the geodesic is unique if $\|Z\|_{\text{sp}}<\pi$ . Similarly to the results for the smooth manifold ${\mathbb{P}}_{n}$ , if the norm $\|\cdot\|_{U}$ on $T_{U}{\mathbb{U}}_{n}$ is changed to any other unitary invariant matrix norm $\|\cdot\|_{\Phi,U}$ the expressions for a geodesic in (2.5) is unchanged and the expression for the geodesic distance (2.6) is only changed to using the corresponding norm $\|\cdot\|_{\Phi}$ [4, 7]. However, even if $\|Z\|_{\text{sp}}<\pi$ the geodesic (2.5) might not be unique in this case [7, Sec. 3.2].

3 The smooth manifold of strictly accretive matrices

In this section we prove that ${\mathbb{A}}_{n}$ is a smooth manifold diffemorphic to ${\mathbb{P}}_{n}\times{\mathbb{A}}{\mathbb{U}}_{n}$ . The latter is fundamental for the introduction of the Finsler structures in the following section. For improved readability, some of the technical results of this section are deferred to Appendix A. To this end, we start by proving the following.

Theorem 3.1.

${\mathbb{A}}_{n}$ is a connected smooth manifold, and at a point $A\in{\mathbb{A}}_{n}$ the tangent space is $T_{A}{\mathbb{A}}_{n}={\mathbb{M}}_{n}$ .

Proof.

This follows by Lemma A.1, A.2, and A.3, and applying [30, Ex. 1.26]. ∎

Next, we prove the following characterization of the manifold.

Theorem 3.2.

${\mathbb{A}}_{n}\cong{\mathbb{P}}_{n}\times{\mathbb{A}}{\mathbb{U}}_{n}$ , where ${\mathbb{P}}_{n}$ and ${\mathbb{A}}{\mathbb{U}}_{n}$ are embedded submanifolds.

This theorem follows as a corollary to the following proposition.

Proposition 3.3.

For $A\in{\mathbb{A}}_{n}$ , let $A=PUP$ be the symmetric polar decomposition. Then the mapping $A\mapsto(P^{2},U)$ is a diffeomorphism between the smooth manifolds ${\mathbb{A}}_{n}$ and ${\mathbb{P}}_{n}\times{\mathbb{A}}{\mathbb{U}}_{n}$ .

Proof.

Since ${\mathbb{P}}_{n}$ and ${\mathbb{A}}{\mathbb{U}}_{n}$ are smooth manifolds (see [11, Chp. 6] and Lemma A.4, respectively), ${\mathbb{P}}_{n}\times{\mathbb{A}}{\mathbb{U}}_{n}$ is also a smooth manifold [30, Ex. 1.34]. Next, note that the matrix square root is a diffeomorphism of ${\mathbb{P}}_{n}$ to itself, with the matrix square as the inverse, cf. [23, Thm. 7.2.6]. Therefore, it suffices to show that the mapping $A\mapsto(P,U)$ is a diffeomorphism between ${\mathbb{A}}_{n}$ and ${\mathbb{P}}_{n}\times{\mathbb{A}}{\mathbb{U}}_{n}$ . To this end, first observe that the latter is a bijection due to the existence and uniqueness of a symmetric polar decomposition [44, Thm. 3.1]. Moreover, that the inverse is smooth follows since the components in $A$ are polynomial in the components in $P$ and $U$ .

To show that $P$ and $U$ are smooth in $A$ , we note that since $A$ is strictly accretive, $H(A)\succ 0$ . This means that we can write

	$\displaystyle A$	$\displaystyle=H(A)+i(\tfrac{1}{i}S(A))=H(A)^{1/2}\big{(}I+iH(A)^{-1/2}\tfrac{1}{i}S(A)H(A)^{-1/2}\big{)}H(A)^{1/2}$
		$\displaystyle=H(A)^{1/2}KH(A)^{1/2},$

where $K:=I+iH(A)^{-1/2}\tfrac{1}{i}S(A)H(A)^{-1/2}$ is a normal matrix⁴⁴4To see this, note that a matrix $A$ is normal if and only if $H(A)$ and $\tfrac{1}{i}S(A)$ commute, see, e.g., [45, Thm. 9.1]. (cf. [46, Proof of Cor. 2.5]) which by construction depends smoothly on $A$ . Now, let $K=V_{K}Q_{K}$ be the polar decomposition of $K$ . Since $K$ depends smoothly on $A$ , and since the polar decomposition is smooth in the matrix (Lemma A.5), $V_{K}$ and $Q_{K}$ are smooth in $A$ . Moreover, since $K$ is normal, $V_{K}$ and $Q_{K}$ commute [45, Thm. 9.1], and thus $V_{K}$ and $Q_{K}^{1/2}$ commute. Therefore, $A=H(A)^{1/2}Q_{K}^{1/2}U_{K}Q_{K}^{1/2}H(A)^{1/2}$ , where all components depend smoothly on $A$ . Now, let $L:=Q_{K}^{1/2}H(A)^{1/2}$ and let $L=V_{L}Q_{L}$ be the polar decomposition of $L$ . Similar to before, $V_{L}$ and $Q_{L}$ are thus both smooth in $A$ . Finally, we thus have that $A=Q_{L}V_{L}^{*}U_{K}V_{L}Q_{L},$ and since $V_{L}$ and $U_{K}$ are unitary so is $V_{L}^{*}U_{K}V_{L}$ . By the uniqueness of the symmetric polar decomposition [44, Thm. 3.1] it follows that $P=Q_{L}$ and $U=V_{L}^{*}U_{K}V_{L}$ , which are both smooth in $A$ . ∎

Proof of Theorem 3.2.

By Proposition 3.3, ${\mathbb{A}}_{n}\cong{\mathbb{P}}_{n}\times{\mathbb{A}}{\mathbb{U}}_{n}$ , and by [30, Prop. 5.3], both ${\mathbb{P}}_{n}\times\{I\}\cong{\mathbb{P}}_{n}$ and $\{I\}\times{\mathbb{A}}{\mathbb{U}}_{n}\cong{\mathbb{A}}{\mathbb{U}}_{n}$ are embedded submanifolds of ${\mathbb{P}}_{n}\times{\mathbb{A}}{\mathbb{U}}_{n}$ . ∎

Remark 3.4.

The results in Theorem 3.1 and 3.2 can easily be generalized to other subsets of sectorial matrices, namely any subset $\tilde{{\mathbb{A}}}_{n}\subset\mathbb{W}_{n}$ of all matrices $A$ such that there exists $\alpha,\beta\in\mathbb{R}$ , $\alpha<\beta$ , and $\beta-\alpha=\pi$ , for which $\min_{k=1,\ldots n}\phi_{k}(A)>\alpha$ and $\max_{k=1,\ldots n}\phi_{k}(A)<\beta$ . To see this, note that for $\tilde{A}=\tilde{P}\tilde{U}\tilde{P}\in\tilde{{\mathbb{A}}}_{n}$ we have that $A=\tilde{P}(e^{-(\beta+\alpha)/2}\tilde{U})\tilde{P}\in{\mathbb{A}}_{n}$ , i.e., that $e^{-(\beta+\alpha)/2}\tilde{{\mathbb{A}}}_{n}={\mathbb{A}}_{n}$ with a diffeomorphism between the components in the symmetric polar decomposition. Examples of such sets of matrices are and the set of strictly dissipative matrices, i.e., matrices such that $\Re(A)\prec 0$ [28, p. 279], and matrices such that $\Im(A)\succ 0$ .⁵⁵5Note that the latter has unfortunately also been termed dissipative in the literature, see [28, p. 279].

4 A family of Finsler metrics on ${\mathbb{A}}_{n}$ and their geodesics

In this section we introduce a family of Finsler structures on ${\mathbb{A}}_{n}$ and in particular we will characterize the geodesics and geodesic distances corresponding to these structures. To this end, by Theorem 3.2 we have that ${\mathbb{A}}_{n}\cong{\mathbb{P}}_{n}\times{\mathbb{A}}{\mathbb{U}}_{n}$ . Moreover, ${\mathbb{P}}_{n}$ and ${\mathbb{U}}_{n}$ are smooth manifolds that are well-studied in the literature, and since ${\mathbb{P}}_{n}$ and ${\mathbb{A}}{\mathbb{U}}_{n}$ are embedded submanifolds of ${\mathbb{A}}_{n}$ a desired property would be that when restricted to any of the two embedded submanifold the introduced Finsler structure would yield the corresponding known Finsler structure. To this end, we first characterize the geodesics and the geodesic distance on ${\mathbb{A}}{\mathbb{U}}_{n}$ .

Proposition 4.1.

Let $\|\cdot\|_{\Phi}$ be any smooth unitarily invariant norm and consider the Finsler structure given by $F_{{\mathbb{A}}{\mathbb{U}}_{n}}^{\Phi}:T{\mathbb{A}}{\mathbb{U}}_{n}\to\mathbb{R}_{+}$ , $F_{{\mathbb{A}}{\mathbb{U}}_{n}}^{\Phi}:(U,X)\mapsto\|X\|_{\Phi}$ . A geodesic between $U\in{\mathbb{A}}{\mathbb{U}}_{n}$ and $V\in{\mathbb{A}}{\mathbb{U}}_{n}$ is given by

\gamma_{{\mathbb{A}}{\mathbb{U}}_{n}}(t)=Ue^{t\log(U^{-1}V)}=U^{1/2}(U^{-1/2}VU^{-1/2})^{t}U^{1/2}.

Moreover, the geodesic distance is given by

\delta_{{\mathbb{A}}{\mathbb{U}}_{n}}^{\Phi}(U,V)=\|\log(U^{-1}V)\|_{\Phi}=\|\log(U^{-1/2}VU^{-1/2})\|_{\Phi}.

(4.1)

Proof.

Let $U,V\in{\mathbb{A}}{\mathbb{U}}_{n}$ . The proposition follows if we can show that a geodesic on ${\mathbb{U}}_{n}$ between $U$ and $V$ , given by (2.5), remains in ${\mathbb{A}}{\mathbb{U}}_{n}$ for $t\in[0,1]$ , and that $Z$ in (2.5) and (2.6) takes the form $Z=-i\log(U^{*}V)$ . To show the latter, note that $e^{iZ}=U^{-1}V=U^{*}V$ , where $\lambda(U^{*}V)\cap\mathbb{R}_{-}=\emptyset$ since both $U^{*}$ and $V$ are strictly accretive, see [43], [44, Thm. 6.2]. Therefore we can use the principle branch of the logarithm, which gives $Z=-i\log(U^{*}V)$ . Next, to show that $\gamma_{{\mathbb{U}}_{n}}(t)\in{\mathbb{A}}{\mathbb{U}}_{n}$ for $t\in[0,1]$ , note that $\gamma_{{\mathbb{U}}_{n}}(1/2)=U^{1/2}(U^{-1/2}VU^{-1/2})^{1/2}U^{1/2}$ . By [16, Prop. 3.1 and Thm. 3.4], $\gamma_{{\mathbb{U}}_{n}}(1/2)$ is strictly accretive, and a repeated argument now gives that $\gamma_{{\mathbb{U}}_{n}}(t)$ is strictly accretive for a dense set of $t\in[0,1]$ . By continuity of the map $t\mapsto\gamma_{{\mathbb{U}}_{n}}(t)$ , the result follows. ∎

Next, let $\Phi_{1}$ and $\Phi_{2}$ be two smooth symmetric gauge functions and consider the Finsler manifolds $({\mathbb{P}}_{n},F_{{\mathbb{P}}_{n}}^{\Phi_{1}})$ and $({\mathbb{A}}{\mathbb{U}}_{n},F_{{\mathbb{U}}_{n}}^{\Phi_{2}})$ as defined in Proposition 2.1 and Proposition 4.1, respectively. In the Riemannian case, there is a canonical way to introduce a metric on ${\mathbb{P}}_{n}\times{\mathbb{A}}{\mathbb{U}}_{n}$ , namely the product metric [30, Ex. 13.2]. However, in the case of products of Finsler manifolds there is no canonical way to introduce a Finsler structure on a product space, cf. [9, Ex. 11.1.6], [39]. Here, we consider so called (Minkowskian) product manifolds [39] and to this end we next define so called (Minkowskian) product functions.

Definition 4.2 ([39]).

A function $\Psi:\mathbb{R}_{+}\times\mathbb{R}_{+}\to\mathbb{R}_{+}$ is called a product function if it satisfies the following conditions:

i)

$\Psi(x_{1},x_{2})=0$ if and only if $(x_{1},x_{2})=(0,0)$ ,
ii)

$\Psi(\alpha x_{1},\alpha x_{2})=\alpha\Psi(x_{1},x_{2})$ for all $(x_{1},x_{2})\in\mathbb{R}_{+}\times\mathbb{R}_{+}$ and all $\alpha\in\mathbb{R}_{+}$ ,
iii)

$\Psi$ is smooth on $\mathbb{R}_{+}\times\mathbb{R}_{+}\setminus\{(0,0)\}$ ,
iv)

$\partial_{x_{\ell}}\Psi\neq 0$ on $\mathbb{R}_{+}\times\mathbb{R}_{+}\setminus\{(0,0)\}$ , for $\ell=1,2$ ,
v)

$\partial_{x_{1}}\Psi\,\partial_{x_{2}}\Psi-2\Psi\partial_{x_{1}}\partial_{x_{2}}\Psi\neq 0$ on $\mathbb{R}_{+}\times\mathbb{R}_{+}\setminus\{(0,0)\}$ .

For any product function $\Psi$ , $({\mathbb{P}}_{n}\times{\mathbb{A}}{\mathbb{U}}_{n},\sqrt{\Psi((F_{{\mathbb{P}}_{n}}^{\Phi_{1}})^{2},(F_{{\mathbb{A}}{\mathbb{U}}_{n}}^{\Phi_{2}})^{2})})$ is a Finsler manifold [39], and we therefore define the Finsler manifold $({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ as follows.⁶⁶6In [39], the convention is that the Finsler structure is squared compared to the one in [9]. We follow the convention of the latter.⁷⁷7Note that functions $\Psi$ fulfilling i)-v) are not necessarily symmetric gauge functions. As an example, consider $\Psi(x,y)=(x^{2}+3xy+y^{2})^{2}$ [39, Rem. 6]; this is not a symmetric gauge function since in general $\Psi(x,y)\neq\Psi(x,-y)$ . Conversely, symmetric gauge functions do not in general fulfill i)-v). As an example, consider $\Phi(x,y)=\max\{|x|,|y|\}$ [34, p. 138]; at any point $(x,y)\in\mathbb{R}_{+}\times\mathbb{R}_{+}$ such that $x>y$ , $\partial_{y}\Phi=0$ and hence condition iv) is not fulfilled for this function.

Definition 4.3.

Let $\Phi_{1}$ and $\Phi_{2}$ be two smooth symmetric gauge functions, let $({\mathbb{P}}_{n},F_{{\mathbb{P}}_{n}}^{\Phi_{1}})$ and $({\mathbb{A}}{\mathbb{U}}_{n},F_{{\mathbb{U}}_{n}}^{\Phi_{2}})$ be the Finsler manifolds as defined in Proposition 2.1 and Proposition 4.1, respectively, and let $\Psi$ be a product function. The Finsler manifold $({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ is defined via the diffeomorphis in Proposition 3.3 as

({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}):=\left({\mathbb{P}}_{n}\times{\mathbb{A}}{\mathbb{U}}_{n},\sqrt{\Psi((F_{{\mathbb{P}}_{n}}^{\Phi_{1}})^{2},(F_{{\mathbb{A}}{\mathbb{U}}_{n}}^{\Phi_{2}})^{2}})\right).

One particular example of a product function is $\Psi:(x_{1},x_{2})\mapsto x_{1}+x_{2}$ , which in [39] this was called “the Euclidean product”, and in the Riemannian case this leads to the canonical product manifold. Moreover, the geodesics and geodesic distance can be characterized for general product functions $\Psi$ . This leads to the following result.

Theorem 4.4.

Let $A,B\in{\mathbb{A}}_{n}$ , and let $A=P_{A}U_{A}P_{A}$ and $B=P_{B}U_{B}P_{B}$ be the corresponding symmetric polar decompositions. On the Finsler manifold $({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ , a geodesic from $A$ to $B$ is given by

	$\gamma_{{\mathbb{A}}_{n}}(t)=\gamma_{{\mathbb{P}}_{n}}(t)^{1/2}\cdot\gamma_{{\mathbb{A}}{\mathbb{U}}_{n}}(t)\cdot\gamma_{{\mathbb{P}}_{n}}(t)^{1/2},$	(4.2a)
where

	$\displaystyle\gamma_{{\mathbb{P}}_{n}}(t):=P_{A}(P_{A}^{-1}P_{B}P_{B}P_{A}^{-1})^{t}P_{A}=P_{A}^{2}e^{t\log(P_{A}^{-2}P_{B}^{2})},$	(4.2b)
	$\displaystyle\gamma_{{\mathbb{A}}{\mathbb{U}}_{n}}(t):=U_{A}^{1/2}(U_{A}^{-1/2}U_{B}U_{A}^{-1/2})^{t}U_{A}^{1/2}=U_{A}e^{t\log(U_{A}^{*}U_{B})}.$	(4.2c)

Moreover, the geodesic distance from $A$ to $B$ is given by

$\displaystyle\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(A,B)$	$\displaystyle=\sqrt{\Psi\Big{(}\delta_{{\mathbb{P}}_{n}}^{\Phi_{1}}(P_{A}P_{A},P_{B}P_{B})^{2},\delta_{{\mathbb{A}}{\mathbb{U}}_{n}}^{\Phi_{2}}(U_{A},U_{B})^{2}\Big{)}}$
	$\displaystyle=\sqrt{\Psi\Big{(}\\|\log(P_{A}^{-1}P_{B}P_{B}P_{A}^{-1})\\|_{\Phi_{1}}^{2},\\|\log(U_{A}^{*}U_{B})\\|_{\Phi_{2}}^{2}\Big{)}}$	(4.3)
	$\displaystyle=\sqrt{\Psi\Big{(}\Phi_{1}(\lambda(\log(P_{A}^{-1}P_{B}P_{B}P_{A}^{-1})))^{2},\Phi_{2}(\varphi(U^{-1}_{A}U_{B}))^{2}\Big{)}},$

where $\varphi(\cdot)$ denotes the angles of the eigenvalues.

Proof.

That $({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ is a Finsler manifold follows from the discussion leading up to the theorem; see [39]. Moreover, that (4.2) is a geodesic follows (by construction) by using [39, Thm. 3] together with Proposition 2.1 and Proposition 4.1; this also gives the first two equalities in (4.3). To prove the last equality, first observe that $P^{-1}P_{B}P_{B}P^{-1}$ is the geometric mean of $P_{A}^{2}$ and $P_{B}^{2}$ and hence positive definite [11, Thm. 4.1.3], [16, Sec. 3]. Therefore, $\log(P_{A}^{-1}P_{B}P_{B}P_{A}^{-1})$ is Hermitian and thus

\sigma(\log(P_{A}^{-1}P_{B}P_{B}P_{A}^{-1})=|\lambda(\log(P_{A}^{-1}P_{B}P_{B}P_{A}^{-1}))|.

Similarly, $U^{-1}_{A}U_{B}$ is unitary and thus $\log(U_{A}^{-1}U_{B})$ is skew-Hermitian. Therefore, $\lambda(\log(U_{A}^{-1}U_{B}))=-i\varphi(U_{A}^{-1}U_{B})$ and hence

\sigma(\log(U_{A}^{-1}U_{B}))=|\lambda(\log(U_{A}^{-1}U_{B}))|=|\varphi(U_{A}^{-1}U_{B})|.

Finally, for unitary invariant norms we have that $\|\cdot\|_{\Phi}=\Phi(\sigma(\cdot))$ , and since for any symmetric gauge function $\Phi(|x|)=\Phi(x)$ , the last equality follows. ∎

Next, we derive some properties of the geodesic distance in Theorem 4.4.

Proposition 4.5.

For matrices $A,B\in({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ we have that

1)

$\displaystyle\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(A^{-1},B^{-1})=\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(A,B)$
2)

$\displaystyle\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(A^{*},B^{*})=\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(A,B)$
3)

$\displaystyle\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(A^{-1},A)=2\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(I,A)=2\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(I,A^{-1})$ , and the geodesic midpoint between $A^{-1}$ and $A$ is $\gamma_{{\mathbb{A}}_{n}}(1/2)=I$
4)

for any $U\in{\mathbb{U}}_{n}$ we have that $\displaystyle\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(U^{*}AU,U^{*}BU)=\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(A,B).$

Proof.

To prove the statements, first note that $A^{-1}=P_{A}^{-1}U_{A}^{-1}P_{A}^{-1}$ , and that $A^{*}=P_{A}U_{A}^{-1}P_{A}$ . To prove 1), we observe that

\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(A^{-1},B^{-1})=\sqrt{\Psi\Big{(}\Phi_{1}(\lambda(\log(P_{A}P_{B}^{-1}P_{B}^{-1}P_{A})))^{2},\Phi_{2}(\varphi(U_{A}U_{B}^{-1}))^{2}\Big{)}}.

For the positive definite part, we have that

\lambda(\log(P_{A}P_{B}^{-1}P_{B}^{-1}P_{A}))=\lambda(-\log((P_{A}P_{B}^{-1}P_{B}^{-1}P_{A})^{-1}))=-\lambda(\log(P_{A}^{-1}P_{B}P_{B}P_{A}^{-1})),

and since the symmetric gauge function $\Phi_{1}$ is invariant under sign changes the distance corresponding to the positive definite part is equal. Similarly, for the strictly accretive unitary part we have that $|\varphi(U_{A}U_{B}^{-1})|=|\varphi((U_{A}U_{B}^{-1})^{*})|=|\varphi(U_{B}U_{A}^{-1})|=|\varphi(U_{A}^{-1}U_{B})|$ , where the first equality follows from that the absolute value of the angles of the eigenvalues of a unitary matrix are invariant under the operation of taking conjugate transpose, and the last equality follows since the angles of the eigenvalues are invariant under unitary congruence. Statement 2) follows by a similar argument. To prove statement 3),

	$\displaystyle\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(A^{-1},A)$	$\displaystyle=\sqrt{\Psi\Big{(}\Phi_{1}(\lambda(\log(P_{A}^{-4})))^{2},\Phi_{2}(\varphi(U_{A}^{-2}))^{2}\Big{)}}$
		$\displaystyle=\sqrt{\Psi\Big{(}\Phi_{1}(-2\lambda(\log(P_{A}^{2})))^{2},\Phi_{2}(-2\varphi(U_{A}))^{2}\Big{)}}$
		$\displaystyle=2\sqrt{\Psi\Big{(}\Phi_{1}(\lambda(\log(P_{A}^{2})))^{2},\Phi_{2}(\varphi(U_{A}))^{2}\Big{)}}=2\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(I,A)$

where the second equity follows by an argument similar to previous ones, and the third equality follows from property ii) for symmetric gauge functions and property ii) for product functions. The second equality in 3) now follows from 1), and that $\gamma_{{\mathbb{A}}_{n}}(1/2)=I$ follows by a direct calculation using (4.2). Finally, to prove 4), simply note that $U^{*}AU=U^{*}P_{A}U_{A}P_{A}U=U^{*}P_{A}UU^{*}U_{A}UU^{*}P_{A}U$ , i.e., the same unitary congruence transformation applied to $P_{A}$ and $U_{A}$ individually. A direct calculation, using the unitary invariance of eigenvalues, the matrix logarithm, and the norms, gives the result. ∎

Remark 4.6.

Note that $({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ is in general not a complete metric space. In particular, in the Riemannian case, i.e., with symmetric gauge functions $\Phi_{\ell}:x\in\mathbb{R}^{n}\mapsto\sqrt{\sum_{k=1}^{n}x_{k}^{2}}$ , $\ell=1,2$ , and product function $\Psi:(x_{1},x_{2})\mapsto x_{1}+x_{2}$ , $({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ is not a complete metric space since it is not geodesically complete [31, Thm. 6.19]. The latter is due to the fact that $({\mathbb{A}}{\mathbb{U}}_{n},F_{{\mathbb{A}}{\mathbb{U}}_{n}}^{\Phi_{2}})$ is not geodesically complete; for $({\mathbb{A}}{\mathbb{U}}_{n},F_{{\mathbb{A}}{\mathbb{U}}_{n}}^{\Phi_{2}})$ geodesics are not defined for all $t\in\mathbb{R}_{+}$ since they will reach the boundary. For an example of a Cauchy sequence that does not converge to an element in $({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ , consider the sequence $(A_{\ell})_{\ell=1}^{\infty}$ , where $A_{\ell}=e^{i(\pi/2-\pi/(2\ell))}I\in{\mathbb{A}}_{n}$ for all $\ell$ . In this case, the geodesic distance between $A_{\ell}$ and $A_{k}$ is given by

\delta_{{\mathbb{A}}_{n}}(A_{\ell},A_{k})=\|Z^{(\ell,k)}\|=\sqrt{n}\frac{\pi}{2}\left|\frac{1}{\ell}-\frac{1}{k}\right|,

since $Z^{(\ell,k)}:=-i\log(A_{\ell}^{*}A_{k})=-i\log(e^{i(\pi/(2\ell)-\pi/(2k))}I)=\pi/2(1/\ell-1/k)I$ . Thus $(A_{\ell})_{\ell=1}^{\infty}$ is a Cauchy sequence, however $\lim_{\ell\to\infty}A_{\ell}=e^{i\pi/2}I\not\in{\mathbb{A}}_{n}$ . Since the Hopf-Rinow theorem [31, Thm. 6.19] also carries over to Finsler geometry [9, Thm. 6.6.1], similar statements are true also in the general case.

Remark 4.7.

Using Remark 3.4, the above results can easily be generalized to the same subsets $\tilde{{\mathbb{A}}}_{n}$ of sectorial matrices. In fact, a direct calculation shows that all the algebraic expressions in Theorem 4.4 still hold in this case. However, statements 1)-3) of Proposition 4.5 use the fact that if $A\in{\mathbb{A}}_{n}$ then $A^{-1},A^{*}\in{\mathbb{A}}_{n}$ . This is in general not true for other sets $\tilde{{\mathbb{A}}}_{n}$ .

5 An application and some related results

By construction, on the Finsler manifold $({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ the question “given $A\in{\mathbb{A}}_{n}$ , which matrix $B\in{\mathbb{P}}_{n}$ is closest to $A$ ” have the answer “ $B=P^{2}$ , where $A=PUP$ is the symmetric polar decomposition.” Similarly, the corresponding question “which matrix $B\in{\mathbb{A}}{\mathbb{U}}_{n}$ is closest to $A$ ” have the answer “ $B=U$ ”. In this section we consider an application of the distance on the Finsler manifold $({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ to another matrix approximation problem, namely finding the closest matrix of bounded log-rank, the definition of which is given in Section 5.1. Moreover, in Section 5.2 we consider the relation between the midpoint of geodesics on $({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ and the geometric mean of strictly accretive matrices as introduced in [16].

5.1 Closest matrix of bounded log-rank

For a positive definite matrix $P$ , we defined the log-rank as the rank of the matrix logarithm of $P$ . This is equivalent to the number of eigenvalues of $P$ that are different from $1$ . We denote this ${\text{\rm log-rank}}_{{\mathbb{P}}_{n}}(\cdot)$ . Analogously, for a unitary matrix $U$ the log-rank can be defined as the rank of the matrix logarithm of $U$ , which is equivalent to the number of eigenvalues of $U$ with phase different from $0$ . We denote this ${\text{\rm log-rank}}_{{\mathbb{U}}_{n}}(\cdot)$ . For strictly accretive matrices we define the log-rank as follows.

Definition 5.1.

For $A\in{\mathbb{A}}_{n}$ with symmetric polar decomposition $A=PUP$ , we define the log-rank of $A$ as

{\text{\rm log-rank}}_{{\mathbb{A}}_{n}}(A):=\max\{{\text{\rm log-rank}}_{{\mathbb{P}}_{n}}(P^{2}),\;{\text{\rm log-rank}}_{{\mathbb{U}}_{n}}(U)\}.

We now consider the log-rank approximation problem: given $A\in{\mathbb{A}}_{n}$ find $A_{r}\in{\mathbb{A}}_{n}$ , the latter with log-rank bounded by $r$ , that is closest to $A$ in the geodesic distance $\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}$ . This can be formulated as the optimization problem


$\displaystyle\inf_{A_{r}\in{\mathbb{A}}_{n}}$	$\displaystyle\quad\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(A_{r},A)$	(5.1a)
subject to	$\displaystyle\quad{\text{\rm log-rank}}_{{\mathbb{A}}_{n}}(A_{r})\leq r.$	(5.1b)

Let $A_{r}=P_{r}U_{r}P_{r}$ be the symmetric polar decomposition. By properties i) - iv) in the Definition 4.2 of product functions $\Psi$ , for each such function the distance is nondecreasing in each argument separately. Therefore, by the form of the geodesic distance (4.3) and the definition of log-rank on ${\mathbb{A}}_{n}$ , it follows that (5.1) splits into two separate problems over ${\mathbb{P}}_{n}$ and ${\mathbb{A}}{\mathbb{U}}_{n}$ , namely


$\displaystyle\inf_{P_{r}\in{\mathbb{P}}_{n}}$	$\displaystyle\quad\\|\log(P_{r}^{-1}P^{2}P_{r}^{-1})\\|_{\Phi_{1}}$	(5.2a)
subject to	$\displaystyle\quad{\text{\rm log-rank}}_{{\mathbb{P}}_{n}}(P_{r}^{2})\leq r,$	(5.2b)

and


$\displaystyle\inf_{U_{r}\in{\mathbb{A}}{\mathbb{U}}_{n}}$	$\displaystyle\quad\\|\log(U_{r}^{*}U)\\|_{\Phi_{2}}$	(5.3a)
subject to	$\displaystyle\quad{\text{\rm log-rank}}_{{\mathbb{U}}_{n}}(U_{r})\leq r.$	(5.3b)

In fact, this gives the following theorem.

Theorem 5.2.

Assume that $\hat{P}_{r}$ and $\hat{U}_{r}$ are optimal solutions to (5.2) and (5.3), respectively. Then an optimal solution to (5.1) is given by $\hat{A_{r}}=\hat{P}_{r}\hat{U}_{r}\hat{P}_{r}$ . Conversely, if (5.1) has an optimal solution $\hat{A_{r}}=\hat{P}_{r}\hat{U}_{r}\hat{P}_{r}$ , then $\hat{P}_{r}$ and $\hat{U}_{r}$ are optimal solutions to (5.2) and (5.3), respectively.

In [47, Thm. 3] it was shown that (5.3) always has an optimal solution, and that it is the same for all symmetric gauge functions $\Phi_{2}$ . More precisely, the optimal solution $\hat{U}_{r}$ is obtained from $U$ by setting the $n-r$ phases of $U$ with smallest absolute value equal to $0$ . That is, let $U=V^{*}DV$ be a diagonalization of $U$ where $D={\text{\rm diag}}([e^{i\tilde{\phi}_{k}(U)}]_{k=1}^{n})$ and where $[\tilde{\phi}_{k}(U)]_{k=1}^{n}$ are the phases of $U$ ordered so that $|\tilde{\phi}_{1}(U)|\geq|\tilde{\phi}_{2}(U)|\geq\ldots\geq|\tilde{\phi}_{n}(U)|$ . Then

\hat{U}_{r}=V^{*}{\text{\rm diag}}(e^{i\tilde{\phi}_{1}(U)},\ldots e^{i\tilde{\phi}_{r}(U)},1,\ldots,1)V

is the optimal solution to (5.3). In the same spirit, the optimal solution to (5.2) can be charaterized as follows.

Proposition 5.3.

Let $P\in{\mathbb{P}}_{n}$ , and let $P^{2}=V^{*}{\text{\rm diag}}(\lambda_{1},\ldots,\lambda_{n})V$ be a diagonlization of $P^{2}$ where the eigenvalues are ordered so that $|\log(\lambda_{1})|\geq|\log(\lambda_{2})|\geq\ldots\geq|\log(\lambda_{n})|$ . Then $\hat{P}_{r}^{2}=V^{*}{\text{\rm diag}}(\lambda_{1},\ldots,\lambda_{r},1,\ldots,1)V$ is a minimizer to (5.2) for all symmetric gauge functions $\Phi_{1}$ .

Proof.

Clearly, ${\text{\rm log-rank}}_{{\mathbb{P}}_{n}}(\hat{P}_{r})=r$ and hence $\hat{P}_{r}$ is feasible to (5.2). Next, $\|\log(P_{r}^{-1}P^{2}P_{r}^{-1})\|_{\Phi_{1}}=\Phi_{1}(\lambda(\log(P_{r}^{-1}P^{2}P_{r}^{-1})))$ . Moreover, since $P_{r}^{-1}P^{2}P_{r}^{-1}\in{\mathbb{P}}_{n}$ and hence is unitary diagonalizable and has positive eigenvalues, we have that $\lambda(\log(P_{r}^{-1}P^{2}P_{r}^{-1}))=\log(\lambda(P_{r}^{-1}P^{2}P_{r}^{-1}))$ . Now, to show that $\hat{P}_{r}$ is the minimizer to (5.2) for all symmetric gauge functions $\Phi_{1}$ , it is equivalent to show that $|\log(\lambda(\hat{P}_{r}^{-1}P^{2}\hat{P}_{r}^{-1}))|\prec_{w}|\log(\lambda(P_{r}^{-1}P^{2}P_{r}^{-1}))|$ for all $P_{r}$ such that ${\text{\rm log-rank}}_{{\mathbb{P}}_{n}}(P_{r}^{2})\leq r$ ; see, e.g., [18, Thm. 4], [36, Thm. 1], [22, Sec. 3.5], [34, Prop. 4.B.6], [45, Thm. 10.35 ].

To this end, using [34, Thm. 9.H.1.f] (or [10, Cor III.4.6 ], [45, Thm. 10.30 ]) we have that $\log(\lambda(P^{2}))^{\downarrow}-\log(\lambda(P_{r}^{2}))^{\downarrow}\prec\log(\lambda(P_{r}^{-1}P^{2}P_{r}^{-1}))$ ,⁸⁸8To see this, take $V=P_{r}^{-1}P^{2}P_{r}^{-1}\in{\mathbb{P}}_{n}$ and $U=P_{r}^{2}$ in [34, Thm. 9.H.1.f], and use the fact that the eigenvalues of $UV=P_{r}P^{2}P_{r}^{-1}$ are invariant under the similarity transform $P_{r}^{-1}\cdot P_{r}$ . and by [10, Ex 11.3.5] we therefore have that $|\log(\lambda(P^{2}))^{\downarrow}-\log(\lambda(P_{r}^{2}))^{\downarrow}|\prec_{w}|\log(\lambda(P_{r}^{-1}P^{2}P_{r}^{-1}))|$ . Moreover, by a direct calculation it can be verified that $\log(\lambda(P^{2}))^{\downarrow}-\log(\lambda(\hat{P}_{r}^{2}))^{\downarrow}=\log(\lambda(\hat{P}_{r}^{-1}P^{2}\hat{P}_{r}^{-1}))^{\downarrow}$ holds. Therefore, if we can show that

	$\displaystyle\text{$\|\log(\lambda(P^{2}))^{\downarrow}-\log(\lambda(\hat{P}_{r}^{2}))^{\downarrow}\|\prec_{w}\|\log(\lambda(P^{2}))^{\downarrow}-\log(\lambda(P_{r}^{2}))^{\downarrow}\|$}$			(5.4)
	for all for all $P_{r}$ such that ${\text{\rm log-rank}}_{{\mathbb{P}}_{n}}(P_{r}^{2})\leq r$	,		(5.4)

we would have that for all such $P_{r}$ ,

	$\displaystyle\|\log(\lambda(\hat{P}_{r}^{-1}P^{2}\hat{P}_{r}^{-1}))^{\downarrow}\|$	$\displaystyle=\|\log(\lambda(P^{2}))^{\downarrow}-\log(\lambda(\hat{P}_{r}^{2}))^{\downarrow}\|\prec_{w}\|\log(\lambda(P^{2}))^{\downarrow}-\log(\lambda(P_{r}^{2}))^{\downarrow}\|$
		$\displaystyle\prec_{w}\|\log(\lambda(P_{r}^{-1}P^{2}P_{r}^{-1}))\|$

and by transitivity of preorders the result follows. To show (5.4), we formulate the following equivalent optimization problem: let $a=\log(\lambda(P^{2}))^{\downarrow}$ and consider

	$\displaystyle\min_{\prec_{w}}$	$\displaystyle\quad\|a-x\|$
	subject to	$\displaystyle\quad x\in\mathbb{R}^{n},\quad x_{1}\geq x_{2}\geq\ldots\geq x_{n}$
		$\displaystyle\quad\text{at most }r\text{ elements of }x\text{ are nonzero},$

where $\min_{\prec_{w}}$ is minimizing with respect to the preordering $\prec_{w}$ . The solution to the latter is to take $x_{i}=a_{i}$ for the $r$ elements of $a$ with largest absolute value. ∎

5.2 On the geometric mean for strictly accretive matrices

The geometric mean of strictly accretive matrices, denoted by $A\#B$ , was introduced in [16] as a generalization of the geometric mean for positive definite matrices [11, Chp. 4 and 6], [37]. In particular, in [16] it was shown that for $A,B\in{\mathbb{A}}_{n}$ there is a unique solution $G\in{\mathbb{A}}_{n}$ to the equation $GA^{-1}G=B$ . This solution is given explicitly as

A\#B:=G=A^{1/2}(A^{-1/2}BA^{-1/2})^{1/2}A^{1/2},

which is also the same algebraic expression as for the geometric mean of positive definite matrices.

The geometric mean for positive definite matrices can also be interpreted as the midpoint on the geodesic connecting the matrices [11, Sec. 6.1.7]. With the Finsler geometry $({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ , we can therefore get an alternative definition of the geometric mean between strictly accretive matrices as the geodesic midpoint. However, for $A,B\in({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ we in general have that $\gamma_{{\mathbb{A}}_{n}}(1/2)\neq A\#B$ . This can be seen by the following simple example.

Example 5.4.

Let $A=I$ and let $B=PUP\in({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ . Then we have that $A\#B=B^{1/2}=(PUP)^{1/2}$ and $\gamma_{{\mathbb{A}}_{n}}(1/2)=P^{1/2}U^{1/2}P^{1/2}$ . Thus, in general $A\#B\neq\gamma_{{\mathbb{A}}_{n}}(1/2)$ . In fact, equality holds in this case if and only if $P$ and $U$ commute.

Instead, the midpoint of the geodesic between two matrices $A,B\in({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ can be expressed using the geometric mean $\#$ as

\gamma_{{\mathbb{A}}_{n}}(1/2)=(P_{A}^{2}\#P_{B}^{2})^{1/2}\,(U_{A}\#U_{B})\,(P_{A}^{2}\#P_{B}^{2})^{1/2},

(5.5)

which follows directly from (4.2). Using this representation, we can characterize when $A\#B=\gamma(1/2)$ . In order to do so, we first need the following two auxiliary result.

Lemma 5.5.

Let $A\in\mathbb{W}_{n}$ , and let $A=VQ$ be is polar decomposition, where $V\in{\mathbb{U}}_{n}$ and $Q\in{\mathbb{P}}_{n}$ . $A$ is normal if and only if $A=Q^{1/2}VQ^{1/2}$ is the symmetric polar decomposition of $A$ .

Proof.

First, using [21, Lem. 9] we conclude that since $A$ is sectorial, $V$ is also sectorial. Now, $A$ is normal if and only if $V$ and $Q$ commute [45, Thm. 9.1], which is true if and only if $V$ and $Q^{1/2}$ commute. Hence $A$ is normal if and only if $A=VQ=Q^{1/2}VQ^{1/2}$ , and by the existence and uniqueness of the symmetric polar decomposition the result follows. ∎

Lemma 5.6.

Let $A,B\in{\mathbb{A}}_{n}$ and $G=A\#B$ . For all $X\in{\mathbb{G}\mathbb{L}_{n}}$ , the unique strictly accretive solution to

H(X^{*}AX)^{-1}H=X^{*}BX

is $H=X^{*}GX$ .

Proof.

That $H=X^{*}GX$ solves the equations is easily verified by simply plugging it in. Moreover, that $H$ is unique follows from the uniqueness of the geometric mean for strictly accretive matrices [16, Sec. 3] and the fact that for any $X\in{\mathbb{G}\mathbb{L}_{n}}$ we have that $X^{*}AX,X^{*}BX\in{\mathbb{A}}_{n}$ . ∎

Proposition 5.7.

For $A,B\in({\mathbb{A}}_{n},F_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi})$ , let $A=P_{A}U_{A}P_{A}$ and $B=P_{B}U_{B}P_{B}$ be the corresponding symmetric polar decompositions. We have that $A\#B=\gamma_{{\mathbb{A}}_{n}}(1/2)$ if one of the following holds:

i)

$U_{A}=U_{B}=I$ ,
ii)

$P_{A}=P_{B}$ ,
iii)

$A$ and $B$ are commuting normal matrices.

Proof.

Using (5.5), the first statement follows immediately. To prove the second statement, let $A=PU_{A}P$ and $B=PU_{B}P$ . From (5.5) it therefor follows that $\gamma_{{\mathbb{A}}_{n}}(1/2)=P(U_{A}\#U_{B})P$ . Using Lemma 5.6 with $G=U_{A}\#U_{B}$ and $X=P$ , where therefore have that

A\#B=(PU_{A}P)\#(PU_{B}P)=P(U_{A}\#U_{B})P=\gamma_{{\mathbb{A}}_{n}}(1/2).

To prove the third statement, by Lemma 5.5 we have that $P_{A}$ , $U_{A}$ , and $P_{B}$ , $U_{B}$ commute. Moreover, since commuting normal matrices are simultaneously unitarilty diagonalizable [23, Thm. 2.5.5], and since a unitary diagonlization is unique up to permutation of the eigenvalues and eigenvectors, it follows that $P_{A},U_{A},P_{B}$ and $U_{B}$ all commute. Using this together with (5.5), a direct calculation gives the result. ∎

As noted in the above proof, if $A$ and $B$ are normal and commute they are also simultaneously unitarily diagonalizable [23, Thm. 2.5.5], i.e., $A=V^{*}\Lambda_{A}V$ and $B=V^{*}\Lambda_{B}V$ for some $V\in{\mathbb{U}}_{n}$ . In this case, using Lemma 5.6 we have that $A\#B=V^{*}(\Lambda_{A}\#\Lambda_{B})V$ , and the geometric mean between $A$ and $B$ can thus be interpreted as the (independent) geometric mean between the corresponding pairs of eigenvalues. In fact, the latter observation can be generalized to all pairs of matrices that can be simultaneously diagonalized by congruence, albeit that the elements of the diagonal matrices are not necessarily eigenvalues in this case (cf. Proposition 2.1).

Proposition 5.8.

Let $A,B\in{\mathbb{A}}_{n}$ and assume that $A=T^{*}D_{A}T$ and $B=T^{*}D_{B}T$ , where $T\in{\mathbb{G}\mathbb{L}_{n}}$ and where $D_{A}$ and $D_{B}$ are diagonal matrices. Then $A\#B=T^{*}(D_{A}\#D_{B})T$ .

Proof.

As a final remark, note that if $D_{A}$ and $D_{B}$ in Proposition 5.8 are unitary, then $A=T^{*}D_{A}T$ and $B=T^{*}D_{B}T$ are sectorial decompositions of $A$ and $B$ , respectively. Now, let $T=VP$ be the polar decomposition of $T$ , with $V\in{\mathbb{U}}_{n}$ and $P\in{\mathbb{P}}_{n}$ . Hence we have that $A=PV^{*}D_{A}VP=PU_{A}P$ and $B=PV^{*}D_{B}VP=PU_{B}P$ , i.e., $P$ is the positive definite part and $V^{*}D_{A}V$ and $V^{*}D_{B}V$ are the strictly accretive unitary part in the symmetric polar decomposition of $A$ and $B$ , respectively. From Proposition 5.7.ii) we therefore have that $A\#B=\gamma_{{\mathbb{A}}_{n}}(1/2)$ in this case.

6 Conclusions

In this work we show that the set of strictly accretive matrices is a smooth manifold that is diffeomorphic to a direct product of the smooth manifold of positive definite matrices and the smooth manifold of strictly accretive unitary matrices. Using this decomposition, we introduced a family of Finsler metrics and studied their geodesics and geodesic distances. Finally, we consider the matrix approximation problem of finding the closest strictly accretive matrix of bounded log-rank, and also discuss the relation between the geodesic midpoint and the previously introduced geometric mean between accretive matrices.

There are several interesting ways in which these results can be extended. For example, in the case of positive definite matrices the geometric framework offered by the Riemannian manifold construction gives yet another interpretation of the geometric mean. In fact, the geometric mean between two positive definite matrices $A$ and $B$ is also the (unique) solution to the variational problem $\min_{G\in{\mathbb{P}}_{n}}\delta_{{\mathbb{P}}_{n}}(A,G)^{2}+\delta_{{\mathbb{P}}_{n}}(B,G)^{2}$ [11, Sec. 6.2.8], [37, Prop. 3.5], and this interpretation can be used to extend the geometric mean to a mean between several matrices [37]. In a similar way, a geometric mean between the strictly accretive matrices $A_{1},\ldots,A_{N}$ can be defined as the solution to

\min_{G\in{\mathbb{A}}_{n}}\;\sum_{i=1}^{N}\delta_{{\mathbb{A}}_{n}}^{\Phi_{1},\Phi_{2},\Psi}(A_{i},G)^{2},

however such a generalization would need more investigation. For example, even in the case of the Riemannian metric on the manifold of positive definite matrices, analytically computing the geometric mean between several matrices is nontrivial [37, Prop. 3.4]. Nevertheless, there are efficient numerical algorithms for solving the latter problem, see, e.g., the survey [25] or the monograph [1] and references therein.

The idea of this work was to introduce a metric that separates the “magnitudes” and the “phases” of strictly accretive matrices. However, the similarities between the manifold of positive definite matrices and the manifold of unitary matrices raises a question about another potential geometry on ${\mathbb{A}}_{n}$ that does not explicitly use the product structure. More precisely, note that since all strictly accretive matrices have a unique, strictly accretive square root, the inner product on the tangent space $T_{U}{\mathbb{A}}{\mathbb{U}}_{n}$ , given by (2.4), can be defined analogously to the one on ${\mathbb{P}}_{n}$ , given by (2.1), namely as

\langle X,Y\rangle_{U}={\text{\rm tr}}((U^{-1/2}XU^{-1/2})^{*}(U^{-1/2}YU^{-1/2}))={\text{\rm tr}}(X^{*}Y),

for $U\in{\mathbb{A}}{\mathbb{U}}_{n}$ and $X,Y\in T_{U}{\mathbb{A}}{\mathbb{U}}_{n}$ . Based on the similarities between the inner products, and the corresponding geodesics and geodesic distances, we ask the following question: for $A,B\in{\mathbb{A}}_{n}$ and $X,Y\in T_{A}{\mathbb{A}}_{n}$ , if we define the inner product on $T_{A}{\mathbb{A}}_{n}$ as

\langle X,Y\rangle_{A}={\text{\rm tr}}((A^{-1/2}XA^{-1/2})^{*}(A^{-1/2}YA^{-1/2})),

what is the form of the geodesics and the geodesic distance?

Acknowledgments

The authors would like to thank Wei Chen, Dan Wang, Xin Mao, Di Zhao, and Chao Chen for valuable discussions.

Appendix A Technical results from Section 3

The following is a number of lemmata use in the proofs of Theorem 3.1 and 3.2.

Lemma A.1.

${\mathbb{A}}_{n}$ is an open sets in ${\mathbb{G}\mathbb{L}_{n}}$ .

Proof.

To show that ${\mathbb{A}}_{n}$ is open in ${\mathbb{G}\mathbb{L}_{n}}$ , note that since ${\mathbb{H}}_{n}\perp\mathbb{S}_{n}$ , cf. [34, Thm. 10.B.1 and 10.B.2], [45, Prob. 10.7.20], we have that $A+B=H(A+B)+S(A+B)$ . Moreover, $H(A+B)=H(A)+H(B)$ , and since the set ${\mathbb{P}}_{n}$ is open in ${\mathbb{H}}_{n}$ , ${\mathbb{P}}_{n}\oplus\mathbb{S}_{n}$ is open in ${\mathbb{G}\mathbb{L}_{n}}$ . ∎

Lemma A.2.

${\mathbb{A}}_{n}$ is connected.

Proof.

By [30, Prop. 1.11], ${\mathbb{A}}_{n}$ is connected if and only if it is path-connected. To show the latter, it suffices to show that any $A\in{\mathbb{A}}_{n}$ is path-connected to $I$ . To this end, let $A=T^{*}DT$ be the sectorial decomposition of $A$ . A piece-wise smooth path connecting $A$ and $I$ is given by

\gamma(t):=\begin{cases}T^{*}D^{1-2t}T,&\text{for }t\in[0,1/2)\\ (T^{*}T)^{2-2t},&\text{for }t\in[1/2,1)\\ I,&\text{for }t=1,\end{cases}

and hence ${\mathbb{A}}_{n}$ is connected. ∎

Lemma A.3.

The tangent space at an $A\in{\mathbb{A}}_{n}$ is given by $T_{A}{\mathbb{A}}_{n}={\mathbb{M}}_{n}$ .

Proof.

This follows since ${\mathbb{A}}_{n}$ is an open subset of ${\mathbb{M}}_{n}$ . ∎

Lemma A.4.

${\mathbb{A}}{\mathbb{U}}_{n}$ is a connected smooth manifold and at a point $U\in{\mathbb{A}}{\mathbb{U}}_{n}$ the tangent space is $T_{U}{\mathbb{A}}{\mathbb{U}}_{n}=\mathbb{S}_{n}$ .

Proof.

Since ${\mathbb{A}}_{n}$ is open in ${\mathbb{G}\mathbb{L}_{n}}$ (Lemma A.1), ${\mathbb{A}}{\mathbb{U}}_{n}={\mathbb{U}}_{n}\cap{\mathbb{A}}_{n}$ is open in ${\mathbb{U}}_{n}$ in the relative topology with respect to ${\mathbb{G}\mathbb{L}_{n}}$ . Thus it is a smooth manifold [30, Ex. 1.26]. Moreover, the proof of Lemma A.2 holds, mutatis mutandis, showing that it is connected. Finally, since it is open in ${\mathbb{U}}_{n}$ , the tangent space at $U\in{\mathbb{A}}{\mathbb{U}}_{n}$ is $T_{U}{\mathbb{A}}{\mathbb{U}}_{n}=\mathbb{S}_{n}$ , cf. [30, Prob. 8.29]. ∎

Lemma A.5 (Cf. [29, Prop. VII.2.5]).

For $A\in{\mathbb{G}\mathbb{L}_{n}}$ , let $A=VQ$ where $V\in{\mathbb{U}}_{n}$ and $Q\in{\mathbb{P}}_{n}$ be the polar decomposition of $A$ . The mapping $A\mapsto(V,Q)$ is a diffeomorphis between the manifolds ${\mathbb{G}\mathbb{L}_{n}}$ and ${\mathbb{U}}_{n}\times{\mathbb{P}}_{n}$ .

Proof.

First, since ${\mathbb{U}}_{n}$ and ${\mathbb{P}}_{n}$ are smooth manifolds so is ${\mathbb{U}}_{n}\times{\mathbb{P}}_{n}$ [30, Ex. 1.34]. Next, for each $A\in{\mathbb{G}\mathbb{L}_{n}}$ the polar decomposition is unique [23, Thm. 7.3.1], [45, Prob. 3.2.20], and for each pair of matrices $(V,Q)\in{\mathbb{U}}_{n}\times{\mathbb{P}}_{n}$ we have that $VQ\in{\mathbb{G}\mathbb{L}_{n}}$ ; thus the mapping is bijective. Now, $A$ is smooth in $V$ and $Q$ since it is polynomial in the coefficients, i.e., the inverse mapping is smooth. To prove the converse, note that the components in the polar decomposition $A=VQ$ are given by $Q=(A^{*}A)^{1/2}$ and $V=A(A^{*}A)^{-1/2}$ , cf. [23, p. 449], [45, p. 288]. Since $A\in{\mathbb{G}\mathbb{L}_{n}}$ , $A^{*}A\in{\mathbb{P}}_{n}$ , and the matrix square root is a smooth function on ${\mathbb{P}}_{n}$ , cf. [23, Thm. 7.2.6]. Therefore, since both $Q$ and $V$ are compositions of smooth functions of $A$ , they both depend smoothly on the components of $A$ . ∎

References

[1] Pierre-Antoine Absil, Robert Mahony, and Rodolphe Sepulchre. Optimization algorithms on matrix manifolds. Princeton University Press, Princeton, NJ, 2008.
[2] Shun-ichi Amari and Hiroshi Nagaoka. Methods of information geometry. American Mathematical Society, Providence, RI, 2000.
[3] Esteban Andruchow. Short geodesics of unitaries in the $L_{2}$ metric. Canadian Mathematical Bulletin, 48(3):340–354, 2005.
[4] Esteban Andruchow, Gabriel Larotonda, and Lázaro A. Recht. Finsler geometry and actions of the $p$ -Schatten unitary groups. Transactions of the American Mathematical Society, 362(1):319–344, 2010.
[5] Esteban Andruchow and Lázaro A. Recht. Geometry of unitaries in a finite algebra: variation formulas and convexity. International Journal of Mathematics, 19(10):1223–1246, 2008.
[6] Jorge Antezana, Eduardo Ghiglioni, and Demetrio Stojanoff. Minimal curves in $\mathcal{U}(n)$ and $\mathcal{G}l(n)^{+}$ with respect to the spectral and the trace norms. Journal of Mathematical Analysis and Applications, 483(2):123632, 2020.
[7] Jorge Antezana, Gabriel Larotonda, and Alejandro Varela. Optimal paths for symmetric actions in the unitary group. Communications in Mathematical Physics, 328(2):481–497, 2014.
[8] Charles S. Ballantine and Charles R. Johnson. Accretive matrix products. Linear and Multilinear Algebra, 3(3):169–185, 1975.
[9] David Bao, Shiing-Shen Chern, and Zhongmin Shen. An introduction to Riemann-Finsler geometry. Springer, New York, NY, 2000.
[10] Rajendra Bhatia. Matrix analysis. Springer, New York, NY, 1997.
[11] Rajendra Bhatia. Positive definite matrices. Princeton university press, Princton, NJ, 2007.
[12] Wei Chen, Dan Wang, Sei Zhen Khong, and Li Qiu. Phase analysis of MIMO LTI systems. In 2019 IEEE 58th Conference on Decision and Control (CDC), pages 6062–6067. IEEE, 2019.
[13] Xinyue Cheng and Zhongmin Shen. Finsler geometry. Springer, Berlin, Heidelberg, 2012.
[14] Shiing-Shen Chern. Finsler geometry is just Riemannian geometry without the quadratic restriction. Notices of the American Mathematical Society, 43(9):959–963, 1996.
[15] Charles R. DePrima and Charles R. Johnson. The range of $A^{-1}A^{*}$ in GL(n, C). Linear Algebra and its Applications, 9:209–222, 1974.
[16] Stephen Drury. Principal powers of matrices with positive definite real part. Linear and Multilinear Algebra, 63(2):296–301, 2015.
[17] Stephen Drury and Minghua Lin. Singular value inequalities for matrices with numerical ranges in a sector. Operators and Matrices, 8(4):1143–1148, 2014.
[18] Ky Fan. Maximum properties and inequalities for the eigenvalues of completely continuous operators. Proceedings of the National Academy of Sciences of the United States of America, 37(11):760, 1951.
[19] Susana Furtado and Charles R. Johnson. Spectral variation under congruence. Linear and Multilinear Algebra, 49(3):243–259, 2001.
[20] Karl E. Gustafson and Duggirala K.M. Rao. Numerical range. Springer, New York, NY, 1997.
[21] Alfred Horn and Robert Steinberg. Eigenvalues of the unitary part of a matrix. Pacific Journal of Mathematics, 9(2):541–550, 1959.
[22] Roger A. Horn and Charles R. Johnson. Topics in matrix analysis. Cambridge University Press, New York, NY, 1994.
[23] Roger A. Horn and Charles R. Johnson. Matrix analysis. Cambridge university press, New York, NY, 2013.
[24] Roger A. Horn and Vladimir V. Sergeichuk. Canonical forms for complex matrix congruence and ^∗congruence. Linear algebra and its applications, 416(2-3):1010–1032, 2006.
[25] Ben Jeuris, Raf Vandebril, and Bart Vandereycken. A survey and comparison of contemporary algorithms for computing the matrix geometric mean. Electronic Transactions on Numerical Analysis, 39:379–402, 2012.
[26] Charles R. Johnson and Susana Furtado. A generalization of Sylvester’s law of inertia. Linear Algebra and its Applications, 338(1-3):287–290, 2001.
[27] Jürgen Jost. Riemannian geometry and geometric analysis. Springer, Berlin, Heidelberg, 2008.
[28] Tosio Kato. Perturbation theory for linear operators. Springer, Berlin, Heidelberg, 1995.
[29] Serge Lang. Fundamentals of differential geometry. Springer, New York, NY, 1999.
[30] John M. Lee. Introduction to Smooth Manifolds. Springer, New York, NY, 2013.
[31] John M. Lee. Introduction to Riemannian Manifolds. Springer, Cham, 2018.
[32] Adrian S. Lewis. Group invariance and convex matrix analysis. SIAM Journal on Matrix Analysis and Applications, 17(4):927–949, 1996.
[33] Chi-Kwong Li and Nung-Sing Sze. Determinantal and eigenvalue inequalities for matrices with numerical ranges in a sector. Journal of Mathematical Analysis and Applications, 410(1):487–491, 2014.
[34] Albert W Marshall, Ingram Olkin, and Barry C Arnold. Inequalities: theory of majorization and its applications. Springer, New York, NY, 2nd edition, 2011.
[35] Roy Mathias. Matrices with positive definite Hermitian part: Inequalities and linear systems. SIAM journal on matrix analysis and applications, 13(2):640–654, 1992.
[36] Leon Mirsky. Symmetric gauge functions and unitarily invariant norms. The quarterly journal of mathematics, 11(1):50–59, 1960.
[37] Maher Moakher. A differential geometric approach to the geometric mean of symmetric positive-definite matrices. SIAM Journal on Matrix Analysis and Applications, 26(3):735–747, 2005.
[38] Richard M. Murray, Zexiang Li, and S. Shankar Sastry. A mathematical introduction to robotic manipulation. CRC press, Boca Raton, FL, 1994.
[39] Tsutomu Okada. Minkowskian product of Finsler spaces and Berwald connection. Journal of Mathematics of Kyoto University, 22(2):323–332, 1982.
[40] Barrett O’Neill. Semi-Riemannian geometry with applications to relativity. Academic press, San Diego, CA, 1983.
[41] Karl Johan Åström and Richard M. Murray. Feedback systems. Princeton university press, Princeton, NJ, 2008.
[42] Hideo Shimada and Vasile Soriin Sabău. Finsler geometry. In P.L. Antonelli, editor, Finslerian Geometries: A Meeting of Minds, pages 15–24. Springer, Dordrecht, 2000.
[43] Robert C. Thompson. On the eigenvalues of a product of unitary matrices I. Linear and Multilinear Algebra, 2(1):13–24, 1974.
[44] Dan Wang, Wei Chen, Sei Zhen Khong, and Li Qiu. On the phases of a complex matrix. Linear Algebra and its Applications, 593:152–179, 2020.
[45] Fuzhen Zhang. Matrix theory: basic results and techniques. Springer, New York, NY, 2011.
[46] Fuzhen Zhang. A matrix decomposition and its applications. Linear and Multilinear Algebra, 63(10):2033–2042, 2015.
[47] Di Zhao, Axel Ringh, Li Qiu, and Sei Zhen Khong. Low phase-rank approximation. Submitted, 2020.

Finsler geometries on strictly accretive matrices

Abstract

1 Introduction

2 Background and notation

2.1 Sectorial matrices and the phases of a matrix

2.2 Riemannian and Finsler manifolds

2.2.1 A family of Finsler metrics on ℙn{\mathbb{P}}_{n} and their geodesics

Proposition 2.1.

Proof.

2.2.2 A family of Finsler metrics on 𝕌n{\mathbb{U}}_{n} and their geodesics

3 The smooth manifold of strictly accretive matrices

Theorem 3.1.

Proof.

Theorem 3.2.

Proposition 3.3.

Proof.

Proof of Theorem 3.2.

Remark 3.4.

4 A family of Finsler metrics on 𝔸n{\mathbb{A}}_{n} and their geodesics

Proposition 4.1.

Proof.

Definition 4.2 ([39]).

Definition 4.3.

Theorem 4.4.

Proof.

Proposition 4.5.

Proof.

Remark 4.6.

Remark 4.7.

5 An application and some related results

5.1 Closest matrix of bounded log-rank

Definition 5.1.

Theorem 5.2.

Proposition 5.3.

Proof.

5.2 On the geometric mean for strictly accretive matrices

Example 5.4.

Lemma 5.5.

Proof.

Lemma 5.6.

Proof.

Proposition 5.7.

Proof.

Proposition 5.8.

Proof.

6 Conclusions

Acknowledgments

Appendix A Technical results from Section 3

Lemma A.1.

Proof.

Lemma A.2.

Proof.

Lemma A.3.

Proof.

Lemma A.4.

Proof.

Lemma A.5 (Cf. [29, Prop. VII.2.5]).

Proof.

References

2.2.1 A family of Finsler metrics on ${\mathbb{P}}_{n}$ and their geodesics

2.2.2 A family of Finsler metrics on ${\mathbb{U}}_{n}$ and their geodesics

4 A family of Finsler metrics on ${\mathbb{A}}_{n}$ and their geodesics