Linear Principal Minor Polynomials: hyperbolic determinantal inequalities and spectral containment

Grigoriy Blekherman School of Mathematics, Georgia Institute of Technology, Atlanta, Georgia [email protected] , Mario Kummer Technische Universität Dresden, Germany [email protected] , Raman Sanyal Institut für Mathematik, Goethe-Universität Frankfurt, Germany [email protected] , Kevin Shu School of Mathematics, Georgia Institute of Technology, Atlanta, Georgia [email protected] and Shengding Sun School of Mathematics, Georgia Institute of Technology, Atlanta, Georgia [email protected]

Abstract.

A linear principal minor polynomial or lpm polynomial is a linear combination of principal minors of a symmetric matrix. By restricting to the diagonal, lpm polynomials are in bijection to multiaffine polynomials. We show that this establishes a one-to-one correspondence between homogeneous multiaffine stable polynomials and PSD-stable lpm polynomials. This yields new construction techniques for hyperbolic polynomials and allows us to generalize the well-known Fisher–Hadamard and Koteljanskii inequalities from determinants to PSD-stable lpm polynomials. We investigate the relationship between the associated hyperbolicity cones and conjecture a relationship between the eigenvalues of a symmetric matrix and the values of certain lpm polynomials evaluated at that matrix. We refer to this relationship as spectral containment.

We would like to thank Santanu Dey for many insightful conversations. Grigoriy Blekherman, Kevin Shu and Shengding Sun were partially supported by NSF grant DMS-1901950. Mario Kummer was partially supported by DFG grant 421473641

1. Introduction

A homogeneous polynomial $p\in\mathbb{R}[{x}]:=\mathbb{R}[x_{1},\dots,x_{n}]$ is called hyperbolic with respect to ${a}\in\mathbb{R}^{n}$ if $p({a})\neq 0$ and $p_{{a}}(t):=p({v}-t{a})\in\mathbb{R}[t]$ has only real roots for all ${v}\in\mathbb{R}^{n}$ . The hyperbolicity cone $H_{a}(p)$ of a polynomial $p$ hyperbolic with respect to ${a}\in\mathbb{R}^{n}$ is the set of all ${v}\in\mathbb{R}^{n}$ such that $p({v}-t{a})$ has only nonnegative roots. Originally conceived in the context of partial differential equations [10], hyperbolic polynomials were discovered to yield deep results in (non-)linear algebra, combinatorics, and optimization; see, for example, [1, 2, 3, 24, 26, 29].

A fundamental family of hyperbolic polynomials is given by the elementary symmetric polynomials

e_{k}({x})\ :=\ \sum_{J}\prod_{i\in J}x_{i}\,,

where $J$ ranges over all $k$ -element subsets of $[n]:=\{1,\dots,n\}$ . The elementary symmetric polynomials are stable: a multivariate polynomial $p\in\mathbb{R}[{x}]$ is stable if for all complex numbers $z_{1},\ldots,z_{n}$ lying in the open upper half-plane, we have $p(z_{1},\ldots,z_{n})\neq 0$ . If $p$ is homogeneous, then it is stable if and only if it is hyperbolic with respect to all ${a}\in\mathbb{R}^{n}_{>0}$ , and we denote by $H(p)=H_{\mathbf{1}}(p)$ its hyperbolicity cone with respect to the vector $\mathbf{1}=(1,\dots,1)$ .

Let ${X}$ denote an $n\times n$ matrix of indeterminants, and for any $J\subseteq[n]$ , we let ${X}_{J}$ denote the principal submatrix of ${X}$ indexed by $J$ . We can then define a polynomial

E_{k}({X})\ :=\ \sum_{J}\det({X}_{J})\,,

where again $J$ ranges over all $k$ -element subsets of $[n]$ . It turns out that these polynomials do not vanish on the Siegel upper half-plane, i.e., the set of all complex symmetric matrices with positive definite imaginary part. Such polynomials are called Dirichlet–Gårding [13] or PSD-stable [16]. For a homogeneous polynomial $P$ this property is equivalent to being hyperbolic with respect to any positive definite matrix, and we denote by $H(P)$ its hyperbolicity cone (taken with respect to the identity matrix). When the context is clear, we will simply refer to PSD-stable polynomials $P({X})$ as stable polynomials.

The starting point of our paper is the observation that $E_{k}({X})$ is closely related to $e_{k}(x)$ . For instance, if ${X}=\operatorname*{Diag}(x_{1},\dots,x_{n})$ is the diagonal matrix with diagonal entries $X_{ii}=x_{i}$ , then $E_{k}({X})=e_{k}(x_{1},\dots,x_{n})$ . To generalize this observation, let $\mathbb{R}^{n\times n}_{\mathrm{sym}}$ be the vector space of real symmetric $n\times n$ -matrices and let $\mathbb{R}[{X}]$ be the ring of polynomials on it, where we regard ${X}$ as being an $n\times n$ matrix of indeterminants. A polynomial $P({X})\in\mathbb{R}[{X}]$ is called a linear principal minor polynomial or lpm-polynomial if $P({X})$ is of the form

P({X})\ =\ \sum_{J}c_{J}\det({X}_{J})\,,

where $J$ ranges over all subsets of $[n]$ . The first natural question we pursue is what interesting properties are shared by a homogeneous lpm polynomial $P({X})$ and its diagonal restriction $p({x})$ . We show that $P({X})$ is PSD-stable if and only if $p({x})$ is stable. We obtain a similar result for the related concept of Lorentzian polynomials. We prove these facts using the theory of stability preservers [5].

Having established these basic facts we generalize classical determinantal inequalities from linear algebra, such as the Hadamard–Fischer and Koteljanskii inequality to the setting of stable lpm polynomials. This generalizes the Hadamard-type inequalities for $k$ -positive matrices obtained in [21]. Another interesting consequence of the above results is that they give construction of a new class of hyperbolic polynomials. Using lpm polynomials we construct a hyperbolic cubic in 6 variables which has a Rayleigh difference that is not a sum of squares. The previously smallest known example with $43$ variables was contructed by Saunderson in [27]. Finally, we study whether the eigenvalue vector $\lambda$ of a matrix $X$ lying in the hyperbolicity cone of a stable lpm polynomial $P({X})$ lies in the hyperbolicity cone of $p(x)$ and show how this is related to a potential generalization of the classical Schur–Horn theorem [28, 14]. We now discuss our results in detail.

2. Our results in detail

Our discussion of lpm polynomials can also be viewed from a different perspective. A polynomial $p\in\mathbb{R}[{x}]:=\mathbb{R}[x_{1},\dots,x_{n}]$ is multi-affine if it is a linear combination of square-free monomials $x^{J}=\prod_{j\in J}x_{j}$ for $J\subseteq[n]$ . We define a linear map $\Phi$ from the vector subspace of multi-affine polynomials in $x_{1},\ldots,x_{n}$ to the vector space of lpm polynomials, which we call the minor lift map, as follows. The minor lift of

p({x})=\sum_{J\subseteq[n]}a_{J}\prod_{i\in J}x_{i},

is the polynomial $P=\Phi(p)$ given by

P(X)=\sum_{J\subseteq[n]}a_{J}\det({X}_{J}).

We note that $\deg(\Phi(p))=\deg(p)$ and that $\Phi(p)$ is homogeneous if and only if $p$ is homogeneous. When it is unambiguous, we will use lower case letters such as $p$ to denote homogeneous, multiaffine $p\in\mathbb{R}[x_{1},\dots,x_{n}]$ , and use the corresponding upper case letters for the minor lift, so that $P$ is equal to $\Phi(p)$ .

2.1. Properties of the minor lift map and constructions

Our first result is that the minor lift map sends stable polynomials to PSD-stable polynomials. Stronger even, let us call a matrix $A$ $k$ -locally PSD if every principal $k\times k$ -submatrix $A_{J}$ of $A$ is positive semidefinite. The collection $\mathrm{PSD}_{k}$ of $k$ -locally PSD matrices is a closed convex cone and $\mathrm{PSD}_{d}\subset\mathrm{PSD}_{d-1}\subset\cdots\subset\mathrm{PSD}_{1}$ .

Theorem 2.1.

Let $p$ be a homogeneous multiaffine polynomial of degree $k$ . If $p$ is stable, then $P=\Phi(p)$ is hyperbolic with $\mathrm{PSD}_{k}\subseteq H(P)$ . In particular, $P$ is PSD-stable.

For $A\in\mathbb{R}^{n\times n}_{\mathrm{sym}}$ , let $\pi(A)=(A_{11},A_{22},\dots,A_{nn})$ be the projection to the diagonal. A first implication for the associated hyperbolicity cones is as follows.

Corollary 2.2.

Let $p$ be a homogeneous multiaffine stable polynomial and $P=\Phi(p)$ . If $A\in H(P)$ , then $p(\pi(A))\geq P(A)$ and $\pi(A)\in H(p)$ .

Using Theorem 2.1, we are able to construct new interesting hyperbolic polynomials. Given a hyperbolic polynomial $p$ and points ${a},{v}$ in the hyperbolicity cone of $p$ , the Rayleigh difference $\Delta_{{v},{a}}(p)=D_{{v}}p\cdot D_{{a}}p-p\cdot D_{{v}}D_{{a}}p$ is a polynomial nonnegative on $\mathbb{R}^{n}$ [19]. If the polynomial $\Delta_{{v},{a}}(p)=D_{{v}}p\cdot D_{{a}}p-p\cdot D_{{v}}D_{{a}}p$ is not a sum of squares, this has interesting implications for determinantal representations as well as a hyperbolic certificate of nonnegativity of $\Delta_{{v},{a}}(p)$ which cannot be recovered by sums of squares. Saunderson [27] characterized all pairs $(d,n)$ for which there exists such a hyperbolic polynomial $p\in\mathbb{R}[x_{1},\ldots,x_{n}]$ of degree $d$ , except when $d=3$ , where the smallest known example with a Rayleigh difference that is not a sum of squares depends on 43 variables. We are able to reduce the number of variables to 6. See Section 6 for more details.

Theorem 2.3.

There exists an (explicit) degree-3 hyperbolic polynomial $p$ in $6$ variables and vectors ${v},{a}\in H(p)$ such that the Rayleigh difference $\Delta_{{v},{a}}(p)$ is not a sum-of-squares.

2.2. Hyperbolic determinantal inequalities

We generalize some well known theorems from linear algebra to the setting of lpm polynomials. Note that the cone of positive semidefinite matrices is precisely the hyperbolicity cone of $\det({X})$ , which is the minor lift of $e_{n}(x)=x_{1}\cdots x_{n}$ . For our generalizations, we replace the determinant by the minor lift of a homogeneous multiaffine stable polynomial, and the cone of positive semidefinite matrices by the hyperbolicity cone of the minor lift.

Hadamard’s inequality is a classical result comparing the determinant of any positive semidefinite matrix with the product of its diagonal entries.

Theorem (Hadamard’s inequality).

Let $A$ be a $n\times n$ positive semidefinite matrix, then $\det(A)\leq\prod_{i=1}^{n}A_{ii}$ .

An equivalent statement of this inequality is as follows: if $V$ is any, not necessarily symmetric, real $n\times n$ -matrix with columns ${v}_{1},\dots,{v}_{n}$ , then $\det(V)\leq\prod_{i=1}^{n}\|{v}_{i}\|_{2}$ . This yields a geometric interpretation, since the absolute value of determinant is the volume of an $n$ -dimensional parallelepiped with edges ${v}_{1},\dots,{v}_{n}$ .

Fischer’s inequality generalizes Hadamard’s inequality, and relates the determinant of a positive semidefinite matrix to its principal minors. Let $\Pi=\{S_{1},\dots,S_{m}\}$ be a partition of the set $[n]$ into $m$ disjoint subsets. Given such a partition, we write $i\sim j$ if $i,j\in S_{k}$ for some $k=1,\dots,m$ . Let $\mathcal{D}_{\Pi}$ be the vector space of symmetric matrices that are block diagonal with respect to $\Pi$

\mathcal{D}_{\Pi}=\{A\in\mathbb{R}^{n\times n}_{\mathrm{sym}}:A_{ij}=0\textnormal{ if }i\not\sim j\}.

Let $\pi_{\Pi}$ be the orthogonal projection from $\mathbb{R}^{n\times n}_{\mathrm{sym}}$ onto the subspace $\mathcal{D}_{\Pi}$ .

Theorem (Fischer’s inequality).

Let $A$ be a positive semidefinite matrix. Then

\det(\pi_{\Pi}(A))\ \geq\ \det(A)\,.

Observe that Hadamard inequality is simply Fischer’s inequality with partition $\Pi=\{\{1\},\dots,\{n\}\}$ . We now give a hyperbolic generalization of Fischer-Hadamard inequality. For $P=\Phi(e_{k})$ , our hyperbolic Hadamard inequality was obtained in [21].

Theorem 2.4 (Hyperbolic Fischer–Hadamard inequality).

Let $P$ be a homogeneous PSD-stable lpm-polynomial and $\Pi$ a partition. Then

P(\pi_{\Pi}(A))\ \geq\ P(A)

holds for all $A\in H(P)$ .

The classical Fischer–Hadamard inequality is a consequence of a more general inequality known as Koteljanskii’s inequality, which handles the case of overlapping blocks [17].

Theorem (Koteljanski’s inequality).

Let $S$ and $T$ be two subsets of $[n]$ and $A$ be a positive semidefinite $n\times n$ matrix. Then

\det(A_{S})\det(A_{T})\ \geq\ \det(A_{S\cup T})\det(A_{S\cap T})\,.

While we were not able to generalize Koteljanskii’s inequality in a way that implies the hyperbolic Fischer–Hadamard inequality, we found a hyperbolic generalization of Koteljanskii’s inequality, which uses a different interpretation of what it means to take a minor of a matrix.

Definition 2.5.

Given a degree $k$ homogeneous lpm polynomial $P$ and $T\subseteq[n]$ with $|T|\geq n-k$ , we define the restriction

P|_{T}\ :=\ \Bigl{(}\prod_{i\in[n]\setminus T}\frac{\partial}{\partial X_{ii}}\Bigr{)}P\,,

where we take partial derivative with respect to diagonal variables not in $T$ .

With this definition we can state the hyperbolic Koteljanskii inequality, which is related to the negative lattice condition in [6]:

Theorem 2.6 (Hyperbolic Koteljanskii inequality).

Let $P$ be a homogeneous PSD-stable lpm-polynomial and $S,T\subseteq[n]$ . Then

P|_{S}(A)P|_{T}(A)\ \geq\ P|_{S\cup T}(A)P|_{S\cap T}(A)

holds for all $A\in H(P)$ .

2.3. Spectral containment property

If $A$ is an $n\times n$ symmetric matrix, we say that $\lambda\in\mathbb{R}^{n}$ is an eigenvalue vector of $A$ if the entries of $\lambda$ are precisely the eigenvalues of $A$ with appropriate multiplicities. Note that the set of eigenvalue vectors of a symmetric matrix $A$ are invariant under permutations.

We recall the example of the $k$ -th elementary symmetric polynomial $e_{k}({x})$ and its minor lift $E_{k}({X})$ from the introduction. It is well-known that $E_{k}(A)=e_{k}(\lambda)$ where $\lambda$ is an eigenvalue vector of $A$ . In particular, it follows that $A\in H(P)$ implies that $\lambda\in H(p)$ . Notice that since $e_{k}$ is invariant under permutations of coordinates, the order in which we list the eigenvalues of $A$ in $\lambda(A)$ does not matter. This motivates the following definition.

Definition 2.7.

A homogeneous multiaffine stable polynomial $p\in\mathbb{R}[x_{1},\dots,x_{n}]$ has the spectral containment property if for any $A\in H(P)\subset\mathbb{R}^{n\times n}_{\mathrm{sym}}$ , there is an eigenvalue vector $\lambda\in\mathbb{R}^{n}$ of $A$ such that $\lambda\in H(p)$ .

Remark 2.8.

We could make a stronger requirement in 2.7 that for all $A\in H(P)$ , all eigenvalue vectors of $A$ lie in $H(p)$ , seems to be too restrictive; we do not have any examples of polynomials besides the elementary symmetric polynomials with this stronger property.

We now give a number of polynomials which have the spectral containment property:

Theorem 2.9.

The following classes of polynomials have the spectral containment property:

(1)

The elementary symmetric polynomials $e_{1},\dots,e_{n}$ .
(2)

For any $n\geq k\geq d$ , and any $|\varepsilon|$ sufficiently small, $e_{d}(x_{1},\dots,x_{n})+\varepsilon e_{d}(x_{1},\dots,x_{k})$ .
(3)

Stable linear polynomials.
(4)

Any degree $n-1$ stable polynomial that interlaces $e_{n-2}$ .
(5)

$e_{2}(x_{1},x_{2},x_{3},x_{4})-\varepsilon(x_{1}x_{2}+x_{1}x_{3})$ for $\varepsilon$ sufficiently small.

Moreover, if $p$ has the spectral containment property, and $x_{0}$ is a variable not used in $p$ , then $x_{0}p$ has the spectral containment property.

While this property may seem mysterious, we conjecture that it is in fact ubiquitous:

Conjecture 2.10.

Every homogeneous multaffine stable polynomial has the spectral containment property.

If 2.10 is true, then Theorem 2.1 implies that for every $k$ -locally PSD matrix $A$ and homogeneous multiaffine stable polynomial $p$ , some eigenvalue vector of $A$ is contained in $H(p)$ . This may seem like a very strong condition on the eigenvalues of $A$ , but as we show below it is equivalent to the fact that every eigenvalue vector of $A$ is contained in $H(e_{k})$ , which we already observed above. Let $\mathfrak{S}_{n}$ denote the symmetric group on $n$ letters and let it act on $\mathbb{R}^{n}$ by permuting coordinates.

Theorem 2.11.

Let $e_{k}\in\mathbb{R}[{x}]$ be the elementary symmetric polynomial of degree $k$ and $h\in\mathbb{R}[{x}]$ be a nonzero homogeneous multiaffine stable polynomial of degree $k$ . If ${v}\in H(e_{k})$ , then there exists a permutation $\tau\in\mathfrak{S}_{n}$ such that $\tau({v})\in H(h)$ .

In Section 7.4 we will also show that 2.10 would be implied in many cases by another conjecture generalizing the classical Schur–Horn Theorem.

3. The Minor Lift Map and Stability Preservers

Our goal in this section is to prove Theorem 2.1. We first explain how to construct the minor lift map via partial derivatives of the determinant. Let $p\in\mathbb{R}[{x}]$ be a multiaffine polynomial. The dual of $p$ is

(1)

p^{*}(x):=x_{1}\cdot x_{2}\cdots x_{n}\cdot p\Bigl{(}\frac{1}{x_{1}},\frac{1}{x_{2}},\dots,\frac{1}{x_{n}}\Bigr{)}.

For any polynomial $p\in\mathbb{R}[x_{1},\dots,x_{n}]$ , we consider the differential operator $p^{*}\left(\frac{\partial}{\partial X_{11}},\frac{\partial}{\partial X_{22}},\dots,\frac{\partial}{\partial X_{nn}}\right)$ . For instance, if $p=x^{S}=\prod_{i\in S}x_{i}$ is a monomial, then the associated differential operator is $\prod_{i\notin S}\frac{\partial}{\partial X_{ii}}$ . Applying the differential operator associated to $x^{S}$ to $\det(X)$ yields

\Bigl{(}\prod_{i\notin S}\frac{\partial}{\partial X_{ii}}\Bigr{)}\det({X})=\det(X_{S}).

By linearity, we then obtain that

P=\left(p^{*}\left(\frac{\partial}{\partial X_{11}},\frac{\partial}{\partial X_{22}},\dots,\frac{\partial}{\partial X_{nn}}\right)\right)\det(X)\,.

This formulation of the minor lift map will allow us to easily apply the theory of stability preservers.

Remark 3.1.

The minor lift operation interacts nicely with dualization. If $p$ is a multiaffine polynomial, then

\Phi(p^{*})|_{X}=\det(X)\cdot\Phi(p)|_{X^{-1}}.

Here, $\cdot|_{X}$ denotes the evaluation of a polynomial at $X$ .

This result follows directly from the Jacobi complementary minors identity, found in [20], which states that $\det(X|_{S^{c}})=\det(X^{-1}|_{S})\det(X)$ . This is a matrix analogue of (1).

Before we go on, we need the following facts about hyperbolicity cones that can be found in [29].

Lemma 3.2.

Let $p\in\mathbb{R}[{x}]$ be a homogeneous polynomial and $K\subset\mathbb{R}^{n}$ a cone. The following are equivalent:

(1)

$p$ is hyperbolic with respect to all ${a}\in K$ , and
(2)

$p({v}+\textnormal{i}{a})\neq 0$ for all ${v}\in\mathbb{R}^{n}$ and ${a}\in K$ .

Lemma 3.3.

Let $p\in\mathbb{R}[{x}]$ be hyperbolic with respect to ${a}\in\mathbb{R}^{n}$ . Then $p$ is hyperbolic with respect to every point in the connected component of $\{{v}\in\mathbb{R}^{n}:\,p({v})\neq 0\}$ that contains ${a}$ .

Our first step is the following observation:

Lemma 3.4.

Let $P\in\mathbb{R}[{X}]$ be a homogeneous polynomial. Then $P$ is PSD-stable if and only if the following two conditions hold:

(1)

$P(A)\neq 0$ for all positive definite matrices $A$ ;
(2)

$P(\operatorname*{Diag}(x_{1},\dots,x_{n})+M)\in\mathbb{R}[{x}]$ is stable for every real symmetric matrix $M$ .

Proof.

First assume that $P$ is PSD-stable and let $A$ be a positive definite matrix. By definition we have $P(\mathrm{i}A)\neq 0$ . Since $P$ is homogeneous, this implies that $P(A)\neq 0$ . Further let $z_{i}=a_{i}+\mathrm{i}b_{i}$ in the upper half-plane. Then $P(\operatorname*{Diag}(z_{1},\dots,z_{n})+M)$ is nonzero for any real symmetric matrix $M$ , since $\operatorname*{Diag}(b_{1},\dots,b_{n})$ is a positive definite matrix.

For the other direction we first observe that condition (2) implies that $P$ is hyperbolic with respect to the identity matrix. Indeed, the univariate polynomial $P(tI+M)$ is stable and thus real-rooted for every real symmetric matrix $M$ . Now condition (1) together with Lemmas 3.2 and 3.3 imply the claim. ∎

Proof of Theorem 2.1.

Let $p\in\mathbb{R}[{x}]$ be multiaffine, homogeneous and stable. Then by [9, Thm. 6.1] all nonzero coefficients of $p$ have the same sign. Without loss of generality assume that all are positive. Then $P=\Phi(p)$ is clearly positive on positive definite matrices since the minors of a positive definite matrix are positive. Thus by Lemma 3.4, it remains to show that

P(\operatorname*{Diag}(x_{1},\dots,x_{n})+M)=\left(p^{*}\left(\frac{\partial}{\partial x_{1}},\frac{\partial}{\partial x_{2}},\dots,\frac{\partial}{\partial x_{n}}\right)\right)\det(\operatorname*{Diag}(x_{1},\dots,x_{n})+M)

is stable for every real symmetric matrix $M$ . The polynomial $\det(\operatorname*{Diag}(x_{1},\dots,x_{n})+M)$ is stable as well as $p^{*}$ by [9, Prop. 4.2]. Thus the polynomial $P(\operatorname*{Diag}(x_{1},\dots,x_{n})+M)$ is also stable by [5, Thm. 1.3].

Let $A\in\mathrm{PSD}_{k}\subseteq\mathbb{R}^{n\times n}_{\mathrm{sym}}$ be $k$ -locally PSD. Then for every $k$ -subset $S\subseteq[n]$ , we have $\det((A+t{I})|_{S})>0$ for all $t>0$ . Hence, if $p$ has degree $k$ with all coefficients positive, then $P(A-t{I})>0$ for all $t<0$ and hence all roots are non-negative. This implies that $A\in H(P)$ . ∎

Remark 3.5.

Given a multiaffine homogeneous stable polynomial $p\in\mathbb{R}[x_{1},\ldots,x_{n}]$ , the minor lift map gives a hyperbolic polynomial $P$ in the entries of a symmetric $n\times n$ matrix whose restriction to the diagonal equals to $p$ . Such polynomials can also be constructed for stable polynomials that are not necessarily multiaffine. Since we are mainly interested in multiaffine polynomials, we only briefly sketch one possible such construction. To a stable homogeneous polynomial $p\in\mathbb{R}[x_{1},\ldots,x_{n}]$ one can find a multiaffine stable polynomial $q\in\mathbb{R}[z_{11},\ldots,z_{1d_{1}},\ldots,z_{nd_{n}}]$ such that we can recover $p$ from $q$ by substituting each variable $z_{ij}$ by $x_{i}$ , see [9, §2.5]. This polynomial $q$ is called a polarization of $p$ . If we restrict the minor lift of $q$ to suitable block-diagonal matrices, we obtain a hyperbolic polynomial with the desired properties for $p$ .

Remark 3.6.

Using [8, Thm. 3.2] one can show that the analogous statement to Theorem 2.1 for Lorentzian polynomials, a recent generalization of stable polynomials, holds as well.

4. Hyperbolic Hadamard-Fischer Inequality

Our goal in this section is to prove Theorem 2.4. We start by making some general observations about supporting hyperplanes of the hyperbolicity cone:

Lemma 4.1.

Let $p\in\mathbb{R}[x]$ be hyperbolic with respect to $a\in\mathbb{R}^{n}$ and $H_{a}(p)$ the corresponding hyperbolicity cone. Assume that $p(a)>0$ and that $p$ is reduced in the sense that all its irreducible factors are coprime. Then we have the following:

(1)

For all $v\in H_{a}(p)$ the linear form $L_{v}=\langle\nabla p(v),x\rangle$ is nonnegative on $H_{a}(p)$ .
(2)

If $v\in\partial H_{a}(p)$ , then $L_{v}(v)=0$ .
(3)

If $b\not\in H_{a}(p)$ , then there exists $v\in\partial H_{a}(p)$ such that $L_{v}(b)<0$ .

Proof.

Part (2) is just Euler’s identity since $p$ vanishes on $\partial H_{a}(p)$ . If $\nabla p(v)=0$ , then (1) is trivial. Otherwise, the hyperplane $\{x:\langle\nabla p(v),x\rangle=0\}$ is tangent to $\partial H_{a}(p)$ at $v$ . Since $H_{a}(p)$ is convex and $p$ positive on the interior of $H_{a}(p)$ , this implies that $L_{v}(x)\geq 0$ for all $x\in H_{a}(p)$ . In order to prove (3), we first note that by our assumption on $p$ , the set of points $c\in\partial H_{a}(p)$ where $\nabla p(c)=0$ is nowhere dense. Thus if $b\not\in H_{a}(p)$ , then there is a point $e$ in the interior of $H_{a}(p)$ such that the line segment $[e,b]$ intersects $\partial H_{a}(p)$ in a smooth point $v$ . Since $L_{v}(e)>0$ and $L_{v}(v)=0$ , we have $L_{v}(b)<0$ . ∎

We now apply the above observations to lpm polynomials. Recall that for a partition $\Pi=\{S_{1},\dots,S_{m}\}$ of $[n]$ , we denote by $\mathcal{D}_{\Pi}$ the vector space of block diagonal symmetric matrices with blocks given by $\Pi$ and $\pi_{\Pi}$ is the orthogonal projection of $\mathbb{R}^{n\times n}_{\mathrm{sym}}$ onto the subspace $\mathcal{D}_{\Pi}$ . Further recall that we write $a\sim b$ for $a,b\in[n]$ if $a,b\in S_{k}$ for some $k=1,\dots,m$ .

Lemma 4.2.

Fix a partition $\Pi=\{S_{1},\dots,S_{m}\}$ of $[n]$ and let $B\subseteq[n]$ be any subset. Then for any $\sigma\in\mathfrak{S}_{B}$ , we have $|\{b\in B|b\not\sim\sigma(b)\}|\neq 1$ .

Proof.

For $b\in B$ , consider the orbit $b,\sigma(b),\sigma^{2}(b),\dots,\sigma^{t-1}(b),\sigma^{t}(b)=b$ . If $b\in S_{k}$ but the orbit is not fully contained in $S_{k}$ , then there are $0\leq r<s<t$ such that $\sigma^{r}(b),\sigma^{s+1}(b)\in S_{k}$ but $\sigma^{r+1}(b),\sigma^{s}(b)\not\in S_{k}$ . ∎

Lemma 4.3.

Let $P$ be an lpm polynomial. If $A\in\mathcal{D}_{\Pi}$ , then $\nabla P(A)\in\mathcal{D}_{\Pi}$ .

Proof.

Since $P$ is a sum of terms of the form $a_{B}\det(X_{B})$ with $B\subseteq[n]$ , it suffices to prove the claim for $P=\det(X_{B})$ . In that case, this is equivalent to saying that if $A\in\mathcal{D}_{\Pi}$ and $i\not\sim j$ , then

\Bigl{(}\frac{\partial}{\partial X_{ij}}\det(X_{B})\Bigr{)}(A)=0.

Now $\det X_{B}=\sum_{\sigma\in\mathfrak{S}_{B}}\mathrm{sgn}(\sigma)\prod_{i\in B}X_{i,\sigma(i)}$ and Lemma 4.2 applied to each term yields the claim. ∎

The preceding lemma allows us to show that the hyperbolicity cone of a hyperbolic lpm polynomial is closed under projections onto $\mathcal{D}_{\Pi}$ .

Lemma 4.4.

Let $P$ be a homogeous PSD-stable lpm polynomial. If $A\in H(P)$ , then $\pi_{\Pi}(A)\in H(P)$ .

Proof.

Let $P_{\Pi}$ be the restriction of the polynomial $P$ to $\mathcal{D}_{\Pi}$ , i.e. $P_{\Pi}=P\circ\iota$ where $\iota:\mathcal{D}_{\Pi}\rightarrow\mathbb{R}^{n\times n}_{\mathrm{sym}}$ is the inclusion map. We have $H(P)\cap\mathcal{D}_{\Pi}=H(P_{\Pi})$ . For $A\in H(P)$ we thus have to show that $\pi_{\Pi}(A)\in H(P_{\Pi})$ . By Lemma 4.1 this is equivalent to $\langle\nabla P_{\Pi}(B),\pi_{\Pi}(A)\rangle\geq 0$ for all $B\in H(P_{\Pi})$ . But by the previous lemma we have $\langle\nabla P_{\Pi}(B),\pi_{\Pi}(A)\rangle=\langle\nabla P(B),A\rangle$ which is nonnegative by Lemma 4.1 since $A\in H(P)$ . ∎

We are now able to show the hyperbolic Fischer–Hadamard inequality. Our proof technique is inspired by the proof of [11, Thm. 5].

Proof of Theorem 2.4.

Without loss of generality, we can assume that $P(I)>0$ . If $A$ is on the boundary of $H(P)$ , then $P(A)=0$ and we are done since $\pi_{\Pi}(A)\in H(P)$ implies $P(\pi_{\Pi}(A))\geq 0$ . Therefore, we may assume that $A$ is in the interior of $H(P)$ . In this case, let $\epsilon>0$ be sufficiently small such that $A-\epsilon I\in H(P)$ , then $\pi_{\Pi}(A)-\epsilon I=\pi_{\Pi}(A-\epsilon I)$ is also in $H(P)$ . This shows that $\pi_{\Pi}(A)$ is in the interior of $H(P)$ and $P(\pi_{\Pi}(A))>0$ . Thus $P$ is hyperbolic with respect to $A$ and $q(t)=P(tA+\pi_{\Pi}(A))\in\mathbb{R}[t]$ is real rooted with negative roots. Let $d$ be the degree of $q(t)$ degree and $-\lambda_{1},\ldots,-\lambda_{d}$ the roots of $q(t)$ with each $\lambda_{i}>0$ . We consider the coefficients of $t$ in $q(t)$ :

–

The coefficient of $t^{d}$ is $P(A)$ .
–

The coefficient of $t$ is $dP(\pi_{\Pi}(A))$ , since $\frac{d}{dt}q(t)|_{t=0}=\langle\nabla P|_{\pi_{\Pi}(A)},A\rangle$ , and by Lemma 4.3, $\langle\nabla P|_{\pi_{\Pi}(A)},A\rangle=\langle\nabla P|_{\pi_{\Pi}(A)},\pi_{\Pi}(A)\rangle=dP(\pi_{\Pi}(A))$ . This last equality is due to Euler’s identity.
–

The constant coefficient is $P(\pi_{\Pi}(A))$ .

Thus we have $e_{d-1}(\lambda)=\frac{dP(\pi_{\Pi}(A))}{P(A)}$ , and $e_{d}(\lambda)=\lambda_{1}\cdots\lambda_{d}=\frac{P(\pi_{\Pi}(A))}{P(A)}$ . Since all $\lambda_{i}$ are positive, from the AM-GM inequality we have

\frac{P(\pi_{\Pi}(A))}{P(A)}=\frac{e_{d-1}(\lambda)}{d}\geq(\lambda_{1}\cdots\lambda_{d})^{\frac{d-1}{d}}=\left(\frac{P(\pi_{\Pi}(A))}{P(A)}\right)^{\frac{d-1}{d}}.

This proves the claim. ∎

When $P(X)=\det X$ , then $H(P)$ is the cone of positive semidefinite matrices and our theorem implies the well-known Fischer’s inequality:

Corollary 4.5 (Fischer’s inequality).

If $A$ is positive semidefinite, then $\det\pi_{\Pi}(A)\geq\det A$ .

Remark 4.6.

The usual statement of Fischer’s inequality corresponds to the case of two blocks. This is equivalent to our multi-block version since principal submatrices of a positive semidefinite matrix are also positive semidefinite.

In the case, where $\Pi=\{\{1\},...,\{n\}\}$ , Theorem 2.4 and Lemma 4.4 imply 2.2. We also get the following strengthening of Theorem 2.4.

Corollary 4.7.

Let $P$ be a homogeneous and PSD-stable lpm-polynomial. If $A\in H(P)$ , then the polynomial $P((1-t)A+t\pi_{\Pi}(A))$ is monotonically increasing for $t\in[0,1]$ .

Proof.

The polynomial $q(t)=P(tX+(\pi_{\Pi}(X)-X))$ is real rooted, and

P((1-t)X+t\pi_{\Pi}(X))=q^{*}(t)

so that $q^{*}(t)$ is real rooted. Because both $A$ and $\pi_{\Pi}(A)$ are in $H(P)$ , we have $q^{*}(t)\geq 0$ for $t\in[0,1]$ . Hence, by interlacing $\frac{d}{dt}q^{*}(t)$ has at most one root in the interval $[0,1]$ . If there were a root of $\frac{d}{dt}q^{*}(t)$ in the interval $[0,1)$ , then at such a root $q^{*}(t)$ would have a maximum on this interval. This is a contradiction to the fact that $q^{*}(t)$ is maximized at $t=1$ by Theorem 2.4. Hence, we have that $\frac{d}{dt}q^{*}(t)$ must in fact be nonnegative on this interval. ∎

5. Hyperbolic Koteljanskii inequality

Koteljanskii’s inequality [17] states that for any $n\times n$ positive semidefinite matrix $A$ and $S,T\subset[n]$ , $\det A_{S}\det A_{T}\geq\det A_{S\cap T}\det A_{S\cup T}$ . This is a generalization of the Hadamard–Fischer inequality. Later this inequality was proven to hold for other classes of (possibly non-symmetric) matrices [15]. In this section we prove Theorem 2.6, a generalization of Koteljanskii’s inequality, where the determinant can be replaced by a PSD-stable lpm polynomial. First we need the hyperbolic counterpart of the fact that principal submatrices of a positive semidefinite matrix are again positive semidefinite, and hence have nonnegative determinant. For this we use Renegar derivatives [25].

Theorem 5.1.

Let $p$ be a polynomial, hyperbolic with respect to $v$ . Let $D_{v}p$ denote the directional derivative of $p$ in direction $v$ . Then $D_{v}p$ is also hyperbolic with respect to $v$ . Furthermore, their hyperbolicity cones satisfy $H_{v}(p)\subseteq H_{v}(D_{v}p)$ .

Recall from 2.5 that $P|_{T}=(\prod_{i\in[n]\setminus T}\frac{\partial}{\partial X_{ii}})P$ . Then we have:

Corollary 5.2.

Let $P$ be a homogeneous PSD-stable lpm polynomial of degree $k$ and $A\in H(P)$ . Let $T\subseteq[n]$ with $|T|\geq n-k$ . Then $P|_{T}$ is PSD-stable as well and $A\in H(P|_{T})$ .

Now we use the result from [6] on negative dependence. For any polynomial $p\in\mathbb{R}[x]$ and index set $S\subseteq[n]$ we denote $\partial^{S}p=(\prod_{i\in S}\frac{\partial}{\partial x_{i}})p$ .

Theorem 5.3 ([6, Sect. 2.1 and Thm. 4.9]).

Let $p$ be a multiaffine stable polynomial with nonnegative coefficients. Then $p$ satisfies the nonnegative lattice condition: for all $S,T\subseteq[n]$

\partial^{S}p(0)\partial^{T}p(0)\ \geq\ \partial^{S\cup T}p(0)\partial^{S\cap T}p(0)\,.

This theorem directly implies the generalization of Koteljanskii’s inequality.

Proof of Theorem 2.6.

Without loss of generality assume that $P(I)>0$ . Let $P_{A}(x)=P(A+\operatorname*{Diag}(x))\in\mathbb{R}[x_{1},...,x_{n}]$ . It is clear that $P_{A}$ is multiaffine and $\partial^{S}P_{A}(0)=P|_{S}(A)$ for all $S\subseteq[n]$ . It follows from Corollary 5.2 that $P_{A}$ is stable and has nonnegative coefficients. Thus by Theorem 5.3 it satisfies the nonnegative lattice condition, i.e. for all $S,T\subseteq[n],\partial^{S}P_{A}(0)\partial^{T}P_{A}(0)\geq\partial^{S\cup T}P_{A}(0)\partial^{S\cap T}P_{A}(0)$ . This completes the proof. ∎

6. Hyperbolic polynomials and sums of squares

Let $p\in\mathbb{R}[x]$ be hyperbolic with respect to $v\in\mathbb{R}^{n}$ and $a,b\in H_{v}(p)$ . Then the mixed derivative

\Delta_{a,b}(p)=\operatorname{D}_{a}p\cdot\operatorname{D}_{b}p-p\cdot\operatorname{D}_{a}\operatorname{D}_{b}p

is globally nonnegative by Theorem 3.1 in [19]. If some power $p^{r}$ has a definite symmetric determinantal representation, i.e., can be written as

p^{r}=\det(x_{1}A_{1}+\cdots+x_{n}A_{n})

for some real symmetric (or complex hermitian) matrices $A_{1},\dots,A_{n}$ with $v_{1}A_{1}+\ldots+v_{n}A_{n}$ positive definite, then $\Delta_{a,b}(p)$ is even a sum of squares [19, Cor. 4.3]. Therefore, any instance where $\Delta_{a,b}(p)$ is not a sum of squares gives an example of a hyperbolic polynomial none of whose powers has a definite symmetric determinantal representation. Another source of interest in such examples comes from the point of view taken in [27], as these give rise to families of polynomials that are not sums of squares but whose nonnegativity can be certified via hyperbolic programming. Saunderson [27] characterized all pairs $(d,n)$ for which there exists such a hyperbolic polynomial $p\in\mathbb{R}[x]=\mathbb{R}[x_{1},\ldots,x_{n}]$ of degree $d$ , except when $d=3$ . In this section we will construct an explicit hyperbolic cubic $p$ in $6$ variables for which there are two points $a,b$ in the hyperbolicity cone such that $\Delta_{a,b}(p)$ is not a sum of squares.

Remark 6.1.

If there are two points $a,b$ in the closed hyperbolicity cone of $p$ such that $\Delta_{a,b}(h)$ is not a sum of squares, then there are also such points in the interior of the hyperbolicity cone as the cone of sums of squares is closed.

Remark 6.2.

In [27] Saunderson constructs a hyperbolic cubic in $43$ variables whose Bézout matrix is not a matrix sum of squares. This is the smallest such example that has been known so far. The top left entry of the Bézout matrix is the mixed derivative that we are studying. Thus if the latter is not a sum of squares, then the Bézout matrix is not a matrix sum of squares.

Consider the complete graph $K_{4}$ on $4$ vertices. We define the spanning tree polynomial of $K_{4}$ as the element of $\mathbb{R}[x_{e}:e\in E(K_{4})]$ given by

t_{K_{4}}(x)\ =\ \sum_{\tau}\prod_{e\in\tau}x_{e}\,,

where $\tau\subset E(K_{4})$ ranges over all edge sets of spanning trees of $K_{4}$ . The polynomial $t_{K_{4}}$ is multiaffine, homogeneous and stable [9, Thm. 1.1]. Let $T$ be its minor lift. Finally, let $p$ be the polynomial obtained from $T$ by evaluating $T$ at the matrix of indeterminants

A=\bordermatrix{&12&13&14&23&24&34\cr&x_{1}&0&0&0&0&0\cr&0&x_{2}&a&b&c&0\cr&0&a&x_{2}&c&b&0\cr&0&b&c&x_{2}&a&0\cr&0&c&b&a&x_{2}&0\cr&0&0&0&0&0&x_{3}}.

Thus $p$ is hyperbolic with respect to every positive definite matrix that can be obtained by specializing entries of $A$ to some real numbers. In particular, the polynomial

W=\frac{\partial p}{\partial x_{1}}\cdot\frac{\partial p}{\partial x_{3}}-p\cdot\frac{\partial^{2}p}{\partial x_{1}\partial x_{3}}

is nonnegative. We will show that it is not a sum of squares. We first study the real zero set of $W$ .

Lemma 6.3.

The polynomial $W$ is contained in the ideals $J_{1},J_{2},J_{3}$ and $J_{4}$ where

(1)

$J_{1}$ is generated by all $2\times 2$ minors of $A$ ,
(2)

$J_{2}$ is generated by all off-diagonal entries of $A$ ,
(3)

$J_{3}$ is generated by $a,c$ and $x_{2}$ ,
(4)

$J_{4}$ is generated by $b,c$ and $x_{2}$ .

Proof.

Part (1) follows from the fact that both $h$ and $\frac{\partial h}{\partial x_{1}}$ are in $J_{1}$ . The other claims are apparent since

\frac{1}{4}W=a^{2}b^{2}+a^{2}c^{2}+b^{2}c^{2}+c^{4}-8abcx_{2}+2a^{2}x_{2}^{2}+2b^{2}x_{2}^{2}.\qed

Definition 6.4.

An ideal $I$ in a ring $A$ is called real radical if $g_{1}^{2}+\cdots+g_{r}^{2}\in I$ implies $g_{1},\ldots,g_{r}\in I$ for all $g_{1},\ldots,g_{r}\in A$ .

Lemma 6.5.

The ideal $J=\bigcap_{k=1}^{4}J_{k}$ is real radical.

Proof.

It suffices to show that each $J_{k}$ is a radical ideal such that the real points lie Zariski dense in its zero set. This is clear for $J_{2},J_{3}$ and $J_{4}$ . Using Macaulay2 [12] we checked that $J_{1}$ is radical. Moreover, the primary decomposition of $J_{1}$ shows that the zero set of $J_{1}$ is a union of linear spaces. This implies that the real zeros of $J_{1}$ are dense as well. ∎

Theorem 6.6.

The polynomial $W$ is not a sum of squares.

Proof.

Assume for the sake of a contradiction that $W=g_{1}^{2}+\cdots+g_{r}^{2}$ for some polynomials $g_{i}$ . Since $W\in J$ by Lemma 6.3, Lemma 6.5 implies that each $g_{i}$ is in $J$ . Thus $W$ is even in the ideal $J\cdot J$ . Using Macaulay2 [12] one checks that this is not the case. ∎

Remark 6.7.

In the terminology of [27] this shows in particular that $h$ is neither SOS-hyperbolic nor weakly SOS-hyperbolic.

7. The Spectral Containment Property

We would like to relate the hyperbolicity cone of a homogeneous stable polynomial with the hyperbolicity cone of its minor lift. Recall from 2.7 that a homogeneous multiaffine stable polynomial $p$ has the spectral containment property if for any $X\in H(P)$ , there is some vector $\lambda$ consisting of the eigenvalues of $X$ with appropriate multiplicity so that $\lambda\in H(p)$ . Elementary symmetric polynomials have the spectral containment property, and we will show that several other polynomials have the spectral containment property in this section. The remainder of this section is devoted to proving some sufficient conditions for the spectral containment property, as well as showing some connections between this property and the Schur-Horn theorem.

7.1. Schur–Horn Theorem and stable linear functions

Recall that a linear homogeneous polynomial $p(x)=a_{1}x_{1}+\dots+a_{n}x_{n}$ is stable if and only if either $a_{i}\geq 0$ for each $i\in[n]$ , or $a_{i}\leq 0$ for each $i\in[n]$ . Moreover, in this case $H(p)=\{x\in\mathbb{R}^{n}:p(x)\geq 0\}$ . These are the simplest stable polynomials and yet it is not completely trivial to show that they have the spectral containment property.

Theorem 7.1.

Every stable linear homogeneous polynomial has the spectral containment property.

In order to prove this, we will use Schur’s contribution to the Schur–Horn theorem.

Theorem 7.2 (Schur).

Let $p:\mathbb{R}^{n}\rightarrow\mathbb{R}$ be a homogeneous linear function, and let $P$ be the associated minor lift. Let $A$ be a symmetric matrix and let $\lambda$ be an eigenvalue vector for $A$ . Let $\mathfrak{S}_{n}$ denote the symmetric group which acts on $\mathbb{R}^{n}$ by permuting coordinates. Let $\textnormal{O}(n)$ denote the orthogonal group of $n\times n$ matrices. Then

\max_{\pi\in\mathfrak{S}_{n}}p(\pi(\lambda))=\max_{U\in\textnormal{O}(n)}P(UAU^{\intercal}).

Proof of Theorem 7.1.

Suppose that $A\in H(P)$ , which is equivalent to $P(A)\geq 0$ . By the Schur–Horn theorem, there is some eigenvalue vector of $A$ , say $\lambda$ , so that $p(\lambda)\geq P(A)\geq 0$ . Thus, there is an eigenvalue vector of $A$ contained in $H(p)$ as desired. ∎

We will see in Section 7.4 that if an appropriate generalization of the Schur-Horn theorem holds, then we would be able to show the spectral containment property for a large class of polynomials.

7.2. Operations Preserving the Spectral Containment Property

In this section we prove that the spectral containment property is preserved under some simple operations involving adjoining a new variable.

Lemma 7.3.

Let $q\in\mathbb{R}[x_{1},\ldots,x_{n}]$ be stable, multiaffine and homogeneous. Let $p\in\mathbb{R}[x_{0},\ldots,x_{n}]$ be defined by $p(x_{0},\dots,x_{n})=q(x_{1},\dots,x_{n})$ . If $q$ has the spectral containment property, then $p$ has the spectral containment property.

Proof.

First note that $x=(x_{0},\ldots,x_{n})\in H(p)$ if and only if $(x_{1},\ldots,x_{n})\in H(q)$ . Let $X\in H(P)$ , then we can divide $X$ into blocks as

X=\begin{pmatrix}X_{00}&v^{\intercal}\\ v&M\end{pmatrix}.

Here, $M$ is equal to $X|_{[n]}$ , and $v$ is some element of $\mathbb{R}^{n}$ .

If $I_{m}$ is the $n\times n$ identity matrix, we can see from the definition of $P$ that $P(X+tI_{n+1})=Q(M+tI_{n})$ . Therefore, for $t\geq 0$ , $Q(M+tI_{n})=P(X+tI_{n+1})\geq 0$ , which implies $M\in H(Q)$ . Let $\lambda(M)$ and $\lambda(X)$ be eigenvalue vectors of $M$ and $X$ respectively, with the property that the entries of $\lambda(M)$ and $\lambda(X)$ appear in increasing order. The Cauchy interlacing inequalities say that

\lambda_{0}(X)\leq\lambda_{1}(M)\leq\lambda_{1}(X)\leq\lambda_{2}(M)\leq\lambda_{2}(X)\leq\dots\leq\lambda_{{n}}(M)\leq\lambda_{{n}}(X).

Thus for $i\in[n]$ we can write $\lambda_{i}(X)=\lambda_{i}(M)+\epsilon_{i}$ for some $\epsilon\geq 0$ . Since $q$ has the spectral containment property, there is a permutation $\sigma$ such that $(\lambda_{\sigma(i)}(M))_{1\leq i\leq n}\in H(q)$ . Since the hyperbolicity cone of the stable polynomial $q$ is convex and contains the nonnegative orthant, we also have $(\lambda_{\sigma(i)}(X))_{1\leq i\leq n}=(\lambda_{\sigma(i)}(M)+\epsilon_{\sigma(i)})_{1\leq i\leq n}\in H(q)$ . This implies that $(\lambda_{0}(X),\lambda_{\sigma(1)}(X),\ldots,\lambda_{\sigma(n)}(X))\in H(p)$ . ∎

The spectral containment property is also preserved when multiplying by a new variable.

Proposition 7.4.

Let $q\in\mathbb{R}[x_{1},\ldots,x_{n}]$ be stable, multiaffine and homogeneous. Let $p\in\mathbb{R}[x_{0},\ldots,x_{n}]$ defined by $p(x_{0},\dots,x_{n})=x_{0}q(x_{1},\dots,x_{n})$ . If $q$ has the spectral containment property, then $p$ has the spectral containment property.

Before we show this, we need another lemma. Let $X$ be a matrix written in block form as

X=\begin{pmatrix}X_{00}&v^{\intercal}\\ v&M\end{pmatrix}

and $X_{00}\neq 0$ . We write $X/0:=M-X_{00}^{-1}vv^{\intercal}$ for the Schur complement.

Lemma 7.5.

Let $q\in\mathbb{R}[x_{1},\dots,x_{n}]$ be stable, multiaffine and homogeneous. Let $p=x_{0}q\in\mathbb{R}[x_{0},\dots,x_{n}]$ , and $X\in H(P)$ , with $X_{00}>0$ , then $X/0\in H(Q)$ .

Proof.

Note that a vector $x=(x_{0},x_{1},\dots,x_{n})\in H(p)$ if and only if $x_{0}\geq 0$ and $(x_{1},\dots,x_{n})\in H(q)$ . Recall the determinant formula for Schur complements: for any $n\times n$ matrix $X$ ,

\det(X)=X_{00}\det(X/0).

Also, it is not hard to see from the definition that if $S\subseteq\{0,1,\dots,n\}$ , and $0\in S$ , then

X|_{S}/0=X/0|_{(S\setminus 0)}\,,

that is, Schur complements interact naturally with taking submatrices. Therefore,

P(X)=\sum_{S\subseteq\{0,\dots,n\}}a_{S}\det(X|_{S})=\sum_{S\subseteq\{0,\dots,n\}}a_{S}X_{00}\det((X/0)|_{S\setminus 0})=X_{00}Q(X/0)

Thus, if $X\in H(P)$ and $X_{00}>0$ , then

Q(X/0)=\frac{P(X)}{X_{00}}\geq 0

We can strengthen this result by noting that if we let $J$ be the block diagonal matrix given by

J=\begin{pmatrix}0&0\\ 0&I_{n},\end{pmatrix}

then $J\in H(P)$ , since it is in particular positive semidefinite. It is clear from the definition that $X/0+tI_{n+1}=(X+tJ)/0$ . Thus, we have that for all $t\geq 0$ ,

Q(X/0+tI_{n+1})=Q((X+tJ)/0)=\frac{P(X+tJ)}{X_{00}}\geq 0,

which implies that $X/0\in H(Q)$ . ∎

Proof of 7.4.

First assume that $X_{00}>0$ . By Lemma 7.5, and the spectral containment property for $q$ , we have that there is an ordering of the eigenvalues of $X/0$ so that $\lambda(X/0)\in H(q)$ .

Now, we can write

X=\begin{pmatrix}0&0\\ 0&X/0\end{pmatrix}+\begin{pmatrix}X_{00}&v^{\intercal}\\ v&X_{00}^{-1}vv^{\intercal}\end{pmatrix},

where the second term is a rank 1 positive semidefinite matrix.

Let $X^{\prime}=\begin{pmatrix}0&0\\ 0&X/0\end{pmatrix}$ . Note that $X^{\prime}$ is block diagonal, so that if $\lambda(X^{\prime})$ is an eigenvalue vector for $X/0$ , then $0\oplus\lambda(X^{\prime})$ is an eigenvalue vector for $X^{\prime}$ . In particular, by ordering the entries appropriately, $\lambda(X^{\prime})\in H(p)$ , from our characterization of $H(p)$ in terms of $H(q)$ .

By the Weyl inequalities, there is an ordering of the eigenvalues of $X$ so that $\lambda_{i}(X)\geq\lambda_{i}(X^{\prime})$ for each $i$ . This implies that

\lambda(X)=(0\oplus\lambda(X^{\prime}))+v

where $v$ is a nonnegative vector, and therefore $v$ is in $H(p)$ . Therefore, $\lambda\in H(p)$ .

The case of $X_{00}=0$ follows from continuity of eigenvalues. Observe that if $X$ is in the interior of $H(P)$ , then $X_{00}>0$ , and also, since the eigenvalues of a symmetric matrix vary continuously with the matrix, the property of having an eigenvalue vector in $H(p)$ is closed. Therefore, since $H(p)$ is closed and has nonempty interior, there is an eigenvalue vector of $X$ in $H(p)$ . ∎

7.3. Polynomials Interlacing an Elementary Symmetric Polynomial

The spectral containment property can be proved more easily for polynomials which interlace some elementary symmetric polynomial.

Before stating the main result, we note that the minor lift map preserves interlacing.

Lemma 7.6.

Let $p,q\in\mathbb{R}[x_{1},\ldots,x_{n}]$ be stable, multiaffine and homogeneous. Let $P,Q$ be the associated minor lifts. Then $p$ interlaces $q$ if and only if $P$ interlaces $Q$ .

Proof.

Assume that $p$ interlaces $q$ . Then by the multivariate Hermite–Biehler Theorem [7, Thm. 5.3] we have that $p+iq$ is stable. Let $A$ be a symmetric $n\times n$ matrix. We have to show that $P(tI+A)$ interlaces $Q(tI+A)$ . From [5, Thm. 1.3] we see that the linear operator $T_{A}$ that sends a multiaffine polynomial $p$ to the polynomial $P(\operatorname*{Diag}(x_{1},\dots,x_{n})+A)$ is a stability preserver. Thus $T_{A}(p+iq)$ is stable. Substituting $t$ for all variables in $T_{A}(p+iq)$ shows that $P(tI+A)+iQ(tI+A)$ is stable. Now the claim follows from another application of the Hermite–Biehler Theorem. The other direction is clear, since $p$ and $q$ are the respective restrictions of $P$ and $Q$ to the diagonal matrices. ∎

Lemma 7.7.

Suppose that $p$ is a stable, multiaffine and homogeneous polynomial of degree $d$ , and that $e_{d-1}$ interlaces $p$ . Further suppose that for any $X\in H(P)$ , there is some eigenvalue vector $\lambda$ of $X$ , such that $p(\lambda)\geq P(X)$ . Then $p$ has the spectral containment property.

Proof.

We first note the fact that if $p$ is any hyperbolic polynomial, and $q$ interlaces $p$ , then $x$ is in the interior of $H(p)$ if and only if $x$ is in $H(q)$ and $p(x)>0$ . This follows easily from considering the bivariate case.

Let $X$ be in the interior of $H(P)$ . We first want to show that there is an eigenvalue vector of $X$ that is contained in $H(p)$ ; the case for general $X$ will then follow from the fact that the eigenvalues of a symmetric matrix are continuous as a function of the entries of the matrix.

Since $e_{d-1}$ interlaces $p$ , by Lemma 7.6, we have that $E_{d-1}$ interlaces $P$ . From this, we conclude that since $X\in H(P)$ , $X$ is contained in $H(E_{d-1})$ , and so any vector of eigenvalues of $X$ is contained in $H(e_{d-1})$ .

Let $\lambda$ be any eigenvalue vector of $X$ so that $0<P(X)\leq p(\lambda)$ , then we see that this $\lambda$ must then be in the interior of $H(p)$ , as desired. ∎

In 7.9, we show that the set of stable multiaffine forms interlacing $e_{d-1}$ is an open subset containing $e_{d}$ . This implies that if we have a hyperbolic polynomial $p$ which is sufficiently close to $e_{d}$ , then $p$ will have the spectral containment property as long as for any $X\in H(P)$ , there is some eigenvalue vector $\lambda$ , so that $p(\lambda)\geq P(X)$ .

We will apply this lemma in a few cases, together with some variational characterizations for eigenvalues to show the spectral containment property for some special kinds of polynomials.

Lemma 7.8.

Let $p,q$ be multiaffine polynomials of degree $d+1$ and $d$ , and let $a\in\mathbb{R}^{n}$ . There exist multiaffine polynomials $m_{1},\ldots,m_{s},n_{1},\ldots,n_{s}$ of degree $d$ such that

\operatorname{D}_{a}p\cdot q-p\cdot\operatorname{D}_{a}q=m_{1}n_{1}+\ldots+m_{s}n_{s}.

Proof.

This is straightforward. ∎

Proposition 7.9.

There is an open neighborhood $U$ of $e_{d+1}$ in the vector space of multiaffine forms of degree $d+1$ such that every stable multiaffine $p\in U$ of degree $d+1$ is interlaced by $e_{d}$ .

Proof.

Let $I$ be the ideal generated by all multiaffine polynomials of degree $d$ and let $V$ be the degree $2d$ part of $I^{2}$ . Let $\Sigma\subset V$ be the set of all polynomials that can be written as a sum of squares of multiaffine polynomials of degree $d$ . It follows from the proof of [18, Thm. 6.2] that $\operatorname{D}_{e}e_{d+1}\cdot e_{d}-e_{d+1}\cdot\operatorname{D}_{e}e_{d}$ is in the interior of $\Sigma$ (with respect to the euclidean topology on $V$ ). Thus it follows from Lemma 7.8 that there is an open neighborhood $U$ of $e_{d+1}$ such that for every stable multiaffine $p\in U$ the polynomial $\operatorname{D}_{e}p\cdot e_{d}-p\cdot\operatorname{D}_{e}e_{d}$ is in $\Sigma$ . Thus $e_{d}$ interlaces $p$ by [19, Thm. 2.1]. ∎

7.4. Generalized Schur-Horn Property and the Spectral Containment Property

We say that an $n$ -variate multiaffine homogeneous polynomial $p$ has the Schur-Horn property if for any $n\times n$ symmetric matrix $X$ with some eigenvalue vector $\lambda$ ,

\max_{\pi\in\mathfrak{S}_{n}}p(\pi(\lambda))=\max_{U\in O(n)}P(UXU^{\intercal}).

The Schur-Horn property for $p$ is equivalent to the fact that for any $n\times n$ symmetric matrix $X$ with eigenvalue vector $\lambda$ ,

\max_{\pi\in\mathfrak{S}_{n}}p(\pi(\lambda))\geq P(X).

Another equivalent formulation states that $p$ has the Schur-Horn property if and only if the maximum of $P(UXU^{\intercal})$ as $U$ varies over $O(n)$ is obtained for some $U$ such that $UXU^{\intercal}$ is diagonal.

The Schur-Horn theorem states that any linear homogeneous polynomial has the Schur-Horn property. We now relate Schur-Horn property and the spectral containment property.

Theorem 7.10.

Let $p$ is an homogeneous multiaffine form of degree $d$ . If $p$ has the Schur-Horn property, and $e_{d-1}$ interlaces $p$ , then $p$ has the spectral containment property.

Proof.

It is clear that if $p$ has the Schur-Horn property, then in particular, for any $X\in H(P)$ , there is some eigenvalue vector $\lambda$ so that $p(\lambda)\geq P(X)$ . Therefore, $p$ has the spectral containment property by Lemma 7.7. ∎

Using the Schur-Horn property and our previous lemmas, we can show that a family of stable polynomials have the spectral containment property.

Lemma 7.11.

If $p$ is a degree $d$ homogeneous multiaffine polynomial with the Schur-Horn property, then $e_{d}(x)+p$ also has the Schur-Horn property.

Proof.

It can easily be seen that if $X$ is an $n\times n$ symmetric matrix, with an eigenvalue vector $\lambda$ , that

	$\displaystyle\max_{\pi\in\mathfrak{S}_{n}}(e_{d}(\pi(\lambda))+p(\pi(\lambda)))$	$\displaystyle=e_{d}(\lambda)+\max_{\pi\in\mathfrak{S}_{n}}p(\pi(\lambda))$
		$\displaystyle=E_{d}(X)+\max_{U\in O(n)}P(UXU^{\intercal})$
		$\displaystyle=\max_{U\in O(n)}E_{d}(UXU^{\intercal})+P(UXU^{\intercal}).$

This gives the desired result. ∎

Lemma 7.12.

If $p$ is a degree $d$ homogeneous multiaffine polynomial with the Schur-Horn property, then for $\epsilon>0$ sufficiently small, $e_{d}(x)+\epsilon p$ has the spectral containment property.

Proof.

By 7.9, we see that for $\epsilon$ sufficiently small, $e_{d}(x)+\epsilon p$ is interlaced by $e_{d-1}$ . Moreover, by Lemma 7.11, we see that $e_{d}(x)+\epsilon p$ has the Schur-Horn property. Therefore, by Theorem 7.10, we see that $e_{d}(x)+\epsilon p$ has the spectral containment property. ∎

We now give some examples of polynomials with the Schur-Horn property.

7.5. The Schur-Horn Property For Degree $n-1$ Polynomials

Theorem 7.13.

If $p\in\mathbb{R}[x_{1},\dots,x_{n}]$ is a degree $n-1$ multilinear homogeneous polynomial, then $p$ has the Schur-Horn property.

Proof.

Write $p(x)=\sum_{i=1}^{n}a_{i}\prod_{j\in[n]\setminus i}x_{i}$ . In this case,

P(X)=\sum_{i=1}^{n}a_{i}\det(X|_{[n]\setminus i})

Recall that the dual of $p(x)$ was defined in Section 3, as

p^{*}(x)=\sum_{i=1}^{n}a_{i}x_{i}.

Abusing notation, we define $P^{*}$ to be

P^{*}(X)=\sum_{i=1}^{n}a_{i}X_{ii}.

Define the adjugate matrix of $X$ by $\operatorname*{Adj}(X)=\det(X)X^{-1}$ . By Cramer’s rule, the diagonal entries of the adjugate matrix are given by

\operatorname*{Adj}(X)_{ii}=\det(X|_{[n]\setminus i}).

Hence, using Remark 3.1, we see that $P^{*}(\operatorname*{Adj}(X))=P(X)$ .

The eigenvalues of $\operatorname*{Adj}(X)$ are of the form $\mu_{j}=\prod_{i\in[n]\setminus j}\lambda_{i}$ where $\lambda$ is an eigenvalue vector of $X$ . We see then that $p^{*}(\mu)=p(\lambda)$ . Now we apply the Schur-Horn theorem to the linear form $p^{*}$ and the matrix $\operatorname*{Adj}(X)$ to see that

\max_{\pi\in\mathfrak{S}_{n}}p^{*}(\pi(\mu))=\max_{U\in O(n)}P^{*}(U^{\intercal}\operatorname*{Adj}(X)U).

Applying our identities relating the vector $\mu$ to $\lambda$ , we see that

\max_{\pi\in\mathfrak{S}_{n}}p(\pi(\lambda))=\max_{U\in O(n)}P(U^{\intercal}XU).

∎

From this, we immediately obtain a corollary.

Corollary 7.14.

There is an open set $U$ in the space of degree $n-1$ homogeneous multiaffine polynomials, such that $U$ contains $e_{n-1}$ and every element of $U$ is stable and has the spectral containment property.

7.6. Extensions of Elementary Symmetric Polynomials and the Schur–Horn Property

We note that it is unclear whether the Schur–Horn property is preserved by adding extra variables. We show that this holds for elementary symmetric polynomials.

Proposition 7.15.

Fix natural numbers $d\leq k\leq n$ . Let $p=\pm e_{d}(x_{1},\dots,x_{k})\in\mathbb{R}[x_{1},\dots,x_{n}]$ . Then, $p$ has the Schur-Horn property.

We can reduce this to the classical Schur-Horn theorem. To do this, we require a lemma involving a construction, which is referred to in [22, Chapter 3] as a derivation of a matrix $X$ .

Lemma 7.16.

For any $n\times n$ symmetric matrix $X$ , with eigenvalue vector $\lambda$ , there exists a $\binom{n}{k}\times\binom{n}{k}$ symmetric matrix $D^{k,d}X$ with the following two properties:

•

The eigenvalues of $D^{k,d}X$ are precisely those real numbers of the form

$e_{d}(\lambda_{s_{1}},\lambda_{s_{2}},\dots,\lambda_{s_{k}}),$

where we range over all possible values of $s_{1},\dots,s_{k}\in[n]$ so that $s_{1}<s_{2}<\dots<s_{k}$ .
•

The diagonal entries of $D^{k,d}X$ are precisely those of the form $E_{d}(X|_{S})$ , where $S$ ranges over the size $k$ subsets of $[n]$ .

Proof of Lemma 7.16.

We will define $D^{k,d}X$ in terms of wedge powers. If we regard $X$ as an endomorphism from $\mathbb{R}^{n}$ to $\mathbb{R}^{n}$ , then $D^{k,d}X$ is defined as an endomorphism of $\wedge^{k}\mathbb{R}^{n}$ by letting

D^{k,d}X(v_{1}\wedge v_{2}\wedge\dots\wedge v_{k})=\sum_{S\in\binom{[n]}{k}}w_{S,1}\wedge w_{S,1}\wedge\dots\wedge w_{S,k},

where

w_{S,k}=\begin{cases}Xv_{k}\text{ if }k\in S\\ v_{k}\text{ if }k\not\in S\end{cases}.

It is not hard to see that if $v_{1},\dots,v_{k}$ are linearly independent eigenvectors of $X$ with eigenvalues $\lambda_{1},\dots,\lambda_{k}$ respectively, then $v_{1}\wedge\dots\wedge v_{k}$ is an eigenvector of $D^{k,d}X$ with eigenvalue $e_{d}(\lambda_{1},\dots,\lambda_{k})$ , and this clearly implies the first property in Lemma 7.16.

On the other hand, if we use the natural basis of $\wedge^{k}\mathbb{R}^{n}$ given by $\{e_{s_{1}}\wedge e_{s_{2}}\wedge\dots\wedge e_{s_{k}}\}$ , where $e_{i}$ is a standard basis vector, and $s_{1}<s_{2}<\dots<s_{k}$ , then this basis is orthogonal under the natural inner product of $\wedge^{k}\mathbb{R}^{n}$ , and also

(e_{s_{1}}\wedge e_{s_{2}}\wedge\dots\wedge e_{s_{k}})^{\intercal}D^{k,d}X(e_{s_{1}}\wedge e_{s_{2}}\wedge\dots\wedge e_{s_{k}})=E_{d}(X|_{\{s_{1},\dots,s_{k}\}}).

This clearly implies the second property of Lemma 7.16. ∎

Proof of 7.15.

The classical Schur-Horn theorem implies that for any symmetric matrix $X$ ,

\max_{s_{1}<s_{2}<\dots<s_{k}}e_{d}(\lambda_{s_{1}},\lambda_{s_{2}},\dots,\lambda_{s_{k}})\geq\max_{S\in\binom{[n]}{k}}E_{d}(X|_{S})\geq E_{d}(X|_{1,\dots,k}),

and also that

\min_{s_{1}<s_{2}<\dots<s_{k}}e_{d}(\lambda_{s_{1}},\lambda_{s_{2}},\dots,\lambda_{s_{k}})\leq\min_{S\in\binom{[n]}{k}}E_{d}(X|_{S})\leq E_{d}(X|_{1,\dots,k}).

This first statement implies the Schur-Horn property for $e_{d}(x_{1},\dots,x_{k})$ , and the second implies the Schur-Horn property for $-e_{d}(x_{,}\dots,x_{k})$ . ∎

7.7. A Small Example of the Schur-Horn Property

We give one more example of the Schur-Horn property, which is noteworthy because our proof does not appeal to the classical Schur-Horn theorem.

Lemma 7.17.

The polynomial $x_{1}(x_{2}+x_{3})\in\mathbb{R}[x_{1},x_{2},x_{3},x_{4}]$ has the Schur-Horn property.

Remark 7.18.

The polynomial $x_{1}(x_{2}+x_{3})\in\mathbb{R}[x_{1},x_{2},x_{3}]$ clearly has the Schur-Horn property, by Theorem 7.13, but it is not clear that this remains the case if we introduce a new variable.

Proof.

Let $D=\operatorname*{Diag}(\lambda_{1},\lambda_{2},\lambda_{3},\lambda_{4})$ . Let $U$ be in $\operatorname*{SO}(4)$ , and write its columns as

U=\begin{pmatrix}v&w&z&y\end{pmatrix}.

It is not hard to see via an explicit computation that

P(UDU^{\intercal})=\det\begin{pmatrix}\sum_{i=1}^{4}\lambda_{i}v_{i}^{2}&\sum_{i=1}^{4}\lambda_{i}w_{i}v_{i}\\ \sum_{i=1}^{4}\lambda_{i}w_{i}v_{i}&\sum_{i=1}^{4}\lambda_{i}w_{i}^{2}\\ \end{pmatrix}+\det\begin{pmatrix}\sum_{i=1}^{4}\lambda_{i}v_{i}^{2}&\sum_{i=1}^{4}\lambda_{i}z_{i}v_{i}\\ \sum_{i=1}^{4}\lambda_{i}z_{i}v_{i}&\sum_{i=1}^{4}\lambda_{i}z_{i}^{2}\\ \end{pmatrix}.\\

We expand this formulation out by multilinearity of the determinant to obtain the following

(2)		$\displaystyle\sum_{i=1}^{4}\sum_{j=1}^{4}\lambda_{i}\lambda_{j}\left(\det\begin{pmatrix}v_{i}^{2}&w_{j}v_{j}\\ w_{i}v_{i}&w_{j}^{2}\\ \end{pmatrix}+\det\begin{pmatrix}v_{i}^{2}&z_{j}v_{j}\\ z_{i}v_{i}&z_{j}^{2}\\ \end{pmatrix}\right)$
(3)		$\displaystyle=\sum_{i=1}^{4}\sum_{j<i}\lambda_{i}\lambda_{j}\left(\det\begin{pmatrix}v_{i}^{2}&w_{j}v_{j}\\ w_{i}v_{i}&w_{j}^{2}\\ \end{pmatrix}+\det\begin{pmatrix}v_{i}^{2}&z_{j}v_{j}\\ z_{i}v_{i}&z_{j}^{2}\\ \end{pmatrix}+\det\begin{pmatrix}v_{j}^{2}&w_{i}v_{i}\\ w_{j}v_{j}&w_{i}^{2}\\ \end{pmatrix}+\det\begin{pmatrix}v_{j}^{2}&z_{i}v_{i}\\ z_{j}v_{j}&z_{i}^{2}\\ \end{pmatrix}\right)$
(4)		$\displaystyle=\sum_{i=1}^{4}\sum_{j<i}\lambda_{i}\lambda_{j}\left(\det\begin{pmatrix}v_{i}&w_{i}\\ v_{j}&w_{j}\\ \end{pmatrix}^{2}+\det\begin{pmatrix}v_{i}&z_{i}\\ v_{j}&z_{j}\\ \end{pmatrix}^{2}\right).$

We can think of this as a polynomial

\gamma(\lambda)=\sum_{i=1}^{4}\sum_{j<i}\gamma_{i,j}\lambda_{i}\lambda_{j},

where

\gamma_{i,j}=\det\begin{pmatrix}v_{i}&w_{i}\\ v_{j}&w_{j}\\ \end{pmatrix}^{2}+\det\begin{pmatrix}v_{i}&z_{i}\\ v_{j}&z_{j}\\ \end{pmatrix}^{2}.

We make the following claim:

Lemma 7.19.

$\gamma(\lambda)$ is in the convex hull of the polynomials

a_{i,j,k}(\lambda)=\lambda_{i}(\lambda_{j}+\lambda_{k})

where $\{i,j,k\}\in\binom{[4]}{3}$ .

To see that Lemma 7.19 implies the theorem, observe that for any $i,j,k$ ,

a_{i,j,k}(\lambda)\geq\max_{\pi\in S_{4}}p(\lambda_{\pi(1)},\lambda_{\pi(2)},\lambda_{\pi(3)},\lambda_{\pi(4)}).

Hence, in particular, any convex combination of the $a_{i,j,k}$ will be lower bounded by this same quantity.

Therefore, since every symmetric matrix is diagonalizable, we have that for any symmetric matrix $X$ , and any orthogonal $U$ ,

P(U^{\intercal}XU)\geq\max_{\pi\in S_{4}}p(\lambda_{\pi(1)},\lambda_{\pi(2)},\lambda_{\pi(3)},\lambda_{\pi(4)}).

The opposite inequality is easy to see by choosing $U$ to be an orthogonal matrix diagonalizing $X$ . ∎

It remains to show Lemma 7.19.

Proof of Lemma 7.19.

Since we work in dimension $4$ , we can find the inequalities defining this convex hull explicitly using computational methods. It turns out that the polynomial $\gamma$ is in the convex hull of the $a_{i,j,k}$ if and only if for each $i,j\in[n]$ ,

\gamma_{i,j}\geq 0

and if $\{i,j,k,l\}=\{1,2,3,4\}$ , then

\gamma_{i,j}+\gamma_{k,l}\leq 1

and

\sum_{i,j\in\binom{[n]}{2}}\gamma_{i,j}=2

For our particular value of $\gamma_{i,j}$ , it is easy to see that it is the sum of two squares and hence nonnegative. Further notice that the sum of all of the coefficients of $\gamma$ is $\gamma(1,1,1,1)$ , so that returning to the definition of the polynomial $\gamma$ ,

\gamma(1,1,1,1)=P(I)=2.

It remains to show that

\gamma_{i,j}+\gamma_{k,l}\leq 1

Notice that

\gamma_{i,j}+\gamma_{k,l}=\det\begin{pmatrix}v_{i}&w_{i}\\ v_{j}&w_{j}\\ \end{pmatrix}^{2}+\det\begin{pmatrix}v_{i}&z_{i}\\ v_{j}&z_{j}\\ \end{pmatrix}^{2}+\begin{pmatrix}v_{k}&w_{k}\\ v_{l}&w_{l}\\ \end{pmatrix}^{2}+\det\begin{pmatrix}v_{k}&z_{k}\\ v_{l}&z_{l}\\ \end{pmatrix}^{2}

We prove that this is at most 1. Recall that

U=\begin{pmatrix}v&w&z&y\end{pmatrix}=\begin{pmatrix}v_{1}&w_{1}&z_{1}&y_{1}\\ v_{2}&w_{2}&z_{2}&y_{2}\\ v_{3}&w_{3}&z_{3}&y_{3}\\ v_{4}&w_{4}&z_{4}&y_{4}\\ \end{pmatrix}.

The Jacobi complementary minors theorem for matrix inverses implies that if $S\subseteq[4]$ , then

\det(U|_{S,T})=\det(U)\det(U^{-1}|_{S^{c},T^{c}})=\pm\det(U^{\intercal}|_{S^{c},T^{c}})=\pm\det(U^{\intercal}|_{T^{c},S^{c}})

We now see that

\det\begin{pmatrix}v_{i}&w_{i}\\ v_{j}&w_{k}\end{pmatrix}^{2}=\det\begin{pmatrix}z_{k}&y_{k}\\ z_{l}&y_{l}\end{pmatrix}^{2}.

Similarly, we must have

\det\begin{pmatrix}v_{i}&z_{i}\\ v_{j}&z_{j}\end{pmatrix}^{2}=\det\begin{pmatrix}w_{k}&y_{k}\\ w_{l}&y_{l}\end{pmatrix}^{2}.

Let

M=\begin{pmatrix}v_{3}&w_{3}&z_{3}&y_{3}\\ v_{4}&w_{4}&z_{4}&y_{4}\\ \end{pmatrix}.

Notice that since $U$ is orthogonal, these two rows are orthogonal, and so $MM^{\intercal}=I$ . Applying the Cauchy–Binet theorem to $\det(MM^{\intercal})=\det(I)$ we see that

	$\displaystyle 1$	$\displaystyle=\det(MM^{\intercal})$
		$\displaystyle=\sum_{S\subseteq\binom{[n]}{2}}\det(M_{S})\det(M_{S}^{\intercal})$
		$\displaystyle=\sum_{S\subseteq\binom{[n]}{2}}\det(M_{S})^{2}$
		$\displaystyle\geq\begin{pmatrix}z_{k}&y_{k}\\ z_{l}&y_{l}\end{pmatrix}^{2}+\det\begin{pmatrix}v_{i}&w_{i}\\ v_{j}&w_{j}\\ \end{pmatrix}^{2}+\begin{pmatrix}v_{k}&w_{k}\\ v_{l}&w_{l}\\ \end{pmatrix}^{2}+\det\begin{pmatrix}v_{k}&z_{k}\\ v_{l}&z_{l}\\ \end{pmatrix}^{2}$
		$\displaystyle=\gamma_{i,j}+\gamma_{k,l}\,.$

The result now follows. ∎

8. The permutation property

The goal of this section is to prove Theorem 2.11. It says that given any point $v$ in the hyperbolicity cone of $e_{k}$ and any other homogeneous stable multiaffine polynomial $h$ of the same degree, some permuation of the coordinates of $v$ is in the hyperbolicity cone of $h$ . We call this remarkable propery of $e_{k}$ the permutation property. We first need some preparation.

Lemma 8.1.

Assume that the homogeneous stable polynomials $g,h\in\mathbb{R}[x_{1},\ldots,x_{n}]$ have nonnegative coefficients and a common interlacer. Then $f=g+h$ is stable. If $v$ is in the hyperbolicity cone of $f$ , then $v$ is in the hyperbolicity cone of $g$ or in the hyperbolicity cone of $h$ .

Proof.

Let $e$ be the all-ones vector. The univariate polynomials $F=f(te-v),G=g(te-v)$ and $H=h(te-v)$ have a common interlacer. Further, all roots of $F$ are nonnegative. The existence of a common interlacer implies that $G$ and $H$ have at most one negative root each. Assume for the sake of a contradiction that both $G$ and $H$ have a negative root. Then $G$ and $H$ have the same (nonzero) sign on the smallest root of $F$ . This contradicts $F=G+H$ . Thus either $G$ or $H$ have only nonnegative roots which implies the claim. ∎

Lemma 8.2.

Let $h\in\mathbb{R}[x_{1},\ldots,x_{n}]$ be homogeneous, multiaffine and stable. Let $\tau\in\mathfrak{S}_{n}$ be a transposition. Then $h$ and $\tau(h)$ have a common interlacer.

Proof.

Without loss of generality assume that $\tau=(12)$ and let $g=\tau(h)$ . We can write

h=A\cdot x_{1}\cdot x_{2}+B\cdot x_{1}+C\cdot x_{2}+D

for some multiaffine $A,B,C,D\in\mathbb{R}[x_{3},\ldots,x_{n}]$ . Then the polynomial

\left(\frac{\partial}{\partial x_{1}}+\frac{\partial}{\partial x_{2}}\right)h=A\cdot(x_{1}+x_{2})+B+C=\left(\frac{\partial}{\partial x_{1}}+\frac{\partial}{\partial x_{2}}\right)g

is a common interlacer of $h$ and $g$ . ∎

Corollary 8.3.

Let $h\in\mathbb{R}[x_{1},\ldots,x_{n}]$ be homogeneous, multiaffine and stable. Let $\tau\in\mathfrak{S}_{n}$ be a transposition, $g=\tau(h)$ and $f=\lambda g+\mu h$ for some nonnegative $\lambda,\mu\in\mathbb{R}$ . Then $\textnormal{C}(f,e)\subset\textnormal{C}(g,e)\cup\textnormal{C}(h,e)$ .

Proof.

This is a direct consequence of the two preceding lemmas. ∎

Let $\mathbb{Q}[\mathfrak{S}_{n}]$ be the group algebra of the symmetric group $\mathfrak{S}_{n}$ on $n$ elements, i.e. $\mathbb{Q}[\mathfrak{S}_{n}]$ is the vector space over $\mathbb{Q}$ with basis $e_{g}$ for $g\in\mathfrak{S}_{n}$ whose ring structure is defined by extending $e_{g}\cdot e_{h}:=e_{g\cdot h}$ linearly. In $\mathbb{Q}[\mathfrak{S}_{n}]$ we have the identity

(5)

\prod_{j=2}^{n}\prod_{i=1}^{j-1}\left(1+\frac{1}{j-i}\cdot e_{(ij)}\right)=\sum_{g\in\mathfrak{S}_{n}}e_{g},

see for example [23, p. 192]. From this we obtain our desired theorem.

Theorem 8.4.

Let $\sigma_{d}\in\mathbb{R}[x_{1},\ldots,x_{n}]$ be the elementary symmetric polynomial of degree $d$ and $h\in\mathbb{R}[x_{1},\ldots,x_{n}]$ any other nonzero homogeneous multiaffine stable polynomial of degree $d$ . If $v$ is in the hyperbolicity cone of $e_{d}$ , then $\tau(v)$ is in the hyperbolicity cone of $h$ for some permutation $\tau\in\mathfrak{S}_{n}$ .

Proof.

We have $c\cdot\sigma_{d}=(\sum_{g\in\mathfrak{S}_{n}}e_{g})h$ for some nonzero scalar $c\in\mathbb{R}$ . Thus by Equation 5 we can write

c\cdot\sigma_{d}=\left(\prod_{i=1}^{r}(1+\lambda_{i}e_{\tau_{i}})\right)h

for some positive $\lambda_{i}\in\mathbb{R}$ , transpositions $\tau_{i}\in\mathfrak{S}_{n}$ and $r=\binom{n}{2}$ . We define $h_{k}=\left(\prod_{i=1}^{k}(1+\lambda_{i}e_{\tau_{i}})\right)h$ for $k=0,\ldots,r$ . Since $h_{k}=h_{k-1}+\lambda_{k}\tau_{k}(h_{k-1})$ , 8.3 implies that if $v$ is in the hyperbolicity cone of $h_{k}$ , then either $v$ or $\tau_{k}(v)$ is in the hyperbolicity cone of $h_{k-1}$ . Since $h_{r}=c\cdot\sigma_{d}$ and $h_{0}=h$ , this argument shows that if $v$ is in the hyperbolicity cone of $\sigma_{d}$ , then $(\tau_{i_{1}}\circ\cdots\circ\tau_{i_{s}})(v)$ is in the hyperbolicity cone of $h$ for some $1\leq i_{1}<\cdots<i_{s}\leq r$ . ∎

9. Open problems

Our work sparks a wide range of open problems. We mention some of them here. For several of these problems, we presented proofs for some special cases, whereas the general case remains open.

9.1. Hyperbolic Schur-Horn Theorem

In Section 4 we proved the hyperbolic generalization of Hadamard-Fischer inequality as well as Koteljanskii’s inequality, in Theorem 2.4 and Theorem 2.6. Here we present another potential generalization of classical linear algebra results in Schur-Horn theorem.

Schur-Horn theorem appears in our previous section on Spectral containment Property, where it plays a major role and a generalized version called Schur-Horn property was formed. Here we will form a different generalization of Schur-Horn theorem in terms of hyperbolic polynomials.

We will formulate our generalization in the language of majorization. Given polynomials $p$ and $q$ of the same degree, both hyperbolic with respect to the direction $v$ , we say that $p$ majorizes $q$ in direction $v$ if for all $x\in\mathbb{R}^{n}$ , the roots of $p(x-tv)$ majorize the roots of $q(x-tv)$ . Recall that given $\alpha,\beta\in\mathbb{R}^{k}$ , $\alpha$ majorizes $\beta$ if $\sum_{i=1}^{k}\alpha_{i}=\sum_{i=1}^{k}\beta_{i}$ and the following holds: let $\alpha^{\prime},\beta^{\prime}$ be obtained from $\alpha,\beta$ by reordering coordinates such that $\alpha^{\prime}_{1}\geq...\geq\alpha^{\prime}_{k}$ and $\beta^{\prime}_{1}\geq...\geq\beta^{\prime}_{k}$ , then for each $1\leq m<k,\sum_{i=1}^{m}\alpha^{\prime}_{i}\geq\sum_{i=1}^{m}\beta^{\prime}_{i}$ . Equivalently, $\alpha$ majorizes $\beta$ if and only if $\beta\in\operatorname{\operatorname{conv}}(\mathfrak{S}_{k}(\alpha))$ , where the symmetric group $\mathfrak{S}_{k}$ acts on $\alpha$ by permuting its coordinates.

In this language, we can restate the Schur direction of the Schur-Horn theorem as follows:

Lemma 9.1.

(Schur) $\det(X)$ majorizes $\det(\operatorname*{diag}(X))$ in the identity direction.

We conjecture that this holds for all homogeneous PSD-stable lpm-polynomials.

Conjecture 9.2.

Let $P$ be a homogeneous PSD-stable lpm-polynomial. Then $P(X)$ majorizes $P(\operatorname*{diag}(X))$ in the identity direction.

Recall for $1\leq k\leq n$ we defined $E_{k}(X)=\sum_{|S|=k}\det X_{S}$ to be the minor lift of degree $k$ elementary symmetric polynomial, i.e., sum of all $k\times k$ principal minors of $X$ . We are able to prove this conjecture for rescalings of $E_{k}$ . Our proof will use the following result from [4], which follows from Theorem 1 of their paper.

Proposition 9.3.

Suppose $p$ majorizes $q$ in direction $v$ , then $D_{v}p$ majorizes $D_{v}q$ , where $D_{v}$ denotes the directional derivatives in the $v$ direction.

Now we are ready to state and prove our result.

Proposition 9.4.

Let $D$ be any positive diagonal matrix, and $P(X)=E_{k}(D^{-1/2}XD^{-1/2})$ . Then $P(X)$ majorizes $P(\operatorname*{diag}(X))$ in the identity direction.

Proof.

First notice that $\det(X)$ majorizes $\det(\operatorname*{diag}(X))$ in the $D$ direction, i.e., roots of $\det(X-tD)$ majorize roots of $\det(\operatorname*{diag}(X)-tD)$ for any $X$ . This follows from applying original Schur’s theorem to the symmetric matrix $D^{-1/2}XD^{-1/2}$ , since we have $\det(X-tD)=\det(D)\det(D^{-1/2}XD^{-1/2}-tI)$ and similarly $\det(\operatorname*{diag}(X)-tD)=\det(D)\det(D^{-1/2}\operatorname*{diag}(X)D^{-1/2}-tI)$ . Also notice that $D^{-1/2}\operatorname*{diag}(X)D^{-1/2}=\operatorname*{diag}(D^{-1/2}XD^{-1/2})$ .

Now we apply 9.3 $(n-k)$ times, where $p=\det(X),q=\det(\operatorname*{diag}(X)),v=D$ . This shows that $p^{(k)}(X)=\sum_{|S|=k}\det(X_{S})\prod_{i\notin S}D_{ii}$ majorizes $q^{(k)}(X)=\sum_{|S|=k}\det(\operatorname*{diag}(X)_{S})\prod_{i\notin S}D_{ii}$ in the $D$ direction. Computing $p^{(k)}(X-tD)$ we have

	$\displaystyle p^{(k)}(X-tD)$	$\displaystyle=\sum_{\|S\|=k}\det(X_{S}-tD_{S})\prod_{i\notin S}D_{ii}$
		$\displaystyle=\sum_{\|S\|=k}\det(D_{S})\det(D_{S}^{-1/2}X_{S}D_{S}^{-1/2}-tI_{S})\prod_{i\notin S}D_{ii}$
		$\displaystyle=\det(D)\sum_{\|S\|=k}\det(D_{S}^{-1/2}X_{S}D_{S}^{-1/2}-tI_{S})$
		$\displaystyle=\det(D)E_{k}(D^{-1/2}XD_{-1/2}-tI)$

Similarly, $q^{(k)}(X-tD)=\det(D)E_{k}(D^{-1/2}\operatorname*{diag}(X)D_{-1/2}-tI)$ . This shows that roots of $E_{k}(D^{-1/2}XD_{-1/2}-tI)$ majorize roots of $E_{k}(D^{-1/2}\operatorname*{diag}(X)D_{-1/2}-tI)$ . This completes the proof. ∎

We may also formulate a hyperbolic generalization of Horn’s theorem, which we conjecture to be true but do not have any results.

Conjecture 9.5.

Let $P$ be any degree $k$ lpm-polynomial. Let $\lambda,\mu\in\mathbb{R}^{k}$ such that $\lambda$ majorizes $\mu$ . Then there exists a symmetric matrix $X$ such that roots of $P(X-tI)$ are given by $\lambda$ , and roots of $P(\operatorname*{diag}(X)-tI)$ are given by $\mu$ .

9.2. Spectral containment property and the Schur-Horn property

We showed that many polynomials have the spectral containment property. Based on these examples and additional computational evidence we conjecture the following:

Conjecture 9.6.

All homogeneous multiaffine stable polynomials have the spectral containment property.

There are several special cases of this conjecture which are of particular interest, which we enumerate separately.

Conjecture 9.7.

All quadratic homogeneous multiaffine stable polynomials have the spectral containment property.

This case is of special interest because quadratic multiaffine polynomials have especially simple minor lifts. Namely, if

p(x)=\sum_{i\neq j}a_{ij}x_{i}x_{j},

then

P(X)=p(\operatorname*{diag}(X))-\sum_{i\neq j}a_{ij}X_{ij}^{2}.

It is therefore plausible that this conjecture could be proved (or disproved) by exploiting this special structure.

Conjecture 9.8.

Let $D$ be a positive definite diagonal matrix, and let $p(x)=e_{k}(Dx)$ . Then $p(x)$ has the spectral containment property.

Again, this is of special interest because of its relation to diagonal congruence as we now explain.

Lemma 9.9.

Let $p$ be a homogeneous, multiaffine stable polynomial, let $D$ be a positive definite diagonal matrix, and let $q=p(Dx)$ . Then $x\in H(q)$ if and only if $Dx\in H(p)$ , and $X\in H(Q)$ if and only if $DXD\in H(P)$ .

Proof.

$x\in H(q)$ if and only if $q(x+t\vec{1})\geq 0$ for all $t\geq 0$ . This is equivalent to the statement that $p(D(x+t\vec{1}))=p(Dx+t\operatorname*{diag}(D))\geq 0$ for all $t\geq 0$ . Notice though that if $D$ is positive definite, then $\operatorname*{diag}(D)$ is in the interior of the hyperbolicity cone of $p$ . Therefore, $p(Dx+t\operatorname*{diag}(D))\geq 0$ for all $t\geq 0$ if and only if $Dx\in H(p)$ .

Similarly, if $p(x)=\sum_{S\subseteq[n]}a_{S}\prod_{i\in S}x_{i}$ , we see that $q(x)=\sum_{S\subseteq[n]}(\prod_{i\in S}D_{ii}a_{S})\prod_{i\in S}x_{i}$ . Therefore,

Q(X)=\sum_{S\subseteq[n]}(\prod_{i\in S}D_{ii}a_{S})\det(X|_{S})=\sum_{S\subseteq[n]}a_{S}\det((D^{1/2}XD^{1/2})|_{S})=P(D^{1/2}XD^{1/2}).

We thus have that

Q(X+tI)=P(D^{1/2}(X+tI)D^{1/2})=P(D^{1/2}XD^{1/2}+tD).

Because $D$ is positive definite, it is in the interior of $H(P)$ , and therefore, $P(D^{1/2}XD^{1/2}+tD)\geq 0$ for all $t\geq 0$ if and only if $D^{1/2}XD^{1/2}\in H(P)$ . This implies the result. ∎

From, this we see that 9.8 is equivalent to the statement that for any $X\in H(E_{k})$ , and any positive definite diagonal matrix $D$ , we have that there exists an eigenvalue vector $\lambda$ of $D^{1/2}XD^{1/2}$ so that $D^{-1}\lambda\in H(e_{k})$ . This gives us a very quantitative relationship between the eigenvalues of a symmetric matrix $X$ and those of $D^{1/2}XD^{1/2}$ , which are of fundamental interest in a number of situations.

The Schur-Horn property is another interesting property of a multiaffine polynomial. Once again, despite computer search, we are unable to find an example of a multiaffine homogeneous polynomial that does not have the Schur-Horn property. From this, we conjecture

Conjecture 9.10.

All homogeneous multiaffine polynomials have the Schur-Horn property.

References

[1] Nima Anari, Shayan Oveis Gharan, Amin Saberi, and Mohit Singh. Nash social welfare, matrix permanent, and stable polynomials. In 8th Innovations in Theoretical Computer Science Conference, volume 67 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 36, 12. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2017.
[2] Julius Borcea and Petter Brändén. Applications of stable polynomials to mixed determinants: Johnson’s conjectures, unimodality, and symmetrized fischer products. Duke Mathematical Journal, 143(2):205–223, 2008.
[3] Julius Borcea and Petter Brändén. The Lee-Yang and Pólya-Schur programs. I. Linear operators preserving stability. Invent. Math., 177(3):541–569, 2009.
[4] Julius Borcea and Petter Brändén. Hyperbolicity preservers and majorization. Comptes Rendus Mathematique, 348(15):843–846, 2010.
[5] Julius Borcea and Petter Brändén. Multivariate Pólya-Schur classification problems in the Weyl algebra. Proc. Lond. Math. Soc. (3), 101(1):73–104, 2010.
[6] Julius Borcea, Petter Brändén, and Thomas M. Liggett. Negative dependence and the geometry of polynomials. Journal of the American Mathematical Society, 22(2):521–567, 2009.
[7] Petter Brändén. Polynomials with the half-plane property and matroid theory. Adv. Math., 216(1):302–320, 2007.
[8] Petter Brändén and June Huh. Lorentzian polynomials. Annals of Mathematics, 192(3):821–891, 2020.
[9] Young-Bin Choe, James G. Oxley, Alan D. Sokal, and David G. Wagner. Homogeneous multivariate polynomials with the half-plane property. volume 32, pages 88–187. 2004.
[10] Lars Gårding. Linear hyperbolic partial differential equations with constant coefficients. Acta Math., 85:1–62, 1951.
[11] Lars Gårding. An inequality for hyperbolic polynomials. Journal of Mathematics and Mechanics, 8(6):957–965, 1959.
[12] Daniel R. Grayson and Michael E. Stillman. Macaulay2, a software system for research in algebraic geometry. Available at https://faculty.math.illinois.edu/Macaulay2/.
[13] F Reese Harvey and H Blaine Lawson Jr. Hyperbolic polynomials and the dirichlet problem. arXiv preprint arXiv:0912.5220, 2009.
[14] Alfred Horn. Doubly stochastic matrices and the diagonal of a rotation matrix. Amer. J. Math., 76:620–630, 1954.
[15] Charles R. Johnson and Sivaram K. Narayan. The Koteljanskii inequalities for mixed matrices. Linear and Multilinear Algebra, 62(12):1583–1590, 2014.
[16] Thorsten Jörgens and Thorsten Theobald. Conic stability of polynomials. Res. Math. Sci., 5(2):Paper No. 26, 12, 2018.
[17] D. M. Koteljanskii. A property of sign-symmetric matrices. American Mathematical Society Translations, (27):19–23, 1963.
[18] Mario Kummer. Spectral linear matrix inequalities. Adv. Math., 384:Paper No. 107749, 36, 2021.
[19] Mario Kummer, Daniel Plaumann, and Cynthia Vinzant. Hyperbolic polynomials, interlacers, and sums of squares. Math. Program., 153(1, Ser. B):223–245, 2015.
[20] Pierre Lalonde. A non-commutative version of Jacobi’s equality on the cofactors of a matrix. Discrete Math., 158(1-3):161–172, 1996.
[21] Nam Q. Le. Hadamard-type inequalities for $k$ -positive matrices, December 2021.
[22] Marvin Marcus. Finite Dimensional Multilinear Algebra. Marcel Dekker Inc., 1973.
[23] Maxim Nazarov. Young’s symmetrizers for projective representations of the symmetric group. Adv. Math., 127(2):190–257, 1997.
[24] Robin Pemantle. Hyperbolicity and stable polynomials in combinatorics and probability. In Current developments in mathematics, 2011, pages 57–123. Int. Press, Somerville, MA, 2012.
[25] James Renegar. Hyperbolic programs, and their derivative relaxations. Foundations of Computational Mathematics, 6:59–79, 2006.
[26] Raman Sanyal and James Saunderson. Spectral polyhedra, January 2020.
[27] James Saunderson. Certifying polynomial nonnegativity via hyperbolic optimization. SIAM J. Appl. Algebra Geom., 3(4):661–690, 2019.
[28] Issai Schur. Über eine Klasse von Mittelbildungen mit Anwendungen auf die Determinantentheorie. Sitzungsberichte der Berliner Mathematischen Gesellschaft, 22(9-20):51, 1923.
[29] David G. Wagner. Multivariate stable polynomials: theory and applications. Bull. Amer. Math. Soc. (N.S.), 48(1):53–84, 2011.

	$\displaystyle p^{(k)}(X-tD)$	$\displaystyle=\sum_{\|S\|=k}\det(X_{S}-tD_{S})\prod_{i\notin S}D_{ii}$
		$\displaystyle=\sum_{\|S\|=k}\det(D_{S})\det(D_{S}^{-1/2}X_{S}D_{S}^{-1/2}-tI_{S})\prod_{i\notin S}D_{ii}$
		$\displaystyle=\det(D)\sum_{\|S\|=k}\det(D_{S}^{-1/2}X_{S}D_{S}^{-1/2}-tI_{S})$
		$\displaystyle=\det(D)E_{k}(D^{-1/2}XD_{-1/2}-tI)$