This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Linear Principal Minor Polynomials: hyperbolic determinantal inequalities and spectral containment

Grigoriy Blekherman School of Mathematics, Georgia Institute of Technology, Atlanta, Georgia [email protected] Mario Kummer Technische Universität Dresden, Germany [email protected] Raman Sanyal Institut für Mathematik, Goethe-Universität Frankfurt, Germany [email protected] Kevin Shu School of Mathematics, Georgia Institute of Technology, Atlanta, Georgia [email protected]  and  Shengding Sun School of Mathematics, Georgia Institute of Technology, Atlanta, Georgia [email protected]
Abstract.

A linear principal minor polynomial or lpm polynomial is a linear combination of principal minors of a symmetric matrix. By restricting to the diagonal, lpm polynomials are in bijection to multiaffine polynomials. We show that this establishes a one-to-one correspondence between homogeneous multiaffine stable polynomials and PSD-stable lpm polynomials. This yields new construction techniques for hyperbolic polynomials and allows us to generalize the well-known Fisher–Hadamard and Koteljanskii inequalities from determinants to PSD-stable lpm polynomials. We investigate the relationship between the associated hyperbolicity cones and conjecture a relationship between the eigenvalues of a symmetric matrix and the values of certain lpm polynomials evaluated at that matrix. We refer to this relationship as spectral containment.

We would like to thank Santanu Dey for many insightful conversations. Grigoriy Blekherman, Kevin Shu and Shengding Sun were partially supported by NSF grant DMS-1901950. Mario Kummer was partially supported by DFG grant 421473641

1. Introduction

A homogeneous polynomial p[x]:=[x1,,xn]p\in\mathbb{R}[{x}]:=\mathbb{R}[x_{1},\dots,x_{n}] is called hyperbolic with respect to an{a}\in\mathbb{R}^{n} if p(a)0p({a})\neq 0 and pa(t):=p(vta)[t]p_{{a}}(t):=p({v}-t{a})\in\mathbb{R}[t] has only real roots for all vn{v}\in\mathbb{R}^{n}. The hyperbolicity cone Ha(p)H_{a}(p) of a polynomial pp hyperbolic with respect to an{a}\in\mathbb{R}^{n} is the set of all vn{v}\in\mathbb{R}^{n} such that p(vta)p({v}-t{a}) has only nonnegative roots. Originally conceived in the context of partial differential equations [10], hyperbolic polynomials were discovered to yield deep results in (non-)linear algebra, combinatorics, and optimization; see, for example, [1, 2, 3, 24, 26, 29].

A fundamental family of hyperbolic polynomials is given by the elementary symmetric polynomials

ek(x):=JiJxi,e_{k}({x})\ :=\ \sum_{J}\prod_{i\in J}x_{i}\,,

where JJ ranges over all kk-element subsets of [n]:={1,,n}[n]:=\{1,\dots,n\}. The elementary symmetric polynomials are stable: a multivariate polynomial p[x]p\in\mathbb{R}[{x}] is stable if for all complex numbers z1,,znz_{1},\ldots,z_{n} lying in the open upper half-plane, we have p(z1,,zn)0p(z_{1},\ldots,z_{n})\neq 0. If pp is homogeneous, then it is stable if and only if it is hyperbolic with respect to all a>0n{a}\in\mathbb{R}^{n}_{>0}, and we denote by H(p)=H𝟏(p)H(p)=H_{\mathbf{1}}(p) its hyperbolicity cone with respect to the vector 𝟏=(1,,1)\mathbf{1}=(1,\dots,1).

Let X{X} denote an n×nn\times n matrix of indeterminants, and for any J[n]J\subseteq[n], we let XJ{X}_{J} denote the principal submatrix of X{X} indexed by JJ. We can then define a polynomial

Ek(X):=Jdet(XJ),E_{k}({X})\ :=\ \sum_{J}\det({X}_{J})\,,

where again JJ ranges over all kk-element subsets of [n][n]. It turns out that these polynomials do not vanish on the Siegel upper half-plane, i.e., the set of all complex symmetric matrices with positive definite imaginary part. Such polynomials are called Dirichlet–Gårding [13] or PSD-stable [16]. For a homogeneous polynomial PP this property is equivalent to being hyperbolic with respect to any positive definite matrix, and we denote by H(P)H(P) its hyperbolicity cone (taken with respect to the identity matrix). When the context is clear, we will simply refer to PSD-stable polynomials P(X)P({X}) as stable polynomials.

The starting point of our paper is the observation that Ek(X)E_{k}({X}) is closely related to ek(x)e_{k}(x). For instance, if X=Diag(x1,,xn){X}=\operatorname*{Diag}(x_{1},\dots,x_{n}) is the diagonal matrix with diagonal entries Xii=xiX_{ii}=x_{i}, then Ek(X)=ek(x1,,xn)E_{k}({X})=e_{k}(x_{1},\dots,x_{n}). To generalize this observation, let symn×n\mathbb{R}^{n\times n}_{\mathrm{sym}} be the vector space of real symmetric n×nn\times n-matrices and let [X]\mathbb{R}[{X}] be the ring of polynomials on it, where we regard X{X} as being an n×nn\times n matrix of indeterminants. A polynomial P(X)[X]P({X})\in\mathbb{R}[{X}] is called a linear principal minor polynomial or lpm-polynomial if P(X)P({X}) is of the form

P(X)=JcJdet(XJ),P({X})\ =\ \sum_{J}c_{J}\det({X}_{J})\,,

where JJ ranges over all subsets of [n][n]. The first natural question we pursue is what interesting properties are shared by a homogeneous lpm polynomial P(X)P({X}) and its diagonal restriction p(x)p({x}). We show that P(X)P({X}) is PSD-stable if and only if p(x)p({x}) is stable. We obtain a similar result for the related concept of Lorentzian polynomials. We prove these facts using the theory of stability preservers [5].

Having established these basic facts we generalize classical determinantal inequalities from linear algebra, such as the Hadamard–Fischer and Koteljanskii inequality to the setting of stable lpm polynomials. This generalizes the Hadamard-type inequalities for kk-positive matrices obtained in [21]. Another interesting consequence of the above results is that they give construction of a new class of hyperbolic polynomials. Using lpm polynomials we construct a hyperbolic cubic in 6 variables which has a Rayleigh difference that is not a sum of squares. The previously smallest known example with 4343 variables was contructed by Saunderson in [27]. Finally, we study whether the eigenvalue vector λ\lambda of a matrix XX lying in the hyperbolicity cone of a stable lpm polynomial P(X)P({X}) lies in the hyperbolicity cone of p(x)p(x) and show how this is related to a potential generalization of the classical Schur–Horn theorem [28, 14]. We now discuss our results in detail.

2. Our results in detail

Our discussion of lpm polynomials can also be viewed from a different perspective. A polynomial p[x]:=[x1,,xn]p\in\mathbb{R}[{x}]:=\mathbb{R}[x_{1},\dots,x_{n}] is multi-affine if it is a linear combination of square-free monomials xJ=jJxjx^{J}=\prod_{j\in J}x_{j} for J[n]J\subseteq[n]. We define a linear map Φ\Phi from the vector subspace of multi-affine polynomials in x1,,xnx_{1},\ldots,x_{n} to the vector space of lpm polynomials, which we call the minor lift map, as follows. The minor lift of

p(x)=J[n]aJiJxi,p({x})=\sum_{J\subseteq[n]}a_{J}\prod_{i\in J}x_{i},

is the polynomial P=Φ(p)P=\Phi(p) given by

P(X)=J[n]aJdet(XJ).P(X)=\sum_{J\subseteq[n]}a_{J}\det({X}_{J}).

We note that deg(Φ(p))=deg(p)\deg(\Phi(p))=\deg(p) and that Φ(p)\Phi(p) is homogeneous if and only if pp is homogeneous. When it is unambiguous, we will use lower case letters such as pp to denote homogeneous, multiaffine p[x1,,xn]p\in\mathbb{R}[x_{1},\dots,x_{n}], and use the corresponding upper case letters for the minor lift, so that PP is equal to Φ(p)\Phi(p).

2.1. Properties of the minor lift map and constructions

Our first result is that the minor lift map sends stable polynomials to PSD-stable polynomials. Stronger even, let us call a matrix AA kk-locally PSD if every principal k×kk\times k-submatrix AJA_{J} of AA is positive semidefinite. The collection PSDk\mathrm{PSD}_{k} of kk-locally PSD matrices is a closed convex cone and PSDdPSDd1PSD1\mathrm{PSD}_{d}\subset\mathrm{PSD}_{d-1}\subset\cdots\subset\mathrm{PSD}_{1}.

Theorem 2.1.

Let pp be a homogeneous multiaffine polynomial of degree kk. If pp is stable, then P=Φ(p)P=\Phi(p) is hyperbolic with PSDkH(P)\mathrm{PSD}_{k}\subseteq H(P). In particular, PP is PSD-stable.

For Asymn×nA\in\mathbb{R}^{n\times n}_{\mathrm{sym}}, let π(A)=(A11,A22,,Ann)\pi(A)=(A_{11},A_{22},\dots,A_{nn}) be the projection to the diagonal. A first implication for the associated hyperbolicity cones is as follows.

Corollary 2.2.

Let pp be a homogeneous multiaffine stable polynomial and P=Φ(p)P=\Phi(p). If AH(P)A\in H(P), then p(π(A))P(A)p(\pi(A))\geq P(A) and π(A)H(p)\pi(A)\in H(p).

Using Theorem 2.1, we are able to construct new interesting hyperbolic polynomials. Given a hyperbolic polynomial pp and points a,v{a},{v} in the hyperbolicity cone of pp, the Rayleigh difference Δv,a(p)=DvpDappDvDap\Delta_{{v},{a}}(p)=D_{{v}}p\cdot D_{{a}}p-p\cdot D_{{v}}D_{{a}}p is a polynomial nonnegative on n\mathbb{R}^{n} [19]. If the polynomial Δv,a(p)=DvpDappDvDap\Delta_{{v},{a}}(p)=D_{{v}}p\cdot D_{{a}}p-p\cdot D_{{v}}D_{{a}}p is not a sum of squares, this has interesting implications for determinantal representations as well as a hyperbolic certificate of nonnegativity of Δv,a(p)\Delta_{{v},{a}}(p) which cannot be recovered by sums of squares. Saunderson [27] characterized all pairs (d,n)(d,n) for which there exists such a hyperbolic polynomial p[x1,,xn]p\in\mathbb{R}[x_{1},\ldots,x_{n}] of degree dd, except when d=3d=3, where the smallest known example with a Rayleigh difference that is not a sum of squares depends on 43 variables. We are able to reduce the number of variables to 6. See Section 6 for more details.

Theorem 2.3.

There exists an (explicit) degree-3 hyperbolic polynomial pp in 66 variables and vectors v,aH(p){v},{a}\in H(p) such that the Rayleigh difference Δv,a(p)\Delta_{{v},{a}}(p) is not a sum-of-squares.

2.2. Hyperbolic determinantal inequalities

We generalize some well known theorems from linear algebra to the setting of lpm polynomials. Note that the cone of positive semidefinite matrices is precisely the hyperbolicity cone of det(X)\det({X}), which is the minor lift of en(x)=x1xne_{n}(x)=x_{1}\cdots x_{n}. For our generalizations, we replace the determinant by the minor lift of a homogeneous multiaffine stable polynomial, and the cone of positive semidefinite matrices by the hyperbolicity cone of the minor lift.

Hadamard’s inequality is a classical result comparing the determinant of any positive semidefinite matrix with the product of its diagonal entries.

Theorem (Hadamard’s inequality).

Let AA be a n×nn\times n positive semidefinite matrix, then det(A)i=1nAii\det(A)\leq\prod_{i=1}^{n}A_{ii}.

An equivalent statement of this inequality is as follows: if VV is any, not necessarily symmetric, real n×nn\times n-matrix with columns v1,,vn{v}_{1},\dots,{v}_{n}, then det(V)i=1nvi2\det(V)\leq\prod_{i=1}^{n}\|{v}_{i}\|_{2}. This yields a geometric interpretation, since the absolute value of determinant is the volume of an nn-dimensional parallelepiped with edges v1,,vn{v}_{1},\dots,{v}_{n}.

Fischer’s inequality generalizes Hadamard’s inequality, and relates the determinant of a positive semidefinite matrix to its principal minors. Let Π={S1,,Sm}\Pi=\{S_{1},\dots,S_{m}\} be a partition of the set [n][n] into mm disjoint subsets. Given such a partition, we write iji\sim j if i,jSki,j\in S_{k} for some k=1,,mk=1,\dots,m. Let 𝒟Π\mathcal{D}_{\Pi} be the vector space of symmetric matrices that are block diagonal with respect to Π\Pi

𝒟Π={Asymn×n:Aij=0 if i≁j}.\mathcal{D}_{\Pi}=\{A\in\mathbb{R}^{n\times n}_{\mathrm{sym}}:A_{ij}=0\textnormal{ if }i\not\sim j\}.

Let πΠ\pi_{\Pi} be the orthogonal projection from symn×n\mathbb{R}^{n\times n}_{\mathrm{sym}} onto the subspace 𝒟Π\mathcal{D}_{\Pi}.

Theorem (Fischer’s inequality).

Let AA be a positive semidefinite matrix. Then

det(πΠ(A))det(A).\det(\pi_{\Pi}(A))\ \geq\ \det(A)\,.

Observe that Hadamard inequality is simply Fischer’s inequality with partition Π={{1},,{n}}\Pi=\{\{1\},\dots,\{n\}\}. We now give a hyperbolic generalization of Fischer-Hadamard inequality. For P=Φ(ek)P=\Phi(e_{k}), our hyperbolic Hadamard inequality was obtained in [21].

Theorem 2.4 (Hyperbolic Fischer–Hadamard inequality).

Let PP be a homogeneous PSD-stable lpm-polynomial and Π\Pi a partition. Then

P(πΠ(A))P(A)P(\pi_{\Pi}(A))\ \geq\ P(A)

holds for all AH(P)A\in H(P).

The classical Fischer–Hadamard inequality is a consequence of a more general inequality known as Koteljanskii’s inequality, which handles the case of overlapping blocks [17].

Theorem (Koteljanski’s inequality).

Let SS and TT be two subsets of [n][n] and AA be a positive semidefinite n×nn\times n matrix. Then

det(AS)det(AT)det(AST)det(AST).\det(A_{S})\det(A_{T})\ \geq\ \det(A_{S\cup T})\det(A_{S\cap T})\,.

While we were not able to generalize Koteljanskii’s inequality in a way that implies the hyperbolic Fischer–Hadamard inequality, we found a hyperbolic generalization of Koteljanskii’s inequality, which uses a different interpretation of what it means to take a minor of a matrix.

Definition 2.5.

Given a degree kk homogeneous lpm polynomial PP and T[n]T\subseteq[n] with |T|nk|T|\geq n-k, we define the restriction

P|T:=(i[n]TXii)P,P|_{T}\ :=\ \Bigl{(}\prod_{i\in[n]\setminus T}\frac{\partial}{\partial X_{ii}}\Bigr{)}P\,,

where we take partial derivative with respect to diagonal variables not in TT.

With this definition we can state the hyperbolic Koteljanskii inequality, which is related to the negative lattice condition in [6]:

Theorem 2.6 (Hyperbolic Koteljanskii inequality).

Let PP be a homogeneous PSD-stable lpm-polynomial and S,T[n]S,T\subseteq[n]. Then

P|S(A)P|T(A)P|ST(A)P|ST(A)P|_{S}(A)P|_{T}(A)\ \geq\ P|_{S\cup T}(A)P|_{S\cap T}(A)

holds for all AH(P)A\in H(P).

2.3. Spectral containment property

If AA is an n×nn\times n symmetric matrix, we say that λn\lambda\in\mathbb{R}^{n} is an eigenvalue vector of AA if the entries of λ\lambda are precisely the eigenvalues of AA with appropriate multiplicities. Note that the set of eigenvalue vectors of a symmetric matrix AA are invariant under permutations.

We recall the example of the kk-th elementary symmetric polynomial ek(x)e_{k}({x}) and its minor lift Ek(X)E_{k}({X}) from the introduction. It is well-known that Ek(A)=ek(λ)E_{k}(A)=e_{k}(\lambda) where λ\lambda is an eigenvalue vector of AA. In particular, it follows that AH(P)A\in H(P) implies that λH(p)\lambda\in H(p). Notice that since eke_{k} is invariant under permutations of coordinates, the order in which we list the eigenvalues of AA in λ(A)\lambda(A) does not matter. This motivates the following definition.

Definition 2.7.

A homogeneous multiaffine stable polynomial p[x1,,xn]p\in\mathbb{R}[x_{1},\dots,x_{n}] has the spectral containment property if for any AH(P)symn×nA\in H(P)\subset\mathbb{R}^{n\times n}_{\mathrm{sym}}, there is an eigenvalue vector λn\lambda\in\mathbb{R}^{n} of AA such that λH(p)\lambda\in H(p).

Remark 2.8.

We could make a stronger requirement in 2.7 that for all AH(P)A\in H(P), all eigenvalue vectors of AA lie in H(p)H(p), seems to be too restrictive; we do not have any examples of polynomials besides the elementary symmetric polynomials with this stronger property.

We now give a number of polynomials which have the spectral containment property:

Theorem 2.9.

The following classes of polynomials have the spectral containment property:

  1. (1)

    The elementary symmetric polynomials e1,,ene_{1},\dots,e_{n}.

  2. (2)

    For any nkdn\geq k\geq d, and any |ε||\varepsilon| sufficiently small, ed(x1,,xn)+εed(x1,,xk)e_{d}(x_{1},\dots,x_{n})+\varepsilon e_{d}(x_{1},\dots,x_{k}).

  3. (3)

    Stable linear polynomials.

  4. (4)

    Any degree n1n-1 stable polynomial that interlaces en2e_{n-2}.

  5. (5)

    e2(x1,x2,x3,x4)ε(x1x2+x1x3)e_{2}(x_{1},x_{2},x_{3},x_{4})-\varepsilon(x_{1}x_{2}+x_{1}x_{3}) for ε\varepsilon sufficiently small.

Moreover, if pp has the spectral containment property, and x0x_{0} is a variable not used in pp, then x0px_{0}p has the spectral containment property.

While this property may seem mysterious, we conjecture that it is in fact ubiquitous:

Conjecture 2.10.

Every homogeneous multaffine stable polynomial has the spectral containment property.

If 2.10 is true, then Theorem 2.1 implies that for every kk-locally PSD matrix AA and homogeneous multiaffine stable polynomial pp, some eigenvalue vector of AA is contained in H(p)H(p). This may seem like a very strong condition on the eigenvalues of AA, but as we show below it is equivalent to the fact that every eigenvalue vector of AA is contained in H(ek)H(e_{k}), which we already observed above. Let 𝔖n\mathfrak{S}_{n} denote the symmetric group on nn letters and let it act on n\mathbb{R}^{n} by permuting coordinates.

Theorem 2.11.

Let ek[x]e_{k}\in\mathbb{R}[{x}] be the elementary symmetric polynomial of degree kk and h[x]h\in\mathbb{R}[{x}] be a nonzero homogeneous multiaffine stable polynomial of degree kk. If vH(ek){v}\in H(e_{k}), then there exists a permutation τ𝔖n\tau\in\mathfrak{S}_{n} such that τ(v)H(h)\tau({v})\in H(h).

In Section 7.4 we will also show that 2.10 would be implied in many cases by another conjecture generalizing the classical Schur–Horn Theorem.

3. The Minor Lift Map and Stability Preservers

Our goal in this section is to prove Theorem 2.1. We first explain how to construct the minor lift map via partial derivatives of the determinant. Let p[x]p\in\mathbb{R}[{x}] be a multiaffine polynomial. The dual of pp is

(1) p(x):=x1x2xnp(1x1,1x2,,1xn).p^{*}(x):=x_{1}\cdot x_{2}\cdots x_{n}\cdot p\Bigl{(}\frac{1}{x_{1}},\frac{1}{x_{2}},\dots,\frac{1}{x_{n}}\Bigr{)}.

For any polynomial p[x1,,xn]p\in\mathbb{R}[x_{1},\dots,x_{n}], we consider the differential operator p(X11,X22,,Xnn)p^{*}\left(\frac{\partial}{\partial X_{11}},\frac{\partial}{\partial X_{22}},\dots,\frac{\partial}{\partial X_{nn}}\right). For instance, if p=xS=iSxip=x^{S}=\prod_{i\in S}x_{i} is a monomial, then the associated differential operator is iSXii\prod_{i\notin S}\frac{\partial}{\partial X_{ii}}. Applying the differential operator associated to xSx^{S} to det(X)\det(X) yields

(iSXii)det(X)=det(XS).\Bigl{(}\prod_{i\notin S}\frac{\partial}{\partial X_{ii}}\Bigr{)}\det({X})=\det(X_{S}).

By linearity, we then obtain that

P=(p(X11,X22,,Xnn))det(X).P=\left(p^{*}\left(\frac{\partial}{\partial X_{11}},\frac{\partial}{\partial X_{22}},\dots,\frac{\partial}{\partial X_{nn}}\right)\right)\det(X)\,.

This formulation of the minor lift map will allow us to easily apply the theory of stability preservers.

Remark 3.1.

The minor lift operation interacts nicely with dualization. If pp is a multiaffine polynomial, then

Φ(p)|X=det(X)Φ(p)|X1.\Phi(p^{*})|_{X}=\det(X)\cdot\Phi(p)|_{X^{-1}}.

Here, |X\cdot|_{X} denotes the evaluation of a polynomial at XX.

This result follows directly from the Jacobi complementary minors identity, found in [20], which states that det(X|Sc)=det(X1|S)det(X)\det(X|_{S^{c}})=\det(X^{-1}|_{S})\det(X). This is a matrix analogue of (1).

Before we go on, we need the following facts about hyperbolicity cones that can be found in [29].

Lemma 3.2.

Let p[x]p\in\mathbb{R}[{x}] be a homogeneous polynomial and KnK\subset\mathbb{R}^{n} a cone. The following are equivalent:

  1. (1)

    pp is hyperbolic with respect to all aK{a}\in K, and

  2. (2)

    p(v+ia)0p({v}+\textnormal{i}{a})\neq 0 for all vn{v}\in\mathbb{R}^{n} and aK{a}\in K.

Lemma 3.3.

Let p[x]p\in\mathbb{R}[{x}] be hyperbolic with respect to an{a}\in\mathbb{R}^{n}. Then pp is hyperbolic with respect to every point in the connected component of {vn:p(v)0}\{{v}\in\mathbb{R}^{n}:\,p({v})\neq 0\} that contains a{a}.

Our first step is the following observation:

Lemma 3.4.

Let P[X]P\in\mathbb{R}[{X}] be a homogeneous polynomial. Then PP is PSD-stable if and only if the following two conditions hold:

  1. (1)

    P(A)0P(A)\neq 0 for all positive definite matrices AA;

  2. (2)

    P(Diag(x1,,xn)+M)[x]P(\operatorname*{Diag}(x_{1},\dots,x_{n})+M)\in\mathbb{R}[{x}] is stable for every real symmetric matrix MM.

Proof.

First assume that PP is PSD-stable and let AA be a positive definite matrix. By definition we have P(iA)0P(\mathrm{i}A)\neq 0. Since PP is homogeneous, this implies that P(A)0P(A)\neq 0. Further let zi=ai+ibiz_{i}=a_{i}+\mathrm{i}b_{i} in the upper half-plane. Then P(Diag(z1,,zn)+M)P(\operatorname*{Diag}(z_{1},\dots,z_{n})+M) is nonzero for any real symmetric matrix MM, since Diag(b1,,bn)\operatorname*{Diag}(b_{1},\dots,b_{n}) is a positive definite matrix.

For the other direction we first observe that condition (2) implies that PP is hyperbolic with respect to the identity matrix. Indeed, the univariate polynomial P(tI+M)P(tI+M) is stable and thus real-rooted for every real symmetric matrix MM. Now condition (1) together with Lemmas 3.2 and 3.3 imply the claim. ∎

Proof of Theorem 2.1.

Let p[x]p\in\mathbb{R}[{x}] be multiaffine, homogeneous and stable. Then by [9, Thm. 6.1] all nonzero coefficients of pp have the same sign. Without loss of generality assume that all are positive. Then P=Φ(p)P=\Phi(p) is clearly positive on positive definite matrices since the minors of a positive definite matrix are positive. Thus by Lemma 3.4, it remains to show that

P(Diag(x1,,xn)+M)=(p(x1,x2,,xn))det(Diag(x1,,xn)+M)P(\operatorname*{Diag}(x_{1},\dots,x_{n})+M)=\left(p^{*}\left(\frac{\partial}{\partial x_{1}},\frac{\partial}{\partial x_{2}},\dots,\frac{\partial}{\partial x_{n}}\right)\right)\det(\operatorname*{Diag}(x_{1},\dots,x_{n})+M)

is stable for every real symmetric matrix MM. The polynomial det(Diag(x1,,xn)+M)\det(\operatorname*{Diag}(x_{1},\dots,x_{n})+M) is stable as well as pp^{*} by [9, Prop. 4.2]. Thus the polynomial P(Diag(x1,,xn)+M)P(\operatorname*{Diag}(x_{1},\dots,x_{n})+M) is also stable by [5, Thm. 1.3].

Let APSDksymn×nA\in\mathrm{PSD}_{k}\subseteq\mathbb{R}^{n\times n}_{\mathrm{sym}} be kk-locally PSD. Then for every kk-subset S[n]S\subseteq[n], we have det((A+tI)|S)>0\det((A+t{I})|_{S})>0 for all t>0t>0. Hence, if pp has degree kk with all coefficients positive, then P(AtI)>0P(A-t{I})>0 for all t<0t<0 and hence all roots are non-negative. This implies that AH(P)A\in H(P). ∎

Remark 3.5.

Given a multiaffine homogeneous stable polynomial p[x1,,xn]p\in\mathbb{R}[x_{1},\ldots,x_{n}], the minor lift map gives a hyperbolic polynomial PP in the entries of a symmetric n×nn\times n matrix whose restriction to the diagonal equals to pp. Such polynomials can also be constructed for stable polynomials that are not necessarily multiaffine. Since we are mainly interested in multiaffine polynomials, we only briefly sketch one possible such construction. To a stable homogeneous polynomial p[x1,,xn]p\in\mathbb{R}[x_{1},\ldots,x_{n}] one can find a multiaffine stable polynomial q[z11,,z1d1,,zndn]q\in\mathbb{R}[z_{11},\ldots,z_{1d_{1}},\ldots,z_{nd_{n}}] such that we can recover pp from qq by substituting each variable zijz_{ij} by xix_{i}, see [9, §2.5]. This polynomial qq is called a polarization of pp. If we restrict the minor lift of qq to suitable block-diagonal matrices, we obtain a hyperbolic polynomial with the desired properties for pp.

Remark 3.6.

Using [8, Thm. 3.2] one can show that the analogous statement to Theorem 2.1 for Lorentzian polynomials, a recent generalization of stable polynomials, holds as well.

4. Hyperbolic Hadamard-Fischer Inequality

Our goal in this section is to prove Theorem 2.4. We start by making some general observations about supporting hyperplanes of the hyperbolicity cone:

Lemma 4.1.

Let p[x]p\in\mathbb{R}[x] be hyperbolic with respect to ana\in\mathbb{R}^{n} and Ha(p)H_{a}(p) the corresponding hyperbolicity cone. Assume that p(a)>0p(a)>0 and that pp is reduced in the sense that all its irreducible factors are coprime. Then we have the following:

  1. (1)

    For all vHa(p)v\in H_{a}(p) the linear form Lv=p(v),xL_{v}=\langle\nabla p(v),x\rangle is nonnegative on Ha(p)H_{a}(p).

  2. (2)

    If vHa(p)v\in\partial H_{a}(p), then Lv(v)=0L_{v}(v)=0.

  3. (3)

    If bHa(p)b\not\in H_{a}(p), then there exists vHa(p)v\in\partial H_{a}(p) such that Lv(b)<0L_{v}(b)<0.

Proof.

Part (2) is just Euler’s identity since pp vanishes on Ha(p)\partial H_{a}(p). If p(v)=0\nabla p(v)=0, then (1) is trivial. Otherwise, the hyperplane {x:p(v),x=0}\{x:\langle\nabla p(v),x\rangle=0\} is tangent to Ha(p)\partial H_{a}(p) at vv. Since Ha(p)H_{a}(p) is convex and pp positive on the interior of Ha(p)H_{a}(p), this implies that Lv(x)0L_{v}(x)\geq 0 for all xHa(p)x\in H_{a}(p). In order to prove (3), we first note that by our assumption on pp, the set of points cHa(p)c\in\partial H_{a}(p) where p(c)=0\nabla p(c)=0 is nowhere dense. Thus if bHa(p)b\not\in H_{a}(p), then there is a point ee in the interior of Ha(p)H_{a}(p) such that the line segment [e,b][e,b] intersects Ha(p)\partial H_{a}(p) in a smooth point vv. Since Lv(e)>0L_{v}(e)>0 and Lv(v)=0L_{v}(v)=0, we have Lv(b)<0L_{v}(b)<0. ∎

We now apply the above observations to lpm polynomials. Recall that for a partition Π={S1,,Sm}\Pi=\{S_{1},\dots,S_{m}\} of [n][n], we denote by 𝒟Π\mathcal{D}_{\Pi} the vector space of block diagonal symmetric matrices with blocks given by Π\Pi and πΠ\pi_{\Pi} is the orthogonal projection of symn×n\mathbb{R}^{n\times n}_{\mathrm{sym}} onto the subspace 𝒟Π\mathcal{D}_{\Pi}. Further recall that we write aba\sim b for a,b[n]a,b\in[n] if a,bSka,b\in S_{k} for some k=1,,mk=1,\dots,m.

Lemma 4.2.

Fix a partition Π={S1,,Sm}\Pi=\{S_{1},\dots,S_{m}\} of [n][n] and let B[n]B\subseteq[n] be any subset. Then for any σ𝔖B\sigma\in\mathfrak{S}_{B}, we have |{bB|b≁σ(b)}|1|\{b\in B|b\not\sim\sigma(b)\}|\neq 1.

Proof.

For bBb\in B, consider the orbit b,σ(b),σ2(b),,σt1(b),σt(b)=bb,\sigma(b),\sigma^{2}(b),\dots,\sigma^{t-1}(b),\sigma^{t}(b)=b. If bSkb\in S_{k} but the orbit is not fully contained in SkS_{k}, then there are 0r<s<t0\leq r<s<t such that σr(b),σs+1(b)Sk\sigma^{r}(b),\sigma^{s+1}(b)\in S_{k} but σr+1(b),σs(b)Sk\sigma^{r+1}(b),\sigma^{s}(b)\not\in S_{k}. ∎

Lemma 4.3.

Let PP be an lpm polynomial. If A𝒟ΠA\in\mathcal{D}_{\Pi}, then P(A)𝒟Π\nabla P(A)\in\mathcal{D}_{\Pi}.

Proof.

Since PP is a sum of terms of the form aBdet(XB)a_{B}\det(X_{B}) with B[n]B\subseteq[n], it suffices to prove the claim for P=det(XB)P=\det(X_{B}). In that case, this is equivalent to saying that if A𝒟ΠA\in\mathcal{D}_{\Pi} and i≁ji\not\sim j, then

(Xijdet(XB))(A)=0.\Bigl{(}\frac{\partial}{\partial X_{ij}}\det(X_{B})\Bigr{)}(A)=0.

Now detXB=σ𝔖Bsgn(σ)iBXi,σ(i)\det X_{B}=\sum_{\sigma\in\mathfrak{S}_{B}}\mathrm{sgn}(\sigma)\prod_{i\in B}X_{i,\sigma(i)} and Lemma 4.2 applied to each term yields the claim. ∎

The preceding lemma allows us to show that the hyperbolicity cone of a hyperbolic lpm polynomial is closed under projections onto 𝒟Π\mathcal{D}_{\Pi}.

Lemma 4.4.

Let PP be a homogeous PSD-stable lpm polynomial. If AH(P)A\in H(P), then πΠ(A)H(P)\pi_{\Pi}(A)\in H(P).

Proof.

Let PΠP_{\Pi} be the restriction of the polynomial PP to 𝒟Π\mathcal{D}_{\Pi}, i.e. PΠ=PιP_{\Pi}=P\circ\iota where ι:𝒟Πsymn×n\iota:\mathcal{D}_{\Pi}\rightarrow\mathbb{R}^{n\times n}_{\mathrm{sym}} is the inclusion map. We have H(P)𝒟Π=H(PΠ)H(P)\cap\mathcal{D}_{\Pi}=H(P_{\Pi}). For AH(P)A\in H(P) we thus have to show that πΠ(A)H(PΠ)\pi_{\Pi}(A)\in H(P_{\Pi}). By Lemma 4.1 this is equivalent to PΠ(B),πΠ(A)0\langle\nabla P_{\Pi}(B),\pi_{\Pi}(A)\rangle\geq 0 for all BH(PΠ)B\in H(P_{\Pi}). But by the previous lemma we have PΠ(B),πΠ(A)=P(B),A\langle\nabla P_{\Pi}(B),\pi_{\Pi}(A)\rangle=\langle\nabla P(B),A\rangle which is nonnegative by Lemma 4.1 since AH(P)A\in H(P). ∎

We are now able to show the hyperbolic Fischer–Hadamard inequality. Our proof technique is inspired by the proof of [11, Thm. 5].

Proof of Theorem 2.4.

Without loss of generality, we can assume that P(I)>0P(I)>0. If AA is on the boundary of H(P)H(P), then P(A)=0P(A)=0 and we are done since πΠ(A)H(P)\pi_{\Pi}(A)\in H(P) implies P(πΠ(A))0P(\pi_{\Pi}(A))\geq 0. Therefore, we may assume that AA is in the interior of H(P)H(P). In this case, let ϵ>0\epsilon>0 be sufficiently small such that AϵIH(P)A-\epsilon I\in H(P), then πΠ(A)ϵI=πΠ(AϵI)\pi_{\Pi}(A)-\epsilon I=\pi_{\Pi}(A-\epsilon I) is also in H(P)H(P). This shows that πΠ(A)\pi_{\Pi}(A) is in the interior of H(P)H(P) and P(πΠ(A))>0P(\pi_{\Pi}(A))>0. Thus PP is hyperbolic with respect to AA and q(t)=P(tA+πΠ(A))[t]q(t)=P(tA+\pi_{\Pi}(A))\in\mathbb{R}[t] is real rooted with negative roots. Let dd be the degree of q(t)q(t) degree and λ1,,λd-\lambda_{1},\ldots,-\lambda_{d} the roots of q(t)q(t) with each λi>0\lambda_{i}>0. We consider the coefficients of tt in q(t)q(t):

  • The coefficient of tdt^{d} is P(A)P(A).

  • The coefficient of tt is dP(πΠ(A))dP(\pi_{\Pi}(A)), since ddtq(t)|t=0=P|πΠ(A),A\frac{d}{dt}q(t)|_{t=0}=\langle\nabla P|_{\pi_{\Pi}(A)},A\rangle, and by Lemma 4.3, P|πΠ(A),A=P|πΠ(A),πΠ(A)=dP(πΠ(A))\langle\nabla P|_{\pi_{\Pi}(A)},A\rangle=\langle\nabla P|_{\pi_{\Pi}(A)},\pi_{\Pi}(A)\rangle=dP(\pi_{\Pi}(A)). This last equality is due to Euler’s identity.

  • The constant coefficient is P(πΠ(A))P(\pi_{\Pi}(A)).

Thus we have ed1(λ)=dP(πΠ(A))P(A)e_{d-1}(\lambda)=\frac{dP(\pi_{\Pi}(A))}{P(A)}, and ed(λ)=λ1λd=P(πΠ(A))P(A)e_{d}(\lambda)=\lambda_{1}\cdots\lambda_{d}=\frac{P(\pi_{\Pi}(A))}{P(A)}. Since all λi\lambda_{i} are positive, from the AM-GM inequality we have

P(πΠ(A))P(A)=ed1(λ)d(λ1λd)d1d=(P(πΠ(A))P(A))d1d.\frac{P(\pi_{\Pi}(A))}{P(A)}=\frac{e_{d-1}(\lambda)}{d}\geq(\lambda_{1}\cdots\lambda_{d})^{\frac{d-1}{d}}=\left(\frac{P(\pi_{\Pi}(A))}{P(A)}\right)^{\frac{d-1}{d}}.

This proves the claim. ∎

When P(X)=detXP(X)=\det X, then H(P)H(P) is the cone of positive semidefinite matrices and our theorem implies the well-known Fischer’s inequality:

Corollary 4.5 (Fischer’s inequality).

If AA is positive semidefinite, then detπΠ(A)detA\det\pi_{\Pi}(A)\geq\det A.

Remark 4.6.

The usual statement of Fischer’s inequality corresponds to the case of two blocks. This is equivalent to our multi-block version since principal submatrices of a positive semidefinite matrix are also positive semidefinite.

In the case, where Π={{1},,{n}}\Pi=\{\{1\},...,\{n\}\}, Theorem 2.4 and Lemma 4.4 imply 2.2. We also get the following strengthening of Theorem 2.4.

Corollary 4.7.

Let PP be a homogeneous and PSD-stable lpm-polynomial. If AH(P)A\in H(P), then the polynomial P((1t)A+tπΠ(A))P((1-t)A+t\pi_{\Pi}(A)) is monotonically increasing for t[0,1]t\in[0,1].

Proof.

The polynomial q(t)=P(tX+(πΠ(X)X))q(t)=P(tX+(\pi_{\Pi}(X)-X)) is real rooted, and

P((1t)X+tπΠ(X))=q(t)P((1-t)X+t\pi_{\Pi}(X))=q^{*}(t)

so that q(t)q^{*}(t) is real rooted. Because both AA and πΠ(A)\pi_{\Pi}(A) are in H(P)H(P), we have q(t)0q^{*}(t)\geq 0 for t[0,1]t\in[0,1]. Hence, by interlacing ddtq(t)\frac{d}{dt}q^{*}(t) has at most one root in the interval [0,1][0,1]. If there were a root of ddtq(t)\frac{d}{dt}q^{*}(t) in the interval [0,1)[0,1), then at such a root q(t)q^{*}(t) would have a maximum on this interval. This is a contradiction to the fact that q(t)q^{*}(t) is maximized at t=1t=1 by Theorem 2.4. Hence, we have that ddtq(t)\frac{d}{dt}q^{*}(t) must in fact be nonnegative on this interval. ∎

5. Hyperbolic Koteljanskii inequality

Koteljanskii’s inequality [17] states that for any n×nn\times n positive semidefinite matrix AA and S,T[n]S,T\subset[n], detASdetATdetASTdetAST\det A_{S}\det A_{T}\geq\det A_{S\cap T}\det A_{S\cup T}. This is a generalization of the Hadamard–Fischer inequality. Later this inequality was proven to hold for other classes of (possibly non-symmetric) matrices [15]. In this section we prove Theorem 2.6, a generalization of Koteljanskii’s inequality, where the determinant can be replaced by a PSD-stable lpm polynomial. First we need the hyperbolic counterpart of the fact that principal submatrices of a positive semidefinite matrix are again positive semidefinite, and hence have nonnegative determinant. For this we use Renegar derivatives [25].

Theorem 5.1.

Let pp be a polynomial, hyperbolic with respect to vv. Let DvpD_{v}p denote the directional derivative of pp in direction vv. Then DvpD_{v}p is also hyperbolic with respect to vv. Furthermore, their hyperbolicity cones satisfy Hv(p)Hv(Dvp)H_{v}(p)\subseteq H_{v}(D_{v}p).

Recall from 2.5 that P|T=(i[n]TXii)PP|_{T}=(\prod_{i\in[n]\setminus T}\frac{\partial}{\partial X_{ii}})P. Then we have:

Corollary 5.2.

Let PP be a homogeneous PSD-stable lpm polynomial of degree kk and AH(P)A\in H(P). Let T[n]T\subseteq[n] with |T|nk|T|\geq n-k. Then P|TP|_{T} is PSD-stable as well and AH(P|T)A\in H(P|_{T}).

Now we use the result from [6] on negative dependence. For any polynomial p[x]p\in\mathbb{R}[x] and index set S[n]S\subseteq[n] we denote Sp=(iSxi)p\partial^{S}p=(\prod_{i\in S}\frac{\partial}{\partial x_{i}})p.

Theorem 5.3 ([6, Sect. 2.1 and Thm. 4.9]).

Let pp be a multiaffine stable polynomial with nonnegative coefficients. Then pp satisfies the nonnegative lattice condition: for all S,T[n]S,T\subseteq[n]

Sp(0)Tp(0)STp(0)STp(0).\partial^{S}p(0)\partial^{T}p(0)\ \geq\ \partial^{S\cup T}p(0)\partial^{S\cap T}p(0)\,.

This theorem directly implies the generalization of Koteljanskii’s inequality.

Proof of Theorem 2.6.

Without loss of generality assume that P(I)>0P(I)>0. Let PA(x)=P(A+Diag(x))[x1,,xn]P_{A}(x)=P(A+\operatorname*{Diag}(x))\in\mathbb{R}[x_{1},...,x_{n}]. It is clear that PAP_{A} is multiaffine and SPA(0)=P|S(A)\partial^{S}P_{A}(0)=P|_{S}(A) for all S[n]S\subseteq[n]. It follows from Corollary 5.2 that PAP_{A} is stable and has nonnegative coefficients. Thus by Theorem 5.3 it satisfies the nonnegative lattice condition, i.e. for all S,T[n],SPA(0)TPA(0)STPA(0)STPA(0)S,T\subseteq[n],\partial^{S}P_{A}(0)\partial^{T}P_{A}(0)\geq\partial^{S\cup T}P_{A}(0)\partial^{S\cap T}P_{A}(0). This completes the proof. ∎

6. Hyperbolic polynomials and sums of squares

Let p[x]p\in\mathbb{R}[x] be hyperbolic with respect to vnv\in\mathbb{R}^{n} and a,bHv(p)a,b\in H_{v}(p). Then the mixed derivative

Δa,b(p)=DapDbppDaDbp\Delta_{a,b}(p)=\operatorname{D}_{a}p\cdot\operatorname{D}_{b}p-p\cdot\operatorname{D}_{a}\operatorname{D}_{b}p

is globally nonnegative by Theorem 3.1 in [19]. If some power prp^{r} has a definite symmetric determinantal representation, i.e., can be written as

pr=det(x1A1++xnAn)p^{r}=\det(x_{1}A_{1}+\cdots+x_{n}A_{n})

for some real symmetric (or complex hermitian) matrices A1,,AnA_{1},\dots,A_{n} with v1A1++vnAnv_{1}A_{1}+\ldots+v_{n}A_{n} positive definite, then Δa,b(p)\Delta_{a,b}(p) is even a sum of squares [19, Cor. 4.3]. Therefore, any instance where Δa,b(p)\Delta_{a,b}(p) is not a sum of squares gives an example of a hyperbolic polynomial none of whose powers has a definite symmetric determinantal representation. Another source of interest in such examples comes from the point of view taken in [27], as these give rise to families of polynomials that are not sums of squares but whose nonnegativity can be certified via hyperbolic programming. Saunderson [27] characterized all pairs (d,n)(d,n) for which there exists such a hyperbolic polynomial p[x]=[x1,,xn]p\in\mathbb{R}[x]=\mathbb{R}[x_{1},\ldots,x_{n}] of degree dd, except when d=3d=3. In this section we will construct an explicit hyperbolic cubic pp in 66 variables for which there are two points a,ba,b in the hyperbolicity cone such that Δa,b(p)\Delta_{a,b}(p) is not a sum of squares.

Remark 6.1.

If there are two points a,ba,b in the closed hyperbolicity cone of pp such that Δa,b(h)\Delta_{a,b}(h) is not a sum of squares, then there are also such points in the interior of the hyperbolicity cone as the cone of sums of squares is closed.

Remark 6.2.

In [27] Saunderson constructs a hyperbolic cubic in 4343 variables whose Bézout matrix is not a matrix sum of squares. This is the smallest such example that has been known so far. The top left entry of the Bézout matrix is the mixed derivative that we are studying. Thus if the latter is not a sum of squares, then the Bézout matrix is not a matrix sum of squares.

Consider the complete graph K4K_{4} on 44 vertices. We define the spanning tree polynomial of K4K_{4} as the element of [xe:eE(K4)]\mathbb{R}[x_{e}:e\in E(K_{4})] given by

tK4(x)=τeτxe,t_{K_{4}}(x)\ =\ \sum_{\tau}\prod_{e\in\tau}x_{e}\,,

where τE(K4)\tau\subset E(K_{4}) ranges over all edge sets of spanning trees of K4K_{4}. The polynomial tK4t_{K_{4}} is multiaffine, homogeneous and stable [9, Thm. 1.1]. Let TT be its minor lift. Finally, let pp be the polynomial obtained from TT by evaluating TT at the matrix of indeterminants

A=121314232434( x100000) 0x2abc00ax2cb00bcx2a00cbax2000000x3.A=\bordermatrix{&12&13&14&23&24&34\cr&x_{1}&0&0&0&0&0\cr&0&x_{2}&a&b&c&0\cr&0&a&x_{2}&c&b&0\cr&0&b&c&x_{2}&a&0\cr&0&c&b&a&x_{2}&0\cr&0&0&0&0&0&x_{3}}.

Thus pp is hyperbolic with respect to every positive definite matrix that can be obtained by specializing entries of AA to some real numbers. In particular, the polynomial

W=px1px3p2px1x3W=\frac{\partial p}{\partial x_{1}}\cdot\frac{\partial p}{\partial x_{3}}-p\cdot\frac{\partial^{2}p}{\partial x_{1}\partial x_{3}}

is nonnegative. We will show that it is not a sum of squares. We first study the real zero set of WW.

Lemma 6.3.

The polynomial WW is contained in the ideals J1,J2,J3J_{1},J_{2},J_{3} and J4J_{4} where

  1. (1)

    J1J_{1} is generated by all 2×22\times 2 minors of AA,

  2. (2)

    J2J_{2} is generated by all off-diagonal entries of AA,

  3. (3)

    J3J_{3} is generated by a,ca,c and x2x_{2},

  4. (4)

    J4J_{4} is generated by b,cb,c and x2x_{2}.

Proof.

Part (1) follows from the fact that both hh and hx1\frac{\partial h}{\partial x_{1}} are in J1J_{1}. The other claims are apparent since

14W=a2b2+a2c2+b2c2+c48abcx2+2a2x22+2b2x22.\frac{1}{4}W=a^{2}b^{2}+a^{2}c^{2}+b^{2}c^{2}+c^{4}-8abcx_{2}+2a^{2}x_{2}^{2}+2b^{2}x_{2}^{2}.\qed
Definition 6.4.

An ideal II in a ring AA is called real radical if g12++gr2Ig_{1}^{2}+\cdots+g_{r}^{2}\in I implies g1,,grIg_{1},\ldots,g_{r}\in I for all g1,,grAg_{1},\ldots,g_{r}\in A.

Lemma 6.5.

The ideal J=k=14JkJ=\bigcap_{k=1}^{4}J_{k} is real radical.

Proof.

It suffices to show that each JkJ_{k} is a radical ideal such that the real points lie Zariski dense in its zero set. This is clear for J2,J3J_{2},J_{3} and J4J_{4}. Using Macaulay2 [12] we checked that J1J_{1} is radical. Moreover, the primary decomposition of J1J_{1} shows that the zero set of J1J_{1} is a union of linear spaces. This implies that the real zeros of J1J_{1} are dense as well. ∎

Theorem 6.6.

The polynomial WW is not a sum of squares.

Proof.

Assume for the sake of a contradiction that W=g12++gr2W=g_{1}^{2}+\cdots+g_{r}^{2} for some polynomials gig_{i}. Since WJW\in J by Lemma 6.3, Lemma 6.5 implies that each gig_{i} is in JJ. Thus WW is even in the ideal JJJ\cdot J. Using Macaulay2 [12] one checks that this is not the case. ∎

Remark 6.7.

In the terminology of [27] this shows in particular that hh is neither SOS-hyperbolic nor weakly SOS-hyperbolic.

7. The Spectral Containment Property

We would like to relate the hyperbolicity cone of a homogeneous stable polynomial with the hyperbolicity cone of its minor lift. Recall from 2.7 that a homogeneous multiaffine stable polynomial pp has the spectral containment property if for any XH(P)X\in H(P), there is some vector λ\lambda consisting of the eigenvalues of XX with appropriate multiplicity so that λH(p)\lambda\in H(p). Elementary symmetric polynomials have the spectral containment property, and we will show that several other polynomials have the spectral containment property in this section. The remainder of this section is devoted to proving some sufficient conditions for the spectral containment property, as well as showing some connections between this property and the Schur-Horn theorem.

7.1. Schur–Horn Theorem and stable linear functions

Recall that a linear homogeneous polynomial p(x)=a1x1++anxnp(x)=a_{1}x_{1}+\dots+a_{n}x_{n} is stable if and only if either ai0a_{i}\geq 0 for each i[n]i\in[n], or ai0a_{i}\leq 0 for each i[n]i\in[n]. Moreover, in this case H(p)={xn:p(x)0}H(p)=\{x\in\mathbb{R}^{n}:p(x)\geq 0\}. These are the simplest stable polynomials and yet it is not completely trivial to show that they have the spectral containment property.

Theorem 7.1.

Every stable linear homogeneous polynomial has the spectral containment property.

In order to prove this, we will use Schur’s contribution to the Schur–Horn theorem.

Theorem 7.2 (Schur).

Let p:np:\mathbb{R}^{n}\rightarrow\mathbb{R} be a homogeneous linear function, and let PP be the associated minor lift. Let AA be a symmetric matrix and let λ\lambda be an eigenvalue vector for AA. Let 𝔖n\mathfrak{S}_{n} denote the symmetric group which acts on n\mathbb{R}^{n} by permuting coordinates. Let O(n)\textnormal{O}(n) denote the orthogonal group of n×nn\times n matrices. Then

maxπ𝔖np(π(λ))=maxUO(n)P(UAU).\max_{\pi\in\mathfrak{S}_{n}}p(\pi(\lambda))=\max_{U\in\textnormal{O}(n)}P(UAU^{\intercal}).
Proof of Theorem 7.1.

Suppose that AH(P)A\in H(P), which is equivalent to P(A)0P(A)\geq 0. By the Schur–Horn theorem, there is some eigenvalue vector of AA, say λ\lambda, so that p(λ)P(A)0p(\lambda)\geq P(A)\geq 0. Thus, there is an eigenvalue vector of AA contained in H(p)H(p) as desired. ∎

We will see in Section 7.4 that if an appropriate generalization of the Schur-Horn theorem holds, then we would be able to show the spectral containment property for a large class of polynomials.

7.2. Operations Preserving the Spectral Containment Property

In this section we prove that the spectral containment property is preserved under some simple operations involving adjoining a new variable.

Lemma 7.3.

Let q[x1,,xn]q\in\mathbb{R}[x_{1},\ldots,x_{n}] be stable, multiaffine and homogeneous. Let p[x0,,xn]p\in\mathbb{R}[x_{0},\ldots,x_{n}] be defined by p(x0,,xn)=q(x1,,xn)p(x_{0},\dots,x_{n})=q(x_{1},\dots,x_{n}). If qq has the spectral containment property, then pp has the spectral containment property.

Proof.

First note that x=(x0,,xn)H(p)x=(x_{0},\ldots,x_{n})\in H(p) if and only if (x1,,xn)H(q)(x_{1},\ldots,x_{n})\in H(q). Let XH(P)X\in H(P), then we can divide XX into blocks as

X=(X00vvM).X=\begin{pmatrix}X_{00}&v^{\intercal}\\ v&M\end{pmatrix}.

Here, MM is equal to X|[n]X|_{[n]}, and vv is some element of n\mathbb{R}^{n}.

If ImI_{m} is the n×nn\times n identity matrix, we can see from the definition of PP that P(X+tIn+1)=Q(M+tIn)P(X+tI_{n+1})=Q(M+tI_{n}). Therefore, for t0t\geq 0, Q(M+tIn)=P(X+tIn+1)0Q(M+tI_{n})=P(X+tI_{n+1})\geq 0, which implies MH(Q)M\in H(Q). Let λ(M)\lambda(M) and λ(X)\lambda(X) be eigenvalue vectors of MM and XX respectively, with the property that the entries of λ(M)\lambda(M) and λ(X)\lambda(X) appear in increasing order. The Cauchy interlacing inequalities say that

λ0(X)λ1(M)λ1(X)λ2(M)λ2(X)λn(M)λn(X).\lambda_{0}(X)\leq\lambda_{1}(M)\leq\lambda_{1}(X)\leq\lambda_{2}(M)\leq\lambda_{2}(X)\leq\dots\leq\lambda_{{n}}(M)\leq\lambda_{{n}}(X).

Thus for i[n]i\in[n] we can write λi(X)=λi(M)+ϵi\lambda_{i}(X)=\lambda_{i}(M)+\epsilon_{i} for some ϵ0\epsilon\geq 0. Since qq has the spectral containment property, there is a permutation σ\sigma such that (λσ(i)(M))1inH(q)(\lambda_{\sigma(i)}(M))_{1\leq i\leq n}\in H(q). Since the hyperbolicity cone of the stable polynomial qq is convex and contains the nonnegative orthant, we also have (λσ(i)(X))1in=(λσ(i)(M)+ϵσ(i))1inH(q)(\lambda_{\sigma(i)}(X))_{1\leq i\leq n}=(\lambda_{\sigma(i)}(M)+\epsilon_{\sigma(i)})_{1\leq i\leq n}\in H(q). This implies that (λ0(X),λσ(1)(X),,λσ(n)(X))H(p)(\lambda_{0}(X),\lambda_{\sigma(1)}(X),\ldots,\lambda_{\sigma(n)}(X))\in H(p). ∎

The spectral containment property is also preserved when multiplying by a new variable.

Proposition 7.4.

Let q[x1,,xn]q\in\mathbb{R}[x_{1},\ldots,x_{n}] be stable, multiaffine and homogeneous. Let p[x0,,xn]p\in\mathbb{R}[x_{0},\ldots,x_{n}] defined by p(x0,,xn)=x0q(x1,,xn)p(x_{0},\dots,x_{n})=x_{0}q(x_{1},\dots,x_{n}). If qq has the spectral containment property, then pp has the spectral containment property.

Before we show this, we need another lemma. Let XX be a matrix written in block form as

X=(X00vvM)X=\begin{pmatrix}X_{00}&v^{\intercal}\\ v&M\end{pmatrix}

and X000X_{00}\neq 0. We write X/0:=MX001vvX/0:=M-X_{00}^{-1}vv^{\intercal} for the Schur complement.

Lemma 7.5.

Let q[x1,,xn]q\in\mathbb{R}[x_{1},\dots,x_{n}] be stable, multiaffine and homogeneous. Let p=x0q[x0,,xn]p=x_{0}q\in\mathbb{R}[x_{0},\dots,x_{n}], and XH(P)X\in H(P), with X00>0X_{00}>0, then X/0H(Q)X/0\in H(Q).

Proof.

Note that a vector x=(x0,x1,,xn)H(p)x=(x_{0},x_{1},\dots,x_{n})\in H(p) if and only if x00x_{0}\geq 0 and (x1,,xn)H(q)(x_{1},\dots,x_{n})\in H(q). Recall the determinant formula for Schur complements: for any n×nn\times n matrix XX,

det(X)=X00det(X/0).\det(X)=X_{00}\det(X/0).

Also, it is not hard to see from the definition that if S{0,1,,n}S\subseteq\{0,1,\dots,n\}, and 0S0\in S, then

X|S/0=X/0|(S0),X|_{S}/0=X/0|_{(S\setminus 0)}\,,

that is, Schur complements interact naturally with taking submatrices. Therefore,

P(X)=S{0,,n}aSdet(X|S)=S{0,,n}aSX00det((X/0)|S0)=X00Q(X/0)P(X)=\sum_{S\subseteq\{0,\dots,n\}}a_{S}\det(X|_{S})=\sum_{S\subseteq\{0,\dots,n\}}a_{S}X_{00}\det((X/0)|_{S\setminus 0})=X_{00}Q(X/0)

Thus, if XH(P)X\in H(P) and X00>0X_{00}>0, then

Q(X/0)=P(X)X000Q(X/0)=\frac{P(X)}{X_{00}}\geq 0

We can strengthen this result by noting that if we let JJ be the block diagonal matrix given by

J=(000In,)J=\begin{pmatrix}0&0\\ 0&I_{n},\end{pmatrix}

then JH(P)J\in H(P), since it is in particular positive semidefinite. It is clear from the definition that X/0+tIn+1=(X+tJ)/0X/0+tI_{n+1}=(X+tJ)/0. Thus, we have that for all t0t\geq 0,

Q(X/0+tIn+1)=Q((X+tJ)/0)=P(X+tJ)X000,Q(X/0+tI_{n+1})=Q((X+tJ)/0)=\frac{P(X+tJ)}{X_{00}}\geq 0,

which implies that X/0H(Q)X/0\in H(Q). ∎

Proof of 7.4.

First assume that X00>0X_{00}>0. By Lemma 7.5, and the spectral containment property for qq, we have that there is an ordering of the eigenvalues of X/0X/0 so that λ(X/0)H(q)\lambda(X/0)\in H(q).

Now, we can write

X=(000X/0)+(X00vvX001vv),X=\begin{pmatrix}0&0\\ 0&X/0\end{pmatrix}+\begin{pmatrix}X_{00}&v^{\intercal}\\ v&X_{00}^{-1}vv^{\intercal}\end{pmatrix},

where the second term is a rank 1 positive semidefinite matrix.

Let X=(000X/0)X^{\prime}=\begin{pmatrix}0&0\\ 0&X/0\end{pmatrix}. Note that XX^{\prime} is block diagonal, so that if λ(X)\lambda(X^{\prime}) is an eigenvalue vector for X/0X/0, then 0λ(X)0\oplus\lambda(X^{\prime}) is an eigenvalue vector for XX^{\prime}. In particular, by ordering the entries appropriately, λ(X)H(p)\lambda(X^{\prime})\in H(p), from our characterization of H(p)H(p) in terms of H(q)H(q).

By the Weyl inequalities, there is an ordering of the eigenvalues of XX so that λi(X)λi(X)\lambda_{i}(X)\geq\lambda_{i}(X^{\prime}) for each ii. This implies that

λ(X)=(0λ(X))+v\lambda(X)=(0\oplus\lambda(X^{\prime}))+v

where vv is a nonnegative vector, and therefore vv is in H(p)H(p). Therefore, λH(p)\lambda\in H(p).

The case of X00=0X_{00}=0 follows from continuity of eigenvalues. Observe that if XX is in the interior of H(P)H(P), then X00>0X_{00}>0, and also, since the eigenvalues of a symmetric matrix vary continuously with the matrix, the property of having an eigenvalue vector in H(p)H(p) is closed. Therefore, since H(p)H(p) is closed and has nonempty interior, there is an eigenvalue vector of XX in H(p)H(p). ∎

7.3. Polynomials Interlacing an Elementary Symmetric Polynomial

The spectral containment property can be proved more easily for polynomials which interlace some elementary symmetric polynomial.

Before stating the main result, we note that the minor lift map preserves interlacing.

Lemma 7.6.

Let p,q[x1,,xn]p,q\in\mathbb{R}[x_{1},\ldots,x_{n}] be stable, multiaffine and homogeneous. Let P,QP,Q be the associated minor lifts. Then pp interlaces qq if and only if PP interlaces QQ.

Proof.

Assume that pp interlaces qq. Then by the multivariate Hermite–Biehler Theorem [7, Thm. 5.3] we have that p+iqp+iq is stable. Let AA be a symmetric n×nn\times n matrix. We have to show that P(tI+A)P(tI+A) interlaces Q(tI+A)Q(tI+A). From [5, Thm. 1.3] we see that the linear operator TAT_{A} that sends a multiaffine polynomial pp to the polynomial P(Diag(x1,,xn)+A)P(\operatorname*{Diag}(x_{1},\dots,x_{n})+A) is a stability preserver. Thus TA(p+iq)T_{A}(p+iq) is stable. Substituting tt for all variables in TA(p+iq)T_{A}(p+iq) shows that P(tI+A)+iQ(tI+A)P(tI+A)+iQ(tI+A) is stable. Now the claim follows from another application of the Hermite–Biehler Theorem. The other direction is clear, since pp and qq are the respective restrictions of PP and QQ to the diagonal matrices. ∎

Lemma 7.7.

Suppose that pp is a stable, multiaffine and homogeneous polynomial of degree dd, and that ed1e_{d-1} interlaces pp. Further suppose that for any XH(P)X\in H(P), there is some eigenvalue vector λ\lambda of XX, such that p(λ)P(X)p(\lambda)\geq P(X). Then pp has the spectral containment property.

Proof.

We first note the fact that if pp is any hyperbolic polynomial, and qq interlaces pp, then xx is in the interior of H(p)H(p) if and only if xx is in H(q)H(q) and p(x)>0p(x)>0. This follows easily from considering the bivariate case.

Let XX be in the interior of H(P)H(P). We first want to show that there is an eigenvalue vector of XX that is contained in H(p)H(p); the case for general XX will then follow from the fact that the eigenvalues of a symmetric matrix are continuous as a function of the entries of the matrix.

Since ed1e_{d-1} interlaces pp, by Lemma 7.6, we have that Ed1E_{d-1} interlaces PP. From this, we conclude that since XH(P)X\in H(P), XX is contained in H(Ed1)H(E_{d-1}), and so any vector of eigenvalues of XX is contained in H(ed1)H(e_{d-1}).

Let λ\lambda be any eigenvalue vector of XX so that 0<P(X)p(λ)0<P(X)\leq p(\lambda), then we see that this λ\lambda must then be in the interior of H(p)H(p), as desired. ∎

In 7.9, we show that the set of stable multiaffine forms interlacing ed1e_{d-1} is an open subset containing ede_{d}. This implies that if we have a hyperbolic polynomial pp which is sufficiently close to ede_{d}, then pp will have the spectral containment property as long as for any XH(P)X\in H(P), there is some eigenvalue vector λ\lambda, so that p(λ)P(X)p(\lambda)\geq P(X).

We will apply this lemma in a few cases, together with some variational characterizations for eigenvalues to show the spectral containment property for some special kinds of polynomials.

Lemma 7.8.

Let p,qp,q be multiaffine polynomials of degree d+1d+1 and dd, and let ana\in\mathbb{R}^{n}. There exist multiaffine polynomials m1,,ms,n1,,nsm_{1},\ldots,m_{s},n_{1},\ldots,n_{s} of degree dd such that

DapqpDaq=m1n1++msns.\operatorname{D}_{a}p\cdot q-p\cdot\operatorname{D}_{a}q=m_{1}n_{1}+\ldots+m_{s}n_{s}.
Proof.

This is straightforward. ∎

Proposition 7.9.

There is an open neighborhood UU of ed+1e_{d+1} in the vector space of multiaffine forms of degree d+1d+1 such that every stable multiaffine pUp\in U of degree d+1d+1 is interlaced by ede_{d}.

Proof.

Let II be the ideal generated by all multiaffine polynomials of degree dd and let VV be the degree 2d2d part of I2I^{2}. Let ΣV\Sigma\subset V be the set of all polynomials that can be written as a sum of squares of multiaffine polynomials of degree dd. It follows from the proof of [18, Thm. 6.2] that Deed+1eded+1Deed\operatorname{D}_{e}e_{d+1}\cdot e_{d}-e_{d+1}\cdot\operatorname{D}_{e}e_{d} is in the interior of Σ\Sigma (with respect to the euclidean topology on VV). Thus it follows from Lemma 7.8 that there is an open neighborhood UU of ed+1e_{d+1} such that for every stable multiaffine pUp\in U the polynomial DepedpDeed\operatorname{D}_{e}p\cdot e_{d}-p\cdot\operatorname{D}_{e}e_{d} is in Σ\Sigma. Thus ede_{d} interlaces pp by [19, Thm. 2.1]. ∎

7.4. Generalized Schur-Horn Property and the Spectral Containment Property

We say that an nn-variate multiaffine homogeneous polynomial pp has the Schur-Horn property if for any n×nn\times n symmetric matrix XX with some eigenvalue vector λ\lambda,

maxπ𝔖np(π(λ))=maxUO(n)P(UXU).\max_{\pi\in\mathfrak{S}_{n}}p(\pi(\lambda))=\max_{U\in O(n)}P(UXU^{\intercal}).

The Schur-Horn property for pp is equivalent to the fact that for any n×nn\times n symmetric matrix XX with eigenvalue vector λ\lambda,

maxπ𝔖np(π(λ))P(X).\max_{\pi\in\mathfrak{S}_{n}}p(\pi(\lambda))\geq P(X).

Another equivalent formulation states that pp has the Schur-Horn property if and only if the maximum of P(UXU)P(UXU^{\intercal}) as UU varies over O(n)O(n) is obtained for some UU such that UXUUXU^{\intercal} is diagonal.

The Schur-Horn theorem states that any linear homogeneous polynomial has the Schur-Horn property. We now relate Schur-Horn property and the spectral containment property.

Theorem 7.10.

Let pp is an homogeneous multiaffine form of degree dd. If pp has the Schur-Horn property, and ed1e_{d-1} interlaces pp, then pp has the spectral containment property.

Proof.

It is clear that if pp has the Schur-Horn property, then in particular, for any XH(P)X\in H(P), there is some eigenvalue vector λ\lambda so that p(λ)P(X)p(\lambda)\geq P(X). Therefore, pp has the spectral containment property by Lemma 7.7. ∎

Using the Schur-Horn property and our previous lemmas, we can show that a family of stable polynomials have the spectral containment property.

Lemma 7.11.

If pp is a degree dd homogeneous multiaffine polynomial with the Schur-Horn property, then ed(x)+pe_{d}(x)+p also has the Schur-Horn property.

Proof.

It can easily be seen that if XX is an n×nn\times n symmetric matrix, with an eigenvalue vector λ\lambda, that

maxπ𝔖n(ed(π(λ))+p(π(λ)))\displaystyle\max_{\pi\in\mathfrak{S}_{n}}(e_{d}(\pi(\lambda))+p(\pi(\lambda))) =ed(λ)+maxπ𝔖np(π(λ))\displaystyle=e_{d}(\lambda)+\max_{\pi\in\mathfrak{S}_{n}}p(\pi(\lambda))
=Ed(X)+maxUO(n)P(UXU)\displaystyle=E_{d}(X)+\max_{U\in O(n)}P(UXU^{\intercal})
=maxUO(n)Ed(UXU)+P(UXU).\displaystyle=\max_{U\in O(n)}E_{d}(UXU^{\intercal})+P(UXU^{\intercal}).

This gives the desired result. ∎

Lemma 7.12.

If pp is a degree dd homogeneous multiaffine polynomial with the Schur-Horn property, then for ϵ>0\epsilon>0 sufficiently small, ed(x)+ϵpe_{d}(x)+\epsilon p has the spectral containment property.

Proof.

By 7.9, we see that for ϵ\epsilon sufficiently small, ed(x)+ϵpe_{d}(x)+\epsilon p is interlaced by ed1e_{d-1}. Moreover, by Lemma 7.11, we see that ed(x)+ϵpe_{d}(x)+\epsilon p has the Schur-Horn property. Therefore, by Theorem 7.10, we see that ed(x)+ϵpe_{d}(x)+\epsilon p has the spectral containment property. ∎

We now give some examples of polynomials with the Schur-Horn property.

7.5. The Schur-Horn Property For Degree n1n-1 Polynomials

Theorem 7.13.

If p[x1,,xn]p\in\mathbb{R}[x_{1},\dots,x_{n}] is a degree n1n-1 multilinear homogeneous polynomial, then pp has the Schur-Horn property.

Proof.

Write p(x)=i=1naij[n]ixip(x)=\sum_{i=1}^{n}a_{i}\prod_{j\in[n]\setminus i}x_{i}. In this case,

P(X)=i=1naidet(X|[n]i)P(X)=\sum_{i=1}^{n}a_{i}\det(X|_{[n]\setminus i})

Recall that the dual of p(x)p(x) was defined in Section 3, as

p(x)=i=1naixi.p^{*}(x)=\sum_{i=1}^{n}a_{i}x_{i}.

Abusing notation, we define PP^{*} to be

P(X)=i=1naiXii.P^{*}(X)=\sum_{i=1}^{n}a_{i}X_{ii}.

Define the adjugate matrix of XX by Adj(X)=det(X)X1\operatorname*{Adj}(X)=\det(X)X^{-1}. By Cramer’s rule, the diagonal entries of the adjugate matrix are given by

Adj(X)ii=det(X|[n]i).\operatorname*{Adj}(X)_{ii}=\det(X|_{[n]\setminus i}).

Hence, using Remark 3.1, we see that P(Adj(X))=P(X)P^{*}(\operatorname*{Adj}(X))=P(X).

The eigenvalues of Adj(X)\operatorname*{Adj}(X) are of the form μj=i[n]jλi\mu_{j}=\prod_{i\in[n]\setminus j}\lambda_{i} where λ\lambda is an eigenvalue vector of XX. We see then that p(μ)=p(λ)p^{*}(\mu)=p(\lambda). Now we apply the Schur-Horn theorem to the linear form pp^{*} and the matrix Adj(X)\operatorname*{Adj}(X) to see that

maxπ𝔖np(π(μ))=maxUO(n)P(UAdj(X)U).\max_{\pi\in\mathfrak{S}_{n}}p^{*}(\pi(\mu))=\max_{U\in O(n)}P^{*}(U^{\intercal}\operatorname*{Adj}(X)U).

Applying our identities relating the vector μ\mu to λ\lambda, we see that

maxπ𝔖np(π(λ))=maxUO(n)P(UXU).\max_{\pi\in\mathfrak{S}_{n}}p(\pi(\lambda))=\max_{U\in O(n)}P(U^{\intercal}XU).

From this, we immediately obtain a corollary.

Corollary 7.14.

There is an open set UU in the space of degree n1n-1 homogeneous multiaffine polynomials, such that UU contains en1e_{n-1} and every element of UU is stable and has the spectral containment property.

7.6. Extensions of Elementary Symmetric Polynomials and the Schur–Horn Property

We note that it is unclear whether the Schur–Horn property is preserved by adding extra variables. We show that this holds for elementary symmetric polynomials.

Proposition 7.15.

Fix natural numbers dknd\leq k\leq n. Let p=±ed(x1,,xk)[x1,,xn]p=\pm e_{d}(x_{1},\dots,x_{k})\in\mathbb{R}[x_{1},\dots,x_{n}]. Then, pp has the Schur-Horn property.

We can reduce this to the classical Schur-Horn theorem. To do this, we require a lemma involving a construction, which is referred to in [22, Chapter 3] as a derivation of a matrix XX.

Lemma 7.16.

For any n×nn\times n symmetric matrix XX, with eigenvalue vector λ\lambda, there exists a (nk)×(nk)\binom{n}{k}\times\binom{n}{k} symmetric matrix Dk,dXD^{k,d}X with the following two properties:

  • The eigenvalues of Dk,dXD^{k,d}X are precisely those real numbers of the form

    ed(λs1,λs2,,λsk),e_{d}(\lambda_{s_{1}},\lambda_{s_{2}},\dots,\lambda_{s_{k}}),

    where we range over all possible values of s1,,sk[n]s_{1},\dots,s_{k}\in[n] so that s1<s2<<sks_{1}<s_{2}<\dots<s_{k}.

  • The diagonal entries of Dk,dXD^{k,d}X are precisely those of the form Ed(X|S)E_{d}(X|_{S}), where SS ranges over the size kk subsets of [n][n].

Proof of Lemma 7.16.

We will define Dk,dXD^{k,d}X in terms of wedge powers. If we regard XX as an endomorphism from n\mathbb{R}^{n} to n\mathbb{R}^{n}, then Dk,dXD^{k,d}X is defined as an endomorphism of kn\wedge^{k}\mathbb{R}^{n} by letting

Dk,dX(v1v2vk)=S([n]k)wS,1wS,1wS,k,D^{k,d}X(v_{1}\wedge v_{2}\wedge\dots\wedge v_{k})=\sum_{S\in\binom{[n]}{k}}w_{S,1}\wedge w_{S,1}\wedge\dots\wedge w_{S,k},

where

wS,k={Xvk if kSvk if kS.w_{S,k}=\begin{cases}Xv_{k}\text{ if }k\in S\\ v_{k}\text{ if }k\not\in S\end{cases}.

It is not hard to see that if v1,,vkv_{1},\dots,v_{k} are linearly independent eigenvectors of XX with eigenvalues λ1,,λk\lambda_{1},\dots,\lambda_{k} respectively, then v1vkv_{1}\wedge\dots\wedge v_{k} is an eigenvector of Dk,dXD^{k,d}X with eigenvalue ed(λ1,,λk)e_{d}(\lambda_{1},\dots,\lambda_{k}), and this clearly implies the first property in Lemma 7.16.

On the other hand, if we use the natural basis of kn\wedge^{k}\mathbb{R}^{n} given by {es1es2esk}\{e_{s_{1}}\wedge e_{s_{2}}\wedge\dots\wedge e_{s_{k}}\}, where eie_{i} is a standard basis vector, and s1<s2<<sks_{1}<s_{2}<\dots<s_{k}, then this basis is orthogonal under the natural inner product of kn\wedge^{k}\mathbb{R}^{n}, and also

(es1es2esk)Dk,dX(es1es2esk)=Ed(X|{s1,,sk}).(e_{s_{1}}\wedge e_{s_{2}}\wedge\dots\wedge e_{s_{k}})^{\intercal}D^{k,d}X(e_{s_{1}}\wedge e_{s_{2}}\wedge\dots\wedge e_{s_{k}})=E_{d}(X|_{\{s_{1},\dots,s_{k}\}}).

This clearly implies the second property of Lemma 7.16. ∎

Proof of 7.15.

The classical Schur-Horn theorem implies that for any symmetric matrix XX,

maxs1<s2<<sked(λs1,λs2,,λsk)maxS([n]k)Ed(X|S)Ed(X|1,,k),\max_{s_{1}<s_{2}<\dots<s_{k}}e_{d}(\lambda_{s_{1}},\lambda_{s_{2}},\dots,\lambda_{s_{k}})\geq\max_{S\in\binom{[n]}{k}}E_{d}(X|_{S})\geq E_{d}(X|_{1,\dots,k}),

and also that

mins1<s2<<sked(λs1,λs2,,λsk)minS([n]k)Ed(X|S)Ed(X|1,,k).\min_{s_{1}<s_{2}<\dots<s_{k}}e_{d}(\lambda_{s_{1}},\lambda_{s_{2}},\dots,\lambda_{s_{k}})\leq\min_{S\in\binom{[n]}{k}}E_{d}(X|_{S})\leq E_{d}(X|_{1,\dots,k}).

This first statement implies the Schur-Horn property for ed(x1,,xk)e_{d}(x_{1},\dots,x_{k}), and the second implies the Schur-Horn property for ed(x,,xk)-e_{d}(x_{,}\dots,x_{k}). ∎

7.7. A Small Example of the Schur-Horn Property

We give one more example of the Schur-Horn property, which is noteworthy because our proof does not appeal to the classical Schur-Horn theorem.

Lemma 7.17.

The polynomial x1(x2+x3)[x1,x2,x3,x4]x_{1}(x_{2}+x_{3})\in\mathbb{R}[x_{1},x_{2},x_{3},x_{4}] has the Schur-Horn property.

Remark 7.18.

The polynomial x1(x2+x3)[x1,x2,x3]x_{1}(x_{2}+x_{3})\in\mathbb{R}[x_{1},x_{2},x_{3}] clearly has the Schur-Horn property, by Theorem 7.13, but it is not clear that this remains the case if we introduce a new variable.

Proof.

Let D=Diag(λ1,λ2,λ3,λ4)D=\operatorname*{Diag}(\lambda_{1},\lambda_{2},\lambda_{3},\lambda_{4}). Let UU be in SO(4)\operatorname*{SO}(4), and write its columns as

U=(vwzy).U=\begin{pmatrix}v&w&z&y\end{pmatrix}.

It is not hard to see via an explicit computation that

P(UDU)=det(i=14λivi2i=14λiwivii=14λiwivii=14λiwi2)+det(i=14λivi2i=14λizivii=14λizivii=14λizi2).P(UDU^{\intercal})=\det\begin{pmatrix}\sum_{i=1}^{4}\lambda_{i}v_{i}^{2}&\sum_{i=1}^{4}\lambda_{i}w_{i}v_{i}\\ \sum_{i=1}^{4}\lambda_{i}w_{i}v_{i}&\sum_{i=1}^{4}\lambda_{i}w_{i}^{2}\\ \end{pmatrix}+\det\begin{pmatrix}\sum_{i=1}^{4}\lambda_{i}v_{i}^{2}&\sum_{i=1}^{4}\lambda_{i}z_{i}v_{i}\\ \sum_{i=1}^{4}\lambda_{i}z_{i}v_{i}&\sum_{i=1}^{4}\lambda_{i}z_{i}^{2}\\ \end{pmatrix}.\\

We expand this formulation out by multilinearity of the determinant to obtain the following

(2) i=14j=14λiλj(det(vi2wjvjwiviwj2)+det(vi2zjvjzivizj2))\displaystyle\sum_{i=1}^{4}\sum_{j=1}^{4}\lambda_{i}\lambda_{j}\left(\det\begin{pmatrix}v_{i}^{2}&w_{j}v_{j}\\ w_{i}v_{i}&w_{j}^{2}\\ \end{pmatrix}+\det\begin{pmatrix}v_{i}^{2}&z_{j}v_{j}\\ z_{i}v_{i}&z_{j}^{2}\\ \end{pmatrix}\right)
(3) =i=14j<iλiλj(det(vi2wjvjwiviwj2)+det(vi2zjvjzivizj2)+det(vj2wiviwjvjwi2)+det(vj2zivizjvjzi2))\displaystyle=\sum_{i=1}^{4}\sum_{j<i}\lambda_{i}\lambda_{j}\left(\det\begin{pmatrix}v_{i}^{2}&w_{j}v_{j}\\ w_{i}v_{i}&w_{j}^{2}\\ \end{pmatrix}+\det\begin{pmatrix}v_{i}^{2}&z_{j}v_{j}\\ z_{i}v_{i}&z_{j}^{2}\\ \end{pmatrix}+\det\begin{pmatrix}v_{j}^{2}&w_{i}v_{i}\\ w_{j}v_{j}&w_{i}^{2}\\ \end{pmatrix}+\det\begin{pmatrix}v_{j}^{2}&z_{i}v_{i}\\ z_{j}v_{j}&z_{i}^{2}\\ \end{pmatrix}\right)
(4) =i=14j<iλiλj(det(viwivjwj)2+det(vizivjzj)2).\displaystyle=\sum_{i=1}^{4}\sum_{j<i}\lambda_{i}\lambda_{j}\left(\det\begin{pmatrix}v_{i}&w_{i}\\ v_{j}&w_{j}\\ \end{pmatrix}^{2}+\det\begin{pmatrix}v_{i}&z_{i}\\ v_{j}&z_{j}\\ \end{pmatrix}^{2}\right).

We can think of this as a polynomial

γ(λ)=i=14j<iγi,jλiλj,\gamma(\lambda)=\sum_{i=1}^{4}\sum_{j<i}\gamma_{i,j}\lambda_{i}\lambda_{j},

where

γi,j=det(viwivjwj)2+det(vizivjzj)2.\gamma_{i,j}=\det\begin{pmatrix}v_{i}&w_{i}\\ v_{j}&w_{j}\\ \end{pmatrix}^{2}+\det\begin{pmatrix}v_{i}&z_{i}\\ v_{j}&z_{j}\\ \end{pmatrix}^{2}.

We make the following claim:

Lemma 7.19.

γ(λ)\gamma(\lambda) is in the convex hull of the polynomials

ai,j,k(λ)=λi(λj+λk)a_{i,j,k}(\lambda)=\lambda_{i}(\lambda_{j}+\lambda_{k})

where {i,j,k}([4]3)\{i,j,k\}\in\binom{[4]}{3}.

To see that Lemma 7.19 implies the theorem, observe that for any i,j,ki,j,k,

ai,j,k(λ)maxπS4p(λπ(1),λπ(2),λπ(3),λπ(4)).a_{i,j,k}(\lambda)\geq\max_{\pi\in S_{4}}p(\lambda_{\pi(1)},\lambda_{\pi(2)},\lambda_{\pi(3)},\lambda_{\pi(4)}).

Hence, in particular, any convex combination of the ai,j,ka_{i,j,k} will be lower bounded by this same quantity.

Therefore, since every symmetric matrix is diagonalizable, we have that for any symmetric matrix XX, and any orthogonal UU,

P(UXU)maxπS4p(λπ(1),λπ(2),λπ(3),λπ(4)).P(U^{\intercal}XU)\geq\max_{\pi\in S_{4}}p(\lambda_{\pi(1)},\lambda_{\pi(2)},\lambda_{\pi(3)},\lambda_{\pi(4)}).

The opposite inequality is easy to see by choosing UU to be an orthogonal matrix diagonalizing XX. ∎

It remains to show Lemma 7.19.

Proof of Lemma 7.19.

Since we work in dimension 44, we can find the inequalities defining this convex hull explicitly using computational methods. It turns out that the polynomial γ\gamma is in the convex hull of the ai,j,ka_{i,j,k} if and only if for each i,j[n]i,j\in[n],

γi,j0\gamma_{i,j}\geq 0

and if {i,j,k,l}={1,2,3,4}\{i,j,k,l\}=\{1,2,3,4\}, then

γi,j+γk,l1\gamma_{i,j}+\gamma_{k,l}\leq 1

and

i,j([n]2)γi,j=2\sum_{i,j\in\binom{[n]}{2}}\gamma_{i,j}=2

For our particular value of γi,j\gamma_{i,j}, it is easy to see that it is the sum of two squares and hence nonnegative. Further notice that the sum of all of the coefficients of γ\gamma is γ(1,1,1,1)\gamma(1,1,1,1), so that returning to the definition of the polynomial γ\gamma,

γ(1,1,1,1)=P(I)=2.\gamma(1,1,1,1)=P(I)=2.

It remains to show that

γi,j+γk,l1\gamma_{i,j}+\gamma_{k,l}\leq 1

Notice that

γi,j+γk,l=det(viwivjwj)2+det(vizivjzj)2+(vkwkvlwl)2+det(vkzkvlzl)2\gamma_{i,j}+\gamma_{k,l}=\det\begin{pmatrix}v_{i}&w_{i}\\ v_{j}&w_{j}\\ \end{pmatrix}^{2}+\det\begin{pmatrix}v_{i}&z_{i}\\ v_{j}&z_{j}\\ \end{pmatrix}^{2}+\begin{pmatrix}v_{k}&w_{k}\\ v_{l}&w_{l}\\ \end{pmatrix}^{2}+\det\begin{pmatrix}v_{k}&z_{k}\\ v_{l}&z_{l}\\ \end{pmatrix}^{2}

We prove that this is at most 1. Recall that

U=(vwzy)=(v1w1z1y1v2w2z2y2v3w3z3y3v4w4z4y4).U=\begin{pmatrix}v&w&z&y\end{pmatrix}=\begin{pmatrix}v_{1}&w_{1}&z_{1}&y_{1}\\ v_{2}&w_{2}&z_{2}&y_{2}\\ v_{3}&w_{3}&z_{3}&y_{3}\\ v_{4}&w_{4}&z_{4}&y_{4}\\ \end{pmatrix}.

The Jacobi complementary minors theorem for matrix inverses implies that if S[4]S\subseteq[4], then

det(U|S,T)=det(U)det(U1|Sc,Tc)=±det(U|Sc,Tc)=±det(U|Tc,Sc)\det(U|_{S,T})=\det(U)\det(U^{-1}|_{S^{c},T^{c}})=\pm\det(U^{\intercal}|_{S^{c},T^{c}})=\pm\det(U^{\intercal}|_{T^{c},S^{c}})

We now see that

det(viwivjwk)2=det(zkykzlyl)2.\det\begin{pmatrix}v_{i}&w_{i}\\ v_{j}&w_{k}\end{pmatrix}^{2}=\det\begin{pmatrix}z_{k}&y_{k}\\ z_{l}&y_{l}\end{pmatrix}^{2}.

Similarly, we must have

det(vizivjzj)2=det(wkykwlyl)2.\det\begin{pmatrix}v_{i}&z_{i}\\ v_{j}&z_{j}\end{pmatrix}^{2}=\det\begin{pmatrix}w_{k}&y_{k}\\ w_{l}&y_{l}\end{pmatrix}^{2}.

Let

M=(v3w3z3y3v4w4z4y4).M=\begin{pmatrix}v_{3}&w_{3}&z_{3}&y_{3}\\ v_{4}&w_{4}&z_{4}&y_{4}\\ \end{pmatrix}.

Notice that since UU is orthogonal, these two rows are orthogonal, and so MM=IMM^{\intercal}=I. Applying the Cauchy–Binet theorem to det(MM)=det(I)\det(MM^{\intercal})=\det(I) we see that

1\displaystyle 1 =det(MM)\displaystyle=\det(MM^{\intercal})
=S([n]2)det(MS)det(MS)\displaystyle=\sum_{S\subseteq\binom{[n]}{2}}\det(M_{S})\det(M_{S}^{\intercal})
=S([n]2)det(MS)2\displaystyle=\sum_{S\subseteq\binom{[n]}{2}}\det(M_{S})^{2}
(zkykzlyl)2+det(viwivjwj)2+(vkwkvlwl)2+det(vkzkvlzl)2\displaystyle\geq\begin{pmatrix}z_{k}&y_{k}\\ z_{l}&y_{l}\end{pmatrix}^{2}+\det\begin{pmatrix}v_{i}&w_{i}\\ v_{j}&w_{j}\\ \end{pmatrix}^{2}+\begin{pmatrix}v_{k}&w_{k}\\ v_{l}&w_{l}\\ \end{pmatrix}^{2}+\det\begin{pmatrix}v_{k}&z_{k}\\ v_{l}&z_{l}\\ \end{pmatrix}^{2}
=γi,j+γk,l.\displaystyle=\gamma_{i,j}+\gamma_{k,l}\,.

The result now follows. ∎

8. The permutation property

The goal of this section is to prove Theorem 2.11. It says that given any point vv in the hyperbolicity cone of eke_{k} and any other homogeneous stable multiaffine polynomial hh of the same degree, some permuation of the coordinates of vv is in the hyperbolicity cone of hh. We call this remarkable propery of eke_{k} the permutation property. We first need some preparation.

Lemma 8.1.

Assume that the homogeneous stable polynomials g,h[x1,,xn]g,h\in\mathbb{R}[x_{1},\ldots,x_{n}] have nonnegative coefficients and a common interlacer. Then f=g+hf=g+h is stable. If vv is in the hyperbolicity cone of ff, then vv is in the hyperbolicity cone of gg or in the hyperbolicity cone of hh.

Proof.

Let ee be the all-ones vector. The univariate polynomials F=f(tev),G=g(tev)F=f(te-v),G=g(te-v) and H=h(tev)H=h(te-v) have a common interlacer. Further, all roots of FF are nonnegative. The existence of a common interlacer implies that GG and HH have at most one negative root each. Assume for the sake of a contradiction that both GG and HH have a negative root. Then GG and HH have the same (nonzero) sign on the smallest root of FF. This contradicts F=G+HF=G+H. Thus either GG or HH have only nonnegative roots which implies the claim. ∎

Lemma 8.2.

Let h[x1,,xn]h\in\mathbb{R}[x_{1},\ldots,x_{n}] be homogeneous, multiaffine and stable. Let τ𝔖n\tau\in\mathfrak{S}_{n} be a transposition. Then hh and τ(h)\tau(h) have a common interlacer.

Proof.

Without loss of generality assume that τ=(12)\tau=(12) and let g=τ(h)g=\tau(h). We can write

h=Ax1x2+Bx1+Cx2+Dh=A\cdot x_{1}\cdot x_{2}+B\cdot x_{1}+C\cdot x_{2}+D

for some multiaffine A,B,C,D[x3,,xn]A,B,C,D\in\mathbb{R}[x_{3},\ldots,x_{n}]. Then the polynomial

(x1+x2)h=A(x1+x2)+B+C=(x1+x2)g\left(\frac{\partial}{\partial x_{1}}+\frac{\partial}{\partial x_{2}}\right)h=A\cdot(x_{1}+x_{2})+B+C=\left(\frac{\partial}{\partial x_{1}}+\frac{\partial}{\partial x_{2}}\right)g

is a common interlacer of hh and gg. ∎

Corollary 8.3.

Let h[x1,,xn]h\in\mathbb{R}[x_{1},\ldots,x_{n}] be homogeneous, multiaffine and stable. Let τ𝔖n\tau\in\mathfrak{S}_{n} be a transposition, g=τ(h)g=\tau(h) and f=λg+μhf=\lambda g+\mu h for some nonnegative λ,μ\lambda,\mu\in\mathbb{R}. Then C(f,e)C(g,e)C(h,e)\textnormal{C}(f,e)\subset\textnormal{C}(g,e)\cup\textnormal{C}(h,e).

Proof.

This is a direct consequence of the two preceding lemmas. ∎

Let [𝔖n]\mathbb{Q}[\mathfrak{S}_{n}] be the group algebra of the symmetric group 𝔖n\mathfrak{S}_{n} on nn elements, i.e. [𝔖n]\mathbb{Q}[\mathfrak{S}_{n}] is the vector space over \mathbb{Q} with basis ege_{g} for g𝔖ng\in\mathfrak{S}_{n} whose ring structure is defined by extending egeh:=eghe_{g}\cdot e_{h}:=e_{g\cdot h} linearly. In [𝔖n]\mathbb{Q}[\mathfrak{S}_{n}] we have the identity

(5) j=2ni=1j1(1+1jie(ij))=g𝔖neg,\prod_{j=2}^{n}\prod_{i=1}^{j-1}\left(1+\frac{1}{j-i}\cdot e_{(ij)}\right)=\sum_{g\in\mathfrak{S}_{n}}e_{g},

see for example [23, p. 192]. From this we obtain our desired theorem.

Theorem 8.4.

Let σd[x1,,xn]\sigma_{d}\in\mathbb{R}[x_{1},\ldots,x_{n}] be the elementary symmetric polynomial of degree dd and h[x1,,xn]h\in\mathbb{R}[x_{1},\ldots,x_{n}] any other nonzero homogeneous multiaffine stable polynomial of degree dd. If vv is in the hyperbolicity cone of ede_{d}, then τ(v)\tau(v) is in the hyperbolicity cone of hh for some permutation τ𝔖n\tau\in\mathfrak{S}_{n}.

Proof.

We have cσd=(g𝔖neg)hc\cdot\sigma_{d}=(\sum_{g\in\mathfrak{S}_{n}}e_{g})h for some nonzero scalar cc\in\mathbb{R}. Thus by Equation 5 we can write

cσd=(i=1r(1+λieτi))hc\cdot\sigma_{d}=\left(\prod_{i=1}^{r}(1+\lambda_{i}e_{\tau_{i}})\right)h

for some positive λi\lambda_{i}\in\mathbb{R}, transpositions τi𝔖n\tau_{i}\in\mathfrak{S}_{n} and r=(n2)r=\binom{n}{2}. We define hk=(i=1k(1+λieτi))hh_{k}=\left(\prod_{i=1}^{k}(1+\lambda_{i}e_{\tau_{i}})\right)h for k=0,,rk=0,\ldots,r. Since hk=hk1+λkτk(hk1)h_{k}=h_{k-1}+\lambda_{k}\tau_{k}(h_{k-1}), 8.3 implies that if vv is in the hyperbolicity cone of hkh_{k}, then either vv or τk(v)\tau_{k}(v) is in the hyperbolicity cone of hk1h_{k-1}. Since hr=cσdh_{r}=c\cdot\sigma_{d} and h0=hh_{0}=h, this argument shows that if vv is in the hyperbolicity cone of σd\sigma_{d}, then (τi1τis)(v)(\tau_{i_{1}}\circ\cdots\circ\tau_{i_{s}})(v) is in the hyperbolicity cone of hh for some 1i1<<isr1\leq i_{1}<\cdots<i_{s}\leq r. ∎

9. Open problems

Our work sparks a wide range of open problems. We mention some of them here. For several of these problems, we presented proofs for some special cases, whereas the general case remains open.

9.1. Hyperbolic Schur-Horn Theorem

In Section 4 we proved the hyperbolic generalization of Hadamard-Fischer inequality as well as Koteljanskii’s inequality, in Theorem 2.4 and Theorem 2.6. Here we present another potential generalization of classical linear algebra results in Schur-Horn theorem.

Schur-Horn theorem appears in our previous section on Spectral containment Property, where it plays a major role and a generalized version called Schur-Horn property was formed. Here we will form a different generalization of Schur-Horn theorem in terms of hyperbolic polynomials.

We will formulate our generalization in the language of majorization. Given polynomials pp and qq of the same degree, both hyperbolic with respect to the direction vv, we say that pp majorizes qq in direction vv if for all xnx\in\mathbb{R}^{n}, the roots of p(xtv)p(x-tv) majorize the roots of q(xtv)q(x-tv). Recall that given α,βk\alpha,\beta\in\mathbb{R}^{k}, α\alpha majorizes β\beta if i=1kαi=i=1kβi\sum_{i=1}^{k}\alpha_{i}=\sum_{i=1}^{k}\beta_{i} and the following holds: let α,β\alpha^{\prime},\beta^{\prime} be obtained from α,β\alpha,\beta by reordering coordinates such that α1αk\alpha^{\prime}_{1}\geq...\geq\alpha^{\prime}_{k} and β1βk\beta^{\prime}_{1}\geq...\geq\beta^{\prime}_{k}, then for each 1m<k,i=1mαii=1mβi1\leq m<k,\sum_{i=1}^{m}\alpha^{\prime}_{i}\geq\sum_{i=1}^{m}\beta^{\prime}_{i}. Equivalently, α\alpha majorizes β\beta if and only if βconv(𝔖k(α))\beta\in\operatorname{\operatorname{conv}}(\mathfrak{S}_{k}(\alpha)), where the symmetric group 𝔖k\mathfrak{S}_{k} acts on α\alpha by permuting its coordinates.

In this language, we can restate the Schur direction of the Schur-Horn theorem as follows:

Lemma 9.1.

(Schur) det(X)\det(X) majorizes det(diag(X))\det(\operatorname*{diag}(X)) in the identity direction.

We conjecture that this holds for all homogeneous PSD-stable lpm-polynomials.

Conjecture 9.2.

Let PP be a homogeneous PSD-stable lpm-polynomial. Then P(X)P(X) majorizes P(diag(X))P(\operatorname*{diag}(X)) in the identity direction.

Recall for 1kn1\leq k\leq n we defined Ek(X)=|S|=kdetXSE_{k}(X)=\sum_{|S|=k}\det X_{S} to be the minor lift of degree kk elementary symmetric polynomial, i.e., sum of all k×kk\times k principal minors of XX. We are able to prove this conjecture for rescalings of EkE_{k}. Our proof will use the following result from [4], which follows from Theorem 1 of their paper.

Proposition 9.3.

Suppose pp majorizes qq in direction vv, then DvpD_{v}p majorizes DvqD_{v}q, where DvD_{v} denotes the directional derivatives in the vv direction.

Now we are ready to state and prove our result.

Proposition 9.4.

Let DD be any positive diagonal matrix, and P(X)=Ek(D1/2XD1/2)P(X)=E_{k}(D^{-1/2}XD^{-1/2}). Then P(X)P(X) majorizes P(diag(X))P(\operatorname*{diag}(X)) in the identity direction.

Proof.

First notice that det(X)\det(X) majorizes det(diag(X))\det(\operatorname*{diag}(X)) in the DD direction, i.e., roots of det(XtD)\det(X-tD) majorize roots of det(diag(X)tD)\det(\operatorname*{diag}(X)-tD) for any XX. This follows from applying original Schur’s theorem to the symmetric matrix D1/2XD1/2D^{-1/2}XD^{-1/2}, since we have det(XtD)=det(D)det(D1/2XD1/2tI)\det(X-tD)=\det(D)\det(D^{-1/2}XD^{-1/2}-tI) and similarly det(diag(X)tD)=det(D)det(D1/2diag(X)D1/2tI)\det(\operatorname*{diag}(X)-tD)=\det(D)\det(D^{-1/2}\operatorname*{diag}(X)D^{-1/2}-tI). Also notice that D1/2diag(X)D1/2=diag(D1/2XD1/2)D^{-1/2}\operatorname*{diag}(X)D^{-1/2}=\operatorname*{diag}(D^{-1/2}XD^{-1/2}).

Now we apply 9.3 (nk)(n-k) times, where p=det(X),q=det(diag(X)),v=Dp=\det(X),q=\det(\operatorname*{diag}(X)),v=D. This shows that p(k)(X)=|S|=kdet(XS)iSDiip^{(k)}(X)=\sum_{|S|=k}\det(X_{S})\prod_{i\notin S}D_{ii} majorizes q(k)(X)=|S|=kdet(diag(X)S)iSDiiq^{(k)}(X)=\sum_{|S|=k}\det(\operatorname*{diag}(X)_{S})\prod_{i\notin S}D_{ii} in the DD direction. Computing p(k)(XtD)p^{(k)}(X-tD) we have

p(k)(XtD)\displaystyle p^{(k)}(X-tD) =|S|=kdet(XStDS)iSDii\displaystyle=\sum_{|S|=k}\det(X_{S}-tD_{S})\prod_{i\notin S}D_{ii}
=|S|=kdet(DS)det(DS1/2XSDS1/2tIS)iSDii\displaystyle=\sum_{|S|=k}\det(D_{S})\det(D_{S}^{-1/2}X_{S}D_{S}^{-1/2}-tI_{S})\prod_{i\notin S}D_{ii}
=det(D)|S|=kdet(DS1/2XSDS1/2tIS)\displaystyle=\det(D)\sum_{|S|=k}\det(D_{S}^{-1/2}X_{S}D_{S}^{-1/2}-tI_{S})
=det(D)Ek(D1/2XD1/2tI)\displaystyle=\det(D)E_{k}(D^{-1/2}XD_{-1/2}-tI)

Similarly, q(k)(XtD)=det(D)Ek(D1/2diag(X)D1/2tI)q^{(k)}(X-tD)=\det(D)E_{k}(D^{-1/2}\operatorname*{diag}(X)D_{-1/2}-tI). This shows that roots of Ek(D1/2XD1/2tI)E_{k}(D^{-1/2}XD_{-1/2}-tI) majorize roots of Ek(D1/2diag(X)D1/2tI)E_{k}(D^{-1/2}\operatorname*{diag}(X)D_{-1/2}-tI). This completes the proof. ∎

We may also formulate a hyperbolic generalization of Horn’s theorem, which we conjecture to be true but do not have any results.

Conjecture 9.5.

Let PP be any degree kk lpm-polynomial. Let λ,μk\lambda,\mu\in\mathbb{R}^{k} such that λ\lambda majorizes μ\mu. Then there exists a symmetric matrix XX such that roots of P(XtI)P(X-tI) are given by λ\lambda, and roots of P(diag(X)tI)P(\operatorname*{diag}(X)-tI) are given by μ\mu.

9.2. Spectral containment property and the Schur-Horn property

We showed that many polynomials have the spectral containment property. Based on these examples and additional computational evidence we conjecture the following:

Conjecture 9.6.

All homogeneous multiaffine stable polynomials have the spectral containment property.

There are several special cases of this conjecture which are of particular interest, which we enumerate separately.

Conjecture 9.7.

All quadratic homogeneous multiaffine stable polynomials have the spectral containment property.

This case is of special interest because quadratic multiaffine polynomials have especially simple minor lifts. Namely, if

p(x)=ijaijxixj,p(x)=\sum_{i\neq j}a_{ij}x_{i}x_{j},

then

P(X)=p(diag(X))ijaijXij2.P(X)=p(\operatorname*{diag}(X))-\sum_{i\neq j}a_{ij}X_{ij}^{2}.

It is therefore plausible that this conjecture could be proved (or disproved) by exploiting this special structure.

Conjecture 9.8.

Let DD be a positive definite diagonal matrix, and let p(x)=ek(Dx)p(x)=e_{k}(Dx). Then p(x)p(x) has the spectral containment property.

Again, this is of special interest because of its relation to diagonal congruence as we now explain.

Lemma 9.9.

Let pp be a homogeneous, multiaffine stable polynomial, let DD be a positive definite diagonal matrix, and let q=p(Dx)q=p(Dx). Then xH(q)x\in H(q) if and only if DxH(p)Dx\in H(p), and XH(Q)X\in H(Q) if and only if DXDH(P)DXD\in H(P).

Proof.

xH(q)x\in H(q) if and only if q(x+t1)0q(x+t\vec{1})\geq 0 for all t0t\geq 0. This is equivalent to the statement that p(D(x+t1))=p(Dx+tdiag(D))0p(D(x+t\vec{1}))=p(Dx+t\operatorname*{diag}(D))\geq 0 for all t0t\geq 0. Notice though that if DD is positive definite, then diag(D)\operatorname*{diag}(D) is in the interior of the hyperbolicity cone of pp. Therefore, p(Dx+tdiag(D))0p(Dx+t\operatorname*{diag}(D))\geq 0 for all t0t\geq 0 if and only if DxH(p)Dx\in H(p).

Similarly, if p(x)=S[n]aSiSxip(x)=\sum_{S\subseteq[n]}a_{S}\prod_{i\in S}x_{i}, we see that q(x)=S[n](iSDiiaS)iSxiq(x)=\sum_{S\subseteq[n]}(\prod_{i\in S}D_{ii}a_{S})\prod_{i\in S}x_{i}. Therefore,

Q(X)=S[n](iSDiiaS)det(X|S)=S[n]aSdet((D1/2XD1/2)|S)=P(D1/2XD1/2).Q(X)=\sum_{S\subseteq[n]}(\prod_{i\in S}D_{ii}a_{S})\det(X|_{S})=\sum_{S\subseteq[n]}a_{S}\det((D^{1/2}XD^{1/2})|_{S})=P(D^{1/2}XD^{1/2}).

We thus have that

Q(X+tI)=P(D1/2(X+tI)D1/2)=P(D1/2XD1/2+tD).Q(X+tI)=P(D^{1/2}(X+tI)D^{1/2})=P(D^{1/2}XD^{1/2}+tD).

Because DD is positive definite, it is in the interior of H(P)H(P), and therefore, P(D1/2XD1/2+tD)0P(D^{1/2}XD^{1/2}+tD)\geq 0 for all t0t\geq 0 if and only if D1/2XD1/2H(P)D^{1/2}XD^{1/2}\in H(P). This implies the result. ∎

From, this we see that 9.8 is equivalent to the statement that for any XH(Ek)X\in H(E_{k}), and any positive definite diagonal matrix DD, we have that there exists an eigenvalue vector λ\lambda of D1/2XD1/2D^{1/2}XD^{1/2} so that D1λH(ek)D^{-1}\lambda\in H(e_{k}). This gives us a very quantitative relationship between the eigenvalues of a symmetric matrix XX and those of D1/2XD1/2D^{1/2}XD^{1/2}, which are of fundamental interest in a number of situations.

The Schur-Horn property is another interesting property of a multiaffine polynomial. Once again, despite computer search, we are unable to find an example of a multiaffine homogeneous polynomial that does not have the Schur-Horn property. From this, we conjecture

Conjecture 9.10.

All homogeneous multiaffine polynomials have the Schur-Horn property.

References

  • [1] Nima Anari, Shayan Oveis Gharan, Amin Saberi, and Mohit Singh. Nash social welfare, matrix permanent, and stable polynomials. In 8th Innovations in Theoretical Computer Science Conference, volume 67 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 36, 12. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2017.
  • [2] Julius Borcea and Petter Brändén. Applications of stable polynomials to mixed determinants: Johnson’s conjectures, unimodality, and symmetrized fischer products. Duke Mathematical Journal, 143(2):205–223, 2008.
  • [3] Julius Borcea and Petter Brändén. The Lee-Yang and Pólya-Schur programs. I. Linear operators preserving stability. Invent. Math., 177(3):541–569, 2009.
  • [4] Julius Borcea and Petter Brändén. Hyperbolicity preservers and majorization. Comptes Rendus Mathematique, 348(15):843–846, 2010.
  • [5] Julius Borcea and Petter Brändén. Multivariate Pólya-Schur classification problems in the Weyl algebra. Proc. Lond. Math. Soc. (3), 101(1):73–104, 2010.
  • [6] Julius Borcea, Petter Brändén, and Thomas M. Liggett. Negative dependence and the geometry of polynomials. Journal of the American Mathematical Society, 22(2):521–567, 2009.
  • [7] Petter Brändén. Polynomials with the half-plane property and matroid theory. Adv. Math., 216(1):302–320, 2007.
  • [8] Petter Brändén and June Huh. Lorentzian polynomials. Annals of Mathematics, 192(3):821–891, 2020.
  • [9] Young-Bin Choe, James G. Oxley, Alan D. Sokal, and David G. Wagner. Homogeneous multivariate polynomials with the half-plane property. volume 32, pages 88–187. 2004.
  • [10] Lars Gårding. Linear hyperbolic partial differential equations with constant coefficients. Acta Math., 85:1–62, 1951.
  • [11] Lars Gårding. An inequality for hyperbolic polynomials. Journal of Mathematics and Mechanics, 8(6):957–965, 1959.
  • [12] Daniel R. Grayson and Michael E. Stillman. Macaulay2, a software system for research in algebraic geometry. Available at https://faculty.math.illinois.edu/Macaulay2/.
  • [13] F Reese Harvey and H Blaine Lawson Jr. Hyperbolic polynomials and the dirichlet problem. arXiv preprint arXiv:0912.5220, 2009.
  • [14] Alfred Horn. Doubly stochastic matrices and the diagonal of a rotation matrix. Amer. J. Math., 76:620–630, 1954.
  • [15] Charles R. Johnson and Sivaram K. Narayan. The Koteljanskii inequalities for mixed matrices. Linear and Multilinear Algebra, 62(12):1583–1590, 2014.
  • [16] Thorsten Jörgens and Thorsten Theobald. Conic stability of polynomials. Res. Math. Sci., 5(2):Paper No. 26, 12, 2018.
  • [17] D. M. Koteljanskii. A property of sign-symmetric matrices. American Mathematical Society Translations, (27):19–23, 1963.
  • [18] Mario Kummer. Spectral linear matrix inequalities. Adv. Math., 384:Paper No. 107749, 36, 2021.
  • [19] Mario Kummer, Daniel Plaumann, and Cynthia Vinzant. Hyperbolic polynomials, interlacers, and sums of squares. Math. Program., 153(1, Ser. B):223–245, 2015.
  • [20] Pierre Lalonde. A non-commutative version of Jacobi’s equality on the cofactors of a matrix. Discrete Math., 158(1-3):161–172, 1996.
  • [21] Nam Q. Le. Hadamard-type inequalities for kk-positive matrices, December 2021.
  • [22] Marvin Marcus. Finite Dimensional Multilinear Algebra. Marcel Dekker Inc., 1973.
  • [23] Maxim Nazarov. Young’s symmetrizers for projective representations of the symmetric group. Adv. Math., 127(2):190–257, 1997.
  • [24] Robin Pemantle. Hyperbolicity and stable polynomials in combinatorics and probability. In Current developments in mathematics, 2011, pages 57–123. Int. Press, Somerville, MA, 2012.
  • [25] James Renegar. Hyperbolic programs, and their derivative relaxations. Foundations of Computational Mathematics, 6:59–79, 2006.
  • [26] Raman Sanyal and James Saunderson. Spectral polyhedra, January 2020.
  • [27] James Saunderson. Certifying polynomial nonnegativity via hyperbolic optimization. SIAM J. Appl. Algebra Geom., 3(4):661–690, 2019.
  • [28] Issai Schur. Über eine Klasse von Mittelbildungen mit Anwendungen auf die Determinantentheorie. Sitzungsberichte der Berliner Mathematischen Gesellschaft, 22(9-20):51, 1923.
  • [29] David G. Wagner. Multivariate stable polynomials: theory and applications. Bull. Amer. Math. Soc. (N.S.), 48(1):53–84, 2011.