This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

On the complexity of matrix Putinar’s Positivstellensätz

Lei Huang Department of Mathematics, University of California San Diego () [email protected]
Abstract

This paper studies the complexity of matrix Putinar’s Positivstellensätz on the semialgebraic set that is given by the polynomial matrix inequality. When the quadratic module generated by the constrained polynomial matrix is Archimedean, we prove a polynomial bound on the degrees of terms appearing in the representation of matrix Putinar’s Positivstellensätz. Estimates on the exponent and constant are given. As a byproduct, a polynomial bound on the convergence rate of matrix sum-of-squares relaxations is obtained, which resolves an open question raised by Dinh and Pham. When the constraining set is unbounded, we also prove a similar bound for the matrix version of Putinar–Vasilescu’s Positivstellensätz by exploiting homogenization techniques.

keywords:
Polynomial matrix, polynomial optimization, matrix Putinar’s Positivstellensätz, matrix sum-of-squares relaxations
{MSCcodes}

90C23, 65K05, 90C22, 90C26

1 Introduction

Optimization with polynomial matrix inequalities has broad applications [6, 12, 16, 17, 18, 25, 47, 48, 57, 58]. A fundamental question to solve it globally is how to efficiently characterize polynomial matrices that are positive semidefinite on the semialgebraic set described by the polynomial matrix inequality. To be more specific, given an \ell-by-\ell symmetric polynomial matrix F(x)F(x) in x:=(x1,,xn)x:=(x_{1},\dots,x_{n}) and a semialgebraic set KK, how can we certify that F0F\succeq 0 on KK? Here, the set KK is given as

(1.1) K={xnG(x)0},K=\{x\in\mathbb{R}^{n}\mid G(x)\succeq 0\},

where G(x)G(x) is an mm-by-mm symmetric polynomial matrix.

When F(x)F(x) is a scalar polynomial (i.e., =1\ell=1) and G(x)G(x) is a diagonal matrix whose diagonal entries are polynomials g1(x)g_{1}(x), \dots, gm(x)g_{m}(x), KK can be equivalently described as

(1.2) K={xng1(x)0,,gm(x)0}.K=\{x\in\mathbb{R}^{n}\mid g_{1}(x)\geq 0,\dots,g_{m}(x)\geq 0\}.

Then, it reduces to the classical Positivstellensätz in real algebraic geometry and polynomial optimization. A polynomial pp is said to be a sum of squares (SOS) if p=p12++pt2p=p_{1}^{2}+\dots+p_{t}^{2} for polynomials p1,,ptp_{1},\dots,p_{t}. Schmüdgen [49] proved that if KK is compact and F>0F>0 on KK, then there exist SOS polynomials σα\sigma_{\alpha} such that

(1.3) F=α{0,1}mσαg1α1gmαm.F=\sum\limits_{\alpha\in\{0,1\}^{m}}\sigma_{\alpha}g_{1}^{\alpha_{1}}\cdots g_{m}^{\alpha_{m}}.

When the quadratic module of scalar polynomials QM[G]\mbox{QM}[G] generated by GG is Archimedean (a condition slightly stronger than compactness, refer to Section 2.2), Putinar [42] showed that if F>0F>0 on KK, then there exist SOS polynomials σ0,σ1,,σm\sigma_{0},\sigma_{1},\dots,\sigma_{m} such that

(1.4) F=σ0+σ1g1++σmgm,F=\sigma_{0}+\sigma_{1}g_{1}{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}+\cdots+}\sigma_{m}g_{m},

which is referred to as Putinar’s Positivstellensätz in the literature.

The complexities of the above scalar Positivstellensätze have been studied extensively. Suppose that K(1,1)nK\subseteq(-1,1)^{n} and the degree of FF is dd. For the degree dd, denote

dn:={αnα1++αnd}.\mathbb{N}_{d}^{n}:=\left\{\alpha\in\mathbb{N}^{n}\mid\left.\alpha_{1}+\cdots+\alpha_{n}\leq d\right\}.\right.

For a polynomial p=αdnpα|α|!α1!αn!xαp=\sum\limits_{\alpha\in\mathbb{N}_{d}^{n}}p_{\alpha}\frac{|\alpha|!}{\alpha_{1}!\cdots\alpha_{n}!}x^{\alpha}, denote the norm p=max{|pα|:αdn}.\|p\|=\max\{{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}|p_{\alpha}|}:\alpha\in\mathbb{N}_{d}^{n}\}. It was shown in [51] that if FFmin>0{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}F\geq F_{\min}>0} on KK, then for

kcd2(1+(d2ndFFmin)c),k\geq cd^{2}\left(1+\left(d^{2}n^{d}\frac{\|F\|}{F_{\min}}\right)^{c}\right),

there exist SOS polynomials σα\sigma_{\alpha} with deg(σαg1α1gmαm)k\deg(\sigma_{\alpha}g_{1}^{\alpha_{1}}\cdots g_{m}^{\alpha_{m}})\leq k such that (1.3) holds. Here, cc is a positive constant that depends on the constraining polynomials g1,,gmg_{1},\dots,g_{m}. Based on the above bound, Nie and Schweighofer [39] proved that if QM[G]\mbox{QM}[G] is Archimedean, there exists a constant c>0c>0, depending on g1,,gmg_{1},\dots,g_{m}, such that if

(1.5) kcexp((d2ndFFmin)c),k\geq c\exp\left(\left(d^{2}n^{d}\frac{\|F\|}{F_{\min}}\right)^{c}\right),

then (1.4) holds for some SOS polynomials σ0\sigma_{0}, \dots, σm\sigma_{m} with deg(σ0)k\deg(\sigma_{0})\leq k, deg(σigi)k\deg(\sigma_{i}g_{i})\leq k (i=1,,m)(i=1,\dots,m). Recently, Baldi and Mourrain improved the exponent bound (1.5) to a polynomial bound. It was shown in [2] that the same result holds if

(1.6) kγd3.5nL(FmaxFmin)2.5nL.k\geq\gamma d^{3.5nL}\left(\frac{\|F\|_{\max}}{F_{\min}}\right)^{2.5nL}.

In the above, γ\gamma is a positive constant depending on nn, g1,,gmg_{1},\dots,g_{m}, LL is the Łojasiewicz exponent associated with g1,,gmg_{1},\dots,g_{m} and Fmax\|F\|_{\max} is the max norm on the cube [1,1]n[-1,1]^{n}. The bound (1.6) has been further improved in [3] by removing the dependency of the exponent on the dimension nn and providing estimates on the constant γ\gamma. There exist better degree bounds when the set KK is specified, see [1, 4, 12, 53, 54, 55]. We also refer to [29] when F,g1,,gmF,g_{1},\dots,g_{m} admit the correlative sparse sparsity.

Based on Putinar’s Positivstellensätz, Lasserre [31] introduced the Moment-SOS hierarchy for solving polynomial optimization and proved its asymptotic convergence. The above degree bounds give estimates on the convergence rates of the Moment-SOS hierarchy directly. We also refer to [8, 20, 21, 23, 36, 37, 46] for the finite convergence of the Moment-SOS hierarchy under some regular assumptions.

When FF, GG are general polynomial matrices, a matrix version of Putinar’s Positivstellensätz was provided in [28, 48]. Let [x]:=[x1,,xn]\mathbb{R}[x]:=\mathbb{R}\left[x_{1},\ldots,x_{n}\right] be the ring of polynomials in x:=(x1,,xn)x:=\left(x_{1},\ldots,x_{n}\right) with real coefficients. A polynomial matrix S(x)S(x) is SOS if S(x)=H(x)TH(x)S(x)=H(x)^{T}H(x) for some polynomial matrix H(x)H(x). The set of all \ell-by-\ell SOS polynomial matrices in xx is denoted as Σ[x]\Sigma[x]^{\ell}. The quadratic module of \ell-by-\ell polynomial matrices generated by G(x)G(x) is the set

QM[G]:={S+i=1tPiTGPit,SΣ[x],Pi[x]m×}.\mbox{QM}[G]^{\ell}:=\left\{S+\sum_{i=1}^{t}P_{i}^{T}GP_{i}\mid t\in\mathbb{N},~{}S\in\Sigma[x]^{\ell},~{}P_{i}\in\mathbb{R}[x]^{m\times\ell}\right\}.

The set QM[G]\mbox{QM}[G]^{\ell} is Archimedean if there exists a scalar r>0r>0 such that (rx22)IQM[G](r-\|x\|_{2}^{2})\cdot I_{\ell}\in\mathrm{QM}[G]^{\ell}. When the quadratic module QM[G]\mbox{QM}[G]^{\ell} is Archimedean, it was proved in [28, 48] that if F0F\succ 0 on KK, then we have that FQM[G]F\in\mbox{QM}[G]^{\ell}. For the case that FF is a polynomial matrix and GG is diagonal (i.e., KK is as in (1.2)), the exponential bound (1.5) was generalized to the matrix case in [15]. There are specific estimates on degree bounds when GG is a simple set defined by scalar polynomial inequalities. We refer to [12] for the sphere case and refer to [55] for the boolean hypercube case. When GG is a general polynomial matrix, the complexity of matrix Putinar’s Positivstellensätz is open, to the best of the author’s knowledge. We also refer to [7, 27, 50] for other types of SOS representations for polynomial matrices.

Contributions

This paper studies the complexity of matrix Putinar’s Positivstellensätz. Throu-
ghout the paper, we assume that

K{xn1x220},K\subseteq\{x\in\mathbb{R}^{n}\mid 1-\|x\|_{2}^{2}\geq 0\},

unless otherwise specified. Denote the scaled simplex

(1.7) Δ¯:={xn1+x10,,1+xn0,nx1xn0}.\bar{\Delta}:=\left\{x\in\mathbb{R}^{n}\mid 1+x_{1}\geq 0,\ldots,1+x_{n}\geq 0,\sqrt{n}-x_{1}-\cdots-x_{n}\geq 0\right\}.

Note that the set KΔ¯K\subset\bar{\Delta} in our settings. The degree of the polynomial matrix GG is denoted by dGd_{G}, i.e.,

(1.8) dG=max{deg(Gij),1i,jm}.d_{G}=\max\{\deg(G_{ij}),~{}1\leq i,j\leq m\}.

For a positive integer mm, denote

(1.9) θ(m)=i=1m(k=m+1imk(k+1)2).\theta(m)=\sum\limits_{i=1}^{m}\left(\prod_{k=m+1-i}^{m}\frac{k(k+1)}{2}\right).

Let FB\|F\|_{B} be the Bernstein norm of FF on the scaled simplex Δ¯\bar{\Delta} (see Section 3 for details).

The following is our main result. It generalizes the recent results of scalar Putinar’s Positivstellensätz, as shown in [2, 3], to the matrix case and provides the first degree bound for matrix Putinar’s Positivstellensätz.

Theorem 1.1.

Let KK be as in (1.1) with m2m\geq 2. Suppose F(x)F(x) is an \ell-by-\ell symmetric polynomial matrix of degree dd, and (1x22)IQM[G](1-\|x\|_{2}^{2})\cdot I_{\ell}\in\mbox{QM}[G]^{\ell}. If F(x)FminI0F(x)\succeq F_{\min}\cdot{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}I_{\ell}}\succ 0 on KK, then for

(1.10) kC87η36(m1)θ(m)3n2dG6κ7d14η(FBFmin)7η+3,k\geq C\cdot{8}^{7\eta}\cdot 3^{6(m-1)}\theta(m)^{3}n^{2}d_{G}^{6}\kappa^{7}d^{14\eta}\left(\frac{\|F\|_{B}}{F_{\min}}\right)^{7\eta+3},

there exists an \ell-by-\ell SOS matrix SS and polynomial matrices Pi[x]m×P_{i}\in\mathbb{R}[x]^{m\times\ell} (i=1,,t)(i=1,\dots,t) with

deg(S)k,deg(PiTGPi)k(i=1,,t),\deg(S)\leq k,~{}\deg(P_{i}^{T}GP_{i})\leq k~{}(i=1,\dots,t),

such that

(1.11) F=S+P1TGP1++PtTGPt.F=S+P_{1}^{T}GP_{1}+\cdots+P_{t}^{T}GP_{t}.

In the above, κ,η\kappa,\eta are respectively the Łojasiewicz constant and exponent in (5.3), and C>0C>0 is a constant independent of FF, GG. Furthermore, the Łojasiewicz exponent can be estimated as

η(3m1dG+1)(3mdG)n+θ(m)2.\eta\leq(3^{m-1}d_{G}+1)(3^{m}d_{G})^{n+\theta(m)-2}.

The positivity certificate (1.11) can be applied to construct matrix SOS relaxations for solving polynomial matrix optimization [16, 48]. As a byproduct of Theorem 1.1, we can obtain a polynomial bound on the convergence rate of matrix SOS relaxations (see Theorem 5.7), which resolves an open question raised in [10].

When F(x)F(x) is a scalar polynomial and G(x)G(x) is diagonal with diagonal entries g1(x),,gm(x)[x]g_{1}(x),\dots,g_{m}(x)\in\mathbb{R}[x], it was shown in [3] that if FFmin>0F\geq F_{\min}>0 on KK, then ff admits a representation as in (1.4) for

kC2ηn2mdG6d2ηκ7(FBFmin)7η+3,k\geq C\cdot 2^{\eta}\cdot n^{2}md_{G}^{6}d^{2\eta}\kappa^{7}\left(\frac{\|F\|_{B}}{F_{\min}}\right)^{7\eta+3},

where κ,η\kappa,\eta are respectively the Łojasiewicz constant and exponent, and C>0C>0 is a constant independent of FF, GG. Comparing this with the bound as in Theorem 1.1, one important additional factor is 36(m1)θ(m)33^{6(m-1)}\theta(m)^{3}. This comes from the estimates on the quantity and degree bounds of scalar polynomials that can be used to describe the polynomial matrix inequality G(x)0G(x)\succeq 0. Please see Section 4 for details. The other additional factors such as 87η{8}^{7\eta}, d14ηd^{14\eta} are artifacts of the proof technique. They primarily arise from the degree estimates of higih_{i}g_{i} in Theorem 3.5. Since we want the constant CC to be independent of FF and GG, we retain all these factors in Theorem 1.1.

Remark. The assumption that (1x22)IQM[G](1-\|x\|_{2}^{2})\cdot I_{\ell}\in\mbox{QM}[G]^{\ell} is not serious. When the quadratic module QM[G]\mbox{QM}[G]^{\ell} is Archimedean, i.e., (rx22)IQM[G](r-\|x\|_{2}^{2})\cdot I_{\ell}\in\mathrm{QM}[G]^{\ell} for some r>0r>0, we have that (1x22)IQM[G(rx)](1-\|x\|_{2}^{2})\cdot I_{\ell}\in\mathrm{QM}[G(\sqrt{r}x)]^{\ell}. Theorem 1.1 remains true with F,GF,G replaced by F(rx)F(\sqrt{r}x), G(rx)G(\sqrt{r}x).  

When KK is unbounded, the Putinar-type SOS representation may not hold. For the scalar case (i.e., F[x]F\in\mathbb{R}[x] and KK is as in (1.2)), it was shown in [43, 44] that if the highest degree homogeneous term of FF is positive definite in n\mathbb{R}^{n} and F>0F>0 on KK, then there exists a degree kk such that (1+x22)kF(1+\|x\|_{2}^{2})^{k}F has a representation (1.4), which is referred to as Putinar-Vasilescu’s Positivstellensätz. A polynomial complexity for this type of representation was proved by Mai and Magron [35]. Recently, a matrix version of Putinar–Vasilescu’s Positivstellensätz was established in [11].

In the following, we state a bound similar to that in Theorem 1.1 for matrix Putinar–Vasilescu’s Positivstellensätz. Let FhomF^{hom} denote the highest degree homogeneous term of FF. The homogenization of FF is denoted by F~\widetilde{F}, i.e., F~(x~):=x0deg(F)F(x/x0)\widetilde{F}(\tilde{x}):=x_{0}^{\deg(F)}F(x/x_{0}) for x~:=(x0,x)\tilde{x}:=(x_{0},x). The notation λmin(A)\lambda_{\min}(A) denotes the smallest eigenvalue of the matrix AA.

Theorem 1.2.

Let KK be as in (1.1) with m2m\geq 2. Suppose that F(x)F(x) is an \ell-by-\ell symmetric polynomial matrix of degree dd. If F(x)0F(x)\succ 0 on KK and Fhom0F^{hom}\succ 0 on n\mathbb{R}^{n}, then for

kC87η36(m1)(θ(m)+2)3(n+1)2(dG/2)6κ7d14η(F~BF~min)7η+3,k\geq C\cdot{8}^{7\eta}\cdot 3^{6(m-1)}(\theta(m)+2)^{3}(n+1)^{2}(\lceil d_{G}/2\rceil)^{6}\kappa^{7}d^{14\eta}\left(\frac{\|\widetilde{F}\|_{B}}{\widetilde{F}_{\min}}\right)^{7\eta+3},

there exists an \ell-by-\ell SOS matrix SS and polynomial matrices Pi[x]m×P_{i}\in\mathbb{R}[x]^{m\times\ell} (i=1,,t)(i=1,\dots,t) with

deg(S)2k+d,deg(PiTGPi)2k+d(i=1,,t),\deg(S)\leq 2k+d,~{}\deg(P_{i}^{T}GP_{i})\leq 2k+d~{}(i=1,\dots,t),

such that

(1.12) (1+x22)kF=S+P1TGP1++PtTGPt.(1+\|x\|_{2}^{2})^{k}F=S+P_{1}^{T}GP_{1}+\cdots+P_{t}^{T}GP_{t}.

In the above, κ\kappa, η\eta are respectively the Łojasiewicz constant and exponent in (5.8), C>0C>0 is a constant independent of FF, GG, and F~min\widetilde{F}_{\min} is the optimal value of the optimization

(1.13) {minλmin(F~(x~))s.t.(x0)2dG/2dGG~(x~)0,x02+xTx=1.\left\{\begin{array}[]{rl}\min&\lambda_{\min}(\widetilde{F}(\tilde{x}))\\ \mathit{s.t.}&(x_{0})^{2\lceil d_{G}/2\rceil-d_{G}}\widetilde{G}(\tilde{x})\succeq 0,\\ &x_{0}^{2}+x^{T}x=1.\\ \end{array}\right.

Furthermore, the Łojasiewicz exponent can be estimated as

η(23m1dG/2+1)(23mdG/2)n+θ(m)1.\eta\leq(2\cdot 3^{m-1}\lceil d_{G}/2\rceil+1)(2\cdot 3^{m}\lceil d_{G}/2\rceil)^{n+\theta(m)-1}.

For unbounded optimization, the positivity of FhomF^{\hom} is also crucial for obtaining SOS-type representations because it reflects the behavior of FF at infinity (e.g., see [22]). The quantity F~min\widetilde{F}_{\min} is the optimal value of the homogenized optimization (1.15). It is an important quantity for evaluating the positivity of FF on the unbounded set KK, as it reflects not only the behavior of FF on KK but also its behavior at infinity. Suppose that F(x)FminI0F(x)\succeq F_{\min}\cdot I_{\ell}\succ 0 on KK, FhomFminI0F^{hom}\succeq F_{\min}^{\prime}\cdot I_{\ell}\succ 0 on n\mathbb{R}^{n}, and x~=(x0,x)n+1\tilde{x}=(x_{0},x)\in\mathbb{R}^{n+1} is a feasible point of (1.15). One can see that x/x0Kx/x_{0}\in K and F~(x~)=x0dF(x/x0)x0dFminI\widetilde{F}(\tilde{x})=x_{0}^{d}F(x/x_{0})\succeq x_{0}^{d}F_{\min}\cdot I_{\ell} for all feasible x~\tilde{x} with x00x_{0}\neq 0, and F~(x~)FminI\widetilde{F}(\tilde{x})\succeq F^{\prime}_{\min}\cdot I_{\ell} for all feasible x~\tilde{x} with x0=0x_{0}=0. However, it is impossible to bound F~min\widetilde{F}_{\min} from below with FhomF^{hom} and FminF_{\min}^{\prime}, since F~min\widetilde{F}_{\min} can be arbitrarily close to 0. For instance, consider the scalar polynomial F(x)=(xϵ)2+1F(x)=(x-\epsilon)^{2}+1 and K=K=\mathbb{R}. Note that for x02+x2=1x_{0}^{2}+x^{2}=1, we have F~(x0,x)=(xϵx0)2+x02=1+ϵ2x022ϵx0x\widetilde{F}(x_{0},x)=(x-\epsilon\cdot x_{0})^{2}+x_{0}^{2}=1+\epsilon^{2}\cdot x_{0}^{2}-2\epsilon\cdot x_{0}\cdot x. Let x0=sinθx_{0}=\sin\theta, x=cosθx=\cos\theta. Then, it follows that

F~(x0,x)=1+ϵ2sin2θ2ϵsinθcosθ=1+ϵ22(1cos(2θ))ϵsin(2θ)=1+ϵ22ϵ2+ϵ44sin(2θ+ϕ),\begin{array}[]{ll}\widetilde{F}(x_{0},x)=1+\epsilon^{2}\cdot\sin^{2}\theta-2\cdot\epsilon\cdot\sin\theta\cos\theta&=1+\frac{\epsilon^{2}}{2}\cdot(1-\cos(2\theta))-\epsilon\cdot\sin(2\theta)\\ &=1+\frac{\epsilon^{2}}{2}-\sqrt{\epsilon^{2}+\frac{\epsilon^{4}}{4}}\sin(2\theta+\phi),\end{array}

for some ϕ[0,2π]\phi\in[0,2\pi]. This implies that F~min=1+ϵ22ϵ2+ϵ44\widetilde{F}_{\min}=1+\frac{\epsilon^{2}}{2}-\sqrt{\epsilon^{2}+\frac{\epsilon^{4}}{4}}. One can verify that 1+ϵ22ϵ2+ϵ4401+\frac{\epsilon^{2}}{2}-\sqrt{\epsilon^{2}+\frac{\epsilon^{4}}{4}}\rightarrow 0, as ϵ\epsilon\rightarrow\infty, while Fmin=Fmin=1F_{\min}=F_{\min}^{\prime}=1 for arbitrary ϵ\epsilon.

The following result is a direct corollary of Theorem 1.2. Its proof is deferred to Section 5.2.

Corollary 1.3.

Let KK be as in (1.1) with m2m\geq 2. Suppose that F(x)F(x) is an \ell-by-\ell symmetric polynomial matrix with deg(F)=d\deg(F)=d. If F0F\succeq 0 on KK, then for any ε>0\varepsilon>0, we have

(1+x22)k(F+ε(1+x22)d+12I)QM[G]k,(1+\|x\|_{2}^{2})^{k}(F+\varepsilon\cdot(1+\|x\|_{2}^{2})^{\lceil\frac{d+1}{2}\rceil}I_{\ell})\in\mbox{QM}[G]_{k}^{\ell},

provided that kCε7η3.k\geq C\cdot\varepsilon^{-7\eta-3}. Here, η\eta is the Łojasiewicz exponent in (5.8), and C>0C>0 is a constant that depends on FF, GG.

Outline of the proofs

The proof of Theorem 1.1 involves three major steps:

Step 1. When F(x)F(x) is an \ell-by-\ell symmetric polynomial matrix and KK is described by finitely many scalar polynomial inequalities (i.e., KK is as in (1.2) ), we establish a polynomial bound on degrees of terms appearing in matrix Putinar’s Positivstellensätz (see Theorem 3.5). It improves the exponential bound given by Helton and Nie [15]. The proof closely follows the idea and strategies of [2, 3], adapting it to the matrix setting. This proof is provided in Section 3 and is completed in three steps:

Step 1a. We perturb FF into a polynomial matrix PP such that PP is positive definite on the scaled simplex Δ¯\bar{\Delta}. To be more specific, we prove that there exist λ>0\lambda>0 and hiΣ[x]h_{i}\in\Sigma[x] (i=1,,m)(i=1,\dots,m) with controlled degrees such that

P(x):=F(x)λi=1mhi(x)gi(x)I14FminI,xΔ¯.P(x):=F(x)-\lambda\sum_{i=1}^{m}h_{i}(x)g_{i}(x)I_{\ell}\succeq\frac{1}{4}F_{\min}\cdot I_{\ell},~{}~{}\forall~{}x\in\bar{\Delta}.

This step uses a result by Baldi-Mourrain-Parusinski [3, Proposition 3.2] about the approximation of a plateau by SOS polynomials, and some estimates on the matrix Bernstein norm (see Lemma 3.3).

Step 1b. Using the dehomogenized version of matrix Polya’s Positivstellensätz in terms of the Bernstein basis (see Lemma 3.1), we show that there exist matrices Pα(k)0P_{\alpha}^{(k)}\succ 0 such that

P=αknPα(k)Bαk(x),P=\sum_{\alpha\in\mathbb{N}^{n}_{k}}P_{\alpha}^{(k)}B_{\alpha}^{k}(x),

and provide an estimate on kk.

Step 1c. Note that the factors of Bαk(x)B_{\alpha}^{k}(x) satisfy 1x1,,1xn,nx1xnQM[1x22]21-x_{1},\dots,1-x_{n},\sqrt{n}-x_{1}-\cdots-x_{n}\in\mbox{QM}[1-\|x\|_{2}^{2}]_{2}. We know that PQM[1x22]P\in\mbox{QM}[1-\|x\|_{2}^{2}]^{\ell}. Consequently, a polynomial bound on the degrees of terms appearing in matrix Putinar’s Positivstellensätz can be derived from the assumption that (1x22)IQM[G](1-\|x\|_{2}^{2})\cdot I_{\ell}\in\mbox{QM}[G]^{\ell}.

Step 2. When KK is described by a polynomial matrix inequality (i.e., KK is as in (1.1)), we show that KK can be equivalently described by finitely many scalar polynomial inequalities with exact estimates on the quantity and degrees, by exploiting a proposition due to Schmüdgen [50]. Additionally, these polynomials belong to the quadratic module generated by G(x)G(x) in [x]\mathbb{R}[x]. This is discussed in Section 4.

Step 3. Finally, we prove Theorem 1.1 by applying the main result from Step 1 (i.e., Theorem 3.5) to FF and the equivalent scalar polynomial inequalities for KK obtained in Step 2. We refer to Section 5 for details.

The proof of Theorem 1.2 relies on Theorem 1.1 and homogenization techniques.

The remaining of the paper is organized as follows. Section 2 presents some backgrounds. Section 5 provides the proofs of Theorems 1.1, 1.2 and studies the convergence rate of matrix SOS relaxations. Conclusions and discussions are drawn in Section 6.

2 Preliminaries

2.1 Notations

The symbol \mathbb{N} (resp., \mathbb{R}) denotes the set of nonnegative integers (resp., real numbers). Let [x]:=[x1,,xn]\mathbb{R}[x]:=\mathbb{R}\left[x_{1},\ldots,x_{n}\right] be the ring of polynomials in x:=(x1,,xn)x:=\left(x_{1},\ldots,x_{n}\right) with real coefficients and let [x]d\mathbb{R}[x]_{d} be the ring of polynomials with degrees d\leq d. For α=(α1,,αn)n\alpha=(\alpha_{1},\ldots,\alpha_{n})\in\mathbb{N}^{n} and a degree dd, denote |α|α1++αn|\alpha|\coloneqq\alpha_{1}+\cdots+\alpha_{n}, dn:={αn|α|d}.\mathbb{N}_{d}^{n}:=\left\{\alpha\in\mathbb{N}^{n}\mid|\alpha|\leq d\right\}. For a polynomial pp, the symbol deg(p)\deg(p) denotes its total degree. For a polynomial matrix P=(Pij(x))[x]1×2P=(P_{ij}(x))\in\mathbb{R}[x]^{\ell_{1}\times\ell_{2}}, its degree is denoted by deg(P)\deg(P), i.e.,

deg(P):=max{deg(Pij(x)),1i1,1j2}.\deg(P):=\max\{\deg(P_{ij}(x)),~{}1\leq i\leq\ell_{1},1\leq j\leq\ell_{2}\}.

For a real number t,tt,\lceil t\rceil denotes the smallest integer greater than or equal to tt. For a matrix AA, ATA^{T} denotes its transpose and A2\|A\|_{2} denotes the spectral norm of AA. A symmetric matrix X0X\succeq 0 (resp., X0X\succ 0) if XX is positive semidefinite (resp., positive definite). For a smooth function pp in xx, its gradient is denoted by p\nabla p.

2.2 Some basics in real algebraic geometry

In this subsection, we review some basics in real algebraic geometry. We refer to [7, 16, 27, 28, 33, 38, 48, 50] for more details.

A polynomial matrix SS is said to be a sum of squares (SOS) if S=PTPS=P^{T}P for some polynomial matrix P(x)P(x). The set of all \ell-by-\ell SOS polynomial matrices in xx is denoted as Σ[x]\Sigma[x]^{\ell}. For a degree kk, denote the truncation

Σ[x]k{i=1tPiTPit,Pi[x]×,2deg(Pi)k}.\Sigma[x]_{k}^{\ell}\,\coloneqq\,\left\{\sum_{i=1}^{t}P_{i}^{T}P_{i}\mid t\in\mathbb{N},~{}P_{i}\in\mathbb{R}[x]^{\ell\times\ell},~{}2\deg(P_{i})\leq k\right\}.

For an mm-by-mm symmetric polynomial matrix GG, the quadratic module of \ell-by-\ell polynomial matrices generated by G(x)G(x) in [x]×\mathbb{R}[x]^{\ell\times\ell} is

QM[G]={S+i=1tPiTGPit,SΣ[x],Pi[x]m×}.\mbox{QM}[G]^{\ell}=\left\{S+\sum_{i=1}^{t}P_{i}^{T}GP_{i}\mid t\in\mathbb{N},~{}S\in\Sigma[x]^{\ell},~{}P_{i}\in\mathbb{R}[x]^{m\times\ell}\right\}.

The kkth degree truncation of QM[G]\mbox{QM}[G]^{\ell} is

(2.1) QM[G]k={S+i=1tPiTGPit,SΣ[x]k,Pi[x]k/2deg(G)/2m×}.\mbox{QM}[G]_{k}^{\ell}=\left\{S+\sum_{i=1}^{t}P_{i}^{T}GP_{i}\mid t\in\mathbb{N},~{}S\in\Sigma[x]^{\ell}_{k},~{}P_{i}\in\mathbb{R}[x]_{\lceil k/2-\deg(G)/2\rceil}^{m\times\ell}\right\}.

Note that Σ[x]QM[G]\Sigma[x]^{\ell}\subseteq\mbox{QM}[G]^{\ell}. For the case =1\ell=1, we simply denote

Σ[x]k:=Σ[x]k1,QM[G]:=QM[G]1,QM[G]k:=QM[G]k1.\Sigma[x]_{k}:=\Sigma[x]_{k}^{1},~{}{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}~{}\mbox{QM}[G]:=\mbox{QM}[G]^{1}},~{}\mbox{QM}[G]_{k}:=\mbox{QM}[G]_{k}^{1}.

The quadratic module QM[G]\mathrm{QM}[G]^{\ell} is said to be Archimedean if there exists r>0r>0 such that (rx22)IQM[G].\left(r-\|x\|_{2}^{2}\right)\cdot I_{\ell}\in\mathrm{QM}[G]^{\ell}. The polynomial matrix GG determines the semialgebraic set

K:={xnG(x)0}.K:=\left\{x\in\mathbb{R}^{n}\mid G(x)\succeq 0\right\}.

If QM[G]\mbox{QM}[G]^{\ell} is Archimedean, then the set KK must be compact. However, when KK is compact, it is not necessarily that QM[G]\mbox{QM}[G]^{\ell} is Archimedean (see [26]). The following is known as the matrix-version of Putinar’s Positivstellensätz.

Theorem 2.1 ([28, 48]).

Suppose that GG is an mm-by-mm symmetric polynomial matrix and QM[G]\mbox{QM}[G]^{\ell} is Archimedean. If the \ell-by-\ell symmetric polynomial matrix F0F\succ 0 on KK, then we have FQM[G]F\in\mbox{QM}[G]^{\ell}.

2.3 Some technical propositions

In this subsection, we present several useful propositions that are used in our proofs. The following is the classical Łojasiewicz inequality.

Corollary 2.2 ([5, 34]).

Let AA be a closed and bounded semialgebraic set, and let ff and gg two continuous semialgebraic functions from AA to \mathbb{R} such that f1(0)g1(0)f^{-1}(0)\subseteq g^{-1}(0). Then there exist two positive constants η\eta, κ\kappa such that:

|g(x)|ηκ|f(x)|,xA.|g(x)|^{\eta}\leq\kappa|f(x)|,~{}\forall x\in A.

The smallest constant η\eta is called the Łojasiewicz exponent and κ\kappa is the corresponding Łojasiewicz constant.

The Markov inequality below can be used to estimate gradients of polynomials.

Theorem 2.3 ([3, 30]).

Let Δ¯\bar{\Delta} be the scaled simplex as in (1.7) and let p[x]p\in\mathbb{R}[x] be a polynomial with degree d\leq d. Then:

maxxΔ¯p(x)22d(2d1)n+1maxxΔ¯|p(x)|.\max_{x\in\bar{\Delta}}\|\nabla p(x)\|_{2}\leq\frac{2d(2d-1)}{\sqrt{n}+1}\max_{x\in\bar{\Delta}}|p(x)|.

The following theorem is a slight variation of [14, Theorem 3.3], which gives explicit Łojasiewicz exponents for basic closed semialgebraic sets.

Theorem 2.4 ([14]).

Suppose that g1,,gs,h1,,hl[x]dg_{1},\dots,g_{s},h_{1},\dots,h_{l}\in\mathbb{R}[x]_{d} and let

S:={xnh1(x)==hl(x)=0,g1(x)0,,gs(x)0}.S:=\{x\in\mathbb{R}^{n}\mid h_{1}(x)=\cdots=h_{l}(x)=0,g_{1}(x)\geq 0,\ldots,g_{s}(x)\geq 0\}\neq\emptyset.

Then for any compact set KnK\subset\mathbb{R}^{n}, there is a constant c>0c>0 such that for all xKx\in K,

d(x,S)(n+l+s1,d+1)cmin{g1(x),,gs(x),h1(x),h1(x),,hl(x),hl(x),0},d(x,S)^{\mathcal{R}(n+l+s-1,d+1)}\leq-c\cdot\min\{g_{1}(x),\dots,g_{s}(x),h_{1}(x),-h_{1}(x),\dots,h_{l}(x),-h_{l}(x),0\},

where d(x,S)d(x,S) is the Euclidean distance to the set SS and (n,d)=d(3d3)n1\mathcal{R}(n,d)=d\cdot(3d-3)^{n-1}.

3 Degree bounds for scalar polynomial inequalities

In this section, we consider the case where G(x)G(x) is a diagonal matrix with diagonal entries g1,,gm[x]g_{1},\dots,g_{m}\in\mathbb{R}[x]. Then, the set KK as in (1.1) can be equivalently described as

(3.1) K={xng1(x)0,,,gm(x)0}.K=\{x\in\mathbb{R}^{n}\mid g_{1}(x)\geq 0,\dots,\dots,g_{m}(x)\geq 0\}.

If the symmetric polynomial matrix F(x)0F(x)\succ 0 on KK, we prove a polynomial bound on the degrees of terms appearing in matrix Putinar’s Positivstellensätz. This improves the exponential bound shown by Helton and Nie [15]. The proof essentially generalizes the idea and strategies used in [2, 3] to the matrix case. We refer to Step 1 in the section ‘Outlines of proofs’ for an overview of the proof.

Let Δn\Delta^{n} be the nn-dimensional standard simplex, i.e.,

Δn={xnx1++xn=1,x10,,xn0}.\Delta^{n}=\{x\in\mathbb{R}^{n}\mid x_{1}+\dots+x_{n}=1,x_{1}\geq 0,\dots,x_{n}\geq 0\}.

Denote the scaled simplex by

Δ¯={xnnx1xn0,1+x10,,1+xn0}.\bar{\Delta}=\left\{x\in\mathbb{R}^{n}\mid\sqrt{n}-x_{1}-\cdots-x_{n}\geq 0,1+x_{1}\geq 0,\ldots,1+x_{n}\geq 0\right\}.

For a degree tt, let {Bα(t)(x):αtn}\{B_{\alpha}^{(t)}(x):\alpha\in\mathbb{N}^{n}_{t}\} be the Bernstein basis of degree tt on Δ¯\bar{\Delta}, i.e.,

(3.2) Bα(t)(x)=t!α1!αn!(n+n)t(nx1xn)t|α|(1+x1)α1(1+xn)αn,B_{\alpha}^{(t)}(x)=\frac{t!}{\alpha_{1}!\cdots\alpha_{n}!}(n+\sqrt{n})^{-t}\left(\sqrt{n}-x_{1}-\cdots-x_{n}\right)^{t-|\alpha|}\left(1+x_{1}\right)^{\alpha_{1}}\cdots\left(1+x_{n}\right)^{\alpha_{n}},

For a degree tdt\geq d, the ×\ell\times\ell symmetric polynomial matrix FF of degree dd can be expressed in the Bernstein basis {Bα(t)(x):αtn}\{B_{\alpha}^{(t)}(x):\alpha\in\mathbb{N}^{n}_{t}\} as

F=αtnFα(t)Bα(t)(x),F=\sum_{\alpha\in\mathbb{N}_{t}^{n}}F_{\alpha}^{(t)}B_{\alpha}^{(t)}(x),

where Fα(t)F_{\alpha}^{(t)} are \ell-by-\ell symmetric matrices. The Bernstein norm of FF with respect to {Bα(t)(x):αtn}\{B_{\alpha}^{(t)}(x):\alpha\in\mathbb{N}^{n}_{t}\} is defined as

FB,t=max{Fα(t)2:αtn}.\|F\|_{B,t}=\max\{\|F_{\alpha}^{(t)}\|_{2}:~{}\alpha\in\mathbb{N}^{n}_{t}\}.

For convenience, we denote FB,d\|F\|_{B,d} simply as FB\|F\|_{B}. Similarly, FF can also be expressed in the monomial basis {xα:αdn}\{x^{\alpha}:\alpha\in\mathbb{N}^{n}_{d}\} as

F=αdnFαd!α1!αn!xα.F=\sum_{\alpha\in\mathbb{N}_{d}^{n}}F_{\alpha}\frac{d!}{\alpha_{1}!\cdots\alpha_{n}!}x^{\alpha}.

The spectral norm of FF is defined as F2=max{Fα2:αdn}.\|F\|_{2}=\max\{\|F_{\alpha}\|_{2}:~{}\alpha\in\mathbb{N}^{n}_{d}\}. For f(x)[x]d1f(x)\in\mathbb{R}[x]_{d_{1}}, g(x)[x]d2g(x)\in\mathbb{R}[x]_{d_{2}} and 12d1\ell_{1}\geq\ell_{2}\geq d_{1}, the following is a basic property of the Bernstein norm (see [13], [3, Lemma 1.5]):

(3.3) maxsΔ¯|f(s)|fB,1fB,2,fgB,d1+d2fB,d1gB,d2.\max\limits_{s\in\bar{\Delta}}|f(s)|\leq\|f\|_{B,\ell_{1}}\leq\|f\|_{B,\ell_{2}},~{}~{}\|fg\|_{B,d_{1}+d_{2}}\leq\|f\|_{B,d_{1}}\|g\|_{B,d_{2}}.

Suppose that p(x)p(x) is a homogeneous polynomial with deg(p)=d\deg(p)=d. Polya’s Positivstellensätz [40, 41] states that if p(x)pmin>0p(x)\geq p_{\min}>0 on the standard simplex Δn\Delta^{n}, then the polynomial (x1++xn)kp(x)(x_{1}+\cdots+x_{n})^{k}p(x) has only positive coefficients if k>d(d1)p2pmind.k>d(d-1)\frac{\|p\|}{2p_{\min}}-d. This result has been generalized to the matrix case by Scherer and Hol [48]. It was shown in [48, Theorem 3] that if F(x)F(x) is an \ell-by-\ell homogeneous symmetric polynomial matrix with deg(F)=d\deg(F)=d and F(x)FminI0F(x)\succeq F_{\min}\cdot{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}I_{\ell}}\succ 0 on n\triangle^{n}, then for k>d(d1)F22Fmind,k>d(d-1)\frac{\|F\|_{2}}{2F_{\min}}-d, all coefficient matrices of (x1++xn)kF(x)(x_{1}+\cdots+x_{n})^{k}F(x) in the monomial basis are positive definite.

In the following, we present a dehomogenized version of matrix Polya’s Positivstellensätz in terms of the Bernstein basis.

Lemma 3.1.

Suppose F(x)F(x) is an \ell-by-\ell symmetric polynomial matrix with deg(F)=d\deg(F)=d. If F(x)FminI0F(x)\succeq F_{\min}\cdot{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}I_{\ell}}\succ 0 on Δ¯\bar{\Delta}, then for

k>d(d1)FB2Fmind,k>d(d-1)\frac{\|F\|_{B}}{2F_{\min}}-d,

all coefficient matrices of F(x)F(x) in the Bernstein basis {Bα(k+d)(x):αk+dn}\{B_{\alpha}^{(k+d)}(x):\alpha\in\mathbb{N}^{n}_{k+d}\} are positive definite.

Proof 3.2.

Suppose that FF can be expressed in the Bernstein basis {Bα(d)(x):αdn}\{B_{\alpha}^{(d)}(x):\alpha\in\mathbb{N}^{n}_{d}\} as

F=αdnFα(d)Bα(d)(x),F=\sum_{\alpha\in\mathbb{N}^{n}_{d}}F_{\alpha}^{(d)}B_{\alpha}^{(d)}(x),

where Fα(d)F_{\alpha}^{(d)} are \ell-by-\ell symmetric matrices. Denote

F^(y1,,yn+1)=αdnFα(d)d!α1!αn!yαyn+1d|α|.\widehat{F}(y_{1},\dots,y_{n+1})=\sum_{\alpha\in\mathbb{N}^{n}_{d}}F_{\alpha}^{(d)}\frac{d!}{\alpha_{1}!\cdots\alpha_{n}!}y^{\alpha}y_{n+1}^{d-|\alpha|}.

For each (y1,,yn+1)Δn+1(y_{1},\dots,y_{n+1})\in\Delta^{n+1}, let

xi=(n+n)yi1(i=1,,n).x_{i}=(n+\sqrt{n})y_{i}-1~{}(i=1,\dots,n).

Then we know that

ni=1nxi=ni=1n((n+n)yi1)=n+n(n+n)i=1nyi=(n+n)(1i=1nyi)0,\begin{array}[]{ll}\sqrt{n}-\sum\limits_{i=1}^{n}x_{i}=\sqrt{n}-\sum\limits_{i=1}^{n}((n+\sqrt{n})y_{i}-1)&=\sqrt{n}+n-(n+\sqrt{n})\sum\limits_{i=1}^{n}y_{i}\\ &=(\sqrt{n}+n)(1-\sum\limits_{i=1}^{n}y_{i})\\ &\geq 0,\\ \end{array}

which implies that x=(x1,,xn)Δ¯x=(x_{1},\dots,x_{n})\in\bar{\Delta}. By computations, we have that for (y1,,yn+1)Δn+1(y_{1},\dots,y_{n+1})\in\Delta^{n+1}, the following holds

F^(y1,,yn+1)=αdnFα(d)Bα(d)(x)=F(x)FminI.\widehat{F}(y_{1},\dots,y_{n+1})=\sum_{\alpha\in\mathbb{N}^{n}_{d}}F_{\alpha}^{(d)}B_{\alpha}^{(d)}(x)=F(x)\succeq F_{\min}\cdot I_{\ell}.

Hence, F^FminI\widehat{F}\succeq F_{\min}\cdot I_{\ell} on Δn+1\Delta^{n+1}. It follows from [48, Theorem 3] that for k>d(d1)F^22Fmindk>d(d-1)\frac{\|\widehat{F}\|_{2}}{2F_{\min}}-d, all coefficient matrices of (y1++yn+1)kF^(y1,,yn+1)(y_{1}+\cdots+y_{n+1})^{k}\widehat{F}(y_{1},\dots,y_{n+1}) in the monomial basis are positive definite, i.e., there exist Pα0P_{\alpha}\succ 0 such that

(y1++yn+1)kF^(y1,,yn+1)=αk+dnPα(k+d)!α1!αn!y1α1ynαn(yn+1)k+d|α|.(y_{1}+\cdots+y_{n+1})^{k}\widehat{F}(y_{1},\dots,y_{n+1})=\sum_{\alpha\in\mathbb{N}_{k+d}^{n}}P_{\alpha}\frac{(k+d)!}{\alpha_{1}!\cdots\alpha_{n}!}y_{1}^{\alpha_{1}}\cdots y_{n}^{\alpha_{n}}(y_{n+1})^{k+d-|\alpha|}.

By substituting

yi=xi+1n+n(i=1,,n),yn+1=11n+n(i=1nxi+n)y_{i}=\frac{x_{i}+1}{n+\sqrt{n}}~{}(i=1,\dots,n),\,y_{n+1}=1-\frac{1}{n+\sqrt{n}}\left(\sum\limits_{i=1}^{n}x_{i}+n\right)

into the above identity, we obtain

F(x1,,xn)=αk+dnPαBα(k+d)(x).F(x_{1},\dots,x_{n})=\sum_{\alpha\in\mathbb{N}_{k+d}^{n}}P_{\alpha}B_{\alpha}^{(k+d)}(x).

Since F^2=FB\|\widehat{F}\|_{2}=\|F\|_{B}, the conclusion follows.

Lemma 3.3.

Suppose P(x)P(x) is an \ell-by-\ell symmetric polynomial matrix with deg(P)=dP\deg(P)=d_{P}. Let ξ\xi\in\mathbb{R}^{\ell} with ξ2=1\|\xi\|_{2}=1, and denote

P(ξ)(x):=ξTP(x)ξ.P^{(\xi)}(x):=\xi^{T}P(x)\xi.

Then, we have that P(ξ)B,dPPB\|P^{(\xi)}\|_{B,d_{P}}\leq\|P\|_{B}. Furthermore, there exists ξ0\xi_{0}\in\mathbb{R}^{\ell} with ξ02=1\|\xi_{0}\|_{2}=1 such that P(ξ0)B,dP=PB\|P^{(\xi_{0})}\|_{B,d_{P}}=\|P\|_{B}.

Proof 3.4.

Suppose that PP can be expressed in the Bernstein basis {Bα(dP)(x):αdPn}\{B_{\alpha}^{(d_{P})}(x):\alpha\in\mathbb{N}^{n}_{d_{P}}\} as

P=αdPnPα(dP)Bα(dP)(x),P=\sum_{\alpha\in\mathbb{N}^{n}_{d_{P}}}P_{\alpha}^{(d_{P})}B_{\alpha}^{(d_{P})}(x),

where Pα(dP)P_{\alpha}^{(d_{P})} are \ell-by-\ell symmetric matrices. Note that

P(ξ)B,dP=ξTP(x)ξB,dP=αdPnξTPα(dP)ξBα(dP)(x)B,dP=max{|ξTPα(dP)ξ|:αdPn}max{Pα(dP)2:αdPn}=PB.\begin{array}[]{ll}\|P^{(\xi)}\|_{B,d_{P}}=\|\xi^{T}P(x)\xi\|_{B,d_{P}}&=\|\sum\limits_{\alpha\in\mathbb{N}^{n}_{d_{P}}}\xi^{T}P_{\alpha}^{(d_{P})}\xi B_{\alpha}^{(d_{P})}(x)\|_{B,d_{P}}\\ &=\max\{|\xi^{T}P_{\alpha}^{(d_{P})}\xi|:~{}\alpha\in\mathbb{N}^{n}_{d_{P}}\}\\ &\leq\max\{\|P_{\alpha}^{(d_{P})}\|_{2}:~{}\alpha\in\mathbb{N}^{n}_{d_{P}}\}=\|P\|_{B}.\end{array}

Thus, we have P(ξ)B,dPPB.\|P^{(\xi)}\|_{B,d_{P}}\leq\|P\|_{B}. Since there exists ξ0\xi_{0}\in\mathbb{R}^{\ell} with ξ02=1\|\xi_{0}\|_{2}=1 such that

PB=max{|ξ0TPα(dP)ξ0|:αdPn},\|P\|_{B}=\max\{|\xi_{0}^{T}P_{\alpha}^{(d_{P})}\xi_{0}|:~{}\alpha\in\mathbb{N}^{n}_{d_{P}}\},

then we have that

P(ξ0)B,dP=max{|ξ0TPα(dP)ξ0|:αdPn}=PB.\|P^{(\xi_{0})}\|_{B,d_{P}}=\max\{|\xi_{0}^{T}P_{\alpha}^{(d_{P})}\xi_{0}|:~{}\alpha\in\mathbb{N}^{n}_{d_{P}}\}=\|P\|_{B}.

Denote

(3.4) 𝒢(x)=min(g1(x)g1B,,gm(x)gmB,0).\mathcal{G}(x)=-\min\left({\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\frac{g_{1}(x)}{\left\|g_{1}\right\|_{B}}},\ldots,{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\frac{g_{m}(x)}{\left\|g_{m}\right\|_{B}}},0\right).

For xnx\in\mathbb{R}^{n}, let d(x,K)d(x,K) denote the Euclidean distance to the set KK, i.e.,

d(x,K)=min{xy2,yK}.d(x,K)=\min\{\|x-y\|_{2},\,\,y\in K\}.

If 𝒢(x)=0\mathcal{G}(x)=0, then we have xKx\in K and d(x,K)=0d(x,K)=0. The distance function d(x,K)d(x,K) is a continuous semialgebraic function (see [5, Proposition 2.2.8], [14, Proposition 1.8]). By Corollary 2.2, there exist positive constants η\eta, κ\kappa such that the following Łojasiewicz inequality holds:

(3.5) d(x,K)ηκ𝒢(x),xΔ¯.d(x,K)^{\eta}\leq\kappa\mathcal{G}(x),\quad\forall x\in\bar{\Delta}.

Note that

dG=max{deg(g1),,deg(gm)}.d_{G}=\max\{\deg(g_{1}),\dots,\deg(g_{m})\}.

In the following, we prove a polynomial bound on the degrees of terms appearing in matrix Putinar’s Positivstellensätz when KK is given by finitely many scalar polynomial inequalities, which improves the exponential bound established in [15]. Recall that the quadratic module of scalar polynomials generated by G(x)G(x) in [x]\mathbb{R}[x] is denoted by QM[G]\mbox{QM}[G] (see Section 2.2).

Theorem 3.5.

Let G(x)G(x) be a diagonal matrix with diagonal entries g1,,gm[x]g_{1},\dots,g_{m}\in\mathbb{R}[x] and let KK be as in (3.1). Suppose that F(x)F(x) is an \ell-by-\ell symmetric polynomial matrix with deg(F)=d\deg(F)=d, and (1x22)IQM[G](1-\|x\|_{2}^{2})\cdot I_{\ell}\in\mbox{QM}[G]^{\ell}. If F(x)FminI0F(x)\succeq F_{\min}\cdot I_{\ell}\succ 0 on KK, then FQM[G]kF\in\mbox{QM}[G]_{k}^{\ell} for

(3.6) kC87ηm3n2dG6κ7d14η(FBFmin)7η+3.k\geq C\cdot{8}^{7\eta}m^{3}n^{2}d_{G}^{6}\kappa^{7}d^{14\eta}\left(\frac{\|F\|_{B}}{F_{\min}}\right)^{7\eta+3}.

In the above, κ\kappa, η\eta are respectively the Łojasiewicz constant and exponent in (3.5) and C>0C>0 is a constant independent of FF, g1,,gmg_{1},\dots,g_{m}. Furthermore, the Łojasiewicz exponent can be estimated as η(dG+1)(3dG)n+m2\eta\leq(d_{G}+1)(3d_{G})^{n+m-2}.

Remark. At a feasible point xKx\in K, the Mangasarian–Fromovitz constraint qualification (MFCQ) holds if there exists a direction vnv\in\mathbb{R}^{n} such that gj(x)Tv>0\nabla g_{j}(x)^{T}v>0 for all active inequality constraints gj(x)=0g_{j}(x)=0. The linear independence constraint qualification (LICQ) holds at xx if the set of gradient vectors gj(x)\nabla g_{j}(x) of all active constraints is linearly independent. It was shown in [45] that if the MFCQ holds at every xKx\in K, then the Łojasiewicz exponent, as in (3.5), is equal to 1. Since the MFCQ is weaker than the LICQ, we know that if the LICQ holds at every xKx\in K, the Łojasiewicz exponent η=1\eta=1 (see also [2, 3])111The author would like to thank the referee for pointing out this fact.. Then, under the assumptions of Theorem 3.5, we have that FQM[G]k{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}F}\in\mbox{QM}[G]_{k}^{\ell} if

(3.7) kCm3n2dG6κ7d14(FBFmin)10.k\geq C\cdot m^{3}n^{2}d_{G}^{6}\kappa^{7}d^{14}\left(\frac{\|F\|_{B}}{F_{\min}}\right)^{10}.
Proof 3.6.

Without loss of generality, we assume that giB=1\left\|g_{i}\right\|_{B}=1 for i=1,,mi=1,\dots,m. Otherwise, we can scale gig_{i} by 1/giB1/\|g_{i}\|_{B}. We remark that scaling gig_{i} does not change κ\kappa, since we normalize gig_{i} in 𝒢\mathcal{G} and (gi(x)/giB)/(gi(x)/giBB)=gi(x)/giB(g_{i}(x)/\|g_{i}\|_{B})/(\|g_{i}(x)/\|g_{i}\|_{B}\|_{B})=g_{i}(x)/\|g_{i}\|_{B}. Thus, the bound remains unchanged. Let

δ=18ηκd2η(FminFB)η,v=δFmin20mFB,λ=5FB/δ.\delta=\frac{1}{{8}^{\eta}\kappa d^{2\eta}}\left(\frac{F_{\min}}{\|F\|_{B}}\right)^{\eta},~{}v=\frac{\delta F_{\min}}{20m\|F\|_{B}},~{}\lambda=5\|F\|_{B}/\delta.

For the above δ\delta, vv, it follows from [3, Proposition 3.2] that there exists hi[x]h_{i}\in\mathbb{R}[x] such that for xΔ¯x\in\bar{\Delta}, we have that

  • (i)

    if gi(x)0,|hi(x)|2vg_{i}(x)\geq 0,\left|h_{i}(x)\right|\leq 2v.

  • (ii)

    if gi(x)δ,|hi(x)|12g_{i}(x)\leq-\delta,\left|h_{i}(x)\right|\geq\frac{1}{2}.

  • (iii)

    hiB1\left\|h_{i}\right\|_{B}\leq 1.

  • (iv)

    hiΣ[x]kh_{i}\in\Sigma[x]_{k} with k105ndG2δ2v1k\leq 10^{5}nd_{G}^{2}\delta^{-2}v^{-1}.

Let

P=Fλi=1mhigiI.P=F-\lambda\sum_{i=1}^{m}h_{i}g_{i}I_{\ell}.

In the following, we show that P14FminIP\succeq\frac{1}{4}F_{\min}\cdot I_{\ell} on Δ¯\bar{\Delta}. Equivalently, for a fixed xΔ¯x\in\bar{\Delta}, we prove that for each ξ\xi\in\mathbb{R}^{\ell} with ξ2=1\|\xi\|_{2}=1, the following holds:

(3.8) F(ξ)(x)λi=1mhi(x)gi(x)=ξT(F(x)λi=1mhi(x)gi(x)I)ξ14Fmin.F^{(\xi)}(x)-\lambda\sum_{i=1}^{m}h_{i}(x)g_{i}(x)=\xi^{T}(F(x)-\lambda\sum_{i=1}^{m}h_{i}(x)g_{i}(x)I_{\ell})\xi\geq\frac{1}{4}F_{\min}.

In the above, F(ξ)(x):=ξTF(x)ξ.F^{(\xi)}(x):=\xi^{T}F(x)\xi.

Case I. Suppose that F(ξ)(x)3Fmin4F^{(\xi)}(x)\leq\frac{3F_{\min}}{4}. Then, we know that xKx\notin K. Let xKx^{*}\in K be a vector such that xx2=d(x,K)\|x-x^{*}\|_{2}=d(x,K). Since F(x)FminIF(x^{*})\succeq F_{\min}\cdot I_{\ell}, we have that

Fmin4F(ξ)(x)F(ξ)(x)maxsΔ¯F(ξ)(s)2xx22d(2d1)n+1xx2maxsΔ¯|F(ξ)(s)|,\begin{array}[]{ll}\frac{F_{\min}}{4}\leq F^{(\xi)}(x^{*})-F^{(\xi)}(x)&\leq\max\limits_{s\in\bar{\Delta}}\|\nabla F^{(\xi)}(s)\|_{2}\|x-x^{*}\|_{2}\\ &\leq\frac{2d(2d-1)}{\sqrt{n}+1}\|x-x^{*}\|_{2}\max\limits_{s\in\bar{\Delta}}|F^{(\xi)}(s)|,\\ \end{array}

where the last inequality follows from Theorem 2.3. Since maxsΔ¯|F(ξ)(s)|F(ξ)B,d\max\limits_{s\in\bar{\Delta}}|F^{(\xi)}(s)|\leq\|F^{(\xi)}\|_{B,d} (see (3.3)), it holds that

(3.9) FminmaxsΔ¯|F(ξ)(s)|F(ξ)B,dFB,F_{\min}\leq\max_{s\in\bar{\Delta}}|F^{(\xi)}(s)|\leq\|F^{(\xi)}\|_{B,d}\leq\|F\|_{B},

where the last inequality is due to Lemma 3.3. Then, we know that

Fmin42d(2d1)n+1F(ξ)B,dxx22d2FBxx2.\frac{F_{\min}}{4}\leq\frac{2d(2d-1)}{\sqrt{n}+1}\|F^{(\xi)}\|_{B,d}\|x-x^{*}\|_{2}\leq 2d^{2}\|F\|_{B}\|x-x^{*}\|_{2}.

By the Łojasiewicz inequality (3.5), we have that

𝒢(x)κ1d(x,K)ηκ1xx2η18ηκd2η(FminFB)η=δ.\mathcal{G}(x)\geq\kappa^{-1}d(x,K)^{\eta}\geq\kappa^{-1}\|x-x^{*}\|_{2}^{\eta}\geq\frac{1}{{8}^{\eta}\kappa d^{2\eta}}(\frac{F_{\min}}{\|F\|_{B}})^{\eta}=\delta.

From the definition of the function 𝒢(x)\mathcal{G}(x) as in (3.4), there exists i0{1,,m}i_{0}\in\{1,\dots,m\} such that gi0(x)δg_{i_{0}}(x)\leq-\delta. Since hi0(x)h_{i_{0}}(x) is an SOS (see item (iv)) and gi0(x)δg_{i_{0}}(x)\leq-\delta, we have hi0(x)12h_{i_{0}}(x)\geq\frac{1}{2} by item (ii). Note that

hi0(x)gi0(x)12δ,hi(x)gi(x)2vmaxsΔ¯|gi(s)|2vgiB=2v(i=1,,m).-h_{i_{0}}(x)g_{i_{0}}(x)\geq\frac{1}{2}\delta,~{}-h_{i}(x)g_{i}(x)\geq-2v\cdot\max\limits_{s\in\bar{\Delta}}|g_{i}(s)|\geq-2v\cdot\|g_{i}\|_{B}=-2v~{}(i=1,\dots,m).

In the above, we use the inequality (3.3). Then, we have that

F(ξ)(x)λi=1mhi(x)gi(x)\displaystyle F^{(\xi)}(x)-\lambda\sum_{i=1}^{m}h_{i}(x)g_{i}(x) F(ξ)(x)+λ12δ2λv(m1)\displaystyle\geq F^{(\xi)}(x)+\lambda\cdot\frac{1}{2}\delta-2\lambda v(m-1)
F(ξ)(x)+λδ4+λ(δ42v(m1)).\displaystyle\geq F^{(\xi)}(x)+\lambda\frac{\delta}{4}+\lambda\left(\frac{\delta}{4}-2v(m-1)\right).

One can see from the relation (3.9) that F(ξ)(x)F(ξ)B,dF^{(\xi)}(x)\geq-\|F^{(\xi)}\|_{B,d} and FminFBF_{\min}\leq\|F\|_{B}, which implies that

δ42v(m1)δ42vmδ4δFmin10FBδ4δ10>0,\frac{\delta}{4}-2v(m-1)\geq\frac{\delta}{4}-2vm\geq\frac{\delta}{4}-\frac{\delta F_{\min}}{10\|F\|_{B}}\geq\frac{\delta}{4}-\frac{\delta}{10}>0,
λδ4=54FB54F(ξ)B,d.\lambda\frac{\delta}{4}=\frac{5}{4}\|F\|_{B}\geq\frac{5}{4}\|F^{(\xi)}\|_{B,d}.

Then, it follows that

F(ξ)(x)λi=1mhi(x)gi(x)\displaystyle F^{(\xi)}(x)-\lambda\sum_{i=1}^{m}h_{i}(x)g_{i}(x) F(ξ)(x)+λδ4+λ(δ42v(m1))\displaystyle\geq F^{(\xi)}(x)+\lambda\frac{\delta}{4}+\lambda\left(\frac{\delta}{4}-2v(m-1)\right)
F(ξ)B,d+54F(ξ)B,d\displaystyle\geq-\|F^{(\xi)}\|_{B,d}+\frac{5}{4}\|F^{(\xi)}\|_{B,d}
14F(ξ)B,d14maxsΔ¯|F(ξ)(s)|\displaystyle\geq\frac{1}{4}\|F^{(\xi)}\|_{B,d}\geq\frac{1}{4}\max_{s\in\bar{\Delta}}|F^{(\xi)}(s)|
14Fmin,\displaystyle\geq\frac{1}{4}F_{\min},

where the second inequality follows from the inequality (3.3).

Case II. Suppose that F(ξ)(x)3Fmin4F^{(\xi)}(x)\geq\frac{3F_{\min}}{4}. Note that hi(x)gi(x)2vgiB=2v-h_{i}(x)g_{i}(x)\geq-2v\cdot\|g_{i}\|_{B}=-2v for i=1,,mi=1,\dots,m, and

2mλv=2mδFmin20mFB5FB/δ=12Fmin.2m\lambda v=2m\cdot\frac{\delta F_{\min}}{20m\|F\|_{B}}\cdot 5\|F\|_{B}/\delta=\frac{1}{2}F_{\min}.

Then, we have that

F(ξ)(x)λi=1mhi(x)gi(x)34Fmin2mλv14Fmin.F^{(\xi)}(x)-\lambda\sum_{i=1}^{m}h_{i}(x)g_{i}(x)\geq\frac{3}{4}F_{\min}-2m\lambda v\geq\frac{1}{4}F_{\min}.

This concludes the proof of (3.8).

Since ξn\xi\in\mathbb{R}^{n} with ξ2=1\|\xi\|_{2}=1 is arbitrary, we have P(x)14FminIP(x)\succeq\frac{1}{4}F_{\min}\cdot I_{\ell} on Δ¯\bar{\Delta}. Note that xx can be any fixed point in Δ¯\bar{\Delta}, we know that P(x)14FminIP(x)\succeq\frac{1}{4}F_{\min}\cdot I_{\ell} for all xΔ¯x\in\bar{\Delta}. Since deg(hi)105ndG2δ2v1\deg(h_{i})\leq 10^{5}nd_{G}^{2}\delta^{-2}v^{-1} for i=1,,mi=1,\dots,m, we have that

deg(higi)105ndG3δ2v12106mndG3δ3FBFmin210683ηmndG3κ3d6η(FBFmin)3η+1.\begin{array}[]{ll}\deg(h_{i}g_{i})\leq 10^{5}nd_{G}^{3}\delta^{-2}v^{-1}&\leq 2\cdot 10^{6}mnd_{G}^{3}\delta^{-3}\frac{\|F\|_{B}}{F_{\min}}\\ &\leq 2\cdot 10^{6}\cdot{8}^{3\eta}mnd_{G}^{3}\kappa^{3}d^{6\eta}(\frac{\|F\|_{B}}{F_{\min}})^{3\eta+1}.\\ \end{array}

By Lemma 3.3, there exists ξ0n\xi_{0}\in\mathbb{R}^{n} with ξ02=1\|\xi_{0}\|_{2}=1 such that PB=ξ0TPξ0B,deg(P)\|P\|_{B}=\|\xi_{0}^{T}P\xi_{0}\|_{B,\deg(P)}. Then, we obtain that

PB=ξ0TPξ0B,deg(P)=F(ξ0)λi=1mhigiB,deg(P)F(ξ0)B,deg(P)+λi=1mhigiB,deg(P)F(ξ0)B,deg(F)+λi=1mhiB,deg(P)deg(gi)giB,deg(gi)FB+λm(1+5δ1m)FB8η+1mκd2η(FBFmin)ηFB,\begin{array}[]{ll}\|P\|_{B}=\|\xi_{0}^{T}P\xi_{0}\|_{B,\deg(P)}&=\|F^{(\xi_{0})}-\lambda\sum_{i=1}^{m}h_{i}g_{i}\|_{B,\deg(P)}\\ &\leq\|F^{(\xi_{0})}\|_{B,\deg(P)}+\lambda\sum_{i=1}^{m}\|h_{i}g_{i}\|_{B,\deg(P)}\\ &\leq\|F^{(\xi_{0})}\|_{B,\deg(F)}+\lambda\sum_{i=1}^{m}\|h_{i}\|_{B,\deg(P)-\deg(g_{i})}\|g_{i}\|_{B,\deg(g_{i})}\\ &\leq\|F\|_{B}+\lambda m\leq(1+5\delta^{-1}m)\|F\|_{B}\\ &\leq{8}^{\eta+1}m\kappa d^{2\eta}(\frac{\|F\|_{B}}{F_{\min}})^{\eta}\|F\|_{B},\\ \end{array}

where the second inequality is due to the inequality (3.3). Note that

deg(P)=deg(Fλi=1mhigiI)=max{deg(F),deg(higi)(i=1,,m)}max{d,210683ηmndG3κ3d6η(FBFmin)3η+1}=210683ηmndG3κ3d6η(FBFmin)3η+1.\begin{array}[]{ll}\deg(P)=\deg(F-\lambda\sum_{i=1}^{m}h_{i}g_{i}I_{\ell})&=\max\{\deg(F),\deg(h_{i}g_{i})~{}(i=1,\dots,m)\}\\ &\leq\max\{d,2\cdot 10^{6}\cdot{8}^{3\eta}mnd_{G}^{3}\kappa^{3}d^{6\eta}(\frac{\|F\|_{B}}{F_{\min}})^{3\eta+1}\}\\ &=2\cdot 10^{6}\cdot{8}^{3\eta}mnd_{G}^{3}\kappa^{3}d^{6\eta}(\frac{\|F\|_{B}}{F_{\min}})^{3\eta+1}.\end{array}

It follows from Lemma 3.1 that all coefficient matrices of P(x)P(x) in the Bernstein basis {Bα(k)(x):αkn}\{B_{\alpha}^{(k)}(x):\alpha\in\mathbb{N}_{k}^{n}\} are positive definite, for

k(16106)287ηm3n2dG6κ7d14η(FBFmin)7η+3>(210683ηmndG3κ3d6η(FBFmin)3η+1)28η+2mκd2η(FBFmin)ηFBFmindeg(P)2PB12Fmin,\begin{array}[]{ll}k&\geq(16\cdot 10^{6})^{2}\cdot{8}^{7\eta}m^{3}n^{2}d_{G}^{6}\kappa^{7}d^{14\eta}(\frac{\|F\|_{B}}{F_{\min}})^{7\eta+3}\\ &>\left(2\cdot 10^{6}\cdot{8}^{3\eta}mnd_{G}^{3}\kappa^{3}d^{6\eta}(\frac{\|F\|_{B}}{F_{\min}})^{3\eta+1}\right)^{2}{8}^{\eta+2}m\kappa d^{2\eta}(\frac{\|F\|_{B}}{F_{\min}})^{\eta}\frac{\|F\|_{B}}{F_{\min}}\\ &\geq\deg(P)^{2}\frac{\|P\|_{B}}{\frac{1}{2}F_{\min}},\\ \end{array}

i.e., there exist matrices Pα(k)0P_{\alpha}^{(k)}\succ 0 such that

Fλi=1mhigiI=αknPα(k)Bαk(x).F-\lambda\sum_{i=1}^{m}h_{i}g_{i}I_{\ell}=\sum_{\alpha\in\mathbb{N}_{k}^{n}}P_{\alpha}^{(k)}B_{\alpha}^{k}(x).

In the following, we show that nx1xnQM[1x22]2\sqrt{n}-x_{1}-\cdots-x_{n}\in\mbox{QM}[1-\|x\|_{2}^{2}]_{2}. Consider the optimization

(3.10) {minnx1xns.t.1x220.\left\{\begin{array}[]{rl}\min&\sqrt{n}-x_{1}-\cdots-x_{n}\\ \mathit{s.t.}&1-\|x\|_{2}^{2}\geq 0.\\ \end{array}\right.

One can verify that the optimal value of (3.10) is 0. Since nx1xn\sqrt{n}-x_{1}-\cdots-x_{n} and (1x2)-(1-\|x\|^{2}) are SOS convex and the feasible set has an interior point, it follows from [32, Theorem 3.3] that the lowest order SOS relaxation of (3.10) is exact and the optimal values of all SOS relaxations are achievable. Consequently, we have that nx1xnQM[1x22]2\sqrt{n}-x_{1}-\cdots-x_{n}\in\mbox{QM}[1-\|x\|_{2}^{2}]_{2}. Similarly, we know that 1+x1,,1+xnQM[1x22]21+x_{1},\cdots,1+x_{n}\in\mbox{QM}[1-\|x\|_{2}^{2}]_{2}. Hence, we have

(nx1xn)k|α|(1+x1)α1(1+xn)αnQM[1x22]2(k|α|)+2α1++2αn=QM[1x22]2k.\left(\sqrt{n}-x_{1}-\cdots-x_{n}\right)^{k-|\alpha|}\left(1+x_{1}\right)^{\alpha_{1}}\cdots\left(1+x_{n}\right)^{\alpha_{n}}\in\mbox{QM}[1-\|x\|_{2}^{2}]_{2(k-|\alpha|)+2\alpha_{1}+\dots+2\alpha_{n}}=\mbox{QM}[1-\|x\|_{2}^{2}]_{2k}.

Then, from the definition of Bαk(x)B_{\alpha}^{k}(x), we have Bαk(x)QM[1x22]2kB_{\alpha}^{k}(x)\in\mbox{QM}[1-\|x\|_{2}^{2}]_{2k}. Since (1x22)IQM[G]k0(1-\|x\|_{2}^{2})\cdot I_{\ell}\in\mbox{QM}[G]_{k_{0}}^{\ell} for some k0>0k_{0}>0, it follows that 1x22QM[G]k01-\|x\|_{2}^{2}\in\mbox{QM}[G]_{k_{0}}, and thus Bαk(x)QM[1x22]2k+k0B_{\alpha}^{k}(x)\in\mbox{QM}[1-\|x\|_{2}^{2}]_{2k+k_{0}}. Then, the conclusion holds for sufficiently large CC. Note that

K={xng1(x)g1B0,,gm(x)gmB0}.K=\left\{x\in\mathbb{R}^{n}\mid\frac{g_{1}(x)}{\left\|g_{1}\right\|_{B}}\geq 0,\dots,\frac{g_{m}(x)}{\left\|g_{m}\right\|_{B}}\geq 0\right\}.

It follows from Theorem 2.4 that η(dG+1)(3dG)n+m2\eta\leq(d_{G}+1)(3d_{G})^{n+m-2}.

4 A scalar representation for polymomial matrix inequalities

In this section, we provide exact estimates on the quantity and degree bounds for polynomials d1,,dtd_{1},\dots,d_{t} with d1,,dtQM[G]d_{1},\dots,d_{t}\in\mbox{QM}[G], such that the polynomial matrix inequality G(x)0G(x)\succeq 0 can be equivalently described by the scalar polynomial inequalities d10d_{1}\geq 0, \dots, dt0d_{t}\geq 0. This serves as a quantitative version of the following proposition due to Schmüdgen [50].

Proposition 1 (Proposition 9, [50]).

Let G(x)G(x) be an mm-by-mm symmetric polynomial matrix and G(x)0G(x)\neq 0. Then there exist mm-by-mm diagonal polynomial matrices DrD_{r}, mm-by-mm symmetric polynomial matrices X±rX_{\pm r} and polynomials zrz_{r}\in Σ[x],r=1,,t\Sigma[x],r=1,\ldots,t, such that:

  • (i)

    X+rXr=XrX+r=zrI,Dr=XrGXrT,zr2G=X+rDrX+rTX_{+r}X_{-r}=X_{-r}X_{+r}=z_{r}I,{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}D_{r}}=X_{-r}GX_{-r}^{T},z_{r}^{2}G=X_{+r}D_{r}X_{+r}^{T},

  • (ii)

    For xnx\in\mathbb{R}^{n}, G(x)0G(x)\succeq 0 if and only if Dr(x)0D_{r}(x)\succeq 0 for all r=1,,tr=1,\ldots,t.

Let K={xnG(x)0}.K=\{x\in\mathbb{R}^{n}\mid G(x)\succeq 0\}. A direct corollary of Proposition 1 is that there exist finitely many polynomials in QM[G]\mbox{QM}[G] such that KK can be equivalently described by these polynomials. We also refer to [7, Proposition 5] for a generalization of Proposition 1, where the author considers the case that KK can be described by possibly infinitely many polynomial matrix inequalities. Denote

θ(m)=i=1m(k=m+1imk(k+1)2).\theta(m)=\sum\limits_{i=1}^{m}\left(\prod_{k=m+1-i}^{m}\frac{k(k+1)}{2}\right).

Recall that dG=max{deg(Gij),1i,jm}.d_{G}=\max\{\deg(G_{ij}),~{}1\leq i,j\leq m\}. In the following, we provide a quantitative version of Proposition 1.

Theorem 4.1.

Suppose G(x)G(x) is an mm-by-mm symmetric polynomial matrix with m2m\geq 2. Then there exist d1,d2,,dθ(m)QM[G]3m1dGd_{1},d_{2},\dots,d_{\theta(m)}\in QM[G]_{3^{m-1}d_{G}} such that

K={xnd1(x)0,,dθ(m)(x)0}.K=\{x\in\mathbb{R}^{n}\mid d_{1}(x)\geq 0,\dots,d_{\theta(m)}(x)\geq 0\}.

Proof 4.2.

We prove the result by applying induction on the matrix size mm. Suppose that m=2m=2 and G(x)=[Gij(x)]i,j=1,2G(x)=[G_{ij}(x)]_{i,j=1,2}. In the following, we show that the inequality G(x)=[Gij(x)]i,j=1,20G(x)=[G_{ij}(x)]_{i,j=1,2}\succeq 0 can be equivalently described as

G110,G220,G11(G11G22G122)0,G_{11}\geq 0,~{}G_{22}\geq 0,~{}G_{11}(G_{11}G_{22}-G_{12}^{2})\geq 0,
G11+2G12+G220,G22(G11G22G122)0,G_{11}+2G_{12}+G_{22}\geq 0,~{}G_{22}(G_{11}G_{22}-G_{12}^{2})\geq 0,
(G11+2G12+G22)((G11+2G12+G22)G22(G12+G22)2)0.(G_{11}+2G_{12}+G_{22})((G_{11}+2G_{12}+G_{22})G_{22}-(G_{12}+G_{22})^{2})\geq 0.

If G(x)0G(x)\succeq 0, we know that G11(x)0G_{11}(x)\geq 0, G22(x)0G_{22}(x)\geq 0, G11(x)G22(x)G12(x)20G_{11}(x)\cdot G_{22}(x)-G_{12}(x)^{2}\geq 0, which imply the above inequalities. Suppose xnx\in\mathbb{R}^{n} satisfies these inequalities. If G11(x)0G_{11}(x)\neq 0 (G22(x)0G_{22}(x)\neq 0, respectively), it follows from the inequality G11(x)(G11(x)G22(x)G12(x)2)0G_{11}(x)\cdot(G_{11}(x)\cdot G_{22}(x)-G_{12}(x)^{2})\geq 0  (G22(x)(G11(x)G22(x)G12(x)2)0G_{22}(x)\cdot(G_{11}(x)\cdot G_{22}(x)-G_{12}(x)^{2})\geq 0, respectively) that G11(x)G22(x)G12(x)20G_{11}(x)\cdot G_{22}(x)-G_{12}(x)^{2}\geq 0. This implies that G(x)0G(x)\succeq 0. Otherwise, if G11(x)=G22(x)=0G_{11}(x)=G_{22}(x)=0, the last inequality reduces to 2G12(x)30-2G_{12}(x)^{3}\geq 0 and the inequality G11(x)+2G12(x)+G22(x)0G_{11}(x)+2G_{12}(x)+G_{22}(x)\geq 0 reduces to G12(x)0G_{12}(x)\geq 0. Then, we have that G12=0G_{12}=0 and G(x)=0G(x)=0. Note that

[G110G12G11]G[G11G120G11]=[G11300G11(G11G22G122)],\begin{bmatrix}G_{11}&0\\ -G_{12}&G_{11}\\ \end{bmatrix}\cdot G\cdot\begin{bmatrix}G_{11}&-G_{12}\\ 0&G_{11}\\ \end{bmatrix}=\begin{bmatrix}G_{11}^{3}&0\\ 0&G_{11}(G_{11}G_{22}-G_{12}^{2})\\ \end{bmatrix},
[G22G120G22]G[G220G12G22]=[G22(G11G22G122)00G223],\begin{bmatrix}G_{22}&-G_{12}\\ 0&G_{22}\\ \end{bmatrix}\cdot G\cdot\begin{bmatrix}G_{22}&0\\ -G_{12}&G_{22}\\ \end{bmatrix}=\begin{bmatrix}G_{22}(G_{11}G_{22}-G_{12}^{2})&0\\ 0&G_{22}^{3}\\ \end{bmatrix},
ATGA=[(G11+2G12+G22)300(G11+2G12+G22)((G11+2G12+G22)G22(G12+G22)2)],A^{T}\cdot G\cdot A=\begin{bmatrix}(G_{11}+2G_{12}+G_{22})^{3}&0\\ 0&(G_{11}+2G_{12}+G_{22})((G_{11}+2G_{12}+G_{22})G_{22}-(G_{12}+G_{22})^{2})\\ \end{bmatrix},

where

A=[1011][G11+2G12+G22G22G120G11+2G12+G22].A=\begin{bmatrix}1&0\\ 1&1\\ \end{bmatrix}\cdot\begin{bmatrix}G_{11}+2G_{12}+G_{22}&-G_{22}-G_{12}\\ 0&G_{11}+2G_{12}+G_{22}\\ \end{bmatrix}.

We know that all the above scalar inequalities involve polynomials that belong to QM[G]3dG\mbox{QM}[G]_{3d_{G}}. Note that θ(2)=6\theta(2)=6. Hence, the result holds for m=2m=2.

Suppose the result holds for m=km=k and consider the case that m=k+1m=k+1. In the following, we construct block diagonal polynomial matrices with a 1×11\times 1 block and a k×kk\times k block, such that the inequality G(x)0G(x)\succeq 0 can be equivalently described by the main diagonal blocks of these matrices. This reduction in matrix size allows us to apply induction. For i{1,,n}i\in\{1,\ldots,n\}, let PiP_{i} denote the permutation matrix that swaps the first row and the ii-th row. For 1i<jk+11\leq i<j\leq k+1, denote the row operation matrix Q(ij)Q^{(ij)} as

(Q(ij))uv={1,if (u,v)=(1,1),,(k+1,k+1) or if (u,v)=(i,j),0,otherwise.(Q^{(ij)})_{uv}=\left\{\begin{array}[]{ll}1,&\text{if~{}}(u,v)=(1,1),\dots,(k+1,k+1)\text{~{}or~{}if~{}}(u,v)=(i,j),\\ 0,&\text{otherwise}.\\ \end{array}\right.

Note that Q(ij)G(Q(ij))TQ^{(ij)}G(Q^{(ij)})^{T} is the matrix obtained by first adding the jj-th row of GG to the ii-th row and then adding the jj-th column to the ii-th column. Let

Tii=Pi(i=1,,k+1),Tij=PiQ(ij)(1i<jk+1).T_{ii}=P_{i}~{}(i=1,\dots,k+1),~{}T_{ij}=P_{i}\cdot Q^{(ij)}~{}(1\leq i<j\leq k+1).

Clearly, for i<ji<j, TijGTijTT_{ij}GT_{ij}^{\mathrm{T}} is the matrix obtained by swapping the first row with the ii-th row of Q(ij)G(Q(ij))TQ^{(ij)}G(Q^{(ij)})^{T} and then swapping the first column with the ii-th column. Hence, we know that for 1ijk+11\leq i\leq j\leq k+1, the matrices TijT_{ij} are all invertible and it holds that

TijGTijT=[sij(β(ij))Tβ(ij)H(ij)],T_{ij}GT_{ij}^{\mathrm{T}}=\left[\begin{array}[]{cc}s_{ij}&(\beta^{(ij)})^{\mathrm{T}}\\ \beta^{(ij)}&H^{(ij)}\end{array}\right],

where H(ij)H^{(ij)}is a kk-by-kk symmetric polynomial matrix, β(ij)[x]k\beta^{(ij)}\in\mathbb{R}[x]^{k} and

sii=Gii(i=1,,k+1),sij=Gii+Gjj+2Gij(1i<jk+1).s_{ii}=G_{ii}~{}(i=1,\dots,k+1),~{}s_{ij}=G_{ii}+G_{jj}+2G_{ij}~{}(1\leq i<j\leq k+1).

Denote the matrices

X+(ij)=[sij0β(ij)sijIk],X(ij)=[sij0β(ij)sijIk].X_{+}^{(ij)}=\left[\begin{array}[]{cc}s_{ij}&~{}0\\ \beta^{(ij)}&~{}s_{ij}\cdot I_{k}\end{array}\right],~{}X_{-}^{(ij)}=\left[\begin{array}[]{cc}s_{ij}&~{}0\\ -\beta^{(ij)}&~{}s_{ij}\cdot I_{k}\end{array}\right].

By computations, we have that

(4.1) X(ij)X+(ij)=X+(ij)X(ij)=sij2Ik+1,X_{-}^{(ij)}X_{+}^{(ij)}=X_{+}^{(ij)}X_{-}^{(ij)}=s_{ij}^{2}\cdot I_{k+1},
(4.2) (X(ij)Tij)G(X(ij)Tij)T=G(ij),(X_{-}^{(ij)}T_{ij})\cdot G\cdot(X_{-}^{(ij)}T_{ij})^{\mathrm{T}}=G^{(ij)},
(4.3) sij4TijGTijT=X+(ij)G(ij)(X+(ij))T,s_{ij}^{4}T_{ij}GT_{ij}^{\mathrm{T}}=X_{+}^{(ij)}\cdot G^{(ij)}(X_{+}^{(ij)})^{\mathrm{T}},

where

G(ij)=[sij300sij(sijH(ij)β(ij)(β(ij))T)].G^{(ij)}=\left[\begin{array}[]{cc}s_{ij}^{3}&0\\ 0&s_{ij}\cdot\left(s_{ij}H^{(ij)}-\beta^{(ij)}(\beta^{(ij)})^{\mathrm{T}}\right)\end{array}\right].

Let

B(ij)=sij(sijH(ij)β(ij)(β(ij))T)(1ijk+1).B^{(ij)}=s_{ij}\left(s_{ij}H^{(ij)}-\beta^{(ij)}(\beta^{(ij)})^{\mathrm{T}}\right)~{}(1\leq i\leq j\leq k+1).

Since TijT_{ij} are scalar matrices, we have deg(sij)dG,deg(B(ij))3dG.\deg(s_{ij})\leq d_{G},~{}\deg(B^{(ij)})\leq 3d_{G}.

In the following, we show that

(4.4) K={xnsij(x)0,B(ij)(x)0,1ijk+1}.K=\{x\in\mathbb{R}^{n}\mid s_{ij}(x)\geq 0,B^{(ij)}(x)\succeq 0,~{}1\leq i\leq j\leq k+1\}.

If xKx\in K, i.e., G(x)0G(x)\succeq 0, it follows from (4.2) that G(ij)(x)0G^{(ij)}(x)\succeq 0, which implies that

(4.5) sij(x)0,B(ij)(x)0,1ijk+1.s_{ij}(x)\geq 0,B^{(ij)}(x)\succeq 0,\quad\forall 1\leq i\leq j\leq k+1.

For the converse part, suppose xnx\in\mathbb{R}^{n} satisfies (4.5). If sij=0s_{ij}=0 for all 1ijk+11\leq i\leq j\leq k+1, then we have Gij(x)=0G_{ij}(x)=0 for all 1i,jk+11\leq i,j\leq k+1, leading to G(x)=0G(x)=0. If there exist 1i0j0k+11\leq i_{0}\leq j_{0}\leq k+1 such that si0j0(x)0s_{i_{0}j_{0}}(x)\neq 0, the relation (4.3) gives that

(4.6) G(x)=si0j04(x)Tij1X+(i0j0)G(i0j0)(X+(i0j0))T(Ti0j0T)1.G(x)=s_{i_{0}j_{0}}^{-4}(x)T_{ij}^{-1}X_{+}^{(i_{0}j_{0})}\cdot G^{(i_{0}j_{0})}(X_{+}^{(i_{0}j_{0})})^{\mathrm{T}}(T_{i_{0}j_{0}}^{\mathrm{T}})^{-1}.

Since G(i0j0)(x)0G^{(i_{0}j_{0})}(x)\succeq 0, we conclude that G(x)0G(x)\succeq 0.

Note that each B(ij)B^{(ij)} is a kk-by-kk symmetric polynomial matrix with deg(B(ij))3dG\deg(B^{(ij)})\leq 3d_{G} for 1ijk+11\leq i\leq j\leq k+1. By induction, there exist d1(ij),,dθ(k)(ij)QM[G]3kdGd_{1}^{(ij)},\dots,d_{\theta(k)}^{(ij)}\in QM[G]_{3^{k}d_{G}} such that

{xnB(ij)(x)0}={xnd1(ij)(x)0,,dθ(k)(ij)(x)0}.\{x\in\mathbb{R}^{n}\mid B^{(ij)}(x)\succeq 0\}=\{x\in\mathbb{R}^{n}\mid d_{1}^{(ij)}(x)\geq 0,\dots,d_{\theta(k)}^{(ij)}(x)\geq 0\}.

Combining this with the relation (4.4), we have that

K={xnsij(x)0,dt(ij)(x)0,1ijk+1,1tθ(k)}.K=\{x\in\mathbb{R}^{n}\mid s_{ij}(x)\geq 0,d_{t}^{(ij)}(x)\succeq 0,~{}1\leq i\leq j\leq k+1,1\leq t\leq\theta(k)\}.

The total number of the above constraints is 12(k+1)(k+2)+12(k+1)(k+2)θ(k),\frac{1}{2}(k+1)(k+2)+\frac{1}{2}(k+1)(k+2)\theta(k), which equals to θ(k+1)\theta(k+1). Hence, we have completed the proof.

Remark. There exist alternative ways to obtain finitely many scalar polynomial inequalities that describe KK. For instance, consider the characteristic polynomial det[λImG(x)]det[\lambda I_{m}-G(x)] of G(x)G(x). It can be expressed as

det[λImG(x)]=λm+i=1m(1)igi(x)λmi.det[\lambda I_{m}-G(x)]=\lambda^{m}+\sum\limits_{i=1}^{m}(-1)^{i}g_{i}(x)\lambda^{m-i}.

Since G(x)G(x) is symmetric, all eigenvalues are real, and det[λImG(x)]det[\lambda I_{m}-G(x)] has only real roots. We know that G(x)0G(x)\succeq 0 if and only if gi(x)0g_{i}(x)\geq 0 for i=1,,mi=1,\dots,m. Note that g1(x)=trace(G(x))g_{1}(x)=\operatorname{trace}(G(x)), which is always in QM[G]\mbox{QM}[G]. Hence, if the polynomials g2(x),,gm(x)QM[G]{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}g_{2}(x)},\dots,g_{m}(x)\in\mbox{QM}[G], the quantity θ(m)\theta(m) and the degree 3m1dG3^{m-1}d_{G} as stated in Theorem 4.1 can be improved to mm and mdGmd_{G}, respectively. However, g2(x),,gm(x)g_{2}(x),\dots,g_{m}(x) typically do not belong to QM[G]\mbox{QM}[G].

5 Degree bounds for matrix Positivstellensätze

This section is mainly to prove Theorems 1.1 and 1.2. As a byproduct, we can get a polynomial bound on the convergence rate of matrix SOS relaxations for solving polynomial matrix optimization, which resolves an open question raised by Dinh and Pham [10]. Let

(5.1) K={xnG(x)0},K=\{x\in\mathbb{R}^{n}\mid G(x)\succeq 0\},

where G(x)G(x) is an mm-by-mm symmetric polynomial matrix.

5.1 Proof of Theorem 1.1

For the polynomial matrix G(x)G(x), let d1,,dθ(m)d_{1},\dots,d_{\theta(m)} be the polynomials as in Theorem 4.1. Note that we assume

K{xn1x220}.K\subseteq\{x\in\mathbb{R}^{n}\mid 1-\|x\|_{2}^{2}\geq 0\}.

Denote

(5.2) 𝒢(x)=min(d1(x)d1B,,dθ(m)(x)dθ(m)B,0).\mathcal{G}(x)=-\min\left({\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\frac{d_{1}(x)}{\left\|d_{1}\right\|_{B}}},\ldots,{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\frac{d_{\theta(m)}(x)}{\left\|d_{\theta(m)}\right\|_{B}}},0\right).

If 𝒢(x)=0\mathcal{G}(x)=0, we have that xKx\in K, i.e., d(x,K)=0d(x,K)=0. By Corollary 2.2, there exist positive constants η\eta, κ\kappa such that the following Łojasiewicz inequality holds:

(5.3) d(x,K)ηκ𝒢(x),xΔ¯.d(x,K)^{\eta}\leq\kappa\mathcal{G}(x),~{}\forall x\in\bar{\Delta}.

In the following, we give the proof of Theorem 1.1.

Theorem 5.1.

Let KK be as in (5.1) with m2m\geq 2. Suppose that F(x)F(x) is an \ell-by-\ell symmetric polynomial matrix with deg(F)=d\deg(F)=d, and (1x22)IQM[G](1-\|x\|_{2}^{2})\cdot I_{\ell}\in\mbox{QM}[G]^{\ell}. If F(x)FminI0F(x)\succeq F_{\min}\cdot I_{\ell}\succ 0 on KK, then FQM[G]kF\in\mbox{QM}[G]_{k}^{\ell} for

(5.4) kC87η36(m1)θ(m)3n2dG6κ7d14η(FBFmin)7η+3.k\geq C\cdot{8}^{7\eta}\cdot 3^{6(m-1)}\theta(m)^{3}n^{2}d_{G}^{6}\kappa^{7}d^{14\eta}\left(\frac{\|F\|_{B}}{F_{\min}}\right)^{7\eta+3}.

In the above, κ,η\kappa,\eta are respectively the Łojasiewicz constant and exponent in (5.3) and C>0C>0 is a constant independent of FF, GG. Furthermore, the Łojasiewicz exponent can be estimated as

η(3m1dG+1)(3mdG)n+θ(m)2.\eta\leq(3^{m-1}d_{G}+1)(3^{m}d_{G})^{n+\theta(m)-2}.

Proof 5.2.

By Theorem 4.1, we know that d1,,dθ(m)QM[G]3m1dGd_{1},\dots,d_{\theta(m)}\in\mbox{QM}[G]_{3^{m-1}d_{G}} and

K={xnd1(x)0,,dθ(m)(x)0}.K=\{x\in\mathbb{R}^{n}\mid d_{1}(x)\geq 0,\dots,d_{\theta(m)}(x)\geq 0\}.

It follows from Theorem 3.5 that FQM[D]kF\in QM[D]_{k}^{\ell}, where DD is the diagonal matrix with diagonal entries d1,,dθ(m)d_{1},\ldots,d_{\theta(m)}, 1x221-\|x\|_{2}^{2}, for

kC87η36(m1)θ(m)3n2dG6κ7d14η(FBFmin)7η+3.k\geq C\cdot{8}^{7\eta}\cdot 3^{6(m-1)}\theta(m)^{3}n^{2}d_{G}^{6}\kappa^{7}d^{14\eta}\left(\frac{\|F\|_{B}}{F_{\min}}\right)^{7\eta+3}.

Equivalently, there exist SOS polynomial matrices SiΣ[x]S_{i}\in\Sigma[x]^{\ell} (i=0,,θ(m)+1i=0,\dots,\theta(m)+1) with deg(S0)k\deg(S_{0})\leq k, deg(diSi)k(i=1,,θ(m))\deg(d_{i}S_{i})\leq k~{}(i=1,\dots,\theta(m)), deg(Sθ(m)+1)k2\deg(S_{\theta(m)+1})\leq k-2, such that

F=S0+d1S1+dθ(m)Sθ(m)+(1x22)Sθ(m)+1.F=S_{0}+d_{1}S_{1}+\cdots d_{\theta(m)}S_{\theta(m)}+(1-\|x\|_{2}^{2})S_{\theta(m)+1}.

Suppose that Si=j=1sivi,jvi,jTS_{i}=\sum\limits_{j=1}^{s_{i}}v_{i,j}v_{i,j}^{T} for vi,j[x]k/2×1v_{i,j}\in\mathbb{R}[x]_{\lceil k/2\rceil}^{\ell\times 1} and di=σi+t=1riqi,tTGqi,td_{i}=\sigma_{i}+\sum\limits_{t=1}^{r_{i}}q_{i,t}^{T}Gq_{i,t} for some σiΣ[x]3m1dG1\sigma_{i}\in\Sigma[x]_{3^{m-1}d_{G}}^{1}, qi,t[x]3m1dG/2dG/2m×1q_{i,t}\in\mathbb{R}[x]_{\lceil 3^{m-1}d_{G}/2-d_{G}/2\rceil}^{m\times 1} . Then, we know that

diSi=σiSi+j=1sit=1rivi,jqi,tTGqi,tvi,jT,d_{i}S_{i}=\sigma_{i}S_{i}+\sum\limits_{j=1}^{s_{i}}\sum\limits_{t=1}^{r_{i}}v_{i,j}q_{i,t}^{T}Gq_{i,t}v_{i,j}^{T},

which implies that diSiQM[G]k+3m1dGd_{i}S_{i}\in\mbox{QM}[G]_{k+3^{m-1}d_{G}}^{\ell}. Since (1x22)IQM[G](1-\|x\|_{2}^{2})\cdot I_{\ell}\in\mbox{QM}[G]^{\ell}, we know that 1x22QM[G]k01-\|x\|_{2}^{2}\in\mbox{QM}[G]_{k_{0}} for some k0k_{0}\in\mathbb{N}. Then, we know that (1x22)Sθ(m)+1QM[G]k+k0(1-\|x\|_{2}^{2})S_{\theta(m)+1}\in\mbox{QM}[G]_{k+k_{0}}^{\ell}. Let CC be sufficiently large. Then, we can drop the small terms 3m1dG3^{m-1}d_{G}, k0k_{0}, and thus FQM[G]kF\in\mbox{QM}[G]_{k}^{\ell} for kk satisfying (5.4). Since d1,,dθ(m)QM[G]3m1dGd_{1},\dots,d_{\theta(m)}\in\mbox{QM}[G]_{3^{m-1}d_{G}} and

K={xnd1(x)d1B0,,,dθ(m)(x)dθ(m)B0},K=\left\{x\in\mathbb{R}^{n}\mid\frac{d_{1}(x)}{\left\|d_{1}\right\|_{B}}\geq 0,\dots,\dots,\frac{d_{\theta(m)}(x)}{\left\|d_{\theta(m)}\right\|_{B}}\geq 0\right\},

it follows from Theorem 2.4 that η(3m1dG+1)(3mdG)n+θ(m)2\eta\leq(3^{m-1}d_{G}+1)(3^{m}d_{G})^{n+\theta(m)-2}.

Remark. When G(x)G(x) is a diagonal polynomial matrix with diagonal entries g1,,gm[x]g_{1},\dots,g_{m}\in\mathbb{R}[x], i.e.,

K={xng1(x)0,,gm(x)0},K=\{x\in\mathbb{R}^{n}\mid g_{1}(x)\geq 0,\dots,g_{m}(x)\geq 0\},

Theorem 5.1 implies that for kk greater than the quantity as in (5.4), we have FQM[G]kF\in\mbox{QM}[G]_{k}^{\ell}. However, the bound given in (3.6) from Theorem 3.5 is generally smaller than that in (5.4). Additionally, the proof of Theorem 5.1 builds upon Theorem 3.5.

5.2 Proof of Theorem 1.2

For a polynomial matrix F=αdnFαxαF=\sum_{\alpha\in\mathbb{N}^{n}_{d}}F_{\alpha}x^{\alpha} with deg(F)=d\deg(F)=d, FhomF^{hom} denotes its highest-degree homogeneous terms, i.e., Fhom(x)=αn,|α|=dFαxα.F^{hom}(x)=\sum_{\alpha\in\mathbb{N}^{n},|\alpha|=d}F_{\alpha}x^{\alpha}. The homogenization of FF is denoted by F~\widetilde{F}, i.e., F~(x~)=αdnFαxαx0d|α|\widetilde{F}(\tilde{x})=\sum_{\alpha\in\mathbb{N}^{n}_{d}}F_{\alpha}x^{\alpha}x_{0}^{d-|\alpha|} for x~:=(x0,x)\tilde{x}:=(x_{0},x). Let

(5.5) K~:={x~n+1|(x0)d0G~(x~)0,x02+xTx=1},\begin{array}[]{rcl}\widetilde{K}:=&\left\{\tilde{x}\in\mathbb{R}^{n+1}\left|\begin{array}[]{l}(x_{0})^{d_{0}}\widetilde{G}(\tilde{x})\succeq 0,\\ x_{0}^{2}+x^{T}x=1\end{array}\right.\right\},\\ \end{array}

where d0=2dG/2dGd_{0}=2\lceil d_{G}/2\rceil-d_{G}. Consider the optimization

(5.6) {minλmin(F~(x~))s.t.(x0)d0G~(x~)0,x02+xTx=1,\left\{\begin{array}[]{rl}\min&\lambda_{\min}(\widetilde{F}(\tilde{x}))\\ \mathit{s.t.}&(x_{0})^{d_{0}}\widetilde{G}(\tilde{x})\succeq 0,\\ &x_{0}^{2}+x^{T}x=1,\\ \end{array}\right.

where λmin(F~(x~))\lambda_{\min}(\widetilde{F}(\tilde{x})) is the smallest eigenvalue of F~(x~)\widetilde{F}(\tilde{x}). Let F~min\widetilde{F}_{\min} denote the optimal value of (5.6).

Lemma 5.3.

Let KK be as in (5.1). Suppose that F(x)F(x) is an \ell-by-\ell symmetric polynomial matrix with deg(F)=d\deg(F)=d. If F0F\succ 0 on KK and Fhom0F^{hom}\succ 0 on n\mathbb{R}^{n}, then F~min>0\widetilde{F}_{\min}>0 and F~F~minI\widetilde{F}\succeq\widetilde{F}_{\min}\cdot I_{\ell} on K~\widetilde{K}.

Proof 5.4.

Suppose x~=(x0,x)K~\tilde{x}=(x_{0},x)\in\widetilde{K}. If x0=0x_{0}=0, then we have F~(x~)=Fhom(x)0.\widetilde{F}(\tilde{x})=F^{hom}(x)\succ 0. If x00x_{0}\neq 0, then

G(x/x0)=(x0)dGd0(x0)d0G~(x~)0,G(x/x_{0})=(x_{0})^{-d_{G}-d_{0}}(x_{0})^{d_{0}}\tilde{G}(\tilde{x})\succeq 0,

which implies that x/x0Kx/x_{0}\in K. Since Fhom0F^{hom}\succ 0 on n\mathbb{R}^{n}, the degree of FF must be even, and we have

F~(x~)=x0dF(x/x0)0.\widetilde{F}(\tilde{x})=x_{0}^{d}F(x/x_{0})\succ 0.

Since K~\widetilde{K} is compact, the optimal value of (5.6) is positive, i.e., F~min>0\widetilde{F}_{\min}>0.

For the polynomial matrix (x0)d0G~(x~)(x_{0})^{d_{0}}\widetilde{G}(\tilde{x}), let d1,,dθ(m)d_{1},\dots,d_{\theta(m)} be the polynomials as in Theorem 4.1. Denote

(5.7) 𝒢(x~)=min(d1(x~)d1B,,dθ(m)(x~)dθ(m)B,1x~221x~22B,x~2211x~22B,0).\mathcal{G}(\tilde{x})=-\min\left(\frac{d_{1}(\tilde{x})}{\left\|d_{1}\right\|_{B}},\ldots,\frac{d_{\theta(m)}(\tilde{x})}{\left\|d_{\theta(m)}\right\|_{B}},\frac{1-\|\tilde{x}\|_{2}^{2}}{\|1-\|\tilde{x}\|_{2}^{2}\|_{B}},\frac{\|\tilde{x}\|_{2}^{2}-1}{\|1-\|\tilde{x}\|_{2}^{2}\|_{B}},0\right).

Note that 𝒢(x~)=0\mathcal{G}(\tilde{x})=0 implies that x~K~\tilde{x}\in\widetilde{K}. By Corollary 2.2, there exist positive constants η\eta, κ\kappa such that for all x~\tilde{x} satisfying n+1x0x1xn0,1+x00,1+x10,,1+xn0\sqrt{n+1}-x_{0}-x_{1}-\cdots-x_{n}\geq 0,1+x_{0}\geq 0,1+x_{1}\geq 0,\ldots,1+x_{n}\geq 0, it holds that

(5.8) d(x~,K~)ηκ𝒢(x~).d(\tilde{x},\widetilde{K})^{\eta}\leq\kappa\mathcal{G}(\tilde{x}).

In the following, we give the proof of Theorem 1.2, which provides a polynomial bound on the complexity of matrix Putinar–Vasilescu’s Positivstellensätz.

Theorem 5.5.

Let KK be as in (5.1) with m2m\geq 2. Suppose that F(x)F(x) is an \ell-by-\ell symmetric polynomial matrix with deg(F)=d\deg(F)=d. If F0F\succ 0 on KK and Fhom0F^{hom}\succ 0 on n\mathbb{R}^{n}, then (1+x22)kFQM[G]2k+d(1+\|x\|_{2}^{2})^{k}F\in\mbox{QM}[G]_{2k+d}^{\ell} for

(5.9) kC87η36(m1)(θ(m)+2)3(n+1)2(dG/2)6κ7d14η(F~BF~min)7η+3.k\geq C\cdot{8}^{7\eta}\cdot 3^{6(m-1)}(\theta(m)+2)^{3}(n+1)^{2}(\lceil d_{G}/2\rceil)^{6}\kappa^{7}d^{14\eta}\left(\frac{\|\widetilde{F}\|_{B}}{\widetilde{F}_{\min}}\right)^{7\eta+3}.

In the above, κ\kappa, η\eta are respectively the Łojasiewicz constant and exponent in (5.8), C>0C>0 is a constant independent of FF, GG and F~min\widetilde{F}_{\min} is the optimal value of (5.6). Furthermore, the Łojasiewicz exponent can be estimated as

η(23m1dG/2+1)(23mdG/2)n+θ(m)1.\eta\leq(2\cdot 3^{m-1}\lceil d_{G}/2\rceil+1)(2\cdot 3^{m}\lceil d_{G}/2\rceil)^{n+\theta(m)-1}.

Proof 5.6.

By Lemma 5.3, we have that F~F~minI0\widetilde{F}\succeq\widetilde{F}_{\min}\cdot I_{\ell}\succ 0 on K~\widetilde{K}. Note that QM[1x~2]\mbox{QM}[1-\|\tilde{x}\|^{2}]^{\ell} is Archimedean, and K~{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\widetilde{K}} can be equivalently described by

d1(x~)0,,dθ(m)(x~)0,1x~220,x~2210.d_{1}(\tilde{x})\geq 0,\dots,d_{\theta(m)}(\tilde{x})\geq 0,1-\|\tilde{x}\|_{2}^{2}\geq 0,\|\tilde{x}\|_{2}^{2}-1\geq 0.

By following the proof of Theorem 1.1 for F~(x)\widetilde{F}(x), we know that for kk satisfying (5.9), there exist SΣ[x~]S\in\Sigma[\tilde{x}]^{\ell}, H[x~]×H\in\mathbb{R}[\tilde{x}]^{\ell\times\ell}, Pi[x~]m×P_{i}\in\mathbb{R}[\tilde{x}]^{m\times\ell} (i=1,,ti=1,\dots,t) with

deg(S)k,deg(H)k2,deg(PiT(x0)d0G~Pi)k(i=1,,t),\deg(S)\leq k,\,\deg(H)\leq k-2,\,\deg(P_{i}^{T}(x_{0})^{d_{0}}\widetilde{G}P_{i})\leq k~{}(i=1,\dots,t),

such that

F~(x~)=S(x~)+(x~221)H(x~)+i=1tPi(x~)T(x0)d0G~(x~)Pi(x~).\widetilde{F}(\tilde{x})=S(\tilde{x})+(\|\tilde{x}\|_{2}^{2}-1)\cdot H(\tilde{x})+\sum\limits_{i=1}^{t}P_{i}(\tilde{x})^{T}(x_{0})^{d_{0}}\widetilde{G}(\tilde{x})P_{i}(\tilde{x}).

Replacing x~\tilde{x} by (1,x)/1+x22(1,x)/\sqrt{1+\|x\|_{2}^{2}} in the above identity, we get that

(5.10) F(x)1+x22d=i=1tPi((1,x)1+x22)TG(x)1+x22d0+dGPi((1,x)1+x22)+S((1,x)1+x22).\begin{array}[]{ll}\frac{F(x)}{\sqrt{1+\|x\|_{2}^{2}}^{d}}=&\sum\limits_{i=1}^{t}P_{i}\left(\frac{(1,x)}{\sqrt{1+\|x\|_{2}^{2}}}\right)^{T}\frac{G(x)}{\sqrt{1+\|x\|_{2}^{2}}^{d_{0}+d_{G}}}P_{i}\left(\frac{(1,x)}{\sqrt{1+\|x\|_{2}^{2}}}\right)\\ &+S\left(\frac{(1,x)}{\sqrt{1+\|x\|_{2}^{2}}}\right).\end{array}

Suppose that S=VTVS=V^{T}V for some polynomial matrix VV. Then, there exist polynomial matrices V(1),V(2)V^{(1)},V^{(2)} with 2deg(V(1))deg(S)2\deg(V^{(1)})\leq\deg(S), 2deg(V(2))+2deg(S)2\deg(V^{(2)})+2\leq\deg(S) such that

V((1,x)1+x22)=(11+x22)deg(V)(V(1)+1+x22V(2)).V\left(\frac{(1,x)}{\sqrt{1+\|x\|_{2}^{2}}}\right)=\left(\frac{1}{\sqrt{1+\|x\|_{2}^{2}}}\right)^{\deg(V)}(V^{(1)}+\sqrt{1+\|x\|_{2}^{2}}V^{(2)}).

It holds that

S((1,x)1+x22)=(V(1))TV(1)+(1+x22)(V(2))TV(2)1+x22deg(S)+(V(2))TV(1)+(V(1))TV(2)1+x22deg(S)1.\begin{array}[]{ll}S\left(\frac{(1,x)}{\sqrt{1+\|x\|_{2}^{2}}}\right)=\frac{(V^{(1)})^{T}V^{(1)}+(1+\|x\|_{2}^{2})(V^{(2)})^{T}V^{(2)}}{\sqrt{1+\|x\|_{2}^{2}}^{\deg(S)}}+\frac{(V^{(2)})^{T}V^{(1)}+(V^{(1)})^{T}V^{(2)}}{\sqrt{1+\|x\|_{2}^{2}}^{\deg(S)-1}}.\end{array}

Similarly, there exist polynomial matrices Pi(1),Pi(2)P_{i}^{(1)},P_{i}^{(2)} with deg(Pi(1))deg(Pi)\deg(P_{i}^{(1)})\leq\deg(P_{i}), deg(Pi(2))+1deg(Pi)\deg(P_{i}^{(2)})+1\leq\deg(P_{i}) satisfying

Pi((1,x)1+x22)=(11+x22)deg(Pi)(Pi(1)+1+x22Pi(2)),P_{i}\left(\frac{(1,x)}{\sqrt{1+\|x\|_{2}^{2}}}\right)=\left(\frac{1}{\sqrt{1+\|x\|_{2}^{2}}}\right)^{\deg(P_{i})}(P_{i}^{(1)}+\sqrt{1+\|x\|_{2}^{2}}P_{i}^{(2)}),

Then, we have that

i=1tPi((1,x)1+x22)TG(x)1+x22d0+dGPi((1,x)1+x22)=i=1t(Pi(1))TG(x)Pi(1)+(1+x22)(Pi(2))TG(x)Pi(2)1+x22d0+dG+2deg(Pi)+(Pi(2))TG(x)Pi(1)+(Pi(1))TG(x)Pi(2)1+x22d0+dG+2deg(Pi)1.\begin{array}[]{l}\sum\limits_{i=1}^{t}P_{i}\left(\frac{(1,x)}{\sqrt{1+\|x\|_{2}^{2}}}\right)^{T}\frac{G(x)}{\sqrt{1+\|x\|_{2}^{2}}^{d_{0}+d_{G}}}P_{i}\left(\frac{(1,x)}{\sqrt{1+\|x\|_{2}^{2}}}\right)\\ \quad\quad\quad\quad\quad\quad=\sum_{i=1}^{t}\frac{(P_{i}^{(1)})^{T}G(x)P_{i}^{(1)}+(1+\|x\|_{2}^{2})(P_{i}^{(2)})^{T}G(x)P_{i}^{(2)}}{\sqrt{1+\|x\|_{2}^{2}}^{d_{0}+d_{G}+2\deg(P_{i})}}+\frac{(P_{i}^{(2)})^{T}G(x)P_{i}^{(1)}+(P_{i}^{(1)})^{T}G(x)P_{i}^{(2)}}{\sqrt{1+\|x\|_{2}^{2}}^{d_{0}+d_{G}+2\deg(P_{i})-1}}.\\ \end{array}

Combining these results with equation (5.10), we know that for kk satisfying (5.9), the following holds:

(1+x22)kF=Q+1+x22H(x),(1+\|x\|_{2}^{2})^{k}F=Q+\sqrt{1+\|x\|_{2}^{2}}H(x),

for some QQM[G]Q\in\mbox{QM}[G]^{\ell} with deg(Q)2k+d\deg(Q)\leq 2k+d and a symmetric polynomial matrix HH. Since 1+x22\sqrt{1+\|x\|_{2}^{2}} is not a rational function, we conclude that (1+x22)kF=Q(1+\|x\|_{2}^{2})^{k}F=Q, which implies (1+x22)kFQM[G]2k+d(1+\|x\|_{2}^{2})^{k}F\in\mbox{QM}[G]^{\ell}_{2k+d}.

Since d1,,dθ(m)QM[G]23m1dG/2d_{1},\dots,d_{\theta(m)}\in\mbox{QM}[G]_{2\cdot 3^{m-1}\lceil d_{G}/2\rceil} and

K~={x~n+1d1(x~)d1B0,,dθ(m)(x~)dθ(m)B0,1x~22=0},\widetilde{K}=\left\{\tilde{x}\in\mathbb{R}^{n+1}\mid\frac{d_{1}(\tilde{x})}{\left\|d_{1}\right\|_{B}}\geq 0,\dots,\frac{d_{\theta(m)}(\tilde{x})}{\left\|d_{\theta(m)}\right\|_{B}}\geq 0,1-\|\tilde{x}\|_{2}^{2}=0\right\},

it follows from Theorem 2.4 that η(23m1dG/2+1)(23mdG/2)n+θ(m)1\eta\leq(2\cdot 3^{m-1}\lceil d_{G}/2\rceil+1)(2\cdot 3^{m}\lceil d_{G}/2\rceil)^{n+\theta(m)-1}.

Proof of Corollary 1.3. Note that F0F\succeq 0 on KK, and we have

d+2deg(F+ε(1+x22)d+12I)=2d+12>d.d+2\geq\deg(F+\varepsilon\cdot(1+\|x\|_{2}^{2})^{\lceil\frac{d+1}{2}\rceil}I_{\ell})=2\lceil\frac{d+1}{2}\rceil>d.

Hence, for x~K~\tilde{x}\in\widetilde{K}, the homogenization of F+ε(1+x22)d+12IF+\varepsilon\cdot(1+\|x\|_{2}^{2})^{\lceil\frac{d+1}{2}\rceil}I_{\ell} satisfies

x02d+12dF~+ε(x02+x22)d+12Iε(x02+x22)d+12IεI.\begin{array}[]{ll}x_{0}^{2\lceil\frac{d+1}{2}\rceil-d}\widetilde{F}+\varepsilon\cdot(x_{0}^{2}+\|x\|_{2}^{2})^{\lceil\frac{d+1}{2}\rceil}I_{\ell}\succeq\varepsilon\cdot(x_{0}^{2}+\|x\|_{2}^{2})^{\lceil\frac{d+1}{2}\rceil}I_{\ell}\succeq\varepsilon\cdot I_{\ell}.\\ \end{array}

The conclusion follows from Theorem 5.5.

5.3 Convergence rate of matrix SOS relaxations

Consider the polynomial matrix optimization

(5.11) {minf(x)s.t.G(x)0,\left\{\begin{array}[]{rl}\min&f(x)\\ \mathit{s.t.}&G(x)\succeq 0,\\ \end{array}\right.

where f(x)[x]f(x)\in\mathbb{R}[x], G(x)G(x) is an m×mm\times m symmetric polynomial matrix. We denote the optimal value of (5.11) as fminf_{\min}.

A standard approach for solving (5.11) globally is the sum-of-squares hierarchy introduced in [16, 28, 48]. This hierarchy consists of a sequence of semidefinite programming relaxations. Recall that for a degree kk, the kkth degree truncation of QM[G]\mbox{QM}[G] is

QM[G]k:={σ+i=1tviTGviσΣ[x],vi[x]m,t,deg(σ)k,deg(viTGvi)k}.\mbox{QM}[G]_{k}:=\left\{\begin{array}[]{l|l}\sigma+\sum_{i=1}^{t}v_{i}^{T}Gv_{i}&\begin{array}[]{l}\sigma\in\Sigma[x],~{}v_{i}\in\mathbb{R}[x]^{m},~{}t\in\mathbb{N},\\ \deg(\sigma)\leq k,~{}\deg(v_{i}^{T}Gv_{i})\leq k\end{array}\end{array}\right\}.

Checking whether a polynomial belongs to the the quadratic module QM[G]k\mbox{QM}[G]_{k} can be done by solving a semidefinite program (see [48]). For an order kk, the kkth order SOS relaxation of (5.11) is

(5.12) {maxγ s.t. fγQM[G]2k.\left\{\begin{array}[]{cl}\max&\gamma\\ \text{ s.t. }&f-\gamma\in\mathrm{QM}[G]_{2k}.\end{array}\right.

Let fkf_{k} denote the optimal value of (5.12). It holds the monotonicity relation: fkfk+1fmin.f_{k}\leq f_{k+1}\leq\cdots\leq f_{\min}. The hierarchy (5.12) is said to have asymptotic convergence if fkf_{k}\rightarrow\infty as kk\rightarrow\infty, and it is said to have finite convergence if fk=fminf_{k}=f_{\min} for all kk big enough. When the quadratic module QM[G]\mbox{QM}[G] is Archimedean, it was shown in [16, 28, 48] that the hierarchy (5.12) converges asymptotically. Recently, Huang and Nie [24] proved that this hierarchy also has finite convergence if, in addition, the nondegeneracy condition, strict complementarity condition and second order sufficient condition hold at every global minimizer of (5.11). In the following, we provide a polynomial bound on the convergence rate of the hierarchy (5.12).

Theorem 5.7.

Suppose f[x]f\in\mathbb{R}[x] with d=deg(f)d=\deg(f), m2m\geq 2, and 1x22QM[G]1-\|x\|_{2}^{2}\in\mbox{QM}[G]. Then we have that

(5.13) |fkfmin|3(C87η36(m1)θ(m)3n2dG6κ7d14η)17η+3fB(1k)17η+3,|f_{k}-f_{\min}|\leq 3\cdot(C\cdot{8}^{7\eta}\cdot 3^{6(m-1)}\theta(m)^{3}n^{2}d_{G}^{6}\kappa^{7}d^{14\eta})^{\frac{1}{7\eta+3}}\|f\|_{B}\left(\frac{1}{k}\right)^{\frac{1}{7\eta+3}},

where κ\kappa, η\eta are respectively the Łojasiewicz constant and exponent in (5.3), and C>0C>0 is a constant independent of ff, GG. Furthermore, the Łojasiewicz exponent can be estimated as

η(3m1dG+1)(3mdG)n+θ(m)2.\eta\leq(3^{m-1}d_{G}+1)(3^{m}d_{G})^{n+\theta(m)-2}.

Proof 5.8.

Since fminfBf_{\min}\leq\|f\|_{B} (see [13]), we have that for 0<εfB0<\varepsilon\leq\|f\|_{B},

ffmin+εBfB+|fmin|+ε3fB.\|f-f_{\min}+\varepsilon\|_{B}\leq\|f\|_{B}+|f_{\min}|+\varepsilon\leq 3\|f\|_{B}.

Since ffmin+εεf-f_{\min}+\varepsilon\geq\varepsilon on KK, it follows from Theorem 1.1 that ffmin+εQM[G]kf-f_{\min}+\varepsilon\in\mbox{QM}[G]_{k} for

kC87η36(m1)θ(m)3n2dG6κ7d14η(3fBε)7η+3,k\geq C\cdot{8}^{7\eta}\cdot 3^{6(m-1)}\theta(m)^{3}n^{2}d_{G}^{6}\kappa^{7}d^{14\eta}\left(\frac{3\|f\|_{B}}{\varepsilon}\right)^{7\eta+3},\\

and η(3m1dG+1)(3mdG)n+θ(m)2\eta\leq(3^{m-1}d_{G}+1)(3^{m}d_{G})^{n+\theta(m)-2}. Then, we know that fminεf_{\min}-\varepsilon is feasible for (5.12), i.e., fkfminεf_{k}\geq f_{\min}-\varepsilon when kk satisfies the above inequality. Equivalently, we know that the relation (5.13)(\ref{rate}) holds.

6 Conclusions and discussions

In this paper, we establish a polynomial bound on the degrees of terms appearing in matrix Putinar’s Positivstellensätz. For the special case that the constraining polynomial matrix is diagonal, this result improves the exponential bound shown in [15]. As a byproduct, we also obtain a polynomial bound on the convergence rate of matrix SOS relaxations, which resolves an open question posed by Dinh and Pham [10]. When the constraining set is unbounded, a similar bound is provided for matrix Putinar–Vasilescu’s Positivstellensätz.

Theorem 1.1 uses the Łojasiewicz inequality (5.3) for equivalent scalar polynomial inequalities of KK. There also exists the Łojasiewicz inequality for polynomial matrices (see [9, 10]), where the corresponding Łojasiewicz exponent is typically smaller than the scalar one as in (5.3). An interesting future work is to provide a proof based directly on the Łojasiewicz inequality for G(x)G(x), which may yields a better bound. On the another hand, the set {xnG(x)0}\{x\in\mathbb{R}^{n}\mid G(x)\succeq 0\} can be equivalently described by the quantifier set {xnyTG(x)y0,y2=1}.\{x\in\mathbb{R}^{n}\mid y^{T}G(x)y\geq 0,~{}\forall~{}\|y\|_{2}=1\}. It would also be interesting to study the complexity of Putinar-type Positivstellensätze with universal quantifiers for the above reformulation. We refer to [19] for the recent work on Positivstellensätze with universal quantifiers.

When KK is given by scalar polynomial inequalities, it was shown in [45] that the Łojasiewicz exponent η\eta as in (3.5) is equal to 11 if the linear independence constraint qualification holds at every xKx\in K. Consequently, we can get a representation with order (FBFmin)10\left(\frac{\|F\|_{B}}{F_{\min}}\right)^{10}. An analogue to the linear independence constraint qualification in nonlinear semidefinite programming is the nondegeneracy condition [52, 56]. It would also be interesting to explore whether similar results exist for polynomial matrix inequalities under the nondegeneracy condition.

Acknowledgements: The author would like to thank Prof. Konrad Schmüdgen for helpful discussions on Proposition 1, Prof. Tien-Son Pham for fruitful comments on this paper. The author also thanks the associate editor and three anonymous referees for valuable comments and suggestions.

References

  • [1] F. Bach and A. Rudi, Exponential convergence of sum-of-squares hierarchies for trigonometric polynomials, SIAM J. Optim., 11 (2023), pp. 796–817.
  • [2] L. Baldi and B. Mourrain, On the effective Putinar’s Positivstellensätz and moment approximation, Math. Program., 200 (2023), pp. 71–103.
  • [3] L. Baldi, B. Mourrain and A. Parusinski, On Łojasiewicz inequalities and the effective Putinar’s Positivstellensätz, J. Algebra, 662 (2025), pp. 741–767.
  • [4] L. Baldi and L. Slot, Degree bounds for Putinar’s Positivstellensätz on the hypercube, SIAM J. Appl. Algebra Geom., 8 (2023), pp. 1–25.
  • [5] J. Bochnak, M. Coste and M. Roy, Real Algebraic Geometry, Springer, Berlin, 1998.
  • [6] G. Chesi, LMI techniques for optimization over polynomials in control: a survey, IEEE Trans. Autom., 55 (2010), pp. 2500–2510.
  • [7] J. Cimpric, Real algebraic geometry for matrices over commutative rings, J. Algebra 359 (2012), pp. 89–103.
  • [8] E. De Klerk and M. Laurent, On the Lasserre hierarchy of semidefinite programming relaxations of convex polynomial optimization problems, SIAM J. Optim., 21 (2011), pp. 824–832.
  • [9] S. Dinh. and T. Pham, Łojasiewicz-type inequalities with explicit exponents for the largest eigenvalue function of real symmetric polynomial matrices, Internat. J. Math., 27 (2016) , 1650012.
  • [10] S. Dinh. and T. Pham, Łojasiewicz inequalities with explicit exponents for smallest singular value functions, J. Complex., 41 (2017), pp. 58–71.
  • [11] T. Dinh., M. Ho and C. Le, Positivstellensätze for polynomial matrices, Positivity 25 (2021), pp. 1295–1312.
  • [12] K. Fang and H. Fawzi, The sum-of-squares hierarchy on the sphere and applications in quantum information theory, Math. Program. 190 (2021), pp. 331-–360.
  • [13] G. Farin, Curves and Surfaces for CAGD: A Practical Guide, 5th ed. Morgan Kaufmann Series in Computer Graphics and Geometric Modeling, San Francisco, 2021.
  • [14] H. Ha`\grave{\text{a}} and T. Pham, Genericity in Polynomial Optimization, World Scientific Publishing, Singapore, 2017.
  • [15] J. Helton and J. Nie, Semidefinite representation of convex sets, Math. Program., 122 (2010), pp. 21–64.
  • [16] D. Henrion and J. Lasserre, Convergent relaxations of polynomial matrix inequalities and static output feedback, IEEE Trans. Autom. Control, 51 (2006), 192–202.
  • [17] D. Henrion and J. Lasserre, Inner approximations for polynomial matrix inequalities and robust stability, IEEE Trans. Autom. Control, 57 (2011), pp. 1456–1467.
  • [18] D. Henrion, M. Korda and J. Lasserre, The Moment-SOS Hierarchy: Lectures in Probability, Statistics, Computational Geometry, Control and Nonlinear PDEs, World Scientific, Cambridge, 2020.
  • [19] X. Hu, I. Klep and J. Nie, Positivstellensätze and moment problems with universal quantifiers, arXiv preprint, https://arxiv.org/abs/2401.12359, 2024.
  • [20] L. Huang, Optimality conditions for homogeneous polynomial optimization on the unit sphere, Optim Lett, 17 (2023), pp. 1263–1270.
  • [21] L. Huang, J. Nie and Y. Yuan, Homogenization for polynomial optimization with unbounded sets, Math. Program., 200 (2023), pp. 105–145.
  • [22] L. Huang, J. Nie, and Y. Yuan, Generalized truncated moment problems with unbounded sets, J. Sci. Comput., 95 (2023).
  • [23] L. Huang, J. Nie and Y. Yuan, Finite convergence of Moment-SOS relaxations with non-real radical ideals, SIAM J. Optim., 34 (2024), pp. 3399–3428.
  • [24] L. Huang and J. Nie, On the tightness of matrix Moment-SOS hierarchy, arXiv preprint, https://arxiv.org/abs/2403.17241, 2024.
  • [25] H. Ichihara and J. Lasserre, Optimal control for polynomial systems using matrix sum of squares relaxations, IEEE Trans. Autom. Control, 54 (2009), pp. 1048–1053.
  • [26] T. Jacobi and A. Prestel, Distinguished representations of strictly positive polynomials, J. Reine Angew. Math., 532 (2001), pp. 223–235.
  • [27] I. Klep and M. Schweighofer, Pure states, positive matrix polynomials and sums of hermitian squares, Indiana Univ. Math. J., 59 (2010), pp. 857–874.
  • [28] M. Kojima, Sums of squares relaxations of polynomial semidefinite programs, In: Research Reports on Mathematical and Computing Sciences Series B: Operations Research B-397, Tokyo Institute of Technology, 2003.
  • [29] M. Korda, V. Magron and Z. Rios-Zertuche, Convergence rates for sums-of-squares hierarchies with correlative sparsity, Math. Program., (2024), to appear.
  • [30] A. Kroo and S. Revesz, On Bernstein and Markov-type inequalities for multivariate polynomials on convex bodies, J Approx Theory, 99 (1999), pp. 134–152.
  • [31] J. Lasserre, Global optimization with polynomials and the problem of moments, SIAM J. Optim., 11 (2001), pp. 796–817.
  • [32] J. Lasserre, Convexity in semialgebraic geometry and polynomial optimization, SIAM J. Optim., 19 (2009), pp. 1995–2014.
  • [33] M. Laurent, Sums of squares, moment matrices and optimization over polynomials, In: Emerging Applications of Algebraic Geometry of IMA Volumes in Mathematics and its Applications, 149 (2009), pp. 157–270.
  • [34] S. Łojasiewicz, Sur le problème de la division, Studia Math., 18 (1959), pp. 87–136.
  • [35] N. Mai and V. Magron, On the complexity of Putinar–Vasilescu’s Positivstellensätz, J. Complexity, 72 (2022), 101663.
  • [36] M. Marshall, Representation of non-negative polynomials with finitely many zeros, Annales de la Faculte des Sciences Toulouse, 15 (2006), pp. 599–609.
  • [37] J. Nie, Optimality conditions and finite convergence of Lasserre’s hierarchy, Math. Program., 146 (2014), pp. 97–121.
  • [38] J. Nie, Moment and Polynomial Optimization, SIAM, Philadelphia, 2023.
  • [39] J. Nie and M. Schweighofer, On the complexity of Putinar’s Positivstellensätz, J. Complexity, 23 (2007), pp. 135–150.
  • [40] G. Polya, Uber positive Darstellung von Polynomen, Vierteljahresschrift der Naturforschenden Gesellschaft in Zurich, 73 (1928), pp. 141–145.
  • [41] V. Powers and B. Reznick, A new bound for Polya’s theorem with applications to polynomials positive on polyhedra, J. Pure Appl. Algebra, 164 (2001), pp. 221–229.
  • [42] M. Putinar, Positive polynomials on compact semi-algebraic sets, Indiana Univ. Math. J., 42 (1993), pp. 969–984.
  • [43] M. Putinar and F Vasilescu, Positive polynomials on semi-algebraic sets, C. R. Acad. Sci. Ser. I Math., 328 (1999), pp. 585–589.
  • [44] M. Putinar and F Vasilescu, Solving moment problems by dimensional extension, C. R. Acad. Sci. Ser. I Math., 328 (1999), pp. 495–499.
  • [45] S. Robinson, Stability theorems for systems of inequalities. Part II: Differentiable nonlinear systems, SIAM J. Numer. Anal., 13 (1976), pp. 497–513.
  • [46] C. Scheiderer, Distinguished representations of non-negative polynomials, J. Algebra, 289 (2005), pp. 558–573.
  • [47] C. Scherer, LMI relaxations in robust control, Eur. J. Control, 2 (2006), pp. 3–29.
  • [48] C. Scherer and C. Hol, Matrix sum-of-squares relaxations for robust semi-definite programs, Math. Program., 107 (2006), pp. 189–211.
  • [49] K. Schmüdgen, The KK-moment problem for compact semi-algebraic sets, Math. Ann., 289(1991), pp. 203–206.
  • [50] K. Schmüdgen, Noncommutative real algebraic geometry: some basic concepts and first ideas, In: Emerging Applications of Algebraic Geometry of IMA Volumes in Mathematics and its Applications, 149 (2009), pp. 325–350.
  • [51] M. Schweighofer, On the complexity of Schmüdgen’s Positivstellensätz, J. Complexity, 20 (2004), pp. 529–543.
  • [52] A. Shapiro, First and second order analysis of nonlinear semidefinite programs, Math. Program., 77 (1997), pp. 301–320.
  • [53] L. Slot, Sum-of-squares hierarchies for polynomial optimization and the Christoffel–Darboux kernel, SIAM J. Optim., 32 (2022), pp. 2612–2635.
  • [54] L. Slot and M. Laurent, An effective version of Schmüdgen’s Positivstellensätz for the hypercube, Optim Lett, 17 (2023), pp. 515–530.
  • [55] L. Slot and M. Laurent, Sum-of-squares hierarchies for binary polynomial optimization, Math. Program., 197 (2023), pp. 621–660.
  • [56] D. Sun, The strong second order sufficient condition and constraint nondegeneracy in nonlinear semidefinite programming and their implications, Math. Oper. Res., 31 (2006), pp. 761–776.
  • [57] M. Tyburec, J. Zeman, M. Kruzık and D. Henrion, Global optimality in minimum compliance topology optimization of frames and shells by moment-sum-of-squares hierarchy, Struct Multidisc Optim , 64 (2021), pp. 1963–1981.
  • [58] Y. Zheng and G. Fantuzzi, Sum-of-squares chordal decomposition of polynomial matrix inequalities, Math. Program., 197 (2023), pp. 71–108.