This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\stackMath

Fourier sum of squares certificates

Jianting Yang KLMM, Academy of Mathematics and Systems Science,& University of Chinese Academy of Sciences [email protected] Ke Ye KLMM, Academy of Mathematics and Systems Science,& University of Chinese Academy of Sciences [email protected]  and  Lihong Zhi KLMM, Academy of Mathematics and Systems Science,& University of Chinese Academy of Sciences [email protected]
(Date: February 10, 2025)
Abstract.

The non-negativity of a function on a finite abelian group can be certified by its Fourier sum of squares (FSOS). In this paper, we propose a method of certifying the non-negativity of an integer-valued function by an FSOS certificate, which is defined to be an FSOS with a small error. We prove the existence of exponentially sparse polynomial and rational FSOS certificates and we provide two methods to validate them. As a consequence of the aforementioned existence theorems, we propose a semidefinite programming (SDP)-based algorithm to efficiently compute a sparse FSOS certificate. For applications, we consider certificate problems for maximum satisfiability (MAX-SAT) and maximum k-colorable subgraph (MkCS) and demonstrate our theoretical results and algorithm by numerical experiments.

Key words and phrases:
Fourier sum of squares, short certificate, semidefinite programming, approximation theory, numerical algorithm, MAX-SAT, MkCS
Ke Ye and Lihong Zhi are supported by the National Key Research Project of China 2018YFA0306702. Lihong Zhi is supported by the National Natural Science Foundation of China 12071467.

1. Introduction

The problem of sum of squares (SOS) dates back to Hilbert [Hil88]. It asks whether a non-negative polynomial can be written as a sum of squares of polynomials (resp. rational functions). The polynomial case is disproved by the well-known Motzkin’s polynomial [Mot67] while the rational function case, usually called the Hilbert’s 17th problem, is proved by Artin in his seminal work [Art27]. Based on various versions of positivstellensätze [Kri64, Sch91, Ste74], SOS becomes an essential ingredient in polynomial optimization [Las01, Par03, Lau09, PT20, Nie14]. In particular, it is used to certify the nonnegativity of a function on the hypercube Γ2n={1,1}n\Gamma_{2}^{n}=\{-1,1\}^{n}. The most concerned problem is the existence of low degree SOS certificate [BGP16, STKI17]. Numerous applications of SOS certificates can be found in combinatorics, including Knapsack problem[Gri01], Constrained Satisfaction Problem (CSP) [KMOW17], Empty Integral Hull (EIH) problem [KLM16], Min Knapsack (MK) problem [Kur19] and Planted Clique (PC) problem[MPW15, BHK+19]. Moreover, due to its diverse applications in various fields such as the interpretability of black-box models in machine learning [BLT21], accuracy certification [NOR10] and automated reasoning [HKM16, Sha94, Par00], constructing a certificate for a computational problem is at the core of computer science as well.

In [FSP16], the notion of Fourier sum of squares (FSOS) is proposed, which extends SOS on Γ2n\Gamma_{2}^{n} to arbitrary finite abelian groups. The main idea behind FSOS is that on a finite abelian group GG, its characters play the role of monomials in polynomial SOS. A graph theoretic framework to study FSOS is developed in [FSP16] and a quasi-linear algorithm for sparse FSOS is proposed in [YYZ22]. Although most combinatorial optimization problems can be modelled on the hypercube, there also exist many problems whose feasible domains are finite abelian groups of other types. For instance, the pigeon-hole principle can be certified by a sparse FSOS on nn+1\mathbb{Z}_{n}^{n+1} [YYZ22]; the maximum 3-colorable subgraph problem for the wheel graph with nn vertices can be certified by a sparse FSOS on Γ3n\Gamma_{3}^{n}, which is much sparser than that on the hypercube, if we re-model the problem accordingly (c.f. Section 6.3.1); we also prove in Section 6.3.2 that the maximum (n1)(n-1)-colorable subgraph problem for the complete graph with nn vertices can be certified by a sparse FSOS on Γnn1{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\Gamma}_{n}^{n-1}.

In this paper, we consider the following two problems:

Problem 1.1 (computation of FSOS certificate).

Given an integer-valued function f:Gf:G\to\mathbb{Z} and an integer LL, how to certify fLf\geq L via FSOS?

Problem 1.2 (validation of FSOS certificate).

Given an integer-valued function f:Gf:G\to\mathbb{Z}, an integer LL and a pair of finite family of functions ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}), how to check whether ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) indeed certifies fLf\geq L?

1.1. Our contributions

Our solutions to Problems 1.1 and 1.2 are based on the following simple observation (c.f. Lemma 3.1):

fis non-negative onGminyG(f(y)jJ|gj(y)|2iI|hi(y)|2)>1,f\leavevmode\nobreak\ \text{is non-negative on}\leavevmode\nobreak\ G\iff\min_{y\in G}\left(f(y)-\frac{\sum_{j\in J}|g_{j}(y)|^{2}}{\sum_{i\in I}|h_{i}(y)|^{2}}\right)>-1,

where f:Gf:G\to\mathbb{Z} is an integer-valued function. As a consequence, we propose the notion of FSOS certificate in Definition 3.3, which can be regarded as an FSOS of a non-negative function with a small error. It is worth emphasizing that FSOS certificates in this paper are different from those defined in [FSP16]. Indeed, an FSOS certificate for f0f\geq 0 in [FSP16, MPW15, BGP16] requires

f=jJ|gj|2iI|hi|2,f=\frac{\sum_{j\in J}|g_{j}|^{2}}{\sum_{i\in I}|h_{i}|^{2}},

while our FSOS certificate allows a small error.

It should be noticed that the small error in our FSOS certificate does not impair its ability to certify the non-negativity of ff as ff is integer-valued. In fact, quite the contrary, the small error provides us an exponentially sparser certificate. For instance (cf. Theorem 2.10), it is proved that the non-negative function

f(x1,,xn)=(j=1nxjn2)(j=1nxjn21)f(x_{1},\dots,x_{n})=\left(\sum_{j=1}^{n}x_{j}-\left\lfloor\frac{n}{2}\right\rfloor\right)\left(\sum_{j=1}^{n}x_{j}-\left\lfloor\frac{n}{2}\right\rfloor-1\right)

on {0,1}n\{0,1\}^{n} has no polynomial or rational FSOS of degree less than n12\frac{n-1}{2}. However, we can construct an FSOS certificate for f0f\geq 0 of degree 11 and sparsity n+1n+1 if we allow an error of 1/41/4 (cf. Remark 3.19).

We address Problems 1.1 and 1.2, both theoretically and algorithmically. On the theoretical side, we study the existence of sparse (polynomial and rational) FSOS certificates for lower bounds of integer-valued functions on finite abelian groups. We informally summarize our main results below.

  • In Theorem 3.8, we prove that for any β[0,1)\beta\in[0,1) and a non-negative integer-valued function ff on GG such that fmax<(54β)fminf_{\max}<(5-4\beta)f_{\min}, there exists a sparse polynomial FSOS certificate for f[βfmin]f\geq[\beta f_{\min}] with degree d=deg(f)3+log(1αβ)fmax2+log(1β)αlog(1α)d=\deg(f)\left\lfloor\frac{3+\log(1-\alpha\beta)f_{\max}}{2+\log(1-\beta)\alpha-\log(1-\alpha)}\right\rfloor and with sparsity |supp(f)|d|\operatorname{supp}(f)|^{d}, where α=fminfmax\alpha=\frac{f_{\min}}{f_{\max}}.

  • We also prove that Theorem 3.8 is optimal in the sense of approximation theory in Theorems 3.11 and 3.14.

  • In Theorem 3.18, we prove that any non-negative integer-valued function ff on GG admits a sparse rational FSOS certificate for fLf\geq L (an integer), the degree of its denominator and numerator is bounded by d=O(deg(f)log(fmaxL)2)d=\operatorname{O}(\deg(f)\log(f_{\max}-L)^{2}).

For practical use, we propose algorithms for both the computation and the validation of FSOS certificates, which are briefly summarized in the following.

  • We present Algorithm 1 to compute a low degree rational FSOS certificate for fLf\geq L.

  • We propose two efficient methods to validate an FSOS certificate: validation by 1\ell^{1}-norm (15) and validation by sampling (16). Moreover, we justify the two methods by Proposition 4.2 and Theorem 4.6 respectively.

1.2. Organization of the paper

The paper is organized as follows. In Section 2 we present some essential definitions and facts in group theory and FSOS. We prove in Section 3 the existence of sparse polynomial and rational FSOS certificates, respectively. We also provide two simple methods to validate an FSOS certificate in Subsections 4.1 and 4.2. In Section 5, we present an efficient algorithm to compute a low degree FSOS certificate. As applications, we consider the certificate problem for MAX-SAT and maximum k-colorable subgraph (MkCS) problem in Section 6. Some numerical experiments are presented to demonstrate the correctness of our theorems and the efficiency of our algorithm.

2. Preliminaries

In this section, we recall some basic definitions and results in Fourier analysis on groups and Fourier sum of squares (FSOS). With the help of these mathematical tools, we are able to convert the certificate problem for lower bounds to the problem of FSOS.

2.1. Fourier analysis on groups

We briefly summarize fundamentals of group theory and representation theory in this subsection, which are necessary for the development of the paper. For more details, we refer interested readers to [Rud62, FH13].

Let GG be a finite abelian group. A nonzero complex valued function χ\chi on GG is called a character of GG if for any y,yGy,y^{\prime}\in G, it holds that

χ(yy)=χ(y)χ(y).\chi(yy^{\prime})=\chi(y)\chi(y^{\prime}).

We denote by G^\widehat{G} the set of all characters, which is called the dual group of GG. It is straightforward to verify that G^\widehat{G} is also a finite abelian group, with the group operation given by pointwise multiplication. Moreover, GG^G\simeq\widehat{G} as abelian groups.

According to the classification theorem of finite abelian groups, any finite abelian group GG is isomorphic to Γn1××Γnd\Gamma_{n_{1}}\times\cdots\times\Gamma_{n_{d}} for some positive integers 2n1nd2\leq n_{1}\leq\cdots\leq n_{d}. Here Γm\Gamma_{m} denotes the subgroup of the circle group 𝕊1\mathbb{S}^{1} consisting of points ei2lπm,0lm1e^{i\frac{2l\pi}{m}},0\leq l\leq m-1. For convenience, in this paper we do not distinguish isomorphic abelian groups and simply write G=Γn1××ΓndG=\Gamma_{n_{1}}\times\cdots\times\Gamma_{n_{d}}. For each α=(α1,,αd)n1××nd\alpha=(\alpha_{1},\dots,\alpha_{d})\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}, we define the character χα\chi_{\alpha} of GG to be the function χα(y)=yαy1α1ydαd\chi_{\alpha}(y)=y^{\alpha}\coloneqq y_{1}^{\alpha_{1}}\cdots y_{d}^{\alpha_{d}} where y=(y1,,yd)Gy=(y_{1},\dots,y_{d})\in G. Thus the dual group of GG is simply

G^={χα:αn1××nd}.\widehat{G}=\{\chi_{\alpha}:{\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}}\}.

By the representation theory of finite abelian groups, any function f:G=Γn1××Γndf:G=\Gamma_{n_{1}}\times\cdots\times\Gamma_{n_{d}}\to\mathbb{C} can be uniquely written as a linear combination of elements in G^\widehat{G} [FH13, Chapter 1]. More specifically, we have the Fourier expansion of ff:

(1) f=χαG^fαχα,f=\sum_{\chi_{\alpha}\in\widehat{G}}f_{\alpha}\chi_{\alpha},

where for each αn1××nd\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}, fαf_{\alpha} is the Fourier coefficient of ff at χαG^\chi_{\alpha}\in\widehat{G} defined by

(2) fα1|G|yGf(y)χα(y)¯=(j=1dnj)1yGf(y)yα.f_{\alpha}\coloneqq\frac{1}{|G|}\sum_{y\in G}f(y)\overline{\chi_{\alpha}(y)}=\left(\prod_{j=1}^{d}n_{j}\right)^{-1}\sum_{y\in G}f(y)y^{-\alpha}.

We also define f^:G^\widehat{f}:\widehat{G}\to\mathbb{C} by f^(χα)=fα\widehat{f}(\chi_{\alpha})=f_{\alpha}, αn1××nd\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}.

Since both ff and f^\widehat{f} are functions on finite sets, they are naturally identified with their evaluation vectors (f(y))yG(f(y))_{y\in G} and (f^(χα))χαG^(\widehat{f}(\chi_{\alpha}))_{\chi_{\alpha}\in\widehat{G}}, respectively. Thus we have the following norms associated to ff and f^\widehat{f}:

fmaxyG|f(y)|,f^1χαG^|f^(χα)|=αn1××nd|fα|.\lVert f\rVert_{\ell^{\infty}}\coloneqq\max_{y\in G}|f(y)|,\quad\lVert\widehat{f}\rVert_{\ell^{1}}\coloneqq\sum_{\chi_{\alpha}\in\widehat{G}}|\widehat{f}(\chi_{\alpha})|=\sum_{\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}}|f_{\alpha}|.

Moreover, we have the following inequalities:

(3) |fα|1|G|yG|f(y)|f,αn1××nd.\left|f_{\alpha}\right|\leq\frac{1}{|G|}\sum_{y\in G}\left|f(y)\right|\leq\lVert f\rVert_{\ell^{\infty}},\quad\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}.
(4) ff^1.\lVert f\rVert_{\ell^{\infty}}\leq\lVert\widehat{f}\rVert_{\ell^{1}}.

The support of ff is defined to be

supp(f){χα:fα0}.\operatorname{supp}(f)\coloneqq\left\{\chi_{\alpha}:f_{\alpha}\neq 0\right\}.

The sparsity of ff is defined to be the cardinality of supp(f)\operatorname*{supp}(f). We denote the sparsity of ff by sp(f)\operatorname{sp}(f).

Because of (1), we have the notion of degree of a function on G=Γn1××ΓndG=\Gamma_{n_{1}}\times\cdots\times\Gamma_{n_{d}}.

Definition 2.1 (degree).

The degree of ff as in (1) is defined by

deg(f)maxαn1××nd{j=1d|αj|:fα0}.\deg(f)\coloneqq\max_{\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}}\left\{\sum_{j=1}^{d}|\alpha_{j}|:f_{\alpha}\neq 0\right\}.

Here for each βn1××nd\beta\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}, we denote by βj\beta_{j} the jj-th element of β\beta, 1jd1\leq j\leq d.

It is worthy to mention that although every function ff on GG can be uniquely written as a Laurent polynomial (1), deg(f)\deg(f) in Definition 2.1 is not the same as the degree of the Laurent polynomial representing ff. As an example, we consider f(y)=y+y2f(y)=y+y^{-2} on Γ6\Gamma_{6}. According to Definition 2.1, we have deg(f)=2\deg(f)=2. However, as a Laurent polynomial, the degree of ff is 11.

Remark 2.2.

Two particularly interesting examples are ΓN\Gamma_{N} and Γ2nΓ2××Γ2n copies\Gamma_{2}^{n}\coloneqq\overbrace{\Gamma_{2}\times\cdots\times\Gamma_{2}}^{\text{$n$ copies}}. Clearly, a function on ΓN\Gamma_{N} is a univariate Laurent polynomial of degree at most N12\left\lceil\frac{N-1}{2}\right\rceil while a function on Γ2n\Gamma_{2}^{n} is a multilinear polynomial in nn variables, of degree at most nn. For these two examples, Definition 2.1 coincide with the one used in [FSP16]. Moreover, monomials of negative powers do appear in (1) in general, unless GΓ2nG\simeq\Gamma_{2}^{n} for some nn\in\mathbb{N}.

It is obvious from the definition that the degree of ff is at most j=1dnj12\sum_{j=1}^{d}\left\lceil\frac{n_{j}-1}{2}\right\rceil. Moreover, the sparsity of ff is controlled by the degree of ff. Namely, we have

sp(f)k=0deg(f)|{(m1,,md)n1××nd:|m1|++|md|=k}|.\operatorname{sp}(f)\leq\sum_{k=0}^{\deg(f)}\left\lvert\{(m_{1},\dots,m_{d})\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}:|m_{1}|+\cdots+|m_{d}|=k\}\right\rvert.

2.2. Fourier sum of squares on abelian groups

This subsection concerns with the theory of FSOS developed in [FSP16, STKI17, YYZ22]. We first recall the definition of Fourier sum of squares.

Definition 2.3 (polynomial FSOS).

Let ff be a non-negative function on GG and let SS be a subset of G^\widehat{G}. We say that ff admits a polynomial FSOS supported in SS if there exists a finite family {gj:supp(gj)S}jJ\{g_{j}:\operatorname*{supp}(g_{j})\subseteq S\}_{j\in J} of complex valued functions on GG such that

(5) f=jJ|gj|2.f=\sum_{j\in J}|g_{j}|^{2}.
Definition 2.4 (rational FSOS).

We say that ff admits a rational FSOS supported in SS if there exists a pair ({gj:supp(gj)S}jJ,{hi:supp(hi)S}iI)(\{g_{j}:\operatorname*{supp}(g_{j})\in S\}_{j\in J},\{h_{i}:\operatorname*{supp}(h_{i})\in S\}_{i\in I}) of finite families of complex valued functions on GG such that iI|hi|2\sum_{i\in I}|h_{i}|^{2} is non-vanishing and

(6) f=jJ|gj|2iI|hi|2.f=\frac{\sum_{j\in J}|g_{j}|^{2}}{\sum_{i\in I}|h_{i}|^{2}}.

In particular, a polynomial (resp. rational) FSOS of the form {g}\{g\} (resp. ({g},{h})(\{g\},\{h\})) is called a rank one polynomial (resp. rational) FSOS.

Remark 2.5.

If G=Γ2nG=\Gamma_{2}^{n}, we may further require gjg_{j}’s and hih_{i}’s in (5) and (6) to be real valued. We claim that ff admits a real FSOS supported in S2nS\subseteq\mathbb{Z}_{2}^{n} if and only if it admits a complex FSOS supported in SS. Indeed, we can write a complex valued function gg on Γ2n\Gamma_{2}^{n} as g=α2ncαχαg=\sum_{\alpha\in\mathbb{Z}_{2}^{n}}c_{\alpha}\chi_{\alpha}. Since χα\chi_{\alpha} is real-valued in this case, we have |g|2=Re(g)2+Im(g)2|g|^{2}=\operatorname{Re}(g)^{2}+\operatorname{Im}(g)^{2} where

Re(g)=α2dRe(cα)χα,Im(g)=α2dIm(cα)χα.\operatorname{Re}(g)=\sum_{\alpha\in\mathbb{Z}_{2}^{d}}\operatorname{Re}(c_{\alpha})\chi_{\alpha},\quad\operatorname{Im}(g)=\sum_{\alpha\in\mathbb{Z}_{2}^{d}}\operatorname{Im}(c_{\alpha})\chi_{\alpha}.

This implies supp(Re(g))supp(g)\operatorname*{supp}(\operatorname{Re}(g))\subseteq\operatorname*{supp}(g) and supp(Im(g))supp(g)\operatorname*{supp}(\operatorname{Im}(g))\subseteq\operatorname*{supp}(g).111It is worthy to notice that for a real valued function gg on a general finite abelian group, it may happen that supp(Re(g))supp(g)\operatorname*{supp}(\operatorname{Re}(g))\not\subseteq\operatorname*{supp}(g) and supp(Im(g))supp(g)\operatorname*{supp}(\operatorname{Im}(g))\not\subseteq\operatorname*{supp}(g). Therefore, if {gj}jJ\{g_{j}\}_{j\in J} is a complex valued polynomial FSOS of ff on Γ2n\Gamma_{2}^{n}, then {Re(gj),Im(gj)}jJ\{\operatorname{Re}(g_{j}),\operatorname{Im}(g_{j})\}_{j\in J} provides a real valued polynomial FSOS of ff. Moreover, these two FSOS share the same support. The argument for complex valued rational FSOS is similar. As a consequence of the claim, it suffices to focus on real FSOS for functions on Γ2n\Gamma_{2}^{n}.

By the work of Hilbert [Hil88], for each pair (n,d)(n,d) of positive integers such that n,d4n,d\geq 4, there always exists non-negative polynomials which can not be written as sum of squares of finitely many polynomials. Moreover, according to the seminal work of Artin [Art27] which solves Hilbert’s seventeenth problem, any non-negative polynomial can be written as a sum of squares of rational functions. The distinction between the existence of polynomial and rational SOS can be diminished by restricting the problem to a finite set of points. To be more precise, we have the following proposition, which can be regarded as a cornerstone of the theory of FSOS.

Proposition 2.6.

[FSP16, Proposition 2] A non-negative function ff on an abelian group admits both polynomial and rational FSOS.

As a consequence of Proposition 2.6, a function ff on GG is non-negative if and only if there exists a Hermitian positive semidefinite matrix H=(Hα,β)χα,χβG^|G|×|G|H=(H_{\alpha,\beta})_{\chi_{\alpha},\chi_{\beta}\in\widehat{G}}\in\mathbb{C}^{|G|\times|G|} such that the following holds for each χβG^\chi_{\beta}\in\widehat{G}:

(7) χαG^Hα,α+β=fβ.\sum_{\chi_{\alpha}\in\widehat{G}}H_{\alpha,\alpha+\beta}=f_{\beta}.

Here we index rows and columns of HH by elements in G^\widehat{G}. Any |G|×|G||G|\times|G| matrix H0H\succeq 0 satisfying (7) is called a Gram matrix of ff. Gram matrices are of great importance in both the theoretical and computational study of FSOS. Indeed, if HH is a Gram matrix of ff and H=MMH=M^{\ast}M for some M=(Mj,α)1jr,χαG^r×|G|M=(M_{j,\alpha})_{1\leq j\leq r,\chi_{\alpha}\in\widehat{G}}\in\mathbb{C}^{r\times|G|}, then we have

(8) f=j=1r|χαG^Mj,αχα|2.f=\sum_{j=1}^{r}\Big{\lvert}\sum_{\chi_{\alpha}\in\widehat{G}}M_{j,\alpha}\chi_{\alpha}\Big{\rvert}^{2}.

2.3. Sparsity of FSOS

We measure the quality of a polynomial or rational FSOS by its sparsity. To that end, we define the notion of sparsity for a finite family of functions on GG.

Definition 2.7 (sparsity).

Let {fk}kK\{f_{k}\}_{k\in K} be a finite family of functions on GG. The sparsity of {fk}kK\{f_{k}\}_{k\in K} is

sp({fk}kK)|kKsupp(fk)|.\operatorname{sp}(\{f_{k}\}_{k\in K})\coloneqq\left\lvert\bigcup_{k\in K}\operatorname{supp}(f_{k})\right\rvert.

It is clear that the sparsity of a polynomial (resp. rational) FSOS {gj}jJ\{g_{j}\}_{j\in J} (resp. {(gj,hj)}jJ\{(g_{j},h_{j})\}_{j\in J}) of a given function ff on GG can be controlled by maxjJ{deggj}\max_{j\in J}\{\deg g_{j}\} (resp. maxjJ{deggj,deghj}\max_{j\in J}\{\deg g_{j},\deg h_{j}\}). For ease of reference, we record the following theorems which provide an upper bound for the sparsity of a polynomial FSOS of a non-negative function on Γ2n\Gamma_{2}^{n} and ΓN\Gamma_{N}.

Theorem 2.8.

[STKI17, Theorem 3.2] Every degree dd non-negative function on Γ2n\Gamma_{2}^{n} has a polynomial FSOS

f=jJ|gj|2,f=\sum_{j\in J}|g_{j}|^{2},

where maxjJ{deg(gj)}(n+d1)/2\max_{j\in J}\{\deg(g_{j})\}\leq\lceil(n+d-1)/2\rceil.

Theorem 2.9.

[FSP16, Theorem 3] Every degree dd non-negative function on ΓN\Gamma_{N} has a polynomial FSOS

f=jJ|gj|2,f=\sum_{j\in J}|g_{j}|^{2},

where sp({gj}jJ)3dlog(Nd)\operatorname{sp}\left(\{g_{j}\}_{j\in J}\right)\leq 3d\log\left(\frac{N}{d}\right).

Theorem 2.10.

[BGP16, Theorem 1.1 & 1.2] For each non-negative quadratic polynomial on {0,1}nn\{0,1\}^{n}\subseteq\mathbb{R}^{n}, there exists a pair ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) of finite families of real polynomials such that

(9) p=jJgj2iIhi2,p=\frac{\sum_{j\in J}g_{j}^{2}}{\sum_{i\in I}h_{i}^{2}},

where maxjJ{deg(gj)}1+n2\max_{j\in J}\{\deg(g_{j})\}\leq 1+\lfloor\frac{n}{2}\rfloor and maxiI{deg(hi)}n2\max_{i\in I}\{\deg(h_{i})\}\leq\lfloor\frac{n}{2}\rfloor. Moreover, the polynomial

f(y1,,yn)=(j=1nyjn2)(j=1nyjn21)f(y_{1},\dots,y_{n})=\left(\sum_{j=1}^{n}y_{j}-\left\lfloor\frac{n}{2}\right\rfloor\right)\left(\sum_{j=1}^{n}y_{j}-\left\lfloor\frac{n}{2}\right\rfloor-1\right)

is non-negative on {0,1}n\{0,1\}^{n} but (9) does not hold for any ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) if maxjJ{deg(gj)}n2\max_{j\in J}\{\deg(g_{j})\}\leq\lfloor\frac{n}{2}\rfloor.

3. Existence of sparse FSOS certificates

Let GG be an abelian group and let f:Gf:G\to\mathbb{Z} be an integer-valued function on GG. For an integer LL, we may certify the inequality fLf\geq L by sum of squares of fLf-L. Moreover, we observe that the range of ff is a finite subset of \mathbb{Z}. This enables us to certify fLf\geq L by sum of squares of fLf-L with errors, which is the content of the lemma that follows.

Lemma 3.1.

The following are equivalent:

  1. (i)

    fLf\geq L.

  2. (ii)

    fL+δ0f-L+\delta\geq 0 for some function δ:G[0,1)\delta:G\to[0,1).

  3. (iii)

    fL+δ0f-L+\delta\geq 0 for any function δ:G[0,1)\delta:G\to[0,1).

In particular, if fL+δ0f-L+\delta\geq 0 for some function δ:G(,1)\delta:G\to(-\infty,1), then (i)–(iii) holds.

Remark 3.2.

Clearly, Proposition 2.6 ensures the existence of both polynomial and rational FSOS certificates for lower bounds. In theory, it suffices to discuss FSOS of fLf-L. In practice, however, it is usually not possible to find an exact FSOS of fLf-L, since errors are inevitable in numerical computation. Fortunately, Lemma 3.1 implies that if one can find an FSOS of fLf-L with a small error, it still certifies fLf\geq L. For example, if

fLjJ|gj|2<1,\left\lVert f-L-\sum_{j\in J}|g_{j}|^{2}\right\rVert_{\ell^{\infty}}<1,

i.e., {gj}jJ\{g_{j}\}_{j\in J} is a polynomial FSOS certificate for fLf\geq L by taking δ=jJ|gj|2f+L\delta=\sum_{j\in J}|g_{j}|^{2}-f+L then {gj}jJ\{g_{j}\}_{j\in J} certifies fLf\geq L. This indicates that it is not necessary to certify fLf\geq L by an exact FSOS of fLf-L. Thus FSOS certificates provide us a wider class of certifiers for lower bounds. In [SL23], the relative error between the lower bound of a low degree FSOS approximation of ff and fminf_{\min} is analyzed on the hypercube. It implies that if a small relative error is allowed, then one might certify an integer-valued function ffminf\geq f_{\min} by a lower degree FSOS on the hypercube.

Due to Lemma 3.1, we have the following definition.

Definition 3.3 (FSOS certificates for lower bound).

A polynomial (resp. rational) FSOS certificate for fLf\geq L is a polynomial (resp. rational) FSOS {gj}jJ\{g_{j}\}_{j\in J} (resp. ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})) of fL+δf-L+\delta for some function δ:G(,1)\delta:G\to(-\infty,1).

In the sequel, if ff and LL are understood from the context, we call the finite family {gj}jJ\{g_{j}\}_{j\in J} (resp. ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})) in Definition 3.3 a polynomial (resp. rational) FSOS certificate. We further abbreviate polynomial/rational FSOS certificate as FSOS certificate if there is no need to emphasize whether it is polynomial or rational.

Remark 3.4.

It is imperative to clarify the distinction between FSOS certificates for lower bound (Definition 3.3) and FSOS (Definitions 2.3 and 2.4). As a concrete example, we consider the quadratic function ff in Theorem 2.10, for which there does not exist ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) such that (iIhi2)f=jJgj2(\sum_{i\in I}h_{i}^{2})f=\sum_{j\in J}g_{j}^{2} and maxjJ,iI{deggj,deghi}n/2\max_{j\in J,i\in I}\{\deg g_{j},\deg h_{i}\}\leq\lfloor n/2\rfloor. On the other side, we observe that

f(x1,,xn)\displaystyle f(x_{1},\dots,x_{n}) =(i=1nxin2)(i=1nxin21)\displaystyle=\left(\sum_{i=1}^{n}x_{i}-\left\lfloor\frac{n}{2}\right\rfloor\right)\left(\sum_{i=1}^{n}x_{i}-\left\lfloor\frac{n}{2}\right\rfloor-1\right)
=(i=1nxin212+12)(i=1nxin21212)\displaystyle=\left(\sum_{i=1}^{n}x_{i}-\left\lfloor\frac{n}{2}\right\rfloor-\frac{1}{2}+\frac{1}{2}\right)\left(\sum_{i=1}^{n}x_{i}-\left\lfloor\frac{n}{2}\right\rfloor-\frac{1}{2}-\frac{1}{2}\right)
=(i=1nxin212)214\displaystyle=\left(\sum_{i=1}^{n}x_{i}-\left\lfloor\frac{n}{2}\right\rfloor-\frac{1}{2}\right)^{2}-\frac{1}{4}
14\displaystyle\geq-\frac{1}{4}

for each (x1,,xn){0,1}n(x_{1},\dots,x_{n})\in\{0,1\}^{n}. This implies that ff admits a polynomial FSOS certificate for f0f\geq 0 as ff is an integer-valued function.

Now we are ready to address Problem 1.1, which we reproduce below for convenience. See 1.1 Our solution to Problem 1.1 is the polynomial/rational FSOS certificate. The rest of this section is devoted to a discussion of the existence of low degree polynomial and rational FSOS certificates.

3.1. Polynomial FSOS certificates

In this subsection, we focus on polynomial FSOS certificates. From the perspective of computational complexity theory, the criterion to measure the quality of a certificate is its complexity. In our case, the quality of a polynomial FSOS certificate {gj}jJ\{g_{j}\}_{j\in J} is measured by its sparsity defined in Definition 2.7.

According to Definition 3.3, for an integer LfminminxG{f(x)}L\leq f_{\min}\coloneqq\min_{x\in G}\{f(x)\}, we want to find a finite family {gj}jJ\{g_{j}\}_{j\in J} of functions on GG such that

(10) (fL+ε)jJ|gj|2=maxyG|(f(y)L+ε)jJ|gj(y)|2|<1ε\left\lVert(f-L+\varepsilon)-\sum_{j\in J}|g_{j}|^{2}\right\rVert_{\ell^{\infty}}=\max_{y\in G}\left\lvert(f(y)-L+\varepsilon)-\sum_{j\in J}|g_{j}(y)|^{2}\right\rvert<1-\varepsilon

for some ε[0,1)\varepsilon\in[0,1). Here we implicitly take the function δ:G(,1)\delta:G\to(-\infty,1) in Definition 3.3 to be δ=(fL+ϵ)+jJ|gj|2\delta=-(f-L+\epsilon)+\sum_{j\in J}|g_{j}|^{2}. We remark that ε\varepsilon appears in both sides of (10) for computational purposes. One can easily rewrite (10) so that ε\varepsilon only appears on the right side. For numerical examples and algorithms in this paper, we simply choose ε=1/2\varepsilon=1/2.

We notice that for any ε[0,1)\varepsilon\in[0,1) we may take g=fL+εg=\sqrt{f-L+\varepsilon} so that (10) holds trivially for {g}\{g\}. However, fL+ε\sqrt{f-L+\varepsilon} usually fails to be sparse. This can be easily seen from the following illustrative example.

Example 3.5.

We consider G=Γ23G=\Gamma_{2}^{3} and

(11) f(y1,y2,y3)=138+38y1+38y2+38y3+18y1y2+18y1y3+18y2y318y1y2y3.f(y_{1},y_{2},y_{3})=\frac{13}{8}+\frac{3}{8}y_{1}+\frac{3}{8}y_{2}+\frac{3}{8}y_{3}+\frac{1}{8}y_{1}y_{2}+\frac{1}{8}y_{1}y_{3}+\frac{1}{8}y_{2}y_{3}-\frac{1}{8}y_{1}y_{2}y_{3}.

We take ε=1/2,L=fmin=1\varepsilon=1/2,L=f_{\min}=1. Then g=fL+ε=f1/2g=\sqrt{f-L+\varepsilon}=\sqrt{f-1/2} is a dense polynomial:

g(y1,y2,y3)\displaystyle g(y_{1},y_{2},y_{3}) =42+36+1016+22+6+1016y1+6+1016y2+22+6+1016y3\displaystyle=\frac{4\sqrt{2}+3\sqrt{6}+\sqrt{10}}{16}+\frac{-2\sqrt{2}+\sqrt{6}+\sqrt{10}}{16}y_{1}+\frac{\sqrt{6}+\sqrt{10}}{16}y_{2}+\frac{-2\sqrt{2}+\sqrt{6}+\sqrt{10}}{16}y_{3}
+6+1016y1y2+6+1016y1y3+6+1016y2y3+2236+1016y1y2y3.\displaystyle+\frac{-\sqrt{6}+\sqrt{10}}{16}y_{1}y_{2}+\frac{-\sqrt{6}+\sqrt{10}}{16}y_{1}y_{3}+\frac{-\sqrt{6}+\sqrt{10}}{16}y_{2}y_{3}+\frac{2\sqrt{2}-3\sqrt{6}+\sqrt{10}}{16}y_{1}y_{2}y_{3}.

It is clear that sp({g})=8\operatorname{sp}(\{g\})=8. However, by Algorithm 1, we obtain a polynomial FSOS certificate of sparsity 44:

l1(y)\displaystyle l_{1}(y) =0.2615y1+0.2615y2+0.2615y3+0.7170,\displaystyle=0.2615y_{1}+0.2615y_{2}+0.2615y_{3}+0.7170,
l2(y)\displaystyle l_{2}(y) =0.0542y10.0542y2+0.1085y3,\displaystyle=-0.0542y_{1}-0.0542y_{2}+0.1085y_{3},
l3(y)\displaystyle l_{3}(y) =0.0939y1+0.0939y2.\displaystyle=-0.0939y_{1}+0.0939y_{2}.

In fact, according to (4), it is easy to verify that

f1i=13|li|2f^1i=13|li|2^1<0.255,\left\lVert f-1-\sum_{i=1}^{3}|l_{i}|^{2}\right\rVert_{\ell^{\infty}}\leq\left\lVert\widehat{f}-1-\sum_{i=1}^{3}\widehat{|l_{i}|^{2}}\right\rVert_{\ell^{1}}<0.255,

from which one may conclude that {li}i=13\{l_{i}\}_{i=1}^{3} certifies f1f\geq 1.

3.1.1. Existence of low degree polynomial FSOS certificate

This sub-subsection is concerned with the existence of sparse polynomial FSOS certificates. To be more precise, we prove that there exists a lower degree function gg on GG such that {g}\{g\} satisfies (10). As a side remark, the existence of gg is equivalent to the existence of a rank one sparse Gram matrix defined in (7) of fL+δf-L+\delta for some function δ:G(,1)\delta:G\to(-\infty,1).

Our discussion relies on approximation theory of functions. For the reader’s convenience, we record the following classic estimate for Chebyshev interpolation.

Lemma 3.6.

[SB02] Let a<ba<b be two real numbers. For each fCd+1([a,b])f\in C^{d+1}([a,b]), we have

maxt[a,b]|f(t)p(t)|(ba2)d+1maxy[a,b]|f(d+1)(t)|2d(d+1)!,\max_{t\in[a,b]}|f(t)-p(t)|\leq\left(\frac{b-a}{2}\right)^{d+1}\frac{\max_{y\in[a,b]}|f^{(d+1)}(t)|}{2^{d}(d+1)!},

where pp is the degree dd Chebyshev interpolation polynomial for ff on [a,b][a,b].

Next we apply Lemma 3.6 to t\sqrt{t} on [α,1][\alpha,1] for some α>0\alpha>0, from which we obtain the next lemma.

Lemma 3.7.

Let ff be a function on GG and let fminf_{\min} (resp. fmaxf_{\max}) be the minimum (resp. maximum) of ff. If L<5fminfmax4L<\frac{5f_{\min}-f_{\max}}{4}, then for any ε>0\varepsilon>0, there exists a function PP on GG of degree

deg(f)1logε+log(fmaxL)2+log(fminL)log(fmaxfmin)\deg(f)\left\lfloor\frac{1-\log\varepsilon+\log(f_{\max}-L)}{2+\log(f_{\min}-L)-\log(f_{\max}-f_{\min})}\right\rfloor

such that

fL|P|2<ε.\left\lVert f-L-|P|^{2}\right\rVert_{\ell^{\infty}}<\varepsilon.
Proof.

We notice that the image of ff is contained in [fmin,fmax][f_{\min},f_{\max}]\cap\mathbb{Z}. Thus the image of f0(fL)/(fmaxL)f_{0}\coloneqq(f-L)/(f_{\max}-L) is contained in

{fminLfmaxL,fminL+1fmaxL,,1}.\left\{\frac{f_{\min}-L}{f_{\max}-L},\frac{f_{\min}-L+1}{f_{\max}-L},\dots,1\right\}.

For simplicity we denote αfminLfmaxL\alpha\coloneqq\frac{f_{\min}-L}{f_{\max}-L}. By assumption we have 15<α1\frac{1}{5}<\alpha\leq 1.

Let p(t)p(t) be the degree d=1logε+log(fmaxL)2+log(fminL)log(fmaxfmin)d=\left\lfloor\frac{1-\log\varepsilon+\log(f_{\max}-L)}{2+\log(f_{\min}-L)-\log(f_{\max}-f_{\min})}\right\rfloor Chebyshev interpolation polynomial for t\sqrt{t} on [α,1][\alpha,1]. According to Lemma 3.6, we have

maxt[α,1]|p(t)t|\displaystyle\max_{t\in[\alpha,1]}\lvert p(t)-\sqrt{t}\rvert (1α2)d+1(2d1)!!α12(d+1)22d+1(d+1)!\displaystyle\leq\left(\frac{1-\alpha}{2}\right)^{d+1}\frac{(2d-1)!!\alpha^{\frac{1}{2}-(d+1)}}{2^{2d+1}(d+1)!}
<αd+1(1α4α)d+1\displaystyle<\frac{\sqrt{\alpha}}{d+1}\left(\frac{1-\alpha}{4\alpha}\right)^{d+1}
<(1α4α)d+1\displaystyle<\left(\frac{1-\alpha}{4\alpha}\right)^{d+1}
ε2(fmaxL),\displaystyle\leq\frac{\varepsilon}{2(f_{\max}-L)},

where n!!=n(n2)(n4)(n2n12)n!!=n\cdot(n-2)\cdot(n-4)\cdots(n-2\lfloor\frac{n-1}{2}\rfloor). Next we define PfmaxL(pf0)P\coloneqq\sqrt{f_{\max}-L}\left(p\circ f_{0}\right). Thus

maxxG||P(x)|2(f(x)L)|\displaystyle\max_{x\in G}\left||P(x)|^{2}-(f(x)-L)\right|
=maxxG|(fmaxL)p(f0(x))2(fmaxL)(f0(x))2|\displaystyle=\max_{x\in G}\left\lvert(f_{\max}-L)p(f_{0}(x))^{2}-(f_{\max}-L)\left(\sqrt{f_{0}(x)}\right)^{2}\right\rvert
=maxxG{|p(f0(x))f0(x)||(fmaxL)(p(f0(x))+f0(x))|}\displaystyle=\max_{x\in G}\left\{\left\lvert p(f_{0}(x))-\sqrt{f_{0}(x)}\right\rvert\left\lvert(f_{\max}-L)\left(p(f_{0}(x))+\sqrt{f_{0}(x)}\right)\right\rvert\right\}
2(fmaxL)maxxG|p(f0(x))f0(x)|\displaystyle\leq 2(f_{\max}-L)\max_{x\in G}\left\lvert p(f_{0}(x))-\sqrt{f_{0}(x)}\right\rvert
<ε\displaystyle<\varepsilon

and this completes the proof of the lemma. ∎

As a straightforward application of Lemma 3.7, we have the following theorem on the existence of a low degree polynomial FSOS certificate for the lower bound.

Theorem 3.8 (low degree polynomial FSOS certificate).

Let β[0,1)\beta\in[0,1) be a fixed real number and let ff be a non-negative integer-valued function on GG. If

fmax<(54β)fmin,f_{\max}<(5-4\beta)f_{\min},

then there exists a rank-one FSOS certificate {g}\{g\} for fL[βfmin]f\geq L\coloneqq[\beta f_{\min}], such that

deg(g)deg(f)3+log(fmaxL)2+log(fminL)log(fmaxfmin).\deg(g)\leq\deg(f)\left\lfloor\frac{3+\log(f_{\max}-L)}{2+\log(f_{\min}-L)-\log(f_{\max}-f_{\min})}\right\rfloor.

Here [x][x] denotes the nearest integer to xx\in\mathbb{R}.

Proof.

We may take ε=1/4\varepsilon=1/4 and L=[βfmin]L=[\beta f_{\min}] in Lemma 3.7, which implies the existence of a function PP on GG of the desired degree such that

maxaG|f(a)L|P(a)|2|<1/4.\max_{a\in G}|f(a)-L-|P(a)|^{2}|<1/4.

We illustrate Theorem 3.8 by Figure 1. For each pair (α,β)(\alpha,\beta) in the shaded region in Figure 1 and any function ff with fmin=αfmax0f_{\min}=\alpha f_{\max}\geq 0, one can certify the inequality f[βfmin]f\geq[\beta f_{\min}] by a polynomial FSOS of degree at most deg(f)3+log(1αβ)fmax2+log(1β)αlog(1α)\deg(f)\left\lfloor\frac{3+\log(1-\alpha\beta)f_{\max}}{2+\log(1-\beta)\alpha-\log(1-\alpha)}\right\rfloor.

00.20.20.40.40.60.60.80.81100.20.20.40.40.60.60.80.811β=(5α1)/(4α)\beta=(5\alpha-1)/(4\alpha)x=1x=1α=fminfmax\alpha=\frac{f_{\min}}{f_{\max}}β=Lfmin\beta=\frac{L}{f_{\min}}
Figure 1. threshold

In particular, if β=0\beta=0 and α(1/5,1)\alpha\in(1/5,1), the degree bound simplifies to deg(f)3+logfmax2+logαlog(1α)\deg(f)\left\lfloor\frac{3+\log f_{\max}}{2+\log\alpha-\log(1-\alpha)}\right\rfloor.

We may apply this bound to the special case G=Γ2nG=\Gamma_{2}^{n}. Thus we can certify f0f\geq 0 by a polynomial FSOS of degree at most

deg(f)3+deg(f)logn+logf^max2+logαlog(1α),\deg(f)\left\lfloor\frac{3+\deg(f)\log n+\log\widehat{f}_{\max}}{2+\log\alpha-\log(1-\alpha)}\right\rfloor,

since fmaxndeg(f)f^maxf_{\max}\leq n^{\deg(f)}\widehat{f}_{\max}. Here f^maxmaxχαG^|f^(χα)|\widehat{f}_{\max}\coloneqq\max_{\chi_{\alpha}\in\widehat{G}}|\widehat{f}(\chi_{\alpha})|. If we further assume that α\alpha is fixed and f^maxM\widehat{f}_{\max}\leq M for some constant M>0M>0, then the degree bound is O(deg(f)2logn)O(\deg(f)^{2}\log n).

Another special case is G=ΓNG=\Gamma_{N}. In this case, the upper bound becomes

deg(f)3+log(2deg(f)+1)+logf^max2+logαlog(1α),\deg(f)\left\lfloor\frac{3+\log(2\deg(f)+1)+\log\widehat{f}_{\max}}{2+\log\alpha-\log(1-\alpha)}\right\rfloor,

since fmax(2deg(f)+1)f^maxf_{\max}\leq(2\deg(f)+1)\widehat{f}_{\max}. Again, if α\alpha is fixed and f^maxM\widehat{f}_{\max}\leq M, then the degree bound is O(deg(f)log(deg(f)))O(\deg(f)\log(\deg(f))). Since ff in this case is a univariate Laurent polynomial, O(deg(f)log(deg(f)))O(\deg(f)\log(\deg(f))) is also an upper bound for the sparsity of polynomial FSOS certificates.

Remark 3.9.

For any non-negative function on Γ2n\Gamma_{2}^{n}, Theorem 2.8 guarantees the existence of a degree O(n+deg(f))O(n+\deg(f)) polynomial FSOS certificate for the lower bound. As a comparison, if we assume that deg(f)\deg(f) is small, α\alpha is fixed and f^maxM\widehat{f}_{\max}\leq M for some constant M>0M>0, then it is remarkable that Theorem 3.8 ensures the existence (in some cases) of a degree O(logn)O(\log n) polynomial FSOS certificate.

Similarly, according to Theorem 2.9, every non-negative integer-valued function of degree dd on ΓN{\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\Gamma}_{N} admits a polynomial FSOS certificate of sparsity O(dlog(Nd))O\left(d\log\left(\frac{N}{d}\right)\right). Our bound O(dlogd)O(d\log d) in this case supplies a better upper bound provided that dd is small, α\alpha is fixed and f^maxM\widehat{f}_{\max}\leq M. Since Theorems 2.8 and 2.9 actually concern with the existence of a sparse polynomial FSOS of a non-negative function ff, we must have f=jJ|gj|2f=\sum_{j\in J}|g_{j}|^{2}. The improvement in Theorem 3.8 should be regarded as a benefit of allowing errors in Definition 3.3.

3.1.2. Two impossibility theorems

We notice that the key ingredient in the proof of Lemma 3.7 and thus Theorem 3.8 is the fact that one can approximate t\sqrt{t} on [α,1][\alpha,1] exponentially by a polynomial when α>1/5\alpha>1/5. Thus it is tempting to expect an exponential approximation of t\sqrt{t} on [0,1][0,1]. Unfortunately, this is not possible, which indicates that Theorem 3.8 is already the optimal result one can obtain by our method. We prove this non-existence result in the following. To begin with, we recall the following theorem on the approximation error of the absolute value function.

Theorem 3.10.

[VK87] Let E2nE_{2n} be the error of the best uniform approximation of |t||t| by polynomials of degree at most 2n2n on the interval [1,1][-1,1]. Then limnnE2n\lim_{n\rightarrow\infty}nE_{2n} exists.

Using Theorem 3.10, we can prove that it is not possible to approximate the square root function exponentially and uniformly.

Theorem 3.11 (impossibility theorem for uniform approximation).

There exists no polynomial sequence {pi:deg(pi)i}i=1\{p_{i}:\operatorname{deg}(p_{i})\leq i\}_{i=1}^{\infty} converging uniformly and exponentially to the square root function on the interval [0,1][0,1]. As a consequence, Theorem 3.8 can not be improved by approximating t\sqrt{t} uniformly. Furthermore, there exists a sequence of polynomials {rn:deg(rn)=n}n=1\{r_{n}:\deg(r_{n})=n\}_{n=1}^{\infty} that uniformly approximates the square root function on interval [0,1][0,1] with error EnE^{\prime}_{n}, such that limnnEn\lim_{n\rightarrow\infty}nE^{\prime}_{n} exists.

Proof.

Suppose there exists such a sequence {pi}i=1\{p_{i}\}_{i=1}^{\infty}. Since t2=|t|\sqrt{t^{2}}=|t|, the polynomial sequence {pi(t2)}i=1\{p_{i}(t^{2})\}_{i=1}^{\infty} converges uniformly and exponentially to the absolute value function on the interval [1,1][-1,1], which contradicts to the error estimate in Theorem 3.10.

For the “furthermore” part, let p(t)p(t) be a polynomial of degree 2d2d such that |p(t)|t||<ε\left|p(t)-|t|\right|<\varepsilon holds for any t[1,1]t\in[-1,1]. Then for r(t)p(t)+p(t)2r(t)\coloneqq\frac{p(t)+p(-t)}{2}, we have:

|r(t)|t||12|p(t)|t||+12|p(t)|t||ε.\left|r(t)-|t|\right|\leq\frac{1}{2}\left|p(t)-|t|\right|+\frac{1}{2}\left|p(-t)-|-t|\right|\leq\varepsilon.

Since r(t)r(t) is an even polynomial, r(t)r(\sqrt{t}) is also a polynomial of degree dd such that |r(t)t|<ε\left|r(\sqrt{t})-\sqrt{t}\right|<\varepsilon for t[0,1]t\in[0,1]. ∎

By a closer investigation of the proof of Lemma 3.7, we note that instead of finding a low degree uniform approximation of t\sqrt{t}, it is sufficient to find one which approximates t\sqrt{t} at integer points 0,1,,m0,1,\dots,m. It is seemingly convincing that such a more flexible approximation may improve Lemma 3.7 and Theorem 3.10, but we can prove that the expected improvement is not possible. Our argument is based on the following result.

Theorem 3.12.

Let pp be a univariate polynomial of degree dd and let b1b2,c0b_{1}\leq b_{2},c\geq 0 and δ0\delta\geq 0 be fixed real numbers. Assume a1<<ala_{1}<\dots<a_{l} are points in [0,m][0,m] such that for any x[0,m]x\in[0,m] there exists some aia_{i} such that |xai|δ|x-a_{i}|\leq\delta.

  • b1p(ai)b2b_{1}\leq p(a_{i})\leq b_{2} for each 1il1\leq i\leq l;

  • there exists some t0[0,m]t_{0}\in[0,m] such that |p(t0)|c|p^{\prime}(t_{0})|\geq c.

Then dcm/(2δc+b2b1)d\geq\sqrt{cm/(2\delta c+b_{2}-b_{1})}.

Proof.

The proof for l=m+1l=m+1 and ai=ia_{i}=i can be found in [NS94, Theorem 3.3] and one can easily adapt the proof there to the stated general case. ∎

Lemma 3.13.

Let 12δ>0,m5\frac{1}{2}\geq\delta>0,m\geq 5 be fixed real numbers and let pp be a polynomial of degree dd. Assume a1<<ala_{1}<\cdots<a_{l} are points in [0,m][0,m] such that for any x[0,m]x\in[0,m] there exists some aia_{i} such that |xai|δ|x-a_{i}|\leq\delta. If there exists some 0<ε<1/60<\varepsilon<1/6 such that |p(ai)ai|ε\left|p(a_{i})-\sqrt{a_{i}}\right|\leq\varepsilon for each 1il1\leq i\leq l, then

d(16ε)mm+2ε+2δ(16ε)=O(m14),d\geq\sqrt{\frac{\left(\frac{1}{6}-\varepsilon\right)m}{\sqrt{m}+2\varepsilon+2\delta\left(\frac{1}{6}-\varepsilon\right)}}=O(m^{\frac{1}{4}}),

.

Proof.

It is clear that 0a1δ0\leq a_{1}\leq\delta. There exists 1kl1\leq k\leq l such that |2ak|δ|2-a_{k}|\leq\delta, thus 2+δak2δ2+\delta\geq a_{k}\geq 2-\delta. By assumption, it is also obvious that εp(ai)m+ε-\varepsilon\leq p(a_{i})\leq\sqrt{m}+\varepsilon for each 1il1\leq i\leq l. The mean value theorem ensures the existence of t[a1,ak]t\in[a_{1},a_{k}] such that

|p(t)|\displaystyle|p^{\prime}(t)| =|p(ak)p(a1)|/|aka1|\displaystyle=|p(a_{k})-p(a_{1})|/|a_{k}-a_{1}|
=|(p(ak)ak)(p(a1)a1)+(aka1)|/|aka1|\displaystyle=|(p(a_{k})-\sqrt{a_{k}})-(p(a_{1})-\sqrt{a_{1}})+(\sqrt{a_{k}}-\sqrt{a_{1}})|/|a_{k}-a_{1}|
(2δδ2ε)/(2+δ)\displaystyle\geq(\sqrt{2-\delta}-\sqrt{\delta}-2\varepsilon)/(2+\delta)
2(31)/5ε\displaystyle\geq\sqrt{2}(\sqrt{3}-1)/5-\varepsilon
1/6ε.\displaystyle\geq 1/6-\varepsilon.

Applying Theorem 3.12 to pp with b1=ε,b2=m+εb_{1}=-\varepsilon,b_{2}=\sqrt{m}+\varepsilon and c=1/6εc=1/6-\varepsilon, we obtain the desired lower bound for dd. ∎

Since Lemma 3.13 provides a lower bound for the degree of a polynomial that approximates square root function on finitely many integers, we are now able to prove our second impossibility theorem, as promised.

Theorem 3.14 (impossibility theorem for discrete approximation).

Let 0<ε<1/60<\varepsilon<1/6 be a real number and let m5m\geq 5 be a positive integer. Suppose pm,εp_{m,\varepsilon} is a univariate polynomial such that for any integer valued function ff upper bounded by mm on GG, the function gpm,ε(ffmin)g\coloneqq p_{m,\varepsilon}(f-f_{\min}) satisfies

ffmingϵ\left\lVert\sqrt{f-f_{\min}}-g\right\rVert_{\ell^{\infty}}\leq\epsilon

Then the degree of pm,εp_{m,\varepsilon} is at least O(m14)O(m^{\frac{1}{4}}). In particular, Theorem 3.8 can not be improved by approximating t\sqrt{t} at integer points 0,,m0,\dots,m.

Proof.

For each integer 0im0\leq i\leq m, we consider fi:Gf_{i}:G\to\mathbb{N} defined by

fi(y)={i,ify=e0,otherwise.f_{i}(y)=\begin{cases}i,\quad\text{if}\leavevmode\nobreak\ y=e\\ 0,\quad\text{otherwise}.\end{cases}

Here eGe\in G denotes the identity element of GG. By assumption, we have

maxyG|fi(y)pm,ε(fi(y))|<ε,\max_{y\in G}\left|\sqrt{f_{i}(y)}-p_{m,\varepsilon}(f_{i}(y))\right|<\varepsilon,

from which we may conclude that

|pm,ε(i)i|<ε.\left|p_{m,\varepsilon}(i)-\sqrt{i}\right|<\varepsilon.

By Lemma 3.13, the degree of pm,εp_{m,\varepsilon} is at least O(m14)O(m^{\frac{1}{4}}). ∎

To summarize, Theorems 3.11 and 3.14 imply that the conditions on fminf_{\min} and fmaxf_{\max} in Theorem 3.8 can not be removed by using other approximations of t\sqrt{t}. Thus our result in Theorem 3.8 can be regarded as an optimal result one can obtain by function approximation method. An illustrative example for our impossibility theorems is as follows.

Example 3.15.

We define f:Γ26f:\Gamma_{2}^{6}\to\mathbb{Z} by

f(y1,,y6)=134+14y1+14y2+12y3+12y4+12y5+12y6+14y1y2.f(y_{1},\dots,y_{6})=\frac{13}{4}+\frac{1}{4}y_{1}+\frac{1}{4}y_{2}+\frac{1}{2}y_{3}+\frac{1}{2}y_{4}+\frac{1}{2}y_{5}+\frac{1}{2}y_{6}+\frac{1}{4}y_{1}y_{2}.

It is clear that fmin=1f_{\min}=1 and f6f\leq 6. We apply the method in the proof of Lemma 3.7 to f1+12=f12f-1+\frac{1}{2}=f-\frac{1}{2}. Namely, we approximate t\sqrt{t} by some degree dd polynomial pd(t)p_{d}(t) at points {i+12}i=05\{i+\frac{1}{2}\}_{i=0}^{5}. Then {pd(f12)}\{p_{d}(f-\frac{1}{2})\} is a polynomial FSOS certificate for f1f\geq 1, provided that pd(t)p_{d}(t) approximates t\sqrt{t} sufficiently well.

The best linear approximation of t\sqrt{t} at {i+12}i=05\left\{i+\frac{1}{2}\right\}_{i=0}^{5} is p1(t)=0.653+0.328tp_{1}(t)=0.653+0.328t. However, p1(f12)p_{1}(f-\frac{1}{2}) is not a polynomial FSOS certificate for f1f\geq 1 since

|(p1(f(1,1,1,1,1,1)12))2(f(1,1,1,1,1,1)12)|0.525.\left|\left(p_{1}\left(f(1,1,1,1,1,1)-\frac{1}{2}\right)\right)^{2}-\left(f(1,1,1,1,1,1)-\frac{1}{2}\right)\right|\geq 0.525.

Thus to obtain a polynomial FSOS certificate, we consider p2(t)=0.0359t2+0.532t+0.479p_{2}(t)=-0.0359t^{2}+0.532t+0.479, which is the best quadratic approximation of t\sqrt{t} at {i+12}i=05\left\{i+\frac{1}{2}\right\}_{i=0}^{5}. One may verify directly that

g=p2(f12)\displaystyle g=p_{2}\left(f-\frac{1}{2}\right) =1.63+0.079y1+0.079y2+0.167y3+0.167y4+0.167y5+0.167y6+0.079y1y2\displaystyle=1.63+0.079y_{1}+0.079y_{2}+0.167y_{3}+0.167y_{4}+0.167y_{5}+0.167y_{6}+0.079y_{1}y_{2}
0.009y1y30.009y1y40.009y2y30.009y1y50.009y2y40.009y1y6\displaystyle-0.009y_{1}y_{3}-0.009y_{1}y_{4}-0.009y_{2}y_{3}-0.009y_{1}y_{5}-0.009y_{2}y_{4}-0.009y_{1}y_{6}
0.009y2y50.018y3y40.009y2y60.018y3y50.018y3y6\displaystyle-0.009y_{2}y_{5}-0.018y_{3}y_{4}-0.009y_{2}y_{6}-0.018y_{3}y_{5}-0.018y_{3}y_{6}
0.018y4y50.018y4y60.018y5y60.009y1y2y3\displaystyle-0.018y_{4}y_{5}-0.018y_{4}y_{6}-0.018y_{5}y_{6}-0.009y_{1}y_{2}y_{3}
0.009y1y2y40.009y1y2y50.009y1y2y6.\displaystyle-0.009y_{1}y_{2}y_{4}-0.009y_{1}y_{2}y_{5}-0.009y_{1}y_{2}y_{6}.

It is clear that f1g2f^1g2^1<0.44\left\lVert f-1-g^{2}\right\rVert_{\ell^{\infty}}\leq\left\lVert\widehat{f}-1-\widehat{g^{2}}\right\rVert_{\ell^{1}}<0.44, thus {g}\{g\} is a desired polynomial FSOS certificate. We also see immediately that |supp(g)|=26|\operatorname{supp}(g)|=26. As a comparison, by Algorithm 1, one can find the following sparse polynomials

g1\displaystyle g_{1} =0.25y3+0.25y4+0.25y5+0.25y6+1,\displaystyle=0.25y_{3}+0.25y_{4}+0.25y_{5}+0.25y_{6}+1,
g2\displaystyle g_{2} =0.1443y30.1443y40.1443y5+0.433y6,\displaystyle=-0.1443y_{3}-0.1443y_{4}-0.1443y_{5}+0.433y_{6},
g3\displaystyle g_{3} =0.2041y30.2041y4+0.4083y5,\displaystyle=-0.2041y_{3}-0.2041y_{4}+0.4083y_{5},
g4\displaystyle g_{4} =0.3536y3+0.3536y4,\displaystyle=-0.3536y_{3}+0.3536y_{4},
g5\displaystyle g_{5} =0.5,\displaystyle=0.5,

which satisfy

f^1j=15|gj|2^1<0.76.\left\lVert\widehat{f}-1-\sum_{j=1}^{5}\widehat{|g_{j}|^{2}}\right\rVert_{\ell^{1}}<0.76.

Therefore, {gj}j=15\{g_{j}\}_{j=1}^{5} is an FSOS certificate for f1f\geq 1 of sparsity 55.

3.2. Rational FSOS certificates

Now we turn our attention to rational FSOS certificates. Although by Proposition 2.6, each function on GG admits a polynomial FSOS certificate, it is still imperative to consider rational FSOS certificates, due to the following reason:

  • According to Theorem 3.18 below, every integer-valued function on GG admits a rational FSOS certificate, which is of degree O(deg(f)log2(fmaxfmin))O(\deg(f)\log^{2}(f_{\max}-f_{\min})). In contrast, the existence of low degree polynomial FSOS certificates can only be guaranteed for some particular ff (cf. Theorem 3.8 and Figure 1).

The remaining of this section focuses on proving the existence of sparse rational FSOS certificates. Let RdR_{d} be the set of univariate rational functions with numerator and denominator being real polynomials of degree at most dd. We define below the approximation error of rational functions to t\sqrt{t} on the interval [0,1][0,1]:

EdinfrRd{maxt[0,1]|tr(t)|}.E_{d}\coloneqq\inf_{r\in{R}_{d}}\left\{\max_{t\in[0,1]}|\sqrt{t}-r(t)|\right\}.
Theorem 3.16.

[Vya75, Sta03] There exists some constant C>0C>0 such that for each dd\in\mathbb{N}, it holds that

(12) 13e2πd2EdCe2πd2.\displaystyle\frac{1}{3}e^{-2\pi\sqrt{\frac{d}{2}}}\leq E_{d}\leq Ce^{-2\pi\sqrt{\frac{d}{2}}}.
Lemma 3.17.

There exists a constant c>0c>0 such that for any function ff on GG, real number LfminL\leq f_{\min} and ε(0,1)\varepsilon\in(0,1), one can find functions g,hg,h of degree at most

deg(f)(log(fmaxL)+logclogε)22π2\deg(f)\left\lceil\frac{(\log(f_{\max}-L)+\log c-\log\varepsilon)^{2}}{2\pi^{2}}\right\rceil

such that

(fL)|g|2|h|2ε.\left\lVert(f-L)-\frac{|g|^{2}}{|h|^{2}}\right\rVert_{\ell^{\infty}}\leq\varepsilon.
Proof.

We define f0=(fL)/(fmaxL)f_{0}=(f-L)/(f_{\max}-L). It is clear that f0(x)[0,1]f_{0}(x)\in[0,1] for any xGx\in G. According to Theorem 3.16, we can find a constant c>0c>0 and univariate polynomials p,qp,q of degree at most d(log(fmaxL)+logclogε)22π2d\coloneqq\left\lceil\frac{(\log(f_{\max}-L)+\log c-\log\varepsilon)^{2}}{2\pi^{2}}\right\rceil such that

maxt[0,1]|p(t)/q(t)t|c3eπ2dε3(fmaxL).\max_{t\in[0,1]}\lvert p(t)/q(t)-\sqrt{t}\rvert\leq\frac{c}{3}e^{-\pi\sqrt{2d}}\leq\frac{\varepsilon}{3(f_{\max}-L)}.

In particular, we have maxt[0,1]|p(t)/q(t)+t|3\max_{t\in[0,1]}\lvert p(t)/q(t)+\sqrt{t}\rvert\leq 3. Let

g=fmaxL(pf0),h=qf0.g=\sqrt{f_{\max}-L}(p\circ f_{0}),\quad h=q\circ f_{0}.

This implies

maxxG||g(x)|2|h(x)|2(f(x)L)|\displaystyle\max_{x\in G}\left\lvert\frac{|g(x)|^{2}}{|h(x)|^{2}}-(f(x)-L)\right\rvert
=(fmaxL)maxxG|(p(f0(x))/q(f0(x)))2f0(x))|\displaystyle=(f_{\max}-L)\max_{x\in G}\lvert(p(f_{0}(x))/q(f_{0}(x)))^{2}-f_{0}(x))\rvert
3(fmaxL)maxt[0,1]|p(t)/q(t)t|\displaystyle\leq 3(f_{\max}-L)\max_{t\in[0,1]}\left\lvert p(t)/q(t)-\sqrt{t}\right\rvert
ε.\displaystyle\leq\varepsilon.

As a direct application of Lemma 3.17, we are able to derive the following existence theorem of low degree rational FSOS certificate for integer-valued functions on abelian groups.

Theorem 3.18 (low degree rational FSOS certificate).

There is a constant c>0c>0 such that every non-negative integer-valued function ff on GG admits a rank-one rational FSOS certificate {(g,h)}\{(g,h)\} for fLf\geq L, where LfminL\leq f_{\min} is an integer and

degg=deghdeg(f)(log(fmaxL)+c)22π2.\deg g=\deg h\leq\deg(f)\left\lceil\frac{(\log(f_{\max}-L)+c)^{2}}{2\pi^{2}}\right\rceil.

In particular, if we denote by MdM_{d} the cardinality of the set {χα:deg(χα)d}G^\{\chi_{\alpha}:\deg(\chi_{\alpha})\leq d\}\subseteq\widehat{G}, then

degg=deghdeg(f)(logf^max+logMdeg(f)+c+1)22π2.\deg g=\deg h\leq\deg(f)\left\lceil\frac{(\log\widehat{f}_{\max}+\log M_{\deg(f)}+c+1)^{2}}{2\pi^{2}}\right\rceil.
Proof.

We take ε=1/4\varepsilon=1/4 in Lemma 3.17 and the first inequality follows immediately. We observe that fmaxL2f^maxMdeg(f)f_{\max}-L\leq 2\widehat{f}_{\max}M_{\deg(f)}. This implies the estimate for the degree of gg and hh in terms of Mdeg(f)M_{\deg(f)} and f^max\widehat{f}_{\max}. ∎

If we apply Theorem 3.18 to ΓN\Gamma_{N} and f:ΓNf:\Gamma_{N}\to\mathbb{Z} with f^maxM\widehat{f}_{\max}\leq M for some constant M>0M>0, then ff admits a rational FSOS certificate of degree at most O(deg(f)log2(deg(f)))O(\deg(f)\log^{2}(\deg(f))). We also apply Theorem 3.18 to Γ2n\Gamma_{2}^{n} and f:Γ2nf:\Gamma_{2}^{n}\to\mathbb{N} with f^M\lVert\widehat{f}\rVert_{\ell^{\infty}}\leq M. In this case, Md=j=0d(nd)=O(nd)M_{d}=\sum_{j=0}^{d}\binom{n}{d}=O(n^{d}). Thus the upper bound for rational FSOS certificate is O(deg(f)3log2n)O(\deg(f)^{3}\log^{2}n). In particular, if deg(f)=2\deg(f)=2 then the bound is simply O(log2n)O(\log^{2}n).

Remark 3.19.

By Theorem 2.10, every quadratic function on Γ2n\Gamma_{2}^{n} has a rational SOS of degree at most n2+1=O(n)\left\lfloor\frac{n}{2}\right\rfloor+1=O(n). It is even proved by an example that the upper bound n2+1\left\lfloor\frac{n}{2}\right\rfloor+1 is tight, while our exponentially smaller upper bound O(log2n)O(\log^{2}n) does not lead to any contradiction. Indeed, the bound in Theorem 2.10 is for a pair of finite families of real polynomials ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) such that (iIhi2)f=jJgj2(\sum_{i\in I}h_{i}^{2})f=\sum_{j\in J}g_{j}^{2} on Γ2n\Gamma_{2}^{n}. However, our degree bound is for rational FSOS certificate defined in Definition 3.3, which allows errors. A concrete example is given in Remark 3.4.

We notice that it is possible to turn the proof of Theorem 3.18 into an explicit construction of a rank-one rational FSOS certificate, if a rational approximation of t\sqrt{t} can be explicitly given. Fortunately, Newman provides such an explicit rational approximation, which we record in Remark 3.20 for the convenience of the reader.

Remark 3.20.

In [New64], a rational approximation of |t||t| is given explicitly. For any integer d>0d>0, set ξ=exp(d12),pd(t)=k=0d1(t+ξk)\xi=\exp(-d^{-\frac{1}{2}}),\leavevmode\nobreak\ p_{d}(t)=\prod_{k=0}^{d-1}\left(t+\xi^{k}\right) and

rd(t)=t(pd(t)pd(t))pd(t)+pd(t).r_{d}(t)=\frac{t\left(p_{d}(t)-p_{d}(-t)\right)}{p_{d}(t)+p_{d}(-t)}.

Then

maxt[1,1]|rd(t)|t||3exp(d).\max_{t\in[-1,1]}\left|r_{d}(t)-|t|\right|\leq 3\exp(-\sqrt{d}).

Since both t(pd(t)pd(t))t\left(p_{d}(t)-p_{d}(-t)\right) and pd(t)+pd(t)p_{d}(t)+p_{d}(-t) are even functions, r(t)r(\sqrt{t}) is also a rational function. Thus

maxt[0,1]|rd(t)t|3exp(d),\max_{t\in[0,1]}\left|r_{d}(\sqrt{t})-\sqrt{t}\right|\leq 3\exp(-\sqrt{d}),

and rd(t)r_{d}(\sqrt{t}) gives an exponential approximation of t\sqrt{t}.

We illustrate the construction by the example that follows.

Example 3.21 (continues= ex:running example-1).

The image of f12f-\frac{1}{2} is {12,32,52}\{\frac{1}{2},\frac{3}{2},\frac{5}{2}\}. According to Newman’s construction (cf. Remark 3.20), we can take r(t)10(e22+1)t2t+5e22r(t)\coloneqq\frac{\sqrt{10}\left(\mathrm{e}^{-\frac{\sqrt{2}}{2}}+1\right)t}{2t+5\mathrm{e}^{-\frac{\sqrt{2}}{2}}} so that

|r(12)12|<0.026,|r(32)32|<0.072,|r(52)52|=0.\left|r\left(\frac{1}{2}\right)-\sqrt{\frac{1}{2}}\right|<0.026,\quad\left|r\left(\frac{3}{2}\right)-\sqrt{\frac{3}{2}}\right|<0.072,\quad\left|r\left(\frac{5}{2}\right)-\sqrt{\frac{5}{2}}\right|=0.

Then it is straightforward to check that r(f(y)12)r\left(f(y)-\frac{1}{2}\right) equals to

(13) 5.31+1.77y1+1.77y2+1.77y3+0.59y1y2+0.59y1y3+0.59y2y30.59y1y2y34.72+0.75y1+0.75y2+0.75y3+0.25y1y2+0.25y1y3+0.25y2y30.25y1y2y3\frac{5.31+1.77y_{1}+1.77y_{2}+1.77y_{3}+0.59y_{1}y_{2}+0.59y_{1}y_{3}+0.59y_{2}y_{3}-0.59y_{1}y_{2}y_{3}}{4.72+0.75y_{1}+0.75y_{2}+0.75y_{3}+0.25y_{1}y_{2}+0.25y_{1}y_{3}+0.25y_{2}y_{3}-0.25y_{1}y_{2}y_{3}}

hence f12(r(f12))2<0.18\lVert f-\frac{1}{2}-\left(r\circ\left(f-\frac{1}{2}\right)\right)^{2}\rVert_{\ell^{\infty}}<0.18. If we respectively take gg and hh to be the numerator and the denominator of r(f12)r\circ\left(f-\frac{1}{2}\right) in (13), then ({g},{h})(\{g\},\{h\}) is a rank one rational FSOS certificate for f1f\geq 1.

The estimate in (3) leads to the proposition that follows.

Proposition 3.22.

Let ff be a non-negative function on GG. If pp is a univariate polynomial such that

p(f)fε\left\lVert p(f)-\sqrt{f}\right\rVert_{\ell^{\infty}}\leq\varepsilon

for some ε>0\varepsilon>0, then

maxχαG^|(p(f)2)αfα|2f12ε+ε2.\max_{\chi_{\alpha}\in\widehat{G}}\left\lvert(p(f)^{2})_{\alpha}-f_{\alpha}\right\rvert\leq 2\lVert f\rVert_{\ell^{\infty}}^{\frac{1}{2}}\varepsilon+\varepsilon^{2}.

In other words, Fourier coefficients of p(f)2fp(f)^{2}-f are uniformly bounded by O(f12ε)O(\lVert f\rVert_{\ell^{\infty}}^{\frac{1}{2}}\varepsilon). Moreover, if qq is a univariate polynomial such that q(f)2^f^1ε\left\lVert\widehat{q(f)^{2}}-\widehat{f}\right\rVert_{\ell^{1}}\leq\varepsilon, then

q(f)2fε.\left\lVert q(f)^{2}-f\right\rVert_{\ell^{\infty}}\leq\varepsilon.
Proof.

By assumption, for each yGy\in G we have

|p(f(y))+f(y)||2|f(y)|+p(f(y))f(y)|2f12+ε,\left\lvert p(f(y))+\sqrt{f(y)}\right\rvert\leq\left\lvert 2\left\lvert\sqrt{f(y)}\right\rvert+p(f(y))-\sqrt{f(y)}\right\rvert\leq 2\lVert f\rVert_{\ell^{\infty}}^{\frac{1}{2}}+\varepsilon,

from which we obtain

|p(f(y))2f(y)|=|p(f(y))f(y)||p(f(y))+f(y)|2f12ε+ε2\left\lvert p(f(y))^{2}-f(y)\right\rvert=\left\lvert p(f(y))-\sqrt{f(y)}\right\rvert\left\lvert p(f(y))+\sqrt{f(y)}\right\rvert\leq 2\lVert f\rVert_{\ell^{\infty}}^{\frac{1}{2}}\varepsilon+\varepsilon^{2}

and the desired estimate for Fourier coefficients follows from (3). The moreover part follows directly by (4). ∎

In the sequel, we usually apply Proposition 3.22 to fLf-L for some LfminL\leq f_{\min}. We also remark that the finiteness of GG is crucial to the estimate of Fourier coefficients in Proposition 3.22. In general, Fourier coefficients of an LL^{\infty} function can be arbitrarily large. More importantly, Proposition 3.22 validates Algorithm 1, in which we relax the \ell^{\infty}-norm in (10) by the 1\ell^{1}-norm.

4. Validation of FSOS certificates

We address Problem 1.2 in this section. For the sake of convenience, we reproduce the problem below. See 1.2 Clearly, to solve Problem 1.2, it is sufficient to check the inequality

(14) fL+εjJ|gj|2iI|hi|2<1ε,\left\lVert f-L+\varepsilon-\frac{\sum_{j\in J}|g_{j}|^{2}}{\sum_{i\in I}|h_{i}|^{2}}\right\rVert_{\ell^{\infty}}<1-\varepsilon,

for some ε[0,1)\varepsilon\in[0,1). Along this direction, we propose two methods to validate ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}). One is given by checking the inequalities:

(15) iI|hi|21,iI\savestack\tmpbox\stretchto\scaleto\scalerel[width("|hi|2(fL+ε)")] 0.5ex\stackon[1pt]|hi|2(fL+ε)\tmpboxjJ|gj|2^1<1ε,\sum_{i\in I}|h_{i}|^{2}\geq 1,\quad\left\lVert\sum_{i\in I}\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("|h_{i}|^{2}\left(f-L+\varepsilon\right)")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{|h_{i}|^{2}\left(f-L+\varepsilon\right)}{\tmpbox}-\sum_{j\in J}\widehat{|g_{j}|^{2}}\ \right\rVert_{\ell^{1}}<1-\varepsilon,

which is justified by Proposition 4.2. In the sequel, we call this method the 1\ell^{1}-norm validation.

The other method is only for functions on Γ2n\Gamma_{2}^{n}. It is called the validation by sampling. It checks iIhi21\sum_{i\in I}h_{i}^{2}\geq 1 and

(16) |jJ|gj(y)|2(iI|hi(y)|2)(f(y)L)|n2(deg(f)+d)\left\lvert\sum_{j\in J}|g_{j}(y)|^{2}-\left(\sum_{i\in I}|h_{i}(y)|^{2}\right)(f(y)-L)\right\rvert\leq n^{-2(\deg(f)+d)}

for yΓ2ny\in\Gamma_{2}^{n} such that i=1nyi2deg(f)n+2d\sum_{i=1}^{n}y_{i}\leq 2\deg(f)-n+2d. Here d=2max{deg(gj),deg(hi)}iI,jJd=2\max\{\deg(g_{j}),\deg(h_{i})\}_{i\in I,j\in J}. Theorem 4.6 ensures the fidelity of this method.

4.1. Validating FSOS certificates by 1\ell^{1}-norm

We recall that the 1\ell^{1}-norm validation is checking the two inequalities in (15). To this end, we need the following proposition, which is in the same spirit as Proposition 3.22.

Lemma 4.1.

Let ff be a function on GG and let ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) be a pair of finite families of functions on GG such that iI|hi|21\sum_{i\in I}|h_{i}|^{2}\geq 1 and

iI|hi|2f^jJ|gj|2^1<ε\left\lVert\sum_{i\in I}\widehat{|h_{i}|^{2}f}-\sum_{j\in J}\widehat{|g_{j}|^{2}}\right\rVert_{\ell^{1}}<\varepsilon

for some ε>0\varepsilon>0. Then we have

fj|gj|2i|hi|2<ε.\left\lVert f-\frac{\sum_{j}|g_{j}|^{2}}{\sum_{i}|h_{i}|^{2}}\right\rVert_{\ell^{\infty}}<\varepsilon.
Proof.

We denote r=(iI|hi|2)fjJ|gj|2r=\left(\sum_{i\in I}|h_{i}|^{2}\right)f-\sum_{j\in J}|g_{j}|^{2}. Since r^1<ε\lVert\widehat{r}\rVert_{\ell^{1}}<\varepsilon, we conclude that r<ε\lVert r\rVert_{\ell^{\infty}}<\varepsilon. The condition that i|hi|21\sum_{i}|h_{i}|^{2}\geq 1 implies

|f(y)j|gj(y)|2i|hi(y)|2|=|r(y)|i|hi(y)|2<ε,yG.\left|f(y)-\frac{\sum_{j}|g_{j}(y)|^{2}}{\sum_{i}|h_{i}(y)|^{2}}\right|=\frac{\left|r(y)\right|}{\sum_{i}|h_{i}(y)|^{2}}<\varepsilon,\quad y\in G.

Proposition 4.2 (validation by 1\ell^{1}-norm).

Let ff be an integer-valued function on GG and let LfminL\leq f_{\min} be an integer. If ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) is a pair of finite families of functions on GG such that (15) holds, then ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) is a rational FSOS certificate for fLf\geq L.

Proof.

If ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) satisfy (15), then by Lemma 4.1, we have

(fL+ε)j|gj|2i|hi|2<1ε.\left\lVert\left(f-L+\varepsilon\right)-\frac{\sum_{j}|g_{j}|^{2}}{\sum_{i}|h_{i}|^{2}}\right\rVert_{\ell^{\infty}}<1-\varepsilon.

This implies that ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) is a rational FSOS certificate for fLf\geq L. ∎

4.2. Validating FSOS certificates by sampling on Γ2n\Gamma_{2}^{n}

Next we discuss the method of validation by sampling (16). The method relies on the extrapolation theory of functions on [0,1]n[0,1]^{n}.

Lemma 4.3 (extrapolation).

Let RR be a polynomial of degree at most dd in nn variables x1,,xnx_{1},\dots,x_{n} and let ε\varepsilon be a positive number. We define

S{ξ=(ξ1,,ξn)2n:j=1nξjd}.S\coloneqq\left\{\xi=(\xi_{1},\dots,\xi_{n})\in\mathbb{Z}_{2}^{n}:\sum_{j=1}^{n}\xi_{j}\leq d\right\}.

If monomials in RR are square free, dkd\leq k and |R(ξ)|<ε\lvert R(\xi)\rvert<\varepsilon for each ξS\xi\in S, then

  1. (i)

    for any x[0,1]nx\in[0,1]^{n} with |x|d+1|\lceil x\rceil|\geq d+1, we have

    |R(x)|εj=0d2j(|x|j)|R(x)|\leq\varepsilon\sum_{j=0}^{d}2^{j}\binom{|\lceil x\rceil|}{j}

    where (x1,,xn)(x1,,xn)\lceil(x_{1},\dots,x_{n})\rceil\coloneqq(\lceil x_{1}\rceil,\dots,\lceil x_{n}\rceil).

  2. (ii)

    if moreover x2nx\in\mathbb{Z}_{2}^{n}, we have

    (17) |R(x)|ε(|x|d)p=0d|x|d|x|p(dp).|R(x)|\leq\varepsilon\binom{|x|}{d}\sum_{p=0}^{d}\frac{|x|-d}{|x|-p}\binom{d}{p}.

    In particular, we have

    (18) |R(x)|ε(|x|d)(2dd|x|2d1).|R(x)|\leq\varepsilon\binom{|x|}{d}\left(2^{d}-\frac{d}{|x|}2^{d-1}\right).
Proof.

Since elements in SS can be used to denote both multi-indices and points in [0,1]n[0,1]^{n}, to avoid confusion we define ΛS\Lambda\coloneqq S and we denote multi-indices (resp. points) in Λ\Lambda (resp. SS) by α,β,\alpha,\beta,\dots (resp. ξ,ζ,\xi,\zeta,\dots). We sort SS (resp. Λ\Lambda) by the graded reverse lexicographic order so that

S={ξ1,,ξN},Λ={α1,,αN}.S=\{\xi_{1},\dots,\xi_{N}\},\quad\Lambda=\{\alpha_{1},\dots,\alpha_{N}\}.

Since RR has no square-free monomials, we may write

R(x)=αΛcαxαR(x)=\sum_{\alpha\in\Lambda}c_{\alpha}x^{\alpha}

for some cαc_{\alpha}\in\mathbb{R}. Here xαx^{\alpha} denotes the monomial x1α1xnαnx_{1}^{\alpha_{1}}\cdots x_{n}^{\alpha_{n}}. Evaluating R(x)R(x) at points in SS, we obtain

(19) R(ξ)=αΛcαξα,ξS.R(\xi)=\sum_{\alpha\in\Lambda}c_{\alpha}\xi^{\alpha},\quad\xi\in S.

Therefore (19) can be written as

(20) [cα1cαN][ξ1α1ξNα1ξ1αNξNαN]=[R(ξ1),,R(ξN)].\begin{bmatrix}c_{\alpha_{1}}&\cdots&c_{\alpha_{N}}\end{bmatrix}\begin{bmatrix}\xi_{1}^{\alpha_{1}}&\cdots&\xi_{N}^{\alpha_{1}}\\ \vdots&\ddots&\vdots\\ \xi_{1}^{\alpha_{N}}&\cdots&\xi_{N}^{\alpha_{N}}\end{bmatrix}=\begin{bmatrix}R(\xi_{1}),\cdots,R(\xi_{N})\end{bmatrix}.

By definition, it is straightforward to verify that

ξjαi=ξj1αi1ξjkαik={1,ifαiαj,0,otherwise.\xi_{j}^{\alpha_{i}}=\xi_{j1}^{\alpha_{i1}}\cdots\xi_{jk}^{\alpha_{ik}}=\begin{cases}1,\leavevmode\nobreak\ \text{if}\leavevmode\nobreak\ \alpha_{i}\leq\alpha_{j},\\ 0,\leavevmode\nobreak\ \text{otherwise}.\end{cases}

Here (a1,,an)(b1,,bn)(a_{1},\dots,a_{n})\leq(b_{1},\dots,b_{n}) if and only if aibi,i=1,,na_{i}\leq b_{i},i=1,\dots,n.

Next we investigate the structure of the matrix A(ξjαi)i,j=1NA\coloneqq(\xi_{j}^{\alpha_{i}})_{i,j=1}^{N}. We partition columns and rows of AA by |ξj||\xi_{j}|’s and |αi||\alpha_{i}|’s respectively. Namely, we have A=(Aqp)p,q=0dA=(A_{q}^{p})_{p,q=0}^{d} where AqpA_{q}^{p} is a (np)×(nq)\binom{n}{p}\times\binom{n}{q} matrix whose elements are ξjαi\xi_{j}^{\alpha_{i}}’s with |αi|=p|\alpha_{i}|=p and |ξj|=q|\xi_{j}|=q. The matrix AqpA^{p}_{q} has the following properties:

  • Aqp=0A^{p}_{q}=0 if p>qp>q;

  • App=I(np)A^{p}_{p}=I_{\binom{n}{p}};

  • Aq0=1(nq)A^{0}_{q}=\text{1}_{\binom{n}{q}}, where 1k\text{1}_{k} denotes the kk-dimensional row vector whose elements are all equal to one;

  • if |ξ|=q>p|\xi|=q>p, then the column of AqpA^{p}_{q} determined by ξ\xi contains exactly (qp)\binom{q}{p} nonzero elements which are all equal to one.

Thus the matrix AA can be written as

(21) [\@arstrut|ξ|=0|ξ|=1|ξ|=2|ξ|=d-1|ξ|=d\\|α|=011n1(n2)1(n-d1)1(nd)\\|α|=10InA12A1-d1A1d\\|α|=200I(n2)A2-d1A2d\\\\|α|=d-1000I(n-d1)A-d1d\\|α|=d00000I(nd)] .\hbox{}\vbox{\kern 0.86108pt\hbox{$\kern 0.0pt\kern 2.5pt\kern-5.0pt\left[\kern 0.0pt\kern-2.5pt\kern-5.55557pt\vbox{\kern-0.86108pt\vbox{\vbox{ \halign{\kern\arraycolsep\hfil\@arstrut$\kbcolstyle#$\hfil\kern\arraycolsep& \kern\arraycolsep\hfil$\@kbrowstyle#$\ifkbalignright\relax\else\hfil\fi\kern\arraycolsep&& \kern\arraycolsep\hfil$\@kbrowstyle#$\ifkbalignright\relax\else\hfil\fi\kern\arraycolsep\cr 5.0pt\hfil\@arstrut$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle|\xi|=0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle|\xi|=1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle|\xi|=2$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle|\xi|=d-1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle|\xi|=d\\|\alpha|=0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\text{1}_{n}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\text{1}_{\binom{n}{2}}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\text{1}_{\binom{n}{d-1}}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\text{1}_{\binom{n}{d}}\\|\alpha|=1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle I_{n}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle A^{1}_{2}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle A^{1}_{d-1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle A^{1}_{d}\\|\alpha|=2$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle I_{\binom{n}{2}}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle A^{2}_{d-1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle A^{2}_{d}\\\vdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\vdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\vdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\vdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\ddots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\vdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\vdots\\|\alpha|=d-1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\vdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle I_{\binom{n}{d-1}}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle A^{d-1}_{d}\\|\alpha|=d$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle I_{\binom{n}{d}}$\hfil\kern 5.0pt\crcr}}}}\right]$}}.

In particular, AA is invertible and hence one can determine

(22) [cα1cαN]=[R(ξ1),,R(ξN)]A1.\begin{bmatrix}c_{\alpha_{1}}&\cdots&c_{\alpha_{N}}\end{bmatrix}=\begin{bmatrix}R(\xi_{1}),\cdots,R(\xi_{N})\end{bmatrix}A^{-1}.

We claim that A1=((1)qp+1Aqp)p,q=0dA^{-1}=((-1)^{q-p+1}A^{p}_{q})_{p,q=0}^{d}, i.e.,

(23) A1=[11n1(n2)(1)d11(nd1)(1)d1(nd)0InA21(1)d2Ad11(1)d1Ad100I(n2)(1)d3Ad12(1)d2Ad2000I(nd1)(1)Add10000I(nd)].A^{-1}=\begin{bmatrix}1&-\text{1}_{n}&\text{1}_{\binom{n}{2}}&\cdots&(-1)^{d-1}\text{1}_{\binom{n}{d-1}}&(-1)^{d}\text{1}_{\binom{n}{d}}\\ 0&I_{n}&-A^{1}_{2}&\cdots&(-1)^{d-2}A^{1}_{d-1}&(-1)^{d-1}A^{1}_{d}\\ 0&0&I_{\binom{n}{2}}&\cdots&(-1)^{d-3}A^{2}_{d-1}&(-1)^{d-2}A^{2}_{d}\\ \vdots&\vdots&\vdots&\ddots&\vdots&\vdots\\ 0&0&0&\cdots&I_{\binom{n}{d-1}}&(-1)A^{d-1}_{d}\\ 0&0&0&\cdots&0&I_{\binom{n}{d}}\end{bmatrix}.

Indeed, one can verify (23) directly by properties of AqpA^{p}_{q} summarized before (21) and the identity

j=0k(1)j(kj)=0.\sum_{j=0}^{k}(-1)^{j}\binom{k}{j}=0.

In particular, elements in A1A^{-1} are all contained in {1,0,1}\{-1,0,1\}.

Now we are ready to prove (i). According to (23), we have for each i=1,,Ni=1,\dots,N that

(24) |cαi|#{nonzero elements in the αi-th column of A1}ε2|αi|ε.|c_{\alpha_{i}}|\leq\#\{\text{nonzero elements in the $\alpha_{i}$-th column of $A^{-1}$}\}\varepsilon\leq 2^{|\alpha_{i}|}\varepsilon.

This implies that for each x[0,1]nx\in[0,1]^{n} with |x|d+1|\lceil x\rceil|\geq d+1,

|R(x)|\displaystyle|R(x)| αΛ|cα|xα\displaystyle\leq\sum_{\alpha\in\Lambda}|c_{\alpha}|x^{\alpha}
=αΛαx|cα|\displaystyle=\sum_{\begin{subarray}{c}\alpha\in\Lambda\\ \alpha\leq\lceil x\rceil\end{subarray}}|c_{\alpha}|
εαΛαx2|α|\displaystyle\leq\varepsilon\sum_{\begin{subarray}{c}\alpha\in\Lambda\\ \alpha\leq\lceil x\rceil\end{subarray}}2^{|\alpha|}
=εj=0d2j#{αΛ:|α|=j,αx}\displaystyle=\varepsilon\sum_{j=0}^{d}2^{j}\#\{\alpha\in\Lambda:|\alpha|=j,\alpha\leq\lceil x\rceil\}
=εj=0d2j(|x|j).\displaystyle=\varepsilon\sum_{j=0}^{d}2^{j}\binom{|\lceil x\rceil|}{j}.

To prove (ii), we notice that for each x2nx\in\mathbb{Z}_{2}^{n},

R(x)=[R(ξ1),,R(ξN)]A1[xα1xαN].R(x)=\begin{bmatrix}R(\xi_{1}),\cdots,R(\xi_{N})\end{bmatrix}A^{-1}\begin{bmatrix}x^{\alpha_{1}}\\ \vdots\\ x^{\alpha_{N}}\end{bmatrix}.

For each βΛ\beta\in\Lambda, we have

xβ={1,ifβx,0,otherwise.x^{\beta}=\begin{cases}1,\leavevmode\nobreak\ \text{if}\leavevmode\nobreak\ \beta\leq x,\\ 0,\leavevmode\nobreak\ \text{otherwise}.\end{cases}

This implies that the α\alpha-th element of A1[xα1xαN]𝖳A^{-1}\begin{bmatrix}x^{\alpha_{1}}&\cdots&x^{\alpha_{N}}\end{bmatrix}^{\scriptscriptstyle\mathsf{T}} is zero if αx\alpha\not\leq x. Moreover, if αx\alpha\leq x, then the α\alpha-th element of A1[xα1xαN]𝖳A^{-1}\begin{bmatrix}x^{\alpha_{1}}&\cdots&x^{\alpha_{N}}\end{bmatrix}^{\scriptscriptstyle\mathsf{T}} is given by

αβxβΛ(1)|β||α|\displaystyle\sum_{\begin{subarray}{c}\alpha\leq\beta\leq x\\ \beta\in\Lambda\end{subarray}}(-1)^{|\beta|-|\alpha|} =q=|α|d|β|=q(1)q|α|\displaystyle=\sum_{q=|\alpha|}^{d}\sum_{|\beta|=q}(-1)^{q-|\alpha|}
=q=|α|d(1)q|α|(|x||α|q|α|)\displaystyle=\sum_{q=|\alpha|}^{d}(-1)^{q-|\alpha|}\binom{|x|-|\alpha|}{q-|\alpha|}
=j=0d|α|(1)j(|x||α|j)\displaystyle=\sum_{j=0}^{d-|\alpha|}(-1)^{j}\binom{|x|-|\alpha|}{j}
=(1)d|α|(|x||α|1d|α|).\displaystyle=(-1)^{d-|\alpha|}\binom{|x|-|\alpha|-1}{d-|\alpha|}.

Here the last equality follows from the identity j=0t(1)j(mj)=(1)t(m1t)\sum_{j=0}^{t}(-1)^{j}\binom{m}{j}=(-1)^{t}\binom{m-1}{t}. Thus we have

|R(x)|\displaystyle|R(x)| εαΛαx(|x||α|1d|α|)\displaystyle\leq\varepsilon\sum_{\begin{subarray}{c}\alpha\in\Lambda\\ \alpha\leq x\end{subarray}}\binom{|x|-|\alpha|-1}{d-|\alpha|}
=εp=0d(|x|p)(|x|p1dp)\displaystyle=\varepsilon\sum_{p=0}^{d}\binom{|x|}{p}\binom{|x|-p-1}{d-p}
(25) =ε(|x|d)p=0d|x|d|x|p(dp).\displaystyle=\varepsilon\binom{|x|}{d}\sum_{p=0}^{d}\frac{|x|-d}{|x|-p}\binom{d}{p}.

We consider the function f(t)=1/(|x|t)f(t)=1/(|x|-t) on [0,d][0,d]. It is straightforward to verify that ff is a convex function. Therefore, we have

f(t)g(t)1|x|+t|x|(|x|d),t[0,d],f(t)\leq g(t)\coloneqq\frac{1}{|x|}+\frac{t}{|x|(|x|-d)},\quad t\in[0,d],

which follows from the convexity of ff and g(0)=f(0),g(d)=f(d)g(0)=f(0),g(d)=f(d). This implies that

p=0d|x|d|x|p(dp)\displaystyle\sum_{p=0}^{d}\frac{|x|-d}{|x|-p}\binom{d}{p} (|x|d)p=0d(1|x|+p|x|(|x|d))(dp)\displaystyle\leq(|x|-d)\sum_{p=0}^{d}\left(\frac{1}{|x|}+\frac{p}{|x|(|x|-d)}\right)\binom{d}{p}
=|x|d|x|2d+d|x|2d1\displaystyle=\frac{|x|-d}{|x|}2^{d}+\frac{d}{|x|}2^{d-1}
(26) =2dd|x|2d1.\displaystyle=2^{d}-\frac{d}{|x|}2^{d-1}.

Here the first equality follows from identities p=0d(dp)=2d\sum_{p=0}^{d}\binom{d}{p}=2^{d} and p=0dp(dp)=d2d1\sum_{p=0}^{d}p\binom{d}{p}=d2^{d-1}. Combining (4.2) and (4.2), we obtain the desired estimate for |R(x)||R(x)|. ∎

Remark 4.4.

Extrapolation results which are similar to Lemma 4.3 are available in the literature [RS10, She20]. According to [She20, Lemma 2.11], it holds that

|R(x)|ε(|x|d)2d,x2n,|x|d+1,|R(x)|\leq\varepsilon\binom{|x|}{d}2^{d},\quad x\in\mathbb{Z}_{2}^{n},|x|\geq d+1,

for which our estimate in (18) provides an improvement.

Moreover, (17) is optimal in the sense that there exists some degree dd polynomial RR such that the equality holds. Indeed, we may take RR to be the one such that

R(ξ)=(1)d+|ξ|ε,ξS,R(\xi)=(-1)^{d+|\xi|}\varepsilon,\quad\xi\in S,

which can be obtained by interpolation. By (4.2), it is clear that the equality in (17) holds.

Lastly, either by (i) or (ii) in Lemma 4.3 we have

maxx[0,1]n|R(x)|<2dNε,\max_{x\in[0,1]^{n}}\lvert R(x)\rvert<2^{d}N\varepsilon,

where N|S|=j=0d(nj)N\coloneqq|S|=\sum_{j=0}^{d}\binom{n}{j}.

Proposition 4.5.

Let ff be an integer-valued function on Γ2n\Gamma_{2}^{n}. If g,hg,h are non-negative functions on Γ2n\Gamma_{2}^{n} of degree at most dd and LL is an integer such that

  • deg(f)+dn\deg(f)+d\leq n;

  • minyΓ2n|h(y)|1\min_{y\in\Gamma_{2}^{n}}|h(y)|\geq 1;

  • maxyΓ2n,i=1nyi2deg(f)n+2d|g(y)h(y)(f(y)L)|12N2\max_{y\in\Gamma_{2}^{n},\leavevmode\nobreak\ \sum_{i=1}^{n}y_{i}\leq 2\deg(f)-n+2d}|g(y)-h(y)(f(y)-L)|\leq\frac{1}{2N^{2}} where Nj=0deg(f)+d(nj)N\coloneqq\sum_{j=0}^{\deg(f)+d}\binom{n}{j}.

Then we have f(y)Lf(y)\geq L for any yΓ2ny\in\Gamma_{2}^{n}.

Proof.

We observe that the following inequality holds for any yΓ2ny\in\Gamma_{2}^{n}:

|g(y)/h(y)(f(y)L)|=|g(y)h(y)(f(y)L)||h(y)||g(y)h(y)(f(y)L)|.|g(y)/h(y)-(f(y)-L)|=\frac{|g(y)-h(y)(f(y)-L)|}{|h(y)|}\leq|g(y)-h(y)(f(y)-L)|.

Thus it is sufficient to prove maxx2n|R(x)|<1/2\max_{x\in\mathbb{Z}_{2}^{n}}|R(x)|<1/2 where

R(x)g(2x1)h(2x1)(f(2x1)L),x2n.R(x)\coloneqq g(2x-1)-h(2x-1)(f(2x-1)-L),\quad x\in\mathbb{Z}_{2}^{n}.

We notice that R(x)R(x) is a polynomial of degree at most deg(f)+d\deg(f)+d and by assumption we also have

maxx2n,|x|deg(f)+d|R(x)|12N2.\max_{x\in\mathbb{Z}_{2}^{n},|x|\leq\deg(f)+d}|R(x)|\leq\frac{1}{2N^{2}}.

Applying Lemma 4.3 we obtain

maxx2n|R(x)|<2deg(f)+dN2N212.{\max_{x\in\mathbb{Z}_{2}^{n}}|R(x)|<\frac{2^{\deg(f)+d}N}{2N^{2}}\leq\frac{1}{2}}.

Now we are able to prove the correctness of the validation by sampling (16).

Theorem 4.6 (validation by sampling).

Let ff be an integer-valued function on Γ2n\Gamma_{2}^{n}. Suppose that ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) is a pair of finite families of functions on Γ2n\Gamma_{2}^{n} such that

  1. (i)

    iI|hi|21\sum_{i\in I}|h_{i}|^{2}\geq 1;

  2. (ii)

    d2maxjJ,iI{deggj,deghi}d\coloneqq 2\max_{j\in J,i\in I}\{\deg g_{j},\deg h_{i}\} and deg(f)+dn\deg(f)+d\leq n;

  3. (iii)

    |jJ|gj(y)|2(iI|hi(y)|2)(f(y)L)|n2(deg(f)+d)\left\lvert\sum_{j\in J}|g_{j}(y)|^{2}-\left(\sum_{i\in I}|h_{i}(y)|^{2}\right)(f(y)-L)\right\rvert\leq n^{-2(\deg(f)+d)} for xΓ2nx\in\Gamma_{2}^{n} such that i=1nyi2deg(f)n+2d\sum_{i=1}^{n}y_{i}\leq 2\deg(f)-n+2d.

Then ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) is a rational FSOS certificate for fLf\geq L.

According to Theorem 3.18, given any constant M>0M>0, there exists some constant C=C(M)>0C=C(M)>0 such that if f^M\lVert\widehat{f}\rVert_{\ell^{\infty}}\leq M, then ff admits a rational FSOS certificate of degree Cdeg(f)3log2nC\deg(f)^{3}\log^{2}n for fLf\geq L. Thus in Theorem 4.6 we may set d=Cdeg(f)3log2nd=C\deg(f)^{3}\log^{2}n so that (iii) becomes

|jJ|gj(x)|2(iI|hi(x)|2)(f(x)L)|n2(Cdeg(f)3log2n+deg(f))\left\lvert\sum_{j\in J}|g_{j}(x)|^{2}-\left(\sum_{i\in I}|h_{i}(x)|^{2}\right)(f(x)-L)\right\rvert\leq n^{-2(C\deg(f)^{3}\log^{2}n+\deg(f))}

for aΓ2na\in\Gamma_{2}^{n} such that i=1nai2Cdeg(f)3log2nn+2deg(f)\sum_{i=1}^{n}a_{i}\leq 2C\deg(f)^{3}\log^{2}n-n+2\deg(f). If deg(f)\deg(f) is fixed, then it suffices to validate ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) by checking quasi-polynomially many inequalities.

5. Computing FSOS certificates by the SDP relaxation

As shown in Example 3.15, rank one (polynomial and rational) FSOS certificates, although they exist and can be constructed explicitly, suffer from high sparsity. Moreover, such FSOS certificates may not be validated by methods we discussed in Section 4.

Example 5.1 (continues= ex:running example-3).

Let g,hg,h be defined by (13). We have

(f12)h2g2=1.790.777y10.777y20.777y3+0.669y1y2+0.669y1y3+0.669y2y3+2.12y1y2y3.\left(f-\frac{1}{2}\right)h^{2}-g^{2}=-1.79-0.777y_{1}-0.777y_{2}-0.777y_{3}+0.669y_{1}y_{2}+0.669y_{1}y_{3}+0.669y_{2}y_{3}+2.12y_{1}y_{2}y_{3}.

By a direct calculation, we obtain (f12)h2^g2^1=8.16\lVert\widehat{(f-\frac{1}{2})h^{2}}-\widehat{g^{2}}\rVert_{\ell^{1}}=8.16 and our 1\ell^{1}-norm validation (15) fails for ({g},{h})(\{g\},\{h\}). Noticing that for any μ0\mu\neq 0, ({μg},{μh})(\{\mu g\},\{\mu h\}) is a still a rank one rational FSOS certificate, thus one may expect to choose a sufficiently small μ\mu such that (f12)(μh)2^(μg)2^1=8.16μ2<12\lVert\widehat{(f-\frac{1}{2})(\mu h)^{2}}-\widehat{(\mu g)^{2}}\rVert_{\ell^{1}}=8.16\mu^{2}<\frac{1}{2} and the 1\ell^{1}-norm validation can work. However, since the range of h2h^{2} is {12.0,22.2,29.9,55.7}\{12.0,22.2,29.9,55.7\}, to ensure (μh)21(\mu h)^{2}\geq 1, μ2\mu^{2} is at least 112\frac{1}{12}. Hence (f12)(μh)2^(μg)2^10.68\lVert\widehat{(f-\frac{1}{2})(\mu h)^{2}}-\widehat{(\mu g)^{2}}\rVert_{\ell^{1}}\geq 0.68 and 1\ell^{1}-norm validation is still not applicable.

Thus to find sparse FSOS certificates which can be validated efficiently, it is imperative to consider those of higher ranks. This section is concerned with numerical computation of sparse FSOS certificates. After a discussion on how to translate the problem of computing FSOS certificates to an SDP problem, we present Algorithm 1. The main idea of Algorithm 1 is as follows:

  • We translate Problem 1.1 into the SDP problem (5).

  • By relaxing equality constraints in (5) to inequalities, we obtain (5).

  • Our low degree theorems (Theorems 3.8 and 3.18) ensure that it suffices to search for low degree FSOS certificates and this speeds up the algorithm.

Let f:Gf:G\to\mathbb{Z} be an integer-valued function on finite abelian group G=Γn1××ΓndG=\Gamma_{n_{1}}\times\cdots\times\Gamma_{n_{d}}. Using the Gram matrix defined in (7), we can convert the Problem 1.1 to the following semidefinite programming problem, which is a variant of the formulation in [KLYZ12]:

minU|S|×|S|V|T|×|T|1,\displaystyle\min_{\begin{subarray}{c}U\in\mathbb{R}^{|S|\times|S|}\\ V\in\mathbb{R}^{|T|\times|T|}\end{subarray}}1,
(27) subject to βα=λUα,β=γν+ζ=λγsupp(f)(fL)γVν,ζ,λ(SS)(TT+supp(f)),\displaystyle\sum_{\beta-\alpha=\lambda}U_{\alpha,\beta}=\sum_{\begin{subarray}{c}\gamma-\nu+\zeta=\lambda\\ \gamma\in\operatorname{supp}(f)\end{subarray}}(f-L)_{\gamma}V_{\nu,\zeta},\quad\forall\leavevmode\nobreak\ \lambda\in\left(S-S\right)\cup\left(T-T+\operatorname{supp}(f)\right),
U0,V1|T|.\displaystyle U\succeq 0,\leavevmode\nobreak\ \leavevmode\nobreak\ V\succeq\frac{1}{|T|}.

Here SS (resp. TT) is a subset of n1××nd\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}, columns and rows of the matrix U=(Uα,β)α,βS|S|×|S|U=(U_{\alpha,\beta})_{\alpha,\beta\in S}\in\mathbb{C}^{|S|\times|S|} (resp. V=(Vν,ζ)ν,ζT|T|×|T|V=(V_{\nu,\zeta})_{\nu,\zeta\in T}\in\mathbb{C}^{|T|\times|T|}) are indexed by elements in SS (reps. TT) and (fL)γ(f-L)_{\gamma} denotes the Fourier coefficient of fLf-L at χγG^\chi_{\gamma}\in\widehat{G} defined in (2). In the following we prove that the problem of certifying the nonnegativity of fLf-L is equivalent, up to a choice of TT, to solving (5).

Proposition 5.2 (certifying nonnegativity by the SDP relaxation).

Given S,Tn1××ndS,T\subseteq\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}, each solution of (5) provides a rational FSOS for nonnegativity of ff. Conversely, any rational FSOS for nonnegativity of ff supplies a solution to (5) for some choices of SS and TT.

Proof.

Assume that (U,V)(U,V) is a solution to (5). In particular, U,V0U,V\succeq 0. We let

g=α,βSUα,βχβα,h=ν,ζTVν,ζχζν.g=\sum_{\alpha,\beta\in S}U_{\alpha,\beta}\chi_{\beta-\alpha},\quad h=\sum_{\nu,\zeta\in T}V_{\nu,\zeta}\chi_{\zeta-\nu}.

Clearly U,VU,V are Gram matrices of g,hg,h respectively. According to (8) we are able to find {gj}j=1rank(U)\{g_{j}\}_{j=1}^{\operatorname{rank}(U)} and {hi}i=1rank(V)\{h_{i}\}_{i=1}^{\operatorname{rank}(V)} such that g=j=1rank(U)|gj|2g=\sum_{j=1}^{\operatorname{rank}(U)}|g_{j}|^{2}, h=i=1rank(V)|hi|2h=\sum_{i=1}^{\operatorname{rank}(V)}|h_{i}|^{2} and

hf=g,j=1rank(U)supp(gj)S,i=1rank(V)supp(hi)T.hf=g,\quad\bigcup_{j=1}^{\operatorname{rank}(U)}\operatorname{supp}(g_{j})\subseteq S,\quad\bigcup_{i=1}^{\operatorname{rank}(V)}\operatorname{supp}(h_{i})\subseteq T.

Moreover, h1h\geq 1 on GG since V1|T|V\succeq\frac{1}{|T|}. Conversely, given a pair ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) of finite families such that

f=jJ|gj|2iI|hi|2,iI|hi|21,f=\frac{\sum_{j\in J}|g_{j}|^{2}}{\sum_{i\in I}|h_{i}|^{2}},\quad\sum_{i\in I}|h_{i}|^{2}\geq 1,

then Gram matrices UU and VV of jJ|gj|2\sum_{j\in J}|g_{j}|^{2} and iI|hi|2\sum_{i\in I}|h_{i}|^{2} clearly form a solution to (5) for S=jJsupp(gj)S=\bigcup_{j\in J}\operatorname{supp}(g_{j}) and T=n1××ndT=\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}. ∎

Remark 5.3.

By the proof of Proposition 5.2, it is clear that (S,T)(S,T) in (5) bounds the supports of rational FSOS certificates. Moreover, since rational FSOS include polynomial FSOS as a special case, it is clear that some solution (U,V)(U,V) of (5) should supply a polynomial FSOS certificate. Indeed, this occurs if VV is a diagonal matrix whose diagonal elements are at least 1|T|\frac{1}{|T|}.

We notice that there are linear equality constraints in (5), which are not favourable from the perspective of optimization. To resolve this issue, we consider (5) obtained by relaxing the equalities to inequalities in (5).

minU|S|×|S|V|T|×|T|\displaystyle\min_{\begin{subarray}{c}U\in\mathbb{R}^{|S|\times|S|}\\ V\in\mathbb{R}^{|T|\times|T|}\end{subarray}} 1,\displaystyle 1,
(28) subject to λ(SS)(TT+supp(f))|βα=λα,βSUα,βγ+ζν=λγsupp(f),ν,ζT(fL+12)γVν,ζ|<12,\displaystyle\sum_{\lambda\in\left(S-S\right)\bigcup\left(T-T+\operatorname{supp}(f)\right)}\left\lvert\sum_{\begin{subarray}{c}\beta-\alpha=\lambda\\ \alpha,\beta\in S\end{subarray}}U_{\alpha,\beta}-\sum_{\begin{subarray}{c}\gamma+\zeta-\nu=\lambda\\ \gamma\in\operatorname{supp}(f),\nu,\zeta\in T\end{subarray}}\left(f-L+\frac{1}{2}\right)_{\gamma}V_{\nu,\zeta}\right\rvert<\frac{1}{2},
U0,V1|T|.\displaystyle U\succeq 0,\leavevmode\nobreak\ \leavevmode\nobreak\ V\succeq\frac{1}{|T|}.

Clearly, (5) is indeed a relaxation of (5), and its inequality constraints can be rewritten as

(29) iI\savestack\tmpbox\stretchto\scaleto\scalerel[width("|hi|2(fL+12)")] 0.5ex\stackon[1pt]|hi|2(fL+12)\tmpboxjJ|gj|2^1<12.\left\lVert\sum_{i\in I}\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("|h_{i}|^{2}\left(f-L+\frac{1}{2}\right)")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{|h_{i}|^{2}\left(f-L+\frac{1}{2}\right)}{\tmpbox}-\sum_{j\in J}\widehat{|g_{j}|^{2}}\right\rVert_{\ell^{1}}<\frac{1}{2}.

In particular, (29) can be recognized as an 1\ell^{1}-norm relaxation of

iI\savestack\tmpbox\stretchto\scaleto\scalerel[width("|hi|2(fL+12)")] 0.5ex\stackon[1pt]|hi|2(fL+12)\tmpboxjJ|gj|2^<12,\left\lVert\sum_{i\in I}\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("|h_{i}|^{2}\left(f-L+\frac{1}{2}\right)")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{|h_{i}|^{2}\left(f-L+\frac{1}{2}\right)}{\tmpbox}-\sum_{j\in J}\widehat{|g_{j}|^{2}}\right\rVert_{\ell^{\infty}}<\frac{1}{2},

which is simply (14) with ε=12\varepsilon=\frac{1}{2}.

Now we are ready to present Algorithm 1 which computes a rational FSOS certificate for fLf\geq L, where ff is an integer-valued function f:Gf:G\to\mathbb{Z} and LL is an integer. In particular, if we set T={1}T=\{1\} and S:={νn1××nd:zνsupp(r)}S:=\{\nu\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}:z^{\nu}\in\operatorname{supp}(r)\} in step 6 of Algorithm 1, we obtain an algorithm to compute a sparse polynomial FSOS certificate for fLf\geq L.

Algorithm 1 Rational FSOS certificate for lower bound
1:ff is an integer-valued function f:Gf:G\to\mathbb{Z}, L,M,dL,M,d and kk are integers such that the image of ff is contained in [L,M][L,M]\cap\mathbb{Z}.
2:rational FSOS certificate ({gα}αS,{hν}νT)(\{g_{\alpha}\}_{\alpha\in S},\{h_{\nu}\}_{\nu\in T}) for fLf\geq L.
3:compute the degree dd polynomial pd(t)p_{d}(t) which approximates t\sqrt{t} at points i+12,i=0,,MLi+\frac{1}{2},i=0,\dots,M-L by a linear programming solver.
4:while problem in step 7 is not feasible do
5:     compute the first-kk-truncation rr of pd(fL+12)p_{d}(f-L+\frac{1}{2}). \triangleright see (31)
6:     set T:={νn1××nd:χνsupp(r)}T:=\{\nu\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}:\chi_{\nu}\in\operatorname{supp}(r)\}, S:=T+{αn1××nd:χαsupp(f)}S:=T+\{\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}:\chi_{\alpha}\in\operatorname{supp}(f)\}.
7:     solve problem (5), if it is not feasible, then increase the value of kk. \triangleright by any SDP solver
8:end while
9:compute Cholesky decompositions U=GGU=G^{*}G and V=HHV=H^{*}H, HI×|S|H\in\mathbb{C}^{I\times|S|}, GJ×|T|G\in\mathbb{C}^{J\times|T|}.
10:return gj:=αSGj,αχα,jJg_{j}:=\sum_{\alpha\in S}G_{j,\alpha}\chi_{\alpha},j\in J, hi:=ζTHi,ζχζ,iIh_{i}:=\sum_{\zeta\in T}H_{i,\zeta}\chi_{\zeta},i\in I.

In Algorithm 1, we again label columns and rows of matrices in |S|×|S|\mathbb{C}^{|S|\times|S|} (resp. |T|×|T|\mathbb{C}^{|T|\times|T|}) by elements in SS (resp. TT) respectively. Among all the steps in Algorithm 1, the rationale behind steps 710 is clear from the discussion above. Thus it remains to interpret step 3-6. Let pd(t)p_{d}(t) be the approximation polynomial obtained in step 3 justified by Proposition 3.22. The univariate polynomial pdp_{d} can be easily determined by optimization techniques. For instance, we can compute pd(t)=i=0daitip_{d}(t)=\sum_{i=0}^{d}a_{i}t^{i} by solving a linear programming problem:

(30) min(a0,,ad)d+1λ,\displaystyle\min_{(a_{0},\dots,a_{d})\in\mathbb{R}^{d+1}}\lambda,
subject to|(i=0daiti)t|λ,t=12,32,,(fmaxL)+12.\displaystyle\text{subject to}\leavevmode\nobreak\ \left|\left(\sum_{i=0}^{d}a_{i}t^{i}\right)-\sqrt{t}\right|\leq\lambda,\quad t=\frac{1}{2},\frac{3}{2},\dots,(f_{\max}-L)+\frac{1}{2}.

The first-kk-truncation in step 5 is defined as follows: we order terms of pd(f)p_{d}(f) by the magnitude of their coefficients:

pd(f)=c1χα1++cDχαD,p_{d}(f)=c_{1}\chi_{\alpha_{1}}+\dots+c_{D}\chi_{\alpha_{D}},

where |c1||cD||c_{1}|\geq\cdots\geq|c_{D}| and {χα1,,χαD}=supp(pd(f))\{\chi_{\alpha_{1}},\dots,\chi_{\alpha_{D}}\}=\operatorname{supp}(p_{d}(f)). Then we take the first kk terms to obtain the first-kk-truncation of pd(f)p_{d}(f):

(31) r=c1χα1++ckχαk.r=c_{1}\chi_{\alpha_{1}}+\dots+c_{k}\chi_{\alpha_{k}}.

Clearly we have

r2fgh,g=jJ|gi|2,h=iJ|hi|2,r^{2}\approx f\approx\frac{g}{h},\quad g=\sum_{j\in J}|g_{i}|^{2},\quad h=\sum_{i\in J}|h_{i}|^{2},

where S,Tn1××ndS,T\subseteq\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}} and ({gj}jJ,{hi}iI)(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I}) are to be determined. The existence of rr suggests the existence of a rational certificate whose support contains supp(r)\operatorname{supp}(r), from which we heuristically choose TT to be the one in step 6. Roughly speaking, it is true that rr is already truncated, thus it may not be a good approximation of f\sqrt{f} anymore. However, the definition of the first-kk-truncation ensures that supp(r)\operatorname{supp}(r) still contains enough information of supp(f)\operatorname{supp}(\sqrt{f}). Now from the relation hfghf\approx g, it is clear why SS should be in the form of the one in step 6.

We provide the following example to illustrate Algorithm 1.

Example 5.4.

Let

f:Γ24,f(y)y1+y2+y3+y4+y1y2+2y1y33y1y4+8.f:\Gamma_{2}^{4}\mapsto\mathbb{Z},\leavevmode\nobreak\ f(y)\coloneqq y_{1}+y_{2}+y_{3}+y_{4}+y_{1}y_{2}+2y_{1}y_{3}-3y_{1}y_{4}+8.

We compute a rational FSOS certificate for fL2f\geq L\coloneqq 2 by Algorithm 1. We have

f(y)L+12=y1+y2+y3+y4+y1y2+2y1y33y1y4+132.f(y)-L+\frac{1}{2}=y_{1}+y_{2}+y_{3}+y_{4}+y_{1}y_{2}+2y_{1}y_{3}-3y_{1}y_{4}+\frac{13}{2}.

Clearly f(y)f^1=18f(y)\leq\|\widehat{f}\|_{\ell^{1}}=18 for all yΓ24y\in\Gamma_{2}^{4}. We take L=2L=2, M=18M=18, d=1d=1 and k=2k=2 in Algorithm 1. Then p1(t)=0.2097t+0.8971p_{1}(t)=0.2097t+0.8971 and the first-22-truncation of p1(fL+12)p_{1}(f-L+\frac{1}{2}) is 0.63y1y4+2.26-0.63y_{1}y_{4}+2.26. This implies that T={(0,0,0,0),(1,0,0,1)}T=\{(0,0,0,0),(1,0,0,1)\} and

S={(0,0,0,0),(0,0,0,1),(0,0,1,0),(0,0,1,1),(0,1,0,0),(0,1,0,1),(1,0,0,0),(1,0,0,1),(1,0,1,0),(1,0,1,1),(1,1,0,0),(1,1,0,1)}S=\left\{\begin{aligned} &(0,0,0,0),(0,0,0,1),(0,0,1,0),(0,0,1,1),(0,1,0,0),(0,1,0,1),\\ &(1,0,0,0),(1,0,0,1),(1,0,1,0),(1,0,1,1),(1,1,0,0),(1,1,0,1)\end{aligned}\right\}

One can find the rational FSOS certificate obtained by Algorithm 1 in Appendix A.

6. Applications of FSOS certificates

In this section, we present numerical experiments for three applications of FSOS certificates. The first application is the certificate problem for the function from [BGP16], which we discussed in Theorem 2.10 and Remark 3.4. The second application is the certificate problem for MAX-SAT. In the literature, a certificate for MAX-SAT can be obtained either by an inference rule called resolution [AH15, PCH20, BLM07, HMS11, PCH21] or by the polynomial SOS technique [HKM16, vvH08]. We illustrate in Section 6.2 that FSOS certificates provide another way to certify the MAX-SAT problem. The third application originates from graph theory. It is the certificate problem for maximum k-colorable subgraph (MkCS) problem, which reduces to the MAX-CUT problem when k=2k=2 [PY88, Boo75, GW95]. In Section 6.3, we consider the MkCS certificate problem for wheel graphs and complete graphs, respectively.

6.1. Numerical experiments on the function from [BGP16]

We recall from Theorem 2.10 that for each positive integer nn, the function

fn(x1,,xn)=(j=1nxjn2)(j=1nxjn21)f_{n}(x_{1},\dots,x_{n})=\left(\sum_{j=1}^{n}x_{j}-\left\lfloor\frac{n}{2}\right\rfloor\right)\left(\sum_{j=1}^{n}x_{j}-\left\lfloor\frac{n}{2}\right\rfloor-1\right)

on {0,1}n\{0,1\}^{n} can not be written as a rational FSOS of degrees lower than n/2\lfloor n/2\rfloor. However, according to Remark 3.4, fnf_{n} admits an explicit FSOS certificate of sparsity n+1n+1. In this subsection, we test Algorithm 1 on fnf_{n} for n=10,20,,100n=10,20,\dots,100. Numerical results are presented in Table 1. We list the sparsity of computed polynomial FSOS certificates in the second column. For comparison, we also list in the third column that the upper bound of the sparsity of a rational FSOS of degrees n/2\lfloor n/2\rfloor.

n sparsity upper bound of the sparsity [BGP16]
10 50 512512
20 200 524299524299
30 450 5.36871085.3687\cdot 10^{8}
40 800 5.497610115.4976\cdot 10^{11}
50 1250 5.629510145.6295\cdot 10^{14}
60 1800 5.764610175.7646\cdot 10^{17}
70 2450 5.90310205.903\cdot 10^{20}
80 3200 6.044610236.0446\cdot 10^{23}
90 4050 6.189710266.1897\cdot 10^{26}
100 5000 6.338310296.3383\cdot 10^{29}
Table 1. polynomial FSOS certificates for fn{f}_{n}

6.2. MAX-SAT certificate problem

In this subsection, we apply Algorithm 1 to solve the MAX-SAT certificate problem. To begin with, we first recall the definition of conjunctive normal form (CNF) formula.

Definition 6.1 (clause and CNF formula).

Let x1,,xs+rx_{1},\dots,x_{s+r} be variables on the Boolean field ({False,True},,)(\{\text{False},\text{True}\},\lor,\land). A clause (in variables x1,,xs+rx_{1},\dots,x_{s+r}) with (s+r)(s+r) literals is:

c=x1xs¬xs+1¬xs+r.c=x_{1}\lor\cdots\lor x_{s}\lor\lnot x_{s+1}\lor\cdots\lor\lnot x_{s+r}.

A kk-CNF formula (in variables x1,,xnx_{1},\dots,x_{n}) with mm clauses is:

ϕ=c1cm,\phi=c_{1}\land\cdots\land c_{m},

where cic_{i} is a clause (in variables x1,,xnx_{1},\dots,x_{n}) with at most kk literals for each 1im1\leq i\leq m.

An assignment of ϕ\phi is an evaluation of ϕ\phi at a point in {False,True}n\{\text{False},\text{True}\}^{n}. We say that a clause cc in ϕ\phi is satisfiable if there is an assignment such that the value of cc is True. Given a kk-CNF formula ϕ\phi, the MAX-SAT problem determines the maximum number of simultaneously satisfiable clauses in ϕ\phi by an assignment. Below we also clarify the MAX-SAT certificate problem.

Definition 6.2 (MAX-SAT certificate problem).

Given a kk-CNF formula ϕ\phi with mm clauses and a positive integer LmL\leq m, the MAX-SAT certificate problem asks for a proof of the existence of at most (mL)(m-L) simultaneously satisfiable clauses in ϕ\phi, or equivalently, a proof of the existence of at least LL simultaneously falsified clauses in ϕ\phi. Such a proof is called a MAX-SAT certificate for (ϕ,L)(\phi,L).

Next we will establish a connection between the MAX-SAT certificate problem and FSOS. To achieve this goal, we consider the following map sending a Boolean variable to a variable in Γ2\Gamma_{2}:

τ:{False,True}Γ2,τ(x)={1,ifx=False,1,ifx=True.\tau:\{\text{False},\text{True}\}\to\Gamma_{2},\quad\tau(x)=\begin{cases}1,\quad\text{if}\leavevmode\nobreak\ x=\text{False},\\ -1,\quad\text{if}\leavevmode\nobreak\ x=\text{True}.\end{cases}

In fact, if we identify the truth values False and True with 0 and 11 respectively, then τ\tau is simply the bijective map between 2\mathbb{Z}_{2} and Γ2\Gamma_{2} defined by x12xx\mapsto 1-2x. It is obvious that under this identification τ\tau is a group isomorphism between 2\mathbb{Z}_{2} and Γ2\Gamma_{2}. In particular, we have τ(¬x)=τ(x)\tau(\lnot x)=-\tau(x). Moreover, if we regard a clause c=x1xs¬xs+1¬xs+rc=x_{1}\lor\cdots\lor x_{s}\lor\lnot x_{s+1}\lor\cdots\lor\lnot x_{s+r} as an integer-valued function on 2s+r\mathbb{Z}_{2}^{s+r}, then we have the commutative diagram shown in Figure 2.

2s+r{\mathbb{Z}_{2}^{s+r}}Γ2s+r{\Gamma_{2}^{s+r}}{\mathbb{Z}}c\scriptstyle{c}τ××τ\scriptstyle{\tau\times\cdots\times\tau}c(τ1××τ1)\scriptstyle{c\circ(\tau^{-1}\times\cdots\times\tau^{-1})}
Figure 2. conversion of a clause to a function

By a direct calculation, we may obtain that

(32) c(τ1××τ1)(y1,,ys+r)=1hs+r(y1,,ys,ys+1,,ys+r),c\circ(\tau^{-1}\times\cdots\times\tau^{-1})(y_{1},\dots,y_{s+r})=1-h_{s+r}(y_{1},\cdots,y_{s},-y_{s+1},\cdots,-y_{s+r}),

where for each positive integer ll, hlh_{l} is the multilinear polynomial in variables y1,,yly_{1},\dots,y_{l} defined by

(33) hl(y1,y2,..,yl)=12li=1l(1+yi).h_{l}(y_{1},y_{2},..,y_{l})=\frac{1}{2^{l}}\prod_{i=1}^{l}\left(1+y_{i}\right).

One can easily verify that for (y1,,yl)Γ2l(y_{1},\dots,y_{l})\in\Gamma_{2}^{l}, we have

(34) hl(y1,y2,..,yl)={1,if y1=y2==yl=1,0,otherwise.h_{l}(y_{1},y_{2},..,y_{l})=\begin{cases}1,\leavevmode\nobreak\ &\text{if }y_{1}=y_{2}=\cdots=y_{l}=1,\\ 0,\leavevmode\nobreak\ &\text{otherwise}.\end{cases}

It is the auxiliary function hlh_{l} defined in (33) which enables us to convert a kk-CNF formula to a function on Γ2n\Gamma_{2}^{n}. Before we proceed, it is worthy to recall that in the literature, one can convert a kk-CNF formula to a polynomial by several different methods. The degree of the resulting polynomial obtained by each method varies. For example, given a kk-CNF formula ϕ\phi with mm clauses, the arithmetization of ϕ\phi in [AB09, Section 8.5.1] and [Gol10, Section 4] has degree 3m3m and mm respectively, while the characteristic function of ϕ\phi in [BLM07, Definition 6] has degree kk, which is independent of mm. For our purpose, it is favorable in terms of computational complexity to convert a kk-CNF formula to a low degree polynomial. Therefore, we have the following variant of the characteristic function defined in [BLM07, Definition 6].

Definition 6.3.

We define the characteristic function fcf_{c} of a clause c=x1xs¬xs+1¬xs+rc=x_{1}\lor\cdots\lor x_{s}\lor\lnot x_{s+1}\lor\cdots\lor\lnot x_{s+r} as

fc(y)=hs+r(y1,,ys,ys+1,,ys+r),yΓ2s+r.f_{c}(y)=h_{s+r}(y_{1},\cdots,y_{s},-y_{s+1},\cdots,-y_{s+r}),\quad y\in\Gamma_{2}^{s+r}.

Given a CNF formula ϕ=i=1mci\phi=\bigwedge_{i=1}^{m}c_{i} in nn variables, we define its characteristic function as

fϕ(y)=i=1mfci(y),yΓ2n.f_{\phi}(y)=\sum_{i=1}^{m}f_{c_{i}}(y),\quad y\in\Gamma_{2}^{n}.
Example 6.4.

The characteristic function of the CNF formula:

ϕ=x1x2x3(¬x1¬x2¬x3)\phi=x_{1}\land x_{2}\land x_{3}\land\left(\lnot{x}_{1}\lor\lnot{x}_{2}\lor\lnot{x}_{3}\right)

is given by

fϕ(y1,y2,y3)\displaystyle f_{\phi}(y_{1},y_{2},y_{3}) =h1(y1)+h1(y2)+h1(y3)+h3(y1,y2,y3)\displaystyle=h_{1}(y_{1})+h_{1}(y_{2})+h_{1}(y_{3})+h_{3}(-y_{1},-y_{2},-y_{3})
=138+38y1+38y2+38y3+18y1y2+18y1y3+18y2y318y1y2y3.\displaystyle=\frac{13}{8}+\frac{3}{8}y_{1}+\frac{3}{8}y_{2}+\frac{3}{8}y_{3}+\frac{1}{8}y_{1}y_{2}+\frac{1}{8}y_{1}y_{3}+\frac{1}{8}y_{2}y_{3}-\frac{1}{8}y_{1}y_{2}y_{3}.

This is the function on Γ23\Gamma_{2}^{3} we discussed in Example 3.5.

We summarize some simple but useful properties of characteristic functions in the next proposition.

Proposition 6.5.

Let ϕ\phi be a kk-CNF formula in nn variables with mm clauses and let fϕf_{\phi} be the characteristic function of ϕ\phi. Then fϕf_{\phi} has the following properties:

  1. (i)

    The degree of fϕf_{\phi} is at most kk.

  2. (ii)

    The cardinality of supp(fϕ)\operatorname{supp}(f_{\phi}) is at most min{2km,i=0k(ni)}\min\left\{2^{k}m,\sum_{i=0}^{k}\binom{n}{i}\right\}.

  3. (iii)

    fϕf_{\phi} can be computed by O(2kkm)O(2^{k}km) operations. In particular, if we fix kk then fϕf_{\phi} can be computed by O(m)O(m) operations.

  4. (iv)

    For each x{True,False}nx\in\{\text{True},\text{False}\}^{n}, fϕ(τ(x))f_{\phi}(\tau(x)) is the number of clauses of ϕ\phi falsified by xx.

  5. (v)

    The image set of fϕf_{\phi} is contained in {0,,m}\{0,\dots,m\}.

  6. (vi)

    Let α(1/5,1]\alpha\in(1/5,1] and 0β<5α14α0\leq\beta<\frac{5\alpha-1}{4\alpha} be fixed real numbers. Assume the minimum of the number of simultaneously falsified clauses in ϕ\phi is Lmin=αmL_{\min}=\alpha m. For each positive integer LβLminL\leq\beta L_{\min}, there is a polynomial FSOS certificate for fϕLf_{\phi}\geq L whose degree is O(klogm)O\left(k\log m\right).

  7. (vii)

    For each LLminL\leq L_{\min}, there is a rational FSOS certificate for fϕLf_{\phi}\geq L whose degree is O(klog2m)O(k\log^{2}m).

Proof.

According to (33), for each positive integer ss, we first have deghs=s\deg h_{s}=s from which (i) follows. It is easy to check that the cardinality of supp(hs)\operatorname{supp}(h_{s}) is 2s2^{s}. Since the number of multilinear monomials in nn variables with degree at most kk is i=0k(ni)\sum_{i=0}^{k}\binom{n}{i}, we obtain (iii). Next one can easily check (ii) by (33). Lastly from (32) and (34), we may conclude that for each clause cc in nn variables, fc(τ(x1),,τ(xn))=1f_{c}(\tau(x_{1}),\dots,\tau(x_{n}))=1 if and only if cc is falsified by (x1,,xn){True,False}n(x_{1},\dots,x_{n})\in\{\text{True},\text{False}\}^{n}. This implies (iv) and (v) by the definition of fϕf_{\phi}. (vi) and (vii) are direct consequences of Theorems 3.8 and 3.18, respectively. ∎

Combining (iv), (vi) and (vii) in Proposition 6.5, we may easily obtain the following theorem on the existence of short MAX-SAT certificate.

Theorem 6.6 (short MAX-SAT certificate).

Let α(1/5,1]\alpha\in(1/5,1], 0β<5α14α0\leq\beta<\frac{5\alpha-1}{4\alpha} be fixed real numbers and let kk be a fixed positive integer.

  1. (i)

    For each kk-CNF formula in nn variables with m=O(n)m=O(n) clauses such that Lmin=αmL_{\min}=\alpha m, there is a MAX-SAT certificate for (ϕ,L)(\phi,L) of the form {hi}iI\{h_{i}\}_{i\in I} whose degree is O(logn)O(\log n).

  2. (ii)

    For each kk-CNF formula in nn variables with m=O(n)m=O(n) clauses, there is a MAX-SAT certificate for (ϕ,L)(\phi,L) of the form ({hi}iI,{gi}jJ)(\{h_{i}\}_{i\in I},\{g_{i}\}_{j\in J}) whose degree is O(log2n)O(\log^{2}n).

Remark 6.7.

Clearly, using characteristic functions, we are able to deal with the certificate problem associated with other problems such as UNSAT and MIN-SAT. In each instance, we certify the inequality fLf\geq L by FSOS certificate for appropriate choice of ff and LL. In Table 2, we list the corresponding choice of ff and LL respectively, where fϕf_{\phi} denotes the characteristic function of a CNF formula ϕ\phi and LminL_{\min} (resp. LmaxL_{\max}) is the minimum (maximum) of the number of simultaneously falsified clauses in ϕ\phi.

MAX-SAT UNSAT MIN-SAT
LL LminL_{\min} 1 Lmax-L_{\max}
ff fϕf_{\phi} fϕf_{\phi} fϕ-f_{\phi}
Table 2. adaptations to other problems

6.2.1. Experiments on MAX-3-SAT Benchmark Problems

Given a CNF formula ϕ\phi, we recall that the MAX-SAT certificate problem for ϕ\phi can be solved by {gi}iI\{g_{i}\}_{i\in I} if

fϕ^LϕiIgi2^1<1,\|\widehat{f_{\phi}}-L_{\phi}-\sum_{i\in I}\widehat{g_{i}^{2}}\|_{\ell^{1}}<1,

where LϕL_{\phi} is the maximum number of simultaneously satisfiable clauses in ϕ\phi. In [vvH08], bases Map{M}_{ap} and Mpt{M}_{pt} are proposed to compute FSOS of MAX-3-SAT problems with nn variables, where

Map\displaystyle M_{\text{ap}} ={1,yi}i=1n{yiyj:xi,xjoccur in same clause}i,j=1n,\displaystyle=\{1,y_{i}\}_{i=1}^{n}\cup\{y_{i}y_{j}:x_{i},x_{j}\leavevmode\nobreak\ \text{occur in same clause}\}_{i,j=1}^{n},
Mpt\displaystyle M_{\text{pt}} =Map{yiyjyk:xi,xj,xk occur in same clause}i,j,k=1n.\displaystyle=M_{\text{ap}}\cup\{y_{i}y_{j}y_{k}:x_{i},x_{j},x_{k}\text{ occur in same clause}\}_{i,j,k=1}^{n}.

We consider the MAX-3-SAT benchmark problems in 2009 MAX-SAT competitions222http://www.maxsat.udl.cat/09/index.php. CNF formulae in these problems are of 7070 variables and 300300 clauses. We compute FSOS certificates for these problems by MapM_{\text{ap}}, MptM_{\text{pt}} and Algorithm 1, respectively. Each of the three methods requires us to solve an SDP and we solve it by the ADMM algorithm in SDPNAL+[STYZ20].

We record our results in Table 3. Instances whose FSOS certificates can be found by the corresponding method are marked by “\surd” in the ”verifiability” column. Instances for which the corresponding method fails to find FSOS certificates are marked by “×\times” in the “verifiability” column. We also record the sparsity of the computed FSOS certificate in the column labelled by “sparsity”. For instances marked by “×\times”, the sparsity is the number of elements in MptM_{\text{pt}}.

The experimental results demonstrate that when there exists an FSOS certificate with basis MptM_{\text{pt}} or MapM_{\text{ap}}, Algorithm 1 can find an FSOS certificate with smaller basis. More importantly, when methods based on MptM_{\text{pt}} and MapM_{\text{ap}} fail, Algorithm 1 is still able to computes an FSOS certificate successfully.

No LϕL_{\phi} MptM_{\text{pt}}, MapM_{\text{ap}} Algorithm 1
verifiability sparsity verifiability sparsity
0 0 \surd 1122 \surd 1111
1 1 \surd 1122 \surd 1122
2 0 \surd 1113 \surd 1102
3 0 \surd 1131 \surd 1120
4 1 ×\times 2486 \surd 1691
5 0 \surd 1108 \surd 1097
6 1 ×\times 2486 \surd 1698
7 0 \surd 1130 \surd 1119
8 0 \surd 1125 \surd 1114
9 0 \surd 1124 \surd 1113
Table 3. polynomial FSOS certificates for random benchmark problems with 300 clauses

6.3. Maximum k-colorable subgraph certificate problem

Given a simple undirected graph 𝒢\mathcal{G} and a positive integer kk, the maximum k-colorable subgraph problem (MkCS) asks one to find a subgraph with maximum number of edges which can be colored with kk colors [PY88]. We remark that the MkCS problem considered in this paper is also called the maximum k-cut problem [FJ95, vS16, Sot14]. Moreover, in some resources [LFT92, JP11, CC10], the MkCS problem also refers to a closely related but different problem.

In this subsection, we exhibit how to certify the MkCS probelm by FSOS certificates.

Definition 6.8 (MkCS certificate problem).

Given a simple undirected graph 𝒢=(𝒱,)\mathcal{G}=(\mathcal{V},\mathcal{E}) and positive integers kk and LL, the MkCS certificate problem asks for a proof of the nonexistence of a kk-colorable subgraph with L+1L+1 edges.

Next we rephrase the MkCS certificate problem as a certificate problem for lower bound. To that end, we have two candidates for the underlying group structure: one is Γ3n\Gamma_{3}^{n} and the other one is Γ23n\Gamma_{2}^{3n}.

Let kk be a fixed positive integer and let 𝒢=(𝒱,)\mathcal{G}=(\mathcal{V},\mathcal{E}) be an undirected graph with vertex set 𝒱={1,2,3,,n}\mathcal{V}=\{1,2,3,...,n\} and edge set 𝒱×𝒱\mathcal{E}\subset\mathcal{V}\times\mathcal{V}. We consider the following two functions:

δ:Γk2\displaystyle\delta:\Gamma_{k}^{2} ,δ(y1,y2)={1,if y1=y2,0,otherwise.\displaystyle\to\mathbb{R},\quad\delta(y_{1},y_{2})=\begin{cases}1,&\mbox{if }y_{1}=y_{2},\\ 0,&\mbox{otherwise}.\end{cases}
(35) f𝒢,k:Γkn\displaystyle f_{\mathcal{G},k}:\Gamma_{k}^{n} ,f𝒢,k(y1,y2,,yn)=(i,j)Eδ(yi,yj).\displaystyle\to\mathbb{R},\leavevmode\nobreak\ f_{\mathcal{G},k}(y_{1},y_{2},...,y_{n})=\sum_{(i,j)\in E}\delta(y_{i},y_{j}).

We notice that each element (y1,,yn)Γkn(y_{1},\dots,y_{n})\in\Gamma_{k}^{n} represents an assignment of kk colors to vertices of 𝒢\mathcal{G}. Thus f𝒢,k(y1,,yn)f_{\mathcal{G},k}(y_{1},\dots,y_{n}) is exactly the number of edges whose vertices are assigned the same color by (y1,,yn)(y_{1},\dots,y_{n}). This observation leads to the lemma that follows.

Lemma 6.9.

Let 𝒢\mathcal{G} be an undirected graph and let kk be a positive integer. The maximum number d𝒢,kd_{\mathcal{G},k} of edges in 𝒢\mathcal{G} that are kk-colorable is equal to ||minyΓknf𝒢,k(y)|\mathcal{E}|-\min_{y\in\Gamma_{k}^{n}}f_{\mathcal{G},k}(y). In particular, d𝒢,kLd_{\mathcal{G},k}\leq L if and only if minyΓknf𝒢,k(y)||L\min_{y\in\Gamma_{k}^{n}}f_{\mathcal{G},k}(y)\geq|\mathcal{E}|-L, which can be certified by an FSOS certificate of f𝒢,k||+Lf_{\mathcal{G},k}-|\mathcal{E}|+L.

According to [BCMM05], we can also encode the 3-colorability of a graph 𝒢\mathcal{G} into the following 3-CNF formula in variables xi,kx_{i,k}, i=1,,ni=1,\dots,n, k=1,2,3k=1,2,3:

(36) ϕ𝒢=i=1n(xi,1xi,2xi,3)i=1,2,3,,nk=1,2(¬xi,k¬xi,k+1)(i,j)k=1,2,3(¬xi,k¬xj,k).\phi_{\mathcal{G}}=\bigwedge_{i=1}^{n}\left(x_{i,1}\lor x_{i,2}\lor x_{i,3}\right)\land\bigwedge_{\begin{subarray}{c}i=1,2,3,...,n\\ k=1,2\end{subarray}}\left(\lnot x_{i,k}\lor\lnot x_{i,k+1}\right)\land\bigwedge_{\begin{subarray}{c}(i,j)\in\mathcal{E}\\ k=1,2,3\end{subarray}}\left(\lnot x_{i,k}\lor\lnot x_{j,k}\right).

Here xi,kx_{i,k} is true if and only if the kk-th color is assigned to the ii-th vertex. Therefore 𝒢\mathcal{G} is 33-colorable if and only if ϕ𝒢\phi_{\mathcal{G}} is satisfiable. Moreover, using the characteristic function fϕ𝒢f_{\phi_{\mathcal{G}}} (c.f. Definition 6.3) of ϕ𝒢\phi_{\mathcal{G}}, we are able to prove that 𝒢\mathcal{G} are not 33-colorable by FSOS certificate. This is the content of the following lemma.

Lemma 6.10.

Let 𝒢=(𝒱,)\mathcal{G}=(\mathcal{V},\mathcal{E}) be a undirected graph with nn vertices, then 𝒢\mathcal{G} is not 33-colorable if and only if fϕ𝒢1f_{\phi_{\mathcal{G}}}\geq 1.

We notice that Lemmas 6.9 and 6.10 provide us two ways to certify the inequality d𝒢,3Ld_{\mathcal{G},3}\leq L: the first one is by an FSOS certificate of f𝒢,3||+Lf_{\mathcal{G},3}-|\mathcal{E}|+L, which is a function on Γ3n\Gamma_{3}^{n}; the second one is by an FSOS certificate of fϕ𝒢||+Lf_{\phi_{\mathcal{G}}}-|\mathcal{E}|+L which is a function on Γ23n\Gamma_{2}^{3n}.

6.3.1. MkCS certificate problem for Wheel graphs

Let 𝒲n\mathcal{W}_{n} be the wheel graph of nn vertices [W+01]. The graph in Figure 3(a) is the wheel graph of 88 vertices.

(a) 𝒲8\mathcal{W}_{8}
(b) 3-colorable subgraph of 𝒲8\mathcal{W}_{8}
Figure 3. Example of a wheel graph and its 3-colorable subgraph

The following lemma is obvious.

Lemma 6.11.

If nn is even, then

  1. (i)

    𝒲n\mathcal{W}_{n} is not 3-colorable.

  2. (ii)

    The subgraph of 𝒲n\mathcal{W}_{n} obtained by deleting an edge in the outer circle is 3-colorable.

As a consequence, we have

minyΓ3nf𝒲n,3(y)=1,\min_{y\in\Gamma_{3}^{n}}f_{\mathcal{W}_{n},3}(y)=1,

and

minyΓ23nfϕ𝒢(y)=1.\min_{y\in\Gamma_{2}^{3n}}f_{\phi_{\mathcal{G}}}(y)=1.

The graph in Figure 3(b) is the 3-colorable subgraph of 𝒲8\mathcal{W}_{8} described in (ii) of Lemma 6.11. Next we compute a short polynomial FSOS certificate of f𝒲n,31f_{\mathcal{W}_{n},3}-1 for n{2m:5m25}n\in\{2m:5\leq m\leq 25\}, which provides a short proof of the inequality minyΓ3nf𝒲n,3(y)1\min_{y\in\Gamma_{3}^{n}}f_{\mathcal{W}_{n},3}(y)\geq 1. To that end, we apply Algorithm 1 to f𝒲n,3f_{\mathcal{W}_{n},3} with L=1L=1, fmax=||f_{\max}=|\mathcal{E}|, d=2d=2 and k=|supp(f)|k=|\operatorname{supp}(f)|. Numerical results are recorded in Table 5, where nn is the number of vertices of the wheel graph, ”time” is the running time of Algorithm 1, and ”sparsity” is the sparsity of the computed FSOS certificate. It is clear from Figure 4 that the computed polynomial FSOS certificates are indeed short since their sparsities are linear in the number of vertices. We remark that the order of Γ3n\Gamma_{3}^{n} is 3n3^{n}, which is as large as 350710233^{50}\approx 7\cdot 10^{23} in our examples.

As a comparison, we also apply TSSOS [WML21b, WML21a] to compute a polynomial SOS certificate for f𝒲n,31f_{\mathcal{W}_{n},3}\geq 1. Since TSSOS is not able to process complex polynomials directly, we transform the problem into the following equivalent real form:

minx,yn\displaystyle\min_{x,y\in\mathbb{R}^{n}} Re(f𝒲n,3(x+1y))\displaystyle\operatorname{Re}\left(f_{\mathcal{W}_{n},3}(x+\sqrt{-1}y)\right)
subject to (xi+1yi)3=1,i=1,,n.\displaystyle(x_{i}+\sqrt{-1}y_{i})^{3}=1,\quad i=1,\dots,n.

We then solve (6.3.1) by TSSOS.333We thank Jie Wang for his help. For instance, TSSOS444The relaxation order is set to be 44. spends 510510 seconds to find an SOS certificate for f𝒲10,31f_{\mathcal{W}_{10},3}\geq 1 of sparsity 1020610206.

n time sparsity
10 0.87 74
12 1.20 90
14 1.29 106
16 1.77 122
18 2.14 138
20 3.60 154
22 4.44 170
24 5.34 186
26 6.21 202
28 7.19 218
30 10.54 234
32 11.50 250
34 13.04 266
36 14.62 282
38 14.76 298
40 16.75 314
42 19.03 330
44 21.74 346
46 25.63 362
48 26.59 378
50 30.45 394
Table 4. polynomial FSOS certificates for f𝒲n,31f_{\mathcal{W}_{n},3}\geq 1
n time sparsity
10 2.25 125
12 3.49 151
14 5.06 177
16 6.71 203
18 9.06 229
20 11.46 255
22 15.38 281
24 18.95 307
26 23.89 333
28 31.67 359
30 149.23 578
32 190.84 617
34 218.54 656
36 289.95 695
38 326.91 734
40 375.87 773
42 454.19 812
44 481.17 851
46 543.78 890
48 609.56 929
50 694.80 968
Table 5. polynomial FSOS certificates for fϕ𝒲n1f_{\phi_{\mathcal{W}_{n}}}\geq 1
Refer to caption
Figure 4. MkCS certificate problem for wheel graphs

Let ϕ𝒲n\phi_{\mathcal{W}_{n}} be the 3-CNF formula defined in (36) and let fϕ𝒲nf_{\phi_{\mathcal{W}_{n}}} be its characteristic function (c.f. Definition 6.3). According to Lemmas 6.11 and 6.10, we have minyΓ23nfϕ𝒲n(y)=1\min_{y\in\Gamma_{2}^{3n}}f_{\phi_{\mathcal{W}_{n}}}(y)=1 if nn is even. We apply Algorithm 1 to compute polynomial FSOS certificates for fϕ𝒲n1,n{2m:5m15}f_{\phi_{\mathcal{W}_{n}}}\geq 1,n\in\{2m:5\leq m\leq 15\}. Results are shown in Table 5. It is obvious from Figure 4 that the sparsity of computed FSOS certificate is (roughly) linear in the number of vertices.

We remark that the sparsity of the computed FSOS certificate for fϕ𝒲n1f_{\phi_{\mathcal{W}_{n}}}\geq 1 is much greater than that for f𝒲n,31f_{\mathcal{W}_{n},3}\geq 1. Moreover, for the same nn, computing an FSOS certificate for fϕ𝒲n1f_{\phi_{\mathcal{W}_{n}}}\geq 1 costs more time than computing an FSOS certificate for f𝒲n,31f_{\mathcal{W}_{n},3}\geq 1. This indicates that although many combinatorial problems can be equivalently reformulated as FSOS problems on Γ2n\Gamma_{2}^{n}, a suitable choice of the underlying group structure may accelerate the computation in practice.

6.3.2. MkCS certificate problem for complete graphs

Let 𝒦n\mathcal{K}_{n} be the complete graph with nn vertexes and let f𝒦n,n1:Γnn1f_{\mathcal{K}_{n},n-1}:\Gamma_{n}^{n-1}\to\mathbb{R} be the integer valued function defined in (6.3) for 𝒦n\mathcal{K}_{n} and n1n-1.

Lemma 6.12.

For any nn, f𝒦n,n11f_{\mathcal{K}_{n},n-1}\geq 1 admits a polynomial FSOS certificate of sparsity O(n2)\operatorname{O}(n^{2}).

Proof.

We observe that

f𝒦n,n1=n2+1n1k=1n21i<jnχk(yi)χnk1(yj),f_{\mathcal{K}_{n},n-1}=\frac{n}{2}+\frac{1}{n-1}\sum_{k=1}^{n-2}\sum_{1\leq i<j\leq n}\chi_{k}(y_{i})\chi_{n-k-1}(y_{j}),

where χk(z)=zk,0kn1\chi_{k}(z)=z^{k},0\leq k\leq n-1 are characters of Γn\Gamma_{n}. It is straightforward to verify that

f𝒦n,n11=n+22(n1)+k=1n212(n1)|i=1nχk(yi)|2.f_{\mathcal{K}_{n},n-1}-1=\frac{-n+2}{2(n-1)}+\sum_{k=1}^{n-2}\frac{1}{2(n-1)}\left|\sum_{i=1}^{n}\chi_{k}(y_{i})\right|^{2}.

This provides us a desired FSOS certificate for f𝒦n,n11f_{\mathcal{K}_{n},n-1}\geq 1. ∎

In fact, f𝒦n,n1f_{\mathcal{K}_{n},n-1} is the same function discussed in [YYZ22, Proposition 5.2], whose nonnegativity is equivalent to the pigeon-hole principle. Although Lemma 6.12 already explicitly supplies a sparse polynomial FSOS certificate for f𝒦n,n11f_{\mathcal{K}_{n},n-1}\geq 1, we apply Algorithm 1 to f𝒦n,n11f_{\mathcal{K}_{n},n-1}\geq 1 where 4n224\leq n\leq 22, only for testing and comparison purposes555The order of Γ2221\Gamma_{22}^{21} is approximately equal to 1.510281.5\cdot 10^{28}..

Numerical results are listed in Table 6. In Figure 5, we plot the time cost and computed sparsity as functions of nn respectively. It turns out that the time cost (resp. computed sparsity) can be interpolated by a cubic (resp. quintic) polynomial.

n time sparsity
4 0.16 26
5 0.32 62
6 0.80 122
7 2.46 212
8 6.13 338
9 15.38 506
10 32.25 722
11 65.56 992
12 122.07 1322
13 235.69 1718
14 405.70 2186
15 659.25 2732
16 1075.5 3362
17 1813.0 4082
18 2922.6 4898
19 4556.8 5816
20 6850.7 6842
21 11050 7982
22 15665 9242
Table 6. polynomial FSOS certificates for f𝒦n,n11f_{\mathcal{K}_{n},n-1}\geq 1
Refer to caption
Figure 5. MkCS certificate problem for complete graphs

Appendix A Rational FSOS certificate in Example 5.4

g1=\displaystyle g_{1}= 0.29y1+0.25y2+0.25y3+0.22y4+0.25y1y2+0.49y1y30.32y1y4+0.022y2y4+0.044y3y4+\displaystyle 0.29y_{1}+0.25y_{2}+0.25y_{3}+0.22y_{4}+0.25y_{1}y_{2}+0.49y_{1}y_{3}-0.32y_{1}y_{4}+0.022y_{2}y_{4}+0.044y_{3}y_{4}+
0.022y1y2y4+0.023y1y3y4+1.2,\displaystyle 0.022y_{1}y_{2}y_{4}+0.023y_{1}y_{3}y_{4}+1.2,
g2=\displaystyle g_{2}= 1.1y40.025y22.7103y30.39y10.025y1y20.072y1y3+0.35y1y4+0.25y2y4+\displaystyle 1.1y_{4}-0.025y_{2}-2.7\cdot 10^{-3}y_{3}-0.39y_{1}-0.025y_{1}y_{2}-0.072y_{1}y_{3}+0.35y_{1}y_{4}+0.25y_{2}y_{4}+
0.24y3y4+0.25y1y2y4+0.5y1y3y4,\displaystyle 0.24y_{3}y_{4}+0.25y_{1}y_{2}y_{4}+0.5y_{1}y_{3}y_{4},
g3=\displaystyle g_{3}= 0.47y10.056y2+1.1y30.056y1y2+0.049y1y3+0.1y1y44.3103y2y4+0.15y3y4\displaystyle 0.47y_{1}-0.056y_{2}+1.1y_{3}-0.056y_{1}y_{2}+0.049y_{1}y_{3}+0.1y_{1}y_{4}-4.3\cdot 10^{-3}y_{2}y_{4}+0.15y_{3}y_{4}-
4.3103y1y2y40.44y1y3y4,\displaystyle 4.3\cdot 10^{-3}y_{1}y_{2}y_{4}-0.44y_{1}y_{3}y_{4},
g4=\displaystyle g_{4}= 0.033y1+3.9103y2+3.9103y1y20.45y1y3+0.46y1y40.056y2y4+\displaystyle 0.033y_{1}+3.9\cdot 10^{-3}y_{2}+3.9\cdot 10^{-3}y_{1}y_{2}-0.45y_{1}y_{3}+0.46y_{1}y_{4}-0.056y_{2}y_{4}+
1.1y3y40.056y1y2y4+0.11y1y3y4,\displaystyle 1.1y_{3}y_{4}-0.056y_{1}y_{2}y_{4}+0.11y_{1}y_{3}y_{4},
g5=\displaystyle g_{5}= 0.24y1+0.97y2+0.058y1y20.12y1y3+0.12y1y4+0.2y2y40.55y1y2y40.019y1y3y4,\displaystyle 0.24y_{1}+0.97y_{2}+0.058y_{1}y_{2}-0.12y_{1}y_{3}+0.12y_{1}y_{4}+0.2y_{2}y_{4}-0.55y_{1}y_{2}y_{4}-0.019y_{1}y_{3}y_{4},
g6=\displaystyle g_{6}= 0.075y10.58y1y2+6.5103y1y3+0.22y1y4+0.95y2y4+0.17y1y2y40.12y1y3y4,\displaystyle 0.075y_{1}-0.58y_{1}y_{2}+6.5\cdot 10^{-3}y_{1}y_{3}+0.22y_{1}y_{4}+0.95y_{2}y_{4}+0.17y_{1}y_{2}y_{4}-0.12y_{1}y_{3}y_{4},
g7=\displaystyle g_{7}= 0.92y1+0.29y1y2+0.15y1y3+0.41y1y4+0.26y1y2y4+0.49y1y3y4,\displaystyle 0.92y_{1}+0.29y_{1}y_{2}+0.15y_{1}y_{3}+0.41y_{1}y_{4}+0.26y_{1}y_{2}y_{4}+0.49y_{1}y_{3}y_{4},
g8=\displaystyle g_{8}= 0.15y1y2+0.48y1y3+0.82y1y4+0.19y1y2y40.078y1y3y4,\displaystyle 0.15y_{1}y_{2}+0.48y_{1}y_{3}+0.82y_{1}y_{4}+0.19y_{1}y_{2}y_{4}-0.078y_{1}y_{3}y_{4},
g9=\displaystyle g_{9}= 0.69y1y30.32y1y20.31y1y2y4+0.33y1y3y4,\displaystyle 0.69y_{1}y_{3}-0.32y_{1}y_{2}-0.31y_{1}y_{2}y_{4}+0.33y_{1}y_{3}y_{4},
g10=\displaystyle g_{10}= 0.61y1y3y40.19y1y2y40.18y1y2,\displaystyle 0.61y_{1}y_{3}y_{4}-0.19y_{1}y_{2}y_{4}-0.18y_{1}y_{2},
g11=\displaystyle g_{11}= 0.61y1y2+0.15y1y2y4,\displaystyle 0.61y_{1}y_{2}+0.15y_{1}y_{2}y_{4},
g12=\displaystyle g_{12}= 0.59y1y2y4,\displaystyle 0.59y_{1}y_{2}y_{4},
h1=\displaystyle h_{1}= 0.098y1y4+1.1,\displaystyle 0.098y_{1}y_{4}+1.1,
h2=\displaystyle h_{2}= 1.1y1y4.\displaystyle 1.1y_{1}y_{4}.

It is straightforward to verify that νT(f32)hν2^αSgα2^1<0.092\left\lVert\sum_{\nu\in T}\widehat{(f-\frac{3}{2})h_{\nu}^{2}}-\sum_{\alpha\in S}\widehat{g_{\alpha}^{2}}\right\rVert_{\ell^{1}}<0.092.

References

  • [AB09] Sanjeev Arora and Boaz Barak. Computational complexity: a modern approach. Cambridge University Press, 2009.
  • [AH15] André Abramé and Djamal Habet. On the resiliency of unit propagation to max-resolution. In Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015.
  • [Art27] Emil Artin. Über die Zerlegung definiter Funktionen in Quadrate. Abh. Math. Sem. Univ. Hamburg, 5(1):100–115, 1927.
  • [BCMM05] Paul Beame, Joseph Culberson, David Mitchell, and Cristopher Moore. The resolution complexity of random graph k-colorability. Discrete Applied Mathematics, 153(1-3):25–47, 2005.
  • [BGP16] Grigoriy Blekherman, João Gouveia, and James Pfeiffer. Sums of squares on the hypercube. Mathematische Zeitschrift, 284(1):41–54, 2016.
  • [BHK+19] Boaz Barak, Samuel Hopkins, Jonathan Kelner, Pravesh K. Kothari, Ankur Moitra, and Aaron Potechin. A nearly tight sum-of-squares lower bound for the planted clique problem. SIAM Journal on Computing, 48(2):687–735, 2019.
  • [BLM07] Maria Luisa Bonet, Jordi Levy, and Felip Manya. Resolution for max-sat. Artificial Intelligence, 171(8-9):606–618, 2007.
  • [BLT21] Guy Blanc, Jane Lange, and Li-Yang Tan. Provably efficient, succinct, and precise explanations. Advances in Neural Information Processing Systems, 34:6129–6141, 2021.
  • [Boo75] Ronald V. Book. Richard m. karp. reducibility among combinatorial problems. complexity of computer computations, proceedings of a symposium on the complexity of computer computations, held march 20-22, 1972, at the ibm thomas j. watson center, yorktown heights, new york, edited by raymond e. miller and james w. thatcher, plenum press, new york and london 1972, pp. 85–103. The Journal of Symbolic Logic, 40(4):618–619, 1975.
  • [CC10] Manoel Campélo and Ricardo C. Corréa. A combined parallel lagrangian decomposition and cutting-plane generation for maximum stable set problems. Electronic Notes in Discrete Mathematics, 36:503–510, 2010. ISCO 2010 - International Symposium on Combinatorial Optimization.
  • [FH13] William Fulton and Joe Harris. Representation theory: a first course, volume 129. Springer Science & Business Media, 2013.
  • [FJ95] Alan M. Frieze and Mark Jerrum. Improved approximation algorithms for max k-cut and max bisection. In Proceedings of the 4th International IPCO Conference on Integer Programming and Combinatorial Optimization, page 1–13, Berlin, Heidelberg, 1995. Springer-Verlag.
  • [FSP16] Hamza Fawzi, James Saunderson, and Pablo A Parrilo. Sparse sums of squares on finite abelian groups and improved semidefinite lifts. Mathematical Programming, 160(1-2):149–191, 2016.
  • [Gol10] Oded Goldreich. P, NP, and NP-Completeness: The basics of computational complexity. Cambridge University Press, 2010.
  • [Gri01] D. Grigoriev. Complexity of positivstellensatz proofs for the knapsack. computational complexity, 10(2):139–154, 2001.
  • [GW95] Michel X Goemans and David P Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM (JACM), 42(6):1115–1145, 1995.
  • [Hil88] David Hilbert. Über die darstellung definiter formen als summe von formenquadraten. Mathematische Annalen, 32(3):342–350, 1888.
  • [HKM16] Marijn JH Heule, Oliver Kullmann, and Victor W Marek. Solving and verifying the boolean pythagorean triples problem via cube-and-conquer. In International Conference on Theory and Applications of Satisfiability Testing, pages 228–245. Springer, 2016.
  • [HMS11] Federico Heras and Joao Marques-Silva. Read-once resolution for unsatisfiability-based max-sat algorithms. In Twenty-Second International Joint Conference on Artificial Intelligence, 2011.
  • [JP11] Tim Januschowski and Marc E. Pfetsch. Branch-cut-and-propagate for the maximum k-colorable subgraph problem with symmetry. In Tobias Achterberg and J. Christopher Beck, editors, Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems, pages 99–116, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.
  • [KLM16] Adam Kurpisz, Samuli Leppänen, and Monaldo Mastrolilli. Tight sum-of-squares lower bounds for binary polynomial optimization problems. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016.
  • [KLYZ12] Erich L Kaltofen, Bin Li, Zhengfeng Yang, and Lihong Zhi. Exact certification in global polynomial optimization via sums-of-squares of rational functions with rational coefficients. Journal of Symbolic Computation, 47(1):1–15, 2012.
  • [KMOW17] Pravesh K. Kothari, Ryuhei Mori, Ryan O’Donnell, and David Witmer. Sum of squares lower bounds for refuting any csp. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, page 132–145, New York, NY, USA, 2017. Association for Computing Machinery.
  • [Kri64] J. L. Krivine. Anneaux préordonnés. Journal d’Analyse Mathématique, 12(1):307–326, 1964.
  • [Kur19] Adam Kurpisz. Sum-of-squares bounds via boolean function analysis. In Proceedings of the 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019), volume 132, page 79. Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2019.
  • [Las01] Jean B. Lasserre. Global optimization with polynomials and the problem of moments. SIAM Journal on Optimization, 11(3):796–817, 2001.
  • [Lau09] Monique Laurent. Sums of Squares, Moment Matrices and Optimization Over Polynomials, pages 157–270. Springer New York, 2009.
  • [LFT92] K.C. Lee, N. Funabiki, and Y. Takefuji. A parallel improvement algorithm for the bipartite subgraph problem. IEEE Transactions on Neural Networks, 3(1):139–145, 1992.
  • [Mot67] Theodore Samuel Motzkin. The arithmetic-geometric inequality. Inequalities (Proc. Sympos. Wright-Patterson Air Force Base, Ohio, 1965), pages 205–224, 1967.
  • [MPW15] Raghu Meka, Aaron Potechin, and Avi Wigderson. Sum-of-squares lower bounds for planted clique. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’15, page 87–96, New York, NY, USA, 2015. Association for Computing Machinery.
  • [New64] Donald J Newman. Rational approximation to |x||x|. Michigan Mathematical Journal, 11(1):11–14, 1964.
  • [Nie14] Jiawang Nie. Optimality conditions and finite convergence of lasserre’s hierarchy. Mathematical Programming, 146(1):97–121, 2014.
  • [NOR10] Arkadi Nemirovski, Shmuel Onn, and Uriel G Rothblum. Accuracy certificates for computational problems with convex structure. Mathematics of Operations Research, 35(1):52–78, 2010.
  • [NS94] Noam Nisan and Mario Szegedy. On the degree of boolean functions as real polynomials. Computational complexity, 4(4):301–313, 1994.
  • [Par00] Pablo A Parrilo. Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization. California Institute of Technology, 2000.
  • [Par03] Pablo A. Parrilo. Semidefinite programming relaxations for semialgebraic problems. Mathematical Programming, 96(2):293–320, 2003.
  • [PCH20] Matthieu Py, Mohamed Sami Cherif, and Djamal Habet. Towards bridging the gap between sat and max-sat refutations. In 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), pages 137–144. IEEE, 2020.
  • [PCH21] Matthieu Py, Mohamed Sami Cherif, and Djamal Habet. A proof builder for max-sat. In International Conference on Theory and Applications of Satisfiability Testing, pages 488–498. Springer, 2021.
  • [PT20] Pablo A. Parrilo and Rekha R. Thomas, editors. Sum of squares: theory and applications, volume 77 of Proceedings of Symposia in Applied Mathematics. American Mathematical Society, Providence, RI, [2020] ©2020. AMS Short Course, Sum of Squares: Theory and Applications, January 14–15, 2019, Baltimore, MD.
  • [PY88] Christos Papadimitriou and Mihalis Yannakakis. Optimization, approximation, and complexity classes. In Proceedings of the twentieth annual ACM symposium on Theory of computing, pages 229–234, 1988.
  • [RS10] Alexander A. Razborov and Alexander A. Sherstov. The sign-rank of AC0\rm AC^{0}. SIAM J. Comput., 39(5):1833–1855, 2010.
  • [Rud62] Walter Rudin. Fourier analysis on groups, volume 121967. Wiley Online Library, 1962.
  • [SB02] J. Stoer and R. Bulirsch. Introduction to numerical analysis, volume 12 of Texts in Applied Mathematics. Springer-Verlag, New York, third edition, 2002. Translated from the German by R. Bartels, W. Gautschi and C. Witzgall.
  • [Sch91] Konrad Schmüdgen. Thek-moment problem for compact semi-algebraic sets. Mathematische Annalen, 289(1):203–206, 1991.
  • [Sha94] N. Shankar. Metamathematics, machines, and Gödel’s proof, volume 38 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, Cambridge, 1994.
  • [She20] Alexander A Sherstov. Algorithmic polynomials. SIAM Journal on Computing, 49(6):1173–1231, 2020.
  • [SL23] Lucas Slot and Monique Laurent. Sum-of-squares hierarchies for binary polynomial optimization. Mathematical Programming, 197(2):621–660, 2023.
  • [Sot14] R. Sotirov. An efficient semidefinite programming relaxation for the graph partition problem. INFORMS J. on Computing, 26(1):16–30, feb 2014.
  • [Sta03] Herbert R. Stahl. Best uniform rational approximation of xαx^{\alpha} on [0,1][0,1]. Acta Math., 190(2):241–306, 2003.
  • [Ste74] Gilbert Stengle. A nullstellensatz and a positivstellensatz in semialgebraic geometry. Mathematische Annalen, 207(2):87–97, 1974.
  • [STKI17] Shinsaku Sakaue, Akiko Takeda, Sunyoung Kim, and Naoki Ito. Exact semidefinite programming relaxations with truncated moment matrix for binary polynomial optimization problems. SIAM Journal on Optimization, 27(1):565–582, 2017.
  • [STYZ20] Defeng Sun, Kim-Chuan Toh, Yancheng Yuan, and Xin-Yuan Zhao. SDPNAL+: A matlab software for semidefinite programming with bound constraints (version 1.0). Optimization Methods and Software, 35(1):87–115, 2020.
  • [VK87] Richard S Varga and A D Karpenter. On a conjecture of S. Bernstein in approximation theory. Mathematics of the USSR-Sbornik, 57(2):547–560, feb 1987.
  • [vS16] E.R. van Dam and R. Sotirov. New bounds for the max-k-cut and chromatic number of a graph. Linear Algebra and its Applications, 488:216–234, 2016.
  • [vvH08] H. van Maaren, L. van Norden, and M.J.H. Heule. Sums of squares based approximation algorithms for max-sat. Discrete Applied Mathematics, 156(10):1754–1779, 2008.
  • [Vya75] NS Vyacheslavov. On the uniform approximation of |x||x| by rational functions. Doklady Akademii Nauk, 220(3):512–515, 1975.
  • [W+01] Douglas Brent West et al. Introduction to graph theory, volume 2. Prentice hall Upper Saddle River, 2001.
  • [WML21a] Jie Wang, Victor Magron, and Jean-Bernard Lasserre. Chordal-tssos: a moment-sos hierarchy that exploits term sparsity with chordal extension. SIAM Journal on Optimization, 31(1):114–141, 2021.
  • [WML21b] Jie Wang, Victor Magron, and Jean-Bernard Lasserre. Tssos: A moment-sos hierarchy that exploits term sparsity. SIAM Journal on Optimization, 31(1):30–58, 2021.
  • [YYZ22] Jianting Yang, Ke Ye, and Lihong Zhi. Computing sparse fourier sum of squares on finite abelian groups in quasi-linear time. arXiv preprint arXiv:2201.03912, 2022.