\stackMath

Fourier sum of squares certificates

Jianting Yang KLMM, Academy of Mathematics and Systems Science,& University of Chinese Academy of Sciences [email protected] , Ke Ye KLMM, Academy of Mathematics and Systems Science,& University of Chinese Academy of Sciences [email protected] and Lihong Zhi KLMM, Academy of Mathematics and Systems Science,& University of Chinese Academy of Sciences [email protected]

(Date: February 10, 2025)

Abstract.

The non-negativity of a function on a finite abelian group can be certified by its Fourier sum of squares (FSOS). In this paper, we propose a method of certifying the non-negativity of an integer-valued function by an FSOS certificate, which is defined to be an FSOS with a small error. We prove the existence of exponentially sparse polynomial and rational FSOS certificates and we provide two methods to validate them. As a consequence of the aforementioned existence theorems, we propose a semidefinite programming (SDP)-based algorithm to efficiently compute a sparse FSOS certificate. For applications, we consider certificate problems for maximum satisfiability (MAX-SAT) and maximum k-colorable subgraph (MkCS) and demonstrate our theoretical results and algorithm by numerical experiments.

Key words and phrases:

Fourier sum of squares, short certificate, semidefinite programming, approximation theory, numerical algorithm, MAX-SAT, MkCS

Ke Ye and Lihong Zhi are supported by the National Key Research Project of China 2018YFA0306702. Lihong Zhi is supported by the National Natural Science Foundation of China 12071467.

1. Introduction

The problem of sum of squares (SOS) dates back to Hilbert [Hil88]. It asks whether a non-negative polynomial can be written as a sum of squares of polynomials (resp. rational functions). The polynomial case is disproved by the well-known Motzkin’s polynomial [Mot67] while the rational function case, usually called the Hilbert’s 17th problem, is proved by Artin in his seminal work [Art27]. Based on various versions of positivstellensätze [Kri64, Sch91, Ste74], SOS becomes an essential ingredient in polynomial optimization [Las01, Par03, Lau09, PT20, Nie14]. In particular, it is used to certify the nonnegativity of a function on the hypercube $\Gamma_{2}^{n}=\{-1,1\}^{n}$ . The most concerned problem is the existence of low degree SOS certificate [BGP16, STKI17]. Numerous applications of SOS certificates can be found in combinatorics, including Knapsack problem[Gri01], Constrained Satisfaction Problem (CSP) [KMOW17], Empty Integral Hull (EIH) problem [KLM16], Min Knapsack (MK) problem [Kur19] and Planted Clique (PC) problem[MPW15, BHK⁺19]. Moreover, due to its diverse applications in various fields such as the interpretability of black-box models in machine learning [BLT21], accuracy certification [NOR10] and automated reasoning [HKM16, Sha94, Par00], constructing a certificate for a computational problem is at the core of computer science as well.

In [FSP16], the notion of Fourier sum of squares (FSOS) is proposed, which extends SOS on $\Gamma_{2}^{n}$ to arbitrary finite abelian groups. The main idea behind FSOS is that on a finite abelian group $G$ , its characters play the role of monomials in polynomial SOS. A graph theoretic framework to study FSOS is developed in [FSP16] and a quasi-linear algorithm for sparse FSOS is proposed in [YYZ22]. Although most combinatorial optimization problems can be modelled on the hypercube, there also exist many problems whose feasible domains are finite abelian groups of other types. For instance, the pigeon-hole principle can be certified by a sparse FSOS on $\mathbb{Z}_{n}^{n+1}$ [YYZ22]; the maximum 3-colorable subgraph problem for the wheel graph with $n$ vertices can be certified by a sparse FSOS on $\Gamma_{3}^{n}$ , which is much sparser than that on the hypercube, if we re-model the problem accordingly (c.f. Section 6.3.1); we also prove in Section 6.3.2 that the maximum $(n-1)$ -colorable subgraph problem for the complete graph with $n$ vertices can be certified by a sparse FSOS on ${\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\Gamma}_{n}^{n-1}$ .

In this paper, we consider the following two problems:

Problem 1.1 (computation of FSOS certificate).

Given an integer-valued function $f:G\to\mathbb{Z}$ and an integer $L$ , how to certify $f\geq L$ via FSOS?

Problem 1.2 (validation of FSOS certificate).

Given an integer-valued function $f:G\to\mathbb{Z}$ , an integer $L$ and a pair of finite family of functions $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ , how to check whether $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ indeed certifies $f\geq L$ ?

1.1. Our contributions

Our solutions to Problems 1.1 and 1.2 are based on the following simple observation (c.f. Lemma 3.1):

f\leavevmode\nobreak\ \text{is non-negative on}\leavevmode\nobreak\ G\iff\min_{y\in G}\left(f(y)-\frac{\sum_{j\in J}|g_{j}(y)|^{2}}{\sum_{i\in I}|h_{i}(y)|^{2}}\right)>-1,

where $f:G\to\mathbb{Z}$ is an integer-valued function. As a consequence, we propose the notion of FSOS certificate in Definition 3.3, which can be regarded as an FSOS of a non-negative function with a small error. It is worth emphasizing that FSOS certificates in this paper are different from those defined in [FSP16]. Indeed, an FSOS certificate for $f\geq 0$ in [FSP16, MPW15, BGP16] requires

f=\frac{\sum_{j\in J}|g_{j}|^{2}}{\sum_{i\in I}|h_{i}|^{2}},

while our FSOS certificate allows a small error.

It should be noticed that the small error in our FSOS certificate does not impair its ability to certify the non-negativity of $f$ as $f$ is integer-valued. In fact, quite the contrary, the small error provides us an exponentially sparser certificate. For instance (cf. Theorem 2.10), it is proved that the non-negative function

f(x_{1},\dots,x_{n})=\left(\sum_{j=1}^{n}x_{j}-\left\lfloor\frac{n}{2}\right\rfloor\right)\left(\sum_{j=1}^{n}x_{j}-\left\lfloor\frac{n}{2}\right\rfloor-1\right)

on $\{0,1\}^{n}$ has no polynomial or rational FSOS of degree less than $\frac{n-1}{2}$ . However, we can construct an FSOS certificate for $f\geq 0$ of degree $1$ and sparsity $n+1$ if we allow an error of $1/4$ (cf. Remark 3.19).

We address Problems 1.1 and 1.2, both theoretically and algorithmically. On the theoretical side, we study the existence of sparse (polynomial and rational) FSOS certificates for lower bounds of integer-valued functions on finite abelian groups. We informally summarize our main results below.

•

In Theorem 3.8, we prove that for any $\beta\in[0,1)$ and a non-negative integer-valued function $f$ on $G$ such that $f_{\max}<(5-4\beta)f_{\min}$ , there exists a sparse polynomial FSOS certificate for $f\geq[\beta f_{\min}]$ with degree $d=\deg(f)\left\lfloor\frac{3+\log(1-\alpha\beta)f_{\max}}{2+\log(1-\beta)\alpha-\log(1-\alpha)}\right\rfloor$ and with sparsity $|\operatorname{supp}(f)|^{d}$ , where $\alpha=\frac{f_{\min}}{f_{\max}}$ .
•

We also prove that Theorem 3.8 is optimal in the sense of approximation theory in Theorems 3.11 and 3.14.
•

In Theorem 3.18, we prove that any non-negative integer-valued function $f$ on $G$ admits a sparse rational FSOS certificate for $f\geq L$ (an integer), the degree of its denominator and numerator is bounded by $d=\operatorname{O}(\deg(f)\log(f_{\max}-L)^{2})$ .

For practical use, we propose algorithms for both the computation and the validation of FSOS certificates, which are briefly summarized in the following.

•

We present Algorithm 1 to compute a low degree rational FSOS certificate for $f\geq L$ .
•

We propose two efficient methods to validate an FSOS certificate: validation by $\ell^{1}$ -norm (15) and validation by sampling (16). Moreover, we justify the two methods by Proposition 4.2 and Theorem 4.6 respectively.

1.2. Organization of the paper

The paper is organized as follows. In Section 2 we present some essential definitions and facts in group theory and FSOS. We prove in Section 3 the existence of sparse polynomial and rational FSOS certificates, respectively. We also provide two simple methods to validate an FSOS certificate in Subsections 4.1 and 4.2. In Section 5, we present an efficient algorithm to compute a low degree FSOS certificate. As applications, we consider the certificate problem for MAX-SAT and maximum k-colorable subgraph (MkCS) problem in Section 6. Some numerical experiments are presented to demonstrate the correctness of our theorems and the efficiency of our algorithm.

2. Preliminaries

In this section, we recall some basic definitions and results in Fourier analysis on groups and Fourier sum of squares (FSOS). With the help of these mathematical tools, we are able to convert the certificate problem for lower bounds to the problem of FSOS.

2.1. Fourier analysis on groups

We briefly summarize fundamentals of group theory and representation theory in this subsection, which are necessary for the development of the paper. For more details, we refer interested readers to [Rud62, FH13].

Let $G$ be a finite abelian group. A nonzero complex valued function $\chi$ on $G$ is called a character of $G$ if for any $y,y^{\prime}\in G$ , it holds that

\chi(yy^{\prime})=\chi(y)\chi(y^{\prime}).

We denote by $\widehat{G}$ the set of all characters, which is called the dual group of $G$ . It is straightforward to verify that $\widehat{G}$ is also a finite abelian group, with the group operation given by pointwise multiplication. Moreover, $G\simeq\widehat{G}$ as abelian groups.

According to the classification theorem of finite abelian groups, any finite abelian group $G$ is isomorphic to $\Gamma_{n_{1}}\times\cdots\times\Gamma_{n_{d}}$ for some positive integers $2\leq n_{1}\leq\cdots\leq n_{d}$ . Here $\Gamma_{m}$ denotes the subgroup of the circle group $\mathbb{S}^{1}$ consisting of points $e^{i\frac{2l\pi}{m}},0\leq l\leq m-1$ . For convenience, in this paper we do not distinguish isomorphic abelian groups and simply write $G=\Gamma_{n_{1}}\times\cdots\times\Gamma_{n_{d}}$ . For each $\alpha=(\alpha_{1},\dots,\alpha_{d})\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}$ , we define the character $\chi_{\alpha}$ of $G$ to be the function $\chi_{\alpha}(y)=y^{\alpha}\coloneqq y_{1}^{\alpha_{1}}\cdots y_{d}^{\alpha_{d}}$ where $y=(y_{1},\dots,y_{d})\in G$ . Thus the dual group of $G$ is simply

\widehat{G}=\{\chi_{\alpha}:{\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}}\}.

By the representation theory of finite abelian groups, any function $f:G=\Gamma_{n_{1}}\times\cdots\times\Gamma_{n_{d}}\to\mathbb{C}$ can be uniquely written as a linear combination of elements in $\widehat{G}$ [FH13, Chapter 1]. More specifically, we have the Fourier expansion of $f$ :

(1)

f=\sum_{\chi_{\alpha}\in\widehat{G}}f_{\alpha}\chi_{\alpha},

where for each $\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}$ , $f_{\alpha}$ is the Fourier coefficient of $f$ at $\chi_{\alpha}\in\widehat{G}$ defined by

(2)

f_{\alpha}\coloneqq\frac{1}{|G|}\sum_{y\in G}f(y)\overline{\chi_{\alpha}(y)}=\left(\prod_{j=1}^{d}n_{j}\right)^{-1}\sum_{y\in G}f(y)y^{-\alpha}.

We also define $\widehat{f}:\widehat{G}\to\mathbb{C}$ by $\widehat{f}(\chi_{\alpha})=f_{\alpha}$ , $\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}$ .

Since both $f$ and $\widehat{f}$ are functions on finite sets, they are naturally identified with their evaluation vectors $(f(y))_{y\in G}$ and $(\widehat{f}(\chi_{\alpha}))_{\chi_{\alpha}\in\widehat{G}}$ , respectively. Thus we have the following norms associated to $f$ and $\widehat{f}$ :

\lVert f\rVert_{\ell^{\infty}}\coloneqq\max_{y\in G}|f(y)|,\quad\lVert\widehat{f}\rVert_{\ell^{1}}\coloneqq\sum_{\chi_{\alpha}\in\widehat{G}}|\widehat{f}(\chi_{\alpha})|=\sum_{\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}}|f_{\alpha}|.

Moreover, we have the following inequalities:

(3)

\left|f_{\alpha}\right|\leq\frac{1}{|G|}\sum_{y\in G}\left|f(y)\right|\leq\lVert f\rVert_{\ell^{\infty}},\quad\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}.

(4)

\lVert f\rVert_{\ell^{\infty}}\leq\lVert\widehat{f}\rVert_{\ell^{1}}.

The support of $f$ is defined to be

\operatorname{supp}(f)\coloneqq\left\{\chi_{\alpha}:f_{\alpha}\neq 0\right\}.

The sparsity of $f$ is defined to be the cardinality of $\operatorname*{supp}(f)$ . We denote the sparsity of $f$ by $\operatorname{sp}(f)$ .

Because of (1), we have the notion of degree of a function on $G=\Gamma_{n_{1}}\times\cdots\times\Gamma_{n_{d}}$ .

Definition 2.1 (degree).

The degree of $f$ as in (1) is defined by

\deg(f)\coloneqq\max_{\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}}\left\{\sum_{j=1}^{d}|\alpha_{j}|:f_{\alpha}\neq 0\right\}.

Here for each $\beta\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}$ , we denote by $\beta_{j}$ the $j$ -th element of $\beta$ , $1\leq j\leq d$ .

It is worthy to mention that although every function $f$ on $G$ can be uniquely written as a Laurent polynomial (1), $\deg(f)$ in Definition 2.1 is not the same as the degree of the Laurent polynomial representing $f$ . As an example, we consider $f(y)=y+y^{-2}$ on $\Gamma_{6}$ . According to Definition 2.1, we have $\deg(f)=2$ . However, as a Laurent polynomial, the degree of $f$ is $1$ .

Remark 2.2.

Two particularly interesting examples are $\Gamma_{N}$ and $\Gamma_{2}^{n}\coloneqq\overbrace{\Gamma_{2}\times\cdots\times\Gamma_{2}}^{\text{$n$ copies}}$ . Clearly, a function on $\Gamma_{N}$ is a univariate Laurent polynomial of degree at most $\left\lceil\frac{N-1}{2}\right\rceil$ while a function on $\Gamma_{2}^{n}$ is a multilinear polynomial in $n$ variables, of degree at most $n$ . For these two examples, Definition 2.1 coincide with the one used in [FSP16]. Moreover, monomials of negative powers do appear in (1) in general, unless $G\simeq\Gamma_{2}^{n}$ for some $n\in\mathbb{N}$ .

It is obvious from the definition that the degree of $f$ is at most $\sum_{j=1}^{d}\left\lceil\frac{n_{j}-1}{2}\right\rceil$ . Moreover, the sparsity of $f$ is controlled by the degree of $f$ . Namely, we have

\operatorname{sp}(f)\leq\sum_{k=0}^{\deg(f)}\left\lvert\{(m_{1},\dots,m_{d})\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}:|m_{1}|+\cdots+|m_{d}|=k\}\right\rvert.

2.2. Fourier sum of squares on abelian groups

This subsection concerns with the theory of FSOS developed in [FSP16, STKI17, YYZ22]. We first recall the definition of Fourier sum of squares.

Definition 2.3 (polynomial FSOS).

Let $f$ be a non-negative function on $G$ and let $S$ be a subset of $\widehat{G}$ . We say that $f$ admits a polynomial FSOS supported in $S$ if there exists a finite family $\{g_{j}:\operatorname*{supp}(g_{j})\subseteq S\}_{j\in J}$ of complex valued functions on $G$ such that

(5)

f=\sum_{j\in J}|g_{j}|^{2}.

Definition 2.4 (rational FSOS).

We say that $f$ admits a rational FSOS supported in $S$ if there exists a pair $(\{g_{j}:\operatorname*{supp}(g_{j})\in S\}_{j\in J},\{h_{i}:\operatorname*{supp}(h_{i})\in S\}_{i\in I})$ of finite families of complex valued functions on $G$ such that $\sum_{i\in I}|h_{i}|^{2}$ is non-vanishing and

(6)

f=\frac{\sum_{j\in J}|g_{j}|^{2}}{\sum_{i\in I}|h_{i}|^{2}}.

In particular, a polynomial (resp. rational) FSOS of the form $\{g\}$ (resp. $(\{g\},\{h\})$ ) is called a rank one polynomial (resp. rational) FSOS.

Remark 2.5.

If $G=\Gamma_{2}^{n}$ , we may further require $g_{j}$ ’s and $h_{i}$ ’s in (5) and (6) to be real valued. We claim that $f$ admits a real FSOS supported in $S\subseteq\mathbb{Z}_{2}^{n}$ if and only if it admits a complex FSOS supported in $S$ . Indeed, we can write a complex valued function $g$ on $\Gamma_{2}^{n}$ as $g=\sum_{\alpha\in\mathbb{Z}_{2}^{n}}c_{\alpha}\chi_{\alpha}$ . Since $\chi_{\alpha}$ is real-valued in this case, we have $|g|^{2}=\operatorname{Re}(g)^{2}+\operatorname{Im}(g)^{2}$ where

\operatorname{Re}(g)=\sum_{\alpha\in\mathbb{Z}_{2}^{d}}\operatorname{Re}(c_{\alpha})\chi_{\alpha},\quad\operatorname{Im}(g)=\sum_{\alpha\in\mathbb{Z}_{2}^{d}}\operatorname{Im}(c_{\alpha})\chi_{\alpha}.

This implies $\operatorname*{supp}(\operatorname{Re}(g))\subseteq\operatorname*{supp}(g)$ and $\operatorname*{supp}(\operatorname{Im}(g))\subseteq\operatorname*{supp}(g)$ .¹¹1It is worthy to notice that for a real valued function $g$ on a general finite abelian group, it may happen that $\operatorname*{supp}(\operatorname{Re}(g))\not\subseteq\operatorname*{supp}(g)$ and $\operatorname*{supp}(\operatorname{Im}(g))\not\subseteq\operatorname*{supp}(g)$ . Therefore, if $\{g_{j}\}_{j\in J}$ is a complex valued polynomial FSOS of $f$ on $\Gamma_{2}^{n}$ , then $\{\operatorname{Re}(g_{j}),\operatorname{Im}(g_{j})\}_{j\in J}$ provides a real valued polynomial FSOS of $f$ . Moreover, these two FSOS share the same support. The argument for complex valued rational FSOS is similar. As a consequence of the claim, it suffices to focus on real FSOS for functions on $\Gamma_{2}^{n}$ .

By the work of Hilbert [Hil88], for each pair $(n,d)$ of positive integers such that $n,d\geq 4$ , there always exists non-negative polynomials which can not be written as sum of squares of finitely many polynomials. Moreover, according to the seminal work of Artin [Art27] which solves Hilbert’s seventeenth problem, any non-negative polynomial can be written as a sum of squares of rational functions. The distinction between the existence of polynomial and rational SOS can be diminished by restricting the problem to a finite set of points. To be more precise, we have the following proposition, which can be regarded as a cornerstone of the theory of FSOS.

Proposition 2.6.

[FSP16, Proposition 2] A non-negative function $f$ on an abelian group admits both polynomial and rational FSOS.

As a consequence of Proposition 2.6, a function $f$ on $G$ is non-negative if and only if there exists a Hermitian positive semidefinite matrix $H=(H_{\alpha,\beta})_{\chi_{\alpha},\chi_{\beta}\in\widehat{G}}\in\mathbb{C}^{|G|\times|G|}$ such that the following holds for each $\chi_{\beta}\in\widehat{G}$ :

(7)

\sum_{\chi_{\alpha}\in\widehat{G}}H_{\alpha,\alpha+\beta}=f_{\beta}.

Here we index rows and columns of $H$ by elements in $\widehat{G}$ . Any $|G|\times|G|$ matrix $H\succeq 0$ satisfying (7) is called a Gram matrix of $f$ . Gram matrices are of great importance in both the theoretical and computational study of FSOS. Indeed, if $H$ is a Gram matrix of $f$ and $H=M^{\ast}M$ for some $M=(M_{j,\alpha})_{1\leq j\leq r,\chi_{\alpha}\in\widehat{G}}\in\mathbb{C}^{r\times|G|}$ , then we have

(8)

f=\sum_{j=1}^{r}\Big{\lvert}\sum_{\chi_{\alpha}\in\widehat{G}}M_{j,\alpha}\chi_{\alpha}\Big{\rvert}^{2}.

2.3. Sparsity of FSOS

We measure the quality of a polynomial or rational FSOS by its sparsity. To that end, we define the notion of sparsity for a finite family of functions on $G$ .

Definition 2.7 (sparsity).

Let $\{f_{k}\}_{k\in K}$ be a finite family of functions on $G$ . The sparsity of $\{f_{k}\}_{k\in K}$ is

\operatorname{sp}(\{f_{k}\}_{k\in K})\coloneqq\left\lvert\bigcup_{k\in K}\operatorname{supp}(f_{k})\right\rvert.

It is clear that the sparsity of a polynomial (resp. rational) FSOS $\{g_{j}\}_{j\in J}$ (resp. $\{(g_{j},h_{j})\}_{j\in J}$ ) of a given function $f$ on $G$ can be controlled by $\max_{j\in J}\{\deg g_{j}\}$ (resp. $\max_{j\in J}\{\deg g_{j},\deg h_{j}\}$ ). For ease of reference, we record the following theorems which provide an upper bound for the sparsity of a polynomial FSOS of a non-negative function on $\Gamma_{2}^{n}$ and $\Gamma_{N}$ .

Theorem 2.8.

[STKI17, Theorem 3.2] Every degree $d$ non-negative function on $\Gamma_{2}^{n}$ has a polynomial FSOS

f=\sum_{j\in J}|g_{j}|^{2},

where $\max_{j\in J}\{\deg(g_{j})\}\leq\lceil(n+d-1)/2\rceil$ .

Theorem 2.9.

[FSP16, Theorem 3] Every degree $d$ non-negative function on $\Gamma_{N}$ has a polynomial FSOS

f=\sum_{j\in J}|g_{j}|^{2},

where $\operatorname{sp}\left(\{g_{j}\}_{j\in J}\right)\leq 3d\log\left(\frac{N}{d}\right)$ .

Theorem 2.10.

[BGP16, Theorem 1.1 & 1.2] For each non-negative quadratic polynomial on $\{0,1\}^{n}\subseteq\mathbb{R}^{n}$ , there exists a pair $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ of finite families of real polynomials such that

(9)

p=\frac{\sum_{j\in J}g_{j}^{2}}{\sum_{i\in I}h_{i}^{2}},

where $\max_{j\in J}\{\deg(g_{j})\}\leq 1+\lfloor\frac{n}{2}\rfloor$ and $\max_{i\in I}\{\deg(h_{i})\}\leq\lfloor\frac{n}{2}\rfloor$ . Moreover, the polynomial

f(y_{1},\dots,y_{n})=\left(\sum_{j=1}^{n}y_{j}-\left\lfloor\frac{n}{2}\right\rfloor\right)\left(\sum_{j=1}^{n}y_{j}-\left\lfloor\frac{n}{2}\right\rfloor-1\right)

is non-negative on $\{0,1\}^{n}$ but (9) does not hold for any $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ if $\max_{j\in J}\{\deg(g_{j})\}\leq\lfloor\frac{n}{2}\rfloor$ .

3. Existence of sparse FSOS certificates

Let $G$ be an abelian group and let $f:G\to\mathbb{Z}$ be an integer-valued function on $G$ . For an integer $L$ , we may certify the inequality $f\geq L$ by sum of squares of $f-L$ . Moreover, we observe that the range of $f$ is a finite subset of $\mathbb{Z}$ . This enables us to certify $f\geq L$ by sum of squares of $f-L$ with errors, which is the content of the lemma that follows.

Lemma 3.1.

The following are equivalent:

(i)

$f\geq L$ .
(ii)

$f-L+\delta\geq 0$ for some function $\delta:G\to[0,1)$ .
(iii)

$f-L+\delta\geq 0$ for any function $\delta:G\to[0,1)$ .

In particular, if $f-L+\delta\geq 0$ for some function $\delta:G\to(-\infty,1)$ , then (i)–(iii) holds.

Remark 3.2.

Clearly, Proposition 2.6 ensures the existence of both polynomial and rational FSOS certificates for lower bounds. In theory, it suffices to discuss FSOS of $f-L$ . In practice, however, it is usually not possible to find an exact FSOS of $f-L$ , since errors are inevitable in numerical computation. Fortunately, Lemma 3.1 implies that if one can find an FSOS of $f-L$ with a small error, it still certifies $f\geq L$ . For example, if

\left\lVert f-L-\sum_{j\in J}|g_{j}|^{2}\right\rVert_{\ell^{\infty}}<1,

i.e., $\{g_{j}\}_{j\in J}$ is a polynomial FSOS certificate for $f\geq L$ by taking $\delta=\sum_{j\in J}|g_{j}|^{2}-f+L$ then $\{g_{j}\}_{j\in J}$ certifies $f\geq L$ . This indicates that it is not necessary to certify $f\geq L$ by an exact FSOS of $f-L$ . Thus FSOS certificates provide us a wider class of certifiers for lower bounds. In [SL23], the relative error between the lower bound of a low degree FSOS approximation of $f$ and $f_{\min}$ is analyzed on the hypercube. It implies that if a small relative error is allowed, then one might certify an integer-valued function $f\geq f_{\min}$ by a lower degree FSOS on the hypercube.

Due to Lemma 3.1, we have the following definition.

Definition 3.3 (FSOS certificates for lower bound).

A polynomial (resp. rational) FSOS certificate for $f\geq L$ is a polynomial (resp. rational) FSOS $\{g_{j}\}_{j\in J}$ (resp. $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ ) of $f-L+\delta$ for some function $\delta:G\to(-\infty,1)$ .

In the sequel, if $f$ and $L$ are understood from the context, we call the finite family $\{g_{j}\}_{j\in J}$ (resp. $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ ) in Definition 3.3 a polynomial (resp. rational) FSOS certificate. We further abbreviate polynomial/rational FSOS certificate as FSOS certificate if there is no need to emphasize whether it is polynomial or rational.

Remark 3.4.

It is imperative to clarify the distinction between FSOS certificates for lower bound (Definition 3.3) and FSOS (Definitions 2.3 and 2.4). As a concrete example, we consider the quadratic function $f$ in Theorem 2.10, for which there does not exist $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ such that $(\sum_{i\in I}h_{i}^{2})f=\sum_{j\in J}g_{j}^{2}$ and $\max_{j\in J,i\in I}\{\deg g_{j},\deg h_{i}\}\leq\lfloor n/2\rfloor$ . On the other side, we observe that

		$\displaystyle f(x_{1},\dots,x_{n})$	$\displaystyle=\left(\sum_{i=1}^{n}x_{i}-\left\lfloor\frac{n}{2}\right\rfloor\right)\left(\sum_{i=1}^{n}x_{i}-\left\lfloor\frac{n}{2}\right\rfloor-1\right)$
			$\displaystyle=\left(\sum_{i=1}^{n}x_{i}-\left\lfloor\frac{n}{2}\right\rfloor-\frac{1}{2}+\frac{1}{2}\right)\left(\sum_{i=1}^{n}x_{i}-\left\lfloor\frac{n}{2}\right\rfloor-\frac{1}{2}-\frac{1}{2}\right)$
			$\displaystyle=\left(\sum_{i=1}^{n}x_{i}-\left\lfloor\frac{n}{2}\right\rfloor-\frac{1}{2}\right)^{2}-\frac{1}{4}$
			$\displaystyle\geq-\frac{1}{4}$

for each $(x_{1},\dots,x_{n})\in\{0,1\}^{n}$ . This implies that $f$ admits a polynomial FSOS certificate for $f\geq 0$ as $f$ is an integer-valued function.

Now we are ready to address Problem 1.1, which we reproduce below for convenience. See 1.1 Our solution to Problem 1.1 is the polynomial/rational FSOS certificate. The rest of this section is devoted to a discussion of the existence of low degree polynomial and rational FSOS certificates.

3.1. Polynomial FSOS certificates

In this subsection, we focus on polynomial FSOS certificates. From the perspective of computational complexity theory, the criterion to measure the quality of a certificate is its complexity. In our case, the quality of a polynomial FSOS certificate $\{g_{j}\}_{j\in J}$ is measured by its sparsity defined in Definition 2.7.

According to Definition 3.3, for an integer $L\leq f_{\min}\coloneqq\min_{x\in G}\{f(x)\}$ , we want to find a finite family $\{g_{j}\}_{j\in J}$ of functions on $G$ such that

(10)

\left\lVert(f-L+\varepsilon)-\sum_{j\in J}|g_{j}|^{2}\right\rVert_{\ell^{\infty}}=\max_{y\in G}\left\lvert(f(y)-L+\varepsilon)-\sum_{j\in J}|g_{j}(y)|^{2}\right\rvert<1-\varepsilon

for some $\varepsilon\in[0,1)$ . Here we implicitly take the function $\delta:G\to(-\infty,1)$ in Definition 3.3 to be $\delta=-(f-L+\epsilon)+\sum_{j\in J}|g_{j}|^{2}$ . We remark that $\varepsilon$ appears in both sides of (10) for computational purposes. One can easily rewrite (10) so that $\varepsilon$ only appears on the right side. For numerical examples and algorithms in this paper, we simply choose $\varepsilon=1/2$ .

We notice that for any $\varepsilon\in[0,1)$ we may take $g=\sqrt{f-L+\varepsilon}$ so that (10) holds trivially for $\{g\}$ . However, $\sqrt{f-L+\varepsilon}$ usually fails to be sparse. This can be easily seen from the following illustrative example.

Example 3.5.

We consider $G=\Gamma_{2}^{3}$ and

(11)

f(y_{1},y_{2},y_{3})=\frac{13}{8}+\frac{3}{8}y_{1}+\frac{3}{8}y_{2}+\frac{3}{8}y_{3}+\frac{1}{8}y_{1}y_{2}+\frac{1}{8}y_{1}y_{3}+\frac{1}{8}y_{2}y_{3}-\frac{1}{8}y_{1}y_{2}y_{3}.

We take $\varepsilon=1/2,L=f_{\min}=1$ . Then $g=\sqrt{f-L+\varepsilon}=\sqrt{f-1/2}$ is a dense polynomial:

	$\displaystyle g(y_{1},y_{2},y_{3})$	$\displaystyle=\frac{4\sqrt{2}+3\sqrt{6}+\sqrt{10}}{16}+\frac{-2\sqrt{2}+\sqrt{6}+\sqrt{10}}{16}y_{1}+\frac{\sqrt{6}+\sqrt{10}}{16}y_{2}+\frac{-2\sqrt{2}+\sqrt{6}+\sqrt{10}}{16}y_{3}$
		$\displaystyle+\frac{-\sqrt{6}+\sqrt{10}}{16}y_{1}y_{2}+\frac{-\sqrt{6}+\sqrt{10}}{16}y_{1}y_{3}+\frac{-\sqrt{6}+\sqrt{10}}{16}y_{2}y_{3}+\frac{2\sqrt{2}-3\sqrt{6}+\sqrt{10}}{16}y_{1}y_{2}y_{3}.$

It is clear that $\operatorname{sp}(\{g\})=8$ . However, by Algorithm 1, we obtain a polynomial FSOS certificate of sparsity $4$ :

	$\displaystyle l_{1}(y)$	$\displaystyle=0.2615y_{1}+0.2615y_{2}+0.2615y_{3}+0.7170,$
	$\displaystyle l_{2}(y)$	$\displaystyle=-0.0542y_{1}-0.0542y_{2}+0.1085y_{3},$
	$\displaystyle l_{3}(y)$	$\displaystyle=-0.0939y_{1}+0.0939y_{2}.$

In fact, according to (4), it is easy to verify that

\left\lVert f-1-\sum_{i=1}^{3}|l_{i}|^{2}\right\rVert_{\ell^{\infty}}\leq\left\lVert\widehat{f}-1-\sum_{i=1}^{3}\widehat{|l_{i}|^{2}}\right\rVert_{\ell^{1}}<0.255,

from which one may conclude that $\{l_{i}\}_{i=1}^{3}$ certifies $f\geq 1$ .

3.1.1. Existence of low degree polynomial FSOS certificate

This sub-subsection is concerned with the existence of sparse polynomial FSOS certificates. To be more precise, we prove that there exists a lower degree function $g$ on $G$ such that $\{g\}$ satisfies (10). As a side remark, the existence of $g$ is equivalent to the existence of a rank one sparse Gram matrix defined in (7) of $f-L+\delta$ for some function $\delta:G\to(-\infty,1)$ .

Our discussion relies on approximation theory of functions. For the reader’s convenience, we record the following classic estimate for Chebyshev interpolation.

Lemma 3.6.

[SB02] Let $a<b$ be two real numbers. For each $f\in C^{d+1}([a,b])$ , we have

\max_{t\in[a,b]}|f(t)-p(t)|\leq\left(\frac{b-a}{2}\right)^{d+1}\frac{\max_{y\in[a,b]}|f^{(d+1)}(t)|}{2^{d}(d+1)!},

where $p$ is the degree $d$ Chebyshev interpolation polynomial for $f$ on $[a,b]$ .

Next we apply Lemma 3.6 to $\sqrt{t}$ on $[\alpha,1]$ for some $\alpha>0$ , from which we obtain the next lemma.

Lemma 3.7.

Let $f$ be a function on $G$ and let $f_{\min}$ (resp. $f_{\max}$ ) be the minimum (resp. maximum) of $f$ . If $L<\frac{5f_{\min}-f_{\max}}{4}$ , then for any $\varepsilon>0$ , there exists a function $P$ on $G$ of degree

\deg(f)\left\lfloor\frac{1-\log\varepsilon+\log(f_{\max}-L)}{2+\log(f_{\min}-L)-\log(f_{\max}-f_{\min})}\right\rfloor

such that

\left\lVert f-L-|P|^{2}\right\rVert_{\ell^{\infty}}<\varepsilon.

Proof.

We notice that the image of $f$ is contained in $[f_{\min},f_{\max}]\cap\mathbb{Z}$ . Thus the image of $f_{0}\coloneqq(f-L)/(f_{\max}-L)$ is contained in

\left\{\frac{f_{\min}-L}{f_{\max}-L},\frac{f_{\min}-L+1}{f_{\max}-L},\dots,1\right\}.

For simplicity we denote $\alpha\coloneqq\frac{f_{\min}-L}{f_{\max}-L}$ . By assumption we have $\frac{1}{5}<\alpha\leq 1$ .

Let $p(t)$ be the degree $d=\left\lfloor\frac{1-\log\varepsilon+\log(f_{\max}-L)}{2+\log(f_{\min}-L)-\log(f_{\max}-f_{\min})}\right\rfloor$ Chebyshev interpolation polynomial for $\sqrt{t}$ on $[\alpha,1]$ . According to Lemma 3.6, we have

	$\displaystyle\max_{t\in[\alpha,1]}\lvert p(t)-\sqrt{t}\rvert$	$\displaystyle\leq\left(\frac{1-\alpha}{2}\right)^{d+1}\frac{(2d-1)!!\alpha^{\frac{1}{2}-(d+1)}}{2^{2d+1}(d+1)!}$
		$\displaystyle<\frac{\sqrt{\alpha}}{d+1}\left(\frac{1-\alpha}{4\alpha}\right)^{d+1}$
		$\displaystyle<\left(\frac{1-\alpha}{4\alpha}\right)^{d+1}$
		$\displaystyle\leq\frac{\varepsilon}{2(f_{\max}-L)},$

where $n!!=n\cdot(n-2)\cdot(n-4)\cdots(n-2\lfloor\frac{n-1}{2}\rfloor)$ . Next we define $P\coloneqq\sqrt{f_{\max}-L}\left(p\circ f_{0}\right)$ . Thus

	$\displaystyle\max_{x\in G}\left\|\|P(x)\|^{2}-(f(x)-L)\right\|$
	$\displaystyle=\max_{x\in G}\left\lvert(f_{\max}-L)p(f_{0}(x))^{2}-(f_{\max}-L)\left(\sqrt{f_{0}(x)}\right)^{2}\right\rvert$
	$\displaystyle=\max_{x\in G}\left\{\left\lvert p(f_{0}(x))-\sqrt{f_{0}(x)}\right\rvert\left\lvert(f_{\max}-L)\left(p(f_{0}(x))+\sqrt{f_{0}(x)}\right)\right\rvert\right\}$
	$\displaystyle\leq 2(f_{\max}-L)\max_{x\in G}\left\lvert p(f_{0}(x))-\sqrt{f_{0}(x)}\right\rvert$
	$\displaystyle<\varepsilon$

and this completes the proof of the lemma. ∎

As a straightforward application of Lemma 3.7, we have the following theorem on the existence of a low degree polynomial FSOS certificate for the lower bound.

Theorem 3.8 (low degree polynomial FSOS certificate).

Let $\beta\in[0,1)$ be a fixed real number and let $f$ be a non-negative integer-valued function on $G$ . If

f_{\max}<(5-4\beta)f_{\min},

then there exists a rank-one FSOS certificate $\{g\}$ for $f\geq L\coloneqq[\beta f_{\min}]$ , such that

\deg(g)\leq\deg(f)\left\lfloor\frac{3+\log(f_{\max}-L)}{2+\log(f_{\min}-L)-\log(f_{\max}-f_{\min})}\right\rfloor.

Here $[x]$ denotes the nearest integer to $x\in\mathbb{R}$ .

Proof.

We may take $\varepsilon=1/4$ and $L=[\beta f_{\min}]$ in Lemma 3.7, which implies the existence of a function $P$ on $G$ of the desired degree such that

\max_{a\in G}|f(a)-L-|P(a)|^{2}|<1/4.

∎

We illustrate Theorem 3.8 by Figure 1. For each pair $(\alpha,\beta)$ in the shaded region in Figure 1 and any function $f$ with $f_{\min}=\alpha f_{\max}\geq 0$ , one can certify the inequality $f\geq[\beta f_{\min}]$ by a polynomial FSOS of degree at most $\deg(f)\left\lfloor\frac{3+\log(1-\alpha\beta)f_{\max}}{2+\log(1-\beta)\alpha-\log(1-\alpha)}\right\rfloor$ .

Figure 1. threshold

In particular, if $\beta=0$ and $\alpha\in(1/5,1)$ , the degree bound simplifies to $\deg(f)\left\lfloor\frac{3+\log f_{\max}}{2+\log\alpha-\log(1-\alpha)}\right\rfloor$ .

We may apply this bound to the special case $G=\Gamma_{2}^{n}$ . Thus we can certify $f\geq 0$ by a polynomial FSOS of degree at most

\deg(f)\left\lfloor\frac{3+\deg(f)\log n+\log\widehat{f}_{\max}}{2+\log\alpha-\log(1-\alpha)}\right\rfloor,

since $f_{\max}\leq n^{\deg(f)}\widehat{f}_{\max}$ . Here $\widehat{f}_{\max}\coloneqq\max_{\chi_{\alpha}\in\widehat{G}}|\widehat{f}(\chi_{\alpha})|$ . If we further assume that $\alpha$ is fixed and $\widehat{f}_{\max}\leq M$ for some constant $M>0$ , then the degree bound is $O(\deg(f)^{2}\log n)$ .

Another special case is $G=\Gamma_{N}$ . In this case, the upper bound becomes

\deg(f)\left\lfloor\frac{3+\log(2\deg(f)+1)+\log\widehat{f}_{\max}}{2+\log\alpha-\log(1-\alpha)}\right\rfloor,

since $f_{\max}\leq(2\deg(f)+1)\widehat{f}_{\max}$ . Again, if $\alpha$ is fixed and $\widehat{f}_{\max}\leq M$ , then the degree bound is $O(\deg(f)\log(\deg(f)))$ . Since $f$ in this case is a univariate Laurent polynomial, $O(\deg(f)\log(\deg(f)))$ is also an upper bound for the sparsity of polynomial FSOS certificates.

Remark 3.9.

For any non-negative function on $\Gamma_{2}^{n}$ , Theorem 2.8 guarantees the existence of a degree $O(n+\deg(f))$ polynomial FSOS certificate for the lower bound. As a comparison, if we assume that $\deg(f)$ is small, $\alpha$ is fixed and $\widehat{f}_{\max}\leq M$ for some constant $M>0$ , then it is remarkable that Theorem 3.8 ensures the existence (in some cases) of a degree $O(\log n)$ polynomial FSOS certificate.

Similarly, according to Theorem 2.9, every non-negative integer-valued function of degree $d$ on ${\color[rgb]{0,0,1}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,1}\Gamma}_{N}$ admits a polynomial FSOS certificate of sparsity $O\left(d\log\left(\frac{N}{d}\right)\right)$ . Our bound $O(d\log d)$ in this case supplies a better upper bound provided that $d$ is small, $\alpha$ is fixed and $\widehat{f}_{\max}\leq M$ . Since Theorems 2.8 and 2.9 actually concern with the existence of a sparse polynomial FSOS of a non-negative function $f$ , we must have $f=\sum_{j\in J}|g_{j}|^{2}$ . The improvement in Theorem 3.8 should be regarded as a benefit of allowing errors in Definition 3.3.

3.1.2. Two impossibility theorems

We notice that the key ingredient in the proof of Lemma 3.7 and thus Theorem 3.8 is the fact that one can approximate $\sqrt{t}$ on $[\alpha,1]$ exponentially by a polynomial when $\alpha>1/5$ . Thus it is tempting to expect an exponential approximation of $\sqrt{t}$ on $[0,1]$ . Unfortunately, this is not possible, which indicates that Theorem 3.8 is already the optimal result one can obtain by our method. We prove this non-existence result in the following. To begin with, we recall the following theorem on the approximation error of the absolute value function.

Theorem 3.10.

[VK87] Let $E_{2n}$ be the error of the best uniform approximation of $|t|$ by polynomials of degree at most $2n$ on the interval $[-1,1]$ . Then $\lim_{n\rightarrow\infty}nE_{2n}$ exists.

Using Theorem 3.10, we can prove that it is not possible to approximate the square root function exponentially and uniformly.

Theorem 3.11 (impossibility theorem for uniform approximation).

There exists no polynomial sequence $\{p_{i}:\operatorname{deg}(p_{i})\leq i\}_{i=1}^{\infty}$ converging uniformly and exponentially to the square root function on the interval $[0,1]$ . As a consequence, Theorem 3.8 can not be improved by approximating $\sqrt{t}$ uniformly. Furthermore, there exists a sequence of polynomials $\{r_{n}:\deg(r_{n})=n\}_{n=1}^{\infty}$ that uniformly approximates the square root function on interval $[0,1]$ with error $E^{\prime}_{n}$ , such that $\lim_{n\rightarrow\infty}nE^{\prime}_{n}$ exists.

Proof.

Suppose there exists such a sequence $\{p_{i}\}_{i=1}^{\infty}$ . Since $\sqrt{t^{2}}=|t|$ , the polynomial sequence $\{p_{i}(t^{2})\}_{i=1}^{\infty}$ converges uniformly and exponentially to the absolute value function on the interval $[-1,1]$ , which contradicts to the error estimate in Theorem 3.10.

For the “furthermore” part, let $p(t)$ be a polynomial of degree $2d$ such that $\left|p(t)-|t|\right|<\varepsilon$ holds for any $t\in[-1,1]$ . Then for $r(t)\coloneqq\frac{p(t)+p(-t)}{2}$ , we have:

\left|r(t)-|t|\right|\leq\frac{1}{2}\left|p(t)-|t|\right|+\frac{1}{2}\left|p(-t)-|-t|\right|\leq\varepsilon.

Since $r(t)$ is an even polynomial, $r(\sqrt{t})$ is also a polynomial of degree $d$ such that $\left|r(\sqrt{t})-\sqrt{t}\right|<\varepsilon$ for $t\in[0,1]$ . ∎

By a closer investigation of the proof of Lemma 3.7, we note that instead of finding a low degree uniform approximation of $\sqrt{t}$ , it is sufficient to find one which approximates $\sqrt{t}$ at integer points $0,1,\dots,m$ . It is seemingly convincing that such a more flexible approximation may improve Lemma 3.7 and Theorem 3.10, but we can prove that the expected improvement is not possible. Our argument is based on the following result.

Theorem 3.12.

Let $p$ be a univariate polynomial of degree $d$ and let $b_{1}\leq b_{2},c\geq 0$ and $\delta\geq 0$ be fixed real numbers. Assume $a_{1}<\dots<a_{l}$ are points in $[0,m]$ such that for any $x\in[0,m]$ there exists some $a_{i}$ such that $|x-a_{i}|\leq\delta$ .

•

$b_{1}\leq p(a_{i})\leq b_{2}$ for each $1\leq i\leq l$ ;
•

there exists some $t_{0}\in[0,m]$ such that $|p^{\prime}(t_{0})|\geq c$ .

Then $d\geq\sqrt{cm/(2\delta c+b_{2}-b_{1})}$ .

Proof.

The proof for $l=m+1$ and $a_{i}=i$ can be found in [NS94, Theorem 3.3] and one can easily adapt the proof there to the stated general case. ∎

Lemma 3.13.

Let $\frac{1}{2}\geq\delta>0,m\geq 5$ be fixed real numbers and let $p$ be a polynomial of degree $d$ . Assume $a_{1}<\cdots<a_{l}$ are points in $[0,m]$ such that for any $x\in[0,m]$ there exists some $a_{i}$ such that $|x-a_{i}|\leq\delta$ . If there exists some $0<\varepsilon<1/6$ such that $\left|p(a_{i})-\sqrt{a_{i}}\right|\leq\varepsilon$ for each $1\leq i\leq l$ , then

d\geq\sqrt{\frac{\left(\frac{1}{6}-\varepsilon\right)m}{\sqrt{m}+2\varepsilon+2\delta\left(\frac{1}{6}-\varepsilon\right)}}=O(m^{\frac{1}{4}}),

Proof.

It is clear that $0\leq a_{1}\leq\delta$ . There exists $1\leq k\leq l$ such that $|2-a_{k}|\leq\delta$ , thus $2+\delta\geq a_{k}\geq 2-\delta$ . By assumption, it is also obvious that $-\varepsilon\leq p(a_{i})\leq\sqrt{m}+\varepsilon$ for each $1\leq i\leq l$ . The mean value theorem ensures the existence of $t\in[a_{1},a_{k}]$ such that

	$\displaystyle\|p^{\prime}(t)\|$	$\displaystyle=\|p(a_{k})-p(a_{1})\|/\|a_{k}-a_{1}\|$
		$\displaystyle=\|(p(a_{k})-\sqrt{a_{k}})-(p(a_{1})-\sqrt{a_{1}})+(\sqrt{a_{k}}-\sqrt{a_{1}})\|/\|a_{k}-a_{1}\|$
		$\displaystyle\geq(\sqrt{2-\delta}-\sqrt{\delta}-2\varepsilon)/(2+\delta)$
		$\displaystyle\geq\sqrt{2}(\sqrt{3}-1)/5-\varepsilon$
		$\displaystyle\geq 1/6-\varepsilon.$

Applying Theorem 3.12 to $p$ with $b_{1}=-\varepsilon,b_{2}=\sqrt{m}+\varepsilon$ and $c=1/6-\varepsilon$ , we obtain the desired lower bound for $d$ . ∎

Since Lemma 3.13 provides a lower bound for the degree of a polynomial that approximates square root function on finitely many integers, we are now able to prove our second impossibility theorem, as promised.

Theorem 3.14 (impossibility theorem for discrete approximation).

Let $0<\varepsilon<1/6$ be a real number and let $m\geq 5$ be a positive integer. Suppose $p_{m,\varepsilon}$ is a univariate polynomial such that for any integer valued function $f$ upper bounded by $m$ on $G$ , the function $g\coloneqq p_{m,\varepsilon}(f-f_{\min})$ satisfies

\left\lVert\sqrt{f-f_{\min}}-g\right\rVert_{\ell^{\infty}}\leq\epsilon

Then the degree of $p_{m,\varepsilon}$ is at least $O(m^{\frac{1}{4}})$ . In particular, Theorem 3.8 can not be improved by approximating $\sqrt{t}$ at integer points $0,\dots,m$ .

Proof.

For each integer $0\leq i\leq m$ , we consider $f_{i}:G\to\mathbb{N}$ defined by

f_{i}(y)=\begin{cases}i,\quad\text{if}\leavevmode\nobreak\ y=e\\ 0,\quad\text{otherwise}.\end{cases}

Here $e\in G$ denotes the identity element of $G$ . By assumption, we have

\max_{y\in G}\left|\sqrt{f_{i}(y)}-p_{m,\varepsilon}(f_{i}(y))\right|<\varepsilon,

from which we may conclude that

\left|p_{m,\varepsilon}(i)-\sqrt{i}\right|<\varepsilon.

By Lemma 3.13, the degree of $p_{m,\varepsilon}$ is at least $O(m^{\frac{1}{4}})$ . ∎

To summarize, Theorems 3.11 and 3.14 imply that the conditions on $f_{\min}$ and $f_{\max}$ in Theorem 3.8 can not be removed by using other approximations of $\sqrt{t}$ . Thus our result in Theorem 3.8 can be regarded as an optimal result one can obtain by function approximation method. An illustrative example for our impossibility theorems is as follows.

Example 3.15.

We define $f:\Gamma_{2}^{6}\to\mathbb{Z}$ by

f(y_{1},\dots,y_{6})=\frac{13}{4}+\frac{1}{4}y_{1}+\frac{1}{4}y_{2}+\frac{1}{2}y_{3}+\frac{1}{2}y_{4}+\frac{1}{2}y_{5}+\frac{1}{2}y_{6}+\frac{1}{4}y_{1}y_{2}.

It is clear that $f_{\min}=1$ and $f\leq 6$ . We apply the method in the proof of Lemma 3.7 to $f-1+\frac{1}{2}=f-\frac{1}{2}$ . Namely, we approximate $\sqrt{t}$ by some degree $d$ polynomial $p_{d}(t)$ at points $\{i+\frac{1}{2}\}_{i=0}^{5}$ . Then $\{p_{d}(f-\frac{1}{2})\}$ is a polynomial FSOS certificate for $f\geq 1$ , provided that $p_{d}(t)$ approximates $\sqrt{t}$ sufficiently well.

The best linear approximation of $\sqrt{t}$ at $\left\{i+\frac{1}{2}\right\}_{i=0}^{5}$ is $p_{1}(t)=0.653+0.328t$ . However, $p_{1}(f-\frac{1}{2})$ is not a polynomial FSOS certificate for $f\geq 1$ since

\left|\left(p_{1}\left(f(1,1,1,1,1,1)-\frac{1}{2}\right)\right)^{2}-\left(f(1,1,1,1,1,1)-\frac{1}{2}\right)\right|\geq 0.525.

Thus to obtain a polynomial FSOS certificate, we consider $p_{2}(t)=-0.0359t^{2}+0.532t+0.479$ , which is the best quadratic approximation of $\sqrt{t}$ at $\left\{i+\frac{1}{2}\right\}_{i=0}^{5}$ . One may verify directly that

	$\displaystyle g=p_{2}\left(f-\frac{1}{2}\right)$	$\displaystyle=1.63+0.079y_{1}+0.079y_{2}+0.167y_{3}+0.167y_{4}+0.167y_{5}+0.167y_{6}+0.079y_{1}y_{2}$
		$\displaystyle-0.009y_{1}y_{3}-0.009y_{1}y_{4}-0.009y_{2}y_{3}-0.009y_{1}y_{5}-0.009y_{2}y_{4}-0.009y_{1}y_{6}$
		$\displaystyle-0.009y_{2}y_{5}-0.018y_{3}y_{4}-0.009y_{2}y_{6}-0.018y_{3}y_{5}-0.018y_{3}y_{6}$
		$\displaystyle-0.018y_{4}y_{5}-0.018y_{4}y_{6}-0.018y_{5}y_{6}-0.009y_{1}y_{2}y_{3}$
		$\displaystyle-0.009y_{1}y_{2}y_{4}-0.009y_{1}y_{2}y_{5}-0.009y_{1}y_{2}y_{6}.$

It is clear that $\left\lVert f-1-g^{2}\right\rVert_{\ell^{\infty}}\leq\left\lVert\widehat{f}-1-\widehat{g^{2}}\right\rVert_{\ell^{1}}<0.44$ , thus $\{g\}$ is a desired polynomial FSOS certificate. We also see immediately that $|\operatorname{supp}(g)|=26$ . As a comparison, by Algorithm 1, one can find the following sparse polynomials

	$\displaystyle g_{1}$	$\displaystyle=0.25y_{3}+0.25y_{4}+0.25y_{5}+0.25y_{6}+1,$
	$\displaystyle g_{2}$	$\displaystyle=-0.1443y_{3}-0.1443y_{4}-0.1443y_{5}+0.433y_{6},$
	$\displaystyle g_{3}$	$\displaystyle=-0.2041y_{3}-0.2041y_{4}+0.4083y_{5},$
	$\displaystyle g_{4}$	$\displaystyle=-0.3536y_{3}+0.3536y_{4},$
	$\displaystyle g_{5}$	$\displaystyle=0.5,$

which satisfy

\left\lVert\widehat{f}-1-\sum_{j=1}^{5}\widehat{|g_{j}|^{2}}\right\rVert_{\ell^{1}}<0.76.

Therefore, $\{g_{j}\}_{j=1}^{5}$ is an FSOS certificate for $f\geq 1$ of sparsity $5$ .

3.2. Rational FSOS certificates

Now we turn our attention to rational FSOS certificates. Although by Proposition 2.6, each function on $G$ admits a polynomial FSOS certificate, it is still imperative to consider rational FSOS certificates, due to the following reason:

•

According to Theorem 3.18 below, every integer-valued function on $G$ admits a rational FSOS certificate, which is of degree $O(\deg(f)\log^{2}(f_{\max}-f_{\min}))$ . In contrast, the existence of low degree polynomial FSOS certificates can only be guaranteed for some particular $f$ (cf. Theorem 3.8 and Figure 1).

The remaining of this section focuses on proving the existence of sparse rational FSOS certificates. Let $R_{d}$ be the set of univariate rational functions with numerator and denominator being real polynomials of degree at most $d$ . We define below the approximation error of rational functions to $\sqrt{t}$ on the interval $[0,1]$ :

E_{d}\coloneqq\inf_{r\in{R}_{d}}\left\{\max_{t\in[0,1]}|\sqrt{t}-r(t)|\right\}.

Theorem 3.16.

[Vya75, Sta03] There exists some constant $C>0$ such that for each $d\in\mathbb{N}$ , it holds that

(12)

\displaystyle\frac{1}{3}e^{-2\pi\sqrt{\frac{d}{2}}}\leq E_{d}\leq Ce^{-2\pi\sqrt{\frac{d}{2}}}.

Lemma 3.17.

There exists a constant $c>0$ such that for any function $f$ on $G$ , real number $L\leq f_{\min}$ and $\varepsilon\in(0,1)$ , one can find functions $g,h$ of degree at most

\deg(f)\left\lceil\frac{(\log(f_{\max}-L)+\log c-\log\varepsilon)^{2}}{2\pi^{2}}\right\rceil

such that

\left\lVert(f-L)-\frac{|g|^{2}}{|h|^{2}}\right\rVert_{\ell^{\infty}}\leq\varepsilon.

Proof.

We define $f_{0}=(f-L)/(f_{\max}-L)$ . It is clear that $f_{0}(x)\in[0,1]$ for any $x\in G$ . According to Theorem 3.16, we can find a constant $c>0$ and univariate polynomials $p,q$ of degree at most $d\coloneqq\left\lceil\frac{(\log(f_{\max}-L)+\log c-\log\varepsilon)^{2}}{2\pi^{2}}\right\rceil$ such that

\max_{t\in[0,1]}\lvert p(t)/q(t)-\sqrt{t}\rvert\leq\frac{c}{3}e^{-\pi\sqrt{2d}}\leq\frac{\varepsilon}{3(f_{\max}-L)}.

In particular, we have $\max_{t\in[0,1]}\lvert p(t)/q(t)+\sqrt{t}\rvert\leq 3$ . Let

g=\sqrt{f_{\max}-L}(p\circ f_{0}),\quad h=q\circ f_{0}.

This implies

	$\displaystyle\max_{x\in G}\left\lvert\frac{\|g(x)\|^{2}}{\|h(x)\|^{2}}-(f(x)-L)\right\rvert$
	$\displaystyle=(f_{\max}-L)\max_{x\in G}\lvert(p(f_{0}(x))/q(f_{0}(x)))^{2}-f_{0}(x))\rvert$
	$\displaystyle\leq 3(f_{\max}-L)\max_{t\in[0,1]}\left\lvert p(t)/q(t)-\sqrt{t}\right\rvert$
	$\displaystyle\leq\varepsilon.$

∎

As a direct application of Lemma 3.17, we are able to derive the following existence theorem of low degree rational FSOS certificate for integer-valued functions on abelian groups.

Theorem 3.18 (low degree rational FSOS certificate).

There is a constant $c>0$ such that every non-negative integer-valued function $f$ on $G$ admits a rank-one rational FSOS certificate $\{(g,h)\}$ for $f\geq L$ , where $L\leq f_{\min}$ is an integer and

\deg g=\deg h\leq\deg(f)\left\lceil\frac{(\log(f_{\max}-L)+c)^{2}}{2\pi^{2}}\right\rceil.

In particular, if we denote by $M_{d}$ the cardinality of the set $\{\chi_{\alpha}:\deg(\chi_{\alpha})\leq d\}\subseteq\widehat{G}$ , then

\deg g=\deg h\leq\deg(f)\left\lceil\frac{(\log\widehat{f}_{\max}+\log M_{\deg(f)}+c+1)^{2}}{2\pi^{2}}\right\rceil.

Proof.

We take $\varepsilon=1/4$ in Lemma 3.17 and the first inequality follows immediately. We observe that $f_{\max}-L\leq 2\widehat{f}_{\max}M_{\deg(f)}$ . This implies the estimate for the degree of $g$ and $h$ in terms of $M_{\deg(f)}$ and $\widehat{f}_{\max}$ . ∎

If we apply Theorem 3.18 to $\Gamma_{N}$ and $f:\Gamma_{N}\to\mathbb{Z}$ with $\widehat{f}_{\max}\leq M$ for some constant $M>0$ , then $f$ admits a rational FSOS certificate of degree at most $O(\deg(f)\log^{2}(\deg(f)))$ . We also apply Theorem 3.18 to $\Gamma_{2}^{n}$ and $f:\Gamma_{2}^{n}\to\mathbb{N}$ with $\lVert\widehat{f}\rVert_{\ell^{\infty}}\leq M$ . In this case, $M_{d}=\sum_{j=0}^{d}\binom{n}{d}=O(n^{d})$ . Thus the upper bound for rational FSOS certificate is $O(\deg(f)^{3}\log^{2}n)$ . In particular, if $\deg(f)=2$ then the bound is simply $O(\log^{2}n)$ .

Remark 3.19.

By Theorem 2.10, every quadratic function on $\Gamma_{2}^{n}$ has a rational SOS of degree at most $\left\lfloor\frac{n}{2}\right\rfloor+1=O(n)$ . It is even proved by an example that the upper bound $\left\lfloor\frac{n}{2}\right\rfloor+1$ is tight, while our exponentially smaller upper bound $O(\log^{2}n)$ does not lead to any contradiction. Indeed, the bound in Theorem 2.10 is for a pair of finite families of real polynomials $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ such that $(\sum_{i\in I}h_{i}^{2})f=\sum_{j\in J}g_{j}^{2}$ on $\Gamma_{2}^{n}$ . However, our degree bound is for rational FSOS certificate defined in Definition 3.3, which allows errors. A concrete example is given in Remark 3.4.

We notice that it is possible to turn the proof of Theorem 3.18 into an explicit construction of a rank-one rational FSOS certificate, if a rational approximation of $\sqrt{t}$ can be explicitly given. Fortunately, Newman provides such an explicit rational approximation, which we record in Remark 3.20 for the convenience of the reader.

Remark 3.20.

In [New64], a rational approximation of $|t|$ is given explicitly. For any integer $d>0$ , set $\xi=\exp(-d^{-\frac{1}{2}}),\leavevmode\nobreak\ p_{d}(t)=\prod_{k=0}^{d-1}\left(t+\xi^{k}\right)$ and

r_{d}(t)=\frac{t\left(p_{d}(t)-p_{d}(-t)\right)}{p_{d}(t)+p_{d}(-t)}.

Then

\max_{t\in[-1,1]}\left|r_{d}(t)-|t|\right|\leq 3\exp(-\sqrt{d}).

Since both $t\left(p_{d}(t)-p_{d}(-t)\right)$ and $p_{d}(t)+p_{d}(-t)$ are even functions, $r(\sqrt{t})$ is also a rational function. Thus

\max_{t\in[0,1]}\left|r_{d}(\sqrt{t})-\sqrt{t}\right|\leq 3\exp(-\sqrt{d}),

and $r_{d}(\sqrt{t})$ gives an exponential approximation of $\sqrt{t}$ .

We illustrate the construction by the example that follows.

Example 3.21 (continues= ex:running example-1).

The image of $f-\frac{1}{2}$ is $\{\frac{1}{2},\frac{3}{2},\frac{5}{2}\}$ . According to Newman’s construction (cf. Remark 3.20), we can take $r(t)\coloneqq\frac{\sqrt{10}\left(\mathrm{e}^{-\frac{\sqrt{2}}{2}}+1\right)t}{2t+5\mathrm{e}^{-\frac{\sqrt{2}}{2}}}$ so that

\left|r\left(\frac{1}{2}\right)-\sqrt{\frac{1}{2}}\right|<0.026,\quad\left|r\left(\frac{3}{2}\right)-\sqrt{\frac{3}{2}}\right|<0.072,\quad\left|r\left(\frac{5}{2}\right)-\sqrt{\frac{5}{2}}\right|=0.

Then it is straightforward to check that $r\left(f(y)-\frac{1}{2}\right)$ equals to

(13)

\frac{5.31+1.77y_{1}+1.77y_{2}+1.77y_{3}+0.59y_{1}y_{2}+0.59y_{1}y_{3}+0.59y_{2}y_{3}-0.59y_{1}y_{2}y_{3}}{4.72+0.75y_{1}+0.75y_{2}+0.75y_{3}+0.25y_{1}y_{2}+0.25y_{1}y_{3}+0.25y_{2}y_{3}-0.25y_{1}y_{2}y_{3}}

hence $\lVert f-\frac{1}{2}-\left(r\circ\left(f-\frac{1}{2}\right)\right)^{2}\rVert_{\ell^{\infty}}<0.18$ . If we respectively take $g$ and $h$ to be the numerator and the denominator of $r\circ\left(f-\frac{1}{2}\right)$ in (13), then $(\{g\},\{h\})$ is a rank one rational FSOS certificate for $f\geq 1$ .

The estimate in (3) leads to the proposition that follows.

Proposition 3.22.

Let $f$ be a non-negative function on $G$ . If $p$ is a univariate polynomial such that

\left\lVert p(f)-\sqrt{f}\right\rVert_{\ell^{\infty}}\leq\varepsilon

for some $\varepsilon>0$ , then

\max_{\chi_{\alpha}\in\widehat{G}}\left\lvert(p(f)^{2})_{\alpha}-f_{\alpha}\right\rvert\leq 2\lVert f\rVert_{\ell^{\infty}}^{\frac{1}{2}}\varepsilon+\varepsilon^{2}.

In other words, Fourier coefficients of $p(f)^{2}-f$ are uniformly bounded by $O(\lVert f\rVert_{\ell^{\infty}}^{\frac{1}{2}}\varepsilon)$ . Moreover, if $q$ is a univariate polynomial such that $\left\lVert\widehat{q(f)^{2}}-\widehat{f}\right\rVert_{\ell^{1}}\leq\varepsilon$ , then

\left\lVert q(f)^{2}-f\right\rVert_{\ell^{\infty}}\leq\varepsilon.

Proof.

By assumption, for each $y\in G$ we have

\left\lvert p(f(y))+\sqrt{f(y)}\right\rvert\leq\left\lvert 2\left\lvert\sqrt{f(y)}\right\rvert+p(f(y))-\sqrt{f(y)}\right\rvert\leq 2\lVert f\rVert_{\ell^{\infty}}^{\frac{1}{2}}+\varepsilon,

from which we obtain

\left\lvert p(f(y))^{2}-f(y)\right\rvert=\left\lvert p(f(y))-\sqrt{f(y)}\right\rvert\left\lvert p(f(y))+\sqrt{f(y)}\right\rvert\leq 2\lVert f\rVert_{\ell^{\infty}}^{\frac{1}{2}}\varepsilon+\varepsilon^{2}

and the desired estimate for Fourier coefficients follows from (3). The moreover part follows directly by (4). ∎

In the sequel, we usually apply Proposition 3.22 to $f-L$ for some $L\leq f_{\min}$ . We also remark that the finiteness of $G$ is crucial to the estimate of Fourier coefficients in Proposition 3.22. In general, Fourier coefficients of an $L^{\infty}$ function can be arbitrarily large. More importantly, Proposition 3.22 validates Algorithm 1, in which we relax the $\ell^{\infty}$ -norm in (10) by the $\ell^{1}$ -norm.

4. Validation of FSOS certificates

We address Problem 1.2 in this section. For the sake of convenience, we reproduce the problem below. See 1.2 Clearly, to solve Problem 1.2, it is sufficient to check the inequality

(14)

\left\lVert f-L+\varepsilon-\frac{\sum_{j\in J}|g_{j}|^{2}}{\sum_{i\in I}|h_{i}|^{2}}\right\rVert_{\ell^{\infty}}<1-\varepsilon,

for some $\varepsilon\in[0,1)$ . Along this direction, we propose two methods to validate $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ . One is given by checking the inequalities:

(15)

\sum_{i\in I}|h_{i}|^{2}\geq 1,\quad\left\lVert\sum_{i\in I}\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("|h_{i}|^{2}\left(f-L+\varepsilon\right)")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{|h_{i}|^{2}\left(f-L+\varepsilon\right)}{\tmpbox}-\sum_{j\in J}\widehat{|g_{j}|^{2}}\ \right\rVert_{\ell^{1}}<1-\varepsilon,

which is justified by Proposition 4.2. In the sequel, we call this method the $\ell^{1}$ -norm validation.

The other method is only for functions on $\Gamma_{2}^{n}$ . It is called the validation by sampling. It checks $\sum_{i\in I}h_{i}^{2}\geq 1$ and

(16)

\left\lvert\sum_{j\in J}|g_{j}(y)|^{2}-\left(\sum_{i\in I}|h_{i}(y)|^{2}\right)(f(y)-L)\right\rvert\leq n^{-2(\deg(f)+d)}

for $y\in\Gamma_{2}^{n}$ such that $\sum_{i=1}^{n}y_{i}\leq 2\deg(f)-n+2d$ . Here $d=2\max\{\deg(g_{j}),\deg(h_{i})\}_{i\in I,j\in J}$ . Theorem 4.6 ensures the fidelity of this method.

4.1. Validating FSOS certificates by $\ell^{1}$ -norm

We recall that the $\ell^{1}$ -norm validation is checking the two inequalities in (15). To this end, we need the following proposition, which is in the same spirit as Proposition 3.22.

Lemma 4.1.

Let $f$ be a function on $G$ and let $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ be a pair of finite families of functions on $G$ such that $\sum_{i\in I}|h_{i}|^{2}\geq 1$ and

\left\lVert\sum_{i\in I}\widehat{|h_{i}|^{2}f}-\sum_{j\in J}\widehat{|g_{j}|^{2}}\right\rVert_{\ell^{1}}<\varepsilon

for some $\varepsilon>0$ . Then we have

\left\lVert f-\frac{\sum_{j}|g_{j}|^{2}}{\sum_{i}|h_{i}|^{2}}\right\rVert_{\ell^{\infty}}<\varepsilon.

Proof.

We denote $r=\left(\sum_{i\in I}|h_{i}|^{2}\right)f-\sum_{j\in J}|g_{j}|^{2}$ . Since $\lVert\widehat{r}\rVert_{\ell^{1}}<\varepsilon$ , we conclude that $\lVert r\rVert_{\ell^{\infty}}<\varepsilon$ . The condition that $\sum_{i}|h_{i}|^{2}\geq 1$ implies

\left|f(y)-\frac{\sum_{j}|g_{j}(y)|^{2}}{\sum_{i}|h_{i}(y)|^{2}}\right|=\frac{\left|r(y)\right|}{\sum_{i}|h_{i}(y)|^{2}}<\varepsilon,\quad y\in G.

∎

Proposition 4.2 (validation by $\ell^{1}$ -norm).

Let $f$ be an integer-valued function on $G$ and let $L\leq f_{\min}$ be an integer. If $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ is a pair of finite families of functions on $G$ such that (15) holds, then $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ is a rational FSOS certificate for $f\geq L$ .

Proof.

If $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ satisfy (15), then by Lemma 4.1, we have

\left\lVert\left(f-L+\varepsilon\right)-\frac{\sum_{j}|g_{j}|^{2}}{\sum_{i}|h_{i}|^{2}}\right\rVert_{\ell^{\infty}}<1-\varepsilon.

This implies that $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ is a rational FSOS certificate for $f\geq L$ . ∎

4.2. Validating FSOS certificates by sampling on $\Gamma_{2}^{n}$

Next we discuss the method of validation by sampling (16). The method relies on the extrapolation theory of functions on $[0,1]^{n}$ .

Lemma 4.3 (extrapolation).

Let $R$ be a polynomial of degree at most $d$ in $n$ variables $x_{1},\dots,x_{n}$ and let $\varepsilon$ be a positive number. We define

S\coloneqq\left\{\xi=(\xi_{1},\dots,\xi_{n})\in\mathbb{Z}_{2}^{n}:\sum_{j=1}^{n}\xi_{j}\leq d\right\}.

If monomials in $R$ are square free, $d\leq k$ and $\lvert R(\xi)\rvert<\varepsilon$ for each $\xi\in S$ , then

(i)

for any $x\in[0,1]^{n}$ with $|\lceil x\rceil|\geq d+1$ , we have

$|R(x)|\leq\varepsilon\sum_{j=0}^{d}2^{j}\binom{|\lceil x\rceil|}{j}$

where $\lceil(x_{1},\dots,x_{n})\rceil\coloneqq(\lceil x_{1}\rceil,\dots,\lceil x_{n}\rceil)$ .
(ii)

if moreover $x\in\mathbb{Z}_{2}^{n}$ , we have

(17) $|R(x)|\leq\varepsilon\binom{|x|}{d}\sum_{p=0}^{d}\frac{|x|-d}{|x|-p}\binom{d}{p}.$

In particular, we have

(18) $|R(x)|\leq\varepsilon\binom{|x|}{d}\left(2^{d}-\frac{d}{|x|}2^{d-1}\right).$

Proof.

Since elements in $S$ can be used to denote both multi-indices and points in $[0,1]^{n}$ , to avoid confusion we define $\Lambda\coloneqq S$ and we denote multi-indices (resp. points) in $\Lambda$ (resp. $S$ ) by $\alpha,\beta,\dots$ (resp. $\xi,\zeta,\dots$ ). We sort $S$ (resp. $\Lambda$ ) by the graded reverse lexicographic order so that

S=\{\xi_{1},\dots,\xi_{N}\},\quad\Lambda=\{\alpha_{1},\dots,\alpha_{N}\}.

Since $R$ has no square-free monomials, we may write

R(x)=\sum_{\alpha\in\Lambda}c_{\alpha}x^{\alpha}

for some $c_{\alpha}\in\mathbb{R}$ . Here $x^{\alpha}$ denotes the monomial $x_{1}^{\alpha_{1}}\cdots x_{n}^{\alpha_{n}}$ . Evaluating $R(x)$ at points in $S$ , we obtain

(19)

R(\xi)=\sum_{\alpha\in\Lambda}c_{\alpha}\xi^{\alpha},\quad\xi\in S.

Therefore (19) can be written as

(20)

\begin{bmatrix}c_{\alpha_{1}}&\cdots&c_{\alpha_{N}}\end{bmatrix}\begin{bmatrix}\xi_{1}^{\alpha_{1}}&\cdots&\xi_{N}^{\alpha_{1}}\\ \vdots&\ddots&\vdots\\ \xi_{1}^{\alpha_{N}}&\cdots&\xi_{N}^{\alpha_{N}}\end{bmatrix}=\begin{bmatrix}R(\xi_{1}),\cdots,R(\xi_{N})\end{bmatrix}.

By definition, it is straightforward to verify that

\xi_{j}^{\alpha_{i}}=\xi_{j1}^{\alpha_{i1}}\cdots\xi_{jk}^{\alpha_{ik}}=\begin{cases}1,\leavevmode\nobreak\ \text{if}\leavevmode\nobreak\ \alpha_{i}\leq\alpha_{j},\\ 0,\leavevmode\nobreak\ \text{otherwise}.\end{cases}

Here $(a_{1},\dots,a_{n})\leq(b_{1},\dots,b_{n})$ if and only if $a_{i}\leq b_{i},i=1,\dots,n$ .

Next we investigate the structure of the matrix $A\coloneqq(\xi_{j}^{\alpha_{i}})_{i,j=1}^{N}$ . We partition columns and rows of $A$ by $|\xi_{j}|$ ’s and $|\alpha_{i}|$ ’s respectively. Namely, we have $A=(A_{q}^{p})_{p,q=0}^{d}$ where $A_{q}^{p}$ is a $\binom{n}{p}\times\binom{n}{q}$ matrix whose elements are $\xi_{j}^{\alpha_{i}}$ ’s with $|\alpha_{i}|=p$ and $|\xi_{j}|=q$ . The matrix $A^{p}_{q}$ has the following properties:

•

$A^{p}_{q}=0$ if $p>q$ ;
•

$A^{p}_{p}=I_{\binom{n}{p}}$ ;
•

$A^{0}_{q}=\text{1}_{\binom{n}{q}}$ , where $\text{1}_{k}$ denotes the $k$ -dimensional row vector whose elements are all equal to one;
•

if $|\xi|=q>p$ , then the column of $A^{p}_{q}$ determined by $\xi$ contains exactly $\binom{q}{p}$ nonzero elements which are all equal to one.

Thus the matrix $A$ can be written as

(21)

\hbox{}\vbox{\kern 0.86108pt\hbox{$\kern 0.0pt\kern 2.5pt\kern-5.0pt\left[\kern 0.0pt\kern-2.5pt\kern-5.55557pt\vbox{\kern-0.86108pt\vbox{\vbox{ \halign{\kern\arraycolsep\hfil\@arstrut$\kbcolstyle#$\hfil\kern\arraycolsep& \kern\arraycolsep\hfil$\@kbrowstyle#$\ifkbalignright\relax\else\hfil\fi\kern\arraycolsep&& \kern\arraycolsep\hfil$\@kbrowstyle#$\ifkbalignright\relax\else\hfil\fi\kern\arraycolsep\cr 5.0pt\hfil\@arstrut$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle|\xi|=0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle|\xi|=1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle|\xi|=2$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle|\xi|=d-1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle|\xi|=d\\|\alpha|=0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\text{1}_{n}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\text{1}_{\binom{n}{2}}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\text{1}_{\binom{n}{d-1}}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\text{1}_{\binom{n}{d}}\\|\alpha|=1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle I_{n}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle A^{1}_{2}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle A^{1}_{d-1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle A^{1}_{d}\\|\alpha|=2$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle I_{\binom{n}{2}}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle A^{2}_{d-1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle A^{2}_{d}\\\vdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\vdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\vdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\vdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\ddots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\vdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\vdots\\|\alpha|=d-1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\vdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle I_{\binom{n}{d-1}}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle A^{d-1}_{d}\\|\alpha|=d$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle I_{\binom{n}{d}}$\hfil\kern 5.0pt\crcr}}}}\right]$}}.

In particular, $A$ is invertible and hence one can determine

(22)

\begin{bmatrix}c_{\alpha_{1}}&\cdots&c_{\alpha_{N}}\end{bmatrix}=\begin{bmatrix}R(\xi_{1}),\cdots,R(\xi_{N})\end{bmatrix}A^{-1}.

We claim that $A^{-1}=((-1)^{q-p+1}A^{p}_{q})_{p,q=0}^{d}$ , i.e.,

(23)

A^{-1}=\begin{bmatrix}1&-\text{1}_{n}&\text{1}_{\binom{n}{2}}&\cdots&(-1)^{d-1}\text{1}_{\binom{n}{d-1}}&(-1)^{d}\text{1}_{\binom{n}{d}}\\ 0&I_{n}&-A^{1}_{2}&\cdots&(-1)^{d-2}A^{1}_{d-1}&(-1)^{d-1}A^{1}_{d}\\ 0&0&I_{\binom{n}{2}}&\cdots&(-1)^{d-3}A^{2}_{d-1}&(-1)^{d-2}A^{2}_{d}\\ \vdots&\vdots&\vdots&\ddots&\vdots&\vdots\\ 0&0&0&\cdots&I_{\binom{n}{d-1}}&(-1)A^{d-1}_{d}\\ 0&0&0&\cdots&0&I_{\binom{n}{d}}\end{bmatrix}.

Indeed, one can verify (23) directly by properties of $A^{p}_{q}$ summarized before (21) and the identity

\sum_{j=0}^{k}(-1)^{j}\binom{k}{j}=0.

In particular, elements in $A^{-1}$ are all contained in $\{-1,0,1\}$ .

Now we are ready to prove (i). According to (23), we have for each $i=1,\dots,N$ that

(24)

|c_{\alpha_{i}}|\leq\#\{\text{nonzero elements in the $\alpha_{i}$-th column of $A^{-1}$}\}\varepsilon\leq 2^{|\alpha_{i}|}\varepsilon.

This implies that for each $x\in[0,1]^{n}$ with $|\lceil x\rceil|\geq d+1$ ,

	$\displaystyle\|R(x)\|$	$\displaystyle\leq\sum_{\alpha\in\Lambda}\|c_{\alpha}\|x^{\alpha}$
		$\displaystyle=\sum_{\begin{subarray}{c}\alpha\in\Lambda\\ \alpha\leq\lceil x\rceil\end{subarray}}\|c_{\alpha}\|$
		$\displaystyle\leq\varepsilon\sum_{\begin{subarray}{c}\alpha\in\Lambda\\ \alpha\leq\lceil x\rceil\end{subarray}}2^{\|\alpha\|}$
		$\displaystyle=\varepsilon\sum_{j=0}^{d}2^{j}\#\{\alpha\in\Lambda:\|\alpha\|=j,\alpha\leq\lceil x\rceil\}$
		$\displaystyle=\varepsilon\sum_{j=0}^{d}2^{j}\binom{\|\lceil x\rceil\|}{j}.$

To prove (ii), we notice that for each $x\in\mathbb{Z}_{2}^{n}$ ,

R(x)=\begin{bmatrix}R(\xi_{1}),\cdots,R(\xi_{N})\end{bmatrix}A^{-1}\begin{bmatrix}x^{\alpha_{1}}\\ \vdots\\ x^{\alpha_{N}}\end{bmatrix}.

For each $\beta\in\Lambda$ , we have

x^{\beta}=\begin{cases}1,\leavevmode\nobreak\ \text{if}\leavevmode\nobreak\ \beta\leq x,\\ 0,\leavevmode\nobreak\ \text{otherwise}.\end{cases}

This implies that the $\alpha$ -th element of $A^{-1}\begin{bmatrix}x^{\alpha_{1}}&\cdots&x^{\alpha_{N}}\end{bmatrix}^{\scriptscriptstyle\mathsf{T}}$ is zero if $\alpha\not\leq x$ . Moreover, if $\alpha\leq x$ , then the $\alpha$ -th element of $A^{-1}\begin{bmatrix}x^{\alpha_{1}}&\cdots&x^{\alpha_{N}}\end{bmatrix}^{\scriptscriptstyle\mathsf{T}}$ is given by

	$\displaystyle\sum_{\begin{subarray}{c}\alpha\leq\beta\leq x\\ \beta\in\Lambda\end{subarray}}(-1)^{\|\beta\|-\|\alpha\|}$	$\displaystyle=\sum_{q=\|\alpha\|}^{d}\sum_{\|\beta\|=q}(-1)^{q-\|\alpha\|}$
		$\displaystyle=\sum_{q=\|\alpha\|}^{d}(-1)^{q-\|\alpha\|}\binom{\|x\|-\|\alpha\|}{q-\|\alpha\|}$
		$\displaystyle=\sum_{j=0}^{d-\|\alpha\|}(-1)^{j}\binom{\|x\|-\|\alpha\|}{j}$
		$\displaystyle=(-1)^{d-\|\alpha\|}\binom{\|x\|-\|\alpha\|-1}{d-\|\alpha\|}.$

Here the last equality follows from the identity $\sum_{j=0}^{t}(-1)^{j}\binom{m}{j}=(-1)^{t}\binom{m-1}{t}$ . Thus we have

	$\displaystyle\|R(x)\|$	$\displaystyle\leq\varepsilon\sum_{\begin{subarray}{c}\alpha\in\Lambda\\ \alpha\leq x\end{subarray}}\binom{\|x\|-\|\alpha\|-1}{d-\|\alpha\|}$
		$\displaystyle=\varepsilon\sum_{p=0}^{d}\binom{\|x\|}{p}\binom{\|x\|-p-1}{d-p}$
(25)			$\displaystyle=\varepsilon\binom{\|x\|}{d}\sum_{p=0}^{d}\frac{\|x\|-d}{\|x\|-p}\binom{d}{p}.$

We consider the function $f(t)=1/(|x|-t)$ on $[0,d]$ . It is straightforward to verify that $f$ is a convex function. Therefore, we have

f(t)\leq g(t)\coloneqq\frac{1}{|x|}+\frac{t}{|x|(|x|-d)},\quad t\in[0,d],

which follows from the convexity of $f$ and $g(0)=f(0),g(d)=f(d)$ . This implies that

	$\displaystyle\sum_{p=0}^{d}\frac{\|x\|-d}{\|x\|-p}\binom{d}{p}$	$\displaystyle\leq(\|x\|-d)\sum_{p=0}^{d}\left(\frac{1}{\|x\|}+\frac{p}{\|x\|(\|x\|-d)}\right)\binom{d}{p}$
		$\displaystyle=\frac{\|x\|-d}{\|x\|}2^{d}+\frac{d}{\|x\|}2^{d-1}$
(26)			$\displaystyle=2^{d}-\frac{d}{\|x\|}2^{d-1}.$

Here the first equality follows from identities $\sum_{p=0}^{d}\binom{d}{p}=2^{d}$ and $\sum_{p=0}^{d}p\binom{d}{p}=d2^{d-1}$ . Combining (4.2) and (4.2), we obtain the desired estimate for $|R(x)|$ . ∎

Remark 4.4.

Extrapolation results which are similar to Lemma 4.3 are available in the literature [RS10, She20]. According to [She20, Lemma 2.11], it holds that

|R(x)|\leq\varepsilon\binom{|x|}{d}2^{d},\quad x\in\mathbb{Z}_{2}^{n},|x|\geq d+1,

for which our estimate in (18) provides an improvement.

Moreover, (17) is optimal in the sense that there exists some degree $d$ polynomial $R$ such that the equality holds. Indeed, we may take $R$ to be the one such that

R(\xi)=(-1)^{d+|\xi|}\varepsilon,\quad\xi\in S,

which can be obtained by interpolation. By (4.2), it is clear that the equality in (17) holds.

Lastly, either by (i) or (ii) in Lemma 4.3 we have

\max_{x\in[0,1]^{n}}\lvert R(x)\rvert<2^{d}N\varepsilon,

where $N\coloneqq|S|=\sum_{j=0}^{d}\binom{n}{j}$ .

Proposition 4.5.

Let $f$ be an integer-valued function on $\Gamma_{2}^{n}$ . If $g,h$ are non-negative functions on $\Gamma_{2}^{n}$ of degree at most $d$ and $L$ is an integer such that

•

$\deg(f)+d\leq n$ ;
•

$\min_{y\in\Gamma_{2}^{n}}|h(y)|\geq 1$ ;
•

$\max_{y\in\Gamma_{2}^{n},\leavevmode\nobreak\ \sum_{i=1}^{n}y_{i}\leq 2\deg(f)-n+2d}|g(y)-h(y)(f(y)-L)|\leq\frac{1}{2N^{2}}$ where $N\coloneqq\sum_{j=0}^{\deg(f)+d}\binom{n}{j}$ .

Then we have $f(y)\geq L$ for any $y\in\Gamma_{2}^{n}$ .

Proof.

We observe that the following inequality holds for any $y\in\Gamma_{2}^{n}$ :

|g(y)/h(y)-(f(y)-L)|=\frac{|g(y)-h(y)(f(y)-L)|}{|h(y)|}\leq|g(y)-h(y)(f(y)-L)|.

Thus it is sufficient to prove $\max_{x\in\mathbb{Z}_{2}^{n}}|R(x)|<1/2$ where

R(x)\coloneqq g(2x-1)-h(2x-1)(f(2x-1)-L),\quad x\in\mathbb{Z}_{2}^{n}.

We notice that $R(x)$ is a polynomial of degree at most $\deg(f)+d$ and by assumption we also have

\max_{x\in\mathbb{Z}_{2}^{n},|x|\leq\deg(f)+d}|R(x)|\leq\frac{1}{2N^{2}}.

Applying Lemma 4.3 we obtain

{\max_{x\in\mathbb{Z}_{2}^{n}}|R(x)|<\frac{2^{\deg(f)+d}N}{2N^{2}}\leq\frac{1}{2}}.

∎

Now we are able to prove the correctness of the validation by sampling (16).

Theorem 4.6 (validation by sampling).

Let $f$ be an integer-valued function on $\Gamma_{2}^{n}$ . Suppose that $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ is a pair of finite families of functions on $\Gamma_{2}^{n}$ such that

(i)

$\sum_{i\in I}|h_{i}|^{2}\geq 1$ ;
(ii)

$d\coloneqq 2\max_{j\in J,i\in I}\{\deg g_{j},\deg h_{i}\}$ and $\deg(f)+d\leq n$ ;
(iii)

$\left\lvert\sum_{j\in J}|g_{j}(y)|^{2}-\left(\sum_{i\in I}|h_{i}(y)|^{2}\right)(f(y)-L)\right\rvert\leq n^{-2(\deg(f)+d)}$ for $x\in\Gamma_{2}^{n}$ such that $\sum_{i=1}^{n}y_{i}\leq 2\deg(f)-n+2d$ .

Then $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ is a rational FSOS certificate for $f\geq L$ .

According to Theorem 3.18, given any constant $M>0$ , there exists some constant $C=C(M)>0$ such that if $\lVert\widehat{f}\rVert_{\ell^{\infty}}\leq M$ , then $f$ admits a rational FSOS certificate of degree $C\deg(f)^{3}\log^{2}n$ for $f\geq L$ . Thus in Theorem 4.6 we may set $d=C\deg(f)^{3}\log^{2}n$ so that (iii) becomes

\left\lvert\sum_{j\in J}|g_{j}(x)|^{2}-\left(\sum_{i\in I}|h_{i}(x)|^{2}\right)(f(x)-L)\right\rvert\leq n^{-2(C\deg(f)^{3}\log^{2}n+\deg(f))}

for $a\in\Gamma_{2}^{n}$ such that $\sum_{i=1}^{n}a_{i}\leq 2C\deg(f)^{3}\log^{2}n-n+2\deg(f)$ . If $\deg(f)$ is fixed, then it suffices to validate $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ by checking quasi-polynomially many inequalities.

5. Computing FSOS certificates by the SDP relaxation

As shown in Example 3.15, rank one (polynomial and rational) FSOS certificates, although they exist and can be constructed explicitly, suffer from high sparsity. Moreover, such FSOS certificates may not be validated by methods we discussed in Section 4.

Example 5.1 (continues= ex:running example-3).

Let $g,h$ be defined by (13). We have

\left(f-\frac{1}{2}\right)h^{2}-g^{2}=-1.79-0.777y_{1}-0.777y_{2}-0.777y_{3}+0.669y_{1}y_{2}+0.669y_{1}y_{3}+0.669y_{2}y_{3}+2.12y_{1}y_{2}y_{3}.

By a direct calculation, we obtain $\lVert\widehat{(f-\frac{1}{2})h^{2}}-\widehat{g^{2}}\rVert_{\ell^{1}}=8.16$ and our $\ell^{1}$ -norm validation (15) fails for $(\{g\},\{h\})$ . Noticing that for any $\mu\neq 0$ , $(\{\mu g\},\{\mu h\})$ is a still a rank one rational FSOS certificate, thus one may expect to choose a sufficiently small $\mu$ such that $\lVert\widehat{(f-\frac{1}{2})(\mu h)^{2}}-\widehat{(\mu g)^{2}}\rVert_{\ell^{1}}=8.16\mu^{2}<\frac{1}{2}$ and the $\ell^{1}$ -norm validation can work. However, since the range of $h^{2}$ is $\{12.0,22.2,29.9,55.7\}$ , to ensure $(\mu h)^{2}\geq 1$ , $\mu^{2}$ is at least $\frac{1}{12}$ . Hence $\lVert\widehat{(f-\frac{1}{2})(\mu h)^{2}}-\widehat{(\mu g)^{2}}\rVert_{\ell^{1}}\geq 0.68$ and $\ell^{1}$ -norm validation is still not applicable.

Thus to find sparse FSOS certificates which can be validated efficiently, it is imperative to consider those of higher ranks. This section is concerned with numerical computation of sparse FSOS certificates. After a discussion on how to translate the problem of computing FSOS certificates to an SDP problem, we present Algorithm 1. The main idea of Algorithm 1 is as follows:

•

We translate Problem 1.1 into the SDP problem (5).
•

By relaxing equality constraints in (5) to inequalities, we obtain (5).
•

Our low degree theorems (Theorems 3.8 and 3.18) ensure that it suffices to search for low degree FSOS certificates and this speeds up the algorithm.

Let $f:G\to\mathbb{Z}$ be an integer-valued function on finite abelian group $G=\Gamma_{n_{1}}\times\cdots\times\Gamma_{n_{d}}$ . Using the Gram matrix defined in (7), we can convert the Problem 1.1 to the following semidefinite programming problem, which is a variant of the formulation in [KLYZ12]:

		$\displaystyle\min_{\begin{subarray}{c}U\in\mathbb{R}^{\|S\|\times\|S\|}\\ V\in\mathbb{R}^{\|T\|\times\|T\|}\end{subarray}}1,$
(27)			subject to	$\displaystyle\sum_{\beta-\alpha=\lambda}U_{\alpha,\beta}=\sum_{\begin{subarray}{c}\gamma-\nu+\zeta=\lambda\\ \gamma\in\operatorname{supp}(f)\end{subarray}}(f-L)_{\gamma}V_{\nu,\zeta},\quad\forall\leavevmode\nobreak\ \lambda\in\left(S-S\right)\cup\left(T-T+\operatorname{supp}(f)\right),$
	$\displaystyle U\succeq 0,\leavevmode\nobreak\ \leavevmode\nobreak\ V\succeq\frac{1}{\|T\|}.$

Here $S$ (resp. $T$ ) is a subset of $\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}$ , columns and rows of the matrix $U=(U_{\alpha,\beta})_{\alpha,\beta\in S}\in\mathbb{C}^{|S|\times|S|}$ (resp. $V=(V_{\nu,\zeta})_{\nu,\zeta\in T}\in\mathbb{C}^{|T|\times|T|}$ ) are indexed by elements in $S$ (reps. $T$ ) and $(f-L)_{\gamma}$ denotes the Fourier coefficient of $f-L$ at $\chi_{\gamma}\in\widehat{G}$ defined in (2). In the following we prove that the problem of certifying the nonnegativity of $f-L$ is equivalent, up to a choice of $T$ , to solving (5).

Proposition 5.2 (certifying nonnegativity by the SDP relaxation).

Given $S,T\subseteq\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}$ , each solution of (5) provides a rational FSOS for nonnegativity of $f$ . Conversely, any rational FSOS for nonnegativity of $f$ supplies a solution to (5) for some choices of $S$ and $T$ .

Proof.

Assume that $(U,V)$ is a solution to (5). In particular, $U,V\succeq 0$ . We let

g=\sum_{\alpha,\beta\in S}U_{\alpha,\beta}\chi_{\beta-\alpha},\quad h=\sum_{\nu,\zeta\in T}V_{\nu,\zeta}\chi_{\zeta-\nu}.

Clearly $U,V$ are Gram matrices of $g,h$ respectively. According to (8) we are able to find $\{g_{j}\}_{j=1}^{\operatorname{rank}(U)}$ and $\{h_{i}\}_{i=1}^{\operatorname{rank}(V)}$ such that $g=\sum_{j=1}^{\operatorname{rank}(U)}|g_{j}|^{2}$ , $h=\sum_{i=1}^{\operatorname{rank}(V)}|h_{i}|^{2}$ and

hf=g,\quad\bigcup_{j=1}^{\operatorname{rank}(U)}\operatorname{supp}(g_{j})\subseteq S,\quad\bigcup_{i=1}^{\operatorname{rank}(V)}\operatorname{supp}(h_{i})\subseteq T.

Moreover, $h\geq 1$ on $G$ since $V\succeq\frac{1}{|T|}$ . Conversely, given a pair $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ of finite families such that

f=\frac{\sum_{j\in J}|g_{j}|^{2}}{\sum_{i\in I}|h_{i}|^{2}},\quad\sum_{i\in I}|h_{i}|^{2}\geq 1,

then Gram matrices $U$ and $V$ of $\sum_{j\in J}|g_{j}|^{2}$ and $\sum_{i\in I}|h_{i}|^{2}$ clearly form a solution to (5) for $S=\bigcup_{j\in J}\operatorname{supp}(g_{j})$ and $T=\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}$ . ∎

Remark 5.3.

By the proof of Proposition 5.2, it is clear that $(S,T)$ in (5) bounds the supports of rational FSOS certificates. Moreover, since rational FSOS include polynomial FSOS as a special case, it is clear that some solution $(U,V)$ of (5) should supply a polynomial FSOS certificate. Indeed, this occurs if $V$ is a diagonal matrix whose diagonal elements are at least $\frac{1}{|T|}$ .

We notice that there are linear equality constraints in (5), which are not favourable from the perspective of optimization. To resolve this issue, we consider (5) obtained by relaxing the equalities to inequalities in (5).

		$\displaystyle\min_{\begin{subarray}{c}U\in\mathbb{R}^{\|S\|\times\|S\|}\\ V\in\mathbb{R}^{\|T\|\times\|T\|}\end{subarray}}$	$\displaystyle 1,$
(28)			subject to	$\displaystyle\sum_{\lambda\in\left(S-S\right)\bigcup\left(T-T+\operatorname{supp}(f)\right)}\left\lvert\sum_{\begin{subarray}{c}\beta-\alpha=\lambda\\ \alpha,\beta\in S\end{subarray}}U_{\alpha,\beta}-\sum_{\begin{subarray}{c}\gamma+\zeta-\nu=\lambda\\ \gamma\in\operatorname{supp}(f),\nu,\zeta\in T\end{subarray}}\left(f-L+\frac{1}{2}\right)_{\gamma}V_{\nu,\zeta}\right\rvert<\frac{1}{2},$
	$\displaystyle U\succeq 0,\leavevmode\nobreak\ \leavevmode\nobreak\ V\succeq\frac{1}{\|T\|}.$

Clearly, (5) is indeed a relaxation of (5), and its inequality constraints can be rewritten as

(29)

\left\lVert\sum_{i\in I}\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("|h_{i}|^{2}\left(f-L+\frac{1}{2}\right)")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{|h_{i}|^{2}\left(f-L+\frac{1}{2}\right)}{\tmpbox}-\sum_{j\in J}\widehat{|g_{j}|^{2}}\right\rVert_{\ell^{1}}<\frac{1}{2}.

In particular, (29) can be recognized as an $\ell^{1}$ -norm relaxation of

\left\lVert\sum_{i\in I}\savestack{\tmpbox}{\stretchto{\scaleto{\scalerel*[width("|h_{i}|^{2}\left(f-L+\frac{1}{2}\right)")]{\kern-0.6pt\bigwedge\kern-0.6pt}{\rule[-505.89pt]{4.30554pt}{505.89pt}}}{}}{0.5ex}}\stackon[1pt]{|h_{i}|^{2}\left(f-L+\frac{1}{2}\right)}{\tmpbox}-\sum_{j\in J}\widehat{|g_{j}|^{2}}\right\rVert_{\ell^{\infty}}<\frac{1}{2},

which is simply (14) with $\varepsilon=\frac{1}{2}$ .

Now we are ready to present Algorithm 1 which computes a rational FSOS certificate for $f\geq L$ , where $f$ is an integer-valued function $f:G\to\mathbb{Z}$ and $L$ is an integer. In particular, if we set $T=\{1\}$ and $S:=\{\nu\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}:z^{\nu}\in\operatorname{supp}(r)\}$ in step 6 of Algorithm 1, we obtain an algorithm to compute a sparse polynomial FSOS certificate for $f\geq L$ .

Algorithm 1 Rational FSOS certificate for lower bound

f

is an integer-valued function

f:G\to\mathbb{Z}

L,M,d

and

k

are integers such that the image of

f

is contained in

[L,M]\cap\mathbb{Z}

2:rational FSOS certificate

(\{g_{\alpha}\}_{\alpha\in S},\{h_{\nu}\}_{\nu\in T})

for

f\geq L

3:compute the degree

d

polynomial

p_{d}(t)

which approximates

\sqrt{t}

at points

i+\frac{1}{2},i=0,\dots,M-L

by a linear programming solver.

4:while problem in step 7 is not feasible do

5: compute the first-

k

-truncation

r

p_{d}(f-L+\frac{1}{2})

\triangleright

see (31)

6: set

T:=\{\nu\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}:\chi_{\nu}\in\operatorname{supp}(r)\}

S:=T+\{\alpha\in\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}:\chi_{\alpha}\in\operatorname{supp}(f)\}

7: solve problem (5), if it is not feasible, then increase the value of

k

\triangleright

by any SDP solver

8:end while

9:compute Cholesky decompositions

U=G^{*}G

and

V=H^{*}H

H\in\mathbb{C}^{I\times|S|}

G\in\mathbb{C}^{J\times|T|}

10:return

g_{j}:=\sum_{\alpha\in S}G_{j,\alpha}\chi_{\alpha},j\in J

h_{i}:=\sum_{\zeta\in T}H_{i,\zeta}\chi_{\zeta},i\in I

In Algorithm 1, we again label columns and rows of matrices in $\mathbb{C}^{|S|\times|S|}$ (resp. $\mathbb{C}^{|T|\times|T|}$ ) by elements in $S$ (resp. $T$ ) respectively. Among all the steps in Algorithm 1, the rationale behind steps 7–10 is clear from the discussion above. Thus it remains to interpret step 3-6. Let $p_{d}(t)$ be the approximation polynomial obtained in step 3 justified by Proposition 3.22. The univariate polynomial $p_{d}$ can be easily determined by optimization techniques. For instance, we can compute $p_{d}(t)=\sum_{i=0}^{d}a_{i}t^{i}$ by solving a linear programming problem:

(30)			$\displaystyle\min_{(a_{0},\dots,a_{d})\in\mathbb{R}^{d+1}}\lambda,$
		$\displaystyle\text{subject to}\leavevmode\nobreak\ \left\|\left(\sum_{i=0}^{d}a_{i}t^{i}\right)-\sqrt{t}\right\|\leq\lambda,\quad t=\frac{1}{2},\frac{3}{2},\dots,(f_{\max}-L)+\frac{1}{2}.$

The first- $k$ -truncation in step 5 is defined as follows: we order terms of $p_{d}(f)$ by the magnitude of their coefficients:

p_{d}(f)=c_{1}\chi_{\alpha_{1}}+\dots+c_{D}\chi_{\alpha_{D}},

where $|c_{1}|\geq\cdots\geq|c_{D}|$ and $\{\chi_{\alpha_{1}},\dots,\chi_{\alpha_{D}}\}=\operatorname{supp}(p_{d}(f))$ . Then we take the first $k$ terms to obtain the first- $k$ -truncation of $p_{d}(f)$ :

(31)

r=c_{1}\chi_{\alpha_{1}}+\dots+c_{k}\chi_{\alpha_{k}}.

Clearly we have

r^{2}\approx f\approx\frac{g}{h},\quad g=\sum_{j\in J}|g_{i}|^{2},\quad h=\sum_{i\in J}|h_{i}|^{2},

where $S,T\subseteq\mathbb{Z}_{n_{1}}\times\cdots\times\mathbb{Z}_{n_{d}}$ and $(\{g_{j}\}_{j\in J},\{h_{i}\}_{i\in I})$ are to be determined. The existence of $r$ suggests the existence of a rational certificate whose support contains $\operatorname{supp}(r)$ , from which we heuristically choose $T$ to be the one in step 6. Roughly speaking, it is true that $r$ is already truncated, thus it may not be a good approximation of $\sqrt{f}$ anymore. However, the definition of the first- $k$ -truncation ensures that $\operatorname{supp}(r)$ still contains enough information of $\operatorname{supp}(\sqrt{f})$ . Now from the relation $hf\approx g$ , it is clear why $S$ should be in the form of the one in step 6.

We provide the following example to illustrate Algorithm 1.

Example 5.4.

Let

f:\Gamma_{2}^{4}\mapsto\mathbb{Z},\leavevmode\nobreak\ f(y)\coloneqq y_{1}+y_{2}+y_{3}+y_{4}+y_{1}y_{2}+2y_{1}y_{3}-3y_{1}y_{4}+8.

We compute a rational FSOS certificate for $f\geq L\coloneqq 2$ by Algorithm 1. We have

f(y)-L+\frac{1}{2}=y_{1}+y_{2}+y_{3}+y_{4}+y_{1}y_{2}+2y_{1}y_{3}-3y_{1}y_{4}+\frac{13}{2}.

Clearly $f(y)\leq\|\widehat{f}\|_{\ell^{1}}=18$ for all $y\in\Gamma_{2}^{4}$ . We take $L=2$ , $M=18$ , $d=1$ and $k=2$ in Algorithm 1. Then $p_{1}(t)=0.2097t+0.8971$ and the first- $2$ -truncation of $p_{1}(f-L+\frac{1}{2})$ is $-0.63y_{1}y_{4}+2.26$ . This implies that $T=\{(0,0,0,0),(1,0,0,1)\}$ and

S=\left\{\begin{aligned} &(0,0,0,0),(0,0,0,1),(0,0,1,0),(0,0,1,1),(0,1,0,0),(0,1,0,1),\\ &(1,0,0,0),(1,0,0,1),(1,0,1,0),(1,0,1,1),(1,1,0,0),(1,1,0,1)\end{aligned}\right\}

One can find the rational FSOS certificate obtained by Algorithm 1 in Appendix A.

6. Applications of FSOS certificates

In this section, we present numerical experiments for three applications of FSOS certificates. The first application is the certificate problem for the function from [BGP16], which we discussed in Theorem 2.10 and Remark 3.4. The second application is the certificate problem for MAX-SAT. In the literature, a certificate for MAX-SAT can be obtained either by an inference rule called resolution [AH15, PCH20, BLM07, HMS11, PCH21] or by the polynomial SOS technique [HKM16, vvH08]. We illustrate in Section 6.2 that FSOS certificates provide another way to certify the MAX-SAT problem. The third application originates from graph theory. It is the certificate problem for maximum k-colorable subgraph (MkCS) problem, which reduces to the MAX-CUT problem when $k=2$ [PY88, Boo75, GW95]. In Section 6.3, we consider the MkCS certificate problem for wheel graphs and complete graphs, respectively.

6.1. Numerical experiments on the function from [BGP16]

We recall from Theorem 2.10 that for each positive integer $n$ , the function

f_{n}(x_{1},\dots,x_{n})=\left(\sum_{j=1}^{n}x_{j}-\left\lfloor\frac{n}{2}\right\rfloor\right)\left(\sum_{j=1}^{n}x_{j}-\left\lfloor\frac{n}{2}\right\rfloor-1\right)

on $\{0,1\}^{n}$ can not be written as a rational FSOS of degrees lower than $\lfloor n/2\rfloor$ . However, according to Remark 3.4, $f_{n}$ admits an explicit FSOS certificate of sparsity $n+1$ . In this subsection, we test Algorithm 1 on $f_{n}$ for $n=10,20,\dots,100$ . Numerical results are presented in Table 1. We list the sparsity of computed polynomial FSOS certificates in the second column. For comparison, we also list in the third column that the upper bound of the sparsity of a rational FSOS of degrees $\lfloor n/2\rfloor$ .

n	sparsity	upper bound of the sparsity [BGP16]
10	50	$512$
20	200	$524299$
30	450	$5.3687\cdot 10^{8}$
40	800	$5.4976\cdot 10^{11}$
50	1250	$5.6295\cdot 10^{14}$
60	1800	$5.7646\cdot 10^{17}$
70	2450	$5.903\cdot 10^{20}$
80	3200	$6.0446\cdot 10^{23}$
90	4050	$6.1897\cdot 10^{26}$
100	5000	$6.3383\cdot 10^{29}$

Table 1. polynomial FSOS certificates for

{f}_{n}

6.2. MAX-SAT certificate problem

In this subsection, we apply Algorithm 1 to solve the MAX-SAT certificate problem. To begin with, we first recall the definition of conjunctive normal form (CNF) formula.

Definition 6.1 (clause and CNF formula).

Let $x_{1},\dots,x_{s+r}$ be variables on the Boolean field $(\{\text{False},\text{True}\},\lor,\land)$ . A clause (in variables $x_{1},\dots,x_{s+r}$ ) with $(s+r)$ literals is:

c=x_{1}\lor\cdots\lor x_{s}\lor\lnot x_{s+1}\lor\cdots\lor\lnot x_{s+r}.

A $k$ -CNF formula (in variables $x_{1},\dots,x_{n}$ ) with $m$ clauses is:

\phi=c_{1}\land\cdots\land c_{m},

where $c_{i}$ is a clause (in variables $x_{1},\dots,x_{n}$ ) with at most $k$ literals for each $1\leq i\leq m$ .

An assignment of $\phi$ is an evaluation of $\phi$ at a point in $\{\text{False},\text{True}\}^{n}$ . We say that a clause $c$ in $\phi$ is satisfiable if there is an assignment such that the value of $c$ is True. Given a $k$ -CNF formula $\phi$ , the MAX-SAT problem determines the maximum number of simultaneously satisfiable clauses in $\phi$ by an assignment. Below we also clarify the MAX-SAT certificate problem.

Definition 6.2 (MAX-SAT certificate problem).

Given a $k$ -CNF formula $\phi$ with $m$ clauses and a positive integer $L\leq m$ , the MAX-SAT certificate problem asks for a proof of the existence of at most $(m-L)$ simultaneously satisfiable clauses in $\phi$ , or equivalently, a proof of the existence of at least $L$ simultaneously falsified clauses in $\phi$ . Such a proof is called a MAX-SAT certificate for $(\phi,L)$ .

Next we will establish a connection between the MAX-SAT certificate problem and FSOS. To achieve this goal, we consider the following map sending a Boolean variable to a variable in $\Gamma_{2}$ :

\tau:\{\text{False},\text{True}\}\to\Gamma_{2},\quad\tau(x)=\begin{cases}1,\quad\text{if}\leavevmode\nobreak\ x=\text{False},\\ -1,\quad\text{if}\leavevmode\nobreak\ x=\text{True}.\end{cases}

In fact, if we identify the truth values False and True with $0$ and $1$ respectively, then $\tau$ is simply the bijective map between $\mathbb{Z}_{2}$ and $\Gamma_{2}$ defined by $x\mapsto 1-2x$ . It is obvious that under this identification $\tau$ is a group isomorphism between $\mathbb{Z}_{2}$ and $\Gamma_{2}$ . In particular, we have $\tau(\lnot x)=-\tau(x)$ . Moreover, if we regard a clause $c=x_{1}\lor\cdots\lor x_{s}\lor\lnot x_{s+1}\lor\cdots\lor\lnot x_{s+r}$ as an integer-valued function on $\mathbb{Z}_{2}^{s+r}$ , then we have the commutative diagram shown in Figure 2.

Figure 2. conversion of a clause to a function

By a direct calculation, we may obtain that

(32)

c\circ(\tau^{-1}\times\cdots\times\tau^{-1})(y_{1},\dots,y_{s+r})=1-h_{s+r}(y_{1},\cdots,y_{s},-y_{s+1},\cdots,-y_{s+r}),

where for each positive integer $l$ , $h_{l}$ is the multilinear polynomial in variables $y_{1},\dots,y_{l}$ defined by

(33)

h_{l}(y_{1},y_{2},..,y_{l})=\frac{1}{2^{l}}\prod_{i=1}^{l}\left(1+y_{i}\right).

One can easily verify that for $(y_{1},\dots,y_{l})\in\Gamma_{2}^{l}$ , we have

(34)

h_{l}(y_{1},y_{2},..,y_{l})=\begin{cases}1,\leavevmode\nobreak\ &\text{if }y_{1}=y_{2}=\cdots=y_{l}=1,\\ 0,\leavevmode\nobreak\ &\text{otherwise}.\end{cases}

It is the auxiliary function $h_{l}$ defined in (33) which enables us to convert a $k$ -CNF formula to a function on $\Gamma_{2}^{n}$ . Before we proceed, it is worthy to recall that in the literature, one can convert a $k$ -CNF formula to a polynomial by several different methods. The degree of the resulting polynomial obtained by each method varies. For example, given a $k$ -CNF formula $\phi$ with $m$ clauses, the arithmetization of $\phi$ in [AB09, Section 8.5.1] and [Gol10, Section 4] has degree $3m$ and $m$ respectively, while the characteristic function of $\phi$ in [BLM07, Definition 6] has degree $k$ , which is independent of $m$ . For our purpose, it is favorable in terms of computational complexity to convert a $k$ -CNF formula to a low degree polynomial. Therefore, we have the following variant of the characteristic function defined in [BLM07, Definition 6].

Definition 6.3.

We define the characteristic function $f_{c}$ of a clause $c=x_{1}\lor\cdots\lor x_{s}\lor\lnot x_{s+1}\lor\cdots\lor\lnot x_{s+r}$ as

f_{c}(y)=h_{s+r}(y_{1},\cdots,y_{s},-y_{s+1},\cdots,-y_{s+r}),\quad y\in\Gamma_{2}^{s+r}.

Given a CNF formula $\phi=\bigwedge_{i=1}^{m}c_{i}$ in $n$ variables, we define its characteristic function as

f_{\phi}(y)=\sum_{i=1}^{m}f_{c_{i}}(y),\quad y\in\Gamma_{2}^{n}.

Example 6.4.

The characteristic function of the CNF formula:

\phi=x_{1}\land x_{2}\land x_{3}\land\left(\lnot{x}_{1}\lor\lnot{x}_{2}\lor\lnot{x}_{3}\right)

is given by

	$\displaystyle f_{\phi}(y_{1},y_{2},y_{3})$	$\displaystyle=h_{1}(y_{1})+h_{1}(y_{2})+h_{1}(y_{3})+h_{3}(-y_{1},-y_{2},-y_{3})$
		$\displaystyle=\frac{13}{8}+\frac{3}{8}y_{1}+\frac{3}{8}y_{2}+\frac{3}{8}y_{3}+\frac{1}{8}y_{1}y_{2}+\frac{1}{8}y_{1}y_{3}+\frac{1}{8}y_{2}y_{3}-\frac{1}{8}y_{1}y_{2}y_{3}.$

This is the function on $\Gamma_{2}^{3}$ we discussed in Example 3.5.

We summarize some simple but useful properties of characteristic functions in the next proposition.

Proposition 6.5.

Let $\phi$ be a $k$ -CNF formula in $n$ variables with $m$ clauses and let $f_{\phi}$ be the characteristic function of $\phi$ . Then $f_{\phi}$ has the following properties:

(i)

The degree of $f_{\phi}$ is at most $k$ .
(ii)

The cardinality of $\operatorname{supp}(f_{\phi})$ is at most $\min\left\{2^{k}m,\sum_{i=0}^{k}\binom{n}{i}\right\}$ .
(iii)

$f_{\phi}$ can be computed by $O(2^{k}km)$ operations. In particular, if we fix $k$ then $f_{\phi}$ can be computed by $O(m)$ operations.
(iv)

For each $x\in\{\text{True},\text{False}\}^{n}$ , $f_{\phi}(\tau(x))$ is the number of clauses of $\phi$ falsified by $x$ .
(v)

The image set of $f_{\phi}$ is contained in $\{0,\dots,m\}$ .
(vi)

Let $\alpha\in(1/5,1]$ and $0\leq\beta<\frac{5\alpha-1}{4\alpha}$ be fixed real numbers. Assume the minimum of the number of simultaneously falsified clauses in $\phi$ is $L_{\min}=\alpha m$ . For each positive integer $L\leq\beta L_{\min}$ , there is a polynomial FSOS certificate for $f_{\phi}\geq L$ whose degree is $O\left(k\log m\right)$ .
(vii)

For each $L\leq L_{\min}$ , there is a rational FSOS certificate for $f_{\phi}\geq L$ whose degree is $O(k\log^{2}m)$ .

Proof.

According to (33), for each positive integer $s$ , we first have $\deg h_{s}=s$ from which (i) follows. It is easy to check that the cardinality of $\operatorname{supp}(h_{s})$ is $2^{s}$ . Since the number of multilinear monomials in $n$ variables with degree at most $k$ is $\sum_{i=0}^{k}\binom{n}{i}$ , we obtain (iii). Next one can easily check (ii) by (33). Lastly from (32) and (34), we may conclude that for each clause $c$ in $n$ variables, $f_{c}(\tau(x_{1}),\dots,\tau(x_{n}))=1$ if and only if $c$ is falsified by $(x_{1},\dots,x_{n})\in\{\text{True},\text{False}\}^{n}$ . This implies (iv) and (v) by the definition of $f_{\phi}$ . (vi) and (vii) are direct consequences of Theorems 3.8 and 3.18, respectively. ∎

Combining (iv), (vi) and (vii) in Proposition 6.5, we may easily obtain the following theorem on the existence of short MAX-SAT certificate.

Theorem 6.6 (short MAX-SAT certificate).

Let $\alpha\in(1/5,1]$ , $0\leq\beta<\frac{5\alpha-1}{4\alpha}$ be fixed real numbers and let $k$ be a fixed positive integer.

(i)

For each $k$ -CNF formula in $n$ variables with $m=O(n)$ clauses such that $L_{\min}=\alpha m$ , there is a MAX-SAT certificate for $(\phi,L)$ of the form $\{h_{i}\}_{i\in I}$ whose degree is $O(\log n)$ .
(ii)

For each $k$ -CNF formula in $n$ variables with $m=O(n)$ clauses, there is a MAX-SAT certificate for $(\phi,L)$ of the form $(\{h_{i}\}_{i\in I},\{g_{i}\}_{j\in J})$ whose degree is $O(\log^{2}n)$ .

Remark 6.7.

Clearly, using characteristic functions, we are able to deal with the certificate problem associated with other problems such as UNSAT and MIN-SAT. In each instance, we certify the inequality $f\geq L$ by FSOS certificate for appropriate choice of $f$ and $L$ . In Table 2, we list the corresponding choice of $f$ and $L$ respectively, where $f_{\phi}$ denotes the characteristic function of a CNF formula $\phi$ and $L_{\min}$ (resp. $L_{\max}$ ) is the minimum (maximum) of the number of simultaneously falsified clauses in $\phi$ .

	MAX-SAT	UNSAT	MIN-SAT
$L$	$L_{\min}$	1	$-L_{\max}$
$f$	$f_{\phi}$	$f_{\phi}$	$-f_{\phi}$

Table 2. adaptations to other problems

6.2.1. Experiments on MAX-3-SAT Benchmark Problems

Given a CNF formula $\phi$ , we recall that the MAX-SAT certificate problem for $\phi$ can be solved by $\{g_{i}\}_{i\in I}$ if

\|\widehat{f_{\phi}}-L_{\phi}-\sum_{i\in I}\widehat{g_{i}^{2}}\|_{\ell^{1}}<1,

where $L_{\phi}$ is the maximum number of simultaneously satisfiable clauses in $\phi$ . In [vvH08], bases ${M}_{ap}$ and ${M}_{pt}$ are proposed to compute FSOS of MAX-3-SAT problems with $n$ variables, where

	$\displaystyle M_{\text{ap}}$	$\displaystyle=\{1,y_{i}\}_{i=1}^{n}\cup\{y_{i}y_{j}:x_{i},x_{j}\leavevmode\nobreak\ \text{occur in same clause}\}_{i,j=1}^{n},$
	$\displaystyle M_{\text{pt}}$	$\displaystyle=M_{\text{ap}}\cup\{y_{i}y_{j}y_{k}:x_{i},x_{j},x_{k}\text{ occur in same clause}\}_{i,j,k=1}^{n}.$

We consider the MAX-3-SAT benchmark problems in 2009 MAX-SAT competitions²²2http://www.maxsat.udl.cat/09/index.php. CNF formulae in these problems are of $70$ variables and $300$ clauses. We compute FSOS certificates for these problems by $M_{\text{ap}}$ , $M_{\text{pt}}$ and Algorithm 1, respectively. Each of the three methods requires us to solve an SDP and we solve it by the ADMM algorithm in SDPNAL+[STYZ20].

We record our results in Table 3. Instances whose FSOS certificates can be found by the corresponding method are marked by “ $\surd$ ” in the ”verifiability” column. Instances for which the corresponding method fails to find FSOS certificates are marked by “ $\times$ ” in the “verifiability” column. We also record the sparsity of the computed FSOS certificate in the column labelled by “sparsity”. For instances marked by “ $\times$ ”, the sparsity is the number of elements in $M_{\text{pt}}$ .

The experimental results demonstrate that when there exists an FSOS certificate with basis $M_{\text{pt}}$ or $M_{\text{ap}}$ , Algorithm 1 can find an FSOS certificate with smaller basis. More importantly, when methods based on $M_{\text{pt}}$ and $M_{\text{ap}}$ fail, Algorithm 1 is still able to computes an FSOS certificate successfully.

No	$L_{\phi}$	$M_{\text{pt}}$ , $M_{\text{ap}}$		Algorithm 1
No	$L_{\phi}$	verifiability	sparsity	verifiability	sparsity
0	0	$\surd$	1122	$\surd$	1111
1	1	$\surd$	1122	$\surd$	1122
2	0	$\surd$	1113	$\surd$	1102
3	0	$\surd$	1131	$\surd$	1120
4	1	$\times$	2486	$\surd$	1691
5	0	$\surd$	1108	$\surd$	1097
6	1	$\times$	2486	$\surd$	1698
7	0	$\surd$	1130	$\surd$	1119
8	0	$\surd$	1125	$\surd$	1114
9	0	$\surd$	1124	$\surd$	1113

Table 3. polynomial FSOS certificates for random benchmark problems with 300 clauses

6.3. Maximum k-colorable subgraph certificate problem

Given a simple undirected graph $\mathcal{G}$ and a positive integer $k$ , the maximum k-colorable subgraph problem (MkCS) asks one to find a subgraph with maximum number of edges which can be colored with $k$ colors [PY88]. We remark that the MkCS problem considered in this paper is also called the maximum k-cut problem [FJ95, vS16, Sot14]. Moreover, in some resources [LFT92, JP11, CC10], the MkCS problem also refers to a closely related but different problem.

In this subsection, we exhibit how to certify the MkCS probelm by FSOS certificates.

Definition 6.8 (MkCS certificate problem).

Given a simple undirected graph $\mathcal{G}=(\mathcal{V},\mathcal{E})$ and positive integers $k$ and $L$ , the MkCS certificate problem asks for a proof of the nonexistence of a $k$ -colorable subgraph with $L+1$ edges.

Next we rephrase the MkCS certificate problem as a certificate problem for lower bound. To that end, we have two candidates for the underlying group structure: one is $\Gamma_{3}^{n}$ and the other one is $\Gamma_{2}^{3n}$ .

Let $k$ be a fixed positive integer and let $\mathcal{G}=(\mathcal{V},\mathcal{E})$ be an undirected graph with vertex set $\mathcal{V}=\{1,2,3,...,n\}$ and edge set $\mathcal{E}\subset\mathcal{V}\times\mathcal{V}$ . We consider the following two functions:

	$\displaystyle\delta:\Gamma_{k}^{2}$	$\displaystyle\to\mathbb{R},\quad\delta(y_{1},y_{2})=\begin{cases}1,&\mbox{if }y_{1}=y_{2},\\ 0,&\mbox{otherwise}.\end{cases}$
(35)		$\displaystyle f_{\mathcal{G},k}:\Gamma_{k}^{n}$	$\displaystyle\to\mathbb{R},\leavevmode\nobreak\ f_{\mathcal{G},k}(y_{1},y_{2},...,y_{n})=\sum_{(i,j)\in E}\delta(y_{i},y_{j}).$

We notice that each element $(y_{1},\dots,y_{n})\in\Gamma_{k}^{n}$ represents an assignment of $k$ colors to vertices of $\mathcal{G}$ . Thus $f_{\mathcal{G},k}(y_{1},\dots,y_{n})$ is exactly the number of edges whose vertices are assigned the same color by $(y_{1},\dots,y_{n})$ . This observation leads to the lemma that follows.

Lemma 6.9.

Let $\mathcal{G}$ be an undirected graph and let $k$ be a positive integer. The maximum number $d_{\mathcal{G},k}$ of edges in $\mathcal{G}$ that are $k$ -colorable is equal to $|\mathcal{E}|-\min_{y\in\Gamma_{k}^{n}}f_{\mathcal{G},k}(y)$ . In particular, $d_{\mathcal{G},k}\leq L$ if and only if $\min_{y\in\Gamma_{k}^{n}}f_{\mathcal{G},k}(y)\geq|\mathcal{E}|-L$ , which can be certified by an FSOS certificate of $f_{\mathcal{G},k}-|\mathcal{E}|+L$ .

According to [BCMM05], we can also encode the 3-colorability of a graph $\mathcal{G}$ into the following 3-CNF formula in variables $x_{i,k}$ , $i=1,\dots,n$ , $k=1,2,3$ :

(36)

\phi_{\mathcal{G}}=\bigwedge_{i=1}^{n}\left(x_{i,1}\lor x_{i,2}\lor x_{i,3}\right)\land\bigwedge_{\begin{subarray}{c}i=1,2,3,...,n\\ k=1,2\end{subarray}}\left(\lnot x_{i,k}\lor\lnot x_{i,k+1}\right)\land\bigwedge_{\begin{subarray}{c}(i,j)\in\mathcal{E}\\ k=1,2,3\end{subarray}}\left(\lnot x_{i,k}\lor\lnot x_{j,k}\right).

Here $x_{i,k}$ is true if and only if the $k$ -th color is assigned to the $i$ -th vertex. Therefore $\mathcal{G}$ is $3$ -colorable if and only if $\phi_{\mathcal{G}}$ is satisfiable. Moreover, using the characteristic function $f_{\phi_{\mathcal{G}}}$ (c.f. Definition 6.3) of $\phi_{\mathcal{G}}$ , we are able to prove that $\mathcal{G}$ are not $3$ -colorable by FSOS certificate. This is the content of the following lemma.

Lemma 6.10.

Let $\mathcal{G}=(\mathcal{V},\mathcal{E})$ be a undirected graph with $n$ vertices, then $\mathcal{G}$ is not $3$ -colorable if and only if $f_{\phi_{\mathcal{G}}}\geq 1$ .

We notice that Lemmas 6.9 and 6.10 provide us two ways to certify the inequality $d_{\mathcal{G},3}\leq L$ : the first one is by an FSOS certificate of $f_{\mathcal{G},3}-|\mathcal{E}|+L$ , which is a function on $\Gamma_{3}^{n}$ ; the second one is by an FSOS certificate of $f_{\phi_{\mathcal{G}}}-|\mathcal{E}|+L$ which is a function on $\Gamma_{2}^{3n}$ .

6.3.1. MkCS certificate problem for Wheel graphs

Let $\mathcal{W}_{n}$ be the wheel graph of $n$ vertices [W⁺01]. The graph in Figure 3(a) is the wheel graph of $8$ vertices.

(a)

\mathcal{W}_{8}

(b) 3-colorable subgraph of

\mathcal{W}_{8}

Figure 3. Example of a wheel graph and its 3-colorable subgraph

The following lemma is obvious.

Lemma 6.11.

If $n$ is even, then

(i)

$\mathcal{W}_{n}$ is not 3-colorable.
(ii)

The subgraph of $\mathcal{W}_{n}$ obtained by deleting an edge in the outer circle is 3-colorable.

As a consequence, we have

\min_{y\in\Gamma_{3}^{n}}f_{\mathcal{W}_{n},3}(y)=1,

and

\min_{y\in\Gamma_{2}^{3n}}f_{\phi_{\mathcal{G}}}(y)=1.

The graph in Figure 3(b) is the 3-colorable subgraph of $\mathcal{W}_{8}$ described in (ii) of Lemma 6.11. Next we compute a short polynomial FSOS certificate of $f_{\mathcal{W}_{n},3}-1$ for $n\in\{2m:5\leq m\leq 25\}$ , which provides a short proof of the inequality $\min_{y\in\Gamma_{3}^{n}}f_{\mathcal{W}_{n},3}(y)\geq 1$ . To that end, we apply Algorithm 1 to $f_{\mathcal{W}_{n},3}$ with $L=1$ , $f_{\max}=|\mathcal{E}|$ , $d=2$ and $k=|\operatorname{supp}(f)|$ . Numerical results are recorded in Table 5, where $n$ is the number of vertices of the wheel graph, ”time” is the running time of Algorithm 1, and ”sparsity” is the sparsity of the computed FSOS certificate. It is clear from Figure 4 that the computed polynomial FSOS certificates are indeed short since their sparsities are linear in the number of vertices. We remark that the order of $\Gamma_{3}^{n}$ is $3^{n}$ , which is as large as $3^{50}\approx 7\cdot 10^{23}$ in our examples.

As a comparison, we also apply TSSOS [WML21b, WML21a] to compute a polynomial SOS certificate for $f_{\mathcal{W}_{n},3}\geq 1$ . Since TSSOS is not able to process complex polynomials directly, we transform the problem into the following equivalent real form:

		$\displaystyle\min_{x,y\in\mathbb{R}^{n}}$	$\displaystyle\operatorname{Re}\left(f_{\mathcal{W}_{n},3}(x+\sqrt{-1}y)\right)$
		subject to	$\displaystyle(x_{i}+\sqrt{-1}y_{i})^{3}=1,\quad i=1,\dots,n.$

We then solve (6.3.1) by TSSOS.³³3We thank Jie Wang for his help. For instance, TSSOS⁴⁴4The relaxation order is set to be $4$ . spends $510$ seconds to find an SOS certificate for $f_{\mathcal{W}_{10},3}\geq 1$ of sparsity $10206$ .

n	time	sparsity
10	0.87	74
12	1.20	90
14	1.29	106
16	1.77	122
18	2.14	138
20	3.60	154
22	4.44	170
24	5.34	186
26	6.21	202
28	7.19	218
30	10.54	234
32	11.50	250
34	13.04	266
36	14.62	282
38	14.76	298
40	16.75	314
42	19.03	330
44	21.74	346
46	25.63	362
48	26.59	378
50	30.45	394

Table 4. polynomial FSOS certificates for

f_{\mathcal{W}_{n},3}\geq 1

n	time	sparsity
10	2.25	125
12	3.49	151
14	5.06	177
16	6.71	203
18	9.06	229
20	11.46	255
22	15.38	281
24	18.95	307
26	23.89	333
28	31.67	359
30	149.23	578
32	190.84	617
34	218.54	656
36	289.95	695
38	326.91	734
40	375.87	773
42	454.19	812
44	481.17	851
46	543.78	890
48	609.56	929
50	694.80	968

Table 5. polynomial FSOS certificates for

f_{\phi_{\mathcal{W}_{n}}}\geq 1

Refer to caption — Figure 4. MkCS certificate problem for wheel graphs

Let $\phi_{\mathcal{W}_{n}}$ be the 3-CNF formula defined in (36) and let $f_{\phi_{\mathcal{W}_{n}}}$ be its characteristic function (c.f. Definition 6.3). According to Lemmas 6.11 and 6.10, we have $\min_{y\in\Gamma_{2}^{3n}}f_{\phi_{\mathcal{W}_{n}}}(y)=1$ if $n$ is even. We apply Algorithm 1 to compute polynomial FSOS certificates for $f_{\phi_{\mathcal{W}_{n}}}\geq 1,n\in\{2m:5\leq m\leq 15\}$ . Results are shown in Table 5. It is obvious from Figure 4 that the sparsity of computed FSOS certificate is (roughly) linear in the number of vertices.

We remark that the sparsity of the computed FSOS certificate for $f_{\phi_{\mathcal{W}_{n}}}\geq 1$ is much greater than that for $f_{\mathcal{W}_{n},3}\geq 1$ . Moreover, for the same $n$ , computing an FSOS certificate for $f_{\phi_{\mathcal{W}_{n}}}\geq 1$ costs more time than computing an FSOS certificate for $f_{\mathcal{W}_{n},3}\geq 1$ . This indicates that although many combinatorial problems can be equivalently reformulated as FSOS problems on $\Gamma_{2}^{n}$ , a suitable choice of the underlying group structure may accelerate the computation in practice.

6.3.2. MkCS certificate problem for complete graphs

Let $\mathcal{K}_{n}$ be the complete graph with $n$ vertexes and let $f_{\mathcal{K}_{n},n-1}:\Gamma_{n}^{n-1}\to\mathbb{R}$ be the integer valued function defined in (6.3) for $\mathcal{K}_{n}$ and $n-1$ .

Lemma 6.12.

For any $n$ , $f_{\mathcal{K}_{n},n-1}\geq 1$ admits a polynomial FSOS certificate of sparsity $\operatorname{O}(n^{2})$ .

Proof.

We observe that

f_{\mathcal{K}_{n},n-1}=\frac{n}{2}+\frac{1}{n-1}\sum_{k=1}^{n-2}\sum_{1\leq i<j\leq n}\chi_{k}(y_{i})\chi_{n-k-1}(y_{j}),

where $\chi_{k}(z)=z^{k},0\leq k\leq n-1$ are characters of $\Gamma_{n}$ . It is straightforward to verify that

f_{\mathcal{K}_{n},n-1}-1=\frac{-n+2}{2(n-1)}+\sum_{k=1}^{n-2}\frac{1}{2(n-1)}\left|\sum_{i=1}^{n}\chi_{k}(y_{i})\right|^{2}.

This provides us a desired FSOS certificate for $f_{\mathcal{K}_{n},n-1}\geq 1$ . ∎

In fact, $f_{\mathcal{K}_{n},n-1}$ is the same function discussed in [YYZ22, Proposition 5.2], whose nonnegativity is equivalent to the pigeon-hole principle. Although Lemma 6.12 already explicitly supplies a sparse polynomial FSOS certificate for $f_{\mathcal{K}_{n},n-1}\geq 1$ , we apply Algorithm 1 to $f_{\mathcal{K}_{n},n-1}\geq 1$ where $4\leq n\leq 22$ , only for testing and comparison purposes⁵⁵5The order of $\Gamma_{22}^{21}$ is approximately equal to $1.5\cdot 10^{28}$ ..

Numerical results are listed in Table 6. In Figure 5, we plot the time cost and computed sparsity as functions of $n$ respectively. It turns out that the time cost (resp. computed sparsity) can be interpolated by a cubic (resp. quintic) polynomial.

n	time	sparsity
4	0.16	26
5	0.32	62
6	0.80	122
7	2.46	212
8	6.13	338
9	15.38	506
10	32.25	722
11	65.56	992
12	122.07	1322
13	235.69	1718
14	405.70	2186
15	659.25	2732
16	1075.5	3362
17	1813.0	4082
18	2922.6	4898
19	4556.8	5816
20	6850.7	6842
21	11050	7982
22	15665	9242

Table 6. polynomial FSOS certificates for

f_{\mathcal{K}_{n},n-1}\geq 1

Appendix A Rational FSOS certificate in Example 5.4

	$\displaystyle g_{1}=$	$\displaystyle 0.29y_{1}+0.25y_{2}+0.25y_{3}+0.22y_{4}+0.25y_{1}y_{2}+0.49y_{1}y_{3}-0.32y_{1}y_{4}+0.022y_{2}y_{4}+0.044y_{3}y_{4}+$
		$\displaystyle 0.022y_{1}y_{2}y_{4}+0.023y_{1}y_{3}y_{4}+1.2,$
	$\displaystyle g_{2}=$	$\displaystyle 1.1y_{4}-0.025y_{2}-2.7\cdot 10^{-3}y_{3}-0.39y_{1}-0.025y_{1}y_{2}-0.072y_{1}y_{3}+0.35y_{1}y_{4}+0.25y_{2}y_{4}+$
		$\displaystyle 0.24y_{3}y_{4}+0.25y_{1}y_{2}y_{4}+0.5y_{1}y_{3}y_{4},$
	$\displaystyle g_{3}=$	$\displaystyle 0.47y_{1}-0.056y_{2}+1.1y_{3}-0.056y_{1}y_{2}+0.049y_{1}y_{3}+0.1y_{1}y_{4}-4.3\cdot 10^{-3}y_{2}y_{4}+0.15y_{3}y_{4}-$
		$\displaystyle 4.3\cdot 10^{-3}y_{1}y_{2}y_{4}-0.44y_{1}y_{3}y_{4},$
	$\displaystyle g_{4}=$	$\displaystyle 0.033y_{1}+3.9\cdot 10^{-3}y_{2}+3.9\cdot 10^{-3}y_{1}y_{2}-0.45y_{1}y_{3}+0.46y_{1}y_{4}-0.056y_{2}y_{4}+$
		$\displaystyle 1.1y_{3}y_{4}-0.056y_{1}y_{2}y_{4}+0.11y_{1}y_{3}y_{4},$
	$\displaystyle g_{5}=$	$\displaystyle 0.24y_{1}+0.97y_{2}+0.058y_{1}y_{2}-0.12y_{1}y_{3}+0.12y_{1}y_{4}+0.2y_{2}y_{4}-0.55y_{1}y_{2}y_{4}-0.019y_{1}y_{3}y_{4},$
	$\displaystyle g_{6}=$	$\displaystyle 0.075y_{1}-0.58y_{1}y_{2}+6.5\cdot 10^{-3}y_{1}y_{3}+0.22y_{1}y_{4}+0.95y_{2}y_{4}+0.17y_{1}y_{2}y_{4}-0.12y_{1}y_{3}y_{4},$
	$\displaystyle g_{7}=$	$\displaystyle 0.92y_{1}+0.29y_{1}y_{2}+0.15y_{1}y_{3}+0.41y_{1}y_{4}+0.26y_{1}y_{2}y_{4}+0.49y_{1}y_{3}y_{4},$
	$\displaystyle g_{8}=$	$\displaystyle 0.15y_{1}y_{2}+0.48y_{1}y_{3}+0.82y_{1}y_{4}+0.19y_{1}y_{2}y_{4}-0.078y_{1}y_{3}y_{4},$
	$\displaystyle g_{9}=$	$\displaystyle 0.69y_{1}y_{3}-0.32y_{1}y_{2}-0.31y_{1}y_{2}y_{4}+0.33y_{1}y_{3}y_{4},$
	$\displaystyle g_{10}=$	$\displaystyle 0.61y_{1}y_{3}y_{4}-0.19y_{1}y_{2}y_{4}-0.18y_{1}y_{2},$
	$\displaystyle g_{11}=$	$\displaystyle 0.61y_{1}y_{2}+0.15y_{1}y_{2}y_{4},$
	$\displaystyle g_{12}=$	$\displaystyle 0.59y_{1}y_{2}y_{4},$
	$\displaystyle h_{1}=$	$\displaystyle 0.098y_{1}y_{4}+1.1,$
	$\displaystyle h_{2}=$	$\displaystyle 1.1y_{1}y_{4}.$

It is straightforward to verify that $\left\lVert\sum_{\nu\in T}\widehat{(f-\frac{3}{2})h_{\nu}^{2}}-\sum_{\alpha\in S}\widehat{g_{\alpha}^{2}}\right\rVert_{\ell^{1}}<0.092$ .

References

[AB09] Sanjeev Arora and Boaz Barak. Computational complexity: a modern approach. Cambridge University Press, 2009.
[AH15] André Abramé and Djamal Habet. On the resiliency of unit propagation to max-resolution. In Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015.
[Art27] Emil Artin. Über die Zerlegung definiter Funktionen in Quadrate. Abh. Math. Sem. Univ. Hamburg, 5(1):100–115, 1927.
[BCMM05] Paul Beame, Joseph Culberson, David Mitchell, and Cristopher Moore. The resolution complexity of random graph k-colorability. Discrete Applied Mathematics, 153(1-3):25–47, 2005.
[BGP16] Grigoriy Blekherman, João Gouveia, and James Pfeiffer. Sums of squares on the hypercube. Mathematische Zeitschrift, 284(1):41–54, 2016.
[BHK⁺19] Boaz Barak, Samuel Hopkins, Jonathan Kelner, Pravesh K. Kothari, Ankur Moitra, and Aaron Potechin. A nearly tight sum-of-squares lower bound for the planted clique problem. SIAM Journal on Computing, 48(2):687–735, 2019.
[BLM07] Maria Luisa Bonet, Jordi Levy, and Felip Manya. Resolution for max-sat. Artificial Intelligence, 171(8-9):606–618, 2007.
[BLT21] Guy Blanc, Jane Lange, and Li-Yang Tan. Provably efficient, succinct, and precise explanations. Advances in Neural Information Processing Systems, 34:6129–6141, 2021.
[Boo75] Ronald V. Book. Richard m. karp. reducibility among combinatorial problems. complexity of computer computations, proceedings of a symposium on the complexity of computer computations, held march 20-22, 1972, at the ibm thomas j. watson center, yorktown heights, new york, edited by raymond e. miller and james w. thatcher, plenum press, new york and london 1972, pp. 85–103. The Journal of Symbolic Logic, 40(4):618–619, 1975.
[CC10] Manoel Campélo and Ricardo C. Corréa. A combined parallel lagrangian decomposition and cutting-plane generation for maximum stable set problems. Electronic Notes in Discrete Mathematics, 36:503–510, 2010. ISCO 2010 - International Symposium on Combinatorial Optimization.
[FH13] William Fulton and Joe Harris. Representation theory: a first course, volume 129. Springer Science & Business Media, 2013.
[FJ95] Alan M. Frieze and Mark Jerrum. Improved approximation algorithms for max k-cut and max bisection. In Proceedings of the 4th International IPCO Conference on Integer Programming and Combinatorial Optimization, page 1–13, Berlin, Heidelberg, 1995. Springer-Verlag.
[FSP16] Hamza Fawzi, James Saunderson, and Pablo A Parrilo. Sparse sums of squares on finite abelian groups and improved semidefinite lifts. Mathematical Programming, 160(1-2):149–191, 2016.
[Gol10] Oded Goldreich. P, NP, and NP-Completeness: The basics of computational complexity. Cambridge University Press, 2010.
[Gri01] D. Grigoriev. Complexity of positivstellensatz proofs for the knapsack. computational complexity, 10(2):139–154, 2001.
[GW95] Michel X Goemans and David P Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM (JACM), 42(6):1115–1145, 1995.
[Hil88] David Hilbert. Über die darstellung definiter formen als summe von formenquadraten. Mathematische Annalen, 32(3):342–350, 1888.
[HKM16] Marijn JH Heule, Oliver Kullmann, and Victor W Marek. Solving and verifying the boolean pythagorean triples problem via cube-and-conquer. In International Conference on Theory and Applications of Satisfiability Testing, pages 228–245. Springer, 2016.
[HMS11] Federico Heras and Joao Marques-Silva. Read-once resolution for unsatisfiability-based max-sat algorithms. In Twenty-Second International Joint Conference on Artificial Intelligence, 2011.
[JP11] Tim Januschowski and Marc E. Pfetsch. Branch-cut-and-propagate for the maximum k-colorable subgraph problem with symmetry. In Tobias Achterberg and J. Christopher Beck, editors, Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems, pages 99–116, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.
[KLM16] Adam Kurpisz, Samuli Leppänen, and Monaldo Mastrolilli. Tight sum-of-squares lower bounds for binary polynomial optimization problems. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016.
[KLYZ12] Erich L Kaltofen, Bin Li, Zhengfeng Yang, and Lihong Zhi. Exact certification in global polynomial optimization via sums-of-squares of rational functions with rational coefficients. Journal of Symbolic Computation, 47(1):1–15, 2012.
[KMOW17] Pravesh K. Kothari, Ryuhei Mori, Ryan O’Donnell, and David Witmer. Sum of squares lower bounds for refuting any csp. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, page 132–145, New York, NY, USA, 2017. Association for Computing Machinery.
[Kri64] J. L. Krivine. Anneaux préordonnés. Journal d’Analyse Mathématique, 12(1):307–326, 1964.
[Kur19] Adam Kurpisz. Sum-of-squares bounds via boolean function analysis. In Proceedings of the 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019), volume 132, page 79. Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2019.
[Las01] Jean B. Lasserre. Global optimization with polynomials and the problem of moments. SIAM Journal on Optimization, 11(3):796–817, 2001.
[Lau09] Monique Laurent. Sums of Squares, Moment Matrices and Optimization Over Polynomials, pages 157–270. Springer New York, 2009.
[LFT92] K.C. Lee, N. Funabiki, and Y. Takefuji. A parallel improvement algorithm for the bipartite subgraph problem. IEEE Transactions on Neural Networks, 3(1):139–145, 1992.
[Mot67] Theodore Samuel Motzkin. The arithmetic-geometric inequality. Inequalities (Proc. Sympos. Wright-Patterson Air Force Base, Ohio, 1965), pages 205–224, 1967.
[MPW15] Raghu Meka, Aaron Potechin, and Avi Wigderson. Sum-of-squares lower bounds for planted clique. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’15, page 87–96, New York, NY, USA, 2015. Association for Computing Machinery.
[New64] Donald J Newman. Rational approximation to $|x|$ . Michigan Mathematical Journal, 11(1):11–14, 1964.
[Nie14] Jiawang Nie. Optimality conditions and finite convergence of lasserre’s hierarchy. Mathematical Programming, 146(1):97–121, 2014.
[NOR10] Arkadi Nemirovski, Shmuel Onn, and Uriel G Rothblum. Accuracy certificates for computational problems with convex structure. Mathematics of Operations Research, 35(1):52–78, 2010.
[NS94] Noam Nisan and Mario Szegedy. On the degree of boolean functions as real polynomials. Computational complexity, 4(4):301–313, 1994.
[Par00] Pablo A Parrilo. Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization. California Institute of Technology, 2000.
[Par03] Pablo A. Parrilo. Semidefinite programming relaxations for semialgebraic problems. Mathematical Programming, 96(2):293–320, 2003.
[PCH20] Matthieu Py, Mohamed Sami Cherif, and Djamal Habet. Towards bridging the gap between sat and max-sat refutations. In 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), pages 137–144. IEEE, 2020.
[PCH21] Matthieu Py, Mohamed Sami Cherif, and Djamal Habet. A proof builder for max-sat. In International Conference on Theory and Applications of Satisfiability Testing, pages 488–498. Springer, 2021.
[PT20] Pablo A. Parrilo and Rekha R. Thomas, editors. Sum of squares: theory and applications, volume 77 of Proceedings of Symposia in Applied Mathematics. American Mathematical Society, Providence, RI, [2020] ©2020. AMS Short Course, Sum of Squares: Theory and Applications, January 14–15, 2019, Baltimore, MD.
[PY88] Christos Papadimitriou and Mihalis Yannakakis. Optimization, approximation, and complexity classes. In Proceedings of the twentieth annual ACM symposium on Theory of computing, pages 229–234, 1988.
[RS10] Alexander A. Razborov and Alexander A. Sherstov. The sign-rank of $\rm AC^{0}$ . SIAM J. Comput., 39(5):1833–1855, 2010.
[Rud62] Walter Rudin. Fourier analysis on groups, volume 121967. Wiley Online Library, 1962.
[SB02] J. Stoer and R. Bulirsch. Introduction to numerical analysis, volume 12 of Texts in Applied Mathematics. Springer-Verlag, New York, third edition, 2002. Translated from the German by R. Bartels, W. Gautschi and C. Witzgall.
[Sch91] Konrad Schmüdgen. Thek-moment problem for compact semi-algebraic sets. Mathematische Annalen, 289(1):203–206, 1991.
[Sha94] N. Shankar. Metamathematics, machines, and Gödel’s proof, volume 38 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, Cambridge, 1994.
[She20] Alexander A Sherstov. Algorithmic polynomials. SIAM Journal on Computing, 49(6):1173–1231, 2020.
[SL23] Lucas Slot and Monique Laurent. Sum-of-squares hierarchies for binary polynomial optimization. Mathematical Programming, 197(2):621–660, 2023.
[Sot14] R. Sotirov. An efficient semidefinite programming relaxation for the graph partition problem. INFORMS J. on Computing, 26(1):16–30, feb 2014.
[Sta03] Herbert R. Stahl. Best uniform rational approximation of $x^{\alpha}$ on $[0,1]$ . Acta Math., 190(2):241–306, 2003.
[Ste74] Gilbert Stengle. A nullstellensatz and a positivstellensatz in semialgebraic geometry. Mathematische Annalen, 207(2):87–97, 1974.
[STKI17] Shinsaku Sakaue, Akiko Takeda, Sunyoung Kim, and Naoki Ito. Exact semidefinite programming relaxations with truncated moment matrix for binary polynomial optimization problems. SIAM Journal on Optimization, 27(1):565–582, 2017.
[STYZ20] Defeng Sun, Kim-Chuan Toh, Yancheng Yuan, and Xin-Yuan Zhao. SDPNAL+: A matlab software for semidefinite programming with bound constraints (version 1.0). Optimization Methods and Software, 35(1):87–115, 2020.
[VK87] Richard S Varga and A D Karpenter. On a conjecture of S. Bernstein in approximation theory. Mathematics of the USSR-Sbornik, 57(2):547–560, feb 1987.
[vS16] E.R. van Dam and R. Sotirov. New bounds for the max-k-cut and chromatic number of a graph. Linear Algebra and its Applications, 488:216–234, 2016.
[vvH08] H. van Maaren, L. van Norden, and M.J.H. Heule. Sums of squares based approximation algorithms for max-sat. Discrete Applied Mathematics, 156(10):1754–1779, 2008.
[Vya75] NS Vyacheslavov. On the uniform approximation of $|x|$ by rational functions. Doklady Akademii Nauk, 220(3):512–515, 1975.
[W⁺01] Douglas Brent West et al. Introduction to graph theory, volume 2. Prentice hall Upper Saddle River, 2001.
[WML21a] Jie Wang, Victor Magron, and Jean-Bernard Lasserre. Chordal-tssos: a moment-sos hierarchy that exploits term sparsity with chordal extension. SIAM Journal on Optimization, 31(1):114–141, 2021.
[WML21b] Jie Wang, Victor Magron, and Jean-Bernard Lasserre. Tssos: A moment-sos hierarchy that exploits term sparsity. SIAM Journal on Optimization, 31(1):30–58, 2021.
[YYZ22] Jianting Yang, Ke Ye, and Lihong Zhi. Computing sparse fourier sum of squares on finite abelian groups in quasi-linear time. arXiv preprint arXiv:2201.03912, 2022.

	$\displaystyle\|R(x)\|$	$\displaystyle\leq\sum_{\alpha\in\Lambda}\|c_{\alpha}\|x^{\alpha}$
		$\displaystyle=\sum_{\begin{subarray}{c}\alpha\in\Lambda\\ \alpha\leq\lceil x\rceil\end{subarray}}\|c_{\alpha}\|$
		$\displaystyle\leq\varepsilon\sum_{\begin{subarray}{c}\alpha\in\Lambda\\ \alpha\leq\lceil x\rceil\end{subarray}}2^{\|\alpha\|}$
		$\displaystyle=\varepsilon\sum_{j=0}^{d}2^{j}\#\{\alpha\in\Lambda:\|\alpha\|=j,\alpha\leq\lceil x\rceil\}$
		$\displaystyle=\varepsilon\sum_{j=0}^{d}2^{j}\binom{\|\lceil x\rceil\|}{j}.$

	$\displaystyle\sum_{\begin{subarray}{c}\alpha\leq\beta\leq x\\ \beta\in\Lambda\end{subarray}}(-1)^{\|\beta\|-\|\alpha\|}$	$\displaystyle=\sum_{q=\|\alpha\|}^{d}\sum_{\|\beta\|=q}(-1)^{q-\|\alpha\|}$
		$\displaystyle=\sum_{q=\|\alpha\|}^{d}(-1)^{q-\|\alpha\|}\binom{\|x\|-\|\alpha\|}{q-\|\alpha\|}$
		$\displaystyle=\sum_{j=0}^{d-\|\alpha\|}(-1)^{j}\binom{\|x\|-\|\alpha\|}{j}$
		$\displaystyle=(-1)^{d-\|\alpha\|}\binom{\|x\|-\|\alpha\|-1}{d-\|\alpha\|}.$

	$\displaystyle\|R(x)\|$	$\displaystyle\leq\varepsilon\sum_{\begin{subarray}{c}\alpha\in\Lambda\\ \alpha\leq x\end{subarray}}\binom{\|x\|-\|\alpha\|-1}{d-\|\alpha\|}$
		$\displaystyle=\varepsilon\sum_{p=0}^{d}\binom{\|x\|}{p}\binom{\|x\|-p-1}{d-p}$
(25)			$\displaystyle=\varepsilon\binom{\|x\|}{d}\sum_{p=0}^{d}\frac{\|x\|-d}{\|x\|-p}\binom{d}{p}.$

	$\displaystyle\sum_{p=0}^{d}\frac{\|x\|-d}{\|x\|-p}\binom{d}{p}$	$\displaystyle\leq(\|x\|-d)\sum_{p=0}^{d}\left(\frac{1}{\|x\|}+\frac{p}{\|x\|(\|x\|-d)}\right)\binom{d}{p}$
		$\displaystyle=\frac{\|x\|-d}{\|x\|}2^{d}+\frac{d}{\|x\|}2^{d-1}$
(26)			$\displaystyle=2^{d}-\frac{d}{\|x\|}2^{d-1}.$

Fourier sum of squares certificates

Abstract.

Key words and phrases:

1. Introduction

Problem 1.1 (computation of FSOS certificate).

Problem 1.2 (validation of FSOS certificate).

1.1. Our contributions

1.2. Organization of the paper

2. Preliminaries

2.1. Fourier analysis on groups

Definition 2.1 (degree).

Remark 2.2.

2.2. Fourier sum of squares on abelian groups

Definition 2.3 (polynomial FSOS).

Definition 2.4 (rational FSOS).

Remark 2.5.

Proposition 2.6.

2.3. Sparsity of FSOS

Definition 2.7 (sparsity).

Theorem 2.8.

Theorem 2.9.

Theorem 2.10.

3. Existence of sparse FSOS certificates

Lemma 3.1.

Remark 3.2.

Definition 3.3 (FSOS certificates for lower bound).

Remark 3.4.

3.1. Polynomial FSOS certificates

Example 3.5.

3.1.1. Existence of low degree polynomial FSOS certificate

Lemma 3.6.

Lemma 3.7.

Proof.

Theorem 3.8 (low degree polynomial FSOS certificate).

Proof.

Remark 3.9.

3.1.2. Two impossibility theorems

Theorem 3.10.

Theorem 3.11 (impossibility theorem for uniform approximation).

Proof.

Theorem 3.12.

Proof.

Lemma 3.13.

Proof.

Theorem 3.14 (impossibility theorem for discrete approximation).

Proof.

Example 3.15.

3.2. Rational FSOS certificates

Theorem 3.16.

Lemma 3.17.

Proof.

Theorem 3.18 (low degree rational FSOS certificate).

Proof.

Remark 3.19.

Remark 3.20.

Example 3.21 (continues= ex:running example-1).

Proposition 3.22.

Proof.

4. Validation of FSOS certificates

4.1. Validating FSOS certificates by ℓ1\ell^{1}-norm

Lemma 4.1.

Proof.

Proposition 4.2 (validation by ℓ1\ell^{1}-norm).

Proof.

4.2. Validating FSOS certificates by sampling on Γ2n\Gamma_{2}^{n}

Lemma 4.3 (extrapolation).

Proof.

Remark 4.4.

Proposition 4.5.

Proof.

Theorem 4.6 (validation by sampling).

5. Computing FSOS certificates by the SDP relaxation

Example 5.1 (continues= ex:running example-3).

Proposition 5.2 (certifying nonnegativity by the SDP relaxation).

Proof.

Remark 5.3.

Example 5.4.

6. Applications of FSOS certificates

6.1. Numerical experiments on the function from [BGP16]

6.2. MAX-SAT certificate problem

4.1. Validating FSOS certificates by $\ell^{1}$ -norm

Proposition 4.2 (validation by $\ell^{1}$ -norm).

4.2. Validating FSOS certificates by sampling on $\Gamma_{2}^{n}$