This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A Polynomial Degree Bound on Equations for Non-rigid Matrices and Small Linear Circuits

Mrinal Kumar Department of Computer Science & Engineering, IIT Bombay. Email: [email protected]    Ben Lee Volk Center for the Mathematics of Information, California Institute of Technology, USA. Email: [email protected]
Abstract

We show that there is an equation of degree at most \poly(n)\poly(n) for the (Zariski closure of the) set of the non-rigid matrices: that is, we show that for every large enough field 𝔽\mathbb{F}, there is a non-zero n2n^{2}-variate polynomial P𝔽[x1,1,,xn,n]P\in\mathbb{F}[x_{1,1},\ldots,x_{n,n}] of degree at most \poly(n)\poly(n) such that every matrix MM which can be written as a sum of a matrix of rank at most n/100n/100 and a matrix of sparsity at most n2/100n^{2}/100 satisfies P(M)=0P(M)=0. This confirms a conjecture of Gesmundo, Hauenstein, Ikenmeyer and Landsberg [GHIL16] and improves the best upper bound known for this problem down from exp(n2)\exp(n^{2}) [KLPS14, GHIL16] to \poly(n)\poly(n).

We also show a similar polynomial degree bound for the (Zariski closure of the) set of all matrices MM such that the linear transformation represented by MM can be computed by an algebraic circuit with at most n2/200n^{2}/200 edges (without any restriction on the depth). As far as we are aware, no such bound was known prior to this work when the depth of the circuits is unbounded.

Our methods are elementary and short and rely on a polynomial map of Shpilka and Volkovich [SV15] to construct low degree “universal” maps for non-rigid matrices and small linear circuits. Combining this construction with a simple dimension counting argument to show that any such polynomial map has a low degree annihilating polynomial completes the proof.

As a corollary, we show that any derandomization of the polynomial identity testing problem will imply new circuit lower bounds. A similar (but incomparable) theorem was proved by Kabanets and Impagliazzo [KI04].

1 Introduction

1.1 Equations for varities in algebraic complexity theory

Let V𝔽nV\subseteq\mathbb{F}^{n} be a (not necessarily irreducible) affine variety and let 𝐈(V)\mathbf{I}(V) denote its ideal.111For completeness, we provide the formal (standard) definitions for these notions in Section 1.4.. A non-zero polynomial P𝐈(V)P\in\mathbf{I}(V) is called an equation for VV. An equation for VV may serve as a “proof” that a point 𝐱𝔽n\mathbf{x}\in\mathbb{F}^{n} is not in VV, by showing that P(𝐱)0P(\mathbf{x})\neq 0.

A fundamental observation of the Geometric Complexity Theory program is that many important circuit lower bounds problems in algebraic complexity theory fit naturally into the setting of showing that a point 𝐱\mathbf{x} lies outside a variety VV [MS01, BIL+19]. In this formulation, one considers VV to be the closure of a class of polynomials of low complexity, and 𝐱\mathbf{x} is the coefficient vector of the candidate hard polynomial.

Let Δ(V):=min0P𝐈(V){deg(P)}\Delta(V):=\min_{0\neq P\in\mathbf{I}(V)}\{\deg(P)\}. The quantity Δ(V)\Delta(V) can be thought of as a measure of complexity for the geometry of the variety VV. The quantity Δ(V)\Delta(V) is a very coarse complexity measure. A recent line of work regarding algebraic natural proofs [FSV18, GKSS17] suggests to study the arithmetic circuit complexity of equations for varieties VV that correspond to polynomials with small circuit complexity. Having Δ(V)\Delta(V) growing like a polynomial in nn is a necessary (but not a sufficient) condition for a variety VV to have an algebraic natural proof for non-containment.

1.2 Rigid matrices

A matrix MM is (r,s)(r,s)-rigid if MM cannot be written as a sum R+SR+S where 𝗋𝖺𝗇𝗄(R)r\mathsf{rank}(R)\leq r and SS contains at most ss non-zero entries. Valiant [Val77] proved that if AA is (εn,n1+δ)(\varepsilon n,n^{1+\delta})-rigid for some constants ε,δ>0\varepsilon,\delta>0 then AA cannot be computed by arithmetic circuits of size O(n)O(n) and depth O(logn)O(\log n), and posed the problem of explicitly constructing rigid matrices with these parameters, which is still open. It is easy to prove that most matrices have much stronger rigidity parameters: over algebraically closed fields a generic matrix is (r,(nr)2)(r,(n-r)^{2})-rigid for any target rank rr.

Let 𝔽\mathbb{F} be an algebraically closed field. Let Ar,s𝔽n×nA_{r,s}\subseteq\mathbb{F}^{n\times n} denote the set of matrices which are not (r,s)(r,s)-rigid. Let Vr,s=Ar,s¯V_{r,s}=\overline{A_{r,s}} denote the Zariski closure of Ar,sA_{r,s}. A geometric study of Vr,sV_{r,s} was initiated by Kumar, Lokam, Patankar and Sarma [KLPS14]. Among other results, they prove that for every s<(nr)2s<(n-r)^{2}, Δ(Vr,s)n4n2\Delta(V_{r,s})\leq n^{4n^{2}}. A slightly improved (but still exponential) upper bound was obtained by Gesmundo, Hauenstein, Ikenmeyer and Landsberg [GHIL16], who also conjectured that for some ε,δ>0\varepsilon,\delta>0, Δ(Vεn,n1+δ)\Delta(V_{\varepsilon n,n^{1+\delta}}) grows like a polynomial function in nn. The following theorem which we prove in this note confirms this conjecture.

Theorem 1.1.

Let ε<1/25\varepsilon<1/25, and let 𝔽\mathbb{F} be a field of size at least n2n^{2}. For every large enough nn, there exists a non-zero polynomial Q𝔽[x1,1,,xn,n]Q\in\mathbb{F}[x_{1,1},\ldots,x_{n,n}], of degree at most n3n^{3}, which is a non-trivial equation for matrices which are not (εn\varepsilon n, εn2\varepsilon n^{2})-rigid. That is, for every such matrix MM, Q(M)=0Q(M)=0.

In fact, the conjecture of [GHIL16] was slightly weaker: they conjectured that Δ(U)\Delta(U) is polynomial in nn for every irreducible component UU of Vεn,n1+δV_{\varepsilon n,n^{1+\delta}}. As shown by [KLPS14], the irreducible components are in one-to-one correspondence with subsets of [n]×[n][n]\times[n] of size n1+δn^{1+\delta} corresponding to possible supports of the sparse matrix SS.

As we observe in 2.3, it is somewhat simpler to show that each of these irreducible components has an equation with a polynomial degree bound. However, since the number of such irreducible components is exponentially large, it is not clear if there is a single equation for the whole variety which is of polynomially bounded degree. We do manage to reverse the order of quantifiers and prove such an upper bound in Theorem 1.1. This suggests that the set of non-rigid matrices is much less complex than what one may suspect given the results of [KLPS14, GHIL16].

1.3 Circuits for linear transformations

The original motivation for defining rigidity was in the context of proving lower bounds for algebraic circuits [Val77]. If A𝔽n×nA\in\mathbb{F}^{n\times n} is an (εn,n1+δ)(\varepsilon n,n^{1+\delta})-rigid matrix, for any ε,δ>0\varepsilon,\delta>0, then the linear transformation represented by AA cannot be computed by an algebraic circuit of depth O(logn)O(\log n) and size O(n)O(n).

Every algebraic circuit computing a linear transformation is without loss of generality a linear circuit. A linear circuit is a directed acyclic graph that has nn inputs labeled X1,,XnX_{1},\ldots,X_{n} and nn output nodes. Each edge is labeled by a scalar α𝔽\alpha\in\mathbb{F}. Each node computes a linear function in X1,,XnX_{1},\ldots,X_{n} defined inductively. An internal node uu with children, v1,,vkv_{1},\ldots,v_{k}, connected to it by edges labeled α1,,αk\alpha_{1},\ldots,\alpha_{k}, computes the linear function iαivi\sum_{i}\alpha_{i}\ell_{v_{i}}, where vi\ell_{v_{i}} is the linear function computed by viv_{i}, 1ik1\leq i\leq k. The size of the circuit is the number of edges in the circuit.

It is possible to use similar techniques to those used in the proof of Theorem 1.1 in order to prove a polynomial upper bound on an equation for a variety containing all matrices A𝔽n×nA\in\mathbb{F}^{n\times n} whose corresponding linear transformation can be computed by an algebraic circuit of size at most n2/200n^{2}/200 (even without restriction on the depth). Note that this is nearly optimal as any such linear transformation can be computed by a circuit of size n2n^{2}. More formally, we show the following.

Theorem 1.2.

Let 𝔽\mathbb{F} be a field of size at least n2n^{2}. For every large enough nn, there exists a non-zero polynomial Q𝔽[x1,1,,xn,n]Q\in\mathbb{F}[x_{1,1},\ldots,x_{n,n}], of degree at most n3n^{3}, which is a non-trivial equation for matrices which are computed by algebraic circuit of size at most n2/200n^{2}/200.

Our proofs are based on a dimension counting arguments, and are therefore non-constructive and do not give explicit equations for the relevant varieties. It thus remains a very interesting open problem to provide explicit low-degree equations for any of the varieties considered in this paper. Here “explicit” means a polynomial which has arithmetic circuits of size \poly(n)\poly(n).222Although one may consider other, informal notions of explicitness which could nevertheless be helpful. The question of whether such equations exists has a win-win flavor: if they do, this can aid in explicit constructions of rigid matrices, and on the other hand, if all equations are hard, we have identified a family of polynomials which requires super-polynomial arithmetic circuits. Assuming the existence of a polynomial time algorithm for polynomial identity testing, we are able to make this connection formal.

Let 𝖯𝖨𝖳\mathsf{PIT} denote the set of strings which describe arithmetic circuits (say, over \mathbb{C}) which compute the zero polynomial. It is well known that 𝖯𝖨𝖳\coRP\mathsf{PIT}\in\coRP. Kabanets and Impagliazzo [KI04] proved that certain circuit lower bounds follow from the assumption that 𝖯𝖨𝖳\mathsf{PIT}\in\P. As a corollary to Theorem 1.2, we are able to prove theorem of a similar kind.

Corollary 1.3.

Suppose 𝖯𝖨𝖳\mathsf{PIT}\in\P. Then at least one of the following is true:

  1. 1.

    There exists a family of nn-variate polynomials of degree \poly(n)\poly(n) over \mathbb{C}, which can be computed (as its list of coefficients, given the input 1n1^{n}) in \PSPACE\PSPACE, which does not have polynomial size constant free arithmetic circuits.

  2. 2.

    there exists a family of matrices, constructible in polynomial time with an \NP\NP oracle (given the input 1n1^{n}), which requires linear circuits of size Ω(n2)\Omega(n^{2}).

A constant free arithmetic circuit is an arithmetic circuit which is only allowed to use the constants {0,±1}\{0,\pm 1\}.

A different way to interpret 1.3 is as saying that at least one of the following three lower bound results hold: either 𝖯𝖨𝖳\mathsf{PIT}\not\in\P, or (at least) one of the two circuit lower bounds stated in the corollary. We emphasize that the result holds under any (even so-called white box) derandomization of 𝖯𝖨𝖳\mathsf{PIT}.

Our statement is similar to, but incomparable with the result of Kabanets and Impagliazzo [KI04] who proved that if 𝖯𝖨𝖳\mathsf{PIT}\in\P then either the permanent does not have polynomial size constant free arithmetic circuits, or \NEXP/\poly\NEXP\not\subseteq\P/\poly.

Since (εn,εn2)(\varepsilon n,\varepsilon n^{2})-rigid matrices have linear circuit of size 3εn23\varepsilon n^{2}, the last item of 1.3 in particular implies a conditional construction of (Ω(n),Ω(n2))(\Omega(n),\Omega(n^{2}))-rigid matrices (it is also possible to directly use Theorem 1.1 instead of Theorem 1.2 to deduce this result). Unconditional constructions of rigid matrices in polynomial time with an \NP\NP oracle were recently given in [AC19, BHPT20]. However, the rigidity parameters in these papers are not enough to imply circuit lower bounds (furthermore, even optimal rigidity parameters are not enough to imply Ω(n2)\Omega(n^{2}) lower bounds for general linear circuits).

Since it is widely believed that 𝖯𝖨𝖳\mathsf{PIT}\in\P, the answer to which of the last two items of 1.3 holds boils down to the question of whether there exists an equation for non-rigid matrices of degree \poly(n)\poly(n) and circuit size \poly(n)\poly(n). If determining if a matrix is rigid is \coNP\coNP-hard (as is known for some restricted ranges of parameters [MS10]), it is tempting to also believe that the equations should not be easily computable, as they provide “proof” for rigidity which can be verified in randomized polynomial time. However, it could still be the case that those equations that have polynomial size circuits only prove the rigidity of “easy” instances.

1.4 Some basic notions in algebraic geometry

For completeness, in this section we define some basic notions in algebraic geometry. A reader who is familiar with this topic may skip to the next section.

Let 𝔽\mathbb{F} be an algebraically closed field. A set V𝔽nV\subseteq\mathbb{F}^{n} is called an affine variety if there exist polynomials f1,,ft𝔽[x1,,xn]f_{1},\ldots,f_{t}\in\mathbb{F}[x_{1},\ldots,x_{n}] such that V={𝐱:f1(𝐱)=f2(𝐱)==ft(𝐱)=0}V=\{\mathbf{x}:f_{1}(\mathbf{x})=f_{2}(\mathbf{x})=\cdots=f_{t}(\mathbf{x})=0\}. For convenience, in this paper we often refer to affine varieties simply as varieties.

For each variety VV there is a corresponding ideal 𝐈(V)𝔽[x1,,xn]\mathbf{I}(V)\subseteq\mathbb{F}[x_{1},\ldots,x_{n}] which is defined as

𝐈(V):={f𝔽[x1,,xn]:f(𝐱)=0 for all 𝐱V}.\mathbf{I}(V):=\{f\in\mathbb{F}[x_{1},\ldots,x_{n}]:f(\mathbf{x})=0\text{ for all }\mathbf{x}\in V\}.

Conversely, for an ideal I𝔽[x1,,xn]I\subseteq\mathbb{F}[x_{1},\ldots,x_{n}] we may define the variety

𝐕(I)={𝐱:f(𝐱)=0 for all fI}.\mathbf{V}(I)=\{\mathbf{x}:f(\mathbf{x})=0\text{ for all }f\in I\}.

Given a set A𝔽nA\subseteq\mathbb{F}^{n} we may similarly define the ideal 𝐈(A)\mathbf{I}(A). The (Zariski) closure of a set AA, denoted A¯\overline{A}, is the set 𝐕(𝐈(A))\mathbf{V}(\mathbf{I}(A)). In words, the closure of AA is the set of common zeros of all the polynomials that vanish on AA. It is also the smallest variety with respect to inclusion which contains AA. By construction, A¯\overline{A} is a variety, and a polynomial which vanishes everywhere on AA is also vanishes on A¯\overline{A}.

Over \mathbb{C}, it is instructive to think of the Zariski closure of AA as the usual Euclidean closure. In fact, for the various sets AA we consider in this paper (which correspond to sets of “low complexity” objects, e.g., non-rigid matrices or matrices which can be computed with a small circuit), it can be shown that these two notions of closure coincide (see, e.g., Section 4.2 of [BI17]).

A variety VV is called irreducible if it cannot be written as a union V=V1V2V=V_{1}\cup V_{2} of varieties V1,V2V_{1},V_{2} that are properly contained in VV. Every variety can be uniquely written as a union V=V1V2VmV=V_{1}\cup V_{2}\cup\cdots\cup V_{m} of irreducible varieties. The varieties V1,,VmV_{1},\ldots,V_{m} are then called the irreducible components of VV.

2 Degree Upper Bound for Non-Rigid Matrices

In this section, we prove Theorem 1.1. A key component of the proof is the use of the following construction, due to Shpilka and Volkovich, which provides an explicit low-degree polynomial map on a small number of variables, which contains all sparse matrices in its image. For completeness, we provide the construction and prove its basic property.

Lemma 2.1 ([SV15]).

Let 𝔽\mathbb{F} be a field such that |𝔽|>n|\mathbb{F}|>n. Then for all kk\in\mathbb{N}, there exists an explicit polynomial map SVn,k(𝐱,𝐲):𝔽2k𝔽n\mathrm{SV}_{n,k}(\mathbf{x},\mathbf{y}):\mathbb{F}^{2k}\to\mathbb{F}^{n} of degree at most nn such that for any subset T={i1,,ik}[n]T=\{i_{1},\ldots,i_{k}\}\subseteq[n] of size kk, there exists a setting 𝐲=𝛂\mathbf{y}=\boldsymbol{\alpha} such that SV(𝐱,𝛂)\mathrm{SV}(\mathbf{x},\boldsymbol{\alpha}) is identically zero on every coordinate jTj\not\in T, and equals xjx_{j} in coordinate iji_{j} for all j[k]j\in[k].

Proof.

Arbitrarily pick distinct α1,αn𝔽\alpha_{1},\ldots\alpha_{n}\in\mathbb{F}, and let u1,,unu_{1},\ldots,u_{n} be their corresponding Lagrange’s interpolation polynomials, i.e., polynomials of degree at most n1n-1 such that ui(αj)=1u_{i}(\alpha_{j})=1 if j=ij=i and 0 otherwise (more explicitly, ui(z)=ji(zαj)ji(αiαj)u_{i}(z)=\frac{\prod_{j\neq i}(z-\alpha_{j})}{\prod_{j\neq i}(\alpha_{i}-\alpha_{j})}).

Let Pi(x1,,xk,y1,,yk)=j=1kui(yj)xjP_{i}(x_{1},\ldots,x_{k},y_{1},\ldots,y_{k})=\sum_{j=1}^{k}u_{i}(y_{j})\cdot x_{j}, and finally let

SVn,k(𝐱,𝐲)=(P1(𝐱,𝐲),,Pn(𝐱,𝐲)).\mathrm{SV}_{n,k}(\mathbf{x},\mathbf{y})=(P_{1}(\mathbf{x},\mathbf{y}),\ldots,P_{n}(\mathbf{x},\mathbf{y})).

It readily follows that given T={i1,,ik}T=\{i_{1},\ldots,i_{k}\} as in the statement of the lemma, we can set yj=αijy_{j}=\alpha_{i_{j}} for j[k]j\in[k] to derive the desired conclusion. The upper bound on the degree follows by inspection. ∎

As a step toward the proof of Theorem 1.1, we show there is a polynomial map on much fewer than n2n^{2} variables with degree polynomially bounded in nn such that its image contains every non-rigid matrix. In the next step, we show that the image of every such polynomial map has an equation of degree \poly(n)\poly(n).

Lemma 2.2.

There exists an explicit polynomial map P:𝔽4εn2𝔽n×nP:\mathbb{F}^{4\varepsilon n^{2}}\to\mathbb{F}^{n\times n}, of degree at most n2n^{2}, such that every matrix MM which is not (εn,εn2)(\varepsilon n,\varepsilon n^{2}) rigid lies in its image.

Proof.

Let k=εn2k=\varepsilon n^{2} and let 𝐮,𝐯,𝐱,𝐲\mathbf{u},\mathbf{v},\mathbf{x},\mathbf{y} denote disjoint tuples of kk variables each.

Let UU be a symbolic n×εnn\times\varepsilon n matrix whose entries are labeled by the variables 𝐮\mathbf{u}, and similarly let VV be a symbolic εn×n\varepsilon n\times n matrix labeled by 𝐯\mathbf{v}. Let UV(𝐮,𝐯):𝔽2k𝔽n×n\mathrm{UV}(\mathbf{u},\mathbf{v}):\mathbb{F}^{2k}\to\mathbb{F}^{n\times n} be the degree 2 polynomial map defined by the matrix multiplication UVUV.

Finally, let P:𝔽4k𝔽n×nP:\mathbb{F}^{4k}\to\mathbb{F}^{n\times n} be defined as

P(𝐮,𝐯,𝐱,𝐲)=UV(𝐮,𝐯)+SVn2,k(𝐱,𝐲),P(\mathbf{u},\mathbf{v},\mathbf{x},\mathbf{y})=\mathrm{UV}(\mathbf{u},\mathbf{v})+\mathrm{SV}_{n^{2},k}(\mathbf{x},\mathbf{y}),

where SVn2,k\mathrm{SV}_{n^{2},k} is as defined in Lemma 2.1.

Suppose now MM is a non-rigid matrix, i.e., M=R+SM=R+S for RR of rank εn\varepsilon n and SS which is εn2\varepsilon n^{2}-sparse. Decompose R=U0V0R=U_{0}V_{0} for n×εnn\times\varepsilon n matrix U0U_{0} and εn×n\varepsilon n\times n matrix V0V_{0}. Let TT denote the support of SS. For convenience we may assume |T|=k|T|=k (otherwise, pad with zeros arbitrarily). Let 𝜶𝔽k\boldsymbol{\alpha}\in\mathbb{F}^{k} denote the setting for 𝐲\mathbf{y} in SVn2,k\mathrm{SV}_{n^{2},k} which maps x1,,xkx_{1},\ldots,x_{k} to TT, and let 𝐬=(s1,,sk)\mathbf{s}=(s_{1},\ldots,s_{k}) denote the non-zero entries of SS. Then

P(U0,V0,𝐬,𝜶)=U0V0+S=R+S=M.P(U_{0},V_{0},\mathbf{s},\boldsymbol{\alpha})=U_{0}V_{0}+S=R+S=M.\qed

To complete the proof of Theorem 1.1, we now argue that the image of any polynomial map with parameters as in 2.2 has an equation of degree at most n3n^{3}.

Proof of Theorem 1.1.

Let V1V_{1} denote the subspace of polynomials over 𝔽\mathbb{F} in n2n^{2} variables of degree at most n3n^{3}. Let V2V_{2} denote the subspace of polynomials over 𝔽\mathbb{F} in 4εn24\varepsilon n^{2} variables of degree at most n5n^{5}. Let PP be as in Lemma 2.2, and consider the linear transformation T:V1V2T:V_{1}\to V_{2} given by QQPQ\mapsto Q\circ P, where QPQ\circ P denotes the composition of the polynomial QQ with the map PP, i.e., (QP)(𝐱)=Q(P(𝐱))(Q\circ P)(\mathbf{x})=Q(P(\mathbf{x})) (indeed, observe that since deg(Q)n3\deg(Q)\leq n^{3} and deg(P)n2\deg(P)\leq n^{2}, it follows that degQPn5\deg Q\circ P\leq n^{5}).

We have that dim(V1)=(n3+n2n2)nn2\dim(V_{1})=\binom{n^{3}+n^{2}}{n^{2}}\geq n^{n^{2}}, whereas dim(V2)=(4εn2+n54εn2)(2n5)4εn2<dim(V1)\dim(V_{2})=\binom{4\varepsilon n^{2}+n^{5}}{4\varepsilon n^{2}}\leq(2n^{5})^{4\varepsilon n^{2}}<\dim(V_{1}) by the choice of ε\varepsilon, so that there exists a non-zero polynomial in the kernel of TT, that is, 0Q0V10\neq Q_{0}\in V_{1} such that Q0P0Q_{0}\circ P\equiv 0.

It remains to be shown that for any non-rigid matrix MM, Q0(M)=0Q_{0}(M)=0. Indeed, let MM be a non-rigid matrix. By Lemma 2.2, there exist 𝜷𝔽4εn2\boldsymbol{\beta}\in\mathbb{F}^{4\varepsilon n^{2}} such that P(𝜷)=MP(\boldsymbol{\beta})=M. Thus, Q0(M)=Q0(P(𝜷))=Q0P(β)=0Q_{0}(M)=Q_{0}(P(\boldsymbol{\beta}))=Q_{0}\circ P(\beta)=0, as Q0P0Q_{0}\circ P\equiv 0. ∎

Remark 2.3.

If the support of the sparse matrix is fixed a-priori to some set S[n]×[n]S\subseteq[n]\times[n] of cardinality at most ϵn2\epsilon n^{2}, then it is easier to come up with a universal map P~\tilde{P} from 𝔽3ϵn2𝔽n×n\mathbb{F}^{3\epsilon n^{2}}\mapsto\mathbb{F}^{n\times n} such that every matrix MM whose rank can be reduced to at most ϵn\epsilon n by changing entries in the set SS is contained in the image of P~\tilde{P}. Just consider P~(𝐰,𝐱,𝐲)=UV(𝐮,𝐯)+W\tilde{P}(\mathbf{w},\mathbf{x},\mathbf{y})=\mathrm{UV}(\mathbf{u},\mathbf{v})+W, where WW is a matrix such that for all (i,j)[n]×[n](i,j)\in[n]\times[n], if (i,j)S(i,j)\in S, then W(i,j)=wi,jW(i,j)=w_{i,j} and W(i,j)W(i,j) is zero otherwise. Here, each wi,jw_{i,j} is a distinct formal variable. Combined with the dimension comparison argument we used in the proof of Theorem 1.1, it can be seen that there is a non-zero low degree polynomial Q~\tilde{Q} such that Q~P~0\tilde{Q}\circ\tilde{P}\equiv 0. This argument provides a (different) equation of polynomial degree for each irreducible component of the variety of non-rigid matrices.

Remark 2.4.

It is possible to use the equation given in Theorem 1.1, and using the methods of [KLPS14], to construct “semi-explicit” (εn,εn2)(\varepsilon n,\varepsilon n^{2})-rigid matrices. These are matrices whose entries are algebraic numbers (over \mathbb{Q}) with short description, which are non-explicit from the computational complexity point of view. However, such constructions are also known using different methods (see Section 2.4 of [Lok09]).

3 Degree Upper bound for Matrices with a Small Circuit

In this section, we prove Theorem 1.2. Our strategy, as before, is to observe that all matrices with a small circuit lie in the image of a polynomial map PP on a small number of variables and small degree. Circuits of size ss can have many different topologies and thus we first construct a “universal” linear circuit, of size ss4s^{\prime}\leq s^{4}, that contains as subcircuits all linear circuits of size ss. Importantly, ss^{\prime} will affect the degree of PP but not its number of variables. We note that this construction of universal circuits is slightly different from similar constructions in earlier work, e.g., in [Raz10]; the key difference being that a naive use of ideas in [Raz10] to obtain the map PP seems to incur an asymptotic increase in the number of variables of PP, which is unacceptable in our current setting.

3.1 A construction of universal map for small linear circuits

We now define a map U(𝐱,𝐲)U(\mathbf{x},\mathbf{y}) which is “universal” for size ss linear circuits, i.e., it contains in its image all n×nn\times n matrices AA whose corresponding linear transformation can be computed by a linear circuit of size at most ss.

Let sns\geq n. We first define a universal graph GG for size ss. GG has a set V0V_{0} of nn input nodes labeled X1,XnX_{1},\ldots X_{n} and a set Vs+1V_{s+1} of nn designated output nodes. In addition, GG is composed of ss disjoint sets of vertices V1,,VsV_{1},\ldots,V_{s}, each contains ss vertices.

Each vertex vViv\in V_{i}, for 0is+10\leq i\leq s+1, has as its children all vertices uVju\in V_{j} for all 0j<i0\leq j<i. It is clear than every directed acyclic graph with ss edges (and hence at most ss vertices, and depth at most ss) can be (perhaps non-uniquely) embedded in GG as a subgraph.

We now describe the edge labeling. Let ss4s^{\prime}\leq s^{4} be the number of edges in VV and let eie_{i} denote the ii-th edge, 1is1\leq i\leq s^{\prime}. The edge eie_{i} is labeled by the ii-th coordinate of the map SVs,s(𝐱,𝐲)\mathrm{SV}_{s^{\prime},s}(\mathbf{x},\mathbf{y}) given in 2.1.

Thus, the graph GG with this labeling computes a linear transformation (over the field 𝔽(𝐱,𝐲)\mathbb{F}(\mathbf{x},\mathbf{y})) in the variables X1,,XnX_{1},\ldots,X_{n}. More explicitly, the (i,j)(i,j)-th entry of the matrix U(𝐱,𝐲)U(\mathbf{x},\mathbf{y}) representing this linear transformation is given by the sum, over all paths from XiX_{i} to the jj-th output node, of the product of the edge labels on that path. This entry is a polynomial in 𝐱,𝐲\mathbf{x},\mathbf{y}, so that we can think of UU as a polynomial map from 𝔽2s\mathbb{F}^{2s} to 𝔽n2\mathbb{F}^{n^{2}}.

Lemma 3.1.

The map U(𝐱,𝐲)U(\mathbf{x},\mathbf{y}) defined above contains in its image all n×nn\times n matrices AA whose corresponding linear transformation can be computed by a linear circuit of size at most ss. The degree of UU is at most s(s+1)s^{\prime}\cdot(s+1).

Proof.

Let AA be a matrix whose linear transformation is computed by a size ss circuit CC. The graph of CC can be embedded as a subgraph in the graph GG constructed above (if the embedding is not unique, pick one arbitrarily). Let ei1,,eise_{i_{1}},\ldots,e_{i_{s}} be the edges of this subgraph, and let 𝜷=(β1,,βs)\boldsymbol{\beta}=(\beta_{1},\ldots,\beta_{s}) be their corresponding labels in CC. By the properties of the map SVs,s(𝐱,𝐲)\mathrm{SV}_{s^{\prime},s}(\mathbf{x},\mathbf{y}) given in 2.1, it is possible to set the tuple of variables 𝐲\mathbf{y} to field elements α1,,αs\alpha_{1},\dots,\alpha_{s} such that the jj-th coordinate of SV(𝜷,𝜶)\mathrm{SV}(\boldsymbol{\beta},\boldsymbol{\alpha}) equals βi\beta_{i} if j=ikj=i_{k} for some 1ks1\leq k\leq s the 0 otherwise. Observe that under this labeling of the edges, the circuit GG computes the same transformation as the circuit CC. Hence U(𝜷,𝜶)=AU(\boldsymbol{\beta},\boldsymbol{\alpha})=A.

To upper bound the degree of UU, note that each edge label in GG is a polynomial of degree ss^{\prime}, and each path is of length at most s+1s+1. ∎

3.2 Low degree equations for small linear circuits

Analogous to the proof of Theorem 1.1, we now observe via a dimension counting argument that the image of the polynomial map U(𝐱,𝐲)U(\mathbf{x},\mathbf{y}) has a equation of degree at most n3n^{3}. This would complete the proof of Theorem 1.2.

Proof of Theorem 1.2.

As before, let V1V_{1} denote the subspace of polynomials over 𝔽\mathbb{F} in n2n^{2} variables of degree at most n3n^{3}. Let V2V_{2} denote the subspace of polynomials over 𝔽\mathbb{F} in n2/100n^{2}/100 variables of degree at most n30n^{30}. Let UU be the map given by 3.1 for s=n2/200s=n^{2}/200 so that sn8s^{\prime}\leq n^{8}, and the degree of UU is at most s(s+1)n10s^{\prime}(s+1)\leq n^{10}. Now, consider the linear transformation T:V1V2T:V_{1}\to V_{2} given by QQUQ\mapsto Q\circ U.

Once again, we compute that dim(V1)=(n3+n2n2)nn2\dim(V_{1})=\binom{n^{3}+n^{2}}{n^{2}}\geq n^{n^{2}}, whereas dim(V2)=(n2/100+n30n2/100)(2n30)n2/100<dim(V1)\dim(V_{2})=\binom{n^{2}/100+n^{30}}{n^{2}/100}\leq(2n^{30})^{n^{2}/100}<\dim(V_{1}), so that there exists a non-zero polynomial in the kernel of TT, that is, 0Q0V10\neq Q_{0}\in V_{1} such that Q0U0Q_{0}\circ U\equiv 0.

By 3.1, if AA has a circuit of size n2/200n^{2}/200, it is in the image of UU, so that Q0(A)=0Q_{0}(A)=0. ∎

4 Degree Upper Bound for Three Dimensional Tensors

Another algebraic object which is closely related to proving circuit lower bounds is the set of three dimensional tensors of high rank. A three dimensional tensor of rank at least rr implies a lower bound of rr on an arithmetic circuit computing the bi-linear function associated with the tensor. Our arguments also provide polynomial degree upper bounds for the set of tensors of (border) rank at most n2/300n^{2}/300.

Lemma 4.1.

Let 𝔽\mathbb{F} be any field. There is a polynomial map P:𝔽n2/100𝔽n3P:\mathbb{F}^{n^{2}/100}\to\mathbb{F}^{n^{3}} of degree at most 33 such that for every 33 dimensional tensor τ:[n]3𝔽\tau:[n]^{3}\to\mathbb{F} of rank at most n2/300n^{2}/300 lies in its image.

Proof.

This follows immediately from the definition.

Indeed, let r=n2/300r=n^{2}/300. Let 𝐮1,,𝐮r,𝐯1,,𝐯r,𝐰1,,𝐰r\mathbf{u}_{1},\ldots,\mathbf{u}_{r},\mathbf{v}_{1},\ldots,\mathbf{v}_{r},\mathbf{w}_{1},\ldots,\mathbf{w}_{r} be disjoint nn tuples of variables. Let UU be a tensor of rank at most rr over the ring 𝔽[𝐮1,,𝐮r,𝐯1,,𝐯r,𝐰1,,𝐰r]\mathbb{F}[\mathbf{u}_{1},\ldots,\mathbf{u}_{r},\mathbf{v}_{1},\ldots,\mathbf{v}_{r},\mathbf{w}_{1},\ldots,\mathbf{w}_{r}] defined as follows.

U(𝐮,𝐯,𝐰)=i=1r𝐮i𝐯i𝐰i.U(\mathbf{u},\mathbf{v},\mathbf{w})=\sum_{i=1}^{r}\mathbf{u}_{i}\otimes\mathbf{v}_{i}\otimes\mathbf{w}_{i}\,.

From the definition of UU, it can be readily observed that for every tensor τ:𝔽[n]3𝔽\tau:\mathbb{F}^{[n]^{3}}\to\mathbb{F} of rank at most rr, there is a setting 𝜶,𝜷,𝜸\boldsymbol{\alpha},\boldsymbol{\beta},\boldsymbol{\gamma} of the variables in 𝐮,𝐯,𝐰\mathbf{u},\mathbf{v},\mathbf{w} respectively such that U(𝜶,𝜷,𝜸)=τU(\boldsymbol{\alpha},\boldsymbol{\beta},\boldsymbol{\gamma})=\tau. Moreover, each of the coordinates of UU is a polynomial of degree equal to three in the variables in 𝐮,𝐯,𝐰\mathbf{u},\mathbf{v},\mathbf{w}. Let PP be the degree three polynomial map which maps the variables 𝐮1,,𝐮r,𝐯1,,𝐯r\mathbf{u}_{1},\ldots,\mathbf{u}_{r},\mathbf{v}_{1},\ldots,\mathbf{v}_{r} and 𝐰1,,𝐰r\mathbf{w}_{1},\ldots,\mathbf{w}_{r} to the coordinates of UU. ∎

We now argue that for every polynomial map PP given by 4.1 has an equation of not too large degree.

Theorem 4.2.

Let 𝔽\mathbb{F} be any field. There exists a non-zero polynomial Q𝔽[x1,1,1,,xn,n,n]Q\in\mathbb{F}[x_{1,1,1},\ldots,x_{n,n,n}], of degree at most n4n^{4}, which is a non-trivial equation for three dimensional tensors τ:[n]×[n]×[n]𝔽\tau:[n]\times[n]\times[n]\mapsto\mathbb{F} of rank at most n2/300n^{2}/300.

Proof.

As before, let V1V_{1} denote the subspace of polynomials over 𝔽\mathbb{F} in n3n^{3} variables of degree at most n4n^{4} and let V2V_{2} denote the subspace of polynomials over 𝔽\mathbb{F} in n3/100n^{3}/100 variables of degree at most 3n43n^{4}. Let PP be the map given by 4.1. Now, consider the linear transformation T:V1V2T:V_{1}\to V_{2} given by QQPQ\mapsto Q\circ P.

Observe that dim(V1)=(n4+n3n3)nn3\dim(V_{1})=\binom{n^{4}+n^{3}}{n^{3}}\geq n^{n^{3}}, whereas dim(V2)=(n3/100+3n4n3/100)(2n4)n3/100<dim(V1)\dim(V_{2})=\binom{n^{3}/100+3n^{4}}{n^{3}/100}\leq(2n^{4})^{n^{3}/100}<\dim(V_{1}), so that there exists a non-zero polynomial in the kernel of TT, that is, 0Q0V10\neq Q_{0}\in V_{1} such that Q0P0Q_{0}\circ P\equiv 0.

By 4.1, if τ\tau is a tensor of rank at most n2/300n^{2}/300, then it is in the image of PP, and thus Q0(τ)=0Q_{0}(\tau)=0. ∎

The arguments here also generalize to tensors in higher dimensions. In particular, the following analog of 4.1 is true.

Lemma 4.3.

Let 𝔽\mathbb{F} be any field. Then, for all n,dn,d\in\mathbb{N}, there is a polynomial map P:𝔽nd1/100𝔽ndP:\mathbb{F}^{n^{d-1}/100}\to\mathbb{F}^{n^{d}} of degree at most dd such that for every dd dimensional tensor τ:[n]d𝔽\tau:[n]^{\otimes d}\to\mathbb{F} of rank at most nd1/100dn^{d-1}/100d lies in its image.

Combining this lemma with a dimension comparison argument analogous to that in the proof of Theorem 4.2 gives the following theorem. We skip the details of the proof.

Theorem 4.4.

For every field 𝔽\mathbb{F} and for all n,dn,d\in\mathbb{N}, there exists a non-zero polynomial QQ on ndn^{d} variables and degree at most n2dn^{2d}, which is a non-trivial equation for dd dimensional tensors τ:[n]d𝔽\tau:[n]^{\otimes d}\to\mathbb{F} of rank at most nd1/100dn^{d-1}/100d.

We remark that a similar methods can be used to prove the existence of an equation of degree \poly(n)\poly(n) for three dimensional tensors of slice rank (see, e.g., [BIL+19]) at most, say, n/1000n/1000. The existence of such an equations was proved (using different techniques) in [BIL+19].

5 Applications to Circuit Lower Bounds

In this section we prove 1.3. The strategy of the proof is simple: the proof of Theorem 1.2 implies a \PSPACE\PSPACE algorithm which produces a sequence of polynomials which are equations for the set of matrices with small linear circuits. If those equations require large circuits, we are done, and if not, then there exists an equation with small circuits which (assuming 𝖯𝖨𝖳\mathsf{PIT}\in\P) can be found using an \NP\NP-oracle. Using, once again, the assumption that 𝖯𝖨𝖳\mathsf{PIT}\in\P, we can also find deterministically a matrix on which the equation evaluates to non-zero, which implies the matrix requires large linear circuits.

There are some technical difficulties involved with this plan which we now describe. The first problem is that even arithmetic circuits of small size can have large description as bit strings, due to the field constants appearing in the circuits. To prevent this issue, we only consider constant free arithmetic circuits, which are only allowed inputs labeled by {0,±1}\{0,\pm 1\} (but can still compute other constants in the circuit using arithmetic operations).

The second problem is that, in order to be able to find a non-zero of the equation in the last step of the algorithm (using the mere assumption that 𝖯𝖨𝖳\mathsf{PIT}\in\P), we need not only the size of the circuit but also its degree to be bounded by \poly(n)\poly(n). Of course, by Theorem 1.2 the exists such a circuit, but we need to be able to prevent a malicious prover from providing us with a \poly(n)\poly(n) size circuit of exponential degree, and it is not known how to compute the degree of a circuit in deterministic polynomial time, even assuming 𝖯𝖨𝖳\mathsf{PIT}\in\P. To solve this issue, we use an idea of Malod and Portier [MP08], who showed that any polynomial with circuit of size \poly(n)\poly(n) and degree dd also has a multiplicatively disjoint (MD) circuit of size \poly(n,d)\poly(n,d). An MD circuit is a circuit in which any multiplication gates multiplies two disjoint subcircuits. This is a syntactic notion which is easy to verify efficiently and deterministically, and an MD circuit of size ss is guaranteed to compute a polynomial of degree at most ss.

A final technical issue is that the notion of MD circuits does not fit perfectly within the framework of constant free circuits. Therefore we use the notion of “almost MD” circuits, which allow for the case which the inputs to a multiplciation gates are not disjoint, as long as at least one of them is the root of a subcircuit in which only constants appear.

Definition 5.1.

We say a gate vv in a circuit is constant producing (CP) if in the subcircuit rooted at vv, all input nodes are field constants.

An almost-MD circuit is a circuit where every multiplication gate either multiplies two disjoint subcircuits, or at least one of its children is constant producing.

Lemma 5.2.

Suppose ff is an nn-variate polynomial of degree \poly(n)\poly(n) which has a constant free arithmetic circuit of degree \poly(n)\poly(n). Then ff has a constant free almost-MD circuit of size \poly(n)\poly(n).

Proof.

Let C0C_{0} be a constant free arithmetic circuit for ff. We first homogenize the circuit C0C_{0} to obtain a circuit C1C_{1} (a homogeneous circuit is a circuit in which every gate computes a homogeneous polynomial [SY10]). Since C1C_{1} is homogeneous, all the gates which compute non-zero field constants are CP gates. We then eliminate all gates which compute constants by allowing the edges entering sum gates to be labeled by field scalars, and interpreting a sum gate as computing a linear combination whose coefficients are given by the edge labels. We call this circuit C2C_{2}. This step does not maintain constant-freeness. However, every label appearing on the edges of C2C_{2} was computed in C1C_{1}, so it can be computed by a constant-free arithmetic circuit of polynomial size.

We now do the transformation detailed in [MP08] to C2C_{2} to obtain an MD circuit C3C_{3}, which has labels on the edges. This step does not produce new constants. Finally, we convert C3C_{3} to an almost-MD constant free circuit C4C_{4}, by re-computing every label appearing on the edge using a fresh subcircuit for each label, and rewiring the circuit (which will convert the circuit from an MD circuit to an almost MD circuit). These subcircuits are guaranteed to have polynomial size constant free circuits since these constant were all computed in C0C_{0}, which keeps the total size \poly(n)\poly(n). ∎

For circuits which compute low-degree polynomials, the mere existence of an algorithm for the decision version of PIT allows one to construct an algorithm for the search version.

Lemma 5.3.

Suppose 𝖯𝖨𝖳\mathsf{PIT}\in\P. Then there is a polynomial time algorithm that given a non-zero almost-MD arithmetic circuit CC of size ss computing an nn-variate polynomial, finds in time \poly(n,s)\poly(n,s) an element 𝐚n\mathbf{a}\in\mathbb{C}^{n} such that C(𝐚)0C(\mathbf{a})\neq 0.

Proof.

We abuse notation by denoting by CC also the polynomial computed by the circuit CC. Note that since CC is almost-MD, the degree of CC is at most ss. Thus, there exists a1{0,1,,s}a_{1}\in\{0,1,\ldots,s\} such that C(a1,x2,,xn)C(a_{1},x_{2},\ldots,x_{n}) is a non-zero polynomial in x2,,xnx_{2},\ldots,x_{n}. By iterating over those s+1s+1 values from 0 to ss and using the assumption that 𝖯𝖨𝖳\mathsf{PIT}\in\P, we can find such a value for a1a_{1} in time \poly(n,s)\poly(n,s). We then continue in the same manner with the rest of the variables. ∎

As we noted above, the assumption that CC is almost-MD was used in 5.3 to bound the degree of the circuit. It is also useful because it is easy to decide in deterministic polynomial time whether a circuit is almost-MD. We now complete the proof of 1.3.

Proof of 1.3.

For every nn, the proof of Theorem 1.2 provides an equation QnQ_{n} for the set of n×nn\times n matrices with small linear circuits. This polynomial can be found by solving a linear system of equations in a linear space whose dimension is exp(\poly(n))\exp(\poly(n)). Using standard, small space algorithm for linear algebra [BvzGH82, ABO99], this implies that there exists a fixed \PSPACE\PSPACE algorithm which, on input 1n1^{n}, outputs the list of coefficients of the polynomial QnQ_{n}.

Consider now the family {Qn}n\{Q_{n}\}_{n\in\mathbb{N}}. If for any constant kk\in\mathbb{N} there exist infinitely many nn\in\mathbb{N} such that QnQ_{n} requires circuits of size at least nkn^{k}, it follows (by definition) that the \PSPACE\PSPACE algorithm above outputs a family of polynomials with super-polynomial constant-free arithmetic circuits.

We are thus left to consider the case that there exists a constant kk\in\mathbb{N} such that for all large enough nn\in\mathbb{N}, QnQ_{n} can be computed by circuits of size nkn^{k}. By 5.2, we may assume without loss of generality that these circuits are almost-MD circuits. Further suppose 𝖯𝖨𝖳\mathsf{PIT}\in\P. We will show how to construct a matrix in polynomial time with an \NP\NP oracle which requires large linear circuits.

Consider the language LL of pairs (1n,x)(1^{n},x) such that there exists a string yy of length at most nkn^{k} such that xyxy describes an almost-MD circuit CC such that CC is non-zero, and CU0C\circ U\equiv 0, where UU is the polynomial map given in the proof of Theorem 1.2.

Assuming 𝖯𝖨𝖳\mathsf{PIT}\in\P, the language LL is in \NP\NP, and by assumption for every large enough nn there exists such a circuit. Thus, we can use the \NP\NP oracle to construct such a circuit CC bit by bit. Finally, using 5.3 we can output a matrix MM such that C(M)0C(M)\neq 0.

By the properties of the circuit CC and the map UU, MM does not have linear circuits of size less than n2/200n^{2}/200. ∎

Many variations of 1.3 can be proved as well, with virtually the same proof. By slightly modifying the language LL used in the proof, it is possible to prove the same result even under the assumption 𝖯𝖨𝖳\NP\mathsf{PIT}\in\NP (recall that 𝖯𝖨𝖳\coRP\mathsf{PIT}\in\coRP). A similar statements also holds over finite fields of size \poly(n)\poly(n), in which case the proof is simpler since there are no issues related to the bit complexity of the first constants. Finally, an analog of 1.3 also holds for tensor rank, by using Theorem 4.2 instead of Theorem 1.2: that is, assuming 𝖯𝖨𝖳\mathsf{PIT}\in\P, either there exists a construction of a hard polynomial in \PSPACE\PSPACE, or an efficient construction with an \NP\NP oracle of a 3-dimensional tensor of rank Ω(n2)\Omega(n^{2}). We remark that for tensors of large rank there are no analogs of [AC19, BHPT20], i.e., there do not exist even constructions with an \NP\NP oracle of tensors with slightly super-linear rank.

References