This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Northeastern University, [email protected] \CopyrightXuangui Huang \ccsdesc[500]Theory of computation Problems, reductions and completeness \supplement\fundingSupported by NSF CCF award 1813930.

Acknowledgements.
The author thanks Emanuele Viola for helpful discussions.\EventEditorsJohn Q. Open and Joan R. Access \EventNoEds2 \EventLongTitle42nd Conference on Very Important Topics (CVIT 2016) \EventShortTitleCVIT 2016 \EventAcronymCVIT \EventYear2016 \EventDateDecember 24–27, 2016 \EventLocationLittle Whinging, United Kingdom \EventLogo \SeriesVolume42 \ArticleNo23

Space Hardness of Solving Structured Linear Systems

Xuangui Huang
Abstract

We show that if the probabilistic logarithmic-space solver or the deterministic nearly logarithmic-space solver for undirected Laplacian matrices can be extended to solve slightly larger subclasses of linear systems, then they can be use to solve all linear systems with similar space complexity. Previously Kyng and Zhang [7] proved similar results in the time complexity setting using reductions between approximate solvers. We prove that their reductions can be implemented using constant-depth, polynomial-size threshold circuits.

keywords:
linear system solver, logarithmic space, threshold circuit
category:
\relatedversion

1 Introduction

One of the oldest mathematical problems is to solve a linear system, that is to find a solution 𝐱\mathbf{x} satisfying 𝐀𝐱=𝐛\mathbf{A}\mathbf{x}=\mathbf{b} given an n×nn\times n matrix 𝐀\mathbf{A} and a nn-dimensional vector 𝐛\mathbf{b} as input. In the RealRAM model this can be done in O(nω)O(n^{\omega}) time, where ω2.3728\omega\leq 2.3728 [4] is the matrix multiplication constant. Much faster algorithms exist for approximately solving linear systems when 𝐀\mathbf{A} is the Laplacian of undirected graphs. Indeed recent breakthroughs showed that it can be done in nearly linear time [13, 2]. Kyng and Zhang [7] further showed that if such solvers can be extended to nearly linear time solvers for some classes slightly larger than undirected Laplacians, we can also solve general linear systems in nearly linear time.

In this paper we are interested in the space complexity of this problem. For general linear systems Ta-shma gave a quantum algorithm using logarithmic space [14]. For undirected Laplacian, Doron et al. showed that it has a probabilistic logarithmic-space algorithm [3] and hence a deterministic O(log3/2n)O(\log^{3/2}n)-space algorithm by a well-known space-efficient derandomization result [12]. This was improved later to O~(logn)\widetilde{O}(\log n) by Murtagh et al [10].

1.1 Our results

We prove a space hardness version of Kyng and Zhang’s results [7], showing space hardness of approximate linear system solvers for some classes slightly larger than undirected Laplacians, namely multi-commodity Laplacians, 2D Truss Stiffness Matrices, and Total Variation Matrices.

Theorem 1.1.

Suppose that for multi-commodity Laplacians, 2-D Truss Stiffness Matrices, or Total Variation Matrices, the linear system 𝐀𝐱=𝐛\mathbf{A}\mathbf{x}=\mathbf{b} can be approximately solved in (nearly) logarithmic space with logarithmic dependence on condition number κ\kappa and accuracy ε1\mathrm{\varepsilon}^{-1} (even if it only works in expectation or with high probability), then any linear systems with polynomially bounded integers and condition number can be solved in (nearly) logarithmic space with high accuracy (in expectation or with high probability, respectively).

This shows that if the probabilistic logspace solver for undirected Laplacian in [3], or the deterministic O~(logn)\widetilde{O}(\log n)-space one in [10], can be extended to solve any of these three slightly larger subclasses of linear systems, we would have a surprising result that all linear systems can be approximately solved in probabilistic logspace or in deterministic O~(logn)\widetilde{O}(\log n)-space. Pessimistically speaking the theorem means that it is very hard to get space efficient algorithms for these subclasses, as it is as difficult as solving all linear systems in a space efficient way. On the bright side, we actually prove that any progress on solving these subclasses using less space will immediately translate into similar progress for solving all linear system using less space.

Kyng and Zhang [7] proved their results via reductions from approximate solvers of general linear systems to those of three subclasses. In this paper we prove Theorem 1.1 by proving that their reductions can be carried out in a space efficient manner. Indeed we prove a much stronger result that their reductions can be implemented in 𝖳𝖢0\mathsf{TC}^{0} circuits, which are constant-depth polynomial-size unbounded-fan in circuits with 𝖬𝖠𝖩𝖮𝖱𝖨𝖳𝖸\mathsf{MAJORITY} and ¬\neg gates. It shows that these reductions are actually highly parallelizable.

We denote 𝒢\mathcal{G} as the class of all matrices with integer valued entries. In the context of solving linear systems, an all-zero row or column can be trivially handled, so we can assume without loss of generality that matrices in 𝒢\mathcal{G} has at least one non-zero entry in every row and column. For 2-commodity matrices 𝒞2\mathcal{MC}_{2}, we have two set of variables XX and YY of the same size, and the equations are scalings of xixj=0x_{i}-x_{j}=0, yiyj=0y_{i}-y_{j}=0, and xiyi(xjyj)=0x_{i}-y_{i}-(x_{j}-y_{j})=0, where xi,xjXx_{i},x_{j}\in X and yi,yjYy_{i},y_{j}\in Y. This generalizes undirected Laplacians, as the incidence matrices of undirected Lapacians only have equations of the form xixj=0x_{i}-x_{j}=0 for xi,xjXx_{i},x_{j}\in X.

Our main technical result proves that the reduction from 𝒢\mathcal{G} to 𝒞2\mathcal{MC}_{2} in [7] is 𝖳𝖢0\mathsf{TC}^{0}-computable.

Theorem 1.2.

There is a 𝖳𝖢0\mathsf{TC}^{0}-reduction from approximately solving 𝒢\mathcal{G} to approximately solving 𝒞2\mathcal{MC}_{2}.

In [7] it is shown that the matrix produced by this reduction is also a 2D Truss Stiffness Matrix as well as a Total Variation Matrix, therefore Theorem 1.2 also works for these classes. Also note that this reduction is a Karp-style reduction, i.e. it requires only one linear system solve and uses the solution in a black-box way. That is why Theorem 1.1 still applies if the solver only works in expectation or with high probability.

We also show 𝖳𝖢0\mathsf{TC}^{0}-computability of the reductions in [7] to some more restrictive subclasses of 𝒞2\mathcal{MC}_{2}, including 𝒞2>\mathcal{MC}^{>}_{2}, the exact class we have to solve when we use Interior Point Methods for 2-commodity problems, as explained in the Kyng and Zhang’s full paper [8]. They also showed that the promise problem of deciding if a vector is in the image of a matrix or ε\mathrm{\varepsilon}-far from the image can be directly reduced to approximately solving linear systems. Combining with the above results, this shows that the promise problem can be reduced to approximately solving the above-mentioned subclasses in 𝖳𝖢0\mathsf{TC}^{0}.

2 Simplified reductions in 𝖳𝖢0\mathsf{TC}^{0}: the easy parts

Throughout this paper we use the sign-magnitude representation to encode a ww-bit signed integer z[2w+1+1,2w+11]z\in[-2^{w+1}+1,2^{w+1}-1] into a sign bit, 0 for positive and 11 for negative, and ww bits for |z||z| in binary. Note that in this method, 0 has two representations and we accept both.

For simplicity we first prove the special case for the reduction from exact solver of 𝒢\mathcal{G} to exact solver of 𝒞2\mathcal{MC}_{2}.

Theorem 2.1.

Given an m×nm\times n ww-bit matrix 𝐀𝒢\mathbf{A}\in\mathcal{G} and a vector 𝐛\mathbf{b} as input, we can compute in 𝖳𝖢0\mathsf{TC}^{0} an O(nmwlogn)×O(nmwlogn)O(nmw\log n)\times O(nmw\log n) O(wlogn)O(w\log n)-bit matrix 𝐀𝒞2\mathbf{A}^{\prime}\in\mathcal{MC}_{2} and a vector 𝐛\mathbf{b}^{\prime} such that:

  • 𝐀𝐱=𝐛\mathbf{A}\mathbf{x}=\mathbf{b} has a solution if and only if 𝐀𝐱=𝐛\mathbf{A}^{\prime}\mathbf{x}^{\prime}=\mathbf{b}^{\prime} has a solution;

  • if 𝐀𝐱=𝐛\mathbf{A}^{\prime}\mathbf{x}^{\prime}=\mathbf{b}^{\prime} has a solution 𝐱\mathbf{x}^{\prime}, then we can compute 𝐱\mathbf{x} in 𝖳𝖢0\mathsf{TC}^{0} from 𝐱\mathbf{x}^{\prime} so that 𝐀𝐱=𝐛\mathbf{A}\mathbf{x}=\mathbf{b}.

As in [7], this reduction is split into several steps using the following classes of matrices.

Definition 2.2 ([7]).
  • Let 𝒢z𝒢\mathcal{G}_{z}\subset\mathcal{G} denote the class of matrices with integer valued entries such that every row has zero row sum;

  • Let 𝒢z,2𝒢z\mathcal{G}_{z,2}\subset\mathcal{G}_{z} denote the class of matrices with integer valued entries such that every row has zero row sum, and for each row the sum of positive coefficients is a power of 22.

Lemma 2.3.

There are 𝖳𝖢0\mathsf{TC}^{0}-reductions for exact solvers of the following classes:

  1. (i)

    from 𝒢\mathcal{G} to 𝒢z\mathcal{G}_{z};

  2. (ii)

    from 𝒢z\mathcal{G}_{z} to 𝒢z,2\mathcal{G}_{z,2};

  3. (iii)

    from 𝒢z,2\mathcal{G}_{z,2} to 𝒞2\mathcal{MC}_{2}.

Lemma 2.3 (iii) is the main reduction in this paper (same as in [7]), which will be proved in the next section. In the remaining of this section we prove Lemma 2.3 (i) and (ii).

From 𝒢\mathcal{G} to 𝒢z\mathcal{G}_{z}

Proof 2.4 (Proof sketch).

Given a matrix 𝐀𝒢\mathbf{A}\in\mathcal{G}, we can define a matrix 𝐀\mathbf{A}^{\prime} with one more column by 𝐀=(𝐀𝐀𝟏)\mathbf{A}^{\prime}=\begin{pmatrix}\mathbf{A}&-\mathbf{A}\mathbf{1}\end{pmatrix}. Obviously 𝐀𝒢z\mathbf{A}^{\prime}\in\mathcal{G}_{z}, and 𝐀𝐱=𝐛\mathbf{A}\mathbf{x}=\mathbf{b} has a solution if and only if 𝐀𝐱=𝐛\mathbf{A}^{\prime}\mathbf{x}^{\prime}=\mathbf{b} has a solution. We can also recover 𝐱n\mathbf{x}\in\mathbb{R}^{n} from 𝐱n+1\mathbf{x}^{\prime}\in\mathbb{R}^{n+1} by taking the first nn rows of 𝐱\mathbf{x}^{\prime} and minus each of them by 𝐱n+1\mathbf{x}^{\prime}_{n+1}. The following results about additions imply that 𝐀\mathbf{A}^{\prime} can be calculated in 𝖳𝖢0\mathsf{TC}^{0}, and we can recover 𝐱\mathbf{x}^{\prime} from 𝐱\mathbf{x} in 𝖠𝖢0\mathsf{AC}^{0} (for simplicity we ignore the precision problem here).

Fact 1.
  • Addition of 22 ww-bit numbers has 𝖠𝖢0\mathsf{AC}^{0} circuit of size poly(w)\mathrm{{poly}}(w) (c.f. [1]);111𝖠𝖢0\mathsf{AC}^{0} circuits are constant-depth polynomial-size unbounded-fan in circuits with \wedge, \vee, and ¬\neg gates.

  • Addition of nn ww-bit numbers has 𝖳𝖢0\mathsf{TC}^{0} circuit of size poly(n,w)\mathrm{{poly}}(n,w) [6].

From 𝒢z\mathcal{G}_{z} to 𝒢z,2\mathcal{G}_{z,2}

Proof 2.5 (Proof sketch).

Given an m×nm\times n ww-bit signed-integer matrix 𝐀𝒢z\mathbf{A}^{\prime}\in\mathcal{G}_{z}, we just need to add two more columns to make the sum of positive (and negative) entries in each row to the closet power of 22. This can be done in 𝖳𝖢0\mathsf{TC}^{0} in the following way. For each row 1im1\leq i\leq m, we calculate the sum of positive entries sis_{i} by checking the sign bit then do the iterated addition in 𝖳𝖢0\mathsf{TC}^{0} by Fact 1. We then take s=maxsis=\max s_{i}, which can be computed in 𝖠𝖢0\mathsf{AC}^{0} given sis_{i}’s. ss has at most O(wlogn)O(w\log n) bits so given ss by searching brute-forcely in 𝖠𝖢0\mathsf{AC}^{0} we can find the minimum kk s.t. 2ks2^{k}\geq s. Therefore by Fact 1 we can calculate ai=2ksia_{i}=2^{k}-s_{i} in 𝖠𝖢0\mathsf{AC}^{0} given sis_{i}. Then for each row ii we add aia_{i} and ai-a_{i} to the last two columns of 𝐀\mathbf{A}^{\prime} to get 𝐀′′\mathbf{A}^{\prime\prime}. Additionally we need to add a new row to 𝐀′′\mathbf{A}^{\prime\prime} (and to 𝐛′′\mathbf{b}^{\prime\prime} accordingly) to zero out the last two variables we just added: set the last two entries into 11 and 1-1, and set all other entries 0, and also add a 0 entry to 𝐛\mathbf{b}^{\prime} to get 𝐛′′\mathbf{b}^{\prime\prime}.

Obviously we have 𝐀′′𝒢z,2\mathbf{A}^{\prime\prime}\in\mathcal{G}_{z,2}, and 𝐀𝐱=𝐛\mathbf{A}^{\prime}\mathbf{x}^{\prime}=\mathbf{b}^{\prime} has a solution iff 𝐀′′𝐱′′=𝐛′′\mathbf{A}^{\prime\prime}\mathbf{x}^{\prime\prime}=\mathbf{b}^{\prime\prime} has a solution. We can easily recover 𝐱\mathbf{x}^{\prime} from 𝐱′′\mathbf{x}^{\prime\prime} by taking the first nn rows.

For the next section we need the following definition.

Definition 2.6.

We say a matrix 𝐀𝒢z\mathbf{A}\in\mathcal{G}_{z} is ww-bounded if for each row the sum of positive coefficients is at most 2w2^{w}. Note that in such matrix every entry is a ww-bit signed integer.

Note that the reduction from 𝒢\mathcal{G} to 𝒢z\mathcal{G}_{z} reduces an m×nm\times n ww-bit matrix 𝐀\mathbf{A} into an m×(n+1)m\times(n+1) O(wlogn)O(w\log n)-bounded matrix 𝐀\mathbf{A}^{\prime}, then the reduction from 𝒢z\mathcal{G}_{z} to 𝒢z,2\mathcal{G}_{z,2} reduces 𝐀\mathbf{A}^{\prime} into an (m+1)×(n+3)(m+1)\times(n+3) O(wlogn)O(w\log n)-bounded matrix 𝐀′′\mathbf{A}^{\prime\prime}.

3 Simplified main reduction in 𝖳𝖢0\mathsf{TC}^{0}

The main reduction in [7] uses the pair-and-replace scheme to transform a linear system in 𝒢z,2\mathcal{G}_{z,2} to a 22-commodity equation system. The main idea is to use 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝\mathcal{MC}_{2}\mathtt{Gadget}s consisting of 𝒞2\mathcal{MC}_{2} equations to replace pairs of variables in the original linear system round-by-round according to the bit representation of their coefficients. A simplified version of the gadget, implicitly given in [7], is as follows.

Definition 3.1 (Simplified 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝\mathcal{MC}_{2}\mathtt{Gadget}).

Define 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝(t,t,j1,j2)\mathcal{MC}_{2}\mathtt{Gadget}(t,t^{\prime},j_{1},j_{2}) to be the following set of 2-commodity linear equations representing “2xt=xj1+xj22x_{t}=x_{j_{1}}+x_{j_{2}}”:

xtxt+1\displaystyle x_{t}-x_{t^{\prime}+1} =0\displaystyle=0
xt+2xj2\displaystyle x_{t^{\prime}+2}-x_{j_{2}} =0\displaystyle=0
yt+1yt+3\displaystyle y_{t^{\prime}+1}-y_{t^{\prime}+3} =0\displaystyle=0
yt+4yt+2\displaystyle y_{t^{\prime}+4}-y_{t^{\prime}+2} =0\displaystyle=0
xt+3xj1\displaystyle x_{t^{\prime}+3}-x_{j_{1}} =0\displaystyle=0
xtxt+4\displaystyle x_{t}-x_{t^{\prime}+4} =0\displaystyle=0
xt+4yt+4(xt+3yt+3)\displaystyle x_{t^{\prime}+4}-y_{t^{\prime}+4}-(x_{t^{\prime}+3}-y_{t^{\prime}+3}) =0\displaystyle=0
xt+1yt+1(xt+2yt+2)\displaystyle x_{t^{\prime}+1}-y_{t^{\prime}+1}-(x_{t^{\prime}+2}-y_{t^{\prime}+2}) =0.\displaystyle=0.

For convenience we use an extra parameter tt^{\prime} to keep track of new variables that are only used in the gadgets.

Correctness of this gadget can be easily verified by summing up all the equations in it.

We now present a simplified reduction from exactly solving 𝒢z,2\mathcal{G}_{z,2} to exactly solving 𝒞2\mathcal{MC}_{2} in Algorithm 1. We use 𝐀i\mathbf{A}_{i} to denote the ii-th row of 𝐀\mathbf{A}.

1
Input : a ww-bounded m×nm\times n matrix 𝐀𝒢z,2\mathbf{A}\in\mathcal{G}_{z,2} and 𝐜n\mathbf{c}\in\mathbb{R}^{n}.
Output : a ww-bit m×nm^{\prime}\times n^{\prime} matrix 𝐁𝒞2\mathbf{B}\in\mathcal{MC}_{2} and 𝐝n\mathbf{d}\in\mathbb{R}^{n}.
2 mmm^{\prime}\leftarrow m, nnn^{\prime}\leftarrow n
3 ng0n_{g}\leftarrow 0
4  // # of new variables used only in the 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝\mathcal{MC}_{2}\mathtt{Gadget}s
5 for i1i\leftarrow 1 to mm do
6       for s{+,}s\in\{+,-\} do
7             for k1k\leftarrow 1 to ww do
8                   while strictly more than 1 entry in 𝐀i\mathbf{A}_{i} has sign ss and the kk-th bit being 11 do
9                         Let j1,j2j_{1},j_{2} be the first and second indices of such entries in 𝐀i\mathbf{A}_{i}
10                         In 𝐀i\mathbf{A}_{i}, replace 2k(xj1+xj2)2^{k}(x_{j_{1}}+x_{j_{2}}) by 2k+1xn+12^{k+1}x_{n^{\prime}+1} (thus adding one column to the right of 𝐀\mathbf{A})
11                         Add the coefficients of 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝\mathcal{MC}_{2}\mathtt{Gadget}(n+ng+1n^{\prime}+n_{g}+1, n+ng+1n^{\prime}+n_{g}+1, j1j_{1}, j2j_{2}) to 𝐂\mathbf{C}
12                         nn+1n^{\prime}\leftarrow n^{\prime}+1
13                         ngng+4n_{g}\leftarrow n_{g}+4
14                         mm+8m^{\prime}\leftarrow m^{\prime}+8
15                        
16                   end while
17                  
18             end for
19            
20       end for
21      
22 end for
23n2×(n+ng)n^{\prime}\leftarrow 2\times(n^{\prime}+n_{g})
24 Stack 𝐂\mathbf{C} in the bottom of 𝐀\mathbf{A} and fill in 0’s to get 𝐁\mathbf{B}
25 Add mmm^{\prime}-m 0’s under 𝐜\mathbf{c} to get 𝐝\mathbf{d}
26
Algorithm 1 Simplified Reduce𝒢z,2To𝒞2\textsc{Reduce}\mathcal{G}_{z,2}\textsc{To}\mathcal{MC}_{2}

Note that in 𝒞2\mathcal{MC}_{2} we have two input variables sets XX, YY of the same size. That is why we have to multiply nn^{\prime} by 2 at last in the reduction. But here we will only use variables in YY in the 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝\mathcal{MC}_{2}\mathtt{Gadget}s, and all the other YY variables are unused. For convenience we arrange the variables in the following way.

Remark 3.2 (Arrangement of variables).

In 𝐁\mathbf{B} we put all the variables in XX before those in YY. More importantly, we put those XX variables that are only used in the gadgets behind all those xn+1x_{n^{\prime}+1} in Line 1. Equivalently it can be viewed as the following process. We first run Algorithm 1 virtually before Line 1 to get nn^{\prime}, which is the number of XX variables ignoring those only used in the gadgets. Then we run it again on the original input, starting with ngn_{g} being this value, and in Line 1 we use 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝(n+1,ng,j1,j2)\mathcal{MC}_{2}\mathtt{Gadget}(n^{\prime}+1,n_{g},j_{1},j_{2}) instead.

We give an example showing how the reduction works under this arrangement.

Example 3.3.

We show how the reduction runs on 3x1+5x2+x3+7x416x5=03x_{1}+5x_{2}+x_{3}+7x_{4}-16x_{5}=0:

00011x1+00101x2+00001x3+00111x410000x5=0\displaystyle 00011x_{1}+00101x_{2}+00001x_{3}+00111x_{4}-10000x_{5}=0
x1+x22x6=0\displaystyle\xrightarrow{x_{1}+x_{2}-2x_{6}=0}\leavevmode\nobreak\ 00010x1+00100x2+00001x3+00111x4+00010x610000x5=0\displaystyle 00010x_{1}+00100x_{2}+00001x_{3}+00111x_{4}+00010x_{6}-10000x_{5}=0
x3+x42x7=0\displaystyle\xrightarrow{x_{3}+x_{4}-2x_{7}=0}\leavevmode\nobreak\ 00010x1+00100x2+00110x4+00010x6+00010x710000x5=0\displaystyle 00010x_{1}+00100x_{2}+00110x_{4}+00010x_{6}+00010x_{7}-10000x_{5}=0
x1+x42x8=0\displaystyle\xrightarrow{x_{1}+x_{4}-2x_{8}=0}\leavevmode\nobreak\ 00100x2+00100x4+00010x6+00010x7+00100x810000x5=0\displaystyle 00100x_{2}+00100x_{4}+00010x_{6}+00010x_{7}+00100x_{8}-10000x_{5}=0
x6+x72x9=0\displaystyle\xrightarrow{x_{6}+x_{7}-2x_{9}=0}\leavevmode\nobreak\ 00100x2+00100x4+00100x8+00100x910000x5=0\displaystyle 00100x_{2}+00100x_{4}+00100x_{8}+00100x_{9}-10000x_{5}=0
x2+x42x10=0\displaystyle\xrightarrow{x_{2}+x_{4}-2x_{10}=0}\leavevmode\nobreak\ 01000x10+00100x8+00100x910000x5=0\displaystyle 01000x_{10}+00100x_{8}+00100x_{9}-10000x_{5}=0
x8+x92x11=0\displaystyle\xrightarrow{x_{8}+x_{9}-2x_{11}=0}\leavevmode\nobreak\ 01000x10+01000x1110000x5=0\displaystyle 01000x_{10}+01000x_{11}-10000x_{5}=0
x10+x112x12=0\displaystyle\xrightarrow{x_{10}+x_{11}-2x_{12}=0}\leavevmode\nobreak\ 10000x1210000x5=0.\displaystyle 10000x_{12}-10000x_{5}=0.

We are only eliminating the positive coefficient variables in this example for simplicity. In the first round we use new variables (and the corresponding gadgets) x6x_{6} and x7x_{7} to eliminate the first bit, getting the equation 2x1+4x2+6x4+2x6+2x7=02x_{1}+4x_{2}+6x_{4}+2x_{6}+2x_{7}=0. These two generated variables are then eliminated in the second round by x9x_{9}, in addition to x8x_{8} for the second bit. In the third round we use x10x_{10} and x11x_{11}. Finally in the fourth round we use x12x_{12} and get the equation after the reduction 16x1216x5=016x_{12}-16x_{5}=0, with 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝\mathcal{MC}_{2}\mathtt{Gadget}s representing 2x6=x1+x22x_{6}=x_{1}+x_{2}, 2x7=x3+x42x_{7}=x_{3}+x_{4}, 2x8=x1+x42x_{8}=x_{1}+x_{4}, 2x9=x6+x72x_{9}=x_{6}+x_{7}, 2x10=x2+x42x_{10}=x_{2}+x_{4}, 2x11=x8+x92x_{11}=x_{8}+x_{9}, and 2x12=x10+x112x_{12}=x_{10}+x_{11}.

Correctness of this simplified reduction follows easily by correctness of the gadget, as our transformation preserves the original solution. Moreover, given a solution 𝐱\mathbf{x}^{*} such that 𝐁𝐱=𝐝\mathbf{B}\mathbf{x}^{*}=\mathbf{d} where 𝐁\mathbf{B} and 𝐝\mathbf{d} are obtained from running Algorithm 1 on input 𝐀\mathbf{A} and 𝐜\mathbf{c}, we can easily get the solution to the original equation system 𝐀𝐱=𝐜\mathbf{A}\mathbf{x}=\mathbf{c} by simply taking the first nn elements in 𝐱\mathbf{x}^{*}. It is also easy to get 𝐝\mathbf{d} from 𝐜\mathbf{c} if we can calculate mm^{\prime} in 𝖳𝖢0\mathsf{TC}^{0}.

In the remaining of this section we are going to prove that Algorithm 1 can be implemented in 𝖳𝖢0\mathsf{TC}^{0}. For 1im1\leq i\leq m, 1kw1\leq k\leq w, we define

𝙻𝚎𝚗i+(𝐀)\displaystyle\mathtt{Len}^{+}_{i}(\mathbf{A}) = log of sum of positive coefficients in 𝐀i,\displaystyle=\text{ log of sum of positive coefficients in $\mathbf{A}_{i}$},
𝙲𝚘𝚞𝚗𝚝𝙱𝚒𝚝i,k+(𝐀)\displaystyle\mathtt{CountBit}^{+}_{i,k}(\mathbf{A}) =# of positive coefficient variables in the original 𝐀i with the k-th bit 1,\displaystyle=\#\text{ of positive coefficient variables in the original $\mathbf{A}_{i}$ with the $k$-th bit $1$},
𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,k+(𝐀)\displaystyle\mathtt{NumGadget}^{+}_{i,k}(\mathbf{A}) =# of 𝒞2Gadgets used for 𝐀i to eliminate the k-th bit\displaystyle=\#\text{ of $\mathcal{MC}_{2}\textsc{Gadget}$s used for $\mathbf{A}_{i}$ to eliminate the $k$-th bit}
 of positive coefficient variables,\displaystyle\phantom{=\#}\text{ of positive coefficient variables},

and similarly 𝙻𝚎𝚗i(𝐀)\mathtt{Len}^{-}_{i}(\mathbf{A}), 𝙲𝚘𝚞𝚗𝚝𝙱𝚒𝚝i,k(𝐀)\mathtt{CountBit}^{-}_{i,k}(\mathbf{A}), 𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,k(𝐀)\mathtt{NumGadget}^{-}_{i,k}(\mathbf{A}) for the negative coefficient variables.

Example 3.4.

For the above example, we have 𝙻𝚎𝚗i+(𝐀)=5\mathtt{Len}^{+}_{i}(\mathbf{A})=5, and the following values for each kk.

Table 1: Values of 𝙲𝚘𝚞𝚗𝚝𝙱𝚒𝚝i,k+(𝐀)\mathtt{CountBit}^{+}_{i,k}(\mathbf{A}) and 𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,k+(𝐀)\mathtt{NumGadget}^{+}_{i,k}(\mathbf{A}) for Example 3.3
kk 11 22 33 44 55
𝙲𝚘𝚞𝚗𝚝𝙱𝚒𝚝i,k+(𝐀)\mathtt{CountBit}^{+}_{i,k}(\mathbf{A}) 44 22 22 0 0
𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,k+(𝐀)\mathtt{NumGadget}^{+}_{i,k}(\mathbf{A}) 22 22 22 11 0

We have the following simple but crucial properties for these values.

Claim 2.
  1. (i)

    𝙻𝚎𝚗is(𝐀)w\mathtt{Len}^{s}_{i}(\mathbf{A})\leq w for all s{+,}s\in\{+,-\}, 1im1\leq i\leq m;

  2. (ii)

    𝙲𝚘𝚞𝚗𝚝𝙱𝚒𝚝i,ks(𝐀)n\mathtt{CountBit}^{s}_{i,k}(\mathbf{A})\leq n for all s{+,}s\in\{+,-\}, 1im1\leq i\leq m, 1kw1\leq k\leq w;

  3. (iii)

    For all s{+,}s\in\{+,-\}, 1im1\leq i\leq m,

    𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,ks(𝐀)={2(k+1)k=1k2k𝙲𝚘𝚞𝚗𝚝𝙱𝚒𝚝i,ks(𝐀) for 1k𝙻𝚎𝚗is(𝐀)1,0 for 𝙻𝚎𝚗is(𝐀)kw,\mathtt{NumGadget}^{s}_{i,k}(\mathbf{A})=\begin{cases}2^{-(k+1)}\sum_{k^{\prime}=1}^{k}2^{k^{\prime}}\mathtt{CountBit}^{s}_{i,k^{\prime}}(\mathbf{A})&\text{ for }1\leq k\leq\mathtt{Len}^{s}_{i}(\mathbf{A})-1,\\ 0&\text{ for }\mathtt{Len}^{s}_{i}(\mathbf{A})\leq k\leq w,\end{cases}

    thus 𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,ks(𝐀)O(n)\mathtt{NumGadget}^{s}_{i,k}(\mathbf{A})\leq O(n);

  4. (iv)

    m=m+8i=1ms{+,}k=1w𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,ks(𝐀)O(nmw)m^{\prime}=m+8\sum_{i=1}^{m}\sum_{s\in\{+,-\}}\sum_{k=1}^{w}\mathtt{NumGadget}^{s}_{i,k}(\mathbf{A})\leq O(nmw);

  5. (v)

    n=2n+10i=1ms{+,}k=1w𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,ks(𝐀)O(nmw)n^{\prime}=2n+10\sum_{i=1}^{m}\sum_{s\in\{+,-\}}\sum_{k=1}^{w}\mathtt{NumGadget}^{s}_{i,k}(\mathbf{A})\leq O(nmw).

Proof 3.5.

(i) and (ii) are trivial by definition. (iv) and (v) are straightforward from Algorithm 1 and (iii). For (iii), note that in each round for kk we will eliminate all the variables generated in the previous round for k1k-1 by construction (ignoring those variables that are only used in the gadgets), therefore we have 𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,1s(𝐀)=𝙲𝚘𝚞𝚗𝚝𝙱𝚒𝚝i,1s(𝐀)/2\mathtt{NumGadget}^{s}_{i,1}(\mathbf{A})=\mathtt{CountBit}^{s}_{i,1}(\mathbf{A})/2 and 𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,ks(𝐀)=(𝙲𝚘𝚞𝚗𝚝𝙱𝚒𝚝i,ks(𝐀)+𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,k1s(𝐀))/2\mathtt{NumGadget}^{s}_{i,k}(\mathbf{A})=(\mathtt{CountBit}^{s}_{i,k}(\mathbf{A})+\mathtt{NumGadget}^{s}_{i,k-1}(\mathbf{A}))/2 for 2k𝙻𝚎𝚗is(𝐀)12\leq k\leq\mathtt{Len}^{s}_{i}(\mathbf{A})-1. Here we rely on the property that the sum of positive (and negative) coefficients in each row is a power of 22 to ensure that 𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,ks(𝐀)\mathtt{NumGadget}^{s}_{i,k}(\mathbf{A}) as calculated in this way are always integers. Then by induction we get the formula.

Note that in (iii) 2(k+1)2^{-(k+1)} and 2k2^{k^{\prime}} are just right and left shifts, which can be easily implemented in the circuit model. Combining Claim 2 and Fact 1, we can see that all of these values can be computed in 𝖳𝖢0\mathsf{TC}^{0} for all i,k,si,k,s, i.e. the depths of the 𝖳𝖢\mathsf{TC} circuits are absolute constants independent of ii, kk, and ss.

Lemma 3.6 (Informal).
  • 𝙻𝚎𝚗is\mathtt{Len}^{s}_{i}, 𝙲𝚘𝚞𝚗𝚝𝙱𝚒𝚝i,ks\mathtt{CountBit}^{s}_{i,k}, and 𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,ks\mathtt{NumGadget}^{s}_{i,k} have 𝖳𝖢0\mathsf{TC}^{0} circuits of size poly(n,w)\mathrm{{poly}}(n,w) for all i,k,si,k,s;

  • n,mn^{\prime},m^{\prime} has 𝖳𝖢0\mathsf{TC}^{0} circuits of size poly(n,m,w)\mathrm{{poly}}(n,m,w).

Now we can prove that the reduction from 𝒢z,2\mathcal{G}_{z,2} to 𝒞2\mathcal{MC}_{2} can be done in 𝖳𝖢0\mathsf{TC}^{0}. We represent the input matrix 𝐀\mathbf{A} explicitly by giving all its entries in sign-magnitude form, and the output matrix 𝐁\mathbf{B} implicitly by outputting the entry of 𝐁\mathbf{B} in row ii column jj with ii,jj as additional inputs.

Theorem 3.7.

There is a 𝖳𝖢0\mathsf{TC}^{0} circuit family {Cn,m,w}n,m,w\{C_{n,m,w}\}_{n,m,w\in\mathbb{N}} of size poly(n,m,w)\mathrm{{poly}}(n,m,w) with (O(nmw)+O(lognmw))(O(nmw)+O(\log nmw))-bit input and O(w)O(w)-bit output such that:

  • the first O(nmw)O(nmw) bits of input represent a ww-bounded m×nm\times n matrix 𝐀𝒢z,2\mathbf{A}\in\mathcal{G}_{z,2} in sign-magnitude form, the next O(lognmw)O(\log nmw) bits of input represent an index ii, and the last O(lognmw)O(\log nmw) bits represent another index jj;

  • for imi\leq m^{\prime} and jnj\leq n^{\prime} where m,nO(nmw)m^{\prime},n^{\prime}\leq O(nmw) is calculated as in Algorithm 1 on 𝐀\mathbf{A}, the output represents the entry of 𝐁\mathbf{B} (in sign-magnitude form) in row ii column jj as calculated by Algorithm 1 on 𝐀\mathbf{A}; otherwise it represents 0.

Proof 3.8.

Consider the ii-th row of 𝐁\mathbf{B}, we need to know the equation it corresponds to. Because we put those equations of 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝\mathcal{MC}_{2}\mathtt{Gadget}s behind the modified equations of 𝐀\mathbf{A}, we have two cases:

  • imi\leq m: in this case, the equation is 𝐀i\mathbf{A}_{i} after the transformation, which has the form C(xj+xj)=𝐜iC(x_{j_{+}}-x_{j_{-}})=\mathbf{c}_{i} with C=1000𝙻𝚎𝚗i+(𝐀) bitsC=\underbrace{100\cdots 0}_{\mathtt{Len}^{+}_{i}(\mathbf{A})\text{ bits}} in binary since our transformation does not change the sum of positive (and negative) coefficients in 𝐀\mathbf{A}. What remains is to calculate the indices j+j_{+} and jj_{-} in 𝖳𝖢0\mathsf{TC}^{0}.

    Let 𝚂𝚞𝚖𝙽𝚞𝚖𝙶is(𝐀)=k=1w𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,ks\mathtt{SumNumG}^{s}_{i}(\mathbf{A})=\sum_{k=1}^{w}\mathtt{NumGadget}^{s}_{i,k} for all s{+,}s\in\{+,-\} and 1im1\leq i\leq m. For each s{+,}s\in\{+,-\}, there are two cases:

    • No new variable is added, i.e. 𝚂𝚞𝚖𝙽𝚞𝚖𝙶is(𝐀)=0\mathtt{SumNumG}^{s}_{i}(\mathbf{A})=0: there is only one entry in the original 𝐀i\mathbf{A}_{i} with the corresponding sign thus jsj_{s} is the index of this entry. We search for this entry to get its index, which can be implemented in 𝖠𝖢0\mathsf{AC}^{0} because there are nn possibilities.

    • Some new variables are added: then jsj_{s} is the index of the last added new variable, ignoring those only in 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝\mathcal{MC}_{2}\mathtt{Gadget}s. It can be calculated by

      js={n+𝚂𝚞𝚖𝙽𝚞𝚖𝙶i+(𝐀)+i=1i1s{+,}𝚂𝚞𝚖𝙽𝚞𝚖𝙶is(𝐀) if s=+,n+i=1is{+,}𝚂𝚞𝚖𝙽𝚞𝚖𝙶is(𝐀) if s=.\displaystyle j_{s}=\begin{cases}n+\mathtt{SumNumG}^{+}_{i}(\mathbf{A})+\sum_{i^{\prime}=1}^{i-1}\sum_{s^{\prime}\in\{+,-\}}\mathtt{SumNumG}^{s^{\prime}}_{i^{\prime}}(\mathbf{A})&\text{ if }s=+,\\ n+\sum_{i^{\prime}=1}^{i}\sum_{s^{\prime}\in\{+,-\}}\mathtt{SumNumG}^{s^{\prime}}_{i^{\prime}}(\mathbf{A})&\text{ if }s=-.\end{cases}

    𝚂𝚞𝚖𝙽𝚞𝚖𝙶is𝖳𝖢0\mathtt{SumNumG}^{s}_{i}\in\mathsf{TC}^{0} by Fact 1 and Lemma 3.6 so both j+,jj_{+},j_{-} can be calculated in 𝖳𝖢0\mathsf{TC}^{0} by Fact 1. Therefore we can calculate the entries in row ii in 𝖳𝖢0\mathsf{TC}^{0} for imi\leq m.

  • i>mi>m: in this case, this equation is in an 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝(t,t,j1,j2)\mathcal{MC}_{2}\mathtt{Gadget}(t,t^{\prime},j_{1},j_{2}) gadget for some t,t,j1,j2t,t^{\prime},j_{1},j_{2}. We need to calculate t,t,j1t,t^{\prime},j_{1}, and j2j_{2} in 𝖳𝖢0\mathsf{TC}^{0} then it is easy to recover the equation from the offset in the gadget.

    We can first calculate in 𝖠𝖢0\mathsf{AC}^{0} the gadget’s index ind=im18+1ind=\lfloor\frac{i-m-1}{8}\rfloor+1 as 8\lfloor\frac{\cdot}{8}\rfloor is just ignoring the three least significant bits. By Algorithm 1 and Remark 3.2, we know t=n+indt=n+ind thus it is 𝖠𝖢0\mathsf{AC}^{0}-computable, and we can also calculate t=n+i=1ms{+,}𝚂𝚞𝚖𝙽𝚞𝚖𝙶is(𝐀)+4(ind1)t^{\prime}=n+\sum_{i^{\prime}=1}^{m}\sum_{s^{\prime}\in\{+,-\}}\mathtt{SumNumG}^{s^{\prime}}_{i^{\prime}}(\mathbf{A})+4(ind-1) in 𝖳𝖢0\mathsf{TC}^{0} by Fact 1.

    Then we can calculate the number ii^{\prime}, kk^{\prime}, sign s{+,}s^{\prime}\in\{+,-\}, and number \ell^{\prime} such that this gadget is the \ell^{\prime}-th gadget we created to eliminate the kk^{\prime}-th bit of 𝐀i\mathbf{A}_{i^{\prime}} for variables with sign ss^{\prime}. More specifically, we want to find the minimum ii^{\prime}, kk^{\prime}, ss^{\prime} (where we treat ‘++<<-’) such that

    𝙿𝚛𝚎𝚏𝚒𝚡𝚂𝚞𝚖i,k+(𝐀)<ind𝙿𝚛𝚎𝚏𝚒𝚡𝚂𝚞𝚖i,k+(𝐀)+𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,k+\displaystyle\mathtt{PrefixSum}^{+}_{i^{\prime},k^{\prime}}(\mathbf{A})<ind\leq\mathtt{PrefixSum}^{+}_{i^{\prime},k^{\prime}}(\mathbf{A})+\mathtt{NumGadget}^{+}_{i^{\prime},k^{\prime}} if s=+,\displaystyle\text{ if }s^{\prime}=+,
    𝙿𝚛𝚎𝚏𝚒𝚡𝚂𝚞𝚖i,k(𝐀)<ind𝙿𝚛𝚎𝚏𝚒𝚡𝚂𝚞𝚖i,k(𝐀)+𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,k\displaystyle\mathtt{PrefixSum}^{-}_{i^{\prime},k^{\prime}}(\mathbf{A})<ind\leq\mathtt{PrefixSum}^{-}_{i^{\prime},k^{\prime}}(\mathbf{A})+\mathtt{NumGadget}^{-}_{i^{\prime},k^{\prime}} if s=,\displaystyle\text{ if }s^{\prime}=-,

    where

    𝙿𝚛𝚎𝚏𝚒𝚡𝚂𝚞𝚖i,k+(𝐀)\displaystyle\mathtt{PrefixSum}^{+}_{i^{\prime},k^{\prime}}(\mathbf{A}) =i′′=1i1s′′{+,}𝚂𝚞𝚖𝙽𝚞𝚖𝙶i′′s′′(𝐀)+k′′=1k1s′′{+,}𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,k′′s′′(𝐀),\displaystyle=\sum_{i^{\prime\prime}=1}^{i^{\prime}-1}\sum_{s^{\prime\prime}\in\{+,-\}}\mathtt{SumNumG}^{s^{\prime\prime}}_{i^{\prime\prime}}(\mathbf{A})+\sum_{k^{\prime\prime}=1}^{k^{\prime}-1}\sum_{s^{\prime\prime}\in\{+,-\}}\mathtt{NumGadget}^{s^{\prime\prime}}_{i^{\prime},k^{\prime\prime}}(\mathbf{A}),
    𝙿𝚛𝚎𝚏𝚒𝚡𝚂𝚞𝚖i,k(𝐀)\displaystyle\mathtt{PrefixSum}^{-}_{i^{\prime},k^{\prime}}(\mathbf{A}) =𝙿𝚛𝚎𝚏𝚒𝚡𝚂𝚞𝚖i,k+(𝐀)+𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,k+.\displaystyle=\mathtt{PrefixSum}^{+}_{i^{\prime},k^{\prime}}(\mathbf{A})+\mathtt{NumGadget}^{+}_{i^{\prime},k^{\prime}}.

    There are O(mw)O(mw) possible choices for ii^{\prime}, kk^{\prime}, ss^{\prime} and each condition is a prefix sum of 𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,ks\mathtt{NumGadget}^{s}_{i,k}’s, therefore this can be done in 𝖳𝖢0\mathsf{TC}^{0} by a parallel comparison of indind to the prefix sums of 𝙽𝚞𝚖𝙶𝚊𝚍𝚐𝚎𝚝i,ks\mathtt{NumGadget}^{s}_{i,k}’s. After getting ii^{\prime}, kk^{\prime}, and ss^{\prime}, we can get =ind𝙿𝚛𝚎𝚏𝚒𝚡𝚂𝚞𝚖i,ks\ell^{\prime}=ind-\mathtt{PrefixSum}^{s^{\prime}}_{i^{\prime},k^{\prime}} by Fact 1.

    What remains is to calculate j1j_{1} and j2j_{2} from ii^{\prime}, kk^{\prime}, ss^{\prime}, and \ell^{\prime}. Note that when eliminating the kk^{\prime}-th bit of 𝐀i\mathbf{A}_{i^{\prime}} for variables with sign ss^{\prime}, we first eliminate in pairs those variables in the original 𝐀i\mathbf{A}_{i^{\prime}} that have 11 in the kk^{\prime}-th bit before the reduction (we call them original pairs), then eliminate in pairs those generated in the previous round for ii^{\prime}, k1k^{\prime}-1, and ss^{\prime}. The number of original pairs is given by p=𝙲𝚘𝚞𝚗𝚝𝙱𝚒𝚝i,ks(𝐀)/2p=\lfloor\mathtt{CountBit}^{s^{\prime}}_{i^{\prime},k^{\prime}}(\mathbf{A})/2\rfloor, computable in 𝖳𝖢0\mathsf{TC}^{0}. There are two cases:

    • p\ell^{\prime}\leq p: j1j_{1} and j2j_{2} is the indices of variables in the \ell^{\prime}-th original pairs, which are the indices of the (21)(2\ell^{\prime}-1)-th and 22\ell^{\prime}-th variables in 𝐀i\mathbf{A}_{i^{\prime}} that have 11 in the kk^{\prime}-th bit. Similarly as above, this can be done by a simple parallel comparison to prefix sums of the kk^{\prime}-th bits of variables in 𝐀i\mathbf{A}_{i^{\prime}}.

    • >p\ell^{\prime}>p: then j1j_{1} and j2j_{2} are the (2(p)1)(2(\ell^{\prime}-p)-1)-th and 2(p)2(\ell^{\prime}-p)-th new variables generated in the previous round, therefore j1=𝙿𝚛𝚎𝚏𝚒𝚡𝚂𝚞𝚖i,k1s(𝐀)+2(p)1j_{1}=\mathtt{PrefixSum}^{s^{\prime}}_{i^{\prime},k^{\prime}-1}(\mathbf{A})+2(\ell^{\prime}-p)-1 and j2=j1+1j_{2}=j_{1}+1.

    However the second case above only works for even 𝙲𝚘𝚞𝚗𝚝𝙱𝚒𝚝i,ks(𝐀)\mathtt{CountBit}^{s^{\prime}}_{i^{\prime},k^{\prime}}(\mathbf{A}). When it is odd, =p+1\ell^{\prime}=p+1 corresponds to a pair with the last original variable in this round and the first generated variables in the previous round, thus j1j_{1} can be calculated as in the first case and j2=𝙿𝚛𝚎𝚏𝚒𝚡𝚂𝚞𝚖i,k1s(𝐀)+1j_{2}=\mathtt{PrefixSum}^{s^{\prime}}_{i^{\prime},k^{\prime}-1}(\mathbf{A})+1. Then for >p+1\ell^{\prime}>p+1, we have j1=𝙿𝚛𝚎𝚏𝚒𝚡𝚂𝚞𝚖i,k1s(𝐀)+2(p)2j_{1}=\mathtt{PrefixSum}^{s^{\prime}}_{i^{\prime},k^{\prime}-1}(\mathbf{A})+2(\ell^{\prime}-p)-2 and j2=j1+1j_{2}=j_{1}+1.

In conclusion, given ii as input, we can first check if imi\leq m^{\prime} in 𝖳𝖢0\mathsf{TC}^{0} by Lemma 3.6 and if so compute coefficients of 𝐁i\mathbf{B}_{i} in 𝖳𝖢0\mathsf{TC}^{0}, therefore with jj as input, we check if jnj\leq n^{\prime} in 𝖳𝖢0\mathsf{TC}^{0} and if so compute the entry of 𝐁\mathbf{B} in row ii column jj in 𝖳𝖢0\mathsf{TC}^{0}.

4 Genearlization to approximate solvers, and more restrictive classes

Now we are going to generalize the above results of reductions for exact solvers into those for approximate solvers, thus proving Theorem 1.2. First we need the following result showing the power of 𝖳𝖢0\mathsf{TC}^{0}.

Fact 3.

Division and iterated multiplication are in 𝖣𝖫𝖮𝖦𝖳𝖨𝖬𝖤\mathsf{DLOGTIME}-uniform 𝖳𝖢0\mathsf{TC}^{0} [5]. Moreover, we can approximate in 𝖳𝖢0\mathsf{TC}^{0} functions represented by sufficiently nice power series, such as log\log, exp\exp, and x1/kx^{1/k} [11, 9, 5].

Proof 4.1 (Proof sketch of Theorem 1.2).

Based on our proofs on the simplified reductions, we are going to prove that the original reductions from 𝒢\mathcal{G} to 𝒞2\mathcal{MC}_{2} in [7] can be computed in 𝖳𝖢0\mathsf{TC}^{0}. In the context of approximate solvers, the error ε\mathrm{\varepsilon} that solvers are required to achieve is also part of the instance. By Fact 3, all the errors in the reduced instances, as defined in the reductions in Kyng and Zhang’s full paper [8], can be computed in 𝖳𝖢0\mathsf{TC}^{0} given the size, condition number, and bit-complexity of the original matrix as parameters.

  • From 𝒢\mathcal{G} to 𝒢z\mathcal{G}_{z}: this reduction remains the same, but now knowing the accuracy we can recover 𝐱\mathbf{x}^{\prime} from 𝐱\mathbf{x} using fix-point arithmetic in 𝖳𝖢0\mathsf{TC}^{0}.

  • From 𝒢z\mathcal{G}_{z} to 𝒢z,2\mathcal{G}_{z,2}: the only difference (besides the calculation of accuracy) is that in the original reduction in [8, Section 7.2], in the last row of the reduced matrix 𝐀\mathbf{A}^{\prime} the last two entries are set to be ww and w-w for some value ww, instead of 11 and 1-1 as in our simplified version. However ww is computable in 𝖳𝖢0\mathsf{TC}^{0} by Fact 3 so the reduction is still computable in 𝖳𝖢0\mathsf{TC}^{0}.

  • From 𝒢z,2\mathcal{G}_{z,2} to 𝒞2\mathcal{MC}_{2}: for approximate solvers we will have to use the original 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝\mathcal{MC}_{2}\mathtt{Gadget} in [8, Section 4] consisting of ten 22-commodity equations instead of eight and with additional variables. So we need to modify the corresponding numbers in our calculation, and in particular the gedget’s index becomes ind=im110+1ind=\lfloor\frac{i-m-1}{10}\rfloor+1, which is still 𝖳𝖢0\mathsf{TC}^{0}-computable. Besides, the equations in the 𝒞2𝙶𝚊𝚍𝚐𝚎𝚝\mathcal{MC}_{2}\mathtt{Gadget} for eliminating the kk-th bit of variables with sign ss in 𝐀i\mathbf{A}_{i} will be multiplied by a factor s2kwi-s\cdot 2^{k}\cdot w_{i}, where wi=10s{+,}𝚂𝚞𝚖𝙽𝚞𝚖𝙶is(𝐀)w_{i}=\sqrt{10\sum_{s^{\prime}\in\{+,-\}}\mathtt{SumNumG}^{s}_{i}(\mathbf{A})}, as specified in Algorithm 1 of [8]. These factors can be calculated with desired accuracy in 𝖳𝖢0\mathsf{TC}^{0} by Fact 3. Therefore the reduction is still computable in 𝖳𝖢0\mathsf{TC}^{0}.

Additionally in the second and third reduction, when recovering 𝐱\mathbf{x}^{\prime} from 𝐱\mathbf{x} we need to check if the original matrix 𝐀\mathbf{A} and vector 𝐛\mathbf{b} satisfy 𝐀𝐛=𝟎\mathbf{A}^{\top}\mathbf{b}=\mathbf{0} and simply return 𝟎\mathbf{0} if so. This can be done in 𝖳𝖢0\mathsf{TC}^{0} by Fact 1. In conclusion, we can reduce the problem of approximately solving equations in 𝒢\mathcal{G} to approximately solving equations in 𝒞2\mathcal{MC}_{2} in 𝖳𝖢0\mathsf{TC}^{0}.

In [7] they also considered some more restrictive subclasses of 𝒞2\mathcal{MC}_{2}. Intuitively, the set of strict 22-commodity matrices 𝒞2>0𝒞2\mathcal{MC}^{>0}_{2}\subset\mathcal{MC}_{2} is the class of 2-commodity equations 𝐀\mathbf{A} such that for every pair i,ji,j, equation xixj=0𝐀x_{i}-x_{j}=0\in\mathbf{A} iff equation yiyj=0𝐀y_{i}-y_{j}=0\in\mathbf{A} iff equation xiyi(xjyj)=0𝐀x_{i}-y_{i}-(x_{j}-y_{j})=0\in\mathbf{A}. The set of strict 22-commodity matrices with integer entries is denoted by 𝒞2,>0\mathcal{MC}^{>0}_{2,\mathbb{Z}}. They showed that reductions from approximately solving 𝒞2\mathcal{MC}_{2} to approximately solving 𝒞2>0\mathcal{MC}^{>0}_{2}, and from 𝒞2>0\mathcal{MC}^{>0}_{2} to 𝒞2,>0\mathcal{MC}^{>0}_{2,\mathbb{Z}}. We are going to show that these reductions can be computed in 𝖳𝖢0\mathsf{TC}^{0}.

From 𝒞2\mathcal{MC}_{2} to 𝒞2>0\mathcal{MC}^{>0}_{2}

Proof 4.2 (Proof sketch).

The reduction, as defined in [8, Section 5.1], runs by checking for each pair i,ji,j that are involved in some equation in 𝐀\mathbf{A} if any of the three types of equations is missing and add it if so. The added equations will be multiplied by a factor that is computable in 𝖳𝖢0\mathsf{TC}^{0}. Obviously the resulting equation systems is in 𝒞2>0\mathcal{MC}^{>0}_{2}. It is easy to see that the number of added equations for each pair i,ji,j can be computed in 𝖠𝖢0\mathsf{AC}^{0}, thus all the prefix sums of these numbers can be calculated in 𝖳𝖢0\mathsf{TC}^{0} simultaneously, and so we can determine the equations in the reduced instance in 𝖳𝖢0\mathsf{TC}^{0}.

From 𝒞2>0\mathcal{MC}^{>0}_{2} to 𝒞2,>0\mathcal{MC}^{>0}_{2,\mathbb{Z}}

Proof 4.3 (Proof sketch).

In [8, Section 6] it is done by scaling up all the numbers in the matrix and the vector by a factor of 2k2^{k}, where kk is computable in 𝖳𝖢0\mathsf{TC}^{0} by Fact 3, then take the ceiling function on entries of the matrix to convert them into integer entries, which also can be done in 𝖳𝖢0\mathsf{TC}^{0}.

References

  • [1] Peter Clote and Evangelos Kranakis. Boolean Functions and Computation Models. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2002.
  • [2] Michael B. Cohen, Rasmus Kyng, Gary L. Miller, Jakub W. Pachocki, Richard Peng, Anup B. Rao, and Shen Chen Xu. Solving SDD linear systems in nearly mlog1/2{}^{\mbox{1/2}}n time. In Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31 - June 03, 2014, pages 343–352, 2014.
  • [3] Dean Doron, François Le Gall, and Amnon Ta-Shma. Probabilistic logarithmic-space algorithms for laplacian solvers. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2017, August 16-18, 2017, Berkeley, CA, USA, pages 41:1–41:20, 2017.
  • [4] François Le Gall. Powers of tensors and fast matrix multiplication. In International Symposium on Symbolic and Algebraic Computation, ISSAC ’14, Kobe, Japan, July 23-25, 2014, pages 296–303, 2014.
  • [5] William Hesse, Eric Allender, and David A. Mix Barrington. Uniform constant-depth threshold circuits for division and iterated multiplication. J. Comput. Syst. Sci., 65(4):695–716, 2002.
  • [6] Neil Immerman and Susan Landau. The complexity of iterated multiplication. In Proceedings: Fourth Annual Structure in Complexity Theory Conference, University of Oregon, Eugene, Oregon, USA, June 19-22, 1989, pages 104–111, 1989.
  • [7] Rasmus Kyng and Peng Zhang. Hardness results for structured linear systems. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 684–695, 2017.
  • [8] Rasmus Kyng and Peng Zhang. Hardness results for structured linear systems. CoRR, abs/1705.02944, 2017. arXiv:1705.02944.
  • [9] Alexis Maciel and Denis Thérien. Efficient threshold circuits for power series. Inf. Comput., 152(1):62–73, 1999.
  • [10] Jack Murtagh, Omer Reingold, Aaron Sidford, and Salil P. Vadhan. Derandomization beyond connectivity: Undirected laplacian systems in nearly logarithmic space. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 801–812, 2017.
  • [11] John H. Reif and Stephen R. Tate. On threshold circuits and polynomial computation. SIAM J. Comput., 21(5):896–908, 1992.
  • [12] Michael E. Saks and Shiyu Zhou. RSPACE(S) \subseteq dspace(s3/2{}^{\mbox{3/2}}). In 36th Annual Symposium on Foundations of Computer Science, Milwaukee, Wisconsin, USA, 23-25 October 1995, pages 344–353, 1995.
  • [13] Daniel A. Spielman and Shang-Hua Teng. Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems. SIAM J. Matrix Analysis Applications, 35(3):835–885, 2014.
  • [14] Amnon Ta-Shma. Inverting well conditioned matrices in quantum logspace. In Symposium on Theory of Computing Conference, STOC’13, Palo Alto, CA, USA, June 1-4, 2013, pages 881–890, 2013.