Northeastern University, [email protected] \CopyrightXuangui Huang \ccsdesc[500]Theory of computation Problems, reductions and completeness \supplement\fundingSupported by NSF CCF award 1813930.

Acknowledgements.

The author thanks Emanuele Viola for helpful discussions.\EventEditorsJohn Q. Open and Joan R. Access \EventNoEds2 \EventLongTitle42nd Conference on Very Important Topics (CVIT 2016) \EventShortTitleCVIT 2016 \EventAcronymCVIT \EventYear2016 \EventDateDecember 24–27, 2016 \EventLocationLittle Whinging, United Kingdom \EventLogo \SeriesVolume42 \ArticleNo23

Space Hardness of Solving Structured Linear Systems

Xuangui Huang

Abstract

We show that if the probabilistic logarithmic-space solver or the deterministic nearly logarithmic-space solver for undirected Laplacian matrices can be extended to solve slightly larger subclasses of linear systems, then they can be use to solve all linear systems with similar space complexity. Previously Kyng and Zhang [7] proved similar results in the time complexity setting using reductions between approximate solvers. We prove that their reductions can be implemented using constant-depth, polynomial-size threshold circuits.

keywords:

linear system solver, logarithmic space, threshold circuit

category:

\relatedversion

1 Introduction

One of the oldest mathematical problems is to solve a linear system, that is to find a solution $\mathbf{x}$ satisfying $\mathbf{A}\mathbf{x}=\mathbf{b}$ given an $n\times n$ matrix $\mathbf{A}$ and a $n$ -dimensional vector $\mathbf{b}$ as input. In the RealRAM model this can be done in $O(n^{\omega})$ time, where $\omega\leq 2.3728$ [4] is the matrix multiplication constant. Much faster algorithms exist for approximately solving linear systems when $\mathbf{A}$ is the Laplacian of undirected graphs. Indeed recent breakthroughs showed that it can be done in nearly linear time [13, 2]. Kyng and Zhang [7] further showed that if such solvers can be extended to nearly linear time solvers for some classes slightly larger than undirected Laplacians, we can also solve general linear systems in nearly linear time.

In this paper we are interested in the space complexity of this problem. For general linear systems Ta-shma gave a quantum algorithm using logarithmic space [14]. For undirected Laplacian, Doron et al. showed that it has a probabilistic logarithmic-space algorithm [3] and hence a deterministic $O(\log^{3/2}n)$ -space algorithm by a well-known space-efficient derandomization result [12]. This was improved later to $\widetilde{O}(\log n)$ by Murtagh et al [10].

1.1 Our results

We prove a space hardness version of Kyng and Zhang’s results [7], showing space hardness of approximate linear system solvers for some classes slightly larger than undirected Laplacians, namely multi-commodity Laplacians, 2D Truss Stiffness Matrices, and Total Variation Matrices.

Theorem 1.1.

Suppose that for multi-commodity Laplacians, 2-D Truss Stiffness Matrices, or Total Variation Matrices, the linear system $\mathbf{A}\mathbf{x}=\mathbf{b}$ can be approximately solved in (nearly) logarithmic space with logarithmic dependence on condition number $\kappa$ and accuracy $\mathrm{\varepsilon}^{-1}$ (even if it only works in expectation or with high probability), then any linear systems with polynomially bounded integers and condition number can be solved in (nearly) logarithmic space with high accuracy (in expectation or with high probability, respectively).

This shows that if the probabilistic logspace solver for undirected Laplacian in [3], or the deterministic $\widetilde{O}(\log n)$ -space one in [10], can be extended to solve any of these three slightly larger subclasses of linear systems, we would have a surprising result that all linear systems can be approximately solved in probabilistic logspace or in deterministic $\widetilde{O}(\log n)$ -space. Pessimistically speaking the theorem means that it is very hard to get space efficient algorithms for these subclasses, as it is as difficult as solving all linear systems in a space efficient way. On the bright side, we actually prove that any progress on solving these subclasses using less space will immediately translate into similar progress for solving all linear system using less space.

Kyng and Zhang [7] proved their results via reductions from approximate solvers of general linear systems to those of three subclasses. In this paper we prove Theorem 1.1 by proving that their reductions can be carried out in a space efficient manner. Indeed we prove a much stronger result that their reductions can be implemented in $\mathsf{TC}^{0}$ circuits, which are constant-depth polynomial-size unbounded-fan in circuits with $\mathsf{MAJORITY}$ and $\neg$ gates. It shows that these reductions are actually highly parallelizable.

We denote $\mathcal{G}$ as the class of all matrices with integer valued entries. In the context of solving linear systems, an all-zero row or column can be trivially handled, so we can assume without loss of generality that matrices in $\mathcal{G}$ has at least one non-zero entry in every row and column. For 2-commodity matrices $\mathcal{MC}_{2}$ , we have two set of variables $X$ and $Y$ of the same size, and the equations are scalings of $x_{i}-x_{j}=0$ , $y_{i}-y_{j}=0$ , and $x_{i}-y_{i}-(x_{j}-y_{j})=0$ , where $x_{i},x_{j}\in X$ and $y_{i},y_{j}\in Y$ . This generalizes undirected Laplacians, as the incidence matrices of undirected Lapacians only have equations of the form $x_{i}-x_{j}=0$ for $x_{i},x_{j}\in X$ .

Our main technical result proves that the reduction from $\mathcal{G}$ to $\mathcal{MC}_{2}$ in [7] is $\mathsf{TC}^{0}$ -computable.

Theorem 1.2.

There is a $\mathsf{TC}^{0}$ -reduction from approximately solving $\mathcal{G}$ to approximately solving $\mathcal{MC}_{2}$ .

In [7] it is shown that the matrix produced by this reduction is also a 2D Truss Stiffness Matrix as well as a Total Variation Matrix, therefore Theorem 1.2 also works for these classes. Also note that this reduction is a Karp-style reduction, i.e. it requires only one linear system solve and uses the solution in a black-box way. That is why Theorem 1.1 still applies if the solver only works in expectation or with high probability.

We also show $\mathsf{TC}^{0}$ -computability of the reductions in [7] to some more restrictive subclasses of $\mathcal{MC}_{2}$ , including $\mathcal{MC}^{>}_{2}$ , the exact class we have to solve when we use Interior Point Methods for 2-commodity problems, as explained in the Kyng and Zhang’s full paper [8]. They also showed that the promise problem of deciding if a vector is in the image of a matrix or $\mathrm{\varepsilon}$ -far from the image can be directly reduced to approximately solving linear systems. Combining with the above results, this shows that the promise problem can be reduced to approximately solving the above-mentioned subclasses in $\mathsf{TC}^{0}$ .

2 Simplified reductions in $\mathsf{TC}^{0}$ : the easy parts

Throughout this paper we use the sign-magnitude representation to encode a $w$ -bit signed integer $z\in[-2^{w+1}+1,2^{w+1}-1]$ into a sign bit, $0$ for positive and $1$ for negative, and $w$ bits for $|z|$ in binary. Note that in this method, $0$ has two representations and we accept both.

For simplicity we first prove the special case for the reduction from exact solver of $\mathcal{G}$ to exact solver of $\mathcal{MC}_{2}$ .

Theorem 2.1.

Given an $m\times n$ $w$ -bit matrix $\mathbf{A}\in\mathcal{G}$ and a vector $\mathbf{b}$ as input, we can compute in $\mathsf{TC}^{0}$ an $O(nmw\log n)\times O(nmw\log n)$ $O(w\log n)$ -bit matrix $\mathbf{A}^{\prime}\in\mathcal{MC}_{2}$ and a vector $\mathbf{b}^{\prime}$ such that:

•

$\mathbf{A}\mathbf{x}=\mathbf{b}$ has a solution if and only if $\mathbf{A}^{\prime}\mathbf{x}^{\prime}=\mathbf{b}^{\prime}$ has a solution;
•

if $\mathbf{A}^{\prime}\mathbf{x}^{\prime}=\mathbf{b}^{\prime}$ has a solution $\mathbf{x}^{\prime}$ , then we can compute $\mathbf{x}$ in $\mathsf{TC}^{0}$ from $\mathbf{x}^{\prime}$ so that $\mathbf{A}\mathbf{x}=\mathbf{b}$ .

As in [7], this reduction is split into several steps using the following classes of matrices.

Definition 2.2 ([7]).

•

Let $\mathcal{G}_{z}\subset\mathcal{G}$ denote the class of matrices with integer valued entries such that every row has zero row sum;
•

Let $\mathcal{G}_{z,2}\subset\mathcal{G}_{z}$ denote the class of matrices with integer valued entries such that every row has zero row sum, and for each row the sum of positive coefficients is a power of $2$ .

Lemma 2.3.

There are $\mathsf{TC}^{0}$ -reductions for exact solvers of the following classes:

(i)

from $\mathcal{G}$ to $\mathcal{G}_{z}$ ;
(ii)

from $\mathcal{G}_{z}$ to $\mathcal{G}_{z,2}$ ;
(iii)

from $\mathcal{G}_{z,2}$ to $\mathcal{MC}_{2}$ .

Lemma 2.3 (iii) is the main reduction in this paper (same as in [7]), which will be proved in the next section. In the remaining of this section we prove Lemma 2.3 (i) and (ii).

From $\mathcal{G}$ to $\mathcal{G}_{z}$

Proof 2.4 (Proof sketch).

Given a matrix $\mathbf{A}\in\mathcal{G}$ , we can define a matrix $\mathbf{A}^{\prime}$ with one more column by $\mathbf{A}^{\prime}=\begin{pmatrix}\mathbf{A}&-\mathbf{A}\mathbf{1}\end{pmatrix}$ . Obviously $\mathbf{A}^{\prime}\in\mathcal{G}_{z}$ , and $\mathbf{A}\mathbf{x}=\mathbf{b}$ has a solution if and only if $\mathbf{A}^{\prime}\mathbf{x}^{\prime}=\mathbf{b}$ has a solution. We can also recover $\mathbf{x}\in\mathbb{R}^{n}$ from $\mathbf{x}^{\prime}\in\mathbb{R}^{n+1}$ by taking the first $n$ rows of $\mathbf{x}^{\prime}$ and minus each of them by $\mathbf{x}^{\prime}_{n+1}$ . The following results about additions imply that $\mathbf{A}^{\prime}$ can be calculated in $\mathsf{TC}^{0}$ , and we can recover $\mathbf{x}^{\prime}$ from $\mathbf{x}$ in $\mathsf{AC}^{0}$ (for simplicity we ignore the precision problem here).

Fact 1.

•

Addition of $2$ $w$ -bit numbers has $\mathsf{AC}^{0}$ circuit of size $\mathrm{{poly}}(w)$ (c.f. [1]);¹¹1 $\mathsf{AC}^{0}$ circuits are constant-depth polynomial-size unbounded-fan in circuits with $\wedge$ , $\vee$ , and $\neg$ gates.
•

Addition of $n$ $w$ -bit numbers has $\mathsf{TC}^{0}$ circuit of size $\mathrm{{poly}}(n,w)$ [6].

From $\mathcal{G}_{z}$ to $\mathcal{G}_{z,2}$

Proof 2.5 (Proof sketch).

Given an $m\times n$ $w$ -bit signed-integer matrix $\mathbf{A}^{\prime}\in\mathcal{G}_{z}$ , we just need to add two more columns to make the sum of positive (and negative) entries in each row to the closet power of $2$ . This can be done in $\mathsf{TC}^{0}$ in the following way. For each row $1\leq i\leq m$ , we calculate the sum of positive entries $s_{i}$ by checking the sign bit then do the iterated addition in $\mathsf{TC}^{0}$ by Fact 1. We then take $s=\max s_{i}$ , which can be computed in $\mathsf{AC}^{0}$ given $s_{i}$ ’s. $s$ has at most $O(w\log n)$ bits so given $s$ by searching brute-forcely in $\mathsf{AC}^{0}$ we can find the minimum $k$ s.t. $2^{k}\geq s$ . Therefore by Fact 1 we can calculate $a_{i}=2^{k}-s_{i}$ in $\mathsf{AC}^{0}$ given $s_{i}$ . Then for each row $i$ we add $a_{i}$ and $-a_{i}$ to the last two columns of $\mathbf{A}^{\prime}$ to get $\mathbf{A}^{\prime\prime}$ . Additionally we need to add a new row to $\mathbf{A}^{\prime\prime}$ (and to $\mathbf{b}^{\prime\prime}$ accordingly) to zero out the last two variables we just added: set the last two entries into $1$ and $-1$ , and set all other entries $0$ , and also add a $0$ entry to $\mathbf{b}^{\prime}$ to get $\mathbf{b}^{\prime\prime}$ .

Obviously we have $\mathbf{A}^{\prime\prime}\in\mathcal{G}_{z,2}$ , and $\mathbf{A}^{\prime}\mathbf{x}^{\prime}=\mathbf{b}^{\prime}$ has a solution iff $\mathbf{A}^{\prime\prime}\mathbf{x}^{\prime\prime}=\mathbf{b}^{\prime\prime}$ has a solution. We can easily recover $\mathbf{x}^{\prime}$ from $\mathbf{x}^{\prime\prime}$ by taking the first $n$ rows.

For the next section we need the following definition.

Definition 2.6.

We say a matrix $\mathbf{A}\in\mathcal{G}_{z}$ is $w$ -bounded if for each row the sum of positive coefficients is at most $2^{w}$ . Note that in such matrix every entry is a $w$ -bit signed integer.

Note that the reduction from $\mathcal{G}$ to $\mathcal{G}_{z}$ reduces an $m\times n$ $w$ -bit matrix $\mathbf{A}$ into an $m\times(n+1)$ $O(w\log n)$ -bounded matrix $\mathbf{A}^{\prime}$ , then the reduction from $\mathcal{G}_{z}$ to $\mathcal{G}_{z,2}$ reduces $\mathbf{A}^{\prime}$ into an $(m+1)\times(n+3)$ $O(w\log n)$ -bounded matrix $\mathbf{A}^{\prime\prime}$ .

3 Simplified main reduction in $\mathsf{TC}^{0}$

The main reduction in [7] uses the pair-and-replace scheme to transform a linear system in $\mathcal{G}_{z,2}$ to a $2$ -commodity equation system. The main idea is to use $\mathcal{MC}_{2}\mathtt{Gadget}$ s consisting of $\mathcal{MC}_{2}$ equations to replace pairs of variables in the original linear system round-by-round according to the bit representation of their coefficients. A simplified version of the gadget, implicitly given in [7], is as follows.

Definition 3.1 (Simplified $\mathcal{MC}_{2}\mathtt{Gadget}$ ).

Define $\mathcal{MC}_{2}\mathtt{Gadget}(t,t^{\prime},j_{1},j_{2})$ to be the following set of 2-commodity linear equations representing “ $2x_{t}=x_{j_{1}}+x_{j_{2}}$ ”:

	$\displaystyle x_{t}-x_{t^{\prime}+1}$	$\displaystyle=0$
	$\displaystyle x_{t^{\prime}+2}-x_{j_{2}}$	$\displaystyle=0$
	$\displaystyle y_{t^{\prime}+1}-y_{t^{\prime}+3}$	$\displaystyle=0$
	$\displaystyle y_{t^{\prime}+4}-y_{t^{\prime}+2}$	$\displaystyle=0$
	$\displaystyle x_{t^{\prime}+3}-x_{j_{1}}$	$\displaystyle=0$
	$\displaystyle x_{t}-x_{t^{\prime}+4}$	$\displaystyle=0$
	$\displaystyle x_{t^{\prime}+4}-y_{t^{\prime}+4}-(x_{t^{\prime}+3}-y_{t^{\prime}+3})$	$\displaystyle=0$
	$\displaystyle x_{t^{\prime}+1}-y_{t^{\prime}+1}-(x_{t^{\prime}+2}-y_{t^{\prime}+2})$	$\displaystyle=0.$

For convenience we use an extra parameter $t^{\prime}$ to keep track of new variables that are only used in the gadgets.

Correctness of this gadget can be easily verified by summing up all the equations in it.

We now present a simplified reduction from exactly solving $\mathcal{G}_{z,2}$ to exactly solving $\mathcal{MC}_{2}$ in Algorithm 1. We use $\mathbf{A}_{i}$ to denote the $i$ -th row of $\mathbf{A}$ .

Input : a

w

-bounded

m\times n

matrix

\mathbf{A}\in\mathcal{G}_{z,2}

and

\mathbf{c}\in\mathbb{R}^{n}

Output : a

w

-bit

m^{\prime}\times n^{\prime}

matrix

\mathbf{B}\in\mathcal{MC}_{2}

and

\mathbf{d}\in\mathbb{R}^{n}

m^{\prime}\leftarrow m

n^{\prime}\leftarrow n

n_{g}\leftarrow 0

4 // # of new variables used only in the

\mathcal{MC}_{2}\mathtt{Gadget}

5 for $i\leftarrow 1$ to $m$ do

6 for $s\in\{+,-\}$ do

7 for $k\leftarrow 1$ to $w$ do

8 while strictly more than 1 entry in $\mathbf{A}_{i}$ has sign $s$ and the $k$ -th bit being $1$ do

9 Let

j_{1},j_{2}

be the first and second indices of such entries in

\mathbf{A}_{i}

10 In

\mathbf{A}_{i}

, replace

2^{k}(x_{j_{1}}+x_{j_{2}})

2^{k+1}x_{n^{\prime}+1}

(thus adding one column to the right of

\mathbf{A}

)

11 Add the coefficients of

\mathcal{MC}_{2}\mathtt{Gadget}

( $n^{\prime}+n_{g}+1$ , $n^{\prime}+n_{g}+1$ , $j_{1}$ , $j_{2}$ ) to

\mathbf{C}

n^{\prime}\leftarrow n^{\prime}+1

n_{g}\leftarrow n_{g}+4

m^{\prime}\leftarrow m^{\prime}+8

16 end while

18 end for

20 end for

22 end for

n^{\prime}\leftarrow 2\times(n^{\prime}+n_{g})

24 Stack

\mathbf{C}

in the bottom of

\mathbf{A}

and fill in

0

’s to get

\mathbf{B}

25 Add

m^{\prime}-m

0

’s under

\mathbf{c}

to get

\mathbf{d}

Algorithm 1 Simplified

\textsc{Reduce}\mathcal{G}_{z,2}\textsc{To}\mathcal{MC}_{2}

Note that in $\mathcal{MC}_{2}$ we have two input variables sets $X$ , $Y$ of the same size. That is why we have to multiply $n^{\prime}$ by 2 at last in the reduction. But here we will only use variables in $Y$ in the $\mathcal{MC}_{2}\mathtt{Gadget}$ s, and all the other $Y$ variables are unused. For convenience we arrange the variables in the following way.

Remark 3.2 (Arrangement of variables).

In $\mathbf{B}$ we put all the variables in $X$ before those in $Y$ . More importantly, we put those $X$ variables that are only used in the gadgets behind all those $x_{n^{\prime}+1}$ in Line 1. Equivalently it can be viewed as the following process. We first run Algorithm 1 virtually before Line 1 to get $n^{\prime}$ , which is the number of $X$ variables ignoring those only used in the gadgets. Then we run it again on the original input, starting with $n_{g}$ being this value, and in Line 1 we use $\mathcal{MC}_{2}\mathtt{Gadget}(n^{\prime}+1,n_{g},j_{1},j_{2})$ instead.

We give an example showing how the reduction works under this arrangement.

Example 3.3.

We show how the reduction runs on $3x_{1}+5x_{2}+x_{3}+7x_{4}-16x_{5}=0$ :

		$\displaystyle 00011x_{1}+00101x_{2}+00001x_{3}+00111x_{4}-10000x_{5}=0$
	$\displaystyle\xrightarrow{x_{1}+x_{2}-2x_{6}=0}\leavevmode\nobreak\$	$\displaystyle 00010x_{1}+00100x_{2}+00001x_{3}+00111x_{4}+00010x_{6}-10000x_{5}=0$
	$\displaystyle\xrightarrow{x_{3}+x_{4}-2x_{7}=0}\leavevmode\nobreak\$	$\displaystyle 00010x_{1}+00100x_{2}+00110x_{4}+00010x_{6}+00010x_{7}-10000x_{5}=0$
	$\displaystyle\xrightarrow{x_{1}+x_{4}-2x_{8}=0}\leavevmode\nobreak\$	$\displaystyle 00100x_{2}+00100x_{4}+00010x_{6}+00010x_{7}+00100x_{8}-10000x_{5}=0$
	$\displaystyle\xrightarrow{x_{6}+x_{7}-2x_{9}=0}\leavevmode\nobreak\$	$\displaystyle 00100x_{2}+00100x_{4}+00100x_{8}+00100x_{9}-10000x_{5}=0$
	$\displaystyle\xrightarrow{x_{2}+x_{4}-2x_{10}=0}\leavevmode\nobreak\$	$\displaystyle 01000x_{10}+00100x_{8}+00100x_{9}-10000x_{5}=0$
	$\displaystyle\xrightarrow{x_{8}+x_{9}-2x_{11}=0}\leavevmode\nobreak\$	$\displaystyle 01000x_{10}+01000x_{11}-10000x_{5}=0$
	$\displaystyle\xrightarrow{x_{10}+x_{11}-2x_{12}=0}\leavevmode\nobreak\$	$\displaystyle 10000x_{12}-10000x_{5}=0.$

We are only eliminating the positive coefficient variables in this example for simplicity. In the first round we use new variables (and the corresponding gadgets) $x_{6}$ and $x_{7}$ to eliminate the first bit, getting the equation $2x_{1}+4x_{2}+6x_{4}+2x_{6}+2x_{7}=0$ . These two generated variables are then eliminated in the second round by $x_{9}$ , in addition to $x_{8}$ for the second bit. In the third round we use $x_{10}$ and $x_{11}$ . Finally in the fourth round we use $x_{12}$ and get the equation after the reduction $16x_{12}-16x_{5}=0$ , with $\mathcal{MC}_{2}\mathtt{Gadget}$ s representing $2x_{6}=x_{1}+x_{2}$ , $2x_{7}=x_{3}+x_{4}$ , $2x_{8}=x_{1}+x_{4}$ , $2x_{9}=x_{6}+x_{7}$ , $2x_{10}=x_{2}+x_{4}$ , $2x_{11}=x_{8}+x_{9}$ , and $2x_{12}=x_{10}+x_{11}$ .

Correctness of this simplified reduction follows easily by correctness of the gadget, as our transformation preserves the original solution. Moreover, given a solution $\mathbf{x}^{*}$ such that $\mathbf{B}\mathbf{x}^{*}=\mathbf{d}$ where $\mathbf{B}$ and $\mathbf{d}$ are obtained from running Algorithm 1 on input $\mathbf{A}$ and $\mathbf{c}$ , we can easily get the solution to the original equation system $\mathbf{A}\mathbf{x}=\mathbf{c}$ by simply taking the first $n$ elements in $\mathbf{x}^{*}$ . It is also easy to get $\mathbf{d}$ from $\mathbf{c}$ if we can calculate $m^{\prime}$ in $\mathsf{TC}^{0}$ .

In the remaining of this section we are going to prove that Algorithm 1 can be implemented in $\mathsf{TC}^{0}$ . For $1\leq i\leq m$ , $1\leq k\leq w$ , we define

	$\displaystyle\mathtt{Len}^{+}_{i}(\mathbf{A})$	$\displaystyle=\text{ log of sum of positive coefficients in $\mathbf{A}_{i}$},$
	$\displaystyle\mathtt{CountBit}^{+}_{i,k}(\mathbf{A})$	$\displaystyle=\#\text{ of positive coefficient variables in the original $\mathbf{A}_{i}$ with the $k$-th bit $1$},$
	$\displaystyle\mathtt{NumGadget}^{+}_{i,k}(\mathbf{A})$	$\displaystyle=\#\text{ of $\mathcal{MC}_{2}\textsc{Gadget}$s used for $\mathbf{A}_{i}$ to eliminate the $k$-th bit}$
		$\displaystyle\phantom{=\#}\text{ of positive coefficient variables},$

and similarly $\mathtt{Len}^{-}_{i}(\mathbf{A})$ , $\mathtt{CountBit}^{-}_{i,k}(\mathbf{A})$ , $\mathtt{NumGadget}^{-}_{i,k}(\mathbf{A})$ for the negative coefficient variables.

Example 3.4.

For the above example, we have $\mathtt{Len}^{+}_{i}(\mathbf{A})=5$ , and the following values for each $k$ .

Table 1: Values of

\mathtt{CountBit}^{+}_{i,k}(\mathbf{A})

and

\mathtt{NumGadget}^{+}_{i,k}(\mathbf{A})

for Example 3.3

$k$	$1$	$2$	$3$	$4$	$5$
$\mathtt{CountBit}^{+}_{i,k}(\mathbf{A})$	$4$	$2$	$2$	$0$	$0$
$\mathtt{NumGadget}^{+}_{i,k}(\mathbf{A})$	$2$	$2$	$2$	$1$	$0$

We have the following simple but crucial properties for these values.

Claim 2.

(i)

$\mathtt{Len}^{s}_{i}(\mathbf{A})\leq w$ for all $s\in\{+,-\}$ , $1\leq i\leq m$ ;
(ii)

$\mathtt{CountBit}^{s}_{i,k}(\mathbf{A})\leq n$ for all $s\in\{+,-\}$ , $1\leq i\leq m$ , $1\leq k\leq w$ ;

(iii)

For all $s\in\{+,-\}$ , $1\leq i\leq m$ ,

\mathtt{NumGadget}^{s}_{i,k}(\mathbf{A})=\begin{cases}2^{-(k+1)}\sum_{k^{\prime}=1}^{k}2^{k^{\prime}}\mathtt{CountBit}^{s}_{i,k^{\prime}}(\mathbf{A})&\text{ for }1\leq k\leq\mathtt{Len}^{s}_{i}(\mathbf{A})-1,\\ 0&\text{ for }\mathtt{Len}^{s}_{i}(\mathbf{A})\leq k\leq w,\end{cases}

thus $\mathtt{NumGadget}^{s}_{i,k}(\mathbf{A})\leq O(n)$ ;

(iv)

$m^{\prime}=m+8\sum_{i=1}^{m}\sum_{s\in\{+,-\}}\sum_{k=1}^{w}\mathtt{NumGadget}^{s}_{i,k}(\mathbf{A})\leq O(nmw)$ ;
(v)

$n^{\prime}=2n+10\sum_{i=1}^{m}\sum_{s\in\{+,-\}}\sum_{k=1}^{w}\mathtt{NumGadget}^{s}_{i,k}(\mathbf{A})\leq O(nmw)$ .

Proof 3.5.

(i) and (ii) are trivial by definition. (iv) and (v) are straightforward from Algorithm 1 and (iii). For (iii), note that in each round for $k$ we will eliminate all the variables generated in the previous round for $k-1$ by construction (ignoring those variables that are only used in the gadgets), therefore we have $\mathtt{NumGadget}^{s}_{i,1}(\mathbf{A})=\mathtt{CountBit}^{s}_{i,1}(\mathbf{A})/2$ and $\mathtt{NumGadget}^{s}_{i,k}(\mathbf{A})=(\mathtt{CountBit}^{s}_{i,k}(\mathbf{A})+\mathtt{NumGadget}^{s}_{i,k-1}(\mathbf{A}))/2$ for $2\leq k\leq\mathtt{Len}^{s}_{i}(\mathbf{A})-1$ . Here we rely on the property that the sum of positive (and negative) coefficients in each row is a power of $2$ to ensure that $\mathtt{NumGadget}^{s}_{i,k}(\mathbf{A})$ as calculated in this way are always integers. Then by induction we get the formula.

Note that in (iii) $2^{-(k+1)}$ and $2^{k^{\prime}}$ are just right and left shifts, which can be easily implemented in the circuit model. Combining Claim 2 and Fact 1, we can see that all of these values can be computed in $\mathsf{TC}^{0}$ for all $i,k,s$ , i.e. the depths of the $\mathsf{TC}$ circuits are absolute constants independent of $i$ , $k$ , and $s$ .

Lemma 3.6 (Informal).

•

$\mathtt{Len}^{s}_{i}$ , $\mathtt{CountBit}^{s}_{i,k}$ , and $\mathtt{NumGadget}^{s}_{i,k}$ have $\mathsf{TC}^{0}$ circuits of size $\mathrm{{poly}}(n,w)$ for all $i,k,s$ ;
•

$n^{\prime},m^{\prime}$ has $\mathsf{TC}^{0}$ circuits of size $\mathrm{{poly}}(n,m,w)$ .

Now we can prove that the reduction from $\mathcal{G}_{z,2}$ to $\mathcal{MC}_{2}$ can be done in $\mathsf{TC}^{0}$ . We represent the input matrix $\mathbf{A}$ explicitly by giving all its entries in sign-magnitude form, and the output matrix $\mathbf{B}$ implicitly by outputting the entry of $\mathbf{B}$ in row $i$ column $j$ with $i$ , $j$ as additional inputs.

Theorem 3.7.

There is a $\mathsf{TC}^{0}$ circuit family $\{C_{n,m,w}\}_{n,m,w\in\mathbb{N}}$ of size $\mathrm{{poly}}(n,m,w)$ with $(O(nmw)+O(\log nmw))$ -bit input and $O(w)$ -bit output such that:

•

the first $O(nmw)$ bits of input represent a $w$ -bounded $m\times n$ matrix $\mathbf{A}\in\mathcal{G}_{z,2}$ in sign-magnitude form, the next $O(\log nmw)$ bits of input represent an index $i$ , and the last $O(\log nmw)$ bits represent another index $j$ ;
•

for $i\leq m^{\prime}$ and $j\leq n^{\prime}$ where $m^{\prime},n^{\prime}\leq O(nmw)$ is calculated as in Algorithm 1 on $\mathbf{A}$ , the output represents the entry of $\mathbf{B}$ (in sign-magnitude form) in row $i$ column $j$ as calculated by Algorithm 1 on $\mathbf{A}$ ; otherwise it represents $0$ .

Proof 3.8.

Consider the $i$ -th row of $\mathbf{B}$ , we need to know the equation it corresponds to. Because we put those equations of $\mathcal{MC}_{2}\mathtt{Gadget}$ s behind the modified equations of $\mathbf{A}$ , we have two cases:

•

$i\leq m$ : in this case, the equation is $\mathbf{A}_{i}$ after the transformation, which has the form $C(x_{j_{+}}-x_{j_{-}})=\mathbf{c}_{i}$ with $C=\underbrace{100\cdots 0}_{\mathtt{Len}^{+}_{i}(\mathbf{A})\text{ bits}}$ in binary since our transformation does not change the sum of positive (and negative) coefficients in $\mathbf{A}$ . What remains is to calculate the indices $j_{+}$ and $j_{-}$ in $\mathsf{TC}^{0}$ .

Let $\mathtt{SumNumG}^{s}_{i}(\mathbf{A})=\sum_{k=1}^{w}\mathtt{NumGadget}^{s}_{i,k}$ for all $s\in\{+,-\}$ and $1\leq i\leq m$ . For each $s\in\{+,-\}$ , there are two cases:

–

No new variable is added, i.e. $\mathtt{SumNumG}^{s}_{i}(\mathbf{A})=0$ : there is only one entry in the original $\mathbf{A}_{i}$ with the corresponding sign thus $j_{s}$ is the index of this entry. We search for this entry to get its index, which can be implemented in $\mathsf{AC}^{0}$ because there are $n$ possibilities.

–

Some new variables are added: then $j_{s}$ is the index of the last added new variable, ignoring those only in $\mathcal{MC}_{2}\mathtt{Gadget}$ s. It can be calculated by

\displaystyle j_{s}=\begin{cases}n+\mathtt{SumNumG}^{+}_{i}(\mathbf{A})+\sum_{i^{\prime}=1}^{i-1}\sum_{s^{\prime}\in\{+,-\}}\mathtt{SumNumG}^{s^{\prime}}_{i^{\prime}}(\mathbf{A})&\text{ if }s=+,\\ n+\sum_{i^{\prime}=1}^{i}\sum_{s^{\prime}\in\{+,-\}}\mathtt{SumNumG}^{s^{\prime}}_{i^{\prime}}(\mathbf{A})&\text{ if }s=-.\end{cases}

$\mathtt{SumNumG}^{s}_{i}\in\mathsf{TC}^{0}$ by Fact 1 and Lemma 3.6 so both $j_{+},j_{-}$ can be calculated in $\mathsf{TC}^{0}$ by Fact 1. Therefore we can calculate the entries in row $i$ in $\mathsf{TC}^{0}$ for $i\leq m$ .

•

$i>m$ : in this case, this equation is in an $\mathcal{MC}_{2}\mathtt{Gadget}(t,t^{\prime},j_{1},j_{2})$ gadget for some $t,t^{\prime},j_{1},j_{2}$ . We need to calculate $t,t^{\prime},j_{1}$ , and $j_{2}$ in $\mathsf{TC}^{0}$ then it is easy to recover the equation from the offset in the gadget.

We can first calculate in $\mathsf{AC}^{0}$ the gadget’s index $ind=\lfloor\frac{i-m-1}{8}\rfloor+1$ as $\lfloor\frac{\cdot}{8}\rfloor$ is just ignoring the three least significant bits. By Algorithm 1 and Remark 3.2, we know $t=n+ind$ thus it is $\mathsf{AC}^{0}$ -computable, and we can also calculate $t^{\prime}=n+\sum_{i^{\prime}=1}^{m}\sum_{s^{\prime}\in\{+,-\}}\mathtt{SumNumG}^{s^{\prime}}_{i^{\prime}}(\mathbf{A})+4(ind-1)$ in $\mathsf{TC}^{0}$ by Fact 1.

Then we can calculate the number $i^{\prime}$ , $k^{\prime}$ , sign $s^{\prime}\in\{+,-\}$ , and number $\ell^{\prime}$ such that this gadget is the $\ell^{\prime}$ -th gadget we created to eliminate the $k^{\prime}$ -th bit of $\mathbf{A}_{i^{\prime}}$ for variables with sign $s^{\prime}$ . More specifically, we want to find the minimum $i^{\prime}$ , $k^{\prime}$ , $s^{\prime}$ (where we treat ‘ $+$ ’ $<$ ‘ $-$ ’) such that

	$\displaystyle\mathtt{PrefixSum}^{+}_{i^{\prime},k^{\prime}}(\mathbf{A})<ind\leq\mathtt{PrefixSum}^{+}_{i^{\prime},k^{\prime}}(\mathbf{A})+\mathtt{NumGadget}^{+}_{i^{\prime},k^{\prime}}$	$\displaystyle\text{ if }s^{\prime}=+,$
	$\displaystyle\mathtt{PrefixSum}^{-}_{i^{\prime},k^{\prime}}(\mathbf{A})<ind\leq\mathtt{PrefixSum}^{-}_{i^{\prime},k^{\prime}}(\mathbf{A})+\mathtt{NumGadget}^{-}_{i^{\prime},k^{\prime}}$	$\displaystyle\text{ if }s^{\prime}=-,$

where

	$\displaystyle\mathtt{PrefixSum}^{+}_{i^{\prime},k^{\prime}}(\mathbf{A})$	$\displaystyle=\sum_{i^{\prime\prime}=1}^{i^{\prime}-1}\sum_{s^{\prime\prime}\in\{+,-\}}\mathtt{SumNumG}^{s^{\prime\prime}}_{i^{\prime\prime}}(\mathbf{A})+\sum_{k^{\prime\prime}=1}^{k^{\prime}-1}\sum_{s^{\prime\prime}\in\{+,-\}}\mathtt{NumGadget}^{s^{\prime\prime}}_{i^{\prime},k^{\prime\prime}}(\mathbf{A}),$
	$\displaystyle\mathtt{PrefixSum}^{-}_{i^{\prime},k^{\prime}}(\mathbf{A})$	$\displaystyle=\mathtt{PrefixSum}^{+}_{i^{\prime},k^{\prime}}(\mathbf{A})+\mathtt{NumGadget}^{+}_{i^{\prime},k^{\prime}}.$

There are $O(mw)$ possible choices for $i^{\prime}$ , $k^{\prime}$ , $s^{\prime}$ and each condition is a prefix sum of $\mathtt{NumGadget}^{s}_{i,k}$ ’s, therefore this can be done in $\mathsf{TC}^{0}$ by a parallel comparison of $ind$ to the prefix sums of $\mathtt{NumGadget}^{s}_{i,k}$ ’s. After getting $i^{\prime}$ , $k^{\prime}$ , and $s^{\prime}$ , we can get $\ell^{\prime}=ind-\mathtt{PrefixSum}^{s^{\prime}}_{i^{\prime},k^{\prime}}$ by Fact 1.

What remains is to calculate $j_{1}$ and $j_{2}$ from $i^{\prime}$ , $k^{\prime}$ , $s^{\prime}$ , and $\ell^{\prime}$ . Note that when eliminating the $k^{\prime}$ -th bit of $\mathbf{A}_{i^{\prime}}$ for variables with sign $s^{\prime}$ , we first eliminate in pairs those variables in the original $\mathbf{A}_{i^{\prime}}$ that have $1$ in the $k^{\prime}$ -th bit before the reduction (we call them original pairs), then eliminate in pairs those generated in the previous round for $i^{\prime}$ , $k^{\prime}-1$ , and $s^{\prime}$ . The number of original pairs is given by $p=\lfloor\mathtt{CountBit}^{s^{\prime}}_{i^{\prime},k^{\prime}}(\mathbf{A})/2\rfloor$ , computable in $\mathsf{TC}^{0}$ . There are two cases:

–

$\ell^{\prime}\leq p$ : $j_{1}$ and $j_{2}$ is the indices of variables in the $\ell^{\prime}$ -th original pairs, which are the indices of the $(2\ell^{\prime}-1)$ -th and $2\ell^{\prime}$ -th variables in $\mathbf{A}_{i^{\prime}}$ that have $1$ in the $k^{\prime}$ -th bit. Similarly as above, this can be done by a simple parallel comparison to prefix sums of the $k^{\prime}$ -th bits of variables in $\mathbf{A}_{i^{\prime}}$ .
–

$\ell^{\prime}>p$ : then $j_{1}$ and $j_{2}$ are the $(2(\ell^{\prime}-p)-1)$ -th and $2(\ell^{\prime}-p)$ -th new variables generated in the previous round, therefore $j_{1}=\mathtt{PrefixSum}^{s^{\prime}}_{i^{\prime},k^{\prime}-1}(\mathbf{A})+2(\ell^{\prime}-p)-1$ and $j_{2}=j_{1}+1$ .

However the second case above only works for even $\mathtt{CountBit}^{s^{\prime}}_{i^{\prime},k^{\prime}}(\mathbf{A})$ . When it is odd, $\ell^{\prime}=p+1$ corresponds to a pair with the last original variable in this round and the first generated variables in the previous round, thus $j_{1}$ can be calculated as in the first case and $j_{2}=\mathtt{PrefixSum}^{s^{\prime}}_{i^{\prime},k^{\prime}-1}(\mathbf{A})+1$ . Then for $\ell^{\prime}>p+1$ , we have $j_{1}=\mathtt{PrefixSum}^{s^{\prime}}_{i^{\prime},k^{\prime}-1}(\mathbf{A})+2(\ell^{\prime}-p)-2$ and $j_{2}=j_{1}+1$ .

In conclusion, given $i$ as input, we can first check if $i\leq m^{\prime}$ in $\mathsf{TC}^{0}$ by Lemma 3.6 and if so compute coefficients of $\mathbf{B}_{i}$ in $\mathsf{TC}^{0}$ , therefore with $j$ as input, we check if $j\leq n^{\prime}$ in $\mathsf{TC}^{0}$ and if so compute the entry of $\mathbf{B}$ in row $i$ column $j$ in $\mathsf{TC}^{0}$ .

4 Genearlization to approximate solvers, and more restrictive classes

Now we are going to generalize the above results of reductions for exact solvers into those for approximate solvers, thus proving Theorem 1.2. First we need the following result showing the power of $\mathsf{TC}^{0}$ .

Fact 3.

Division and iterated multiplication are in $\mathsf{DLOGTIME}$ -uniform $\mathsf{TC}^{0}$ [5]. Moreover, we can approximate in $\mathsf{TC}^{0}$ functions represented by sufficiently nice power series, such as $\log$ , $\exp$ , and $x^{1/k}$ [11, 9, 5].

Proof 4.1 (Proof sketch of Theorem 1.2).

Based on our proofs on the simplified reductions, we are going to prove that the original reductions from $\mathcal{G}$ to $\mathcal{MC}_{2}$ in [7] can be computed in $\mathsf{TC}^{0}$ . In the context of approximate solvers, the error $\mathrm{\varepsilon}$ that solvers are required to achieve is also part of the instance. By Fact 3, all the errors in the reduced instances, as defined in the reductions in Kyng and Zhang’s full paper [8], can be computed in $\mathsf{TC}^{0}$ given the size, condition number, and bit-complexity of the original matrix as parameters.

•

From $\mathcal{G}$ to $\mathcal{G}_{z}$ : this reduction remains the same, but now knowing the accuracy we can recover $\mathbf{x}^{\prime}$ from $\mathbf{x}$ using fix-point arithmetic in $\mathsf{TC}^{0}$ .
•

From $\mathcal{G}_{z}$ to $\mathcal{G}_{z,2}$ : the only difference (besides the calculation of accuracy) is that in the original reduction in [8, Section 7.2], in the last row of the reduced matrix $\mathbf{A}^{\prime}$ the last two entries are set to be $w$ and $-w$ for some value $w$ , instead of $1$ and $-1$ as in our simplified version. However $w$ is computable in $\mathsf{TC}^{0}$ by Fact 3 so the reduction is still computable in $\mathsf{TC}^{0}$ .
•

From $\mathcal{G}_{z,2}$ to $\mathcal{MC}_{2}$ : for approximate solvers we will have to use the original $\mathcal{MC}_{2}\mathtt{Gadget}$ in [8, Section 4] consisting of ten $2$ -commodity equations instead of eight and with additional variables. So we need to modify the corresponding numbers in our calculation, and in particular the gedget’s index becomes $ind=\lfloor\frac{i-m-1}{10}\rfloor+1$ , which is still $\mathsf{TC}^{0}$ -computable. Besides, the equations in the $\mathcal{MC}_{2}\mathtt{Gadget}$ for eliminating the $k$ -th bit of variables with sign $s$ in $\mathbf{A}_{i}$ will be multiplied by a factor $-s\cdot 2^{k}\cdot w_{i}$ , where $w_{i}=\sqrt{10\sum_{s^{\prime}\in\{+,-\}}\mathtt{SumNumG}^{s}_{i}(\mathbf{A})}$ , as specified in Algorithm 1 of [8]. These factors can be calculated with desired accuracy in $\mathsf{TC}^{0}$ by Fact 3. Therefore the reduction is still computable in $\mathsf{TC}^{0}$ .

Additionally in the second and third reduction, when recovering $\mathbf{x}^{\prime}$ from $\mathbf{x}$ we need to check if the original matrix $\mathbf{A}$ and vector $\mathbf{b}$ satisfy $\mathbf{A}^{\top}\mathbf{b}=\mathbf{0}$ and simply return $\mathbf{0}$ if so. This can be done in $\mathsf{TC}^{0}$ by Fact 1. In conclusion, we can reduce the problem of approximately solving equations in $\mathcal{G}$ to approximately solving equations in $\mathcal{MC}_{2}$ in $\mathsf{TC}^{0}$ .

In [7] they also considered some more restrictive subclasses of $\mathcal{MC}_{2}$ . Intuitively, the set of strict $2$ -commodity matrices $\mathcal{MC}^{>0}_{2}\subset\mathcal{MC}_{2}$ is the class of 2-commodity equations $\mathbf{A}$ such that for every pair $i,j$ , equation $x_{i}-x_{j}=0\in\mathbf{A}$ iff equation $y_{i}-y_{j}=0\in\mathbf{A}$ iff equation $x_{i}-y_{i}-(x_{j}-y_{j})=0\in\mathbf{A}$ . The set of strict $2$ -commodity matrices with integer entries is denoted by $\mathcal{MC}^{>0}_{2,\mathbb{Z}}$ . They showed that reductions from approximately solving $\mathcal{MC}_{2}$ to approximately solving $\mathcal{MC}^{>0}_{2}$ , and from $\mathcal{MC}^{>0}_{2}$ to $\mathcal{MC}^{>0}_{2,\mathbb{Z}}$ . We are going to show that these reductions can be computed in $\mathsf{TC}^{0}$ .

From $\mathcal{MC}_{2}$ to $\mathcal{MC}^{>0}_{2}$

Proof 4.2 (Proof sketch).

The reduction, as defined in [8, Section 5.1], runs by checking for each pair $i,j$ that are involved in some equation in $\mathbf{A}$ if any of the three types of equations is missing and add it if so. The added equations will be multiplied by a factor that is computable in $\mathsf{TC}^{0}$ . Obviously the resulting equation systems is in $\mathcal{MC}^{>0}_{2}$ . It is easy to see that the number of added equations for each pair $i,j$ can be computed in $\mathsf{AC}^{0}$ , thus all the prefix sums of these numbers can be calculated in $\mathsf{TC}^{0}$ simultaneously, and so we can determine the equations in the reduced instance in $\mathsf{TC}^{0}$ .

From $\mathcal{MC}^{>0}_{2}$ to $\mathcal{MC}^{>0}_{2,\mathbb{Z}}$

Proof 4.3 (Proof sketch).

In [8, Section 6] it is done by scaling up all the numbers in the matrix and the vector by a factor of $2^{k}$ , where $k$ is computable in $\mathsf{TC}^{0}$ by Fact 3, then take the ceiling function on entries of the matrix to convert them into integer entries, which also can be done in $\mathsf{TC}^{0}$ .

References

[1] Peter Clote and Evangelos Kranakis. Boolean Functions and Computation Models. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2002.
[2] Michael B. Cohen, Rasmus Kyng, Gary L. Miller, Jakub W. Pachocki, Richard Peng, Anup B. Rao, and Shen Chen Xu. Solving SDD linear systems in nearly mlog ${}^{\mbox{1/2}}$ n time. In Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31 - June 03, 2014, pages 343–352, 2014.
[3] Dean Doron, François Le Gall, and Amnon Ta-Shma. Probabilistic logarithmic-space algorithms for laplacian solvers. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2017, August 16-18, 2017, Berkeley, CA, USA, pages 41:1–41:20, 2017.
[4] François Le Gall. Powers of tensors and fast matrix multiplication. In International Symposium on Symbolic and Algebraic Computation, ISSAC ’14, Kobe, Japan, July 23-25, 2014, pages 296–303, 2014.
[5] William Hesse, Eric Allender, and David A. Mix Barrington. Uniform constant-depth threshold circuits for division and iterated multiplication. J. Comput. Syst. Sci., 65(4):695–716, 2002.
[6] Neil Immerman and Susan Landau. The complexity of iterated multiplication. In Proceedings: Fourth Annual Structure in Complexity Theory Conference, University of Oregon, Eugene, Oregon, USA, June 19-22, 1989, pages 104–111, 1989.
[7] Rasmus Kyng and Peng Zhang. Hardness results for structured linear systems. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 684–695, 2017.
[8] Rasmus Kyng and Peng Zhang. Hardness results for structured linear systems. CoRR, abs/1705.02944, 2017. arXiv:1705.02944.
[9] Alexis Maciel and Denis Thérien. Efficient threshold circuits for power series. Inf. Comput., 152(1):62–73, 1999.
[10] Jack Murtagh, Omer Reingold, Aaron Sidford, and Salil P. Vadhan. Derandomization beyond connectivity: Undirected laplacian systems in nearly logarithmic space. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 801–812, 2017.
[11] John H. Reif and Stephen R. Tate. On threshold circuits and polynomial computation. SIAM J. Comput., 21(5):896–908, 1992.
[12] Michael E. Saks and Shiyu Zhou. RSPACE(S) \subseteq dspace(s ${}^{\mbox{3/2}}$ ). In 36th Annual Symposium on Foundations of Computer Science, Milwaukee, Wisconsin, USA, 23-25 October 1995, pages 344–353, 1995.
[13] Daniel A. Spielman and Shang-Hua Teng. Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems. SIAM J. Matrix Analysis Applications, 35(3):835–885, 2014.
[14] Amnon Ta-Shma. Inverting well conditioned matrices in quantum logspace. In Symposium on Theory of Computing Conference, STOC’13, Palo Alto, CA, USA, June 1-4, 2013, pages 881–890, 2013.