The Vector Balancing Constant for Zonotopes

Laurel Heck University of Washington, Seattle. Email: [email protected]. Supported by an NSF Graduate Research Fellowship. Victor Reis University of Washington, Seattle. Email: [email protected] Thomas Rothvoss University of Washington, Seattle. Email: [email protected]. Supported by NSF CAREER grant 1651861 and a David & Lucile Packard Foundation Fellowship.

Abstract

The vector balancing constant $\textrm{vb}(K,Q)$ of two symmetric convex bodies $K,Q$ is the minimum $r\geq 0$ so that any number of vectors from $K$ can be balanced into an $r$ -scaling of $Q$ . A question raised by Schechtman is whether for any zonotope $K\subseteq\mathbb{R}^{d}$ one has $\textrm{vb}(K,K)\lesssim\sqrt{d}$ . Intuitively, this asks whether a natural geometric generalization of Spencer’s Theorem (for which $K=B^{d}_{\infty}$ ) holds. We prove that for any zonotope $K\subseteq\mathbb{R}^{d}$ one has $\textrm{vb}(K,K)\lesssim\sqrt{d}\log\log\log d$ . Our main technical contribution is a tight lower bound on the Gaussian measure of any section of a normalized zonotope, generalizing Vaaler’s Theorem for cubes. We also prove that for two different normalized zonotopes $K$ and $Q$ one has $\textrm{vb}(K,Q)\lesssim\sqrt{d\log d}$ . All the bounds are constructive and the corresponding colorings can be computed in polynomial time.

1 Introduction

Discrepancy theory is a subfield of combinatorics where one is given a set system $(X,\pazocal{F})$ with a ground set $X$ and a family of sets $\pazocal{F}\subseteq 2^{X}$ , and the goal is to find the coloring that minimizes the maximum imbalance, i.e.

\textrm{disc}(\pazocal{F})=\min_{x\in\{-1,1\}^{X}}\max_{S\in\pazocal{F}}\Big{|}\sum_{j\in S}x_{j}\Big{|}.

A slightly more general linear-algebraic view is that one is given a matrix $A\in[-1,1]^{d\times n}$ and its discrepancy is defined as $\min_{x\in\{-1,1\}^{n}}\|Ax\|_{\infty}$ . The best known result in this area is certainly Spencer’s Theorem [Spe85] which states that for any $n\leq d$ one has $\textrm{disc}(A)\leq O(\sqrt{n\log(\frac{2d}{n})})$ . The challenging aspect of that Theorem is that — say for $n=d$ — a uniform random coloring $x\sim\{-1,1\}^{n}$ will only give a $\Theta(\sqrt{n\log n})$ bound. Instead, Spencer [Spe85] applied the partial coloring method which had been first used by Beck [Bec81].

The original proofs of the partial coloring method are based on the pigeonhole principle and are non-constructive. The first polynomial time algorithm to actually find the coloring guaranteed by Spencer [Spe85] is due to Bansal [Ban10], followed by a sequence of algorithms [LM12, Rot14, LRR16, ES18] that either work in more general settings or are simpler.

Discrepancy theory is an extensively studied topic with many applications in mathematics and computer science. To give two concrete examples, Nikolov, Talwar and Zhang [NTZ13] showed a connection between differential privacy and hereditary discrepancy, and the best known approximation algorithm for Bin Packing uses a discrepancy-based rounding [HR17]. Other applications can be found in data structure lower bounds, communication complexity and pseudorandomness; we refer to the book of Chazelle [Cha00] for a more detailed account. The seminal result of Batson, Spielman and Srivastava [BSS09] on the existence of linear-size spectral sparsifiers for graphs can also be interpreted as a discrepancy-theoretic result, see [RR20] for details.

For the purpose of this paper, it will be convenient to introduce more general notation. For two symmetric convex bodies $K,Q\subseteq\mathbb{R}^{d}$ we define the vector balancing constant $\textrm{vb}(K,Q)$ as the smallest number $r\geq 0$ so that for any vectors $u_{1},\ldots,u_{n}\in K$ one can find signs $x\in\{-1,1\}^{n}$ so that the signed sum $x_{1}u_{1}+\cdots+x_{n}u_{n}$ is in $rQ$ . We also denote $\textrm{vb}_{n}(K,Q)$ as the same quantity where we fix the number of vectors to be $n$ . For example, Spencer’s Theorem [Spe85] can then be rephrased as $\textrm{vb}(B_{\infty}^{d},B_{\infty}^{d})=\Theta(\sqrt{d})$ and as $\textrm{vb}_{n}(B_{\infty}^{d},B_{\infty}^{d})=\Theta(\sqrt{n\log(\frac{2d}{n})})$ for $n\leq d$ . Here we denote $B_{p}^{d}$ as the $d$ -dimensional unit ball of the norm $\|\cdot\|_{p}$ . Moreover for a Euclidean ball one can easily prove that $\textrm{vb}(B_{2}^{d},B_{2}^{d})=\Theta(\sqrt{d})$ and for the $\ell_{1}$ -ball we have $\textrm{vb}(B_{1}^{d},B_{1}^{d})=\Theta(d)$ .

While Spencer’s Theorem itself is tight, at least three candidate generalizations have been suggested in the literature — all three are unsolved so far.

The Beck-Fiala Conjecture.

Suppose we have a set system $(X,\pazocal{F})$ in which every element is in at most $t$ sets. Beck and Fiala [BF81] proved using a linear-algebraic argument that in this case the discrepancy is bounded by $2t$ and they state the conjecture that the correct dependence should be $O(\sqrt{t})$ . The same proof of [BF81] also shows that $\textrm{vb}(B_{1}^{d},B_{\infty}^{d})\leq 2$ . However, the Beck-Fiala Conjecture is wide open and the best known bounds are $O(\sqrt{t\log n})$ [Ban98, BDGL18] and $2t-\log^{*}(t)$ [Buk16]. In fact, Komlós Conjecture of $\textrm{vb}(B_{2}^{d},B_{\infty}^{d})\leq O(1)$ is even more general; here the best known bound is $\textrm{vb}(B_{2}^{d},B_{\infty}^{d})\leq O(\sqrt{\log(d)})$ [Ban98].

The Matrix Spencer Conjecture.

A conjecture popularized by Zouzias [Zou12] and Meka [Mek14] claims that for any symmetric matrices $A_{1},\ldots,A_{n}\in\mathbb{R}^{n\times n}$ with all eigenvalues in $[-1,1]$ , there are signs $x\in\{-1,1\}^{n}$ so that the maximum singular value of $\sum_{i=1}^{n}x_{i}A_{i}$ is at most $O(\sqrt{n})$ . Using standard matrix concentration bounds, one can prove that a random coloring attains a value of at most $O(\sqrt{n\log n})$ . Moreover, one can prove the conjectured upper bound of $O(\sqrt{n})$ under the additional assumption that the matrices are block-diagonal with constant size blocks [DJR22], or have rank $O(\sqrt{n})$ [HRS22]. Based on recent progress on matrix concentration, it is possible to obtain the same under the weaker condition that they have rank at most $\frac{n}{\log^{3}(n)}$ [BJM22].

The vector balancing constant of zonotopes.

A zonotope is defined as the linear image of a cube. If $A\in\mathbb{R}^{m\times d}$ is a matrix with $m\geq d$ , we can write a $d$ -dimensional zonotope in the form $K=\{\sum_{i=1}^{m}y_{i}A_{i}\mid y\in[-1,1]^{m}\}=A^{\top}B_{\infty}^{m}\subseteq\mathbb{R}^{d}$ . Note that $m$ is the number of segments of the zonotope. The cube $B_{\infty}^{d}$ is trivially a zonotope, and it is known that for every $p\geq 2$ , the ball $B_{p}^{n}$ is the limit of a sequence of zonotopes, called a zonoid [BLM89]. Schechtman [Sch07] raised the question whether it is true that for any zonotope $K\subseteq\mathbb{R}^{d}$ one has $\textrm{vb}(K,K)\lesssim\sqrt{d}$ where we write $A\lesssim B$ if $A\leq C\cdot B$ for a universal constant $C>0$ . The best known bound of $\textrm{vb}(K,K)\lesssim\sqrt{d\log\log d}$ is a direct consequence of Spencer’s theorem and the fact that zonotopes can be sparsified up to a constant factor with only $O(d\log d)$ segments [Tal90]. An affirmative answer to Schechtman’s question would follow from an $O(d)$ bound, or equivalently whether an $\ell_{1}$ -analogue of [BSS09] is true. We defer to Section 6 for details.

1.1 Our contributions

Our main result is an almost-proof of Schechtman’s conjecture (falling short only by a $\log\log\log d$ term).

Theorem 1.

For any zonotope $K\subseteq\mathbb{R}^{d}$ one has $\textrm{vb}(K,K)\lesssim\sqrt{d}\log\log\log d$ . Moreover, for any $v_{1},\ldots,v_{n}\in K$ one can find in randomized polynomial time a coloring $x\in\{-1,1\}^{n}$ with $\|\sum_{i=1}^{n}x_{i}v_{i}\|_{K}\lesssim\sqrt{d}\log\log\log d$ .

The claim is invariant under linear transformations to $K$ and so it will be useful to place $K$ in a normalized position. For this sake, we make the following definition:

Definition 2.

A matrix $A\in\mathbb{R}^{m\times d}$ is called approximately regular if the following holds: {enumerate*}

The columns $A^{1},\ldots,A^{d}$ are orthonormal.

The rows satisfy $\|A_{i}\|_{2}\leq 2\sqrt{\frac{d}{m}}$ for all $i=1,\ldots,m$ .

Then we call a zonotope $K\subseteq\mathbb{R}^{d}$ normalized if there exists a matrix $A\in\mathbb{R}^{m\times d}$ that is approximately regular so that $K=\sqrt{\frac{d}{m}}A^{\top}B_{\infty}^{m}$ . We choose the scaling so that any cube $B_{\infty}^{d}$ is indeed normalized and zonotopes with any number of segments are comparable to $B_{\infty}^{d}$ in terms of volume and radius.

Our main technical contribution is a tight lower bound for the Gaussian measure of sections of any normalized zonotope.

Theorem 3.

For any normalized zonotope $K\subseteq\mathbb{R}^{d}$ , any subspace $H\subseteq\mathbb{R}^{d}$ with $n:=\dim(H)$ and any $t\geq 1$ , one has $\gamma_{H}(t\cdot C\cdot K\cap H)\geq\exp(-e^{-t^{2}/2}\cdot n)$ where $C>0$ is a universal constant.

In order to prove Theorem 3, we show that a normalized zonotope can be decomposed into $\Theta(\frac{m}{d})$ many smaller zonotopes with $\Theta(d)$ many segments each. This decomposition requires an iterative application of the Kadison-Singer theorem by Marcus, Spielman and Srivastava [MSS15]. Then we prove the statement of Theorem 3 for such simpler zonotopes and derive the lower bound on $\gamma_{H}(t\cdot C\cdot K\cap H)$ by using log-concavity of the Gaussian measure.

We can also use Theorem 3 to show how to balance vectors between different normalized zonotopes:

Theorem 4.

For any normalized zonotopes $K,Q\subseteq\mathbb{R}^{d}$ one has $\textrm{vb}(K,Q)\lesssim\sqrt{d\log d}$ . Moreover, for any $v_{1},\ldots,v_{n}\in K$ one can find in randomized polynomial time a coloring $x\in\{-1,1\}^{n}$ such that ${\|\sum_{i=1}^{n}x_{i}v_{i}\|_{Q}\lesssim\sqrt{d\cdot\log\min\{d,n\}}}$ .

2 Preliminaries

We review a few facts that we rely on later.

Probability.

By $\gamma_{n}$ we denote the (standard) Gaussian density $\frac{1}{(2\pi)^{n/2}}e^{-\|x\|_{2}^{2}/2}$ . For the corresponding distribution we will write $N(0,I_{n})$ . For a subspace $F\subseteq\mathbb{R}^{n}$ we write $I_{F}\in\mathbb{R}^{n\times n}$ as the identity on the subspace; in particular $I_{F}=\sum_{i=1}^{\dim(F)}u_{i}u_{i}^{T}$ where $u_{1},\ldots,u_{\dim(F)}$ is any orthonormal basis of $F$ . A strip is a symmetric convex body of the form $P=\{x\in\mathbb{R}^{n}:|\left<a,x\right>|\leq 1\}$ with $a\in\mathbb{R}^{n}$ .

Theorem 5 (Šidák-Khatri).

For any two symmetric convex bodies $P,Q\subseteq\mathbb{R}^{n}$ where at least one is a strip, one has $\gamma_{n}(P\cap Q)\geq\gamma_{n}(P)\cdot\gamma_{n}(Q)$ .

More recently, Royen [Roy14] proved that this is indeed true for any pair of symmetric convex bodies, but the weaker result suffices for us.

Lemma 6.

For any symmetric convex body $K$ and any subspace $H\subseteq\mathbb{R}^{n}$ one has $\gamma_{H}(K\cap H)\geq\gamma_{n}(K)$ .

We will use the following convenient estimate on the Gaussian measure of a strip:

Lemma 7.

For any $a\in\mathbb{R}^{n}$ with $\|a\|_{2}\leq 1$ and $t\geq 1$ one has

\Pr_{y\sim N(0,I_{n})}[|\left<a,y\right>|\leq t]\geq\exp(-e^{-t^{2}/2}\cdot\|a\|_{2}^{2}).

The following comparison inequality (see e.g. Ledoux and Talagrand [LT11]) will also be useful:

Lemma 8.

Let $K$ be a symmetric convex body and let $0\preceq A\preceq B$ . Then

\Pr_{y\sim N(0,A)}[y\in K]\geq\Pr_{y\sim N(0,B)}[y\in K].

We prove these lemmas in Appendix B. The following lemma allows us to dismiss constant scaling factors, see [Tko15]:

Lemma 9.

Let $K\subset\mathbb{R}^{n}$ be a measurable set and $B$ be an Euclidean ball centered at the origin such that $\gamma_{n}(K)=\gamma_{n}(B)$ . Then $\gamma_{n}(tK)\geq\gamma_{n}(tB)$ for all $t\in[0,1]$ . In particular, if $\gamma_{n}(C_{1}\cdot K)\geq e^{-C_{1}n}$ for some constant $C_{1}\geq 1$ then also $\gamma_{n}(K)\geq e^{-C_{2}n}$ for some $C_{2}:=C_{2}(C_{1})>0$ .

Discrepancy theory.

First we give a full statement of Spencer’s theorem that we mentioned earlier:

Theorem 10 (Spencer’s Theorem [Spe85, LM12]).

For any $A\in[-1,1]^{m\times n}$ with $m\geq n$ there are polynomial time computable signs $x\in\{-1,1\}^{n}$ so that $\|Ax\|_{\infty}\lesssim\sqrt{n\log(\frac{2m}{n})}$ . More generally, for any shift $x_{0}\in[-1,1]^{n}$ , there is a polynomial time computable $x\in\mathbb{R}^{n}$ so that $x+x_{0}\in\{-1,1\}^{n}$ and $\|A(x+x_{0})\|_{\infty}\lesssim\sqrt{n\log(\frac{2m}{n})}$ .

To be exact, the first algorithm giving a bound of $O(\sqrt{n}\log(\frac{2m}{n}))$ is due to Bansal [Ban10] and the tight algorithmic bound is due to Lovett and Meka [LM12].

We say that a vector $x\in\mathbb{R}^{n}$ is a good partial coloring if $x\in[-1,1]^{n}$ with $|\{j\in[n]:x_{j}\in\{-1,1\}\}|\geq n/2$ . We will need a connection between good partial colorings and Gaussian measure lower bounds.

Theorem 11 ([RR22], special case of Theorem 6).

For any $\alpha>0$ , there is a constant $c:=c(\alpha)>0$ and a randomized polynomial time algorithm that for a symmetric convex body $K\subseteq\mathbb{R}^{n}$ , a $2n/3$ -dimensional subspace $F\subseteq\mathbb{R}^{n}$ with $\gamma_{F}(K\cap F)\geq e^{-\alpha n}$ and a shift $y\in(-1,1)^{n}$ , finds $x\in c\cdot K\cap F$ so that $x+y$ is a good partial coloring.

We will also need a theorem of Banaszczyk [Ban98] (whose algorithmic version is due to [BDGL18]).

Theorem 12 (Banaszczyk’s Theorem).

Let $K\subseteq\mathbb{R}^{d}$ be a convex set with $\gamma_{d}(K)\geq\frac{1}{2}$ and let $v_{1},\ldots,v_{n}\in B_{2}^{d}$ . Then there is a randomized polynomial time algorithm to compute signs $x\in\{-1,1\}^{n}$ so that $\sum_{j=1}^{n}x_{j}v_{j}\in CK$ where $C>0$ is a universal constant.

For many decades, the Kadison-Singer problem was an open question in operator theory. It was finally resolved in 2015:

Theorem 13 (Marcus, Spielman, Srivastava [MSS15]).

Let $v_{1},\ldots,v_{m}\in\mathbb{R}^{n}$ so that $\sum_{i=1}^{m}v_{i}v_{i}^{\top}=I_{d}$ and let $\varepsilon>0$ so that $\|v_{i}\|_{2}^{2}\leq\varepsilon$ for all $i\in[m]$ . Then there is a partition $[m]=S_{1}\dot{\cup}S_{2}$ so that for both $j\in\{1,2\}$ one has

\Big{\|}\sum_{i\in S_{j}}v_{i}v_{i}^{\top}-\frac{1}{2}I_{d}\Big{\|}_{\mathrm{op}}\leq 3\sqrt{\varepsilon}

In the definition of $\textrm{vb}(K,Q)$ , there is no upper bound on the number of vectors to be balanced. But it is well-known that up to a constant factor, the worst-case is attained for $d$ many vectors. Let

\textrm{vb}_{n}(K,Q):=\inf\Big{\{}r\geq 0\mid\forall u_{1},\ldots,u_{n}\in K:\exists x\in\{-1,1\}^{n}:\sum_{j=1}^{n}x_{j}v_{j}\in rQ\Big{\}}

be the vector balancing variant with $n$ vectors, so that $\textrm{vb}(K,Q):=\sup_{n\in\mathbb{N}}\textrm{vb}_{n}(K,Q).$

Theorem 14 ([LSV86]).

For any symmetric convex $K,Q\subseteq\mathbb{R}^{d}$ , $\mathrm{vb}(K,Q)\leq 2\cdot\mathrm{vb}_{d}(K,Q)$ .

The reduction underlying the inequality is algorithmic as well.

Zonotopes.

A substantial amount of work in the literature has been done on the question of how one can sparsify an arbitrary zonotope with another zonotope that has fewer segments, while losing only a constant factor approximation. The first bound of $O(d^{2})$ [Sch87] was improved to $O(d\log^{3}d)$ [BLM89]. We highlight the current best known bound:

Theorem 15 (Talagrand [Tal90]).

For any zonotope $K\subseteq\mathbb{R}^{d}$ and $0<\varepsilon\leq\frac{1}{2}$ , there is a zonotope $Q$ with at most $O(\frac{d}{\varepsilon^{2}}\log d)$ segments so that $Q\subseteq K\subseteq(1+\varepsilon)Q$ .

We refer to the approach of Cohen and Peng [CP15] for an elementary exposition of the $O(d\log d)$ bound.

Finally, we justify why it suffices to consider normalized zonotopes:

Lemma 16.

For any full-dimensional zonotope $K=A^{\top}B^{m}_{\infty}\subseteq\mathbb{R}^{d}$ , there is a normalized zonotope $\tilde{K}$ and an invertible linear map $T$ so that $\frac{4}{5}\tilde{K}\subseteq T(K)\subseteq\tilde{K}$ . In particular, $\frac{4}{5}\textrm{vb}(\tilde{K},\tilde{K})\leq\textrm{vb}(K,K)\leq\frac{5}{4}\textrm{vb}(\tilde{K},\tilde{K})$ .

We show the argument in Appendix A.

Lemma 17.

Any normalized zonotope $K\subseteq\mathbb{R}^{d}$ satisfies $K\subseteq\sqrt{d}B_{2}^{d}$ .

Proof.

We write $K=\sqrt{\frac{d}{m}}A^{\top}B_{\infty}^{m}$ where $A\in\mathbb{R}^{m\times d}$ . Note that $A^{\top}A=I_{d}$ by orthonormality of the columns of $A$ and so $\|A\|_{\textrm{op}}=\|A^{\top}A\|_{\textrm{op}}^{1/2}=1$ . By definition, for any $x\in K$ there is a $y\in B_{\infty}^{m}$ with $x=\sqrt{\frac{d}{m}}A^{\top}y$ , so that

\|x\|_{2}=\sqrt{\frac{d}{m}}\|A^{\top}y\|_{2}\leq\sqrt{\frac{d}{m}}\|A^{\top}\|_{\textrm{op}}\cdot\|y\|_{2}\leq\sqrt{d}.\qed

3 Sections of normalized zonotopes

In this section we prove Theorem 3, showing that all sections of zonotopes are large. To be be more precise, we prove the following more general measure lower bound:

Theorem 18.

In the most basic form where $K=B_{\infty}^{d}$ is a cube and $t=1$ , the statement is similar to a result of Vaaler [Vaa79] who proved that $\textrm{Vol}_{H}(K\cap H)\geq 2^{n}$ for any $n$ -dimensional subspace $H\subseteq\mathbb{R}^{d}$ ; though the geometry of a zonotope is more complex and the proof strategy is rather different.

3.1 A first direct lower bound

We begin with a simple estimate on the Gaussian measure of the section of a zonotope where we drop the scalar of $\sqrt{\frac{d}{m}}$ . Hence this bound will be tight if the number of segments is close to $d$ but rather loose otherwise. We denote $\Pi_{H}$ as the orthogonal projection into a subspace $H$ .

Lemma 19.

Let $K:=A^{\top}B_{\infty}^{m}\subseteq\mathbb{R}^{d}$ be a zonotope where $A\in\mathbb{R}^{m\times d}$ is a matrix with orthonormal columns. Then for any subspace $H\subseteq\mathbb{R}^{d}$ with $n:=\dim(H)$ and any $t\geq 1$ one has $\gamma_{H}(t\cdot K\cap H)\geq\exp(-e^{-t^{2}/2}\cdot n)$ .

Proof.

Let $U\in\mathbb{R}^{d\times n}$ be a matrix with orthonormal columns $U^{1},\dots,U^{n}$ spanning $H$ . Then if we draw $y\sim N(0,I_{n})$ , $Uy$ is indeed a standard Gaussian in the subspace $H$ . By assumption, $\sum_{i=1}^{m}A_{i}A_{i}^{\top}=I_{d}$ , and this can be used to write any outcome of the random process as

Uy=\sum_{j=1}^{n}y_{j}I_{d}U^{j}=\sum_{i=1}^{m}A_{i}\sum_{j=1}^{n}y_{j}\left<A_{i},U^{j}\right>=\sum_{i=1}^{m}A_{i}\left<y,U^{\top}A_{i}\right>.

(1)

Here one should think of $U^{\top}A_{i}\in\mathbb{R}^{n}$ as the coordinates of $\Pi_{H}(A_{i})$ in terms of the basis $U$ of $H$ . From the expression in (1) we can draw the following conclusion:
Claim I. For any $y\in\mathbb{R}^{n}$ and $s>0$ one has $(|\left<y,U^{\top}A_{i}\right>|\leq s\;\forall i\in[m])\Rightarrow Uy\in sK$ .
Then Claim I gives a simple sufficient (but in general not necessary) condition for $Uy$ to lie in the zonotope $K$ . Next, we can see that

\sum_{i=1}^{m}\|U^{\top}A_{i}\|_{2}^{2}=\sum_{i=1}^{m}\textrm{Tr}\big{[}UU^{\top}A_{i}A_{i}^{\top}\big{]}=\textrm{Tr}[UU^{\top}]=n

Then we can use Claim I and the inequality of Šidák-Khatri to lower bound the Gaussian measure by

$\displaystyle\gamma_{H}(t\cdot K\cap H)$	$\displaystyle=$	$\displaystyle\Pr_{y\sim N(0,I_{n})}[Uy\in t\cdot K]$
	$\displaystyle\geq$	$\displaystyle\Pr_{y\sim N(0,I_{n})}\big{[}\|\left<U^{\top}A_{i},y\right>\|\leq t\;\;\forall i\in[m]\big{]}$
	$\displaystyle\stackrel{{\scriptstyle\textrm{Lem~{}}\ref{lem:SidakKhatri}}}{{\geq}}$	$\displaystyle\prod_{i=1}^{m}\Pr_{y\sim N(0,I_{n})}\big{[}\|\left<U^{\top}A_{i},y\right>\|\leq t\big{]}$
	$\displaystyle\stackrel{{\scriptstyle\textrm{Lem~{}\ref{lem:GaussianMeasureOfStrip}}}}{{\geq}}$	$\displaystyle\prod_{i=1}^{m}\exp\big{(}-e^{-t^{2}/2}\\|U^{\top}A_{i}\\|_{2}^{2}\big{)}$
	$\displaystyle=$	$\displaystyle\exp\Big{(}-e^{-t^{2}/2}\sum_{i=1}^{m}\\|U^{\top}A_{i}\\|_{2}^{2}\Big{)}=\exp\big{(}-e^{t^{2}/2}n\big{)}$

Here we have used that $\|U^{\top}A_{i}\|_{2}\leq\|A_{i}\|_{2}\leq 1$ which follows by the orthonormality of the columns of $A$ . ∎

It is somewhat unfortunate that Claim I shown above requires that $\sum_{i=1}^{m}A_{i}A_{i}^{\top}$ is exactly the identity and an approximation is not enough. But we can fix this by a rescaling argument:

Lemma 20.

Let $K=A^{\top}B_{\infty}^{m}\subseteq\mathbb{R}^{d}$ be a zonotope where $A\in\mathbb{R}^{m\times d}$ is a matrix so that $\sum_{i=1}^{m}A_{i}A_{i}^{\top}\succeq\alpha I_{d}$ for some $\alpha>0$ . Then for any $n$ -dimensional subspace $H\subseteq\mathbb{R}^{d}$ and any $t\geq 1$ one has $\gamma_{H}(\frac{t}{\sqrt{\alpha}}\cdot K\cap H)\geq\exp\big{(}-e^{-t^{2}/2}\cdot n\big{)}$ .

Proof.

Scaling $K$ by $\frac{1}{\sqrt{\alpha}}$ is equivalent to scaling $\sum_{i=1}^{m}A_{i}A_{i}^{\top}$ by $\frac{1}{\alpha}$ , hence we may assume that indeed $\alpha=1$ . Abbreviate $M:=\sum_{i=1}^{m}A_{i}A_{i}^{\top}\succeq I_{d}$ which is a symmetric positive definite matrix. Consider the matrix $\tilde{A}\in\mathbb{R}^{m\times d}$ with rescaled rows $\tilde{A}_{i}:=M^{-1/2}A_{i}$ , so that $\sum_{i=1}^{m}\tilde{A}_{i}\tilde{A}_{i}^{\top}=I_{d}$ . Let $\tilde{K}:=\tilde{A}^{\top}B_{\infty}^{m}=M^{-1/2}(K)$ and $\tilde{H}:=M^{-1/2}(H)$ be the rescaled zonotope and subspace. Let $U^{1},\ldots,U^{n}$ be an orthonormal basis of $H$ . Then with $\tilde{U}=M^{-1/2}U$ , $\tilde{U}^{1},\ldots,\tilde{U}^{n}$ will be the basis of $\tilde{H}$ , but it will not be orthogonal in general. However, for $y\sim N(0,I_{n})$ one has $\textrm{Cov}(\tilde{U}y)=\tilde{U}\tilde{U}^{\top}=M^{-1/2}UU^{\top}M^{-1/2}\preceq I_{\tilde{H}}$ . Then

\Pr_{y\sim N(0,I_{d})}[Uy\in tK]=\Pr_{y\sim N(0,I_{d})}[\tilde{U}y\in t\tilde{K}]\stackrel{{\scriptstyle\textrm{Lem~{}\ref{lem:ComparisonGaussians}}}}{{\geq}}\Pr_{y\sim N(0,I_{\tilde{H}})}[y\in t\tilde{K}]\stackrel{{\scriptstyle\textrm{Lem~{}\ref{lem:MeasureOfSliceDirectProof}}}}{{\geq}}\exp\big{(}-e^{-t^{2}/2}n\big{)}.\qed

3.2 Decomposition of normalized zonotopes

The next step in our proof strategy is to decompose the rows of an approximately regular matrix $A\in\mathbb{R}^{m\times d}$ into $\Theta(\frac{m}{d})$ many blocks $J\subseteq[m]$ so that $\sum_{i\in J}A_{i}A_{i}^{\top}\succeq\Omega(\frac{d}{m})\cdot I_{d}$ . For this purpose, we formulate a slight variant of Theorem 13.

Lemma 21.

Let $v_{1},\ldots,v_{m}\in\mathbb{R}^{d}$ be vectors with $\sum_{i=1}^{m}v_{i}v_{i}^{\top}\succeq L\cdot I_{d}$ for some $L>0$ and let $\varepsilon:=\max_{i=1,\ldots,m}\|v_{i}\|_{2}^{2}$ . Then there is a partition $[m]=S_{1}\dot{\cup}S_{2}$ so that

\sum_{i\in S_{j}}v_{i}v_{i}^{\top}\succeq\Big{(}\frac{L}{2}-3\sqrt{L\varepsilon}\Big{)}I_{d}\quad\forall j\in\{1,2\}

Proof.

Abbreviate $M:=\sum_{i=1}^{m}v_{i}v_{i}^{\top}$ which is a PSD matrix with $M\succeq L\cdot I_{d}$ . Define $v_{i}^{\prime}:=M^{-1/2}v_{i}$ . Then $\sum_{i=1}^{m}v_{i}^{\prime}(v_{i}^{\prime})^{\top}=M^{-1/2}\big{(}\sum_{i=1}^{m}v_{i}v_{i}^{\top}\Big{)}M^{-1/2}=I_{d}$ . We set $\varepsilon^{\prime}:=\frac{\varepsilon}{L}$ and verify that for all $i$ one has $\|v_{i}^{\prime}\|_{2}^{2}=v_{i}^{\top}M^{-1}v_{i}\leq v_{i}^{\top}(\tfrac{1}{L}I_{d})v_{i}=\frac{\|v_{i}\|_{2}^{2}}{L}\leq\varepsilon^{\prime}.$ Then we apply Theorem 13 to the vectors $\{v_{i}^{\prime}\}_{i\in[m]}$ and obtain a partition $[m]=S_{1}\dot{\cup}S_{2}$ so that for $j\in\{1,2\}$ one has

\color[rgb]{0,0,0}{M^{-1/2}\Big{(}\sum_{i\in S_{j}}v_{i}v_{i}^{\top}\Big{)}M^{-1/2}=\sum_{i\in S_{j}}v_{i}^{\prime}(v_{i}^{\prime})^{\top}\stackrel{{\scriptstyle\textrm{Thm~{}\ref{thm:MSS}}}}{{\succeq}}\Big{(}\frac{1}{2}-3\sqrt{\varepsilon/L}\Big{)}I_{d}},

and using the fact that $A\succeq B\implies M^{1/2}AM^{1/2}\succeq M^{1/2}BM^{1/2}$ , we conclude

\sum_{i\in S_{j}}v_{i}v_{i}^{\top}\succeq\Big{(}\frac{1}{2}-3\sqrt{\varepsilon/L}\Big{)}M^{1/2}I_{d}M^{1/2}\succeq\Big{(}\frac{L}{2}-3\sqrt{L\varepsilon}\Big{)}I_{d}.\qed

Now to the main lemma of this section where we decompose an approximately regular matrix by iteratively applying Lemma 21.

Lemma 22.

There is a universal constant $C>0$ so that the following holds. Let $A\in\mathbb{R}^{m\times d}$ be an approximately regular matrix. Then there are disjoint subsets $J_{1}\dot{\cup}\cdots\dot{\cup}J_{k}\subseteq[m]$ with $k\geq\frac{m}{Cd}$ and $|J_{\ell}|\leq Cd$ and $\sum_{i\in J_{\ell}}A_{i}A_{i}^{\top}\succeq\frac{1}{Ck}I_{d}$ for all $\ell\in[k]$ .

Proof.

If $\frac{m}{d}\leq C$ we may set $k=1$ and $J_{1}=[m]$ , so assume $m\geq Cd$ . Set $\varepsilon:=4\frac{d}{m}$ so that $\|A_{i}\|_{2}^{2}\leq\varepsilon$ for all $i\in[m]$ . Let $t\in\mathbb{N}$ be a parameter that we choose later. For $s\in\{0,\ldots,t\}$ we will obtain partitions $\pazocal{P}_{s}$ of the row indices starting with $\pazocal{P}_{0}:=\{[m]\}$ so that $\pazocal{P}_{s+1}$ is a refinement of $\pazocal{P}_{s}$ and moreover $|\pazocal{P}_{s}|=2^{s}$ . More precisely, in each iteration $s\in\{0,\ldots,t-1\}$ and for each $S\in\pazocal{P}_{s}$ , we apply Lemma 21 to the vectors $\{A_{i}\}_{i\in S}$ ; if $S=S_{1}\dot{\cup}S_{2}$ is the obtained partition, then we add $\{S_{1},S_{2}\}$ to $\pazocal{P}_{s+1}$ . We first analyze the corresponding eigenvalue lower bound. Define $L_{s}:=2^{-s}-15\sqrt{2^{-s}\varepsilon}$ .
Claim. If $2^{t}\leq\frac{m}{Cd}$ for a large enough constant $C>0$ , then for all $s\in\{0,\ldots,t\}$ one has $\sum_{i\in S}A_{i}A_{i}^{\top}\succeq L_{s}I_{d}$ for all $S\in\pazocal{P}_{s}$ .
Proof of Claim. Clearly $L_{s}\leq 2^{-s}$ all $s\geq 0$ . We will prove the claim by induction on $s$ . For $s=0$ one has $\pazocal{P}_{0}=\{[m]\}$ and the claim is true as $L_{0}\leq 1$ . Now consider an iteration $s\in\{0,\ldots,t-1\}$ and suppose $S\in\pazocal{P}_{s}$ is split into $S=S_{1}\dot{\cup}S_{2}$ . Then $\sum_{i\in S_{j}}A_{i}A_{i}^{\top}\succeq(\frac{L_{s}}{2}-3\sqrt{L_{s}\varepsilon})I_{d}$ for both $j\in\{1,2\}$ . This is at least $L_{s+1}$ as:

\frac{L_{s}}{2}-3\sqrt{L_{s}\varepsilon}\stackrel{{\scriptstyle L_{s}\leq 2^{-s}}}{{\geq}}\frac{L_{s}}{2}-3\sqrt{2^{-s}\varepsilon}\geq 2^{-(s+1)}-\frac{15}{2}\sqrt{2^{-s}\varepsilon}-3\sqrt{2^{-s}\varepsilon}\geq 2^{-(s+1)}-15\sqrt{2^{-(s+1)}\varepsilon}.

Here we use $15/2+3\leq 15\sqrt{2^{-1}}$ . This shows the claim. ∎

For a large enough constant $C$ , we pick $t\in\mathbb{N}$ so that $\frac{m}{2Cd}\leq 2^{t}\leq\frac{m}{Cd}$ . Then $L_{t}\geq\frac{Cd}{m}-15\sqrt{\frac{2Cd}{m}\cdot 4\frac{d}{m}}=\frac{d}{m}\cdot(C-15\sqrt{8C})\geq\frac{C}{2}\cdot\frac{d}{m}$ for $C$ large enough. Moreover we know that $\mathop{\mathbb{E}}_{S\sim\pazocal{P}_{t}}[|S|]=\frac{m}{2^{t}}\leq 2Cd$ . Then by Markov’s inequality at least half the sets $S\in\pazocal{P}_{t}$ have at most $4Cd$ indices. Those sets will satisfy the statement. ∎

3.3 Proof of Theorem 3

Next we prove our main technical result, Theorem 3. Recall that a measure $\mu$ on $\mathbb{R}^{d}$ is called log-concave if for all compact subsets $S,T\subseteq\mathbb{R}^{d}$ and $0\leq\lambda\leq 1$ one has

\mu(\lambda S+(1-\lambda)T)\geq\mu(S)^{\lambda}\cdot\mu(T)^{1-\lambda}

By induction one can verify that for any compact subsets $S_{1},\dots,S_{k}\subseteq\mathbb{R}^{d}$ and $\lambda_{1},\ldots,\lambda_{k}\geq 0$ with $\sum_{i=1}^{k}\lambda_{i}=1$ we have $\mu(\lambda_{1}S_{1}+\cdots+\lambda_{k}S_{k})\geq\prod_{\ell=1}^{k}\mu(S_{\ell})^{\lambda_{\ell}}$ . Also recall that the Gaussian measure $\gamma_{d}$ is indeed log-concave, see e.g. [AAGM15]. For a matrix $A\in\mathbb{R}^{m\times d}$ and indices $J\subseteq[m]$ we denote $A_{J}\in\mathbb{R}^{|J|\times d}$ as the submatrix of $A$ with rows in $J$ .

Proof of Theorem 3.

Let $K\subseteq\mathbb{R}^{d}$ be a normalized zonotope and let $H\subseteq\mathbb{R}^{d}$ be a subspace with dimension $n$ . Then we can write $K=\sqrt{\frac{d}{m}}A^{\top}B_{\infty}^{m}$ where $A\in\mathbb{R}^{m\times d}$ is approximately regular. We use Lemma 22 to obtain disjoint subsets $J_{1}\dot{\cup}\cdots\dot{\cup}J_{k}\subseteq[m]$ with $k\geq\frac{m}{Cd}$ so that $\sum_{i\in J_{\ell}}A_{i}A_{i}^{\top}\succeq\frac{d}{Cm}I_{d}$ where $C>0$ is a constant. Consider the zonotope $K_{\ell}:=\sqrt{\frac{d}{m}}A_{J_{\ell}}^{\top}B_{\infty}^{|J_{\ell}|}$ generated by the rows with indices in $J_{\ell}$ . Then we have $K_{1}+\ldots+K_{k}\subseteq K$ and $(K_{1}\cap H)+\ldots+(K_{k}\cap H)\subseteq K\cap H$ . Note that for each $\ell\in[k]$ we have $kK_{\ell}\supseteq\sqrt{\frac{k}{C}}A_{J_{\ell}}^{\top}B_{\infty}^{|J_{\ell}|}$ , so that $\sum_{i\in J_{\ell}}(\sqrt{\tfrac{k}{C}}A_{i})(\sqrt{\tfrac{k}{C}}A_{i})^{\top}\succeq\frac{k}{C}\cdot\frac{d}{Cm}I_{d}\succeq\frac{1}{C^{3}}I_{d}$ . Then applying Lemma 20 with $\alpha:=\frac{1}{C^{3}}$ we have

\gamma_{H}\big{(}tC^{3/2}kK_{\ell}\cap H\big{)}\geq\exp\big{(}-e^{t^{2}/2}\cdot n\big{)}

for all $t\geq 1$ . Finally, using log-concavity of the Gaussian measure we obtain

	$\displaystyle\gamma_{H}\big{(}tC^{3/2}K\cap H\big{)}$	$\displaystyle\geq$	$\displaystyle\gamma_{H}\big{(}(tC^{3/2}K_{1}\cap H)+\ldots+(tC^{3/2}K_{k}\cap H)\big{)}$
		$\displaystyle\geq$	$\displaystyle\prod_{\ell=1}^{k}\gamma_{H}\big{(}tC^{3/2}\cdot kK_{\ell}\cap H\big{)}^{1/k}\geq\exp\big{(}-e^{-t^{2}/2}\cdot n\big{)}.\qed$

4 The vector balancing constant $\mathrm{vb}(K,K)$

Next, we show how to translate measure lower bounds for sections into an improved bounds on the vector balancing constant.

4.1 Tight partial colorings for zonotopes

First we prove a generalization of the constant discrepancy partial coloring for the Komlós setting:

Lemma 23.

Let $v_{1},\ldots,v_{n}\in B_{2}^{d}$ and let $K\subseteq\mathbb{R}^{d}$ be a symmetric convex body with $\gamma_{H}(K\cap H)\geq e^{-\alpha n}$ for some $\alpha>0$ where $H=\textrm{span}\{v_{1},\ldots,v_{n}\}$ . Then there is a randomized polynomial time algorithm that given a shift $y\in(-1,1)^{n}$ finds a good partial coloring $x+y\in[-1,1]^{n}$ with $\sum_{j=1}^{n}x_{j}v_{j}\in cK$ where $c:=c(\alpha)$ is a constant.

Proof.

Let $Z\sim\sum_{j=1}^{n}z_{j}v_{j}$ where $z_{i}\sim N(0,1)$ are i.i.d. Gaussians so that $\mathbb{E}[ZZ^{\top}]=\sum_{j=1}^{n}v_{j}v_{j}^{\top}$ has trace $\textrm{Tr}\big{[}\mathbb{E}[ZZ^{\top}]\big{]}=\sum_{j=1}^{n}\|v_{j}\|_{2}^{2}\leq n$ . Let $u_{1},\dots,u_{r}$ be an orthonormal basis of $H$ with $r\leq n$ , and write $\sum_{j=1}^{n}v_{j}v_{j}^{\top}=\sum_{j=1}^{r}\sigma_{j}u_{j}u_{j}^{\top}$ . Since $\sum_{j=1}^{n}v_{j}v_{j}^{\top}\succeq 0$ , we have $\sigma_{j}\geq 0$ for all $j$ . Then after reindexing we may assume that $0\leq\sigma_{1}\leq\sigma_{2}\leq\dots\leq\sigma_{r}$ . Since $\sum_{j=1}^{r}\sigma_{j}=\sum_{j=1}^{n}\|v_{j}\|_{2}^{2}\leq n$ we know by Markov’s Inequality that $\sigma_{2n/3}\leq 3/2$ , denoting $\sigma_{j}=0$ for $j>r$ . Thus restricting to the subspaces $F:=\textrm{span}\{u_{1},\dots,u_{2n/3}\}$ and $V:=\{g\in\mathbb{R}^{n}\mid\sum_{j=1}^{n}g_{j}v_{j}\in F\}$ with $\dim(V)\geq\frac{2}{3}n$ , we may lower bound

$\displaystyle\Pr_{g\sim N(0,I_{V})}\Big{[}\sum_{j=1}^{n}g_{j}v_{j}\in 3/2\cdot K\Big{]}$	$\displaystyle=$	$\displaystyle\Pr_{g\sim N(0,I_{2n/3})}\Big{[}\sum_{j=1}^{2n/3}g_{j}\cdot\sigma_{j}u_{j}u_{j}^{\top}\in 3/2\cdot K\Big{]}$
	$\displaystyle\stackrel{{\scriptstyle(*)}}{{\geq}}$	$\displaystyle\Pr_{g\sim N(0,I_{2n/3})}\Big{[}\sum_{j=1}^{2n/3}g_{j}\cdot 3/2\cdot u_{j}u_{j}^{\top}\in 3/2\cdot K\Big{]}$
	$\displaystyle=$	$\displaystyle\gamma_{F}(K\cap F)$
	$\displaystyle\stackrel{{\scriptstyle\textrm{Lem~{}\ref{lem:ProbOfSubspaceVsBody}}}}{{\geq}}$	$\displaystyle\gamma_{H}(K\cap H)$
	$\displaystyle\geq$	$\displaystyle e^{-\alpha n},$

where $(*)$ follows by Lemma 8. Then by Theorem 11, the symmetric convex body $Q:=\{x\in\mathbb{R}^{n}:\sum_{j=1}^{n}x_{j}v_{j}\in K\}$ contains a good partial coloring in $Q\cap F$ . ∎

Then Lemma 23 implies the existence of a partial coloring with optimal bounds as long as $n$ is of the order of $d$ :

Corollary 24.

Let $K\subseteq\mathbb{R}^{d}$ be a normalized zonotope and let $v_{1},\ldots,v_{n}\in K$ . Then there is a randomized polynomial time algorithm to find a good partial coloring $x\in[-1,1]^{n}$ so that $\|\sum_{j=1}^{n}x_{j}v_{j}\|_{K}\lesssim\sqrt{d}$ .

Proof.

By Theorem 3, denoting $H:=\textrm{span}\{v_{1},\dots,v_{n}\}$ , we have $\gamma_{H}(C\cdot K\cap H)\geq e^{-n}$ . By Lemma 9, there exists some constant $\alpha>0$ such that $\gamma_{H}(K\cap H)\geq e^{-\alpha n}$ . By Lemma 17, $v_{i}\in\sqrt{d}B^{d}_{2}$ , so that the statement follows directly from Lemma 23. ∎

4.2 Proof of the main Theorem

Now we have all the ingredients to prove our main result, Theorem 1.

Proof of Theorem 1.

By Theorem 15, we may assume that $K$ is generated by only $m\lesssim d\log d$ segments, and by Lemma 16, we may assume that $K$ is a normalized zonotope $K:=\sqrt{\frac{d}{m}}A^{\top}B^{m}_{\infty}$ for some approximately regular $A\in\mathbb{R}^{m\times d}$ . By Theorem 14, since $\textrm{vb}(K,K)\leq 2\cdot\textrm{vb}_{d}(K,K)$ , we may assume that $n=d$ , though for clarity we only use this in the final bound. As before we set $Q:=\{x\in\mathbb{R}^{n}:\sum_{j=1}^{n}x_{j}v_{j}\in K\}$ . We iteratively apply Lemma 23 for $t$ rounds to obtain a partial coloring $x^{\prime}\in Q\cap[-1,1]^{n}$ , so that the set $I:=\{i:|x^{\prime}_{i}|<1\}$ of partially colored indices satisfies $|I|\leq n/2^{t}$ , and by the triangle inequality over the $t$ rounds $\|\sum_{j=1}^{n}x^{\prime}_{j}v_{j}\|_{K}\lesssim\sqrt{d}\cdot t$ .

For each $j\in I$ , we may write $v_{j}=\sqrt{\frac{d}{m}}A^{\top}u_{i}$ for some $u_{i}\in B_{\infty}^{m}$ . By Theorem 10, we can find $\tilde{x}\in\mathbb{R}^{n}$ so that $x:=\tilde{x}+x^{\prime}\in\{-1,1\}^{n}$ and $\sum_{i\in I}\tilde{x}_{i}u_{i}\in\sqrt{|I|\log(\frac{2m}{|I|})}\cdot c\cdot B^{m}_{\infty}$ where we set $\tilde{x}_{i}=0$ for $i\notin I$ . Therefore, setting $t:=\log\log(\frac{2m}{n})$ ,

	$\displaystyle\Big{\\|}\sum_{j=1}^{n}x_{j}v_{j}\Big{\\|}_{K}$	$\displaystyle\leq\Big{\\|}\sum_{j=1}^{n}x^{\prime}_{j}v_{j}\Big{\\|}_{K}+\Big{\\|}\sum_{j\in I}\tilde{x}_{j}v_{j}\Big{\\|}_{K}$
		$\displaystyle\lesssim\sqrt{d}\cdot t+\sqrt{\frac{n}{2^{t}}\cdot\log\Big{(}\frac{2m}{n/2^{t}}\Big{)}}$
		$\displaystyle=\sqrt{d}\log\log\Big{(}\frac{2m}{n}\Big{)}+\underbrace{\sqrt{\frac{n}{\log(\frac{2m}{n})}\cdot\log\Big{(}\frac{2m}{n}\cdot\log\Big{(}\frac{2m}{n}\Big{)}\Big{)}}}_{\lesssim\sqrt{n}\leq\sqrt{d}}$
		$\displaystyle\lesssim\sqrt{d}\log\log\Big{(}\frac{2m}{n}\Big{)}$
		$\displaystyle\lesssim\sqrt{d}\log\log\Big{(}\frac{d\log d}{n}\Big{)}.$

We conclude that $\textrm{vb}(K,K)\lesssim\textrm{vb}_{d}(K,K)\lesssim\sqrt{d}\log\log\log d$ . ∎

5 The vector balancing constant $\mathrm{vb}(K,Q)$

In this section we prove Theorem 4, stating that $\textrm{vb}(K,Q)\lesssim\sqrt{d\log d}$ where $K$ and $Q$ are normalized zonotopes. First note that Cor 24 indeed generalizes and for any $v_{1},\ldots,v_{n}\in K$ there is a good partial coloring $x\in[-1,1]^{n}$ with $\|\sum_{j=1}^{n}x_{j}v_{j}\|_{Q}\lesssim\sqrt{d}$ . On the other hand, in the proof of Theorem 1 we have also relied on Spencer’s Theorem which implies that $\textrm{vb}_{n}(K,K)\lesssim\sqrt{n\log(\frac{2m}{n})}$ . In particular this gives a bound that improves as $n$ decreases. However in our setting with different zonotopes $K$ and $Q$ such a bound does not hold!

To see this, let $H\in\{-1,1\}^{d\times d}$ be a Hadamard matrix, meaning that all rows and columns are orthogonal. Then one can verify that $K:=\frac{1}{\sqrt{d}}H^{\top}B_{\infty}^{d}$ is a normalized zonotope; in fact, $K$ is a rotated cube. Fix any $n\leq d$ and consider the points $v_{1},\ldots,v_{n}\in K$ with $v_{i}=\frac{1}{\sqrt{d}}H^{\top}H^{i}=\sqrt{d}\cdot e_{i}$ . We choose $Q:=B_{\infty}^{d}$ as the second normalized zonotope. Any good partial coloring $x\in[-1,1]^{n}$ must have a coordinate $i$ with $|x_{i}|\geq\frac{1}{2}$ and so $\|\sum_{j=1}^{n}x_{j}v_{j}\|_{Q}\geq\sqrt{d}|x_{i}|\geq\frac{\sqrt{d}}{2}$ .

Hence instead of applying Cor 24 iteratively and obtaining a bound of $\textrm{vb}(K,Q)\lesssim\sqrt{d}\log d$ , we use Banaszczyk’s Theorem together with Theorem 3:

Proof of Theorem 4.

Let $K,Q\subseteq\mathbb{R}^{d}$ be normalized zonotopes, and let $v_{1},\ldots,v_{n}\in K$ be the vectors to be balanced. Define $H:=\mathrm{span}\{v_{1},\ldots,v_{n}\}$ and let $r:=\dim(H)\leq\min\{d,n\}$ . By applying Theorem 3 to the zonotope $Q$ , subspace $H$ , and $t:=\sqrt{2\log 2r}$ , we find that

\gamma_{H}\big{(}\sqrt{2\log 2r}C^{\prime}Q\cap H\big{)}\geq e^{-\frac{1}{2}}>\frac{1}{2}.

By Lemma 17 we know that $v_{i}\in\sqrt{d}B_{2}^{d}$ for each $i\in[n]$ , hence by Theorem 12, signs $x\in\{-1,1\}^{n}$ can be computed in polynomial time such that

\sum_{j=1}^{n}x_{j}v_{j}\in\sqrt{d}C^{\prime\prime}\left(\sqrt{2\log 2r}C^{\prime}Q\cap H\right)\subseteq C\sqrt{d\log\min\{d,n\}}Q,

as desired. In particular, $\mathrm{vb}(K,Q)\lesssim\sqrt{d\log d}$ . ∎

6 Open problems

The main open question about zonotopes is whether a $d$ -dimensional zonotope can be approximated up to a constant factor using only a linear number of segments:

Conjecture 1 ([Sch07]).

For any zonotope $K\subseteq\mathbb{R}^{d}$ and $0<\varepsilon\leq\frac{1}{2}$ , does there exist a zonotope $Q$ with $O(\frac{d}{\varepsilon^{2}})$ segments so that $Q\subseteq K\subseteq(1+\varepsilon)Q$ ?

Equivalently, since the polar body of a zonotope $A^{\top}B^{m}_{\infty}\subseteq\mathbb{R}^{d}$ is the preimage $A^{-1}(B^{m}_{1}):=\{x\in\mathbb{R}^{d}:\|Ax\|_{1}\leq 1\}$ , we can restate the question as follows:

Conjecture 2.

Does there exist a universal constant $C>0$ such that given any matrix $A\in\mathbb{R}^{m\times d}$ with $m\geq d$ and $0<\varepsilon\leq\frac{1}{2}$ , one can always find another matrix $\tilde{A}\in\mathbb{R}^{Cd/\varepsilon^{2}\times d}$ with $\|\tilde{A}x\|_{1}\leq\|Ax\|_{1}\leq(1+\varepsilon)\|\tilde{A}x\|_{1}$ for all $x\in\mathbb{R}^{d}$ ?

We remark that if one replaces the $\ell_{1}$ norm by the $\ell_{2}$ norm, an analogue of Conjecture 2 holds as a direct corollary of a linear-size spectral sparsifier [BSS09]. In that setting, each row of $\tilde{A}$ is a scalar multiple of a row of $A$ , and there is hope that another rescaling of the rows of $A$ may suffice for the $\ell_{1}$ norm. Just as a spectral sparsifier can be found via spectral partial colorings [RR20], we also state the stronger conjecture of the existence of good partial colorings in the $\ell_{1}$ setting:

Conjecture 3.

Given any matrix $A\in\mathbb{R}^{m\times d}$ , does the set

K:=\Big{\{}x\in\mathbb{R}^{m}:\Big{|}\sum_{i=1}^{m}x_{i}|\langle A_{i},z\rangle|\Big{|}\leq\sqrt{\tfrac{d}{m}}\|Az\|_{1}\ \forall z\in\mathbb{R}^{d}\Big{\}}

have large Gaussian measure $\gamma_{m}(K)\geq e^{-Cm}$ where $C>0$ is a universal constant?

Finally, we restate Schechtman’s question, which would also follow from the above conjectures:

Conjecture 4 ([Sch07]).

Is it true that for any zonotope $K\subseteq\mathbb{R}^{d}$ , $\textrm{vb}(K,K)\lesssim\sqrt{d}$ ?

References

[AAGM15] S. Artstein-Avidan, A. Giannopoulos, and V. Milman. Asymptotic Geometric Analysis. Part I. 2015.
[Ban98] W. Banaszczyk. Balancing vectors and Gaussian measures of $n$ -dimensional convex bodies. Random Struct. Algorithms, 12(4):351–360, 1998.
[Ban10] N. Bansal. Constructive algorithms for discrepancy minimization. In FOCS, pages 3–10. IEEE Computer Society, 2010.
[BDGL18] N. Bansal, D. Dadush, S. Garg, and S. Lovett. The Gram-Schmidt walk: a cure for the Banaszczyk blues. In STOC, pages 587–597. ACM, 2018.
[Bec81] J. Beck. Roth’s estimate of the discrepancy of integer sequences is nearly sharp. Combinatorica, 1(4):319–325, 1981.
[BF81] J. Beck and T. Fiala. “Integer-making” theorems. Discrete Appl. Math., 3(1):1–8, 1981.
[BJM22] N. Bansal, H. Jiang, and R. Meka. Resolving Matrix Spencer conjecture up to poly-logarithmic rank, 2022.
[BLM89] J. Bourgain, J. Lindenstrauss, and V. Milman. Approximation of zonoids by zonotopes. Acta Mathematica, 162:73 – 141, 1989.
[BSS09] J. Batson, D. Spielman, and N. Srivastava. Twice-Ramanujan sparsifiers. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pages 255–262, 2009.
[Buk16] B. Bukh. An improvement of the Beck-Fiala theorem. Combinatorics, Probability and Computing, 25(3):380–398, 2016.
[Cha00] B. Chazelle. The Discrepancy Method. Cambridge University Press, 2000.
[CP15] M. B. Cohen and R. Peng. $\ell_{p}$ row sampling by Lewis weights. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’15, pages 183–192, New York, NY, USA, 2015. Association for Computing Machinery.
[DJR22] D. Dadush, H. Jiang, and V. Reis. A new framework for matrix discrepancy: partial coloring bounds via mirror descent. In STOC, pages 649–658. ACM, 2022.
[ES18] R. Eldan and M. Singh. Efficient algorithms for discrepancy minimization in convex sets. Random Struct. Algorithms, 53(2):289–307, 2018.
[HR17] R. Hoberg and T. Rothvoss. A logarithmic additive integrality gap for bin packing. In SODA, pages 2616–2625. SIAM, 2017.
[HRS22] S. B. Hopkins, P. Raghavendra, and A. Shetty. Matrix discrepancy from quantum communication. STOC 2022, pages 637–648, New York, NY, USA, 2022. Association for Computing Machinery.
[LM12] S. Lovett and R. Meka. Constructive discrepancy minimization by walking on the edges. In FOCS, pages 61–67. IEEE Computer Society, 2012.
[LRR16] A. Levy, H. Ramadas, and T. Rothvoss. Deterministic discrepancy minimization via the multiplicative weight update method. CoRR, abs/1611.08752, 2016.
[LSV86] L. Lovász, J. Spencer, and K. Vesztergombi. Discrepancy of set-systems and matrices. Eur. J. Comb., 7(2):151–160, 1986.
[LT11] M. Ledoux and M. Talagrand. Probability in Banach spaces. Classics in Mathematics. Springer-Verlag, Berlin, 2011. Isoperimetry and processes, Reprint of the 1991 edition.
[Mek14] R. Meka. Discrepancy and beating the union bound (blog post), 2014.
[MSS15] A. W. Marcus, D. A. Spielman, and N. Srivastava. Interlacing families ii: Mixed characteristic polynomials and the Kadison-Singer problem. Annals of Mathematics, 182(1):327–350, 2015.
[NTZ13] A. Nikolov, K. Talwar, and L. Zhang. The geometry of differential privacy: the sparse and approximate cases. In STOC, pages 351–360. ACM, 2013.
[Rot14] T. Rothvoß. Constructive discrepancy minimization for convex sets. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 140–145, 2014.
[Roy14] T. Royen. A simple proof of the gaussian correlation conjecture extended to multivariate gamma distributions. arXiv: Probability, 2014.
[RR20] V. Reis and T. Rothvoss. Linear size sparsifier and the geometry of the operator norm ball. In SODA, pages 2337–2348. SIAM, 2020.
[RR22] V. Reis and T. Rothvoss. Vector balancing in Lebesgue spaces. Random Structures and Algorithms, 08 2022.
[Sch87] G. Schechtman. More on embedding subspaces of $l_{p}$ in $l^{n}_{r}$ . Compositio Mathematica, 61(2):159–169, 1987.
[Sch07] G. Schechtman. Fourier analytic methods in convex geometry (workshop at the American Institute of Mathematics; http://aimpl.org/fourierconvex/1/), 2007.
[Spe85] J. Spencer. Six standard deviations suffice. Trans. Amer. Math. Soc., 289(2):679–706, 1985.
[SW99] S. J. Szarek and E. Werner. A nonsymmetric correlation inequality for gaussian measure. Journal of Multivariate Analysis, 68(2):193–211, 1999.
[Tal90] M. Talagrand. Embedding subspaces of ${L}_{1}$ into $\ell_{1}^{{N}}$ . Proceedings of the American Mathematical Society, 108(2):363–369, 1990.
[Tko15] T. Tkocz. High-dimensional Phenomena: Dilations, Tensor Products and Geometry of $L_{1}$ . University of Warwick, 2015.
[Vaa79] J. D. Vaaler. A geometric inequality with applications to linear forms. Pacific Journal of Mathematics, 83(2):543 – 553, 1979.
[Zou12] A. Zouzias. A matrix hyperbolic cosine algorithm and applications. In Artur Czumaj, Kurt Mehlhorn, Andrew Pitts, and Roger Wattenhofer, editors, Automata, Languages, and Programming, pages 846–858, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg.

Appendix A Normalizing zonotopes

In this section, we show that for any full-dimensional zonotope $K\subseteq\mathbb{R}^{d}$ there is a linear transformation $T:\mathbb{R}^{d}\to\mathbb{R}^{d}$ and a normalized zonotope $\tilde{K}$ so that $\frac{4}{5}\tilde{K}\subseteq T(K)\subseteq\tilde{K}$ . For this result we will need the existence of Lewis weights [CP15]:

Theorem 25.

Given a matrix $A\in\mathbb{R}^{m\times d}$ , there exists a unique vector $\overline{w}\in\mathbb{R}^{m}_{>0}$ so that for all $i\in[m]$ one has

\overline{w}_{i}^{-2}A_{i}^{\top}(A^{\top}\overline{W}^{-1}A)^{-1}A_{i}=1,

where $\overline{W}:=\operatorname{diag}(\overline{w})$ . Moreover, $\textrm{Tr}[\overline{W}]\leq d$ , with equality for full rank $A$ .

Now to the proof of Lemma 16.

Proof of Lemma 16.

Consider a full-dimensional zonotope $K=A^{\top}B_{\infty}^{m}$ with $A\in\mathbb{R}^{m\times d}$ . Let $\overline{W}$ be the diagonal matrix corresponding to the Lewis weights of $A$ and let $W:=D\overline{W}$ where $D>0$ is large enough so that $w_{i}:=W_{i,i}\geq 1$ for all $i$ . Define a matrix $B:=A(A^{\top}W^{-1}A)^{-1/2}\in\mathbb{R}^{m\times d}$ and define a second matrix $\tilde{A}$ where each row $B_{i}$ is replaced by $\lceil w_{i}\rceil$ many rows so that the first $\lfloor w_{i}\rfloor$ rows are all copies of $w_{i}^{-1}B_{i}$ , and (if $\{w_{i}\}\neq 0$ ) the last row is $\{w_{i}\}^{1/2}w_{i}^{-1}B_{i}$ , for a total of $m^{\prime}:=\sum_{i=1}^{m}\lceil w_{i}\rceil$ many rows. We will show that the conditions of Lemma 16 hold with

T:\mathbb{R}^{d}\rightarrow\mathbb{R}^{d},\ \ T(K)=\sqrt{\tfrac{d}{m^{\prime}}}(A^{T}W^{-1}A)^{-1/2}K=\sqrt{\tfrac{d}{m^{\prime}}}\cdot B^{T}B_{\infty}^{m}

and $\widetilde{K}:=\sqrt{\tfrac{d}{m^{\prime}}}\widetilde{A}^{T}B_{\infty}^{m^{\prime}}$ .

First we show that $\tilde{K}$ is normalized, or equivalently that $\tilde{A}$ is approximately regular. Note that

(\tilde{A}^{\top}\tilde{A})_{j,k}=\sum_{i=1}^{m^{\prime}}\tilde{A}_{i,j}\tilde{A}_{i,k}=\sum_{i=1}^{m}w_{i}^{-2}(\lfloor w_{i}\rfloor+\{w_{i}\})\cdot B_{i,j}B_{i,k}=\sum_{i=1}^{m}w_{i}^{-1}\cdot B_{i,j}B_{i,k},

so that by definition of $B$ ,

\tilde{A}^{\top}\tilde{A}=B^{\top}W^{-1}B=(A^{\top}W^{-1}A)^{-1/2}A^{\top}W^{-1}A(A^{\top}W^{-1}A)^{-1/2}=I_{d}.

Moreover, by the definition of Lewis weights, for each row $i^{\prime}\in[m^{\prime}]$ corresponding to a copy of $B_{i}$ one has

\|\tilde{A}_{i^{\prime}}\|_{2}^{2}\leq w_{i}^{-2}B_{i}^{\top}B_{i}=w_{i}^{-2}A_{i}^{\top}(A^{\top}W^{-1}A)^{-1}A_{i}=\frac{1}{D}\leq\frac{2d}{m^{\prime}},

where the last inequality follows since

m^{\prime}=\sum_{i=1}^{m}\lceil w_{i}\rceil\leq 2\cdot\sum_{i=1}^{m}w_{i}=2D\sum_{i=1}^{m}\overline{w}_{i}\leq 2D\cdot d.

Thus $\widetilde{A}$ is approximately regular, and $\tilde{K}$ is normalized.

To see that $\frac{4}{5}\tilde{K}\subseteq T(K)\subseteq\tilde{K}$ , take an arbitrary

y=\tfrac{4}{5}\sqrt{\tfrac{d}{m^{\prime}}}\sum_{i=1}^{m}\Big{(}\sum_{p=1}^{\lfloor w_{i}\rfloor}x_{i,p}w^{-1}_{i}B_{i}+x_{i,\lceil w_{i}\rceil}\{w_{i}\}^{1/2}w^{-1}_{i}B_{i}\Big{)}\in\tfrac{4}{5}\sqrt{\tfrac{d}{m^{\prime}}}\tilde{A}^{\top}B^{m^{\prime}}_{\infty}=\tfrac{4}{5}\tilde{K},

and rewrite it as

\tfrac{4}{5}\sqrt{\tfrac{d}{m^{\prime}}}\sum_{i=1}^{m}\Big{(}\underbrace{w_{i}^{-1}\Big{(}\sum_{i=1}^{\lfloor w_{i}\rfloor}x_{i,p}+x_{i,\lceil w_{i}\rceil}\{w_{i}\}\Big{)}}_{\in[-1,1]}+\underbrace{x_{i,\lceil w_{i}\rceil}\tfrac{\{w_{i}\}^{1/2}-\{w_{i}\}}{w_{i}}}_{\in[-\frac{1}{4},\frac{1}{4}]}\Big{)}B_{i}\in\sqrt{\tfrac{d}{m^{\prime}}}B^{\top}B^{m}_{\infty}=T(K).

Now taking an arbitrary $y:=\sqrt{\tfrac{d}{m^{\prime}}}\sum_{i=1}^{m}x_{i}B_{i}\in B^{\top}B^{m}_{\infty}=T(K)$ , we may write

y=\sqrt{\tfrac{d}{m^{\prime}}}\sum_{i=1}^{m}\Big{(}\sum_{p=1}^{\lfloor w_{i}\rfloor}x_{i}w^{-1}_{i}B_{i}+x_{i}\{w_{i}\}w^{-1}_{i}B_{i}\Big{)}\in\sqrt{\tfrac{d}{m^{\prime}}}\tilde{A}^{\top}B^{m^{\prime}}_{\infty}=\tilde{K},

completing the proof of the lemma. Finally, note that this result immediately implies that

\tfrac{4}{5}\textrm{vb}(\tilde{K},\tilde{K})\leq\textrm{vb}(K,K)\leq\tfrac{5}{4}\textrm{vb}(\tilde{K},\tilde{K}).\qed

Appendix B Gaussian measure

Proof of Lemma 7.

We make use of the following tail inequality due to Szarek and Werner [SW99] which holds for $t>-1$ :

\Pr_{g\sim N(0,1)}[g>t]<\frac{1}{\sqrt{2\pi}}\frac{4e^{-t^{2}/2}}{3t+(t^{2}+8)^{1/2}}.

In particular, for $t\geq 1$ the right side is upper bounded by $\frac{1}{\sqrt{2\pi}}\frac{4e^{-t^{2}/2}}{6}$ . Thus

\Pr_{g\sim N(0,1)}[|g|\leq t]\geq 1-\frac{4}{3\sqrt{2\pi}}e^{-t^{2}/2}.

Since the function $z\mapsto e^{-2z/3}$ is convex, we have $1-\frac{4}{3\sqrt{2\pi}}z\geq e^{-2z/3}$ for all $z\in[0,e^{-1/2}]$ as it holds for the endpoints of the interval. Therefore for $t\geq 1$ ,

\Pr_{g\sim N(0,1)}[|g|\leq t]\geq\exp(-\tfrac{2}{3}e^{-t^{2}/2}).

We conclude that for any $a\in\mathbb{R}^{n}$ with $\|a\|_{2}\leq 1$ and $t\geq 1$ one has

\Pr_{y\sim N(0,I_{n})}[|\left<a,y\right>|\leq t]=\Pr_{g\sim N(0,1)}\Big{[}|g|\leq\frac{t}{\|a\|_{2}}\Big{]}\geq\exp(-\tfrac{2}{3}e^{-t^{2}/(2\|a\|_{2}^{2})})\geq\exp(-e^{-t^{2}/2}\cdot\|a\|_{2}^{2}).

Indeed, the last inequality follows because

\frac{2}{3}\exp\Big{(}\frac{t^{2}}{2}-\frac{t^{2}}{2\|a\|_{2}^{2}}\Big{)}\leq\frac{2}{3}\exp\Big{(}\frac{1}{2}-\frac{1}{2\|a\|_{2}^{2}}\Big{)}\leq\frac{2}{3}\cdot e^{1/2}\cdot\frac{2}{e}\cdot\|a\|_{2}^{2}\leq\|a\|_{2}^{2},

where the second to last inequality follows from $e^{z}\geq ez$ for $z:=1/(2\|a\|_{2}^{2})$ . ∎

Proof of Lemma 8.

Draw another random variable $z\sim N(0,B-A)$ and note that by log-concavity we have

\Pr_{y\sim N(0,A)}[y\in K]\geq\Pr_{y\sim N(0,A)}\Big{[}\Pr_{z\sim N(0,B-A)}[y+z\in K]\Big{]}=\Pr_{y\sim N(0,B)}[y\in K].\qed

The Vector Balancing Constant for Zonotopes

Abstract

1 Introduction

The Beck-Fiala Conjecture.

The Matrix Spencer Conjecture.

The vector balancing constant of zonotopes.

1.1 Our contributions

Theorem 1.

Definition 2.

Theorem 3.

Theorem 4.

2 Preliminaries

Probability.

Theorem 5 (Šidák-Khatri).

Lemma 6.

Lemma 7.

Lemma 8.

Lemma 9.

Discrepancy theory.

Theorem 10 (Spencer’s Theorem [Spe85, LM12]).

Theorem 11 ([RR22], special case of Theorem 6).

Theorem 12 (Banaszczyk’s Theorem).

Theorem 13 (Marcus, Spielman, Srivastava [MSS15]).

Theorem 14 ([LSV86]).

Zonotopes.

Theorem 15 (Talagrand [Tal90]).

Lemma 16.

Lemma 17.

Proof.

3 Sections of normalized zonotopes

Theorem 18.

3.1 A first direct lower bound

Lemma 19.

Proof.

Lemma 20.

Proof.

3.2 Decomposition of normalized zonotopes

Lemma 21.

Proof.

Lemma 22.

Proof.

3.3 Proof of Theorem 3

Proof of Theorem 3.

4 The vector balancing constant vb​(K,K)\mathrm{vb}(K,K)

4.1 Tight partial colorings for zonotopes

Lemma 23.

Proof.

Corollary 24.

Proof.

4.2 Proof of the main Theorem

Proof of Theorem 1.

5 The vector balancing constant vb​(K,Q)\mathrm{vb}(K,Q)

Proof of Theorem 4.

6 Open problems

Conjecture 1 ([Sch07]).

Conjecture 2.

Conjecture 3.

Conjecture 4 ([Sch07]).

References

Appendix A Normalizing zonotopes

Theorem 25.

Proof of Lemma 16.

Appendix B Gaussian measure

Proof of Lemma 7.

Proof of Lemma 8.

4 The vector balancing constant $\mathrm{vb}(K,K)$

5 The vector balancing constant $\mathrm{vb}(K,Q)$