The Vector Balancing Constant for Zonotopes
Abstract
The vector balancing constant of two symmetric convex bodies is the minimum so that any number of vectors from can be balanced into an -scaling of . A question raised by Schechtman is whether for any zonotope one has . Intuitively, this asks whether a natural geometric generalization of Spencer’s Theorem (for which ) holds. We prove that for any zonotope one has . Our main technical contribution is a tight lower bound on the Gaussian measure of any section of a normalized zonotope, generalizing Vaaler’s Theorem for cubes. We also prove that for two different normalized zonotopes and one has . All the bounds are constructive and the corresponding colorings can be computed in polynomial time.
1 Introduction
Discrepancy theory is a subfield of combinatorics where one is given a set system with a ground set and a family of sets , and the goal is to find the coloring that minimizes the maximum imbalance, i.e.
A slightly more general linear-algebraic view is that one is given a matrix and its discrepancy is defined as . The best known result in this area is certainly Spencer’s Theorem [Spe85] which states that for any one has . The challenging aspect of that Theorem is that — say for — a uniform random coloring will only give a bound. Instead, Spencer [Spe85] applied the partial coloring method which had been first used by Beck [Bec81].
The original proofs of the partial coloring method are based on the pigeonhole principle and are non-constructive. The first polynomial time algorithm to actually find the coloring guaranteed by Spencer [Spe85] is due to Bansal [Ban10], followed by a sequence of algorithms [LM12, Rot14, LRR16, ES18] that either work in more general settings or are simpler.
Discrepancy theory is an extensively studied topic with many applications in mathematics and computer science. To give two concrete examples, Nikolov, Talwar and Zhang [NTZ13] showed a connection between differential privacy and hereditary discrepancy, and the best known approximation algorithm for Bin Packing uses a discrepancy-based rounding [HR17]. Other applications can be found in data structure lower bounds, communication complexity and pseudorandomness; we refer to the book of Chazelle [Cha00] for a more detailed account. The seminal result of Batson, Spielman and Srivastava [BSS09] on the existence of linear-size spectral sparsifiers for graphs can also be interpreted as a discrepancy-theoretic result, see [RR20] for details.
For the purpose of this paper, it will be convenient to introduce more general notation. For two symmetric convex bodies we define the vector balancing constant as the smallest number so that for any vectors one can find signs so that the signed sum is in . We also denote as the same quantity where we fix the number of vectors to be . For example, Spencer’s Theorem [Spe85] can then be rephrased as and as for . Here we denote as the -dimensional unit ball of the norm . Moreover for a Euclidean ball one can easily prove that and for the -ball we have .
While Spencer’s Theorem itself is tight, at least three candidate generalizations have been suggested in the literature — all three are unsolved so far.
The Beck-Fiala Conjecture.
Suppose we have a set system in which every element is in at most sets. Beck and Fiala [BF81] proved using a linear-algebraic argument that in this case the discrepancy is bounded by and they state the conjecture that the correct dependence should be . The same proof of [BF81] also shows that . However, the Beck-Fiala Conjecture is wide open and the best known bounds are [Ban98, BDGL18] and [Buk16]. In fact, Komlós Conjecture of is even more general; here the best known bound is [Ban98].
The Matrix Spencer Conjecture.
A conjecture popularized by Zouzias [Zou12] and Meka [Mek14] claims that for any symmetric matrices with all eigenvalues in , there are signs so that the maximum singular value of is at most . Using standard matrix concentration bounds, one can prove that a random coloring attains a value of at most . Moreover, one can prove the conjectured upper bound of under the additional assumption that the matrices are block-diagonal with constant size blocks [DJR22], or have rank [HRS22]. Based on recent progress on matrix concentration, it is possible to obtain the same under the weaker condition that they have rank at most [BJM22].
The vector balancing constant of zonotopes.
A zonotope is defined as the linear image of a cube. If is a matrix with , we can write a -dimensional zonotope in the form . Note that is the number of segments of the zonotope. The cube is trivially a zonotope, and it is known that for every , the ball is the limit of a sequence of zonotopes, called a zonoid [BLM89]. Schechtman [Sch07] raised the question whether it is true that for any zonotope one has where we write if for a universal constant . The best known bound of is a direct consequence of Spencer’s theorem and the fact that zonotopes can be sparsified up to a constant factor with only segments [Tal90]. An affirmative answer to Schechtman’s question would follow from an bound, or equivalently whether an -analogue of [BSS09] is true. We defer to Section 6 for details.
1.1 Our contributions
Our main result is an almost-proof of Schechtman’s conjecture (falling short only by a term).
Theorem 1.
For any zonotope one has . Moreover, for any one can find in randomized polynomial time a coloring with .
The claim is invariant under linear transformations to and so it will be useful to place in a normalized position. For this sake, we make the following definition:
Definition 2.
A matrix is called approximately regular if the following holds: {enumerate*}
The columns are orthonormal.
The rows satisfy for all .
Then we call a zonotope normalized if there exists a matrix that is approximately regular so that . We choose the scaling so that any cube is indeed normalized and zonotopes with any number of segments are comparable to in terms of volume and radius.
Our main technical contribution is a tight lower bound for the Gaussian measure of sections of any normalized zonotope.
Theorem 3.
For any normalized zonotope , any subspace with and any , one has where is a universal constant.
In order to prove Theorem 3, we show that a normalized zonotope can be decomposed into many smaller zonotopes with many segments each. This decomposition requires an iterative application of the Kadison-Singer theorem by Marcus, Spielman and Srivastava [MSS15]. Then we prove the statement of Theorem 3 for such simpler zonotopes and derive the lower bound on by using log-concavity of the Gaussian measure.
We can also use Theorem 3 to show how to balance vectors between different normalized zonotopes:
Theorem 4.
For any normalized zonotopes one has . Moreover, for any one can find in randomized polynomial time a coloring such that .
2 Preliminaries
We review a few facts that we rely on later.
Probability.
By we denote the (standard) Gaussian density . For the corresponding distribution we will write . For a subspace we write as the identity on the subspace; in particular where is any orthonormal basis of . A strip is a symmetric convex body of the form with .
Theorem 5 (Šidák-Khatri).
For any two symmetric convex bodies where at least one is a strip, one has .
More recently, Royen [Roy14] proved that this is indeed true for any pair of symmetric convex bodies, but the weaker result suffices for us.
Lemma 6.
For any symmetric convex body and any subspace one has .
We will use the following convenient estimate on the Gaussian measure of a strip:
Lemma 7.
For any with and one has
The following comparison inequality (see e.g. Ledoux and Talagrand [LT11]) will also be useful:
Lemma 8.
Let be a symmetric convex body and let . Then
We prove these lemmas in Appendix B. The following lemma allows us to dismiss constant scaling factors, see [Tko15]:
Lemma 9.
Let be a measurable set and be an Euclidean ball centered at the origin such that . Then for all . In particular, if for some constant then also for some .
Discrepancy theory.
First we give a full statement of Spencer’s theorem that we mentioned earlier:
Theorem 10 (Spencer’s Theorem [Spe85, LM12]).
For any with there are polynomial time computable signs so that . More generally, for any shift , there is a polynomial time computable so that and .
To be exact, the first algorithm giving a bound of is due to Bansal [Ban10] and the tight algorithmic bound is due to Lovett and Meka [LM12].
We say that a vector is a good partial coloring if with . We will need a connection between good partial colorings and Gaussian measure lower bounds.
Theorem 11 ([RR22], special case of Theorem 6).
For any , there is a constant and a randomized polynomial time algorithm that for a symmetric convex body , a -dimensional subspace with and a shift , finds so that is a good partial coloring.
Theorem 12 (Banaszczyk’s Theorem).
Let be a convex set with and let . Then there is a randomized polynomial time algorithm to compute signs so that where is a universal constant.
For many decades, the Kadison-Singer problem was an open question in operator theory. It was finally resolved in 2015:
Theorem 13 (Marcus, Spielman, Srivastava [MSS15]).
Let so that and let so that for all . Then there is a partition so that for both one has
In the definition of , there is no upper bound on the number of vectors to be balanced. But it is well-known that up to a constant factor, the worst-case is attained for many vectors. Let
be the vector balancing variant with vectors, so that
Theorem 14 ([LSV86]).
For any symmetric convex , .
The reduction underlying the inequality is algorithmic as well.
Zonotopes.
A substantial amount of work in the literature has been done on the question of how one can sparsify an arbitrary zonotope with another zonotope that has fewer segments, while losing only a constant factor approximation. The first bound of [Sch87] was improved to [BLM89]. We highlight the current best known bound:
Theorem 15 (Talagrand [Tal90]).
For any zonotope and , there is a zonotope with at most segments so that .
We refer to the approach of Cohen and Peng [CP15] for an elementary exposition of the bound.
Finally, we justify why it suffices to consider normalized zonotopes:
Lemma 16.
For any full-dimensional zonotope , there is a normalized zonotope and an invertible linear map so that . In particular, .
We show the argument in Appendix A.
Lemma 17.
Any normalized zonotope satisfies .
Proof.
We write where . Note that by orthonormality of the columns of and so . By definition, for any there is a with , so that
3 Sections of normalized zonotopes
In this section we prove Theorem 3, showing that all sections of zonotopes are large. To be be more precise, we prove the following more general measure lower bound:
Theorem 18.
For any normalized zonotope , any subspace with and any , one has where is a universal constant.
In the most basic form where is a cube and , the statement is similar to a result of Vaaler [Vaa79] who proved that for any -dimensional subspace ; though the geometry of a zonotope is more complex and the proof strategy is rather different.
3.1 A first direct lower bound
We begin with a simple estimate on the Gaussian measure of the section of a zonotope where we drop the scalar of . Hence this bound will be tight if the number of segments is close to but rather loose otherwise. We denote as the orthogonal projection into a subspace .
Lemma 19.
Let be a zonotope where is a matrix with orthonormal columns. Then for any subspace with and any one has .
Proof.
Let be a matrix with orthonormal columns spanning . Then if we draw , is indeed a standard Gaussian in the subspace . By assumption, , and this can be used to write any outcome of the random process as
(1) |
Here one should think of as the coordinates of in terms of the basis of .
From the expression in (1) we can draw the following conclusion:
Claim I. For any and one has .
Then Claim I gives a simple sufficient (but in general not necessary) condition for to lie in the zonotope .
Next, we can see that
Then we can use Claim I and the inequality of Šidák-Khatri to lower bound the Gaussian measure by
Here we have used that which follows by the orthonormality of the columns of . ∎
It is somewhat unfortunate that Claim I shown above requires that is exactly the identity and an approximation is not enough. But we can fix this by a rescaling argument:
Lemma 20.
Let be a zonotope where is a matrix so that for some . Then for any -dimensional subspace and any one has .
Proof.
Scaling by is equivalent to scaling by , hence we may assume that indeed . Abbreviate which is a symmetric positive definite matrix. Consider the matrix with rescaled rows , so that . Let and be the rescaled zonotope and subspace. Let be an orthonormal basis of . Then with , will be the basis of , but it will not be orthogonal in general. However, for one has . Then
3.2 Decomposition of normalized zonotopes
The next step in our proof strategy is to decompose the rows of an approximately regular matrix into many blocks so that . For this purpose, we formulate a slight variant of Theorem 13.
Lemma 21.
Let be vectors with for some and let . Then there is a partition so that
Proof.
Abbreviate which is a PSD matrix with . Define . Then . We set and verify that for all one has Then we apply Theorem 13 to the vectors and obtain a partition so that for one has
and using the fact that , we conclude
Now to the main lemma of this section where we decompose an approximately regular matrix by iteratively applying Lemma 21.
Lemma 22.
There is a universal constant so that the following holds. Let be an approximately regular matrix. Then there are disjoint subsets with and and for all .
Proof.
If we may set and , so assume . Set so that for all .
Let be a parameter that we choose later. For we will obtain
partitions of the row indices starting with so that is a
refinement of and moreover .
More precisely, in each iteration and for each ,
we apply Lemma 21 to the vectors ; if is the obtained partition,
then we add to .
We first analyze the corresponding eigenvalue lower bound.
Define .
Claim. If for a large enough constant , then for all one has for all .
Proof of Claim. Clearly all .
We will prove the claim by induction on . For one has and the claim is true as .
Now consider an iteration and suppose is split into . Then for both . This is at least as:
Here we use . This shows the claim. ∎
For a large enough constant , we pick so that . Then for large enough. Moreover we know that . Then by Markov’s inequality at least half the sets have at most indices. Those sets will satisfy the statement. ∎
3.3 Proof of Theorem 3
Next we prove our main technical result, Theorem 3. Recall that a measure on is called log-concave if for all compact subsets and one has
By induction one can verify that for any compact subsets and with we have . Also recall that the Gaussian measure is indeed log-concave, see e.g. [AAGM15]. For a matrix and indices we denote as the submatrix of with rows in .
Proof of Theorem 3.
Let be a normalized zonotope and let be a subspace with dimension . Then we can write where is approximately regular. We use Lemma 22 to obtain disjoint subsets with so that where is a constant. Consider the zonotope generated by the rows with indices in . Then we have and . Note that for each we have , so that . Then applying Lemma 20 with we have
for all . Finally, using log-concavity of the Gaussian measure we obtain
4 The vector balancing constant
Next, we show how to translate measure lower bounds for sections into an improved bounds on the vector balancing constant.
4.1 Tight partial colorings for zonotopes
First we prove a generalization of the constant discrepancy partial coloring for the Komlós setting:
Lemma 23.
Let and let be a symmetric convex body with for some where . Then there is a randomized polynomial time algorithm that given a shift finds a good partial coloring with where is a constant.
Proof.
Let where are i.i.d. Gaussians so that has trace . Let be an orthonormal basis of with , and write . Since , we have for all . Then after reindexing we may assume that . Since we know by Markov’s Inequality that , denoting for . Thus restricting to the subspaces and with , we may lower bound
where follows by Lemma 8. Then by Theorem 11, the symmetric convex body contains a good partial coloring in . ∎
Then Lemma 23 implies the existence of a partial coloring with optimal bounds as long as is of the order of :
Corollary 24.
Let be a normalized zonotope and let . Then there is a randomized polynomial time algorithm to find a good partial coloring so that .
4.2 Proof of the main Theorem
Now we have all the ingredients to prove our main result, Theorem 1.
Proof of Theorem 1.
By Theorem 15, we may assume that is generated by only segments, and by Lemma 16, we may assume that is a normalized zonotope for some approximately regular . By Theorem 14, since , we may assume that , though for clarity we only use this in the final bound. As before we set . We iteratively apply Lemma 23 for rounds to obtain a partial coloring , so that the set of partially colored indices satisfies , and by the triangle inequality over the rounds .
For each , we may write for some . By Theorem 10, we can find so that and where we set for . Therefore, setting ,
We conclude that . ∎
5 The vector balancing constant
In this section we prove Theorem 4, stating that where and are normalized zonotopes. First note that Cor 24 indeed generalizes and for any there is a good partial coloring with . On the other hand, in the proof of Theorem 1 we have also relied on Spencer’s Theorem which implies that . In particular this gives a bound that improves as decreases. However in our setting with different zonotopes and such a bound does not hold!
To see this, let be a Hadamard matrix, meaning that all rows and columns are orthogonal. Then one can verify that is a normalized zonotope; in fact, is a rotated cube. Fix any and consider the points with . We choose as the second normalized zonotope. Any good partial coloring must have a coordinate with and so .
Hence instead of applying Cor 24 iteratively and obtaining a bound of , we use Banaszczyk’s Theorem together with Theorem 3:
Proof of Theorem 4.
6 Open problems
The main open question about zonotopes is whether a -dimensional zonotope can be approximated up to a constant factor using only a linear number of segments:
Conjecture 1 ([Sch07]).
For any zonotope and , does there exist a zonotope with segments so that ?
Equivalently, since the polar body of a zonotope is the preimage , we can restate the question as follows:
Conjecture 2.
Does there exist a universal constant such that given any matrix with and , one can always find another matrix with for all ?
We remark that if one replaces the norm by the norm, an analogue of Conjecture 2 holds as a direct corollary of a linear-size spectral sparsifier [BSS09]. In that setting, each row of is a scalar multiple of a row of , and there is hope that another rescaling of the rows of may suffice for the norm. Just as a spectral sparsifier can be found via spectral partial colorings [RR20], we also state the stronger conjecture of the existence of good partial colorings in the setting:
Conjecture 3.
Given any matrix , does the set
have large Gaussian measure where is a universal constant?
Finally, we restate Schechtman’s question, which would also follow from the above conjectures:
Conjecture 4 ([Sch07]).
Is it true that for any zonotope , ?
References
- [AAGM15] S. Artstein-Avidan, A. Giannopoulos, and V. Milman. Asymptotic Geometric Analysis. Part I. 2015.
- [Ban98] W. Banaszczyk. Balancing vectors and Gaussian measures of -dimensional convex bodies. Random Struct. Algorithms, 12(4):351–360, 1998.
- [Ban10] N. Bansal. Constructive algorithms for discrepancy minimization. In FOCS, pages 3–10. IEEE Computer Society, 2010.
- [BDGL18] N. Bansal, D. Dadush, S. Garg, and S. Lovett. The Gram-Schmidt walk: a cure for the Banaszczyk blues. In STOC, pages 587–597. ACM, 2018.
- [Bec81] J. Beck. Roth’s estimate of the discrepancy of integer sequences is nearly sharp. Combinatorica, 1(4):319–325, 1981.
- [BF81] J. Beck and T. Fiala. “Integer-making” theorems. Discrete Appl. Math., 3(1):1–8, 1981.
- [BJM22] N. Bansal, H. Jiang, and R. Meka. Resolving Matrix Spencer conjecture up to poly-logarithmic rank, 2022.
- [BLM89] J. Bourgain, J. Lindenstrauss, and V. Milman. Approximation of zonoids by zonotopes. Acta Mathematica, 162:73 – 141, 1989.
- [BSS09] J. Batson, D. Spielman, and N. Srivastava. Twice-Ramanujan sparsifiers. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pages 255–262, 2009.
- [Buk16] B. Bukh. An improvement of the Beck-Fiala theorem. Combinatorics, Probability and Computing, 25(3):380–398, 2016.
- [Cha00] B. Chazelle. The Discrepancy Method. Cambridge University Press, 2000.
- [CP15] M. B. Cohen and R. Peng. row sampling by Lewis weights. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’15, pages 183–192, New York, NY, USA, 2015. Association for Computing Machinery.
- [DJR22] D. Dadush, H. Jiang, and V. Reis. A new framework for matrix discrepancy: partial coloring bounds via mirror descent. In STOC, pages 649–658. ACM, 2022.
- [ES18] R. Eldan and M. Singh. Efficient algorithms for discrepancy minimization in convex sets. Random Struct. Algorithms, 53(2):289–307, 2018.
- [HR17] R. Hoberg and T. Rothvoss. A logarithmic additive integrality gap for bin packing. In SODA, pages 2616–2625. SIAM, 2017.
- [HRS22] S. B. Hopkins, P. Raghavendra, and A. Shetty. Matrix discrepancy from quantum communication. STOC 2022, pages 637–648, New York, NY, USA, 2022. Association for Computing Machinery.
- [LM12] S. Lovett and R. Meka. Constructive discrepancy minimization by walking on the edges. In FOCS, pages 61–67. IEEE Computer Society, 2012.
- [LRR16] A. Levy, H. Ramadas, and T. Rothvoss. Deterministic discrepancy minimization via the multiplicative weight update method. CoRR, abs/1611.08752, 2016.
- [LSV86] L. Lovász, J. Spencer, and K. Vesztergombi. Discrepancy of set-systems and matrices. Eur. J. Comb., 7(2):151–160, 1986.
- [LT11] M. Ledoux and M. Talagrand. Probability in Banach spaces. Classics in Mathematics. Springer-Verlag, Berlin, 2011. Isoperimetry and processes, Reprint of the 1991 edition.
- [Mek14] R. Meka. Discrepancy and beating the union bound (blog post), 2014.
- [MSS15] A. W. Marcus, D. A. Spielman, and N. Srivastava. Interlacing families ii: Mixed characteristic polynomials and the Kadison-Singer problem. Annals of Mathematics, 182(1):327–350, 2015.
- [NTZ13] A. Nikolov, K. Talwar, and L. Zhang. The geometry of differential privacy: the sparse and approximate cases. In STOC, pages 351–360. ACM, 2013.
- [Rot14] T. Rothvoß. Constructive discrepancy minimization for convex sets. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 140–145, 2014.
- [Roy14] T. Royen. A simple proof of the gaussian correlation conjecture extended to multivariate gamma distributions. arXiv: Probability, 2014.
- [RR20] V. Reis and T. Rothvoss. Linear size sparsifier and the geometry of the operator norm ball. In SODA, pages 2337–2348. SIAM, 2020.
- [RR22] V. Reis and T. Rothvoss. Vector balancing in Lebesgue spaces. Random Structures and Algorithms, 08 2022.
- [Sch87] G. Schechtman. More on embedding subspaces of in . Compositio Mathematica, 61(2):159–169, 1987.
- [Sch07] G. Schechtman. Fourier analytic methods in convex geometry (workshop at the American Institute of Mathematics; http://aimpl.org/fourierconvex/1/), 2007.
- [Spe85] J. Spencer. Six standard deviations suffice. Trans. Amer. Math. Soc., 289(2):679–706, 1985.
- [SW99] S. J. Szarek and E. Werner. A nonsymmetric correlation inequality for gaussian measure. Journal of Multivariate Analysis, 68(2):193–211, 1999.
- [Tal90] M. Talagrand. Embedding subspaces of into . Proceedings of the American Mathematical Society, 108(2):363–369, 1990.
- [Tko15] T. Tkocz. High-dimensional Phenomena: Dilations, Tensor Products and Geometry of . University of Warwick, 2015.
- [Vaa79] J. D. Vaaler. A geometric inequality with applications to linear forms. Pacific Journal of Mathematics, 83(2):543 – 553, 1979.
- [Zou12] A. Zouzias. A matrix hyperbolic cosine algorithm and applications. In Artur Czumaj, Kurt Mehlhorn, Andrew Pitts, and Roger Wattenhofer, editors, Automata, Languages, and Programming, pages 846–858, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg.
Appendix A Normalizing zonotopes
In this section, we show that for any full-dimensional zonotope there is a linear transformation and a normalized zonotope so that . For this result we will need the existence of Lewis weights [CP15]:
Theorem 25.
Given a matrix , there exists a unique vector so that for all one has
where . Moreover, , with equality for full rank .
Now to the proof of Lemma 16.
Proof of Lemma 16.
Consider a full-dimensional zonotope with . Let be the diagonal matrix corresponding to the Lewis weights of and let where is large enough so that for all . Define a matrix and define a second matrix where each row is replaced by many rows so that the first rows are all copies of , and (if ) the last row is , for a total of many rows. We will show that the conditions of Lemma 16 hold with
and .
First we show that is normalized, or equivalently that is approximately regular. Note that
so that by definition of ,
Moreover, by the definition of Lewis weights, for each row corresponding to a copy of one has
where the last inequality follows since
Thus is approximately regular, and is normalized.
To see that , take an arbitrary
and rewrite it as
Now taking an arbitrary , we may write
completing the proof of the lemma. Finally, note that this result immediately implies that
Appendix B Gaussian measure
Proof of Lemma 7.
We make use of the following tail inequality due to Szarek and Werner [SW99] which holds for :
In particular, for the right side is upper bounded by . Thus
Since the function is convex, we have for all as it holds for the endpoints of the interval. Therefore for ,
We conclude that for any with and one has
Indeed, the last inequality follows because
where the second to last inequality follows from for . ∎