This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The Phase Transition of Discrepancy in Random Hypergraphs

Calum MacRury Department of Computer Science, University of Toronto, Toronto, ON, Canada [email protected] Tomáš Masařík Department of Applied Mathematics, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic & Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland & Department of Mathematics, Simon Fraser University, Burnaby, BC, Canada [email protected] Leilani Pai Department of Mathematics, University of Nebraska-Lincoln, Lincoln NE, USA [email protected]  and  Xavier Pérez-Giménez Department of Mathematics, University of Nebraska-Lincoln, Lincoln NE, USA [email protected]
Abstract.

Motivated by the Beck-Fiala conjecture, we study the discrepancy problem in two related models of random hypergraphs on nn vertices and mm edges. In the first (edge-independent) model, a random hypergraph H1H_{1} is constructed by fixing a parameter pp and allowing each of the nn vertices to join each of the mm edges independently with probability pp. In the parameter range in which pnpn\rightarrow\infty and pmpm\rightarrow\infty, we show that with high probability (w.h.p.) H1H_{1} has discrepancy at least Ω(2n/mpn)\Omega(2^{-n/m}\sqrt{pn}) when m=O(n)m=O(n), and at least Ω(pnlogγ)\Omega(\sqrt{pn\log\gamma}) when mnm\gg n, where γ=min{m/n,pn}\gamma=\min\{m/n,pn\}. In the second (edge-dependent) model, dd is fixed and each vertex of H2H_{2} independently joins exactly dd edges uniformly at random. We obtain analogous results for this model by generalizing the techniques used for the edge-independent model with p=d/mp=d/m. Namely, for dd\rightarrow\infty and dn/mdn/m\rightarrow\infty, we prove that w.h.p. H2H_{2} has discrepancy at least Ω(2n/mdn/m)\Omega(2^{-n/m}\sqrt{dn/m}) when m=O(n)m=O(n), and at least Ω((dn/m)logγ)\Omega(\sqrt{(dn/m)\log\gamma}) when mnm\gg n, where γ=min{m/n,dn/m}\gamma=\min\{m/n,dn/m\}. Furthermore, we obtain nearly matching asymptotic upper bounds on the discrepancy in both models (when p=d/mp=d/m), in the dense regime of mnm\gg n. Specifically, we apply the partial colouring lemma of Lovett and Meka to show that w.h.p. H1H_{1} and H2H_{2} each have discrepancy O(dn/mlog(m/n))O(\sqrt{dn/m}\log(m/n)), provided dd\rightarrow\infty, dn/mdn/m\rightarrow\infty and mnm\gg n. This result is algorithmic, and together with the work of Bansal and Meka characterizes how the discrepancy of each random hypergraph model transitions from Θ(d)\Theta(\sqrt{d}) to o(d)o(\sqrt{d}) as mm varies from m=Θ(n)m=\Theta(n) to mnm\gg n.

1. Introduction

A hypergraph111The definition of a hypergraph is equivalent to that of a set system, though we exclusively use the former terminology in this work. H=(V,E)H=(V,E) consists of a set V={v1,,vn}V=\{v_{1},\ldots,v_{n}\} of nn vertices together with a multiset E={e1,,em}E=\{e_{1},\ldots,e_{m}\} of mm edges, where each edge eie_{i} is a subset of VV. We denote the size of an edge ee as |e||e|. Note that HH is allowed to have duplicate edges (i.e. we may have ei=eie_{i}=e_{i^{\prime}} for some iii\neq i^{\prime}). We can bijectively represent HH by an m×nm\times n {0,1}\{0,1\}-matrix 𝑨=𝑨(H)=(Ai,j)1im, 1jn\bm{A}=\bm{A}(H)=(A_{i,j})_{1\leqslant i\leqslant m,\,1\leqslant j\leqslant n}, where Ai,j=1A_{i,j}=1 if vjeiv_{j}\in\ e_{i} and Ai,j=0A_{i,j}=0 if vjeiv_{j}\notin\ e_{i}. We call 𝑨\bm{A} the incidence matrix of HH. In particular, each pair of duplicate edges in HH corresponds to a pair of identical rows in 𝑨\bm{A}. Moreover, we define the degree of a vertex vjv_{j} of HH to be the number of edges containing that vertex, i.e. the number of 11’s in the jjth column of 𝑨\bm{A}. A classical problem in discrepancy theory is to find a 2-colouring of the vertices of HH so that every edge is as “balanced” as possible. To make this more precise, we define a colouring of HH to be a function ψ:V{1,1}\psi:V\rightarrow\{-1,1\}. This can be extended to a map ψ:E\psi:E\to\mathbb{Z}, by defining ψ(e):=veψ(v)\psi(e):=\sum_{v\in e}\psi(v) for each eEe\in E. We call |ψ(e)||\psi(e)| the discrepancy of edge ee, and note that it measures how unbalanced the colouring ψ\psi is on that edge. Further, the discrepancy of colouring ψ\psi, denoted disc(ψ)\operatorname{disc}(\psi), is the discrepancy of the least balanced edge of HH. That is,

disc(ψ):=maxeE|ψ(e)|\operatorname{disc}(\psi):=\max_{e\in E}|\psi(e)|

Finally, we define the discrepancy of hypergraph HH as

disc(H):=minψdisc(ψ),\operatorname{disc}(H):=\min_{\psi}\operatorname{disc}(\psi),

where the minimization is over all colourings of VV.

This and other related notions of combinatorial discrepancy have been studied from various angles and in different contexts. (For a more detailed introduction to the subject, we refer to books [18], [7] and [8].) One of the central problems in discrepancy theory in the above setting is to bound the discrepancy of a hypergraph HH in terms of its maximum degree dd. In [5], it was proven by Beck and Fiala that the discrepancy of HH is no larger than 2d12d-1. Moreover, they conjectured that the correct upper bound is of the order O(d1/2)O(d^{1/2}). There has been much work in trying to improve on the original bound of [5]. Most recently, it was proven by Bukh [6] that disc(H)2dlg(d)\operatorname{disc}(H)\leqslant 2d-\lg^{*}(d), where lg\lg^{*} is the binary iterated logarithm. This of course yields no asymptotic improvement in terms of dd, but to this date is the strongest upper bound known which solely depends on dd. If the upper bound is allowed dependence on the multiple parameters of the hypergraph, then there are results yielding improvements for hypergraphs in the correct range of parameters. For instance, Banaszczyk [3] showed that if n:=|V|n:=|V|, then disc(H)=O(dlogn)\operatorname{disc}(H)=O(\sqrt{d\log n})—a bound which was later made algorithmic by Bansal and Meka [4]. Recently, Potukuchi [20] proved that, for dd-regular HH, disc(H)=O(d+λ)\operatorname{disc}(H)=O(\sqrt{d}+\lambda), where λ:=maxv𝟏,v2=1𝑨v2\lambda:=\max_{v\perp\bm{1},\left\lVert v\right\rVert_{2}=1}\left\lVert\bm{A}\,v\right\rVert_{2} and 𝑨\bm{A} is the incidence matrix of HH.

In order to find upper bounds which depend solely on the maximum degree, restricted classes of hypergraphs have instead been studied. For example, if H=(V,E)H=(V,E) is assumed to be both dd-regular and dd-uniform (that is, the incidence matrix 𝑨\bm{A} has exactly dd 11’s in each column and row), then a folklore application of the Lovász local lemma can be used to show that there exists a colouring which achieves discrepancy O(dlogd)O(\sqrt{d\log d}) (see [10, 19] for details).

Another approach is to restrict one’s attention to hypergraphs which are generated randomly. In this work, we focus on two specific random hypergraph models. Both of these models are defined as distributions over the set of hypergraphs with n1n\geqslant 1 vertices and m1m\geqslant 1 edges.

1.1. The Edge-Independent Model

In [15], Hoberg and Rothvoss introduced a random hypergraph model, denoted (n,m,p)\mathbb{H}(n,m,p), in which a probability parameter 0p10\leqslant p\leqslant 1 is given (in addition to nn and mm). Their model, which we refer to as the edge-independent model, is a distribution on hypergraphs which we describe through the following randomized procedure:

  • Fix the vertex set V={v1,,vn}V=\{v_{1},\ldots,v_{n}\}.

  • For each 1im1\leqslant i\leqslant m, construct edge eie_{i} by placing each vVv\in V in eie_{i} independently with probability pp.

We denote E={e1,,em}E=\{e_{1},\ldots,e_{m}\} and define (n,m,p)\mathbb{H}(n,m,p) to be the distribution of the random hypergraph H=(V,E)H=(V,E). In other words, the entries of the incidence matrix 𝑨\bm{A} of HH are independent Bernoulli random variables of parameter pp. We write H(n,m,p)H\sim\mathbb{H}(n,m,p) to indicate that HH is drawn from (n,m,p)\mathbb{H}(n,m,p).

If m=m(n)m=m(n) and p=p(n)p=p(n) are functions which depend on nn, then we say that (n,m,p)\mathbb{H}(n,m,p) satisfies a property Q=Q(n)Q=Q(n) w.h.p., provided that [H(n) satisfies Q(n)]1\mathbb{P}[\text{$H(n)$ satisfies $Q(n)$}]\rightarrow 1 as nn\rightarrow\infty, where H=H(n)H=H(n) is drawn from (n,m,p)\mathbb{H}(n,m,p). Often, we abuse terminology slightly and say that the random hypergraph HH satisfies QQ w.h.p.

Hoberg and Rothvoss showed that, if nC1m2logmn\geqslant C_{1}m^{2}\log m and C1logn/mp1/2C_{1}\log{n}/m\leqslant p\leqslant 1/2 for some sufficiently large constant C1>0C_{1}>0, then disc(H)1\operatorname{disc}(H)\leqslant 1 w.h.p. for H(n,m,p)H\sim\mathbb{H}(n,m,p). A natural question left open by their work is whether or not HH continues to have constant discrepancy when nn transitions from C1m2logmC_{1}m^{2}\log m to Θ(mlogm)\Theta(m\log m). Potukuchi [19] provided a positive answer to this question for the special case when p=1/2p=1/2 by showing that if nC2mlogmn\geqslant C_{2}m\log m for C2=(2log2)1C_{2}=(2\log 2)^{-1}, then w.h.p. disc(H)1\operatorname{disc}(H)\leqslant 1. Very recently, Altschuler and Niles-Weed [2] used Stein’s method [9] in conjunction with the second moment method to substantially generalize this result to hold when p=p(n)p=p(n) depends on nn. This includes the challenging case when p(n)0p(n)\rightarrow 0.

1.2. The Edge-Dependent Model

A related model was introduced by Ezra and Lovett in [10]. As before, we fix n1n\geqslant 1 and m1m\geqslant 1, yet we now also consider a parameter d1d\geqslant 1 which satisfies dmd\leqslant m. The edge-dependent model, denoted (n,m,d)\mathcal{H}(n,m,d), is again a distribution on hypergraphs, though we describe it through a different randomized procedure:

  • Fix vertex set V={v1,,vn}V=\{v_{1},\ldots,v_{n}\}.

  • For each vertex vVv\in V, independently and u.a.r. (uniformly at random) draw Iv[m]I_{v}\subseteq[m] with |Iv|=d|I_{v}|=d.

  • For each 1im1\leqslant i\leqslant m, construct edge eie_{i} by defining ei:={uV:iIu}e_{i}:=\{u\in V:i\in I_{u}\}.

By setting E:={e1,,em}E:=\{e_{1},\ldots,e_{m}\}, we define (n,m,d)\mathcal{H}(n,m,d) to be the distribution of the random hypergraph H=(V,E)H=(V,E). In other words, the incidence matrix 𝑨\bm{A} of H(n,m,d)H\sim\mathcal{H}(n,m,d) is a random m×nm\times n matrix where each column has dd ones and mdm-d zeros. Note that the columns of 𝑨\bm{A} are independent, but the rows are not. We define what it means for (n,m,d)\mathcal{H}(n,m,d) to satisfy a property w.h.p. in the same way as in the edge-independent model.

Ezra and Lovett showed that, if mndm\geqslant n\geqslant d\to\infty, then w.h.p. disc(H)=O(dlogd)\operatorname{disc}(H)=O(\sqrt{d\log d}). Bansal and Meka [4] later showed that the factor of logd\sqrt{\log d} is redundant, thereby matching the bound claimed in the Beck-Fiala conjecture. Specifically, they showed that, for the entire range of nn and mm, disc(H)=O(d)\operatorname{disc}(H)=O(\sqrt{d}) w.h.p., provided d=Ω((loglogm)2)d=\Omega((\log\log m)^{2}). This result can be easily modified to also apply to the edge-independent model, provided the analogous condition pm=Ω((loglogm)2)pm=\Omega((\log\log m)^{2}) holds.

In [12], Franks and Saks considered the more general problem of vector balancing. Their main result concerns a collection of random matrix models in which the columns are generated independently. In particular, their results apply to the sparse regime (mnm\ll n) of both the random hypergraph models we have discussed. Specifically, they show that if n=Ω(m3log3m)n=\Omega(m^{3}\log^{3}m), then w.h.p. disc(H)2\operatorname{disc}(H)\leqslant 2, provided HH is drawn from (n,m,p)\mathbb{H}(n,m,p) or (n,m,d)\mathcal{H}(n,m,d) for p=d/mp=d/m. Finally, in a very recent work, Turner et al. [22] considered the problem of vector balancing when the entries of the random matrix 𝑨\bm{A} are each distributed as standard Gaussian random variables which are independent and identically distributed (i.i.d.). Amongst other results, they showed that the discrepancy of 𝑨\bm{A} is Θ(2n/mn)\Theta(2^{-n/m}\sqrt{n}) w.h.p., provided mnm\ll n.

1.3. An Overview of Our Results

All results proven in this paper are asymptotic with respect to the parameter nn, the number of vertices of the model. Thus, we hereby assume that m=m(n)m=m(n), p=p(n)p=p(n) and d=d(n)d=d(n) are functions which depend on nn, with p=d/mp=d/m.

While previous results have successfully matched (or improved upon) the conjectured Beck-Fiala bound of d\sqrt{d} in the random hypergraph setting, they either apply to a restricted parameter range [10, 19, 15, 12], or do not provide asymptotically tight results for the full parameter range [10, 4]. In particular, when n/lognmnn/\log n\ll m\ll n or mnm\gg n, the correct order of the discrepancy is unknown in either model. In this paper, we obtain (almost) matching upper and lower bounds in the dense regime of mnm\gg n. Moreover, our upper bounds are algorithmic. In the sparse regime of mnm\ll n, we provide the first lower bounds which apply to the full parameter range under the mild assumption that both the average edge size dn/m=pndn/m=pn and degree d=pmd=pm tend to infinity with nn. Proving the existence of a colouring whose discrepancy matches our lower bound in the regime n/lognmnn/\log n\ll m\ll n remains open. This problem is particularly challenging from an algorithmic perspective, as the partial colouring lemma [17] does not appear to be useful in this range of parameters, and this is the main tool used in the literature.

We now formally state our main results:

Theorem 1.1.

Suppose that HH is generated from (n,m,p)\mathbb{H}(n,m,p) with pnpn\rightarrow\infty, pmpm\rightarrow\infty and pp bounded away from 11. If m=O(n)m=O(n), then w.h.p.

disc(H)=Ω(max{2n/mpn,1}),\operatorname{disc}(H)=\Omega\left(\max\{2^{-n/m}\sqrt{pn},1\}\right),

Moreover, if mnm\gg n, then w.h.p.

disc(H)=Ω(pnlogγ),\operatorname{disc}(H)=\Omega\left(\sqrt{pn\log\gamma}\right),

where γ:=min{m/n,pn}\gamma:=\min\{m/n,pn\}.

Remark 1.

This bound complements the upper bound of 11 in [2] for mCn/lognm\leqslant Cn/\log n where C=2log2C=2\log 2, but also implies that if ε>0\varepsilon>0 is a constant, then HH has non-constant discrepancy for m(C+ε)n/log(np)m\geqslant(C+\varepsilon)n/\log(np). Thus, HH exhibits a sharp phase transition at Cn/log(np)Cn/\log(np) for constant pp.

We obtain analogous results regarding the edge-dependent model (n,m,d)\mathcal{H}(n,m,d):

Theorem 1.2.

Suppose that HH is generated from (n,m,d)\mathcal{H}(n,m,d) with dn/mdn/m\rightarrow\infty, dd\rightarrow\infty, and d/md/m bounded away from 11. If m=O(n)m=O(n), then w.h.p.

disc(H)=Ω(max{2n/mdnm,1}),\operatorname{disc}(H)=\Omega\left(\max\{2^{-n/m}\sqrt{\frac{dn}{m}},1\}\right),

Moreover, if mnm\gg n, then w.h.p.

disc(H)=Ω(dnmlogγ),\operatorname{disc}(H)=\Omega\left(\sqrt{\frac{dn}{m}\log\gamma}\right),

where γ:=min{m/n,dn/m}\gamma:=\min\{m/n,dn/m\}.

Remark 2.

The techniques used in [2] do not seem to apply to the edge dependent model. Thus, if m=O(n1/3/logn)m=O(n^{1/3}/\log n), then [12] implies disc(H)2\operatorname{disc}(H)\leqslant 2, however when n1/3/lognmnn^{1/3}/\log n\ll m\ll n, O(d)O(\sqrt{d}) remains the best known upper bound for disc(H)\operatorname{disc}(H) [4].

Finally, we prove an upper bound in the dense regime of mnm\gg n which holds for both models:

Theorem 1.3.

Assume that HH is drawn from (n,m,d)\mathcal{H}(n,m,d) or (n,m,p)\mathbb{H}(n,m,p) with p=d/mp=d/m, and pick any β=β(n)1\beta=\beta(n)\geqslant 1 satisfying222Let us remark at this place that by log\log we always mean the natural logarithm. In the proof of this theorem it is convenient to use lg\lg to denote logarithm of base 22.

βdnmlog(m/n)(log(dnm)+2)5.\beta\frac{dn}{m}\geqslant\log(m/n)\left(\log\left(\frac{dn}{m}\right)+2\right)^{5}. (1)

If mnm\gg n, and dn/mdn/m\rightarrow\infty, then w.h.p.

disc(H)=O(dnmlog(mn)β).\operatorname{disc}(H)=O\left(\sqrt{\frac{dn}{m}\,\log\left(\frac{m}{n}\right)\,\beta}\right).

Moreover, whenever this holds, we can find a colouring ψ\psi of HH with such a discrepancy in expected polynomial time.

Remark 3.

Observe that the smaller we are able to take β\beta, the better upper bound we get. In particular, if β:=log(m/n)\beta:=\log(m/n), then (1) and β(n)1\beta(n)\geqslant 1 are satisfied (for large enough nn). Therefore, at worst the upper bound is O(dnmlog(mn))O\left(\sqrt{\frac{dn}{m}}\log\left(\frac{m}{n}\right)\right), which is significantly smaller than the upper bound of O(d)O(\sqrt{d}) of [4], as mnm\gg n.

Theorem 1.3 provides asymptotically matching bounds for the lower bounds of Theorems 1.1 and 1.2 in a broad range of the dense regime mnm\gg n. For instance, this happens when d=pm(m/n)1+εd=pm\geqslant(m/n)^{1+\varepsilon}, where ε>0\varepsilon>0 is a constant, since in that case β:=1\beta:=1 clearly satisfies (1), and log(dn/m)=Ω(log(m/n))\log(dn/m)=\Omega(\log(m/n)), so γ=Θ(log(m/n))\gamma=\Theta(\log(m/n)).

Corollary 1.4.

Assume that HH is drawn from (n,m,d)\mathcal{H}(n,m,d) or (n,m,p)\mathbb{H}(n,m,p), where p=d/mp=d/m. If mnm\gg n and d=pm(m/n)1+εd=pm\geqslant(m/n)^{1+\varepsilon} for constant ε>0\varepsilon>0, then w.h.p.

disc(H)=Θ(dnmlog(mn)).\operatorname{disc}(H)=\Theta\left(\sqrt{\frac{dn}{m}\log\left(\frac{m}{n}\right)}\right).
Remark 4.

This shows that in the dense regime, the main parameter of interest describing the discrepancy of each random hypergraph model is the average edge size, dn/mdn/m, opposed to dd, the average/maximum degree (depending on the model).

2. Preliminaries

In this section, we first provide some central-limit-type results for sums of independent random variables, which will be used in Section 3 to obtain lower bounds on the discrepancy. In the sequel, we write N(0,1)N(0,1) to denote a generic random variable distributed as a standard Gaussian with cumulative distribution function

Φ(x):=[N(0,1)x]=12πxexp(t2/2)𝑑tfor each x.\Phi(x):=\mathbb{P}[N(0,1)\leqslant x]=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{x}\exp(-t^{2}/2)\,dt\qquad\text{for each }x\in\mathbb{R}.

The Berry-Esseen Theorem (see section XVI.5 in [11]), which we state below for convenience, yields a quantitative form of the Central Limit Theorem for sums of independent random variables with finite third moment.

Theorem 2.1 (Berry-Esseen).

The following holds for some universal constant cuni>0c_{uni}>0. Let Y1,,YnY_{1},\ldots,Y_{n} be independent random variables with 𝔼[Yi]=0\mathbb{E}[Y_{i}]=0, 𝔼[Yi2]=σi2\mathbb{E}[Y^{2}_{i}]=\sigma^{2}_{i} and 𝔼[|Yi|3]=ρi<\mathbb{E}[|Y_{i}|^{3}]=\rho_{i}<\infty for all i[n]i\in[n]. Consider the sum Y=i=1nYiY=\sum_{i=1}^{n}Y_{i}, with standard deviation σ=i=1nσi2\sigma=\sqrt{\sum_{i=1}^{n}\sigma_{i}^{2}}, and let ρ=i=1nρi\rho=\sum_{i=1}^{n}\rho_{i}. Assume σ>0\sigma>0. Then,

supx|[Y/σx]Φ(x)|(cuni/2)ρσ3.\sup_{x\in\mathbb{R}}|\mathbb{P}[Y/\sigma\leqslant x]-\Phi(x)|\leqslant\frac{(c_{uni}/2)\rho}{\sigma^{3}}.
Remark 5.

There has been a series of works improving upon the constant cunic_{uni}, the latest of which by Shevtsova [21] shows that cuni/20.560c_{uni}/2\leqslant 0.560 in our setting. However, since we are only concerned with the asymptotic growth of discrepancy, the precise value of cunic_{uni} is not important.

Theorem 2.1 and the triangle inequality immediately yield the following corollary:

Corollary 2.2.

For any interval I(,)I\subseteq(-\infty,\infty),

|[Y/σI][N(0,1)I]|cuniρσ3.|\mathbb{P}[Y/\sigma\in I]-\mathbb{P}[N(0,1)\in I]|\leqslant\frac{c_{uni}\rho}{\sigma^{3}}.

We will apply this result to linear combinations of independent Bernoulli’s with coefficients in {1,1}\{-1,1\}. More precisely, let X1,,XnX_{1},\ldots,X_{n} be independent random variables with XiBer(pi)X_{i}\sim\textup{Ber}(p_{i}) for some 𝒑=(p1,,pn)[0,1]n\bm{p}=(p_{1},\ldots,p_{n})\in[0,1]^{n} (where Ber(pi)\textup{Ber}(p_{i}) is a Bernoulli of parameter pip_{i}). Given a vector 𝒂=(a1,,an){1,1}n\bm{a}=(a_{1},\ldots,a_{n})\in\{-1,1\}^{n}, consider the sum S𝒂,𝒑:=k=1naiXiS_{\bm{a},\bm{p}}:=\sum_{k=1}^{n}a_{i}X_{i}, whose standard deviation we denote by σ\sigma. Under these assumptions we obtain the following bound.

Lemma 2.3.

For any bounded interval [L,R](,)[L,R]\subseteq(-\infty,\infty),

[S𝒂,𝒑[L,R]]cuniσ+(1exp((RL)22πσ2))1/2.\mathbb{P}[S_{\bm{a},\bm{p}}\in[L,R]]\leqslant\frac{c_{uni}}{\sigma}+\left(1-\exp\left(-\frac{(R-L)^{2}}{2\pi\sigma^{2}}\right)\right)^{1/2}.

(When σ=0\sigma=0, the right-hand side of the bound above is simply interpreted as ++\infty.)

Proof.

Let μ=𝔼[S𝒂,𝒑]=i=1naipi\mu=\mathbb{E}[S_{\bm{a},\bm{p}}]=\sum_{i=1}^{n}a_{i}p_{i}. Then we centre S𝒂,𝒑S_{\bm{a},\bm{p}} by defining the random variables Yi=ai(Xi𝔼[Xi])Y_{i}=a_{i}(X_{i}-\mathbb{E}[X_{i}]) and setting

Y:=k=1nYi=S𝒂,𝒑μ.Y:=\sum_{k=1}^{n}Y_{i}=S_{\bm{a},\bm{p}}-\mu.

Observe that YY has the same standard deviation σ\sigma as S𝒂,𝒑S_{\bm{a},\bm{p}}, which we assume is non-zero (otherwise the lemma holds trivially). Moreover, S𝒂,𝒑[L,R]S_{\bm{a},\bm{p}}\in[L,R] if and only if Y/σ[L~,R~]Y/\sigma\in[\widetilde{L},\widetilde{R}], where L~:=(Lμ)/σ\widetilde{L}:=(L-\mu)/\sigma and R~:=(Rμ)/σ\widetilde{R}:=(R-\mu)/\sigma. Further,

𝔼[|Yi|3]=(1pi)pi3+pi(1pi)3=pi(1pi)(pi2+(1pi)2)pi(1pi),\mathbb{E}[|Y_{i}|^{3}]=(1-p_{i})p_{i}^{3}+p_{i}(1-p_{i})^{3}=p_{i}(1-p_{i})\left(p_{i}^{2}+(1-p_{i})^{2}\right)\leqslant p_{i}(1-p_{i}),

and hence

ρ=i=1n𝔼[|Yi|3]i=1npi(1pi)=σ2.\rho=\sum_{i=1}^{n}\mathbb{E}[|Y_{i}|^{3}]\leqslant\sum_{i=1}^{n}p_{i}(1-p_{i})=\sigma^{2}.

Then, Corollary 2.2 immediately yields

[S𝒂,𝒑[L,R]]=[Y/σ[L~,R~]]cuniσ+[N(0,1)[L~,R~]].\mathbb{P}[S_{\bm{a},\bm{p}}\in[L,R]]=\mathbb{P}[Y/\sigma\in[\widetilde{L},\widetilde{R}]]\leqslant\frac{c_{uni}}{\sigma}+\mathbb{P}[N(0,1)\in[\widetilde{L},\widetilde{R}]].

To finalize the proof, it only remains to bound [N(0,1)[L~,R~]]\mathbb{P}[N(0,1)\in[\widetilde{L},\widetilde{R}]]. In order to do so, we will use the inequality

ttexp(x2/2)𝑑x2π(1exp(2t2/π))for all t,\int_{-t}^{t}\exp(-x^{2}/2)\,dx\leqslant\sqrt{2\pi(1-\exp(-2t^{2}/\pi))}\qquad\text{for all }t\in\mathbb{R}, (2)

which can be found in [23]. Furthermore, note that exp(x2/2)\exp(-x^{2}/2) is an even function, decreasing with x2x^{2}. Combining this fact with (2), we get

[N(0,1)[L~,R~]]\displaystyle\mathbb{P}[N(0,1)\in[\widetilde{L},\widetilde{R}]] =12πL~R~exp(x2/2)𝑑x\displaystyle=\frac{1}{\sqrt{2\pi}}\int_{\widetilde{L}}^{\widetilde{R}}\exp(-x^{2}/2)\,dx
12π(R~L~)/2(R~L~)/2exp(x2/2)𝑑x\displaystyle\leqslant\frac{1}{\sqrt{2\pi}}\int_{-\left(\widetilde{R}-\widetilde{L}\right)/2}^{(\widetilde{R}-\widetilde{L})/2}\exp(-x^{2}/2)\,dx
=12π(RL)/2σ(RL)/2σexp(x2/2)𝑑x\displaystyle=\frac{1}{\sqrt{2\pi}}\int_{-(R-L)/2\sigma}^{(R-L)/2\sigma}\exp(-x^{2}/2)\,dx
1exp((RL)22πσ2),\displaystyle\leqslant\sqrt{1-\exp\left(-\frac{(R-L)^{2}}{2\pi\sigma^{2}}\right)},

which concludes the proof of the lemma. ∎

Now suppose there exist 0<p<10<p<1, 0<ζ<10<\zeta<1 and 0ε<10\leqslant\varepsilon<1 such that

i=1npi(1ε)pnandpiζfor each i[n].\sum_{i=1}^{n}p_{i}\geqslant(1-\varepsilon)pn\qquad\text{and}\qquad p_{i}\leqslant\zeta\quad\text{for each }i\in[n]. (3)

Then we can restate the upper bound of Lemma 2.3 in the following convenient way, which we use as our key tool in proving Theorems 1.1 and 1.2.

Lemma 2.4.

Suppose p1,,pnp_{1},\ldots,p_{n} satisfy (3) for some 0ε<10\leqslant\varepsilon<1, 0<p<10<p<1 and 0<ζ<10<\zeta<1. Then, for any bounded interval [L,R](,)[L,R]\subseteq(-\infty,\infty),

[S𝒂,𝒑[L,R]]\displaystyle\mathbb{P}[S_{\bm{a},\bm{p}}\in[L,R]] cuni(1ζ)(1ε)np+(1exp((RL)22π(1ζ)(1ε)np))1/2\displaystyle\leqslant\frac{c_{uni}}{\sqrt{(1-\zeta)(1-\varepsilon)np}}+\left(1-\exp\left(-\frac{(R-L)^{2}}{2\pi(1-\zeta)(1-\varepsilon)np}\right)\right)^{1/2} (4)
cuni+|RL|/2π(1ζ)(1ε)np.\displaystyle\leqslant\frac{c_{uni}+|R-L|/\sqrt{2\pi}}{\sqrt{(1-\zeta)(1-\varepsilon)np}}. (5)
Proof.

First note that the variance σ2\sigma^{2} of S𝒂,𝒑S_{\bm{a},\bm{p}} satisfies

σ2=i=1npi(1pi)(1ζ)i=1npi(1ζ)(1ε)pn.\sigma^{2}=\sum_{i=1}^{n}p_{i}(1-p_{i})\geqslant(1-\zeta)\sum_{i=1}^{n}p_{i}\geqslant(1-\zeta)(1-\varepsilon)pn.

This bound, used with Lemma 2.3, immediately gives (4). Then (5) follows by applying the inequality 1exp(x)x1-\exp(-x)\leqslant x, which holds for every xx\in\mathbb{R}. ∎

In Theorem 1.3, we prove an upper bound on discrepancy in the dense regime (mnm\gg n). In this parameter range, we make use of the algorithmic partial colouring lemma, a seminal result of Lovett and Meka [17] later made deterministic by Levy, Ramadas, and Rothvoss [16]. We defer the statement of this result to Lemma 4.1 of Section 4, as it will not be needed until then.

3. Lower Bounding Discrepancy

3.1. The Edge-Independent Model

We now return to the setting of hypergraph discrepancy in the context of the edge-independent model (n,m,p)\mathbb{H}(n,m,p). Throughout this section, m=m(n)m=m(n), p=p(n)p=p(n) and asymptotic statements are with respect to nn\to\infty. We first observe that w.h.p. there are some edges containing an odd number of vertices and thus the discrepancy cannot be zero.

Proposition 3.1.

Suppose H(n,m,p)H\sim\mathbb{H}(n,m,p) with mm\to\infty, pnpn\to\infty and pp bounded away from 11 as nn\to\infty. Then w.h.p. disc(H)1\operatorname{disc}(H)\geqslant 1.

Proof.

By hypothesis, p1εp\leqslant 1-\varepsilon for some sufficiently small constant ε>0\varepsilon>0. Let e1,,eme_{1},\ldots,e_{m} be the edges of HH, and observe that the number of vertices contained in each edge is distributed as Bin(n,p)\textup{Bin}(n,p). Then the probability that eie_{i} has an even number of vertices is

j even(nj)pj(1p)nj=12(1+(12p)n)=1/2+o(1),\sum_{j\text{ even}}\binom{n}{j}p^{j}(1-p)^{n-j}=\frac{1}{2}(1+(1-2p)^{n})=1/2+o(1), (6)

where we used the fact that |12p|nmax{(12ε)n,e2pn}=o(1)|1-2p|^{n}\leqslant\max\{(1-2\varepsilon)^{n},e^{-2pn}\}=o(1) as nn\to\infty. Hence, the probability all the edges of HH contain an even number of vertices is (1/2+o(1))m=o(1)(1/2+o(1))^{m}=o(1). Therefore, w.h.p. HH has an edge with an odd number of vertices, and thus has discrepancy at least 11. ∎

Proposition 3.1 trivially implies Theorem 1.1 in the regime in which 2n/mpn=O(1)2^{-n/m}\sqrt{pn}=O(1). We now use Lemma 2.4 to prove the remaining cases of Theorem 1.1 via a simple first moment argument:

Proof of Theorem 1.1.

Suppose that H=(V,E)H=(V,E) is generated from (n,m,p)\mathbb{H}(n,m,p) with pnpn\to\infty, pmpm\to\infty and pp bounded away from 11, as nn\to\infty. We define

f^=f^(n)={2n/mp(1p)nif m=O(n),p(1p)nlogγif mn,\hat{f}=\hat{f}(n)=\begin{cases}2^{-n/m}\sqrt{p(1-p)n}&\text{if }m=O(n),\\ \sqrt{p(1-p)n\log\gamma}&\text{if }m\gg n,\end{cases}

where γ=min{pn,m/n}\gamma=\min\{pn,m/n\}, and choose a sufficiently small constant κ>0\kappa>0. To prove the theorem, it suffices to show that w.h.p. disc(H)max{κf^,1}\operatorname{disc}(H)\geqslant\max\{\kappa\hat{f},1\}. Note that this is trivially true when f^1/κ\hat{f}\leqslant 1/\kappa, in view of Proposition 3.1. So we will assume that f^>1/κ\hat{f}>1/\kappa, and show that w.h.p. disc(H)κf^\operatorname{disc}(H)\geqslant\kappa\hat{f}. Let Ψ\Psi be the set of all colourings ψ:V{1,1}\psi:V\to\{-1,1\}, and let ZZ denote the number of colourings ψΨ\psi\in\Psi with discrepancy disc(ψ)κf^\operatorname{disc}(\psi)\leqslant\kappa\hat{f}. Since the random edges e1,,eme_{1},\ldots,e_{m} of HH are i.i.d.,

𝔼[Z]=ψΨ[disc(ψ)κf^]=ψΨ[|ψ(e1)|κf^]m.\mathbb{E}[Z]=\sum_{\psi\in\Psi}\mathbb{P}[\operatorname{disc}(\psi)\leqslant\kappa\hat{f}]=\sum_{\psi\in\Psi}\mathbb{P}[|\psi(e_{1})|\leqslant\kappa\hat{f}]^{m}. (7)

Note that ψ(e1)=i=1nψ(vi)𝟏vie1\psi(e_{1})=\sum_{i=1}^{n}\psi(v_{i})\bm{1}_{v_{i}\in e_{1}}, where 𝟏vie1\bm{1}_{v_{i}\in e_{1}} denotes the indicator random variable of the event that edge e1e_{1} contains vertex viv_{i}, so ψ(e1)\psi(e_{1}) is distributed as S𝒂,𝒑S_{\bm{a},\bm{p}} in Section 2 (with ai=ψ(vi)a_{i}=\psi(v_{i}) and pi=[vie1]p_{i}=\mathbb{P}[v_{i}\in e_{1}]). Hence, by applying (5) in Lemma 2.4 (with ϵ=0\epsilon=0, ζ=p\zeta=p and [L,R]=[κf^,κf^][L,R]=[-\kappa\hat{f},\kappa\hat{f}]) to each one of the 2n2^{n} terms of the last sum in (7), it follows that

𝔼[Z]2n(cuni+2κf^/2πp(1p)n)m2n(κf^(cuni+2/π)p(1p)n)m,\mathbb{E}[Z]\leqslant 2^{n}\left(\frac{c_{uni}+2\kappa\hat{f}/\sqrt{2\pi}}{\sqrt{p(1-p)n}}\right)^{m}\leqslant 2^{n}\left(\frac{\kappa\hat{f}(c_{uni}+\sqrt{2/\pi})}{\sqrt{p(1-p)n}}\right)^{m},

where we also used that κf^>1\kappa\hat{f}>1. Let us consider first the case that m=O(n)m=O(n). Then, from the definition of f^\hat{f} and assuming κ<1/(cuni+2/π)\kappa<1/(c_{uni}+\sqrt{2/\pi}),

𝔼[Z](κ(cuni+2/π))m=o(1).\mathbb{E}[Z]\leqslant\left(\kappa(c_{uni}+\sqrt{2/\pi})\right)^{m}=o(1).

Now suppose that mnm\gg n. In this case, we bound the factor [|ψ(e1)|κf^]\mathbb{P}[|\psi(e_{1})|\leqslant\kappa\hat{f}] on the right-hand side of (7) using the tighter inequality (4) in Lemma 2.4 instead of (5). Then, assuming that C:=2κ2/π<1/2C:=2\kappa^{2}/\pi<1/2, we get

𝔼[Z]\displaystyle\mathbb{E}[Z] 2n(cunip(1p)n+(1exp((2κf^)22πp(1p)n))1/2)m\displaystyle\leqslant 2^{n}\left(\frac{c_{uni}}{\sqrt{p(1-p)n}}+\left(1-\exp\left(-\frac{(2\kappa\hat{f})^{2}}{2\pi p(1-p)n}\right)\right)^{1/2}\right)^{m}
=2n(cunip(1p)n+(1γ2κ2/π)1/2)m\displaystyle=2^{n}\left(\frac{c_{uni}}{\sqrt{p(1-p)n}}+\left(1-\gamma^{-2\kappa^{2}/\pi}\right)^{1/2}\right)^{m}
=(1+O(n/m))m(O(1/pn)+(1γC)1/2)m\displaystyle=\left(1+O\left(n/m\right)\right)^{m}\left(O(1/\sqrt{pn})+\left(1-\gamma^{-C}\right)^{1/2}\right)^{m}
=(112γC(1+o(1)))m\displaystyle=\left(1-\frac{1}{2}\gamma^{-C}(1+o(1))\right)^{m}
=exp(m2γC(1+o(1)))=o(1),\displaystyle=\exp\left(-\frac{m}{2}\gamma^{-C}(1+o(1))\right)=o(1),

where we used the facts 1/pn=o(γC)1/\sqrt{pn}=o(\gamma^{-C}), n/m=o(γC)n/m=o(\gamma^{-C}) and m/γCm/\gamma^{C}\to\infty. In either case, 𝔼[Z]=o(1)\mathbb{E}[Z]=o(1), and therefore w.h.p. disc(H)κf^\operatorname{disc}(H)\geqslant\kappa\hat{f}. ∎

3.2. The Edge-Dependent Model

In this section, we derive a lower bound on the discrepancy of a hypergraph generated from the edge-dependent model and prove Theorem 1.2. In view of our previous results for the edge-independent model, one natural approach is to compare both models (n,m,d)\mathcal{H}(n,m,d) and (n,m,p)\mathbb{H}(n,m,p) via a coupling procedure. For instance, let m=nm=n and dlognd\gg\log n for simplicity, and suppose that we can generate (H1,H2)(H_{1},H_{2}) with H1(n,n,d)H_{1}\sim\mathcal{H}(n,n,d) and H2(n,n,p)H_{2}\sim\mathbb{H}(n,n,p), with edge sets E(H1)={e11,,en1}E(H_{1})=\{e^{1}_{1},\ldots,e^{1}_{n}\} and E(H2)={e12,,en2}E(H_{2})=\{e^{2}_{1},\ldots,e^{2}_{n}\}, in such a way that w.h.p. for every i=1,,ni=1,\ldots,n we have |ei1Δei2|η|e^{1}_{i}\Delta e^{2}_{i}|\leqslant\eta, for some suitable η=η(n)\eta=\eta(n). In particular, this implies that w.h.p. |disc(H1)disc(H2)|η|\operatorname{disc}(H_{1})-\operatorname{disc}(H_{2})|\leqslant\eta, and thus disc(H1)=Ω(d+η)\operatorname{disc}(H_{1})=\Omega(\sqrt{d}+\eta) by Theorem 1.1. Unfortunately, since the standard deviation of the size of an edge is Θ(d)\Theta(\sqrt{d}) in either model, most naïve attempts to build such a coupling require ηd\eta\gg\sqrt{d}, which is too large for our purposes. (In fact, it is not hard to build such a coupling with any ηdlogn\eta\gg\sqrt{d\log n}.) As a result, while it is conceivable that a more delicate coupling argument works, we abandon this approach. Instead, we handle the dependencies of the edges of H2H_{2} by applying a careful conditioning argument, while generalizing how we apply Lemma 2.4.

As in the edge-independent model, we first prove a constant lower bound on the discrepancy of H(n,m,d)H\sim\mathcal{H}(n,m,d).

Proposition 3.2.

Suppose H(n,m,d)H\sim\mathcal{H}(n,m,d) with mm\rightarrow\infty, dn/mdn/m\to\infty and d/md/m bounded away from 11 as nn\to\infty. Then w.h.p. disc(H)1\operatorname{disc}(H)\geqslant 1.

Proof.

By hypothesis, p=d/m1εp=d/m\leqslant 1-\varepsilon for some sufficiently small constant ε>0\varepsilon>0. Our goal is to show that w.h.p. there is some edge in HH containing an odd number of vertices, so disc(H)1\operatorname{disc}(H)\geqslant 1. As in the proof of Proposition 3.1, the probability that an edge eie_{i} has an odd number of vertices is 1/2+o(1)1/2+o(1) (see (6)). Therefore, the expected number WW of edges with an odd number of vertices is (1+o(1))m/2(1+o(1))m/2. Similarly, the probability that two different edges (say, e1e_{1} and e2e_{2}) have an odd number of vertices is 1/4+o(1)1/4+o(1). To see this, first define the bivariate generating function

f(x,y)=(d(d1)m(m1)xy+d(md)m(m1)(x+y)+(md)(md1)m(m1))n.f(x,y)=\left(\frac{d(d-1)}{m(m-1)}xy+\frac{d(m-d)}{m(m-1)}(x+y)+\frac{(m-d)(m-d-1)}{m(m-1)}\right)^{n}.

For 0i,jn0\leqslant i,j\leqslant n, the coefficient

[xiyj]f(x,y)[x^{i}y^{j}]f(x,y)

gives the probability that edge e1e_{1} has ii ones and edge e2e_{2} has jj ones. Hence, the probability that both edges have an odd number of vertices is

f(1,1)f(1,1)f(1,1)+f(1,1)4,\frac{f(1,1)-f(-1,1)-f(1,-1)+f(-1,-1)}{4},

which after tedious but straightforward calculations and using the facts that d/m1εd/m\leqslant 1-\varepsilon and dn/mdn/m\to\infty, is equal to

12(12dm)n+(14d(md)m(m1))n4=1/4+o(1),\frac{1-2\left(1-\frac{2d}{m}\right)^{n}+\left(1-4\frac{d(m-d)}{m(m-1)}\right)^{n}}{4}=1/4+o(1),

and hence 𝔼W2m2/4\mathbb{E}W^{2}\sim m^{2}/4. This implies that 𝐕𝐚𝐫W=o((𝔼W)2)\mathbf{Var}W=o((\mathbb{E}W)^{2}) and, by a standard application of Chebyshev’s inequality, we conclude that w.h.p. Wm/2W\sim m/2\to\infty. This proves the proposition. ∎

Remark 6.

It is conceivable that in the proof of the proposition above one could obtain an upper bound on [W=0]\mathbb{P}[W=0] that is exponentially small in mm, in the same spirit as in the proof of Proposition 3.1. However, this would require some additional work due to the fact that the edges of H(n,m,d)H\sim\mathcal{H}(n,m,d) are not formed independently.

Proposition 3.2 trivially implies Theorem 1.2 in the regime in which 2n/mdn/m=O(1)2^{-n/m}\sqrt{dn/m}=O(1). To prove the remaining cases, we will generalize the ideas we used in the proof of Theorem 1.1. However, the dependencies among the edges make the argument much more delicate.

Proof of Theorem 1.2.

Suppose that H=(V,E)H=(V,E) is generated from (n,m,d)\mathcal{H}(n,m,d) with m=m(n)m=m(n) and d=d(n)d=d(n) satisfying dn/mdn/m\to\infty and dd\to\infty (as nn\to\infty) and with p=d/mcp=d/m\leqslant c for some constant 0<c<10<c<1. For short, we use \mathcal{H} to denote the sample space of the distribution, i.e. the set of all possible outcomes of HH. Fix a constant 0ε<min{1,1/c1}0\leqslant\varepsilon<\min\{1,1/c-1\}, and define

f^=f^(n):={2n/mpn(1ε)(1c(1+ε))=Ω(2n/mdn/m)if m=O(n),pnlogγ(1ε)(1c(1+ε))=Ω(pnlogγ)if mn,\hat{f}=\hat{f}(n):=\begin{cases}2^{-n/m}\sqrt{pn(1-\varepsilon)(1-c(1+\varepsilon))}=\Omega(2^{-n/m}\sqrt{dn/m})&\text{if }m=O(n),\\ \sqrt{pn\log\gamma(1-\varepsilon)(1-c(1+\varepsilon))}=\Omega(\sqrt{pn\log\gamma})&\text{if }m\gg n,\end{cases}

where γ=min{pn,m/n}\gamma=\min\{pn,m/n\}. Let κ>0\kappa>0 be a sufficiently small constant. To prove Theorem 1.2, it suffices to show that w.h.p. disc(H)max{κf^,1}\operatorname{disc}(H)\geqslant\max\{\kappa\hat{f},1\}. Note that this is trivially true when f^1/κ\hat{f}\leqslant 1/\kappa, in view of Proposition 3.2. So we will assume that f^>1/κ\hat{f}>1/\kappa, and show that w.h.p. disc(H)κf^\operatorname{disc}(H)\geqslant\kappa\hat{f}. Note that this assumption ensures that n=O(mlogd)n=O(m\log d), i.e. there are not too many more vertices than edges.

Let ZZ be the number of colourings ψ:V{1,1}\psi:V\to\{-1,1\} with discrepancy disc(ψ)κf^\operatorname{disc}(\psi)\leqslant\kappa\hat{f}. We would like to prove an analogue of (7) in order to bound 𝔼Z\mathbb{E}Z, but unfortunately the random edges e1,,eme_{1},\ldots,e_{m} of HH are no longer i.i.d. In order to overcome this obstacle, we introduce some random variables that will play an essential role in the analysis of ZZ. For each j=1,,mj=1,\ldots,m and k=1,,nk=1,\ldots,n, let Aj,k:=𝟏[vkej]A_{j,k}:=\bm{1}_{[v_{k}\in e_{j}]} be the {0,1}\{0,1\} value of the (j,k)(j,k) entry of the incidence matrix 𝑨\bm{A} of HH, and let 𝑨j=(Aj,1,,Aj,n)\bm{A}_{j}=(A_{j,1},\ldots,A_{j,n}) denote the jj-th row of 𝑨\bm{A}. Moreover, for each k=1,,nk=1,\ldots,n, we define

B0,k:=dandBi,k:=dj=1iAj,kfor i=1,,m.B_{0,k}:=d\qquad\text{and}\qquad B_{i,k}:=d-\sum_{j=1}^{i}A_{j,k}\quad\text{for }i=1,\ldots,m. (8)

In other words, Bi,kB_{i,k} counts the number of ones that appear in the kk-th column of 𝑨\bm{A} below the ii-th row (recall that each column of AA has exactly dd ones). Note that each Bi,kB_{i,k} can be expressed as a function of the first ii rows of 𝑨\bm{A}, and moreover the distribution of Ai+1,kA_{i+1,k} conditional on the outcome of A1,k,,Ai,kA_{1,k},\ldots,A_{i,k} can be described solely in terms of Bi,kB_{i,k}. More formally, for each i=1,,mi=1,\ldots,m, let i\mathcal{F}_{i} be the sigma algebra generated by 𝑨1,,𝑨i\bm{A}_{1},\ldots,\bm{A}_{i}, and let 0:={,}\mathcal{F}_{0}:=\{\emptyset,\mathcal{H}\} denote the trivial sigma algebra. Then, for each i=0,,mi=0,\ldots,m and k=1,,nk=1,\ldots,n, Bi,kB_{i,k} is measurable with respect to i\mathcal{F}_{i}, and (for i<mi<m)

Pi,k:=[Ai+1,k=1i]=Bi,kmi.P_{i,k}:=\mathbb{P}[A_{i+1,k}=1\mid\mathcal{F}_{i}]=\frac{B_{i,k}}{m-i}.

Intuitively speaking, we would like that the above conditional probabilities remain close to pp (on average) as we reveal new rows of 𝑨\bm{A}, at least for a large number of rows. In view of that, for each i=0,,m1i=0,\ldots,m-1, we consider the event QiQ_{i} that for every 0ji0\leqslant j\leqslant i

k=1nPj,k(1ε)pnandPj,k(1+ε)cfor k=1,,n.\sum_{k=1}^{n}P_{j,k}\geqslant(1-\varepsilon)pn\qquad\quad\text{and}\qquad\quad P_{j,k}\leqslant(1+\varepsilon)c\quad\text{for $k=1,\ldots,n$}.

(Here (1+ε)c<1(1+\varepsilon)c<1 from our choice of ε\varepsilon.) Observe that =Q0Qm1\mathcal{H}=Q_{0}\supseteq\cdots\supseteq Q_{m-1} is a decreasing sequence of events, and each QiQ_{i} is i\mathcal{F}_{i}-measurable by construction. Let α:=max{n/(n+m),1/2}\alpha:=\max\{n/(n+m),1/2\}. We need the following technical result, which we prove in Section 3.3.

Proposition 3.3.

Under the assumptions in the proof of Theorem 1.2, event QαmQ_{\lfloor\alpha m\rfloor} holds w.h.p.

Now let Ψ\Psi be the set of all colourings ψ:V{1,1}\psi:V\to\{-1,1\}, and pick an arbitrary ψΨ\psi\in\Psi. For each i=1,,mi=1,\ldots,m, let RiψR^{\psi}_{i} denote the event that |ψ(ei)|κf^|\psi(e_{i})|\leqslant\kappa\hat{f}, and let Riψ:=j=1iRjψR^{\psi}_{\leqslant i}:=\bigcap_{j=1}^{i}R^{\psi}_{j}. (By convention, R0ψ=R^{\psi}_{\leqslant 0}=\mathcal{H}.) Clearly, RiψR^{\psi}_{i} and RiψR^{\psi}_{\leqslant i} are i\mathcal{F}_{i}-measurable. Note that, conditional upon any outcome of 𝑨1,,𝑨i1\bm{A}_{1},\ldots,\bm{A}_{i-1} satisfying Qi1Q_{i-1}, the random variable ψ(ei)\psi(e_{i}) is distributed as S𝒂,𝒑S_{\bm{a},\bm{p}} in Section 2 (with ak=ψ(vk)a_{k}=\psi(v_{k}) and pk=[vkei]p_{k}=\mathbb{P}[v_{k}\in e_{i}]) and it satisfies the conditions of Lemma 2.4 (with ζ=(1+ε)c<1\zeta=(1+\varepsilon)c<1 and [L,R]=[κf^,κf^][L,R]=[-\kappa\hat{f},\kappa\hat{f}]). Hence, we can use that lemma to bound the conditional probability of RiψR^{\psi}_{i}. We first consider the sparse regime of m=O(n)m=O(n). By Lemma 2.4, assuming κ<1/(3(cuni+2/π))\kappa<1/\left(3(c_{uni}+\sqrt{2/\pi})\right) and since κf^>1\kappa\hat{f}>1,

[Riψi1]𝟏[Qi1]cuni+2κf^/2π(1ζ)(1ε)pnκf^(cuni+2/π)(1ζ)(1ε)pn<2n/m/3.\mathbb{P}[R^{\psi}_{i}\mid\mathcal{F}_{i-1}]\bm{1}_{[Q_{i-1}]}\leqslant\frac{c_{uni}+2\kappa\hat{f}/\sqrt{2\pi}}{\sqrt{(1-\zeta)(1-\varepsilon)pn}}\leqslant\frac{\kappa\hat{f}(c_{uni}+\sqrt{2/\pi})}{\sqrt{(1-\zeta)(1-\varepsilon)pn}}<2^{-n/m}/3. (9)

In particular, since Ri1ψQi1R^{\psi}_{\leqslant i-1}\cap Q_{i-1} is i1\mathcal{F}_{i-1}-measurable and is contained in Qi1Q_{i-1},

[R1ψ]2n/m/3and[RiψRi1ψQi1]2n/m/3for i=2,,m.\mathbb{P}[R^{\psi}_{1}]\leqslant 2^{-n/m}/3\qquad\text{and}\qquad\mathbb{P}[R^{\psi}_{i}\mid R^{\psi}_{\leqslant i-1}\cap Q_{i-1}]\leqslant 2^{-n/m}/3\quad\text{for }i=2,\ldots,m.

Thus, for each i=2,,mi=2,\ldots,m,

[RiψQi1]=[RiψRi1ψQi1][Ri1ψQi1](2n/m/3)[Ri1ψQi2],\mathbb{P}[R^{\psi}_{\leqslant i}\cap Q_{i-1}]=\mathbb{P}[R^{\psi}_{i}\mid R^{\psi}_{\leqslant i-1}\cap Q_{i-1}]\cdot\mathbb{P}[R^{\psi}_{\leqslant i-1}\cap Q_{i-1}]\leqslant\left(2^{-n/m}/3\right)\mathbb{P}[R^{\psi}_{\leqslant i-1}\cap Q_{i-2}],

and inductively

[RiψQi1](2n/m/3)i.\mathbb{P}[R^{\psi}_{\leqslant i}\cap Q_{i-1}]\leqslant\left(2^{-n/m}/3\right)^{i}.

Let t:=αmt:=\lceil\alpha m\rceil. Next, we will bound disc(H)\operatorname{disc}(H) from below based on the discrepancies of the first tt rows of 𝑨\bm{A} when Qt1Q_{t-1} holds. First note that, since tnm/(n+m)t\geqslant nm/(n+m),

[RtψQt1](2n/m/3)t(2n/m/3)nm/(n+m)=2n(2/3)nm/(n+m)=o(2n),\mathbb{P}[R^{\psi}_{\leqslant t}\cap Q_{t-1}]\leqslant\left(2^{-n/m}/3\right)^{t}\leqslant\left(2^{-n/m}/3\right)^{nm/(n+m)}=2^{-n}(2/3)^{nm/(n+m)}=o(2^{-n}), (10)

and then, by applying Markov’s inequality to the random variable Z𝟏Qt1Z\bm{1}_{Q_{t-1}},

[disc(H)κf^ and Qt1]𝔼[Z𝟏Qt1]=ψΨ[RmψQt1]ψΨ[RtψQt1]=o(1).\mathbb{P}[\text{$\operatorname{disc}(H)\leqslant\kappa\hat{f}$ and $Q_{t-1}$}]\leqslant\mathbb{E}[Z\bm{1}_{Q_{t-1}}]=\sum_{\psi\in\Psi}\mathbb{P}[R^{\psi}_{\leqslant m}\cap Q_{t-1}]\leqslant\sum_{\psi\in\Psi}\mathbb{P}[R^{\psi}_{\leqslant t}\cap Q_{t-1}]=o(1). (11)

Before proceeding with the proof, we consider the dense regime of mnm\gg n. In that case, we obtain an analogue of (9) by using the tighter inequality (4) in Lemma 2.4 instead of (5). With ζ=(1+ε)c<1\zeta=(1+\varepsilon)c<1 and assuming that C:=2κ2/π<1/2C:=2\kappa^{2}/\pi<1/2, we get

[Riψi1]𝟏[Qi1]\displaystyle\mathbb{P}[R^{\psi}_{i}\mid\mathcal{F}_{i-1}]\bm{1}_{[Q_{i-1}]} cuni(1ζ)(1ε)pn+(1exp((2κf^)22π(1ζ)(1ε)pn))1/2\displaystyle\leqslant\frac{c_{uni}}{\sqrt{(1-\zeta)(1-\varepsilon)pn}}+\left(1-\exp\left(-\frac{(2\kappa\hat{f})^{2}}{2\pi(1-\zeta)(1-\varepsilon)pn}\right)\right)^{1/2}
=O(1/pn)+(1γ2κ2/π)1/2=1Θ(γC),\displaystyle=O(1/\sqrt{pn})+\left(1-\gamma^{-2\kappa^{2}/\pi}\right)^{1/2}=1-\Theta\left(\gamma^{-C}\right),

where we used the fact that 1/pn=o(γC)1/\sqrt{pn}=o(\gamma^{-C}). Reasoning as before, we obtain the following analogue of (10):

[RtψQt1](1Θ(γC))t(1Θ(γC))m/2=enΘ(γCm/n)=o(2n),\mathbb{P}[R^{\psi}_{\leqslant t}\cap Q_{t-1}]\leqslant\left(1-\Theta\left(\gamma^{-C}\right)\right)^{t}\leqslant\left(1-\Theta\left(\gamma^{-C}\right)\right)^{m/2}=e^{-n\Theta\left(\gamma^{-C}m/n\right)}=o(2^{-n}),

where we used the facts that tm/2t\geqslant m/2 and n/m=o(γC)n/m=o(\gamma^{-C}). As a result, our bound in (11) is also valid when mnm\gg n as well. Then, in either regime (m=O(n)m=O(n) or mnm\gg n),

[disc(H)κf^][disc(H)κf^ and Qt1]+[¬Qt1]=o(1),\mathbb{P}[\operatorname{disc}(H)\leqslant\kappa\hat{f}]\leqslant\mathbb{P}[\text{$\operatorname{disc}(H)\leqslant\kappa\hat{f}$ and $Q_{t-1}$}]+\mathbb{P}[\neg Q_{t-1}]=o(1), (12)

by (11), Proposition 3.3 and the fact Qαm1QαmQ_{\lceil\alpha m\rceil-1}\supseteq Q_{\lfloor\alpha m\rfloor}. This shows that w.h.p. disc(H)κf^\operatorname{disc}(H)\geqslant\kappa\hat{f}, and concludes the proof of Theorem 1.2. ∎

3.3. Proof of Proposition 3.3

In this section, we prove Proposition 3.3. For any mm\in\mathbb{N}, let [m]:={1,2,,m}[m]:=\{1,2,\ldots,m\} and [0]:=[0]:=\emptyset. Suppose that J[m]J\subseteq[m] is a fixed subset of size 0jm0\leqslant j\leqslant m. If S[m]S\subseteq[m] is a random subset of size dd, then the distribution of the random variable |SJ||S\cap J| is said to be hypergeometric with parameters m,dm,d and jj. We denote this distribution by Hyper(m,d,j)\text{Hyper}(m,d,j) in what follows. Now, Hyper(m,d,j)\text{Hyper}(m,d,j) is at least as concentrated about its expectation as the binomial distribution, Bin(d,j/m)\textup{Bin}(d,j/m) (see Chapter 21 in [13] for details). As such, standard Chernoff bounds ensure the following:

Theorem 3.4.

Suppose that XHyper(m,d,j)X\sim\text{Hyper}(m,d,j), and μ:=𝔼[X]=dj/m\mu:=\mathbb{E}[X]=dj/m. In this case, for every 0<λ<10<\lambda<1,

(|Xμ|λμ)2exp(λ2μ3).\mathbb{P}(|X-\mu|\geqslant\lambda\mu)\leqslant 2\exp\left(\frac{-\lambda^{2}\mu}{3}\right).

Let 𝑨\bm{A} be the adjacency matrix of H(n,m,d)H\sim\mathcal{H}(n,m,d), which has exactly dd ones in each column at random positions. Let p=d/mp=d/m. Recall that, for each i=0,,mi=0,\ldots,m and k=1,,nk=1,\ldots,n, the random variable Bi,kB_{i,k} counts the number of ones in the kk-th column and below the ii-th row of 𝑨\bm{A} (cf. (8)). Also recall Pi,k:=Bi,k/(mi)P_{i,k}:=B_{i,k}/(m-i) (for i<mi<m). Clearly, Bi,kHyper(m,d,mi)B_{i,k}\sim\text{Hyper}(m,d,m-i), so we may apply Theorem 3.4 to control the value of Bi,kB_{i,k}, and thus of Pi,kP_{i,k}. Let α:=max{n/(n+m),1/2}\alpha:=\max\{n/(n+m),1/2\} and t:=αmt:=\lceil\alpha m\rceil. For a fixed column kk, our goal is to show that w.h.p. Bi,kB_{i,k} remains “close” to 𝔼[Bi,k]=d(mi)/m\mathbb{E}[B_{i,k}]=d(m-i)/m for all i=1,,ti=1,\ldots,t. By combining the error term in Theorem 3.4 with a naïve union bound, we can bound the probability of failure by something of the order of mexp(Θ(d))m\exp(\Theta(-d)), which does not tend to 0 unless d=Ω(logm)d=\Omega(\log m). To overcome this, we need a more subtle argument in which we take the union bound over a smaller set of indices ii and take into account that Bi,kB_{i,k} does not change too much between two consecutive values of ii. This is made more precise in the following claim:

Proposition 3.5.

Assume 0<α,λ,ξ<10<\alpha,\lambda,\xi<1 with ξ1/m\xi\geqslant 1/m and α+ξ<1\alpha+\xi<1. Fix 1kn1\leqslant k\leqslant n. Then, with probability at least

18ξ1exp(dλ2(1αξ)23),1-8\xi^{-1}\exp\left(\frac{-d\lambda^{2}(1-\alpha-\xi)^{2}}{3}\right),

it holds that, for all i=0,,αmi=0,\ldots,\lfloor\alpha m\rfloor,

(1λ)(1+ξ1αξ)1pPi,k(1+λ)(1+ξ1αξ)p.(1-\lambda)\left(1+\frac{\xi}{1-\alpha-\xi}\right)^{-1}p\leqslant P_{i,k}\leqslant(1+\lambda)\left(1+\frac{\xi}{1-\alpha-\xi}\right)p. (13)
Proof.

As the columns of 𝑨\bm{A} are identically distributed, we may assume that k=1k=1 in what follows. We thus drop the index kk from the notation of Ai,k,Bi,k,Pi,kA_{i,k},B_{i,k},P_{i,k} for simplicity. Recall BiHyper(m,d,mi)B_{i}\sim\text{Hyper}(m,d,m-i) with 𝔼Bi=p(mi)\mathbb{E}B_{i}=p(m-i) for each i=0,,mi=0,\ldots,m.

Let r:=(m1)/ξmr:=\lceil(m-1)/\lfloor\xi m\rfloor\rceil, which satisfies 1rm11\leqslant r\leqslant m-1 by assumption. Our first goal is to partition the set [m1][m-1] into rr intervals, each of size at most ξm\xi m. For each q=0,,r1q=0,\ldots,r-1 let Iq:=[qξm]I_{q}:=[q\lfloor\xi m\rfloor], and let Ir:=[m1]I_{r}:=[m-1]. Then, setting I~q:=IqIq1\widetilde{I}_{q}:=I_{q}\setminus I_{q-1} for q=1,,rq=1,\ldots,r, gives us the desired partition I~1,,I~r\widetilde{I}_{1},\ldots,\widetilde{I}_{r} of [m1][m-1]. Now, let r0:=αm/ξmr_{0}:=\lceil\lfloor\alpha m\rfloor/\lfloor\xi m\rfloor\rceil. Since r0ξmαmr_{0}\lfloor\xi m\rfloor\geqslant\lfloor\alpha m\rfloor, the set [αm][\lfloor\alpha m\rfloor] is contained in q=1r0I~q\bigcup_{q=1}^{r_{0}}\widetilde{I}_{q}. Clearly, r0rr_{0}\leqslant r since αmm1\lfloor\alpha m\rfloor\leqslant m-1. If r0=rr_{0}=r, then (m1)αm<ξm(m-1)-\lfloor\alpha m\rfloor<\lfloor\xi m\rfloor, which implies αm+ξmm\lfloor\alpha m\rfloor+\lfloor\xi m\rfloor\geqslant m by integrality. This contradicts the fact that αm+ξm(α+ξ)m<m\lfloor\alpha m\rfloor+\lfloor\xi m\rfloor\leqslant(\alpha+\xi)m<m. As a result, r0r1r_{0}\leqslant r-1.

For each 0qr10\leqslant q\leqslant r-1, define Yq:=BqξmY_{q}:=B_{q\lfloor\xi m\rfloor} and let Yr:=Bm1Y_{r}:=B_{m-1}. In other words, each YqY_{q} counts the number of ones in the kk-th column of 𝑨\bm{A} below all the rows indexed by IqI_{q}. We will prove that the variables Y0,,Yr0Y_{0},\ldots,Y_{r_{0}} are concentrated around their mean, and from that derive a concentration result for B0,,BαmB_{0},\ldots,B_{\lfloor\alpha m\rfloor}. Observe that YrY_{r} must be defined slightly differently, due to divisibility issues. Fortunately, our argument will only require the analysis of Y0,,Yr0Y_{0},\ldots,Y_{r_{0}}, where r0r1r_{0}\leqslant r-1, so this fact will cause no trouble. For q=0,,r1q=0,\ldots,r-1,

𝔼Yq=p(m|Iq|)=p(mqξm),\mathbb{E}Y_{q}=p(m-|I_{q}|)=p(m-q\lfloor\xi m\rfloor),

and therefore, for every q=1,,r0q=1,\ldots,r_{0},

𝔼Yq1𝔼Yq=1+ξmmqξm1+ξmmαmξm1+ξ1αξ,\frac{\mathbb{E}Y_{q-1}}{\mathbb{E}Y_{q}}=1+\frac{\lfloor\xi m\rfloor}{m-q\lfloor\xi m\rfloor}\leqslant 1+\frac{\lfloor\xi m\rfloor}{m-\lfloor\alpha m\rfloor-\lfloor\xi m\rfloor}\leqslant 1+\frac{\xi}{1-\alpha-\xi}, (14)

where we also used the fact that r0ξmαm+ξmr_{0}\lfloor\xi m\rfloor\leqslant\lfloor\alpha m\rfloor+\lfloor\xi m\rfloor. Now, let EE be the event that

|Yq𝔼Yq|λ𝔼Yqfor all q=0,,r0.|Y_{q}-\mathbb{E}Y_{q}|\leqslant\lambda\mathbb{E}Y_{q}\qquad\text{for all }q=0,\ldots,r_{0}.

A direct application of Theorem 3.4 yields

(¬E)q=0r02exp(λ2p2(mqξm)23d)2(r0+1)exp(λ2p2(mr0ξm)23d).\mathbb{P}(\neg E)\leqslant\sum_{q=0}^{r_{0}}2\exp\left(\frac{-\lambda^{2}p^{2}(m-q\lfloor\xi m\rfloor)^{2}}{3d}\right)\leqslant 2(r_{0}+1)\exp\left(\frac{-\lambda^{2}p^{2}(m-r_{0}\lfloor\xi m\rfloor)^{2}}{3d}\right).

Using the fact that r0ξm(α+ξ)mr_{0}\lfloor\xi m\rfloor\leqslant(\alpha+\xi)m and the rough bound r0+1αmξm/2+24ξr_{0}+1\leqslant\frac{\alpha m}{\xi m/2}+2\leqslant\frac{4}{\xi}, we conclude that

(¬E)(8/ξ)exp(dλ2(1αξ)23).\mathbb{P}(\neg E)\leqslant(8/\xi)\exp\left(\frac{-d\lambda^{2}(1-\alpha-\xi)^{2}}{3}\right). (15)

Finally, we turn our attention to B1,,BαmB_{1},\ldots,B_{\lfloor\alpha m\rfloor}. For each i[αm]i\in[\lfloor\alpha m\rfloor], we pick q[r0]q\in[r_{0}] such that iI~qi\in\widetilde{I}_{q}. Then, by monotonicity,

YqBiYq1and𝔼Yq𝔼Bi𝔼Yq1.Y_{q}\leqslant B_{i}\leqslant Y_{q-1}\qquad\text{and}\qquad\mathbb{E}Y_{q}\leqslant\mathbb{E}B_{i}\leqslant\mathbb{E}Y_{q-1}.

Combining this with (14) yields

𝔼Yq1(1+ξ1αξ)𝔼Biand𝔼Yq(1+ξ1αξ)1𝔼Bi.\mathbb{E}Y_{q-1}\leqslant\left(1+\frac{\xi}{1-\alpha-\xi}\right)\mathbb{E}B_{i}\qquad\text{and}\qquad\mathbb{E}Y_{q}\geqslant\left(1+\frac{\xi}{1-\alpha-\xi}\right)^{-1}\mathbb{E}B_{i}.

As a result, event EE implies that for every 1iαm1\leqslant i\leqslant\lfloor\alpha m\rfloor,

(1λ)𝔼YqYqBiYq1(1+λ)𝔼Yq1,(1-\lambda)\mathbb{E}Y_{q}\leqslant Y_{q}\leqslant B_{i}\leqslant Y_{q-1}\leqslant(1+\lambda)\mathbb{E}Y_{q-1},

and hence

(1λ)(1+ξ1αξ)1𝔼BiBi(1+λ)(1+ξ1αξ)𝔼Bi.(1-\lambda)\left(1+\frac{\xi}{1-\alpha-\xi}\right)^{-1}\mathbb{E}B_{i}\leqslant B_{i}\leqslant(1+\lambda)\left(1+\frac{\xi}{1-\alpha-\xi}\right)\mathbb{E}B_{i}. (16)

(Note that the equation above is also valid for i=0i=0, since B0=d=𝔼B0B_{0}=d=\mathbb{E}B_{0}.) Dividing (16) by mim-i, we conclude that event EE implies that, for every 0iαm0\leqslant i\leqslant\lfloor\alpha m\rfloor,

(1λ)(1+ξ1αξ)1pPi(1+λ)(1+ξ1αξ)p.(1-\lambda)\left(1+\frac{\xi}{1-\alpha-\xi}\right)^{-1}p\leqslant P_{i}\leqslant(1+\lambda)\left(1+\frac{\xi}{1-\alpha-\xi}\right)p.

Our bound on (¬E)\mathbb{P}(\neg E) in (15) completes the proof of the proposition. ∎

Corollary 3.6.

Suppose mdm\geqslant d\to\infty and n=O(mlogd)n=O(m\log d) as nn\to\infty. Set p=d/mp=d/m and α=max{n/(n+m),1/2}\alpha=\max\{n/(n+m),1/2\}. Given any fixed constant 0<ε<10<\varepsilon<1 and any 1kn1\leqslant k\leqslant n, the following holds with probability at least 1exp(Ω(d/log3d))1-\exp\left(-\Omega(d/\log^{3}d)\right). For every i=0,,αmi=0,\ldots,\lfloor\alpha m\rfloor,

|Pi,kp|εp.|P_{i,k}-p|\leqslant\varepsilon p. (17)
Proof.

Since the probability bound in the statement is asymptotic as nn\to\infty, we will implicitly assume throughout the proof that nn is sufficiently large for all the inequalities therein to be valid. First, define λ:=ξ:=1/log3/2d\lambda:=\xi:=1/\log^{3/2}d. Clearly, ξ1/d1/m\xi\geqslant 1/d\geqslant 1/m. Observe that, since n=O(mlogd)n=O(m\log d), we have

1α=min{m/(n+m),1/2}=Ω(1/logd).1-\alpha=\min\{m/(n+m),1/2\}=\Omega(1/\log d). (18)

In particular α+ξ<1\alpha+\xi<1, and thus all the assumptions in Proposition 3.5 are satisfied. Moreover,

ξ1αξ=O(1/log3/2d1/logd)=O(log1/2d).\frac{\xi}{1-\alpha-\xi}=O\left(\frac{1/\log^{3/2}d}{1/\log d}\right)=O(\log^{-1/2}d).

As a result, we can relax the inequalities in (13) to

Pi,k=(1+O(log3/2d))(1+O(log1/2d))p=(1+o(1))p,P_{i,k}=\left(1+O(\log^{-3/2}d)\right)\left(1+O(\log^{-1/2}d)\right)p=(1+o(1))p,

which implies that |Pi,kp|εp|P_{i,k}-p|\leqslant\varepsilon p (eventually for nn sufficiently large). In view of Proposition 3.5, this fails for some i=0,,αmi=0,\ldots,\lfloor\alpha m\rfloor with probability at most

8ξ1exp(dλ2(1αξ)23)8(log3/2d)exp(d(1/2log3/2d)23log3d)8\xi^{-1}\exp\left(\frac{-d\lambda^{2}(1-\alpha-\xi)^{2}}{3}\right)\leqslant 8(\log^{3/2}d)\exp\left(\frac{-d(1/2-\log^{-3/2}d)^{2}}{3\log^{3}d}\right)
=exp(Ω(d/log3d)).=\exp\left(-\Omega(d/\log^{3}d)\right).

This finishes the proof of the corollary. ∎

Now we are ready to prove Proposition 3.3, which we restate below in a more explicit form for convenience.

Proposition 3.7.

Let 0<ε,c<10<\varepsilon,c<1 be fixed constants with (1+ε)c<1(1+\varepsilon)c<1. Assume that dd\to\infty, dn/mdn/m\to\infty and n=O(mlogd)n=O(m\log d) as nn\to\infty, and suppose that p=d/mcp=d/m\leqslant c. Let α=max{n/(n+m),1/2}\alpha=\max\{n/(n+m),1/2\}. Then w.h.p., for every i=0,,αmi=0,\ldots,\lfloor\alpha m\rfloor,

k=1nPi,k(1ε)pn,\sum_{k=1}^{n}P_{i,k}\geqslant(1-\varepsilon)pn, (19)

and

Pi,k(1+ε)cfor k=1,,n.P_{i,k}\leqslant(1+\varepsilon)c\quad\text{for }k=1,\ldots,n. (20)
Proof.

For k=1,,nk=1,\ldots,n, we say that column kk of 𝑨\bm{A} is controllable if, for every i=0,,αmi=0,\ldots,\lfloor\alpha m\rfloor, it holds that

|Pi,kp|ε0p,|P_{i,k}-p|\leqslant\varepsilon_{0}p,

where ε0:=ε/2\varepsilon_{0}:=\varepsilon/2. Let U[n]U\subseteq[n] denote the set of indices of the uncontrollable columns. Then, by Corollary 3.6 (with ε\varepsilon replaced by ε0\varepsilon_{0}),

𝔼|U|nexp(Ω(d/log3d))=o(n).\mathbb{E}|U|\leqslant n\exp\left(-\Omega(d/\log^{3}d)\right)=o(n).

Hence, we can apply Markov’s inequality to ensure that |U|/nε0|U|/n\leqslant\varepsilon_{0} w.h.p. On the other hand, by applying the trivial lower bound to Pi,kP_{i,k} for each controllable column k[n]k\in[n],

k=1nPi,k(n|U|)(1ε0)p(1ε0)2pn(12ε0)pn,\displaystyle\sum_{k=1}^{n}P_{i,k}\geqslant(n-|U|)(1-\varepsilon_{0})p\geqslant(1-\varepsilon_{0})^{2}pn\geqslant(1-2\varepsilon_{0})pn,

w.h.p., thereby proving (19) (as 2ε0=ε2\varepsilon_{0}=\varepsilon).

In order to verify that (20) holds, we first consider the regime in which dlog2nd\leqslant\log^{2}n. Observe then that deterministically

Pi,kdmid(1α)m,P_{i,k}\leqslant\frac{d}{m-i}\leqslant\frac{d}{(1-\alpha)m},

for each i=1,,αmi=1,\ldots,\lfloor\alpha m\rfloor and k=1,,nk=1,\ldots,n. In particular, since 1α=Ω(log1d)1-\alpha=\Omega(\log^{-1}d) (in view of (18)) and n=O(mlogd)n=O(m\log d), it holds that

Pi,k=O(d(log2d)/n)=o(1).P_{i,k}=O(d(\log^{2}d)/n)=o(1).

Thus, (20) holds in this regime, as c(1+ε)>0c(1+\varepsilon)>0 is a fixed constant. On the other hand, if dlog2nd\geqslant\log^{2}n, we can apply Corollary 3.6 again, which ensures that with probability at least

1nexp(Ω(d/log3d))=1o(1)1-n\exp\left(-\Omega(d/\log^{3}d)\right)=1-o(1)

we have

Pi,k(1+ε)p(1+ε)cP_{i,k}\leqslant(1+\varepsilon)p\leqslant(1+\varepsilon)c

for all i=0,,αmi=0,\ldots,\lfloor\alpha m\rfloor and k=1,,nk=1,\ldots,n. The proof is therefore complete. ∎

4. Upper Bounding Discrepancy—Proof of Theorem 1.3

The main tool we make use of is the algorithmic partial colouring lemma [17], as done in [4, 19, 20]. For convenience, we restate this lemma in the relevant hypergraph terminology, where we define a fractional colouring to be a relaxation of a (hypergraph) colouring, whose values are allowed to lie in the interval [1,1][-1,1]:

Lemma 4.1 (Partial Colouring Lemma [17]).

Suppose that H=(V,E)H=(V,E) is a hypergraph with mm edges and nn vertices which are coloured by some fractional colouring ρ:V[1,1]\rho:V\rightarrow[-1,1]. Moreover, assume that δ>0\delta>0 and (λe)eE(\lambda_{e})_{e\in E} are non-negative values such that

eEexp(λe2/16)n16.\sum_{e\in E}\exp(-\lambda_{e}^{2}/16)\leqslant\frac{n}{16}. (21)

Under these assumptions, there exists some fractional colouring ψ:V[1,1]\psi:V\rightarrow[-1,1] for which

  1. (1)

    |ψ(e)ρ(e)|λe|e|1/2|\psi(e)-\rho(e)|\leqslant\lambda_{e}|e|^{1/2} for all eEe\in E, and

  2. (2)

    |ψ(v)|1δ|\psi(v)|\geqslant 1-\delta for at least n/2n/2 vertices of VV.

Moreover, ψ\psi can be computed by a randomized algorithm in expected time

O((n+m)3δ2log(mn/δ)).O((n+m)^{3}\delta^{-2}\log(mn/\delta)).
Remark 7.

As previously mentioned, Levy, Ramadas, and Rothvoss [16] showed that the algorithmic guarantee of Lemma 4.1 can be achieved using a deterministic algorithm (albeit with a worse run-time).

In what follows, it will be convenient to refer to ρ\rho as the target (fractional) colouring and δ\delta as the rounding parameter. We make use of Lemma 4.1 in the same way as in [4, 17, 19, 20]. In fact, we analyze the same two-phase algorithm considered by both Bansal and Meka [4] and Potukuchi [19, 20], though we must tune the relevant asymptotics carefully in order to achieve the bound claimed in Theorem 1.3. In particular, we shorten phase one and change the target discrepancy in each application of Lemma 4.1. These modifications allow us to derive more precise asymptotics in the studied range of parameters.

Let us suppose that H=(V,E)H=(V,E) is a hypergraph drawn from (n,m,d)\mathcal{H}(n,m,d) or (n,m,p)\mathbb{H}(n,m,p), where p=d/mp=d/m. From now on, we assume that nn is a power of 22 for convenience. This follows w.l.o.g. as we can always add extra vertices to VV which do not lie in any of the edges of GG. Given the lower bounds of Theorems 1.1 and 1.2, ideally we would like to compute an output colouring, ϕ:V{1,1}\phi:V\rightarrow\{-1,1\} with matching discrepancy. However, without finer proof techniques, this does not seem fully attainable for the full parameter range of mnm\gg n. Let us fix μ:=dn/m\mu:=dn/m. We recall the definition of β=β(n)\beta=\beta(n) as given in the statement of Theorem 1.3: For all nn sufficiently large, β(n)1\beta(n)\geqslant 1 and

β(n)μlog(m/n)(logμ+2)5.\beta(n)\mu\geqslant\log(m/n)\left(\log\mu+2\right)^{5}. (22)

Let f^:=μlog(m/n)β\hat{f}:=\sqrt{\mu\log(m/n)\beta} be the target upper bound on the discrepancy that we are aiming to prove. Our argument is based on analyzing the Iterated-Colouring-Algorithm, which we now describe. Fix t1:=lgμt_{1}:=\lg{\mu}, 0it10\leqslant i\leqslant t_{1}, and let H0:=HH_{0}:=H, and δ:=1/n\delta:=1/n. For convenience, we refer to the below procedure as round ii. We define f^i:=f^(i+2)2\hat{f}_{i}:=\hat{f}(i+2)^{-2} as the desired discrepancy bound to be attained in round ii.

  1. (1)

    Remove all edges of Hi=(Vi,Ei)H_{i}=(V_{i},E_{i}) of size less than or equal to f^\hat{f}.

  2. (2)

    Update λe\lambda_{e} to be f^i/|e|1/2\hat{f}_{i}/|e|^{1/2} for each eEie\in E_{i}. Update the target colouring, which we denote by ρi\rho_{i}, to be the previously computed colouring ψi1:Vi1[1,1]\psi_{i-1}:V_{i-1}\rightarrow[-1,1] restricted to ViV_{i} (here ψ1\psi_{-1} is the identically zero function by convention).

  3. (3)

    If eEiexp(λe2/16)|Vi|/16\sum_{e\in E_{i}}\exp(-\lambda_{e}^{2}/16)\leqslant|V_{i}|/16, then apply Lemma 4.1 to HiH_{i} with the above values, yielding the fractional colouring ψi:Vi[1,1]\psi_{i}:V_{i}\rightarrow[-1,-1]. Otherwise, abort the round and declare an error.

  4. (4)

    Assuming an error did not occur, compute SiViS_{i}\subseteq V_{i}, such that |ψi(v)|11/n|\psi_{i}(v)|\geqslant 1-1/n for all vSiv\in S_{i}, where |Si|:=|Vi|/2|S_{i}|:=|V_{i}|/2. Afterwards, construct Hi+1=(Vi+1,Ei+1)H_{i+1}=(V_{i+1},E_{i+1}) by restricting the edges of EiE_{i} to Vi+1V_{i+1}, where Vi+1:=ViSiV_{i+1}:=V_{i}\setminus S_{i}.

We refer to the rounds i=0,,t1i=0,\ldots,t_{1} as phase one of the algorithm’s execution. Note that we refer to the vertices of SiS_{i} as inactive after round ii, as the value ϕ\phi assigns to them will not change at any point onwards. We refer to the remaining vertices as being active. Observe then the following proposition:

Proposition 4.2.

For each 0it10\leqslant i\leqslant t_{1} and eEie\in E_{i}, it holds that

|ψi(e)ρi(e)|f^(i+2)2.|\psi_{i}(e)-\rho_{i}(e)|\leqslant\frac{\hat{f}}{(i+2)^{2}}.

Assuming that none of the rounds yielded an error, there are exactly 2t1n=nμ1=m/d2^{-t_{1}}n=n\mu^{-1}=m/d active vertices at the end of phase one, as t1=lgμt_{1}=\lg\mu. Heuristically, this means that we expect Ht1+1H_{t_{1}+1} to have edges of roughly constant size. As such, we can easily complete the colouring ϕ\phi by executing the phase two procedure for rounds t1+1it2t_{1}+1\leqslant i\leqslant t_{2}, where t2:=lg(10n/f^)+1t_{2}:=\lg(10n/\hat{f})+1. The phase two procedure is identical to that of phase one, with the exception that in step 2, we update λe\lambda_{e} to be 0 (rather than f^i/|e|1/2\hat{f}_{i}/|e|^{1/2}) for each eEie\in E_{i}. Analogously, we observe the following.

Proposition 4.3.

For each t1+1it2t_{1}+1\leqslant i\leqslant t_{2} and eEie\in E_{i}, it holds that

|ψi(e)ρi(e)|=0.|\psi_{i}(e)-\rho_{i}(e)|=0.

Assuming that none of the rounds yielded an error, there are exactly 2t2n=nf^/20n=f^/202^{-t_{2}}n=n\hat{f}/20n=\hat{f}/20 active vertices at the end of phase two. In order to complete the construction of ϕ\phi, we conclude with a post-processing phase. That is, we arbitrarily assign 1-1 or 11 to any of the vertices which remain active at the end of phase two. Finally, we round each remaining fractional value assigned by ϕ\phi to the nearest integer within {1,1}\{-1,1\}.

Let us assume that the above procedure succeeds in its execution on HH; that is, it does not abort during any iteration in either phase one or two. In this case, we conclude the proof by showing the next lemma.

Lemma 4.4.

If neither phase one nor phase two fails then Theorem 1.3 holds.

Proof.

For each 0it20\leqslant i\leqslant t_{2}, let us formally extend ψi\psi_{i} to all of VV. That is, define ψi(v):=0\psi_{i}(v):=0 for each vVViv\in V\setminus V_{i}, and keep ψi\psi_{i} unchanged on ViV_{i}. Moreover, do the same for the target colouring ρi\rho_{i}. Observe then that once phase two ends, ϕ\phi can be expressed as a sum of differences involving the partial colourings (ψi)i=0t2(\psi_{i})_{i=0}^{t_{2}} and (ρi)i=0t21(\rho_{i})_{i=0}^{t_{2}-1} Specifically,

ϕ(v)=i=0t2(ψi(v)ρi(v)).\phi(v)=\sum_{i=0}^{t_{2}}(\psi_{i}(v)-\rho_{i}(v)).

Let tet_{e} be the time when edge ee becomes smaller than f^\hat{f} or te=t2t_{e}=t_{2} if it never happens. After applying Propositions 4.2 and 4.3 we get that

|ϕ(e)|\displaystyle|\phi(e)| f^+i=0te|ψi(e)ρi(e)|\displaystyle\leqslant\hat{f}+\sum_{i=0}^{t_{e}}|\psi_{i}(e)-\rho_{i}(e)|
f^+i=0t1|ψi(e)ρi(e)|+i=t1+1t2|ψi(e)ρi(e)|\displaystyle\leqslant\hat{f}+\sum_{i=0}^{t_{1}}|\psi_{i}(e)-\rho_{i}(e)|+\sum_{i=t_{1}+1}^{t_{2}}|\psi_{i}(e)-\rho_{i}(e)|
f^+i=0f^(i+2)2=O(f^).\displaystyle\leqslant\hat{f}+\sum_{i=0}^{\infty}\frac{\hat{f}}{(i+2)^{2}}=O(\hat{f}).

The post-processing phase cannot increase the discrepancy that ϕ\phi attains on any edge of EE by more than f^\hat{f} for the remaining active vertices; as we already observed, there are at most f^/20\hat{f}/20 of them. The rounding of inactive vertices increases the discrepancy by at most 1. ∎

Bounding the Failure Probability

First, we recall the following lemma proven in [4] by Bansal and Meka:

Lemma 4.5 (Lemma 6 in [4]).

Suppose that HH is generated from (n,m,d)\mathcal{H}(n,m,d) and 𝐌\bm{M} is a fixed r×r\times\ell sub-matrix of the m×m\times\ell incidence matrix 𝐀\bm{A} of HH. If s10d/ms\geqslant 10d\ell/m, and B(r,,s)B(r,\ell,s) corresponds to the event in which each row of 𝐌\bm{M} has at least ss 1’s, then

[B(r,,s)]exp(rslog((sm)/(d))2).\mathbb{P}[B(r,\ell,s)]\leqslant\exp\left(-\frac{rs\log((sm)/(d\ell))}{2}\right).

While this lemma is stated for the case when HH is generated from (n,m,d)\mathcal{H}(n,m,d), the upper bound on [B(r,,s)]\mathbb{P}[B(r,\ell,s)] is proven by instead bounding the probability of the analogous event when HH is generated from (n,m,p)\mathbb{H}(n,m,p) for p=d/mp=d/m. As such, this lemma extends to the edge independent model. We now restate it in a form which will be more convenient for our purposes.

Lemma 4.6.

Suppose that HH is generated from (n,m,d)\mathcal{H}(n,m,d) or (n,m,p)\mathbb{H}(n,m,p) for p=d/mp=d/m, whose incidence matrix we denote by 𝐀\bm{A}. If s10d/ms\geqslant 10d\ell/m, then define Q(r,,s)Q(r,\ell,s) as the event in which there exists an r×r\times\ell sub-matrix of 𝐀\bm{A} in which each row has at least ss 1’s. In this case,

[Q(r,,s)](mr)(n)exp(rslog((sm)/(d))2).\mathbb{P}[Q(r,\ell,s)]\leqslant\binom{m}{r}\binom{n}{\ell}\exp\left(-\frac{rs\log((sm)/(d\ell))}{2}\right).

Using this lemma, we can ensure that w.h.p. Iterated-Colouring-Algorithm will not abort during phase one (Proposition 4.7) or two (Proposition 4.8) and thus conclude the proof of Theorem 1.3.

Proposition 4.7.

If Iterated-Colouring-Algorithm inputs a hypergraph drawn from (n,m,d)\mathcal{H}(n,m,d) or (n,m,p)\mathbb{H}(n,m,p) where p=d/mp=d/m, then w.h.p. it does not abort during phase one, provided we assume that μ=dn/m\mu=dn/m\rightarrow\infty and mnm\gg n.

Proof.

Given 0it10\leqslant i\leqslant t_{1}, we say that round ii is good, provided there are at most ni/17n_{i}/17 rows of HiH_{i} whose size is greater than si:=βμ/16(i+2)5s_{i}:=\beta\mu/16(i+2)^{5}, where β\beta satisfies (22). Otherwise, we say that the round is bad. Recall that t1=lgμt_{1}=\lg{\mu}.

Now, if round ii is good, then we claim that Iterated-Colouring-Algorithm does not abort in iteration ii. To see this, it suffices to show that for nn sufficiently large

eEiexp(λe2/16)ni/16,\sum_{e\in E_{i}}\exp\left(-\lambda^{2}_{e}/16\right)\leqslant n_{i}/16,

where ni:=|Vi|=n/2in_{i}:=|V_{i}|=n/2^{i}, f^i:=f^(i+2)2\hat{f}_{i}:=\hat{f}(i+2)^{-2} and λe:=f^i/|e|1/2\lambda_{e}:=\hat{f}_{i}/|e|^{1/2} for eEie\in E_{i}. Observe now that since the round is good, we get that

eEiexp(λe2/16)\displaystyle\sum_{e\in E_{i}}\exp\left(-\lambda^{2}_{e}/16\right) =eEi:|e|siexp(λe2/16)+eEi:|e|>siexp(λe2/16)\displaystyle=\sum_{\begin{subarray}{c}e\in E_{i}:\\ |e|\leqslant s_{i}\end{subarray}}\exp\left(-\lambda^{2}_{e}/16\right)+\sum_{\begin{subarray}{c}e\in E_{i}:\\ |e|>s_{i}\end{subarray}}\exp\left(-\lambda^{2}_{e}/16\right)
mexp(f^i216si)+ni/17.\displaystyle\leqslant m\exp\left(-\frac{\hat{f}_{i}^{2}}{16s_{i}}\right)+n_{i}/17.

On the other hand, since f^:=μlog(m/n)β\hat{f}:=\sqrt{\mu\log(m/n)\beta},

mexp(f^i2/16si)\displaystyle m\exp\left(-\hat{f}_{i}^{2}/16s_{i}\right) =mexp(log(m/n)(i+2))\displaystyle=m\exp(-\log(m/n)(i+2))
=m(nm)i+2\displaystyle=m\left(\frac{n}{m}\right)^{i+2}
=n(nm)i+1=o(ni)\displaystyle=n\left(\frac{n}{m}\right)^{i+1}=o(n_{i})

where the last line follows since (n/m)i+12i(n/m)^{i+1}\ll 2^{-i}, as nmn\ll m. Thus,

eEiexp(λe2/16)(1+o(1))ni17ni16.\sum_{e\in E_{i}}\exp\left(-\lambda_{e}^{2}/16\right)\leqslant(1+o(1))\frac{n_{i}}{17}\leqslant\frac{n_{i}}{16}.

We now must show that w.h.p., all of the rounds are good. Now, observe that if some round 1it11\leqslant i\leqslant t_{1} is bad, then there exists an (ni/17)×ni(n_{i}/17)\times n_{i} sub-matrix of 𝑨\bm{A}, say 𝑴\bm{M}, in which each row of 𝑴\bm{M} has greater than sis_{i} 1’s. In fact, since nt1nin_{t_{1}}\leqslant n_{i} and st1sis_{t_{1}}\leqslant s_{i}, we can take a sub-matrix of 𝑴\bm{M} (and thus of 𝑨\bm{A}) in which each row has at least st1s_{t_{1}} 1’s, and whose size is (nt1/17)×nt1(n_{t_{1}}/17)\times n_{t_{1}}. Thus, we observe the following claim:

  1. (1)

    If a bad round occurs, then there exists a (nt1/17)×nt1(n_{t_{1}}/17)\times n_{t_{1}} sub-matrix of 𝑨\bm{A} in which each row has more than st1s_{t_{1}} 1’s.

Let us define Q(nt1/17,nt1,st1)Q(n_{t_{1}}/17,n_{t_{1}},s_{t_{1}}) as this latter event; namely, that there exists an (nt1/17)×nt1(n_{t_{1}}/17)\times n_{t_{1}} sub-matrix of 𝑨\bm{A} in which each row has more than st1s_{t_{1}} 1’s. In order to complete the proof, it suffices to show that w.h.p., Q(nt1/17,nt1,st1)Q(n_{t_{1}}/17,n_{t_{1}},s_{t_{1}}) does not occur. Recall nt1=n/μ=m/dn_{t_{1}}=n/\mu=m/d. as the number of active vertices drops by exactly half in each round. It follows that

st1=βμ16(t1+2)5logmn(logμ+2)516(lgμ+2)55=10dn2μm10dnt1m.s_{t_{1}}=\frac{\beta\mu}{16(t_{1}+2)^{5}}\geqslant\frac{\log{\frac{m}{n}}(\log{\mu}+2)^{5}}{16(\lg{\mu}+2)^{5}}\gg 5=10\frac{dn}{2\mu m}\geqslant 10\frac{dn_{t_{1}}}{m}.

Thus, we can apply Lemma 4.6 to ensure that

[Q(nt1/17,nt1,st1)]\displaystyle\mathbb{P}[Q(n_{t_{1}}/17,n_{t_{1}},s_{t_{1}})] (mnt1/17)(nnt1)exp(nt1st134log(st1mdnt1))\displaystyle\leqslant\binom{m}{n_{t_{1}}/17}\binom{n}{n_{t_{1}}}\exp\left(-\frac{n_{t_{1}}s_{t_{1}}}{34}\log\left(\frac{s_{t_{1}}m}{dn_{t_{1}}}\right)\right)
(mnt1)2exp(nt1st134logst1)\displaystyle\leqslant\binom{m}{n_{t_{1}}}^{2}\exp\left(-\frac{n_{t_{1}}s_{t_{1}}}{34}\log{s_{t_{1}}}\right)
(ment1)2nt1exp(nt1st1logst134)\displaystyle\leqslant\left(\frac{me}{n_{t_{1}}}\right)^{2n_{t_{1}}}\exp\left(-\frac{n_{t_{1}}s_{t_{1}}\log{s_{t_{1}}}}{34}\right)

where the inequalities follow since mn,(mnt1)(me/nt1)nt1m\gg n,\binom{m}{n_{t_{1}}}\leqslant(me/n_{t_{1}})^{n_{t_{1}}}. Now,

(ment1)2nt1\displaystyle\left(\frac{me}{n_{t_{1}}}\right)^{2n_{t_{1}}} =exp(2nt1(log(m/n)+log(ne/nt1)))\displaystyle=\exp\left(2n_{t_{1}}(\log(m/n)+\log(ne/n_{t_{1}}))\right)
=exp(2nt1(log(m/n)+log(eμ))),\displaystyle=\exp\left(2n_{t_{1}}(\log(m/n)+\log(e\mu))\right),

so

[Q(nt1/17,nt1,st1)]exp(2nt1(st1logst168log(m/n)log(eμ))).\mathbb{P}[Q(n_{t_{1}}/17,n_{t_{1}},s_{t_{1}})]\leqslant\exp\left(-2n_{t_{1}}\left(\frac{s_{t_{1}}\log{s_{t_{1}}}}{68}-\log(m/n)-\log(e\mu)\right)\right).

Thus, by our assumption (22) on β\beta, we get that

st1=βμ16(lgμ+2)5logmn(logμ+2)516(lgμ+2)5116log(m/n),s_{t_{1}}=\frac{\beta\mu}{16\left(\lg{\mu}+2\right)^{5}}\geqslant\frac{\log{\frac{m}{n}}(\log{\mu}+2)^{5}}{16(\lg{\mu}+2)^{5}}\geqslant\frac{1}{16}\log(m/n),

and

st1=βμ16(lgμ+2)5μ16(lgμ+2)5,s_{t_{1}}=\frac{\beta\mu}{16\left(\lg{\mu}+2\right)^{5}}\geqslant\frac{\mu}{16\left(\lg{\mu}+2\right)^{5}},

as β1\beta\geqslant 1. The proposition follows as

log(m/n)log(log(m/n)/16)1668log(m/n),\frac{\log{(m/n)}\log(\log{(m/n)}/16)}{16\cdot 68}\gg\log(m/n),

and

μ6816(lgμ+2)5log(μe).\frac{\mu}{68\cdot 16\left(\lg{\mu}+2\right)^{5}}\gg\log{(\mu e)}.\qed
Proposition 4.8.

If Iterated-Colouring-Algorithm inputs a hypergraph drawn from (n,m,d)\mathcal{H}(n,m,d) or (n,m,p)\mathbb{H}(n,m,p) where p=d/mp=d/m, then w.h.p. it does not abort in phase two, provided μ=dn/m\mu=dn/m\rightarrow\infty as nn\rightarrow\infty and mnm\gg n.

Proof.

Suppose that t1+1it2t_{1}+1\leqslant i\leqslant t_{2} for t2:=lg(10n/f^)+1t_{2}:=\lg(10n/\hat{f})+1. Recall nt2=n/2(10n/f^)=f^/20n_{t_{2}}=n/2(10n/\hat{f})=\hat{f}/20 and that during phase two, λe=0\lambda_{e}=0 for each eEie\in E_{i}. Thus, we get that

eEiexp(λe2/16)=|Ei|.\sum_{e\in E_{i}}\exp\left(\lambda_{e}^{2}/16\right)=|E_{i}|.

On the other hand, in order for an edge of HH to remain in EiE_{i}, it must have greater than f^\hat{f} vertices which lie in ViV_{i}. As a result, if Iterated-Colouring-Algorithm aborts in round ii, then HiH_{i} must have at least ni/16n_{i}/16 edges of size greater than f^\hat{f}. In particular, since nt2nin_{t_{2}}\leqslant n_{i}, this implies that the incidence matrix 𝑨\bm{A} of HH has an (nt2/16)×nt2(n_{t_{2}}/16)\times n_{t_{2}} sub-matrix in which each row has greater than f^\hat{f} 1’s. Thus, if Q(nt2/16,nt2,f^)Q(n_{t_{2}}/16,n_{t_{2}},\hat{f}) corresponds to the event in which 𝑨\bm{A} has an (nt2/16)×nt2(n_{t_{2}}/16)\times n_{t_{2}} sub-matrix in which each row has greater than f^\hat{f} 1’s, then we get the following claim:

  1. (1)

    If Iterated-Colouring-Algorithm aborts in some round t1+1it2t_{1}+1\leqslant i\leqslant t_{2}, then Q(nt2/16,nt2,f^)Q(n_{t_{2}}/16,n_{t_{2}},\hat{f}) must occur.

As a result, in order to show that w.h.p. Iterated-Colouring-Algorithm does not abort in any round it suffices to prove that Q(nt2/16,nt2,f^)Q(n_{t_{2}}/16,n_{t_{2}},\hat{f}) does not occur w.h.p. Now, it follows that

f^10df^20m=10dnt2m,\hat{f}\geqslant 10\frac{d\hat{f}}{20m}=10\frac{dn_{t_{2}}}{m},

as dmd\leqslant m. Thus, we can apply Lemma 4.6 to ensure that

[Q(nt2/16,nt2,f^)]\displaystyle\mathbb{P}[Q(n_{t_{2}}/16,n_{t_{2}},\hat{f})] (mnt2/16)(nnt2)exp(nt2f^32log(f^mdnt2))\displaystyle\leqslant\binom{m}{n_{t_{2}}/16}\binom{n}{n_{t_{2}}}\exp\left(-\frac{n_{t_{2}}\hat{f}}{32}\log\left(\frac{\hat{f}m}{dn_{t_{2}}}\right)\right)
(mnt2/16)(nnt2)exp(nt2f^32log(20md)),\displaystyle\leqslant\binom{m}{n_{t_{2}}/16}\binom{n}{n_{t_{2}}}\exp\left(-\frac{n_{t_{2}}\hat{f}}{32}\log\left(\frac{20m}{d}\right)\right),

and so [Q(nt2/16,nt2,f^)]\mathbb{P}[Q(n_{t_{2}}/16,n_{t_{2}},\hat{f})] is upper bounded by

exp(2nt2(f^64log(m/n)log(μe))),\exp\left(-2n_{t_{2}}\left(\frac{\hat{f}}{64}-\log(m/n)-\log(\mu e)\right)\right), (23)

after applying the same simplifications as in Proposition 4.7. The proposition then follows by assumption (22) on β\beta, as

f^/64=βμlog(m/n)/64log(m/n)log5(μ+2)64logm/n,\hat{f}/64=\sqrt{\beta\mu\log{(m/n)}}/64\geqslant\frac{\log{(m/n)}\log^{5}{(\mu+2)}}{64}\gg\log{m/n},

and

log(m/n)log5(μ+2)64log(μe).\frac{\log{(m/n)}\log^{5}{(\mu+2)}}{64}\gg\log{(\mu e)}.\qed

5. Conclusion and Open Problems

We have lower bounded the discrepancy of the random hypergraph models (n,m,p)\mathbb{H}(n,m,p) and (n,m,d)\mathcal{H}(n,m,d) for the full parameter range in which dd\rightarrow\infty and dn/mdn/m\rightarrow\infty where p=d/mp=d/m. In the dense regime of mnm\gg n, we have provided asymptotically matching upper bounds, under the assumption that d=pm(m/n)1+εd=pm\geqslant(m/n)^{1+\varepsilon} for some constant ε>0\varepsilon>0. These upper bounds are algorithmic, and so the main question left open by our work is whether analogous upper bounds can be proven in the sparse regime of n/lognmnn/\log n\ll m\ll n. Our lower bounds suggest that the discrepancy is Θ(2n/mpn)\Theta\left(2^{-n/m}\sqrt{pn}\right), and while we believe that a second moment argument could be used to prove the existence of such a colouring—particularly, in the edge-independent model (n,m,p)\mathbb{H}(n,m,p)—the partial colouring lemma does not seem to be of much use here. This leaves open whether such a colouring can be computed efficiently in this parameter range. If this is not possible, then ideally one could find a reduction to a problem which is believed to be hard on average. One candidate may be the random-lattice problem of Ajtai [1] and Goldreich et al. [14], in which a random mm by nn matrix 𝑴\bm{M} with i.i.d. entries from q\mathbb{Z}_{q} is generated, and one wishes to compute a vector 𝒙{0,1}n\bm{x}\in\{0,1\}^{n} such that 𝑴𝒙=0\bm{M}\bm{x}=0.

Acknowledgements

This work was initiated at the 2019 Graduate Research Workshop in Combinatorics, which was supported in part by NSF grant #1923238, NSA grant #H98230-18-1-0017, a generous award from the Combinatorics Foundation, and Simons Foundation Collaboration Grants #426971 (to M. Ferrara), #316262 (to S. Hartke) and #315347 (to J. Martin). We thank Aleksandar Nikolov for suggesting the problem, and Puck Rombach and Paul Horn for discussions and encouragements in the early stages of the project. Moreover, T. Masařík received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme Grant Agreement 714704. He completed a part of this work while he was a postdoc at Simon Fraser University in Canada, where he was supported through NSERC grants R611450 and R611368. X. Pérez-Giménez was supported in part by Simons Foundation Grant #587019.

References

  • [1] Miklós Ajtai “Generating Hard Instances of Lattice Problems (Extended Abstract)” In Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing, STOC ’96 Philadelphia, Pennsylvania, USA: Association for Computing Machinery, 1996, pp. 99–108 DOI: 10.1145/237814.237838
  • [2] Dylan J. Altschuler and Jonathan Niles-Weed “The Discrepancy of Random Rectangular Matrices” In CoRR abs/2101.04036, 2021 arXiv:2101.04036
  • [3] Wojciech Banaszczyk “Balancing vectors and Gaussian measures of n-dimensional convex bodies” In Random Structures & Algorithms 12.4, 1998, pp. 351–360 DOI: 10.1002/(SICI)1098-2418(199807)12:4<351::AID-RSA3>3.0.CO;2-S
  • [4] Nikhil Bansal and Raghu Meka “On the discrepancy of random low degree set systems” In Random Structures & Algorithms 57.3 Wiley, 2020, pp. 695–705 DOI: 10.1002/rsa.20935
  • [5] József Beck and Tibor Fiala ““Integer-making” theorems” In Discrete Applied Mathematics 3.1 Elsevier BV, 1981, pp. 1–8 DOI: 10.1016/0166-218x(81)90022-6
  • [6] Boris Bukh “An Improvement of the Beck–Fiala Theorem” In Combinatorics, Probability and Computing 25.3 Cambridge University Press, 2016, pp. 380–398 DOI: 10.1017/S0963548315000140
  • [7] Bernard Chazelle “The Discrepancy Method: Randomness and Complexity” Cambridge University Press, 2000 DOI: 10.1017/cbo9780511626371
  • [8] “A Panorama of Discrepancy Theory” Springer International Publishing, 2014 DOI: 10.1007/978-3-319-04696-9
  • [9] Persi Diaconis and Susan Holmes “Stein’s Method: Expository Lectures and Applications”, Institute of Mathematical Statistics Institute of Mathematical Statistics, 2004 URL: https://books.google.ca/books?id=3n-iQIU9LNEC
  • [10] Esther Ezra and Shachar Lovett “On the Beck-Fiala conjecture for random set systems” In Random Structures & Algorithms 54.4 Wiley, 2018, pp. 665–675 DOI: 10.1002/rsa.20810
  • [11] William Feller “An introduction to probability theory and its applications. Vol. II.”, Second edition John Wiley & Sons Inc., 1971
  • [12] Cole Franks and Michael Saks “On the discrepancy of random matrices with many columns” In Random Structures & Algorithms 57.1, 2020, pp. 64–96 DOI: 10.1002/rsa.20909
  • [13] Alan Frieze and Michał Karoński “Introduction to Random Graphs” Cambridge University Press, 2015 DOI: 10.1017/CBO9781316339831
  • [14] Oded Goldreich, Shafi Goldwasser and Shai Halevi “Collision-Free Hashing from Lattice Problems” In Studies in Complexity and Cryptography. Miscellanea on the Interplay between Randomness and Computation: In Collaboration with Lidor Avigad, Mihir Bellare, Zvika Brakerski, Shafi Goldwasser, Shai Halevi, Tali Kaufman, Leonid Levin, Noam Nisan, Dana Ron, Madhu Sudan, Luca Trevisan, Salil Vadhan, Avi Wigderson, David Zuckerman Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 30–39 DOI: 10.1007/978-3-642-22670-0_5
  • [15] Rebecca Hoberg and Thomas Rothvoss “A fourier-analytic approach for the discrepancy of random set systems” In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, 2019, pp. 2547–2556 SIAM DOI: 10.1137/1.9781611975482.156
  • [16] Avi Levy, Harishchandra Ramadas and T. Rothvoss “Deterministic Discrepancy Minimization via the Multiplicative Weight Update Method” In Integer Programming and Combinatorial Optimization, 2017, pp. 380–391 DOI: 10.1007/978-3-319-59250-3_31
  • [17] Shachar Lovett and Raghu Meka “Constructive Discrepancy Minimization by Walking on the Edges” In SIAM Journal on Computing 44.5 Society for Industrial & Applied Mathematics (SIAM), 2015, pp. 1573–1582 DOI: 10.1137/130929400
  • [18] Jiří Matoušek “Geometric Discrepancy: An Illustrated Guide”, Algorithms and Combinatorics 18 Springer-Verlag Berlin Heidelberg, 1999 DOI: 10.1007/978-3-642-03942-3
  • [19] Aditya Potukuchi “Discrepancy in random hypergraph models”, 2018 arXiv:1811.01491 [math.CO]
  • [20] Aditya Potukuchi “A Spectral Bound on Hypergraph Discrepancy” In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020) 168, Leibniz International Proceedings in Informatics (LIPIcs) Dagstuhl, Germany: Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 2020, pp. 93:1–93:14 DOI: 10.4230/LIPIcs.ICALP.2020.93
  • [21] Irina Shevtsova “An Improvement of Convergence Rate Estimates in the Lyapunov Theorem” In Doklady Mathematics 82, 2010, pp. 862–864 DOI: 10.1134/S1064562410060062
  • [22] Paxton Turner, Raghu Meka and Philippe Rigollet “Balancing Gaussian vectors in high dimension” In Conference on Learning Theory, COLT 2020, 9-12 July 2020, Virtual Event [Graz, Austria] 125, Proceedings of Machine Learning Research PMLR, 2020, pp. 3455–3486 URL: http://proceedings.mlr.press/v125/turner20a.html
  • [23] J.. Williams “An Approximation to the Probability Integral” In Annals of Mathematical Statistics 17.3 The Institute of Mathematical Statistics, 1946, pp. 363–365 DOI: 10.1214/aoms/1177730951