Moser-Tardos Algorithm: Beyond Shearer’s Bound

Kun He , Qian Li and Xiaoming Sun Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China. University of Chinese Academy of Sciences. Beijing, China. E-mail: [email protected], [email protected], and [email protected].

Abstract.

In a seminal paper (Moser and Tardos, JACM’10), Moser and Tardos developed a simple and powerful algorithm to find solutions to combinatorial problems in the variable Lovász Local Lemma (LLL) setting. Kolipaka and Szegedy (Kolipaka and Szegedy, STOC’11) proved that the Moser-Tardos algorithm is efficient up to the tight condition of the abstract Lovász Local Lemma, known as Shearer’s bound. A fundamental problem around LLL is whether the efficient region of the Moser-Tardos algorithm can be further extended.

In this paper, we give a positive answer to this problem. We show that the efficient region of the Moser-Tardos algorithm goes beyond the Shearer’s bound of the underlying dependency graph, if the graph is not chordal. Otherwise, the dependency graph is chordal, and it has been shown that Shearer’s bound exactly characterizes the efficient region for such graphs (Kolipaka and Szegedy, STOC’11; He, Li, Liu, Wang and Xia, FOCS’17).

Moreover, we demonstrate that the efficient region can exceed Shearer’s bound by a constant by explicitly calculating the gaps on several infinite lattices.

The core of our proof is a new criterion on the efficiency of the Moser-Tardos algorithm which takes the intersection between dependent events into consideration. Our criterion is strictly better than Shearer’s bound whenever the intersection exists between dependent events. Meanwhile, if any two dependent events are mutually exclusive, our criterion becomes the Shearer’s bound, which is known to be tight in this situation for the Moser-Tardos algorithm (Kolipaka and Szegedy, STOC’11; Guo, Jerrum and Liu, JACM’19).

1. Introduction

Suppose $\mathcal{A}=\{A_{1},\cdots,A_{m}\}$ is a set of bad events. If the events are mutually independent, then we can avoid all of these events simultaneously whenever no event has probability 1. Lovász Local Lemma (LLL) [14], one of the most important probabilistic methods, allows for limited dependency among the events, but still concludes that all the events can be avoided simultaneously if each individual event has a bounded probability. In the most general setting (a.k.a. abstract LLL), the dependency among $\mathcal{A}$ is characterized by an undirected graph $G_{D}=([m],E_{D})$ , called a dependency graph of $\mathcal{A}$ , which satisfies that for any vertex $i$ , $A_{i}$ is independent of $\{A_{j}:j\notin\mathcal{N}_{G_{D}}(i)\cup\{i\}\}$ . Here $\mathcal{N}_{G}(i)$ stands for the set of neighbors of vertex $i$ in a given graph $G$ .

We use $\mathcal{A}\sim(G_{D},\bm{p})$ to denote that (i) $G_{D}$ is a dependency graph of $\mathcal{A}$ and (ii) the probability vector of $\mathcal{A}$ is $\bm{p}$ . Given a graph $G_{D}$ , define the abstract interior $\mathcal{I}_{a}(G_{D})$ to be the set consisting of all vectors $\bm{p}$ such that $\mathbb{P}\left(\cap_{A\in\mathcal{A}}\overline{A}\right)>0$ for any $\mathcal{A}\sim(G_{D},\bm{p})$ . In this context, the most frequently used abstract LLL can be stated as follows:

Theorem 1.1 ([58]).

Given any graph $G_{D}=([m],E_{D})$ and any probability vector $\bm{p}\in(0,1]^{m}$ , if there exist real numbers $x_{1},...,x_{m}\in(0,1)$ such that $p_{i}\leq x_{i}\prod_{j\in\mathcal{N}_{G_{D}}(i)}(1-x_{j})$ for any $i\in[m]$ , then $\bm{p}\in\mathcal{I}_{a}(G_{D})$ .

Shearer [56] obtained the strongest possible condition for abstract LLL. Let $\text{Ind}(G_{D})$ be the set of all independent sets of an undirected graph $G_{D}=([m],E_{D})$ and $\bm{p}=(p_{1},\cdots,p_{m})\in(0,1]^{m}$ . For each $I\in\text{Ind}(G_{D})$ , define the quantity

q_{I}(G_{D},\bm{p})=\sum_{J\in\text{Ind}(G_{D}),I\subseteq J}(-1)^{|J|-|I|}\prod_{i\in J}p_{i}.

$\bm{p}$ is called in Shearer’s bound of $G_{D}$ if $q_{I}(G_{D},\bm{p})>0$ for any $I\in\text{Ind}(G_{D})$ . Otherwise we say $\bm{p}$ is beyond Shearer’s bound of $G_{D}$ . Shearer’s result can be stated as follows.

Theorem 1.2 ([56]).

For any graph $G_{D}=([m],E_{D})$ and any probability vector $\bm{p}\in(0,1]^{m}$ , $\bm{p}\in\mathcal{I}_{a}(G_{D})$ if and only if $\bm{p}$ is in Shearer’s bound of $G_{D}$ .

Variable Lovász Local Lemma. Variable Lovász Local Lemma (VLLL) is another quite general and common setting of LLL, which applies to variable-generated event systems. In this setting, there is a set of underlying mutually independent random variables $\{X_{1},\cdots,X_{n}\}$ , and each event $A_{i}$ can be fully determined by some variables $\mathrm{vbl}(A_{i})$ of them. The dependency between events and variables can be naturally characterized by a bipartite graph $G_{B}=([m],[n],E_{B})$ , known as the event-variable graph, such that edge $(i,j)\in[m]\times[n]$ exists if and only if $X_{j}\in\mathrm{vbl}(A_{i})$ .

The variable setting is important, mainly because most applications of LLL have natural underlying independent variables, such as the satisfiability of CNF formulas[30, 31, 50, 16], hypergraph coloring [49, 29], and Ramsey numbers[57, 58, 32]. In particular, the groundbreaking result by Moser and Tardos [54] on constructive LLL applies in the variable setting.

There is a natural choice for the dependency graph of variable-generated systems, called the canonical dependency graph: two events are adjacent if they share some common variables. Formally, given a bipartite graph $G_{B}=(U,V,E_{B})$ , its base graph is defined as the graph $G_{D}(G_{B})=(U,E_{D})$ such that for any two vertices $u_{i},u_{j}\in U$ , $(u_{i},u_{j})\in E_{D}$ if and only if $u_{i}$ and $u_{j}$ share common neighbors in $G_{B}$ . If $G_{B}$ is the event-variable graph of a variable-generated system $\mathcal{A}$ , then $G_{D}(G_{B})$ is the canonical dependency graph of $\mathcal{A}$ .

Given a graph $G_{D}$ , define the variable interior $\mathcal{I}_{v}(G_{D})$ to be the set consisting of all vectors $\bm{p}$ such that $\mathbb{P}\left(\cap_{A\in\mathcal{A}}\overline{A}\right)>0$ for any variable-generated event system $\mathcal{A}\sim(G_{D},\bm{p})$ . Obviously, $\mathcal{I}_{v}(G_{D})\supseteq\mathcal{I}_{a}(G_{D})$ for any $G_{D}$ . In contrast with the abstract LLL, the Shearer’s bound (of the canonical dependency graph) turns out to be not tight for variable-generated systems [36]: the containment is proper if and only if $G_{D}$ is not chordal¹¹1A graph is chordal if it has no induced cycle of length at least four..

Constructive (variable) Lovász Local Lemma and Moser-Tardos algorithm. The abstract LLL and the variable LLL mentioned above are not constructive in that they do not indicate how to efficiently find an object avoiding all the bad events. In a seminal paper [54], Moser and Tardos developed an amazingly simple efficient algorithm for variable-generated systems, depicted in Algorithm 1²²2Throughout the paper, the Moser-Tardos algorithm is allowed to follow arbitrary selection rules., and showed that this algorithm terminates quickly under the condition in Theorem 1.1. Following the Moser-Tardos algorithm (or MT algorithm for short), a large amount of effort devoted to constructive LLL, including the remarkable works which extend the MT techniques beyond the variable setting [38, 5, 3, 4, 42, 41]. The MT algorithm has been applied to many important problems, including $k$ -SAT [31], hypergraph coloring [32], Hamiltonian cycle [32], and their counting and sampling [27, 50, 16, 18, 44, 40].

2Assign random values to

X_{1},\cdots,X_{n}

;

3 while $\exists i~{}\in[m]$ such that $A_{i}$ holds do

4 Arbitrarily select one such

i

and resample all variables

X_{j}

\mathrm{vbl}(A_{i})

;

Return the current assignment;

Algorithm 1 Moser-Tardos Algorithm

Mainly because such a simple algorithm is so powerful and general-purpose, it is one of the most intriguing and fundamental problems on constructive LLL how powerful the MT algorithm is. Given a graph $G_{D}$ , define the Moser-Tardos interior $\mathcal{I}_{MT}(G_{D})$ to be the set consisting of all vectors $\bm{p}$ such that the MT algorithm is efficient for any variable-generated event system $\mathcal{A}\sim(G_{D},\bm{p})$ . Clearly, $\mathcal{I}_{MT}(G_{D})\subseteq\mathcal{I}_{v}(G_{D})$ for any $G_{D}$ . A major line of follow-up works explores $\mathcal{I}_{MT}(G_{D})$ [46, 55, 45, 10]. The best known criterion is obtained by Kolipaka and Szegedy [45]. They extended the MT interior to the Shearer’s bound. That is, they showed that $\mathcal{I}_{MT}(G_{D})\supseteq\mathcal{I}_{a}(G_{D})$ . As mentioned above, if $G_{D}$ is not chordal, $\mathcal{I}_{a}(G_{D})$ is properly contained in $\mathcal{I}_{v}(G_{D})$ , so it is possible to further push $\mathcal{I}_{MT}(G_{D})$ beyond Shearer’s bound.

In this paper, we concentrate on the following open problem:

Problem 1: does $\mathcal{I}_{MT}(G_{D})$ properly contain $\mathcal{I}_{a}(G_{D})$ for some $G_{D}$ ? If so, for what kind of graph $G_{D}$ ?

Rather than potential applications, our main motivations are the following fundamental problems around LLL itself:

•

The limitation of the constructive LLL in the variable setting. In the most fascinating problems around LLL, a mysterious conjecture says that there is an algorithm which is efficient for all variable-generated systems $\mathcal{A}$ if $\mathcal{A}\sim(G_{D},\bm{p})$ for some $G_{D}$ and $\bm{p}\in\mathcal{I}_{v}(G_{D})$ [59]. It would be a small miracle if the conjecture is true, since if so, one can always construct a solution efficiently in the variable setting if solutions are guaranteed to exist by the LLL condition. Towards this conjecture, a good start is to show that $\mathcal{I}_{MT}(G_{D})\supsetneqq\mathcal{I}_{a}(G_{D})$ for some $G_{D}$ , as $\mathcal{I}_{v}(G_{D})\supsetneqq\mathcal{I}_{a}(G_{D})$ for $G_{D}$ which is not chordal.
•

The limitation of the MT algorithm. The MT algorithm is one of the most intriguing topics in modern algorithm researches, not only because it is very simple and with magic power, but also because it is closely related to the famous Walksat algorithm for random $k$ -SAT. A mysterious problem about the MT algorithm is where is its true limitation [59, 10]. It is conjectured that $\mathcal{I}_{MT}(G_{D})=\mathcal{I}_{v}(G_{D})$ for any $G_{D}$ [59]. To prove this conjecture, the first step is to give a positive answer to Problem 1. Moreover, due to the connection between Shearer’s bound and the Repulsive Lattice Gas model, it is conjectured that essential connection exists between statistical mechanics and the MT algorithm [59]. Whether $\mathcal{I}_{MT}(G_{D})=\mathcal{I}_{a}(G_{D})$ for each $G_{D}$ is critical to this conjecture.

Remark 1.3.

To explore the power of the MT algorithm in specific applications, one may employ special structures of the applications, such as the way the variables interact, to obtain sharp bounds rather than in terms of the canonical dependency graph only. Nevertheless, characterizing the power of the MT algorithm in terms of the canonical dependency graph is a very fundamental problem and also the focus of the major line of researches [54, 55, 9, 45]. Moreover, a major difficulty to strengthen the guarantees of the MT algorithm is that the analysis should be valid for all possible variable-generated event systems. It is not quite surprising to obtain better bounds if the event system has further restrictions. To substantially improve the guarantees of the MT algorithm and provide deep insight about its dynamics, we would rather focus on the general variable LLL setting than employ the special structures in the applications.

We should emphasize that Problem 1 is still quite open! As mentioned before, it has been proved that the Shearer’s bound is not tight for variable-generated systems [36]. However, this only says that there is some probability vector $\bm{p}$ beyond the Shearer’s bound such that all variable-generated event systems $\mathcal{A}\sim(G_{D},\bm{p})$ must have a satisfying assignment. It is unclear whether the MT algorithm can construct such an assignment efficiently.

It also has been proved that the MT algorithm can still be efficient even beyond the Shearer’s bound for some specific applications [32]. Despite its novel contribution, this result does not provide an answer to Problem 1. The result in [32] focuses on the event systems with special structures. Thus, it only implies that there is a probability vector $\bm{p}$ beyond the Shearer’s bound such that the MT algorithm is efficient for some restricted variable-generated event systems $\mathcal{A}\sim(G_{D},\bm{p})$ . However, to show $\mathcal{I}_{MT}(G_{D})\supsetneqq\mathcal{I}_{a}(G_{D})$ , one must prove that the MT algorithm is efficient for all possible event systems, and this is one major difficulty to resolve Problem 1.

1.1. Results and contributions

We provide a complete answer to Problem 1 (Theorem 1.5): if $G_{D}$ is not chordal, then $\mathcal{I}_{MT}(G_{D})\supsetneqq\mathcal{I}_{a}(G_{D})$ , i.e., the efficient region of the MT algorithm goes beyond Shearer’s bound. Otherwise, $\mathcal{I}_{MT}(G_{D})=\mathcal{I}_{a}(G_{D})$ , because $\mathcal{I}_{a}(G_{D})\subseteq\mathcal{I}_{MT}(G_{D})\subseteq\mathcal{I}_{v}(G_{D})$ and $\mathcal{I}_{v}(G_{D})=\mathcal{I}_{a}(G_{D})$ for chordal graphs $G_{D}$ [36].

The core of the proof of Theorem 1.5 is a new convergence criterion for the MT algorithm (Theorem 1.6), which may be of independent interest. This new criterion takes the intersection between dependent events into consideration, and is strictly better than Shearer’s bound when there exists a pair of dependent events which are not mutually exclusive.

1.1.1. Moser-Tardos algorithm: beyond Shearer’s bound

Given a dependency graph $G_{D}=([m],E_{D})$ and a probability vector $\bm{p}=(p_{1},p_{2},\cdots,p_{m})\in(0,1)^{m}$ , we say that $\bm{p}$ is on the Shearer’s boundary of $G_{D}$ if $(1-\varepsilon)\bm{p}$ is in Shearer’s bound and $(1+\varepsilon)\bm{p}$ is not for any $\varepsilon>0$ . A chordless cycle in a graph $G_{D}$ is an induced cycle of length at least 4. A chordal graph is a graph without chordless cycles.

Given two vectors $\bm{p}$ and $\bm{q}$ , we say $\bm{p}\leq\bm{q}$ if the inequality holds entry-wise. Additionally, if the inequality is strict on at least one entry, we say that $\bm{p}<\bm{q}$ .

Definition 1.4 (Maximum $L_{1}$ -gap to the Shearer’s bound).

Given a dependency graph $G_{D}$ and a probability vector $\bm{p}$ beyond the Shearer’s bound of $G_{D}$ , define the maximum $L_{1}$ -gap from $\bm{p}$ to the Shearer’s bound of $G_{D}$ as

d(\bm{p},G_{D})\triangleq\arg\sup_{||\bm{q}||_{1}}\{\bm{p}-\bm{q}\not\in\mathcal{I}_{a}(G_{D}):\bm{q}\leq\bm{p}\}.

For convenience, we let $d(\bm{p},G_{D})=-1$ if $\bm{p}$ is in the Shearer’s bound of $G_{D}$ .

Intuitively, $d(\bm{p},G_{D})$ measures how far $\bm{p}$ is from the Shearer’s bound of $G_{D}$ . One can verify that $d(\bm{p},G_{D})<0$ if $\bm{p}$ is in the Shearer’s bound, $d(\bm{p},G_{D})=0$ if $\bm{p}$ is on the Shearer’s boundary, and $d(\bm{p},G_{D})>0$ if $\bm{p}$ is beyond Shearer’s bound but not on the Shearer’s boundary. Now, we are ready to state our main result.

Theorem 1.5.

For any chordal graph $G_{D}$ , $\mathcal{I}_{MT}(G_{D})=\mathcal{I}_{a}(G_{D})$ , i.e., $\bm{p}\in\mathcal{I}_{MT}(G_{D})$ iff $d(\bm{p},G_{D})<0$ .

For any graph $G_{D}$ which is not chordal, $\bm{p}\in\mathcal{I}_{MT}(G_{D})$ if

d(\bm{p},G_{D})<\frac{1}{545}\cdot\sum_{i\leq\ell}|C_{i}|\big{(}\min_{j\in C_{i}}p_{j}\big{)}^{4}\cdot\left(\max\left\{\frac{2\sum_{j\in C_{i}}\sqrt{p_{j}}}{|C_{i}|}-1,0\right\}\right)^{2}

for some disjoint chordless cycles $C_{1},C_{2},\cdots,C_{\ell}$ in $G_{D}$ . In particular, there is a probability vector $\bm{p}$ with $d(\bm{p},G_{D})\geq 2^{-20}K^{-3}$ satisfying the above condition, where $K$ is the length the shortest chordless cycle. This implies that $\mathcal{I}_{MT}(G_{D})$ contains a probability vector $\bm{p}$ with $d(\bm{p},G_{D})\geq 2^{-20}K^{-3}$ .

The intuition of Theorem 1.5 is as follows. The theorem characterizes the efficient region of the MT algorithm with $d(\bm{p},G_{D})$ . It shows that if $d(\bm{p},G_{D})$ is upper bounded by a non-negative quantity related to the chordless cycles in $G_{D}$ , then the MT algorithm is efficient. Since $\mathcal{I}_{a}(G_{D})$ is the set of $\bm{p}$ where $d(\bm{p},G_{D})<0$ , our criterion is at least as good as Shearer’s bound. Moreover, for each $G_{D}$ which is not chordal, our criterion is strictly better: there exists some $\bm{p}$ with $d(\bm{p},G_{D})\geq 2^{-20}K^{-3}$ satisfying our criterion. Intuitively, Theorem 1.5 implies that chordless cycles in $G_{D}$ enhance the power of the MT algorithm.

We emphasize that Theorem 1.5 provides a complete answer to Problem 1: $\mathcal{I}_{MT}(G_{D})$ properly contains $\mathcal{I}_{a}(G_{D})$ if and only if $G_{D}$ is not chordal.

1.1.2. A new constructive LLL for non-extremal instances

Given a set $\mathcal{A}$ of events with dependency graph $G_{D}$ , $\mathcal{A}$ is called extremal if all pairs of dependent events are mutually exclusive, and non-extremal otherwise. Kolipaka and Szegedy [45] showed that the MT algorithm is efficient up to the Shearer’s bound. In particular, Shearer’s bound is the tight convergence criterion for extremal instances [45, 27]. Here, we provide a new convergence criterion (Theorem 1.6) which is a strict improvement of Kolipaka and Szegedy’s result: this criterion is strictly better than Shearer’s bound when the instance is non-extremal, and becomes Shearer’s bound when the instance is extremal. This criterion, named intersection LLL, is the core of our proof of Theorem 1.5.

Let $G_{D}=([m],E_{D})$ be a canonical dependency graph and $\bm{p}=(p_{1},\cdots,p_{m})\in(0,1)^{m}$ be a probability vector. Let $\mathcal{M}=\{(i_{1},i_{1}^{\prime}),(i_{2},i_{2}^{\prime}),\cdots\}\subseteq E_{D}$ be a matching of $G_{D}$ , and $\bm{\delta}=(\delta_{i_{1},i_{1}^{\prime}},\delta_{i_{2},i_{2}^{\prime}},\cdots)\in(0,1)^{|\mathcal{M}|}$ be another probability vector. We say that an event set $\mathcal{A}$ is of the setting $(G_{D},\bm{p},\mathcal{M},\bm{\delta})$ , and write $\mathcal{A}\sim(G_{D},\bm{p},\mathcal{M},\bm{\delta})$ , if $\mathcal{A}\sim(G_{D},\bm{p})$ and $\mathbb{P}(A_{i}\cap A_{i^{\prime}})\geq\delta_{i,i^{\prime}}$ for each pair $(i,i^{\prime})\in\mathcal{M}$ . Given $(G_{D},\bm{p},\mathcal{M},\bm{\delta})$ , define $\bm{p}^{-}\in(0,1)^{m}$ as follows:

\displaystyle\forall i\in[m]:\quad p_{i}^{-}=\begin{cases}p_{i}-\frac{1}{17}\cdot\delta_{i,i^{\prime}}^{2},&\text{if }(i,i^{\prime})\in\mathcal{M}\text{ for some }i^{\prime};\\ p_{i},&\text{otherwise}.\end{cases}

Theorem 1.6 (intersection LLL (informal)).

For any $\mathcal{A}\sim(G_{D},\bm{p},\mathcal{M},\bm{\delta})$ , MT algorithm terminates quickly if $\bm{p}^{-}$ is in the Shearer’s bound of $G_{D}$ .

The intuition of Theorem 1.6 is as follows. For any matching $\mathcal{M}$ in $G_{D}$ , if the intersection of events on each edge $(i,i^{\prime})$ in $\mathcal{M}$ has a lower bound $\delta_{i,i^{\prime}}$ , then one can subtract $\frac{1}{17}\cdot\delta^{2}_{i,i^{\prime}}$ from the probabilities of endpoints $i$ and $i^{\prime}$ , and the MT algorithm is guaranteed to be efficient whenever the reduced probability vector is in the Shearer’s bound.

Remark 1.7.

In many applications of LLL [49, 31, 30, 50, 28], the dependent bad events naturally intersect with each other. For instance, in a CNF formula, if the common variables in two clauses are both either positive or negative, then the bad events corresponding to these two clauses are dependent and intersect with each other. Thus our intersection LLL may be capable of improving bounds for these applications. However, currently the improvement is weak because only the intersections between the matched events are considered in Theorem 1.6.

Nevertheless, the primary motivation of this work is to explore the power of the MT algorithm in the general variable LLL setting. This basic problem is very important in itself, besides its potential applications.

1.1.3. Application to lattices

To illustrate the application of Theorem 1.5, we estimate the efficient region of the MT algorithm on some lattices explicitly. For simplicity, we focus on symmetric probabilities, i.e., $\bm{p}=(p,p,\cdots,p)$ . Our lower bounds on the gaps between the efficient region of the MT algorithm and the Shearer’s bound are summarized in Table 1. For example, when the canonical dependency graph is the square lattice, the vector $(0.1193,0.1193,\cdots)$ is on the Shearer’s boundary, and the MT algorithm is provably efficient whenever the probability of each event is at most $0.1193+1.858\times 10^{-22}$ .

Table 1. Summary of lower bounds on the gaps

Lattice	Shearer’s bound	lower bound on the gaps
Square	0.1193 [20, 60]	$1.858\times 10^{-22}$
Hexagonal	0.1547 [60]	$2.597\times 10^{-25}$
Simple Cubic	0.0744 [19]	$7.445\times 10^{-23}$

1.2. Technique overview

As mentioned before, the Shearer’s bound is the tight criterion for MT algorithm on extremal instances. Thus in order to show that MT algorithm goes beyond Shearer’s bound, we need to take advantage of the intersection between dependent events. Specifically, Theorem 1.5 immediately follows from two results about non-extremal instances. One is the intersection LLL criterion (Theorem 1.6), which goes beyond Shearer’s bound whenever there are intersections between dependent events. The other result is a lower bound on the amount of intersection between dependent events for general instances (Theorem 4.1).

1.2.1. Proof overview of Theorem 1.6

Let us first remember Kolipaka and Szegedy’s argument [45], which shows that the MT algorithm is efficient up to the Shearer’s bound. We assume that $\{A_{i}\}_{i=1}^{m}$ is a fixed set of events with dependency graph $G_{D}=([m],E_{D})$ and probabilities $\bm{p}=(p_{1},\cdots,p_{m})$ . The notion of a witness DAG³³3In the paper [45], the role of witness DAGs was played by “stable set sequences”, but the concepts are essentially the same: there is a natural bijection between stable set sequences and wdags. (abbreviated wdag) is central to their argument. A wdag is a DAG whose each node $v$ has a label $L(v)$ from $[m]$ and in which two nodes $v$ and $v^{\prime}$ are connected by an arc if and only if $L(v)=L(v^{\prime})$ or $(L(v),L(v^{\prime}))\in E_{D}$ . With a resampling sequence $\bm{s}=s_{1},s_{2},\cdots,s_{T}$ (i.e., MT algorithm picks the events $A_{s_{1}},A_{s_{2}},\cdots,A_{s_{T}}$ for resampling in this order), we associate a wdag $D_{\bm{s}}$ on node set $\{v_{1},\cdots,v_{T}\}$ as follows: (a) $L(v_{k})=s_{k}$ and (b) there is an arc from $v_{k}$ to $v_{\ell}$ with $k<\ell$ if and only if either $s_{k}=s_{\ell}$ or $(s_{k},s_{\ell})\in E_{D}$ (see an example in Figure 1). We say that a wdag $D$ occurs in the resampling sequence $\bm{s}$ if there is subset $U$ of nodes in $D_{\bm{s}}$ such that $D$ is a subgraph of $D_{\bm{s}}$ induced by the nodes that have a directed path to $U$ (Figure 1 (d) is an example, where $U=\{v_{4}\}$ ). An useful observation is that $\mathbb{E}[T]=\sum_{D\in\mathcal{D}}\mathbb{P}_{\bm{s}}[D\mbox{ occurs in }\bm{s}]$ . Here, $\mathcal{D}$ denotes the set of all single-sink wdags (a.k.a. proper wdags) of $G_{D}$ .

We define the weight of a wdag $D$ to be $\Pi_{v\in D}p_{L(v)}$ . The crucial lemma in Kolipaka and Szegedy’s argument (the idea is from Moser-Tardos analysis) is that the probability of occurrence of a certain wdag $D$ is upper bounded by its weight. The idea is that we can assume (only for the analysis) that the MT algorithm has a preprocessing step where it prepares an infinite number of independent samples for each variable. These independent samples create a table $\bm{X}$ , called the resampling table (see Figure 2 in Section 3.1 for an example). When the MT algorithm decides to resample variable $X_{j}$ , it picks a new sample of $X_{j}$ from the resampling table. Suppose a certain wdag $D$ occurs, then for each of its events we can determine a particular set of samples in the resampling table that must satisfy the event, where we say that $D$ is consistent with the resampling table $\bm{X}$ and denote it by $D\sim\bm{X}$ . Hence, $\mathbb{P}_{\bm{s}}[D\mbox{ occurs in }\bm{s}]\leq\mathbb{P}_{\bm{X}}[D\sim\bm{X}]=\Pi_{v\in D}p_{L(v)}$ .

Finally, they solved beautifully the summation of weights of proper wdags, i.e., $\sum_{D\in\mathcal{D}}\Pi_{v\in D}p_{L(v)}$ , which turns out to converge if and only if $\bm{p}$ is in the Shearer’s bound of $G_{D}$ .

Refer to caption — Figure 1. (a) a dependency graph $G_{D}$ ; (b) a resample sequence; (c) the $D_{s}$ ; (d) a wdag occurring in $\bm{s}$ .

Viewing Theorem 1.6 as an improvement of Kolipaka and Szegedy’s result, we begin by providing a tighter upper bound on $\sum_{D\in\mathcal{D}}\mathbb{P}_{\bm{s}}[D\mbox{ occurs in }\bm{s}]$ when the instance is non-extremal (Theorem 3.7). First, note that for each wdag $D$ , there exist selection rules to make $\mathbb{P}_{\bm{s}}[D\mbox{ occurs in }\bm{s}]=\Pi_{v\in D}p_{L(v)}$ , so it is impossible to give a better upper bound on $\mathbb{P}_{\bm{s}}[D\mbox{ occurs in }\bm{s}]$ which holds for all selection rules. Our idea is to group proper wdags, and consider the sum of $\mathbb{P}_{\bm{s}}[D\mbox{ occurs in }\bm{s}]$ over a group. For example, suppose that $A_{1}$ and $A_{2}$ are dependent and $\mathbb{P}[A_{1}\cap A_{2}]\geq\delta_{1,2}$ . Let $D_{1}$ denote the proper wdag which consists of only one arc $A_{1}\rightarrow A_{2}$ , and $D_{2}$ denote the proper wdag consisting of only $A_{2}\rightarrow A_{1}$ . $D_{1}$ and $D_{2}$ cannot both occur, but they may be both consistent with a given resampling table. So the total weights of $D_{1}$ and $D_{2}$ is an overestimate of the probability that $D_{1}$ or $D_{2}$ occurs. Formally,

	$\displaystyle\mathbb{P}_{\bm{s}}[D_{1}\mbox{ occurs in }\bm{s}]+\mathbb{P}_{\bm{s}}[D_{2}\mbox{ occurs in }\bm{s}]=$	$\displaystyle\mathbb{P}_{\bm{s}}[(D_{1}\mbox{ occurs in }\bm{s})\vee(D_{2}\mbox{ occurs in }\bm{s})]$
	$\displaystyle\leq$	$\displaystyle\mathbb{P}_{\bm{X}}[(D_{1}\sim\bm{X})\vee(D_{2}\sim\bm{X})]$
	$\displaystyle=$	$\displaystyle\mathbb{P}_{\bm{X}}[D_{1}\sim\bm{X}]+\mathbb{P}_{\bm{X}}[(D_{2}\sim\bm{X})\wedge(D_{1}\not\sim\bm{X})]$
	$\displaystyle\leq$	$\displaystyle p_{1}p_{2}+p_{1}p_{2}-\delta_{1,2}^{2},$

where the last inequality is according to the Cauchy–Schwarz inequality (see Proposition 3.3). Importantly, the upper bound holds for all selection rules.

It is crucial as well as the difficulty that our improvement over the weight of wdags should be “exponential”: since the quantity $\sum_{D\in\mathcal{D}}\Pi_{v\in D}p_{L(v)}^{-}$ converges if and only if $\bm{p}^{-}$ is in the Shearer’s bound, constant factor or even sub-exponential improvements over $\sum_{D\in\mathcal{D}}\Pi_{v\in D}p_{L(v)}$ do not help to show the desired convergence criterion. Our exponential improvement relies on a delicate grouping and a tricky random partition of the union of $D\sim\bm{X}$ across wdags.

We first state how we group proper wdags: define $\mathcal{D}(i,r)$ to be the set of proper wdags whose unique sink node is labelled with $i$ and in which there are exactly $r$ nodes labelled with $i$ . Noticing that at most one wdag in $\mathcal{D}(i,r)$ can occur, we have that

\displaystyle\sum_{D\in\mathcal{D}(i,r)}\mathbb{P}_{\bm{s}}[D\mbox{ occurs in }\bm{s}]=

\displaystyle\mathbb{P}_{\bm{X}}\left[\bigvee_{D\in\mathcal{D}(i,r)}(D\mbox{ occurs})\right]\leq\mathbb{P}_{\bm{X}}\left[\bigvee_{D\in\mathcal{D}(i,r)}(D\sim\bm{X})\right].

Now, we partition the space $\bigvee_{D\in\mathcal{D}(i,r)}(D\sim\bm{X})$ across wdags in $\mathcal{D}(i,r)$ . The notions of reversible arcs (see Definition 2.4) and a auxiliary table (see Section 3.1) are two central concepts here. Specifically, an arc $u\rightarrow v$ in a wdag $D$ is said reversible, if the directed graph obtained from $D$ by reversing the direction of $u\rightarrow v$ is also a wdag. The auxiliary table is a table $\bm{Y}$ of independent fair coins corresponding to directions of reversible arcs. We say a wdag $D$ is consistent with $(\bm{X},\bm{Y})$ , denoted by $D\sim(\bm{X},\bm{Y})$ if (i) $D\sim\bm{X}$ ; and (ii) for each reversible arc whose direction is not consistent with $\bm{Y}$ , the wdag obtained by reversing the arc is not consistent with $\bm{X}$ . The crucial lemma (Lemma 3.1) shows that for any certain assignment $\bm{y}$ of the auxiliary table $\bm{Y}$ , $\bigvee_{D\in\mathcal{D}(i,r)}(D\sim\bm{X})=\bigvee_{D\in\mathcal{D}(i,r)}(D\sim(\bm{X},\bm{y}))$ . The point is that $(D\sim(\bm{X},\bm{y}))$ ’s have much less overlap with each other so that they can be viewed as a “approximate” partition of the space. By applying union bound, we get

	$\displaystyle\mathbb{P}_{\bm{X}}\left[\bigvee_{D\in\mathcal{D}(i,r)}(D\sim\bm{X})\right]=$	$\displaystyle\mathbb{E}_{\bm{Y}}\mathbb{P}_{\bm{X}}\left[\bigvee_{D\in\mathcal{D}(i,r)}(D\sim\bm{X})\right]=\mathbb{E}_{\bm{Y}}\mathbb{P}_{\bm{X}}\left[\bigvee_{D\in\mathcal{D}(i,r)}(D\sim(\bm{X},\bm{Y})\right]$
	$\displaystyle\leq$	$\displaystyle\mathbb{E}_{\bm{Y}}\sum_{D\in\mathcal{D}(i,r)}\mathbb{P}_{\bm{X}}\left[D\sim(\bm{X},\bm{Y})\right]$
	$\displaystyle=$	$\displaystyle\sum_{D\in\mathcal{D}(i,r)}\mathbb{E}_{\bm{Y}}\mathbb{P}_{\bm{X}}\left[D\sim(\bm{X},\bm{Y})\right].$

Then we are able to provide an upper bound on $\mathbb{E}_{\bm{Y}}\mathbb{P}_{\bm{X}}\left[D\sim(\bm{X},\bm{Y})\right]$ which is “exponentially” smaller than $\Pi_{v\in D}p_{L(v)}$ (Lemma 3.4), and then complete the proof of Theorem 3.7.

The next step is to show that the tighter upper bound converges when $\bm{p}^{-}$ is in the Shearer’s bound. For each vertex $i$ in the matching $\mathcal{M}$ , we “split” vertex $i$ into two new connected vertices $i^{\uparrow}$ and $i^{\downarrow}$ . Let $G^{\mathcal{M}}$ be the resulted dependency graph (see an example in Figure 3). Define $p^{\mathcal{M}}_{i^{\uparrow}}=p^{\prime}_{i}$ and $p^{\mathcal{M}}_{i^{\downarrow}}=p^{-}_{i}-p^{\prime}_{i}$ (see the definition of $p_{i}^{\prime}$ in Section 2.3). One can see that $(G_{D},\bm{p}^{-})$ and $(G^{\mathcal{M}},\bm{p}^{\mathcal{M}})$ are essentially the same: suppose $\mathcal{A}\sim(G_{D},\bm{p}^{-})$ , then for each $i\in\mathcal{M}$ , we view $A_{i}$ as the union of two mutually exclusive events $A_{i^{\uparrow}}$ and $A_{i^{\downarrow}}$ whose probabilities are $p_{i}^{\prime}$ and $p^{-}-p_{i}^{\prime}$ respectively. Such a representation of $\mathcal{A}$ is of the setting $(G^{\mathcal{M}},\bm{p}^{\mathcal{M}})$ . Thus, the sum of weights of proper wdags in the setting $(G_{D},\bm{p}^{-})$ is equal to that in the setting $(G^{\mathcal{M}},\bm{p}^{\mathcal{M}})$ (Proposition 3.9). So it suffices to show that our tighter upper bound is upper bounded by the sum of weights of proper wdags in the setting $(G^{\mathcal{M}},\bm{p}^{\mathcal{M}})$ (Theorem 3.13). Our idea is to construct a mapping which maps each $D\in\mathcal{D}(G_{D})$ to a subset of $\mathcal{D}(G^{\mathcal{M}})$ and satisfies that:

(a)

distinct proper wdags of $G_{D}$ are mapped to disjoint subsets of $\mathcal{D}(G^{\mathcal{M}})$ ; and
(b)

for each $D\in\mathcal{D}(G_{D})$ , the bound in Lemma 3.4 is upper bounded by the sum of weights of proper wdags over the subset that $D$ is mapped to.

We present such a mapping in Definition 3.11. Conditions (a) and (b) are verified in Theorem 3.12 and Theorem 3.13 respectively.

The idea of constructing a mapping between wdags of two dependency graphs may be of independent interest, and may be applied elsewhere when we wish to show some properties about Shearer’s bound.

1.2.2. Proof overview of Theorem 4.1

The proof of Theorem 4.1 mainly consists of two parts. First, we show that there is an elementary event set which approximately achieves the minimum amount of the intersection between dependent events (Lemma 4.2). Here, we call an event $A_{i}\in\mathcal{A}$ elementary, if there is a subset $S_{j}^{i}$ of the domain of variable $X_{j}$ for each variable in $\mathrm{vbl}(A)$ such that $A$ happens if and only if $X_{j}\in S_{j}^{i}$ for all variables in $\mathrm{vbl}(A)$ . We call a set $\mathcal{A}$ of events elementary if every $A_{i}\in\mathcal{A}$ is elementary. Then, for elementary event sets, by applying AM-GM inequality, we obtain a lower bound on the total amount of overlap on common variables, which further implies a lower bound on the amount of intersection between dependent events (Lemma 4.5).

1.3. Related works

Beck proposed the first constructive LLL, which provides efficient algorithms for finding the perfect object avoiding all “bad” events [7]. His methods were refined and improved by a long line of research [6, 53, 13, 39]. In a groundbreaking work, Moser and Tardos proposed a new algorithm, i.e., Algorithm 1, and proved that it finds such a perfect object under the condition in Theorem 1.1 in the variable setting [54]. Pegden [55] proved that the MT algorithm efficiently converges even under the condition of the cluster expansion local lemma [9]. Kolipaka and Szegedy [45] pushed the efficient region to Shearer’s bound. The phenomenon that the MT algorithm can still be efficient beyond Shearer’s bound was known to exist for sporadic and toy examples [32]. However, such result employs the special structures in the examples and only applies to some restricted variable-generated event systems $\mathcal{A}\sim(G_{D},\bm{p})$ . By contrast, the results in this work applies to all variable-generated event systems.

Besides the line of research exploring the efficient region of the MT algorithm, there is a large amount of effort devoted to derandomizing or parallelizing the MT algorithm [54, 11, 34, 8, 23, 12, 35, 33] and to extending the Moser-Tardos techniques beyond the variable setting [38, 2, 41, 5, 3, 52, 42, 4].

There is a line of works studying the gap between non-constructive VLLL and Shearer’s bound [45, 36, 24, 37]. Kolipaka and Szegedy [45] obtained the first example of gap existence where the canonical dependency graph is a cycle of length 4. The paper [36] showed that Shearer’s bound is not tight for VLLL. More precisely, Shearer’s bound is tight for non-constructive VLLL if and only if the canonical dependency graph is chordal. The first paper to study quantitatively the gaps systematically is [37], which provides lower bounds on the gap when the canonical dependency graph containing many chordless cycles.

Erdös and Spencer [15] introduced the lopsided-LLL, which extends the results in [14] to lopsidependency graphs. Lopsided LLL has many interesting applications in combinatorics and theoretical computer science, such as the $k$ -SAT [31], random permutations [47], Hamiltonian cycles [1], and matchings on the complete graph [48]. Shearer’s bound is also the tight condition for the lopsided LLL [56].

LLL has a strong connection to sampling. Guo, Jerrum and Liu [27] proved that the MT algorithm indeed uniformly samples a perfect object if the instance is extremal. For extremal instances, they developed an algorithm called “partial rejection sampling” which resamples in a parallel fashion, since the occurring bad events form an independent set in the dependency graph. Actually, a series of sampling algorithms for specific problems are the parallel resampling algorithm running in the extremal case [27, 26, 22, 25]. In a celebrated work, Moitra [51] introduced a novel approach that utilizes LLL to sample $k$ -CNF solutions. This approach was then extended by several works [29, 21, 16, 17, 43, 44].

1.4. Organization of the paper.

In Section 2, we recall and introduce some definitions and notations. In Section 3, we prove Theorem 1.6. Section 4 is about the proof of Theorem 4.1, which gives a lower bound on the amount of the intersection between dependent events. In Section 5, we prove Theorem 1.5. In Section 6, we provide a explicit lower bound for the gaps between the efficient region of MT algorithm and Shearer’s bound on periodic Euclidean graphs.

2. Preliminaries

Let $\mathbb{N}=\{0,1,2,\cdots\}$ denote the set of non-negative integers. Let $\mathbb{N}^{+}=\{1,2,\cdots\}$ denote the set positive integers. For $m\in\mathbb{N}^{+}$ , we define $[m]=\{1,\cdots,m\}$ . Throughout this section, we fix a canonical dependency graph $G_{D}=([m],E_{D})$ .

2.1. Witness DAG

If for a given run, MT algorithm picks the events $A_{s_{1}},A_{s_{2}},...,A_{s_{T}}$ for resampling in this order, we say that $\bm{s}=s_{1},s_{2}...,s_{T}$ is a resample sequence. If the algorithm never finishes, the resample sequence is infinite, and in this case we set $T=\infty$ .

Definition 2.1 (Witness DAG).

We define a witness DAG (abbreviated wdag) of $G_{D}$ to be a DAG $D$ , in which each node $v$ has a label $L(v)$ from $[m]$ , and which satisfies the additional condition that for all distinct nodes $v,v^{\prime}\in D$ there is an arc between $v$ and $v^{\prime}$ (in either direction) if and only if $L(v)=L(v^{\prime})$ or $\big{(}L(v),L(v^{\prime})\big{)}\in E_{D}$ .

We say $D$ is a proper wdag (abbreviated pwdag) if $D$ has only one sink node. Let $\mathcal{D}(G_{D})$ denote the set of pwdags of $G_{D}$ .

Given a resampling sequence $\bm{s}=s_{1},s_{2},...,s_{T}$ , we associate a wdag $D_{\bm{s}}$ on the node set $\{v_{1},...,v_{T}\}$ such that (i) $L(v_{k})=s_{k}$ and (ii) $v_{k}\rightarrow v_{\ell}$ with $k<\ell$ is as an arc of $D_{\bm{s}}$ if and only if either $s_{k}=s_{\ell}$ or $(s_{k},s_{\ell})\in E_{D}$ . See Figure 1 for an example of $D_{\bm{s}}$ .

Given a wdag $D$ and a set $U$ of nodes of $D$ , we define $D(U)$ to be the induced subgraph on all nodes which has a directed path to some $u\in U$ . Note that $D(U)$ is also a wdag. We say that $H$ is a prefix of $D$ , denoted by $H\unlhd D$ , if $H=D(U)$ for some node set $U$ .

Definition 2.2.

We say a wdag $D$ occurs in a resampling sequence $\bm{s}$ if $D\unlhd D_{\bm{s}}$ . Let $\chi_{D}$ be the indicator variable of the event that $D$ occurs in $\bm{s}$ .

Similar to Lemma 12 in [45], we have that $T=\sum_{D\in\mathcal{D}(G_{D})}\chi_{D}$ . For $i\in[m]$ and $r\in\mathbb{N}^{+}$ , define $\mathcal{D}(i,r)$ to be the set of pwdags whose unique sink node is labelled with $i$ and in which there are exactly $r$ nodes labelled with $i$ . Let $\chi_{\mathcal{D}(i,r)}$ be the indicator variable of the event that there is a $D\in\mathcal{D}(i,r)$ occurring in $\bm{s}$ . It is easy to see that only one pwdag in $\mathcal{D}(i,r)$ can occur in $\bm{s}$ . Thus $\chi_{\mathcal{D}(i,r)}=\sum_{D\in\mathcal{D}(i,r)}\chi_{D}$ , which further implies that

Fact 2.3.

$T=\sum_{i\in[m]}\sum_{r\in\mathbb{N}^{+}}\chi_{\mathcal{D}(i,r)}$ .

2.2. Reversible arc

In the rest of this section, we fix a matching $\mathcal{M}\subseteq E_{D}$ of $G_{D}$ . Given $i\in[m]$ , with a slight abuse of notation, we sometimes say $i\in\mathcal{M}$ if there is some $i^{\prime}\in[m]$ such that $(i,i^{\prime})\in\mathcal{M}$ .

Definition 2.4 (Reversibility).

We say that an arc $u\rightarrow v$ is reversible in a wdag $D$ if the directed graph obtained from $D$ by reversing the direction of the arc is still a DAG.

Furthermore, we say that $u\rightarrow v$ is $\mathcal{M}$ -reversible in $D$ if $u\rightarrow v$ is reversible in $D$ and $(L(u),L(v))\in\mathcal{M}$ .

By definition, we have the following two observations.

Fact 2.5.

$u\rightarrow v$ is reversible in $D$ if and only if it is the unique path from $u$ to $v$ in $D$ .

Fact 2.6.

If $u\rightarrow v$ is reversible in a wdag $D$ of $G_{D}$ , then the directed graph obtained from $D$ by reversing the direction of $u\rightarrow v$ is also a wdag of $G_{D}$ .

Given a pwdag $D=(V,E,L)$ , define

\mathcal{V}(D)\triangleq\{v:\exists u\in V\text{ such that $u\rightarrow v$ or $v\rightarrow u$ is $\mathcal{M}$-reversible in $D$}\}

to be the set of nodes participating in reversible arcs, and $\overline{\mathcal{V}}(D)\triangleq V\setminus\mathcal{V}(D)$ . For $i\in[m]$ , define $\mathcal{V}(D,i)\triangleq\mathcal{V}(D)\cap\{v:L(v)=i\}$ .

2.3. Other notations

Let $\bm{p}=(p_{1},\cdots,p_{m})\in(0,1]^{m}$ and $\bm{\delta}\in(0,1)^{\mathcal{M}}$ be two probability vectors. Recall that $\bm{p}^{-}=(p_{1}^{-},\cdots,p_{m}^{-})$ is defined as

(1)

\displaystyle\forall i\in[m]:\quad p_{i}^{-}=\begin{cases}p_{i}-\frac{\delta_{i,i^{\prime}}^{2}}{17}&\text{if }(i,i^{\prime})\in\mathcal{M}\text{ for some }i^{\prime},\\ p_{i}&\text{otherwise}.\end{cases}

For each $i\in[m]$ where $(i,i^{\prime})\in\mathcal{M}$ for some $i^{\prime}\in[m]$ , define

c_{i}\triangleq\frac{\delta_{i,i^{\prime}}^{2}}{8p_{i}p_{i^{\prime}}}\quad\quad\quad\text{ and }\quad\quad\quad p_{i}^{\prime}\triangleq p_{i}(1-c_{i})=p_{i}-\frac{\delta_{i,i^{\prime}}^{2}}{8p_{i^{\prime}}}.

Fact 2.7.

$p^{-}_{i}+p^{-}_{i^{\prime}}(p^{-}_{i}-p^{\prime}_{i})\geq p_{i}$ for each $(i,i^{\prime})\in\mathcal{M}$ .

3. Proof of Theorem 1.6

The proof of Theorem 1.6 consists of two parts. First, we provide a tighter upper bound on the complexity of MT algorithm (Section 3.1). Then, we show that the tighter upper bound converges if $\bm{p}^{-}$ is in the Shearer’s bound of $G_{D}$ (Section 3.2).

3.1. A tighter upper bound on the complexity of MT algorithm

In this subsection, we prove Theorem 3.7, which follows from Lemma 3.1 and Lemma 3.4 immediately. We first recall and introduce some concepts and notations.

Resampling Table. One key analytical technique of Moser and Tardos [54] is to precompute the randomness in a resampling table $\bm{X}$ . Specifically, we can assume (only for the analysis) that MT algorithm has a preprocessing step where it draws an infinite number of independent samples $X_{j}^{1},X_{j}^{2},\cdots$ for each variable $X_{j}$ . These independent samples create a table $\bm{X}=(X_{j}^{k})_{j\in[m],k\in\mathbb{N}^{+}}$ , called the resampling table (see Figure 2). MT algorithm takes that first column as the initial assignments of $X_{1},\cdots,X_{n}$ . Then, when $X_{j}$ is to be resampled, MT algorithm goes right in the row corresponding to $X_{j}$ and picks the sample.

Consistency with the resampling table. For a wdag $D$ , a node $v$ , and a variable $X_{j}\in\mathrm{vbl}(A_{L(v)})$ , we define

\mathcal{L}(D,v,j)\triangleq|\{u:\text{ there is a directed path from $u$ to $v$ in $D$ and }X_{j}\in\mathrm{vbl}(A_{L(u)})\}|+1.

Moreover, let $\bm{X}_{D,v}\triangleq\{X_{j}^{\mathcal{L}(D,v,j)}:X_{j}\in A_{L(v)}\}$ . We say that $D$ is consistent with $\bm{X}$ , denoted by $D\sim\bm{X}$ , if for each node $v$ in $D$ , the event $A_{L(v)}$ holds on $\bm{X}_{D,v}$ . Intuitively, suppose $D$ occurs, then $\bm{X}_{D,v}$ are the assignments of $\mathrm{vbl}(A_{L(v)})$ just before the time that the MT algorithm picks the event corresponding to $v$ to resample, hence $A_{L(v)}$ must hold on $\bm{X}_{D,v}$ . We sometimes use $\mathcal{L}(v,j)$ and $\bm{X}_{v}$ instead of $\mathcal{L}(D,v,j)$ and $\bm{X}_{D,v}$ respectively if $D$ is clear from the context. Besides, we use $\mathcal{D}(i,r)\sim\bm{X}$ to denote that there is some $D\in\mathcal{D}(i,r)$ such that $D\sim\bm{X}$ .

Auxiliary Table. We introduce another central concept in the proof of Theorem 3.7, called the auxiliary table, which is a table of independent fair coins. Specifically, for each pair $(i,i^{\prime})\in\mathcal{M}$ , we draw an infinite number of independent fair coins $Y_{i,i^{\prime}}^{1},Y_{i,i^{\prime}}^{2},\cdots$ , where $\mathbb{P}(Y_{i,i^{\prime}}^{k}=i)=\mathbb{P}(Y_{i,i^{\prime}}^{k}=i^{\prime})=1/2$ . These independent coins form the auxiliary table $\bm{Y}=(Y_{i,i^{\prime}}^{k})_{(i,i^{\prime})\in\mathcal{M},k\in\mathbb{N}^{+}}$ (see Figure 2). The auxiliary table is used to encode directions of $\mathcal{M}$ -reversible arcs, according to which we partition the space $\bigvee_{D\in\mathcal{D}(i,r)}(D\sim\bm{X})$ .

Consistency with the resampling table and the auxiliary table. We need some notations about reversible arcs. Suppose $D$ has a unique sink node $w$ and $u\rightarrow v$ is reversible in $D$ . Let $D^{\prime}$ be the DAG obtained from $D$ by reversing the direction of $u\rightarrow v$ . We define $\varphi(D,u,v)\triangleq D^{\prime}(\{w\})$ . In other words, $\varphi(D,u,v)$ is the prefix of $D^{\prime}$ with a unique sink node $w$ . Given $(i,i^{\prime})\in\mathcal{M}$ and a pwdag $D$ , let $\mathrm{List}(D,i,i^{\prime})$ denote the sequence listing all nodes in $D$ with labels $i$ or $i^{\prime}$ in a topological order of $G_{D}$ ⁴⁴4It is easy to see that $\mathrm{List}(D,i,i^{\prime})$ is well defined. That is, all topological orderings of $D$ induce the same $\mathrm{List}(D,i,i^{\prime})$ .. Given a node $v$ in $D$ , if $(L(v),i)\in\mathcal{M}$ ⁵⁵5Because $\mathcal{M}$ is a matching, there is at most one such $i$ ., we define

\lambda(v,D)\triangleq|\{u:(u\rightarrow v\text{ is in }D)\land(L(u)\in\{i,L(v)\})\}|+1

to be the order of $v$ in $\mathrm{List}(D,L(v),i)$ . For simplicity of notations, we will use $\lambda(v)$ instead of $\lambda(v,D)$ if $D$ is clear from the context.

Given a wdag $D$ , we say an $\mathcal{M}$ -reversible arc $u\rightarrow v$ is inconsistent with the auxiliary table $\bm{Y}$ if $y_{L(u),L(v)}^{\lambda(u)}=L(v)$ . We say $D$ is consistent with $(\bm{X},\bm{Y})$ , denoted by $D\sim(\bm{X},\bm{Y})$ , if (i) $D\sim\bm{X}$ and (ii) for any $\mathcal{M}$ -reversible arc $u\rightarrow v$ inconsistent with $\bm{y}$ , $\varphi(D,u,v)\not\sim\bm{X}$ . We say $\mathcal{D}(i,r)\sim(\bm{X},\bm{Y})$ if there is some $D\in\mathcal{D}(i,r)$ such that $D\sim(\bm{X},\bm{Y})$ .

The intuition of the notion “consistency” is as follows. Suppose $u\rightarrow v$ in a $\mathcal{M}$ -reversible arc in $D$ , and both $D$ and $\varphi(D,u,v)$ are consistent with the resampling table. But $D$ and $\varphi(D,u,v)$ cannot both occur. It is according to the auxiliary table to which one of $D$ and $\varphi(D,u,v)$ we assign $(D\sim\bm{X})\wedge(\varphi(D,u,v)\sim\bm{X})$ .

Lemma 3.1.

For each $i\in[m]$ and $r\in\mathbb{N}^{+}$ , $\mathbb{P}_{\bm{X}}[\mathcal{D}(i,r)\sim\bm{X}]=\mathbb{P}_{\bm{X},\bm{Y}}[\mathcal{D}(i,r)\sim(\bm{X},\bm{Y})]$ .

Proof.

Fix an arbitrary assignment $\bm{x}$ of $\bm{X}$ and an arbitrary assignment $\bm{y}$ of $\bm{Y}$ . Suppose $\mathcal{D}(i,r)\sim\bm{x}$ , i.e, $\exists D_{0}\in\mathcal{D}(i,r)$ such that $D\sim\bm{x}$ . We will show that there must exist some $D\in\mathcal{D}(i,r)$ such that $D\sim(\bm{x},\bm{y})$ . This will imply the conclusion immediately.

We apply the following procedure to find such a pwdag $D\in\mathcal{D}(i,r)$ .

1 Initially,

k=0

;

2 while $\exists$ an $\mathcal{M}$ -reversible arc $u_{k}\rightarrow v_{k}$ in $D_{k}$ inconsistent with $\bm{y}$ such that $\varphi(D_{k},u_{k},v_{k})\sim\bm{x}$ do

3 let

D_{k+1}:=\varphi(D_{k},u_{k},v_{k})

and

k:=k+1

;

Return

D_{k}

;

By induction on $k$ , it is easy to check that $D_{k}\sim\bm{x}$ and $D_{k}\in\mathcal{D}(i,r)$ for each $k$ . Furthermore, if the procedure terminates, then in the final wdag $D$ , for every $\mathcal{M}$ -reversible arc $u\rightarrow v$ inconsistent with $\bm{y}$ , we have that $\varphi(D,u,v)\not\sim\bm{x}$ . So $D\sim(\bm{x},\bm{y})$ . In the following, we will show that the procedure always terminates, which finishes the proof.

Note that each $D_{k}$ has no more nodes than $D_{0}$ and that there are finite number of wdags in $\mathcal{D}(i,r)$ with no more nodes than $D_{0}$ , so it suffices to prove that each wdag appears at most once in the procedure.

By contradiction, assume $D_{j}=D_{k}$ for some $j\leq k$ . Recall that $u_{j}\rightarrow v_{j}$ is reversible in $D_{j}$ and inconsistent with $\bm{y}$ . So $y_{L(u_{j}),L(v_{j})}^{\lambda(v_{j},D_{j})-1}=y_{L(u_{j}),L(v_{j})}^{\lambda(u_{j},D_{j})}=L(v_{j})$ .

Let $D_{\ell}$ be the last wdag in $D_{j+1},\cdots,D_{k}$ such that $\lambda(v_{j},D_{\ell})<\lambda(v_{j},D_{j})$ . Observing that $\lambda(v_{j},D_{j+1})=\lambda(v_{j},D_{j})-1$ , we have such $D_{\ell}$ must exist. By $\lambda(v_{j},D_{k})=\lambda(v_{j},D_{j})$ , we have $\lambda(v_{j},D_{\ell})=\lambda(v_{j},D_{j})-1$ , $\lambda(v_{j},D_{\ell+1})=\lambda(v_{j},D_{j})$ . Therefore, $\lambda(v_{j},D_{\ell+1})=\lambda(v_{j},D_{\ell})+1$ . Combining with that $u_{\ell}\rightarrow v_{\ell}$ is the inconsistent arc in $D_{\ell}$ which is reversed in $D_{\ell+1}$ , we have $u_{\ell}=v_{j}$ , $(L(u_{j}),L(v_{j}))=(L(u_{\ell}),L(v_{\ell}))\in\mathcal{M}$ and $y_{L(u_{\ell}),L(v_{\ell})}^{\lambda(u_{\ell},D_{\ell})}=L(v_{\ell})$ . Thus we have $L(v_{\ell})=L(u_{j})$ and $y_{L(u_{\ell}),L(v_{\ell})}^{\lambda(u_{\ell},D_{\ell})}=L(u_{j})$ . Note that $\lambda(v_{\ell},D_{\ell})=1+\lambda(u_{\ell},D_{\ell})=1+\lambda(v_{j},D_{\ell})$ . Combining with $\lambda(u_{j},D_{j})=\lambda(v_{j},D_{j})-1$ , we have $\lambda(u_{\ell},D_{\ell})=\lambda(u_{j},D_{j})$ . Combining with $y_{L(u_{\ell}),L(v_{\ell})}^{\lambda(u_{\ell},D_{\ell})}=L(u_{j})$ , we have $y_{L(u_{\ell}),L(v_{\ell})}^{\lambda(u_{j},D_{j})}=L(u_{j})$ . This is contradicted with $y_{L(u_{j}),L(v_{j})}^{\lambda(u_{j},D_{j})}=L(v_{j})$ .

∎

The following two propositions will be used in the proof of Lemma 3.4. The first proposition is an easy observation, and the second one is a direct application of the Cauchy-Schwarz inequality. For the sake of completeness, we present their proof in the appendix.

Proposition 3.2.

Given any wdag $D$ , there exists a set $\mathcal{P}$ of disjoint $\mathcal{M}$ -reversible arcs⁶⁶6We say two arc $u\rightarrow v$ and $u^{\prime}\rightarrow v^{\prime}$ are disjoint if their node sets are disjoint, i.e. $\{u,v\}\cap\{u^{\prime},v^{\prime}\}=\emptyset$ . such that: for each $i\in\mathcal{M}$ ,

|\{v:\exists u\text{ such that }u\rightarrow v\text{ or }v\rightarrow u\text{ is in }\mathcal{P}\}\cap\{v:L(v)=i\}|\geq\frac{1}{2}\cdot\mathcal{V}(D,i).

Proposition 3.3.

Suppose $X,Y$ and $Z$ are three independent random variables, $A$ is an event determined by $\{X,Y\}$ , and $A^{\prime}$ is an event determined by $\{Y,Z\}$ . Let $X_{1},Y_{1},Y_{2},Z_{1}$ be four independent samples of $X,Y,Y,Z$ , respectively. Then the following holds with probability at most $\mathbb{P}(A)\mathbb{P}(A^{\prime})-\mathbb{P}(A\cap A^{\prime})^{2}$ :

•

$A$ is true on $(X_{1},Y_{1})$ , $A^{\prime}$ is true on $(Y_{2},Z_{1})$ , and
•

either $A$ is false on $(X_{1},Y_{2})$ or $A^{\prime}$ is false on $(Y_{1},Z_{1})$ .

Now, we are ready to show Lemma 3.4.

Lemma 3.4.

For each pwdag $D$ ,

\mathbb{P}[D\sim(\bm{X},\bm{Y})]\leq\left(\prod_{v\in\overline{\mathcal{V}}(D)}p_{L(v)}\right)\left(\prod_{v\in\mathcal{V}(D)}p^{\prime}_{L(v)}\right).

Proof.

Let $\mathcal{P}$ be the set of disjoint $\mathcal{M}$ -reversible arcs defined in Proposition 3.2. Let $V(\mathcal{P})$ denote the set of nodes which appears in $\mathcal{P}$ , and $\overline{V(\mathcal{P})}$ consists of the other nodes. Proposition 3.2 says that for each $i\in\mathcal{M}$ ,

V(\mathcal{P})\cap\{v:L(v)=i\}|\geq\frac{1}{2}\cdot\mathcal{V}(D,i).

For each $v\in\overline{V(\mathcal{P})}$ , let $B_{v}$ denote the event that $A_{L(v)}$ holds on $\bm{X}_{v}$ . It is easy to see that $\mathbb{P}[B_{v}]=p_{L(v)}$ . Besides,

Claim 3.5.

If $D\sim(\bm{X},\bm{Y})$ , then $B_{v}$ holds for each $v\in\overline{V(\mathcal{P})}$ .

Proof.

Note that $\bm{X}_{v}$ are the assignments of $\mathrm{vbl}(A_{L(v)})$ just before the time that the MT algorithm picks the event corresponding to $v$ to resample. MT algorithm decides to pick $A_{L(v)}$ only if $A_{L(v)}$ holds. Hence $A_{L(v)}$ must hold on $\bm{X}_{v}$ . ∎

Let $u\rightarrow v$ be an arc in $\mathcal{P}$ , where $L(u)=i$ and $L(v)=i^{\prime}$ . Then by the definition of $\mathcal{P}$ , we have $u\rightarrow v$ is reversible in $D$ . Let $D^{\prime}$ be the wdag obtained by reversing the direction of $u\rightarrow v$ in $D$ . Recalling the definition of $\bm{X}_{D^{\prime},v}$ , one can verify that

\bm{X}_{D^{\prime},u}:=\left\{X_{j,\mathcal{L}(v,j)}:X_{j}\in\mathrm{vbl}\big{(}A_{i}\big{)}\cap\mathrm{vbl}\big{(}A_{i^{\prime}}\big{)}\right\}\cup\left\{X_{j,\mathcal{L}(u,j)}:X_{j}\in\mathrm{vbl}\big{(}A_{i}\big{)}\setminus\mathrm{vbl}\big{(}A_{i^{\prime}}\big{)}\right\}

and

\bm{X}_{D^{\prime},v}:=\left\{X_{j,\mathcal{L}(u,j)}:X_{j}\in\mathrm{vbl}\big{(}A_{i}\big{)}\cap\mathrm{vbl}\big{(}A_{i^{\prime}}\big{)}\right\}\cup\left\{X_{j,\mathcal{L}(v,j)}:X_{j}\in\mathrm{vbl}\big{(}A_{i^{\prime}}\big{)}\setminus\mathrm{vbl}\big{(}A_{i}\big{)}\right\}.

For simplicity, let $\lambda:=\lambda(u,D)$ . We define $B_{u,v}$ to be the event that the following hold:

(a)

$A_{i}$ holds on $X_{u}$ , and $A_{i^{\prime}}$ holds on $X_{v}$ ;
(b)

If $Y_{i,i^{\prime}}^{\lambda}=i^{\prime}$ , then either $A_{i}$ is false on $\bm{X}_{D^{\prime},u}$ or $A_{i}^{\prime}$ is false on $\bm{X}_{D^{\prime},v}$ .

Conditioned on that $Y_{i,i^{\prime}}^{\lambda}=i$ , $B_{u,v}$ happens with probability $p_{i}p_{i^{\prime}}$ . Condition on that $Y_{i,i^{\prime}}^{\lambda}=i^{\prime}$ , by using Proposition 3.3, $B_{u,v}$ happens with probability at most $p_{i}p_{i^{\prime}}-\delta^{2}_{i,i^{\prime}}$ . Thus,

	$\displaystyle\mathbb{P}[B_{u,v}]$	$\displaystyle\leq\mathbb{P}[Y_{i,i^{\prime}}^{\lambda}=i]p_{i}p_{i^{\prime}}+\mathbb{P}[Y_{i,i^{\prime}}^{\lambda}=i^{\prime}]\left(p_{i}p_{i^{\prime}}-\delta^{2}_{i,i^{\prime}}\right)\leq\frac{1}{2}\cdot p_{i}p_{i^{\prime}}+\frac{1}{2}\cdot\left(p_{i}p_{i^{\prime}}-\delta^{2}_{i,i^{\prime}}\right)$
		$\displaystyle\leq p_{i}p_{i^{\prime}}(1-2c_{i})(1-2c_{i^{\prime}}).$

Claim 3.6.

If $D\sim(\bm{X},\bm{Y})$ , then $B_{u,v}$ holds for each $u\rightarrow v$ in $\mathcal{P}$ .

Proof.

Suppose $D\sim(\bm{X},\bm{Y})$ . Similar to the argument in Claim 3.5, we can see that Item (a) holds. In the following, we show Item (b) holds.

By contradiction, assume $Y_{i,i^{\prime}}^{\lambda}=i^{\prime}$ , $A_{i}$ holds on $\bm{X}_{D^{\prime},u}$ , and $A_{i^{\prime}}$ holds on $\bm{X}_{D^{\prime},v}$ . Then, we have $u\rightarrow v$ in $D$ is inconsistent with $\bm{Y}$ and $D^{\prime}\sim\bm{X}$ . Thus, $\varphi(D,u,v)\sim\bm{X}$ since $\varphi(D,u,v)$ is a prefix of $D^{\prime}$ . By definition, we have $D\not\sim(\bm{X},\bm{Y})$ , a contradiction. ∎

Since the events $\{B_{v}:v\in\overline{V(\mathcal{P})}\}$ and $\{B_{u,v}:u\rightarrow v\text{ is in }\mathcal{P}\}$ depend on distinct entries of $\bm{X}$ and $\bm{Y}$ , they are mutually independent. Therefore,

	$\displaystyle\mathbb{P}\left[D\sim(\bm{X},\bm{Y})\right]$	$\displaystyle\leq\mathbb{P}\left[\left(\bigcap_{w\in\overline{V(\mathcal{P}})}B_{w}\right)\bigcap\left(\bigcap_{u\rightarrow v\text{ is in }\mathcal{P}}B_{u,v}\right)\right]=\left(\prod_{w\in\overline{V(\mathcal{P})}}\mathbb{P}(B_{w})\right)\left(\prod_{u\rightarrow v\text{ is in }\mathcal{P}}\mathbb{P}(B_{u,v})\right)$
		$\displaystyle\leq\left(\prod_{w\in\overline{V(\mathcal{P})}}p_{L(w)}\right)\left(\prod_{u\rightarrow v\text{ is in }\mathcal{P}}p_{L(u)}p_{L(v)}\left(1-2c_{L(u)}\right)\left(1-2c_{L(u)}\right)\right)$
		$\displaystyle=\left(\prod_{v\text{ in }D}p_{L(v)}\right)\cdot\left(\prod_{i\in[m]}\left(1-2c_{i}\right)^{\|\mathcal{P}\cap\{v:L(v)=i\}\|}\right)$
		$\displaystyle\leq\left(\prod_{v\text{ in }D}p_{L(v)}\right)\cdot\left(\prod_{i\in[m]}\left(1-2c_{i}\right)^{\|\mathcal{V}(i)\|/2}\right)\leq\left(\prod_{v\text{ in }D}p_{L(v)}\right)\cdot\left(\prod_{i\in[m]}\left(1-c_{i}\right)^{\|\mathcal{V}(i)\|}\right)$
		$\displaystyle=\left(\prod_{v\in\overline{\mathcal{V}}(D)}p_{L(v)}\right)\left(\prod_{v\in\mathcal{V}(D)}p^{\prime}_{L(v)}\right).$

∎

Now we are ready to prove the main theorem of this subsection.

Theorem 3.7.

$\mathbb{E}[T]\leq\sum_{D\in\mathcal{D}(G_{D})}\big{(}\prod_{v\in\overline{\mathcal{V}}(D)}p_{L(v)}\big{)}\big{(}\prod_{v\in\mathcal{V}(D)}p^{\prime}_{L(v)}\big{)}$ .

Proof.

First, according to Lemmas 3.1 and 3.4,

	$\displaystyle\mathbb{P}[\chi_{\mathcal{D}(i,r)}]$	$\displaystyle\leq\mathbb{P}[\mathcal{D}(i,r)\sim\bm{X}]=\mathbb{P}[\mathcal{D}(i,r)\sim(\bm{X},\bm{Y})]\leq\sum_{D\in\mathcal{D}(i,r)}\mathbb{P}[D\sim(\bm{X},\bm{Y})]$
		$\displaystyle\leq\sum_{D\in\mathcal{D}(i,r)}\left(\prod_{v\in\overline{\mathcal{V}}(D)}p_{L(v)}\right)\left(\prod_{v\in\mathcal{V}(D)}p^{\prime}_{L(v)}\right).$

Then, by Fact 2.3 and the above inequality, we have

	$\displaystyle\mathbb{E}[T]$	$\displaystyle=\sum_{i\in[m]}\sum_{r\in\mathbb{N}^{+}}\mathbb{P}[\chi_{\mathcal{D}(i,r)}]\leq\sum_{i\in[m]}\sum_{r\in\mathbb{N}^{+}}\sum_{D\in\mathcal{D}(i,r)}\left(\prod_{v\in\overline{\mathcal{V}}(D)}p_{L(v)}\right)\left(\prod_{v\in\mathcal{V}(D)}p^{\prime}_{L(v)}\right)$
		$\displaystyle\leq\sum_{D\in\mathcal{D}(G_{D})}\left(\prod_{v\in\overline{\mathcal{V}}(D)}p_{L(v)}\right)\left(\prod_{v\in\mathcal{V}(D)}p^{\prime}_{L(v)}\right).$

∎

3.2. Mapping between wdags

In this section, we will prove Theorem 3.13, which provides a upper bound of $\mathbb{E}[T]$ in terms of $\bm{p}^{-}$ .

Definition 3.8 (Homomorphic dependency graph).

Given a dependency graph $G_{D}=([m],E_{D})$ and a matching $\mathcal{M}$ of $G_{D}$ , we define a graph $G^{\mathcal{M}}=(V^{\mathcal{M}},E^{\mathcal{M}})$ homomorphic to $G_{D}$ respected to $\mathcal{M}$ as follows.

•

$V^{\mathcal{M}}=[m]\setminus\{i_{0},i_{1}:(i_{0},i_{1})\in\mathcal{M}\}\cup\{i_{0}^{\uparrow},i_{0}^{\downarrow},i_{1}^{\uparrow},i_{1}^{\downarrow}:(i_{0},i_{1})\in\mathcal{M}\}$ ;
•

$\forall(i_{0},i_{1})\in E_{D}$ , each pair of vertices in $\{i_{0},i_{1},i_{0}^{\uparrow},i_{0}^{\downarrow},i_{1}^{\uparrow},i_{1}^{\downarrow}\}\cap V^{\mathcal{M}}$ are connected in $G^{\mathcal{M}}$ .

Besides, we associate a probability vector $\bm{p}^{\mathcal{M}}$ with $G^{\mathcal{M}}$ as follows:

\displaystyle\forall v\in V^{\mathcal{M}}:\quad p^{\mathcal{M}}_{v}=\begin{cases}p_{i}^{\prime}&\text{if }v=i^{\uparrow}\text{ for some }i\in[m],\\ p_{i}^{-}-p_{i}^{\prime}&\text{if }v=i^{\downarrow}\text{ for some }i\in[m],\\ p^{-}_{i}&\text{otherwise, }v=i\text{ for some }i\in[m].\end{cases}

In fact, $(G_{D},\bm{p}^{-})$ and $(G^{\mathcal{M}},\bm{p}^{\mathcal{M}})$ are essentially the same: suppose $\mathcal{A}\sim(G_{D},\bm{p}^{-})$ , then for each $i\in\mathcal{M}$ , we view $A_{i}$ as the union of two mutually exclusive events $A_{i^{\uparrow}}\cup A_{i^{\downarrow}}$ whose probabilities are $p_{i}^{\prime}$ and $p^{-}-p_{i}^{\prime}$ respectively. Such a representation of $\mathcal{A}$ is of the setting $(G^{\mathcal{M}},\bm{p}^{\mathcal{M}})$ .

We have the following proposition, whose proof can be found in the appendix.

Proposition 3.9.

$\sum_{D^{\prime}\in\mathcal{D}(G^{\mathcal{M}})}\prod_{v^{\prime}\text{ in }D^{\prime}}p^{\mathcal{M}}_{L^{\prime}(v^{\prime})}=\sum_{D\in\mathcal{D}(G_{D})}\prod_{v\text{ in }D}p^{-}_{L(v)}.$

Given a pwdag $D=(V,E,L)$ , recall that $\mathcal{V}(D)$ is the set of nodes of $\mathcal{M}$ -reversible arcs in $D$ . Define $\mathscr{M}(D)\triangleq\{v:L(v)\in\mathcal{M}\}$ to be the set of nodes $v$ in $D$ where $L(v)$ is contained in an edge in $\mathcal{M}$ . Obviously, $\mathcal{V}(D)\subseteq\mathscr{M}(D)$ . For simplicity of notations, we will omit $D$ from the notations if $D$ is clear from the context.

Given a pwdag $D=(V,E,L)$ , we use $\mathscr{S}=\{\mathscr{S}_{1},\mathscr{S}_{2},\mathscr{S}_{3},\mathscr{S}_{4}\}$ to represent a partition of $\mathscr{M}(D)$ where $\mathcal{V}\subseteq\mathscr{S}_{1}$ (some of these four sets are possibly empty). Let $\psi(D)$ denote the set consisting of all such partitions. The formal definition is as follows.

Definition 3.10 (Partition).

Given a pwdag $D=(V,E,L)$ of $G_{D}$ , define

\psi(D)\triangleq\{\{\mathscr{S}_{1},\mathscr{S}_{2},\mathscr{S}_{3},\mathscr{S}_{4}\}:\mathcal{V}\subseteq\mathscr{S}_{1}\text{ and }\mathscr{M}=\mathscr{S}_{1}\sqcup\mathscr{S}_{2}\sqcup\mathscr{S}_{3}\sqcup\mathscr{S}_{4}\}.

Given a wdag $D$ , there may be two or more topological ordering of $D$ . We fix an arbitrary topological ordering, and denote it by $\pi_{D}$ . In the following, we define an injection $h$ from $\{(D,\mathscr{S}):D\in\mathcal{D}(G_{D}),\mathscr{S}\in\psi(D)\}$ to $\mathcal{D}(G^{\mathcal{M}})$ .

Definition 3.11.

Given a pwdag $D$ and $\mathscr{S}\in\psi(D)$ , define $h(D,\mathscr{S})$ to be a directed graph $D^{\prime}=(V^{\prime},E^{\prime},L^{\prime})$ constructed as follows.

Constructing $V^{\prime}$ . $V^{\prime}=V_{1}^{\prime}\sqcup V_{2}^{\prime}$ where $|V_{1}^{\prime}|=|V|$ and $|V_{2}^{\prime}|=|\mathscr{S}_{3}\cup\mathscr{S}_{4}|$ . For convenience of presentation, we fix two bijections $f:V\rightarrow V_{1}^{\prime}$ and $f^{\ast}:\mathscr{S}_{3}\cup\mathscr{S}_{4}\rightarrow V_{2}^{\prime}$ to name nodes in $V^{\prime}$ . In order to distinguish between nodes in $D$ and those in $D^{\prime}$ , we will always use $u,v,w$ to represent the nodes of $D$ and $u^{\prime},v^{\prime},w^{\prime}$ to present the nodes of $D^{\prime}$ . Given $v^{\prime}\in V^{\prime}$ , we use $g(v^{\prime})$ to denote the unique node $v\in V$ such that $f(v)=v^{\prime}$ (if $v^{\prime}\in V_{1}^{\prime}$ ) or $f^{\ast}(v)=v^{\prime}$ (if $v^{\prime}\in V_{2}^{\prime}$ ).

Description of $L^{\prime}$ . For each node $v^{\prime}\in V_{1}^{\prime}$ , where $v^{\prime}=f(v)$ ,

(2)

\displaystyle L^{\prime}(v^{\prime})=\begin{cases}(L(v))^{\uparrow},&\text{if }v\in\mathscr{S}_{1},\\ (L(v))^{\downarrow},&\text{if }v\in\mathscr{S}_{2}\cup\mathscr{S}_{3}\cup\mathscr{S}_{4},\\ L(v),&\text{otherwise, }v\not\in\mathscr{M}.\end{cases}

For each node $v^{\prime}\in V_{2}^{\prime}$ , assuming $v\in\mathscr{S}_{3}\cup\mathscr{S}_{4}$ is the node such that $v^{\prime}=f^{\ast}(v)$ and $i\in[m]$ is the node such that $((L(v),i)\in\mathcal{M}$ ,

(3)

\displaystyle\quad L^{\prime}(v^{\prime})=\begin{cases}i^{\uparrow},&\text{if }v\in\mathscr{S}_{3},\\ i^{\downarrow},&\text{otherwise, }v\in\mathscr{S}_{4}.\end{cases}

Constructing $E^{\prime}$ . $E^{\prime}=E_{1}^{\prime}\sqcup E_{2}^{\prime}$ where $E^{\prime}_{1}=\{f^{\ast}(v)\rightarrow f(v):v\in\mathscr{S}_{3}\cup\mathscr{S}_{4}\}$ and

E^{\prime}_{2}=\{u^{\prime}\rightarrow v^{\prime}:\big{(}(L^{\prime}(u^{\prime})=L^{\prime}(v^{\prime})\big{)}\lor\big{(}(L^{\prime}(u^{\prime}),L^{\prime}(v^{\prime}))\in E^{\mathcal{M}})\big{)}\land(g(u^{\prime})\prec g(v^{\prime})\text{ in }\pi_{D})\}.

Theorem 3.12.

$h(\cdot,\cdot)$ is an injection from $\{(D,\mathscr{S}):D\in\mathcal{D}(G_{D}),\mathscr{S}\in\psi(D)\}$ to $\mathcal{D}(G^{\mathcal{M}})$ .

The proof of Theorem 3.12 is in the appendix. Now we can prove the main theorem of this subsection.

Theorem 3.13.

$\sum_{D\in\mathcal{D}(G_{D})}\left(\prod_{v\in\overline{\mathcal{V}}(D)}p_{L(v)}\right)\left(\prod_{v\in\mathcal{V}(D)}p^{\prime}_{L(v)}\right)\leq\sum_{D\in\mathcal{D}(G_{D})}\prod_{v\text{ in }D}p^{-}_{L(v)}.$

Proof.

For each $i\in[m]$ where $(i,j)\in\mathcal{M}$ , let

\begin{array}[]{llll}q^{1}_{i}\triangleq p^{\prime}_{i},\quad q^{2}_{i}\triangleq p^{-}_{i}-p^{\prime}_{i},\quad q^{3}_{i}\triangleq(p^{-}_{i}-p^{\prime}_{i})p^{\prime}_{j},\quad\text{and }\quad q^{4}_{i}\triangleq(p^{-}_{i}-p^{\prime}_{i})(p^{-}_{j}-p^{\prime}_{j}).\end{array}

According to Fact 2.7, $q^{1}_{i}+q^{2}_{i}+q^{3}_{i}+q^{4}_{i}=p^{-}_{i}+p^{-}_{j}(p^{-}_{i}-p^{\prime}_{i})\geq p_{i}$ .

Given $D=(V,E,L)\in\mathcal{D}(G_{D})$ and $\mathscr{S}\in\psi(D)$ , let $D^{\prime}=h(D,\mathscr{S})$ . For each $v$ in $D$ where $(L(v),j)\in\mathcal{M}$ for some $j\in[m]$ , according to the definition of $\bm{p}^{\mathcal{M}}$ , (2), and (3), we have that

•

if $v\in\mathscr{S}_{1}$ , then $p^{\mathcal{M}}_{L^{\prime}(f(v))}=p^{\prime}_{L(v)}=q^{1}_{L(v)}$ ;
•

if $v\in\mathscr{S}_{2}$ , then $p^{\mathcal{M}}_{L^{\prime}(f(v))}=p^{-}_{L(v)}-p^{\prime}_{L(v)}=q^{2}_{L(v)}$ ;
•

if $v\in\mathscr{S}_{3}$ , then $p^{\mathcal{M}}_{L^{\prime}(f(v))}\cdot p^{\mathcal{M}}_{L^{\prime}(f^{\ast}(v))}=(p^{-}_{L(v)}-p^{\prime}_{L(v)})p^{\prime}_{j}=q^{3}_{L(v)}$ ;
•

if $v\in\mathscr{S}_{4}$ , then $p^{\mathcal{M}}_{L^{\prime}(f(v))}\cdot p^{\mathcal{M}}_{L^{\prime}(f^{\ast}(v))}=(p^{-}_{L(v)}-p^{\prime}_{L(v)})(p^{-}_{j}-p^{\prime}_{j})=q^{4}_{L(v)}$ .

Moreover, for each $u\in V\setminus\mathscr{M}(D)=\overline{\mathcal{V}(D)}\setminus\mathscr{M}(D)$ , we have $p^{\mathcal{M}}_{L^{\prime}(f(v))}=p_{L(v)}$ . Thus, for each $\mathscr{S}\in\psi(D)$ ,

	$\displaystyle\prod_{v^{\prime}\text{ in }h(D,\mathscr{S})}p^{\mathcal{M}}_{L^{\prime}(v^{\prime})}$	$\displaystyle=\prod_{v\in\overline{\mathcal{V}}\setminus\mathscr{M}}p_{L(v)}\prod_{v\in\mathscr{S}_{1}}q^{1}_{L(v)}\prod_{v\in\mathscr{S}_{2}}q^{2}_{L(v)}\prod_{v\in\mathscr{S}_{3}}q^{3}_{L(v)}\prod_{v\in\mathscr{S}_{4}}q^{4}_{L(v)}$
		$\displaystyle=\prod_{v\in\overline{\mathcal{V}}\setminus\mathscr{M}}p_{L(v)}\prod_{v\in\mathcal{V}}p^{\prime}_{L(v)}\prod_{v\in\mathscr{S}_{1}\setminus\mathcal{V}}q^{1}_{L(v)}\prod_{v\in\mathscr{S}_{2}}q^{2}_{L(v)}\prod_{v\in\mathscr{S}_{3}}q^{3}_{L(v)}\prod_{v\in\mathscr{S}_{4}}q^{4}_{L(v)}.$

	$\displaystyle\sum_{\mathscr{S}\in\psi(D)}\prod_{v^{\prime}\text{ in }h(D,\mathscr{S})}p^{\mathcal{M}}_{L^{\prime}(v^{\prime})}$	$\displaystyle=\sum_{\mathscr{S}\in\psi(D)}\prod_{v\in\overline{\mathcal{V}}\setminus\mathscr{M}}p_{L(v)}\prod_{v\in\mathcal{V}}p^{\prime}_{L(v)}\prod_{v\in\mathscr{S}_{1}\setminus\mathcal{V}}q^{1}_{L(v)}\prod_{v\in\mathscr{S}_{2}}q^{2}_{L(v)}\prod_{v\in\mathscr{S}_{3}}q^{3}_{L(v)}\prod_{v\in\mathscr{S}_{4}}q^{4}_{L(v)}$
		$\displaystyle=\prod_{v\in\overline{\mathcal{V}}\setminus\mathscr{M}}p_{L(v)}\prod_{v\in\mathcal{V}}p^{\prime}_{L(v)}\sum_{\mathscr{S}\in\psi(D)}\prod_{v\in\mathscr{S}_{1}\setminus\mathcal{V}}q^{1}_{L(v)}\prod_{v\in\mathscr{S}_{2}}q^{2}_{L(v)}\prod_{v\in\mathscr{S}_{3}}q^{3}_{L(v)}\prod_{v\in\mathscr{S}_{4}}q^{4}_{L(v)}$
		$\displaystyle=\prod_{v\in\overline{\mathcal{V}}\setminus\mathscr{M}}p_{L(v)}\prod_{v\in\mathcal{V}}p^{\prime}_{L(v)}\prod_{v\in\mathscr{M}\setminus\mathcal{V}}\left(q^{1}_{L(v)}+q^{2}_{L(v)}+q^{3}_{L(v)}+q^{4}_{L(v)}\right)$
		$\displaystyle\geq\prod_{v\in\overline{\mathcal{V}}\setminus\mathscr{M}}p_{L(v)}\prod_{v\in\mathcal{V}}p^{\prime}_{L(v)}\prod_{v\in\mathscr{M}\setminus\mathcal{V}}p_{L(v)}$
		$\displaystyle=\prod_{v\in\overline{\mathcal{V}}}p_{L(v)}\prod_{v\in\mathcal{V}}p^{\prime}_{L(v)},$

where the third equality is according to the definition of $\psi(D)$ . Finally,

	$\displaystyle\sum_{D\in\mathcal{D}(G_{D})}\left(\prod_{v\in\overline{\mathcal{V}}(D)}p_{L(v)}\right)\left(\prod_{v\in\mathcal{V}(D)}p^{\prime}_{L(v)}\right)$	$\displaystyle\leq\sum_{D\in\mathcal{D}(G_{D})}\sum_{\mathscr{S}\in\psi(D)}\prod_{v^{\prime}\text{ in }h(D,\mathscr{S})}p^{\mathcal{M}}_{L^{\prime}(v^{\prime})}\leq\sum_{D^{\prime}\in\mathcal{D}(G^{\mathcal{M}})}\prod_{v\text{ in }D^{\prime}}p^{\mathcal{M}}_{L^{\prime}(v^{\prime})}$
		$\displaystyle=\sum_{D\in\mathcal{D}(G_{D})}\prod_{v\text{ in }D}p^{-}_{L(v)},$

where the second inequality is due to Theorem 3.12 and the equality is by Proposition 3.9. ∎

3.3. Putting all things together

The following lemma is implicitly proved in [45].

Lemma 3.14 ([45]).

For any undirected graph $G_{D}=([m],E_{D})$ and probability vector $\bm{p}\in\mathcal{I}_{a}(G_{D})/(1+\varepsilon)$ , $\sum_{i\in[m]}\frac{q_{\{i\}}(G_{D},\bm{p})}{q_{\emptyset}(G_{D},\bm{p})}\leq m/\varepsilon$ .

Theorem 1.6 (restated).

For any $\mathcal{A}\sim(G_{D},\bm{p},\mathcal{M},\bm{\delta})$ , if $(1+\varepsilon)\cdot\bm{p}^{-}\in\mathcal{I}_{a}(G_{D})$ , then the expected number of resampling steps performed by MT algorithm is most $m/\varepsilon$ , where $m$ is the number of events in $\mathcal{A}$ .

Proof.

Fix any such $\mathcal{A}$ . We have that

\mathbb{E}[T]\leq\sum_{D\in\mathcal{D}(G_{D})}\prod_{v\text{ in }D}p^{-}_{L(v)}\leq\frac{\sum_{i\in[m]}q_{\{i\}}(G_{D},\bm{p}^{-})}{q_{\emptyset}(G_{D},\bm{p}^{-})}\leq\frac{m}{\varepsilon},

where the first inequality is by Theorems 3.7 and 3.13, the second inequality is due to Theorem 4 in [45], and the last inequality is according to Lemma 3.14. ∎

4. Lower bound on the amount of intersection

In order to explore how far beyond Shearer’s bound MT algorithm is still efficient in general, we provide a lower bound on the amount of intersection between dependent events for general instances (Theorem 4.1).

We first introduce some notations. Given a bipartite graph $G_{B}=([m],[n],E_{B})$ , we call the vertex $i\in[m]$ left vertex and the vertex $j\in[n]$ right vertex. We call $G_{B}$ linear⁷⁷7The notion is not arbitrary. The bipartite graph $G_{B}$ can be represented by a hypergraph in a natural way: each right vertex $j$ is represented by a node $v_{j}$ in the hypergraph, each left vertex $i$ is represented by a hyperedge $e_{i}$ , and $v_{j}$ is in $e_{i}$ if and only if $(i,j)\in E_{B}$ . A hypergraph is called linear if any two hyperedges share at most one node. if any two left vertices in $[m]$ share at most one common neighbor in $[n]$ . Let $\Delta_{D}(G_{B})$ denote the maximum degree of $G_{D}(G_{B})$ , and $\Delta_{B}(G_{B})$ denote the maximum degree of the left vertices in $G_{B}$ . If $G_{B}$ is clear from the context, we may omit $G_{B}$ from these notations. In addition, for a bipartite graph $G=(L\subset[m],R,E)$ and a probability vector $\bm{p}\in(0,1)^{m}$ , we define⁸⁸8It is possible that $\digamma(G,\bm{p})<0$ .

\digamma(G,\bm{p})\triangleq\frac{\big{(}\min_{i\in L}p_{i}\big{)}^{2}\cdot\left(-|\cup_{i\in L}\mathcal{N}_{G}(i)|+\sum_{i\in L}|\mathcal{N}_{G}(i)|\cdot p_{i}^{1/|\mathcal{N}_{G}(i)|}\right)}{\sqrt{|L|}\cdot\Delta_{D}(G)\cdot\Delta_{B}(G)^{2}}.

and $\digamma^{+}(G,\bm{p})\triangleq\max\{\digamma(G,\bm{p}),0\}$ .

We use $\mathcal{A}\sim(G_{B},\bm{p})$ to denote that (i) $G_{B}$ is an event-variable graph of $\mathcal{A}$ and (ii) the probability vector of $\mathcal{A}$ is $\bm{p}$ . Let $\mathcal{M}=\{(i_{1},i_{1}^{\prime}),(i_{2},i_{2}^{\prime}),\cdots\}$ be a matching of $G_{D}(G_{B})$ , and $\bm{\delta}=(\delta_{i_{1},i_{1}^{\prime}},\delta_{i_{2},i_{2}^{\prime}},\cdots)\in(0,1)^{|\mathcal{M}|}$ be another probability vector. We say that an event set $\mathcal{A}$ is of the setting $(G_{B},\bm{p},\mathcal{M},\bm{\delta})$ , and write $\mathcal{A}\sim(G_{B},\bm{p},\mathcal{M},\bm{\delta})$ , if $\mathcal{A}\sim(G_{B},\bm{p})$ and $\mathbb{P}(A_{i}\cap A_{i^{\prime}})\geq\delta_{i,i^{\prime}}$ for each pair $(i,i^{\prime})\in\mathcal{M}$ .

We call an event $A$ elementary, if $A$ can be written as $(X_{i_{1}}\in S_{i_{1}})\land(X_{i_{2}}\in S_{i_{2}})\land\cdots\land(X_{i_{k}}\in S_{i_{k}})$ where $S_{i_{1}},\cdots,S_{i_{k}}$ are subsets of the domains of variables. We call an event set $\mathcal{A}$ elementary if all events in $\mathcal{A}$ are elementary.

Theorem 4.1.

Let $G_{B}=([m],[n],E_{B})$ be a bipartite graph, $\bm{p}\in(0,1]^{m}$ be a probability vector, and $L_{1},L_{2},\cdots,L_{t}$ be a collection of disjoint subsets of $[m]$ . For each $k\in[t]$ , let $G_{k}$ denote the induced subgraph on $L_{k}\cup\left(\cup_{i\in L_{k}}\mathcal{N}_{G_{B}}(i)\right)$ and $E_{k}$ denote the edge set of $G_{D}(G_{k})$ . If all $G_{k}$ ’s are linear, then the following holds.

If $\mathcal{A}\sim(G_{B},\bm{p})$ , then there is a matching $\mathcal{M}$ of $G_{D}(G_{B})$ satisfying that $\sum_{(i,i^{\prime})\in\mathcal{M}\cap E_{k}}\mathbb{P}(A_{i}\cap A_{i^{\prime}})^{2}\geq\left(\digamma^{+}(G_{k},\bm{p})\right)^{2}$ for any $k$ .

The proof of Theorem 4.1 mainly consists of two parts. First, we show that there is an elementary event set which approximately achieves the minimum amount of intersection between dependent events (Lemma 4.2). Then, for elementary event sets, by applying AM-GM inequality, we obtain a lower bound on the total amount of overlap on common variables, which further implies a lower bound on the amount of intersection between dependent events (Lemma 4.5).

Lemma 4.2.

Let $G_{B}=([m],[n],E_{B})$ be a linear bipartite graph, $E_{D}$ be the edge set of $G_{D}(G_{B})$ , and $\bm{p}\in(0,1]^{m}$ is a probability vector. Let $\gamma$ denote the minimum $\sum_{(i_{0},i_{1})\in E_{D}}\mathbb{P}[A_{i_{0}}\cap A_{i_{1}}]$ among all event sets $\mathcal{A}=(A_{1},\cdots,A_{m})\sim(G_{B},\bm{p})$ . Then there is an elementary event set $\mathcal{A}^{\prime}$ such that $\sum_{(i_{0},i_{1})\in E_{D}}\mathbb{P}[A_{i_{0}}^{\prime}\cap A_{i_{1}}^{\prime}]\leq(\Delta_{B}(G_{B}))^{2}\cdot\gamma$ .

Proof.

For simplicity, we let $\Delta\triangleq\Delta_{B}(G_{B})$ . Without loss of generality, we assume that each random variable $X_{i}$ is uniformly distributed over $[0,1]$ . Let $\mathcal{A}\sim(G_{B},\bm{p})$ be an event set where $\sum_{(i_{0},i_{1})\in E_{D}}\mathbb{P}[A_{i}\cap A_{j}]=\gamma$ . We will replace $A_{i}$ with an elementary $A_{i}^{\prime}$ one by one for each $i=1,2,\cdots,m$ , so that the resulted event set $\mathcal{A}^{\prime}$ satisfies $\sum_{(i_{0},i_{1})\in E_{D}}\mathbb{P}[A_{i_{0}}^{\prime}\cap A_{i_{1}}^{\prime}]\leq\Delta^{2}\cdot\sum_{(i_{0},i_{1})\in E_{D}}\mathbb{P}[A_{i_{0}}\cap A_{i_{1}}]=\Delta^{2}\cdot\gamma$ .

More precisely, fix $i\in[m]$ and suppose $A_{1},\cdots,A_{i-1}$ have been replaced with elementary events $A_{1}^{\prime},\cdots,A_{i-1}^{\prime}$ respectively. For simplicity of notations, for any pair $i_{0}<i_{1}$ , we abbreviate $\mathbb{P}[A_{i_{0}}\cap A_{i_{1}}]$ , $\mathbb{P}[A_{i_{0}}^{\prime}\cap A_{i_{1}}]$ and $\mathbb{P}[A_{i_{0}}^{\prime}\cap A_{i_{1}}^{\prime}]$ to $p_{i_{0},i_{1}}$ , $p_{i_{0},i_{1}}^{\prime}$ and $p_{i_{0},i_{1}}^{\prime\prime}$ respectively. Without loss of generality, we assume $A_{i}$ depends on variables $X_{1},X_{2},\cdots,X_{k}$ . For every $j\in[k]$ , we define

P_{j}(x_{j}):=\sum_{i_{0}<i,i_{0}\in\mathcal{N}_{G_{B}}(j)}\frac{1}{\Delta}\cdot\mathbb{P}[A_{i_{0}}^{\prime}\mid X_{j}=x_{j}]+\sum_{i_{0}>i,i_{0}\in\mathcal{N}_{G_{B}}(j)}\mathbb{P}[A_{i_{0}}\mid X_{j}=x_{j}].

for $x_{j}\in[0,1]$ . Without loss of generality, we assume $P_{j}(\cdot)$ is non-decreasing. Let $\mu:[0,1]^{k}\rightarrow\{0,1\}$ be the indicator of $A_{i}$ , then

\displaystyle\int_{x_{1},\cdots,x_{k}}\mu(x_{1},\cdots,x_{k})\mathrm{d}x_{1}\cdots\mathrm{d}x_{k}=\mathbb{P}[A_{i}],

For each $j\in[k]$ , let

\mu_{j}(x_{j}):=\mathbb{P}[A_{i}\mid X_{j}=x_{j}]=\int_{x_{1},\cdots,x_{j-1},x_{j+1},\cdots,x_{k}}\mu(x_{1},\cdots,x_{k})\mathrm{d}x_{1}\cdots\mathrm{d}x_{j-1}\mathrm{d}x_{j+1}\cdots\mathrm{d}x_{k}.

Noticing that $G_{B}$ is linear (i.e., any two events share at most one common variable), we have

(4)

\displaystyle\int_{x_{j}}P_{j}(x_{j})\mu_{j}(x_{j})\mathrm{d}x_{j}=\sum_{i_{0}<i,i_{0}\in\mathcal{N}_{G_{B}}(j)}\frac{p^{\prime}_{i_{0},i}}{\Delta}+\sum_{i_{0}>i,i_{0}\in\mathcal{N}_{G_{B}}(j)}p_{i_{0},i}.

Let $A_{i}^{\prime}$ be an elementary event such that it happens if and only if $(x_{1},\cdots,x_{k})\in[0,q_{1}]\times\cdots\times[0,q_{k}]$ . Here $q_{1},\cdots,q_{k}$ is a set of positive real numbers satisfying that

(i)

$\Pi_{j=1}^{k}q_{j}=\mathbb{P}[A_{i}]$ . That is, $\mathbb{P}[A_{i}^{\prime}]=\mathbb{P}[A_{i}]$ ;
(ii)

$\int_{x_{1}\geq q_{1}}\mu_{1}(x_{1})\mathrm{d}x_{1}=\int_{x_{2}\geq q_{2}}\mu_{2}(x_{2})\mathrm{d}x_{2}\cdots=\int_{x_{k}\geq q_{k}}\mu_{k}(x_{k})\mathrm{d}x_{k}$ .

Claim 4.3.

Such $\{q_{1},\cdots,q_{k}\}$ exists. Thus so does $A_{i}^{\prime}$ .

Proof.

We prove a generalized statement in which $\Pi_{j=1}^{k}q_{j}$ can be required to be an arbitrary number in $[0,1]$ . Our proof is by induction on $k$ . The base case when $k=1$ is trivial. Now we assume that for any preset $q^{\prime}\in(0,1]$ , there exist $\{q_{1},\cdots,q_{k-1}\}$ satisfying that

(i)

$\Pi_{j=1}^{k-1}q_{j}=q^{\prime}$ and
(ii)

$\int_{x_{1}\geq q_{1}}\mu_{1}(x_{1})\mathrm{d}x_{1}=\cdots=\int_{x_{k-1}\geq q_{k-1}}\mu_{k-1}(x_{k-1})\mathrm{d}x_{k-1}$ .

Let $f(q^{\prime})$ denote the minimum $\int_{x_{1}\geq q_{1}}\mu_{1}(x_{1})\mathrm{d}x_{1}$ among all such $\{q_{1},\cdots,q_{k-1}\}$ ’s. It is easy to see that $f(1)=0$ and $f$ is continuous and non-increasing.

Fix an arbitrary $q\in[0,1]$ . We define $g(q^{\prime\prime}):=\int_{x_{k}\geq q/q^{\prime}}\mu_{k}(x_{k})\mathrm{d}x_{k}$ for $q^{\prime\prime}\in[q,1]$ . Obviously, $g(q)=0$ and $g$ is continuous and non-decreasing. So there must exist a $q^{\ast}\in[q,1]$ such that $g(q^{\ast})=f(q^{\ast})$ . Then let $\{q_{1}^{\ast},\cdots,q_{k-1}^{\ast}\}$ be a set of positive real numbers where

(i)

$\Pi_{j=1}^{k-1}q_{j}^{\ast}=q^{\ast}$ and
(ii)

$f(q^{\ast})=\int_{x_{1}\geq q_{1}^{\ast}}\mu_{1}(x_{1})\mathrm{d}x_{1}=\cdots=\int_{x_{k-1}\geq q_{k-1}^{\ast}}\mu_{k-1}(x_{k-1})\mathrm{d}x_{k-1}$ .

Let $q_{k}^{\ast}=q/q^{\ast}$ . It is obvious that $\Pi_{j=1}^{k}q_{j}^{\ast}=q$ and $f(q^{\ast})=g(q^{\ast})=\int_{x_{k}\geq q_{k}^{\ast}}\mu_{k}(x_{k})\mathrm{d}x_{k}$ . This completes the induction step. ∎

Claim 4.4.

For every $j\in[k]$ , we have

\sum_{i_{0}<i,i_{0}\in\mathcal{N}_{G_{B}}(j)}\frac{p_{i_{0},i}^{\prime\prime}}{\Delta}+\sum_{i_{0}>i,i_{0}\in\mathcal{N}_{G_{B}}(j)}p_{i_{0},i}^{\prime}\leq\sum_{i_{0}<i,i_{0}\in\mathcal{N}_{G_{B}}(j)}p^{\prime}_{i_{0},i}+\Delta\cdot\sum_{i_{0}>i,i_{0}\in\mathcal{N}_{G_{B}}(j)}p_{i_{0},i}.

Proof.

Let $\mu_{i^{\prime}},\mu_{i\cap i^{\prime}}$ , and $\mu_{i^{\prime}\setminus i}$ denote the indicator functions of the events $A_{i}^{\prime}$ , $A_{i}^{\prime}\cap A_{i}$ , and $A_{i}^{\prime}\setminus A_{i}$ respectively. Since $\mathbb{P}[A_{i}^{\prime}]=\mathbb{P}[A_{i}]$ ,

\int_{x_{1}\geq q_{1}}\mu_{1}(x_{1})\mathrm{d}x_{1}+\cdots+\int_{x_{k}\geq q_{k}}\mu_{k}(x_{k})\mathrm{d}x_{k}\geq\mathbb{P}[A_{i}\setminus A_{i}^{\prime}]=\mathbb{P}[A_{i}^{\prime}\setminus A_{i}]=\int_{x_{1},\cdots,x_{k}}\mu_{i^{\prime}\setminus i}(x_{1},\cdots,x_{k})\mathrm{d}x_{1}\cdots\mathrm{d}x_{k}.

Fix $j\in[k]$ , then

\int_{x_{j}\geq q_{j}}\mu_{j}(x_{j})\mathrm{d}x_{j}\geq\frac{1}{k}\cdot\int_{x_{1},x_{2},\cdots,x_{k}}\mu_{i^{\prime}\setminus i}(x_{1},\cdots,x_{k})\mathrm{d}x_{1}\cdots\mathrm{d}x_{k}.

Since $P_{j}(x_{j})$ is non-decreasing and $k\leq\Delta$ , we have

\int_{x_{j}\geq q_{j}}P_{j}(x_{j})\mu_{j}(x_{j})\mathrm{d}x_{j}\geq\frac{1}{\Delta}\cdot\int_{x_{1},x_{2},\cdots,x_{k}}P_{j}(x_{j})\mu_{i^{\prime}\setminus i}(x_{1},\cdots,x_{k})\mathrm{d}x_{1}\cdots\mathrm{d}x_{k}.

According to Equation 4,

		$\displaystyle\sum_{i_{0}<i,i_{0}\in\mathcal{N}_{G_{B}}(j)}p^{\prime}_{i_{0},i}+\Delta\cdot\sum_{i_{0}>i,i_{0}\in\mathcal{N}_{G_{B}}(j)}p_{i_{0},i}=\Delta\cdot\int_{x_{j}}P_{j}(x_{j})\mu_{j}(x_{j})\mathrm{d}x_{j}$
	$\displaystyle=$	$\displaystyle\Delta\cdot\int_{x_{j}\geq q_{j}}P_{j}(x_{j})\mu_{j}(x_{j})\mathrm{d}x_{j}+\Delta\cdot\int_{x_{j}<q_{j}}P_{j}(x_{j})\mu_{j}(x_{j})\mathrm{d}x_{j}$
	$\displaystyle\geq$	$\displaystyle\Delta\cdot\int_{x_{j}\geq q_{j}}P_{j}(x_{j})\mu_{j}(x_{j})\mathrm{d}x_{j}+\Delta\cdot\int_{x_{1},\cdots,x_{k}}P_{j}(x_{j})\mu_{i\cap i^{\prime}}(x_{1},\cdots,x_{k})\mathrm{d}x_{1}\cdots\mathrm{d}x_{k}$
	$\displaystyle\geq$	$\displaystyle\int_{x_{1},\cdots,x_{k}}P_{j}(x_{j})\mu_{i^{\prime}\setminus i}(x_{1},\cdots,x_{k})\mathrm{d}x_{1}\cdots\mathrm{d}x_{k}+\int_{x_{1},\cdots,x_{k}}P_{j}(x_{j})\mu_{i\cap i^{\prime}}(x_{1},\cdots,x_{k})\mathrm{d}x_{1}\cdots\mathrm{d}x_{k}$
	$\displaystyle=$	$\displaystyle\int_{x_{1},\cdots,x_{k}}P_{j}(x_{j})\mu_{i^{\prime}}(x_{1},\cdots,x_{k})\mathrm{d}x_{1}\cdots\mathrm{d}x_{k}$
	$\displaystyle=$	$\displaystyle\sum_{i_{0}<i,i_{0}\in\mathcal{N}_{G_{B}}(j)}\frac{\mathbb{P}[A_{i_{0}}^{\prime}\cap A_{i}^{\prime}]}{\Delta}+\sum_{i_{0}>i,i_{0}\in\mathcal{N}_{G_{B}}(j)}\mathbb{P}[A_{i_{0}}\cap A_{i}^{\prime}].$

This completes the proof. ∎

From Claim 4.4, we have

(5)

\displaystyle\sum_{i_{0}<i,i_{0}\in\mathcal{N}_{G_{D}}(i)}\frac{p_{i_{0},i}^{\prime\prime}}{\Delta}+\sum_{i_{0}>i,i_{0}\in\mathcal{N}_{G_{D}}(i)}p_{i_{0},i}^{\prime}\leq\sum_{i_{0}<i,i_{0}\in\mathcal{N}_{G_{D}}(i)}p^{\prime}_{i_{0},i}+\Delta\cdot\sum_{i_{0}>i,i_{0}\in\mathcal{N}_{G_{D}}(i)}p_{i_{0},i},

By summation over all $i\in[m]$ , we finish the proof:

\sum_{(i_{0},i)\in E_{D}}\frac{p_{i_{0},i}^{\prime\prime}}{\Delta}\leq\Delta\cdot\sum_{(i_{0},i)\in E_{D}}p_{i_{0},i}.

∎

Lemma 4.5.

Let $G_{B}=([m],[n],E_{B})$ be a linear bipartite graph and $\bm{p}$ be a probability vector. Then for any elementary $\mathcal{A}=(A_{1},\cdots,A_{m})\sim(G_{B},\bm{p})$ ,

\sum_{(i_{0},i_{1})\in E_{D}}\mathbb{P}\left(A_{i_{0}}\cap A_{i_{1}}\right)\geq\sqrt{m}\cdot\Delta_{D}(G_{B})\cdot\Delta_{B}(G_{B})^{2}\cdot\digamma(G_{B},\bm{p}),

where $E_{D}$ is the edge set of $G_{D}(G_{B})$

Proof.

For simplicity of notation, we let $\mathcal{N}$ stand for $\mathcal{N}_{G_{B}}$ . Without loss of generality, we assume that each variable $X_{i}$ is uniformly distributed over $[0,1]$ . As $\mathcal{A}$ is elementary, each $A_{i}$ can be written as $\bigwedge_{j\in\mathcal{N}(i)}[X_{j}\in S_{i}^{j}]$ where $S_{i}^{j}\subset[0,1]$ . Let $\mu$ be the Lebesgue measure.

On one hand, according to the AM–GM inequality,

(6)

\displaystyle\sum_{i\in[m]}\sum_{j\in\mathcal{N}(i)}\mu(S^{j}_{i})\geq\sum_{i\in[m]}|\mathcal{N}(i)|\cdot\big{(}\Pi_{j\in\mathcal{N}(i)}\mu(S^{j}_{i})\big{)}^{1/|\mathcal{N}(i)|}=\sum_{i\in[m]}|\mathcal{N}(i)|\cdot p_{i}^{1/|\mathcal{N}(i)|}.

On the other hand,

(7)

\displaystyle\sum_{i\in[m]}\sum_{j\in\mathcal{N}(i)}\mu(S^{j}_{i})=\sum_{j\in[n]}\sum_{i\in\mathcal{N}(i)}\mu(S^{j}_{i})\leq n+\sum_{j\in[n]}\sum_{i_{0}\neq i_{1}\in\mathcal{N}(j)}\mu\left(S^{j}_{i_{0}}\cap S^{j}_{i_{1}}\right)

By Inequalities 6 and 7 and noticing $G_{B}$ is linear, we have that

(8)

\displaystyle\sum_{(i_{0},i_{1})\in E_{D}}\sum_{j\in\mathcal{N}(i_{0})\cap\mathcal{N}(i_{1})}\mu\left(S^{j}_{i_{0}}\cap S^{j}_{i_{1}}\right)=\sum_{j\in[n]}\sum_{i_{0}\neq i_{1}\in\mathcal{N}(j)}\mu\left(S^{j}_{i_{0}}\cap S^{j}_{i_{1}}\right)\geq\left(\sum_{i\in[m]}|\mathcal{N}(i)|\cdot p_{i}^{1/|\mathcal{N}(i)|}\right)-n.

Moreover, given any $(i_{0},i_{1})\in E_{D}$ , where $\{j\}=\mathcal{N}(i)\cap\mathcal{N}(i^{\prime})$ , we have that

(9)		$\displaystyle\mathbb{P}(A_{i_{0}}\cap A_{i_{1}})$	$\displaystyle\geq\mu\left(S^{j}_{i_{0}}\cap S^{j}_{i_{1}}\right)\cdot\left(\prod_{k\in\mathcal{N}(i_{0})\setminus\{j\}}\mu(S^{k}_{i_{0}})\right)\cdot\left(\prod_{k^{\prime}\in\mathcal{N}(i_{1})\setminus\{j\}}\mu(S^{k^{\prime}}_{i_{1}})\right)$
(9)			$\displaystyle\geq\mu\left(S^{j}_{i_{0}}\cap S^{j}_{i_{1}}\right)\cdot p_{i_{0}}\cdot p_{i_{1}}.$

Finally, combining (8) with (9), we concludes that

	$\displaystyle\sum_{(i_{0},i_{1})\in E_{D}}\mathbb{P}(A_{i_{0}}\cap A_{i_{1}})$	$\displaystyle\geq\sum_{(i_{0},i_{1})\in E_{D}}\sum_{j\in\mathcal{N}(i_{0})\cap\mathcal{N}(i_{1})}\mu\left(S^{j}_{i_{0}}\cap S^{j}_{i_{1}}\right)\cdot p_{i_{0}}\cdot p_{i_{1}}$
		$\displaystyle\geq\left(\min_{i\in[m]}p_{i}\right)^{2}\left(\sum_{i}\|\mathcal{N}(i)\|\cdot p_{i}^{1/\|\mathcal{N}(i)\|}-n\right)$
		$\displaystyle=\sqrt{m}\cdot\Delta_{D}(G_{B})\cdot\Delta_{B}(G_{B})^{2}\cdot\digamma(G_{B},\bm{p}).$

∎

The following lemma is a special case of Theorem 4.1 where $t=1$ and $L_{1}=[m]$ . In fact, Theorem 4.1 is proved by applying Lemma 4.6 to each $G_{k}$ separately.

Lemma 4.6.

Let $G_{B}=([m],[n],E_{B})$ be a linear bipartite graph and $\bm{p}$ be a probability vector. If $\mathcal{A}\sim(G_{B},\bm{p})$ , then $\mathcal{A}\sim(G_{B},\bm{p},\mathcal{M},\bm{\delta})$ for some matching $\mathcal{M}$ of $G_{D}(G_{B})$ and some $\bm{\delta}\in(0,1)^{|\mathcal{M}|}$ satisfying that $\sum_{(i,i^{\prime})\in\mathcal{M}}\delta^{2}_{i,i^{\prime}}\geq\big{(}\digamma^{+}(G_{B},\bm{p})\big{)}^{2}$ .

Proof.

Given an instance $\mathcal{A}\sim(G_{B},\bm{p})$ , we construct such a $\mathcal{M}$ greedily as follows.

We maintain two sets $E$ and $\mathcal{M}$ , which are initialized as $E_{D}$ and $\emptyset$ respectively. We do the following iteratively until $E$ becomes empty: select a edge $(i_{0},i_{1})$ with maximum $\mathbb{P}(A_{i_{0}}\cap A_{i_{1}})$ from $E$ , add $(i_{0},i_{1})$ to $\mathcal{M}$ , and delete all edges connecting $i_{0}$ or $i_{1}$ from $E$ (including $(i_{0},i_{1})$ ).

Let $\Delta_{D}$ and $\Delta_{B}$ denote $\Delta_{D}(G_{B})$ and $\Delta_{B}(G_{B})$ respectively. In each iteration, at most $2\Delta_{D}$ edges are deleted from $E$ and for each deleted edge $(i,i^{\prime})$ , $\mathbb{P}(A_{i}\cap A_{i^{\prime}})^{2}\leq\mathbb{P}(A_{i_{0}}\cap A_{i_{1}})^{2}$ . Based on this observation, it is easy to see that

(10)

\displaystyle\sum_{(i_{0},i_{1})\in\mathcal{M}}\mathbb{P}(A_{i_{0}}\cap A_{i_{1}})^{2}\geq\frac{1}{2\Delta_{D}}\sum_{(i,i^{\prime})\in E_{D}}\mathbb{P}(A_{i}\cap A_{i^{\prime}})^{2}.

Moreover, according to Lemma 4.2 and 4.5, it has that

(11)

\displaystyle\sum_{(i,i^{\prime})\in E_{D}}\mathbb{P}(A_{i}\cap A_{i^{\prime}})^{2}\geq\frac{1}{|E_{D}|}\cdot\left(\sum_{(i,i^{\prime})\in E_{D}}\mathbb{P}(A_{i}\cap A_{i^{\prime}})\right)^{2}\geq\frac{m\cdot\Delta_{D}^{2}\cdot\left(\digamma^{+}(G_{B},\bm{p})\right)^{2}}{|E_{D}|},

By combining Inequality 10 and 11, setting $\delta_{i,i^{\prime}}=\mathbb{P}(A_{i}\cap A_{i^{\prime}})$ , and noting $2|E_{D}|\leq m\Delta_{D}$ , we finish the proof. ∎

Proof of Theorem 4.1.

For each $k\in[t]$ , by applying Lemma 4.6 to $G_{k}$ , we have that $\mathcal{A}\sim(G_{B},\bm{p},\mathcal{M}_{k},\bm{\delta}_{k})$ for some matching $\mathcal{M}_{k}\subseteq E_{k}$ and some $\bm{\delta}_{k}$ where $\sum_{(i,i^{\prime})\in\mathcal{M}_{k}}\delta^{2}_{i,i^{\prime}}\geq\big{(}\digamma^{+}(G_{k},\bm{p})\big{)}^{2}$ . Note that $E_{k}$ ’s are disjoint with each other, so $\mathcal{M}_{1}\cup\mathcal{M}_{2}\cup\cdots\cup\mathcal{M}_{t}$ is still a matching. By letting $\mathcal{M}=\mathcal{M}_{1}\cup\mathcal{M}_{2}\cup\cdots\cup\mathcal{M}_{t}$ and $\bm{\delta}=(\bm{\delta}_{1},\cdots,\bm{\delta}_{t})$ , we conclude the theorem. ∎

Remark 4.7.

Given a bipartite graph $G$ , its simplified graph is defined to be obtained from $G$ by deleting all the right nodes which only have one neighbor and combining all the right nodes with the same neighbor set. Notice that if $G$ is linear, so is its simplified graph.

Theorem 4.1 can be slightly generalized: it is sufficient that the simplified graph of $G_{k}$ instead of $G_{k}$ itself is linear.

5. The Moser-Tardos algorithm is beyond Shearer’s bound

In this section, we prove Theorem 1.5. Given a dependency graph $G_{D}$ , a vector $\bm{p}$ and a chordless cycle $C$ in $G_{D}$ , define

r(G_{D},\bm{p},C)\triangleq|C|\cdot\big{(}\min_{j\in C}p_{j}\big{)}^{4}\cdot\left(\frac{2\sum_{j\in C}\sqrt{p_{j}}}{|C|}-1\right)^{2}.

and

r^{+}(G_{D},\bm{p},C)\triangleq|C|\cdot\big{(}\min_{j\in C}p_{j}\big{)}^{4}\cdot\left(\max\left\{\frac{2\sum_{j\in C}\sqrt{p_{j}}}{|C|}-1,0\right\}\right)^{2}.

Then Theorem 1.5 is obvious by Lemmas 5.1 and 5.2.

Lemma 5.1.

Given $G_{D}$ , $\bm{p}$ and $\varepsilon>0$ , let $C_{1},C_{2},\cdots,C_{\ell}$ be any disjoint chordless cycles in $G_{D}$ . If

d((1+\varepsilon)\bm{p},G_{D})<\frac{1}{544}\sum_{i\leq\ell}r^{+}(G_{D},\bm{p},C_{i}),

then for any variable-generated event system $\mathcal{A}\sim(G_{D},\bm{p})$ , the expected number of resampling steps performed by MT algorithm is most $m/\varepsilon$ .

Proof.

Fix such an instance $\mathcal{A}$ . Define $\delta_{i,i^{\prime}}:=\mathbb{P}(A_{i}\cap A_{i^{\prime}})$ . Let $G_{B}$ denote the event-variable graph of $\mathcal{A}$ . Let $G_{k}$ denote the induced subgraph of $G_{B}$ on $C_{k}\cup\left(\cup_{i\in C_{k}}\mathcal{N}_{G_{B}}(i)\right)$ . According to Remark 4.7, it is lossless to assume $G_{k}$ is a cycle of length $2|C_{k}|$ . Thus we have

(12)

\digamma^{+}(G_{k},\bm{p})\geq\frac{\big{(}\min_{i\in C_{k}}p_{i}\big{)}^{2}\cdot\left(-|C_{k}|+\sum_{i\in L}2\sqrt{p_{i}}\right)}{8\sqrt{|C_{k}|}}.

According to Theorem 4.1, there is a matching $\mathcal{M}$ of $G_{D}$ such that $\sum_{(i,i^{\prime})\in\mathcal{M}}\delta_{i,i^{\prime}}^{2}\geq\sum_{k\leq\ell}\left(\digamma^{+}(G_{k},\bm{p})\right)^{2}$ . Define $\bm{p}^{-}$ as (1). We have $(1+\varepsilon)\bm{p}^{-}\leq(1+\varepsilon)\bm{p}$ and

\displaystyle||(1+\varepsilon)\bm{p}-(1+\varepsilon)\bm{p}^{-}||_{1}\geq||\bm{p}-\bm{p}^{-}||_{1}\geq\frac{2}{17}\sum_{(i,i^{\prime})\in\mathcal{M}}\delta^{2}_{i,i^{\prime}}\geq\frac{2}{17}\sum_{k\leq\ell}\left(\digamma^{+}(G_{k},\bm{p})\right)^{2}.

Combining with (12), we have

\displaystyle||(1+\varepsilon)\bm{p}-(1+\varepsilon)\bm{p}^{-}||_{1}\geq\frac{1}{544}\sum_{i\leq\ell}r^{+}(G_{D},\bm{p},C_{i})>d((1+\varepsilon)\bm{p},G_{D}),

where the last inequality is by the condition of the lemma. Thus by Definition 1.4, we have $(1+\varepsilon)\bm{p}^{-}$ is in the Shearer’s bound of $G_{D}$ . Combining with Theorem 1.6, we have the expected number of resampling steps performed by the Moser-Tardos algorithm is most $m/\varepsilon$ . ∎

Lemma 5.2.

Given $G_{D}$ and any chordless cycle $C$ in $G_{D}$ , there is some probability vector $\bm{p}$ beyond the Shearer’s bound of $G_{D}$ and with

d(\bm{p},G_{D})\geq\frac{1}{545}\cdot r(G_{D},\bm{p},C)>2^{-20}\ell^{-3}

such that for any variable-generated event system $\mathcal{A}\sim(G_{D},\bm{p})$ , the expected number of resampling steps performed by MT algorithm is most $2^{29}\cdot m^{2}\cdot|C|^{3}$ .

The following two lemmas will be used in the proof of Lemma 5.2.

Lemma 5.3.

[56] $q_{\emptyset}(G_{D},\bm{p})=1-\mathbb{P}(\bigcup_{A\in\mathcal{A}}A)$ holds for any extremal instance $\mathcal{A}\sim(G_{D},\bm{p})$ .

Lemma 5.4.

[56] Suppose $\bm{p}$ is the Shearer’s bound of $G_{D}=([m],E_{D})$ . Then for $i\in[m]$ ,

\displaystyle\frac{\partial{q_{\emptyset}(G_{D},\bm{p})}}{\partial{p_{i}}}=-\mathbb{P}\left(\bigcap_{j\notin\mathcal{N}_{G_{D}}(i)\cup\{i\}}\overline{A_{j}}\right)

holds for any $\mathcal{A}\sim(G_{D},\bm{p})$ satisfying that $A_{i^{\prime}}\cap A_{i^{\prime\prime}}=\emptyset$ for any $(i^{\prime},i^{\prime\prime})\in E_{D}$ where $i^{\prime},i^{\prime\prime}\neq i$ .

Proof of Lemma 5.2.

Let $\ell=|C|$ and $\bm{\lambda}=\left(\frac{1}{4},\cdots,\frac{1}{4},\frac{1}{4}\right)$ . Let $\mathcal{A}\sim(C,\bm{\lambda})$ be an extremal instance defined as follows: $\mathcal{A}=(A_{1},\cdots,A_{\ell})$ is a variable-generated event system fully determined a set of underlying mutually independent random variables $\{X_{1},\cdots,X_{\ell}\}$ . Moreover, $A_{i}=[X_{i}<1/2]\land[X_{i+1}\geq 1/2]$ for each $i\in[\ell-1]$ , and $A_{\ell}=[X_{\ell}<1/2]\land[X_{1}\geq 1/2]$ . According to Lemma 5.3,

q_{\emptyset}(C,\bm{\lambda})=\mathbb{P}\left(\bigcup_{i\in[\ell]}A_{i}\right)=\frac{1}{2^{\ell-1}}.

Besides, according to Lemma 5.4, for any $\bm{\lambda}^{\prime}=(\frac{1}{4},\cdots,\frac{1}{4},\frac{1}{4}+\varepsilon)$ in the Shearer’s bound of $C$ ,

\displaystyle\frac{\partial{q_{\emptyset}(C,\bm{\lambda}^{\prime})}}{\partial{\lambda^{\prime}_{\ell}}}=-\mathbb{P}\left(\bigcap_{i\in[2,\ell-2]}\overline{A_{i}}\right)=-\frac{\ell-2}{2^{\ell-3}}.

Thus, for any $\bm{\lambda}\leq\bm{\lambda^{\prime}}\leq\bm{\lambda}^{\prime\prime}:=\left(\frac{1}{4},\cdots,\frac{1}{4},\frac{1}{4}+\frac{1}{4(\ell-1)}\right)$ , we have that

\displaystyle q_{\emptyset}(C,\bm{\lambda}^{\prime\prime})=q_{\emptyset}(C,\bm{\lambda})+\int_{\frac{1}{4}}^{\lambda^{\prime\prime}_{\ell}}\frac{\partial{q_{\emptyset}(C,\bm{\lambda}^{\prime})}}{\partial{\lambda_{\ell}^{\prime}}}d\lambda_{\ell}^{\prime}>\frac{1}{2^{\ell-1}}-\frac{\ell-2}{\ell-1}\cdot\frac{1}{2^{\ell-1}}=\frac{1}{\ell-1}\cdot\frac{1}{2^{\ell-1}}.

Hence $\bm{\lambda}^{\prime\prime}$ is in the Shearer’s bound of $C$ . Thus, there exists $q>0$ such that $\bm{q}$ defined as follows is on the Shearer’s boundary of $G_{D}$ :

\displaystyle\forall i\in[m]:\quad q_{i}=\begin{cases}\frac{1}{4}&\text{if }i\in[\ell-1],\\ \frac{1}{4}+\frac{1}{4(\ell-1)}&\text{if }i=\ell,\\ q&\text{otherwise}.\end{cases}

One can verify that

(13)

\displaystyle r^{+}(G_{D},\bm{q},C)=r(G_{D},\bm{q},C)>\ell\cdot\frac{1}{4^{4}}\cdot\left(\frac{1}{2\ell^{2}}\right)^{2}>\frac{1}{2^{10}\cdot\ell^{3}}.

Define

f(\delta)=545\cdot d((1+\delta)\bm{q},G_{D})-r^{+}(G_{D},(1+\delta)\bm{q},C).

One can verify that $f(0)<0$ because $d(\bm{q},G_{D})=0$ and $r^{+}(G_{D},\bm{q},C)>0$ . Moreover, let $\delta^{\prime}$ be large enough such that $(1+\delta^{\prime})\bm{q}\not\in\mathcal{I}_{v}(G_{D})$ . One can verify that such $\delta^{\prime}$ must exist. We have $f(\delta^{\prime})\geq 0$ . This is because otherwise $f(\delta^{\prime})<0$ and then

d((1+\delta^{\prime})\bm{q},G_{D})<\frac{1}{545}\cdot r^{+}(G_{D},(1+\delta^{\prime})\bm{q},C).

By following the proof of Lemma 5.1, we have the MT algorithm terminates at $(1+\delta^{\prime})\bm{q}$ , which is contradictory with $(1+\delta^{\prime})\bm{q}\not\in\mathcal{I}_{v}(G_{D})$ .

Moreover, $f(\delta)$ is a continuous function of $\delta$ , because $d((1+\delta)\bm{q},G_{D})$ and $r^{+}(G_{D},(1+\delta)\bm{q},C)$ are both continuous functions of $\delta$ . Combining with $f(0)<0$ and $f(\delta^{\prime})>0$ , we have there must be a $0\leq\delta\leq\delta^{\prime}$ such that $f(\delta)=0$ . Let $\bm{p}=(1+\delta)\bm{q}$ . By $f(\delta)=0$ , we have

(14)

d(\bm{p},G_{D})=\frac{1}{545}\cdot r^{+}(G_{D},\bm{p},C).

Combining with $r^{+}(G_{D},\bm{p},C)=r(G_{D},\bm{p},C)>r(G_{D},\bm{q},C)$ and (13), we have $d(\bm{p},G_{D})>2^{-20}\ell^{-3}$ .

Fix a variable-generated event system $\mathcal{A}\sim(G_{D},\bm{p})$ . Define $\delta_{i,i^{\prime}}:=\mathbb{P}(A_{i}\cap A_{i^{\prime}})$ . Let $G_{B}$ denote the event-variable graph of $\mathcal{A}$ . Let $G$ denote the induced subgraph of $G_{B}$ on $C\cup\left(\cup_{i\in C}\mathcal{N}_{G_{B}}(i)\right)$ . According to Remark 4.7, it is lossless to assume that $G$ is a cycle of length $2|C|$ . Thus we have

(15)

\digamma^{+}(G,\bm{p})\geq\frac{\big{(}\min_{i\in C}p_{i}\big{)}^{2}\cdot\left(-|C|+\sum_{i\in L}2\sqrt{p_{i}}\right)}{8\sqrt{|C|}}.

According to Theorem 4.1, there is a matching $\mathcal{M}$ of $G_{D}$ such that $\sum_{(i,i^{\prime})\in\mathcal{M}}\delta_{i,i^{\prime}}^{2}\geq\left(\digamma^{+}(G,\bm{p})\right)^{2}$ . Define $\bm{p}^{-}$ as (1). We have

\displaystyle||\bm{p}-\bm{p}^{-}||_{1}\geq\frac{2}{17}\sum_{(i,i^{\prime})\in\mathcal{M}}\delta^{2}_{i,i^{\prime}}\geq\frac{2}{17}\sum_{k\leq\ell}\left(\digamma^{+}(G_{k},\bm{p})\right)^{2}.

Combining with (15), we have

\displaystyle||\bm{p}-\bm{p}^{-}||_{1}\geq\frac{1}{544}\cdot r^{+}(G_{D},\bm{p},C).

Let

\varepsilon\triangleq\frac{1}{2^{29}\cdot\ell^{3}\cdot m}.

By (13) we have

\displaystyle m\varepsilon\leq\frac{1}{545\cdot 544\cdot 2^{10}\cdot\ell^{3}}\leq\left(\frac{1}{544}-\frac{1}{545}\right)r^{+}(G_{D},\bm{q},C)\leq\left(\frac{1}{544}-\frac{1}{545}\right)r^{+}(G_{D},\bm{p},C).

Thus we have

\displaystyle||\bm{p}-(1+\varepsilon)\bm{p}^{-}||_{1}>||\bm{p}-\bm{p}^{-}||_{1}-m\varepsilon\geq\frac{r^{+}(G_{D},\bm{p},C)}{544}-m\varepsilon\geq\frac{r^{+}(G_{D},\bm{p},C)}{545}\geq d(\bm{p},G_{D}),

where the last inequality is by (14). Thus by Definition 1.4, we have $(1+\varepsilon)\bm{p}^{-}$ is in the Shearer’s bound of $G_{D}$ . Combining with Theorem 1.6, we have the expected number of resampling steps performed by the MT algorithm is most $m/\varepsilon$ . ∎

6. Application to periodic Euclidean graphs

In this section, we explicitly calculate the gaps between our new criterion and Shearer’s bound on periodic Euclidean graphs, including several lattices that have been studied extensively in physics. It turns out the efficient region of MT algorithm can exceed significantly beyond Shearer’s bound.

A periodic Euclidean graph $G_{D}$ is a graph that is embedded into a Euclidean space naturally and has a translational unit $G_{U}$ in the sense that $G_{D}$ can be viewed as the union of periodic translations of $G_{U}$ . For example, a cycle of length 4 is a translational unit of the square lattice.

Given a dependency graph $G_{D}$ , it naturally defines a bipartite graph $G_{B}(G_{D})$ as follows. Regard each edge of $G_{D}$ as a variable and each vertex as an event. An event $A$ depends on a variable $X$ if and only if the vertex corresponding to $A$ is an endpoint of the edge corresponding to $X$ .

For simplicity, we only focus on symmetric probabilities, where $\bm{p}=(p,p,\cdots,p)$ . Given a dependency graph $G_{B}$ and a vector $\bm{p}$ , remember that $\bm{p}$ is on Shearer’s boundary of $G_{D}$ if $(1-\varepsilon)\bm{p}$ is in Shearer’s bound and $(1+\varepsilon)\bm{p}$ is not for any $\varepsilon>0$ .

Given a dependency graph $G_{D}=([m],E_{D})$ and two vertices $i,i^{\prime}\in[m]$ , we use $\mathrm{dist}(i,i^{\prime})$ to denote the distance between $i$ and $i^{\prime}$ in $G_{D}$ . The following Lemma will be used.

Lemma 6.1.

Suppose $\bm{p}_{a}=(p_{a},p_{a},\cdots,p_{a})$ is on Shearer’s boundary of $G_{D}=([m],E_{D})$ . For any probability vector $\bm{p}$ other than $\bm{p}_{a}$ , it is in the Shearer’s bound if there exist $K,d\in\mathbb{N}^{+}$ , $\mathcal{S}\subseteq 2^{[m]}$ where $\cup_{S\in\mathcal{S}}=[m]$ , and $f:\mathcal{S}\rightarrow 2^{[m]}$ such that the following conditions hold:

(a)

for each $i\in[m]$ , there are at most $K$ subsets $S\in\mathcal{S}$ such that $f(S)\ni i$ ;
(b)

if $f(S)=T$ , then $\mathrm{dist}(i,i^{\prime})\leq d$ for each $i\in S$ and $i^{\prime}\in T$ ;

(c)

if $f(S)=T$ , then

\left(\frac{1-p_{a}}{p_{a}}\right)^{d-1}\cdot\frac{K}{p_{a}}\cdot\sum_{i\in S}\max\{p_{i}-p_{a},0\}\leq\sum_{i\in T}\max\{p_{a}-p_{i},0\}.

While Lemma 6.1 looks involved, the basic idea is simple: by contradiction, suppose there is such a vector $\bm{p}^{\prime}$ beyond Shearer’s bound; then we apply Lemma D.1 repeatedly to transfer probability from one event to another while keeping the probability vector still beyond Shearer’s bound; finally, the vector $\bm{p}^{\prime}$ will be changed to a vector strictly below $\bm{p}$ , which makes a contradiction to the assumption that $\bm{p}$ is on the Shearer’s boundary. The involved part is a transferring scheme which changes $\bm{p}^{\prime}$ to another probability vector strictly below $\bm{p}$ . We leave the proof to the appendix.

The main result of this section is as follows.

Theorem 6.2.

Let $G_{D}=(V_{D},E_{D})$ be a periodic Euclidean graph with maximum degree $\Delta$ , and $\bm{p}_{a}=(p_{a},\cdots,p_{a})$ be the probability vector on Shearer’s boundary of $G_{D}$ . Suppose $G_{U}=(V_{U},E_{U})$ is a translational unit of $G_{D}$ with diameter $D$ . Let

\displaystyle q\triangleq\frac{p_{a}^{D+2}\big{(}\digamma^{+}(G_{B}(G_{U}),\bm{p}_{a})\big{)}^{2}}{17\cdot(\Delta+1)\cdot|V|^{2}\cdot(1-p_{a})^{D+1}}.

Then for any $\mathcal{A}\sim(G_{B}(G_{D}),\bm{p})$ where $(1+\varepsilon)\bm{p}\leq(p_{a}+q,\cdots,p_{a}+q)$ , the expected number of resampling steps performed by the MT algorithm is most $|V_{D}|/\varepsilon$ .

Proof.

Fix any $\mathcal{A}\sim(G_{B}(G_{D}),\bm{p})$ where $(1+\varepsilon)\bm{p}\leq(p_{a}+q,\cdots,p_{a}+q)$ . Let $\delta_{v_{0},v_{1}}$ denote $\mathbb{P}(A_{v_{0}}\cap A_{v_{1}})$ for $(v_{0},v_{1})\in E_{D}$ . We construct a matching $\mathcal{M}\subset E_{D}$ greedily as follows: we maintain two sets $E$ and $\mathcal{M}$ , which are initialized as $E_{D}$ and $\emptyset$ respectively. We do the following iteratively until $E$ becomes empty: select a edge $(v_{0},v_{1})$ with maximum $\delta_{v_{0},v_{1}}$ from $E$ , add $(v_{0},v_{1})$ to $\mathcal{M}$ , and delete all edges connecting $v_{0}$ or $v_{1}$ from $E$ (including $(v_{0},v_{1})$ ). Let $\bm{\delta}=\big{(}\delta_{v_{0},v_{1}}:(v_{0},v_{1})\in\mathcal{M}\big{)}$ . Then $\mathcal{A}\sim(G_{B}(G_{D}),\bm{p},\mathcal{M},\bm{\delta})$ .

Define $\bm{p}^{-}$ as (1). In the remaining part of the proof, we will show that $(1+\varepsilon)\bm{p}^{-}$ is in the Shearer’s bound. This implies the conclusion immediately by Theorem 1.6.

In fact, it is a direct application of Lemma 6.1 to show that $(1+\varepsilon)\bm{p}^{-}$ is in the Shearer’s bound. To provide more detail, we need some notations. We use $v,v^{\prime},v_{1},v_{2},\cdots$ to represent vertices in $G_{D}$ , and use $u,u^{\prime},u_{1},u_{2},\cdots$ to represent vertices in $G_{U}$ . Let $G_{U}^{1},G_{U}^{2},\cdots$ be the periodic translations of $G_{U}$ in $G_{D}$ . And we use a surjection⁹⁹9 $h$ is possibly not a injection, as these translations are possibly overlapped with each other. $h:\mathbb{N}^{+}\times V_{U}\rightarrow V_{D}$ to represent how these periodic translations constitute $G_{D}$ : $h(k,u)=v$ if the copy of $u\in V_{U}$ in $k$ -th translation (i.e., $G_{U}^{k}$ ) is $v\in V_{D}$ . In particular, the vertex set of $G_{U}^{k}$ , denoted by $V_{U}^{k}$ , is $\{h(k,u):u\in V\}$ , and the edge set of $G_{U}^{k}$ , denoted by $E_{U}^{k}$ , is $\{(h(k,u),h(k,u^{\prime})):(u,u^{\prime})\in E_{U}\}$ . Besides, let $\mathcal{N}^{+}(v):=\mathcal{N}_{G_{D}}(v)\cup\{v\}$ for $v\in V_{D}$ . For $V\subset V_{D}$ , let $\mathcal{N}^{+}(V):=\cup_{v\in V}\mathcal{N}^{+}(v)$ . Let $T_{k}:=\{(v_{0},v_{1})\in\mathcal{M}:v_{0},v_{1}\in\mathcal{N}^{+}(G_{U}^{k})\}$ stand for the pairs in $\mathcal{M}$ adjacent to $G_{U}^{k}$ . With some abuse of notation, we sometimes use $v\in T_{k}$ to denote that $(v,v^{\prime})\in T_{k}$ for some $v^{\prime}\in V_{D}$ .

The following claim says that $\bm{p}^{-}$ is much smaller than $\bm{p}$ even projected on a single translation. Its proof uses a similar idea to Theorem 4.6 and can be found in the appendix.

Claim 6.3.

$\sum_{(v_{0},v_{1})\in T_{k}}\delta_{v_{0},v_{1}}^{2}\geq\big{(}\digamma^{+}(G_{B}(G_{U}),\bm{p})\big{)}^{2}$ holds for any $k$ .

To apply Lemma 6.1, let $K:=(\Delta+1)|V_{U}|$ , $d:=D+2$ , $\mathcal{S}:=\{V_{U}^{1},V_{U}^{2},\cdots\}$ , and $f(V_{U}^{k}):=T_{k}$ . Based on Claim 6.3, one can check that all the three conditions in Lemma 6.1 hold (see the appendix for details). Thus, according to Lemma 6.1, $(1+\varepsilon)\bm{p}^{-}$ is in Shearer’s bound. ∎

We apply Theorem 6.2 to three lattices: square lattice, Hexagonal lattice, and simple cubic lattice. For square lattice, we take the $5\times 5$ square with 25 vertices as the translational unit. For Hexagonal lattice, we take a graph consisting of $19$ hexagons as the translational unit, in which there are 3,4,5,4,3 hexagons in the five columns, respectively. For simple cubic lattice, we take the $3\times 3\times 3$ cube with 27 vertices as the translational unit. The explicit gaps are summarized in Table 1. Finally, the lower bounds for these three lattices in Table 1 hold for all bipartite graphs with the given canonical dependency graph, because all such bipartite graphs are essentially the same under the reduction rules defined in [36].

References

AFR [95] Michael Albert, Alan Frieze, and Bruce Reed. Multicoloured hamilton cycles. the electronic journal of combinatorics, 2(1):R10, 1995.
AI [16] Dimitris Achlioptas and Fotis Iliopoulos. Random walks that find perfect objects and the lovász local lemma. Journal of ACM, 63(3):22:1–22:29, 2016.
AIK [19] Dimitris Achlioptas, Fotis Iliopoulos, and Vladimir Kolmogorov. A local lemma for focused stochastic algorithms. SIAM Journal on Computing, 48(5):1583–1602, 2019.
AIS [19] Dimitris Achlioptas, Fotis Iliopoulos, and Alistair Sinclair. Beyond the lovász local lemma: Point to set correlations and their algorithmic applications. In Proceedings of Symposium on Foundations of Computer Science (FOCS), pages 725–744, 2019.
AIV [17] Dimitris Achlioptas, Fotis Iliopoulos, and Nikos Vlassis. Stochastic control via entropy compression. In Proceedings of International Colloquium on Automata, Languages, and Programming (ICALP), volume 80, pages 83:1–83:13, 2017.
Alo [91] Noga Alon. A parallel algorithmic version of the local lemma. Random Structures and Algorithms, 2(4):367–378, 1991.
Bec [91] József Beck. An algorithmic approach to the Lovász local lemma. Random Structures and Algorithms, 2(4):343–365, 1991.
BFH⁺ [16] Sebastian Brandt, Orr Fischer, Juho Hirvonen, Barbara Keller, Tuomo Lempiäinen, Joel Rybicki, Jukka Suomela, and Jara Uitto. A lower bound for the distributed lovász local lemma. In Proceedings of Symposium on Theory of Computing (STOC), pages 479–488, 2016.
BFPS [11] Rodrigo Bissacot, Roberto Fernández, Aldo Procacci, and Benedetto Scoppola. An improvement of the Lovász local lemma via cluster expansion. Combinatorics, Probability and Computing, 20(05):709–719, 2011.
CCS⁺ [17] Jan Dean Catarata, Scott Corbett, Harry Stern, Mario Szegedy, Tomas Vyskocil, and Zheng Zhang. The Moser-Tardos resample algorithm: Where is the limit?(an experimental inquiry). In 2017 Proceedings of the Ninteenth Workshop on Algorithm Engineering and Experiments (ALENEX), pages 159–171, 2017.
CGH [13] Karthekeyan Chandrasekaran, Navin Goyal, and Bernhard Haeupler. Deterministic algorithms for the lovász local lemma. SIAM Journal on Computing, 42(6):2132–2155, 2013.
CPS [17] Kai-Min Chung, Seth Pettie, and Hsin-Hao Su. Distributed algorithms for the lovász local lemma and graph coloring. Distributed Computing, 30(4):261–280, 2017.
CS [00] Artur Czumaj and Christian Scheideler. Coloring nonuniform hypergraphs: A new algorithmic approach to the general Lovász local lemma. Random Structures and Algorithms, 17(3-4):213–237, 2000.
EL [75] Paul Erdős and László Lovász. Problems and results on 3-chromatic hypergraphs and some related questions. Infinite and finite sets, 10(2):609–627, 1975.
ES [91] Paul Erdös and Joel Spencer. Lopsided lovász local lemma and latin transversals. Discrete Applied Mathematics, 30(2-3):151–154, 1991.
FGYZ [20] Weiming Feng, Heng Guo, Yitong Yin, and Chihao Zhang. Fast sampling and counting $k$ -sat solutions in the local lemma regime. In Proceedings of Symposium on Theory of Computing (STOC), pages 854–867, 2020.
FHY [20] Weiming Feng, Kun He, and Yitong Yin. Sampling constraint satisfaction solutions in the local lemma regime. arXiv preprint arXiv:2011.03915, 2020.
FHY [21] Weiming Feng, Kun He, and Yitong Yin. Sampling constraint satisfaction solutions in the local lemma regime. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 1565–1578, 2021.
Gau [67] David S Gaunt. Hard-sphere lattice gases. ii. plane-triangular and three-dimensional lattices. The Journal of Chemical Physics, 46(8):3237–3259, 1967.
GF [65] David S Gaunt and Michael E Fisher. Hard-sphere lattice gases. i. plane-square lattice. The Journal of Chemical Physics, 43(8):2840–2863, 1965.
GGGY [20] Andreas Galanis, Leslie Ann Goldberg, Heng Guo, and Kuan Yang. Counting solutions to random CNF formulas. In ICALP, volume 168 of LIPIcs, pages 53:1–53:14, 2020.
GH [20] Heng Guo and Kun He. Tight bounds for popping algorithms. Random Structures & Algorithms, 2020.
Gha [16] Mohsen Ghaffari. An improved distributed algorithm for maximal independent set. In Robert Krauthgamer, editor, Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 270–277, 2016.
Gil [19] András Gilyén. Quantum singular value transformation & its algorithmic applications. PhD thesis, University of Amsterdam, 2019.
GJ [18] Heng Guo and Mark Jerrum. Approximately counting bases of bicircular matroids. arXiv preprint arXiv:1808.09548, 2018.
GJ [19] Heng Guo and Mark Jerrum. A polynomial-time approximation algorithm for all-terminal network reliability. SIAM Journal on Computing, 48(3):964–978, 2019.
GJL [19] Heng Guo, Mark Jerrum, and Jingcheng Liu. Uniform sampling through the lovász local lemma. Journal of the ACM (JACM), 66(3):18, 2019.
GKPT [17] Ioannis Giotis, Lefteris Kirousis, Kostas I Psaromiligkos, and Dimitrios M Thilikos. Acyclic edge coloring through the Lovász local lemma. Theoretical Computer Science, 665:40–50, 2017.
GLLZ [19] Heng Guo, Chao Liao, Pinyan Lu, and Chihao Zhang. Counting hypergraph colorings in the local lemma regime. SIAM Journal on Computing, 48(4):1397–1424, 2019.
GMSW [09] Heidi Gebauer, Robin A Moser, Dominik Scheder, and Emo Welzl. The lovász local lemma and satisfiability. In Efficient Algorithms, pages 30–54. Springer, 2009.
GST [16] Heidi Gebauer, Tibor Szabó, and Gábor Tardos. The local lemma is asymptotically tight for SAT. Journal of ACM, 63(5):43, 2016.
Har [16] David G Harris. Lopsidependency in the moser-tardos framework: beyond the lopsided Lovász local lemma. ACM Transactions on Algorithms (TALG), 13(1):17, 2016.
Har [18] David G. Harris. Deterministic parallel algorithms for fooling polylogarithmic juntas and the lovász local lemma. ACM Transaction on Algorithms, 14(4):47:1–47:24, 2018.
Har [19] David G. Harris. Deterministic algorithms for the lovasz local lemma: simpler, more general, and more parallel. CoRR, abs/1909.08065, 2019.
HH [17] Bernhard Haeupler and David G. Harris. Parallel algorithms and concentration bounds for the lovász local lemma via witness dags. ACM Transaction on Algorithms, 13(4):53:1–53:25, 2017.
HLL⁺ [17] Kun He, Liang Li, Xingwu Liu, Yuyi Wang, and Mingji Xia. Variable-version Lovász local lemma: Beyond shearer’s bound. In Proceedings of Symposium on Foundations of Computer Science (FOCS), pages 451–462, 2017.
HLSZ [19] Kun He, Qian Li, Xiaoming Sun, and Jiapeng Zhang. Quantum lovász local lemma: Shearer’s bound is tight. In Proceedings of Symposium on Theory of Computing (STOC), pages 461–472, 2019.
HS [14] David G. Harris and Aravind Srinivasan. A constructive algorithm for the lovász local lemma on permutations. In Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 907–925, 2014.
HSS [11] Bernhard Haeupler, Barna Saha, and Aravind Srinivasan. New constructive aspects of the lovász local lemma. Journal of ACM, 58(6):1–28, 2011.
HSW [21] Kun He, Xiaoming Sun, and Kewen Wu. Perfect sampling for (atomic) lov $\backslash$ ’asz local lemma. arXiv preprint arXiv:2107.03932, 2021.
HV [20] Nicholas J. A. Harvey and Jan Vondrák. An algorithmic proof of the lovász local lemma via resampling oracles. SIAM Journal on Computing, 49(2):394–428, 2020.
IS [20] Fotis Iliopoulos and Alistair Sinclair. Efficiently list-edge coloring multigraphs asymptotically optimally. In Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2319–2336, 2020.
JPV [20] Vishesh Jain, Huy Tuan Pham, and Thuy Duong Vuong. Towards the sampling lovász local lemma. CoRR, abs/2011.12196, 2020.
JPV [21] Vishesh Jain, Huy Tuan Pham, and Thuy Duong Vuong. On the sampling lovász local lemma for atomic constraint satisfaction problems. CoRR, abs/2102.08342, 2021.
KS [11] Kashyap Babu Rao Kolipaka and Mario Szegedy. Moser and tardos meet lovász. In Proceedings of Symposium on Theory of Computing (STOC), pages 235–244, 2011.
KSX [12] Kashyap Babu Rao Kolipaka, Mario Szegedy, and Yixin Xu. A sharper local lemma with improved applications. In Proceedings of APPROX/RANDOM, volume 7408 of Lecture Notes in Computer Science, pages 603–614, 2012.
LS [07] Linyuan Lu and László Székely. Using lovász local lemma in the space of random injections. the electronic journal of combinatorics, 14(1):R63, 2007.
LS [09] Linyuan Lu and Laszlo A Szekely. A new asymptotic enumeration technique: the lovász local lemma. arXiv preprint arXiv:0905.3983, 2009.
McD [97] Colin McDiarmid. Hypergraph colouring and the Lovász local lemma. Discrete Mathematics, 167:481–486, 1997.
[50] Ankur Moitra. Approximate counting, the lovasz local lemma, and inference in graphical models. Journal of ACM, 66(2):10, 2019.
[51] Ankur Moitra. Approximate counting, the Lovász local lemma, and inference in graphical models. J. ACM, 66(2):10:1–10:25, 2019.
Mol [19] Michael Molloy. The list chromatic number of graphs with small clique number. Journal of Combinatorial Theory, Series B, 134:264–284, 2019.
MR [98] Michael Molloy and Bruce Reed. Further algorithmic aspects of the local lemma. In Proceedings of Symposium on Theory of Computing (STOC), pages 524–529, 1998.
MT [10] Robin A Moser and Gábor Tardos. A constructive proof of the general Lovász local lemma. Journal of ACM, 57(2):11, 2010.
Peg [14] Wesley Pegden. An extension of the moser–tardos algorithmic local lemma. SIAM Journal on Discrete Mathematics, 28(2):911–917, 2014.
She [85] James B. Shearer. On a problem of spencer. Combinatorica, 5(3):241–245, 1985.
Spe [75] Joel Spencer. Ramsey’s theorem—a new lower bound. Journal of Combinatorial Theory, Series A, 18(1):108–115, 1975.
Spe [77] Joel Spencer. Asymptotic lower bounds for Ramsey functions. Discrete Mathematics, 20:69–76, 1977.
Sze [13] Mario Szegedy. The lovász local lemma–a survey. In International Computer Science Symposium in Russia, pages 1–11. Springer, 2013.
Tod [99] Synge Todo. Transfer-matrix study of negative-fugacity singularity of hard-core lattice gas. International Journal of Modern Physics C, 10(04):517–529, 1999.

Appendix A Missing Proofs in Section 2

Proof of Proposition 3.2.

The following simple greedy procedure will find such a $\mathcal{P}$ .

1 Initially,

\mathcal{P}=\emptyset

;

2 for each $(i,i^{\prime})\in\mathcal{M}$ do

3 for each $k$ from $1$ to $|\mathrm{List}(D,i,i^{\prime})|-1$ do

4 if the $k$ -th node and $(k+1)$ -th node in $\mathrm{List}(D,i,i^{\prime})$ form a reversible arc then

5 add this arc to

\mathcal{P}

, and

k:=k+2

;

6 else

k:=k+1

;

Return

\mathcal{P}

;

Obviously, for each $(i,i^{\prime})\in\mathcal{M}$ , the procedure contains at least half of all reversible arcs $u\rightarrow v$ where $\{L(u),L(v)\}=\{i,i^{\prime}\}$ , hence at least half of nodes in $\mathcal{V}(D,i)$ . ∎

Appendix B Proof of Proposition 3.9

Given a pwdag $D=(V,E,L)$ of $G_{D}$ and a Boolean string $\bm{R}\in\{0,1\}^{\mathscr{M}(D)}$ , define $h(D,\bm{R})$ to be a directed graph $D^{\prime}:=(V^{\prime},E^{\prime},L^{\prime})$ where $V^{\prime}=V$ , $E^{\prime}=E$ , and

\displaystyle\forall v\in V:\quad L^{\prime}(v)=\begin{cases}(L(v))^{\uparrow},&\text{if }v\in\mathscr{M}\text{ and }R_{v}=0;\\ (L(v))^{\downarrow},&\text{if }v\in\mathscr{M}\text{ and }R_{v}=1;\\ L(v),&\text{otherwise, }v\not\in\mathscr{M}.\end{cases}

It is easy to verify that $h(D,\bm{R})$ is a pwdag of $G^{\mathcal{M}}$ . Moreover, given any $D^{\prime}\in\mathcal{D}(G^{\mathcal{M}})$ , there is one and only one $D\in\mathcal{D}(G_{D})$ and $\bm{R}\in\{0,1\}^{\mathscr{M}(D)}$ such that $h(D,\bm{R})=D^{\prime}$ . In other words, $h$ is a bijection between $\{(D,\bm{R}):D\in\mathcal{D}(G_{D}),\bm{R}\in\{0,1\}^{\mathscr{M}(D)}\}$ and $\mathcal{D}(G^{\mathcal{M}})$ . So

	$\displaystyle\sum_{D^{\prime}\in\mathcal{D}(G^{\mathcal{M}})}\prod_{v^{\prime}\text{ in }D^{\prime}}p^{\mathcal{M}}_{L^{\prime}(v^{\prime})}$	$\displaystyle=\sum_{D\in\mathcal{D}(G_{D})}\sum_{\bm{R}\in\{0,1\}^{\mathscr{M}(D)}}\prod_{v^{\prime}\text{ in }h(D,\bm{R})}p^{\mathcal{M}}_{L^{\prime}(v^{\prime})}$
		$\displaystyle=\sum_{D\in\mathcal{D}(G_{D})}\sum_{\bm{R}\in\{0,1\}^{\mathscr{M}(D)}}\prod_{v\text{ in }D}p^{\mathcal{M}}_{L^{\prime}(v)}$
		$\displaystyle=\sum_{D\in\mathcal{D}(G_{D})}\prod_{v\not\in\mathscr{M}(D)}p^{\mathcal{M}}_{L^{\prime}(v)}\left(\sum_{\bm{R}\in\{0,1\}^{\mathscr{M}(D)}}\prod_{v\in\mathscr{M}(D)}p^{\mathcal{M}}_{L^{\prime}(v)}\right)$
		$\displaystyle=\sum_{D\in\mathcal{D}(G_{D})}\prod_{v\not\in\mathscr{M}(D)}p^{\mathcal{M}}_{L(v)}\prod_{v\in\mathscr{M}(D)}\left(p^{\mathcal{M}}_{L(v)^{\uparrow}}+p^{\mathcal{M}}_{L(v)^{\downarrow}}\right)$
		$\displaystyle=\sum_{D\in\mathcal{D}(G_{D})}\prod_{v\not\in\mathscr{M}(D)}p^{-}_{L(v)}\prod_{v\in\mathscr{M}(D)}\left(p^{\prime}_{L(v)}+p^{-}_{L(v)}-p^{\prime}_{L(v)}\right)$
		$\displaystyle=\sum_{D\in\mathcal{D}(G_{D})}\prod_{v\text{ in }D}p^{-}_{L(v)},$

where the second equality is by that $V=V^{\prime}$ , the forth equality is by the definition of $L^{\prime}$ , and the fifth equality is by the definition of $\bm{p}^{\mathcal{M}}$ .

Appendix C Proof of Theorem 3.12

We first verify that the image of $h$ is a subset of $\mathcal{D}(G^{\mathcal{M}})$ .

Lemma C.1.

For any $D\in\mathcal{D}(G_{D})$ and $\mathscr{S}\in\psi(D)$ , $h(D,\mathscr{S})\in\mathcal{D}(G^{\mathcal{M}})$ .

Proof.

First, we prove that $h(D,\mathscr{S})=(V^{\prime},E^{\prime},L^{\prime})$ is a DAG. Define a total order $\pi^{\prime}$ over the set $V^{\prime}$ as follows: for any two distinct nodes $u^{\prime},v^{\prime}\in V^{\prime}$ ,

•

if $g(u^{\prime})\neq g(v^{\prime})$ , then $u^{\prime}\prec v^{\prime}$ in $\pi^{\prime}$ if and only if $g(u^{\prime})\prec g(v^{\prime})$ in $\pi_{D}$ ;
•

if $g(u^{\prime})=g(v^{\prime})$ , then $u^{\prime}\prec v^{\prime}$ in $\pi^{\prime}$ if and only if $u^{\prime}=f^{\ast}(g(u^{\prime}))$ (and then $v^{\prime}=f(g(u^{\prime}))$ ).

One can verify that $\pi^{\prime}$ is a topological order of $h(D,\mathscr{S})$ , which means that $h(D,\mathscr{S})$ is a DAG.

Secondly, we prove that $h(D,\mathscr{S})$ is a wdag of $G^{\mathcal{M}}$ . As $h(D,\mathscr{S})$ has been shown to be a DAG, we only need to verify that: for any two distinct nodes $u^{\prime},v^{\prime}$ in $D^{\prime}$ , there is a arc between $u^{\prime}$ and $v^{\prime}$ (in either direction) if and only if either $L^{\prime}(u^{\prime})=L^{\prime}(v^{\prime})$ or $(L^{\prime}(v^{\prime}),L^{\prime}(u^{\prime}))\in E^{\mathcal{M}}$ .

$\Longrightarrow$ : By symmetry, suppose $(u^{\prime}\rightarrow v^{\prime})\in E^{\prime}$ . If $(u^{\prime}\rightarrow v^{\prime})\in E^{\prime}_{1}$ , then $u^{\prime}=f^{\ast}(w)$ and $v^{\prime}=f(w)$ for some vertex $w\in\mathscr{S}_{3}\cup\mathscr{S}_{4}$ . Thus, by (2) and (3) we have $L^{\prime}(u^{\prime})\in\{i^{\uparrow},i^{\downarrow}\}$ and $L^{\prime}(v^{\prime})=L(w)^{\downarrow}$ where $(L(w),i)\in\mathcal{M}$ . By $(L(w),i)\in\mathcal{M}$ , any two vertices in $\{(L(w))^{\uparrow},(L(w))^{\downarrow},i^{\uparrow},i^{\downarrow}\}$ are connected in $G^{\mathcal{M}}$ . In particular, $(L^{\prime}(v^{\prime}),L^{\prime}(u^{\prime}))\in E^{\mathcal{M}}$ . If $(u^{\prime}\rightarrow v^{\prime})\in E^{\prime}_{2}$ , we have $L^{\prime}(u^{\prime})=L^{\prime}(v^{\prime})$ or $(L^{\prime}(u^{\prime}),L^{\prime}(v^{\prime}))\in E^{\mathcal{M}}$ immediately.

$\Longleftarrow$ : Suppose $u^{\prime},v^{\prime}\in V^{\prime}$ are two distinct nodes where $L^{\prime}(u^{\prime})=L^{\prime}(v^{\prime})$ or $(L^{\prime}(u^{\prime}),L^{\prime}(v^{\prime}))\in E^{\mathcal{M}}$ . If $g(u^{\prime})\not=g(v^{\prime})$ , then either $g(u^{\prime})\prec g(v^{\prime})$ or $g(v^{\prime})\prec g(u^{\prime})$ in $\pi_{D}$ , which implies that either $(u^{\prime}\rightarrow v^{\prime})\in E^{\prime}_{2}$ or $(v^{\prime}\rightarrow u^{\prime})\in E^{\prime}_{2}$ . Otherwise, $g(u^{\prime})=g(v^{\prime})$ . Let $v:=g(u^{\prime})=g(v^{\prime})$ . By (2) and (3), we have $v\in\mathscr{S}_{3}\cup\mathscr{S}_{4}$ and $\{u^{\prime},v^{\prime}\}=\{f(v),f^{\ast}(v)\}$ . Therefore either $u^{\prime}\rightarrow v^{\prime}$ or $v^{\prime}\rightarrow u^{\prime}$ is in $E^{\prime}_{1}$ .

Finally, one can check that $f(v)$ where $v$ is the unique sink of $D$ is the unique sink of $D^{\prime}$ . This completes the proof. ∎

In the rest of this section, we show that $h$ is injective. Given $D\in\mathcal{D}(G_{D})$ and $(i,j)\in\mathcal{M}$ , recall that $\mathrm{List}(D,i,j)$ is the sequence listing all nodes in $D^{\prime}$ labelled with $i$ or $j$ in the topological order. Similarly,

Definition C.2.

Given $D^{\prime}=(V^{\prime},E^{\prime},L^{\prime})\in\mathcal{D}(G^{\mathcal{M}})$ and $(i,j)\in\mathcal{M}$ , we use $\mathrm{List^{\prime}}(D^{\prime},i,j)$ to denote the unique sequence listing all nodes in $D^{\prime}$ with label in $\{i^{\uparrow},i^{\downarrow},j^{\uparrow},j^{\downarrow}\}$ in the topological order.

Claims C.3 and C.5 are two properties about $\mathrm{List}^{\prime}(D^{\prime},i,j)$ , which will be used to show the injectiveness of $h$ .

Claim C.3.

Suppose $D^{\prime}=h(D,\mathscr{S})$ for some $D\in\mathcal{D}(G_{D})$ and $\mathscr{S}\in\psi(D)$ . Let $(i,j)\in\mathcal{M}$ . Then for any node $v^{\prime}$ in $D^{\prime}$ ,

(a)

$v^{\prime}\in\mathrm{List}^{\prime}(D^{\prime},i,j)$ if and only if $g(v^{\prime})\in\mathrm{List}(D,i,j)$ ;
(b)

for any other node $u^{\prime}$ in $D^{\prime}$ , if $g(u^{\prime})$ precedes $g(v^{\prime})$ in $\mathrm{List}(D,i,j)$ , then $u^{\prime}$ precedes $v^{\prime}$ in $\mathrm{List}^{\prime}(D^{\prime},i,j)$ ;
(c)

if $v\in\mathscr{S}_{3}\cup\mathscr{S}_{4}$ , then $f(v)$ is next to $f^{\ast}(v)$ in $\mathrm{List}^{\prime}(D^{\prime},i,j)$ .

Proof.

Part (a) is immediate by Definition 3.11.

Now, we show Part (b). Suppose $g(u^{\prime})$ precedes $g(v^{\prime})$ in $\mathrm{List}(D,i,j)$ . Then $g(u^{\prime})\prec g(v^{\prime})$ in $\pi_{D}$ . Thus one can check that all the four arcs $f(g(u^{\prime}))\rightarrow f(g(v^{\prime}))$ , $f^{\ast}(g(u^{\prime}))\rightarrow f(g(v^{\prime}))$ , $f(g(u^{\prime}))\rightarrow f^{\ast}(g(v^{\prime}))$ , and $f^{\ast}(g(u^{\prime}))\rightarrow f^{\ast}(g(v^{\prime}))$ are contained in $E_{2}^{\prime}$ . In particular, $(u^{\prime}\rightarrow v^{\prime})\in E^{\prime}$ as $u^{\prime}\in\{f(g(u^{\prime})),f^{\ast}(g(u^{\prime}))\}$ and $v^{\prime}\in\{f(g(v^{\prime})),f^{\ast}(g(v^{\prime}))\}$ . This implies that $u^{\prime}$ precedes $v^{\prime}$ in $\mathrm{List}^{\prime}(D^{\prime},i,j)$ .

Finally, we prove Part (c). According to Part (b), $f(v)$ and $f^{\ast}(v)$ are adjacent in $\mathrm{List}^{\prime}(D^{\prime},i,j)$ . Besides, as there is an arc $f^{\ast}(v)\rightarrow f(v)$ in $E_{1}^{\prime}$ , we conclude that $f(v)$ is next to $f^{\ast}(v)$ in $\mathrm{List}^{\prime}(D^{\prime},i,j)$ .

∎

Definition C.4.

For a reversible arc $u^{\prime}\rightarrow v^{\prime}$ in $D^{\prime}$ , we call it $(*,\downarrow)$ -reversible in $D^{\prime}$ if $L^{\prime}(u^{\prime})\in\{i^{\uparrow},i^{\downarrow}\}$ and $L^{\prime}(v^{\prime})=j^{\downarrow}$ for some $(i,j)\in E_{D}$ .

Claim C.5.

Suppose $D^{\prime}=h(D,\mathscr{S})$ for some $D\in\mathcal{D}(G_{D})$ and $\mathscr{S}\in\psi(D)$ . Let $(i,j)\in\mathcal{M}$ . Let $u^{\prime},v^{\prime}$ be two nodes in $\mathrm{List^{\prime}}(D^{\prime},i,j)$ where $v^{\prime}$ is next to $u^{\prime}$ . Then $u^{\prime}\in V_{2}^{\prime}$ if and only if $u^{\prime}\rightarrow v^{\prime}$ is $(*,\downarrow)$ -reversible in $D^{\prime}$ and $v^{\prime}\in V_{1}^{\prime}$ .

Proof.

$\Longrightarrow$ : Let $u:=g(u^{\prime})$ . Assume $u^{\prime}\in V_{2}^{\prime}$ , i.e., $u^{\prime}=f^{\ast}(u)$ . By Definition 3.11, $u\in\mathscr{S}_{3}\cup\mathscr{S}_{4}$ . According to Part (c) of Claim C.3, as $v^{\prime}$ is next to $u^{\prime}$ , we have $v^{\prime}=f(u)$ and then $v^{\prime}\in V_{1}^{\prime}$ .

Now we show that $u^{\prime}\rightarrow v^{\prime}$ is $(*,\downarrow)$ -reversible. First, by Definition 3.11, either $L^{\prime}(u^{\prime})\in\{i^{\uparrow},i^{\downarrow}\}$ and $L^{\prime}(v^{\prime})=j^{\downarrow}$ , or $L^{\prime}(u^{\prime})\in\{j^{\uparrow},j^{\downarrow}\}$ and $L^{\prime}(v^{\prime})=i^{\downarrow}$ . What remains is to show $u^{\prime}\rightarrow v^{\prime}$ is reversible, by Fact 2.5 which is equivalent to show that $f^{\ast}(u)\rightarrow f(u)$ is the unique path from $u^{\prime}$ to $v^{\prime}$ in $D^{\prime}$ . By contradiction, assume that there is a path $f^{\ast}(u)\rightarrow w^{\prime}_{1}\rightarrow\cdots\rightarrow w^{\prime}_{k}\rightarrow f(u)$ in $D^{\prime}$ where $w^{\prime}_{1}\neq f(u)$ and $w^{\prime}_{k}\neq f^{\ast}(u)$ . As $w^{\prime}_{1}\neq f(u)$ , we have $(f^{\ast}(u)\rightarrow w^{\prime}_{1})$ is not in $E^{\prime}_{1}$ and then should be in $E^{\prime}_{2}$ , which further implies that $u\prec g(w^{\prime}_{1})$ in $\pi_{D}$ . Similarly, we have $g(w^{\prime}_{k})\prec u$ in $\pi_{D}$ . So $g(w^{\prime}_{k})\prec u\prec g(w^{\prime}_{1})$ . Meanwhile, for each $\ell<k$ , if $(w^{\prime}_{\ell}\rightarrow w^{\prime}_{\ell+1})\in E^{\prime}_{1}$ , then $g(w^{\prime}_{\ell})=g(w^{\prime}_{\ell+1})$ ; if $(w^{\prime}_{\ell}\rightarrow w^{\prime}_{\ell+1})\in E^{\prime}_{2}$ , then $g(w^{\prime}_{\ell})\prec g(w^{\prime}_{\ell+1})$ in $\pi_{D}$ . So, it always holds that $g(w^{\prime}_{\ell})\preccurlyeq g(w^{\prime}_{\ell+1})$ in $\pi_{D}$ for each $\ell<k$ . In particular, $g(w_{1}^{\prime})\preccurlyeq g(w^{\prime}_{k})$ . A contradiction.

$\Longleftarrow$ : Let $u:=g(u^{\prime})$ and $v:=g(v^{\prime})$ . Assume $u^{\prime}\notin V_{2}^{\prime}$ and $v^{\prime}\in V_{1}^{\prime}$ , i.e., $u^{\prime}=f(u)$ and $v^{\prime}=f(v)$ . Furthermore, assume $L^{\prime}(v^{\prime})=j^{\downarrow}$ , then $v\notin\mathscr{S}_{1}$ and $L(v)=j$ . We will show that $(f(u)\rightarrow f(v))$ is not reversible.

Note that $(f(u)\rightarrow f(v))$ should be in $E^{\prime}_{2}$ and then $u\prec v$ in $\pi_{D}$ . By $L^{\prime}(u^{\prime})\in\{i^{\uparrow},i^{\downarrow}\}$ , $u^{\prime}=f(u)$ , and (2), we have $L(u)=i$ . Thus, $(L(u),L(v))=(i,j)\in\mathcal{M}\subseteq E_{D}$ . As $D$ is a wdag and $u\prec v$ in $\pi_{D}$ , the arc $(u\rightarrow v)$ exists in $D$ . Since $v\notin\mathscr{S}_{1}$ , $v\notin\mathcal{V}$ , which means that $u\rightarrow v$ is not reversible in $D$ . According to Fact 2.5, there is a path $u=w_{1}\rightarrow w_{2}\rightarrow\cdots\rightarrow w_{k}\rightarrow w_{k+1}=v$ from $u$ to $v$ in $D$ other than the arc $u\rightarrow v$ , where $w_{\ell}\prec w_{\ell+1}$ in $\pi_{D}$ and $\left(L(w_{\ell})=L(w_{\ell+1})\right)\lor\left((L(w_{\ell}),L(w_{\ell+1}))\in E_{D}\right)$ for each $\ell\in[k]$ .

According to the definition of $G^{\mathcal{M}}$ and (2), one can check that $(f(w_{i})\rightarrow f(w_{i+1}))\in E^{\prime}_{2}$ . Therefore $u^{\prime}=f(w_{1})\rightarrow f(w_{2})\rightarrow\cdots\rightarrow f(w_{k})\rightarrow f(w_{k+1})=v^{\prime}$ is a path from $u^{\prime}$ to $v^{\prime}$ in $D^{\prime}$ , which implies that $u^{\prime}\rightarrow f(v)$ is not reversible in $D^{\prime}$ by Fact 2.5. ∎

Having Claims C.3 and C.5, we are ready to show that $h$ is injective.

Lemma C.6.

$h$ is injective.

Proof.

Fix a $D=(V,E,L)\in\mathcal{D}(G_{D})$ and a $\mathscr{S}\in\psi(D)$ . Let $D^{\prime}=(V^{\prime},E^{\prime},L^{\prime})$ denote $h(D,\mathscr{S})$ . We show $(D,\mathscr{S})$ can be recovered from $D^{\prime}$ , which implies the injectiveness of $h$ .

First, we recover the partition $(V_{1}^{\prime},V_{2}^{\prime})$ . That is, given a node $u^{\prime}\in V^{\prime}$ , we distinguish whether $u^{\prime}\in V_{1}^{\prime}$ or $u^{\prime}\in V_{2}^{\prime}$ . If $L^{\prime}(u^{\prime})\in[m]\setminus\mathcal{M}$ , then $u^{\prime}\in V_{1}^{\prime}$ according to (2). Otherwise, we have $L^{\prime}(u^{\prime})\in\{i^{\uparrow},i^{\downarrow}\}$ for some $(i,j)\in\mathcal{M}$ , hence $u^{\prime}$ is in $\mathrm{List}^{\prime}(D^{\prime},i,j)$ . Assume the nodes in $\mathrm{List}^{\prime}(D,i,j)$ are $v_{1}^{\prime}v_{2}^{\prime}v_{3}^{\prime}\cdots v_{k}^{\prime}$ . According to Claim C.5, we can see that the following procedure distinguishes whether $v_{\ell}^{\prime}\in V_{1}^{\prime}$ or $v_{k}^{\prime}\in V_{2}^{\prime}$ for all $v_{\ell}^{\prime}\in\mathrm{List}^{\prime}(D^{\prime},i,j)$ , including $u^{\prime}$ .

1 Initially, mark that

v_{k}^{\prime}\in V_{1}^{\prime}

, and let

\ell:=k-1

;

2 while $\ell\geq 1$ do

3 if the arc $(v_{\ell}^{\prime}\rightarrow v_{\ell+1}^{\prime})$ is $(*,\downarrow)$ -reversible and $v_{\ell+1}^{\prime}\in V_{1}^{\prime}$ then

4 Mark that

v_{\ell}^{\prime}\in V_{2}^{\prime}

;

6 else

7 Mark that

v_{\ell}^{\prime}\in V_{1}^{\prime}

;

\ell:=\ell-1

;

Secondly, we can easily recover $D=(V,E,L)$ from $D^{\prime}$ and $(V_{1}^{\prime},V_{2}^{\prime})$ . Ignoring labels, it is easy to see that $D$ is exactly the induced subgraph of $D^{\prime}$ on $V_{1}^{\prime}$ . By the way, we also get the function $f:V\rightarrow V_{1}^{\prime}$ . For labels, we simply replace each label $i^{\uparrow}$ or $i^{\downarrow}$ with $i$ .

Finally, we recover $\mathscr{S}$ from $D^{\prime}$ , $D$ and $(V_{1}^{\prime},V_{2})$ . That is, we distinguish which one of $\{\mathscr{S}_{1},\mathscr{S}_{2},\mathscr{S}_{3},\mathscr{S}_{3}\}$ contains a given node $v\in\mathscr{M}(D)$ . Assume $L(v)=i$ and $(i,j)\in\mathcal{M}$ . Let $u^{\prime}$ be the node previous to $f(v)$ in $\mathrm{List^{\prime}}(D,i,j)$ . According to Part (c) of Claim C.3, $u^{\prime}\in V_{2}^{\prime}$ if and only if $v\in\mathscr{S}_{3}\cup\mathscr{S}_{4}$ . When $v\in\mathscr{S}_{3}\cup\mathscr{S}_{4}$ , $v\in\mathscr{S}_{3}$ if $L^{\prime}(u^{\prime})=j^{\uparrow}$ , and $v\in\mathscr{S}_{4}$ if $L^{\prime}(u^{\prime})=j^{\downarrow}$ . When $v\notin\mathscr{S}_{3}\cup\mathscr{S}_{4}$ , $v\in\mathscr{S}_{1}$ if $L^{\prime}(v^{\prime})=i^{\uparrow}$ , and $v\in\mathscr{S}_{2}$ if $L^{\prime}(v^{\prime})=i^{\downarrow}$ .

∎

Appendix D Proof of Lemma 6.1

Let $\bm{e_{i}}$ denote the vector whose coordinates are all 0 except the $i$ -th that equals 1. The following lemmas will be used in the proof.

Lemma D.1.

[37] Let $G_{D}=([m],E_{D})$ be a dependency graph and $\bm{p}$ be a probability vector beyond the Shearer’s bound. Suppose $i,i_{1},i_{2},\cdots,i_{k-1},i^{\prime}$ form a shortest path from $i$ to $i^{\prime}$ in $G_{D}$ . Then for any $q\leq p_{i^{\prime}}$ , $\bm{p}-q\bm{e_{i^{\prime}}}+\big{(}\prod_{\ell\in[k-1]}\frac{1-p_{i_{\ell}}}{p_{i_{\ell}}}\big{)}\cdot\frac{1-p_{i}}{p_{i^{\prime}}}\cdot q\bm{e_{i}}$ is also beyond the Shearer’s bound.

Without loss of generality, we assume that $p_{i}-p_{a}$ is rational for each $i\in[m]$ . By contradiction, let $\bm{p}$ be such a vector which is beyond Shearer’s bound. Let $S_{+}:=\{i\in[m]:p_{i}>p\}$ and $S_{-}:=\{i\in[m]:p_{i}<p\}$ . Let $\Delta_{p}$ be a real number such that the following hold:

•

For each $i\in S_{+}$ , $p_{i}-p_{a}=\gamma_{i}\cdot\Delta_{p}$ for some $\gamma_{i}\in\mathbb{N}^{+}$ . Intuitively, we cut $p_{i}-p_{a}$ into $\gamma_{i}$ pieces each of size $\Delta_{p}$ . Besides, we call such pieces positive pieces.
•

For each $i\in S_{-}$ ,

$p_{a}-p_{i}=\tau_{i}\cdot K\cdot\left(\frac{1-p_{a}}{p_{a}}\right)^{d-1}\cdot\frac{\Delta_{p}}{p_{a}}$

for some $\tau_{i}\in\mathbb{N}^{+}$ . Intuitively, we cut $p_{a}-p_{i}$ into $\tau_{i}\cdot K$ pieces each of size $\left(\frac{1-p_{a}}{p_{a}}\right)^{d-1}\cdot\frac{\Delta_{p}}{p_{a}}$ . We call such pieces negative pieces.

We use $\mathcal{R}:=\{(i,r):i\in S_{+},r\in[\gamma_{i}]\}$ and $\mathcal{T}\triangleq\{(i^{\prime},t,k):i^{\prime}\in S_{-},t\in[\tau_{i^{\prime}}],k\in[K]\}$ to denote the set of positive pieces and negative pieces respectively.

For convenience, let $\gamma_{i}=0$ if $i\not\in S_{+}$ , and $\tau_{i}=0$ if $i\not\in S_{-}$ . Then Condition (c) can be restated as: for $f(S)=T$ , the positive pieces in $S$ are no more than the negative pieces in $T$ , i.e.,

(16)

\displaystyle\sum_{i\in S}\gamma_{i}\leq\sum_{i^{\prime}\in T}\tau_{i^{\prime}}.

The basic idea of Lemma 6.1 is relatively simple: for each $S\in\mathcal{S}$ , we move positive pieces in $S$ to $f(S)$ such that (i) all the positive pieces in $S$ are absorbed by the negative pieces in $f(S)$ and (ii) the resulted probability vector is still beyond Shearer’s bound. Finally, all positive pieces will be absorbed, and we will get a vector strictly smaller than $\bm{p}$ . By Lemma D.1, this vector is beyond Shearer’s bound, which makes a contradiction.

For $i^{\prime}\in[m]$ , remember Condition (a) which says that there are at most $K$ subsets $S\subset\mathcal{S}$ such that $i^{\prime}\in f(S)$ , and we use $S^{1}_{i^{\prime}},S^{2}_{i^{\prime}},\cdots$ to represent these subsets. Let $g:\mathcal{R}\rightarrow\mathcal{T}$ be a injection mapping each $(i,r)\in\mathcal{R}$ to some $(i^{\prime},t,k)\in\mathcal{T}$ satisfying that (i) $i\in S^{k}_{i^{\prime}}$ and (ii)

\sum_{i_{0}\in S^{k}_{i^{\prime}},i_{0}<i}\gamma_{i_{0}}+r=\sum_{i_{1}\in f(S^{k}_{i^{\prime}}),i_{1}<i^{\prime}}\tau_{i_{1}}+t.

By (16), one can verify that such mapping $g$ exists. In addition, according to Condition (b), if $g(i,r)=(i^{\prime},t,k)$ , then $\mathrm{dist}(i,i^{\prime})\leq d$ .

In the following, we will apply Lemma D.1 repeatedly.

Let $g_{0}$ be $g$ , $S_{0}$ be $S_{-}$ and $\mathcal{R}_{0}$ be $\mathcal{R}$ . Given an injection $g_{\kappa}:\mathcal{R}\rightarrow\mathcal{T}$ , $S_{\kappa}$ and $\mathcal{R}_{\kappa}$ where $\text{dis}(i,j)\leq d$ if $g_{\kappa}(i,r)=(j,t,k)$ , we construct another injection $g_{\kappa+1}:\mathcal{R}\rightarrow\mathcal{T}$ , $S_{\kappa+1}$ and $\mathcal{R}_{\kappa+1}$ as follows. There are two possible cases for $g_{\kappa}$ , $S_{\kappa}$ and $\mathcal{R}_{\kappa}$ .

(1)

there exists $i,r,j,t,k$ such that $(i,r)\in\mathcal{R}_{\kappa}$ , $g_{\kappa}(i,r)=(j,t,k)$ and there is a shortest path between $i$ and $j$ such that no vertex in $S_{\kappa}$ is on the path;
(2)

For each $g_{\kappa}(i,r)=(j,t,k)$ where $(i,r)\in\mathcal{R}_{\kappa}$ and each shortest path between $i$ and $j$ , there is a vertex in $S_{\kappa}$ on the path.

For case (1), we let $g_{\kappa+1}=g_{\kappa}$ , $\mathcal{R}_{\kappa+1}=\mathcal{R}_{\kappa}\setminus\{(i,r)\}$ , and

S_{\kappa+1}=\{j\in S_{-}:\text{there exists }i,r,t,k\text{ where }(i,r)\in\mathcal{R}_{\kappa+1}\text{ such that }g_{\kappa+1}(i,r)=(j,t,k)\}.

For case (2), there must be $(i_{1},r_{1},j_{1},t_{1},k_{1}),\cdots,(i_{n},r_{n},j_{n},t_{n},k_{n})$ for some $n\in\mathbb{N}^{+}$ such that

-

$(i_{\ell},r_{\ell})\in\mathcal{R}_{\kappa}$ , $j_{\ell}\in S_{\kappa}$ , $g_{\kappa}(i_{\ell},r_{\ell})=(j_{\ell},t_{\ell},k_{\ell})$ for each $\ell\in[n]$ ,
-

$j_{\ell+1}$ is on a shortest path between $i_{\ell}$ and $j_{\ell}$ for each $\ell\in[n-1]$ ,
-

$j_{1}$ is on a shortest path between $i_{n}$ and $j_{n}$ .

We define the injection $F(g_{\kappa})$ as follows.

\displaystyle\begin{cases}F(g_{\kappa})(i_{n},r_{n})&=(j_{1},t_{1},k_{1}),\\ F(g_{\kappa})(i_{\ell},r_{\ell})&=(j_{\ell+1},t_{\ell+1},k_{\ell+1})\text{ for each }\ell\in[n-1],\\ F(g_{\kappa})(i,r)&=g_{\kappa}(i,r)\text{ for other }(i,r).\end{cases}

One can verify that $\text{dis}(i,j)\leq d$ if $F(g_{\kappa})(i,r)=(j,t,k)$ and

\displaystyle N\triangleq\sum_{\begin{subarray}{c}(i,r,j,t,k):\\ g_{\kappa}(i,r)=(j,t,k)\end{subarray}}\text{dis}(i,j)\geq 1+\sum_{\begin{subarray}{c}(i,r,j,t,k):\\ F(g_{\kappa})(i,r)=(j,t,k)\end{subarray}}\text{dis}(i,j).

Since $N$ is bounded, there must be a constant $\ell\leq N$ and $i,r,j,t,k$ such that $(i,r)\in\mathcal{R}_{\kappa}$ , $F^{\ell}(g_{\kappa})(i,r)=(j,t,k)$ and there is a shortest path between $i$ and $j$ such that no vertex in $S_{\kappa}$ is on the path. Let $g_{\kappa+1}=F^{\ell}(g_{\kappa})$ , $\mathcal{R}_{\kappa+1}=\mathcal{R}_{\kappa}\setminus\{(i,r)\}$ and

S_{\kappa+1}=\{j\in S_{-}:\text{there exists }i,r,t,k\text{ where }(i,r)\in\mathcal{R}_{\kappa+1}\text{ such that }g_{\kappa+1}(i,r)=(j,t,k)\}.

One can verify that in both cases, $g_{\kappa+1}$ is an injection from $\mathcal{R}$ to $\mathcal{T}$ and $\text{dis}(i,j)\leq d$ if $g_{\kappa+1}(i,r)=(j,t,k)$ .

Let $g^{\prime}$ be $g_{|\mathcal{R}|}$ . For each $\ell\in[|\mathcal{R}|]$ , let $(i_{\ell},r_{\ell})$ be the unique element in $\mathcal{R}_{\ell-1}\setminus\mathcal{R}_{\ell}$ . Let $(j_{\ell},t_{\ell},k_{\ell})$ denote $g^{\prime}(i_{\ell},r_{\ell})$ . Thus, we have

-

$g^{\prime}$ is an injection from $\mathcal{R}$ to $\mathcal{T}$ ,
-

$\text{dis}(i_{\ell},j_{\ell})\leq d$ for each $\ell\in[|\mathcal{R}|]$ ,
-

there is a shortest path between $i_{\ell}$ and $j_{\ell}$ such that $j_{\ell+1},j_{\ell+2},\cdots,j_{|\mathcal{R}|}\in S_{\ell}$ are not on the path.

For each $j\in S_{-}$ , define

\displaystyle\eta_{j}=|\{(i,r):g^{\prime}(i,r)=(j,t,k)\text{ for some }t\in[\tau_{j}],k\in[K]\}|.

Because $g^{\prime}$ is an injection, we have $\eta_{j}\leq\tau_{j}\cdot K$ . Let

\bm{p^{\prime\prime}}\triangleq\bm{p}^{\prime}+\sum_{j\in S_{-}}(K\cdot\tau_{j}-\eta_{j})\cdot\left(\frac{1-p}{p}\right)^{d-1}\cdot\frac{\Delta_{p}}{p}\cdot\bm{e}_{j}.

By $\bm{p}^{\prime}$ is beyond Shearer’s bound and $\eta_{j}\leq K\cdot\tau_{j}$ for each $j\in S_{-}$ , we have $\bm{p}^{\prime\prime}$ is also beyond Shearer’s bound. For each $\ell\in[0,|\mathcal{R}|]$ , let

\displaystyle\bm{p}_{\ell}\triangleq\bm{p}^{\prime\prime}-\Delta_{p}\cdot\left(\sum_{\kappa\leq\ell-1}\left(\bm{e}_{i_{\kappa}}-\left(\frac{1-p}{p}\right)^{d-1}\cdot\frac{1}{p}\cdot\bm{e}_{j_{\kappa}}\right)+\bm{e}_{i_{\ell}}-\left(\frac{1-p}{p}\right)^{d-1}\cdot\frac{1}{p+\Delta_{p}}\cdot\bm{e}_{j_{\ell}}\right).

Then we have the following claim.

Claim D.2.

For $\ell\in[0,|\mathcal{R}|]$ , $\bm{p}_{\ell}$ is beyond Shearer’s bound.

Proof.

We prove this claim by induction. Obviously, $\bm{p}_{0}$ is beyond Shearer’s bound. In the following, we prove that if $\bm{p}_{\ell-1}$ is beyond Shearer’s bound, then $\bm{p}_{\ell}$ is also beyond Shearer’s bound.

Let

\bm{q}\triangleq\bm{p}^{\prime\prime}-\Delta_{p}\cdot\sum_{\kappa\leq\ell-1}\left(\bm{e}_{i_{\kappa}}-\left(\frac{1-p}{p}\right)^{d-1}\cdot\frac{1}{p}\cdot\bm{e}_{j_{\kappa}}\right).

Obviously, $\bm{q}\geq\bm{p}_{\ell-1}$ . By $\bm{p}_{\ell-1}$ is beyond Shearer’s bound, we have $\bm{q}$ is also beyond Shearer’s bound. Note that there is a shortest path $i_{\ell},k_{1},k_{2},\cdots,k_{n},j_{\ell}$ between $i_{\ell}$ and $j_{\ell}$ such that $j_{\ell+1},j_{\ell+2},\cdots,j_{|\mathcal{R}|}$ are not on the path. Because $\bm{q}$ is beyond Shearer’s bound, by Lemma D.1, we have

\bm{q}^{\prime}\triangleq\bm{q}-\Delta_{p}\cdot\left(\bm{e}_{i_{\ell}}-\left(\prod_{t\in[n]}\frac{1-q_{k_{t}}}{q_{k_{t}}}\right)\cdot\frac{1}{q_{i}}\cdot\bm{e}_{j_{\ell}}\right)

is also beyond Shearer’s bound. Meanwhile, by $(i_{\ell},r_{\ell})\in\mathcal{R}$ , we have

\displaystyle q_{i_{\ell}}=p^{\prime}_{i}-\Delta_{p}\sum_{\kappa\in\ell-1}\mathbbm{1}(i_{\kappa}=i_{\ell})\geq p^{\prime}_{i}-(\gamma_{i}-1)\Delta_{p}\geq p_{i}+\Delta_{p}.

For each $t\in[n]$ , if $k_{t}\not\in S_{-}$ , we have $q_{k_{t}}\geq p$ . Otherwise, $k_{t}\in S_{-}$ , and $k_{t}\neq j_{\kappa}$ for each $\kappa\geq\ell$ . Thus, we have $\sum_{\kappa\in\ell-1}\mathbbm{1}(j_{\kappa}=k_{t})=\eta_{k_{t}}$ . Therefore,

	$\displaystyle q_{k_{t}}$	$\displaystyle=p^{\prime}_{k_{t}}+(K\cdot\tau_{k_{t}}-\eta_{k_{t}})\cdot\left(\frac{1-p}{p}\right)^{d-1}\cdot\frac{\Delta_{p}}{p}+\sum_{\kappa\in\ell-1}\mathbbm{1}(j_{\kappa}=k_{t})\cdot\left(\frac{1-p}{p}\right)^{d-1}\cdot\frac{\Delta_{p}}{p}$
		$\displaystyle=p^{\prime}_{k_{t}}+(K\cdot\tau_{k_{t}}-\eta_{k_{t}})\cdot\left(\frac{1-p}{p}\right)^{d-1}\cdot\frac{\Delta_{p}}{p}+\eta_{k_{t}}\cdot\left(\frac{1-p}{p}\right)^{d-1}\cdot\frac{\Delta_{p}}{p}=p.$

By $\text{dis}(i,j)\geq d,q_{i_{\ell}}\geq p+\Delta_{p}$ and $q_{k_{t}}\geq p$ for each $t\in[n]$ , we have

\left(\prod_{t\in[n]}\frac{1-q_{k_{t}}}{q_{k_{t}}}\right)\cdot\frac{1}{q_{i}}<\left(\frac{1-p}{p}\right)^{d-1}\cdot\frac{1}{p+\Delta_{p}}.

Thus, by $\bm{q}^{\prime}$ is beyond Shearer’s bound, we have

\bm{p}_{\ell}=\bm{q}-\Delta_{p}\cdot\left(\bm{e}_{i_{\ell}}-\left(\frac{1-p}{p}\right)^{d-1}\cdot\frac{1}{p+\Delta_{p}}\cdot\bm{e}_{j_{\ell}}\right)

is also beyond Shearer’s bound. ∎

Thus, we have $\bm{p}_{|\mathcal{R}|}$ is beyond Shearer’s bound. It is easy to verify that $\bm{p}_{|\mathcal{R}|}<\bm{p}$ , which is contradictory with that $\bm{p}$ is on Shearer’s boundary.

Appendix E Missing part in the proof of Theorem 6.2

Proof of Claim 6.3.

Observe that for each $(v,v^{\prime})\in E_{U}^{k}$ , if $(v,v^{\prime})\notin\mathcal{M}$ , then one of its neighboring edge $(v_{0},v_{1})$ is in $T_{k}$ and satisfies that $\delta_{v,v^{\prime}}\leq\delta_{v_{0},v_{1}}$ . Here, we say two edges neighboring if they share a common vertex. Besides, note that each edge has at most $2\Delta$ neighboring edges. So

(17)

\displaystyle\sum_{(v_{0},v_{1})\in T_{k}}\delta_{v_{0},v_{1}}^{2}\geq\frac{1}{2\Delta}\sum_{(v,v^{\prime})\in E_{U}^{k}}\delta_{v,v^{\prime}}^{2}.

Moreover, according to Lemma 4.2 and 4.5, it has that

(18)

\displaystyle\sum_{(v,v^{\prime})\in E_{U}^{k}}\delta_{v,v^{\prime}}^{2}\geq\frac{1}{|E_{U}^{k}|}\cdot\left(\sum_{(v,v^{\prime})\in E_{U}^{k}}\delta_{v,v^{\prime}}\right)^{2}\geq\frac{|V_{U}^{k}|\cdot\Delta^{2}}{|E_{U}^{k}|}\cdot\left(\digamma^{+}(G_{B}(G_{D}),\bm{p})\right)^{2},

By combining Inequality 17, 18 and the fact that $2|E_{U}^{k}|\leq|V_{U}^{k}|\Delta$ , we finish the proof. ∎

Let $K:=(\Delta+1)|V_{U}|$ , $d:=D+2$ , $\mathcal{S}:=\{V_{U}^{1},V_{U}^{2},\cdots\}$ , and $f(V_{U}^{k}):=T_{k}$ . In the following, we check that all the three conditions in Lemma 6.1 hold.

Condition (a). That is, we want to show $|\{k:T_{k}\ni v\}|\leq(\Delta+1)|V_{U}|$ for each $v\in V_{D}$ . Observe that if $v\in T_{k}$ , then $v\in\mathcal{N}^{+}(V_{U}^{k})$ . So

|\{k:T_{k}\ni v\}|\leq|\{k:\mathcal{N}^{+}(V_{U}^{k})\ni v\}|\leq|\{k:\mathcal{N}^{+}(v)\cap V_{U}^{k}\neq\emptyset\}|\leq\sum_{v^{\prime}\in\mathcal{N}^{+}(v)}|\{k:V_{U}^{k}\ni v^{\prime}\}|\leq(\Delta+1)\cdot|V_{D}|.

The last inequality uses the fact that $h(k^{\prime},u)\neq h(k,u)$ if $k\neq k^{\prime}$ .

Condition (b). That is, we want to show $\mathrm{dist}(v,v^{\prime})\leq D+2$ for any $v\in V_{U}^{k}$ and $v^{\prime}\in T_{k}$ . This is obvious, because if $v^{\prime}\in T_{k}$ , then $v^{\prime}\in\mathcal{N}^{+}(V_{U}^{k})$ .

Condition (c). We verify that

(19)

\displaystyle\left(\frac{1-p_{a}}{p_{a}}\right)^{D+1}\cdot\frac{K}{p_{a}}\cdot\sum_{i\in S}\max\{p_{i}-p_{a},0\}\leq\sum_{i\in T}\max\{p_{a}-p_{i},0\}.

On one hand, noting that $\max\{(1+\varepsilon)p^{-}_{v}-p_{a},0\}\leq\max\{(1+\varepsilon)p_{v}-p_{a},0\}\leq q$ , we have

(20)

\displaystyle\text{L.H.S of }(\ref{eq:conditionc})\leq\left(\frac{1-p_{a}}{p_{a}}\right)^{D+1}\cdot\frac{(\Delta+1)|V_{U}|^{2}}{p_{a}}\cdot q.

On the other hand, observe that

	$\displaystyle\max\{p_{a}-(1+\varepsilon)p^{-}_{v},0\}$	$\displaystyle\geq p_{a}-(1+\varepsilon)p^{-}_{v}=(p_{a}+q-(1+\varepsilon)p^{-}_{v})-q\geq(1+\varepsilon)(p_{v}-p_{v}^{-})-q$
		$\displaystyle\geq(p_{v}-p_{v}^{-})-q,$

where the last inequality is due to the assumption that $(1+\varepsilon)\bm{p}\leq(p_{a}+q,\cdots,p_{a}+q)$ . Then

	$\displaystyle\text{R.H.S of }(\ref{eq:conditionc})\geq$	$\displaystyle\left(\sum_{v\in V_{U}^{k}}(p_{v}-p_{v}^{-})\right)-\|\mathcal{N}^{+}(V_{U}^{k})\|q\geq\frac{2}{17}\left(\sum_{(v_{0},v_{1})\in T_{k}}\delta_{v_{0},v_{1}}^{2}\right)-\Delta\|V_{U}\|q$
(21)		$\displaystyle\geq$	$\displaystyle\frac{2}{17}\left(\digamma^{+}(G_{B}(G_{D}),\bm{p})\right)^{2}-\Delta\|V_{U}\|q.$

Putting Inequality 20 and 21 together and noting that $\frac{1-p_{a}}{p_{a}}\geq 1$ .

Moser-Tardos Algorithm: Beyond Shearer’s Bound

Abstract.

1. Introduction

Theorem 1.1 ([58]).

Theorem 1.2 ([56]).

Remark 1.3.

1.1. Results and contributions

1.1.1. Moser-Tardos algorithm: beyond Shearer’s bound

Definition 1.4 (Maximum L1L_{1}-gap to the Shearer’s bound).

Theorem 1.5.

1.1.2. A new constructive LLL for non-extremal instances

Theorem 1.6 (intersection LLL (informal)).

Remark 1.7.

1.1.3. Application to lattices

1.2. Technique overview

1.2.1. Proof overview of Theorem 1.6

1.2.2. Proof overview of Theorem 4.1

1.3. Related works

1.4. Organization of the paper.

2. Preliminaries

2.1. Witness DAG

Definition 2.1 (Witness DAG).

Definition 2.2.

Fact 2.3.

2.2. Reversible arc

Definition 2.4 (Reversibility).

Fact 2.5.

Fact 2.6.

2.3. Other notations

Fact 2.7.

3. Proof of Theorem 1.6

3.1. A tighter upper bound on the complexity of MT algorithm

Lemma 3.1.

Proof.

Proposition 3.2.

Proposition 3.3.

Lemma 3.4.

Proof.

Claim 3.5.

Proof.

Claim 3.6.

Proof.

Theorem 3.7.

Proof.

3.2. Mapping between wdags

Definition 3.8 (Homomorphic dependency graph).

Proposition 3.9.

Definition 3.10 (Partition).

Definition 3.11.

Theorem 3.12.

Theorem 3.13.

Proof.

3.3. Putting all things together

Lemma 3.14 ([45]).

Theorem 1.6 (restated).

Proof.

4. Lower bound on the amount of intersection

Theorem 4.1.

Lemma 4.2.

Proof.

Claim 4.3.

Proof.

Claim 4.4.

Proof.

Lemma 4.5.

Proof.

Lemma 4.6.

Proof.

Proof of Theorem 4.1.

Remark 4.7.

5. The Moser-Tardos algorithm is beyond Shearer’s bound

Lemma 5.1.

Proof.

Lemma 5.2.

Lemma 5.3.

Lemma 5.4.

Proof of Lemma 5.2.

6. Application to periodic Euclidean graphs

Lemma 6.1.

Theorem 6.2.

Definition 1.4 (Maximum $L_{1}$ -gap to the Shearer’s bound).