Erdős–Selfridge Theorem for Nonmonotone CNFs
Abstract
In an influential paper, Erdős and Selfridge introduced the Maker-Breaker game played on a hypergraph, or equivalently, on a monotone CNF. The players take turns assigning values to variables of their choosing, and Breaker’s goal is to satisfy the CNF, while Maker’s goal is to falsify it. The Erdős–Selfridge Theorem says that the least number of clauses in any monotone CNF with literals per clause where Maker has a winning strategy is .
We study the analogous question when the CNF is not necessarily monotone. We prove bounds of when Maker plays last, and and when Breaker plays last, where is the golden ratio.
1 Introduction
In 1973, Erdős and Selfridge published a paper [ES73] with several fundamental contributions, including:
-
Being widely regarded as the genesis of the method of conditional expectations. The subsequent impact of this method on theoretical computer science needs no explanation.
-
Introducing the so-called Maker-Breaker game, variants of which have since been studied in numerous papers in the combinatorics literature.
We revisit that seminal work and steer it in a new direction. The main theorem from [ES73] can be phrased in terms of CNFs (conjunctive normal form boolean formulas) that are monotone (they contain only positive literals). We investigate what happens for general CNFs, which may contain negative literals. We feel that the influence of Erdős–Selfridge and the pervasiveness of CNFs in theoretical computer science justify this question as inherently worthy of attention. Our pursuit of the answer uncovers new techniques and invites the development of further techniques to achieve a full resolution in the future.
In the Maker-Breaker game played on a monotone CNF, the eponymous players take turns assigning boolean values to variables of their choosing. Breaker wins if the CNF gets satisfied, and Maker wins otherwise; there are no draws. Since the CNF is monotone, Breaker might as well assign to every variable she picks, and Maker might as well assign to every variable he picks. In the generalization to nonmonotone CNFs, each player can pick which remaining variable and which bit to assign it during their turn. To distinguish this general game, we rename Breaker as T (for “true”) and Maker as F (for “false”). The computational complexity of deciding which player has a winning strategy has been studied in [Sch76, Sch78, Bys04, Kut04, Kut05, AO12, RW20a, RW20b, RW21].
A CNF is -uniform when every clause has exactly literals (corresponding to distinct variables). The Erdős–Selfridge Theorem answers an extremal question: How few clauses can there be in a -uniform monotone CNF that Maker can win? It depends a little on which player gets the opening move: if Breaker plays first, and if Maker plays first. The identity of the player with the final move doesn’t affect the answer for monotone CNFs. In contrast, “who gets the last laugh” matters a lot for general CNFs:
Theorem 1 (informal).
If F plays last, then the least number of clauses in any -uniform CNF where F has a winning strategy is .
Theorem 2 (informal).
If T plays last, then the least number of clauses in any -uniform CNF where F has a winning strategy is and where .
The most involved proof is the lower bound in Theorem 2. We conjecture the correct bound is .
2 Results
In the unordered CNF game, there is a CNF and a set of variables containing all variables that appear in and possibly more. The players T and F alternate turns; each turn consists of picking an unassigned variable from and picking a value or to assign it.111This game is called “unordered” to contrast it with the related TQBF game, in which the variables must be played in a prescribed order. The game ends when all variables are assigned; T wins if is satisfied (every clause has a true literal), and F wins if is unsatisfied (some clause has all false literals). There are four possible patterns according to “who goes first” and “who goes last.” If the same player has the first and last moves, then is odd, and if different players have the first and last moves, then is even.
Definition 1.
For and , we let be the minimum number of clauses in , over all unordered CNF game instances where is -uniform and F has a winning strategy when player has the first move and player has the last move.
Theorem 1 (formal).
for even , and for odd .
Let denote the Fibonacci number. It is well-known that where .
Theorem 2 (formal).
for all .
Observation 1.
for all and .
Proof.
: Suppose F wins when T moves first, where is -uniform. Then F wins when F moves first, where is a fresh variable (not already in ) and is the same as but with added to each clause. F’s winning strategy is to play first and then use the winning strategy for . Note that is -uniform and has the same number of clauses as .
: Suppose F wins when F moves first, where is -uniform. Say the opening move in F’s winning strategy is , where is some literal. Obtain from by removing each clause containing , removing from each clause containing , and removing an arbitrary literal from each clause containing neither nor . Then F wins when T moves first, and is -uniform and has at most as many clauses as .
Corollary 1.
-
for odd , and for even .
-
for all .
(Observation 1 requires , but the bounds in Corollary 1 also hold for since .)
3 Upper bounds
In this section, we prove the upper bounds of Theorem 1 and Theorem 2 by giving examples of game instances with few clauses where F wins. In [ES73], Erdős and Selfridge proved the upper bound for the Maker-Breaker game by showing a -uniform monotone CNF with clauses where Maker (F) wins. The basic idea is that F can win on the following formula, which is not a CNF:
Whenever T plays a variable, F responds by assigning to the paired variable. By the distributive law, this expands to a -uniform monotone CNF with clauses. We study nonmonotone CNFs, which may have both positive and negative literals.
3.1 F plays last
Lemma 1.
for even .
Proof.
F can win on the following formula, which is not a CNF, with variables .
Whenever T plays a variable, F responds by playing the paired variable to make them equal. To convert this formula to an equivalent CNF, first replace each with . Then by the distributive law, this expands to a -uniform CNF where one clause is
and for , each clause contains either or . Therefore has clauses: one clause for each . F wins in .
Lemma 2.
for odd .
Proof.
Suppose is the -uniform CNF with clauses from Lemma 1 (since is even). We take two copies of , and put a new variable in each clause of one copy, and a new variable in each clause of the other copy. Call this . Formally:
We argue F wins in . If T plays or , F responds by assigning to the other one. For other variables, F follows his winning strategy for from Lemma 1. Since is a -uniform CNF with clauses, is a -uniform CNF with clauses.
3.2 T plays last
Before proving Lemma 3 we draw an intuition. We already know that F wins on
Now replace each with , which is equivalent. This does not change the function expressed by the formula, so F still wins this T F game. To turn it into a T T game, we can introduce a dummy variable . Since the game is equivalent to a monotone game, neither player has any incentive to play , so F still wins this T T game.
If we convert it to a CNF, then by the distributive law it will again have clauses. But this CNF is not uniform—each clause has at least literals and at most literals. We can do a similar construction that balances the CNF to make it uniform. This intuitively suggests that .
Lemma 3.
.
Proof.
For every we recursively define a -uniform CNF on variables , where if , and (these , are different than in Section 3.1):
-
:
-
:
-
:
Now we argue F wins in . F’s strategy is to assign to at least one variable from each pair . Whenever T plays from a pair, F responds by assigning to the other variable. After T plays , F picks a fresh pair where is odd and assigns one of them , then “chases” T until T plays the other from . Here the “chase” means whenever T plays from a fresh pair, F responds by assigning to the other variable in that pair. After T returns to , then F picks another fresh pair to start another chase, and so on in phases. We prove by induction on that this strategy ensures is unsatisfied:
-
: is obviously unsatisfied.
-
: is obviously unsatisfied.
-
: By induction, both and are unsatisfied. Now is unsatisfied since: By F’s strategy, at least one of is assigned . If then one of the clauses of that came from is unsatisfied. If and then one of the clauses of that came from is unsatisfied.
Letting represent the number of clauses in , we argue by induction on :
-
: .
-
: .
-
: By induction, and . So
Therefore .
4 Lower bounds
4.1 Notation
In the proofs, we will define a potential value for each clause . The value of depends on the context. If is a CNF (any set of clauses), then the potential of is . The potential of a literal with respect to is defined as . When we have a particular in mind, we can abbreviate as .
Suppose is a CNF and are two literals. We define the potentials of different sets of clauses based on which of , and their complements exist in the clause. For example, is the sum of the potentials of clauses in that contain both .
neither nor | |||
---|---|---|---|
neither nor |
We can abbreviate these quantities as in contexts where we have particular in mind. Also the following relations hold:
When we assign (i.e., assign if is , or assign if is ), becomes the residual CNF denoted where all clauses containing get removed, and the literal gets removed from remaining clauses.
4.2 F plays last
Lemma 4.
for even .
Proof.
Consider any game instance where is a -uniform CNF with clauses and is even. We show T has a winning strategy. In this proof, we use . A round consists of a T move followed by an F move.
Claim 1.
In every round, there exists a move for T such that for every response by F, we have where is the residual CNF before the round and is the residual CNF after the round.
At the beginning we have for each clause , so . By Claim 1, T has a strategy guaranteeing that where is the residual CNF after all variables have been played. If this final contained a clause, the clause would be empty and have potential , which would imply . Thus the final must have no clauses, which means got satisfied and T won. This concludes the proof of Lemma 4, except for the proof of Claim 1.
Proof of Claim 1.
Let be the residual CNF at the beginning of a round. T picks a literal maximizing and plays .222It is perhaps counterintuitive that T’s strategy ignores the effect of clauses that contain , which increase in potential after playing . A more intuitive strategy would be to pick a literal maximizing , which is the overall decrease in potential from playing ; this strategy also works but is trickier to analyze. Suppose F responds by playing , and let be the residual CNF after F’s move. Letting the notation be with respect to , we have
because:
-
Clauses from the groups are satisfied and removed (since they contain or or both), so their potential gets multiplied by .
-
Clauses from the group each shrink by two literals (since they contain and ), so their potential gets multiplied by .
-
Clauses from the groups each shrink by one literal, so their potential gets multiplied by .
By the choice of , we have and with respect to , in other words, and . Thus because
Note: It did not matter whether is even or odd! Lemma 4 is true for any . Lemma 5 actually uses oddness of . The main idea is to exploit the slack that appeared at the end of the proof of Claim 1.
Lemma 5.
for odd .
Proof.
Consider any game instance where is a -uniform CNF with clauses and is even. We show T has a winning strategy. In this proof, we use
Claim 2.
In every round, there exists a move for T such that for every response by F, we have where is the residual CNF before the round and is the residual CNF after the round.
At the beginning we have for each clause (since , which is odd), so . By Claim 2, T has a strategy guaranteeing that where is the residual CNF after all variables have been played. If this final contained a clause, the clause would be empty and have potential (since is even), which would imply . Thus the final must have no clauses, which means got satisfied and T won. This concludes the proof of Lemma 5, except for the proof of Claim 2.
Proof of Claim 2.
Let be the residual CNF at the beginning of a round. T picks a literal maximizing and plays . Suppose F responds by playing , and let be the residual CNF after F’s move. Letting the notation be with respect to , we have
because:
-
Clauses from the groups are satisfied and removed (since they contain or or both), so their potential gets multiplied by .
-
Clauses from the group each shrink by two literals (since they contain and ). Here odd-width clauses remain odd and even-width clauses remain even, so their potential gets multiplied by .
-
Clauses from the groups each shrink by one literal. There are two cases for a clause in these groups:
-
is even, so . After being shrunk by 1, the new clause has potential . So the potential of an even-width clause gets multiplied by .
-
is odd, so . After being shrunk by 1, the new clause has potential . So the potential of an odd-width clause gets multiplied by .
So their potential gets multiplied by (since ).
-
By the choice of , we have and with respect to , in other words, and . Thus because
4.3 T plays last
Lemma 6.
.
Proof.
Consider any game instance where is a -uniform CNF with clauses and is odd. We show T has a winning strategy. In this proof, we use .
For intuition, how can T take advantage of having the last move? She will look out for certain pairs of literals to “set aside” and wait for F to assign one of them, and then respond by assigning the other one the opposite value. We call such a pair “zugzwang,” which means a situation where F’s obligation to make a move is a disadvantage for F. Upon finding such a pair, T anticipates that certain clauses will get satisfied later, but other clauses containing those literals might shrink when the zugzwang pair eventually gets played. Thus T can update the CNF to pretend those events have already transpired. The normal gameplay of TF rounds (T plays, then F plays) will sometimes get interrupted by FT rounds of playing previously-designated zugzwang pairs. We define the zugzwang condition so that T’s modifications won’t increase the potential of the CNF (which is no longer simply a residual version of ). When there are no remaining zugzwang pairs to set aside, we can exploit this fact—together with T’s choice of “best” literal for her normal move—to analyze the potential change in a TF round. This allows the proof to handle a smaller potential function and hence more initial clauses, compared to when F had the last move.
We describe T’s winning strategy in as Algorithm 1. In the first line, the algorithm declares and initializes , which are accessed globally. Here is a CNF (initially the same as ), and is a set (conjunction) of constraints of the form . We consider , , , to be the same constraint as each other. The algorithm maintains the following three invariants:
-
(1)
and are disjoint subsets of , and is the set of unplayed variables, and contains all variables that appear in , and is exactly the set of variables that appear in , and is even.
-
(2)
For every assignment to , if and are satisfied, then is also satisfied by the same assignment together with the assignment played by T and F so far to the other variables of .
-
(3)
.
Now we argue how these invariants are maintained at the end of the outer loop in Algorithm 1. Invariant (1) is straightforward to see.
Claim 3.
Invariant is maintained.
Proof.
Invariant (2) trivially holds at the beginning.
Each iteration of the first inner loop maintains (2): Say and are at the beginning of the iteration, and and denote the formulas after the iteration. Assume (2) holds for and . To see that (2) holds for and , consider any assignment to the unplayed variables. We will argue that if and are satisfied, then and are satisfied, which implies (by assumption) that is satisfied. So suppose and are satisfied. Then is satisfied because each clause containing or containing is satisfied due to being satisfied in , and each other clause is satisfied since it contains the corresponding clause in which is satisfied. Also, is satisfied since each of its constraints is also in which is satisfied.
It is immediate that T’s and F’s “normal” moves in the outer loop maintain (2), because of the way we update and .
Each iteration of the second inner loop maintains (2): If an assignment satisfies and (after the iteration) then it also satisfies and (at the beginning of the iteration) since T’s move satisfies —and therefore the assignment satisfies .
Claim 4.
Invariant is maintained.
Proof.
Invariant (3) holds at the beginning by the assumption that has clauses (and each clause has potential ).
The first inner loop maintains (3) by the following proposition, which we prove later.
Proposition 1.
If FindZugzwang() returns , then where and are the CNFs before and after the execution of TfoundZugzwang().
The second inner loop does not affect (3). In each outer iteration except the last, T’s and F’s moves from maintain (3) by the following proposition, which we prove later.
Proposition 2.
If FindZugzwang() returns NULL, then where is the CNF before TplayNormal() and is the CNF after FplayNormal().
This concludes the proof of Claim 4.
Now we argue why T wins in the last outer iteration. Right before TplayNormal(), must be odd by invariant (1), because an even number of variables have been played so far (since T has the first move) and is odd (since T also has the last move) and is even. Thus, T always has an available move in TplayNormal() since at this point. When T is about to play the last variable (possibly followed by some moves in the second inner loop), all remaining clauses in have width . There cannot be an empty clause in , because then would be , contradicting invariant (3). There cannot be more than one clause in , because then would be . Thus is either empty (already satisfied) or just or just , which T satisfies in one move.
At termination, and are empty, and and are empty and thus satisfied. By invariant (2), this means is satisfied by the gameplay, so T wins.
This concludes the proof of Lemma 6 except Proposition 1 and Proposition 2.
Proof of Proposition 1.
Since FindZugzwang() returns , the following holds with respect to :
() |
We also have
because:
-
Clauses from the groups are removed (since they contain or ), so their potential gets multiplied by . (Intuitively, T considers these clauses satisfied in advance since she will satisfy later.)
-
Clauses from the groups each shrink by two literals (since they contain two of which are removed), so their potential gets multiplied by . (Some of these four literals will eventually get assigned , but since T cannot predict which ones, she pessimistically assumes they are all .)
-
Clauses from the groups each shrink by one literal (since they contain one of which are removed), so their potential gets multiplied by .
Since (missing) holds, .
Proof of Proposition 2.
In TplayNormal(), T picks the literal maximizing and plays .333Some other strategies would also work here, but this one is the simplest to analyze. In FplayNormal(), F plays . With respect to we have
because:
-
Clauses from the groups are satisfied and removed (since they contain or or both), so their potential gets multiplied by .
-
Clauses from the group each shrink by two literals (since they contain and ), so their potential gets multiplied by .
-
Clauses from the groups each shrink by one literal, so their potential gets multiplied by .
By the choice of (i.e., maximizing ), we have:
() |
Since FindZugzwang() returns NULL, (missing) does not hold in . Thus the following holds:
() |
Thus because the linear combination (missing) (missing) implies:
Acknowledgments
This work was supported by NSF grants CCF-1657377 and CCF-1942742.
References
- [AO12] Lauri Ahlroth and Pekka Orponen. Unordered constraint satisfaction games. In Proceedings of the 37th International Symposium on Mathematical Foundations of Computer Science (MFCS), pages 64–75. Springer, 2012.
- [Bys04] Jesper Byskov. Maker-Maker and Maker-Breaker games are PSPACE-complete. Technical Report RS-04-14, BRICS, Department of Computer Science, Aarhus University, 2004.
- [ES73] Paul Erdős and John Selfridge. On a combinatorial game. Journal of Combinatorial Theory, Series A, 14(3), 1973.
- [Kut04] Martin Kutz. The Angel Problem, Positional Games, and Digraph Roots. PhD thesis, Freie Universität Berlin, 2004. Chapter 2: Weak Positional Games.
- [Kut05] Martin Kutz. Weak positional games on hypergraphs of rank three. In Proceedings of the 3rd European Conference on Combinatorics, Graph Theory, and Applications (EuroComb), pages 31–36. Discrete Mathematics & Theoretical Computer Science, 2005.
- [RW20a] Md Lutfar Rahman and Thomas Watson. Complexity of unordered CNF games. ACM Transactions on Computation Theory, 12(3):18:1–18:18, 2020.
- [RW20b] Md Lutfar Rahman and Thomas Watson. Tractable unordered 3-CNF games. In Proceedings of the 14th Latin American Theoretical Informatics Symposium (LATIN), pages 360–372. Springer, 2020.
- [RW21] Md Lutfar Rahman and Thomas Watson. 6-uniform Maker-Breaker game is PSPACE-complete. In Proceedings of the 38th International Symposium on Theoretical Aspects of Computer Science (STACS), pages 57:1–57:15. Schloss Dagstuhl, 2021.
- [Sch76] Thomas Schaefer. Complexity of decision problems based on finite two-person perfect-information games. In Proceedings of the 8th Symposium on Theory of Computing (STOC), pages 41–49. ACM, 1976.
- [Sch78] Thomas Schaefer. On the complexity of some two-person perfect-information games. Journal of Computer and System Sciences, 16(2):185–225, 1978.