This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The Element Extraction Problem and the Cost of
Determinism and Limited Adaptivity in Linear Queries

Amit Chakrabarti111[email protected], Department of Computer Science, Dartmouth College    Manuel Stoeckl222[email protected], Department of Computer Science, Dartmouth College
Abstract

Two widely-used computational paradigms for sublinear algorithms are using linear measurements to perform computations on a high dimensional input and using structured queries to access a massive input. Typically, algorithms in the former paradigm are non-adaptive whereas those in the latter are highly adaptive. This work studies the fundamental search problem of element-extraction in a query model that combines both: linear measurements with bounded adaptivity.

In the element-extraction problem, one is given a nonzero vector 𝐳=(z1,,zn){0,1}n\mathbf{z}=(z_{1},\ldots,z_{n})\in\{0,1\}^{n} and must report an index ii where zi=1z_{i}=1. The input can be accessed using arbitrary linear functions of it with coefficients in some ring. This problem admits an efficient nonadaptive randomized solution (through the well known technique of 0\ell_{0}-sampling) and an efficient fully adaptive deterministic solution (through binary search). We prove that when confined to only kk rounds of adaptivity, a deterministic element-extraction algorithm must spend Ω(k(n1/k1))\Omega(k(n^{1/k}-1)) queries, when working in the ring of integers modulo some fixed qq. This matches the corresponding upper bound. For queries using integer arithmetic, we prove a 22-round Ω~(n)\widetilde{\Omega}(\sqrt{n}) lower bound, also tight up to polylogarithmic factors. Our proofs reduce to classic problems in combinatorics, and take advantage of established results on the zero-sum problem as well as recent improvements to the sunflower lemma.


Keywords:

query complexity;  linear measurements;  sketching;  round elimination


Acknowledgements:

We thank Deeparnab Chakrabarty, Prantar Ghosh, and anonymous reviewers for several helpful discussions, comments, and pointers.

This work was supported in part by NSF under award 2006589.

1 Introduction

Determinism versus randomization in algorithm design is a fundamental concern in computer science and is the topic of a great many works in complexity theory. In “space-constrained” models such as communication complexity and data streaming, basic results show that derandomization can entail an exponential or worse blow-up in cost. For instance, in the two-party communication setting, the very basic nn-bit equality problem admits a bounded-error randomized protocol with only O(1)O(1) communication (O(logn)O(\log n) if restricted to private coins), whereas its deterministic communication complexity is as large as it gets, namely n+1n+1. In the data streaming setting, the similarly basic distinct-elements problem admits a one-pass bounded-error randomized algorithm that uses O(logn)O(\log n) space to provide a (1+ε)(1+\varepsilon)-approximation [KNW10], whereas a deterministic algorithm would require Ω(n)\Omega(n) space, even if multiple passes and large approximation factors are allowed [CK16]. In this work, we explore such a price-of-determinism phenomenon in the query complexity world, for a similarly basic search problem.

The focus of our study is a search problem that we call element-extraction (henceforth, elemx), where the input is a set Z[n]:={1,,n}Z\subseteq[n]:=\{1,\ldots,n\}, promised to be nonempty, and the goal is to extract any element from ZZ. Formally, this is a total search problem given by the relation elemxn2[n]×[n]\textsc{elemx}_{n}\subseteq 2^{[n]}\times[n], where

elemxn={(Z,i):Z[n],i[n],and |Z|>0iZ}.\displaystyle\textsc{elemx}_{n}=\left\{(Z,i):Z\subseteq[n],\,i\in[n],\,\text{and }|Z|>0\Rightarrow i\in Z\right\}\,. (1)

As is often the case, the natural correspondence between sets in 2[n]2^{[n]} and vectors in {0,1}n\{0,1\}^{n} will be useful. Indeed, we shall freely switch between these two viewpoints, using the notational convention that uppercase letters denote sets and their corresponding lowercase boldface variants denote characteristic vectors. Thus, we can also formalize elemx as

elemxn={(𝐳,i):𝐳=(z1,,zn){0,1}n,i[n],and 𝟏𝖳𝐳>0zi=1}.\displaystyle\textsc{elemx}_{n}=\left\{(\mathbf{z},i):\mathbf{z}=(z_{1},\ldots,z_{n})\in\{0,1\}^{n},\,i\in[n],\,\text{and }\mathbf{1}^{\mathsf{T}}\mathbf{z}>0\Rightarrow z_{i}=1\right\}\,. (2)

The goal of an algorithm solving elemx is to produce an output ii such that (Z,i)elemx(Z,i)\in\textsc{elemx}: with certainty in the deterministic setting, and with probability 2/3\geq 2/3 (say) in the randomized setting. In other words, the algorithm must produce a witness of the nonemptiness of ZZ. To do so, the algorithm may access ZZ (equivalently, 𝐳\mathbf{z}) using linear queries, as we shall now explain.

In a Boolean decision tree model, an algorithm may only access the input vector by querying its individual bits. In such a setting, there is not much to say about elemx: even randomized algorithms are easily seen to require Ω(n)\Omega(n) queries. But things get interesting if we allow more powerful queries: specifically, linear ones. Let us define a linear query protocol over domain DD (a DD-LQP, for short) to be a query protocol wherein each query is an evaluation of a linear form i=1naizi\sum_{i=1}^{n}a_{i}z_{i}, where each aiDa_{i}\in D. The domain DD should be thought of a “reasonable” subset of a ring containing {0,1}\{0,1\}—e.g., a finite field, or integers with bounded absolute value—and the linear functions will be evaluated in the underlying ring. The cost of an LQP is the number of linear form evaluations used.333Note that this is somewhat lower than the number of bits needed to encode the output of the queries. In this work we particularly care about the amount of adaptivity in an LQP, which quantifies the extent to which each query depends on the outcomes of previous queries.

To set the stage, we recall the problem of 0\ell_{0}-sampling [FIS08, CF14], from the world of sketching and streaming algorithms. The goal of 0\ell_{0}-sampling is to sample a pair (i,xi)(i,x_{i}) from a nonzero input vector 𝐱n\mathbf{x}\in\mathbb{R}^{n} (say), so that xi0x_{i}\neq 0 and ii is distributed nearly uniformly on the support of 𝐱\mathbf{x}. This is a fundamental primitive, used as a low-level subroutine in a wide range of applications in streaming and other “big data” algorithms. There are several solutions to this problem [CF14], most of which provide a linear sketching scheme, wherein one computes 𝐲=S𝐱\mathbf{y}=S\mathbf{x} for a certain random d×nd\times n matrix SS and then runs a recovery algorithm on the low-dimensional vector 𝐲\mathbf{y} to produce the desired sample. Notice that if the input is a vector 𝐳{0,1}n\mathbf{z}\in\{0,1\}^{n}, such a scheme provides a randomized LQP for elemxn\textsc{elemx}_{n} (allowing a small probability of error). In particular, using the optimal 0\ell_{0}-sampling sketch of Jowhari, Sağlam, and Tardos [JST11], we obtain a \mathbb{Z}-LQP that makes O(logn)O(\log n) queries, using coefficients in {0,1,,n}\{0,1,\ldots,n\}, and has the pleasing property of being non-adaptive. We can also obtain a q\mathbb{Z}_{q}-LQP that makes O(log2n/logq)O(\log^{2}n/\log q) queries;444Throughout this paper, “log\log” denotes the base-22 logarithm. details in Section 6.

Turning to the deterministic setting—our main focus in this paper—it is easy to show that a non-adaptive \mathbb{Z}-LQP for elemxn\textsc{elemx}_{n} must make Ω(n/logn)\Omega(n/\log n) queries, for basic information-theoretic reasons. For completeness, we give the proof in Proposition 1.5. However, this heavy determinism penalty disappears upon moving to general deterministic LQPs, where we can use adaptivity. Indeed, a simple binary search strategy leads to a \mathbb{Z}-LQP that makes O(logn)O(\log n) queries, using coefficients in {0,1}\{0,1\}. We can refine this observation to trade off the query complexity for amount of adaptivity. This brings us to our central concept.

Define a kk-round LQP to be one where the queries are made in batches that we call rounds: the collection of linear forms defining the queries in round ii depend only on the results of queries made in rounds 1,,i11,\ldots,i-1 (a formal definition appears in Section 2). Then, a natural generalization of the binary search strategy provides a kk-round \mathbb{Z}-LQP for elemx, using coefficients in {0,1}\{0,1\}, making at most k(n1/k1)k(\lceil n^{1/k}\rceil-1) queries in total. When we are additionally promised that 𝟏𝖳𝐳0\mathbf{1}^{\mathsf{T}}\mathbf{z}\neq 0, where addition is performed in the ring q\mathbb{Z}_{q}, then this algorithm also works as a q\mathbb{Z}_{q}-LQP; details in Section 6. Notice that kk-round LQPs naturally interpolate between linear sketches at one extreme (when k=1k=1) and linear decision trees at the other (when k=nk=n).

The most important message of this paper is that the above rounds-versus-queries tradeoff is asymptotically tight for deterministic linear query protocols for elemx, in several natural settings. We state our results informally for now, with formal statements given after the necessary definitions and preliminaries.

1.1 Our Results and Techniques

We shall study DD-LQPs for the domains D=qD=\mathbb{Z}_{q}, the ring of integers modulo qq (with qnq\ll n) as well as D=D=\mathbb{Z}, but with coefficients of small magnitude (at most poly(n)\operatorname{poly}(n), say). Such restrictions on the coefficients are necessary, because allowing arbitrary integer coefficients makes it possible to recover the entire input 𝐳\mathbf{z} with the single query i=1n2i1zi\sum_{i=1}^{n}2^{i-1}z_{i}.

When D=qD=\mathbb{Z}_{q}, for small qq, solving elemx without the promise that 𝟏𝖳𝐳0\mathbf{1}^{\mathsf{T}}\mathbf{z}\neq 0 is hard, regardless of the number of rounds. Intuitively, there is no cheap way to deterministically verify that a subset I[n]I\subseteq[n] indeed contains an index iIi\in I where zi0z_{i}\neq 0. Defining the “cost” of an LQP to be the number of queries it makes in the worst case (formally defined in Section 2), we obtain the following not-too-hard results.

Proposition 1.1.

Every deterministic 2\mathbb{Z}_{2}-LQP for elemxn\textsc{elemx}_{n} has cost n1\geq n-1, which is optimal.

Proposition 1.2.

For q3q\geq 3, every deterministic q\mathbb{Z}_{q}-LQP for elemxn\textsc{elemx}_{n} has cost n/(2qlnq)\geq{n}/(2q\ln q).

As noted earlier, adding the promise that 𝟏𝖳𝐳0\mathbf{1}^{\mathsf{T}}\mathbf{z}\neq 0 permits a more efficient kk-round deterministic algorithm. For each integer q2q\geq 2, define elemxn(q)\textsc{elemx}^{(q)}_{n} to be the version of elemxn\textsc{elemx}_{n} where we are given the stronger promise that 𝟏𝖳𝐳0\mathbf{1}^{\mathsf{T}}\mathbf{z}\neq 0 under arithmetic in q\mathbb{Z}_{q}. Equivalently, using set notation, we are promised that |Z|0(modq)|Z|\not\equiv 0\pmod{q}. We prove the following results, using similar round-elimination arguments.

Theorem 1.3.

Every deterministic kk-round 2\mathbb{Z}_{2}-LQP for elemxn(2)\textsc{elemx}^{(2)}_{n} has cost k(n1/k1)\geq k(n^{1/k}-1).

Theorem 1.4.

Every deterministic kk-round q\mathbb{Z}_{q}-LQP for elemxn(q)\textsc{elemx}^{(q)}_{n} has cost Ω(1q1+1/kln2qk(n1/k1))\Omega\left(\frac{1}{q^{1+1/k}\ln^{2}q}k(n^{1/k}-1)\right).

Although Theorem 1.4 subsumes Theorem 1.3 in the asymptotic sense, we find it useful to present the former result in full, first, to lay the groundwork for our subsequent lower bound proofs. As we shall see, the fact that 2\mathbb{Z}_{2} is a field leads to an especially clean execution of the round elimination strategy. Note also that a weaker form of Theorem 1.3 follows from existing work on formula size-depth tradeoffs (see Section 7); however, the resulting proof, once fully unrolled, is considerably more complex than our direct argument.

At a high level, a lower bound proof based on round elimination works as follows. We consider a hypothetical kk-round protocol for nkn_{k}-dimensional instances of some problem PP that does not incur much cost in its first round. Based on this low cost, we extract a (k1)(k-1)-round protocol for nk1n_{k-1}-dimensional instances of PP by “lifting” these smaller instances to special nkn_{k}-dimensional instances on which the kk-round protocol essentially “wastes” its first round. If we can carry out this argument while ensuring that the shrinkage from nkn_{k} to nk1n_{k-1} is not too drastic, then a too-cheap kk-round protocol will eventually give us a 0-round protocol for a nontrivial instance dimension, leading to a contradiction.

In the proofs of the above two theorems, this strategy is executed by identifying a large collection of pairwise disjoint sets that are treated identically in the protocol’s first round. Viewing these sets as blocks of indices within [n][n], we consider block-structured instances of elemxn\textsc{elemx}_{n} and proceed to lift general instances of elemxn\textsc{elemx}_{n^{\prime}} into these block-structured ones. In Theorem 1.3, these blocks arise from elementary linear algebraic considerations. In Theorem 1.4, the fact that inputs are in {0,1}n\{0,1\}^{n} instead of qn\mathbb{Z}_{q}^{n} necessitates a brief excursion into additive combinatorics.

Finally, we consider LQPs over \mathbb{Z}, the ring of all integers, but with bounds on the magnitude of coefficients (which, as we noted earlier, is necessary in order to have nontrivial results). To be precise, we consider domains of the form [b,c]:={a:bac}\mathbb{Z}_{[b,c]}:=\{a\in\mathbb{Z}:\,b\leq a\leq c\}. While we are unable to prove a full tradeoff lower bound in this case, we do obtain a near-optimal result for k=2k=2 rounds.

Proposition 1.5.

Every deterministic 11-round [B,B]\mathbb{Z}_{[-B,B]}-LQP for elemxn\textsc{elemx}_{n} costs Ω(n/log(nB))\Omega({n}/\log(nB)).

Theorem 1.6.

Every deterministic 22-round [B,B]\mathbb{Z}_{[-B,B]}-LQP for elemxn\textsc{elemx}_{n} costs Ω(n/log3/2(nB))\Omega(\sqrt{n}/\log^{3/2}(nB)).

The former result is straightforward, based on the simple observation that such an LQP can extract the entire input 𝐳\mathbf{z} followed by basic information theoretic considerations. Incidentally, the problem of extracting all of 𝐳\mathbf{z} using [0,1]\mathbb{Z}_{[0,1]}-LQPs has a long history as the coin weighing problem, for which a 1-round O(n/logn)O(n/\log n) algorithm exists; see Section 1.2.

The significant result here is the latter. It again uses a round elimination strategy and, as before, the bird’s-eye view is that we identify disjoint blocks of indices to engineer a suitable lifting. This time, the blocks arise out of extremal combinatorics considerations, specifically the sunflower lemma, in its recently strengthened form [Rao20]. Furthermore, upon carrying out this round elimination, we are left with a 11-round LQP that solves elemx only under a cardinality constraint on the input set. To finish the proof, we must demonstrate hardness even for this special case. This is not as straightforward as Proposition 1.5: our argument to handle this hinges on the Frankl–Wilson theorem [FW81] on set systems with forbidden intersection sizes.

Attempts to extend the above proof outline to handle more than two rounds runs into technical issues of integer divisibility. We suspect that this is an artifact of our proof machinery and not inherent to the problem. We conjecture that every deterministic kk-round [B,B]\mathbb{Z}_{[-B,B]}-LQP requires cost Ω~(n1/k)\widetilde{\Omega}(n^{1/k}), suppressing polylogarithmic factors. Indeed, we believe that much more is true, and that a communication complexity analogue of such a tradeoff also holds. We shall take this up after a discussion of related work.

1.2 Related Work and Connections

Our work touches upon several themes with long histories of study in computer science: determinism versus randomization, adaptivity versus non-adaptivity, sublinear algorithms, and input access through structured queries. With these connections in mind, we recall a small number of works that are either close in spirit to ours or shed light on some aspect of this work.

The most basic query model is the Boolean decision tree. In this setting, deterministic and randomized complexities are polynomially related for total Boolean functions [BdW02, ABB+17], whereas arbitrarily large gaps are possible for search problems [LNNW95]. Parity decision trees—equivalent to our 2\mathbb{Z}_{2}-LQPs—have been studied in several works (e.g., [ZS10, HHL18] and the references therein), usually for Boolean functions and focusing on connections with communication complexity of XOR-composed functions. Beyond the Boolean—or indeed the discrete—setting lie linear decision trees, where the input is a real vector and one can query the sign of a linear form [DL78, KLM19]. All such “decision tree” models are fully adaptive and the vast majority of works using them do not focus on amount of adaptivity as a resource.

At the other extreme is the (nonadaptive) linear sketching model, where a high-dimensional input is accessed through one batch of linear queries (equivalently, through a low-dimensional sketch of it produced by a sketching matrix). This paradigm is ubiquitous in data streaming algorithms and compressed sensing [Mut05, Don06, Woo14, CY20] and has connections to dimension reduction and metric embeddings. Some recent work carries the message that linear sketching might be a complete paradigm for a large class of data streaming algorithms [LNW14] and certain communication protocols [KMSY18, HLY19]. Most work on linear sketching considers randomized sketches, since determinism often precludes sublinear cost.

Turning to determinism, the well-studied coin weighing problem, put in our terms, asks for a [0,1]\mathbb{Z}_{[0,1]}-LQP that retrieves the entire input 𝐳{0,1}n\mathbf{z}\in\{0,1\}^{n}. It has long been known that (2±o(1))n/logn(2\pm o(1))n/\log n nonadaptive queries are necessary and sufficient. Special cases and variants of this problem have been studied over the years; see [ER63] for some early history and [Bsh09, MK13] for recent history. While some of these works consider adaptive LQPs, there is no strong rounds-vs-queries tradeoff for this problem, which is harder than elemx.

The body of work on round complexity under linear queries is much smaller. There is one recent work very close to ours: Assadi, Chakrabarty, and Khanna [ACK20] studied a problem very similar to elemx that they called single-element-recovery, where the input is a vector 𝐱0n\mathbf{x}\in\mathbb{R}_{\geq 0}^{n}, and by applying \mathbb{R}-linear queries one wishes to recovery an arbitrary element from the support of 𝐱\mathbf{x}. While their query model is much stronger than our \mathbb{Z}-linear or q\mathbb{Z}_{q}-linear queries, it is balanced by the 0\mathbb{R}_{\geq 0}-valued inputs that prevent tricks to recover the entire input in one query. Their main theorem implies that the deterministic kk-round search algorithm making roughly k(n1/k1)k(n^{1/k}-1) queries in total—very similar to Algorithm 1—has cost exactly matching the lower bound. Linear queries and adaptivity are also featured together in some work on sparse recovery problems. One such problem is to find an approximately closest ss-sparse vector 𝐱\mathbf{x}^{\star} to an input 𝐱\mathbf{x}, using \mathbb{R}-linear queries to the input and rr rounds of adaptivity. For this, [PW13] have proven near optimal lower bounds of Ω(r(logn)1/r)\Omega(r(\log n)^{1/r}) when s=1s=1 and [KP19] have extended them to small ss, proving Ω(1rs(logn)1/r)\Omega(\frac{1}{r}s(\log n)^{1/r}) queries are needed when logs<(logn)1/r\log s<(\log n)^{1/r}.

A number of works consider rounds of adaptivity in query models beyond linear queries. Recent examples include works on maximizing submodular functions through adaptive oracle queries [BS18]; on adaptivity hierarchy theorems in property testing [CG18]; on identifying biased coins through pairwise comparisons or in multi-armed bandit settings [AAAK17]; and on finding approximately maximum bipartite matchings through demand queries and OR-queries [Nis21]. Other works have studied adaptivity in the massive parallel communication/computation (MPC) model [BKS17] and in various graph query models [AB19, BHPSR+20].

A rich body of work on cost/adaptivity tradeoffs is found in communication complexity, where adaptivity manifests as rounds of interaction. An early work [NW93] gave exponential separations between kk and k+1k+1 rounds for all kk and introduced a round elimination paradigm that remains ubiquitous to this day. This work also explains how an earlier result [KW90] connecting circuit and communication complexities can be used to relate bounded-round communication complexity for a specific problem to the size of bounded-depth, unbounded fan-in formulas. More work has been spurred by applications of bounded-round communication lower bounds in data structures, where they provide lower bounds in the cell-probe model [MNSW98, Sen03, CR04, PT07, LPY16]; and in streaming algorithms, where they translate naturally to tradeoffs between the number of passes made over the input stream and the working memory required [GM09, ER14, GO16, CCM16, CW16]. In much of this body of work, round elimination is performed using information theoretic arguments that naturally provide lower bounds against randomized algorithms.

In contrast, it is rare to see deterministic tradeoffs where corresponding randomized ones do not hold because randomization makes the problem “too easy.” This is exactly the situation with elemx, as shown by this work in the context of the randomized upper bounds (Section 6) via 0\ell_{0}-sampling [JST11]. In light of the preceding discussion, our instantiations of round elimination must use techniques beyond Shannon-style information theory. They indeed do. Our techniques therefore have the potential for further use in separating determinism from randomization in this fine-grained (round aware) sense.

Our query complexity results on elemx suggest a tantalizing communication complexity analogue. Let urn\textsc{ur}^{\subset}_{n} denote555The notation, due to Nelson and Yu [NY19], is to be read as “universal relation with a subset constraint.” the communication complexity problem where Alice and Bob receive sets X,Y[n]X,Y\subseteq[n] respectively with the promise that YXY\subset X, and their goal is to produce an element in XYX\smallsetminus Y. Clearly, a kk-round query protocol for elemx making qq queries, with each answer lying in a set of size MM, provides a kk-round communication protocol for ur\textsc{ur}^{\subset} using at most qlogMq\log M bits. Therefore, our results here would be subsumed, in an asymptotic sense, if one could resolve the following conjecture positively.

Conjecture 1.7.

Every deterministic kk-round communication protocol for ur\textsc{ur}^{\subset} costs Ω~(n1/k)\widetilde{\Omega}(n^{1/k}) bits, suppressing polylogarithmic factors.

We find the above conjecture compelling because it would demonstrate a new phenomenon in communication complexity, where a problem is easy for one-round randomized and for interactive deterministic protocols, but exhibits a nontrivial tradeoff for bounded-round deterministic ones.

In passing, we note that the ur\textsc{ur}^{\subset} problem was introduced in [KNP+17] where its randomized communication complexity was studied. The randomized lower bound was subsequently used by Nelson and Yu [NY19] to prove the optimality of Ahn, Guha, and McGregor’s graph sketching algorithm for graph connectivity [AGM12]. An outstanding open question about the latter problem (viewed as a communication problem where nn players, each holding a vertex neighborhood, talk to a coordinator who determines whether the graph is connected) is whether it admits a deterministic algorithm with sublinear communication. A better understanding of ur\textsc{ur}^{\subset} in the deterministic setting could be key to addressing this question.

There are also two problems similar to ur\textsc{ur}^{\subset} for which lower bounds have already been proven. The universal relation problem ur gives Alice and Bob unequal sets X,Y[n]X,Y\subseteq[n] and asks them to produce an element i(XY)(YX)i\in(X\smallsetminus Y)\cup(Y\smallsetminus X). This has deterministic communication complexity n+1\geq n+1 [TZ97]. The Karchmer-Wigderson game for parityn\textsc{parity}_{n} is the problem ur with the additional constraints that |X||X| be even and |Y||Y| be odd; existing circuit complexity results [Has86, Ros15] imply, as briefly explained in Section 7, that kk-round deterministic communication protocols for this require Ω(k(n1/k1))\Omega(k(n^{1/k}-1)) bits of communication.

2 Preliminaries

Throughout the paper, we shall freely switch between the equivalent viewpoints of sets in 2[n]2^{[n]} and vectors in {0,1}n\{0,1\}^{n}, using the notational convention that when an uppercase letter (e.g., S,ZS,Z) denotes a set, the corresponding lowercase boldface letter (e.g., 𝐬,𝐳\mathbf{s},\mathbf{z}) denotes the characteristic vector of that set and vice versa.

2.1 Various Definitions

The search problem elemxn\textsc{elemx}_{n} was already formally defined in eq. 1. We shall also work with special cases of this problem, where the cardinality of the input set is further restricted in some way. These are formalized as follows: we define

elemxn(q)\displaystyle\textsc{elemx}^{(q)}_{n} ={(Z,i):Z[n],i[n],and |Z|0\@displayfalse(modq)iZ};\displaystyle=\left\{(Z,i):Z\subseteq[n],\,i\in[n],\,\text{and }|Z|\not\equiv 0{\@displayfalse\pmod{q}}\Rightarrow i\in Z\right\}\,; (3)
elemxn(q,h)\displaystyle\textsc{elemx}^{(q,h)}_{n} ={(Z,i):Z[n],i[n],and |Z|h\@displayfalse(modq)iZ};\displaystyle=\left\{(Z,i):Z\subseteq[n],\,i\in[n],\,\text{and }|Z|\equiv h{\@displayfalse\pmod{q}}\Rightarrow i\in Z\right\}\,; (4)
elemxn1/4\displaystyle\textsc{elemx}^{1/4}_{n} ={(Z,i):Z[n],i[n],and |Z|=n/4iZ}.\displaystyle=\left\{(Z,i):Z\subseteq[n],\,i\in[n],\,\text{and }|Z|=n/4\Rightarrow i\in Z\right\}\,. (5)
Definition 2.1 (Protocol).

Let f{0,1}n×𝒪f\subseteq\{0,1\}^{n}\times\mathcal{O} be a search problem with input space {0,1}n\{0,1\}^{n} and output space 𝒪\mathcal{O}. A deterministic kk-round DD-linear query protocol (DD-LQP), Π\Pi, on this input space is a rooted tree of depth kk where each internal node vv is labeled with a matrix AvDdv×nA_{v}\in D^{d_{v}\times n}; each leaf node with an output oλ𝒪o_{\lambda}\in\mathcal{O}; and the edges from a node vv to its children are labeled with the elements of v:={Av𝐳:𝐳{0,1}n}\mathcal{M}_{v}:=\{A_{v}\mathbf{z}:\,\mathbf{z}\in\{0,1\}^{n}\} bijectively. The quantity dvd_{v} of node vv is the cost of the node, sometimes also denoted cost(v)\operatorname{cost}(v). Given an input 𝐳{0,1}n\mathbf{z}\in\{0,1\}^{n}, the measurement at internal node vv is Av𝐳A_{v}\mathbf{z}. The transcript of Π\Pi on 𝐳\mathbf{z} —denoted Π(𝐳)\Pi(\mathbf{z})—is the unique root-to-leaf path obtained by walking along the edges determined by these measurements; the jjth measurement is the label of the jjth edge on this path; and the output is the label oo_{\ell} of the leaf :=(Π(𝐳))\ell:=\ell(\Pi(\mathbf{z})) reached by this path. We say that Π\Pi solves ff if (𝐳,o)f(\mathbf{z},o_{\ell})\in f for every input 𝐳\mathbf{z}.

Since this paper is largely focused on deterministic complexity, henceforth we shall assume that all LQPs are deterministic unless stated otherwise.

Definition 2.2 (Cost).

The query cost of a protocol Π\Pi is:

cost(Π)\displaystyle\operatorname{cost}(\Pi) :=max𝐳{0,1}ncost(Π;𝐳),wherecost(Π;𝐳):=v internal node on Π(𝐳)dv,\displaystyle:=\max_{\mathbf{z}\in\{0,1\}^{n}}\operatorname{cost}(\Pi;\mathbf{z})\,,\qquad\text{where}\quad\operatorname{cost}(\Pi;\mathbf{z}):=\sum_{v\text{ internal node on }\Pi(\mathbf{z})}d_{v}\,,

which is, informally, the number of linear queries performed when Π\Pi executes on 𝐳\mathbf{z}. While we do not focus on bit complexity in this paper, it is worth noting that to make an information-theoretically fair comparison between different domains, one should consider the number of bits returned in response to all the queries. This number may be larger than cost(Π)\operatorname{cost}(\Pi), though only by an O(logn)O(\log n) factor for D=[B,B]D=\mathbb{Z}_{[-B,B]} with B=poly(n)B=\operatorname{poly}(n), and not at all for D=2D=\mathbb{Z}_{2}.

Definition 2.3 (Complexity).

The DD-linear query complexity and kk-round DD-linear query complexity of a search problem ff are defined, respectively, to be

LQD(f)\displaystyle\operatorname{LQ}_{D}(f) =min{cost(Π):Π is a D-LQP that solves f};\displaystyle=\min\{\operatorname{cost}(\Pi):\Pi\text{ is a $D$-LQP that solves }f\}\,;
LQDk(f)\displaystyle\operatorname{LQ}_{D}^{k}(f) =min{cost(Π):Π is a k-round D-LQP that solves f}.\displaystyle=\min\{\operatorname{cost}(\Pi):\Pi\text{ is a $k$-round $D$-LQP that solves }f\}\,.

2.2 Useful Results from Combinatorics

In the course of this paper, we will use several important theorems from combinatorics. For results on q\mathbb{Z}_{q}-LQPs (proved in Section 4), we use the following result of van Emde Boas and Kruyswijk [vEBK69] on zero sumsets, slightly reworded to use modern notation.

Theorem 2.4 ([vEBK69]).

Let GG be a finite abelian group with exponent666The exponent of a group is the least common multiple of the orders of its elements. exp(G)\exp(G) and order |G||G|. Let s(G)s(G) be the minimal positive integer tt for which any sequence of tt elements from GG has a nonempty subsequence which sums to zero. Then s(G)exp(G)(1+ln|G|exp(G))s(G)\leq\exp(G)(1+\ln\frac{|G|}{\exp(G)}). ∎

A stronger result that s(G)=1+r(q1)s(G)=1+r(q-1) applies when G=qrG=\mathbb{Z}_{q}^{r} and qq is a prime power [Ols69]; it is conjectured that the prime-power constraint is unnecessary [GG06, conjecture 3.5].

When working over \mathbb{Z} (in Section 5), we use the well-known notion of a sunflower and the following recent result of Rao [Rao20], which refines the noted result of Alweiss, Lovett, Wu, and Zhang [ALWZ20] that improved the classic sunflower lemma of Erdős and Rado [ER60]. The note [BCW20] further improves Rao’s bound by replacing the log(pt)\log(pt) factor with logt\log t, but this will not affect our proof. Tao [Tao20] gives an alternative presentation of Rao’s result which may be simpler to follow.

In a different part of our argument, we will need a well known theorem of Frankl and Wilson [FW81].

Theorem 2.5 (Rao).

There is a universal constant c1>1c_{1}>1 such that every family of more than (c1plog(pt))t(c_{1}p\log(pt))^{t} sets, each of cardinality tt, must contain a pp-sunflower, defined as a family of pp distinct sets whose pairwise intersections are identical. ∎

Theorem 2.6 (Frankl–Wilson).

Let m(n,k,l¯)m(n,k,\overline{l}) be the largest size of a collection \mathcal{F} of subsets of ([n]k)\binom{[n]}{k} for which no two elements F,FF,F^{\prime}\in\mathcal{F} have intersection size ll. Then, if klk-l is a prime power:

m(n,k,l¯)\displaystyle m(n,k,\overline{l}) (nkl1),\displaystyle\leq\binom{n}{k-l-1}\,, if k2l+1;\displaystyle\text{ if }k\geq 2l+1\,;
m(n,k,l¯)\displaystyle m(n,k,\overline{l}) (nl)(2kl1k)/(2kl1l),\displaystyle\leq\binom{n}{l}\binom{2k-l-1}{k}\Big{/}\binom{2k-l-1}{l}\,, if k2l+1. ∎\displaystyle\text{ if }k\leq 2l+1\,.\hbox to0.0pt{~{}~{}\qquad\qquad\qquad\qquad\qed\hss}

2.3 Our Round Elimination Framework

We now describe a framework for our round elimination arguments. For this section, we shall work over a general ring (with unity), RR, and “LQP” will mean a DD-LQP where DRD\subseteq R. Fix this ring RR.

Definition 2.7 (Homomorphism and shadowing).

A protocol homomorphism is a map φ\varphi from a protocol Υ\Upsilon to a protocol Π\Pi such that (i) for any two nodes u,vu,v in φ\varphi, the node φ(u)\varphi(u) is a child of φ(v)\varphi(v) iff uu is a child of vv, and (ii) φ\varphi maps leaves of Υ\Upsilon to leaves of Π\Pi. We say that φ\varphi is cost-preserving for each internal node vv of Υ\Upsilon, cost(v)=cost(φ(v))\operatorname{cost}(v)=\operatorname{cost}(\varphi(v)). We say that Υ\Upsilon shadows Π\Pi through φ\varphi if φ\varphi is injective, cost-preserving, and maps the root of Υ\Upsilon to a child of the root of Π\Pi. Notice that when this is the case, Υ\Upsilon is one round shorter than Π\Pi.

Suppose we have an LQP Π\Pi that operates on inputs in {0,1}n\{0,1\}^{n} and produces outputs in [n][n]. Further, suppose S1,,Sm[n]S_{1},\ldots,S_{m}\subseteq[n] is a collection of pairwise disjoint nonempty sets. We then define a certain LQP Π(S1,,Sm)\Pi^{(S_{1},\ldots,S_{m})} operating on inputs in {0,1}m\{0,1\}^{m} and producing outputs in [m][m]. To aid intuition, we describe the construction procedurally in Algorithm 1.

Algorithm 1  Outline of protocol Π(S1,,Sm)\Pi^{(S_{1},\ldots,S_{m})}
1:Lift our input W[m]W\subseteq[m] to Z:=iWSi[n]Z:=\bigcup_{i\in W}S_{i}\subseteq[n] (this step is only conceptual).
2:Mimic Π\Pi by simulating the queries it would have made to its input ZZ. Emulate each such query by making the corresponding query to our own input WW. This is indeed possible using linear queries to WW.
3:Suppose Π\Pi wants to output hh. If hSih\in S_{i}, then output that index ii (which must be unique); otherwise, output an arbitrary index.

To define Π:=Π(S1,,Sm)\Pi^{\prime}:=\Pi^{(S_{1},\ldots,S_{m})} formally, we first define the lifting matrix

L=[𝐬1𝐬2𝐬m]Rn×m,\displaystyle L=[\mathbf{s}_{1}~{}\mathbf{s}_{2}~{}\cdots~{}\mathbf{s}_{m}]\in R^{n\times m}\,, (6)

whose entries lie in {0,1}\{0,1\} and which maps the input space of Π\Pi^{\prime} to the input space of Π\Pi according to 1, thanks to the pairwise disjointness of the sets SiS_{i}. At a given node vv of Π\Pi, labeled with Avqdv×nA_{v}\in\mathbb{Z}_{q}^{d_{v}\times n}, the simulation in 2 would retrieve the measurement Av𝐳=AvL𝐰A_{v}\mathbf{z}=A_{v}L\mathbf{w}. The protocol Π\Pi^{\prime} can get the same result by making the query AvLqdv×mA_{v}L\in\mathbb{Z}_{q}^{d_{v}\times m}.

Thus, the protocol tree for Π\Pi^{\prime} is formed as follows. Prepare a copy of Π\Pi and let φ:ΠΠ\varphi\colon\Pi^{\prime}\to\Pi be the natural bijection between their nodes. Label each internal node vv of Π\Pi^{\prime} with Av:=Aφ(v)LA_{v}:=A_{\varphi(v)}L. Copy over all edge labels from Π\Pi to Π\Pi^{\prime}. For each leaf \ell of Π\Pi^{\prime}, if oφ()Sio_{\varphi(\ell)}\in S_{i}, then assign label o:=io_{\ell}:=i. If no such ii exists, assign o:=1o_{\ell}:=1 (say). This labeling is well defined because of the pairwise disjointness of the sets SiS_{i}.

In the sequel, to perform round elimination, we shall use the construction of Π\Pi^{\prime} in a special way that we record in the lemma below. We also record a definition that will be relevant when invoking the lemma.

Lemma 2.8.

Suppose that Π\Pi correctly solves elemxn\textsc{elemx}_{n} on inputs in 𝒵{0,1}n\mathcal{Z}\subseteq\{0,1\}^{n}. Let S1,,Sm[n]S_{1},\ldots,S_{m}\subseteq[n] be pairwise disjoint and let LL be defined by eq. 6. Let ρ\rho be the root node of Π\Pi and, for 𝐫Rdρ\mathbf{r}\in R^{d_{\rho}}, let 𝒲𝐫:={𝐰{0,1}m:L𝐰𝒵\mathcal{W}_{\mathbf{r}}:=\{\mathbf{w}\in\{0,1\}^{m}:\,L\mathbf{w}\in\mathcal{Z} and AρL𝐰=𝐫}A_{\rho}L\mathbf{w}=\mathbf{r}\}. Then, there is a protocol Υ\Upsilon that shadows Π\Pi and correctly solves elemxm\textsc{elemx}_{m} on each input in 𝒲𝐫\mathcal{W}_{\mathbf{r}}.

Proof.

Using the above setup and terminology, construct Π:=Π(S1,,Sm)\Pi^{\prime}:=\Pi^{(S_{1},\ldots,S_{m})} as in Algorithm 1. The given conditions imply that on all inputs in 𝒲𝐫\mathcal{W}_{\mathbf{r}}, the first measurement of Π\Pi^{\prime} is always 𝐫\mathbf{r} and thus leads an execution of Π\Pi^{\prime} to a particular child, uu, of its root node. Thus, we can shrink Π\Pi^{\prime} to the subprotocol Υ\Upsilon rooted at uu. Notice that the bijection φ\varphi is a cost-preserving protocol homomorphism and so Υ\Upsilon shadows Π\Pi through φ|Υ\varphi|_{\Upsilon}.

By construction, Υ\Upsilon on input 𝐰𝒲𝐫\mathbf{w}\in\mathcal{W}_{\mathbf{r}} simulates Π\Pi on 𝐳:=L𝐰=iW𝐬i\mathbf{z}:=L\mathbf{w}=\sum_{i\in W}\mathbf{s}_{i}, an input on which Π\Pi correctly solves elemxn\textsc{elemx}_{n}. Therefore, if Π\Pi outputs hh, then hZ=iWSih\in Z=\bigcup_{i\in W}S_{i}. By the disjointness guarantee, there exists a unique iWi\in W for which hSih\in S_{i}. As Υ\Upsilon reports precisely this ii, it correctly solves elemxm\textsc{elemx}_{m} on 𝐰\mathbf{w}. ∎

Definition 2.9 (Uniform family).

Fix a matrix ARd×nA\in R^{d\times n}. An AA-uniform family of size mm is a collection of mm pairwise disjoint sets S1,,Sm[n]S_{1},\ldots,S_{m}\subseteq[n] such that A𝐬1==A𝐬m=𝐫A\mathbf{s}_{1}=\cdots=A\mathbf{s}_{m}=\mathbf{r}, for some vector 𝐫Rd\mathbf{r}\in R^{d}.

3 Linear Queries Modulo 2

We begin our study of the element-extraction problem by considering 2\mathbb{Z}_{2}-linear queries. As noted in Section 1.1, we shall later generalize the results to q\mathbb{Z}_{q}, but we feel it is worth seeing our framework in action in the especially clean setting of 2\mathbb{Z}_{2}. We begin by showing that the additional promise of odd cardinality on the input set ZZ is crucial, or else there is no interesting rounds-vs-queries tradeoff to be had.

Proposition 3.1 (Restatement of Proposition 1.1).

LQ2(elemxn)=n1\operatorname{LQ}_{\mathbb{Z}_{2}}(\textsc{elemx}_{n})=n-1.

Proof.

The upper bound is achieved by the trivial 11-round LQP (i.e., a sketch) that queries all but one of the individual bits of the input.

Now assume to the contrary that there is a 2\mathbb{Z}_{2}-LQP Π\Pi with cost(Π)=dn2\operatorname{cost}(\Pi)=d\leq n-2 that solves elemxn\textsc{elemx}_{n}. Let A2d×nA\in\mathbb{Z}_{2}^{d\times n} be the matrix whose rows represent all queries along the path Π(𝟎)\Pi(\mathbf{0}). Then dimkerAnd2\dim\,\ker A\geq n-d\geq 2, whence there exist distinct nonzero vectors 𝐲,𝐳2n\mathbf{y},\mathbf{z}\in\mathbb{Z}_{2}^{n} such that A𝐲=A𝐳=𝟎A\mathbf{y}=A\mathbf{z}=\mathbf{0}. Setting 𝐱=𝐲+𝐳\mathbf{x}=\mathbf{y}+\mathbf{z}, we also have A𝐱=𝟎A\mathbf{x}=\mathbf{0}. Thus, the three nonzero inputs 𝐱,𝐲,𝐳\mathbf{x},\mathbf{y},\mathbf{z} lead to the same leaf, namely (Π(𝟎))\ell(\Pi(\mathbf{0})), and produce the same output ii, say. By the correctness of Π\Pi, we have xi=yi=zi=1x_{i}=y_{i}=z_{i}=1, which contradicts 𝐱=𝐲+𝐳\mathbf{x}=\mathbf{y}+\mathbf{z}. ∎

Accordingly, for the rest of this section, we focus on the problem elemxn(2)\textsc{elemx}^{(2)}_{n}, as defined in eq. 3. We shall prove Theorem 1.3 using a round elimination technique. As discussed in Section 2, this round elimination will be enabled by identifying a certain AA-uniform family. The next lemma, establishing a useful fact about matrices over 2\mathbb{Z}_{2}, will provide us this family.

Lemma 3.2.

Every matrix A2d×nA\in\mathbb{Z}_{2}^{d\times n}, admits an AA-uniform family S1,,SmS_{1},\ldots,S_{m} of size mn/(d+1)m\geq{\lceil{n/(d+1)}\rceil} such that each cardinality |Si||S_{i}| is odd.

Proof.

Let 𝐛1,,𝐛n\mathbf{b}_{1},\ldots,\mathbf{b}_{n} be the (nonzero) column vectors of the matrix

B:=[A𝟏𝖳]2(d+1)×nB:=\left[\begin{array}[]{c}A\\ \mathbf{1}^{\mathsf{T}}\end{array}\right]\in\mathbb{Z}_{2}^{(d+1)\times n}

formed by appending the all-ones row to AA. For each Q[n]Q\subseteq[n], let BQB_{Q} be the collection of column vectors {𝐛i:iQ}\{\mathbf{b}_{i}:\,i\in Q\} and let BQ{\langle{B_{Q}}\rangle} be the linear subspace of 2d+1\mathbb{Z}_{2}^{d+1} spanned by the vectors in BQB_{Q}.

Partition [n][n] into nonempty disjoint sets T1,,TmT_{1},\ldots,T_{m} iteratively, as follows. For each ii, let TiT_{i} be a maximal subset of [n]j=1i1Tj[n]\smallsetminus\bigcup_{j=1}^{i-1}T_{j} such that the vectors in BTiB_{T_{i}} are linearly independent. Since these vectors live in 2d+1\mathbb{Z}_{2}^{d+1}, it follows that |Ti|d+1|T_{i}|\leq d+1. We stop when j=1mTm=[n]\bigcup_{j=1}^{m}T_{m}=[n], implying mn/(d+1)m\geq{\lceil{n/(d+1)}\rceil}.

We claim that, for each i{2,,m}i\in\{2,\ldots,m\}, we have BTi1BTi{\langle{B_{T_{i-1}}}\rangle}\supseteq{\langle{B_{T_{i}}}\rangle}. Indeed, if there exists an element 𝐱BTiBTi1\mathbf{x}\in{\langle{B_{T_{i}}}\rangle}\smallsetminus{\langle{B_{T_{i-1}}}\rangle}, then there is a set QTiQ\subseteq T_{i} for which 𝐱=hQ𝐛j\mathbf{x}=\sum_{h\in Q}\mathbf{b}_{j}. Since BTi1{\langle{B_{T_{i-1}}}\rangle} is closed under linear combinations and does not contain 𝐱\mathbf{x}, there exists hQh\in Q with 𝐛hBTi1\mathbf{b}_{h}\notin{\langle{B_{T_{i-1}}}\rangle}. By construction, hj=1i2Tjh\notin\bigcup_{j=1}^{i-2}T_{j}, so hh was not included in Ti1T_{i-1} despite being available. This contradicts the maximality of Ti1T_{i-1}.

Let kk be an index in TmT_{m}. Then 𝐛kBTmBTm1BT1\mathbf{b}_{k}\in{\langle{B_{T_{m}}}\rangle}\subseteq{\langle{B_{T_{m-1}}}\rangle}\subseteq\cdots\subseteq{\langle{B_{T_{1}}}\rangle}, so there must exist subsets S1,,SmS_{1},\ldots,S_{m} of T1,,TmT_{1},\ldots,T_{m} for which B𝐬i=𝐛kB\mathbf{s}_{i}=\mathbf{b}_{k}. The sets {Si}i=1m\{S_{i}\}_{i=1}^{m} are pairwise disjoint because the sets {Ti}i=1m\{T_{i}\}_{i=1}^{m} are. Let 𝐫\mathbf{r} be the first dd coordinates of 𝐛k\mathbf{b}_{k}; then for all i[m]i\in[m], A𝐬i=𝐫A\mathbf{s}_{i}=\mathbf{r}. Therefore, {Si}i=1m\{S_{i}\}_{i=1}^{m} is AA-uniform. Finally, since the last coordinate of 𝐛k\mathbf{b}_{k} is 11 and the last row of BB is 𝟏𝖳\mathbf{1}^{\mathsf{T}}, for each i[m]i\in[m], 𝟏𝖳𝐬i=1\mathbf{1}^{\mathsf{T}}\mathbf{s}_{i}=1, so |Si||S_{i}| is odd. ∎

Lemma 3.3 (Round elimination lemma).

Let Π\Pi be a deterministic kk-round 2\mathbb{Z}_{2}-LQP for elemxn(2)\textsc{elemx}^{(2)}_{n}, where k1k\geq 1. Then there exists a deterministic (k1)(k-1)-round 2\mathbb{Z}_{2}-LQP Υ\Upsilon for elemxm(2)\textsc{elemx}^{(2)}_{m}, such that

  1. (3.3.1)

    Υ\Upsilon shadows Π\Pi through a (cost-preserving, injective) protocol homomorphism φΥ:ΥΠ\varphi_{\Upsilon}\colon\Upsilon\to\Pi;

  2. (3.3.2)

    mn/(d+1)m\geq{\lceil{n/(d+1)}\rceil}, where dd is the cost of the root of Π\Pi.

Proof.

Let A2d×nA\in\mathbb{Z}_{2}^{d\times n} be the label of the root of Π\Pi. Let S1,,SmS_{1},\ldots,S_{m} be an AA-uniform family of size mn/(d+1)m\geq{\lceil{n/(d+1)}\rceil} with each |Si||S_{i}| odd, as guaranteed by Lemma 3.2. Let the lifting matrix LL be as given by eq. 6 and let 𝐫=A𝐬1\mathbf{r}=A\mathbf{s}_{1}. We know that Π\Pi correctly solves elemxn(2)\textsc{elemx}^{(2)}_{n} on inputs in 𝒵:={Z[n]:|Z|\mathcal{Z}:=\{Z\subseteq[n]:\,|Z| odd}\}. Invoking Lemma 2.8, we obtain a (k1)(k-1)-round 2\mathbb{Z}_{2}-LQP Υ\Upsilon that shadows Π\Pi as required.

It remains to show that Υ\Upsilon solves elemxm(2)\textsc{elemx}^{(2)}_{m}. The guarantee of Lemma 2.8 is that Υ\Upsilon correctly solves elemxm\textsc{elemx}_{m} on the input set 𝒲𝐫\mathcal{W}_{\mathbf{r}} defined there. Thus, it suffices to show that if an input W[m]W\subseteq[m] satisfies the promise of elemxm(2)\textsc{elemx}^{(2)}_{m}—i.e., |W||W| is odd—then W𝒲𝐫W\in\mathcal{W}_{\mathbf{r}}. We reason as follows:

|W| odd\displaystyle|W|\text{ odd} |L𝐰|=|iW𝐬i|1\@displayfalse(mod2)\displaystyle\implies|L\mathbf{w}|=\left|\sum_{i\in W}\mathbf{s}_{i}\right|\equiv 1{\@displayfalse\pmod{2}}  each |Si| is odd\displaystyle\lhd\text{ each $|S_{i}|$ is odd}
L𝐰𝒵;\displaystyle\implies L\mathbf{w}\in\mathcal{Z}\,;  definition of 𝒵\displaystyle\lhd\text{ definition of $\mathcal{Z}$}
and
|W| odd\displaystyle|W|\text{ odd} AL𝐰=AiW𝐬i=|W|A𝐬1=|W|𝐫=𝐫.\displaystyle\implies AL\mathbf{w}=A\sum_{i\in W}\mathbf{s}_{i}=|W|\cdot A\mathbf{s}_{1}=|W|\cdot\mathbf{r}=\mathbf{r}\,.  definition of A-uniformity\displaystyle\lhd\text{ definition of $A$-uniformity}

This completes the proof, by definition of 𝒲𝐫\mathcal{W}_{\mathbf{r}}. ∎

The next step of the proof is to repeatedly invoke the above round elimination lemma and carefully control parameters. To perform a sharp analysis, we introduce the following concept.

Definition 3.4.

A division sequence for nn is a finite sequence of positive integers d1djd_{1}\ldots d_{j} for which

n1d1+11d1+11dj+1=1.\displaystyle\left\lceil\cdots\left\lceil\left\lceil n\cdot\frac{1}{d_{1}+1}\right\rceil\frac{1}{d_{1}+1}\right\rceil\cdots\frac{1}{d_{j}+1}\right\rceil=1\,. (7)
Lemma 3.5.

Let d1,,djd_{1},\ldots,d_{j} be a division sequence for nn minimizing h=1jdh\sum_{h=1}^{j}d_{h}. Then

jn1/jjh=1jdhjn1/jj.\displaystyle jn^{1/j}-j\leq\sum_{h=1}^{j}d_{h}\leq j\lceil n^{1/j}\rceil-j\,.
Proof.

For the upper bound, let d1==dj=n1/j1d_{1}=\ldots=d_{j}=\lceil n^{1/j}\rceil-1. For the lower bound, remove the ceiling operations in eq. 7 to get

nh=1j(dh+1)1,\displaystyle\frac{n}{\prod_{h=1}^{j}(d_{h}+1)}\leq 1\,, which impliesn1/j(h=1j(dh+1))1/j.\displaystyle\qquad\text{which implies}\quad n^{1/j}\leq\smash{\left(\prod_{h=1}^{j}(d_{h}+1)\right)^{1/j}}\,.

By the AM-GM inequality,

h=1jdh=j(1jh=1j(dh+1)1)j((h=1j(dh+1))1/j1)j(n1/j1).\sum_{h=1}^{j}d_{h}=j\left(\frac{1}{j}\sum_{h=1}^{j}(d_{h}+1)-1\right)\geq\smash{j\left(\left(\prod_{h=1}^{j}(d_{h}+1)\right)^{1/j}-1\right)}\geq j(n^{1/j}-1)\,.\qed

This brings us to the main result of this section: a rounds-vs-queries tradeoff.

Theorem 3.6 (Restatement of Theorem 1.3).

LQ2k(elemxn(2))k(n1/k1)\operatorname{LQ}_{\mathbb{Z}_{2}}^{k}(\textsc{elemx}^{(2)}_{n})\geq k(n^{1/k}-1).

Proof.

Suppose that Π\Pi is a deterministic kk-round 2\mathbb{Z}_{2}-LQP for elemxn(2)\textsc{elemx}^{(2)}_{n}. Repeatedly applying Lemma 3.3, we obtain a sequence of protocols Π=Π1,Π2,,Πj+1\Pi=\Pi_{1},\Pi_{2},\ldots,\Pi_{j+1}, which solve elemx(2)\textsc{elemx}^{(2)} on progressively smaller input sizes, until Πj+1\Pi_{j+1} is a degenerate depth-0 protocol (in which no queries occur).

Let did_{i} be the cost of the root ρi\rho_{i} of Πi\Pi_{i}, for 1ij1\leq i\leq j. As Property (3.3.1) gives protocol homomorphisms φΠi+1:Πi+1Πi\varphi_{\Pi_{i+1}}:\Pi_{i+1}\to\Pi_{i}, we find the the roots of each Πi\Pi_{i} correspond to nodes ui=(φΠ2φΠi)(ρi)u_{i}=(\varphi_{\Pi_{2}}\circ\cdots\circ\varphi_{\Pi_{i}})(\rho_{i}) in Π\Pi. In fact, the vertices u1,u1,,uj+1u_{1},u_{1},\ldots,u_{j+1} form a path from the root ρ=u1\rho=u_{1} of Π\Pi to the leaf uj+1u_{j+1}. The inputs of Πj+1\Pi_{j+1} lift to inputs of Π\Pi which reach uj+1u_{j+1}. Lower bounding the query cost of Π\Pi using this branch gives

cost(Π)i=1jcost(ui)=i=1jdi.\displaystyle\operatorname{cost}(\Pi)\geq\sum_{i=1}^{j}\operatorname{cost}(u_{i})=\sum_{i=1}^{j}d_{i}\,. (8)

Using property (3.3.2) repeatedly, Πj+1\Pi_{j+1} must solve elemxm(2)\textsc{elemx}^{(2)}_{m}, for some integer

mn1d1+11d2+11dj+1.\displaystyle m\geq\left\lceil\cdots\left\lceil\left\lceil n\cdot\frac{1}{d_{1}+1}\right\rceil\frac{1}{d_{2}+1}\right\rceil\cdots\frac{1}{d_{j}+1}\right\rceil\,.

However, as Πj+1\Pi_{j+1} solves elemxm(2)\textsc{elemx}^{(2)}_{m} without performing any queries, there must be a fixed index which is a valid output for all inputs Z2[m]Z\in 2^{[m]} of odd size. This is only possible when m=1m=1; for any larger mm, the inputs Z={1}Z=\{1\} and Z={2}Z^{\prime}=\{2\} must produce different outputs.

Therefore, the integers d1,,djd_{1},\ldots,d_{j} form a division sequence for nn. Applying Lemma 3.5 to eq. 8,

cost(Π)i=1jdijn1/jjk(n1/k1),\displaystyle\operatorname{cost}(\Pi)\geq\sum_{i=1}^{j}d_{i}\geq jn^{1/j}-j\geq k(n^{1/k}-1)\,,

where the last inequality follows from the fact that ddz[z(n1/z1)]0\frac{d}{dz}\left[z(n^{1/z}-1)\right]\leq 0 for all z0z\geq 0. ∎

4 Linear Queries Modulo q

First, we use Theorem 2.4 to show that elemxn\textsc{elemx}_{n} is hard for q\mathbb{Z}_{q}-LQPs.

Proposition 4.1 (Restatement of Proposition 1.2).

For every q3q\geq 3, we have LQq(elemxn)n/(2qlnq)\operatorname{LQ}_{\mathbb{Z}_{q}}(\textsc{elemx}_{n})\geq n/(2q\ln q).

Proof.

This is proven with the same strategy as for Proposition 1.1. Assume for sake of contradiction that cost(Π)n2qlnq\operatorname{cost}(\Pi)\leq\frac{n}{2q\ln q}. Let ν\nu be the leaf (Π(𝟎))\ell(\Pi(\mathbf{0})). Let Aqd×nA\in\mathbb{Z}_{q}^{d\times n} be the matrix containing all queries along the path from the root of Π\Pi to ν\nu.

By Theorem 2.4, since the group qd\mathbb{Z}_{q}^{d} has order qdq^{d}, and exponent qq, any sequence of Dq(1+ln(qdq))D\leq q\left(1+\ln(\frac{q^{d}}{q})\right) elements in qd\mathbb{Z}_{q}^{d} has a nontrivial subsequence summing to 𝟎\mathbf{0}. As q3q\geq 3, dqlnqDdq\ln q\geq D. Thus, since n2dqlnqn\geq 2dq\ln q, picking disjoint subsets II and JJ of sizes dqlnqdq\ln q each, and applying the theorem implies there exist disjoint nonempty subsets Z1Z_{1} and Z2Z_{2} of [n][n] for which the corresponding columns of AA sum to 𝟎\mathbf{0}. In other words, Π\Pi reaches the same leaf given 𝐳1\mathbf{z}_{1} and 𝐳2\mathbf{z}_{2}, but the leaf cannot be assigned an output consistent with both. ∎

A similar strategy proves a lemma analogous to Lemma 3.2:

Lemma 4.2.

Every matrix Aqd×nA\in\mathbb{Z}_{q}^{d\times n}, admits an AA-uniform family S1,,SmS_{1},\ldots,S_{m} where

  1. (4.2.1)

    mn(d+1)qlnq1\displaystyle m\geq\frac{n}{(d+1)q\ln q}-1, and

  2. (4.2.2)

    each cardinality |Si|1(modq)|S_{i}|\equiv-1\pmod{q}.

Proof.

To be able to enforce constraints on the values |Si||S_{i}|, we define B:=[𝟏A𝖳]𝖳q(d+1)×nB:=\left[\mathbf{1}\mid A^{\mathsf{T}}\right]^{\mathsf{T}}\in\mathbb{Z}_{q}^{(d+1)\times n}, and let 𝐛1,,𝐛n\mathbf{b}_{1},\ldots,\mathbf{b}_{n} be its column vectors. We partition the columns of the matrix BB into disjoint subsets D1,,DkD_{1},\ldots,D_{k} of [n][n] by the following iterative procedure. In the procedure, let PP be the set of indices of [n][n] not yet chosen. Each set DiD_{i} starts out as \varnothing; then beginning with i=1i=1, each set DiD_{i} is expanded by picking an index jj from PP for which no subset H(Di{j})H\subseteq(D_{i}\cup\{j\}) has the property that hH𝐛h=𝟎\sum_{h\in H}\mathbf{b}_{h}=\mathbf{0}; adding jj to DiD_{i} and removing jj from PP; until no more such indices can be found. When DiD_{i} is done, start filling Di+1D_{i+1}, etc.

When q=2q=2, each DiD_{i} corresponds to a basis of a subspace of 2d+1\mathbb{Z}_{2}^{d+1}, so |Di|d+1<(d+1)2ln2|D_{i}|\leq d+1<(d+1)2\ln 2. For q3q\geq 3, we apply Theorem 2.4, using the fact that the group qd+1\mathbb{Z}_{q}^{d+1} has order qd+1q^{d+1} and exponent qq. The maximum possible size of each set DiD_{i} is then q(1+ln(qd+1q))1\leq q\left(1+\ln(\frac{q^{d+1}}{q})\right)-1. The upper bound (d+1)qlnq(d+1)q\ln q also holds here. Consequently, the number kk of sets formed is n(d+1)qlnq\geq\frac{n}{(d+1)q\ln q}. Pick some tDkt\in D_{k}; for any i<ki<k, since tt was not picked when DiD_{i} was constructed, it must be the case that there is a subset SiDiS_{i}\subseteq D_{i} for which hSi𝐛h+𝐛t=𝟎\sum_{h\in S_{i}}\mathbf{b}_{h}+\mathbf{b}_{t}=\mathbf{0}. This implies B𝐬i=hSi𝐛h=𝐛tB\mathbf{s}_{i}=\sum_{h\in S_{i}}\mathbf{b}_{h}=-\mathbf{b}_{t}. Since the first row of BB is 𝟏\mathbf{1}, we have |Si|hSi11(modq)|S_{i}|\equiv\sum_{h\in S_{i}}1\equiv-1\pmod{q}, so all the sets SiS_{i} have size 1(modq)-1\pmod{q}. Let 𝐫\mathbf{r} be the last dd entries of 𝐛t-\mathbf{b}_{t}; then for all ii, B𝐬i=𝐫B\mathbf{s}_{i}=\mathbf{r}. There are m=k1n(d+1)qlnq1m=k-1\geq\frac{n}{(d+1)q\ln q}-1 sets in total. ∎

Compared to elemx(2)\textsc{elemx}^{(2)}, there is a slight weakening of the main round elimination lemma, which is a direct consequence of the weakened Lemma 4.2. Instead of directly lower bounding the cost of elemxn(q)\textsc{elemx}^{(q)}_{n}, we prove separate lower bounds for each elemxn(q,h)\textsc{elemx}^{(q,h)}_{n}, for all h{1,,q1}h\in\{1,\ldots,q-1\}, and take their maximum. The search problem elemxn(q,h)\textsc{elemx}^{(q,h)}_{n} is elemxn\textsc{elemx}_{n} with the additional promise that the input set ZZ has size h(modq)\equiv h\pmod{q}.

Lemma 4.3 (Round elimination lemma).

Let Π\Pi be a kk-round q\mathbb{Z}_{q}-LQP for elemxn(q,h)\textsc{elemx}^{(q,h)}_{n}, where k1k\geq 1 and h{1,,q1}h\in\{1,\ldots,q-1\}. Then there exists a (k1)(k-1)-round q\mathbb{Z}_{q}-LQP Υ\Upsilon for elemxm(q,h)\textsc{elemx}_{m}^{(q,-h)}, such that

  1. (4.3.1)

    Υ\Upsilon shadows Π\Pi through a protocol homomorphism φΥ:ΥΠ\varphi_{\Upsilon}\colon\Upsilon\to\Pi;

  2. (4.3.2)

    mn(d+1)qlnq1\displaystyle m\geq\frac{n}{(d+1)q\ln q}-1, where dd is the cost of the root of Π\Pi.

Proof.

Let Aqd×nA\in\mathbb{Z}_{q}^{d\times n} be the label of the root of Π\Pi. Lemma 4.2 guarantees that there exists an AA-uniform family of size mm, where mm satisfies property (4.3.2), and A𝐬1==A𝐬m=𝐱A\mathbf{s}_{1}=\ldots=A\mathbf{s}_{m}=\mathbf{x}, and |S1||Sm|1(modq)|S_{1}|\equiv\ldots\equiv|S_{m}|\equiv-1\pmod{q}. Let LL be the lifting matrix from eq. 6, and 𝐫=h𝐱\mathbf{r}=-h\mathbf{x}. Applying Lemma 2.8 to Π\Pi, LL and 𝐫\mathbf{r}, we obtain a (k1)(k-1)-round q\mathbb{Z}_{q}-LQP Υ\Upsilon that shadows Π\Pi, and solves elemxm\textsc{elemx}_{m} on all inputs W[m]W\subseteq[m] for which AL𝐰=𝐫AL\mathbf{w}=\mathbf{r} and |L𝐰|h(modq)|L\mathbf{w}|\equiv h\pmod{q}. If WW fulfills the promise of elemxn(q,h)\textsc{elemx}^{(q,-h)}_{n}, that |W|h(modq)|W|\equiv-h\pmod{q}, then:

|L𝐰|\displaystyle|L\mathbf{w}| =|iWSi|=iW|Si|=|W|(1)=h(modq),\displaystyle=\left|\bigcup_{i\in W}S_{i}\right|=\sum_{i\in W}|S_{i}|=|W|\cdot(-1)=h\pmod{q}\,,
AL𝐰\displaystyle AL\mathbf{w} =iWA𝐬i=|W|𝐱=h𝐱=𝐫,\displaystyle=\sum_{i\in W}A\mathbf{s}_{i}=|W|\mathbf{x}=-h\mathbf{x}=\mathbf{r}\,,

which proves that Υ\Upsilon is correct on WW. ∎

This brings us to the main result of this section, which essentially generalizes the modulo-22 result from the previous section.

Theorem 4.4 (Restatement of Theorem 1.4).

For each q2q\geq 2, we have

LQqk(elemxn(q))13.67q1+1/kln2qk(n1/k1).\operatorname{LQ}_{\mathbb{Z}_{q}}^{k}(\textsc{elemx}^{(q)}_{n})\geq\frac{1}{3.67q^{1+1/k}\ln^{2}q}k(n^{1/k}-1)\,.
Proof.

Suppose that Π\Pi is a deterministic q\mathbb{Z}_{q}-LQP for elemxn(q,h)\textsc{elemx}^{(q,h)}_{n}. Repeatedly applying Lemma 4.3, we construct a sequence of protocols Π=Π1,Π2,,Πj+1\Pi=\Pi_{1},\Pi_{2},\ldots,\Pi_{j+1}, which respectively solve elemx(q,h)\textsc{elemx}^{(q,h)}, elemxn(q,h)\textsc{elemx}_{n}^{(q,-h)}, elemx(q,h),\textsc{elemx}^{(q,h)},\ldots\, on progressively smaller input sizes, until Πj+1\Pi_{j+1} is a degenerate depth-0 protocol (in which no queries occur), for elemxn(q,(1)jh)\textsc{elemx}_{n}^{(q,(-1)^{j}h)}. As in Section 3, the roots ρi\rho_{i} of the protocols Πi\Pi_{i}, 1ij1\leq i\leq j, which have cost did_{i}, correspond to a branch of Π\Pi formed by corresponding nodes uiu_{i} and ending at a leaf corresponding to the root of Πj+1\Pi_{j+1}. Then

cost(Π)i=1jcost(ui)=i=1jdi.\displaystyle\operatorname{cost}(\Pi)\geq\sum_{i=1}^{j}\operatorname{cost}(u_{i})=\sum_{i=1}^{j}d_{i}\,. (9)

Let δi:=(di+1)qlnq\delta_{i}:=(d_{i}+1)q\ln q. By Lemma 4.3, Πi\Pi_{i} solves elemxmi(q,(1)i1h)\textsc{elemx}^{(q,(-1)^{i-1}h)}_{m_{i}}, where m1=nm_{1}=n, and:

mi+1\displaystyle m_{i+1} mi(di+1)qlnq1=miδiδi.\displaystyle\geq\frac{m_{i}}{(d_{i}+1)q\ln q}-1=\frac{m_{i}-\delta_{i}}{\delta_{i}}\,. (10)

As Πj+1\Pi_{j+1} solves elemxmi(q,(1)jh)\textsc{elemx}^{(q,(-1)^{j}h)}_{m_{i}} without any queries, the problem must be trivial, necessitating mj+1qm_{j+1}\leq q. Combining eq. 10 for ii between 11 and jj and rearranging:

qni=1j=1iδ=1jδn\displaystyle q\geq\frac{n-\sum_{i=1}^{j}{\prod_{\ell=1}^{i}{\delta_{\ell}}}}{\prod_{\ell=1}^{j}{\delta_{\ell}}}\quad\implies\quad n q=1jδ+i=1j=1iδ(q+j)=1jδ.\displaystyle\leq q\prod_{\ell=1}^{j}{\delta_{\ell}}+\sum_{i=1}^{j}{\prod_{\ell=1}^{i}{\delta_{\ell}}}\leq(q+j)\prod_{\ell=1}^{j}{\delta_{\ell}}\,.

Further rearrangement lets us use AM-GM and an inequality derived from (q+j)qj(q+1)jq(q+j)q^{j}\leq(q+1)^{j}q:

(1ji=1jδi)(i=1jδi)1/j(nq+j)1/jqq+1(nq)1/j.\displaystyle\left(\frac{1}{j}\sum_{i=1}^{j}{\delta_{i}}\right)\geq\left(\prod_{i=1}^{j}{\delta_{i}}\right)^{1/j}\geq\left(\frac{n}{q+j}\right)^{1/j}\geq\frac{q}{q+1}\left(\frac{n}{q}\right)^{1/j}\,. (11)

We can now lower bound the query cost of Π\Pi:

cost(Π)\displaystyle\operatorname{cost}(\Pi) i=1j(di+1)j=1qlnqi=1jδij\displaystyle\geq\sum_{i=1}^{j}(d_{i}+1)-j=\frac{1}{q\ln q}\sum_{i=1}^{j}\delta_{i}-j  by eq. 9\displaystyle\lhd\text{ by \lx@cref{creftype~refnum}{eq:zq-lincost-sum}}
j(1(q+1)lnq(nq)1/j1)\displaystyle\geq j\left(\frac{1}{(q+1)\ln q}\left(\frac{n}{q}\right)^{1/j}-1\right)  by eq. 11\displaystyle\lhd\text{ by \lx@cref{creftype~refnum}{eq:zq-lincost-amgm}}
k(1(q+1)lnq(nq)1/k1).\displaystyle\geq k\left(\frac{1}{(q+1)\ln q}\left(\frac{n}{q}\right)^{1/k}-1\right)\,.  since dds[s(r1/s1)]0\displaystyle\lhd\text{ since $\frac{d}{ds}[s(r^{1/s}-1)]\leq 0$} (12)

This lower bound becomes negative for sufficiently large kk. To obtain a bound that remains positive for all kk, we combine it with an unconditional lower bound. First, we note that eq. 12 also applies to protocols solving elemxn(q)\textsc{elemx}^{(q)}_{n}, since elemxn(q,h)\textsc{elemx}^{(q,h)}_{n} was an easier case. For elemxn(q)\textsc{elemx}^{(q)}_{n}, the set of possible transcripts of any protocol Ψ\Psi forms a qq-ary prefix code of maximum length dd. If qd<nq^{d}<n, then by the pigeonhole principle Ψ\Psi must treat identically some pair of {1},{2},,{n}\{1\},\{2\},\ldots,\{n\}, which is a contradiction; thus cost(Π)lnn/lnq\operatorname{cost}(\Pi)\geq\ln n/\ln q. Combining this lower bound with eq. 12 and applying Lemma 8.1, we obtain

cost(Π)max{lnnlnq,k(1q1/k(q+1)lnqn1/k1)}k(n1/k1)q1/k(q+1)(lnq+1)lnqk(n1/k1)3.67q1+1/kln2q.\operatorname{cost}(\Pi)\geq\max\left\{\frac{\ln n}{\ln q},\,k\left(\frac{1}{q^{1/k}(q+1)\ln q}n^{1/k}-1\right)\right\}\geq\frac{k(n^{1/k}-1)}{q^{1/k}(q+1)(\ln q+1)\ln q}\geq\frac{k(n^{1/k}-1)}{3.67q^{1+1/k}\ln^{2}q}\,.\qed

5 Linear Queries Over the Integers

For \mathbb{Z}-LQPs, our main result is a 2-round lower bound for elemxn\textsc{elemx}_{n}. We require a careful accounting of the query cost of a protocol, to adjust for the fact that the (bit) size of the query results depends on the maximum entry value in a given query matrix. This motivates the following definition and observation.

Definition 5.1.

A \mathbb{Z}-LQP is said to be MM-bounded if each linear measurement can take at most MM distinct values. In particular, if the inputs to a [B,B]\mathbb{Z}_{[-B,B]}-LQP Π\Pi lie in {0,1}n\{0,1\}^{n}, then Π\Pi is (Bn+1)(Bn+1)-bounded.

Recall the problem elemxn1/4\textsc{elemx}^{1/4}_{n} defined in eq. 5. For nn divisible by 44, this is simply elemxn\textsc{elemx}_{n} under the additional promise that |Z|=n/4|Z|=n/4. We first prove a 11-round lower bound for this problem, under a slight additional assumption on nn.

Lemma 5.2.

Let n=4rn=4r where rr is a prime power. If Π\Pi is an MM-bounded one-round protocol for elemxn1/4\textsc{elemx}^{1/4}_{n},

cost(Π)0.14nlogM.\displaystyle\operatorname{cost}(\Pi)\geq 0.14\frac{n}{\log M}\,.
Proof.

Let d=cost(Π)d=\operatorname{cost}(\Pi) and let Ad×nA\in\mathbb{Z}^{d\times n} be the query performed by Π\Pi. We first consider what Π\Pi does on inputs of cardinality n/2n/2, even though such inputs lie outside the promise region of elemxn1/4\textsc{elemx}^{1/4}_{n}. Soon, we shall see how this helps.

Since Π\Pi is MM-bounded, the mapping 𝐳A𝐳\mathbf{z}\mapsto A\mathbf{z} from domain ([n]n/2)\binom{[n]}{n/2} to d\mathbb{Z}^{d} has no more than MdM^{d} possible output values. By the pigeonhole principle, there exists a vector 𝐰d\mathbf{w}\in\mathbb{Z}^{d} for which

𝐰:={𝐳([n]n/2):A𝐳=𝐰} has |𝐰|(nn/2)Md.\displaystyle\mathcal{F}_{\mathbf{w}}:=\left\{\mathbf{z}\in\binom{[n]}{n/2}:\,A\mathbf{z}=\mathbf{w}\right\}\qquad\text{ has }\qquad|\mathcal{F}_{\mathbf{w}}|\geq\binom{n}{n/2}M^{-d}\,.

If there exist two distinct vectors 𝐱,𝐲𝐰\mathbf{x},\mathbf{y}\in\mathcal{F}_{\mathbf{w}} such that |𝐱𝐲|=n/4|\mathbf{x}\cap\mathbf{y}|=n/4, then we can construct two disjoint vectors which Π\Pi can not distinguish, and thus cannot give a correct answer to elemxm1/4\textsc{elemx}^{1/4}_{m} in both cases. Specifically, |𝐱𝐲|=|𝐲𝐱|=n/4|\mathbf{x}\smallsetminus\mathbf{y}|=|\mathbf{y}\smallsetminus\mathbf{x}|=n/4 and

A(𝐱𝐲)=A𝐱A(𝐱𝐲)=𝐰A(𝐱𝐲)=A𝐲A(𝐱𝐲)=A(𝐲𝐱).\displaystyle A(\mathbf{x}\smallsetminus\mathbf{y})=A\mathbf{x}-A(\mathbf{x}\cap\mathbf{y})=\mathbf{w}-A(\mathbf{x}\cap\mathbf{y})=A\mathbf{y}-A(\mathbf{x}\cap\mathbf{y})=A(\mathbf{y}\smallsetminus\mathbf{x})\,.

By Theorem 2.6, if there does not exist such a pair 𝐱,𝐲\mathbf{x},\mathbf{y}, then we have an upper bound on |||\mathcal{F}|, and can derive

(nn/2)Md\displaystyle\binom{n}{n/2}M^{-d} ||(nn/41).\displaystyle\leq|\mathcal{F}|\leq\binom{n}{n/4-1}\,.

Therefore, (nn/2)Md(nn/4)\binom{n}{n/2}M^{-d}\leq\binom{n}{n/4} and we obtain

dlog(nn/2)log(nn/4)logMlog(42)log(41)4nlogM0.14nlogM.d\geq\frac{\log{\binom{n}{n/2}}-\log{\binom{n}{n/4}}}{\log M}\geq\frac{\log{\binom{4}{2}}-\log\binom{4}{1}}{4}\frac{n}{\log M}\geq 0.14\frac{n}{\log M}\,.\qed

It should be noted that without the promise that |Z|=n/4|Z|=n/4, a one-round lower bound would follow very easily. By a standard “decoding” argument, a one-round protocol for elemx can be used to recover the entire unknown input 𝐳\mathbf{z}. For completeness, we give the easy proof below. The reason we needed the much more complicated argument in Lemma 5.2 above is that the promise in elemxn1/4\textsc{elemx}^{1/4}_{n} prevents us from performing such a decoding.

Proposition 5.3 (Essentially a restatement of Proposition 1.5).

If Π\Pi is an MM-bounded one-round protocol for elemxn\textsc{elemx}_{n}, then cost(Π)n/logM1\operatorname{cost}(\Pi)\geq n/\log M-1.

Proof.

Modify Π\Pi to add the query 𝟏𝖳\mathbf{1}^{\mathsf{T}}, which reports |Z||Z|; this increases cost(Π)\operatorname{cost}(\Pi) by one. Let A(d+1)×nA\in\mathbb{Z}^{(d+1)\times n} be the modified query matrix. Since Π\Pi is correct, A𝐳A\mathbf{z} determines an index i1Zi_{1}\in Z. Let 𝐞i1\mathbf{e}_{i_{1}} be the indicator vector for i1i_{1}; since we know 𝐞i1\mathbf{e}_{i_{1}}, we can compute A(𝐳𝐞i1)A(\mathbf{z}-\mathbf{e}_{i_{1}}) without making another query; this is enough to find an index i2Z{i1}i_{2}\in Z\smallsetminus\{i_{1}\}. Repeating this |Z||Z| times, we can reconstruct ZZ from A𝐳A\mathbf{z} alone. (This works for all ZZ\neq\varnothing; since we query 𝟏𝖳\mathbf{1}^{\mathsf{T}}, we can also detect when |Z|=0|Z|=0.) By the pigeonhole principle, the number of possible values of A𝐳A\mathbf{z} must be at least the number of valid inputs, so Md+12nM^{d+1}\geq 2^{n}, which implies dn/logM1d\geq n/\log M-1. ∎

For our round elimination argument, we require the following claim, similar to Lemma 3.2 and Lemma 4.2. Even though the claim looks similar, the round elimination argument will be subtly different from its 2\mathbb{Z}_{2} and q\mathbb{Z}_{q} predecessors.

Claim 5.4.

Every matrix Ad×nA\in\mathbb{Z}^{d\times n} admits an AA-uniform family S1,,SmS_{1},\ldots,S_{m} of size mn/(c0dlognlogM)1m\geq n/(c_{0}d\log n\log M)-1, for some absolute constant c0c_{0}.

Proof.

Put t=dlogMt={\lceil{d\log M}\rceil}. Since Π\Pi is MM-bounded, the mapping 𝐱A𝐱\mathbf{x}\mapsto A\mathbf{x} sends the vectors in {𝐱{0,1}n:|𝐱|=t}\{\mathbf{x}\in\{0,1\}^{n}:\,|\mathbf{x}|=t\} to vectors in d\mathbb{Z}^{d} where each entry comes from a set of cardinality MM. By the pigeonhole principle, there exists a vector 𝐫~d\mathbf{\tilde{r}}\in\mathbb{Z}^{d} such that

:={𝐱{0,1}n:|𝐱|=t and A𝐱=𝐫~} has cardinality ||(nt)Md.\displaystyle\mathcal{F}:=\{\mathbf{x}\in\{0,1\}^{n}:\,|\mathbf{x}|=t\text{ and }A\mathbf{x}=\mathbf{\tilde{r}}\}\text{~{}~{}has cardinality~{}~{}}|\mathcal{F}|\geq\dbinom{n}{t}M^{-d}\,. (13)

We claim that \mathcal{F} contains an mm-sunflower for some integer mm. Indeed, take mm to be the largest integer satisfying

mtlogn<n2c1, which ensures that mnc0dlognlogM1.\displaystyle mt\log n<\frac{n}{2c_{1}}\,,\qquad\text{ which ensures that }\qquad m\geq\frac{n}{c_{0}d\log n\log M}-1\,. (14)

This satisfies the claimed bound upon taking c0=2c1c_{0}=2c_{1} (say). Continuing from eq. 13,

||\displaystyle|\mathcal{F}| (nt)tMd\displaystyle\geq\left(\frac{n}{t}\right)^{t}M^{-d}  standard estimate\displaystyle\lhd\text{ standard estimate}
(n2t)t\displaystyle\geq\left(\frac{n}{2t}\right)^{t}  definition of t\displaystyle\lhd\text{ definition of $t$}
(c1mlogn)t\displaystyle\geq(c_{1}m\log n)^{t}  by eq. 14\displaystyle\lhd\text{ by \lx@cref{creftype~refnum}{eq:m-constraint}}
(c1mlog(mt))t,\displaystyle\geq(c_{1}m\log(mt))^{t}\,,  by eq. 14, again\displaystyle\lhd\text{ by \lx@cref{creftype~refnum}{eq:m-constraint}, again}

whence the required sunflower exists, by Theorem 2.5.

Let S~1,,S~m\widetilde{S}_{1},\ldots,\widetilde{S}_{m} be sets constituting such an mm-sunflower and let V=i=1mS~iV=\bigcap_{i=1}^{m}\widetilde{S}_{i} be the common pairwise intersection. Define Si=S~iVS_{i}=\widetilde{S}_{i}\smallsetminus V, for each i[m]i\in[m]. We then have A𝐬i=A(𝐬~i𝐯)=𝐫~A𝐯A\mathbf{s}_{i}=A(\mathbf{\tilde{s}}_{i}-\mathbf{v})=\mathbf{\tilde{r}}-A\mathbf{v} for each ii, whence S1,,SmS_{1},\ldots,S_{m} is an AA-uniform family. ∎

Lemma 5.5 (Round elimination lemma).

Let Π\Pi be a kk-round MM-bounded \mathbb{Z}-LQP for elemxn\textsc{elemx}_{n}, where k1k\geq 1 and nn is an integer. Then there exists a (k1)(k-1)-round MM-bounded \mathbb{Z}-LQP Υ\Upsilon for elemxm1/4\textsc{elemx}^{1/4}_{m}, such that

  1. (5.5.1)

    Υ\Upsilon shadows Π\Pi through a homomorphism φΥ:ΥΠ\varphi_{\Upsilon}\colon\Upsilon\to\Pi;

  2. (5.5.2)

    m/4m/4 is a prime number and

    mn2c0dlognlogM2,\displaystyle m\geq\frac{n}{2c_{0}d\log n\log M}-2\,,

    where dd is the cost of the root of Π\Pi, and c0c_{0} the constant from 5.4.

Proof.

Let Ad×nA\in\mathbb{Z}^{d\times n} be the label of the root of Π\Pi. By 5.4, there is an AA-uniform family S1,,SmS_{1},\ldots,S_{m^{\prime}} of size mm^{\prime}. By Bertrand’s postulate, there exists a prime number pp between between m/8m^{\prime}/8 and m/4m^{\prime}/4; let m=4pm=4p. Let 𝐱=A𝐬1\mathbf{x}=A\mathbf{s}_{1}, 𝐫=(m/4)𝐱\mathbf{r}=(m/4)\mathbf{x} and let LL be the lifting matrix defined from {Si}i=1m\{S_{i}\}_{i=1}^{m} according to eq. 6. Using Lemma 2.8 on Π\Pi, LL, and 𝐫\mathbf{r}, we obtain a (k1)(k-1)-round \mathbb{Z}-LQP Υ\Upsilon that shadows Π\Pi, and solves elemxm\textsc{elemx}_{m} on all inputs W[m]W\subseteq[m] for which L𝐰𝟎L\mathbf{w}\neq\mathbf{0} and AL𝐰=𝐫AL\mathbf{w}=\mathbf{r}. While the queries performed by Υ\Upsilon may have larger coefficients than those of Π\Pi, the construction of Υ\Upsilon described in Section 2.3 only restricts the possible results of each individual linear measurement performed, so Υ\Upsilon is still MM-bounded. Finally, if |W|=m/4|W|=m/4, then since LL has full rank, L𝐰𝟎L\mathbf{w}\neq\mathbf{0}; and furthermore

AL𝐰=iWA𝐬i=|W|𝐱=m4𝐱=𝐫.\displaystyle AL\mathbf{w}=\sum_{i\in W}A\mathbf{s}_{i}=|W|\mathbf{x}=\frac{m}{4}\mathbf{x}=\mathbf{r}\,.

This implies that Υ\Upsilon gives the correct output for W[m]W\subseteq[m] fulfilling the promise of elemxm1/4\textsc{elemx}^{1/4}_{m}. ∎

The preceding round elimination lemma has a key limitation: it requires a protocol for elemxn\textsc{elemx}_{n} to create one for elemxm1/4\textsc{elemx}^{1/4}_{m}. Because of this, it is not possible to apply the lemma to its own output, and thereby obtain a kk-round lower bound. Say we were to try, and AA were the matrix at the root of the protocol Π\Pi for elemxn1/4\textsc{elemx}^{1/4}_{n}. Then if AA contained an all-ones row, 5.4 might produce an AA-uniform family with all set sizes |Si||S_{i}| equal to some constant bb which is not a factor of n/4n/4. Then lifting inputs WW of size m/4m/4 to inputs ZZ of size n/4n/4 would fail, because n/4=|Z|=b|W|n/4=|Z|=b|W| would imply that bb divides n/4n/4, a contradiction.

With that said, we now use our round elimination lemma in a one-shot fashion to obtain our main result for integer LQPs.

Theorem 5.6 (Restatement of Theorem 1.6).

LQ[B,B]2(elemxn)=Ω(n/(log3/2(nB)))\operatorname{LQ}_{\mathbb{Z}_{[-B,B]}}^{2}(\textsc{elemx}_{n})=\Omega(\sqrt{n}/(\log^{3/2}(nB))).

Proof.

Suppose that Π\Pi is a deterministic 2-round [B,B]\mathbb{Z}_{[-B,B]}-LQP for elemxn\textsc{elemx}_{n}, whose root has cost d1d_{1}. By Lemma 5.5, there is a one round O(nB)O(nB)-bounded protocol for elemxm1/4\textsc{elemx}^{1/4}_{m} with cost d2d_{2}. Combining the following three equations:

cost(Π)\displaystyle\operatorname{cost}(\Pi) d1+d2\displaystyle\geq d_{1}+d_{2}
d2\displaystyle d_{2} 0.14mlogM\displaystyle\geq\frac{0.14m}{\log M}  from Lemma 5.2\displaystyle\lhd\text{ from \lx@cref{creftypecap~refnum}{lem:int-quarter-oneround}}
m\displaystyle m n2c0d1lognlogM2\displaystyle\geq\frac{n}{2c_{0}d_{1}\log n\log M}-2  from Lemma 5.5\displaystyle\lhd\text{ from \lx@cref{creftypecap~refnum}{lem:zz-round-elim}}

gives

cost(Π)0.19nc0lognlog2M2=Ω(nlog3/2(nB)).\operatorname{cost}(\Pi)\geq 0.19\sqrt{\frac{n}{c_{0}\log n\log^{2}M}}-2=\Omega\left(\frac{\sqrt{n}}{\log^{3/2}(nB)}\right)\,.\qed

6 Upper Bounds

For the sake of completeness, we provide details of the LQPs attaining various upper bounds referenced throughout the paper. For the most part, these upper bounds are simple observations or extensions of well-known existing results.

6.1 Deterministic k-round LQP for elemx

The following family of protocols works both when D=[0,1]D=\mathbb{Z}_{[0,1]} on the problem elemxn\textsc{elemx}_{n}, and when D=qD=\mathbb{Z}_{q} on the problem elemxn(q)\textsc{elemx}^{(q)}_{n}. The algorithm appears to be well known, and versions of it are described in Lemma 4.1 of [ACK20] and Section 2.2 of [KUW88].

Let d1,,dkd_{1},\ldots,d_{k} be a division sequence (see Lemma 3.5) for nn, which minimizes i=1kdi\sum_{i=1}^{k}d_{i}. Algorithm 1 makes no more than drd_{r} queries in each round rr.

Algorithm 1  Outline of deterministic query protocol on 𝐳\mathbf{z}
1:[u,v][1,n][u,v]\leftarrow[1,n]
2:for r=1,,kr=1,\ldots,k do
3:     Split the interval [u,v][u,v] into dr+1d_{r}+1 intervals J1,,Jdr+1J_{1},\ldots,J_{d_{r}+1}, each of size vu+1dr+1\leq\lceil\frac{v-u+1}{d_{r}+1}\rceil
4:     Query with matrix ADdr×nA\in D^{d_{r}\times n}, where Ai,jA_{i,j} is 11 if jJij\in J_{i} and 0 otherwise.
5:     If A𝐳A\mathbf{z} is not all zero, let i[dr]i\in[d_{r}] be the index of any nonzero entry; otherwise, let i=dr+1i=d_{r}+1.
6:     Update [u,v]Ji[u,v]\leftarrow J_{i}.
7:Report uu as the index where uZu\in Z.

Since d1,,dkd_{1},\ldots,d_{k} is a division sequence for nn, the final interval [u,v][u,v] must have u=vu=v. The total cost of the protocol is i=1kdi\sum_{i=1}^{k}d_{i}, which by Lemma 3.5 lies in the interval [k(n1/k1),k(n1/k1)][k(n^{1/k}-1),k(\lceil n^{1/k}\rceil-1)]. Note that when n=2kn=2^{k}, the algorithm cost is exactly kk.

Write 𝟏S\mathbf{1}_{S} to denote the indicator vector in DnD^{n} for a given set S[n]S\subseteq[n]. To prove that the algorithm is correct, it suffices to verify that 𝟏[u,v]𝖳𝐳0\mathbf{1}^{\mathsf{T}}_{[u,v]}\mathbf{z}\neq 0 in each round. Since 𝟏𝖳𝐳0\mathbf{1}^{\mathsf{T}}\mathbf{z}\neq 0, this is true at the start. For any given round, the matrix AA queries 𝟏J1,,𝟏Jdr\mathbf{1}_{J_{1}},\ldots,\mathbf{1}_{J_{d_{r}}}. Since 𝟏[u,v]=i=1dr+1𝟏Ji\mathbf{1}_{[u,v]}=\sum_{i=1}^{d_{r}+1}\mathbf{1}_{J_{i}}, and 𝟏[u,v]𝖳𝐳0\mathbf{1}^{\mathsf{T}}_{[u,v]}\mathbf{z}\neq 0, there must be some first index ii for which 𝟏Ji𝖳𝐳0\mathbf{1}^{\mathsf{T}}_{J_{i}}\mathbf{z}\neq 0. If i<dr+1i<d_{r}+1, the index is shown in the query response; if i=dr+1i=d_{r}+1, then no other intervals JhJ_{h} have 𝟏Jh𝖳𝐳0\mathbf{1}^{\mathsf{T}}_{J_{h}}\mathbf{z}\neq 0, so A𝐳A\mathbf{z} is all zeros. In either case, the algorithm correctly identifies the interval JiJ_{i} for which 𝟏Ji𝖳𝐳0\mathbf{1}^{\mathsf{T}}_{J_{i}}\mathbf{z}\neq 0.

6.2 Randomized 1-round LQP for elemx

The 0\ell_{0}-sampling algorithm from [JST11] relies on a standard result on the exact recovery of sparse vectors in n\mathbb{R}^{n}, which (paraphrasing) states that O(s)O(s) \mathbb{R}-linear queries suffice to exactly recover any ss-sparse vector 𝐯\mathbf{v} in n\mathbb{R}^{n}, or if 𝐯\mathbf{v} is not sparse, say that the output is dense with high probability. The 0\ell_{0}-sampling algorithm then chooses subsets {Ti}i=1logn\{T_{i}\}_{i=1}^{\lceil\log n\rceil} where each TiT_{i} is uniformly randomly drawn from the set of all subsets of [n][n] of size 2i2^{i}. To obtain a constant final error probability, for each set TiT_{i}, the 0\ell_{0}-sampler runs the sparse recovery method on the coordinates given by TiT_{i} with s=O(1)s=O(1). The sampler then returns a random index from the first sparse recovery instance to successfully recover a nonzero vector. With high probability, at least one of the sets TiT_{i} will contain fewer than O(1)O(1) entries of ZZ, and the algorithm succeeds.

Recovering ss-sparse vectors in {0,1}n\{0,1\}^{n} is easier than recovering general ss-sparse vectors in n\mathbb{R}^{n} or n\mathbb{Z}^{n}, so directly adapting [JST11]’s 0\ell_{0}-sampling algorithm to elemx means only O(logn)O(\log n) queries are needed for [B,B]\mathbb{Z}_{[-B,B]} with B=O(poly(n))B=O(\operatorname{poly}(n)), and O(log2n/logq)O(\log^{2}n/\log q) for q\mathbb{Z}_{q}. This follows from the costs of ss-sparse recovery and detection with DD-linear queries and {0,1}n\{0,1\}^{n}, addressed in the following lemma. We spell out this result and its proof for the sake of completeness: though it may be folklore, it appears not to have been published in quite this form.

Lemma 6.1 (Discrete ss-sparse recovery).

There exists a query matrix H[B,B]r×nH\in\mathbb{Z}_{[-B,B]}^{r\times n} for r=O(slogn/logB)r=O(s\log n/\log B) for which the query HvHv returns a unique value for all V[n]V\subseteq[n] with |V|s|V|\leq s. The same holds true for q\mathbb{Z}_{q} with r=O(slogn/logq)r=O(s\log n/\log q).

Proof.

Call a matrix in ADr×tA\in D^{r\times t} full-[1,1][-1,1]-rank if there does not exist a nonzero vector 𝐯{1,0,1}t\mathbf{v}\in\{-1,0,1\}^{t} for which A𝐯=𝟎A\mathbf{v}=\mathbf{0}. If we choose a matrix BDr×tB\in D^{r\times t} uniformly at random, then it is full-[1,1][-1,1]-rank with probability 13t/|D|r\geq 1-3^{t}/|D|^{r}. One way to prove this is to consider columns the 𝐛1𝐛t\mathbf{b}_{1}\ldots\mathbf{b}_{t} of BB one by one, and note that if each 𝐛i\mathbf{b}_{i} is not contained in the set Fi:={j=1i1ai𝐛i:a{1,0,1}i1}F_{i}:=\{\sum_{j=1}^{i-1}a_{i}\mathbf{b}_{i}:a\in\{-1,0,1\}^{i-1}\}, then BB has full-[1,1][-1,1]-rank. Since BB is chosen uniformly at random, each column is independent of the the earlier ones, so

Pr[D doesn’t have full [1,1]-rank]\displaystyle\Pr[\text{D doesn't have full $[-1,1]$-rank}] i=1tPr[𝐛iFi]i=1t3i1|D|r3t|D|r.\displaystyle\leq\sum_{i=1}^{t}\Pr[\mathbf{b}_{i}\notin F_{i}]\leq\sum_{i=1}^{t}\frac{3^{i-1}}{|D|^{r}}\leq\frac{3^{t}}{|D|^{r}}\,.

Let rr be chosen later; if we pick H^Dr×n\hat{H}\in D^{r\times n} uniformly at random, then the expected number of sets T[n]T\subseteq[n] with |T|=2s|T|=2s for which H^T\hat{H}_{T} (the submatrix of H^\hat{H} with columns in TT) has full [1,1][-1,1]-rank is (n2s)32s/|D|n\leq\binom{n}{2s}3^{2s}/|D|^{n}. Letting r=2slog(3n)/log(|D|)r=\lceil 2s\log(3n)/\log(|D|)\rceil makes this less than 11. Consequently, there must exist a specific matrix HH for which every such submatrix HTH_{T} has full [1,1][-1,1]-rank. Then for any two distinct vectors 𝐮,𝐯{0,1}n\mathbf{u},\mathbf{v}\in\{0,1\}^{n} with |U|,|V|s|U|,|V|\leq s, we cannot have H𝐮=H𝐯H\mathbf{u}=H\mathbf{v}, because that would imply there exists TUVT\supseteq U\cup V with |T|=2s|T|=2s for which HT(bubv)=0H_{T}(bu-bv)=0, contradicting the full [1,1][-1,1]-rank assumption. ∎

Detecting whether a {0,1}n\{0,1\}^{n} vector is not ss-sparse is also easier than in n\mathbb{R}^{n}. For [B,B]\mathbb{Z}_{[-B,B]}-LQPs, querying with the vector 𝟏d\mathbf{1}\in\mathbb{Z}^{d} suffices. For q\mathbb{Z}_{q}, because Lemma 6.1 ensures that if a vector 𝐳\mathbf{z} is s-sparse, it can be recovered exactly, it is enough to query O(1)O(1) random vectors in qn\mathbb{Z}_{q}^{n}. Let 𝐫\mathbf{r} be such a random vector, and let 𝐰\mathbf{w} be the ss-sparse vector in {0,1}\{0,1\} recovered using HH; if 𝐳\mathbf{z} was s-sparse, then 𝐳=𝐰\mathbf{z}=\mathbf{w} and 𝐫𝖳𝐳=𝐫𝖳𝐰\mathbf{r}^{\mathsf{T}}\mathbf{z}=\mathbf{r}^{\mathsf{T}}\mathbf{w}; otherwise, 𝐫𝖳𝐳\mathbf{r}^{\mathsf{T}}\mathbf{z} does not equal 𝐫𝖳𝐰\mathbf{r}^{\mathsf{T}}\mathbf{w} with probability 11/q1-1/q.

7 Connections Between 2\mathbb{Z}_{2}-LQPs and Circuit Complexity

A weaker version of Theorem 1.3 can be proven by combining existing results. As shown in the following lemma, a given kk-round 2\mathbb{Z}_{2}-LQP Π\Pi for elemx(2)\textsc{elemx}^{(2)} can be converted to a communication protocol Υ\Upsilon for the Karchmer-Wigderson game on parityn\textsc{parity}_{n}, with the communication cost CC of Υ\Upsilon being 2cost(Π)\leq 2\operatorname{cost}(\Pi). By a slight adaptation of the proof of Theorem 5 in [NW93], we can convert Υ\Upsilon into an unbounded fan-in boolean formula with depth k+1k+1 and no more than 2C12^{C}-1 AND/OR gates that computes parityn\textsc{parity}_{n}. Relatively tight lower bounds on the size of such a formula date back to [Has86], but we use a result of [Ros15], which says that a depth-(k+1)(k+1) unbounded fan-in formula computing parityn\textsc{parity}_{n} must have at least 2Ω(k(n1/k1)2^{\Omega(k(n^{1/k}-1)} AND/OR gates. Thus cost(Π)12CΩ(k(n1/k1))\operatorname{cost}(\Pi)\geq\frac{1}{2}C\geq\Omega(k(n^{1/k}-1)).

Lemma 7.1.

Consider the Karchmer-Wigderson game for parityn\textsc{parity}_{n}, in which Alice has a set X{0,1}nX\in\{0,1\}^{n} with |X||X| even, and Bob has a set Y{0,1}nY\in\{0,1\}^{n} with |Y||Y| odd, and they seek to identify an index i[n]i\in[n] for which 𝐱i𝐲i\mathbf{x}_{i}\neq\mathbf{y}_{i}. Let Π\Pi be a kk-round 2\mathbb{Z}_{2}-LQP for elemxn(2)\textsc{elemx}^{(2)}_{n}; then there exists Υ\Upsilon a kk-round communication protocol for this game, with cost 2cost(Π)\leq 2\operatorname{cost}(\Pi).

Proof.

Let ρ\rho be the root of Π\Pi, with label Aρ2dρ×nA_{\rho}\in\mathbb{Z}_{2}^{d_{\rho}\times n}. In the first round of Υ\Upsilon, Alice sends Aρ𝐱A_{\rho}\mathbf{x} to Bob. Then Bob computes Aρ𝐲A_{\rho}\mathbf{y}, and uses Alice’s message to determine 𝐫1=Aρ(𝐱+𝐲)\mathbf{r}_{1}=A_{\rho}(\mathbf{x}+\mathbf{y}). The value 𝐫1\mathbf{r}_{1} determines a child node ν\nu of ρ\rho. If this is a leaf, Bob outputs its label oνo_{\nu}. Otherwise, in the second round, Bob sends both Aρ𝐲A_{\rho}\mathbf{y} and Aν𝐲A_{\nu}\mathbf{y} to Alice. Given Aρ𝐲A_{\rho}\mathbf{y}, Alice can determine ν\nu, and compute Aν𝐱A_{\nu}\mathbf{x}. With this, Alice can compute 𝐫2=Aν(𝐱+𝐲)\mathbf{r}_{2}=A_{\nu}(\mathbf{x}+\mathbf{y}), and identify the child node μ\mu of ν\nu. If this is a leaf, Alice outputs oμo_{\mu}; otherwise, in the third round, Alice sends Aν𝐱A_{\nu}\mathbf{x} and Aμ𝐱A_{\mu}\mathbf{x} to Bob; the players continue in this fashion until a leaf is reached and the protocol ends; since Π\Pi has depth kk, this takes at most kk rounds.

This protocol is correct, because it finds the leaf of Π\Pi associated to the input 𝐱+𝐲\mathbf{x}+\mathbf{y}. Since we are promised 𝐱\mathbf{x} has even parity, and 𝐲\mathbf{y} odd, 𝐱+𝐲\mathbf{x}+\mathbf{y} has odd parity and thus fulfills the condition under which a protocol for elemxn(2)\textsc{elemx}^{(2)}_{n} must be correct. The output value is an index ii where 𝐱i+𝐲i=1\mathbf{x}_{i}+\mathbf{y}_{i}=1, hence where 𝐱i𝐲i\mathbf{x}_{i}\neq\mathbf{y}_{i}, as required for the communication game.

Since Aρ𝐱2dρA_{\rho}\mathbf{x}\in\mathbb{Z}_{2}^{d_{\rho}}, the round first message uses exactly dρd_{\rho} bits. The second, dρ+dνd_{\rho}+d_{\nu}, the third, dν+dμd_{\nu}+d_{\mu}, and so on. The communication needed on inputs (𝐱,𝐲\mathbf{x},\mathbf{y}) is thus at most twice cost(Π,𝐱+𝐲)\operatorname{cost}(\Pi,\mathbf{x}+\mathbf{y}), so the worst-case communication cost of Υ\Upsilon is at most 2cost(Π)2\operatorname{cost}(\Pi). ∎

8 Appendix

The following estimate was used in Section 4 during calculations in the proof of our q\mathbb{Z}_{q}-LQP lower bound.

Lemma 8.1.

Let C,DC,D be constants with 2CD2C\leq D and D1D\geq 1. Then

max(lnnC,k(1Dn1/k1))1D(1+C)k(n1/k1).\displaystyle\max\left(\frac{\ln n}{C},k\left(\frac{1}{D}n^{1/k}-1\right)\right)\geq\frac{1}{D(1+C)}k\left(n^{1/k}-1\right)\,. (15)
Proof.

Let γn(k)=k(n1/k1)\gamma_{n}(k)=k(n^{1/k}-1). We have k(1Dn1/k1)1Dγn(k)kk\left(\frac{1}{D}n^{1/k}-1\right)\geq\frac{1}{D}\gamma_{n}(k)-k. Since γn(k)\gamma_{n}(k) is decreasing, let kk_{\star} be the unique solution to 1Dγn(k)=1Clnn\frac{1}{D}\gamma_{n}(k_{\star})=\frac{1}{C}\ln n. Since γn(lnn)=(e1)lnn2lnnDClnn\gamma_{n}(\ln n)=(e-1)\ln n\leq 2\ln n\leq\frac{D}{C}\ln n, it follows klnnk_{\star}\leq\ln n. Let kk_{\dagger} be the unique solution to 1Dγn(k)k=1Clnn\frac{1}{D}\gamma_{n}(k_{\dagger})-k_{\dagger}=\frac{1}{C}\ln n. Since kkk_{\dagger}\leq k_{\star}, klnnk_{\dagger}\leq\ln n as well. Evaluating the right hand side of eq. 15 at kk_{\dagger} gives:

1D(1+C)γn(k)\displaystyle\frac{1}{D(1+C)}\gamma_{n}(k_{\dagger}) =1D(1+C)(DlnnC+Dk)\displaystyle=\frac{1}{D(1+C)}\left(\frac{D\ln n}{C}+Dk_{\dagger}\right)
1D(1+C)(DlnnC+Dlnn)\displaystyle\leq\frac{1}{D(1+C)}\left(\frac{D\ln n}{C}+D\ln n\right)
=lnnC=1Dγn(k)k.\displaystyle=\frac{\ln n}{C}=\frac{1}{D}\gamma_{n}(k_{\dagger})-k_{\dagger}\,.

Because the derivative of 1D(1+C)γn(k)\frac{1}{D(1+C)}\gamma_{n}(k) is less that of lnnC\frac{\ln n}{C} when kkk\geq k_{\dagger}, and greater than that of 1Dγn(k)k\frac{1}{D}\gamma_{n}(k)-k for kkk\leq k_{\dagger}, we can extend this inequality to all k(0,)k\in(0,\infty), proving eq. 15. ∎

References

  • [AAAK17] Arpit Agarwal, Shivani Agarwal, Sepehr Assadi, and Sanjeev Khanna. Learning with limited rounds of adaptivity: Coin tossing, multi-armed bandits, and ranking from pairwise comparisons. In Satyen Kale and Ohad Shamir, editors, Proceedings of the 30th Conference on Learning Theory, COLT 2017, Amsterdam, The Netherlands, 7-10 July 2017, volume 65 of Proceedings of Machine Learning Research, pages 39–75. PMLR, 2017.
  • [AB19] Hasan Abasi and Nader Bshouty. On learning graphs with edge-detecting queries. In Algorithmic Learning Theory, pages 3–30. PMLR, 2019.
  • [ABB+17] Andris Ambainis, Kaspars Balodis, Aleksandrs Belovs, Troy Lee, Miklos Santha, and Juris Smotrovs. Separations in query complexity based on pointer functions. J. ACM, 64(5):32:1–32:24, 2017.
  • [ACK20] Sepehr Assadi, Deeparnab Chakrabarty, and Sanjeev Khanna. Graph connectivity and single element recovery via linear and or queries. arXiv preprint arXiv:2007.06098, 2020.
  • [AGM12] Kook Jin Ahn, Sudipto Guha, and Andrew McGregor. Analyzing graph structure via linear measurements. In Proc. 23rd Annual ACM-SIAM Symposium on Discrete Algorithms, pages 459–467, 2012.
  • [ALWZ20] Ryan Alweiss, Shachar Lovett, Kewen Wu, and Jiapeng Zhang. Improved bounds for the sunflower lemma. In Proc. 52nd Annual ACM Symposium on the Theory of Computing, pages 624–630, 2020.
  • [BCW20] Tolson Bell, Suchakree Chueluecha, and Lutz Warnke. Note on sunflowers. CoRR, abs/2009.09327, 2020.
  • [BdW02] Harry Buhrman and Ronald de Wolf. Complexity measures and decision tree complexity: a survey. Theor. Comput. Sci., 288(1):21–43, 2002.
  • [BHPSR+20] Paul Beame, Sariel Har-Peled, Natarajan Sivaramakrishnan Ramamoorthy, Cyrus Rashtchian, and Makrand Sinha. Edge estimation with independent set oracles. ACM Trans. Alg., 16(4):1–27, 2020.
  • [BKS17] Paul Beame, Paraschos Koutris, and Dan Suciu. Communication steps for parallel query processing. J. ACM, 64(6):40:1–40:58, 2017.
  • [BS18] Eric Balkanski and Yaron Singer. The adaptive complexity of maximizing a submodular function. In Proc. 50th Annual ACM Symposium on the Theory of Computing, pages 1138–1151, 2018.
  • [Bsh09] Nader H. Bshouty. Optimal algorithms for the coin weighing problem with a spring scale. In COLT 2009 - The 22nd Conference on Learning Theory, Montreal, Quebec, Canada, June 18-21, 2009, 2009.
  • [CCM16] Amit Chakrabarti, Graham Cormode, and Andrew McGregor. Robust lower bounds for communication and stream computation. Theor. Comput., 12(1):1–35, 2016. Preliminary version in Proc. 40th Annual ACM Symposium on the Theory of Computing, pages 641–649, 2008.
  • [CF14] Graham Cormode and Donatella Firmani. A unifying framework for \ell0-sampling algorithms. Distributed Parallel Databases, 32(3):315–335, 2014.
  • [CG18] Clément L. Cannone and Tom Gur. An adaptivity hierarchy theorem for property testing. computational complexity, 27(4):671–716, 2018.
  • [CK16] Amit Chakrabarti and Sagar Kale. Strong fooling sets for multi-player communication with applications to deterministic estimation of stream statistics. In Proc. 57th Annual IEEE Symposium on Foundations of Computer Science, pages 41–50, 2016.
  • [CR04] Amit Chakrabarti and Oded Regev. An optimal randomised cell probe lower bound for approximate nearest neighbour searching. In Proc. 45th Annual IEEE Symposium on Foundations of Computer Science, pages 473–482, 2004.
  • [CW16] Amit Chakrabarti and Anthony Wirth. Incidence geometries and the pass complexity of semi-streaming set cover. In Proc. 27th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1365–1373, 2016.
  • [CY20] Graham Cormode and Ke Yi. Small Summaries for Big Data. Cambridge University Press, Cambridge, 2020.
  • [DL78] David P. Dobkin and Richard J. Lipton. A lower bound of the ½n2 on linear search programs for the knapsack problem. J. Comput. Syst. Sci., 16(3):413–417, 1978.
  • [Don06] David L. Donoho. Compressed sensing. IEEE Trans. Inf. Theory, 52(4):1289–1306, 2006.
  • [ER60] Paul Erdős and Richard Rado. Intersection theorems for systems of sets. Journal London Math. Soc., 35(1):85–90, 1960.
  • [ER63] Paul Erdős and Alfréd Rényi. On two problems of information theory. Publ. Math. Inst. Hung. Acad. Sci., Ser. A, 8:241–254, 1963.
  • [ER14] Yuval Emek and Adi Rosén. Semi-streaming set cover. In Proc. 41st International Colloquium on Automata, Languages and Programming, pages 453–464, 2014.
  • [FIS08] Gereon Frahling, Piotr Indyk, and Christian Sohler. Sampling in dynamic data streams and applications. Int. J. Comput. Geom. Appl., 18(1/2):3–28, 2008.
  • [FW81] Péter Frankl and Richard M. Wilson. Intersection theorems with geometric consequences. Combinatorica, 1(3):357–368, 1981.
  • [GG06] Weidong Gao and Alfred Geroldinger. Zero-sum problems in finite abelian groups: A survey. Expositiones Mathematicae, 24(4):337–369, 2006.
  • [GM09] Sudipto Guha and Andrew McGregor. Stream order and order statistics: Quantile estimation in random-order streams. SIAM J. Comput., 38(5):2044–2059, 2009.
  • [GO16] Venkatesan Guruswami and Krzysztof Onak. Superlinear lower bounds for multipass graph processing. Algorithmica, 76(3):654–683, November 2016.
  • [Has86] John Hastad. Almost optimal lower bounds for small depth circuits. In Proc. 18th Annual ACM Symposium on the Theory of Computing, pages 6–20, 1986.
  • [HHL18] Hamed Hatami, Kaave Hosseini, and Shachar Lovett. Structure of protocols for XOR functions. SIAM J. Comput., 47(1):208–217, 2018.
  • [HLY19] Kaave Hosseini, Shachar Lovett, and Grigory Yaroslavtsev. Optimality of linear sketching under modular updates. In Amir Shpilka, editor, Proc. 34th Annual IEEE Conference on Computational Complexity, volume 137 of LIPIcs, pages 13:1–13:17. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019.
  • [JST11] Hossein Jowhari, Mert Saglam, and Gábor Tardos. Tight bounds for lpl_{p} samplers, finding duplicates in streams, and related problems. In Proc. 30th ACM Symposium on Principles of Database Systems, pages 49–58, 2011.
  • [KLM19] Daniel M. Kane, Shachar Lovett, and Shay Moran. Near-optimal linear decision trees for k-sum and related problems. J. ACM, 66(3):16:1–16:18, 2019.
  • [KMSY18] Sampath Kannan, Elchanan Mossel, Swagato Sanyal, and Grigory Yaroslavtsev. Linear sketching over f_2. In Rocco A. Servedio, editor, Proc. 33rd Annual IEEE Conference on Computational Complexity, volume 102 of LIPIcs, pages 8:1–8:37. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018.
  • [KNP+17] Michael Kapralov, Jelani Nelson, Jakub Pachocki, Zhengyu Wang, David P. Woodruff, and Mobin Yahyazadeh. Optimal lower bounds for universal relation, and for samplers and finding duplicates in streams. In Proc. 58th Annual IEEE Symposium on Foundations of Computer Science, pages 475–486, 2017.
  • [KNW10] Daniel M. Kane, Jelani Nelson, and David P. Woodruff. An optimal algorithm for the distinct elements problem. In Proc. 29th ACM Symposium on Principles of Database Systems, pages 41–52, 2010.
  • [KP19] Akshay Kamath and Eric Price. Adaptive sparse recovery with limited adaptivity. In Proc. 30th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2729–2744, 2019.
  • [KUW88] Richard M. Karp, Eli Upfal, and Avi Wigderson. The complexity of parallel search. J. Comput. Syst. Sci., 36(2):225–253, 1988.
  • [KW90] Mauricio Karchmer and Avi Wigderson. Monotone circuits for connectivity require super-logarithmic depth. SIAM J. Disc. Math., 3(2):255–265, 1990. Preliminary version in Proc. 20th Annual ACM Symposium on the Theory of Computing, pages 539–550, 1988.
  • [LNNW95] László Lovász, Moni Naor, Ilan Newman, and Avi Wigderson. Search problems in the decision tree model. SIAM J. Disc. Math., 8(1):119–132, 1995.
  • [LNW14] Yi Li, Huy L. Nguyen, and David P. Woodruff. Turnstile streaming algorithms might as well be linear sketches. In Proc. 46th Annual ACM Symposium on the Theory of Computing, pages 174–183, 2014.
  • [LPY16] Mingmou Liu, Xiaoyin Pan, and Yitong Yin. Randomized approximate nearest neighbor search with limited adaptivity. In Proc. 28th ACM Symposium on Parallelism in Algorithms and Architectures, page 23–33, 2016.
  • [MK13] Gianluca De Marco and Dariusz R. Kowalski. Searching for a subset of counterfeit coins: Randomization vs determinism and adaptiveness vs non-adaptiveness. Rand. Struct. Alg., 42(1):97–109, 2013.
  • [MNSW98] Peter Bro Miltersen, Noam Nisan, Shmuel Safra, and Avi Wigderson. On data structures and asymmetric communication complexity. J. Comput. Syst. Sci., 57(1):37–49, 1998. Preliminary version in Proc. 27th Annual ACM Symposium on the Theory of Computing, pages 103–111, 1995.
  • [Mut05] S. Muthukrishnan. Data streams: Algorithms and applications. Found. Trends Theor. Comput. Sci., 1(2):117–236, 2005.
  • [Nis21] Noam Nisan. The demand query model for bipartite matching. In Proc. 32nd Annual ACM-SIAM Symposium on Discrete Algorithms, pages 592––599, 2021.
  • [NW93] Noam Nisan and Avi Wigderson. Rounds in communication complexity revisited. SICOMP, 22(1):211–219, 1993. Preliminary version in Proc. 23rd Annu. ACM Symp. Theory Comput., pages 419–429, 1991.
  • [NY19] Jelani Nelson and Huacheng Yu. Optimal lower bounds for distributed and streaming spanning forest computation. In Proc. 30th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1844–1860, 2019.
  • [Ols69] John Olson. A combinatorial problem on finite abelian groups, i. Journal of Number Theory, 1:8–10, 1969.
  • [PT07] Mihai Pǎtraşcu and Mikkel Thorup. Randomization does not help searching predecessors. In Proc. 18th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 555–564, 2007.
  • [PW13] Eric Price and David P. Woodruff. Lower bounds for adaptive sparse recovery. In Proc. 24th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 652–663, 2013.
  • [Rao20] Anup Rao. Coding for sunflowers. Discrete Analysis, (2), 2020.
  • [Ros15] Benjamin Rossman. The average sensitivity of bounded-depth formulas. In Proc. 56th Annual IEEE Symposium on Foundations of Computer Science, pages 424–430, 2015.
  • [Sen03] Pranab Sen. Lower bounds for predecessor searching in the cell probe model. In Proc. 18th Annual IEEE Conference on Computational Complexity, pages 73–83, 2003.
  • [Tao20] Terry Tao. The sunflower lemma via Shannon entropy, 2020. Blog post; available online at https://terrytao.wordpress.com/2020/07/20/the-sunflower-lemma-via-shannon-entropy/.
  • [TZ97] Gábor Tardos and Uri Zwick. The communication complexity of the universal relation. In Proc. 12th Annual IEEE Conference on Computational Complexity, pages 247–259, 1997.
  • [vEBK69] P. van Emde Boas and Kruyswijk. A combinatorial problem on finite abelian groups III. Technical Report ZW 8, Mathematisch Centrum, Amsterdam, The Netherlands, 1969.
  • [Woo14] David P. Woodruff. Sketching as a tool for numerical linear algebra. Found. Trends Theor. Comput. Sci., 10(1-2):1–157, 2014.
  • [ZS10] Zhiqiang Zhang and Yaoyun Shi. On the parity complexity measures of boolean functions. Theor. Comput. Sci., 411(26-28):2612–2618, 2010.