Robust Lower Bounds for Graph Problems in the Blackboard Model of Communication
Abstract.
We give lower bounds on the communication complexity of graph problems in the multi-party blackboard model. In this model, the edges of an -vertex input graph are partitioned among parties, who communicate solely by writing messages on a shared blackboard that is visible to every party. We show that any non-trivial graph problem on -vertex graphs has blackboard communication complexity bits, even if the edges of the input graph are randomly assigned to the parties. We say that a graph problem is non-trivial if the output cannot be computed in a model where every party holds at most one edge and no communication is allowed. Our lower bound thus holds for essentially all key graph problems relevant to distributed computing, including Maximal Independent Set (MIS), Maximal Matching, ()-coloring, and Dominating Set. In many cases, e.g., MIS, Maximal Matching, and -coloring, our lower bounds are optimal, up to poly-logarithmic factors.
1. Introduction
The multi-party blackboard model of communication is a well-established model of distributed computation (Drucker et al., 2012; Phillips et al., 2012; Braverman and Oshman, 2015; Vempala et al., 2020). In this model, the input is split among parties, who collectively need to solve a joint problem by communicating with each other. However, as opposed to point-to-point message passing models where pairs of parties have private communication links, the parties communicate solely via a shared blackboard that is visible to all parties. The objective is to design communication protocols with minimal communication complexity, i.e., protocols that instruct the parties to write as few bits onto the shared blackboard as possible. The model is asynchronous, and the order in which the parties write onto the blackboard is solely determined by the communication protocol.
In this paper, we initiate the study of fundamental graph problems in the blackboard model. Given an input graph with , we consider an edge-partition model, where the edges of are partitioned among the parties either adversarially or uniformly at random. Our main result is a lower bound that applies to a large class of graph problems and that is tight (up to poly-logarithmic factors) in many cases:
Theorem 1 (simplified).
Every non-trivial graph problem has (randomized) blackboard communication complexity of bits, even if the edges of the input graph are partitioned randomly among the parties.
Informally, a problem is non-trivial if it does not admit a deterministic protocol where every party holds at most one edge of the input graph and no communication occurs at all (see Section 3 for a precise definition). Intuitively, the knowledge of at most one edge is not enough for the parties to solve any interesting graph problem, and, as proved in this paper, most graph problems, including Minimum Vertex Cover, Minimum Dominating Set, Maximal/Maximum Matching, Maximal Independent Set, and -coloring, are therefore non-trivial.
In many cases, our lower bound is tight up to poly-logarithmic factors, even if the edge partitioning is adversarial. For Maximal Matching, obtaining a blackboard communication protocol with communication cost is trivial. For others, such as Maximal Independent Set and -coloring, protocols with communication cost can be derived from known data streaming algorithms. We refer the reader to Table 1 for an overview of upper bounds that match our lower bounds up to factors. These upper bounds will be discussed in Section 4.
Problem | CC Upper Bound | Reference |
---|---|---|
Maximal Matching | trivial | |
-approx. Maximum Matching (general graphs) | McGregor (McGregor, 2005) | |
-approx. Maximum Matching (bipartite graphs) | Assadi et al. (Sepehr Assadi, 2021) | |
Maximal Independent Set | Ghaffari et al. (Ghaffari et al., 2018) | |
()-coloring | Assadi et al. (Assadi et al., 2019) |
1.1. Related Work
Multi-party Communication Models
Yao introduced the two-party communication complexity framework in 1979 (Yao, 1979), which was extended to multiple parties by Chandra et al. a few years later (Chandra et al., 1983). The various multi-party communication complexity models known today can be categorized by 1) how input data is partitioned among the parties; and 2) how the parties communicate with each other. Regarding 1), in the number-on-the-forehead model, every party knows the input of all other parties except their own, which was also the model considered by Chandra et al. This stands in contrast to the number-in-hand model, where every party knows their own input, which is the model studied in this paper. Regarding 2), in the message passing model, every pair of vertices has access to a private communication link that is not visible to other parties. This model is similar to the coordinator model, where every party solely communicates with a third party referred to as the coordinator. The two models are identical up to a factor in the communication complexity (see, for example, (Phillips et al., 2012)). The blackboard model can be regarded as a variant of the coordinator model, where every message received by the coordinator is immediately visible to every other party. Writing on the blackboard can also be regarded as a message broadcast.
Multi-party Communication Complexity of Graph Problems
Various graph problems have been studied in the multi-party number-in-hand communication settings. Typically, the edges of the input graph are distributed among the participating parties, either with or without duplication, with the without duplication setting usually being easier to handle. In the message passing model, tight bounds are known for testing bipartiteness, cycle-freeness, connectivity, and degree computations (Woodruff and Zhang, 2017), approximate maximum matching and approximate maximum flow (Huang et al., 2020), and spanner computations (Fernandez et al., 2020). In contrast, graph problems have not been considered in the blackboard model to the best of our knowledge. However, the Maximum Matching problem has been considered in the simultaneous message model, which can be seen as a variant of the blackboard model, where all parties simultaneously send a single message to a coordinator who then outputs the result (Dobzinski et al., 2014; Alon et al., 2015; Konrad, 2015; Assadi et al., 2016b).
Further Results in the Blackboard Model
While graph problems have not yet been studied in the blackboard model, the model has nevertheless attracted significant attention in recent years. For example, tight bounds are known for Set-Disjointness (Braverman and Oshman, 2015), bit-wise XOR and logical AND computations (Phillips et al., 2012), optimization problems such as solving linear systems (Vempala et al., 2020), and distributed task allocation problems (Drucker et al., 2012). We point out that the blackboard model also abstracts aspects of real-world graph processing systems with shared memory such as (Low et al., 2012) and, as observed in (Braverman and Oshman, 2015), models single-hop wireless ad hoc networks where nodes communicate via broadcast.
1.2. Techniques
Consider a graph problem P and the uniform distribution over all -vertex graphs, for some constant . We denote a graph chosen from as gadget. Our hard input graph distribution is the product distribution , i.e., graphs consisting of disjoint and independent gadgets chosen from .
Let be a communication protocol for problem P. We measure the amount of information about the edges of the input graph that is necessarily revealed by the transcript of , i.e., by the bits written on the blackboard. This quantity is usually referred to as the external information cost (see (Braverman, 2017) for an excellent exposition) of a protocol and constitutes a lower bound on the communication cost of .
At the core of our lower bound argument lies a direct sum argument. Towards a contradiction, assume that the external information cost of is , for a sufficiently small constant . We then prove that can be used to obtain a protocol with external information cost that solves P on a single -vertex gadget, i.e., on inputs chosen from . We further argue that since is a very small constant, cannot depend much on the actual input of the problem. We then give a compression lemma, showing that there is a protocol that avoids communication altogether, while only marginally increasing the probability of error. Finally, to show that such a “silent” protocol cannot exist, we employ Yao’s lemma, which allows us to focus on deterministic protocols with distributional error, and then give simple problem-specific combinatorial arguments.
1.3. Outline
In Section 2, we define the model and provide the necessary context on information theory. Next, we present our main result, our lower bound for non-trivial graph problems in the blackboard model, in Section 3. In Section 4, we summarize how known algorithms can be adapted to yield optimal (up to poly-logarithmic factors) communication protocols in the blackboard model. Last, we conclude in Section 5.
2. Preliminaries and Computing Models
2.1. The Blackboard Model
In the (shared) blackboard model, denoted by , we have parties that communicate by writing messages on a shared blackboard that is visible to all parties. The way the parties interact, and, in particular, the order in which the parties write onto the blackboard, is solely specified by the given communication protocol (and the current state of the blackboard), i.e., the model is therefore asynchronous. All parties have access to both private and public random coins.
In this paper, we study graph problems in the blackboard model. Given an input graph with , we consider an edge-partition model without duplications, where the edges of are partitioned among the parties according to a partition function that maps the set of all pairs of vertices of to the machines (since we consider undirected graphs, we assume that , for every ). This function does not reveal the set of edges of itself, however, it reveals that if an edge is contained in the input graph, then it is located at machine . For our lower bounds, we assume that is chosen uniformly at random from the set of all symmetric mappings from to , and that is known by all parties. For our upper bounds, we assume that is chosen adversarially and is not given to the parties. Instead, each machine then only knows the edges that are assigned to it at the beginning of the algorithm, i.e., the set of edges .
For a given problem P, we say that a communication protocol has error if, for any input graph, the probability that the joint output of the parties is a correct solution for P is at least . This probability is taken over all private and public coins, and the random selection of the partition function .111We adapt this definition accordingly when considering adversarially chosen partition functions for the upper bounds in Section 4. We consider the global output of as the union of the local outputs of the machines. For example, for MIS, every party is required to output an independent set so that the union of the parties’ outputs constitutes an MIS in the input graph. We say that a protocol is silent if none of the parties write on the blackboard.
The transcript of a protocol , denoted by , is the entirety of the bits written onto the blackboard. The communication cost of a protocol is the maximum length of in an execution of , which we denote by . Then, the randomized -error blackboard communication complexity of a problem P, denoted by is the minimum cost of a protocol that solves P with error at most .
2.2. Tools From Information Theory
Let be a random variable distributed according to a distribution . We denote the (Shannon) entropy of by , where the index may be dropped if the distribution is clear from the context. The mutual information of two random variables distributed according to is denoted by (again, may be dropped), where is the entropy of conditioned on . We frequently use the shorthand for the event , for a random variable , and we write to denote the expected value operator where the expectation is taken over the range of . The conditional mutual information of and conditioned on random variable is defined as
For a more detailed introduction to information theory, we refer the reader to (Cover and Thomas, 2006).
We will use the following properties of mutual information:
-
(1)
Chain rule for mutual information. .
-
(2)
Independent events. For random variables and event : , if is independent of .
-
(3)
Conditioning on independent variable. (see e.g. Claim 2.3. in (Assadi et al., 2016a) for a proof) Let be jointly distributed random variables so that and are independent conditioned on . Then,
Our lower bound proof relies on Pinsker’s Inequality, which relates the total variation distance to the Kullback-Leibler divergence of two distributions:
Definition 0 (Total Variation Distance, Kullback-Leibler Divergence).
Let be a discrete random variable and consider two probability distributions and . The total variation distance between and is defined as
and the Kullback–Leibler divergence from to (measured in bits) is defined as
Theorem 2 (Pinsker’s Inequality (Pinsker, 1964)).
Let be a random variable. For any two probability distributions and , it holds that
3. Lower Bound
3.1. Input Distributions
In our input distributions, we assume that the vertex set of our input graph is fixed and known to all parties. Furthermore, w.l.o.g. we assume that 222For an integer we write to denote the set .. We will now define a distribution of partition functions that map the set of potential edges to the parties , where the subscript denotes the cardinality of the set of vertices, and two distributions and over the edge set of our input graph. Since we need to specify both the partition function and the input graph as the input to a protocol in the model, the relevant distributions are therefore the product distributions and .
Partition Function
We denote by the uniform distribution over all symmetric functions mapping to , where , and we denote by the partition function associated with our input. In this section, we assume that is known to all parties.
Input Graph
We consider two input distributions, which we denote the single-gadget and the multi-gadget distributions. For a given gadget size , let be the probability distribution over -vertex graphs where we sample each edge uniformly and independently with probability . We call the resulting graph a gadget and say that is a single-gadget distribution. We define the multi-gadget distribution to be the probability distribution where we sample gadgets independently from . More precisely, we assume that gadget is induced by vertices , and the edges interconnecting gadget are distributed according to . For simplicity, we assume that is an integer. To differentiate between the two distributions, we always denote the number of vertices of the input graph when chosen from the single-gadget distribution by , and when chosen from the multi-gadget distribution by .
3.2. Non-trivial Graph Problems
The lower bounds proved in this paper hold for non-trivial problems, which are defined by means of good partitions:
Definition 0 (Good Partition).
Consider a single-gadget partition function . We say that is good, if every party receives at most one potential edge under .
Observe that this is only possible if the number of parties exceeds the number of potential edges, i.e., if . In our setting, is a small constant and is sufficiently large. Almost all partitions are therefore good.
Definition 0 (Non-trivial Problem).
We say that a graph problem is non-trivial if there exists a natural number such that for every good partition there is no deterministic silent model protocol that solves the problem on every instance , where .
For the parties to contribute to the global output of the protocol by only seeing at most one edge and without communication is impossible for most problems. The class of non-trivial problems is therefore large. In Section 3.8, we give formal proofs, demonstrating that many key problems considered in distributed computing are non-trivial.
3.3. Information Cost
We now define the (external) information cost of a protocol in the model:
Definition 0 ((External) information cost).
Let be an -error protocol in the model for a problem P. We define
where are the public random coins used by and is the partition function.
Informally, is the amount of information revealed about the input edge set by the transcript, given that the public randomness and the partition function are known. We define in a similar way.
It is well-known that the external information cost of a protocol is a lower bound on its communication cost. For completeness, we give a proof in the following lemma:
Lemma 0.
where ranges over all -error randomized protocols for problem .
Proof.
where the second last step follows from Theorem 6.1 in (Rao and Yehudayoff, 2020). ∎
3.4. Direct Sum Argument
In this section, we show that the information cost of a protocol for the multi-gadget case is lower-bounded by the information cost of a protocol for the single-gadget case, multiplied by the number of gadgets:
Theorem 5.
Let be an -error randomized protocol. Then, there exists an -error protocol such that:
(1) |
Proof.
Given , we construct protocol as follows. Let denote the input to . The parties construct input for from , as follows:
-
(1)
The parties use public randomness to sample a uniform random index .
-
(2)
The parties use public randomness to sample a partition from conditioned on the partition of gadget being equivalent to .
-
(3)
The parties set the single input gadget to be gadget in the multi-gadget input .
-
(4)
Given , the parties know which potential edges of the gadgets different to gadget are hosted locally. They then use private randomness to sample the existence of the individual edges.
-
(5)
The parties run protocol on input and return the part of the output that concerns gadget .
Observe that the random variables are identical to , where denotes the public coins used by protocol . Observe further that the constructed input is distributed according to .
Denote by the edges of gadget of the multi-gadget input. We obtain:
3.5. Zero Communication Protocol
Next, we show that if a protocol solves single-gadget instances with small information cost, then there exists a silent protocol (i.e., a protocol that avoids all communication) with only slightly increased error.
Theorem 6.
For a problem P, let be an -error randomized protocol in the model with
(4) |
Then, there exists a silent protocol in the model with error at most for problem P.
Proof.
Consider the protocol as stated in the premise of the theorem. As a first step, we leverage the assumption that is small to show that the messages sent by the algorithm are close to being independent of the edges of the sampled gadget .
Lemma 0.
Suppose we sample the edges of gadget and a partition function . Let be the public randomness used by . Let denote the probability distribution of conditioned on , , and , and define similarly. Then, with probability at least , it holds that
(5) |
Proof.
We will derive an upper bound on the expected total variation distance, where the expectation ranges over the input gadget , the public random string , and the partition function . In the following, we use the abbreviation . By applying Pinsker’s inequality (see Theorem 2) and taking expectations on both sides, we get
From Jensen’s inequality for concave functions and the fact that , we get | ||||
In the following derivation, ranges over all possible transcripts of . By using the definition of the Kullback-Leibler divergence (see Def. 1), we obtain that
(6) | ||||
where we used the definition of conditional mutual information in (6), and the upper bound (4) in the last step. To translate the upper bound on the expected value to an upper bound on the probability that the total variation distance is large, we apply Markov’s inequality. That is, if we sample , , and from the respective distributions stated in the premise of the lemma, then it holds that
and the lemma follows. ∎
Equipped with Lemma 7, we are now ready to prove Theorem 6. The players use the even bits of their public randomness to sample , which is the public randomness of protocol , yielding a new protocol that does not use public randomness but otherwise behaves the same way as given . Then, the players use the odd bits of their public randomness to sample a transcript from the distribution . After these two steps, all players know and . Finally, each player computes its output by applying the state transition function of to , , its input assignment, and its private random bits. Thus, we have obtained a silent protocol .
To prove the lemma, we need to obtain an upper-bound on the error probability of . Clearly, fails in all instances where fails. In fact, the only difference between computing the output of compared to is that the transcript used in is sampled from distribution , whereas the one we use in is sampled from . Thus, the only additional error probability of protocol is determined by the difference in the probability mass assigned to each transcript between the distributions and . We obtain
(7) |
Let be the event that , and define event similarly. To upper-bound the probability , we condition on whether event holds. Continuing from (7), we get
(8) |
Conditioned on being correct, we know that the additional error of is at most
which, conditioned on event , yields the bound .
3.6. From Randomized to Deterministic Protocols
In this section, we show that the existence of a randomized silent protocol for single-gadget instances with a sufficiently small error implies existence of a deterministic silent protocol for single-gadget instances that solves the problem on every input graph under some good partition.
First, we will argue that the probability that the partition function is good is at least :
Lemma 0.
Let and . Consider a single gadget input on vertices. Then, the probability that the partition function is good is at least .
Proof.
There are vertex pairs. The probability that they are all assigned to different parties is at least:
where in the penultimate inequality we used the fact that for any . ∎
We now state and prove the main result of the section.
Lemma 0.
Let , , and . Let be a randomized silent protocol with error at most (where the error is over the random partition and the random coin flips). Then, there exists a good partition and a deterministic silent protocol that succeeds on every input , where is a -vertex graph.
Proof.
First, the assumption of the lemma together with Yao’s minimax principle (Yao, 1977) imply that there exists a deterministic silent protocol with distributional error at most over the input distribution .
Now, for a fixed partition function , observe that if is not correct on all -vertex input graphs, then its error conditioned on is at least . For the sake of a contradiction, suppose that there is no good partition such that succeeds on all inputs. Recall further that at least half of all partition functions are good ( Lemma 8). Then:
contradicting the assumption that . ∎
3.7. Main Lower Bound Theorem
Theorem 1. P be a non-trivial graph problem. Then there exists and such that for any and the randomized -error blackboard communication complexity of is , even if the edges of the input graph are partitioned randomly among the parties.
Proof.
Let be the minimum integer such that for every good partition from there is no deterministic silent model protocol that solves on every instance , where (such a exists since P is non-trivial). Let and .
3.8. Lower Bounds for Specific Problems
In this section we establish non-triviality of several fundamental graph problems. We point out that lower bounds are known for all of these problems in other distributed computing models such as the LOCAL model (see (Balliu et al., 2019; Kuhn et al., 2016)). However, these lower bounds do not directly translate to our setting since, in the model, a player who receives an edge as input does not necessarily need to produce an output for this particular edge or its vertices.
Lemma 0.
Maximal Matching is a non-trivial problem.
Proof.
For contradiction, suppose that for every natural there exist a good partition from and a protocol that solves Maximal Matching on every instance chosen from under partition . Let . First, observe that if the input graph is the empty graph, then no party can output an edge. This implies that if a party does not receive an edge then its output needs to be empty. Suppose next that the input graph consists of a single edge. Then, the party that hosts this edge needs to output this edge since otherwise the computed matching would not be maximal. This implies further that any party that holds an edge needs to output their edge. However, if two machines hold edges that share an endpoint then the output would not be a matching, a contradiction. ∎
Lemma 0.
Maximal Independent Set, Minimum Dominating Set, -coloring, and Minimum Vertex Cover are non-trivial problems.
Proof.
Maximal Independent Set and Minimum Dominating Set. For contradiction, suppose that for every natural there exist a good partition from and a protocol that solves Maximal Independent Set (resp. Minimum Dominating Set) on every instance chosen from under partition . Let . First, observe that if the input graph is the empty graph, all vertices must be output. This implies that there is a function such that for every , if party hosts no edges, then its output contains vertex . Let be distinct vertices of the input graph. Since there are six different pairs of distinct vertices from , at least one of these pairs, say , is assigned by the partition to a party . But then on the input graph with a unique edge , the output will contain both vertices and , which is not possible, as the output of should be an independent set (resp. a minimum dominating set).
-coloring. The proof works similarly to the previous one. Indeed, for the empty input graph, the output should be a vertex coloring, where every vertex is assigned the same color, say color 1. This implies that there is a function such that for every , if party hosts no edges, then its output assigns color 1 to vertex . Then, following the same arguments as in the previous proof, we conclude that on an input graph with at least four vertices and a unique edge , the output coloring assigns color 1 to both vertices and , which is not a proper coloring, a contradiction.
Minimum Vertex Cover. Since for any graph a vertex set is a minimum vertex cover if and only if the set is a maximum independent set, the existence of a deterministic silent protocol for Minimum Vertex Cover would imply the existence of one for Maximum Independent Set. Furthermore, as any maximum independent set is also a maximal independent set, the existence of a desired protocol for Maximum Independent Set would imply the existence of one for Maximal Independent Set, a contradiction. ∎
In the proof of non-triviality of Maximal Matching we used gadgets of size 3 and in the proofs of non-triviality of all other problems we used gadgets of size 4. Together with Theorem 1 this implies the following
Corollary 0.
-
(1)
For any and , the randomized -error blackboard communication complexity of Maximal Matching is .
-
(2)
For any and , the randomized -error blackboard communication complexity of each of the problems Maximal Independent Set, Minimum Dominating Set, -coloring, and Minimum Vertex Cover is .
4. Upper Bounds
In this section, we assume that the partition function is arbitrary (potentially adversarially chosen), and each party only knows the edges allocated to them, i.e., the set of edges . Thus, in contrast to our lower bounds, the error probability of a protocol is now only computed over the private and public coin flips.
4.1. Maximal Matching and Maximum Matching Approximation
Computing a maximal matching in the (k) model with communication cost is straightforward. The parties can simulate the Greedy matching algorithm, as follows: The blackboard serves as a variable that contains the current matching. Initially, the blackboard is empty and represents the empty set, i.e., . The parties proceed in order. At party ’s turn, party attempts to add each of its edges in any order to the blackboard if possible, i.e., if is still a matching. Since any matching in an -vertex graph is of size at most , and each edge requires space to be written onto the blackboard (e.g., by indicating the two endpoints of the edge), the communication cost is .
Various algorithms are known for computing a -approximation to Maximum Matching by repeatedly computing maximal matchings in subgraphs of the input graph (e.g. (McGregor, 2005; Ahn and Guha, 2011; Eggert et al., 2012; Konrad, 2018; binti Khalil and Konrad, 2020; Sepehr Assadi, 2021)). In order to implement these algorithms in the (k) model, we need to make sure that the parties know which of their edges are contained in each of the respective subgraphs. If this can be achieved with communication cost per subgraph, then a (k) model implementation with cost can be obtained, where is the number of matchings computed.
The algorithm by McGregor (McGregor, 2005) follows this scheme and relies on the computation of maximal matchings in order to establish a -approximate matching in general graphs. Specifying the respective subgraphs with small communication cost is straightforward for this algorithm, requiring a cost of (details omitted), and we thus obtain a -approximation (k) model algorithm with communication cost .
The dependency on can be improved in the case of bipartite graphs. Assadi et al. (Sepehr Assadi, 2021) very recently gave an algorithm for Maximum Matching in bipartite graphs that solely relies on the computation of maximal matchings in subgraphs of the input graph. Specifying the respective subgraphs, however, is slightly more involved, but can be done with communication cost . Taking this into account, a (k) model protocol with communication cost can be obtained.
4.2. Maximal Independent Set
We now discuss how the random order Greedy MIS algorithm can be implemented in the (k) model with communication cost . A similar implementation in the massively parallel computation and the congested clique models was previously discussed by Ghaffari et al. (Ghaffari et al., 2018). The key idea of this implementation, i.e., a residual sparsity property of the random order Greedy algorithm, goes back to the multi-pass correlation clustering streaming algorithm by Ahn et al. (Ahn et al., 2015) and have since been used at multiple occasions (Konrad, 2018; Gamlath et al., 2019).
The random order Greedy MIS algorithm proceeds as follows: First, the algorithm identifies a uniform random permutation , assigning each vertex a rank. Denote by the vertex with rank . The algorithm processes the vertices in the order specified by . When processing vertex , the algorithm attempts to insert into an initially empty independent set if possible, i.e., if is an independent set.
To simulate this algorithm in the (k) model, the parties first agree on a uniform random permutation using public randomness. The simulation then proceeds in phases, where in phase Greedy is simulated on vertices , for
Denote by the independent set computed after phase (i.e., after having processed all vertices with ranks smaller than ), and let . In each phase , the parties write the subgraph induced by vertices onto the blackboard, where denotes the neighborhood of . This information, together with , then allows all parties to locally continue the simulation of Greedy on vertices , resulting in independent set . As proved by Ghaffari et al., the subgraph induced by vertices has edges with high probability. Since the parties know from the local simulations of the previous phase, they know which of their edges are contained in subgraph and write all of these edges onto the blackboard.
After phases, it can be seen that the graph induced by vertices with rank larger than has edges. The parties then write this subgraph onto the blackboard, which allows all parties to complete their local simulations of Greedy.
The communication cost is dominated by the vertex-induced subgraphs written onto the blackboard. Each of these subgraphs contains edges, and, since there are phases, the total communication cost is .
4.3. -coloring
We now discuss how the one-pass -space streaming algorithm by Assadi et al. (Assadi et al., 2019) for -coloring can be implemented in the model.
Assadi et al.’s streaming algorithm assumes that is known prior to the processing of the stream. For each vertex , the algorithm first samples uniform random colors from the color palette . Then, while processing the stream, the algorithm maintains the set of edges such that , i.e., the palettes of and intersect. Denote this set of edges by . Assadi et al. prove that w.h.p., and there exists a coloring of the conflict graph such that every vertex is assigned a color from w.h.p., i.e., . It is easy to see that this coloring is also a valid coloring of the input graph . Since , and space is accounted for storing an edge, the space complexity of this algorithm is .
We implement this algorithm in the (k) model as follows. The parties maintain a guess of that is initially set to and updated according to a binary search in the range . In each step of the binary search, our algorithm simulates Assadi et al.’s algorithm using the guess instead of . To this end, first, the parties use public randomness to determine the palettes for every vertex . Knowing all the vertices’ palettes, every party is able to identify which of their edges are contained in . Denote the conflict edges of party by . Then, every party writes to the blackboard. Next, the parties locally each compute the same coloring of the conflict graph respecting the color palettes of the vertices, if such a coloring exists. If such a coloring exists, the parties conclude that , and if such a coloring does not exist, the parties conclude that . The parties continue the binary search and output the coloring computed for the smallest value of . This coloring is clearly a valid -coloring.
Each iteration of the binary search requires writing the conflict graph onto the blackboard, which requires bits. Since the binary search requires rounds, this protocol has communication cost bits.
5. Conclusion
In this paper, we gave a new lower bound technique that allowed us to prove lower bounds on the communication complexity of most graph problems in the model, even if the edges are assigned uniformly at random to the parties. The strength of our approach is its wide applicability and the robustness under the random assignment of edges. We also showed that our lower bounds are tight, up to poly-logarithmic factors, for Maximal Independent Set, Maximal Matching, and -coloring.
We conclude with pointing out a connection between the two-party communication setting and the model. It is not hard to see that a model protocol can be implemented in the two-party communication setting with the same (or less) communication cost. Hence, lower bounds in the two-party communication model translate to the model. While some strong lower bounds in the two-party communication setting are known, such as the lower bound by Halldórsson et al. (Halldórsson et al., 2012) who showed that computing an -approximation to Maximum Independent Set requires communication cost , these lower bounds do not hold under the random edge assignment assumption. This begs the following question: How can we prove strictly stronger lower bounds for graph problems in the model if edges are assigned uniformly at random to the parties?
References
- (1)
- Ahn et al. (2015) Kook Jin Ahn, Graham Cormode, Sudipto Guha, Andrew McGregor, and Anthony Wirth. 2015. Correlation Clustering in Data Streams. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015 (JMLR Workshop and Conference Proceedings, Vol. 37), Francis R. Bach and David M. Blei (Eds.). JMLR.org, 2237–2246.
- Ahn and Guha (2011) Kook Jin Ahn and Sudipto Guha. 2011. Linear Programming in the Semi-streaming Model with Application to the Maximum Matching Problem. In Automata, Languages and Programming - 38th International Colloquium, ICALP 2011, Zurich, Switzerland, July 4-8, 2011, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 6756), Luca Aceto, Monika Henzinger, and Jirí Sgall (Eds.). Springer, 526–538.
- Alon et al. (2015) Noga Alon, Noam Nisan, Ran Raz, and Omri Weinstein. 2015. Welfare Maximization with Limited Interaction. In Proceedings of the 2015 IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS) (FOCS ’15). IEEE Computer Society, USA, 1499–1512. https://doi.org/10.1109/FOCS.2015.95
- Assadi et al. (2019) Sepehr Assadi, Yu Chen, and Sanjeev Khanna. 2019. Sublinear Algorithms for ( + 1) Vertex Coloring. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, Timothy M. Chan (Ed.). SIAM, 767–786. https://doi.org/10.1137/1.9781611975482.48
- Assadi et al. (2016a) Sepehr Assadi, Sanjeev Khanna, and Yang Li. 2016a. Tight bounds for single-pass streaming complexity of the set cover problem. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, Cambridge, MA, USA, June 18-21, 2016, Daniel Wichs and Yishay Mansour (Eds.). ACM, 698–711.
- Assadi et al. (2016b) Sepehr Assadi, Sanjeev Khanna, Yang Li, and Grigory Yaroslavtsev. 2016b. Maximum Matchings in Dynamic Graph Streams and the Simultaneous Communication Model. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, Robert Krauthgamer (Ed.). SIAM, 1345–1364. https://doi.org/10.1137/1.9781611974331.ch93
- Balliu et al. (2019) Alkida Balliu, Sebastian Brandt, Juho Hirvonen, Dennis Olivetti, Mikaël Rabie, and Jukka Suomela. 2019. Lower Bounds for Maximal Matchings and Maximal Independent Sets. In 60th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2019, Baltimore, Maryland, USA, November 9-12, 2019. 481–497. https://doi.org/10.1109/FOCS.2019.00037
- binti Khalil and Konrad (2020) Lidiya Khalidah binti Khalil and Christian Konrad. 2020. Constructing Large Matchings via Query Access to a Maximal Matching Oracle. In 40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2020, December 14-18, 2020, BITS Pilani, K K Birla Goa Campus, Goa, India (Virtual Conference) (LIPIcs, Vol. 182), Nitin Saxena and Sunil Simon (Eds.). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 26:1–26:15. https://doi.org/10.4230/LIPIcs.FSTTCS.2020.26
- Braverman (2017) Mark Braverman. 2017. Interactive Information Complexity. SIAM Rev. 59, 4 (2017), 803–846. https://doi.org/10.1137/17M1139254
- Braverman and Oshman (2015) Mark Braverman and Rotem Oshman. 2015. On information complexity in the broadcast model. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing. 355–364.
- Chandra et al. (1983) Ashok K. Chandra, Merrick L. Furst, and Richard J. Lipton. 1983. Multi-Party Protocols. In Proceedings of the Fifteenth Annual ACM Symposium on Theory of Computing (STOC ’83). Association for Computing Machinery, New York, NY, USA, 94–99. https://doi.org/10.1145/800061.808737
- Cover and Thomas (2006) Thomas M. Cover and Joy A. Thomas. 2006. Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing). Wiley-Interscience, USA.
- Dobzinski et al. (2014) Shahar Dobzinski, Noam Nisan, and Sigal Oren. 2014. Economic Efficiency Requires Interaction. In Proceedings of the Forty-Sixth Annual ACM Symposium on Theory of Computing (New York, New York) (STOC ’14). Association for Computing Machinery, New York, NY, USA, 233–242. https://doi.org/10.1145/2591796.2591815
- Drucker et al. (2012) Andrew Drucker, Fabian Kuhn, and Rotem Oshman. 2012. The Communication Complexity of Distributed Task Allocation (PODC ’12). Association for Computing Machinery, New York, NY, USA, 67–76. https://doi.org/10.1145/2332432.2332443
- Eggert et al. (2012) Sebastian Eggert, Lasse Kliemann, Peter Munstermann, and Anand Srivastav. 2012. Bipartite Matching in the Semi-streaming Model. Algorithmica 63, 1-2 (2012), 490–508. https://doi.org/10.1007/s00453-011-9556-8
- Fernandez et al. (2020) Manuel Fernandez, David P. Woodruff, and Taisuke Yasuda. 2020. Graph Spanners in the Message-Passing Model. In 11th Innovations in Theoretical Computer Science Conference, ITCS 2020, January 12-14, 2020, Seattle, Washington, USA (LIPIcs, Vol. 151), Thomas Vidick (Ed.). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 77:1–77:18. https://doi.org/10.4230/LIPIcs.ITCS.2020.77
- Gamlath et al. (2019) Buddhima Gamlath, Sagar Kale, Slobodan Mitrovic, and Ola Svensson. 2019. Weighted Matchings via Unweighted Augmentations. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, PODC 2019, Toronto, ON, Canada, July 29 - August 2, 2019, Peter Robinson and Faith Ellen (Eds.). ACM, 491–500.
- Ghaffari et al. (2018) Mohsen Ghaffari, Themis Gouleakis, Christian Konrad, Slobodan Mitrovic, and Ronitt Rubinfeld. 2018. Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover. In Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing, PODC 2018, Egham, United Kingdom, July 23-27, 2018, Calvin Newport and Idit Keidar (Eds.). ACM, 129–138.
- Halldórsson et al. (2012) Magnús M. Halldórsson, Xiaoming Sun, Mario Szegedy, and Chengu Wang. 2012. Streaming and Communication Complexity of Clique Approximation. In Automata, Languages, and Programming - 39th International Colloquium, ICALP 2012, Warwick, UK, July 9-13, 2012, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 7391), Artur Czumaj, Kurt Mehlhorn, Andrew M. Pitts, and Roger Wattenhofer (Eds.). Springer, 449–460.
- Huang et al. (2020) Zengfeng Huang, Bozidar Radunovic, Milan Vojnovic, and Qin Zhang. 2020. Communication complexity of approximate maximum matching in the message-passing model. Distributed Comput. 33, 6 (2020), 515–531. https://doi.org/10.1007/s00446-020-00371-6
- Konrad (2015) Christian Konrad. 2015. Maximum Matching in Turnstile Streams. In Algorithms - ESA 2015 - 23rd Annual European Symposium, Patras, Greece, September 14-16, 2015, Proceedings (Lecture Notes in Computer Science, Vol. 9294), Nikhil Bansal and Irene Finocchi (Eds.). Springer, 840–852. https://doi.org/10.1007/978-3-662-48350-3_70
- Konrad (2018) Christian Konrad. 2018. A Simple Augmentation Method for Matchings with Applications to Streaming Algorithms. In 43rd International Symposium on Mathematical Foundations of Computer Science, MFCS 2018, August 27-31, 2018, Liverpool, UK (LIPIcs, Vol. 117), Igor Potapov, Paul G. Spirakis, and James Worrell (Eds.). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 74:1–74:16.
- Kuhn et al. (2016) Fabian Kuhn, Thomas Moscibroda, and Roger Wattenhofer. 2016. Local Computation: Lower and Upper Bounds. J. ACM 63, 2 (2016), 17:1–17:44. https://doi.org/10.1145/2742012
- Low et al. (2012) Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M Hellerstein. 2012. Distributed graphlab: A framework for machine learning in the cloud. arXiv preprint arXiv:1204.6078 (2012).
- McGregor (2005) Andrew McGregor. 2005. Finding Graph Matchings in Data Streams. In Approximation, Randomization and Combinatorial Optimization, Algorithms and Techniques, 8th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2005 and 9th InternationalWorkshop on Randomization and Computation, RANDOM 2005, Berkeley, CA, USA, August 22-24, 2005, Proceedings (Lecture Notes in Computer Science, Vol. 3624), Chandra Chekuri, Klaus Jansen, José D. P. Rolim, and Luca Trevisan (Eds.). Springer, 170–181.
- Phillips et al. (2012) Jeff M Phillips, Elad Verbin, and Qin Zhang. 2012. Lower bounds for number-in-hand multiparty communication complexity, made easy. In Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms. SIAM, 486–501.
- Pinsker (1964) Mark S Pinsker. 1964. Information and information stability of random variables and processes. Holden-Day.
- Rao and Yehudayoff (2020) Anup Rao and Amir Yehudayoff. 2020. Communication Complexity: and Applications. Cambridge University Press.
- Sepehr Assadi (2021) Robert E. Tarjan Sepehr Assadi, Cliff Liu. 2021. An Auction Algorithm for Bipartite Matching in Streaming and Massively Parallel Computation Models. In 4th Symposium on Simplicity in Algorithms, SOSA@SODA 2021.
- Vempala et al. (2020) Santosh S. Vempala, Ruosong Wang, and David P. Woodruff. 2020. The Communication Complexity of Optimization. In Proceedings of the Thirty-First Annual ACM-SIAM Symposium on Discrete Algorithms (Salt Lake City, Utah) (SODA ’20). Society for Industrial and Applied Mathematics, USA, 1733–1752.
- Woodruff and Zhang (2017) David P. Woodruff and Qin Zhang. 2017. When Distributed Computation is Communication Expensive. Distrib. Comput. 30, 5 (Oct. 2017), 309–323. https://doi.org/10.1007/s00446-014-0218-3
- Yao (1977) Andrew Chi-Chin Yao. 1977. Probabilistic computations: Toward a unified measure of complexity. In 18th Annual Symposium on Foundations of Computer Science (sfcs 1977). IEEE Computer Society, 222–227.
- Yao (1979) Andrew Chi-Chih Yao. 1979. Some Complexity Questions Related to Distributive Computing(Preliminary Report). In Proceedings of the Eleventh Annual ACM Symposium on Theory of Computing (Atlanta, Georgia, USA) (STOC ’79). Association for Computing Machinery, New York, NY, USA, 209–213. https://doi.org/10.1145/800135.804414