[1] \WithSuffix[1] \WithSuffix[1] \WithSuffix[1] \WithSuffix[1] \WithSuffix[1] \WithSuffix[1] \WithSuffix[1] \WithSuffix[1] \WithSuffix[1] \NewEnvironkillcontents
On Sparsity Awareness in Distributed Computations
We extract a core principle that underlies seemingly different fundamental distributed settings, which is that sparsity awareness may induce faster algorithms for core problems in these settings. To leverage this, we establish a new framework by developing an intermediate auxiliary model which is weak enough to be successfully simulated in the classic model given low mixing time, as well as in the recently introduced model. We prove that despite imposing harsh restrictions, this artificial model allows balancing massive data transfers with a maximal utilization of bandwidth. We then exemplify the power we gain from our methods, by deriving fast shortest-paths algorithms which greatly improve upon the state-of-the-art.
Specifically, we obtain the following end results for graphs of nodes:
-
•
A approximation for weighted, undirected APSP in rounds in the model, where is the minimum degree of the graph and is its mixing time. For graphs with , this takes rounds, despite the known lower bound for general graphs [Nanongkai, STOC’14].
-
•
An -round exact SSSP algorithm in the model, for graphs with edges and a mixing time of . This improves upon the previous algorithm of [Chechik and Mukhtar, PODC’20] for significant ranges of values of and .
-
•
A simulation in the model which improves upon the previous state-of-the-art simulation of [Ghaffari, Kuhn, and SU, PODC’17] by a factor that is proportional to the average degree in the graph.
-
•
An -round algorithm for a approximation for SSSP in the model. The only previous round algorithm for distance approximations in this model is for a much larger approximation factor of in rounds [Augustine, Hinnenthal, Kuhn, Scheideler, Schneider, SODA’20].
1 Introduction
The overarching theme of this paper is laying down an algorithmic infrastructure and employing it for designing fast algorithms in seemingly unrelated distributed settings, namely, the classic model and the recently-introduced model.
The model [56] abstracts a synchronous network of nodes, in which in each round of computation, each node can send a message of bits on each of its links. A recent line of work addresses computing MST, distances, and data summarizaion in with low-mixing time [38, 40, 60] and finding small graphs in benefits from efficient computation inside components of low-mixing time [20, 21, 22, 15, 13, 30, 48]. Low-mixing time is, in particular, a property of expander graphs, which have been shown to be useful for designing data centers [43, 27].
The model, recently introduced by [8], abstracts networks supporting high-bandwidth communication over local edges, as well as very low-bandwidth communication over global edges. Aligned with most previous work on this model, we assume here unbounded bandwidth for messages to neighbors, and -bit messages to a constant number of nodes anywhere in the network. This model in particular abstracts recent developments in hybrid data centers [26, 42, 47, 62]. Most research in the model has been devoted to distance computation problems [8, 49, 32, 18].
While these settings highly differ, we pinpoint an approach that underlies computation in both settings, namely, sparsity and density awareness. The key type of tasks which we tackle are tasks requiring transfer of massive amounts of data. Our general approach is to design load balancing mechanisms that leverage the full bandwidth of a given communication model. Examples of tasks that enjoy our framework are matrix multiplication and distance computations.
For the purpose of our framework, as an auxiliary tool (not as a computational model in its own right), we define the model with capacity (abbreviated ), which is an all-to-all setting whose main characteristics are a limit on the bandwidth per node and the anonymity of nodes. To cope with the harsh nature of the model, which is needed in order to allow it to be efficiently simulated, we develop a distributed data structure and accompanying algorithms, dedicated towards load balancing and full utilization of the available bandwidth.
We then show sparsity aware simulations of the model in the and settings. Specifically, the simulations focus on utilizing all the available bandwidth in the underlying models, even when given highly skewed inputs. Combined, these obtain our end results – fast algorithms for distance computations in low-mixing time and in the model.
A first flavor of our end results: One of our main contributions is proving that the size of the graph is not fine-grained enough to capture the complexity of the all-pairs shortest-paths problem (APSP) in the model. While there is an lower bound for general graphs, even when allowing large approximation factors [54], and a matching randomized algorithm giving an exact solution [12], we show that one can go significantly below this complexity, depending on the minimal degree in the graph and its mixing time.
Theorem 1.1 (-Approximation for APSP in ).
For any constant , and weighted, undirected graph with minimal degree and mixing time , there is an algorithm in the model computing a approximation to APSP on within rounds222Note that there is a typo in the complexity stated for this theorem in the SPAA’21 proceedings., w.h.p.
For any constant , consider a graph with . Using Theorem 1.1, it is possible to approximate weighted, undirected APSP on in rounds, w.h.p., in the model, which is a major improvement over the linear general case. This approach is aligned with a conclusion that is obtained by the single-source shortest paths (SSSP) algorithm of [40] that reflects that and the diameter are insufficient for capturing the complexity of SSSP. Our result suggests that for APSP the dependence parameters could be and , and this opens the complexity landscape of APSP in the model to much further exciting research.
1.1 Our Contributions
1.1.1 Fast Algorithms for the Model
The pioneering works of [8, 49] lay down technical foundations that show that utilizing both the local and global edges in the model allows solutions which are much faster than algorithms which only use the local or global edges. One of their prime contributions is showing that the complexity of exact and approximate APSP is rounds.
The -round regime is also of importance in this model, as [49, 18] show a variety of algorithms with this complexity. For diameter, a lower bound of rounds for exact unweighted diameter and for a approximation for weighted diameter is shown (the first lower bound in the model for a problem with a small output), matched with a round algorithm for a approximation for unweighted diameter, and a approximation for weighted diameter. For weighted distances, -round algorithms are shown for approximations from polynomially many sources, in addition to exact distances from sources.
Our main contribution in the model is breaking below rounds for a approximation for weighted single source shortest paths (SSSP), which also implies a approximation for weighted diameter. As we elaborate upon in the following subsections, this requires establishing an entire foundation of techniques which are a core contribution of the paper. We show the following algorithm which completes in rounds, w.h.p. (as common, high probability is at least , for a constant ).
Theorem 1.2 (-Approximation for SSSP in ).
Given a weighted, undirected graph , with and , a value , and a source , there is a model algorithm computing a -approximation of SSSP from , in rounds, w.h.p.
Our results provide an interesting open question: what is the best round complexity for a approximation of the diameter? We show that for a approximation, rounds suffice, while requires rounds, as stated above.
1.1.2 Fast Algorithms for the Model
As a warm-up, we illustrate the power of our model and its simulation in the model, by showing how to simulate the model (a synchronous model where every two nodes can exchange messages of bits in every round) in and hence in . Simulation of algorithms from the model may both significantly simplify algorithm design as well as improve results in other distributed models, as seen in previous works [38, 20, 21, 22, 15, 13, 30, 49].
We get the following result, from which one can already show new algorithms which beat the state-of-the-art in the model by simulating existing algorithms.
Theorem 1.3 ( Simulation in ).
Consider a graph . Let be an algorithm which runs in the model over in rounds. If each node has bits of input and output, then there is an algorithm which simulates in rounds of the model over , w.h.p.
Here, is the mixing time of the graph, which is roughly the number of rounds required for a lazy random walk to reach the stationary distribution (a precise definition is not needed for understanding our results). A previous non trivial simulation of the model in the model is due to [38]. It emulates one round in rounds of the model, where is the maximum degree in the graph. Moreover, for certain graphs, this is improved to . Thus, our simulation improves upon both previous algorithms for all graphs with edges by a factor that is roughly proportional to the average degree, namely, faster by a factor of for some constant . Intuitively, in [38], if every node in desires to message every other node then this requires many rounds. We circumvent this by sending the input of low degree nodes to high degree ones, which then simulate the model and send back the output to the low degree nodes.
Finally, such a simulation implies a relation between lower bounds in the and models. Specifically, simulating one round in rounds of the model shows that a lower bound of rounds in the model implies a lower bound of rounds in the model. Due to [29, Theorem 4], we know that lower bounds for some problems in the model imply lower bounds in bounded-depth circuit complexity, and are therefore considered hard to obtain. Plugging our results from Theorem 1.3 in the lower bound for the model shows that if one constructs a family of graphs with edges and mixing time, for which solving some problem in the model requires rounds, then has a lower bound of rounds in the model for some constant . This means that any value of that is below implies a lower bound that is considered hard in the model.
SSSP.
The current state-of-the-art exact SSSP algorithm in the model, due to [14], runs in rounds. Using our result from Theorem 1.3, a simulation of this algorithm in the model runs in rounds, w.h.p. We further improve this result by presenting a solution that is faster for any graph which is not extremely dense, namely, for which . In a nutshell, our speed-up is due to the fact that the algorithm does not use all of the bandwidth available to it in every round, and so it is inefficient to simulate directly in . Thus, we instead simulate our model, which better reflects the complexity of such algorithms, giving us the following.
Theorem 1.4 (Exact SSSP in ).
Given a weighted, undirected graph and a source node , there is an algorithm in the model that ensures that every node knows the value , within rounds, w.h.p.
Consider graphs with a mixing time . The diameter of such graphs is also . If such a graph has at least edges, then our SSSP algorithm runs asymptotically faster than the state-of-the-art -round algorithm of [23].
APSP.
We now turn our attention to the APSP problem. As observed by Nanongkai [54, Observation 1.4], to solve APSP in the model, a node is required to learn bits of information. In the worst case, for a node with a constant degree, this takes rounds. For this reason, slightly modified requirements for the output have been considered.
We consider a shortest-path query problem, in which we separate the computation of shortest paths into two phases, one in which the input graph is pre-processed, and another in which a query set of pairs of nodes is given, , and every node is required to learn the distance to every node such that . The round complexity of this problem is thus bi-criteria, measured both in terms of pre-processing time and in terms of query time. We analyze the round complexity of the query in terms of the query load, where given a node , denotes the number of queries which is a part of, and the total, normalized query load is .
Theorem 1.5 (-Approximation for Shortest-Path Query in ).
For any constant and a weighted, undirected graph with edges, there is a algorithm which after rounds of pre-processing, solves any instance of the -approximate shortest path query problem with a known load , in rounds, w.h.p.
By denoting as the minimal degree in the graph, one gets and , which implies Theorem 1.1 stated above.
Finally, we consider a version of APSP, which we call Scattered APSP, where the distance between every pair of nodes is known to some node, not necessarily the endpoints themselves. That is, we require that every node knows, for every node , the identity of a node , which stores the distance .
Theorem 1.6 (-Approximation for Scattered APSP in ).
There exists an algorithm in the model, that given a weighted, undirected input graph with and , and some constant , solves the -approximate Scattered APSP problem on , within rounds333See footnote 2., w.h.p.
Roadmap.
After a survey of additional related work, Section 2 provides required definitions. Additional preliminaries appear in Appendix A. Section 3 is dedicated to the definition of carrier configurations in the model, and for a flavor of a sample proof, namely, sparse matrix multiplication. The bulk of our infrastructure for is deferred to Appendix B. Finally, Section 4 and Section 5 provide proofs of our end results in the and models, respectively. Both sections have some content deferred to Appendix C and Appendix D, respectively. We conclude with a discussion in Section 6.
1.2 Additional Related Work
.
There is growing interest in the recently introduced network model [8]. It is further studied by [49, 32, 18]. [8, 49, 18] consider the same values for the local bandwidth and global bandwidth as we do, and address mainly distance computations. The complexity of exact weighted APSP is rounds, by combining the upper bound of [49], and the lower bound of [8]. The complexity of approximate weighted -SSP is rounds, by combining the upper bound of [18], and the lower bound of [49]. [8, 49] show how to simulate one round of the (a synchronous distributed model where every node can broadcast a (same) -bit message to all nodes per round) and models on the skeleton graph with nodes in rounds of the model and obtain various distance related results using them. For example [49] show and weighted -SSP approximation in and rounds w.h.p., respectively. [18] improve on some of those results by simulating ad-hoc models which exploit additional communication abilities of the model. For instance, they show approximate weighted -SSP approximation in rounds w.h.p. [32] solve distance problems for sparse families of graphs using local congestion bounded by . Another special case of the model is the model, which contains only global edges [6, 7].
.
Distance problems are extensively studied in the model due to being fundamental tasks. One of the important problems in the model is -approximate SSSP [58, 11, 44]. The state-of-the-art randomized algorithm [11] solves it in rounds w.h.p., even in the more restricted Broadcast model. This is close to the lower bound by [58]. The state-of-the-art deterministic algorithm [44] completes in rounds. The complexity of exact SSSP is still open, with algorithms given in [31, 39, 33, 23]. The current best known algorithm is of [23] runs in rounds, w.h.p.
There is a lower bound of rounds to compute APSP [54] which was then tweaked to give a lower bound for the weighted case [17]. Over the years there was much progress in understanding the complexity of this problem [57, 52, 51, 4, 31, 12, 2, 3]. For weighted exact APSP, the best known randomized algorithm [12] is optimal up to polylogarithmic terms. The best deterministic algorithm to date [3] completes in rounds. For unweighted exact APSP, rounds algorithms are known [57, 52, 46].
To go below the lower bound, [50, 51] propose to build name-dependent routing tables and APSP. This means that the algorithm is allowed to choose small labels and output results that are consistent with those labels. Choosing the labels carefully thus overcomes the need to send too many bits of information to low-degree nodes. There is also a line of work which breaks below the lower bound for some certain graph families [36, 37, 53, 55]. [57, 34, 1, 41, 5] study exact and approximate diameter, eccentricities, girth and other problems.
Recently, [38, 40, 60] noticed that it is possible to develop faster algorithms when the underlying graph has low mixing time. This was shown for MST [38], maximum flow, SSSP and transshipment [40] and frequency moments [60]. Algorithms for subgraph-freeness and related variants also enjoy fast computations on graphs with low mixing time [20, 48, 21, 20, 21, 22, 15, 13, 30].
and .
2 Preliminaries
The following are some required definitions, while Appendix A contains additional definitions and basic claims. We begin with a variant of the model, introduced in [8].
Definition 1 ( Model).
In the model, a synchronous network of nodes with identifiers in is given by a graph . In each round, every node can send and receive messages of bits to/from each of its neighbors (over local edges ) and an additional messages in total that are bits long to/from any other nodes in the network (over global edges). If in some round more than messages are sent via global edges to/from a node, only messages selected adversarially are delivered.
We introduce the model with capacity (abbreviated ), which we show is powerful for extracting core distributed principles. The model has two defining characteristics which are anonymity and restricted message capacity.
Definition 2 (The Model).
The model with capacity is a distributed synchronous communication model, over a graph with nodes, where each node can send and receive messages in every round, each of size bits, to and from any node in the graph. Nodes have identifiers in , however, every node has a unique -bit, communication token, initially known only to itself. Node can send each message either to some node whose communication token is already known to , or, to a node selected uniformly, independently at random from the entire graph.
bears some similarity to the model [6] with an empty initial knowledge graph. Unlike the model, has additional capacity to send messages from each node and ability to send messages to random nodes. Our motivation for defining is to provide a setting which is both strong enough in order to solve challenging problems, yet, at the same time is weak enough in order to be simulated efficiently in the settings of interest. We note the importance of having both identifiers and communication tokens. An identifier is chosen from a given, hardcoded set and thus can be used for assigning tasks to specific nodes. Communication tokens both assist in dealing with anonymity, and enhance the ability of the model to be easily simulated in other distributed settings – a simulating algorithm can encode routing information of the underlying model in the tokens.
Many of our results hold for weighted graphs , where for a which is polynomial in . Whenever we send an edge as part of a message, we assume is sent as well. We assume that all graphs that we deal with are connected.
Given a graph and a pair of nodes , we denote by the hop distance between and , by a subset of the closest nodes to with ties broken arbitrarily, and by the weight of the lightest path between and of at most -hops, where if there is no path of at most -hops then . In the special case of , we denote by the weight of a lightest path between and . We also denote by the degree of in , and, in the directed case, denote the in-degree and out-degree of in , respectively. When it is clear from the context we sometimes drop the subscript .
Definition 3 (-Source Shortest Paths (k-SSP)).
Given a graph , in the -source shortest path problem, we are given a set of sources. Every is required to learn the distance for each source . The cases where , are called the single source shortest paths problem (SSSP), and all pairs shortest paths problem (APSP), respectively.
Definition 4 (Scattered APSP).
Given a graph , in the Scattered APSP problem, for every pair of nodes , there exist nodes (potentially ), such that and know , knows the identifier of , and knows the identifier of .
In the approximate versions of these problems, each is required to learn an -approximate distance which satisfies , and in case , is called an -approximate distance.
Definition 5 (Diameter).
Given a graph , the diameter is the maximum distance in the graph. An -approximation of the diameter satisfies .
3 The Model
The role of defining the model is its power given by our ability to efficiently simulate it in the and models. Applications of this strength are exemplified by improved algorithms for distance computation problems in these models.
To this end, we design fast algorithms in the model for the useful tools of sparse matrix multiplication and hopset construction (Section 3.2). Such algorithms already exist in all-to-all communication models, such as the [14, 28]. However, fundamental load balancing and synchronization steps that are simple to implement when assuming a bandwidth of messages per round as in the model, provide formidable challenges when nodes cannot receive messages each. For instance, when multiplying two matrices, while the number of finite elements in row of both input matrices might be small, in the output matrix, row might be very dense, so that node would not be able to learn all of this information. This even implies that it is not always the case that every node can even know about all the edges incident to it in some overlay graph (e.g., a hopset). The crux in the way we overcome these challenges is in introducing the carrier configuration distributed data structure that performs automatic load balancing (Section 3.1). Missing proofs are deferred to Appendix B.
3.1 Carrier Configurations
A carrier configuration is a distributed data structure for holding graphs and matrices, whose main objective is to provide a unified framework for load balancing in situations where substantial amounts of data need to be transferred. The key is that when using the carrier configurations data structure, an algorithm does not need to address many load balancing issues, as those are dealt with under-the-hood by the data structure itself. Therefore, this data structure allows us to focus on the core concepts of each algorithm and abstract away challenges that arise due to skewed inputs.
The data structure crucially enables our algorithms to enjoy sparsity awareness, by yielding complexities that depend on the average degree in an input graph rather than its maximal degree. This allows one to eventually deal with data which is highly skewed and which would otherwise cause a slow-down by having some nodes send and receive significantly more messages than others.
We stress that the manner in which we implement carrier configurations is inherently distributed in nature. In many cases, when two nodes store data, , respectively, in a carrier configuration, the data is dispersed among many other nodes, and when operations are performed using both and , the nodes which store the data perform direct communication between themselves, without necessarily involving or .
In more detail, the carrier configuration data structure is based on three key parts. (1) Carrier Nodes: every node gets a set of carrier nodes, where and is the average degree in the graph, which help to carry information and store its input edges in a distributed fashion. A key insight is that it is possible to create such an assignment of carrier nodes and also maintain that each node itself is not a carrier for too many other nodes, thus avoiding congestion. (2) Ordered Data Storage: the data of is stored in a sorted fashion across in order to enable its efficient usage. In particular, it takes bits in order to describe what ranges of data are stored in a specific carrier node. (3) Communication Structure: the nodes are connected using a communication tree (Definition 11), which enables fast broadcasting and aggregation of information between and .
A formal definition of a carrier configuration data structure (Definition 12), as well as an extended -specific definition (Definition 13) are given in Section B.2.
Carrier Configuration Toolbox. Typically, data is converted from a classical storage setting (every node knows the edges incident to it) into being stored in a carrier configuration, and then operations are applied to change the state of the configuration. Thus, we show in the model how to convert a classical graph representation to a carrier configuration. Then, we provide basic tools, e.g., given two matrices held in carrier configurations, produce a third configuration holding the matrix resulting from computing the point-wise minimum of the two input matrices. The descriptions and implementations of these are deferred to Section B.2.
3.2 Sparsity Aware Distance Computations
In order to give a taste of the type of algorithms which we construct in the model, we present an outline of our sparse matrix multiplication algorithm.
We build a foundation in the model which enables us to eventually implement sparse matrix multiplication and hopset construction algorithms, as inspired by the model algorithms in [14],444We note that while some results of [14] were improved in [28], the improvements focus on reducing the amount of synchronous rounds of the algorithm, yet do not decrease the message complexity in way that assists us. in order to solve distance related problems. Ultimately, compared to [14], our main contribution is significantly reducing the message complexity overhead of the load balancing mechanisms. In the implementation, various load balancing steps require messages, which is trivial in the model, yet is highly problematic in the model (and in the and models which ultimately simulate it). Interestingly, for the relevant sparsity used in distance computations, the sparse matrix multiplication itself requires messages for actual transfer of matrix elements, and in the algorithm, the message complexity is dominated by the overhead. We reduce this overhead significantly such that the message complexity of the algorithm is dominated by the messages which actually transfer matrix elements.
We show how to perform sparse matrix multiplication in the model. To simplify our proofs, we assume that the two input matrices and the output matrix have the same average number of finite elements per row. We note that it is possible to use the same ideas we show here in order to prove the general case where the matrices have different densities.
Theorem 3.1 (Sparse Matrix Multiplication).
Given two input matrices , both with an average number of finite elements per row of at most , and stored in carrier configurations, and , it is possible to output the matrix , in carrier configuration , where is , with only the smallest elements of each row computed, breaking ties arbitrarily. This takes rounds, in the model, w.h.p.
Proof of Theorem 3.1.
Throughout the algorithm, we assume that every piece of data sent directly to a node is also known by all its carrier nodes. This is possible to do with only a multiplicative factor to the round complexity, due to \IfAppendixLABEL:\next ( (Carriers Broadcast and Aggregate).)Lemma B.14. Further, as the input matrices are stored in carrier configuration, due to Item 4, every entry is stored in alongside the communication tokens of both and (likewise for entries of stored in ). Thus, whenever a value is sent in the algorithm from one of the input matrices, we assume that it is sent alongside these communication tokens.
Denote by the maximal number of finite elements in a row of or . In the proof below, we show a round complexity of , which is bounded by the claimed round complexity of .555Notice that , since: 1) It always holds that , and so , and 2) If , then , yet if , then .
Throughout the proof, we construct various sets of nodes (for instance, ) where every node in the set knows the communication tokens and identifiers of all the other nodes in the set (), and thus we assume that the nodes in the each set locally compute some arbitrary node (some fixed ) which is the leader of the set.
Step: Partitioning the Input – Sets
Denote by the number of finite elements in row of , and column of , respectively.
Denote , a hardcoded partition of into equally sized sets. Using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1 and \IfAppendixLABEL:\next ( (Grouping).)Lemma B.7, every broadcasts to , within rounds. The nodes in locally partition into , for some , where for each , . Since , there exists a way to create this partitioning such that .
From here on, we refer to the sets as . We denote by the rows of corresponding to the nodes , and by the columns of corresponding to the nodes . As , there are at most sets – the sets . For each set , an arbitrary is designated as its leader. The identifiers and communication tokens of all the leaders are broadcast using \IfAppendixLABEL:\next ( (Broadcasting).)Lemma B.5, within rounds.
We partition the information held by the sets . We compute , where the total number of finite entries in666The notation denotes all rows of with indices in , from column to column . is at most . Recall that for each , it holds that , and also , implying the existence of such a set .
The nodes in compute the values in using binary search. To see this, given any , the number of finite entries in each of can be computed by the nodes in in rounds, using \IfAppendixLABEL:\next ( (Group Broadcasting and Aggregating).)Corollary B.9. Thus, each can be found using binary search, with binary searches run in parallel in rounds. In total rounds are required to compute .
Step: Creating Intermediate Representations – Sets , Matrices
Let , for , be a hard-coded partition of into equally sized sets. The goal of this step is for nodes to compute an intermediate representation of the product . Therefore, we desire that each sends all its data to , for each , in some load-balanced manner. We show how, for each , sends to , for each , and in a symmetric way (with matrix instead of ) this can be done for .
In rounds, allow communication within each by invoking \IfAppendixLABEL:\next ( (Grouping).)Lemma B.7. Let some , be denoted leader, and broadcast the leaders of all the sets using \IfAppendixLABEL:\next ( (Broadcasting).)Lemma B.5 within rounds. Node sends to both all of . Each node receives messages, thus this takes rounds using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1. Now, each node broadcasts to all the tokens it receives, using \IfAppendixLABEL:\next ( (Group Broadcasting and Aggregating).)Corollary B.9, within rounds.
Leader sends the contents of to all the leader nodes , for any , in rounds using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1. Each broadcasts to the contents of and , within rounds using \IfAppendixLABEL:\next ( (Group Broadcasting and Aggregating).)Corollary B.9.
We send information from to . The -th node in learns the finite elements in . By definition of , for any , the number of finite elements in is at most , bounding the number of messages each node desires to receive. Every finite element held by a node needs to be sent to nodes in the graph, in total. Further, since the finite elements of are stored in its carrier nodes, and each carrier node of stores elements, each node sends at most a total of messages. Thus, this routing can be accomplished in rounds, using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1.777To perform the routing, it is required that the carrier nodes of know the communication tokens of the nodes in which receive the messages. All the nodes in received , and so knows the required tokens. At the start of the proof, we state that we assume that every message a node receives is also broadcast from it to its carriers, and so also the carriers of know the required communication tokens..
We shuffle data within . Recall that . The first nodes of received data from , according to . Denote . We desire that for each interval, , for , node knows all the elements and . To do so, notice that (likewise with ) is fully contained in the data which some node in , where already knows. Thus, each node of denotes by how many other nodes in are reliant on the data which it itself received from .888This can be done since all the nodes in know all of . Then, we invoke \IfAppendixLABEL:\next ( (Group Multicasting).)Lemma B.10 in rounds in order to route the required information. Finally, each node from knows and , denoted , respectively, and can compute the product .
We are at a state where for each , every of knows some matrix such that .
Step: Sparsification – Matrices
We sparsify the matrices. Recall that we desire to output which is , with only the smallest entries on each row.
Fix , , and denote by as the matrix of size created by concatenating .
As shown in [14],
we are allowed to keep only the smallest entries on each row of , without damaging the final output . That is, throwing out elements at this stage is guaranteed to throw out elements which are only in and not in .
Fix , and denote by the nodes numbered in each . The nodes in perform a binary search for each row of , to determine the cutoff for the smallest element on that row. The leaders broadcast to each other , using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1, taking rounds. Each leader broadcasts the tokens it received to , using \IfAppendixLABEL:\next ( (Group Broadcasting and Aggregating).)Corollary B.9, within rounds. Now, the nodes all know .
The nodes proceed in iterations to perform concurrent binary searches on the threshold values for each row in . Let be an arbitrarily chosen leader node for . In every iteration, node broadcasts values, each a tentative threshold value for a row in , using \IfAppendixLABEL:\next ( (Group Broadcasting and Aggregating).)Corollary B.9, and in response, aggregates from the nodes of , using \IfAppendixLABEL:\next ( (Group Broadcasting and Aggregating).)Corollary B.9 the total number of entries in each row of below, equal to, and above the queried threshold. Each such iteration takes rounds, and after iterations of this procedure, all the nodes in know a threshold for every row they posses, informing them which values in can be thrown out.
Define the matrices by removing from the entries which are thrown away due to the thresholds.
Step: Balancing the Intermediate Representation
The nodes computed the matrices , yet,
some matrices may be too dense in order to transport out of the nodes which locally hold them. Even though we sparsified the matrices above, the sparsification steps perform were on several matrices at once, and thus we can still have single matrices which remains very dense. Let be such a very dense matrix. We overcome this challenge by having more nodes compute from scratch, allowing each node to only take responsibility for retaining a part of .
For each , we pool the nodes and redistribute them such that areas in the matrix which are too dense get more nodes.
Each node from computes the number of finite values in , denoted as . Node computes using on within rounds due to \IfAppendixLABEL:\next ( (Group Broadcasting and Aggregating).)Corollary B.9. Finally, broadcasts to the other leader nodes the values , and then broadcasts to all the values that it received – taking rounds due to \IfAppendixLABEL:\next ( (Routing).)Lemma B.1 and \IfAppendixLABEL:\next ( (Group Broadcasting and Aggregating).)Corollary B.9.
Due to the fact that each row of has at most elements in total, and currently is distributed across matrices which need to be summed, all the nodes hold at most elements. That is, the sum is at most . Each node from observes and locally breaks it up into matrices which sum up to and each have at most finite elements. This creates at most such matrices. That is, . Thus, the total number of intermediate matrices, spread over the nodes , is at most . Each from is allocated other nodes from , called auxiliary nodes in order to send them the matrices . Notice that since all nodes in know all the values , all the nodes can locally know which node in is allocated to help which other node in . However, node cannot send to all its auxiliary nodes all of this information, instead it sends to each of its auxiliary nodes the data it received from and with which it computed the matrix , and the thresholds which it used to turn into . This can be accomplished via \IfAppendixLABEL:\next ( (Group Multicasting).)Lemma B.10 within rounds, since each node wishes to multicast at most data (as this is the bound on the total data each node received from and ), to at most other nodes, and each node desires to receive multicast messages only from one other node.
Step: Summation
We send data from to to perform the final summation step. Node learns all of the information held in the nodes in which pertains to .
All the nodes in know , and vice versa and can thus communicate. Each node in wishes to receive data from , and likewise, every node in wishes to send data. Thus, this communication is executed using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1 in rounds. Upon receiving the data, each node computes the at most smallest entries of . Now, is stored in a partial carrier configuration (with every node having ), and thus we invoke \IfAppendixLABEL:\next ( (Partial Configuration Completion).)Lemma B.18, within rounds, to ensure that is stored in a carrier configuration . Notice that , and so we are within the stated round complexity.
∎
4 Breaking Below in
We show a approximation for weighted SSSP in the model within rounds, further implying that it is possible to compute a approximation for the weighted diameter in this number of rounds. We achieve this by combining a simulation of our model and of the model. Roughly speaking, this incorporates density awareness in addition to the sparsity awareness discussed thus far.
A key approach in previous distance computations in the model [8, 49, 18] is to construct an overlay skeleton graph, and show that solving distance problems on such skeleton graphs can be extended to solutions on the entire graph. In a nutshell, given a graph and some constant , a skeleton graph is generated by letting every node in independently join with probability . Two skeleton nodes in have an edge in if there exists a path between them in of at most hops. In particular, the nodes of are well spaced in the graph and satisfy a variety of useful properties. The central distance related property is that pair of far enough nodes in have skeleton nodes at predictable intervals on some shortest path between them.
Given such a skeleton graph, , with , previous work showed that it is possible in the model to let the nodes in take control over the other nodes in the graph and use all the global bandwidth available to perform messaging between nodes in . In essence, after pre-processing rounds, the global bandwidth available to the entire graph in each round of the model is utilized such that every node in can send and receive messages per round from any other node in , in an amortized manner.999In rounds, each node in can send and receive messages to other nodes in . For a given , we denote by the set of helper nodes of which contribute their globally communication capacity to ; it is guaranteed that all nodes in are at most hops away from in .
A formal definition of skeleton graphs encapsulating all of the above, is in Definition 20 in Section C.1.2. These skeleton graphs are built upon nodes randomly selected from the input graph , and thus the number of edges in correlates to the density of neighborhoods in – the graph is either sparse or dense, depending on . We split into cases according to the sparsity of , which can be computed using known techniques from the and models.
Sparse : A hurdle that stands in the way for going below rounds is that one must choose , in order for the -round pre-processing step to not exceed the goal complexity. However, in order to use the previously shown routing techniques, the identifiers of the nodes in must be globally known, a task which can be shown to take rounds. This leads to an anonymity issue – letting nodes in communicate with one another although no node in the graph knows the identifiers of all the nodes in .
We overcome this anonymity problem by showing a routing algorithm which allows messaging over without assuming that the identifiers of the nodes in are globally known. This allows us to simulate the model over . By simulating algorithms from the model on , we directly get a -approximation for SSSP in rounds, with the exact round complexity dependent on the sparsity of . However, as approaches having edges, the round complexity of the simulated model algorithms approaches . As such, using the techniques so far, we solve all cases except for very dense skeleton graphs .
Dense : To tackle a dense , we present a density aware simulation of the model over . The model is simulated over in [8] within rounds. Our observation is that as is more dense, broadcasting messages in the input graph can be made more efficient. In essence, as is more dense, neighborhoods in the original input graph are closely packed, and so when a node receives some message, it can efficiently share it with many nodes in its neighborhood. With this in hand, we can simulate the algorithm from [11, Theorem 8] for approximate SSSP very quickly on dense skeleton graphs.
Tying up the pieces: Each simulation result by itself is insufficient, as in the extreme cases each solution takes rounds. Yet, by combining them and using each when it is better, based on the sparsity of , we achieve the resulting round algorithm, for all graphs.
The outline of the rest of this section: We first perform routing over skeleton graphs where the receivers of messages do not know which nodes desire to send them messages (Section 4.1). In Section 4.2, we simulate the model in the model. Next, we state that the model can be simulated in the the model, and defer the proof to Appendix C. Finally, in Section 4.3, we combine our various algorithms to yield the SSSP approximation result from which the weighted diameter approximation result also follows.
4.1 Oblivious Token Routing
The following claim shows how to route unicast messages inside a skeleton graph, and is based on \IfAppendixLABEL:\next ( (Oblivious Token Routing).)Lemma C.11 shown in Section C.2.
Claim 4.1 (Skeleton Unicast).
Given a graph , a skeleton graph , and a set of messages between the nodes of , s.t. each is the sender and receiver of at most messages, and each message is initially known to its sender, it is possible to route all given messages within rounds of the model.
LABEL:\next ( (Skeleton Unicast).)Claim 4.1 is an extremely important requirement for showing our round algorithms. Previously, [49] required that every node knows how many messages every other node intends to send to it. In turn, this would require that for each , all the other nodes in know the identifier of . But the latter can be shown to take rounds, since (as elaborated above). Therefore, the necessity of our strengthened claim follows.
4.2 and Simulations in
Theorem 4.2 ( Simulation in ).
Consider a graph , and a skeleton graph of , for some constant . Let be an algorithm which runs in the model with capacity over in rounds. Then there exists an algorithm which simulates within rounds of the model over , w.h.p. Further, it is ensured that at the start of the simulation, every node in knows the communication tokens of all its neighbors in .
Proof of Theorem 4.2.
We show that we can instantiate the model over and then show how to simulate each round.
In order to instantiate the model over the nodes , the model definition asserts that every node has an identifier in as well as a communication token whose knowledge enables other nodes to communicate with . In order to ensure the condition related to identifiers is satisfied, we invoke \IfAppendixLABEL:\next ( (Unique IDs).)Claim C.7 over and in rounds. For the second condition, each node uses its original identifier in the model as its communication token in the model, which enables us to use \IfAppendixLABEL:\next ( (Skeleton Unicast).)Claim 4.1 in order to later simulate the rounds of the model. Finally, notice that it holds that every node in knows the communication tokens of all its neighbors in , as the communication tokens are the model identifiers.
In each round of the model, each node sends/receives at most messages to/from any other node, such that for each message it either knows the communication token of the recipient or the recipient is chosen independently, uniformly at random. We split each round of the model into two phases. First, we route the messages which use communication tokens, and second, we route those messages destined to random nodes. Since the communication tokens in the model are the identifiers from the model, the first phase is implemented straightforwardly using \IfAppendixLABEL:\next ( (Skeleton Unicast).)Claim 4.1, taking rounds of the model over , as required.
For the second phase, denote by the messages which node desires to send to random targets. Node selects, uniformly and independently, random nodes from , where each node is the target for a different message in – from here on, we assume that a message in has the identifier of a random node in attached to it. For each helper , node assigns messages from , denoted by , to . Within rounds, node uses the local edges of the model in order to inform about . Next, each node uses the global edges of the model in order to send the messages to their targets in , within rounds, due to \IfAppendixLABEL:\next ( (Uniform Sending).)Claim C.4. Finally, given that a node received a message in this way, node selects uniformly and independently at random a node such that and forwards that message . Notice that it can be the case that a certain does not help any node, and thus there does not exist a node such that – in such a case, will report back to whichever node sent it the message saying that it cannot forward it. However, the definition of a skeleton graph promises that the total number of helper nodes of nodes in is at least (Property 7d of \IfAppendixLABEL:\next ( (Extended Skeleton Graph).)Definition 20). As every node assists at most other nodes, this implies that at least a poly-logarithmic fraction of the nodes assist other nodes, and so if we repeat the above process and resend messages that bounced back, within iterations of the above algorithm, w.h.p., all messages are forwarded.
Notice that the above methodology does not produce a uniform distribution of receivers over the nodes , since there might be some node such that all of only help , and a node such that all the nodes also help other nodes in (thus is less likely than to receive a random message). This happens even though each node only helps at most other nodes and the number of nodes in is exactly the same for each . Thus, we augment the probabilities with which a node accepts a message. Each node observes its helpers and computes the probability, denoted , that given that a uniformly chosen node received a message, forwards the message to . Now, the nodes utilize an Aggregate and Broadcast tool of [7, Theorem 2.2] (see \IfAppendixLABEL:\next ( (Aggregate and Broadcast).)Claim C.1) in order to compute . Every node now accepts each message it received with probability , independently. In the case that a node rejects a message, it notifies the original sender of the message that the message was rejected – this is done by sending messages in the reverse direction to the way they were sent previously. In the case that a node hears that a message it sent was rejected, it will attempt to resend it by repeating the entire algorithm above. As every node helps at most nodes , it holds that , and so within iterations of the above algorithm, w.h.p., every message will be successfully delivered. ∎
The following is proven in Appendix C.
Theorem 4.3 ( Simulation in ).
Given a graph , a skeleton graph , for some constant , with average degree , and an algorithm in the model which runs on in rounds, it is possible to simulate by executing rounds of the model on . This assumes that prior to running , each node has at most bits of input used in , including, potentially, the incident edges of in , and that the output of each node in is at most bits.
4.3 A -Approximation for SSSP
We show an round algorithm for a -approximation of weighted SSSP in the model. We begin by showing how to use the algorithm from \IfAppendixLABEL:\next ( (-Approximation for SSSP in (Wrapper)).)Theorem B.25 in the model for sparse skeleton graphs, with low maximal degree and even lower average degree, we then show an algorithm in the model for graphs with low average degree, yet high maximal degree, and then finally we show how to use the algorithm from [11, Theorem 8] in the model for dense graphs, with high average degree. Combining these claims using \IfAppendixLABEL:\next ( (-Approximation for SSSP in ).)Theorem 1.2 gives the desired result.
Lemma 4.4 (SSSP with Low Average and Maximal Degrees).
Given a weighted input graph , and a skeleton graph , s.t. the average and maximal degrees in are and , respectively, a value , and a source node , ensures that every node in knows a -approximation to its distance from over the edges , within rounds in the model, w.h.p.
Proof of Lemma 4.4.
Use \IfAppendixLABEL:\next ( ( Simulation in ).)Theorem 4.2 to simulate in the model over the SSSP algorithm from \IfAppendixLABEL:\next ( (-Approximation for SSSP in (Wrapper)).)Theorem B.25 over . Set capacity for a round complexity of .∎
LABEL:\next ( (-Approximation for SSSP in (Wrapper)).)Theorem B.25 depends on the maximal degree, and so applying it to a skeleton graph with high maximal degree is inefficient. Thus, for sparse skeleton graphs with a high maximal degree, we show a algorithm ensuring one node in the skeleton graph can learn all of the skeleton graph and inform the nodes of their desired outputs. The proof of the following appears in Appendix C.
Lemma 4.5 (SSSP with Low Average and High Maximal Degrees).
Given a weighted input graph , and a skeleton graph , for some constant , such that the average and maximal degrees in are at most and at least , respectively, and a source node , ensures that every node in knows its distance from over the edges , within rounds in the model, w.h.p.
We use our efficient simulation for dense skeleton graphs.
Lemma 4.6 (SSSP with High Average Degree).
Given a weighted, undirected input graph , a skeleton graph , for some constant , such that the average degree in is at least , a value , and a source node , ensures that every node in knows a -approximation to its distance from over the edges , within rounds in the model, w.h.p.
Proof of Lemma 4.6.
LABEL:\next ( ( Simulation in ).)Theorem 4.3 simulates the SSSP approximation algorithm of [11, Theorem 8] on . As the complexity of [11, Theorem 8] is rounds of the model, the simulation takes rounds of the model. ∎
Finally, we combine \IfAppendixLABEL:\next ( (SSSP with Low Average and Maximal Degrees).) and LABEL:\next ( (SSSP with Low Average and High Maximal Degrees).) and LABEL:\next ( (SSSP with High Average Degree).)Lemmas 4.4, 4.5 and 4.6.
See 1.2
Proof of Theorem 1.2.
Denote . Construct a skeleton graph in rounds using \IfAppendixLABEL:\next ( (Construct Skeleton).)Corollary C.6. Notice that \IfAppendixLABEL:\next ( (Construct Skeleton).)Corollary C.6 can ensure that source is also in . Using \IfAppendixLABEL:\next ( (Aggregate and Broadcast).)Claim C.1, compute and the maximal degree in , the value , in rounds.
First, ensure every node in knows a -approximation to its distance from over the edge set only. Thus, selectively deploy Lemmas 4.6, 4.4 and 4.5, as follows. If , invoke Lemma 4.6 in rounds. Otherwise, , and so split the algorithm into two cases, according to . If , invoke Lemma 4.4, in rounds, and otherwise invoke Lemma 4.5 in rounds.
Finally, due to Property 4 of the definition of a skeleton graph, we invoke \IfAppendixLABEL:\next ( (Extend Distances).)Claim C.10, in rounds, to ensure every knows a -approximation to its distance from in . ∎
LABEL:\next ( (-Approximation for SSSP in ).)Theorem 1.2 with and \IfAppendixLABEL:\next ( (Diameter from SSSP).)Claim C.9 give the following.
Theorem 4.7 (-Approximation for Weighted Diameter in ).
It is possible to compute a -approximation for weighted diameter in rounds in the model, w.h.p.
5 Fast Distance Computations in the Model
In Section 5.1 we show how to simulate the model in . We then employ our simulation to simulate the model in (Section 5.2), and to derive our distance computations in (Section 5.3).
5.1 Simulation in
Key Principle. In the model, each node can send or receive messages in each round, implying a total bandwidth of messages per round. We aim to build efficient distance tools which utilize this bandwidth completely. Consider the model, which also has a bandwidth of messages per round. Potentially, by comparing the bandwidths, it could be possible to simulate one round of the model in single round of the model. However, in the model, nodes with degree can not send messages in a single round, regardless of how an algorithm tries to do this.
To overcome this problem, we notice that the bandwidth of the nodes with degree at least is at least . Thus, the key principle we show is that the high degree nodes, denoted by , can learn all of the input of the low degree nodes, denote by . The higher the degree of a node, the more nodes it simulates. Formally, we create an assignment where for every , the node simulates , and we ensure that is globally known.
We then simulate the model using only the nodes in , and finally we send back the resulting output to the nodes in . To allow all-to-all communication between the nodes in , we use the routing algorithms developed in [38, 20] and stated in Appendix D and pay an overhead of rounds for the simulation.
Some problems, such as Scattered APSP, require the output to be stored on some node, but do not specify on which. We adapt for this case by allowing nodes to produce an auxiliary output. After the simulation is over, every node , given the communication token of any node can compute the identifier of node where the auxiliary output of is stored.
Carrier Configuration and Communication Tokens. In addition to simulating the model in the model, we let every node know the communication token of its neighbors as well as construct a carrier configuration directly in the model. This benefits some graph problems greatly.
Note that in the simulation in the model, the carrier configuration is constructed in the model itself. However, in the case of the model, we cannot delegate this task efficiently to the model since building a carrier configuration requires every node to be able to send its incident edges to arbitrary nodes in the graph. Doing so takes rounds in the model, yet only rounds in .
However, building a carrier configuration in the model is also not directly possible, as a low degree carrier might learn edges of other nodes, which could take rounds. Therefore, instead of each node sending its edges directly to its carriers, it sends them to the nodes which simulate its carriers.
Overall this might sound like a back and forth process, as we simulate low degree nodes by high degree nodes and then split the edges of the high degree nodes among nodes which may be simulated nodes. However, using the model grants us the modular approach we aim for.
Supergraphs. \IfAppendixLABEL:\next ( (Hopset Construction).)Theorem B.19 constructs hopsets and Theorem B.28 computes Scattered APSP in the model, and both require the input graph to have edges.101010This assumption can be removed at the expense of more complicated proofs, yet would not imply any speed-up for our end-results. We aim to apply those results on general graphs and so augment with added nodes and added edges while preserving distances between the nodes of and ensuring that all added edges are globally known. We call the resulting graph an -supergraph, build a carrier configuration holding , and apply the algorithms on .
Definition 6.
Given a weighted graph with and , and a number , a weighted graph is an -supergraph of , if is obtained from by adding new nodes and adding an infinite weight edge between every added node and every original node.
Clearly, a supergraph preserves the original distances.
We now simulate the model in the model.
Theorem 5.1 ( Simulation in ).
Consider a graph and some constant . Let be some number s.t. . Let be the average degree of and let be an algorithm which runs in the model over in rounds. For each denote by and the number of bits in the input and output of the node , respectively. Let and be the input and output capacities: the minimum number of rounds required for any node to send or receive its input or output, respectively.
There exists an algorithm which simulates within rounds of the model over . The above works even if requires a carrier configuration of (an -supergraph of ) or communication tokens of neighbors as an input. Furthermore, each node might produce some unbounded auxiliary output, in which case the output is known to some (not necessary same) node such that each node can compute the identifier of given the communication token of .
Proof of Theorem 5.1.
Identifiers: First, compute in rounds. By the definition of the model, each node has an identifier , denoted the original identifier. Use \IfAppendixLABEL:\next ( (Identifiers).)Corollary D.2, to compute a set of new identifiers . We abuse notation and denote by the degree of node with identifier . Each node locally computes for each new identifier its value . A node with degree that is less than is a low degree node , otherwise it is a high degree node . According to the properties of , the new identifiers of nodes in are smaller than those of nodes in . We abuse the notation and treat and as set of new identifiers. For each denote and for each denote .
Simulation Assignment: The numbers and the sets satisfy the conditions of \IfAppendixLABEL:\next ( (Assignment).)Claim D.4, since . Hence, there is a partition of into sets , satisfying . The node simulates the nodes in , and for simplicity it also simulates itself. Denote by the new identifier of the node simulating the node with new identifier .
Input: For every , we now deliver the information node requires to simulate nodes in . Each sends to its neighbors its new identifier . As knows its new identifier , it also knows the new identifier of . Using \IfAppendixLABEL:\next ( ( Routing).)Claim D.1, sends to the node its new and original identifier together with the new and original identifiers of its neighbors and input. A node , for each neighbor of node ,can now locally compute the new identifier of . This invokes the algorithm from \IfAppendixLABEL:\next ( ( Routing).)Claim D.1 at most times. In each invocation, each sends at most messages and each node receives at most messages, as required. Instantiation: We now instantiate the model. As a communication token of node in the model, we use the concatenation of , and . Clearly, identifiers are unique in and communication tokens are unique of size bits. While pre-possessing, we already guaranteed that the node which simulates knows the communication token of all neighbors of . Now the new identifier assignment and simulation assignment satisfy the demands of \IfAppendixLABEL:\next ( (Build Carrier Configurations in ).)Claim D.3 and we use it to build carrier configuration.
Round Simulation: During one round of the model, each node can send and receive at most messages. Each message is sent either to a random node or a node with a known communication token. We split into two phases. First, sending the messages to nodes with known communication tokens, and second sending the messages to random nodes.
In the first phase, we use the fact that the new identifier of the destination is a part of the communication token. Each node has to send and receive messages on behalf of all . Thus, node has to send and receive at most messages. It is therefore enough to invoke the algorithm from \IfAppendixLABEL:\next ( ( Routing).)Claim D.1 for times to deliver all the messages w.h.p.
In the second phase, for each message, we chose a new identifier independently and uniformly. Since for node we know the new identifier of , we also know where to route the message to. As nodes sample uniformly at most messages, each new identifier is sampled at most times, w.h.p. Thus, the number of messages that a high degree node has to send or receive is at most . Thus, w.h.p., invocations of algorithm from \IfAppendixLABEL:\next ( ( Routing).)Claim D.1 suffice.
Main Output: Finally, we send outputs back to the simulated nodes. This works in a similar manner, with a node splitting the output of each node into batches of size at most . Notice that since receives the identifiers of all neighbors of , it knows . Then, for rounds, each node uses \IfAppendixLABEL:\next ( ( Routing).)Claim D.1 to send one batch to each node it simulates.
Auxiliary Output: The auxiliary output that some node produces is stored in the node which simulates it. Since is a part of the communication token of , each node which knows the communication token of , knows also .
Round Complexity: By \IfAppendixLABEL:\next ( (Identifiers).)Corollary D.2, computing new identifiers takes rounds. By \IfAppendixLABEL:\next ( ( Routing).)Claim D.1, sending the input requires rounds, the simulation of the rounds of the model requires rounds, and sending the output back takes rounds, w.h.p. By \IfAppendixLABEL:\next ( (Build Carrier Configurations in ).)Claim D.3, building the carrier configuration takes rounds w.h.p. Thus, the execution terminates in rounds, w.h.p. ∎
5.2 Faster Simulation
The model is, in a sense, a generalization of the model, directly implying the following.
Claim 5.2 ( Simulation in ).
There is an algorithm, which executes one round of the model in the in rounds w.h.p.
Proof of Claim 5.2.
Initially, for rounds, each node sends its communication token and identifier to (not necessary distinct) randomly sampled nodes. By Chernoff and union bounds, all nodes receive the identifiers and communication tokens of all other nodes w.h.p. For an additional rounds, the nodes use the learned communication tokens to deliver the messages they have. ∎
Combining \IfAppendixLABEL:\next ( ( Simulation in ).) and LABEL:\next ( ( Simulation in ).)Claims 5.2 and 5.1 implies a density aware simulation of the model in the model – the more edges the input graph has, the faster the simulation.
Theorem 1.3 ( Simulation in ).
Consider a graph , and an algorithm which runs in the model over in rounds. For each denote by and its number of input and output bits, respectively. Let and be the input and output capacities: rounds required for any node to send or receive its input or output, respectively. Then can be simulated within rounds of the model over , w.h.p.
5.3 Improved Distance Computations
We show an exact SSSP algorithm via a simulation of the algorithm from \IfAppendixLABEL:\next ( ( Simulation in ).)Theorem 5.1. See 1.4
Proof of Theorem 1.4.
The claim follows from simulating \IfAppendixLABEL:\next ( (Exact SSSP in ).)Theorem B.23 of over using \IfAppendixLABEL:\next ( ( Simulation in ).)Theorem 5.1, in rounds, w.h.p. ∎
Now, we define and approximate our first APSP relaxation.
Definition 7 (Shortest Path Query Problem).
Given an input graph , a query set is a set of source-destination pairs called queries. For each node , the source and destination loads, and , respectively, are the number of times appears as a source and destination in divided by its degree. The maximum over all source and destination loads is the query set load.
A shortest path query problem is a query set of size and load , s.t. every knows the identifier of . The goal is to answer all queries, that is, computes or approximates .
Given an input graph , an algorithm is a shortest path query algorithm if, after some pre-processing of rounds, given any query set of size and load , the algorithm solves shortest path query problem within an additional number of rounds.
We follow the approach of [14] in order to design a -approximate shortest path query algorithm, using our methods from the model. For this we use the following important tool whose proof deferred to Appendix D.
Lemma 5.3 (-nearest in ).
Given a graph , it is possible in the model, within rounds, w.h.p., to compute the distance from every node to every node which is one of the closest nodes to (with ties broken arbitrarily).
See 1.5
Proof of Theorem 1.5.
To approximate the distance , we compute , where is the closest node to in some globally known hitting set of all . Thus, while pre-possessing, we verify that each node knows for each .
Pre-processing: First, we execute the algorithm from Lemma 5.3 with to get the distance to each node in . Now, each node enters independently with probability . W.h.p., the set is of size and is a hitting set for each . Let . We compute -approximate MSSP from using invocations of the approximate SSSP algorithm [40, Corollary 5]. Each computes , which is the sampled node closest to , which exists in the set of its nearest neighbors w.h.p.
Computing -nearest takes rounds, w.h.p. The complexity of solving -SSP is rounds w.h.p. Overall complexity of the pre-processing thus rounds w.h.p.
Query: Whenever node needs to approximate its distance to , the node requests from the distance to , for which knows a approximation, , due to [40, Corollary 5]. The node approximates its distance to as . The approximation factor follows from \IfAppendixLABEL:\next ( (APSP using -nearest and MSSP).)Claim A.3. To execute the routing, the nodes invoke Claim D.1. The number of rounds for solving a query with load is w.h.p. ∎
By denoting as the minimal degree in the graph, one gets load and , which implies our main result Theorem 1.1.
See 1.1
Finally, we use our simulation together with \IfAppendixLABEL:\next ( (-Approximation for Scattered APSP in ).)Theorem B.28 to obtain our Scattered APSP algorithm in the model.
See 1.6
Proof of Theorem 1.6.
Simulate \IfAppendixLABEL:\next ( (-Approximation for Scattered APSP in ).)Theorem B.28 using \IfAppendixLABEL:\next ( ( Simulation in ).)Theorem 5.1. Split the output of each node to two parts. The first part, , is bits encoding the communication tokens of nodes which store the distances from . Thus, the output capacity is . The communication tokens decoded from , allow to know where its distances are stored. The second part of the output of is the auxiliary output which is the distances it knows. Thus, the algorithm completes in rounds, w.h.p. ∎
6 Discussion
We believe that additional problems in various fundamental distributed settings could be solvable using our infrastructure for sparsity aware computation. This is a broad open direction for further research.
With respect to the specific results shown here, a major goal would be to construct a more sparse hopset in the model. Further, one could attempt to show sparse matrix multiplication algorithms which relax the assumption that the input matrices and output matrix are bounded by the same number of finite elements, as this could directly improve the complexity of our -SSP algorithm. Either of these improvements is likely to significantly reduce the round complexity of many of our end results, in both the and models.
Acknowledgements
This project was partially supported by the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement no. 755839. The authors would like to thank Michal Dory and Yuval Efron for various helpful conversations. We also thank Fabian Kuhn for sharing a preprint of [49] with us.
References
- [1] Amir Abboud, Keren Censor-Hillel, and Seri Khoury. Near-linear lower bounds for distributed distance computations, even in sparse networks. In Cyril Gavoille and David Ilcinkas, editors, Distributed Computing - 30th International Symposium, DISC 2016, September 27-29, 2016. Proceedings, volume 9888 of Lecture Notes in Computer Science, pages 29–42, Paris, France, 2016. Springer. doi:10.1007/978-3-662-53426-7\\\\_3.
- [2] Udit Agarwal and Vijaya Ramachandran. Distributed weighted all pairs shortest paths through pipelining. In 2019 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2019, May 20-24, 2019, pages 23–32, Rio de Janeiro, Brazil, 2019. IEEE. doi:10.1109/IPDPS.2019.00014.
- [3] Udit Agarwal and Vijaya Ramachandran. Faster deterministic all pairs shortest paths in congest model. In Christian Scheideler and Michael Spear, editors, SPAA ’20: 32nd ACM Symposium on Parallelism in Algorithms and Architectures, July 15-17, 2020, pages 11–21, Virtual Event, USA, 2020. ACM. doi:10.1145/3350755.3400256.
- [4] Udit Agarwal, Vijaya Ramachandran, Valerie King, and Matteo Pontecorvi. A deterministic distributed algorithm for exact weighted all-pairs shortest paths in õ(n 3/2 ) rounds. In Calvin Newport and Idit Keidar, editors, Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing, PODC 2018, July 23-27, 2018, pages 199–205, Egham, United Kingdom, 2018. ACM. doi:10.1145/3212734.3212773.
- [5] Bertie Ancona, Keren Censor-Hillel, Mina Dalirrooyfard, Yuval Efron, and Virginia Vassilevska Williams. Distributed distance approximation. In Quentin Bramas, Rotem Oshman, and Paolo Romano, editors, 24th International Conference on Principles of Distributed Systems, OPODIS 2020, December 14-16, 2020, volume 184 of LIPIcs, pages 30:1–30:17, Strasbourg, France (Virtual Conference), 2020. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.OPODIS.2020.30.
- [6] John Augustine, Keerti Choudhary, Avi Cohen, David Peleg, Sumathi Sivasubramaniam, and Suman Sourav. Distributed graph realizations †. In 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 18-22, 2020, pages 158–167, New Orleans, LA, USA, 2020. IEEE. doi:10.1109/IPDPS47924.2020.00026.
- [7] John Augustine, Mohsen Ghaffari, Robert Gmyr, Kristian Hinnenthal, Christian Scheideler, Fabian Kuhn, and Jason Li. Distributed computation in node-capacitated networks. In Christian Scheideler and Petra Berenbrink, editors, The 31st ACM on Symposium on Parallelism in Algorithms and Architectures, SPAA 2019, June 22-24, 2019, pages 69–79, Phoenix, AZ, USA, 2019. ACM. doi:10.1145/3323165.3323195.
- [8] John Augustine, Kristian Hinnenthal, Fabian Kuhn, Christian Scheideler, and Philipp Schneider. Shortest paths in a hybrid network model. In Shuchi Chawla, editor, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, January 5-8, 2020, pages 1280–1299, Salt Lake City, UT, USA, 2020. SIAM. doi:10.1137/1.9781611975994.78.
- [9] Florent Becker, Antonio Fernández Anta, Ivan Rapaport, and Eric Rémila. Brief announcement: A hierarchy of congested clique models, from broadcast to unicast. In Chryssis Georgiou and Paul G. Spirakis, editors, Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, PODC 2015, July 21 - 23, 2015, pages 167–169, Donostia-San Sebastián, Spain, 2015. ACM. doi:10.1145/2767386.2767447.
- [10] Florent Becker, Antonio Fernández Anta, Ivan Rapaport, and Eric Rémila. The effect of range and bandwidth on the round complexity in the congested clique model. In Thang N. Dinh and My T. Thai, editors, Computing and Combinatorics - 22nd International Conference, COCOON 2016, August 2-4, 2016, Proceedings, volume 9797 of Lecture Notes in Computer Science, pages 182–193, Ho Chi Minh City, Vietnam, 2016. Springer. doi:10.1007/978-3-319-42634-1\\\\_15.
- [11] Ruben Becker, Andreas Karrenbauer, Sebastian Krinninger, and Christoph Lenzen. Near-optimal approximate shortest paths and transshipment in distributed and streaming models. In Andréa W. Richa, editor, 31st International Symposium on Distributed Computing, DISC 2017, October 16-20, 2017, volume 91 of LIPIcs, pages 7:1–7:16, Vienna, Austria, 2017. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.DISC.2017.7.
- [12] Aaron Bernstein and Danupon Nanongkai. Distributed exact weighted all-pairs shortest paths in near-linear time. In Moses Charikar and Edith Cohen, editors, Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, June 23-26, 2019, pages 334–342, Phoenix, AZ, USA, 2019. ACM. doi:10.1145/3313276.3316326.
- [13] Keren Censor-Hillel, Yi-Jun Chang, François Le Gall, and Dean Leitersdorf. Tight distributed listing of cliques. In Dániel Marx, editor, Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, January 10 - 13, 2021, pages 2878–2891, Virtual Conference, 2021. SIAM. doi:10.1137/1.9781611976465.171.
- [14] Keren Censor-Hillel, Michal Dory, Janne H. Korhonen, and Dean Leitersdorf. Fast approximate shortest paths in the congested clique. In Peter Robinson and Faith Ellen, editors, Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, PODC 2019, Canada, July 29 - August 2, 2019, pages 74–83, Toronto, ON, 2019. ACM. doi:10.1145/3293611.3331633.
- [15] Keren Censor-Hillel, François Le Gall, and Dean Leitersdorf. On distributed listing of cliques. In Yuval Emek and Christian Cachin, editors, PODC ’20: ACM Symposium on Principles of Distributed Computing, August 3-7, 2020, pages 474–482, Virtual Event, Italy, 2020. ACM. doi:10.1145/3382734.3405742.
- [16] Keren Censor-Hillel, Petteri Kaski, Janne H. Korhonen, Christoph Lenzen, Ami Paz, and Jukka Suomela. Algebraic methods in the congested clique. Distributed Comput., 32(6):461–478, 2019. doi:10.1007/s00446-016-0270-2.
- [17] Keren Censor-Hillel, Seri Khoury, and Ami Paz. Quadratic and near-quadratic lower bounds for the CONGEST model. In Andréa W. Richa, editor, 31st International Symposium on Distributed Computing, DISC 2017, October 16-20, 2017, volume 91 of LIPIcs, pages 10:1–10:16, Vienna, Austria, 2017. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.DISC.2017.10.
- [18] Keren Censor-Hillel, Dean Leitersdorf, and Volodymyr Polosukhin. Distance computations in the hybrid network model via oracle simulations. In Markus Bläser and Benjamin Monmege, editors, 38th International Symposium on Theoretical Aspects of Computer Science, STACS 2021, March 16-19, 2021, volume 187 of LIPIcs, pages 21:1–21:19, Saarbrücken, Germany (Virtual Conference), 2021. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.STACS.2021.21.
- [19] Keren Censor-Hillel, Dean Leitersdorf, and Elia Turner. Sparse matrix multiplication and triangle listing in the congested clique model. Theor. Comput. Sci., 809:45–60, 2020. doi:10.1016/j.tcs.2019.11.006.
- [20] Yi-Jun Chang, Seth Pettie, and Hengjie Zhang. Distributed triangle detection via expander decomposition. In Timothy M. Chan, editor, Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, January 6-9, 2019, pages 821–840, San Diego, California, USA, 2019. SIAM. doi:10.1137/1.9781611975482.51.
- [21] Yi-Jun Chang and Thatchaphol Saranurak. Improved distributed expander decomposition and nearly optimal triangle enumeration. In Peter Robinson and Faith Ellen, editors, Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, PODC 2019, Canada, July 29 - August 2, 2019, pages 66–73, Toronto, ON, 2019. ACM. doi:10.1145/3293611.3331618.
- [22] Yi-Jun Chang and Thatchaphol Saranurak. Deterministic distributed expander decomposition and routing with applications in distributed derandomization. In 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, November 16-19, 2020, pages 377–388, Durham, NC, USA, 2020. IEEE. doi:10.1109/FOCS46700.2020.00043.
- [23] Shiri Chechik and Doron Mukhtar. Single-source shortest paths in the CONGEST model with improved bound. In Yuval Emek and Christian Cachin, editors, PODC ’20: ACM Symposium on Principles of Distributed Computing, August 3-7, 2020, pages 464–473, Virtual Event, Italy, 2020. ACM. doi:10.1145/3382734.3405729.
- [24] Edith Cohen. Polylog-time and near-linear work approximation scheme for undirected shortest paths. J. ACM, 47(1):132–166, 2000. doi:10.1145/331605.331610.
- [25] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, 3rd Edition. MIT Press, 2009. URL: http://mitpress.mit.edu/books/introduction-algorithms.
- [26] Yong Cui, Hongyi Wang, and Xiuzhen Cheng. Channel allocation in wireless data center networks. In INFOCOM 2011. 30th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, 10-15 April 2011, pages 1395–1403, Shanghai, China, 2011. IEEE. doi:10.1109/INFCOM.2011.5934925.
- [27] Michael Dinitz, Michael Schapira, and Gal Shahaf. Approximate moore graphs are good expanders. J. Comb. Theory, Ser. B, 141:240–263, 2020. doi:10.1016/j.jctb.2019.08.003.
- [28] Michal Dory and Merav Parter. Exponentially faster shortest paths in the congested clique. In Yuval Emek and Christian Cachin, editors, PODC ’20: ACM Symposium on Principles of Distributed Computing, August 3-7, 2020, pages 59–68, Virtual Event, Italy, 2020. ACM. doi:10.1145/3382734.3405711.
- [29] Andrew Drucker, Fabian Kuhn, and Rotem Oshman. On the power of the congested clique model. In Magnús M. Halldórsson and Shlomi Dolev, editors, ACM Symposium on Principles of Distributed Computing, PODC ’14, July 15-18, 2014, pages 367–376, Paris, France, 2014. ACM. doi:10.1145/2611462.2611493.
- [30] Talya Eden, Nimrod Fiat, Orr Fischer, Fabian Kuhn, and Rotem Oshman. Sublinear-time distributed algorithms for detecting small cliques and even cycles. In Jukka Suomela, editor, 33rd International Symposium on Distributed Computing, DISC 2019, October 14-18, 2019, volume 146 of LIPIcs, pages 15:1–15:16, Budapest, Hungary, 2019. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.DISC.2019.15.
- [31] Michael Elkin. Distributed exact shortest paths in sublinear time. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Canada, June 19-23, 2017, pages 757–770, Montreal, QC, 2017. ACM. doi:10.1145/3055399.3055452.
- [32] Michael Feldmann, Kristian Hinnenthal, and Christian Scheideler. Fast hybrid network algorithms for shortest paths in sparse graphs. In Quentin Bramas, Rotem Oshman, and Paolo Romano, editors, 24th International Conference on Principles of Distributed Systems, OPODIS 2020, December 14-16, 2020, volume 184 of LIPIcs, pages 31:1–31:16, Strasbourg, France (Virtual Conference), 2020. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.OPODIS.2020.31.
- [33] Sebastian Forster and Danupon Nanongkai. A faster distributed single-source shortest paths algorithm. In Mikkel Thorup, editor, 59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, October 7-9, 2018, pages 686–697, Paris, France, 2018. IEEE Computer Society. doi:10.1109/FOCS.2018.00071.
- [34] Silvio Frischknecht, Stephan Holzer, and Roger Wattenhofer. Networks cannot compute their diameter in sublinear time. In Yuval Rabani, editor, Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2012, January 17-19, 2012, pages 1150–1162, Kyoto, Japan, 2012. SIAM. doi:10.1137/1.9781611973099.91.
- [35] François Le Gall. Further algebraic algorithms in the congested clique model and applications to graph-theoretic problems. In Cyril Gavoille and David Ilcinkas, editors, Distributed Computing - 30th International Symposium, DISC 2016, September 27-29, 2016. Proceedings, volume 9888 of Lecture Notes in Computer Science, pages 57–70, Paris, France, 2016. Springer. doi:10.1007/978-3-662-53426-7\\\\_5.
- [36] Mohsen Ghaffari and Bernhard Haeupler. Distributed algorithms for planar networks I: planar embedding. In George Giakkoupis, editor, Proceedings of the 2016 ACM Symposium on Principles of Distributed Computing, PODC 2016, July 25-28, 2016, pages 29–38, Chicago, IL, USA, 2016. ACM. doi:10.1145/2933057.2933109.
- [37] Mohsen Ghaffari and Bernhard Haeupler. Distributed algorithms for planar networks II: low-congestion shortcuts, mst, and min-cut. In Robert Krauthgamer, editor, Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, January 10-12, 2016, pages 202–219, Arlington, VA, USA, 2016. SIAM. doi:10.1137/1.9781611974331.ch16.
- [38] Mohsen Ghaffari, Fabian Kuhn, and Hsin-Hao Su. Distributed MST and routing in almost mixing time. In Elad Michael Schiller and Alexander A. Schwarzmann, editors, Proceedings of the ACM Symposium on Principles of Distributed Computing, PODC 2017, July 25-27, 2017, pages 131–140, Washington, DC, USA, 2017. ACM. doi:10.1145/3087801.3087827.
- [39] Mohsen Ghaffari and Jason Li. Improved distributed algorithms for exact shortest paths. In Ilias Diakonikolas, David Kempe, and Monika Henzinger, editors, Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, June 25-29, 2018, pages 431–444, Los Angeles, CA, USA, 2018. ACM. doi:10.1145/3188745.3188948.
- [40] Mohsen Ghaffari and Jason Li. New distributed algorithms in almost mixing time via transformations from parallel algorithms. In Ulrich Schmid and Josef Widder, editors, 32nd International Symposium on Distributed Computing, DISC 2018, October 15-19, 2018, volume 121 of LIPIcs, pages 31:1–31:16, New Orleans, LA, USA, 2018. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.DISC.2018.31.
- [41] Ofer Grossman, Seri Khoury, and Ami Paz. Improved hardness of approximation of diameter in the CONGEST model. In Hagit Attiya, editor, 34th International Symposium on Distributed Computing, DISC 2020, October 12-16, 2020, volume 179 of LIPIcs, pages 19:1–19:16, Virtual Conference, 2020. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.DISC.2020.19.
- [42] Kai Han, Zhiming Hu, Jun Luo, and Liu Xiang. RUSH: routing and scheduling for hybrid data center networks. In 2015 IEEE Conference on Computer Communications, INFOCOM 2015, April 26 - May 1, 2015, pages 415–423, Kowloon, Hong Kong, 2015. IEEE. doi:10.1109/INFOCOM.2015.7218407.
- [43] Vipul Harsh, Sangeetha Abdu Jyothi, Inderdeep Singh, and Philip Brighten Godfrey. Expander datacenters: From theory to practice. CoRR, abs/1811.00212, 2018. URL: http://arxiv.org/abs/1811.00212, arXiv:1811.00212.
- [44] Monika Henzinger, Sebastian Krinninger, and Danupon Nanongkai. A deterministic almost-tight distributed algorithm for approximating single-source shortest paths. In Daniel Wichs and Yishay Mansour, editors, Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, June 18-21, 2016, pages 489–498, Cambridge, MA, USA, 2016. ACM. doi:10.1145/2897518.2897638.
- [45] Stephan Holzer and Nathan Pinsker. Approximation of distances and shortest paths in the broadcast congest clique. In Emmanuelle Anceaume, Christian Cachin, and Maria Gradinariu Potop-Butucaru, editors, 19th International Conference on Principles of Distributed Systems, OPODIS 2015, December 14-17, 2015, volume 46 of LIPIcs, pages 6:1–6:16, Rennes, France, 2015. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.OPODIS.2015.6.
- [46] Stephan Holzer and Roger Wattenhofer. Optimal distributed all pairs shortest paths and applications. In Darek Kowalski and Alessandro Panconesi, editors, ACM Symposium on Principles of Distributed Computing, PODC ’12, July 16-18, 2012, pages 355–364, Funchal, Madeira, Portugal, 2012. ACM. doi:10.1145/2332432.2332504.
- [47] He Huang, Xiangke Liao, Shanshan Li, Shaoliang Peng, Xiaodong Liu, and Bin Lin. The architecture and traffic management of wireless collaborated hybrid data center network. In Dah Ming Chiu, Jia Wang, Paul Barford, and Srinivasan Seshan, editors, ACM SIGCOMM 2013 Conference, SIGCOMM’13, August 12-16, 2013, pages 511–512, Hong Kong, China, 2013. ACM. doi:10.1145/2486001.2491724.
- [48] Taisuke Izumi, François Le Gall, and Frédéric Magniez. Quantum distributed algorithm for triangle finding in the CONGEST model. In Christophe Paul and Markus Bläser, editors, 37th International Symposium on Theoretical Aspects of Computer Science, STACS 2020, March 10-13, 2020, volume 154 of LIPIcs, pages 23:1–23:13, Montpellier, France, 2020. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.STACS.2020.23.
- [49] Fabian Kuhn and Philipp Schneider. Computing shortest paths and diameter in the hybrid network model. In Yuval Emek and Christian Cachin, editors, PODC ’20: ACM Symposium on Principles of Distributed Computing, August 3-7, 2020, pages 109–118, Virtual Event, Italy, 2020. ACM. doi:10.1145/3382734.3405719.
- [50] Christoph Lenzen and Boaz Patt-Shamir. Fast routing table construction using small messages: extended abstract. In Dan Boneh, Tim Roughgarden, and Joan Feigenbaum, editors, Symposium on Theory of Computing Conference, STOC’13, June 1-4, 2013, pages 381–390, Palo Alto, CA, USA, 2013. ACM. doi:10.1145/2488608.2488656.
- [51] Christoph Lenzen and Boaz Patt-Shamir. Fast partial distance estimation and applications. In Chryssis Georgiou and Paul G. Spirakis, editors, Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, PODC 2015, July 21 - 23, 2015, pages 153–162, Donostia-San Sebastián, Spain, 2015. ACM. doi:10.1145/2767386.2767398.
- [52] Christoph Lenzen and David Peleg. Efficient distributed source detection with limited bandwidth. In Panagiota Fatourou and Gadi Taubenfeld, editors, ACM Symposium on Principles of Distributed Computing, PODC ’13, Canada, July 22-24, 2013, pages 375–382, Montreal, QC, 2013. ACM. doi:10.1145/2484239.2484262.
- [53] Jason Li and Merav Parter. Planar diameter via metric compression. In Moses Charikar and Edith Cohen, editors, Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, June 23-26, 2019, pages 152–163, Phoenix, AZ, USA, 2019. ACM. doi:10.1145/3313276.3316358.
- [54] Danupon Nanongkai. Distributed approximation algorithms for weighted shortest paths. In David B. Shmoys, editor, Symposium on Theory of Computing, STOC 2014, May 31 - June 03, 2014, pages 565–573, New York, NY, USA, 2014. ACM. doi:10.1145/2591796.2591850.
- [55] Merav Parter. Distributed planar reachability in nearly optimal time. In Hagit Attiya, editor, 34th International Symposium on Distributed Computing, DISC 2020, October 12-16, 2020, volume 179 of LIPIcs, pages 38:1–38:17, Virtual Conference, 2020. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.DISC.2020.38.
- [56] David Peleg. Distributed Computing: A Locality-Sensitive Approach. Society for Industrial and Applied Mathematics, USA, 2000.
- [57] David Peleg, Liam Roditty, and Elad Tal. Distributed algorithms for network diameter and girth. In Artur Czumaj, Kurt Mehlhorn, Andrew M. Pitts, and Roger Wattenhofer, editors, Automata, Languages, and Programming - 39th International Colloquium, ICALP 2012, July 9-13, 2012, Proceedings, Part II, volume 7392 of Lecture Notes in Computer Science, pages 660–672, Warwick, UK, 2012. Springer. doi:10.1007/978-3-642-31585-5\\\\_58.
- [58] Atish Das Sarma, Stephan Holzer, Liah Kor, Amos Korman, Danupon Nanongkai, Gopal Pandurangan, David Peleg, and Roger Wattenhofer. Distributed verification and hardness of distributed approximation. SIAM J. Comput., 41(5):1235–1265, 2012. doi:10.1137/11085178X.
- [59] Jeanette P. Schmidt, Alan Siegel, and Aravind Srinivasan. Chernoff-hoeffding bounds for applications with limited independence. SIAM J. Discret. Math., 8(2):223–250, 1995. doi:10.1137/S089548019223872X.
- [60] Hsin-Hao Su and Hoa T. Vu. Distributed dense subgraph detection and low outdegree orientation. In Hagit Attiya, editor, 34th International Symposium on Distributed Computing, DISC 2020, October 12-16, 2020, volume 179 of LIPIcs, pages 15:1–15:18, Virtual Conference, 2020. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.DISC.2020.15.
- [61] Salil P. Vadhan. Pseudorandomness. Found. Trends Theor. Comput. Sci., 7(1-3):1–336, 2012. doi:10.1561/0400000010.
- [62] Guohui Wang, David G. Andersen, Michael Kaminsky, Konstantina Papagiannaki, T. S. Eugene Ng, Michael Kozuch, and Michael P. Ryan. c-through: part-time optics in data centers. In Shivkumar Kalyanaraman, Venkata N. Padmanabhan, K. K. Ramakrishnan, Rajeev Shorey, and Geoffrey M. Voelker, editors, Proceedings of the ACM SIGCOMM 2010 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, August 30 -September 3, 2010, pages 327–338, New Delhi, India, 2010. ACM. doi:10.1145/1851182.1851222.
Appendix A Preliminaries – Extended
A.1 Definitions
Additional Models. Throughout the paper, we also refer to the and models. Both are synchronous models, where every node can communicate in each round with every other node by sending messages of bits. In the model, the messages between every pair of nodes can be unique, while in the model each node sends the same message to all the other nodes.
Similarly to [7] we define a distributive aggregate function and an Aggregation-Problem.
Definition 8 (Aggregate Function).
An aggregate function maps a multiset of input values to some value . An aggregate function is called distributive if there is an aggregate function such that for any multiset and any partition of , it holds that .
Definition 9 (Aggregate-and-Broadcast Problem).
In the Aggregate-and-Broadcast Problem we are given a distributive aggregate function and each node stores exactly one input value . The goal is to let every node learn .
A.2 Mathematical Tools
Similarly to [49], we will use families of -wise independent functions.
Definition 10 (Family of -wise Independent Random Functions).
For finite sets , let be a family of hash functions. Then is called -wise independent if for a random function and for any distinct keys , we have that are independent and uniformly distributed random variables in .
In particular, we are interested in a hash function on which nodes can agree within a small amount of communication.
Claim A.1 (Seed).
Unlike [49] we do not necessarily apply the hash function sampled from the family of -wise independent random functions on distinct sets of arguments, but rather on a multiset of arguments where each argument appears at most times. So, we show in \IfAppendixLABEL:\next ( (Conflicts).)Claim A.2 the property similar to [49, Lemma D.2], which bounds the number of collisions.
Claim A.2 (Conflicts).
There exists a value such that for a sufficiently large , given a function (with ) sampled from a family of -wise independent hash functions, and a multi-set of keys , in which each key appears in at most times, each value appears in the multiset of values at most times w.h.p.
Proof of Claim A.2.
Split greedily into sets of distinct keys . Consider some . Let be a family of -wide independent hash functions, for some to be determined. By the definition of , the random variables are -wise independent and uniformly distributed in . Thus the probability to sample some particular is . By a Chernoff Bound for variables with bounded independence [59, Theorem 2] and a union bound over all and , there is a large enough , such that each value appears in at most times w.h.p. Thus, each value appears in at most times w.h.p. ∎
A.3 Distance Tools
Claim A.3 (APSP using -nearest and MSSP).
(see e.g. [14]) Let be a weighted graph, let be a constant, let be a value, and let be a set of nodes marked independently with probability at least .
With probability at least , it holds that . Denote by one of the closest nodes to in . Denote by the -approximate distance from to other nodes for some . With probability at least , for any pair of nodes it holds that is a -approximate weighted distance between and .
Appendix B The Model – Extended
Section B.1 contains proofs of various technical tools for routing information in the model – we note that if taken as black-boxes, its contents can be skipped without harming the understanding of the main contributions of this section. Then, we show how to build carrier configurations and how to work with them, in Section B.2. We use sparse matrix multiplication Theorem 3.1 to construct hopsets in Section B.3, which eventually allows us to obtain our fast algorithms for SSSP and MSSP in Section B.4 and Section B.5, respectively.
B.1 General Tools
We show basic tools which are useful in the model, for overcoming the anonymity challenges, as well as for solving problems related to communication with limited bandwidth.
We introduce the following notation. Given a set of nodes , denote by the set of pairs of communication tokens and identifiers of the nodes in .
B.1.1 Basic Message Routing
Lemma B.1 (Routing).
Given a set of messages and a globally known value , if each node desires to send at most messages and knows the communication tokens of their recipients, and each node is the recipient of at most messages, then it is possible to deliver the messages in rounds of the model, w.h.p.
Proof of Lemma B.1.
Denote by the messages that node desires to send. We proceed in rounds, where in each round each node samples messages from that are not yet sent, independently with probability , and sends them to their destinations. The probability that some message is not sampled during this procedure is . Thus, by applying a union bound over all messages, each message is sent w.h.p.
For any given round, the probability for a specific message to be sent or received by some node is at most (it is zero for rounds after the one in which it has been sent). Thus, due to the independence between messages, by a Chernoff Bound and a union bound over senders, receivers and rounds, on each round, each node sends or receives at most messages w.h.p.
∎
B.1.2 Anonymous Communication Primitives
Definition 11 (Communication Tree).
Given a graph and a node , a communication tree rooted at in is a -depth directed tree which is rooted at and spans , such that each node has at most 2 edges directed away from it. A communication tree over , satisfies the conditions above, yet, only spans and not .
A communication tree rooted at allows to efficiently broadcast messages from to the entire graph as well as compute aggregation functions.
We show it is possible to build many communication trees in parallel.
Lemma B.2 (Constructing Communication Trees).
Given a set of nodes , it is possible to construct for each a communication tree rooted at , , such that each node in the graph knows the edges incident to it in each tree. This takes rounds of the model, w.h.p.
Proof of Lemma B.2.
Consider the task of constructing for a single node . Node randomly samples two nodes, , and tells them it is their parent in . Nodes each sample two nodes and repeat this process. At each step, a node might sample a node which is already in . In such a case, rejects the demand of to add it as a child. Thus, when building the next level of the tree, we repeat the choosing step times, ensuring, w.h.p., that each node has two nodes as its children. Notice that this ensures, w.h.p., that at every level in , except for the last, each node has exactly two children, and thus the depth of is w.h.p. Thus, in rounds, a communication tree from which spans the entire graph is constructed.
In each round, every node sends and receives messages, w.h.p., thus we can perform this for nodes in parallel, taking rounds overall to build such trees for all . ∎
Lemma B.3 (Message Doubling on Communication Trees).
Let be a set of nodes, and a communication tree rooted at , which spans . It is possible for to broadcast a set of messages to the entire set within rounds of the model, w.h.p., while utilizing only the communication bandwidth of the nodes in . Likewise, it is possible to compute aggregation functions on values of the nodes in , in rounds.
Proof of Lemma B.3.
On the first round, sends to its children in , and , some set , where . On the second round, and forward to their children, while sends them some other such set . This continues for rounds. Notice that every node sends and receives at most messages per round.
To solve aggregation functions, reversing the flow of messages in the above algorithm suffices. ∎
Lemma B.4 (Synchronization).
In the model, given a communication tree rooted at some node and assuming that every node has a value , it is possible in rounds to ensure that knows the sum of all the values of the nodes which come before it in the in-order traversal of . Further, it is possible to solve instances of this problem in parallel in rounds.
Proof of Lemma B.4.
We treat only a single value, noting that allowing values follows as in the single value case, each node sends and receives only a constant number of messages per round. For a node , denote by , its left and right children in , respectively, and by its parent.
Start from the leaves of and sum the total of the values upwards till the root. To clarify, a leaf sends to . Denote by the sum of all of the values of the nodes in the subtree of rooted at . Once receives and from its children, it sends up to the sum .
Then, the root of sets . Further, sends to the value zero, and sends to the sum . Then, every node , upon receiving a value from , sets , forwards the value to , and sends to . This algorithm takes rounds and achieves the desired result. ∎
Lemma B.5 (Broadcasting).
Let be a set of messages distributed across the nodes arbitrarily. It is possible to broadcast this set of messages to all nodes in rounds of the model, w.h.p.
Proof of Lemma B.5.
Construct a communication tree from the node with ID 1, and then send down the communication token of node 1, which implies that from now on every node can communicate with node 1. Using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1, node 1 receives all of in rounds.
Once node 1 knows all of , we send down along in rounds, using Lemma B.3. ∎
Reversing the flow of messages in the proof of Lemma B.5 proves Corollary B.6.
Corollary B.6 (Aggregation).
It is possible to solve aggregation problems in rounds in the model . That is, if every node has a vector of values , denote , and there are aggregation functions, , it is possible to ensure within rounds that all the nodes know the values .
B.1.3 Communication Tools Within Groups of Nodes
We show the following communication tools related to allowing subsets of nodes in the graph to communicate with one another.
LABEL:\next ( (Grouping).)Lemma B.7 allows grouping together disjoint sets of nodes such that each node in a given set knows all the communication tokens of the other nodes in the set. \IfAppendixLABEL:\next ( (Group Broadcasting and Aggregating).)Corollary B.9 allows a single node in the set to quickly broadcast messages to or perform aggregation operations on the set.
Lemma B.7 (Grouping).
Let be disjoint sets where , for some , and where every node knows if and to which set it belongs. It is possible in rounds of the model to ensure that, for each , every node knows w.h.p.
Proof of Lemma B.7.
Due to the definition of the sets, we know that . Thus, the nodes (with identifiers ) broadcast using \IfAppendixLABEL:\next ( (Broadcasting).)Lemma B.5 in rounds. Using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1, for each , the nodes in send to node , in rounds.
Fix . Node performs message duplication to tell all nodes in the sets , as follows. Node chooses some , and tells it, within rounds, . Now, and , each proceed to each tell another node in all this information, doubling the number of nodes which know to 4. This continues for iterations, each taking rounds. ∎
Claim B.8 (Group Communication Tree Construction).
Given a set of nodes , where every knows , it is possible to build a communication tree over such that the in-order traversal of the tree imposes any desired ordering of the nodes . This is done using local computation only and requires no communication.
Proof of Claim B.8.
Since each node in knows , this simply entails having every node locally decide which other nodes in are its parent, left, and right children in the output tree, . ∎
The following statement follows immediately from \IfAppendixLABEL:\next ( (Group Communication Tree Construction).)Claim B.8 and \IfAppendixLABEL:\next ( (Message Doubling on Communication Trees).)Lemma B.3.
Corollary B.9 (Group Broadcasting and Aggregating).
Given a set of nodes , where every knows , it is possible to allow a single node to broadcast a set of messages to the entire set within rounds of the model, w.h.p., while utilizing only the communication bandwidth of the nodes in . Likewise, it is possible to compute aggregation functions on values of the nodes in , in rounds.
LABEL:\next ( (Group Multicasting).)Lemma B.10 extends \IfAppendixLABEL:\next ( (Group Broadcasting and Aggregating).)Corollary B.9 in order to allow nodes to efficiently send multicast messages within their given set.
Lemma B.10 (Group Multicasting).
Given a set of nodes , where every knows , and a value such that every node desires to multicast at most messages to nodes in , where each node is the destination of messages originating in at most one node, it is possible to perform the communication task in rounds of the model, where is an upper bound for all .
Proof of Lemma B.10.
Assume, w.l.o.g., that the nodes in have identifiers . Then, we use \IfAppendixLABEL:\next ( (Group Communication Tree Construction).)Claim B.8 to construct a communication tree over , with the demand that the in-order traversal of the tree will output the nodes of in ascending order from to . Then, execute, in rounds, the algorithm given by \IfAppendixLABEL:\next ( (Synchronization).)Lemma B.4 over in order to ensure that every node knows the sum .
Now, node tells node the values and . Then, node tells these values to , while node tells them to node , and we continue in this fashion for rounds until all nodes in know the values and . We now call these nodes the gateways of node . Notice that every node has distinct gateways – that is, no node is a gateway of more than one node.
Node now sends all of its messages to its gateways. To do so, it tells its first gateway all of its messages in rounds. Then, it tells its second gateway, while the first gateway tells the third gateway. This continues for iterations, each taking rounds, until all the gateways of node know all the messages of . Next, in rounds, node sequentially tells its gateway the identifier of the node which is set to receive the multicast messages from . Finally, every gateway forwards the messages to the target which tells it to send the messages to, in rounds. ∎
B.2 Carrier Configurations

In this example, , , . The two arrays denote which edges and have, with a checkmark indicating the existence of an edge. The node holds information about and the first four nodes. That is, it knows that there are edges from the first two nodes to and that there are no edges from the following to nodes to . Notice that in this case and both hold the edge and thus will know its weight, direction, the communication tokens of and , and the communication tokens of each other ( and ). Further, have communication trees (not shown), which allow them to perform broadcast and aggregate operations on all of , respectively.
Definition 12 (Carrier Configuration).
Given a set of nodes , a data structure is a Carrier Configuration holding a graph with average degree , if for every node the following hold:
Carrier Node Allocations
-
1.
are the outgoing and incoming carrier nodes of , where , .
-
2.
is in at most sets and sets , for a constant , and knows which sets it is in.
Data Storage
-
3.
An edge is always stored alongside its weight and direction.
-
4.
For each , there exists an interval , such that knows all of the edges directed away from and towards nodes with IDs in the interval , and there are at most such edges. It further holds that the intervals partition . Similar constraints hold for .
-
5.
Node knows, for each , the two delimiters of the interval .
Communication Structure
-
6.
For each , the nodes in are connected by the communication tree , implying that each node knows its parent and children in the tree. The same holds for nodes in .
The definition of the data structure is compatible with both directed and undirected graphs, where for undirected graphs each edge appears in both directions. We also use carrier configurations for holding matrices, where it can be thought that every finite entry at indices in a matrix represents an edge from node to . Each node stores the finite entries of row as edges outgoing from , and the finite entries of column as edges incoming to .
In order to use carrier configurations in the model, we must slightly extend the definition in order to address the usage of communication tokens. Thus, we present the following definition for Carrier Configurations, to which we often refer simply as ‘carrier configurations’.
Definition 13 ( Carrier Configuration).
Given a set of nodes , a data structure is a carrier configuration holding a graph with average degree , if it is a Carrier Configuration and, additionally, for every node the following hold:
-
1.
Node knows and .
-
2.
Node knows the communication tokens and identifiers of each such that or .
-
3.
Node knows the communication tokens of its parent and children in each communication tree that it belongs to as part of the data structure.
-
4.
An edge is always stored alongside the communication tokens of and .
-
5.
Every knows for every edge which it holds, the communication token of node which also holds . Similarly, knows the communication token of .
Throughout this entire section, the term ‘carrier configuration’ refers to \IfAppendixLABEL:\next ( ( Carrier Configuration).)Definition 13, unless otherwise specified.
B.2.1 Initialization
We show how to construct a carrier configuration, given that the edges of the graph are initially known to the nodes incident to them. As the stages taken during the construction can be partially reused in other algorithms which we show, we break up the construction into two statements – \IfAppendixLABEL:\next ( (Initialize Carriers).)Lemma B.11 creates an empty carrier configuration by allocating the carrier node sets and creating communication trees spanning them, and \IfAppendixLABEL:\next ( (Populate Carriers).)Lemma B.12 transfers the data from nodes to their carrier nodes.
Lemma B.11 (Initialize Carriers).
Given a graph , with , and the maximal degree in , where each node initially only knows (but not even the edges incident to it), it is possible to assign for each node sets , which satisfy Items 1, 2, 6, 1, 2 and 3, in rounds, w.h.p.
Note: We do not assume that and are originally known to all the nodes.
Proof of Lemma B.11.
We perform two operations in this proof. First, we allocate the carrier node sets. Then, we construct communication trees across them. We show the case of outgoing carrier nodes, , and note that the case of is symmetric.
Carrier Allocations: We start by computing the values and , using \IfAppendixLABEL:\next ( (Aggregation).)Corollary B.6, in rounds. Each node selects by sending its communication token and identifier to random nodes, and each node which reaches replies to with its communication token and identifier. The expected number of times a node is picked as carrier node is at most , and thus by applying a Chernoff Bound, there exists a constant such that each node is picked by (not necessarily distinct) nodes in order to be in their carrier node set w.h.p. This concludes the creation of the sets themselves, and satisfies Items 1, 2, 1 and 2.
The round complexity of this step is , as each node initially sends messages, and then replies to the at most nodes which chose it for their carrier configuration sets.
Communication Trees: Node locally builds a balanced binary tree which spans , and sends to each the communication tokens of its parent and children in , taking rounds w.h.p. Notice that is a communication tree (Definition 11) as it is of depth , and thus we satisfy Items 6 and 3. ∎
Now, we assume that we are given a carrier configuration which is still incomplete and only satisfies the conditions from the previous statement, and we complete it to a proper carrier configuration.
Lemma B.12 (Populate Carriers).
Let be a graph where each node knows all the edges incident to it and the communication tokens of all of its neighbors. Assume that we have a carrier configuration which is currently in-construction and satisfies all of the properties of a carrier configuration, except for Items 3, 4, 5, 4 and 5. Then, it is possible, within rounds, where is the maximal degree in the graph, to reach a state where satisfies all of the properties of a carrier configuration, w.h.p.
Proof of Lemma B.12.
We show the procedure for, , and note that the case of is symmetric.
Node partitions the identifier space into intervals, , such that for every such interval , the number of edges directed from to nodes with identifiers in is at most . Denote by the edges from to nodes with identifiers in . For each node , node assigns a unique interval , and sends to the delimiters of the interval as well as all the edges in . Every edge is sent along with its weight, direction, and the communication tokens and identifiers of both of its endpoints.
The above procedure satisfies Items 3, 4, 5 and 4. We proceed to analyze the round complexity of this step. Clearly, every node desires to send at most messages. To bound the number of messages each node receives, recall that by Item 2, each node is a carrier node in at most carried nodes sets, and the number of messages it receives on behalf of each of them is at most . Thus, each node desires to receive messages. Therefore, by \IfAppendixLABEL:\next ( (Routing).)Lemma B.1, this stage can be executed in rounds w.h.p.
Finally, we need to satisfy Item 5. We assume that every node followed the above procedure to construct both sets , and which satisfy all properties except for Property 5, and now we show how, at once, this property can be satisfied for both , and . For every node , and every edge for some , let be the node holding , then node asks node which node holds . Node , which knows this information due to the above, replies to with the answer. As each node carries edges, and each node receives requests, one for each of its edges, by \IfAppendixLABEL:\next ( (Routing).)Lemma B.1 it takes an additional rounds. ∎
Applying \IfAppendixLABEL:\next ( (Initialize Carriers).)Lemma B.11, followed by \IfAppendixLABEL:\next ( (Populate Carriers).)Lemma B.12, directly gives the following.
Lemma B.13 (Initialize Carrier Configuration).
Given a graph , where each node knows all the edges incident to it and the communication tokens of all of its neighbors in , it is possible, within rounds, where is the maximal degree in the graph, to reach a state where is held in a carrier configuration , w.h.p.
B.2.2 Basic Tools
We show a basic communication tool within carrier configurations.
Lemma B.14 (Carriers Broadcast and Aggregate).
Let be a graph held in a carrier configuration . In parallel for all nodes, every can broadcast messages to all the nodes in and , as well as solve aggregation tasks over and . This requires rounds.
Proof of Lemma B.14.
Due to Items 6 and 3, there is a communication tree spanning each and , and every carrier node knows the communication tokens of its parent and children in the tree. Further, since each node is a member of at most sets of carrier nodes, it is possible to apply \IfAppendixLABEL:\next ( (Message Doubling on Communication Trees).)Lemma B.3 simultaneously across all the communication trees in the carrier configuration, in rounds, proving the claim.
∎
We show the following helpful statement which enables nodes to query a carrier configuration and return to the classical state in which edges are known by the nodes incident to them.
Lemma B.15 (Learn Carried Information).
Given a graph with average degree held in a carrier configuration , it is possible for each node to learn all edges adjacent to it in in rounds w.h.p., where is the maximal degree in . It is possible to invoke this procedure for only outgoing or incoming edges separately, requiring , rounds, respectively, where is the maximal out-degree, and is the maximal in-degree.
Proof of Lemma B.15.
First, each node computes by summing up the number of edges nodes in and hold. Then, the nodes compute the maximum of their degrees, the value . Every node in and sends to the edges incident to which it holds. Node desires to receive at most messages, and each node desires to send at most messages, as every node is the carrier of at most nodes. Thus, due to \IfAppendixLABEL:\next ( (Routing).)Lemma B.1, this requires rounds.
∎
It is possible to extend Lemma B.15, and show that if for a node , both and all the carrier nodes of know some predicate over edges, then it is possible to send to only edges incident to it which satisfy the predicate. We formalize this, as follows.
Lemma B.16 (Learn Carried Information with Predicate).
Assume that we are given a graph with average degree held in a carrier configuration . If each node has a predicate over the edges incident to , which both and the nodes know, then it is possible for each node to learn all edges incident to it in which satisfy the predicate. The round complexity for this procedure is rounds w.h.p., where is the maximal number of edges incident to any node which satisfy .
B.2.3 Merging Carrier Configurations
We present a useful tool, which shows how to compute the point-wise minimum of two matrices. With respect to graphs, this can be seen as adding edges to a graph, and if an edge exists twice, then setting its weight to the minimum of the two. This tool can be used in order to merge two carrier configurations.
Lemma B.17 (Merging).
Let be a set of nodes which hold two matrices in carrier configurations , , respectively. Denote by the matrix generated by taking the point-wise minimum of the two given matrices. It is possible within rounds to output in a carrier configuration , where the values denote the average number of finite elements per row of , respectively, and the values denote the maximal number of carriers each node has in , , respectively, w.h.p.
Proof of Lemma B.17.
We show how to set up , and note that the case of is symmetric. Thus, we sometimes drop the superscripts and denote . A critical note is that in the following proof, when we denote , if a node appears in both carrier sets, we count it twice in the union. That is, denotes a multiset. Further, let be some edge which is held in or by some carrier node . At the onset of the algorithm, attaches its identifier and communication token to – that is, whenever is sent in a message, it is sent along with these values as metadata.
Proof Overview
Goal: Consider a node . Essentially, node has a sparse array (row in matrix , denoted as ) held, in a distributed fashion, over the nodes in , and a sparse array (row in matrix , denoted as ) held over the nodes in . Node wishes to merge these two arrays into one sparse array (the currently not-yet computed ), and hold it in some (currently not allocated) carrier set . In the case that an entry appears in both and , it should keep the minimum of the values.
Merging: Initially, performs some merging mechanism in order to compute the sparse array . At the end of this step, the array is distributed across the nodes and , as we have yet to allocate .
Constructing : Finally, we allocate the set , and move the data of from its temporary storage in the nodes and to be distributed across . Further, several steps are taken to ensure that is a valid carrier configuration.
Step: Merging
Observe some node . In this step, our goal is to compute , and store it in a convenient distributed representation across the nodes in .
Initially, we desire for all the nodes in to be able to communicate with one another. Node knows the communication tokens and identifiers of the nodes in (Item 1), and broadcasts all of them to all the nodes in rounds using \IfAppendixLABEL:\next ( (Carriers Broadcast and Aggregate).)Lemma B.14.
Due to Item 4, is distributed across such that an interval corresponds to each , where holds all the finite elements in (entries from index to index of ). The same holds for . For a set of carrier nodes , denote , and for a set of intervals , denote by the set of delimiters of . Due to Property 5, node knows and . We now perform the following steps.
-
1.
Node computes , where , that is, the partition of into intervals using all of the delimiters in . Notice that , and every is contained in exactly one interval in and in one interval in . Further, , since and .
-
2.
Node broadcasts to in rounds using Lemma B.14.
Notice that all the nodes in know the identifiers of one another (guaranteed above), and also all of . Thus, it is possible for the nodes in to perform local computation which allocates to each two intervals, , and every node in knows that is assigned .
Now, we wish for to learn the finite entries in , and compute . To do so, we need to route the finite entries which requires from their current storage in the nodes to . We bound the amount of information receives. For any interval and , there are at most and finite elements in and , respectively. Further, every interval is contained in exactly one interval in and one interval in , and so the number of finite elements in and is at most and respectively. Therefore, node desires to learn at most finite elements. We now bound the amount of information node sends to other nodes in in order to let them learn their desired intervals. Node originally holds at most finite elements, and each element is desired by exactly one node. Therefore, node sends at most finite elements. Thus, we conclude that every node in sends and receives at most messages to and from other nodes in , showing that this step can be completed in rounds, using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1.
Finally, we wish for all the nodes in to know, for each , how many finite entries are in . Using Lemma B.1, every node sends to the number of finite entries in and in , within rounds. Then, broadcasts all of this information to using \IfAppendixLABEL:\next ( (Carriers Broadcast and Aggregate).)Lemma B.14 in rounds.
Step: Constructing
We perform several operations in this step. First, we invoke \IfAppendixLABEL:\next ( (Initialize Carriers).)Lemma B.11 w.r.t. , in order to create the carrier sets , which satisfy all of the properties of a carrier configuration, except for Items 3, 4, 5, 4 and 5. These remaining properties relate to populating the sets with data. Therefore, we then populate with the data pertaining to .
Sub-step – Invoking Lemma B.11: In order to invoke Lemma B.11 w.r.t. , every node needs to know the number of finite entries in row of and column of . Notice that can compute the number of finite entries in by aggregating over the nodes . In order to compute the number of finite entries in column of , recall that at the beginning of the proof, we say that our analysis follows only the rows of the matrices, thus, inherently one also runs the algorithm up to this point on the columns of the matrices. Therefore, in a symmetric way, can know the number of finite entries in column of . Next, we analyze the round complexity of invoking Lemma B.11 w.r.t. . Denote by the average number of finite elements in a row of , and, by aggregation, all of the nodes of the graph compute . Notice that , and , as in each row the number of finite elements could only have increased due to the minimization operation. Further, the maximal number of finite elements in a row of is at most the maximal number of finite elements in a row of plus the maximal number of finite elements in a row of . Thus, the round complexity is rounds.
Sub-step – Ensuring Items 3, 4, 5 and 4: First, we begin by computing the intervals . Notice that , since , and and can themselves hold the vectors with each node in each set carrying at most finite elements, respectively. Now, we partition into intervals , such that the number of finite elements in every is at most . As the nodes in all know , as well for each , how many finite entries are in , every node in knows for every finite element in which it holds the number of finite elements preceding it in . Thus, for every interval , there exist some two nodes such that can compute the left endpoint of and can compute the right. We show this for left endpoints as the proof for right endpoints is symmetric. The left endpoint of is the index of the -th finite entry in , and thus the node in which holds this finite element, knows the left endpoint of . In rounds, the nodes in tell the contents of , using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1. In the same round complexity, broadcasts to , , and , using \IfAppendixLABEL:\next ( (Carriers Broadcast and Aggregate).)Lemma B.14.
Now, we move the finite entries of from the nodes in to the nodes in . Node broadcasts the communication tokens and identifiers of all the nodes in to all the nodes in , in rounds. The nodes in communicate all of the finite entries of to , each node knowing where to send the information which it holds as all the nodes in know . Each node sends or receives at most messages, therefore routing these messages takes rounds using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1.
Sub-step – Ensuring Item 5: Recall that, as stated at the beginning of the proof, we show how to construct , while the case of is a symmetric algorithm. At this point, we require that all of the above be executed w.r.t. to both and . This is due to the fact that in order to satisfy Item 5 for , we query the nodes of for some information which they compute above. Thus, we now show how Item 5 is satisfied for . In a symmetric way, it can be shown for .
Let there be some edge , for some , which is now held in . Denote by , the node which originally held at the onset of the algorithm, and recall that at the onset of the algorithm (stated at the beginning of the proof), we attach to the communication token and identifier of , and so knows . W.l.o.g., assume that . Due to the fact that is a carrier configuration, node knows the communication token and identifier of node which also holds . Again, assume that at the onset of the algorithm, node attached to the communication token and identifier of . Thus, node knows the communication token and identifier of .
As such, asks which node in holds . Node is able to answer this query, as all the nodes in know which intervals are held by which nodes in . The answer to this query is exactly the information which node needs in order to satisfy Item 5. We analyze the round complexity of this routing. Each node in sends queries only w.r.t. edges it holds as part of , and each node in answers queries only w.r.t. edges it holds in . Thus, this step can be executed in rounds, using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1. ∎
B.2.4 Partial Carrier Configuration
We now prove a fundamental tool which can be roughly viewed as computing the transpose of a matrix. Notice that each entry of data is stored twice in a carrier configuration . For instance, an edge is stored in both , and . We show that if only the outgoing carrier sets are stored, one can complete the data structure to contain also the incoming carrier sets .
This is a very useful tool, as sometimes nodes can only compute the edges directed away from them, and not the edges directed towards them. For instance, in \IfAppendixLABEL:\next ( (Sparse Matrix Multiplication).)Theorem 3.1, we reach a state where there are few edges directed away from every node, but potentially edges directed towards some nodes. If one were to simply invoke \IfAppendixLABEL:\next ( (Initialize Carrier Configuration).)Lemma B.13, this would require every node to learn all of the edges directed both away and towards it, which would incur a high round complexity. Instead, \IfAppendixLABEL:\next ( (Partial Configuration Completion).)Lemma B.18 shows that given that every node has a partial carrier set holding edges directed away from it, the matching carrier set for edges directed towards can be allocated and directly populated with these edges without the edges ever being known to itself.
Definition 14 (Partial Carrier Configuration).
Given a set of nodes , a data structure is a partial carrier configuration holding a graph if all the conditions of \IfAppendixLABEL:\next ( ( Carrier Configuration).)Definition 13 hold, yet, only for the outgoing edges. That is, each node only has .
Notice that Item 5 is not demanded, as it requires the existence of both and .
Lemma B.18 (Partial Configuration Completion).
Given a graph which is held in a partial carrier configuration , there exists an algorithm which runs in rounds, where , and outputs a carrier configuration holding , w.h.p.
Proof of Lemma B.18.
We assign for every , and thus we are required to show two items in this proof: how to allocate the sets , and how to populate them with data.
Allocating :
In order to allocate , node needs to know . Denote , , and , . The sets and contain light and heavy nodes, respectively, yet, notice that at the current stage in the algorithm, no node knows whether it itself is in or in , as it does not know . Our goal is to satisfy an even stronger condition – to make every node know for every whether is in or .
For a set of nodes , denote by . Let be an arbitrary, hardcoded, globally known partition of , where all the parts are of roughly equal size. Using \IfAppendixLABEL:\next ( (Aggregation).)Corollary B.6, all nodes compute the values in rounds. Denote the set of parts in which have low in-degree by , and the high in-degree ones by . Given , if belongs to some set in , that is, , then certainly . As is hardcoded and globally known, and all the nodes know , all such nodes know that they are in , and further all the nodes in the graph know this as well.
As , it holds that , implying, . Since the sets in are of equal sizes, then we have guaranteed at that least half of the nodes in are now identified as belonging to . These nodes are set aside, and we iterate over this procedure. In each iteration, at least half of the nodes remaining are marked as belonging to , up until the final iteration where only nodes in remain. Thus, every node in the graph knows which nodes belong to and which to .
Fix . It holds that . Each node which holds an edge directed towards now sends that edge to . This is done using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1. We must show that knows the communication token of . This follows from Item 4, which states that every edge is stored alongside the communication tokens of both of its endpoints. The execution of Lemma B.1 completes in rounds, as every node receives at most messages, and sends at most messages. Thus, computes , as it knows all of the edges which are directed towards itself.
Observe – we show that every also computes . As the minimum in-degree of a node in is , and , we get . Further, recall that every node knows which nodes are in . Therefore, using \IfAppendixLABEL:\next ( (Aggregation).)Corollary B.6, within rounds, every node in the graph knows the in-degree of every node in .
Finally, we allocate . We are given as input the partial carrier configuration , allowing each node to compute in rounds, using \IfAppendixLABEL:\next ( (Carriers Broadcast and Aggregate).)Lemma B.14. Thus, we invoke \IfAppendixLABEL:\next ( (Initialize Carriers).)Lemma B.11, in rounds, to create the sets , which satisfy all of the properties of a carrier configuration, except for Items 3, 4, 5, 4 and 5. These remaining properties relate to populating the carrier node sets with input data. We throw away and set .
Populating :
Fix . As knows all the edges directed towards it, it sends these edges to its carrier in rounds, using \IfAppendixLABEL:\next ( (Carriers Broadcast and Aggregate).)Lemma B.14 and trivially completes Items 3, 4, 5 and 4, by sending at most messages to each node in , requiring rounds. The only challenging task is ensuring Item 5, which requires that for every edge , the node knows the communication token and identifier of which holds . However, since every edge in this algorithm is sent directly from the carrier node which holds it (node sends ), that carrier node can attach its own communication token and identifier to the edge itself when sending it, thus providing the information which the nodes in need in order to satisfy Item 5.
Partition into sets, , and use \IfAppendixLABEL:\next ( (Grouping).)Lemma B.7 to ensure that for each , every node in knows , within rounds. For each set , denote some arbitrary, hardcoded as the leader of , and in rounds, broadcast the communication tokens and identifiers of all the leaders, using \IfAppendixLABEL:\next ( (Broadcasting).)Lemma B.5.
Fix . Observe that . For each set , the -th carrier in , sends its communication token and identifier to , using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1, in rounds, as every carrier sends at most one message and every leader receives at most messages. Then, leader broadcasts to all the communication token and identifiers of carrier nodes which it received, taking rounds, using \IfAppendixLABEL:\next ( (Group Broadcasting and Aggregating).)Corollary B.9.
Finally, every node in which holds an edge directed towards , tells the -th carrier in about this edge using Lemma B.1, in rounds, as each carrier node receives at most messages, and each node sends at most messages since is a partial carrier configuration. Now, the nodes know all of the edges directed towards . Since implies that every hold , within rounds, the nodes in can rearrange the information stored in them, as well as communicate with , in order to satisfy Items 3, 4, 5 and 4. To satisfy Item 5, an identical claim to the case of can be used. ∎
B.3 Efficient Hopset Construction
We efficiently compute, in the model, a hopset [24], which allows approximating distance-related problems quickly. We follow the general outline of [14], and solve -nearest and -source detection, defined below, in order to construct the hopset. We solve these problems mainly using \IfAppendixLABEL:\next ( (Sparse Matrix Multiplication).)Theorem 3.1.
However, in contrast to the implementation, in the model we are met with additional challenges, as many operations which are trivial in the model become highly complex. For instance, upon computing the edges of a hopset , one must add the edges to the graph – an operation which is straightforward in the model, yet requires \IfAppendixLABEL:\next ( (Merging).)Lemma B.17 in the model. Further, in the model, when we consider undirected graphs, once a node adds an edge to the graph then the edge is added as well, or updated to the minimum cost, if it exists already. To accomplish this in the model, one should invoke the algorithm in \IfAppendixLABEL:\next ( (Merging).)Lemma B.17 on the matrix and the transpose of the matrix. However, transposing a matrix is not trivial and we accomplish it due to the definition of the carrier configuration, which implies that whenever nodes hold a matrix , they also implicitly hold . This goes to show why various new tools are required in the model for this problem.
Definition 15 (-Hopset).
For a given weighted graph , a -hopset, is a set of edges such that paths of length at most hops in approximate distances in by a multiplicative factor of at most . That is, for each , , where is the weight of the shortest path with at most hops between in .
We demonstrate how to construct the -hopset over the input graph, where the number of edges in is . This is done using \IfAppendixLABEL:\next ( (Hopset Construction).)Theorem B.19.
Theorem B.19 (Hopset Construction).
There exists an algorithm in the model, such that given a weighted undirected input graph with and , held in a carrier configuration , and given some , computes a -hopset , with , and outputs in a carrier configuration . The round complexity of this algorithm is , w.h.p.
Before proving \IfAppendixLABEL:\next ( (Hopset Construction).)Theorem B.19, we prove several theorems related to the following two problems.
Definition 16 (-nearest).
Given a graph and a value , in the -nearest problem, each node must learn of its closest neighbors in , breaking ties arbitrarily.
Definition 17 (-source detection).
Given a graph , a set , a value , and a value , in the -source detection problem, each node is required to learn its closest neighbors in , while considering paths of up to hops only.
We solve the -nearest and -nearest problems for the case where , as the round complexity of our solutions does not improve for due to pre-processing costs.
Lemma B.20 (-nearest Algorithm).
Given a graph , where , held in a carrier configuration , and some value , it is possible in the model, within rounds, w.h.p., to output a directed graph held in a carrier configuration , where contains an edge from every node to every node which is one of the closest nodes to (with ties broken arbitrarily), with weight . Notice that it can be the case that .
Proof of Lemma B.20.
As shown in [14], the following process solves the problem. Take the adjacency matrix of , and in each row keep the smallest entries (breaking ties arbitrarily), to create some111111Given , there potentially are many options for , since ties can be broken arbitrarily. matrix . Then, the matrix is iteratively squared, for at most iterations, while after each product only the smallest entries in each row are preserved. We create some matrix from . Fix . Node computes , in rounds, using \IfAppendixLABEL:\next ( (Carriers Broadcast and Aggregate).)Lemma B.14. If , then uses \IfAppendixLABEL:\next ( (Learn Carried Information).)Lemma B.15 in rounds to learn all of the edges outgoing from it. Otherwise , and denote by the edges directed away from with weight at most and towards nodes with identifiers at most . Node computes two values, and , such that is the maximal value where there exists an such that . Given any value , node can compute within rounds using \IfAppendixLABEL:\next ( (Carriers Broadcast and Aggregate).)Lemma B.14. Thus, in rounds, can compute using binary search. Likewise, can be computed in rounds for any , and thus using binary search computes . Then, node broadcasts and to , and using \IfAppendixLABEL:\next ( (Learn Carried Information with Predicate).)Lemma B.16, within rounds, learns all of the edges in . We need to hold in a carrier configuration in order to use it for matrix multiplication. Each node with knows the entries of row in – they are . Each node with , locally adds arbitrary edges directed away from it with infinite weight, to have exactly edges directed away from it. We denote the new matrix created by this process by , and notice that has the same properties w.r.t. distances as , since edges of infinite weight do not affect shortest paths. As each node holds exactly edges directed away from it, the nodes themselves are a partial carrier configuration, , holding . That is, for each node , we set . We invoke \IfAppendixLABEL:\next ( (Partial Configuration Completion).)Lemma B.18, within rounds, since , in order to get a carrier configuration which holds .
Finally, we iteratively square by applying \IfAppendixLABEL:\next ( (Sparse Matrix Multiplication).)Theorem 3.1. That is, we compute , where takes a matrix and leaves only the smallest entries in each row. Then, we compute , and so forth. Repeating this procedure for iterations results in an output matrix which holds in row edges only to some closest nodes to . We perform matrix multiplication, with each taking rounds, since and we always multiply two matrices with at most elements per row, due to \IfAppendixLABEL:\next ( (Sparse Matrix Multiplication).)Theorem 3.1. ∎
Lemma B.21 (-source detection Algorithm).
Given a graph , where and , held in a carrier configuration , and given , where and , it is possible in the model, within rounds, w.h.p., to output a directed graph held in a carrier configuration , where contains an edge from every node to every node which is at most hops away from , with weight . Notice that it can be the case that .
It is assumed that the IDs of the nodes in are known to all of .
Proof of Lemma B.21.
In [14], it is shown that the following process solves the problem. Denote by the adjacency matrix of . Denote by the sparsified adjacency matrix with edges only entering nodes in . The matrix is the solution to the problem.
We construct a carrier configuration which holds . Fix . Denote by the edges from directed towards nodes in . Node uses \IfAppendixLABEL:\next ( (Learn Carried Information with Predicate).)Lemma B.16 to learn , in rounds. We construct a partial carrier configuration which contains for each the edges , by setting , since the average degree in is exactly .121212Assuming that if a node does not have an edge to node in , it inserts a dummy edge with infinite weight. Using \IfAppendixLABEL:\next ( (Partial Configuration Completion).)Lemma B.18, we turn into a carrier configuration holding , in rounds, since . Finally, we perform multiplications. We compute the product by invoking \IfAppendixLABEL:\next ( (Sparse Matrix Multiplication).)Theorem 3.1 with the carrier configurations , to get a carrier configuration , which holds , in rounds, since the average number of finite elements per row in is at most , in it is at most , and . Notice that while this invocation of \IfAppendixLABEL:\next ( (Sparse Matrix Multiplication).)Theorem 3.1 only computes the smallest entries in each row of , there are only at most entries in which are finite – all the columns not corresponding to nodes in do not contain finite values. Thus, it turns out that the invocation of \IfAppendixLABEL:\next ( (Sparse Matrix Multiplication).)Theorem 3.1 actually computes exactly. Thus, we now multiply by and repeat times until achieving the final result, taking rounds. ∎
We turn our attention to proving \IfAppendixLABEL:\next ( (Hopset Construction).)Theorem B.19.
Proof of Theorem B.19.
In the model, some preparation is necessary before constructing the desired hopset.
Initialization: We initialize the hopset and denote by the carrier configuration holding it. We initialize with arbitrary, hardcoded edges all with infinite weights. While adding arbitrary edges of infinite weight does not affect distances, it ensures that throughout the entire algorithm will contain edges. Further, as no more than are added in the algorithm which follows, will always contain edges, ensuring that the average degree in is always , and thus the maximal number of carrier nodes that each node has is at most .
Whenever a set of edges is added to , it is assumed that if an edge is added from node to node , then also an edge is added in the opposite direction. As such, assume that whenever we add sets of edges to , we then reset to be , by invoking \IfAppendixLABEL:\next ( (Merging).)Lemma B.17 on . Notice that due to \IfAppendixLABEL:\next ( ( Carrier Configuration).)Definition 13, if we set and , we get that is a carrier configuration holding . As the maximal number of carrier nodes each node has in is , these invocations of \IfAppendixLABEL:\next ( (Merging).)Lemma B.17 take rounds.
Construction: We now begin computing the edges of . Initially, we use \IfAppendixLABEL:\next ( (-nearest Algorithm).)Lemma B.20 on , with in order to get a carrier configuration with an edge from each node to its nearest neighbors. This takes rounds. We add the edges from to using \IfAppendixLABEL:\next ( (Merging).)Lemma B.17 in rounds.
Next, we sample nodes , where , by letting every node join independently with probability , ensuring, w.h.p., that each node holds in its distance to at least one node of . We use \IfAppendixLABEL:\next ( (Broadcasting).)Lemma B.5, in rounds, in order to let every node know all of .
We solve the -source detection problem with over the graph , and add the resulting edges to . We need to do this iteratively for iterations. In each iteration, we invoke \IfAppendixLABEL:\next ( (-source detection Algorithm).)Lemma B.21 with in rounds. We have now constructed the hopset , and therefore can create by executing \IfAppendixLABEL:\next ( (Merging).)Lemma B.17 on and , taking rounds, since we assume that , completing the proof. ∎
B.4 SSSP
We begin by showing how to perform Bellman-Ford iterations [25] in the model using carrier configurations. Given a source node in a graph , in a Bellman-Ford iteration , every node in broadcasts to its neighbors , its distance to with at most hops, and then calculates by taking the minimal distance to which it receives from its neighbors in this iteration.
Lemma B.22 (Bellman-Ford Iterations in ).
Given a (directed or undirected) weighted graph with average degree held in a carrier configuration , and a source node , it is possible in the model, within rounds, w.h.p., to perform iterations of the Bellman-Ford algorithm on with as the source.
Proof of Lemma B.22.
Fix . Node computes , within rounds, using \IfAppendixLABEL:\next ( (Carriers Broadcast and Aggregate).)Lemma B.14. Then, to simulate the -th iteration, node broadcasts to the value , in rounds. Each node , for every edge which stores, sends to the node which stores , the value . Since the average degree in is , and due to \IfAppendixLABEL:\next ( ( Carrier Configuration).)Definition 13, it holds that each sends and receives at most messages in this step, thus taking rounds, using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1. Finally, sets to be the minimum over all the values which nodes received in this iteration, within rounds. We repeat the above process times. ∎
We show how to compute exact SSSP in the model.
Theorem B.23 (Exact SSSP in ).
Given a weighted undirected graph with and , held in a carrier configuration , and a source node , it is possible in the model, within rounds, w.h.p., to ensure that every node knows the value .
Proof of Theorem B.23.
It was proven in [54] that it is possible to solve exact SSSP on a weighted, undirected graph with a source by using the following steps. First, one solves the -nearest problem, for some , and creates the graph by starting with and adding weighted edges from each node to of its closest neighbors, with the weights equal to the weighted distance between the nodes in . Then, one performs Bellman-Ford iterations on with as the source, and it is guaranteed that for each , it holds that .
Thus, in order to solve exact SSSP, we choose the value which balances the number of rounds required for our -nearest 131313\IfAppendixLABEL:\next ( (-nearest Algorithm).)Lemma B.20 requires , and this holds since we assume that the graph is connected, implying, . and Bellman-Ford implementations.
Due to \IfAppendixLABEL:\next ( (-nearest Algorithm).)Lemma B.20, we can solve the -nearest problem in rounds. This gives us a carrier configuration , which for every node holds edges directed away from it to of its nearest nodes. We need to add these edges from to the carrier configuration which holds . Thus, we invoke \IfAppendixLABEL:\next ( (Merging).)Lemma B.17, in order to get a carrier configuration which includes the edges from and from . Notice that \IfAppendixLABEL:\next ( (Merging).)Lemma B.17 always takes at most rounds, and so this fits within our desired complexity.
Finally, we perform Bellman-Ford iterations on . Notice that the average degree in is . Therefore, due to \IfAppendixLABEL:\next ( (Bellman-Ford Iterations in ).)Lemma B.22, this completes within rounds. ∎
We now proceed to showing how to compute an approximation of SSSP in the model.
Theorem B.24 (-Approximation for SSSP in ).
There exists an algorithm in the model, such that given a weighted, undirected input graph , with and , held in some carrier configuration , some , and a source , ensures that each node knows a -approximation to its distance from . The round complexity of this algorithm is , w.h.p.
Proof of Theorem B.24.
We construct a -hopset by using \IfAppendixLABEL:\next ( (Hopset Construction).)Theorem B.19 on , in rounds, and obtain held in a carrier configuration . Due to the definition of , for every , it holds that . Notice that since , , and are undirected, for each , the sets , hold the same edges, and so it is irrelevant which one of them we use. Thus, we denote from here on.
We now perform Bellman-Ford iterations on , in order to ensure that every node knows and as such a -approximation to . To do so, we invoke \IfAppendixLABEL:\next ( (Bellman-Ford Iterations in ).)Lemma B.22 on , which is held in , and the source , requiring rounds, since . ∎
Finally, as our goal is to simulate our SSSP approximation algorithm in other distributed models directly, we provide the following wrapper statement which receives as input a graph where each node knows its neighbors, instead of a graph held in a carrier configuration.
Theorem B.25 (-Approximation for SSSP in (Wrapper)).
There exists an algorithm in the model, such that given a weighted, undirected input graph , with and , where each node knows all the edges incident to it, and the communication tokens of all of its neighbors in , some , and a source , ensures that each node knows a -approximation to its distance from . The round complexity of this algorithm is , where is the maximal degree in the graph, w.h.p.
Proof of Theorem B.25.
We wish to invoke \IfAppendixLABEL:\next ( (-Approximation for SSSP in ).)Theorem B.24, yet the main hurdle in our way is that it requires a graph with at least edges. Therefore, we build such that , and for each . We call a node with degree less than a low degree node. Each low degree node meets nodes which are not its neighbors in , and adds edges with infinite weight to those nodes. To meet new nodes, each low degree node sends its identifier and communication token to random nodes, and each node which received the communication token of , responds with its identifier and communication token. By \IfAppendixLABEL:\next ( (Sampling Unique Elements).)Lemma B.26, a low degree node meets at least unique nodes which are not its original neighbors in , w.h.p. As such, connects itself with edges to these nodes which it meets. Notice, that each node was sampled at most times w.h.p. Thus, the maximum degree, , in is . The number of edges added is , w.h.p., implying that the number of edges in is .
We initialize a carrier configuration from , using \IfAppendixLABEL:\next ( (Initialize Carrier Configuration).)Lemma B.13 in rounds. Then, we invoke \IfAppendixLABEL:\next ( (-Approximation for SSSP in ).)Theorem B.24 in rounds w.h.p. ∎
Lemma B.26 (Sampling Unique Elements).
Let be some constant. Let be a set of elements. Let be a set of at most bad elements. Denote by the set of good elements. Let be a number of required good elements. Let be a sequence of length at least elements, where each element is sampled independently uniformly and randomly from the set . There are more than unique good elements in the sequence with probability at least .
Proof of Lemma B.26.
We upper bound the number of sequences which contain or less different good elements. There are subsets of good elements which may appear. For each of these subsets there are sequences. Notice, there are a lot of sequences which are counted multiple times, however since we only need an upper bound it is enough for us. So the number of bad sequences is upper bounded by:
for , which satisfies . Solving this condition for results in for large enough n.
Since the total number of sequences is , and all sequences are obtained with the same probability, the probability to get bad sequence is upper bounded by . ∎
B.5 -SSP and APSP
We further show results pertaining to approximating distances from more than one source.
Theorem B.27 (-Approximation for -SSP in ).
There exists an algorithm in the model, such that given a weighted, undirected input graph with and , held in a carrier configuration , some , and a set of sources , , outputs:
-
1.
A directed graph held in a carrier configuration , where contains an edge from every node to every node , where the weight of the edge maintains . Notice that it can be the case that .
-
2.
Every node , knows a -approximation for for every .
The round complexity of this algorithm is , w.h.p.
Proof of Theorem B.27.
This proof is split into three parts.
First, we construct a -hopset by using \IfAppendixLABEL:\next ( (Hopset Construction).)Theorem B.19 on , in rounds, and obtain held in a carrier configuration . Due to definition of , for every , it holds that .
Next, we make the IDs of the nodes in globally known within rounds, using \IfAppendixLABEL:\next ( (Broadcasting).)Lemma B.5. This enables us to invoke \IfAppendixLABEL:\next ( (-source detection Algorithm).)Lemma B.21 on with , , and , creating the carrier configuration which is described in the statement of this theorem. This requires
rounds.
Finally, to satisfy the second guarantee, node learns all the edges held in , using \IfAppendixLABEL:\next ( (Learn Carried Information).)Lemma B.15 within rounds.
∎
Theorem B.28 (-Approximation for Scattered APSP in ).
There exists an algorithm in the model, such that given a weighted, undirected input graph with and , held in a carrier configuration , and some , solves the -Approximate Scattered APSP problem (\IfAppendixLABEL:\next ( (Scattered APSP).)Definition 4) on .
That is, the algorithm ensures that for every , there exist nodes , (potentially ), which each know a approximation to , and node knows the identifier and communication token of node , while node knows the identifier and communication token of .
Further, for a given node , the following hold:
-
1.
The set contains at most unique nodes.
-
2.
Node can compute a string of bits, , such that using , for any , it is possible to determine such that .
-
3.
Denote s.t. . It holds that .
The round complexity of this algorithm is , w.h.p.
The outline of the proof breaks into two parts – initialization and reshuffling.
Initialization: First, every node computes of its nearest neighbors, . Next, a -approximation for distances from to a random set of nodes is computed, denoted by .
It holds that for each , w.h.p., , and so we denote by a closest node to in .
Reshuffling: Due to \IfAppendixLABEL:\next ( (APSP using -nearest and MSSP).)Claim A.3, for every two nodes , it holds that is a approximation to . As is undirected, both and can be used, and so we work with . Thus, we desire a state where for every two nodes , there exists a node which knows and , and whose identifier is known to , and a node which knows and and whose identifier is known to , concluding the proof.
Proof of Theorem B.28.
Initialization: Invoke \IfAppendixLABEL:\next ( (-nearest Algorithm).)Lemma B.20 on , with , within rounds, to get a directed graph held in a carrier configuration , where contains an edge from every node to every node with weight , where is a set of closest nodes to . Using \IfAppendixLABEL:\next ( (Learn Carried Information).)Lemma B.15, in rounds, every node itself knows all the distances to the nodes in .
Then, a random set of nodes is selected, by letting each node join with probability , and are broadcast, using \IfAppendixLABEL:\next ( (Broadcasting).)Lemma B.5 within rounds. As seen in \IfAppendixLABEL:\next ( (APSP using -nearest and MSSP).)Claim A.3, it holds that for each , w.h.p., , and so we denote by a closest node to in . Node knows , since it knows the distances to all the nodes in . Finally, invoke \IfAppendixLABEL:\next ( (-Approximation for -SSP in ).)Theorem B.27 on ,
using as the source set, to compute a approximation for distances from to all of , denoted by , and requiring rounds. As per the specifications of Theorem B.27, is stored in a carrier configuration as edges in a directed graph , where for each , there is an edge with weight .
Reshuffling: For every node , denote by . Notice that does not know the set .
Primarily, we compute all the values, , at once and make them known to all the nodes in , using \IfAppendixLABEL:\next ( (Aggregation).)Corollary B.6 within rounds.
Base Case (The Set ): Denote by nodes with . Since the values are globally known, every node knows which nodes are in . Fix a node . Every node sends to the following values: (1) the identifier , (2), the value , and (3), the communication token of . Node knows all of these values, and also knows the communication token of ,141414As are broadcast initially. and so node can send these three messages to node . This requires rounds using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1. Node broadcasts the information it receives to , in rounds, using \IfAppendixLABEL:\next ( (Carriers Broadcast and Aggregate).)Lemma B.14. Observe some node . Notice that due to Item 4, and due to the fact that there is an edge in from every node in to , then there exists some interval , such that node knows the values . Node sends to each the values and , using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1. This is possible as knows ,151515This holds since broadcast the information it received, including , to . and takes rounds,161616, as the average degree in is , and each node has edges directed towards it in . as sends messages and every node in receives messages.
Fix some where , we claim that the output of the theorem is satisfied for . Observe that given any , there exists some node which both knows and , and further knows which node is, as it is the node such that . Finally, notice that we also satisfy that and s.t. contain unique nodes, and that it is possible to condense into bits the information describing which node in is , for any , as this depends only on the intervals which each node in holds. Thus, all the conditions of the statement we are proving are satisfied for the case of .
Iterative Case (Sets ): We proceed in iterations. Fix the iteration counter . Denote by , the set of nodes with . Notice that , and therefore, . Further, the values are globally known, implying that the contents of are globally known, and so all the nodes locally compute an assignment of unique nodes to each .
Fix . The nodes duplicate the information which they hold, so that for each , the nodes will contain the information held in . We use the following observation: Since the graph is connected, for any , the nodes hold exactly values, . Combining this with the average degree in being , gives that . Node selects some node , and sends to the values , in rounds. Then, broadcasts to , in rounds, using \IfAppendixLABEL:\next ( (Carriers Broadcast and Aggregate).)Lemma B.14. Due to the observation above, it also holds that , and so each carrier node in selects a unique carrier node in , and within rounds, and learns all of the information which it holds. Thus, the nodes know all of the information which the nodes know. Then, nodes and selects nodes , and repeat the process where sends information to and sends information to . This process is repeated for iterations, until the information has been spread to all of and their corresponding carrier nodes.
Fix . The set is split into roughly equal-sized parts, . The main challenge is that no node in the graph knows all of , and therefore partitioning into is not trivial. We overcome this final challenge as follows. Every node in observes (defined above as the interval such that knows for all ) and sends each a message asking if it is in . Notice that this is possible since knows the communication tokens of all of , due to Item 4. Since , then sends at most messages. Further, as , each node in the graph receives at most messages, and so all of these messages may be routed in rounds using \IfAppendixLABEL:\next ( (Routing).)Lemma B.1. Thus, the identities of all of are dispersed across the carrier nodes . Similarly to the step above, this information is doubled, in rounds, in order to make sure that for each , the carrier nodes hold the identifiers of as well. Finally, for each (w.l.o.g. nodes in are numbered from 1 to – possible since is globally known), node performs a binary search, by querying its carrier nodes , in order to find the interval of nodes, such that the number of nodes in with identifiers at most is at most , and the number of nodes in with identifiers in the interval is . These nodes form the set . Node broadcasts and to , and then uses \IfAppendixLABEL:\next ( (Learn Carried Information with Predicate).)Lemma B.16 in order to learn the identifiers and communication tokens of , in rounds. Finally, within rounds, node messages the nodes to notify them that they are in , which overcomes the challenge. This concludes the proof of the statement.
Fix , and . The nodes in and repeat the same process as done for the case of by communicating with . That is, each sends to the values: (1) the identifier , (2) the value , and (3) the communication token of . Then, broadcasts this to . As shown for the case of , this requires rounds, and completes the proof.
∎
Appendix C The Model – Missing Proofs
C.1 Preliminaries – Extended Subsection
C.1.1 Communication Primitives
We observe several basic routing claims which are known in the model.
In [7, Theorem 2.2], the following is shown for the weaker model (in this model there are only global edges), and trivially holds in the model.
Claim C.1 (Aggregate and Broadcast).
There is an Aggregate-and-Broadcast Algorithm that solves any Aggregate-and-Broadcast Problem in rounds in the model.
In [49, 8], solutions are presented for the \IfAppendixLABEL:\next ( (Token Dissemination).)Claim C.2 (see [8, Theorem 2.1]) and \IfAppendixLABEL:\next ( (Token Routing).)Claim C.3 (see [49, Theorem 2.2]) problems. Token dissemination is useful for broadcasting, while token routing has the ability to be used in a fashion that is more similar to unicast.
Definition 18 (Token Dissemination Problem).
The problem of making distinct tokens globally known, where each token is initially known to one node, and each node initially knows at most tokens is called the -Token Dissemination (TD) problem.
Claim C.2 (Token Dissemination).
There is an algorithm that solves -TD in the model in rounds, w.h.p.
The following is discussed in [49] for token routing, and we later redefine the problem and remove the strong assumption which requires that each receiver knows the number of messages each sender sends it. We overcome this limitation later.
Definition 19 (Token Routing Problem).
The token routing problem is defined as follows. Let be a set of sender nodes and be a set of receiver nodes. Each sender needs to send at most tokens and each receiver needs to receive at most tokens, of size bits each. Each token has a dedicated receiver node , and each receiver knows the senders it must receive a token from and how many token it needs to receive from each sender. The token routing problem is solved when all nodes in know all tokens they are the receivers of.
Claim C.3 (Token Routing).
be sets of nodes sampled from with probabilities and , for constant , respectively. Let and be the number of tokens to be sent or received by any node in and , respectively. Let be the total workload. The token routing problem can be solved in rounds in the model w.h.p.
The following claim enables sending a polynomial number of messages uniformly at random while obeying the constraints of the model.
Claim C.4 (Uniform Sending).
[8, Lemma 3.1] Presume some model algorithm takes at most rounds for some polynomial . Presume that each round, every node sends at most messages via global edges to targets in sampled independently and uniformly at random. Then there is a such that for sufficiently large , in every round, every node in receives at most messages per round w.h.p.
C.1.2 Skeleton Graph
We use the notion of the skeleton graph presented in [8, 49] and augment it with additional conditions. In particular, its nodes are well spaced in the graph and satisfy the properties of marked nodes stated above.
Definition 20 (Extended Skeleton Graph).
Given a graph and a value , a graph is called a skeleton graph in , if all of the following hold.
-
1.
if and only if there is a path of at most edges between in .
-
2.
Every node knows all its incident edges in .
-
3.
is connected.
-
4.
For any two nodes , .
-
5.
For any two nodes with , there is at least one shortest path from to in , such that any sub-path of with at least nodes contains a node .
-
6.
.
-
7.
For each there is a helper set which satisfies:
-
(a)
.
-
(b)
.
-
(c)
For each node , there are at most nodes such that for each , .
-
(d)
.
-
(a)
In this definition, we merge the properties used by [8, 49], slightly adjust Property 7a and prove Property 7d.
Claim C.5 (Skeleton From Random Nodes).
Given a graph , a value , and a set of nodes marked independently with probability ,
there is an algorithm which constructs a skeleton graph in rounds w.h.p. If also given a single node , it is possible to construct , without damaging the properties of .
Proof of Claim C.5.
Similarly to [8, Algorithm 7] and [49, Algorithm 6], the algorithm for constructing the skeleton graph is to learn the -hop neighborhood and to run [49, Algorithm 1] to compute the helper sets.
We group and slightly extend the claims given in [8, 49]. Properties 3 and 4 holds w.h.p. since is connected, see [8, Lemma 4.3] or [49, Lemma C.2]. Property 5 follow from [8, Lemma 4.2] or [49, LemmaC.1]. Property 6 follows from Chernoff Bounds. The helper sets described in Property 7 are computed using [49, Algorithm 1], and in [49, Lemma 2.2], their Properties 7b and 7c are proven. It is also shown there that, w.h.p., for every it holds that , and thus in an additional rounds of local communication, we select exactly helpers and obtain Property 7a. The remaining Property 7d of the helper sets states that almost all of the nodes in the graph help other nodes. This holds since there are skeleton nodes, each has helpers and each helper helps skeleton nodes, so, by the pigeonhole principle, the overall number of helpers is at least .
For the sake of formality in the following proofs, as some are stated for a set of marked nodes and some for the skeleton graph, we also show the following \IfAppendixLABEL:\next ( (Construct Skeleton).)Corollary C.6.
Corollary C.6 (Construct Skeleton).
Given a graph , and a value , there is an algorithm which constructs a skeleton graph in rounds w.h.p. Further, if also given a single node , it is possible to ensure that without damaging the properties of .
Proof of Corollary C.6.
First mark each skeleton independently with probability , getting a set of skeleton nodes , then, using the algorithm from \IfAppendixLABEL:\next ( (Skeleton From Random Nodes).)Claim C.5 it is possible to construct the skeleton graph within rounds w.h.p.
∎
We show several primitives related to communication within skeleton graphs.
We show the following claim which, given a skeleton graph , assigns the nodes unique IDs from the set . This is useful, among other uses, for symmetry breaking and synchronization among the skeleton nodes.
Claim C.7 (Unique IDs).
Given a graph , and a skeleton graph , it is possible to assign the nodes unique IDs from the set within rounds in the model, w.h.p.
Proof of Claim C.7.
We construct a binary tree of depth over the nodes , and then assign each node an ID equal to its index in the pre-order traversal of the tree.
The nodes compute the node with minimal initial ID, the ID which it has due to the definition of the model. Notice it is possible to identify and ensure that all nodes in know the identifier of within rounds due to \IfAppendixLABEL:\next ( (Aggregate and Broadcast).)Claim C.1. Further, using \IfAppendixLABEL:\next ( (Aggregate and Broadcast).)Claim C.1, the nodes compute .
Next, node chooses two nodes at random from , nodes , and sends them each a message. Nodes reply each with a random node , respectively, where . Node repeats this process as long as it does not receive two distinct nodes . Node then sends messages to both and lets them know that they are its children in the tree. The nodes added to the tree continue this process, each of them randomly choosing two nodes as its children until it receives two distinct nodes which are not already in the tree, or until some rounds elapsed. Clearly, w.h.p., this process constructs a binary tree of depth within rounds.
Finally we would like to assign an ordering to the nodes. Each node tells its parent the size of its subtree. That is, the leaves tell their parents that they are leaves, and whenever a node reaches a state where it has heard from all its children, it tells its parent how many nodes are in its subtree. Then, the root of the tree, , begins with the ID pallet , takes the first ID for itself, and passes down two contiguous intervals for possible IDs, broken according to the sizes of the subtrees of its children, to its two children nodes – with the left child receiving the interval with smaller IDs. Inductively, each node takes the first ID from the pallet it receives from its parent, breaks the pallet into two contiguous parts, according to the sizes of the subtrees of its children, and sends the part with smaller IDs to its left child, and the higher part to its right child. Since the depth of the tree is , this completes in rounds, w.h.p. ∎
We use the following statement from [18] to prove \IfAppendixLABEL:\next ( (SSSP with Low Average and High Maximal Degrees).)Lemma 4.5.
Lemma C.8 (Reassign Skeletons).
[18, Lemma 29] Given graph , a skeleton graph , a value which is known to all the nodes, and nodes such that each has at least nodes in its neighborhood, there is an algorithm that assigns nodes to , where , such that each node in is assigned to at most nodes in . With respect to the set , it is only required that every node in must know whether or not it itself is in – that is, the entire contents of do not have to be globally known. The algorithm runs in rounds in the model, w.h.p.
The skeleton-based techniques allow us to approximate weighted SSSP fast in the model. After we do it, the following well-known simple reduction allows us to compute approximate weighted diameter.
Claim C.9 (Diameter from SSSP).
(see e.g. [18, Claim 34]) Given a graph , a value and an algorithm which computes an approximation of weighted SSSP in rounds of the model, there is an algorithm which computes a -approximation of the weighted diameter in rounds of the model.
We use the following basic claim regarding usage of skeleton graphs for purposes of distance computations in the model. It is proven in [8].
Claim C.10 (Extend Distances).
[8, Theorem 2.7]Let , let be a skeleton graph, and let be the set of source nodes. If for each source node , each skeleton node knows the -approximate distance such that , then each node can compute for all source nodes , a value such that in rounds.
C.2 Oblivious Token Routing
In [49], they introduce and solve the token routing problem over a skeleton graph, where each receiver knows the number of tokens each sender has for . This is insufficient for our purposes since we work in the complexity realm with skeleton nodes, where we can’t make the identifiers of the skeleton nodes globally known, let alone the number of messages between pairs of nodes. Therefore, we define the following routing problem, in which the receivers do not know neither the identifiers of the senders nor the number of messages each sender intends to send them.
Definition 21 (Oblivious Token Routing Problem).
The oblivious-token routing problem is defined as follows. Let be a set of sender nodes and be a set of receiver nodes. Each sender needs to send and each receiver needs to receive at most tokens, of size bits each. Each token has a dedicated receiver node , and each sender and receiver know the bound on number of tokens the receiver is going to receive. The oblivious-token routing problem is solved when all nodes in know all tokens they are the receivers of.
Notice that the assumption of the knowledge of a bound on the number of messages each receiver gets is something which is easy to eliminate by having the receiver double its estimate and repeat the algorithm till success for iterations. To verify if some particular invocation succeeded, we can can make a node broadcast failure if it sent or received more than half of its global capacity at some point.
Lemma C.11 (Oblivious Token Routing).
Given a graph , and a skeleton graph , let be an upper bound on the number of tokens to be sent or received by any node in and let be the total workload. The oblivious-token routing problem can be solved in rounds, w.h.p., in the model.
Proof of Lemma C.11.
The problem overcome in [49, Theorem 2.2] (the non-oblivious case), is that even though there are enough helpers near each skeleton node to send and receive all the messages, it is not straightforward to connect between senders’ and receivers’ helpers. So, in [49] it is suggested to relay messages via some intermediate receivers. This way, a message is sent by a sender to one of its helpers, by the helper to an intermediate receiver, from there to a helper of the receiver, and from there it is sent to the receiver. To compute intermediate receivers for the message number from to , they apply a pseudo-random hash function .
However, the receiver needs to be able to compute as well, so it needs to know the number of messages it is to receive from each sender , and we cannot assume this for our purposes.
To overcome this limitation, we assign for helper number of the receiver the intermediate receiver whose identifier is computed as , where is pseudo-random hash function. We deliver the messages in phases. To keep the load balanced between phases, for each message we independently at random sample , which is the phase on which it will be sent. In order to keep the load balanced between the receivers’ helpers and intermediate receivers on some phase , for each message we also independently at random sample receivers’ helper index . The intermediate receiver is decided by hash function , i.e. we route the message with the final receiver via . Unlike [49], we apply on arguments that are not necessarily distinct, which could increase the number of conflicts. However, we show that every time all nodes apply , each key is used at most -times w.h.p., so due to \IfAppendixLABEL:\next ( (Conflicts).)Claim A.2 the congestion on each intermediate receiver is w.h.p.
The pseudo-code is provided by Algorithm 1.
Notice that each node can play five different roles: it could be a sender , a receiver , a sender’s helper , a receiver’s helper and an intermediate receiver . Moreover, it can be a sender’s or a receiver’s helper for up to nodes. We show that it can be an intermediate receiver for receiver’s helpers w.h.p.
First, all nodes sample a globally known pseudo-random hash function from the family of -wise independent random functions , which is used to compute the intermediate receivers for each message (Algorithms 1 and 1). For this, by \IfAppendixLABEL:\next ( (Seed).)Claim A.1, bits of globally known seed are enough and the node with the minimal identifier samples and broadcasts them using \IfAppendixLABEL:\next ( (Aggregate and Broadcast).)Claim C.1. Afterwards, each sender distributes the tokens between its helpers in a balanced manner – each sender’s helper is assigned at most messages to send (Algorithm 1). Each receiver enumerates its helpers by identifiers (Algorithm 1). Each sender’s helper , for each message it has to send, samples a random phase and a random receiver’s helper index (Algorithm 1).
We then proceed for phases. On phase , each sender’s helper sends each message for which it sampled to the node . Afterwards, in Algorithm 1, each receiver’s helper for each receiver it helps sends to , where is the index of in computed in Algorithm 1. Each intermediate receiver , sends all messages it received with destination to the from which it received .
Algorithm 1 takes rounds by \IfAppendixLABEL:\next ( (Aggregate and Broadcast).)Claim C.1, and Algorithms 1, 1 and 1 are implemented using local edges in rounds. There are iterations of the loop in Line 1, and we argue that each of them requires rounds of communications via global edges w.h.p. Overall, the complexity is rounds.
For each of the messages designated to some receiver , the phase number is sampled independently with probability , therefore by a Chernoff Bound, there are messages with as the final destination, which are sent on the -th phase w.h.p. On the -th phase, some receiver’s helper index for each of these messages is sampled with probability , therefore by a Chernoff Bound it is sampled times w.h.p. By a union bound over all phases, receivers and receivers’ helper indices, on each phase, for each receiver each receiver’s helper index is selected times w.h.p. Thus, by \IfAppendixLABEL:\next ( (Conflicts).)Claim A.2 each is selected as an intermediate receiver times and receives messages in rounds w.h.p. This implies that no message is lost during Algorithm 1 and that Algorithms 1 and 1 take rounds.
Since each node helps at most senders and due to Chernoff Bounds, each helper sends messages w.h.p. on some phase in Algorithm 1. Since each node is a helper to at most receivers, Algorithm 1 also takes rounds w.h.p. Similarly, by \IfAppendixLABEL:\next ( (Conflicts).)Claim A.2, since there are distinct pairs of receivers and receiver receiver’s helper index, w.h.p. each intermediate receiver is assigned to at most receiver helpers. Thus, Algorithm 1 also takes rounds w.h.p. ∎
See 4.1
Proof of Claim 4.1.
The claim follows by an invocation of \IfAppendixLABEL:\next ( (Oblivious Token Routing).)Lemma C.11 with parameters resulting in rounds, as required. ∎
C.3 Simulation
We use the following claims from [18] to improve the simulating of the model in the .
Lemma C.12 ( Simulation in ).
[18, Lemma 16] Given a graph , and a skeleton graph , it is possible to simulate one round of the model over within rounds in in the model. That is, within rounds in in the model, any two adjacent nodes in can communicate any amount of data between each other.
Lemma C.13 (Sampled neighbors [18, Lemma 3.1]).
Given is a graph . For a value , there is a value such that the following holds w.h.p.: Let be a subset of nodes sampled uniformly at random from . Then each node with has a neighbor in .
We show how to simulate the model using the model and the model together, and it then follows by \IfAppendixLABEL:\next ( ( Simulation in ).)Theorem 4.2 and \IfAppendixLABEL:\next ( ( Simulation in ).)Lemma C.12 that this can be converted into a simulation in the model. The intuition behind the simulation follows from observing \IfAppendixLABEL:\next ( (Sampled neighbors [18, Lemma 3.1]).)Lemma C.13 – if every node desires to broadcast a single message to the entire graph, then with relatively little bandwidth it is possible to ensure that all nodes above a certain minimal degree will get these messages from all the nodes in the graph. We begin with the simulation of the model in the combined and models.
Lemma C.14 ( Simulation in and ).
Given a graph with average degree , given an algorithm in the model, which runs on in rounds, and given some value , there exists an algorithm which uses rounds of the model and rounds of the model on and simulates on . It is assumed that prior to running , each node has at most bits of input used in , including, potentially, the incident edges of in . Further, it is assumed that the output of each node in is at most bits.
Proof of Lemma C.14.
The outline of the simulation is as follows. We split the graph into high degree nodes, , and low degree nodes, , at a certain cut-off. The key idea is that if every node takes a single message and sends it randomly to nodes in , then every will have at least one neighbor, w.h.p., which hears the message from , for every , due to \IfAppendixLABEL:\next ( (Sampled neighbors [18, Lemma 3.1]).)Lemma C.13. Thus, we choose some subset and assign to each node some node which partially simulates . By partially simulating, we mean that, initially, node tells node all of its input to , and then for each round, node tells what message wants to send in that round, and then sends this message (that it wishes to broadcast) to random nodes. Finally, we are guaranteed that every node in hears all the messages broadcast in the graph, which allows for to internally simulate the local computation which should perform in before the next round. In a sense, when simulates , after each round node knows what message it wants to send in that round of , yet not necessarily other information that it would have learned from other nodes in the graph during that round of . Thus, node might not know its output in . To overcome this, notice that knows the output of in , and due to our assumption in the statement of this theorem, each node outputs at most bits, and so we can simulate another rounds where each will just broadcast its output (ensuring that it itself receives it from ).
Initialization
We begin by showing how to initialize the nodes of high degree which simulate those of low degree. The cut-off for being a high or low degree node is . That is, we desire to simulate every node with using a node with degree at least . Observe that since is the average degree, there are edges in the graph. Since the maximal degree is at most , there must be at least nodes with degree at least . Thus, we denote by the nodes in with the highest degree, and are guaranteed that for each , . Notice that it is possible within rounds to count the number of nodes in with degree above a threshold, using \IfAppendixLABEL:\next ( (Aggregation).)Corollary B.6, and thus within rounds it is possible to do a binary search for the degree of the node with highest degree, allowing each node to know whether or not it is in .
Let . The node now knows that it is in , and thus randomly sends messages, using rounds of the model, containing its ID and communication token in the model. Clearly, w.h.p., every node has received a message from at least one node . Thus, if node needs simulating, that is, , it chooses arbitrarily among the nodes from which it heard from some node and tells that it should simulate . Denote by the set of nodes which choose to simulate them. Each node , upon receiving , chooses and arbitrary order for and sends back to each its index in that order.
Every node now attempts to learn all the input to of the nodes which it simulates. Notice that now for every node in it holds that is at most . Notice that each node has degree , since otherwise it would have opted to not be simulated by any node, implying, by the constraints of this theorem, that has at most bits of input to , and therefore all the nodes desire to send to at most messages. Since synchronized all the nodes in by sending them each its index in some order of , it is possible to send all this data to in rounds of the model: To do so, assume that every node wishes to send exactly messages to (we can assume this since wants to send at most messages to , and so it can just add extra empty messages at the end), therefore, since node knows its index in , for some ordering which decided on, it is possible to order all the messages from all of to in such a way that each node knows the indices of its messages and such that at all round neither receives more than messages, nor a node sends more than messages.
Round Simulation
We now show how to simulate every one of the rounds of the given . That is, we show the two final steps of the simulation: how tells each node in what value to randomly send to nodes across the graph, and how each node in gets all the messages which were sent by all the nodes. The first part is simple – we already saw that node can send a single, unique message to each within rounds of the model. The second part follows from \IfAppendixLABEL:\next ( (Sampled neighbors [18, Lemma 3.1]).)Lemma C.13. According to \IfAppendixLABEL:\next ( (Sampled neighbors [18, Lemma 3.1]).)Lemma C.13, for every node (which has ) to receive a single message from some node , it is enough for this node to send a message times to nodes sampled uniformly at random and for each node to learn received messages from its neighbors in . Sending messages, each one to random node requires rounds of the model. And aggregating messages from neighbors requires single round of the model.
Output
It is critival that every node will know its output at the end of the simulation of . This is ensured since we assume in the statement of this theorem that the output of every node in it at most bits. Thus, instead of simulating directly, we simulate an algorithm which is just like , yet is followed by rounds where each node broadcasts its output. Since takes rounds, this simply doubles the round complexity achieved above. Due to the fact that our simulation maintains that each node knows all the messages which it broadcasts during the simulated algorithm, every node will necessarily know its output.
∎
Finally, we show how to use \IfAppendixLABEL:\next ( ( Simulation in and ).)Lemma C.14 for simulating a algorithm in the model.
See 4.3
Proof of Theorem 4.3.
We execute the simulation of \IfAppendixLABEL:\next ( ( Simulation in and ).)Lemma C.14 on and with , in order to obtain an algorithm which simulates on within rounds of the model and rounds of the model on . Due to \IfAppendixLABEL:\next ( ( Simulation in ).)Theorem 4.2, since , it is possible to simulate the rounds of on in rounds of the model on . Likewise, using \IfAppendixLABEL:\next ( ( Simulation in ).)Lemma C.12, it is possible to simulate the rounds of on in rounds of the model on . ∎
C.4 A -Approximation for SSSP
We show the missing proof of how we compute SSSP with low average degree and high maximal degree. See 4.5
Proof of Lemma 4.5.
Let be some node with degree . Observe that such a node exists and can be found and agreed upon by all the nodes of using \IfAppendixLABEL:\next ( (Aggregate and Broadcast).)Claim C.1 within rounds. Denote by , where , an arbitrary subset of the neighbors of . Using \IfAppendixLABEL:\next ( (Token Dissemination).)Claim C.2, it is possible to ensure within rounds that every node in the graph knows which nodes are in . We strive to have the nodes of send all the contents of to the nodes .
We now show how the nodes can learn all of . We show that there is a way to do this in which every node in desires to send and receive at most messages, and therefore due to \IfAppendixLABEL:\next ( (Skeleton Unicast).)Claim 4.1, the routing will complete in rounds.
We start by showing that the nodes even have the bandwidth to receive within at most rounds. Due to \IfAppendixLABEL:\next ( (Skeleton Unicast).)Claim 4.1, each node in can receive messages in rounds, implying that in total can receive messages. Since the average degree in is , then , and if and only if , which holds since we assume .
We now show that each node has the bandwidth to send all its incident edges within at most rounds. Notice that if , then it can clearly do so. Thus, for all other nodes in which have higher degrees, we assign some of the other nodes of in their neighborhood to assist them. Let be the set of all nodes in such that has . We strive to invoke \IfAppendixLABEL:\next ( (Reassign Skeletons).)Lemma C.8 on in order to assign each some nodes, denoted , where are in the -hop neighborhood of in , and where each node in is assigned to at most nodes in . Thus, we must show that for each node , it has in its neighborhood in at least nodes of . Recall that , implying that has at least nodes of in its neighborhood in . We thus strive to show that . Notice that since the average degree in is at most and the minimal degree of a node in is . Thus, , and since if and only if , which is given in the conditions of this statement, we conclude. Therefore, it is possible to invoke \IfAppendixLABEL:\next ( (Reassign Skeletons).)Lemma C.8 and thus we can assume that each node is assigned the nodes defined previously. Now, node distributes its incident edges in to the nodes uniformly, using the local edges of the model, within rounds. Since has at most incident edges in , this means that each node receives at most messages from . Every node takes responsibility for the edges it received from and soon forwards them to the nodes . Notice that since each node is assigned to as most nodes in , each takes responsibility for at most messages in total.
At last, noticed that we reach a state where every node in wishes to send at most messages to the nodes in – this is since nodes with at most neighbors in (nodes in ) have at most such many messages, and each node distributed such many messages per each node in . Further, as stated above, the total number of messages to send are . Notice that for each message, it does not matter which node in receives it, and so for each message we select the target in independently and uniformly. The expected number of messages each node in receives is ==, where the last transition holds since , as seen previously. Since the targets of the messages are independent, by an application of a Chernoff Bound, and applying a union bound over all , number of messages each node receives is w.h.p. Thus by Claim 4.1, it is possible to route the messages within rounds w.h.p. This, combined with the fact that the contents of were previously make globally known using \IfAppendixLABEL:\next ( (Token Dissemination).)Claim C.2, allows every node to locally compute to which node in it should deliver each of its messages in a way such that every node in receives the same number of messages across all the messages being sent. At this point, since every node in desires to send and receive at most messages, it is possible to invoke \IfAppendixLABEL:\next ( (Skeleton Unicast).)Claim 4.1 in order to route all these messages within rounds.
Finally, since now nodes in know all of , node can learn all of by learning all the information stored in within rounds. Node can compute the exact distance from the source to any node . Thus, node desires to tell every node the value of . This is possible since is connected and thus every node sent at least one message to throughout the above algorithm, and thus it is possible to reverse the direction of the messages sent above in order to ensure that for each , node can send a unique message to , in the same round complexity as of the above algorithm. ∎
Appendix D The Model – Missing Proofs
We begin with some preliminaries for this section.
Claim D.1 ( Routing).
[38, Theorem 1.2][20, Theorem 2] Consider a graph with an identifier assignment such that any node given can compute , and a set of point-to-point routing requests, each given by the identifiers of the corresponding source-destination pair. If each node of is the source and the destination of at most messages, there is a randomized distributed algorithm that delivers all messages in time in the model, w.h.p.
Corollary D.2 (Identifiers).
In the model in time we can compute an ID assignment and other information such that implies , and any vertex , given , can locally compute for any .
Proof of Corollary D.2.
[20, Lemma 4.1] shows how to compute the aforementioned set of identifiers in rounds in the model, where is the diameter of the graph. Since the diameter is at most the mixing time , we also have that the identifiers are computable in rounds w.h.p. ∎
The following lemma shows how to build a carrier configuration of the -supergraph of the input graph in the model and is rather technical. Thus, its proof is deferred to the Appendix D. Let be an assignment of identifiers, s.t. any node can compute using . For an added node supergraph, we assume that old identifier and new identifier are equal and greater than identifier of any original node . Denote by a globally known simulation assignment, which satisfies , for each new identifier .
Claim D.3 (Build Carrier Configurations in ).
Given is a graph , , and an assignment of new identifiers . Let be s.t. . Assume that for each , node knows the original identifier of and . There is an algorithm that builds a carrier configuration , which holds an -supergraph of . The communication token in for node is a concatenation of , and . The algorithm runs in rounds w.h.p. and ensures that information for the carrier node (carried edges, and communication tokens) is stored in the node , which simulates .
Proof of Claim D.3.
We show how to build the outgoing carrier configuration which holds the -supergraph . The incoming carrier configuration is built similarly and simultaneously.
Representation: The added nodes are represented by the first nodes with lowest identifiers. So the -th added node is simulated by .
Carrier Allocation for Added Nodes: We preallocate outgoing carriers for the added carried nodes . For this we compute using BFS, and set and . Then, we split the outgoing edges of each added node into batches of size at most . Notice that there are at most added batches, which we assign to the original nodes to carry, such that each outgoing carrier node carries a constant number of batches. This assignment is done in terms of . Now each node knows locally for each added node the identifiers of its outgoing carrier nodes . In particular, each outgoing carrier knows which added edges it carries.
Carrier Allocation for Original Nodes: Now, we allocate the part of the outgoing carrier configuration which stores original nodes and edges. Each node samples identifiers () randomly independently and uniformly. Those are to become its outgoing carriers (Item 1). By Chernoff bounds, there is a constant , such that each node is an outgoing carrier for at most nodes w.h.p. (Item 2).
Acquainting: For each assigned (for new nodes) or sampled (for original nodes) outgoing carrier identifier , which belongs to some outgoing carrier , carried node or its representative knows the identifier () of ’s simulating node. Node (or its representative) sends the identifiers , and , and the identifier of the carrier node directly to the simulating node with the new identifier . This requires for each node to send at most messages and to receive w.h.p. Simulating node responds with the identifiers , . Again each node sends and receives at most messages w.h.p. Now, each carried node , for each , knows the communication token of , which is the concatenation of , and (Item 1). Also, for each carrier node , ’s simulating node with identifier knows the communication tokens of each carried node whose edges carries (Item 2).
Each carried node sorts its outgoing carrier nodes by identifiers. It partitions the interval into continuous sub-intervals with at most identifiers of opposite endpoints of outgoing edges. We assign the -th sub-interval to the -th carrier. For each carrier , we send to its simulating node the boundaries of its interval. Each node sends and receives messages w.h.p. (Items 4 and 5)
Then, carried node , for each true outgoing edge , sends to its other endpoint the communication token of the outgoing carrier which is assigned to carry the edge . Now, Item 5 is satisfied for original outgoing edges but not for added outgoing edges. This requires sending messages over edges of .
Consider an added outgoing edge , where is an added node and is the original one. Let be the carrier of the outgoing part of the edge and be the carrier of the incoming part of the edge . New identifier is globally known by construction given . Thus, new identifier of the representative of is globally known, as well as the identifier of its simulator . Let be the simulator of the . Node sends to tuple . This requires for each simulating node to send or to receive messages. This makes Item 5 satisfied for added outgoing edges as well.
Communication Tree: Each carried node (or its representative) locally builds a Communication Tree on its outgoing carrier nodes and sends to each node which simulates a carrier node, its parent and children in the tree (Items 6 and 3). Here each node sends messages and receives messages.
Carrier Population: Each node , for each of its outgoing carriers , sends to node which simulates , the batch of original edges assigned for to carry along with communication token of carriers of the opposite direction of these edges (Items 3 and 4). For this, each node sends messages and receives . For the added edges we send the identifiers of the first and last edges they store. To do so, each node sends message and receives messages.
Round Complexity: The carrier allocation phase is done locally, thus requires no communication.
In the acquainting phase, we use the routing algorithm from \IfAppendixLABEL:\next ( ( Routing).)Claim D.1 for the problems where each node sends and receives messages and messages. Thus, it requires rounds w.h.p.
Communication tree building requires only invocation of the \IfAppendixLABEL:\next ( ( Routing).)Claim D.1, thus terminates in rounds w.h.p.
For the Carrier population phase, each node sends and receives messages, thus runs in rounds w.h.p.
The overall complexity is rounds w.h.p. ∎
In the proof of \IfAppendixLABEL:\next ( ( Simulation in ).)Theorem 5.1 we use the following technical claim.
Claim D.4 (Assignment).
Let be integers such that , for each and , where , . There is a partition of to sets , such that for each .
Proof of Claim D.4.
We construct sets greedily, by adding new elements to the set as long as its size is less than . We notice that the total capacity of sets
is enough to hold all elements.
∎
See 5.3
Proof of Lemma 5.3.
First, we compute and using a BFS algorithm in rounds. We simulate the algorithm from \IfAppendixLABEL:\next ( (-nearest Algorithm).)Lemma B.20 to compute the -nearest (notice that Lemma B.20 works for ) problem in a carrier configuration, and then we simulate the algorithm from \IfAppendixLABEL:\next ( (Learn Carried Information).)Lemma B.15 to learn the edges stored in the output carrier configuration by nodes in the model. The simulation in the model is done by the algorithm from \IfAppendixLABEL:\next ( ( Simulation in ).)Theorem 5.1. Notice that the out-degree in the resulting graph is , and we truncate the output of each node to bits before the end of the simulation. In the model, solving -nearest requires , thus in the model the simulation round complexity is w.h.p.
∎