This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Vertex Ordering Problems in Directed Graph Streams

Amit Chakrabarti    Prantar Ghosh    Andrew McGregor    Sofya Vorotnikova

We consider directed graph algorithms in a streaming setting, focusing on problems concerning orderings of the vertices. This includes such fundamental problems as topological sorting and acyclicity testing. We also study the related problems of finding a minimum feedback arc set (edges whose removal yields an acyclic graph), and finding a sink vertex. We are interested in both adversarially-ordered and randomly-ordered streams. For arbitrary input graphs with edges ordered adversarially, we show that most of these problems have high space complexity, precluding sublinear-space solutions. Some lower bounds also apply when the stream is randomly ordered: e.g., in our most technical result we show that testing acyclicity in the pp-pass random-order model requires roughly n1+1/pn^{1+1/p} space. For other problems, random ordering can make a dramatic difference: e.g., it is possible to find a sink in an acyclic tournament in the one-pass random-order model using polylog(n)(n) space whereas under adversarial ordering roughly n1/pn^{1/p} space is necessary and sufficient given Θ(p)\Theta(p) passes. We also design sublinear algorithms for the feedback arc set problem in tournament graphs; for random graphs; and for randomly ordered streams. In some cases, we give lower bounds establishing that our algorithms are essentially space-optimal. Together, our results complement the much maturer body of work on algorithms for undirected graph streams.

1 Introduction

While there has been a large body of work on undirected graphs in the data stream model [20], the complexity of processing directed graphs (digraphs) in this model is relatively unexplored. The handful of exceptions include multipass algorithms emulating random walks in directed graphs [22, 15], establishing prohibitive space lower bounds on finding sinks [13] and answering reachability queries [11], and ruling out semi-streaming constant-pass algorithms for directed reachability [12]. This is rather unfortunate given that many of the massive graphs often mentioned in the context of motivating work on graph streaming are directed, e.g., hyperlinks, citations, and Twitter “follows” all correspond to directed edges.

In this paper we consider the complexity of a variety of fundamental problems related to vertex ordering in directed graphs. For example, one basic problem that motivated111The problem was explicitly raised in an open problems session at the Shonan Workshop “Processing Big Data Streams” (June 5-8, 2017) and generated considerable discussion. much of this work is as follows: given a stream consisting of edges of an acyclic graph in an arbitrary order, how much memory is required to return a topological ordering of the graph? In the offline setting, this can be computed in O(m+n)O(m+n) time using Kahn’s algorithm [16] or via depth-first trees [23] but nothing was known in the data stream setting.

We also consider the related minimum feedback arc set problem, i.e., estimating the minimum number of edges (arcs) that need to be removed such that the resulting graph is acyclic. This problem is NP-hard and the best known approximation factor is O(lognloglogn)O(\log n\log\log n) for arbitrary graphs [10], although a PTAS is known in the case of tournaments [18]. Again, nothing was known in the data stream model. In contrast, the analogous problem for undirected graphs is well understood in the data stream model. The number of edges required to make an undirected graph acyclic is mn+cm-n+c where cc is the number of connected components. The number of connected components can be computed in O(nlogn)O(n\log n) space by constructing a spanning forest [11, 2].

Previous Work.  Some versions of the problems we study in this work have been considered previously in the query complexity model. For example, Huang et al. [14] consider the “generalized sorting problem” where GG is an acyclic graph with a unique topological order. The algorithm is presented with an undirected version of this graph and may query any edge to reveal its direction. The goal is to learn the topological ordering with the minimum number of queries. Huang et al. [14] and Angelov et al. [5] also studied the average case complexity of various problems where the input graph is chosen from some known distribution. Ailon [3] studied the equivalent problem for feedback arc set in tournaments. Note that all these query complexity results are adaptive and do not immediately give rise to small-space data stream algorithms.

Perhaps the relative lack of progress on streaming algorithms for directed graph problems stems from their being considered “implicitly hard” in the literature, a point made in the recent work of Khan and Mehta [19]. Indeed, that work and the also-recent work of Elkin [9] provide the first nontrivial streaming algorithms for computing a depth-first search tree and a shortest-paths tree (respectively) in semi-streaming space, using O(n/polylogn)O(n/\operatorname{polylog}n) passes. Notably, fairly non-trivial work was needed to barely beat the trivial bound of O(n)O(n) passes.

Some of our work here applies and extends the work of Guruswami and Onak [12], who gave the first super-linear (in nn) space lower bounds in the streaming model for decision problems on graphs. In particular, they showed that solving reachability in nn-vertex digraphs using pp passes requires n1+Ω(1/p)/pO(1)n^{1+\Omega(1/p)}/p^{O(1)} space. Via simple reductions, they then showed similar lower bounds for deciding whether a given (undirected) graph has a short sstt path or a perfect matching.

1.1 Results

Problem Passes Input Order Space Bound Notes
acyc 11 Θ(n2)\Theta(n^{2})
acyc pp n1+Ω(1/p)/pO(1)n^{1+\Omega(1/p)}/p^{O(1)}
mult. approx. fas-size 11 Θ(n2)\Theta(n^{2})
mult. approx. fas-size pp n1+Ω(1/p)/pO(1)n^{1+\Omega(1/p)}/p^{O(1)}
topo-sort 11 Θ(n2)\Theta(n^{2})
topo-sort pp n1+Ω(1/p)/(p+1)O(1)n^{1+\Omega(1/p)}/(p+1)^{O(1)}
mult. approx. fas 11 Θ(n2)\Theta(n^{2})
mult. approx. fas pp n1+Ω(1/p)/(p+1)O(1)n^{1+\Omega(1/p)}/(p+1)^{O(1)}
stconn-dag pp random n1+Ω(1/p)/pO(1)n^{1+\Omega(1/p)}/p^{O(1)} error probability 1/pΩ(p)1/p^{\Omega(p)}
acyc pp random n1+Ω(1/p)/pO(1)n^{1+\Omega(1/p)}/p^{O(1)} error probability 1/pΩ(p)1/p^{\Omega(p)}
mult. approx. fas-size pp random n1+Ω(1/p)/pO(1)n^{1+\Omega(1/p)}/p^{O(1)} error probability 1/pΩ(p)1/p^{\Omega(p)}
topo-sort pp random n1+Ω(1/p)/(p+1)O(1)n^{1+\Omega(1/p)}/(p+1)^{O(1)} error probability 1/pΩ(p)1/p^{\Omega(p)}
mult. approx. fas pp random n1+Ω(1/p)/(p+1)O(1)n^{1+\Omega(1/p)}/(p+1)^{O(1)} error probability 1/pΩ(p)1/p^{\Omega(p)}
(1+ε)(1+\varepsilon)-approx. fas-t 11 O~(ε2n)\widetilde{O}(\varepsilon^{-2}n) exp. time post-processing
33-approx. fas-t pp O~(n1+1/p)\widetilde{O}(n^{1+1/p})
acyc-t 11 O~(n)\widetilde{O}(n)
acyc-t pp Ω(n/p)\Omega(n/p)
sink-find-t 2p12p-1 O~(n1/p)\widetilde{O}(n^{1/p})
sink-find-t pp Ω(n1/p/p2)\Omega(n^{1/p}/p^{2})
sink-find-t 11 random O~(1)\widetilde{O}(1)
topo-sort 11 random O~(n3/2)\widetilde{O}(n^{3/2}) random DAG + planted path
topo-sort O(logn)O(\log n) O~(n4/3)\widetilde{O}(n^{4/3}) random DAG + planted path
(1+ε)(1+\varepsilon)-apx. rank-aggr 11 O~(ε2n)\widetilde{O}(\varepsilon^{-2}n) exp. time post-processing
Table 1: Summary of our algorithmic and space lower bound results. These problems are defined in Section 1.2. The input stream is adversarially ordered unless marked as “random” above. Besides the above results, we also give an oracle (query complexity) lower bound in Section 3.4.

Arbitrary Graphs.  To set the stage, in Section 2 we present a number of negative results for the case when the input digraph can be arbitrary. In particular, we show that there is no one-pass sublinear-space algorithm for such fundamental digraph problems as testing whether an input digraph is acyclic, topologically sorting it if it is, or finding its feedback arc set if it is not. These results set the stage for our later focus on specific families of graphs, where we can do much more, algorithmically.

For our lower bounds, we consider both arbitrary and random stream orderings. In Section 2.1, we concentrate on the arbitrary ordering and show that checking whether the graph is acyclic, finding a topological ordering of a directed acyclic graph (DAG), or any multiplicative approximation of feedback arc set requires Ω(n2)\Omega(n^{2}) space in one pass. The lower bound extends to n1+Ω(1/p)/pO(1)n^{1+\Omega(1/p)}/p^{O(1)} when the number of passes is p1p\geqslant 1. In Section 2.2, we show that essentially the same bound holds even when the stream is randomly ordered. This strengthening is one of our more technically involved results and it is based on generalizing a fundamental result by Guruswami and Onak [12] on sstt connectivity in the multi-pass data stream model.

As a by-product of our generalization, we also obtain the first random-order super-linear (in nn) lower bounds for the undirected graph problems of deciding (i) whether there exists a short sstt path (ii) whether there exists a perfect matching.

Tournaments.  A tournament is a digraph that has exactly one directed edge between each pair of distinct vertices. If we assume that the input graph is a tournament, it is trivial to find a topological ordering, given that one exists, by considering the in-degrees of the vertices. Furthermore, it is known that ordering the vertices by in-degree yields a 5-approximation to feedback arc set [8].

In Section 3, we present an algorithm which computes a (1+ε)(1+\varepsilon)-approximation to feedback arc set in one pass using O~(ε2n)\widetilde{O}(\varepsilon^{-2}n) space222Throughout the paper, O~(f(n))=O(f(n)polylogn)\widetilde{O}(f(n))=O(f(n)\operatorname{polylog}n).. However, in the post-processing step, it estimates the number of back edges for every permutation of vertices in the graph, thus resulting in exponential post-processing time. Despite its “brute force” feel, our algorithm is essentially optimal, both in its space usage (unconditionally) and its post-processing time (in a sense we shall make precise later). We address these issues in Section 3.4. On the other hand, in Section 3.2, we show that with O(logn)O(\log n) additional passes it is possible to compute a 3-approximation to feedback arc set while using only polynomial time and O~(n)\widetilde{O}(n) space.

Lastly, in Section 4, we consider the problem of finding a sink in a tournament which is guaranteed to be acyclic. Obviously, this problem can be solved in a single pass using O(n)O(n) space by maintaining an “is-sink” flag for each vertex. Our results show that for arbitrary order streams this is tight. We prove that finding a sink in pp passes requires Ω(n1/p/p2)\Omega(n^{1/p}/p^{2}) space. We also provide an O(n1/plog(3p))O(n^{1/p}\log(3p))-space sink-finding algorithm that uses O(p)O(p) passes, for any 1plogn1\leqslant p\leqslant\log n. In contrast, we show that if the stream is randomly ordered, then using polylogn\operatorname{polylog}n space and a single pass is sufficient. This is a significant separation between the arbitrary-order and random-order data stream models.

Random Graphs.  In Section 5, we consider a natural family of random acyclic graphs (see Definition 1.2 below) and present two algorithms for finding a topological ordering of vertices. We show that, for this family, O~(n4/3)\widetilde{O}(n^{4/3}) space is sufficient to find the best ordering given O(logn)O(\log n) passes. Alternatively, O~(n3/2)\widetilde{O}(n^{3/2}) space is sufficient given only a single pass, on the assumption that the edges in the stream are randomly ordered.

Rank Aggregation.  In Section 6, we consider the problem of rank aggregation (formally defined in the next section), which is closely related to the feedback arc set problem. We present a one-pass, O~(ε2n)\widetilde{O}(\varepsilon^{-2}n) space algorithm that returns (1+ε)(1+\varepsilon)-approximation to the rank aggregation problem. The algorithm is very similar to our (1+ε)(1+\varepsilon)-approximation of feedback arc set in tournaments and has the same drawback of using exponential post-processing time.


A summary of these results is given in Table 1.

1.2 Models and Preliminaries

Vertex Ordering Problems in Digraphs.  An ordering of an nn-vertex digraph G=(V,E)G=(V,E) is a list consisting of its vertices. We shall view each ordering σ\sigma as a function σ:V[n]\sigma\colon V\to[n], with σ(v)\sigma(v) being the position of vv in the list. To each ordering σ\sigma, there corresponds a set of back edges BG(σ)={(v,u)E:σ(u)<σ(v)}B_{G}(\sigma)=\{(v,u)\in E:\,\sigma(u)<\sigma(v)\}. We say that σ\sigma is a topological ordering if BG(σ)=B_{G}(\sigma)=\varnothing; such σ\sigma exists iff GG is acyclic. We define βG=min{|BG(σ)|:σ\beta_{G}=\min\{|B_{G}(\sigma)|:\,\sigma is an ordering of G}G\}, i.e., the size of a minimum feedback arc set for GG.

We now define the many interrelated digraph problems studied in this work. In each of these problems, the input is a digraph GG, presented as a stream of its edges. The ordering of the edges is adversarial unless specified otherwise.

acyc:

Decide whether or not GG is acyclic.

topo-sort:

Under the promise that GG is acyclic, output a topological ordering of its vertices.

stconn-dag:

Under the promise that GG is acyclic, decide whether it has an ss-to-tt path, these being two prespecified vertices.

sink-find:

Under the promise that GG is acyclic, output a sink of GG.

fas-size (α\alpha-approximation):

Output an integer β^[βG,αβG]\hat{\beta}\in[\beta_{G},\alpha\beta_{G}].

fas (α\alpha-approximation):

Output an ordering σ\sigma such that |BG(σ)|αβG|B_{G}(\sigma)|\leqslant\alpha\beta_{G}.

fas-t:

Solve fas under the promise that GG is a tournament. In a similar vein, we define the promise problems acyc-t, topo-sort-t, sink-find-t, fas-size-t.

For randomized solutions to these problems we shall require that the error probability be at most 1/31/3.

We remark that the most common definition of the minimum feedback arc set problem in the literature on optimization is to identify a small set of edges whose removal makes the graph acyclic, so fas-size is closer in spirit to this problem than fas. As we shall see, our algorithms will apply to both variants of the problem. On the other hand, lower bounds sometimes require different proofs for the two variants. Since βG=0\beta_{G}=0 iff GG is acyclic, we have the following basic observation.

Observation 1.1.

Producing a multiplicative approximation for any of fas, fas-t, fas-size, and fas-size-t entails solving (respectively) topo-sort, topo-sort-t, acyc, and acyc-t.

For an ordering π\pi of a vertex set VV, define Eπ={(u,v)V2:π(u)<π(v)}E^{\pi}=\{(u,v)\in V^{2}:\,\pi(u)<\pi(v)\}. Define Tou(π)=(V,Eπ)\operatorname{Tou}(\pi)=(V,E^{\pi}) to be the unique acyclic tournament on VV consistent with π\pi.

As mentioned above, we will also consider vertex ordering problems on random graphs from a natural distribution. This distribution, which we shall call a “planted path distribution,” was considered by Huang et al. [14] for average case analysis in their work on generalized sorting.

Definition 1.2 (Planted Path Distribution).

Let 𝖯𝗅𝖺𝗇𝗍𝖣𝖠𝖦n,q\mathsf{PlantDAG}_{n,q} be the distribution on digraphs on [n][n] defined as follows. Pick a permutation π\pi of [n][n] uniformly at random. Retain each edge (u,v)(u,v) in Tou(π)\operatorname{Tou}(\pi) with probability 11 if π(v)=π(u)+1\pi(v)=\pi(u)+1, and with probability qq, independently, otherwise.

Rank Aggregation.  The feedback arc set problem in tournaments is closely related to the problem of rank aggregation (rank-aggr). Given kk total orderings σ1,,σk\sigma_{1},\ldots,\sigma_{k} of nn objects we want to find an ordering that best describes the “preferences” expressed in the input. Formally, we want to find an ordering that minimizes cost(π):=i=1kd(π,σi)\operatorname{cost}(\pi):=\sum_{i=1}^{k}d(\pi,\sigma_{i}), where the distance d(π,σ)d(\pi,\sigma) between two orderings is the number of pairs of objects ranked differently by them. That is,

d(π,σ):=a,b[n]𝟙{π(a)<π(b),σ(b)<σ(a)},d(\pi,\sigma):=\sum_{a,b\in[n]}\mathbbm{1}\{\pi(a)<\pi(b),\,\sigma(b)<\sigma(a)\}\,,

where the notation 𝟙{ϕ}\mathbbm{1}\{\phi\} denotes a 0/10/1-valued indicator for the condition ϕ\phi.

In the streaming model, the input to rank-aggr can be given either as a concatenation of kk orderings, leading to a stream of length knkn, or as a sequence of triples (a,b,i)(a,b,i) conveying that σi(a)<σi(b)\sigma_{i}(a)<\sigma_{i}(b), leading to a stream of length k(n2)k\binom{n}{2}. Since we want the length of the stream to be polynomial in nn, we assume k=nO(1)k=n^{O(1)}.

Lower Bounds through Communication Complexity.  Space lower bounds for data streaming algorithms are most often proven via reductions from standard problems in communication complexity. We recall two such problems, each involving two players, Alice and Bob. In the indexN\textsc{index}_{N} problem, Alice holds a vector 𝐱{0,1}N\mathbf{x}\in\{0,1\}^{N} and Bob holds an index k[N]k\in[N]: the goal is for Alice to send Bob a message allowing him to output xkx_{k}. In the disjN\textsc{disj}_{N} problem, Alice holds 𝐱{0,1}N\mathbf{x}\in\{0,1\}^{N} and Bob holds 𝐲{0,1}N\mathbf{y}\in\{0,1\}^{N}: the goal is for them to communicate interactively, following which they must decide whether 𝐱\mathbf{x} and 𝐲\mathbf{y} are disjoint, when considered as subsets of [N][N], i.e., they must output ¬i=1Nxiyi\neg\bigvee_{i=1}^{N}x_{i}\wedge y_{i}. In the special case disjN,s\textsc{disj}_{N,s}, it is promised that the cardinalities |𝐱|=|𝐲|=s|\mathbf{x}|=|\mathbf{y}|=s. In each case, the communication protocol may be randomized, erring with probability at most δ\delta. We shall use the following well-known lower bounds.

Fact 1.3 (See, e.g., [1, 21]).

For error probability δ=13\delta=\frac{1}{3}, the one-way randomized complexity R1/3(indexN)=Ω(N)\operatorname{R}_{1/3}^{\to}(\textsc{index}_{N})=\Omega(N) and the general randomized complexity R1/3(disjN,N/3)=Ω(N)\operatorname{R}_{1/3}(\textsc{disj}_{N,N/3})=\Omega(N).

Other Notation and Terminology.  We call an edge critical if it lies on a directed Hamiltonian path of length n1n-1 in a directed acyclic graph. We say an event holds with high probability (w.h.p.) if the probability is at least 11/poly(n)1-1/\operatorname{poly}(n). Given a graph with a unique total ordering, we say a vertex uu has rank rr if it occurs in the rrth position in this total ordering.

2 General Digraphs and the Hardness of some Basic Problems

In this section, our focus is bad news. In particular, we show that there is no one-pass sublinear-space algorithm for the rather basic problem of testing whether an input digraph is acyclic, nor for topologically sorting it if it is. These results set the stage for our later focus on tournament graphs, where we can do much more, algorithmically.

2.1 Arbitrary Order Lower Bounds

To begin, note that the complexity of topo-sort-t is easily understood: maintaining in-degrees of all vertices and then sorting by in-degree provides a one-pass O(nlogn)O(n\log n)-space solution. However, the problem becomes maximally hard without the promise of a tournament.

Theorem 2.1.

Solving topo-sort in one pass requires Ω(n2)\Omega(n^{2}) space.

Proof.

We reduce from indexN\textsc{index}_{N}, where N=p2N=p^{2} for a positive integer pp. Using a canonical bijection from [p]2[p]^{2} to [N][N], we rewrite Alice’s input vector as a matrix 𝐱=(𝐱ij)i,j[p]\mathbf{x}=(\mathbf{x}_{ij})_{i,j\in[p]} and Bob’s input index as (y,z)[p]2(y,z)\in[p]^{2}. Our reduction creates a graph G=(V,E)G=(V,E) on n=4pn=4p vertices: the vertex set V=L0R0L1R1V=L^{0}\uplus R^{0}\uplus L^{1}\uplus R^{1}, where each |Lb|=|Rb|=p|L^{b}|=|R^{b}|=p. These vertices are labeled, with i0\ell^{0}_{i} being the iith vertex in L0L^{0} (and similarly for ri0,i1,ri1r^{0}_{i},\ell^{1}_{i},r^{1}_{i}).

Based on their inputs, Alice and Bob create streams of edges by listing the following sets:

E𝐱={(ib,rjb):b{0,1},i,j[p],xij=b},Eyz={(rz0,y1),(rz1,y0)}.E_{\mathbf{x}}=\{(\ell^{b}_{i},r^{b}_{j}):\,b\in\{0,1\},\,i,j\in[p],\,x_{ij}=b\}\,,\quad E_{yz}=\{(r^{0}_{z},\ell^{1}_{y}),\,(r^{1}_{z},\ell^{0}_{y})\}\,.

The combined stream defines the graph GG, where E=E𝐱EyzE=E_{\mathbf{x}}\cup E_{yz}.

We claim that GG is acyclic. In the digraph (V,E𝐱)(V,E_{\mathbf{x}}), every vertex is either a source or a sink. So the only vertices that could lie on a cycle in GG are y0,rz0,y1\ell^{0}_{y},r^{0}_{z},\ell^{1}_{y}, and rz1r^{1}_{z}. Either (y0,rz0)E(\ell^{0}_{y},r^{0}_{z})\notin E or (y1,rz1)E(\ell^{1}_{y},r^{1}_{z})\notin E, so there is in fact no cycle even among these four vertices.

Let σ\sigma be a topological ordering of GG. If xyz=0x_{yz}=0, then we must, in particular, have σ(y0)<σ(y1)\sigma(\ell^{0}_{y})<\sigma(\ell^{1}_{y}), else we must have σ(y1)<σ(y0)\sigma(\ell^{1}_{y})<\sigma(\ell^{0}_{y}). Thus, by simulating a one-pass algorithm 𝒜\mathcal{A} on Alice’s stream followed by Bob’s stream, consulting the ordering σ\sigma produced by 𝒜\mathcal{A} and outputting 0 iff σ(y0)<σ(y1)\sigma(\ell^{0}_{y})<\sigma(\ell^{1}_{y}), the players can solve indexN\textsc{index}_{N}. It follows that the space used by 𝒜\mathcal{A} must be at least R1/3(indexN)=Ω(N)=Ω(p2)=Ω(n2)\operatorname{R}^{\to}_{1/3}(\textsc{index}_{N})=\Omega(N)=\Omega(p^{2})=\Omega(n^{2}). ∎

For our next two results, we use reductions from stconn-dag. It is a simple exercise to show that a one-pass streaming algorithm for stconn-dag requires Ω(n2)\Omega(n^{2}) space. Guruswami and Onak [12] showed that a pp-pass algorithm requires n1+Ω(1/p)/pO(1){n^{1+\Omega(1/p)}}/{p^{O(1)}} space.333Although their paper states the lower bound for ss-tt connectivity in general digraphs, their proof in fact shows the stronger result that the bound holds even when restricted to DAGs.

Proposition 2.2.

Solving acyc requires Ω(n2)\Omega(n^{2}) space in one pass and n1+Ω(1/p)/pO(1){n^{1+\Omega(1/p)}}/{p^{O(1)}} space in pp passes.

Proof.

Given a DAG GG and specific vertices s,ts,t, let GG^{\prime} be obtained by adding edge (t,s)(t,s) to GG. Then GG^{\prime} is acyclic iff GG has no ss-to-tt path. By the discussion above, the lower bounds on acyc follow. ∎

Corollary 2.3.

Solving topo-sort in pp passes requires n1+Ω(1/p)/(p+1)O(1)n^{1+\Omega(1/p)}/(p+1)^{O(1)} space.

Proof.

Given a pp-pass SS-space algorithm AA for topo-sort, we can obtain a (p+1)(p+1)-pass (S+O(nlogn))(S+O(n\log n))-space algorithm for acyc as follows. Run algorithm AA, store the ordering it outputs, and in another pass, check if the ordering induces any back-edge. If it does, we output NO, and otherwise, we output YES. In case of any runtime error, we return NO. For correctness, observe that if the input graph GG is acyclic, then AA returns a valid topological ordering w.h.p.. Hence, the final pass doesn’t detect any back-edge, and we correctly output YES. In case GG is not acyclic, the promise that the input graph for topo-sort would be a DAG is violated, and hence, AA either raises an error or returns some ordering that must induce a back-edge since GG doesn’t have a topological ordering. Thus, we correctly return NO in this case. Finally, Proposition 2.2 implies that S+O(nlogn)n1+Ω(1/p)/(p+1)O(1)S+O(n\log n)\geqslant n^{1+\Omega(1/p)}/(p+1)^{O(1)}, i.e., Sn1+Ω(1/p)/(p+1)O(1)S\geqslant n^{1+\Omega(1/p)}/(p+1)^{O(1)}. ∎

Corollary 2.4.

A multiplicative approximation algorithm for either fas-size or fas requires Ω(n2)\Omega(n^{2}) space in one pass. In pp passes, such approximations require n1+Ω(1/p)/pO(1){n^{1+\Omega(1/p)}}/{p^{O(1)}} space and n1+Ω(1/p)/(p+1)O(1){n^{1+\Omega(1/p)}}/{(p+1)^{O(1)}} space respectively.

Proof.

This is immediate from 1.1, Theorem 2.1, Proposition 2.2, and Corollary 2.3. ∎

2.2 Random Order Lower Bounds

We consider the stconn-dag, acyc, and fas problems in a uniformly randomly ordered digraph stream. Recall that for adversarially ordered streams, these problems require about n1+Ω(1/p)n^{1+\Omega(1/p)} space in pp passes. The hardness ultimately stems from a similar lower bound for the shortpath-dag problem. In this latter problem, the input is an nn-vertex DAG with two designated vertices vsv_{s} and vtv_{t}, such that either (a) there exists a path of length at most 2p+22p+2 from vsv_{s} to vtv_{t} or (b) vtv_{t} is unreachable from vsv_{s}. The goal is to determine which of these is the case.

Our goal in this section is to show that the same lower bound continues to hold under random ordering, provided we insist on a sufficiently small error probability, about 1/pΩ(p)1/p^{\Omega(p)}. We prove this for shortpath-dag. As this is a special case of stconn-dag, a lower bound for shortpath-dag carries over to stconn-dag. Further, by the reductions in Proposition 2.2 and Corollaries 2.4 and 2.3, the lower bounds also carry over to acyc, topo-sort, and fas. We also show a barrier result arguing that this restriction to low error is necessary: for the shortpath-dag problem, if an error probability of at least 2/p!2/p! is allowed, then O~(n)\widetilde{O}(n) space is achievable in pp passes.

Our proof uses the machinery of the Guruswami–Onak lower bound for shortpath-dag under an adversarial stream ordering [12]. As in their work, we derive our space lower bound from a communication lower bound for set chasing intersection (henceforth, sci). However, unlike them, we need to prove a “robust” lower bound for sci, in the sense of Chakrabarti, Cormode, and McGregor [7], as explained below. To define sci, we first set up a special family of multilayer pointer jumping problems, described next.

Picture a layered digraph GG^{*} with 2k+12k+1 layers of vertices, each layer having mm vertices, laid out in a rectangular grid with each column being one layer. From left to right, the layers are numbered k,k+1,,k-k,-k+1,\ldots,k. Layer 0 is called the mid-layer. The only possible edges of GG^{*} are from layer \ell to layer 1\ell-1, or from layer -\ell to layer +1-\ell+1, for [k]\ell\in[k] (i.e., edges travel from the left and right ends of the rectangular grid towards the mid-layer). We designate the first vertex in layer k-k as vsv_{s} and the first vertex in layer kk as vtv_{t}.

Each vertex not in the mid-layer has exactly tt outgoing edges, numbered 11st through ttth, possibly with repetition (i.e., GG^{*} is a multigraph). Think of these edges as pointers. An input to one of our communication problems (to be defined soon) specifies the destinations of these pointers. Thus, an input consists of 2mkt2mkt tokens, where each token is an integer in [m][m] specifying which of the mm possibilities a certain pointer takes. The pointers emanating from layer \ell of vertices constitute the \ellth layer of pointers. Our communication games will involve 2k2k players named Pk,,P1,P1,,PkP_{-k},\ldots,P_{-1},P_{1},\ldots,P_{k}. We say that PP_{\ell} is the natural owner of the portion of the input specifying the \ellth layer of pointers.

In the scim,k,t\textsc{sci}_{m,k,t} problem, the goal is to determine whether or not there exists a mid-layer vertex reachable from vsv_{s} as well as vtv_{t}. Consider the communication game where each pointer is known to its natural owner and the players must communicate in k1k-1 rounds, where in each round they broadcast messages in the fixed order P1,,Pk,P1,,PkP_{-1},\ldots,P_{-k},P_{1},\ldots,P_{k}. Guruswami and Onak showed that this problem requires total communication Ω(m1+1/(2k)/k16log3/2m)\Omega(m^{1+1/(2k)}/k^{16}\log^{3/2}m) in the parameter regime t2kmt^{2k}\ll m. This almost immediately implies a similar lower bound for shortpath-dag—simply reverse the directions of the pointers in positive-numbered layers—which then translates into a data streaming lower bound along standard lines.

The key twist in our version of the sci problem is that each pointer is allocated to one of the 2k2k players uniformly at random: thus, most pointers are not allocated to their natural owners. The players have to determine the output to sci communicating exactly in the same pattern as before, up to a small error probability taken over the protocol’s internal randomness as well as the random allocation. This setup potentially makes the problem easier because there is a good chance that the players will be able to “jump two pointers” within a single round. Our main technical result is to show that a lower bound of the form m1+Ω(1/k)m^{1+\Omega(1/k)} holds despite this. In the terminology of Chakrabarti et al. [7], who lower-bounded a number of communication problems under such random-allocation setups, this is a robust communication lower bound.

Theorem 2.5.

Suppose that t2k=o(m/polylog(m))t^{2k}=o(m/\operatorname{polylog}(m)) and that protocol Π\Pi solves scim,k,t\textsc{sci}_{m,k,t} with error ε<(2k)2k2\varepsilon<(2k)^{-2k-2} when the input is randomly allocated amongst the 2k2k players, as described above. Then, Π\Pi communicates Ω(m1+1/(2k)/k16log3/2m)\Omega(m^{1+1/(2k)}/k^{16}\log^{3/2}m) bits.

To prove this result, we consider a problem we call mpj-meetm,k,t\textsc{mpj-meet}_{m,k,t}, defined next (Guruswami and Onak called this problem orlpce\textsc{or}\circ\textsc{lpce}). Consider an input GG^{*} to scim,k,t\textsc{sci}_{m,k,t} and fix an i[t]i\in[t]. If we retain only the iith pointer emanating from each vertex, for each layer \ell, the \ellth layer of pointers defines a function f,i:[n][n]f_{\ell,i}\colon[n]\to[n]. Let xix_{i} (respectively, yiy_{i}) denote the index of the unique mid-layered vertex reached from vsv_{s} (respectively, vtv_{t}) by following the retained pointers. Formally,

xi=f1,i(f2,i(fk,i(1))),yi=f1,i(f2,i(fk,i(1))).x_{i}=f_{-1,i}(f_{-2,i}(\cdots f_{-k,i}(1)\cdots))\,,\quad y_{i}=f_{1,i}(f_{2,i}(\cdots f_{k,i}(1)\cdots))\,.

Define a function to be rr-thin if every element in its range has at most rr distinct pre-images. The instance GG^{*} is said to meet at ii if xi=yix_{i}=y_{i} and is said to be rr-thin at ii if each function f,if_{\ell,i} is rr-thin. The desired output of mpj-meet is

mpj-meet(G)=i=1t𝟙{G meets at i}𝟙{G is not (Clogm)-thin at i},\textsc{mpj-meet}(G^{*})=\bigvee_{i=1}^{t}\mathbbm{1}\{G^{*}\text{ meets at }i\}\vee\mathbbm{1}\{G^{*}\text{ is not $(C\log m)$-thin at $i$}\}\,,

for an appropriate universal constant CC. The corresponding communication game allocates each pointer to its natural owner and asks them to determine the output using the same communication pattern as for sci. Here is the key result about this problem.

Lemma 2.6 (Lemma 7 of Guruswami–Onak [12]).

The (k1)(k-1)-round constant-error communication complexity Rk1(mpj-meetm,k,t)=Ω(tm/(k16logm))O(kt2)\operatorname{R}^{k-1}(\textsc{mpj-meet}_{m,k,t})=\Omega(tm/(k^{16}\log m))-O(kt^{2}). ∎

Using this, we prove our main technical result.

Proof of Theorem 2.5.

Based on the ε\varepsilon-error protocol Π\Pi for scim,k,t\textsc{sci}_{m,k,t}, we design a protocol 𝒬\mathcal{Q} for mpj-meetm,k,t\textsc{mpj-meet}_{m,k,t} as follows. Let GG^{*} be an instance of mpj-meet allocated to players as described above. The players first check whether, for some ii, GG^{*} fails to be rr-thin at ii, for r:=Clogmr:=C\log m: this check can be performed in the first round of communication with each player communicating a single bit. If the check passes, the protocol ends with output 11. From now on, we assume that GG^{*} is indeed rr-thin at each i[t]i\in[t].

Using public randomness, the players randomly renumber the vertices in each layer of GG^{*}, creating an instance GG^{\prime} of sci.444This step is exactly as in Guruswami-Onak [12]. Formally, each function f,if_{\ell,i} is replaced by a corresponding function of form π,if,iπ+1,i1\pi_{\ell,i}\circ f_{\ell,i}\circ\pi_{\ell+1,i}^{-1} (for >0\ell>0), for random permutations π,i:[m][m]\pi_{\ell,i}\colon[m]\to[m]. To keep things concise, we omit the full details here. The players then choose ρ\rho, a random allocation of pointers as in the sci problem. They would like to simulate Π\Pi on GG^{\prime}, as allocated by ρ\rho, but of course they can’t do so without additional communication. Instead, using further public randomness, for each pointer that ρ\rho allocates to someone besides its natural owner, the players reset that pointer to a uniformly random (and independent) value in [m][m]. We refer to such a pointer as damaged. Since there are 2k2k players, each pointer is damaged with probability 11/(2k)1-1/(2k). Let G′′G^{\prime\prime} denote the resulting random instance of sci. The players then simulate Π\Pi on G′′G^{\prime\prime} as allocated by ρ\rho.

It remains to analyze the correctness properties of 𝒬\mathcal{Q}. Suppose that GG^{*} is a 11-instance of mpj-meet. Then there exists i[t]i\in[t] such that GG^{*} meets at ii. By considering the unique maximal paths out of vsv_{s} and vtv_{t} following only the iith pointers at each vertex, we see that GG^{*} is also a 11-instance of sci. Since the vertex renumbering preserves connectivity, GG^{\prime} is also a 11-instance of sci. With probability (2k)2k(2k)^{-2k}, none of the 2k2k pointers on these renumbered paths is damaged; when this event occurs, G′′G^{\prime\prime} is also a 11-instance of sci. Therefore, 𝒬\mathcal{Q} outputs 11 with probability at least (2k)2k(1err(Π))(2k)2k(1ε)(2k)^{-2k}(1-\operatorname{err}(\Pi))\geqslant(2k)^{-2k}(1-\varepsilon).

Next, suppose that GG^{*} is a 0-instance of mpj-meet. It could be that GG^{*} is a 11-instance of sci. However, as Guruswami and Onak show,555See the final paragraph of the proof of Lemma 11 in [12]. the random vertex renumbering ensures that Pr[sci(G)=1]<o(1)\Pr[\textsc{sci}(G^{\prime})=1]<o(1). For the rest of the argument, assume that sci(G)=0\textsc{sci}(G^{\prime})=0. In order to have sci(G′′)=1\textsc{sci}(G^{\prime\prime})=1, there must exist a mid-layer vertex xx such that

f1,i1(f2,i2(fk,ik(1)))=x=f1,j1(f2,j2(fk,jk(1)))f_{1,i_{1}}(f_{2,i_{2}}(\cdots f_{k,i_{k}}(1)\cdots))=x=f_{-1,j_{1}}(f_{-2,j_{2}}(\cdots f_{-k,j_{k}}(1)\cdots)) (1)

for some choice of pointer numbers i1,,ik,i_{1},\ldots,i_{k}, j1,,jk[t]j_{1},\ldots,j_{k}\in[t]. We consider three cases.

  • Case 1: None of the pointers in the above list is damaged.  In this case, eq. 1 cannot hold, because sci(G)=0\textsc{sci}(G^{\prime})=0.

  • Case 2: The layer-11 pointer in the above list is damaged.  Condition on a particular realization of pointers in negative-numbered layers and let xx denote the mid-layered vertex reached from vsv_{s} by following pointers numbered jk,,j1j_{k},\ldots,j_{1}, as in eq. 1. The probability that the damaged pointer at layer 11 points to xx is 1/m1/m. Since this holds for each conditioning, the probability that sci(G′′)=1\textsc{sci}(G^{\prime\prime})=1 is also 1/m1/m.

  • Case 3: The layer-\ell pointer is damaged, but pointers in layers 1,,11,\ldots,\ell-1 are not, where 2\ell\geqslant 2 Again, condition on a particular realization of pointers in negative-numbered layers and let xx be as above. Since the functions ff in eq. 1 are all rr-thin, the number of vertices in layer 1\ell-1 that can reach xx using only undamaged pointers is at most r1rk1r^{\ell-1}\leqslant r^{k-1}. The probability that the damaged pointer at layer \ell points to one of these vertices is at most rk1/mr^{k-1}/m.

Combining the cases, the probability that eq. 1 holds for a particular choice of pointer numbers i1,,ik,j1,,jk[t]i_{1},\ldots,i_{k},j_{1},\ldots,j_{k}\in[t] is at most rk1/mr^{k-1}/m. Taking a union bound over the t2kt^{2k} choices, the overall probability Pr[sci(G′′)=1]<t2krk1/m=o(1)\Pr[\textsc{sci}(G^{\prime\prime})=1]<t^{2k}r^{k-1}/m=o(1), for the parameter regime t2k=o(m/polylog(m))t^{2k}=o(m/\operatorname{polylog}(m)) and r=O(logm)r=O(\log m). Therefore, 𝒬\mathcal{Q} outputs 11 with probability at most err(Π)+o(1)ε+o(1)\operatorname{err}(\Pi)+o(1)\leqslant\varepsilon+o(1).

Thus far, we have a protocol 𝒬\mathcal{Q} that outputs 11 with probability α\alpha when mpj-meet(G)=0\textsc{mpj-meet}(G^{*})=0 and with probability β\beta when mpj-meet(G)=1\textsc{mpj-meet}(G^{*})=1, where αε+o(1)\alpha\leqslant\varepsilon+o(1) and β(2k)2k(1ε)\beta\geqslant(2k)^{-2k}(1-\varepsilon). Recall that ε=(2k)2k2\varepsilon=(2k)^{-2k-2}, so β\beta is bounded away from α\alpha. Let 𝒬\mathcal{Q}^{\prime} be a protocol where we first toss an unbiased coin: if it lands heads, we output 0 with probability δ:=(α+β)/2\delta:=(\alpha+\beta)/2 and 11 with probability 1δ1-\delta; if it lands tails, we simulate 𝒬\mathcal{Q}. Then 𝒬\mathcal{Q}^{\prime} is a protocol for mpj-meet with error probability 12(βα)/4\frac{1}{2}-(\beta-\alpha)/4. By Lemma 2.6, this protocol must communicate Ω(m1+1/(2k)/k16log3/2m)\Omega(m^{1+1/(2k)}/k^{16}\log^{3/2}m) bits and so must Π\Pi. ∎

By a standard reduction from random-allocation communication protocols to random-order streaming algorithms, we obtain the following lower bound: the main result of this section.

Theorem 2.7.

For each constant pp, a pp-pass algorithm that solves shortpath-dag on nn-vertex digraphs whose edges presented in a uniform random order, erring with probability at most 1/pΩ(p)1/p^{\Omega(p)} must use Ω(n1+1/(2p+2)/log3/2n)\Omega(n^{1+1/(2p+2)}/\log^{3/2}n) bits of space.

Consequently, similar lower bounds hold for the problems stconn-dag, acyc, topo-sort, fas, and fas-size. ∎

This paper is focused on directed graph problems. However, it is worth noting that a by-product of our generalization of the Guruswami–Onak bound to randomly ordered streams is that we also obtain the first random-order super-linear (in nn) lower bounds for two important undirected graph problems.

Corollary 2.8.

For each constant pp, n1+Ω(1/p)n^{1+\Omega(1/p)} space is required to solve either of the following problems in pp passes, erring with probability at most 1/pΩ(p)1/p^{\Omega(p)}, over a randomly ordered edge stream of an nn-vertex undirected graph GG:

  • decide whether GG contains a perfect matching;

  • decide whether the distance between prespecified vertices vsv_{s} and vtv_{t} is at most 2p+22p+2.

A Barrier Result.  Notably, Theorem 2.7 applies only to algorithms with a rather small error probability. This is inherent: allowing just a slightly larger error probability renders the problem solvable in semi-streaming space. This is shown in the result below, which should be read as a barrier result rather than a compelling algorithm.

Proposition 2.9.

Given a randomly ordered edge stream of a digraph GG, the shortpath-dag problem on GG can be solved using O~(n)\widetilde{O}(n) space and pp passes, with error probability at most 2/p!2/p! .

Proof.

Recall that we’re trying to decide whether or not GG has a path of length at most (2p+2)(2p+2) from vsv_{s} to vtv_{t}. The high-level idea is that thanks to the random ordering, a “Bellman–Ford” style algorithm that grows a forward path out of vsv_{s} and a backward path out of vtv_{t} is very likely to make more than one step of progress during some pass.

To be precise, we maintain arrays dsd_{s} and dtd_{t}, each indexed by VV. Initialize the arrays to \infty, except that ds[vs]=dt[vt]=0d_{s}[v_{s}]=d_{t}[v_{t}]=0. During each pass, we use the following logic.

for each edge (x,y)(x,y) in the stream:
if ds[x]+dt[y]2p+1d_{s}[x]+d_{t}[y]\leqslant 2p+1: output true and halt
ds[y]min(ds[y],1+ds[x])d_{s}[y]\leftarrow\min(d_{s}[y],1+d_{s}[x])
dt[x]min(dt[x],1+dt[y])d_{t}[x]\leftarrow\min(d_{t}[x],1+d_{t}[y])

If we complete pp passes without any output, then we output false.

If GG has no short enough path from vsv_{s} to vtv_{t}, this algorithm will always output false. So let’s consider the other case, when there is a vsv_{s}vtv_{t} path π\pi of length at most 2p+22p+2. Let vertex zz be the midpoint of π\pi, breaking ties arbitrarily if needed. The subpaths [vs,z]π[v_{s},z]_{\pi} and [z,vt]π[z,v_{t}]_{\pi} have lengths qq and rr, respectively, with qp+1q\leqslant p+1 and rp+1r\leqslant p+1. Notice that if our algorithm is allowed to run for qq (resp. rr) passes, then ds[z]d_{s}[z] (resp. dt[z]d_{t}[z]) will settle to its correct value. If both of them settle, then the algorithm correctly outputs true. So, the only nontrivial case is when q,r{p,p+1}q,r\in\{p,p+1\}.

Let EsE_{s} be the event that the random ordering of the edges in the stream places the edges of [vs,z]π[v_{s},z]_{\pi} in the exact reverse order of π\pi. Let EtE_{t} be the event that the random ordering places the edges of [z,vt]π[z,v_{t}]_{\pi} in the exact same order as π\pi. If EsE_{s} does not occur, then for some two consecutive edges (w,x),(x,y)(w,x),(x,y) on [vs,z]π[v_{s},z]_{\pi}, the stream puts (w,x)(w,x) before (x,y)(x,y). Therefore, once ds[w]d_{s}[w] settles to its correct value, the following pass will settle not just ds[x]d_{s}[x], but also ds[y]d_{s}[y]; therefore, after q1pq-1\leqslant p passes, ds[z]d_{s}[z] is settled. Similarly, if EtE_{t} does not occur, then after r1pr-1\leqslant p passes, dt[z]d_{t}[z] is settled. As noted above, if both of them settle, the algorithm correctly outputs true.

Thus, the error probability Pr[EsEt]Pr[Es]+Pr[Et]=1/q!+1/r!2/p!\leqslant\Pr[E_{s}\vee E_{t}]\leqslant\Pr[E_{s}]+\Pr[E_{t}]=1/q!+1/r!\leqslant 2/p! , as required. ∎

3 Feedback Arc Set in Tournaments

3.1 Accurate, One Pass, but Slow Algorithm for FAS-T

We shall now design an algorithm for fas-t (that also solves fas-size-t) based on linear sketches for 1\ell_{1}-norm estimation. Recall that the 1\ell_{1}-norm of a vector 𝐱N\mathbf{x}\in\mathbb{R}^{N} is 𝐱1=i[N]|xi|\|\mathbf{x}\|_{1}=\sum_{i\in[N]}|x_{i}|. A dd-dimensional 1\ell_{1}-sketch with accuracy parameter ε\varepsilon and error parameter δ\delta is a distribution 𝒮\mathcal{S} over d×Nd\times N matrices, together with an estimation procedure Est:d\operatorname{Est}\colon\mathbb{R}^{d}\to\mathbb{R} such that

PrS𝒮[(1ε)𝐱1Est(S𝐱)(1+ε)𝐱1]1δ.\Pr_{S\leftarrow\mathcal{S}}\big{[}\,(1-\varepsilon)\|\mathbf{x}\|_{1}\leqslant\operatorname{Est}(S\mathbf{x})\leqslant(1+\varepsilon)\|\mathbf{x}_{1}\|\,\big{]}\geqslant 1-\delta\,.

Such a sketch is “stream friendly” if there is an efficient procedure to generate a given column of SS and further, Est\operatorname{Est} is efficient. Obviously, a stream friendly sketch leads to a space and time efficient algorithm for estimating 𝐱1\|\mathbf{x}\|_{1} given a stream of entrywise updates to 𝐱\mathbf{x}. We shall use the following specialization of a result of Kane et al. [17].

Fact 3.1 (Kane et al. [17]).

There is a stream friendly dd-dimensional 1\ell_{1}-sketch with accuracy ε\varepsilon and error δ\delta that can handle NO(1)N^{O(1)} many ±1\pm 1-updates to 𝐱N\mathbf{x}\in\mathbb{R}^{N}, with each update taking O(ε2logε1logδ1logN)O(\varepsilon^{-2}\log\varepsilon^{-1}\log\delta^{-1}\log N) time, with d=O(ε2logδ1)d=O(\varepsilon^{-2}\log\delta^{-1}), and with entries of the sketched vector fitting in O(logN)O(\log N) bits.

Theorem 3.2.

There is a one-pass algorithm for fas-t that uses O(ε2nlog2n)O(\varepsilon^{-2}n\log^{2}n) space and returns a (1+ε)(1+\varepsilon)-approximation with probability at least 23\frac{2}{3}, but requires exponential post-processing time.

Proof.

Identify the vertex set of the input graph G=(V,E)G=(V,E) with [n][n] and put N=(n2)N=\binom{n}{2}. We index vectors 𝐳\mathbf{z} in N\mathbb{R}^{N} as zuvz_{uv}, where 1u<vn1\leqslant u<v\leqslant n. Define a vector 𝐱{0,1}N\mathbf{x}\in\{0,1\}^{N} based on GG and vectors 𝐲π{0,1}N\mathbf{y}^{\pi}\in\{0,1\}^{N} for each permutation π:[n][n]\pi\colon[n]\to[n] using indicator variables as follows.

xuv=𝟙{(u,v)E},yuvπ=𝟙{π(u)<π(v)}.x_{uv}=\mathbbm{1}\{(u,v)\in E\}\,,\quad y^{\pi}_{uv}=\mathbbm{1}\{\pi(u)<\pi(v)\}\,.

A key observation is that the uvuv-entry of 𝐱𝐲π\mathbf{x}-\mathbf{y}^{\pi} is nonzero iff the edge between uu and vv is a back edge of GG according to the ordering π\pi. Thus, |BG(π)|=𝐱𝐲π1|B_{G}(\pi)|=\|\mathbf{x}-\mathbf{y}^{\pi}\|_{1}.

Our algorithm processes the graph stream by maintaining an 1\ell_{1}-sketch S𝐱S\mathbf{x} with accuracy ε/3\varepsilon/3 and error δ=1/(3n!)\delta=1/(3\cdot n!). By 3.1, this takes O(ε2nlog2n)O(\varepsilon^{-2}n\log^{2}n) space and O(ε2logε1nlog2n)O(\varepsilon^{-2}\log\varepsilon^{-1}n\log^{2}n) time per edge.

In post-processing, the algorithm considers all n!n! permutations π\pi and, for each of them, computes S(𝐱𝐲π)=S𝐱S𝐲πS(\mathbf{x}-\mathbf{y}^{\pi})=S\mathbf{x}-S\mathbf{y}^{\pi}. It thereby recovers an estimate for 𝐱𝐲π1\|\mathbf{x}-\mathbf{y}^{\pi}\|_{1} and finally outputs the ordering π\pi that minimizes this estimate. By a union bound, the probability that every estimate is (1±ε/3)(1\pm\varepsilon/3)-accurate is at least 1n!δ=2/31-n!\cdot\delta=2/3. When this happens, the output ordering provides a (1+ε)(1+\varepsilon)-approximation to fas-t by our key observation above. ∎

Despite its “brute force” feel, the above algorithm is essentially optimal, both in its space usage (unconditionally) and its post-processing time (in a sense we shall make precise later). We address these issues in Section 3.4.

3.2 Multiple Passes: FAS-T in Polynomial Time

For a more time-efficient streaming algorithm, we design one based on the KwikSort algorithm of Ailon et al. [4]. This (non-streaming) algorithm operates as follows on a tournament G=(V,E)G=(V,E).

  • Choose a random ordering of the vertices: v1,v2,,vnv_{1},v_{2},\ldots,v_{n}.

  • Vertex v1v_{1} partitions VV into two sub-problems {u:(u,v1)E}\{u:\,(u,v_{1})\in E\} and {w:(v1,w)E}\{w:\,(v_{1},w)\in E\}. At this point we know the exact place of v1v_{1} in the ordering.

  • Vertex v2v_{2} further partitions one of the these sub-problems. Proceeding in this manner, after v1,v2,,viv_{1},v_{2},\ldots,v_{i} are considered, there are i+1i+1 sub-problems.

  • Continue until all nn vertices are ordered.

When viv_{i} is being used to divide a sub-problem we refer to it as a pivot.

Emulating KwikSort in the Data Stream Model.  We will emulate KwikSort in pp passes over the data stream. In each pass, we will consider the action of multiple pivots. Partition v1,,vnv_{1},\ldots,v_{n} into pp groups V1,,VpV_{1},\ldots,V_{p}, where V1={v1,,vcn1/plogn}V_{1}=\{v_{1},\ldots,v_{cn^{1/p}\log n}\}, V2V_{2} consists of the next cn2/plogncn^{2/p}\log n vertices in the sequence, and VjV_{j} contains cnj/plogncn^{j/p}\log n vertices coming after Vj1V_{j-1}. Here cc is a sufficiently large constant. At the end of pass j+1j+1, we want to emulate the effect of pivots in Vj+1V_{j+1} on the sub-problems resulting from considering pivots in V1V_{1} through VjV_{j}. In order to do that, in pass j+1j+1 for each vertex vVj+1v\in V_{j+1} we store all edges between vv and vertices in the same sub-problem as vv, where the sub-problems are defined at the end of pass jj.

The following combinatorial lemma plays a key role in analyzing this algorithm’s space usage.

Lemma 3.3 (Mediocrity Lemma).

In an nn-vertex tournament, if we pick a vertex vv uniformly at random, then Pr[εn<din(v)<(1ε)n]14ε\Pr[\varepsilon n<\mathrm{d_{in}}(v)<(1-\varepsilon)n]\geqslant 1-4\varepsilon. Similarly, Pr[εn<dout(v)<(1ε)n]14ε\Pr[\varepsilon n<\mathrm{d_{out}}(v)<(1-\varepsilon)n]\geqslant 1-4\varepsilon. In particular, with probability at least 1/31/3, vv has in/out-degree between n/6n/6 and 5n/65n/6.666 The Mediocrity Lemma is tight: consider sets of vertices A,B,CA,B,C where |A|=|C|=2εn|A|=|C|=2\varepsilon n and |B|=(14ε)n|B|=(1-4\varepsilon)n. Edges on BB do not form any directed cycles. Subgraphs induced by AA and CC are balanced, i.e., the in-degree equals the out-degree of every vertex (where degrees here are considered within the subgraph). All other edges are directed from AA to BB, from BB to CC, or from AA to CC. Then vertices with in/out-degrees between εn\varepsilon n and (1ε)n(1-\varepsilon)n are exactly the vertices in BB, and a random vertex belongs to this set with probability 14ε1-4\varepsilon.

Proof.

Let HH be a set of vertices of in-degree at least (1ε)n(1-\varepsilon)n. Let h=|H|h=|H|. On the one hand, vHdin(v)(1ε)nh\sum_{v\in H}\mathrm{d_{in}}(v)\geqslant(1-\varepsilon)nh. On the other hand, the edges that contribute to the in-degrees of vertices in HH have both endpoints in HH or one endpoint in HH and one in VHV\setminus H. The number of such edges is

vHdin(v)(h2)+h(nh)=12(2nhh2h).\sum_{v\in H}\mathrm{d_{in}}(v)\leqslant\binom{h}{2}+h(n-h)=\frac{1}{2}(2nh-h^{2}-h)\,.

Therefore, (2nhh2h)/2(1ε)nh(2nh-h^{2}-h)/2\geqslant(1-\varepsilon)nh. This implies h<2εnh<2\varepsilon n.

Thus, the number of vertices with in-degree at least (1ε)n(1-\varepsilon)n (and out-degree at most εn\varepsilon n) is h<2εnh<2\varepsilon n. By symmetry, the number of vertices with out-degree at least (1ε)n(1-\varepsilon)n (and in-degree at most εn\varepsilon n) is also less than 2εn2\varepsilon n. Thus, the probability a random vertex has in/out-degree between εn\varepsilon n and (1ε)n(1-\varepsilon)n is (n2h)/n>(n22εn)/n=14ε(n-2h)/{n}>(n-2\cdot 2\varepsilon n)/{n}=1-4\varepsilon. ∎

Space Analysis.  Let MjM_{j} be the maximum size of a sub-problem after pass jj. The number of edges collected in pass j+1j+1 is then at most Mj|Vj+1|M_{j}|V_{j+1}|. By Lemma 3.4 (below), this is at most cn1+1/plogncn^{1+1/p}\log n. Once the post-processing of pass j+1j+1 is done, the edges collected in that pass can be discarded.

Lemma 3.4.

With high probability, Mjn1j/pM_{j}\leqslant n^{1-j/p} for all jj.

Proof.

Let MjvM_{j}^{v} denote the size of the sub-problem that contains vv, after the jjth pass. We shall prove that, for each vv, Pr[Mjv>n1j/p]1/n10\Pr[M_{j}^{v}>n^{1-j/p}]\leqslant 1/n^{10}. The lemma will then follow by a union bound.

Take a particular vertex vv. If, before the jjth pass, we already have Mj1vn1j/pM_{j-1}^{v}\leqslant n^{1-j/p}, there is nothing to prove. So assume that Mj1v>n1j/pM_{j-1}^{v}>n^{1-j/p}. Call a pivot “good” if it reduces the size of the sub-problem containing vv by a factor of at least 5/65/6. A random pivot falls in the same sub-problem as vv with probability at least n1j/p/nn^{1-j/p}/n; when this happens, by the Mediocrity Lemma, the probability that the pivot is good is at least 1/31/3. Overall, the probability that the pivot is good is at least nj/p/3n^{-j/p}/3.

In the jjth pass, we use cnj/plogncn^{j/p}\log n pivots. If at least log6/5n\log_{6/5}n of them are good, we definitely have Mjvn1j/pM_{j}^{v}\leqslant n^{1-j/p}. Thus, by a Chernoff bound, for a sufficiently large cc, we have

Pr[Mjv>n1j/p]Pr[Bin(cnj/plogn,nj/p/3)<log6/5n]1/n10.\Pr\left[M^{v}_{j}>n^{1-j/p}\right]\leqslant\Pr\left[\textup{Bin}\left(cn^{j/p}\log n,\,n^{-j/p}/3\right)<\log_{6/5}n\right]\leqslant 1/n^{10}\,.\qed
Theorem 3.5.

There exists a polynomial time pp-pass data stream algorithm using O~(n1+1/p)\widetilde{O}(n^{1+1/p}) space that returns a 33-approximation (in expectation) for fas-t.

Proof.

The pass/space tradeoff follows from Lemma 3.4 and the discussion above it; the approximation factor follows directly from the analysis of Ailon et al. [4]. ∎

3.3 A Space Lower Bound

Both our one-pass algorithm and the O(logn)O(\log n)-pass instantiation of our multi-pass algorithm use at least Ω(n)\Omega(n) space. For fas-size-t, where the desired output is a just a number, it is reasonable to ask whether o(n)o(n)-space solutions exist. We now prove that they do not.

Proposition 3.6.

Solving acyc-t is possible in one pass and O(nlogn)O(n\log n) space. Meanwhile, any pp-pass solution requires Ω(n/p)\Omega(n/p) space.

Proof.

For the upper bound, we maintain the in-degrees of all vertices in the input graph GG. Since GG is a tournament, the set of in-degrees is exactly {0,1,,n1}\{0,1,\ldots,n-1\} iff the input graph is acyclic.

For the lower bound, we reduce from disjN,N/3\textsc{disj}_{N,N/3}. Alice and Bob construct a tournament TT on n=7N/3n=7N/3 vertices, where the vertices are labeled {v1,,v2N,w1,,wN/3}\{v_{1},\ldots,v_{2N},w_{1},\ldots,w_{N/3}\}. Alice, based on her input 𝐱\mathbf{x}, adds edges (v2i,v2i1)(v_{2i},v_{2i-1}) for each i𝐱i\in\mathbf{x}. For each remaining pair (i,j)[2N]×[2N](i,j)\in[2N]\times[2N] with i<ji<j, she adds the edge (vi,vj)(v_{i},v_{j}). Let a1<<aN/3a_{1}<\cdots<a_{N/3} be the sorted order of the elements in Bob’s set 𝐲\mathbf{y}. For each k=a𝐲k=a_{\ell}\in\mathbf{y}, Bob defines the alias v2N+k=wv_{2N+k}=w_{\ell} and then adds the edges

Ek={(vi,v2N+k): 1i2k1}{(v2N+k,vj): 2kj2N}.E_{k}=\{(v_{i},v_{2N+k}):\,1\leqslant i\leqslant 2k-1\}\cup\{(v_{2N+k},v_{j}):\,2k\leqslant j\leqslant 2N\}\,.

Finally, he adds the edges {(wi,wj): 1i<jN/3}\{(w_{i},w_{j}):\,1\leqslant i<j\leqslant N/3\}. This completes the construction of TT.

We claim that the tournament TT is acyclic iff 𝐱𝐲=\mathbf{x}\cap\mathbf{y}=\varnothing. The “only if” part is direct from construction, since if 𝐱\mathbf{x} and 𝐲\mathbf{y} intersect at some index k[N]k\in[N], we have the directed cycle (v2k,v2k1,v2N+k,v2k)(v_{2k},v_{2k-1},v_{2N+k},v_{2k}). For the “if” part, let σ\sigma be the ordering (v1,,v2N)(v_{1},\ldots,v_{2N}) and let T=Tou(σ)T^{\prime}=\operatorname{Tou}(\sigma), as defined in Section 1.2. We show how to modify σ\sigma into a topological ordering of TT, proving that TT is acyclic. Observe that, by construction, the tournament T{w1,,wN/3}T\setminus\{w_{1},\ldots,w_{N/3}\} can be obtained from TT^{\prime} by flipping only the edges (v2i1,v2i)(v_{2i-1},v_{2i}) for each i𝐱i\in\mathbf{x}. Each time we perform such an edge flip, we modify the topological ordering of TT^{\prime} by swapping the associated vertices of the edge. The resultant ordering would still be topological as the vertices were consecutive in the ordering before the flip. Thus, after performing these swaps, we get a topological ordering of T{w1,,wN/3}T\setminus\{w_{1},\ldots,w_{N/3}\}. Now, consider some k𝐲k\in\mathbf{y}. Since 𝐱𝐲=\mathbf{x}\cap\mathbf{y}=\varnothing, k𝐱k\notin\mathbf{x} and so, v2kv_{2k} succeeds v2k1v_{2k-1} in this ordering, just as in σ\sigma, since we never touched these two vertices while performing the swaps. Thus, for each such kk, we can now insert v2N+kv_{2N+k} between v2k1v_{2k-1} and v2kv_{2k} in the ordering and obtain a topological ordering of TT. This proves the claim.

Thus, given a pp-pass solution to acyc-t using ss bits of space, we obtain a protocol for disjN,N/3\textsc{disj}_{N,N/3} that communicates at most (2p1)s(2p-1)s bits. By 1.3, (2p1)s=Ω(N)=Ω(n)(2p-1)s=\Omega(N)=\Omega(n), i.e., s=Ω(n/p)s=\Omega(n/p). ∎

Theorem 3.7.

A pp-pass multiplicative approximation for fas-size-t requires Ω(n/p)\Omega(n/p) space.

Proof.

This is immediate from 1.1 and 3.6. ∎

3.4 An Oracle Lower Bound

Let us now consider the nature of the post-processing performed by our one-pass fas-t algorithm in Section 3.1. During its streaming pass, that algorithm builds an oracle based on GG that, when queried on an ordering σ\sigma, returns a fairly accurate estimate of |BG(σ)||B_{G}(\sigma)|. It proceeds to query this oracle n!n! times to find a good ordering. This raises the question: is there a more efficient way to exploit the oracle that the algorithm has built? A similar question was asked in Bateni et al. [6] in the context of using sketches for the maximum coverage problem.

Were the oracle exact—i.e., on input σ\sigma it returned |BG(σ)||B_{G}(\sigma)| exactly—then two queries to the oracle would determine which of (i,j)(i,j) and (j,i)(j,i) was an edge in GG. It follows that O(nlogn)O(n\log n) queries to such an exact oracle suffice to solve fas-t and fas-size-t. However, what we actually have is an ε\varepsilon-oracle, defined as one that, on query σ\sigma, returns β^\hat{\beta}\in\mathbb{R} such that (1ε)|BG(σ)|β^(1+ε)|BG(σ)|(1-\varepsilon)|B_{G}(\sigma)|\leqslant\hat{\beta}\leqslant(1+\varepsilon)|B_{G}(\sigma)|. We shall show that an ε\varepsilon-oracle cannot be exploited efficiently: a randomized algorithm will, with high probability, need exponentially many queries to such an oracle to solve either fas-t or fas-size-t.

To prove this formally, we consider two distributions on nn-vertex tournaments, defined next.

Definition 3.8.

Let 𝒟yes,𝒟no\mathcal{D}_{\mathrm{yes}},\mathcal{D}_{\mathrm{no}} be distributions on tournaments on [n][n] produced as follows. To produce a sample from 𝒟yes\mathcal{D}_{\mathrm{yes}}, pick a permutation π\pi of [n][n] uniformly at random; output Tou(π)\operatorname{Tou}(\pi). To produce a sample from 𝒟no\mathcal{D}_{\mathrm{no}}, for each i,ji,j with 1i<jn1\leqslant i<j\leqslant n, independently at random, include edge (i,j)(i,j) with probability 12\frac{1}{2}; otherwise include edge (j,i)(j,i).

Let σ\sigma be an ordering of [n][n]. By linearity of expectation, if TT is sampled from either 𝒟yes\mathcal{D}_{\mathrm{yes}} or 𝒟no\mathcal{D}_{\mathrm{no}},

𝔼|BT(σ)|=m:=12(n2).\mathbb{E}|B_{T}(\sigma)|=m:=\frac{1}{2}\binom{n}{2}\,.

In fact, we can say much more.

Lemma 3.9.

There is a constant cc such that, for all ε>0\varepsilon>0, sufficiently large nn, a fixed ordering σ\sigma on [n][n], and random TT drawn from either 𝒟yes\mathcal{D}_{\mathrm{yes}} or 𝒟no\mathcal{D}_{\mathrm{no}},

Pr[(1ε)m<|BT(σ)|<(1+ε)m]12cε2n.\Pr\left[(1-\varepsilon)m<|B_{T}(\sigma)|<(1+\varepsilon)m\right]\geqslant 1-2^{-c\varepsilon^{2}n}\,.
Proof.

When T𝒟noT\leftarrow\mathcal{D}_{\mathrm{no}}, the random variable |BT(σ)||B_{T}(\sigma)| has binomial distribution Bin(2m,12)\textup{Bin}(2m,\frac{1}{2}), so the claimed bound is immediate.

Let T𝒟yesT\leftarrow\mathcal{D}_{\mathrm{yes}}. Partition the edges of the tournament into perfect matchings M1,,Mn1M_{1},\ldots,M_{n-1}. For each i[n1]i\in[n-1], let XiX_{i} be the number of back edges of TT involving MiM_{i}, i.e.,

Xi=|{(u,v)Mi:either (u,v)BT(σ) or (v,u)BT(σ)}|.X_{i}=|\{(u,v)\in M_{i}:\,\text{either }(u,v)\in B_{T}(\sigma)\text{ or }(v,u)\in B_{T}(\sigma)\}|\,.

Notice that XiBin(n/2,12)X_{i}\sim\textup{Bin}(n/2,\frac{1}{2}), whence

Pr[(1ε)n/4<Xi<12(1+ε)n/4]12bε2n,\Pr\left[\textstyle(1-\varepsilon)n/4<X_{i}<\frac{1}{2}(1+\varepsilon)n/4\right]\geqslant 1-2^{b\varepsilon^{2}n}\,,

for a certain constant bb. By a union bound, the probability that all of the XiX_{i}s are between these bounds is at least 1(n1)2bε2n12cε2n1-(n-1)2^{-b\varepsilon^{2}n}\geqslant 1-2^{-c\varepsilon^{2}n}, for suitable cc. When this latter event happens, we also have (1ε)m<|BT(σ)|=12i=1n1Xi<(1+ε)m(1-\varepsilon)m<|B_{T}(\sigma)|=\frac{1}{2}\sum_{i=1}^{n-1}X_{i}<(1+\varepsilon)m. ∎

We define a (q,ε)(q,\varepsilon)-query algorithm for a problem PP to be one that access an input digraph GG solely through queries to an ε\varepsilon-oracle and, after at most qq such queries, outputs its answer to P(G)P(G). We require this answer to be correct with probability at least 23\frac{2}{3}.

Now consider the particular oracle 𝒪T,ε\mathcal{O}_{T,\varepsilon}, describing an nn-vertex tournament TT, that behaves as follows when queried on an ordering σ\sigma.

  • If (1ε/2)m<|BT(σ)|<(1+ε/2)m(1-\varepsilon/2)m<|B_{T}(\sigma)|<(1+\varepsilon/2)m, then return mm.

  • Otherwise, return |BT(σ)||B_{T}(\sigma)|.

Clearly, 𝒪T,ε\mathcal{O}_{T,\varepsilon} is an ε\varepsilon-oracle. The intuition in the next two proofs is that this oracle makes life difficult by seldom providing useful information.

Proposition 3.10.

Every (q,ε)(q,\varepsilon)-query algorithm for topo-sort-t makes exp(Ω(ε2n))\exp(\Omega(\varepsilon^{2}n)) queries.

Proof.

WLOG, consider a (q,ε)(q,\varepsilon)-query algorithm, 𝒜\mathcal{A}, that makes exactly qq queries, the last of which is its output. Using Yao’s minimax principle, fix 𝒜\mathcal{A}’s random coins, obtaining a deterministic (q,ε)(q,\varepsilon)-query algorithm 𝒜\mathcal{A}^{\prime} that succeeds with probability 23\geqslant\frac{2}{3} on a random tournament T𝒟yesT\leftarrow\mathcal{D}_{\mathrm{yes}}. Let σ1,,σq\sigma_{1},\ldots,\sigma_{q} be the sequence of queries that 𝒜\mathcal{A}^{\prime} makes when the answer it receives from the oracle to each of σ1,,σq1\sigma_{1},\ldots,\sigma_{q-1} is mm.

Suppose that the oracle supplied to 𝒜\mathcal{A}^{\prime} is 𝒪T,ε\mathcal{O}_{T,\varepsilon}. Let \mathcal{E} be the event that 𝒜\mathcal{A}^{\prime}’s query sequence is σ1,,σq\sigma_{1},\ldots,\sigma_{q} and it receives the response mm to each of these queries. For a particular σi\sigma_{i},

Pr[𝒪T,ε(σi)=m]=Pr[(1ε/2)m<|BT(σi)|<(1+ε/2)m]12bε2n\Pr[\mathcal{O}_{T,\varepsilon}(\sigma_{i})=m]=\Pr[(1-\varepsilon/2)m<|B_{T}(\sigma_{i})|<(1+\varepsilon/2)m]\geqslant 1-2^{-b\varepsilon^{2}n}

for a suitable constant bb, by Lemma 3.9. Thus, by a union bound, Pr[]1q2bε2n\Pr[\mathcal{E}]\geqslant 1-q2^{-b\varepsilon^{2}n}.

When \mathcal{E} occurs, 𝒜\mathcal{A}^{\prime} must output σq\sigma_{q}, but \mathcal{E} itself implies that |BT(σq)|0|B_{T}(\sigma_{q})|\neq 0, so 𝒜\mathcal{A}^{\prime} errs. Thus, the success probability of 𝒜\mathcal{A}^{\prime} is at most 1Pr[]q2bε2n1-\Pr[\mathcal{E}]\leqslant q2^{-b\varepsilon^{2}n}. Since this probability must be at least 23\frac{2}{3}, we need q232bε2n=exp(Ω(ε2n))q\geqslant\frac{2}{3}\cdot 2^{b\varepsilon^{2}n}=\exp(\Omega(\varepsilon^{2}n)). ∎

Proposition 3.11.

Every (q,ε)(q,\varepsilon)-query algorithm for acyc-t makes exp(Ω(ε2n))\exp(\Omega(\varepsilon^{2}n)) queries.

Proof.

We proceed similarly to Proposition 3.10, except that we require the deterministic (q,ε)(q,\varepsilon)-query algorithm 𝒜\mathcal{A}^{\prime} to succeed with probability at least 23\frac{2}{3} on a random T12(𝒟yes+𝒟no)T\leftarrow\frac{1}{2}(\mathcal{D}_{\mathrm{yes}}+\mathcal{D}_{\mathrm{no}}). We view TT as being chosen in two stages: first, we pick ZR{yes,no}Z\in_{R}\{\mathrm{yes},\mathrm{no}\} uniformly at random, then we pick T𝒟ZT\leftarrow\mathcal{D}_{Z}.

Define σ1,,σq\sigma_{1},\ldots,\sigma_{q} and \mathcal{E} as before. So Pr[]1q2bε2n\Pr[\mathcal{E}]\geqslant 1-q2^{-b\varepsilon^{2}n}. When \mathcal{E} occurs, 𝒜\mathcal{A}^{\prime} must output some fixed answer, either “yes” or “no.” We consider these cases separately.

Suppose that 𝒜\mathcal{A}^{\prime} outputs “no,” declaring that TT is not acyclic. Then 𝒜\mathcal{A}^{\prime} errs whenever Z=Z= yes and \mathcal{E} occurs. The probability of this is at least 12q2bε2n\frac{1}{2}-q2^{-b\varepsilon^{2}n}, but it must be at most 13\frac{1}{3}, requiring q=exp(Ω(ε2n))q=\exp(\Omega(\varepsilon^{2}n)).

Suppose that 𝒜\mathcal{A}^{\prime} outputs “yes” instead. Then it errs when Z=Z= no, TT is cyclic, and \mathcal{E} occurs. Since

Pr[T acyclicZ=no]=n!/2(n2)=exp(Ω(n2)),\Pr[T\text{ acyclic}\mid Z=\text{no}]=n!/2^{\binom{n}{2}}=\exp(-\Omega(n^{2}))\,,

we have 13Pr[𝒜\frac{1}{3}\geqslant\Pr[\mathcal{A}^{\prime} errs]12exp(Ω(n2))q2bε2n]\geqslant\frac{1}{2}-\exp(-\Omega(n^{2}))-q2^{-b\varepsilon^{2}n}, requiring q=exp(Ω(ε2n))q=\exp(\Omega(\varepsilon^{2}n)). ∎

Theorem 3.12.

A (q,ε)(q,\varepsilon)-query algorithm that gives a multiplicative approximation for either fas-t or fas-size-t must make q=exp(Ω(ε2n))q=\exp(\Omega(\varepsilon^{2}n)) queries.

Proof.

This is immediate from 1.1, 3.10 and 3.11. ∎

4 Sink Finding in Tournaments

A classical offline algorithm for topo-sort is to repeatedly find a sink vv in the input graph (which must exist in a DAG), prepend vv to a growing list, and recurse on GvG\setminus v. Thus, sink-find itself is a fundamental digraph problem. Obviously, sink-find can be solved in a single pass using O(n)O(n) space by maintaining an “is-sink” flag for each vertex. Our results below show that for arbitrary order streams this is tight, even for tournament graphs.

In fact, we say much more. In pp passes, on the one hand, the space bound can be improved to roughly O(n2/p)O(n^{2/p}). On the other hand, any pp-pass algorithm requires about Ω(n1/p)\Omega(n^{1/p}) space. While these bounds don’t quite match, they reveal the correct asymptotics for the number of passes required to achieve polylogarithmic space usage: namely, Θ(logn/loglogn)\Theta(\log n/\log\log n).

In contrast, we show that if the stream is randomly ordered, then using polylog(n)\operatorname{polylog}(n) space and a single pass is sufficient. This is a significant separation between the adversarial and random order data streams.

4.1 Arbitrary Order Sink Finding

Theorem 4.1 (Multi-pass algorithm).

For all pp with 1plogn1\leqslant p\leqslant\log n, there is a (2p1)(2p-1)-pass algorithm for sink-find-t that uses O(n1/plog(3p))O(n^{1/p}\log(3p)) space and has failure probability at most 1/31/3.

Proof.

Let the input digraph be G=(V,E)G=(V,E). For a set SVS\subseteq V, let maxS\max S denote the vertex in SS that has maximum in-degree. This can also be seen as the maximum vertex within SS according to the total ordering defined by the edge directions.

Our algorithm proceeds as follows.

  • Initialization: Set s=n1/pln(3p)s=\lceil{n^{1/p}\ln(3p)}\rceil. Let S1S_{1} be a set of ss vertices chosen randomly from VV.

  • For i=1i=1 to p1p-1:

    • During pass 2i12i-1: Find vi=maxSiv_{i}=\max S_{i} by computing the in-degree of each vertex in SiS_{i}.

    • During pass 2i2i: Let Si+1S_{i+1} be a set of ss vertices chosen randomly from {u:(vi,u)E}\{u:\,(v_{i},u)\in E\}.

  • During pass 2p12p-1: Find vp=maxSpv_{p}=\max S_{p} by computing the in-degree of each vertex in SpS_{p}.

For the sake of analysis, consider the quantity i=|{u:(vi,u)E}|\ell_{i}=|\{u:(v_{i},u)\in E\}|. Note that, for each i[p]i\in[p],

Pr[i>i1/n1/p]=(11/n1/p)s13p.\Pr\left[\ell_{i}>\ell_{i-1}/n^{1/p}\right]=(1-1/n^{1/p})^{s}\leqslant\frac{1}{3p}\,.

Thus, by the union bound, p=0\ell_{p}=0 with probability at least 1p/(3p)=2/31-p/(3p)=2/3. Note that p=0\ell_{p}=0 implies that vpv_{p} is a sink. ∎

We turn to establishing a multi-pass lower bound. Our starting point for this is the tree pointer jumping problem tpjk,t\textsc{tpj}_{k,t}, which is a communication game involving kk players. To set up the problem, consider a complete ordered kk-level tt-ary tree TT; we consider its root zz to be at level 0, the children of zz to be at level 11, and so on. We denote the ii-th child of yV(T)y\in V(T) by yi{y}_{i}, the jj-th child of yi{y}_{i} by yi,j{y}_{i,j}, and so on. Thus, each leaf of TT is of the form zi1,,ik1{z}_{i_{1},\ldots,i_{k-1}} for some integers i1,,ik1[t]i_{1},\ldots,i_{k-1}\in[t].

An instance of tpjk,t\textsc{tpj}_{k,t} is given by a function ϕ:V(T)[t]\phi\colon V(T)\to[t] such that ϕ(y){0,1}\phi(y)\in\{0,1\} for each leaf yy. The desired one-bit output is

tpjk,t(ϕ)\displaystyle\textsc{tpj}_{k,t}(\phi) :=g(k)(z)=g(g(g(z))),where\displaystyle:=g^{(k)}(z)=g(g(\cdots g(z)\cdots))\,,\text{where}
g(y)\displaystyle g(y) :={ϕ(y),if y is a leaf,yϕ(y),otherwise.\displaystyle:=\begin{cases}\phi(y)\,,&\text{if $y$ is a leaf,}\\ {y}_{\phi(y)}\,,&\text{otherwise.}\end{cases} (2)

For each j{0,,k1}j\in\{0,\ldots,k-1\}, Player jj receives the input values ϕ(y)\phi(y) for each vertex yy at level jj. The players then communicate using at most k1k-1 rounds, where a single round consists of one message from each player, speaking in the order Player k1k-1, …, Player 0. All messages are broadcast publicly (equivalently, written on a shared blackboard) and may depend on public random coins. The cost of a round is the total number of bits communicated in that round and the cost of a protocol is the maximum, over all rounds, of the cost of a round. The randomized complexity Rk1(tpjk,t)\operatorname{R}^{k-1}(\textsc{tpj}_{k,t}) is the minimum cost of a (k1)(k-1)-round 13\frac{1}{3}-error protocol for tpjk,t\textsc{tpj}_{k,t}.

Combining the lower bound approach of Chakrabarti et al. [7] with the improved round elimination analysis of Yehudayoff [24], we obtain the following lower bound on the randomized communication complexity of the problem.

Theorem 4.2.

Rk1(tpjk,t)=Ω(t/k)\operatorname{R}^{k-1}(\textsc{tpj}_{k,t})=\Omega(t/k). ∎

Based on this, we prove the following lower bound.

Theorem 4.3 (Multi-pass lower bound).

Any streaming algorithm that solves sink-find-t in pp passes must use Ω(n1/p/p2)\Omega(n^{1/p}/p^{2}) space.

Proof.

We reduce from tpjk,t\textsc{tpj}_{k,t}, where k=p+1k=p+1. We continue using the notations defined above. At a high level, we encode an instance of tpj in the directions of edges in a tournament digraph GG, where V(G)V(G) can be viewed as two copies of the set of leaves of TT. Formally,

V(G)={i1,,ik1,a:each ij[t] and a{0,1}}.V(G)=\{\langle{i_{1},\ldots,i_{k-1},a}\rangle:\,\text{each }i_{j}\in[t]\text{ and }a\in\{0,1\}\}\,.

We assign each pair of distinct vertices u,vV(G)u,v\in V(G) to a level in {0,,k1}\{0,\ldots,k-1\} as follows. Suppose that u=i1,,iku=\langle{i_{1},\ldots,i_{k}}\rangle and v=i1,,ikv=\langle{i^{\prime}_{1},\ldots,i^{\prime}_{k}}\rangle. We assign {u,v}\{u,v\} to level j1j-1, where jj is the smallest index such that ijiji_{j}\neq i^{\prime}_{j}. Given an instance of tpjk,t\textsc{tpj}_{k,t}, the players jointly create an instance of sink-find-t as follows. For each jj from k1k-1 to 0, in that order, Player jj assigns directions for all pairs of vertices at level jj, obtaining a set EjE_{j} of directed edges, and then appends EjE_{j} to a stream. The combined stream Ek1E1E0E_{k-1}\circ\cdots\circ E_{1}\circ E_{0} defines the tournament GG. It remains to define each set EjE_{j} precisely.

The set Ek1E_{k-1} encodes the bits ϕ(y)\phi(y) at the leaves yy of TT as follows.

Ek1={(i1,,ik1,1a,i1,,ik1,a)V(G)2:ϕ(zi1,,ik1)=a},\displaystyle E_{k-1}=\{(\langle{i_{1},\ldots,i_{k-1},1-a}\rangle,\langle{i_{1},\ldots,i_{k-1},a}\rangle)\in V(G)^{2}:\,\phi({z}_{i_{1},\ldots,i_{k-1}})=a\}\,, (3)

Notice that if we ignore edge directions, Ek1E_{k-1} is a perfect matching on V(G)V(G).

Now consider an arbitrary level j{0,,k2}j\in\{0,\ldots,k-2\}. Corresponding to each vertex zi1,,ij1{z}_{i_{1},\ldots,i_{j-1}} at level jj of TT, we define the permutation πi1,,ij1:[t][t]\pi_{i_{1},\ldots,i_{j-1}}\colon[t]\to[t] thus:

(πi1,,ij1(1),,πi1,,ij1(t))\displaystyle(\pi_{i_{1},\ldots,i_{j-1}}(1),\ldots,\pi_{i_{1},\ldots,i_{j-1}}(t)) =(1,,1,+1,,t,),\displaystyle=(1,\ldots,\ell-1,\ell+1,\ldots,t,\ell)\,,
where \displaystyle\text{where }\ell =ϕ(zi1,,ij1).\displaystyle=\phi({z}_{i_{1},\ldots,i_{j-1}})\,. (4)

Using this, we define EjE_{j} so as to encode the pointers at level jj as follows.

Ej={(i1,,ij1,ij,,ik,i1,,ij1,ij,,ik)V(G)2:πi1,,ij11(ij)<πi1,,ij11(ij)}.E_{j}=\{(\langle{i_{1},\ldots,i_{j-1},i_{j},\ldots,i_{k}}\rangle,\langle{i_{1},\ldots,i_{j-1},i^{\prime}_{j},\ldots,i^{\prime}_{k}}\rangle)\in V(G)^{2}:\,\pi_{i_{1},\ldots,i_{j-1}}^{-1}(i_{j})<\pi_{i_{1},\ldots,i_{j-1}}^{-1}(i^{\prime}_{j})\}\,. (5)

It should be clear that the digraph (V(G),E0E1Ek1)(V(G),E_{0}\cup E_{1}\cup\cdots\cup E_{k-1}) is a tournament. We argue that it is acyclic. Suppose, to the contrary, that GG has a cycle σ\sigma. Let j{0,,k2}j\in\{0,\ldots,k-2\} be the smallest-numbered level of an edge on σ\sigma. Then there exist h1,,hj1h_{1},\ldots,h_{j-1} such that every vertex on σ\sigma is of the form h1,,hj1,ij,,ik\langle{h_{1},\ldots,h_{j-1},i_{j},\ldots,i_{k}}\rangle. Let v(1),,v(r)v^{(1)},\ldots,v^{(r)} be the vertices on σ\sigma whose outgoing edges belong to level jj. For each q[r]q\in[r], let v(q)=h1,,hj1,ij(q),,ik(q)v^{(q)}=\langle{h_{1},\ldots,h_{j-1},i^{(q)}_{j},\ldots,i^{(q)}_{k}}\rangle. Let π^=πh1,,hj1\hat{\pi}=\pi_{h_{1},\ldots,h_{j-1}}. According to eq. 5,

π^1(ij(1))<π^1(ij(2))<<π^1(ij(r))<π^1(ij(1)),\hat{\pi}^{-1}\big{(}i^{(1)}_{j}\big{)}<\hat{\pi}^{-1}\big{(}i^{(2)}_{j}\big{)}<\cdots<\hat{\pi}^{-1}\big{(}i^{(r)}_{j}\big{)}<\hat{\pi}^{-1}\big{(}i^{(1)}_{j}\big{)}\,,

a contradiction.

It follows that GG has a unique sink. Let v=h1,,hk1,aV(G)v=\langle{h_{1},\ldots,h_{k-1},a}\rangle\in V(G) be this sink. In particular, for each level j{0,,k2}j\in\{0,\ldots,k-2\}, all edges in EjE_{j} involving vv must be directed towards vv. According to eq. 5, we must have πh1,,hj11(hj)=t\pi_{h_{1},\ldots,h_{j-1}}^{-1}(h_{j})=t, i.e., πh1,,hj1(t)=hj\pi_{h_{1},\ldots,h_{j-1}}(t)=h_{j}. By section 4.1, this gives ϕ(zh1,,hj1)=hj\phi({z}_{h_{1},\ldots,h_{j-1}})=h_{j}. Next, by eq. 2, this gives g(zh1,,hj1)=zh1,,hjg({z}_{h_{1},\ldots,h_{j-1}})={z}_{h_{1},\ldots,h_{j}}. Instantiating this observation for j=0,,k2j=0,\ldots,k-2, we have

zh1=g(z),zh1,h2=g(zh1),,zh1,,hk1=g(zh1,,hk2),{z}_{h_{1}}=g(z),~{}~{}{z}_{h_{1},h_{2}}=g({z}_{h_{1}}),~{}~{}\ldots,~{}~{}{z}_{h_{1},\ldots,h_{k-1}}=g({z}_{h_{1},\ldots,h_{k-2}})\,,

i.e., zh1,,hk1=g(k1)(z){z}_{h_{1},\ldots,h_{k-1}}=g^{(k-1)}(z).

At this point h1,,hk1h_{1},\ldots,h_{k-1} have been determined, leaving only two possibilities for vv. We now use the fact that the sole edge in Ek1E_{k-1} involving vv must be directed towards vv. According to eq. 3, ϕ(zh1,,hk1)=a\phi({z}_{h_{1},\ldots,h_{k-1}})=a. Invoking eq. 2 again, a=ϕ(g(k1)(z))=g(k)(z)=tpjk,t(ϕ)a=\phi(g^{(k-1)}(z))=g^{(k)}(z)=\textsc{tpj}_{k,t}(\phi).

Thus, the players can read off the desired output tpjk,t(ϕ)\textsc{tpj}_{k,t}(\phi) from the identity of the unique sink of the constructed digraph GG. Notice that n:=|V(G)|=2tk1n:=|V(G)|=2t^{k-1}. It follows that a (k1)(k-1)-pass streaming algorithm for sink-find-t that uses SS bits of space solves tpjk,t\textsc{tpj}_{k,t} in k1k-1 rounds at a communication cost of kSkS. By Theorem 4.2, we have S=Ω(t/k2)=Ω(n1/(k1)/k2)S=\Omega(t/k^{2})=\Omega(n^{1/(k-1)}/k^{2}). ∎

4.2 Random Order Sink Finding

In this section we show that it is possible to find the sink of an acyclic tournament in one pass over a randomly order stream while using only polylog(n)\operatorname{polylog}(n) space. The algorithm we consider is as follows:

  • Initialization: Let SS be a random set of s=200logns=200\log n nodes.

  • For i=1i=1 to k:=log2(m200000nlogn)k:=\log_{2}\left(\frac{m}{200000n\log n}\right):

    • Ingest the next ci:=1002i(n1)lognc_{i}:=100\cdot 2^{i}(n-1)\log n elements of the stream: For each vSv\in S, collect the set of edges SvS_{v} consisting of all outgoing edges; throw away SvS_{v} if it exceeds size 220logn220\log n

    • Pick any vSv\in S, such that |Sv|=(200±20)logn|S_{v}|=(200\pm 20)\log n and let SS be the endpoints (other than vv) of the edges in SvS_{v}

  • Ingest the next m/1000m/1000 elements: find PP the set of vertices ww such that there exists an edge uwuw for some uSu\in S

  • Ingest the remaining 499m/500499m/500 elements: Output any vertex in PP with no outgoing edges.

Theorem 4.4.

There is a single pass algorithm for sink-find-t that uses O(polylogn)O(\operatorname{polylog}n) space and has failure probability at most 1/31/3 under the assumption that the data stream is randomly ordered.

Proof.

We refer to the cic_{i} elements used in the iteration ii as the iith segment of the stream. For a node uu, let Xu,iX_{u,i} be the number of outgoing edges from uu amongst the iith segment. The following claim follows from the Chernoff bound:

Claim 4.5.

With high probability, for all uu with |rk(u)n/2i|0.2n/2i|rk(u)-n/2^{i}|\geqslant 0.2\cdot n/2^{i} then

|Xu,i200logn|>0.1200logn.|X_{u,i}-200\log n|>0.1\cdot 200\log n\ .

With high probability, for all uu with |rk(u)n/2i|0.05n/2i|rk(u)-n/2^{i}|\leqslant 0.05\cdot n/2^{i}, then

|Xu,i200logn|<0.1200logn.|X_{u,i}-200\log n|<0.1\cdot 200\log n\ .

If follows from the claim that if after processing the iith segment of the stream there exists a vv such that |Sv|=(200±20)logn|S_{v}|=(200\pm 20)\log n then with high probability rk(u)=(1±0.2)n/2irk(u)=(1\pm 0.2)\cdot n/2^{i}. We next need to argue that there exists such a vv.

Claim 4.6.

With high probability, for every node uu with rk(u)=(1±0.2)n/2i1rk(u)=(1\pm 0.2)\cdot n/2^{i-1}, there exists an edge uvuv in the iith segment such that |rk(v)n/2i|0.05n/2i|rk(v)-n/2^{i}|\leqslant 0.05\cdot n/2^{i}.

Proof.

There are at least 0.01n/2i0.01\cdot n/2^{i} such edges. The probability that none of them exists in the iith segment is at most (1ci/m)0.01n/2i1/poly(n)(1-c_{i}/m)^{0.01\cdot n/2^{i}}\leqslant 1/\operatorname{poly}(n). ∎

The above two claims allow us to argue by induction that we will have an element uu with rk(u)=(1±0.2)n/2irk(u)=(1\pm 0.2)\cdot n/2^{i} after the iith segment. At the end of the kkth segment we have identified at least (20020)logn(200-20)\log n vertices where every rank is at most (1+0.2)n/2k=O(logn)(1+0.2)\cdot n/2^{k}=O(\log n). With probability at least 11/poly(n)1-1/\operatorname{poly}(n) one of these vertices includes an edge to the sink amongst the (k+1)(k+1) segment and hence the sink is in PP with high probability. There may be other vertices in PP but the following claim shows that we will identify any false positives while processing the final 499m/500499m/500 elements of the stream.

Claim 4.7.

With probability at least 11/4991-1/499, there exists at least once outgoing edge from every node except the sink amongst the last 499m/500499m/500 elements of the stream

Proof of Claim.

The probability no outgoing edge from the an element of rank r>0r>0 appears in the suffix of the stream is at most (1499/500)r(1-499/500)^{r} . Hence, by the union bound the probability that there exists an element of rank r>0r>0 without an outgoing edge is at most r1(1499/500)r=1/499.\sum_{r\geqslant 1}(1-499/500)^{r}=1/499.

This concludes the proof of Theorem 4.4. ∎

5 Topological Ordering in Random Graphs

We present results for computing a topological ordering of G𝖯𝗅𝖺𝗇𝗍𝖣𝖠𝖦n,qG\sim\mathsf{PlantDAG}_{n,q} (see Definition 1.2). We first present an O(logn)O(\log n)-pass algorithm using O~(n4/3)\widetilde{O}(n^{4/3}) space. We then present a one-pass algorithm that uses O~(n3/2)\widetilde{O}(n^{3/2}) space and requires the assumption that the stream is in random order.

5.1 Arbitrary Order Algorithm

In this section, we present two different algorithms. The first is appropriate when qq is large whereas the second is appropriate when qq is small. Combining these algorithms and considering the worst case value of qq yields the algorithm using O~(n4/3)\widetilde{O}(n^{4/3}) space.

Algorithm for large q.  The basic approach is to emulate QuickSort. We claim that we can find the relationship between any vertex uu among nn vertices and a predetermined vertex vv using three passes and O(n+q3logn){O}(n+q^{-3}\log n) space. Assuming this claim, we can sort in O(log(q2n))O(\log(q^{2}n)) passes and O~(n/q)\widetilde{O}(n/q) space: we recursively partition the vertices and suppose at the end of a phase we have sub-problems of sizes n1,n2,n3,n_{1},n_{2},n_{3},\ldots. Any sub-problem with at least 1/q21/q^{2} vertices is then sub-divided by picking Θ(logn)\Theta(\log n) random pivots (with replacement) within the sub-problems using the aforementioned three pass algorithm. There are at most q2nq^{2}n such sub-problems. Hence, the total space required partition all the sub-problems in this way is at most

O(logni=1q2n(ni+q3logn))=O(nq1log2n).{O}\left(\log n\sum_{i=1}^{q^{2}n}(n_{i}+q^{-3}\log n)\right)=O(nq^{-1}\log^{2}n)\ .

Note that the size of every sub-problem decreases by a factor at least 22 at each step with high probability and hence after log(q2n)\log(q^{2}n) iterations, all sub-problems have at most 1/q21/q^{2} vertices. Furthermore, each vertex degree is O(1/qlogn)O(1/q\cdot\log n) in each sub-problem. Hence, the entire remaining instance can be stored using O(n/qlogn)O(n/q\cdot\log n) space.

It remains to prove our three-pass claim. For this, we define the following families of sets:

Li={u: u-to-v path of lengthi},Ri={u: v-to-u path of lengthi}.L_{i}=\{u:\exists\mbox{ $u$-to-$v$ path of length}\leqslant i\}\,,\quad R_{i}=\{u:\exists\mbox{ $v$-to-$u$ path of length}\leqslant i\}\,.

Using two passes and O(nlogn)O(n\log n) space we can identify L2L_{2} and R2R_{2} using O(nlogn)O(n\log n) space. Let UU be the set of vertices not contained in L2R2L_{2}\cup R_{2}. The following lemma (which can be proved via Chernoff bounds) establishes that L2R2L_{2}\cup R_{2} includes most of the vertices of the graph with high probability.

Lemma 5.1.

With high probability, |U|=O(q2logn)|U|=O(q^{-2}\log n).

In a third pass, we store every edge between vertices in UU and also compute L3L_{3} and R3R_{3}. Computing L3L_{3} and R3R_{3} requires only O(nlogn)O(n\log n) space. There is an edge between each pair of vertices in UU with probability qq and hence, the expected number of edges between vertices in UU is at most q|U|2=O(q3log2n)q|U|^{2}=O(q^{-3}\log^{2}n). By an application of the Chernoff Bound, this bound also holds w.h.p. Note that L3,R3L_{3},R_{3}, and the edges within UU suffice to determine whether uLu\in L_{\infty} or uRu\in R_{\infty} for all uu. To see this first suppose uLu\in L_{\infty} and that (u,w)(u,w) is the critical edge on the directed path from uu to vv. Either wL2w\in L_{2} and therefore we deduce uL3u\in L_{3}; or uL2u\in L_{2}; or uL2u\not\in L_{2} and wL2w\not\in L_{2} and we therefore store the edge (u,w)(u,w).

This establishes the following lemma.

Lemma 5.2.

There is a O(logn)O(\log n)-pass, O~(n/q)\widetilde{O}(n/q)-space algorithm for topo-sort on a random input graph G𝖯𝗅𝖺𝗇𝗍𝖣𝖠𝖦n,qG\sim\mathsf{PlantDAG}_{n,q}. ∎

Algorithm for small q.  We use only two passes. In the first pass, we compute the in-degree of every vertex. In the second, we store all edges between vertices where the in-degrees differ by at most 3cnqlnn3\sqrt{cnq\cdot\ln n} where c>0c>0 is a sufficiently large constant.

Lemma 5.3.

There is a two-pass, O~(n3/2q)\widetilde{O}(n^{3/2}\sqrt{q})-space algorithm for topo-sort on a random input graph G𝖯𝗅𝖺𝗇𝗍𝖣𝖠𝖦n,qG\sim\mathsf{PlantDAG}_{n,q}.

Proof.

We show that, with high probability, the above algorithm collects all critical edges and furthermore only collects O~(n3/2q)\widetilde{O}(n^{3/2}\sqrt{q}) edges in total. Let uu be the element of rank rur_{u}. Note that din(u)\mathrm{d_{in}}(u) has distribution 1+Bin(ru2,q)1+\textup{Bin}(r_{u}-2,q). Let Xu=din(u)1X_{u}=\mathrm{d_{in}}(u)-1. By an application of the Chernoff Bound,

Pr[|Xu(ru2)q|c(ru2)qlnn]1/poly(n).\Pr\left[|X_{u}-(r_{u}-2)q|\geqslant\sqrt{c(r_{u}-2)q\ln n}\right]\leqslant 1/\operatorname{poly}(n)\,.

Hence, w.h.p., ru=2+Xu/q±cn/qlnnr_{u}=2+X_{u}/q\pm\sqrt{cn/q\cdot\ln n} for all vertices uu. Therefore, if (u,v)(u,v) is critical, then

|XuXv||Xu(ru2)q|+|(ru2)q(rv2)q|+|Xv(rv2)q|3cnqlnn.|X_{u}-X_{v}|\leqslant|X_{u}-(r_{u}-2)q|+|(r_{u}-2)q-(r_{v}-2)q|+|X_{v}-(r_{v}-2)q|\leqslant 3\sqrt{cnq\cdot\ln n}\,.

This ensures that the algorithm collects all critical edges. For the space bound, we first observe that for an arbitrary pair of vertices uu and vv, if |XuXv|3cnqlnn|X_{u}-X_{v}|\leqslant 3\sqrt{cnq\cdot\ln n} then

|rurv||XuXv|/q+2cn/qlnn8cn/qlnn.|r_{u}-r_{v}|\leqslant|X_{u}-X_{v}|/q+2\sqrt{cn/q\cdot\ln n}\leqslant 8\sqrt{cn/q\cdot\ln n}\ .

Hence, we only store an edge between vertex uu and vertices whose rank differs by at most 8cn/qlnn8\sqrt{cn/q\cdot\ln n}. Since edges between such vertices are present with probability qq, the expected number of edges stored incident to uu is 8cnqlnn8\sqrt{cnq\cdot\ln n} and is O(nqlnn)O(\sqrt{nq\cdot\ln n}) by an application of the Chernoff bounds. Across all vertices this means the number of edges stored is O(n3/2qlnn)O(n^{3/2}\sqrt{q\cdot\ln n}) as claimed. ∎

Combining Lemma 5.2 and Lemma 5.3 yields the main theorem of this section.

Theorem 5.4.

There is an O(logn)O(\log n)-pass algorithm for topo-sort on a random input G𝖯𝗅𝖺𝗇𝗍𝖣𝖠𝖦n,qG\sim\mathsf{PlantDAG}_{n,q} that uses O~(min(n/q,n3/2q)\widetilde{O}(\min(n/q,n^{3/2}\sqrt{q}) space. For the worst-case over qq, this is O~(n4/3)\widetilde{O}(n^{4/3}). ∎

5.2 Random Order Algorithm

The transitive reduction of a DAG G=(V,E)G=(V,E) is the minimal subgraph Gred=(V,E)G^{\mathrm{red}}=(V,E^{\prime}) such that, for all u,vVu,v\in V, if GG has a uu-to-vv path, then so does GredG^{\mathrm{red}}. So if GG has a Hamiltonian path, GredG^{\mathrm{red}} is this path.

The one-pass algorithm assuming a random ordering of the edges is simply to maintain GredG^{\mathrm{red}} as GG is streamed in, as follows. Let SS be initially empty. For each edge (u,v)(u,v) in the stream, we add (u,v)(u,v) to SS and then remove all edges (u,v)(u^{\prime},v^{\prime}) where there is a uu^{\prime}-to-vv^{\prime} path among the stored edges.

Theorem 5.5.

There is a one-pass O~(maxq^qmin{n/q^,n2q^})\widetilde{O}(\max_{\hat{q}\leqslant q}\min\{n/\hat{q},n^{2}\hat{q}\})-space algorithm for topo-sort on inputs G𝖯𝗅𝖺𝗇𝗍𝖣𝖠𝖦n,qG\sim\mathsf{PlantDAG}_{n,q} presented in random order. In the worst case this space bound is O~(n3/2)\widetilde{O}(n^{3/2}).

Proof.

Consider the length-TT prefix of the stream where the edges of GG are presented in random order. It will be convenient to write T=n2q^T=n^{2}\hat{q}. We will argue that the number of edges in the transitive reduction of this prefix is O(min{n/q^,n2q^})O(\min\{n/\hat{q},\,n^{2}\hat{q}\}) with high probability; note the bound n2q^n^{2}\hat{q} follows trivially because the transitive reduction has at most TT edges. The result then follows by taking the maximum over all prefixes.

We say an edge (u,v)(u,v) of GG is short if the difference between the ranks is rvruτ:=cq^2lognr_{v}-r_{u}\leqslant\tau:=c\hat{q}^{-2}\log n where cc is some sufficiently large constant. An edge that is not short is defined to be long. Let SS be the number of short edges in GG and let MM be the total number of edges in GG. Note that 𝔼[S](n1)+qτn{\mathbb{E}}[S]\leqslant(n-1)+q\tau n and 𝔼[M]=(n1)+q(n12){\mathbb{E}}[M]=(n-1)+q\binom{n-1}{2}. By the Chernoff bound, S2qτnS\leqslant 2q\tau n and n2q/4Mn2qn^{2}q/4\leqslant M\leqslant n^{2}q with high probability. Furthermore, the number of short edges in the prefix is expected to be TS/MT\cdot S/M and, with high probability, is at most

2TS/M4Tqτnn2q/4=16cn/q^logn.2T\cdot S/M\leqslant\frac{4Tq\tau n}{n^{2}q/4}=16cn/\hat{q}\cdot\log n\,.

Now consider how many long edges are in the transitive reduction of the prefix. For any long edge (u,v)(u,v), let XwX_{w} denote the event that (u,w),(w,v)(u,w),(w,v) are both in the prefix. Note that the variables {Xw}w:ru+1rwrv1\{X_{w}\}_{w:r_{u}+1\leqslant r_{w}\leqslant r_{v}-1} are negatively correlated and that

Pr[Xw=1](qT/M)2/2q^2/2.\Pr\left[X_{w}=1\right]\geqslant(qT/M)^{2}/2\geqslant\hat{q}^{2}/2\,.

Hence, if X=w:ru+1rwrv1XwX=\sum_{w:r_{u}+1\leqslant r_{w}\leqslant r_{v}-1}X_{w} then

𝔼[X]cq^2lognq^2/2=c/2logn{\mathbb{E}}[X]\geqslant c\hat{q}^{-2}\log n\cdot\hat{q}^{2}/2=c/2\cdot\log n

and so, by the Chernoff bound, X>0X>0 with high probability and if this is the case, even if (u,v)(u,v) is in the prefix, it will not be in the transitive reduction of the prefix. Hence, by the union bound, with high probability no long edges exist in the transitive closure of the prefix. ∎

6 Rank Aggregation

Recall the rank-aggr problem and the distance dd between permutations, defined in Section 1.2. To recap, the distance between two orderings is the number of pairs of objects which are ranked differently by them, i.e.,

d(π,σ):=a,b[n]𝟙{π(a)<π(b),σ(b)<σ(a)}.d(\pi,\sigma):=\sum_{a,b\in[n]}\mathbbm{1}\{\pi(a)<\pi(b),\,\sigma(b)<\sigma(a)\}\,.

Note that rank-aggr is equivalent to finding the median of a set of kk points under this distance function, which can be shown to be metric. It follows that picking a random ordering from the kk input orderings provides a 22-approximation for rank-aggr.

A different approach is to reduce rank-aggr to the weighted feedback arc set problem on a tournament. This idea leads to a (1+ε)(1+\varepsilon)-approximation via 1\ell_{1}-norm estimation in a way similar to the algorithm in Section 3.1. Define a vector 𝐱\mathbf{x} of length (n2)\binom{n}{2} indexed by pairs of vertices {a,b}\{a,b\} where

xa,b=i=1k𝟙{σi(a)<σi(b)},x_{a,b}=\sum_{i=1}^{k}\mathbbm{1}\{\sigma_{i}(a)<\sigma_{i}(b)\}\,,

i.e., the number of input orderings that have a<ba<b. Then for any ordering π\pi define a vector 𝐲π\mathbf{y}^{\pi}, where for each pair of vertices {a,b}\{a,b\},

ya,bπ=k𝟙{π(a)<π(b)}.y^{\pi}_{a,b}=k\cdot\mathbbm{1}\{\pi(a)<\pi(b)\}\,.

It is easy to see that 𝐱𝐲π1=cost(π)\|\mathbf{x}-\mathbf{y}^{\pi}\|_{1}=\operatorname{cost}(\pi).

As in Section 3.1, our algorithm maintains an 1\ell_{1}-sketch S𝐱S\mathbf{x} with accuracy ε/3\varepsilon/3 and error δ=1/(3n!)\delta=1/(3\cdot n!). By 3.1, this requires at most O(ε2nlog2n)O(\varepsilon^{-2}n\log^{2}n) space. In post-processing, the algorithm considers all n!n! permutations π\pi and, for each of them, computes S(𝐱𝐲π)=S𝐱S𝐲πS(\mathbf{x}-\mathbf{y}^{\pi})=S\mathbf{x}-S\mathbf{y}^{\pi}. It thereby recovers an estimate for 𝐱𝐲π1\|\mathbf{x}-\mathbf{y}^{\pi}\|_{1} and finally outputs the ordering π\pi that minimizes this estimate.

The analysis of this algorithm is essentially the same as in Theorem 3.2. Overall, we obtain the following result.

Theorem 6.1.

There is a one-pass algorithm for rank aggregation that uses O(ε2nlog2n)O(\varepsilon^{-2}n\log^{2}n) space, returns a (1+ε)(1+\varepsilon)-approximation with probability at least 23\frac{2}{3}, but requires exponential post-processing time. ∎

Acknowledgements

We thank Riko Jacob for a helpful discussion about the sink finding problem. We thank the anonymous SODA 2020 reviewers for several helpful comments that improved the presentation of the paper.

References

  • [1] F. Ablayev. Lower bounds for one-way probabilistic communication complexity and their application to space complexity. Theoretical Computer Science, 175(2):139–159, 1996.
  • [2] K. J. Ahn, S. Guha, and A. McGregor. Analyzing graph structure via linear measurements. In Proc. 23rd Annual ACM-SIAM Symposium on Discrete Algorithms, pages 459–467, 2012.
  • [3] N. Ailon. Active learning ranking from pairwise preferences with almost optimal query complexity. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 24, pages 810–818. Curran Associates, Inc., 2011.
  • [4] N. Ailon, M. Charikar, and A. Newman. Aggregating inconsistent information: Ranking and clustering. J. ACM, 55(5):23:1–23:27, 2008.
  • [5] S. Angelov, K. Kunal, and A. McGregor. Sorting and selection with random costs. In LATIN 2008: Theoretical Informatics, 8th Latin American Symposium, Búzios, Brazil, April 7-11, 2008, Proceedings, pages 48–59, 2008.
  • [6] M. Bateni, H. Esfandiari, and V. S. Mirrokni. Almost optimal streaming algorithms for coverage problems. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2017, Washington DC, USA, July 24-26, 2017, pages 13–23, 2017.
  • [7] A. Chakrabarti, G. Cormode, and A. McGregor. Robust lower bounds for communication and stream computation. Theor. Comput., 12(1):1–35, 2016. Preliminary version in Proc. 40th Annual ACM Symposium on the Theory of Computing, pages 641–649, 2008.
  • [8] D. Coppersmith, L. Fleischer, and A. Rudra. Ordering by weighted number of wins gives a good ranking for weighted tournaments. ACM Trans. Algorithms, 6(3):55:1–55:13, 2010.
  • [9] M. Elkin. Distributed exact shortest paths in sublinear time. In Proc. 49th Annual ACM Symposium on the Theory of Computing, pages 757–770, 2017.
  • [10] G. Even, J. (Seffi) Naor, B. Schieber, and M. Sudan. Approximating minimum feedback sets and multicuts in directed graphs. Algorithmica, 20(2):151–174, Feb 1998.
  • [11] J. Feigenbaum, S. Kannan, A. McGregor, S. Suri, and J. Zhang. On graph problems in a semi-streaming model. Theoretical Computer Science, 348(2-3):207–216, 2005.
  • [12] V. Guruswami and K. Onak. Superlinear lower bounds for multipass graph processing. Algorithmica, 76(3):654–683, Nov. 2016.
  • [13] M. R. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. External memory algorithms, pages 107–118, 1999.
  • [14] Z. Huang, S. Kannan, and S. Khanna. Algorithms for the generalized sorting problem. In IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS 2011, Palm Springs, CA, USA, October 22-25, 2011, pages 738–747, 2011.
  • [15] C. Jin. Simulating random walks on graphs in the streaming model. In 10th Innovations in Theoretical Computer Science Conference, ITCS 2019, January 10-12, 2019, San Diego, California, USA, pages 46:1–46:15, 2019.
  • [16] A. B. Kahn. Topological sorting of large networks. Commun. ACM, 5(11):558–562, Nov. 1962.
  • [17] D. M. Kane, J. Nelson, and D. P. Woodruff. On the exact space complexity of sketching and streaming small norms. In Proc. 21st Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1161–1178, 2010.
  • [18] C. Kenyon-Mathieu and W. Schudy. How to rank with few errors: A PTAS for weighted feedback arc set on tournaments. Electronic Colloquium on Computational Complexity (ECCC), 13(144), 2006.
  • [19] S. Khan and S. K. Mehta. Depth first search in the semi-streaming model. In Proc. 36th International Colloquium on Automata, Languages and Programming, pages 42:1–42:16, 2019.
  • [20] A. McGregor. Graph stream algorithms: a survey. SIGMOD Record, 43(1):9–20, 2014.
  • [21] A. Razborov. On the distributional complexity of disjointness. Theor. Comput. Sci., 106(2):385–390, 1992. Preliminary version in Proc. 17th International Colloquium on Automata, Languages and Programming, pages 249–253, 1990.
  • [22] A. D. Sarma, S. Gollapudi, and R. Panigrahy. Estimating pagerank on graph streams. J. ACM, 58(3):13, 2011.
  • [23] R. E. Tarjan. Edge-disjoint spanning trees and depth-first search. Acta Informatica, 6(2):171–185, Jun 1976.
  • [24] A. Yehudayoff. Pointer chasing via triangular discrimination. Technical Report TR16-151, ECCC, 2016.