Congestion-Approximators from the Bottom Up
Abstract
We develop a novel algorithm to construct a congestion-approximator with polylogarithmic quality on a capacitated, undirected graph in nearly-linear time. Our approach is the first bottom-up hierarchical construction, in contrast to previous top-down approaches including that of Räcke, Shah, and Taubig (SODA 2014), the only other construction achieving polylogarithmic quality that is implementable in nearly-linear time (Peng, SODA 2016). Similar to Räcke, Shah, and Taubig, our construction at each hierarchical level requires calls to an approximate max-flow/min-cut subroutine. However, the main advantage to our bottom-up approach is that these max-flow calls can be implemented directly without recursion. More precisely, the previously computed levels of the hierarchy can be converted into a pseudo-congestion-approximator, which then translates to a max-flow algorithm that is sufficient for the particular max-flow calls used in the construction of the next hierarchical level. As a result, we obtain the first non-recursive algorithms for congestion-approximator and approximate max-flow that run in nearly-linear time, a conceptual improvement to the aforementioned algorithms that recursively alternate between the two problems.
1 Introduction
The famous max-flow min-cut theorem111For the sake of this introduction, a flow problem on a graph is a set of excesses and deficits on vertices and a solution is a set of paths connecting excess and deficits where no edge is used in more than one path. implies that any set of excesses or deficits in a graph can be connected by disjoint paths if and only if every cut in the graph has more capacity in the edges that cross the cut than the “demand” that is required to cross it. One direction is easy to see (if a cut does not have enough capacity, clearly one cannot route demand paths across it), and the other can be established by linear programming duality (or explicitly using an algorithm as was done by Ford and Fulkerson in 1956 [12].)
Framed slightly differently, a cut where the ratio of the demand across it to the capacity of the cut is implies that any flow satisfying the demands requires at least units of flow go through some edge. Again, the max-flow min-cut theorem implies that there is always such a cut if the optimal flow has congestion . Here, the congestion of a flow is the maximum flow that is routed on any edge in the graph.
More generally, the set of feasible demands in a graph is captured by the (exponentially sized) set of all cuts in the graph. Remarkably, it was shown that polynomially many cuts could approximately (within logarithmic factors) characterize the congestion of any set of demands [31, 32]. Indeed, the total size of cuts was nearly-linear222Nearly-linear means where is a constant. Almost-linear means for some . in [31, 33] and could be computed in nearly-linear time as announced in 2014 by Peng [29].
More precisely, an -congestion-approximator for a graph is a set of cuts where for every set of demands, the minimum ratio (over cuts in the -congestion-approximator) of the demands crossing the cut and the capacity of the cut determines the minimum congestion of the optimal flow (that routes the demands) to within an factor.
In this paper, we provide an -congestion-approximator (along with an routing scheme) that can be constructed in nearly-linear time with an of . The number of logarithmic factors is not optimized but the structure is significantly simpler than the construction by Peng [29].
The top-down frame of the decomposition of [33] requires that one compute approximate maximum flows at the very top level which entailed a non-linear runtime. A breakthrough result of Sherman [36] showed how to use congestion-approximators to find approximate maximum flows.333Simultaneously, Kelner et. al. [17] used a dual version of congestion approximators called oblivous routers to give efficient algorithms for approximate maximum flow. Peng [30] used the result of Sherman [36] inside the top-down partitioning method based on [33] for constructing congestion-approximators. There is a chicken and egg problem here as the two methods need each other, and a costly recursion was used to combine them based on ultra-sparsifiers which we discuss a bit more below.
Our result proceeds in a bottom-up iterative fashion; indeed, the bottom level is a “weak expander decomposition” which can be computed using trivial congestion-approximators consisting of singleton vertices.444We note expander decompositions themselves have found wide application, including in the breakthrough near linear time algorithm of [8] for maximum flow. Also, the high-level idea of our congestion-approximator is similar to hierachical expander decomposition [13], and our techniques may be useful in getting from almost linear to near linear. Then we proceed level by level, using the lower level clusterings as “pseudo”-congestion-approximators for the next. The frame is simple, but “around the edges” both literally (the edges of the clusters) and metaphorically there are technical issues which require some attention.
Still, given the power that congestion-approximators provide with respect to understanding the structure of graphs, finding more efficient constructions is important and an effective bottom-up or clustering approach seems a natural path to follow. We make the first and a substantive step on this path in the decade since the announcement of Peng’s also polylogarithmic nearly linear time algorithm [29].
Recent progress on maximum flow can be compared to developments in nearly-linear time Laplacian solvers [38], which were initially very complex with many logarithmic factors. But over the years, new tools were developed that both improved the running times and allowed for better understanding of graphs as well as having broader application. See, for example, [21, 22, 10, 16]. Progress has been much slower for maximum flow in comparison. Perhaps one reason is that solutions to Laplacian linear systems are optimizing flows that involve the Euclidean norm which is a fair bit simpler than the norm central to maximum flow or norms based on more general convex bodies. We believe this result to be a step in the process of understanding maximum flows better.
We proceed with a discussion of previous work which consists of a remarkable series of developments.
1.1 Previous work
The maximum flow, minimum cut theorem is nice and remarkable in that one can find a single cut that establishes the optimality of the flow or “routing”, and the optimal routing establishes a tight lower bound on the size of any cut.555A wrinkle is that there could be many (even an exponential) number of minimum cuts or feasible optimal flows. But any optimal routing or any optimal cut establishes the optimality of the “dual” object.
We note that this theorem was extended to multi-commodity flow in an approximate sense in [24], [20] and [26] where flows and corresponding cuts are shown to be related by an factor. The first paper gives a method for finding the sparsity or conductance of a graph to within an factor by using multicommodity flow to embed a complete graph, and the others extend the techniques to give approximations between cuts and solutions for arbitrary multicommodity flow instances. In a parallel thread, the mixing times of random walks or eigenvalues (which involve flow problems with an objective) are related to cuts via classical and tremendously impactful results of Cheeger [7]. Combining eigenvalues and linear programming methods through semidefinite programming yields improved approximate relationships between embeddings and cuts of roughly [5, 4]. Fast versions of these methods involve the cut-matching game developed in [2, 18, 28, 35] and used directly in this work.
As we noted previously, one can view the (exponentially sized) set of feasible sets of demands in a graph as being a very general measure of a graph’s capabilities. And as also mentioned above, Räcke in [31] showed that a polynomial sized set of cuts and corresponding pre-computed routings approximately model the congestion required for any of (exponentially many) sets of possible demands.
A bit more formally, [31] provides an oblivious routing scheme, where the scheme can obliviously route any set of demands with no more than times as much congestion as the optimal routing. Here, oblivious roughly means that for any demand pair the routing is done without considering any other demand pair. [31] also provides a decomposition where for any set of demands the maximum congestion (i.e., ratio of demand crossing to edge capacity) on any cut in the decomposition is within a factor of of the congestion needed to route those demands.
The value for in [31] is but was non-constructive: however, constructive and improved schemes were quickly developed with of in [6, 15].
An alternative scheme also by Räcke [32], gave a remarkably simple oblivious routing scheme, that consisted of trees. The oblivious routing was simply to route any demand by splitting the flow among the trees. Of course, each tree implicitly corresponds to a laminar family of cuts. Still the total size of both the routings and the implicit total size of the cuts is quadratic or worse. Moreover, the time complexity for its construction was also at least quadratic. The approximation factor for this scheme was . We will refer to this as Räcke’s tree scheme.
Madry, in [27], gave a nearly-linear time algorithm that produced a congestion-approximator which has almost-linear size and running time at the cost of having approximation factor of that is based on Räcke tree scheme. A central idea is the use of ultrasparsifiers which were introduced by Spielman and Teng [38] in their breakthrough results on linear time solvers for Laplacian linear systems. An ultrasparsifier is formed by taking a certain kind of spanning tree (called low-stretch), making clusters from small connected components of the tree, and sampling a very small set of non-tree edges between the clusters. Such a graph approximates the cuts to within a small factor in the sense that cuts in the original graph and the new one have approximately the same size. This allows a (complicated) recursion where one use several ultrasparsifiers to approximate the cuts in the graph and recursively produce oblivious routing schemes for each of the ultrasparsifiers. Again, this approach is more efficient in time than Räcke’s tree scheme at the cost of a worse approximation factor. The scheme enabled the approximate solution to a host of problems in almost linear time. This contribution was remarkable.
In sum, Madry [27] produced a -congestion approximator in the almost linear running time of .666The is subconstant, roughly which means an factor is larger than any polylogarithmic factor. The obstruction to getting to polylogarithmic overheads in Madry’s scheme is the need to recurse on several ultrasparsifiers to keep the approximation from blowing up during the recursion.
In another exciting development, Sherman [36] used Madry’s [27] structure to produce an almost-linear time algorithm for -approximate undirected maximum flow. This is remarkable in that he reduced Madry’s approximation factor of to . The approximation factor in Madry’s construction translates into a running time overhead, which results in the almost-linear running time we mentioned. We point out that there is close relationship to the work of Spielman and Teng [38], who used ultrasparsifiers to spectrally approximate a Laplacian system to provide algorithms whose running times depend on the approximation quality while obtaining arbitrarily precise solutions. In that case, the dependence of running time on precision was logarithmic whereas Sherman’s algorithm suffers a dependence on the error . To be sure, in solving linear systems the idea of using an “approximate” graph in improving condition numbers was known as preconditioning, but for maximum flow or optimization this was a striking development.
Sherman’s method (could be seen) to use the multiplicative weights framework (see [3]) to route flow across cuts so that every cut has near zero residual demand across it, in particular, fraction of its capacity. The congestion approximator then ensures that the remaining flow can be routed with congestion and one can recurse with slightly more than extra congestion.
We note that Sherman’s result (with Madry’s construction) bypasses longstanding barriers to faster maximum flow algorithms. Since the work of Dinitz [11] in 1973, algorithms had to pay either for path length or for number of paths. Dinitz himself traded this off to get a flow algorithm [9, 23]. Using linear time solvers to do electrical flow, one could eke out a few more paths simultaneously, but one still was pretty stuck at . The congestion-approximator and the multiplicative weights optimization method bypasses these obstructions.
In 2014, Räcke, Shah, and Taubig [33] made progress on Räcke’s decomposition based approach. In particular, they showed how to produce a decomposition with approximation parameter of using maximum flow computations of total size . They use the same frame as Räcke [31], but use single commodity flows in the context of the cut-matching game [19, 28] which, as previously mentioned, can replace multicommodity flows in finding sparse cuts and routings.
Note, that a congestion-approximator can be used to compute flows efficiently where flows can now be used to compute congestion-approximators. This suggests recursion, but the construction in [33] (and indeed previous ones) were very much top down. That is, the first level of the decomposition (which itself is a tour de force) falls victim to the fact that typical paths in the flows that find them can be long and somewhat abundant. Thus, one critically needs something like a congestion-approximator right away to even compute the top level of the decomposition.
Still, Peng [30] was able to get to a nearly-linear running time recursive method by using ultrasparsifiers and Sherman’s approximate maximum flow algorithm. He constructs an ultrasparsifier of size with polylogarithmic approximation factor, recursively computes a cut sparsifier for the resulting graph, and then argues that the combination of Sherman’s algorithm with both the ultrasparsifier and the congestion-approximator can be used to compute an approximate maximum flow in time. Again, the key here is that Sherman [36] allows him to use polylogarithmic overhead to combat the polylogarithmic approximation in both the ultrasparsifer and in the congestion-approximator of the ultrasparsifier. Still, the recursion is a bear, resulting in an admittedly unoptimized runtime of .
As noted previously, our method is bottom up from the start and avoids the costly recursion required above. We have not computed the exponent of the logarithmic factors due to, for example, using the fair cuts method of [25] as a black box. Still, the clustering approach is more natural and we expect is a useful frame.
We proceed with a technical overview of our result and methods.
2 Technical Overview
Our main conceptual and technical contribution is a novel congestion approximator that is constructed bottom-up in a hierarchical fashion. We start with an informal construction in the theorem below, which is later formalized in Theorem 4.1. See Figure 1 for an illustration.

Theorem 2.1 (Informal Theorem 4.1).
Consider a capacitated graph . Suppose there exist partitions of such that
-
1.
is the partition of singleton clusters, and is the partition with a single cluster.
-
2.
For each , for each cluster , the inter-cluster edges of internal to , together with the boundary edges of , mix777Informally, a set of edges mixes if there is a low-congestion multi-commodity flow between the set of edges whose demand pairs form an expander. In other words, there is an expander flow (in the [5] sense) between the set of edges. Our formal definition is in the preliminaries and only considers single-commodity flows. in the graph . Moreover, the mixings over all clusters have congestion simultaneously.
-
3.
For each , there is a flow in with congestion such that each inter-cluster edge in sends its capacity in flow and each inter-cluster edge in receives at most half its capacity in flow.
For each , let partition be the common refinement of partitions , i.e.,
Then, their union is a congestion-approximator with quality .
To understand the construction, first consider the case . By property (1), is the partition of singleton clusters. Since the inter-cluster edges of are precisely all the edges, property (2) says that for each cluster , the edges internal to , together with the boundary edges of , “mix” in the graph . In other words, the set of edges with at least one endpoint in mixes in . In the extreme case , the entire edge set mixes in , so the graph is an expander.888We do not define expanders in this paper since we do not need their precise definition. The connection to expanders is only stated as motivation for readers familiar with the concept. In general, we can informally say that each cluster is a sort of weak-expander, and the partition is a weak-expander decomposition of graph .999A key difference (from an actual expander decomposition) is that the (routing to certify the) mixing of each cluster is not required to be fully inside its induced subgraph, although the full graph still needs to have the capacity to support the mixing of all the clusters simultaneously.
Now consider property (3), which establishes a flow starting from the inter-cluster edges of such that each edge in receives at most half its capacity in flow. We claim that this statement is very natural and follows almost immediately from property (2) with one mild assumption: for each cluster in , the total capacity of boundary edges is much smaller than the total capacity of internal edges.101010From the perspective of expander decompositions, this assumption is very natural. Expander decompositions require that the total capacity of inter-cluster edges is small relative to the total capacity of all edges. We are simply extending this property to hold for each cluster, not just globally over all clusters. Indeed, from the mixing of the internal and boundary edges of , we can spread a flow from the boundary edges of such that each internal edge receives flow proportional to its capacity. As long as the boundary edges have small total capacity, the total flow source is also small, so each internal edge receives a small amount of flow relative to its capacity. Finally, since the clusters mix simultaneously, we can compose the corresponding flows for each cluster and still ensure small congestion.
Now consider a general level . Recall from property (2) that for each cluster , the inter-cluster edges of inside , together with the boundary edges of , “mix” in the graph (see Figure 1, middle). We can interpret this statement (again) as a weak-expander decomposition where expansion is measured with respect to a subset of edges, namely the inter-cluster edges of partition together with the boundary edges of a cluster in . Property (3) says that we can spread flow from the inter-cluster edges of to the inter-cluster edges of such that each edge in receives a small amount of flow (see Figure 1, right). Once again, we can show that it follows from property (2) with the following mild assumption: for each cluster in , the total capacity of boundary edges is much smaller than the total capacity of inter-cluster edges of that are internal to this cluster.
Overall, the partitions can be viewed as hierarchical expander decompositions where each partition is a weak-expander decomposition relative to the inter-cluster edges of partition . It is instructive to compare our partitioning to the expander hierarchy construction of [14], where clusters of the previous partition are contracted before building the next partition. While [14] can also extract a congestion approximator from their expander hierarchy construction, their quality is because they lose a multiplicative factor from contracting vertices on each level of the hierarchy. To avoid this multiplicative blow-up per level, we do not contract vertices, so our partitioning is not truly hierarchical: a cluster in partition may be “cut” into many components by the next partition (see Figure 1, left). While a hierarchical construction is not required, it is a useful property to have when analyzing the quality of the final congestion-approximator. To establish a hierarchy, our key insight is to consider the common refinement of partitions , which we name . The partitions over all are now hierarchical by construction, and we show that the union of all refinements is a congestion-approximator with good quality.
We emphasize that our conceptual idea of not contracting clusters and looking at common refinements is novel and may have future applications to bottom-up constructions of hierarchical objects, especially for obtaining polylogarithmic approximations, meaning that one cannot lose a multiplicative factor at each level. Given that [14] has popularized bottom-up hierarchical approaches, and that their methods so far can only obtain -factors due to multiplicative errors across levels, we believe that our ideas are promising for future development in this area.
2.1 Bottom-Up Construction
As mentioned previously, the partition is a weak-expander decomposition of the graph, so it can be computed in nearly-linear time using off-the-shelf expander decomposition algorithm that avoid any black-box call to max-flow [34]. For partitions onwards, expansion is measured with respect to a subset of edges, so simple expander decomposition algorithms no longer suffice. Instead, we have to resort to expander decomposition algorithms that make black-box calls to (approximate) max-flow. Naïvely, these max-flows can be computed recursively, resulting in a congestion approximator algorithm that makes recursive calls to max-flow, similar to [33]. Our key insight is that these max-flow instances are actually well-structured enough that recursion is unnecessary. In particular, to construct partition , the first partitions —which the algorithm has already computed—can be converted to a pseudo-congestion approximator, which then translates to a max-flow algorithm sufficient for these well-structured instances.
3 Preliminaries
We are given an undirected, capacitated graph . The graph has vertices and edges, and each edge has capacity in the range . For a set , let denote the set of edges with exactly one endpoint in , and let denote the total capacity of edges in . For a vertex , let denote the capacitated degree of vertex , which is also equal to . When the graph is clear from context, we may drop the subscript . We sometimes write , , and to emphasize that the values are with respect to graph .
For a given edge set , let , , and denote the corresponding values on the subgraph of with edge set . For a different graph , let , , and denote the corresponding values on graph . We never remove the subscripts and to avoid confusion with the original graph .
For a partition of the vertex set , we call each part of the partition a cluster. Let denote the set of edges whose endpoints belong to different clusters; we can also write . We also define as the total capacity of edges in .
Let be an arbitrary set of vertices. For a vector , let denote its value on entry . For a vertex subset , define , and define as the vector restricted to , i.e., for all and for all . For two vectors , by we mean entry-wise inequality, i.e., for all . For a vector , let be the vector with entry-wise absolute values, i.e., .
A demand vector is a vector whose entries sum to , i.e., . A flow routes demand if each vertex receives a net flow of in the flow . A flow has congestion if the amount of flow sent along each (undirected) edge is at most times the capacity of that edge. Given a flow , a path decomposition of flow is a collection of directed, capacitated paths such that for any two vertices connected by an edge , the amount of flow that sends from to equals the total capacity of (directed) paths that contain edge in the direction from to .
A vertex weighting is a vector , i.e., all entries in are nonnegative. The vertex weighting mixes in graph with congestion if for any demand satisfying , there is a flow routing with congestion .111111There is a close connection between the concept of mixing and expander graphs, though we do not need the definition of expanders in this paper. A collection of vertex weightings mixes simultaneously with congestion if for any demands with for each , there is a flow routing demand with congestion .
Throughout the paper, we view vectors in and functions from to as interchangeable. In particular, we sometimes treat the degree function as a vector . In particular, is a vertex weighting.
Congestion-approximators and approximate flow.
Given a graph , a congestion-approximator of quality is a collection of subsets of such that for any demand satisfying for all , there is a flow routing demand with congestion . Through Sherman’s framework [36, 37], a congestion-approximator of quality translates to a -approximate maximum flow algorithm with running time .121212The notation hides polylogarithmic factors in . In our paper, it is most convenient to work with a stronger variant of approximate min-cut/max-flow called fair cut/flow [25], which is formally defined in Section 5.2.
4 Bottom-Up Congestion-Approximator
In this section, we formally state our congestion-approximator construction and prove its quality guarantee. See Figure 1 for an illustration. The most important distinction is that we define and analyze mixing on vertex weightings, not edges. This simplifies the notation since we can avoid working with the subdivision graph in [33]. For example, in property (2) below, the mixing of the vertex weighting is conceptually equivalent to the mixing of the edges of internal to together with the boundary edges .
Theorem 4.1.
Consider a capacitated graph , and let and be parameters. Suppose there exist partitions of such that
-
1.
is the partition of singleton clusters, and is the partition with a single cluster.
-
2.
For each , the collection of vertex weightings mixes simultaneously in with congestion .
-
3.
For each , there is a flow in with congestion such that each vertex sends flow and receives at most flow.
For each , let partition be the common refinement of partitions , i.e.,
Then, their union is a congestion-approximator with quality .
In Section 5, we develop an efficient algorithm to construct partitions with . This algorithm builds the partitions iteratively in the order . For technical reasons explained in Section 5, we require the following analogue of Theorem 4.1 where is not necessarily the partition . Note that assumptions (2) and (3) remain unchanged below. The key difference is that is no longer a congestion-approximator, but a pseudo-congestion-approximator whose precise guarantee is stated below.
Lemma 4.2.
Consider a capacitated graph , and let and be parameters. Suppose there exist partitions such that
-
1.
is the partition of singleton clusters.
-
2.
For each , the collection of vertex weightings mixes simultaneously in with congestion .
-
3.
For each , there is a flow in with congestion such that each vertex sends flow and receives at most flow.
For each , let partition be the common refinement of partitions , i.e.,
Consider their union . Then, for any demand satisfying for all , there exists a demand satisfying and a flow routing demand with congestion .
Instead of proving Theorem 4.1 directly, we prove Lemma 4.2 which is needed for the algorithm. Before we do so, we first establish that Lemma 4.2 indeed implies Theorem 4.1.
Proof (Lemma 4.2Theorem 4.1).
Consider partitions that satisfy the assumptions of Theorem 4.1. For a given demand satisfying for all , we want to establish a flow routing demand with congestion . Theorem 4.1 then follows from the definition of congestion-approximator.
Apply Lemma 4.2 to the partitions and the demand . We obtain a demand satisfying and a flow routing demand with congestion . By assumption (1) of Theorem 4.1, we have , which implies that . Since , we must have . It follows that the flow routes demand with congestion , finishing the proof. ∎
For the rest of this section, we prove Lemma 4.2. We begin with two helper claims that establish structure on the sets .
Claim 4.3.
For all with , the partition of is a refinement of the partition , in the sense that each set in is the disjoint union of some sets in . In particular, .
Proof.
Consider a set for some . Since are all partitions of , the set is the disjoint union of all nonempty sets of the form for . Therefore, is a refinement of , and since refinements can only increase the boundary set, the second statement follows. ∎
Claim 4.4.
For all , we have .
Proof.
Consider an edge . Since , there is a set containing both vertices and . As in the proof of Claim 4.3, write for some . The set is the disjoint union of all nonempty sets of the form for some . Since , both and belong to sets of this form. Since , the sets containing and must be different. They can only differ in the set , so and belong to different sets in , and we obtain as promised. ∎
Let be a flow demand satisfying for all . We need to construct a demand satisfying and a flow routing demand with congestion .
The construction of the flow has iterations. On iteration , we construct a flow and a demand such that
-
1.
The flow routes demand , where we initialize on iteration .
-
2.
The flow has congestion .
-
3.
For all , we have .
-
4.
The demand satisfies .
Note for the last level , by definition we have . The lemma below shows that properties (1), (2), and (4) alone are sufficient to prove Lemma 4.2 with demand and flow .
Lemma 4.5.
Proof.
In order to establish the conditions above, we will use the following technical lemma:
Lemma 4.6.
Consider an iteration and a vector such that
-
(a)
.
-
(b)
for all .
Then, we can construct a flow such that
-
(i)
Flow routes demand for a vector with .
-
(ii)
The flow has congestion .
-
(iii)
For all , we have .
Before we prove this lemma, we first establish that it implies properties (1) to (4) above for appropriate and .
Proof.
We induct on , where the base case is just property (4) for (and ). For this base case, since the singleton sets are in , they are also in , so for all , which implies , as desired.
For the inductive step, we apply Lemma 4.6 on iteration and the vector with
We first verify the conditions on required by Lemma 4.6.
- (a)
-
(b)
To establish condition b, fix a set . We first prove that . This is trivial for , so assume that . For a given , the set is a disjoint union of some sets by Claim 4.3. Apply property (3) for iteration to obtain for all . Summing over all gives , so . Over all iterations , we obtain .
By definition of vector , we have , which equals from above. Finally, since the initial flow demand satisfies , we have , establishing condition b.
With the conditions fulfilled, Lemma 4.6 outputs a flow which we set as , immediately satisfying property (2). We set so that flow routes demand and property (1) holds. Property (3) follows from property iii of Lemma 4.6.
For the rest of this section, we establish Lemma 4.6, the most technical part of the proof. Before we do so, we first establish three technical helper claims.
Claim 4.8.
For any vector with , there is a vector with and a flow routing demand with congestion .
Proof.
By assumption (3) of Lemma 4.2, there is a flow in with congestion such that each vertex sends flow and receives at most flow. In other words, there is a vector with and a flow routing demand with congestion . Take a path decomposition of the flow where each vertex is the start of total capacity of (potentially empty) paths and the end of total capacity of (potentially empty) paths. Since , we can remove or decrease the capacity of paths until each vertex is the start of total paths. The resulting flow routes demand with congestion . ∎
Claim 4.9.
For any , consider any vector with . There exists a vector with and a flow routing demand with congestion .
Proof.
We prove the statement by induction from down to . For the base case , since , we have . By Claim 4.8 on vector , there is a vector with and a flow routing demand with congestion .
For the inductive step, consider a vector with . Define as
which satisfies . By induction, there exists a vector with and a flow routing demand with congestion .
Next, consider the vector , which satisfies , where the second inequality holds by Claim 4.3. If , then and , so . Otherwise, if , then
where the first inequality holds by Claim 4.4. Combining the two cases, we obtain . By Claim 4.8 on vector , there exists a vector with and a flow routing demand with congestion . Scaling this flow by factor , we obtain a flow routing demand with congestion . The final flow is the sum of flows and , which routes demand and has congestion . We set , which concludes the inductive step and hence the proof. ∎
Claim 4.10.
For any , consider any vector with . There exists a vector such that
-
1.
.
-
2.
For all clusters , we have .
-
3.
There is a flow routing demand with congestion .
Proof.
Apply Claim 4.9 on vector , obtaining a vector which we relabel as , and a flow routing demand with congestion . Take a path decomposition of flow where each vertex is the start of total capacity of (potentially empty) paths and the end of total capacity of (potentially empty) paths. For each path starting at a vertex in some cluster , perform the following operation. If the path contains an edge in with , then replace the path with its prefix ending at ; otherwise, do nothing to the path. Note that the modified path ends in the same cluster as its starting point. These modified paths form a new flow , which also has congestion .
We now bound the difference in the demands routed by and . To do so, we consider the difference in endpoints in the old and new path decompositions. Each vertex was initially the endpoint of total capacity of paths. We now claim that for each cluster , each vertex becomes the new endpoint of at most total capacity of paths. This is because each new endpoint is a result of an edge in some path, and the total capacity of such paths is at most since the flow has congestion . It follows that each vertex is the (new or old) endpoint of at most total capacity of paths in the new flow .
Define vector such that each vertex is the endpoint of total capacity of paths in the new flow . In other words, the flow routes demand . We have shown that , and combined with the guarantee from Claim 4.9, we conclude that . Finally, recall that each path of starts and ends in the same cluster of , so for all clusters . ∎
Proof.
We first construct demand as follows. For each set , define for all , which satisfies by condition b. Since is a partition of , this fully defines demand , which satisfies the bound required by property i. Also, since , we have , so
satisfying property iii. In particular, and is a valid demand.
Since , we apply Claim 4.10 with , obtaining a vector which we relabel as . We have and for all clusters . Let be the flow routing demand with congestion .
Consider a cluster . We have . Moreover, for all vertices , we have
We conclude that the scaled-down and restricted vector is a demand satisfying . By assumption (2) of Lemma 4.2, the collection of vertex weightings mixes simultaneously in with congestion , so there is a flow in routing demand with congestion . Scaling this flow by factor , we obtain a flow routing demand with congestion at most .
The final flow is , which routes demand and has congestion , concluding the proof of Lemma 4.6. ∎
5 Partitioning Algorithm
The partitioning algorithm starts with the partition of singleton clusters. The algorithm then iteratively constructs partition given the current partitions . The lemma below establishes this iterative algorithm, where we substitute for .
Theorem 5.1.
Consider a capacitated graph , and let be a parameter. Suppose there exist partitions that satisfy the three properties in Lemma 4.2, i.e.,
-
1.
is the partition of singleton clusters.
-
2.
For each , the collection of vertex weightings mixes simultaneously in with congestion .
-
3.
For each , there is a flow in with congestion such that each vertex sends flow and receives at most flow.
Then, there is an algorithm that constructs partition such that properties (2) and (3) hold for as well. The algorithm runs in time.
We remark that the parameters and , which result in an overall quality of , are far from optimized. We aim for a clean and modular exposition over the optimization of logarithmic factors, leaving the latter open for future work.
Before we describe the algorithm, we first show that iterations suffice to obtain a congestion-approximator.
Claim 5.2.
After iterations, we have , and is a congestion-approximator with quality .
Proof.
We first claim that for all . By property (3), there is a flow that sends a total of flow among the vertices, and receives a total of at most flow among the vertices. Since the total flow sent equals the total flow received, we have , or equivalently, .
The guarantee ensures that for , we must have . Since all edge capacities are assumed to be at least , we conclude that and , fulfilling property (1) of Theorem 4.1. By Theorem 4.1, we obtain a congestion-approximator with quality . ∎
The rest of this section is dedicated to proving Theorem 5.1. Throughout the section, we fix the input graph as well as the current partitions . We also define and according to Lemma 4.2.
At a high level, our algorithm proceeds similarly to expander decomposition algorithms like [34], where expanders are defined with respect to the vertex weighting of partition . We iteratively decompose the graph using the cut-matching game of [18]: for each cluster of the decomposition, we either compute a low-conductance cut or certify that the current cluster is mixing (or in [34] terms, a nearly expander). The matching step of the cut-matching game requires a call to approximate min-cut/max-flow, but recall that the partitions only form a pseudo-congestion-approximator. Luckily, we can modify the graph in a way that the pseudo-congestion-approximator can be adapted to an actual congestion-approximator for the new graph. We then black-box a cut/flow algorithm on this new graph, and then modify the cut and flow to fit the old graph in a way that suffices for the cut-matching game.
In more detail, we break down this section as follows:
-
1.
In Section 5.1, we show that can be used to construct a congestion-approximator for slightly modified graphs.
-
2.
In Section 5.2, we cite the (approximate) fair cut/flow algorithm of [25], which computes a cut/flow pair with desirable properties given a congestion-approximator.
-
3.
In Section 5.3, we introduce the cut-matching game as well as a trimming procedure similar to [34]. Both the cut-matching game and the trimming step use the fair cut/flow algorithm (Section 5.2) on the modified graphs for which we have a congestion-approximator (Section 5.1).
-
4.
Finally, in Section 5.4, we establish Theorem 5.1. We present the recursive clustering algorithm that computes the next partition given the current partitions . It uses the cut-matching game and trimming procedures of Section 5.3.
5.1 Congestion-approximator
We first show how to build a congestion-approximator on certain graph instances that show up in our algorithm. We define these graph instances below.
Definition 5.3 ().
For given vertex set , parameter , and vertex weightings , define the graph as follows:
-
•
Start with the graph .
-
•
Add new vertices , , and .
-
•
For each vertex , add an edge with capacity .
-
•
For each vertex , add an edge with capacity .
-
•
For each vertex , add an edge with capacity .
To understand these instances, recall a fact about that follows from Lemma 4.2.
Fact 5.4.
For any demand satisfying for all , there exists a demand satisfying and a flow in routing demand with congestion , where we define .
Suppose we start with the entire graph , and then add a vertex connected to each vertex with an edge of capacity . In the setting of Fact 5.4, suppose we wish to route the demand . We start with the flow routing demand as promised by Fact 5.4. To route the remaining demand , we simply use the new edges incident to : for each vertex , send flow along the edge in the proper direction, which is a flow with congestion . Overall, we obtain a flow routing demand with congestion , and with some more work, we can show that is a congestion approximator of the new graph.
For our algorithm, we actually work with graphs as described in Definition 5.3. In particular, the base graphs are induced subgraphs, and there are additional vertices and . One issue is that the newly added edges may also contribute to the values of for .131313This issue is also present in the example with the entire graph, but can be resolved by investigating the structure of . Nevertheless, we show in the lemma below that as long as are “well-behaved”, we can modify into a congestion approximator for .
Lemma 5.5.
Consider partitions for a graph that satisfy the three properties in Lemma 4.2, and define the partitions and according to Lemma 4.2. Fix vertex set , parameter , and vertex weightings on , and denote the graph by . Consider a parameter such that the following assumption holds:
-
for all .
Let be the collection . Then, is a congestion-approximator of with quality .
For the rest of Section 5.1, we prove Lemma 5.5. Consider a demand satisfying for all as well as for . We want to establish a flow on routing demand with congestion .
We first handle the demand at , , and . For each edge where , route flow from to (or flow from to , whichever is nonnegative). Since , we route at most flow along each edge . Analogously, for each edge where , route flow from to , and for each edge where , route flow from to . By the same argument, we route at most and flow along each edge and , respectively. In other words, the routing so far has congestion .
After this initial routing, vertices , , and no longer have any demand, and each vertex receives at most additional demand in absolute value. In other words, if is the new demand that must be routed, we have and .
In order to invoke Fact 5.4, our next goal is to show the following.
Claim 5.6.
for all .
Proof.
We first bound as
Next, we bound as follows. By assumption, we have . By construction of , we have
Putting everything together, we obtain
(1) |
It remains to bound , which we split into . By construction of , we have
and
We now bound the individual terms and above. For , we claim the bound : any edge in with an endpoint in has its other endpoint outside , so the edge must be in , and the claimed bound holds. For , we claim the bound . The first inequality is trivial, and for the second inequality, observe that by construction of , each set is a subset of some cluster in the partition . It follows that any edge in with an endpoint in has its other endpoint outside , so the edge must be in , and we conclude that .
Let be the vector without entries and with new entries for all . Since is a demand with , we have that is also a demand, i.e., the coordinates sum to . By Claim 5.6, we have
so we can apply Fact 5.4 on demand to obtain a demand satisfying and a flow on routing demand with congestion . Scaling this flow by factor , we obtain a flow on routing demand with congestion .
Next, imagine contracting into a single vertex labeled , so that each edge has capacity . Consider the corresponding flow on this contracted graph, which sends at most flow on each edge . Now consider the exact same flow on , whose edges have capacities instead of . These capacities are at least times the capacities of the contracted graph, so the corresponding flow has congestion at most a factor larger. We have established a flow on routing demand with congestion .
Finally, since demand satisfies , we can directly route the demand along the edges of capacity , which is a routing with congestion .
Adding up all three routings, we have routed the initial demand with congestion , concluding the proof.
5.2 Fair Cut/Flow Algorithm
Given a congestion-approximator, the most convenient min-cut/max-flow algorithm is the fair cut/flow algorithm of [25]. We state the definition of a fair cut/flow and then cite the main result of [25].
Definition 5.7 (Fair cut/flow).
Let be an undirected graph with edge capacities . Let be two vertices in . For any parameter , we say that an cut and a feasible flow is an -fair -cut/flow pair if for each edge with and , the flow sends at least flow along the edge in the direction from to .
Fact 5.8.
For any -fair -cut/flow pair , the cut is an -approximate minimum -cut.
Theorem 5.9 (Fair cut/flow algorithm [25]).
Consider a graph , two vertices , and error parameter . Given a congestion-approximator with quality , there is an algorithm that outputs a -fair -cut/flow pair in time where .
We remark that the fair cut/flow algorithm above is not the fastest available algorithm. However, it is conceptually the easiest for our purposes, and we believe that future work may improve the running time of fair cut/flow algorithms to approach those of standard approximate cut/flow algorithms. Hence, we decide to black-box a fair cut/flow algorithm rather than starting with a standard cut/flow algorithm and massaging it to work in our setting.
To apply Theorem 5.9 to the graph with the congestion-approximator of Lemma 5.5, we need to bound for the congestion-approximator . Recall that is the union of partitions of , so is the union of partitions of . So , where the comes from the singletons in . It follows that Theorem 5.9 runs in time where is the number of edges in incident to vertices in .
We conclude this section with the main subroutine that we use to construct partition . Note that the assumption 1 below remains unchanged.
Theorem 5.10 (Flow/cut subroutine).
Consider partitions for a graph that satisfy the three properties in Lemma 4.2, and define the partitions and according to Lemma 4.2. Fix vertex set , parameter , and vertex weightings on , and denote the graph by . Consider a parameter such that the following assumption holds:
-
for all .
Let be the collection . Then, given two vertices and error parameter , there is an algorithm that outputs a -fair -cut/flow pair in time, where is the number of edges in incident to vertices in .
5.3 Cut-Matching Game and Trimming
We follow the cut-matching game treatment in [34]: either find a “balanced” cut of small capacity, or ensure that a “large” part of the graph mixes with low congestion. The following lemma is similar to Theorem 2.2 of [34] with one key difference: there is no built-in flow subroutine, so the algorithm makes black-box calls to the fair cut/flow algorithm of Theorem 5.10.
In past work [33, 1], the analysis of the cut-matching game for capacitated graphs has only been sketched, referencing the fact that a capacitated graph can be modelled by an uncapacitated graph with parallel edges (at a cost). For completeness, we provide a full proof of this capacitated case in Appendix A.
Theorem 5.11 (Cut-Matching).
Consider a graph with integral edge capacities in the range . Let be a vertex subset, let be parameters, and define as . Suppose that the following assumption holds:
-
There is a flow on with congestion such that each vertex is the source of flow and each vertex is the sink of at most flow.
There exists parameter and a randomized, Monte Carlo algorithm that outputs a (potentially empty) set such that
-
1.
,
-
2.
, and
-
3.
Either , or the vertex weighting mixes in with congestion with high probability.
The algorithm makes at most calls to Theorem 5.10 with parameters
Outside these calls, the algorithm takes an additional time, where is the number of edges in incident to vertices in .
Property (3) asserts that either an approximately balanced cut is found, or the vertex weighting mixes with low congestion (with high probability). In our algorithm for Theorem 5.1, we actually want the weighting to mix in the second case. To guarantee this stronger property, we augment the set into through one additional call to the fair cut/flow algorithm of Theorem 5.10. The algorithm is similar to the flow-based expander trimming procedure in [34]. For completeness, we defer the algorithm and proof to Appendix B. Note that the setting, including assumption 1, is the same as Theorem 5.11.
Theorem 5.12 (Trimming).
Consider a graph with integral edge capacities in the range . Let be a vertex subset, let be parameters, and define as . Suppose that the following assumption holds:
-
There is a flow on with congestion such that each vertex is the source of flow and each vertex is the sink of at most flow.
There is a deterministic algorithm that inputs a subset and a parameter , and outputs a (potentially empty) set such that
-
1.
,
-
2.
,
-
3.
If the vertex weighting mixes in with congestion , then the vertex weighting mixes in with congestion , and
-
4.
There exists a vector with and a flow on routing demand with congestion .
The algorithm makes one call to Theorem 5.10 with parameters
Outside of this call, the algorithm takes an additional time, where is the number of edges in incident to vertices in .
5.4 Clustering Algorithm
With the necessary primitives established, we now describe how to construct partition given the partitions . The algorithm is recursive, taking as input a vertex subset that is initially . Throughout, we maintain the invariant that each input subset satisfies assumption 1, which is the same for Theorems 5.11 and 5.12.
On input , the algorithm calls Theorem 5.11 with parameters and for a large enough constant . The algorithm obtains an output set and then calls Theorem 5.12 on inputs and with the same parameters , obtaining a set . There are now two cases:
-
1.
If , then recursively call the algorithm on inputs and if they are nonempty.
-
2.
Otherwise, make a single recursive call on input if it is nonempty, and add the set to the final partition .
Claim 5.13.
Property (2) of Theorem 5.1 holds for , i.e., the collection of vertex weightings mixes simultaneously in with congestion .
Proof.
By property (3) of Theorem 5.11 and property (3) of Theorem 5.12, for each set added to the final partition , the vertex weighting mixes in with congestion . Since , we have
so in particular, the vertex weighting also mixes in with congestion . The recursive instances that add a set to are disjoint, so the vertex weightings mix simultaneously in with the same congestion. We bound the congestion by , concluding the proof. ∎
It remains to establish condition (3) of Theorem 5.1 for as well as the assumption 1 of Theorems 5.11 and 5.12. To do so, we first prove a few guarantees of the algorithm.
Claim 5.14.
For any recursive call , we have where .
Proof.
We first claim that . For any edge in , consider its endpoint in . Either it is in , in which case the edge belongs to , or it is in , in which case the edge belongs to . It follows that , and we can bound as follows:
By properties (1) and (2) of Theorem 5.11, we have and . By properties (1) and (2) of Theorem 5.12, we have and . The only two options for the recursive instance are and , and in both cases, we have
Combining the two bounds so far, we obtain
To bound , we case on whether or . If , then we must be in case (1) of the algorithm, which means . In this case, we bound . Together with the bound , we obtain
as promised. Otherwise, suppose that . We have
as promised. With both cases established, this concludes the proof. ∎
For a given recursive call , define its recursion depth inductively as follows: the initial call has depth , and given a recursive call of depth , all of its recursive calls have depth . By Claim 5.14, the value of decreases multiplicatively by factor on each recursive call, so the maximum recursion depth is .
For a given recursion depth , let denote the union of edges over all instances of depth . By construction of the algorithm, the (disjoint) union of over all recursion depths is exactly . To avoid clutter, we also define .
Claim 5.15.
For any recursion depth , there is a flow on with congestion such that each vertex sends flow and receives at most flow.
Proof.
We prove the statement by induction on . The base case is satisfied with the empty flow: since is the partition with a single part, each vertex indeed sends flow. Now assume by induction that there is a flow on with congestion such that each vertex sends flow and receives at most flow.
For each instance of depth , the algorithm calls Theorem 5.12 which defines . By property (4) of Theorem 5.12, there exists a vector with and a flow on routing demand with congestion . We now construct a flow in with congestion such that each vertex sends flow and receives at most flow. First, for each edge in , send flow to full capacity in the direction from to . In this initial flow, each vertex sends exactly flow, and each vertex receives exactly flow. Next, we send the flow scaled by , so that each vertex sends exactly flow and each vertex receives at most flow. Note that since . Summing the two flows, we obtain a flow in such that each vertex sends flow and receives at most flow. The congestion of the flow is , since edges in have congestion in the initial flow, and edges in have congestion in the flow scaled by .
To complete the induction, our final flow is the union of the constructed flow over all recursive instances of depth . Since the flow for instance is in , and since the instances are disjoint, the flows are also disjoint over all . It follows that their union is a flow on with congestion such that each vertex sends flow and each vertex receives flow. ∎
Finally, the two claims below establish property (3) of Theorem 5.1 and assumption 1, respectively.
Claim 5.16.
Property (3) of Theorem 5.1 holds for , i.e., there is a flow in with congestion such that each vertex sends flow and receives at most flow.
Proof.
Let be the maximum recursion depth. Summing the flows from Claim 5.15 over all recursion depths , and using that , we obtain a flow with congestion such that each vertex sends flow and receives at most flow. Recall that we set for large enough constant . We choose large enough that , so that each vertex receives at most flow. We can cancel out at most flow received at each vertex from the flow sent. After cancellation, we obtain a flow with congestion such that each vertex sends at least flow and receives at most flow. Scaling the flow by factor , taking a path decomposition, and removing enough paths until each vertex is the start of exactly paths, we obtain the desired flow with congestion . ∎
Claim 5.17.
For each recursive instance , the assumption 1 of Theorems 5.11 and 5.12 hold, i.e., there is a flow on with congestion such that each vertex is the source of flow and each vertex is the sink of at most flow.
Proof.
By Claim 5.16, there is a flow in with congestion such that each vertex sends flow and receives at most flow. Observe that for any recursive instance , we have since the recursive algorithm starting at instance adds a partition of into . In particular, each vertex sends at least flow. Take a path decomposition of the flow and remove enough paths until each vertex is the start of exactly paths. The resulting flow satisfies assumption 1, concluding the proof. ∎
It remains to bound the running time of the algorithm for Theorem 5.1. For each instance , Theorems 5.11 and 5.12 run in time plus calls to Theorem 5.10, which takes time per call, for a total time of . The instances on a given recursion depth are disjoint, so the sum of over all such instances is . The maximum recursion depth is , so the sum of over all instances of the algorithm is . It follows that the algorithm of Theorem 5.1 runs in time.
6 Approximate Maximum Flow
From Theorem 5.1 and Claim 5.2, we obtain an algorithm that constructs a congestion-approximator of quality in time. Recall that Sherman’s framework [36, 37] translates a congestion-approximator of quality to a -approximate maximum flow algorithm with running time . Thus, for any parameter , we obtain a -approximate maximum flow algorithm with running time .
References
- [1] Arpit Agarwal, Sanjeev Khanna, Huan Li, Prathamesh Patil, Chen Wang, Nathan White, and Peilin Zhong. Parallel approximate maximum flows in near-linear work and polylogarithmic depth. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3997–4061. SIAM, 2024.
- [2] Sanjeev Arora, Elad Hazan, and Satyen Kale. approximation to sparsest cut in time. SIAM Journal on Computing, 39(5):1748–1771, 2010.
- [3] Sanjeev Arora, Elad Hazan, and Satyen Kale. The multiplicative weights update method: a meta-algorithm and applications. Theory of computing, 8(1):121–164, 2012.
- [4] Sanjeev Arora, James R Lee, and Assaf Naor. Euclidean distortion and the sparsest cut. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 553–562, 2005.
- [5] Sanjeev Arora, Satish Rao, and Umesh Vazirani. Expander flows, geometric embeddings and graph partitioning. Journal of the ACM (JACM), 56(2):1–37, 2009.
- [6] Marcin Bienkowski, Miroslaw Korzeniowski, and Harald Räcke. A practical algorithm for constructing oblivious routing schemes. In Proceedings of the Fifteenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’03, page 24–33, New York, NY, USA, 2003. Association for Computing Machinery.
- [7] Jeff Cheeger. A lower bound for the smallest eigenvalue of the laplacian. Problems in analysis, 625(195-199):110, 1970.
- [8] Li Chen, Rasmus Kyng, Yang P Liu, Richard Peng, Maximilian Probst Gutenberg, and Sushant Sachdeva. Maximum flow and minimum-cost flow in almost-linear time. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 612–623. IEEE, 2022.
- [9] Paul Christiano, Jonathan A Kelner, Aleksander Madry, Daniel A Spielman, and Shang-Hua Teng. Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 273–282, 2011.
- [10] Michael B Cohen, Rasmus Kyng, Gary L Miller, Jakub W Pachocki, Richard Peng, Anup B Rao, and Shen Chen Xu. Solving sdd linear systems in nearly m log1/2 n time. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, pages 343–352, 2014.
- [11] Yefim Dinitz. Dinitz’algorithm: The original version and even’s version. In Theoretical Computer Science: Essays in Memory of Shimon Even, pages 218–240. Springer, 2006.
- [12] Lester Randolph Ford and Delbert R Fulkerson. Maximal flow through a network. Canadian journal of Mathematics, 8:399–404, 1956.
- [13] Gramoz Goranci, Harald Räcke, Thatchaphol Saranurak, and Zihan Tan. The expander hierarchy and its applications to dynamic graph algorithms. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2212–2228. SIAM, 2021.
- [14] Gramoz Goranci, Harald Räcke, Thatchaphol Saranurak, and Zihan Tan. The expander hierarchy and its applications to dynamic graph algorithms. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2212–2228. SIAM, 2021.
- [15] Chris Harrelson, Kirsten Hildrum, and Satish Rao. A polynomial-time tree decomposition to minimize congestion. In Proceedings of the Fifteenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’03, page 34–43, New York, NY, USA, 2003. Association for Computing Machinery.
- [16] Arun Jambulapati and Aaron Sidford. Ultrasparse ultrasparsifiers and faster laplacian system solvers. ACM Transactions on Algorithms, 2021.
- [17] Jonathan A Kelner, Yin Tat Lee, Lorenzo Orecchia, and Aaron Sidford. An almost-linear-time algorithm for approximate max flow in undirected graphs, and its multicommodity generalizations. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 217–226. SIAM, 2014.
- [18] Rohit Khandekar, Satish Rao, and Umesh Vazirani. Graph partitioning using single commodity flows. Journal of the ACM (JACM), 56(4):1–15, 2009.
- [19] Rohit Khandekar, Satish Rao, and Umesh Vazirani. Graph partitioning using single commodity flows. Journal of the ACM (JACM), 56(4):1–15, 2009.
- [20] Philip Klein, Satish Rao, Ajit Agrawal, and R Ravi. An approximate max-flow min-cut relation for undirected multicommodity flow, with applications. Combinatorica, 15(2):187–202, 1995.
- [21] Ioannis Koutis, Gary L Miller, and Richard Peng. A nearly-m log n time solver for sdd linear systems. In 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, pages 590–598. IEEE, 2011.
- [22] Rasmus Kyng and Sushant Sachdeva. Approximate gaussian elimination for laplacians-fast, sparse, and simple. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 573–582. IEEE, 2016.
- [23] Yin Tat Lee, Satish Rao, and Nikhil Srivastava. A new approach to computing maximum flows using electrical flows. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 755–764, 2013.
- [24] Tom Leighton and Satish Rao. Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. J. ACM, 46(6):787–832, nov 1999.
- [25] Jason Li, Danupon Nanongkai, Debmalya Panigrahi, and Thatchaphol Saranurak. Near-linear time approximations for cut problems via fair cuts. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 240–275. SIAM, 2023.
- [26] Nathan Linial, Eran London, and Yuri Rabinovich. The geometry of graphs and some of its algorithmic applications. Combinatorica, 15:215–245, 1995.
- [27] Aleksander Madry. Fast approximation algorithms for cut-based problems in undirected graphs. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 245–254, 2010.
- [28] Lorenzo Orecchia, Leonard J Schulman, Umesh V Vazirani, and Nisheeth K Vishnoi. On partitioning graphs via single commodity flows. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pages 461–470, 2008.
- [29] Richard Peng. A note on cut-approximators and approximating undirected max flows. CoRR, abs/1411.7631, 2014.
- [30] Richard Peng. Approximate undirected maximum flows in o (m polylog (n)) time. In Proceedings of the twenty-seventh annual ACM-SIAM symposium on Discrete algorithms, pages 1862–1867. SIAM, 2016.
- [31] Harald Räcke. Minimizing congestion in general networks. In Proceedings of the 43rd Symposium on Foundations of Computer Science, FOCS ’02, page 43–52, USA, 2002. IEEE Computer Society.
- [32] Harald Räcke. Optimal hierarchical decompositions for congestion minimization in networks. In Proceedings of the Fortieth Annual ACM Symposium on Theory of Computing, STOC ’08, page 255–264, New York, NY, USA, 2008. Association for Computing Machinery.
- [33] Harald Räcke, Chintan Shah, and Hanjo Täubig. Computing cut-based hierarchical decompositions in almost linear time. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 227–238. SIAM, 2014.
- [34] Thatchaphol Saranurak and Di Wang. Expander decomposition and pruning: Faster, stronger, and simpler. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2616–2635. SIAM, 2019.
- [35] Jonah Sherman. Breaking the multicommodity flow barrier for -approximations to sparsest cut. In 2009 50th Annual IEEE Symposium on Foundations of Computer Science, pages 363–372, 2009.
- [36] Jonah Sherman. Nearly maximum flows in nearly linear time. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pages 263–269. IEEE, 2013.
- [37] Jonah Sherman. Area-convexity, regularization, and undirected multicommodity flow. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 452–460, 2017.
- [38] Daniel A Spielman and Shang-Hua Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, pages 81–90, 2004.
Appendix A Cut-Matching Game
In this section, we prove Theorem 5.11, restated below. See 5.11
Our setup resembles Appendix B of [34] with a few minor changes. In particular, since our flow routine is more restrictive, we have to adapt the algorithm to handle our flow outputs.
We begin with notation from [34]. For simplicity, we avoid working with the subdivision graph in [33, 34]. Define a -commodity flow as a multi-commodity flow where each vertex is the source of quantity of its distinct flow commodity. Only for analysis, we consider a flow-matrix which encodes information about a -commodity flow. We say that is routable with congestion if there exists a -commodity flow such that, simultaneously for all , we have that can send quantity of its own commodity to , and the amount of flow through each edge is at most .
The algorithm initializes flow-matrix as the diagonal matrix with value on entry . Trivially, is routable with zero congestion. The algorithm initializes and , and then proceeds for at most rounds. For each round , the algorithm implicitly updates to such that it is routable with congestion at most . The operation for implicitly updating will be described explicitly later on, but we ensure that row sums do not change from to , i.e., there is always total quantity of each commodity spread among the vertices. For each round , the algorithm (explicitly) finds a partition of into , and then computes
-
(i)
A (possibly empty) set satisfying and , and
-
(ii)
A (possibly empty) flow from to such that each vertex is the source of at least flow, and each vertex is the sink of at most flow. The flow has congestion .
The algorithm then updates and . Note that on each round , the sets and partition . If holds, then the algorithm immediately terminates and outputs . Otherwise, we have at the end, and the algorithm outputs .
Lemma A.1.
For any round , we have and .
Proof.
We start by proving the first statement. Each time we remove a set , we are guaranteed that . We charge the part to the vertices in so that each vertex is charged exactly . Since the algorithm updates and , each newly charged vertex leaves and joins . In total, we charge to the vertices in so that each vertex is charged once at exactly . It follows that and
concluding the first statement of the lemma.
For the second statement, consider the round with , if it exists, at which point the algorithm terminates. (If there is no such , then we are done.) We have and , and the final set satisfies . It follows that the new set satisfies
concluding the second statement and the proof. ∎
Purely for the analysis, we define a potential function for each round as follows. For each vertex , let be row of matrix ; we call a flow-vector of . We define the potential function
where
(2) |
is the weighted average of the flow-vectors in , which is also the minimizer of when treated as a function of . The latter fact can be verified separately for each coordinate ; namely,
follows from setting the derivative to zero.
A.1 Small Potential Implies Mixing
We first show that if for a sufficiently large polynomial, then the vertex weighting mixes in .
Lemma A.2.
For any , if and for large enough constant , then the vertex weighting mixes in with congestion .
For the rest of Section A.1, we prove Lemma A.2. Suppose that and for large enough constant . We first prove two claims about the flow-matrix . Let denote the sum .
Claim A.3.
.
Proof.
Since the -commodity flow routing has congestion , and since is a cut separating from , we have . Since by Lemma A.1, we have
where the last inequality holds by the assumption . Since the sum of row in is always , we have
Subtracting the two inequalities above, we conclude that
Claim A.4.
For all , we have
In particular, .
Proof.
Since by assumption, we have
so
In particular,
By the definition of ,
concluding the first statement of the claim. By Claim A.3, the expression above is at least , so
for large enough, concluding the second statement. ∎
With Claims A.3 and A.4 established, we now prove Lemma A.2. Recall that is routable with congestion . Decompose this -commodity flow into single-commodity flows that send quantity of commodity from to .
Consider any demand satisfying . In particular, . We want to construct a single-commodity flow routing demand with congestion . For each , we first route the flow
Summing over all , we observe that each vertex sends a total of demand. Also, by Claim A.4, each flow has (absolute) value
In particular, these flows can be routed simultaneously with congestion times the -commodity flow, which is congestion .
After routing this flow, each vertex receives demand
Since , we can approximate for all . Together with Claim A.4, we can approximate the numerator and denominator of the fraction above to
which equals since . When is large enough, the approximated value of is within an additive of the true value. In particular, each vertex receives a demand whose absolute value is at most . Since the minimum edge capacity is , the remaining demand can trivially be routed with congestion . The final congestion is , concluding the proof of Lemma A.2.
A.2 Computing the Set and Flow
In this subsection, we describe a round of the algorithm in detail, following Lemma B.3 of [34] with some minor changes. Recall that on each iteration , the algorithm first computes a partition of into , and then computes a set and flow satisfying certain properties. The algorithm then updates and . The choice of partition will depend on an implicitly represented flow-matrix that is useful for the analysis.
A.2.1 Constructing the partition
We start with the construction of and . We first list some variables key to the algorithm and analysis.
-
1.
Let be a random unit vector orthogonal to the all-ones vector.
-
2.
For each , let be the projection of normalized flow-vector onto the vector . We later show in Claim A.12 that the values can be computed in total time without explicitly maintaining the flow matrix .
-
3.
Let be the projection of the weighted average onto the vector . It is only used in the analysis.
-
4.
Let and be constructed by Lemma A.5 below.
Lemma A.5.
Given the values for all , we can find in time a partition of into two sets and a separation value such that
-
(a)
separates the projections of , i.e., either or ,
-
(b)
and , and
-
(c)
.
Proof.
Sort the vertices by their value of in ascending order, and let the sorted list be . Let be the smallest integer such that . Define the vertex sets and . By our construction of , we have for both . Since , we have
so pick such that
Define so that
It follows that we can take and , which satisfies all three properties of the lemma. ∎
We conclude this section with a lemma about the behavior of projections onto a random unit vector. Since the lemma is standard in high-dimensional geometry, we omit the proof and refer to Lemma 3.4 of [18].
Lemma A.6 (Lemma 3.4 of [18]).
For all vertices , we have
and for any pair and constant , we have
with probability at least . In particular, there exists a constant such that for any pair ,
A.2.2 Max-flow/min-cut call
Given and , we call Theorem 5.10 with parameters
(3) |
and we denote the graph by . We will later show in Claim A.10 that Assumption 1 of Theorem 5.10 is satisfied with our parameter .
From Theorem 5.10, we obtain an -approximate fair cut/flow pair . The algorithm sets , which satisfies the condition below as required by step i.
Claim A.7.
and ,
Proof.
We begin with the first statement. We begin by bounding as follows. The edges in can be split into three groups:
-
1.
The edges in , which have total capacity ,
-
2.
The edges incident to , which have total capacity , and
-
3.
The edges incident to or , which we ignore.
It follows that
By Fact 5.8, the cut value is at most times the minimum -cut. Since the -minimum cut is at most , we have
It follows that
concluding the first statement. For the second statement, observe that and , so
concluding the second statement and the proof. ∎
The algorithm defines the flow as follows. First, scale the flow by factor and restrict the flow to edges in . In other words, the flow on edges incident to are removed. Then, decompose the flow into paths (using, for example, a dynamic tree [ST83]). For each vertex , remove enough paths starting at until it is the start of exactly total capacity of paths, and for each vertex , remove all paths starting at ; let the remaining flow be . The claim below shows that this last step is always possible, and that satisfies the condition below as required by step ii.
Claim A.8.
is a flow from to such that each vertex is the source of exactly flow, and each vertex is the sink of at most flow. The flow has congestion .
Proof.
It suffices to show that the scaled flow , once restricted to edges in , is a flow such that
-
1.
Each vertex is the source of at least flow, and
-
2.
The only sinks are at vertices , and each such vertex is the sink of at most flow.
The path-removing step in the algorithm is then possible, and ensures that each vertex is the source of exactly flow in , each vertex is the sink of at most flow, and there are no (nonzero) sources or sinks elsewhere.
We begin with the unscaled flow . Since is a -fair cut/flow pair, each vertex receives at least total flow from vertices in . Since , the same holds for all . By construction of , we have for all . It follows that each vertex receives at least total flow from vertices in .
We now investigate the effect of restricting the flow to edges in , starting with removing all edges incident to . Continuing the argument above, removing these edges causes each vertex to be the source of at least flow.
If , then we now remove the edges incident to . By construction of , each vertex has an edge to of capacity . Since the flow is feasible, there is at most flow along the edge . Removing this edge changes the net flow out of by at most . Since each vertex is the source of at least flow before this step, it is the source of at least flow after this step. In particular, it cannot become a sink. Also, if , then it is the source of at least flow after restriction.
Finally, we remove the edges incident to . Since the flow is feasible, each edge carries at most flow. By construction of , we have for all , so each vertex receives a net flow of at most from vertices other than (and then sends that flow to ). We may assume that the flow does not send any flow away from (i.e., along any edge in the direction from to ), since otherwise we can remove such flow using a path decomposition. Under this assumption, removing flow on edges incident to does not create any sources, only sinks. In particular, each vertex is now the sink of at most flow.
Finally, scaling this restricted flow by , we conclude that the scaled flow has congestion , each vertex is the source of at least flow, and each vertex is the sink of at most flow. ∎
Claim A.9.
Given flow , the flow can be constructed in time.
Proof.
The flow is on the graph with at most edges, so scaling and restricting the flow takes time. Using a dynamic tree, we can decompose the new flow into at most (implicit) paths in time. The path removal step also takes time through dynamic trees. The overall running time is . ∎
Claim A.10.
Assumption 1 of Theorem 5.10 holds for our choice of parameters . That is, we have for all .
Proof.
Recall that we set , and . For the entire proof, fix a set . We first bound as
For , we have since any edge in with an endpoint in has its other endpoint outside . For , we claim the bound . The first inequality is trivial, and for the second inequality, observe that by construction of , each set is a subset of some cluster in the partition . It follows that any edge in with an endpoint in has its other endpoint outside , so the edge must be in , and we conclude that . In total, we have established the bound .
Next, we bound as
and together with the bound on , we obtain
(4) |
We now focus on bounding . Recall from assumption 1 of Theorem 5.11 that there is a flow on with congestion such that each vertex is the source of flow and each vertex is the sink of at most flow. In particular, among the vertices , there is a total of source and at most sink. Since the flow has congestion , the net flow out of is at most . Putting everything together, we obtain
where the second inequality follows from our earlier claim . Continuing from (4), we conclude that
finishing the proof. ∎
A.2.3 Constructing the matching and updating the flow-matrix
Using the flow , the algorithm then constructs a (fractional) matching graph on vertex set . Take a path decomposition of flow into paths from to . For each pair of vertices and , add an edge to whose capacity is the sum of capacities of all – paths in the decomposition of . The following claim is immediate using an (implicit) path decomposition.
Claim A.11.
Given flow , the matching graph can be constructed in time.
From the matching graph , we implicitly update the flow-matrix to as follows. For each , we set
Note that we are viewing as a graph on vertex set , where vertices outside are isolated.
Recall from Section A.2.1 that the algorithm needs to compute the projection onto the vector for each vertex . Now that has been explicitly defined, we show that the projections can be computed in total time .
Claim A.12.
Given vector , the projection for each vertex can be computed in total time .
Proof.
By linearity of inner product, we have
for all vertices and iterations . Since the flow is on , we can assume that the path decomposition has at most paths, so the matching graph has at most edges. In other words, there are at most nonzero values of . It follows that given the values for all , we can compute for all in time. Since is a diagonal matrix, the initial values can be computed in time. Over iterations, the total time is . ∎
A.3 Analyzing the Potential Decrease
The main goal of Section A.3 is to prove the following lemma, establishing a expected decrease in on each iteration.
Lemma A.13.
Over the random choice of the unit vector , we have .
To prove Lemma A.13, we begin by listing properties of and .
Claim A.14.
for all , and for all .
Proof.
For each vertex , its degree in equals the total capacity of paths starting at in the decomposition of , which is exactly by Claim A.8. A symmetric argument establishes for all . ∎
Claim A.15.
For each vertex , the flow-vector sums to .
Proof.
We prove by induction on that the flow-vector sums to . The statement is true for since is defined as the diagonal matrix with value on entry . Assume by induction that the flow-vector sums to for each . For each , we take the definition of vector and sum over its coordinates to obtain
completing the induction and the proof. ∎
Claim A.16.
If is routable with congestion , then is routable with congestion .
Proof.
Take the -commodity flow routing with congestion and reverse it, forming a -commodity flow routing the transpose with the same congestion. In this reversed flow, each vertex has quantity of commodity . In other words, the quantities of each commodity at vertex is captured by its flow vector .
Next, we “mix” commodities along the edges of the matching graph : for each edge in of capacity , send a proportional fraction of each commodity at along the corresponding path (in the path decomposition of ) in the direction from to , and send a proportional fraction of each commodity at in the direction from to . By Claim A.15, each vertex has total quantity of commodities, so we send total quantity of commodities in each direction, or in total, along this path of capacity (in the path decomposition of ). Since flow has congestion by Claim A.8, the total congestion of this mixing step is . After this mixing step, the quantities of each commodity at vertex is exactly . In other words, we have established a -commodity flow routing with congestion . Reversing the flow, we obtain a routing of with the same congestion. ∎
The following lemma adapts Lemma 3.4 of [33] to the capacitated case.
Lemma A.17.
.
Proof.
For each vertex , define the normalized flow vector , and define the normalized flow matrix with vector for each row . Let be the diagonal matrix with value on entry . We can then write . Define the vectors and matrix analogously. For each vertex , by definition of flow vector , we have
Let be the Laplacian matrix for the matching graph on vertex set (where vertices outside are isolated). That is, we define for all and for all distinct .
We first prove the following about the matrix .
Subclaim 1.
.
Proof.
Consider the normalized Laplacian where is the diagonal matrix with value on each entry . It is well-known that the normalized Laplacian has eigenvalues in the range , so . Multiplying by on both sides gives . Claim A.14 implies that , so we obtain . Finally, multiplying by on both sides gives the desired . ∎
By definition, the matrix has value on entry and value on entry with . It follows that we can write matrix as
For two matrices , define the Hadamard product . To bound , we use the fact that minimizes the expression in (2) to get
For , we use to get
Taking the difference and expanding,
The third term equals , which equals since is the zero matrix. Expanding the first and second terms, we obtain
(5) |
To bound the first term in (5), recall from Subclaim 1 that , which means that . Since and has eigenvalues in the range , we have , so
so the first term is at least .
Continuing from (5), we conclude that
It remains to understand the above expression . For each vertex , let denote the unit vector in direction . The Laplacian can be written as
so that
We conclude that
Finally, we prove the main lemma of Section A.2.3, restated below. See A.13
Proof.
By Lemma A.17, we have
(6) |
By Lemma A.6, we have
(7) |
for all vertices , and for some constant ,
(8) |
for all pairs . By Claim A.14, we have for all , and together with statements a to c of Lemma A.5, we have
(9) |
Taking the expectation and putting everything together,
which concludes the proof of Lemma A.13. ∎
A.4 Putting Everything Together
Finally, we prove correctness of the algorithm by establishing properties (1) to (3) in Theorem 5.11. Properties (1) and (2) follow directly from Lemma A.1. For property (3), if the algorithm terminates early, then by the termination condition and property (3) is satisfied. Otherwise, we have . By Lemma A.13, the potential for any iteration is at most times the previous potential in expectation. Over iterations, the expected potential is at most , where the constant can be made arbitrarily large by setting the constant in appropriately. By Markov’s inequality, we have with probability at least , in which case Lemma A.2 implies that vertex weighting mixes in with congestion , fulfilling property (3).
It remains to bound the running time. For each iteration, the projections for each vertex can be computed in total time by Claim A.12. Computing the values takes additional time by Lemma A.5. The algorithm makes one call to Theorem 5.10 with the parameters in (3), and then computes flow and the matching graph in time by Claims A.9 and A.11. Overall, the algorithm runs in time per iteration for many iterations, which is time in total. The algorithm also makes many calls to Theorem 5.10.
With both properties (1) to (3) and the running time established, this concludes the proof of Theorem 5.11.
Appendix B Expander Trimming
In this section, we prove Theorem 5.12, restated below. See 5.12
Our setup closely resembles Section 3.2 of [34], except that we replace the exact min-cut/max-flow oracle in their “slow-trimming” procedure with an approximate one. We also avoid defining expanders and work exclusively with flow mixing, which simplifies the analysis.
Define the vertex weighting as for all , and for all . We call Theorem 5.10 with parameters
and we denote the graph by . Note that except for and , these parameters match the parameters (3) in the proof of Theorem 5.11, where and are replaced by and , respectively. Since assumption 1 is the same as the one in Theorem 5.11, the statement and proof of Claim A.10 (which does not depend on or ) translates directly to this setting. For brevity, we omit the details, and simply restate Claim A.10 below.
Claim B.1.
Assumption 1 of Theorem 5.10 holds for our choice of parameters . That is, we have for all .
From Theorem 5.10, we obtain an -approximate fair cut/flow pair . The algorithm sets , which trivially takes time outside the call. It remains to show that properties (1) to (4) are satisfied.
To show property (1), we start with and proceed by bounding . By Fact 5.8, the cut value is at most times the minimum -cut. Since the -minimum cut is at most , we have
By definition of , we have . Putting everything together,
fulfilling property (1). By construction of , we also have , so
and dividing by establishes property (2).
Claim B.2.
Property (4) holds, i.e., there exists a vector with and a flow on routing demand with congestion .
Proof.
Since is a -fair cut/flow pair, each vertex receives at least total flow from vertices in . Since , the same holds for all . By construction of , we have the following for all :
It follows that each vertex receives at least total flow from vertices in .
We now investigate the effect of restricting the flow to edges in , starting with removing all edges incident to . Continuing the argument above, removing these edges causes each vertex to be the source of at least flow.
If , then we now remove the edges incident to . By construction of , each vertex has an edge to of capacity . Since the flow has congestion , there is at most flow along the edge . Removing this edge changes the net flow out of by at most . Since each vertex is the source of at least flow before this step, it is the source of at least flow after this step.
At this point, only the vertices and remain, and each vertex is the source of at least flow. Scale the flow by factor , take a path decomposition of the flow, and remove paths until each vertex is the source of exactly flow. We may assume that the flow does not send any flow away from (i.e., along any edge in the direction from to ), since otherwise we can cancel such flow using the path decomposition.
Finally, we remove the edges incident to . Since the original flow is feasible, the scaled flow has congestion , so each edge carries at most flow. By construction of , we have for all , so each vertex receives a net flow of at most from vertices other than (and then sends that flow to ). Let be the net flow received, so that the vector satisfies . Removing the edge increases the net flow into by exactly . Since each vertex is the source of exactly flow before removing the edge , the vertex has net flow after removal. In other words, the new flow routes demand and has congestion . By construction, it is also restricted to , concluding the proof. ∎
Claim B.3.
Property (3) holds, i.e., if the vertex weighting mixes in with congestion , then the vertex weighting mixes in with congestion .
Proof.
Consider any demand satisfying . In particular, . To fulfill property (3), we want to construct a single-commodity flow routing demand with congestion .
We first split into demands and , where vector is defined as
Since , we have and .
Take the flow from property (3), and compute a path decomposition of the flow. For each path in the decomposition from vertex to vertex of capacity , route flow along the path from to (or flow in the reversed direction, whichever is nonnegative). Since , we send at most flow along each such path of capacity , so our new flow has congestion at most that of , which is . Each vertex is the end of at most total capacity of paths in the decomposition, and for each such path of capacity , the vertex receives a net flow of at most in absolute value. It follows that each vertex receives at most net flow in absolute value. Setting as this net flow, we obtain and that flow route demand .
Having routed demand through flow , it remains to route demand . We bound
By assumption, the vertex weighting mixes in with congestion , so there is a flow routing demand with congestion . Scaling this flow by factor , we obtain a flow routing demand with congestion . Summing the two flows, the final flow routes demand with congestion , concluding the proof. ∎