Congestion-Approximators from the Bottom Up

Jason Li Carnegie Mellon University. email: [email protected] Satish Rao UC Berkeley. email: [email protected] Di Wang Google Research. email: [email protected]

Abstract

We develop a novel algorithm to construct a congestion-approximator with polylogarithmic quality on a capacitated, undirected graph in nearly-linear time. Our approach is the first bottom-up hierarchical construction, in contrast to previous top-down approaches including that of Räcke, Shah, and Taubig (SODA 2014), the only other construction achieving polylogarithmic quality that is implementable in nearly-linear time (Peng, SODA 2016). Similar to Räcke, Shah, and Taubig, our construction at each hierarchical level requires calls to an approximate max-flow/min-cut subroutine. However, the main advantage to our bottom-up approach is that these max-flow calls can be implemented directly without recursion. More precisely, the previously computed levels of the hierarchy can be converted into a pseudo-congestion-approximator, which then translates to a max-flow algorithm that is sufficient for the particular max-flow calls used in the construction of the next hierarchical level. As a result, we obtain the first non-recursive algorithms for congestion-approximator and approximate max-flow that run in nearly-linear time, a conceptual improvement to the aforementioned algorithms that recursively alternate between the two problems.

1 Introduction

The famous max-flow min-cut theorem¹¹1For the sake of this introduction, a flow problem on a graph is a set of excesses and deficits on vertices and a solution is a set of paths connecting excess and deficits where no edge is used in more than one path. implies that any set of excesses or deficits in a graph can be connected by disjoint paths if and only if every cut in the graph has more capacity in the edges that cross the cut than the “demand” that is required to cross it. One direction is easy to see (if a cut does not have enough capacity, clearly one cannot route demand paths across it), and the other can be established by linear programming duality (or explicitly using an algorithm as was done by Ford and Fulkerson in 1956 [12].)

Framed slightly differently, a cut where the ratio of the demand across it to the capacity of the cut is $c$ implies that any flow satisfying the demands requires at least $c$ units of flow go through some edge. Again, the max-flow min-cut theorem implies that there is always such a cut if the optimal flow has congestion $c$ . Here, the congestion of a flow is the maximum flow that is routed on any edge in the graph.

More generally, the set of feasible demands in a graph is captured by the (exponentially sized) set of all cuts in the graph. Remarkably, it was shown that polynomially many cuts could approximately (within logarithmic factors) characterize the congestion of any set of demands [31, 32]. Indeed, the total size of cuts was nearly-linear²²2Nearly-linear means $O(m\log^{c}n)$ where $c$ is a constant. Almost-linear means $O(m^{1+\epsilon})$ for some $\epsilon=o(1)$ . in [31, 33] and could be computed in nearly-linear time as announced in 2014 by Peng [29].

More precisely, an $\alpha$ -congestion-approximator for a graph is a set of cuts where for every set of demands, the minimum ratio (over cuts in the $\alpha$ -congestion-approximator) of the demands crossing the cut and the capacity of the cut determines the minimum congestion of the optimal flow (that routes the demands) to within an $\alpha$ factor.

In this paper, we provide an $\alpha$ -congestion-approximator (along with an routing scheme) that can be constructed in nearly-linear time with an $\alpha$ of $O(\log^{10}n)$ . The number of logarithmic factors is not optimized but the structure is significantly simpler than the construction by Peng [29].

The top-down frame of the decomposition of [33] requires that one compute approximate maximum flows at the very top level which entailed a non-linear runtime. A breakthrough result of Sherman [36] showed how to use congestion-approximators to find approximate maximum flows.³³3Simultaneously, Kelner et. al. [17] used a dual version of congestion approximators called oblivous routers to give efficient algorithms for approximate maximum flow. Peng [30] used the result of Sherman [36] inside the top-down partitioning method based on [33] for constructing congestion-approximators. There is a chicken and egg problem here as the two methods need each other, and a costly recursion was used to combine them based on ultra-sparsifiers which we discuss a bit more below.

Our result proceeds in a bottom-up iterative fashion; indeed, the bottom level is a “weak expander decomposition” which can be computed using trivial congestion-approximators consisting of singleton vertices.⁴⁴4We note expander decompositions themselves have found wide application, including in the breakthrough near linear time algorithm of [8] for maximum flow. Also, the high-level idea of our congestion-approximator is similar to hierachical expander decomposition [13], and our techniques may be useful in getting from almost linear to near linear. Then we proceed level by level, using the lower level clusterings as “pseudo”-congestion-approximators for the next. The frame is simple, but “around the edges” both literally (the edges of the clusters) and metaphorically there are technical issues which require some attention.

Still, given the power that congestion-approximators provide with respect to understanding the structure of graphs, finding more efficient constructions is important and an effective bottom-up or clustering approach seems a natural path to follow. We make the first and a substantive step on this path in the decade since the announcement of Peng’s also polylogarithmic nearly linear time algorithm [29].

Recent progress on maximum flow can be compared to developments in nearly-linear time Laplacian solvers [38], which were initially very complex with many logarithmic factors. But over the years, new tools were developed that both improved the running times and allowed for better understanding of graphs as well as having broader application. See, for example, [21, 22, 10, 16]. Progress has been much slower for maximum flow in comparison. Perhaps one reason is that solutions to Laplacian linear systems are $\ell_{2}$ optimizing flows that involve the Euclidean norm which is a fair bit simpler than the $\ell_{\infty}$ norm central to maximum flow or norms based on more general convex bodies. We believe this result to be a step in the process of understanding maximum flows better.

We proceed with a discussion of previous work which consists of a remarkable series of developments.

1.1 Previous work

The maximum flow, minimum cut theorem is nice and remarkable in that one can find a single cut that establishes the optimality of the flow or “routing”, and the optimal routing establishes a tight lower bound on the size of any cut.⁵⁵5A wrinkle is that there could be many (even an exponential) number of minimum cuts or feasible optimal flows. But any optimal routing or any optimal cut establishes the optimality of the “dual” object.

We note that this theorem was extended to multi-commodity flow in an approximate sense in [24], [20] and [26] where flows and corresponding cuts are shown to be related by an $O(\log n)$ factor. The first paper gives a method for finding the sparsity or conductance of a graph to within an $O(\log n)$ factor by using multicommodity flow to embed a complete graph, and the others extend the techniques to give approximations between cuts and solutions for arbitrary multicommodity flow instances. In a parallel thread, the mixing times of random walks or eigenvalues (which involve flow problems with an $\ell_{2}$ objective) are related to cuts via classical and tremendously impactful results of Cheeger [7]. Combining eigenvalues and linear programming methods through semidefinite programming yields improved approximate relationships between embeddings and cuts of roughly $O(\sqrt{\log n})$ [5, 4]. Fast versions of these methods involve the cut-matching game developed in [2, 18, 28, 35] and used directly in this work.

As we noted previously, one can view the (exponentially sized) set of feasible sets of demands in a graph as being a very general measure of a graph’s capabilities. And as also mentioned above, Räcke in [31] showed that a polynomial sized set of cuts and corresponding pre-computed routings approximately model the congestion required for any of (exponentially many) sets of possible demands.

A bit more formally, [31] provides an oblivious routing scheme, where the scheme can obliviously route any set of demands with no more than $\alpha$ times as much congestion as the optimal routing. Here, oblivious roughly means that for any demand pair the routing is done without considering any other demand pair. [31] also provides a decomposition where for any set of demands the maximum congestion (i.e., ratio of demand crossing to edge capacity) on any cut in the decomposition is within a factor of $\alpha$ of the congestion needed to route those demands.

The value for $\alpha$ in [31] is $O(\log^{3}n)$ but was non-constructive: however, constructive and improved schemes were quickly developed with $\alpha$ of $O(\log^{2}n\log\log n)$ in [6, 15].

An alternative scheme also by Räcke [32], gave a remarkably simple oblivious routing scheme, that consisted of $O(m)$ trees. The oblivious routing was simply to route any demand by splitting the flow among the trees. Of course, each tree implicitly corresponds to a laminar family of cuts. Still the total size of both the routings and the implicit total size of the cuts is quadratic or worse. Moreover, the time complexity for its construction was also at least quadratic. The approximation factor $\alpha$ for this scheme was $O(\log n)$ . We will refer to this as Räcke’s tree scheme.

Madry, in [27], gave a nearly-linear time algorithm that produced a congestion-approximator which has almost-linear size and running time at the cost of having approximation factor of $O(n^{\epsilon})$ that is based on Räcke tree scheme. A central idea is the use of ultrasparsifiers which were introduced by Spielman and Teng [38] in their breakthrough results on linear time solvers for Laplacian linear systems. An ultrasparsifier is formed by taking a certain kind of spanning tree (called low-stretch), making clusters from small connected components of the tree, and sampling a very small set of non-tree edges between the clusters. Such a graph approximates the cuts to within a small factor in the sense that cuts in the original graph and the new one have approximately the same size. This allows a (complicated) recursion where one use several ultrasparsifiers to approximate the cuts in the graph and recursively produce oblivious routing schemes for each of the ultrasparsifiers. Again, this approach is more efficient in time than Räcke’s tree scheme at the cost of a worse approximation factor. The scheme enabled the approximate solution to a host of problems in almost linear time. This contribution was remarkable.

In sum, Madry [27] produced a $\alpha=O(n^{\epsilon})$ -congestion approximator in the almost linear running time of $O(m^{1+\epsilon})$ .⁶⁶6The $\epsilon$ is subconstant, roughly $1/\sqrt{\log n}$ which means an $m^{\epsilon}$ factor is larger than any polylogarithmic factor. The obstruction to getting to polylogarithmic overheads in Madry’s scheme is the need to recurse on several ultrasparsifiers to keep the approximation from blowing up during the recursion.

In another exciting development, Sherman [36] used Madry’s [27] structure to produce an almost-linear time algorithm for $1+\epsilon$ -approximate undirected maximum flow. This is remarkable in that he reduced Madry’s approximation factor of $O(n^{\epsilon})$ to $(1+\epsilon)$ . The approximation factor in Madry’s construction translates into a running time overhead, which results in the almost-linear running time we mentioned. We point out that there is close relationship to the work of Spielman and Teng [38], who used ultrasparsifiers to spectrally approximate a Laplacian system to provide algorithms whose running times depend on the approximation quality while obtaining arbitrarily precise solutions. In that case, the dependence of running time on precision was logarithmic whereas Sherman’s algorithm suffers a $O(1/\epsilon^{2})$ dependence on the error $\epsilon$ . To be sure, in solving linear systems the idea of using an “approximate” graph in improving condition numbers was known as preconditioning, but for maximum flow or $\ell_{1}$ optimization this was a striking development.

Sherman’s method (could be seen) to use the multiplicative weights framework (see [3]) to route flow across cuts so that every cut has near zero residual demand across it, in particular, $\delta=\epsilon/\alpha$ fraction of its capacity. The congestion approximator then ensures that the remaining flow can be routed with $\alpha\cdot\delta=\epsilon$ congestion and one can recurse with slightly more than $\epsilon$ extra congestion.

We note that Sherman’s result (with Madry’s construction) bypasses longstanding barriers to faster maximum flow algorithms. Since the work of Dinitz [11] in 1973, algorithms had to pay either for path length or for number of paths. Dinitz himself traded this off to get a $O(m^{3/2})$ flow algorithm [9, 23]. Using linear time solvers to do electrical flow, one could eke out a few more paths simultaneously, but one still was pretty stuck at $O(m^{4/3})$ . The congestion-approximator and the multiplicative weights optimization method bypasses these obstructions.

In 2014, Räcke, Shah, and Taubig [33] made progress on Räcke’s decomposition based approach. In particular, they showed how to produce a decomposition with approximation parameter of $O(\log^{4}n)$ using maximum flow computations of total size $O(m\log^{5}n)$ . They use the same frame as Räcke [31], but use single commodity flows in the context of the cut-matching game [19, 28] which, as previously mentioned, can replace multicommodity flows in finding sparse cuts and routings.

Note, that a congestion-approximator can be used to compute flows efficiently where flows can now be used to compute congestion-approximators. This suggests recursion, but the construction in [33] (and indeed previous ones) were very much top down. That is, the first level of the decomposition (which itself is a tour de force) falls victim to the fact that typical paths in the flows that find them can be long and somewhat abundant. Thus, one critically needs something like a congestion-approximator right away to even compute the top level of the decomposition.

Still, Peng [30] was able to get to a nearly-linear running time recursive method by using ultrasparsifiers and Sherman’s approximate maximum flow algorithm. He constructs an ultrasparsifier of size $O(m/\log^{c}m)$ with polylogarithmic approximation factor, recursively computes a cut sparsifier for the resulting graph, and then argues that the combination of Sherman’s algorithm with both the ultrasparsifier and the congestion-approximator can be used to compute an approximate maximum flow in $O(m\log^{c}m)$ time. Again, the key here is that Sherman [36] allows him to use polylogarithmic overhead to combat the polylogarithmic approximation in both the ultrasparsifer and in the congestion-approximator of the ultrasparsifier. Still, the recursion is a bear, resulting in an admittedly unoptimized runtime of $O(m\log^{41}n)$ .

As noted previously, our method is bottom up from the start and avoids the costly recursion required above. We have not computed the exponent of the logarithmic factors due to, for example, using the fair cuts method of [25] as a black box. Still, the clustering approach is more natural and we expect is a useful frame.

We proceed with a technical overview of our result and methods.

2 Technical Overview

Our main conceptual and technical contribution is a novel congestion approximator that is constructed bottom-up in a hierarchical fashion. We start with an informal construction in the theorem below, which is later formalized in Theorem 4.1. See Figure 1 for an illustration.

Refer to caption — Figure 1: On the left, partitions $\mathcal{P}_{i}$ (solid) and $\mathcal{P}_{i+1}$ (dotted) are shown. On the middle, the marked edges for each cluster mix simultaneously in $G$ (property (2)). On the right, a flow from the inter-cluster edges of $\mathcal{P}_{i+1}$ to the inter-cluster edges of $\mathcal{P}_{i}$ is displayed (property (3)); assuming edges are unit-weight, each flow path carries a half-unit of flow.

Theorem 2.1 (Informal Theorem 4.1).

Consider a capacitated graph $G=(V,E)$ . Suppose there exist partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}$ of $V$ such that

1.

$\mathcal{P}_{1}$ is the partition $\{\{v\}:v\in V\}$ of singleton clusters, and $\mathcal{P}_{L}$ is the partition $\{V\}$ with a single cluster.
2.

For each $i\in[L-1]$ , for each cluster $C\in\mathcal{P}_{i+1}$ , the inter-cluster edges of $\mathcal{P}_{i}$ internal to $C$ , together with the boundary edges of $C$ , mix⁷⁷7Informally, a set of edges mixes if there is a low-congestion multi-commodity flow between the set of edges whose demand pairs form an expander. In other words, there is an expander flow (in the [5] sense) between the set of edges. Our formal definition is in the preliminaries and only considers single-commodity flows. in the graph $G$ . Moreover, the mixings over all clusters $C\in\mathcal{P}_{i+1}$ have congestion $\alpha$ simultaneously.
3.

For each $i\in[L-1]$ , there is a flow in $G$ with congestion $\beta$ such that each inter-cluster edge in $\mathcal{P}_{i+1}$ sends its capacity in flow and each inter-cluster edge in $\mathcal{P}_{i}$ receives at most half its capacity in flow.

For each $i\in[L]$ , let partition $\mathcal{R}_{\geq i}$ be the common refinement of partitions $\mathcal{P}_{i},\mathcal{P}_{i+1},\ldots,\mathcal{P}_{L}$ , i.e.,

\mathcal{R}_{\geq i}=\{C_{i}\cap C_{i+1}\cap\cdots\cap C_{L}:C_{i}\in\mathcal{P}_{i},\,C_{i+1}\in\mathcal{P}_{i+1},\ldots,\,C_{L}\in\mathcal{P}_{L},\,C_{i}\cap\cdots\cap C_{L}\neq\emptyset\}.

Then, their union $\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i}$ is a congestion-approximator with quality $5L^{2}\alpha\beta$ .

To understand the construction, first consider the case $i=1$ . By property (1), $\mathcal{P}_{1}$ is the partition of singleton clusters. Since the inter-cluster edges of $\mathcal{P}_{1}$ are precisely all the edges, property (2) says that for each cluster $C\in\mathcal{P}_{i+1}$ , the edges internal to $C$ , together with the boundary edges of $C$ , “mix” in the graph $G$ . In other words, the set of edges with at least one endpoint in $C$ mixes in $G$ . In the extreme case $C=V$ , the entire edge set mixes in $G$ , so the graph $G$ is an expander.⁸⁸8We do not define expanders in this paper since we do not need their precise definition. The connection to expanders is only stated as motivation for readers familiar with the concept. In general, we can informally say that each cluster $C$ is a sort of weak-expander, and the partition $\mathcal{P}_{2}$ is a weak-expander decomposition of graph $G$ .⁹⁹9A key difference (from an actual expander decomposition) is that the (routing to certify the) mixing of each cluster is not required to be fully inside its induced subgraph, although the full graph $G$ still needs to have the capacity to support the mixing of all the clusters simultaneously.

Now consider property (3), which establishes a flow starting from the inter-cluster edges of $\mathcal{P}_{2}$ such that each edge in $G$ receives at most half its capacity in flow. We claim that this statement is very natural and follows almost immediately from property (2) with one mild assumption: for each cluster in $\mathcal{P}_{i+1}$ , the total capacity of boundary edges is much smaller than the total capacity of internal edges.¹⁰¹⁰10From the perspective of expander decompositions, this assumption is very natural. Expander decompositions require that the total capacity of inter-cluster edges is small relative to the total capacity of all edges. We are simply extending this property to hold for each cluster, not just globally over all clusters. Indeed, from the mixing of the internal and boundary edges of $C$ , we can spread a flow from the boundary edges of $C$ such that each internal edge receives flow proportional to its capacity. As long as the boundary edges have small total capacity, the total flow source is also small, so each internal edge receives a small amount of flow relative to its capacity. Finally, since the clusters mix simultaneously, we can compose the corresponding flows for each cluster and still ensure small congestion.

Now consider a general level $i\geq 2$ . Recall from property (2) that for each cluster $C\in\mathcal{P}_{i+1}$ , the inter-cluster edges of $\mathcal{P}_{i}$ inside $C$ , together with the boundary edges of $C$ , “mix” in the graph $G$ (see Figure 1, middle). We can interpret this statement (again) as a weak-expander decomposition where expansion is measured with respect to a subset of edges, namely the inter-cluster edges of partition $\mathcal{P}_{i}$ together with the boundary edges of a cluster in $\mathcal{P}_{i+1}$ . Property (3) says that we can spread flow from the inter-cluster edges of $\mathcal{P}_{i+1}$ to the inter-cluster edges of $\mathcal{P}_{i}$ such that each edge in $\mathcal{P}_{i}$ receives a small amount of flow (see Figure 1, right). Once again, we can show that it follows from property (2) with the following mild assumption: for each cluster in $\mathcal{P}_{i+1}$ , the total capacity of boundary edges is much smaller than the total capacity of inter-cluster edges of $\mathcal{P}_{i}$ that are internal to this cluster.

Overall, the partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}$ can be viewed as hierarchical expander decompositions where each partition $\mathcal{P}_{i+1}$ is a weak-expander decomposition relative to the inter-cluster edges of partition $\mathcal{P}$ . It is instructive to compare our partitioning to the expander hierarchy construction of [14], where clusters of the previous partition are contracted before building the next partition. While [14] can also extract a congestion approximator from their expander hierarchy construction, their quality is $n^{o(1)}$ because they lose a multiplicative factor from contracting vertices on each level of the hierarchy. To avoid this multiplicative blow-up per level, we do not contract vertices, so our partitioning is not truly hierarchical: a cluster in partition $\mathcal{P}_{i}$ may be “cut” into many components by the next partition $\mathcal{P}_{i+1}$ (see Figure 1, left). While a hierarchical construction is not required, it is a useful property to have when analyzing the quality of the final congestion-approximator. To establish a hierarchy, our key insight is to consider the common refinement of partitions $\mathcal{P}_{i},\mathcal{P}_{i+1},\ldots,\mathcal{P}_{L}$ , which we name $\mathcal{R}_{\geq i}$ . The partitions $\mathcal{R}_{\geq i}$ over all $i$ are now hierarchical by construction, and we show that the union of all refinements $\mathcal{R}_{\geq i}$ is a congestion-approximator with good quality.

We emphasize that our conceptual idea of not contracting clusters and looking at common refinements is novel and may have future applications to bottom-up constructions of hierarchical objects, especially for obtaining polylogarithmic approximations, meaning that one cannot lose a multiplicative factor at each level. Given that [14] has popularized bottom-up hierarchical approaches, and that their methods so far can only obtain $n^{o(1)}$ -factors due to multiplicative errors across levels, we believe that our ideas are promising for future development in this area.

2.1 Bottom-Up Construction

As mentioned previously, the partition $\mathcal{P}_{2}$ is a weak-expander decomposition of the graph, so it can be computed in nearly-linear time using off-the-shelf expander decomposition algorithm that avoid any black-box call to max-flow [34]. For partitions $\mathcal{P}_{3}$ onwards, expansion is measured with respect to a subset of edges, so simple expander decomposition algorithms no longer suffice. Instead, we have to resort to expander decomposition algorithms that make black-box calls to (approximate) max-flow. Naïvely, these max-flows can be computed recursively, resulting in a congestion approximator algorithm that makes recursive calls to max-flow, similar to [33]. Our key insight is that these max-flow instances are actually well-structured enough that recursion is unnecessary. In particular, to construct partition $\mathcal{P}_{i+1}$ , the first $i$ partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{i}$ —which the algorithm has already computed—can be converted to a pseudo-congestion approximator, which then translates to a max-flow algorithm sufficient for these well-structured instances.

3 Preliminaries

We are given an undirected, capacitated graph $G=(V,E)$ . The graph has $n$ vertices and $m$ edges, and each edge $e\in E$ has capacity $c(e)$ in the range $[1,W]$ . For a set $C\subseteq V$ , let $\partial C$ denote the set of edges with exactly one endpoint in $C$ , and let $\delta C=\sum_{e\in\partial C}c(e)$ denote the total capacity of edges in $\partial C$ . For a vertex $v\in V$ , let $\deg(v)$ denote the capacitated degree of vertex $v$ , which is also equal to $\delta\{v\}$ . When the graph $G$ is clear from context, we may drop the subscript $G$ . We sometimes write $\partial_{G}C$ , $\delta_{G}C$ , and $\deg_{G}(v)$ to emphasize that the values are with respect to graph $G$ .

For a given edge set $F\subseteq E$ , let $\partial_{F}C$ , $\delta_{F}C$ , and $\deg_{F}(v)$ denote the corresponding values on the subgraph of $G$ with edge set $F$ . For a different graph $H$ , let $\partial_{H}C$ , $\delta_{H}C$ , and $\deg_{H}(v)$ denote the corresponding values on graph $H$ . We never remove the subscripts $F$ and $H$ to avoid confusion with the original graph $G$ .

For a partition $\mathcal{P}$ of the vertex set $V$ , we call each part of the partition a cluster. Let $\partial\mathcal{P}$ denote the set of edges whose endpoints belong to different clusters; we can also write $\partial\mathcal{P}=\bigcup_{C\in\mathcal{P}}\partial C$ . We also define $\delta\mathcal{P}$ as the total capacity of edges in $\partial\mathcal{P}$ .

Let $U$ be an arbitrary set of vertices. For a vector $\mathbf{x}\in\mathbb{R}^{U}$ , let $\mathbf{x}(v)$ denote its value on entry $v\in U$ . For a vertex subset $S\subseteq U$ , define $\mathbf{x}(S)=\sum_{v\in S}\mathbf{x}(v)$ , and define $\mathbf{x}|_{S}\in\mathbb{R}^{U}$ as the vector $\mathbf{x}$ restricted to $S$ , i.e., $\mathbf{x}|_{S}(v)=\mathbf{x}(v)$ for all $v\in S$ and $\mathbf{x}|_{S}(v)=0$ for all $v\notin S$ . For two vectors $\mathbf{s},\mathbf{t}\in\mathbb{R}^{U}$ , by $\mathbf{s}\leq\mathbf{t}$ we mean entry-wise inequality, i.e., $\mathbf{s}(v)\leq\mathbf{t}(v)$ for all $v\in U$ . For a vector $\mathbf{s}\in\mathbb{R}^{U}$ , let $|\mathbf{s}|\in\mathbb{R}^{U}$ be the vector with entry-wise absolute values, i.e., $|\mathbf{s}|(v)=|\mathbf{s}(v)|$ .

A demand vector is a vector $\mathbf{b}\in\mathbb{R}^{U}$ whose entries sum to $0$ , i.e., $\mathbf{b}(U)=0$ . A flow $f$ routes demand $\mathbf{b}$ if each vertex $v\in U$ receives a net flow of $\mathbf{b}(v)$ in the flow $f$ . A flow $f$ has congestion $\alpha$ if the amount of flow sent along each (undirected) edge is at most $\alpha$ times the capacity of that edge. Given a flow $f$ , a path decomposition of flow $f$ is a collection of directed, capacitated paths such that for any two vertices $u,v\in U$ connected by an edge $e$ , the amount of flow that $f$ sends from $u$ to $v$ equals the total capacity of (directed) paths that contain edge $e$ in the direction from $u$ to $v$ .

A vertex weighting is a vector $\mathbf{d}\in\mathbb{R}^{U}_{\geq 0}$ , i.e., all entries in $\mathbf{d}$ are nonnegative. The vertex weighting $\mathbf{d}\in\mathbb{R}^{U}_{\geq 0}$ mixes in graph $G$ with congestion $\alpha$ if for any demand $\mathbf{b}\in\mathbb{R}^{U}$ satisfying $|\mathbf{b}|\leq\mathbf{d}$ , there is a flow routing $\mathbf{b}$ with congestion $\alpha$ .¹¹¹¹11There is a close connection between the concept of mixing and expander graphs, though we do not need the definition of expanders in this paper. A collection $\{\mathbf{d}_{1},\mathbf{d}_{2},\ldots,\mathbf{d}_{\ell}\}$ of vertex weightings mixes simultaneously with congestion $\alpha$ if for any demands $\mathbf{b}_{1},\mathbf{b}_{2},\ldots,\mathbf{b}_{\ell}$ with $|\mathbf{b}_{i}|\leq\mathbf{d}_{i}$ for each $i\in[\ell]$ , there is a flow routing demand $\mathbf{b}_{1}+\mathbf{b}_{2}+\cdots+\mathbf{b}_{\ell}$ with congestion $\alpha$ .

Throughout the paper, we view vectors in $\mathbb{R}^{U}$ and functions from $U$ to $\mathbb{R}$ as interchangeable. In particular, we sometimes treat the degree function $\deg:V\to\mathbb{R}_{\geq 0}$ as a vector $\deg\in\mathbb{R}^{V}_{\geq 0}$ . In particular, $\deg$ is a vertex weighting.

Congestion-approximators and approximate flow.

Given a graph $H=(U,F)$ , a congestion-approximator $\mathcal{C}$ of quality $\alpha$ is a collection of subsets of $U$ such that for any demand $\mathbf{b}$ satisfying $|\mathbf{b}(C)|\leq\delta_{H}C$ for all $C\in\mathcal{C}$ , there is a flow routing demand $\mathbf{b}$ with congestion $\alpha$ . Through Sherman’s framework [36, 37], a congestion-approximator of quality $\alpha$ translates to a $(1+\epsilon)$ -approximate maximum flow algorithm with running time $\tilde{O}(\epsilon^{-1}\alpha m)$ .¹²¹²12The $\tilde{O}(\cdot)$ notation hides polylogarithmic factors in $n$ . In our paper, it is most convenient to work with a stronger variant of approximate min-cut/max-flow called fair cut/flow [25], which is formally defined in Section 5.2.

4 Bottom-Up Congestion-Approximator

In this section, we formally state our congestion-approximator construction and prove its quality guarantee. See Figure 1 for an illustration. The most important distinction is that we define and analyze mixing on vertex weightings, not edges. This simplifies the notation since we can avoid working with the subdivision graph in [33]. For example, in property (2) below, the mixing of the vertex weighting $\mathrm{deg}_{\partial\mathcal{P}_{i}\cup\partial C}|_{C}$ is conceptually equivalent to the mixing of the edges of $\partial\mathcal{P}_{i}$ internal to $C$ together with the boundary edges $\partial C$ .

Theorem 4.1.

Consider a capacitated graph $G=(V,E)$ , and let $\alpha\geq 1$ and $\beta\geq 1$ be parameters. Suppose there exist partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}$ of $V$ such that

1.

$\mathcal{P}_{1}$ is the partition $\{\{v\}:v\in V\}$ of singleton clusters, and $\mathcal{P}_{L}$ is the partition $\{V\}$ with a single cluster.
2.

For each $i\in[L-1]$ , the collection of vertex weightings $\{\mathrm{deg}_{\partial\mathcal{P}_{i}\cup\partial C}|_{C}\in\mathbb{R}^{V}_{\geq 0}:C\in\mathcal{P}_{i+1}\}$ mixes simultaneously in $G$ with congestion $\alpha$ .
3.

For each $i\in[L-1]$ , there is a flow in $G$ with congestion $\beta$ such that each vertex $v\in V$ sends $\deg_{\partial\mathcal{P}_{i+1}}(v)$ flow and receives at most $\frac{1}{2}\deg_{\partial\mathcal{P}_{i}}(v)$ flow.

For each $i\in[L]$ , let partition $\mathcal{R}_{\geq i}$ be the common refinement of partitions $\mathcal{P}_{i},\mathcal{P}_{i+1},\ldots,\mathcal{P}_{L}$ , i.e.,

\mathcal{R}_{\geq i}=\{C_{i}\cap C_{i+1}\cap\cdots\cap C_{L}:C_{i}\in\mathcal{P}_{i},\,C_{i+1}\in\mathcal{P}_{i+1},\ldots,\,C_{L}\in\mathcal{P}_{L},\,C_{i}\cap\cdots\cap C_{L}\neq\emptyset\}.

Then, their union $\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i}$ is a congestion-approximator with quality $5L^{2}\alpha\beta$ .

In Section 5, we develop an efficient algorithm to construct partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}$ with $L=O(\log n)$ . This algorithm builds the partitions iteratively in the order $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}$ . For technical reasons explained in Section 5, we require the following analogue of Theorem 4.1 where $\mathcal{P}_{L}$ is not necessarily the partition $\{V\}$ . Note that assumptions (2) and (3) remain unchanged below. The key difference is that $\mathcal{C}$ is no longer a congestion-approximator, but a pseudo-congestion-approximator whose precise guarantee is stated below.

Lemma 4.2.

Consider a capacitated graph $G=(V,E)$ , and let $\alpha\geq 1$ and $\beta\geq 1$ be parameters. Suppose there exist partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}$ such that

1.

$\mathcal{P}_{1}$ is the partition $\{\{v\}:v\in V\}$ of singleton clusters.
2.

For each $i\in[L-1]$ , the collection of vertex weightings $\{\mathrm{deg}_{\partial\mathcal{P}_{i}\cup\partial C}|_{C}\in\mathbb{R}^{V}_{\geq 0}:C\in\mathcal{P}_{i+1}\}$ mixes simultaneously in $G$ with congestion $\alpha$ .
3.

For each $i\in[L-1]$ , there is a flow in $G$ with congestion $\beta$ such that each vertex $v\in V$ sends $\deg_{\partial\mathcal{P}_{i+1}}(v)$ flow and receives at most $\frac{1}{2}\deg_{\partial\mathcal{P}_{i}}(v)$ flow.

For each $i\in[L]$ , let partition $\mathcal{R}_{\geq i}$ be the common refinement of partitions $\mathcal{P}_{i},\mathcal{P}_{i+1},\ldots,\mathcal{P}_{L}$ , i.e.,

\mathcal{R}_{\geq i}=\{C_{i}\cap C_{i+1}\cap\cdots\cap C_{L}:C_{i}\in\mathcal{P}_{i},\,C_{i+1}\in\mathcal{P}_{i+1},\ldots,\,C_{L}\in\mathcal{P}_{L},\,C_{i}\cap\cdots\cap C_{L}\neq\emptyset\}.

Consider their union $\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i}$ . Then, for any demand $\mathbf{b}\in\mathbb{R}^{V}$ satisfying $|\mathbf{b}(C)|\leq\delta C$ for all $C\in\mathcal{C}$ , there exists a demand $\mathbf{b}^{\prime}\in\mathbb{R}^{V}$ satisfying $|\mathbf{b}^{\prime}|\leq L\deg_{\partial\mathcal{P}_{L}}$ and a flow routing demand $\mathbf{b}-\mathbf{b}^{\prime}$ with congestion $5L^{2}\alpha\beta$ .

Instead of proving Theorem 4.1 directly, we prove Lemma 4.2 which is needed for the algorithm. Before we do so, we first establish that Lemma 4.2 indeed implies Theorem 4.1.

Proof (Lemma 4.2 $\implies$ Theorem 4.1).

Consider partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}$ that satisfy the assumptions of Theorem 4.1. For a given demand $\mathbf{b}\in\mathbb{R}^{V}$ satisfying $|\mathbf{b}(C)|\leq\delta C$ for all $C\in\mathcal{C}$ , we want to establish a flow routing demand $\mathbf{b}$ with congestion $5L^{2}\alpha\beta$ . Theorem 4.1 then follows from the definition of congestion-approximator.

Apply Lemma 4.2 to the partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}$ and the demand $\mathbf{b}$ . We obtain a demand $\mathbf{b}^{\prime}\in\mathbb{R}^{V}$ satisfying $|\mathbf{b}^{\prime}|\leq L\deg_{\partial\mathcal{P}_{L}}$ and a flow $f$ routing demand $\mathbf{b}-\mathbf{b}^{\prime}$ with congestion $5L^{2}\alpha\beta$ . By assumption (1) of Theorem 4.1, we have $\mathcal{P}_{L}=\{V\}$ , which implies that $\partial\mathcal{P}_{L}=\emptyset$ . Since $|\mathbf{b}^{\prime}|\leq L\deg_{\partial\mathcal{P}_{L}}=\mathbf{0}$ , we must have $\mathbf{b}^{\prime}=\mathbf{0}$ . It follows that the flow $f$ routes demand $\mathbf{b}$ with congestion $5L^{2}\alpha\beta$ , finishing the proof. ∎

For the rest of this section, we prove Lemma 4.2. We begin with two helper claims that establish structure on the sets $\mathcal{R}_{\geq i}$ .

Claim 4.3.

For all $i,j\in[L]$ with $i<j$ , the partition $\mathcal{R}_{\geq i}$ of $V$ is a refinement of the partition $\mathcal{R}_{\geq j}$ , in the sense that each set in $\mathcal{R}_{\geq j}$ is the disjoint union of some sets in $\mathcal{R}_{\geq i}$ . In particular, $\partial\mathcal{R}_{\geq i}\supseteq\partial\mathcal{R}_{\geq j}$ .

Proof.

Consider a set $C=C_{j}\cap C_{j+1}\cap\cdots\cap C_{L}\in\mathcal{R}_{\geq j}$ for some $C_{j}\in\mathcal{P}_{j},C_{j+1}\in\mathcal{P}_{j+1},\ldots,C_{L}\in\mathcal{P}_{L}$ . Since $\mathcal{P}_{i},\mathcal{P}_{i+1},\ldots,\mathcal{P}_{j-1}$ are all partitions of $V$ , the set $C$ is the disjoint union of all nonempty sets of the form $C_{i}\cap C_{i+1}\cap\cdots\cap C_{j-1}\cap C\in\mathcal{R}_{\geq i}$ for $C_{i}\in\mathcal{P}_{i},C_{i+1}\in\mathcal{P}_{i+1},\ldots,C_{j-1}\in\mathcal{P}_{j-1}$ . Therefore, $\mathcal{R}_{\geq i}$ is a refinement of $\mathcal{R}_{\geq j}$ , and since refinements can only increase the boundary set, the second statement $\partial\mathcal{R}_{\geq i}\supseteq\partial\mathcal{R}_{\geq j}$ follows. ∎

Claim 4.4.

For all $i\in[L-1]$ , we have $\partial\mathcal{R}_{\geq i}\setminus\partial\mathcal{R}_{\geq i+1}\subseteq\partial\mathcal{P}_{i}$ .

Proof.

Consider an edge $(u,v)\in\partial\mathcal{R}_{\geq i}\setminus\partial\mathcal{R}_{\geq i+1}$ . Since $(u,v)\notin\partial\mathcal{R}_{\geq i+1}$ , there is a set $C\in\mathcal{R}_{\geq i+1}$ containing both vertices $u$ and $v$ . As in the proof of Claim 4.3, write $C=C_{i+1}\cap C_{i+2}\cap\cdots\cap C_{L}\in\mathcal{R}_{\geq i+1}$ for some $C_{i+1}\in\mathcal{P}_{i+1},C_{i+2}\in\mathcal{P}_{i+2},\ldots,C_{L}\in\mathcal{P}_{L}$ . The set $C$ is the disjoint union of all nonempty sets of the form $C_{i}\cap C\in\mathcal{R}_{\geq i}$ for some $C_{i}\in\mathcal{P}_{i}$ . Since $u,v\in C$ , both $u$ and $v$ belong to sets of this form. Since $(u,v)\in\partial\mathcal{R}_{\geq i}$ , the sets containing $u$ and $v$ must be different. They can only differ in the set $C_{i}\in\mathcal{P}_{i}$ , so $u$ and $v$ belong to different sets in $\mathcal{P}_{i}$ , and we obtain $(u,v)\in\partial\mathcal{P}_{i}$ as promised. ∎

Let $\mathbf{b}\in\mathbb{R}^{V}$ be a flow demand satisfying $|\mathbf{b}(C)|\leq\delta C$ for all $C\in\mathcal{C}$ . We need to construct a demand $\mathbf{b}^{\prime}\in\mathbb{R}^{V}$ satisfying $|\mathbf{b}^{\prime}|\leq L\deg_{\partial\mathcal{P}_{L}}$ and a flow routing demand $\mathbf{b}-\mathbf{b}^{\prime}$ with congestion $5L^{2}\alpha\beta$ .

The construction of the flow has $L-1$ iterations. On iteration $i\in[L-1]$ , we construct a flow $f_{i}$ and a demand $\mathbf{b}_{i}$ such that

1.

The flow $f_{i}$ routes demand $\mathbf{b}_{i-1}-\mathbf{b}_{i}$ , where we initialize $\mathbf{b}_{0}=\mathbf{b}$ on iteration $i=1$ .
2.

The flow $f_{i}$ has congestion $5L\alpha\beta$ .
3.

For all $C\in\mathcal{R}_{\geq i+1}$ , we have $(\mathbf{b}_{i-1}-\mathbf{b}_{i})(C)=0$ .
4.

The demand $\mathbf{b}_{i}$ satisfies $|\mathbf{b}_{i}|\leq(i+1)\deg_{\partial\mathcal{R}_{\geq i+1}}$ .

Note for the last level $L-1$ , by definition we have $\partial\mathcal{R}_{\geq L}=\partial\mathcal{P}_{L}$ . The lemma below shows that properties (1), (2), and (4) alone are sufficient to prove Lemma 4.2 with demand $\mathbf{b}^{\prime}=\mathbf{b}_{L-1}$ and flow $f_{1}+f_{2}+\cdots+f_{L-1}$ .

Lemma 4.5.

Suppose that properties (1), (2), and (4) hold for each $i\in[L-1]$ . Then, the demand $\mathbf{b}_{L-1}$ satisfies $|\mathbf{b}_{L-1}|\leq L\deg_{\partial\mathcal{P}_{L}}$ , and the flow $f_{1}+f_{2}+\cdots+f_{L-1}$ routes demand $\mathbf{b}-\mathbf{b}_{L-1}$ with congestion $5L^{2}\alpha\beta$ .

Proof.

The demand $\mathbf{b}_{L-1}$ satisfies $|\mathbf{b}_{L-1}|\leq L\deg_{\partial\mathcal{R}_{\geq L}}=L\deg_{\partial\mathcal{P}_{L}}$ by property (4) on iteration $i=L-1$ . By property (1) over all $i\in[L-1]$ , the flow $f_{1}+f_{2}+\cdots+f_{L-1}$ routes demand $(\mathbf{b}_{0}-\mathbf{b}_{1})+(\mathbf{b}_{1}-\mathbf{b}_{2})+\cdots+(\mathbf{b}_{L-2}-\mathbf{b}_{L-1})=\mathbf{b}-\mathbf{b}_{L-1}$ . The congestion is at most the sum of the congestions of each $f_{i}$ , which is $(L-1)\cdot 5L\alpha\beta\leq 5L^{2}\alpha\beta$ by property (2). ∎

In order to establish the conditions above, we will use the following technical lemma:

Lemma 4.6.

Consider an iteration $i\in[L-1]$ and a vector $\mathbf{s}\in\mathbb{R}^{V}$ such that

(a)

$|\mathbf{s}|\leq i\deg_{\partial\mathcal{P}_{i}}$ .
(b)

$|\mathbf{s}(C)|\leq\delta C$ for all $C\in\mathcal{R}_{\geq i+1}$ .

Then, we can construct a flow $f$ such that

(i)

Flow $f$ routes demand $\mathbf{s}-\mathbf{t}$ for a vector $\mathbf{t}\in\mathbb{R}^{V}$ with $|\mathbf{t}|\leq\deg_{\partial\mathcal{R}_{\geq i+1}}$ .
(ii)

The flow $f$ has congestion $5L\alpha\beta$ .
(iii)

For all $C\in\mathcal{R}_{\geq i+1}$ , we have $(\mathbf{s}-\mathbf{t})(C)=0$ .

Before we prove this lemma, we first establish that it implies properties (1) to (4) above for appropriate $f_{i}$ and $\mathbf{b}_{i}$ .

Lemma 4.7.

Assuming Lemma 4.6, we can construct $f_{i}$ and $\mathbf{b}_{i}$ satisfying properties (1) to (4) for each $i\in[L-1]$ .

Proof.

We induct on $i\in[L-1]$ , where the base case is just property (4) for $i=0$ (and $\mathbf{b}_{0}=\mathbf{b}$ ). For this base case, since the singleton sets $\{v\}$ are in $\mathcal{P}_{1}$ , they are also in $\mathcal{C}$ , so $|\mathbf{b}(\{v\})|\leq\deg(v)$ for all $v\in V$ , which implies $|\mathbf{b}_{0}(v)|=|\mathbf{b}(v)|\leq\deg(v)=\deg_{\partial\mathcal{R}_{\geq 1}}(v)$ , as desired.

For the inductive step, we apply Lemma 4.6 on iteration $i\geq 1$ and the vector $\mathbf{s}\in\mathbb{R}^{V}$ with

\mathbf{s}(v)=\begin{cases}\displaystyle\left(1-\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\right)\mathbf{b}_{i-1}(v)&\text{if }\deg_{\partial\mathcal{R}_{\geq i}}(v)>0,\\ 0&\text{otherwise}.\end{cases}.

We first verify the conditions on $\mathbf{s}$ required by Lemma 4.6.

(a)

To establish condition a, fix a vertex $v\in V$ ; our goal is to show that $|\mathbf{s}(v)|\leq i\deg_{\partial\mathcal{P}_{i}}(v)$ . If $\deg_{\partial\mathcal{R}_{\geq i}}(v)=0$ , then $\mathbf{s}(v)=0$ and the claim is trivial. Otherwise, suppose that $\deg_{\partial\mathcal{R}_{\geq i}}(v)>0$ . We can apply property (4) for iteration $i-1$ to obtain

	$\displaystyle\|\mathbf{s}(v)\|$	$\displaystyle=\left(1-\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\right)\|\mathbf{b}_{i-1}(v)\|$
		$\displaystyle\leq\left(1-\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\right)\cdot i\deg_{\partial\mathcal{R}_{\geq i}}(v)$
		$\displaystyle=(\deg_{\partial\mathcal{R}_{\geq i}}(v)-\deg_{\partial\mathcal{R}_{\geq i+1}}(v))\cdot i.$

which equals $i\deg_{\partial\mathcal{R}_{\geq i}\setminus\partial\mathcal{R}_{\geq i+1}}(v)$ since $\partial\mathcal{R}_{\geq i}\supseteq\partial\mathcal{R}_{\geq i+1}$ by Claim 4.3. Since $\partial\mathcal{R}_{\geq i}\setminus\partial\mathcal{R}_{\geq i+1}\subseteq\partial\mathcal{P}_{i}$ by Claim 4.4, we have $\deg_{\partial\mathcal{R}_{\geq i}\setminus\partial\mathcal{R}_{\geq i+1}}(v)\leq\deg_{\partial\mathcal{P}_{i}}(v)$ , establishing condition a.

(b)

To establish condition b, fix a set $C\in\mathcal{R}_{\geq i+1}$ . We first prove that $\mathbf{b}_{0}(C)=\mathbf{b}_{i-1}(C)$ . This is trivial for $i=1$ , so assume that $i>1$ . For a given $j\in[i-1]$ , the set $C$ is a disjoint union of some sets $C_{1},\ldots,C_{\ell}\in\mathcal{R}_{\geq j+1}$ by Claim 4.3. Apply property (3) for iteration $j$ to obtain $(\mathbf{b}_{j-1}-\mathbf{b}_{j})(C_{k})=0$ for all $k\in[\ell]$ . Summing over all $k\in[\ell]$ gives $(\mathbf{b}_{j-1}-\mathbf{b}_{j})(C)=\sum_{k\in[\ell]}(\mathbf{b}_{j-1}-\mathbf{b}_{j})(C_{k})=0$ , so $\mathbf{b}_{j-1}(C)=\mathbf{b}_{j}(C)$ . Over all iterations $j\in[i-1]$ , we obtain $\mathbf{b}_{0}(C)=\mathbf{b}_{1}(C)=\cdots=\mathbf{b}_{i-1}(C)$ .

By definition of vector $\mathbf{s}$ , we have $|\mathbf{s}(C)|\leq|\mathbf{b}_{i-1}(C)|$ , which equals $|\mathbf{b}_{0}(C)|$ from above. Finally, since the initial flow demand $\mathbf{b}\in\mathbb{R}^{V}$ satisfies $|\mathbf{b}(C)|\leq\delta C$ , we have $|\mathbf{s}(C)|\leq|\mathbf{b}_{0}(C)|=|\mathbf{b}(C)|\leq\delta C$ , establishing condition b.

With the conditions fulfilled, Lemma 4.6 outputs a flow $f$ which we set as $f_{i}$ , immediately satisfying property (2). We set $\mathbf{b}_{i}=\mathbf{b}_{i-1}-\mathbf{s}+\mathbf{t}$ so that flow $f_{i}$ routes demand $\mathbf{s}-\mathbf{t}=\mathbf{b}_{i-1}-\mathbf{b}_{i}$ and property (1) holds. Property (3) follows from property iii of Lemma 4.6.

It remains to establish property (4). By property (4) for iteration $i-1$ , we have $|\mathbf{b}_{i-1}(v)|\leq i\deg_{\partial\mathcal{R}_{\geq i}}(v)$ for each vertex $v\in V$ . If $\deg_{\partial\mathcal{R}_{\geq i}}(v)=0$ , then $\mathbf{b}_{i-1}(v)=0$ and

|\mathbf{b}_{i}(v)|=|\mathbf{b}_{i-1}(v)-\mathbf{s}(v)+\mathbf{t}(v)|=|0-0+\mathbf{t}(v)|=|\mathbf{t}(v)|.

Otherwise, if $\deg_{\partial\mathcal{R}_{\geq i}}(v)>0$ , then

	$\displaystyle\|\mathbf{b}_{i}(v)\|$	$\displaystyle=\|\mathbf{b}_{i-1}(v)-\mathbf{s}(v)+\mathbf{t}(v)\|$
		$\displaystyle=\left\|\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\mathbf{b}_{i-1}(v)+\mathbf{t}(v)\right\|$
		$\displaystyle\leq\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\|\mathbf{b}_{i-1}(v)\|+\|\mathbf{t}(v)\|$
		$\displaystyle\leq\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\cdot i\deg_{\partial\mathcal{R}_{\geq i}}(v)+\|\mathbf{t}(v)\|$
		$\displaystyle=i\cdot\deg_{\partial\mathcal{R}_{\geq i+1}}(v)+\|\mathbf{t}(v)\|.$

In both cases, we obtain $|\mathbf{b}_{i}(v)|\leq i\cdot\deg_{\partial\mathcal{R}_{\geq i+1}}(v)+|\mathbf{t}(v)|$ . We also have $|\mathbf{t}(v)|\leq\deg_{\partial\mathcal{R}_{\geq i+1}}(v)$ by property i of Lemma 4.6, so property (4) holds on iteration $i$ , completing the induction. ∎

For the rest of this section, we establish Lemma 4.6, the most technical part of the proof. Before we do so, we first establish three technical helper claims.

Claim 4.8.

For any vector $\mathbf{s}\in\mathbb{R}^{V}_{\geq 0}$ with $\mathbf{s}\leq\deg_{\partial\mathcal{P}_{i+1}}$ , there is a vector $\mathbf{t}\in\mathbb{R}^{V}_{\geq 0}$ with $\mathbf{t}\leq\mathrm{deg}_{\partial\mathcal{P}_{i}}/2$ and a flow routing demand $\mathbf{s}-\mathbf{t}$ with congestion $\beta$ .

Proof.

By assumption (3) of Lemma 4.2, there is a flow in $G$ with congestion $\beta$ such that each vertex $v\in V$ sends $\deg_{\partial\mathcal{P}_{i+1}}(v)$ flow and receives at most $\frac{1}{2}\deg_{\partial\mathcal{P}_{i}}(v)$ flow. In other words, there is a vector $\mathbf{t}\in\mathbb{R}^{V}_{\geq 0}$ with $\mathbf{t}\leq\mathrm{deg}_{\partial\mathcal{P}_{i}}/2$ and a flow routing demand $\textup{deg}_{\partial\mathcal{P}_{i+1}}-\mathbf{t}$ with congestion $\beta$ . Take a path decomposition of the flow where each vertex is the start of $\deg_{\partial\mathcal{P}_{i+1}}(v)$ total capacity of (potentially empty) paths and the end of $\mathbf{t}(v)$ total capacity of (potentially empty) paths. Since $\mathbf{s}\leq\deg_{\partial\mathcal{P}_{i+1}}$ , we can remove or decrease the capacity of paths until each vertex is the start of $\mathbf{s}(v)$ total paths. The resulting flow routes demand $\mathbf{s}-\mathbf{t}$ with congestion $\beta$ . ∎

Claim 4.9.

For any $i\in[L-1]$ , consider any vector $\mathbf{x}\in\mathbb{R}^{V}_{\geq 0}$ with $\mathbf{x}\leq\deg_{\partial\mathcal{R}_{\geq i+1}}$ . There exists a vector $\mathbf{y}\in\mathbb{R}^{V}_{\geq 0}$ with $\mathbf{y}\leq\deg_{\partial\mathcal{P}_{i}}$ and a flow routing demand $\mathbf{x}-\mathbf{y}$ with congestion $(2L-1-2i)\beta$ .

Proof.

We prove the statement by induction from $i=L-1$ down to $i=1$ . For the base case $i=L-1$ , since $\mathcal{R}_{\geq L}=\mathcal{P}_{L}$ , we have $\mathbf{x}\leq\deg_{\partial\mathcal{P}_{L}}$ . By Claim 4.8 on vector $\mathbf{x}$ , there is a vector $\mathbf{y}\in\mathbb{R}^{V}_{\geq 0}$ with $\mathbf{y}\leq\deg_{\partial\mathcal{P}_{L-1}}/2\leq\deg_{\partial\mathcal{P}_{L-1}}$ and a flow routing demand $\mathbf{x}-\mathbf{y}$ with congestion $\beta=(2L-1-2(L-1))\beta$ .

For the inductive step, consider a vector $\mathbf{x}\in\mathbb{R}^{V}_{\geq 0}$ with $\mathbf{x}\leq\deg_{\partial\mathcal{R}_{\geq i+1}}$ . Define $\mathbf{x}^{\prime}\in\mathbb{R}^{V}_{\geq 0}$ as

\mathbf{x}^{\prime}(v)=\begin{cases}\displaystyle\frac{\deg_{\partial\mathcal{R}_{\geq i+2}}(v)}{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}\cdot\mathbf{x}(v)&\text{if }\deg_{\partial\mathcal{R}_{\geq i+1}}(v)>0,\\ 0&\text{otherwise},\end{cases}

which satisfies $\mathbf{x}^{\prime}\leq\deg_{\partial\mathcal{R}_{\geq i+2}}$ . By induction, there exists a vector $\mathbf{y}^{\prime}\in\mathbb{R}^{V}_{\geq 0}$ with $\mathbf{y}^{\prime}\leq\deg_{\partial\mathcal{P}_{i+1}}$ and a flow $f_{1}$ routing demand $\mathbf{x}^{\prime}-\mathbf{y}^{\prime}$ with congestion $(2L-1-2(i+1))\beta$ .

Next, consider the vector $\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime}$ , which satisfies $\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime}\geq\mathbf{x}-\mathbf{x}^{\prime}\geq\mathbf{0}$ , where the second inequality holds by Claim 4.3. If $\deg_{\partial\mathcal{R}_{\geq i+1}}(v)=0$ , then $\mathbf{x}(v)=0$ and $\mathbf{x}^{\prime}(v)=0$ , so $(\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime})(v)=\mathbf{y}^{\prime}(v)\leq\deg_{\partial\mathcal{P}_{i+1}}(v)$ . Otherwise, if $\deg_{\partial\mathcal{R}_{\geq i+1}}(v)>0$ , then

	$\displaystyle(\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime})(v)$	$\displaystyle=\bigg{(}\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)-\deg_{\partial\mathcal{R}_{\geq i+2}}(v)}{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}\bigg{)}\mathbf{x}(v)+\mathbf{y}^{\prime}(v)$
		$\displaystyle\leq\frac{\deg_{\partial\mathcal{P}_{i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}\mathbf{x}(v)+\mathbf{y}^{\prime}(v)$
		$\displaystyle\leq\deg_{\partial\mathcal{P}_{i+1}}(v)+\deg_{\partial\mathcal{P}_{i+1}}(v),$

where the first inequality holds by Claim 4.4. Combining the two cases, we obtain $\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime}\leq 2\deg_{\partial\mathcal{P}_{i+1}}$ . By Claim 4.8 on vector $(\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime})/2$ , there exists a vector $\mathbf{t}\in\mathbb{R}^{V}_{\geq 0}$ with $\mathbf{t}\leq\mathrm{deg}_{\partial\mathcal{P}_{i}}/2$ and a flow routing demand $(\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime})/2-\mathbf{t}$ with congestion $\beta$ . Scaling this flow by factor $2$ , we obtain a flow $f_{2}$ routing demand $\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime}-2\mathbf{t}$ with congestion $2\beta$ . The final flow is the sum of flows $f_{1}$ and $f_{2}$ , which routes demand $(\mathbf{x}^{\prime}-\mathbf{y}^{\prime})+(\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime}-2\mathbf{t})=\mathbf{x}-2\mathbf{t}$ and has congestion $(2L-1-2(i+1))\beta+2\beta=(2L-1-2i)\beta$ . We set $\mathbf{y}=2\mathbf{t}\leq\deg_{\partial\mathcal{P}_{i}}$ , which concludes the inductive step and hence the proof. ∎

Claim 4.10.

1.

$\mathbf{y}\leq\textup{deg}_{\partial\mathcal{P}_{i}}+2L\beta\deg_{\partial\mathcal{P}_{i+1}}$ .
2.

For all clusters $C\in\mathcal{P}_{i+1}$ , we have $(\mathbf{x}-\mathbf{y})(C)=0$ .
3.

There is a flow routing demand $\mathbf{x}-\mathbf{y}$ with congestion $2L\beta$ .

Proof.

Apply Claim 4.9 on vector $\mathbf{x}$ , obtaining a vector $\mathbf{y}\in\mathbb{R}^{V}_{\geq 0}$ which we relabel as $\mathbf{y}^{\prime}$ , and a flow $f$ routing demand $\mathbf{x}-\mathbf{y}^{\prime}$ with congestion $(2L-1-2i)\beta\leq 2L\beta$ . Take a path decomposition of flow $f$ where each vertex is the start of $\mathbf{x}(v)$ total capacity of (potentially empty) paths and the end of $\mathbf{y}^{\prime}(v)$ total capacity of (potentially empty) paths. For each path starting at a vertex $v$ in some cluster $C\in\mathcal{P}_{i+1}$ , perform the following operation. If the path contains an edge $(u,v)$ in $\partial C$ with $u\in C$ , then replace the path with its prefix ending at $u$ ; otherwise, do nothing to the path. Note that the modified path ends in the same cluster $C\in\mathcal{P}_{i+1}$ as its starting point. These modified paths form a new flow $f^{\prime}$ , which also has congestion $2L\beta$ .

We now bound the difference in the demands routed by $f$ and $f^{\prime}$ . To do so, we consider the difference in endpoints in the old and new path decompositions. Each vertex $u\in V$ was initially the endpoint of $\mathbf{y}^{\prime}(u)$ total capacity of paths. We now claim that for each cluster $C\in\mathcal{P}_{i+1}$ , each vertex $u\in C$ becomes the new endpoint of at most $2L\beta\deg_{\partial C}(u)=2L\beta\deg_{\partial\mathcal{P}_{i+1}}(u)$ total capacity of paths. This is because each new endpoint is a result of an edge $(u,v)\in\partial C$ in some path, and the total capacity of such paths is at most $2L\beta\deg_{\partial C}(u)$ since the flow $f$ has congestion $2L\beta$ . It follows that each vertex $u\in V$ is the (new or old) endpoint of at most $\mathbf{y}^{\prime}(u)+2L\beta\deg_{\partial\mathcal{P}_{i+1}}(u)$ total capacity of paths in the new flow $f^{\prime}$ .

Define vector $\mathbf{y}\in\mathbb{R}^{V}_{\geq 0}$ such that each vertex $u\in V$ is the endpoint of $\mathbf{y}(u)$ total capacity of paths in the new flow $f^{\prime}$ . In other words, the flow $f^{\prime}$ routes demand $\mathbf{x}-\mathbf{y}$ . We have shown that $\mathbf{y}\leq\mathbf{y}^{\prime}+2L\beta\deg_{\partial\mathcal{P}_{i+1}}$ , and combined with the guarantee $\mathbf{y}^{\prime}\leq\deg_{\partial\mathcal{P}_{i}}$ from Claim 4.9, we conclude that $\mathbf{y}\leq\deg_{\partial\mathcal{P}_{i}}+2L\beta\deg_{\partial\mathcal{P}_{i+1}}$ . Finally, recall that each path of $f^{\prime}$ starts and ends in the same cluster of $\mathcal{P}_{i+1}$ , so $(\mathbf{x}-\mathbf{y})(C)=0$ for all clusters $C\in\mathcal{P}_{i+1}$ . ∎

We now prove Lemma 4.6, restated below. See 4.6

Proof.

We first construct demand $\mathbf{t}\in\mathbb{R}^{V}$ as follows. For each set $C\in\mathcal{R}_{\geq i+1}$ , define $\mathbf{t}(v)=\mathbf{s}(C)\cdot\deg_{\partial\mathcal{R}_{\geq i+1}}(v)/\delta C$ for all $v\in C$ , which satisfies $|\mathbf{t}(v)|\leq\deg_{\partial\mathcal{R}_{\geq i+1}}(v)$ by condition b. Since $\mathcal{R}_{\geq i+1}$ is a partition of $V$ , this fully defines demand $\mathbf{t}$ , which satisfies the bound required by property i. Also, since $C\in\mathcal{R}_{\geq i+1}$ , we have $\sum_{v\in C}\deg_{\partial\mathcal{R}_{\geq i+1}}(v)=\delta C$ , so

\mathbf{t}(C)=\sum_{v\in C}\mathbf{t}(v)=\sum_{v\in C}\mathbf{s}(C)\cdot\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\delta C}=\mathbf{s}(C),

satisfying property iii. In particular, $\mathbf{t}(V)=\mathbf{s}(V)$ and $\mathbf{s}-\mathbf{t}$ is a valid demand.

Since $|\mathbf{t}|\leq\deg_{\partial\mathcal{R}_{\geq i+1}}$ , we apply Claim 4.10 with $\mathbf{x}=\mathbf{t}$ , obtaining a vector $\mathbf{y}\in\mathbb{R}^{V}_{\geq 0}$ which we relabel as $\mathbf{t}^{\prime}$ . We have $|\mathbf{t}^{\prime}|=\mathbf{t}^{\prime}\leq\deg_{\partial\mathcal{P}_{i}}+2L\beta\deg_{\partial\mathcal{P}_{i+1}}$ and $(\mathbf{t}-\mathbf{t}^{\prime})(C)=0$ for all clusters $C\in\mathcal{P}_{i+1}$ . Let $f_{1}$ be the flow routing demand $\mathbf{t}-\mathbf{t}^{\prime}$ with congestion $2L\beta$ .

Consider a cluster $C\in\mathcal{P}_{i+1}$ . We have $(\mathbf{s}-\mathbf{t}^{\prime})(C)=(\mathbf{s}-\mathbf{t})(C)+(\mathbf{t}-\mathbf{t}^{\prime})(C)=0$ . Moreover, for all vertices $v\in C$ , we have

	$\displaystyle\|\mathbf{s}(v)-\mathbf{t}^{\prime}(v)\|\leq\|\mathbf{s}(v)\|+\|\mathbf{t}^{\prime}(v)\|$	$\displaystyle\leq i\deg_{\partial\mathcal{P}_{i}}(v)+\deg_{\partial\mathcal{P}_{i}}(v)+2L\beta\deg_{\partial\mathcal{P}_{i+1}}(v)$
		$\displaystyle=(i+1)\deg_{\partial\mathcal{P}_{i}}(v)+2L\beta\deg_{\partial C}(v)$
		$\displaystyle\leq 3L\beta\deg_{\partial\mathcal{P}_{i}\cup\partial C}(v).$

We conclude that the scaled-down and restricted vector $\frac{1}{3L\beta}(\mathbf{s}-\mathbf{t}^{\prime})|_{C}$ is a demand satisfying $\big{|}\frac{1}{3L\beta}(\mathbf{s}-\mathbf{t}^{\prime})|_{C}\big{|}\leq\mathrm{deg}_{\partial\mathcal{P}_{i}\cup\partial C}|_{C}$ . By assumption (2) of Lemma 4.2, the collection of vertex weightings $\{\mathrm{deg}_{\partial\mathcal{P}_{i}\cup\partial C}|_{C}\in\mathbb{R}^{V}_{\geq 0}:C\in\mathcal{P}_{i+1}\}$ mixes simultaneously in $G$ with congestion $\alpha$ , so there is a flow in $G$ routing demand $\sum_{C\in\mathcal{P}_{i+1}}\frac{1}{3L\beta}(\mathbf{s}-\mathbf{t}^{\prime})|_{C}=\frac{1}{3L\beta}(\mathbf{s}-\mathbf{t}^{\prime})$ with congestion $\alpha$ . Scaling this flow by factor $3L\beta$ , we obtain a flow $f_{2}$ routing demand $\mathbf{s}-\mathbf{t}^{\prime}$ with congestion at most $3L\alpha\beta$ .

The final flow $f$ is $f_{2}-f_{1}$ , which routes demand $(\mathbf{s}-\mathbf{t}^{\prime})-(\mathbf{t}-\mathbf{t}^{\prime})=\mathbf{s}-\mathbf{t}$ and has congestion $2L\alpha+3L\alpha\beta\leq 5L\alpha\beta$ , concluding the proof of Lemma 4.6. ∎

5 Partitioning Algorithm

The partitioning algorithm starts with the partition $\mathcal{P}_{1}=\{\{v\}:v\in V\}$ of singleton clusters. The algorithm then iteratively constructs partition $\mathcal{P}_{i+1}$ given the current partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{i}$ . The lemma below establishes this iterative algorithm, where we substitute $L$ for $i$ .

Theorem 5.1.

Consider a capacitated graph $G=(V,E)$ , and let $\alpha\geq 1$ be a parameter. Suppose there exist partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}$ that satisfy the three properties in Lemma 4.2, i.e.,

1.

$\mathcal{P}_{1}$ is the partition $\{\{v\}:v\in V\}$ of singleton clusters.
2.

For each $i\in[L-1]$ , the collection of vertex weightings $\{\mathrm{deg}_{\partial\mathcal{P}_{i}\cup\partial C}|_{C}\in\mathbb{R}^{V}_{\geq 0}:C\in\mathcal{P}_{i+1}\}$ mixes simultaneously in $G$ with congestion $\alpha=O(\log^{5}(nW))$ .
3.

For each $i\in[L-1]$ , there is a flow in $G$ with congestion $\beta=O(\log^{3}(nW))$ such that each vertex $v\in V$ sends $\deg_{\partial\mathcal{P}_{i+1}}(v)$ flow and receives at most $\frac{1}{2}\deg_{\partial\mathcal{P}_{i}}(v)$ flow.

Then, there is an algorithm that constructs partition $\mathcal{P}_{L+1}$ such that properties (2) and (3) hold for $i=L$ as well. The algorithm runs in $\tilde{O}(m)$ time.

We remark that the parameters $\alpha=O(\log^{5}(nW))$ and $\beta=O(\log^{3}(nW))$ , which result in an overall quality of $O(\log^{10}(nW))$ , are far from optimized. We aim for a clean and modular exposition over the optimization of logarithmic factors, leaving the latter open for future work.

Before we describe the algorithm, we first show that $O(\log(nW))$ iterations suffice to obtain a congestion-approximator.

Claim 5.2.

After $L=O(\log(nW))$ iterations, we have $\mathcal{P}_{L}=\{V\}$ , and $\mathcal{C}$ is a congestion-approximator with quality $O(\log^{10}(nW))$ .

Proof.

We first claim that $\delta\mathcal{P}_{i+1}\leq\delta\mathcal{P}_{i}/2$ for all $i\in[L]$ . By property (3), there is a flow that sends a total of $\deg_{\partial\mathcal{P}_{i+1}}(V)$ flow among the vertices, and receives a total of at most $\frac{1}{2}\deg_{\partial\mathcal{P}_{i}}(V)$ flow among the vertices. Since the total flow sent equals the total flow received, we have $\deg_{\partial\mathcal{P}_{i+1}}(V)\leq\frac{1}{2}\deg_{\partial\mathcal{P}_{i}}(V)$ , or equivalently, $\delta\mathcal{P}_{i+1}\leq\frac{1}{2}\delta\mathcal{P}_{i}$ .

The guarantee $\delta\mathcal{P}_{i+1}\leq\frac{1}{2}\delta\mathcal{P}_{i}$ ensures that for $L=O(\log(nW))$ , we must have $\delta\mathcal{P}_{L}<1$ . Since all edge capacities are assumed to be at least $1$ , we conclude that $\delta\mathcal{P}_{L}=0$ and $\mathcal{P}_{L}=\{V\}$ , fulfilling property (1) of Theorem 4.1. By Theorem 4.1, we obtain a congestion-approximator with quality $5L^{2}\alpha\beta=O(\log^{10}(nW))$ . ∎

The rest of this section is dedicated to proving Theorem 5.1. Throughout the section, we fix the input graph $G=(V,E)$ as well as the current partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}$ . We also define $\mathcal{R}_{\geq 1},\ldots,\mathcal{R}_{\geq L}$ and $\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i}$ according to Lemma 4.2.

At a high level, our algorithm proceeds similarly to expander decomposition algorithms like [34], where expanders are defined with respect to the vertex weighting $\deg_{\partial\mathcal{P}_{L}}$ of partition $\mathcal{P}_{L}$ . We iteratively decompose the graph using the cut-matching game of [18]: for each cluster of the decomposition, we either compute a low-conductance cut or certify that the current cluster is mixing (or in [34] terms, a nearly expander). The matching step of the cut-matching game requires a call to approximate min-cut/max-flow, but recall that the partitions $\mathcal{P}_{1},\ldots,\mathcal{P}_{L}$ only form a pseudo-congestion-approximator. Luckily, we can modify the graph in a way that the pseudo-congestion-approximator can be adapted to an actual congestion-approximator for the new graph. We then black-box a cut/flow algorithm on this new graph, and then modify the cut and flow to fit the old graph in a way that suffices for the cut-matching game.

In more detail, we break down this section as follows:

1.

In Section 5.1, we show that $\mathcal{C}$ can be used to construct a congestion-approximator for slightly modified graphs.
2.

In Section 5.2, we cite the (approximate) fair cut/flow algorithm of [25], which computes a cut/flow pair with desirable properties given a congestion-approximator.
3.

In Section 5.3, we introduce the cut-matching game as well as a trimming procedure similar to [34]. Both the cut-matching game and the trimming step use the fair cut/flow algorithm (Section 5.2) on the modified graphs for which we have a congestion-approximator (Section 5.1).
4.

Finally, in Section 5.4, we establish Theorem 5.1. We present the recursive clustering algorithm that computes the next partition $\mathcal{P}_{L+1}$ given the current partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}$ . It uses the cut-matching game and trimming procedures of Section 5.3.

5.1 Congestion-approximator

We first show how to build a congestion-approximator on certain graph instances that show up in our algorithm. We define these graph instances below.

Definition 5.3 ( $G[A,\gamma,\mathbf{s},\mathbf{t}]$ ).

For given vertex set $A\subseteq V$ , parameter $\gamma\in(0,1]$ , and vertex weightings $\mathbf{s},\mathbf{t}\in\mathbb{R}^{A}_{\geq 0}$ , define the graph $G[A,\gamma,\mathbf{s},\mathbf{t}]$ as follows:

•

Start with the graph $G[A]$ .
•

Add new vertices $x$ , $s$ , and $t$ .
•

For each vertex $v\in A$ , add an edge $(x,v)$ with capacity $\gamma\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A}(v)$ .
•

For each vertex $v\in A$ , add an edge $(s,v)$ with capacity $\mathbf{s}(v)$ .
•

For each vertex $v\in A$ , add an edge $(t,v)$ with capacity $\mathbf{t}(v)$ .

To understand these instances, recall a fact about $\mathcal{C}$ that follows from Lemma 4.2.

Fact 5.4.

For any demand $\mathbf{b}\in\mathbb{R}^{V}$ satisfying $|\mathbf{b}(C)|\leq\delta C$ for all $C\in\mathcal{C}$ , there exists a demand $\mathbf{b}^{\prime}\in\mathbb{R}^{V}$ satisfying $|\mathbf{b}^{\prime}|\leq L\deg_{\partial\mathcal{P}_{L}}$ and a flow in $G$ routing demand $\mathbf{b}-\mathbf{b}^{\prime}$ with congestion $\kappa$ , where we define $\kappa=5L^{2}\alpha\beta$ .

Suppose we start with the entire graph $G$ , and then add a vertex $x$ connected to each vertex $v\in V$ with an edge of capacity $\deg_{\partial\mathcal{P}_{L}}(v)$ . In the setting of Fact 5.4, suppose we wish to route the demand $\mathbf{b}$ . We start with the flow routing demand $\mathbf{b}-\mathbf{b}^{\prime}$ as promised by Fact 5.4. To route the remaining demand $\mathbf{b}^{\prime}$ , we simply use the new edges incident to $x$ : for each vertex $v\in V$ , send $|\mathbf{b}^{\prime}(v)|\leq L\deg_{\partial\mathcal{P}_{L}}$ flow along the edge $(x,v)$ in the proper direction, which is a flow with congestion $L$ . Overall, we obtain a flow routing demand $\mathbf{b}$ with congestion $\kappa+L$ , and with some more work, we can show that $\mathcal{C}$ is a congestion approximator of the new graph.

For our algorithm, we actually work with graphs as described in Definition 5.3. In particular, the base graphs are induced subgraphs, and there are additional vertices $s$ and $t$ . One issue is that the newly added edges may also contribute to the values of $\delta C$ for $C\in\mathcal{C}$ .¹³¹³13This issue is also present in the example with the entire graph, but can be resolved by investigating the structure of $\mathcal{C}$ . Nevertheless, we show in the lemma below that as long as $A,\mathbf{s},\mathbf{t}$ are “well-behaved”, we can modify $\mathcal{C}$ into a congestion approximator for $G[A,\gamma,\mathbf{s},\mathbf{t}]$ .

Lemma 5.5.

Consider partitions $\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}$ for a graph $G=(V,E)$ that satisfy the three properties in Lemma 4.2, and define the partitions $\mathcal{R}_{\geq 1},\ldots,\mathcal{R}_{\geq L}$ and $\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i}$ according to Lemma 4.2. Fix vertex set $A\subseteq V$ , parameter $\gamma\in(0,1]$ , and vertex weightings $\mathbf{s},\mathbf{t}\in\mathbb{R}^{A}_{\geq 0}$ on $A$ , and denote the graph $G[A,\gamma,\mathbf{s},\mathbf{t}]$ by $H=(V_{H},E_{H})$ . Consider a parameter $\beta\geq 1$ such that the following assumption holds:

$(\star)$

$\mathbf{s}(C\cap A)+\mathbf{t}(C\cap A)+\gamma\cdot\delta_{G}(C\cap A)\leq\beta\cdot\delta_{G}C$ for all $C\in\mathcal{C}$ .

Let $\mathcal{C}|_{A}$ be the collection $\{C\cap A:C\in\mathcal{C}\}$ . Then, $\mathcal{C}|_{A}\cup\{\{x\},\{s\},\{t\}\}$ is a congestion-approximator of $H$ with quality $O(\beta\gamma^{-1}(\kappa+L))$ .

For the rest of Section 5.1, we prove Lemma 5.5. Consider a demand $\mathbf{b}\in\mathbb{R}^{A\cup\{x,s,t\}}$ satisfying $|\mathbf{b}(C\cap A)|\leq\delta_{H}(C\cap A)$ for all $C\in\mathcal{C}$ as well as $|\mathbf{b}(v)|\leq\delta_{H}\{v\}$ for $v\in\{x,s,t\}$ . We want to establish a flow on $H$ routing demand $\mathbf{b}$ with congestion $O(\beta\gamma^{-1}(\kappa+L))$ .

We first handle the demand at $x$ , $s$ , and $t$ . For each edge $(x,v)$ where $v\in A$ , route $\mathbf{b}(x)/\deg_{H}(x)\cdot c_{H}(x,v)$ flow from $x$ to $v$ (or $-\mathbf{b}(x)/\deg_{H}(x)\cdot c_{H}(x,v)$ flow from $v$ to $x$ , whichever is nonnegative). Since $|\mathbf{b}(x)|=|\mathbf{b}(\{x\})|\leq\delta_{H}\{x\}=\deg_{H}(x)$ , we route at most $c_{H}(x,v)$ flow along each edge $(x,v)$ . Analogously, for each edge $(s,v)$ where $v\in A$ , route $\mathbf{b}(s)/\deg_{H}(s)\cdot c_{H}(s,v)$ flow from $s$ to $v$ , and for each edge $(t,v)$ where $v\in A$ , route $\mathbf{b}(t)/\deg_{H}(t)\cdot c_{H}(t,v)$ flow from $t$ to $v$ . By the same argument, we route at most $c_{H}(s,v)$ and $c_{H}(t,v)$ flow along each edge $(s,v)$ and $(t,v)$ , respectively. In other words, the routing so far has congestion $1$ .

After this initial routing, vertices $x$ , $s$ , and $t$ no longer have any demand, and each vertex $v\in S$ receives at most $c_{H}(x,v)+c_{H}(s,v)+c_{H}(t,v)=c(\{x,s,t\},v)$ additional demand in absolute value. In other words, if $\widetilde{\mathbf{b}}$ is the new demand that must be routed, we have $\widetilde{\mathbf{b}}(x)=\widetilde{\mathbf{b}}(s)=\widetilde{\mathbf{b}}(t)=0$ and $|\mathbf{b}(v)-\tilde{\mathbf{b}}(v)|\leq c_{H}(\{x,s,t\},v)$ .

In order to invoke Fact 5.4, our next goal is to show the following.

Claim 5.6.

$|\widetilde{\mathbf{b}}(C\cap A)|\leq(1+2\beta+2\gamma)\delta_{G}C$ for all $C\in\mathcal{C}$ .

Proof.

We first bound $|\mathbf{b}(C\cap A)-\widetilde{\mathbf{b}}(C\cap A)|$ as

\displaystyle|\mathbf{b}(C\cap A)-\widetilde{\mathbf{b}}(C\cap A)|\leq\sum_{v\in C\cap A}|\mathbf{b}(v)-\widetilde{\mathbf{b}}(v)|\leq\sum_{v\in C\cap A}c_{H}(\{x,s,t\},v)=c_{H}(\{x,s,t\},C\cap A).

Next, we bound $|\mathbf{b}(C\cap A)|$ as follows. By assumption, we have $|\mathbf{b}(C\cap A)|\leq\delta_{H}(C\cap A)$ . By construction of $H$ , we have

\displaystyle\delta_{H}(C\cap A)=c_{G}(C\cap A,A\setminus C)+c_{H}(\{x,s,t\},C\cap A)\leq\delta_{G}C+c_{H}(\{x,s,t\},C\cap A).

Putting everything together, we obtain

\displaystyle|\widetilde{\mathbf{b}}(C\cap A)|\leq|\mathbf{b}(C\cap A)-\widetilde{\mathbf{b}}(C\cap A)|+|\mathbf{b}(C\cap A)|\leq\delta_{G}C+2c_{H}(\{x,s,t\},C\cap A).

(1)

It remains to bound $c_{H}(\{x,s,t\},C\cap A)$ , which we split into $c_{H}(\{s,t\},C\cap A)+c_{H}(\{x\},C\cap A)$ . By construction of $H=G[A,\gamma,\mathbf{s},\mathbf{t}]$ , we have

c_{H}(\{s,t\},C\cap A)=\mathbf{s}(C\cap A)+\mathbf{t}(C\cap A)

and

c_{H}(\{x\},C\cap A)=\gamma\cdot\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A}(C\cap A)\leq\gamma\cdot\deg_{\partial_{G}\mathcal{P}_{L}}(C\cap A)+\gamma\cdot\deg_{\partial_{G}A}(C\cap A).

We now bound the individual terms $\deg_{\partial_{G}\mathcal{P}_{L}}(C\cap A)$ and $\deg_{\partial_{G}A}(C\cap A)$ above. For $\deg_{\partial_{G}A}(C\cap A)$ , we claim the bound $\deg_{\partial_{G}A}(C\cap A)\leq\delta_{G}(C\cap A)$ : any edge in $\partial_{G}A$ with an endpoint in $C\cap A$ has its other endpoint outside $C\cap A$ , so the edge must be in $\partial_{G}(C\cap A)$ , and the claimed bound holds. For $\deg_{\partial_{G}\mathcal{P}_{L}}(C\cap A)$ , we claim the bound $\deg_{\partial_{G}\mathcal{P}_{L}}(C\cap A)\leq\deg_{\partial_{G}\mathcal{P}_{L}}(C)\leq\delta_{G}C$ . The first inequality is trivial, and for the second inequality, observe that by construction of $\mathcal{C}$ , each set $C\in\mathcal{C}$ is a subset of some cluster in the partition $\mathcal{P}_{L}$ . It follows that any edge in $\partial_{G}\mathcal{P}_{L}$ with an endpoint in $C$ has its other endpoint outside $C$ , so the edge must be in $\partial_{G}C$ , and we conclude that $\deg_{\partial_{G}\mathcal{P}_{L}}(C)\leq\delta_{G}C$ .

Continuing from (1), we conclude that

	$\displaystyle\|\widetilde{\mathbf{b}}(C\cap A)\|$	$\displaystyle\leq\delta_{G}C+2c_{H}(\{x,s,t\},C\cap A)$
		$\displaystyle=\delta_{G}C+2c_{H}(\{s,t\},C\cap A)+2c_{H}(\{x\},C\cap A)$
		$\displaystyle\leq\delta_{G}C+2(\mathbf{s}(C\cap A)+\mathbf{t}(C\cap A))+2(\gamma\cdot\deg_{\partial_{G}\mathcal{P}_{L}}(C\cap A)+\gamma\cdot\deg_{\partial_{G}A}(C\cap A))$
		$\displaystyle\leq\delta_{G}C+2(\mathbf{s}(C\cap A)+\mathbf{t}(C\cap A))+2(\gamma\cdot\delta_{G}C+\gamma\cdot\delta_{G}(C\cap A))$
		$\displaystyle\stackrel{{\scriptstyle\mathclap{\ref{item:property-Ast1}}}}{{\leq}}\delta_{G}C+2\beta\cdot\delta_{G}C+2\gamma\cdot\delta_{G}C,$

finishing the proof. ∎

Let $\widetilde{\mathbf{b}}^{\prime}\in\mathbb{R}^{V}$ be the vector $\widetilde{\mathbf{b}}\in\mathbb{R}^{A\cup\{x,s,t\}}$ without entries $\widetilde{\mathbf{b}}(x),\widetilde{\mathbf{b}}(s),\widetilde{\mathbf{b}}(t)$ and with new entries $\widetilde{\mathbf{b}}^{\prime}(v)=0$ for all $v\in V\setminus A$ . Since $\widetilde{\mathbf{b}}$ is a demand with $\widetilde{\mathbf{b}}(x)=\widetilde{\mathbf{b}}(s)=\widetilde{\mathbf{b}}(t)=0$ , we have that $\widetilde{\mathbf{b}}^{\prime}$ is also a demand, i.e., the coordinates sum to $0$ . By Claim 5.6, we have

|\widetilde{\mathbf{b}}^{\prime}(C)|=|\widetilde{\mathbf{b}}(C\cap A)|\leq(1+2\beta+2\gamma)\delta_{G}C,

so we can apply Fact 5.4 on demand $\widetilde{\mathbf{b}}^{\prime}/(1+2\beta+2\gamma)$ to obtain a demand $\mathbf{b}^{\prime}\in\mathbb{R}^{V}$ satisfying $|\mathbf{b}^{\prime}|\leq L\deg_{\partial_{G}\mathcal{P}_{L}}$ and a flow on $G$ routing demand $\widetilde{\mathbf{b}}^{\prime}/(1+2\beta+2\gamma)-\mathbf{b}^{\prime}$ with congestion $\kappa$ . Scaling this flow by factor $(1+2\beta+2\gamma)$ , we obtain a flow $f^{\prime}$ on $G$ routing demand $\widetilde{\mathbf{b}}^{\prime}-(1+2\beta+2\gamma)\mathbf{b}^{\prime}$ with congestion $(1+2\beta+2\gamma)\kappa$ .

Next, imagine contracting $V\setminus A$ into a single vertex labeled $x$ , so that each edge $(v,x)$ has capacity $\deg_{\partial_{G}A}(v)$ . Consider the corresponding flow $f^{\prime}$ on this contracted graph, which sends at most $(1+2\beta+2\gamma)\kappa\deg_{\partial_{G}A}(v)$ flow on each edge $(v,x)$ . Now consider the exact same flow on $H$ , whose edges $(v,x)$ have capacities $\gamma\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A}(v)$ instead of $\deg_{\partial_{G}A}(v)$ . These capacities are at least $\gamma$ times the capacities of the contracted graph, so the corresponding flow has congestion at most a factor $1/\gamma$ larger. We have established a flow on $H$ routing demand $\widetilde{\mathbf{b}}^{\prime}-(1+2\beta+2\gamma)\mathbf{b}^{\prime}$ with congestion $\gamma^{-1}(1+2\beta+2\gamma)\kappa$ .

Finally, since demand $(1+2\beta+2\gamma)\mathbf{b}^{\prime}$ satisfies $|(1+2\beta+2\gamma)\mathbf{b}^{\prime}|\leq(1+2\beta+2\gamma)L\deg_{\partial_{G}\mathcal{P}_{L}}$ , we can directly route the demand along the edges $(v,x)$ of capacity $\gamma\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A}(v)$ , which is a routing with congestion $\gamma^{-1}(1+2\beta+2\gamma)L$ .

Adding up all three routings, we have routed the initial demand $\mathbf{b}$ with congestion $1+\gamma^{-1}(1+2\beta+2\gamma)\kappa+\gamma^{-1}(1+2\beta+2\gamma)L=O(\beta\gamma^{-1}(\kappa+L))$ , concluding the proof.

5.2 Fair Cut/Flow Algorithm

Given a congestion-approximator, the most convenient min-cut/max-flow algorithm is the fair cut/flow algorithm of [25]. We state the definition of a fair cut/flow and then cite the main result of [25].

Definition 5.7 (Fair cut/flow).

Let $G=(V,E)$ be an undirected graph with edge capacities $c\in\mathbb{R}_{>0}^{E}$ . Let $s,t$ be two vertices in $V$ . For any parameter $\alpha\geq 1$ , we say that an $(s,t)$ cut $S\subseteq V$ and a feasible flow $f$ is an $\alpha$ -fair $(s,t)$ -cut/flow pair if for each edge $(u,v)\in\partial S$ with $u\in S$ and $v\in T$ , the flow $f$ sends at least $\frac{1}{\alpha}\cdot c(u,v)$ flow along the edge in the direction from $u$ to $v$ .

Fact 5.8.

For any $\alpha$ -fair $(s,t)$ -cut/flow pair $(S,f)$ , the cut $\partial S$ is an $\alpha$ -approximate minimum $(s,t)$ -cut.

Theorem 5.9 (Fair cut/flow algorithm [25]).

Consider a graph $G=(V,E)$ , two vertices $s,t\in V$ , and error parameter $\epsilon\in(0,1]$ . Given a congestion-approximator $\mathcal{C}$ with quality $\kappa$ , there is an algorithm that outputs a $(1+\epsilon)$ -fair $(s,t)$ -cut/flow pair in $\tilde{O}((\kappa/\epsilon)^{O(1)}(K+m))$ time where $K=\sum_{C\in\mathcal{C}}|C|$ .

We remark that the fair cut/flow algorithm above is not the fastest available algorithm. However, it is conceptually the easiest for our purposes, and we believe that future work may improve the running time of fair cut/flow algorithms to approach those of standard approximate cut/flow algorithms. Hence, we decide to black-box a fair cut/flow algorithm rather than starting with a standard cut/flow algorithm and massaging it to work in our setting.

To apply Theorem 5.9 to the graph $H=G[A,\gamma,\mathbf{s},\mathbf{t}]$ with the congestion-approximator of Lemma 5.5, we need to bound $K$ for the congestion-approximator $\mathcal{C}|_{A}\cup\{\{x\},\{s\},\{t\}\}$ . Recall that $\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i}$ is the union of $L$ partitions of $V$ , so $\mathcal{C}|_{A}$ is the union of $L$ partitions of $A$ . So $K=L|A|+3$ , where the $+3$ comes from the singletons in $\{\{x\},\{s\},\{t\}\}$ . It follows that Theorem 5.9 runs in time $\tilde{O}((\kappa/\epsilon)^{O(1)}(K+|E(H)|))=\tilde{O}((\kappa/\epsilon)^{O(1)}(L|A|+m^{\prime}))$ where $m^{\prime}$ is the number of edges in $G$ incident to vertices in $A$ .

We conclude this section with the main subroutine that we use to construct partition $\mathcal{P}_{L+1}$ . Note that the assumption 1 below remains unchanged.

Theorem 5.10 (Flow/cut subroutine).

$(\star)$

$\mathbf{s}(C\cap A)+\mathbf{t}(C\cap A)+\gamma\cdot\delta_{G}(C\cap A)\leq\beta\cdot\delta_{G}C$ for all $C\in\mathcal{C}$ .

Let $\mathcal{C}|_{A}$ be the collection $\{C\cap A:C\in\mathcal{C}\}$ . Then, given two vertices $s,t\in V$ and error parameter $\epsilon\in(0,1]$ , there is an algorithm that outputs a $(1+\epsilon)$ -fair $(s,t)$ -cut/flow pair in $\tilde{O}((L\alpha\beta/\epsilon)^{O(1)}(|A|+m^{\prime}))$ time, where $m^{\prime}$ is the number of edges in $G$ incident to vertices in $A$ .

5.3 Cut-Matching Game and Trimming

We follow the cut-matching game treatment in [34]: either find a “balanced” cut of small capacity, or ensure that a “large” part of the graph mixes with low congestion. The following lemma is similar to Theorem 2.2 of [34] with one key difference: there is no built-in flow subroutine, so the algorithm makes black-box calls to the fair cut/flow algorithm of Theorem 5.10.

In past work [33, 1], the analysis of the cut-matching game for capacitated graphs has only been sketched, referencing the fact that a capacitated graph can be modelled by an uncapacitated graph with parallel edges (at a cost). For completeness, we provide a full proof of this capacitated case in Appendix A.

Theorem 5.11 (Cut-Matching).

Consider a graph $G=(V,E)$ with integral edge capacities in the range $[1,W]$ . Let $A\subseteq V$ be a vertex subset, let $\phi,\eta>0$ be parameters, and define $\mathbf{d}\in\mathbb{R}^{A}_{\geq 0}$ as $\mathbf{d}=\textup{deg}_{\partial\mathcal{P}_{L}\cup\partial A}|_{A}$ . Suppose that the following assumption holds:

$(\diamond)$

There is a flow on $G$ with congestion $\kappa$ such that each vertex $v\in A$ is the source of $c_{G}(\{v\},V\setminus A)$ flow and each vertex $v\in V$ is the sink of at most $\deg_{\partial\mathcal{P}_{L}}(v)$ flow.

There exists parameter $T=O(\log^{2}(nW))$ and a randomized, Monte Carlo algorithm that outputs a (potentially empty) set $R\subseteq V$ such that

1.

$\delta_{G[A]}R\leq\phi\mathbf{d}(R)+\frac{\phi}{6T}\mathbf{d}(A)$ ,
2.

$\mathbf{d}(R)\leq\mathbf{d}(A)/2$ , and
3.

Either $\mathbf{d}(R)\geq\mathbf{d}(A)/(6T)$ , or the vertex weighting $\mathbf{d}|_{A\setminus R}$ mixes in $G[A]$ with congestion $5T/\phi$ with high probability.

The algorithm makes at most $T$ calls to Theorem 5.10 with parameters

A\leftarrow A,\,\epsilon\leftarrow\frac{1}{18T^{2}},\,\gamma\leftarrow\frac{\epsilon\phi}{2},\,\beta\leftarrow\max\{1,(24\phi+\epsilon\gamma)(\kappa+2)\}.

Outside these calls, the algorithm takes an additional $O((|A|+m^{\prime})\log^{4}(nW))$ time, where $m^{\prime}$ is the number of edges in $G$ incident to vertices in $A$ .

Property (3) asserts that either an approximately balanced cut is found, or the vertex weighting $\mathbf{d}|_{A\setminus R}$ mixes with low congestion (with high probability). In our algorithm for Theorem 5.1, we actually want the weighting $\mathbf{d}_{A\setminus R}+\deg_{\partial R}$ to mix in the second case. To guarantee this stronger property, we augment the set $R$ into $R\cup B$ through one additional call to the fair cut/flow algorithm of Theorem 5.10. The algorithm is similar to the flow-based expander trimming procedure in [34]. For completeness, we defer the algorithm and proof to Appendix B. Note that the setting, including assumption 1, is the same as Theorem 5.11.

Theorem 5.12 (Trimming).

Consider a graph $G=(V,E)$ with integral edge capacities in the range $[1,W]$ . Let $A\subseteq V$ be a vertex subset, let $\phi,\kappa>0$ be parameters, and define $\mathbf{d}\in\mathbb{R}^{A}_{\geq 0}$ as $\mathbf{d}=\textup{deg}_{\partial\mathcal{P}_{L}\cup\partial A}|_{A}$ . Suppose that the following assumption holds:

$(\diamond)$

There is a flow on $G$ with congestion $\kappa$ such that each vertex $v\in A$ is the source of $c_{G}(\{v\},V\setminus A)$ flow and each vertex $v\in V$ is the sink of at most $\deg_{\partial\mathcal{P}_{L}}(v)$ flow.

There is a deterministic algorithm that inputs a subset $R\subseteq A$ and a parameter $\epsilon>0$ , and outputs a (potentially empty) set $B\subseteq A$ such that

1.

$\delta_{G[A]}B\leq 2\delta_{G[A]}R+2\epsilon\phi\mathbf{d}(A)$ ,
2.

$\mathbf{d}(B\setminus R)\leq\frac{1}{6\phi}\,\delta_{G[A]}R+\frac{\epsilon}{6}\mathbf{d}(A)$ ,
3.

If the vertex weighting $\mathbf{d}|_{A\setminus R}$ mixes in $G[A]$ with congestion $c$ , then the vertex weighting $(\mathbf{d}+\deg_{\partial_{G[A]}(R\cup B)})|_{A\setminus(R\cup B)}$ mixes in $G[A]$ with congestion $2+(1+24\phi)c$ , and
4.

There exists a vector $\mathbf{t}\in\mathbb{R}^{A}_{\geq 0}$ with $\mathbf{t}\leq 24\phi\mathbf{d}|_{A\setminus(R\cup B)}$ and a flow $g$ on $G[A\setminus(R\cup B)]$ routing demand $\textup{deg}_{\partial_{G[A]}(R\cup B)}|_{A\setminus(R\cup B)}-\mathbf{t}$ with congestion $2$ .

The algorithm makes one call to Theorem 5.10 with parameters

A\leftarrow A,\,\epsilon\leftarrow\epsilon,\,\gamma\leftarrow\frac{\epsilon\phi}{2},\,\beta\leftarrow\max\{1,(12\phi+\epsilon\gamma)(\kappa+2)\}.

Outside of this call, the algorithm takes an additional $O(|A|+m^{\prime})$ time, where $m^{\prime}$ is the number of edges in $G$ incident to vertices in $A$ .

5.4 Clustering Algorithm

With the necessary primitives established, we now describe how to construct partition $\mathcal{P}_{L+1}$ given the partitions $\mathcal{P}_{1},\ldots,\mathcal{P}_{L}$ . The algorithm is recursive, taking as input a vertex subset $A\subseteq V$ that is initially $V$ . Throughout, we maintain the invariant that each input subset satisfies assumption 1, which is the same for Theorems 5.11 and 5.12.

On input $A\subseteq V$ , the algorithm calls Theorem 5.11 with parameters $\phi\leftarrow\frac{1}{C\log^{3}(nW)}$ and $\kappa\leftarrow C\log^{3}(nW)$ for a large enough constant $C>0$ . The algorithm obtains an output set $R\subseteq A$ and then calls Theorem 5.12 on inputs $R\leftarrow R$ and $\epsilon\leftarrow 1/(4T)$ with the same parameters $\phi,\kappa$ , obtaining a set $B\subseteq A$ . There are now two cases:

1.

If $\mathbf{d}(R)\geq\mathbf{d}(A)/(6T)$ , then recursively call the algorithm on inputs $R\cup B$ and $A\setminus(R\cup B)$ if they are nonempty.
2.

Otherwise, make a single recursive call on input $R\cup B$ if it is nonempty, and add the set $A\setminus(R\cup B)$ to the final partition $\mathcal{P}_{L+1}$ .

Claim 5.13.

Property (2) of Theorem 5.1 holds for $i=L$ , i.e., the collection of vertex weightings $\{\mathrm{deg}_{\partial\mathcal{P}_{L}\cup\partial C}|_{C}\in\mathbb{R}^{V}_{\geq 0}:C\in\mathcal{P}_{L+1}\}$ mixes simultaneously in $G$ with congestion $O(\log^{5}(nW))$ .

Proof.

By property (3) of Theorem 5.11 and property (3) of Theorem 5.12, for each set $A\setminus(R\cup B)$ added to the final partition $\mathcal{P}_{L+1}$ , the vertex weighting $(\mathbf{d}+\deg_{\partial_{G[A]}(R\cup B)})|_{A\setminus(R\cup B)}$ mixes in $G[A]$ with congestion $2+(1+24\phi)\cdot 5T/\phi$ . Since $\partial_{G}(A\setminus(R\cup B))\subseteq\partial_{G}A\cup\partial_{G[A]}(A\setminus(R\cup B))$ , we have

	$\displaystyle\mathbf{d}+\deg_{\partial_{G[A]}(R\cup B)}$	$\displaystyle=\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A}+\deg_{\partial_{G[A]}(A\setminus(R\cup B))}$
		$\displaystyle\geq\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A\cup\partial_{G[A]}(A\setminus(R\cup B))}$
		$\displaystyle\geq\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}(A\setminus(R\cup B))},$

so in particular, the vertex weighting $\textup{deg}_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}(A\setminus(R\cup B))}|_{A\setminus(R\cup B)}\leq(\mathbf{d}+\deg_{\partial_{G[A]}(R\cup B)})|_{A\setminus(R\cup B)}$ also mixes in $G[A]$ with congestion $2+(1+24\phi)\cdot 5T/\phi$ . The recursive instances $A$ that add a set $A\setminus(R\cup B)$ to $\mathcal{P}_{L+1}$ are disjoint, so the vertex weightings $\textup{deg}_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}(R\cup B)}|_{A\setminus(R\cup B)}$ mix simultaneously in $G$ with the same congestion. We bound the congestion by $2+(1+24\phi)\cdot 5T/\phi=O(T/\phi)=O(\log^{5}(nW))$ , concluding the proof. ∎

It remains to establish condition (3) of Theorem 5.1 for $i=L$ as well as the assumption 1 of Theorems 5.11 and 5.12. To do so, we first prove a few guarantees of the algorithm.

Claim 5.14.

For any recursive call $A^{\prime}\subseteq A$ , we have $\mathbf{d}^{\prime}(A^{\prime})\leq(1-\frac{1}{24T})\mathbf{d}(A)$ where $\mathbf{d}^{\prime}=\textup{deg}_{\partial\mathcal{P}_{L}\cup\partial A^{\prime}}|_{A^{\prime}}$ .

Proof.

We first claim that $\partial_{G}A^{\prime}\subseteq\partial_{G}A\cup\partial_{G[A]}A^{\prime}$ . For any edge in $\partial_{G}A^{\prime}$ , consider its endpoint in $V\setminus A^{\prime}$ . Either it is in $A\setminus A^{\prime}$ , in which case the edge belongs to $\partial_{G[A]}A^{\prime}$ , or it is in $V\setminus A$ , in which case the edge belongs to $\partial_{G}A$ . It follows that $\partial_{G}A^{\prime}\subseteq\partial_{G}A\cup\partial_{G[A]}A^{\prime}$ , and we can bound $\mathbf{d}^{\prime}(A^{\prime})$ as follows:

	$\displaystyle\mathbf{d}^{\prime}(A^{\prime})$	$\displaystyle=\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A^{\prime}}(A^{\prime})$
		$\displaystyle\leq\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A\cup\partial_{G[A]}A^{\prime}}(A^{\prime})$
		$\displaystyle\leq\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A}(A^{\prime})+\deg_{\partial_{G[A]}A^{\prime}}(A^{\prime})$
		$\displaystyle=\mathbf{d}(A^{\prime})+\delta_{G[A]}A^{\prime}.$

By properties (1) and (2) of Theorem 5.11, we have $\delta_{G[A]}R\leq\phi\mathbf{d}(R)+\frac{\phi}{6T}\mathbf{d}(A)$ and $\mathbf{d}(R)\leq\mathbf{d}(A)/2$ . By properties (1) and (2) of Theorem 5.12, we have $\delta_{G[A]}B\leq 2\delta_{G[A]}R+2\epsilon\phi\mathbf{d}(A)$ and $\mathbf{d}(B\setminus R)\leq\frac{1}{6\phi}\,\delta_{G[A]}R+\frac{\epsilon}{6}\mathbf{d}(A)$ . The only two options for the recursive instance $A^{\prime}$ are $A^{\prime}=R\cup B$ and $A^{\prime}=A\setminus(R\cup B)$ , and in both cases, we have

	$\displaystyle\delta_{G[A]}A^{\prime}=\delta_{G[A]}(R\cup B)$	$\displaystyle\leq\delta_{G[A]}R+\delta_{G[A]}B$
		$\displaystyle\leq\delta_{G[A]}R+2\delta_{G[A]}R+2\epsilon\phi\mathbf{d}(A)$
		$\displaystyle=3\delta_{G[A]}R+\frac{\phi}{2T}\mathbf{d}(A)$
		$\displaystyle\leq 3\left(\phi\mathbf{d}(R)+\frac{\phi}{6T}\mathbf{d}(A)\right)+\frac{\phi}{2T}\mathbf{d}(A)$
		$\displaystyle=3\phi\mathbf{d}(R)+\frac{\phi}{T}\mathbf{d}(A).$

Combining the two bounds so far, we obtain

\mathbf{d}^{\prime}(A^{\prime})\leq\mathbf{d}(A^{\prime})+3\phi\mathbf{d}(R)+\frac{\phi}{T}\mathbf{d}(A).

To bound $\mathbf{d}(A^{\prime})$ , we case on whether $A^{\prime}=R\cup B$ or $A^{\prime}=A\setminus(R\cup B)$ . If $A^{\prime}=A\setminus(R\cup B)$ , then we must be in case (1) of the algorithm, which means $\mathbf{d}(R)\geq\mathbf{d}(A)/(6T)$ . In this case, we bound $\mathbf{d}(A^{\prime})=\mathbf{d}(A\setminus(R\cup B))\leq\mathbf{d}(A\setminus R)=\mathbf{d}(A)-\mathbf{d}(R)$ . Together with the bound $\phi\leq 1/24$ , we obtain

	$\displaystyle\mathbf{d}^{\prime}(A^{\prime})$	$\displaystyle\leq\mathbf{d}(A^{\prime})+3\phi\mathbf{d}(R)+\frac{\phi}{T}\mathbf{d}(A)$
		$\displaystyle\leq\mathbf{d}(A)-\mathbf{d}(R)+3\phi\mathbf{d}(R)+\frac{\phi}{T}\mathbf{d}(A)$
		$\displaystyle\leq\mathbf{d}(A)-\frac{1}{2}\mathbf{d}(R)+\frac{1}{24T}\mathbf{d}(A)$
		$\displaystyle\leq\mathbf{d}(A)-\frac{1}{2}\cdot\frac{\mathbf{d}(A)}{6T}+\frac{1}{24T}\mathbf{d}(A)$
		$\displaystyle=\left(1-\frac{1}{24T}\right)\mathbf{d}(A),$

as promised. Otherwise, suppose that $A^{\prime}=R\cup B$ . We have

	$\displaystyle\mathbf{d}(R\cup B)$	$\displaystyle=\mathbf{d}(R)+\mathbf{d}(B\setminus R)$
		$\displaystyle\leq\mathbf{d}(R)+\frac{1}{6\phi}\delta_{G[A]}R+\frac{\epsilon}{6}\mathbf{d}(A)$
		$\displaystyle\leq\mathbf{d}(R)+\frac{1}{6\phi}\left(\phi\mathbf{d}(R)+\frac{\phi}{6T}\mathbf{d}(A)\right)+\frac{\epsilon}{6}\mathbf{d}(A)$
		$\displaystyle\leq\frac{7}{6}\mathbf{d}(R)+\frac{1}{36T}\mathbf{d}(A)+\frac{\epsilon}{6}\mathbf{d}(A)$
		$\displaystyle\leq\frac{7}{6}\cdot\frac{1}{2}\mathbf{d}(A)+\frac{1}{36}\mathbf{d}(A)+\frac{1}{6}\mathbf{d}(A)$
		$\displaystyle=\frac{7}{9}\mathbf{d}(A)$
		$\displaystyle\leq\left(1-\frac{1}{24T}\right)\mathbf{d}(A),$

as promised. With both cases established, this concludes the proof. ∎

For a given recursive call $A$ , define its recursion depth inductively as follows: the initial call $A\leftarrow V$ has depth $0$ , and given a recursive call $A$ of depth $d$ , all of its recursive calls have depth $d+1$ . By Claim 5.14, the value of $\mathbf{d}(A)$ decreases multiplicatively by factor $1/(24T)$ on each recursive call, so the maximum recursion depth is $O(T\log(nW))$ .

For a given recursion depth $d$ , let $E_{d}\subseteq E$ denote the union of edges $\partial_{G[A]}(R\cup B)$ over all instances $A$ of depth $d$ . By construction of the algorithm, the (disjoint) union of $E_{d}$ over all recursion depths $d$ is exactly $\partial\mathcal{P}_{L+1}$ . To avoid clutter, we also define $E_{<d}=E_{1}\cup\cdots\cup E_{d-1}$ .

Claim 5.15.

For any recursion depth $d\geq 0$ , there is a flow on $G$ with congestion $4$ such that each vertex $v\in V$ sends $\deg_{E_{d}}(v)$ flow and receives at most $48\phi\deg_{\partial\mathcal{P}_{L}\cup E_{<d}}(v)$ flow.

Proof.

We prove the statement by induction on $d\geq 0$ . The base case $d=0$ is satisfied with the empty flow: since $\mathcal{A}_{0}$ is the partition $\{V\}$ with a single part, each vertex $v\in A$ indeed sends $\deg_{\partial A_{0}}(v)=0$ flow. Now assume by induction that there is a flow on $G$ with congestion $4$ such that each vertex $v\in A$ sends $\deg_{E_{d}}(v)$ flow and receives at most $48\phi\deg_{\partial\mathcal{P}_{L}}(v)$ flow.

For each instance $A$ of depth $d$ , the algorithm calls Theorem 5.12 which defines $\mathbf{d}=\textup{deg}_{\partial\mathcal{P}_{L}\cup\partial A}|_{A}$ . By property (4) of Theorem 5.12, there exists a vector $\mathbf{t}\in\mathbb{R}^{A}_{\geq 0}$ with $\mathbf{t}\leq 24\phi\mathbf{d}|_{A\setminus(R\cup B)}$ and a flow $g$ on $G[A\setminus(R\cup B)]$ routing demand $\textup{deg}_{\partial_{G[A]}(R\cup B)}|_{A\setminus(R\cup B)}-\mathbf{t}$ with congestion $2$ . We now construct a flow in $G[A]$ with congestion $4$ such that each vertex $v\in A$ sends $\deg_{E_{d}}(v)=\deg_{\partial_{G[A]}(R\cup B)}(v)$ flow and receives at most $48\phi\deg_{\partial\mathcal{P}_{L}\cup\partial E_{<d}}(v)$ flow. First, for each edge in $\partial_{G[A]}(R\cup B)$ , send flow to full capacity in the direction from $R\cup B$ to $A\setminus(R\cup B)$ . In this initial flow, each vertex $v\in R\cup B$ sends exactly $\deg_{\partial_{G[A]}(R\cup B)}(v)$ flow, and each vertex $v\in A\setminus(R\cup B)$ receives exactly $\deg_{\partial_{G[A]}(R\cup B)}(v)$ flow. Next, we send the flow $g$ scaled by $2$ , so that each vertex $v\in A\setminus(R\cup B)$ sends exactly $2\deg_{\partial_{G[A]}(R\cup B)}(v)$ flow and each vertex $v\in A\setminus(R\cup B)$ receives at most $48\phi\mathbf{d}(v)$ flow. Note that $\mathbf{d}(v)=\deg_{\partial\mathcal{P}_{L}\cup\partial A}(v)\leq\deg_{\partial\mathcal{P}_{L}\cup\partial E_{<d}}(v)$ since $\partial A\subseteq\partial E_{<d}$ . Summing the two flows, we obtain a flow in $G[A]$ such that each vertex $v\in A$ sends $\deg_{E_{d}}(v)=\deg_{\partial_{G[A]}(R\cup B)}(v)$ flow and receives at most $48\phi\deg_{\partial\mathcal{P}_{L}\cup\partial E_{<d}}(v)$ flow. The congestion of the flow is $4$ , since edges in $\partial_{G[A]}(R\cup B)$ have congestion $1$ in the initial flow, and edges in $G[A\setminus(R\cup B)]$ have congestion $2$ in the flow $g$ scaled by $2$ .

To complete the induction, our final flow is the union of the constructed flow over all recursive instances $A$ of depth $d$ . Since the flow for instance $A$ is in $G[A]$ , and since the instances $A$ are disjoint, the flows are also disjoint over all $A$ . It follows that their union is a flow on $G$ with congestion $4$ such that each vertex $v\in A$ sends $\deg_{E_{d}}(v)$ flow and each vertex $v\in V$ receives $48\phi\deg_{\partial\mathcal{P}_{L}\cup E_{<d}}(v)$ flow. ∎

Finally, the two claims below establish property (3) of Theorem 5.1 and assumption 1, respectively.

Claim 5.16.

Property (3) of Theorem 5.1 holds for $i=L$ , i.e., there is a flow in $G$ with congestion $O(T\log(nW))$ such that each vertex $v\in V$ sends $\deg_{\partial\mathcal{P}_{L+1}}(v)$ flow and receives at most $\frac{1}{2}\deg_{\partial\mathcal{P}_{L}}(v)$ flow.

Proof.

Let $D=O(T\log(nW))$ be the maximum recursion depth. Summing the flows from Claim 5.15 over all recursion depths $d$ , and using that $E_{1}\cup\cdots\cup E_{d}=\partial\mathcal{P}_{L+1}$ , we obtain a flow with congestion $O(T\log(nW))$ such that each vertex $v\in V$ sends $\deg_{\partial\mathcal{P}_{L+1}}(v)$ flow and receives at most $48D\phi\deg_{\partial\mathcal{P}_{L}\cup\partial\mathcal{P}_{L+1}}(v)$ flow. Recall that we set $\phi\leftarrow\frac{1}{C\log^{3}(nW)}$ for large enough constant $C>0$ . We choose $C$ large enough that $48D\phi\leq 1/4$ , so that each vertex $v\in V$ receives at most $\frac{1}{3}\deg_{\partial\mathcal{P}_{L}\cup\partial\mathcal{P}_{L+1}}(v)\leq\frac{1}{3}\deg_{\partial\mathcal{P}_{L}}(v)+\frac{1}{3}\deg_{\partial\mathcal{P}_{L+1}}(v)$ flow. We can cancel out at most $\frac{1}{3}\deg_{\partial\mathcal{P}_{L+1}}(v)$ flow received at each vertex $v\in V$ from the $\deg_{\partial\mathcal{P}_{L+1}}(v)$ flow sent. After cancellation, we obtain a flow with congestion $O(T\log(nW))$ such that each vertex $v\in V$ sends at least $\frac{2}{3}\deg_{\partial\mathcal{P}_{L+1}}(v)$ flow and receives at most $\frac{1}{3}\deg_{\partial\mathcal{P}_{L}}(v)$ flow. Scaling the flow by factor $3/2$ , taking a path decomposition, and removing enough paths until each vertex is the start of exactly $\deg_{\partial\mathcal{P}_{L+1}}(v)$ paths, we obtain the desired flow with congestion $O(T\log(nW))$ . ∎

Claim 5.17.

For each recursive instance $A$ , the assumption 1 of Theorems 5.11 and 5.12 hold, i.e., there is a flow on $G$ with congestion $O(\log^{3}(nW))$ such that each vertex $v\in A$ is the source of $c_{G}(\{v\},V\setminus A)$ flow and each vertex $v\in V$ is the sink of at most $\deg_{\partial\mathcal{P}_{L}}(v)$ flow.

Proof.

By Claim 5.16, there is a flow in $G$ with congestion $O(\log^{3}(nW))$ such that each vertex $v\in V$ sends $\deg_{\partial\mathcal{P}_{L+1}}(v)$ flow and receives at most $\frac{1}{2}\deg_{\partial\mathcal{P}_{L}}(v)$ flow. Observe that for any recursive instance $A$ , we have $\partial A\subseteq\partial\mathcal{P}_{L+1}$ since the recursive algorithm starting at instance $A$ adds a partition of $A$ into $\mathcal{P}_{L+1}$ . In particular, each vertex $v\in A$ sends at least $\deg_{\partial A}(v)=c_{G}(\{v\},V\setminus A)$ flow. Take a path decomposition of the flow and remove enough paths until each vertex is the start of exactly $c_{G}(\{v\},V\setminus A)$ paths. The resulting flow satisfies assumption 1, concluding the proof. ∎

It remains to bound the running time of the algorithm for Theorem 5.1. For each instance $A$ , Theorems 5.11 and 5.12 run in $\tilde{O}(|A|+m^{\prime})$ time plus $O(\log^{2}(nW))$ calls to Theorem 5.10, which takes $\tilde{O}(|A|+m^{\prime})$ time per call, for a total time of $\tilde{O}(|A|+m^{\prime})$ . The instances $A$ on a given recursion depth are disjoint, so the sum of $|A|+m^{\prime}$ over all such instances $A$ is $O(m)$ . The maximum recursion depth is $O(T\log(nW))=O(\log^{3}(nW))$ , so the sum of $|A|+m^{\prime}$ over all instances of the algorithm is $O(m\log^{3}(nW))$ . It follows that the algorithm of Theorem 5.1 runs in $\tilde{O}(m)$ time.

6 Approximate Maximum Flow

From Theorem 5.1 and Claim 5.2, we obtain an algorithm that constructs a congestion-approximator of quality $O(\log^{10}(nW))$ in $\tilde{O}(m)$ time. Recall that Sherman’s framework [36, 37] translates a congestion-approximator of quality $\alpha$ to a $(1+\epsilon)$ -approximate maximum flow algorithm with running time $\tilde{O}(\epsilon^{-1}\alpha m)$ . Thus, for any parameter $\epsilon>0$ , we obtain a $(1+\epsilon)$ -approximate maximum flow algorithm with running time $\tilde{O}(\epsilon^{-1}m)$ .

References

[1] Arpit Agarwal, Sanjeev Khanna, Huan Li, Prathamesh Patil, Chen Wang, Nathan White, and Peilin Zhong. Parallel approximate maximum flows in near-linear work and polylogarithmic depth. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3997–4061. SIAM, 2024.
[2] Sanjeev Arora, Elad Hazan, and Satyen Kale. $o(\log n)$ approximation to sparsest cut in $o(n^{2})$ time. SIAM Journal on Computing, 39(5):1748–1771, 2010.
[3] Sanjeev Arora, Elad Hazan, and Satyen Kale. The multiplicative weights update method: a meta-algorithm and applications. Theory of computing, 8(1):121–164, 2012.
[4] Sanjeev Arora, James R Lee, and Assaf Naor. Euclidean distortion and the sparsest cut. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 553–562, 2005.
[5] Sanjeev Arora, Satish Rao, and Umesh Vazirani. Expander flows, geometric embeddings and graph partitioning. Journal of the ACM (JACM), 56(2):1–37, 2009.
[6] Marcin Bienkowski, Miroslaw Korzeniowski, and Harald Räcke. A practical algorithm for constructing oblivious routing schemes. In Proceedings of the Fifteenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’03, page 24–33, New York, NY, USA, 2003. Association for Computing Machinery.
[7] Jeff Cheeger. A lower bound for the smallest eigenvalue of the laplacian. Problems in analysis, 625(195-199):110, 1970.
[8] Li Chen, Rasmus Kyng, Yang P Liu, Richard Peng, Maximilian Probst Gutenberg, and Sushant Sachdeva. Maximum flow and minimum-cost flow in almost-linear time. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 612–623. IEEE, 2022.
[9] Paul Christiano, Jonathan A Kelner, Aleksander Madry, Daniel A Spielman, and Shang-Hua Teng. Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 273–282, 2011.
[10] Michael B Cohen, Rasmus Kyng, Gary L Miller, Jakub W Pachocki, Richard Peng, Anup B Rao, and Shen Chen Xu. Solving sdd linear systems in nearly m log1/2 n time. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, pages 343–352, 2014.
[11] Yefim Dinitz. Dinitz’algorithm: The original version and even’s version. In Theoretical Computer Science: Essays in Memory of Shimon Even, pages 218–240. Springer, 2006.
[12] Lester Randolph Ford and Delbert R Fulkerson. Maximal flow through a network. Canadian journal of Mathematics, 8:399–404, 1956.
[13] Gramoz Goranci, Harald Räcke, Thatchaphol Saranurak, and Zihan Tan. The expander hierarchy and its applications to dynamic graph algorithms. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2212–2228. SIAM, 2021.
[14] Gramoz Goranci, Harald Räcke, Thatchaphol Saranurak, and Zihan Tan. The expander hierarchy and its applications to dynamic graph algorithms. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2212–2228. SIAM, 2021.
[15] Chris Harrelson, Kirsten Hildrum, and Satish Rao. A polynomial-time tree decomposition to minimize congestion. In Proceedings of the Fifteenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’03, page 34–43, New York, NY, USA, 2003. Association for Computing Machinery.
[16] Arun Jambulapati and Aaron Sidford. Ultrasparse ultrasparsifiers and faster laplacian system solvers. ACM Transactions on Algorithms, 2021.
[17] Jonathan A Kelner, Yin Tat Lee, Lorenzo Orecchia, and Aaron Sidford. An almost-linear-time algorithm for approximate max flow in undirected graphs, and its multicommodity generalizations. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 217–226. SIAM, 2014.
[18] Rohit Khandekar, Satish Rao, and Umesh Vazirani. Graph partitioning using single commodity flows. Journal of the ACM (JACM), 56(4):1–15, 2009.
[19] Rohit Khandekar, Satish Rao, and Umesh Vazirani. Graph partitioning using single commodity flows. Journal of the ACM (JACM), 56(4):1–15, 2009.
[20] Philip Klein, Satish Rao, Ajit Agrawal, and R Ravi. An approximate max-flow min-cut relation for undirected multicommodity flow, with applications. Combinatorica, 15(2):187–202, 1995.
[21] Ioannis Koutis, Gary L Miller, and Richard Peng. A nearly-m log n time solver for sdd linear systems. In 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, pages 590–598. IEEE, 2011.
[22] Rasmus Kyng and Sushant Sachdeva. Approximate gaussian elimination for laplacians-fast, sparse, and simple. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 573–582. IEEE, 2016.
[23] Yin Tat Lee, Satish Rao, and Nikhil Srivastava. A new approach to computing maximum flows using electrical flows. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 755–764, 2013.
[24] Tom Leighton and Satish Rao. Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. J. ACM, 46(6):787–832, nov 1999.
[25] Jason Li, Danupon Nanongkai, Debmalya Panigrahi, and Thatchaphol Saranurak. Near-linear time approximations for cut problems via fair cuts. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 240–275. SIAM, 2023.
[26] Nathan Linial, Eran London, and Yuri Rabinovich. The geometry of graphs and some of its algorithmic applications. Combinatorica, 15:215–245, 1995.
[27] Aleksander Madry. Fast approximation algorithms for cut-based problems in undirected graphs. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 245–254, 2010.
[28] Lorenzo Orecchia, Leonard J Schulman, Umesh V Vazirani, and Nisheeth K Vishnoi. On partitioning graphs via single commodity flows. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pages 461–470, 2008.
[29] Richard Peng. A note on cut-approximators and approximating undirected max flows. CoRR, abs/1411.7631, 2014.
[30] Richard Peng. Approximate undirected maximum flows in o (m polylog (n)) time. In Proceedings of the twenty-seventh annual ACM-SIAM symposium on Discrete algorithms, pages 1862–1867. SIAM, 2016.
[31] Harald Räcke. Minimizing congestion in general networks. In Proceedings of the 43rd Symposium on Foundations of Computer Science, FOCS ’02, page 43–52, USA, 2002. IEEE Computer Society.
[32] Harald Räcke. Optimal hierarchical decompositions for congestion minimization in networks. In Proceedings of the Fortieth Annual ACM Symposium on Theory of Computing, STOC ’08, page 255–264, New York, NY, USA, 2008. Association for Computing Machinery.
[33] Harald Räcke, Chintan Shah, and Hanjo Täubig. Computing cut-based hierarchical decompositions in almost linear time. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 227–238. SIAM, 2014.
[34] Thatchaphol Saranurak and Di Wang. Expander decomposition and pruning: Faster, stronger, and simpler. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2616–2635. SIAM, 2019.
[35] Jonah Sherman. Breaking the multicommodity flow barrier for $o(\sqrt{\log n})$ -approximations to sparsest cut. In 2009 50th Annual IEEE Symposium on Foundations of Computer Science, pages 363–372, 2009.
[36] Jonah Sherman. Nearly maximum flows in nearly linear time. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pages 263–269. IEEE, 2013.
[37] Jonah Sherman. Area-convexity, $\ell_{\infty}$ regularization, and undirected multicommodity flow. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 452–460, 2017.
[38] Daniel A Spielman and Shang-Hua Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, pages 81–90, 2004.

Appendix A Cut-Matching Game

In this section, we prove Theorem 5.11, restated below. See 5.11

Our setup resembles Appendix B of [34] with a few minor changes. In particular, since our flow routine is more restrictive, we have to adapt the algorithm to handle our flow outputs.

We begin with notation from [34]. For simplicity, we avoid working with the subdivision graph in [33, 34]. Define a $A$ -commodity flow as a multi-commodity flow where each vertex $v\in A$ is the source of quantity $\mathbf{d}(v)$ of its distinct flow commodity. Only for analysis, we consider a $A\times A$ flow-matrix $\mathbf{F}\in\mathbb{R}^{A\times A}_{\geq 0}$ which encodes information about a $A$ -commodity flow. We say that $\mathbf{F}$ is routable with congestion $c$ if there exists a $A$ -commodity flow $f$ such that, simultaneously for all $u,v\in A$ , we have that $u$ can send quantity $\mathbf{F}(u,v)$ of its own commodity to $v$ , and the amount of flow through each edge is at most $c$ .

The algorithm initializes flow-matrix $\mathbf{F}_{0}\in\mathbb{R}^{A\times A}_{\geq 0}$ as the diagonal matrix with value $\mathbf{d}(v)$ on entry $\mathbf{F}(v,v)$ . Trivially, $\mathbf{F}$ is routable with zero congestion. The algorithm initializes $A_{0}=A$ and $R_{0}=\emptyset$ , and then proceeds for at most $T=O(\log^{2}(nW))$ rounds. For each round $t\in[T]$ , the algorithm implicitly updates $\mathbf{F}_{t-1}$ to $\mathbf{F}_{t}$ such that it is routable with congestion at most $t/\phi$ . The operation for implicitly updating $\mathbf{F}_{t-1}$ will be described explicitly later on, but we ensure that row sums do not change from $\mathbf{F}_{t-1}$ to $\mathbf{F}_{t}$ , i.e., there is always $\mathbf{d}(v)$ total quantity of each commodity $v\in A$ spread among the vertices. For each round $t\in[T]$ , the algorithm (explicitly) finds a partition of $A_{t-1}$ into $A_{t}^{\ell},A_{t}^{r}$ , and then computes

(i)

A (possibly empty) set $S_{t}\subseteq A$ satisfying $\delta_{G[A]}S_{t}\leq\phi\mathbf{d}(S\cap A_{t-1})+\frac{\phi}{6T^{2}}\mathbf{d}(A)$ and $\mathbf{d}(S_{t})\leq\mathbf{d}(A)/3$ , and
(ii)

A (possibly empty) flow $f_{t}$ from $A_{t}^{\ell}\setminus S_{t}$ to $A_{t}^{r}\setminus S_{t}$ such that each vertex $v\in A_{t}^{\ell}\setminus S_{t}$ is the source of at least $\mathbf{d}(v)/12$ flow, and each vertex $v\in A_{t}^{r}\setminus S_{t}$ is the sink of at most $\mathbf{d}(v)$ flow. The flow $f$ has congestion $1/\phi$ .

The algorithm then updates $A_{t}\leftarrow A_{t-1}\setminus S_{t}$ and $R_{t}\leftarrow R_{t-1}\cup S_{t}$ . Note that on each round $t$ , the sets $A_{t}$ and $R_{t}$ partition $A$ . If $\mathbf{d}(R_{t})\geq\mathbf{d}(A)/(6T)$ holds, then the algorithm immediately terminates and outputs $R=R_{t}$ . Otherwise, we have $\mathbf{d}(R_{T})<\mathbf{d}(A)/(6T)$ at the end, and the algorithm outputs $R=R_{T}$ .

Lemma A.1.

For any round $t$ , we have $\delta_{G[A]}R_{t}\leq\phi\mathbf{d}(R_{t})+\frac{\phi}{6T}\mathbf{d}(A)$ and $\mathbf{d}(R_{t})\leq\mathbf{d}(A)/2$ .

Proof.

We start by proving the first statement. Each time we remove a set $S_{t}$ , we are guaranteed that $\delta_{G[A]}S_{t}\leq\phi\mathbf{d}(S_{t}\cap A_{t-1})+\frac{\phi}{6T^{2}}\mathbf{d}(A)$ . We charge the $\phi\mathbf{d}(S_{t}\cap A_{t-1})$ part to the vertices in $S_{t}\cap A_{t-1}$ so that each vertex $v\in S_{t}\cap A_{t-1}$ is charged exactly $\phi\mathbf{d}(v)$ . Since the algorithm updates $A_{t}\leftarrow A_{t-1}\setminus S_{t}$ and $R_{t}\leftarrow R_{t-1}\cup S_{t}$ , each newly charged vertex leaves $A_{t}$ and joins $R_{t}$ . In total, we charge $\sum_{t=1}^{T}\phi\mathbf{d}(S_{t}\cap A_{t-1})$ to the vertices in $R_{t}$ so that each vertex $v\in R_{t}$ is charged once at exactly $\phi\mathbf{d}(v)$ . It follows that $\sum_{t=1}^{T}\phi\mathbf{d}(S_{t}\cap A_{t-1})\leq\phi\mathbf{d}(R_{t})$ and

\delta_{G[A]}R_{t}\leq\sum_{t=1}^{T}\delta_{G[A]}S_{t}\leq\sum_{t=1}^{T}\left(\phi\mathbf{d}(S_{t}\cap A_{t-1})+\frac{\phi}{6T^{2}}\mathbf{d}(A)\right)\leq\phi\mathbf{d}(R_{t})+T\cdot\frac{\phi}{6T^{2}}\mathbf{d}(A),

concluding the first statement of the lemma.

For the second statement, consider the round $t$ with $\mathbf{d}(R_{t})\geq\mathbf{d}(A)/(6T)$ , if it exists, at which point the algorithm terminates. (If there is no such $t$ , then we are done.) We have $\mathbf{d}(R_{t-1})<\mathbf{d}(A)/(6T)$ and $\mathbf{d}(R_{t-1}\cup S_{t})=\mathbf{d}(R_{t})\geq\mathbf{d}(A)/(6T)$ , and the final set $S_{t}$ satisfies $\mathbf{d}(S_{t})\leq\mathbf{d}(A_{t-1})/3\leq\mathbf{d}(A)/3$ . It follows that the new set $R_{t}$ satisfies

\displaystyle\mathbf{d}(R_{t})=\mathbf{d}(R_{t-1}\cup S_{t-1})\leq\mathbf{d}(R_{t-1})+\mathbf{d}(S_{t})\leq\frac{\mathbf{d}(A)}{6T}+\frac{\mathbf{d}(A)}{3}\leq\frac{\mathbf{d}(A)}{2},

concluding the second statement and the proof. ∎

Purely for the analysis, we define a potential function $\psi(t)$ for each round $t\in[T]$ as follows. For each vertex $v\in A$ , let $\mathbf{F}_{t}(u)\in\mathbb{R}^{A}_{\geq 0}$ be row $u$ of matrix $\mathbf{F}_{t}$ ; we call $\mathbf{F}_{t}(u)$ a flow-vector of $u$ . We define the potential function

\psi(t)=\sum_{u\in A_{t}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}

where

\displaystyle\boldsymbol{\mu}_{t}=\frac{\sum_{u\in A_{t}}\mathbf{F}_{t}(u)}{\mathbf{d}(A_{t})}=\arg\min_{\boldsymbol{\mu}\in\mathbb{R}^{A}}\sum_{u\in A_{t}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}\right\rVert_{2}^{2}

(2)

is the weighted average of the flow-vectors in $A_{t}$ , which is also the minimizer of $\psi(t)$ when treated as a function of $\psi(t)$ . The latter fact can be verified separately for each coordinate $v\in A$ ; namely,

\boldsymbol{\mu}_{t}(v)=\frac{\sum_{u\in A_{t}}\mathbf{F}_{t}(u,v)}{\mathbf{d}(A_{t})}=\arg\displaystyle\min_{\boldsymbol{\mu}\in\mathbb{R}^{A}}\sum_{u\in A_{t}}\mathbf{d}(u)\bigg{(}\frac{\mathbf{F}_{t}(u,v)}{\mathbf{d}(u)}-\boldsymbol{\mu}(v)\bigg{)}^{2}

follows from setting the derivative to zero.

A.1 Small Potential Implies Mixing

We first show that if $\Phi(t)\leq 1/\textup{poly}(nW)$ for a sufficiently large polynomial, then the vertex weighting $\mathbf{d}|_{A_{t}}$ mixes in $G$ .

Lemma A.2.

For any $t\in T$ , if $\mathbf{d}(R_{t})\leq\mathbf{d}(A)/(6T)$ and $\psi(t)\leq 1/(nW)^{C}$ for large enough constant $C>0$ , then the vertex weighting $\mathbf{d}|_{A_{t}}$ mixes in $G$ with congestion $5T/\phi$ .

For the rest of Section A.1, we prove Lemma A.2. Suppose that $\mathbf{d}(R_{t})\leq\mathbf{d}(A)/(6T)$ and $\psi(t)\leq 1/(nW)^{C}$ for large enough constant $C>0$ . We first prove two claims about the flow-matrix $\mathbf{F}_{t}$ . Let $\mathbf{F}_{t}(A_{t},A_{t})$ denote the sum $\sum_{u,v\in A_{t}}\mathbf{F}_{t}(u,v)$ .

Claim A.3.

$\mathbf{F}_{t}(A_{t},A_{t})\geq\mathbf{d}(A)/3$ .

Proof.

Since the $A$ -commodity flow routing $\mathbf{F}_{t}$ has congestion $t/\phi$ , and since $\partial_{G[A]}R_{t}$ is a cut separating $A_{t}\subseteq A\setminus R_{t}$ from $R_{t}$ , we have $\sum_{u\in A_{t},v\in R_{t}}\mathbf{F}_{t}(u,v)\leq t/\phi\cdot\delta_{G[A]}R_{t}$ . Since $\delta_{G[A]}R_{t}\leq\phi\mathbf{d}(R_{t})+\frac{\phi}{6}\mathbf{d}(A)$ by Lemma A.1, we have

\sum_{u\in A_{t},v\in R_{t}}\mathbf{F}_{t}(u,v)\leq\frac{t}{\phi}\cdot\delta_{G[A]}R_{t}\leq\frac{t}{\phi}\cdot\left(\phi\mathbf{d}(R_{t})+\frac{\phi}{6T}\mathbf{d}(A)\right)\leq T\cdot\mathbf{d}(R_{t})+\frac{1}{6}\mathbf{d}(A)\leq\frac{1}{3}\mathbf{d}(A),

where the last inequality holds by the assumption $\mathbf{d}(R_{t})\leq\mathbf{d}(A)/(6T)$ . Since the sum of row $v\in A$ in $\mathbf{F}$ is always $\mathbf{d}(v)$ , we have

\sum_{u\in A_{t},v\in A}\mathbf{F}_{t}(u,v)=\mathbf{d}(A_{t})=\mathbf{d}(A)-\mathbf{d}(R_{t})\geq\mathbf{d}(A)-\frac{\mathbf{d}(A)}{6T}\geq\frac{2}{3}\mathbf{d}(A).

Subtracting the two inequalities above, we conclude that

\mathbf{F}_{t}(A_{t},A_{t})=\sum_{u\in A_{t},v\in A}\mathbf{F}_{t}(u,v)-\sum_{u\in A_{t},v\in R_{t}}\mathbf{F}_{t}(u,v)\geq\frac{2}{3}\mathbf{d}(A)-\frac{1}{3}\mathbf{d}(A)=\frac{1}{3}\mathbf{d}(A).\qed

Claim A.4.

For all $u\in A$ , we have

\bigg{|}\sum_{v\in A_{t}}\mathbf{F}_{t}(u,v)-\mathbf{d}(u)\cdot\frac{\mathbf{F}_{t}(A_{t},A_{t})}{\mathbf{d}(A_{t})}\bigg{|}\leq\frac{1}{(nW)^{C/2-3}}.

In particular, $\sum_{v\in A_{t}}\mathbf{F}_{t}(u,v)\geq\mathbf{d}(u)/4$ .

Proof.

Since $\psi(t)\leq 1/(nW)^{C}$ by assumption, we have

\bigg{|}\frac{\mathbf{F}_{t}(u,v)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}(v)\bigg{|}^{2}\leq\mathbf{d}(u)\,\bigg{|}\frac{\mathbf{F}_{t}(u,v)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}(v)\bigg{|}^{2}\leq\sum_{u\in A_{t}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}=\psi(t)\leq\frac{1}{(nW)^{C}},

\big{|}\mathbf{F}_{t}(u,v)-\mathbf{d}(u)\cdot\boldsymbol{\mu}_{t}(v)\big{|}=\mathbf{d}(u)\,\bigg{|}\frac{\mathbf{F}_{t}(u,v)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}(v)\bigg{|}\leq nW\cdot\frac{1}{(nW)^{C/2}}=\frac{1}{(nW)^{C/2-1}}.

In particular,

	$\displaystyle\bigg{\|}\sum_{v\in A_{t}}\big{(}\mathbf{F}_{t}(u,v)-\mathbf{d}(u)\cdot\boldsymbol{\mu}_{t}(v)\big{)}\bigg{\|}$	$\displaystyle\leq\sum_{v\in A_{t}}\big{\|}\mathbf{F}_{t}(u,v)-\mathbf{d}(u)\cdot\boldsymbol{\mu}_{t}(v)\big{\|}$
		$\displaystyle\leq\sum_{v\in A_{t}}\frac{\mathbf{d}(u)}{(nW)^{C/2-1}}=\|A_{t}\|\cdot\frac{\mathbf{d}(u)}{(nW)^{C/2-1}}\leq\frac{1}{(nW)^{C/2-3}}.$

By the definition of $\boldsymbol{\mu}_{t}$ ,

\sum_{v\in A_{t}}\boldsymbol{\mu}_{t}(v)=\sum_{u,v\in A_{t}}\frac{\mathbf{F}_{t}(u,v)}{\mathbf{d}(A_{t})}=\frac{\mathbf{F}_{t}(A_{t},A_{t})}{\mathbf{d}(A_{t})},

concluding the first statement of the claim. By Claim A.3, the expression above is at least $\frac{\mathbf{d}(A)/3}{\mathbf{d}(A_{t})}\geq 1/3$ , so

\sum_{v\in A_{t}}\mathbf{F}_{t}(u,v)\geq\sum_{v\in A_{t}}\mathbf{d}(u)\cdot\boldsymbol{\mu}_{t}(v)-\frac{1}{(nW)^{C/2-3}}\geq\frac{1}{3}\mathbf{d}(u)-\frac{1}{(nW)^{C/2-3}}\geq\frac{1}{4}\mathbf{d}(u)

for $C>0$ large enough, concluding the second statement. ∎

With Claims A.3 and A.4 established, we now prove Lemma A.2. Recall that $\mathbf{F}_{t}$ is routable with congestion $T/\phi$ . Decompose this $A$ -commodity flow into single-commodity flows $f_{u,v}:u,v\in A$ that send quantity $\mathbf{F}_{t}(u,v)$ of commodity $u$ from $u$ to $v$ .

Consider any demand $\mathbf{b}\in\mathbb{R}^{A}$ satisfying $|\mathbf{b}|\leq\mathbf{d}|_{A_{t}}$ . In particular, $\sum_{u\in A_{t}}\mathbf{b}(u)=0$ . We want to construct a single-commodity flow routing demand $\mathbf{b}$ with congestion $5T/\phi$ . For each $u,v\in A_{t}$ , we first route the flow

f^{\prime}_{u,v}=\frac{f_{u,v}}{\sum_{v^{\prime}\in A_{t}}\mathbf{F}_{t}(u,v^{\prime})}\cdot\mathbf{b}(u).

Summing over all $v\in A_{t}$ , we observe that each vertex $u\in A_{t}$ sends a total of $\mathbf{b}(u)$ demand. Also, by Claim A.4, each flow $f^{\prime}_{u,v}$ has (absolute) value

\frac{\mathbf{F}_{t}(u,v)}{\sum_{v^{\prime}\in A_{t}}\mathbf{F}_{t}(u,v^{\prime})}\cdot|\mathbf{b}(u)|\leq\frac{\mathbf{F}_{t}(u,v)}{\mathbf{d}(u)/4}\cdot\mathbf{d}(u)\leq 4\mathbf{F}_{t}(u,v).

In particular, these flows can be routed simultaneously with congestion $4$ times the $A$ -commodity flow, which is congestion $4T/\phi$ .

After routing this flow, each vertex $v\in A_{t}$ receives demand

\sum_{u\in A_{t}}\frac{\mathbf{F}_{t}(u,v)}{\sum_{v^{\prime}\in A_{t}}\mathbf{F}_{t}(u,v^{\prime})}\cdot\mathbf{b}(u),

Since $\psi(t)\leq 1/(nW)^{C}$ , we can approximate $\mathbf{F}_{t}(u,v)\approx\mathbf{d}(u)\cdot\boldsymbol{\mu}_{t}(v)$ for all $u\in A$ . Together with Claim A.4, we can approximate the numerator and denominator of the fraction above to

\sum_{u\in A_{t}}\frac{\mathbf{d}(u)\cdot\boldsymbol{\mu}_{t}(v)}{\mathbf{d}(u)\cdot\mathbf{F}_{t}(A_{t},A_{t})/\mathbf{d}(A_{t})}\cdot\mathbf{b}(u)=\frac{\boldsymbol{\mu}_{t}(v)}{\mathbf{F}_{t}(A_{t},A_{t})/\mathbf{d}(A_{t})}\sum_{u\in A_{t}}\mathbf{b}(u),

which equals $0$ since $\sum_{u\in A_{t}}\mathbf{b}(u)=0$ . When $C>0$ is large enough, the approximated value of $0$ is within an additive $1/n^{2}$ of the true value. In particular, each vertex $v\in A_{t}$ receives a demand whose absolute value is at most $1/n^{2}$ . Since the minimum edge capacity is $1$ , the remaining demand can trivially be routed with congestion $1$ . The final congestion is $4T/\phi+1\leq 5T/\phi$ , concluding the proof of Lemma A.2.

A.2 Computing the Set and Flow

In this subsection, we describe a round of the algorithm in detail, following Lemma B.3 of [34] with some minor changes. Recall that on each iteration $t\in[T]$ , the algorithm first computes a partition of $A_{t-1}$ into $A_{t}^{\ell},A_{t}^{r}$ , and then computes a set $S_{t}$ and flow $f_{t}$ satisfying certain properties. The algorithm then updates $A_{t}\leftarrow A_{t-1}\setminus S_{t}$ and $R_{t}\leftarrow R_{t-1}\cup S_{t}$ . The choice of partition $A_{t}^{\ell},A_{t}^{r}$ will depend on an implicitly represented flow-matrix $\mathbf{F}_{t}$ that is useful for the analysis.

A.2.1 Constructing the partition

We start with the construction of $A_{t}^{\ell}$ and $A_{t}^{r}$ . We first list some variables key to the algorithm and analysis.

1.

Let $\mathbf{r}\in\mathbb{R}^{A}$ be a random unit vector orthogonal to the all-ones vector.
2.

For each $v\in A$ , let $\mathbf{p}(v)=\langle\mathbf{F}_{t-1}(v)/\mathbf{d}(v),\,\mathbf{r}\rangle$ be the projection of normalized flow-vector $\mathbf{F}_{t-1}(v)/\mathbf{d}(v)$ onto the vector $r$ . We later show in Claim A.12 that the values $\mathbf{p}(v)$ can be computed in total time $O(mT)$ without explicitly maintaining the flow matrix $F$ .
3.

Let $\bar{\mu}_{t-1}=\langle\boldsymbol{\mu}_{t-1},\mathbf{r}\rangle$ be the projection of the weighted average $\boldsymbol{\mu}_{t-1}=\sum_{u\in A_{t-1}}\mathbf{F}_{t-1}(u)/\mathbf{d}(A_{t-1})$ onto the vector $\mathbf{r}$ . It is only used in the analysis.
4.

Let $A_{t}^{\ell}$ and $A_{t}^{r}$ be constructed by Lemma A.5 below.

Lemma A.5.

Given the values $\mathbf{p}(v)$ for all $v\in A_{t-1}$ , we can find in time $O(|A_{t-1}|\log|A_{t-1}|)$ a partition of $A_{t-1}$ into two sets $A_{t}^{\ell},A_{t}^{r}$ and a separation value $\eta\in\mathbb{R}$ such that

(a)

$\eta$ separates the projections of $A_{t}^{\ell},A_{t}^{r}$ , i.e., either $\displaystyle\max_{u\in A_{t}^{\ell}}\mathbf{p}(u)\leq\eta\leq\min_{v\in A_{t}^{r}}\mathbf{p}(v)$ or $\displaystyle\min_{u\in A_{t}^{\ell}}\mathbf{p}(u)\geq\eta\geq\max_{v\in A_{t}^{r}}\mathbf{p}(v)$ ,
(b)

$\mathbf{d}(A_{t}^{\ell})\leq\mathbf{d}(A_{t-1})/2$ and $\mathbf{d}(A_{t}^{r})\geq\mathbf{d}(A_{t-1})/2$ , and
(c)

$\sum_{v\in A_{t}^{\ell}}\mathbf{d}(v)\cdot(\mathbf{p}(v)-\eta)^{2}\geq\frac{1}{2}\sum_{v\in A_{t-1}}\mathbf{d}(v)\cdot(\mathbf{p}(v)-\bar{\mu}_{t-1})^{2}$ .

Proof.

Sort the vertices $v\in A_{t-1}$ by their value of $\mathbf{p}(v)$ in ascending order, and let the sorted list be $v_{1},v_{2},\ldots,v_{|A_{t-1}|}$ . Let $i\geq 1$ be the smallest integer such that $\mathbf{d}(\{v_{1},v_{2},\ldots,v_{i}\})\geq\mathbf{d}(A_{t-1})/2$ . Define the vertex sets $S_{1}=\{v_{1},v_{2},\ldots,v_{i}\}$ and $S_{2}=\{v_{i},v_{i+1},\ldots,v_{|A_{t-1}|}\}$ . By our construction of $i$ , we have $\mathbf{d}(S_{j}\setminus\{v_{i}\})<\mathbf{d}(A_{t-1})/2$ for both $j\in\{1,2\}$ . Since $S_{1}\cup S_{2}=A_{t-1}$ , we have

\sum_{v\in S_{1}}\mathbf{d}(v)\cdot(\mathbf{p}(v)-\eta)^{2}+\sum_{v\in S_{2}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2}\geq\sum_{u\in A_{t-1}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2},

so pick $j\in\{1,2\}$ such that

\sum_{v\in S_{j}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2}\geq\frac{1}{2}\sum_{u\in A_{t-1}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2}.

Define $\eta=p_{v_{i}}$ so that

\sum_{v\in S_{j}\setminus\{i\}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2}=\sum_{v\in S_{j}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2}\geq\frac{1}{2}\sum_{u\in A_{t-1}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2}.

It follows that we can take $A_{t}^{\ell}=S_{j}\setminus\{i\}$ and $A_{t}^{r}=A_{t-1}\setminus A_{t}^{\ell}$ , which satisfies all three properties of the lemma. ∎

We conclude this section with a lemma about the behavior of projections onto a random unit vector. Since the lemma is standard in high-dimensional geometry, we omit the proof and refer to Lemma 3.4 of [18].

Lemma A.6 (Lemma 3.4 of [18]).

For all vertices $v\in A$ , we have

\mathbb{E}[(\mathbf{p}(v)-\bar{\mu}_{t-1})^{2}]=\frac{1}{n}\left\lVert\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}-\boldsymbol{\mu}_{t-1}\right\rVert_{2}^{2},

and for any pair $u,v\in A$ and constant $c>0$ , we have

(\mathbf{p}(u)-\mathbf{p}(v))^{2}\leq\frac{c\log n}{n}\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}

with probability at least $1-n^{-c/4}$ . In particular, there exists a constant $C>0$ such that for any pair $u,v\in A$ ,

\mathbb{E}[(\mathbf{p}(u)-\mathbf{p}(v))^{2}]\leq\frac{C\log n}{n}\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}.

A.2.2 Max-flow/min-cut call

Given $A_{t}^{\ell}$ and $A_{t}^{r}$ , we call Theorem 5.10 with parameters

\displaystyle A\leftarrow A,\,\epsilon\leftarrow\frac{1}{18T^{2}},\,\gamma\leftarrow\frac{\epsilon\phi}{2},\,\beta\leftarrow\max\{1,(12\phi+\epsilon\gamma)(\kappa+2)\},\,\,\mathbf{s}\leftarrow\phi\mathbf{d}|_{A_{t}^{\ell}}+\epsilon\phi\mathbf{d},\,\mathbf{t}\leftarrow 12\phi\mathbf{d}|_{A_{t}^{r}},

(3)

and we denote the graph $G[A,\gamma,\mathbf{s},\mathbf{t}]$ by $H=(V_{H},E_{H})$ . We will later show in Claim A.10 that Assumption 1 of Theorem 5.10 is satisfied with our parameter $\beta$ .

From Theorem 5.10, we obtain an $(1+\epsilon)$ -approximate fair cut/flow pair $(S,f)$ . The algorithm sets $S_{t}\leftarrow S\setminus\{s,x\}$ , which satisfies the condition below as required by step i.

Claim A.7.

$\delta_{G[A]}S_{t}\leq\phi\mathbf{d}(S\cap A_{t-1})+3\epsilon\phi\mathbf{d}(A)$ and $\mathbf{d}(S_{t})\leq\mathbf{d}(A)/3$ ,

Proof.

We begin with the first statement. We begin by bounding $\delta_{H}S$ as follows. The edges in $\partial_{H}S$ can be split into three groups:

1.

The edges in $G[A]$ , which have total capacity $\delta_{G[A]}(S\setminus\{s,x\})=\delta_{G[A]}S_{t}$ ,
2.

The edges incident to $s$ , which have total capacity $c_{H}(\{s\},V_{H}\setminus S)=\mathbf{s}(A\setminus S_{t})\geq\phi\mathbf{d}(A_{t}^{\ell}\setminus S_{t})$ , and
3.

The edges incident to $t$ or $x$ , which we ignore.

It follows that

\displaystyle\delta_{H}S\geq\delta_{G[A]}S_{t}+\phi\mathbf{d}(A_{t}^{\ell}\setminus S_{t}).

By Fact 5.8, the cut value $\delta_{H}S$ is at most $(1+\epsilon)$ times the minimum $(s,t)$ -cut. Since the $(s,t)$ -minimum cut is at most $\delta_{H}\{s\}=\mathbf{s}(A)$ , we have

\delta_{H}S\leq(1+\epsilon)\mathbf{s}(A)=(1+\epsilon)(\phi\mathbf{d}(A_{t}^{\ell})+\epsilon\phi\mathbf{d}(A))\leq(1+\epsilon)\phi\mathbf{d}(A_{t}^{\ell})+2\epsilon\phi\mathbf{d}(A).

It follows that

	$\displaystyle\delta_{G[A]}S_{t}$	$\displaystyle\leq\delta_{H}S-\phi\mathbf{d}(A_{t}^{\ell}\setminus S_{t})$
		$\displaystyle\leq(1+\epsilon)\phi\mathbf{d}(A_{t}^{\ell})+2\epsilon\phi\mathbf{d}(A)-\phi\mathbf{d}(A_{t}^{\ell}\setminus S_{t})$
		$\displaystyle=\phi\mathbf{d}(S_{t}\cap A_{t}^{\ell})+\epsilon\phi\mathbf{d}(A_{t}^{\ell})+2\epsilon\phi\mathbf{d}(A)$
		$\displaystyle\leq\phi\mathbf{d}(S_{t}\cap A_{t-1})+3\epsilon\phi\mathbf{d}(A),$

concluding the first statement. For the second statement, observe that $\delta_{H}S\leq(1+\epsilon)\mathbf{s}(A)\leq(1+\epsilon)\cdot 2\phi\mathbf{d}(A)$ and $\delta_{H}S\geq c_{H}(S,\{t\})=12\phi\mathbf{d}(S_{t})$ , so

\mathbf{d}(S_{t})\leq\frac{\delta_{H}S}{12\phi}\leq\frac{(1+\epsilon)\cdot 2\phi\mathbf{d}(A)}{12\phi}\leq\frac{\mathbf{d}(A)}{3},

concluding the second statement and the proof. ∎

The algorithm defines the flow $f_{t}$ as follows. First, scale the flow $f$ by factor $1/(12\phi)$ and restrict the flow to edges in $G[A\setminus S_{t}]$ . In other words, the flow on edges incident to $S_{t}\cup\{s,t,x\}$ are removed. Then, decompose the flow $f_{t}$ into paths (using, for example, a dynamic tree [ST83]). For each vertex $v\in A_{t}^{\ell}\setminus S_{t}$ , remove enough paths starting at $v$ until it is the start of exactly $\mathbf{d}(v)/12$ total capacity of paths, and for each vertex $v\notin A_{t}^{\ell}\setminus S_{t}$ , remove all paths starting at $v$ ; let the remaining flow be $f_{t}$ . The claim below shows that this last step is always possible, and that $f_{t}$ satisfies the condition below as required by step ii.

Claim A.8.

$f_{t}$ is a flow from $A_{t}^{\ell}\setminus S_{t}$ to $A_{t}^{r}\setminus S_{t}$ such that each vertex $v\in A_{t}^{\ell}\setminus S_{t}$ is the source of exactly $\mathbf{d}(v)/12$ flow, and each vertex $v\in A_{t}^{r}\setminus S_{t}$ is the sink of at most $\mathbf{d}(v)$ flow. The flow $f$ has congestion $1/(12\phi)$ .

Proof.

It suffices to show that the scaled flow $f/(12\phi)$ , once restricted to edges in $G$ , is a flow such that

1.

Each vertex $v\in A_{t}^{\ell}\setminus S_{t}$ is the source of at least $\mathbf{d}(v)/12$ flow, and
2.

The only sinks are at vertices $v\in A_{t}^{r}\setminus S_{t}$ , and each such vertex $v$ is the sink of at most $\mathbf{d}(v)$ flow.

The path-removing step in the algorithm is then possible, and ensures that each vertex $v\in A_{t}^{\ell}\setminus S_{t}$ is the source of exactly $\mathbf{d}(v)/12$ flow in $f_{t}$ , each vertex $v\in A_{t}^{r}\setminus S_{t}$ is the sink of at most $\mathbf{d}(v)$ flow, and there are no (nonzero) sources or sinks elsewhere.

We begin with the unscaled flow $f$ . Since $(S,f)$ is a $(1+\epsilon)$ -fair cut/flow pair, each vertex $v\in V_{H}\setminus(S\cup\{t\})$ receives at least $\frac{1}{1+\epsilon}\,c_{H}(\{v\},S)\geq\frac{1}{2}c_{H}(\{v\},S)$ total flow from vertices in $S$ . Since $A\setminus S_{t}\subseteq V_{H}\setminus(S\cup\{t\})$ , the same holds for all $v\in A\setminus S_{t}$ . By construction of $H$ , we have $c_{H}(\{v\},S)\geq c_{H}(v,s)=\phi\mathbf{d}|_{A_{t}^{\ell}}(v)+\epsilon\phi\mathbf{d}(v)$ for all $v\in A\setminus S_{t}$ . It follows that each vertex $v\in A\setminus S_{t}$ receives at least $\frac{\phi}{2}\mathbf{d}|_{A_{t}^{\ell}}(v)+\frac{\epsilon\phi}{2}\mathbf{d}(v)$ total flow from vertices in $S$ .

We now investigate the effect of restricting the flow $f$ to edges in $G[A\setminus S_{t}]$ , starting with removing all edges incident to $S$ . Continuing the argument above, removing these edges causes each vertex $v\in A\setminus S_{t}$ to be the source of at least $\frac{\phi}{2}\mathbf{d}|_{A_{t}^{\ell}}(v)+\frac{\epsilon\phi}{2}\mathbf{d}(v)$ flow.

If $x\notin S$ , then we now remove the edges incident to $x$ . By construction of $H$ , each vertex $v\in A$ has an edge to $x$ of capacity $\gamma\mathbf{d}(v)=\frac{\epsilon\phi}{2}\mathbf{d}(v)$ . Since the flow $f$ is feasible, there is at most $\frac{\epsilon\phi}{2}\mathbf{d}(v)$ flow along the edge $(v,x)$ . Removing this edge changes the net flow out of $v$ by at most $\frac{\epsilon\phi}{2}\mathbf{d}(v)$ . Since each vertex $v\in A\setminus S_{t}$ is the source of at least $\frac{\phi}{2}\mathbf{d}|_{A_{t}^{\ell}}(v)+\frac{\epsilon\phi}{2}\mathbf{d}(v)$ flow before this step, it is the source of at least $\frac{\phi}{2}\mathbf{d}|_{A_{t}^{\ell}}(v)\geq 0$ flow after this step. In particular, it cannot become a sink. Also, if $v\in A_{t}^{\ell}\setminus S_{t}$ , then it is the source of at least $\frac{\phi}{2}\mathbf{d}(v)$ flow after restriction.

Finally, we remove the edges incident to $t$ . Since the flow $f$ is feasible, each edge $(v,t)$ carries at most $c_{H}(v,t)$ flow. By construction of $H$ , we have $c_{H}(v,t)=12\phi\mathbf{d}(v)$ for all $v\in A_{t}^{r}$ , so each vertex $v\in A_{t}^{r}$ receives a net flow of at most $12\phi\mathbf{d}(v)$ from vertices other than $t$ (and then sends that flow to $t$ ). We may assume that the flow does not send any flow away from $t$ (i.e., along any edge $(v,t)$ in the direction from $t$ to $v$ ), since otherwise we can remove such flow using a path decomposition. Under this assumption, removing flow on edges incident to $t$ does not create any sources, only sinks. In particular, each vertex $v\in A_{t}^{r}$ is now the sink of at most $12\phi\mathbf{d}(v)$ flow.

Finally, scaling this restricted flow by $1/(12\phi)$ , we conclude that the scaled flow $f$ has congestion $1/(12\phi)$ , each vertex $v\in A_{t}^{\ell}\cap S_{t}$ is the source of at least $\mathbf{d}(v)/12$ flow, and each vertex $v\in A_{t}^{r}\setminus S_{t}$ is the sink of at most $\mathbf{d}(v)$ flow. ∎

Claim A.9.

Given flow $f$ , the flow $f_{t}$ can be constructed in $O(m\log m)$ time.

Proof.

The flow $f_{t}$ is on the graph $H$ with at most $m+n$ edges, so scaling and restricting the flow takes $O(m)$ time. Using a dynamic tree, we can decompose the new flow into at most $m$ (implicit) paths in $O(m\log m)$ time. The path removal step also takes $O(m\log m)$ time through dynamic trees. The overall running time is $O(m\log m)$ . ∎

Claim A.10.

Assumption 1 of Theorem 5.10 holds for our choice of parameters $\gamma,\beta$ . That is, we have $\mathbf{t}(C\cap A)+\epsilon\gamma\cdot\delta_{G}(C\cap A)\leq\beta\cdot\delta_{G}C$ for all $C\in\mathcal{C}$ .

Proof.

Recall that we set $\gamma\leftarrow\frac{\epsilon\phi}{2},\,\beta\leftarrow(12\phi+\epsilon\gamma)(\kappa+2)$ , and $\mathbf{t}\leftarrow 12\phi\mathbf{d}|_{A_{t}^{r}}$ . For the entire proof, fix a set $C\in\mathcal{C}$ . We first bound $\mathbf{t}(C\cap A)$ as

\mathbf{t}(C\cap A)=12\phi\mathbf{d}|_{A_{t}^{r}}(C\cap A)\leq 12\phi\mathbf{d}(C\cap A)=12\phi\deg_{\partial\mathcal{P}_{L}\cap\partial A}(C\cap A)\leq 12\phi(\deg_{\partial\mathcal{P}_{L}}(C\cap A)+\deg_{\partial A}(C\cap A)).

For $\deg_{\partial A}(C\cap A)$ , we have $\deg_{\partial A}(C\cap A)=c_{G}(C\cap A,V\setminus A)$ since any edge in $\partial A$ with an endpoint in $C\cap A$ has its other endpoint outside $A$ . For $\deg_{\partial\mathcal{P}_{L}}(C\cap A)$ , we claim the bound $\deg_{\partial\mathcal{P}_{L}}(C\cap A)\leq\deg_{\partial\mathcal{P}_{L}}(C)\leq\delta_{G}C$ . The first inequality is trivial, and for the second inequality, observe that by construction of $\mathcal{C}$ , each set $C\in\mathcal{C}$ is a subset of some cluster in the partition $\mathcal{P}_{L}$ . It follows that any edge in $\partial_{G}\mathcal{P}_{L}$ with an endpoint in $C$ has its other endpoint outside $C$ , so the edge must be in $\partial_{G}C$ , and we conclude that $\deg_{\partial\mathcal{P}_{L}}(C)\leq\delta_{G}C$ . In total, we have established the bound $\mathbf{t}(C\cap A)\leq 12\phi(c_{G}(C\cap A,V\setminus A)+\delta_{G}C)$ .

Next, we bound $\delta_{G}(C\cap A)$ as

\delta_{G}(C\cap A)=c_{G}(C\cap A,V\setminus A)+c_{G}(C\cap A,A\setminus C)\leq c_{G}(C\cap A,V\setminus A)+\delta_{G}C,

and together with the bound on $\mathbf{t}(C\cap A)$ , we obtain

	$\displaystyle\mathbf{t}(C\cap A)+\epsilon\gamma\cdot\delta_{G}(C\cap A)$	$\displaystyle\leq 12\phi(c_{G}(C\cap A,V\setminus A)+\delta_{G}C)+\epsilon\gamma(c_{G}(C\cap A,V\setminus A)+\delta_{G}C)$
		$\displaystyle=(12\phi+\epsilon\gamma)(c_{G}(C\cap A,V\setminus A)+\delta_{G}C).$		(4)

We now focus on bounding $c_{G}(C\cap A,V\setminus A)$ . Recall from assumption 1 of Theorem 5.11 that there is a flow on $G$ with congestion $\kappa$ such that each vertex $v\in A$ is the source of $c_{G}(\{v\},V\setminus A)$ flow and each vertex $v\in V$ is the sink of at most $\deg_{\partial\mathcal{P}_{L}}(v)$ flow. In particular, among the vertices $v\in C$ , there is a total of $c_{G}(C\cap A,V\setminus A)$ source and at most $\deg_{\partial\mathcal{P}_{L}}(C)$ sink. Since the flow has congestion $\kappa$ , the net flow out of $C$ is at most $\kappa\delta_{G}C$ . Putting everything together, we obtain

c_{G}(C\cap A,V\setminus A)\leq\deg_{\partial\mathcal{P}_{L}}(C)+\kappa\delta_{G}C\leq\delta_{G}C+\kappa\delta_{G}C,

where the second inequality follows from our earlier claim $\deg_{\partial\mathcal{P}_{L}}(C)\leq\delta_{G}C$ . Continuing from (4), we conclude that

	$\displaystyle\mathbf{t}(C\cap A)+\epsilon\gamma\cdot\delta_{G}(C\cap A)$	$\displaystyle\leq(12\phi+\epsilon\gamma)(c_{G}(C\cap A,V\setminus A)+\delta_{G}C)$
		$\displaystyle=(12\phi+\epsilon\gamma)(\delta_{G}C+\kappa\delta_{G}C+\delta_{G}C)$
		$\displaystyle=(12\phi+\epsilon\gamma)(\kappa+2)\delta_{G}C$
		$\displaystyle\leq\beta\delta_{G}C,$

finishing the proof. ∎

A.2.3 Constructing the matching and updating the flow-matrix

Using the flow $f_{t}$ , the algorithm then constructs a (fractional) matching graph $M_{t}$ on vertex set $A_{t-1}$ . Take a path decomposition of flow $f_{t}$ into paths from $A_{t}^{\ell}\setminus S_{t}$ to $A_{t}^{r}\setminus S_{t}$ . For each pair of vertices $u\in A_{t}^{\ell}\setminus S_{t}$ and $v\in A_{t}^{r}\setminus S_{t}$ , add an edge $(u,v)$ to $M_{t}$ whose capacity is the sum of capacities of all $u$ – $v$ paths in the decomposition of $f_{t}$ . The following claim is immediate using an (implicit) path decomposition.

Claim A.11.

Given flow $f_{t}$ , the matching graph $M_{t}$ can be constructed in $O(m\log m)$ time.

From the matching graph $M_{t}$ , we implicitly update the flow-matrix $\mathbf{F}_{t-1}$ to $\mathbf{F}_{t}$ as follows. For each $v\in A$ , we set

\mathbf{F}_{t}(u)=\mathbf{F}_{t-1}(u)+\sum_{v\in A}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}-\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}\bigg{)}.

Note that we are viewing $M_{t}$ as a graph on vertex set $A$ , where vertices outside $(A_{t}^{\ell}\setminus S_{t})\cup(A_{t}^{r}\setminus S_{t})$ are isolated.

Recall from Section A.2.1 that the algorithm needs to compute the projection $\mathbf{p}(v)=\langle\mathbf{F}_{t-1}(v)/\mathbf{d}(v),\,\mathbf{r}\rangle$ onto the vector $r$ for each vertex $v\in A$ . Now that $\mathbf{F}_{t-1}(v)$ has been explicitly defined, we show that the projections can be computed in total time $O(mT)$ .

Claim A.12.

Given vector $\mathbf{r}\in\mathbb{R}^{A}$ , the projection $\mathbf{p}(v)=\langle\mathbf{F}_{t-1}(v)/\mathbf{d}(v),\,\mathbf{r}\rangle$ for each vertex $v\in A$ can be computed in total time $O(mT)$ .

Proof.

By linearity of inner product, we have

\langle\mathbf{F}_{j}(u),\mathbf{r}\rangle=\langle\mathbf{F}_{j-1}(u),\mathbf{r}\rangle+\sum_{v\in A}\frac{c_{M_{j}}(u,v)}{2}\bigg{(}\frac{\langle\mathbf{F}_{j-1}(v),\mathbf{r}\rangle}{\mathbf{d}(v)}-\frac{\langle\mathbf{F}_{j-1}(u),\mathbf{r}\rangle}{\mathbf{d}(u)}\bigg{)}

for all vertices $v\in A$ and iterations $j\in[t-1]$ . Since the flow $f_{j}$ is on $G$ , we can assume that the path decomposition has at most $m$ paths, so the matching graph $M_{j}$ has at most $m$ edges. In other words, there are at most $m$ nonzero values of $c_{M_{j}}(u,v)$ . It follows that given the values $\langle\mathbf{F}_{j-1}(u),\mathbf{r}\rangle$ for all $u\in A$ , we can compute $\langle\mathbf{F}_{j}(u),\mathbf{r}\rangle$ for all $u\in A$ in $O(m)$ time. Since $\mathbf{F}_{0}$ is a diagonal matrix, the initial values $\langle\mathbf{F}_{0}(u),\mathbf{r}\rangle$ can be computed in $O(n)$ time. Over $t-1$ iterations, the total time is $O(mT)$ . ∎

A.3 Analyzing the Potential Decrease

The main goal of Section A.3 is to prove the following lemma, establishing a $(1-\Omega(1/\log n))$ expected decrease in $\psi(t)$ on each iteration.

Lemma A.13.

Over the random choice of the unit vector $r$ , we have $\mathbb{E}[\psi(t-1)-\psi(t)]\geq\Omega(\psi(t-1)/\log n)$ .

To prove Lemma A.13, we begin by listing properties of $M_{t}$ and $\mathbf{F}_{t}$ .

Claim A.14.

$\deg_{M_{t}}(u)=\mathbf{d}(u)/48$ for all $u\in A_{t}^{\ell}\setminus S_{t}$ , and $\deg_{M_{t}}(v)\leq\mathbf{d}(v)$ for all $u\in A_{t}^{r}\setminus S_{t}$ .

Proof.

For each vertex $u\in A_{t}^{\ell}\setminus S_{t}$ , its degree in $M_{t}$ equals the total capacity of paths starting at $u$ in the decomposition of $f_{t}$ , which is exactly $\mathbf{d}(u)/48$ by Claim A.8. A symmetric argument establishes $\deg_{M_{t}}(v)\leq\mathbf{d}(v)$ for all $u\in A_{t}^{r}\setminus S_{t}$ . ∎

Claim A.15.

For each vertex $u\in A$ , the flow-vector $\mathbf{F}_{t}(u)$ sums to $\mathbf{d}(u)$ .

Proof.

We prove by induction on $t$ that the flow-vector $\mathbf{F}_{t}(u)$ sums to $\mathbf{d}(u)$ . The statement is true for $t=0$ since $\mathbf{F}_{0}$ is defined as the diagonal matrix with value $\mathbf{d}(v)$ on entry $F(v,v)$ . Assume by induction that the flow-vector $\mathbf{F}_{t-1}(u)$ sums to $\mathbf{d}(u)$ for each $u\in A$ . For each $u\in A$ , we take the definition of vector $\mathbf{F}_{t}(u)$ and sum over its coordinates $w\in A$ to obtain

	$\displaystyle\sum_{w\in A}\mathbf{F}_{t}(u,w)$	$\displaystyle=\sum_{w\in A}\left(\mathbf{F}_{t-1}(u,w)+\sum_{v\in A}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\mathbf{F}_{t-1}(v,w)}{\mathbf{d}(v)}-\frac{\mathbf{F}_{t-1}(u,w)}{\mathbf{d}(u)}\bigg{)}\right)$
		$\displaystyle=\sum_{w\in A}\mathbf{F}_{t-1}(u,w)+\sum_{v\in A}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\sum_{w\in A}\mathbf{F}_{t-1}(v,w)}{\mathbf{d}(v)}-\frac{\sum_{w\in A}\mathbf{F}_{t-1}(u,w)}{\mathbf{d}(u)}\bigg{)}$
		$\displaystyle=\mathbf{d}(u)+\sum_{v\in A}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\mathbf{d}(v)}{\mathbf{d}(v)}-\frac{\mathbf{d}(u)}{\mathbf{d}(u)}\bigg{)}$
		$\displaystyle=\mathbf{d}(u),$

completing the induction and the proof. ∎

Claim A.16.

If $\mathbf{F}_{t-1}$ is routable with congestion $t/\phi$ , then $\mathbf{F}_{t}$ is routable with congestion $(t+1)/\phi$ .

Proof.

Take the $A$ -commodity flow routing $\mathbf{F}_{t-1}$ with congestion $t/\phi$ and reverse it, forming a $A$ -commodity flow routing the transpose $\mathbf{F}_{t-1}^{\top}$ with the same congestion. In this reversed flow, each vertex $u\in A$ has $\mathbf{F}_{t-1}^{\top}(v,u)=\mathbf{F}_{t-1}(u,v)$ quantity of commodity $v$ . In other words, the quantities of each commodity at vertex $u\in A$ is captured by its flow vector $\mathbf{F}_{t-1}(u)$ .

Next, we “mix” commodities along the edges of the matching graph $M_{t}$ : for each edge $(u,v)$ in $M_{t}$ of capacity $c$ , send a proportional $\frac{c}{2\mathbf{d}(u)}$ fraction of each commodity at $u$ along the corresponding path (in the path decomposition of $f/\phi$ ) in the direction from $u$ to $v$ , and send a proportional $\frac{c}{2\mathbf{d}(v)}$ fraction of each commodity at $v$ in the direction from $v$ to $u$ . By Claim A.15, each vertex $u\in A_{t-1}$ has $\mathbf{d}(u)$ total quantity of commodities, so we send $c/2$ total quantity of commodities in each direction, or $c$ in total, along this path of capacity $c$ (in the path decomposition of $f_{t}$ ). Since flow $f_{t}$ has congestion $1/\phi$ by Claim A.8, the total congestion of this mixing step is $1/\phi$ . After this mixing step, the quantities of each commodity at vertex $u\in A$ is exactly $\mathbf{F}_{t}(u)$ . In other words, we have established a $A$ -commodity flow routing $\mathbf{F}_{t}^{\top}$ with congestion $(t+1)/\phi$ . Reversing the flow, we obtain a routing of $\mathbf{F}_{t}$ with the same congestion. ∎

The following lemma adapts Lemma 3.4 of [33] to the capacitated case.

Lemma A.17.

$\psi(t)-\psi(t-1)\geq\displaystyle\frac{1}{2}\sum_{u,v\in A}c_{M_{t}}(u,v)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}$ .

Proof.

For each vertex $v\in A$ , define the normalized flow vector $\widetilde{\mathbf{F}}_{t-1}(v)=\mathbf{d}(v)^{-1/2}\mathbf{F}_{t-1}(v)$ , and define the normalized flow matrix $\widetilde{\mathbf{F}}_{t-1}\in\mathbb{R}^{A\times A}_{\geq 0}$ with vector $\widetilde{\mathbf{F}}_{t-1}(v)$ for each row $v\in A$ . Let $D$ be the diagonal matrix with value $\mathbf{d}(u)$ on entry $(u,u)$ . We can then write $\widetilde{\mathbf{F}}_{t-1}=D^{-1/2}\mathbf{F}_{t-1}$ . Define the vectors $\widetilde{\mathbf{F}}_{t}(v)=\mathbf{d}(v)^{-1/2}\mathbf{F}_{t}(v)$ and matrix $\widetilde{\mathbf{F}}_{t}=D^{-1/2}\mathbf{F}_{t}$ analogously. For each vertex $u\in A$ , by definition of flow vector $\mathbf{F}_{t}(u)$ , we have

	$\displaystyle\widetilde{\mathbf{F}}_{t}(u)=\mathbf{d}(u)^{-1/2}\mathbf{F}_{t}(u)$	$\displaystyle=\mathbf{d}(u)^{-1/2}\bigg{(}\mathbf{F}_{t-1}(u)+\sum_{v\in A_{t}}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}-\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}\bigg{)}\bigg{)}$
		$\displaystyle=\mathbf{d}(u)^{-1/2}\bigg{(}\mathbf{d}(u)^{1/2}\widetilde{\mathbf{F}}_{t-1}(u)+\sum_{v\in A_{t-1}}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\widetilde{\mathbf{F}}_{t-1}(v)}{\mathbf{d}(v)^{1/2}}-\frac{\widetilde{\mathbf{F}}_{t-1}(u)}{\mathbf{d}(u)^{1/2}}\bigg{)}\bigg{)}$
		$\displaystyle=\widetilde{F}(u)+\sum_{v\in A_{t-1}}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\widetilde{\mathbf{F}}_{t-1}(v)}{\mathbf{d}(u)^{1/2}\mathbf{d}(v)^{1/2}}-\frac{\widetilde{\mathbf{F}}_{t-1}(u)}{\mathbf{d}(u)}\bigg{)}$
		$\displaystyle=\widetilde{F}(u)-\frac{1}{2}\bigg{(}\frac{\deg_{M_{t}}(u)}{\mathbf{d}(u)}\widetilde{F}(u)-\sum_{v\in A_{t-1}}\frac{c_{M_{t}}(u,v)}{\mathbf{d}(u)^{1/2}\mathbf{d}(v)^{1/2}}\widetilde{F}(v)\bigg{)}.$

Let $L\in\mathbb{R}^{A\times A}_{\geq 0}$ be the Laplacian matrix for the matching graph $M_{t}$ on vertex set $A$ (where vertices outside $A_{t}^{\ell}\cup A_{t}^{r}$ are isolated). That is, we define $L(u,u)=\deg_{M_{t}}(u)$ for all $u\in A$ and $L(u,v)=-c_{M_{t}}(u,v)$ for all distinct $u,v\in A$ .

We first prove the following about the matrix $D^{-1/2}LD^{-1/2}$ .

Subclaim 1.

$0\preceq D^{-1/2}LD^{-1/2}\preceq 2I$ .

Proof.

Consider the normalized Laplacian $D_{L}^{-1/2}LD^{-1/2}_{L}$ where $D_{L}$ is the diagonal matrix with value $\deg_{M_{t}}(u)$ on each entry $(u,u)$ . It is well-known that the normalized Laplacian has eigenvalues in the range $[0,2]$ , so $0\preceq D_{L}^{-1/2}LD_{L}^{-1/2}\preceq 2I$ . Multiplying by $D_{L}^{1/2}$ on both sides gives $0\preceq L\preceq 2D_{L}$ . Claim A.14 implies that $D_{L}\preceq D$ , so we obtain $0\preceq L\preceq 2D_{L}\preceq 2D$ . Finally, multiplying by $D^{-1/2}$ on both sides gives the desired $0\preceq D^{-1/2}LD^{-1/2}\preceq 2I$ . ∎

By definition, the matrix $D^{-1/2}LD^{-1/2}$ has value $\frac{\deg_{M_{t}}(u)}{\mathbf{d}(u)}$ on entry $(u,u)$ and value $-\frac{c_{M_{t}}(u,v)}{\mathbf{d}(u)^{1/2}\mathbf{d}(v)^{1/2}}$ on entry $(u,v)$ with $u\neq v$ . It follows that we can write matrix $\widetilde{\mathbf{F}}_{t}$ as

\widetilde{\mathbf{F}}_{t}=\bigg{(}I-\frac{1}{2}D^{-1/2}LD^{-1/2}\bigg{)}\widetilde{\mathbf{F}}_{t-1}=\frac{1}{2}(I+N)\widetilde{\mathbf{F}}_{t-1}\quad\text{where}\quad N=I-D^{-1/2}LD^{-1/2}.

For two matrices $A,B\in\mathbb{R}^{A\times A}$ , define the Hadamard product $A\bullet B=\sum_{u,v\in A}A(u,v)\cdot B(u,v)=\textup{Tr}(A^{\top}B)$ . To bound $\psi(t-1)$ , we use the fact that $\boldsymbol{\mu}_{t-1}$ minimizes the expression in (2) to get

	$\displaystyle\psi(t-1)$	$\displaystyle=\sum_{u\in A_{t-1}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t-1}\right\rVert_{2}^{2}$
		$\displaystyle\leq\sum_{u\in A_{t-1}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}$
		$\displaystyle=\sum_{u\in A_{t-1}}\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)^{1/2}}-\mathbf{d}(u)^{1/2}\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}$
		$\displaystyle=(\widetilde{\mathbf{F}}_{t-1}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})\bullet(\widetilde{\mathbf{F}}_{t-1}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}).$

For $\psi(t)$ , we use $A_{t-1}\subseteq A_{t}$ to get

	$\displaystyle\psi(t)$	$\displaystyle=\sum_{u\in A_{t}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}$
		$\displaystyle\geq\sum_{u\in A_{t-1}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}$
		$\displaystyle=\sum_{u\in A_{t-1}}\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)^{1/2}}-\mathbf{d}(u)^{1/2}\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}$
		$\displaystyle=(\widetilde{\mathbf{F}}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})\bullet(\widetilde{\mathbf{F}}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}).$

Taking the difference and expanding,

	$\displaystyle\psi(t-1)-\psi(t)$
	$\displaystyle\geq(\widetilde{\mathbf{F}}_{t-1}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})\bullet(\widetilde{\mathbf{F}}_{t-1}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})-(\widetilde{\mathbf{F}}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})\bullet(\widetilde{\mathbf{F}}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})$
	$\displaystyle=(\widetilde{\mathbf{F}}_{t-1}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})\bullet(\widetilde{\mathbf{F}}_{t-1}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})-(\widetilde{\mathbf{F}}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})\bullet(\widetilde{\mathbf{F}}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})$
	$\displaystyle=\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-2\widetilde{\mathbf{F}}_{t-1}\bullet D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}+D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}\bullet D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}-\widetilde{\mathbf{F}}_{t}\bullet\widetilde{\mathbf{F}}_{t}+2\widetilde{\mathbf{F}}_{t}\bullet D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}\bullet D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}$
	$\displaystyle=\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-\widetilde{\mathbf{F}}_{t}\bullet\widetilde{\mathbf{F}}_{t}+2(\widetilde{\mathbf{F}}_{t}-\widetilde{\mathbf{F}}_{t-1})\bullet D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}$
	$\displaystyle=\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-\widetilde{\mathbf{F}}_{t}\bullet\widetilde{\mathbf{F}}_{t}-2\bigg{(}\frac{1}{2}D^{-1/2}LD^{-1/2}\bigg{)}\bullet D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}.$

The third term equals $\textup{Tr}\big{(}D^{-1/2}LD^{-1/2}D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}\big{)}$ , which equals $0$ since $LD^{-1/2}D^{1/2}\mathbbm{1}=L\mathbbm{1}$ is the zero matrix. Expanding the first and second terms, we obtain

$\displaystyle\psi(t-1)-\psi(t)$	$\displaystyle\geq\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-\widetilde{\mathbf{F}}_{t}\bullet\widetilde{\mathbf{F}}_{t}$
	$\displaystyle=\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-\frac{1}{2}(I+N)\widetilde{\mathbf{F}}_{t-1}\bullet\frac{1}{2}(I+N)\widetilde{\mathbf{F}}_{t-1}$
	$\displaystyle=\frac{3}{4}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-\frac{1}{2}N\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-\frac{1}{4}N\widetilde{\mathbf{F}}_{t-1}\bullet N\widetilde{\mathbf{F}}_{t-1}$
	$\displaystyle=\frac{1}{4}\big{(}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-N\widetilde{\mathbf{F}}_{t-1}\bullet N\widetilde{\mathbf{F}}_{t-1}\big{)}+\frac{1}{2}(I-N)\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}$
	$\displaystyle=\frac{1}{4}\big{(}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-N\widetilde{\mathbf{F}}_{t-1}\bullet N\widetilde{\mathbf{F}}_{t-1}\big{)}+\frac{1}{2}D^{-1/2}LD^{-1/2}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}.$	(5)

To bound the first term in (5), recall from Subclaim 1 that $0\preceq D^{-1/2}LD^{-1/2}\preceq 2I$ , which means that $-I\preceq N\preceq I$ . Since $\widetilde{\mathbf{F}}_{t-1}\widetilde{\mathbf{F}}_{t-1}^{\top}\succeq 0$ and $N$ has eigenvalues in the range $[-1,1]$ , we have $\textup{Tr}(N\widetilde{\mathbf{F}}_{t-1}\widetilde{\mathbf{F}}_{t-1}^{\top}N)\leq\textup{Tr}(\widetilde{\mathbf{F}}_{t-1}\widetilde{\mathbf{F}}_{t-1}^{\top})$ , so

\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}=\textup{Tr}(\widetilde{\mathbf{F}}_{t-1}\widetilde{\mathbf{F}}_{t-1}^{\top})\geq\textup{Tr}(N\widetilde{\mathbf{F}}_{t-1}\widetilde{\mathbf{F}}_{t-1}^{\top}N)=\textup{Tr}(\widetilde{\mathbf{F}}_{t-1}^{\top}NN\widetilde{\mathbf{F}}_{t-1})=N\widetilde{\mathbf{F}}_{t-1}\bullet N\widetilde{\mathbf{F}}_{t-1},

so the first term $\frac{1}{4}(\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-N\widetilde{\mathbf{F}}_{t-1}\bullet N\widetilde{\mathbf{F}}_{t-1})$ is at least $0$ .

Continuing from (5), we conclude that

\psi(t-1)-\psi(t)\geq\frac{1}{2}D^{-1/2}LD^{-1/2}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}.

It remains to understand the above expression $D^{-1/2}LD^{-1/2}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}$ . For each vertex $v\in A$ , let $\mathbbm{1}_{v}\in\mathbb{R}^{A}$ denote the unit vector in direction $v$ . The Laplacian can be written as

L=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot(\mathbbm{1}_{u}-\mathbbm{1}_{v})(\mathbbm{1}_{u}-\mathbbm{1}_{v})^{\top},

so that

	$\displaystyle D^{-1/2}LD^{-1/2}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}$	$\displaystyle=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot D^{-1/2}(\mathbbm{1}_{u}-\mathbbm{1}_{v})(\mathbbm{1}_{u}-\mathbbm{1}_{v})^{\top}D^{-1/2}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}$
		$\displaystyle=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot\textup{Tr}\big{(}\widetilde{\mathbf{F}}_{t-1}^{\top}D^{-1/2}(\mathbbm{1}_{u}-\mathbbm{1}_{v})(\mathbbm{1}_{u}-\mathbbm{1}_{v})^{\top}D^{-1/2}\widetilde{\mathbf{F}}_{t-1}\big{)}$
		$\displaystyle=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot\textup{Tr}\big{(}(\mathbbm{1}_{u}-\mathbbm{1}_{v})^{\top}D^{-1/2}\widetilde{\mathbf{F}}_{t-1}\widetilde{\mathbf{F}}_{t-1}^{\top}D^{-1/2}(\mathbbm{1}_{u}-\mathbbm{1}_{v})\big{)}$
		$\displaystyle=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot\left\lVert\widetilde{\mathbf{F}}_{t-1}^{\top}D^{-1/2}(\mathbbm{1}_{u}-\mathbbm{1}_{v})\right\rVert_{2}^{2}$
		$\displaystyle=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot\left\lVert\mathbf{F}_{t-1}^{\top}D^{-1}(\mathbbm{1}_{u}-\mathbbm{1}_{v})\right\rVert_{2}^{2}$
		$\displaystyle=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}.$

We conclude that

\psi(t-1)-\psi(t)\geq\frac{1}{2}\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}.\qed

Finally, we prove the main lemma of Section A.2.3, restated below. See A.13

Proof.

By Lemma A.17, we have

\displaystyle\psi(t-1)-\psi(t)\geq\displaystyle\frac{1}{2}\sum_{u,v\in A}c_{M_{t}}(u,v)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}.

(6)

By Lemma A.6, we have

\displaystyle\mathbb{E}[(\mathbf{p}(v)-\bar{\mu}_{t-1})^{2}]=\frac{1}{n}\left\lVert\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}-\boldsymbol{\mu}_{t-1}\right\rVert_{2}^{2}

(7)

for all vertices $v\in A$ , and for some constant $C>0$ ,

\displaystyle\mathbb{E}[(\mathbf{p}(u)-\mathbf{p}(v))^{2}]\leq\frac{C\log n}{n}\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}

(8)

for all pairs $u,v\in A$ . By Claim A.14, we have $\deg_{M_{t}}(u)=\mathbf{d}(u)/12$ for all $u\in A_{t}^{\ell}\setminus S_{t}$ , and together with statements a to c of Lemma A.5, we have

$\displaystyle\sum_{u\in A_{t}^{\ell},v\in A_{t}^{r}}c_{M_{t}}(u,v)\cdot(\mathbf{p}(u)-\mathbf{p}(v))^{2}$	$\displaystyle\stackrel{{\scriptstyle\mathclap{\textup{\ref{item:CMG-separator-1}}}}}{{\geq}}\sum_{u\in A_{t}^{\ell},v\in A_{t}^{r}}c_{M_{t}}(u,v)\cdot(\mathbf{p}(u)-\eta)^{2}$
	$\displaystyle=\sum_{u\in A_{t}^{\ell}}\deg_{M_{t}}(u)\cdot(\mathbf{p}(u)-\eta)^{2}$
	$\displaystyle=\frac{1}{12}\sum_{u\in A_{t}^{\ell}}\mathbf{d}(u)\cdot(\mathbf{p}(u)-\eta)^{2}$
	$\displaystyle\stackrel{{\scriptstyle\mathclap{\textup{\ref{item:CMG-separator-3}}}}}{{\geq}}\frac{1}{24}\sum_{u\in A_{t-1}}\mathbf{d}(u)\cdot(\mathbf{p}(u)-\bar{\mu}_{t-1})^{2}.$	(9)

Taking the expectation and putting everything together,

	$\displaystyle\mathbb{E}[\psi(t-1)-\psi(t)]$	$\displaystyle\stackrel{{\scriptstyle\mathclap{(\ref{eq:pot-red-1})}}}{{\geq}}\mathbb{E}\bigg{[}\displaystyle\frac{1}{2}\sum_{u,v\in A}c_{M_{t}}(u,v)\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}\bigg{]}$
		$\displaystyle=\mathbb{E}\bigg{[}\displaystyle\sum_{u\in A_{t}^{\ell},v\in A_{t}^{r}}c_{M_{t}}(u,v)\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}\bigg{]}$
		$\displaystyle\stackrel{{\scriptstyle\mathclap{(\ref{eq:pot-red-3})}}}{{\geq}}\mathbb{E}\bigg{[}\frac{n}{C\log n}\sum_{u\in A_{t}^{\ell},v\in A_{t}^{r}}c_{M_{t}}(u,v)\cdot(\mathbf{p}(u)-\mathbf{p}(v))^{2}\bigg{]}$
		$\displaystyle\stackrel{{\scriptstyle\mathclap{(\ref{eq:pot-red-4})}}}{{\geq}}\mathbb{E}\bigg{[}\frac{n}{24C\log n}\sum_{u\in A_{t-1}}\mathbf{d}(u)\cdot(\mathbf{p}(u)-\bar{\mu}_{t-1})^{2}\bigg{]}$
		$\displaystyle=\frac{n}{24C\log n}\sum_{u\in A_{t-1}}\mathbf{d}(u)\cdot\mathbb{E}[(\mathbf{p}(u)-\bar{\mu}_{t-1})^{2}]$
		$\displaystyle\stackrel{{\scriptstyle\mathclap{(\ref{eq:pot-red-2})}}}{{=}}\frac{n}{24C\log n}\sum_{u\in A_{t-1}}\mathbf{d}(u)\cdot\frac{1}{n}\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t-1}\right\rVert_{2}^{2}$
		$\displaystyle=\frac{1}{24C\log n}\cdot\psi(t-1),$

which concludes the proof of Lemma A.13. ∎

A.4 Putting Everything Together

Finally, we prove correctness of the algorithm by establishing properties (1) to (3) in Theorem 5.11. Properties (1) and (2) follow directly from Lemma A.1. For property (3), if the algorithm terminates early, then $\mathbf{d}(R)\geq\mathbf{d}(A)/(6T)$ by the termination condition and property (3) is satisfied. Otherwise, we have $\mathbf{d}(R_{T})<\mathbf{d}(A)/(6T)$ . By Lemma A.13, the potential $\psi(t)$ for any iteration $t$ is at most $1-\Omega(1/\log n)$ times the previous potential $\psi(t-1)$ in expectation. Over $T=O(\log^{2}(nW))$ iterations, the expected potential $\psi(T)$ is at most $1/(nW)^{2C}$ , where the constant $C>0$ can be made arbitrarily large by setting the constant in $T=O(\log^{2}(nW))$ appropriately. By Markov’s inequality, we have $\psi(T)\leq 1/(nW)^{C}$ with probability at least $1-1/n^{C}$ , in which case Lemma A.2 implies that vertex weighting $\mathrm{deg}_{F}|_{A_{t}}$ mixes in $G$ with congestion $5T/\phi=O(\phi^{-1}\log^{2}(nW))$ , fulfilling property (3).

It remains to bound the running time. For each iteration, the projections $\mathbf{p}(v)=\langle\mathbf{F}_{t-1}(v)/\mathbf{d}(v),\,\mathbf{r}\rangle$ for each vertex $v\in A$ can be computed in total time $O(mT)$ by Claim A.12. Computing the values $A_{t}^{\ell},A_{t}^{r}$ takes additional time $O(|A_{t-1}|\log|A_{t-1}|)=O(m\log m)$ by Lemma A.5. The algorithm makes one call to Theorem 5.10 with the parameters in (3), and then computes flow $f_{t}$ and the matching graph $M_{t}$ in $O(m\log m)$ time by Claims A.9 and A.11. Overall, the algorithm runs in $O(m\log^{2}(nW))$ time per iteration for $T=O(\log^{2}(nW))$ many iterations, which is $O(m\log^{4}(nW))$ time in total. The algorithm also makes $T=O(\log^{2}(nW))$ many calls to Theorem 5.10.

With both properties (1) to (3) and the running time established, this concludes the proof of Theorem 5.11.

Appendix B Expander Trimming

In this section, we prove Theorem 5.12, restated below. See 5.12

Our setup closely resembles Section 3.2 of [34], except that we replace the exact min-cut/max-flow oracle in their “slow-trimming” procedure with an approximate one. We also avoid defining expanders and work exclusively with flow mixing, which simplifies the analysis.

Define the vertex weighting $\mathbf{s}_{0}\in\mathbb{R}^{A}_{\geq 0}$ as $\mathbf{s}_{0}(v)=c_{G}(\{v\},R)$ for all $v\in A\setminus R$ , and $\mathbf{s}_{0}(v)=0$ for all $v\in R$ . We call Theorem 5.10 with parameters

\displaystyle A\leftarrow A,\,\epsilon\leftarrow\epsilon,\,\gamma\leftarrow\frac{\epsilon\phi}{2},\,\beta\leftarrow\max\{1,(12\phi+\epsilon\gamma)(\kappa+2)\},\,\,\mathbf{s}\leftarrow\mathbf{s}_{0}+\epsilon\phi\mathbf{d},\,\mathbf{t}\leftarrow 12\phi\mathbf{d}|_{A\setminus R},

and we denote the graph $G[A,\gamma,\mathbf{s},\mathbf{t}]$ by $H=(V_{H},E_{H})$ . Note that except for $\epsilon$ and $\mathbf{s}$ , these parameters match the parameters (3) in the proof of Theorem 5.11, where $A_{t}^{\ell}$ and $A_{t}^{r}$ are replaced by $R$ and $A\setminus R$ , respectively. Since assumption 1 is the same as the one in Theorem 5.11, the statement and proof of Claim A.10 (which does not depend on $\epsilon$ or $\mathbf{s}$ ) translates directly to this setting. For brevity, we omit the details, and simply restate Claim A.10 below.

Claim B.1.

Assumption 1 of Theorem 5.10 holds for our choice of parameters $\epsilon,\gamma,\beta$ . That is, we have $\mathbf{t}(C\cap A)+\epsilon\gamma\cdot\delta_{G}(C\cap A)\leq\beta\cdot\delta_{G}C$ for all $C\in\mathcal{C}$ .

From Theorem 5.10, we obtain an $(1+\epsilon)$ -approximate fair cut/flow pair $(S,f)$ . The algorithm sets $B\leftarrow S\setminus\{s,x\}$ , which trivially takes $O(|A|)$ time outside the call. It remains to show that properties (1) to (4) are satisfied.

To show property (1), we start with $\delta_{G}B\leq\delta_{H}S$ and proceed by bounding $\delta_{H}S$ . By Fact 5.8, the cut value $\delta_{H}S$ is at most $(1+\epsilon)$ times the minimum $(s,t)$ -cut. Since the $(s,t)$ -minimum cut is at most $\delta_{H}\{s\}=\mathbf{s}(A)$ , we have

\delta_{H}S\leq(1+\epsilon)\mathbf{s}(A)=(1+\epsilon)(\mathbf{s}_{0}(A)+\epsilon\phi\mathbf{d}(A))\leq 2\mathbf{s}_{0}(A)+2\epsilon\phi\mathbf{d}(A).

By definition of $\mathbf{s}_{0}$ , we have $\mathbf{s}_{0}(A)=\sum_{v\in A\setminus R}c_{G}(\{v\},R)=c_{G}(A\setminus R,R)=\delta_{G[A]}R$ . Putting everything together,

\delta_{G}B\leq\delta_{H}S\leq 2\mathbf{s}_{0}(A)+2\epsilon\phi\mathbf{d}(A)=2\delta_{G[A]}R+2\epsilon\phi\mathbf{d}(A),

fulfilling property (1). By construction of $H$ , we also have $c_{H}(S,\{t\})=12\phi\mathbf{d}|_{A\setminus R}(B)$ , so

12\phi\mathbf{d}(B\setminus R)=12\phi\mathbf{d}|_{A\setminus R}(B)=c_{H}(S,\{t\})\leq\delta_{H}S\leq 2\delta_{G[A]}R+2\epsilon\phi\mathbf{d}(A),

and dividing by $12\phi$ establishes property (2).

The proofs of properties (3) and (4) are longer, so we package them into the two claims below.

Claim B.2.

Property (4) holds, i.e., there exists a vector $\mathbf{t}\in\mathbb{R}^{A}_{\geq 0}$ with $\mathbf{t}\leq 24\phi\mathbf{d}|_{A\setminus(R\cup B)}$ and a flow $g$ on $G[A\setminus(R\cup B)]$ routing demand $\textup{deg}_{\partial_{G[A]}(R\cup B)}|_{A\setminus(R\cup B)}-\mathbf{t}$ with congestion $2$ .

Proof.

Since $(S,f)$ is a $(1+\epsilon)$ -fair cut/flow pair, each vertex $v\in V_{H}\setminus(S\cup\{t\})$ receives at least $\frac{1}{1+\epsilon}\,c_{H}(\{v\},S)\geq\frac{1}{2}c_{H}(\{v\},S)$ total flow from vertices in $S$ . Since $A\setminus(R\cup B)\subseteq V_{H}\setminus(S\cup\{t\})$ , the same holds for all $v\in A\setminus(R\cup B)$ . By construction of $H$ , we have the following for all $v\in A\setminus(R\cup B)$ :

	$\displaystyle c_{H}(\{v\},S)$	$\displaystyle\geq c_{H}(v,s)+c_{H}(\{v\},B)$
		$\displaystyle=\mathbf{s}_{0}(v)+\epsilon\phi\mathbf{d}(v)+c_{G}(\{v\},B)$
		$\displaystyle=c_{G}(\{v\},R)+\epsilon\phi\mathbf{d}(v)+c_{G}(\{v\},B)$
		$\displaystyle\geq c_{G}(\{v\},R\cup B)+\epsilon\phi\mathbf{d}(v)$
		$\displaystyle=\deg_{\partial_{G[A]}(R\cup B)}(v)+\epsilon\phi\mathbf{d}(v).$

It follows that each vertex $v\in A\setminus(R\cup B)$ receives at least $\frac{1}{2}(\deg_{\partial_{G[A]}(R\cup B)}(v)+\epsilon\phi\mathbf{d}(v))$ total flow from vertices in $R\cup B$ .

We now investigate the effect of restricting the flow $f$ to edges in $G[A\setminus(R\cup B)]$ , starting with removing all edges incident to $R\cup B$ . Continuing the argument above, removing these edges causes each vertex $v\in A\setminus(R\cup B)$ to be the source of at least $\deg_{\partial_{G[A]}(R\cup B)}(v)+\epsilon\phi\mathbf{d}(v)$ flow.

If $x\notin S$ , then we now remove the edges incident to $x$ . By construction of $H$ , each vertex $v\in A$ has an edge to $x$ of capacity $\gamma\mathbf{d}(v)=\frac{\epsilon\phi}{2\kappa}\mathbf{d}(v)$ . Since the flow has congestion $\kappa$ , there is at most $\frac{\epsilon\phi}{2}\mathbf{d}(v)$ flow along the edge $(v,x)$ . Removing this edge changes the net flow out of $v$ by at most $\frac{\epsilon\phi}{2}\mathbf{d}(v)$ . Since each vertex $v\in A\setminus(R\cup B)$ is the source of at least $\frac{1}{2}(\deg_{\partial_{G[A]}(R\cup B)}(v)+\epsilon\phi\mathbf{d}(v))$ flow before this step, it is the source of at least $\frac{1}{2}\deg_{\partial_{G[A]}(R\cup B)}(v)$ flow after this step.

At this point, only the vertices $A\setminus(R\cup B)$ and $t$ remain, and each vertex $v\neq t$ is the source of at least $\frac{1}{2}\deg_{\partial_{G[A]}(R\cup B)}(v)$ flow. Scale the flow by factor $2$ , take a path decomposition of the flow, and remove paths until each vertex $v\neq t$ is the source of exactly $\deg_{\partial_{G[A]}(R\cup B)}(v)$ flow. We may assume that the flow does not send any flow away from $t$ (i.e., along any edge $(v,t)$ in the direction from $t$ to $v$ ), since otherwise we can cancel such flow using the path decomposition.

Finally, we remove the edges incident to $t$ . Since the original flow $f$ is feasible, the scaled flow has congestion $2$ , so each edge $(v,t)$ carries at most $2c_{H}(v,t)$ flow. By construction of $H$ , we have $c_{H}(v,t)=12\phi\mathbf{d}(v)$ for all $v\in A\setminus R$ , so each vertex $v\in A\setminus(R\cup B)$ receives a net flow of at most $24\phi\mathbf{d}(v)$ from vertices other than $t$ (and then sends that flow to $t$ ). Let $\mathbf{t}(v)\leq 24\phi\mathbf{d}(v)$ be the net flow received, so that the vector $\mathbf{t}\in\mathbb{R}^{A\setminus(R\cup B)}_{\geq 0}$ satisfies $\mathbf{t}\leq 24\phi\mathbf{d}|_{A\setminus(R\cup B)}$ . Removing the edge $(v,t)$ increases the net flow into $v$ by exactly $\mathbf{t}(v)$ . Since each vertex $v\in A\setminus(R\cup B)$ is the source of exactly $\deg_{\partial_{G[A]}(R\cup B)}(v)$ flow before removing the edge $(v,t)$ , the vertex $v$ has net flow $\deg_{\partial_{G[A]}(R\cup B)}(v)-\mathbf{t}(v)$ after removal. In other words, the new flow routes demand $\deg_{\partial_{G[A]}(R\cup B)}(v)-\mathbf{t}(v)$ and has congestion $2$ . By construction, it is also restricted to $G[A\setminus(R\cup B)]$ , concluding the proof. ∎

Claim B.3.

Property (3) holds, i.e., if the vertex weighting $\mathbf{d}|_{A\setminus R}$ mixes in $G[A]$ with congestion $c$ , then the vertex weighting $(\mathbf{d}+\deg_{\partial_{G[A]}(R\cup B)})|_{A\setminus(R\cup B)}$ mixes in $G[A]$ with congestion $2+(1+24\phi)c$ .

Proof.

Consider any demand $\mathbf{b}\in\mathbb{R}^{A}$ satisfying $|\mathbf{b}|\leq(\mathbf{d}+\deg_{\partial_{G}(R\cup B)})|_{A\setminus(R\cup B)}$ . In particular, $\sum_{v\in R\cup B}\mathbf{b}(v)=0$ . To fulfill property (3), we want to construct a single-commodity flow routing demand $\mathbf{b}$ with congestion $2+(1+24\phi)c$ .

We first split $\mathbf{b}$ into demands $\mathbf{x}$ and $\mathbf{b}-\mathbf{x}$ , where vector $\mathbf{x}\in\mathbb{R}^{A}_{\geq 0}$ is defined as

\mathbf{x}(v)=\begin{cases}\displaystyle\frac{\deg_{\partial_{G}(R\cup B)}(v)}{\mathbf{d}(v)+\deg_{\partial_{G}(R\cup B)}(v)}\cdot\mathbf{b}(v)&\text{if }v\in A\setminus(R\cup B)\text{ and }\mathbf{d}(v)+\deg_{\partial_{G}(R\cup B)}(v)>0,\\ 0&\text{otherwise}.\end{cases}

Since $|\mathbf{b}|\leq(\mathbf{d}+\deg_{\partial_{G}(R\cup B)})|_{A\setminus(R\cup B)}$ , we have $|\mathbf{x}|\leq\text{deg}_{\partial_{G}(R\cup B)}|_{A\setminus(R\cup B)}$ and $|\mathbf{b}-\mathbf{x}|\leq\mathbf{d}|_{A\setminus(R\cup B)}$ .

Take the flow $g$ from property (3), and compute a path decomposition of the flow. For each path in the decomposition from vertex $u\in A\setminus(R\cup B)$ to vertex $v\in A\setminus(R\cup B)$ of capacity $c$ , route $\frac{\mathbf{x}(u)}{\deg_{\partial_{G}(R\cup B)}(u)}\cdot c$ flow along the path from $u$ to $v$ (or $-\frac{\mathbf{x}(u)}{\deg_{\partial_{G}(R\cup B)}(u)}\cdot c$ flow in the reversed direction, whichever is nonnegative). Since $|\mathbf{x}(u)|\leq\text{deg}_{\partial_{G}(R\cup B)}(u)$ , we send at most $c$ flow along each such path of capacity $c$ , so our new flow $g_{1}$ has congestion at most that of $g$ , which is $2$ . Each vertex $v\in A\setminus(R\cup B)$ is the end of at most $\mathbf{t}(v)\leq 24\phi\mathbf{d}(v)$ total capacity of paths in the decomposition, and for each such path of capacity $c$ , the vertex $v$ receives a net flow of at most $c$ in absolute value. It follows that each vertex $v\in A\setminus(R\cup B)$ receives at most $24\phi\mathbf{d}(v)$ net flow in absolute value. Setting $\mathbf{y}(v)$ as this net flow, we obtain $|\mathbf{y}|\leq 24\phi\mathbf{d}|_{A\setminus(R\cup B)}$ and that flow $g_{1}$ route demand $\mathbf{x}-\mathbf{y}$ .

Having routed demand $\mathbf{x}-\mathbf{y}$ through flow $g_{1}$ , it remains to route demand $\mathbf{b}-\mathbf{x}+\mathbf{y}$ . We bound

|\mathbf{b}-\mathbf{x}+\mathbf{y}|\leq|\mathbf{b}-\mathbf{x}|+|\mathbf{y}|\leq\mathbf{d}|_{A\setminus(R\cup B)}+24\phi\mathbf{d}|_{A\setminus(R\cup B)}\leq(1+24\phi)\mathbf{d}|_{A\setminus(R\cup B)}.

By assumption, the vertex weighting $(\mathbf{d}+\deg_{\partial(R\cup B)})|_{A\setminus R}$ mixes in $G$ with congestion $c$ , so there is a flow routing demand $(\mathbf{b}-\mathbf{x}+\mathbf{y})/(1+24\phi)$ with congestion $c$ . Scaling this flow by factor $1+24\phi$ , we obtain a flow $g_{2}$ routing demand $\mathbf{b}-\mathbf{x}+\mathbf{y}$ with congestion $(1+24\phi)c$ . Summing the two flows, the final flow $g_{1}+g_{2}$ routes demand $\mathbf{b}$ with congestion $2+(1+24\phi)c$ , concluding the proof. ∎

	$\displaystyle\|\mathbf{b}_{i}(v)\|$	$\displaystyle=\|\mathbf{b}_{i-1}(v)-\mathbf{s}(v)+\mathbf{t}(v)\|$
		$\displaystyle=\left\|\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\mathbf{b}_{i-1}(v)+\mathbf{t}(v)\right\|$
		$\displaystyle\leq\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\|\mathbf{b}_{i-1}(v)\|+\|\mathbf{t}(v)\|$
		$\displaystyle\leq\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\cdot i\deg_{\partial\mathcal{R}_{\geq i}}(v)+\|\mathbf{t}(v)\|$
		$\displaystyle=i\cdot\deg_{\partial\mathcal{R}_{\geq i+1}}(v)+\|\mathbf{t}(v)\|.$

Congestion-Approximators from the Bottom Up

Abstract

1 Introduction

1.1 Previous work

2 Technical Overview

Theorem 2.1 (Informal Theorem 4.1).

2.1 Bottom-Up Construction

3 Preliminaries

Congestion-approximators and approximate flow.

4 Bottom-Up Congestion-Approximator

Theorem 4.1.

Lemma 4.2.

Proof (Lemma 4.2⟹\impliesTheorem 4.1).

Claim 4.3.

Proof.

Claim 4.4.

Proof.

Lemma 4.5.

Proof.

Lemma 4.6.

Lemma 4.7.

Proof.

Claim 4.8.

Proof.

Claim 4.9.

Proof.

Claim 4.10.

Proof.

Proof.

5 Partitioning Algorithm

Theorem 5.1.

Claim 5.2.

Proof.

5.1 Congestion-approximator

Definition 5.3 (G​[A,γ,𝐬,𝐭]G[A,\gamma,\mathbf{s},\mathbf{t}]).

Fact 5.4.

Lemma 5.5.

Claim 5.6.

Proof.

5.2 Fair Cut/Flow Algorithm

Definition 5.7 (Fair cut/flow).

Fact 5.8.

Theorem 5.9 (Fair cut/flow algorithm [25]).

Theorem 5.10 (Flow/cut subroutine).

5.3 Cut-Matching Game and Trimming

Theorem 5.11 (Cut-Matching).

Theorem 5.12 (Trimming).

5.4 Clustering Algorithm

Claim 5.13.

Proof.

Claim 5.14.

Proof.

Claim 5.15.

Proof.

Claim 5.16.

Proof.

Claim 5.17.

Proof.

6 Approximate Maximum Flow

References

Appendix A Cut-Matching Game

Lemma A.1.

Proof.

A.1 Small Potential Implies Mixing

Lemma A.2.

Claim A.3.

Proof.

Claim A.4.

Proof.

A.2 Computing the Set and Flow

A.2.1 Constructing the partition

Lemma A.5.

Proof.

Lemma A.6 (Lemma 3.4 of [18]).

A.2.2 Max-flow/min-cut call

Claim A.7.

Proof.

Claim A.8.

Proof.

Claim A.9.

Proof (Lemma 4.2 $\implies$ Theorem 4.1).

Definition 5.3 ( $G[A,\gamma,\mathbf{s},\mathbf{t}]$ ).