This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Congestion-Approximators from the Bottom Up

Jason Li Carnegie Mellon University. email: [email protected]    Satish Rao UC Berkeley. email: [email protected]    Di Wang Google Research. email: [email protected]
Abstract

We develop a novel algorithm to construct a congestion-approximator with polylogarithmic quality on a capacitated, undirected graph in nearly-linear time. Our approach is the first bottom-up hierarchical construction, in contrast to previous top-down approaches including that of Räcke, Shah, and Taubig (SODA 2014), the only other construction achieving polylogarithmic quality that is implementable in nearly-linear time (Peng, SODA 2016). Similar to Räcke, Shah, and Taubig, our construction at each hierarchical level requires calls to an approximate max-flow/min-cut subroutine. However, the main advantage to our bottom-up approach is that these max-flow calls can be implemented directly without recursion. More precisely, the previously computed levels of the hierarchy can be converted into a pseudo-congestion-approximator, which then translates to a max-flow algorithm that is sufficient for the particular max-flow calls used in the construction of the next hierarchical level. As a result, we obtain the first non-recursive algorithms for congestion-approximator and approximate max-flow that run in nearly-linear time, a conceptual improvement to the aforementioned algorithms that recursively alternate between the two problems.

1 Introduction

The famous max-flow min-cut theorem111For the sake of this introduction, a flow problem on a graph is a set of excesses and deficits on vertices and a solution is a set of paths connecting excess and deficits where no edge is used in more than one path. implies that any set of excesses or deficits in a graph can be connected by disjoint paths if and only if every cut in the graph has more capacity in the edges that cross the cut than the “demand” that is required to cross it. One direction is easy to see (if a cut does not have enough capacity, clearly one cannot route demand paths across it), and the other can be established by linear programming duality (or explicitly using an algorithm as was done by Ford and Fulkerson in 1956 [12].)

Framed slightly differently, a cut where the ratio of the demand across it to the capacity of the cut is cc implies that any flow satisfying the demands requires at least cc units of flow go through some edge. Again, the max-flow min-cut theorem implies that there is always such a cut if the optimal flow has congestion cc. Here, the congestion of a flow is the maximum flow that is routed on any edge in the graph.

More generally, the set of feasible demands in a graph is captured by the (exponentially sized) set of all cuts in the graph. Remarkably, it was shown that polynomially many cuts could approximately (within logarithmic factors) characterize the congestion of any set of demands [31, 32]. Indeed, the total size of cuts was nearly-linear222Nearly-linear means O(mlogcn)O(m\log^{c}n) where cc is a constant. Almost-linear means O(m1+ϵ)O(m^{1+\epsilon}) for some ϵ=o(1)\epsilon=o(1). in [31, 33] and could be computed in nearly-linear time as announced in 2014 by Peng [29].

More precisely, an α\alpha-congestion-approximator for a graph is a set of cuts where for every set of demands, the minimum ratio (over cuts in the α\alpha-congestion-approximator) of the demands crossing the cut and the capacity of the cut determines the minimum congestion of the optimal flow (that routes the demands) to within an α\alpha factor.

In this paper, we provide an α\alpha-congestion-approximator (along with an routing scheme) that can be constructed in nearly-linear time with an α\alpha of O(log10n)O(\log^{10}n). The number of logarithmic factors is not optimized but the structure is significantly simpler than the construction by Peng [29].

The top-down frame of the decomposition of [33] requires that one compute approximate maximum flows at the very top level which entailed a non-linear runtime. A breakthrough result of Sherman [36] showed how to use congestion-approximators to find approximate maximum flows.333Simultaneously, Kelner et. al. [17] used a dual version of congestion approximators called oblivous routers to give efficient algorithms for approximate maximum flow. Peng [30] used the result of Sherman [36] inside the top-down partitioning method based on [33] for constructing congestion-approximators. There is a chicken and egg problem here as the two methods need each other, and a costly recursion was used to combine them based on ultra-sparsifiers which we discuss a bit more below.

Our result proceeds in a bottom-up iterative fashion; indeed, the bottom level is a “weak expander decomposition” which can be computed using trivial congestion-approximators consisting of singleton vertices.444We note expander decompositions themselves have found wide application, including in the breakthrough near linear time algorithm of [8] for maximum flow. Also, the high-level idea of our congestion-approximator is similar to hierachical expander decomposition [13], and our techniques may be useful in getting from almost linear to near linear. Then we proceed level by level, using the lower level clusterings as “pseudo”-congestion-approximators for the next. The frame is simple, but “around the edges” both literally (the edges of the clusters) and metaphorically there are technical issues which require some attention.

Still, given the power that congestion-approximators provide with respect to understanding the structure of graphs, finding more efficient constructions is important and an effective bottom-up or clustering approach seems a natural path to follow. We make the first and a substantive step on this path in the decade since the announcement of Peng’s also polylogarithmic nearly linear time algorithm [29].

Recent progress on maximum flow can be compared to developments in nearly-linear time Laplacian solvers [38], which were initially very complex with many logarithmic factors. But over the years, new tools were developed that both improved the running times and allowed for better understanding of graphs as well as having broader application. See, for example, [21, 22, 10, 16]. Progress has been much slower for maximum flow in comparison. Perhaps one reason is that solutions to Laplacian linear systems are 2\ell_{2} optimizing flows that involve the Euclidean norm which is a fair bit simpler than the \ell_{\infty} norm central to maximum flow or norms based on more general convex bodies. We believe this result to be a step in the process of understanding maximum flows better.

We proceed with a discussion of previous work which consists of a remarkable series of developments.

1.1 Previous work

The maximum flow, minimum cut theorem is nice and remarkable in that one can find a single cut that establishes the optimality of the flow or “routing”, and the optimal routing establishes a tight lower bound on the size of any cut.555A wrinkle is that there could be many (even an exponential) number of minimum cuts or feasible optimal flows. But any optimal routing or any optimal cut establishes the optimality of the “dual” object.

We note that this theorem was extended to multi-commodity flow in an approximate sense in [24], [20] and [26] where flows and corresponding cuts are shown to be related by an O(logn)O(\log n) factor. The first paper gives a method for finding the sparsity or conductance of a graph to within an O(logn)O(\log n) factor by using multicommodity flow to embed a complete graph, and the others extend the techniques to give approximations between cuts and solutions for arbitrary multicommodity flow instances. In a parallel thread, the mixing times of random walks or eigenvalues (which involve flow problems with an 2\ell_{2} objective) are related to cuts via classical and tremendously impactful results of Cheeger [7]. Combining eigenvalues and linear programming methods through semidefinite programming yields improved approximate relationships between embeddings and cuts of roughly O(logn)O(\sqrt{\log n}) [5, 4]. Fast versions of these methods involve the cut-matching game developed in [2, 18, 28, 35] and used directly in this work.

As we noted previously, one can view the (exponentially sized) set of feasible sets of demands in a graph as being a very general measure of a graph’s capabilities. And as also mentioned above, Räcke in [31] showed that a polynomial sized set of cuts and corresponding pre-computed routings approximately model the congestion required for any of (exponentially many) sets of possible demands.

A bit more formally, [31] provides an oblivious routing scheme, where the scheme can obliviously route any set of demands with no more than α\alpha times as much congestion as the optimal routing. Here, oblivious roughly means that for any demand pair the routing is done without considering any other demand pair. [31] also provides a decomposition where for any set of demands the maximum congestion (i.e., ratio of demand crossing to edge capacity) on any cut in the decomposition is within a factor of α\alpha of the congestion needed to route those demands.

The value for α\alpha in [31] is O(log3n)O(\log^{3}n) but was non-constructive: however, constructive and improved schemes were quickly developed with α\alpha of O(log2nloglogn)O(\log^{2}n\log\log n) in [6, 15].

An alternative scheme also by Räcke [32], gave a remarkably simple oblivious routing scheme, that consisted of O(m)O(m) trees. The oblivious routing was simply to route any demand by splitting the flow among the trees. Of course, each tree implicitly corresponds to a laminar family of cuts. Still the total size of both the routings and the implicit total size of the cuts is quadratic or worse. Moreover, the time complexity for its construction was also at least quadratic. The approximation factor α\alpha for this scheme was O(logn)O(\log n). We will refer to this as Räcke’s tree scheme.

Madry, in [27], gave a nearly-linear time algorithm that produced a congestion-approximator which has almost-linear size and running time at the cost of having approximation factor of O(nϵ)O(n^{\epsilon}) that is based on Räcke tree scheme. A central idea is the use of ultrasparsifiers which were introduced by Spielman and Teng [38] in their breakthrough results on linear time solvers for Laplacian linear systems. An ultrasparsifier is formed by taking a certain kind of spanning tree (called low-stretch), making clusters from small connected components of the tree, and sampling a very small set of non-tree edges between the clusters. Such a graph approximates the cuts to within a small factor in the sense that cuts in the original graph and the new one have approximately the same size. This allows a (complicated) recursion where one use several ultrasparsifiers to approximate the cuts in the graph and recursively produce oblivious routing schemes for each of the ultrasparsifiers. Again, this approach is more efficient in time than Räcke’s tree scheme at the cost of a worse approximation factor. The scheme enabled the approximate solution to a host of problems in almost linear time. This contribution was remarkable.

In sum, Madry [27] produced a α=O(nϵ)\alpha=O(n^{\epsilon})-congestion approximator in the almost linear running time of O(m1+ϵ)O(m^{1+\epsilon}).666The ϵ\epsilon is subconstant, roughly 1/logn1/\sqrt{\log n} which means an mϵm^{\epsilon} factor is larger than any polylogarithmic factor. The obstruction to getting to polylogarithmic overheads in Madry’s scheme is the need to recurse on several ultrasparsifiers to keep the approximation from blowing up during the recursion.

In another exciting development, Sherman [36] used Madry’s [27] structure to produce an almost-linear time algorithm for 1+ϵ1+\epsilon-approximate undirected maximum flow. This is remarkable in that he reduced Madry’s approximation factor of O(nϵ)O(n^{\epsilon}) to (1+ϵ)(1+\epsilon). The approximation factor in Madry’s construction translates into a running time overhead, which results in the almost-linear running time we mentioned. We point out that there is close relationship to the work of Spielman and Teng [38], who used ultrasparsifiers to spectrally approximate a Laplacian system to provide algorithms whose running times depend on the approximation quality while obtaining arbitrarily precise solutions. In that case, the dependence of running time on precision was logarithmic whereas Sherman’s algorithm suffers a O(1/ϵ2)O(1/\epsilon^{2}) dependence on the error ϵ\epsilon. To be sure, in solving linear systems the idea of using an “approximate” graph in improving condition numbers was known as preconditioning, but for maximum flow or 1\ell_{1} optimization this was a striking development.

Sherman’s method (could be seen) to use the multiplicative weights framework (see [3]) to route flow across cuts so that every cut has near zero residual demand across it, in particular, δ=ϵ/α\delta=\epsilon/\alpha fraction of its capacity. The congestion approximator then ensures that the remaining flow can be routed with αδ=ϵ\alpha\cdot\delta=\epsilon congestion and one can recurse with slightly more than ϵ\epsilon extra congestion.

We note that Sherman’s result (with Madry’s construction) bypasses longstanding barriers to faster maximum flow algorithms. Since the work of Dinitz [11] in 1973, algorithms had to pay either for path length or for number of paths. Dinitz himself traded this off to get a O(m3/2)O(m^{3/2}) flow algorithm [9, 23]. Using linear time solvers to do electrical flow, one could eke out a few more paths simultaneously, but one still was pretty stuck at O(m4/3)O(m^{4/3}). The congestion-approximator and the multiplicative weights optimization method bypasses these obstructions.

In 2014, Räcke, Shah, and Taubig [33] made progress on Räcke’s decomposition based approach. In particular, they showed how to produce a decomposition with approximation parameter of O(log4n)O(\log^{4}n) using maximum flow computations of total size O(mlog5n)O(m\log^{5}n). They use the same frame as Räcke [31], but use single commodity flows in the context of the cut-matching game [19, 28] which, as previously mentioned, can replace multicommodity flows in finding sparse cuts and routings.

Note, that a congestion-approximator can be used to compute flows efficiently where flows can now be used to compute congestion-approximators. This suggests recursion, but the construction in [33] (and indeed previous ones) were very much top down. That is, the first level of the decomposition (which itself is a tour de force) falls victim to the fact that typical paths in the flows that find them can be long and somewhat abundant. Thus, one critically needs something like a congestion-approximator right away to even compute the top level of the decomposition.

Still, Peng [30] was able to get to a nearly-linear running time recursive method by using ultrasparsifiers and Sherman’s approximate maximum flow algorithm. He constructs an ultrasparsifier of size O(m/logcm)O(m/\log^{c}m) with polylogarithmic approximation factor, recursively computes a cut sparsifier for the resulting graph, and then argues that the combination of Sherman’s algorithm with both the ultrasparsifier and the congestion-approximator can be used to compute an approximate maximum flow in O(mlogcm)O(m\log^{c}m) time. Again, the key here is that Sherman [36] allows him to use polylogarithmic overhead to combat the polylogarithmic approximation in both the ultrasparsifer and in the congestion-approximator of the ultrasparsifier. Still, the recursion is a bear, resulting in an admittedly unoptimized runtime of O(mlog41n)O(m\log^{41}n).

As noted previously, our method is bottom up from the start and avoids the costly recursion required above. We have not computed the exponent of the logarithmic factors due to, for example, using the fair cuts method of [25] as a black box. Still, the clustering approach is more natural and we expect is a useful frame.

We proceed with a technical overview of our result and methods.

2 Technical Overview

Our main conceptual and technical contribution is a novel congestion approximator that is constructed bottom-up in a hierarchical fashion. We start with an informal construction in the theorem below, which is later formalized in Theorem 4.1. See Figure 1 for an illustration.

Refer to caption
Figure 1: On the left, partitions 𝒫i\mathcal{P}_{i} (solid) and 𝒫i+1\mathcal{P}_{i+1} (dotted) are shown. On the middle, the marked edges for each cluster mix simultaneously in GG (property (2)). On the right, a flow from the inter-cluster edges of 𝒫i+1\mathcal{P}_{i+1} to the inter-cluster edges of 𝒫i\mathcal{P}_{i} is displayed (property (3)); assuming edges are unit-weight, each flow path carries a half-unit of flow.
Theorem 2.1 (Informal Theorem 4.1).

Consider a capacitated graph G=(V,E)G=(V,E). Suppose there exist partitions 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L} of VV such that

  1. 1.

    𝒫1\mathcal{P}_{1} is the partition {{v}:vV}\{\{v\}:v\in V\} of singleton clusters, and 𝒫L\mathcal{P}_{L} is the partition {V}\{V\} with a single cluster.

  2. 2.

    For each i[L1]i\in[L-1], for each cluster C𝒫i+1C\in\mathcal{P}_{i+1}, the inter-cluster edges of 𝒫i\mathcal{P}_{i} internal to CC, together with the boundary edges of CC, mix777Informally, a set of edges mixes if there is a low-congestion multi-commodity flow between the set of edges whose demand pairs form an expander. In other words, there is an expander flow (in the [5] sense) between the set of edges. Our formal definition is in the preliminaries and only considers single-commodity flows. in the graph GG. Moreover, the mixings over all clusters C𝒫i+1C\in\mathcal{P}_{i+1} have congestion α\alpha simultaneously.

  3. 3.

    For each i[L1]i\in[L-1], there is a flow in GG with congestion β\beta such that each inter-cluster edge in 𝒫i+1\mathcal{P}_{i+1} sends its capacity in flow and each inter-cluster edge in 𝒫i\mathcal{P}_{i} receives at most half its capacity in flow.

For each i[L]i\in[L], let partition i\mathcal{R}_{\geq i} be the common refinement of partitions 𝒫i,𝒫i+1,,𝒫L\mathcal{P}_{i},\mathcal{P}_{i+1},\ldots,\mathcal{P}_{L}, i.e.,

i={CiCi+1CL:Ci𝒫i,Ci+1𝒫i+1,,CL𝒫L,CiCL}.\mathcal{R}_{\geq i}=\{C_{i}\cap C_{i+1}\cap\cdots\cap C_{L}:C_{i}\in\mathcal{P}_{i},\,C_{i+1}\in\mathcal{P}_{i+1},\ldots,\,C_{L}\in\mathcal{P}_{L},\,C_{i}\cap\cdots\cap C_{L}\neq\emptyset\}.

Then, their union 𝒞=i[L]i\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i} is a congestion-approximator with quality 5L2αβ5L^{2}\alpha\beta.

To understand the construction, first consider the case i=1i=1. By property (1), 𝒫1\mathcal{P}_{1} is the partition of singleton clusters. Since the inter-cluster edges of 𝒫1\mathcal{P}_{1} are precisely all the edges, property (2) says that for each cluster C𝒫i+1C\in\mathcal{P}_{i+1}, the edges internal to CC, together with the boundary edges of CC, “mix” in the graph GG. In other words, the set of edges with at least one endpoint in CC mixes in GG. In the extreme case C=VC=V, the entire edge set mixes in GG, so the graph GG is an expander.888We do not define expanders in this paper since we do not need their precise definition. The connection to expanders is only stated as motivation for readers familiar with the concept. In general, we can informally say that each cluster CC is a sort of weak-expander, and the partition 𝒫2\mathcal{P}_{2} is a weak-expander decomposition of graph GG.999A key difference (from an actual expander decomposition) is that the (routing to certify the) mixing of each cluster is not required to be fully inside its induced subgraph, although the full graph GG still needs to have the capacity to support the mixing of all the clusters simultaneously.

Now consider property (3), which establishes a flow starting from the inter-cluster edges of 𝒫2\mathcal{P}_{2} such that each edge in GG receives at most half its capacity in flow. We claim that this statement is very natural and follows almost immediately from property (2) with one mild assumption: for each cluster in 𝒫i+1\mathcal{P}_{i+1}, the total capacity of boundary edges is much smaller than the total capacity of internal edges.101010From the perspective of expander decompositions, this assumption is very natural. Expander decompositions require that the total capacity of inter-cluster edges is small relative to the total capacity of all edges. We are simply extending this property to hold for each cluster, not just globally over all clusters. Indeed, from the mixing of the internal and boundary edges of CC, we can spread a flow from the boundary edges of CC such that each internal edge receives flow proportional to its capacity. As long as the boundary edges have small total capacity, the total flow source is also small, so each internal edge receives a small amount of flow relative to its capacity. Finally, since the clusters mix simultaneously, we can compose the corresponding flows for each cluster and still ensure small congestion.

Now consider a general level i2i\geq 2. Recall from property (2) that for each cluster C𝒫i+1C\in\mathcal{P}_{i+1}, the inter-cluster edges of 𝒫i\mathcal{P}_{i} inside CC, together with the boundary edges of CC, “mix” in the graph GG (see Figure 1, middle). We can interpret this statement (again) as a weak-expander decomposition where expansion is measured with respect to a subset of edges, namely the inter-cluster edges of partition 𝒫i\mathcal{P}_{i} together with the boundary edges of a cluster in 𝒫i+1\mathcal{P}_{i+1}. Property (3) says that we can spread flow from the inter-cluster edges of 𝒫i+1\mathcal{P}_{i+1} to the inter-cluster edges of 𝒫i\mathcal{P}_{i} such that each edge in 𝒫i\mathcal{P}_{i} receives a small amount of flow (see Figure 1, right). Once again, we can show that it follows from property (2) with the following mild assumption: for each cluster in 𝒫i+1\mathcal{P}_{i+1}, the total capacity of boundary edges is much smaller than the total capacity of inter-cluster edges of 𝒫i\mathcal{P}_{i} that are internal to this cluster.

Overall, the partitions 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L} can be viewed as hierarchical expander decompositions where each partition 𝒫i+1\mathcal{P}_{i+1} is a weak-expander decomposition relative to the inter-cluster edges of partition 𝒫\mathcal{P}. It is instructive to compare our partitioning to the expander hierarchy construction of [14], where clusters of the previous partition are contracted before building the next partition. While [14] can also extract a congestion approximator from their expander hierarchy construction, their quality is no(1)n^{o(1)} because they lose a multiplicative factor from contracting vertices on each level of the hierarchy. To avoid this multiplicative blow-up per level, we do not contract vertices, so our partitioning is not truly hierarchical: a cluster in partition 𝒫i\mathcal{P}_{i} may be “cut” into many components by the next partition 𝒫i+1\mathcal{P}_{i+1} (see Figure 1, left). While a hierarchical construction is not required, it is a useful property to have when analyzing the quality of the final congestion-approximator. To establish a hierarchy, our key insight is to consider the common refinement of partitions 𝒫i,𝒫i+1,,𝒫L\mathcal{P}_{i},\mathcal{P}_{i+1},\ldots,\mathcal{P}_{L}, which we name i\mathcal{R}_{\geq i}. The partitions i\mathcal{R}_{\geq i} over all ii are now hierarchical by construction, and we show that the union of all refinements i\mathcal{R}_{\geq i} is a congestion-approximator with good quality.

We emphasize that our conceptual idea of not contracting clusters and looking at common refinements is novel and may have future applications to bottom-up constructions of hierarchical objects, especially for obtaining polylogarithmic approximations, meaning that one cannot lose a multiplicative factor at each level. Given that [14] has popularized bottom-up hierarchical approaches, and that their methods so far can only obtain no(1)n^{o(1)}-factors due to multiplicative errors across levels, we believe that our ideas are promising for future development in this area.

2.1 Bottom-Up Construction

As mentioned previously, the partition 𝒫2\mathcal{P}_{2} is a weak-expander decomposition of the graph, so it can be computed in nearly-linear time using off-the-shelf expander decomposition algorithm that avoid any black-box call to max-flow [34]. For partitions 𝒫3\mathcal{P}_{3} onwards, expansion is measured with respect to a subset of edges, so simple expander decomposition algorithms no longer suffice. Instead, we have to resort to expander decomposition algorithms that make black-box calls to (approximate) max-flow. Naïvely, these max-flows can be computed recursively, resulting in a congestion approximator algorithm that makes recursive calls to max-flow, similar to [33]. Our key insight is that these max-flow instances are actually well-structured enough that recursion is unnecessary. In particular, to construct partition 𝒫i+1\mathcal{P}_{i+1}, the first ii partitions 𝒫1,𝒫2,,𝒫i\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{i}—which the algorithm has already computed—can be converted to a pseudo-congestion approximator, which then translates to a max-flow algorithm sufficient for these well-structured instances.

3 Preliminaries

We are given an undirected, capacitated graph G=(V,E)G=(V,E). The graph has nn vertices and mm edges, and each edge eEe\in E has capacity c(e)c(e) in the range [1,W][1,W]. For a set CVC\subseteq V, let C\partial C denote the set of edges with exactly one endpoint in CC, and let δC=eCc(e)\delta C=\sum_{e\in\partial C}c(e) denote the total capacity of edges in C\partial C. For a vertex vVv\in V, let deg(v)\deg(v) denote the capacitated degree of vertex vv, which is also equal to δ{v}\delta\{v\}. When the graph GG is clear from context, we may drop the subscript GG. We sometimes write GC\partial_{G}C, δGC\delta_{G}C, and degG(v)\deg_{G}(v) to emphasize that the values are with respect to graph GG.

For a given edge set FEF\subseteq E, let FC\partial_{F}C, δFC\delta_{F}C, and degF(v)\deg_{F}(v) denote the corresponding values on the subgraph of GG with edge set FF. For a different graph HH, let HC\partial_{H}C, δHC\delta_{H}C, and degH(v)\deg_{H}(v) denote the corresponding values on graph HH. We never remove the subscripts FF and HH to avoid confusion with the original graph GG.

For a partition 𝒫\mathcal{P} of the vertex set VV, we call each part of the partition a cluster. Let 𝒫\partial\mathcal{P} denote the set of edges whose endpoints belong to different clusters; we can also write 𝒫=C𝒫C\partial\mathcal{P}=\bigcup_{C\in\mathcal{P}}\partial C. We also define δ𝒫\delta\mathcal{P} as the total capacity of edges in 𝒫\partial\mathcal{P}.

Let UU be an arbitrary set of vertices. For a vector 𝐱U\mathbf{x}\in\mathbb{R}^{U}, let 𝐱(v)\mathbf{x}(v) denote its value on entry vUv\in U. For a vertex subset SUS\subseteq U, define 𝐱(S)=vS𝐱(v)\mathbf{x}(S)=\sum_{v\in S}\mathbf{x}(v), and define 𝐱|SU\mathbf{x}|_{S}\in\mathbb{R}^{U} as the vector 𝐱\mathbf{x} restricted to SS, i.e., 𝐱|S(v)=𝐱(v)\mathbf{x}|_{S}(v)=\mathbf{x}(v) for all vSv\in S and 𝐱|S(v)=0\mathbf{x}|_{S}(v)=0 for all vSv\notin S. For two vectors 𝐬,𝐭U\mathbf{s},\mathbf{t}\in\mathbb{R}^{U}, by 𝐬𝐭\mathbf{s}\leq\mathbf{t} we mean entry-wise inequality, i.e., 𝐬(v)𝐭(v)\mathbf{s}(v)\leq\mathbf{t}(v) for all vUv\in U. For a vector 𝐬U\mathbf{s}\in\mathbb{R}^{U}, let |𝐬|U|\mathbf{s}|\in\mathbb{R}^{U} be the vector with entry-wise absolute values, i.e., |𝐬|(v)=|𝐬(v)||\mathbf{s}|(v)=|\mathbf{s}(v)|.

A demand vector is a vector 𝐛U\mathbf{b}\in\mathbb{R}^{U} whose entries sum to 0, i.e., 𝐛(U)=0\mathbf{b}(U)=0. A flow ff routes demand 𝐛\mathbf{b} if each vertex vUv\in U receives a net flow of 𝐛(v)\mathbf{b}(v) in the flow ff. A flow ff has congestion α\alpha if the amount of flow sent along each (undirected) edge is at most α\alpha times the capacity of that edge. Given a flow ff, a path decomposition of flow ff is a collection of directed, capacitated paths such that for any two vertices u,vUu,v\in U connected by an edge ee, the amount of flow that ff sends from uu to vv equals the total capacity of (directed) paths that contain edge ee in the direction from uu to vv.

A vertex weighting is a vector 𝐝0U\mathbf{d}\in\mathbb{R}^{U}_{\geq 0}, i.e., all entries in 𝐝\mathbf{d} are nonnegative. The vertex weighting 𝐝0U\mathbf{d}\in\mathbb{R}^{U}_{\geq 0} mixes in graph GG with congestion α\alpha if for any demand 𝐛U\mathbf{b}\in\mathbb{R}^{U} satisfying |𝐛|𝐝|\mathbf{b}|\leq\mathbf{d}, there is a flow routing 𝐛\mathbf{b} with congestion α\alpha.111111There is a close connection between the concept of mixing and expander graphs, though we do not need the definition of expanders in this paper. A collection {𝐝1,𝐝2,,𝐝}\{\mathbf{d}_{1},\mathbf{d}_{2},\ldots,\mathbf{d}_{\ell}\} of vertex weightings mixes simultaneously with congestion α\alpha if for any demands 𝐛1,𝐛2,,𝐛\mathbf{b}_{1},\mathbf{b}_{2},\ldots,\mathbf{b}_{\ell} with |𝐛i|𝐝i|\mathbf{b}_{i}|\leq\mathbf{d}_{i} for each i[]i\in[\ell], there is a flow routing demand 𝐛1+𝐛2++𝐛\mathbf{b}_{1}+\mathbf{b}_{2}+\cdots+\mathbf{b}_{\ell} with congestion α\alpha.

Throughout the paper, we view vectors in U\mathbb{R}^{U} and functions from UU to \mathbb{R} as interchangeable. In particular, we sometimes treat the degree function deg:V0\deg:V\to\mathbb{R}_{\geq 0} as a vector deg0V\deg\in\mathbb{R}^{V}_{\geq 0}. In particular, deg\deg is a vertex weighting.

Congestion-approximators and approximate flow.

Given a graph H=(U,F)H=(U,F), a congestion-approximator 𝒞\mathcal{C} of quality α\alpha is a collection of subsets of UU such that for any demand 𝐛\mathbf{b} satisfying |𝐛(C)|δHC|\mathbf{b}(C)|\leq\delta_{H}C for all C𝒞C\in\mathcal{C}, there is a flow routing demand 𝐛\mathbf{b} with congestion α\alpha. Through Sherman’s framework [36, 37], a congestion-approximator of quality α\alpha translates to a (1+ϵ)(1+\epsilon)-approximate maximum flow algorithm with running time O~(ϵ1αm)\tilde{O}(\epsilon^{-1}\alpha m).121212The O~()\tilde{O}(\cdot) notation hides polylogarithmic factors in nn. In our paper, it is most convenient to work with a stronger variant of approximate min-cut/max-flow called fair cut/flow [25], which is formally defined in Section 5.2.

4 Bottom-Up Congestion-Approximator

In this section, we formally state our congestion-approximator construction and prove its quality guarantee. See Figure 1 for an illustration. The most important distinction is that we define and analyze mixing on vertex weightings, not edges. This simplifies the notation since we can avoid working with the subdivision graph in [33]. For example, in property (2) below, the mixing of the vertex weighting deg𝒫iC|C\mathrm{deg}_{\partial\mathcal{P}_{i}\cup\partial C}|_{C} is conceptually equivalent to the mixing of the edges of 𝒫i\partial\mathcal{P}_{i} internal to CC together with the boundary edges C\partial C.

Theorem 4.1.

Consider a capacitated graph G=(V,E)G=(V,E), and let α1\alpha\geq 1 and β1\beta\geq 1 be parameters. Suppose there exist partitions 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L} of VV such that

  1. 1.

    𝒫1\mathcal{P}_{1} is the partition {{v}:vV}\{\{v\}:v\in V\} of singleton clusters, and 𝒫L\mathcal{P}_{L} is the partition {V}\{V\} with a single cluster.

  2. 2.

    For each i[L1]i\in[L-1], the collection of vertex weightings {deg𝒫iC|C0V:C𝒫i+1}\{\mathrm{deg}_{\partial\mathcal{P}_{i}\cup\partial C}|_{C}\in\mathbb{R}^{V}_{\geq 0}:C\in\mathcal{P}_{i+1}\} mixes simultaneously in GG with congestion α\alpha.

  3. 3.

    For each i[L1]i\in[L-1], there is a flow in GG with congestion β\beta such that each vertex vVv\in V sends deg𝒫i+1(v)\deg_{\partial\mathcal{P}_{i+1}}(v) flow and receives at most 12deg𝒫i(v)\frac{1}{2}\deg_{\partial\mathcal{P}_{i}}(v) flow.

For each i[L]i\in[L], let partition i\mathcal{R}_{\geq i} be the common refinement of partitions 𝒫i,𝒫i+1,,𝒫L\mathcal{P}_{i},\mathcal{P}_{i+1},\ldots,\mathcal{P}_{L}, i.e.,

i={CiCi+1CL:Ci𝒫i,Ci+1𝒫i+1,,CL𝒫L,CiCL}.\mathcal{R}_{\geq i}=\{C_{i}\cap C_{i+1}\cap\cdots\cap C_{L}:C_{i}\in\mathcal{P}_{i},\,C_{i+1}\in\mathcal{P}_{i+1},\ldots,\,C_{L}\in\mathcal{P}_{L},\,C_{i}\cap\cdots\cap C_{L}\neq\emptyset\}.

Then, their union 𝒞=i[L]i\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i} is a congestion-approximator with quality 5L2αβ5L^{2}\alpha\beta.

In Section 5, we develop an efficient algorithm to construct partitions 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L} with L=O(logn)L=O(\log n). This algorithm builds the partitions iteratively in the order 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}. For technical reasons explained in Section 5, we require the following analogue of Theorem 4.1 where 𝒫L\mathcal{P}_{L} is not necessarily the partition {V}\{V\}. Note that assumptions (2) and (3) remain unchanged below. The key difference is that 𝒞\mathcal{C} is no longer a congestion-approximator, but a pseudo-congestion-approximator whose precise guarantee is stated below.

Lemma 4.2.

Consider a capacitated graph G=(V,E)G=(V,E), and let α1\alpha\geq 1 and β1\beta\geq 1 be parameters. Suppose there exist partitions 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L} such that

  1. 1.

    𝒫1\mathcal{P}_{1} is the partition {{v}:vV}\{\{v\}:v\in V\} of singleton clusters.

  2. 2.

    For each i[L1]i\in[L-1], the collection of vertex weightings {deg𝒫iC|C0V:C𝒫i+1}\{\mathrm{deg}_{\partial\mathcal{P}_{i}\cup\partial C}|_{C}\in\mathbb{R}^{V}_{\geq 0}:C\in\mathcal{P}_{i+1}\} mixes simultaneously in GG with congestion α\alpha.

  3. 3.

    For each i[L1]i\in[L-1], there is a flow in GG with congestion β\beta such that each vertex vVv\in V sends deg𝒫i+1(v)\deg_{\partial\mathcal{P}_{i+1}}(v) flow and receives at most 12deg𝒫i(v)\frac{1}{2}\deg_{\partial\mathcal{P}_{i}}(v) flow.

For each i[L]i\in[L], let partition i\mathcal{R}_{\geq i} be the common refinement of partitions 𝒫i,𝒫i+1,,𝒫L\mathcal{P}_{i},\mathcal{P}_{i+1},\ldots,\mathcal{P}_{L}, i.e.,

i={CiCi+1CL:Ci𝒫i,Ci+1𝒫i+1,,CL𝒫L,CiCL}.\mathcal{R}_{\geq i}=\{C_{i}\cap C_{i+1}\cap\cdots\cap C_{L}:C_{i}\in\mathcal{P}_{i},\,C_{i+1}\in\mathcal{P}_{i+1},\ldots,\,C_{L}\in\mathcal{P}_{L},\,C_{i}\cap\cdots\cap C_{L}\neq\emptyset\}.

Consider their union 𝒞=i[L]i\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i}. Then, for any demand 𝐛V\mathbf{b}\in\mathbb{R}^{V} satisfying |𝐛(C)|δC|\mathbf{b}(C)|\leq\delta C for all C𝒞C\in\mathcal{C}, there exists a demand 𝐛V\mathbf{b}^{\prime}\in\mathbb{R}^{V} satisfying |𝐛|Ldeg𝒫L|\mathbf{b}^{\prime}|\leq L\deg_{\partial\mathcal{P}_{L}} and a flow routing demand 𝐛𝐛\mathbf{b}-\mathbf{b}^{\prime} with congestion 5L2αβ5L^{2}\alpha\beta.

Instead of proving Theorem 4.1 directly, we prove Lemma 4.2 which is needed for the algorithm. Before we do so, we first establish that Lemma 4.2 indeed implies Theorem 4.1.

Proof (Lemma 4.2\impliesTheorem 4.1).

Consider partitions 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L} that satisfy the assumptions of Theorem 4.1. For a given demand 𝐛V\mathbf{b}\in\mathbb{R}^{V} satisfying |𝐛(C)|δC|\mathbf{b}(C)|\leq\delta C for all C𝒞C\in\mathcal{C}, we want to establish a flow routing demand 𝐛\mathbf{b} with congestion 5L2αβ5L^{2}\alpha\beta. Theorem 4.1 then follows from the definition of congestion-approximator.

Apply Lemma 4.2 to the partitions 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L} and the demand 𝐛\mathbf{b}. We obtain a demand 𝐛V\mathbf{b}^{\prime}\in\mathbb{R}^{V} satisfying |𝐛|Ldeg𝒫L|\mathbf{b}^{\prime}|\leq L\deg_{\partial\mathcal{P}_{L}} and a flow ff routing demand 𝐛𝐛\mathbf{b}-\mathbf{b}^{\prime} with congestion 5L2αβ5L^{2}\alpha\beta. By assumption (1) of Theorem 4.1, we have 𝒫L={V}\mathcal{P}_{L}=\{V\}, which implies that 𝒫L=\partial\mathcal{P}_{L}=\emptyset. Since |𝐛|Ldeg𝒫L=𝟎|\mathbf{b}^{\prime}|\leq L\deg_{\partial\mathcal{P}_{L}}=\mathbf{0}, we must have 𝐛=𝟎\mathbf{b}^{\prime}=\mathbf{0}. It follows that the flow ff routes demand 𝐛\mathbf{b} with congestion 5L2αβ5L^{2}\alpha\beta, finishing the proof. ∎

For the rest of this section, we prove Lemma 4.2. We begin with two helper claims that establish structure on the sets i\mathcal{R}_{\geq i}.

Claim 4.3.

For all i,j[L]i,j\in[L] with i<ji<j, the partition i\mathcal{R}_{\geq i} of VV is a refinement of the partition j\mathcal{R}_{\geq j}, in the sense that each set in j\mathcal{R}_{\geq j} is the disjoint union of some sets in i\mathcal{R}_{\geq i}. In particular, ij\partial\mathcal{R}_{\geq i}\supseteq\partial\mathcal{R}_{\geq j}.

Proof.

Consider a set C=CjCj+1CLjC=C_{j}\cap C_{j+1}\cap\cdots\cap C_{L}\in\mathcal{R}_{\geq j} for some Cj𝒫j,Cj+1𝒫j+1,,CL𝒫LC_{j}\in\mathcal{P}_{j},C_{j+1}\in\mathcal{P}_{j+1},\ldots,C_{L}\in\mathcal{P}_{L}. Since 𝒫i,𝒫i+1,,𝒫j1\mathcal{P}_{i},\mathcal{P}_{i+1},\ldots,\mathcal{P}_{j-1} are all partitions of VV, the set CC is the disjoint union of all nonempty sets of the form CiCi+1Cj1CiC_{i}\cap C_{i+1}\cap\cdots\cap C_{j-1}\cap C\in\mathcal{R}_{\geq i} for Ci𝒫i,Ci+1𝒫i+1,,Cj1𝒫j1C_{i}\in\mathcal{P}_{i},C_{i+1}\in\mathcal{P}_{i+1},\ldots,C_{j-1}\in\mathcal{P}_{j-1}. Therefore, i\mathcal{R}_{\geq i} is a refinement of j\mathcal{R}_{\geq j}, and since refinements can only increase the boundary set, the second statement ij\partial\mathcal{R}_{\geq i}\supseteq\partial\mathcal{R}_{\geq j} follows. ∎

Claim 4.4.

For all i[L1]i\in[L-1], we have ii+1𝒫i\partial\mathcal{R}_{\geq i}\setminus\partial\mathcal{R}_{\geq i+1}\subseteq\partial\mathcal{P}_{i}.

Proof.

Consider an edge (u,v)ii+1(u,v)\in\partial\mathcal{R}_{\geq i}\setminus\partial\mathcal{R}_{\geq i+1}. Since (u,v)i+1(u,v)\notin\partial\mathcal{R}_{\geq i+1}, there is a set Ci+1C\in\mathcal{R}_{\geq i+1} containing both vertices uu and vv. As in the proof of Claim 4.3, write C=Ci+1Ci+2CLi+1C=C_{i+1}\cap C_{i+2}\cap\cdots\cap C_{L}\in\mathcal{R}_{\geq i+1} for some Ci+1𝒫i+1,Ci+2𝒫i+2,,CL𝒫LC_{i+1}\in\mathcal{P}_{i+1},C_{i+2}\in\mathcal{P}_{i+2},\ldots,C_{L}\in\mathcal{P}_{L}. The set CC is the disjoint union of all nonempty sets of the form CiCiC_{i}\cap C\in\mathcal{R}_{\geq i} for some Ci𝒫iC_{i}\in\mathcal{P}_{i}. Since u,vCu,v\in C, both uu and vv belong to sets of this form. Since (u,v)i(u,v)\in\partial\mathcal{R}_{\geq i}, the sets containing uu and vv must be different. They can only differ in the set Ci𝒫iC_{i}\in\mathcal{P}_{i}, so uu and vv belong to different sets in 𝒫i\mathcal{P}_{i}, and we obtain (u,v)𝒫i(u,v)\in\partial\mathcal{P}_{i} as promised. ∎

Let 𝐛V\mathbf{b}\in\mathbb{R}^{V} be a flow demand satisfying |𝐛(C)|δC|\mathbf{b}(C)|\leq\delta C for all C𝒞C\in\mathcal{C}. We need to construct a demand 𝐛V\mathbf{b}^{\prime}\in\mathbb{R}^{V} satisfying |𝐛|Ldeg𝒫L|\mathbf{b}^{\prime}|\leq L\deg_{\partial\mathcal{P}_{L}} and a flow routing demand 𝐛𝐛\mathbf{b}-\mathbf{b}^{\prime} with congestion 5L2αβ5L^{2}\alpha\beta.

The construction of the flow has L1L-1 iterations. On iteration i[L1]i\in[L-1], we construct a flow fif_{i} and a demand 𝐛i\mathbf{b}_{i} such that

  1. 1.

    The flow fif_{i} routes demand 𝐛i1𝐛i\mathbf{b}_{i-1}-\mathbf{b}_{i}, where we initialize 𝐛0=𝐛\mathbf{b}_{0}=\mathbf{b} on iteration i=1i=1.

  2. 2.

    The flow fif_{i} has congestion 5Lαβ5L\alpha\beta.

  3. 3.

    For all Ci+1C\in\mathcal{R}_{\geq i+1}, we have (𝐛i1𝐛i)(C)=0(\mathbf{b}_{i-1}-\mathbf{b}_{i})(C)=0.

  4. 4.

    The demand 𝐛i\mathbf{b}_{i} satisfies |𝐛i|(i+1)degi+1|\mathbf{b}_{i}|\leq(i+1)\deg_{\partial\mathcal{R}_{\geq i+1}}.

Note for the last level L1L-1, by definition we have L=𝒫L\partial\mathcal{R}_{\geq L}=\partial\mathcal{P}_{L}. The lemma below shows that properties (1), (2), and (4) alone are sufficient to prove Lemma 4.2 with demand 𝐛=𝐛L1\mathbf{b}^{\prime}=\mathbf{b}_{L-1} and flow f1+f2++fL1f_{1}+f_{2}+\cdots+f_{L-1}.

Lemma 4.5.

Suppose that properties (1), (2), and (4) hold for each i[L1]i\in[L-1]. Then, the demand 𝐛L1\mathbf{b}_{L-1} satisfies |𝐛L1|Ldeg𝒫L|\mathbf{b}_{L-1}|\leq L\deg_{\partial\mathcal{P}_{L}}, and the flow f1+f2++fL1f_{1}+f_{2}+\cdots+f_{L-1} routes demand 𝐛𝐛L1\mathbf{b}-\mathbf{b}_{L-1} with congestion 5L2αβ5L^{2}\alpha\beta.

Proof.

The demand 𝐛L1\mathbf{b}_{L-1} satisfies |𝐛L1|LdegL=Ldeg𝒫L|\mathbf{b}_{L-1}|\leq L\deg_{\partial\mathcal{R}_{\geq L}}=L\deg_{\partial\mathcal{P}_{L}} by property (4) on iteration i=L1i=L-1. By property (1) over all i[L1]i\in[L-1], the flow f1+f2++fL1f_{1}+f_{2}+\cdots+f_{L-1} routes demand (𝐛0𝐛1)+(𝐛1𝐛2)++(𝐛L2𝐛L1)=𝐛𝐛L1(\mathbf{b}_{0}-\mathbf{b}_{1})+(\mathbf{b}_{1}-\mathbf{b}_{2})+\cdots+(\mathbf{b}_{L-2}-\mathbf{b}_{L-1})=\mathbf{b}-\mathbf{b}_{L-1}. The congestion is at most the sum of the congestions of each fif_{i}, which is (L1)5Lαβ5L2αβ(L-1)\cdot 5L\alpha\beta\leq 5L^{2}\alpha\beta by property (2). ∎

In order to establish the conditions above, we will use the following technical lemma:

Lemma 4.6.

Consider an iteration i[L1]i\in[L-1] and a vector 𝐬V\mathbf{s}\in\mathbb{R}^{V} such that

  1. (a)

    |𝐬|ideg𝒫i|\mathbf{s}|\leq i\deg_{\partial\mathcal{P}_{i}}.

  2. (b)

    |𝐬(C)|δC|\mathbf{s}(C)|\leq\delta C for all Ci+1C\in\mathcal{R}_{\geq i+1}.

Then, we can construct a flow ff such that

  1. (i)

    Flow ff routes demand 𝐬𝐭\mathbf{s}-\mathbf{t} for a vector 𝐭V\mathbf{t}\in\mathbb{R}^{V} with |𝐭|degi+1|\mathbf{t}|\leq\deg_{\partial\mathcal{R}_{\geq i+1}}.

  2. (ii)

    The flow ff has congestion 5Lαβ5L\alpha\beta.

  3. (iii)

    For all Ci+1C\in\mathcal{R}_{\geq i+1}, we have (𝐬𝐭)(C)=0(\mathbf{s}-\mathbf{t})(C)=0.

Before we prove this lemma, we first establish that it implies properties (1) to (4) above for appropriate fif_{i} and 𝐛i\mathbf{b}_{i}.

Lemma 4.7.

Assuming Lemma 4.6, we can construct fif_{i} and 𝐛i\mathbf{b}_{i} satisfying properties (1) to (4) for each i[L1]i\in[L-1].

Proof.

We induct on i[L1]i\in[L-1], where the base case is just property (4) for i=0i=0 (and 𝐛0=𝐛\mathbf{b}_{0}=\mathbf{b}). For this base case, since the singleton sets {v}\{v\} are in 𝒫1\mathcal{P}_{1}, they are also in 𝒞\mathcal{C}, so |𝐛({v})|deg(v)|\mathbf{b}(\{v\})|\leq\deg(v) for all vVv\in V, which implies |𝐛0(v)|=|𝐛(v)|deg(v)=deg1(v)|\mathbf{b}_{0}(v)|=|\mathbf{b}(v)|\leq\deg(v)=\deg_{\partial\mathcal{R}_{\geq 1}}(v), as desired.

For the inductive step, we apply Lemma 4.6 on iteration i1i\geq 1 and the vector 𝐬V\mathbf{s}\in\mathbb{R}^{V} with

𝐬(v)={(1degi+1(v)degi(v))𝐛i1(v)if degi(v)>0,0otherwise..\mathbf{s}(v)=\begin{cases}\displaystyle\left(1-\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\right)\mathbf{b}_{i-1}(v)&\text{if }\deg_{\partial\mathcal{R}_{\geq i}}(v)>0,\\ 0&\text{otherwise}.\end{cases}.

We first verify the conditions on 𝐬\mathbf{s} required by Lemma 4.6.

  1. (a)

    To establish condition a, fix a vertex vVv\in V; our goal is to show that |𝐬(v)|ideg𝒫i(v)|\mathbf{s}(v)|\leq i\deg_{\partial\mathcal{P}_{i}}(v). If degi(v)=0\deg_{\partial\mathcal{R}_{\geq i}}(v)=0, then 𝐬(v)=0\mathbf{s}(v)=0 and the claim is trivial. Otherwise, suppose that degi(v)>0\deg_{\partial\mathcal{R}_{\geq i}}(v)>0. We can apply property (4) for iteration i1i-1 to obtain

    |𝐬(v)|\displaystyle|\mathbf{s}(v)| =(1degi+1(v)degi(v))|𝐛i1(v)|\displaystyle=\left(1-\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\right)|\mathbf{b}_{i-1}(v)|
    (1degi+1(v)degi(v))idegi(v)\displaystyle\leq\left(1-\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\right)\cdot i\deg_{\partial\mathcal{R}_{\geq i}}(v)
    =(degi(v)degi+1(v))i.\displaystyle=(\deg_{\partial\mathcal{R}_{\geq i}}(v)-\deg_{\partial\mathcal{R}_{\geq i+1}}(v))\cdot i.

    which equals idegii+1(v)i\deg_{\partial\mathcal{R}_{\geq i}\setminus\partial\mathcal{R}_{\geq i+1}}(v) since ii+1\partial\mathcal{R}_{\geq i}\supseteq\partial\mathcal{R}_{\geq i+1} by Claim 4.3. Since ii+1𝒫i\partial\mathcal{R}_{\geq i}\setminus\partial\mathcal{R}_{\geq i+1}\subseteq\partial\mathcal{P}_{i} by Claim 4.4, we have degii+1(v)deg𝒫i(v)\deg_{\partial\mathcal{R}_{\geq i}\setminus\partial\mathcal{R}_{\geq i+1}}(v)\leq\deg_{\partial\mathcal{P}_{i}}(v), establishing condition a.

  2. (b)

    To establish condition b, fix a set Ci+1C\in\mathcal{R}_{\geq i+1}. We first prove that 𝐛0(C)=𝐛i1(C)\mathbf{b}_{0}(C)=\mathbf{b}_{i-1}(C). This is trivial for i=1i=1, so assume that i>1i>1. For a given j[i1]j\in[i-1], the set CC is a disjoint union of some sets C1,,Cj+1C_{1},\ldots,C_{\ell}\in\mathcal{R}_{\geq j+1} by Claim 4.3. Apply property (3) for iteration jj to obtain (𝐛j1𝐛j)(Ck)=0(\mathbf{b}_{j-1}-\mathbf{b}_{j})(C_{k})=0 for all k[]k\in[\ell]. Summing over all k[]k\in[\ell] gives (𝐛j1𝐛j)(C)=k[](𝐛j1𝐛j)(Ck)=0(\mathbf{b}_{j-1}-\mathbf{b}_{j})(C)=\sum_{k\in[\ell]}(\mathbf{b}_{j-1}-\mathbf{b}_{j})(C_{k})=0, so 𝐛j1(C)=𝐛j(C)\mathbf{b}_{j-1}(C)=\mathbf{b}_{j}(C). Over all iterations j[i1]j\in[i-1], we obtain 𝐛0(C)=𝐛1(C)==𝐛i1(C)\mathbf{b}_{0}(C)=\mathbf{b}_{1}(C)=\cdots=\mathbf{b}_{i-1}(C).

    By definition of vector 𝐬\mathbf{s}, we have |𝐬(C)||𝐛i1(C)||\mathbf{s}(C)|\leq|\mathbf{b}_{i-1}(C)|, which equals |𝐛0(C)||\mathbf{b}_{0}(C)| from above. Finally, since the initial flow demand 𝐛V\mathbf{b}\in\mathbb{R}^{V} satisfies |𝐛(C)|δC|\mathbf{b}(C)|\leq\delta C, we have |𝐬(C)||𝐛0(C)|=|𝐛(C)|δC|\mathbf{s}(C)|\leq|\mathbf{b}_{0}(C)|=|\mathbf{b}(C)|\leq\delta C, establishing condition b.

With the conditions fulfilled, Lemma 4.6 outputs a flow ff which we set as fif_{i}, immediately satisfying property (2). We set 𝐛i=𝐛i1𝐬+𝐭\mathbf{b}_{i}=\mathbf{b}_{i-1}-\mathbf{s}+\mathbf{t} so that flow fif_{i} routes demand 𝐬𝐭=𝐛i1𝐛i\mathbf{s}-\mathbf{t}=\mathbf{b}_{i-1}-\mathbf{b}_{i} and property (1) holds. Property (3) follows from property iii of Lemma 4.6.

It remains to establish property (4). By property (4) for iteration i1i-1, we have |𝐛i1(v)|idegi(v)|\mathbf{b}_{i-1}(v)|\leq i\deg_{\partial\mathcal{R}_{\geq i}}(v) for each vertex vVv\in V. If degi(v)=0\deg_{\partial\mathcal{R}_{\geq i}}(v)=0, then 𝐛i1(v)=0\mathbf{b}_{i-1}(v)=0 and

|𝐛i(v)|=|𝐛i1(v)𝐬(v)+𝐭(v)|=|00+𝐭(v)|=|𝐭(v)|.|\mathbf{b}_{i}(v)|=|\mathbf{b}_{i-1}(v)-\mathbf{s}(v)+\mathbf{t}(v)|=|0-0+\mathbf{t}(v)|=|\mathbf{t}(v)|.

Otherwise, if degi(v)>0\deg_{\partial\mathcal{R}_{\geq i}}(v)>0, then

|𝐛i(v)|\displaystyle|\mathbf{b}_{i}(v)| =|𝐛i1(v)𝐬(v)+𝐭(v)|\displaystyle=|\mathbf{b}_{i-1}(v)-\mathbf{s}(v)+\mathbf{t}(v)|
=|degi+1(v)degi(v)𝐛i1(v)+𝐭(v)|\displaystyle=\left|\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\mathbf{b}_{i-1}(v)+\mathbf{t}(v)\right|
degi+1(v)degi(v)|𝐛i1(v)|+|𝐭(v)|\displaystyle\leq\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}|\mathbf{b}_{i-1}(v)|+|\mathbf{t}(v)|
degi+1(v)degi(v)idegi(v)+|𝐭(v)|\displaystyle\leq\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i}}(v)}\cdot i\deg_{\partial\mathcal{R}_{\geq i}}(v)+|\mathbf{t}(v)|
=idegi+1(v)+|𝐭(v)|.\displaystyle=i\cdot\deg_{\partial\mathcal{R}_{\geq i+1}}(v)+|\mathbf{t}(v)|.

In both cases, we obtain |𝐛i(v)|idegi+1(v)+|𝐭(v)||\mathbf{b}_{i}(v)|\leq i\cdot\deg_{\partial\mathcal{R}_{\geq i+1}}(v)+|\mathbf{t}(v)|. We also have |𝐭(v)|degi+1(v)|\mathbf{t}(v)|\leq\deg_{\partial\mathcal{R}_{\geq i+1}}(v) by property i of Lemma 4.6, so property (4) holds on iteration ii, completing the induction. ∎

For the rest of this section, we establish Lemma 4.6, the most technical part of the proof. Before we do so, we first establish three technical helper claims.

Claim 4.8.

For any vector 𝐬0V\mathbf{s}\in\mathbb{R}^{V}_{\geq 0} with 𝐬deg𝒫i+1\mathbf{s}\leq\deg_{\partial\mathcal{P}_{i+1}}, there is a vector 𝐭0V\mathbf{t}\in\mathbb{R}^{V}_{\geq 0} with 𝐭deg𝒫i/2\mathbf{t}\leq\mathrm{deg}_{\partial\mathcal{P}_{i}}/2 and a flow routing demand 𝐬𝐭\mathbf{s}-\mathbf{t} with congestion β\beta.

Proof.

By assumption (3) of Lemma 4.2, there is a flow in GG with congestion β\beta such that each vertex vVv\in V sends deg𝒫i+1(v)\deg_{\partial\mathcal{P}_{i+1}}(v) flow and receives at most 12deg𝒫i(v)\frac{1}{2}\deg_{\partial\mathcal{P}_{i}}(v) flow. In other words, there is a vector 𝐭0V\mathbf{t}\in\mathbb{R}^{V}_{\geq 0} with 𝐭deg𝒫i/2\mathbf{t}\leq\mathrm{deg}_{\partial\mathcal{P}_{i}}/2 and a flow routing demand deg𝒫i+1𝐭\textup{deg}_{\partial\mathcal{P}_{i+1}}-\mathbf{t} with congestion β\beta. Take a path decomposition of the flow where each vertex is the start of deg𝒫i+1(v)\deg_{\partial\mathcal{P}_{i+1}}(v) total capacity of (potentially empty) paths and the end of 𝐭(v)\mathbf{t}(v) total capacity of (potentially empty) paths. Since 𝐬deg𝒫i+1\mathbf{s}\leq\deg_{\partial\mathcal{P}_{i+1}}, we can remove or decrease the capacity of paths until each vertex is the start of 𝐬(v)\mathbf{s}(v) total paths. The resulting flow routes demand 𝐬𝐭\mathbf{s}-\mathbf{t} with congestion β\beta. ∎

Claim 4.9.

For any i[L1]i\in[L-1], consider any vector 𝐱0V\mathbf{x}\in\mathbb{R}^{V}_{\geq 0} with 𝐱degi+1\mathbf{x}\leq\deg_{\partial\mathcal{R}_{\geq i+1}}. There exists a vector 𝐲0V\mathbf{y}\in\mathbb{R}^{V}_{\geq 0} with 𝐲deg𝒫i\mathbf{y}\leq\deg_{\partial\mathcal{P}_{i}} and a flow routing demand 𝐱𝐲\mathbf{x}-\mathbf{y} with congestion (2L12i)β(2L-1-2i)\beta.

Proof.

We prove the statement by induction from i=L1i=L-1 down to i=1i=1. For the base case i=L1i=L-1, since L=𝒫L\mathcal{R}_{\geq L}=\mathcal{P}_{L}, we have 𝐱deg𝒫L\mathbf{x}\leq\deg_{\partial\mathcal{P}_{L}}. By Claim 4.8 on vector 𝐱\mathbf{x}, there is a vector 𝐲0V\mathbf{y}\in\mathbb{R}^{V}_{\geq 0} with 𝐲deg𝒫L1/2deg𝒫L1\mathbf{y}\leq\deg_{\partial\mathcal{P}_{L-1}}/2\leq\deg_{\partial\mathcal{P}_{L-1}} and a flow routing demand 𝐱𝐲\mathbf{x}-\mathbf{y} with congestion β=(2L12(L1))β\beta=(2L-1-2(L-1))\beta.

For the inductive step, consider a vector 𝐱0V\mathbf{x}\in\mathbb{R}^{V}_{\geq 0} with 𝐱degi+1\mathbf{x}\leq\deg_{\partial\mathcal{R}_{\geq i+1}}. Define 𝐱0V\mathbf{x}^{\prime}\in\mathbb{R}^{V}_{\geq 0} as

𝐱(v)={degi+2(v)degi+1(v)𝐱(v)if degi+1(v)>0,0otherwise,\mathbf{x}^{\prime}(v)=\begin{cases}\displaystyle\frac{\deg_{\partial\mathcal{R}_{\geq i+2}}(v)}{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}\cdot\mathbf{x}(v)&\text{if }\deg_{\partial\mathcal{R}_{\geq i+1}}(v)>0,\\ 0&\text{otherwise},\end{cases}

which satisfies 𝐱degi+2\mathbf{x}^{\prime}\leq\deg_{\partial\mathcal{R}_{\geq i+2}}. By induction, there exists a vector 𝐲0V\mathbf{y}^{\prime}\in\mathbb{R}^{V}_{\geq 0} with 𝐲deg𝒫i+1\mathbf{y}^{\prime}\leq\deg_{\partial\mathcal{P}_{i+1}} and a flow f1f_{1} routing demand 𝐱𝐲\mathbf{x}^{\prime}-\mathbf{y}^{\prime} with congestion (2L12(i+1))β(2L-1-2(i+1))\beta.

Next, consider the vector 𝐱𝐱+𝐲\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime}, which satisfies 𝐱𝐱+𝐲𝐱𝐱𝟎\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime}\geq\mathbf{x}-\mathbf{x}^{\prime}\geq\mathbf{0}, where the second inequality holds by Claim 4.3. If degi+1(v)=0\deg_{\partial\mathcal{R}_{\geq i+1}}(v)=0, then 𝐱(v)=0\mathbf{x}(v)=0 and 𝐱(v)=0\mathbf{x}^{\prime}(v)=0, so (𝐱𝐱+𝐲)(v)=𝐲(v)deg𝒫i+1(v)(\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime})(v)=\mathbf{y}^{\prime}(v)\leq\deg_{\partial\mathcal{P}_{i+1}}(v). Otherwise, if degi+1(v)>0\deg_{\partial\mathcal{R}_{\geq i+1}}(v)>0, then

(𝐱𝐱+𝐲)(v)\displaystyle(\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime})(v) =(degi+1(v)degi+2(v)degi+1(v))𝐱(v)+𝐲(v)\displaystyle=\bigg{(}\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)-\deg_{\partial\mathcal{R}_{\geq i+2}}(v)}{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}\bigg{)}\mathbf{x}(v)+\mathbf{y}^{\prime}(v)
deg𝒫i+1(v)degi+1(v)𝐱(v)+𝐲(v)\displaystyle\leq\frac{\deg_{\partial\mathcal{P}_{i+1}}(v)}{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}\mathbf{x}(v)+\mathbf{y}^{\prime}(v)
deg𝒫i+1(v)+deg𝒫i+1(v),\displaystyle\leq\deg_{\partial\mathcal{P}_{i+1}}(v)+\deg_{\partial\mathcal{P}_{i+1}}(v),

where the first inequality holds by Claim 4.4. Combining the two cases, we obtain 𝐱𝐱+𝐲2deg𝒫i+1\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime}\leq 2\deg_{\partial\mathcal{P}_{i+1}}. By Claim 4.8 on vector (𝐱𝐱+𝐲)/2(\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime})/2, there exists a vector 𝐭0V\mathbf{t}\in\mathbb{R}^{V}_{\geq 0} with 𝐭deg𝒫i/2\mathbf{t}\leq\mathrm{deg}_{\partial\mathcal{P}_{i}}/2 and a flow routing demand (𝐱𝐱+𝐲)/2𝐭(\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime})/2-\mathbf{t} with congestion β\beta. Scaling this flow by factor 22, we obtain a flow f2f_{2} routing demand 𝐱𝐱+𝐲2𝐭\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime}-2\mathbf{t} with congestion 2β2\beta. The final flow is the sum of flows f1f_{1} and f2f_{2}, which routes demand (𝐱𝐲)+(𝐱𝐱+𝐲2𝐭)=𝐱2𝐭(\mathbf{x}^{\prime}-\mathbf{y}^{\prime})+(\mathbf{x}-\mathbf{x}^{\prime}+\mathbf{y}^{\prime}-2\mathbf{t})=\mathbf{x}-2\mathbf{t} and has congestion (2L12(i+1))β+2β=(2L12i)β(2L-1-2(i+1))\beta+2\beta=(2L-1-2i)\beta. We set 𝐲=2𝐭deg𝒫i\mathbf{y}=2\mathbf{t}\leq\deg_{\partial\mathcal{P}_{i}}, which concludes the inductive step and hence the proof. ∎

Claim 4.10.

For any i[L1]i\in[L-1], consider any vector 𝐱0V\mathbf{x}\in\mathbb{R}^{V}_{\geq 0} with 𝐱degi+1\mathbf{x}\leq\deg_{\partial\mathcal{R}_{\geq i+1}}. There exists a vector 𝐲0V\mathbf{y}\in\mathbb{R}^{V}_{\geq 0} such that

  1. 1.

    𝐲deg𝒫i+2Lβdeg𝒫i+1\mathbf{y}\leq\textup{deg}_{\partial\mathcal{P}_{i}}+2L\beta\deg_{\partial\mathcal{P}_{i+1}}.

  2. 2.

    For all clusters C𝒫i+1C\in\mathcal{P}_{i+1}, we have (𝐱𝐲)(C)=0(\mathbf{x}-\mathbf{y})(C)=0.

  3. 3.

    There is a flow routing demand 𝐱𝐲\mathbf{x}-\mathbf{y} with congestion 2Lβ2L\beta.

Proof.

Apply Claim 4.9 on vector 𝐱\mathbf{x}, obtaining a vector 𝐲0V\mathbf{y}\in\mathbb{R}^{V}_{\geq 0} which we relabel as 𝐲\mathbf{y}^{\prime}, and a flow ff routing demand 𝐱𝐲\mathbf{x}-\mathbf{y}^{\prime} with congestion (2L12i)β2Lβ(2L-1-2i)\beta\leq 2L\beta. Take a path decomposition of flow ff where each vertex is the start of 𝐱(v)\mathbf{x}(v) total capacity of (potentially empty) paths and the end of 𝐲(v)\mathbf{y}^{\prime}(v) total capacity of (potentially empty) paths. For each path starting at a vertex vv in some cluster C𝒫i+1C\in\mathcal{P}_{i+1}, perform the following operation. If the path contains an edge (u,v)(u,v) in C\partial C with uCu\in C, then replace the path with its prefix ending at uu; otherwise, do nothing to the path. Note that the modified path ends in the same cluster C𝒫i+1C\in\mathcal{P}_{i+1} as its starting point. These modified paths form a new flow ff^{\prime}, which also has congestion 2Lβ2L\beta.

We now bound the difference in the demands routed by ff and ff^{\prime}. To do so, we consider the difference in endpoints in the old and new path decompositions. Each vertex uVu\in V was initially the endpoint of 𝐲(u)\mathbf{y}^{\prime}(u) total capacity of paths. We now claim that for each cluster C𝒫i+1C\in\mathcal{P}_{i+1}, each vertex uCu\in C becomes the new endpoint of at most 2LβdegC(u)=2Lβdeg𝒫i+1(u)2L\beta\deg_{\partial C}(u)=2L\beta\deg_{\partial\mathcal{P}_{i+1}}(u) total capacity of paths. This is because each new endpoint is a result of an edge (u,v)C(u,v)\in\partial C in some path, and the total capacity of such paths is at most 2LβdegC(u)2L\beta\deg_{\partial C}(u) since the flow ff has congestion 2Lβ2L\beta. It follows that each vertex uVu\in V is the (new or old) endpoint of at most 𝐲(u)+2Lβdeg𝒫i+1(u)\mathbf{y}^{\prime}(u)+2L\beta\deg_{\partial\mathcal{P}_{i+1}}(u) total capacity of paths in the new flow ff^{\prime}.

Define vector 𝐲0V\mathbf{y}\in\mathbb{R}^{V}_{\geq 0} such that each vertex uVu\in V is the endpoint of 𝐲(u)\mathbf{y}(u) total capacity of paths in the new flow ff^{\prime}. In other words, the flow ff^{\prime} routes demand 𝐱𝐲\mathbf{x}-\mathbf{y}. We have shown that 𝐲𝐲+2Lβdeg𝒫i+1\mathbf{y}\leq\mathbf{y}^{\prime}+2L\beta\deg_{\partial\mathcal{P}_{i+1}}, and combined with the guarantee 𝐲deg𝒫i\mathbf{y}^{\prime}\leq\deg_{\partial\mathcal{P}_{i}} from Claim 4.9, we conclude that 𝐲deg𝒫i+2Lβdeg𝒫i+1\mathbf{y}\leq\deg_{\partial\mathcal{P}_{i}}+2L\beta\deg_{\partial\mathcal{P}_{i+1}}. Finally, recall that each path of ff^{\prime} starts and ends in the same cluster of 𝒫i+1\mathcal{P}_{i+1}, so (𝐱𝐲)(C)=0(\mathbf{x}-\mathbf{y})(C)=0 for all clusters C𝒫i+1C\in\mathcal{P}_{i+1}. ∎

We now prove Lemma 4.6, restated below. See 4.6

Proof.

We first construct demand 𝐭V\mathbf{t}\in\mathbb{R}^{V} as follows. For each set Ci+1C\in\mathcal{R}_{\geq i+1}, define𝐭(v)=𝐬(C)degi+1(v)/δC\mathbf{t}(v)=\mathbf{s}(C)\cdot\deg_{\partial\mathcal{R}_{\geq i+1}}(v)/\delta C for all vCv\in C, which satisfies |𝐭(v)|degi+1(v)|\mathbf{t}(v)|\leq\deg_{\partial\mathcal{R}_{\geq i+1}}(v) by condition b. Since i+1\mathcal{R}_{\geq i+1} is a partition of VV, this fully defines demand 𝐭\mathbf{t}, which satisfies the bound required by property i. Also, since Ci+1C\in\mathcal{R}_{\geq i+1}, we have vCdegi+1(v)=δC\sum_{v\in C}\deg_{\partial\mathcal{R}_{\geq i+1}}(v)=\delta C, so

𝐭(C)=vC𝐭(v)=vC𝐬(C)degi+1(v)δC=𝐬(C),\mathbf{t}(C)=\sum_{v\in C}\mathbf{t}(v)=\sum_{v\in C}\mathbf{s}(C)\cdot\frac{\deg_{\partial\mathcal{R}_{\geq i+1}}(v)}{\delta C}=\mathbf{s}(C),

satisfying property iii. In particular, 𝐭(V)=𝐬(V)\mathbf{t}(V)=\mathbf{s}(V) and 𝐬𝐭\mathbf{s}-\mathbf{t} is a valid demand.

Since |𝐭|degi+1|\mathbf{t}|\leq\deg_{\partial\mathcal{R}_{\geq i+1}}, we apply Claim 4.10 with 𝐱=𝐭\mathbf{x}=\mathbf{t}, obtaining a vector 𝐲0V\mathbf{y}\in\mathbb{R}^{V}_{\geq 0} which we relabel as 𝐭\mathbf{t}^{\prime}. We have |𝐭|=𝐭deg𝒫i+2Lβdeg𝒫i+1|\mathbf{t}^{\prime}|=\mathbf{t}^{\prime}\leq\deg_{\partial\mathcal{P}_{i}}+2L\beta\deg_{\partial\mathcal{P}_{i+1}} and (𝐭𝐭)(C)=0(\mathbf{t}-\mathbf{t}^{\prime})(C)=0 for all clusters C𝒫i+1C\in\mathcal{P}_{i+1}. Let f1f_{1} be the flow routing demand 𝐭𝐭\mathbf{t}-\mathbf{t}^{\prime} with congestion 2Lβ2L\beta.

Consider a cluster C𝒫i+1C\in\mathcal{P}_{i+1}. We have (𝐬𝐭)(C)=(𝐬𝐭)(C)+(𝐭𝐭)(C)=0(\mathbf{s}-\mathbf{t}^{\prime})(C)=(\mathbf{s}-\mathbf{t})(C)+(\mathbf{t}-\mathbf{t}^{\prime})(C)=0. Moreover, for all vertices vCv\in C, we have

|𝐬(v)𝐭(v)||𝐬(v)|+|𝐭(v)|\displaystyle|\mathbf{s}(v)-\mathbf{t}^{\prime}(v)|\leq|\mathbf{s}(v)|+|\mathbf{t}^{\prime}(v)| ideg𝒫i(v)+deg𝒫i(v)+2Lβdeg𝒫i+1(v)\displaystyle\leq i\deg_{\partial\mathcal{P}_{i}}(v)+\deg_{\partial\mathcal{P}_{i}}(v)+2L\beta\deg_{\partial\mathcal{P}_{i+1}}(v)
=(i+1)deg𝒫i(v)+2LβdegC(v)\displaystyle=(i+1)\deg_{\partial\mathcal{P}_{i}}(v)+2L\beta\deg_{\partial C}(v)
3Lβdeg𝒫iC(v).\displaystyle\leq 3L\beta\deg_{\partial\mathcal{P}_{i}\cup\partial C}(v).

We conclude that the scaled-down and restricted vector 13Lβ(𝐬𝐭)|C\frac{1}{3L\beta}(\mathbf{s}-\mathbf{t}^{\prime})|_{C} is a demand satisfying |13Lβ(𝐬𝐭)|C|deg𝒫iC|C\big{|}\frac{1}{3L\beta}(\mathbf{s}-\mathbf{t}^{\prime})|_{C}\big{|}\leq\mathrm{deg}_{\partial\mathcal{P}_{i}\cup\partial C}|_{C}. By assumption (2) of Lemma 4.2, the collection of vertex weightings {deg𝒫iC|C0V:C𝒫i+1}\{\mathrm{deg}_{\partial\mathcal{P}_{i}\cup\partial C}|_{C}\in\mathbb{R}^{V}_{\geq 0}:C\in\mathcal{P}_{i+1}\} mixes simultaneously in GG with congestion α\alpha, so there is a flow in GG routing demand C𝒫i+113Lβ(𝐬𝐭)|C=13Lβ(𝐬𝐭)\sum_{C\in\mathcal{P}_{i+1}}\frac{1}{3L\beta}(\mathbf{s}-\mathbf{t}^{\prime})|_{C}=\frac{1}{3L\beta}(\mathbf{s}-\mathbf{t}^{\prime}) with congestion α\alpha. Scaling this flow by factor 3Lβ3L\beta, we obtain a flow f2f_{2} routing demand 𝐬𝐭\mathbf{s}-\mathbf{t}^{\prime} with congestion at most 3Lαβ3L\alpha\beta.

The final flow ff is f2f1f_{2}-f_{1}, which routes demand (𝐬𝐭)(𝐭𝐭)=𝐬𝐭(\mathbf{s}-\mathbf{t}^{\prime})-(\mathbf{t}-\mathbf{t}^{\prime})=\mathbf{s}-\mathbf{t} and has congestion 2Lα+3Lαβ5Lαβ2L\alpha+3L\alpha\beta\leq 5L\alpha\beta, concluding the proof of Lemma 4.6. ∎

5 Partitioning Algorithm

The partitioning algorithm starts with the partition 𝒫1={{v}:vV}\mathcal{P}_{1}=\{\{v\}:v\in V\} of singleton clusters. The algorithm then iteratively constructs partition 𝒫i+1\mathcal{P}_{i+1} given the current partitions 𝒫1,𝒫2,,𝒫i\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{i}. The lemma below establishes this iterative algorithm, where we substitute LL for ii.

Theorem 5.1.

Consider a capacitated graph G=(V,E)G=(V,E), and let α1\alpha\geq 1 be a parameter. Suppose there exist partitions 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L} that satisfy the three properties in Lemma 4.2, i.e.,

  1. 1.

    𝒫1\mathcal{P}_{1} is the partition {{v}:vV}\{\{v\}:v\in V\} of singleton clusters.

  2. 2.

    For each i[L1]i\in[L-1], the collection of vertex weightings {deg𝒫iC|C0V:C𝒫i+1}\{\mathrm{deg}_{\partial\mathcal{P}_{i}\cup\partial C}|_{C}\in\mathbb{R}^{V}_{\geq 0}:C\in\mathcal{P}_{i+1}\} mixes simultaneously in GG with congestion α=O(log5(nW))\alpha=O(\log^{5}(nW)).

  3. 3.

    For each i[L1]i\in[L-1], there is a flow in GG with congestion β=O(log3(nW))\beta=O(\log^{3}(nW)) such that each vertex vVv\in V sends deg𝒫i+1(v)\deg_{\partial\mathcal{P}_{i+1}}(v) flow and receives at most 12deg𝒫i(v)\frac{1}{2}\deg_{\partial\mathcal{P}_{i}}(v) flow.

Then, there is an algorithm that constructs partition 𝒫L+1\mathcal{P}_{L+1} such that properties (2) and (3) hold for i=Li=L as well. The algorithm runs in O~(m)\tilde{O}(m) time.

We remark that the parameters α=O(log5(nW))\alpha=O(\log^{5}(nW)) and β=O(log3(nW))\beta=O(\log^{3}(nW)), which result in an overall quality of O(log10(nW))O(\log^{10}(nW)), are far from optimized. We aim for a clean and modular exposition over the optimization of logarithmic factors, leaving the latter open for future work.

Before we describe the algorithm, we first show that O(log(nW))O(\log(nW)) iterations suffice to obtain a congestion-approximator.

Claim 5.2.

After L=O(log(nW))L=O(\log(nW)) iterations, we have 𝒫L={V}\mathcal{P}_{L}=\{V\}, and 𝒞\mathcal{C} is a congestion-approximator with quality O(log10(nW))O(\log^{10}(nW)).

Proof.

We first claim that δ𝒫i+1δ𝒫i/2\delta\mathcal{P}_{i+1}\leq\delta\mathcal{P}_{i}/2 for all i[L]i\in[L]. By property (3), there is a flow that sends a total of deg𝒫i+1(V)\deg_{\partial\mathcal{P}_{i+1}}(V) flow among the vertices, and receives a total of at most 12deg𝒫i(V)\frac{1}{2}\deg_{\partial\mathcal{P}_{i}}(V) flow among the vertices. Since the total flow sent equals the total flow received, we have deg𝒫i+1(V)12deg𝒫i(V)\deg_{\partial\mathcal{P}_{i+1}}(V)\leq\frac{1}{2}\deg_{\partial\mathcal{P}_{i}}(V), or equivalently, δ𝒫i+112δ𝒫i\delta\mathcal{P}_{i+1}\leq\frac{1}{2}\delta\mathcal{P}_{i}.

The guarantee δ𝒫i+112δ𝒫i\delta\mathcal{P}_{i+1}\leq\frac{1}{2}\delta\mathcal{P}_{i} ensures that for L=O(log(nW))L=O(\log(nW)), we must have δ𝒫L<1\delta\mathcal{P}_{L}<1. Since all edge capacities are assumed to be at least 11, we conclude that δ𝒫L=0\delta\mathcal{P}_{L}=0 and 𝒫L={V}\mathcal{P}_{L}=\{V\}, fulfilling property (1) of Theorem 4.1. By Theorem 4.1, we obtain a congestion-approximator with quality 5L2αβ=O(log10(nW))5L^{2}\alpha\beta=O(\log^{10}(nW)). ∎

The rest of this section is dedicated to proving Theorem 5.1. Throughout the section, we fix the input graph G=(V,E)G=(V,E) as well as the current partitions 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}. We also define 1,,L\mathcal{R}_{\geq 1},\ldots,\mathcal{R}_{\geq L} and 𝒞=i[L]i\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i} according to Lemma 4.2.

At a high level, our algorithm proceeds similarly to expander decomposition algorithms like [34], where expanders are defined with respect to the vertex weighting deg𝒫L\deg_{\partial\mathcal{P}_{L}} of partition 𝒫L\mathcal{P}_{L}. We iteratively decompose the graph using the cut-matching game of [18]: for each cluster of the decomposition, we either compute a low-conductance cut or certify that the current cluster is mixing (or in [34] terms, a nearly expander). The matching step of the cut-matching game requires a call to approximate min-cut/max-flow, but recall that the partitions 𝒫1,,𝒫L\mathcal{P}_{1},\ldots,\mathcal{P}_{L} only form a pseudo-congestion-approximator. Luckily, we can modify the graph in a way that the pseudo-congestion-approximator can be adapted to an actual congestion-approximator for the new graph. We then black-box a cut/flow algorithm on this new graph, and then modify the cut and flow to fit the old graph in a way that suffices for the cut-matching game.

In more detail, we break down this section as follows:

  1. 1.

    In Section 5.1, we show that 𝒞\mathcal{C} can be used to construct a congestion-approximator for slightly modified graphs.

  2. 2.

    In Section 5.2, we cite the (approximate) fair cut/flow algorithm of [25], which computes a cut/flow pair with desirable properties given a congestion-approximator.

  3. 3.

    In Section 5.3, we introduce the cut-matching game as well as a trimming procedure similar to [34]. Both the cut-matching game and the trimming step use the fair cut/flow algorithm (Section 5.2) on the modified graphs for which we have a congestion-approximator (Section 5.1).

  4. 4.

    Finally, in Section 5.4, we establish Theorem 5.1. We present the recursive clustering algorithm that computes the next partition 𝒫L+1\mathcal{P}_{L+1} given the current partitions 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L}. It uses the cut-matching game and trimming procedures of Section 5.3.

5.1 Congestion-approximator

We first show how to build a congestion-approximator on certain graph instances that show up in our algorithm. We define these graph instances below.

Definition 5.3 (G[A,γ,𝐬,𝐭]G[A,\gamma,\mathbf{s},\mathbf{t}]).

For given vertex set AVA\subseteq V, parameter γ(0,1]\gamma\in(0,1], and vertex weightings 𝐬,𝐭0A\mathbf{s},\mathbf{t}\in\mathbb{R}^{A}_{\geq 0}, define the graph G[A,γ,𝐬,𝐭]G[A,\gamma,\mathbf{s},\mathbf{t}] as follows:

  • Start with the graph G[A]G[A].

  • Add new vertices xx, ss, and tt.

  • For each vertex vAv\in A, add an edge (x,v)(x,v) with capacity γdegG𝒫LGA(v)\gamma\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A}(v).

  • For each vertex vAv\in A, add an edge (s,v)(s,v) with capacity 𝐬(v)\mathbf{s}(v).

  • For each vertex vAv\in A, add an edge (t,v)(t,v) with capacity 𝐭(v)\mathbf{t}(v).

To understand these instances, recall a fact about 𝒞\mathcal{C} that follows from Lemma 4.2.

Fact 5.4.

For any demand 𝐛V\mathbf{b}\in\mathbb{R}^{V} satisfying |𝐛(C)|δC|\mathbf{b}(C)|\leq\delta C for all C𝒞C\in\mathcal{C}, there exists a demand 𝐛V\mathbf{b}^{\prime}\in\mathbb{R}^{V} satisfying |𝐛|Ldeg𝒫L|\mathbf{b}^{\prime}|\leq L\deg_{\partial\mathcal{P}_{L}} and a flow in GG routing demand 𝐛𝐛\mathbf{b}-\mathbf{b}^{\prime} with congestion κ\kappa, where we define κ=5L2αβ\kappa=5L^{2}\alpha\beta.

Suppose we start with the entire graph GG, and then add a vertex xx connected to each vertex vVv\in V with an edge of capacity deg𝒫L(v)\deg_{\partial\mathcal{P}_{L}}(v). In the setting of Fact 5.4, suppose we wish to route the demand 𝐛\mathbf{b}. We start with the flow routing demand 𝐛𝐛\mathbf{b}-\mathbf{b}^{\prime} as promised by Fact 5.4. To route the remaining demand 𝐛\mathbf{b}^{\prime}, we simply use the new edges incident to xx: for each vertex vVv\in V, send |𝐛(v)|Ldeg𝒫L|\mathbf{b}^{\prime}(v)|\leq L\deg_{\partial\mathcal{P}_{L}} flow along the edge (x,v)(x,v) in the proper direction, which is a flow with congestion LL. Overall, we obtain a flow routing demand 𝐛\mathbf{b} with congestion κ+L\kappa+L, and with some more work, we can show that 𝒞\mathcal{C} is a congestion approximator of the new graph.

For our algorithm, we actually work with graphs as described in Definition 5.3. In particular, the base graphs are induced subgraphs, and there are additional vertices ss and tt. One issue is that the newly added edges may also contribute to the values of δC\delta C for C𝒞C\in\mathcal{C}.131313This issue is also present in the example with the entire graph, but can be resolved by investigating the structure of 𝒞\mathcal{C}. Nevertheless, we show in the lemma below that as long as A,𝐬,𝐭A,\mathbf{s},\mathbf{t} are “well-behaved”, we can modify 𝒞\mathcal{C} into a congestion approximator for G[A,γ,𝐬,𝐭]G[A,\gamma,\mathbf{s},\mathbf{t}].

Lemma 5.5.

Consider partitions 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L} for a graph G=(V,E)G=(V,E) that satisfy the three properties in Lemma 4.2, and define the partitions 1,,L\mathcal{R}_{\geq 1},\ldots,\mathcal{R}_{\geq L} and 𝒞=i[L]i\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i} according to Lemma 4.2. Fix vertex set AVA\subseteq V, parameter γ(0,1]\gamma\in(0,1], and vertex weightings 𝐬,𝐭0A\mathbf{s},\mathbf{t}\in\mathbb{R}^{A}_{\geq 0} on AA, and denote the graph G[A,γ,𝐬,𝐭]G[A,\gamma,\mathbf{s},\mathbf{t}] by H=(VH,EH)H=(V_{H},E_{H}). Consider a parameter β1\beta\geq 1 such that the following assumption holds:

  1. ()(\star)

    𝐬(CA)+𝐭(CA)+γδG(CA)βδGC\mathbf{s}(C\cap A)+\mathbf{t}(C\cap A)+\gamma\cdot\delta_{G}(C\cap A)\leq\beta\cdot\delta_{G}C for all C𝒞C\in\mathcal{C}.

Let 𝒞|A\mathcal{C}|_{A} be the collection {CA:C𝒞}\{C\cap A:C\in\mathcal{C}\}. Then, 𝒞|A{{x},{s},{t}}\mathcal{C}|_{A}\cup\{\{x\},\{s\},\{t\}\} is a congestion-approximator of HH with quality O(βγ1(κ+L))O(\beta\gamma^{-1}(\kappa+L)).

For the rest of Section 5.1, we prove Lemma 5.5. Consider a demand 𝐛A{x,s,t}\mathbf{b}\in\mathbb{R}^{A\cup\{x,s,t\}} satisfying |𝐛(CA)|δH(CA)|\mathbf{b}(C\cap A)|\leq\delta_{H}(C\cap A) for all C𝒞C\in\mathcal{C} as well as |𝐛(v)|δH{v}|\mathbf{b}(v)|\leq\delta_{H}\{v\} for v{x,s,t}v\in\{x,s,t\}. We want to establish a flow on HH routing demand 𝐛\mathbf{b} with congestion O(βγ1(κ+L))O(\beta\gamma^{-1}(\kappa+L)).

We first handle the demand at xx, ss, and tt. For each edge (x,v)(x,v) where vAv\in A, route 𝐛(x)/degH(x)cH(x,v)\mathbf{b}(x)/\deg_{H}(x)\cdot c_{H}(x,v) flow from xx to vv (or 𝐛(x)/degH(x)cH(x,v)-\mathbf{b}(x)/\deg_{H}(x)\cdot c_{H}(x,v) flow from vv to xx, whichever is nonnegative). Since |𝐛(x)|=|𝐛({x})|δH{x}=degH(x)|\mathbf{b}(x)|=|\mathbf{b}(\{x\})|\leq\delta_{H}\{x\}=\deg_{H}(x), we route at most cH(x,v)c_{H}(x,v) flow along each edge (x,v)(x,v). Analogously, for each edge (s,v)(s,v) where vAv\in A, route 𝐛(s)/degH(s)cH(s,v)\mathbf{b}(s)/\deg_{H}(s)\cdot c_{H}(s,v) flow from ss to vv, and for each edge (t,v)(t,v) where vAv\in A, route 𝐛(t)/degH(t)cH(t,v)\mathbf{b}(t)/\deg_{H}(t)\cdot c_{H}(t,v) flow from tt to vv. By the same argument, we route at most cH(s,v)c_{H}(s,v) and cH(t,v)c_{H}(t,v) flow along each edge (s,v)(s,v) and (t,v)(t,v), respectively. In other words, the routing so far has congestion 11.

After this initial routing, vertices xx, ss, and tt no longer have any demand, and each vertex vSv\in S receives at most cH(x,v)+cH(s,v)+cH(t,v)=c({x,s,t},v)c_{H}(x,v)+c_{H}(s,v)+c_{H}(t,v)=c(\{x,s,t\},v) additional demand in absolute value. In other words, if 𝐛~\widetilde{\mathbf{b}} is the new demand that must be routed, we have 𝐛~(x)=𝐛~(s)=𝐛~(t)=0\widetilde{\mathbf{b}}(x)=\widetilde{\mathbf{b}}(s)=\widetilde{\mathbf{b}}(t)=0 and |𝐛(v)𝐛~(v)|cH({x,s,t},v)|\mathbf{b}(v)-\tilde{\mathbf{b}}(v)|\leq c_{H}(\{x,s,t\},v).

In order to invoke Fact 5.4, our next goal is to show the following.

Claim 5.6.

|𝐛~(CA)|(1+2β+2γ)δGC|\widetilde{\mathbf{b}}(C\cap A)|\leq(1+2\beta+2\gamma)\delta_{G}C for all C𝒞C\in\mathcal{C}.

Proof.

We first bound |𝐛(CA)𝐛~(CA)||\mathbf{b}(C\cap A)-\widetilde{\mathbf{b}}(C\cap A)| as

|𝐛(CA)𝐛~(CA)|vCA|𝐛(v)𝐛~(v)|vCAcH({x,s,t},v)=cH({x,s,t},CA).\displaystyle|\mathbf{b}(C\cap A)-\widetilde{\mathbf{b}}(C\cap A)|\leq\sum_{v\in C\cap A}|\mathbf{b}(v)-\widetilde{\mathbf{b}}(v)|\leq\sum_{v\in C\cap A}c_{H}(\{x,s,t\},v)=c_{H}(\{x,s,t\},C\cap A).

Next, we bound |𝐛(CA)||\mathbf{b}(C\cap A)| as follows. By assumption, we have |𝐛(CA)|δH(CA)|\mathbf{b}(C\cap A)|\leq\delta_{H}(C\cap A). By construction of HH, we have

δH(CA)=cG(CA,AC)+cH({x,s,t},CA)δGC+cH({x,s,t},CA).\displaystyle\delta_{H}(C\cap A)=c_{G}(C\cap A,A\setminus C)+c_{H}(\{x,s,t\},C\cap A)\leq\delta_{G}C+c_{H}(\{x,s,t\},C\cap A).

Putting everything together, we obtain

|𝐛~(CA)||𝐛(CA)𝐛~(CA)|+|𝐛(CA)|δGC+2cH({x,s,t},CA).\displaystyle|\widetilde{\mathbf{b}}(C\cap A)|\leq|\mathbf{b}(C\cap A)-\widetilde{\mathbf{b}}(C\cap A)|+|\mathbf{b}(C\cap A)|\leq\delta_{G}C+2c_{H}(\{x,s,t\},C\cap A). (1)

It remains to bound cH({x,s,t},CA)c_{H}(\{x,s,t\},C\cap A), which we split into cH({s,t},CA)+cH({x},CA)c_{H}(\{s,t\},C\cap A)+c_{H}(\{x\},C\cap A). By construction of H=G[A,γ,𝐬,𝐭]H=G[A,\gamma,\mathbf{s},\mathbf{t}], we have

cH({s,t},CA)=𝐬(CA)+𝐭(CA)c_{H}(\{s,t\},C\cap A)=\mathbf{s}(C\cap A)+\mathbf{t}(C\cap A)

and

cH({x},CA)=γdegG𝒫LGA(CA)γdegG𝒫L(CA)+γdegGA(CA).c_{H}(\{x\},C\cap A)=\gamma\cdot\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A}(C\cap A)\leq\gamma\cdot\deg_{\partial_{G}\mathcal{P}_{L}}(C\cap A)+\gamma\cdot\deg_{\partial_{G}A}(C\cap A).

We now bound the individual terms degG𝒫L(CA)\deg_{\partial_{G}\mathcal{P}_{L}}(C\cap A) and degGA(CA)\deg_{\partial_{G}A}(C\cap A) above. For degGA(CA)\deg_{\partial_{G}A}(C\cap A), we claim the bound degGA(CA)δG(CA)\deg_{\partial_{G}A}(C\cap A)\leq\delta_{G}(C\cap A): any edge in GA\partial_{G}A with an endpoint in CAC\cap A has its other endpoint outside CAC\cap A, so the edge must be in G(CA)\partial_{G}(C\cap A), and the claimed bound holds. For degG𝒫L(CA)\deg_{\partial_{G}\mathcal{P}_{L}}(C\cap A), we claim the bound degG𝒫L(CA)degG𝒫L(C)δGC\deg_{\partial_{G}\mathcal{P}_{L}}(C\cap A)\leq\deg_{\partial_{G}\mathcal{P}_{L}}(C)\leq\delta_{G}C. The first inequality is trivial, and for the second inequality, observe that by construction of 𝒞\mathcal{C}, each set C𝒞C\in\mathcal{C} is a subset of some cluster in the partition 𝒫L\mathcal{P}_{L}. It follows that any edge in G𝒫L\partial_{G}\mathcal{P}_{L} with an endpoint in CC has its other endpoint outside CC, so the edge must be in GC\partial_{G}C, and we conclude that degG𝒫L(C)δGC\deg_{\partial_{G}\mathcal{P}_{L}}(C)\leq\delta_{G}C.

Continuing from (1), we conclude that

|𝐛~(CA)|\displaystyle|\widetilde{\mathbf{b}}(C\cap A)| δGC+2cH({x,s,t},CA)\displaystyle\leq\delta_{G}C+2c_{H}(\{x,s,t\},C\cap A)
=δGC+2cH({s,t},CA)+2cH({x},CA)\displaystyle=\delta_{G}C+2c_{H}(\{s,t\},C\cap A)+2c_{H}(\{x\},C\cap A)
δGC+2(𝐬(CA)+𝐭(CA))+2(γdegG𝒫L(CA)+γdegGA(CA))\displaystyle\leq\delta_{G}C+2(\mathbf{s}(C\cap A)+\mathbf{t}(C\cap A))+2(\gamma\cdot\deg_{\partial_{G}\mathcal{P}_{L}}(C\cap A)+\gamma\cdot\deg_{\partial_{G}A}(C\cap A))
δGC+2(𝐬(CA)+𝐭(CA))+2(γδGC+γδG(CA))\displaystyle\leq\delta_{G}C+2(\mathbf{s}(C\cap A)+\mathbf{t}(C\cap A))+2(\gamma\cdot\delta_{G}C+\gamma\cdot\delta_{G}(C\cap A))
1δGC+2βδGC+2γδGC,\displaystyle\stackrel{{\scriptstyle\mathclap{\ref{item:property-Ast1}}}}{{\leq}}\delta_{G}C+2\beta\cdot\delta_{G}C+2\gamma\cdot\delta_{G}C,

finishing the proof. ∎

Let 𝐛~V\widetilde{\mathbf{b}}^{\prime}\in\mathbb{R}^{V} be the vector 𝐛~A{x,s,t}\widetilde{\mathbf{b}}\in\mathbb{R}^{A\cup\{x,s,t\}} without entries 𝐛~(x),𝐛~(s),𝐛~(t)\widetilde{\mathbf{b}}(x),\widetilde{\mathbf{b}}(s),\widetilde{\mathbf{b}}(t) and with new entries 𝐛~(v)=0\widetilde{\mathbf{b}}^{\prime}(v)=0 for all vVAv\in V\setminus A. Since 𝐛~\widetilde{\mathbf{b}} is a demand with 𝐛~(x)=𝐛~(s)=𝐛~(t)=0\widetilde{\mathbf{b}}(x)=\widetilde{\mathbf{b}}(s)=\widetilde{\mathbf{b}}(t)=0, we have that 𝐛~\widetilde{\mathbf{b}}^{\prime} is also a demand, i.e., the coordinates sum to 0. By Claim 5.6, we have

|𝐛~(C)|=|𝐛~(CA)|(1+2β+2γ)δGC,|\widetilde{\mathbf{b}}^{\prime}(C)|=|\widetilde{\mathbf{b}}(C\cap A)|\leq(1+2\beta+2\gamma)\delta_{G}C,

so we can apply Fact 5.4 on demand 𝐛~/(1+2β+2γ)\widetilde{\mathbf{b}}^{\prime}/(1+2\beta+2\gamma) to obtain a demand 𝐛V\mathbf{b}^{\prime}\in\mathbb{R}^{V} satisfying |𝐛|LdegG𝒫L|\mathbf{b}^{\prime}|\leq L\deg_{\partial_{G}\mathcal{P}_{L}} and a flow on GG routing demand 𝐛~/(1+2β+2γ)𝐛\widetilde{\mathbf{b}}^{\prime}/(1+2\beta+2\gamma)-\mathbf{b}^{\prime} with congestion κ\kappa. Scaling this flow by factor (1+2β+2γ)(1+2\beta+2\gamma), we obtain a flow ff^{\prime} on GG routing demand 𝐛~(1+2β+2γ)𝐛\widetilde{\mathbf{b}}^{\prime}-(1+2\beta+2\gamma)\mathbf{b}^{\prime} with congestion (1+2β+2γ)κ(1+2\beta+2\gamma)\kappa.

Next, imagine contracting VAV\setminus A into a single vertex labeled xx, so that each edge (v,x)(v,x) has capacity degGA(v)\deg_{\partial_{G}A}(v). Consider the corresponding flow ff^{\prime} on this contracted graph, which sends at most (1+2β+2γ)κdegGA(v)(1+2\beta+2\gamma)\kappa\deg_{\partial_{G}A}(v) flow on each edge (v,x)(v,x). Now consider the exact same flow on HH, whose edges (v,x)(v,x) have capacities γdegG𝒫LGA(v)\gamma\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A}(v) instead of degGA(v)\deg_{\partial_{G}A}(v). These capacities are at least γ\gamma times the capacities of the contracted graph, so the corresponding flow has congestion at most a factor 1/γ1/\gamma larger. We have established a flow on HH routing demand 𝐛~(1+2β+2γ)𝐛\widetilde{\mathbf{b}}^{\prime}-(1+2\beta+2\gamma)\mathbf{b}^{\prime} with congestion γ1(1+2β+2γ)κ\gamma^{-1}(1+2\beta+2\gamma)\kappa.

Finally, since demand (1+2β+2γ)𝐛(1+2\beta+2\gamma)\mathbf{b}^{\prime} satisfies |(1+2β+2γ)𝐛|(1+2β+2γ)LdegG𝒫L|(1+2\beta+2\gamma)\mathbf{b}^{\prime}|\leq(1+2\beta+2\gamma)L\deg_{\partial_{G}\mathcal{P}_{L}}, we can directly route the demand along the edges (v,x)(v,x) of capacity γdegG𝒫LGA(v)\gamma\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A}(v), which is a routing with congestion γ1(1+2β+2γ)L\gamma^{-1}(1+2\beta+2\gamma)L.

Adding up all three routings, we have routed the initial demand 𝐛\mathbf{b} with congestion 1+γ1(1+2β+2γ)κ+γ1(1+2β+2γ)L=O(βγ1(κ+L))1+\gamma^{-1}(1+2\beta+2\gamma)\kappa+\gamma^{-1}(1+2\beta+2\gamma)L=O(\beta\gamma^{-1}(\kappa+L)), concluding the proof.

5.2 Fair Cut/Flow Algorithm

Given a congestion-approximator, the most convenient min-cut/max-flow algorithm is the fair cut/flow algorithm of [25]. We state the definition of a fair cut/flow and then cite the main result of [25].

Definition 5.7 (Fair cut/flow).

Let G=(V,E)G=(V,E) be an undirected graph with edge capacities c>0Ec\in\mathbb{R}_{>0}^{E}. Let s,ts,t be two vertices in VV. For any parameter α1\alpha\geq 1, we say that an (s,t)(s,t) cut SVS\subseteq V and a feasible flow ff is an α\alpha-fair (s,t)(s,t)-cut/flow pair if for each edge (u,v)S(u,v)\in\partial S with uSu\in S and vTv\in T, the flow ff sends at least 1αc(u,v)\frac{1}{\alpha}\cdot c(u,v) flow along the edge in the direction from uu to vv.

Fact 5.8.

For any α\alpha-fair (s,t)(s,t)-cut/flow pair (S,f)(S,f), the cut S\partial S is an α\alpha-approximate minimum (s,t)(s,t)-cut.

Theorem 5.9 (Fair cut/flow algorithm [25]).

Consider a graph G=(V,E)G=(V,E), two vertices s,tVs,t\in V, and error parameter ϵ(0,1]\epsilon\in(0,1]. Given a congestion-approximator 𝒞\mathcal{C} with quality κ\kappa, there is an algorithm that outputs a (1+ϵ)(1+\epsilon)-fair (s,t)(s,t)-cut/flow pair in O~((κ/ϵ)O(1)(K+m))\tilde{O}((\kappa/\epsilon)^{O(1)}(K+m)) time where K=C𝒞|C|K=\sum_{C\in\mathcal{C}}|C|.

We remark that the fair cut/flow algorithm above is not the fastest available algorithm. However, it is conceptually the easiest for our purposes, and we believe that future work may improve the running time of fair cut/flow algorithms to approach those of standard approximate cut/flow algorithms. Hence, we decide to black-box a fair cut/flow algorithm rather than starting with a standard cut/flow algorithm and massaging it to work in our setting.

To apply Theorem 5.9 to the graph H=G[A,γ,𝐬,𝐭]H=G[A,\gamma,\mathbf{s},\mathbf{t}] with the congestion-approximator of Lemma 5.5, we need to bound KK for the congestion-approximator 𝒞|A{{x},{s},{t}}\mathcal{C}|_{A}\cup\{\{x\},\{s\},\{t\}\}. Recall that 𝒞=i[L]i\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i} is the union of LL partitions of VV, so 𝒞|A\mathcal{C}|_{A} is the union of LL partitions of AA. So K=L|A|+3K=L|A|+3, where the +3+3 comes from the singletons in {{x},{s},{t}}\{\{x\},\{s\},\{t\}\}. It follows that Theorem 5.9 runs in time O~((κ/ϵ)O(1)(K+|E(H)|))=O~((κ/ϵ)O(1)(L|A|+m))\tilde{O}((\kappa/\epsilon)^{O(1)}(K+|E(H)|))=\tilde{O}((\kappa/\epsilon)^{O(1)}(L|A|+m^{\prime})) where mm^{\prime} is the number of edges in GG incident to vertices in AA.

We conclude this section with the main subroutine that we use to construct partition 𝒫L+1\mathcal{P}_{L+1}. Note that the assumption 1 below remains unchanged.

Theorem 5.10 (Flow/cut subroutine).

Consider partitions 𝒫1,𝒫2,,𝒫L\mathcal{P}_{1},\mathcal{P}_{2},\ldots,\mathcal{P}_{L} for a graph G=(V,E)G=(V,E) that satisfy the three properties in Lemma 4.2, and define the partitions 1,,L\mathcal{R}_{\geq 1},\ldots,\mathcal{R}_{\geq L} and 𝒞=i[L]i\mathcal{C}=\bigcup_{i\in[L]}\mathcal{R}_{\geq i} according to Lemma 4.2. Fix vertex set AVA\subseteq V, parameter γ(0,1]\gamma\in(0,1], and vertex weightings 𝐬,𝐭0A\mathbf{s},\mathbf{t}\in\mathbb{R}^{A}_{\geq 0} on AA, and denote the graph G[A,γ,𝐬,𝐭]G[A,\gamma,\mathbf{s},\mathbf{t}] by H=(VH,EH)H=(V_{H},E_{H}). Consider a parameter β1\beta\geq 1 such that the following assumption holds:

  1. ()(\star)

    𝐬(CA)+𝐭(CA)+γδG(CA)βδGC\mathbf{s}(C\cap A)+\mathbf{t}(C\cap A)+\gamma\cdot\delta_{G}(C\cap A)\leq\beta\cdot\delta_{G}C for all C𝒞C\in\mathcal{C}.

Let 𝒞|A\mathcal{C}|_{A} be the collection {CA:C𝒞}\{C\cap A:C\in\mathcal{C}\}. Then, given two vertices s,tVs,t\in V and error parameter ϵ(0,1]\epsilon\in(0,1], there is an algorithm that outputs a (1+ϵ)(1+\epsilon)-fair (s,t)(s,t)-cut/flow pair in O~((Lαβ/ϵ)O(1)(|A|+m))\tilde{O}((L\alpha\beta/\epsilon)^{O(1)}(|A|+m^{\prime})) time, where mm^{\prime} is the number of edges in GG incident to vertices in AA.

5.3 Cut-Matching Game and Trimming

We follow the cut-matching game treatment in [34]: either find a “balanced” cut of small capacity, or ensure that a “large” part of the graph mixes with low congestion. The following lemma is similar to Theorem 2.2 of [34] with one key difference: there is no built-in flow subroutine, so the algorithm makes black-box calls to the fair cut/flow algorithm of Theorem 5.10.

In past work [33, 1], the analysis of the cut-matching game for capacitated graphs has only been sketched, referencing the fact that a capacitated graph can be modelled by an uncapacitated graph with parallel edges (at a cost). For completeness, we provide a full proof of this capacitated case in Appendix A.

Theorem 5.11 (Cut-Matching).

Consider a graph G=(V,E)G=(V,E) with integral edge capacities in the range [1,W][1,W]. Let AVA\subseteq V be a vertex subset, let ϕ,η>0\phi,\eta>0 be parameters, and define 𝐝0A\mathbf{d}\in\mathbb{R}^{A}_{\geq 0} as 𝐝=deg𝒫LA|A\mathbf{d}=\textup{deg}_{\partial\mathcal{P}_{L}\cup\partial A}|_{A}. Suppose that the following assumption holds:

  1. ()(\diamond)

    There is a flow on GG with congestion κ\kappa such that each vertex vAv\in A is the source of cG({v},VA)c_{G}(\{v\},V\setminus A) flow and each vertex vVv\in V is the sink of at most deg𝒫L(v)\deg_{\partial\mathcal{P}_{L}}(v) flow.

There exists parameter T=O(log2(nW))T=O(\log^{2}(nW)) and a randomized, Monte Carlo algorithm that outputs a (potentially empty) set RVR\subseteq V such that

  1. 1.

    δG[A]Rϕ𝐝(R)+ϕ6T𝐝(A)\delta_{G[A]}R\leq\phi\mathbf{d}(R)+\frac{\phi}{6T}\mathbf{d}(A),

  2. 2.

    𝐝(R)𝐝(A)/2\mathbf{d}(R)\leq\mathbf{d}(A)/2, and

  3. 3.

    Either 𝐝(R)𝐝(A)/(6T)\mathbf{d}(R)\geq\mathbf{d}(A)/(6T), or the vertex weighting 𝐝|AR\mathbf{d}|_{A\setminus R} mixes in G[A]G[A] with congestion 5T/ϕ5T/\phi with high probability.

The algorithm makes at most TT calls to Theorem 5.10 with parameters

AA,ϵ118T2,γϵϕ2,βmax{1,(24ϕ+ϵγ)(κ+2)}.A\leftarrow A,\,\epsilon\leftarrow\frac{1}{18T^{2}},\,\gamma\leftarrow\frac{\epsilon\phi}{2},\,\beta\leftarrow\max\{1,(24\phi+\epsilon\gamma)(\kappa+2)\}.

Outside these calls, the algorithm takes an additional O((|A|+m)log4(nW))O((|A|+m^{\prime})\log^{4}(nW)) time, where mm^{\prime} is the number of edges in GG incident to vertices in AA.

Property (3) asserts that either an approximately balanced cut is found, or the vertex weighting 𝐝|AR\mathbf{d}|_{A\setminus R} mixes with low congestion (with high probability). In our algorithm for Theorem 5.1, we actually want the weighting 𝐝AR+degR\mathbf{d}_{A\setminus R}+\deg_{\partial R} to mix in the second case. To guarantee this stronger property, we augment the set RR into RBR\cup B through one additional call to the fair cut/flow algorithm of Theorem 5.10. The algorithm is similar to the flow-based expander trimming procedure in [34]. For completeness, we defer the algorithm and proof to Appendix B. Note that the setting, including assumption 1, is the same as Theorem 5.11.

Theorem 5.12 (Trimming).

Consider a graph G=(V,E)G=(V,E) with integral edge capacities in the range [1,W][1,W]. Let AVA\subseteq V be a vertex subset, let ϕ,κ>0\phi,\kappa>0 be parameters, and define 𝐝0A\mathbf{d}\in\mathbb{R}^{A}_{\geq 0} as 𝐝=deg𝒫LA|A\mathbf{d}=\textup{deg}_{\partial\mathcal{P}_{L}\cup\partial A}|_{A}. Suppose that the following assumption holds:

  1. ()(\diamond)

    There is a flow on GG with congestion κ\kappa such that each vertex vAv\in A is the source of cG({v},VA)c_{G}(\{v\},V\setminus A) flow and each vertex vVv\in V is the sink of at most deg𝒫L(v)\deg_{\partial\mathcal{P}_{L}}(v) flow.

There is a deterministic algorithm that inputs a subset RAR\subseteq A and a parameter ϵ>0\epsilon>0, and outputs a (potentially empty) set BAB\subseteq A such that

  1. 1.

    δG[A]B2δG[A]R+2ϵϕ𝐝(A)\delta_{G[A]}B\leq 2\delta_{G[A]}R+2\epsilon\phi\mathbf{d}(A),

  2. 2.

    𝐝(BR)16ϕδG[A]R+ϵ6𝐝(A)\mathbf{d}(B\setminus R)\leq\frac{1}{6\phi}\,\delta_{G[A]}R+\frac{\epsilon}{6}\mathbf{d}(A),

  3. 3.

    If the vertex weighting 𝐝|AR\mathbf{d}|_{A\setminus R} mixes in G[A]G[A] with congestion cc, then the vertex weighting(𝐝+degG[A](RB))|A(RB)(\mathbf{d}+\deg_{\partial_{G[A]}(R\cup B)})|_{A\setminus(R\cup B)} mixes in G[A]G[A] with congestion 2+(1+24ϕ)c2+(1+24\phi)c, and

  4. 4.

    There exists a vector 𝐭0A\mathbf{t}\in\mathbb{R}^{A}_{\geq 0} with 𝐭24ϕ𝐝|A(RB)\mathbf{t}\leq 24\phi\mathbf{d}|_{A\setminus(R\cup B)} and a flow gg on G[A(RB)]G[A\setminus(R\cup B)] routing demand degG[A](RB)|A(RB)𝐭\textup{deg}_{\partial_{G[A]}(R\cup B)}|_{A\setminus(R\cup B)}-\mathbf{t} with congestion 22.

The algorithm makes one call to Theorem 5.10 with parameters

AA,ϵϵ,γϵϕ2,βmax{1,(12ϕ+ϵγ)(κ+2)}.A\leftarrow A,\,\epsilon\leftarrow\epsilon,\,\gamma\leftarrow\frac{\epsilon\phi}{2},\,\beta\leftarrow\max\{1,(12\phi+\epsilon\gamma)(\kappa+2)\}.

Outside of this call, the algorithm takes an additional O(|A|+m)O(|A|+m^{\prime}) time, where mm^{\prime} is the number of edges in GG incident to vertices in AA.

5.4 Clustering Algorithm

With the necessary primitives established, we now describe how to construct partition 𝒫L+1\mathcal{P}_{L+1} given the partitions 𝒫1,,𝒫L\mathcal{P}_{1},\ldots,\mathcal{P}_{L}. The algorithm is recursive, taking as input a vertex subset AVA\subseteq V that is initially VV. Throughout, we maintain the invariant that each input subset satisfies assumption 1, which is the same for Theorems 5.11 and 5.12.

On input AVA\subseteq V, the algorithm calls Theorem 5.11 with parameters ϕ1Clog3(nW)\phi\leftarrow\frac{1}{C\log^{3}(nW)} and κClog3(nW)\kappa\leftarrow C\log^{3}(nW) for a large enough constant C>0C>0. The algorithm obtains an output set RAR\subseteq A and then calls Theorem 5.12 on inputs RRR\leftarrow R and ϵ1/(4T)\epsilon\leftarrow 1/(4T) with the same parameters ϕ,κ\phi,\kappa, obtaining a set BAB\subseteq A. There are now two cases:

  1. 1.

    If 𝐝(R)𝐝(A)/(6T)\mathbf{d}(R)\geq\mathbf{d}(A)/(6T), then recursively call the algorithm on inputs RBR\cup B and A(RB)A\setminus(R\cup B) if they are nonempty.

  2. 2.

    Otherwise, make a single recursive call on input RBR\cup B if it is nonempty, and add the set A(RB)A\setminus(R\cup B) to the final partition 𝒫L+1\mathcal{P}_{L+1}.

Claim 5.13.

Property (2) of Theorem 5.1 holds for i=Li=L, i.e., the collection of vertex weightings{deg𝒫LC|C0V:C𝒫L+1}\{\mathrm{deg}_{\partial\mathcal{P}_{L}\cup\partial C}|_{C}\in\mathbb{R}^{V}_{\geq 0}:C\in\mathcal{P}_{L+1}\} mixes simultaneously in GG with congestion O(log5(nW))O(\log^{5}(nW)).

Proof.

By property (3) of Theorem 5.11 and property (3) of Theorem 5.12, for each set A(RB)A\setminus(R\cup B) added to the final partition 𝒫L+1\mathcal{P}_{L+1}, the vertex weighting (𝐝+degG[A](RB))|A(RB)(\mathbf{d}+\deg_{\partial_{G[A]}(R\cup B)})|_{A\setminus(R\cup B)} mixes in G[A]G[A] with congestion 2+(1+24ϕ)5T/ϕ2+(1+24\phi)\cdot 5T/\phi. Since G(A(RB))GAG[A](A(RB))\partial_{G}(A\setminus(R\cup B))\subseteq\partial_{G}A\cup\partial_{G[A]}(A\setminus(R\cup B)), we have

𝐝+degG[A](RB)\displaystyle\mathbf{d}+\deg_{\partial_{G[A]}(R\cup B)} =degG𝒫LGA+degG[A](A(RB))\displaystyle=\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A}+\deg_{\partial_{G[A]}(A\setminus(R\cup B))}
degG𝒫LGAG[A](A(RB))\displaystyle\geq\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A\cup\partial_{G[A]}(A\setminus(R\cup B))}
degG𝒫LG(A(RB)),\displaystyle\geq\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}(A\setminus(R\cup B))},

so in particular, the vertex weighting degG𝒫LG(A(RB))|A(RB)(𝐝+degG[A](RB))|A(RB)\textup{deg}_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}(A\setminus(R\cup B))}|_{A\setminus(R\cup B)}\leq(\mathbf{d}+\deg_{\partial_{G[A]}(R\cup B)})|_{A\setminus(R\cup B)} also mixes in G[A]G[A] with congestion 2+(1+24ϕ)5T/ϕ2+(1+24\phi)\cdot 5T/\phi. The recursive instances AA that add a set A(RB)A\setminus(R\cup B) to 𝒫L+1\mathcal{P}_{L+1} are disjoint, so the vertex weightings degG𝒫LG(RB)|A(RB)\textup{deg}_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}(R\cup B)}|_{A\setminus(R\cup B)} mix simultaneously in GG with the same congestion. We bound the congestion by 2+(1+24ϕ)5T/ϕ=O(T/ϕ)=O(log5(nW))2+(1+24\phi)\cdot 5T/\phi=O(T/\phi)=O(\log^{5}(nW)), concluding the proof. ∎

It remains to establish condition (3) of Theorem 5.1 for i=Li=L as well as the assumption 1 of Theorems 5.11 and 5.12. To do so, we first prove a few guarantees of the algorithm.

Claim 5.14.

For any recursive call AAA^{\prime}\subseteq A, we have 𝐝(A)(1124T)𝐝(A)\mathbf{d}^{\prime}(A^{\prime})\leq(1-\frac{1}{24T})\mathbf{d}(A) where 𝐝=deg𝒫LA|A\mathbf{d}^{\prime}=\textup{deg}_{\partial\mathcal{P}_{L}\cup\partial A^{\prime}}|_{A^{\prime}}.

Proof.

We first claim that GAGAG[A]A\partial_{G}A^{\prime}\subseteq\partial_{G}A\cup\partial_{G[A]}A^{\prime}. For any edge in GA\partial_{G}A^{\prime}, consider its endpoint in VAV\setminus A^{\prime}. Either it is in AAA\setminus A^{\prime}, in which case the edge belongs to G[A]A\partial_{G[A]}A^{\prime}, or it is in VAV\setminus A, in which case the edge belongs to GA\partial_{G}A. It follows that GAGAG[A]A\partial_{G}A^{\prime}\subseteq\partial_{G}A\cup\partial_{G[A]}A^{\prime}, and we can bound 𝐝(A)\mathbf{d}^{\prime}(A^{\prime}) as follows:

𝐝(A)\displaystyle\mathbf{d}^{\prime}(A^{\prime}) =degG𝒫LGA(A)\displaystyle=\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A^{\prime}}(A^{\prime})
degG𝒫LGAG[A]A(A)\displaystyle\leq\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A\cup\partial_{G[A]}A^{\prime}}(A^{\prime})
degG𝒫LGA(A)+degG[A]A(A)\displaystyle\leq\deg_{\partial_{G}\mathcal{P}_{L}\cup\partial_{G}A}(A^{\prime})+\deg_{\partial_{G[A]}A^{\prime}}(A^{\prime})
=𝐝(A)+δG[A]A.\displaystyle=\mathbf{d}(A^{\prime})+\delta_{G[A]}A^{\prime}.

By properties (1) and (2) of Theorem 5.11, we have δG[A]Rϕ𝐝(R)+ϕ6T𝐝(A)\delta_{G[A]}R\leq\phi\mathbf{d}(R)+\frac{\phi}{6T}\mathbf{d}(A) and 𝐝(R)𝐝(A)/2\mathbf{d}(R)\leq\mathbf{d}(A)/2. By properties (1) and (2) of Theorem 5.12, we have δG[A]B2δG[A]R+2ϵϕ𝐝(A)\delta_{G[A]}B\leq 2\delta_{G[A]}R+2\epsilon\phi\mathbf{d}(A) and 𝐝(BR)16ϕδG[A]R+ϵ6𝐝(A)\mathbf{d}(B\setminus R)\leq\frac{1}{6\phi}\,\delta_{G[A]}R+\frac{\epsilon}{6}\mathbf{d}(A). The only two options for the recursive instance AA^{\prime} are A=RBA^{\prime}=R\cup B and A=A(RB)A^{\prime}=A\setminus(R\cup B), and in both cases, we have

δG[A]A=δG[A](RB)\displaystyle\delta_{G[A]}A^{\prime}=\delta_{G[A]}(R\cup B) δG[A]R+δG[A]B\displaystyle\leq\delta_{G[A]}R+\delta_{G[A]}B
δG[A]R+2δG[A]R+2ϵϕ𝐝(A)\displaystyle\leq\delta_{G[A]}R+2\delta_{G[A]}R+2\epsilon\phi\mathbf{d}(A)
=3δG[A]R+ϕ2T𝐝(A)\displaystyle=3\delta_{G[A]}R+\frac{\phi}{2T}\mathbf{d}(A)
3(ϕ𝐝(R)+ϕ6T𝐝(A))+ϕ2T𝐝(A)\displaystyle\leq 3\left(\phi\mathbf{d}(R)+\frac{\phi}{6T}\mathbf{d}(A)\right)+\frac{\phi}{2T}\mathbf{d}(A)
=3ϕ𝐝(R)+ϕT𝐝(A).\displaystyle=3\phi\mathbf{d}(R)+\frac{\phi}{T}\mathbf{d}(A).

Combining the two bounds so far, we obtain

𝐝(A)𝐝(A)+3ϕ𝐝(R)+ϕT𝐝(A).\mathbf{d}^{\prime}(A^{\prime})\leq\mathbf{d}(A^{\prime})+3\phi\mathbf{d}(R)+\frac{\phi}{T}\mathbf{d}(A).

To bound 𝐝(A)\mathbf{d}(A^{\prime}), we case on whether A=RBA^{\prime}=R\cup B or A=A(RB)A^{\prime}=A\setminus(R\cup B). If A=A(RB)A^{\prime}=A\setminus(R\cup B), then we must be in case (1) of the algorithm, which means 𝐝(R)𝐝(A)/(6T)\mathbf{d}(R)\geq\mathbf{d}(A)/(6T). In this case, we bound 𝐝(A)=𝐝(A(RB))𝐝(AR)=𝐝(A)𝐝(R)\mathbf{d}(A^{\prime})=\mathbf{d}(A\setminus(R\cup B))\leq\mathbf{d}(A\setminus R)=\mathbf{d}(A)-\mathbf{d}(R). Together with the bound ϕ1/24\phi\leq 1/24, we obtain

𝐝(A)\displaystyle\mathbf{d}^{\prime}(A^{\prime}) 𝐝(A)+3ϕ𝐝(R)+ϕT𝐝(A)\displaystyle\leq\mathbf{d}(A^{\prime})+3\phi\mathbf{d}(R)+\frac{\phi}{T}\mathbf{d}(A)
𝐝(A)𝐝(R)+3ϕ𝐝(R)+ϕT𝐝(A)\displaystyle\leq\mathbf{d}(A)-\mathbf{d}(R)+3\phi\mathbf{d}(R)+\frac{\phi}{T}\mathbf{d}(A)
𝐝(A)12𝐝(R)+124T𝐝(A)\displaystyle\leq\mathbf{d}(A)-\frac{1}{2}\mathbf{d}(R)+\frac{1}{24T}\mathbf{d}(A)
𝐝(A)12𝐝(A)6T+124T𝐝(A)\displaystyle\leq\mathbf{d}(A)-\frac{1}{2}\cdot\frac{\mathbf{d}(A)}{6T}+\frac{1}{24T}\mathbf{d}(A)
=(1124T)𝐝(A),\displaystyle=\left(1-\frac{1}{24T}\right)\mathbf{d}(A),

as promised. Otherwise, suppose that A=RBA^{\prime}=R\cup B. We have

𝐝(RB)\displaystyle\mathbf{d}(R\cup B) =𝐝(R)+𝐝(BR)\displaystyle=\mathbf{d}(R)+\mathbf{d}(B\setminus R)
𝐝(R)+16ϕδG[A]R+ϵ6𝐝(A)\displaystyle\leq\mathbf{d}(R)+\frac{1}{6\phi}\delta_{G[A]}R+\frac{\epsilon}{6}\mathbf{d}(A)
𝐝(R)+16ϕ(ϕ𝐝(R)+ϕ6T𝐝(A))+ϵ6𝐝(A)\displaystyle\leq\mathbf{d}(R)+\frac{1}{6\phi}\left(\phi\mathbf{d}(R)+\frac{\phi}{6T}\mathbf{d}(A)\right)+\frac{\epsilon}{6}\mathbf{d}(A)
76𝐝(R)+136T𝐝(A)+ϵ6𝐝(A)\displaystyle\leq\frac{7}{6}\mathbf{d}(R)+\frac{1}{36T}\mathbf{d}(A)+\frac{\epsilon}{6}\mathbf{d}(A)
7612𝐝(A)+136𝐝(A)+16𝐝(A)\displaystyle\leq\frac{7}{6}\cdot\frac{1}{2}\mathbf{d}(A)+\frac{1}{36}\mathbf{d}(A)+\frac{1}{6}\mathbf{d}(A)
=79𝐝(A)\displaystyle=\frac{7}{9}\mathbf{d}(A)
(1124T)𝐝(A),\displaystyle\leq\left(1-\frac{1}{24T}\right)\mathbf{d}(A),

as promised. With both cases established, this concludes the proof. ∎

For a given recursive call AA, define its recursion depth inductively as follows: the initial call AVA\leftarrow V has depth 0, and given a recursive call AA of depth dd, all of its recursive calls have depth d+1d+1. By Claim 5.14, the value of 𝐝(A)\mathbf{d}(A) decreases multiplicatively by factor 1/(24T)1/(24T) on each recursive call, so the maximum recursion depth is O(Tlog(nW))O(T\log(nW)).

For a given recursion depth dd, let EdEE_{d}\subseteq E denote the union of edges G[A](RB)\partial_{G[A]}(R\cup B) over all instances AA of depth dd. By construction of the algorithm, the (disjoint) union of EdE_{d} over all recursion depths dd is exactly 𝒫L+1\partial\mathcal{P}_{L+1}. To avoid clutter, we also define E<d=E1Ed1E_{<d}=E_{1}\cup\cdots\cup E_{d-1}.

Claim 5.15.

For any recursion depth d0d\geq 0, there is a flow on GG with congestion 44 such that each vertex vVv\in V sends degEd(v)\deg_{E_{d}}(v) flow and receives at most 48ϕdeg𝒫LE<d(v)48\phi\deg_{\partial\mathcal{P}_{L}\cup E_{<d}}(v) flow.

Proof.

We prove the statement by induction on d0d\geq 0. The base case d=0d=0 is satisfied with the empty flow: since 𝒜0\mathcal{A}_{0} is the partition {V}\{V\} with a single part, each vertex vAv\in A indeed sends degA0(v)=0\deg_{\partial A_{0}}(v)=0 flow. Now assume by induction that there is a flow on GG with congestion 44 such that each vertex vAv\in A sends degEd(v)\deg_{E_{d}}(v) flow and receives at most 48ϕdeg𝒫L(v)48\phi\deg_{\partial\mathcal{P}_{L}}(v) flow.

For each instance AA of depth dd, the algorithm calls Theorem 5.12 which defines 𝐝=deg𝒫LA|A\mathbf{d}=\textup{deg}_{\partial\mathcal{P}_{L}\cup\partial A}|_{A}. By property (4) of Theorem 5.12, there exists a vector 𝐭0A\mathbf{t}\in\mathbb{R}^{A}_{\geq 0} with 𝐭24ϕ𝐝|A(RB)\mathbf{t}\leq 24\phi\mathbf{d}|_{A\setminus(R\cup B)} and a flow gg on G[A(RB)]G[A\setminus(R\cup B)] routing demand degG[A](RB)|A(RB)𝐭\textup{deg}_{\partial_{G[A]}(R\cup B)}|_{A\setminus(R\cup B)}-\mathbf{t} with congestion 22. We now construct a flow in G[A]G[A] with congestion 44 such that each vertex vAv\in A sends degEd(v)=degG[A](RB)(v)\deg_{E_{d}}(v)=\deg_{\partial_{G[A]}(R\cup B)}(v) flow and receives at most 48ϕdeg𝒫LE<d(v)48\phi\deg_{\partial\mathcal{P}_{L}\cup\partial E_{<d}}(v) flow. First, for each edge in G[A](RB)\partial_{G[A]}(R\cup B), send flow to full capacity in the direction from RBR\cup B to A(RB)A\setminus(R\cup B). In this initial flow, each vertex vRBv\in R\cup B sends exactly degG[A](RB)(v)\deg_{\partial_{G[A]}(R\cup B)}(v) flow, and each vertex vA(RB)v\in A\setminus(R\cup B) receives exactly degG[A](RB)(v)\deg_{\partial_{G[A]}(R\cup B)}(v) flow. Next, we send the flow gg scaled by 22, so that each vertex vA(RB)v\in A\setminus(R\cup B) sends exactly 2degG[A](RB)(v)2\deg_{\partial_{G[A]}(R\cup B)}(v) flow and each vertex vA(RB)v\in A\setminus(R\cup B) receives at most 48ϕ𝐝(v)48\phi\mathbf{d}(v) flow. Note that 𝐝(v)=deg𝒫LA(v)deg𝒫LE<d(v)\mathbf{d}(v)=\deg_{\partial\mathcal{P}_{L}\cup\partial A}(v)\leq\deg_{\partial\mathcal{P}_{L}\cup\partial E_{<d}}(v) since AE<d\partial A\subseteq\partial E_{<d}. Summing the two flows, we obtain a flow in G[A]G[A] such that each vertex vAv\in A sends degEd(v)=degG[A](RB)(v)\deg_{E_{d}}(v)=\deg_{\partial_{G[A]}(R\cup B)}(v) flow and receives at most 48ϕdeg𝒫LE<d(v)48\phi\deg_{\partial\mathcal{P}_{L}\cup\partial E_{<d}}(v) flow. The congestion of the flow is 44, since edges in G[A](RB)\partial_{G[A]}(R\cup B) have congestion 11 in the initial flow, and edges in G[A(RB)]G[A\setminus(R\cup B)] have congestion 22 in the flow gg scaled by 22.

To complete the induction, our final flow is the union of the constructed flow over all recursive instances AA of depth dd. Since the flow for instance AA is in G[A]G[A], and since the instances AA are disjoint, the flows are also disjoint over all AA. It follows that their union is a flow on GG with congestion 44 such that each vertex vAv\in A sends degEd(v)\deg_{E_{d}}(v) flow and each vertex vVv\in V receives 48ϕdeg𝒫LE<d(v)48\phi\deg_{\partial\mathcal{P}_{L}\cup E_{<d}}(v) flow. ∎

Finally, the two claims below establish property (3) of Theorem 5.1 and assumption 1, respectively.

Claim 5.16.

Property (3) of Theorem 5.1 holds for i=Li=L, i.e., there is a flow in GG with congestion O(Tlog(nW))O(T\log(nW)) such that each vertex vVv\in V sends deg𝒫L+1(v)\deg_{\partial\mathcal{P}_{L+1}}(v) flow and receives at most 12deg𝒫L(v)\frac{1}{2}\deg_{\partial\mathcal{P}_{L}}(v) flow.

Proof.

Let D=O(Tlog(nW))D=O(T\log(nW)) be the maximum recursion depth. Summing the flows from Claim 5.15 over all recursion depths dd, and using that E1Ed=𝒫L+1E_{1}\cup\cdots\cup E_{d}=\partial\mathcal{P}_{L+1}, we obtain a flow with congestion O(Tlog(nW))O(T\log(nW)) such that each vertex vVv\in V sends deg𝒫L+1(v)\deg_{\partial\mathcal{P}_{L+1}}(v) flow and receives at most 48Dϕdeg𝒫L𝒫L+1(v)48D\phi\deg_{\partial\mathcal{P}_{L}\cup\partial\mathcal{P}_{L+1}}(v) flow. Recall that we set ϕ1Clog3(nW)\phi\leftarrow\frac{1}{C\log^{3}(nW)} for large enough constant C>0C>0. We choose CC large enough that 48Dϕ1/448D\phi\leq 1/4, so that each vertex vVv\in V receives at most 13deg𝒫L𝒫L+1(v)13deg𝒫L(v)+13deg𝒫L+1(v)\frac{1}{3}\deg_{\partial\mathcal{P}_{L}\cup\partial\mathcal{P}_{L+1}}(v)\leq\frac{1}{3}\deg_{\partial\mathcal{P}_{L}}(v)+\frac{1}{3}\deg_{\partial\mathcal{P}_{L+1}}(v) flow. We can cancel out at most 13deg𝒫L+1(v)\frac{1}{3}\deg_{\partial\mathcal{P}_{L+1}}(v) flow received at each vertex vVv\in V from the deg𝒫L+1(v)\deg_{\partial\mathcal{P}_{L+1}}(v) flow sent. After cancellation, we obtain a flow with congestion O(Tlog(nW))O(T\log(nW)) such that each vertex vVv\in V sends at least 23deg𝒫L+1(v)\frac{2}{3}\deg_{\partial\mathcal{P}_{L+1}}(v) flow and receives at most 13deg𝒫L(v)\frac{1}{3}\deg_{\partial\mathcal{P}_{L}}(v) flow. Scaling the flow by factor 3/23/2, taking a path decomposition, and removing enough paths until each vertex is the start of exactly deg𝒫L+1(v)\deg_{\partial\mathcal{P}_{L+1}}(v) paths, we obtain the desired flow with congestion O(Tlog(nW))O(T\log(nW)). ∎

Claim 5.17.

For each recursive instance AA, the assumption 1 of Theorems 5.11 and 5.12 hold, i.e., there is a flow on GG with congestion O(log3(nW))O(\log^{3}(nW)) such that each vertex vAv\in A is the source of cG({v},VA)c_{G}(\{v\},V\setminus A) flow and each vertex vVv\in V is the sink of at most deg𝒫L(v)\deg_{\partial\mathcal{P}_{L}}(v) flow.

Proof.

By Claim 5.16, there is a flow in GG with congestion O(log3(nW))O(\log^{3}(nW)) such that each vertex vVv\in V sends deg𝒫L+1(v)\deg_{\partial\mathcal{P}_{L+1}}(v) flow and receives at most 12deg𝒫L(v)\frac{1}{2}\deg_{\partial\mathcal{P}_{L}}(v) flow. Observe that for any recursive instance AA, we have A𝒫L+1\partial A\subseteq\partial\mathcal{P}_{L+1} since the recursive algorithm starting at instance AA adds a partition of AA into 𝒫L+1\mathcal{P}_{L+1}. In particular, each vertex vAv\in A sends at least degA(v)=cG({v},VA)\deg_{\partial A}(v)=c_{G}(\{v\},V\setminus A) flow. Take a path decomposition of the flow and remove enough paths until each vertex is the start of exactly cG({v},VA)c_{G}(\{v\},V\setminus A) paths. The resulting flow satisfies assumption 1, concluding the proof. ∎

It remains to bound the running time of the algorithm for Theorem 5.1. For each instance AA, Theorems 5.11 and 5.12 run in O~(|A|+m)\tilde{O}(|A|+m^{\prime}) time plus O(log2(nW))O(\log^{2}(nW)) calls to Theorem 5.10, which takes O~(|A|+m)\tilde{O}(|A|+m^{\prime}) time per call, for a total time of O~(|A|+m)\tilde{O}(|A|+m^{\prime}). The instances AA on a given recursion depth are disjoint, so the sum of |A|+m|A|+m^{\prime} over all such instances AA is O(m)O(m). The maximum recursion depth is O(Tlog(nW))=O(log3(nW))O(T\log(nW))=O(\log^{3}(nW)), so the sum of |A|+m|A|+m^{\prime} over all instances of the algorithm is O(mlog3(nW))O(m\log^{3}(nW)). It follows that the algorithm of Theorem 5.1 runs in O~(m)\tilde{O}(m) time.

6 Approximate Maximum Flow

From Theorem 5.1 and Claim 5.2, we obtain an algorithm that constructs a congestion-approximator of quality O(log10(nW))O(\log^{10}(nW)) in O~(m)\tilde{O}(m) time. Recall that Sherman’s framework [36, 37] translates a congestion-approximator of quality α\alpha to a (1+ϵ)(1+\epsilon)-approximate maximum flow algorithm with running time O~(ϵ1αm)\tilde{O}(\epsilon^{-1}\alpha m). Thus, for any parameter ϵ>0\epsilon>0, we obtain a (1+ϵ)(1+\epsilon)-approximate maximum flow algorithm with running time O~(ϵ1m)\tilde{O}(\epsilon^{-1}m).

References

  • [1] Arpit Agarwal, Sanjeev Khanna, Huan Li, Prathamesh Patil, Chen Wang, Nathan White, and Peilin Zhong. Parallel approximate maximum flows in near-linear work and polylogarithmic depth. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3997–4061. SIAM, 2024.
  • [2] Sanjeev Arora, Elad Hazan, and Satyen Kale. o(logn)o(\log n) approximation to sparsest cut in o(n2)o(n^{2}) time. SIAM Journal on Computing, 39(5):1748–1771, 2010.
  • [3] Sanjeev Arora, Elad Hazan, and Satyen Kale. The multiplicative weights update method: a meta-algorithm and applications. Theory of computing, 8(1):121–164, 2012.
  • [4] Sanjeev Arora, James R Lee, and Assaf Naor. Euclidean distortion and the sparsest cut. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 553–562, 2005.
  • [5] Sanjeev Arora, Satish Rao, and Umesh Vazirani. Expander flows, geometric embeddings and graph partitioning. Journal of the ACM (JACM), 56(2):1–37, 2009.
  • [6] Marcin Bienkowski, Miroslaw Korzeniowski, and Harald Räcke. A practical algorithm for constructing oblivious routing schemes. In Proceedings of the Fifteenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’03, page 24–33, New York, NY, USA, 2003. Association for Computing Machinery.
  • [7] Jeff Cheeger. A lower bound for the smallest eigenvalue of the laplacian. Problems in analysis, 625(195-199):110, 1970.
  • [8] Li Chen, Rasmus Kyng, Yang P Liu, Richard Peng, Maximilian Probst Gutenberg, and Sushant Sachdeva. Maximum flow and minimum-cost flow in almost-linear time. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 612–623. IEEE, 2022.
  • [9] Paul Christiano, Jonathan A Kelner, Aleksander Madry, Daniel A Spielman, and Shang-Hua Teng. Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 273–282, 2011.
  • [10] Michael B Cohen, Rasmus Kyng, Gary L Miller, Jakub W Pachocki, Richard Peng, Anup B Rao, and Shen Chen Xu. Solving sdd linear systems in nearly m log1/2 n time. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, pages 343–352, 2014.
  • [11] Yefim Dinitz. Dinitz’algorithm: The original version and even’s version. In Theoretical Computer Science: Essays in Memory of Shimon Even, pages 218–240. Springer, 2006.
  • [12] Lester Randolph Ford and Delbert R Fulkerson. Maximal flow through a network. Canadian journal of Mathematics, 8:399–404, 1956.
  • [13] Gramoz Goranci, Harald Räcke, Thatchaphol Saranurak, and Zihan Tan. The expander hierarchy and its applications to dynamic graph algorithms. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2212–2228. SIAM, 2021.
  • [14] Gramoz Goranci, Harald Räcke, Thatchaphol Saranurak, and Zihan Tan. The expander hierarchy and its applications to dynamic graph algorithms. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2212–2228. SIAM, 2021.
  • [15] Chris Harrelson, Kirsten Hildrum, and Satish Rao. A polynomial-time tree decomposition to minimize congestion. In Proceedings of the Fifteenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA ’03, page 34–43, New York, NY, USA, 2003. Association for Computing Machinery.
  • [16] Arun Jambulapati and Aaron Sidford. Ultrasparse ultrasparsifiers and faster laplacian system solvers. ACM Transactions on Algorithms, 2021.
  • [17] Jonathan A Kelner, Yin Tat Lee, Lorenzo Orecchia, and Aaron Sidford. An almost-linear-time algorithm for approximate max flow in undirected graphs, and its multicommodity generalizations. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 217–226. SIAM, 2014.
  • [18] Rohit Khandekar, Satish Rao, and Umesh Vazirani. Graph partitioning using single commodity flows. Journal of the ACM (JACM), 56(4):1–15, 2009.
  • [19] Rohit Khandekar, Satish Rao, and Umesh Vazirani. Graph partitioning using single commodity flows. Journal of the ACM (JACM), 56(4):1–15, 2009.
  • [20] Philip Klein, Satish Rao, Ajit Agrawal, and R Ravi. An approximate max-flow min-cut relation for undirected multicommodity flow, with applications. Combinatorica, 15(2):187–202, 1995.
  • [21] Ioannis Koutis, Gary L Miller, and Richard Peng. A nearly-m log n time solver for sdd linear systems. In 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, pages 590–598. IEEE, 2011.
  • [22] Rasmus Kyng and Sushant Sachdeva. Approximate gaussian elimination for laplacians-fast, sparse, and simple. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 573–582. IEEE, 2016.
  • [23] Yin Tat Lee, Satish Rao, and Nikhil Srivastava. A new approach to computing maximum flows using electrical flows. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 755–764, 2013.
  • [24] Tom Leighton and Satish Rao. Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. J. ACM, 46(6):787–832, nov 1999.
  • [25] Jason Li, Danupon Nanongkai, Debmalya Panigrahi, and Thatchaphol Saranurak. Near-linear time approximations for cut problems via fair cuts. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 240–275. SIAM, 2023.
  • [26] Nathan Linial, Eran London, and Yuri Rabinovich. The geometry of graphs and some of its algorithmic applications. Combinatorica, 15:215–245, 1995.
  • [27] Aleksander Madry. Fast approximation algorithms for cut-based problems in undirected graphs. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 245–254, 2010.
  • [28] Lorenzo Orecchia, Leonard J Schulman, Umesh V Vazirani, and Nisheeth K Vishnoi. On partitioning graphs via single commodity flows. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pages 461–470, 2008.
  • [29] Richard Peng. A note on cut-approximators and approximating undirected max flows. CoRR, abs/1411.7631, 2014.
  • [30] Richard Peng. Approximate undirected maximum flows in o (m polylog (n)) time. In Proceedings of the twenty-seventh annual ACM-SIAM symposium on Discrete algorithms, pages 1862–1867. SIAM, 2016.
  • [31] Harald Räcke. Minimizing congestion in general networks. In Proceedings of the 43rd Symposium on Foundations of Computer Science, FOCS ’02, page 43–52, USA, 2002. IEEE Computer Society.
  • [32] Harald Räcke. Optimal hierarchical decompositions for congestion minimization in networks. In Proceedings of the Fortieth Annual ACM Symposium on Theory of Computing, STOC ’08, page 255–264, New York, NY, USA, 2008. Association for Computing Machinery.
  • [33] Harald Räcke, Chintan Shah, and Hanjo Täubig. Computing cut-based hierarchical decompositions in almost linear time. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 227–238. SIAM, 2014.
  • [34] Thatchaphol Saranurak and Di Wang. Expander decomposition and pruning: Faster, stronger, and simpler. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2616–2635. SIAM, 2019.
  • [35] Jonah Sherman. Breaking the multicommodity flow barrier for o(logn)o(\sqrt{\log n})-approximations to sparsest cut. In 2009 50th Annual IEEE Symposium on Foundations of Computer Science, pages 363–372, 2009.
  • [36] Jonah Sherman. Nearly maximum flows in nearly linear time. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pages 263–269. IEEE, 2013.
  • [37] Jonah Sherman. Area-convexity, \ell_{\infty} regularization, and undirected multicommodity flow. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 452–460, 2017.
  • [38] Daniel A Spielman and Shang-Hua Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, pages 81–90, 2004.

Appendix A Cut-Matching Game

In this section, we prove Theorem 5.11, restated below. See 5.11

Our setup resembles Appendix B of [34] with a few minor changes. In particular, since our flow routine is more restrictive, we have to adapt the algorithm to handle our flow outputs.

We begin with notation from [34]. For simplicity, we avoid working with the subdivision graph in [33, 34]. Define a AA-commodity flow as a multi-commodity flow where each vertex vAv\in A is the source of quantity 𝐝(v)\mathbf{d}(v) of its distinct flow commodity. Only for analysis, we consider a A×AA\times A flow-matrix 𝐅0A×A\mathbf{F}\in\mathbb{R}^{A\times A}_{\geq 0} which encodes information about a AA-commodity flow. We say that 𝐅\mathbf{F} is routable with congestion cc if there exists a AA-commodity flow ff such that, simultaneously for all u,vAu,v\in A, we have that uu can send quantity 𝐅(u,v)\mathbf{F}(u,v) of its own commodity to vv, and the amount of flow through each edge is at most cc.

The algorithm initializes flow-matrix 𝐅00A×A\mathbf{F}_{0}\in\mathbb{R}^{A\times A}_{\geq 0} as the diagonal matrix with value 𝐝(v)\mathbf{d}(v) on entry 𝐅(v,v)\mathbf{F}(v,v). Trivially, 𝐅\mathbf{F} is routable with zero congestion. The algorithm initializes A0=AA_{0}=A and R0=R_{0}=\emptyset, and then proceeds for at most T=O(log2(nW))T=O(\log^{2}(nW)) rounds. For each round t[T]t\in[T], the algorithm implicitly updates 𝐅t1\mathbf{F}_{t-1} to 𝐅t\mathbf{F}_{t} such that it is routable with congestion at most t/ϕt/\phi. The operation for implicitly updating 𝐅t1\mathbf{F}_{t-1} will be described explicitly later on, but we ensure that row sums do not change from 𝐅t1\mathbf{F}_{t-1} to 𝐅t\mathbf{F}_{t}, i.e., there is always 𝐝(v)\mathbf{d}(v) total quantity of each commodity vAv\in A spread among the vertices. For each round t[T]t\in[T], the algorithm (explicitly) finds a partition of At1A_{t-1} into At,AtrA_{t}^{\ell},A_{t}^{r}, and then computes

  1. (i)

    A (possibly empty) set StAS_{t}\subseteq A satisfying δG[A]Stϕ𝐝(SAt1)+ϕ6T2𝐝(A)\delta_{G[A]}S_{t}\leq\phi\mathbf{d}(S\cap A_{t-1})+\frac{\phi}{6T^{2}}\mathbf{d}(A) and 𝐝(St)𝐝(A)/3\mathbf{d}(S_{t})\leq\mathbf{d}(A)/3, and

  2. (ii)

    A (possibly empty) flow ftf_{t} from AtStA_{t}^{\ell}\setminus S_{t} to AtrStA_{t}^{r}\setminus S_{t} such that each vertex vAtStv\in A_{t}^{\ell}\setminus S_{t} is the source of at least 𝐝(v)/12\mathbf{d}(v)/12 flow, and each vertex vAtrStv\in A_{t}^{r}\setminus S_{t} is the sink of at most 𝐝(v)\mathbf{d}(v) flow. The flow ff has congestion 1/ϕ1/\phi.

The algorithm then updates AtAt1StA_{t}\leftarrow A_{t-1}\setminus S_{t} and RtRt1StR_{t}\leftarrow R_{t-1}\cup S_{t}. Note that on each round tt, the sets AtA_{t} and RtR_{t} partition AA. If 𝐝(Rt)𝐝(A)/(6T)\mathbf{d}(R_{t})\geq\mathbf{d}(A)/(6T) holds, then the algorithm immediately terminates and outputs R=RtR=R_{t}. Otherwise, we have 𝐝(RT)<𝐝(A)/(6T)\mathbf{d}(R_{T})<\mathbf{d}(A)/(6T) at the end, and the algorithm outputs R=RTR=R_{T}.

Lemma A.1.

For any round tt, we have δG[A]Rtϕ𝐝(Rt)+ϕ6T𝐝(A)\delta_{G[A]}R_{t}\leq\phi\mathbf{d}(R_{t})+\frac{\phi}{6T}\mathbf{d}(A) and 𝐝(Rt)𝐝(A)/2\mathbf{d}(R_{t})\leq\mathbf{d}(A)/2.

Proof.

We start by proving the first statement. Each time we remove a set StS_{t}, we are guaranteed that δG[A]Stϕ𝐝(StAt1)+ϕ6T2𝐝(A)\delta_{G[A]}S_{t}\leq\phi\mathbf{d}(S_{t}\cap A_{t-1})+\frac{\phi}{6T^{2}}\mathbf{d}(A). We charge the ϕ𝐝(StAt1)\phi\mathbf{d}(S_{t}\cap A_{t-1}) part to the vertices in StAt1S_{t}\cap A_{t-1} so that each vertex vStAt1v\in S_{t}\cap A_{t-1} is charged exactly ϕ𝐝(v)\phi\mathbf{d}(v). Since the algorithm updates AtAt1StA_{t}\leftarrow A_{t-1}\setminus S_{t} and RtRt1StR_{t}\leftarrow R_{t-1}\cup S_{t}, each newly charged vertex leaves AtA_{t} and joins RtR_{t}. In total, we charge t=1Tϕ𝐝(StAt1)\sum_{t=1}^{T}\phi\mathbf{d}(S_{t}\cap A_{t-1}) to the vertices in RtR_{t} so that each vertex vRtv\in R_{t} is charged once at exactly ϕ𝐝(v)\phi\mathbf{d}(v). It follows that t=1Tϕ𝐝(StAt1)ϕ𝐝(Rt)\sum_{t=1}^{T}\phi\mathbf{d}(S_{t}\cap A_{t-1})\leq\phi\mathbf{d}(R_{t}) and

δG[A]Rtt=1TδG[A]Stt=1T(ϕ𝐝(StAt1)+ϕ6T2𝐝(A))ϕ𝐝(Rt)+Tϕ6T2𝐝(A),\delta_{G[A]}R_{t}\leq\sum_{t=1}^{T}\delta_{G[A]}S_{t}\leq\sum_{t=1}^{T}\left(\phi\mathbf{d}(S_{t}\cap A_{t-1})+\frac{\phi}{6T^{2}}\mathbf{d}(A)\right)\leq\phi\mathbf{d}(R_{t})+T\cdot\frac{\phi}{6T^{2}}\mathbf{d}(A),

concluding the first statement of the lemma.

For the second statement, consider the round tt with 𝐝(Rt)𝐝(A)/(6T)\mathbf{d}(R_{t})\geq\mathbf{d}(A)/(6T), if it exists, at which point the algorithm terminates. (If there is no such tt, then we are done.) We have 𝐝(Rt1)<𝐝(A)/(6T)\mathbf{d}(R_{t-1})<\mathbf{d}(A)/(6T) and 𝐝(Rt1St)=𝐝(Rt)𝐝(A)/(6T)\mathbf{d}(R_{t-1}\cup S_{t})=\mathbf{d}(R_{t})\geq\mathbf{d}(A)/(6T), and the final set StS_{t} satisfies 𝐝(St)𝐝(At1)/3𝐝(A)/3\mathbf{d}(S_{t})\leq\mathbf{d}(A_{t-1})/3\leq\mathbf{d}(A)/3. It follows that the new set RtR_{t} satisfies

𝐝(Rt)=𝐝(Rt1St1)𝐝(Rt1)+𝐝(St)𝐝(A)6T+𝐝(A)3𝐝(A)2,\displaystyle\mathbf{d}(R_{t})=\mathbf{d}(R_{t-1}\cup S_{t-1})\leq\mathbf{d}(R_{t-1})+\mathbf{d}(S_{t})\leq\frac{\mathbf{d}(A)}{6T}+\frac{\mathbf{d}(A)}{3}\leq\frac{\mathbf{d}(A)}{2},

concluding the second statement and the proof. ∎

Purely for the analysis, we define a potential function ψ(t)\psi(t) for each round t[T]t\in[T] as follows. For each vertex vAv\in A, let 𝐅t(u)0A\mathbf{F}_{t}(u)\in\mathbb{R}^{A}_{\geq 0} be row uu of matrix 𝐅t\mathbf{F}_{t}; we call 𝐅t(u)\mathbf{F}_{t}(u) a flow-vector of uu. We define the potential function

ψ(t)=uAt𝐝(u)𝐅t(u)𝐝(u)𝝁t22\psi(t)=\sum_{u\in A_{t}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}

where

𝝁t=uAt𝐅t(u)𝐝(At)=argmin𝝁AuAt𝐝(u)𝐅t(u)𝐝(u)𝝁22\displaystyle\boldsymbol{\mu}_{t}=\frac{\sum_{u\in A_{t}}\mathbf{F}_{t}(u)}{\mathbf{d}(A_{t})}=\arg\min_{\boldsymbol{\mu}\in\mathbb{R}^{A}}\sum_{u\in A_{t}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}\right\rVert_{2}^{2} (2)

is the weighted average of the flow-vectors in AtA_{t}, which is also the minimizer of ψ(t)\psi(t) when treated as a function of ψ(t)\psi(t). The latter fact can be verified separately for each coordinate vAv\in A; namely,

𝝁t(v)=uAt𝐅t(u,v)𝐝(At)=argmin𝝁AuAt𝐝(u)(𝐅t(u,v)𝐝(u)𝝁(v))2\boldsymbol{\mu}_{t}(v)=\frac{\sum_{u\in A_{t}}\mathbf{F}_{t}(u,v)}{\mathbf{d}(A_{t})}=\arg\displaystyle\min_{\boldsymbol{\mu}\in\mathbb{R}^{A}}\sum_{u\in A_{t}}\mathbf{d}(u)\bigg{(}\frac{\mathbf{F}_{t}(u,v)}{\mathbf{d}(u)}-\boldsymbol{\mu}(v)\bigg{)}^{2}

follows from setting the derivative to zero.

A.1 Small Potential Implies Mixing

We first show that if Φ(t)1/poly(nW)\Phi(t)\leq 1/\textup{poly}(nW) for a sufficiently large polynomial, then the vertex weighting 𝐝|At\mathbf{d}|_{A_{t}} mixes in GG.

Lemma A.2.

For any tTt\in T, if 𝐝(Rt)𝐝(A)/(6T)\mathbf{d}(R_{t})\leq\mathbf{d}(A)/(6T) and ψ(t)1/(nW)C\psi(t)\leq 1/(nW)^{C} for large enough constant C>0C>0, then the vertex weighting 𝐝|At\mathbf{d}|_{A_{t}} mixes in GG with congestion 5T/ϕ5T/\phi.

For the rest of Section A.1, we prove Lemma A.2. Suppose that 𝐝(Rt)𝐝(A)/(6T)\mathbf{d}(R_{t})\leq\mathbf{d}(A)/(6T) and ψ(t)1/(nW)C\psi(t)\leq 1/(nW)^{C} for large enough constant C>0C>0. We first prove two claims about the flow-matrix 𝐅t\mathbf{F}_{t}. Let 𝐅t(At,At)\mathbf{F}_{t}(A_{t},A_{t}) denote the sum u,vAt𝐅t(u,v)\sum_{u,v\in A_{t}}\mathbf{F}_{t}(u,v).

Claim A.3.

𝐅t(At,At)𝐝(A)/3\mathbf{F}_{t}(A_{t},A_{t})\geq\mathbf{d}(A)/3.

Proof.

Since the AA-commodity flow routing 𝐅t\mathbf{F}_{t} has congestion t/ϕt/\phi, and since G[A]Rt\partial_{G[A]}R_{t} is a cut separating AtARtA_{t}\subseteq A\setminus R_{t} from RtR_{t}, we have uAt,vRt𝐅t(u,v)t/ϕδG[A]Rt\sum_{u\in A_{t},v\in R_{t}}\mathbf{F}_{t}(u,v)\leq t/\phi\cdot\delta_{G[A]}R_{t}. Since δG[A]Rtϕ𝐝(Rt)+ϕ6𝐝(A)\delta_{G[A]}R_{t}\leq\phi\mathbf{d}(R_{t})+\frac{\phi}{6}\mathbf{d}(A) by Lemma A.1, we have

uAt,vRt𝐅t(u,v)tϕδG[A]Rttϕ(ϕ𝐝(Rt)+ϕ6T𝐝(A))T𝐝(Rt)+16𝐝(A)13𝐝(A),\sum_{u\in A_{t},v\in R_{t}}\mathbf{F}_{t}(u,v)\leq\frac{t}{\phi}\cdot\delta_{G[A]}R_{t}\leq\frac{t}{\phi}\cdot\left(\phi\mathbf{d}(R_{t})+\frac{\phi}{6T}\mathbf{d}(A)\right)\leq T\cdot\mathbf{d}(R_{t})+\frac{1}{6}\mathbf{d}(A)\leq\frac{1}{3}\mathbf{d}(A),

where the last inequality holds by the assumption 𝐝(Rt)𝐝(A)/(6T)\mathbf{d}(R_{t})\leq\mathbf{d}(A)/(6T). Since the sum of row vAv\in A in 𝐅\mathbf{F} is always 𝐝(v)\mathbf{d}(v), we have

uAt,vA𝐅t(u,v)=𝐝(At)=𝐝(A)𝐝(Rt)𝐝(A)𝐝(A)6T23𝐝(A).\sum_{u\in A_{t},v\in A}\mathbf{F}_{t}(u,v)=\mathbf{d}(A_{t})=\mathbf{d}(A)-\mathbf{d}(R_{t})\geq\mathbf{d}(A)-\frac{\mathbf{d}(A)}{6T}\geq\frac{2}{3}\mathbf{d}(A).

Subtracting the two inequalities above, we conclude that

𝐅t(At,At)=uAt,vA𝐅t(u,v)uAt,vRt𝐅t(u,v)23𝐝(A)13𝐝(A)=13𝐝(A).\mathbf{F}_{t}(A_{t},A_{t})=\sum_{u\in A_{t},v\in A}\mathbf{F}_{t}(u,v)-\sum_{u\in A_{t},v\in R_{t}}\mathbf{F}_{t}(u,v)\geq\frac{2}{3}\mathbf{d}(A)-\frac{1}{3}\mathbf{d}(A)=\frac{1}{3}\mathbf{d}(A).\qed
Claim A.4.

For all uAu\in A, we have

|vAt𝐅t(u,v)𝐝(u)𝐅t(At,At)𝐝(At)|1(nW)C/23.\bigg{|}\sum_{v\in A_{t}}\mathbf{F}_{t}(u,v)-\mathbf{d}(u)\cdot\frac{\mathbf{F}_{t}(A_{t},A_{t})}{\mathbf{d}(A_{t})}\bigg{|}\leq\frac{1}{(nW)^{C/2-3}}.

In particular, vAt𝐅t(u,v)𝐝(u)/4\sum_{v\in A_{t}}\mathbf{F}_{t}(u,v)\geq\mathbf{d}(u)/4.

Proof.

Since ψ(t)1/(nW)C\psi(t)\leq 1/(nW)^{C} by assumption, we have

|𝐅t(u,v)𝐝(u)𝝁t(v)|2𝐝(u)|𝐅t(u,v)𝐝(u)𝝁t(v)|2uAt𝐝(u)𝐅t(u)𝐝(u)𝝁t22=ψ(t)1(nW)C,\bigg{|}\frac{\mathbf{F}_{t}(u,v)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}(v)\bigg{|}^{2}\leq\mathbf{d}(u)\,\bigg{|}\frac{\mathbf{F}_{t}(u,v)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}(v)\bigg{|}^{2}\leq\sum_{u\in A_{t}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}=\psi(t)\leq\frac{1}{(nW)^{C}},

so

|𝐅t(u,v)𝐝(u)𝝁t(v)|=𝐝(u)|𝐅t(u,v)𝐝(u)𝝁t(v)|nW1(nW)C/2=1(nW)C/21.\big{|}\mathbf{F}_{t}(u,v)-\mathbf{d}(u)\cdot\boldsymbol{\mu}_{t}(v)\big{|}=\mathbf{d}(u)\,\bigg{|}\frac{\mathbf{F}_{t}(u,v)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}(v)\bigg{|}\leq nW\cdot\frac{1}{(nW)^{C/2}}=\frac{1}{(nW)^{C/2-1}}.

In particular,

|vAt(𝐅t(u,v)𝐝(u)𝝁t(v))|\displaystyle\bigg{|}\sum_{v\in A_{t}}\big{(}\mathbf{F}_{t}(u,v)-\mathbf{d}(u)\cdot\boldsymbol{\mu}_{t}(v)\big{)}\bigg{|} vAt|𝐅t(u,v)𝐝(u)𝝁t(v)|\displaystyle\leq\sum_{v\in A_{t}}\big{|}\mathbf{F}_{t}(u,v)-\mathbf{d}(u)\cdot\boldsymbol{\mu}_{t}(v)\big{|}
vAt𝐝(u)(nW)C/21=|At|𝐝(u)(nW)C/211(nW)C/23.\displaystyle\leq\sum_{v\in A_{t}}\frac{\mathbf{d}(u)}{(nW)^{C/2-1}}=|A_{t}|\cdot\frac{\mathbf{d}(u)}{(nW)^{C/2-1}}\leq\frac{1}{(nW)^{C/2-3}}.

By the definition of 𝝁t\boldsymbol{\mu}_{t},

vAt𝝁t(v)=u,vAt𝐅t(u,v)𝐝(At)=𝐅t(At,At)𝐝(At),\sum_{v\in A_{t}}\boldsymbol{\mu}_{t}(v)=\sum_{u,v\in A_{t}}\frac{\mathbf{F}_{t}(u,v)}{\mathbf{d}(A_{t})}=\frac{\mathbf{F}_{t}(A_{t},A_{t})}{\mathbf{d}(A_{t})},

concluding the first statement of the claim. By Claim A.3, the expression above is at least 𝐝(A)/3𝐝(At)1/3\frac{\mathbf{d}(A)/3}{\mathbf{d}(A_{t})}\geq 1/3, so

vAt𝐅t(u,v)vAt𝐝(u)𝝁t(v)1(nW)C/2313𝐝(u)1(nW)C/2314𝐝(u)\sum_{v\in A_{t}}\mathbf{F}_{t}(u,v)\geq\sum_{v\in A_{t}}\mathbf{d}(u)\cdot\boldsymbol{\mu}_{t}(v)-\frac{1}{(nW)^{C/2-3}}\geq\frac{1}{3}\mathbf{d}(u)-\frac{1}{(nW)^{C/2-3}}\geq\frac{1}{4}\mathbf{d}(u)

for C>0C>0 large enough, concluding the second statement. ∎

With Claims A.3 and A.4 established, we now prove Lemma A.2. Recall that 𝐅t\mathbf{F}_{t} is routable with congestion T/ϕT/\phi. Decompose this AA-commodity flow into single-commodity flows fu,v:u,vAf_{u,v}:u,v\in A that send quantity 𝐅t(u,v)\mathbf{F}_{t}(u,v) of commodity uu from uu to vv.

Consider any demand 𝐛A\mathbf{b}\in\mathbb{R}^{A} satisfying |𝐛|𝐝|At|\mathbf{b}|\leq\mathbf{d}|_{A_{t}}. In particular, uAt𝐛(u)=0\sum_{u\in A_{t}}\mathbf{b}(u)=0. We want to construct a single-commodity flow routing demand 𝐛\mathbf{b} with congestion 5T/ϕ5T/\phi. For each u,vAtu,v\in A_{t}, we first route the flow

fu,v=fu,vvAt𝐅t(u,v)𝐛(u).f^{\prime}_{u,v}=\frac{f_{u,v}}{\sum_{v^{\prime}\in A_{t}}\mathbf{F}_{t}(u,v^{\prime})}\cdot\mathbf{b}(u).

Summing over all vAtv\in A_{t}, we observe that each vertex uAtu\in A_{t} sends a total of 𝐛(u)\mathbf{b}(u) demand. Also, by Claim A.4, each flow fu,vf^{\prime}_{u,v} has (absolute) value

𝐅t(u,v)vAt𝐅t(u,v)|𝐛(u)|𝐅t(u,v)𝐝(u)/4𝐝(u)4𝐅t(u,v).\frac{\mathbf{F}_{t}(u,v)}{\sum_{v^{\prime}\in A_{t}}\mathbf{F}_{t}(u,v^{\prime})}\cdot|\mathbf{b}(u)|\leq\frac{\mathbf{F}_{t}(u,v)}{\mathbf{d}(u)/4}\cdot\mathbf{d}(u)\leq 4\mathbf{F}_{t}(u,v).

In particular, these flows can be routed simultaneously with congestion 44 times the AA-commodity flow, which is congestion 4T/ϕ4T/\phi.

After routing this flow, each vertex vAtv\in A_{t} receives demand

uAt𝐅t(u,v)vAt𝐅t(u,v)𝐛(u),\sum_{u\in A_{t}}\frac{\mathbf{F}_{t}(u,v)}{\sum_{v^{\prime}\in A_{t}}\mathbf{F}_{t}(u,v^{\prime})}\cdot\mathbf{b}(u),

Since ψ(t)1/(nW)C\psi(t)\leq 1/(nW)^{C}, we can approximate 𝐅t(u,v)𝐝(u)𝝁t(v)\mathbf{F}_{t}(u,v)\approx\mathbf{d}(u)\cdot\boldsymbol{\mu}_{t}(v) for all uAu\in A. Together with Claim A.4, we can approximate the numerator and denominator of the fraction above to

uAt𝐝(u)𝝁t(v)𝐝(u)𝐅t(At,At)/𝐝(At)𝐛(u)=𝝁t(v)𝐅t(At,At)/𝐝(At)uAt𝐛(u),\sum_{u\in A_{t}}\frac{\mathbf{d}(u)\cdot\boldsymbol{\mu}_{t}(v)}{\mathbf{d}(u)\cdot\mathbf{F}_{t}(A_{t},A_{t})/\mathbf{d}(A_{t})}\cdot\mathbf{b}(u)=\frac{\boldsymbol{\mu}_{t}(v)}{\mathbf{F}_{t}(A_{t},A_{t})/\mathbf{d}(A_{t})}\sum_{u\in A_{t}}\mathbf{b}(u),

which equals 0 since uAt𝐛(u)=0\sum_{u\in A_{t}}\mathbf{b}(u)=0. When C>0C>0 is large enough, the approximated value of 0 is within an additive 1/n21/n^{2} of the true value. In particular, each vertex vAtv\in A_{t} receives a demand whose absolute value is at most 1/n21/n^{2}. Since the minimum edge capacity is 11, the remaining demand can trivially be routed with congestion 11. The final congestion is 4T/ϕ+15T/ϕ4T/\phi+1\leq 5T/\phi, concluding the proof of Lemma A.2.

A.2 Computing the Set and Flow

In this subsection, we describe a round of the algorithm in detail, following Lemma B.3 of [34] with some minor changes. Recall that on each iteration t[T]t\in[T], the algorithm first computes a partition of At1A_{t-1} into At,AtrA_{t}^{\ell},A_{t}^{r}, and then computes a set StS_{t} and flow ftf_{t} satisfying certain properties. The algorithm then updates AtAt1StA_{t}\leftarrow A_{t-1}\setminus S_{t} and RtRt1StR_{t}\leftarrow R_{t-1}\cup S_{t}. The choice of partition At,AtrA_{t}^{\ell},A_{t}^{r} will depend on an implicitly represented flow-matrix 𝐅t\mathbf{F}_{t} that is useful for the analysis.

A.2.1 Constructing the partition

We start with the construction of AtA_{t}^{\ell} and AtrA_{t}^{r}. We first list some variables key to the algorithm and analysis.

  1. 1.

    Let 𝐫A\mathbf{r}\in\mathbb{R}^{A} be a random unit vector orthogonal to the all-ones vector.

  2. 2.

    For each vAv\in A, let 𝐩(v)=𝐅t1(v)/𝐝(v),𝐫\mathbf{p}(v)=\langle\mathbf{F}_{t-1}(v)/\mathbf{d}(v),\,\mathbf{r}\rangle be the projection of normalized flow-vector 𝐅t1(v)/𝐝(v)\mathbf{F}_{t-1}(v)/\mathbf{d}(v) onto the vector rr. We later show in Claim A.12 that the values 𝐩(v)\mathbf{p}(v) can be computed in total time O(mT)O(mT) without explicitly maintaining the flow matrix FF.

  3. 3.

    Let μ¯t1=𝝁t1,𝐫\bar{\mu}_{t-1}=\langle\boldsymbol{\mu}_{t-1},\mathbf{r}\rangle be the projection of the weighted average 𝝁t1=uAt1𝐅t1(u)/𝐝(At1)\boldsymbol{\mu}_{t-1}=\sum_{u\in A_{t-1}}\mathbf{F}_{t-1}(u)/\mathbf{d}(A_{t-1}) onto the vector 𝐫\mathbf{r}. It is only used in the analysis.

  4. 4.

    Let AtA_{t}^{\ell} and AtrA_{t}^{r} be constructed by Lemma A.5 below.

Lemma A.5.

Given the values 𝐩(v)\mathbf{p}(v) for all vAt1v\in A_{t-1}, we can find in time O(|At1|log|At1|)O(|A_{t-1}|\log|A_{t-1}|) a partition of At1A_{t-1} into two sets At,AtrA_{t}^{\ell},A_{t}^{r} and a separation value η\eta\in\mathbb{R} such that

  1. (a)

    η\eta separates the projections of At,AtrA_{t}^{\ell},A_{t}^{r}, i.e., either maxuAt𝐩(u)ηminvAtr𝐩(v)\displaystyle\max_{u\in A_{t}^{\ell}}\mathbf{p}(u)\leq\eta\leq\min_{v\in A_{t}^{r}}\mathbf{p}(v) or minuAt𝐩(u)ηmaxvAtr𝐩(v)\displaystyle\min_{u\in A_{t}^{\ell}}\mathbf{p}(u)\geq\eta\geq\max_{v\in A_{t}^{r}}\mathbf{p}(v),

  2. (b)

    𝐝(At)𝐝(At1)/2\mathbf{d}(A_{t}^{\ell})\leq\mathbf{d}(A_{t-1})/2 and 𝐝(Atr)𝐝(At1)/2\mathbf{d}(A_{t}^{r})\geq\mathbf{d}(A_{t-1})/2, and

  3. (c)

    vAt𝐝(v)(𝐩(v)η)212vAt1𝐝(v)(𝐩(v)μ¯t1)2\sum_{v\in A_{t}^{\ell}}\mathbf{d}(v)\cdot(\mathbf{p}(v)-\eta)^{2}\geq\frac{1}{2}\sum_{v\in A_{t-1}}\mathbf{d}(v)\cdot(\mathbf{p}(v)-\bar{\mu}_{t-1})^{2}.

Proof.

Sort the vertices vAt1v\in A_{t-1} by their value of 𝐩(v)\mathbf{p}(v) in ascending order, and let the sorted list be v1,v2,,v|At1|v_{1},v_{2},\ldots,v_{|A_{t-1}|}. Let i1i\geq 1 be the smallest integer such that 𝐝({v1,v2,,vi})𝐝(At1)/2\mathbf{d}(\{v_{1},v_{2},\ldots,v_{i}\})\geq\mathbf{d}(A_{t-1})/2. Define the vertex sets S1={v1,v2,,vi}S_{1}=\{v_{1},v_{2},\ldots,v_{i}\} and S2={vi,vi+1,,v|At1|}S_{2}=\{v_{i},v_{i+1},\ldots,v_{|A_{t-1}|}\}. By our construction of ii, we have 𝐝(Sj{vi})<𝐝(At1)/2\mathbf{d}(S_{j}\setminus\{v_{i}\})<\mathbf{d}(A_{t-1})/2 for both j{1,2}j\in\{1,2\}. Since S1S2=At1S_{1}\cup S_{2}=A_{t-1}, we have

vS1𝐝(v)(𝐩(v)η)2+vS2𝐝(v)(𝐩(u)η)2uAt1𝐝(v)(𝐩(u)η)2,\sum_{v\in S_{1}}\mathbf{d}(v)\cdot(\mathbf{p}(v)-\eta)^{2}+\sum_{v\in S_{2}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2}\geq\sum_{u\in A_{t-1}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2},

so pick j{1,2}j\in\{1,2\} such that

vSj𝐝(v)(𝐩(u)η)212uAt1𝐝(v)(𝐩(u)η)2.\sum_{v\in S_{j}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2}\geq\frac{1}{2}\sum_{u\in A_{t-1}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2}.

Define η=pvi\eta=p_{v_{i}} so that

vSj{i}𝐝(v)(𝐩(u)η)2=vSj𝐝(v)(𝐩(u)η)212uAt1𝐝(v)(𝐩(u)η)2.\sum_{v\in S_{j}\setminus\{i\}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2}=\sum_{v\in S_{j}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2}\geq\frac{1}{2}\sum_{u\in A_{t-1}}\mathbf{d}(v)\cdot(\mathbf{p}(u)-\eta)^{2}.

It follows that we can take At=Sj{i}A_{t}^{\ell}=S_{j}\setminus\{i\} and Atr=At1AtA_{t}^{r}=A_{t-1}\setminus A_{t}^{\ell}, which satisfies all three properties of the lemma. ∎

We conclude this section with a lemma about the behavior of projections onto a random unit vector. Since the lemma is standard in high-dimensional geometry, we omit the proof and refer to Lemma 3.4 of [18].

Lemma A.6 (Lemma 3.4 of [18]).

For all vertices vAv\in A, we have

𝔼[(𝐩(v)μ¯t1)2]=1n𝐅t1(v)𝐝(v)𝝁t122,\mathbb{E}[(\mathbf{p}(v)-\bar{\mu}_{t-1})^{2}]=\frac{1}{n}\left\lVert\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}-\boldsymbol{\mu}_{t-1}\right\rVert_{2}^{2},

and for any pair u,vAu,v\in A and constant c>0c>0, we have

(𝐩(u)𝐩(v))2clognn𝐅t1(u)𝐝(u)𝐅t1(v)𝐝(v)22(\mathbf{p}(u)-\mathbf{p}(v))^{2}\leq\frac{c\log n}{n}\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}

with probability at least 1nc/41-n^{-c/4}. In particular, there exists a constant C>0C>0 such that for any pair u,vAu,v\in A,

𝔼[(𝐩(u)𝐩(v))2]Clognn𝐅t1(u)𝐝(u)𝐅t1(v)𝐝(v)22.\mathbb{E}[(\mathbf{p}(u)-\mathbf{p}(v))^{2}]\leq\frac{C\log n}{n}\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}.

A.2.2 Max-flow/min-cut call

Given AtA_{t}^{\ell} and AtrA_{t}^{r}, we call Theorem 5.10 with parameters

AA,ϵ118T2,γϵϕ2,βmax{1,(12ϕ+ϵγ)(κ+2)},𝐬ϕ𝐝|At+ϵϕ𝐝,𝐭12ϕ𝐝|Atr,\displaystyle A\leftarrow A,\,\epsilon\leftarrow\frac{1}{18T^{2}},\,\gamma\leftarrow\frac{\epsilon\phi}{2},\,\beta\leftarrow\max\{1,(12\phi+\epsilon\gamma)(\kappa+2)\},\,\,\mathbf{s}\leftarrow\phi\mathbf{d}|_{A_{t}^{\ell}}+\epsilon\phi\mathbf{d},\,\mathbf{t}\leftarrow 12\phi\mathbf{d}|_{A_{t}^{r}}, (3)

and we denote the graph G[A,γ,𝐬,𝐭]G[A,\gamma,\mathbf{s},\mathbf{t}] by H=(VH,EH)H=(V_{H},E_{H}). We will later show in Claim A.10 that Assumption 1 of Theorem 5.10 is satisfied with our parameter β\beta.

From Theorem 5.10, we obtain an (1+ϵ)(1+\epsilon)-approximate fair cut/flow pair (S,f)(S,f). The algorithm sets StS{s,x}S_{t}\leftarrow S\setminus\{s,x\}, which satisfies the condition below as required by step i.

Claim A.7.

δG[A]Stϕ𝐝(SAt1)+3ϵϕ𝐝(A)\delta_{G[A]}S_{t}\leq\phi\mathbf{d}(S\cap A_{t-1})+3\epsilon\phi\mathbf{d}(A) and 𝐝(St)𝐝(A)/3\mathbf{d}(S_{t})\leq\mathbf{d}(A)/3,

Proof.

We begin with the first statement. We begin by bounding δHS\delta_{H}S as follows. The edges in HS\partial_{H}S can be split into three groups:

  1. 1.

    The edges in G[A]G[A], which have total capacity δG[A](S{s,x})=δG[A]St\delta_{G[A]}(S\setminus\{s,x\})=\delta_{G[A]}S_{t},

  2. 2.

    The edges incident to ss, which have total capacity cH({s},VHS)=𝐬(ASt)ϕ𝐝(AtSt)c_{H}(\{s\},V_{H}\setminus S)=\mathbf{s}(A\setminus S_{t})\geq\phi\mathbf{d}(A_{t}^{\ell}\setminus S_{t}), and

  3. 3.

    The edges incident to tt or xx, which we ignore.

It follows that

δHSδG[A]St+ϕ𝐝(AtSt).\displaystyle\delta_{H}S\geq\delta_{G[A]}S_{t}+\phi\mathbf{d}(A_{t}^{\ell}\setminus S_{t}).

By Fact 5.8, the cut value δHS\delta_{H}S is at most (1+ϵ)(1+\epsilon) times the minimum (s,t)(s,t)-cut. Since the (s,t)(s,t)-minimum cut is at most δH{s}=𝐬(A)\delta_{H}\{s\}=\mathbf{s}(A), we have

δHS(1+ϵ)𝐬(A)=(1+ϵ)(ϕ𝐝(At)+ϵϕ𝐝(A))(1+ϵ)ϕ𝐝(At)+2ϵϕ𝐝(A).\delta_{H}S\leq(1+\epsilon)\mathbf{s}(A)=(1+\epsilon)(\phi\mathbf{d}(A_{t}^{\ell})+\epsilon\phi\mathbf{d}(A))\leq(1+\epsilon)\phi\mathbf{d}(A_{t}^{\ell})+2\epsilon\phi\mathbf{d}(A).

It follows that

δG[A]St\displaystyle\delta_{G[A]}S_{t} δHSϕ𝐝(AtSt)\displaystyle\leq\delta_{H}S-\phi\mathbf{d}(A_{t}^{\ell}\setminus S_{t})
(1+ϵ)ϕ𝐝(At)+2ϵϕ𝐝(A)ϕ𝐝(AtSt)\displaystyle\leq(1+\epsilon)\phi\mathbf{d}(A_{t}^{\ell})+2\epsilon\phi\mathbf{d}(A)-\phi\mathbf{d}(A_{t}^{\ell}\setminus S_{t})
=ϕ𝐝(StAt)+ϵϕ𝐝(At)+2ϵϕ𝐝(A)\displaystyle=\phi\mathbf{d}(S_{t}\cap A_{t}^{\ell})+\epsilon\phi\mathbf{d}(A_{t}^{\ell})+2\epsilon\phi\mathbf{d}(A)
ϕ𝐝(StAt1)+3ϵϕ𝐝(A),\displaystyle\leq\phi\mathbf{d}(S_{t}\cap A_{t-1})+3\epsilon\phi\mathbf{d}(A),

concluding the first statement. For the second statement, observe that δHS(1+ϵ)𝐬(A)(1+ϵ)2ϕ𝐝(A)\delta_{H}S\leq(1+\epsilon)\mathbf{s}(A)\leq(1+\epsilon)\cdot 2\phi\mathbf{d}(A) and δHScH(S,{t})=12ϕ𝐝(St)\delta_{H}S\geq c_{H}(S,\{t\})=12\phi\mathbf{d}(S_{t}), so

𝐝(St)δHS12ϕ(1+ϵ)2ϕ𝐝(A)12ϕ𝐝(A)3,\mathbf{d}(S_{t})\leq\frac{\delta_{H}S}{12\phi}\leq\frac{(1+\epsilon)\cdot 2\phi\mathbf{d}(A)}{12\phi}\leq\frac{\mathbf{d}(A)}{3},

concluding the second statement and the proof. ∎

The algorithm defines the flow ftf_{t} as follows. First, scale the flow ff by factor 1/(12ϕ)1/(12\phi) and restrict the flow to edges in G[ASt]G[A\setminus S_{t}]. In other words, the flow on edges incident to St{s,t,x}S_{t}\cup\{s,t,x\} are removed. Then, decompose the flow ftf_{t} into paths (using, for example, a dynamic tree [ST83]). For each vertex vAtStv\in A_{t}^{\ell}\setminus S_{t}, remove enough paths starting at vv until it is the start of exactly 𝐝(v)/12\mathbf{d}(v)/12 total capacity of paths, and for each vertex vAtStv\notin A_{t}^{\ell}\setminus S_{t}, remove all paths starting at vv; let the remaining flow be ftf_{t}. The claim below shows that this last step is always possible, and that ftf_{t} satisfies the condition below as required by step ii.

Claim A.8.

ftf_{t} is a flow from AtStA_{t}^{\ell}\setminus S_{t} to AtrStA_{t}^{r}\setminus S_{t} such that each vertex vAtStv\in A_{t}^{\ell}\setminus S_{t} is the source of exactly 𝐝(v)/12\mathbf{d}(v)/12 flow, and each vertex vAtrStv\in A_{t}^{r}\setminus S_{t} is the sink of at most 𝐝(v)\mathbf{d}(v) flow. The flow ff has congestion 1/(12ϕ)1/(12\phi).

Proof.

It suffices to show that the scaled flow f/(12ϕ)f/(12\phi), once restricted to edges in GG, is a flow such that

  1. 1.

    Each vertex vAtStv\in A_{t}^{\ell}\setminus S_{t} is the source of at least 𝐝(v)/12\mathbf{d}(v)/12 flow, and

  2. 2.

    The only sinks are at vertices vAtrStv\in A_{t}^{r}\setminus S_{t}, and each such vertex vv is the sink of at most 𝐝(v)\mathbf{d}(v) flow.

The path-removing step in the algorithm is then possible, and ensures that each vertex vAtStv\in A_{t}^{\ell}\setminus S_{t} is the source of exactly 𝐝(v)/12\mathbf{d}(v)/12 flow in ftf_{t}, each vertex vAtrStv\in A_{t}^{r}\setminus S_{t} is the sink of at most 𝐝(v)\mathbf{d}(v) flow, and there are no (nonzero) sources or sinks elsewhere.

We begin with the unscaled flow ff. Since (S,f)(S,f) is a (1+ϵ)(1+\epsilon)-fair cut/flow pair, each vertex vVH(S{t})v\in V_{H}\setminus(S\cup\{t\}) receives at least 11+ϵcH({v},S)12cH({v},S)\frac{1}{1+\epsilon}\,c_{H}(\{v\},S)\geq\frac{1}{2}c_{H}(\{v\},S) total flow from vertices in SS. Since AStVH(S{t})A\setminus S_{t}\subseteq V_{H}\setminus(S\cup\{t\}), the same holds for all vAStv\in A\setminus S_{t}. By construction of HH, we have cH({v},S)cH(v,s)=ϕ𝐝|At(v)+ϵϕ𝐝(v)c_{H}(\{v\},S)\geq c_{H}(v,s)=\phi\mathbf{d}|_{A_{t}^{\ell}}(v)+\epsilon\phi\mathbf{d}(v) for all vAStv\in A\setminus S_{t}. It follows that each vertex vAStv\in A\setminus S_{t} receives at least ϕ2𝐝|At(v)+ϵϕ2𝐝(v)\frac{\phi}{2}\mathbf{d}|_{A_{t}^{\ell}}(v)+\frac{\epsilon\phi}{2}\mathbf{d}(v) total flow from vertices in SS.

We now investigate the effect of restricting the flow ff to edges in G[ASt]G[A\setminus S_{t}], starting with removing all edges incident to SS. Continuing the argument above, removing these edges causes each vertex vAStv\in A\setminus S_{t} to be the source of at least ϕ2𝐝|At(v)+ϵϕ2𝐝(v)\frac{\phi}{2}\mathbf{d}|_{A_{t}^{\ell}}(v)+\frac{\epsilon\phi}{2}\mathbf{d}(v) flow.

If xSx\notin S, then we now remove the edges incident to xx. By construction of HH, each vertex vAv\in A has an edge to xx of capacity γ𝐝(v)=ϵϕ2𝐝(v)\gamma\mathbf{d}(v)=\frac{\epsilon\phi}{2}\mathbf{d}(v). Since the flow ff is feasible, there is at most ϵϕ2𝐝(v)\frac{\epsilon\phi}{2}\mathbf{d}(v) flow along the edge (v,x)(v,x). Removing this edge changes the net flow out of vv by at most ϵϕ2𝐝(v)\frac{\epsilon\phi}{2}\mathbf{d}(v). Since each vertex vAStv\in A\setminus S_{t} is the source of at least ϕ2𝐝|At(v)+ϵϕ2𝐝(v)\frac{\phi}{2}\mathbf{d}|_{A_{t}^{\ell}}(v)+\frac{\epsilon\phi}{2}\mathbf{d}(v) flow before this step, it is the source of at least ϕ2𝐝|At(v)0\frac{\phi}{2}\mathbf{d}|_{A_{t}^{\ell}}(v)\geq 0 flow after this step. In particular, it cannot become a sink. Also, if vAtStv\in A_{t}^{\ell}\setminus S_{t}, then it is the source of at least ϕ2𝐝(v)\frac{\phi}{2}\mathbf{d}(v) flow after restriction.

Finally, we remove the edges incident to tt. Since the flow ff is feasible, each edge (v,t)(v,t) carries at most cH(v,t)c_{H}(v,t) flow. By construction of HH, we have cH(v,t)=12ϕ𝐝(v)c_{H}(v,t)=12\phi\mathbf{d}(v) for all vAtrv\in A_{t}^{r}, so each vertex vAtrv\in A_{t}^{r} receives a net flow of at most 12ϕ𝐝(v)12\phi\mathbf{d}(v) from vertices other than tt (and then sends that flow to tt). We may assume that the flow does not send any flow away from tt (i.e., along any edge (v,t)(v,t) in the direction from tt to vv), since otherwise we can remove such flow using a path decomposition. Under this assumption, removing flow on edges incident to tt does not create any sources, only sinks. In particular, each vertex vAtrv\in A_{t}^{r} is now the sink of at most 12ϕ𝐝(v)12\phi\mathbf{d}(v) flow.

Finally, scaling this restricted flow by 1/(12ϕ)1/(12\phi), we conclude that the scaled flow ff has congestion 1/(12ϕ)1/(12\phi), each vertex vAtStv\in A_{t}^{\ell}\cap S_{t} is the source of at least 𝐝(v)/12\mathbf{d}(v)/12 flow, and each vertex vAtrStv\in A_{t}^{r}\setminus S_{t} is the sink of at most 𝐝(v)\mathbf{d}(v) flow. ∎

Claim A.9.

Given flow ff, the flow ftf_{t} can be constructed in O(mlogm)O(m\log m) time.

Proof.

The flow ftf_{t} is on the graph HH with at most m+nm+n edges, so scaling and restricting the flow takes O(m)O(m) time. Using a dynamic tree, we can decompose the new flow into at most mm (implicit) paths in O(mlogm)O(m\log m) time. The path removal step also takes O(mlogm)O(m\log m) time through dynamic trees. The overall running time is O(mlogm)O(m\log m). ∎

Claim A.10.

Assumption 1 of Theorem 5.10 holds for our choice of parameters γ,β\gamma,\beta. That is, we have 𝐭(CA)+ϵγδG(CA)βδGC\mathbf{t}(C\cap A)+\epsilon\gamma\cdot\delta_{G}(C\cap A)\leq\beta\cdot\delta_{G}C for all C𝒞C\in\mathcal{C}.

Proof.

Recall that we set γϵϕ2,β(12ϕ+ϵγ)(κ+2)\gamma\leftarrow\frac{\epsilon\phi}{2},\,\beta\leftarrow(12\phi+\epsilon\gamma)(\kappa+2), and 𝐭12ϕ𝐝|Atr\mathbf{t}\leftarrow 12\phi\mathbf{d}|_{A_{t}^{r}}. For the entire proof, fix a set C𝒞C\in\mathcal{C}. We first bound 𝐭(CA)\mathbf{t}(C\cap A) as

𝐭(CA)=12ϕ𝐝|Atr(CA)12ϕ𝐝(CA)=12ϕdeg𝒫LA(CA)12ϕ(deg𝒫L(CA)+degA(CA)).\mathbf{t}(C\cap A)=12\phi\mathbf{d}|_{A_{t}^{r}}(C\cap A)\leq 12\phi\mathbf{d}(C\cap A)=12\phi\deg_{\partial\mathcal{P}_{L}\cap\partial A}(C\cap A)\leq 12\phi(\deg_{\partial\mathcal{P}_{L}}(C\cap A)+\deg_{\partial A}(C\cap A)).

For degA(CA)\deg_{\partial A}(C\cap A), we have degA(CA)=cG(CA,VA)\deg_{\partial A}(C\cap A)=c_{G}(C\cap A,V\setminus A) since any edge in A\partial A with an endpoint in CAC\cap A has its other endpoint outside AA. For deg𝒫L(CA)\deg_{\partial\mathcal{P}_{L}}(C\cap A), we claim the bound deg𝒫L(CA)deg𝒫L(C)δGC\deg_{\partial\mathcal{P}_{L}}(C\cap A)\leq\deg_{\partial\mathcal{P}_{L}}(C)\leq\delta_{G}C. The first inequality is trivial, and for the second inequality, observe that by construction of 𝒞\mathcal{C}, each set C𝒞C\in\mathcal{C} is a subset of some cluster in the partition 𝒫L\mathcal{P}_{L}. It follows that any edge in G𝒫L\partial_{G}\mathcal{P}_{L} with an endpoint in CC has its other endpoint outside CC, so the edge must be in GC\partial_{G}C, and we conclude that deg𝒫L(C)δGC\deg_{\partial\mathcal{P}_{L}}(C)\leq\delta_{G}C. In total, we have established the bound 𝐭(CA)12ϕ(cG(CA,VA)+δGC)\mathbf{t}(C\cap A)\leq 12\phi(c_{G}(C\cap A,V\setminus A)+\delta_{G}C).

Next, we bound δG(CA)\delta_{G}(C\cap A) as

δG(CA)=cG(CA,VA)+cG(CA,AC)cG(CA,VA)+δGC,\delta_{G}(C\cap A)=c_{G}(C\cap A,V\setminus A)+c_{G}(C\cap A,A\setminus C)\leq c_{G}(C\cap A,V\setminus A)+\delta_{G}C,

and together with the bound on 𝐭(CA)\mathbf{t}(C\cap A), we obtain

𝐭(CA)+ϵγδG(CA)\displaystyle\mathbf{t}(C\cap A)+\epsilon\gamma\cdot\delta_{G}(C\cap A) 12ϕ(cG(CA,VA)+δGC)+ϵγ(cG(CA,VA)+δGC)\displaystyle\leq 12\phi(c_{G}(C\cap A,V\setminus A)+\delta_{G}C)+\epsilon\gamma(c_{G}(C\cap A,V\setminus A)+\delta_{G}C)
=(12ϕ+ϵγ)(cG(CA,VA)+δGC).\displaystyle=(12\phi+\epsilon\gamma)(c_{G}(C\cap A,V\setminus A)+\delta_{G}C). (4)

We now focus on bounding cG(CA,VA)c_{G}(C\cap A,V\setminus A). Recall from assumption 1 of Theorem 5.11 that there is a flow on GG with congestion κ\kappa such that each vertex vAv\in A is the source of cG({v},VA)c_{G}(\{v\},V\setminus A) flow and each vertex vVv\in V is the sink of at most deg𝒫L(v)\deg_{\partial\mathcal{P}_{L}}(v) flow. In particular, among the vertices vCv\in C, there is a total of cG(CA,VA)c_{G}(C\cap A,V\setminus A) source and at most deg𝒫L(C)\deg_{\partial\mathcal{P}_{L}}(C) sink. Since the flow has congestion κ\kappa, the net flow out of CC is at most κδGC\kappa\delta_{G}C. Putting everything together, we obtain

cG(CA,VA)deg𝒫L(C)+κδGCδGC+κδGC,c_{G}(C\cap A,V\setminus A)\leq\deg_{\partial\mathcal{P}_{L}}(C)+\kappa\delta_{G}C\leq\delta_{G}C+\kappa\delta_{G}C,

where the second inequality follows from our earlier claim deg𝒫L(C)δGC\deg_{\partial\mathcal{P}_{L}}(C)\leq\delta_{G}C. Continuing from (4), we conclude that

𝐭(CA)+ϵγδG(CA)\displaystyle\mathbf{t}(C\cap A)+\epsilon\gamma\cdot\delta_{G}(C\cap A) (12ϕ+ϵγ)(cG(CA,VA)+δGC)\displaystyle\leq(12\phi+\epsilon\gamma)(c_{G}(C\cap A,V\setminus A)+\delta_{G}C)
=(12ϕ+ϵγ)(δGC+κδGC+δGC)\displaystyle=(12\phi+\epsilon\gamma)(\delta_{G}C+\kappa\delta_{G}C+\delta_{G}C)
=(12ϕ+ϵγ)(κ+2)δGC\displaystyle=(12\phi+\epsilon\gamma)(\kappa+2)\delta_{G}C
βδGC,\displaystyle\leq\beta\delta_{G}C,

finishing the proof. ∎

A.2.3 Constructing the matching and updating the flow-matrix

Using the flow ftf_{t}, the algorithm then constructs a (fractional) matching graph MtM_{t} on vertex set At1A_{t-1}. Take a path decomposition of flow ftf_{t} into paths from AtStA_{t}^{\ell}\setminus S_{t} to AtrStA_{t}^{r}\setminus S_{t}. For each pair of vertices uAtStu\in A_{t}^{\ell}\setminus S_{t} and vAtrStv\in A_{t}^{r}\setminus S_{t}, add an edge (u,v)(u,v) to MtM_{t} whose capacity is the sum of capacities of all uuvv paths in the decomposition of ftf_{t}. The following claim is immediate using an (implicit) path decomposition.

Claim A.11.

Given flow ftf_{t}, the matching graph MtM_{t} can be constructed in O(mlogm)O(m\log m) time.

From the matching graph MtM_{t}, we implicitly update the flow-matrix 𝐅t1\mathbf{F}_{t-1} to 𝐅t\mathbf{F}_{t} as follows. For each vAv\in A, we set

𝐅t(u)=𝐅t1(u)+vAcMt(u,v)2(𝐅t1(v)𝐝(v)𝐅t1(u)𝐝(u)).\mathbf{F}_{t}(u)=\mathbf{F}_{t-1}(u)+\sum_{v\in A}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}-\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}\bigg{)}.

Note that we are viewing MtM_{t} as a graph on vertex set AA, where vertices outside (AtSt)(AtrSt)(A_{t}^{\ell}\setminus S_{t})\cup(A_{t}^{r}\setminus S_{t}) are isolated.

Recall from Section A.2.1 that the algorithm needs to compute the projection 𝐩(v)=𝐅t1(v)/𝐝(v),𝐫\mathbf{p}(v)=\langle\mathbf{F}_{t-1}(v)/\mathbf{d}(v),\,\mathbf{r}\rangle onto the vector rr for each vertex vAv\in A. Now that 𝐅t1(v)\mathbf{F}_{t-1}(v) has been explicitly defined, we show that the projections can be computed in total time O(mT)O(mT).

Claim A.12.

Given vector 𝐫A\mathbf{r}\in\mathbb{R}^{A}, the projection 𝐩(v)=𝐅t1(v)/𝐝(v),𝐫\mathbf{p}(v)=\langle\mathbf{F}_{t-1}(v)/\mathbf{d}(v),\,\mathbf{r}\rangle for each vertex vAv\in A can be computed in total time O(mT)O(mT).

Proof.

By linearity of inner product, we have

𝐅j(u),𝐫=𝐅j1(u),𝐫+vAcMj(u,v)2(𝐅j1(v),𝐫𝐝(v)𝐅j1(u),𝐫𝐝(u))\langle\mathbf{F}_{j}(u),\mathbf{r}\rangle=\langle\mathbf{F}_{j-1}(u),\mathbf{r}\rangle+\sum_{v\in A}\frac{c_{M_{j}}(u,v)}{2}\bigg{(}\frac{\langle\mathbf{F}_{j-1}(v),\mathbf{r}\rangle}{\mathbf{d}(v)}-\frac{\langle\mathbf{F}_{j-1}(u),\mathbf{r}\rangle}{\mathbf{d}(u)}\bigg{)}

for all vertices vAv\in A and iterations j[t1]j\in[t-1]. Since the flow fjf_{j} is on GG, we can assume that the path decomposition has at most mm paths, so the matching graph MjM_{j} has at most mm edges. In other words, there are at most mm nonzero values of cMj(u,v)c_{M_{j}}(u,v). It follows that given the values 𝐅j1(u),𝐫\langle\mathbf{F}_{j-1}(u),\mathbf{r}\rangle for all uAu\in A, we can compute 𝐅j(u),𝐫\langle\mathbf{F}_{j}(u),\mathbf{r}\rangle for all uAu\in A in O(m)O(m) time. Since 𝐅0\mathbf{F}_{0} is a diagonal matrix, the initial values 𝐅0(u),𝐫\langle\mathbf{F}_{0}(u),\mathbf{r}\rangle can be computed in O(n)O(n) time. Over t1t-1 iterations, the total time is O(mT)O(mT). ∎

A.3 Analyzing the Potential Decrease

The main goal of Section A.3 is to prove the following lemma, establishing a (1Ω(1/logn))(1-\Omega(1/\log n)) expected decrease in ψ(t)\psi(t) on each iteration.

Lemma A.13.

Over the random choice of the unit vector rr, we have 𝔼[ψ(t1)ψ(t)]Ω(ψ(t1)/logn)\mathbb{E}[\psi(t-1)-\psi(t)]\geq\Omega(\psi(t-1)/\log n).

To prove Lemma A.13, we begin by listing properties of MtM_{t} and 𝐅t\mathbf{F}_{t}.

Claim A.14.

degMt(u)=𝐝(u)/48\deg_{M_{t}}(u)=\mathbf{d}(u)/48 for all uAtStu\in A_{t}^{\ell}\setminus S_{t}, and degMt(v)𝐝(v)\deg_{M_{t}}(v)\leq\mathbf{d}(v) for all uAtrStu\in A_{t}^{r}\setminus S_{t}.

Proof.

For each vertex uAtStu\in A_{t}^{\ell}\setminus S_{t}, its degree in MtM_{t} equals the total capacity of paths starting at uu in the decomposition of ftf_{t}, which is exactly 𝐝(u)/48\mathbf{d}(u)/48 by Claim A.8. A symmetric argument establishes degMt(v)𝐝(v)\deg_{M_{t}}(v)\leq\mathbf{d}(v) for all uAtrStu\in A_{t}^{r}\setminus S_{t}. ∎

Claim A.15.

For each vertex uAu\in A, the flow-vector 𝐅t(u)\mathbf{F}_{t}(u) sums to 𝐝(u)\mathbf{d}(u).

Proof.

We prove by induction on tt that the flow-vector 𝐅t(u)\mathbf{F}_{t}(u) sums to 𝐝(u)\mathbf{d}(u). The statement is true for t=0t=0 since 𝐅0\mathbf{F}_{0} is defined as the diagonal matrix with value 𝐝(v)\mathbf{d}(v) on entry F(v,v)F(v,v). Assume by induction that the flow-vector 𝐅t1(u)\mathbf{F}_{t-1}(u) sums to 𝐝(u)\mathbf{d}(u) for each uAu\in A. For each uAu\in A, we take the definition of vector 𝐅t(u)\mathbf{F}_{t}(u) and sum over its coordinates wAw\in A to obtain

wA𝐅t(u,w)\displaystyle\sum_{w\in A}\mathbf{F}_{t}(u,w) =wA(𝐅t1(u,w)+vAcMt(u,v)2(𝐅t1(v,w)𝐝(v)𝐅t1(u,w)𝐝(u)))\displaystyle=\sum_{w\in A}\left(\mathbf{F}_{t-1}(u,w)+\sum_{v\in A}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\mathbf{F}_{t-1}(v,w)}{\mathbf{d}(v)}-\frac{\mathbf{F}_{t-1}(u,w)}{\mathbf{d}(u)}\bigg{)}\right)
=wA𝐅t1(u,w)+vAcMt(u,v)2(wA𝐅t1(v,w)𝐝(v)wA𝐅t1(u,w)𝐝(u))\displaystyle=\sum_{w\in A}\mathbf{F}_{t-1}(u,w)+\sum_{v\in A}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\sum_{w\in A}\mathbf{F}_{t-1}(v,w)}{\mathbf{d}(v)}-\frac{\sum_{w\in A}\mathbf{F}_{t-1}(u,w)}{\mathbf{d}(u)}\bigg{)}
=𝐝(u)+vAcMt(u,v)2(𝐝(v)𝐝(v)𝐝(u)𝐝(u))\displaystyle=\mathbf{d}(u)+\sum_{v\in A}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\mathbf{d}(v)}{\mathbf{d}(v)}-\frac{\mathbf{d}(u)}{\mathbf{d}(u)}\bigg{)}
=𝐝(u),\displaystyle=\mathbf{d}(u),

completing the induction and the proof. ∎

Claim A.16.

If 𝐅t1\mathbf{F}_{t-1} is routable with congestion t/ϕt/\phi, then 𝐅t\mathbf{F}_{t} is routable with congestion (t+1)/ϕ(t+1)/\phi.

Proof.

Take the AA-commodity flow routing 𝐅t1\mathbf{F}_{t-1} with congestion t/ϕt/\phi and reverse it, forming a AA-commodity flow routing the transpose 𝐅t1\mathbf{F}_{t-1}^{\top} with the same congestion. In this reversed flow, each vertex uAu\in A has 𝐅t1(v,u)=𝐅t1(u,v)\mathbf{F}_{t-1}^{\top}(v,u)=\mathbf{F}_{t-1}(u,v) quantity of commodity vv. In other words, the quantities of each commodity at vertex uAu\in A is captured by its flow vector 𝐅t1(u)\mathbf{F}_{t-1}(u).

Next, we “mix” commodities along the edges of the matching graph MtM_{t}: for each edge (u,v)(u,v) in MtM_{t} of capacity cc, send a proportional c2𝐝(u)\frac{c}{2\mathbf{d}(u)} fraction of each commodity at uu along the corresponding path (in the path decomposition of f/ϕf/\phi) in the direction from uu to vv, and send a proportional c2𝐝(v)\frac{c}{2\mathbf{d}(v)} fraction of each commodity at vv in the direction from vv to uu. By Claim A.15, each vertex uAt1u\in A_{t-1} has 𝐝(u)\mathbf{d}(u) total quantity of commodities, so we send c/2c/2 total quantity of commodities in each direction, or cc in total, along this path of capacity cc (in the path decomposition of ftf_{t}). Since flow ftf_{t} has congestion 1/ϕ1/\phi by Claim A.8, the total congestion of this mixing step is 1/ϕ1/\phi. After this mixing step, the quantities of each commodity at vertex uAu\in A is exactly 𝐅t(u)\mathbf{F}_{t}(u). In other words, we have established a AA-commodity flow routing 𝐅t\mathbf{F}_{t}^{\top} with congestion (t+1)/ϕ(t+1)/\phi. Reversing the flow, we obtain a routing of 𝐅t\mathbf{F}_{t} with the same congestion. ∎

The following lemma adapts Lemma 3.4 of [33] to the capacitated case.

Lemma A.17.

ψ(t)ψ(t1)12u,vAcMt(u,v)𝐅t(u)𝐝(u)𝐅t(v)𝐝(v)22\psi(t)-\psi(t-1)\geq\displaystyle\frac{1}{2}\sum_{u,v\in A}c_{M_{t}}(u,v)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}.

Proof.

For each vertex vAv\in A, define the normalized flow vector 𝐅~t1(v)=𝐝(v)1/2𝐅t1(v)\widetilde{\mathbf{F}}_{t-1}(v)=\mathbf{d}(v)^{-1/2}\mathbf{F}_{t-1}(v), and define the normalized flow matrix 𝐅~t10A×A\widetilde{\mathbf{F}}_{t-1}\in\mathbb{R}^{A\times A}_{\geq 0} with vector 𝐅~t1(v)\widetilde{\mathbf{F}}_{t-1}(v) for each row vAv\in A. Let DD be the diagonal matrix with value 𝐝(u)\mathbf{d}(u) on entry (u,u)(u,u). We can then write 𝐅~t1=D1/2𝐅t1\widetilde{\mathbf{F}}_{t-1}=D^{-1/2}\mathbf{F}_{t-1}. Define the vectors 𝐅~t(v)=𝐝(v)1/2𝐅t(v)\widetilde{\mathbf{F}}_{t}(v)=\mathbf{d}(v)^{-1/2}\mathbf{F}_{t}(v) and matrix 𝐅~t=D1/2𝐅t\widetilde{\mathbf{F}}_{t}=D^{-1/2}\mathbf{F}_{t} analogously. For each vertex uAu\in A, by definition of flow vector 𝐅t(u)\mathbf{F}_{t}(u), we have

𝐅~t(u)=𝐝(u)1/2𝐅t(u)\displaystyle\widetilde{\mathbf{F}}_{t}(u)=\mathbf{d}(u)^{-1/2}\mathbf{F}_{t}(u) =𝐝(u)1/2(𝐅t1(u)+vAtcMt(u,v)2(𝐅t1(v)𝐝(v)𝐅t1(u)𝐝(u)))\displaystyle=\mathbf{d}(u)^{-1/2}\bigg{(}\mathbf{F}_{t-1}(u)+\sum_{v\in A_{t}}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}-\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}\bigg{)}\bigg{)}
=𝐝(u)1/2(𝐝(u)1/2𝐅~t1(u)+vAt1cMt(u,v)2(𝐅~t1(v)𝐝(v)1/2𝐅~t1(u)𝐝(u)1/2))\displaystyle=\mathbf{d}(u)^{-1/2}\bigg{(}\mathbf{d}(u)^{1/2}\widetilde{\mathbf{F}}_{t-1}(u)+\sum_{v\in A_{t-1}}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\widetilde{\mathbf{F}}_{t-1}(v)}{\mathbf{d}(v)^{1/2}}-\frac{\widetilde{\mathbf{F}}_{t-1}(u)}{\mathbf{d}(u)^{1/2}}\bigg{)}\bigg{)}
=F~(u)+vAt1cMt(u,v)2(𝐅~t1(v)𝐝(u)1/2𝐝(v)1/2𝐅~t1(u)𝐝(u))\displaystyle=\widetilde{F}(u)+\sum_{v\in A_{t-1}}\frac{c_{M_{t}}(u,v)}{2}\bigg{(}\frac{\widetilde{\mathbf{F}}_{t-1}(v)}{\mathbf{d}(u)^{1/2}\mathbf{d}(v)^{1/2}}-\frac{\widetilde{\mathbf{F}}_{t-1}(u)}{\mathbf{d}(u)}\bigg{)}
=F~(u)12(degMt(u)𝐝(u)F~(u)vAt1cMt(u,v)𝐝(u)1/2𝐝(v)1/2F~(v)).\displaystyle=\widetilde{F}(u)-\frac{1}{2}\bigg{(}\frac{\deg_{M_{t}}(u)}{\mathbf{d}(u)}\widetilde{F}(u)-\sum_{v\in A_{t-1}}\frac{c_{M_{t}}(u,v)}{\mathbf{d}(u)^{1/2}\mathbf{d}(v)^{1/2}}\widetilde{F}(v)\bigg{)}.

Let L0A×AL\in\mathbb{R}^{A\times A}_{\geq 0} be the Laplacian matrix for the matching graph MtM_{t} on vertex set AA (where vertices outside AtAtrA_{t}^{\ell}\cup A_{t}^{r} are isolated). That is, we define L(u,u)=degMt(u)L(u,u)=\deg_{M_{t}}(u) for all uAu\in A and L(u,v)=cMt(u,v)L(u,v)=-c_{M_{t}}(u,v) for all distinct u,vAu,v\in A.

We first prove the following about the matrix D1/2LD1/2D^{-1/2}LD^{-1/2}.

Subclaim 1.

0D1/2LD1/22I0\preceq D^{-1/2}LD^{-1/2}\preceq 2I.

Proof.

Consider the normalized Laplacian DL1/2LDL1/2D_{L}^{-1/2}LD^{-1/2}_{L} where DLD_{L} is the diagonal matrix with value degMt(u)\deg_{M_{t}}(u) on each entry (u,u)(u,u). It is well-known that the normalized Laplacian has eigenvalues in the range [0,2][0,2], so 0DL1/2LDL1/22I0\preceq D_{L}^{-1/2}LD_{L}^{-1/2}\preceq 2I. Multiplying by DL1/2D_{L}^{1/2} on both sides gives 0L2DL0\preceq L\preceq 2D_{L}. Claim A.14 implies that DLDD_{L}\preceq D, so we obtain 0L2DL2D0\preceq L\preceq 2D_{L}\preceq 2D. Finally, multiplying by D1/2D^{-1/2} on both sides gives the desired 0D1/2LD1/22I0\preceq D^{-1/2}LD^{-1/2}\preceq 2I. ∎

By definition, the matrix D1/2LD1/2D^{-1/2}LD^{-1/2} has value degMt(u)𝐝(u)\frac{\deg_{M_{t}}(u)}{\mathbf{d}(u)} on entry (u,u)(u,u) and value cMt(u,v)𝐝(u)1/2𝐝(v)1/2-\frac{c_{M_{t}}(u,v)}{\mathbf{d}(u)^{1/2}\mathbf{d}(v)^{1/2}} on entry (u,v)(u,v) with uvu\neq v. It follows that we can write matrix 𝐅~t\widetilde{\mathbf{F}}_{t} as

𝐅~t=(I12D1/2LD1/2)𝐅~t1=12(I+N)𝐅~t1whereN=ID1/2LD1/2.\widetilde{\mathbf{F}}_{t}=\bigg{(}I-\frac{1}{2}D^{-1/2}LD^{-1/2}\bigg{)}\widetilde{\mathbf{F}}_{t-1}=\frac{1}{2}(I+N)\widetilde{\mathbf{F}}_{t-1}\quad\text{where}\quad N=I-D^{-1/2}LD^{-1/2}.

For two matrices A,BA×AA,B\in\mathbb{R}^{A\times A}, define the Hadamard product AB=u,vAA(u,v)B(u,v)=Tr(AB)A\bullet B=\sum_{u,v\in A}A(u,v)\cdot B(u,v)=\textup{Tr}(A^{\top}B). To bound ψ(t1)\psi(t-1), we use the fact that 𝝁t1\boldsymbol{\mu}_{t-1} minimizes the expression in (2) to get

ψ(t1)\displaystyle\psi(t-1) =uAt1𝐝(u)𝐅t1(u)𝐝(u)𝝁t122\displaystyle=\sum_{u\in A_{t-1}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t-1}\right\rVert_{2}^{2}
uAt1𝐝(u)𝐅t1(u)𝐝(u)𝝁t22\displaystyle\leq\sum_{u\in A_{t-1}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}
=uAt1𝐅t1(u)𝐝(u)1/2𝐝(u)1/2𝝁t22\displaystyle=\sum_{u\in A_{t-1}}\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)^{1/2}}-\mathbf{d}(u)^{1/2}\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}
=(𝐅~t1D1/2𝟙𝝁t)(𝐅~t1D1/2𝟙𝝁t).\displaystyle=(\widetilde{\mathbf{F}}_{t-1}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})\bullet(\widetilde{\mathbf{F}}_{t-1}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}).

For ψ(t)\psi(t), we use At1AtA_{t-1}\subseteq A_{t} to get

ψ(t)\displaystyle\psi(t) =uAt𝐝(u)𝐅t(u)𝐝(u)𝝁t22\displaystyle=\sum_{u\in A_{t}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}
uAt1𝐝(u)𝐅t(u)𝐝(u)𝝁t22\displaystyle\geq\sum_{u\in A_{t-1}}\mathbf{d}(u)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}
=uAt1𝐅t(u)𝐝(u)1/2𝐝(u)1/2𝝁t22\displaystyle=\sum_{u\in A_{t-1}}\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)^{1/2}}-\mathbf{d}(u)^{1/2}\boldsymbol{\mu}_{t}\right\rVert_{2}^{2}
=(𝐅~tD1/2𝟙𝝁t)(𝐅~tD1/2𝟙𝝁t).\displaystyle=(\widetilde{\mathbf{F}}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})\bullet(\widetilde{\mathbf{F}}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}).

Taking the difference and expanding,

ψ(t1)ψ(t)\displaystyle\psi(t-1)-\psi(t)
(𝐅~t1D1/2𝟙𝝁t)(𝐅~t1D1/2𝟙𝝁t)(𝐅~tD1/2𝟙𝝁t)(𝐅~tD1/2𝟙𝝁t)\displaystyle\geq(\widetilde{\mathbf{F}}_{t-1}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})\bullet(\widetilde{\mathbf{F}}_{t-1}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})-(\widetilde{\mathbf{F}}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})\bullet(\widetilde{\mathbf{F}}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})
=(𝐅~t1D1/2𝟙𝝁t)(𝐅~t1D1/2𝟙𝝁t)(𝐅~tD1/2𝟙𝝁t)(𝐅~tD1/2𝟙𝝁t)\displaystyle=(\widetilde{\mathbf{F}}_{t-1}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})\bullet(\widetilde{\mathbf{F}}_{t-1}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})-(\widetilde{\mathbf{F}}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})\bullet(\widetilde{\mathbf{F}}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t})
=𝐅~t1𝐅~t12𝐅~t1D1/2𝟙𝝁t+D1/2𝟙𝝁tD1/2𝟙𝝁t𝐅~t𝐅~t+2𝐅~tD1/2𝟙𝝁tD1/2𝟙𝝁tD1/2𝟙𝝁t\displaystyle=\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-2\widetilde{\mathbf{F}}_{t-1}\bullet D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}+D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}\bullet D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}-\widetilde{\mathbf{F}}_{t}\bullet\widetilde{\mathbf{F}}_{t}+2\widetilde{\mathbf{F}}_{t}\bullet D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}-D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}\bullet D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}
=𝐅~t1𝐅~t1𝐅~t𝐅~t+2(𝐅~t𝐅~t1)D1/2𝟙𝝁t\displaystyle=\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-\widetilde{\mathbf{F}}_{t}\bullet\widetilde{\mathbf{F}}_{t}+2(\widetilde{\mathbf{F}}_{t}-\widetilde{\mathbf{F}}_{t-1})\bullet D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}
=𝐅~t1𝐅~t1𝐅~t𝐅~t2(12D1/2LD1/2)D1/2𝟙𝝁t.\displaystyle=\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-\widetilde{\mathbf{F}}_{t}\bullet\widetilde{\mathbf{F}}_{t}-2\bigg{(}\frac{1}{2}D^{-1/2}LD^{-1/2}\bigg{)}\bullet D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}.

The third term equals Tr(D1/2LD1/2D1/2𝟙𝝁t)\textup{Tr}\big{(}D^{-1/2}LD^{-1/2}D^{1/2}\mathbbm{1}\boldsymbol{\mu}_{t}\big{)}, which equals 0 since LD1/2D1/2𝟙=L𝟙LD^{-1/2}D^{1/2}\mathbbm{1}=L\mathbbm{1} is the zero matrix. Expanding the first and second terms, we obtain

ψ(t1)ψ(t)\displaystyle\psi(t-1)-\psi(t) 𝐅~t1𝐅~t1𝐅~t𝐅~t\displaystyle\geq\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-\widetilde{\mathbf{F}}_{t}\bullet\widetilde{\mathbf{F}}_{t}
=𝐅~t1𝐅~t112(I+N)𝐅~t112(I+N)𝐅~t1\displaystyle=\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-\frac{1}{2}(I+N)\widetilde{\mathbf{F}}_{t-1}\bullet\frac{1}{2}(I+N)\widetilde{\mathbf{F}}_{t-1}
=34𝐅~t1𝐅~t112N𝐅~t1𝐅~t114N𝐅~t1N𝐅~t1\displaystyle=\frac{3}{4}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-\frac{1}{2}N\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-\frac{1}{4}N\widetilde{\mathbf{F}}_{t-1}\bullet N\widetilde{\mathbf{F}}_{t-1}
=14(𝐅~t1𝐅~t1N𝐅~t1N𝐅~t1)+12(IN)𝐅~t1𝐅~t1\displaystyle=\frac{1}{4}\big{(}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-N\widetilde{\mathbf{F}}_{t-1}\bullet N\widetilde{\mathbf{F}}_{t-1}\big{)}+\frac{1}{2}(I-N)\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}
=14(𝐅~t1𝐅~t1N𝐅~t1N𝐅~t1)+12D1/2LD1/2𝐅~t1𝐅~t1.\displaystyle=\frac{1}{4}\big{(}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-N\widetilde{\mathbf{F}}_{t-1}\bullet N\widetilde{\mathbf{F}}_{t-1}\big{)}+\frac{1}{2}D^{-1/2}LD^{-1/2}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}. (5)

To bound the first term in (5), recall from Subclaim 1 that 0D1/2LD1/22I0\preceq D^{-1/2}LD^{-1/2}\preceq 2I, which means that INI-I\preceq N\preceq I. Since 𝐅~t1𝐅~t10\widetilde{\mathbf{F}}_{t-1}\widetilde{\mathbf{F}}_{t-1}^{\top}\succeq 0 and NN has eigenvalues in the range [1,1][-1,1], we have Tr(N𝐅~t1𝐅~t1N)Tr(𝐅~t1𝐅~t1)\textup{Tr}(N\widetilde{\mathbf{F}}_{t-1}\widetilde{\mathbf{F}}_{t-1}^{\top}N)\leq\textup{Tr}(\widetilde{\mathbf{F}}_{t-1}\widetilde{\mathbf{F}}_{t-1}^{\top}), so

𝐅~t1𝐅~t1=Tr(𝐅~t1𝐅~t1)Tr(N𝐅~t1𝐅~t1N)=Tr(𝐅~t1NN𝐅~t1)=N𝐅~t1N𝐅~t1,\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}=\textup{Tr}(\widetilde{\mathbf{F}}_{t-1}\widetilde{\mathbf{F}}_{t-1}^{\top})\geq\textup{Tr}(N\widetilde{\mathbf{F}}_{t-1}\widetilde{\mathbf{F}}_{t-1}^{\top}N)=\textup{Tr}(\widetilde{\mathbf{F}}_{t-1}^{\top}NN\widetilde{\mathbf{F}}_{t-1})=N\widetilde{\mathbf{F}}_{t-1}\bullet N\widetilde{\mathbf{F}}_{t-1},

so the first term 14(𝐅~t1𝐅~t1N𝐅~t1N𝐅~t1)\frac{1}{4}(\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}-N\widetilde{\mathbf{F}}_{t-1}\bullet N\widetilde{\mathbf{F}}_{t-1}) is at least 0.

Continuing from (5), we conclude that

ψ(t1)ψ(t)12D1/2LD1/2𝐅~t1𝐅~t1.\psi(t-1)-\psi(t)\geq\frac{1}{2}D^{-1/2}LD^{-1/2}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}.

It remains to understand the above expression D1/2LD1/2𝐅~t1𝐅~t1D^{-1/2}LD^{-1/2}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}. For each vertex vAv\in A, let 𝟙vA\mathbbm{1}_{v}\in\mathbb{R}^{A} denote the unit vector in direction vv. The Laplacian can be written as

L=u,vAcMt(u,v)(𝟙u𝟙v)(𝟙u𝟙v),L=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot(\mathbbm{1}_{u}-\mathbbm{1}_{v})(\mathbbm{1}_{u}-\mathbbm{1}_{v})^{\top},

so that

D1/2LD1/2𝐅~t1𝐅~t1\displaystyle D^{-1/2}LD^{-1/2}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1} =u,vAcMt(u,v)D1/2(𝟙u𝟙v)(𝟙u𝟙v)D1/2𝐅~t1𝐅~t1\displaystyle=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot D^{-1/2}(\mathbbm{1}_{u}-\mathbbm{1}_{v})(\mathbbm{1}_{u}-\mathbbm{1}_{v})^{\top}D^{-1/2}\widetilde{\mathbf{F}}_{t-1}\bullet\widetilde{\mathbf{F}}_{t-1}
=u,vAcMt(u,v)Tr(𝐅~t1D1/2(𝟙u𝟙v)(𝟙u𝟙v)D1/2𝐅~t1)\displaystyle=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot\textup{Tr}\big{(}\widetilde{\mathbf{F}}_{t-1}^{\top}D^{-1/2}(\mathbbm{1}_{u}-\mathbbm{1}_{v})(\mathbbm{1}_{u}-\mathbbm{1}_{v})^{\top}D^{-1/2}\widetilde{\mathbf{F}}_{t-1}\big{)}
=u,vAcMt(u,v)Tr((𝟙u𝟙v)D1/2𝐅~t1𝐅~t1D1/2(𝟙u𝟙v))\displaystyle=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot\textup{Tr}\big{(}(\mathbbm{1}_{u}-\mathbbm{1}_{v})^{\top}D^{-1/2}\widetilde{\mathbf{F}}_{t-1}\widetilde{\mathbf{F}}_{t-1}^{\top}D^{-1/2}(\mathbbm{1}_{u}-\mathbbm{1}_{v})\big{)}
=u,vAcMt(u,v)𝐅~t1D1/2(𝟙u𝟙v)22\displaystyle=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot\left\lVert\widetilde{\mathbf{F}}_{t-1}^{\top}D^{-1/2}(\mathbbm{1}_{u}-\mathbbm{1}_{v})\right\rVert_{2}^{2}
=u,vAcMt(u,v)𝐅t1D1(𝟙u𝟙v)22\displaystyle=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot\left\lVert\mathbf{F}_{t-1}^{\top}D^{-1}(\mathbbm{1}_{u}-\mathbbm{1}_{v})\right\rVert_{2}^{2}
=u,vAcMt(u,v)𝐅t1(u)𝐝(u)𝐅t1(v)𝐝(v)22.\displaystyle=\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}.

We conclude that

ψ(t1)ψ(t)12u,vAcMt(u,v)𝐅t1(u)𝐝(u)𝐅t1(v)𝐝(v)22.\psi(t-1)-\psi(t)\geq\frac{1}{2}\sum_{u,v\in A}c_{M_{t}}(u,v)\cdot\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}.\qed

Finally, we prove the main lemma of Section A.2.3, restated below. See A.13

Proof.

By Lemma A.17, we have

ψ(t1)ψ(t)12u,vAcMt(u,v)𝐅t(u)𝐝(u)𝐅t(v)𝐝(v)22.\displaystyle\psi(t-1)-\psi(t)\geq\displaystyle\frac{1}{2}\sum_{u,v\in A}c_{M_{t}}(u,v)\left\lVert\frac{\mathbf{F}_{t}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}. (6)

By Lemma A.6, we have

𝔼[(𝐩(v)μ¯t1)2]=1n𝐅t1(v)𝐝(v)𝝁t122\displaystyle\mathbb{E}[(\mathbf{p}(v)-\bar{\mu}_{t-1})^{2}]=\frac{1}{n}\left\lVert\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}-\boldsymbol{\mu}_{t-1}\right\rVert_{2}^{2} (7)

for all vertices vAv\in A, and for some constant C>0C>0,

𝔼[(𝐩(u)𝐩(v))2]Clognn𝐅t1(u)𝐝(u)𝐅t1(v)𝐝(v)22\displaystyle\mathbb{E}[(\mathbf{p}(u)-\mathbf{p}(v))^{2}]\leq\frac{C\log n}{n}\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2} (8)

for all pairs u,vAu,v\in A. By Claim A.14, we have degMt(u)=𝐝(u)/12\deg_{M_{t}}(u)=\mathbf{d}(u)/12 for all uAtStu\in A_{t}^{\ell}\setminus S_{t}, and together with statements a to c of Lemma A.5, we have

uAt,vAtrcMt(u,v)(𝐩(u)𝐩(v))2\displaystyle\sum_{u\in A_{t}^{\ell},v\in A_{t}^{r}}c_{M_{t}}(u,v)\cdot(\mathbf{p}(u)-\mathbf{p}(v))^{2} auAt,vAtrcMt(u,v)(𝐩(u)η)2\displaystyle\stackrel{{\scriptstyle\mathclap{\textup{\ref{item:CMG-separator-1}}}}}{{\geq}}\sum_{u\in A_{t}^{\ell},v\in A_{t}^{r}}c_{M_{t}}(u,v)\cdot(\mathbf{p}(u)-\eta)^{2}
=uAtdegMt(u)(𝐩(u)η)2\displaystyle=\sum_{u\in A_{t}^{\ell}}\deg_{M_{t}}(u)\cdot(\mathbf{p}(u)-\eta)^{2}
=112uAt𝐝(u)(𝐩(u)η)2\displaystyle=\frac{1}{12}\sum_{u\in A_{t}^{\ell}}\mathbf{d}(u)\cdot(\mathbf{p}(u)-\eta)^{2}
c124uAt1𝐝(u)(𝐩(u)μ¯t1)2.\displaystyle\stackrel{{\scriptstyle\mathclap{\textup{\ref{item:CMG-separator-3}}}}}{{\geq}}\frac{1}{24}\sum_{u\in A_{t-1}}\mathbf{d}(u)\cdot(\mathbf{p}(u)-\bar{\mu}_{t-1})^{2}. (9)

Taking the expectation and putting everything together,

𝔼[ψ(t1)ψ(t)]\displaystyle\mathbb{E}[\psi(t-1)-\psi(t)] (6)𝔼[12u,vAcMt(u,v)𝐅t1(u)𝐝(u)𝐅t1(v)𝐝(v)22]\displaystyle\stackrel{{\scriptstyle\mathclap{(\ref{eq:pot-red-1})}}}{{\geq}}\mathbb{E}\bigg{[}\displaystyle\frac{1}{2}\sum_{u,v\in A}c_{M_{t}}(u,v)\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}\bigg{]}
=𝔼[uAt,vAtrcMt(u,v)𝐅t1(u)𝐝(u)𝐅t1(v)𝐝(v)22]\displaystyle=\mathbb{E}\bigg{[}\displaystyle\sum_{u\in A_{t}^{\ell},v\in A_{t}^{r}}c_{M_{t}}(u,v)\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\frac{\mathbf{F}_{t-1}(v)}{\mathbf{d}(v)}\right\rVert_{2}^{2}\bigg{]}
(8)𝔼[nClognuAt,vAtrcMt(u,v)(𝐩(u)𝐩(v))2]\displaystyle\stackrel{{\scriptstyle\mathclap{(\ref{eq:pot-red-3})}}}{{\geq}}\mathbb{E}\bigg{[}\frac{n}{C\log n}\sum_{u\in A_{t}^{\ell},v\in A_{t}^{r}}c_{M_{t}}(u,v)\cdot(\mathbf{p}(u)-\mathbf{p}(v))^{2}\bigg{]}
(9)𝔼[n24ClognuAt1𝐝(u)(𝐩(u)μ¯t1)2]\displaystyle\stackrel{{\scriptstyle\mathclap{(\ref{eq:pot-red-4})}}}{{\geq}}\mathbb{E}\bigg{[}\frac{n}{24C\log n}\sum_{u\in A_{t-1}}\mathbf{d}(u)\cdot(\mathbf{p}(u)-\bar{\mu}_{t-1})^{2}\bigg{]}
=n24ClognuAt1𝐝(u)𝔼[(𝐩(u)μ¯t1)2]\displaystyle=\frac{n}{24C\log n}\sum_{u\in A_{t-1}}\mathbf{d}(u)\cdot\mathbb{E}[(\mathbf{p}(u)-\bar{\mu}_{t-1})^{2}]
=(7)n24ClognuAt1𝐝(u)1n𝐅t1(u)𝐝(u)𝝁t122\displaystyle\stackrel{{\scriptstyle\mathclap{(\ref{eq:pot-red-2})}}}{{=}}\frac{n}{24C\log n}\sum_{u\in A_{t-1}}\mathbf{d}(u)\cdot\frac{1}{n}\left\lVert\frac{\mathbf{F}_{t-1}(u)}{\mathbf{d}(u)}-\boldsymbol{\mu}_{t-1}\right\rVert_{2}^{2}
=124Clognψ(t1),\displaystyle=\frac{1}{24C\log n}\cdot\psi(t-1),

which concludes the proof of Lemma A.13. ∎

A.4 Putting Everything Together

Finally, we prove correctness of the algorithm by establishing properties (1) to (3) in Theorem 5.11. Properties (1) and (2) follow directly from Lemma A.1. For property (3), if the algorithm terminates early, then 𝐝(R)𝐝(A)/(6T)\mathbf{d}(R)\geq\mathbf{d}(A)/(6T) by the termination condition and property (3) is satisfied. Otherwise, we have 𝐝(RT)<𝐝(A)/(6T)\mathbf{d}(R_{T})<\mathbf{d}(A)/(6T). By Lemma A.13, the potential ψ(t)\psi(t) for any iteration tt is at most 1Ω(1/logn)1-\Omega(1/\log n) times the previous potential ψ(t1)\psi(t-1) in expectation. Over T=O(log2(nW))T=O(\log^{2}(nW)) iterations, the expected potential ψ(T)\psi(T) is at most 1/(nW)2C1/(nW)^{2C}, where the constant C>0C>0 can be made arbitrarily large by setting the constant in T=O(log2(nW))T=O(\log^{2}(nW)) appropriately. By Markov’s inequality, we have ψ(T)1/(nW)C\psi(T)\leq 1/(nW)^{C} with probability at least 11/nC1-1/n^{C}, in which case Lemma A.2 implies that vertex weighting degF|At\mathrm{deg}_{F}|_{A_{t}} mixes in GG with congestion 5T/ϕ=O(ϕ1log2(nW))5T/\phi=O(\phi^{-1}\log^{2}(nW)), fulfilling property (3).

It remains to bound the running time. For each iteration, the projections 𝐩(v)=𝐅t1(v)/𝐝(v),𝐫\mathbf{p}(v)=\langle\mathbf{F}_{t-1}(v)/\mathbf{d}(v),\,\mathbf{r}\rangle for each vertex vAv\in A can be computed in total time O(mT)O(mT) by Claim A.12. Computing the values At,AtrA_{t}^{\ell},A_{t}^{r} takes additional time O(|At1|log|At1|)=O(mlogm)O(|A_{t-1}|\log|A_{t-1}|)=O(m\log m) by Lemma A.5. The algorithm makes one call to Theorem 5.10 with the parameters in (3), and then computes flow ftf_{t} and the matching graph MtM_{t} in O(mlogm)O(m\log m) time by Claims A.9 and A.11. Overall, the algorithm runs in O(mlog2(nW))O(m\log^{2}(nW)) time per iteration for T=O(log2(nW))T=O(\log^{2}(nW)) many iterations, which is O(mlog4(nW))O(m\log^{4}(nW)) time in total. The algorithm also makes T=O(log2(nW))T=O(\log^{2}(nW)) many calls to Theorem 5.10.

With both properties (1) to (3) and the running time established, this concludes the proof of Theorem 5.11.

Appendix B Expander Trimming

In this section, we prove Theorem 5.12, restated below. See 5.12

Our setup closely resembles Section 3.2 of [34], except that we replace the exact min-cut/max-flow oracle in their “slow-trimming” procedure with an approximate one. We also avoid defining expanders and work exclusively with flow mixing, which simplifies the analysis.

Define the vertex weighting 𝐬00A\mathbf{s}_{0}\in\mathbb{R}^{A}_{\geq 0} as 𝐬0(v)=cG({v},R)\mathbf{s}_{0}(v)=c_{G}(\{v\},R) for all vARv\in A\setminus R, and 𝐬0(v)=0\mathbf{s}_{0}(v)=0 for all vRv\in R. We call Theorem 5.10 with parameters

AA,ϵϵ,γϵϕ2,βmax{1,(12ϕ+ϵγ)(κ+2)},𝐬𝐬0+ϵϕ𝐝,𝐭12ϕ𝐝|AR,\displaystyle A\leftarrow A,\,\epsilon\leftarrow\epsilon,\,\gamma\leftarrow\frac{\epsilon\phi}{2},\,\beta\leftarrow\max\{1,(12\phi+\epsilon\gamma)(\kappa+2)\},\,\,\mathbf{s}\leftarrow\mathbf{s}_{0}+\epsilon\phi\mathbf{d},\,\mathbf{t}\leftarrow 12\phi\mathbf{d}|_{A\setminus R},

and we denote the graph G[A,γ,𝐬,𝐭]G[A,\gamma,\mathbf{s},\mathbf{t}] by H=(VH,EH)H=(V_{H},E_{H}). Note that except for ϵ\epsilon and 𝐬\mathbf{s}, these parameters match the parameters (3) in the proof of Theorem 5.11, where AtA_{t}^{\ell} and AtrA_{t}^{r} are replaced by RR and ARA\setminus R, respectively. Since assumption 1 is the same as the one in Theorem 5.11, the statement and proof of Claim A.10 (which does not depend on ϵ\epsilon or 𝐬\mathbf{s}) translates directly to this setting. For brevity, we omit the details, and simply restate Claim A.10 below.

Claim B.1.

Assumption 1 of Theorem 5.10 holds for our choice of parameters ϵ,γ,β\epsilon,\gamma,\beta. That is, we have 𝐭(CA)+ϵγδG(CA)βδGC\mathbf{t}(C\cap A)+\epsilon\gamma\cdot\delta_{G}(C\cap A)\leq\beta\cdot\delta_{G}C for all C𝒞C\in\mathcal{C}.

From Theorem 5.10, we obtain an (1+ϵ)(1+\epsilon)-approximate fair cut/flow pair (S,f)(S,f). The algorithm sets BS{s,x}B\leftarrow S\setminus\{s,x\}, which trivially takes O(|A|)O(|A|) time outside the call. It remains to show that properties (1) to (4) are satisfied.

To show property (1), we start with δGBδHS\delta_{G}B\leq\delta_{H}S and proceed by bounding δHS\delta_{H}S. By Fact 5.8, the cut value δHS\delta_{H}S is at most (1+ϵ)(1+\epsilon) times the minimum (s,t)(s,t)-cut. Since the (s,t)(s,t)-minimum cut is at most δH{s}=𝐬(A)\delta_{H}\{s\}=\mathbf{s}(A), we have

δHS(1+ϵ)𝐬(A)=(1+ϵ)(𝐬0(A)+ϵϕ𝐝(A))2𝐬0(A)+2ϵϕ𝐝(A).\delta_{H}S\leq(1+\epsilon)\mathbf{s}(A)=(1+\epsilon)(\mathbf{s}_{0}(A)+\epsilon\phi\mathbf{d}(A))\leq 2\mathbf{s}_{0}(A)+2\epsilon\phi\mathbf{d}(A).

By definition of 𝐬0\mathbf{s}_{0}, we have 𝐬0(A)=vARcG({v},R)=cG(AR,R)=δG[A]R\mathbf{s}_{0}(A)=\sum_{v\in A\setminus R}c_{G}(\{v\},R)=c_{G}(A\setminus R,R)=\delta_{G[A]}R. Putting everything together,

δGBδHS2𝐬0(A)+2ϵϕ𝐝(A)=2δG[A]R+2ϵϕ𝐝(A),\delta_{G}B\leq\delta_{H}S\leq 2\mathbf{s}_{0}(A)+2\epsilon\phi\mathbf{d}(A)=2\delta_{G[A]}R+2\epsilon\phi\mathbf{d}(A),

fulfilling property (1). By construction of HH, we also have cH(S,{t})=12ϕ𝐝|AR(B)c_{H}(S,\{t\})=12\phi\mathbf{d}|_{A\setminus R}(B), so

12ϕ𝐝(BR)=12ϕ𝐝|AR(B)=cH(S,{t})δHS2δG[A]R+2ϵϕ𝐝(A),12\phi\mathbf{d}(B\setminus R)=12\phi\mathbf{d}|_{A\setminus R}(B)=c_{H}(S,\{t\})\leq\delta_{H}S\leq 2\delta_{G[A]}R+2\epsilon\phi\mathbf{d}(A),

and dividing by 12ϕ12\phi establishes property (2).

The proofs of properties (3) and (4) are longer, so we package them into the two claims below.

Claim B.2.

Property (4) holds, i.e., there exists a vector 𝐭0A\mathbf{t}\in\mathbb{R}^{A}_{\geq 0} with 𝐭24ϕ𝐝|A(RB)\mathbf{t}\leq 24\phi\mathbf{d}|_{A\setminus(R\cup B)} and a flow gg on G[A(RB)]G[A\setminus(R\cup B)] routing demand degG[A](RB)|A(RB)𝐭\textup{deg}_{\partial_{G[A]}(R\cup B)}|_{A\setminus(R\cup B)}-\mathbf{t} with congestion 22.

Proof.

Since (S,f)(S,f) is a (1+ϵ)(1+\epsilon)-fair cut/flow pair, each vertex vVH(S{t})v\in V_{H}\setminus(S\cup\{t\}) receives at least 11+ϵcH({v},S)12cH({v},S)\frac{1}{1+\epsilon}\,c_{H}(\{v\},S)\geq\frac{1}{2}c_{H}(\{v\},S) total flow from vertices in SS. Since A(RB)VH(S{t})A\setminus(R\cup B)\subseteq V_{H}\setminus(S\cup\{t\}), the same holds for all vA(RB)v\in A\setminus(R\cup B). By construction of HH, we have the following for all vA(RB)v\in A\setminus(R\cup B):

cH({v},S)\displaystyle c_{H}(\{v\},S) cH(v,s)+cH({v},B)\displaystyle\geq c_{H}(v,s)+c_{H}(\{v\},B)
=𝐬0(v)+ϵϕ𝐝(v)+cG({v},B)\displaystyle=\mathbf{s}_{0}(v)+\epsilon\phi\mathbf{d}(v)+c_{G}(\{v\},B)
=cG({v},R)+ϵϕ𝐝(v)+cG({v},B)\displaystyle=c_{G}(\{v\},R)+\epsilon\phi\mathbf{d}(v)+c_{G}(\{v\},B)
cG({v},RB)+ϵϕ𝐝(v)\displaystyle\geq c_{G}(\{v\},R\cup B)+\epsilon\phi\mathbf{d}(v)
=degG[A](RB)(v)+ϵϕ𝐝(v).\displaystyle=\deg_{\partial_{G[A]}(R\cup B)}(v)+\epsilon\phi\mathbf{d}(v).

It follows that each vertex vA(RB)v\in A\setminus(R\cup B) receives at least 12(degG[A](RB)(v)+ϵϕ𝐝(v))\frac{1}{2}(\deg_{\partial_{G[A]}(R\cup B)}(v)+\epsilon\phi\mathbf{d}(v)) total flow from vertices in RBR\cup B.

We now investigate the effect of restricting the flow ff to edges in G[A(RB)]G[A\setminus(R\cup B)], starting with removing all edges incident to RBR\cup B. Continuing the argument above, removing these edges causes each vertex vA(RB)v\in A\setminus(R\cup B) to be the source of at least degG[A](RB)(v)+ϵϕ𝐝(v)\deg_{\partial_{G[A]}(R\cup B)}(v)+\epsilon\phi\mathbf{d}(v) flow.

If xSx\notin S, then we now remove the edges incident to xx. By construction of HH, each vertex vAv\in A has an edge to xx of capacity γ𝐝(v)=ϵϕ2κ𝐝(v)\gamma\mathbf{d}(v)=\frac{\epsilon\phi}{2\kappa}\mathbf{d}(v). Since the flow has congestion κ\kappa, there is at most ϵϕ2𝐝(v)\frac{\epsilon\phi}{2}\mathbf{d}(v) flow along the edge (v,x)(v,x). Removing this edge changes the net flow out of vv by at most ϵϕ2𝐝(v)\frac{\epsilon\phi}{2}\mathbf{d}(v). Since each vertex vA(RB)v\in A\setminus(R\cup B) is the source of at least 12(degG[A](RB)(v)+ϵϕ𝐝(v))\frac{1}{2}(\deg_{\partial_{G[A]}(R\cup B)}(v)+\epsilon\phi\mathbf{d}(v)) flow before this step, it is the source of at least 12degG[A](RB)(v)\frac{1}{2}\deg_{\partial_{G[A]}(R\cup B)}(v) flow after this step.

At this point, only the vertices A(RB)A\setminus(R\cup B) and tt remain, and each vertex vtv\neq t is the source of at least 12degG[A](RB)(v)\frac{1}{2}\deg_{\partial_{G[A]}(R\cup B)}(v) flow. Scale the flow by factor 22, take a path decomposition of the flow, and remove paths until each vertex vtv\neq t is the source of exactly degG[A](RB)(v)\deg_{\partial_{G[A]}(R\cup B)}(v) flow. We may assume that the flow does not send any flow away from tt (i.e., along any edge (v,t)(v,t) in the direction from tt to vv), since otherwise we can cancel such flow using the path decomposition.

Finally, we remove the edges incident to tt. Since the original flow ff is feasible, the scaled flow has congestion 22, so each edge (v,t)(v,t) carries at most 2cH(v,t)2c_{H}(v,t) flow. By construction of HH, we have cH(v,t)=12ϕ𝐝(v)c_{H}(v,t)=12\phi\mathbf{d}(v) for all vARv\in A\setminus R, so each vertex vA(RB)v\in A\setminus(R\cup B) receives a net flow of at most 24ϕ𝐝(v)24\phi\mathbf{d}(v) from vertices other than tt (and then sends that flow to tt). Let 𝐭(v)24ϕ𝐝(v)\mathbf{t}(v)\leq 24\phi\mathbf{d}(v) be the net flow received, so that the vector 𝐭0A(RB)\mathbf{t}\in\mathbb{R}^{A\setminus(R\cup B)}_{\geq 0} satisfies 𝐭24ϕ𝐝|A(RB)\mathbf{t}\leq 24\phi\mathbf{d}|_{A\setminus(R\cup B)}. Removing the edge (v,t)(v,t) increases the net flow into vv by exactly 𝐭(v)\mathbf{t}(v). Since each vertex vA(RB)v\in A\setminus(R\cup B) is the source of exactly degG[A](RB)(v)\deg_{\partial_{G[A]}(R\cup B)}(v) flow before removing the edge (v,t)(v,t), the vertex vv has net flow degG[A](RB)(v)𝐭(v)\deg_{\partial_{G[A]}(R\cup B)}(v)-\mathbf{t}(v) after removal. In other words, the new flow routes demand degG[A](RB)(v)𝐭(v)\deg_{\partial_{G[A]}(R\cup B)}(v)-\mathbf{t}(v) and has congestion 22. By construction, it is also restricted to G[A(RB)]G[A\setminus(R\cup B)], concluding the proof. ∎

Claim B.3.

Property (3) holds, i.e., if the vertex weighting 𝐝|AR\mathbf{d}|_{A\setminus R} mixes in G[A]G[A] with congestion cc, then the vertex weighting (𝐝+degG[A](RB))|A(RB)(\mathbf{d}+\deg_{\partial_{G[A]}(R\cup B)})|_{A\setminus(R\cup B)} mixes in G[A]G[A] with congestion 2+(1+24ϕ)c2+(1+24\phi)c.

Proof.

Consider any demand 𝐛A\mathbf{b}\in\mathbb{R}^{A} satisfying |𝐛|(𝐝+degG(RB))|A(RB)|\mathbf{b}|\leq(\mathbf{d}+\deg_{\partial_{G}(R\cup B)})|_{A\setminus(R\cup B)}. In particular, vRB𝐛(v)=0\sum_{v\in R\cup B}\mathbf{b}(v)=0. To fulfill property (3), we want to construct a single-commodity flow routing demand 𝐛\mathbf{b} with congestion 2+(1+24ϕ)c2+(1+24\phi)c.

We first split 𝐛\mathbf{b} into demands 𝐱\mathbf{x} and 𝐛𝐱\mathbf{b}-\mathbf{x}, where vector 𝐱0A\mathbf{x}\in\mathbb{R}^{A}_{\geq 0} is defined as

𝐱(v)={degG(RB)(v)𝐝(v)+degG(RB)(v)𝐛(v)if vA(RB) and 𝐝(v)+degG(RB)(v)>0,0otherwise.\mathbf{x}(v)=\begin{cases}\displaystyle\frac{\deg_{\partial_{G}(R\cup B)}(v)}{\mathbf{d}(v)+\deg_{\partial_{G}(R\cup B)}(v)}\cdot\mathbf{b}(v)&\text{if }v\in A\setminus(R\cup B)\text{ and }\mathbf{d}(v)+\deg_{\partial_{G}(R\cup B)}(v)>0,\\ 0&\text{otherwise}.\end{cases}

Since |𝐛|(𝐝+degG(RB))|A(RB)|\mathbf{b}|\leq(\mathbf{d}+\deg_{\partial_{G}(R\cup B)})|_{A\setminus(R\cup B)}, we have |𝐱|degG(RB)|A(RB)|\mathbf{x}|\leq\text{deg}_{\partial_{G}(R\cup B)}|_{A\setminus(R\cup B)} and |𝐛𝐱|𝐝|A(RB)|\mathbf{b}-\mathbf{x}|\leq\mathbf{d}|_{A\setminus(R\cup B)}.

Take the flow gg from property (3), and compute a path decomposition of the flow. For each path in the decomposition from vertex uA(RB)u\in A\setminus(R\cup B) to vertex vA(RB)v\in A\setminus(R\cup B) of capacity cc, route 𝐱(u)degG(RB)(u)c\frac{\mathbf{x}(u)}{\deg_{\partial_{G}(R\cup B)}(u)}\cdot c flow along the path from uu to vv (or 𝐱(u)degG(RB)(u)c-\frac{\mathbf{x}(u)}{\deg_{\partial_{G}(R\cup B)}(u)}\cdot c flow in the reversed direction, whichever is nonnegative). Since |𝐱(u)|degG(RB)(u)|\mathbf{x}(u)|\leq\text{deg}_{\partial_{G}(R\cup B)}(u), we send at most cc flow along each such path of capacity cc, so our new flow g1g_{1} has congestion at most that of gg, which is 22. Each vertex vA(RB)v\in A\setminus(R\cup B) is the end of at most 𝐭(v)24ϕ𝐝(v)\mathbf{t}(v)\leq 24\phi\mathbf{d}(v) total capacity of paths in the decomposition, and for each such path of capacity cc, the vertex vv receives a net flow of at most cc in absolute value. It follows that each vertex vA(RB)v\in A\setminus(R\cup B) receives at most 24ϕ𝐝(v)24\phi\mathbf{d}(v) net flow in absolute value. Setting 𝐲(v)\mathbf{y}(v) as this net flow, we obtain |𝐲|24ϕ𝐝|A(RB)|\mathbf{y}|\leq 24\phi\mathbf{d}|_{A\setminus(R\cup B)} and that flow g1g_{1} route demand 𝐱𝐲\mathbf{x}-\mathbf{y}.

Having routed demand 𝐱𝐲\mathbf{x}-\mathbf{y} through flow g1g_{1}, it remains to route demand 𝐛𝐱+𝐲\mathbf{b}-\mathbf{x}+\mathbf{y}. We bound

|𝐛𝐱+𝐲||𝐛𝐱|+|𝐲|𝐝|A(RB)+24ϕ𝐝|A(RB)(1+24ϕ)𝐝|A(RB).|\mathbf{b}-\mathbf{x}+\mathbf{y}|\leq|\mathbf{b}-\mathbf{x}|+|\mathbf{y}|\leq\mathbf{d}|_{A\setminus(R\cup B)}+24\phi\mathbf{d}|_{A\setminus(R\cup B)}\leq(1+24\phi)\mathbf{d}|_{A\setminus(R\cup B)}.

By assumption, the vertex weighting (𝐝+deg(RB))|AR(\mathbf{d}+\deg_{\partial(R\cup B)})|_{A\setminus R} mixes in GG with congestion cc, so there is a flow routing demand (𝐛𝐱+𝐲)/(1+24ϕ)(\mathbf{b}-\mathbf{x}+\mathbf{y})/(1+24\phi) with congestion cc. Scaling this flow by factor 1+24ϕ1+24\phi, we obtain a flow g2g_{2} routing demand 𝐛𝐱+𝐲\mathbf{b}-\mathbf{x}+\mathbf{y} with congestion (1+24ϕ)c(1+24\phi)c. Summing the two flows, the final flow g1+g2g_{1}+g_{2} routes demand 𝐛\mathbf{b} with congestion 2+(1+24ϕ)c2+(1+24\phi)c, concluding the proof. ∎