\setlistdepth

Partial Implementation of Max Flow and Min Cost Flow in Almost-Linear Time

A thesis presented by
Nithin Kavi

to
Computer Science
in partial fulfillment of the joint concentration
requirements for the degree of Bachelor of Arts
in Computer Science and Mathematics

Harvard College
Cambridge, Massachusetts
April 1, 2024

Abstract

In 2022, Chen et al. proposed an algorithm in [2] that solves the min cost flow problem in $m^{1+o(1)}\log U\log C$ time, where $m$ is the number of edges in the graph, $U$ is an upper bound on capacities and $C$ is an upper bound on costs. However, as far as the authors of [2] know, no one has implemented their algorithm to date. In this paper, we discuss implementations of several key portions of the algorithm given in [2], including the justifications for specific implementation choices. For the portions of the algorithm that we do not implement, we provide stubs. We then go through the entire algorithm and calculate the $m^{o(1)}$ term more precisely. Finally, we conclude with potential directions for future work in this area.

Acknowledgments

I would like to thank Dr. Adam Hesterberg for offering guidance throughout this project as my senior thesis advisor during the 2023-2024 academic year. He was extremely helpful in discussing ideas and offering feedback on this paper, and I could not have done this project without him. I would also like to thank Professors Yang Liu, Sushant Sachdeva and Richard Peng for answering clarifying questions about the algorithm that they present in [2] and offering advice on how I go about implementing things in this project. Their insight was extremely helpful. I am also thankful for Professors Daniel Sleator and Robert Tarjan for their insight on Dynamic Trees. I also sincerely appreciate Professor Madhu Sudan for agreeing to read and evaluate my thesis.

In addition, I want to thank Mincheol, Kevin, Ryan, Skyler, Sunera, Wittmann, and Yiting for being my suitemates and being part of an unforgettable experience at Harvard. I’ve really enjoyed these past 4 years with them. I am also thankful to my parents and older sister for all of their support throughout my Harvard education.

Lastly, I thank the Harvard Computer Science Department and Harvard College for giving me the opportunity to write a senior thesis and undertake this project.

1 Introduction

The maximum-flow problem is formulated as follows:

Suppose we have a connected graph $G$ with $m$ edges and $n$ vertices where one vertex is designed as a source $s$ and another is designated as a sink $t$ . Each edge $e$ in $G$ has a given capacity $c_{e}.$ We want to the maximize the amount of flow from $s$ to $t$ such that for all vertices $v\neq s,t$ the flow out of $v$ is equal to the flow into $v$ and that for any edge $e$ , the flow $f_{e}$ along edge $e$ satisfies $f_{e}\leq c_{e}.$

To solve this problem, Ford and Fulkerson created the Ford-Fulkerson algorithm, which runs in $O(mf^{*})$ time where $f^{*}$ is the max flow. There is also the Edmonds-Karp algorithm which runs in $O(nm^{2})$ time which does not depend on the size of the max flow itself.

There is also the related minimum-cost flow problem, where each edge is also assigned a cost along with its capacity. Then the goal is to push the maximum total flow from the source to the sink while minimizing the total cost along the way.

Lastly, we can combine these two ideas as follows: Letting the max flow as defined above be $f^{*},$ the min cost flow finds the minimum possible total cost of the flow over the edges such that all requirements of the max flow problem are met and the max flow is attained.

In 2022, Chen, Kyng, Liu, Peng, Gutenberg and Sachdeva wrote [2] in which they found an algorithm to compute both maximum flows and minimum-cost flows in $m^{1+o(1)}\log U\log C$ time, which is the asymptotically fastest known algorithm to date. In this paper, we work on implementing the algorithm given in [2]. As far as the authors of [2] are aware, this algorithm has never been implemented before, and this paper constitutes the first attempt at implementing it. This algorithm relies on a number of subparts, including dynamic trees, low-stretch spanning trees, and undirected minimum-ratio cycles. Apart from very well-known things (like binary trees), there do not appear to be existing implementations of any of these subparts that meet the asymptotic time complexities given in [2].

We implement some of these subparts in the following sections in Python. Although Python is slower than many other languages by a large constant factor, the algorithm in [2] was never designed to be fast in practice, so this is not an issue. At the beginning of each section, we briefly describe what code that section corresponds to and explain how that code fits into the broader algorithm. In addition, for the sections where we do not implement all of the code described in the corresponding subpart, we include a description of what we did implement and what was challenging about the parts we did not implement. All of the code referenced in this paper can be found at the following Github: https://github.com/thinkinavi24/almost_linear_max_flow. The generative AI tool ChatGPT was used to write a significant amount of this code and is credited accordingly in comments of the Jupyter codeblocks. In multiple Jupyter notebooks, some codeblocks were written by ChatGPT while others were written by hand; any code block not labeled as being written by ChatGPT was written by hand. All code written by generative AI was thoroughly checked, tested and revised as needed as was all other code in the Github.

2 Preliminaries

We begin with some definitions that we will refer back to throughout this paper. Definitions, notations and remarks with references in this section are copied identically from those references and are reproduced here for convenience. In addition, in later sections throughout this paper, we will quote results including theorems, definitions and lemmas from other articles and we will mention the source of the result immediately before reproducing it. Further, when discussing the implementation of a list of methods/attributes in a data structure that are described in another paper, we will often quote that paper’s description of the methods/attributes. When we do so, we will mention the source of the result immediately before reproducing a list that is copied from the referenced paper.

2.1 Definitions

Definition 2.1.

Given a graph $G,$ for all pairs of vertices $(u,v)\in G$ we define $d_{G}(u,v)$ to be the shortest path distance from $u$ to $v$ on $G.$

Definition 2.2.

([1]) The radius of a connected graph $G$ around a point $x_{0}\in G$ is $\max_{v\in G}d(x_{0},v)$ where $d(x_{0},v)$ represents the shortest distance from $x_{0}$ to $v$ for each vertex $v\in G.$

Notation 2.3.

([2]) We let a graph $G$ correspond to its vertices $V$ and its edges $E$ , and we let $n=|V|$ and $m=|E|.$ Sometimes, we may explicitly write $E(G)$ to denote the edges of $G$ or $V(G)$ to denote the vertices of $G.$ Further, we let $e^{G}\in E(G)$ denote an edge of $G$ and $v^{G}\in V(G)$ denote a vertex of $G.$ We assume that each edge $e\in E$ has an implicit direction, used to define its edge-vertex incidence matrix $\mathbf{B}.$

Definition 2.4.

([2]) We let $\deg_{G}(v)$ denote the degree of the vertex $v$ in $G,$ or the number of incident edges. We let $\Delta_{\max}(G)$ and $\Delta_{\min}(G)$ denote the maximum and minimum degree of graph $G.$

Notation 2.5.

For a graph $G$ with vertices $V,$ for any subset $S\subset V,$ we let $\overline{S}$ denote $V\backslash S.$

Definition 2.6.

([2]) We define the volume of a set $S\subseteq V$ as $\operatorname{vol}_{G}(S)=\sum_{v\in S}\deg_{G}(v).$

Definition 2.7.

([2]) Given graphs $G$ and $H$ with $V(G)\subseteq V(H)$ we say that $\Pi_{G\to H}$ is a graph-embedding from $G$ into $H$ if it maps each edge $e^{G}=(u,v)\in E(G)$ to a $u-v$ path $\Pi_{G\to H}(e^{G})$ in $H$ .

Definition 2.8.

([2]) The congestion of an edge $e^{H}$ is defined as $\operatorname{econg}(\Pi_{G\to H},e^{H})=|\{e^{G}\in E(G)|e^{H}\in\Pi_{G\to H}(e^{G})\}|$ and of the embedding by $\operatorname{econg}(\Pi_{G\to H})=\max_{e^{H}\in E(H)}\operatorname{econg}(\Pi_{G\to H},e^{H}).$

Definition 2.9.

([2]) The congestion of a vertex $v^{H}\in V(H)$ is defined by $\operatorname{vcong}(\Pi_{G\to H},v^{H})=|\{e^{G}\in E(G)|v^{H}\in\Pi_{G\to H}(e^{G})\}|$ Further, the vertex-congestion of the embedding is $\operatorname{vcong}(\Pi_{G\to H})=\max_{v^{H}\in V(H)}\operatorname{vcong}(\Pi_{G\to H},v^{H}).$

Definition 2.10.

([2]) The length of an embedding is defined as $\operatorname{length}(\Pi_{G\to H})=\max_{e^{G}\in E(G)}|\Pi_{G\to H}(e^{G})|.$

Definition 2.11.

([2]) $G$ is a dynamic graph if it undergoes batches $U^{(1)},U^{(2)},\ldots$ of updates consisting of edge insertions/deletions and/or vertex splits that are applied to $G.$ We say that the graph $G$ , after applying the first $t$ update batches $U^{(1)},U^{(2)},\ldots,U^{(t)}$ is at stage $t$ and denote the graph at this stage by $G^{(t)}.$

Remark 2.12.

([2]) For each update batch $U^{(t)},$ we encode edge insertions by a tuple of tail and head of the new edge and deletions by a pointer to the edge that is about to be deleted. We further also encode vertex splits by a sequence of edge insertions and deletions as follows: if a vertex $v$ is about to be split and the vertex that is split off is denoted $v^{\text{NEW}}$ , we can delete all edges that are incident to $v$ but should be incident to $v^{\text{new}}$ from $v$ and then re-insert each such edge via an insertion (we allow insertions to new vertices that do not yet exist in the graph). For technical reasons, we assume that in an update batch $U^{(t)}$ , the updates to implement the vertex splits are last, and that we always encode a vertex split of $v$ into $v$ and $v^{\text{NEW}}$ such that $\deg_{G^{(t+1)}}(v^{\text{NEW}})\leq\deg_{G^{(t+1)}}(v).$ We let the vertex set of graph $G^{(t)}$ consist of the union of all endpoints of edges in the graph (in particular if a vertex is split, the new vertex $v^{\text{NEW}}$ is added due to having edge insertions incident to this new vertex $v^{\text{NEW}}$ in $U^{(t)}).$

Definition 2.13.

We say a function $g(n)=\tilde{O}(f(n))$ if there exist constants $c,k$ such that $g(n)\leq cf(n)\log^{k}(n)$ for sufficiently large $n.$ Note that this is equivalent to standard big O-notation except we also allow for arbitrary powers of $\log$ .

Definition 2.14.

([2]) We define ENC( $u$ ) of an update $u\in U^{(t)}$ to be the size of the encoding of the update and note that for edge insertions/deletions, we have ENC( $u$ ) = $\tilde{O}(1)$ and for a vertex split of $v$ into $v$ and $v^{\text{NEW}}$ as described above we have that ENC( $u$ ) = $\tilde{O}(\deg_{G^{(t+1)}})(v^{\text{NEW}}).$ For a batch of updates $U$ , we let ENC( $U$ ) = $\sum_{u\in U}$ ENC( $u$ ).

Remark 2.15.

([2]) The number of updates $|U|$ in an update batch $U$ can be completely different from the actual encoding size ENC $(U)$ of the update batch $U$ .

Definition 2.16.

([5]) For any subsets $S,T\subset V$ we denote $E(S,T)=\{\{u,v\}\in E|u\in S,v\in T\}$ as the set of edges between $S$ and $T.$

Definition 2.17.

([5]) The cut-size of a cut $S$ is $\delta(S)=|E(S,\overline{S})|.$

Definition 2.18.

([5]) The conductance of a cut $S$ in $G$ is $\Phi_{G}(S)=\frac{\delta(S)}{\min(\operatorname{vol}_{G}(S),\operatorname{vol}_{G}(V\backslash S))}.$ Unless otherwise noted, when speaking of the conductance of a cut $S,$ we assume $S$ to be the side of smaller volume. The conductance of a graph $G$ is $\Phi_{G}=\min_{S\subset V}\Phi_{G}(S).$ If $G$ is a singleton, we define $\Phi_{G}=1.$

Definition 2.19.

([5]) We say a graph $G$ is a $\phi$ expander if $\Phi_{G}\geq\phi$ , and we call a partition $V_{1},\ldots,V_{k}$ of $V$ a $\phi$ expander decomposition if $\min_{i}\Phi_{G_{[V_{i}]}}\geq\phi.$

Definition 2.20.

([5]) Given $G=(V,E)$ and a set of vertices $A\subset V,$ we say $A$ is a nearly $\phi$ expander in $G$ if $\forall S\subseteq A,\operatorname{vol}(S)\leq\operatorname{vol}(A)/2:|E(S,V\backslash S)|\geq\phi\operatorname{vol}(S).$

Definition 2.21.

([5]) Given a graph $G=(V,E)$ , its subdivision graph $G_{E}=(v^{\prime},E^{\prime})$ is the graph where we put a split [vertex] $x_{e}$ on each edge $e\in E$ (including self loops). Formally, $V^{\prime}=V\cup X_{E}$ where $X_{e}=\{x_{e}|e\in E\}$ and $E^{\prime}=\{\{u,x_{e}\},\{v,x_{e}\}|e=\{u,v\}\in E\}.$ We will call [vertices] in $X_{E}$ the split [vertices], and the other [vertices] in $V^{\prime}$ the regular [vertices].

Convention 2.22.

Whenever we use $\log$ in this paper, we always mean the natural logarithm. Of course, this is irrelevant when considering only asymptotic runtimes.

2.2 Layout of Algorithm

The authors in [2] present their main algorithm on page 69 at the beginning of Section 9 in Algorithm 7. In order to figure out exactly what needs to be implemented (vs portions of [2] that are not directly part of the final algorithm) we work backwards from Algorithm 7.

We begin by listing the things we need to implement for Algorithm 7. Then for each of these, we list the things we need to implement for them, and so on until we reach algorithms that can be implemented without referring to any other major algorithms. Note that some things are repeated within the list; this is because certain data structures/algorithms are used at multiple points within the overall main algorithm. The bolded items are ones which we have implemented and are labeled as fully implemented or partially implemented.

A quick glance at Algorithm 7 tells us that we need to implement the following:

1.
The data structure $\mathcal{D}^{HSFC}$ (Theorem 6.2, [2])
1. 1.1.
  Branching Tree Chain (Theorem 7.1, [2])
  1. 1.1.1.
    
    Dynamic Low Stretch Decomposition (LSD) (Lemma 6.5, [2]) (partial implementation in Section 4)
    
    1.1.1.1.
    
    Low-stretch spanning trees (Theorem 3.2, [2]) (full implementation in Section 4)
    
    1.1.1.1.1.
    
    create_petal algorithm ([1])
    
    1.1.1.1.2.
    
    petal_decomposition algorithm ([1])
    
    1.1.1.1.3.
    
    hierarchical_petal_decomposition algorithm ([1])
    
    1.1.1.2.
    
    Heavy Light Decomposition (Lemma B.8, [2]) (full implementation in Section 4)
    
    1.1.1.3.
    
    Rooted tree (Lemma B.9, [2])
    
    1.1.1.4.
    
    Computing Stretch Overestimates
    
    1.1.1.4.1.
    
    Computing permutation $\pi$ (Lemma B.6, [2])
    
    1.1.1.4.1.1.
    
    Compute congestion (Lemma B.5, [2])
    
    1.1.1.4.1.1.1.
    
    Dynamic Trees (Lemma 3.3, [2]) (external implementation cited in Section 3)
    
    1.1.1.5.
    
    Tree Decomposition (Lemma B.7 of [2], quotes [9] and [8])
  2. 1.1.2.
    
    Sparsified core graph (Definition 6.9, [2]), implemented in Algorithm 4 (Lemma 7.8. [2])
    
    1.1.2.1.
    
    Dynamic Spanner (Theorem 5.1, [2])
    
    1.1.2.1.1.
    
    SPARSIFY algorithm (Theorem 5.5, [2])
    
    1.1.2.1.1.1.
    
    DECOMPOSE algorithm (Theorem 5.11, [2])
    
    1.1.2.1.1.1.1.
    
    Expander Decomposition (Theorem 1.2 of [5])
    
    1.1.2.1.1.1.1.1.
    
    Cut-Matching (Theorem 2.2 of [5])
    
    1.1.2.1.1.1.1.1.1.
    
    Implicitly update flow $f$ (Appendix B.1 of [5])
    
    1.1.2.1.1.1.1.2.
    
    Trimming (Theorem 2.1 of [5])
    
    1.1.2.1.1.2.
    
    Data structure $\mathcal{DS}_{ExpPath}$ (Theorem 5.12 of [2], quotes [3])
    
    1.1.2.1.1.2.1.
    
    Expander Pruning (Theorem 1.3 of [5])
    
    1.1.2.1.1.2.2.
    
    Theorem 3.8 of [3]
    
    1.1.2.1.1.2.3.
    
    ES-Tree Data Structure ([4])
    
    1.1.2.1.2.
    
    Update algorithm
    
    1.1.2.2.
    
    Dynamic Core Graphs (Lemma 7.5, [2])
    
    1.1.2.2.1.
    
    Dynamic Low Stretch Decomposition (LSD) (Lemma 6.5, [2])
    
    1.1.2.2.2.
    
    Multiplicative Weights Update (MWU) (Lemma 6.6, [2])
2. 1.2.
  
  Rebuilding Game (Lemma 8.3, [2]) (full implementation in Section 5)
2.

Dynamic Trees (Lemma 3.3, [2]) (external implementation in Section 3)

In order to implement Algorithm 7, we can view this list itself as a flowchart/DAG, where the leaves of the tree need to be implemented to eventually implement our way up to the root. In order to implement the algorithm at each “node” of the tree, all of its children must be implemented.

In Section 7, we analyze these different components and the asymptotic guarantees given in [2] to precisely calculate the asymptotic time complexity of the main algorithm in [2]. Lastly, we conclude the paper in Section 8 with a summary of the work and possible directions for future study.

3 Dynamic Trees

Dynamic Trees are described in Lemma 3.3 of [2] which is reproduced below for convenience:

Lemma 3.1.

([2]) There is a deterministic data structure $\mathcal{D}^{(T)}$ that maintains a dynamic tree $T\subseteq G=(V,E)$ under insertion/deletion of edges with gradients $\mathbf{g}$ and lengths $\mathbf{\ell}$ and supports the following operations:

1.

Insert/delete edges $e$ to $T$ , under the condition that $T$ is always a tree, or update the gradient $\mathbf{g_{e}}$ or lengths $\mathbf{\ell_{e}}.$ The amortized time is $\tilde{O}(1)$ per change.
2.

For a path vector, $\Delta=\mathbf{p}(T[u,v])$ for some $u,v\in V$ , return $\langle\mathbf{g},\mathbf{\Delta}\rangle$ or $\langle\mathbf{\ell},|\mathbf{\Delta}|\rangle$ in time $\tilde{O}(1).$
3.

Maintain a flow $\mathbf{f}\in\mathbb{R}^{E}$ under operations $\mathbf{f}\leftarrow\mathbf{f}+\eta\Delta$ for $\eta\in\mathbb{R}$ and path vector $\Delta=\mathbf{p}(T[u,v])$ or query the value $\mathbf{f}_{e}$ in amortized time $\tilde{O}(1).$
4.

Maintain a positive flow $\mathbf{f}\in\mathbb{R}^{E}$ under operations $\mathbf{f}\leftarrow\mathbf{f}+\eta|\Delta|$ for $\eta\in\mathbb{R}_{\geq 0}$ and path vector $\mathbf{\Delta}=\mathbf{p}(T[u,v])$ , or query the value $\mathbf{f}_{e}$ in amortized time $\tilde{O}(1)$ .
5.

DETECT(). For a fixed parameter $\varepsilon$ , and under positive flow updates (item 4), where $\Delta^{(t)}$ is the update vector at time $t$ , returns $S^{(t)}=\left\{e\in E:\mathbf{\ell_{e}}\sum_{t^{\prime}\in[\textbf{last}_{e}^{(t)}+1,t]}|\Delta_{e}^{(t^{\prime})}|\geq\varepsilon\right\}$ where $\text{last}_{e}^{(t)}$ is the last time before $t$ that $e$ was returned by DETECT(). Runs in time $\tilde{O}(|S^{(t)}|).$

The data structure in Lemma 3.3 of [2] above is a slight extension on the Dynamic Trees described in [6]. In general, the purpose of dynamic trees is to make operations like cutting a tree or linking two trees possible in $\tilde{O}(1)$ time, which can then be efficiently combined to implement operations 1-5 above also in $\tilde{O}(1)$ time.

This is an important data structure for the algorithm presented in [2]. It is challenging to find an implementation of this data structure online. The authors of [6] pointed us to the following implementation in C++: https://codeforces.com/contest/487/submission/8902202, written by the first author of [6]. They also pointed us to their paper [7] which provides a simpler description for Dynamic Trees.

Since an implementation of Dynamic Trees already exists, though it is in a different language, we do not implement one here and instead defer to that implementation. While that implementation does not include a DETECT() method, the authors in [2] give an algorithm for DETECT() which should be implementable given the Dynamic Tree Class linked above. Even though we are not creating our own dynamic trees implementation, we will refer to Lemma 3.1 throughout this paper so it is important to understand their definition as given by Lemma 3.1.

4 Low Stretch Decomposition

As described in Section 2.2, in this section we implement Low Stretch Decomposition as specified in Lemma 6.5 of [2], which states the following:

Lemma 4.1.

(Dynamic Low Stretch Decomposition). There is a deterministic algorithm with total runtime $\tilde{O}(m)$ that on a graph $G=(V,E)$ with lengths $\mathbf{\ell}\in\mathbb{R}^{E}_{>0},$ weights $\mathbf{v}\in\mathbb{R}^{E}_{>0},$ and parameter $k$ , initializes a tree $T$ spanning $V,$ and a rooted spanning forest $F\subseteq T,$ a edge-disjoint partition $\mathcal{W}$ of $F$ into $O(m/k)$ sub trees and stretch overestimates $\widetilde{\text{str}_{e}}.$ The algorithm maintains $F$ decrementally against $\tau$ batches of updates to $G,$ say $U^{(1)},U^{(2)},\ldots,U^{(\tau)},$ such that $\widetilde{\text{str}_{e}}~{}~{}\stackrel{{\scriptstyle\text{def}}}{{=}}1$ for any new edge $e$ added by either edge insertions or vertex splits, and:

1.

$F$ has initially $O(m/k)$ connected components and $O(q\log^{2}n)$ more after $t$ update batches of total encoding size $q~{}\stackrel{{\scriptstyle def}}{{=}}\sum_{t=1}^{\tau}ENC(U^{(i)})$ satisfying $q\leq\tilde{O}(m).$
2.

$\text{str}_{e}^{F,\ell}\leq\widetilde{\text{str}}_{e}\leq O(k\gamma_{LSST}\log^{4}n)$ for all $e\in E$ at all times, including inserted edges $e.$
3.

$\sum_{e\in E^{(0)}}v_{e}\widetilde{\text{str}}_{e}\leq O(||\mathbf{v}||_{1}\gamma_{LSST}\log^{2}n)$
4.

Initially, $\mathcal{W}$ contains $O(m/k)$ subtrees. For any piece $W\in\mathcal{W},W\subseteq V,|\partial W|\leq 1$ and $\operatorname{vol}_{G}(W\backslash R)\leq O(k\log^{2}n)$ at all times, where $R\supseteq\partial\mathcal{W}$ is the set of roots in $F.$ Here, $\partial W$ denotes the set of boundary vertices that are in multiple partition pieces.

The factor $\gamma_{LSST}$ will be specified in Notation 4.3.

We begin by implementing low-stretch spanning trees in Section 4.1 before moving onto the Heavy Light Decomposition in Section 4.2.

4.1 Low-stretch spanning trees

We begin by writing code to implement the LSST algorithm described in [1]. The code can be found in the file titled petal.ipynb.

The main result of [1] can be summarized as follows:

Theorem 4.2.

Suppose $G$ is a connected, undirected graph with vertices $V$ and edges $E$ where $|V|=n$ and $|E|=m$ . Then we can construct a spanning tree $T$ such that $\sum_{(u,v)\in E}\frac{d_{T}(u,v)}{w(u,v)}=O(m\log n\log\log n).$ Further, this tree can be found in $O(m\log n\log\log n)$ time.

The tree $T$ found in Theorem 4.2 is our low-stretch spanning tree (LSST).

Notation 4.3.

Following the notation of [2], we will be referring to the stretch factor $O(m\log n\log\log n)$ of the LSST as $\gamma_{LSST}$ for the remainder of the paper.

The algorithm can be understood through a few functions: create_petal, petal_decomposition and hierarchical_petal_decomposition. We describe the implementation of each of these functions below. In order to get our LSST, we call hierarchical_petal_decomposition on our original graph with specific parameters that are specified in Section 3.3.

4.1.1 create_petal

Abraham and Neiman give two algorithms in [1] for create-petal which both take in a graph, a starting point $x_{0},$ a target $t$ and a radius $r$ (where $r$ is expressed as a range in the first algorithm) and which both return a cluster (which is the petal) as well as a center of that petal. The algorithm given on page 235 is not fast enough for the overall desired bound in this paper: it requires that the ratio between the largest and smallest distances is $O(n^{k})$ for some $k$ which does not necessarily hold in general, and the create-petal algorithm has a for loop which could potentially run for many iterations, and there is no clear bound on how many iterations this would take. For this reason, we implement the faster algorithm on page 245 in section 6 of [1], which the authors themselves state is necessary for the best asymptotic efficiency.

In the code, we begin with an ordinary implementation of Dijkstra’s Algorithm using Fibonacci Heaps along with a corresponding shortest_path function. Everything in this portion is standard. After that, we write a function to generate the directed graph $\tilde{G}$ as specified on page 245 of [1]. As Abraham and Neiman point out, all edges on the directed graph have nonnegative length.

With that out of the way, we can write the create_petal function. In essence, this algorithm iteratively carves petals off of the graph and then carves petals off of the remaining part of the graph until all petals have been carved, when any remaining vertices are made part of the stigma of the flower. This function takes in the following inputs: a graph $Y$ , a target $t$ , a starting point $x_{0}$ and a radius $r.$ Note that this is slightly different from the slower function that Abraham and Neiman propose. In particular, the create_petal function does not take in the original graph as input except at the beginning; rather, as described in the following subsection, each time a petal is carved, create_petal is iteratively called on the remaining subgraph. We also do not have any hi and lo parameters, as the faster algorithm described on page 245 of [1] does not include any.

To create a petal, we first run Dijkstra on the given graph with the given starting point $x_{0}.$ We then generate the corresponding directed graph based on the shortest distances from $x_{0}$ to all other points in the graph and calculate the shortest path to the target $t.$ Next, we need to find the special vertex $r^{\prime}$ , defined on page 245 of [1] as the furthest point from $t$ on the shortest path from $t$ to $x_{0}$ in the original graph $G$ that is within distance $r$ of $t.$ To do this, we could perform binary search on the sorted distances on the shortest path from $t$ to $x_{0}$ in order to find the largest number less than or equal to $r.$ However, in our implementation we simply do a linear search as this does not impact the overall asymptotic time complexity, which we empirically check later.

Next, the algorithm stipulates that the petal is the ball of radius $r/2$ around $t$ in the directed graph where each edge on the path from $t$ to $p_{r^{\prime}}$ is set to half of its original weight in $G.$ At first, this may seem to be equivalent to the following: include every vertex within the ball of radius $r/2$ and then also include every vertex on the path from $t$ to $p_{r^{\prime}}$ including $p_{r^{\prime}}.$ This would make intuitive sense because $p_{r^{\prime}}$ is defined as the furthest point along the path from $t$ to $x_{0}$ within distance $r$ of $t$ . If all of the closest points to $t$ are further than $r$ away, then $p_{r^{\prime}}=t.$ An implementation which does this would not be quite right for the following reason: while edges in the directed graph all have nonnegative length by the triangle inequality, they can have length 0, and this fact influences the implementation of create_petal. Therefore, in the directed graph, $p_{r^{\prime}}$ may not be the furthest point within distance $r.$ Edges of length 0 mean that some points “further” along the path may be of equal distance.

Therefore, it is essential to first set the weight of each edge on the path from $t$ to $p_{r^{\prime}}$ to half its original weight in $G.$ After that, we run Dijkstra on the directed graph $\tilde{G}$ and select all vertices within distance $r/2$ of $t.$

The function then returns the petal and the point $p_{r^{\prime}}$ which is the center of the petal and plays a key role in the petal decomposition. In addition, we also choose to return the shortest path from $x$ to $t$ , since this will come up in the petal_decomposition function described in the next subsection.

4.1.2 petal_decomposition

In our implementation of this function, we begin with some sanity checks that the center $x$ and target $t$ are both in the graph. While this may seem unnecessary, we will see in the next subsection that there are many recursive calls which involve modifying the graph. These assert and raise statements helped catch many errors earlier on in the implementation and ensured everything worked properly once those errors were corrected.

We begin by computing the radius of graph, defined as the maximum length across all vertices $v$ of the shortest path from $x_{0}$ to $v$ found using Dijkstra. We then initialize the following lists with their defined purposes:

•

Ys: $Y_{j}$ is defined as the original graph minus the vertices and edges contained in petals $X_{1},\ldots,X_{j}$ . In particular, $Y_{0}$ is initialized to the entire graph before any petals have been removed.
•

Xs: $X_{j}$ is the petal found on iteration $j$ and is carved from $Y_{j-1}$ , except for $X_{0}$ which is the remaining part of the graph after all petals have been carved (what the authors in [1] call the “stigma” of the flower)
•

xs: $x_{j}$ is the center of petal $X_{j},$ or the point $p_{r^{\prime}}$ when computing the petal $X_{j}$ , except for $x_{0}$ which is the original input
•

ts: $t_{j}$ is the target when creating petal $X_{j}$
•

ys: $y_{j}$ is the neigbhor of $x_{j}$ on the path from $x_{0}$ to $t_{j}$ which is closer to $x_{0}$

The rest of the implementation follows the pseudocode. The first petal that is carved may be a special petal if the distance from $x_{0}$ to the target $t$ is greater than $\frac{5}{8}$ of the radius from $x_{0}.$ Abraham and Neiman describe why they choose the $\frac{5}{8}$ constant in [1]. The purpose of this petal is to preserve the shortest path from $x_{0}$ to $t.$

To carve all of the other petals, we iteratively check for points whose distance from $x_{0}$ is at least $\frac{3r}{4}.$ If no other points exist, we have found all petals. Otherwise, we arbitrarily choose one of those points to be the target $t_{j}$ for petal $X_{j}.$ In our implementation, we simply choose the first key in the dictionary of key value pairs where the keys are vertices and the values are distances from the center $x_{0}$ , but this is arbitrary because the keys are only ordered as they were in the original input graph, which is arbitrary.

We then call create_petal on the remaining part of the original graph (denoted by the element most recently appended to the Ys list) along with our center $x_{0}$ and target $t_{j}.$ For the radius, we use $r/8$ . This choice comes from the pseudocode for petal_decomposition in [1], where the call to create_petal sets lo = 0 and hi = $r/8.$ However, as we recall from the previous subsection, the faster implementation of create_petal does not have any lo or hi parameters but rather just a single radius parameter. For this reason, it was not obvious whether to use $r/8$ or something like $r/16.$ However, we decided to use $r/8$ because this parameter is a radius and all points within this radius will have distance less than or equal to $r/8$ from the center. For this reason, having a max possible value of $r/8$ (the hi parameter) vs a single value of $r/8$ seemed more analogous. However, that is not a fully rigorous argument, and the testing described in Section 4.1.4 is what provides empirical validation for this choice.

Once we create the petal, we subtract it off from the remaining part of the graph. Using the path from $x_{0}$ to $t_{j}$ , we compute $y_{j}$ , which will be used in the following subsection. We then halve all of the edges on the shortest path to $t$ (but beginning at $x_{j}$ rather than $x_{0}$ ) returned by the create_petal function as given in the pseudocode.

Finally, we let $X_{0}$ be the last $Y_{j}$ , namely the remaining part of the graph after all petals have been carved.

4.1.3 hierarchical-petal-decomposition

This is the third and final function necessary to generate a low-stretch spanning tree. For inputs, it takes in the graph, a starting point $x_{0}$ and a target $t.$ This function is recursive with a base case that the tree on a graph with one vertex is just the single vertex itself.

On each recursive call, we compute the petal decomposition of the graph. As explained in the previous subsection, this returns a set of petals $X_{j}$ , their centers $x_{j}$ , a set of neighbors $y_{j}$ and a set of targets $t_{j}$ . We then run hierarchical-petal-decomposition recursively on each petal $X_{j}$ where the starting point is $x_{j}$ and the target is $t_{j}.$ Each of these recursive calls generates a tree $T_{j}$ for each petal $X_{j}.$ These trees are all disjoint, so we create a new graph which combines all of these $T_{j}$ through the edges $(x_{j},y_{j}).$ This gives us a final low-stretch spanning tree $T.$

Lastly, to compute the tree itself, we call hierarchical-petal-decomposition on the graph with arbitrary starting point $x_{0}$ but critically also with target $x_{0}.$ The justification for this decision is explained in [1].

4.1.4 LSST Implementation Validation

The function hierarchical_petal_decomposition was the last function described in [1]. In order to test the LSST implementation, we generate many Erdős-Rényi random graphs. For each random graph, we pick a probability $p$ and let there be an edge between every pair of distinct vertices with probability $p$ (so $p=1$ would correspond to a complete graph).

Then, for each graph, we compute an LSST using hierarchical_petal_decomposition, measuring its stretch and the duration to compute it. We repeat this for 10 trials and take the average for both the stretch and the duration. There are two primary things to verify: that the stretch is $O(m\log n\log\log n)$ and that the time complexity is $O(m\log n\log\log n).$ We also separately checked that the output was always a spanning tree (it reaches every vertex of the original graph and has $n-1$ edges). However, when timing the code, we did not include this extra check since it is not a necessary part of the algorithm once we are confident that the algorithm is running correctly.

To check the asymptotic nature of the stretch and duration to compute the LSST, for different values of $m,n$ we graph the stretch and duration vs $m\log n\log\log n.$ We expect that there are some finite constants $c_{1},c_{2}\in\mathbb{R}$ such that the stretch with $m$ edges and $n$ vertices is bounded above by $c_{1}m\log n\log\log n$ and likewise the duration is bounded above by $c_{2}m\log n\log\log n.$ We check this graphically below.

Refer to caption — Figure 1: For different random graphs, we graph the average stretch vs $m\log n\log\log n.$ Empirically, we can see that the stretch is bounded above by $0.045m\log n\log\log n$ and that the pattern of growth appears to suggest that this growth rate will continue. The purple line is the best fit line which minimizes squared L2 error.

In both cases, we can see that for sufficiently large values of $m$ and $n,$ both the duration and stretch are bounded above by a linear constant multiplied by $m\log n\log\log n.$ Of course, this does not definitively verify that the implementation is correct (for example, there could hypothetically be a $\log\log\log n$ factor that would not show up graphically in the experiments run above) but given that the authors in [1] prove their algorithm and stretch are both $O(m\log n\log\log n)$ and the implementation followed their algorithm, it is reasonable to conclude that the graphs above provide empirical validation that our implementation works correctly.

4.2 Heavy Light Decomposition

In this subsection we discuss the code that implements the Heavy Light Decomposition of a tree. First, we reproduce a necessary definition from [2], which we have modified slightly to refer to general trees:

Definition 4.4.

([2]) Given a rooted tree $T$ and vertex $u,$ define $u^{\uparrow T}$ as the set of its ancestors in $T$ plus itself. We extend the notation to any subset of vertices by defining $R^{\uparrow T}=\bigcup_{u\in R}u^{\uparrow T}.$

Now we define the Heavy Light Decomposition as described in Lemma B.8 of [2] (which quotes [6]) which we reproduce below for convenience:

Lemma 4.5.

There is a linear-time algorithm that given a rooted tree $T$ with $n$ vertices outputs a collection of vertex disjoint tree paths $\{P_{1},\ldots,P_{t}\}$ (called heavy chains), such that the following hold for every vertex $u$ :

1.

There is exactly one heavy chain $P_{i}$ containing $u.$
2.

If $P_{i}$ is the heavy chain containing $u$ , at most one child of $u$ is in $P_{i}.$
3.

There are at most $O(\log n)$ heavy chains that intersect with $u^{\uparrow T}.$

In addition, edges that are not covered by any heavy chain are called light edges.

4.2.1 Implementation Details

The heavy_light_decomposition function takes in one input: the tree. Within this function, we define two inner functions: dfs_size and decompose_chain, which we describe below.

For the dfs_size function which takes in a vertex $v$ and its parent as inputs. Then, the function uses Depth First Search to recursively visit every vertex and compute the size of the subtree (number of vertices) rooted at that vertex. If the vertex is a leaf node, we say its subtree has size 1 (just itself). Otherwise, we initialize the size of the subtree to 1 and compute $\operatorname{size}(v)=1+\sum_{w\in c(v)}\operatorname{size}(w)$ where $c(v)$ is the set of all children of $v$ .

Next, we have the decompose_chain function. This function takes in a vertex $v$ , the head of the chain of $v$ and the parent of $v$ as inputs. In this function, we first identify the heavy child for that vertex, which corresponds to condition 2 in Definition 4.5 and is the one child of $u$ within $P_{i}$ , the heavy chain containing $u$ . We find this heavy child by iterating over the children of $u$ and picking the child of $u$ which has the maximum subtree size.

The decompose_chain function then recursively calls itself as long as we keep finding new “heavy children.” For each recursive call, we take in the previous heavy_child as our new vertex $v,$ the same chain_head as before and the vertex $v$ as the parent (since the heavy_child is a child of vertex $v$ ).

Once the recursive calls are complete, we iterate over the children of $v$ , and for every child that isn’t the heavy child, we initialize a new chain and recursively call decompose_chain where our new vertex $v$ is the child we’re iterating over, the chain_head is the index of the current chain and the parent is our old $v.$ By setting the code up in this way, we ensure that we create a chain for every vertex $v$ and its descendants and that we satisfy the requirements of Lemma 4.5.

With these two functions implemented, we can finish our heavy_light_decomposition implementation. We initialize variables for the length of the tree, the subtree sizes, and the chains. We then call dfs_size on the root vertex $0$ and its “parent” (which is not a real vertex) but which we denote $-1.$ This allows us to figure out the sizes of all of the subtrees in the tree. Then, we call decompose_chain on our root vertex $0$ , the chain_head which we initialize to $0$ and the parent $-1$ . Finally, the heavy_light decomposition function returns the list of chains.

4.2.2 Heavy-Light Decomposition Implementation Validation

In order to check our implementation, we check the three conditions of Lemma 4.5 empirically. We produce a function that generates random trees. These trees are implemented as a list of lists, with nodes numbered $0$ to $n$ where $0$ is the root. Further, the list at index $i$ consists of the children of node $i$ , and the list is empty if and only if node $i$ is a leaf node. Based on this format, such random trees can be generated as follows: for vertex $n$ , randomly choose one of the vertices $i=0,1,\ldots,n-1$ to be its parent by appending vertex $n$ to the list at index $i$ . This also has the convenient property that the parent’s number is always strictly less than the child’s number.

We then call our heavy light decomposition function on 1000 randomly generated trees. We iterate over all possible vertices for that tree and check that vertex $u$ is represented in one and only one heavy chain. We can then easily check the children of $u$ and ensure that at most one child of $u$ is in that same heavy chain.

For the third condition, we calculate $u^{\uparrow T}$ for each $u$ and count the number of heavy chains that intersect with it. This is the trickiest condition to check because Lemma 4.5 only promises an $O(\log n)$ bound on the number of intersecting heavy chains. To check this, we consider increasing values of $n$ and graph the average number of intersecting heavy chains vs $\log n$ , where the average is calculated by summing up the number of intersections across all vertices and then dividing by both the number of vertices and the number of randomly generated trees, in this case 1000. Below, we plot the average number of intersections vs $\log n$ , and we see that a line with slope 1.15 dominates the growth. Thus, empirically it appears that the number of intersections is bounded by $1.15\log n.$ In addition, we also computed the quotient of the number of intersections divided by $\log n$ , which gave us the following list: [1.276, 1.196, 1.144, 1.134, 1.126, 1.103, 1.092, 1.090, 1.084, 1.081, 1.079, 1.072]. We can see that this list is monotonically decreasing, which provides further assurance that the number of intersections is $O(\log n).$

In addition, we also empirically check the time complexity for our implementation just like we did for low-stretch spanning trees. Lemma 4.4 states that the algorithm should be $O(n)$ time, which indeed appears to be the case below.

5 Rebuilding Game

The authors in [2] create a rebuilding game in order to attain their promised near linear time bound for max flow. As they describe at the beginning of section 8, this rebuilding game is the key step to go from Theorem 7.1 to Theorem 6.2 (theorem numbers are with respect to [2]). Theorem 6.2 asserts the existence of the main overall data structure in [2] that is crucial for attaining the near asymptotic linear time bound. The code for this part is included in the file titled rebuilding.ipynb. Although for our purposes we only need the rebuilding game under very specific conditions with specific parameter values, we implement it in full generality as it is described in Section 8 of [2].

5.1 Preliminaries

“The rebuilding game has several parameters: integers parameters size $m>0$ and depth $d>0$ , update frequency $0<\gamma_{g}<1$ , a rebuilding cost $C_{r}\geq 1$ , a weight range $K\geq 1$ , and a recursive size reduction parameter $k=m^{1/d}\geq 2$ , and finally an integer round count $T>0$ ” ([2]).

They define the rebuilding game in Definition 8.1 of [2] which we reproduce here for convenience:

Definition 5.1.

([2]) The rebuilding game is played between a player and an adversary and proceeds in rounds $t=1,2,\ldots,T.$ Additionally, the steps (moves) taken by the player are indexed as $s=1,2,\ldots$ . Every step $s$ is associated with a tuple $\text{prev}^{(s)}:=(\text{prev}_{0}^{(s)},\ldots,\text{prev}_{d}^{(s)})\in[T]^{d+1}.$ Both the player and adversary know $\text{prev}^{(s)}$ . At the beginning of the game, at round $t=1$ and step $s=1$ , we initially set $\text{prev}_{i}^{(1)}=1$ for all levels $i\in\{0,1,\ldots,d\}.$ At the beginning each round $t\geq 1$ ,

1.

The adversary first chooses a positive real weight $W^{(t)}$ satisfying $\log W^{(t)}\in(-K,K).$ This weight is hidden from the player.
2.

Then, while either of the following conditions hold, $\sum_{i=0}^{d}W^{\text{prev}_{i}^{(s)}}>2(d+1)W^{(t)}$ or for some level $l,$ at least $\gamma_{g}m/k^{l}$ rounds have passed since the last rebuild of level $l,$ the adversary can (but does not have to) force the player to perform a fixing step. The player may also choose to perform a fixing step, regardless of whether the adversary forces it or not. In a fixing step, the player picks a level $i\in\{0,1,\ldots,d\}$ and we then set $\text{prev}_{j}^{(s+1)}\leftarrow t$ for $j\in\{i,i+1,\ldots,d\},$ and $\text{prev}_{j}^{(s+1)}\leftarrow\text{prev}_{j}^{(s+1)}$ for $j\in\{0,\ldots,i-1\}.$ We call this a fix at level $i$ and we say the levels $j\geq i$ have been rebuilt. This move costs $C_{r}m/k^{i}$ time.
3.

When the player is no longer performing fixing steps, the round finishes.

There are some aspects of the original explanation in [2] that were underspecified and are clarified here. The first is that the steps indexed by $s$ refer to fixing steps, and further that each step also has a level $i$ . For this reason, the step count is incremented each time a level is fixed (thus $s$ refers to the total number of fixing steps across all levels). Note that this increment of the step count is separate from the fixing counts in the fix array, which are incremented as well until they are reset.

In order to assess whether the adversary is forcing a fixing step, we check the $\sum_{i=0}^{d}W^{\text{prev}_{i}^{(s)}}>2(d+1)W^{(t)}$ as specified above in Definition 5.1.

5.2 Implementation and Validation

We implement this rebuilding game through two functions. We first implement a fix function which the main algorithm will call to perform a fixing step. This function is fairly routine and simply takes in a level $i$ , an index $j$ , the step index $s$ , the round $t$ and an array called prevs. The fix function returns the newly updated prevs array after the fix is complete.

Next, we have our main rebuilding function. In addition to the parameters $m,d,\gamma_{g},C_{r},K,T$ listed above, we also take in the parameter $w,$ which represents a 2d array where the $t^{th}$ row of $w$ is $w^{(t)}$ and we initialize $W^{(t)}=||w^{(t)}||_{1}$ as explained in Remark 8.2 of [2]. We also initialize 1D arrays to keep track of the fix and round counts and a 2D prevs array. The prevs array has $d$ columns (one for each level) and initially one row of all ones as outlined in Definition 5.1 above.

We provide empirical validation that the algorithm runs in $O\left(\frac{C_{r}Kd}{\gamma_{g}}(m+T)\right).$ For setting the values of each of these variables, we refer to Section 8.2 of [2], which gives asymptotic values for the variables. We let $C_{r}=\exp(\log^{7/8}m\log\log m),\gamma_{g}=\frac{1}{C_{r}}=\exp(-\log^{7/8}m\log\log m),Q=mC_{r}=m\exp(\log^{7/8}m\log\log m),T=\lfloor(m+Q)C_{r}\rfloor$ and $d=\lfloor 5\cdot\sqrt[8]{\log m}\rfloor.$ While section 8.2 of [2] states that $d=\sqrt[8]{\log m}$ as opposed to $O(\sqrt[8]{\log m}),$ we note that scaling $d$ by a constant has no bearing on the $O\left(\frac{C_{r}Kd}{\gamma_{g}}(m+T)\right)$ runtime, and the authors also state $d=O(\sqrt[8]{\log m})$ many other times in [2]. The reason for the constant factor of $5$ in this implementation is because otherwise we would always have $d=1$ unless $m$ was extremely large, and there is no reason to always set $d=1.$ The exact value of $5$ is arbitrary and can be replaced by any other positive integer.

In addition, we also let $W$ be a vector with $T$ entries where the $W[i]$ are i.i.d Uniform(1, 1000). Here, the number 1000 is arbitrary. Lastly, we set $K=\lceil\log\max_{i}W[i]\rceil$ in order to ensure that $\log W[i]\in(-K,K)$ as required by [2].

With these parameters, we run experiments with $m=8,9,10\ldots,64.$ We plot the durations for each of these values of $m$ vs $\frac{C_{r}Kd}{\gamma_{g}}(m+T).$ As promised by [2], we do empirically observe that the runtime increases linearly with $\frac{C_{r}Kd}{\gamma_{g}}(m+T)$ and is therefore bounded by a constant times $\frac{C_{r}Kd}{\gamma_{g}}(m+T),$ where this constant is the slope of the red line in the graph below or $2.02\cdot 10^{-11}.$ This number comes from calculating the slope between the last points and one of the preceding points and increasing the y-intercept slightly, as the graph appears to be progressing linearly once the $x$ axis exceeds $4.$

6 Interior Point Method

In Section 4 of [2], the authors present an interior point method that solves the min-cost flow problem.

6.1 IPM Preliminaries

We begin by defining the key variables that will be used throughout this section and the remainder of the paper as they are defined in [2]. In the subsequent text, recall that $\mathbf{B}$ was defined in Notation 2.3.

Definition 6.1.

For our min-cost flow problem, we let $\mathbf{d}\in\mathbb{Z}^{V}$ represent the demands. We also let the lower and upper capacities be denoted by $\mathbf{u}^{-},\mathbf{u}^{+}\in\mathbb{Z}^{E}$ where all integers are bounded by the variable $U$ . Finally, we let $\mathbf{c}\in\mathbb{Z}^{E}$ denote the costs.

Definition 6.2.

We define $\alpha\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1}{1000\log mU}.$ We will use $\alpha$ to create a barrier $x^{-\alpha}$ when solving the L1 IPM.

Definition 6.3.

We define:

\mathbf{f}^{*}\stackrel{{\scriptstyle\text{def}}}{{=}}\operatorname*{argmin}_{\begin{subarray}{c}\mathbf{B^{\top}\mathbf{f}}=d\\ \mathbf{u}_{e}^{-}\leq\mathbf{f}_{e}\leq\mathbf{u}_{e}^{+}\text{for all}e\in E\end{subarray}}c^{T}f.

Definition 6.4.

We let $\Phi(\mathbf{f})\stackrel{{\scriptstyle\text{def}}}{{=}}20m\log(c^{\top}\mathbf{f}-F^{*})+\sum_{e\in E}((\mathbf{u}_{e}^{+}-\mathbf{f}_{e})^{-\alpha}+(\mathbf{f}_{e}-\mathbf{u}_{e}^{-})^{-\alpha}).$

Definition 6.5.

([2]) Given a flow $\mathbf{f}\in\mathbb{R}^{E}$ we define lengths $\ell\in\mathbb{R}^{E}$ as

\mathbf{\ell}(\mathbf{f})_{e}\stackrel{{\scriptstyle\text{def}}}{{=}}(\mathbf{u}_{e}^{+}-\mathbf{f}_{e})^{-1-\alpha}+(\mathbf{f}_{e}-\mathbf{u}_{e}^{-})^{-1-\alpha}

and gradients $\mathbf{g}\in\mathbb{R}^{E}$ as $\mathbf{g}(\mathbf{f})\stackrel{{\scriptstyle\text{def}}}{{=}}\nabla\Phi(\mathbf{f}).$ More explicitly,

\mathbf{g}(\mathbf{f})_{e}\stackrel{{\scriptstyle\text{def}}}{{=}}[\nabla\Phi(\mathbf{f})]_{e}=20m(\mathbf{c}^{\top}\mathbf{f}-F^{*})^{-1}\mathbf{c}_{e}+\alpha(\mathbf{u}_{e}^{+}-\mathbf{f}_{e})^{-1-\alpha}-\alpha(\mathbf{f}_{e}-\mathbf{u}_{e})^{-1-\alpha}.

We now reproduce Theorem 4.3 of [2], which is their main result regarding IPMs for solving the min-cost flow problem ( $f^{(t)}$ is the flow after iteration $t$ , while the optimal flow $f^{*}$ is unknown):

Theorem 6.6.

([2]) Suppose we are given a min-cost flow instance given by Equation (1). Let $\mathbf{f}^{*}$ denote an optimal solution to the instance.

For all $\kappa\in(0,1),$ there is a potential reduction interior point method for this problem, that, given an initial flow $\mathbf{f}^{(0)}\in\mathbb{R}^{E}$ such that $\Phi(\mathbf{f}^{(0)})\leq 200m\log mU,$ the algorithm proceeds as follows:

The algorithm runs for $\tilde{O}(m\kappa^{2})$ iterations. At each iteration, let $\mathbf{g(f^{(t)})}\in\mathbb{R}^{E}$ denote that gradient and $\mathbf{\ell}(\mathbf{f}^{(t)})\in\mathbb{R}^{E}_{>0}$ denote the lengths given by Definition 6.5. Let $\mathbf{\tilde{g}}\in\mathbb{R}^{E}$ and $\mathbf{\tilde{\ell}}\in\mathbb{R}_{>0}^{E}$ be any vectors such that $||\mathbf{L(f^{(t)}})^{-1}(\mathbf{\tilde{g}}-\mathbf{g}(f^{(t)}))||_{\infty}\leq\kappa/8$ and $\mathbf{\tilde{\ell}}\approx\mathbf{\ell}(f).$

At each iteration, the hidden circulation $\mathbf{f}^{*}-\mathbf{f}^{(t)}$ satisfies

\frac{\mathbf{g}(\mathbf{f}^{*}-\mathbf{f}^{(t)})}{100m+||\tilde{\mathbf{L}}(\mathbf{f}^{*}-\mathbf{f}^{(t)})||_{1}}\leq-\alpha/4.

2.

At each iteration, given any $\Delta$ satisfying $\mathbf{B}^{\top}\delta=0$ and $\tilde{\mathbf{g}}\Delta/||\tilde{\mathbf{L}}\Delta||_{1}\leq-\kappa,$ it updates $\mathbf{f}^{(t+1)}\leftarrow\mathbf{f}^{(t)}+\eta\Delta$ for $\eta\leftarrow\kappa^{2}/(50\cdot|\tilde{\mathbf{g}}\Delta|).$
3.

At the end of $\tilde{O}(m\kappa^{2})$ iterations, we have $\mathbf{c}^{\top}\mathbf{f}^{(t)}\leq\mathbf{c}^{\top}\mathbf{f}^{*}+(mU)^{-10}.$

6.2 IPM Implementation

All code described in this section can be found in the file ipm.ipynb.

In [2], the authors give a fairly detailed implementation that relies on specific bounds on the gradient, number of iterations etc. In order to implement that version of an interior point method, we would have to implement an oracle for a specific step described in Section 4 of [2], which is nontrivial to implement.

In this section we merely provide an approximation/heuristic for solving the interior point method which relies on scipy.optimize. We define the objective function as given in Definition 6.4 and feed the constraints as well. The minimize function in scipy.optimize then numerically solves for the optima of the potential function.

We empirically check the runtime of this simple implementation and compare it to the $\tilde{O}(m\kappa^{2})$ runtime promised above by Theorem 6.6, where we let $\kappa^{2}=\exp(-\log^{7/8}m\log\log m)$ as defined later on in [2] (they say $\kappa=\exp(-O(\log^{7/8}m\log\log m)),$ but an extra constant factor doesn’t change the following analysis).

We notice that the graph is decreasing, which is surprising. Empirically checking the values of $m\log m\exp(-\log^{7/8}m\log\log m)$ , we find they are decreasing over the values from $m=4$ to $m=100.$ In order to inspect this, we take the log of this expression, where we see:

\lim_{m\to\infty}\log m+\log\log m-\log^{7/8}m\log\log m=\lim_{m\to\infty}\log m\left(1+\frac{\log\log m}{\log m}-\frac{\log\log m}{\sqrt[8]{\log m}}\right)=\infty.

Specifically, in the final expression above, we know that $\log m$ will go to $\infty$ while the expression in the parentheses will go to $1$ . Thus in the limit the log goes to infinity, which means the original expression goes to infinity as well. Yet the empirically decreasing nature of the function in the graph above raises a question: when does this function start increasing?

To find this, we take a derivative of the logged expression. We first let $x=\log m$ since $m$ only appears as $\log m.$ Then, we have:

\frac{d}{dx}(x+\log x-x^{7/8}\log x)=\frac{1}{x}+1-\frac{7\log x}{8\sqrt[8]{x}}-\frac{1}{\sqrt[8]{x}}.

We want to find out when the derivative is positive. If we take $x\to\infty$ , we see that the derivative approaches $1,$ but if we plug in a number like $x=1000$ (which, we keep in mind corresponds to $m=e^{1000}$ ) we get $-1.97.$ This function is continuous, so by the Intermediate Value Theorem, we know that it will cross $0$ at some point. We will find out where the derivative equals $0$ (which corresponds to a local optimum) and then argue that it increases after this point.

Setting our above expression equal to $0$ , we get:

\frac{1}{x}+1=\frac{7\log x+8}{8\sqrt[8]{x}}\implies 7\log x+8=8x^{1/8}-\frac{8}{x^{7/8}}.

If we plug this expression into Wolfram Alpha, we get $x\approx 7631410337,$ which we will write as $x=7.63\cdot 10^{10}$ (but all numerical calculations below are done using the more precise value of $x$ ). Now, we want to confirm that the derivative is positive whenever $x>7.63\cdot 10^{10}.$ We have:

\frac{1}{x}+1-\frac{7\log x}{8\sqrt[8]{x}}-\frac{1}{\sqrt[8]{x}}>0\iff\frac{8}{x^{7/8}}+8x^{1/8}>7\log x+8.

The derivative of the left hand side is $\frac{-7}{x^{15/8}}+x^{-7/8}$ while the derivative of the right hand side is $\frac{7}{x}.$ Then, we see:

\frac{1}{x^{7/8}}-\frac{7}{x^{15/8}}>\frac{7}{x}\iff x^{1/8}-\frac{7}{x^{7/8}}>7.

From Wolfram Alpha, we see this is true whenever $x>5764860$ (which makes sense as $7^{8}=5764801$ ) and $7.63\cdot 10^{10}>5764860.$ This means when $x=7.63\cdot 10^{10},$ $\frac{8}{x^{7/8}}+8x^{1/8}$ is increasing faster than $7\log x+8,$ so the first derivative will be positive for $x>7.63\cdot 10^{10}$ and $x=7.63\cdot 10^{10}$ is a local minimum of the differentiated expression.

Returning to our original expression, we recall that $x=\log m,$ so our original function is increasing when $m>\exp(7.63\cdot 10^{10})\approx 10^{33142793985}\approx 10^{3.31\cdot 10^{10}}.$ When $m=10^{3.31\cdot 10^{10}},$ we have $\log m=7.63\cdot 10^{10},\log\log m=25.06,\sqrt[8]{\log m}=22.93.$ . Plugging these values in, at this local minimum, we have:

m\log m\exp(-\log^{7/8}m\log\log m)=m^{1+\frac{\log\log m}{\log m}-\frac{\log\log m}{\sqrt[8]{\log m}}}=10^{3.31\cdot 10^{10}\left(1+\frac{25.06}{7.63\cdot 10^{10}}-\frac{25.06}{22.93}\right)}.

As stated previously, all calculations are done using the precise values of each numerical quantity rather than the rounded values we use above.

Simplifying, this comes out to $10^{-1.07\cdot 10^{10}}$ which is of course extremely, extremely close to $0$ . Thus, even though our function eventually tends to infinity, it first approaches $0$ . Further, given that $m=10^{3.31\cdot 10^{10}}$ is many orders of magnitude larger than the number of atoms in the known universe (approximately $10^{82}$ ), it is not possible to empirically check whether our scipy implementation or (or even a fully correct implementation) matches the asymptotic growth of $m\log m\exp(-\log^{7/8}m\log\log m).$

For this reason, it does not make sense practically to consider the runtime of the IPM relative to its asymptotic guarantee. For any future implementations of the IPM algorithm described by Theorem 6.6, the focus should be on the other requirements of Theorem 6.6. With that said, it’s hard to test the other requirements either. While we could calculate $f^{*}$ via another algorithm to try to test our IPM implementation for accuracy, the scipy.optimize method that we are using only returns the final converged value rather than the successive steps which makes requirements 1 and 2 of Theorem 6.6 infeasible to check. Even if we had the successive step values, it is unlikely that this heuristic would exactly match those requirements. For requirement 3, we note that the $(mU)^{-10}$ bound only holds after requires $\tilde{O}(m\kappa^{-2})$ iterations which could be extremely large practically, which suggests that the scipy implementation may run into numerical issues or may not converge in time, which makes it unreliable for testing. Thus, while the scipy implementation provides a heuristic for solving the IPM problem, it is not complete.

7 Overall Time Complexity

The authors in [2] state that their max flow algorithm is $m^{1+o(1)}\log U\log C$ time or what they sometimes call $\tilde{O}(m)$ time. The goal of this section is to more precisely calculate the $m^{o(1)}$ term.

For a graph with $m$ edges and $n$ vertices to be connected, we must have at least $n-1$ edges. Therefore, $n-1\leq m$ which implies that $n=O(m).$ This allows us to bound $n$ by $m$ in the asymptotic calculations that follow.

We refer to Algorithm 7 of [2], which provides the algorithm for their main theorem. We then refer to each component of it and calculate its runtime based on what the authors in [2] say. In cases when they simply say $m^{o(1)},$ we take a deeper dive as needed to give a more precise asymptotic time complexity.

7.1 Preliminaries

We begin by reproducing several Lemmas and Theorems from [2] which we will refer to throughout the subsequent subsections:

Definition 7.1.

([2]) Consider a dynamic graph $G^{(t)}$ undergoing batches of updates $U^{(1)},\ldots,U^{(t)},\ldots$ consisting of edge insertions/deletions and vertex splits. We say the sequences $\mathbf{g}^{(t)},\mathbf{\ell}^{(t)},$ and $U^{(t)}$ satisfy the hidden stable-flow chasing property if there are hidden dynamic circulations $\mathbf{c}^{(t)}$ and hidden dynamic upper bounds $\mathbf{w}^{(t)}$ such that the following holds at all stages $t$ :

1.

$\mathbf{c}^{(t)}$ is a circulation: $\mathbf{B}^{\top}_{G^{(t)}}\mathbf{c}^{(t)}=0.$
2.

$\mathbf{w}^{(t)}$ upper bounds the length of $\mathbf{c}^{(t)}$ : $|\mathbf{\ell}_{e}^{(t)}\mathbf{c}_{e}^{(t)}|\leq\mathbf{w}_{e}^{(t)}$ for all $e\in E(G^{(t)}).$
3.

For any edge $e$ in the current graph $G^{(t)}$ , and any stage $t^{\prime}\leq t,$ if the edge $e$ was already present in $G^{t^{\prime}},$ i.e. $e\in G^{(t)}\backslash\bigcup_{s=t^{\prime}+1}^{t}U^{(s)}$ then $\mathbf{w}_{e}^{(t)}\leq 2\mathbf{w}_{e}^{(t^{\prime})}.$
4.

Each entry of $\mathbf{w}^{(t)}$ and $\mathbf{\ell}^{(t)}$ is quasipolynomially lower and upper-bounded: $\log\mathbf{w}_{e}^{(t)}\in[-\log^{O(1)}m,\log^{O(1)}m]$ and $\log\mathbf{\ell}_{e}^{(t)}\in[-\log^{O(1)}m,\log^{O(1)}m]$ for all $e\in E(G^{(t)}).$

Definition 7.2.

([2]) Consider a tree $T$ and a rooted spanning forest $E(F)\subseteq E(T)$ on a graph $G$ equipped with stretch overestimates $\widetilde{str}_{e}$ satisfying the guarantees of Lemma 4.1. We define the core graph $\mathcal{C}(G,F)$ as a graph with the same edge and vertex set as $G\backslash F.$ For $e=(u,v)\in E(G)$ with image $\hat{e}\in E(G/F)$ we define its length as $\mathbf{\ell}_{\hat{e}}^{\mathcal{C}(G,F)}\stackrel{{\scriptstyle\text{def}}}{{=}}\widetilde{str}_{e}\mathbf{\ell}_{e}$ and gradient as $\mathbf{g}_{\hat{e}}^{\mathcal{C}(G,F)}\stackrel{{\scriptstyle\text{def}}}{{=}}\mathbf{g}_{e}+\langle\mathbf{g},\mathbf{p}(T[u,v])\rangle.$

Definition 7.3.

([2]) Given a graph $G$ , forest $F$ , and parameter $k$ , define a $(\gamma_{s},\gamma_{c},\gamma_{\ell})$ -sparsified core graph with embedding as a subgraph $\mathcal{S}(G,F)\subseteq\mathcal{C}(G,F)$ and embedding $\prod_{\mathcal{C}(G,F)\to\mathcal{S}(G,F)}$ satisfying

1.

For any $\hat{e}\in E(\mathcal{C}(G,F))$ , all edges $\hat{e}^{\prime}\in\prod_{\mathcal{C}(G,F)\to\mathcal{S}(G,F)}(\hat{e})$ satisfy $\mathbf{\ell}_{\hat{e}}^{\mathcal{C}(G,F)}\approx_{2}\mathbf{\ell}_{\hat{e}^{\prime}}^{\mathcal{C}(G,F)}.$
2.

$\operatorname{length}(\prod_{\mathcal{C}(G,F)\to\mathcal{S}(G,F)})\leq\gamma_{\ell}$ and $\operatorname{econg}(\prod_{\mathcal{C}(G,F)\to\mathcal{S}(G,F)})\leq k\gamma_{c}.$
3.

$\mathcal{S}(G,F)$ has at most $m\gamma_{s}/k$ edges.
4.

The lengths and gradients of edges in $\mathcal{S}(G,F)$ are the same in $\mathcal{C}(G,F)$ (Definition 7.2).

Definition 7.4.

([2]) For a graph $G,$ parameter $k,$ and branching factor $B,$ a B-branching tree-chain consists of collections of graphs $\{\mathcal{G}_{i}\}_{0\leq i\leq d}$ such that $\mathcal{G}_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}\{G\},$ and we define $\mathcal{G}_{i}$ inductively as follows,

1.

For each $G\in\mathcal{G}_{i},i<d$ , we have a collection of $B$ trees $\mathcal{T}^{G_{i}}=\{T_{1},T_{2},\ldots,T_{B}\}$ and a collection of $B$ forests $\mathcal{F}^{G_{i}}=\{F_{1},F_{2},\ldots,F_{B}\}$ such that $E(F_{j})\subseteq E(T_{j})$ satisfy the conditions of Lemma 4.1.
2.

For each $G_{i}\in\mathcal{G}_{i},$ and $F\in\mathcal{F}^{G_{i}},$ we maintain $(\gamma_{s},\gamma_{c},\gamma_{l})$ -sparsified core graphs and embeddings $\mathcal{S}(G_{i},F)$ and $\Pi_{\mathcal{C}(G_{i},F)\to\mathcal{S}(G_{i},F)}.$
3.

We let $\mathcal{G}_{i+1}\stackrel{{\scriptstyle\text{def}}}{{=}}\{\mathcal{S}(G_{i},F):G_{i}\in\mathcal{G}_{i},F\in F^{G_{i}}\}.$

Finally, for each $G_{d}\in\mathcal{G}_{d},$ we maintain a low-stretch tree $F.$

We let a tree chain be a single sequence of graphs $G_{0},G_{1},\ldots,G_{d}$ such that $G_{i+1}$ is the $(\gamma_{s},\gamma_{c},\gamma_{l})$ -sparsified core graph $\mathcal{S}(G_{i},F_{i})$ with embedding $\Pi_{\mathcal{C}(G_{i},F_{i})\to\mathcal{S}(G_{i},F_{i})}$ for some $F_{i}\in\mathcal{F}^{G_{i}}$ for $0\leq i\leq d,$ and a low-stretch tree $F_{d}$ on $G_{d}.$

Definition 7.5.

([2]) Given a graph $G$ and tree-chain $G_{0},G_{1},\ldots,G_{d}$ where $G_{0}=G,$ define the corresponding spanning tree $T^{G_{0},G_{1},\ldots,G_{d}}\stackrel{{\scriptstyle\text{def}}}{{=}}\bigcup_{i=0}^{d}F_{i}$ of $G$ as the union of preimages of edges of $F_{i}$ in $G=G_{0}.$

Define the set of trees corresponding to a branching tree-chain of graph $G$ as the union of $T^{G_{0},G_{1},\ldots,G_{d}}$ over all tree chains $G_{0},G_{1},\ldots,G_{d}$ where $G_{0}=G$ :

\mathcal{T}^{G}\stackrel{{\scriptstyle\text{def}}}{{=}}\{T^{G_{0},G_{1},\ldots,G_{d}}:G_{0},G_{1},\ldots,G_{d}~{}s.t.~{}G_{i+1}=\mathcal{S}(G_{i},F_{i})\text{ for all }0\leq i<d\}

Definition 7.6.

([2]) Given a dynamic graph $G^{(t)}$ with updates indexed by times $t=0,1,\ldots$ and corresponding dynamic branching tree-chain (Definition 7.4), we say that nonnegative integers $\text{prev}_{0}^{(t)}\leq\text{prev}_{1}^{(t)}\leq\cdots\leq\text{prev}_{d}^{(t)}$ are previous rebuild times if $\text{prev}_{i}^{(t)}$ was the most recent time at or before $t$ such that $\mathcal{G}_{i}$ was rebuilt, i.e. $G\in\mathcal{G}_{i}$ the set of trees $\mathcal{T}^{G}$ was reinitialized and sampled.

Lemma 7.7.

([2]) There is a deterministic strategy given by Algorithm 6 for the player to finish $T$ rounds of the rebuilding game in time $O\left(\frac{C_{r}Kd}{\gamma_{g}}(m+T)\right).$

Theorem 7.8.

([2]) Let $G=(V,E)$ be a dynamic graph undergoing $\tau$ batches of updates $U^{(1)},\ldots,U^{(\tau)}$ containing only edge insertions/deletions with edge gradient $\mathbf{g}^{(t)}$ and length $\mathbf{\ell}^{(t)}$ such that the update sequence satisfies the hidden stable-flow chasing property (Definition 7.1) with hidden dynamic circulation $\mathbf{c}^{(t)}$ and width $\mathbf{w}^{(t)}.$ There is an algorithm on $G$ that maintains a $O(\log n)-$ branching tree chain corresponding to $s=O(\log n)^{d}$ trees $T_{1},T_{2},\ldots,T_{s}$ (Definition 7.5), and at stage $t$ outputs a circulation $\Delta$ represented by $\exp(O(\log^{7/8}m\log\log m))$ off-tree edges and paths on some $T_{i},i\in[s].$

The output circulation $\Delta$ satisfies $\mathbf{B}^{\top}\Delta=0$ and for some $\kappa=\exp(-O(\log^{7/8}m\log\log m))$

\frac{\langle\mathbf{g}^{(t)},\Delta\rangle}{||\mathbf{\ell}^{(t)}\circ\Delta||_{1}}\leq\kappa\frac{\langle\mathbf{g}^{(t)},\mathbf{c}^{(t)}\rangle}{\sum_{i=0}^{d}||\mathbf{w}^{(\text{prev}_{i}^{(t)})}||_{1}},

where $\text{prev}_{i}^{(t)},i\in[d]$ are the previous rebuild times (Definition 7.6) for the branching tree train.

The algorithm succeeds w.h.p with total runtime $(m+Q)m^{o(1)}$ for $Q~{}\stackrel{{\scriptstyle\text{def}}}{{=}}\sum_{t=1}^{\tau}|U^{(t)}|\leq\text{poly}(n).$ Also, levels $i,i+1,\ldots,d$ of the branching tree train can be rebuilt at any point in $m^{1+o(1)}/k^{i}$ time.

Theorem 7.9.

([2]) There is a data structure that on a dynamic graph $G^{(t)}$ maintains a collection of $s=O(\log n)^{d}$ spanning trees $T_{1},T_{2},\ldots,T_{s}\subseteq G^{(t)}$ for $d=O(\log^{1/8}m)$ , and supports the following operations:

•

UPDATE( $U^{(t)},\mathbf{g}^{(t)},\mathbf{\ell}^{(t)}$ ): Update the gradients and lengths to $\mathbf{g^{(t)}}$ and $\mathbf{\ell}^{(t)}$ . For the update to be supported, we require that $U^{(t)}$ contains only edge insertions/deletions and $\mathbf{g}^{(t)},\mathbf{\ell}^{(t)}$ and $U^{(t)}$ satisfy the hidden stable-flow chasing property (Definition 7.1) with hidden circulation $\mathbf{c}^{(t)}$ and upper bounds $\mathbf{w}^{(t)}$ , and for a parameter $\alpha$ ,

$\frac{\langle\mathbf{g}^{(t)},\mathbf{c}^{(t)}\rangle}{||\mathbf{w}^{(t)}||_{1}}\leq\alpha.$
•

QUERY(): Returns a tree $T_{i}$ for $i\in[s]$ and a cycle $\Delta$ represented as $m^{o(1)}$ paths on $T_{i},$ (specified by their endpoints and the tree index) and $m^{o(1)}$ explicitly given off-tree edges such that for $\kappa=\exp(-O(\log^{7/8}m\cdot\log\log m)),$

$\frac{\langle\mathbf{g}^{(t)},\Delta\rangle}{||\mathbf{L}^{(t)}\Delta||_{1}}\leq-\kappa\alpha.$

Over $\tau$ stages the algorithm succeeds w.h.p with total run time $m^{o(1)}(m+Q)$ for $Q=\sum_{t\in[\tau]}|U^{(t)}|.$

Notation 7.10.

Following the notation of [2], we denote the data structure described in Theorem 7.9 as $\mathcal{D}^{(HSFC)}.$

7.2 Rebuilding Game

Now, we analyze the time complexity of the Rebuilding Game, which we know is given as $O\left(\frac{C_{r}Kd}{\gamma_{g}}(m+T)\right)$ in Lemma 7.7. At the end of Section 8.2 of [2], when describing how to use the Rebuilding game to get the main data structure described in Theorem 7.9, the authors state that $d=\sqrt[8]{\log m},k=\exp(O(\log^{7/8}m)),\gamma_{g}=\exp(-O(\log^{7/8}m\log\log m)),C_{r}=\exp(O(\log^{7/8}m\log\log m)),T=(m+Q)\exp(O(\log^{7/8}m\log\log m)).$ Plugging these values in, we get the bound $(m+Q)\exp(O(\log^{7/8}m\log\log m))$ given in [2].

Here, $Q=\sum_{t\in\tau}|U^{(t)}|$ as defined in Theorem 7.9, and we need to bound $Q$ in terms of $m$ . To do this, we turn to Lemma 9.4 from [2], which we reproduce below, where Algorithm 7 is the name [2] gives to their overall algorithm:

Lemma 7.11.

([2]) Consider a call to MINCOSTFLOW (Algorithm 7) and let $U^{(t)}$ be as in line 20 [of Algorithm 7]. Then $\sum_{t}|U^{(t)}|\leq\tilde{O}(m\kappa^{-2}\alpha^{-2}\varepsilon^{-1}).$

Thus, based on Lemma 7.11, we have $Q=\tilde{O}(m\kappa^{-2}\alpha^{-2}\varepsilon^{-1}).$ Here, $\kappa=\exp(-O(\log^{7/8}m\log\log m))$ as defined in Theorem 7.9 and $\alpha=\frac{1}{1000\log mU}$ where $U$ is as defined in Definiton 6.1. Further, $\varepsilon$ is as defined in the DETECT operation of Lemma 3.1. We treat $U$ and $\varepsilon$ as constants that we absorb into the $O$ notation.

In order to dissect what is being included in the $\tilde{O}$ notation above for $Q$ , we refer to Section 4 of [2], specifically Theorem 6.6 and the text immediately following it, where the authors in [2] state that a certain operation decreases the potential $\Phi(f)$ (see Definition 6.4) by $\Omega(\kappa^{2})$ . The potential drops from $O(m\log mU)$ to $-O(m\log m),$ so this takes $O(m\log m\kappa^{-2})$ operations, where $\kappa^{-2}=\exp(O(\log^{7/8}m\log\log m))$ . We also note that $\alpha^{-2}=O(\log^{2}m).$

Now we do some big $O$ computation. Letting $c_{i}$ denote constants, we have:

Q=O(m\log m\kappa^{-2}\alpha^{-2})\leq c_{1}m\log^{3}m\exp(O(\log^{7/8}m\log\log m))

\leq\exp\left(c_{2}\log^{7/8}m\log\log m+\log m+3\log\log m+\log c_{1}\right)=m\exp(O(\log^{7/8}m\log\log m)).

The last step follows by pulling out the $e^{\log m}=m$ term and then noticing that $\log^{7/8}m\log\log m$ dominates $3\log\log m$ and $\log c_{1}.$

From this, we get:

(m+Q)\exp(O(\log^{7/8}m\log\log m))=m\exp(O(\log^{7/8}m\log\log m))

+m\exp(O(\log^{7/8}m\log\log m))\exp(O(\log^{7/8}m\log\log m))=m\exp(O(\log^{7/8}m\log\log m)).

As a sanity check, we note that $\exp(O(\log^{7/8}m\log\log m))=m^{o(1)}$ because:

\lim_{m\to\infty}\frac{c_{1}\log^{7/8}m\log\log m}{\log m}=\lim_{m\to\infty}\frac{c_{1}\log\log m}{\log^{1/8}m}=0.

Now we have the following mathematical lemma:

Lemma 7.12.

Suppose $f(m)$ and $g(m)$ are positive functions that both go to $\infty$ as $m\to\infty$ such that $f(m)=o(g(m)).$ Then $e^{f(m)}=o(e^{g(m)})$ .

The proof of Lemma 7.12 is rather obvious: $\forall\varepsilon>0,~{}\exists m$ such that $f(m)\leq\varepsilon\cdot g(m).$ Thus $e^{f(m)-g(m)}\leq e^{(\varepsilon-1)g(m)}.$ Then, since $\varepsilon<1,$ we clearly have $\lim_{m\to\infty}e^{(\varepsilon-1)g(m)}=0.$

Applying this logic above, it follows that $\lim_{m\to\infty}\frac{\exp(O(\log^{7/8}m\log\log m))}{m^{\frac{1}{\log\log m}}}=0$ so we indeed have that $\exp(O(\log^{7/8}m\log\log m))=m^{o(1)}.$

7.3 MinCostFlow Algorithm

In this subsection, we precisely calculate the time complexity for the MinCostFlow Algorithm. This, combined with the previous subsection, allows us to establish the overall time complexity in the next subsection.

In Section 9 of [2], the authors show that the MINCOSTFLOW Algorithm satisfies the hypotheses of Theorem 6.6. Thus, we conclude that the MINCOSTFLOW Algorithm runs for $\tilde{O}(m\kappa^{-2}\alpha^{-2})$ iterations, and following the same logic as in Section 7.2, we can rewrite this as $m\exp(O(\log^{7/8}m\log\log m)).$

For the time per iteration, we note that there are $s=O((\log n)^{d})=O\left((\log m)^{\sqrt[8]{\log m}}\right)$ trees maintained by the $\mathcal{D}^{(HSFC)}$ data structure. For each of these trees, we invoke the Dynamic Tree data structure operations, which take $O(\log m)$ based on the guarantees of Lemma 3.1.

Therefore, we get a bound of $m\exp(O(\log^{7/8}m\log\log m))\cdot O(\log m)\cdot O\left((\log m)^{\sqrt[8]{\log m}}\right)$ . We note that $(\log m)^{1+\sqrt[8]{\log m}}=\exp((1+\sqrt[8]{\log m})\log\log m)=\exp(O(\log^{7/8}m\log\log m)).$ Thus, we still have $m\exp(O(\log^{7/8}m\log\log m))$ as our time complexity.

7.4 Comparison with Edmonds Karp

The Edmonds Karp algorithm is another well known algorithm for solving max flow, which runs in $O(mn^{2})=O(m^{3})$ time. We now compare the growth of $m^{3}$ vs $m\exp(C\log^{7/8}m\log\log m)$ for large $m$ and constant $C$ . The goal is to find out when $m^{3}$ becomes larger than $m\exp(C\log^{7/8}m\log\log m)$ .

We have:

m^{3}>m\exp(\log^{7/8}m\log\log m)\implies 2\log^{1/8}m>C\log\log m.

Letting $x=\log m,$ we need to solve $2\sqrt[8]{x}>C\log x.$ If we take $C=1,$ Wolfram Alpha indicates this inequality holds for all $x>3.03\cdot 10^{7}$ , which means $m>\exp(3.03\cdot 10^{7})=1.01\cdot 10^{13154921}$ . If the number of edges/vertices of the graph were to exceed this number, then Edmonds Karp would be faster. If $C=2,$ the same calculations give us $m>3.32\cdot 10^{93334255463}.$

It is also worth considering how much slower this algorithm is for a more reasonable value of $m$ . Note that because we do not know the value of the constant $C$ , it is unreasonable to analyze small values of $m$ (like $10$ or $100$ ). Taking $C=1,$ if we assume our graph has $10^{10}$ edges, then we have:

\frac{10^{10}\exp(\log^{7/8}(10^{10})\log\log 10^{10})}{10^{30}}\approx 15.58.

Similar to the behavior of the function in Section 6.2, this function gets larger before it gets smaller. When $m=10^{1000},$ it equals $2.95\cdot 10^{941},$ and of course these numbers would be larger for a larger $C.$ Thus we can conclude that for any reasonably sized graph (where reasonably sized is defined as can fit in the storage of the world’s largest supercomputers), a traditional algorithm like Edmonds-Karp will be faster.

8 Conclusion

8.1 Summary

We began with preliminaries and provided a detailed list of all of the different subparts of the algorithm in [2] that would have to be implemented in order for the main algorithm to be implemented. After this, we discussed our implementations of portions of the algorithm, including the Low Stretch Decomposition, the Rebuilding Game and the Interior Point Method with insights into why certain implementation decisions were made. Finally, we examined the overall time complexity of the algorithm in detail and worked out a more precise asymptotic bound.

8.2 Future Directions of Study

Several portions of the algorithm in [2] remain unimplemented, including primarily the expander decomposition and a fully correct Interior Point Method. The nested list provided in Section 2.2 combined with the stubs provided in the GitHub should give a roadmap for any future attempts at implementing this algorithm. Separately, while the algorithm in [2] is the asymptotically fastest algorithm for max flow to date, as discussed in the final subsection it is not designed to be practical. Another interesting area of research would be to attempt to find the fastest algorithm which can be implemented reasonably or prove that existing algorithms already meet those criteria.

References

[1] Ittai Abraham and Ofer Neiman. Using petal-decompositions to build a low stretch spanning tree. SIAM Journal of Computing, 48(2):227–248, 2019.
[2] Li Chen, Rasmus Kyng, Yang Liu, Richard Peng, Maximilian Gutenberg, and Sushant Sachdeva. Maximum flow and minimum-cost flow in almost-linear time. 2022.
[3] Julia Chuzhoy and Thatchaphol Saranurak. Deterministic algorithms for decremental shortest paths via layered core decomposition. Proceedings of the 2021 ACM-SIAM Symposium on Disrete Algorithms (SODA), pages 2478–2496, 2021.
[4] Shimon Even and Yossi Shiloach. An on-line edge-deletion problem. Journal of the ACM, 28:1–4, 1981.
[5] Thatchaphol Saranurak and Di Wang. Expander decomposition and pruning: Faster, stronger, and simpler. Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, pages 2616–2635, 2019.
[6] Daniel Sleator and Robert Tarjan. A data structure for dynamic trees. Journal of Computer and System Sciences, 26(3):362–391, 1983.
[7] Daniel Sleator and Robert Tarjan. Self-adjusting binary search trees. Journal of the Association for Computing Machinery, 32(3):652–686, 1985.
[8] Daniel Spielman and Shang-Hu Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. Proceedings of the 36th Annual ACM Symposium on Theory of Computing, STOC 2004, Chicago, IL, USA, pages 81–90, 2004.
[9] Daniel Spielman and Shang-Hua Teng. Solving sparse, symmetric, diagonally-dominant linear system in time ${O}(m^{1.31})$ . 44th Annual IEEE Symposium on Foundations of Computer Science, pages 416–427, 2003.