This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\setlistdepth

10

Partial Implementation of Max Flow and Min Cost Flow in Almost-Linear Time

A thesis presented by
Nithin Kavi

to
Computer Science
in partial fulfillment of the joint concentration
requirements for the degree of Bachelor of Arts
in Computer Science and Mathematics

Harvard College
Cambridge, Massachusetts
April 1, 2024
Abstract

In 2022, Chen et al. proposed an algorithm in [2] that solves the min cost flow problem in m1+o(1)logUlogCm^{1+o(1)}\log U\log C time, where mm is the number of edges in the graph, UU is an upper bound on capacities and CC is an upper bound on costs. However, as far as the authors of [2] know, no one has implemented their algorithm to date. In this paper, we discuss implementations of several key portions of the algorithm given in [2], including the justifications for specific implementation choices. For the portions of the algorithm that we do not implement, we provide stubs. We then go through the entire algorithm and calculate the mo(1)m^{o(1)} term more precisely. Finally, we conclude with potential directions for future work in this area.

Acknowledgments

I would like to thank Dr. Adam Hesterberg for offering guidance throughout this project as my senior thesis advisor during the 2023-2024 academic year. He was extremely helpful in discussing ideas and offering feedback on this paper, and I could not have done this project without him. I would also like to thank Professors Yang Liu, Sushant Sachdeva and Richard Peng for answering clarifying questions about the algorithm that they present in [2] and offering advice on how I go about implementing things in this project. Their insight was extremely helpful. I am also thankful for Professors Daniel Sleator and Robert Tarjan for their insight on Dynamic Trees. I also sincerely appreciate Professor Madhu Sudan for agreeing to read and evaluate my thesis.

In addition, I want to thank Mincheol, Kevin, Ryan, Skyler, Sunera, Wittmann, and Yiting for being my suitemates and being part of an unforgettable experience at Harvard. I’ve really enjoyed these past 4 years with them. I am also thankful to my parents and older sister for all of their support throughout my Harvard education.

Lastly, I thank the Harvard Computer Science Department and Harvard College for giving me the opportunity to write a senior thesis and undertake this project.

1 Introduction

The maximum-flow problem is formulated as follows:

Suppose we have a connected graph GG with mm edges and nn vertices where one vertex is designed as a source ss and another is designated as a sink tt. Each edge ee in GG has a given capacity ce.c_{e}. We want to the maximize the amount of flow from ss to tt such that for all vertices vs,tv\neq s,t the flow out of vv is equal to the flow into vv and that for any edge ee, the flow fef_{e} along edge ee satisfies fece.f_{e}\leq c_{e}.

To solve this problem, Ford and Fulkerson created the Ford-Fulkerson algorithm, which runs in O(mf)O(mf^{*}) time where ff^{*} is the max flow. There is also the Edmonds-Karp algorithm which runs in O(nm2)O(nm^{2}) time which does not depend on the size of the max flow itself.

There is also the related minimum-cost flow problem, where each edge is also assigned a cost along with its capacity. Then the goal is to push the maximum total flow from the source to the sink while minimizing the total cost along the way.

Lastly, we can combine these two ideas as follows: Letting the max flow as defined above be f,f^{*}, the min cost flow finds the minimum possible total cost of the flow over the edges such that all requirements of the max flow problem are met and the max flow is attained.

In 2022, Chen, Kyng, Liu, Peng, Gutenberg and Sachdeva wrote [2] in which they found an algorithm to compute both maximum flows and minimum-cost flows in m1+o(1)logUlogCm^{1+o(1)}\log U\log C time, which is the asymptotically fastest known algorithm to date. In this paper, we work on implementing the algorithm given in [2]. As far as the authors of [2] are aware, this algorithm has never been implemented before, and this paper constitutes the first attempt at implementing it. This algorithm relies on a number of subparts, including dynamic trees, low-stretch spanning trees, and undirected minimum-ratio cycles. Apart from very well-known things (like binary trees), there do not appear to be existing implementations of any of these subparts that meet the asymptotic time complexities given in [2].

We implement some of these subparts in the following sections in Python. Although Python is slower than many other languages by a large constant factor, the algorithm in [2] was never designed to be fast in practice, so this is not an issue. At the beginning of each section, we briefly describe what code that section corresponds to and explain how that code fits into the broader algorithm. In addition, for the sections where we do not implement all of the code described in the corresponding subpart, we include a description of what we did implement and what was challenging about the parts we did not implement. All of the code referenced in this paper can be found at the following Github: https://github.com/thinkinavi24/almost_linear_max_flow. The generative AI tool ChatGPT was used to write a significant amount of this code and is credited accordingly in comments of the Jupyter codeblocks. In multiple Jupyter notebooks, some codeblocks were written by ChatGPT while others were written by hand; any code block not labeled as being written by ChatGPT was written by hand. All code written by generative AI was thoroughly checked, tested and revised as needed as was all other code in the Github.

2 Preliminaries

We begin with some definitions that we will refer back to throughout this paper. Definitions, notations and remarks with references in this section are copied identically from those references and are reproduced here for convenience. In addition, in later sections throughout this paper, we will quote results including theorems, definitions and lemmas from other articles and we will mention the source of the result immediately before reproducing it. Further, when discussing the implementation of a list of methods/attributes in a data structure that are described in another paper, we will often quote that paper’s description of the methods/attributes. When we do so, we will mention the source of the result immediately before reproducing a list that is copied from the referenced paper.

2.1 Definitions

Definition 2.1.

Given a graph G,G, for all pairs of vertices (u,v)G(u,v)\in G we define dG(u,v)d_{G}(u,v) to be the shortest path distance from uu to vv on G.G.

Definition 2.2.

([1]) The radius of a connected graph GG around a point x0Gx_{0}\in G is maxvGd(x0,v)\max_{v\in G}d(x_{0},v) where d(x0,v)d(x_{0},v) represents the shortest distance from x0x_{0} to vv for each vertex vG.v\in G.

Notation 2.3.

([2]) We let a graph GG correspond to its vertices VV and its edges EE, and we let n=|V|n=|V| and m=|E|.m=|E|. Sometimes, we may explicitly write E(G)E(G) to denote the edges of GG or V(G)V(G) to denote the vertices of G.G. Further, we let eGE(G)e^{G}\in E(G) denote an edge of GG and vGV(G)v^{G}\in V(G) denote a vertex of G.G. We assume that each edge eEe\in E has an implicit direction, used to define its edge-vertex incidence matrix 𝐁.\mathbf{B}.

Definition 2.4.

([2]) We let degG(v)\deg_{G}(v) denote the degree of the vertex vv in G,G, or the number of incident edges. We let Δmax(G)\Delta_{\max}(G) and Δmin(G)\Delta_{\min}(G) denote the maximum and minimum degree of graph G.G.

Notation 2.5.

For a graph GG with vertices V,V, for any subset SV,S\subset V, we let S¯\overline{S} denote V\S.V\backslash S.

Definition 2.6.

([2]) We define the volume of a set SVS\subseteq V as volG(S)=vSdegG(v).\operatorname{vol}_{G}(S)=\sum_{v\in S}\deg_{G}(v).

Definition 2.7.

([2]) Given graphs GG and HH with V(G)V(H)V(G)\subseteq V(H) we say that ΠGH\Pi_{G\to H} is a graph-embedding from GG into HH if it maps each edge eG=(u,v)E(G)e^{G}=(u,v)\in E(G) to a uvu-v path ΠGH(eG)\Pi_{G\to H}(e^{G}) in HH.

Definition 2.8.

([2]) The congestion of an edge eHe^{H} is defined as econg(ΠGH,eH)=|{eGE(G)|eHΠGH(eG)}|\operatorname{econg}(\Pi_{G\to H},e^{H})=|\{e^{G}\in E(G)|e^{H}\in\Pi_{G\to H}(e^{G})\}| and of the embedding by econg(ΠGH)=maxeHE(H)econg(ΠGH,eH).\operatorname{econg}(\Pi_{G\to H})=\max_{e^{H}\in E(H)}\operatorname{econg}(\Pi_{G\to H},e^{H}).

Definition 2.9.

([2]) The congestion of a vertex vHV(H)v^{H}\in V(H) is defined by vcong(ΠGH,vH)=|{eGE(G)|vHΠGH(eG)}|\operatorname{vcong}(\Pi_{G\to H},v^{H})=|\{e^{G}\in E(G)|v^{H}\in\Pi_{G\to H}(e^{G})\}| Further, the vertex-congestion of the embedding is vcong(ΠGH)=maxvHV(H)vcong(ΠGH,vH).\operatorname{vcong}(\Pi_{G\to H})=\max_{v^{H}\in V(H)}\operatorname{vcong}(\Pi_{G\to H},v^{H}).

Definition 2.10.

([2]) The length of an embedding is defined as length(ΠGH)=maxeGE(G)|ΠGH(eG)|.\operatorname{length}(\Pi_{G\to H})=\max_{e^{G}\in E(G)}|\Pi_{G\to H}(e^{G})|.

Definition 2.11.

([2]) GG is a dynamic graph if it undergoes batches U(1),U(2),U^{(1)},U^{(2)},\ldots of updates consisting of edge insertions/deletions and/or vertex splits that are applied to G.G. We say that the graph GG, after applying the first tt update batches U(1),U(2),,U(t)U^{(1)},U^{(2)},\ldots,U^{(t)} is at stage tt and denote the graph at this stage by G(t).G^{(t)}.

Remark 2.12.

([2]) For each update batch U(t),U^{(t)}, we encode edge insertions by a tuple of tail and head of the new edge and deletions by a pointer to the edge that is about to be deleted. We further also encode vertex splits by a sequence of edge insertions and deletions as follows: if a vertex vv is about to be split and the vertex that is split off is denoted vNEWv^{\text{NEW}}, we can delete all edges that are incident to vv but should be incident to vnewv^{\text{new}} from vv and then re-insert each such edge via an insertion (we allow insertions to new vertices that do not yet exist in the graph). For technical reasons, we assume that in an update batch U(t)U^{(t)}, the updates to implement the vertex splits are last, and that we always encode a vertex split of vv into vv and vNEWv^{\text{NEW}} such that degG(t+1)(vNEW)degG(t+1)(v).\deg_{G^{(t+1)}}(v^{\text{NEW}})\leq\deg_{G^{(t+1)}}(v). We let the vertex set of graph G(t)G^{(t)} consist of the union of all endpoints of edges in the graph (in particular if a vertex is split, the new vertex vNEWv^{\text{NEW}} is added due to having edge insertions incident to this new vertex vNEWv^{\text{NEW}} in U(t)).U^{(t)}).

Definition 2.13.

We say a function g(n)=O~(f(n))g(n)=\tilde{O}(f(n)) if there exist constants c,kc,k such that g(n)cf(n)logk(n)g(n)\leq cf(n)\log^{k}(n) for sufficiently large n.n. Note that this is equivalent to standard big O-notation except we also allow for arbitrary powers of log\log.

Definition 2.14.

([2]) We define ENC(uu) of an update uU(t)u\in U^{(t)} to be the size of the encoding of the update and note that for edge insertions/deletions, we have ENC(uu) = O~(1)\tilde{O}(1) and for a vertex split of vv into vv and vNEWv^{\text{NEW}} as described above we have that ENC(uu) = O~(degG(t+1))(vNEW).\tilde{O}(\deg_{G^{(t+1)}})(v^{\text{NEW}}). For a batch of updates UU, we let ENC(UU) = uU\sum_{u\in U} ENC(uu).

Remark 2.15.

([2]) The number of updates |U||U| in an update batch UU can be completely different from the actual encoding size ENC (U)(U) of the update batch UU.

Definition 2.16.

([5]) For any subsets S,TVS,T\subset V we denote E(S,T)={{u,v}E|uS,vT}E(S,T)=\{\{u,v\}\in E|u\in S,v\in T\} as the set of edges between SS and T.T.

Definition 2.17.

([5]) The cut-size of a cut SS is δ(S)=|E(S,S¯)|.\delta(S)=|E(S,\overline{S})|.

Definition 2.18.

([5]) The conductance of a cut SS in GG is ΦG(S)=δ(S)min(volG(S),volG(V\S)).\Phi_{G}(S)=\frac{\delta(S)}{\min(\operatorname{vol}_{G}(S),\operatorname{vol}_{G}(V\backslash S))}. Unless otherwise noted, when speaking of the conductance of a cut S,S, we assume SS to be the side of smaller volume. The conductance of a graph GG is ΦG=minSVΦG(S).\Phi_{G}=\min_{S\subset V}\Phi_{G}(S). If GG is a singleton, we define ΦG=1.\Phi_{G}=1.

Definition 2.19.

([5]) We say a graph GG is a ϕ\phi expander if ΦGϕ\Phi_{G}\geq\phi, and we call a partition V1,,VkV_{1},\ldots,V_{k} of VV a ϕ\phi expander decomposition if miniΦG[Vi]ϕ.\min_{i}\Phi_{G_{[V_{i}]}}\geq\phi.

Definition 2.20.

([5]) Given G=(V,E)G=(V,E) and a set of vertices AV,A\subset V, we say AA is a nearly ϕ\phi expander in GG if SA,vol(S)vol(A)/2:|E(S,V\S)|ϕvol(S).\forall S\subseteq A,\operatorname{vol}(S)\leq\operatorname{vol}(A)/2:|E(S,V\backslash S)|\geq\phi\operatorname{vol}(S).

Definition 2.21.

([5]) Given a graph G=(V,E)G=(V,E), its subdivision graph GE=(v,E)G_{E}=(v^{\prime},E^{\prime}) is the graph where we put a split [vertex] xex_{e} on each edge eEe\in E (including self loops). Formally, V=VXEV^{\prime}=V\cup X_{E} where Xe={xe|eE}X_{e}=\{x_{e}|e\in E\} and E={{u,xe},{v,xe}|e={u,v}E}.E^{\prime}=\{\{u,x_{e}\},\{v,x_{e}\}|e=\{u,v\}\in E\}. We will call [vertices] in XEX_{E} the split [vertices], and the other [vertices] in VV^{\prime} the regular [vertices].

Convention 2.22.

Whenever we use log\log in this paper, we always mean the natural logarithm. Of course, this is irrelevant when considering only asymptotic runtimes.

2.2 Layout of Algorithm

The authors in [2] present their main algorithm on page 69 at the beginning of Section 9 in Algorithm 7. In order to figure out exactly what needs to be implemented (vs portions of [2] that are not directly part of the final algorithm) we work backwards from Algorithm 7.

We begin by listing the things we need to implement for Algorithm 7. Then for each of these, we list the things we need to implement for them, and so on until we reach algorithms that can be implemented without referring to any other major algorithms. Note that some things are repeated within the list; this is because certain data structures/algorithms are used at multiple points within the overall main algorithm. The bolded items are ones which we have implemented and are labeled as fully implemented or partially implemented.

A quick glance at Algorithm 7 tells us that we need to implement the following:

  1. 1.

    The data structure 𝒟HSFC\mathcal{D}^{HSFC} (Theorem 6.2, [2])

    1. 1.1.

      Branching Tree Chain (Theorem 7.1, [2])

      1. 1.1.1.

        Dynamic Low Stretch Decomposition (LSD) (Lemma 6.5, [2]) (partial implementation in Section 4)

        1. 1.1.1.1.

          Low-stretch spanning trees (Theorem 3.2, [2]) (full implementation in Section 4)

          1. 1.1.1.1.1.

            create_petal algorithm ([1])

          2. 1.1.1.1.2.

            petal_decomposition algorithm ([1])

          3. 1.1.1.1.3.

            hierarchical_petal_decomposition algorithm ([1])

        2. 1.1.1.2.

          Heavy Light Decomposition (Lemma B.8, [2]) (full implementation in Section 4)

        3. 1.1.1.3.

          Rooted tree (Lemma B.9, [2])

        4. 1.1.1.4.

          Computing Stretch Overestimates

          1. 1.1.1.4.1.

            Computing permutation π\pi (Lemma B.6, [2])

            1. 1.1.1.4.1.1.

              Compute congestion (Lemma B.5, [2])

              1. 1.1.1.4.1.1.1.

                Dynamic Trees (Lemma 3.3, [2]) (external implementation cited in Section 3)

        5. 1.1.1.5.

          Tree Decomposition (Lemma B.7 of [2], quotes [9] and [8])

      2. 1.1.2.

        Sparsified core graph (Definition 6.9, [2]), implemented in Algorithm 4 (Lemma 7.8. [2])

        1. 1.1.2.1.

          Dynamic Spanner (Theorem 5.1, [2])

          1. 1.1.2.1.1.

            SPARSIFY algorithm (Theorem 5.5, [2])

            1. 1.1.2.1.1.1.

              DECOMPOSE algorithm (Theorem 5.11, [2])

              1. 1.1.2.1.1.1.1.

                Expander Decomposition (Theorem 1.2 of [5])

                1. 1.1.2.1.1.1.1.1.

                  Cut-Matching (Theorem 2.2 of [5])

                  1. 1.1.2.1.1.1.1.1.1.

                    Implicitly update flow ff (Appendix B.1 of [5])

                2. 1.1.2.1.1.1.1.2.

                  Trimming (Theorem 2.1 of [5])

            2. 1.1.2.1.1.2.

              Data structure 𝒟𝒮ExpPath\mathcal{DS}_{ExpPath} (Theorem 5.12 of [2], quotes [3])

              1. 1.1.2.1.1.2.1.

                Expander Pruning (Theorem 1.3 of [5])

              2. 1.1.2.1.1.2.2.

                Theorem 3.8 of [3]

              3. 1.1.2.1.1.2.3.

                ES-Tree Data Structure ([4])

          2. 1.1.2.1.2.

            Update algorithm

        2. 1.1.2.2.

          Dynamic Core Graphs (Lemma 7.5, [2])

          1. 1.1.2.2.1.

            Dynamic Low Stretch Decomposition (LSD) (Lemma 6.5, [2])

          2. 1.1.2.2.2.

            Multiplicative Weights Update (MWU) (Lemma 6.6, [2])

    2. 1.2.

      Rebuilding Game (Lemma 8.3, [2]) (full implementation in Section 5)

  2. 2.

    Dynamic Trees (Lemma 3.3, [2]) (external implementation in Section 3)

In order to implement Algorithm 7, we can view this list itself as a flowchart/DAG, where the leaves of the tree need to be implemented to eventually implement our way up to the root. In order to implement the algorithm at each “node” of the tree, all of its children must be implemented.

In Section 7, we analyze these different components and the asymptotic guarantees given in [2] to precisely calculate the asymptotic time complexity of the main algorithm in [2]. Lastly, we conclude the paper in Section 8 with a summary of the work and possible directions for future study.

3 Dynamic Trees

Dynamic Trees are described in Lemma 3.3 of [2] which is reproduced below for convenience:

Lemma 3.1.

([2]) There is a deterministic data structure 𝒟(T)\mathcal{D}^{(T)} that maintains a dynamic tree TG=(V,E)T\subseteq G=(V,E) under insertion/deletion of edges with gradients 𝐠\mathbf{g} and lengths \mathbf{\ell} and supports the following operations:

  1. 1.

    Insert/delete edges ee to TT, under the condition that TT is always a tree, or update the gradient 𝐠𝐞\mathbf{g_{e}} or lengths 𝐞.\mathbf{\ell_{e}}. The amortized time is O~(1)\tilde{O}(1) per change.

  2. 2.

    For a path vector, Δ=𝐩(T[u,v])\Delta=\mathbf{p}(T[u,v]) for some u,vVu,v\in V, return 𝐠,𝚫\langle\mathbf{g},\mathbf{\Delta}\rangle or ,|𝚫|\langle\mathbf{\ell},|\mathbf{\Delta}|\rangle in time O~(1).\tilde{O}(1).

  3. 3.

    Maintain a flow 𝐟E\mathbf{f}\in\mathbb{R}^{E} under operations 𝐟𝐟+ηΔ\mathbf{f}\leftarrow\mathbf{f}+\eta\Delta for η\eta\in\mathbb{R} and path vector Δ=𝐩(T[u,v])\Delta=\mathbf{p}(T[u,v]) or query the value 𝐟e\mathbf{f}_{e} in amortized time O~(1).\tilde{O}(1).

  4. 4.

    Maintain a positive flow 𝐟E\mathbf{f}\in\mathbb{R}^{E} under operations 𝐟𝐟+η|Δ|\mathbf{f}\leftarrow\mathbf{f}+\eta|\Delta| for η0\eta\in\mathbb{R}_{\geq 0} and path vector 𝚫=𝐩(T[u,v])\mathbf{\Delta}=\mathbf{p}(T[u,v]), or query the value 𝐟e\mathbf{f}_{e} in amortized time O~(1)\tilde{O}(1).

  5. 5.

    DETECT(). For a fixed parameter ε\varepsilon, and under positive flow updates (item 4), where Δ(t)\Delta^{(t)} is the update vector at time tt, returns S(t)={eE:𝐞t[laste(t)+1,t]|Δe(t)|ε}S^{(t)}=\left\{e\in E:\mathbf{\ell_{e}}\sum_{t^{\prime}\in[\textbf{last}_{e}^{(t)}+1,t]}|\Delta_{e}^{(t^{\prime})}|\geq\varepsilon\right\} where laste(t)\text{last}_{e}^{(t)} is the last time before tt that ee was returned by DETECT(). Runs in time O~(|S(t)|).\tilde{O}(|S^{(t)}|).

The data structure in Lemma 3.3 of [2] above is a slight extension on the Dynamic Trees described in [6]. In general, the purpose of dynamic trees is to make operations like cutting a tree or linking two trees possible in O~(1)\tilde{O}(1) time, which can then be efficiently combined to implement operations 1-5 above also in O~(1)\tilde{O}(1) time.

This is an important data structure for the algorithm presented in [2]. It is challenging to find an implementation of this data structure online. The authors of [6] pointed us to the following implementation in C++: https://codeforces.com/contest/487/submission/8902202, written by the first author of [6]. They also pointed us to their paper [7] which provides a simpler description for Dynamic Trees.

Since an implementation of Dynamic Trees already exists, though it is in a different language, we do not implement one here and instead defer to that implementation. While that implementation does not include a DETECT() method, the authors in [2] give an algorithm for DETECT() which should be implementable given the Dynamic Tree Class linked above. Even though we are not creating our own dynamic trees implementation, we will refer to Lemma 3.1 throughout this paper so it is important to understand their definition as given by Lemma 3.1.

4 Low Stretch Decomposition

As described in Section 2.2, in this section we implement Low Stretch Decomposition as specified in Lemma 6.5 of [2], which states the following:

Lemma 4.1.

(Dynamic Low Stretch Decomposition). There is a deterministic algorithm with total runtime O~(m)\tilde{O}(m) that on a graph G=(V,E)G=(V,E) with lengths >0E,\mathbf{\ell}\in\mathbb{R}^{E}_{>0}, weights 𝐯>0E,\mathbf{v}\in\mathbb{R}^{E}_{>0}, and parameter kk, initializes a tree TT spanning V,V, and a rooted spanning forest FT,F\subseteq T, a edge-disjoint partition 𝒲\mathcal{W} of FF into O(m/k)O(m/k) sub trees and stretch overestimates stre~.\widetilde{\text{str}_{e}}. The algorithm maintains FF decrementally against τ\tau batches of updates to G,G, say U(1),U(2),,U(τ),U^{(1)},U^{(2)},\ldots,U^{(\tau)}, such that stre~=def1\widetilde{\text{str}_{e}}~{}~{}\stackrel{{\scriptstyle\text{def}}}{{=}}1 for any new edge ee added by either edge insertions or vertex splits, and:

  1. 1.

    FF has initially O(m/k)O(m/k) connected components and O(qlog2n)O(q\log^{2}n) more after tt update batches of total encoding size q=deft=1τENC(U(i))q~{}\stackrel{{\scriptstyle def}}{{=}}\sum_{t=1}^{\tau}ENC(U^{(i)}) satisfying qO~(m).q\leq\tilde{O}(m).

  2. 2.

    streF,str~eO(kγLSSTlog4n)\text{str}_{e}^{F,\ell}\leq\widetilde{\text{str}}_{e}\leq O(k\gamma_{LSST}\log^{4}n) for all eEe\in E at all times, including inserted edges e.e.

  3. 3.

    eE(0)vestr~eO(𝐯1γLSSTlog2n)\sum_{e\in E^{(0)}}v_{e}\widetilde{\text{str}}_{e}\leq O(||\mathbf{v}||_{1}\gamma_{LSST}\log^{2}n)

  4. 4.

    Initially, 𝒲\mathcal{W} contains O(m/k)O(m/k) subtrees. For any piece W𝒲,WV,|W|1W\in\mathcal{W},W\subseteq V,|\partial W|\leq 1 and volG(W\R)O(klog2n)\operatorname{vol}_{G}(W\backslash R)\leq O(k\log^{2}n) at all times, where R𝒲R\supseteq\partial\mathcal{W} is the set of roots in F.F. Here, W\partial W denotes the set of boundary vertices that are in multiple partition pieces.

The factor γLSST\gamma_{LSST} will be specified in Notation 4.3.

We begin by implementing low-stretch spanning trees in Section 4.1 before moving onto the Heavy Light Decomposition in Section 4.2.

4.1 Low-stretch spanning trees

We begin by writing code to implement the LSST algorithm described in [1]. The code can be found in the file titled petal.ipynb.

The main result of [1] can be summarized as follows:

Theorem 4.2.

Suppose GG is a connected, undirected graph with vertices VV and edges EE where |V|=n|V|=n and |E|=m|E|=m. Then we can construct a spanning tree TT such that (u,v)EdT(u,v)w(u,v)=O(mlognloglogn).\sum_{(u,v)\in E}\frac{d_{T}(u,v)}{w(u,v)}=O(m\log n\log\log n). Further, this tree can be found in O(mlognloglogn)O(m\log n\log\log n) time.

The tree TT found in Theorem 4.2 is our low-stretch spanning tree (LSST).

Notation 4.3.

Following the notation of [2], we will be referring to the stretch factor O(mlognloglogn)O(m\log n\log\log n) of the LSST as γLSST\gamma_{LSST} for the remainder of the paper.

The algorithm can be understood through a few functions: create_petal, petal_decomposition and hierarchical_petal_decomposition. We describe the implementation of each of these functions below. In order to get our LSST, we call hierarchical_petal_decomposition on our original graph with specific parameters that are specified in Section 3.3.

4.1.1 create_petal

Abraham and Neiman give two algorithms in [1] for create-petal which both take in a graph, a starting point x0,x_{0}, a target tt and a radius rr (where rr is expressed as a range in the first algorithm) and which both return a cluster (which is the petal) as well as a center of that petal. The algorithm given on page 235 is not fast enough for the overall desired bound in this paper: it requires that the ratio between the largest and smallest distances is O(nk)O(n^{k}) for some kk which does not necessarily hold in general, and the create-petal algorithm has a for loop which could potentially run for many iterations, and there is no clear bound on how many iterations this would take. For this reason, we implement the faster algorithm on page 245 in section 6 of [1], which the authors themselves state is necessary for the best asymptotic efficiency.

In the code, we begin with an ordinary implementation of Dijkstra’s Algorithm using Fibonacci Heaps along with a corresponding shortest_path function. Everything in this portion is standard. After that, we write a function to generate the directed graph G~\tilde{G} as specified on page 245 of [1]. As Abraham and Neiman point out, all edges on the directed graph have nonnegative length.

With that out of the way, we can write the create_petal function. In essence, this algorithm iteratively carves petals off of the graph and then carves petals off of the remaining part of the graph until all petals have been carved, when any remaining vertices are made part of the stigma of the flower. This function takes in the following inputs: a graph YY, a target tt, a starting point x0x_{0} and a radius r.r. Note that this is slightly different from the slower function that Abraham and Neiman propose. In particular, the create_petal function does not take in the original graph as input except at the beginning; rather, as described in the following subsection, each time a petal is carved, create_petal is iteratively called on the remaining subgraph. We also do not have any hi and lo parameters, as the faster algorithm described on page 245 of [1] does not include any.

To create a petal, we first run Dijkstra on the given graph with the given starting point x0.x_{0}. We then generate the corresponding directed graph based on the shortest distances from x0x_{0} to all other points in the graph and calculate the shortest path to the target t.t. Next, we need to find the special vertex rr^{\prime}, defined on page 245 of [1] as the furthest point from tt on the shortest path from tt to x0x_{0} in the original graph GG that is within distance rr of t.t. To do this, we could perform binary search on the sorted distances on the shortest path from tt to x0x_{0} in order to find the largest number less than or equal to r.r. However, in our implementation we simply do a linear search as this does not impact the overall asymptotic time complexity, which we empirically check later.

Next, the algorithm stipulates that the petal is the ball of radius r/2r/2 around tt in the directed graph where each edge on the path from tt to prp_{r^{\prime}} is set to half of its original weight in G.G. At first, this may seem to be equivalent to the following: include every vertex within the ball of radius r/2r/2 and then also include every vertex on the path from tt to prp_{r^{\prime}} including pr.p_{r^{\prime}}. This would make intuitive sense because prp_{r^{\prime}} is defined as the furthest point along the path from tt to x0x_{0} within distance rr of tt. If all of the closest points to tt are further than rr away, then pr=t.p_{r^{\prime}}=t. An implementation which does this would not be quite right for the following reason: while edges in the directed graph all have nonnegative length by the triangle inequality, they can have length 0, and this fact influences the implementation of create_petal. Therefore, in the directed graph, prp_{r^{\prime}} may not be the furthest point within distance r.r. Edges of length 0 mean that some points “further” along the path may be of equal distance.

Therefore, it is essential to first set the weight of each edge on the path from tt to prp_{r^{\prime}} to half its original weight in G.G. After that, we run Dijkstra on the directed graph G~\tilde{G} and select all vertices within distance r/2r/2 of t.t.

The function then returns the petal and the point prp_{r^{\prime}} which is the center of the petal and plays a key role in the petal decomposition. In addition, we also choose to return the shortest path from xx to tt, since this will come up in the petal_decomposition function described in the next subsection.

4.1.2 petal_decomposition

In our implementation of this function, we begin with some sanity checks that the center xx and target tt are both in the graph. While this may seem unnecessary, we will see in the next subsection that there are many recursive calls which involve modifying the graph. These assert and raise statements helped catch many errors earlier on in the implementation and ensured everything worked properly once those errors were corrected.

We begin by computing the radius of graph, defined as the maximum length across all vertices vv of the shortest path from x0x_{0} to vv found using Dijkstra. We then initialize the following lists with their defined purposes:

  • Ys: YjY_{j} is defined as the original graph minus the vertices and edges contained in petals X1,,XjX_{1},\ldots,X_{j}. In particular, Y0Y_{0} is initialized to the entire graph before any petals have been removed.

  • Xs: XjX_{j} is the petal found on iteration jj and is carved from Yj1Y_{j-1}, except for X0X_{0} which is the remaining part of the graph after all petals have been carved (what the authors in [1] call the “stigma” of the flower)

  • xs: xjx_{j} is the center of petal Xj,X_{j}, or the point prp_{r^{\prime}} when computing the petal XjX_{j}, except for x0x_{0} which is the original input

  • ts: tjt_{j} is the target when creating petal XjX_{j}

  • ys: yjy_{j} is the neigbhor of xjx_{j} on the path from x0x_{0} to tjt_{j} which is closer to x0x_{0}

The rest of the implementation follows the pseudocode. The first petal that is carved may be a special petal if the distance from x0x_{0} to the target tt is greater than 58\frac{5}{8} of the radius from x0.x_{0}. Abraham and Neiman describe why they choose the 58\frac{5}{8} constant in [1]. The purpose of this petal is to preserve the shortest path from x0x_{0} to t.t.

To carve all of the other petals, we iteratively check for points whose distance from x0x_{0} is at least 3r4.\frac{3r}{4}. If no other points exist, we have found all petals. Otherwise, we arbitrarily choose one of those points to be the target tjt_{j} for petal Xj.X_{j}. In our implementation, we simply choose the first key in the dictionary of key value pairs where the keys are vertices and the values are distances from the center x0x_{0}, but this is arbitrary because the keys are only ordered as they were in the original input graph, which is arbitrary.

We then call create_petal on the remaining part of the original graph (denoted by the element most recently appended to the Ys list) along with our center x0x_{0} and target tj.t_{j}. For the radius, we use r/8r/8. This choice comes from the pseudocode for petal_decomposition in [1], where the call to create_petal sets lo = 0 and hi = r/8.r/8. However, as we recall from the previous subsection, the faster implementation of create_petal does not have any lo or hi parameters but rather just a single radius parameter. For this reason, it was not obvious whether to use r/8r/8 or something like r/16.r/16. However, we decided to use r/8r/8 because this parameter is a radius and all points within this radius will have distance less than or equal to r/8r/8 from the center. For this reason, having a max possible value of r/8r/8 (the hi parameter) vs a single value of r/8r/8 seemed more analogous. However, that is not a fully rigorous argument, and the testing described in Section 4.1.4 is what provides empirical validation for this choice.

Once we create the petal, we subtract it off from the remaining part of the graph. Using the path from x0x_{0} to tjt_{j}, we compute yjy_{j}, which will be used in the following subsection. We then halve all of the edges on the shortest path to tt (but beginning at xjx_{j} rather than x0x_{0}) returned by the create_petal function as given in the pseudocode.

Finally, we let X0X_{0} be the last YjY_{j}, namely the remaining part of the graph after all petals have been carved.

4.1.3 hierarchical-petal-decomposition

This is the third and final function necessary to generate a low-stretch spanning tree. For inputs, it takes in the graph, a starting point x0x_{0} and a target t.t. This function is recursive with a base case that the tree on a graph with one vertex is just the single vertex itself.

On each recursive call, we compute the petal decomposition of the graph. As explained in the previous subsection, this returns a set of petals XjX_{j}, their centers xjx_{j}, a set of neighbors yjy_{j} and a set of targets tjt_{j}. We then run hierarchical-petal-decomposition recursively on each petal XjX_{j} where the starting point is xjx_{j} and the target is tj.t_{j}. Each of these recursive calls generates a tree TjT_{j} for each petal Xj.X_{j}. These trees are all disjoint, so we create a new graph which combines all of these TjT_{j} through the edges (xj,yj).(x_{j},y_{j}). This gives us a final low-stretch spanning tree T.T.

Lastly, to compute the tree itself, we call hierarchical-petal-decomposition on the graph with arbitrary starting point x0x_{0} but critically also with target x0.x_{0}. The justification for this decision is explained in [1].

4.1.4 LSST Implementation Validation

The function hierarchical_petal_decomposition was the last function described in [1]. In order to test the LSST implementation, we generate many Erdős-Rényi random graphs. For each random graph, we pick a probability pp and let there be an edge between every pair of distinct vertices with probability pp (so p=1p=1 would correspond to a complete graph).

Then, for each graph, we compute an LSST using hierarchical_petal_decomposition, measuring its stretch and the duration to compute it. We repeat this for 10 trials and take the average for both the stretch and the duration. There are two primary things to verify: that the stretch is O(mlognloglogn)O(m\log n\log\log n) and that the time complexity is O(mlognloglogn).O(m\log n\log\log n). We also separately checked that the output was always a spanning tree (it reaches every vertex of the original graph and has n1n-1 edges). However, when timing the code, we did not include this extra check since it is not a necessary part of the algorithm once we are confident that the algorithm is running correctly.

To check the asymptotic nature of the stretch and duration to compute the LSST, for different values of m,nm,n we graph the stretch and duration vs mlognloglogn.m\log n\log\log n. We expect that there are some finite constants c1,c2c_{1},c_{2}\in\mathbb{R} such that the stretch with mm edges and nn vertices is bounded above by c1mlognloglognc_{1}m\log n\log\log n and likewise the duration is bounded above by c2mlognloglogn.c_{2}m\log n\log\log n. We check this graphically below.

Refer to caption
Figure 1: For different random graphs, we graph the average stretch vs mlognloglogn.m\log n\log\log n. Empirically, we can see that the stretch is bounded above by 0.045mlognloglogn0.045m\log n\log\log n and that the pattern of growth appears to suggest that this growth rate will continue. The purple line is the best fit line which minimizes squared L2 error.
Refer to caption
Figure 2: For different random graphs, we graph the average duration vs mlognloglogn.m\log n\log\log n. Empirically, we can see that the duration is bounded above by 5.7106mlognloglogn5.7\cdot 10^{-6}m\log n\log\log n and that the pattern of growth appears to suggest that this growth rate will continue. The purple line is the best fit line which minimizes squared L2 error.

In both cases, we can see that for sufficiently large values of mm and n,n, both the duration and stretch are bounded above by a linear constant multiplied by mlognloglogn.m\log n\log\log n. Of course, this does not definitively verify that the implementation is correct (for example, there could hypothetically be a logloglogn\log\log\log n factor that would not show up graphically in the experiments run above) but given that the authors in [1] prove their algorithm and stretch are both O(mlognloglogn)O(m\log n\log\log n) and the implementation followed their algorithm, it is reasonable to conclude that the graphs above provide empirical validation that our implementation works correctly.

4.2 Heavy Light Decomposition

In this subsection we discuss the code that implements the Heavy Light Decomposition of a tree. First, we reproduce a necessary definition from [2], which we have modified slightly to refer to general trees:

Definition 4.4.

([2]) Given a rooted tree TT and vertex u,u, define uTu^{\uparrow T} as the set of its ancestors in TT plus itself. We extend the notation to any subset of vertices by defining RT=uRuT.R^{\uparrow T}=\bigcup_{u\in R}u^{\uparrow T}.

Now we define the Heavy Light Decomposition as described in Lemma B.8 of [2] (which quotes [6]) which we reproduce below for convenience:

Lemma 4.5.

There is a linear-time algorithm that given a rooted tree TT with nn vertices outputs a collection of vertex disjoint tree paths {P1,,Pt}\{P_{1},\ldots,P_{t}\} (called heavy chains), such that the following hold for every vertex uu:

  1. 1.

    There is exactly one heavy chain PiP_{i} containing u.u.

  2. 2.

    If PiP_{i} is the heavy chain containing uu, at most one child of uu is in Pi.P_{i}.

  3. 3.

    There are at most O(logn)O(\log n) heavy chains that intersect with uT.u^{\uparrow T}.

In addition, edges that are not covered by any heavy chain are called light edges.

4.2.1 Implementation Details

The heavy_light_decomposition function takes in one input: the tree. Within this function, we define two inner functions: dfs_size and decompose_chain, which we describe below.

For the dfs_size function which takes in a vertex vv and its parent as inputs. Then, the function uses Depth First Search to recursively visit every vertex and compute the size of the subtree (number of vertices) rooted at that vertex. If the vertex is a leaf node, we say its subtree has size 1 (just itself). Otherwise, we initialize the size of the subtree to 1 and compute size(v)=1+wc(v)size(w)\operatorname{size}(v)=1+\sum_{w\in c(v)}\operatorname{size}(w) where c(v)c(v) is the set of all children of vv.

Next, we have the decompose_chain function. This function takes in a vertex vv, the head of the chain of vv and the parent of vv as inputs. In this function, we first identify the heavy child for that vertex, which corresponds to condition 2 in Definition 4.5 and is the one child of uu within PiP_{i}, the heavy chain containing uu. We find this heavy child by iterating over the children of uu and picking the child of uu which has the maximum subtree size.

The decompose_chain function then recursively calls itself as long as we keep finding new “heavy children.” For each recursive call, we take in the previous heavy_child as our new vertex v,v, the same chain_head as before and the vertex vv as the parent (since the heavy_child is a child of vertex vv).

Once the recursive calls are complete, we iterate over the children of vv, and for every child that isn’t the heavy child, we initialize a new chain and recursively call decompose_chain where our new vertex vv is the child we’re iterating over, the chain_head is the index of the current chain and the parent is our old v.v. By setting the code up in this way, we ensure that we create a chain for every vertex vv and its descendants and that we satisfy the requirements of Lemma 4.5.

With these two functions implemented, we can finish our heavy_light_decomposition implementation. We initialize variables for the length of the tree, the subtree sizes, and the chains. We then call dfs_size on the root vertex 0 and its “parent” (which is not a real vertex) but which we denote 1.-1. This allows us to figure out the sizes of all of the subtrees in the tree. Then, we call decompose_chain on our root vertex 0, the chain_head which we initialize to 0 and the parent 1-1. Finally, the heavy_light decomposition function returns the list of chains.

4.2.2 Heavy-Light Decomposition Implementation Validation

In order to check our implementation, we check the three conditions of Lemma 4.5 empirically. We produce a function that generates random trees. These trees are implemented as a list of lists, with nodes numbered 0 to nn where 0 is the root. Further, the list at index ii consists of the children of node ii, and the list is empty if and only if node ii is a leaf node. Based on this format, such random trees can be generated as follows: for vertex nn, randomly choose one of the vertices i=0,1,,n1i=0,1,\ldots,n-1 to be its parent by appending vertex nn to the list at index ii. This also has the convenient property that the parent’s number is always strictly less than the child’s number.

We then call our heavy light decomposition function on 1000 randomly generated trees. We iterate over all possible vertices for that tree and check that vertex uu is represented in one and only one heavy chain. We can then easily check the children of uu and ensure that at most one child of uu is in that same heavy chain.

For the third condition, we calculate uTu^{\uparrow T} for each uu and count the number of heavy chains that intersect with it. This is the trickiest condition to check because Lemma 4.5 only promises an O(logn)O(\log n) bound on the number of intersecting heavy chains. To check this, we consider increasing values of nn and graph the average number of intersecting heavy chains vs logn\log n, where the average is calculated by summing up the number of intersections across all vertices and then dividing by both the number of vertices and the number of randomly generated trees, in this case 1000. Below, we plot the average number of intersections vs logn\log n, and we see that a line with slope 1.15 dominates the growth. Thus, empirically it appears that the number of intersections is bounded by 1.15logn.1.15\log n. In addition, we also computed the quotient of the number of intersections divided by logn\log n, which gave us the following list: [1.276, 1.196, 1.144, 1.134, 1.126, 1.103, 1.092, 1.090, 1.084, 1.081, 1.079, 1.072]. We can see that this list is monotonically decreasing, which provides further assurance that the number of intersections is O(logn).O(\log n).

Refer to caption
Figure 3: The average number of intersections appears to be bounded by O(logn)O(\log n). The purple line is the best fit line.

In addition, we also empirically check the time complexity for our implementation just like we did for low-stretch spanning trees. Lemma 4.4 states that the algorithm should be O(n)O(n) time, which indeed appears to be the case below.

Refer to caption
Figure 4: The time to compute the heavy light decomposition increases linearly with the number of vertices, which matches our desired O(n)O(n) time complexity. The purple line is the best fit line.

5 Rebuilding Game

The authors in [2] create a rebuilding game in order to attain their promised near linear time bound for max flow. As they describe at the beginning of section 8, this rebuilding game is the key step to go from Theorem 7.1 to Theorem 6.2 (theorem numbers are with respect to [2]). Theorem 6.2 asserts the existence of the main overall data structure in [2] that is crucial for attaining the near asymptotic linear time bound. The code for this part is included in the file titled rebuilding.ipynb. Although for our purposes we only need the rebuilding game under very specific conditions with specific parameter values, we implement it in full generality as it is described in Section 8 of [2].

5.1 Preliminaries

“The rebuilding game has several parameters: integers parameters size m>0m>0 and depth d>0d>0, update frequency 0<γg<10<\gamma_{g}<1, a rebuilding cost Cr1C_{r}\geq 1, a weight range K1K\geq 1, and a recursive size reduction parameter k=m1/d2k=m^{1/d}\geq 2, and finally an integer round count T>0T>0” ([2]).

They define the rebuilding game in Definition 8.1 of [2] which we reproduce here for convenience:

Definition 5.1.

([2]) The rebuilding game is played between a player and an adversary and proceeds in rounds t=1,2,,T.t=1,2,\ldots,T. Additionally, the steps (moves) taken by the player are indexed as s=1,2,s=1,2,\ldots. Every step ss is associated with a tuple prev(s):=(prev0(s),,prevd(s))[T]d+1.\text{prev}^{(s)}:=(\text{prev}_{0}^{(s)},\ldots,\text{prev}_{d}^{(s)})\in[T]^{d+1}. Both the player and adversary know prev(s)\text{prev}^{(s)}. At the beginning of the game, at round t=1t=1 and step s=1s=1, we initially set previ(1)=1\text{prev}_{i}^{(1)}=1 for all levels i{0,1,,d}.i\in\{0,1,\ldots,d\}. At the beginning each round t1t\geq 1,

  1. 1.

    The adversary first chooses a positive real weight W(t)W^{(t)} satisfying logW(t)(K,K).\log W^{(t)}\in(-K,K). This weight is hidden from the player.

  2. 2.

    Then, while either of the following conditions hold, i=0dWprevi(s)>2(d+1)W(t)\sum_{i=0}^{d}W^{\text{prev}_{i}^{(s)}}>2(d+1)W^{(t)} or for some level l,l, at least γgm/kl\gamma_{g}m/k^{l} rounds have passed since the last rebuild of level l,l, the adversary can (but does not have to) force the player to perform a fixing step. The player may also choose to perform a fixing step, regardless of whether the adversary forces it or not. In a fixing step, the player picks a level i{0,1,,d}i\in\{0,1,\ldots,d\} and we then set prevj(s+1)t\text{prev}_{j}^{(s+1)}\leftarrow t for j{i,i+1,,d},j\in\{i,i+1,\ldots,d\}, and prevj(s+1)prevj(s+1)\text{prev}_{j}^{(s+1)}\leftarrow\text{prev}_{j}^{(s+1)} for j{0,,i1}.j\in\{0,\ldots,i-1\}. We call this a fix at level ii and we say the levels jij\geq i have been rebuilt. This move costs Crm/kiC_{r}m/k^{i} time.

  3. 3.

    When the player is no longer performing fixing steps, the round finishes.

There are some aspects of the original explanation in [2] that were underspecified and are clarified here. The first is that the steps indexed by ss refer to fixing steps, and further that each step also has a level ii. For this reason, the step count is incremented each time a level is fixed (thus ss refers to the total number of fixing steps across all levels). Note that this increment of the step count is separate from the fixing counts in the fix array, which are incremented as well until they are reset.

In order to assess whether the adversary is forcing a fixing step, we check the i=0dWprevi(s)>2(d+1)W(t)\sum_{i=0}^{d}W^{\text{prev}_{i}^{(s)}}>2(d+1)W^{(t)} as specified above in Definition 5.1.

5.2 Implementation and Validation

We implement this rebuilding game through two functions. We first implement a fix function which the main algorithm will call to perform a fixing step. This function is fairly routine and simply takes in a level ii, an index jj, the step index ss, the round tt and an array called prevs. The fix function returns the newly updated prevs array after the fix is complete.

Next, we have our main rebuilding function. In addition to the parameters m,d,γg,Cr,K,Tm,d,\gamma_{g},C_{r},K,T listed above, we also take in the parameter w,w, which represents a 2d array where the ttht^{th} row of ww is w(t)w^{(t)} and we initialize W(t)=w(t)1W^{(t)}=||w^{(t)}||_{1} as explained in Remark 8.2 of [2]. We also initialize 1D arrays to keep track of the fix and round counts and a 2D prevs array. The prevs array has dd columns (one for each level) and initially one row of all ones as outlined in Definition 5.1 above.

We provide empirical validation that the algorithm runs in O(CrKdγg(m+T)).O\left(\frac{C_{r}Kd}{\gamma_{g}}(m+T)\right). For setting the values of each of these variables, we refer to Section 8.2 of [2], which gives asymptotic values for the variables. We let Cr=exp(log7/8mloglogm),γg=1Cr=exp(log7/8mloglogm),Q=mCr=mexp(log7/8mloglogm),T=(m+Q)CrC_{r}=\exp(\log^{7/8}m\log\log m),\gamma_{g}=\frac{1}{C_{r}}=\exp(-\log^{7/8}m\log\log m),Q=mC_{r}=m\exp(\log^{7/8}m\log\log m),T=\lfloor(m+Q)C_{r}\rfloor and d=5logm8.d=\lfloor 5\cdot\sqrt[8]{\log m}\rfloor. While section 8.2 of [2] states that d=logm8d=\sqrt[8]{\log m} as opposed to O(logm8),O(\sqrt[8]{\log m}), we note that scaling dd by a constant has no bearing on the O(CrKdγg(m+T))O\left(\frac{C_{r}Kd}{\gamma_{g}}(m+T)\right) runtime, and the authors also state d=O(logm8)d=O(\sqrt[8]{\log m}) many other times in [2]. The reason for the constant factor of 55 in this implementation is because otherwise we would always have d=1d=1 unless mm was extremely large, and there is no reason to always set d=1.d=1. The exact value of 55 is arbitrary and can be replaced by any other positive integer.

In addition, we also let WW be a vector with TT entries where the W[i]W[i] are i.i.d Uniform(1, 1000). Here, the number 1000 is arbitrary. Lastly, we set K=logmaxiW[i]K=\lceil\log\max_{i}W[i]\rceil in order to ensure that logW[i](K,K)\log W[i]\in(-K,K) as required by [2].

With these parameters, we run experiments with m=8,9,10,64.m=8,9,10\ldots,64. We plot the durations for each of these values of mm vs CrKdγg(m+T).\frac{C_{r}Kd}{\gamma_{g}}(m+T). As promised by [2], we do empirically observe that the runtime increases linearly with CrKdγg(m+T)\frac{C_{r}Kd}{\gamma_{g}}(m+T) and is therefore bounded by a constant times CrKdγg(m+T),\frac{C_{r}Kd}{\gamma_{g}}(m+T), where this constant is the slope of the red line in the graph below or 2.021011.2.02\cdot 10^{-11}. This number comes from calculating the slope between the last points and one of the preceding points and increasing the y-intercept slightly, as the graph appears to be progressing linearly once the xx axis exceeds 4.4.

Refer to caption
Figure 5: The runtime of the rebuilding game appears to be O(CrKdγg(m+T))O\left(\frac{C_{r}Kd}{\gamma_{g}}(m+T)\right) as stated in [2]. The purple line is the best fit line.

6 Interior Point Method

In Section 4 of [2], the authors present an interior point method that solves the min-cost flow problem.

6.1 IPM Preliminaries

We begin by defining the key variables that will be used throughout this section and the remainder of the paper as they are defined in [2]. In the subsequent text, recall that 𝐁\mathbf{B} was defined in Notation 2.3.

Definition 6.1.

For our min-cost flow problem, we let 𝐝V\mathbf{d}\in\mathbb{Z}^{V} represent the demands. We also let the lower and upper capacities be denoted by 𝐮,𝐮+E\mathbf{u}^{-},\mathbf{u}^{+}\in\mathbb{Z}^{E} where all integers are bounded by the variable UU. Finally, we let 𝐜E\mathbf{c}\in\mathbb{Z}^{E} denote the costs.

Definition 6.2.

We define α=def11000logmU.\alpha\stackrel{{\scriptstyle\text{def}}}{{=}}\frac{1}{1000\log mU}. We will use α\alpha to create a barrier xαx^{-\alpha} when solving the L1 IPM.

Definition 6.3.

We define:

𝐟=defargmin𝐁𝐟=d𝐮e𝐟e𝐮e+for alleEcTf.\mathbf{f}^{*}\stackrel{{\scriptstyle\text{def}}}{{=}}\operatorname*{argmin}_{\begin{subarray}{c}\mathbf{B^{\top}\mathbf{f}}=d\\ \mathbf{u}_{e}^{-}\leq\mathbf{f}_{e}\leq\mathbf{u}_{e}^{+}\text{for all}e\in E\end{subarray}}c^{T}f.
Definition 6.4.

We let Φ(𝐟)=def20mlog(c𝐟F)+eE((𝐮e+𝐟e)α+(𝐟e𝐮e)α).\Phi(\mathbf{f})\stackrel{{\scriptstyle\text{def}}}{{=}}20m\log(c^{\top}\mathbf{f}-F^{*})+\sum_{e\in E}((\mathbf{u}_{e}^{+}-\mathbf{f}_{e})^{-\alpha}+(\mathbf{f}_{e}-\mathbf{u}_{e}^{-})^{-\alpha}).

Definition 6.5.

([2]) Given a flow 𝐟E\mathbf{f}\in\mathbb{R}^{E} we define lengths E\ell\in\mathbb{R}^{E} as

(𝐟)e=def(𝐮e+𝐟e)1α+(𝐟e𝐮e)1α\mathbf{\ell}(\mathbf{f})_{e}\stackrel{{\scriptstyle\text{def}}}{{=}}(\mathbf{u}_{e}^{+}-\mathbf{f}_{e})^{-1-\alpha}+(\mathbf{f}_{e}-\mathbf{u}_{e}^{-})^{-1-\alpha}

and gradients 𝐠E\mathbf{g}\in\mathbb{R}^{E} as 𝐠(𝐟)=defΦ(𝐟).\mathbf{g}(\mathbf{f})\stackrel{{\scriptstyle\text{def}}}{{=}}\nabla\Phi(\mathbf{f}). More explicitly,

𝐠(𝐟)e=def[Φ(𝐟)]e=20m(𝐜𝐟F)1𝐜e+α(𝐮e+𝐟e)1αα(𝐟e𝐮e)1α.\mathbf{g}(\mathbf{f})_{e}\stackrel{{\scriptstyle\text{def}}}{{=}}[\nabla\Phi(\mathbf{f})]_{e}=20m(\mathbf{c}^{\top}\mathbf{f}-F^{*})^{-1}\mathbf{c}_{e}+\alpha(\mathbf{u}_{e}^{+}-\mathbf{f}_{e})^{-1-\alpha}-\alpha(\mathbf{f}_{e}-\mathbf{u}_{e})^{-1-\alpha}.

We now reproduce Theorem 4.3 of [2], which is their main result regarding IPMs for solving the min-cost flow problem (f(t)f^{(t)} is the flow after iteration tt, while the optimal flow ff^{*} is unknown):

Theorem 6.6.

([2]) Suppose we are given a min-cost flow instance given by Equation (1). Let 𝐟\mathbf{f}^{*} denote an optimal solution to the instance.

For all κ(0,1),\kappa\in(0,1), there is a potential reduction interior point method for this problem, that, given an initial flow 𝐟(0)E\mathbf{f}^{(0)}\in\mathbb{R}^{E} such that Φ(𝐟(0))200mlogmU,\Phi(\mathbf{f}^{(0)})\leq 200m\log mU, the algorithm proceeds as follows:

The algorithm runs for O~(mκ2)\tilde{O}(m\kappa^{2}) iterations. At each iteration, let 𝐠(𝐟(𝐭))E\mathbf{g(f^{(t)})}\in\mathbb{R}^{E} denote that gradient and (𝐟(t))>0E\mathbf{\ell}(\mathbf{f}^{(t)})\in\mathbb{R}^{E}_{>0} denote the lengths given by Definition 6.5. Let 𝐠~E\mathbf{\tilde{g}}\in\mathbb{R}^{E} and ~>0E\mathbf{\tilde{\ell}}\in\mathbb{R}_{>0}^{E} be any vectors such that 𝐋(𝐟(𝐭))1(𝐠~𝐠(f(t)))κ/8||\mathbf{L(f^{(t)}})^{-1}(\mathbf{\tilde{g}}-\mathbf{g}(f^{(t)}))||_{\infty}\leq\kappa/8 and ~(f).\mathbf{\tilde{\ell}}\approx\mathbf{\ell}(f).

  1. 1.

    At each iteration, the hidden circulation 𝐟𝐟(t)\mathbf{f}^{*}-\mathbf{f}^{(t)} satisfies

    𝐠(𝐟𝐟(t))100m+𝐋~(𝐟𝐟(t))1α/4.\frac{\mathbf{g}(\mathbf{f}^{*}-\mathbf{f}^{(t)})}{100m+||\tilde{\mathbf{L}}(\mathbf{f}^{*}-\mathbf{f}^{(t)})||_{1}}\leq-\alpha/4.
  2. 2.

    At each iteration, given any Δ\Delta satisfying 𝐁δ=0\mathbf{B}^{\top}\delta=0 and 𝐠~Δ/𝐋~Δ1κ,\tilde{\mathbf{g}}\Delta/||\tilde{\mathbf{L}}\Delta||_{1}\leq-\kappa, it updates 𝐟(t+1)𝐟(t)+ηΔ\mathbf{f}^{(t+1)}\leftarrow\mathbf{f}^{(t)}+\eta\Delta for ηκ2/(50|𝐠~Δ|).\eta\leftarrow\kappa^{2}/(50\cdot|\tilde{\mathbf{g}}\Delta|).

  3. 3.

    At the end of O~(mκ2)\tilde{O}(m\kappa^{2}) iterations, we have 𝐜𝐟(t)𝐜𝐟+(mU)10.\mathbf{c}^{\top}\mathbf{f}^{(t)}\leq\mathbf{c}^{\top}\mathbf{f}^{*}+(mU)^{-10}.

6.2 IPM Implementation

All code described in this section can be found in the file ipm.ipynb.

In [2], the authors give a fairly detailed implementation that relies on specific bounds on the gradient, number of iterations etc. In order to implement that version of an interior point method, we would have to implement an oracle for a specific step described in Section 4 of [2], which is nontrivial to implement.

In this section we merely provide an approximation/heuristic for solving the interior point method which relies on scipy.optimize. We define the objective function as given in Definition 6.4 and feed the constraints as well. The minimize function in scipy.optimize then numerically solves for the optima of the potential function.

We empirically check the runtime of this simple implementation and compare it to the O~(mκ2)\tilde{O}(m\kappa^{2}) runtime promised above by Theorem 6.6, where we let κ2=exp(log7/8mloglogm)\kappa^{2}=\exp(-\log^{7/8}m\log\log m) as defined later on in [2] (they say κ=exp(O(log7/8mloglogm)),\kappa=\exp(-O(\log^{7/8}m\log\log m)), but an extra constant factor doesn’t change the following analysis).

[Uncaptioned image]

We notice that the graph is decreasing, which is surprising. Empirically checking the values of mlogmexp(log7/8mloglogm)m\log m\exp(-\log^{7/8}m\log\log m), we find they are decreasing over the values from m=4m=4 to m=100.m=100. In order to inspect this, we take the log of this expression, where we see:

limmlogm+loglogmlog7/8mloglogm=limmlogm(1+loglogmlogmloglogmlogm8)=.\lim_{m\to\infty}\log m+\log\log m-\log^{7/8}m\log\log m=\lim_{m\to\infty}\log m\left(1+\frac{\log\log m}{\log m}-\frac{\log\log m}{\sqrt[8]{\log m}}\right)=\infty.

Specifically, in the final expression above, we know that logm\log m will go to \infty while the expression in the parentheses will go to 11. Thus in the limit the log goes to infinity, which means the original expression goes to infinity as well. Yet the empirically decreasing nature of the function in the graph above raises a question: when does this function start increasing?

To find this, we take a derivative of the logged expression. We first let x=logmx=\log m since mm only appears as logm.\log m. Then, we have:

ddx(x+logxx7/8logx)=1x+17logx8x81x8.\frac{d}{dx}(x+\log x-x^{7/8}\log x)=\frac{1}{x}+1-\frac{7\log x}{8\sqrt[8]{x}}-\frac{1}{\sqrt[8]{x}}.

We want to find out when the derivative is positive. If we take xx\to\infty, we see that the derivative approaches 1,1, but if we plug in a number like x=1000x=1000 (which, we keep in mind corresponds to m=e1000m=e^{1000}) we get 1.97.-1.97. This function is continuous, so by the Intermediate Value Theorem, we know that it will cross 0 at some point. We will find out where the derivative equals 0 (which corresponds to a local optimum) and then argue that it increases after this point.

Setting our above expression equal to 0, we get:

1x+1=7logx+88x87logx+8=8x1/88x7/8.\frac{1}{x}+1=\frac{7\log x+8}{8\sqrt[8]{x}}\implies 7\log x+8=8x^{1/8}-\frac{8}{x^{7/8}}.

If we plug this expression into Wolfram Alpha, we get x7631410337,x\approx 7631410337, which we will write as x=7.631010x=7.63\cdot 10^{10} (but all numerical calculations below are done using the more precise value of xx). Now, we want to confirm that the derivative is positive whenever x>7.631010.x>7.63\cdot 10^{10}. We have:

1x+17logx8x81x8>08x7/8+8x1/8>7logx+8.\frac{1}{x}+1-\frac{7\log x}{8\sqrt[8]{x}}-\frac{1}{\sqrt[8]{x}}>0\iff\frac{8}{x^{7/8}}+8x^{1/8}>7\log x+8.

The derivative of the left hand side is 7x15/8+x7/8\frac{-7}{x^{15/8}}+x^{-7/8} while the derivative of the right hand side is 7x.\frac{7}{x}. Then, we see:

1x7/87x15/8>7xx1/87x7/8>7.\frac{1}{x^{7/8}}-\frac{7}{x^{15/8}}>\frac{7}{x}\iff x^{1/8}-\frac{7}{x^{7/8}}>7.

From Wolfram Alpha, we see this is true whenever x>5764860x>5764860 (which makes sense as 78=57648017^{8}=5764801) and 7.631010>5764860.7.63\cdot 10^{10}>5764860. This means when x=7.631010,x=7.63\cdot 10^{10}, 8x7/8+8x1/8\frac{8}{x^{7/8}}+8x^{1/8} is increasing faster than 7logx+8,7\log x+8, so the first derivative will be positive for x>7.631010x>7.63\cdot 10^{10} and x=7.631010x=7.63\cdot 10^{10} is a local minimum of the differentiated expression.

Returning to our original expression, we recall that x=logm,x=\log m, so our original function is increasing when m>exp(7.631010)1033142793985103.311010.m>\exp(7.63\cdot 10^{10})\approx 10^{33142793985}\approx 10^{3.31\cdot 10^{10}}. When m=103.311010,m=10^{3.31\cdot 10^{10}}, we have logm=7.631010,loglogm=25.06,logm8=22.93.\log m=7.63\cdot 10^{10},\log\log m=25.06,\sqrt[8]{\log m}=22.93.. Plugging these values in, at this local minimum, we have:

mlogmexp(log7/8mloglogm)=m1+loglogmlogmloglogmlogm8=103.311010(1+25.067.63101025.0622.93).m\log m\exp(-\log^{7/8}m\log\log m)=m^{1+\frac{\log\log m}{\log m}-\frac{\log\log m}{\sqrt[8]{\log m}}}=10^{3.31\cdot 10^{10}\left(1+\frac{25.06}{7.63\cdot 10^{10}}-\frac{25.06}{22.93}\right)}.

As stated previously, all calculations are done using the precise values of each numerical quantity rather than the rounded values we use above.

Simplifying, this comes out to 101.07101010^{-1.07\cdot 10^{10}} which is of course extremely, extremely close to 0. Thus, even though our function eventually tends to infinity, it first approaches 0. Further, given that m=103.311010m=10^{3.31\cdot 10^{10}} is many orders of magnitude larger than the number of atoms in the known universe (approximately 108210^{82}), it is not possible to empirically check whether our scipy implementation or (or even a fully correct implementation) matches the asymptotic growth of mlogmexp(log7/8mloglogm).m\log m\exp(-\log^{7/8}m\log\log m).

For this reason, it does not make sense practically to consider the runtime of the IPM relative to its asymptotic guarantee. For any future implementations of the IPM algorithm described by Theorem 6.6, the focus should be on the other requirements of Theorem 6.6. With that said, it’s hard to test the other requirements either. While we could calculate ff^{*} via another algorithm to try to test our IPM implementation for accuracy, the scipy.optimize method that we are using only returns the final converged value rather than the successive steps which makes requirements 1 and 2 of Theorem 6.6 infeasible to check. Even if we had the successive step values, it is unlikely that this heuristic would exactly match those requirements. For requirement 3, we note that the (mU)10(mU)^{-10} bound only holds after requires O~(mκ2)\tilde{O}(m\kappa^{-2}) iterations which could be extremely large practically, which suggests that the scipy implementation may run into numerical issues or may not converge in time, which makes it unreliable for testing. Thus, while the scipy implementation provides a heuristic for solving the IPM problem, it is not complete.

7 Overall Time Complexity

The authors in [2] state that their max flow algorithm is m1+o(1)logUlogCm^{1+o(1)}\log U\log C time or what they sometimes call O~(m)\tilde{O}(m) time. The goal of this section is to more precisely calculate the mo(1)m^{o(1)} term.

For a graph with mm edges and nn vertices to be connected, we must have at least n1n-1 edges. Therefore, n1mn-1\leq m which implies that n=O(m).n=O(m). This allows us to bound nn by mm in the asymptotic calculations that follow.

We refer to Algorithm 7 of [2], which provides the algorithm for their main theorem. We then refer to each component of it and calculate its runtime based on what the authors in [2] say. In cases when they simply say mo(1),m^{o(1)}, we take a deeper dive as needed to give a more precise asymptotic time complexity.

7.1 Preliminaries

We begin by reproducing several Lemmas and Theorems from [2] which we will refer to throughout the subsequent subsections:

Definition 7.1.

([2]) Consider a dynamic graph G(t)G^{(t)} undergoing batches of updates U(1),,U(t),U^{(1)},\ldots,U^{(t)},\ldots consisting of edge insertions/deletions and vertex splits. We say the sequences 𝐠(t),(t),\mathbf{g}^{(t)},\mathbf{\ell}^{(t)}, and U(t)U^{(t)} satisfy the hidden stable-flow chasing property if there are hidden dynamic circulations 𝐜(t)\mathbf{c}^{(t)} and hidden dynamic upper bounds 𝐰(t)\mathbf{w}^{(t)} such that the following holds at all stages tt:

  1. 1.

    𝐜(t)\mathbf{c}^{(t)} is a circulation: 𝐁G(t)𝐜(t)=0.\mathbf{B}^{\top}_{G^{(t)}}\mathbf{c}^{(t)}=0.

  2. 2.

    𝐰(t)\mathbf{w}^{(t)} upper bounds the length of 𝐜(t)\mathbf{c}^{(t)}: |e(t)𝐜e(t)|𝐰e(t)|\mathbf{\ell}_{e}^{(t)}\mathbf{c}_{e}^{(t)}|\leq\mathbf{w}_{e}^{(t)} for all eE(G(t)).e\in E(G^{(t)}).

  3. 3.

    For any edge ee in the current graph G(t)G^{(t)}, and any stage tt,t^{\prime}\leq t, if the edge ee was already present in Gt,G^{t^{\prime}}, i.e. eG(t)\s=t+1tU(s)e\in G^{(t)}\backslash\bigcup_{s=t^{\prime}+1}^{t}U^{(s)} then 𝐰e(t)2𝐰e(t).\mathbf{w}_{e}^{(t)}\leq 2\mathbf{w}_{e}^{(t^{\prime})}.

  4. 4.

    Each entry of 𝐰(t)\mathbf{w}^{(t)} and (t)\mathbf{\ell}^{(t)} is quasipolynomially lower and upper-bounded: log𝐰e(t)[logO(1)m,logO(1)m]\log\mathbf{w}_{e}^{(t)}\in[-\log^{O(1)}m,\log^{O(1)}m] and loge(t)[logO(1)m,logO(1)m]\log\mathbf{\ell}_{e}^{(t)}\in[-\log^{O(1)}m,\log^{O(1)}m] for all eE(G(t)).e\in E(G^{(t)}).

Definition 7.2.

([2]) Consider a tree TT and a rooted spanning forest E(F)E(T)E(F)\subseteq E(T) on a graph GG equipped with stretch overestimates str~e\widetilde{str}_{e} satisfying the guarantees of Lemma 4.1. We define the core graph 𝒞(G,F)\mathcal{C}(G,F) as a graph with the same edge and vertex set as G\F.G\backslash F. For e=(u,v)E(G)e=(u,v)\in E(G) with image e^E(G/F)\hat{e}\in E(G/F) we define its length as e^𝒞(G,F)=defstr~ee\mathbf{\ell}_{\hat{e}}^{\mathcal{C}(G,F)}\stackrel{{\scriptstyle\text{def}}}{{=}}\widetilde{str}_{e}\mathbf{\ell}_{e} and gradient as 𝐠e^𝒞(G,F)=def𝐠e+𝐠,𝐩(T[u,v]).\mathbf{g}_{\hat{e}}^{\mathcal{C}(G,F)}\stackrel{{\scriptstyle\text{def}}}{{=}}\mathbf{g}_{e}+\langle\mathbf{g},\mathbf{p}(T[u,v])\rangle.

Definition 7.3.

([2]) Given a graph GG, forest FF, and parameter kk, define a (γs,γc,γ)(\gamma_{s},\gamma_{c},\gamma_{\ell})-sparsified core graph with embedding as a subgraph 𝒮(G,F)𝒞(G,F)\mathcal{S}(G,F)\subseteq\mathcal{C}(G,F) and embedding 𝒞(G,F)𝒮(G,F)\prod_{\mathcal{C}(G,F)\to\mathcal{S}(G,F)} satisfying

  1. 1.

    For any e^E(𝒞(G,F))\hat{e}\in E(\mathcal{C}(G,F)), all edges e^𝒞(G,F)𝒮(G,F)(e^)\hat{e}^{\prime}\in\prod_{\mathcal{C}(G,F)\to\mathcal{S}(G,F)}(\hat{e}) satisfy e^𝒞(G,F)2e^𝒞(G,F).\mathbf{\ell}_{\hat{e}}^{\mathcal{C}(G,F)}\approx_{2}\mathbf{\ell}_{\hat{e}^{\prime}}^{\mathcal{C}(G,F)}.

  2. 2.

    length(𝒞(G,F)𝒮(G,F))γ\operatorname{length}(\prod_{\mathcal{C}(G,F)\to\mathcal{S}(G,F)})\leq\gamma_{\ell} and econg(𝒞(G,F)𝒮(G,F))kγc.\operatorname{econg}(\prod_{\mathcal{C}(G,F)\to\mathcal{S}(G,F)})\leq k\gamma_{c}.

  3. 3.

    𝒮(G,F)\mathcal{S}(G,F) has at most mγs/km\gamma_{s}/k edges.

  4. 4.

    The lengths and gradients of edges in 𝒮(G,F)\mathcal{S}(G,F) are the same in 𝒞(G,F)\mathcal{C}(G,F) (Definition 7.2).

Definition 7.4.

([2]) For a graph G,G, parameter k,k, and branching factor B,B, a B-branching tree-chain consists of collections of graphs {𝒢i}0id\{\mathcal{G}_{i}\}_{0\leq i\leq d} such that 𝒢0=def{G},\mathcal{G}_{0}\stackrel{{\scriptstyle\text{def}}}{{=}}\{G\}, and we define 𝒢i\mathcal{G}_{i} inductively as follows,

  1. 1.

    For each G𝒢i,i<dG\in\mathcal{G}_{i},i<d, we have a collection of BB trees 𝒯Gi={T1,T2,,TB}\mathcal{T}^{G_{i}}=\{T_{1},T_{2},\ldots,T_{B}\} and a collection of BB forests Gi={F1,F2,,FB}\mathcal{F}^{G_{i}}=\{F_{1},F_{2},\ldots,F_{B}\} such that E(Fj)E(Tj)E(F_{j})\subseteq E(T_{j}) satisfy the conditions of Lemma 4.1.

  2. 2.

    For each Gi𝒢i,G_{i}\in\mathcal{G}_{i}, and FGi,F\in\mathcal{F}^{G_{i}}, we maintain (γs,γc,γl)(\gamma_{s},\gamma_{c},\gamma_{l})-sparsified core graphs and embeddings 𝒮(Gi,F)\mathcal{S}(G_{i},F) and Π𝒞(Gi,F)𝒮(Gi,F).\Pi_{\mathcal{C}(G_{i},F)\to\mathcal{S}(G_{i},F)}.

  3. 3.

    We let 𝒢i+1=def{𝒮(Gi,F):Gi𝒢i,FFGi}.\mathcal{G}_{i+1}\stackrel{{\scriptstyle\text{def}}}{{=}}\{\mathcal{S}(G_{i},F):G_{i}\in\mathcal{G}_{i},F\in F^{G_{i}}\}.

    Finally, for each Gd𝒢d,G_{d}\in\mathcal{G}_{d}, we maintain a low-stretch tree F.F.

    We let a tree chain be a single sequence of graphs G0,G1,,GdG_{0},G_{1},\ldots,G_{d} such that Gi+1G_{i+1} is the (γs,γc,γl)(\gamma_{s},\gamma_{c},\gamma_{l})-sparsified core graph 𝒮(Gi,Fi)\mathcal{S}(G_{i},F_{i}) with embedding Π𝒞(Gi,Fi)𝒮(Gi,Fi)\Pi_{\mathcal{C}(G_{i},F_{i})\to\mathcal{S}(G_{i},F_{i})} for some FiGiF_{i}\in\mathcal{F}^{G_{i}} for 0id,0\leq i\leq d, and a low-stretch tree FdF_{d} on Gd.G_{d}.

Definition 7.5.

([2]) Given a graph GG and tree-chain G0,G1,,GdG_{0},G_{1},\ldots,G_{d} where G0=G,G_{0}=G, define the corresponding spanning tree TG0,G1,,Gd=defi=0dFiT^{G_{0},G_{1},\ldots,G_{d}}\stackrel{{\scriptstyle\text{def}}}{{=}}\bigcup_{i=0}^{d}F_{i} of GG as the union of preimages of edges of FiF_{i} in G=G0.G=G_{0}.

Define the set of trees corresponding to a branching tree-chain of graph GG as the union of TG0,G1,,GdT^{G_{0},G_{1},\ldots,G_{d}} over all tree chains G0,G1,,GdG_{0},G_{1},\ldots,G_{d} where G0=GG_{0}=G:

𝒯G=def{TG0,G1,,Gd:G0,G1,,Gds.t.Gi+1=𝒮(Gi,Fi) for all 0i<d}\mathcal{T}^{G}\stackrel{{\scriptstyle\text{def}}}{{=}}\{T^{G_{0},G_{1},\ldots,G_{d}}:G_{0},G_{1},\ldots,G_{d}~{}s.t.~{}G_{i+1}=\mathcal{S}(G_{i},F_{i})\text{ for all }0\leq i<d\}
Definition 7.6.

([2]) Given a dynamic graph G(t)G^{(t)} with updates indexed by times t=0,1,t=0,1,\ldots and corresponding dynamic branching tree-chain (Definition 7.4), we say that nonnegative integers prev0(t)prev1(t)prevd(t)\text{prev}_{0}^{(t)}\leq\text{prev}_{1}^{(t)}\leq\cdots\leq\text{prev}_{d}^{(t)} are previous rebuild times if previ(t)\text{prev}_{i}^{(t)} was the most recent time at or before tt such that 𝒢i\mathcal{G}_{i} was rebuilt, i.e. G𝒢iG\in\mathcal{G}_{i} the set of trees 𝒯G\mathcal{T}^{G} was reinitialized and sampled.

Lemma 7.7.

([2]) There is a deterministic strategy given by Algorithm 6 for the player to finish TT rounds of the rebuilding game in time O(CrKdγg(m+T)).O\left(\frac{C_{r}Kd}{\gamma_{g}}(m+T)\right).

Theorem 7.8.

([2]) Let G=(V,E)G=(V,E) be a dynamic graph undergoing τ\tau batches of updates U(1),,U(τ)U^{(1)},\ldots,U^{(\tau)} containing only edge insertions/deletions with edge gradient 𝐠(t)\mathbf{g}^{(t)} and length (t)\mathbf{\ell}^{(t)} such that the update sequence satisfies the hidden stable-flow chasing property (Definition 7.1) with hidden dynamic circulation 𝐜(t)\mathbf{c}^{(t)} and width 𝐰(t).\mathbf{w}^{(t)}. There is an algorithm on GG that maintains a O(logn)O(\log n)- branching tree chain corresponding to s=O(logn)ds=O(\log n)^{d} trees T1,T2,,TsT_{1},T_{2},\ldots,T_{s} (Definition 7.5), and at stage tt outputs a circulation Δ\Delta represented by exp(O(log7/8mloglogm))\exp(O(\log^{7/8}m\log\log m)) off-tree edges and paths on some Ti,i[s].T_{i},i\in[s].

The output circulation Δ\Delta satisfies 𝐁Δ=0\mathbf{B}^{\top}\Delta=0 and for some κ=exp(O(log7/8mloglogm))\kappa=\exp(-O(\log^{7/8}m\log\log m))

𝐠(t),Δ(t)Δ1κ𝐠(t),𝐜(t)i=0d𝐰(previ(t))1,\frac{\langle\mathbf{g}^{(t)},\Delta\rangle}{||\mathbf{\ell}^{(t)}\circ\Delta||_{1}}\leq\kappa\frac{\langle\mathbf{g}^{(t)},\mathbf{c}^{(t)}\rangle}{\sum_{i=0}^{d}||\mathbf{w}^{(\text{prev}_{i}^{(t)})}||_{1}},

where previ(t),i[d]\text{prev}_{i}^{(t)},i\in[d] are the previous rebuild times (Definition 7.6) for the branching tree train.

The algorithm succeeds w.h.p with total runtime (m+Q)mo(1)(m+Q)m^{o(1)} for Q=deft=1τ|U(t)|poly(n).Q~{}\stackrel{{\scriptstyle\text{def}}}{{=}}\sum_{t=1}^{\tau}|U^{(t)}|\leq\text{poly}(n). Also, levels i,i+1,,di,i+1,\ldots,d of the branching tree train can be rebuilt at any point in m1+o(1)/kim^{1+o(1)}/k^{i} time.

Theorem 7.9.

([2]) There is a data structure that on a dynamic graph G(t)G^{(t)} maintains a collection of s=O(logn)ds=O(\log n)^{d} spanning trees T1,T2,,TsG(t)T_{1},T_{2},\ldots,T_{s}\subseteq G^{(t)} for d=O(log1/8m)d=O(\log^{1/8}m), and supports the following operations:

  • UPDATE(U(t),𝐠(t),(t)U^{(t)},\mathbf{g}^{(t)},\mathbf{\ell}^{(t)}): Update the gradients and lengths to 𝐠(𝐭)\mathbf{g^{(t)}} and (t)\mathbf{\ell}^{(t)}. For the update to be supported, we require that U(t)U^{(t)} contains only edge insertions/deletions and 𝐠(t),(t)\mathbf{g}^{(t)},\mathbf{\ell}^{(t)} and U(t)U^{(t)} satisfy the hidden stable-flow chasing property (Definition 7.1) with hidden circulation 𝐜(t)\mathbf{c}^{(t)} and upper bounds 𝐰(t)\mathbf{w}^{(t)}, and for a parameter α\alpha,

    𝐠(t),𝐜(t)𝐰(t)1α.\frac{\langle\mathbf{g}^{(t)},\mathbf{c}^{(t)}\rangle}{||\mathbf{w}^{(t)}||_{1}}\leq\alpha.
  • QUERY(): Returns a tree TiT_{i} for i[s]i\in[s] and a cycle Δ\Delta represented as mo(1)m^{o(1)} paths on Ti,T_{i}, (specified by their endpoints and the tree index) and mo(1)m^{o(1)} explicitly given off-tree edges such that for κ=exp(O(log7/8mloglogm)),\kappa=\exp(-O(\log^{7/8}m\cdot\log\log m)),

    𝐠(t),Δ𝐋(t)Δ1κα.\frac{\langle\mathbf{g}^{(t)},\Delta\rangle}{||\mathbf{L}^{(t)}\Delta||_{1}}\leq-\kappa\alpha.

    Over τ\tau stages the algorithm succeeds w.h.p with total run time mo(1)(m+Q)m^{o(1)}(m+Q) for Q=t[τ]|U(t)|.Q=\sum_{t\in[\tau]}|U^{(t)}|.

Notation 7.10.

Following the notation of [2], we denote the data structure described in Theorem 7.9 as 𝒟(HSFC).\mathcal{D}^{(HSFC)}.

7.2 Rebuilding Game

Now, we analyze the time complexity of the Rebuilding Game, which we know is given as O(CrKdγg(m+T))O\left(\frac{C_{r}Kd}{\gamma_{g}}(m+T)\right) in Lemma 7.7. At the end of Section 8.2 of [2], when describing how to use the Rebuilding game to get the main data structure described in Theorem 7.9, the authors state that d=logm8,k=exp(O(log7/8m)),γg=exp(O(log7/8mloglogm)),Cr=exp(O(log7/8mloglogm)),T=(m+Q)exp(O(log7/8mloglogm)).d=\sqrt[8]{\log m},k=\exp(O(\log^{7/8}m)),\gamma_{g}=\exp(-O(\log^{7/8}m\log\log m)),C_{r}=\exp(O(\log^{7/8}m\log\log m)),T=(m+Q)\exp(O(\log^{7/8}m\log\log m)). Plugging these values in, we get the bound (m+Q)exp(O(log7/8mloglogm))(m+Q)\exp(O(\log^{7/8}m\log\log m)) given in [2].

Here, Q=tτ|U(t)|Q=\sum_{t\in\tau}|U^{(t)}| as defined in Theorem 7.9, and we need to bound QQ in terms of mm. To do this, we turn to Lemma 9.4 from [2], which we reproduce below, where Algorithm 7 is the name [2] gives to their overall algorithm:

Lemma 7.11.

([2]) Consider a call to MINCOSTFLOW (Algorithm 7) and let U(t)U^{(t)} be as in line 20 [of Algorithm 7]. Then t|U(t)|O~(mκ2α2ε1).\sum_{t}|U^{(t)}|\leq\tilde{O}(m\kappa^{-2}\alpha^{-2}\varepsilon^{-1}).

Thus, based on Lemma 7.11, we have Q=O~(mκ2α2ε1).Q=\tilde{O}(m\kappa^{-2}\alpha^{-2}\varepsilon^{-1}). Here, κ=exp(O(log7/8mloglogm))\kappa=\exp(-O(\log^{7/8}m\log\log m)) as defined in Theorem 7.9 and α=11000logmU\alpha=\frac{1}{1000\log mU} where UU is as defined in Definiton 6.1. Further, ε\varepsilon is as defined in the DETECT operation of Lemma 3.1. We treat UU and ε\varepsilon as constants that we absorb into the OO notation.

In order to dissect what is being included in the O~\tilde{O} notation above for QQ, we refer to Section 4 of [2], specifically Theorem 6.6 and the text immediately following it, where the authors in [2] state that a certain operation decreases the potential Φ(f)\Phi(f) (see Definition 6.4) by Ω(κ2)\Omega(\kappa^{2}). The potential drops from O(mlogmU)O(m\log mU) to O(mlogm),-O(m\log m), so this takes O(mlogmκ2)O(m\log m\kappa^{-2}) operations, where κ2=exp(O(log7/8mloglogm))\kappa^{-2}=\exp(O(\log^{7/8}m\log\log m)). We also note that α2=O(log2m).\alpha^{-2}=O(\log^{2}m).

Now we do some big OO computation. Letting cic_{i} denote constants, we have:

Q=O(mlogmκ2α2)c1mlog3mexp(O(log7/8mloglogm))Q=O(m\log m\kappa^{-2}\alpha^{-2})\leq c_{1}m\log^{3}m\exp(O(\log^{7/8}m\log\log m))
exp(c2log7/8mloglogm+logm+3loglogm+logc1)=mexp(O(log7/8mloglogm)).\leq\exp\left(c_{2}\log^{7/8}m\log\log m+\log m+3\log\log m+\log c_{1}\right)=m\exp(O(\log^{7/8}m\log\log m)).

The last step follows by pulling out the elogm=me^{\log m}=m term and then noticing that log7/8mloglogm\log^{7/8}m\log\log m dominates 3loglogm3\log\log m and logc1.\log c_{1}.

From this, we get:

(m+Q)exp(O(log7/8mloglogm))=mexp(O(log7/8mloglogm))(m+Q)\exp(O(\log^{7/8}m\log\log m))=m\exp(O(\log^{7/8}m\log\log m))
+mexp(O(log7/8mloglogm))exp(O(log7/8mloglogm))=mexp(O(log7/8mloglogm)).+m\exp(O(\log^{7/8}m\log\log m))\exp(O(\log^{7/8}m\log\log m))=m\exp(O(\log^{7/8}m\log\log m)).

As a sanity check, we note that exp(O(log7/8mloglogm))=mo(1)\exp(O(\log^{7/8}m\log\log m))=m^{o(1)} because:

limmc1log7/8mloglogmlogm=limmc1loglogmlog1/8m=0.\lim_{m\to\infty}\frac{c_{1}\log^{7/8}m\log\log m}{\log m}=\lim_{m\to\infty}\frac{c_{1}\log\log m}{\log^{1/8}m}=0.

Now we have the following mathematical lemma:

Lemma 7.12.

Suppose f(m)f(m) and g(m)g(m) are positive functions that both go to \infty as mm\to\infty such that f(m)=o(g(m)).f(m)=o(g(m)). Then ef(m)=o(eg(m))e^{f(m)}=o(e^{g(m)}).

The proof of Lemma 7.12 is rather obvious: ε>0,m\forall\varepsilon>0,~{}\exists m such that f(m)εg(m).f(m)\leq\varepsilon\cdot g(m). Thus ef(m)g(m)e(ε1)g(m).e^{f(m)-g(m)}\leq e^{(\varepsilon-1)g(m)}. Then, since ε<1,\varepsilon<1, we clearly have limme(ε1)g(m)=0.\lim_{m\to\infty}e^{(\varepsilon-1)g(m)}=0.

Applying this logic above, it follows that limmexp(O(log7/8mloglogm))m1loglogm=0\lim_{m\to\infty}\frac{\exp(O(\log^{7/8}m\log\log m))}{m^{\frac{1}{\log\log m}}}=0 so we indeed have that exp(O(log7/8mloglogm))=mo(1).\exp(O(\log^{7/8}m\log\log m))=m^{o(1)}.

7.3 MinCostFlow Algorithm

In this subsection, we precisely calculate the time complexity for the MinCostFlow Algorithm. This, combined with the previous subsection, allows us to establish the overall time complexity in the next subsection.

In Section 9 of [2], the authors show that the MINCOSTFLOW Algorithm satisfies the hypotheses of Theorem 6.6. Thus, we conclude that the MINCOSTFLOW Algorithm runs for O~(mκ2α2)\tilde{O}(m\kappa^{-2}\alpha^{-2}) iterations, and following the same logic as in Section 7.2, we can rewrite this as mexp(O(log7/8mloglogm)).m\exp(O(\log^{7/8}m\log\log m)).

For the time per iteration, we note that there are s=O((logn)d)=O((logm)logm8)s=O((\log n)^{d})=O\left((\log m)^{\sqrt[8]{\log m}}\right) trees maintained by the 𝒟(HSFC)\mathcal{D}^{(HSFC)} data structure. For each of these trees, we invoke the Dynamic Tree data structure operations, which take O(logm)O(\log m) based on the guarantees of Lemma 3.1.

Therefore, we get a bound of mexp(O(log7/8mloglogm))O(logm)O((logm)logm8)m\exp(O(\log^{7/8}m\log\log m))\cdot O(\log m)\cdot O\left((\log m)^{\sqrt[8]{\log m}}\right). We note that (logm)1+logm8=exp((1+logm8)loglogm)=exp(O(log7/8mloglogm)).(\log m)^{1+\sqrt[8]{\log m}}=\exp((1+\sqrt[8]{\log m})\log\log m)=\exp(O(\log^{7/8}m\log\log m)). Thus, we still have mexp(O(log7/8mloglogm))m\exp(O(\log^{7/8}m\log\log m)) as our time complexity.

7.4 Comparison with Edmonds Karp

The Edmonds Karp algorithm is another well known algorithm for solving max flow, which runs in O(mn2)=O(m3)O(mn^{2})=O(m^{3}) time. We now compare the growth of m3m^{3} vs mexp(Clog7/8mloglogm)m\exp(C\log^{7/8}m\log\log m) for large mm and constant CC. The goal is to find out when m3m^{3} becomes larger than mexp(Clog7/8mloglogm)m\exp(C\log^{7/8}m\log\log m).

We have:

m3>mexp(log7/8mloglogm)2log1/8m>Cloglogm.m^{3}>m\exp(\log^{7/8}m\log\log m)\implies 2\log^{1/8}m>C\log\log m.

Letting x=logm,x=\log m, we need to solve 2x8>Clogx.2\sqrt[8]{x}>C\log x. If we take C=1,C=1, Wolfram Alpha indicates this inequality holds for all x>3.03107x>3.03\cdot 10^{7}, which means m>exp(3.03107)=1.011013154921m>\exp(3.03\cdot 10^{7})=1.01\cdot 10^{13154921}. If the number of edges/vertices of the graph were to exceed this number, then Edmonds Karp would be faster. If C=2,C=2, the same calculations give us m>3.321093334255463.m>3.32\cdot 10^{93334255463}.

It is also worth considering how much slower this algorithm is for a more reasonable value of mm. Note that because we do not know the value of the constant CC, it is unreasonable to analyze small values of mm (like 1010 or 100100). Taking C=1,C=1, if we assume our graph has 101010^{10} edges, then we have:

1010exp(log7/8(1010)loglog1010)103015.58.\frac{10^{10}\exp(\log^{7/8}(10^{10})\log\log 10^{10})}{10^{30}}\approx 15.58.

Similar to the behavior of the function in Section 6.2, this function gets larger before it gets smaller. When m=101000,m=10^{1000}, it equals 2.9510941,2.95\cdot 10^{941}, and of course these numbers would be larger for a larger C.C. Thus we can conclude that for any reasonably sized graph (where reasonably sized is defined as can fit in the storage of the world’s largest supercomputers), a traditional algorithm like Edmonds-Karp will be faster.

8 Conclusion

8.1 Summary

We began with preliminaries and provided a detailed list of all of the different subparts of the algorithm in [2] that would have to be implemented in order for the main algorithm to be implemented. After this, we discussed our implementations of portions of the algorithm, including the Low Stretch Decomposition, the Rebuilding Game and the Interior Point Method with insights into why certain implementation decisions were made. Finally, we examined the overall time complexity of the algorithm in detail and worked out a more precise asymptotic bound.

8.2 Future Directions of Study

Several portions of the algorithm in [2] remain unimplemented, including primarily the expander decomposition and a fully correct Interior Point Method. The nested list provided in Section 2.2 combined with the stubs provided in the GitHub should give a roadmap for any future attempts at implementing this algorithm. Separately, while the algorithm in [2] is the asymptotically fastest algorithm for max flow to date, as discussed in the final subsection it is not designed to be practical. Another interesting area of research would be to attempt to find the fastest algorithm which can be implemented reasonably or prove that existing algorithms already meet those criteria.

References

  • [1] Ittai Abraham and Ofer Neiman. Using petal-decompositions to build a low stretch spanning tree. SIAM Journal of Computing, 48(2):227–248, 2019.
  • [2] Li Chen, Rasmus Kyng, Yang Liu, Richard Peng, Maximilian Gutenberg, and Sushant Sachdeva. Maximum flow and minimum-cost flow in almost-linear time. 2022.
  • [3] Julia Chuzhoy and Thatchaphol Saranurak. Deterministic algorithms for decremental shortest paths via layered core decomposition. Proceedings of the 2021 ACM-SIAM Symposium on Disrete Algorithms (SODA), pages 2478–2496, 2021.
  • [4] Shimon Even and Yossi Shiloach. An on-line edge-deletion problem. Journal of the ACM, 28:1–4, 1981.
  • [5] Thatchaphol Saranurak and Di Wang. Expander decomposition and pruning: Faster, stronger, and simpler. Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, pages 2616–2635, 2019.
  • [6] Daniel Sleator and Robert Tarjan. A data structure for dynamic trees. Journal of Computer and System Sciences, 26(3):362–391, 1983.
  • [7] Daniel Sleator and Robert Tarjan. Self-adjusting binary search trees. Journal of the Association for Computing Machinery, 32(3):652–686, 1985.
  • [8] Daniel Spielman and Shang-Hu Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. Proceedings of the 36th Annual ACM Symposium on Theory of Computing, STOC 2004, Chicago, IL, USA, pages 81–90, 2004.
  • [9] Daniel Spielman and Shang-Hua Teng. Solving sparse, symmetric, diagonally-dominant linear system in time O(m1.31){O}(m^{1.31}). 44th Annual IEEE Symposium on Foundations of Computer Science, pages 416–427, 2003.