This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Blocking Trails for ff-factors of Multigraphs

Harold N. Gabow Department of Computer Science, University of Colorado at Boulder, Boulder, Colorado 80309-0430, USA. E-mail: [email protected]
Abstract

Blocking flows were introduced by Dinic [3] to speed up the computation of maximum network flows. They have been used in algorithms for problems such as maximum cardinality matching of bipartite graphs [19, Hopcroft and Karp] and general graphs [24, Micali and Vazirani], maximum weight matching of general graphs [17, Gabow and Tarjan], and many others. The blocking algorithm of [17] for matching is based on depth-first search. We extend the depth-first search approach to find ff-factors of general multigraphs. Here ff is an arbitrary integral-valued function on vertices, an ff-matching is a subgraph where every vertex xx has degree f(x)\leq f(x), an ff-factor has equality in every degree bound. A set of blocking trails for an ff-matching MM is a maximal collection 𝒜{\cal A} of edge-disjoint augmenting trails such that MA𝒜AM\bigoplus_{A\in{{\cal A}}}A is a valid ff-matching.

Blocking trails are needed in efficient algorithms for maximum cardinality ff-matching [20], maximum weight ff-factors/matchings by scaling [4, 12], and approximate maximum weight ff-factors and ff-edge covers [20]. Since these algorithms find many sets of blocking trails, the time to find blocking trails is a dominant factor in the running time. Our blocking trail algorithm runs in linear time O(m)O(m). In independent work and using a different approach, Huang and Pettie [20] present a blocking trails algorithm using time O(mα(m,n))O(m\alpha(m,n)). As examples of the time bounds for the above applications, an approximate maximum weight ff-factor is found in time O(mα(m,n))O(m\,\alpha(m,n)) using [20], and our algorithm eliminates the factor α(m,n)\alpha(m,n). Similarly a maximum weight ff-factor is found in time O(ΦlogΦmα(m,n)log(ΦW))O(\sqrt{\Phi\,{\rm log}\,\Phi}\,m\,\alpha(m,n)\,\,{\rm log}\,(\Phi W))\, using [20], (Φ=vVf(v)\Phi=\sum_{v\in V}f(v), WW the maximum edge weight) and our algorithm eliminates the α(m,n)\alpha(m,n) factor, making the time within a factor logΦ\sqrt{\,{\rm log}\,{\Phi}} of the bound for bipartite multigraphs.

The technical difficulty for this work stems from the fact that previous algorithms for both matching and ff-matching use vertex contractions to form blossoms. The dfs approach necessitates using edge contractions. As an example difficulty, edge contractions require extending the definition of blossom to ”skew blossoms” – configurations that must be reorganized in order to become valid blossoms. (The algorithm of [20] uses vertex contractions.)

1 Introduction

Blocking flows (aka blocking paths) are a fundamental tool for speeding up augmenting-type algorithms. They were introduced by Dinic [3] to obtain the first strongly polynomial algorithm for maximum network flow. Hopcroft and Karp [19] (and independently Karzanov [21]) applied the idea to maximum cardinality bipartite matching. They also proving its applicability to general graph matching, and Micali and Vazirani [24] gave the first blocking paths algorithm for general graph cardinality matching. Gabow and Tarjan applied blocking flows in cost-scaling algorithms for minimum cost network flow (aka degree-constrained subgraphs or ff-matchings) [16] and minimum cost general graph matching [17]. Here we have only cited some first applications of the idea, see reference [25, Sec. 9.6,10.8b,16.7a,17.5a,24.4a] for a comprehensive bibliography for the many uses of blocking paths.

We present an algorithm for blocking trails for ff-factors of general multigraphs. Here “general” is to be contrasted with “bipartite”, an important but much simpler special case. For the definitions of ff-factors, blocking trails, etc., please see the above abstract or the last paragraph of this section.

We turn to discussing recent work that applies blocking trails to multigraph ff-factors. Huang and Pettie [20] give algorithms for a host of problems on general multigraphs: They achieve time O(Φmα(m,n))O(\sqrt{\Phi}\,m\,\alpha(m,n)) for maximum cardinality ff-matching. Here Φ=vVf(v)\Phi=\sum_{v\in V}f(v), i.e., twice the size of an ff-factor. The same bound holds for a minimum cardinality ff-edge cover (i.e., ff is a lower bound for degrees in the desired subgraph). These algorithms use a scaling algorithm that finds an approximation to the weighted version of the problem. The scaling algorithm uses time O(mα(m,n)ϵ1logϵ1)O(m\,\alpha(m,n)\,\epsilon^{-1}\,{\rm log}\,\epsilon^{-1}) to find a (1ϵ)(1-\epsilon)-approximate maximum weight ff-matching, or a (1+ϵ)(1+\epsilon)-approximate minimum weight ff-edge cover.

The next applications are scaling algorithms for maximum weight ff-factors. To put the results in perspective first recall the classic time bounds for (unweighted) bipartite ff-factors [21, 8]:

(1.1) {O(n2/3m)G a simple graphO(Φm)G a multigraph.\begin{cases}O(n^{2/3}\;m)&G\text{ a simple graph}\\ O(\sqrt{\Phi}\;m)&G\text{ a multigraph.}\end{cases}

For maximum weight bipartite ff-factors, Gabow and Tarjan [16] presented a scaling algorithm whose time bound is just a scaling factor above (1.1), i.e., time O(n2/3mlog(nW))O(n^{2/3}\;m\;\,{\rm log}\,(nW)) for simple graphs and O(Φmlog(ΦW))O(\sqrt{\Phi}\;m\;\,{\rm log}\,(\Phi W)) for multigraphs. Here WW is the maximum (integral) edge weight.

Now consider general graphs. Duan, He and Zhang [4] give a scaling algorithm for maximum weight ff-factors of simple general graphs. The time is only logarithmic factors above (1.1), i.e., O~(n2/3mlogW)\widetilde{O}(n^{2/3}\,m\,\,{\rm log}\,W). In fact when all weights are 1 this is the first algorithm to achieve the bound of (1.1), to within logarithmic factors, for unweighted ff-factors of simple general graphs. Returning to weighted ff-factors, Gabow [12] gives a scaling algorithm that achieves the multigraph bound of (1.1), to within logarithmic factors, specifically O(ΦlogΦmα(m,n)log(ΦW))O(\sqrt{\Phi\,{\rm log}\,\Phi}\,m\,\alpha(m,n)\,\,{\rm log}\,(\Phi W)).

All of the above algorithms use a subroutine that finds a set of blocking trails. The time bounds cited above assume the blocking set is found by the subroutine presented by Huang and Pettie [20]. It finds a blocking set (more specifically, a set of blocking trails and cycles) in time O(mα(m,n))O(m\alpha(m,n)). Our blocking trail algorithm runs in time O(m)O(m). Using our algorithm the above time bounds all decrease by a factor α(m,n)\alpha(m,n). (The decrease occurs in [4] but is hidden by the use of O~\widetilde{O}.)

Our algorithm is based on the depth-first-search approach to blocking paths for matching, introduced in [17]; see also [11]. Dfs is a natural approach for finding blocking paths, since backing up prematurely may necessitate reexploring edges later on. However it introduces complications for managing blossoms – they are usually processed immediately on discovery, but they are necessarily postponed in a dfs regime [17, 11]. Extending the dfs approach to ff-factors of general multigraphs introduces new complications, which we now summarize.

Fundamentally, in previous settings blossoms are “shrunk” using vertex contraction. This holds even in previous ff-factor algorithms, e.g., see the definition of blossom in [14]. However dfs requires shrinking by edge contraction. A first consequence is new way of discovering blossoms, which we call “skew blossoms”. Skew blossoms have the same properties as ordinary blossoms. However they require a reorganization in order to achieve valid blossom structure (Lemma 2.1).

A second consequence of dfs is incomplete blossoms, i.e., blossoms that do not get completely processed because of the discovery of an augmenting trail. Incomplete blossoms occur in ordinary matching, but they present additional complexity in ff-factors. Specifically because of edge contraction, an alternating trail from a later search can reenter an incomplete blossom. (See Sec.2.) The analysis of Sec.5 shows these reentries present no problem.

Finally edge-contracted blossoms are not immediately handled by the standard min-max formula for maximum cardinality ff-matching (which we require to prove the blocking property). We show the essential structure of vertex-contracted blossoms is preserved by our algorithm (e.g., Lemma 5.2).

These complications also reveal the high-level difference between our approach and that of Huang and Pettie [20]. Their algorithm uses vertex contraction rather than edge contraction. Its fundamental idea is cycle cancellation, which does not occur in our algorithm.


The paper is organized as follows. Section 2 discusses blossoms, first reviewing the definition of ff-factor blossoms, and discussing the new types of blossoms in our algorithm. Appendix A gives the depth-first search algorithm for blocking ordinary matchings [17, 11], for possible help in the reading of Section 2. Section 3 presents the blocking algorithm. Section 4 proves the algorithm constructs a valid search structure, most importantly, the blossoms are valid. Section 5 proves the augmenting trails found by the algorithm form a blocking set. This completes the analysis of the algorithm, which is summarized in Theorem 5.5.

Terminology and conventions

We often omit set braces from singleton sets, denoting {v}\{v\} as vv. So SvS-v denotes S{v}S-\{v\}. We abbreviate expressions {v}S\{v\}\cup S to v+Sv+S. We use a common summing notation: If ff is a function on elements and SS is a set of elements then f(S)f(S) denotes sSf(s)\sum_{s\in S}f(s).

The trees in this paper are out-trees. Writing xyxy for an arc of a tree implies the arc joins parent xx to child yy. We extend parent and child relations to tree arcs, e.g., arc xyxy is the parent of yzyz. A node xx is an ancestor of a node yy allows the possibility x=yx=y, unless xx is a proper ancestor of yy. Similarly for arcs. A pendant edge has no children.

Graphs in this paper are undirected multigraphs. An edge joining vertices xx and yy is denoted {x,y}\{x,y\}. Note this notation ignores distinctions between multiple copies of an edge. Such distinctions are irrelevant to our algorithm. Also note that a loop at xx is denoted {x,x}\{x,x\}. Finally note that we use the usual shorthand notation xyxy to denote edge {x,y}\{x,y\} if context makes it clear that {x,y}\{x,y\} is a graph edge, not a tree arc. (This point is reiterated at the start of Section 3.)

In graph G=(V,E)G=(V,E) for SVS\subseteq V and MEM\subseteq E, δ(S,M)\delta(S,M) (γ(S,M)\gamma(S,M)) denotes the set of edges of MM with exactly one (respectively two) endpoints in SS. We omit MM (writing δ(S)\delta(S) or γ(S)\gamma(S)) when M=EM=E. A loop at vSv\in S belongs to γ(S)δ(S)\gamma(S)-\delta(S). For multigraphs GG all of these sets are multisets.

Figures in this paper use the following conventions. Matched edges are drawn heavy, unmatched edges light. Free vertices are drawn as rectangles. Trails are indicated by arrowheads on their edges. A figure illustrating the algorithm can be drawn in the given graph GG or in an auxiliary graph 𝒯\cal T (which is essentially the search tree; Fig.5(d.1) and (d.2) shows the two views). The view in GG is intuitive and informative but ambiguous: A given vertex vv can occur many times in the search, and an edge of the search leading to vv can potentially be used at any of these occurrences (again see Fig.5(d)). The same edge drawn in 𝒯\cal T goes to a new occurrence of vv, clearly less informative. Of course there is no such difficulty in ordinary matching.

For an undirected multigraph G=(V,E)G=(V,E) with function f:V+f:V\to\mathbb{Z}_{+}, an ff-factor is a subgraph where each vertex vVv\in V has degree exactly f(v)f(v). In a partial factor each vv has degree f(v)\leq f(v). vv is free if strict inequality holds.

We often call the edges of a partial ff-factor a matching or ff-matching.111An ff-matching is not to be confused with a bb-matching [25]. This paper never discusses the latter. So we refer to an ordinary matching as a 1-matching (i.e., ff is identically 1 for all vertices). def(x)def(x) is the deficiency of vertex xx in the current matching MM, def(x)=f(x)|δ(x,M)|2|γ(x,M)|def(x)=f(x)-|\delta(x,M)|-2|\gamma(x,M)|.

When discussing a matching MM, the M-type of an edge ee is MM or M¯\overline{M} depending on whether ee is matched or unmatched, respectively. We usually denote an arbitrary M-type as μ\mu, and μ(e)\mu(e) denotes the M-type of edge ee.

Consider a graph GG with an ff-matching MM. An augmenting trail is an alternating trail AA that begins and ends at a free vertex, such that MAM\oplus A is a valid matching, i.e., the two ends of the trail still satisfy their degree bound ff. (The trail may be closed, i.e., AA begins and ends at the same vertex. Alternating means consecutive edges of AA have opposite M-types.) An augmenting set is a collection of edge-disjoint augmenting trails 𝒜{\cal A} such that MA𝒜AM\bigoplus_{A\in{{\cal A}}}A is a valid ff-matching (i.e., rematching all the trails keeps every free vertex of MM within its degree bound ff.) A blocking trail set is a maximal augmenting set. It is the analog of a blocking flow.

2 Blossoms in 𝒅d

Blossoms are the main issue, and the main stumbling block, in any algorithm for matching general graphs. So it is appropriate to start with the new features of blossoms in our algorithm. This section starts by reviewing the natural definition of ff-factor blossoms presented in [14]. Then it introduces our new blossom variant, skew blossoms, and overviews the new potential difficulty for our algorithm, incomplete blossoms. Finally it presents a blossom substitute structure that allows our algorithm to treat weighted blossoms as ordinary vertices. Appendix A gives the predecessor of our algorithm and may be useful for Fig.2 showing incomplete blossoms.

Definition of Blossom, in 𝑮G

Our algorithm constructs blossoms that satisfy the definition presented in [14]. We call these ordinary blossoms. We briefly restate their definition below. Section 3 gives a similar definition for ordinary blossoms, but as they occur in our auxiliary graph 𝒯\cal T rather than GG.

Let GG be a multigraph with an ff-matching. A blossom BB is a subgraph of GG that has a distinguished vertex, called the base vertex β\beta, and a distinguished incident edge, called the base edge η\eta, whose end in BB is β\beta. (If β\beta is free then η\eta is an artificial unmatched edge.) The detailed definition is recursive. We begin with a graph G¯\overline{G}, the original graph GG with zero or more recursively defined blossoms contracted. A vertex in G¯\overline{G} is either a contracted blossom or a vertex of GG called an atom. The new blossom BB is defined as a closed trail CC. CC begins and ends at a vertex of G¯\overline{G} called the starter. Removing the starter from CC gives the blossom trail. The blossom trail must be alternating. Alternating means two consecutive edges meeting at an atom have opposite M-types; there is no restriction on edges meeting at a contracted blossom. Any atom may occur arbitrarily many times in CC. However a contracted blossom AA occurs at most once (a starter blossom occurs as both ends of CC). Further if AA is in the blossom trail its base edge must be one of its two incident CC-edges.

In the base case of the recursion the starter is an atom, which is taken to be β\beta. The first and last edges of CC must have the same MM-type, which is called the M-type of BB. η\eta is an edge incident to CC, whose M-type is opposite that of BB. (So a blossom whose base vertex is free has M-type M¯\overline{M}.)

In the recursive case the starter is a blossom (that is being enlarged). There is no restriction on the M-type of the starter’s two incident edges. The base vertex of BB is the base vertex of the starter. The same holds for the blossom’s M-type and its base edge, which must be incident to the trail CC. \Box


Our definition of blossoms differs from [14] in an important way. The algorithms of [14] contract the vertex sets of blossoms. This is not compatible with the depth-first search strategy of our algorithm. Instead our definition of blossoms contracts edge sets. (The first example is Fig.5(d).) This means that a vertex of GG occuring in a contracted blossom of CC may also have atomic occurrences in CC. (Fig.7 illustrates the common case of pendant edges, introduced in Lemma 4.4(ii).) In fact a vertex may also occur in many different blossoms of CC.

Our edge-contracted blossoms have three properties allowing them to function like vertex-contracted blossoms and make our algorithm correct. First is the most fundamental property of blossoms: An augmenting trail in G¯\overline{G} (graph GG with all blossoms contracted) gives an augmenting trail in GG, if we add appropriate trails through contracted blossoms. The appropriate trails are called Pi(v,β)P_{i}(v,\beta) in [14]. Here vv is a vertex in a blossom BB with base vertex β\beta, and the edges of this trail are in the blossom subgraph of BB. The trails are constructed in Lemma 4.4 of [14]. This construction works for edge-contracted blossoms if we make one simple change: In every closed trail CC of a blossom, replace each occurrence of an atomic vertex xx by a new vertex xix_{i}. Clearly the edge contractions used in our definition correspond to vertex contractions in the modified graph. So the PiP_{i} trails exist for edge-contracted blossoms.

The second property is a relaxed version of vertex-contracted blossoms: At any point in our algorithm a given vertex xx of GG occurs in at most one blossom. This is property ()(\dagger), stated after Proposition 4.8. ()(\dagger) insures a given blossom occurs at most once in CC. It is also crucial in establishing the blocking property (starting with the definition of labels in (5.3)).

The third property is that in the search structure, a blossom has one entering edge (its base) and every other edge is leaving. This property fails for edge-contracted blossoms. However it holds when our algorithm halts (Corollary 5.1). This allows us to use the min-max formula for maximum cardinality ff-matching in Section 5 to establish the blocking property of our algorithm.

We continue with several more comments on the definition of blossom. First note that the aforementioned trails Pi(v,β)P_{i}(v,\beta) explain the interpretation of “alternating” in the definition of blossom (the alternating trails through the contracted blossom exist regardless of the edges incident to it in CC). Second, a loop xxxx qualifies as a closed trail. So a blossom may have base vertex xx, closed trail xxxx, whose first and last edges are identical. Finally, a blossom is called heavy (light) when its M-type is MM (M¯\overline{M}), respectively.

Skew Blossoms

Skew blossoms extend the algorithmic definition of blossom. They have the same structure as ordinary blossoms after a reorganization. The analysis of skew blossoms is similar to the construction of PiP_{i} trails in [14]. Skew blossom are illustrated in Fig.1 and defined as follows.

Refer to caption
AAη(A)\eta(A)AAxxyy(b)η(A)\eta(A)η(A)\eta(A)AAxxyy(a)xxyy(d)η(A)\eta(A)xxyy(c)AA
Figure 1: Example blossoms: (a) A minimal (ordinary) blossom. (b) A skew blossom. Structure of a (heavy) skew blossom: (c) xβ(A)x\neq\beta(A). (d) x=β(A)x=\beta(A).

Consider a graph with various blossoms contracted, including a blossom AA that contains a vertex xV(G)x\in V(G). Let TT be an alternating trail, with first vertex xx and last vertex (the contracted) blossom AA, first edge xyxy and last edge η(A)\eta(A). TT is a skew blossom.

Note the first vertex xx is an atom in GG and does not belong to AA. Also it is possible that xy=η(A){xy}=\eta(A). We may also have xyxy a loop at xx.

Lemma 2.1

TT is a valid blossom, with base vertex xx and M-type μ(xy)\mu({xy}).

Proof: We wish to construct a blossom decomposition BB for the given skew blossom SS. This decomposition will be a sequence of closed trails CiC_{i}, i0i\geq 0, satisfying the definition of a valid blossom, and having the same set of edges as SS. The initial blossom trail C0C_{0} will start with the given trail TT from the atom xx to η(A)\eta(A), and follow a trail in AA to an occurrence of xx on an edge of M-type μ(xy)\mu({xy}).

To achieve this goal we will construct a sequence of alternating trails TiT_{i}, i=1,,ki=1,\ldots,k, where each TiT_{i} is a prefix of Ti+1T_{i+1}. The last trail TkT_{k} will be the above initial blossom trail C0C_{0}. Along the way we also construct the desired sequence of blossom trails Ci,i>0C_{i},i>0. The construction maintains the invariant that starting with C0C_{0}, adding the trails CiC_{i} in order gives a valid blossom. Furthermore the construction ends with the CiC_{i} consisting of the same edges as SS.

To begin observe that the vertex xx occurs as an atom in a unique closed trail CC of AA. Let xx refer to some fixed occurrence of xx in CC (chosen arbitrarily if there are more than one). Let β\beta and η\eta be the base vertex and base edge of the blossom corresponding to CC.

Suppose xβx\neq\beta. Let T1T_{1} be the subtrail of CηC\cup\eta that starts with the edge of (δ(x)γ(x))μ(xy)(\delta(x)\cup\gamma(x))\cap\mu({xy}), follows CC to β\beta and then traverses η\eta. T1T_{1} exists since CC alternates at xx. Let C1C_{1} be the trail CT1C-T_{1}. Clearly adding C1C_{1} to TkT_{k} gives a valid blossom, which contains all of CC.

Now suppose x=βx=\beta. We proceed exactly as before. If the M-type of CC is μ(xy)\mu({xy}) then T1T_{1} is the entire trail CηC\cup\eta and C1C_{1} is empty. If the M-type of CC is μ(xy)\mu({xy}) then T1T_{1} is the single edge η\eta and C1C_{1} is CC.

Now inductively assume Ti1T_{i-1} ends with the edge η(Ai1)\eta(A_{i-1}), where Ai1A_{i-1} is a blossom in the closed trail DD of AA. Proceed similar to the base case: Let β\beta and η\eta be the base vertex and base edge of DD. Let TiT_{i} be the subtrail of DηD\cup\eta that starts with edge η(Ai1)\eta(A_{i-1}), follows DD to β\beta and then traverses η\eta. Let CiC_{i} be the trail DTiD-T_{i}. Adding CiC_{i} to Tk1i1CjT_{k}\cup\bigcup_{1}^{i-1}C_{j} gives a valid blossom, which contains all of DD. As a special case it is possible that Ai1A_{i-1} is the starter blossom for DD. In that case Ti=Ti1T_{i}=T_{i-1} and Ci=DC_{i}=D.

Eventually we have Ai1=AA_{i-1}=A. In that case i=ki=k and TkT_{k} is as specified above. \Box

Incomplete Blossoms

A successful search finds an augmenting trail. This leads to the possibility of incomplete blossoms. A blossom BB is incomplete if the blossom step has some d(ui,fi)d(u_{i},f_{i}) invocation leading to a free vertex, so not all of the vertices in BB get scanned. Fig.2 gives examples of incomplete blossoms.

Refer to caption
(a)xyxyyyxxIIrr^{\prime}rrI1I_{1}I2I_{2}(c)(b)rrxxff
Figure 2: (a) An incomplete blossom II formed in successful search SS, with the trail from II to a free vertex. Arrows show the augmenting trail. II is formed by edge xy{xy}. Vertex rr^{\prime} is completely scanned in search SS, but unmatched edges from rr are not scanned. (b) A subsequent search SS^{\prime} enters II at vertex rr. SS^{\prime} cannot enter II at vertex rr^{\prime}. (c) An incomplete blossom may contain smaller incomplete blossoms: The search forms incomplete blossom I1I_{1} and then incomplete blossom I2I_{2}. The trail from xx to free vertex ff may contain other incomplete blossoms. Residual edges are drawn dashed.

In 1-matching incomplete blossoms II present little problem. In detail, the augment step removes all edges in the augmenting path PP. In 1-matching this removes all vertices on PP. The remaining vertices in II have all been completely scanned (by the dfs order). So II has the same properites as an ordinary blossom. This is not the case for multigraphs, since as illustrated in (a), a vertex rr on the augmenting trail remains in the graph with unscanned edges.

Eliminating Weighted Blossoms

Blocking trails are required in two types of applications of our algorithm: algorithms for maximum cardinality ff-factors and scaling algorithms for maximum weight ff-factors. Cardinality algorithms are handled directly by the algorithm of Section 3. But weighted algorithms require a modification of the graph, which we now describe.

In scaling algorithms, a blocking trail is found and rematched after each dual adjustment. The graph for the blocking trail is formed from the input graph by contracting every weighted blossom (i.e., blossom with positive dual variable z(B)z(B)). A blocking set trail can pass through such a blossom only once. In contrast an ordinary vertex that is not in a weighted blossom can appear in many different blocking set trails, and many different times in each trail. In order to have all vertices alike we replace each weighted blossom by a blossom substitute, illustrated in Fig.3.

In detail consider a contracted blossom BB with base vertex β\beta, and base edge η=aβ\eta={a\beta}. (η\eta does not exist if β\beta is a free vertex.) Let BmBm (BuBu) denote a typical matched (unmatched) edge incident to BB other than η\eta. The blossom substitute for BB discards the vertices of BβB-\beta and replaces them by a new vertex denoted bb and a new edge βb{\beta b}. It defines f(β)=f(b)=1f(\beta)=f(b)=1. If BB is a light blossom, βb{\beta b} is unmatched, each BmBm edge is replaced by matched bm{bm}, and each BuBu edge is replaced by unmatched βu{\beta u}. (Note that aside from η\eta, edges incident to β\beta in the original graph are treated as BmBm or BuBu edges in the substitute.) If BB is heavy then βb{\beta b} is matched, each BmBm edge is replaced by matched βm{\beta m}, and each BuBu edge is replaced by unmatched bu{bu}.

It is easy to see that blocking trails in the original graph GG correspond to blocking trails in the graph with substitutes, GG^{\prime}. In detail consider an alternating trail TT in GG and a weighted blossom BB. TT either contains β\beta or it does not. If TT contains β\beta, it contains η\eta (if it exists) and exactly one of the BmBm, BuBu edges. If TT does not contain β\beta it does not contain η\eta or BmBm or BuBu edge. Let TT^{\prime} be an alternating trail in GG^{\prime}. We give the details for a light blossom, heavy blossoms are symmetric. If TT^{\prime} contains β\beta, it contains η\eta (if it exists), and it either contains βb{\beta b}, exactly one bm{bm} edge and no bu{bu} edge, or it contains exactly one edge βu{\beta u} edge and no bm{bm} edge. If TT^{\prime} does not contain β\beta it does not contain η\eta, βb{\beta b} or any βu{\beta u} or bm{bm} edge.

Refer to caption
aammuuβ\betaη\etaaa(a)(b)bbaammη\etauuBBβ\betaaammη\etauuBBβ\betammuuβ\betaη\etabb
Figure 3: Weighted blossom substitute: (a) Light blossom. (b) Heavy blossom.

3 The blocking trail algorithm

This section presents the algorithm. It illustrates the algorithm’s execution and elaborates on details of the algorithm statement.

The overall algorithm is called find_trails. It uses a recursive depth-first search routine called dd. We begin by describing the data structures and giving an overview of dd. Each vertex xx has two lists: GL(x)GL(x) contains the edges ee incident to xx that can be used in a grow step from xx. ee may be matched or unmatched, and may be a loop. A nonloop ee starts out in two GL’s, and is removed from both lists when a grow step is executed from the first end. Similarly the second list BL(x)BL(x) contains the edges incident to xx that can trigger a blossom step. When find_blocking_set begins the lists are initialized as

GL(x)=δ(x)γ(x),BL(x)=.GL(x)=\delta(x)\cup\gamma(x),\ BL(x)=\emptyset.

Each search for an augmenting path begins with the GL’s and BL’s as they were at the end of the previous search. (We shall see that a BL entry is relevant only in the search that created it.)

We manage each BL(x)BL(x) using a routine pop that removes edges from a list. Specifically pop(L)(L) removes and returns an element ee of list LL, where ee can be chosen arbitrarily with one exception: The first invocation of pop(L)(L) must return the first element that was added to LL. Obviously this is a special case of a FIFO queue, but we use pop for clarity and greater generality.

The dd routine works in two phases. It starts by using its GL to do every possible grow step. Then it uses the BL to do every possible blossom step. In this second phase BL(x)BL(x) contains the edges where the dfs retreated from xx.

The algorithm constructs a search tree 𝒯\cal T consisting of the edges added in grow steps. To distinguish 𝒯\cal T from GG we use the terms node and arc for elements of V(𝒯)V({\cal T}) and E(𝒯)E({\cal T}), respectively. We view 𝒯\cal T as an out-tree, so every arc is directed, from parent to child. Consider an edge of GG e={x,y}e={\{x,y\}}. An occurrence of ee in 𝒯\cal T is denoted as xyxy or yxyx, where the arc is directed from the first vertex to the second, e.g., xyxy means xx is the parent.

Since a vertex xV(G)x\in V(G) can occur multiple times in 𝒯\cal T, we identify nodes of 𝒯\cal T by an incident 𝒯\cal T-arc. Specifically let yxyx be an arc in 𝒯\cal T. The notation yx˙y\dot{x} refers to the node of 𝒯\cal T at the xx end of yxyx, and xy˙x\dot{y} is the node at the yy end. So node yx˙{y\dot{x}} has yy the parent of xx or a child of xx.

The low-level algorithm represents blossoms using a data structure for set merging. The universe is the set of 𝒯\cal T-nodes. The sets are the vertex sets of the current blossoms. (So these blossoms are complete or incomplete.) In the pseudocode below, Bxy˙B_{{x\dot{y}}} is maintained as the set of all nodes in the blossom currently containing node xy˙{x\dot{y}}. It is a simple matter to transform this to a low-level linear-time set merging algorithm [15] (also see the simplfied incremental-tree set-merging algorithm of [13]).222A node zx˙{z\dot{x}} not in any blossom has Bxy˙={zx˙}B_{{x\dot{y}}}=\{{z\dot{x}}\}. This condition also holds for a loop xxxx when edge {x,x}\{x,x\} is a singleton blossom. This double usage will not cause any confusion. At any point in time 𝒯¯\overline{{\cal T}} denotes the graph with the current blossoms (i.e., the BB_{\cdot} sets) contracted.

An invocation of dd either returns normally or gets terminated, i.e., it stops execution prematurely. The latter occurs when an augmenting trail is discovered. When this occurs every invocation of dd in the current call chain is terminated. All those terminated invocations correspond to edges on the augmenting trail. This trail is added to 𝒜{\cal A}, the set of all augmenting trails that have been discovered.

find_trails does not rematch the trails of 𝒜{\cal A}, leaving that to the calling routine. For instance in applications of our algorithm for weighted matching, the rematching must undo the blossom substitutes. Rematching is further discussed in detail after Theorem 5.5 of Section 5. The other advantage of postponing the rematching is that it simplifies wording in the analysis.

In contrast, the algorithm maintains the quantities def(x)def(x), xV(G)x\in V(G), as the deficiency of vertex xx when the trails of 𝒜{\cal A} have been rematched. This allows proper identification of the free vertices.

procedure find_trails


initialize 𝒯\cal T to an empty forest and 𝒜{\cal A} to an empty set

for (xV(G)x\in V(G))

initialize def(x)def(x) to the deficiency of xx in the current matching

initialize lists GL(x)GL(x) to δ(x)γ(x)\delta(x)\cup\gamma(x) and BL(x)BL(x) to \emptyset


for (αV(G)\alpha\in V(G))

if (def(α)>0def(\alpha)>0 and no invocation d(zα˙)d({z\dot{\alpha}}) has returned) d(εα˙)d({\varepsilon\dot{\alpha}})

/* ε\varepsilon is an artificial vertex, ϵα\epsilon\alpha a matched artificial arc */

return 𝒜{\cal A}


procedure d(zx˙z\dot{x})


if (def(x)>0def(x)>0 and μ(zx)=M¯\mu(zx)=\overline{M} and (xα(x\neq\alpha or def(α)2)def(\alpha)\geq 2))

/* augment step */

add the 𝒯¯\overline{{\cal T}}-path from α\alpha to xx to 𝒜{\cal A}

decrement def(α)def(\alpha) and def(x)def(x)

terminate every currently executing invocation of dd, including this one


while ( edge xyGL(x)μ¯(zx)\exists\text{ edge }xy\in GL(x)\cap\overline{\mu}(zx))

remove xyxy from GL(x)GL(x) and GL(y)GL(y)

/* grow step */

add an arc xyxy, from node zx˙{z\dot{x}} to a new node yy, to 𝒯\cal T

Bxy˙{xy˙}B_{{x\dot{y}}}\leftarrow\{{x\dot{y}}\}

d(xy˙)d({x\dot{y}})


loop

/* blossom base test */

if (no occurrence of xx is in a blossom)

if ( edge xyBL(x)μ¯(zx)\exists\text{ edge }xy\in BL(x)\cap\overline{\mu}(zx)) xypop(BL(x))xy\leftarrow{\rm pop}(BL(x)) else break

/* blossom enlarge test */

else if (zx˙z\dot{x} or some descendant wx˙w\dot{x} is in a blossom)

if ( edge xyBL(x)\exists\text{ edge }xy\in BL(x)) xypop(BL(x))xy\leftarrow{\rm pop}(BL(x)) else break

/* blossom step */

let PP be the path in 𝒯¯\overline{{\cal T}} from Bzx˙B_{z\dot{x}} to Byx˙B_{{y\dot{x}}}

for (every arc uvuv of PP) merge Buv˙B_{{u\dot{v}}} into Bzx˙B_{{z\dot{x}}}

for (every arc uvuv of PP, as ordered in PP but skipping the first arc)

/* blossom-invocation loop */

d(vu˙)d({v\dot{u}})


add zxzx to BL(x)BL(x)

return

Figure 4: Blocking algorithm for ff-factors.

Fig.4 gives the pseudocode for our algorithm. Fig.5 gives examples of searches of dd.

Refer to caption
aa(a)(b)bbbb^{\prime}(f)(d.1)(d)vvvv(d.2)𝒯\cal Tvv(c)bb^{\prime}bbvvvv^{\prime}vv^{\prime}α\alphaz1z_{1}xxz2z_{2}z3z_{3}z4z_{4}(e)z3z_{3}x3x_{3}z1z_{1}x1x_{1}z2z_{2}x2x_{2}bbα\alpha
Figure 5: Examples of dd search. (a) Augmenting trail in a multigraph. Every edge is added in a grow step, no blossom steps. (b) A blossom step causes an augment, assuming bαb\neq\alpha or def(α)2def(\alpha)\geq 2. (c) bb^{\prime} is on the augmenting trail. The nontrail edges in 𝒯\cal T will not be in future augmenting trails, and they are not scanned again. (d) (d.1) A search executing 2 blossom steps, the second finding an augmenting trail. (d.2) Search tree 𝒯\cal T corresponding to (d.1). The 2 deepest occurrences of vv trigger blossom steps. (e) Cross edges trigger blossom steps: d(z1x˙1)d({z_{1}\dot{x}_{1}}), executes the first blossom step. Then 2 cross edges, drawn dashed, trigger blossom steps in d(zix˙i)d({z_{i}\dot{x}_{i}}), i=2,3i=2,3. (f) When z4x˙z_{4}\dot{x} returns 𝒯\cal T contains the alternating path shown by arrows. A blossom step for edge z4xz_{4}x discovers the augmenting trail (which contains every edge shown except xbxb^{\prime}). Now suppose edge bbb^{\prime}b does not exist. The blossom for z4xz_{4}x is discovered in d(z3x˙)d({z_{3}\dot{x}}), so its base edge is z3xz_{3}x. After that a blossom for z3xz_{3}x is discovered in d(z2x˙)d({z_{2}\dot{x}}). Its base edge is z2xz_{2}x. The blossom is skew. Finally a skew blossom for z2xz_{2}x is discovered in d(z1x˙)d({z_{1}\dot{x}}). Its base edge is z1xz_{1}x.
Refer to caption
AAzzxxzzyyzzxx(a)(b)(c)yyaaxxwwAAbbbbbbAAxx
Figure 6: Three types of blossom steps. (a) Initial formation of a blossom. The blossom AA on PP need not exist when the blossom is discovered, i.e., when yxyx is added to BL(x)BL(x). Special cases: The entire path PP can be one edge, the loop xxxx. PP can have only 1 vertex, P=x,y,xP=x,y,x. (b) Blossom AA is enlarged with a blossom trail. Special case: PP can be the single edge axax, a noop. (c) Skew blossom. Special case: PP can be a single edge, blossom AA’s base edge.

Let us comment on various statements in the algorithm. An invocation d(εα˙)d({\varepsilon\dot{\alpha}}) made in the main routine is called a search (for an augmenting trail). This invocation is terminated iff an augmenting trail is discovered. So d(εα˙)d({\varepsilon\dot{\alpha}}) returns if no augmenting trail is discovered. The discussion following Lemma 4.7 will show α\alpha remains free for the rest of the algorithm.

We make some conventions to justify the terminology for 𝒯\cal T. As usual we call 𝒯\cal T a search tree even though it is a forest (each search starts at a different root). Also we allow 𝒯\cal T to contain loops xxxx (added in grow steps). Similarly we allow the path PP in a blossom step to contain loops.

We continue discussing the blossom step. Fig.6 is a high level illustration of the structure of the three types of blossoms in dd. In the blossom base test, it is unclear why the pop routine returns the correct edge (i.e., it may have the wrong M-type). Lemma 4.1 will show correctness. In the blossom enlarge test note zx˙z\dot{x} is in a blossom is easily checked. Also let us informally explain this test (the explanation is proved rigorously below). The possibility that zx˙z\dot{x} is in a blossom covers ordinary blossoms (the possibility holds when either zx˙z\dot{x} is already the base vertex of a blossom, or zxzx or its reverse occurs on the blossom path). The possibility that zx˙z\dot{x} is not in a blossom but wx˙w\dot{x} is holds for skew blossoms. Next note that in the blossom step we identify each arc of PP by its nodes in 𝒯\cal T. The blossom step may have PP empty, specifically Byx˙=Bzx˙B_{{y\dot{x}}}=B_{{z\dot{x}}}. We call such a blossom step a noop since nothing changes (there are no Buv˙B_{{u\dot{v}}} merges or d(vu˙)d({v\dot{u}}) invocations). An invocation d(xy˙)d({x\dot{y}}) in a grow step adds xyxy to 𝒯\cal T, but d(vu˙)d({v\dot{u}}) in the blossom step does not cause a similar addition. So the call chain of dd can be a proper subset of the current search path. (More precisely, an invocation d(zx˙)d({z\dot{x}}) in the current call chain has the corresponding arc xzxz or zxzx in the current search path. Within blossoms, the call chain may omit search path edges.)

For more motivation let us explain the restriction on uvuv in the blossom-invocation loop. (The explanation gets rigorously proved in our analysis below.) Let abab be the first arc of PP, i.e., d(ab˙)d({a\dot{b}}) is not called in the blossom-invocation loop. (x=ax=a in Fig.6(a) and (c).) Let BB denote the blossom being formed. Consider two cases.

Suppose the first node of PP is not in a blossom. This means BB is either the first blossom formed with base edge zxzx, or BB is a skew blossom (Fig.6(a) or (c)). There is no need to invoke d(ab˙)d({a\dot{b}}) because the edges of GL(x)GL(x) have all been scanned before the blossom step. (In Fig.6(a) the scanning occurs because of edges zxzx and yxyx. In Fig.6(c) the scanning occurs in blossom AA or some subblossom.)

Suppose the first end of PP is in a blossom AA (Fig.6(b)). (We remark that this implies the last end of PP is also in AA.) There is no need for d(ab˙)d({a\dot{b}}) since ab˙a\dot{b} belongs to AA. So just like the skew blossom case, AA gives an invocation d(vu˙)d({v\dot{u}}) with vu˙=ab˙{v\dot{u}}={a\dot{b}} and μ(vu)=μ(ab)\mu(vu)=\mu(ab).

We turn to the issue of blossom completeness. A blossom BB becomes complete when η(B)\eta(B) returns. (Recall there are two possibilities if β(B)\beta(B) is a free vertex, say bb. In a search rooted at bb, i.e., b=αb=\alpha, η(B)\eta(B) is the artificial arc εα\varepsilon\alpha. Alternatively bb may occur in a search where it is not the root. In that case η(B)\eta(B) can be a matched GG-edge. In both cases BB can be completed when d(η(B))d(\eta(B)) returns.)

4 Valid search structure

The goal of this section is to show find_trails constructs a valid search structure, i.e., all blossoms are valid and all search paths are alternating [14]. We adopt this as an invariant maintained by the algorithm:


(I1)       Every step of the algorithm maintains a valid search structure, i.e., every blossom is valid and every path of the search structure is alternating.


The goal is achieved in Lemma 4.13.

All the proofs and other arguments in this section make the implicit assumption that all previous steps satisfy (I1). We explicitly prove (I1) for every step that modifies the search structure. As part of this analysis we will verify that the algorithm is well-defined. Specifically, in the blossom step it is unclear why the path PP exists for arbitrary yxBL(x)yx\in BL(x). The other statements of the algorithm present no problems in terms of their meaning.


We start with a fundamental concept of the algorithm. Consider a vertex xx. Let d(zx˙)d({z\dot{x}}) be the first invocation of dd for xx that returns. Define e1(x)e_{1}(x) to be the edge {z,x}\{z,x\}. This definition requires that d(zx˙)d({z\dot{x}}) is not terminated. For example an invocation d(εα˙)d({\varepsilon\dot{\alpha}}) that gives an unsuccessful search makes e1(α)=εαe_{1}(\alpha)=\varepsilon\alpha. If every search for a vertex xx is successful then e1(x)e_{1}(x) is an edge of GG reached in a search rooted at some free vertex x\neq x. e1(x)e_{1}(x) need not exist in this case.

e1(x)e_{1}(x) may be either arc zxzx or xzxz. In fact we will show the former always holds (Lemma 4.3) but this is not required at the moment. We usually abbreviate e1(x)e_{1}(x) to e1e_{1} when context establishes the identity of the node xx.

We begin the analysis by showing correctness of the pop routine.

Lemma 4.1

When an invocation d(zx˙)d({z\dot{x}}) satisfies the blossom base test, μ(zx)=μ¯(e1(x))\mu(zx)=\overline{\mu}(e_{1}(x)).

Remark: By definition e1(x)e_{1}(x) is the first edge added to BL(x)BL(x). So in d(zx˙)d({z\dot{x}}) pop returns e1(x)e_{1}(x), The lemma shows e1(x)e_{1}(x) has the required M-type μ¯(zx)\overline{\mu}(zx). So pop operates correctly in the blossom base test. Correctness of pop is not an issue in the blossom enlarge test since any edge can be removed from BL(x)BL(x).

Proof: First observe that no operation pop(BL(x))(BL(x)) has been performed previously in the blossom enlarge test, since that would mean a blossom already contains an occurrence of xx.

If the blossom base test is satisfied then BL(x)BL(x) is nonempty so e1(x)e_{1}(x) exists. Suppose for contradiction that μ(zx)=μ(e1)\mu(zx)=\mu(e_{1}). Let xyxy be the edge that satisfies the blossom test, i.e., xyBL(x)μ¯xy\in BL(x)\cap\overline{\mu}. So μ(xy)=μ¯(zx)=μ¯(e1)\mu(xy)=\overline{\mu}(zx)=\overline{\mu}(e_{1}). d(yx˙)d({y\dot{x}}) has returned (since xyBL(x)xy\in BL(x)). e1e_{1} was added to BL(x)BL(x) before d(yx˙)d({y\dot{x}}) returned (by definition of e1e_{1}. Also e1yxe_{1}\neq yx since the edges have opposite M-types.) So d(yx˙)d({y\dot{x}}) satisfied the blossom test before d(zx˙)d({z\dot{x}}). This triggered the operation pop(BL(x))(BL(x)), contradiction. \Box

A blossom has a simple structure in 𝒯\cal T:

Proposition 4.2

The nodes in a blossom BB form a subtree of 𝒯\cal T. β(B)\beta(B) is the subtree’s root and η(B)\eta(B) is the parent arc of β(B)\beta(B).

Proof: We induct on the size of the blossom. Let PP be the path in 𝒯¯\overline{{\cal T}} forming blossom BB. Before the blossom step merges BB_{\cdot} values, the nodes on path PP form a subtree of 𝒯¯\overline{{\cal T}}. The inductive hypothesis, applied to each blossom on PP, implies this is a subtree of 𝒯\cal T. After the merge this subtree corresponds to the set of nodes of BB.

Let d(zx˙)d({z\dot{x}}) be the invocation that forms blossom BB. Suppose the blossom base test is satisfied. PP is the path from zx˙z\dot{x} to Byx˙B_{{y\dot{x}}}. So zx˙z\dot{x} is β(B)\beta(B) and zxzx is η(B)\eta(B), as claimed in the lemma. Suppose the blossom enlarge test is satisfied. By induction zx˙z\dot{x} is β(Bzx˙)\beta(B_{{z\dot{x}}}) before the merge of BB_{\cdot} values. As in the blossom base test, zx˙z\dot{x} remains β(B)\beta(B). \Box

Now we continue to investigate e1(x)e_{1}(x). In a slight abuse of notation we write d(e1(x))d(e_{1}(x)) to denote d(zx˙)d({z\dot{x}}) for the invocation defining e1(x)e_{1}(x). We next show this invocation is for a 𝒯\cal T-arc zxzx. More generally, every invocation for an occurrence of xx in the call chain to d(e1(x))d(e_{1}(x)) corresponds to a 𝒯\cal T-arc. Fig.5(f) illlustrates this with arcs zixz_{i}x, i=1,,4i=1,\ldots,4, z4x=e1(x)z_{4}x=e_{1}(x), as well as bz4=e1(z4)b^{\prime}z_{4}=e_{1}(z_{4}).

Lemma 4.3

Every invocation d(zx˙)d({z\dot{x}}) made before d(e1(x))d(e_{1}(x)) returns has zxzx an arc of 𝒯\cal T.

Remark: The lemma even includes invocations made in searches before d(e1)d(e_{1}) is invoked.

Proof: Suppose d(zx˙)d({z\dot{x}}) is executed for an arc xzxz. We show some invocation d(wx˙)d({w\dot{x}}) returns before d(zx˙)d({z\dot{x}}) is invoked. Clearly this implies d(e1(x))d(e_{1}(x)) returns before d(zx˙)d({z\dot{x}}) is invoked. The lemma follows.

Let axax be the parent arc of xzxz. (axax may be the artificial arc εα\varepsilon\alpha.) We analyze two cases depending on whether d(ax˙)d({a\dot{x}}) returns before or after ax˙{a\dot{x}} has entered a blossom.


Case ax˙{a\dot{x}} is not in a blossom when d(ax˙)d({a\dot{x}}) returns: Since ax˙=zx˙{a\dot{x}}={z\dot{x}} this case implies d(zx˙)d({z\dot{x}}) has not been invoked when d(ax˙)d({a\dot{x}}) returns. So axax is the desired edge wxwx.


Case ax˙{a\dot{x}} is in a blossom when d(ax˙)d({a\dot{x}}) returns: Let BB be the blossom. We claim ax=η(B)ax=\eta(B). To prove the claim suppose the contrary. Proposition 4.2 implies axax is in some blossom path PP. The code for a blossom step implies d(ax˙)d({a\dot{x}}) returns before the blossom is formed. This contradicts the case definition. So the claim holds.

Let yxBL(x)yx\in BL(x) be the edge triggering the blossom step that makes ax˙a\dot{x} a base. yxyx was added to BL(x)BL(x) when d(yx˙)d({y\dot{x}}) returned. So yxyx returns before ax˙a\dot{x} becomes a base vertex. This is before blossom BB is formed. So it is before ax˙=zx˙{a\dot{x}}={z\dot{x}} is in a blossom. Thus it is before d(zx˙)d({z\dot{x}}) is invoked. So yxyx is the desired edge wxwx. \Box

Consider the special case of the lemma for occurrences of xx in searches before d(e1(x))d(e_{1}(x)) is invoked. Every such occurrence of xx is on a trail of 𝒜{\cal A}. In proof, suppose zxzx is the arc leading to the occurrence of xx. The corresponding invocation d(zx˙)d({z\dot{x}}) does not return before d(e1)d(e_{1}) returns (definition of e1e_{1}). So it is terminated, i.e., zxzx is on the augmenting path. Furthermore zxzx is not in a blossom (that would also imply d(zx˙)d({z\dot{x}}) has returned). So zxE(𝒜)zx\in E({{\cal A}}) as claimed.

The next result establishes the significance of e1e_{1}. We extend our notational convention to abbreviate μ(e1(x))\mu(e_{1}(x)) to μ\mu when xx is clear. Let 𝒜{\cal A} be defined when d(e1)d(e_{1}) is invoked, i.e., 𝒜{\cal A} contains the augmenting trails in the searches before e1e_{1} is reached.

Lemma 4.4

Fix a vertex xV(G)x\in V(G).

(ii) Let zxzx be an arc in 𝒯\cal T, zxe1zx\neq e_{1}. There are two possibilities for zxzx:

(i.ai.a) zxzx enters 𝒯\cal T before d(e1)d(e_{1}) is invoked: Either zxE(𝒜)zx\in E({{\cal A}}) or zxzx is an ancestor of e1e_{1} in 𝒯\cal T, as well as 𝒯¯\overline{{\cal T}}, the contraction of 𝒯\cal T when e1e_{1} enters 𝒯\cal T.

(i.bi.b) zxzx enters 𝒯\cal T after d(e1)d(e_{1}) returns: zxzx is a pendant edge of 𝒯\cal T throughout the execution of find_trails. Furthermore zxzx has M-type μ\mu.

(iiii) Consider an invocation d(zx˙)d({z\dot{x}}) where μ(zx)=μ¯\mu(zx)=\overline{\mu}. If d(zx˙)d({z\dot{x}}) pops an edge from BL(x)BL(x) then GL(x)GL(x) is empty. So no edge of δ(x)γ(x)\delta(x)\cup\gamma(x) is added to 𝒯\cal T after the pop (this includes both the current search and future searches).

Remarks: (i.ai.a) applies even if e1e_{1} does not exist. If e1e_{1} exists, it satisfies (i.ai.a) (e1e_{1} is an ancestor of itself). It may also satisfy (i.bi.b) (e1e_{1} has M-type μ\mu and it may be pendant).

Lemma 4.11(iiiiii) extends (i.ai.a) to describe the ancestors of e1e_{1} after e1e_{1} is popped. In Fig.5(d.2) the leaf occurrence of vv illustrates (i.bi.b).

We later prove that (iiii) holds without the requirement on μ(zx)\mu(zx).

Proof: (ii) First observe that (ii) covers all possibilities for zxzx: In proof the only case not covered is that d(zx˙)d({z\dot{x}}) is invoked after d(e1)d(e_{1}) is invoked and before d(e1)d(e_{1}) returns. Such an invocation returns before d(e1)d(e_{1}) returns. This contradicts the definition of e1e_{1}.

(i.ai.a) Assume d(zx˙)d({z\dot{x}}) is invoked in the same search as d(e1(x))d(e_{1}(x)). (If not d(zx˙)d({z\dot{x}}) is invoked in a previous search, and as shown above zxE(𝒜)zx\in E({{\cal A}}).) d(zx˙)d({z\dot{x}}) is invoked before d(e1)d(e_{1}) is invoked (assumption) and does not return before d(e1)d(e_{1}) returns (definition of e1e_{1}). So d(e1)d(e_{1}) returns during the execution of d(zx˙)d({z\dot{x}}). (Notice this accounts for d(zx˙)d({z\dot{x}}) being terminated before it returns.) We have shown e1e_{1} descends from zxzx in 𝒯\cal T. zxzx is not in a contracted blossom BB of 𝒯¯\overline{{\cal T}}: The contrary makes zxzx an arc in the blossom path of BB. That implies d(zx˙)d({z\dot{x}}) returns before BB is formed, contradiction. (zx˙z\dot{x} may be the base of a contracted blossom.)

(i.bi.b) d(e1)d(e_{1}) returns with GL(x)μ¯GL(x)\cap\overline{\mu} empty. zxzx is added to 𝒯\cal T in a grow step so this implies it has M-type μ\mu. Furthermore it implies d(zx˙)d({z\dot{x}}) does not execute any grow steps, i.e., zxzx is a pendant edge of tree 𝒯\cal T. Thus (i.bi.b) holds.

(iiii) zxe1zx\neq e_{1} since the two edges have opposite M-types. So d(zx˙)d({z\dot{x}}) returns after d(e1)d(e_{1}). d(e1)d(e_{1}) returns with GL(x)μ¯GL(x)\cap\overline{\mu} empty. d(zx˙)d({z\dot{x}}) removes every edge of GL(x)μGL(x)\cap\mu before it pops any edge from BL(x)BL(x). So GL(x)GL(x) becomes empty before d(zx˙)d({z\dot{x}}) pops an edge from BL(x)BL(x). \Box

As an example suppose xx becomes a free vertex in an unsuccessful search. In (i.ai.a) observe that there may be a path of multiple occurrences of xx. (i.bi.b) shows that after the unsuccessful search, every occurrence of xx is a leaf. So xx will not be on an augmenting trail after the unsuccessful search.

The lemma contains the seeds of the entire algorithm, which we now sketch. (The sketch along with other structural details is proved in Lemmas 4.94.12.) Consider a fixed vertex xx. Each time an edge of BL(x)BL(x) is popped it triggers a blossom step. The blossom step may be a noop (i.e., the current blossom containing xx does not get enlarged). Ignoring these noops there are three successive “stages” for vertex xx: Stage 1 accounts for the time up until e1(x)e_{1}(x) is popped and its blossom is formed. After that BL(x)BL(x)-pops trigger blossoms for leaf occurrences of xx (corresponds to Lemma 4.4(i.bi.b)). We call this Stage 2. After that BL(x)BL(x)-pops form skew blossoms (corresponds to Lemma 4.4(i.ai.a)). This is Stage 3. The search may halt, because of an augmenting trail, at any point during this progression (e.g., before e1e_{1} is defined, before it is popped, etc.). Also at any point in the progression arbitrarily many blossoms may be formed by pops from lists BL(x)BL(x^{\prime}) for various vertices xxx^{\prime}\neq x, and such blossoms may contain xx. For example in Stage 1 blossoms may absorb occurrences of vertex xx before e1e_{1} is popped.

Vertex xx can occur in a blossom in only one search for an augmenting trail, the search where d(e1)d(e_{1}) returns (Lemma 4.4(i.bi.b) and its proof). So the pendant edge and skew blossom stages can only occur in the search where d(e1)d(e_{1}) returns. (Regarding noops, the blossom step for e1e_{1} may be a noop, e.g., consecutive arcs ax,xzax,xz in a blossom path with ax=e1(x)ax=e_{1}(x)d(zx˙)d({z\dot{x}}) pops axax. Pendant edges and skew blossoms do not give noops. Other noops at arbitrary points may occur.)


As previously mentioned we must show the blossom step is well-defined. It is convenient to record that as the following property:


()(*)        At the moment an invocation d(zx˙)d({z\dot{x}}) pops an edge yxyx from BL(x)BL(x), yx˙{y\dot{x}} descends from zx˙{z\dot{x}} in 𝒯¯\overline{{\cal T}}.


Notice the contraction 𝒯¯\overline{{\cal T}} need not be the same as when yxyx was added to BL(x)BL(x).

It is helpful to give some illustrations for ()(*). Fig. 7 shows ()(*) needn’t hold if we change 𝒯¯\overline{{\cal T}} to 𝒯\cal T.

Refer to caption
xxyysszzzzssxxyyzzssxxyyxxxx
Figure 7: Blossom path P=sy,yxP=sy,yx does not descend from zx˙z\dot{x} in 𝒯\cal T (b), but does so in 𝒯¯\overline{{\cal T}} (c).

Next consider Fig.8. It might seem to violate (I1) because dd executes as follows: The invocation d(zx˙)d({z\dot{x}}) executes a blossom step for edge yxBL(x)yx\in BL(x). The blossom step invokes d(ys˙)d({y\dot{s}}). It executes a blossom step for edge rsBL(s)rs\in BL(s). But Brs˙B_{{r\dot{s}}} is not a descendant of Bys˙=Bzx˙B_{y\dot{s}}=B_{{z\dot{x}}}. This purported counterexample is incorrect because the picture in (a) is impossible, the true picture is (b).

Refer to caption
zzxxssyyrrssyyrrxxzz
Figure 8: (a) A flawed counterexample: Blossom step for edge rsBL(s)rs\in BL(s) violates invariant (I1). However the configuration is impossible – the true configuration is (b) (assuming rr is visited before zz). Note the blossom step for sxBL(x)sx\in BL(x) is a noop.

In our analysis of a fixed vertex xx we assume ()(*) for all previous blossoms. This implies an enhanced version of the assumption: Suppose we are considering a blossom formed when an invocation d(zx˙)d({z\dot{x}}) pops an edge yxyx. Let d(st˙)d({s\dot{t}}) be in the call chain from d(zx˙)d({z\dot{x}}). ()(*) implies that when d(st˙)d({s\dot{t}}) returns, β(Bst˙)\beta(B_{{s\dot{t}}}) descends from zx˙z\dot{x}.

()(*) implies any blossom formed after d(zx˙)d({z\dot{x}}) is invoked but before the pop has its base vertex descending from zx˙z\dot{x}.

To set the stage for the ensuing discussion we extend Section 2’s definition of blossom. That definition specifies the properties of a blossom BB in the given graph GG. We now include the corresponding properties as they occur in the search tree 𝒯\cal T.

Definition of Blossom, in 𝑮G and 𝓣\cal T

\bullet Starter: The starter can be a node xx of 𝒯\cal T where xx does not occur in the 𝒯\cal T-subtree of any blossom. xx becomes the blossom base β\beta. The new blossom BB is formed by adding a blossom path PP that begins at node β\beta and ends at another 𝒯\cal T-occurrence of β\beta. (In the code for dd, PP ends at Byx˙B_{{y\dot{x}}}, and ordinary blossoms have Byx˙={yx˙}B_{{y\dot{x}}}=\{{y\dot{x}}\}.) Blossom BB’s occurrence in GG is formed by identifying the two node occurrences of β\beta. The first and last edges of PP must have the same M-type μ\mu. The parent arc of β\beta is the base edge η\eta, which must have opposite M-type μ¯\overline{\mu}. (A special case is where the blossom path PP is the loop ββ\beta\beta. BB’s occurrence in GG is also that loop.)

A blossom AA can be enlarged by using AA as the starter. Blossom BB is formed by adding a blossom path PP that begins at some node aa of AA, and ends at an occurrence of some GG-vertex in subtree AA. (In the code for dd the starter is the blossom A=Bzx˙A=B_{{z\dot{x}}}. As above PP ends at Byx˙B_{{y\dot{x}}}, and ordinary blossoms have Byx˙={yx˙}B_{{y\dot{x}}}=\{{y\dot{x}}\}.) Blossom BB’s occurrence in GG is formed by identifying yx˙{y\dot{x}} with node zx˙z\dot{x} in AA. (Note from the code of dd that PP starts with some node aa in AA, not necessarily zx˙{z\dot{x}}.) The M-types of the extreme edges of PP are arbitrary. The base vertex and edge of BB are inductively defined as those of AA.

For skew blossoms the starter is a node not in any blossom subtree, which becomes the base vertex β\beta. PP ends in a contracted blossom containing an occurrence of β\beta. (In the code for dd this blossom is Byx˙{yx˙}B_{{y\dot{x}}}\neq\{{y\dot{x}}\}.) All other requirements for skew blossoms are the same as ordinary blossoms.


\bullet Blossom trail: In the code for a blossom step, PP is a path in 𝒯¯\overline{{\cal T}}, the contraction of 𝒯\cal T when blossom BB is triggered. Assuming ()(*), the 𝒯\cal T-path PP is guaranteed to have all required properties of the blossom trail. Specifically, PP is guaranteed to be alternating, since the arcs of PP are in 𝒯\cal T. If AA is a contracted blossom on PP, PP is guaranteed to contain AA’s base edge, which is the parent arc of AA. This also guarantees that AA occurs only once in the trail CC of the blossom. This property extends to a starter blossom AA: Its base edge is the parent of the contracted AA, so AA does not reoccur in the blossom trail. A GG-vertex may occur arbitrarily many times in PP. \Box


We start with some simple properties of blossoms. Consider the moment in time when a blossom BB becomes complete, i.e., d(η(B))d(\eta(B)) returns. Let xx be a vertex that occurs in BB.

Proposition 4.5

When BB becomes complete, an invocation d(wx˙)d({w\dot{x}}) has returned for two edges wxwx of opposite M-type. Moreover e1(x)e_{1}(x) has been popped.

Proof: Wlog assume BB is the first blossom to contain an occurrence of vertex xx.


Case xx occurs as β(B)\beta(B): Let yxyx be the first edge of BL(x)BL(x) to trigger a blossom (not necessarily BB). The invocations d(yx˙)d({y\dot{x}}) and d(η(B))d(\eta(B)) are as desired. Obviously e1(x)=yxe_{1}(x)=yx was popped.


Case xx does not occur as β(B)\beta(B): xx is not the first or last vertex of the blossom path PP, since xx is not the base vertex and xx does not occur in a blossom when BB is formed. So PP contains an arc axax. axax is followed by an arc xbxb in PP, since ax˙a\dot{x} is not in a blossom on PP (again by the choice of BB as first blossom). The invocations d(ax˙)d({a\dot{x}}) and d(bx˙)d({b\dot{x}}) are as desired.

Consider the moment when d(bx˙)d({b\dot{x}}) executes the blossom enlarge test. GL(x)GL(x) has been emptied. In particular d(e1(x))d(e_{1}(x)) has been invoked and has returned, adding e1(x)e_{1}(x) to BL(x)BL(x). So d(bx˙)d({b\dot{x}}) pops e1(x)e_{1}(x) unless some previous invocation popped it. \Box

The next two results give the properties of the two blossom tests.

Proposition 4.6

Suppose d(zx˙)d({z\dot{x}}) executes the blossom enlarge test.

(ii) Suppose zx˙z\dot{x} is not in a blossom. d(zx˙)d({z\dot{x}}) satisfies the blossom enlarge test \Longleftrightarrow\ the pop triggers a skew blossom step.

(iiii) zx˙z\dot{x} is in a blossom \Longleftrightarrow\ some current blossom BB has zx=η(B)zx=\eta(B) or arc xzxz in the blossom path of BB.

Proof: (ii) is clear. For (iiii) consider two cases:

zxzx a 𝒯\cal T-arc: During the execution of d(zx˙)d({z\dot{x}}), the current blossom containing zx˙z\dot{x}, Bzx˙B_{{z\dot{x}}}, has base vertex zx˙z\dot{x} (Proposition 4.2). So the first alternative (zx˙=η(B){z\dot{x}}=\eta(B)) holds.

xzxz a 𝒯\cal T-arc: Since the test is executed by the invocation d(zx˙)d({z\dot{x}}), xzxz is in the blossom path of a current blossom. So the second alternative holds. \Box

Lemma 4.7

(ii) d(zx˙)d({z\dot{x}}) with zxe1(x)zx\neq e_{1}(x) pendant is a noop, i.e., it returns without executing a grow or blossom step.

(iiii) After a search that invokes d(e1)d(e_{1}), xx only occurs as a leaf of 𝒯\cal T. Such occurrences do not enter blossoms.

Proof: (ii) d(zx˙)d({z\dot{x}}) goes directly to the blossom tests. Clearly zx˙z\dot{x} is not in a blossom at this point. So the blossom enlarge test fails. Suppose the blossom base test is satisfied. Then BL(x)BL(x) contains an edge yxyx with

(4.1) μ(yx)=μ¯(zx)=μ¯(e1(x))\mu(yx)=\overline{\mu}(zx)=\overline{\mu}(e_{1}(x))

(Lemma 4.4(i.bi.b) gives the second equality). d(yx˙)d({y\dot{x}}) has returned. d(e1(x))d(e_{1}(x)) returns before d(yx˙)d({y\dot{x}}) returns (yxe1(x)yx\neq e_{1}(x) by (4.1)). So e1(x)e_{1}(x) is in BL(x)BL(x) during the execution of d(yx˙)d({y\dot{x}}). Thus d(yx˙)d({y\dot{x}}) satisfies the blossom base test (using (4.1)). It performs a blossom step. The occurrence of yx˙y\dot{x} in a blossom means d(zx˙)d({z\dot{x}}) does not execute the blossom base test. Contradiction.

(iiii) Lemma 4.4(i.bi.b) shows any occurrence of xx in a search after the search invoking d(e1)d(e_{1}) is a leaf of 𝒯\cal T. A pendant edge yxyx can enter a blossom only as the last edge of the blossom path, i.e., when it is popped from BL(x)BL(x). But part (ii) shows the corresponding execution of dd is a noop, i.e., it does not trigger a blossom step. \Box

For intuition note this extension of part (iiii): The occurrences of xx in part (iiii) never enter augmenting trails. In proof no grow step is executed from the xx occurrence, by part (ii) and the nonoccurrence of xx in a blossom. So the 𝒯\cal T-path to xx is not extended and does not lead to a free vertex. (This property is not required in the logic of our algorithm analysis.) In particular this extension shows that if a free vertex α\alpha has a corresponding invocation d(zα˙)d({z\dot{\alpha}}) that returns (as in the code for dd) α\alpha will never enter an augmenting trail in the rest of the execution of find_trails.

We now begin to track the status of a fixed vertex xV(G)x\in V(G) during the execution of find_trails. The following result applies to searches at the start of find_trails, specifically, before e1(x)e_{1}(x) is discovered.

Proposition 4.8

Consider a search where no invocation d(zx˙)d({z\dot{x}}) returns. Any 𝒯\cal T-arc zxzx of the search is on an augmenting trail. Furthermore xx does not occur in any blossom of the search.

Remarks: The search may contain arcs xzxz directed from xx.

The proposition may or may not apply to a search after e1(x)e_{1}(x) is defined.

Proof: Consider a 𝒯\cal T-arc zxzx. (Such an arc always exists; for a free vertex α\alpha it is the artificial arc εα\varepsilon\alpha.) The corresponding invocation d(zx˙)d({z\dot{x}}) was terminated, i.e., zxzx is on an augmenting trail.

zxzx is not an arc in a blossom path, again since d(zx˙)d({z\dot{x}}) did not return. zxzx is not the base edge of a blossom, since the first blossom containing zx˙{z\dot{x}} contains e1(x)e_{1}(x). So as claimed, xx does not occur in a blossom. \Box

The following property of our edge-contracted blossoms is needed for both Sections 4 and 5:


()(\dagger)        At any moment in find_trails a given vertex xx occurs in at most one blossom.


Observe that xx can enter a blossom only in the search that invokes d(e1)d(e_{1}): Proposition 4.8 shows this for searches before the invocation and Lemma 4.7(iiii) shows it for after. To establish ()(\dagger) we must show it holds throughout the search invoking d(e1)d(e_{1}). We have broken this search up into three time periods, Stages 1–3 for xx. The next several lemmas prove ()(\dagger) in each of these stages. Specifically Lemma 4.9 proves ()(\dagger) at any moment in Stage 1. Lemma 4.11(ii) does this for Stage 2, and Lemma 4.12 for Stage 3.


Now assume some invocation d(zx˙)d({z\dot{x}}) in find_trails returns, i.e., e1(x)e_{1}(x) exists. We account for the time up to and including the formation of the blossom for the pop of e1(x)e_{1}(x). Recall this time period is called Stage 1 for xx. It is possible that the pop of e1(x)e_{1}(x) is a noop, but it is still included Stage 1. Fig. 9(a) gives an example. It is also possible that e1(x)e_{1}(x) does not get popped. In that case Stage 1 ends after the last blossom containing xx is formed. Again this may be a noop.

Refer to caption
BBxxwwxxccbbaaxxdde1(x)e_{1}(x)wweeη\etaffη\etawwxxxxwwhhgg
Figure 9: (a) Stage 1 search: ax˙{a\dot{x}} is the first occurrence of xx to enter a blossom. bx˙{b\dot{x}} is a leaf added after d(ax˙)d({a\dot{x}}) returns. d(cw˙)d({c\dot{w}}) forms a blossom containing dx˙{d\dot{x}}. d(dx˙)d({d\dot{x}}) pops e1(x)e_{1}(x). Stage 1 ends when that blossom step is complete. Only the pendant edge bxbx remains, it will be processed in Stage 2. (b) Stage 3: BB is a complete blossom when d(η)d(\eta) returns. BB is enlarged by skew blossoms in d(xw˙)d({x\dot{w}}) and d(hx˙)d({h\dot{x}}).
Lemma 4.9

xx occurs in at most one blossom at any moment in Stage 1.

Proof: Consider a point in Stage 1 where there is a unique blossom BB containing one or more occurrences of xx. Let CC be the next blossom formed in Stage 1. If xx occurs in CC, we will show η(C)=η(B)\eta(C)=\eta(B). Clearly this implies the lemma holds throughout Stage 1. Note that Stage 1 ends when CC is formed by a pop of e1e_{1}, unless e1e_{1} is never popped because of an augmenting trail.

Observe that η(C)\eta(C) descends from η(B)\eta(B). In proof Proposition 4.5 shows that when d(η(B))d(\eta(B)) returns, e1e_{1} has been popped and so Stage 1 has ended. Thus CC is formed before d(η(B))d(\eta(B)) returns. This implies η(C)\eta(C) descends from η(B)\eta(B).

CC is irrelevant unless it contains an occurrence of xx. If CC does not contain a new occurrence of xx then η(C)=η(B)\eta(C)=\eta(B) as desired. Suppose CC has a new occurrence of xx, say on edge {y,x}\{y,x\}.

{y,x}\{y,x\} is not a pendant edge yxyx. In proof, {y,x}\{y,x\} is in the blossom path of CC. yxyx pendant must be the last edge of this blossom path. So the blossom is triggered by the pop of yxyx. yxyx is popped in an invocation d(zx˙)d({z\dot{x}}). This pop was preceded by the pop of e1(x)e_{1}(x). But that ended Stage 1.

We conclude {y,x}\{y,x\} is an ancestor of e1e_{1}, by Lemma 4.4(ii). Let P1P_{1} be the 𝒯\cal T-path from BB to e1e_{1}. We can assume {y,x}\{y,x\} is the arc yxyx on P1P_{1}. η(C)\eta(C) is an ancestor of e1e_{1} (since CC contains xx, an ancestor of e1e_{1}). η(C)\eta(C) is not an edge of P1P_{1}, since every invocation for an arc on P1P_{1} returns before BB is formed. Thus η(C)=η(B)\eta(C)=\eta(B), as desired. \Box

Note that if e1e_{1} exists but does not get popped, any search after d(e1)d(e_{1}) is invoked can add leaf occurrences of xx but no blossom containing xx (Lemma 4.7(iiii)).

Next we analyze the pop of e1e_{1}. In particular part (iiii) of the next lemma shows that the pop of e1e_{1} satisfies ()(*). Part (iiii) will also be used to show the Stage 2 blossoms satisfy ()(*).

Lemma 4.10

Suppose d(zx˙)d({z\dot{x}}) pops e1(x)e_{1}(x).

(ii) μ(zx)=μ¯(e1)\mu(zx)=\overline{\mu}(e_{1}). So GL(x)=GL(x)=\emptyset when e1e_{1} is popped, and no edge of δ(x)γ(x)\delta(x)\cup\gamma(x) will be added to 𝒯\cal T after that.

(iiii) Let η\eta be the base edge of the first blossom containing an occurrence of xx. At every moment until d(η)d(\eta) returns, every edge of BL(x)BL(x) descends from η\eta. So every edge popped from BL(x)BL(x) during the execution of d(zx˙)d({z\dot{x}}) descends from η\eta.

Remark: Fig.9(a) gives an example of (iiii). For instance BL(x)BL(x) is the list (e1,ex,ax,bx)(e_{1},ex,ax,bx) when the first blossom is formed. In general BL(x)BL(x) may contain arcs directed to or from xx, e.g., after d(dx˙)d({d\dot{x}}) returns dxdx enters BL(x)BL(x).

Proof: (ii) The second assertion follows immediately from the first using Lemma 4.4(iiii). We turn to the first assertion.

Suppose e1e_{1} is popped in the blossom base test. (In particular this implies zxzx is a 𝒯\cal T-arc.) The blossom step makes zxzx the base edge of a blossom, zx=η(B)zx=\eta(B). The test implies μ(e1)=μ¯(zx)\mu(e_{1})=\overline{\mu}(zx), as desired.

Now suppose zxzx satisfies the blossom enlarge test. (Note that d(zx˙)d({z\dot{x}}) may be invoked strictly after moment e1e_{1} enters a blossom. In that case the pop of e1e_{1} is a noop.)


Claim: xx does not occur in a complete blossom.

Proof of Claim: Assume it does occur. Proposition 4.5 implies d(wx˙)d({w\dot{x}}) has been invoked for edges wxwx with opposite M-types. Choose an invocation d(wx˙)d({w\dot{x}}) that has μ(wx)=μ¯(e1)\mu(wx)=\overline{\mu}(e_{1}). e1e_{1} is popped before d(wx˙)d({w\dot{x}}) returns, contradicting completeness. \spadesuit


zx˙z\dot{x} must be in a blossom BB. If not, Proposition 4.6(ii) implies zx˙z\dot{x} is the base vertex of a skew blossom. But then the Claim gives a contradiction.

If zx=η(B)zx=\eta(B) the blossom base test has been executed and as shown above (ii) holds. So Proposition 4.6(iiii) shows xzxz is an arc in the blossom path PP. The node zx˙{z\dot{x}} is not in a blossom on PP, again by the Claim. The algorithm statement shows xzxz is not the first arc of PP. So PP contains an arc axax preceding xzxz. axax is not a base edge so μ(ax)=μ(e1)\mu(ax)=\mu(e_{1}). The alternation at xx implies μ(zx)=μ¯(ax)=μ¯(e1)\mu(zx)=\overline{\mu}(ax)=\overline{\mu}(e_{1}).


(iiii) We first show e1e_{1} descends from η\eta, i.e., ()(*) holds for the pop of e1e_{1}. Consider two cases. If zx=ηzx=\eta then Lemma 4.4(ii) shows e1(x)e_{1}(x) descends from zxzx. Suppose zxηzx\neq\eta. So xx occurs on the path of a blossom BB being formed, say on consecutive arcs ax,xbax,xb, and η=η(B)\eta=\eta(B). (Here we use ()(*) for BB.) The invocation d(ax)d(ax) occurs before BB is formed and pops e1e_{1}, (So ax=zxax=zx.) Lemma 4.4(ii) shows axax is an ancestor of e1e_{1}, and η\eta is an ancestor of axax. Thus η\eta is an ancestor of e1e_{1} and ()(*) holds.

We continue with (iiii). Since e1e_{1} descends from η\eta, BL(x)BL(x) is empty when d(η)d(\eta) is invoked. From that moment on until d(η)d(\eta) returns, every edge added to BL(x)BL(x) descends from η\eta.

The second assertion of (iiii) follows siimilarly: Lemma 4.9 shows that at the moment d(zx˙)d({z\dot{x}}) forms the blossom for e1e_{1}, η\eta is the base of the blossom containing zx˙{z\dot{x}}. \Box

Note from part (ii) that if e1e_{1} is popped, no search after this pop contains an occurrence of xx.

Before proceeding it is worthwhile to give an incorrect proof for the first part of (iiii), i.e., that ()(*) holds for the pop of e1e_{1}.


Incorrect Proof: Lemma 4.9 shows that at the moment the blossom for e1e_{1} is formed, η\eta is the base of the unique blossom containing occurrences of xx. This makes η\eta an ancestor of e1e_{1} since a blossom is a subtree of 𝒯\cal T (Proposition 4.2).


The argument fails since the use of Proposition 4.2 assumes the blossom has already been properly formed, i.e., ()(*) held for the blossom popping e1e_{1}.


We now analyze the time from when d(zx˙)d({z\dot{x}}) pops e1(x)e_{1}(x) until it returns. This is Stage 2.

Refer to caption
e1(x)e_{1}(x)zzxxwwxxxxxx
Figure 10: Preemptive invocation: d(zx˙)d({z\dot{x}}) pops e1(x)e_{1}(x) but a different invocation, d(wx˙)d({w\dot{x}}) for the matched arc xwxw, executes the blossom step for the leaf occurrence of xx.
Lemma 4.11

Suppose d(zx˙)d({z\dot{x}}) pops e1(x)e_{1}(x). In parts (ii) and (iiii) yxyx is an arbitrary edge popped from BL(x)BL(x) during the execution of d(zx˙)d({z\dot{x}}) (i.e., from the moment d(zx˙)d({z\dot{x}}) is invoked and until it returns).

(ii) The pop of yxyx makes it an edge of Bzx˙B_{{z\dot{x}}}. So Bzx˙B_{{z\dot{x}}} remains the only blossom containing an occurrence of xx.

(iiii) The pop of yxyx satisfies ()(*).

(iiiiii) Suppose d(zx˙)d({z\dot{x}}) returns. Every pendant edge yxyx ever created in the execution of find_trails was popped from BL(x)BL(x) during the execution of d(zx˙)d({z\dot{x}}). So at the moment d(zx˙)d({z\dot{x}}) returns, every occurrence of xx in 𝒯\cal T is either in Bzx˙B_{{z\dot{x}}} or is a proper ancestor of β(Bzx˙)\beta(B_{{z\dot{x}}}) not in any blossom.

Remarks: The arc corresponding to d(zx˙)d({z\dot{x}}) can be zxzx (blossom base test) or xzxz (blossom enlarge test).

Stage 2 can have blossom steps for pendant edges as well as noops. For example in Fig.9(a), where we have noted BL(x)BL(x) as the list (e1,ex,ax,bx)(e_{1},ex,ax,bx), exex and axax trigger noops and bxbx is pendant. In general noops are for ancestors of e1(x)e_{1}(x).

The edge zxzx need not be in the first blossom containing an occurrence of xx. For instance in Fig.9(a) d(dx˙)d({d\dot{x}}) pops e1(x)e_{1}(x) but ax˙a\dot{x} is first blossom occurrence of xx.

In (ii), Fig.10 shows a leaf xx can be popped in an invocation other than d(zx˙)d({z\dot{x}}).

Proof: (ii) Lemma 4.9 shows (ii) holds for yx=e1yx=e_{1}, the first pop of BL(x)BL(x).

Inductively assume (ii) has always held up to some point in the execution of d(zx˙)d({z\dot{x}}). As previously observed (proof of Lemma 4.7(iiii)) a pendant edge yxyx can enter a blossom only when it is popped from BL(x)BL(x). Let yxyx be the next edge yxyx to be popped from BL(x)BL(x). yxyx may be popped by an invocation d(wx˙)d({w\dot{x}}), wxzxwx\neq zx. (The arc for d(wx˙)d({w\dot{x}}) may be wxwx or xwxw, Fig.10(a) shows the latter.) wxwx is not pendant (Lemma 4.7(ii)). So wx˙{w\dot{x}} is a 𝒯\cal T-ancestor of e1e_{1} (Lemma 4.4(ii); wxwx may be e1e_{1}). d(wx˙)d({w\dot{x}}) is on the call chain starting from d(zx˙)d({z\dot{x}}). Hence wxwx was in Bzx˙B_{{z\dot{x}}} after the pop of e1e_{1} (and possibly before that). So the pop of yxyx preserves (ii). This completes the induction.


(iiii) Although blossom steps during the execution of d(zx˙)d({z\dot{x}}) may enlarge Bzx˙B_{{z\dot{x}}} without adding new occurrences of xx, η(Bzx˙)\eta(B_{{z\dot{x}}}) remains η\eta. So Lemma 4.10(iiii) shows every edge popped during the execution of d(zx˙)d({z\dot{x}}) descends from η\eta. This gives (iiii).


(iiiiii) Lemma 4.10(ii) shows every pendant edge yxyx is created before e1e_{1} is popped. yxyx is added to BL(x)BL(x) when d(yx˙)d({y\dot{x}}) returns. So yxyx is in BL(x)BL(x) before e1e_{1} is popped.

d(zx˙)d({z\dot{x}}) returns with BL(x)BL(x) empty. Thus every pendant edge gets popped from BL(x)BL(x) during the execution of d(zx˙)d({z\dot{x}}). With (ii) this implies d(zx˙)d({z\dot{x}}) returns with Bzx˙B_{{z\dot{x}}} containing every leaf occurrence of xx in 𝒯\cal T. (ii) also shows Bzx˙B_{{z\dot{x}}} is the only blossom in which xx occurs. \Box

Continuing we will track the execution of find_trails for vertex xx after d(zx˙)d({z\dot{x}}) of Lemma 4.11 returns. The return initiates Stage 3. First consider Stage 3 for the example of Fig.9: d(dx˙)d({d\dot{x}}) (which popped e1e_{1}) returns with BL(x)=(dx)BL(x)=(dx). d(fx˙)d({f\dot{x}}) is on the search path to d(dx˙)d({d\dot{x}}), so eventually it pops dxdx, executes a noop blossom for dxdx, and returns with BL(x)=(fx)BL(x)=(fx). In Fig.9(b) edge cwcw triggers the skew blossom for ww and fxfx triggers the skew blossom for xx.

For the general case we continue with previous notation: d(zx˙)d({z\dot{x}}) pops e1(x)e_{1}(x). η\eta is the base edge of the blossom formed by that pop.

Lemma 4.12

After d(zx˙)d({z\dot{x}}) returns, the blossom steps triggered by pops of BL(x)BL(x) are as follows:

(a) Until d(η)d(\eta) returns each blossom is a noop.

(b) If η\eta is directed to xx the blossom is also a noop.

(c) After d(η)d(\eta) returns each blossom is a skew blossom. ()(*) holds for this blossom.

At all times xx occurs in a unique blossom.

Remark: A blossom of (c) may be incomplete, as in Fig.5(f) if the parent of bb^{\prime} is changed to z3z_{3}.

Proof: We will prove the following invariant holds after d(zx˙)d({z\dot{x}}) returns and until the current search ends. Let BB be the current blossom Bzx˙B_{{z\dot{x}}}. Recall BB is a subtree of 𝒯\cal T rooted at β(B)\beta(B).


(I2)       Any occurrence of xx either belongs to BB or is a proper 𝒯\cal T-ancestor of β(B)\beta(B), not in any blossom. BL(x)BL(x) is a singleton whose edge is in Bη(B)B\cup\eta(B).


Note that (I2) implies the last assertion of the lemma, property ()(\dagger).

The invariant holds at the moment d(zx˙)d({z\dot{x}}) returns. In proof, Lemma 4.11(iiiiii) gives the condition on occurrences of xx. Also BL(x)=(zx)BL(x)=(zx), since the return adds zxzx to the previously emptied BL(x)BL(x), and zxBη(B)zx\in B\cup\eta(B).

From this moment until d(η)d(\eta) returns, noops may change the entry in BL(x)BL(x) to some other occurrence of xx in Bη(B)B\cup\eta(B). But (I2) is maintained. Now consider the return of d(η)d(\eta). If η\eta is an arc directed to xx, i.e., x=β(Bzx˙)x=\beta(B_{{z\dot{x}}}), then the blossom is a noop, BL(x)BL(x) changes to (η)(\eta), and (I2) is preserved.

Now assume (I2) holds after d(η)d(\eta) has returned, and a blossom step creates the next blossom CC. Let b=β(C)b=\beta(C). Let aa be the first proper ancestor of β(B)\beta(B) that is an occurrence of xx, if such exists.

BL(x)BL(x) cannot change before control returns to d(a)d(a). Suppose CC is formed before that. There are two possibilities. If bb is not an ancestor of β(B)\beta(B), then no occurrence of xx enters CC (we use (I1) here). So (I2) is preserved. The other possibility is that bb properly descends from aa. CC is either disjoint from BB or contains BB. Again no new occurrence of xx enters a blossom and (I2) is preserved.

Now assume control returns to d(a)d(a). The blossom enlarge test is satisfied. The entry yxyx in BL(x)BL(x) is an edge in Bη(B)B\cup\eta(B). So CC is a skew blossom with base vertex aa. After this blossom step CC may get enlarged in d(a)d(a) but no occurrence of xx is added (again by (I1)). When d(a)d(a) returns η(C)\eta(C) (the arc directed to aa) is added to BL(x)BL(x). So (I2) is preserved.

To show ()(*) for the skew blossom, observe aa is a 𝒯\cal T-ancestor of yx˙{y\dot{x}}. So aa is an ancestor of yx˙y\dot{x} in the current contraction 𝒯¯\overline{{\cal T}}. \Box

Lemma 4.12 completes the proof of ()(\dagger). As mentioned Lemma 4.10(ii) shows xx does not occur in future searches if Stage 3 has been entered.

Note we have also verified that ()(*) always holds: Lemmas 4.10(iiii), 4.11(iiii), and 4.12(c) establish ()(*) for Stages 1,2, and 3 respectively.

We can now establish the validity of the search structure 𝒯\cal T and 𝒯¯\overline{{\cal T}}. It simply amounts to the validity of the algorithm’s blossoms.

Lemma 4.13

Any blossom formed in the algorithm is valid, i.e., it satisfies the above Definition of Blossom, in GG and 𝒯\cal T.

Proof: As noted in the Definition of Blossom in GG and 𝒯\cal T, the requisite properties of the blossom trail hold automatically. So we need only verify the properties of the Starter. The requirements are satisfied trivially but for completeness we step through them.

The blossom base test forms ordinary blossoms with singleton starters zx˙=β{z\dot{x}}=\beta. The blossom path correctly begins with Bzx˙=zx˙=βB_{{z\dot{x}}}={z\dot{x}}=\beta and ends at Byx˙=yx˙=βB_{{y\dot{x}}}={y\dot{x}}=\beta (recall yx=e1(x)yx=e_{1}(x)). zxzx is the base edge and the blossom test ensures the first and last edges of the blossom trail have the same M-type, specificially μ¯(zx)\overline{\mu}(zx).

Now suppose zx˙z\dot{x} is in a blossom Bzx˙{zx˙}B_{{z\dot{x}}}\neq\{{z\dot{x}}\}. The blossom step is triggered by the blossom enlarge test. Bzx˙B_{{z\dot{x}}} is the starter blossom AA. As specified in the code, PP starts at a node in Bzx˙=AB_{{z\dot{x}}}=A, called vertex aa in the Definition. ()(*) implies the blossom path PP descends from AA. It ends at Byx˙B_{{y\dot{x}}}. Consider two possibilities:


Byx˙{yx˙}B_{{y\dot{x}}}\neq\{{y\dot{x}}\}: This implies Byx˙=AB_{{y\dot{x}}}=A, since AA is the only blossom containing an occurrence of xx. So PP is empty and the blossom step is a noop.

Byx˙={yx˙}B_{{y\dot{x}}}=\{{y\dot{x}}\}: If yxyx is the loop xxxx and xxxx is already a blossom, we again have a noop. Otherwise, identifying yx˙y\dot{x} with zx˙z\dot{x} gives a valid blossom.


In the blossom enlarge test, a skew blossom corresponds to the alternative that zx˙z\dot{x} is a singleton but some descendant is in a blossom. \Box

5 Blocking set

This section proves find_trails returns a valid blocking set. This culminates in the main Theorem 5.5, which is followed by a brief discussion of the linear time bound.

5.1 Search structure

We first give several additional properties of the search structure that hold when find_trails halts. They are needed to establish the blocking property.

We start by fleshing out Proposition 4.5. Again consider a moment in time when a blossom BB becomes complete. Note that every xx occurring in BB is in stage 3. Let β\beta and η\eta be the base vertex and edge of BB, respectively. Let e={x,y}e={\{x,y\}} be any edge incident to BB in GG, say xV(B)∌yx\in V(B)\not\ni y.

Corollary 5.1

At the moment BB becomes complete, either

(ii) ee is a proper 𝒯¯\overline{{\cal T}}-ancestor of η\eta, and ee is in the search path SPSP of d(η)d(\eta), or

(iiii) e=yx=ηe=yx=\eta, or

(iiiiii) e=xye=xy and leaves either BB or SPSP in 𝒯¯\overline{{\cal T}}.

Remark: Part (iiiiii) is illustrated by arc xgxg in Fig.9(b).

Proof: ee is an arc of 𝒯\cal T (in one direction or the other) by Lemma 4.10(ii). Lemma 4.11(iiiiii) shows the xx-end of ee is either in BB or is a proper ancestor of β\beta not in any blossom.


Case yx˙{y\dot{x}} is a node of BB: The nodes of BB form a subtree of 𝒯\cal T (Proposition 4.2). So if ee is the arc yxyx then e=ηe=\eta and (iiii) holds. If ee is arc xyxy then it leaves the subtree of BB and (iiiiii) holds.


Case yx˙{y\dot{x}} is a proper 𝒯\cal T-ancestor of β\beta: If ee is arc yxyx then the invocation d(yx˙)d({y\dot{x}}) precedes d(η)d(\eta) in the call chain. Thus ee is in the current search path SPSP. ee is not in a contracted blossom since BB is the only blossom containing xx. So (ii) holds.

Suppose e=xye=xy. Since yx˙{y\dot{x}} is not in a blossom xyxy is in 𝒯¯\overline{{\cal T}}. It is either in the 𝒯¯\overline{{\cal T}}-path to η\eta or it is incident to that path. In the first case ee is in SPSP as before. In the second case ee is incident to SPSP, giving (iiiiii). \Box

Now assume xx occurs in blossoms that are both complete and incomplete. Let B0B_{0} denote the maximal complete blossom containing an occurrence of xx. Let the sequence of incomplete blossoms containing xx be B1B2BkB_{1}\subset B_{2}\subset\ldots\subset B_{k}, BkB_{k}, k>0k>0. η(B0)\eta(B_{0}) and η(B1)\eta(B_{1}) may be identical or different. However the base edges η(Bi)\eta(B_{i}), i1i\geq 1 are identical. This follows since when d(η(B1))d(\eta(B_{1})) is terminated every invocation of dd in the call chain is terminated, i.e., no new blossoms are formed.

Refer to caption
B2B_{2}B4B_{4}IIOOIIIIIIOOB1B_{1}IIOOIIB3B_{3}
Figure 11: (a) Nested incomplete blossoms B1B_{1}B4B_{4}. The augmenting trail is disjoint from the blossom paths of B2B_{2} and B4B_{4}. (b) Residual edges of B4B_{4}, with vertex labels I,OI,O from Section 5.

The above sequence has some simple special cases, for an arbitrary vertex xx: B0B_{0} may exist with k=0k=0. If B0B_{0} does not exist define B0={x}B_{0}=\{x\}. kk may or may not be 0.

Let 𝒞{\cal C} be the set of maximal complete blossoms when find_trails halts.

Lemma 5.2

A 𝒯\cal T-arc zxzx not in any trail of 𝒜{\cal A} or any set E(B)η(B)E(B)\cup\eta(B), B𝒞B\in{{\cal C}} has M-type μ(e1(x))\mu(e_{1}(x)).

Proof: First observe that e1(x)e_{1}(x) exists, since invocation d(zx˙)d({z\dot{x}}) returns (i.e., it is not terminated). Suppose for the sake of contradiction that zxzx has M-type μ¯(e1(x))\overline{\mu}(e_{1}(x)).


Claim 1 zxzx is not the base edge of any blossom formed in find_trails.


Proof of Claim 1: Suppose on the contrary that zxzx is the base edge of a blossom BB.

Suppose BB is incomplete. By definition the search’s augmenting trail includes η(B)\eta(B). But this contradicts the lemma’s hypothesis on 𝒜{\cal A}.

Suppose BB is complete. Let CC be the maximal complete blossom containing BB. Either zx=η(C)zx=\eta(C) or zxzx is on a blossom path contained in CC or a subblossom of CC. Both alternatives contradict the lemma’s hypothesis on 𝒞{\cal C}. \spadesuit


Claim 2 xx occurs in some blossom when d(zx˙)d({z\dot{x}}) executes the blossom base test.


Proof of Claim 2: zxe1(x)zx\neq e_{1}(x) since the two edges have opposite M-type. Lemma 4.4(ii) shows zxzx is a proper ancestor of e1e_{1}. Thus e1e_{1} enters BL(x)BL(x) during the execution of d(zx˙)d({z\dot{x}}). In fact it enters before d(zx˙)d({z\dot{x}}) executes the blossom test. (It enters during the execution of a grow step in d(zx˙)d({z\dot{x}}).)

Suppose Claim 2 fails. Then xx satisfies the first condition of the blossom base test. e1e_{1} is still in BL(x)BL(x). (If not it was popped, placing e1e_{1} in a blossom. This contradicts the test’s first condition.) The second condition of the base blossom test is satisfied (since μ(zx)=μ¯(e1)\mu(zx)=\overline{\mu}(e_{1})). So e1e_{1} is popped and zxzx becomes the base edge of a new blossom. This contradicts Claim 1. \spadesuit


Using Claim 2, let wxwx be an arc (directed to xx) with wx˙w\dot{x} in a blossom when d(zx˙)d({z\dot{x}}) executes the blossom base test. wxwx and zxzx are both ancestors of e1e_{1}, so one is an ancestor of the other. For the moment assume zxwxzx\neq wx. Let BB be the first blossom containing vertex wx˙w\dot{x}. (It is possible that BB is incomplete.) There are four possibilities for zxzx, but none can hold:



Case zxzx an ancestor of η(B)\eta(B): The execution of d(zx˙)d({z\dot{x}}) makes zxzx the base of a skew blossom, contradicting Claim 1.


Case zx=η(B)zx=\eta(B): Contradicts Claim 1 again.


Case zxzx is on the blossom path of BB: d(zx˙)d({z\dot{x}}) executes the blossom base test before BB is formed. This contradicts the definition of wxwx. The same contradiction occurs if zx=wxzx=wx. (Note wxwx is either η(B)\eta(B) or it is on the blossom path of BB. The latter must hold if wx=zxwx=zx.)


Case zxzx descends from BB: This implies zxzx descends from wxwx. Again d(zx˙)d({z\dot{x}}) executes the blossom base test before BB is formed, contradiction. \Box

Lemma 5.3

A vertex xx that is free at end of a search wherein it occurs in a complete blossom BB has

(5.1) def(x)=1 and x=α.\text{$def(x)=1$ and $x=\alpha$}.

Furthermore choosing BB to be maximal makes η(B)=εα\eta(B)=\varepsilon\alpha.

Remarks: xx will be free when find_trails halts.

The hypothesis of maximality is necessary: In Fig.5(iiii) assume b=αb=\alpha and def(α)=1def(\alpha)=1. The blossom of the figure is complete but does not have base vertex bb. But the maximal complete blossom, formed as a skew blossom at α\alpha, has base vertex α\alpha.

Proof: Let BB be the first blossom formed with an occurrence of xx. xx occurs on an edge of the blossom path PP.


Case xx does not occur as an interior vertex of PP: Since xx is not in a prior blossom, xx is the base vertex of BB and the base edge, say arc zxzx, is unmatched. d(zx˙)d({z\dot{x}}) starts by executing the test for an augment step. The test fails (since BB has not been formed yet). This implies (5.1).


Case xx occurs as an interior vertex of PP: PP alternates at xx so xx is on an unmatched edge {z,x}\{z,x\} of PP. The invocation d(zx˙)d({z\dot{x}}) is made before BB becomes complete. (The invocation may be for arc zxzx or xzxz.) As before it starts by executing the test for an augment, which fails (BB has not been completed). So again (5.1) holds.

If η(B)εα\eta(B)\neq\varepsilon\alpha, eventually control returns to the invocation d(εα)d(\varepsilon\alpha). It executes a skew blossom step. This guarantees the second condition of the lemma. \Box

5.2 Min-max relation

For disjoint sets of vertices S,TV(G)S,T\subseteq V(G) let E[S,T]E[S,T] be the set of all edges with one end in SS and the other in TT. The maximum size an ff-matching (i.e., a subgraph where every vertex xx has degree f(x)\leq f(x)) is the minimum value of the expression

(5.2) f(I)+|γ(O)|+Cf(C)+|E[C,O]|2f(I)+|\gamma(O)|+\sum_{C}\bigg{\lfloor}\frac{f(C)+|E[C,O]|}{2}\bigg{\rfloor}

where II and OO range over all pairs of disjoint vertex sets, and in the summation CC ranges over all connected components of GIOG-I-O. This min-max relation is proved in [25, Theorem 32.1] by reduction; [14, Sec. 5.3] gives an algorithmic proof. This section gives another self-contained algorithmic proof, in the process of establishing the blocking property of our algorithm.

We first give some intuition for the relation. The set names are mnemonics for the tight case of the bound: We will show that equality holds for our matching by taking II as the set of inner atoms and OO as the set of outer atoms. The connected components CC are formed by the remaining vertices, specifically vertices in blossoms and vertices not reached in any search. In the special case of f1f\equiv 1 our min-max relation is Edmonds’ odd set cover formula for a maximum cardinality matching [6, 22, 23]. We view our relation as a “generalized odd set formula”. Just like Edmonds’ formula the nontrivial part of the bound is due to rounding odd sets down by 1/21/2. For 1-matching the CC components are the blossoms and one more, the unreached vertices.

Let us show the expression (5.2) always upper bounds the size of an ff-matching. Classify the edges of G as type I (one or both ends in II), OOOO (both ends in OO), and C^O\widehat{C}O (joins a C vertex to a C or O vertex). Fix an ff-matching MM. MM trivially contains at most f(I)f(I) II-edges and |γ(O)||\gamma(O)| OO edges. Consider a connected component CC of GIOG-I-O. The number of edges of MM in C^O\widehat{C}Ois 12(xCOdeg(x,MC^O))\frac{1}{2}(\sum_{x\in C\cup O}deg(x,M\cap{\widehat{C}O})). The terms for xOx\in O trivially sum to at most |E[C,O]||E[C,O]|. The terms for xCx\in C sum to at most f(C)f(C). So |MC^O||M\cap{\widehat{C}O}| is at most 12(f(C)+|E[C,O]|)\frac{1}{2}(f(C)+|E[C,O]|). Since |MC^O||M\cap{\widehat{C}O}| is integral we can take the floor, thus giving the last term of (5.2).

We will show our algorithm finds a blocking set even if we make augmenting paths through complete blossoms residual. To be precise consider the graph when find_trails  halts. Let 𝒜{\cal A} be the set of all augmenting trails at that point. As before 𝒞{\cal C} is the set of maximal complete blossoms. Define the residual graph to be RG=G𝒜𝒞RG=G-{{\cal A}}\cup{{\cal C}}. Let MM be the matching on RGRG when find_trails halts. Notice MM contains the matched edges of augmenting trails when they occur in blossoms of 𝒞{\cal C}. For xV(G)x\in V(G) let def(x)def(x) be the deficiency computed by find_trails, i.e., the deficiency of xx in the matching on GG after augmenting the trails of 𝒜{\cal A}. The degree constraint function on RGRG is

f(x)=deg(x,M)+def(x).f^{\prime}(x)=deg(x,M)+def(x).

Throughout this section we will write ff^{\prime} and defdef^{\prime} as ff and defdef. This causes no problem. For instance the definition of ff^{\prime} shows deficiencies are identical in GG and RGRG. (We will use defdef values when we apply (5.1) of Lemma 5.3.) We will show find_trails finds a maximum cardinality matching of RGRG.

An edge of RGRG is a residual edge. Note that a blossom B𝒞B\in{{\cal C}} need not have a base edge in RGRG. This occurs when a trail of 𝒜{\cal A} passes through BB (or it simply contains an edge incident of δ(β(B),δ(B))\delta(\beta(B),\delta(B))).

We will use (5.2) to show MM is a maximum cardinality matching of RGRG. To define sets II and OO consider a vertex xV(G)x\in V(G) where the invocation d(e1(x))d(e_{1}(x)) returned but xx does not occur in any complete blossom.

(5.3) x{Iμ(x)=M¯Oμ(x)=M.x\in\begin{cases}I&\mu(x)=\overline{M}\\ O&\mu(x)=M.\end{cases}

As an example a free vertex with deficiency greater than 1 is in OO (Lemma 5.3), so its ff value is not used in (5.2). We often refer to I and O as vertex labels, and call the vertices of GIOG-I-O unlabelled. Mnemonically note that xx is labelled according to arc e1e_{1}: xx is I (O) if e1e_{1} makes xx an inner (outer) vertex. Intuition for the labelling scheme is given in Fig.12.

Refer to caption
η\etaBBIIOOIIOOxxyy
Figure 12: Erasing labels. Vertices are labelled before blossom BB is formed. The bound of (5.2) is not tight, since the unmatched edge xyxy contributes to |γ(O)||\gamma(O)|. If an augmenting trail is discovered before d(η)d(\eta) returns xyxy is on the trail, removing the contribution. Otherwise blossom BB is formed, again removing the contribution.

There are two types of unlabelled vertices: If e1(x)e_{1}(x) exists then an unlabelled xx is in a complete blossom of GG. The second possibility for unlabelled xx is that e1(x)e_{1}(x) does not exist, i.e., no invocation d(zx˙)d({z\dot{x}}) ever returned. Equivalently, every occurrence of xx in 𝒯\cal T was in an augmenting trail AA. Call such a vertex xx an orphan, i.e., xx does not occur in 𝒯\cal T or xx occurs in 𝒯RG{\cal T}\cap RG but has no parent, e.g., vertex aa in Fig.5(b). (Note that a free vertex such a parent.) 𝒯\cal T may contain residual arcs directed from xx. There may also be edges incident to xx that are not in 𝒯\cal T, i.e., no corresponding grow step was executed.

We proceed to analyze the number of matched residual edges for each of the three types I, OO, C^O\widehat{C}O. An II vertex is not free (a free vertex is O or unlabelled). So exactly f(I)f(I) type I edges are matched. Thus MM achieves equality in (5.2) for type I edges.

Next consider an OO edge. Say it occurs as the 𝒯\cal T-arc zxzx. It satisfies Lemma 5.2. (In particular zxzx is not the base edge of a 𝒞{\cal C}-blossom BB, since β(B)\beta(B) is unlabelled.) Thus zxzx has M-type μ(e1(x))=M\mu(e_{1}(x))=M. So every OO edge is matched and MM achieves equality in (5.2) for OO edges.

It remains to consider the C^O\widehat{C}O edges. Let CC be a connected component of unlabelled vertices. There are two similar cases, depending on whether or not CC contains an orphan. We analyze them in turn.


Case 1. CC consists of 𝒞{\cal C}-blossoms: CC is a subtree of 𝒯\cal T (Proposition 4.2, with augmenting trails included). CC consists of one or more 𝒞{\cal C}-blossoms joined by their base edges. The root of this subtree is the base vertex of the “root” blossom B0B_{0}. So a blossom BB in CC has its base edge in CC iff BB0B\neq B_{0}.

Next we analyze the edges of GG incident to a blossom B𝒞B\in{{\cal C}}. Consider a residual edge e={x,y}e={\{x,y\}}, xByx\in B\notin y.


Claim Either e=η(B)e=\eta(B) (ee being arc yxyx) or ee is arc xyxy with

(5.4) yy labelled and μ(e)=μ(e1(y))\mu(e)=\mu(e_{1}(y)) or yy unlabelled and y=β(A)y=\beta(A) for a blossom A𝒞A\in{{\cal C}}.

Proof of Claim: We will apply Corollary 5.1 to blossom BB, adjusting for events that occur after BB becomes complete. Consider the corollary’s three alternatives (ii)–(iiiiii) for ee:

(ii) ee is one of two oppositely directed arcs, say axax and xbxb. BB is maximal complete so the skew blossom SS for xx was not completed. SS was either never triggered or was triggered and is incomplete. The first case has both arcs on the augmenting trail, so neither arc is residual. In the second case axax is on the augmenting trail. xbxb is not, since it is not on the search path of any dd invocation made in the blossom step. xbxb is residual, and in particular bb is not an orphan. So if bb is unlabelled then xbxb is the base edge of a blossom A𝒞A\in{{\cal C}}. The Claim holds. If bb is labelled then Lemma 5.2 applies (with zxzx taken as xbxb). It shows μ(xb)=μ(e1(b))\mu(xb)=\mu(e_{1}(b)). Again the Claim holds.

(iiii) e=η(B)e=\eta(B) as in the Claim.

(iiiiii) ee is arc xyxy. The analysis of xbxb in (ii) applies, i.e., the same two possibilities of the Claim hold. Note that ee may originate from any number of occurrences of xx on SPSP. \spadesuit



We now show the component CC achieves equality in (5.2). The only fact we will use about edges incident to CC is the Claim. We will reuse the Claim when we analyze components CC that contain an orphan.

Let MM be the set of matched residual edges with one or both ends in CC. Define MIM_{I} to be the set of MM-edges of type II, and similarly for MOM_{O} and MC^OM_{\widehat{C}O}. We will show the number of matched edges of type C^O\widehat{C}O for our component CC equals its corresponding term in the summation of (5.2). To do this we show the number of ends of matched C^O\widehat{C}O edges is exactly

(5.5) 2|MC^O|=f(C)+|E[C,O]|ϵ2|M_{\widehat{C}O}|=f(C)+|E[C,O]|-\epsilon

where ϵ{0,1}\epsilon\in\{0,1\}. (Recall ff is the residual degree constraint function.) Since the left-hand side is even, ϵ=1\epsilon=1 implies f(C)+|E[C,O]|f(C)+|E[C,O]| is odd. So the desired equality always holds.

Let β\beta be the base vertex of the root blossom B0B_{0}, and let η\eta be its base edge, also denoted as arc aβa\beta. The above quantity ϵ\epsilon is the sum of three quantities ϵi{0,1},i=1,2,3\epsilon_{i}\in\{0,1\},i=1,2,3, defined by

ϵi=1η is {the artifical edge εαi=1matched with a labelled Ii=2unmatched with a labelled Oi=3.\epsilon_{i}=1{\Longleftrightarrow\ }\eta\text{ is }\begin{cases}\text{the artifical edge $\varepsilon\alpha$}&i=1\\ \text{matched with $a$ labelled I}&i=2\\ \text{unmatched with $a$ labelled O}&i=3.\end{cases}

Clearly at most one of these quantities is 1. Fig. 13 illustrates the possibilities for ϵ\epsilon.

Refer to caption
β\betaaazzB0B_{0}ee^{\prime}ee
Figure 13: Possibilities for the base edge of root blossom B0B_{0}: If e=e1(a)e=e_{1}(a) then aa is labelled O when d(za˙)d({z\dot{a}}) returns and ϵ=ϵ3=1\epsilon=\epsilon_{3}=1. If e=e1(a)e^{\prime}=e_{1}(a) and an augment occurs after d(e1(a))d(e_{1}(a)) returns but before the blossom for e1(a)e_{1}(a) is formed then aa is labelled I and ϵ=0\epsilon=0. If the blossom for e1(a)e_{1}(a) is formed then it does not get completed, since B0B_{0} is the root. Again aa remains labelled I. A final possibility with ϵ=0\epsilon=0 is when arc aβa\beta is not residual. This can occur if a blossom step is triggered by an arc from B0B_{0} to zz, and an augmenting trail passes through B0B_{0}.

The number of ends of matched edges of type C^O\widehat{C}O is exactly

(5.6) 2|MC^O|=xCdeg(x,M)deg(x,MI)+|MC^O|.2|M_{\widehat{C}O}|=\sum_{x\in C}deg(x,M)-deg(x,M_{I})+|M_{\widehat{C}O}|.

To analyze this let e={x,y}e={\{x,y\}} be a residual edge incident to CC, with xCx\in C. The Claim and (5.4) show either e=η(B0)e=\eta(B_{0}) for the root blossom B0B_{0} or ee occurs as arc xyxy with yy labelled and ee is either

matched with yy labelled O or unmatched with yy labelled I.

This implies

(5.7) xCdeg(x,MI)=ϵ2 and |E[C,O]|=|MC^O|+ϵ3.\sum_{x\in C}deg(x,M_{I})=\epsilon_{2}\text{ and }|E[C,O]|=|M_{\widehat{C}O}|+\epsilon_{3}.

Recall (Lemma 5.3) CC contains at most one free vertex, which must be β\beta, with β=α\beta=\alpha and f(β)=deg(β,M)+1f(\beta)=deg(\beta,M)+1. Thus

(5.8) xCdeg(x,M)=f(C)ϵ1.\sum_{x\in C}deg(x,M)=f(C)-\epsilon_{1}.

Combining (5.6) with (5.7) and (5.8) gives (5.5).


Case 2. CC contains an orphan: We start by analyzing the adjacencies of an orphan.

Lemma 5.4

Consider a residual edge e={x,y}e={\{x,y\}} incident to an orphan xx. Either yy is unlabelled and an orphan or yy satisfies (5.4).

Proof: Let ee denote edge {x,y}\{x,y\}.


Case yy is labelled: We will show μ(e)=μ(e1(y))\mu(e)=\mu(e_{1}(y)), so (5.4) holds. If edge ee occurs as a 𝒯\cal T-arc it must be arc xyxy (xx is an orphan). Lemma 5.2 applies (in particular ee is not the base edge of a 𝒞{\cal C}-blossom). It proves μ(e)=μ(e1(y))\mu(e)=\mu(e_{1}(y)) as claimed. Suppose ee is not a 𝒯\cal T-arc. The invocation d(e1(y))d(e_{1}(y)) exists and returns (because of yy’s label). ee was not removed in d(e1(y))d(e_{1}(y)) (xx is an orphan). So again μ(e)=μ(e1(y))\mu(e)=\mu(e_{1}(y)).


Case yy is unlabelled: If yy is an orphan the lemma’s condition holds. The other possibility is that yy is in a complete blossom BB^{\prime}. We can assume B𝒞B^{\prime}\in{{\cal C}}. We have yB∌xy\in B^{\prime}\not\ni x and {x,y}\{x,y\} is the arc xyxy (xx an orphan). The Claim above shows xy=η(B)xy=\eta(B^{\prime}). So (5.4) holds for AA taken as BB^{\prime}. \Box

Define an orphanage to be a connected component of the graph induced on the residual graph by the set of orphans. (Its edges are the residual edges joining two orphans.) Clearly CC contains an orphanage, say RR. Lemma 5.4 and (5.4) show RR can be joined to other unlabelled vertices only by 𝒯\cal T-arcs directed from RR to the base vertex of a complete blossom. Thus CC consists of RR and 0 or more subtrees SS that consist entirely of complete blossoms and are joined to RR by the root blossom’s base edge η\eta. We will show CC satisfies the tightness relation (5.5) with ϵ=0\epsilon=0. To do this we examine the contribution to the right-hand side made from the orphanage RR and from a typical subtree SS.

First consider SS. Case 1 applies and shows (5.5). Furthermore η\eta does not leave the current component CC. So each ϵi=0\epsilon_{i}=0 and ϵ=0\epsilon=0.

Now consider RR. (5.6) holds by definition. Since (5.4) holds for RR, (5.7) holds with ϵ2=ϵ3=0\epsilon_{2}=\epsilon_{3}=0. RR does not contain any free vertex, since a free vertex occurs in an unsuccessful search. So (5.8) holds with ϵ1=0\epsilon_{1}=0. As before combining these equations gives (5.5) with ϵ=0\epsilon=0.

Now combining the instances of (5.5) for RR and all subtrees SS gives (5.5) for CC with ϵ=0\epsilon=0. We have now verified the tightness of (5.2) for our labelling and the residual graph matching of find_trails. So our main result follows:

Theorem 5.5

find_trails finds a blocking trail set 𝒜{\cal A}. In fact find_trails halts with a maximum cardinality matching of the residual graph. \Box

Linear-time bound

It is straightforward to see that find_trails operates in linear time O(m)O(m). Here mm counts each edge according to its multiplicity. Note the blossom enlarge test is easy to implement using Proposition 4.6(ii), since testing for a skew blossom is trivial.

For an algorithm to use our blocking set the trails of 𝒜{\cal A} must be rematched. The inclusion of 𝒞{\cal C} in the residual graph allows any alternating trail through a complete blossom to be rematched. Thus the PiP_{i} trails of [14] may be used (even for skew blossoms). So postprocessing the augmenting trails uses O(m)O(m) time.

A Blocking 1-matchings

Fig.14 is a verbatim statement of the blocking algorithm for 1-matching of [17, 11]. It is the jumping off point for our ff-factor algorithm.


procedure find_ap_set initialize 𝒮\cal S to an empty graph and 𝒫\cal P to an empty set for (each vertex vVv\in V) b(v)vb(v)\leftarrow v /* b(v)b(v) maintains the base vertex of BvB_{v} */ for (each free vertex ff) if fV(𝒫)f\notin V({\cal P}) then add ff to 𝒮\cal S as the root of a new search tree find_ap(f)(f) return 𝒫\cal P
procedure find_ap(xx) /* xx is an outer vertex */ for (each edge xyMxy\notin M) //* scan xyxy from xx /*/ if (yV(𝒮)y\notin V({{\cal S}})) if (yy is free) //* yy completes an augmenting path /*/ add xyxy to 𝒮\cal S and add path yP(x)yP(x) to 𝒫\cal P terminate every currently executing recursive call to find_ap else //* grow step /*/ add xy,yyxy,yy^{\prime} to 𝒮\cal S, where yyMyy^{\prime}\in M find_ap(y)(y^{\prime}) else if (b(y)b(y) is an outer, proper descendant of b(x)b(x) in 𝒮\cal S^{-}) /* blossom step */ /* equivalent test: b(y)b(y) became outer strictly after b(x)b(x) */ let uiu_{i}, i=1,,ki=1,\ldots,k be the inner vertices in 𝒮¯(By,Bx){\overline{\cal S}}(B_{y},B_{x}), ordered so uiu_{i} precedes ui1u_{i-1} for (i1𝐭𝐨ki\leftarrow 1\ {\bf to}\ k) for (every vertex vv with b(v){ui,ui}b(v)\in\{u_{i},u^{\prime}_{i}\}, where uiuiMu_{i}u^{\prime}_{i}\in M) b(v)b(x)b(v)\leftarrow b(x) /* this adds BuiBuiB_{u_{i}}\cup B_{u^{\prime}_{i}} to the new outer blossom */ for (i1𝐭𝐨ki\leftarrow 1\ {\bf to}\ k) find_ap(ui)(u_{i}) /* process uiu_{i} in order of increasing depth */ return

Figure 14: Depth-first search blocking algorithm for ordinary matching.

References

  • [1]
  • [2] W.J. Cook, W.H. Cunningham, W.R. Pulleyblank, and A. Schrijver, Combinatorial Optimization, Wiley and Sons, NY, 1998.
  • [3] E.A. Dinic, ”Algorithm for solution of a problem of maximum flow in a network with power estimation”, Soviet Mathematics Doklady, 11, 1970, pp. 1277–1280. (In Russian.)
  • [4] R. Duan, H. He, and T. Zhang, ”A scaling algorithm for weighted f-factors in general graphs”, Proc. of the 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020), Vol. 168 of LIPIcs, pp. 41:1–41:17, 2020.
  • [5] R. Duan, S. Pettie, and H-H. Su, ”Scaling algorithms for weighted matching in general graphs”, ACM Trans. Algorithms 14, 1, 2018, Article 8, 35 pages.
  • [6] J. Edmonds, ”Paths, trees, and flowers”, Canad. J. Math. 17, 1965, pp. 449–467.
  • [7] J. Edmonds, “Maximum matching and a polyhedron with 0,1-vertices”, J. Res. Nat. Bur. Standards 69B, 1965, pp. 125–130.
  • [8] S. Even and R.E. Tarjan, “Network flow and testing graph connectivity”, SIAM J. Comput., 4, 1975, pp. 507–518.
  • [9] H.N. Gabow, ”An efficient implementation of Edmonds’ algorithm for maximum matching on graphs”, J. ACM, 23, 2, 1976, pp. 221–234.
  • [10] H.N. Gabow, “A scaling algorithm for weighted matching on general graphs,” Proc. 26th Annual Symp. on Found. of Comp. Sci., 1985, pp. 90–100.
  • [11] H.N. Gabow, ”The weighted matching approach to maximum cardinality matching,” Fundamenta Informaticae 154, 1-4, 2017, pp. 109–130.
  • [12] H.N. Gabow, ”A weight-scaling algorithm for ff-factors of multigraphs”, arXiv:2010.01102, 2020.
  • [13] H.N. Gabow, ”A data structure for nearest common ancestors with linking”, ACM Trans. on Algorithms, 13, 4, 2017, Article 45, 28 pages.
  • [14] H.N. Gabow, ”Data structures for weighted matching and extensions to bb-matching and ff-factors,” ACM Trans. on Algorithms, 14, 3, 2018, Article 39, 80 pages.
  • [15] H.N. Gabow and R.E. Tarjan, “A linear-time algorithm for a special case of disjoint set union”, J. Comp. and System Sci., 30, 2, 1985, pp. 209–221.
  • [16] H.N. Gabow and R.E. Tarjan, “Faster scaling algorithms for network problems,” SIAM J. Comput., 18, 5, 1989, pp. 1013–1036.
  • [17] H.N. Gabow and R.E. Tarjan, “Faster scaling algorithms for general graph matching problems”, J. ACM 38, 4, 1991, pp. 815–853.
  • [18] Z. Galil, S. Micali and H.N. Gabow, “An O(EVlogV)O(EV\,{\rm log}\,V) algorithm for finding a maximal weighted matching in general graphs”, SIAM J. Comput., 15, 1, 1986, pp. 120–130.
  • [19] J. Hopcroft and R. Karp, “An n5/2n^{5/2} algorithm for maximum matchings in bipartite graphs”, SIAM J. Comput., 2, 4, 1973, pp. 225–231.
  • [20] D. Huang and S. Pettie, “Approximate generalized matching: ff-matchings and ff-edge covers”, arXiv:1706.05761, 2017.
  • [21] A.V. Karzanov, ”On finding maximum flows in network with special structure and some applications”, Matematicheskie Voprosy Upravleniya Proizvodstvom, Vol 5, 1973, pp. 81–94. (In Russian.)
  • [22] E.L. Lawler, Combinatorial Optimization: Networks and Matroids, Holt, Rinehart and Winston, New York, 1976.
  • [23] L. Lovász and M.D. Plummer, Matching Theory, North-Holland Mathematic Studies 121, North-Holland, New York, 1986.
  • [24] S. Micali and V.V. Vazirani, “An O(|V||E|)O(\sqrt{|V|}\cdot|E|) algorithm for finding maximum matching in general graphs”, Proc. 21st Annual Symp. on Found. of Comp. Sci., 1980, pp. 17–27.
  • [25] A. Schrijver, Combinatorial Optimization: Polyhedra and Efficiency, Springer, NY, 2003.