Online matching games in bipartite expanders and applications
Abstract
We study connections between expansion in bipartite graphs and efficient online matching modeled via several games. In the basic game, an opponent switches on and off nodes on the left side and, at any moment, at most nodes may be on. Each time a node is switched on, it must be irrevocably matched with one of its neighbors. A bipartite graph has -expansion up to if every set of at most left nodes has at least neighbors. If all left nodes have degree and is close to , then the graph is a lossless expander. We show that lossless expanders allow for a polynomial time strategy in the above game, and, furthermore, with a slight modification, they allow a strategy running in time , where is the number of left nodes. Using this game and a few related variants, we derive applications in data structures and switching networks. Namely, (a) 1-query bitprobe storage schemes for dynamic sets (previous schemes work only for static sets), (b) explicit space- and time-efficient storage schemes for static and dynamic sets with non-adaptive access to memory (the first fully dynamic dictionary with non-adaptive probing using almost optimal space), and (c) non-explicit constant depth non-blocking -connectors with poly time path finding algorithms whose size is optimal within a factor of (previous connectors are double-exponentially slower).
1 Introduction
A bipartite graph has offline matching up to elements if every set of left nodes can be covered by pairwise disjoint edges. A graph has -expansion up to if every subset with at most left nodes has at least right neighbors.111As usual, left and right are used to indicate the corresponding side of the bipartition containing the node. The classical Hall’s theorem establishes the relation between expansion and matching: a graph has offline matching up to elements if and only if it has -expansion up to . A matching can be found very efficiently (using algorithms that find a maximum matching, such as [HK73] or [CKL+22]).
Is there anything similar for online matching? We show Hall-type results (but only in one direction), namely we establish expansion properties that are sufficient to guarantee online matching with efficient algorithms, and we do this in different settings modeled by different games. Moreover, we show that results of this kind have interesting applications.
Let us introduce the basic model, which is the cleanest among the types of online matching we consider,222The other models are used as intermediate steps in some proofs, and some also show up in the applications. They are introduced close to the place where they are used. and also the most challenging for the design of an efficient matching algorithm.
The basic online matching game
For the sake of this discussion, let us interpret left nodes of a graph as clients and right nodes as servers. The bipartite relation models the fact that a client can only be satisfied by certain servers. If the graph has offline matching up to elements, then for every set of at most clients, one can assign unique servers. In incremental matching up to , irrevocable assignments must be made on-the-fly as clients arrive and request access to a server. The condition is that at most clients arrive. In online matching up to , each client may in addition, release the assigned match. The condition is that at most clients may simultaneously need access to a server. We also allow a relaxed notion, in which a server may be assigned to up to clients. As mentioned, the formal definition uses a game.
Online matching game. The game with parameters and is played on a fixed graph. Two players, called Requester and Matcher, know this graph and alternate turns. Together they maintain a subset of edges, which is initially empty. Requester starts. At his turn, he may remove zero or more edges from . After this, should contain at most edges. Also, he must select a left node . At her turn, Matcher may add an edge to . After this, should be incident on an edge of , and each right node must be incident on at most edges from . If these conditions are not satisfied, then Matcher loses.
Definition.
A graph has online matching up to elements with load , if Matcher has a strategy in the above game in which she never loses. If the load is omitted, then is assumed.
A fully connected bipartite graph with right size has online matching up to . Both graphs below have offline matching up to elements, but neither has online matching up to elements333 Requester wins the game on the right graph above with the following sequence of requests and retractions. He first adds the middle left element. Matcher has to assign to it the top right neighbor (otherwise the Requester wins at the next step by adding the bottom left node). Requester next adds the top left node, which can only be matched with the right bottom node. At next step he retracts the left middle node and adds the bottom left node. At each moment at most left nodes have active matching requests and we conclude that the right graph does not have online matching up to . The right one has incremental matching up to 2 and the left one does not.
Remark.
The objective in the game is different from the extensively studied dynamic and online matching problems in the literature, in which the graph is not fixed and the request consists of a left node with its edges (adversarially chosen). In dynamic matching, matches may be revoked. In both areas, the objective is to maintain a matching for as many active requests as possible, see for example [HKPS20, LMSVW22]. Our definition is incomparable to this. On one side, the definition is weaker because the graph is fixed and known to the players. On the other side, it is stronger because we require a matching for all requested nodes and we may not change previously assigned matches. See appendix D for more related results.
Feldman, Friedman, and Pippenger [FFP88, proposition 1] have shown that if a graph has -expansion up to , then it has online matching up to . By a similar argument, -expansion up to implies online matching up to with load , see appendix A. Unfortunately, matches are computed in time exponential in .
In applications, parameters and are given, and a graph is needed with left size and online matching up to . It is desirable that the graph has:
– few right nodes, (ideally, close to ),
– few edges, (ideally, close to ), and
– fast matching time (ideally, the algorithm should be “local,” i.e., it should inspect only the neighborhoods of a few left nodes).
Therefore, an important open question implicitly formulated in [FFP88] is to find graphs that achieve the above 3 objectives. We come close to this goal.
We show that a graph with expansion factor equal to a large fraction of the left degree (i.e., a lossless expander) has polynomial-time online matching and, moreover, if we allow small load, it has logarithmic time online matching.
Proposition 1.1.
If a graph with left elements and left degree has -expansion up to , then it has online matching up to and each match of the strategy is computed in time .
Theorem 1.2.
If a graph with left elements and left degree has -expansion up to , then it has online matching up to with load , and the matching strategy requires amount of computation to compute each match and process each retraction.444 The runtimes are expressed assuming the Word RAM model with cell size = 1 bit. If the cell size is , which is common in the graph algorithms literature, then the runtime for each match is .
The algorithms receive the game state in the most natural way, see the first paragraphs of section 2 and the definition in section 4 below for details. The algorithms use a datastructure to store information for faster computation of future matches. The load can be reduced to , by making clones of the right side connected to the left side as in the original graph. This yields a graph in which the left degree and the right size increase only by a factor of and the runtime remains .
In theorem 1.2, the running time for computing a match is double exponentially faster than in [FFP88]. There exist non-explicit constructions of lossless expanders (see lemma 5.2) in which the right size is and the number of edges is . Hence, the 3 objectives (the right size, number of edges, and matching time) are simultaneously optimal up to a factor. Known explicit constructions of lossless expanders (see theorem 6.2) provide graphs with online matching in which the right size, number of edges, and matching time are optimal up to .
These results add a new entry to the list of wonderful properties of lossless expanders (for an overview, see [HLW06, Chapter 10] or [CRVW02, section 1.3]). The power of the main theorem and of various related online matching games that are used in its proof will be illustrated in 3 applications below.
Proof ideas
Proposition 1.1 is proven in section 2. The idea is to assign arbitrary free neighbors for all requested nodes, with the exception of left nodes that have 1/3 of their neighbors already assigned to other nodes. Such a node is said to be critical and is “protected” by receiving a carefully chosen virtual match. This match is converted into a real match if the node is requested and it is released if the fraction of busy neighbors decreases below 1/3. The large expansion and the unique neighbor property of lossless expanders are used to show that not too many nodes can be simultaneously critical and to find the virtual matches.
For theorem 1.2, the idea is to combine 2 matching strategies. The first one is the slower procedure from proposition 1.1. The second one, presented in section 3, is a greedy procedure that runs in time as required, but can not assign matches for a few problematic left nodes. Fortunately, a small subset containing these nodes can be identified well in advance, i.e., many requests before such a problematic request might happen, and handled by the slower procedure on a separate copy of the graph. In particular, this implies that there are not too many of such bad requests, and this leads to a small amortized runtime. We next interlace the two procedures on further copies of the graph. This allows de-amortization and leads to the claimed fast worst-case running time.
Application 1: one bitprobe storage schemes for dynamic sets.
The goal is to store a subset of a large set to answer membership queries “Is in ?”. Let be the size. A simple way is to store in a sorted list. This requires bits of memory, and given , one can determine whether is in by reading bits from the table. An alternative is to have a table of bits and set bit equal to if and only if . Now the query “Is ?” can be answered by reading a single bit. Also, one can insert or delete an element by modifying a single bit. The cost is that the table is long, since typically . We show that the advantages of the latter approach can be obtained with a data structure whose size is close to .
A 1-bitprobe storage scheme (also called a bit vector) is a data structure that answers a membership query “Is in ?” by reading a single bit. It is a fundamental data structure introduced by Minsky and Papert in their book on perceptrons [MP69]. See [BMRV00, Rom14, GR17, DPR22] for historic and recent references. In [BMRV00], lossless expanders are used to build 1-bitprobe storage schemes with short tables in which membership queries are answered probabilistically with small error .555Such 1-bitprobe storage schemes are different from Bloom filters which store an approximation of the set. More precisely, a Bloom filter stores a superset S’ of the intended S. Thus for every in (the false positives) the error probability of the query “Is in ?” is 1, and for in or in the error probability is 0 (and the probability over the choice of the hash functions used by the Bloom filter that an element is in is ). Using a non-explicit expander they obtain storage size . Note that this is close to the lower bound for any set data structure. They also have an explicit construction achieving storage size . Ta-Shma [Ta-02] and Guruswami, Umans, and Vadhan [GUV09, Theorem 7.4] give explicit constructions with smaller storage size. In all these schemes, making a membership query (i.e., finding the location in the data structure of the bit that is probed) takes time .
These 1-bitprobe storage schemes work for static sets, in the sense that any updating of requires the recomputation of the entire data structure, which takes time . We obtain explicit 1-bitprobe storage schemes for dynamic sets. Membership queries also take time . Insertion and deletion of an element takes time . The storage size is smaller than in the previous explicit schemes for static sets provided , see table 1. Full definitions are given in section 7. The proofs only depend on sections 3 and 6.
storage size | reference |
---|---|
[BMRV00] | |
[BMRV00] | |
[Ta-02] | |
[GUV09] | |
Theorem 7.1 |
All previous explicit 1-bitprobes required lossless expanders with a special “list-decoding” property (see [GUV09, theorem 7.2]), while our approach works with any lossless expander. Thus future improvements in explicit lossless expanders will give better dynamic 1-bitprobes. This feature of our method also opens the possibility of implementations that are attractive in practice by using constructions based on tabulation hashing [Tho13] or empirical hashing methods, see the remark on page 7.
Proof idea. This result does not follow directly from theorem 1.2 or proposition 1.1. It follows from a related but incomparable result. Let be a subset of left nodes of a graph with left degree . An -rich matching for is a set of edges so that each node in is covered at least times. We also want that, for some small number , that each right node is incident to at most edges. If every set of size at most has this property, we say that the graph has -rich matching up to with load . This is stronger if . On the other hand, we consider a weaker version of the online matching game by adding the restriction of -expiration: Requester must retract an edge at most rounds after being added to the matching.
We show that a graph with -expansion up to has -rich online matching up to with load in the game with -expiration. This follows by a modification of the greedy algorithm in section 3. The modification is given in section 6 (sections 2 and 4 are not needed for this application). Next, composing such a graph with a certain graph based on simple hashing, we obtain explicit graphs with the same type of matching but with load . Moreover, the right size is , which is almost optimal, see corollary 6.3. A slightly weaker result is proven in [BZ23, corollary 2.13], without explicitly referring to matching.
For the 1-bitprobe, we use a graph that has -rich matching (with load ) up to for the game with expiration. To each right node we associate 1 bit of the bitvector and initially all these bits are set to . When an element is inserted in , it is matched with fraction of its neighbors and the associated bits are set to . When we query some , we can still match with fraction of its neighbors (because has matching up to and there exist only at most matches for the elements in ) which means that the associated bits are set to . Therefore for every , fraction of its neighbors indicate if is in or not. This assumes that the game has -expiration, but this requirement can be easily satisfied by periodically refreshing the matches of nodes with old assignments.
Application 2: static and dynamic dictionaries with non-adaptive probes
A dictionary is a datastructure for storing a set of items, where an item is a pair (key, value) and no two items have the same key. The keys are elements of a large ambient set, called the Universe. We ignore in our discussion the value component because in most implementations from the location of the item in the datastructure, the associated value can be retrieved in constant time. Therefore we view the datastructure as a table of cells, where each cell can store a key . The static dictionary supports the operation query(, which returns the index of the cell containing if , and NIL, otherwise. Note that this is stronger than the membership query from Application 1 with bitprobe storage, which only had to return 1 bit indicating if is in or not. The dynamic dictionary supports in addition the operations insert() and delete() for updating the dictionary.
The standard implementations use hash functions and various strategies for handling collisions such as chaining, linear probing, cuckoo chaining, and so on. The perfect hashing scheme for static dictionaries of Fredman, Komlós and Szemerédi [FKS84] does the query operation in time. Dietzfelbinger at al. [DKM+94] have a scheme for dynamic dictionaries with update operations running in expected amortized time. There are various improvements in various aspects but it is not in our scope to present this vast and important area. We refer the reader to [LPPZ23, PY20, LPP+24] for discussions of the literature that is relevant for our application.
An operation() (where operation is one of query, insert or delete) is non-adaptive if the locations in the datastructure that are accessed during its execution are a function (possibly probabilistic) of only (so, they are independent of previous operations).
Note that all the schemes based on hash functions are adaptive: first a description of the hash function stored in the datastructure has to be retrieved in order to calculate , and next the cell with address in the hash table must be probed. In many schemes, subsequent additional adaptive probes are required to handle collisions. Binary search and binary search trees (which are stronger than hash tables because they also support predecessor search) also have adaptive access to the memory.
Are there efficient schemes with non-adaptive operations? This is an interesting theoretical question. It also has practical implications because non-adaptive probing is suitable for parallel algorithms which can reduce the overall response latency. Persiano and Yeo [PY20, LPP+24] have shown lower bounds implying that query cannot be done non-adaptively with probes. However, if the probes are done in parallel the query time can still be constant.
As in Application 1, let be the size of the Universe and the size of (for the dynamic version, the set has during its entire history at most items).666Many papers (for instance [LPPZ23, LPP+24]) in the datastructure literature use for the size of the Universe and for the size of . Typically, . The lower bound in [PY20] is in the standard cell probe model in which computation is free of charge and the data structure consists of memory cells, each cell having size bits. The lower bound in [PY20] is more general, but in the particular and natural setting , it states that for a static dictionary with non-adaptive query (which can be randomized, but with error probability ) the expected number of probes is . Recently, Larsen et al. [LPPZ23, LPP+24] have shown that this lower bound is tight, improving upon [BHP+06]. Their construction is non-explicit (more precisely, they build using the probabilistic method a graph with -expansion up to which is assumed to be available to query) and they ask for an efficient explicit scheme. We present such schemes for both static and dynamic dictionaries (note that the result in [LPPZ23, LPP+24] is only for the static case).
Our static dictionary has (with cell size )777For simplicity, we did not optimize . In fact, for any , one can achieve (with ), with easy adjustments in the proof. and the number of non-adaptive probes is . It is explicit and deterministic and the runtime for computing one probe (i.e., computing the location in the datastructure that is read) is . The dictionary in [LPPZ23, LPP+24] achieves the better parameters (with ) and but, as mentioned, is non-explicit (we note that their scheme also works for larger in which case is smaller).
For the dynamic version, we have two schemes which are similar, however one is probabilistic, and the other one is deterministic. Since we use an explicit graph, in both schemes the runtime for doing one probe in the datastructure is . In the probabilistic version, all operations are non-adaptive, the datastructure size is (with cell size ) and the number of probes is . The operations query and delete succeed with probability 1, and each execution of insert() fails to insert with probability at most , where the probability is over the randomness of the previous update operations. As far as we know, this is the first dynamic dictionary in which all operations are non-adaptive (if we ignore the trivial solution of using a bitvector of length ). In the deterministic version, query and delete are non-adaptive but insert is adaptive.888If the history of the dictionary is limited to operations (for instance, if the dictionary is reconstructed after every batch of operations), then all operations can be made non-adaptive. The advantage is that (with ) is optimal (up to the constant in ) and .
Östlin and Pagh [ÖP02] and Berger et al. [BHP+06] have also used a type of online matching in expanders to implement dynamic dictionaries in which query is non-adaptive and insert and delete are adaptive. Compared to our approach, the main difference is that they allow the Matcher to change previous matches. This makes the matching strategy easier but causes the dictionary updates to be adaptive. Also, they use lossless expanders whereas we use graphs with 1-expansion and for this reason we obtain smaller .
Proof ideas. For the static dictionary we present here the entire proof. Exactly as in [LPPZ23], we take a bipartite expander that has the set of left nodes = Universe and -expansion up to . Each right node is associated with a memory cell in which elements of the Universe can be written (recall that cell size ). By Hall’s theorem for any subset of the left side of size at most , there is an injective mapping from to the right side. We store every element in at the memory cell . When we do query for an arbitrary in Universe, we inspect every right neighbor of to see if one of them contains . [LPPZ23] obtains the graph with the probabilistic method. Instead, we use the explicit -expander in proposition 1.3 and we obtain the static dictionary with the parameters claimed above.
Proposition 1.3.
For each and there exists an explicit graph with left size , right size , left degree , and -expansion up to .
The proof is given in section C. It relies on the dispersers in [TSUZ07] and it improves similar constructions in [BMVZ18, section 3.2], [Teu14], and [Zim14, Lemma 4].
For the dynamic dictionary (both the probabilistic and the deterministic versions) the proof is similar but instead of using offline matching guaranteed by Hall’s theorem, we do online matching in the game with the -expiration restriction (recall that this means that no match is allowed to last more than rounds). The key fact is that in section 3 we design a Matcher strategy for these games that finds a match for a left node only by inspecting its neighbors, and so it is non-adaptive. As in the static case, we associate to each right node a memory cell and when is inserted it is written in the cell matching it. To do query(), we check if the cell associated to some neighbor of contains or not. By proposition 3.1, the Matcher strategy wins in graphs with 1-expansion and then, essentially, by instantiating with the graph in LABEL:{t:constexpander} we obtain small and .
Like in 1-bitprobes, it remains to satisfy the expiration restriction by refreshing the matches of some elements. The choice of which element is refreshed is essentially the difference between the probabilistic and the deterministic versions. In the probabilistic version, we play the online matching game with -expiration. When we do an insert we also refresh a random element of Universe (which, recall, has size ). Note that insert remains non-adaptive. The probability that some matching is not refreshed for more than rounds is . In the deterministic version, we play the online matching game with -expiration, and now when we do an insert, we refresh the oldest element in the dictionary. This causes insert to be adaptive. The advantage is that the expiration parameter is smaller ( compared to ) and therefore we can win the game in a graph with a smaller number of clones, implying a smaller in the deterministic version.
Application 3: non-blocking networks.
Switching networks are used for the transfer of information between a large number of nodes. For example, in the early days of telephone networks, when there were only a few phones in a town, people made pairwise connections between all phones. When the number of phones grew, this was no longer feasible, and switching stations were introduced. Their theoretical study was initiated by Shannon [Sha50] and was the motivation for introducing expander graphs [BP73, Mar73]. Currently there is a large literature, both in the engineering and the theoretical computer science fields. See the book [Hwa04] for more history.
Nowadays, switching networks are important in various engineering applications where a large number of components need to communicate. Unlike telephone networks, these applications mainly concern a bipartite variant with inputs on one side and outputs on the other side, see [Hwa04]. In such graphs, the aim is to connect all output nodes to any permutation of input nodes using node disjoint paths.
An -network is a directed acyclic graph in which nodes are called inputs and nodes are called outputs. Its size is the number of edges. A rearrangeable -network is such a network in which for every 1-to-1 function from outputs to inputs, there exist node disjoint paths that connect each output node to the input node .
For example, a fully connected bipartite graph with left and right sets of size defines a rearrangeable -network with edges. Another example is given in figure 1. The goal is to construct networks with a minimal number of edges. Since there are different mappings, the minimum is at least , which is at least by Stirling’s formula.
We use a generalized variant of rearrangeability, in which several output nodes may be connected to the same input, but each output is connected to at most 1 input. In terms of broadcasting, this means that several outputs can listen to the same input. Moreover, the connection problem needs to be solved dynamically. For this, 2 closely related requirements exist, which are called strict-sense non-blocking connector and wide-sense non-blocking connector, see [Hwa04]. We use the second one, which is weaker, and is defined by a game.
Connection game. The game is played on an -network. Two players, called Requester and Connector, both know the network and alternate turns. They maintain a set of at most trees. The root of each tree must be an input and the leaves must be outputs. The trees must be node disjoint. Initially, the set of trees is empty. Requester starts.
At his turn, Requester may remove zero or more trees. Afterwards, he may select an input and an output such that does not lay on any of the trees.
At her turn, Connector may create or extend a tree so that the above conditions are satisfied. Afterwards, there should be a tree in which is the root and is a leaf. If this is not true, she looses.
Definition.
A wide-sense non-blocking generalized -connector is an -network in which Connector has a strategy in which she never looses. We refer to such a network simply as -connector.
A fully connected bipartite graph is an -connector. An -connector was given in [FFP88] that has edges. This is optimal within a constant factor. The graph is explicit but the path finding algorithm (which is the algorithm that computes Connector’s reply) is very slow. Afterwards, in [ALM96] an explicit -connector is constructed of size in which also the runtime of the path finding algorithm is , and this is optimal within a constant factor. See [ALM96] or [Hwa04, chapter 2].
The depth of a network is the length of the longest path between an input and an output. We focus on constant depth -connectors. In [PY82] it is shown that -connectors of depth have at least edges. In [FFP88] non-explicit constructions of -connectors are given of size , but again the path finding algorithm runs in time exponential in . They ask whether a generalized connector exists with small size and an efficient path finding algorithm. They do not specify explicitly, but “small size” is usually considered to be a value that is , see [WZ99], and “efficient” should ideally mean that the runtime is . Some explicit constant-depth -connectors are known with path finding algorithms running in time , but their size is not optimal, see [Hwa04, chapter 2]. For instance, for odd , the Clos network of depth has size .
In [WZ99, Th. 5.4] an explicit construction of size was obtained, but the path finding algorithm is the same slow one from [FFP88].
In section 5, we present a non-explicit constant depth -connector whose size is optimal up to factors and with a path finding algorithm running in time . Here we assume that the input of this algorithm is a description of the state of the connection game that includes the graph999For each node the input specifies the degree and a list of neighbors in arbitrary order. and the algorithm may use a data structure.
Corollary 1.4.
For all and , there exists an -connector of depth and size
with a time path finding algorithm.
An -connector is explicit if the -th neighbor of a node is computed in time . We present an explicit connector with small size and a path finding algorithm running in time.
Corollary 1.5.
For all and , there exists an explicit -connector of depth , size
with a path finding algorithm with runtime .
Proof idea. -connectors are obtained by composing several graphs with online matching, see fig. 2 for the idea. We apply this to lossless expanders, which according to theorem 1.2 have fast online matching. The path-finding algorithm for the construction of depth applies times an online matching algorithm, and hence, is fast as well. By instantiating with the non-explicit expander from lemma 5.2 and with the explicit one from theorem 6.2, we obtain the parameters in the two corollaries.
Open questions
Theorem 1.2 assumes large expansion, logarithmic load, and that the algorithm uses a data structure. We do not know if we can strengthen any of these assumptions. The strongest claim that we can not refute is the following.
Open question 1. Does there exist and with such that each graph with -expansion up to has online matching up to with load , where the time for computing matches is without using a data structure.101010 Recall that the input is the state of a game with the underlying graph. We do not need to process retractions, since there is no data structure.
The following open question is weaker than the above open question. However, if true, then the size of explicit connectors with depth can be further improved to instead of . (Recall that the lower bound is .)
Open question 2. Does -expansion up to imply online matching up to with load in which matches and retractions are processed in time with a data structure? (In other words, can the restriction of -expiration be removed in proposition 3.1?)
The above claim on the applications follows from the explicit -expander in proposition 1.3 which has right size and . In contrast, the state-of-the-art explicit graph with -expansion up to K has and , see theorem 6.2 below, obtained from [LOZ22, theorem 18].
A further relaxation of the last open question is to require runtime instead of . Proving that such an algorithm does not exist requires some hardness assumption because if P = NP, the algorithm from [FFP88, proposition 1] runs in time .
The main result is only used for the 3rd application, and the other applications follow from parts of its proof. The question is whether the main result can be strengthened to get all applications from a single result. We tried hard to use a single result to obtain both the 1st and 3rd application. Therefore, we expect a negative answer to the following question.
Open Question 3. Does -expansion up to imply -rich online matching up to for some ?
Summary and final remarks
We analyze 3 related online matching games in bipartite graphs.
-
(A)
The basic game, defined on page 1.
-
(B)
The game with -expiration, which adds the requirement that Requester must drop any assignment after at most rounds.
-
(C)
The -rich matching game with -expiration, which adds to game (B) the constraint that the Matcher must match a requested node with -fraction of its neighbors.
From the point of view of the Matcher’s strategy, game (B) is easier than game (A), and game (C) is more difficult than game (B) and incomparable with game (A). The logical dependencies between technical results and applications is given in the following figure. Game (A) is used for application 3 (-connectors), game (B) for application 2 (dictionaries with non-adaptive probes), and game (C) for application 1 (1-bit probes).
The table below provides a summary of various “Hall-type” results for online matching. If a graph with left nodes and left degree satisfies the expansion condition in column 1, then it has matching with the features in columns 2, 3, and 4. The 4-th column is the worst case runtime for finding or retracting one matching assignment.
expansion up to | matching up to | load | runtime per match | reference |
---|---|---|---|---|
offline | N/A | Hall’s Theorem | ||
online | [FFP88], Corollary A.2 | |||
-expiration online | Proposition 3.1 | |||
online | Proposition 1.1 | |||
online | Theorem 1.2 |
For game (C), any graph with -expansion up to has -expiration online -rich matching up to with load , in which a matching assignment/retraction can be done in time . We show this for -expiration in section 6. The case of general -expiration can be shown similarly to proposition 3.1.
Technically, the most difficult is the main result, theorem 1.2. It shows that lossless expanders admit fast strategies for matching up to in the basic game (A). In combination with known non-explicit and explicit constructions, it yields bipartite graphs that solve the main question in a classical paper of Feldman, Friedman, and Pippenger [FFP88]. These graphs have surprisingly strong properties. Note that the right size is only (where (small-factor) is in the non-explicit case and quasipoly in the explicit case) and hence the neighborhoods of the left nodes overlap a lot. Still, for each online matching assignment of a left node, we only inspect its neighborhood and make a few additional simple computations that take less time than scanning the neighborhood.
The proof uses game (B). This game admits a greedy strategy, that only inspects the neighborhood of the requested node (and also updates a few associated counters). This feature of the algorithm is important for the application 2. In fact, this simple strategy makes the bulk of the assignments. A few remaining assignments theoretically may still exist (we proved this for the adversarial setting of the game, but it is unlikely that this happens in practice). For them, a more complex strategy (from proposition 1.1) is used, but running it requires only time per assignment.
The matching strategy for game (C) is a variant of the strategy for game (B). In the open question 3, we ask whether we can obtain a strategy for game (C) without the condition of -expiration. Fortunately, in application (1), -expiration is not a big issue, because the condition can be satisfied by just refreshing old matches, which adds only a constant factor to the runtime of the insertion algorithm.
2 Polynomial time online matching
For notational simplicity, we prove proposition 1.1 with a slightly stronger assumption: we assume expansion up to instead of . The original proposition is proven similarly.111111 Define nodes to be critical if they have at least matched neighbors (instead of ), and follow the proof with extra ’s and ’s where needed. In the second lemma below, bound the number of critical nodes by instead of . We state this variant.
Proposition.
If a graph with left size and left degree has -expansion up to then it has an online matching algorithm up to in which each match is computed in time .
This algorithm is used in the faster -time algorithm in theorem 1.2.
Recall the online matching game with . Requester and Matcher maintain a set of edges. They alternate turns, and at their turn they do the following.
-
•
Requester removes edges from so that . He also selects a left node . We say that he requests .
-
•
If is not covered by , then Matcher must reply by adding an edge to . After this, must be a matching. The right node is called the match of .
The aim of Matcher is to provide correct replies indefinitely.
A matching algorithm has as input the state of the game after Requester’s move, Requester’s move itself, and some datastructure (which stores information to speed up computations in future rounds). The state of the game consists of , , the graph, and the matching. Requester’s move consists of a list of retracted edges and the requested left node. In every strategy of this paper, 2 algorithms are executed that each process a part of Requester’s move.
-
•
First, for each retracted edge, the retraction algorithm updates the datastructure.
-
•
Afterwards, the match generation algorithm is given the requested left node. It updates the datastructure and outputs a match.
The proof uses 2 technical lemmas. Let be the set of neighbors of a set of nodes. Given a set of left nodes, we call a right node a unique neighbor of if has precisely 1 left neighbor in . The following lemma holds in any bipartite graph with left degree .
Lemma.
The number of unique neighbors of is at least .
Proof.
We need to lower bound the number of unique neighbors. The number of vertices in that are not unique, equals . There are edges with an endpoint in . For each such edge, the right endpoint is either unique or has at least 2 neighbors in . Hence,
The lower bound of the lemma follows by rearranging. ∎
The following lemma holds for graphs satisfying the assumption in the proposition.
Lemma.
Let be a subset of right nodes with . If a left set contains only nodes with , then .
Proof.
Suppose contains at least elements, and let be a subset of of size exactly . By assumption on , each of its nodes has at most neighbors outside . Thus, by expansion,
This simplifies to . But this contradicts and . ∎
Proof of the proposition..
The idea of the matching algorithm is to assign a “virtual match” to left nodes for which at least neighbors are matched. Note that there are 2 types of matches to which we refer as standard and virtual matches. In the bound, we count both types of matches. A virtual match is treated as an actual match and other nodes can not be matched to it. The virtual matches are stored in the datastructure.
Left nodes with at least matched neighbors (of both types) are called critical. A virtual match will be assigned to a left node if and only if is critical and has no match.
Algorithm for retracting a match . If is critical, then declare to be a virtual match. Otherwise, retract the match and retract all virtual matches of left nodes with less than matches.
Virtual matches of critical nodes are unique neighbors.
Generating a matching for a request . If the request is a critical node, then its virtual match is returned, and thus is now a standard match. Otherwise, is matched to any neighbor that does not have a match (of either type). (Such a neighbor exists because a non-critical node has more than unmatched neighbors.)
After this, there might be critical nodes which do not have a match. Let be the set of such nodes. Virtual matches for these elements are assigned 1 by 1 as follows.
Select an unmatched right node that has exactly 1 neighbor in . Below we explain that such a always exists. Let be this single neighbor. Remove from , and declare to be its virtual match. Add to all new critical nodes without a match. Keep repeating until is empty. (This must happen, because an element can be added to at most once.) This finishes the description of the matching algorithm.
Note that the 2 algorithms above require amount of computation. We may assume that , since otherwise the proposition is trivial. Hence, the runtime is . In the presentation of the algorithm a claim was made: the set of critical nodes always has at least 1 unique and unmatched neighbor. If this is true, the online matching algorithm always produces matches and the proposition is proven.
We first prove 2 other claims.
Proof that at any moment, at most nodes are critical. In the above algorithm, matches are added 1 by 1. Assume that just before allocating a match there are at most critical nodes. Then the number of standard and virtual matches is at most (and in fact, it is 1 less, but this doesn’t matter). Let be the set of matched right nodes with the new match included, thus . By the second lemma, there are still at most critical nodes.
Proof that at any moment, all nodes in have exactly matched neighbors. By construction a node is placed in when it has at least neighbors. This condition is checked each time after a match is assigned, thus when a node is added to , it has exactly neighbors. As long as is nonempty, a virtual match is given to a left node such that has no other neighbors in , and then is removed. Thus for all other nodes in , the number of matched neighbors remains the same.
Proof that in the above matching algorithm, an unmatched node exists that has exactly 1 left neighbor in . Since all nodes in are critical, we have . By the assumption on expansion, . By the first lemma, has at least unique neighbors. At most of the unique neighbors can be matched, by the previous point. Hence, at least right nodes are unique and unmatched. Thus, if , the required right node exists, and if , no unique neighbor is needed. This finishes the proof of the claim inside the algorithm, and hence, the proposition is proven. ∎
3 Fast online matching with -expiration
In this section we present matching strategies for games in which Requester is restricted, culminating with a proof of proposition 3.1. They will be used in the proof of the main result, theorem 1.2, and in the application with non-adaptive dictionaries. Also, similar games define versions of -rich matching in section 6, which are used in the application with bitprobe storage schemes.
-
•
In the incremental matching game, Requester can not remove edges from . Note that such a game can not last for more than rounds. Matcher wins if he can reply times.
-
•
The -round matching game is the same as the original game, but Matcher already wins if he can reply times.
-
•
In the -expiring matching game, for each , Requester must remove the edge added in round during one of the rounds . Matcher wins if he can reply indefinitely.
We say that a graph has incremental matching, respectively, -round matching, and -expiring matching if Matcher has a winning strategy in the corresponding games.
Note that incremental matching up to and -round matching are equivalent, because in the -round game, removing edges from the matching can only help Matcher. Also, -expiring matching implies -round matching.
Examples. Recall the 2 graphs in the introduction, which are shown again. Recall that the left graph has offline matching up to . This graph does not have incremental matching up to , because if the middle node is selected first, then 1 of the 2 other nodes can not be matched.
The right graph does have incremental matching up to . But it does not have -round matching up to , and also no -expiring matching up to , because Requester’s strategy from footnote 3 has 3 rounds, and in the 3rd round, the match from the 1st round is retracted.
We now define fast matching algorithms. Graphs are given in adjacency list representation and checking whether an edge belongs to the matching requires time.
Definition.
Consider a graph with left nodes and left degree . We call a matching strategy fast if the strategy can be presented by a retraction and a match generation algorithm as explained in the previous section and the runtime of both algorithms is .
In [MRS11, p229 bottom] and [BZ23, Corollary 2.11] it is proven that -expansion up to implies fast incremental matching up to with load . In the remainder of this section we prove the following extension of this result.
Proposition 3.1.
If a graph has -expansion up to , then it has -expiring fast matching up to with load .
An -clone of a graph is a graph obtained from copies of by identifying the left nodes.
Remarks.
– A graph has -expansion if and only if an -clone has -expansion.
– For each of the different matching games above, the following holds.
A graph has matching with load if and only if an -clone has matching with load 1.
The proposition follows from these remarks, corollary 3.4, and lemma 3.5 below. Corollary 3.4 is a variant of the result from [MRS11, section 2.3], which we prove first.
Lemma 3.2.
If a graph has -expansion up to , then a -clone has incremental matching up to .
Proof.
Let the copies of the clone be ordered. Node is a free neighbor of if edge is not in the matching.
Matching strategy. Given a request , select the first copy in which has a free neighbor, and match to any free neighbor in this copy.
For , correctness is trivial. For larger , we use induction. Assume the statement is already proven for some value of . We prove that with more copy, incremental matching up to is obtained.
covers , thus .
Fix a moment in the game. Let be the set of edges in that belong to the first copy. Let be the set of requests whose matches do not belong to . The total number of requests is , and this is bounded by during the incremental matching. It remains to show that , since this implies and the result follows by the inductive hypothesis.
Let denote the neighbors of in the first copy. Note that is covered by edges in , because if request is not matched in the first copy, then its neighbors are covered by by choice of the algorithm. By 1-expansion, we have
Note that the above proof provides a matching strategy which is fast, because it suffices to check all the neighbors in the clone of the requested left node, which is done in time “the degree of the -clone” times . There is no need for a data structure. However, when we transfer from matching in the -clone to matching in the original graph with load, then we do need a data structure for storing the load, since iterating over the edges incident on a right node may take a long time.
Corollary.
If a graph has -expansion up to , then it has fast incremental matching up to with load .
Proof.
We run the algorithm from the previous lemma after merging the copies of each right node. We now use a data structure to maintain the load of each right neighbor, and a requested node is matched to a right neighbor with smallest load. Given an edge, the retraction algorithm simply decreases the load of the right neighbor. The runtime of match generation is . It remains to note that , thus the matching time is , so it is fast. ∎
Lemma 3.3.
Let be a non-negative power of . If a graph has -expansion up to , then a -clone has -round matching up to .
Proof.
The matching strategy is the same as in lemma 3.2. The proof proceeds by induction on . For the lemma is already proven by this lemma. Now assume that the lemma is already proven for some that is a multiple of . We show with extra copy, requests can be handled. We organize the requests in blocks of length . It suffices to show that while processing each such block, at least matches are assigned using the extra copy, and hence, the remaining requests can be processed by the other copies.
Fix a block and consider a moment during the processing of its requests. Let be the set of all edges of the first copy that at some point have been present in the matching since the processing of the block started (and might still be present). Note that , because at the start of the block at most edges can be present, and at most requests are processed during the block. In fact, we have until the last request is processed.
Let be the set of requests in the current block that were matched outside the first copy. We show that after adding each next match, except perhaps after the last request.
Again, let denote the set of neighbors of in the first copy. As in the previous lemma, is covered by , thus . Since was true during the previous step, after 1 more match, we have . By -expansion up to , we conclude that
and hence after adding each next match, except for the last one. ∎
Corollary 3.4.
If a graph has -expansion up to , then it has fast -round matching up to with load .
Proof.
If , then the result is trivial, because every nonempty graph has matching with load . If , then -round matching is equivalent to incremental matching up to , and the result follows from the previous corollary. Otherwise, we obtain -expansion from a -clone, and apply the lemma above. Next, exactly like in the above corollary, merge the copies of each right node, and use a counter to maintain its load. By the same analysis, each match is done in time, where is the maximal load. Since , the matching is fast. ∎
Lemma 3.5.
If a graph has -round matching, then a -clone has -expiring matching. If the -round matching is fast and with load , so is the matching for the -clone.
Proof.
The matching algorithm processes rounds on the first copy, then the next rounds on the other copy, then again rounds on the first one, and so on. At each switch, the matching has no edges in the copy because of -expiration. ∎
Recall that corollary 3.4 and lemma 3.5 imply proposition 3.1, thus its proof is finished.
Remark.
From proposition 3.1 we obtain -expiring matching. Two properties of the matching algorithm will be used in the application about dictionaries.
Each algorithm for generating matches for a node reads the datastructure only about the load of the neighbors of (in fact, the load on copies). Therefore, we may assume that the queried memory only depends on the queried node , and not on the datastructure or the matching.
In the proof of lemma 3.2, we used copies with expansion up to . In fact we may merge graphs with less expansion: it is enough that the -th graph has expansion up to , Since each copy allocates half of the remaining matches. Thus, if we merge graphs with expansion for , we obtain a graph that has -expiring matching with load . Moreover, the previous remark still applies.
4 Fast online matching
We finish the proof of the main result, theorem 1.2. The matching strategy combines an time greedy strategy from section 3 with the time strategy of section 2. The greedy strategy allocates most matches, while the polynomial one is used for a few problematic requests that are anticipated well in advance.
Recall that in fast online matching we use a data structure to compute matches. We consider a relaxed notion of fast matching that besides algorithms to generate matches and process retractions, also has a preparation algorithm. This algorithm is run at regular intervals and does not need to be fast. This algorithm prepares the data structure for fast computation of future matches.
Definition.
We say that a graph with left size and left degree has fast online matching with -step preparation and load if there exists an online matching algorithm that computes matches and processes retractions in time . Moreover, each time after matches have been assigned, it runs a preparation algorithm that takes time.
Remark.
A graph with fast online matching with preparation has fast online matching in the amortized sense. De-amortization is obtained as follows. A 2-clone of such a graph has fast online matching (in the standard worst-case) because blocks of -subsequent requests can alternatingly be given to the copies: while one copy is used for assigning matches, the other can run its preparation algorithm (in little chunks at each request). Next, if we merge the 2 clones, the load increases only by a factor of .
Let and be graphs with vertices and , and with edges and . The union of and is the graph with vertices and edges .
Lemma.
Consider two graphs with the same left set of size . If the first has -expansion up to and the second has polynomial time online matching up to , then their union has fast online matching up to with load and -step preparation.
Before proving the lemma, we show that it implies the main result. This is not so hard to prove, because with a constant number of clones of the graph from the theorem, we obtain the graphs satisfying the conditions of the lemma. Here are the details.
Proof of theorem 1.2..
The graph in the assumption of the theorem has degree and expansion . By proposition 1.1, graph has polynomial-time online matching up to . By applying the lemma to , this graph has online matching up to with load and preparation time.
By the remark above on de-amortization, a 2-clone of has online matching up to with load . Hence, a 4-clone of has such matching up to .
Therefore, the original graph of the theorem has matching up to with load , by multiplying by the constant hidden in . The theorem is proven, except for the lemma. ∎
Proof of the lemma..
Let be the graph with -expansion and let be the graph with polynomial time online matching. We may assume that their right nodes are disjoint, because this affects the load by at most a factor 2.
The preparation algorithm uses as a safety buffer to precompute in it matches for nodes that are at-risk in the sense that they have many busy neighbors in (the precise definition is below). The preparation and retraction algorithms share a queue containing matches in . Let be a polynomial of that we determine later.
Match generation for the first requests. Apply the fast matching algorithm from corollary 3.4 using graph . Since has -expansion, we obtain matching with load .
Preparation algorithm. Run ’s retraction algorithm for all matches from the queue. Also run it for all precomputed matches from the previous run of the preparation algorithm that are not in the current matching.
We call a right node of disabled if it is matched. The others are called enabled. Let be the set of left nodes with at least disabled neighbors (the at-risk nodes). Compute the induced subgraph of containing the left nodes not in and the enabled right nodes. The set and graph are fixed until the next run of the preparation algorithm and will be used in the fast match generation algorithm below.
Precompute matches in for all nodes in . Do this by generating requests 1-by-1 in any order. (We soon explain that ’s matching algorithm will indeed produce matches.)
Match generation for request . If is in , return its precomputed match in . Otherwise, run the fast -round algorithm from corollary 3.4 on the graph . (We soon explain that has -expansion up to .)
Retracting . If is in , then add the edge to the queue. Otherwise, run the retraction algorithm of .
The value of is chosen to be a polynomial in large enough so that the preparation algorithm can be performed in time . By corollary 3.4, the runtime of computing a match satisfies the conditions.
Above, 2 claims were made that need a proof. After this, the lemma is proven, because by construction, the load of all nodes in is bounded by , and for it is bounded by .
Proof that has 1-expansion up to . Let be a set of left nodes in of size at most . By expansion in , the set has at least neighbors in . By choice of , each element in has at most disabled neighbors in . Thus the number of neighbors in is at least
Proof that the polynomial-time matching algorithm precomputes matches for all the nodes in . For this, we need to show that before each request, the size of ’s matching is at most . First we show that . Suppose that and let be a subset of with exactly elements. Let be the set of matches in . Since all matches in are actual (not pre-computed), we have . Each node in has at most neighbors that are not covered by . Hence, the number of neighbors of in is at most
By the expansion of in , we conclude that
Hence, , but this contradicts .
The claim follows because less than precomputed matches can exist simultaneously. Indeed, there are less than matches computed in the current run and also there are at most matches from previous runs (that became actual matches and have not been retracted). ∎
5 Constant-depth connectors with fast path finding algorithms
Graphs with online matching up to can be composed into -connectors of constant depth . The following construction was given in [FFP88, Proposition 3.2], and obtains an almost minimal number of edges.
Proposition 5.1.
Let be a -th power of an integer. Assume that for some and , for all integers , we have graphs with left nodes, at most right nodes, left degree at most , and online matching up to . Then, there exists an -connector of depth with at most edges.
Recall that each connector has at least edges (see [PY82, proposition 2.1]). Hence, the above result is minimal within a factor .
Remark.
To compute a path or extend a tree in this construction, at most matches need to be computed. Thus, for matching obtained from theorem 1.2 we obtain path finding in time .
For the sake of completeness, we present the construction and prove the proposition. First some observations about the static case are given.
An -network is a directed acyclic graph with input and output nodes. Recall that the network is rearrangeable if every 1-to-1 mapping from outputs to inputs can be realized using node disjoint paths.
The following 2 lemmas are directly obtained from the definitions.
Lemma.
Consider a graph with left and right nodes that has offline matching up to . The concatenation of this graph with a rearrangeable -network is a rearrangeable -network.
Lemma.
Consider rearrangeable -networks with the same set of inputs and disjoint outputs. The union of these networks is a rearrangeable -network.
Proof of the proposition..
Let for an integer . The construction is illustrated for in figure 2.
Construction of a rearrangeable -network. For every , we construct a -rearrangeable network of depth recursively. For , we use the complete bipartite graph with left size and right size .
Suppose for some , we already have such a network . First obtain an -network of depth by introducing input nodes, denoted by the set , and connect them to according to a graph with matching up to in the statement of the proposition. Then merge copies of this graph having the same set of inputs and disjoint sets of outputs.
The rearrangeability property follows from the 2 lemmas above.
Proof that the network has at most edges. We prove this by induction on . For , the network is fully connected and has at most edges.
Let and assume that the construction of a -connector of depth contains at most edges. The -network consists of such connectors and graphs with left nodes. Thus, the total number of edges is at most
With exactly the same construction, connectors are obtained. The matching game on -networks is defined in precisely the same way, and using this game, -connectors are defined. We adapt the 2 lemmas above.
-
•
If a graph with sizes and has online matching up to , then its concatenation with an -connector is an -connector.
-
•
The union of output disjoint -connectors is an -connector.
For the first item, when Requester selects an input–output pair , this triggers a request for a match for in the graph with online matching, followed by a request to connect the pair in the connector. Since there are outputs, at most matches are simultaneously needed and therefore both requests can be satisfied.
The second is easy to understand, since the path finding (or better tree extension) algorithms of separate copies do not interfere. Both claims together provide the connectors of the proposition. ∎
Corollary 1.5 follows by applying this construction to the matching algorithm from theorem 1.2 applied to the lossless expander obtained from theorem 6.2 as follows. Choose and left size . For each , the expander of theorem 6.2 satisfies:
and this is bounded by for large . Let be the right side of the above. From this expander, we only use the first of the left nodes and drop the others. This yields the graphs satisfying the conditions of proposition 5.1 with .
For non-explicit constructions, we can use an expander with smaller degree. In fact, a random graph has good expansion properties as explained for example in [Vad12, theorem 6.14] or [BZ23, appendix C].
Lemma 5.2.
For each and , there exists a -expander up to with left degree , left size and right size .
Corollary 1.4 follows from proposition 5.1 and theorem 1.2 instantiated with this expander.
6 -rich matching
We consider matchings in which a left node is matched to most of its right neighbors, and present an explicit family of graphs that have online such matchings with -expiration. In the next section, this is used to construct bitprobe storage schemes.
Given a graph with left degree and a set of left nodes.
A set of edges is an -rich matching with load
– if each element of is incident on at least edges, and
– if each right node is incident on at most 1 edge.
Online -rich matching game. This game is defined in the same way as before, but now, Requester needs to remove edges such that at most left nodes are covered, and when he selects a left node , Matcher needs to cover with different edges such that each right node is incident on at most edges.
Definition.
A graph has online -rich matching with load if it has a winning strategy in the online -rich matching game. For brevity, we drop “online” and just use “-rich matching.” Graphs with incremental and -expiring -rich matchings are defined similarly.
The product of 2 graphs with the same left set and right sets and is the graph with left set and right set in which a left node is adjacent to if and only if is adjacent to both and in the respective graphs.
Proposition 6.1.
If a graph with degree has -expansion up to , and another graph has -rich matching up to , then their product has -expiring -rich matching.
Remark.
This is easily generalized from -expiring to -expiring matching, provided the 2nd graph has -rich matching up to . But we do not need this.
To prove proposition 6.1, we first adapt lemma 3.2 about incremental matching.
Lemma.
If a graph has -expansion up to , then it has incremental -rich matching up to with load .
Proof.
Let be the degree of the graph and consider a -clone. It suffices to show that in this clone every left node can be matched to neighbors. At the risk of being too detailed, we state the induction hypothesis exactly. For this, a more general variant of the game is used: a request consists of a node and a number with , and Matcher must assign matches to .
Matching algorithm given a request . For each neighbor of in the original graph, collect the minimal index of a copy in which this neighbor is free. Match to the neighbors with minimal indices.
For , correctness is trivial. Now inductively assume that the graph has expansion up to and that this algorithm computes incremental matches up to . We show that with 1 more clone it also computes incremental matches up to by using only the first copy for at least half of its requests.
Let be the set of requests for which only the first copy was used and let be the set of other requests. Let and be the sets of their neighbors in the first copy.
By choice of the algorithm, we have , because otherwise would have enough free neighbors to be matched in the first copy. By expansion up to , we have
After a calculation, we conclude that . Thus for at least half of the requests, only the first copy is used. ∎
We obtain matching with -expiration by applying Lemma 3.5, which holds also for -rich matching (with the same proof). This doubles the load.
Corollary.
If a graph has -expansion up to , then it has -rich matching up to with load and -expiration.
To finish the proof of the proposition, we decrease the load from to by applying the following.
Lemma.
Assume a first graph has -rich matching up to with load , and a second graph has -rich matching up to . Then the product has -rich matching up to . If the matching in the former 2 graphs is with -expiration, so is the matching in the product graph.
Proof.
Let and be the graphs in the lemma. The matching strategy in will run the strategy in as well as separate copies of ’s matching strategy for each right node in .
Matching strategy of on input a left node . First run the matching strategy of on input , and let be the match. Run the -th copy of matching strategy on input , and let be the match. Output .
By definition of load , this produces an -rich matching in the -copy of .
The union of all edges in all copies of forms a set that satisfies the definition of -rich matching, because given a request, ’s strategy produces edges covering neighbors , and for each such , the -copy produces edges on neighbors. Since , the lemma is proven. ∎
Recall that by the previous 2 results, proposition 6.1 is proved. Finally, we apply it to explicit graphs. The first one is an explicit lossless expander based on [GUV09], and the second one is a standard hash code with prime numbers, see for example [BZ23, lemma 2.4] or appendix B for a proof.
Theorem 6.2 ([LOZ22], Th 18).
For all , , and , there exists an explicit graph with left size , -expansion up to , left degree , and right size .
Lemma.
For all , , and , there exists an explicit graph with left size , right size , and -rich matching up to .
Corollary 6.3.
For all , and , there exists an explicit graph with left degree and right size , that has fast -rich matching up to with -expiration.
Proof.
The graph is obtained as the product of the graphs in the 2 above result. The datastructure for the matching maintains counters for nodes of the first graph. Let be its degree. Selecting the minimal counters takes time , and since , the result follows. ∎
7 1-bitprobe storage scheme for dynamic sets
The goal is to store a -element set , where typically . A 1-bitprobe (or bit vector) storage scheme is a data structure in which queries “Is in ?” are answered probabilistically by reading a single bit. Previous constructions for 1-bitprobes are for static sets. We show that graphs that admit -rich matching can be used to obtain 1-bitprobe storage schemes for dynamic sets: the data structure also allows for efficient insertions and deletions from .
A static 1-bitprobe is a data structure that is described by a size and a probabilistic algorithm pos mapping to , which selects a bit to answer a membership query. Let be 1 if and otherwise.
Formal requirement for a 1-bitprobe with parameters .121212 For notational convenience, our requirement is slightly stronger than in the standard definition, in the latter, an algorithm for membership queries either returns a bit or its negation. This affects the size by at most a factor 2, (by also storing the negation of each bit). For all with there exists such that for all ,
A dynamic 1-bitprobe additionally has an update function for adding and removing elements from the set. A history is a list of integers that describes these operations chronologically, where a positive integer represents the addition of to the set, and its removal.
For a history , let be the set of elements that remain after the sequence of operations, thus this is the set of positive entries in with no appearance of at their right. In the definition of 1-bitprobes we consider histories that at any moment encode sets of size at most .
Definition.
History is -legal if for all and if for each prefix of .
Definition.
A dynamic 1-bitprobe with parameters consists of
– a size ,
– a deterministic algorithm , and
– a probabilistic algorithm ,
such that for each -legal histories and each
where .
We construct dynamic 1-bitprobes of small size that have efficient implementations for queries and updates.
Theorem 7.1.
There exists a family of dynamic 1-bitprobes with parameters with
– size ,
– query time ,
– update time .
We start the proof with the simpler case of static 1-bitprobes, which follows directly from graphs with -rich incremental matching.
Lemma.
If a left regular graph with left and right sets and has incremental -rich matching up to , then the mapping pos that maps a left node to a random neighbor defines a static 1-bitprobe of size with parameters .
Proof.
Given a -element , run the matching algorithm for all elements of in an arbitrary order and let be the string that has 1’s in precisely those indices in that are covered by the matching.
We prove that satisfies the 1-bitprobe condition. Indeed, if , then at least of ’es neighbors are covered, and hence . Assume . If the incremental matching algorithm would be given , then, it would find right neighbors that are not matched, since the incremental matching is up to . By the choice of , the corresponding indices are , and hence . ∎
In fact, by this proof we obtain 1-bitprobes in which elements can be dynamically inserted but not removed. If there exist graphs with online -rich matching, then we could apply a similar argument and we are done, but we do not know whether such graphs with small right sizes exist.
Fortunately, it is enough to have graphs with -rich matching with -expiration. The idea is to refresh old elements. More precisely, if an element was inserted, and it was not removed during the next insertions, then we delete and reinsert . After this modification, each insertion in the probe’s history corresponds to rounds of the matching game, and hence, -expiration is required.
Lemma.
Assume that there exists a graph with degree , left set and right set that has -expiring -rich matching up to . Furthermore, assume that the matching algorithm is fast and the datastructure uses space . Then there exists a dynamic 1-bitprobe of size with parameters and update time . Moreover, if the graph is explicit, then the query time is .
Proof.
Let the pos function be defined as in the previous lemma. Note that this implies the moreover-part of the lemma.
The update function requires the storage of the matching, the last insertions of the history, and the datastructure of the matching algorithm. To store the matching, we use a Red-Black search tree, so that membership can be checked in time and updates can be done in the same time. To store the queue with the last requests, we simply use a single linked list. This increases the size by for storing the matching, by for the queue, and by for the datastructure.
For removals, the update function first checks the presence of the element in the stored set. If not present, it is finished. Otherwise, it runs the retraction algorithm of the matching, and sets the bits of to zero for the right nodes that are no longer covered.
For inserting a node , the update function first checks whether is already present in the stored set. If so, it finishes. Otherwise, runs the matching algorithm for , and sets the assigned bits to 1. Next, it refreshes the -th oldest insertion (it runs the retraction and then the matching algorithm for it).
To see that this works, we need to verify that every match is retracted after at most requests of the matching algorithm. Indeed, insertions in the probe’s history now correspond to at most requests for the matching algorithm. If an element is removed after at most other insertions, then we are done. Otherwise, the update algorithm will retract it at the -th insertion. ∎
Proof of theorem 7.1..
Apply the above lemma to the graph from corollary 6.3.131313 The 1-bitprobe storage scheme in theorem 7.1 has smaller size than the 1-bitprobe storage schemes in [BMRV00, Ta-02, GUV09] (provided , see table 1), even though these schemes have the limitation of handling only static sets. Plugging in the above generic construction the lossless expander used in [GUV09], one obtains a 1-bitprobe storage scheme for dynamic sets with data structure size equal to (storage size from [GUV09]) , in which pos and upd have runtime . Compared to theorem 7.1, upd is faster. The reason is that the lossless expander in [GUV09] has , smaller than the left degree of the graph in theorem 6.2 (but the right set is larger). ∎
Remark.
The explicit lossless expander from theorem 6.2 is based on the construction in [GUV09] and is not practical. Otherwise, the algorithms of our 1-bitprobe storage scheme in theorem 7.1 are efficient. Thorup [Tho13] constructed lossless expanders (using tabulation hashing) with worse asymptotic right sizes than [GUV09], but for realistic they seem practical. Since a random graph is a lossless expander with positive probability, it is possible that with pseudorandom graphs (obtained from CityHash, murmur, SHA-3), the datastructure is practical, (but with weaker proven guarantees).
8 Static and dynamic dictionaries with non-adaptive probing
A dictionary stores (key, value) pairs for fast retrieval of the values corresponding to keys. The keys belong to . Memory is divided in cells having bitsize exactly . The goal is to implement efficient querying by accessing the cells of the memory in parallel. For this, we aim at accessing cells whose indices only depend on the queried key, and not on the datastructure.
The constructions below are for the membership datastructure. But they also provide dictionaries, because they all have a feature in common: to determine whether belongs to the set, we read a few cells, and is in the set if and only if one of these cells contains . For dictionaries, one needs to store together with each key its corresponding value. This increases the memory by a constant factor.
Definition.
A static membership datastructure for parameters is a pair where is an algorithm that computes membership queries. For all with there must exist such that for all : . ( denotes the ‘blank’ value.)
Algorithm makes non-adaptive probes if the accessed cells of the probe only depend on the first argument and not on the second argument .
Proposition.
There exists a family of static membership datastructures with parameters with (so consisting of cells of bitsize ) in which query runs in time and makes non-adaptive probes.
Note that the bitsize of any membership datastructure is lower bounded by , hence the above size is optimal up to a constant factor. The construction is the same as in [LPPZ23], except that instead of a non-explicit expander, we use the expander from proposition 1.3. For later reference, we repeat the construction.
Proof.
We use the explicit graph with -expansion up to from proposition 1.3, which has right size and degree . Let .
Construction of . By Hall’s theorem, there exists a matching that covers . For each edge , write in the -th cell.
Query function on input and . Check all cells such that is a neighbor of . If one of these cells contains , output that otherwise, output that .
Correctness follows directly from the construction. Since the graph is explicit, the query function can compute all neighbors of a node in polynomial time, and hence query runs in polynomial time. Since no other cells are scanned, the query is non-adaptive. ∎
A dynamic membership datastructure allows for insertion and removal of elements in . Recall from section 7 our notation for updates of using positive and negative numbers for insertions and removals, as well as the notation and . Also, recall the definition of -legal histories in .
Definition.
A dynamic membership datastructure with parameters is a triple consisting of
– a size ,
– an algorithm , and
– an algorithm ,
such that for each -legal history , for all :
. (We use to denote the initial empty table).
Algorithm upd makes non-adaptive cell probes if the cells that it reads and writes to only depend on its first argument .
We present dynamic dictionaries that also have optimal size up to a constant factor.
Proposition 8.1.
There exists a family of dynamic membership datastructures for parameters that use cells of bitsize and in which query and upd run in time. Moreover, query and removals in upd are non-adaptive probes.
Proof.
We use the construction of graphs with -expiring matching from the remark at the end of section 3. When applied to the expander from proposition 1.3, the right size is . Let be the constant that bounds the load.
The matching is maintained on the cell probe in the same way as for the static dictionaries, except that we allow each right node to be matched to nodes (instead of only 1 node). On the cell probe we also keep the datastructure for the match generation algorithm, but this datastructure is only used for insertions.
The query function simply checks the presence of in the lists of their neighbors and answers correspondingly. This requires exactly non-adaptive probes. The update algorithm follows the modifications on the matching and datastructure made by the retraction and match generation algorithms.
To retract , means removing from the list of its matched right neighbor and update information that is also stored there. Thus this is done with non-adaptive probing.
To insert , means running the matching algorithm, and inserting in the list of the selected right node. Afterwards, the oldest match is refreshed: it is retracted and reinsterted. This makes the probing adaptive, because this left node is stored on the probe. ∎
Remark.
In fact, removals by upd are memoryless, which means that the modifications of a cell only depend on this cell and no other cells. This means that retractions are done with a single round of communication, and this is faster than insertions, which require a few rounds. This might be helpful in some applications, for example, think about maintaining a database of available flight tickets in a distributed way over a slow network, where adding new tickets is not time critical, but a conflict between buyers is.
The insertions by upd are close to being non-adaptive in the sense that they only “clean up” the datastructure, and this cleanup can always be postponed for up to other insertions. (For most insertions, no clean up is needed at all.) In fact, this clean up can be done by an independent process that does not need to know about insertions and does the work in little pieces.
A fully non-adaptive datastructure exists using probabilistic insertions. Unfortunately, its size is a factor larger than in the above proposition. We do not know whether there exists a fully non-adaptive membership datastructure whose size is optimal up to a constant.
Theorem 8.2.
There exists a family of dynamic membership datastructures with parameters with a probabilistic algorithm upd, that
– consists of cells of bitsize ,
– query and upd run in time and make non-adapive probes, and
– for each -legal history , with probability : all satisfy
Proof.
Consider the strategy for -expiring matching obtained from proposition 3.1 applied to the expander from proposition 1.3. Recall that the strategy in proposition 3.1 alternates between 2 copies. One copy is used for assigning matches while the other “rests” until all its matches disappear. After insertions, the roles flip. We call these copies the active and passive copies. The query and update function follow the matching algorithm in a similar way as in proposition 8.1, however, there are 2 modifications to the insertions update that have to do with refreshing.
-
•
We do not refresh the oldest match (because then we need to read what this is from the cell probe). Instead we select a random value of and if it is matched in the currently passive copy, then this match is retracted and placed on the active copy.
-
•
If at the end of a stage the passive copy still contains a match, then all insertions during the next stage are ignored.
We transform this graph to a dynamic membership datastructures in the same way as in the proposition above. Note that now the each right node is associated to cells. This finishes the construction.
By the 2 propositions, the size of the probe equals the number of right nodes times the load, which is . By the first modification, insertions of elements are non-adaptive. It remains to prove the last item of the theorem. Fix a history .
Claim. The probability that in a fixed stage of all insertions are ignored is at most .
Proof of the claim. Fix an element that is matched in the passive copy at the beginning of the previous stage. The probability that this element is not refreshed in the first stage equals . The probability that this fails during the insertions of the stage is at most . The claim follows by the union bound, since there are at most matches at the beginning of the stage.
This claim implies the proposition, because contains at most elements, and for each element the stage of their last insertion is good with the probability of the claim. It remains to note that and that for large . ∎
References
- [ABK94] Yossi Azar, Andrei Z Broder, and Anna R Karlin. On-line load balancing. Theoretical computer science, 130(1):73–84, 1994.
- [AKPP93] Y Azar, B Kalyanasundaram, S Plotkin, and K Pruhs. Online load balancing of temporary tasks. In Workshop on Algorithms and Data Structures (WADS), Montreal, Canada, pages 119–130, 1993.
- [ALM96] Sanjeev Arora, F. Thomson Leighton, and Bruce Maggs. On-line algorithms for path selection in a nonblocking network. SIAM Journal on Computing, 25:149–158, 1996.
- [Aza05] Yossi Azar. On-line load balancing. Online algorithms: the state of the art, pages 178–195, 2005.
- [BHP+06] Mette Berger, Esben Rune Hansen, Rasmus Pagh, Mihai Puatracscu, Milan Ruzic, and Peter Tiedemann. Deterministic load balancing and dictionaries in the parallel disk model. In Phillip B. Gibbons and Uzi Vishkin, editors, SPAA 2006: Proceedings of the 18th Annual ACM Symposium on Parallelism in Algorithms and Architectures, Cambridge, Massachusetts, USA, July 30 - August 2, 2006, pages 299–307. ACM, 2006.
- [BKV+81] M. Blum, R.M Karp, O. Vornberger, C.H. Papadimitriou, and M. Yannakakis. The complexity of checking whether a graph is a superconcentrator. Information Processing Letters, 13(4,5):164–167, 1981.
- [BMRV00] Harry Buhrman, Peter Bro Miltersen, Jaikumar Radhakrishnan, and Srinivasan Venkatesh. Are bitvectors optimal? In F. Frances Yao and Eugene M. Luks, editors, Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, May 21-23, 2000, Portland, OR, USA, pages 449–458. ACM, 2000.
- [BMVZ18] Bruno Bauwens, Anton Makhlin, Nikolai K. Vereshchagin, and Marius Zimand. Short lists with short programs in short time. Computational Complexity, 27(1):31–61, 2018.
- [BP73] L. A. Bassalygo and M. S. Pinsker. Complexity of an optimum nonblocking switching network without reconnections. Problems of Information Transmission, 9:64–66, 1973.
- [BRR23] Soheil Behnezhad, Mohammad Roghani, and Aviad Rubinstein. Sublinear time algorithms and complexity of approximate maximum matching. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing, pages 267–280, 2023.
- [BZ23] Bruno Bauwens and Marius Zimand. Universal almost optimal compression and Slepian-Wolf coding in probabilistic polynomial time. J. ACM, 70(2):1–33, 2023. (arxiv version posted in 2019).
- [CKL+22] Li Chen, Rasmus Kyng, Yang P. Liu, Richard Peng, Maximilian Probst Gutenberg, and Sushant Sachdeva. Maximum flow and minimum-cost flow in almost-linear time. CoRR, abs/2203.00671, 2022.
- [CRVW02] M. R. Capalbo, O. Reingold, S. P. Vadhan, and A. Wigderson. Randomness conductors and constant-degree lossless expanders. In John H. Reif, editor, STOC, pages 659–668. ACM, 2002.
- [CW18] Ilan Reuven Cohen and David Wajc. Randomized online matching in regular graphs. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 960–979. SIAM, 2018.
- [DKM+94] Martin Dietzfelbinger, Anna R. Karlin, Kurt Mehlhorn, Friedhelm Meyer auf der Heide, Hans Rohnert, and Robert Endre Tarjan. Dynamic perfect hashing: Upper and lower bounds. SIAM J. Comput., 23(4):738–761, 1994.
- [DPR22] Shyam Dhamapurkar, Shubham Vivek Pawar, and Jaikumar Radhakrishnan. Set membership with two classical and quantum bit probes. In Mikolaj Bojanczyk, Emanuela Merelli, and David P. Woodruff, editors, 49th International Colloquium on Automata, Languages, and Programming, ICALP 2022, July 4-8, 2022, Paris, France, volume 229 of LIPIcs, pages 52:1–52:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
- [FFP88] Paul Feldman, Joel Friedman, and Nicholas Pippenger. Wide-sense nonblocking networks. SIAM J. Discret. Math., 1(2):158–173, 1988.
- [FKS84] Michael L. Fredman, János Komlós, and Endre Szemerédi. Storing a sparse table with 0(1) worst case access time. J. ACM, 31(3):538–544, 1984.
- [GR17] Mohit Garg and Jaikumar Radhakrishnan. Set membership with non-adaptive bit probes. In Heribert Vollmer and Brigitte Vallée, editors, 34th Symposium on Theoretical Aspects of Computer Science, STACS 2017, March 8-11, 2017, Hannover, Germany, volume 66 of LIPIcs, pages 38:1–38:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017.
- [GUV09] Venkatesan Guruswami, Christopher Umans, and Salil P. Vadhan. Unbalanced expanders and randomness extractors from Parvaresh–Vardy codes. J. ACM, 56(4), 2009.
- [HK73] John E. Hopcroft and Richard M. Karp. An algorithm for maximum matchings in bipartite graphs. SIAM Journal on Computing, 2(4):225–231, 1973.
- [HKPS20] Monika Henzinger, Shahbaz Khan, Richard Paul, and Christian Schulz. Dynamic matching algorithms in practice. In Fabrizio Grandoni, Grzegorz Herman, and Peter Sanders, editors, 28th Annual European Symposium on Algorithms, ESA 2020, September 7-9, 2020, Pisa, Italy (Virtual Conference), volume 173 of LIPIcs, pages 58:1–58:20. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020.
- [HLW06] S. Hoory, N. Linial, and A. Wigderson. Expander graphs and their applications. Bull. Amer. Math. Soc., 43:439–561, 2006.
- [Hwa04] Frank K. Hwang. The Mathematical Theory of Nonblocking Switching Networks. World Scientific, 2004. 2nd edition.
- [KVV90] Richard M Karp, Umesh V Vazirani, and Vijay V Vazirani. An optimal algorithm for on-line bipartite matching. In Proceedings of the twenty-second annual ACM symposium on Theory of computing, pages 352–358, 1990.
- [LMSVW22] Hung Le, Lazar Milenković, Shay Solomon, and Virginia Vassilevska Williams. Dynamic matching algorithms under vertex updates. In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2022.
- [LOZ22] Zhenjian Lu, Igor Carboni Oliveira, and Marius Zimand. Optimal coding theorems in time-bounded Kolmogorov complexity. In Mikolaj Bojanczyk, Emanuela Merelli, and David P. Woodruff, editors, 49th International Colloquium on Automata, Languages, and Programming, ICALP 2022, July 4-8, 2022, Paris, France, volume 229 of LIPIcs, pages 92:1–92:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
- [LPP+24] Kasper Green Larsen, Rasmus Pagh, Giuseppe Persiano, Kevin Yeo, Toniann Pitassi, Kevin Yeo, and Or Zamir. Optimal non-adaptive cell probe dictionaries and hashing. In Proceedings ICALP, 2024. A merge and revision of [LPPZ23] and [PY20].
- [LPPZ23] Kasper Green Larsen, Rasmus Pagh, Toniann Pitassi, and Or Zamir. Optimal non-adaptive cell probe dictionaries and hashing. arXiv preprint arXiv:2308.16042, 2023.
- [Mar73] G. A. Margulis. Explicit constructions of concentrators. Problems of Information Transmission, 9:325–332, 1973.
- [MP69] M. Minsky and S. Papert. Perceptrons. MIT Press, 1969.
- [MP97] Yuan Ma and Serge Plotkin. An improved lower bound for load balancing of tasks with unknown duration. Information Processing Letters, 62(6):301–303, 1997.
- [MRS11] D. Musatov, A. E. Romashchenko, and A. Shen. Variations on Muchnik’s conditional complexity theorem. Theory Comput. Syst., 49(2):227–245, 2011.
- [ÖP02] Anna Östlin and Rasmus Pagh. One-probe search. In International Colloquium on Automata, Languages, and Programming, pages 439–450. Springer, 2002.
- [PL86] Michael D Plummer and László Lovász. Matching theory. Elsevier, 1986. A rewritten book by the same authors was published in 2009.
- [PY82] N. Pippenger and A.C.-C Yao. Rearrangeable networks with limited depth. SIAM J. Algebraic Discrete Methods, 2:411–417, 1982.
- [PY20] Giuseppe Persiano and Kevin Yeo. Tight static lower bounds for non-adaptive data structures. CoRR, abs/2001.05053, 2020.
- [Rom14] Andrei E. Romashchenko. Pseudo-random graphs and bit probe schemes with one-sided error. Theory Comput. Syst., 55(2):313–329, 2014.
- [Sha50] C. E. Shannon. Memory requirements ina telephone exchange. Bell Sys. Tech. Jour., 29:343–349, 1950.
- [Ta-02] Amnon Ta-Shma. Storing information with extractors. Inf. Process. Lett., 83(5):267–274, 2002.
- [Teu14] Jason Teutsch. Short lists for shortest descriptions in short time. Computational Complexity, 23(4):565–583, 2014.
- [Tho13] Mikkel Thorup. Simple tabulation, fast expanders, double tabulation, and high independence. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, 26-29 October, 2013, Berkeley, CA, USA, pages 90–99. IEEE Computer Society, 2013.
- [TSUZ07] A. Ta-Shma, C. Umans, and D. Zuckerman. Lossless condensers, unbalanced expanders, and extractors. Combinatorica, 27(2):213–240, 2007.
- [Vad12] Salil P. Vadhan. Pseudorandomness. Foundations and Trends in Theoretical Computer Science, 7(1-3):1–336, 2012.
- [WZ99] Avi Wigderson and David Zuckerman. Expanders that beat the eigenvalue bound: Explicit construction and applications. Combinatorica, 19(1):125–138, 1999.
- [Zim14] Marius Zimand. Short lists with short programs in short time - A short proof. In Arnold Beckmann, Erzsébet Csuhaj-Varjú, and Klaus Meer, editors, Language, Life, Limits - 10th Conference on Computability in Europe, CiE 2014, Budapest, Hungary, June 23-27, 2014. Proceedings, volume 8493 of Lecture Notes in Computer Science, pages 403–408. Springer, 2014.
Appendix A The (slow) online matching algorithm of Feldman, Friedman, and Pippenger
For completeness, we present a special case of [FFP88, Proposition 1]. Our proof is based on the original one. The result implies that if a graph has offline matching up to , then it has online matching up to elements with load .
Theorem A.1.
If a graph has 1-expansion up to and each left set with has at least neighbors, then the graph has online matching up to .
Corollary A.2.
If a graph has -expansion up to , then it has online matching up to with load .
Proof.
We modify by taking 3 clones of each right node. The new graph satisfies the hypothesis of theorem A.1. Indeed, let be subset of left nodes with . We partition into a set of size and a set of size . has at least neighbors in the right subset made with the first 2 clones, and has at least neighbors in the set made with the third clones. Thus, has at least neighbors. theorem A.1 implies that has online matching up to . By merging the 3 clones into the original nodes, it follows that has online matching with load . ∎
We continue with the proof of theorem A.1. We start with 2 technical lemmas.
Definition.
For a set of nodes , let be the set of all neighbors of elements in . A left set is critical if .
Lemma.
If and are critical and , then is also critical.
Proof.
We need to bound the quantity which equals . By the inclusion-exclusion principle this equals
Since and the assumption of the lemma, this is at most
Since and are critical, this is at most . ∎
Lemma.
Assume a graph has 1-expansion up to and has no critical set with . Then, for every left node there exists a right node such that after deleting and , the remaining graph has 1-expansion up to .
Proof.
A right neighbor of is called bad if after deleting , there exists a left set of size at most such that . Note that is critical, and by the 1-expansion of the original graph, contains . We show that by iterated application of the above lemma, the set
is critical. Indeed, for each critical set of size at most , the set is critical by 1-expansion and the previous lemma. Also this set has cardinality at most , thus by the assumption this union must have cardinality at most .
Note that if all neighbors of were bad, then because . Thus
If , then this violates -expansion, and if , this violates the assumption about the sizes of critical sets. Hence, at least 1 neighbor of is not bad and satisfies the conditions of the lemma. ∎
Proof of theorem A.1..
The online matching strategy maintains a copy of the graph. If Requester makes a matching request for a left node , Matcher replies by searching for a right node that satisfies the condition of the above lemma for the copy graph and adds the edge to the matching . In the copy she deletes the nodes and . When Requester removes an edge from , Matcher restores the nodes and in the copy graph.
It remains to show that in each application of the above lemma, the conditions are satisfied. Note that if Matcher restores the endpoints and of an edge, the conditions always remain true, because if , then and do not change, and otherwise both values increase by .
It remains to show that before any matching request, the copy graph has no critical set with (and thus the Matcher can apply the lemma and satisfy the request). Assume to the contrary that there is such an . In the original graph, has at least neighbors. When a right neighbor is assigned, Matcher deletes it from the copy graph. Therefore before any request, the Matcher has deleted from at most right nodes (since there can be at most active requests), hence, has at least neighbors, thus it is not critical.
Therefore, the conditions of the lemma are always satisfied and the strategy can always proceed by selecting a neighbor . The theorem is proven. ∎
Remark.
In the matching algorithm from [FFP88], the condition on the -expansion up to elements is checked using a brute force check over all left sets of size at most . This can be done in time. In general, checking whether a graph has -expansion up to elements is -complete, see [BKV+81]. However, this hardness result does not exclude algorithms that run in time for specially chosen graphs.
Appendix B Prime hashing implies -rich matching
Lemma.
For all , , and , there exists an explicit graph with left size , right size , and -rich matching up to .
Proof.
Let . Let denote the -th prime number. Left nodes are , and right nodes are pairs . Note that , and the condition on the right size is satisfied for . For the lemma is trivial.
A left node is connected to all pairs with . The matching strategy is the greedy strategy that matches a node to all unmatched right neighbors.
We prove that this provides -rich matchings. Assume that there are matches for , and let be an element that is not in this set. For each , there are at most common neighbors of and . Hence, at most a fraction of ’es neighbors have already been matched. Thus the greedy matching is -rich. ∎
Appendix C Explicit expander with -expansion up to
Proposition (Proposition 1.3).
For each and there exists an explicit graph with left size , right size , left degree , and -expansion up to .
The proof relies on the construction by Ta-Shma, Umans, and Zuckerman of an explicit disperser [TSUZ07, Th 1.4]. To make the notation of bipartite graphs compatible with the notation in [TSUZ07], we assume that the left side has size , the left degree is and the right side has size . We also assume . The construction of the disperser is done in 3 steps.
Step 1: Obtaining a good disperser. A disperser is a bipartite graph in which every subset with has . We modify the construction in [TSUZ07, Lemma 6.4] by replacing the extractor used there with the improved extractor with small entropy loss from [GUV09, Th 4.21]. This latter extractor has a smaller (namely, and a smaller entropy loss because it has and, therefore, the entropy loss is . By plugging these values in Lemma 6.4 and Lemma 6.5 in [TSUZ07] and if we assume , we obtain an explicit disperser with left size , left degree and right size , where for some constant natural number .
Step 2: Obtaining a good disperser with . We set . We construct a graph obtained from the above disperser by taking clones of the right side. This is a graph such that (because the left degree is multiplied by ) and it has the property that every with has .
Step 3: Obtaining an expander with -expansion up to . Now we repeat the above construction for and take the union of these graphs in which we take the same left side for all the graphs, but pairwise disjoint right sides. Since all the graphs have the same left side, , and this is also the left side of the union graph. The right side of the union graph has size and the left degree of the union graph is . In this union graph, every set of size , has (so it is a -expander up to ). Indeed suppose . Then in the “ component” of the union, has at least neighbors, and this value is greater than .
Finally, we construct the graph obtained from the above graph by taking clones of the right side. In this new graph, the expansion factor is , as desired. The size of the right side is and the left degree is multiplied by , so it is still .
Appendix D Related work on matching
For more than 4 decades, matching algorithms have been studied, see [PL86], and the research still continues, see for example [BRR23]. We discuss 3 areas in which variants of online matching algorithms are studied. The algorithms from theorem 1.2 and proposition 1.1 combine the constraints of all these areas, but these algorithms only work for graphs with large expansion (and for theorem 1.2, load is allowed).
Online matching. Let be the maximum cardinality of a matching in a graph . For some that is close to , the objective is to maintain a matching of size while edges and vertices are added and removed from the graph . Once a match is assigned it may not be revoked.
A greedy algorithm that maintains a maximal matching, i.e., a matching that is not a strict subset of another matching, obtains this objective for . Note that the greedy algorithm in section 3 maintains a maximal matching in the induced subgraph with left nodes that have “active requests”. A similar algorithm is given in the proof of [LMSVW22, corollary 4].
Perhaps the first paper in this field is [KVV90]. This paper considers the incremental setting in which left nodes arrive but do not depart. They give a probabilistic -approximation algorithm for bipartite graphs with a perfect matching.141414The adversary is oblivious, i.e., the moves of the adversary are fixed before the randomness of the algorithm is fixed. More recently, this was improved to an -approximation for regular graphs with degree . Unfortunately, the runtime of this algorithm is polynomial in , [CW18].
Dynamic matching. The objective is again to maintain a matching of at least , but now matches are allowed to be revoked. The aim is to minimize the runtime and it is also important to have few revocations.
In [LMSVW22, theorem 2] an algorithm is given for general graphs that are not necessarily bipartite. A -approximation is given in which the worst case number of revocations for each assigned match is 1, and the amortized runtime is in the word-ram model, where is the average degree of the graph. Note that this algorithm is almost irrevocable. If it could be made irrevocable, we would obtain a stronger version of theorem 1.2 that does not require lossless expansion: -expansion up to implies -time matching with load up to . Unfortunately, such an improvement is unlikely, because it would contradict 2 popular conjectures: the “online matrix-vector multiplication conjecture” and “triangle detection conjecture”, see [LMSVW22, theorem 1].
Load balancing with restricted assignment. In this task, there are servers and tasks arrive with a duration and a subset of servers that can perform the task. When the tasks arrive, a server needs to be selected immediately. For the full duration of the task, the servers’ load is increased by 1. Usually, it is not allowed to reassign a task to a different server. The aim is to minimize the maximal load of a server. Two types of assignment algorithms are studied, depending on whether the input contains the duration of the task. Our game corresponds to the variant in which the duration is not given.
Given a sequence of clients, the performance of an assignment algorithm is the maximum of the load over all machines and over time. The aim is to minimize the competitive ratio: the ratio of this performance to the performance of an optimal offline algorithm that is given the full sequence of tasks and there durations at once. We refer to [Aza05] for more background.
For this model, for every deterministic assignment algorithm, there exists a sequence of tasks of length in which the competitive ratio is at least , where is the number of servers, see [ABK94, MP97]. There exists an algorithm that guarantees a competitive ratio of at most , [AKPP93]. The algorithm from the proof of theorem 1.2 obtains a competitive ratio , which is typically exponentially smaller than , but this algorithm only works for graphs that are lossless expanders.