This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Dynamic asynchronous iterations

Matthew L. Daggitt Timothy G. Griffin Department of Computer Science and Technology, University of Cambridge, UK
Abstract

Many problems can be solved by iteration by multiple participants (processors, servers, routers etc.). Previous mathematical models for such asynchronous iterations assume a single function being iterated by a fixed set of participants. We will call such iterations static since the system’s configuration does not change. However in several real-world examples, such as inter-domain routing, both the function being iterated and the set of participants change frequently while the system continues to function. In this paper we extend Üresin & Dubois’s work on static iterations to develop a model for this class of dynamic or always on asynchronous iterations. We explore what it means for such an iteration to be implemented correctly, and then prove two different conditions on the set of iterated functions that guarantee the full asynchronous iteration satisfies this new definition of correctness. These results have been formalised in Agda and the resulting library is publicly available.

keywords:
Asynchronous computation, Iteration, Fixed points, Formal verification, Agda

1 Introduction

Let SS be a set. Iterative algorithms aim to find a fixed point xx^{*} for some function F:SS{F:S\rightarrow S} by starting from an initial state xSx\in S and calculating the sequence:

x,F(x),F2(x),F3(x),x,\ F(x),\ F^{2}(x),F^{3}(x),\ ...

If a number of iterations kk^{*} is found such that Fk(x)=Fk+1(x)F^{k^{*}}(x)=F^{k^{*}+1}(x) then Fk(x)F^{k^{*}}(x) is a fixed point xx^{*}. Whether or not there exists such a kk^{*} depends on both the properties of the iterated function FF and the initial state chosen xx. It should be noted that this paper is only interested in functions FF which converge to a unique fixed point, i.e. the same xx^{*} is reached no matter which initial state xx the iteration starts from.

In a distributed version of the iteration, both the set SS and the function FF are assumed to be decomposable into nn parts:

S\displaystyle S =S1×S2××Sn\displaystyle=S_{1}\times S_{2}\times...\times S_{n}
F\displaystyle F =(F1,F2,,Fn)\displaystyle=(F_{1},F_{2},...,F_{n})

where Fi:SSiF_{i}:S\rightarrow S_{i} computes the ithi^{th} component of the new state. Each node ii repeatedly iterates FiF_{i} on its local view of the current state of the iteration, and propagates its stream of updated values to other nodes so that they may incorporate them in their own iteration. In an asynchronous distributed iteration, the timings between nodes are not actively synchronised. A formal model, δ\delta, for such an asynchronous iteration is described in Section 2.

Frommer & Syzld [1] provide a survey of the literature describing when such asynchronous iterations are guaranteed to converge to a unique fixed point. One of the unifying features of these results is that they only require conditions on the function FF and hence users may prove an asynchronous iteration converges without ever directly reasoning about unreliable communication or asynchronous event orderings. Applications of these results include routing [2, 3, 4], programming language design [5], peer-to-peer protocols [6], and numerical simulations [7]. Other recent applications of asynchronous iterations include [8, 9, 10], while [11] and [12] provide useful surveys of asynchronous iterations in general.

However there are two main drawbacks to the models used in the current literature. Firstly, they assume the set of participating nodes remains constant over time. While this may be reasonable when modelling an iterative process run on a multi-core computer, it is unrealistic when reasoning about truly distributed “always on” protocols such as routing and consensus algorithms. For example the global Border Gateway Protocol that coordinates routing in the internet has been “on” since the early 1990’s and has grown from a few dozen routers to millions. During that time the set of participating routers has been completely replaced many times over. The second problem is that the models assume that the function FF being iterated remains constant over time. This may not be the case if it depends on some process external to the iteration (e.g. link latencies in routing) or on the set of participants (e.g. resource allocation/consensus/routing algorithms).

This paper will therefore use the term static to refer to the asynchronous iterations previously described in the literature and dynamic to refer to this new class of asynchronous iterations in which the set of participating nodes and function being iterated may change over time.

When applying the results in the literature to always-on algorithms, it is common for prior work to either informally argue or implicitly assume that the correctness of a dynamic iteration is an immediate consequence of the correctness of a infinite sequence of suitable static iterations. This line of reasoning is:

  • implicitly argued in Section 4.2 of [2].

  • explicitly argued in Section 3.2 of [3].

  • implicitly argued in Section 2.4 of [4].

  • discussed and implicitly argued in point (b) of Section 5 in [13].

The reasoning runs that a dynamic iteration is really a sequence of static iterations, where each new static iteration starts from the final state of the previous static iteration. However this argument is incorrect, as it does not take into account that messages may be shared between the different static iterations in the sequence. For example if node 1 fails, it may still have messages in flight that node 2 will receive in the next static iteration. Not only may this message prevent convergence in the next iteration, the model in the existing literature has no way of even representing messages arriving from nodes that are not participating during the current static iteration.

This paper therefore proposes a new, more general model that can be used to reason about dynamic iterations over both continuous and discrete data. Section 2 of the paper describes one of the most commonly used static models, and discusses some of the surrounding literature. Section 3 then presents our new generalised model for dynamic iterations, and discusses what it means for a dynamic iteration to be “correct”. Next, Section 4 proves two different conditions for a dynamic asynchronous iteration to satisfy this definition of correctness. Importantly, and as with the previous static results of Üresin & Dubois, these conditions only constrain the synchronous behaviour of the dynamic system. This means that users of our theorems can prove the correctness of their asynchronous algorithms by purely synchronous reasoning. Section 5 then briefly describes the formalisation of the results in Agda and their application to inter-domain routing protocols. Finally, Section 6 discusses our concluding thoughts and possible directions for future work.

2 Static asynchronous iterations

2.1 Model

A mathematical model for static asynchronous iterations was standardised by work in the 1970s and 80s [13, 14, 15]. The notation and terminology used here is taken from the recent paper [16] which in turn is based on that used by Üresin & Dubois [17].

Assume that the set of times TT is a discrete linear order. Each point in time marks the occurrence of events of interest: for example a node computing an update or a message arriving at a node. The set of times can be represented by \mathbb{N} but for notational clarity TT will be used. Additionally let V={1,2,,n}V=\{1,2,...,n\} be the set of nodes that are participating in the computation.

Definition 1 (Static schedule).

A static schedule consists of a pair of functions:

  • α:T2V\alpha:T\rightarrow 2^{V}, the activation function, where α(t)\alpha(t) is the set of nodes which activate at time tt.

  • β:T×V×VT\beta:T\times V\times V\rightarrow T, the data flow function, where β(t,i,j)\beta(t,i,j) is the time at which the latest message node ii has received from node jj at time tt was sent by node jj.

such that:

  1. (SS1)

    i,j,t:β(t+1,i,j)t\forall i,j,t:\>\beta(t+1,i,j)\leq t

The function α\alpha describes when nodes update their values, and the function β\beta tracks how the resulting information moves between nodes. Assumption (SS1) enforces causality by stating that information may only flow forward in time. Note that this definition does not forbid the data flow function β\beta from delaying, losing, reordering or even duplicating messages (see Figure 1). Prior to recent work [16], static schedules were assumed to have two additional assumptions that guaranteed every node continued to activate indefinitely and that every pair of nodes continued to communicate indefinitely.

Node iiNode jjTime tt12345678910β(t,i,j)\beta(t,i,j)0021111787Messages reorderedMessage lostMessage duplicated
Figure 1: Behaviour of the data flow function β\beta. Messages from node jj to node ii may be reordered, lost or even duplicated. The only constraint is that every message must arrive after it was sent. Reproduced from [16]
Definition 2 (Static asynchronous state function).

Given a static schedule (α,β)(\alpha,\beta) the static asynchronous state function, δ:TSS\delta:T\rightarrow S\rightarrow S, is defined as follows:

δit(x)\displaystyle\delta^{t}_{i}(x) ={xiif t=0δit1(x)else if iα(t)Fi(δ1β(t,i,1)(x),δ2β(t,i,2)(x),,δnβ(t,i,n)(x))otherwise\displaystyle=\begin{cases}x_{i}&\text{if $t=0$}\\ \delta^{t-1}_{i}(x)&\text{else if $i\notin\alpha(t)$}\\ F_{i}(\delta^{\beta(t,i,1)}_{1}(x),\delta^{\beta(t,i,2)}_{2}(x),...,\delta^{\beta(t,i,n)}_{n}(x))&\text{otherwise}\end{cases}

where δit(x)\delta^{t}_{i}(x) is the state of node ii at time tt when starting from state xx.

At time 0 the iteration is in the initial state xx. At subsequent times tt if node ii is not in the set of active nodes then its state remains unchanged. Otherwise if node ii is in the active set of nodes it applies its update function FiF_{i} to its current view of the global state. For example δ1β(t,i,1)(x)\delta^{\beta(t,i,1)}_{1}(x) is the state of node 11 at the time of departure of the most recent message node ii has received from node 11 at time tt.

2.2 Correctness

In order to precisely define when an asynchronous iteration is expected to converge, it is first necessary to discuss what sort of schedules allow an asynchronous iteration to make progress. As mentioned earlier, previous models made the simplifying assumption that every node activates an infinite number of times and every pair of nodes continue to communicate indefinitely. This essentially says that the schedule is well-behaved forever. In contrast [16] built upon the work of Üresin & Dubois and their concept of pseudocycles and relaxed this condition to only require that schedules must be well-behaved for a finite period of time. This distinction will be important in the dynamic model described later in Section 3, as a dynamic iteration will only have a finite period of time to converge before either the participants or the function being iterated changes.

Definition 3 (Static activation period).

A period of time [t1,t2][t_{1},t_{2}] is an activation period for node ii if there exists a time t[t1,t2]t\in[t_{1},t_{2}] such that iα(t)i\in\alpha(t).

Definition 4 (Static expiry period).

A period of time [t1,t2][t_{1},t_{2}] is an expiry period for node ii if for all nodes jj and times tt2t\geq t_{2} then t1β(t,i,j)t_{1}\leq\beta(t,i,j).

Therefore after an activation period node ii is guaranteed to activate at least once. In contrast after an expiry period the node is guaranteed to use only data generated after the start of the expiry period. In other words, all messages in flight to node ii at time t1t_{1} have either arrived or been lost by time t2t_{2}.

Definition 5 (Static pseudocycle).

A period of time [t1,t2][t_{1},t_{2}] is a pseudocycle if for all nodes ii there exists a time t[t1,t2]t\in[t_{1},t_{2}] such that [t1,t][t_{1},t] is an expiry period for node ii and [t,t2][t,t_{2}] is an activation period for node ii.

The term “pseudocycle” refers to the fact that during such a period of time the asynchronous iteration will make at least as much progress as that of a single step of the synchronous iteration. This statement will be made formal later on by Lemma 8 in Section 4.1. When we informally say that a period of time contains kk pseudocycles, we implicitly mean kk disjoint pseudocycles.

Using the definition of a pseudocycle, it is now possible to define what it means for an asynchronous iteration to converge for schedules that are only well-behaved for a finite amount of time.

Definition 6 (Static convergence).

The static asynchronous iteration converges over a set of initial states X=X1×X2××XnS{X=X_{1}\times X_{2}\times\ldots\times X_{n}}\subseteq S if:

  1. 1.

    there exists a fixed point xx^{*} for FF and a number of iterations kk^{*}.

  2. 2.

    for every starting state xXx\in X and schedule containing at least kk^{*} pseudocycles then there exists a time t1t_{1} such that for all t2t1t_{2}\geq t_{1} then δt2(x)=x\delta^{t_{2}}(x)=x^{*}.

2.3 Results

The survey paper by Frommer & Syzld [1] provides an overview of the convergence results in the literature for this and other related models. Much of the work has been motivated by iterative algorithms in numerical analysis and consequently many of the proofs of convergence assume that the set SS is equipped with a dense ordering. Unfortunately in fields such as routing, consensus algorithms and others, the set SS is discrete, and so many of the more common results are inapplicable. However in the late 1980s Üresin & Dubois [17] came up with one of the first conditions for the convergence of discrete asynchronous iterations. Here we use the relaxed version of the conditions as proposed in [16].

Definition 7 (Static ACO).

A function FF is an asynchronously contracting operator (ACO) if there exists a sequence of sets B(k)=B(k)1×B(k)2××B(k)nB(k)=B(k)_{1}\times B(k)_{2}\times...\times B(k)_{n} for kk\in\mathbb{N} such that:

  1. (SA1)

    xS:xB(0)F(x)B(0)\forall x\in S:x\in B(0)\Rightarrow F(x)\in B(0).

  2. (SA2)

    k,xS:xB(k)F(x)B(k+1)\forall k\in\mathbb{N},x\in S:x\in B(k)\Rightarrow F(x)\in B(k+1).

  3. (SA3)

    k,xS:k:kkB(k)={x}\exists k^{*}\in\mathbb{N},x^{*}\in S:\forall k\in\mathbb{N}:k^{*}\leq k\Rightarrow B(k)=\{x^{*}\}.

Theorem 1.

If function FF is an ACO then δ\delta converges deterministically over the set B(0)B(0).

Proof.

See [17] & [16]. ∎

The advantage of the ACO conditions is that they are independent of both δ\delta and the schedule, and so proving that δ\delta converges only requires reasoning about the function FF.

The conditions require that the state space SS can be divided into a series of nested boxes B(k)B(k) where every application of FF moves the state into the next box, and eventually a box B(k)B(k^{*}) is reached that only contains a single element. See Figure 2 for a visualisation.

B(0)B(0)B(1)B(1)B(2)B(2)B(k)B(k^{*})xx^{*}FFFFFF
Figure 2: If FF is an ACO then the space SS can be divided up into a series of boxes BB. Note that this figure is a simplification, as each set B(k)B(k) is decomposable into B(k)1××B(k)nB(k)_{1}\times...\times B(k)_{n} and so in reality the diagram should be nn dimensional.

The reason why these conditions guarantee asynchronous convergence, rather than merely synchronous convergence, is that each box must be decomposable over each of the nn nodes. Therefore the operator is always contracting even if every node hasn’t performed the same number of updates locally. Note that Theorem 1 only guarantees δ\delta will converge from states in the initial set B(0)B(0). Hence B(0)B(0) can be thought of as a basin of attraction [18].

In practice the set of boxes BB can be difficult and non-intuitive to construct, as they must be explicitly centered around the fixed point whose existence may not even be immediately obvious. Üresin & Dubois recognised this and provided several other stronger conditions that are sufficient to construct an ACO. An alternative set of equivalent conditions was originally described by Gurney [19]. As with the ACO conditions, these conditions were relaxed by [16] and the latter version is now presented.

Definition 8 (Static AMCO).

A function FF is an asynchronously metrically contracting operator (AMCO) if for every node ii there exists a distance function did_{i} such that if D(x,y)maxidi(xi,yi)D(x,y)\triangleq\max_{i}d_{i}(x_{i},y_{i}) then:

  1. (SU1)

    iV,x,yS:di(x,y)=0x=y\forall i\in V,x,y\in S:d_{i}(x,y)=0\Leftrightarrow x=y

  2. (SU2)

    iV:n:x,yS:di(x,y)n\forall i\in V:\exists n\in\mathbb{N}:\forall x,y\in S:d_{i}(x,y)\leq n

  3. (SU3)

    xS:xF(x)D(x,F(x))>D(F(x),F2(x))\forall x\in S:x\neq F(x)\Rightarrow D(x,F(x))>D(F(x),F^{2}(x))

  4. (SU4)

    x,xS:F(x)=xxxD(x,x)>D(x,F(x))\forall x,x^{*}\in S:F(x^{*})=x^{*}\wedge x\neq x^{*}\Rightarrow D(x^{*},x)>D(x^{*},F(x))

  5. (SU5)

    SS is non-empty

The AMCO conditions require the construction of a notion of distance between states such that there exists a maximum distance (SU2) and that successive iterations become both closer together (SU3) and closer to any fixed point (SU4). Note, unlike Gurney’s original formulation, the AMCO conditions as defined above do not require did_{i} to obey the typical metric axioms of symmetry and the ultrametric triangle inequality.

Gurney [19] proves that the AMCO conditions are equivalent to the ACO conditions by constructing reductions in both directions. Consequently the following convergence theorem holds.

Theorem 2.

If FF is an AMCO then δ\delta converges deterministically over the set SS.

Proof.

See [19] & [16]. ∎

2.4 Motivations for a dynamic model

As discussed in the introduction, prior work applying Üresin & Dubois’s results to “always-on” algorithms often assumes that dynamic iterations can be viewed as a sequence of static iterations. By inspecting the definition of δ\delta, the flaw in this argument can now be formalised. Consider a dynamic iteration with nodes VV in which node iVi\in V has sent out an update message to jVj\in V and then ii ceases participating. The new static iteration would begin immediately with participants V{i}V-\{i\} and therefore when jj next activates, the static model is incapable of receiving the message from node ii.

Another feature lacking in the static model is the ability to reboot nodes. It is possible to represent temporary node failure in the static model by excluding it from the set of active nodes, however this still provides an unsatisfactory model as many types of failure will result in a node’s state being erased (e.g. replacing a faulty server in a data centre). In reality after such an event the node is forced to revert back to the initial state. This “rebooting” of a node after a temporary failure cannot be described by the existing static model.

3 Dynamic asynchronous iterations

To overcome these shortcomings we now propose a new, more general model that can describe both dynamic and static iterations.

3.1 Model

Let VV be the set of all the nodes that participate at some point during the dynamic iteration. VV is still assumed to be finite with n=|V|n=|V|, as the only cases in which |V||V| could be infinite is if either an infinite number of nodes participated at the same time or an infinite amount of time has passed since the iteration began. Neither case is useful in reality. As before, we assume there exists a product state space S=S1×S2×..×SnS=S_{1}\times S_{2}\times..\times S_{n}.

In order to capture the new dynamic nature of the iteration we introduce the concept of an epoch. An epoch is a contiguous period of time in which both the function being iterated and the set of participating nodes remain constant. The set of epochs is denoted as EE but as with time can be assumed to be an alias for \mathbb{N}.

Instead of a single function FF, we now assume that FF is a family of indexed functions where FepF^{ep} is the function being computed in epoch eEe\in E by participants pVp\subseteq V. Furthermore we assume there exists a special non-participating state S\bot\in S.

A schedule must therefore not only track the activation of nodes and the flow of data between them but also the current epoch and the participating nodes. Given these requirements it is natural to redefine a schedule as follows:

Definition 9 (Dynamic schedule).

A dynamic schedule is a tuple of functions (α,β,η,π)(\alpha,\beta,\eta,\pi) where:

  • α:T2V\alpha:T\rightarrow 2^{V} is the activation function, where α(t)\alpha(t) is the set of nodes which activate at time tt.

  • β:T×V×VT\beta:T\times V\times V\rightarrow T is the data flow function, where β(t,i,j)\beta(t,i,j) is the time at which the information used by node ii at time tt was sent by node jj.

  • η:TE\eta:T\rightarrow E is the epoch function, where η(t)\eta(t) is the epoch at time tt.

  • π:E2V\pi:E\rightarrow 2^{V} is the participants function, where π(e)\pi(e) is the set of nodes participating in the computation during epoch ee.

such that:

  1. (DS1)

    i,j,t:β(t+1,i,j)t\forall i,j,t:\>\beta(t+1,i,j)\leq t – information only travels forward in time.

  2. (DS2)

    t1,t2:t1t2η(t1)η(t2)\forall t_{1},t_{2}:\>t_{1}\leq t_{2}\Rightarrow\eta(t_{1})\leq\eta(t_{2}) – the epoch number only increases.

The additional assumption (DS2) states that epochs are monotonically increasing. Although not technically required, the assumption is convenient as it ensures that for any two points in time in the same epoch then every point between them is also in the same epoch. This assumption does not reduce the expressive power of the model, as for any non-monotonic η\eta it is possible to find a suitable relabelling of epochs that recovers monotonicity. Another possible assumption that might be made is that a node can only activate if it is currently participating in the iteration (i.e. t:α(t)π(η(t))\forall t:\alpha(t)\subseteq\pi(\eta(t))). Although the assumption is reasonable, the dynamic asynchronous state function δ\delta will be defined in such a way that it will not be required (see Definition 10).

Given a schedule, we define some additional notation for ρ(t)\rho(t), the set of nodes participating at time tt, and FtF^{t}, the function being used at time tt:

ρ(t)\displaystyle\rho(t) π(η(t))\displaystyle\triangleq\pi(\eta(t))
Ft\displaystyle F^{t} Fη(t)ρ(t)\displaystyle\triangleq F^{\eta(t)\rho(t)}

It is now possible to define the dynamic asynchronous state function as follows:

Definition 10 (Dynamic asynchronous state function).

Given an initial state xx and a schedule (α,β,η,π)(\alpha,\beta,\eta,\pi) the dynamic state function is defined as:

δit(x)={iif iρ(t)xielse if t=0 or iρ(t1)δit1(x)else if iα(t)Fit(δ1β(t,i,1)(x),,δnβ(t,i,n)(x))otherwise\delta^{t}_{i}(x)=\begin{cases}\bot_{i}&\text{if $i\notin\rho(t)$}\\ x_{i}&\text{else if $t=0$ or $i\notin\rho(t-1)$}\\ \delta^{t-1}_{i}(x)&\text{else if $i\notin\alpha(t)$}\\ F^{t}_{i}(\delta^{\beta(t,i,1)}_{1}(x),\ldots,\delta^{\beta(t,i,n)}_{n}(x))&\text{otherwise}\end{cases}

where δit(x)\delta^{t}_{i}(x) is the state of node ii at time tt starting from state xx.

If a node is not currently participating then it adopts its non-participating state. If it is participating at time tt but was not participating at the time t1t-1 then it must have just (re)joined the computation and it therefore adopts its initial state. If the node is a continuing participant and is inactive at time tt then its state remains unchanged. Otherwise, if it is active at time tt, it updates its state in accordance with the data received from the other nodes in the computation.

Note that in the latter case at time tt nodes can use data from any node in VV rather than just the current set of participants ρ(t)\rho(t). Hence nodes that are currently participating may end up processing messages from nodes that are no longer participating in the current epoch. Also note that this new model is a strict generalisation of the static model as the static definition of δ\delta is immediately recovered by a schedule with the constant epoch and participants functions η(t)=0\eta(t)=0 and π(0)=V\pi(0)=V.

3.2 Correctness

In order to define a notion of correctness for dynamic iterations, we first need to update the definition of a pseudocycle. It turns out that only two alterations are needed. The first is that the start and end of activation and expiry periods and consequently pseudocycles must belong to the same epoch. The second is that during a pseudocycle, only the participating nodes need to experience an activation and expiry period. An updated version of the definitions is given below with the changes underlined.

Definition 11 (Dynamic activation period).

A period of time [t1,t2][t_{1},t_{2}] is a dynamic activation period for node ii if η(t1)=η(t2)\eta(t_{1})=\eta(t_{2}) and there exists a time t[t1,t2]t\in[t_{1},t_{2}] such that iα(t)i\in\alpha(t).

Definition 12 (Dynamic expiry period).

A period of time [t1,t2][t_{1},t_{2}] is(SU1) a dynamic expiry period for node ii if η(t1)=η(t2)\eta(t_{1})=\eta(t_{2}) and for all nodes jj and times tt2t\geq t_{2} then t1β(t,i,j)t_{1}\leq\beta(t,i,j).

Definition 13 (Dynamic pseudocycle).

A period of time [t1,t2][t_{1},t_{2}] is a dynamic pseudocycle if η(t1)=η(t2)\eta(t_{1})=\eta(t_{2}) and for all nodes ii ρ(t1)\in\rho(t_{1}) there exists a time t[t1,t2]t\in[t_{1},t_{2}] such that [t1,t][t_{1},t] is an expiry period for node ii and [t,t2][t,t_{2}] is an activation period for node ii.

We can now start to think what it means for a dynamic iteration to be implemented correctly. Guaranteeing that a dynamic iteration will always converge to one particular fixed point is impossible as both the underlying computation and the participants may continue to change indefinitely. Furthermore the epoch durations may be short enough that no fixed point is ever reached, even temporarily. The natural and intuitive notion in such circumstances is to say that an iteration is convergent if whenever an epoch contains a sufficient number of pseudocycles then δ\delta will converge to a fixed point for the remainder of that epoch. Furthermore within an epoch the fixed point reached should be unique, but different epochs may have different unique fixed points.

However, we would also like to be able to reason about for which epochs and sets of participants a dynamic iteration does not converge. For example in the case of inter-domain routing, it is known that path-vector protocols only converge if and only if the network topology is free [20]. Therefore, in the same way that we constrain convergence to some set of initial states, we also constrain convergence to some set of pairs of epochs and set of participants. We will refer to such pairs as configurations.

Definition 14 (Convergent iteration).

A dynamic asynchronous iteration is convergent over an initial set of states X=X1×X2××Xn{X=X_{1}\times X_{2}\times\ldots\times X_{n}} and a set of configurations CE×2VC\subseteq E\times 2^{V} iff:

  1. 1.

    for every epoch and configuration (e,p)C(e,p)\in C there exists a fixed point xepx^{*}_{ep} for FepF^{ep} and a number of iterations kepk^{*}_{ep}.

  2. 2.

    for every initial state xXx\in X, schedule and time t1t_{1} then if (η(t1),ρ(t1))C{(\eta(t_{1}),\rho(t_{1}))\in C} and the time period [t1,t2][t_{1},t_{2}] contains kη(t1)ρ(t1)k^{*}_{\eta(t_{1})\rho(t_{1})} pseudocycles then for every time t3t_{3} such that t3t2t_{3}\geq t_{2} and η(t2)=η(t3)\eta(t_{2})=\eta(t_{3}) then δt3(x)=xη(t1)ρ(t1)\delta^{t_{3}}(x)=x^{*}_{\eta(t_{1})\rho(t_{1})}.

Having now defined what we mean for a dynamic iteration to be correct, in the next section we generalise the static ACO and AMCO conditions described in Section 2.3 and prove analogous correctness theorems for them.

4 Results

Before we generalise the ACO and AMCO conditions, some additional definitions are needed. As would be expected, information from non-participating nodes that is still “in-flight” from a previous epoch may interfere with the convergence of δ\delta in the current epoch. Therefore a notion is needed of a state only containing information for the current set of participants.

Definition 15 (Accordant states).

A state xx is accordant with respect to a set of participants pp if every inactive node is assigned the inactive state, i.e. ip:xi=i\forall i\notin p:x_{i}=\bot_{i}. The set of states that are accordant with pp is denoted as ApA^{p}.

When in the dynamic world we also need to take more care about the properties of XX, the set of initial states which the iteration can converge from. The static ACO conditions in Definition 7 implicitly take the first box to be the set of initial states, i.e. X=B(0)X=B(0), whilst the static AMCO conditions implicitly assume any state is valid, i.e. X=SX=S. However, the former approach no longer works in the dynamic world as we are forced to have different sets of boxes for each epoch and set of participants, and the latter is unnecessarily restrictive, as some iterative algorithms may only converge when started from certain states.

In order to solve these problems, we now define the properties that the initial set must satisfy regardless of whether we’re using the ACO or AMCO conditions.

Definition 16 (Valid set of initial states).

An initial set X=X1×X2××XnX=X_{1}\times X_{2}\times\ldots\times X_{n} is valid if:

  1. (IS1)

    X\bot\in X

  2. (IS2)

    e,p,x:xXFep(x)X\forall e,p,x:x\in X\Rightarrow F^{ep}(x)\in X

Assumption (IS1) states that the non-participating state is in the initial set and (IS2) states that XX is closed over every operator. The latter is the counterpart of assumption (SA1) in the definition of a static ACO. Together these ensure that an asynchronous iteration never leaves the initial set (see Lemma 1 for details). Also note that the entire state space SS is trivially a valid initial set.

4.1 Dynamic ACO implies convergent

We can now define the dynamic counterpart of the static ACO conditions. While it might be tempting to simply require that every FepF^{ep} be a static ACO, there are a couple of additional constraints necessary.

Definition 17 (Dynamic ACO).

The set of functions FF are a dynamic ACO over a set of initial states X=X1×X2××XnX=X_{1}\times X_{2}\times\ldots\times X_{n} and set of configurations CE×2VC\subseteq E\times 2^{V} if for every epoch ee and set of participants pp such that (e,p)C(e,p)\in C there exists a sequence of sets Bep(k)=B1ep(k)×B2ep(k)××Bnep(k)B^{ep}(k)=B^{ep}_{1}(k)\times B^{ep}_{2}(k)\times...\times B^{ep}_{n}(k) for kk\in\mathbb{N} such that:

  1. (DA1)

    XBep(0)X\subseteq B^{ep}(0)

  2. (DA2)

    k,ip:iBiep(k)\forall k\in\mathbb{N},i\notin p:\bot_{i}\in B^{ep}_{i}(k)

  3. (DA3)

    k,xXAp:xBep(k)Fep(x)Bep(k+1)\forall k\in\mathbb{N},x\in X\cap A^{p}:x\in B^{ep}(k)\Rightarrow F^{ep}(x)\in B^{ep}(k+1)

  4. (DA4)

    kep,xepX:k:kepkBep(k)={xep}\exists k^{*}_{ep}\in\mathbb{N},x^{*}_{ep}\in X:\forall k\in\mathbb{N}:k^{*}_{ep}\leq k\Rightarrow B^{ep}(k)=\{x^{*}_{ep}\}

Assumption (DA1) is a new assumption that links the initial boxes of each epoch together by assuming that the initial set of states is a subset of the initial box. Assumption (DA2) is also new and ensures that the box for any non-participating node contains its non-participating state. Assumptions (DA3) & (DA4) are generalised versions of (SA2) & (SA3) respectively. The only difference is that (DA3) has been weakened so that applying FepF^{ep} only advances a box when the state is accordant with the current set of participants. This means that progress need not be made in the case when stale messages are still being received from nodes that are no longer participating.

We now prove that if FF is a dynamic ACO over a valid set of initial states XX and configurations CC then δ\delta is convergent over XX and CC. Going forwards the existence of some arbitrary schedule (α,β,η,π)(\alpha,\beta,\eta,\pi) and starting state xXx\in X is assumed. As with FtFη(t)ρ(t)F^{t}\triangleq F^{\eta(t)\rho(t)}, we use the shorthand BtBη(t)ρ(t)B^{t}\triangleq B^{\eta(t)\rho(t)} and c(t)(η(t),ρ(t))c(t)\triangleq(\eta(t),\rho(t)) so that the current boxes and configuration may be indexed by time rather than by epoch and participants. Initially some auxiliary definitions are introduced in order to improve the readability of the proof.

Definition 18.

The state of node ii is in box kk at time tt if:

c(t)Cδit(x)Bit(k)c(t)\in C\Rightarrow\delta^{t}_{i}(x)\in B^{t}_{i}(k)

i.e. if the current configuration is in the set of valid configurations then the current state of node ii is in box kk.

Definition 19.

The messages to node ii are in box kk at time tt if:

c(t)Cs:(s>t)(η(s)=η(t))j:δjβ(s,i,j)(x)Bjt(k)c(t)\in C\Rightarrow\forall s:(s>t)\wedge(\eta(s)=\eta(t))\Rightarrow\forall j:\delta^{\beta(s,i,j)}_{j}(x)\in B^{t}_{j}(k)

i.e. if the current configuration is in the set of valid configurations then any message arriving at node ii after time tt and before the end of the current epoch is guaranteed to be in box kk. A different way of viewing this condition is that node ii’s local view of the global state of the iteration will be in box kk for the remainder of the epoch.

Definition 20.

The messages to node ii are accordant at time tt if:

s:(s>t)(η(s)=η(t))j:jρ(s)δjβ(s,i,j)=j\forall s:(s>t)\wedge(\eta(s)=\eta(t))\Rightarrow\forall j:j\notin\rho(s)\Rightarrow\delta^{\beta(s,i,j)}_{j}=\bot_{j}

i.e. any message arriving at node ii after time tt during the current epoch from a non-participating node jj will always contain the non-participating state j\bot_{j}. This is equivalent to stating that node ii’s local view of the state is accordant.

Definition 21.

The computation at node ii is in box kk at time tt if:

  1. 1.

    the state of node ii is in box kk at time tt.

  2. 2.

    if k>0k>0 the messages to node ii are in box k1k-1 at time tt.

  3. 3.

    if k>0k>0 then the messages to node ii are accordant at time tt.

This definition collects together the pre-conditions required to prove that the state of node ii will always be in box kk for the remainder of the epoch, as shown in Lemma 4. Finally we lift this definition from an individual node to the whole computation as follows:

Definition 22.

The computation is in box kk at time tt if for all nodes iρ(t)i\in\rho(t) then the computation at node ii is in box kk at time tt.

It is interesting to note that Definition 22 does not place any requirements on non-participating nodes. This is because, by the definition of δ\delta, any non-participating node ii is always in the non-participating state i\bot_{i}, which, by assumption (DA2), is in every one of the boxes, including the final one. Also note that all of the above definitions contain some linguistic slight of hand, as being in box kk at time tt and being in box kk at time t+1t+1, does not necessarily refer to the same box if η(t)η(t+1)\eta(t)\neq\eta(t+1).

The proof can now be split into four parts. The first set of closure lemmas prove that the computation is always in box 0 even after changes in the epoch. The second set of stability lemmas describe under what conditions after the computation reaches box kk it remains in that box for the remainder of the epoch. The third set of progress lemmas demonstrate how during a pseudocycle the entire computation advances at least one box. Finally these results are combined to prove convergence.

4.1.1 Closure lemmas

In order to later apply the other ACO assumptions, we first establish that the initial set XX is closed over δ\delta, i.e. that the iteration never escapes the initial set. As a consequence of this and assumption (DA1), we then prove that both the state and the computation are always in the box 0 of the current epoch.

Lemma 1.

For any xXx\in X and time tt then δt(x)X\delta^{t}(x)\in X.

Proof.

Consider an arbitrary node ii. The proof that δit(x)Xi\delta^{t}_{i}(x)\in X_{i} proceeds by induction over the definition of δ\delta.

Case 1: iρ(t)i\notin\rho(t)
Then δit(x)=i\delta^{t}_{i}(x)=\bot_{i} and iXi\bot_{i}\in X_{i} by assumption (IS1).

Case 2: iρ(t)i\in\rho(t) and (t=0t=0 or iρ(t1)i\notin\rho(t-1))
Then δit(x)=xi\delta^{t}_{i}(x)=x_{i} and xiXix_{i}\in X_{i} by the initial assumption.

Case 3: iρ(t)i\in\rho(t) and iρ(t1)i\in\rho(t-1) and iα(t2)i\notin\alpha(t_{2})
Then δit(x)=δit1(x)\delta^{t}_{i}(x)=\delta^{t-1}_{i}(x), and δit1(x)Xi\delta^{t-1}_{i}(x)\in X_{i} by the inductive hypothesis applied to time t1t-1.

Case 4: iρ(t)i\in\rho(t) and iρ(t1)i\in\rho(t-1) and iα(t)i\in\alpha(t)
Then δit(x)=Fit(δ1β(t,i,1)(x),,δnβ(t,i,n)(x))\delta^{t}_{i}(x)=F^{t}_{i}(\delta^{\beta(t,i,1)}_{1}(x),\ldots,\delta^{\beta(t,i,n)}_{n}(x)). For each jj then δjβ(t,i,j)(x)Xj\delta^{\beta(t,i,j)}_{j}(x)\in X_{j} by the inductive hypothesis applied to time β(t,i,j)\beta(t,i,j). Hence Fit()XiF^{t}_{i}(...)\in X_{i} as XX is closed under FtF^{t} by assumption (IS2) . ∎

Lemma 2.

For every time tt and node ii the state of node ii is in box 0 at time tt.

Proof.

Consider an arbitrary time tt and node ii such that c(t)Cc(t)\in C. Then δit(x)Xi\delta^{t}_{i}(x)\in X_{i} by Lemma 1 and XiBit(0)X_{i}\subseteq B^{t}_{i}(0) by assumption (DA1). ∎

Lemma 3.

For every time tt and node ii the computation at node ii is in box 0 at time tt.

Proof.

The state of node ii is in box 0 at time tt by Lemma 2. As we are only considering box 0, Definition 21 does not require us to prove that the messages to node ii are in box 0 and are accordant at time tt111Note that the latter isn’t even true as out-of-date messages may still be arriving from nodes that were participating in a previous epoch but are no longer participating in the current epoch.. ∎

4.1.2 Stability lemmas

Guaranteeing that the dynamic iteration makes progress towards the fixed point of the current epoch is complicated by the fact that out-of-date messages may arrive from earlier in the iteration and undo recent progress. The next lemmas establish what conditions are necessary to guarantee that once the state and messages are in box kk then they will remain in box kk for the remainder of the epoch.

Lemma 4.

If the computation at node ii is in box kk at time tt then the state of node ii is in box kk for every time sts\geq t such that η(s)=η(t)\eta(s)=\eta(t).

Proof.

Assume that the computation at node ii is box kk at time tt for an arbitrary node ii and time tt. We must show that δis(x)Bis(k)\delta^{s}_{i}(x)\in B^{s}_{i}(k) for any sts\geq t such that c(s)Cc(s)\in C. If k=0k=0 then the result follows immediately by Lemma 2. Otherwise if k>0k>0 the proof proceeds by induction over time ss and the definition of δ\delta. If s=ts=t then the state of node ii is in box kk at time tt by Definition 21. Otherwise s>ts>t and as s1[t,s]s-1\in[t,s] and η(t)=η(s)\eta(t)=\eta(s) then η(t)=η(s1)=η(s)\eta(t)=\eta(s-1)=\eta(s) and hence Bt=Bs1=BsB^{t}=B^{s-1}=B^{s} and c(t),c(s1)Cc(t),c(s-1)\in C. Consider the following cases:

Case 1: iρ(s)i\notin\rho(s)
Then δis(x)=i\delta^{s}_{i}(x)=\bot_{i} and iBis(k)\bot_{i}\in B^{s}_{i}(k) by assumption (DA2).

Case 2: iρ(s)i\in\rho(s) and iρ(s1)i\notin\rho(s-1)
As η(s1)=η(s)\eta(s-1)=\eta(s) then ρ(s1)=ρ(s)\rho(s-1)=\rho(s), contradicting the case assumptions.

Case 3: iρ(s)i\in\rho(s) and iρ(s1)i\in\rho(s-1) and iα(s)i\notin\alpha(s)
Then δis(x)=δis1(x)\delta^{s}_{i}(x)=\delta^{s-1}_{i}(x). As c(s1)Cc(s-1)\in C then we have δis1(x)Bis1(k)\delta^{s-1}_{i}(x)\in B^{s-1}_{i}(k) by the inductive hypothesis applied to time s1s-1. As Bis1(k)=Bis(k)B^{s-1}_{i}(k)=B^{s}_{i}(k), we therefore have δis(x)Bis(k)\delta^{s}_{i}(x)\in B^{s}_{i}(k).

Case 4: iρ(s)i\in\rho(s) and iρ(s1)i\in\rho(s-1) and iα(s)i\in\alpha(s)
Then δis(x)=Fis(δ1β(s,i,1)(x),,δnβ(s,i,n)(x))\delta^{s}_{i}(x)=F^{s}_{i}(\delta^{\beta(s,i,1)}_{1}(x),\ldots,\delta^{\beta(s,i,n)}_{n}(x)). As c(t)Cc(t)\in C and all messages to node ii are in box k1k-1 at time tt and are accordant, then δjβ(s,i,j)(x)\delta^{\beta(s,i,j)}_{j}(x) is accordant and in box Bjt(k1)=Bjs(k1)B^{t}_{j}(k-1)=B^{s}_{j}(k-1) for every node jj. Hence Fis()Bis(k)F^{s}_{i}(...)\in B^{s}_{i}(k) by assumption (DA3). ∎

Lemma 5.

If messages to node ii are in box kk at time tt then the messages to node ii are in box kk for all times sts\geq t such that η(s)=η(t)\eta(s)=\eta(t).

Proof.

Consider a time r>sr>s such that η(r)=η(s)\eta(r)=\eta(s). We must show that δjβ(r,i,j)(x)Bjs(k)\delta^{\beta(r,i,j)}_{j}(x)\in B^{s}_{j}(k) for every node jj. Then r>tr>t and η(r)=η(t)\eta(r)=\eta(t) and so by Definition 19 we have that δβ(r,i,j)(x)jBjt(k)\delta^{\beta}(r,i,j)(x)_{j}\in B^{t}_{j}(k). As η(s)=η(t)\eta(s)=\eta(t) then Bs=BtB^{s}=B^{t} and hence we have the required result. ∎

4.1.3 Progress lemmas

Having established that i) the computation is always in box 0 no matter the epoch and ii) once the computation at node ii has reached box kk, it remains in box kk, it is next necessary to establish when the computation advances a box during an epoch. These conditions are intimately tied to the notion of a pseudocycle.

Lemma 6.

If the messages to node ii are accordant and are in box kk at time tt and [t,s][t,s] is an activation period then the state of node ii is in box k+1k+1 at time ss.

Proof.

Assume c(s)Cc(s)\in C. The proof that δis(x)Bis(k+1)\delta^{s}_{i}(x)\in B^{s}_{i}(k+1) proceeds by induction over the definition of δ\delta and time ss. As activation periods are of non-zero length then s>ts>t and as s1[t,s]s-1\in[t,s] and η(t)=η(s)\eta(t)=\eta(s) then Bt=Bs1=BsB^{t}=B^{s-1}=B^{s} and c(t),c(s1)Cc(t),c(s-1)\in C. Consider the following cases:

Case 1: iρ(s)i\notin\rho(s)
Then δis(x)=i\delta^{s}_{i}(x)=\bot_{i} and iBis(k+1)\bot_{i}\in B^{s}_{i}(k+1) by assumption (DA2).

Case 2: iρ(s)i\in\rho(s) and iρ(s1)i\notin\rho(s-1)
As η(s1)=η(s)\eta(s-1)=\eta(s) then ρ(s1)=ρ(s)\rho(s-1)=\rho(s), contradicting the case assumptions.

Case 3: iρ(s)i\in\rho(s) and iρ(s1)i\in\rho(s-1) and iα(s)i\notin\alpha(s)
Then δis(x)=δis1(x)\delta^{s}_{i}(x)=\delta^{s-1}_{i}(x). If s1=ts-1=t then the initial assumptions are contradicted as ii has not activated during the period [t,s][t,s]. Otherwise if s1>ts-1>t then δis1(x)Bis1(k+1)\delta^{s-1}_{i}(x)\in B^{s-1}_{i}(k+1) by applying the inductive hypothesis to time s1s-1. As Bis1(k+1)=Bis(k+1)B^{s-1}_{i}(k+1)=B^{s}_{i}(k+1) we have the required result.

Case 4: iρ(s)i\in\rho(s) and iρ(s1)i\in\rho(s-1) and iα(s)i\in\alpha(s)
Then δis(x)=Fi(δ1β(s,i,1)(x),,δnβ(s,i,n)(x))\delta^{s}_{i}(x)=F_{i}(\delta^{\beta(s,i,1)}_{1}(x),\ldots,\delta^{\beta(s,i,n)}_{n}(x)). As c(t)Cc(t)\in C and all messages to node ii are in box kk at time tt and are accordant, then δjβ(s,i,j)(x)\delta^{\beta(s,i,j)}_{j}(x) is accordant and in box Bjt(k)=Bjs(k)B^{t}_{j}(k)=B^{s}_{j}(k) for every node jj. Hence Fi()Bis(k+1)F_{i}(...)\in B^{s}_{i}(k+1) by assumption (DA3). ∎

Lemma 7.

If the computation is in box kk at time tt and [t,s][t,s] is an expiry period for node ii then the messages to node ii are in box kk at time ss.

Proof.

Assume that the computation is in box kk at time tt and consider two arbitrary nodes ii and jj and time ss such that [t,s][t,s] is an expiry period and c(s)Cc(s)\in C. We must show that for all times r>sr>s such that η(s)=η(r)\eta(s)=\eta(r) then δjβ(r,i,j)(x)Bjs(k)\delta^{\beta(r,i,j)}_{j}(x)\in B^{s}_{j}(k). As [t,s][t,s] is an expiry period then tβ(r,i,j)<rt\leq\beta(r,i,j)<r and therefore η(t)=η(β(r,i,j))=η(r)=η(s)\eta(t)=\eta(\beta(r,i,j))=\eta(r)=\eta(s). If jρ(s)j\notin\rho(s) then δjβ(r,i,j)(x)=j\delta^{\beta(r,i,j)}_{j}(x)=\bot_{j} and jBjs(k)\bot_{j}\in B^{s}_{j}(k) by assumption (DA2). Otherwise if jρ(s)j\in\rho(s) then δjβ(r,i,j)(x)Bjβ(r,i,j)(k)\delta^{\beta(r,i,j)}_{j}(x)\in B^{\beta(r,i,j)}_{j}(k) by Lemma 4 applied to time period [t,β(r,i,j)][t,\beta(r,i,j)] and the fact that the computation at node jj is in box kk at time tt. The required result then follows as Bβ(r,i,j)=BsB^{\beta(r,i,j)}=B^{s} by η(β(r,i,j))=η(s)\eta(\beta(r,i,j))=\eta(s). ∎

Lemmas 6 & 7 prove that during activation and expiry periods the state and the messages are both guaranteed to advance at least one box. The next lemma combines them to prove that during a pseudocycle the whole computation advances at least one box, i.e. during a pseudocycle the asynchronous iteration makes at least as much progress as a single step of the synchronous iteration.

Lemma 8.

If the computation is in box kk at time tt and the period [t,s][t,s] is a pseudocycle then the computation is in box k+1k+1 at time ss.

Proof.

Consider an arbitrary node iρ(t)i\in\rho(t). As [t,s][t,s] is a pseudocycle then as iρ(t)i\in\rho(t) there exists a time rr such that [t,r][t,r] is an expiry period for node ii and [r,s][r,s] is an activation period for node ii.

  • As the messages to node ii are accordant at time tt then they are also accordant at times rr and ss.

  • As [t,r][t,r] is an expiry period and the computation is in box kk at time tt, then the messages to node ii are in box kk at time rr by Lemma 7, and also therefore at time ss by Lemma 5.

  • As [r,s][r,s] is an activation period and the messages to node ii are accordant and in box kk at time rr (by the previous two points) then the state of node ii in box k+1k+1 at time ss by Lemma 6.

Consequently all three requirements for the computation at node ii being in box k+1k+1 at time ss are fulfilled. ∎

4.1.4 Convergence

Now that Lemma 8 has established that during a pseudocycle the whole computation advances one box, the main theorem may be proved.

Theorem 3.

If FF is a dynamic ACO over a valid initial set XX and configurations CC then δ\delta is convergent over XX and CC.

Proof.

To prove that δ\delta is convergent it is first necessary to construct a fixed point xepx^{*}_{ep} and iteration number kepk^{*}_{ep} for every epoch ee and set of participants pp such that (e,p)C(e,p)\in C. Let these be the xepx^{*}_{ep} and kepk^{*}_{ep} respectively as specified by assumption (DA4).

Next consider an arbitrary schedule, starting state xXx\in X and starting time t1t_{1} in epoch e=η(t1)e=\eta(t_{1}) with participants p=ρ(t1)p=\rho(t_{1}) such that (e,p)C(e,p)\in C. We must show that for all times t2t_{2} if [t1,t2][t_{1},t_{2}] contains kepk^{*}_{ep} pseudocycles then for all times t3t_{3} such that t3t2t_{3}\geq t_{2} and η(t3)=η(t2)\eta(t_{3})=\eta(t_{2}) then δ3t(x)=xep\delta^{t}_{3}(x)=x^{*}_{ep}.

The computation is always in box 0 by Lemma 3. Consequently after kepk^{*}_{ep} pseudocycles, the computation is in box kepk^{*}_{ep} at time t2t_{2} by repeated application of Lemma 8. Hence for any subsequent time t3t_{3} in epoch ee, then δt3(x)Bep(kep)\delta^{t_{3}}(x)\in B^{ep}(k^{*}_{ep}) by Lemma 4 and, as xepx^{*}_{ep} is the only state in Bep(kep)B^{ep}(k^{*}_{ep}) by assumption (DA4), then δt3(x)=xep\delta^{t_{3}}(x)=x^{*}_{ep}. ∎

4.2 Dynamic AMCO implies convergent

Although the dynamic ACO conditions are sufficient to guarantee convergence, they can be a tricky to construct in practice. As discussed previously in Section 2.3, the AMCO conditions are often easier to work with. This section defines the dynamic AMCO conditions and shows that they also guarantee the iteration is convergent by constructing a reduction from the dynamic AMCO conditions to the dynamic ACO conditions.

Definition 23 (Dynamic AMCO).

The set of functions FF are a dynamic AMCO over a set of initial states X=X1×X2××XnX=X_{1}\times X_{2}\times...\times X_{n} and a set of configurations CE×2VC\subseteq E\times 2^{V} if for every epoch ee and set of participants pp such that (e,p)C(e,p)\in C and for every node iVi\in V there exists a distance function diepd^{ep}_{i} such that if Dep(x,y)maxipdiep(x,y)D^{ep}(x,y)\triangleq\max_{i\in p}\>d^{ep}_{i}(x,y) then:

  1. (DU4)

    iV:x,yS:diep(x,y)=0x=y\forall i\in V:\forall x,y\in S:d^{ep}_{i}(x,y)=0\Leftrightarrow x=y

  2. (DU5)

    iV:n:x,yS:dep(x,y)in\forall i\in V:\exists n:\forall x,y\in S:d^{ep}(x,y)_{i}\leq n

  3. (DU6)

    xXAp:Fep(x)xDep(Fep(x),(Fep)2(x))<Dep(x,Fep(x))\forall x{\in}X{\cap}A^{p}:F^{ep}(x)\neq x\Rightarrow D^{ep}(F^{ep}(x),(F^{ep})^{2}(x)){<}D^{ep}(x,F^{ep}(x))

  4. (DU7)

    xXAp,xS:Fep(x)=xxxDep(x,Fep(x))<Dep(x,x)\forall x{\in}X{\cap}A^{p},x^{*}{\in}S{:}F^{ep}(x^{*}){=}x^{*}\wedge x{\neq}x^{*}{\Rightarrow}D^{ep}(x^{*},F^{ep}(x)){<}D^{ep}(x^{*},x)

  5. (DU8)

    xXAp:Fep(x)Ap\forall x{\in}X{\cap}A^{p}:F^{ep}(x)\in A^{p}.

Again assumptions (DU1)(DU1)(DU4)(DU4) are generalisations of (SU1)(SU1)(SU4)(SU4). The crucial difference is that everything is restricted to the set of participants:  FepF^{ep} need only be strictly contracting over accordant states ApA^{p}, and the distance functions DepD^{ep} are defined as the maximum over all participating states. Note that the static assumption (SU5) that SS is non-empty is not needed as the dynamic model assumes the existence of the non-participating state S\bot\in S. Instead the new assumption (DU8) ensures that the operator FF preserves accordant enforces that non-participating nodes adopt the non-participating state. This assumption was not stated explicitly in the dynamic ACO conditions but can be derived from assumptions (DA2) and (DA3).

The proof that these conditions imply that the iteration is convergent is a generalisation of the proof in [16] which in turn was based off the work in [19]. The main thrust of the reduction to the dynamic ACO conditions is relatively simple. As FepF^{ep} is strictly contracting on orbits & its fixed points, it possesses a fixed point xx^{*}. As all distances are bounded above by some value, which we will call kk^{*}, the box Biep(k)B^{ep}_{i}(k) can then be defined as the set of the states which are at a distance of no more than kkk^{*}-k from xix^{*}_{i}. This is now fleshed out in more detail.

Theorem 4.

If FF is a dynamic AMCO then FF is a dynamic ACO.

Proof.

Consider an epoch ee and set of participants pp such that (e,p)C{(e,p)\in C}. First we prove that FepF^{ep} has a fixed point. We start by constructing the chain:

,Fep(),(Fep)2(),(Fep)3(),\bot,\ F^{ep}(\bot),\ (F^{ep})^{2}(\bot),\ (F^{ep})^{3}(\bot),\ ...

By assumption (DU8) and the fact \bot is trivially accordant, then we have that every element in the chain is in ApA^{p}. Similarly by assumptions (IS1) & (IS2) we have every element in the chain is in XX. Therefore while (Fep)k()(Fep)k+1()(F^{ep})^{k}(\bot)\neq(F^{ep})^{k+1}(\bot) then by assumption (DU6) the distance between consecutive elements must strictly decrease:

D(,Fep())>D(Fep(),(Fep)2())>D((Fep)2(),(Fep)3())>D(\bot,F^{ep}(\bot))>D(F^{ep}(\bot),(F^{ep})^{2}(\bot))>D((F^{ep})^{2}(\bot),(F^{ep})^{3}(\bot))>...

As this is a decreasing chain in \mathbb{N} it must eventually reach a kk such that D(Fk(),Fk+1())=0D(F^{k}(\bot),F^{k+1}(\bot))=0. Therefore Fk()Fk+1()F^{k}(\bot)\neq F^{k+1}(\bot) by (DU4) and hence x=Fk()x^{*}=F^{k}(\bot) is a fixed point and furthermore xXx^{*}\in X and xApx^{*}\in A^{p}.

By assumption (DU5) there exists an upper bound on the distance function diepd_{i}^{ep} for all nodes ii, which we will denote as kk^{*}. Having established the existence of the fixed point xx^{*} and maximum distance kk^{*}, we can now define ithi^{th} component for the kthk^{th} box as follows:

Biep(k){Siif k=0{i}if k0ip{xidi(xi,xi)max(0,kk)}if k0ipB^{ep}_{i}(k)\triangleq\begin{cases}S_{i}&\text{if $k=0$}\\ \{\bot_{i}\}&\text{if $k\neq 0\wedge i\notin p$}\\ \{x_{i}\mid d_{i}(x_{i},x^{*}_{i})\leq\max(0,k^{*}-k)\}&\text{if $k\neq 0\wedge i\in p$}\end{cases}

We must now verify that the boxes BepB^{ep} fulfil the conditions in Definition 17:

  1. 1.

    (DA1)XBep(0)X\subseteq B^{ep}(0)

    Immediate from the first case of the definition of BepB^{ep}.

  2. 2.

    (DA2)k,ip:iBiep(k)\forall k\in\mathbb{N},i\notin p:\bot_{i}\in B^{ep}_{i}(k)

    Immediate from the first and second cases of the definition of BepB^{ep}.

  3. 3.

    (DA3)k,xXAp:xBep(k)Fep(x)Bep(k+1)\forall k\in\mathbb{N},x\in X\cap A^{p}:x\in B^{ep}(k)\Rightarrow F^{ep}(x)\in B^{ep}(k+1)

    Consider some state xXApx\in X\cap A^{p} and also assume that xBep(k)x\in B^{ep}(k). We must show that for all nodes ii then Fiep(x)Biep(k+1)F^{ep}_{i}(x)\in B^{ep}_{i}(k+1).

    If ipi\notin p then xi=ix_{i}=\bot_{i} by xiBiep(k)x_{i}\in B^{ep}_{i}(k), and hence Fiep(xi)=iF^{ep}_{i}(x_{i})=\bot_{i} by assumption (DU8), and so Fiep(xi)Bi(k+1)F^{ep}_{i}(x_{i})\in B_{i}(k+1). Otherwise if ipi\in p it remains to show that diep(xi,Fiep(x))max(0,k(k+1))d^{ep}_{i}(x^{*}_{i},F^{ep}_{i}(x))\leq\max(0,k^{*}-(k+1)).

    If x=xx=x^{*} then:

    diep(xi,Fiep(x))\displaystyle d^{ep}_{i}(x^{*}_{i},F^{ep}_{i}(x)) =diep(xi,Fiep(x))\displaystyle=d^{ep}_{i}(x^{*}_{i},F^{ep}_{i}(x^{*})) (as x=xx=x^{*})
    =diep(xi,xi)\displaystyle=d^{ep}_{i}(x^{*}_{i},x^{*}_{i}) (as Fep(x)=xF^{ep}(x^{*})=x^{*})
    =0\displaystyle=0 (by (DU4))
    max(0,k(k+1))\displaystyle\leq\max(0,k^{*}-(k+1))

    Otherwise if xxx\neq x^{*} then diep(xi,Fiep(x))<max(0,kk)d^{ep}_{i}(x^{*}_{i},F^{ep}_{i}(x))<\max(0,k^{*}-k) as:

    diep(xi,Fiep(x))\displaystyle d^{ep}_{i}(x^{*}_{i},F^{ep}_{i}(x)) Dep(x,Fep(x))\displaystyle\leq D^{ep}(x^{*},F^{ep}(x)) (by definition of DD)
    <Dep(x,x)\displaystyle<D^{ep}(x^{*},x) (by (DU7) & (DU8))
    max(0,kk)\displaystyle\leq\max(0,k^{*}-k) (by xB(k)x\in B(k))

    which implies that diep(xi,Fiep(x))max(0,k(k+1))d^{ep}_{i}(x^{*}_{i},F^{ep}_{i}(x))\leq\max(0,k^{*}-(k+1)).

  4. 4.

    (DA4)k,x:k:kkB(k)={x}\exists k^{*},x^{*}:\forall k:k^{*}\leq k\Rightarrow B(k)=\{x^{*}\}

    We have already established the existence of kk^{*} and xx^{*}. Consider a kkk\geq k^{*}. If k=0k=0 then k=0k^{*}=0 and so the result holds trivially as xSx^{*}\in S and all points are 0 distance away from and hence equal to xx^{*} by (DU4) & (DU5). Otherwise if k0k\neq 0 then as kkk\geq k^{*} the definition of Bep(k)B^{ep}(k) simplifies to:

    Biep(k){{i}if ip{xidiep(xi,xi)=0}if ipB^{ep}_{i}(k)\triangleq\begin{cases}\{\bot_{i}\}&\text{if $i\notin p$}\\ \{x_{i}\mid d^{ep}_{i}(x_{i},x^{*}_{i})=0\}&\text{if $i\in p$}\end{cases}

    To prove that xBep(k)x^{*}\in B^{ep}(k) we must show that for every node ii we have xiBiep(k)x^{*}_{i}\in B^{ep}_{i}(k). This follows as if ipi\notin p then as xi=ix^{*}_{i}=\bot_{i} as xApx^{*}\in A^{p}. Otherwise if ipi\in p then dep(xi,xi)=0d^{ep}(x^{*}_{i},x^{*}_{i})=0 by (DU4).

    Suppose there exists another state xBep(k)x\in B^{ep}(k). We must show that for every node ii we have that xi=xix_{i}=x^{*}_{i}. If ipi\notin p then xiBiep(k)x_{i}\in B^{ep}_{i}(k) implies xi=ix_{i}=\bot_{i} and i=xi\bot_{i}=x^{*}_{i} as xApx^{*}\in A^{p}. Otherwise if ipi\in p then xiBiep(k)x_{i}\in B^{ep}_{i}(k) implies dep(xi,xi)=0d^{ep}(x_{i},x^{*}_{i})=0. Hence xi=xix_{i}=x^{*}_{i} by (DU4).

Hence the conditions are satisfied and FF is a dynamic ACO. ∎

Theorem 5.

If FF satisfies the dynamic AMCO conditions then δ\delta is convergent.

Proof.

As FF is a dynamic AMCO then FF is a dynamic ACO by Theorem 3. Hence δ\delta is convergent by Theorem 4. ∎

5 Formalisation in Agda

Every result presented in this paper have been formalised in Agda [21], a dependently typed language that is expressive enough that both programs and proofs may be written in it. The results cover not only the dynamic model but also include the previous static model as well. The proofs are available online [22] as an Agda library and the library’s documentation contains a guide to definitions and proofs to the corresponding Agda code.

It is hoped that the library may be of use to others in constructing formal proofs of correctness for a variety of asynchronous algorithms. The library is designed in a modular fashion so that users need not be aware of the underlying details. The library has already been used to generalise and formally verify the correctness conditions for inter-domain routing protocols with complex conditional policy languages found in [3].

6 Conclusion

This paper has successfully constructed a more general model for dynamic asynchronous iterations in which both the computation and the set of participants may change over time. It has generalised the ACO and AMCO conditions for the existing static model and shown that the generalisations are sufficient to guarantee the correctness of the dynamic model.

Although we have not directly explored any uses of these results in this paper, we refer interested readers to [23] which contains an in-depth case study of how they may be applied to prove new theoretical results about the Bellman-Ford family of routing protocols.

There are still several open questions in regards to the theory of asynchronous iterations. For example, even in the static model questions remain about what are necessary conditions for δ\delta to converge. Üresin & Dubois [17] showed that when SS is finite then the ACO conditions are both necessary and sufficient for convergence. As far as the authors are aware there exist no such corresponding conditions for the case when SS is infinite.

Another obvious question is whether the dynamic ACO conditions are also necessary for the convergence of the dynamic model when SS is finite. Üresin & Dubois’s static proof is essentially combinatorial in nature, building the ACO boxes BB such that they contain all possible states that can result from static schedules. The challenges to adapting this to the dynamic model are twofold: firstly the additional combinatorial explosion of possible states introduced by the epochs, and secondly the absence in the definition of a dynamic schedule of Üresin & Dubois’s assumption that the schedules must contain an infinite number of pseudoperiods.

Acknowledgements

Matthew L. Daggitt was supported by an EPSRC Doctoral Training grant.

References

  • [1] A. Frommer, D. B. Szyld, On asynchronous iterations, Journal of computational and applied mathematics 123 (1) (2000) 201–216 (2000).
  • [2] C.-k. Chau, Policy-based routing with non-strict preferences, SIGCOMM Computer Communication Review 36 (4) (2006) 387–398 (Aug. 2006). doi:10.1145/1151659.1159957.
  • [3] M. L. Daggitt, A. J. T. Gurney, T. G. Griffin, Asynchronous convergence of policy-rich distributed Bellman-Ford routing protocols, in: SIGCOMM proceedings, ACM, 2018 (2018).
  • [4] B. Ducourthial, S. Tixeuil, Self-stabilization with path algebra, Theoretical Computer Science 293 (1) (2003) 219 – 236 (2003). doi:https://doi.org/10.1016/S0304-3975(02)00238-4.
  • [5] S. A. Edwards, E. A. Lee, The semantics and execution of a synchronous block-diagram language, Science of Computer Programming 48 (1) (2003) 21 – 42 (2003). doi:https://doi.org/10.1016/S0167-6423(02)00096-5.
  • [6] S. Y. Ko, I. Gupta, Y. Jo, A new class of nature-inspired algorithms for self-adaptive peer-to-peer computing, ACM Transactions on Autonomous and Adaptive Systems 3 (3) (2008) 11:1–11:34 (Aug. 2008). doi:10.1145/1380422.1380426.
  • [7] M. Chau, Algorithmes parallèles asynchrones pour la simulation numérique, Ph.D. thesis, Institut National Polytechnique de Toulouse (2005).
  • [8] J. Wolfson-Pou, E. Chow, Modeling the asynchronous Jacobi method without communication delays, Journal of Parallel and Distributed Computing 128 (2019) 84–98 (2019).
  • [9] F. Magoulès, Q. Zou, Asynchronous time-parallel method based on Laplace transform, International Journal of Computer Mathematics 98 (1) (2021) 179–194 (2021).
  • [10] F. Magoulès, G. Gbikpi-Benissan, Q. Zou, Asynchronous iterations of parareal algorithm for option pricing models, Mathematics 6 (4) (2018) 45 (2018).
  • [11] P. Spiteri, Parallel asynchronous algorithms: A survey, Advances in Engineering Software 149 (2020) 102896 (2020).
  • [12] J. M. Bahi, S. Contassot-Vivier, R. Couturier, Parallel iterative algorithms: from sequential to grid computing, CRC Press, 2007 (2007).
  • [13] D. P. Bertsekas, J. N. Tsitsiklis, et al., A survey of some aspects of parallel and distributed iterative algorithms, Tech. rep. (1989).
  • [14] G. M. Baudet, Asynchronous iterative methods for multiprocessors, Tech. rep., Department of Computer Science, Carnegie-Mellon University, Pittsburgh (1976).
  • [15] A. Uresin, M. Dubois, Generalized asynchronous iterations, in: International Conference on Parallel Processing, Springer, 1986, pp. 272–278 (1986).
  • [16] M. L. Daggitt, R. Zmigrod, T. G. Griffin, A relaxation of Üresin & Dubois’ asynchronous fixed-point theory in Agda, Journal of Automated Reasoning (2019).
  • [17] A. Üresin, M. Dubois, Parallel asynchronous algorithms for discrete data, Journal of the ACM (JACM) 37 (3) (1990) 588–606 (1990).
  • [18] J. Milnor, On the concept of attractor, in: The theory of chaotic attractors, Springer, 1985, pp. 243–264 (1985).
  • [19] A. J. T. Gurney, Asynchronous iterations in ultrametric spaces, Tech. rep. (2017).
    URL https://arxiv.org/abs/1701.07434
  • [20] J. L. Sobrinho, Correctness of routing vector protocols as a property of network cycles, IEEE/ACM Transactions on Networking 25 (1) (2017) 150–163 (2017).
  • [21] U. Norell, Dependently typed programming in Agda, in: Proceedings of the 4th International Workshop on Types in Language Design and Implementation, 2009 (2009).
  • [22] M. L. Daggitt, R. Zmigrod, T. G. Griffin, Routing library (2020).
    URL https://github.com/MatthewDaggitt/agda-routing/tree/jpdc2020
  • [23] M. L. Daggitt, T. G. Griffin, Formally verified convergence of policy-rich dbf routing protocols (2021). arXiv:2106.01184.