Asynchronous Byzantine Approximate Consensus in Directed Networks

Dimitris Sakavalas [email protected] , Lewis Tseng [email protected] Boston CollegeUSA and Nitin H. Vaidya Georgetown UniversityUSA [email protected]

(2018)

Abstract.

In this work, we study the approximate consensus problem in asynchronous message-passing networks where some nodes may become Byzantine faulty. We answer an open problem raised by Tseng and Vaidya, 2012, proposing the first algorithm of optimal resilience for directed networks. Interestingly, our results show that the tight condition on the underlying communication networks for asynchronous Byzantine approximate consensus coincides with the tight condition for synchronous Byzantine exact consensus. Our results can be viewed as a non-trivial generalization of the algorithm by Abraham et al., 2004, which applies to the special case of complete networks. The tight condition and techniques identified in the paper shed light on the fundamental properties for solving approximate consensus in asynchronous directed networks.

approximate consensus, asynchronous networks, network topology, Byzantine adversary

^†^†copyright: acmcopyright^†^†journalyear: 2018^†^†doi: 10.1145/1122445.1122456^†^†conference: PODC ’20: ACM Symposium on Principles of Distributed Computing; August 03–07, 2020; Salerno, Italy^†^†booktitle: PODC ’18: ACM Symposium on Principles of Distributed Computing, August 03–07, 2020, Salerno, Italy^†^†price: 15.00^†^†isbn: 978-1-4503-XXXX-X/18/06^†^†ccs: Theory of computation Distributed algorithms

1. Introduction

The extensively studied fault-tolerant consensus problem (Pease et al., 1980) is a fundamental building block of many important distributed computing applications (Lynch, 1996). The FLP result (Fischer et al., 1985b) states that it is impossible to achieve exact consensus in asynchronous networks where nodes may crash (exact consensus requires nonfaulty nodes to reach an agreement on an identical value). The FLP impossibility result led to the study of weaker variations, including approximate consensus (Dolev et al., 1986). With approximate consensus, nonfaulty nodes only need to output values that are within $\epsilon$ of each other for a given $\epsilon>0$ . Practical applications of approximate consensus range from sensor fusion (Benediktsson and Swain, 1992) and load balancing (Cybenko, 1989), to natural systems like flocking (Vicsek et al., 1995) and opinion dynamics (Hegselmann and Krause, 2002). The feasibility of achieving consensus depends on the type of faults considered in the system. The literature has mainly focused on crash and Byzantine faults, the latter being the worst case since the misbehavior of faults may be arbitrary. In this work, we focus on the asynchronous Byzantine approximate consensus problem under the existence of at most $f$ faults.

Another important parameter affecting the feasibility is the topology of the underlying communication network $G=(V,E)$ in which nodes represent participants that reliably exchange messages through edges. The relation between network topology and feasibility in undirected networks was studied shortly after the introduction of the respective problems (e.g., (Lynch, 1996; Dolev, 1982)). For $|V|=n$ , connectivity $\kappa(G)$ of the network and upper bound $f$ on the number of faults, Table 1 summarizes the well-known necessary and sufficient topological conditions for achieving exact consensus and approximate consensus in various settings where $G$ is undirected. In undirected networks, satisfying the necessary graph conditions in Table 1 also implies feasibility of reliable message transmission (RMT) (cf. (Dolev et al., 1993)), which can be exploited to simulate algorithms designed for complete networks.

	Crash fault	Byzantine fault
Synchronous system (exact consensus)	$n>f$ and $\kappa(G)>f$ (Lynch, 1996)	$n>3f$ and $\kappa(G)>2f$ (Dolev, 1982)
Asynchronous system (approximate consensus)	$n>2f$ and $\kappa(G)>f$ (Lynch, 1996)	$n>3f$ and $\kappa(G)>2f$ (Fischer et al., 1985a; Abraham et al., 2004; Dolev et al., 1993)

Table 1. Necessary and Sufficient Conditions for Undirected Graphs

The study of consensus in directed graphs is largely motivated by wireless networks wherein different nodes may have different transmission range, resulting in directed communication links. While the necessary and sufficient conditions for undirected graphs have been known for many years, their generalizations for directed graphs appeared only after 2012, e.g., (Tseng and Vaidya, 2012, 2015; LeBlanc et al., 2013; Vaidya et al., 2012). This is mainly due to the fact that no direct relation appears between reliable message transmission and consensus in directed graphs.

As Table 2 summarizes, for directed graphs, Tseng and Vaidya (Tseng and Vaidya, 2015, 2012) obtained necessary and sufficient conditions for solving consensus in the presence of crash faults in synchronous and asynchronous systems both. However, they were able to obtain such conditions for Byzantine faults only for synchronous systems. The determination of a tight condition for the asynchronous Byzantine model remains open since 2012. This paper closes this gap in the results. We identify a family of new conditions which we prove equivalent to the ones obtained in (Tseng and Vaidya, 2012, 2015), offering an important intuition, which essentially leads to the answer of this open question. Our condition family consists of 1-reach, 2-reach and 3-reach conditions, which are later defined in Section 2.¹¹1The general k-reach condition family, presented in the appendix, encompasses conditions 1-reach, 2-reach, 3-reach and may be of further interest. Results from (Tseng and Vaidya, 2012, 2015) imply that the 3-reach condition is tight for exact Byzantine consensus in synchronous systems. A key contribution of this paper is to show that the 3-reach condition is also necessary and sufficient for asynchronous Byzantine consensus in directed graphs.

	Crash fault	Byzantine fault
Synchronous system (Exact consensus)	1-reach condition (see Section 2) Tseng and Vaidya 2015 (Tseng and Vaidya, 2015)	3-reach condition (see Section 2) Tseng and Vaidya 2015 (Tseng and Vaidya, 2015)
Asynchronous system (Approximate consensus)	2-reach condition (see Section 2) Tseng and Vaidya 2012, 2015 (Tseng and Vaidya, 2012, 2015)	3-reach condition (this paper) open problem since 2012

Table 2. Necessary and Sufficient Conditions for Directed Graphs

Essentially, obtaining the tight graph conditions for directed graphs is much more difficult than the undirected case, since consensus may be possible even if reliable message transmission (RMT) is not possible between every pair of nodes. This is unlike the case of undirected graphs, as observed previously. For instance, Figure 1(a) presents an undirected network, where synchronous exact Byzantine consensus is possible for $f=1$ . In this graph, all-pair RMT is possible, since $\kappa(G)>2f$ allows any pair of nodes to communicate through at least $2f+1=3$ disjoint paths. Note that removing any edge will reduce $\kappa(G)$ , which will make both RMT and consensus impossible. Such an all-pair RMT is not necessary in directed graphs. In particular, Figure 1(b) shows a network that satisfies the 3-reach condition (stated later in Section 2) – this network includes two cliques, each containing 7 nodes, and eight additional directed edges as shown (edges within each clique are not shown in the figure). Observe that there are pairs of nodes (e.g., $v_{1}$ and $w_{1}$ ) that are connected via only $2f=4$ disjoint paths. Clearly, all-pair RMT is not feasible in this case but consensus can still be achieved, as shown by (Tseng and Vaidya, 2012) and our results. The difficulty posed by directed graphs is further compounded by asynchrony. In this work, we show that the 3-reach condition is necessary and sufficient for asynchronous Byzantine approximate consensus in directed graphs – note that this condition is identical to that proved by Tseng and Vaidya (Tseng and Vaidya, 2015) for synchronous Byzantine exact consensus.

Refer to caption — (a) Byzantine exact consensus feasible for $f=1$

Related work

Additional related work includes studies of the special class of iterative algorithms, which only utilize local knowledge of the network topology and employ local communication between nodes. A tight condition for iterative approximate Byzantine consensus has been presented in (Vaidya et al., 2012; LeBlanc et al., 2013). A family of tight conditions for approximate Byzantine consensus under the more general class of $k$ -hop iterative algorithms has been presented recently in (Su and Vaidya, 2017) but is restricted to synchronous systems. The feasibility of asynchronous crash-tolerant consensus with respect to the $k$ -hop iterative algorithms has been considered in (Sakavalas et al., 2018). A series of works (Litsas et al., 2013; Pagourtzis et al., 2017a, b) studies the effects of topology knowledge on the feasibility of RMT, and consequently exact consensus in undirected networks with Byzantine faults.

2. Preliminaries and Main Result

For the approximate consensus problem (Dolev et al., 1986), each node is given a real-valued input, and the algorithm needs to satisfy the three properties below.

Definition 0.

Approximate consensus is achieved if the following conditions are satisfied for a given $\epsilon>0$ .

(1)

Convergence: the output values of any pair of nonfaulty nodes are within $\epsilon$ of each other.
(2)

Validity: the output of any nonfaulty node is within the range of the inputs of the nonfaulty nodes.
(3)

Termination: all nonfaulty nodes eventually output a value.

System Model

We consider an asynchronous message-passing network. The underlying communication network is modeled as a simple directed graph $G(V,E)$ , where $V=\{1,\dots,n\}$ is the set of $n$ nodes, and $E$ is the set of directed edges between the nodes in $V$ . Node $i$ can reliably transmit messages to node $j$ if and only if the directed edge $(i,j)\in E$ . Each node can send messages to itself as well; however, for convenience, we exclude self-loops from set $E$ . A link is assumed to be reliable, but the message delay is not known a priori.

In the system, at most $f$ nodes may become Byzantine faulty during an execution of the algorithm. A faulty node may misbehave arbitrarily. The faulty nodes may potentially collaborate with each other.

New Graph Conditions

Hereafter, we will use the notation $\overline{X}$ to denote the complement $V\setminus X$ of set $X\subseteq V$ . The subgraph of $G$ induced by node set $Y\subseteq V$ will be denoted by $G_{Y}$ . For a given node set $F\subseteq V$ , we now define the reach set of node $v$ under $F$ , originally introduced in (Tseng and Vaidya, 2015).

Definition 0 (Reach set of $v$ under $F$ ).

For node $v\in V$ and node set $F\subseteq V\setminus\{v\}$ , define

reach_{v}(F)=\{u\in\overline{F}:\text{ $u$ has a directed path to $v$ in graph }G_{\overline{F}}\}

Observe that a node $u$ belongs to $reach_{v}(F)$ if $v$ is reachable from $u$ in the subgraph of $G$ induced by node set $V\setminus F$ . Trivially, $v$ is in $reach_{v}(F)$ . With the definition of a reach set, we introduce the 1-reach, 2-reach and 3-reach conditions referred in Section 1. Intuitively speaking, in the definitions below, the sets $F$ , $F_{v}$ , $F_{u}$ represent potential sets of faulty nodes; thus, these sets are chosen to be of size $\leq f$ . In the following, recall that $\overline{C}$ denotes the set $V\setminus C$ .

Definition 0 (Reach Conditions).

We define three conditions:

•

1-reach: For any $F\subset V$ such that $|F|\leq f$ and any nodes $u,v\in\overline{F}$ , we have

$reach_{u}(F)\cap reach_{v}(F)\neq\emptyset.$
•

2-reach: For any nodes $u,v\in V$ and any node subsets $F_{u}$ , $F_{v}$ such that $|F_{u}|,|F_{v}|\leq f$ , $u\in\overline{F_{u}}$ , and $v\in\overline{F_{v}}$ , we have

$reach_{v}(F_{v})\cap reach_{u}(F_{u})\neq\emptyset.$
•

3-reach: For any nodes $u,v\in V$ and any node subsets $F$ , $F_{u}$ , $F_{v}$ such that $|F|,|F_{u}|,|F_{v}|\leq f$ , $u\in\overline{F\cup F_{u}}$ , and $v\in\overline{F\cup F_{v}}$ , we have

$reach_{v}(F\cup F_{v})\cap reach_{u}(F\cup F_{u})\neq\emptyset.$

It is easy to verify that in a clique, 1-reach, 2-reach, and 3-reach are equivalent with $n>f,n>2f$ , and $n>3f$ respectively. Details can be found in Appendix A.

Main Results

As noted previously, Tseng and Vaidya (Tseng and Vaidya, 2015) obtained necessary and sufficient conditions enumerated in Table 2. We have shown, in Appendix A, that each of their conditions to be equivalent to a respective reach condition in Definition 3 above. In particular, based on the results in (Tseng and Vaidya, 2015), we can prove Theorems 4, 5 and 6 below. These results are not used to prove our main results; hence, we defer the presentation of the original conditions in (Tseng and Vaidya, 2015) and the equivalence proofs to Appendix A.

Theorem 4.

Synchronous exact consensus is possible in network $G(V,E)$ in the presence of up to $f$ crash faults if and only if $G$ satisfies 1-reach condition.

Theorem 5.

Asynchronous approximate consensus is possible in network $G(V,E)$ in the presence of up to $f$ crash faults if and only if $G$ satisfies 2-reach condition.

Theorem 6.

Synchronous exact consensus is possible in network $G(V,E)$ in the presence of up to $f$ Byzantine faults if and only if $G$ satisfies 3-reach condition.

Main Result

Theorem 7.

Asynchronous approximate consensus is possible in network $G(V,E)$ in the presence of up to $f$ Byzantine faults if and only if 3-reach is satisfied.

This result solves the open problem in Table 2 in Section 1.

Proving the main result: The sufficiency of the 3-reach condition for asynchronous Byzantine approximate consensus is demonstrated constructively in Section 4, using Algorithm 1 for achieving this goal. The necessity of the 3-reach condition for asynchronous Byzantine approximate consensus follows by standard indistinguishability arguments; the proof is deferred to Appendix B.

Technique Outline: Our result generalizes the result of (Abraham et al., 2004), which shows the sufficiency of condition $n>3f$ for asynchronous Byzantine approximate consensus in a clique. Note that the condition coincides with the tight condition for the synchronous case (cf. Table 1). For directed graphs, we show that 3-reach is the tight condition for both the synchronous and asynchronous cases. Condition 3-reach states that there exists a node that has (i) a directed path to node $u$ in the subgraph induced by the node subset $\overline{F\cup F_{u}}$ , and also (ii) a directed path to node $v$ in the subgraph induced by the node subset $\overline{F\cup F_{v}}$ . This “source of common influence” for any pair of nodes is crucial for achieving consensus. We outline two techniques used towards our generalization, since they may provide useful intuition for other fault-tolerant settings.

Maximal Consistency: We simplify the Reliable-Broadcast subroutine of (Abraham et al., 2004) by essentially replacing several rounds of communication between nodes with flooding. Even in a clique, $r$ communication rounds can be simulated by flooding through propagation paths ²²2Observe that the definition of a path also applies in a clique network. of length at most $r$ . The receiver of all these propagated messages can then detect the existence of faults in certain propagation paths if the propagated values are inconsistent (i.e., values from different paths do not match). The technique appears in the use of the Maximal-Consistency condition in Algorithm 1; this simple condition provides similar properties as Reliable-Broadcast of (Abraham et al., 2004).

Witness node: The witness technique used in (Abraham et al., 2004) relies on the fact that for any pair of nodes, there is a nonfaulty witness node which provides them with enough common information. The existence of an analogous nonfaulty witness for directed networks is implied by the 3-reach condition. Intuitively, even if two nodes $v,u$ “suspect” different sets $F_{v},F_{u}$ to be faulty, the existence of a common nonfaulty witness guarantees the flow of common information to both. Guaranteeing that all pairs of nonfaulty nodes gather enough common values while ensuring that nonfaulty nodes with wrongly suspected faulty set are always able to proceed appeared to be the most challenging part of the proposed algorithm. This technique appears in how each node verifies the messages that it has received at line 1 in Algorithm 1. Generally speaking, a node tries to collect as many “verified messages” as possible while it cannot wait for messages that might never arrive (i.e., message tampered by faulty nodes).

3. Useful Terminology

Recall that directed graph $G=(V,E)$ represents the network connecting the $n$ nodes in the system. Thus, $n=|V|$ . We will sometimes use the notation $V(G)$ to represent the set of nodes in graph $G$ . In the following, we will use the terms edge and link interchangeably. We now introduce some graph terminology to facilitate the discussion.

•

A path is represented by an ordered list of vertices. In particular, $p=\langle v_{1},\ldots,v_{k}\rangle$ is a directed path $p$ comprising of nodes $v_{1},\ldots,v_{k}\in V$ and directed edges $(v_{i},v_{i+1})\in E$ , where $1\leq i\leq k-1$ .
•

$\mathbf{init(p)}$ and $\mathbf{ter(p)}$ , will be used to denote the initial node $v_{1}$ and terminal node $v_{k}$ of a path $p=\langle v_{1},\ldots,v_{k}\rangle$ .
•

A $\mathbf{(v_{1},v_{k})}$ -path is a path with $init(p)=v_{1}$ and $ter(p)=v_{k}$ .
•

Operation $p||u=\langle v_{1},\ldots,v_{k},u\rangle$ denotes the concatenation of path $p=\langle v_{1},\ldots,v_{k}\rangle$ with node $u$ assuming that $(v_{k},u)\in E$ . Analogously, if $ter(p)=init(p^{\prime})$ , then $p||p^{\prime}$ denotes the concatenation of paths $p$ and $p^{\prime}$ .
•

Redundant path: a path $p$ is a redundant path if $p=p_{1}||p_{2}$ for some simple paths $p_{1}$ and $p_{2}$ ( $p_{1}$ and $p_{2}$ have no cycles) and one of $p_{1},p_{2}$ may be empty. Note that a redundant path may contain cycles and its length is upper bounded by $2n$ .
•

The set of all redundant paths in graph $G_{Y}$ (defined above) will be denoted as $\mathcal{P}^{r}_{Y}$ .
•

Fully nonfaulty path: a path consisting entirely of nonfaulty nodes.
•

$\mathbf{(A,v)}$ -paths: given a set $A\subseteq V$ and a node $v\in V$ , an $(A,v)$ -path $p$ is a path with $init(p)\in A$ and $ter(p)=v$ .

When convenient, we will interpret a path $p$ as the set of nodes in the path. The next few definitions use this interpretation for a node set $C$ and paths $p=\langle v_{1},\ldots,v_{k}\rangle$ , $p^{\prime}=<v^{\prime}_{1},\ldots,v^{\prime}_{k}>$ .
•

$C\cap p$ will denote the intersection $C\cap\{v_{1},\ldots,v_{k}\}$ .
•

We will say that $p\subseteq C$ if $\{v_{1},\ldots,v_{k}\}\subseteq C$ .
•

By $p\cap p^{\prime}$ , we will denote the node intersection $\{v_{1},\ldots,v_{k}\}\cap\{v^{\prime}_{1},\ldots,v^{\prime}_{k}\}$ of paths $p$ and $p^{\prime}$ .

Definition 0 ( $f$ -cover of a path set).

For a set of paths $P$ , a node set $C$ is a $f$ -cover of $P$ , if $|C|\leq f$ , and
$\forall p\in P,~{}~{}\text{ }~{}~{}C\cap p\neq\emptyset$ .

Definition 0 (Reduced Graph).

For graph $G=(V,E)$ , and sets $F_{1},\,F_{2}\subseteq V$ , such that $|F_{1}|,|F_{2}|\leq f$ , reduced graph $G_{F_{1},F_{2}}=(V,E_{F_{1},F_{2}})$ has set of vertices $V$ , and the set of edges $E_{F_{1},F_{2}}$ is obtained by removing from $E$ all the outgoing links at each node in $F_{1}\cup F_{2}$ . That is,

E_{F_{1},F_{2}}~{}=~{}E~{}\setminus~{}\left\{(u,v)~{}|~{}u\in F_{1}\cup F_{2},~{}v\in V,~{}v\neq u\right\}.

Definition 0 (Source Component).

For graph $G=(V,E)$ , and sets $F_{1},\,F_{2}\subseteq V$ , such that $|F_{1}|,|F_{2}|\leq f$ , source component $S_{F1,F2}$ is defined as the set of those nodes in the reduced graph $G_{F1,F2}=(V,E_{F_{1},F_{2}})$ that have directed paths to all the nodes in $V$ .

By definition, the nodes in $S_{F1,F2}$ form a strongly connected component in $G_{F1,F2}$ . The source component $S_{F1,F2}$ has other desirable properties that will be introduced when we prove the correctness of our algorithm later.

4. Asynchronous approximate consensus in directed networks

We next present an algorithm for approximate Byzantine consensus in asynchronous directed networks. The algorithm is optimal in terms of resilience, meaning that it matches the impossibility condition of the problem, i.e., the algorithm works in any graph that satisfies 3-reach. Our solution is inspired by the asynchronous approximate consensus algorithm of (Abraham et al., 2004) as explained in Section 2. However, the tools used in the algorithm of (Abraham et al., 2004) prove highly non-trivial to generalize in the case of a partially-connected directed network. This is due to the constraint of directed edges. In complete graphs considered in (Abraham et al., 2004), information can flow both directions, and each node can use the same rule to collect information. In the case of directed networks, information may only be able to flow in one direction. We need to devise new tools for nodes to exchange and filter values so that enough common information is shared between any pair of nonfaulty nodes.

Outline of the algorithm

In our algorithm, each node $v$ maintains a state value $x_{v}[r]\in\mathbb{R},\text{ for }r\in\mathbb{N}$ , which is updated regularly, with $x_{v}[0]$ denoting the real-valued input of node $v$ . Value $x_{v}[r]$ represents the $r$ -th update of the state value of node $v$ ; we will also refer to it as the state value of $v$ in (asynchronous) round $r$ . Observe that in asynchronous systems, $v$ updates the value every time it receives enough messages of a certain type (i.e., an event-driven algorithm), thus creating the sequence $\left(x_{v}[r]\right)_{r\in\mathbb{N}}$ . The $r$ -th value update of a node $v$ may happen at a different real time than the respective update of another node $u$ .

The proposed algorithm is structured in two parts presented in Algorithm 1: Byzantine Witness (BW) and Algorithm 3: Filter-and-Average (FA). Algorithm BW intuitively guarantees that all nonfaulty nodes will gather enough common information in any given (asynchronous) round. The value update in each round is described in Algorithm FA, where we propose an appropriate way for a node to filter values received in Algorithm BW and obtain the state value for next round as an average of the filtered values. Each node needs to filter potentially faulty values to guarantee validity.

4.1. Algorithm Preliminaries

The proposed algorithm utilizes the propagation of values through certain redundant paths (defined in Section 3). We then describe tools for handling information received by a node through different paths

Messages and Message Sets.

In our algorithm, the messages propagated are of the form $(x,p)$ where $x$ is the propagated value and $p$ corresponds to the (redundant) path through which the value $x$ is propagated, i.e., its propagation path. For a message $m=(x,p)$ , we will use the notation $value(m)=x$ and $path(m)=p$ to denote the propagated value and propagation path, respectively. For simplicity, we will also use the terminology $v$ receives value $x$ from $u$ whenever node $v$ receives $x$ through some path $p$ initiating at node $u$ . A message set $\mathcal{M}$ is a set of messages of the form $m=(x,p)$ where $x$ is the value reported though propagation path $p$ . Given $\mathcal{M}$ , we will use $\mathcal{P}(\mathcal{M})$ to denote the set of all propagation paths in $\mathcal{M}$ , i.e.,

\mathcal{P}(\mathcal{M})=\{p~{}:~{}(x,p)\in\mathcal{M}\}

As defined below, given a node set $A$ and a message set $\mathcal{M}$ , the exclusion of $\mathcal{M}$ on $A$ consists of the messages of $\mathcal{M}$ that are propagated on paths that do not include any node in $A$ .

Definition 0 (Exclusion of message set).

Given a message set $\mathcal{M}$ and $A\subseteq V$ , the exclusion of $\mathcal{M}$ on $A$ is the set

\mathcal{M}|_{A}=\{(x,p)\in\mathcal{M}~{}:~{}p\cap A=\emptyset\}

The notions of a consistent message set and full message set, presented below, are used to facilitate fault detection. A message set $\mathcal{M}$ is consistent if all value-path tuples initiating at the same initial node report the same value.

Definition 0 (Consistent message set).

A message set $\mathcal{M}$ is consistent if

\text{for any two value-path pairs}~{}(x,p),(x^{\prime},p^{\prime})\in\mathcal{M},~{}~{}~{}~{}init(p)=init(p^{\prime})~{}\Rightarrow~{}x=x^{\prime}

Given a consistent message set $\mathcal{M}$ , if $(x,p)\in\mathcal{M}$ and $init(p)=v$ , then we define $value_{v}(\mathcal{M})=x$ . That is, for a node $v$ that appears as an initial node of a path in $\mathcal{P}(\mathcal{M})$ , $value_{v}(\mathcal{M})$ denotes the unique value corresponding to $v$ . Note that the value is guaranteed to be unique owing to the the definition of a consistent message set.

We say that the received message set $\mathcal{M}$ is a full message set for $(A,v)$ , whenever a node $v$ receives messages from all possible incoming redundant paths excluding node set $A$ . The formal definition follows.

Definition 0 (Full message set).

Given $A\subseteq V$ and $v\in V\setminus A$ , a message set $\mathcal{M}$ is full for $(A,v)$ if

\{p\in\mathcal{P}^{r}_{\overline{A}}:ter(p)=v\}\subseteq\mathcal{P}(\mathcal{M})

4.2. Algorithm Byzantine Witness (BW)

The Byzantine Witness (BW) algorithm, presented as Algorithm 1, intuitively guarantees that all nonfaulty nodes will receive enough common state values in a specific asynchronous round of the algorithm; eventually, this common information guarantees convergence of local state values at all nonfaulty nodes. The algorithm is event-driven. That is, whenever a new message is received, each node checks whether a certain set of conditions are satisfied, and whether it should take an action. (In particular, Line 6, Line 8, Line 10, and Line 12 of Algorithm 1 will be triggered upon receipt of a new message.)

Parallel executions.

For the sake of succinctness, we present a parallel version of Algorithm BW; the algorithm makes use of parallel executions (threads) for any potential fault set $F$ . Note that there are exponential number of threads. In the parallel thread for set $F$ , a node “guesses” that the actual fault set of this execution is $F$ , and checks for inconsistencies to reject this guess. Observe that in lines 1-1 of Algorithm BW, the usage of a shared boolean variable $nextround$ guarantees that a node will proceed to the next round during at most one parallel execution; we will later prove that there always exists such a parallel execution that proceeds to the next round. For each round $r$ , during this unique parallel execution, the node will execute Algorithm Filter-and-Average (FA), presented as Algorithm 3, through which the value is updated.

Suppose that a node’s parallel thread for set $F^{\prime}$ proceeds to the next round. It is possible that $F^{\prime}$ is not the actual faulty set. Our algorithm is designed in a way that even if the guess is incorrect, the node is still able to collect enough common values and make progress. Moreover, the parallel thread for set $F$ , where $F$ is the actual fault set, is guaranteed to make progress at every nonfaulty node.

Atomicity

Algorithm BW uses the shared variables $\mathcal{M}_{v}$ and $nextround$ ; $\mathcal{M}_{v}$ includes all values received by node $v$ and is updated whenever $v$ receives a new flooded value while $nextround$ indicates if a parallel thread has proceeded to the next (asynchronous) round. For certain parts of the algorithm we need access to shared variables to be atomic, i.e., reads and writes to shared variables can be performed only by a parallel thread at a time. For clarity of the latter, we make use of the functions lock() and unclock() which indicate the parts of the code performed in an atomic fashion.

We next describe a flooding subroutine used to propagate state values throughout redundant paths in the network.

RedundantFlood (Redundant Flood) algorithm

In the beginning of each asynchronous round, in algorithm 1, all nodes will flood their values throughout the network. The difference between RedundantFlood algorithm and the standard flooding is that each flooded message is propagated through any redundant path in the network, not just through simple paths. The details of the algorithm are deferred to the Appendix E.

FIFO flooding of messages

During the execution of the algorithm, we will employ the FIFO Flood and FIFO Receive procedures which ensure that the order of messages sent from a sender is preserved during reception of these messages by any receiver, when the propagating path is fully comprised of nonfaulty nodes. For the correctness of our algorithm, we only need to FIFO-flood messages through simple paths in the network.³³3It is possible to use RedundantFlood everywhere. For efficiency, we only use RedundantFlood to propagate state values at the very beginning of each round. Thus, a node will propagate a message during FIFO-Flood, only if the resulting propagation path does not contain any cycle. We present a high-level description of the FIFO Flood and FIFO Receive procedures in the Appendix F.

Algorithm BW: Pseudo-code

Algorithm BW is presented in Algorithm 1. We stress that Algorithm BW is executed for each asynchronous round $r$ . Thus, all messages sent in round $r$ will be tagged with corresponding round identifier $r$ . For simplicity, we omit round numbers in the presentation and the analysis of the algorithm. The properties of Algorithm BW are proved hold for any specific asynchronous round $r$ . For brevity, the pseudo-code does not include the termination condition. We defer the discussion on termination to Section 4.6.

Input: State value

x_{v}

of node

v

for round

r

\triangleright

Round id

r

, included in all sent messages, is omitted for simplicity

Code for

v\in V

Initialization

\mathcal{M}_{v}\leftarrow\emptyset

\triangleright

shared variable accessed by all parallel threads

nextround\leftarrow false

\triangleright

shared variable accessed by all parallel threads

FIFORec(F)=false

, for each

F\subseteq V

with

|F|\leq f

8RedundantFlood value

x_{v}

9for each $F_{v}\subseteq V\setminus\{v\}$ , with $|F_{v}|\leq f$ do in parallel

\triangleright

only one parallel thread for some

F_{v}

\triangleright

executes Filter-and-Average due to lines 1-1

12 upon receipt of message $m=(x,p)$ do

13 lock()

\mathcal{M}_{v}\leftarrow\mathcal{M}_{v}\cup m

\triangleright

Atomic updates of

\mathcal{M}_{v}

15 unlock()

\triangleright

Maximal-Consistency Condition

18 upon ( $\mathcal{M}_{v}|_{F_{v}}$ is consistent and full for $(F_{v},v)$ for the first time) do

19 FIFO-Flood

(\mathcal{M}_{v}|_{F_{v}},COMPLETE(F_{v}))

\triangleright

FIFO-Receive-All Condition for

F_{v}

22 upon ( For all $c\in reach_{v}(F_{v})$ , $v$ FIFO-Receives the same message $(\mathcal{M}^{c},COMPLETE(F_{v}))$ from all simple $(c,v)$ -paths $p\subseteq reach_{v}(F_{v})$ ) do

FIFORec(F_{v})\leftarrow true

26 upon $Verify(\mathcal{M}_{v},FIFORec(F_{v}))$ do

\triangleright

For verification of a received value,

v

will wait

\triangleright

to receive the same value from enough paths as implied by Algorithm 2.

28 lock()

29 if $nextround$ =false then

30 Execute Algorithm Filter-and-Average

(\mathcal{M}_{v})

\triangleright

Execution of Algorithm 3

nextround\leftarrow true

33 unlock()

37Function $Verify(\mathcal{M}_{v},FIFORec(F_{v}))$ :

validity\leftarrow false

40 if $FIFORec(F_{v})=true$ then

validity\leftarrow true

43 for each $(\mathcal{M}^{c},COMPLETE(F_{u}))$ FIFO-received through a simple $(c,v)$ -path $p\subseteq reach_{v}(F_{v})$ , with consistent $\mathcal{M}^{c}$ do

validity\leftarrow validity\wedge Completeness(\mathcal{M}_{v},\mathcal{M}^{c},Fu)

\triangleright

Completeness: Algorithm 2

47 return

validity

Algorithm 1 BW (for node

v

and round

r

)

Input: Message sets

\mathcal{M}_{v},\mathcal{M}^{c}

F_{u}\subseteq V

Initialization

output\leftarrow true

4for each $F_{w}\subseteq V$ with $F_{w}\neq F_{u},$ and $|F_{w}|\leq f$ do

6 for each $q\in S_{F_{u},F_{w}}$ do

\mathcal{M}^{\prime}\leftarrow\{(value_{q}(\mathcal{M}^{c}),p)\in\mathcal{M}_{v}:init(p)=q\}

\triangleright

All received messages from

q

\triangleright

which are consistent with

\mathcal{M}^{c}

output\leftarrow output\wedge(\nexists\text{ an $f$-cover }H\subseteq V\setminus S_{F_{u},F_{w}}\text{ of }\mathcal{P}(\mathcal{M}^{\prime}))

12return

output

Algorithm 2 Function

Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})

Function $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$

We first remind the reader that $S_{F_{1},F_{1}}$ denotes the source component of reduced graph $G_{F_{1},F_{2}}$ as defined in Definitions 2, 3. Observe that due to function $Verify(\mathcal{M}_{v})$ called in line 1, a node essentially waits to receive additional messages to the ones it received upon considering possible faulty set $F_{v}$ (during the parallel execution for $F_{v}$ ) before it proceeds to update its value through Algorithm Filter-and-Average. Intuitively, for some received message $(M^{c},COMPLETE(F_{u}))$ , $v$ waits for the confirmation of the values in $\mathcal{M}^{c}$ through enough redundant paths from a source component. We will later prove that if message $(M^{c},COMPLETE(F_{u}))$ is not faulty (i.e., tampered) then $v$ will eventually be able to “confirm” the values in $M^{c}$ . For the sake of simplicity, whenever the function $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$ at node $v$ is true for some given $\mathcal{M}^{c},F_{u}$ , we will simply state that condition $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$ is satisfied.

4.3. Properties of Algorithm BW

In the following, we introduce some notions necessary for the analysis of Algorithm BW. We first borrow the notion of propagation from (Tseng and Vaidya, 2012, 2015).

Definition 0 (Propagation between sets).

Given sets $A,B,C\subseteq V$ with $A\cap B=\emptyset$ , $B\subseteq C$ , set $A$ is said to propagate in $C$ to set $B$ if either (i) $B=\emptyset$ , or (ii) for each node $b\in B$ , there exist at least $f+1$ node-disjoint $(A,b)$ -paths in the node subgraph of $G$ induced by node set $C$ , i.e., $G_{C}$ . We will denote the fact that set $A$ propagates in $C$ to set $B$ by $A\stackrel{{\scriptstyle C}}{{\rightsquigarrow}}{B}$ .

Note that the $f+1$ disjoint paths implied in Definition 4, are entirely contained in $C$ . Next, observe that by Definition 3, for any sets $F_{1},F_{2}\subseteq V$ it holds that $S_{F_{1},F_{2}}=S_{F_{2},F_{1}}$ . The following theorem is repeatedly used in our analysis; its proof is based on Corollary 2 in (Tseng and Vaidya, 2012) and the equivalence of 3-reach condition with the condition in (Tseng and Vaidya, 2012) (proof in Appendix A). Intuitively, Theorem 5 below states that if 3-reach is satisfied then there are at least $f+1$ disjoint paths, excluding nodes in $F_{1}$ , that connect a source component $S_{F_{1},F_{2}}$ with any node outside the source component.

Theorem 5.

Suppose that graph $G=(V,E)$ satisfies condition 3-reach. Then, for any $F_{1}\subseteq V$ and $F_{2}\subseteq\overline{F_{1}}$ , such that $|F_{1}|,|F_{2}|\leq f$ , $S_{F_{1},F_{2}}\stackrel{{\scriptstyle\overline{F_{1}}}}{{\rightsquigarrow}}{\overline{F_{1}\cup S_{F_{1},F_{2}}}}$ and $S_{F_{1},F_{2}}\stackrel{{\scriptstyle\overline{F_{2}}}}{{\rightsquigarrow}}{\overline{F_{2}\cup S_{F_{1},F_{2}}}}$ hold.

Using the notions above, we will next show some important properties of Algorithm BW. As defined in line 1, we will say that node $v$ satisfies the Maximal-Consistency Condition for node set $F^{\prime}$ if it receives the message set $\mathcal{M}_{v}$ and $\mathcal{M}_{v}|_{F^{\prime}}$ is consistent and full for $(F^{\prime},v)$ .

Lemma 6.

For any nonfaulty node $v$ , the Maximal-Consistency condition will eventually be satisfied during a parallel execution for some set $F^{\prime}$ .

Proof.

Consider $v$ ’s parallel execution for set $F^{\prime}=F$ , where $F$ is the actual faulty set of the execution. If the Maximal-Consistency condition has not been satisfied already, it will eventually be satisfied during the parallel execution for $F^{\prime}=F$ . This will happen since every node in $G_{\overline{F}}$ behaves correctly and thus $v$ will eventually receive consistent values from all incoming paths in $G_{\overline{F}}$ , i.e., $\mathcal{M}_{v}|_{F_{v}}$ will be consistent and full for $(F_{v},v)$ . ∎

Lemma 7.

Consider two nonfaulty nodes $v,u$ that satisfy the Maximal-Consistency condition on the same set $F^{\prime}$ . Let the message sets $\mathcal{M}_{v}|_{F^{\prime}}$ and $\mathcal{M}_{u}|_{F^{\prime}}$ be the sets that are used to pass Maximal-Consistency condition at $v$ and $u$ , respectively. Then, the two sets contain the same unique value $\displaystyle x_{w},\forall w\in\bigcup_{\mathclap{\begin{subarray}{c}F^{\prime\prime}\neq F^{\prime}\\ F^{\prime\prime}\subseteq V,|F^{\prime\prime}|\leq f\end{subarray}}}S_{F^{\prime},F^{\prime\prime}}$ .

Proof.

We first prove that for each $w\in S_{F^{\prime},F^{\prime\prime}}$ , both nodes $v$ and $u$ will receive a unique value corresponding to $w$ , contained in the respective sets $\mathcal{M}_{v}|_{F^{\prime}}$ and $\mathcal{M}_{u}|_{F^{\prime}}$ . Observe that for any $F^{\prime\prime}\neq F^{\prime}$ and $w\in S_{F^{\prime},F^{\prime\prime}}$ , by Theorem 5 and the fact that any source component $S_{F^{\prime},F^{\prime\prime}}$ is strongly connected, there exists a simple $(w,v)$ -path in $G_{\overline{F^{\prime}}}$ . Since $\mathcal{M}_{v}|_{F^{\prime}}$ is full for $(F^{\prime},v)$ , $\mathcal{M}_{v}|_{F^{\prime}}$ will contain some value $x_{w}$ corresponding to $w$ . Note that this value might not be the value sent by $w$ , since the above simple path might contain some faulty node. Next, recall that we also require $\mathcal{M}_{v}|_{F^{\prime}}$ to be consistent. Therefore, the previously mentioned $x_{w}$ value contained in $\mathcal{M}_{v}|_{F^{\prime}}$ must be unique . The same argument applies to $\mathcal{M}_{u}|_{F^{\prime}}$ , too.

The 3-reach condition implies the existence of a node $q\in reach_{v}(F\cup F^{\prime})\cap reach_{u}(F\cup F^{\prime})$ for the actual faulty set $F$ . By definition, $q$ is nonfaulty and is connected to both $v,u$ through fully nonfaulty simple paths $p_{q,v}$ and $p_{q,u}$ respectively. By Theorem 5, either $q\in S_{F^{\prime},F^{\prime\prime}}$ or there exist $f+1$ simple disjoint $(S_{F^{\prime},F^{\prime\prime}},q)$ -paths in $G_{\overline{F^{\prime}}}$ . In both cases, there exists a simple $(w,q)$ -path $p_{w,q}$ in $G_{\overline{F^{\prime}}}$ . Note that there might be some faulty nodes in $p_{w,q}$ , since $F^{\prime}$ might not be the actual faulty node set.

This observation implies that in $G_{\overline{F^{\prime}}}$ , there exist a redundant $(w,v)$ -path $p_{w,v}=p_{w,q}||p_{q,v}$ and a redundant $(w,u)$ -path $p_{w,u}=p_{w,q}||p_{q,u}$ such that the first part $p_{w,q}$ is identical in both paths. Note that the $3$ -reach condition only implies that $p_{q,v}$ and $p_{q,u}$ are fully nonfaulty. Hence, it is possible that the value sent by node $w$ is $x^{\prime}_{w}$ , but the message(s) propagated through $p_{w,v}$ and $p_{w,u}$ are different. Since $\mathcal{M}_{v}|_{F^{\prime}}$ and $\mathcal{M}_{u}|_{F^{\prime}}$ are full; nodes $v$ and $u$ will receive some value from paths $p_{w,v}$ and $p_{w,u}$ , respectively. The value received by $v$ and $u$ must be identical. This is because (i) the two redundant paths have a common first part $p_{w,q}$ ; and (ii) $p_{q,v}$ and $p_{q,u}$ are fully nonfaulty by assumption. Let this value be $x_{w}$ (which may or may not equal to $x^{\prime}_{w}$ , the original value sent by $w$ ). Finally, since $\mathcal{M}_{v}|_{F^{\prime}}$ is consistent, all the other messages propagated through paths $p$ with $init(p)=w$ and $ter(p)=v$ must also be $x_{w}$ , the value forwarded by $q$ . The same argument applies to $\mathcal{M}_{v}|_{F^{\prime}}$ . Thus, For each $w$ , there exists a common value $x_{w}$ in both $\mathcal{M}_{v}|_{F^{\prime}}$ and $\mathcal{M}_{u}|_{F^{\prime}}$ .

∎

Next we prove the main Lemma for the correctness of Algorithm BW. With FIFO-Receive-all condition we refer to the condition stated in the event handler of line 1.

Lemma 8.

Consider a nonfaulty node $v$ such that the FIFO-Receive-All condition is satisfied at node $v$ for some parallel execution. If by the time the FIFO-Receive-All condition is satisfied, $v$ receives $(\mathcal{M}^{c},COMPLETE(F_{u}))$ from a fully nonfaulty path $p$ with $init(p)=c$ , then $v$ will eventually receive a message set $M_{v}$ such that the $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$ condition will be satisfied at node $v$ .

Proof.

First observe that since path $p$ is fully nonfaulty, $c$ is nonfaulty; also lines 1,1 of BW imply that $c\notin F_{u}$ since it propagates $(\mathcal{M}^{c},COMPLETE(F_{u}))$ . Consider any $F_{w}\neq F_{u}$ with $|F_{w}|\leq f$ and any $q\in S_{F_{u},F_{w}}$ . Let $F$ be the actual faulty set of the execution. Note that since nonfaulty $v$ receives $(\mathcal{M}^{c},COMPLETE(F_{u}))$ from a fully nonfaulty path with $init(p)=c$ , node $c$ must have FIFO-Flooded this message during the execution. By the 3-reach condition of Definition 2, we have the following.

(1)

\forall F^{\prime}\subseteq V\setminus S_{F_{u},F_{w}}\setminus\{v\},\text{ such that }|F^{\prime}|\leq f,\exists z\in reach_{v}(F\cup F^{\prime})\cap reach_{c}(F\cup F_{u})

This, for any $F^{\prime}\subseteq V\setminus S_{F_{u},F_{w}}$ , implies the existence of a fully nonfaulty simple $(z,v)$ -path $p_{z,v}$ and a fully nonfaulty simple $(z,c)$ -path $p_{z,c}$ in graphs $G_{\overline{F\cup F^{\prime}}}$ and $G_{\overline{F\cup F_{u}}}$ , respectively.⁴⁴4Set $F^{\prime}$ is not to be confused with the set $F_{v}$ during the parallel execution of which $v$ satisfies the FIFO-Receive-All condition. Set $F_{v}$ is arbitrary in the proof. Note that $v$ might even receive values from paths intersecting with $F_{v}$ in order to satisfy the $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$ condition during its parallel execution for $F_{v}$ . This might occur if $F_{v}$ is a wrong “guess” of the actual fault set. We consider the following two cases for $z$ ,

•

Case I: $z\in S_{F_{u},F_{w}}$ .

We first prove the following key claim.

Claim 1:

Both $v$ and $c$ receive an identical value $x_{q}$ from node $q$ .

Proof of Claim: Since $S_{F_{u},F_{w}}$ is strongly connected, there exists a simple $(q,z)$ -path $p_{q,z}$ in graph $G_{S_{F_{u},F_{w}}}$ . This implies the existence of the redundant $(q,c)$ -path $p_{q,c}=p_{q,z}||p_{z,c}$ in $G_{\overline{F_{u}}}$ . Recall that by assumption, $c$ is nonfaulty and FIFO-Floods $COMPLETE(F_{u})$ . This means that $c$ has received a unique value $x_{q}$ through all redundant $(q,c)$ -paths in $G_{\overline{F_{u}}}$ , particularly through path $p_{q,c}$ . Note that since $p_{q,z}$ might contain some faulty nodes, $x_{q}$ might be different from the value originally sent by node $q$ . Observe that there also exists the redundant $(q,v)$ -path $p_{q,v}=p_{q,z}||p_{z,v}$ in $G_{\overline{F^{\prime}}}$ which will eventually propagate the same value $x_{q}$ to $v$ . This is because $p_{z,v}$ is fully nonfaulty and the initial part $p_{q,z}$ is identical in both $p_{q,c}$ and $p_{q,v}$ .

Claim 2:

Node $v$ will eventually receive $x_{q}$ from a set of paths $P$ with no $f$ -cover $H\subseteq V\setminus S_{F_{u},F_{w}}$ .

Proof of Claim: Recall that Claim 1 holds for any $F^{\prime}\subseteq V\setminus S_{F_{u},F_{w}}\setminus\{v\}$ , with $|F^{\prime}|\leq f$ . Then $v$ will eventually receive $x_{q}$ from all redundant paths $p_{q,z}||p^{\prime}_{z,v}$ for any $p^{\prime}_{z,v}$ being a $(z,v)$ -path with $p^{\prime}_{z,v}\cap F=\emptyset$ . This is because all these $p^{\prime}_{z,v}$ paths are fully nonfaulty and the initial part $p_{q,z}$ propagates $x_{q}$ as implied by Claim 1. The set $P$ of all these $p^{\prime}_{z,v}$ paths does not have an $f$ -cover $F^{\prime}\subseteq V\setminus S_{F_{u},F_{w}}$ . If there was such an $f$ -cover $F^{\prime}$ , this would contradict Equation (1) because it would mean that no fully nonfaulty $(z,v)$ -path would exist in $G_{\overline{F^{\prime}}}$ .⁵⁵5Observe that if $v\in S_{F_{u},F_{w}}$ and receives $x_{q}$ from a single path that entirely consists of nodes in $S_{F_{u},F_{w}}$ , then no $f$ -cover $H\subseteq V\setminus S_{F_{u},F_{w}}$ exists for this path. This is because $H\cap S_{F_{u},F_{w}}=\emptyset$ by definition.
•

Case II: $z\notin S_{F_{u},F_{w}}$ .

Theorem 5 implies that there exist $f+1$ simple disjoint $(S_{F_{u},F_{w}},z)$ -paths in $G_{\overline{F_{u}}}$ . This together with the observation that $S_{F_{u},F_{w}}$ is strongly connected imply the existence of $f+1$ simple $(q,z)$ -paths $p_{1},\ldots,p_{f+1}$ which trivially do not have an $f$ -cover $H\subseteq V\setminus S_{F_{u},F_{w}}$ . Similarly with the previous case, since $c$ FIFO-Floods $COMPLETE(F_{u})$ , it must have received the same value from all redundant paths $p_{1}||p_{z,c},\ldots,p_{f+1}||p_{z,c}$ . Since $|F^{\prime}|\leq f$ and $p_{z,v}$ is fully nonfaulty, one of the paths $p_{1}||p_{z,v},\ldots,p_{f+1}||p_{z,v}$ will also eventually propagate value $x_{q}$ to $v$ .

Using the same argument for Claim 2 in Case I, $v$ will eventually receive $x_{q}$ from a set of paths $P$ with no $f$ -cover $H\subseteq V\setminus S_{F_{u},F_{w}}$ .

In both cases, any such value $x_{q}$ received by $v$ will be consistent with values $\mathcal{M}^{c}$ propagated by $c$ , and thus $v$ will eventually satisfy the $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$ condition.

∎

In the following, we will consider the case where a node $v$ executes Algorithm Filter-and-Average through line 1, during its parallel execution for set $F_{v}$ . In this case, $v$ has already satisfied the Maximal-Consistency condition corresponding to $F_{v}$ as well as the $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$ conditions for all $COMPLETE(F_{u})$ messages it has received by the time it satisfied the FIFO-Receive-All condition. Intuitively, this means that $v$ has received redundant messages corresponding to “suspicious sets” $F_{v},F_{u}$ by the time it executes Filter-and-Average. Note that there exists only one such parallel execution during which Filter-and-Average is executed; this follows easily from the atomic updates of shared variable $\mathcal{M}_{v}$ and lines 1-1 of Algorithm BW. We will use the following notion in our proofs.

Definition 0 (Informed node).

A node $v$ that executes Filter-and-Average during its parallel execution for a set $F_{v}$ is informed about set $F_{t}$ if $F_{v}=F_{t}$ , or $v$ has satisfied the $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{t})$ condition after receiving message $(\mathcal{M}^{c},COMPLETE(F_{t}))$ from a fully nonfaulty path $p$ with $init(p)=c$ .

Theorem 10.

Any nonfaulty node $v$ will eventually execute Filter-and-Average during a parallel execution for a set $F^{\prime}$ .

The proof of Theorem 10 relies on the observation that algorithm Filter-and-Average will be executed during parallel execution for actual fault set $F$ if not during any other parallel execution. The full proof is presented in Appendix D.

Theorem 11.

Let any pair of nonfaulty nodes $v,u$ which execute Filter-and-Average during their parallel executions for sets $F_{v}$ and $F_{u}$ , respectively. Then, both nodes $v$ and $u$ will be informed about a node set $F_{t}\in\{F_{v},F_{u}\}$ , where $t\in\{v,u\}$ , and will both receive a common value $x_{q}$ for each $\displaystyle q\in\bigcup_{\mathclap{\begin{subarray}{c}F_{w}\neq F_{t}\\ |F_{w}|\leq f\end{subarray}}}S_{F_{t},F_{w}}$ . More specifically, each value $x_{q}$ will be the unique value corresponding to node $q$ that node $t$ received by the time it satisfied its Maximal-Consistency condition.

Proof.

Theorem 10 implies that sets $F_{v},F_{u}$ are well defined. Observe that, if $F_{v}=F_{u}$ the theorem holds trivially due to Definition 9 and Lemma 7; this is because both nodes will trivially be informed about the same set, according to the first part of the definition of an informed node. Thus, we focus on the case where $F_{v}\neq F_{u}$ . The FIFO-Receive-All condition for nodes $v$ and $u$ is satisfied in the parallel execution for $F_{v},F_{u}$ respectively by assumption. Let $F$ be the actual fault set. Due to the 3-reach condition, there exists a node $c\in reach_{v}(F\cup F_{v})\cap reach_{u}(F\cup F_{u})$ which is nonfaulty by definition of the reach set and is connected to $v,u$ through fully nonfaulty simple paths $p_{c,v}$ and $p_{c,u}$ respectively. Note that both nodes $v,u$ will only satisfy their FIFO-Receive-All conditions only if they receive messages of the form $(\mathcal{M}^{c}_{v},COMPLETE(F_{v}))$ and $(\mathcal{M}^{c}_{u},COMPLETE(F_{u}))$ , respectively, from $c$ through the existing nonfaulty paths $p_{c,v},p_{c,u}$ respectively; this holds since due to Line 1, node $v$ (analogously node $u$ ) will wait until it receives $(\mathcal{M}^{c}_{v},COMPLETE(F_{v}))$ from all paths entirely comprising of nodes in $reach_{v}(F_{v})$ which include all nodes on $p_{c,v}$ . Thus, $c$ must have sent both $COMPLETE(F_{v})$ , $COMPLETE(F_{u})$ messages. Since these messages are FIFO-flooded from $c$ and there are fully nonfaulty paths connecting $c$ with both $u,v$ , one of the two nodes will receive both $COMPLETE(F_{v})$ , $COMPLETE(F_{u})$ messages before satisfying the FIFO-Receive-All condition. Assume without loss of generality that this node is $v$ . We will then show that the theorem holds for $F_{t}=F_{u}$ .

Similar arguments ⁶⁶6This follows from the first paragraph of the proof of Lemma 7 for $c$ being any of the nodes $v,u$ . to the ones used in the proof of Lemma 7 imply that $c$ must have received a unique value $x_{q}$ for each $\displaystyle q\in\bigcup_{\mathclap{\begin{subarray}{c}F_{w}\neq F_{u}\\ |F_{w}|\leq f\end{subarray}}}S_{F_{u},F_{w}}$ from redundant paths in $G_{\overline{F_{u}}}$ in order to satisfy its Maximal-Consistency condition during the parallel execution for set $F_{u}$ . Similarly with previous arguments, by Theorem 5 and the strong connectivity of $S_{F_{u},F_{w}}$ , there exists a simple $(q,c)$ -path $p_{q,c}$ in $G_{\overline{F_{u}}}$ , which propagates this value $x_{q}$ to $c$ . Since, by assumption, $u$ executes Filter-and-Average during its parallel execution for $F_{u}$ , by the Maximal-Consistency condition, $u$ will also receive the same unique value $x_{q}$ , for each such node $q$ , propagated by redundant path $p_{q,c}||p_{c,u}$ because $p_{c,u}$ is fully nonfaulty and entirely contained in $G_{\overline{F_{u}}}$ . As argued previously, $v$ will receive $(\mathcal{M}^{c}_{v},COMPLETE(F_{v}))$ by a fully nonfaulty path $p_{c,v}$ . Consequently, by Lemma 8, $v$ will satisfy the $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$ condition and thus, due to Definition 9, $v$ will be informed about $F_{u}$ . For the $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$ condition to be satisfied at node $v$ , it must receive the respective $x_{q}$ values which are consistent with the ones in $\mathcal{M}^{c}$ , received through the fully nonfaulty path $p_{c,v}$ . Thus, by definition of the $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$ condition, $v$ will also receive the same values $x_{q}$ for each $\displaystyle q\in\bigcup_{\mathclap{\begin{subarray}{c}F_{w}\neq F_{u}\\ |F_{w}|\leq f\end{subarray}}}S_{F_{u},F_{w}}$ .

∎

Finally, we introduce notions that will be useful for our analysis later.

Definition 0.

Assume nonfaulty nodes $v,u$ which execute Filter-and-Average during their parallel executions for sets $F_{v}$ and $F_{u}$ , respectively. Let $F_{t}\in\{F_{v},F_{u}\}$ be the set about which both $v,u$ are informed and both receive a common value $x_{q}$ for each $\displaystyle q\in\bigcup_{\mathclap{\begin{subarray}{c}F_{w}\neq F_{t}\\ |F_{w}|\leq f\end{subarray}}}S_{F_{t},F_{w}}$ , as implied by Theorem 11. We will refer to set $F_{t}$ as the common fault set of $v,u$ , and to $t$ as the leading node of the pair. Considering the common values $x_{q}$ received by both $v,u$ , for any set $F_{w}\neq F_{t}$ with $F_{w}\subseteq V$ , $|F_{w}|\leq f$ , we define the common value set $R_{F_{w}}$ as:

(2)

R_{F_{w}}=\bigcup_{q\in S_{F_{t},F_{w}}}x_{q}

4.4. Value Update

We next present Algorithm Filter-and-Average (FA), proposing a way for a node to filter its received messages in order to update its state value by an averaging procedure. Following the intuition of (Su and Vaidya, 2017), a node first sorts all the values in received message set $\mathcal{M}_{v}$ , which results to a sorted vector $O_{v}[r]$ in round $r$ . In the next step, the node computes the maximal set of lowest values that might have been tampered by a faulty set (i.e., such that their propagation paths have an $f$ -cover) and trims (removes) them from sorted vector $O_{v}[r]$ . Analogously, the node also trims from $O_{v}[r]$ the maximal set of the highest values that may have been tampered by a faulty set. Finally, Finally, the remaining sorted values, denoted as $O_{v}^{\prime}[r]$ , are averaged as is usual in the majority of the approximate consensus literature (e.g., (Dolev et al., 1986; Kieckhafer and Azadmanesh, 1994; Bonomi et al., 2016)).

Input: Incoming message history

\mathcal{M}_{v}

at the point where Filter-and-Average is called in BW in round

r

Code for

v\in V

3Sort all messages

m\in\mathcal{M}_{v}

in increasing order with respect to

value(m)

which results in sorted vector

O_{v}[r]

O_{v}^{lo}\leftarrow

the longest message prefix of

O_{v}[r]

for which

\exists

f

-cover

F_{lo}

\mathcal{P}(O_{v}^{lo})

O_{v}^{hi}\leftarrow

the longest message suffix of

O_{v}[r]

for which

\exists

f

-cover

F_{hi}

\mathcal{P}(O_{v}^{hi})

\triangleright

Trim extreme values and average

7Remove message

O_{v}^{lo},O_{v}^{hi}

from

O_{v}[r]

which results in the trimmed vector

O_{v}^{\prime}[r]

x_{v}[r+1]=\frac{\max(O_{v}^{\prime}[r])-\min(O_{v}^{\prime}[r])}{2}

Algorithm 3 Filter-and-Average

(\mathcal{M}_{v})

(for node

v

in round

r

)

Towards proving the convergence property, we will first show that for any nonfaulty nodes $v,u$ running Algorithm Filter-and-Average in round $r$ , there will be a common value in the trimmed vectors $O^{\prime}_{v}[r]$ and $O^{\prime}_{u}[r]$ of the two nodes. For simplicity of presentation, we will omit the round variable $r$ in the following discussion. The theorems presented next guarantee the existence of this common element.

4.5. Properties of Algorithm Filter-and Average

According to the notation of Definition 12, Theorem 11 implies that for any pair of nonfaulty nodes $v,u$ executing Algorithm Filter-and-Average, there exists the common fault set $F_{t}$ such that both of nodes $v$ and $u$ will be informed about $F_{t}$ and will obtain the same common value sets $R_{F_{w}}$ for all $F_{w}\neq F_{t}$ . The following theorem guarantees that for $F_{t}$ , there will be some source components whose values will appear in the trimmed vector of $v,u$ regardless of the sets $F_{lo},F_{hi}$ used to trim their vectors. Recall that the notion of common fault set and leading node are introduced in Definition 12.

Theorem 13.

Let $v,u$ be any pair of nonfaulty nodes, $F_{t}$ their common fault set, and $t\in\{v,u\}$ the leading node. Then for $z\in\{v,u\}$ and $f$ -covers $F_{lo}^{z},F_{hi}^{z}$ as identified in Algorithm 3, common value set $R_{F_{lo}^{z}}$ will be included in the vector $O_{z}$ after removing $O_{z}^{lo}$ , and common value set $R_{F_{hi}^{z}}$ will be included in the vector $O_{z}$ after removing $O_{z}^{hi}$ .

Proof.

Without loss of generality, assume that $t=u$ . We first consider the validity of the theorem for node $u$ . The fact that $t=u$ implies that $F_{t}=F_{u}$ is the set corresponding to the parallel execution during which $u$ executes Filter-and-Average. By Theorem 5 and the strong connectivity of $S_{F_{u},F_{lo}^{u}}$ , if $u\notin S_{F_{u},F_{lo}^{u}}$ , it must have received each value of $R_{F_{lo}^{u}}$ (as defined in Definition 12) from $f+1$ disjoint $(S_{F_{u},F_{lo}^{u}},u)$ -paths in $G_{\overline{F_{u}}}$ to satisfy its Maximal-Consistency condition. Since $|F_{lo}^{u}|\leq f$ there will be at least one path propagating each value of $R_{F_{lo}^{u}}$ in $G_{\overline{F_{u}\cup F_{lo}^{u}}}$ . If $u\in S_{F_{u},F_{lo}^{u}}$ , by the strong connectivity of $S_{F_{u},F_{lo}^{u}}$ , $u$ will receive $R_{F_{lo}^{u}}$ from paths entirely within $S_{F_{u},F_{lo}^{u}}$ , i.e., paths in $G_{\overline{F_{u}\cup F_{lo}^{u}}}$ . Thus, in both cases, common value set $R_{F_{lo}^{u}}$ will be included in vector $O_{u}$ after removing $O_{u}^{lo}$ . Similar arguments hold for the case of $O_{u}^{hi}$ .

Next, we consider node $v$ and assume that it executes Filter-and-Average during its parallel execution for $F_{v}$ . Since $v\neq t=u$ , node $v$ has satisfied the $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$ condition after receiving message $(\mathcal{M}^{c},COMPLETE(F_{u}))$ from a nonfaulty path initiating at $c\in reach_{v}(F\cup F_{v})\cap reach_{u}(F\cup F_{u})$ ; this holds by an argument identical to that of the proof of Theorem 11. Similarly with the proof of Theorem 11, $c$ will propagate all values of $R_{F_{lo}^{v}}$ to $v$ through its FIFO-flooded message $(\mathcal{M}^{c},COMPLETE(F_{u}))$ . Since $v$ satisfies $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$ , it receives each value of $R_{F_{lo}^{v}}$ through a path set $P$ with no $f$ -cover $H\subseteq V\setminus S_{F_{u},F_{lo}^{v}}$ . Consequently, since $|F_{lo}^{v}|\leq f$ and $F_{lo}^{v}\cap S_{F_{u},F_{lo}^{v}}=\emptyset$ , one of the paths of $P$ will not contain any node in $F_{lo}^{v}$ . This means that for any value $x_{q}\in R_{F_{lo}^{v}}$ , there exists a path in $G_{\overline{F_{lo}^{v}}}$ from which $v$ will receive $x_{q}$ , and thus, all values of $R_{F_{lo}^{v}}$ will be included in $O_{v}^{lo}$ . Similar arguments hold for the case of $O_{v}^{hi}$ . ∎

The next theorem facilitates the analysis later; it states that there is always an overlap between certain pairs of source components of reduced graphs. Its proof is deferred to the Appendix C.

Theorem 14.

Suppose that graph $G=(V,E)$ satisfies condition 3-reach. For any three sets $F_{v},F_{u},F_{w}$ , with $F_{v}\subset V,F_{u},F_{w}\subseteq V\setminus F_{v}$ and $|F_{v}|,|F_{u}|,|F_{w}|\leq f$ , $S_{F_{v},F_{u}}\cap S_{F_{v},F_{w}}\neq\emptyset$ .

We next define some notions, helpful to determine the existence of a common value in the intersection of $O^{\prime}_{v},O^{\prime}_{u}$ for any pair of nonfaulty nodes $v,u$ . As before, we assume that $F_{t}$ is the common fault set of $v,u$ .

Definition 0.

Let $v,u$ be two nonfaulty nodes and $F_{t}$ their common fault set. For $F_{w}\subseteq V$ and $|F|\leq f$ , let $\displaystyle x^{F_{w}}_{\min}=\min_{x\in R_{F_{w}}}x$ , i.e., the minimum common value for the source component $S_{F_{t},F_{w}}$ . Define the maximum of all these minimum values over all possible $F_{w}$ as

x_{\max\min}=\max_{\mathclap{\begin{subarray}{c}F_{w}\subseteq V\\ |F_{w}|\leq f\end{subarray}}}\ x_{\min}^{F_{w}}

and let $S_{F_{t},F_{l}}$ , be a source component that includes the common value $x_{\max\min}$ , i.e., $x_{\max\min}\in R_{F_{l}}$ . Analogously, let $x^{F_{w}}_{\max}$ be the maximum common value for the source component $S_{F_{t},F_{w}}$ and define minimum of these maximum values as

x_{\min\max}=\min_{\mathclap{\begin{subarray}{c}F_{w}\subseteq V\\ |F_{w}|\leq f\end{subarray}}}\ x_{\max}^{F_{w}}

Similar to before, we assume that $S_{F_{t},F_{h}}$ is a source component that includes the common value $x_{\min\max}$ .

Lemma 16.

For any two nonfaulty nodes $v,u$ , $x_{\max\min}\leq x_{\min\max}$

Proof.

By way of contradiction, assume that

(3)

x_{\max\min}>x_{\min\max}

Then by Definition 15, we have the following two inequalities:

(4)

\text{For all values}~{}x~{}\text{ contained in}~{}R_{F_{l}},~{}~{}~{}x\geq x_{\max\min}

(5)

\text{For all values}~{}y~{}\text{ contained in}~{}R_{F_{h}},~{}~{}~{}y\geq x_{\min\max}

Now we make two observations:

•

Observation 1: Equations 3, 4, and 5 imply that $R_{F_{l}}\cap R_{F_{h}}=\emptyset$ .
•

Observation 2: By Definition 12, for each $w\in S_{F_{t},F_{l}}\cap S_{F_{t},F_{h}}$ there will be a common value $x_{w}$ contained in both $R_{F_{l}}$ and $R_{F_{h}}$ .

These two observations imply that $S_{F_{t},F_{l}}\cap S_{F_{t},F_{h}}=\emptyset$ , a contradiction to Theorem 14. ∎

Theorem 17.

For nonfaulty nodes $v,u$ , after the termination of Algorithm Filter-and-Average, we have $O^{\prime}_{v}\cap O^{\prime}_{u}\neq\emptyset$ .

Proof.

We will argue that a common value $x$ contained in $R_{F_{l}}\cap R_{F_{h}}$ will be included in both the trimmed vectors $O_{z}^{\prime}$ , for $z\in\{v,u\}$ . For any $F_{lo}^{z}$ with $|F_{lo}^{z}|\leq f$ chosen by any $z\in\{v,u\}$ , define $x^{z,lo}_{\min}$ as the minimum value contained in $R_{F_{lo}^{z}}$ . Then by the definition of $x_{\max\min}$ , we have $x^{z,lo}_{\min}\leq x_{\max\min}$ .

Due to Theorem 13, value $x^{z,lo}_{\min}$ will be contained in $O_{z}$ after removal of set $O_{z}^{lo}$ . Thus, due to the definition of $O_{z}^{lo}$ , any value $x$ contained in $O_{z}^{lo}$ will satisfy $x\leq x^{z,lo}_{\min}$ , i.e., only values less or equal to $x^{z,lo}_{\min}$ will be removed from $O_{z}$ due to removal of $O_{z}^{lo}$ and the value $x^{z,lo}_{\min}$ will remain in the trimmed $O_{z}$ . Note that, in the event that there are multiple values identical to $x^{z,lo}_{\min}$ , then at least one instance of $x^{z,lo}_{\min}$ remains in $O_{z}$ .

Next, observe that for $z\in\{v,u\}$ and any choice of $F_{hi}^{z}$ , $x^{z,hi}_{\max}\geq x_{\min\max}$ holds, where $x^{z,hi}_{\max}$ is the maximum value contained in $R_{F_{lo}^{z}}$ , due to the definition of $x_{\min\max}$ . Due to Theorem 13, $x^{z,hi}_{\max}$ will be contained in $O_{z}$ after removal of set $O_{z}^{hi}$ . Similarly with the previous argument, any value $x$ contained in $O_{z}^{hi}$ will satisfy $x\geq x^{z,hi}_{\max}$ and the value $x^{z,hi}_{\max}$ will remain in the trimmed $O_{z}$ . Note that, in the event that there are multiple values identical to $x^{z,hi}_{\max}$ , then at least one instance of $x^{z,hi}_{\max}$ remains in $O_{z}$ . Now, by Lemma 16 we have that,

(6)

x^{z,lo}_{\min}\leq x_{\max\min}\leq x_{\min\max}\leq x^{z,hi}_{\max}

Consequently, the only values removed from $O_{z}$ will be less or equal to $x^{z,lo}_{\min}$ , greater or equal to $x^{z,hi}_{\max}$ , and values $x^{z,lo}_{\min},x^{z,hi}_{\max}$ will not be trimmed. Thus, due to Equation (6), $x_{\max\min}$ and $x_{\min\max}$ will be included in the final trimmed vector $O^{\prime}_{z}$ for $z\in\{v,u\}$ .

∎

4.6. Correctness of Algorithm BW

For a given execution round $r\geq 0$ , recall that $x_{v}[r]$ is the state variable maintained at node $v$ at the end of round $r-1$ . Value $x_{v}[0]$ is assumed to be the input given to node $v$ . We denote by $U[r],\mu[r]$ , the maximum and the minimum state value at nonfaulty nodes by the end of round $r$ . Since the initial state of each node is equal to its input, $U[0]$ and $\mu[0]$ is equal to the maximum and minimum value of the initial input of the nodes, respectively. Consequently, the convergence and validity requirements of approximate consensus can be stated as follows.

•

Convergence : $\forall\epsilon>0,\text{ there exists an iteration }r_{\epsilon}\text{ such that for all }r\geq r_{\epsilon},U[r]-\mu[r]<\epsilon$
•

Validity: $\forall r>0,U[r]\leq U[0]\text{ and }\mu[r]\geq\mu[0]$ .

We next prove the convergence of the proposed algorithm which is based on the following lemma.

Lemma 18.

For every round $r$ , it holds that

\displaystyle\frac{U[r]-\mu[r]}{2}\geq U[r+1]-\mu[r+1]

Proof.

For any $r>0$ , consider any pair of nonfaulty nodes $v,u$ . Without loss of generality, assume $x_{v}[r+1]\geq x_{u}[r+1]$ . We prove the lemma by showing that

(7)

\frac{U[r]-\mu[r]}{2}\geq x_{v}[r+1]-x_{u}[r+1].

Observe that by Theorem 17, the existence of element $z$ with $z\in O^{\prime}_{v}[r]\cap O^{\prime}_{u}[r]$ is proved. This implies that $\max(O^{\prime}_{u}[r])\geq z$ .⁷⁷7Since $O^{\prime}_{u}[r]$ is a sorted message set with respect to values, we define $\max(O^{\prime}_{u}[r])$ and $\min(O^{\prime}_{u}[r])$ to be the maximum and minimum value respectively, included in this set as the first component of its messages-pairs. Moreover, $\min(O^{\prime}_{u}[r])\geq\mu[r]$ holds, since if for node $u$ , $F_{lo}=F$ , then all actual faulty values are removed from from $O_{u}[r]$ resulting to trimmed vector $O^{\prime}_{u}[r]$ . Thus, for any remaining value $x$ in $O^{\prime}_{u}[r]$ , it holds that $x\geq\mu[r]$ . If $F_{lo}\neq F$ , then there exists a nonfaulty node $w$ such that $x_{w}[r]$ , state value at node $w$ in round $r$ , is trimmed, and hence, is smaller than any $x\in O^{\prime}_{u}[r]$ . Therefore for any $x\in O^{\prime}_{u}[r]$ , $x\geq\mu[r]$ holds. Consequently, since $\max(O^{\prime}_{u}[r])\geq z$ and $\min(O^{\prime}_{u}[r])\geq\mu[r]$ , due to line 3 of Algorithm Filter-and-Average, we have that

x_{u}[r+1]=\frac{\max(O_{v}^{\prime}[r])-\min(O_{v}^{\prime}[r])}{2}\geq\frac{z+\mu[r]}{2}

Similarly, $\min(O^{\prime}_{v}[r])\leq z$ and $\max(O^{\prime}_{v}[r])\leq U[r]$ , which implies

x_{v}[r+1]\leq\frac{z+U[r]}{2}

Equation (7) follows from these two inequalities. ∎

Lemma 18 and simple arithmetic operations imply the Convergence property (details appear in the termination study of Algorithm BW presented below). Validity is based on the observation that all the extreme values will be eliminated by each nonfaulty node owing to the way the trimmed vector $O_{v}^{\prime}[p]$ is derived. The arguments are similar to that of the proof of Lemma 18. The proof is presented Appendix G.

Theorem 19 (Validity).

$\forall r\geq 0,U[r]\leq U[0]$ and $\mu[r]\geq\mu[0]$

Termination of Algorithm BW

Recall that the termination requirement for approximate consensus (Definition 1) requires that all nonfaulty nodes should output a value. We follow the approach in (Tseng and Vaidya, 2015, 2012). Suppose that the input is within the range $[0,K]$ , where $K\in\mathbb{R}$ is known a priori. If $K<\epsilon$ , then the problem is trivial, so it is assumed that $K\geq\epsilon$ . Repeated application of Lemma 18 implies that $U[r+1]-\mu[r+1]\leq\frac{U[0]-\mu[0]}{2^{r}}\leq\frac{K}{2^{r}}$ . This implies that for given $K,\epsilon$ , the state values of the nonfaulty nodes will be within $\epsilon$ of each other after round $r>\log_{2}\frac{K}{\epsilon}$ . Since $K,\epsilon$ are a priori known , each node can locally compute $\frac{K}{\epsilon}$ and output its value in the first round $r$ such that $r>\log_{2}\frac{K}{\epsilon}$ .

Acknowledgements.

Nitin Vaidya’s work is supported in part by the Army Research Laboratory under Cooperative Agreement W911NF-17-2-0196, and by the National Science Foundation award 1842198. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory, National Science Foundation or the U.S. Government.

References

(1)
Abraham et al. (2004) Ittai Abraham, Yonatan Amit, and Danny Dolev. 2004. Optimal Resilience Asynchronous Approximate Agreement. In OPODIS. 229–239.
Benediktsson and Swain (1992) J. A. Benediktsson and P. H. Swain. 1992. Consensus theoretic classification methods. IEEE Transactions on Systems, Man, and Cybernetics 22, 4 (July 1992), 688–704. https://doi.org/10.1109/21.156582
Bonomi et al. (2016) S. Bonomi, A. D. Pozzo, M. Potop-Butucaru, and S. Tixeuil. 2016. Approximate Agreement under Mobile Byzantine Faults. In 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS). 727–728. https://doi.org/10.1109/ICDCS.2016.68
Cybenko (1989) George Cybenko. 1989. Dynamic Load Balancing for Distributed Memory Multiprocessors. J. Parallel Distrib. Comput. 7, 2 (1989), 279–301. https://doi.org/10.1016/0743-7315(89)90021-X
Dolev (1982) Danny Dolev. March 1982. The Byzantine Generals Strike Again. Journal of Algorithms 3(1) (March 1982).
Dolev et al. (1993) Danny Dolev, Cynthia Dwork, Orli Waarts, and Moti Yung. 1993. Perfectly Secure Message Transmission. J. ACM 40, 1 (Jan. 1993), 17–47. https://doi.org/10.1145/138027.138036
Dolev et al. (1986) Danny Dolev, Nancy A. Lynch, Shlomit S. Pinter, Eugene W. Stark, and William E. Weihl. 1986. Reaching approximate agreement in the presence of faults. J. ACM 33 (May 1986), 499–516. Issue 3. https://doi.org/10.1145/5925.5931
Fischer et al. (1985a) Michael J. Fischer, Nancy A. Lynch, and Michael Merritt. 1985a. Easy impossibility proofs for distributed consensus problems. In Proceedings of the fourth annual ACM symposium on Principles of distributed computing (PODC ’85). ACM, New York, NY, USA, 59–70. https://doi.org/10.1145/323596.323602
Fischer et al. (1985b) Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson. 1985b. Impossibility of Distributed Consensus with One Faulty Process. J. ACM 32, 2 (April 1985), 374–382. https://doi.org/10.1145/3149.214121
Georgiou et al. (2013) Chryssis Georgiou, Seth Gilbert, Rachid Guerraoui, and Dariusz R. Kowalski. 2013. Asynchronous Gossip. J. ACM 60, 2, Article 11 (May 2013), 42 pages. https://doi.org/10.1145/2450142.2450147
Hegselmann and Krause (2002) Rainer Hegselmann and Ulrich Krause. 2002. Opinion dynamics and bounded confidence: Models, analysis and simulation. Journal of Artificial Societies and Social Simulation 5 (2002), 1–24.
Kieckhafer and Azadmanesh (1994) Roger M. Kieckhafer and Mohammad H. Azadmanesh. 1994. Reaching Approximate Agreement with Mixed-Mode Faults. IEEE Trans. Parallel Distrib. Syst. 5, 1 (1994), 53–63. https://doi.org/10.1109/71.262588
LeBlanc et al. (2013) H. LeBlanc, H. Zhang, X. Koutsoukos, and S. Sundaram. April 2013. Resilient Asymptotic Consensus in Robust Networks. IEEE Journal on Selected Areas in Communications: Special Issue on In-Network Computation 31 (April 2013), 766–781.
Litsas et al. (2013) Chris Litsas, Aris Pagourtzis, and Dimitris Sakavalas. 2013. A Graph Parameter That Matches the Resilience of the Certified Propagation Algorithm. In Ad-hoc, Mobile, and Wireless Network - 12th International Conference, ADHOC-NOW 2013, Wrocław, Poland, July 8-10, 2013. Proceedings (Lecture Notes in Computer Science), Jacek Cichon, Maciej Gebala, and Marek Klonowski (Eds.), Vol. 7960. Springer, 269–280. https://doi.org/10.1007/978-3-642-39247-4_23
Lynch (1996) Nancy A. Lynch. 1996. Distributed Algorithms. Morgan Kaufmann. 177–182 pages.
Pagourtzis et al. (2017a) Aris Pagourtzis, Giorgos Panagiotakos, and Dimitris Sakavalas. 2017a. Reliable broadcast with respect to topology knowledge. Distributed Computing 30, 2 (2017), 87–102. https://doi.org/10.1007/s00446-016-0279-6
Pagourtzis et al. (2017b) Aris Pagourtzis, Giorgos Panagiotakos, and Dimitris Sakavalas. 2017b. Reliable Communication via Semilattice Properties of Partial Knowledge. In Fundamentals of Computation Theory - 21st International Symposium, FCT 2017, Bordeaux, France, September 11-13, 2017, Proceedings (Lecture Notes in Computer Science), Ralf Klasing and Marc Zeitoun (Eds.), Vol. 10472. Springer, 367–380. https://doi.org/10.1007/978-3-662-55751-8_29
Pease et al. (1980) M. Pease, R. Shostak, and L. Lamport. 1980. Reaching Agreement in the Presence of Faults. J. ACM 27, 2 (April 1980), 228–234. https://doi.org/10.1145/322186.322188
Sakavalas and Tseng (2018) Dimitris Sakavalas and Lewis Tseng. 2018. Delivery Delay and Mobile Faults. In 17th IEEE International Symposium on Network Computing and Applications, NCA 2018, Cambridge, MA, USA, November 1-3, 2018. IEEE, 1–8. https://doi.org/10.1109/NCA.2018.8548345
Sakavalas and Tseng (2019) Dimitris Sakavalas and Lewis Tseng. 2019. Network Topology and Fault-Tolerant Consensus. Morgan & Claypool Publishers. https://doi.org/10.2200/S00918ED1V01Y201904DCT016
Sakavalas et al. (2018) Dimitris Sakavalas, Lewis Tseng, and Nitin H. Vaidya. 2018. Effects of Topology Knowledge and Relay Depth on Asynchronous Appoximate Consensus. In 22nd International Conference on Principles of Distributed Systems, OPODIS 2018, December 17-19, 2018, Hong Kong, China (LIPIcs), Jiannong Cao, Faith Ellen, Luis Rodrigues, and Bernardo Ferreira (Eds.), Vol. 125. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 14:1–14:16. https://doi.org/10.4230/LIPIcs.OPODIS.2018.14
Su and Vaidya (2017) Lili Su and Nitin H. Vaidya. 2017. Reaching approximate Byzantine consensus with multi-hop communication. Inf. Comput. 255 (2017), 352–368. https://doi.org/10.1016/j.ic.2016.12.003
Tseng and Vaidya (2012) Lewis Tseng and Nitin H. Vaidya. 2012. Exact Byzantine Consensus in Directed Graphs. CoRR abs/1208.5075 (2012). arXiv:1208.5075 http://arxiv.org/abs/1208.5075
Tseng and Vaidya (2015) Lewis Tseng and Nitin H. Vaidya. 2015. Fault-Tolerant Consensus in Directed Graphs. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing (PODC ’15). ACM, New York, NY, USA, 451–460. https://doi.org/10.1145/2767386.2767399
Vaidya et al. (2012) Nitin H. Vaidya, Lewis Tseng, and Guanfeng Liang. 2012. Iterative Approximate Byzantine Consensus in Arbitrary Directed Graphs. In Proceedings of the thirty-first annual ACM symposium on Principles of distributed computing (PODC ’12). ACM.
Vicsek et al. (1995) Tamás Vicsek, András Czirók, Eshel Ben-Jacob, Inon Cohen, and Ofer Shochet. 1995. Novel Type of Phase Transition in a System of Self-Driven Particles. Phys. Rev. Lett. 75 (Aug 1995), 1226–1229. Issue 6. https://doi.org/10.1103/PhysRevLett.75.1226

Appendix A The $k$ -reach condition family

In this section, we summarize the tight topological conditions related with consensus in directed networks that have appeared in the literature along with their equivalents from the family of $k$ -reach conditions, for $k=1,2,3$ . The topological conditions CCS (abbreviating Crash-Consensus-Synchronous), CCA (Crash-Consensus-Asynchronous) and BCS (Byzantine-Consensus-Synchronous) were introduced in (Tseng and Vaidya, 2015) and proven tight for the cases of synchronous crash consensus, asynchronous approximate crash consensus and synchronous Byzantine consensus respectively. The determination of the necessary and sufficient topological condition for solving approximate Byzantine consensus in asynchronous systems, has been an open problem since 2012.

We next present some additional definitions that facilitate our presentation.

For set $B\subseteq V$ , process $v$ is said to be an incoming (resp. outgoing) neighbor of set $B$ if $v\not\in B$ , and there exists $u\in B$ such that $(v,u)\in E$ (resp. $(u,v)\in E$ ). The incoming and outgoing neighborhood of a node $v$ are the sets of its incoming and outgoing neighbors respectively and will be denoted with $\mathcal{N}^{-}_{v},\mathcal{N}^{+}_{v}$ respectively. We extend the notion to the incoming (resp. outgoing) neighborhood of a set $B$ , denoted with $\mathcal{N}^{-}_{B}$ (resp. $\mathcal{N}^{+}_{B}$ ) and defined as the set of all incoming (resp. outgoing) neighbors of $B$ . Given subsets of nodes $A$ and $B$ , set $B$ is said to have $k$ incoming neighbors in set $A$ if $A$ contains $k$ distinct incoming neighbors of $B$ . Next, we define a notion which concerns the connectivity of any two node sets of the graph as presented in (Tseng and Vaidya, 2015).

Definition 0.

Given disjoint non-empty subsets of nodes $A$ and $B$ , we will say that $A\stackrel{{\scriptstyle x}}{{\longrightarrow}}{B}$ holds, if $B$ has at least $x$ incoming neighbors in $A$ . The negation of $A\stackrel{{\scriptstyle x}}{{\longrightarrow}}{B}$ will be denoted by $A\stackrel{{\scriptstyle x}}{{\not\longrightarrow}}{B}$ .

We also introduce the following useful generalization fo the reach set notion. The notion denotes all the multi-hop incoming neighbors of node $v$ in graph $G_{\bar{F}}$ .

Definition 0 (Reach set of $v$ under $F$ ).

For a subgraph $G^{\prime}=(V^{\prime},E^{\prime})$ of $G$ , node $v\in V^{\prime}$ and node set $F\subseteq V\setminus\{v\}$ , we will use the following notation,

reach_{v}^{G^{\prime}}(F)=\{u\in V^{\prime}\setminus F:\text{ $v$ is reachable from $u$ in graph }G^{\prime}_{V^{\prime}\setminus F}\}

Whenever $G^{\prime}=G$ we will omit the superscript $G^{\prime}$ and simply use the notation $reach_{v}(F)$ .

The definitions of conditions CCS, CCA, BCS defined in (Tseng and Vaidya, 2015) follow.

Definition 0 (Condition CCS).

For any partition $F,L,C,R$ of $V$ , where $L$ and $R$ are both non-empty, and $|F|\leq f$ , at least one of the following holds:

•

$L\cup C\stackrel{{\scriptstyle 1}}{{\longrightarrow}}{R}$
•

$R\cup C\stackrel{{\scriptstyle 1}}{{\longrightarrow}}{L}$

Definition 0 (Condition CCA).

For any partition $L,C,R$ of $V$ , where $L$ and $R$ are both non-empty, at least one of the following holds:

•

$L\cup C\stackrel{{\scriptstyle f+1}}{{\longrightarrow}}{R}$
•

$R\cup C\stackrel{{\scriptstyle f+1}}{{\longrightarrow}}{L}$

Definition 0 (Condition BCS).

For any partition $F,L,C,R$ of $V$ , where $L$ and $R$ are both non-empty, and $|F|\leq f$ , at least one of the following holds:

•

$L\cup C\stackrel{{\scriptstyle f+1}}{{\longrightarrow}}{R}$
•

$R\cup C\stackrel{{\scriptstyle f+1}}{{\longrightarrow}}{L}$

Observe that while BCS requires a 4-set partition $F,L,R,C$ of $V$ , condition CCA only requires a 3-set partition $L,R,C$ of $V$ .

We next present an equivalent condition to CCS, CCA, BCS, based on reach sets of any two nodes of the graph.

Definitions 1.

In the following, sets $F,F_{v},F_{u}\subseteq V$ intuitively represent possible faulty sets and thus are of cardinality at most $f$ , i.e., $|F|,|F_{v}|,|F_{u}|\leq f$ . We define the three following conditions,

•

1-reach: For any $F\subset V$ such that $|F|\leq f$ and any nodes $u,v\in\overline{F}$ , we have

$reach_{u}(F)\cap reach_{v}(F)\neq\emptyset.$
•

2-reach: For any nodes $u,v\in V$ and any node subsets $F_{u}$ , $F_{v}$ such that $|F_{u}|,|F_{v}|\leq f$ , $u\in\overline{F_{u}}$ , and $v\in\overline{F_{v}}$ , we have

$reach_{v}(F_{v})\cap reach_{u}(F_{u})\neq\emptyset.$
•

3-reach: For any nodes $u,v\in V$ and any node subsets $F$ , $F_{u}$ , $F_{v}$ such that $|F|,|F_{u}|,|F_{v}|\leq f$ , $u\in\overline{F\cup F_{u}}$ , and $v\in\overline{F\cup F_{v}}$ , we have

$reach_{v}(F\cup F_{v})\cap reach_{u}(F\cup F_{u})\neq\emptyset.$

Observe that in a clique, it holds that $reach_{v}(F_{v})\cap reach_{u}(F_{u})=reach_{v}(F_{v}\cup~{}F_{u})\cap reach_{u}(F_{v}\cup F_{u})$ . Thus, for example condition 3-reach in a clique is equivalent with,

reach_{v}(F\cup F_{v}\cup F_{u})\cap reach_{u}(F\cup F_{v}\cup F_{u})\neq\emptyset

which is equivalent with the well known clique condition $n>3f$ , tight for byzantine consensus. Analogously, one can show that in a clique, 1-reach and 2-reach are equivalent with $n>f$ and $n>2f$ respectively.

We next present the generalization of the above conditions $k$ -reach which determines the family of conditions encompassing the above.

Definition 1.

For any sets $F,F_{v}^{1},F_{u}^{1},\ldots,F_{v}^{k},F_{u}^{k}$ , each of cardinality at most $f$

\textbf{{k-reach:} }\begin{cases}reach_{v}(F_{v}^{1}\cup\ldots\cup F_{v}^{k})\cap reach_{u}(F_{u}^{1}\cup\ldots\cup F_{u}^{k})\neq\emptyset&\text{ if }k=even\\ reach_{v}(F\cup F_{v}^{1}\cup\ldots\cup F_{v}^{k-1})\cap reach_{u}(F\cup F_{u}^{1}\cup\ldots\cup F_{u}^{k-1})&\text{ if }k=odd\end{cases}

In the following theorem, we show that conditions 1-reach, 2-reach, and 3-reach prove are equivalent to CCS, CCA, and BCS respectively.

Theorem 7.

(a)

CCS $\Leftrightarrow$ 1-reach
(b)

CCA $\Leftrightarrow$ 2-reach
(c)

BCS $\Leftrightarrow$ 3-reach

Proof.

$(a)$ Condition 1-reach is trivially equivalent with the existence of a directed rooted tree in $G_{\bar{F}}$ as presented in (Tseng and Vaidya, 2015). In turn, the equivalence of the latter condition with CCS has been proven in (Tseng and Vaidya, 2015; Sakavalas and Tseng, 2019).

$(b)$ Direction“ $\Rightarrow$ ” is implicitly proven in (Tseng and Vaidya, 2015)⁸⁸8The claim is implied by the proof of Lemma 7 in (Tseng and Vaidya, 2015) proves the $``\Rightarrow"$ direction. We next prove direction “ $\Leftarrow$ “.
If CCA does not hold in $G$ , then there exists a partition $L,R,C$ of $V$ with $L,R\neq\emptyset$ such that $L\cup C\stackrel{{\scriptstyle f+1}}{{\not\longrightarrow}}{R}$ and $R\cup C\stackrel{{\scriptstyle f+1}}{{\not\longrightarrow}}{L}$ . Observe that $|N^{-}_{L}|,|\mathcal{N}^{-}_{R}|\leq f$ . This is because $L,R,C$ is a partition of $V$ and thus, $N^{-}_{L}\subseteq R\cup C$ and $N^{-}_{R}\subseteq L\cup C$ ; since CCA is not satisfied, the claim holds. Subsequently, let $v\in L,u\in R$ ; these nodes exist since $L,R\neq\emptyset$ as per CCA definition. Note that there exist two sets $F_{v}=\mathcal{N}^{-}_{L},F_{u}=\mathcal{N}^{-}_{R}$ of cardinality at most $f$ such that the following holds,

reach_{v}(\mathcal{N}^{-}_{L})\cap reach_{u}(\mathcal{N}^{-}_{R})\subseteq L\cap R=\emptyset

and thus condition 2-reach does not hold.

$(c)$ For any sets $F,F_{v}\subseteq V$ with $|F|,|F_{v}|\leq f$ and node $w\in V\setminus F\cup F_{v}$ we will use $reach_{w}^{G_{\bar{F}}}(F_{v})$ as defined in Definition 3. Note that by definitions 4 and 5, Condition BCS is equivalent to the following condition: for all sets $F\subseteq V$ with $|F|\leq f$ , CCA holds in $G_{\bar{F}}$ .

Due to the equivalence of CCA with 2-reach in the previous step $(b)$ , it holds that BCS is equivalent with the following condition: For all sets $F,F_{v},F_{u}\subseteq V$ with $|F|,|F_{v}|,|F_{u}|\leq f$ ,

reach_{v}^{G_{\bar{F}}}(F_{v})\cap reach_{v}^{G_{\bar{F}}}(F_{v})\neq\emptyset

Thus BCS is equivalent to,

reach_{v}(F\cup F_{v})\cap reach_{v}(F\cup F_{v})\neq\emptyset

which coincides with condition 3-reach.

∎

Appendix B Necessity of condition 3-reach

We next show that condition 3-reach is necessary for asynchronous byzantine approximate consensus to be achieved in a network. With $e\stackrel{{\scriptstyle v}}{{\sim}}e^{\prime}$ we will denote the fact that that execution $e$ is indistinguishable from execution $e^{\prime}$ with respect to node $v$ (cf. (Lynch, 1996)). Note that, considering an approximate consensus algorithm, $e\stackrel{{\scriptstyle v}}{{\sim}}e^{\prime}$ implies that node $v$ will output the same value in both executions $e,e^{\prime}$ . To facilitate the proof, for $A,B\subseteq V$ we will use the notation $E(A,B)=\{(v,u):v\in A,u\in B\}$ to denote all edges from set $A$ to set $B$ .

Theorem 1 (Impossibility of Approximate Consensus).

Byzantine asynchronous approximate consensus is impossible in networks where condition 3-reach is not satisfied.

Proof.

Consider a network $G=(V,E)$ where condition 3-reach is not satisfied and assume the existence of algorithm $\mathcal{A}$ that achieves asynchronous approximate consensus in $G$ . This means that there exist sets $F,F_{v},F_{u}\subseteq V$ with $|F|,|F_{v}|,|F_{u}|\leq f$ , and nodes $v\in\overline{F\cup F_{v}}$ , $u\in\overline{F\cup F_{u}}$ such that:

(8)

reach_{v}(F\cup F_{v})\cap reach_{u}(F\cup F_{u})=\emptyset

We define the following three executions of $\mathcal{A}$ determined by the set of faulty nodes and their behavior, the inputs of all nodes and the communication delay. Observe that a Byzantine fault may simply deviate from the protocol by crashing and not sending any message since the Byzantine fault model is strictly stronger than the crash fault. Also, note that we can assume an external notion of time (global clock), not directly observable by the nodes, to facilitate the explicit description of delays. The latter technique has been considered in (Georgiou et al., 2013; Sakavalas and Tseng, 2018).

( $e_{1}$ )

The input of every node $z\in V$ is $x_{z}=0$ , all nodes in $F_{v}$ have crashed from the beginning of the execution and all other nodes are nonfaulty; the latter is possible since $|F_{v}|\leq f$ .
( $e_{2}$ )

The input of every node $z\in V$ is $x_{z}=\epsilon$ , all nodes in $F_{u}$ have crashed from the beginning of the execution and all other nodes are nonfaulty; the latter is possible since $|F_{u}|\leq f$ .
( $e_{3}$ )

Inputs: The input of every node $z\in reach_{v}(F\cup F_{v})$ is $x_{z}=0$ , the input of every node $w\in reach_{u}(F\cup F_{u})$ is $x_{w}=\epsilon$ ; these inputs are well defined because of Eq. 8. All remaining nodes in $V\setminus(reach_{v}(F\cup F_{v})\cup reach_{u}(F\cup F_{u}))$ have arbitrary inputs.

Delivery delays: Message deliveries delays are the same as $e_{1}$ and $e_{2}$ except the delays of all messages transmitted through edges $E(F_{v},reach_{v}(F\cup F_{v}))$ , and all messages transmitted through edges $E(F_{u},reach_{u}(F\cup F_{u}))$ . We assume that the delivery delay of the latter messages is lower bounded by an arbitrary number $T\in\mathbb{N}$ of time-steps. The exact value of $T$ will be defined in the following. Message deliveries though all other edges are instant.

Faulty set behavior: Node set $F$ is faulty and behaves towards $reach_{v}(F\cup F_{v})$ as in $e_{1}$ and towards $reach_{u}(F\cup F_{u})$ as in $e_{2}$ . More concretely, all messages transmitted through edges in $E(F,reach_{v}(F\cup F_{v}))$ are identical to the messages transmitted through $E(F,reach_{v}(F\cup F_{v}))$ in $e_{1}$ and all messages transmitted through edges in $E(F,reach_{u}(F\cup F_{u}))$ are identical to the messages transmitted in $e_{2}$ . Observe that Eq 8 implies that $E(F,reach_{v}(F\cup F_{v}))\cap E(F,reach_{u}(F\cup F_{u}))=\emptyset$ and thus the latter behavior is well defined. This holds because if there exists an edge $(w,z)\in E(F,reach_{v}(F\cup F_{v}))\cap E(F,reach_{u}(F\cup F_{u}))$ , then $w\in F$ and $z\in reach_{v}(F\cup F_{v})\cap reach_{u}(F\cup F_{u})$ ; the latter contradicts Eq, 8. Also, this behavior is possible under the Byzantine faults model since $|F|\leq f$ and Byzantine faults can have any arbitrary behavior ⁹⁹9This is a standard argument used towards indistinguishability in distributed systems. Details proving that this faulty behavior is well defined can be found in (Pagourtzis et al., 2017a)..

Note that all three executions are well defined due to the fact that 3-reach condition is not satisfied (Eq. 8). Since $e_{1}$ is a well defined execution of $\mathcal{A}$ , there will be a specific time point $t_{L}$ by which, node $v$ will terminate in $e_{1}$ . Moreover, in order to satisfy the validity condition $v$ will output the value $0$ upon termination of $e_{1}$ . Similarly, since $e_{2}$ is a well defined execution of $\mathcal{A}$ , there will be a specific time point $t_{R}$ by which, node $u$ will terminate in $e_{2}$ by outputting value $\epsilon$ . We now assume that the lower bound $T$ for all delivery delays described in execution $e_{3}$ is any $T$ with

T>\max\{t_{L},t_{R}\}

Now consider execution $e_{3}$ ; all messages received by $v$ are the same in executions $e_{1},e_{3}$ . Therefore $e_{3}\stackrel{{\scriptstyle v}}{{\sim}}e_{1}$ holds and by the previous argument $v$ will output $0$ in execution $e_{3}$ . Similarly $e_{3}\stackrel{{\scriptstyle u}}{{\sim}}e_{2}$ holds and $u$ will output $\epsilon$ in execution $e_{3}$ . Thus, the convergence property is violated.

∎

Appendix C Proof of Theorem 14

The next theorem states that there is always an overlap between certain pairs of source components of reduced graphs. For ease of presentation we prove the theorem for condition BCS, which was proved equivalent to 3-reach in Theorem 7.

Theorem 1.

Proof.

Observe that by definition of a source component of a reduced graph it holds that

(9)		$\displaystyle\mathcal{N}^{-}_{S_{F_{v},F_{u}}}\subseteq F_{v}\cup F_{u}$
(10)		$\displaystyle\mathcal{N}^{-}_{S_{F_{v},F_{w}}}\subseteq F_{v}\cup F_{w}$

If $S_{F_{v},F_{u}}\cap S_{F_{v},F_{w}}=\emptyset$ then we can consider the following partition $L,R,F,C$ of $V$ .

	$\displaystyle L$	$\displaystyle=S_{F_{v},F_{u}}$
	$\displaystyle R$	$\displaystyle=S_{F_{v},F_{w}}$
	$\displaystyle F$	$\displaystyle=F_{v}$
	$\displaystyle C$	$\displaystyle=V\setminus(L\cup R\cup F)$

We will next show that $R\cup C\stackrel{{\scriptstyle f+1}}{{\not\longrightarrow}}{L}$ and $L\cup C\stackrel{{\scriptstyle f+1}}{{\not\longrightarrow}}{R}$ . First, observe that $L,R\neq\emptyset$ by the definition of source component and the fact that BCS holds. Moreover, we have that,

(R\cup C)\cap F=(S_{F_{v},F_{w}}\cup(V\setminus S_{F_{v},F_{u}}\setminus S_{F_{v},F_{w}}\setminus F_{v}))\cap F_{v}=S_{F_{v},F_{w}}\cap F_{v}=\emptyset

where the latter equation holds by definition of $S_{F_{v},F_{w}}$ . Thus, by equation 9 we have that $N^{-}_{L}\cap(R\cup C)\subseteq F_{u}$ , which is equivalent to $|N^{-}_{L}\cap(R\cup C)|\leq f$ , which in turn implies that $R\cup C\stackrel{{\scriptstyle f+1}}{{\not\longrightarrow}}{L}$ . Similarly, $L\cup C\stackrel{{\scriptstyle f+1}}{{\not\longrightarrow}}{R}$ can be shown. These two facts imply that partition $L,R,F,C$ violates condition BCS; a contradiction since BCS is equivalent with 3-reach by Theorem 7. ∎

Appendix D Proof of Theorem 10

Theorem 1.

Any nonfaulty node $v$ will eventually execute Filter-and-Average during a parallel execution for a set $F^{\prime}$ .

Proof.

Assume that $v$ does not execute Filter-and-Average for any parallel execution. Due to lines 1-1 of BW, this means that the shared variable $nextround$ will be false. Consider the parallel execution for $F^{\prime}=F$ , where $F$ is the actual faulty set. Next observe that due to Lemma 6, $v$ will satisfy Maximal-Consistency condition. The same holds for all other nonfaulty nodes. Consequently, since all nodes in $reach_{v}(F)$ are nonfaulty, they will all eventually FIFO-Flood $COMPLETE(F)$ and $v$ will eventually FIFO-receive all of these messages through nonfaulty paths in $G_{\overline{F}}$ , satisfying the FIFO-Receive-All condition. Finally, by Lemma 8, $v$ will eventually satisfy the $Completeness(\mathcal{M}^{c},F_{u})$ condition corresponding to any received $(\mathcal{M}^{c},COMPLETE(F_{u}))$ message, since all these messages will be received through fully nonfaulty paths and thus function $Verify(\mathcal{M}_{v})$ will be assigned to true. Finally, since the $nextround$ variable is assigned to false, node $v$ will execute Filter-and-Average. ∎

Appendix E The Redundant Flood algorithm (RedundantFlood)

We next present the natural algorithm for flooding a message throughout all redundant paths in the network. To avoid a trivial adversary attack where the adversary lies about the propagation path, each time a node $v$ receives a message $(x,p)$ from node $u$ we assume that $v$ checks if $ter(p)=u$ and rejects the message if this is not the case. This is a feasible check since edges represent reliable communication channels where the recipient knows the identity of the sender.

Input: sender node identifier

s

, sender’s value

x

2Code for

s

: send message

(x,\langle s\rangle)

to all outgoing neighbors.

\triangleright

local broadcast

3Code for

v\neq s

4if $m=(x,p)$ is the first message with $path(m)=p$ received from a node $u\in N^{-}_{v}$ then

5 if $w\in\mathcal{N}_{v}^{+}\text{ and }p||v||w$ is a redundant path then

6 send message

(x,p||v)

w

Algorithm 4 RedundantFlood

Appendix F The FIFO-Flood and FIFO-Receive procedures

In the following we present a high-level description of the FIFO Flood and FIFO Receive procedures used in algorithm 1.

FIFO-Flood. During this procedure, each node $i$ maintains a FIFO-counter which is incremented any time the node sends a message. The counter is appended to every message sent by node $i$ . We stress that this counter is a shared variable between all parallel threads at node $i$ .

FIFO-Receive. We will say that node $v$ FIFO-receives a message $m$ from node $u$ propagated through a FIFO-Flood procedure if $v$ has also received all previous messages (with respect to FIFO-counter) sent by $u$ . More concretely, if $v$ FIFO-receives $m$ with FIFO-counter $k$ initiated by node $u$ , then $v$ must have received all messages initiated by $u$ with FIFO-counters $1,\ldots,k-1$ .

The latter procedures clearly implement FIFO channels between nonfaulty nodes which are connected via fully nonfaulty paths. Trivially, the ordering of messages propagated through fully nonfaulty paths will be maintained since all FIFO-counters will be propagated correctly. In the case of a path containing a faulty node, it is obvious that message order is impossible to maintain since the adversary can change the order arbitrarily under any protocol; this holds since the adversary can filter all information propagated through this path.

Appendix G Proof of Theorem 19

Theorem 1 (Validity).

$\forall r\geq 0,U[r]\leq U[0]$ and $\mu[r]\geq\mu[0]$

Proof.

We prove the claim by induction on the round index $r$ . For $r=0$ both inequalities trivially hold. Assume that the claim holds for round $r$ . For any nonfaulty node $u$ we can show that $\min(O^{\prime}_{u}[r])\geq\mu[r]$ and $\max(O^{\prime}_{u}[r])\leq U[r]$ with identical arguments used in the proof of Lemma 18. Consequently, we have that since $x_{u}[r+1]=\frac{\max(O_{v}^{\prime}[r])-\min(O_{v}^{\prime}[r])}{2}$ , $\mu[r]\leq x_{u}[r+1]\leq U[r]$ holds. ∎

Asynchronous Byzantine Approximate Consensus in Directed Networks

Abstract.

1. Introduction

Related work

2. Preliminaries and Main Result

Definition 0.

System Model

New Graph Conditions

Definition 0 (Reach set of vv under FF).

Definition 0 (Reach Conditions).

Main Results

Theorem 4.

Theorem 5.

Theorem 6.

Main Result

Theorem 7.

3. Useful Terminology

Definition 0 (ff-cover of a path set).

Definition 0 (Reduced Graph).

Definition 0 (Source Component).

4. Asynchronous approximate consensus in directed networks

Outline of the algorithm

4.1. Algorithm Preliminaries

Messages and Message Sets.

Definition 0 (Exclusion of message set).

Definition 0 (Consistent message set).

Definition 0 (Full message set).

4.2. Algorithm Byzantine Witness (BW)

Parallel executions.

Atomicity

RedundantFlood (Redundant Flood) algorithm

FIFO flooding of messages

Algorithm BW: Pseudo-code

Function C​o​m​p​l​e​t​e​n​e​s​s​(ℳv,ℳc,Fu)Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})

4.3. Properties of Algorithm BW

Definition 0 (Propagation between sets).

Theorem 5.

Lemma 6.

Proof.

Lemma 7.

Proof.

Lemma 8.

Proof.

Claim 1:

Claim 2:

Definition 0 (Informed node).

Theorem 10.

Theorem 11.

Proof.

Definition 0.

4.4. Value Update

4.5. Properties of Algorithm Filter-and Average

Theorem 13.

Proof.

Theorem 14.

Definition 0.

Lemma 16.

Proof.

Theorem 17.

Proof.

4.6. Correctness of Algorithm BW

Lemma 18.

Proof.

Theorem 19 (Validity).

Termination of Algorithm BW

Acknowledgements.

References

Appendix A The kk-reach condition family

Definition 0.

Definition 0 (Reach set of vv under FF).

Definition 0 (Condition CCS).

Definition 0 (Condition CCA).

Definition 0 (Condition BCS).

Definitions 1.

Definition 1.

Theorem 7.

Proof.

Appendix B Necessity of condition 3-reach

Theorem 1 (Impossibility of Approximate Consensus).

Proof.

Definition 0 (Reach set of $v$ under $F$ ).

Definition 0 ( $f$ -cover of a path set).

Function $Completeness(\mathcal{M}_{v},\mathcal{M}^{c},F_{u})$

Appendix A The $k$ -reach condition family

Definition 0 (Reach set of $v$ under $F$ ).