University of Texas at Austin, Austin, TX 78712, [email protected] University of Texas at Austin, Austin, TX 78712, [email protected] \CopyrightXiong Zheng, and Vijay K. Garg\ccsdesc[100]Theory of Computation Distributed Algorithms \supplement
Acknowledgements.
Byzantine Lattice Agreement in Asynchronous Systems
Abstract
We study the Byzantine lattice agreement (BLA) problem in asynchronous distributed message passing systems. In the BLA problem, each process proposes a value from a join semi-lattice and needs to output a value also in the lattice such that all output values of correct processes lie on a chain despite the presence of Byzantine processes. We present an algorithm for this problem with round complexity of which tolerates Byzantine failures in the asynchronous setting without digital signatures, where is the number of processes. We also show how this algorithm can be modified to work in the authenticated setting (i.e., with digital signatures) to tolerate Byzantine failures.
keywords:
Lattice Agreement, Byzantine, Gradecastcategory:
\relatedversion1 Introduction
In distributed systems, reaching agreement in the presence of process failures is a fundamental task. Understanding the kind of agreement that can be reached helps us understand the limitation of distributed systems with failures. Consensus [15] is the most fundamental problem in distributed computing. In this problem, each process proposes some input value and has to decide on some output value such that all correct processes decide on the same valid output. In synchronous message systems with crash failures, consensus cannot be solved in fewer than rounds [8]. In asynchronous systems, consensus is impossible in the presence of even one crash failure [10]. The -set agreement [4] is a generalization of consensus, in which processes can decide on at most values instead of just one single value. The -set agreement cannot be solved in asynchronous systems if the number of crash failures [2, 11]. The paper [5] shows that -set agreement problem cannot be solved within rounds if in crash failure model.
The lattice agreement problem was proposed by Attiya et al [1] to solve the atomic snapshot object problem in shared memory systems. In this problem, each process has input and needs to output such that the following properties are satisfied.
1) Downward-Vaility: for each correct process .
2) Upward-Validity: .
3) Comparability: for any two correct processes and , either or .
The lattice agreement problem is a weaker problem than the consensus problem and the -set agreement problem. It can be solved in rounds in synchronous systems and tolerate crash failures [20]. In asynchronous systems, it can also be solved in rounds but can only tolerate crash failures [19].
Attiya et al in [1] presents a generic algorithm to transform any protocol for the lattice agreement problem to a protocol for implementing an atomic snapshot object in shared memory systems. This transformation can be easily implemented in message passing systems by replacing each read and write step with sending “” and “” messages to all and waiting for acknowledgements from different processes. On the other hand, if we can implement an atomic snapshot object, lattice agreement can also be solved easily both on shared-memory and message passing systems with only crash failures. Thus, solving the lattice agreement problem in message passing systems is equivalent to implementing an atomic snapshot object in message passing systems with only crash failures.
Using lattice agreement protocols, Faleiro et al [9] gives procedures to build a special class of linearizable and serializable replicated state machines which only support query operations and update operations but not mixed query-update operations. Later, Xiong et al [19] proposes some optimizations for their procedure for implementing replicated state machines from lattice agreement in practice. They propose a method to truncate the logs maintained in the procedure in [9]. The recent paper [17] by Skrzypczak et al proposes a protocol based on generalized lattice agreement, which is a multi-shot version of lattice agreement problem, to provide linearizability for state based conflict-free data types [16] the procedure given in [9] in terms of memory consumption, at the expense of progress.
In message passing systems with crash failures, the lattice agreement problem is well studied [1, 19, 20, 13]. The best upper bound for both synchronous systems and asynchronous systems is rounds. In the Byzantine failure model, a variant of the lattice agreement problem is first studied by Nowak et al [14]. Then, Di Luna et al [7] proposes a validity condition which still permits the application of lattice agreement protocol in obtaining atomic snapshots and implementing a special class of replicated state machines. They present an rounds algorithm for the Byzantine lattice agreement problem in asynchronous message systems. For synchronous message systems, a recent preprint [18] gives three algorithms. The first algorithm takes rounds and has the early stopping property. The second and third algorithm takes and rounds but are not early stopping. All three algorithms can tolerate failures. They also show how to modify their algorithms to work for authenticated settings and tolerates failures. The preprint [6] presents an algorithm which takes rounds which can tolerate failures and shows how to improve resilience to by using digital signatures and public-key infrastructure.
In this work, we present new algorithms for the Byzantine lattice agreement (BLA) problem in asynchronous message passing systems. In this problem, each process has input from a join semi-lattice with being the set of elements in the lattice, being the partial order defined on , and being the join operation. Each process has to output some such that the following properties are satisfied. Let denote the set of correct processes in the system and denote the actual number of Byzantine processes in the system.
Comparability: For all and , either or .
Downward-Validity: For all , .
Upward-Validity: , where and .
Our main contribution is summarized as follows.
Theorem 1.1.
There is an rounds algorithm for the BLA problem in asynchronous systems which can tolerate Byzantine failures, where is the number of processes in the system. The algorithm takes messages.
Theorem 1.2.
There is a rounds algorithm for the BLA problem in authenticated asynchronous systems which can tolerate Byzantine failures, where is the number of processes in the system. The algorithm takes messages.
2 System Model
We assume a distributed asynchronous message system with processes with unique ids in . The communication graph is a clique, i.e., each process can send messages to any other process in the system. We assume that the communication channel between any two processes is reliable. There is no upper bound on message delay. We assume that processes can have Byzantine failures but at most processes can be Byzantine in any execution of the algorithm. We use parameter to denote the actual number of Byzantine processes in a system. By our assumption, we must have . Byzantine processes can deviate arbitrarily from the algorithm. We say a process is correct or non-faulty if it is not a Byzantine process. We consider both systems with and without digital signatures. In a system with digital signatures, Byzantine processes cannot forge the signature of correct processes.
3 Rounds Algorithm for the Asynchronous BLA Problem
In this section, we present an algorithm for the BLA problem in asynchronous systems which takes rounds of asynchronous communication and tolerates Byzantine failures. The high level idea of the algorithm is to apply a Byzantine tolerant classifier procedure, similar to the classifier procedure in [20] for crash failure systems, to divide a group of processes into the slave subgroup and the master subgroup such that the values of the slave group is less than the values of the master group. Then, by recursively applying such a classifier procedure within each subgroup, eventually all processes have comparable values. The classifier procedure for crash failure model only needs to guarantee that the value a correct slave process is at most the value of any correct master process and the size of the union of all values of correct slave processes is at most . The parameter is a knowledge threshold.
Fig. 1 shows the classification tree which specifies the threshold parameter for each group when the classifier procedure is invoked recursively. Before recursively invoking the classifier procedure, an initial round is used to let all processes exchange their input values. After this initial round, each process obtains at least values. Each node in the tree represents a group. We also use label to indicate the threshold parameter of a group. Initially, all processes are in the same group with label . Let be a group with label at level . The master group (the right child node) of has label . The slave group (the left child node) of has label . We can observe that all labels in the classification tree up to level are unique. The above properties of the classifier procedure guarantee that processes in a leaf group must have the same value.
In presence of Byzantine processes, the above properties are not enough for recursively applying such classifier procedure within each subgroup. A Byzantine process in a slave group can introduce some new values which are not known by some master process. We introduce the notion of admissible values for a group (to be defined later), which is the set of values that processes in this group can ever have. We present a Byzantine tolerant classifier procedure with threshold parameter which provides the following three properties: 1) Each correct slave process has at most values and each correct master process has more than values. 2) The admissible values of the slave group is a subset of the value of any correct master process. 3) The union of all admissible values in the slave group has size less than the threshold parameter .
Suppose now we have a classifier which guarantees the above properties. The main algorithm, shown in Fig. 2, proceeds in asynchronous rounds. Each process has a label , which is used as the threshold parameter when it invokes the classifier procedure. Initially, each process has the same label . The label of a process is updated at each round according to the classification tree. In some places of our algorithm, we use the reliable broadcast primitive defined by Bracha [3] to send values. In this primitive, a process uses RB_broadcast to send a message and uses RB_broadcast to reliably deliver a message. This primitive guarantees many nice properties. In our algorithm, we need the following two main properties: 1) If a message is reliably delivered by some correct process, then this message will eventually be reliably delivered by each correct process. 2) If a correct process reliably delivers a message from process , then each correct process reliably delivers the same message from .
In the initial round at line 1-2, process RB_broadcast its input to all and waits for RB_deliver from different processes. Then, it updates its value set to be the set of values reliably delivered at this round. When reliable delivering a value, process adds this value into its safe value set for the initial group . The reliable delivering procedure is running on background. So this safe value set for the initial group keeps growing. By the properties of reliable broadcast, this safe value set can only contain at most one value from each process. This ensures Upward-Validity.
After the initial round, we can assume that all values in the initial safe value set of each process are unique, which can be done by associating the sender’s id with the value.
From line 3 to line 8, process executes the classifier procedure (to be presented later) for rounds. At each round, it invokes the classifier procedure to decide whether it is classified as a slave or master and then updates its value accordingly. At round , if process is a master, it updates its label to be . Otherwise, if updates its label to be .
Applying the above three properties provided by the classifier procedure, by induction on the round number, we can readily see that at the end of round , for any two correct process and , if they are in the same group, say with label , then both and must have exactly values and their union also has exactly values. Then, and must have the same set of values. If they are in different group, then by recursively applying property 2, the values of one process must be subset of the values of the other process.
We now show a Byzantine tolerant classifier which satisfies the properties we define.
Code for process :
: input value : output value
: label of process . Initially, :
: value set held by process at round of the algorithm
Map : denote the safe value set for group
/* Initial Round */
1: RB_broadcast(), wait for RB_deliver() from
2: Set as the set of values reliably delivered
/* Round 1 to */
3: for to
4: () :=
5: if then
6: else
7: end for
8:
Upon RB_deliver from
3.1 The Byzantine Tolerant Classifier
The Byzantine tolerant classifier procedure, shown in Fig. 7, is inspired by the asynchronous classifier procedure given in [19] for the crash failure model. We say a process writes a value to at least processes if it sends a “” message containing the value to all processes and waits for different processes to send acknowledgement back. We say a process reads from at least processes if it sends a “” message to all processes and waits for at least processes to send their current values back. We say a process performs a write-read step if it writes its value to at least processes and reads their values.
In the asynchronous classifier procedure for the crash failure model [19], to divide a group into a slave subgroup and a master subgroup, each process in the group first writes its value to at least processes and then reads from at least processes. After that, each process checks whether the union of all values read has size greater than the threshold parameter or not. If true, it is classified as a master process, otherwise, it is classified as a slave process. Slave processes keep their value the same. To guarantee that the value of each slave process is less than the value of each master process, each master process performs a write-read step to write the values obtained at the read step to at least processes and read the values from them. Then it updates its value to be the union of all values read. The second read step guarantees that the size of the union of values of slave processes is at most , since the last slave process which completes the write step must be able to read all values of slave processes.
Constructing such a classifier procedure in presence of Byzantine processes is much more difficult. In order to adapt the above procedure to work in Byzantine failure setting, we need to address the following challenges. First, in the write step or read step, when a process waits for at least different processes to send their values back, a Byzantine process can send arbitrary values. Second, simply ensuring that the values of a slave process is a subset of values of each master process is not enough, since a Byzantine process can introduce some values unknown to a master process in the slave group. For example, even if we can guarantee that the current value of each slave process is less that the value of each master process, in a later round, a Byzantine process can send some new value to a slave process which is unknown to some master process. This is possible in an asynchronous systems since messages can be arbitrarily delayed. Third, ensuring that the union of all values in the slave group has size at most is quite challenging. A simple second read step does not work any more since the last process which completes the write step might be a Byzantine process.
To prevent the first problem, in the Byzantine classifier procedure, when a process wants to perform a write step or read step, it applies the reliable broadcast primitive to broadcast its value. When a process waits for values from at least processes, it only accepts a value if the value is a subset of the values reliably delivered by this process. By property of reliable broadcast, this ensures that each accepted value must be reliably broadcast by some process, which prevents Byzantine processes from introducing arbitrary values.
To tackle the second and third problem, the key idea is to restrict the values that a Byzantine process, which claims itself to be a slave process, can successfully reliable broadcast in later rounds. To achieve that, first we require that that a slave process can only reliable broadcast the value that it has reliably broadcast in the previous round. This prevents Byzantine processes from introducing arbitrary new values into a slave group. Second, we require each process which claims itself as a slave process to prove that it is indeed classified as a slave at the previous round when it tries to reliable broadcast a value at the current round by presenting the set of values it used to do classification. To enforce the above two requirements, we add a validity condition when a process echoes a message in the reliable broadcast primitive. However, this is not enough, since the value of a Byzantine slave process might not be known to a master process if broadcast value of the Byzantine process is arbitrarily delayed. To ensure that the value a Byzantine process reliably broadcast to be read by each correct master process, we force a Byzantine process who wants to be able to reliable broadcast a value in the slave group at next round to actually write its value to at least correct processes, i.e., at least correct processes must have received the value of a Byzantine process before each correct master process tries to read from at least correct processes. These two sets of correct processes must have at least one correct process in common since .
The validity condition we add in the reliable broadcast ensures that the admissible values for the slave group must be subset of the value set of any correct master process and the union of these values has size at most the threshold .
Each process which is classified as master is not required to prove its group identity but the value it tries to broadcast has to be a subset of safe value sets of correct processes. Similar to the asynchronous classifier procedure in [19], to ensure that the value of a slave process is less than the value of a master process, the master process needs to write to and read from at least processes after it is classified as a master process.
3.1.1 Bounded Reliable Broadcast
BRB_broadcast()
denotes the type of the message to be sent, either “” or “”
is an array which is a proof of sender’s group identity
is the value to be sent, is the label of the sender, is the round number
The valid function is defined in Fig. 5
Broadcast INIT to all
Upon receiving INIT
if (first reception of INIT
wait until
Broadcast ECHO to all
endif
Upon receiving ECHO
if ECHO is received from at least different processes
READY has not yet broadcasted
Broadcast READY
endif
Upon receiving READY
if READY received from different processes READY has not been broadcasted
Broadcast READY
endif
if READY received from different processes has not been delivered
BRB_deliver
endif
Before explaining the Byzantine classifier procedure in detail, we modify the reliable broadcast primitive by adding a condition when a process echoes a broadcast message. This condition (to be explained later) restricts the admissible values for each group. For completeness, the modified reliable broadcast procedure is shown in Fig. 3. When a process reliable broadcast a value, it also includes the round number, the label of the group it belongs to and a proof of its group identity. The proof is an array of size denoting the values read by the sender at previous round. When a process receives a broadcast message from process , it waits for the validity condition to hold and then echoes the message. We say a process BRB_broadcasts a message if it executs BRB_broadcast procedure with the massage. We say a process BRB_delivers a message if it executes BRB_deliver with this message.
3.1.2 Group and Admissible Values
In our algorithm, each process has a label , which serves as the threshold when it invokes the classifier procedure. The notion of group defined as below is based on labels of processes.
Definition 3.1 (group).
A is a set of processes which have the same label. The label of a group is the label of the processes in this group. The label of a group is also the threshold value processes in this group use to do classification.
We also use label to indicate a group. We say a process is in group if its message is associated with label . Initially all processes are within the same group with label . The label of each process is updated at each round based on whether it is classified as a slave or a master.
We introduce the notion of admissible values for a group, which are the set of values that processes in the group can ever have.
Definition 3.2 (admissible values for a group).
The admissible values for a group with label is the set of values that can be reliably delivered with label if they are reliably broadcast by some process (possibly Byzantine) with label .
In our classifier, each process in group updates its value set to a subset of the values which are reliably delivered with label . Thus, the value set of each process in group must be a subset of the admissible values for group . Our algorithm ensures that this property holds continues to hold until the end of the algorithm.
3.2 The Classifier
The classifier procedure for process , shown in Fig. 7, has three input parameters: is the current value set of process , is the threshold value used to do the classification, which is also the current label of process , and is the round number.
Classifier for :
: input value set : threshold value : round number
/* Each process keeps track of the following variables */
Array . denotes the label of process sent along its values at round
Map . denotes a safe value set for group
Map . denotes the set of values accepted with label , initially
Map . denote the values process read from process at round .
Map . denote the values process read from process at round .
/* write step*/
1: if then else
2: BRB_broadcast, wait for from different processes
/* read step*/
3: BRB_broadcast, wait for s.t. from
4: Set if , otherwise
/* Classification */
5: Let
6: if /* height is greater than its label */
/* write-read step */
7: Send to all, wait for from s.t.
8: Define
9: return (, master)
10: else
11: return (, slave)
Upon BRB_Deliver
if
/* Construct safe value set for group */
/* Record the label of a process at round */
Send to
elif
Send to
endif
Upon receiving from
wait until
Send to
In line 1-2, process writes its current value set to at least processes by using the BRB_broadcast procedure to send a “” message. If process is classified as a slave at the previous round, it needs to include the array of values it read from at least processes at previous round as a proof of its group identity. This proof is used by every other process in the valid function to decide whether to echo the “” message or not. When process BRB_delivers a “” message with label at round , it includes the value in it into its safe value set for group . The safe value set is used to restrict the set of values that can be delivered in the master group . Due to this step, we can see that the admissible values in the master subgroup must be a subset of the admissible values at the current group. Process also includes the value contained in the “” message into , which stores the set of values reliably delivered with label at round .
From line 3 to line 4, process reads values from at least processes by using the BRB_broadcast procedure to send a “” message to all. In the valid function, each process echos a “” message from process only if it has BRB_delivered the “” message from process sent at line 2. This step is used to ensure that for any process, possibly Byzantine, to read from other processes, it must have written its value to at least correct processes, otherwise it cannot have enough processes echo its “” message in the BRB_broadcast. When process BRB_delivers a “” message with label from process at round , it records the set of values it has reliably delivered with label in . Then process sends back a message along with the set of reliably delivered values with label at round to process . At line 3, after the “” message is sent, process has to wait for valid message from processes. A message is valid if the value set contained in it is a subset of , which is the set of values reliably delivered with label at round . Consider a message from a correct process . Since is correct, each value in must have been reliably delivered by process . By property of reliable broadcast, each value in will eventually be reliably delivered by process , thus . Thus, eventually process can obtain valid message. At line 4, process records the set of valid s obtained at line 3 into array . So, this array stores the values reliably delivered with label that process read from all processes. This array is used to do classification in line 5-11 and also used as the proof of group identity of process when it writes at next round.
Line 5-11 is the classification step. Process is classified as a master process if the size of the union of valid values obtained in the read step is greater than its label , otherwise, it is classified as a slave process. If it is classified as a slave process, it returns its input value set. If it is classified as a master process, process performs a write-read step by sending a message which includes the set of values it uses to do classification to all and wait for valid message back at line 7. Similar to line 3, a message is valid if each value contained in it has been reliably delivered with correct label. When a process receives a message with value set and label at round , it first waits until all values in are reliably delivered. Then it sends back a message along with the set of values reliably delivered with label at round . The waiting is used to ensure that each value in is valid, i.e., be reliably delivered, because a Byzantine process can send arbitrary values in its message at line 7. By a similar reasoning as line 3, process will eventually obtain valid message from at least different processes. After the write-read step, at line 8, process updates its value set to be the union of values obtained at line 7.
function for process :
if
return
else
return
endif
function for process :
if
return
else
return
The valid function is defined in Fig. 5. In the this function, we first consider the “” messages. If the message has been sent by a process that claims to be a master, then it is considered valid if the value in this message is contained in the safe value set . If the message has been sent by a process that claims to be a slave, then process checks (1) whether process has BRB_delivered the “” message containing the same value at the previous round, (2) whether the entry in array matches the value process read from in the previous round, and (3) whether the the number of values contained in the proof is at most . The condition (1) ensures that a slave process sends the same value as the previous round since a correct slave process must keep its value same as in the previous round. The condition (2) ensures that the proof sent by the slave process uses values that it read at round . The condition (3) checks that the sender classified itself correctly.
If the message is a “” with label at round , process considers it as valid if it BRB_deliverd a “” message with label at round from the sender. This is used to make sure that the sender (possibly Byzantine) must complete its write step in line 1-2 before trying to read at line 3-4.
The function invoked in the valid function simply checks whether the label of the sender matches the label update rule by comparing it with the label at previous round.
3.3 Proof of Correctness
We first define the notion of committing a message.
Definition 3.3.
We say a process commits a message if it reliably broadcasts the message and the message is reliably delivered. We say a process commits a message at time if this message is reliably delivered by the first process at time .
By properties of reliable broadcast, we observe that each process (possibly Byzantine) can commit at most one “” message and at most one “” message at each round.
Variable | Definition |
A group of processes at round with label | |
The slave subgroup of , i.e., the processes with label at round | |
The master subgroup of , i.e., the processes with label at round | |
The value set of process at the beginning of round | |
The safe value map of process at the beginning of round is the safe value set of process for group at the beginning of round | |
The set of admissible values for group at round |
Define and . The variables we use in the proof are shown in Table. 1. Consider the classification step in group at round . The following lemma shows that if a Byzantine process wants to commit a “” message at round with a slave label, then it must commit a “” message which contains the same value as and a “” message at round with label . Also, it must commit its “” message before its “” message at round with label .
Lemma 3.4.
Suppose that process (possibly Byzantine) commits a write message
. Then
1) The message and the message must be committed by process .
2) Let denote the time that message is committed. Then, the message must have been reliably delivered by at least correct processes before time .
Proof 3.5.
For correct process , the claim is obvious.
Suppose is Byzantine. For it to commit a message with label at round , its broadcast message has to be echoed by at least different processes. Thus, it has to prove the values it read at round to at least different correct processes, which implies that it must commit . For its read message to be reliably delivered, by the condition to echo a read message, we know that process ’s write message with label and value at round must be reliably delivered by at least correct processes.
The above lemma guarantees that if a Byzantine process wants to introduce some values in the admissible values of a slave group of group , then it must first complete its write step and then complete its read step. Enforcing this order guarantees that the last slave process (possibly Byzantine) which completes its write step at line 2 must be able to read all values committed by slave processes in its read step at line 3. This set of values are exactly the set of admissible values for the slave group, since a slave process can only commit a “” message which contains the same value as its “” message at previous round by the valid function. Then, since this last slave process is a valid slave process, which is verified in the valid function, we have that the union of all admissible values for the slave group has size at most .
The following lemma shows that the classifier guarantees the properties we defined.
Lemma 3.6.
Let be a group at round with label . Let and be two nonnegative integers such that . If for each correct process , and , then
(p1) For each correct process ,
(p2) For each correct process ,
(p3)
(p4)
(p5)
(p6)
(p7) For each correct process ,
(p8) Each correct process can commits its value set at round , i.e.,
(p9) Each correct process can commit its value set at round , i.e.,
(p10)
(p11)
Proof 3.7.
(p1)-(p2) : Immediate from the classifier procedure.
(p3): A slave process can only commit the write message that it has reliably broadcast at previous round. Thus, .
(p4): The safe value set of each correct process for group is the union of values reliably broadcast by processes in group at round . Thus, .
(p5): Immediate from
(p6): Consider group at round . From Lemma LABEL:lem:unique_label, we know that this group must be the slave group of group at round . Let denote the set of processes who commit a write message at round with label . For each process , let denote the message that is committed by process . Then, . From part 2) of Lemma 3.4, we have that for each process the message must have been reliably delivered by at least correct processes. Let process be the last process such that its write message is reliably delivered by at least different correct processes. Let denote the set of correct processes which echoed message . We have . By the condition of echoing a “” message, we have for each .
Consider an arbitrary process . Let denote the set of the first correct processes which reliably delivered message . Since , there exists a correct process . Let denote the time that process sets its as . This happens after is reliably delivered by process . From part 2) Lemma of 3.4, the message must be reliably delivered by at least correct processes before time . Since is the last process in such that its write message is reliably delivered by at least different correct processes, process ’s write message must be reliably delivered by each process in before time . Hence, when process sets as , the value set has been added into by process . Thus, . Since is an arbitrary process in , we have that for each process , there exists a process such that . Hence, . Since , we have .
(p7): Consider group at round . Let denote the set of processes who commit a “” message at round with label . For each process , let denote the message that is commitd by process . Then, . We show that for each , . From Lemma 3.4, we have the message must have been reliably delivered by at least correct processes. Let denote the set of correct processes which echoed for message . We have . Then, process ’s “” message and write message must be reliably delivered by each process in . Also, by the echoing condition, we have for each and . Thus for each .
Let denote the set of correct processes which delivered the message sent by correct process at line 7 of round and we have . Since , there exists a correct process such that it reliably delivers of process and message from process . We show that process must deliver before . Suppose that process delivers before for contradiction. Then, we have when process delivers . We also have that process must reliable deliver after . Thus, when delivers . Since , we have . Since process is correct and , then . Thus, , contradiction. Therefore, process delivers before . Then, when receives from , we must have . Thus, .
(p8): Since process is correct, at round , it must read from at least correct processes. Let denote this set of correct processes. Then, at round , each process in will echo the reliable broadcast message of process . Thus, there will be at least echo messages. Since , we have . Hence, eventually the message of will be reliable delivered.
(p9): Consider round , from line 8, we know that . From the property of reliable broadcast, any value is reliably broadcast by some process and will eventually be reliably delivered by each correct process. Hence, value will be included into the safe value set of each correct process for the group with label . Thus, at round , will be eventually reliable delivered by each correct process.
(p10): Implied by and .
(p11): Implied by and .
The following lemma shows that the value set of a correct process is non-decreasing.
Lemma 3.8.
For any correct process and round , .
Proof 3.9.
A slave process keeps its value set unchanged and a master process updates its value set to be the set of reliably delivered values which contains its own value set.
The following lemma is used later to show that processes in the same group at the end of the algorithm must have the same set of values. A similar lemma is given in [20, 19]. We defer its detailed proof in Appendix A.
Lemma 3.10.
Let be a group of processes at round with label . Then
(1) for each correct process ,
(2)
Proof 3.11.
By induction on round number and apply , , and of Lemma 3.6.
Lemma 3.12.
Let and be two correct processes that are within the same group with label at the beginning of round . Then and are equal.
Proof 3.13.
Let be the parent of with label . Assume without loss of generality that . The proof for the case follows in the same manner. Since is a group at round , by Lemma 3.10, we have:
(1) for each correct process , , and
(2)
Since and , (1) and (2) hold for both process and . By the assumption that , process and execute the Classifier procedure with label and are both classified as master. Let and , then by applying Lemma 3.6() we have and , thus . By () of Lemma 3.6, we have . Thus, . Therefore, and are equal at the beginning of round .
Lemma 3.14.
(Comparability) For any two correct process and , and are comparable.
See 1.1
Proof 3.16.
Downward-Validity. After the initial round, . By Lemma 3.8, we have . Thus . Since , we have .
Comparability follows from Lemma 3.14.
Upward-Validity. In the initial round, reliable broadcast is used to construct the safe value set for the initial group with label . By property of reliable broadcast, each Byzantine process can introduce at most one value into the safe value set for group . After the initial round, all admissible values for each group must be subset of the values reliably delivered at the initial round. Thus, the union of all value sets held by correct processes must subset of the union of and a set such that . Therefore, , where and .
Remark 3.17.
The reliable broadcast primitive we use can be replaced by a more efficient one proposed by Imbs et al [12], which can only tolerate Byzantine failures but takes 2 asynchronous communication rounds. This suffices for our application.
4 Rounds Algorithm for Authenticated BLA Problem
In this section, we present an rounds algorithm for the BLA problem in authenticated (i.e., assuming digital signatures and public-key infrastructure) setting that can tolerate Byzantine failures by modifying the Byzantine tolerant classifier procedure in previous section. The Byzantine classifier procedure in authenticated setting is shown in Fig. 6. The primary difference lies in what a process does when it reliably delivers some message and the validity condition for echoing a broadcast message. The basic idea is to let a process sign the message that it needs to send. Each process uses the set of signed messages as proof of its completion of a write step or read step. In this section, we use to denote a message signed by process , i.e., , where is the signature produced by process using its private signing key. We say a message is correctly signed by process if the signature within the message is a correct signature produced by process .
4.1 The Authenticated Byzantine Tolerant Classifier
The classifier in the authenticated setting is shown in Fig. 6. The primary difference between the classifier in previous section and the authenticated classifier is that in the authenticated classifier each process uses signed messages as proof of its group identity.
Classifier:
: input value set : threshold value : round number
Each process keeps track of the same variables as the classifier in Fig. 7
Set , which stores the set of signed message in the read step of previous round
/* write step */
1: if then else
2: BRB_broadcast, wait for valid from
3: Let denote the set of delivered at line 2
/* read step */
4: Send to all, wait for valid s.t. from
5: Set
/* Classification */
6: Let
/* write-read step */
7: if /* height of is greater than its label */
8: Send to all, wait for s.t. from
9: Define
10: return (, master)
11: else
12: return (, slave)
Upon BRB_deliver
if
Send to
endif
Upon receiving from
if
Send to
endif
Upon receiving from
wait until
Send to
At lines 1-2, each process writes its current value set by using the BRB_broadcast procedure to send a “” message. If the process is a slave process, it also includes the set of at least signed messages it received at the previous round as a proof that it is indeed classified as a slave. At line 2, each process waits for correctly signed message from at least different processes. This set of signed message is used as the proof of its completion of the write step when this process tries to read from other processes. When a process BRB_delivers a “” message, it performs similar steps as the algorithm in previous section except that it sends a signed message back.
At line 4-5, each process reads from at least processes. Different from the classifier procedure in previous section, each process directly sends a read message along with the set of correctly signed messages obtained at line 2 to all (instead of using the BRB_broadcast procedure). When a process receives a “” message with label for round , if uses the validSignature function to check whether the “” message contains correctly signed message for round from at least different processes. If so, it sends back to the sender a signed message along with the reliably delivered values with label at round . This ensures that if a process (possibly Byzantine) tries to read from correct processes, it must complete its write step first.
The classification step from line 6-12 is the same as the classification step of the algorithm in previous section. A mater process performs a write-read step by sending a message along the set of value obtained at line 6. Then it waits for valid messages and updates its value set to be the set of values contained in these messages. When a process receives a message, it performs the same steps as in the classifier in previous section.
The valid function is different from the one given in previous section. First, only “” messages are reliably broadcast. Second, the proof is a set of signed messages instead of an array in previous section. To verify the proof, the valid function invokes the validSignature function to check whether the proof contains correctly signed message for previous round from at least different processes.
function :
if
return
else
return
endif
function :
if ( contains correctly signed from processes)
( contains correctly signed from processes)
return
else
return
endif
For the proof of correctness, we just need to prove the classifier procedure satisfies the properties given Lemma 3.6 under the assumption that . The proof of and is similar to the proof in previous section. Thus, we do not give the formal mathematical proof.
Lemma 4.1.
Proof 4.2.
(p1)-(p5) : similar to the proofs given in Lemma 3.6.
(p6): Consider group at round . Let denote the set of processes who commit their “” message with label at round . For process (possibly Byzantine) to commit this message, this message has to be echoed by at least different processes. By the condition of echo, process must have received correctly signed message from at least processes at line 4 of round , among which there are at least correct processes. When a process tries to read at round , it has to prove that it has received at least correctly signed messages. Thus, process must have committed a write message and this message has been reliably delivered by at least correct processes. Let process be the last process in such that its write message is reliably delivered by at least different correct processes. When process read from at least different correct processes at round , each process in must have written its value in at least different correct processes. Then, since by the assumption of , process must have read all values reliably broadcast by processes in . These values are exactly the set . Since is a slave process and the size of must be at most , we have .
(p7): Consider group at round . For a process to commit its value set at round , process must reliably broadcast the same value set as round and must be able to verify the value set it reads at round . By a similar argument as the proof for , process ’s value set must be reliably delivered by at least correct processes at round before its reading step. Let denote the set of correct processes which delivered process ’s value before its reading step at round and . Let denote the set of correct processes which delivered the message sent by process at line 11 of round and . Then, since by the assumption of , there exists a correct process such that it delivered both and at round . Since process is a slave process, process must send to process before sending to process , otherwise, process would be classified as master. Then, process must receive from process and include into . Thus, .
(p8): Since process is correct, at round , it must have received at least correctly signed message. Then, at round , it can prove to at least correct processes. Hence, the write message of at round will be reliably delivered by each correct process.
(p9): Consider round , from line 8, we know that . From the property of reliable broadcast, any value is reliably broadcast by some process with label at round and will eventually be reliably delivered by each correct process. Hence, value will be included into the safe value set of each correct process for the group . Then, eventually for each correct process . Thus, when process tries to reliable broadcast at round , the echo condition eventually holds for each correct process .
See 1.2
5 Conclusion
In this paper, we present an rounds algorithm for the Byzantine lattice agreement problem in asynchronous systems which can tolerates Byzantine failures. We also give an rounds algorithm for the authenticated setting that can tolerate Byzantine failures. One open problem left is to design an algorithm which has resilience of and takes rounds.
References
- [1] Hagit Attiya, Maurice Herlihy, and Ophir Rachman. Atomic snapshots using lattice agreement. Distributed Computing, 8(3):121–132, 1995.
- [2] Martin Biely, Zarko Milosevic, Nuno Santos, and Andre Schiper. S-paxos: Offloading the leader for high throughput state machine replication. In 2012 IEEE 31st Symposium on Reliable Distributed Systems, pages 111–120. IEEE, 2012.
- [3] Gabriel Bracha. Asynchronous Byzantine agreement protocols. Information and Computation, 75(2):130–143, 1987.
- [4] Soma Chaudhuri. More choices allow more faults: Set consensus problems in totally asynchronous systems. Inf. Comput., 105(1):132–158, 1993.
- [5] Soma Chaudhuri, Maurice Erlihy, Nancy A Lynch, and Mark R Tuttle. Tight bounds for k-set agreement. Journal of the ACM (JACM), 47(5):912–943, 2000.
- [6] Giuseppe Antonio Di Luna, Emmanuelle Anceaume, Silvia Bonomi, and Leonardo Querzoni. Synchronous byzantine lattice agreement in rounds. arXiv preprint arXiv:2001.02670, 2020.
- [7] Giuseppe Antonio Di Luna, Emmanuelle Anceaume, and Leonardo Querzoni. Byzantine generalized lattice agreement. arXiv preprint arXiv:1910.05768, 2019.
- [8] Danny Dolev and H Raymond Strong. Authenticated algorithms for Byzantine agreement. SIAM Journal on Computing, 12(4):656–666, 1983.
- [9] Jose M Faleiro, Sriram Rajamani, Kaushik Rajan, G Ramalingam, and Kapil Vaswani. Generalized lattice agreement. In Proceedings of the 2012 ACM symposium on Principles of distributed computing, pages 125–134. ACM, 2012.
- [10] Michael J Fischer, Nancy A Lynch, and Michael S Paterson. Impossibility of distributed consensus with one faulty process. Journal of the ACM (JACM), 32(2):374–382, 1985.
- [11] Maurice Herlihy, Sergio Rajsbaum, and Mark R Tuttle. Unifying synchronous and asynchronous message-passing models. In Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing, pages 133–142, 1998.
- [12] Damien Imbs and Michel Raynal. Trading off t-resilience for efficiency in asynchronous byzantine reliable broadcast. Parallel Processing Letters, 26(04):1650017, 2016.
- [13] Marios Mavronicolasa. A bound on the rounds to reach lattice agreement. http://www.cs.ucy.ac.cy/ mavronic/pdf/lattice.pdf, 2018.
- [14] Thomas Nowak and Joel Rybicki. Byzantine approximate agreement on graphs. In 33rd International Symposium on Distributed Computing (DISC 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019.
- [15] Marshall Pease, Robert Shostak, and Leslie Lamport. Reaching agreement in the presence of faults. Journal of the ACM (JACM), 27(2):228–234, 1980.
- [16] Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. Conflict-free replicated data types. In Symposium on Self-Stabilizing Systems, pages 386–400. Springer, 2011.
- [17] Jan Skrzypczak, Florian Schintke, and Thorsten Schütt. Linearizable state machine replication of state-based crdts without logs. arXiv preprint arXiv:1905.08733, 2019.
- [18] Xiong Zheng and Vijay Garg. Byzantine lattice agreement in synchronous systems. arXiv preprint arXiv:1910.14141, 2019.
- [19] Xiong Zheng, Vijay K Garg, and John Kaippallimalil. Linearizable replicated state machines with lattice agreement. arXiv preprint arXiv:1810.05871, 2018.
- [20] Xiong Zheng, Changyong Hu, and Vijay K Garg. Lattice agreement in message passing systems. In 32nd International Symposium on Distributed Computing (DISC 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018.
Appendix A Proof of Lemma 3.10
See 3.10
Proof A.1.
By induction on . Consider the base case with , . After the initial round, each process must receive at least different values and at most values by the property of reliable broadcast. Thus, . Any value in must be reliably delivered by at least one correct process at the initial round. Thus, .
For the induction step, assume the above lemma holds for all groups at round . Consider an arbitrary group at round with label . Let be the parent group of at round with label . Consider the Classifier procedure executed by all processes in with label . By induction hypothesis, we have:
(1) for each correct process ,
(2) .
Let and , then (1) and (2) are exactly the conditions of Lemma 3.6. Consider the following two cases:
Case 1: . Then . From () and () of Lemma 3.6, we have:
(1) for each correct process ,
(2) .
Case 2: . Then . Similarly, from () and () of Lemma 3.6, we have the same equations.