This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Asynchronous Byzantine Reliable Broadcast
With a Message Adversary

Timothé Albouy,  Davide Frey,  Michel Raynal,  François Taïani 
 
Univ Rennes, IRISA, CNRS, Inria, 35042 Rennes, France
{timothe.albouy,michel.raynal,francois.taiani}@irisa.fr, [email protected]
Abstract

This paper considers the problem of reliable broadcast in asynchronous authenticated systems, in which nn processes communicate using signed messages and up to tt processes may behave arbitrarily (Byzantine processes). In addition, for each message mm broadcast by a correct (i.e., non-Byzantine) process, a message adversary may prevent up to dd correct processes from receiving mm. (This message adversary captures network failures such as transient disconnections, silent churn, or message losses.) Considering such a “double” adversarial context and assuming n>3t+2dn>3t+2d, a reliable broadcast algorithm is presented. Interestingly, when there is no message adversary (i.e., d=0d=0), the algorithm terminates in two communication steps (so, in this case, this algorithm is optimal in terms of both Byzantine tolerance and time efficiency). It is then shown that the condition n>3t+2dn>3t+2d is necessary for implementing reliable broadcast in the presence of both Byzantine processes and a message adversary (whether the underlying system is enriched with signatures or not).

Keywords: Asynchronous system, Byzantine processes, Churn, Message adversary, Message losses, Message-passing, Message signatures, Reliable broadcast, Transient disconnection.

1 Introduction

Reliable broadcast

Introduced in the mid-eighties, Reliable Broadcast is a fundamental communication abstraction that lies at the center of fault-tolerant asynchronous distributed systems. Formally defined in [7, 8], it allows each process to broadcast messages in the presence of process failures, with well-defined delivery properties111The term delivery refers here to the application layer where a process receives and processes the content of an application message (see Section 2.1).. In turn, these properties make it possible to design provably correct distributed software for upper-layer applications based on such a broadcast abstraction.

Intuitively, reliable broadcast guarantees that the non-faulty processes deliver the same set of messages, which includes at least all the messages they broadcast. This set may also contain messages broadcast by faulty processes. The fundamental property of reliable broadcast lies in the fact that no two non-faulty processes deliver different sets of messages [9, 24].

When some processes may suffer from Byzantine failures [20], designing a reliable broadcast communication abstraction that tolerate such failures is far from trivial. Such an algorithm is called Byzantine-tolerant reliable broadcast (BRB) and we say that a process brb-broadcasts and brb-delivers messages. The most famous BRB algorithm is due to Bracha [7] (1987). For an application message, this algorithm gives rise to three sequential communication steps and up to (n1)(2n+1)(n-1)(2n+1) implementation messages sent by correct processes. This algorithm requires n>3tn>3t, which is optimal in terms of fault tolerance.

Recent works related to reliable broadcast

Due to its fundamental nature, BRB has been addressed by many authors. Here are a few recent results. Similarly to Bracha’s algorithm, all these algorithms assume an underlying fully connected reliable network.

  • The versatility dimension of Bracha’s algorithm has been analyzed in [18, 25].

  • Addressing efficiency issues, the BRB algorithm presented in [19] implements the reliable broadcast of an application message with only two communication steps and up to n21n^{2}-1 implementation messages sent by correct processes. The price to pay for this gain in efficiency is a weaker tt-resilience, namely t<n/5t<n/5. Hence, this algorithm and Bracha’s algorithm differ in their trade-off between tt-resilience and message/time efficiency.

  • Scalable BRB is addressed in [17]. The goal of this work is to avoid paying the O(n2)O(n^{2}) message complexity price. To this end, the authors use a non-trivial message-gossiping approach which allows them to design a sophisticated BRB algorithm satisfying probability-dependent properties.

  • BRB in dynamic systems is addressed in [16] (dynamic means that a process can enter and leave the system at any time). In their article, the authors present an efficient BRB algorithm for such a context. This algorithm assumes that, at any time, there are at least two times more correct processes than Byzantine ones in the system.

  • An efficient algorithm for BRB with long inputs of bb bits using lower costs than bb single-bit instances is presented in [21]. This algorithm, which assumes t<n/3t<n/3, achieves the best possible communication complexity of Θ(nb)\Theta(nb) input sizes. This article also presents an authenticated extension of this solution.

The work presented in this paper222A very preliminary version of this work appeared in [3]. goes beyond the previous proposals by considering the conjunction of two types of adversary: as in the above works, processes may be Byzantine, but in addition a message adversary may also remove implementation messages between correct processes. More precisely, this work addresses the problem of fault-tolerant reliable broadcast in asynchronous nn-process message-passing systems enriched with message signatures, in which up to tt processes are Byzantine, and a message adversary that may prevent up to dd non-Byzantine processes from delivering an implementation message broadcast by a non-Byzantine process. This dual fault model originated from our research on the reconciliation of local process states in distributed Byzantine-tolerant money transfer systems (a.k.a. cryptocurrencies), in which processes become temporarily disconnected. Several researchers have indeed pointed out the fundamental role that broadcast abstractions play in Byzantine money transfer systems (see, for instance, [5, 11, 12, 13, 15, 16]). This crucial role naturally leads to considering how Byzantine broadcast can be expanded to more volatile and dynamic settings, thus motivating our proposal to combine traditional Byzantine faults with a message adversary.

Acronyms Meaning
BRB Byzantine-tolerant reliable broadcast
MA Message adversary
MBRB Message adversary- and Byzantine-tolerant reliable broadcast
Notations Meaning
nn number of processes in the network
tt upper bound on the number of Byzantine processes
dd power of the message adversary
cc effective number of correct processes in a run (ntcnn-t\leq c\leq n)
\ell minimal nb of correct processes that mbrb-deliver a message
λ\lambda time complexity of MBRB
μ\mu message complexity of MBRB
Table 1: Acronyms and notations

The paper is made up of 5 sections.

  • Section 2 defines the computing model and the Message Adversary-Tolerant Byzantine Reliable Broadcast communication abstraction (or MBRB for short).

  • Section 3 presents a signature-based algorithm implementing the MBRB abstraction, proves it is correct, and evaluates its cost. When there is no message adversary, this algorithm is optimal from both Byzantine resilience and the number of communication steps333The signature-free BRB algorithm described in [7] is optimal with respect to Byzantine resilience (t<n/3t<n/3), but requires three communication steps, while the signature-free BRB algorithm described in [19] is optimal with respect to the number of communication steps (2) but is not with respect to Byzantine resilience (it requires t<n/5t<n/5)..

  • Section 4 shows that the condition n>3t+2dn>3t+2d is necessary and sufficient for implementing the MBRB communication abstraction (be the underlying system enriched with signatures or not).

  • Finally, Section 5 concludes the article.

2 Computing Model and MBRB Abstraction

2.1 Computing Model

Process model

The system is composed of nn asynchronous sequential processes denoted p1p_{1}, …, pnp_{n}. Each process pip_{i} has an identity, and all the identities are different and known by all processes. To simplify, we assume that ii is the identity of pip_{i}.

Regarding failures, up to tt processes can be Byzantine, where a Byzantine process is a process whose behavior does not follow the code specified by its algorithm [20, 22]. Let us notice that Byzantine processes can collude to fool the non-Byzantine processes (also called correct processes). Let us also notice that, in this model, the premature stop (crash) of a process is a Byzantine failure.

Moreover, given an execution, cc denotes the number of processes that effectively behave correctly in that execution. We always have ntcnn-t\leq c\leq n. While this number remains unknown to correct processes, it is used in the following to analyze and characterize (more precisely than using its worse value ntn-t) the guarantees provided by the proposed algorithms.

Communication model

The processes communicate through a fully connected asynchronous point-to-point communication network. Although this network is assumed to be reliable—in the sense that it neither corrupts, duplicates, nor creates messages—it may nevertheless lose messages due to the actions of a message adversary (defined below).

Let msg be a message type and vv the associated value. A process can invoke the unreliable operation 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍\mathsf{broadcast} msg(v)\textsc{msg}(v), which is a shorthand for “for all i{1,,n}i\in\{1,\cdots,n\} do 𝗌𝖾𝗇𝖽\mathsf{send} msg(v)\textsc{msg}(v) to pjp_{j} end for”. It is assumed that all the correct processes invoke 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍\mathsf{broadcast} to send messages. As we can see, the operation 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍\mathsf{broadcast} msg(v)\textsc{msg}(v) is not reliable. As an example, if the invoking process crashes during its invocation, an arbitrary subset of processes receive the message implementation message msg(v)\textsc{msg}(v). Moreover, due to its very nature, a Byzantine process can send messages without using the macro-operation 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍\mathsf{broadcast}.

From a terminology point of view, at the system/network level, we say that messages are broadcast and received. Moreover, a message generated by the algorithm is said to be an implementation message (imp-message in short), while a message generated by the application layer is said to be an application message (app-message in short).

Message adversary

The notion of a message adversary (MA) was implicitly introduced in [27] (under the name transient faults and ubiquitous faults) and then used (sometimes implicitly) in many works (e.g., [2, 10, 26, 28, 29]). A short tutorial on message adversaries is presented in [23].

Let dd be an integer constant such that 0d<c0\leq d<c. The communication network is under the control of an adversary which eliminates imp-messages sent by processes, so that these imp-messages appear as being lost. More precisely, when a correct process invokes 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍\mathsf{broadcast} msg(v)\textsc{msg}(v), the message adversary is allowed to arbitrarily suppress up to dd copies of the imp-message msg(v)\textsc{msg}(v) that were intended to correct processes. This means that, despite the fact the sender is correct, up to dd correct processes may miss the imp-message msg(v)\textsc{msg}(v)444 A close but different notion was introduced by Dolev in [14] (and explored in subsequent works, such as [6]) which considers static kk-connected networks. If the adversary selects statically for each correct sender dd correct processes that do not receive this sender’s imp-messages, the proposed model includes Dolev’s model with k=ndk=n-d..

As an example, consider a set DD of correct processes, where 1|D|d1\leq|D|\leq d, such that during some period of time, the adversary suppresses all the imp-messages sent to them. It follows that, during this period of time, this set of processes appears as a set of correct processes that are (unknowingly) input-disconnected from the other correct processes. Depending on the message adversary, the set DD may vary with time. Let us notice that d=0d=0 corresponds to the weakest possible message adversary: it corresponds to a classical static system where some processes are Byzantine but no imp-message is lost (the network is fully reliable).

Let us remark that this type of message adversary is stronger, and therefore covers, the more specific case of silent churn, in which processes (nodes) may decide to disconnect from the network. While disconnected, such a process silently pauses its algorithm (a legal behavior in our asynchronous model), and is implicitly moved (by the adversary) to the DD adversary-defined set. Upon coming back, the node resumes its execution, and is removed from DD by the adversary.555So the notion of a message adversary implicitly includes the notion of imp-message omission failures.

Informally, in a silent churn environment, a correct process may miss imp-messages sent by other processes while it is disconnected from the network. The adjective “silent” in silent churn expresses the fact that no notification is sent on the network by processes whenever they leave or join the system: there is no explicit “attendance list” of connected processes, and processes are given no information on the status (connected/disconnected) of their peers. In this regard, the silent churn model diverges from the classical approach when designing dynamic distributed systems, in which processes send imp-messages on the network notifying their connection or disconnection [16]. The silent churn model is a good representation of real-life large-scale peer-to-peer systems, where peers leaving the network typically do so in a completely silent manner (i.e., without warning other peers).

Let us also observe that silent churn allows us to model input-disconnections due to process mobility. When a process moves from one location to another, the sender’s broadcasting range may not be large enough to ensure that the moving process remains input-connected. An even more prosaic example would be one where a user simply turns off their device, or disable its Internet connection, preventing it from receiving or sending any further imp-messages. In this context, we consider that the message adversary removes all the incoming imp-messages from the corresponding process until the device reconnects.

Let us mention that the loss of imp-messages caused by a message adversary may be addressed using a reliable unicast protocol. These protocols were originally introduced to provide reliable channels on top of an unreliable network subject to imp-message losses. The principle is simple: the sender keeps sending idempotent imp-messages at regular intervals through an unreliable channel until it receives an acknowledgement from the receiver. This principle notoriously lies at the core of the Transmission Control Protocol (TCP), although with important practical adaptations, as TCP uses timeouts to close a malfunctioning or otherwise idle connection, typically after a few minutes.

But because there is no way to detect that a process has crashed or disconnected in an asynchronous environment, an ideal reliable unicast protocol (i.e. one that keeps on re-transmitting until success) needs to treat disconnected processes the same way as slow processes or as if there were packet losses in the network: the sender will thus potentially send infinitely many imp-messages to a disconnected receiver. To overcome this issue, some works leverage causal dependencies to avoid resending old imp-messages: if an acknowledgement is received by the sender for a given imp-message, then it can stop resending the imp-messages that causally precede this imp-message and that have not been acknowledged yet (e.g. [13]). However, this approach still assumes that eventually, every communication channel lets some imp-messages pass, which is not always the case in our model, where the message adversary can permanently sever up to dd channels.

Digital signatures

We assume the availability of an asymmetric cryptosystem to sign data (in practice, imp-messages) and verify its authenticity. We assume signatures are secure, and therefore that the computing power of the adversary is bounded. Every process in the network has a public/private key pair. We suppose that the public keys are known to everyone, and that the private keys are kept secret by their owner. Everyone also knows the mapping between any process’ identity ii and its public key. Additionally, we suppose that each process can produce at most one signature per imp-message.

The signatures are used to cope with the net effect of the Byzantine processes and the fact that imp-messages broadcast (sent) by correct processes can be eliminated by the message adversary. A noteworthy advantage of signatures is that, despite the unauthenticated nature of the point-to-point communication channels, signatures allow correct processes to verify the authenticity of imp-messages that have not been directly received from their initial sender, but rather relayed through intermediary processes. Signatures provide us with a network-wide non-repudiation mechanism: if a Byzantine process issues two conflicting imp-messages to two different subsets of correct processes, then the correct processes can detect the malicious behavior by disclosing to each other the Byzantine signed imp-messages.666 The fact that the algorithm uses signed imp-messages does not mean that MBRB-broadcast requires signatures to be implemented, see [4].

2.2 Message Adversary-Tolerant Byzantine Reliable Broadcast (MBRB)

This paper introduces a new broadcast abstraction we have called Message Adversary-Tolerant Byzantine Reliable Broadcast (MBRB for short.) The MBRB communication abstraction is composed of two matching operations, denoted 𝗆𝖻𝗋𝖻_𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍\mathsf{mbrb\_broadcast} and 𝗆𝖻𝗋𝖻_𝖽𝖾𝗅𝗂𝗏𝖾𝗋\mathsf{mbrb\_deliver}. It considers that an identity (i,𝑠𝑛)(i,\mathit{sn}) (sender identity, sequence number) is associated with each app-message, and assumes that any two app-messages mbrb-broadcast by the same correct process have different sequence numbers. Sequence numbers are one of the most natural ways to design “multi-shot” reliable broadcast algorithms, that is, algorithms where the broadcast operation can be invoked multiple times with different app-messages.

When, at the application level, a process pip_{i} invokes 𝗆𝖻𝗋𝖻_𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍(m,𝑠𝑛)\mathsf{mbrb\_broadcast}(m,\mathit{sn}), where mm is the app-message and 𝑠𝑛\mathit{sn} the associated sequence number, we say pip_{i} “mbrb-broadcasts (m,𝑠𝑛)(m,\mathit{sn})”. Similarly, when pip_{i} invokes 𝗆𝖻𝗋𝖻_𝖽𝖾𝗅𝗂𝗏𝖾𝗋(m,𝑠𝑛,j)\mathsf{mbrb\_deliver}(m,\mathit{sn},j), where pjp_{j} is the sender process, we say pip_{i} “mbrb-delivers (m,𝑠𝑛,j)(m,\mathit{sn},j)”. We say that the app-messages are mbrb-broadcast and mbrb-delivered (while, as said previously, the imp-messages algorithm are broadcast and received).

Correctness specification

Because of the message adversary, we cannot always guarantee that an app-message mbrb-delivered by a correct process is eventually mbrb-delivered by all correct processes. Hence, in the MBRB specification, we introduce a variable \ell which indicates the strength of the global delivery guarantee of the primitive: if one correct process mbrb-delivers an app-message then \ell correct processes eventually mbrb-deliver this app-message.777If there is no message adversary (i.e., d=0d=0), we should have =cnt\ell=c\geq n-t. The MBRB-broadcast abstraction is defined by the following properties:

  • Safety:

    • MBRB-Validity (no spurious message). If a correct process pip_{i} mbrb-delivers an app-message mm from a correct process pjp_{j} with sequence number 𝑠𝑛\mathit{sn}, then pjp_{j} mbrb-broadcast mm with sequence number 𝑠𝑛\mathit{sn}.

    • MBRB-No-duplication. A correct process pip_{i} mbrb-delivers at most one app-message mm from a process pjp_{j} with sequence number 𝑠𝑛\mathit{sn}.

    • MBRB-No-duplicity. No two different correct processes mbrb-deliver different app-messages from a process pip_{i} with the same sequence number 𝑠𝑛\mathit{sn}.

  • Liveness:

    • MBRB-Local-delivery. If a correct process pip_{i} mbrb-broadcasts an app-message mm with sequence number 𝑠𝑛\mathit{sn}, then at least one correct process pjp_{j} eventually mbrb-delivers mm from pip_{i} with sequence number 𝑠𝑛\mathit{sn}.

    • MBRB-Global-delivery. If a correct process pip_{i} mbrb-delivers an app-message mm from a process pjp_{j} with sequence number 𝑠𝑛\mathit{sn}, then at least \ell correct processes mbrb-deliver mm from pjp_{j} with sequence number 𝑠𝑛\mathit{sn}.

It is implicitly assumed that a correct process does not use the same sequence number twice. Let us observe that, since at the implementation level the message adversary can always suppress all the imp-messages sent to a fixed set DD of dd processes, the best-guaranteed value for \ell is cdc-d. Furthermore, let us notice that the constraint n>2dn>2d prevents the message adversary from partitioning the system.

Performance metrics

In addition to the correctness specification, we define two metrics that capture the performance of an algorithm implementing the MBRB specification: λ\lambda and μ\mu, which respectively denote the communication step complexity and the imp-message complexity of the algorithm. They are defined as follows:

  • MBRB-Time-cost. If a correct process pip_{i} mbrb-broadcasts an app-message mm with sequence number 𝑠𝑛\mathit{sn}, then \ell correct processes mbrb-deliver mm from pip_{i} with sequence number 𝑠𝑛\mathit{sn} in at most λ\lambda communication steps.

  • MBRB-Message-cost. The mbrb-broadcast of an app-message by a correct process pip_{i} entails the sending of at most μ\mu imp-messages by correct processes.

Byzantine Reliable Broadcast (BRB)

If =c\ell=c (obtained when d=0d=0), the previous specification boils down to Bracha’s seminal specification [7], which defines the Byzantine reliable broadcast (BRB) communication abstraction. Hence, the BRB abstraction is a sub-case of MBRB.

3 A Signature-based Algorithm Implementing the MBRB Abstraction

This section presents Algorithm 1, which implements the MBRB communication abstraction in an asynchronous setting under the constraint n>3t+2d>0n>3t+2d>0. When considering d=0d=0, this algorithm provides both tt-tolerance optimality (as in [7]) and step optimality (as in [19]): it only assumes n>3tn>3t, and guaranteed mbrb-delivery of an app-message in only two communication steps888Signature-based BRB in only two communication steps is a known result [1], however, to the best of our knowledge, no existing BRB algorithm tolerates message adversaries as well as ours.. It follows that signatures can help save one communication step compared to classical signature-free BRB algorithms that assume t<n/3t<n/3. Algorithm 1 fulfills the MBRB-Global-delivery property with =cd\ell=c-d under the following assumption:

  • mbrb-Assumption: n>3t+2dn>3t+2d.

3.1 Preliminaries

Implementation message types

The algorithm uses only one imp-message type, bundle, that carries the signatures backing a given app-message mm, along with mm’s content, sequence number, and emitter. bundle imp-messages propagate through the network using controlled flooding.

Local data structures

Each (correct) process saves locally the valid signatures (i.e., the signed fixed-size digests of a certain data) that it has received from other processes using bundle imp-messages. Each signature “endorses” a certain app-message (m,𝑠𝑛,j)(m,\mathit{sn},j). When certain conditions are met (described below), a process further broadcasts in a bundle imp-message all signatures it knows for a given triplet (m,𝑠𝑛,j)(m,\mathit{sn},j). A correct process pip_{i} saves at most one signature for a given triplet (m,𝑠𝑛,j)(m,\mathit{sn},j) per signing process pkp_{k}.

Time measurement

For the proofs related to MBRB-Time-cost (Lemmas 7-10), we assume that the duration of local computations is negligible compared to that of imp-message transfer delays, and consider them to take zero time units. As the system is asynchronous, the time is measured under the traditional assumption that all the imp-messages have the same transfer delay.

3.2 Algorithm

At a high level, Algorithm 1 works by producing, forwarding, and accumulating witnesses of an initial mbrb-broadcast operation, until a large-enough quorum is observed by at least one correct process, at which point this quorum is propagated in one final unreliable 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍\mathsf{broadcast} operation.

Witnesses take the form of signatures for a given triplet (m,𝑠𝑛,i)(m,\mathit{sn},i), where mm is the app-message, 𝑠𝑛\mathit{sn} its associated sequence number and ii the identity of the sender pip_{i} (which also produces a signature for (m,𝑠𝑛,i)(m,\mathit{sn},i)). Signatures serve to ascertain the provenance and authenticity of these propagated bundle imp-messages, thus providing a key ingredient to tolerate the limited reliability of the underlying network. They also authenticate the invoker of the 𝗆𝖻𝗋𝖻_𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍\mathsf{mbrb\_broadcast} operation, and finally, in the last phase of the algorithm, they allow the propagation of a cryptographic proof that a quorum has been reached, thereby ensuring that enough correct processes eventually mbrb-deliver the app-message that was mbrb-broadcast.

operation 𝗆𝖻𝗋𝖻_𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍(m,𝑠𝑛)\mathsf{mbrb\_broadcast}(m,\mathit{sn}) is (1) save signature for (m,𝑠𝑛,i)(m,\mathit{sn},i) by pip_{i}; (2) 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍\mathsf{broadcast} bundle(m,𝑠𝑛,i,{all saved signatures for (m,𝑠𝑛,i)})\textsc{bundle}(m,\mathit{sn},i,\{\text{all saved signatures for }(m,\mathit{sn},i)\}). when bundle(m,𝑠𝑛,j,𝑠𝑖𝑔𝑠)\textsc{bundle}(m,\mathit{sn},j,\mathit{sigs}) is 𝗋𝖾𝖼𝖾𝗂𝗏𝖾𝖽\mathsf{received} do (3) if ((,𝑠𝑛,j)(-,\mathit{sn},j) not already mbrb-delivered \land 𝑠𝑖𝑔𝑠\mathit{sigs} contains the valid signature for (m,𝑠𝑛,j)(m,\mathit{sn},j) by pjp_{j}) then (4) save all unsaved valid signatures for (m,𝑠𝑛,j)(m,\mathit{sn},j) of 𝑠𝑖𝑔𝑠\mathit{sigs}; (5) if ((,𝑠𝑛,j)(-,\mathit{sn},j) not already signed by pip_{i}) then (6) save signature for (m,𝑠𝑛,j)(m,\mathit{sn},j) by pip_{i}; (7) 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍\mathsf{broadcast} bundle(m,𝑠𝑛,j,{all saved signatures for (m,𝑠𝑛,j)})\textsc{bundle}(m,\mathit{sn},j,\{\text{all saved signatures for }(m,\mathit{sn},j)\}) (8) end if; (9) if (strictly more than n+t2\frac{n+t}{2} signatures for (m,𝑠𝑛,j)(m,\mathit{sn},j) are saved) then (10) 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍\mathsf{broadcast} bundle(m,𝑠𝑛,j,{all saved signatures for (m,𝑠𝑛,j)})\textsc{bundle}(m,\mathit{sn},j,\{\text{all saved signatures for }(m,\mathit{sn},j)\}); (11) 𝗆𝖻𝗋𝖻_𝖽𝖾𝗅𝗂𝗏𝖾𝗋(m,𝑠𝑛,j)\mathsf{mbrb\_deliver}(m,\mathit{sn},j) (12) end if (13) end if.

Algorithm 1: A signature-based implementation of the MBRB communication abstraction (code for pip_{i})

In more detail, when a (correct) process pip_{i} invokes 𝗆𝖻𝗋𝖻_𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍(m,𝑠𝑛)\mathsf{mbrb\_broadcast}(m,\mathit{sn}), it builds and signs the triplet (m,𝑠𝑛,i)(m,\mathit{sn},i) to guarantee its non-repudiation, and saves locally the resulting signature (line 1). Next, pip_{i} broadcasts the bundle imp-message containing the signature that it just produced (line 1).

When a correct process pip_{i} receives a bundle(m,𝑠𝑛,j,𝑠𝑖𝑔𝑠)\textsc{bundle}(m,\mathit{sn},j,\mathit{sigs}) imp-message, it first checks if no app-message has already been mbrb-delivered for the given sequence number 𝑠𝑛\mathit{sn} and sender pjp_{j}, and if pjp_{j} signed the app-message (line 1). If this condition is satisfied, pip_{i} saves all the new valid signatures inside the 𝑠𝑖𝑔𝑠\mathit{sigs} set (line 1). Next, pip_{i} creates and saves its own signature for (m,𝑠𝑛,j)(m,\mathit{sn},j) and then broadcasts it in a bundle imp-message, if it has not already done so previously (line 1-1). Finally, if pip_{i} has saved a quorum of strictly more than n+t2\frac{n+t}{2} signatures for the same triplet (m,𝑠𝑛,j)(m,\mathit{sn},j), it broadcasts a bundle imp-message containing all these signatures and mbrb-delivers the triplet (lines 1-1).999The pseudo-code presented in Algorithm 1 favors readability, and is therefore not fully optimized. For instance, in some cases, a process might unreliably broadcast exactly the same content at lines  1 and 1. This could be avoided by either using an appropriate flag, or by tracking and preventing the repeated broadcast of identical bundle imp-messages.

Remark

The reader can notice that the system parameters nn and tt appear in the algorithm, whereas the system parameter dd does not. Naturally, they all explicitly appear in the proof.

3.3 Algorithm proof

This section proves the correctness and performance properties of MBRB.

Theorem 1.

If the mbrb-Assumption is satisfied, Algorithm 1 implements the mbrb-broadcast of an app-message by a correct process with the following guarantees:

  • =cd\ell=c-d correct processes,

  • λ={2ifd<cn+t2n+t2+13ifd<cc×n+t2>3otherwise}\lambda=\left\{\begin{array}[]{ll}2&\text{if}\leavevmode\nobreak\ \leavevmode\nobreak\ d<\frac{c-\lfloor\frac{n+t}{2}\rfloor}{\lfloor\frac{n+t}{2}\rfloor+1}\\ 3&\text{if}\leavevmode\nobreak\ \leavevmode\nobreak\ d<c-\sqrt{c\times\frac{n+t}{2}}\\ >3&\text{otherwise}\end{array}\right\} communication steps,

  • μ=2n2\mu=2n^{2} imp-messages.

The proof follows from the next lemmas.

Lemma 1 (MBRB-Validity).

If a correct process pip_{i} mbrb-delivers mm from a correct process pjp_{j} with sequence number 𝑠𝑛\mathit{sn}, then pjp_{j} has previously mbrb-broadcast mm with sequence number 𝑠𝑛\mathit{sn}.

Proof.

If a correct process pip_{i} mbrb-delivers (m,𝑠𝑛,j)(m,\mathit{sn},j) (where pjp_{j} is correct) at line 1, then it has passed the condition at line 1, which means that it must have witnessed a valid signature for (m,𝑠𝑛,j)(m,\mathit{sn},j) by pjp_{j}. Since signatures are secure, the only way to create this signature is for pjp_{j} to execute the instruction at line 1, during the 𝗆𝖻𝗋𝖻_𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍(m,𝑠𝑛)\mathsf{mbrb\_broadcast}(m,\mathit{sn}) invocation. ∎

Lemma 2 (MBRB-No-duplication).

A correct process pip_{i} mbrb-delivers at most one app-message from a process pjp_{j} with a given sequence number 𝑠𝑛\mathit{sn}.

Proof.

This property derives trivially from the condition at line 1. ∎

Lemma 3 (MBRB-No-duplicity).

No two different correct processes mbrb-deliver different app-messages from a process pip_{i} with the same sequence number 𝑠𝑛\mathit{sn}.

Proof.

Let us consider two correct processes pap_{a} and pbp_{b} which respectively mbrb-deliver (m,𝑠𝑛,i)(m,\mathit{sn},i) and (m,𝑠𝑛,i)(m^{\prime},\mathit{sn},i). Due to the condition at line 1, pap_{a} and pbp_{b} must have saved (and thus received) two sets QaQ_{a} and QbQ_{b} containing strictly more than n+t2\frac{n+t}{2} signatures for (m,𝑠𝑛,i)(m,\mathit{sn},i) and (m,𝑠𝑛,i)(m^{\prime},\mathit{sn},i), respectively. We thus have |Qa|>n+t2|Q_{a}|>\frac{n+t}{2} and |Qb|>n+t2|Q_{b}|>\frac{n+t}{2}.

As we have |AB|=|A|+|B||AB||A|+|B|n>2×n+t2n=t|A\cap B|=|A|+|B|-|A\cup B|\geq|A|+|B|-n>2\times\frac{n+t}{2}-n=t, AA and BB have at least one correct process pkp_{k} in common, which must have signed both (m,𝑠𝑛,i)(m,\mathit{sn},i) and (m,𝑠𝑛,i)(m^{\prime},\mathit{sn},i). But before signing (m,𝑠𝑛,i)(m,\mathit{sn},i) at line 1 or 1, pkp_{k} checks that it did not sign a different app-message from the same sender and with the same sequence number, whether it be implicitly during a 𝖻𝗋𝖻_𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍(m,𝑠𝑛)\mathsf{brb\_broadcast}(m,\mathit{sn}) invocation or at line 1. Thereby, mm is necessarily equal to mm^{\prime}. ∎

Lemma 4 (MBRB-Local-delivery).

If a correct process pip_{i} mbrb-broadcasts an app-message mm with sequence number 𝑠𝑛\mathit{sn}, then at least one correct process pjp_{j} mbrb-delivers mm from pip_{i} with sequence number 𝑠𝑛\mathit{sn}.

Proof.

If a correct process pip_{i} mbrb-broadcasts (m,𝑠𝑛)(m,\mathit{sn}), then it broadcasts its own signature 𝑠𝑖𝑔i\mathit{sig}_{i} for (m,𝑠𝑛,i)(m,\mathit{sn},i) in a bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔i})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{i}\}) message at line 1. As pip_{i} is correct, it does not sign another triplet (m,𝑠𝑛,i)(m^{\prime},\mathit{sn},i) where mmm^{\prime}\neq m, therefore it is impossible for a correct process to mbrb-deliver (m,𝑠𝑛,i)(m^{\prime},\mathit{sn},i) at line 1, because it cannot pass the condition at line 1.

Let us denote by KK the set of correct processes that receive a message bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔i,})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{i},...\}) at least once. The first one of such bundle messages that a process of KK receives can be the one pip_{i} initially broadcast at line 1, but it can also be a bundle message broadcast by a correct process at lines 1 or 1, or it can even be a bundle message sent by a Byzantine process. In any case, the first time the processes of KK receive such a bundle message, they pass the condition at line 1, and they also pass the condition at line 1, except for pip_{i} if it belongs to KK. Consequently, each process pkp_{k} of KK necessarily broadcasts its own signature 𝑠𝑖𝑔k\mathit{sig}_{k} for (m,𝑠𝑛,i)(m,\mathit{sn},i) in a bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔k,𝑠𝑖𝑔i,})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{k},\mathit{sig}_{i},...\}) message.

By construction of the algorithm, the set KK of correct processes that receive a bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔i,})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{i},...\}) message is equal to the set of correct processes pkp_{k} that broadcast a bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔k,𝑠𝑖𝑔i,})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{k},\mathit{sig}_{i},...\}). By the definition of the message adversary, a message bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔k,𝑠𝑖𝑔i,})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{k},\mathit{sig}_{i},...\}) broadcast by a correct process pkp_{k} is eventually received by at least cdc-d correct processes. Hence, the minimum number of signatures for (m,𝑠𝑛,i)(m,\mathit{sn},i) made by processes of KK that is also received by processes of KK globally is |K|(cd)|K|(c-d). It follows that a given process of KK individually receives on average the distinct signatures of at least |K|(cd)/|K|=cd|K|(c-d)/|K|=c-d processes of KK.

From mbrb-Assumption, we have 3t+2d<nn+3t+2d<2nn+t<2n2t2dn+t2<ntdcd3t+2d<n\iff n+3t+2d<2n\iff n+t<2n-2t-2d\iff\frac{n+t}{2}<n-t-d\leq c-d (as ntcn-t\leq c). As a result, at least one process pjp_{j} of KK (ergo one correct process) receives a set SS (in possibly multiple bundle messages) of strictly more than n+t2\frac{n+t}{2} valid distinct signatures for (m,𝑠𝑛,i)(m,\mathit{sn},i). When pjp_{j} receives the last signature of SS, there are two cases:

  • Case if pjp_{j} does not pass the condition at line 1.

    As processes of KK are correct, then when they broadcast a bundle(m,𝑠𝑛,i,𝑠𝑖𝑔𝑠)\textsc{bundle}(m,\mathit{sn},i,\mathit{sigs}) message, they necessarily include 𝑠𝑖𝑔i\mathit{sig}_{i} in 𝑠𝑖𝑔𝑠\mathit{sigs}, which implies that 𝑠𝑖𝑔i\mathit{sig}_{i} is necessarily in SS. Therefore, if pjp_{j} does not pass the condition at line 1, it is because pjp_{j} already mbrb-delivered some (,𝑠𝑛,i)(-,\mathit{sn},i). But let us remind that, as pip_{i} is correct, it is impossible for pjp_{j} to mbrb-deliver anything different from (m,𝑠𝑛,i)(m,\mathit{sn},i). Therefore, pjp_{j} has already mbrb-delivered (m,𝑠𝑛,i)(m,\mathit{sn},i).

  • Case if pjp_{j} passes the condition at line 1.

    Process pjp_{j} then saves all signatures of SS at line 1, and after it passes the condition at line 1 (as |S|>n+t2|S|>\frac{n+t}{2}) and finally mbrb-delivers (m,𝑠𝑛,i)(m,\mathit{sn},i) at line 1. ∎

Lemma 5 (MBRB-Global-delivery).

If a correct process pip_{i} mbrb-delivers an app-message mm from pjp_{j} with sequence number 𝑠𝑛\mathit{sn}, then at least =cd\ell=c-d correct processes mbrb-deliver mm from pjp_{j} with sequence number 𝑠𝑛\mathit{sn}.

Proof.

If a correct process pip_{i} mbrb-delivers (m,𝑠𝑛,j)(m,\mathit{sn},j) at line 1, it must have saved a set 𝑠𝑖𝑔𝑠\mathit{sigs} of strictly more than n+t2\frac{n+t}{2} valid distinct signatures because of the condition at line 1. Let us remark that 𝑠𝑖𝑔𝑠\mathit{sigs} necessarily contains the signature for (m,𝑠𝑛,i)(m,\mathit{sn},i) by pip_{i} because of the condition at line 1. Additionally, pip_{i} must also have broadcast bundle(m,𝑠𝑛,i,𝑠𝑖𝑔𝑠)\textsc{bundle}(m,\mathit{sn},i,\mathit{sigs}) at line 1, that, by definition of the message adversary, is received by a set KK of at least cdc-d correct processes. For each process pkp_{k} of KK:

  • If pkp_{k} does not pass the condition at line 1, it is necessarily because it has already mbrb-delivered some (,𝑠𝑛,j)(-,\mathit{sn},j) at line 1. But because of MBRB-No-duplicity, pkp_{k} has necessarily mbrb-delivered (m,𝑠𝑛,j)(m,\mathit{sn},j).

  • If pkp_{k} passes the condition at line 1, then it saves all signatures of 𝑠𝑖𝑔𝑠\mathit{sigs} at line 1 and then passes the condition at line 1 and finally mbrb-delivers (m,𝑠𝑛,j)(m,\mathit{sn},j) at line 1.

Therefore, all processes of KK (which, as a reminder, are at least cd=c-d=\ell) necessarily mbrb-deliver (m,𝑠𝑛,j)(m,\mathit{sn},j) at line 1. ∎

Lemma 6.

cd>n+t2c-d>\big{\lfloor}\frac{n+t}{2}\big{\rfloor}.

Proof.

We have the following:

cd\displaystyle c-d ntd=2n2t2d2,\displaystyle\geq n-t-d=\frac{2n-2t-2d}{2}, (by definition of cc)
>n+3t+2d2t2d2,\displaystyle>\frac{n+3t+2d-2t-2d}{2}, (by mbrb-Assumption)
>n+t2n+t2.\displaystyle>\frac{n+t}{2}\geq\Big{\lfloor}\frac{n+t}{2}\Big{\rfloor}.\qed
Lemma 7.

If a correct process pip_{i} mbrb-broadcasts (m,𝑠𝑛)(m,\mathit{sn}), then at least cddn+t2cdn+t2c-d-\Big{\lfloor}\frac{d\lfloor\frac{n+t}{2}\rfloor}{c-d-\lfloor\frac{n+t}{2}\rfloor}\Big{\rfloor} correct processes mbrb-deliver (m,𝑠𝑛,i)(m,\mathit{sn},i) at most two communication steps later.

Proof.

If a correct process pip_{i} mbrb-broadcasts (m,𝑠𝑛)(m,\mathit{sn}), then it broadcasts its own signature 𝑠𝑖𝑔i\mathit{sig}_{i} for (m,𝑠𝑛,i)(m,\mathit{sn},i) in a bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔i})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{i}\}) imp-message at line 1. Let us denote by KK the set of correct processes that receive this bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔i})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{i}\}) imp-message from pip_{i} during the same communication step, and let kk be the number of processes in KK, such that cdk=|K|cc-d\leq k=|K|\leq c (by definition of the message adversary). By construction of the algorithm, every process pxp_{x} of KK passes the condition at line 1, and therefore broadcasts a bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔x,𝑠𝑖𝑔i})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{x},\mathit{sig}_{i}\}) imp-message, whether it be at line 1 for pip_{i}, or at line 1 for any other process of KK.

Let AA and BB define two partitions of the set of all correct processes (ABA\cup B is the set of all correct processes, and AB=A\cap B=\varnothing). AA denotes the set of correct processes that receive strictly more than n+t2\frac{n+t}{2} signatures for (m,𝑠𝑛,i)(m,\mathit{sn},i) from processes of KK two communication steps after pip_{i} mbrb-broadcast (m,𝑠𝑛)(m,\mathit{sn}), while BB denotes the set of remaining correct processes of KK that receive at most n+t2\frac{n+t}{2} signatures for (m,𝑠𝑛,i)(m,\mathit{sn},i) from processes of KK two communication steps after pip_{i} mbrb-broadcast (m,𝑠𝑛)(m,\mathit{sn}). Let 2\ell_{2} be the size of AA: 2=|A|\ell_{2}=|A|. By construction, |B|=c2|B|=c-\ell_{2}. Let sAs_{A} and sBs_{B} respectively denote the number of signatures for (m,𝑠𝑛,i)(m,\mathit{sn},i) from processes of KK received by processes of AA and BB at most two communication steps after pip_{i} mbrb-broadcast (m,𝑠𝑛)(m,\mathit{sn}). Figure 1 represents the distribution of such signatures among processes of KK, sorted by decreasing number of signatures received. Each processes of AA can receive at most kk signatures (that is, all signatures) from processes of KK, while each process of BB can receive at most n+t2\lfloor\frac{n+t}{2}\rfloor signatures from processes of KK two communication steps after pip_{i} mbrb-broadcasts (m,𝑠𝑛)(m,\mathit{sn}). For the sake of simplicity, we use qq in the place of n+t2\lfloor\frac{n+t}{2}\rfloor in some parts of this proof.

# receivedsignatures# correctprocesseskk2\ell_{2}AAsAs_{A}n+t2=q\big{\lfloor}\frac{n+t}{2}\big{\rfloor}=qccBBsBs_{B}
Figure 1: Distribution of signatures among processes of AA and BB two communication steps after pip_{i} mbrb-broadcast (m,𝑠𝑛)(m,\mathit{sn})

From these observations, we infer the following inequalities:

2k\displaystyle\ell_{2}k sA,\displaystyle\geq s_{A},
(c2)q\displaystyle(c-\ell_{2})q sB.\displaystyle\geq s_{B}.

By the definition of the message adversary, a bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔x,𝑠𝑖𝑔i})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{x},\mathit{sig}_{i}\}) imp-message broadcast by a correct process pxp_{x} is eventually received by at least cdc-d correct processes. As a consequence, in total, the minimum number of signatures for (m,𝑠𝑛,i)(m,\mathit{sn},i) collectively received by correct processes as a result of broadcasts by processes in KK in the first two asynchronous communication steps is k(cd)k(c-d). We thus have:

sA+sB\displaystyle s_{A}+s_{B} k(cd).\displaystyle\geq k(c-d).

By combining the previous inequalities, we obtain:

2k+(c2)q\displaystyle\ell_{2}k+(c-\ell_{2})q k(cd),\displaystyle\geq k(c-d),
2k+cq2q\displaystyle\ell_{2}k+cq-\ell_{2}q k(cd),\displaystyle\geq k(c-d),
2k2q\displaystyle\ell_{2}k-\ell_{2}q k(cd)cq,\displaystyle\geq k(c-d)-cq,
2(kq)\displaystyle\ell_{2}(k-q) k(cd)cq.\displaystyle\geq k(c-d)-cq. (1)

By Lemma 6, we know that kcd>n+t2=qk\geq c-d>\big{\lfloor}\frac{n+t}{2}\big{\rfloor}=q, so we can rewrite (1) into:

2\displaystyle\ell_{2} k(cd)cqkq.\displaystyle\geq\frac{k(c-d)-cq}{k-q}. (2)

Let us define a function ff such that f(k)=k(cd)cqkqf(k)=\frac{k(c-d)-cq}{k-q}. As we seek the lowest guaranteed value for 2\ell_{2}, we want to find the minimum of ff on k[cd,c]k\in[c-d,c]. To this end, let us first study the derivative of ff. The image f(k)f(k) is of the form uv\frac{u}{v}, so we have:

f(k)\displaystyle f^{\prime}(k) =uvuvv2=(cd)(kq)(k(cd)qc)(kq)2,\displaystyle=\frac{u^{\prime}v-uv^{\prime}}{v^{2}}=\frac{(c-d)(k-q)-(k(c-d)-qc)}{(k-q)^{2}},
=(cd)(kq)k(cd)+qc(kq)2=qcq(cd)(kq)2=qd(kq)2.\displaystyle=\frac{(c-d)(k-q)-k(c-d)+qc}{(k-q)^{2}}=\frac{qc-q(c-d)}{(k-q)^{2}}=\frac{qd}{(k-q)^{2}}.

As qq and dd are by definition positive, we know that f(k)=qd(kq)2f^{\prime}(k)=\frac{qd}{(k-q)^{2}} is positive, or null when d=0d=0. Therefore, ff is monotonically increasing on k[cd,c]k\in[c-d,c], and the minimum value for 2\ell_{2} can be found when kk is also minimum, that is, when k=cdk=c-d. Thus, when we replace kk by cdc-d in (2), we obtain:

2\displaystyle\ell_{2} (cd)(cd)cqcdq=(cd)(cdq)qdcdq,\displaystyle\geq\frac{(c-d)(c-d)-cq}{c-d-q}=\frac{(c-d)(c-d-q)-qd}{c-d-q},
cdqdcdq.\displaystyle\geq c-d-\frac{qd}{c-d-q}. (3)

Let us denote by 2,𝗆𝗂𝗇\ell_{2,\mathsf{min}} the minimum number of correct processes that receive a quorum of strictly more than n+t2\frac{n+t}{2} valid distinct signatures for (m,𝑠𝑛,i)(m,\mathit{sn},i) two communication steps after pip_{i} mbrb-broadcast (m,𝑠𝑛)(m,\mathit{sn}), such that 2,𝗆𝗂𝗇2=|A|\ell_{2,\mathsf{min}}\leq\ell_{2}=|A|. As the right hand side of (3) is not always an integer, we have:

2,𝗆𝗂𝗇\displaystyle\ell_{2,\mathsf{min}} =cdqdcdq=cd+qdcdq,\displaystyle=\Big{\lceil}c-d-\frac{qd}{c-d-q}\Big{\rceil}=c-d+\Big{\lceil}-\frac{qd}{c-d-q}\Big{\rceil},
=cdqdcdq,\displaystyle=c-d-\Big{\lfloor}\frac{qd}{c-d-q}\Big{\rfloor}, (as x,x=x\operatorname{\forall\,}x\in\mathbb{R},\lceil-x\rceil=-\lfloor x\rfloor)
=cddn+t2cdn+t2.\displaystyle=c-d-\Big{\lfloor}\frac{d\lfloor\frac{n+t}{2}\rfloor}{c-d-\lfloor\frac{n+t}{2}\rfloor}\Big{\rfloor}. (by definition of qq)

Hence, at least 2,𝗆𝗂𝗇=cddn+t2cdn+t2\ell_{2,\mathsf{min}}=c-d-\Big{\lfloor}\frac{d\lfloor\frac{n+t}{2}\rfloor}{c-d-\lfloor\frac{n+t}{2}\rfloor}\Big{\rfloor} processes of KK receive strictly more than n+t2\frac{n+t}{2} valid distinct signatures for (m,𝑠𝑛,i)(m,\mathit{sn},i) two communication steps after pip_{i} mbrb-broadcasts (m,𝑠𝑛)(m,\mathit{sn}). For every process pap_{a} of AA:

  • If pap_{a} does not pass the condition at line 1 after receiving the last signature of the quorum in a bundle imp-message, it is necessarily because pap_{a} already mbrb-delivered some (,𝑠𝑛,i)(-,\mathit{sn},i), because processes of KK are correct and all their bundle imp-messages include the signature for (m,𝑠𝑛,i)(m,\mathit{sn},i) by pip_{i}. But let us remind that, as the sender pip_{i} is correct, it is impossible for pap_{a} to mbrb-deliver anything different from (m,𝑠𝑛,i)(m,\mathit{sn},i). Therefore, pap_{a} has already mbrb-delivered (m,𝑠𝑛,i)(m,\mathit{sn},i) at line 1.

  • If pap_{a} passes the condition at line 1 after processing the last bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔i,𝑠𝑖𝑔x})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{i},\mathit{sig}_{x}\}) imp-message of the quorum from a process pxp_{x}, then pap_{a} saves the signature 𝑠𝑖𝑔x\mathit{sig}_{x} at line 1, and after it passes the condition at line 1 (as it has saved strictly more than n+t2\frac{n+t}{2} signatures) and finally mbrb-delivers (m,𝑠𝑛,i)(m,\mathit{sn},i) at line 1.

Therefore, all processes of AA, which are at least 2,𝗆𝗂𝗇=cddn+t2cdn+t2\ell_{2,\mathsf{min}}=c-d-\Big{\lfloor}\frac{d\lfloor\frac{n+t}{2}\rfloor}{c-d-\lfloor\frac{n+t}{2}\rfloor}\Big{\rfloor}, mbrb-deliver (m,𝑠𝑛,i)(m,\mathit{sn},i) at line 1 at most two communication steps after pip_{i} mbrb-broadcast (m,𝑠𝑛)(m,\mathit{sn}). ∎

Lemma 8.

If a correct process pip_{i} mbrb-broadcasts (m,𝑠𝑛)(m,\mathit{sn}) and d<cc×n+t2d<c-\sqrt{c\times\frac{n+t}{2}}, then at least cdc-d correct processes mbrb-deliver (m,𝑠𝑛,i)(m,\mathit{sn},i) at most three communication steps later.

Proof.

Let us assume that a correct process pip_{i} mbrb-broadcasts (m,𝑠𝑛)(m,\mathit{sn}) and that d<cc×n+t2d<c-\sqrt{c\times\frac{n+t}{2}}. Process pip_{i} must unreliably broadcast a first bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔i})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{i}\}) imp-message (where 𝑠𝑖𝑔i\mathit{sig}_{i} is the signature of (m,𝑠𝑛,i)(m,\mathit{sn},i) by pip_{i}) at line 1. This initial imp-message is received by at least (cd1)(c-d-1) other correct processes, due to our assumption on the message adversary. This counts for a first communication step.

In the second communication step, each process pjp_{j} of these (cd1)(c-d-1) correct processes unreliably broadcasts its own bundle(m,𝑠𝑛,i,{𝑠𝑖𝑔j,𝑠𝑖𝑔i})\textsc{bundle}(m,\mathit{sn},i,\{\mathit{sig}_{j},\mathit{sig}_{i}\}) imp-message (where 𝑠𝑖𝑔j\mathit{sig}_{j} is the signature of (m,𝑠𝑛,i)(m,\mathit{sn},i) by pjp_{j}) at line 1. At the end of the second communication step, in total, at least (cd)(c-d) distinct signatures for (m,𝑠𝑛,i)(m,\mathit{sn},i) have been created and unreliably broadcast by correct processes (counting that of pip_{i}), resulting in at least (cd)2(c-d)^{2} receptions of said signatures by correct processes. As there are cc correct processes, this means that, on average, each correct process has received at least (cd)2c\frac{(c-d)^{2}}{c} signatures by the end of the second communication step, and that at least one correct process, pkp_{k}, receives (and saves at line 1) at least this number of signatures.

From the Lemma hypothesis d<cc×n+t2d<c-\sqrt{c\times\frac{n+t}{2}} and using simple algebraic transformations, we can derive (cd)2c>n+t2\frac{(c-d)^{2}}{c}>\frac{n+t}{2}. Therefore, pkp_{k} reaches a quorum of signatures, that is, it passes the condition at line 1 and unreliably broadcast this quorum of signatures at line 1, two communication steps after the mbrb-broadcast of (m,𝑠𝑛)(m,\mathit{sn}) by pip_{i}. By definition of the message adversary, this quorum of signatures is received by cdc-d correct processes, which save it at line 1 and thus pass the condition at line 1 and finally mbrb-deliver (m,𝑠𝑛,i)(m,\mathit{sn},i) at line 1, three communication steps after the mbrb-broadcast of (m,𝑠𝑛)(m,\mathit{sn}) by pip_{i}. ∎

Lemma 9 (MBRB-Time-cost).

If a correct process pip_{i} mbrb-broadcasts an app-message mm with sequence number 𝑠𝑛\mathit{sn}, then =cd\ell=c-d correct processes mbrb-deliver mm from pip_{i} with sequence number 𝑠𝑛\mathit{sn} at most
 
λ={2ifd<cn+t2n+t2+13ifd<cc×n+t2>3otherwise}\lambda=\left\{\begin{array}[]{ll}2&\text{if}\leavevmode\nobreak\ \leavevmode\nobreak\ d<\frac{c-\lfloor\frac{n+t}{2}\rfloor}{\lfloor\frac{n+t}{2}\rfloor+1}\\ 3&\text{if}\leavevmode\nobreak\ \leavevmode\nobreak\ d<c-\sqrt{c\times\frac{n+t}{2}}\\ >3&\text{otherwise}\end{array}\right\} communication steps later.

Proof.

Let us consider a correct process pip_{i} that mbrb-broadcasts (m,𝑠𝑛)(m,\mathit{sn}). By exhaustion:

  • Case where d<cn+t2n+t2+1d<\frac{c-\lfloor\frac{n+t}{2}\rfloor}{\lfloor\frac{n+t}{2}\rfloor+1}.

    By Lemma 7, at least cddn+t2cdn+t2c-d-\Big{\lfloor}\frac{d\lfloor\frac{n+t}{2}\rfloor}{c-d-\lfloor\frac{n+t}{2}\rfloor}\Big{\rfloor} correct processes mbrb-deliver (m,𝑠𝑛,i)(m,\mathit{sn},i) two communication steps after pip_{i} has mbrb-broadcast (m,𝑠𝑛)(m,\mathit{sn}). We have:

    d\displaystyle d <cn+t2n+t2+1,\displaystyle<\frac{c-\lfloor\frac{n+t}{2}\rfloor}{\lfloor\frac{n+t}{2}\rfloor+1}, (case assumption)
    dn+t2+d\displaystyle d\Big{\lfloor}\frac{n+t}{2}\Big{\rfloor}+d <cn+t2,\displaystyle<c-\Big{\lfloor}\frac{n+t}{2}\Big{\rfloor}, (as n+t2+1>0\lfloor\frac{n+t}{2}\rfloor+1>0)
    dn+t2\displaystyle d\Big{\lfloor}\frac{n+t}{2}\Big{\rfloor} <cdn+t2,\displaystyle<c-d-\Big{\lfloor}\frac{n+t}{2}\Big{\rfloor},
    dn+t2cdn+t2\displaystyle\frac{d\lfloor\frac{n+t}{2}\rfloor}{c-d-\lfloor\frac{n+t}{2}\rfloor} <1,\displaystyle<1, (as cd>n+t2c-d>\lfloor\frac{n+t}{2}\rfloor by Lemma 6)
    dn+t2cdn+t2\displaystyle\left\lfloor\frac{d\lfloor\frac{n+t}{2}\rfloor}{c-d-\lfloor\frac{n+t}{2}\rfloor}\right\rfloor 0,\displaystyle\leq 0,
    cddn+t2cdn+t2\displaystyle c-d-\left\lfloor\frac{d\lfloor\frac{n+t}{2}\rfloor}{c-d-\lfloor\frac{n+t}{2}\rfloor}\right\rfloor cd=.\displaystyle\geq c-d=\ell.

    Hence, \ell correct processes mbrb-deliver (m,𝑠𝑛,i)(m,\mathit{sn},i) at most two communication steps after pip_{i} has mbrb-broadcast (m,𝑠𝑛)(m,\mathit{sn}).

  • Case where d<cc×n+t2d<c-\sqrt{c\times\frac{n+t}{2}}.

    Lemma 8 applies and at least cd=c-d=\ell correct processes mbrb-deliver (m,𝑠𝑛,i)(m,\mathit{sn},i) at most three communication steps after pip_{i} has mbrb-broadcast (m,𝑠𝑛)(m,\mathit{sn}). ∎

Lemma 10 (MBRB-Message-cost).

The mbrb-broadcast of an app-message by a correct process pip_{i} entails the sending of at most μ=2n2\mu=2n^{2} imp-messages by correct processes.

Proof.

The broadcast of an imp-message by a correct process at line 1 entails its forwarding by at most n1n-1 other correct processes at line 1. As each broadcast by correct process corresponds to the sending of nn imp-messages, then at most n2n^{2} imp-messages are sent in a first step.

In a second step, at least one correct process reaches a quorum of signatures and passes the condition at line 1, and then broadcasts this quorum of signatures at line 1. Upon receiving this quorum, every correct process also passes the condition at line 1 (if it has not done it already) and broadcasts the imp-message containing the quorum at line 1. Hence, at most n2n^{2} imp-messages are also sent in this second step, which amounts to a maximum of μ=2n2\mu=2n^{2} imp-messages sent in total. ∎

An additional property

The reader can check from the previous proofs that the algorithm satisfies the following MBRB-delivery property. If there is a set KK of kk correct processes, 1kd1\leq k\leq d, such that there is a finite time τ\tau after which the message adversary never eliminates the imp-messages sent to them, then, after τ\tau, each process of KK mbrb-delivers all the app-messages mbrb-broadcast by correct processes.

4 A Tightness Bound

Definition

An algorithm implementing a broadcast communication abstraction is event-driven if, as far as the correct processes are concerned, only (i) the invocation of the broadcast operation that is provided to the application by the broadcast communication abstraction, or (ii) the reception of an imp-message—sent by a correct or a Byzantine process—can generate the sending of imp-messages (using the underlying unreliable network-level 𝖻𝗋𝗈𝖺𝖽𝖼𝖺𝗌𝗍\mathsf{broadcast} operation).

Theorem 2 (MBRB-Necessary-condition).

When n3t+2dn\leq 3t+2d, there is no event-driven (signature-free or signature-based) algorithm implementing the MBRB communication abstraction on top of an nn-process asynchronous system in which up to tt processes may be Byzantine and where a message adversary may suppress up to dd copies of each imp-message broadcast by a correct process.101010Let us recall that the underlying communication operation offered by the system is an unreliable broadcast defined in Section 2.1.

Proof.

Without loss of generality the proof considers the case n=3t+2dn=3t+2d. Let us partition the nn processes into five sets Q1,Q2,Q3Q_{1},Q_{2},Q_{3}, D1D_{1}, and D2D_{2}, such that |D1|=|D2|=d|D_{1}|=|D_{2}|=d and |Q1|=|Q2|=|Q3|=t|Q_{1}|=|Q_{2}|=|Q_{3}|=t.111111For the case n<3t+2dn<3t+2d, the partition is such that 𝗆𝗂𝗇(|Q1|,|D2|)d\mathsf{min}(|Q_{1}|,|D_{2}|)\leq d and 𝗆𝗂𝗇(|Q1|,|Q2|,|Q3|)t\mathsf{min}(|Q_{1}|,|Q_{2}|,|Q_{3}|)\leq t. So, when considering the sets Q1Q_{1}, Q2Q_{2}, and Q3Q_{3}, there are executions in which all the processes of either Q1Q_{1} or Q2Q_{2} or Q3Q_{3} can be Byzantine, while the processes of the two other sets are not.

The proof is by contradiction. So, assuming that there is an event-driven algorithm AA that builds the MBRB-broadcast abstraction for n=3t+2dn=3t+2d, let us consider an execution EE of AA in which the processes of Q1Q_{1}, Q2Q_{2}, D1D_{1}, and Q2Q_{2} are not Byzantine while all the processes of Q3Q_{3} are Byzantine.

Let us observe that the message adversary can isolate up to dd processes by preventing them from receiving any imp-messages. Without loss of generality, let us assume that the adversary isolates a set of dd correct processes not containing the sender of the app-message. As AA is event-driven, these dd isolated processes do not send imp-messages during the execution EE of AA. As a result, no correct process can expect imp-messages from more than (ntd)(n-t-d) different processes without risking being blocked forever. Thanks to the mbrb-Assumption n=3t+2dn=3t+2d, this translates as “no correct process can expect imp-messages from more than (2t+d)(2t+d) different processes without risking being blocked forever”.

In the execution EE, the (Byzantine) processes of Q3Q_{3} simulate the mbrb-broadcast of an app-message such that this app-message appears as being mbrb-broadcast by one of them and is mbrb-delivered as the app-message mm to the processes of Q1Q_{1} (hence the processes of Q3Q_{3} appear, to the processes of Q1Q_{1}, as if they were correct) and as the app-message mmm^{\prime}\neq m to the processes of Q2Q_{2} (hence, similarly to the previous case, the processes of Q3Q_{3} appear to the processes of Q2Q_{2} as if they were correct). Let us call mm-messages (resp., mm^{\prime}-messages) the imp-messages generated by the event-driven algorithm AA that entails the mbrb-delivery of mm (resp., mm^{\prime}). Moreover, the execution EE is such that:

  • concerning the mm-messages: the message adversary suppresses all the mm-messages sent to the processes of D2D_{2}, and asynchrony delays the reception of all the mm-messages sent to Q2Q_{2} until some time τ\tau defined below.121212 Equivalently, we could also say that asynchrony delays the reception of all the mm-messages sent to D2Q2D_{2}\cup Q_{2} until time τ\tau. The important point is here that, due to the assumed existence of Algorithm AA, the processes of Q1Q_{1} and and D1D_{1} mbrb-deliver mm with mm-messages from at most 2t+d2t+d different processes. So, as |Q1D1Q3|=ntd=2t+d|Q_{1}\cup D_{1}\cup Q_{3}|=n-t-d=2t+d, Algorithm A will cause the processes of Q1Q_{1} and D1D_{1} to mbrb-deliver mm.131313 Let us notice that this is independent from the fact that the processes in Q3Q_{3} are Byzantine or not.

  • concerning the mm^{\prime}-messages: the message adversary suppresses all the mm^{\prime}-messages sent to the processes of D1D_{1}, and the asynchrony delays the reception of all the mm^{\prime}-messages sent to Q1Q_{1} until time τ\tau. As previously, as |Q2D2Q3|=ntd=2t+d|Q_{2}\cup D_{2}\cup Q_{3}|=n-t-d=2t+d, Algorithm AA will cause the processes of Q2Q_{2} and D2D_{2} to mbrb-deliver mm^{\prime}.

  • Finally, the time τ\tau occurs after the mbrb-delivery of mm by the processes of D1D_{1} and Q1Q_{1}, and after the mbrb-delivery of mm^{\prime} by the processes of D2D_{2} and Q2Q_{2}.

It follows that different non-Byzantine processes mbrb-deliver different app-messages for the same mbrb-broadcast (or a fraudulent simulation of it) issued by a Byzantine process (with possibly the help of other Byzantine processes). This contradicts the MBRB-No-Duplicity property, which concludes the proof of the theorem. ∎

Theorem 3 (Algorithm optimality).

Considering an asynchronous nn-process system in which up to tt processes can be Byzantine and where a dd-message adversary can suppress imp-messages, Algorithm 1 is optimal with respect to the pair of values t,d\langle t,d\rangle.

Proof.

Theorem 2 has shown that the condition n>3t+2dn>3t+2d is necessary, while Algorithm 1 has shown that this condition is sufficient (Theorem 1). ∎

5 Conclusion

This article has presented a new communication abstraction (denoted MBRB) that extends Byzantine reliable broadcast (as defined by Bracha and Toueg [7, 8]) to systems where, at the underlying implementation level, an adversary may suppress some subset of implementation messages used by the processes to co-operate. From a practical point of view, this kind of message loss captures phenomena such as silent churn, input-disconnection, etc. A signature-based algorithm implementing the corresponding Byzantine-tolerant reliable broadcast in the presence of a message adversary has been presented and proven correct. This algorithm assumes n>3t+2dn>3t+2d (where nn is the number of processes, tt is the maximum number of Byzantine processes, and dd is an upper bound on the power of the message adversary), which has been shown to be a necessary and sufficient condition. message adversary),

When there is no message adversary, this algorithm is optimal from both Byzantine resilience and the number of communication steps. These properties are also satisfied in other circumstances including a message adversary whose power dd is restricted to some well-defined threshold.

Acknowledgments

This work was partially supported by the French ANR projects ByBloS (ANR-20-CE25-0002-01) and PriCLeSS (ANR-10-LABX-07-81) devoted to the modular design of building blocks for large-scale Byzantine-tolerant multi-users applications. The authors want to thank Colette Johnen, Elad Schiller, and Stefan Schmid for their kind invitation to participate in the SSS 2021 conference.

References

  • [1] Abraham I., Nayak K., Ren L., and Xiang Z., Good-case latency of Byzantine broadcast: a complete categorization. Proc. 40th ACM Symposium on Principles of Distributed Computing (PODC’21), ACM Press, pp. 331-341 (2021) (arXiv:2102.07240v2)
  • [2] Afek Y. and Gafni E., Asynchrony from synchrony. Proc. Int’l Conference on Distributed Computing and Networking (ICDCN’13), Springer LNCS 7730, pp. 225-239, (2013)
  • [3] Albouy T., Frey D., Raynal M., and Taïani F., Byzantine-tolerant reliable broadcast in the presence of silent churn (Invited Talk). Proc. 23th Int’l Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS’21) Springer LNCS 13046, pp. 21-33 (2021)
  • [4] Albouy T., Frey D., Raynal M., and Taïani F., kk\ell-cast: on the foundations of Byzantine reliable broadcast in the presence of message adversaries. (May 2022) (arXiv:2204.13388)
  • [5] Auvolat A., Frey D., Raynal M., and Taïani F., Money Transfer Made Simple: a Specification, a Generic Algorithm, and its Proof. Bulletin of the EATCS, 132 (2020)
  • [6] Bonomi S., Decouchant J., Farina G., Rahli V., and Tixeuil S., Practical Byzantine Reliable Broadcast on Partially Connected Networks. 41th IEEE International Conference on Distributed Computing Systems, ICDCS 2021, IEEE, pp. 506-516 (2021)
  • [7] Bracha G., Asynchronous Byzantine agreement protocols. Information & Computation, 75(2):130-143 (1987)
  • [8] Bracha G. and Toueg S., Asynchronous consensus and broadcast protocols. Journal of the ACM, 32(4):824-840 (1985)
  • [9] Cachin Ch., Guerraoui R., and Rodrigues L., Reliable and secure distributed programming, Springer, 367 pages, ISBN 978-3-642-15259-7 (2011)
  • [10] Charron-Bost B., and Schiper A., The heard-of model: computing in distributed systems with benign faults. Distributed Computing, 22(1):49-71 (2009)
  • [11] Cohen S., and Keidar I., Tame the Wild with Byzantine Linearizability: Reliable Broadcast, Snapshots, and Asset Transfer. Proc. 35rd Int’l Symposium on Distributed Computing (DISC’21), pp. 18:1-18:18 (2021)
  • [12] Collins D., Guerraoui R., Komatovic J., Kuznetsov P., Monti M., Pavlovic M., Pignolet Y.-A., Seredinschi D.-A., Tonkikh A., and Xygkis A., Online Payments by Merely Broadcasting Messages. Proc. 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2020), pp. 26-38 (2020)
  • [13] Danezis G., Kokoris-Kogias L., Sonnino A., and Spiegelman A., Narwhal and Tusk: a DAG-based mempool and efficient BFT consensus. Proc. 17th European Conference on Computer Systems (EUROSYS’22), ACM Press, pp. 34-50 (2022)
  • [14] Dolev D., The Byzantine generals strike again. Journal of Algorithms, 3:14-20 (1982)
  • [15] Guerraoui R., Kuznetsov P., Monti M., Pavlovic M., and Seredinschi D.-A., The Consensus Number of a Cryptocurrency. Proc. 38th ACM Symposium on Principles of Distributed Computing (PODC’19), ACM Press, pp. 307-316 (2019)
  • [16] Guerraoui R., Komatovic J., Kuznetsov P., Pignolet P.A., Seredinschi D.-A., and Tonkikh A., Dynamic Byzantine reliable broadcast. Proc. 24th Int’l Conference on Principles of Distributed Systems (OPODIS’20), LIPIcs Vol. 184, Article 23, 18 pages (2020)
  • [17] Guerraoui G., Kuznetsov P., Monti M., Pavlovic M., and Seredinschi D.-A., Scalable Byzantine reliable broadcast. Proc. 33rd Int’l Symposium on Distributed Computing (DISC’19), LIPIcs Vol. 146, Article 22, 16 pages (2019)
  • [18] Hirt M., Kastrato A., and Liu-Zhang C.-D., Multi-threshold asynchronous reliable broadcast and consensus. Proc. 24th Int’l Conference on Principles of Distributed Systems (OPODIS’20), LIPICs Vol. 184, Article 6, 16 pages (2020)
  • [19] Imbs D. and Raynal M., Trading tt-resilience for efficiency in asynchronous Byzantine reliable broadcast. Parallel Processing Letters, Vol. 26(4), 8 pages (2016)
  • [20] Lamport L., Shostack R., and Pease M., The Byzantine generals problem. ACM Transactions on Programming Languages and Systems, 4(3)-382-401, (1982)
  • [21] Nayak K., Ren L., Shi E., Vaidya N.H., and Xiang Z., Improved extension protocols for Byzantine broadcast and agreement. Proc. 34rd Int’l Symposium on Distributed Computing (DISC’20), LIPIcs Vol. 179, Article 28, 16 pages (2020)
  • [22] Pease M., Shostak R., and Lamport L., Reaching agreement in the presence of faults. Journal of the ACM, 27:228-234 (1980)
  • [23] Raynal M., Message adversaries. Encyclopedia of Algorithms, Springer (2015)
  • [24] Raynal M., Fault-tolerant message-passing distributed systems: an algorithmic approach. Springer, 480 pages, ISBN 978-3-319-94140-0 (2018)
  • [25] Raynal M., On the versatility of Bracha’s Byzantine reliable broadcast algorithm. Parallel Processing Letters, 31(3), 2150006 (9 pages) (2021)
  • [26] Raynal M. and Stainer J., Synchrony weakened by message adversaries vs asynchrony restricted by failure detectors. Proc. 32nd ACM Symposium on Principles of Distributed Computing (PODC’13), ACM Press, pp. 166-175 (2013)
  • [27] Santoro N. and Widmayer P., Time is not a healer. Proc. 6th Annual Symposium on Theoretical Aspects of Computer Science (STACS’89), Springer LNCS 349, pp. 304-316 (1989)
  • [28] Santoro N. and Widmayer P., Agreement in synchronous networks with ubiquitous faults. Theoretical Computer Science, 384(2-3): 232-249 (2007)
  • [29] Tseng L., Zhang Q., Kumar S., and Zhang Y., Exact Consensus under Global Asymmetric Byzantine Links. Proc. 40th IEEE International Conference on Distributed Computing Systems (ICDCS 2020), pp. 721-731 (2020)