Consensus in human systems is actually usually pretty easy because of the social layer of society. This fault tolerance against 51% attacks is due to the fact that convincing the community that any engineered ’truth’ is the real on requires subverting every trusted member in the community, most notably media and news sources (also why systems of authoritarian power are so scary).

A difficult problem for governance within communities

Note that this is inherently different from collaboration methods like CRDTs. Collaboration involves keeping all edits and merging them. Consensus involves picking one of several proposed values and agreeing on it.

Example applications include: SMR, Byzantine Agreement

Consensus and Humming in the IETF

Source: IETF

On rough consensus

  • Humming as ‘temp checks’ for people to voice disagreement but default assumption is optimistic trust
  • “While counting heads might give a good guess as to what the rough consensus will be, doing so can allow important minority views to get lost in the noise. One of the strengths of a consensus model is that minority views are addressed, and using a rough consensus model should not take away from that.”
  • “We can’t know who the “members” of any given working group would be at any one time, and we certainly can’t know who all of the “members” of the IETF would be: That’s why we refer to “participants” in the IETF; the IETF doesn’t really have “members”. Indeed, we often recruit additional implementers and other experts into working groups in order to ensure that broader views are brought into the discussion. So, voting is simply not practical.”

Algorithmic Consensus

There are four requirements to such an algorithm:

  1. Validity. The result must be a value that was submitted by at least one of the processes. The consensus algorithm cannot just make up a value.
  2. Uniform agreement. All nodes must select the same value.
  3. Integrity. A node can select only a single value. That is, a node cannot announce one outcome and later change its mind.
  4. Termination. Also known as progress, every node must eventually reach a decision.

There are two main protocol paradigms for achieving consensus in the presence of Byzantine nodes:

  1. Classic BFT protocols: typically uses two voting rounds to ensure consistency
    1. One phase to guarantee proposal uniqueness using a quorum certificate of votes
    2. The other phase is to convince replicas that the leader is safe to propose new entries
    3. Examples include: Tendermint, Tangaroa, HotStuff, PBFT
  2. Longest-chain consensus
    1. Examples include: most consensus mechanisms for cryptocurrencies like Bitcoin, Ethereum
Classic BFTLongest-chain Consensus
Safety/Liveness tradeoffFavours safety in the face of an attackFavour liveness in the face of an attack
FinalityInstant and deterministicProbabilistic (at risk of potentially large chain reorganizations and double-spend attacks)
Fork behaviourRare but difficult to recover fromEmbrace forks, uses in-protocol methods for resolving ambiguity as to which fork is correct
FLP Result Behavioursacrifice either liveness or consistency in the face of an attack (assuming <33% Byzantine as per FLP Result)Does not apply as longest-chain consensus is non-deterministic
Permission modelGenerally permissioned (see Sandglass)Permissionless

Note that there have been attempts to bridge Classic BFT models with Nakamoto-style consensus ones with hybrid consensus models which use a permissionless chain to determine a participant/proposer rotation in a reconfigurable BFT engine.

Comparisons between different BFT SMR protocols

All protocols are of the following:

  1. protocols for byzantine fault-tolerant SMR
  2. All work in the partially synchronous system model and obtain safety (always) and liveness (after GST) in the face of an adversary that controls  replicas out of a total of  replicas (per FLP Result)
  3. All these protocols are based on the classic leader-based primary-backup approach where leaders are replaced in a view-change (or election to use Raft terminology) protocol.

Below is a comparison of a few top protocols and their tradeoffs in authenticator complexity

Best-case Latency (rounds)Normal-case CommunicationView-change CommunicationLeader RotationResponsiveness
PBFT2On suspected faultYes
Tendermint2 using thresholded signatures, otherwiseEvery roundNo
SBFT1On suspected faultYes
HotStuff3Every roundYes

Responsiveness here refers to the property that a non-faulty leader can drive the protocol to consensus in a time depending on actual message delays and not the theoretical upper bound on message transmission delays. In partially synchronous models, we use optimistic responsiveness, which requires responsiveness only after GST is reached.

Leader rotation tradeoff:

  • maintaining a stable leader means less overhead and better performance due to stability when the leader is honest and trusted
  • constantly rotating the leader provides a stronger fairness guarantee against stable malicious leaders

Pipelining

In PBFT, SBFT, and HotStuff the leader maintains a window of open slots and is allowed to concurrently work on committing all open slots in his active window. Conceptually, this is like TCP where a sender does not have to wait for the ACK of packet  before sending message . This window can significantly increase throughput by allowing the leader to concurrently coordinate several actions of slot commitments.

Impossibility Results

When consensus is impossible to achieve:

  1. 33 percent Impossibility Result
  2. PSL-FLM Impossibility Result
  3. FLP Result
  4. LR Permissionless Result