Consensus in human systems is actually usually pretty easy because of the social layer of society. This fault tolerance against 51% attacks is due to the fact that convincing the community that any engineered ’truth’ is the real on requires subverting every trusted member in the community, most notably media and news sources (also why systems of authoritarian power are so scary).
Note that this is inherently different from collaboration methods like CRDTs. Collaboration involves keeping all edits and merging them. Consensus involves picking one of several proposed values and agreeing on it.
Consensus and Humming in the IETF
On rough consensus
- Humming as ‘temp checks’ for people to voice disagreement but default assumption is optimistic trust
- “While counting heads might give a good guess as to what the rough consensus will be, doing so can allow important minority views to get lost in the noise. One of the strengths of a consensus model is that minority views are addressed, and using a rough consensus model should not take away from that.”
- “We can’t know who the “members” of any given working group would be at any one time, and we certainly can’t know who all of the “members” of the IETF would be: That’s why we refer to “participants” in the IETF; the IETF doesn’t really have “members”. Indeed, we often recruit additional implementers and other experts into working groups in order to ensure that broader views are brought into the discussion. So, voting is simply not practical.”
There are four requirements to such an algorithm:
- Validity. The result must be a value that was submitted by at least one of the processes. The consensus algorithm cannot just make up a value.
- Uniform agreement. All nodes must select the same value.
- Integrity. A node can select only a single value. That is, a node cannot announce one outcome and later change its mind.
- Termination. Also known as progress, every node must eventually reach a decision.
There are two main protocol paradigms for achieving consensus in the presence of Byzantine nodes:
- Classic BFT protocols: typically uses two voting rounds to ensure consistency
- Longest-chain consensus
|Classic BFT||Longest-chain Consensus|
|Safety/Liveness tradeoff||Favours safety in the face of an attack||Favour liveness in the face of an attack|
|Finality||Instant and deterministic||Probabilistic (at risk of potentially large chain reorganizations and double-spend attacks)|
|Fork behaviour||Rare but difficult to recover from||Embrace forks, uses in-protocol methods for resolving ambiguity as to which fork is correct|
|FLP Result Behaviour||sacrifice either liveness or consistency in the face of an attack (assuming <33% Byzantine as per FLP Result)||Does not apply as longest-chain consensus is non-deterministic|
|Permission model||Generally permissioned (see Sandglass)||Permissionless|
Note that there have been attempts to bridge Classic BFT models with Nakamoto-style consensus ones with hybrid consensus models which use a permissionless chain to determine a participant/proposer rotation in a reconfigurable BFT engine.
Comparisons between different BFT SMR protocols
All protocols are of the following:
- protocols for byzantine fault-tolerant SMR
- All work in the partially synchronous system model and obtain safety (always) and liveness (after GST) in the face of an adversary that controls replicas out of a total of replicas (per FLP Result)
- All these protocols are based on the classic leader-based primary-backup approach where leaders are replaced in a view-change (or election to use Raft terminology) protocol.
Below is a comparison of a few top protocols and their tradeoffs in authenticator complexity
|Best-case Latency (rounds)||Normal-case Communication||View-change Communication||Leader Rotation||Responsiveness|
|PBFT||2||On suspected fault||Yes|
|Tendermint||2||using thresholded signatures, otherwise||Every round||No|
|SBFT||1||On suspected fault||Yes|
Responsiveness here refers to the property that a non-faulty leader can drive the protocol to consensus in a time depending on actual message delays and not the theoretical upper bound on message transmission delays. In partially synchronous models, we use optimistic responsiveness, which requires responsiveness only after GST is reached.
Leader rotation tradeoff:
- maintaining a stable leader means less overhead and better performance due to stability when the leader is honest and trusted
- constantly rotating the leader provides a stronger fairness guarantee against stable malicious leaders
In PBFT, SBFT, and HotStuff the leader maintains a window of open slots and is allowed to concurrently work on committing all open slots in his active window. Conceptually, this is like TCP where a sender does not have to wait for the ACK of packet before sending message . This window can significantly increase throughput by allowing the leader to concurrently coordinate several actions of slot commitments.
When consensus is impossible to achieve: