How do we capture assumptions in a system model for distributed systems?

Network behaviour (e.g. message loss)

  • Reliable: message is received if and only if it is sent, messages may be reordered
  • Fair-loss: messages may be lost, duplicated, or reordered. A message eventually gets through if you keep retrying (can be upgraded to reliable using retry + packet deduplication)
  • Arbitrary: active adversary, may interfere with messages (can be upgraded to fair-loss using TLS)
  • Network partition: some links dropping/delaying all messages for an extended period of time

Node behaviour (e.g. crashes)

  • Crash-stop: node is faulty if it crashes. After it crashes, it stops executing forever
  • Crash-recovery: node may crash at any moment, losing in-memory state. It may resume executing sometime later (sometimes call omission fault)
  • Byzantine: a node is faulty if it deviates from the algorithm. Faulty nodes may do anything, including crashing or malicious behaviour
  • Correct: not faulty

Timing behaviour (e.g. latency)

  • Synchronous: message latency no greater than a known upper bound
  • Partially synchronous: asynchronous for some finite (but unknown, possibly arbitrarily large) periods of time, synchronous otherwise
    • Like synchronous model, assumes a shared global clock with bounded drift
    • There is an unknown transition point GST (global stabilization time) where the system goes from asynchronous to synchronous.
      • All messages sent in an asynchronous period are delivered by time
      • All messages sent in the synchronous period arrive by time
    • The key difference is that we can wait for a sufficiently long delay () after the start of a round that if the network has reached synchrony you’re guaranteed to receive all messages from all non-Byzantine nodes
  • Asynchronous: messages can be delayed arbitrarily, no timing guarantees

Identity and Messages

  • Authenticated: a Byzantine node cannot forge a message or change the contents of a received message before it relays the message to other nodes
  • Non-authenticated: nodes have no way of verifying the authenticity of a received message

Permissioning

  • Permissioned: all nodes in the cluster are known ahead of time
  • Permissionless: anyone can join the cluster