A node that has a copy of the data is called a replica
Replication is the act of ensuring consistency of data across replicas. If one replica is faulty, others are ideally still accessible
Of course, if data doesn’t change, this is an easy problem: just copy it. Hard problem is when the data changes.
Can take inspiration from hardware systems! RAID (Redundant Array of Independent Disks) which is used to replicate within a single computer fills a similar role but RAID has a single controlled whereas distributed systems have nodes that act independently.
An important concept in replication (and message broadcast) is making sure that we avoid cases where losing an ACK could lead to users doing an action multiple times (e.g. pressing the like button)?
This can be done by ensuring idempotence in our actions.
# State machine replication
Can be done by FIFO-total order broadcasting every update to all replicas. Whenever a replica delivers an update message, it applies it to its own state