A protocol for peer-to-peer data stores.
- We made Willow to make running a network together a sustainable practice.
- We made Willow to reconcile peer-to-peer networks with social realities. Wrangling the complexity of distributed systems shouldn’t mean we trade away basic features like deletion, or accept data structures which can only grow without limit.
Properties
- As many of these stores as you want, keyed to different namespaces. When stores from different devices belong to the same namespace, they deterministically sync with each other.
- Private and end-to-end encrypted. Other users can’t find out what you’re interested in unless they already know about it themselves.
- Total delete via prefix pruning (essentially cutting a tree of causal dependencies by trimming down to root and marking that with a single tombstone). Destructive edits. When you update a value, the old values and associated metadata are overwritten.
- Fine grained access control. Restrict read and write access by semantically meaningful ranges of data, or time range.
- Peers can communicate resource budgets, so devices with very limited memory can sync too.
Data Model
Willow is a system for giving meaningful, hierarchical names to arbitrary sequences of bytes (called payloads).
For any given subspace, you can address payloads via paths (e.g. blog/idea/1
and blog/idea/2
).
- Entries can be overwritten by more recent entries (Willow tracks timestamps and deletes actually delete everything except the metadata)
- Deletes are hierarchical (e.g. deleting
blog
will delete all of tis subpaths).- This is called prefix pruning
Entries live in separate subspace owned by different users (intuitively, each user writes to their own, separate universe of data. Willow allows for various ways of controlling who gets to write to which subspace)
Interestingly, namespaces can also be aggregated into namespaces.
Some types
Payload
: at most bytesEntry
: metadata for a payloadNamespaceId
: keys namespacesSubspaceID
: keys subspacesPath
: parametrized bymax_component_length
: max length for a path segmentmax_component_count
: max path segments in a pathmax_path_length
: overall limit for size of path
Timestamp
: time in microseconds since Unix epoch timepayload_length
PayloadDigest
: content addressing for a payload- calculated from
hash_payload(payload) -> hash
: mapsPayload
toPayloadDigest
- calculated from
An entry e1
is newer than an entry e2
if:
e2.timestamp < e1.timestamp
ore2.timestamp == e1.timestamp && e2.payload_digest < e1.payload_digest
ore2.timestamp == e1.timestamp && e2.payload_digest == e1.payload_digest && e2.payload_length < e1.payload_length
Auth:
AuthorisationToken
: proving write permissionis_authorized_write(entry, token) -> bool
: maps a path and token to whether that token proves a valid permission to write to entryPossiblyAuthorisedEntry
is a pair of anEntry
and anAuthorisationToken
AuthorisedEntry
is aPossiblyAuthorisedEntry
for whichis_authorised_write
returns true
Stores are collection of AuthorisedEntry
:
- All
Entry
have the sameNamespaceId
- An invariant such that an
Entry
a
cannot both a prefix of anotherEntry
b
anda
be newer thanb
- That is, updating
blog
will invalidateblog/abc
(is this true?)
- That is, updating
A join of two stores is obtained by:
- Start with the union of the two stores
- Remove all
Entry
e1
where there is someEntry
e2
such thate2.path
is a parent ofe1.path
ande2
is newer thane1
- For all
Entry
with the sameSubspaceID
,Path
, andTimestamp
, remove them all except for the one with the greatestPayloadDigest
- For all
Entry
with the sameSubspaceID
,Path
,Timestamp
, andPayloadDigest
, remove them all except for the one with the greatestpayload_length
Stores form a state-based CRDT under the join operation.
Grouping Entries
An application might want to access all chess games that a certain author played in the past week. This kind of query corresponds to a box in the three-dimensional Willow space.
There are one-dimensional queries called ranges which work along SubspaceId
, Path
, or Timestamp
A 3dRange
is a 3-tuple of ranges across all three dimensions:
struct 3dRange
subspaces: SubspaceRange
paths: PathRange
times: TimeRange
Sync Protocol
Requirements:
- Incremental sync: peers can detect regions of shared data with relatively sparse communication to avoid redundant data transfer
- Partial sync: peers synchronise only those regions of data they both care about
- Private area intersection: peers can discover common interests without disclosing any non-shared information to each other
- Resource control: peers communicate (and enforce) their computational resource limits so as not to overload each other
- Transport independence
- General efficiency: peers can make use of efficient implementation techniques, and the overall bandwidth consumption stays low