A protocol for peer-to-peer data stores.
- We made Willow to make running a network together a sustainable practice.
- We made Willow to reconcile peer-to-peer networks with social realities. Wrangling the complexity of distributed systems shouldn’t mean we trade away basic features like deletion, or accept data structures which can only grow without limit.
Properties
- As many of these stores as you want, keyed to different namespaces. When stores from different devices belong to the same namespace, they deterministically sync with each other.
- Private and end-to-end encrypted. Other users can’t find out what you’re interested in unless they already know about it themselves.
- Total delete via prefix pruning (essentially cutting a tree of causal dependencies by trimming down to root and marking that with a single tombstone). Destructive edits. When you update a value, the old values and associated metadata are overwritten.
- Fine grained access control. Restrict read and write access by semantically meaningful ranges of data, or time range.
- Peers can communicate resource budgets, so devices with very limited memory can sync too.
Data Model
Willow is a system for giving meaningful, hierarchical names to arbitrary sequences of bytes (called payloads).
For any given subspace, you can address payloads via paths (e.g. blog/idea/1
and blog/idea/2
).
- Entries can be overwritten by more recent entries (Willow tracks timestamps and deletes actually delete everything except the metadata)
- Deletes are hierarchical (e.g. deleting
blog
will delete all of tis subpaths).- This is called prefix pruning
Entries live in separate subspace owned by different users (intuitively, each user writes to their own, separate universe of data. Willow allows for various ways of controlling who gets to write to which subspace)
Interestingly, namespaces can also be aggregated into namespaces.
Some types
Payload
: at most bytesEntry
: metadata for a payloadNamespaceId
: keys namespacesSubspaceID
: keys subspacesPath
: parametrized bymax_component_length
: max length for a path segmentmax_component_count
: max path segments in a pathmax_path_length
: overall limit for size of path
Timestamp
: time in microseconds since Unix epoch timepayload_length
PayloadDigest
: content addressing for a payload- calculated from
hash_payload(payload) -> hash
: mapsPayload
toPayloadDigest
- calculated from
An entry e1
is newer than an entry e2
if:
e2.timestamp < e1.timestamp
ore2.timestamp == e1.timestamp && e2.payload_digest < e1.payload_digest
ore2.timestamp == e1.timestamp && e2.payload_digest == e1.payload_digest && e2.payload_length < e1.payload_length
Auth:
AuthorisationToken
: proving write permissionis_authorized_write(entry, token) -> bool
: maps a path and token to whether that token proves a valid permission to write to entryPossiblyAuthorisedEntry
is a pair of anEntry
and anAuthorisationToken
AuthorisedEntry
is aPossiblyAuthorisedEntry
for whichis_authorised_write
returns true
Stores are collection of AuthorisedEntry
:
- All
Entry
have the sameNamespaceId
- An invariant such that an
Entry
a
cannot both a prefix of anotherEntry
b
anda
be newer thanb
- That is, updating
blog
will invalidateblog/abc
(is this true?)
- That is, updating
A join of two stores is obtained by:
- Start with the union of the two stores
- Remove all
Entry
e1
where there is someEntry
e2
such thate2.path
is a parent ofe1.path
ande2
is newer thane1
- For all
Entry
with the sameSubspaceID
,Path
, andTimestamp
, remove them all except for the one with the greatestPayloadDigest
- For all
Entry
with the sameSubspaceID
,Path
,Timestamp
, andPayloadDigest
, remove them all except for the one with the greatestpayload_length
Stores form a state-based CRDT under the join operation.
Grouping Entries
An application might want to access all chess games that a certain author played in the past week. This kind of query corresponds to a box in the three-dimensional Willow space.
There are one-dimensional queries called ranges which work along SubspaceId
, Path
, or Timestamp
A 3dRange
is a 3-tuple of ranges across all three dimensions:
struct 3dRange
subspaces: SubspaceRange
paths: PathRange
times: TimeRange
Sync Protocol
Requirements:
- Incremental sync: peers can detect regions of shared data with relatively sparse communication to avoid redundant data transfer
- Partial sync: peers synchronise only those regions of data they both care about
- Private area intersection: peers can discover common interests without disclosing any non-shared information to each other
- Resource control: peers communicate (and enforce) their computational resource limits so as not to overload each other
- Transport independence
- General efficiency: peers can make use of efficient implementation techniques, and the overall bandwidth consumption stays low
3D Range-based Set Reconciliation
- To reconcile two sets, one peer first computes a hash over all items in its set, and sends this fingerprint to the other peer. That peer then computes the fingerprint over its items as well. If the fingerprints match, they are done reconciling.
- If they do not match, there are two options.
- First, the peer can split its set in half and then initiate set reconciliation for each half concurrently (by transmitting its hashes for both halves).
- Second, if the set is sufficiently small, the peer can instead simply transmit its items in the set. The other peer responds to this with all other items that it held in the set, completing the process of reconciliation.
Capabilities System
When interacting with a peer in Willow, there are two fundamental operations:
- writing data — asking your peer to add
Entries
to their stores, and - reading data — asking your peer to send
Entries
to you
Both operations should be restricted. In Willow, this is done via a capabilities system called Meadowcap.
A capability is an unforgeable token that bestows read or write access for some data to a particular person, issued by the owner of that data. A capability bestows not only access rights but also the ability to mint new capabilities for the same resources but to another peer
Capability Delegation
A capability bestows not only access rights but also the ability to mint new capabilities for the same resources but to another peer.
Consider Alfie and Betty, each holding a key pair. Alfie can mint a new capability for Betty by signing his own capability together with her public key: sign(capability + betty.pub_key, alfie.priv_key)
Note that capability can be turned into a more restrictive subset.
Encrypted Willow
- Payloads:
- Append a nonce to each plaintext (so that equal plaintexts at different paths have different digests)
- Apply some padding to a prespecified length (so that the length of the plaintext is not leaked)
- Encrypt the result (so that the contents stay confidential), and
- Use the resulting cyphertext as a Payload
- Computing joins of stores necessitates comparing Timestamps numerically.
- Willow deals in plaintext Timestamps only.
- Unfortunately this still leaks information (i.e. it isn’t indistinguishable from random data)
- There is work around order-preserving encryption