jzhao.xyz

Search

Search IconIcon to open search

Rhizome Proposal

Last updated Mar 14, 2022 Edit Source

Companies of the future should derive value from the intelligence they provide on top of existing data rather than have the value be just the data.

How can we imagine more rhizomatic structures instead of arborescent systems?

DISCLAIMER: To borrow words from Robin Sloan: While it is okay to share this link, I want to underscore that I am sending it specifically to you with the hope that you will really think about it! At such a primordial stage, a proposal like this doesn’t need diffuse, drive-by attention. It needs, instead, close consideration and generous imagination.

The competitive advantage of the vast majority of today’s centralized platforms are in their data moats and network effects. The major reason why these platforms remain so dominant is because of their data and users, not because of how good the service quality is.

As a result, apps have become inseparable from data. In an ideal world, there is data-neutrality. Much like how net neutrality strives to maintain separation of provider and content markets, data neutrality strives to maintain the separation of data and application markets.

In an ideal world, we focus on local-first software that works independently of large platforms – at the end of the day platforms should be used to support efficiency of collaboration at scale, not to gate users from moving their data for the sake of retention.

I’ve spent a lot of time looking at the retrospectives of peer-to-peer protocols and distributed applications and there are 3 common themes I’ve found in all of them:

  1. Running your own infrastructure is hard. We need to think about the average non-technical user.
  2. Data availability and durability is largely unsolved. In most p2p systems, offline collaboration isn’t possible.
  3. Lack of thought behind off-ramping off of existing systems. We have shiny new systems, how do we get people to switch to it?

While blockchain can be used in creative ways to overcome most of these, it currently comes with a large set of downsides that make it hard to build on top of it (e.g. expensive to store things completely on chain, slow confirmation times).

While I hope these will be mitigated in the future, I wanted to spend time exploring alternative and potentially more general-purpose means of addressing these main problems without blockchains.

My main research question is about how we can enable data-neutrality on a web dominated by data moats. A few consequences of this work:

  1. Single purpose apps backed by general purpose data. If two apps are views on the same data, any change to the underlying data will instantly update both apps
  2. Applications ask for access rather than store their own data. You give apps permission to read or write specific parts of your data
  3. As there are separate markets for data and applications, it creates competition based on service quality rather than on data ownership
  4. We can get the convenience of a single centralized platform without the lack of agency that typically comes with it.

I’m tentatively calling this project Rhizome. It aims to be a data-persistence and identity layer for the distributed web.

  1. A personal data pod that you own. Think iCloud or Dropbox but you have agency over how much storage you want, who has access to it, and what you want to do with it.
  2. A framework for easily developing cohesive peer-to-peer applications on top of data from the prev layer

As a whole, it forms the basis for a new model of the internet where first and foremost, people own their own data.

This is a summarized version of the full vision of Rhizome. Read the full essay on data neutrality.

# Technical Details

Rhizome is a set of abstractions on top of DIDs, IPFS (specifically IPLD), Filecoin, and the Raft consensus protocol. It can be analogized to a generalized implementation of state channels which don’t need to be anchored to a chain.

When Root and Trunk are combined, its properties handily solve or avert the three problems listed above:

  1. Data replication is considered solved as devices under a single DID sync with each other. Data availability is solved with a cloud peer which can be bought from a distributed and decentralized network of providers.
  2. Users no longer need to run their own server infrastructure as compute happens natively on a users device rather than on some remote sever. When a user needs more compute, they can utilize a cloud peer which is like renting compute from a neutral provider.
  3. As all apps have a public schema which describe what type of events it adds to the append-only event log, interoperability and data lensing is zero-cost to developers. To interoperate with outside apps, anyone can publish a schema file for the output of a data export of API call for example.

Rough architecture diagram as of June 1st

# Differentiation from existing work

# Output

# Research artifacts

Blog posts explaining distributed systems concepts as I learn and become more familiar with them

# Root

The data replication and identity part of Rhizome

# Trunk

The application-level event log management and collaboration


You can find the ongoing research log here.

# Acknowledgements

Thank you to Anson Yu, Spencer Chang, Sebastien Zany, Jamie Wang, Raymond Zhong, Vincent Huang, Justin Glibert, Morgan Gallant, Ryan Johnson, David Zhou, Aadil Ali, JZ, Nishant Medicharla, Anh Pham, Farzaa Majeed, Amir Bolous, Aaron Pham, Rishi Kothari, Jasmine Sun, and Athena Leong for your continued support. This project wouldn’t be possible without all of you.