Summarization of the Kademlia paper

A peer-to-peer distributed hash table (DHT)

Participating computers each have a node ID in the 160-bit key space. Key-value pairs are stored on nodes with IDs “close” to the key for some notion of closeness. A node-ID-based routing algorithm lets anyone efficiently locate servers near any given target key

A get(key) operation traverses the identifier space and, upon hitting a node storing key, returns the key’s corresponding contact list. Then, the requesting node can contact these nodes, in parallel or in some application-specific way, to download the stored data.

Core ideas

  • Uniform ID Space: names for both data and nodes share the same ID space
    • But collisions are ok as they mean different things depending on the context
  • Decide what machine to place data on depending on a measure of ‘closeness’ between the name of the data and name of a machine
    • Euclidean distance can be ambiguous, two entries could have the same distance
    • XOR distance between and some arbitrary will always be unique
  • Each node only needs information about its neighbours.
    • If the structure of the network is correctly chosen, then global properties apply to the whole network
    • Nested routing similar to IP routing
    • Each machine has a lot of info about machines closest to it, less about machines far away from it

Security

  • It is very vulnerable to Sybil attacks, which can result in the modification or erasure of any data in the network.
  • It also uses no message encryption whatsoever