Distributed hash table

A distributed hash table (English Distributed Hash Table, DHT) is a data structure that can be for example, used to store the location of a file in a P2P system. The focus is on decentralization and the efficiency of data storage in the foreground.

The data are distributed as evenly as possible over the existing storage nodes. Each storage node corresponds to an entry in the hash table. The self-organizing data structure can reproduce the failure, the entry and exit of nodes. The basis for distributed hash tables consistent hash functions.

A distinction DHTs after the memory scheme. The data can be stored directly within the DHT (direct storage) or in the DHT, a reference can be kept on the data (indirect storage). Direct storage is only appropriate for small data (<1 kB ), otherwise the system would be too inflexible.

Properties

Properties of DHTs are:

  • Fault tolerance: The system should function reliably even when nodes fail or leave the system.
  • Load Distribution: keys are distributed evenly across all nodes.
  • Robustness: The system should be able to work "correctly", even if some (possibly a majority ) of the node attempts to disrupt the system.
  • Self-organization: There is no manual configuration is needed.
  • Scalability: The system should be able, with a large number of nodes to remain functional.

General Operation

Using a hash function be assigned to the data objects key in a linear range of values, which is distributed as evenly as possible over the nodes of the node set. Each node is responsible for at least a portion of the key space, but often are also several nodes responsible for the same area, change the responsibilities dynamically. Accession Protocol provides for the addition of new nodes in the existing system. The protocol then establishes the links to the neighboring nodes and sets usually also the construction of routing tables.

The routing tables are used by the DHT node to identify other nodes that are responsible for specific records. The definition of " distance " is connected to the structure and topology, and varies in different systems. They need not necessarily coincide with the physical organization of the nodes. A distributed hash table that placed their nodes in a Euclidean space could choose the node with the smallest Euclidean distance to a key. The routing tables that should allow each node to reach the next node to a key in O ( log n) search steps.

By providing a generic interface, the only two functions publish ( key, content) and lookup ( key ), the implemented algorithms have it replaced.

Implementations

Currently, there are, among others, the following implementations of distributed hash tables:

  • Chord
  • Kademlia - structures based on this algorithm exist in several P2P networks, however, are usually not compatible with each other. implementations: KAD - development of eMule development teams on the Kademlia algorithm to replace the failing based with the time server of the eDonkey2000 network.
  • Mojito - development of LimeWire development teams to quickly source determination within the Gnutella network.

Applications

DHTs for data storage

  • OpenDHT
  • Bamboo
  • The Chord / DHash Project
  • FreePastry
  • TomP2P

Software

  • Apt- p2p ( Package Management System): A distributed update based on apt
  • Azureus ( file sharing client ): BitTorrent-Client
  • BitComet BitTorrent client
  • BitTorrent (Version 4.1.0)
  • Coral: distribution network for content
  • Deluge ( BitTorrent Client)
  • EDonkey2000, Name: Overnet
  • EMule (version 0.40) and aMule (version 2.1.0 ), Name: Kad
  • Free Download Manager: Free download manager, which also dominates BitTorrent and DHT
  • Freenet: Anonymous data storage.
  • Halite BitTorrent Client
  • KTorrent BitTorrent client
  • LimeWire ( Name: Mojito )
  • Retroshare: server -less instant messenger, anonymous / file sharing, newsgroup, Voice over IP, e-mail
  • MLDonkey ( Overnet and Kademlia )
  • qBittorrent: BitTorrent-Client
  • Rtorrent: BitTorrent-Client
  • Transmission ( BitTorrent ): BitTorrent-Client
  • CSpace: Instant Messenger with Kademlia
  • UTorrent BitTorrent client
  • YaCy: Distributed Search Engine

DHT research

  • Project IRIS ( Infrastructure for Resilient Internet Systems )
  • Data structure
  • Hash
291118
de