Kademlia

Kademlia is a technology for peer-to -peer network that implements a distributed hash table, that stores information in a distributed network. Kademlia defines only the type and structure of the network. It was developed by Petar Maymounkov and David Mazières. It is commonly used file-sharing tools, but is not limited to this application.

Use file-sharing

Although Kademlia is commonly used by file sharing programs, files can not be transmitted with this protocol. Rather, the information needed to locate the desired data stored in the distributed network. The files can then be transferred to another protocol as eDonkey2000 or BitTorrent.

File sharing programs of the third generation, whose protocols Kademlia also heard that store information about the network in a distributed hash table, so each node stores a redundant part of this information. This structure replaces the central indexing server programs of the first generation. At the same time the cost to O ( log n) decreases to find desired files.

Operation

Most Kademlia is used over the Internet using the connectionless User Datagram Protocol ( UDP / IP).

Each node is identified by a unique number ( called "Node ID") identified. This number is used not only to identify, but is used by Kademlia simultaneously for other purposes. The own node calculates a random node ID, if he has never been on the net before.

A node that wants to join the network, a " bootstrapping " must first pass through said process: In this phase, the algorithm from a server or user is on the network, the IP address of some nodes that are already known in the Kademlia network. Now from these first node receives the IP addresses of other nodes so that no dependency exists from each node. To this end, he first searches for nodes that are similar to one's own ID to position themselves as favorably as possible where such an ID is expected.

Since there is no central instance that takes an index of available information, this task is shared between all nodes equally. A node which has information, first calculates the hash value of the characteristic ("ID" referred to ) this information. The hash function used in a Kademlia network must always be the same. That node will search in the network, the node whose ID have the smallest " distance" to the hash, and communicated to them his contact information.

Finds a node exactly this information, he takes the same procedure and thereby reaches the nodes that have stored, who owns this web information. Often, the seeker is only the hash value of the data available. Now he can go and receive the data a direct connection to the sources. It is thus ensured that the seeker finds the contact details of the source at the exact spot where this left them. Since the network is usually in constant flux, the contact information will be distributed to multiple nodes and updated by the source after a certain time.

To find a node, the algorithm lurches ever closer to it until he is found or returns an error. The number of search during this maximum to be interviewed node corresponds to the distance of this node to itself is a If the number of participants doubled in the network, so you do not ask twice as many nodes, but only one node per doubling more. Thus, the required bandwidth does not scale linearly with the size of the network, but logarithmic.

Another advantage lies in the decentralized structure that significantly increases the resistance against DDoS attacks. Even if a number of nodes is attacked, which itself has no great impact on the network. Over time, the network then knit these new "holes" around.

" Distance" of nodes

The above-mentioned " distance" has nothing to do with geography, but means the distance within the ID range. It may happen that, for example, a node from Germany and one from Australia, so to speak, " neighbors " are. The distance between two nodes and hash values ​​are calculated by the mathematical XOR function and is ID1 ID2 XOR.

Clients

Clients that implement a Kademlia algorithm (the actual networks for user data such as files are usually not compatible with each other ):

  • EMule (version 0:42 ) and aMule (version 2.1.0 ), Name: Kad
  • MLDonkey Name: Kad & Overnet
  • BitTorrent (version 4.1.0 beta )
  • UTorrent
  • KTorrent
  • Vuze
  • CSpace ( an instant messenger )
  • Deluge ( implemented using libtorrent, compatible with uTorrent )
  • Rtorrent (version 0.8.0 )
  • EDonkey2000, Name: Overnet ( development set )
459607
de