Voldemort (distributed data store)
Voldemort is a distributed database management system, which as persistent and fault-tolerant key-value database ( Key Value Store) is aligned and is used by LinkedIn as a high - scalability storage. The name was borrowed from the villainous Lord Voldemort from the Harry Potter novel series. Voldemort's development is not yet complete.
Benefits
Voldemort has a number of advantages over other databases:
- It combines an in -memory cache to the storage system, so that a separate cache is unnecessary. The storage system itself, is appropriately fast.
- It is possible to emulate the memory layer. This in turn shapes the development and testing of components very easy, as can be developed and tested against a throw- in-memory system. It is not necessary to put on a real cluster or real storage system.
- Reading and writing scaled horizontally.
- Simple programming: The programming decides on data replication and data distribution, and provides space for a variety of application-specific strategies.
- A transparent data partitioning allows the cluster expansion without the redistribution of the total data.
Disadvantages
Voldemort has a number of disadvantages compared to other databases:
- Relationships between the data can not be mapped
- There is no query language, so keys must be known in order to determine a value
- There are no transactions and therefore any ACID properties
- The project is still in an early development phase, the use in production systems should therefore be well balanced against
Properties
The distributed database Voldemort has the following properties:
- Data distribution: There is a support plugbarer data distribution strategies to allow for example a breakdown on distant data centers.
- Data Replication: The data is automatically replicated to multiple servers.
- Partitioning data: The data is partitioned automatically, so that the server contains only one subset of the total data.
- Good single node performance: 10k -20k operations per second can be performed, depending on the computer, network, disk system and data replication factor.
- Independent nodes: Each node is independent of other nodes without a central coordination. There is no single point of failure.
- Plugbare serialization: it allows both structured keys and values including lists and tuples with named fields, as well as the integration in general serialization framework. Examples of these frameworks are Avro, Java serialization protocol - buffers and Thrift.
- Transparent malfunction: Server failures are handled transparently, so users do not notice such problems.
- Versioning: The data are versioned to maximize data integrity in the event of a malfunction, without restricting the availability of the system.