Data integrity

The consistency is referred to in the accuracy of the databases stored in the database. Inconsistent databases may lead to serious errors if the overlying application layer does not expect. One can distinguish two basic perspectives on consistency of data, on the one hand from the world of "classical" relational databases and on the other from the world of distributed systems.

  • 2.1 Consistent transformations

Distributed Systems

Distributed storage systems have gained great impetus and notoriety in the course of cloud computing. In distributed storage systems data is typically replicated repeatedly distributed over different servers on the one hand to increase the availability of the data and on the other hand to reduce the access time. The former is clearly evident, as the probability that multiple servers fail simultaneously is significantly lower than that only one of them fails. The latter is explained by the fact that requests can be sent to geographically closer replica either, or a completely overloaded servers is reduced by a portion of accesses is handled by a different server. In this context, consistency means that all replicas of a date are identical. In particular, this also meant that a distributed storage system for a data set A can be consistent and at the same time inconsistent for a record B. This is known as strict consistency if all replica are always identical.

Since it is not always appropriate in distributed systems to keep all replicas consistent, there are also so-called weak consistency (English weak consistency ), that is, there no consistency guarantees are given, and the so-called eventual consistency that states that a will eventually be consistent record, if only a sufficiently long time no write operations and error may be required.

In the spectrum between "eventual " consistency and strict, there are several intermediate stages, one distinguishes between so-called client- centric consistency and data- centric consistency. The former describes consistency guarantees from the perspective of the client, the latter internal consistency guarantees.

Client -Centric Consistency

Monotonic Read Consistency

If a distributed storage system once replied to a read request from a client for a given key with version N, any subsequent reads of this client are only versions that are at least as new as N return.

Monotonic Write Consistency

If a particular client for a particular key only value 1 and value 2 then writes, then it is guaranteed that the system internally also writes the values ​​in that order. This means in particular that ( without further write accesses ) never value will override value 1 2 in a replica.

Read Your Writes consistency

This guarantees the memory system that a process that has written a date with the version number N guarantees, will not read versions that are older than N. A trivial implementation thereof were local to the client reproached replica that are not synchronized. However, this would only guarantee weak consistency and no eventual consistency. In practice, this is implemented by so-called Session Consistency, in this warranty shall apply only for the duration of a session. For example, it is then possible to all requests (whether read or write access) of a given process to the same replica to route. If not available this replica, the session is terminated.

Write Follows Reads Consistency

If a process has a date X n read in one version and then the same process will overwrite this date, it guarantees Write Follows Reads Consistency, that the write operation takes place only on replica, which are present at least in version n.

Data -centric consistency

Causal Consistency

Causal Consistency means that all operations that are in a causal relationship, must be serialized in the same order at all replicas. An operation O if and only causally from an operation P depending if one or more of the following conditions apply:

Sequential Consistency

Sequential consistency is stricter than Causal Consistency by the model requires that all operations are serialized in the same order at all replicas, and the operations of each client process are executed in their correct chronological order.

Linearizability

Linearizability is stricter than Sequential Consistency by the model in addition requires that the uniform order of the operations of the actual chronological order and corresponds to all requests appear as they would happen in place during a time interval at a time.

Consistency in classical relational databases

In relational databases, is understood to mean the integrity of data consistency. This is defined by the setting up of so-called integrity constraints. There are different types of integrity rules:

  • Area Integrity: The value of each attribute must be in a specific range of values ​​.
  • Entity: The primary key of each object must be unique. It may be zero in any case.
  • Referential Integrity: The value of a foreign key must either be null, or an object with such a key must exist.
  • Logical Consistency: The user can also define additional integrity rules (for example on a pedigree database: The children must have been born after the parents). Such conditions can not be controlled usually by the database system and must therefore be met by the user.

A database is only consistent if it satisfies all integrity provisions. A state in which at least one of the constraints is violated, is referred to as non- consistent.

Consistency in classical relational databases is a superset of the consistency definition from the world of distributed systems, ie until all replicas are identical, the constraints can not all be met.

Consistent transformations

Consistency is one of the four required in database transactions ACID properties. Each transaction must transfer a database from one consistent state into another consistent. While processing the request, however, the consistency of the database can be short-term injured.

After each represented by a transaction number of changes to the data (insert, delete, or change ) the database is checked for integrity constraints. If these are not fulfilled, the entire transaction must be unwound back to the previous ( consistent ) state is restored ( "rollback" ).

Take special care require concurrent transactions.

219794
de