Relational database

A relational database is used for electronic data management in computer systems and is based on the relational database model. This was first proposed in 1970 by Edgar F. Codd and is still, despite some criticisms of an established standard for databases.

The associated database management system is called a relational database management system or RDBMS (Relational Database Management System). To query and manipulate the data, the database language SQL (Structured Query Language) is mainly used.

Basis of the concept of relational databases is the relation, a well-defined term in the mathematical sense. It is a mathematical description of a table; see database relation. Operations on these relations are determined by the relational algebra. The relational algebra is thus the theoretical foundation of SQL.

Despite the mathematical, abstract definition of the database model relational databases are relatively easy and flexible to handle. This had a great impact on the success of this database technology.

  • 4.1 Understanding the relational algebra
  • 4.2 limitations of relational algebra

Basic Concepts

A relational database can be thought of as a collection of tables ( the relations), in which records are stored. Each row ( tuple ) in a table is a record (record). Each tuple consists of a series of attribute values ​​( attribute = properties ), the columns of the table. The relation scheme shall specify the number and type of attributes for a relation. The picture illustrates the relation R with attributes A1 to An in the columns.

For example, a book is described in a library by the record (Book ID, author, publisher, publishing year, title, date of recording). A record must be clearly identifiable. This is done through one or more keys (English key). In this case, book - ID contains the keys. A key may never change. It refers to the data set and not in the position in the table.

Relationships between tables

Further links can be used to express the relationships between tables. A library database could thus be implemented as follows with three tables:

Table book, which contains one row for each book:

  • Each line consists of the table columns ( attributes): book ID, author, publisher, publishing year, title, date of recording.
  • The key is the book - ID because it uniquely identifies each book.

Table user that contains the data of all registered library users:

  • The attributes might include user ID, first name, last name.

You'll also need a third table borrowed that contains information about the availability of the book. They would include the attributes User- ID and book ID. Each row of this table borrowed assigns a user ID to a book ID.

The entry ( 10,3) would therefore mean that the user with the ID 10 (" Hans prolific writer " ) has borrowed the book with the ID3 ( " My Life with Asterix "). The same user has also the book "Printing Made Easy" loan, which is indicated by the table entry (10, 2) occupied. As a key one here takes the set of attributes (user ID, Book ID). At the same user ID each entry in the table for loan with an entry in the table users, and the book ID associates connects each entry of loan with an entry in the table book. Therefore, these attributes are called in this context, foreign key (german foreign key ). Tables without foreign key is called flat tables.

As used herein the term relation does not correspond to the facts of relationship / relationship ( as used in the Entity -Relationship Model ), but is associated with relational databases as an English term for table (mostly from entity arising in the ERM ).

Demarcation

In addition to the relational database model, there are various alternative approaches, which allow you to manage data in other structures. These concepts often have only a small importance or have not yet enforced. Nevertheless, they provide for certain applications a simpler connection of data to be managed.

Older approaches

In the 60s and 70s, hierarchical database systems and network database systems have been used for operational data processing. In these, the data structure or table is defined in the design phase and can not be varied during the search. They are used in special cases even today.

Object-oriented databases

With the advent of object-oriented programming languages ​​are increasingly object databases were offered. Thus, objects of OO languages ​​such as Java can be held directly in the database - an illustration of the objects to the relational table structure, object-relational mapping, is then no longer necessary. This approach has advantages over the relational design if you want to store complex data objects that can be mapped only hard on the flat relational table structures. However, object databases still have disadvantages over relational databases when processing large amounts of data. This is caused for example by access paths to objects via multiple path types (eg inheritance and association ). This results in write operations in the lock management to an exponential complexity and thus in poor performance. The performance issues were addressed in the object-relational databases in which only the constructs of object-oriented databases were acquired with a lower complexity ( for example ).

Object-relational databases

Some vendors add their relational databases added object-oriented features and then call this object-relational databases. However, these are not intended for direct imaging of objects of the programming language - they are just using the concept of inheritance in defining and querying tables with similar field structures, thus simplifying their handling. The SQL - 99 standard has been extended to object-relational language elements.

Semi-structured Databases

Newer concepts are the semi-structured databases. They differ from conventional database models is that they do not have a fixed predefined schema. The database is hierarchical, tree-like and each database unit ( engl. entity) of the same type can have different sets of attributes.

Typical representatives of this type are XML databases, which manage the data as XML fragments. The XML data here are structured hierarchically and can contain as long as they are well-formed according to XML definition arbitrary structures. The data can be queried using XQuery or XPath. Today, proprietary language extensions are used to manipulate. Disadvantage of current XML databases is the lower in comparison to relational systems performance.

Semi-structured databases can be realized through extensions or server programming with relational DB, the relational model, but is no longer used.

Theory of relational databases

The foundations of the theory of relational database were laid by Edgar F. Codd in the 1960s and 1970s and described in his work A Relational Model of Data for Large Shared Data Banks. Theoretically, based on all operations of relational algebra.

Understanding the relational algebra

The relational algebra is an algebraic model that describes how data can be stored, retrieved and manipulated. The essential operations of which all can be further derived are the following:

  • Projection ( projection engl. )
  • Selection (English selection)
  • Cross product or Cartesian product (german cross product, cross join or Cartesian product )
  • Renaming (English rename)
  • Association ( engl. union)
  • Difference ( engl. set difference or minus)

All requests that are made to a relational database using SQL are mapped by the database management system on these operators, that is translated. In practice, there are other operators, such as the join operator, but which also is only a combination of the cross product selection and projection.

Limitations of relational algebra

The relational algebra does not provide support for the computation of recursive queries ( transitive closure ). This means for example that it is not possible to calculate in a query all the ancestors of a person when they are stored in a relation person and is connected via a relation VorfahreVon with the respective ancestors in person. The ancestors can only be determined by a sequence of requests.

With the introduction of SQL -99, however, an extended relational algebra introduced which allows an operation for calculating the transitive closure.

Database schema and modeling

An important component of a relational database is its schema. The schema defines what data is stored in the database and how these data in relation to each other. The process for creating a schema is called data modeling.

For the modeling of relational databases and the Entity-Relationship model is used. It is used to design a conceptual schema which can be implemented using a database management system ( DBMS). This step is called logical design or data model mapping and has as a result of a database schema in the implementation data model of the DBMS.

An important step in the modeling process is the normalization. This is intended to reduce redundancies and avoid anomalies, so as to simplify the maintenance of a database, as well as to ensure the consistency of the data. Edgar F. Codd has proposed four normal forms which have since come in the relational database design are used and have been supplemented by other.

Criticism of the relational database model

With the collective term NoSQL non-relational database models are referred to the problems like the aforementioned intend to solve by alternative approaches.

Juxtaposition of basic concepts

674665
de