Entity–relationship model

The Entity -Relationship Model ( ER Model short or ERM; German about object - relationship model, this term is seldom used ) is used in the context of semantic data modeling one in a given context, eg a project to create an information system to describe the relevant section of the real world. The ER model consists of a graph ( ER Diagram, ERD Abbr ) and a description of the elements used therein, whose meaning is (semantics) and its structure shown.

An ER model is used both in the conceptual phase of the application development of the communication between users and developers (this is only the what treated, ie technical and factual circumstances, not the how, the technology) as well as in the implementation phase as a basis for the design of the - mostly relational - database.

The use of ER models is the de facto standard for data modeling, even though there are different graphical representation forms for data models.

The ER model was introduced in 1976 by Peter Chen in his publication The Entity-Relationship Model. The description means for generalization and aggregation were introduced by Smith and Smith 1977. Then there were several developments, as in the late 1980s by Wong and Katz.

  • 2.1 Specialization and Generalization means is-a relationship
  • 2.2 Aggregation and separation by is- part-of relationship
  • 3.1 ER Diagrams
  • 3.2 Description of the model components

Terms

Based on the Entity-Relationship models is the typing of objects, their relationships and about them leading information (" attributes ").

Basic Components

In discussions, examples and texts concept is referred to ( in the context of viewing ) to objects and situations of the real world; Examples: the customer orders ..., the account is overdrawn, then .... These are called:

  • Entity (Entity ): individually identifiable object of reality; For example, the employee Müller, the project 3232
  • Relationship ( Relationship ): link / connection between two or more entities; eg employee Müller heads project 3232nd
  • Property ( engl. attributes): What is an entity ( in context) of interest; For example, the entry date of the employee Müller.

In the context of modeling similar types are formed from the above-mentioned issues. All identified type - forms are precisely defined and described in the model. These types differ by:

  • Entity: typing of similar entities eg employee, project, book, author, publisher
  • Relationship type ( relationship type ): typing of similar relationships; eg employee conducts project
  • Attribute: typing of similar properties, such as last name, first name and date of entry for the entity type Employee. The describe their value (s) the entity uniquely, ie identify this attribute or combination of attributes, hot, identifying ( s ) attribute (s ), for example, the attribute is identifying project number for the entity type project.

Specific situations

For a description and demonstration of special circumstances, the ER modeling recognizes the following constructs:

  • Strong Entity Type: The identification of an entity is possible by one or more values ​​of attributes of the same entity; eg the order number for the entity order is identifying.
  • Weak Entity Type: For identification of such an entity, an attribute value of another related with the weak entity type entity strong is required; so the order number of other strong entity application is required eg for the identification of the weak entity order item next to the item number. In extensions of the ER model such as the SERM Weak entity type and associated relationship type are combined into a so-called ER- types, which diagrams are compact.
  • Cardinality: The cardinality of sets ( at the level of relationship type) for each of the entity types involved, like many concrete relations ( of this type) its entities may be involved or need. To represent the cardinality of different forms of notation have been developed, of which a certain modeling tools support mostly.
  • Reflexive ( self-referential ) Relationship: Relationship between individual entities of the same entity type, thus a relationship type between the same entity type ( for example, the tree structure of an organizational structure by " organizational unit is divided into organizational unit " and the net structure of a BOM by " part is used in part " ). Synonym: Recursive relationship.
  • Degree or complexity of a relationship type: number of entity types that are involved in a relationship type. The rule is of degree 2 ( binary relationship type); rarely occurs in Grade 3 ( ternary relationship type) or a higher degree. Ternary and higher-degree relationship types can be approximated by introducing a new entity type ( of the original relationship type corresponds ) attributed to binary relationship types. Example: employee care provider ( for product group); new entity supplier support with relationships to the three original entity. Such an approach can also be lossy, ie, there are matters that are exactly representable only by multi-digit relationship types.
  • Relationship attributes: Usually relationship types have no attributes, since they only connect the entity types involved with each other. However, if additional attributes required, can be made of the type of relationship - as with higher-grade relationships - an independent entity to be created with relationship types to the entity originally involved. The new (weak) entity type, the attribute is then assigned (for example, project participation degree in the relationship type employee works on the project). Depending on the applied modeling methodology also " attributive " relationships can be formulated, but often making new entity will substitute practiced.

Relations with special semantics

The substantive importance of the relationship between entity types in the ER diagram is only by a short text in the hash ( usually a verb ), or as a label of the edge to the expression, where it is up to the modeler which label he forgives. Now there are relationships with special semantics that occur relatively frequently in the modeling. It has therefore been defined for this particular relationship types identifier and graphic symbols. Specialization and generalization and aggregation and decomposition are complementary means of description with a special semantics. With these two special relationship types the realities of the real world more exactly and their actual meaning can be modeled and displayed accordingly. With pre-defined names and special graphic symbols is shown that it is semantically be default relations with special rules.

The thus, usually only in semantic data models specifically modeled entity and relationship types can be database- technically implemented in different ways, such as ( modellng - identical) as their own tables or common table with the special relationship characterizing comments or attribute names. The implementation decision on this is done (as well as the determination of the cardinality of these special relationships ) in the activities of database modeling.

Specialization and generalization by is-a relationship

While specializing an entity is recognized as a subset of another entity and declared, with the specialized entity set is characterized by special properties ( applicable only for her attributes and / or relationships ) with respect to the parent, generalized quantity. In particular, the identification - - and all the relations of generalized single object for the specialized individual object, since it is "the same" single object with a single property of the specialized amount and the amount to generalized, all properties are valid. Relationship types of the type " specialization / generalization " are is-a / can -be ( is a '/' can be a ... ') described. For is-a also a-kind -of is occasionally ( a kind of ... ') used. It is about 1: c- relations.

Example to a is -a relationship: air travel is-a journey and in the other direction of reading: Travel can- be air travel, with features such as date, the cruise price (for travel ) and relations with Entity flying ( flight ).

The is-a relationship described here ( between identical individual objects ) must not be confused with the is- element -of relationship ( the association of a single object to another), is used for the occasional spelling is-a as z. B. flight is-a flight (which would be semantically wrong).

For a specialization also more specialized entity types may optionally be declared. It must be determined whether individual properties of the generalized entity type may be missing in the specializations and whether they can occur in a specialized entity set or more specialized entity sets simultaneously as alternatives. Example: Customer is private individual or corporate client; one of the relationships must be present.

While specializations arise from given entities by forming partial entity sets are common characteristics and relationships that occur in various entity types, combined to form a new entity in the generalization. For example, customers and suppliers can also be combined with business partners as name, address, bank account etc. occur in both the customers as well as suppliers. The resulting generalization relationship type is in this example from the business partner and leads to the two entity customer and supplier. Whether the relationship in individual cases may occur only for entities of only one of the two or both or entity must be determined by the cardinality.

The above distinction between specialization and generalization arises only from the sequence identified in the entity types in modeling; always arise as a result of relationship types that are in the one direction of specialization, in the other generalization. If necessary, several specializations / generalizations can occur for the same entity type. For example, employees can be specialized to external or internal MA MA ( disjoint ) and in addition to ' senior staff '. Also specialized entity types again ( continued cascaded) specialized / be generalized.

The visual representation of specializations and generalizations is not envisaged in the original ERM diagram, but is used in extensions such as the SERM.

Aggregation and separation by is- part-of relationship

If several individual objects (eg person and hotel) to a stand-alone single object together (eg reservation), then it is called aggregation. Here, the superior standalone whole unit is called; the parts of which it is composed are called components. Aggregate and components are declared as entity type.

Aggregation / decomposition distinguishes between roles and quantity aggregation: A role aggregation occurs when there are multiple role-specific components, these are combined into one unit and it is a 1: c relationship is.

Example to a is- part-of relationship: football team is- part-of football game and the venue is- part-of football game and in the other direction of reading: football game consists of - football team and venue.

A lot of aggregation occurs when the aggregate is formed by the aggregation of individual objects of exactly one component. Here is a 1: cN - before relationship.

Example to the amount of aggregation: Soccer player is- part-of football team and in the other direction of reading: football team consists of ( multiple, N) football players.

Contents of the ER model

ER diagrams

The graphical representation of entity types and relationship types is Entity Relationship Diagram (ERD ) or ER diagram called. These are overviews or graphics with - depending on the model size - often between 10 and 40 entities. Through the presentation of all their relationships can develop a complex appearing, net-like structure. For very large models of clarity, typically sub-models ( excerpts from the overall model ) are displayed graphically on the grounds. Colloquially ERDs are simplistic in some cases referred to as " data model "; in a broader sense but is understood here under the textual descriptions.

Forms of notation in ER diagrams:

There are different forms of representation in use. Usually a rectangle is used for the entity type, the relationship type mostly in the form of a connecting line with special line ends or labels that represent the cardinality of the relationship type.

Today there are a variety of notations, which differ from one another in clarity, scope of the graphical language, support of standards and tools. Below are some important examples that make it clear first of all that all graphical differences, the core statements of the ER diagrams are almost identical.

Of particular - partly historical - meaning include:

  • The Chen notation by Peter Chen, the developer of the ER diagrams, 1976; expanded to include the representation of attributes by the Modified Chen notation ( MC notation)
  • The IDEF1X as a longtime de facto standard for U.S. authorities;
  • The Bachman notation by Charles Bachman as a widespread tool - diagram language;
  • Martin notation ( crow's foot notation) as a widespread tool - diagram language ( Information Engineering );
  • The (min, max) - notation of Jean -Raymond Abrial, 1974.
  • UML as a standard, the self- employed in own ISO standards as a replacement for ER diagrams. Attributes ( not shown in the diagram ) can be represented as class attributes; Relationship attributes, however, are modeled with the help of association classes.

All adjacent notations press on their way from the following matter:

  • A person is born in a (1) Location. A place is the birthplace of any number of people.
  • Whether each person must refer to a place of birth (if there would be the place Unknown ') and / or whether there may be places where (according to dataset ) was born no person is in the Chen notation and not in the other forms of notation shown with different symbols graphically.

The (min, max ) notation is fundamentally different from the other forms of notation with respect to the determination of the cardinality and the position at which the frequency category is made in the ER diagram. For all other notations, the cardinality of a relationship type is determined, that is asked for an entity of an entity according to the number of possible participating entities of another entity. The min -max notation, however, the cardinality is defined differently. This is asked for each of the entity involved in a relationship type after the minimum and maximum number of relationships in which an entity of the relevant entity is involved. The respective min-max result is noted in the entity for which the question was asked.

The numerical difference between min-max notation and all other notations only appears when in ternary and higher-grade relationship types. For binary relationship types, the difference in only one permutation of cardinality can be seen.

Kardinalitätsnotationen with 'n' without min-max specification involve a semantic deficit. Because they lack an indication of whether the value of 'n', the '0 ' includes or not, so the relationship can occur optional. Whether, for example, at a 1: n- ' leitet' relationship between employee and project a project, even if only temporarily, may be without line employees, remains open - and must be verbally defined explicitly.

Description of the model components

While the ER diagram shows the relevant context entities and their relationships ( at the type level), details about their own descriptions are recorded. The documentation is for the purpose of being able to understand and communicate the issues worked out consistently and clearly ( the same terms! ) And provide the possible from a conceptual point of view information for the subsequent phases of the project implementation.

Examples of possible contents:

  • For entities: name, short name, definition, example ( e), for further information, the estimated amounts, new or already available, ...
  • For relationships: short name, entity types involved, Beziehungsaussage_1 ( ' MA directs Project' ), Beziehungsaussage_2 (in reverse direction), cardinality, any further conditions for the relationship ( ' only when individuals '), ...
  • For attributes: name, short name, definition, example, further explanations, information format (eg number, 2 decimal places), range ( 1-99 ), for identifying the Entity (Y / N / partial), ...

Specifically, the content used by modeling tools or a specific organization can be defined ( eg on document templates ). If in the ER model occur objects that already exist in the organization, these are commonly used in the existing form (copies ...) are. Conversely explore new objects from the ERM after the project in the central data model of the company a.

Use of the ERM during database design

ER model is frequently used in connection with the design of databases. Here, the semantic ERM expanding or taking advantage of this copy as a base, producing a new ER model and this expanded so that it forms the basis for the implementation of the database. The implementation of the recognized in the real world (and modeled ) data facts in a database schema is done in several steps:

  • Identify and summarize entities to entity types by abstraction (eg Colleagues Fritz Maier and Paul Lehmann and many more to the entity type Employee );
  • Identify and summarize relationships between any two objects to a type of relationship (example:. The clerk Paul Lehmann heads the project ' Improvement of the working environment ', the clerk Fritz Maier leads the project ' Increasing efficiency in the administration ' These findings lead to the relationship type " employee passes project "). ;
  • Determining the cardinalities, ie the frequency of occurrence ( Such as A project is always guided by exactly one employee, an employee may conduct several projects ).

These steps can be represented in an ER model according to the examples shown above.

Furthermore, the following steps are required, the results are often not represented graphically, but is supplemented only as descriptive text:

  • Determination and detailed description of relevant attributes of each entity - such as field length, value ranges, etc. Required
  • Determining appropriate attributes of an entity type as identifying ( s ) attribute (s), so-called key attributes. If necessary. are ' artificial keys ' to define.
  • Set additional details on the implementation of relationship types - such as mandatory relationship, foreign key, or relationship table, referential integrity.
  • Generating the schema of a relational database with all its tables and field definitions associated with their respective data types.

Depending on the modeling tools used - and targets for the project methodology - not always necessary to distinguish between ERM and database model. This may be the case for example with small database projects or database tasks, where the database design with the use of end-user databases (eg Microsoft Access) is created and the documentation, including ERD is supported by functions of the same system.

It is also ( depends on tool ) possible to transfer content model for the design of the database in another tool and to further editing. Especially in this case should be made for the consistency of the two design levels.

Transfer to a relational model

The transfer of an entity - relationship model in the relations model is based essentially on the following pictures:

  • Entity relation →
  • Relationship type → foreign key; in the case of an n: m relationship type → additional relation
  • → attribute attribute.

The exact transfer that can be automated, done in 7 steps:

For each strong entity type, a relation R with attributes with k as the primary key and the attributes of the entity is created.

For each weak entity type, a relation R is created with the attributes of the foreign key value k as well as the primary key, where the weak entity type and k identify the strong entity type.

For a 1:1 relationship type of entity types T, S one of the two relations to the foreign key for the other relation is extended.

For the 1: N relationship type of entity types T, S with the cardinality of N (or 1 in min -max notation) incoming relation T is extended by the foreign key of relation S.

For each N: M relationship type a new relation R with attributes is created with the attributes of the relationship and or on the primary keys of the relations involved.

For each multi-valued attribute in T a relation R with attributes with is created as a multi-valued attribute and k as a foreign key in T.

For each type of relation with a degree a relation R is created with the attributes with a foreign key to the incoming entity types as well as attributes of the relationship type. When all entity types involved received with cardinality, the primary key is the set of foreign keys. In all other cases, the primary key contains foreign key, the foreign key must be contained entity types with cardinality in each case in the primary key.

87095
de