Data type

Formal refers to a data type ( from the English datatype ) or a data type in computer science from the summary object sets with the operations defined thereon. This can be specified by the data type of the data set using a so-called signature, only the names of these objects and sets of operations. Such a specified data type has no semantics.

The meaning far more commonly used, but more specific data type of the term comes from the environment of programming languages ​​and means the aggregation of specific ranges of values ​​and operations defined on a unit. Examples can be integers or decimal numbers, strings or more complex types such as date / time or objects. Another way to distinguish the term is used specifically data type for these data types in the literature. For a discussion of how programming languages ​​with data types deal, see typing.

The mental transition from the formal definition to the definition of specific data types used in the context of programming languages, this is done through the gradual introduction of a semantics to formally specified name of the object and sets of operations. The specification of the set of operations leads to abstract data types or algebras. With the further specification of the set of objects results in the concrete data type.

  • 2.3.1 pointer
  • 2.3.2 Constant null pointer
  • 2.3.3 Procedure Types

Formal definition of a data type by a signature

A signature a pair (varieties, operations ), the species name of object sets, and operations for representing names operations on these quantities. An example will show integer, the hot Simple Integer here that this is a simplified version of the well known and described in more detail below ( concrete ) data type:

This is a signature for an assumed data type Simple Integer, on which only two operations and - (next to the " generator " operation ) are allowed. The only one we call int The operation zero is used to generate an int element. The operations and - are two digits each, and each in turn provide an element of variety is int important to note that this is a purely syntactic specification. What is an int, is not defined anywhere. This would require even an assignment of the variety to a lot done. A reasonable assignment would be about the set of natural numbers in this case. Also on the Functioning of the operations is nothing but testified as their arity and results. Whether the symbol corresponds to the operation of the sum operation is not defined here - this would be impossible, since it is not even known whether the operation operates on the natural numbers. Such assignments are within the scope of semantics. A semantics to the extended specification could therefore look like this:

This, however, the area of ​​a signature is already exceeded. This specification would be called rather than algebra. The specification is in this way, however, the programming language understanding of the term data type in detail, which devotes a large part of the rest of the article.

Data types in programming languages

Many programming languages ​​have their own set of predefined data types, in which the principle of the respective range of values, such as integers, floating point numbers or strings, is the same. The actual names of these data types, exact definitions of the range of values ​​and associated operations differ, however, in some cases considerably, as those used by the programming language, the computer platform used and other compiler dependent factors.

Data types are used in programming to assign a specific semantic memory areas. These memory areas are called variables or constants. The data types allow a compiler or runtime environment to verify the compatibility of the type specified by the programmer operations. Illegal operations are already partially detected at compile time, so that for example, the division of a string, HANS ' by the figure, 5', which is not meaningful and is undefined in conventional programming languages, is prevented.

A distinction is elementary and complex data types. Another concept of order is ordinal data type.

Ordinal data types

Ordinal data types are characterized in that a fixed order relation defined on them, which assigns their values ​​a unique order number. Thus, the order of the values ​​is fixed. As a consequence has

  • Any value other than the first exactly one direct predecessor and
  • Every value except the last exactly one direct successor.

Whether an elementary data type is also an ordinal data type depends on the definition in the specific programming language. Examples:

  • The enumeration type is an ordinal data type in PASCAL, since the values ​​are ordered from left to right; Successors, predecessors are determined via standard functions. C which is not the case.
  • Boolean is a special enumeration type with two values ​​"false" ( ordinal value 0) and " true" ( ordinal value 1), usually called the English "false" and "true".
  • Integers and Natural Numbers are inherently ordinal data types.

Elementary data types

Elementary data types, also called simple data types or primitive data types can only take a value of the appropriate range of values. You have a fixed number of values ​​( discreteness ) and a defined upper and lower limits ( finitude ). Therefore, real numbers can be represented as floating point numbers with a certain accuracy. For elementary data types basic operations are defined in a programming language, the figures are the basic arithmetic operations. Data types have, depending on the programming language and range of values ​​different names, and uppercase or lowercase ( here to search all large).

Integers

  • Title: BIGINT, BIN, BIN FIXED, BINARY, COMP, INT, INTEGER, LONG, LONG INT, LONGINT, MEDIUMINT, SHORT, SHORTINT, SMALLINT
  • Range: Most 32-bit ( -231 ... 231-1 ), 16 bit, 64 bit
  • Operations: , -, *, <,>, =, division with remainder and modulo

Natural Numbers

  • Title: BYTE, CARDINAL, NATURAL, UNSIGNED, UNSIGNED CHAR, INT UNSIGNED, UNSIGNED LONG, UNSIGNED SHORT, WORD
  • Range: Most 32-bit (0 ... 232-1 ), 8 bit, 16 bit, 64 bit
  • Operations: , -, *, <,>, =, division with remainder and modulo

Fixed-point numbers ( decimal )

  • Name: COMP -3, CURRENCY, PACKED DECIMAL, DEC, DECIMAL, NUMERIC
  • Range: Range of values ​​directly dependent on the maximum number of digits, which is mostly pretend; CURRENCY (64 bit): -922,337,203,685,477.5808 ... 922,337,203,685,477.5807
  • Operations: , -, *, <,>, =, division with remainder and modulo

Enumerated Types

  • Title: ENUM, SET, or implicitly
  • Value range: Freely selectable, for example, ( BLACK, RED, BLUE, YELLOW)
  • Operations: < ,>, =

Boolean ( logical values)

  • Title: BOOL, BOOLEAN, LOGICAL, or (implicitly without label )
  • Range of values: (TRUE, FALSE) or (= 0, ≠ 0) or ( = -1, = 0)
  • Operations: NOT, AND, XOR, NOR, NAND, OR, =, ≠

Character ( single character )

  • Title: CHAR CHARACTER,
  • Range: All elements of the character set (for example, letters)
  • Operations: < ,>, =, conversion to INTEGER, ...

Floating-point numbers

  • Description: DOUBLE, DOUBLE PRECISION, EXTENDED, FLOAT, HALF, LONG REAL, REAL, SINGLE, SHORT REAL
  • Range: Various definitions (see below)
  • Operations: , -, *, /, <,>, =

Bit sets

Bit sets make a lot of several bits dar. In some programming languages ​​there are to preserve type safety, a separate data type and own operators ( for example, the union or the intersection) for bit sets.

Bit sets are not to be confused with enumerated types or data fields, because several elements of the data type (respectively of the amount ) can be addressed simultaneously. In many programming languages, integer data types for the representation of bit sets are used, so that numbers and bit sets are assignment- compatible, although arithmetic operators do not make sense at bit sets and set operators in connection with integers.

  • Description: SET, BITSET
  • Range of values: { } for the empty set, { i} for quantity with the element i, {i, j } for quantity with the elements i and j
  • Operations: comparison operator, cast to integer or element of a character set, set operators

Pointer Types / Dynamic Data Types

A special feature pointer, whose real value range in many programming languages ​​remain anonymous because they are "only" references to any other data types. Depending on the type referenced pointers are named separately on certain elements, such as pointers to files, printers, or pipes.

Object-oriented programming languages ​​store the referenced by the pointer data type (for example, instance variables ) along with the address to which the pointer points, so that the assignment compatibility can be checked not only for the data type of addressing, but also for the referenced content. This is even the term, and for some applications (for example, polymorphism ) is also needed.

Pointer

  • Title: ACCESS, POINTER, IntPtr or only briefly asterisk (*)
  • Range of values: address of the base type (often anonymous)
  • Operations: reference, dereference, in some languages ​​: , -, *, /

Constant null pointer

  • Name: NULL, VOID, None, NIL Nothing
  • Range of values: none
  • Operations: =
  • Meaning: This pointer is different from all pointers to objects.

Procedure types

Some programming languages ​​, such as Oberon, use types of procedures that are used for pointer variables that can point to different procedures with the same formal parameter lists.

Composite Data Types

Complex data types is a data construct, which is of simpler data types. Since they can be complex theoretically arbitrary, they are also often counted already one of the data structures. Common to most programming languages ​​are:

  • Sequence ( tuple ) table; Field ( mehrdeutig! ) Title: ARRAY ( implicit definition with [n ] or ( n ) without identifiers )
  • Range of values ​​: Illustration of a finite set (index amount ) to the value range of a base type ( element type). The index set has to be ordinal. By applying multiple indexes results in a multi-dimensional ranking.
  • Operations: < ,>, =, assignment to assignment compatibility
  • Example: type 3D vector is ARRAY ( 1 to 3) of INTEGER;
  • Title: Array of CHAR, CHAR (n), CHAR [n ]
  • Range: All possible strings
  • Operations: String Functions (Part string concatenation [Composition ] ), < ,>, =
  • Name: String, Array of CHAR, VARCHAR, CLOB, text
  • Range of values: variable-length strings
  • Operations: String Functions (Part string length, concatenation [Composition ] ), < ,>, =
  • Title: BLOB
  • Range of values: binary variable-length strings
  • Operations: length, concatenation [Composition ], =
  • Title: RECORD, STRUCT, CLASS ( extended meaning ), ( implicit definition on stage numbers )
  • Range of values: A composite containing a sequence of different components which may have different data types. As a component type of each type is allowed. In some object-oriented programming languages ​​( for example, Oberon ) composites by methods can also type-bound procedures for describing the behavior of the components of the composite.
  • Operations: comparison (only equality or diversity ) assignment with or without assignment compatibility (strong programming language dependent)
  • Example: type checking is RECORD ( subject: STRING, Schueler: STRING, points: INTEGER, teacher: STRING, date: DATE )
  • In many programming options to interpret the storage area of a composite multiple differently exist. This is called a variant record or UNION. However, no type safety is usually given.

Additional individual format specifications

With the use of data types in the source code of a program and individual additional format specifications are often at a selected data type implements. For example, a date (or in general a time) may be applied as an integer elementary data type, to be supplemented to the information on the form of processing / presentation. The date is then stored 1970 0.00 clock, for example, in milliseconds since January 1 and, on that basis, in certain other forms (such as ' dd.mm.yyyy ' or ' MM.TT hh: ss ') are transferred; see. Alternatively, it could also be represented as a composite (eg from three numbers representing the day, month and year ) a date of course.

Functions as values ​​of the first order

In many contemporary programming languages ​​and regular function values ​​, function literals and anonymous functions are available in addition to function pointers. These were developed based on the lambda calculus and 1958 (albeit with faulty dynamic binding ) is implemented in LISP. Correct, that is, static binding has been specified eg for Algol 68. The fact that functions are not realized until today, partly as values ​​, is located on the only now ever incipient spread of this concept outside of computer science.

Universal data type

Under a universal data type of the type of the values ​​is understood in a programming language with support for untyped variables. This is usually around the discriminated union of the types of values ​​( elementary, composite, functions, etc. ) that occur. The universal data type characteristically occurs on in the universal scripting languages. As examples of the use of universal data types in languages ​​other genus is the lambda calculus in which functions are the only values ​​and highlighted the prologue, in which the data are given by the Herbrand structure.

Abstract Data Types

As to access ( read or write ) only on the specified operations, the data is encapsulated to the outside. Each ADT includes a data type or a data structure.

Object-oriented programming languages ​​support through their Class concept, the creation of ADTs, since data and operations are bound, and the data can be protected. Some modular programming languages ​​like Ada or Modula-2 support also targeted the creation of abstract data types.

From a technical point of view, abstract data type defines a defined value range with technical meaning and its specific manifestations. How is the data type ' customer number ' may from elementary type ' Integer ', but differs by a defined length and, for example, a check digit in the last place. - It thus forms a subset of all integers in the defined length. It can be combined also with a complex data dependent on each other than ADT. This is the example of a common representation of time periods. It will be a start date and an end date (both have the data type 'Date ') linked via a constraint. Thus, the allowable range of values ​​the end date is further conditions attached to the end. - Finally, an ADT is an arbitrarily complex range of values ​​is bound to static and / or dynamic values ​​and associated rules for determining the value.

Anonymous data types

Some programming languages ​​and the XML Structure Definition Language XML Schema support the concept of anonymous data type. It is a data type for which a name is not defined.

220258
de