XML Schema (W3C)

XML Schema, XSD abbreviated (XML Schema Definition ), a W3C Recommendation to define structures for XML documents. Unlike the traditional XML DTDs the structure in the form of an XML document is described. In addition, a large number of data types is supported.

XML schema describes in a complex schema language data types, a few XML schema instances ( documents) and groups of such instances. A specific XML schema is also known as an XSD (XML Schema Definition ) and has as a file usually have an extension ". Xsd ". In contrast to DTDs can be distinguished using XML schemas between the name of the XML type and the name used in the instance of the XML tag.

In addition to the XML schemas are other concepts for the definition of XML - structures with different intentions known as DTD, RELAX NG or Schematron.

  • 3.1 Unique Key
  • 3.2 Import, Include and Redefine

Data types

XML schema distinguishes between simple ( atomic ) data types and complex data types. The term refers to type in the text below each of the abstract specification of the structure of a section within an XML document. Data types in XML Schema are classified into pre-defined or built-in (built-in ) and custom ( user defined ) data types.

In the specification of the W3C XML Schema for 19 preset primitive data types (such as boolean, string, float, date, and NOTATION ) and another 25 derived from primitive data types defined ( such as ID and integer).

Simple Types

XML schema provides some basic atomic data types. The atomic data types include the "classical" types, as specified (eg, C, Java or SQL) to some extent in other type systems:

  • Xs: string
  • Xs: decimal
  • Xs: integer
  • Xs: float
  • Xs: boolean
  • Xs: date
  • Xs: time

There are also other XML-specific atomic types, including:

  • QName: Qualified name, globally unique identifier. Made up of so-called NCNames (Non- Colonized Names ), each NCName referred to the last one namespace ( namespace). The last NCName matches the local name within the namespace. The individual NCNames are using dot (.) Assembled into a QName.
  • AnyURI: Uniform Resource Identifier ( URI)
  • Language: Language name, eg en-US, en-US, fr
  • ID: ID attribute within XML elements
  • IDREF: Reference to an ID value

Simple XML data types may not be XML child elements, nor have XML attributes.

In addition to the atomic data types include lists and unions (composed of atomic elements and lists) to the simple types:

  • The following example defines a new XML data type named monatInt as well as a list of the new type:

                < / xs: restriction > < / xs: simpleType>    < / xs: simpleType> An instance of the new type might look like this:

    1 2 3 4 5 6 7 8 9 10 11 12 The individual items in a list are separated by whitespace ( space here ).

  • Among the simple types in addition include more so-called associations (English unions ).

A new type is defined as the union of existing types. Each instance then select its type from this set. The following example defines another type month name and a Union Type month:

                       < - And so on ... ->    < / xs: restriction > < / xs: simpleType>    < / xs: simpleType> XML elements of type month may either integer values ​​in the range 1-12 contain or one of the corresponding month names as a string. Valid instances are for example:

January < / month > 2 < / month > Complex Types

In addition to the simple types, complex XML data type definitions provide the ability to define elements of structures contiguous. Such structures can include other elements and attributes.

                               < / xs: sequence>    < / xs: complexType> The possibilities to define complex types are explained only as an example here. The interested reader is directed to the links below to the pages of the W3C.

The child elements of a complex type can be combined in three different ways:

  • Xs: sequence: A list of child elements is specified. Each of these elements can be zero, one or more times occur ( minOccurs and maxOccurs ). If not the occurs attribute is present, the default value 1 is used in both cases. The elements within a sequence must be in the order of occurrence. In the example shown above, the elements must name, manufacturer and processor occur exactly once, the mhz element can occur zero or one time, comment elements can not occur as often or as well.
  • Xs: choice: from a list of alternatives one item can be selected. The following example defines a new type of computer that has a child element is either a desktop element (of type pc- Type) or a laptop element:

                < / xs: complexType> xs: all: Using the xs: all tags can be a group of child elements define each of which may occur at most once (min- and maxOccurs attributes of the child elements may only take the values ​​0 or 1 ). The order of the elements is arbitrary.

Generic content

XML elements with arbitrary content can be defined by means of the base type anyType. The following code specifies a comment element of any content, that is both complex XML elements as well as text can occur.

Should can appear in any order in the content text and tags, the value for the attribute " mixed" must be set to "true":

                      < - Other elements ... ->      < / xs: sequence>    < / xs: complexType> < / xs: element> Empty elements

From empty XML elements is when the element from only a single XML tag exists and no other XML elements or text wraps (eg XHTML line break:
). XML Schema uses at this point a little trick: It is using xs: complexType defines a new type without specifying a child element. Since xs: complexType by default only complex XML child elements permits as content, the element remains empty in this case.

Derivation of new types

New data types can be first, by the definition of a new type create ( see previous section) or by deriving a new type from existing.

When a new type derivation it is not an inheritance in the sense of object orientation, since no comparable properties are inherited methods or attributes of object-oriented classes. Rather, it is here to reuse existing type definitions. Accordingly, in the derivation of new types, no implicit substitutability is given, as is common in other type systems ( however, explicit casts are possible).

The derivation of a new type can be done in two ways: enlargement or reduction.

Extension of a type

The extension of an existing type ( engl. extension) to other properties, that is, new elements or attributes are added. In the following example, the above-defined type pc- type is extended by an element ram:

                               < / xs: sequence>      < / xs: extension>    < / xs: complexContent > < / xs: complexType> The newly defined XML type myPC type consists of all child elements of type pc- type and the element ram. The latter, as shown in an xs: appended sequence definition of the existing child elements. Since there is no substitutability is given, may at one point to an element of type pc- type is not expected to be readily used an element of type myPC type.

Restriction of a type

By restricting existing types ( engl. restriction ) can also derive new definitions. To this end, all the element definitions of the base type must be repeated, changed by the respective restrictive limitations. In the following example, a new type myPC2 type of pc- type is derived. In this case, a comment element can occur at most (as opposed to any number in type pc- type)

                                                              < / xs: sequence>      < / xs: restriction >    < / xs: complexContent > < / xs: complexType> In addition to the restriction of complex types, it is also possible to define new types of limitation simple types. An example of such a definition is already in the section on simple types. A new type monatInt is defined as a restriction of type integer to the range 1-12. Basically, the following primitives are available to describe restrictions on simple types:

  • Length, maxLength, minLength - Limits the length of a string or a list.
  • Enumeration - restriction by specifying alternative values
  • Pattern - restriction by specifying a regular expression
  • MinExclusive, minInclusive, maxExclusive, maxInclusive - Restriction of the range of values.
  • TotalDigits, fractionDigits - restriction of decimal places ( total number and decimal places )
  • Whitespace - treatment of spaces and tabs

The following examples illustrate the use of these components:

  • Body temperature, to 3 decimal places, 1 decimal place, minimum and maximum value

                          < / xs: restriction > < / xs: simpleType> German postal codes, optional "D" followed by five digits

              < / xs: restriction > < / xs: simpleType> size specification

                               < / xs: restriction > < / xs: simpleType> When defining a type, it is possible to define whether and what kind of this type, additional XML element types to be derived. So you can specify, for example, that of a type pc type additional types may only be derived by setting further restrictions - and not by adding new child elements.

Element definition

As explained in the previous section allows XML schema to define new XML data types and use them in defining their own XML elements. The following example illustrates the use of the already defined type pc- type within a list of pc elements:

                    < / xs: sequence>    < / xs: complexType> < / xs: element> A corresponding XML element might look like this:

        Dimension 3100 < / name>      Dell      AMD      3060      workstation < / comment >    < / pc >         T 42 < / name>      IBM      Intel      1600      laptop < / comment >    < / pc > In this example, the specification of the anonymous list type is done directly within the element definition, while the specification of pc- type is external.

In the design of a complex XML schema both the reusability and extensibility of each XML element types as well as the readability of the schema itself should be taken into account. The use of anonymous XML element types as part of larger elements in general ensured a better readability of smaller XML schemas. The definition and naming of individual, small and reusable XML element types, however, allows for a more modular XML schema structure. Due to the variety of possible applications is as yet no generally accepted design principles for XML schemas have emerged (comparable to normal forms for relational databases).

Advanced Concepts and Features

Unique keys

Similar to the primary keys in relational databases can be set using XML Schema to define unique keys. XML Schema is different ( engl. unique) and the key property between the uniqueness.

                    < / xs: sequence>    < / xs: complexType>                      < / xs: unique>                 < / xs: element> The two elements unique key and select an XPath path expression (in this example: pc) a lot of pc elements. For this amount the particular uniqueness or key constraint must be satisfied. In the above example specifies that the combination of the elements name and must be unique for each pc- element within this list vendor. Through the key element is determined that the id attribute must be unique within this list and can be referenced from outside.

The following example shows the referencing of this key with the attribute and refer the keyword @ references.

   <-! IdKey from the above example ->       < / xs: keyref > note With refer to refers to the name attribute of a key constraint, not the key field. The values ​​in references must therefore always be found among the keys to the computers. ( Background of this construct is to ensure referential integrity, as they are known from relational database systems ago. )

Import, Include and Redefine

XML scheme allows reuse foreign schemes. To this end, are both the include and the import- day available and the possibility of a new definition and adaptation of foreign schemes when incorporating.

Type definitions within a name space that are spread across multiple files, can be put together with include.

< schema xmlns = " http://www.w3.org/2001/XMLSchema "          xmlns: pcTeile = " http://www.example.com/pcTeile "          targetNamespace = " http://www.example.com/pcTeile " >    ...          ... several schemes can be included. target namespace of the including harddisk.xsd must match the schema match.

Same example as just. Assumption, it would be a complex type in the schema harddisk.xsd manufacturer.

< schema xmlns = " http://www.w3.org/2001/XMLSchema "          xmlns: pcTeile = " http://www.example.com/pcTeile "          targetNamespace = " http://www.example.com/pcTeile " >    ...         < - Redefinition of Manufacturers ->                      < - Redefinition of manufacturers with restriction '''' or '' extension '' etc. ->                                                                        ...       ... redefine can be used instead of include. The name of the type does not change.

The import tag allows to import elements from other namespaces to a prefix and thus reuse schema components from different namespaces. Assumption is that there is a defined type super type in pcTeile.

< schema xmlns = " http://www.w3.org/2001/XMLSchema "          xmlns: pcTeile = " http://www.example.com/pcTeile "          targetNamespace = " http://www.example.com/firma " >    ...       ...      < ...             ... / >    ... Use of XML schemas

To use an XML schema to an XML file the attribute can be used schemaLocation of schema -instance namespace to make known the address of the schema. Thus, it is an application, such as an XML parser possible to load the pattern if it is not already known to him. Alternatively, the application can be made ​​known through other means the scheme but, for example, via configuration files. The latter possibility is not standardized and therefore varies from application to application.

The following example is expressed that the default namespace is http://www.w3.org/1999/xhtml and then specify that the XML schema for this namespace under http://www.w3.org/1999/ xhtml.xsd is located.

< html xmlns = " http://www.w3.org/1999/xhtml "        xmlns: xsi = " http://www.w3.org/2001/XMLSchema-instance "        xsi: schemaLocation = " http://www.w3.org/1999/xhtml                            http://www.w3.org/1999/xhtml.xsd " > The definition applies to the XML element for which the attributes are specified, and all children elements.

Be assigned to target elements that do not belong to the namespace, an XML schema, this happens as shown in the following example, using the attribute noNamespaceSchemaLocation.

An XML structure that conforms to the schema is this: