XML Schema (W3C)
XML Schema, XSD abbreviated (XML Schema Definition ), a W3C Recommendation to define structures for XML documents. Unlike the traditional XML DTDs the structure in the form of an XML document is described. In addition, a large number of data types is supported.
XML schema describes in a complex schema language data types, a few XML schema instances ( documents) and groups of such instances. A specific XML schema is also known as an XSD (XML Schema Definition ) and has as a file usually have an extension ". Xsd ". In contrast to DTDs can be distinguished using XML schemas between the name of the XML type and the name used in the instance of the XML tag.
In addition to the XML schemas are other concepts for the definition of XML - structures with different intentions known as DTD, RELAX NG or Schematron.
- 3.1 Unique Key
- 3.2 Import, Include and Redefine
Data types
XML schema distinguishes between simple ( atomic ) data types and complex data types. The term refers to type in the text below each of the abstract specification of the structure of a section within an XML document. Data types in XML Schema are classified into pre-defined or built-in (built-in ) and custom ( user defined ) data types.
In the specification of the W3C XML Schema for 19 preset primitive data types (such as boolean, string, float, date, and NOTATION ) and another 25 derived from primitive data types defined ( such as ID and integer).
Simple Types
XML schema provides some basic atomic data types. The atomic data types include the "classical" types, as specified (eg, C, Java or SQL) to some extent in other type systems:
- Xs: string
- Xs: decimal
- Xs: integer
- Xs: float
- Xs: boolean
- Xs: date
- Xs: time
There are also other XML-specific atomic types, including:
- QName: Qualified name, globally unique identifier. Made up of so-called NCNames (Non- Colonized Names ), each NCName referred to the last one namespace ( namespace). The last NCName matches the local name within the namespace. The individual NCNames are using dot (.) Assembled into a QName.
- AnyURI: Uniform Resource Identifier ( URI)
- Language: Language name, eg en-US, en-US, fr
- ID: ID attribute within XML elements
- IDREF: Reference to an ID value
Simple XML data types may not be XML child elements, nor have XML attributes.
In addition to the atomic data types include lists and unions (composed of atomic elements and lists) to the simple types:
- The following example defines a new XML data type named monatInt as well as a list of the new type:
- Among the simple types in addition include more so-called associations (English unions ).
A new type is defined as the union of existing types. Each instance then select its type from this set. The following example defines another type month name and a Union Type month:
In addition to the simple types, complex XML data type definitions provide the ability to define elements of structures contiguous. Such structures can include other elements and attributes.
The child elements of a complex type can be combined in three different ways:
- Xs: sequence: A list of child elements is specified. Each of these elements can be zero, one or more times occur ( minOccurs and maxOccurs ). If not the occurs attribute is present, the default value 1 is used in both cases. The elements within a sequence must be in the order of occurrence. In the example shown above, the elements must name, manufacturer and processor occur exactly once, the mhz element can occur zero or one time, comment elements can not occur as often or as well.
- Xs: choice: from a list of alternatives one item can be selected. The following example defines a new type of computer that has a child element is either a desktop element (of type pc- Type) or a laptop element:
Generic content
XML elements with arbitrary content can be defined by means of the base type anyType. The following code specifies a comment element of any content, that is both complex XML elements as well as text can occur.
From empty XML elements is when the element from only a single XML tag exists and no other XML elements or text wraps (eg XHTML line break:
).
XML Schema uses at this point a little trick: It is using xs: complexType defines a new type without specifying a child element.
Since xs: complexType by default only complex XML child elements permits as content, the element remains empty in this case.
Derivation of new types
New data types can be first, by the definition of a new type create ( see previous section) or by deriving a new type from existing.
When a new type derivation it is not an inheritance in the sense of object orientation, since no comparable properties are inherited methods or attributes of object-oriented classes. Rather, it is here to reuse existing type definitions. Accordingly, in the derivation of new types, no implicit substitutability is given, as is common in other type systems ( however, explicit casts are possible).
The derivation of a new type can be done in two ways: enlargement or reduction.
Extension of a type
The extension of an existing type ( engl. extension) to other properties, that is, new elements or attributes are added. In the following example, the above-defined type pc- type is extended by an element ram:
Restriction of a type
By restricting existing types ( engl. restriction ) can also derive new definitions. To this end, all the element definitions of the base type must be repeated, changed by the respective restrictive limitations. In the following example, a new type myPC2 type of pc- type is derived. In this case, a comment element can occur at most (as opposed to any number in type pc- type)
- Length, maxLength, minLength - Limits the length of a string or a list.
- Enumeration - restriction by specifying alternative values
- Pattern - restriction by specifying a regular expression
- MinExclusive, minInclusive, maxExclusive, maxInclusive - Restriction of the range of values.
- TotalDigits, fractionDigits - restriction of decimal places ( total number and decimal places )
- Whitespace - treatment of spaces and tabs
The following examples illustrate the use of these components:
- Body temperature, to 3 decimal places, 1 decimal place, minimum and maximum value
Element definition
As explained in the previous section allows XML schema to define new XML data types and use them in defining their own XML elements. The following example illustrates the use of the already defined type pc- type within a list of pc elements:
In the design of a complex XML schema both the reusability and extensibility of each XML element types as well as the readability of the schema itself should be taken into account. The use of anonymous XML element types as part of larger elements in general ensured a better readability of smaller XML schemas. The definition and naming of individual, small and reusable XML element types, however, allows for a more modular XML schema structure. Due to the variety of possible applications is as yet no generally accepted design principles for XML schemas have emerged (comparable to normal forms for relational databases).
Advanced Concepts and Features
Unique keys
Similar to the primary keys in relational databases can be set using XML Schema to define unique keys. XML Schema is different ( engl. unique) and the key property between the uniqueness.
The following example shows the referencing of this key with the attribute and refer the keyword @ references.
Import, Include and Redefine
XML scheme allows reuse foreign schemes. To this end, are both the include and the import- day available and the possibility of a new definition and adaptation of foreign schemes when incorporating.
Type definitions within a name space that are spread across multiple files, can be put together with include.
< schema xmlns = " http://www.w3.org/2001/XMLSchema "
xmlns: pcTeile = " http://www.example.com/pcTeile "
targetNamespace = " http://www.example.com/pcTeile " >
...
Same example as just. Assumption, it would be a complex type in the schema harddisk.xsd manufacturer.
< schema xmlns = " http://www.w3.org/2001/XMLSchema "
xmlns: pcTeile = " http://www.example.com/pcTeile "
targetNamespace = " http://www.example.com/pcTeile " >
...
The import tag allows to import elements from other namespaces to a prefix and thus reuse schema components from different namespaces. Assumption is that there is a defined type super type in pcTeile.
< schema xmlns = " http://www.w3.org/2001/XMLSchema "
xmlns: pcTeile = " http://www.example.com/pcTeile "
targetNamespace = " http://www.example.com/firma " >
...
To use an XML schema to an XML file the attribute can be used schemaLocation of schema -instance namespace to make known the address of the schema. Thus, it is an application, such as an XML parser possible to load the pattern if it is not already known to him. Alternatively, the application can be made known through other means the scheme but, for example, via configuration files. The latter possibility is not standardized and therefore varies from application to application.
The following example is expressed that the default namespace is http://www.w3.org/1999/xhtml and then specify that the XML schema for this namespace under http://www.w3.org/1999/ xhtml.xsd is located.
< html xmlns = " http://www.w3.org/1999/xhtml " xmlns: xsi = " http://www.w3.org/2001/XMLSchema-instance " xsi: schemaLocation = " http://www.w3.org/1999/xhtml http://www.w3.org/1999/xhtml.xsd " > The definition applies to the XML element for which the attributes are specified, and all children elements.
Be assigned to target elements that do not belong to the namespace, an XML schema, this happens as shown in the following example, using the attribute noNamespaceSchemaLocation.
An XML structure that conforms to the schema is this:
- List of XML namespaces
- RELAX NG
- WSDL
- Schematron
- Document Structure Description ( DSD)
- RailML