Document type definition

A document type definition (English Document Type Definition, DTD, and schema definition or DOCTYPE ) is a set of rules that is used to declare documents of a certain type. A document type is a class of similar documents, such as telephone books or inventory records. The document type definition consists of element types, attributes of elements, entities, and notations. Specifically, this means that in a DTD the order, the nesting of the elements and the nature of the content of attributes is defined - in short, the structure of the document.

The term is also used for concrete DTD implementations, because DTD is described within the SGML-/XML-Spezifikationen. For XML documents exist various other schema languages ​​to express document type definitions; the best known are XML Schema (XSD) and RELAX NG.

A DTD specifies the expression of a scheme, the syntax of an application of SGML or XML, such as those derived from the languages ​​they HTML or XHTML. This syntax is normally held in a less general form as the SGML or XML syntax.

DTD in XML

The syntax and semantics of a DTD is part of the XML specification. This decision was later criticized because the DTD syntax is not XML itself. With Document Schema Definition Languages ​​a separate specification for defining document structures, data types and data relationships exist in structured information sources.

Document type declaration

The DTD is specified at the beginning of an XML document prior to the root element in the document type declaration. The grammar rules of the DTD can thereby be both within the XML document ( internal DTD) as well as in an external file specified ( three possibilities: the square brackets can be omitted if they are empty ):

As a reference to a file can be any URI can be specified. For standardized DTDs, there are known public identifier ( for example, " - / / W3C / / DTD XHTML 1.0 Strict / / EN " for XHTML ) so that programs do not every time to reload the file need them if the public identifier is known.

Within a DTD file or the brackets different markup declarations can occur, which define the document type.

Markup declarations

Within a DTD the document structure with declarations of element types, attribute lists, entities and notations, and text blocks can be defined. This special parameter entities can be used, which contain DTD parts and are permitted only in the DTD.

Text blocks are either unparsed text, CDATA or PCDATA.

The structural elements ( building blocks ) are defined using attribute mappings:

  • Element
  • Attribute
  • Entity
  • CDATA
  • PCDATA

PCDATA

PCDATA is the key for a block of text that can contain further instructions to the parser - unlike CDATA. The content of this text block is therefore also parsed by the parser. PCDATA is used for mixed content of the document; ie, those which may also contain other elements, or blocks of text.

The term PCDATA stands for parsed character data, however, is quite misleading. It dates back to SGML.

Element type declarations

With an element type declaration is an element and its possible content is defined. In a valid XML document only elements may occur, which are defined in the DTD.

The content of an element can be specified by specifying other item name and some keywords and characters.

  • EMPTY for no content
  • ANY for any content
  • For sequences
  • | For alternatives ( in the sense of " either ... or" )
  • ( ) For grouping
  • * For any number of times
  • For at least once
  • ? for zero or exactly once
  • If no star, plus sign, or a question mark is specified, the element must appear exactly once

< ELEMENT div (# PCDATA | p | ul | ol | dl | table | pre | hr |            h1 | h2 | h3 | h4 | h5 | h6 | blockquote | address | fieldset ) *> < ELEMENT dl ( dt | dd)! > Attribute -list declarations

The list of possible attributes of an element is < ATTLIST element name attribute list! > Specified in a DTD. The attribute list contains by blanks or line breaks separated the name, type and specifications of an attribute.

Examples of elements:

  • ID
  • IDREF and IDREFS
  • NMTOKEN and NMTOKENS
  • NOTATION and Notations
  • Enumerations and NOTATION lists

With the attribute preferences can be specified whether an attribute must exist ( # REQUIRED ) or not ( # IMPLIED ) or a fixed value contains (# FIXED) and which value is used as the default value if the attribute in an XML tag not is specified.

Example

Entity declarations

An entity is a named shortcut for a string or an external document, which can be used within the DTD or XML document that uses this DTD. An entity reference of the form &name; is replaced by the contents of the Entity. ( For general use, see entities in markup languages ​​. )

Internal Entities consist of strings. This can itself again contain entity references and well-formed XML markup:

< ENTITY important PUBLIC "- / / private / / IMPORTANT / /" " wichtig.xml " > For external entities can also be stated that this is an unparsed entity whose content consists of any data that does not need to be replaced by an entity reference. In this case, a notation must be specified (in this case "gif ").

notation declarations

Notations are advice on the interpretation of external data, which are not processed directly by the XML parser. Notations may relate, for example, a file format for images.

NMTOKEN declarations

NMTOKEN (name token ) is related to an XML name, but is permissive to the rules for naming. Thus, in a NMTOKEN names with a leading digit or leading point are allowed, whereas in an XML name only letters, ideographs, and are under lines allowed in the first place. Thus, each XML name is also a NMTOKEN, but not vice versa. Examples of NMTOKEN:

12alpha. crc Declaration example:

Parameter entities

Parameter entities contain a named string using % name; can be used at almost all locations within a DTD. In this way, for example, embed external files in a DTD and abbreviate multiple occurrences of elements. Entities parameters are used like normal XML entities declared, wherein before the element name a single percent symbol. example:

% file;   Conditional sections

A conditional section is a construct to declarations on or off. example:

] ]> Turns the declaration of hi. Accordingly, the following applies:

] ]> off hello.

We used conditional sections as above but not alone, but usually in conjunction with parameter entities:

< [% soft; [    ] ]> The parameter entity % soft; is occupied by one of the possible keywords INCLUDE or IGNORE. Depending on occupancy, the entity is declared hi or not.

This type of notation, a conditional section may be adjusted by override of parameter entities.

Example

Short XML document with reference to an external DTD:

Short XML document with internal DTD

] > Hello world! < / hi > Web Links

  • HTML document types in SELFHTML ( German )
  • Web document types W3C (English)
  • Description Language
  • XML
  • Classification
  • Documentation language
244292
de