HTML5 is a text- based markup language for structuring and semantic markup of content, such as text, images, and hyperlinks in documents. The language is currently still in development, but there are already quite mature designs of two developer teams. You should take over from HTML4. The language replaces the document description standards HTML 4.01, XHTML 1.0 and DOM HTML Level 2 offers many new features such as, among others, video, audio, local storage and dynamic 2D and 3D graphics, which are not directly supported by HTML4 and without HTML5 only with additional plugins - for example, Adobe Flash - can be implemented.

  • 3.1 HTML5 - A vocabulary and associated interfaces for HTML and XHTML 3.1.1 The vocabulary
  • 3.1.2 The HTML parser
  • 3.1.3 HTML, XHTML, and the DOM
  • 3.1.4 The standard representation
  • 3.1.5 The browser context
  • 3.6.1 HTML: The Markup Language
  • 3.6.2 Differences between HTML5 and HTML4
  • 3.6.3 HTML5 techniques for useful text alternatives
  • 3.6.4 " Polyglot " Markup: HTML - Compatible XHTML Documents
  • 4.1 Differences from HTML 4.01 4.1.1 The document type specification
  • 4.1.2 Integration of SVG and MathML
  • 4.1.3 New elements Structuring Elements
  • The grouping element figure
  • elements to markup
  • multimedia elements
  • Form Elements
  • Interactive Elements
  • 5.1 Criticism of development of HTML
  • 5.2 Criticism of the development process of HTML5 5.2.1 Sole write permission by Ian Hickson
  • 5.2.2 Lack of seriousness for internal proposals
  • 5.4.1 Digital Rights Management


Following the release of HTML 4.0 specification in December 1997, the development of HTML was long fallow. In addition to the 4.01 in December 1999, contains only bug fixes, there were up to April 2009, no updates to the markup language more. The World Wide Web Consortium (W3C ) sat on XML, which was to become the successor to HTML, and HTML 4.01 reformulated to the XML-based markup language XHTML 1.0. Here, the functionality of HTML 4.01 was retained without modification. Then, the W3C began development of XHTML 1.1 and later, XHTML 2.0, which is not much with HTML 4.01 had together and wanted to do much better than in HTML. This led to XHTML 1.1 and XHTML 2.0 were no longer backwards compatible with these new developments. In addition, the creation of XHTML 2.0 documents was in many respects very heavy compared to HTML and required a lot of background knowledge. The development of CSS was at this time also only very slowly so that the W3C became more and more criticism.

To counter these developments, published the mid founded by multiple browser vendors Web Hypertext Application Technology Working Group ( WHATWG ) 2004 under the name Web Applications 1.0 the first proposal for HTML5.

On 27 January 2006 announced Tim Berners -Lee, the founder and chairman of the World Wide Web Consortium, a new working group with the goal of further developing HTML. The W3C used as the basis for his work on HTML5 a fork of the version of the WHATWG. Thus, the W3C competition, created in-house, as this also has the development of XHTML 2.0, a purely XML-based format for website award drove forward.

To mitigate the competition within the W3C, the existing working groups have been converted at the W3C between November 2006 and March 2007. HTML5 and XHTML 2.0 were defined as related languages ​​with different target groups.

In May 2007, the members of the HTML Working Group decided in a vote, that the Web Applications 1.0 draft of the WHATWG to be used as a starting point for discussion and further development of HTML. Since then, the W3C and WHATWG develop together on the HTML5 specification.

Mid-2009 gave the W3C announced that the development of XHTML 2.0 will be discontinued by the end of the same year. The next generation of markup languages ​​for the Web is therefore not a new variant of XHTML, but HTML5.

Different working models of W3C and WHATWG

The WHATWG is pursuing a loose version of the model development. She is working on a so-called living standard, ie a specification that is subject to constant correction and extension. Therefore, the WHATWG waived the version number " 5" and only speaks of the " standard HTML ".

The aim of the HTML Working Group at W3C, it is, however, to publish a stable snapshot of this specification under the name of HTML5. For this purpose, a predefined procedure is run until the specification finally to a W3C Recommendation (Recommendation) matures. The W3C expects that the full HTML5 specification will be supported broadly by 2014 and thus can be published as a recommendation.

Ratio of the specifications of the W3C and WHATWG the

The author (English editor) the specification Ian Hickson, founder of the WHATWG and employee of Google. Different specifications are generated from the raw text edited by him, both on the part of the WHATWG and the W3C counterparts.

The WHATWG HTML specification integrates several related sub- specifications, which are divided by the W3C in individual documents. You can use it regardless of the HTML5 specification main run through the W3C development process. These separate standards are Microdata metadata, the 2D drawing context of the Canvas element as well as cross-document messaging ( HTML5 Web Messaging ).

W3C publications

The following are the publications of the HTML5 drafts are listed by the W3C. In addition to regular working drafts ( Working Draft ), the W3C published at intervals of days, so-called Editor's Drafts. The respective daily version of the draft - extended by WHATWG specific elements - is available on the website of the WHATWG.

Progress in the development of

In the specification of WHATWG should be noted that certain parts are more mature than others. Of the more mature developments, some are already included in the current browser versions and can be used.

Completion of HTML5

According to the schedule of the W3C HTML5 is officially adopted in 2014, that is a W3C Recommendation are. In May 2011, the W3C HTML5 received the status of "Last Call ," which will serve as the last call to comment on the HTML5 draft. The WHATWG has the status of "Last Call " already proclaimed on October 27, 2009. However, the status of "Last Call " also means that HTML5 has already been adopted de facto a finished state, which is similar to a release candidate. In most browsers, HTML5 is already (albeit incomplete) implemented. If all current browsers support the corresponding functions, web developers can use ( optionally substituted with one fallback ) Parts of HTML5 so already. This recommendation also expresses the W3C.


The first important targets for HTML5 were ( re-invent HTML) by Tim Berners -Lee in his blog entry "Reinventing HTML" set: there are above all the groups in the development be involved, which use HTML ( Web authors, producers of browsers ). This HTML must be incremental, are thus developed through revision and extension of the previous version, and the transition to well-formed documents to be thus continued. The development of forms in HTML is to be expanded and form towards XForms ideally a step from the existing form structure.

In the course of setting up the new HTML Working Group and as part of the architectural vision for HTML, XHTML and XForms 2.0, these targets were set in more detail, partially modified and supplemented by additional points:

  • In contrast to the previous approach, represent only the differences to an old version in each specification, a complete specification shall be written.
  • The vocabulary of HTML must be written as a classic and HTML as an XML dialect. Regardless of the form in the vocabulary must be a defined Infoset, that is can be converted into a DOM image of the source code.

In addition to the mandate of the Working Group on the definition of the DOM interfaces for working with the HTML vocabulary and have a separate set with embedded media. The working group is to develop forms and general user input elements such as progress bars or menus and define interfaces for user WYSIWYG editing features.

After the establishment of the working group the HTML design principles were published as the first document. It further objectives are explained in detail. These include:


HTML5, as it defines the W3C, a whole consists of several specifications and documents whose content is described in the following section.

HTML5 - A vocabulary and associated interfaces for HTML and XHTML

" HTML5 " is the main specification in which the fundamentals are comprised of HTML5.

The vocabulary

The vocabulary of HTML5 is composed of the vocabulary of previous HTML specifications, until now proprietary elements as well as some new elements together, including for example parts of the Ruby - element group, which were introduced in XHTML 1.1. It includes case only the portion of HTML may use the authors for creating documents and Web applications.

In addition to the vocabulary also a clear structure model is defined, ie, the rules by which the various elements may be nested.

The HTML parser

For the first time since the emergence of HTML, the language is no more than an application of SGML, but even as a generalized language in the form of SGML defined. This is justified by the fact that even modern browser HTML not process with an SGML parser, but a parser suitable for the web.

As this parser function is not defined at present. HTML5 will change this by defining an HTML parser and so avoid that there are differences between the HTML parsers of different browsers the manufacturer.

The special feature of the parser contained in HTML5 is that he understands not only the permitted vocabulary, but also the other elements that were present in earlier versions or exist only as proprietary elements. By means of exact definition HTML5 aims to ensure that the parser to documents available on the web is backwards compatible.

HTML, XHTML, and the DOM

Each element and attribute, which is known in HTML5 is defined in the terms of the document object model. This applies regardless of whether the element or attribute is a permitted language component. This means that, besides the importance of the structural elements and its DOM interface and related interfaces ( methods and properties ) to be defined.

Based on this definition allows HTML5 to the representation of documents in three variants:

  • Documents that have the media type "text / html", apply as HTML documents. They are processed by the HTML parser. This variant is colloquially referred to as HTML5.
  • Documents that an XML media type - " xml application / xhtml " or "application / xml" for example - possess valid as XML documents that are processed by an XML parser. This variant is referred to colloquially as XHTML5.
  • The previous documents have a common document object model. The DOM is often mentioned in this context DOM5.

HTML5 tries to reduce the differences between these three variations on the shape of the respective inherent limitations. For example, the string " ->" within HTML and XML invalid, in the DOM, it can however be shown.

Another example is the attempt to reduce the differences between HTML and XHTML: The core specification of the DOM indicates that HTML elements in the null namespace is included ( XHTML elements during the namespace " / 1999/xhtml " belong ). HTML5, however, that also defines HTML elements are the namespace " " assign.

The standard representation

HTML5 tries to reproduce the expectation of authors to the standard representation of the elements. For all the elements and their attributes, there is therefore an "expected representation ", which is defined by CSS properties. HTML5 distinguishes between display properties that are to be applied in the standard format and the compatibility oriented processing of web pages.

The browser context

HTML5 introduces the concept of a browser context one: in any browser context, a document is loaded or another browser contexts (in the case of frames) generated. The components of the browser context largely belong JavaScript objects, which previously belonged to no standard, such as the History object, in which the sequence of visited web pages is stored. This will attempt to unify the behavior of the browser and submitting to a common definition.

HTML Microdata

This specification attempts to define the integration of machine-readable information in HTML documents. The aim is that this mechanism is clearly defined and is compatible with other formats such as RDF and JSON.

HTML Canvas 2D Context

In this specification, interfaces are defined for drawing two-dimensional shapes. As a sign of the surface is introduced in the main specification canvas element.

Can be drawn lines, shadows, simple and complex shapes (paths) as well as texts and images contained in the document.

HTML5 Messaging

This working draft defines two methods that should allow independent contexts browser to exchange data:

  • "Cross -document messaging", which is intended (e.g., via iframe ) enable communication from each other embedded documents, and
  • "Channel messaging", which will allow (eg separated by two different browser windows ) communication for independent documents.

HTML RDFa - A mechanism for embedding RDF in HTML

This document adapts the embedding RDF in XHTML document, as defined in HTML5.

Supporting documents

HTML: The Markup Language

In " HTML: The Markup Language " is not a specification, but a supporting document that describes the markup language HTML accurate. It wants to bring authors details about the correct use of the language in detail, but there are not a tutorial or manual. The document makes no claims or definitions as HTML should be processed.

Differences between HTML5 and HTML4

This document describes the differences between HTML4 be (more precisely, HTML 4.01 and XHTML 1.0, as well as in parts of DOM Level 2 HTML ) and HTML5 listed and given reasons for the changes. The document will be updated and published in each release of a new working draft of the main specification.

HTML5 techniques for useful text alternatives

This produced a guide for HTML authors describes which alternative texts for images to choose (especially in the alt attribute of the img element ). These are important so that the contents of which are transported through images, are accessible for example, blind web users.

" Polyglot " Markup: HTML - Compatible XHTML Documents

The document describes rules for HTML5 documents, which are written in XHTML syntax and thus can be processed by both HTML5 parsers and XML parsers.

Differences to existing standards

The following overview of the new features in HTML5 is not intended to be exhaustive and is subject by the status of specifications related changes.

Fundamentally can begin by saying that virtually all elements from HTML 4.01 are also included in HTML5. HTML4 is sogesehen a subset of HTML5.

Differences to HTML 4.01

The document type specification

The document type specification in HTML5 documents consists of the string " ", where uppercase and lowercase do not matter. This string causes in all modern browsers, the processing of source code in strict mode.

Since browsers are HTML documents does not differ according to their version, were deliberately omitted any kind of versioning. This also shows that HTML5 is defined as a superset of HTML 4.01.

Integration of SVG and MathML

HTML5 provides a simple way of SVG and MathML in the HTML source. There are only two restrictions:

  • The elements must not contain namespace prefixes.
  • The namespace prefix for XLink loud (with attributes) must be " xlink ".

As a side effect, all named entities from SVG and MathML are allowed as part of HTML.

New elements

HTML5 introduces many new elements that are presented below.

Structuring elements

The elements section, nav, article, aside, header and footer will enable better structuring. Unlike div boxes have been used for patterning of HTML documents, it is also defined here by the element, what kind of content is in the element. For example is section a portion of a continuous text, nav menu, article an article or footer one footer.

In a study of search engine provider Google, it was found that the most frequently assigned class names can be in HTML documents well assign to any new HTML elements.

Some of these elements (section, nav, article and aside ) also bring a feature in combination with the header elements h1 to h6. The heading hierarchy is no longer determined solely by reference to the heading elements, but also the basis of which position within the new elements. Uses a document, for example a header level and then uses it in an article element is also a header level, these non- nested heading is subordinate. This is true even if the non - nested heading is of lower order.

Such new elements are not recognized by older browsers (especially Internet Explorer 6, but also by more recent versions of Internet Explorer to IE8 included). For these versions, JavaScript is required to allow the IE to recognize the new elements as such and are.

The grouping element figure

The figure element and the matching header element figcaption were added to facilitate the award of additional content, such as illustrations with captions.

Elements to markup

In markup level, the elements were time for time values ​​that can be dynamically located; mark for highlighted text sections; ruby, rt and rp for simple Ruby annotations and the previously proprietary wbr element, the text wrapping options provided in long words added.

Multimedia elements

HTML5 introduces specific elements for the integration of audio and video files. For these one or more sources and different formats can be stored, from which a browser then selects an understandable format for him.

For the incorporation of applications or interactive, not HTML-based content HTML5 describes the formerly proprietary embed element.

In addition, a drawing surface ( canvas element ) was added, two-dimensional images can be drawn on using scripts.

Form elements

The input element has been extended to different types, eg for entering search terms, telephone numbers, URLs, e -mail addresses, dates and times, numbers, and color.

To design the following elements were also added: datalist, with the example completion suggestions are given; output, represents the results of calculations; progress, which reflects the progress of an action; meter, which (eg, memory allocation on the hard drive) is intended for measurable value ranges and the previously proprietary keygen element that should be used to produce identification key pairs.

Interactive elements

The details and summary elements are similar in structure to the figure and figcaption elements. The contents of the summary element is constantly displayed, the remaining contents of the detail element can be switched on and off.

For creating toolbars and ( context ) menus, the menu element are defined as structuring base and the command element as the point of interaction.

Elements with changed meaning

Some, formerly of presenting serving elements ( such as b, i, hr or small) was given a semantic meaning. The respective definitions are rather broad, which is to ensure that the new meaning does not contradict the use in existing websites.

In contrast, however, some elements (eg cite ) are redefined. Among other things, is only possible under certain conditions the direct porting of older standards.

From the vocabulary of HTML5, some elements omitted (eg, acronym, center, font, frame, etc.). However, it is still defined, has a browser how to deal with these elements. This ensures compatibility with existing websites.

Elements and attributes in HTML and XHTML

The noscript element is true, but not allowed in HTML5 in XHTML5 because the element model is not compatible with the XML processing rules.

Base attribute not allowed, as there has no effect: HTML5 In the xml.

Differences to HTML DOM Level 2

HTML5 defines some DOM interfaces, which are intended to create Web applications, including but not limited interfaces for:

  • The control of multimedia elements,
  • The manipulation of History ( feinergradige forward and back navigation )
  • Drag and drop,
  • Editable content,
  • Offline applications
  • Storing application data ( usually 5 MB per domain ).

Some formerly proprietary or contained only in collections of functions properties and methods as innerHTML and getElementsByClassName () have been added to the specification.

Furthermore, the behavior of the elements in HTML has been adapted to the behavior of the elements in XHTML, namely:

  • LocalName specifies the element or attribute names in lower case again.
  • NamespaceURI of each element is " "


When HTML5 is a very large and wide -ranging project to which there are as many critical voices. It should be noted that HTML5 is still in development. Criticisms could therefore have lost by a revision of the draft importance.

Critique of the development of HTML

Joe Clark, author and Webaktivist for accessibility, although admits the current version of HTML to problems, but does not believe that the language regarding existing elements should be further developed. Even very poorly written code leads therefore already in all browsers to a satisfactory result.

Clark criticizes the development of HTML is proceeding in the wrong direction. The W3C have to work on far more important sites, such as the controversial scene in the WCAG 2.0 standard.

Shane McCarron, who is responsible as an editor for numerous XHTML specifications, suggested that HTML will develop, because the major browser makers the challenge lies in the implementation of the Semantic Web, are not grown. He concludes that the W3C had abandoned the goal of an XML -based ecosystem to print these manufacturers.

Criticism of the development process of HTML5

Sole write permission by Ian Hickson

Through this commitment Hickson grew into the role of " benevolent dictator ". He alone has the right to edit for the HTML5 draft and for many decisions affecting the specification solely responsible.

This approach is hardly that of the W3C, whose specifications are based on democratic consensus. Therefore, many Web authors ask, for example, Kyle Weems, whether the purpose of ( a functional specification ) justifies the means ( consistent, non-democratic formation that specification). The answer is left open.

Lack of seriousness for internal proposals

Mathias Schäfer, web developer and co-author of various documentation ( for example SELFHTML ), criticized the lack of objectivity in the discussion of HTML5 during formation. Using the example of the " Distributed Extensibility " (extension of the HTML vocabulary to proprietary or private, such as company internal elements ), he shows that proposals will not be taken seriously, although Shepherd thinks that just might HTML5 lead to success this procedure.

Criticism of the WHATWG

The WHATWG itself is criticized because it still carries on its website as a specification HTML5. In fact, however, it is not about HTML5, but - according to own data - the next generation of HTML. So, although the design includes the current state of HTML5, but extends it with new, more immature features.

In addition, the WHATWG draft has a few differences from the version of the W3C. The problem are differences in the normative clauses, as here, the W3C defines the standard but browser vendors to follow generally the WHATWG draft.

Audio and video elements

To include audio or video data defines the elements of HTML5 audio and video. However, since no format is defined, which must be supported as a minimum standard, there are currently no format that is supported by all browsers.

(Which, without having to pay royalties, can be used and originally defined as a minimum standard were ), H.264 (which offers better quality, but used only against payment of fees for discussion were in the past, among other things, Ogg Vorbis and Theora are can ) and WebM.

Digital rights management

Companies like Google, Microsoft and Netflix have addressed to the W3C the wish that a so-called Encrypted Media Extension ( EME) encrypted media content can be output via HTML5. Critics fear that is integrated by this backdoor Digital Rights Management ( DRM) in each browser.