Efficient XML Interchange

2445.5849 hex $ EXI ( ASCII C notation )

Efficient XML Interchange ( EXI ) is a proposed by the World Wide Web Consortium (W3C ) format for binary representation of XML information sets. In comparison with text-based XML documents documents in EXI format can be processed faster and require less bandwidth when transferring over a network. Besides EXI there are other approaches to establish a binary representation of XML (see Binary XML ).

History

Based on the conclusions of the XML Binary Characterization Working Group was established in November 2005, the Efficient XML Interchange ( EXI ) working group established to define with the aim of a binary description format for XML. Analysis and comparison of several approaches ( including XML gzip, Fast Infoset, Fujitsu Binary, Xebu and esXML ) was elected in November 2006 Efficient XML as the basis for EXI. In July 2007, the first draft for the Efficient XML Interchange standard was published.

The planning of the working group saw it before, EXI to publish in September 2009 as a W3C Recommendation. Then a proposal has been published that build upon the W3C Recommendation for a W3C Recommendation in March 2011 and in January 2011.

Concept

The algorithm uses a grammar to determine based on this, which is likely to occur at a particular location in an XML document. The most likely alternative is then encoded with fewer bits than less probable (see entropy ). This algorithm can be generally held (for example, EPS, Java, HTML, etc. ) applied to each language, which is described by a grammar. EXI is, however, optimized for XML languages ​​.

The grammar allows them any XML document or fragments as input. In order to make more accurate predictions, which happens at a certain point, the grammar can through various schemes (such as DTD or XML Schema ) are expanded.

The encoder generates, with the aid of the grammar, the input of a stream of events (English for " power of events" ) which consists of a series of simple variable length codes. These event codes are similar to Huffman coding, but are much easier to compute and maintain. In addition, the event codes can be compressed by run-length encoding.

Magic Number

To distinguish EXI streams of XML streams, two distinguishing bits were introduced. The first two bits of the first octet to the value '1 ' and '0' in the same order have. This sequence is well-formed XML 1.0 documents not possible in the usual character encodings. However, to ensure the distinction is also for possible future encodings, the introduction of a magic cookie was early proposed.

In the specification for 1.0 format is determined that the EXI header can $ EXI ( 0x24455849 ) begin with the so-called EXI Cookie, the ASCII string. Immediately afterwards, the two distinguishing bits must follow. Except for the first two and the fourth bit of the first octet (after the EXI cookie ), the other five variable. This results in theoretically 32 different magic numbers.

Although the use of cookies EXI optional, but is strongly recommended in the specification.

Example

A EXI stream version 1 with EXI cookie and without EXI options would begin with the following bytes:

24 45 58 49 80 A EXI stream from version 16 with EXI cookie and EXI options would begin as follows:

24 45 58 49 AF implementations

On the website of Interchange Working Group you will find a detailed description of the implementations.

  • EXIficient: A -assisted Siemens open-source project for the implementation of the EXI specification in Java.
  • Efficient XML: A commercially marketed by the company AgileDelta implementation of the EXI specification in Java, NET, and C .
  • OpenEXI: A Fujitsu, Naval Postgraduate School ( NPS) and Optima Logic progress -driven open source project for the implementation of the EXI specification in Java.
  • Exi Connexion - Open Source implementation of the EXI Working Draft of 26 March 2008.
256406
de