XProc (of English XML Processing) is a standardized by the W3C XML language to define processing chains for XML documents (so-called XML pipelines). It is a W3C Recommendation since May 2010 and serves the increased demand for mass processing of formats based on XML, such as docx.


In the processing of XML documents typically follow each different steps. For example, in the publication of a manual, the DocBook source document first validated against a RelaxNG schema and are then converted into an HTML and a PDF version of XSLT. Such processing chains with XProc - regardless of the software and platform neutral - are described as XML documents. XProc processors can on the basis of XProc documents execute the processing chains described.

This is also useful if a large amount of identical XML documents one or more operation (s ), such as renaming an XML element should be made.

Building a XProc pipeline

The code of an XProc pipeline is described in XML syntax, which is then read and processed by an interpreter. Starting from the concept of a well-formed XML document always has an XProc pipeline a root element. Within this root element of the document is assigned to at least one of the three XProc namespaces. Central elements of the pipeline are the steps which are enclosed by the root element, described and processed sequentially. A pipeline can read 0 or more XML documents, and 0 or spend more XML documents.

Steps ( Steps)

Steps or Steps are Key Elements of a described by XProc XML pipeline. There are three kinds of steps of:

  • Atomic steps (atomic steps)

This result exactly one processing or operation, such as renaming or deleting an element within the XML document.

  • Co amount coaster step ( Step Compound )

Steps can also be stitched together, which is then referred to as a composite step ( Step Compound ). A pipeline indeed is based only on a certain number of steps, thus integrated into another, which is also referred to as Subpipeline. Using this step, more complex structures such as, but loops can conceive.

  • Multiple steps (multi- container steps)

Use these steps to create parallel defined Subpipelines is possible, which can be described, among other constructs for error control.


Inputs and outputs of the steps of an XProc pipeline can be realized by using ports. Primary ports are used for automated connection of the individual steps of each other or of the pipeline ( the first or last step ) and may not necessarily be named. It is spoken by an implicit indication of the primary ports, when used automated. Consequently, it is in the opposite case, an explicit label them, ie the Primary port is specified. The ports have this unique names, such as source as the primary input port (input port) or result as the primary output port (output port). Another port would include a schema for XML schema files.


XProc uses three internal namespaces. The namespace http://www.w3.org/ns/xproc ( by convention with the prefix p :) describes the XML vocabulary of XProc. The namespace http://www.w3.org/ns/xproc-step ( by convention with the prefix c :) is used for documents that are generated within a processing chain as defined input or output of individual steps - independent of the namespaces the processed external documents. Finally, the namespace used http://www.w3.org/ns/xproc-error ( by convention with the prefix err :) the processing of errors.


                         < / p: input>    < / p: xinclude >                       < / p: input>                  < / p: input>    < / p: validate- with- xml - schema> < / p: pipeline > This is a pipeline, which consists of two parts or atomic steps XInclude and Validate. The pipeline itself has two inputs, source ( a source document ) and schematic ( a list of W3C XML Schema ). The XInclude step reads the pipeline input from source and produces a result document. The Validate step reads the pipeline input schemas and the result of XInclude processing step and produces a result document. The validation result, result, is the result of the processing chain.

The same pipeline can be formulated shortened if their primary ports are specified implicitly:

                              < / p: input>    < / p: validate- with- xml - schema> < / p: pipeline > implementations

  • Calabash by Norman Walsh
  • EMC Documentum XProc Engine
  • QuiXProc of Innovimax
  • Yax - a XProc (XML Pipeline ) implementation. Still based on an XProc Working Draft
  • AntillesXML free XML Toolbox GUI and built-in Calabash