Augmented Backus–Naur Form

The enriched Backus -Naur Form ( ABNF, Eng. Augmented BNF ) is a variant of Backus -Naur Form metalanguage for describing syntax notations. It was originally developed as RFC 2234 for the unambiguous specification of RFC internet standards the IETF and is suitable for the syntactic definition of technical languages ​​and protocols.

  • 2.1 comments
  • 2.2 Terminal symbols
  • 2.3 Naming Rules
  • 2.4 areas
  • 2.5 repetitions
  • 2.6 groups
  • 2.7 sequences
  • 2.8 Optional sequences
  • 2.9 alternatives 2.9.1 Incremental alternatives

Formation

During the formation of the RFC standards, the need to present the required syntax descriptions by a standardized BNF variant arose. The RFC standard RFC 2234 unified the slightly different versions in all published RFC standards. Newer RFC standards now needed no definition of the metalanguage used to contain more. Instead, a reference was enough to RFC 2234.

The document contains a self-definition of ABNF syntax. This ABNF is expressed using the ABNF notation.

Subsequent versions

Later, the corrected editions RFC 4234 and RFC 5234 replaced the first version.

Properties

The ABNF notation is based on the BNF, the force also basics of BNF can be found there.

The extensions to BNF are made of a modified naming rules, repetition, alternatives, ranges of values ​​and a set of predefined basic rules. They allow a more comfortable and more expressive formulation of the structures to be described. The focus of the means of expression is intended for the definition of strings. There were consciously defines mechanisms that presuppose a particular encoding (eg ASCII). If, for example, Character codes or ranges are used, these definitions are dependent on the original underlying character encoding, and must be adapted generally for other character encodings.

Comments

A semicolon ( ;) initiates the comment. The comment text follows and extends to the next newline ( line comment ). Multiline comments require a semicolon for each line.

Terminal symbols

Terminal symbols are the values ​​from which the rule definitions are ultimately built. To terminal symbols include:

  • Literal strings, which is not case - sensitive. You will be in double quotes. " Set If the quotes needed in the string, the string with a character code for a quotation mark must be expressed as a sequence. Example: " PROGRAM "
  • Character codes in decimal representation: the prefix % d indicates the decimal system used. Example:% d13 for the character with code value 13 (which is ASCII character Carriage Return, short CR)
  • Hexadecimal (also hexadecimal ) representation: the prefix % x indicates the Sedezimalsystem used. Example:% x0d for the character with code value 13
  • Binary representation: the prefix % b denotes the binary system used in this case. Example:% b00001101 for the character with code value 13

Name Rules

Name for definitions include the characters A- Z, a- z, 0-9, and the minus sign -, the first character must be a letter. Compared to the BNF angle brackets <> are required to name; However, it is possible for compatibility reasons. The definition of a rule starts = with the name and an equal sign. It is continued until no more continuation lines is encountered with extended indentation.

Rule1 = Regelbestandteil1      Regelbestandteil2      Regelbestandteil3    Rule2 = ... One exception involves the incremental alternative ( see below). It extends an already existing definition.

Areas

Areas represent a set of characters whose code values ​​are within the specified range. They are a special form of alternatives. The area formed by the code values ​​at the borders of the range. Both values ​​are with a minus sign - connected, the prefix for the numbering system used is given only at the first number. For example, with this, the range of hexadecimal character codes 0x30 to 0x39 ( or decimal 48 to 57 ) is set. This corresponds to the usual ASCII character code the digits '0 ', '1', etc. to '9 ':

Digit = % x30 -39 corresponds to the alternative

Digit = " 0" / " 1" / " 2 " / " 3 " / " 4" / " 5 " / " 6" / " 7" / " 8" / "9" repetitions

Information on repetitions are placed in front of the term and can a minimum and / or maximum of occurrences included. The explicit form is * , with a lack of minimum as 0 ( not present) and a missing maximum is taken as infinity ( unlimited occurrence). An exact number of repetitions n is expressed by a single number n.

Beliebighäufig = * my worth    exactly - three times = 3allergutenDinge    at least once or twice = 2 * abzweidabei    at most three - = * 3Versuchefrei    one-to - two = 1 * 2Vornamen groups

Groups are used to clear priority definition for composite expressions and are indicated by parentheses ( and ) formed.

String1 = (To be) / (not to be)    String2 = To ( be / not) to be The first example is the alternative "To be " and "not to be".

In the second example, a sequence is formed from the "To", then either "be" or "not", and then "to be".

Sequences

For sequences, all lined up expressions are exactly as expected as indicated. Sequences are formed simply by juxtaposition of expressions ( separated by white space ).

Sequence = one after the other Optional sequences

Optional sequences can even be present, but do not have it. They are formed by square brackets [ and ]. The following idioms are equivalent:

  • [ optional low pressure ]
  • * 1optionalerAusdruck
  • 0 * 1optionalerAusdruck

Alternatives

For alternatives may be available only exactly one of the variants listed. Alternatives are listed separately with solidus or slash /.

Selection = His / non-existence Incremental alternatives

Existing definitions can be incrementally extended with alternatives. This decentralized definitions are possible, which may go at the expense of clarity, however, if the components of a definition are far apart. The name of the rule needs to be with = / repeated.

Status = Yes    ...    Status = / No    ...    Status = / White Not corresponds

Status = Yes / No / Not White priority setting

In compound expressions, the following processing sequence applies:

  • Names, strings, Terminals
  • Comments
  • Areas
  • Repetitions
  • Groups and optional sequences
  • Sequences
  • Alternatives

The RFC standard recommends setting up groups to clear priority setting in mixed expressions with sequences and alternatives.

Predefined rules

Frequently used definitions are already predefined as core rules. They include general classes such as numbers, letters and spaces.

Comparison with the EBNF

For orientation, the differences between ABNF and EBNF are tabulated here.

Both notations allow the same level of syntax definitions.

24798
de