Character encodings in HTML

Character references (English character references ) are special character sequences to represent characters in an SGML and XML document (and thus XHTML and HTML document). You start with an ampersand (&) and end with a semicolon (;).

Character references are necessary to the metacharacters languages ​​("< ", "> ", "& ", "" " and " '" ) as the character itself ( masking ) and useful for rarely used characters or those that only in the editor can be difficult or impossible entered is not utilized due to the character set or character encoding used, or difficult to distinguish. They are divided into numerical and named character references.

Numeric character references

Numeric character references begin after the "& " character with a "# " character, which in decimal or hexadecimal character position of the character in the UCS character set (Unicode ) and the final semicolon follows.

Decimal notation

The decimal notation of numeric character references corresponding to the pattern:

& # n; Where n corresponds to the decimal character position of the character to be displayed. Examples:

  • { Opening curly brace ({, U 007 B)
  • å small letter a with circle over ( å, U 00 E5)
  • И Cyrillic capital letter "I" ( И, U 0418 )

Hexadecimal notation

The hexadecimal notation of numeric character references differs from the decimal to the point that the hexadecimal character position is an "x " prefix. The notation therefore corresponds to the pattern:

& # xh; Here, h corresponds to the character position of the character in hexadecimal notation. The "x" can be written in HTML as a capital letter, but this must be the lower-case letter in XML; However, the characters of the hexadecimal number is in both languages ​​regardless. Examples:

  • 水 Chinese character for water (水, U 6 C34 )

Named character references

Named character references ( and Entity character references or character entity called ) use defined names instead of character position of the character. The notation:

&name; Name corresponds to a defined name for a character - depending on the markup language or DTD is a different number and range available.

Examples of the names defined in XML:

  • " "Character ( " U 0022 )
  • & & Ampersand (&, U 0026 )
  • ' 'Character ( ', U 0027 )
  • < Less- than sign ( <, U 003 C)
  • > Greater- than sign ( >, U 003 E)
835353
de