ISO/IEC 8859-1

ISO 8859-1, more specifically ISO / IEC 8859-1, also known as Latin -1, is a recently updated in 1998 by the ISO standard for information technology to character encoding with eight bits and the first part of the regulations ISO / IEC 8859th

The encodable with seven -bit US-ASCII characters correspond with leading zero bits. In addition to the 95 displayable ASCII characters (2016 - 7E16 ) codes ISO 8859-1 96 more ( A016 - FF16 ), for a total of 191 theoretically possible 256 ( = 28). Positions 0016- 1F16 and 7F16, 9F16 are assigned in ISO / IEC 8859, and thus the ISO / IEC 8859-1 no characters. The defined by IANA as ISO -8859- 1 (with hyphen) means the combination of the characters of this standard with control characters can not be represented in accordance with ISO / IEC 6429th

ISO / IEC 8859-1 attempts to cover as many characters as Western European languages. As for completeness in addition to the € symbol, especially for French are missing some characters, was created as an alternative to ISO / IEC 8859-15.

Windows -1252 Western European ( Western European ) is an 8 -bit character encoding of the Microsoft Windows operating system that supports most Western European languages. It is based on ISO 8859-1 and ISO 8859-15.

Some applications mix the definition of ISO 8859-1 and Windows 1252. These codes, however, differ only in the control characters in the range 8016 to 9F16. Since these have no meaning, for example, in HTML, the printable characters from Windows -1252 to be used often. For this reason prescribes the new HTML5 standard that as ISO 8859-1 marked texts are to be interpreted as Windows -1252.

History

ISO 8859-1 is based on the DEC Multinational Character Set, which was used by Digital Equipment Corporation VT220 terminal. It was originally developed by the European Computer Manufacturers Association (ECMA ) developed and released in March 1985 as ECMA -94. The second edition of ECMA -94 also contained ISO 8859-2, ISO 8859-3 and ISO 8859-4 as part of the specification.

Tables

ISO / IEC 8859-1

SP (2016, "space ") is the space, NBSP ( A016, " non-breaking space " ), the non-breaking spaces and SHY ( AD16, a "soft hyphen " ) is normally only at ends of lines becoming visible "conditional hyphen ".

ISO / IEC 8859-1 combined with special characters from ISO / IEC 6429

The IANA has registered the following equivalent description large independent labels applied to these code table for use in Internet applications, such as MIME:

  • ISO_8859 -1: 1987
  • ISO_8859 -1
  • ISO -8859-1
  • ISO IR -100
  • CsISOLatin1
  • Latin1
  • L1
  • IBM819
  • CP819

Windows 1252

Windows -1252 - also known as CP 1252 - is referred to as Western European ( Western European ).

This character set differs from ISO 8859-1 in the range 8016 - 9F16, they contain the 32 positions here 27 displayable characters, including the ISO 8859-15 -added and some necessary for better typography characters. The differences between all of these encodings, and a general lack of consistency in the support of various character sets are a common interoperability problem.

Windows -1252 is also registered with the IANA.

ISO 8859-1 vs. ISO 8859-15 vs. Windows -1252 vs. Unicode

Due to the widespread use of ISO 8859-1 Unicode standard was just such a way that the Unicode standard is an extension of ISO 8859-1. A character that is in ISO 8859-1 x encoded by the byte value, so it is set in the Unicode standard code point x. The sequence of bytes actually used may differ from the code point, eg in UTF -8 encoding.

Use

ISO 8859-1 is next to US-ASCII and UTF -8 (a Unicode encoding ) is probably the biggest and most widely used encoding for Latin fonts.

For at least the following languages ​​ISO 8859-1 ranges from:

  • Afrikaans ( È / è, É / é Ê / ê, Ë / ë Î / î Ï / ï Ô / ô Û / û )
  • Albanian ( Ç / ç Ë / ë ),
  • Basque ( Ñ / ñ),
  • Danish ( Å / å, Æ / æ, Ø / ø )
  • German ( Ä / ä, Ö / ö, Ü / ü, ß, in foreign words: É / é, not Euro symbol and possibly s )
  • English (£, ¢; veraltend: Æ / æ, ä, ë, ï, ö, ü, not Œ / œ )
  • Faroese ( Á / á, Ð / ð, Í / í, Ó / ó, Ú / ú, y / y, Æ / æ, Ø / ø )
  • Finnish ( Ä / ä, Ö / ö, in foreign words: Å / å, not Š / š Ž / ž ),
  • French ( Æ / æ, à / à,  / â, È / è, É / é Ê / ê, Ë / ë Î / î Ï / ï Ô / ô Ù / ù, Û / û, ç / ç, ü / ü, ÿ, not œ / œ, Ÿ )
  • Irish Gaelic orthography ( Á / á, É / é, Í / í, Ó / ó, Ú / ú )
  • Icelandic ( Á / á, Ð / ð, É / é, Í / í, Ó / ó, Ú / ú, Ý / ý Þ / þ, Æ / æ, Ö / ö )
  • Italian ( à / à, È / è, É / é Ò / ò, ù / ù )
  • Catalan (À / à, Ç / ç, È / è, É / é, Í / í Ï / ï Ò / ò, Ó / ó, Ú / ú, Ü / ü, not dagg. LL / LL),
  • Dutch ( not IJ / ij, but ÿ ),
  • Norwegian, Bokmål and Nynorsk (Å / å, Æ / æ, Ø / ø, Ò / ò )
  • Including Portuguese Portuguese (Brazil ) (À / à, á / á, â / â Ã / ã, Ç / ç, É / é Ê / ê, Í / í, Ó / ó Ô / ô Õ / õ ú / ú, ü / ü),
  • Romansh,
  • Scottish Gaelic (À / à, È / è, Ì / ì Ò / ò, ù / ù )
  • Swedish ( Å / å, Ä / ä, Ö / ö )
  • Spanish ( ¡, ¿, ª º Á / á, É / é, Í / í, Ñ / ñ, Ó / ó, Ú / ú, Ü / ü, formerly Ç / ç ),
  • Swahili and
  • Walloon ( Â / â, Å / å, Ç / ç, È / è, É / é Ê / ê, Î / î Ô / ô Û / û ).

Since these are now in Western Europe, America and Australia ( writing ) the most used languages ​​, it is everywhere the dominant character encoding. Even in parts of Africa, where not the Arabic script is used, it is widespread, although often some special characters are missing, also in any other 8- bit encoding, however, are available, see eg pannigerianisches alphabet.

195760
de