JBIG2

JBIG2 is a method for image compression of binary images for both lossless and lossy compression. JBIG2 was developed by the "Joint Bi-level Image Experts Group," was published in 2000 as an international standard ITU T.88 and in 2001 as ISO / IEC 14492. It is a further development of JBIG.

Operation

Although the JBIG2 standard refers only to the decoding, it is expected from the encoder that the sides of the input documents are divided into three types of regions: text, graphics, and generic regions. The latter are objects again, can be classified either as text or as image, for example lines or noise.

A text region is made up of a number of symbols are placed on a background. Typically corresponds to a symbol of a character ( eg letters ), which is found in a text. The symbols are stored in a symbol dictionary and can be reused by specifying their indices. Storage in the dictionary will either be encoded bitmap or as a refinement of another dictionary entry, and only the difference is stored to the original. In the lossy compression also slightly different symbols refer to the same symbol dictionary entry.

Raster graphics are compressed by reconstruction of gray scale images and frequently-occurring patterns are stored in a library. Lossless and lossy coding are handled as text regions.

The set by the encoder regions need not be disjoint. Possible areas of overlap will be charged to be indicated by means of logical operators (OR, AND, XOR, XNOR or REPLACE).

JBIG2 files are divided into segments. Page of a document consists for example of a page information segment, a symbol dictionary segment, a text region segment, a pattern dictionary segment, a Halbtonregionssegment and an end- of-page segment. The dictionary segments contain raster graphics that are referenced by the region segments. Because symbols and patterns of different pages can refer to the same dictionary segment, there will be a cross-page compression. Segments are uniquely numbered and consist of a segment header, a header and data. The head segment contains the segment number when other segments are referenced in the data portion of their segment numbers, and the page number on which the decoded picture to be placed or global segments have the value 0

Compression method

For compression, three different methods are used:

  • Arithmetic coding
  • MMR ( Modified Modified READ), also known as Group 4 fax or two-dimensional code.
  • Huffman Encoding

Use

JBIG2 data can occur (from version 1.4) as standalone files or embedded in other file formats such as PDF.

Open source decoder for the JBIG2 are jbig2dec (written in C) and JBIG2 ImageIO (written in Java).

Disadvantages

Through the use of non- identical symbols in the lossy compression can be an unsuitable parameterization in compression to distortion (as opposed to visually identifiable as such mistakes of others compression method ) is less document details result (eg numbers ); see that discovered in August 2013 Scan Copier problem.

432638
de