The term digitization refers to the transfer of analog values ​​into discrete ( stepped ) values ​​, for the purpose, to store them electronically or process. The end product or result of the digitization is sometimes called digitized.

In a more general sense, it can be meant (eg a CD) with digitization of the entire process of the acquisition and processing to storage of analog information on a digital storage medium.

It is estimated that in 2007 already 94 % of the world's technological information capacity was digital ( by only 3 % in 1993 ). It is believed that it was possible for the first time humanity in 2002 to store more information than analog digital ( the start of the " digital age ").

  • 6.1 Digitization of text
  • 6.2 digitization of images
  • 6.3 Digitization of print films
  • 6.4 Digitisation of audio 6.4.1 Optical scanning of records
  • 7.1 copy protection
  • 7.2 Costs change
  • 7.3 impact on the legal system
  • 7.4 impact on business processes in companies


Under digitization is generally understood as the preparation of information for processing or storage in a digital technical system. The information is in this case in any type of analog form and then converted in several steps into a digital signal consisting of only discrete value.

The size to be digitized can be anything that can be measured by sensors. Typical examples include

  • The sound pressure at sound recording with a microphone,
  • Brightness in image and video recording with a CCD sensor,
  • Using special programs text from a scanned document out
  • Temperature,
  • Magnetic fields,
  • Etc.

The sensor measures the physical size and are available in the form of a - still analog - electrical voltage again. This voltage is then converted to an analog to digital converter into a digital value, in the form of (usually electric ) digital signal. From here on, the size is digitized and can range from a digital technical system (eg, the home PC or digital signal processors ) further processed or stored ( eg, on a CD or a USB stick ).

Today's digital technology processed exclusively binary signals in general. Since distinction must be made between these, only two signal states (0 or 1, or low or high ), thus the demands on the accuracy of the components are low - and as a result the cost of production.

Representation of digital data

As the digitized values ​​are then displayed in the system internally, depends on the system.

In common parlance, a further processing using the computer is meant. In this case, there is verschiedeneste types to represent the digitized information. Most of the digital copy is stored in a file whose format depends on the type of information, programs used, and also the later use.

One can differentiate between three different basic shapes:

Universal code

Here there is a very limited number of characters ( for example, ASCII ), which has a simple structure of information ( for example, sentences ) can be transported. A typical application is the transmission of digital text. It is not transmitted even the appearance of the letter, just the letter. Appearance is created ( monitor, printer, ...) until the terminal. The transfer of a universal code is simple and frugal.

Matrix Code

This is used for the transport of complex information (e.g., a photograph ). Within a specified area (often a face) a uniform grid ( = the matrix ) is formed. The entire space ( surface) into equal elements of the transformation (eg, pixels) divided. For each element a certain number of volumes of information is kept free ( for example, bits and bytes). This free hold also takes place, is when contained within the fixed space ( surface ) at this point no information ( for example, the image at a location is empty).

The transfer of a matrix codes is technically demanding and memory intensive.

Vector code

A vector code and coordinates describing appearance of a curve in a room (or area). A typical use case is the appearance of letters. The vector code here describes the curvature, length and intensity of the lines that determine the appearance of a character. The transmission of a code vector is economical, but technically complex.

End product

The end product of a digitization consists of one or more files that can be called (based on conceptions such as condensate or correlate ) digitized.

  • The result is a file containing the desired image points.
  • The resulting PDF file consists of several individual elements: raster, vector and text data.
  • Due to the PDF format, the individual elements are placed on memory -saving way to a file.
  • The individual elements represent full digitization ( individual parts) dar. But only the connection of the individual elements in the final product produces a usable file, because this file useful links the individual elements in the original arrangement (layout).

Reasons for the digitization

The presence of information and data in digital form has many advantages:

  • Digital data permit the use, processing, distribution, development and reproduction in electronic data processing systems.
  • Digital data can be machined and thus processed more quickly.
  • They can be searched (by word too).
  • The space required is significantly lower today.
  • Even for long runs and after multiple processing errors and distortions are low (eg noise overlays ) compared to the analog processing or can be completely ruled out.

Another reason for the digitization of analogue content is the long-term archiving. Assuming that there is no eternal durable media, permanent migration is a fact. The fact is also that analogue content with each copy lose quality. Digital content, however, consist of discrete values ​​that are either read and hence equivalent to the digital original, or are no longer readable, which is prevented by redundant storage of the contents or error correction algorithms.

Finally, it should digitizing analog originals for creating copies to mention use to conserve the original. For many media, including records, analogous This feature films and color slides, losing only by the playback quality.

It should be noted that the step of digitization is basically associated with loss of quality because the resolution "finally" remains. However, a digitized version can be as accurate in many cases, that it ( and future ) is sufficient applications for a large part of the potential. If this quality is achieved through the digitized version, it is called Preservation Digitisation, ie the digitization for preservation ( = replacement copy). The term fails to recognize, however, that not all future applications can be known. For example, enabling high resolution photography while reading the text of a parchment manuscript, but for example can not be used for physical or chemical methods to determine the age of the manuscript.

Historical development

Digitalization has a long development. Long ago, Universal codes were used. Historically, early examples of this are the Braille (1829 ) and Morse ( 1837 ). The basic principle of using fixed codes for information transmission, worked for technologically unfavorable conditions by light and sound signals ( wireless, telephone, telegraph ). Later followed Telegraph ( among others using the Baudot code ), fax and e-mail. Today's computers only process information in digital form.

Areas of digitizing

Generally, the process of digitization is performed by an analog- to-digital converter, wherein the analog input signals at fixed intervals, whether these are the time intervals for linear recording or the spacing of the photo cells in scanning measures (see sample rate) and the values ​​of a certain accuracy (see quantization) digitally encoded (see codec ).

The ongoing digitization penetrates more and more into the classic areas of communication. United are in fashion internet, mobile phone and digital TV.

Depending on the nature of the analog starting material and the purpose of digitizing various methods are used.

Digitization of text

When digitizing the text, the document is first digitized as well as an image that is scanned. If the digitized reproduce the original appearance of the document as accurately as possible, no further processing and only saved the image of the text.

If it is assumed that only the linguistic content of the documents of interest, so the digitized text image is of a text recognition program back into a character set translated (eg ASCII or non - Latin characters Unicode) and then stored only the recognized text. The memory requirement is significantly lower than that for the image, but go, information may be lost that can not be represented in plain text (eg formatting ).

A further possibility is the combination of the two, In addition to the digitized image of the text nor the content is detected and stored as metadata. Operators can search for terms in the text, but which are still displayed the ( digitized ) Original document (eg Google Books).

Digitization of images

To digitize an image, the image of the color value is being scanned, that is, in rows and columns (matrix) divided for each of the resulting picture elements read out, and stored with a given quantization. This can be done by scanners, digital photography, by satellite or medical sensors. For the final storage of the digitized files of image compression methods can be used if necessary.

In a black and white raster image with no gray tones then takes the value of a pixel at the values ​​of 0 for black and 1 for white. The matrix is read row by row, thus obtaining a sequence of the digits 0 and 1, which represents the image. In this case, a quantization bit of one is used.

To represent a color or grayscale image digitally, a higher quantization is needed. When digitized in RGB color space, each color value of a pixel is decomposed into the values ​​of red, green and blue, and these are stored individually with the same quantization (within a byte / color value = 24 bits / pixel ). For example, a pixel in pure red correspond to R = 255, G = 0, B = 0

In the YUV color model the color values ​​of a pixel with different quantization can be saved, since in this case the light intensity, which is registered by the human eye, more specifically, from the chrominance ( = color ) that is registered by the human eye is less accurate, are separated. This allows a smaller storage volume at approximately equal quality to the human viewer.

Digitization of print films

In large format scanner, the individual color separations print films are scanned, assembled and " descreened " so that the data are available in digital form for a CtP exposure again.

Digitizing audio

Often referred to as "sampling". Previously, in analog electronic vibrations transformed sound waves (eg from a microphone ) are randomly quick succession measured as digital values ​​and stored. These values ​​can also be reversed again in quick succession and played " composed " to an analog sound wave, which then can be made audible again. From the measured values ​​, a square wave form would actually result in the re-conversion; more angular, the coarser the sampling frequency. But this can be compensated for by mathematical methods (interpolation). Bit depth refers to the sampling the "space" for values ​​in bits, that are necessary for, inter alia, the resolution of the dynamic range. From a sampling frequency of 44.1 kHz and a resolution of 16 bits is referred to as CD quality.

Due to the large amount of data lossless and lossy compression techniques are used. These allow you to store audio data space saving on data carriers (see flac, mp3 ).

Common file formats for audio are: wav, aiff, flac, mp3, aac, ogg Vorbis or snd.

Common conversion methods see analog to digital converter.

Optical scanning of records

Recordings such as records can be read and digitized contact software support by using a high -resolution optical digital copy of the phonogram is " scanned " by a program. This method is used in the reconstruction of historical recordings.

Digitization in metrology

Digitization of archaeological objects

This is mostly about the digital recording of archaeological objects in writing and pictures. All available information (classification, date, dimensions, properties, etc.) adds to an archaeological object (such as a vessel, Stone tool, sword) are recorded digitally, by electronic images and drawings and stored in a database. Then, the objects can be museum- digital integrated in the form of a data import into an object - portal such as where the objects are free for everyone to be researched. The reason for the digitization of archaeological objects is usually the capture of larger stocks as archaeological collections in museums or responsible for heritage management offices, to present them to the public. Since the museum everyday never all objects in a collection in the form of exhibitions and publications can be shown, the digitization is a way to present the objects yet to the general public and the scientific world. In addition, as an electronic inventory backup is made ​​, a not insignificant in terms of the collapse of the historical archive of the city of Cologne aspect. In special cases, digital imaging, non- destructive methods are used to document the find situation of an object and to provide a basis for deciding on the next steps for securing and restoration, for example, when gold hoard of Gessel.

Social and economic consequences of digitization

The basic benefits of digitization are the speed and universality of disseminating information. Due to cost-effective hardware and software for digitizing and the increasing networking via the Internet created new opportunities in a fast pace, but also dangers. One example of this is:

Copy protection

The possibility of simplified and lossless reproduction has led to various conflicts between preparers and users of digital content. Industry and societies respond to the changing conditions with strategies of artificial scarcity, in particular with copyright protection of intellectual property and the technological implementation of copy protection.

Costs change

A major feature of digital content is a change in the cost structure. A cost reduction is often the case copyability and transport ( eg via the Internet). Thus, the cost for each additional digital copy ( marginal production costs) after the creation of the original contents are often seen as low.

Following the establishment of large companies the costs stand at present be increased by increased expenses in the field of copyright protection of intellectual property and the technological implementation of copy protection. The expected high security of data transmission and high reliability of computer equipment have an effect to increase costs. In addition, often make major investments in future technologies, which often show no profitability.

Impact on the legal system

Digitization also changes the legal system. Legal science is just beginning to deal with this problem. The " theory of fuzzy law " assumes that the right is fundamentally changing total in a digitized environment. According to her the importance of the right perspective clearly as a control agent for the Company.

Impact on business processes in companies

In the operational processes of an enterprise the digitalisation allows an increase in efficiency and thus improving their efficiency. The reason for this is that operations can be implemented more quickly and cost-effectively through the use of information and communication technology than would be possible without digitization. This is realized, for example, by the conversion of physical documents and analog information into digital form. Many companies can, for example, letters she received in physical form, scan and distribute them by email.

However, several dangers increasing digitization in business in addition to the aforementioned advantages. There is a risk of dependence of certain providers and thus may depend on proprietary ( " dependent " ) standards for storage media and formats. For example, while archived documents on microfilm largely can be read by any reader regardless, certain digital media and file formats require special reading equipment and software, which sometimes have short life cycles and / or are not compatible with each other. Typical of this dependence are all data formats from Microsoft in the daily lives of most of the German authorities.