Discrete cosine transform

The discrete cosine transform (English discrete cosine transformation DCT) is a transformation of the numerical analysis. It is used for example in the lossy compression of the audio and image data. For image data, it is for example used in the JPEG file format, in the field of audio data compression is a modified discrete cosine transform ( MDCT) application, for example as part of the MP3 format. The Discrete Cosine Transform in 1974 by N. Ahmed et al. first described.

  • 5.1 Application of 2D - DCT
  • 5.2 Application of 3D - DCT


The discrete cosine transform is one of the real-valued, discrete, linear, orthogonal transforms, the transform to the frequency domain similar to the discrete Fourier transform ( DFT), a discrete-time signal from the time domain (for time signals ), or the local site (for spatial signals).

Audio and video signals are typically in the lower spectral range, high energy signal to the de-correlation, the DCT is particularly suitable, and explains the widespread use of this transformation. A subset of these applications are in the solution of partial differential equations using spectral solution methods. The DCT, within the framework of the Chebyshev approximation is mathematically related to the Chebyshev polynomials. She is also closely related to the discrete sine transform (DST).


Other transformations, such as discrete frequency suppresses the discrete cosine transform of a finite input signal sequence as a finite sum of weighted trigonometric functions with different frequencies. The cosine cosine functions only apply.

The discrete Fourier transform, which is defined over a finite input sequence has implicitly by the type of transformation and its border conditions, a determination of how the input data sequence is continued outside this finite sequence. In the case of the discrete Fourier transformation this is a periodic continuation in the case of the discrete cosine transform, this is a straight continuation of the generating signal sequence.

The type of continuation of the input data sequence and their differentiation into even and odd continuation Different combinations result. There are two edge portions of the input sequence ( the beginning and end of the sequence ), which can be continued in each even or odd. This results in four different ways. Next is to be determined, from which position has to be made in consequence, the continuation. The continuation can be done exactly the value or between two values. If continuation is between two values ​​, the first and last value of the sequence is duplicated. This determination is done separately for the beginning and end of the episode, which again results in four different combinations. So there are various forms of continuation or possible forms of the boundary values ​​.

The eight boundary conditions in the sequel, which have a straight continuation of the start of a sequence, are included among the discrete cosine transform, the remaining eight types of an odd continued at the beginning of the sequence give the Discrete Sine Transform (DST). The different forms are referred to in the literature as DCT I- DCT and DST VIII - I to VIII - DST. The four essential in the field of data compression variants DCT - I to DCT -IV and their sequels are shown in the adjacent figure. The other four variants of the DCT and the DST all 8 variants have no meaning in the field of data compression.

These different consequences continued essentially determine the property of the individual transformations and their practical significance. In the solution of partial differential equations using spectral doing all variants of the DCT and DST are used depending on the problem. In the lossy data compression of image signals as in the JPEG DCT - II found in a two-dimensional transform application, which is why the commonly understood " DCT " or the FDCT for forward discrete cosine transform of the DCT type II. The inverse of DCT - II is designated for reasons of symmetry and up to a constant factor DCT -III, as IDCT inverse discrete cosine transform ( dt inverse discrete cosine transform ). In the field of lossy audio compression, such as the MP3 file format, a continuous discrete audio data stream has to be transformed, whereby in order to avoid aliasing in the time domain, the MDCT based on a one-dimensional DCT-IV, are used.

In the field of image and audio compression determines the type of continuation and thus the boundary values ​​, how well the transformation for data compression is. The reason for this is that jumps in the signal sequence lead to high coefficient values ​​in all frequency bands and thus in particular to high-frequency spectral components. This also applies if these jumps occur at the edges of signal sequence due to an unfavorable sequel.

The discrete Fourier transform is in general a complex-valued transformation and by the periodic continuation jumps can occur at the edge points in the waveform. This also applies to the discrete sine transform, which has an odd continued at the beginning of the sequence.

In contrast to the discrete Fourier transformation, all of the DCT transformation are real and provide real coefficients.

  • The DCT transforms, due to the choice of the type of continuing around the edges, signals (eg, visual or audio signals ) into its spectral components.
  • The DCT can be efficiently implemented both in software and in hardware. There are similar implementations, such as in the fast Fourier transform ( FFT). Using signal processors and their multiply-accumulate instructions ( MAC) allows the DCT calculation implemented efficiently.


The different types of DCT respectively form the real-valued input sequence ( spatial or time domain ) with N elements x [ n] to a real-valued output sequence ( spectral ) X [ n] from:


The DCT -I is in terms of their boundary values ​​just starting to x0 and just at the end to xN -1.

DCT -I corresponds to a factor of 2, the DFT of a real sequence of length 2 N 2 with a straight symmetry. For example, the DCT -I of N = 5 numbers long sequence abcde is up to a factor of 2 is identical to the DFT of the sequence abcdedcb. The DCT -I is only defined for sequences with a minimum length of 2, for all other DCT variants N must be positive integer.


The DCT - II is the usual DCT. It is in terms of their boundary values ​​just beginning to x -1/ 2 and just at the end to xN -1/ 2

The DCT - II corresponds to the constant factor 2 of the DFT of a real sequence of 4N elements with even symmetry, where all elements with even index have the value 0.


DCT -III is the inverse of the DCT -II. It is in terms of their boundary values ​​just starting to x0 and odd at the end to xN. The output sequence is just starting to X-1 /2 and just at the end to XN -1/ 2


DCT -IV with respect to their boundary values ​​just beginning to x -1/2 and at the end of odd xN -1/ 2

A variant of the DCT-IV is the modified discrete cosine transform ( MDCT), in which case the data sequence of the individual data blocks are overlapped, similar to the overlap-add method to obtain a non-periodic profile.


As each transformation, the DCT are reversed. The inverse of the DCT, the DCT is I -I by a factor of 2 / (N -1). The inverse of the DCT-IV is the DCT-IV with a factor of 2 / N. The inverse of the DCT -II, DCT - III with a factor of 2 / N or vice versa.

The pre-factors of the DCT are not uniformly established in the literature. For example, some authors introduced in the DCT, an additional factor to avoid the added factor in the inverse operation. By suitable choice of the constant factor, the transformation matrix can be an orthogonal matrix.

Multidimensional DCT

In particular, in the digital image processing plays a two-dimensional DCT, based on the DCT -II, a significant role. The extension to multiple dimensions is done in the simplest case by a column - or row-wise application of the transformation. For practical implementations exist for the calculation of higher-dimensional transformations more efficient algorithms.

The right figure shows a simple example with all the spectral components of a two-dimensional DCT - II in each dimension eight coefficients. The field at the top left (index 0.0 ) represents the DC component of the signal in the horizontal direction, the horizontal frequency components are in ascending order. About two indices across the frequency is doubled. The same applies to the vertical direction and for the linear combination of the two directions.

Applying 2D DCT

JPEG in each color component (Y, Cb, and Cr ) of the image into 8 x 8 blocks is divided, are subjected to a 2-D DCT.

Application of 3D - DCT

In film formats and 3D - DCT is applied sometimes. Here, a group of pictures ( Group of Pictures, GoP ) is viewed by several images also in terms of temporal change. This method is used for example in Soft Cast application.