Image registration

Image registration is an important process in digital image processing and is used to bring two or more images of the same scene, or at least similar scenes as possible in accordance with each other. Here, one of the images is set as the reference image, the others are called object images. To these best suit the reference image, a balancing transformation is calculated. The images to be registered differ from each other because they were taken from different positions at different times or with different sensors.

Image registration methods are common especially in medical image processing. The various imaging techniques ( modalities) captured images are aligned to each other in order to gain a better understanding of their combination. If, for example, MRI images, the dargestellen soft tissue or brain structures well, superimposed PET images that make certain metabolic processes visible, one can understand in which areas of the brain take place certain metabolic processes. The overlay is referred to as image fusion.

Another example is the combination of several satellite images of a large map. Since the earth's surface is curved and changes the position of the satellite from image to image, it comes within the images to small distortions that may be aligned with one another registration procedure - see image correlation.

The goal of image registration is to find those transformation T that a given source image (object image ) F as possible with a target image (reference image) G brings into agreement. For this purpose, a measure D for the equality or inequality of the images is characterized. Image registration is thus an optimization problem in which D (T ( F), G) to minimize (if D inequality measures ) to maximize or (if D measures the equality).

  • 3.2.1 Methods that use spatial relations
  • 3.2.2 Methods that use the invariant descriptors
  • 4.1 Global Transformations
  • 4.2 Local Transformations
  • 4.3 Radial Basis Functions
  • 4.4 Elastic models

Overview

As seen above, the application of image registration can be divided into the following areas:

  • Different camera positions: show to be registered images the same object or the same scene, but were taken from different camera positions (see parallax). Registration can then be used to obtain a larger two-dimensional field of view or even for 3D reconstruction.
  • Different time points: show to be registered images the same object or the same scene but at different times. Registration means are now changes that have evolved over time, detected (time series analysis).
  • Various sensors: The to be registered images contain the same object or the same scene, but were recorded at different sensors, ie, with different cameras or different imaging modalities. The goal of image registration is here to win more and more detailed information from the images.
  • Scene -to- model registration: One or more images of an object or a scene can be registered with a model of the object or the scene. The registered images may then be compared to the given model.

Because of the wide range of application and different types of images, there is no registration process that is universally applicable. Rather registration procedures are designed specifically for various applications, for which they then function optimally. However, most registration process into the following four main steps can be divided:

  • Feature extraction: From the be to be registered images features, such as corners, edges, contours or similar detected manually or automatically.
  • Feature matching: The correspondence of the extracted feature points is made.
  • Transformation calculation: it is a suitable type of transformation, such as affine, projective, or the like, and calculates the set of transformation parameters.
  • Transformation: The object image is transformed with the calculated in the previous step transformation. Here, also interpolation are used.

Feature extraction

Registration procedures can be classified into two categories, the feature-based and area- based methods. In the area -based method, the registration is performed directly with the intensity values ​​, there is no need features are extracted. The step of feature extraction falls in these processes so away.

The second category, the feature-based methods, in which the pictures of a certain, usually a relatively small number of features are extracted. This is done either manually or automatically. In the manual feature extraction significant points are marked by a person in the images. In automatic method is searched for in the images after striking and searchable especially in all images features. The selected features should it spread as possible over the entire image and are not focused on specific regions. Registration is done by the fact that the selected features are matched. The individual groups of features are explained in detail:

  • Regions: A region characteristics to surfaces are in the picture, which clearly stand out from the surfaces around them. This can be in satellite images eg lakes. Regions are usually represented by their center of gravity and can be detected by segmentation method.
  • Lines: lines or edges may be in the image as contours of regions or just as a line even exists. They can be represented by the pairs of its ends, or its center, and is extracted by means of edge detection.
  • Points: Points can be given in the image as the intersections of lines or corners of contours. They can be extracted by corner detectors.

The advantage of the feature-based method compared to the area based on the one hand the generally lower computational effort for the feature adaptation, since the number of features is chosen not too big and the other, the lower susceptibility to noise, since the registration is not performed directly with the intensity values ​​. The disadvantage, however, is precisely the feature extraction itself, which indeed is an additional processing step. Often it is not easy with the automatic feature extraction to select features that are easy to find in all the images again, or to keep the number of features as small as possible. Therefore, feature-based methods should be selected only if it is expected that in all the pictures a few uniformly distributed over the image and good to extracting features are present.

Feature matching

Surface -based methods

In the area -based method, the step of feature extraction is the mixed of feature matching, since yes, in a sense, each pixel is a feature point. The production of the correspondence between the object image and the reference image can be performed through a window of a certain size so that the window is made correspondence to the window. But it can also be the entire image can be used. For registration then either a function that uses the difference of the object image and the reference image, or a function, which indicates the correspondence of the object image with the reference image. This function has to be minimized or maximized accordingly.

Correlation methods

A widely used approach to area -based method is the cross-correlation function. This is commonly used in the template matching or pattern recognition. Let f and g are two image sections of the same dimension of the object and reference image. Using the normalized cross-correlation function

A value is calculated which is within the range [ 0 to 1 ], representing the similarity between f and g. The higher the value, the more similar the image details. This similarity measure is now calculated for image section pairs from the object and the reference image. The image segments with the highest value is then set as the corresponding. This method works only if the difference between the object image and the reference image consists of translations. During rotation, scaling, or other deformations of the process fails in this form.

Fourier methods

If the images are subject to frequency-dependent noise, the Fourier methods offer a better solution than the correlation methods. In addition, the calculation can be time compared to the correlation methods reduce. Translation, rotation and scaling have their corresponding counterpart in the frequency domain, thus thus realized by these methods. The calculation of the Fourier coefficients of an image can be achieved efficiently by either implemented in hardware or by utilizing the fast Fourier transform (FFT).

A way to register two images, which differ from each other only by a translation, provides the phase correlation. Phase correlation is based on the Shift Theorem.

Are two pictures f and g are added, which differ by a translation ( u, v), that is, f ( x, y) = g ( x u, y v). Then their Fourier transforms are interrelated as follows from:

The Fourier transformation of the images F and G thus differs only by a phase shift which is directly related to the translation. By means of the cross power spectrum of the images f and g

One can calculate the phase shift. For this purpose, one searches for the maximum in the inverse Fourier transform of the cross-power spectrum. The position of the maximum then provides the translation parameters.

On the mutual information -based methods

Methods that use the mutual information of the images show primarily with images, good results, in which the intensity values ​​vary greatly. This variation occurs, for example when images are different sensors, such as MRI and CT, was added.

Important for the explanation of the mutual information is the entropy

Wherein X is a random variable, x is a discrete value of the random variable X, and p is a probability density. In the case of image registration, it is obvious that the random variable X is the intensity values ​​of an image. Entropy is then a measure of the disorder of an image. All intensity values ​​of an image are equally likely, then the greatest entropy. If the image has only a single intensity value, so the entropy is zero. In the image registration but you need the joint entropy of two random variables X and Y

Since at least two images to be compared. If this level is minimal, then the images are in the best possible match. However, the joint entropy is reduced not only when the pictures to be better adapted to each other, but also, when the entropy of one of the images decreases. Therefore, a measure of the correspondence should also consider the entropies of the individual images. The mutual information

Is such a level. The mutual information is maximum when the joint entropy decreases. The picture on the right shows Venn diagrams illustrating the various dimensions. There can also be traced again that the mutual information is maximum when the joint entropy is minimal. The goal of registration with mutual information is thus to maximize them, ie the images are in the best possible match if the mutual information is maximum. To obtain good estimations of the mutual information, it is necessary to have a good estimate of the probability density p. It is therefore necessary as many pixels are involved, which is also a disadvantage because the registry is therefore very expensive.

Feature-Based Method

Be given two sets of features. One contains the characteristics of the object image, the other features in the reference image. The features are represented by so-called control points. This allows the characteristics of his own but if it is points in characteristics, or endpoints of lines, centers of gravity of regions or the like. The aim of feature matching is to establish the correspondence of the pairs of the features of the object image with that of the reference image.

Methods that use spatial relations

In these methods, the information about the distance of control points with each other and their spatial distribution is utilized to make the correspondence between the control points of the object image and that of the reference image.

Perform the adjustment possibility is the following. It n control points are selected in the object image. Then in the reference image n control points as belonging to the selected control points in the object image corresponding to set. The basis of this correspondence, the transformation is calculated and implemented. It is then checked, many of the remaining control points are sufficiently like one above the other, or at least close together. Is the percentage of the superposed remaining control points is below a certain threshold, then two new control points have to be determined in the reference image and the process is repeated.

Methods that use the invariant descriptors

Another method to make the correspondence between the features is to take advantage of certain properties which characterize the features. These properties are called descriptors and should be as invariant with respect to the expected distortion. The descriptors should meet the following conditions:

  • Invariance: The descriptors of the corresponding features of the object image and the reference image should be the same.
  • Uniqueness: Two different features should have different descriptors.
  • Stability: The descriptors of a feature that is molded should be similar to those of the undeformed characteristic.
  • Independence: If the descriptors are a vector whose elements should be functionally independent of each other.

However, it is not always all these conditions are satisfied simultaneously. Thus a suitable compromise in the choice of descriptors have to be found. The selection of the descriptor is a function of the characteristics of the features and the expected deformation between the object image and the reference image. Are the characteristics of, for example, regions and the deformation is only translation and rotation, as can be used as the descriptor area of ​​the region can be selected as this is the same in rotation and translation. But just adds a scaling, so the selected property is no longer invariant under the transformation. In the feature matching then the characteristics of the object image and the reference image are determined as corresponding whose descriptors are most akin.

Invariant descriptors can also be used, if not explicitly before the features are extracted invariant, but a window over the entire image, and then runs for each of the window are calculated.

Transform calculation

After the correspondence has been established between the features in the last section, is described in this section how the transformation is constructed with transforming the object image in order to adapt it to the reference image. The correspondence of the control points of the object image and the reference image, and the condition that the corresponding control points should be as close as possible to each other transforms, are incorporated into the design of the transformation.

The task to be solved is the selection of a family of functions and the calculation of the parameters of the mapping function. The family of functions must be chosen with regard to the expected image differences and the required accuracy of the transformation. The simplest case is a translation. Only two parameters need to be calculated. Are more complex, for example, affine or perspective transformations. The more complex is the family of functions, the greater the number of parameters to be determined. Other factors contributing to the choice of the family of functions is the cause of the difference in the images. For example, in perspective distortions is obvious by different camera positions, the choice of the family of functions as a perspective transformation.

The transformations can be divided into two broad categories, depending on the scope of the data used. Global transformations use all the control points to calculate a parameter set for the entire image. A global transformation is thus made ​​of a single function, which is applied to each pixel. The local transformations the image into several regions is - in the extreme case, each pixel is a separate area - divided. Then the parameters are calculated for each area. Thus, locally varying degrees of differences in the images can be treated. A local transformation is composed of several functions, each of an area.

Global transformations

One of the most widespread global transformation models used bivariate polynomials low, usually the first degree. The similarity transformation is the simplest model. By the following equations, the point ( x, y) on the point ( x ', y') is shown below:

Where the rotation angle, s is the scaling factor and and the translation parameters. This transformation is also known as shape -preserving as this angle, stretching and length ratios remain unchanged. One advantage is that in this case, only two control points are needed. The disadvantage is however, that only rotation, translation and scale can be realized this way.

A more general model is the affine transformation. The point (x, y) is mapped to the following point (x ', y' ):

And wherein the scaling factors and the Scherungsfaktoren and and translation parameters. There are three control points are needed, but may additionally, the shear can be realized.

If must be expected to be registered in the images with distortions of perspective, the perspective transformation should

Be used. Here are four control points are required.

Can we expect more complex distortions in the images, even second or third degree polynomials can be used. Higher grade polynomials are usually not used. In general, however, more and more control points are used for the registration, as the minimum numbers specified here. The parameters of the selected transformation are then usually calculated by the least squares method, so that the transformation equations to minimize the sum of the square error of the control points. Thus it can happen that the control points are not exactly above one another transformed, but only as close together as possible.

Local transformations

Through global transformations locally different strength differences can only poorly or not at all be aligned in the pictures. Local transformations are better suited. Here, the transformation of a plurality of functions. For a function then all the control points are no longer used, but each function has its control points.

A method that implements local transformations, is the piecewise interpolation. Here, a function is defined that interpolates between those placed in correspondence control points of the object image and the reference image. One possibility here is the triangulation. The right figure illustrates how an image can be divided into triangular areas with the help of the control points. The triangles in the image on the right have different colors to make the corresponding triangles seen better. In the transformation are a number of functions are used, each of which is valid within a triangle. In order to obtain good results with this approach sufficient, the vertices of each triangle must not be too far apart, ie there must be a sufficient number of control points to be given. Since the number of required control points thus is very high, and the computational effort is correspondingly high.

Radial Basis Functions

Radial basis functions are global transformations include, but are also able to adapt to local variations. Each function f has the following property is a radial basis function:

Where c is the center of the function f. When registering with radial basis functions of each control point is the center of a basis function. The entire transformation is then a linear combination of all these radial basis functions plus a polynomial of low degree. If N is given control points, the coordinates of the i-th control point and weights that indicate how strongly the function whose center is the i-th control point, enters into the overall transformation. The pixel (x, y) is then used in the pixel as follows (x ', y' ) is transferred:

By means of the polynomial in the above equations, the control points are placed in correspondence, and the remaining pixels are then by means of the radial basis function interpolated between the control points.

The most widely used form when registering with radial basis functions are the thin -plate splines. The radial basis function f is defined as follows:

Registration with Thin - Plate Splines can, however, if many control points are used, be very time consuming.

Elastic models

Another approach for the registration of images that have very complex local distortions, the registry is by means of elastic models. Registration is done in most instances iteratively by minimizing an energy functional of the form

With the functional which describes the inequality of the images, the bilinear form (also regularization ) describes a suitable penalty term, and a positive regularization parameter. The bilinear form is often caused by the elliptic operator

With Neumann boundary conditions and the elastic constants and given. Registration works by iterative solution of the Euler-Lagrange equations

A result, the images are modeled as elastic surfaces or viscous liquids, on which external forces deform the image. The deformation is influenced by the internal forces and suitably scaled by the parameter. The steps of feature matching and transformation computation fall together on this.

Transformation

The calculated transformations are then used to transform the object image in order to register the images. The transformation can be carried out forward or backward. It is carried forward, as the object image is calculated by means of the transformation function, a new position for each pixel. However, this approach has distinct disadvantages. On the one hand a plurality of pixels of the object image on one and the same new image point to be transformed, and on the other hand may result in the transformed image holes. A hole is formed when there is a point (x, y) in the transformed image, on which no pixel of the object image is transformed.

The reverse transformation is executed, the intensity value at the position ( x, y) is calculated in the transformed image, as follows. First, a grid position (x ', y' ) is based on the position (x, y) by means of the inverse transform calculated in the object image. Then, the intensity value in the transformed image by interpolation from the (x ', y' ) is calculated surrounding pixels. Often applied interpolation techniques are, for example, bilinear or bicubic interpolation. In this way, an intensity value is computed for each pixel of the transformed image.

125026
de