Incomplete LU factorization

As ILU- decomposition ( of incomplete LU decomposition ) or incomplete LU decomposition is called in numerical mathematics the faulty decomposition of a matrix into the product of a lower triangular matrix L and an upper triangular matrix U

In which only the entries of a given occupation structure are calculated by the decomposition matrices L and U.

When calculating a normal LU decomposition of a sparse matrix can not take advantage of the occupation structure in general. It is therefore very much more memory than the original matrix and the number of necessary arithmetic operations is not lower than that of a fully occupied matrix. By specifying a maximum occupation structure of this problem is avoided by accepting an erroneous decomposition.

The ILU decomposition is successfully used as a preconditioner to accelerate the iterative solution of large sparse linear systems of equations by means of Krylov subspace methods. No properties of the actual problem are doing (usually the numerical solution of a partial differential equation) exploited. Thus, it is not restricted to certain classes of problems, and has found its way into many areas of numerical simulation, for example in computational fluid dynamics, the technology is widespread.

The process in 1960 by Richard S. Varga and Nikolai Ivanovich Bulejew (NI Buleev ) was first mentioned. A more detailed analysis was published by JA Meijerink and van der Vorst 1977. They examined preconditioners for the CG method and proposed an incomplete Cholesky decomposition of symmetric matrices. At the same time they mentioned an extension to general matrices.

  • 3.1 ILU ( p)
  • 3.2 Ilut
  • 3.3 Other variants
  • 3.4 parallelization

Application

For use as a preconditioner, the equation system is formally multiplied by,

Turning on Krylov subspace methods, as these converge better, because the matrix is closer to the identity matrix as A. For these processes are required matrix-vector multiplications in each step. Since A is sparse, the computation of low computational effort. For the solution of can be the equivalent system of equations

Efficiently solved by forward-reverse insertion. Here, the thin Busy awareness of the matrices L and U can be exploited.

The calculation of an ILU decomposition, is about as compared to splitting -based preconditioners such as Gauss - Seidel, relatively expensive, the expense is directly related to the size of the permitted occupation structure. This is well balanced by the acceleration of Krylov subspace methods, which is better in general, the larger the allowed occupation structure. Directly behind the other multiple systems with the same matrix but different right-hand sides resolved, it is thus appropriate to invest more in the calculation of the ILU. In the numerical solution of time-dependent partial differential equations, often involving sequences of thousands of similar systems of linear equations have to be solved sequentially, a once calculated ILU decomposition is usually frozen and re-calculated periodically.

ILU factorization or variants are part of every major program library for solving sparse linear systems of equations, such as from PETSc or MATLAB.

Basic form

In its basic form is given as the occupation structure of P A. The decomposition of the matrices L and U is then defined by the following conditions:

In addition, a normalization, that is, determining on the diagonal of one of the two incompletely occupied triangular matrices is considered. In this case, either normalized diagonal elements of the bottom triangular matrix to 1:

Or the diagonal elements of the upper triangular matrix:

The decomposition algorithms, which depending on the implementation also has an impact on the calculation efficiency differ depending on the normalization.

Since it is not required for, is possible (this motivated once again the name of incomplete LU decomposition ).

Given the matrix. The incomplete decomposition matrices L and U are then stored together in a new matrix, the previously known ones are not stored on the diagonal of L and U respectively. The matrix M is initialized such that it is set for entries of P identical to A, otherwise to zero.

When selecting the normalization, the calculation of the decomposition is then performed using the following algorithm:

For, do     For and if, do               For and if, do           The sequence of the loops in the above algorithm can be modified to improve the efficiency depending on the data structure. If the matrix for example by lines stored, the memory accesses happen in the last loop not on adjacent memory blocks. In such cases, a permutation of loops is useful.

Existence

Existence results of the decomposition are available for M- matrices and H- matrices. General- matrices there are counter examples in which the algorithm terminates prematurely, because a zero appears on the diagonal, which results in a division by zero. Nevertheless, in practice, is not observed Cancel the calculation of the decomposition.

Variants

There are many variants of the original ILU breakdown. This attempt either to improve the approximation properties or to reduce the computational effort in a similar approximation.

ILU ( p)

Widely used are the ILU ( p) factorization, which were first considered by Watts in 1981 in the simulation of an oil reservoir. Here p denotes the level of fill. The basic version of ILU has the Level 0 Level 1 is defined by the fact that the occupation structure of the product of the matrices L and U from the ILU ( 0) is considered. Level 2 results from the decomposition matrices of ILU (1 ), etc. To determine the structure of an occupation ILU ( p) factorization, it is not necessary to calculate the decompositions of the lower level in advance. For this purpose, one of the non-zero entries of the matrix A at the beginning of the level to 0, the zero entries, however, infinite. Then you can step through the elements of the matrix as it would happen in the regular algorithm and each time the element would be modified in the innermost loops, the level is updated by

Thus, it is possible to provide the memory for the ILU factorization before starting the algorithm. When using an ILU (p ), note that for a calculation of the decomposition is more expensive more expensive than in the basic version and also the application because the preconditioner has more non-zero entries. This is no longer necessarily result in high levels from about 3, the reductions in the numbers of iterations in the Krylov subspace method to shorten the CPU times. Moreover, it may especially when indefinite matrices even lead to a deterioration of iterations compared to the baseline version.

Ilut

The ILU ( p) have the disadvantage that the non- zero entries are not selected on the basis of approximation properties and thus computation time for near-zero entries can be wasted. This is Ilut in the ILU decomposition proposed in 1994 by Yousef Saad with threshold, called into account. Here, additional conditions are permitted in addition to the use of a cast structure nor, after which entries will not be considered if they are below a certain tolerance. Approximately certain diagonally dominant M- matrices can then be proved the existence of the decomposition. The implementation of an efficient Ilut is more difficult than in the other variants, it often has higher levels of fill are possible than in a pure ILU (p).

Other variants

The ILU is expandable without major problems on block matrices, this must be multiplied instead of dividing by the diagonal element of the inverse of the corresponding diagonal block.

A special case, however, is the incomplete Cholesky factorization (IC). This applies the concept of the ILU decomposition to symmetric and positive definite matrices, analogous to the Cholesky decomposition. This introduced in 1977 as the first ILU variant method is often used as a preconditioner for the CG method. The combination of CG with IC is also called ICCG.

Parallelization

The ILU factorization is strictly sequential and therefore difficult to parallelize. However, variants have been developed that make use of the central ideas in order to make parallel processing possible. These include in particular multi-level techniques such as ILUM. This independent sets can be used to eliminate a set of unknown blocks. The resulting fill-in is limited by a threshold. Then, a new independent set is searched in the remaining unknowns and the step is repeated until the remaining block has become small enough for a direct solution.

Influence the numbering

The numbering of the unknowns in A has a not to be underestimated impact on the efficiency of the preconditioner. This is because the fill-in in the exact decomposition exactly depends on, and thus the numbering has influence on how well approximated the faulty ILU decomposition A. Moreover, the numbering affects the size of the entries on the diagonal, and thus the stability.

Again, there is no firm statements which numbering for what kind of problems is useful. In particular, when using the basic version of ILU ( 0) no convincing heuristics are known. For the stronger preconditioner Ilut or ILU (p ) with p > 0, the reverse Cuthill -McKee numbering has been found to be low in many cases.

409896
de