Gaussâ€“Newton algorithm

Please help to eliminate the shortcomings of this article, and please take part you in the discussion! (Insert item)

The Gauss -Newton method (after Carl Friedrich Gauss and Isaac Newton ) is a numerical method for solving nonlinear minimization problems arising in non-linear least squares problems by applying the method of least squares. The method is an extension / simplification of the Newton's method in which, in each step, which is replaced for minimizing function by a quadratic approximation, the minimum may be calculated explicitly. Due to the specific structure of the function to be minimized ( " sum of error squares " ), however, the Gauss-Newton method, is required in contrast to the Newton's method is not the second derivative of the function.

The iteration

In contrast to the Newton's method, which attempts to approach the minimization of a Taylor approximation to the actual function in each iteration the minimum, the Gauss- Newton method attempts to minimize the residual by means of the least squares method. In this way the process converts a non-linear system of equations in a series of linear least squares problems.

Outline of the procedure

Hereinafter, it is assumed that data are available with the following features:

The table of values has lines, so it Measurements were made
For each measurement, control variables were defined and measured the result
There is a model function that describes the relationship between and. This function has several parameters to be calculated now, that can be calculated as accurately as possible from the function. The special thing about the feature is that it is not linear in the parameters.

It is not necessary that the number of parameters being equal to the number of variables. One searches for example the coefficients of a cubic polynomial, there is only one for each record before, but the cubic polynomial, each has its own power coefficient so that ( including absolute member ) four coefficients were determined.

If the measurements are available as a table, they can be represented as follows:

The approach, to minimize the sum of the squares gives:

Algorithm

To determine the parameters you go to this scheme before:

Preparation

Where is the function with n variables and parameters sought

Setting up the Residuumsfunktion and the residual vector

Calculation of the partial derivatives of the Residuumsfunktion after each parameter ():

Structure of the matrices for the iteration:

The matrix D and the column vector r are lines, that is, for each row of the above table one.
The column vector a has lines, that is, for each parameter a
The columns in the matrix D are the partial derivatives with respect to the parameters. The order of the columns in D is related in a with the order of the parameters. Stand in Line 1 of a parameter, it must in the first column D contain the derivatives with respect to. Accordingly, D has columns, ie for each parameter a.

Iteration

The iteration is carried out with the following matrix equation:

Once calculated, also matrices must be recalculated in order to prepare for the next iteration. In order to reduce the computational complexity may also be iterated several times without recalculation of s. This procedure is often recommended when Newton's method, but reduces the speed of convergence and should only be used if only a little change.

The iteration is terminated if so in which the amount of change is less than the error bound.

Comments

Since is symmetric and positive definite, the Cholesky decomposition is particularly well suited for the resolution of the equation system.

The number of rows in the table () must always be greater than the number of parameters. If the number of parameters is the same, the method determines the parameters accurately (within the accuracy of the iteration), so it's not only the optimal solution in the sense of least squares. The system is determined when the number of parameters is larger than.

To obtain a convergent approximation, one should have a rough idea of the magnitude of the parameters. The method diverges if the initial values of the parameters were chosen unfavorable.

By introducing a step size parameter can be a descent, ie the reduction in the error sum of squares of force by the function is evaluated at the new location for different ( see also Levenberg -Marquardt algorithm).

The closer the error bound is zero, the more accurate is the solution; however, that the number of iterations increases with the side effect.

Difference between the Gauss-Newton method and Newton's method

The Newton method, the Jacobian and the Hessian matrix are used to determine the minimum of a non-linear function iteratively (in this case the sum of the squares ).

In the Gauss -Newton method, however, you always have a sum of squares error ( between measured values and function values ) to be minimized iteratively. Each of the functions in the Gauss-Newton step by a linear approximation replaces ( first-order Taylor series ). The sum of the error squares is thus square in the parameters, and can be solved explicitly, without that the exact Hessian matrix of the sum of squares of the errors would be required ( a prerequisite for the Newton's method ). The step is calculated (as in Newton's method ), the step is repeated at the new location.

Residual (numerical analysis) Symmetric matrix Levenberg–Marquardt algorithm

362660