Newton's method

Newton's method, and Newton -Raphson method ( named after Sir Isaac Newton in 1669 and Joseph Raphson 1690) in mathematics, a standard method for the numerical solution of nonlinear equations and systems of equations. In the case of an equation with one variable can be at a given continuously differentiable function approximations to solutions of the equation, ie approximations of the zeros will find this function. The basic idea of this method is to linearize the function in a base, that is, its tangent to determine and use the zero point of the tangent as an improved approximation to the zero point of the function. The approximation obtained serves as a starting point for further improvement step. This iteration is performed until the change has fallen below a set limit in the approximate solution. The iteration converges at best asymptotically with quadratic order of convergence, the number of correct decimal places then doubled in each step.

The following rule is expressed formally repeated until sufficient accuracy has been achieved:

  • 2.1 Examples of non-convergence
  • 2.2 Local quadratic convergence
  • 2.3 Local quadratic convergence of multiple zero by modification
  • 2.4 Remarks
  • 4.1 Calculation of the square root
  • 4.2 Calculation of the cube root
  • 4.3 Intersection of two functions
  • 4.4 Mixed - goniometric function
  • 6.1 Simplified Newton's method
  • 6.2 Inexact Newton method
  • 6.3 Newton - Krylov method

Newton's method for real functions of one variable

Historical information about the Newton's method

Isaac Newton wrote in the period 1664 to 1671 the work " Methodus fluxionum et serierum infinitarum " (latin for: From the method of fluxions and infinite sequences ). In it he explains a new algorithm for solving a polynomial equation using the example. For this purpose, one can easily guess the point as a first approximation. Newton made ​​the approach with a " small" as adopted and put it into the equation:

Be governed by the binomial

There should be " small", the higher order terms to the linear and constant can be neglected, which remains or remaining. We can repeat this procedure and schedule, insert it into the second equation, omitting higher terms and condition.

Joseph Raphson described 1690 in the work "Analysis Aequationum universalis " this computing process formally and illustrated the formalism of the general equation of 3rd degree, where he found the subsequent iteration.

The abstract form of the method with use of the derivation is from Thomas Simpson.

Construction on graph

Clearly reach you as follows on this procedure: Let be a continuously differentiable real function, of which we know a point in the domain with a "small " value function. We want to find a point near that represents an improved approximation to the zero position. We linearize the function at the point, ie, we replace it with its tangent in the point increase.

The tangent is given by the function. Do we use, we obtain

We choose as the only zero of this linear function,

If we apply this construction repeatedly, so we get from a first location an infinite sequence of points generated by the recursion

Is defined. This is referred to as a Newton iteration, the function as the Newton operator. The Newton iteration is a special case of a fixed-point iteration, if the sequence converges to, the following applies and therefore.

The art of the application of Newton's method is to find suitable starting values ​​. The more is known about the function, the smaller is the necessary amount of start values ​​make.

Many non-linear equations have multiple solutions, so has a polynomial of degree up to zeros. If you want to find all the zeros in a certain area, it must be found for each zero of a suitable starting value for the Newton iteration converges. This could be determined sufficiently small insulating intervals for each zero eg by bisection.

First Example

The square root of a number is the positive zero of the function. This function has the derivative, the Newton iteration is done so according to the rule

The advantage of this procedure compared to the square root to Heron 's (see below ) that there is division -free, once again, the reciprocal value of was determined. The starting value is selected in the table. The iterates were truncated at the first place inaccurate. It can be seen that after a few steps, the number of valid points is growing rapidly.

Consider the difference to the threshold value in the second step, it can be cleaved twice a second step by means of the binomial, the difference:

According to the inequality of the arithmetic and geometric means applies, so that the second factor can be meaningfully limited by. Is the difference in the second step a small number, then the difference in the second step which is proportional to the square, that is much smaller. The result by squaring an error, an error estimate is proportional to. Therefore, one says that the number of significant digits in each step of the Newton iteration roughly doubles.

Convergence considerations

The Newton method is a so-called locally convergent method. Convergence of the Newton iteration produced in consequence to a zero is therefore only be guaranteed if the start value, ie, the 0-th term of the sequence, already " sufficiently close " is at the root. If the initial value is too far away, the convergence behavior is not fixed, that is, it is both a divergence of sequence possible and oscillation ( in which a finite number of function values ​​alternate ) or convergence to a different zero of the function under consideration.

The start value is selected so that the Newton method converges to the convergence, however, is a square, that is, with the convergence of order 2 (in case the discharge at the zero point does not disappear ). The amount of the starting points for which Newton's method converges to a certain zero point, forms the catchment area of ​​this zero point. If dyeing for a polynomial with real or complex coefficients, the catchment areas of different zeros in the complex plane, a different, the result is a Newton fractal. In this it can be seen that the catchment areas of the basin, ie circular disks containing the zeros from which out the Newton iteration converges to the stable zero point in the center. But it can also be seen that the edges of the catchment areas " frayed ", they even have a fractal structure. Slight deviations in the starting point can therefore lead to different zeros. If, however, the interval is exactly one zero in consistently and is true and the start value is set to the left of the zero point, then the sequence converges in Newton's method always, and indeed strictly increasing (see figure below and the table top down).

Examples of non- convergence

Local quadratic convergence

Let be a twice continuously differentiable real function and a zero of, in which the derivative has no zeros. This means that the graph of the function transversely, ie, non- touching, the axis intersects. Be a point near. Then the Taylor formula of second degree ( with Lagrange remainder term )

Be changed according to the difference,

It will now be rearranged so that the Newton operator appears on the right side,

Be an interval around without zero of the derivative and barriers as well as the derivatives of. Then follows for all the assessment

With the constant factor is denoted. In each Newton iteration of the step size will be smaller than the square of the same size in the previous step. After completion of induction arises

Can thus for the start point of the iteration, the estimation can be guaranteed, for example, by the interval length of less than so, the sequence of Newton iteration for the zero point, as the result, and is thus according to the given estimate is a zero sequence. The shortening of the interval can be achieved by several iterations of a slower method of zeros restriction, for example the bisection method or the regulators falsi.

The following assessments of this convergence speed is referred to as square, the (logarithmic ) precision or number of valid points is doubled in each step. The estimation of the distance to zero is often given in linear, as is the case with

  • If the length of the interval is smaller. This gives an estimate of the valid places in the binary system.
  • If the length of the interval is smaller than, that is close enough to zero results in a doubling of valid digits in each step.

Local quadratic convergence of multiple zero by modification

In the event that a multiple root has at, can also estimate the speed of convergence and quadratic convergence force again by a slight modification.

Has in - a multiple zero, can be written as with.

Then by the product rule, and thus the expression

Now Substituting this into the iteration, we obtain

And from both sides after subtraction of the iteration error

Now, if the term has become very small, the summand in the denominator is much smaller than, so that the rear bracket in more and more approaches the value. For the simple zero with one has a small value, which is almost 0, so that. For is about 0.5, so that the distance from the zero point from step to step, only about half and one has about 10 steps increases the accuracy only in a further 3 decimal places. When is about 0.67, so that only after about 16 steps, the accuracy to 3 decimal places further increases etc.

We can therefore estimate the convergence behavior of the multiplicity of the zero point, if you do not know for other reasons, and - as is now described - optimize the process.

In one times zero modifying the Newtonian approximation method with a factor:

This will then

Is now again very small, so is the denominator of the summand is much smaller than, and one obtains

With the right factor due converges to a fixed value. As you can see, is now available quadratic convergence here.

An example shows the convergence behavior very nice. The function has the simple zero. The left column of the Excel spreadsheet shows the rapid convergence for the start value 1, after 4 steps, the accuracy in Excel can not be increased, the error is the number of zeros after the decimal point doubling (at least). Squaring the function now (middle column ), the zero point is a double, and now shows the above-mentioned behavior that only approximately halved without modification of the error in each step. Is then modified in this case by a factor, then provides the same behavior as for the simple zero of a (right column).

Comments

  • The local convergence proof can be given also in the same way in the multidimensional case, however, it is then technically slightly more difficult since you are working with two - and three-stage tensors for the first and second derivative. Essentially, the constant is replaced by K, with appropriate induced operator norms.
  • The local convergence proof assumes that a zero -containing interval is known. From his evidence, however, gives no way to test this quickly. A proof of convergence, which provides a criterion for this was first led by Leonid Kantorovich and is known as a set of Kantorovich.
  • In order to find a suitable starting point, occasionally used other ( " coarse ") methods. For example, you can determine the gradient method an approximate solution and then refine with the Newton method.
  • When an unknown starting point one can by means of a homotopy function, from which one seeks a zero, deform to a simpler function of the (at least) is known a zero. It then passes through the deformation backwards in the form of a finite sequence only " little " distinctive features. From the first function we know a zero. As a starting value for the Newton iteration for the currently active function of the sequence using the approximation of a zero of the preceding in the following function. For exact instructions, see homotopy.

Termination criteria

Possible termination criteria with respect to a residual size (for example, computer arithmetic ) are:

Where the quality of the " zero point " determined. In both cases, it may happen that the termination criterion is fulfilled at a "bad" time.

Further application examples

Calculating the square root

A special case of Newton's approximation method is the Babylonian square root, also known as Hero A method according to Hero of Alexandria:

Applying the iteration to obtain the root of the function

, It is obtained because of the derivation function for the solution of the approximation process

This method converges for each and for any initial value.

Computation of the cube root

Applying the iteration to obtain the root of the function

, It is obtained because of the derivation function for the solution of the approximation process

For negative radicand is recommended with the conversion.

This method converges for and initial value when a and x0 have the same sign.

Intersection of two functions

Similarly, can also be the value of the intersection point of two functions and determine:

Since equating the two functions to solve the problem, always can be achieved by forming the following form to which the Newtonian approximation method can be applied, determine:

Mixed - goniometric function

Wanted is the positive solution of a problem with. The problem can be reformulated as. Therefore be sought zeros of.

We have now. As all true and we know that the zero point between 0 and 1. We start the iteration with the value.

For the first twelve digits of the root are known.

Newton's method in the multidimensional

Newton's method can also be used to determine zero positions of the multi-dimensional features. One specific application is the combination with Gaussian least squares method in the Gauss -Newton method. For the general case the starting point of the iteration is the above fixed point equation:

Inspired Newton's method:

Wherein the Jacobian matrix, which is the matrix of the partial derivatives of actual.

Since the dissolving

On the calculation of the inverse of a matrix and subsequent multiplication by complex and is numerically less favorable, instead, the system of linear equations

Solved. Then is obtained from:

To solve the system, there are several solution methods (see list of numerical methods ). If the Jacobian matrix in the zero and invertible in a neighborhood of the root Lipschitz continuous so the method converges locally quadratic.

Variants of Newton's method

The biggest problem in the application of Newton's method is that it takes the first derivative of the function. Their calculation is usually time-consuming, and in many applications, a function is not given analytically but for example, only by a computer program (see also Automatic Differentiation ). The one-dimensional then the regulators Falsi preferable, in which the secant and not tangent is used. In Multidimensional have to look for other alternatives. Here, the problem is even more dramatic, as the derivative is a matrix with entries, the cost of the calculation thus increases with the square of dimension.

Simplified Newton's method

Rather than calculate the derivative in each Newton step, it is also possible to calculate only in every second step. This lowers the cost of an iteration drastically, the price is a loss of convergence speed. The convergence is then no longer square, but it can still be achieved superlinear convergence.

Inexact Newton method

A similar idea is to calculate each step, an approximation of the derivative, for example the finite difference. A quantitative convergence statement is difficult in this case, as a rule of thumb that convergence becomes worse the more, the worse the approximation of the derivative, however. An example of such a process is the secant.

Newton - Krylov method

For the numerical solution of nonlinear partial differential equations, in principle offers the Newton method as a basic solver. The corresponding Jacobian matrix is always sparse and then Krylov subspace methods provide for the solution of linear equation systems. This is called Newton - Krylov method. In Krylov method itself, the Jacobian matrix only in matrix-vector products occurs, which can be interpreted as directional derivatives. We approximate it by finite differences, we obtain completely matrix-free method.

600173
de