Gradient descent

The gradient method, also called the method of steepest descent, is a procedure that is used in the numerical computation, to solve general optimization problems. This process starts ( the example of a minimization problem ) by a proximity value. From this it proceeds towards the negative gradient ( indicating the direction of steepest descent of this approximation ), until you get no more numerical improvement.

The method converges often very slow as it approaches the optimum with a strong zig- zag approaches. For the solution of symmetric positive definite systems of linear equations, the method of conjugate gradients provides a huge improvement. The gradient descent is to the mountain climber algorithm (hill climbing ) related.

The optimization problem

The gradient method is used when it comes to the minimization of a real-valued, differentiable function; So to the optimization problem

This is a problem of optimization without constraints, also called unconstrained optimization problem.

The procedure

Starting from an initial point, the direction of steepest descent is determined by the derivative, where the nabla operator referred to, ie the vector of partial derivatives of respect to the variables. Then iterate as follows:

Here is the negative gradient of, ie the descent direction of this process, and refers to the increment. This increment must be determined in each step of the iteration; for this purpose, there are different possibilities. One method is to " beam " shall be determined by the minimization of the function on the ( one-dimensional ) that points in the direction from the negative gradient:

So you calculated in this case, the step size by

This is a simple, one-dimensional optimization problem for which there are specific method of the step-size determination.

Special case: Quadratic Functional

Typically, an energy functional of the form

Minimized. It is a symmetric positive definite matrix. Starting from an initial point, the direction of steepest descent is of course determined by the derivative, where the nabla operator designates. Then iterate as follows

Denotes the increment of the process and, by

Be calculated. The method converges for any starting value to a value that is so minimal.

Algorithm in pseudocode

Input: a suitable starting vector      for k = 0 to n-1          end Edition: minimizer of the energy functional. convergence speed

The gradient method provides a sequence with

The so-called energy norm called respect and the condition of the matrix with respect to the spectral norm.

275704
de