Gradient descent
The gradient method, also called the method of steepest descent, is a procedure that is used in the numerical computation, to solve general optimization problems. This process starts ( the example of a minimization problem ) by a proximity value. From this it proceeds towards the negative gradient ( indicating the direction of steepest descent of this approximation ), until you get no more numerical improvement.
The method converges often very slow as it approaches the optimum with a strong zig- zag approaches. For the solution of symmetric positive definite systems of linear equations, the method of conjugate gradients provides a huge improvement. The gradient descent is to the mountain climber algorithm (hill climbing ) related.
The optimization problem
The gradient method is used when it comes to the minimization of a real-valued, differentiable function; So to the optimization problem
This is a problem of optimization without constraints, also called unconstrained optimization problem.
The procedure
Starting from an initial point, the direction of steepest descent is determined by the derivative, where the nabla operator referred to, ie the vector of partial derivatives of respect to the variables. Then iterate as follows:
Here is the negative gradient of, ie the descent direction of this process, and refers to the increment. This increment must be determined in each step of the iteration; for this purpose, there are different possibilities. One method is to " beam " shall be determined by the minimization of the function on the ( one-dimensional ) that points in the direction from the negative gradient:
So you calculated in this case, the step size by
This is a simple, one-dimensional optimization problem for which there are specific method of the step-size determination.
Special case: Quadratic Functional
Typically, an energy functional of the form
Minimized. It is a symmetric positive definite matrix. Starting from an initial point, the direction of steepest descent is of course determined by the derivative, where the nabla operator designates. Then iterate as follows
Denotes the increment of the process and, by
Be calculated. The method converges for any starting value to a value that is so minimal.
Algorithm in pseudocode
Input: a suitable starting vector for k = 0 to n-1 end Edition: minimizer of the energy functional. convergence speed
The gradient method provides a sequence with
The so-called energy norm called respect and the condition of the matrix with respect to the spectral norm.