Automatic differentiation

The automatic differentiation or differentiation of algorithms is a method of computer science and applied mathematics. A function of several variables, which is given as a procedure in a programming language or a calculation graph, an extended procedure is generated, which evaluates both the function and one or any number of gradient, to full Jacobian matrix. If the source program contains loops, the number of loop iterations must be independent of the independent variables.

These derivatives are required, for example for solving nonlinear equations by Newton's method and for methods of nonlinear optimization.

The main tool here is the chain rule and the fact that are available in the computer elementary functions such as sin, cos, exp, log and the derivatives known just exactly predictable. Thus the cost of computing the derivatives is proportional (with a small factor) to the effort of evaluating the output function.

  • 2.2.1 Example 1
  • 2.2.2 Example 2

Calculation of discharges

Task: Given a function

Wanted is the code / function for directional derivatives or the full Jacobian

Different approaches are:

The idea of ​​automatic differentiation (AD)

Any program that evaluates a function can be described as a sequence of intermediate steps in which intermediate results are converted in an elementary way. One can imagine this to mean that there is a (potentially infinite) sequence of intermediate values ​​and functions, but only one or two variables really depend. The function is evaluated by setting the beginning and one after the other

Is determined. This can be set up so that the function values ​​of f are located in the most recently evaluated intermediate results, ie at the end is to be allocated.

AD describes a set of procedures, the goal is to create a new program which the Jacobian of f evaluates. The input variables x are called independent variables, the output variable (s ) y dependent variables. In AD we distinguish at least two different modes.

Forward mode

In the forward mode to compute the matrix product

Of the Jacobian matrix using any matrix ( Seedmatrix ) without determining in advance, the components of the Jacobian matrix.

Example 1

In the forward mode directional derivatives are transported along the control flow of the calculation of f. For each scalar variable v is a vector Dv generated in the AD- generated code, the i-th component contains the directional derivative along the ith independent variable.

Example 2

Compute a function

An automatic differentiation in forward mode would have a function

The result:

Reverse mode

The reverse mode consists of two phases.

In Phase 2, a vector is introduced for each scalar variable v. This vector contains in the i- th component of the i th directional derivative (in the direction of v ). The Saatmatrix is filed under. In the reverse mode is obtained as a result of a product

Example 1

Example 2

For each calculation rule line the derivatives of u and v are supplemented by the following route of s:

We are looking for the - and - derivatives of. These are referred to as, and. The value is initialized to 1, all other values ​​are initialized to 0.

Efficiency considerations

The efficiency of AD algorithms depends on the mode and the parameter p. The selection of the mode and the parameter p will depend on, for which the Jacobian matrix is calculated. It applies

For the two proposed modes

The calculation is as a chain of calculations

Given:, Question: What is the derivative of s changed during the second phase to obtain the derivatives of u and v?

F (x) is interpreted as a sequence of programs. In the example, "Optimization of a wing " includes the calculation of the following steps.

  • Superposition of the wing with so-called "Mode Functions"
  • Calculation of a grid, which is applied to the wings around
  • Solution of the Navier -Stokes equations on the mesh and calculating the integral of the selfsame.

Overall, the function

In reverse mode would be analogous

. apply A better approach is to use the result of a calculation in each case as Saatmatrix the following.

So

Since the number of rows of each matrix is 8 (p = 8) increases the time and memory requirements also by a maximum of 8

91562
de