Mathematical optimization

The field of optimization in applied mathematics is concerned with optimum parameters of a - to find system - usually complex. "Optimal " means that a target function is minimized or maximized. Optimization problems arise in business mathematics, statistics, operations research, and generally in all scientific disciplines, in which work with unknown parameters, such as in physics, chemistry and meteorology. Frequently, an analytical solution of optimization problems is not possible and it must be used numerical methods.

  • 7.1 methods of local nonlinear optimization with constraints
  • 7.2 methods of local nonlinear optimization without constraints
  • 7.3 Methods of global non-linear optimization
  • 8.1 Convex problems
  • 8.2 duality
  • 8.3 Lagrange multipliers
  • 8.4 penalty functions
  • 8.5 Extended Lagrangian method

Example of a simple optimization problem

The simplest optimization problem is to find a minimum or maximum of an analytic one-dimensional function, which is achieved by finding the zeros of the first derivative in the rule.

Examples of optimization problems

Demarcation

Relates to the optimization is the area of ​​approximation in numerical analysis. Can be the other way around: An approximation problem, the problem is to minimize the distance ( metric ) of two functions.

Terms: objective function, constraints, allowed amount, local and global optimization

Assuming a minimization problem in the following. That which is to be minimized, for example, a distance, called target function. What is varied, the parameters or variables of the objective function. In a two-dimensional optimization problem (ie, two independent parameters ) you can imagine the objective function space by the parameters span the length and depth axis. The height is then the objective function value. In general, the result is a " mountain " with mountains and valleys.

If it is indeed an approximation problem in the optimization task, then one speaks in the " mountains " sometimes from the Fittopologie. In this case, the sum of squared errors is used as the objective function in the majority of cases, see details in the article the method of least squares.

Since the objective function is a " mountain ", the optimization problem is equivalent to trying to find in these mountains, the deepest valley ( minimization) or the highest peak (maximum). The effort to solve the task depends crucially on the form of the " Mountains " from. Extreme example of a minimization problem would be an absolutely flat plane protrude from the individual at random points needle-shaped tips. In this case helps no search algorithm, one can just randomly search ( Monte Carlo method ) or systematically scanning the entire surface. The simplest case of a two-dimensional optimization problem would be if the mountains would take the form of a ( downward open ) to the height axis rotated parabola whose vertex you would find.

If the optimization task from a given point in the mountains from the nearest relative (local) to find the minimum or maximum in the neighborhood, then it is called local optimization. If the task is to find the absolute minimum or maximum in the entire mountain range, then one speaks of global optimization. These two tasks have a very different level of difficulty: For local optimization, there are numerous methods, all quickly lead more or less in all not too difficult cases with great certainty to the destination. In the global optimization, the solvability of problem in the context of a given computational budget or realizable strongly depends on the objective function topology.

There is often only at those values ​​for the parameters are interested, the additional constraints ( NB) meet. These constraints are also called boundary conditions. You may be given in the form of equations or inequalities, or explicitly describe a lot (for example, only integer solutions). The set of parameter values ​​which satisfy all of NB, is called the feasible set. In the mountains, the NL would be the area in which it is sought to narrow down. The considered optimization problem is called admissible if the allowable amount, so you scan an area that is not empty. There are active and passive constraints: A NB the form is active when the optimum is on the boundary of the feasible region, and therefore there is equality. If the NB passive, they would represent a restriction in the optimum, the optimum is therefore in the area and not on the edge. A NB the form is always active.

Classification

Scalar Optimization Problems

A scalar optimization problem can be expressed mathematically as

Represent; Here is a real-valued function and. Often, the allowed amount is given indirectly by a function, usually in the form of standardized

Depending on the form of scalar optimization problems can be classified as follows:

  • Variational problem: is infinite dimensional, especially a function space.
  • Optimal control problem: Classical variational problem with differential equation constraint
  • Quadratic program (QP ): as above, but is a quadratic function.
  • Quadratic program with quadratic constraints ( QCQP )
  • Non-linear program: are arbitrary functions ( usually assumed constant, differentiable in the immediate surroundings of an optimum can often be a quadratic approximation can be used, which leads to some of the practical procedures. )
  • Integer program (also discrete program ): In addition, the permissible restricted to integer or general discrete values.
  • Stochastic Program: Some parameters in the description of unknown ( but their random distribution is known).
  • Convex Program: is convex and is convex (concave ) if minimized ( maximized) is.

Each of these branches of optimization has specially adapted solution process.

Note on terminology: "the program" is to be understood as a synonym for " optimization problem " ( rather than " computer program "). The use of the term " program " for historical reasons: The first applications of optimization were military problems where an action plan (engl: program of actions ) could be found.

Vector optimization problems

(Also called Pareto optimization) An optimization problem from the vector optimization is, however, a problem in which the values ​​of several objective functions have to be optimized simultaneously. This can be formalized by a vector-valued objective function is optimized. Solutions that result in all the components of the objective function at the same time to an optimum, are not generally available; rather generally results a greater amount of solution from, for example, by a scalarization (weighting of the individual components of the objective function ) is a single optimal point can be selected.

Linear and Integer Programming

An important special case is the linear optimization. Here, the objective function is linear, and the constraints are represented by a system of linear equations and inequalities. Each local optimum is automatically global optimum, since the allowable range is convex. There are methods to calculate the global optimum, in principle exactly, of which the most famous is the simplex method (not to be confused with the downhill simplex method below). Since the 1990s, however, there are also efficient interior-point method, which can be competitive to the simplex method for certain types of optimization problems.

A restriction to integer variables makes the problem much more difficult, but at the same time expands the application possibilities. This so-called integer linear programming is used for example in the production planning, in scheduling, in tour or in the planning of telecommunications or transport networks.

Nonlinear Optimization

Methods of local nonlinear optimization with constraints

More difficult than the linear optimization is the case of the nonlinear optimization in which the objective function, the constraints ( NB) or both are non-linear. The solution is reached by the problem is passed back to the optimization of a utility function without NB. In this new problem, the methods of nonlinear optimization without NB can then be applied below. The procedure is explained by means of a contact problem: Two balls in a hollow attempt to occupy the lowest point, but can not penetrate it. The objective function is thus the potential energy of the balls and in equilibrium assumes a minimum. The constraint here would be the penetration of the balls and call, with negative penetration a positive distance is meant. There are five methods for the construction of the auxiliary function:

Methods of local nonlinear optimization without constraints

For local optimization, the choice of method depends on the exact problem from: If it is an arbitrarily exactly certain objective function? (This is in stochastic objective functions often not the case. ) If the objective function in the neighborhood strictly monotonic, monotonous or it could just " go " to give even small relative extrema? What are the costs to determine a gradient?

Examples of methods:

Derivative -free methods

  • Interval bisection method, method of golden section and other methods for optimization of one-dimensional functions or the line search ( sequential search) with multi-dimensional functions.
  • Downhill Simplex method

These methods cost a lot of iterations, but are (partially) relatively robust against problems in the objective function, for example, small relative extrema and it does not require the calculation of a gradient. The latter can be very costly if only a relatively inaccurate result is sought.

Procedures for which the first derivative is required

  • Secant
  • Gradient method and conjugate gradient method.
  • Quasi- Newton's method

These methods are faster than the derivative- free methods, provided that a gradient can be calculated quickly, and they are as robust as the derivative- free methods.

Procedures for which the second derivative is required

  • Newton's method or Newton -Raphson method.

Generally, the Newton's method is known as a method of determining a zero and the first derivative is required. Accordingly, it can also be applied to the derivation of a target function, since the optimization problem boils down to determining the zeros of the first derivative. Newton's method is very fast, but very little robust. If the " good nature " of its optimization problem one is not sure, you have to additionally globalization strategies such as step- search or trust-region method to use.

For the frequently occurring problems in practice, in which the quantity to be minimized objective function, the special shape of the norm square of a vector-valued function has ( method of least squares, " least squares " ), the Gauss- Newton method is available, which in itself principle makes use of the fact for functions of this form, under certain additional assumptions the expensive second derivative ( Hessian matrix ) can be approximated very well without their explicit calculation as a function of the Jacobian matrix. So near the finish a Newton's method similar super- linear rate of convergence is achieved. Since this method has inherited the stability problems of the Newton's method, so-called globalization and stabilization strategies are also here necessary to guarantee the convergence, at least to the next local minimum can. A popular variant is the Levenberg -Marquardt algorithm here.

Methods of global non-linear optimization

In contrast to the local optimization of the global optimization is a quasi unsolved problem in mathematics: There is virtually no methods for their application as a result, a point, in most cases, which is the absolute extremum of safety, or even likely.

All methods have in common for global optimization that after a certain system repeatedly visit local minima / maxima.

Most commonly Evolutionary algorithms are applied here. These provide particularly good results, if the arrangement of relative minima and maxima have a certain regularity which they may be inherited. A very good method may be to choose the starting points for finding local minima / maxima randomly to investigate then using statistical methods search results for regularities.

Often, however, the actual search criterion is not sufficiently reflected in practice. Thus, it is often more important, not to find the global optimum, but a parameter area in which there are as many relative minima. Here then are methods of cluster analysis or neural networks.

Other methods of nonlinear global optimization:

  • Natural Analogue optimization methods Deluge algorithm (great deluge algorithm )
  • Simulated annealing ( simulated annealing )
  • Metropolis algorithm
  • Acceptance threshold (threshold accepting )
  • Ant algorithm ( ant colony optimization )
  • Climber algorithm (hill climbing )
  • Stochastic tunnels ( Stochastic tunneling )
  • RANSAC algorithm improves the handling of measurement data with outliers
  • Primal - dual active set algorithm for solving a quadratic optimization problem over a convex subset of a Hilbert space over the amount

The performance of optimization algorithms is often analyzed based on complex structures of test problems with the minima or maxima, where the exact solution is known. An example of such a test function, the Rastrigin function.

Theoretical statements

When optimizing a ( differentiable ) function without constraints is known that minima / maxima can only be in places with. This condition is utilized in the construction of many solution methods. In the case of optimization with constraints similar to these are theoretical statements: Duality and Lagrange multipliers.

Convex problems

At convex scan an area of ​​the problems and the objective function is convex. For a convex domain all points lie on the line connecting any two points within the territory also entirely within the territory. Mathematically:

Examples of convex domains are circles, ellipses, triangles and squares.

The objective function is convex if all the values ​​of the objective function of points on the straight lines connecting two points in the area below this line or all of the points lie on this straight line. Mathematically:

Or

The parabolic optimization problem shown at the beginning of this article has a convex objective function.

For convex optimization problems, each optimum is also a global optimum. If the points are optimal, then all points "between" these points are optimal. Mathematically:

Duality

The one optimization problem associated with ( Lagrange ) dual problem is

With the associated Lagrangian is. The case called Lagrange multipliers, or dual variables or shadow prices. Clearly allowed to a violation of the constraints, but punishes them in the objective function by additional costs per unit injured. One solution, for it is not worth to violate the constraints, solves the dual problem. For convex ( in particular, linear ) issues the value of dual problem is equal to the value of the original problem. For linear and convex quadratic problems can solve the inner minimization closed and the dual problem is again a linear or convex quadratic problem.

Lagrange multipliers

The Lagrange multiplier theorem states that solutions of the restricted optimization problem can only be found in places where there are Lagrange multipliers, the condition that

. meet This condition is the direct generalization of the above derivation condition. As these are the Lagrange multipliers set a necessary condition for a minimum or maximum. A sufficient condition can be obtained by considering the second derivatives.

The Lagrange multipliers rate applies only for the case that the constraints are given by equations. The generalization to inequalities is the set of Karush -Kuhn - Tucker.

Penalty functions

In the limit of continuous infinity penalty parameter the solution found with penalty functions goes into the found with the Lagrange multipliers solution.

Extended Lagrangian method

In the limit of infinitely many iterations, the solution found with the extended Lagrangian method also strives against the found with the Lagrange multipliers solution.

Credentials

269182
de