Nonlinear programming

In mathematics, the nonlinear optimization (including linear programming, NLP, called ) the project to optimize a scalar objective function of one or several real variables in a restricted area, where the objective function or the range limits non-linear ( not affine ) are. It is a branch of mathematical optimization and a top field of convex optimization. Add delimitation of the said Articles here the application is described in differentiable non-linear objective functions without restriction to convexity of the objective function or the search area. Objective function, constraints, allowable amount of local and global optimization: For disambiguation, the reading of terms is recommended.

  • 5.1 Elimination of degrees of freedom
  • 5.2 Lagrange multiplier
  • 5.3 barrier function
  • 5.4 criminal
  • 5.5 Extended or generalized - Lagrangian method

Fields of application

Non-linear programs can be found in a variety of ways in science and engineering.

In economics, it may be a matter the cost of a process to minimize the limitations in the availability of resources and capacities subject. The cost function may be non-linear in it. In theoretical mechanics can be found in Hamilton 's principle, an extremal principle, the solution of which represents a non-linear program with nonlinear boundary conditions.

Modern engineering applications often involve and optimization tasks in a complicated way. So it can be to minimize the weight of a component, the same specific requirements must satisfy ( the installation space restrictions, limits on deformation under given loads, for example ).

In the application of a mathematical model, it may be a matter to adapt the parameters of the model to the measured values ​​. Nonlinear influences of the parameters and constraints on the parameters (eg, that are approved only positive values) lead here to a nonlinear program.

These problems is often not known a priori whether the problem posed is convex or not. Sometimes we observed a dependence of the found optimal solution from the starting point of the search. Then you have local optima found and the problem is not convex with certainty.

Problem definition

There are many possible formulations of a non-linear program. At this point a general form as possible be selected. The input parameter is of the one that is, the problem depends on influencing parameters which are stored in the vector. The objective function is continuously differentiable once. Furthermore, the constraints ( NB) are in the form of inequality with equality and in shape with given and once continuously differentiable. Then the problem is mathematically:

The allowable range is restricts the constraints ( NB): For all values ​​of the parameters of the permissible range ( ), the NB should be met. Permitted is the problem if the allowable range is not empty.

Mostly, limited the theoretical treatment of nonlinear optimization to minimize problems. In fact, the maximization problem of a function in a minimization problem of or, if it is ensured to be reformulated.

Procedure

The problem is fed back with the method described below to the optimization of a utility function without NB. In order to make use of the gradient -based methods to advantage, one divides the scan an area where appropriate in such on where the objective function is differentiable. If possible, the sub-areas should be convex and the objective function in them. Then you can calculate the global extrema in subdivisions with the procedures set out in Mathematical Optimization and Convex optimization and select the best.

The construction of the auxiliary function will be explained using a simple example: Two balls in a hollow attempt to occupy the lowest point, but can not penetrate it. The objective function is thus the potential energy of the balls and in equilibrium assumes a minimum. The constraint here would be the penetration of the balls and call, with negative penetration a positive distance is meant.

The advantages and disadvantages of the methods are summarized in the table below:

  • Meets NB exactly
  • The sign of the multiplier indicates whether the NB is active or not.
  • Performs an additional unknown.
  • It drops the vanishing diagonal terms in the system of equations a ( problematic with Cholesky decomposition ).
  • The NB are complied with.
  • For the solution found is excreted in found with Lagrange multipliers.
  • The contour lines of the auxiliary function do not coincide with those of the objective function. In extreme cases, there is for some values ​​of no optimum.
  • When approaching the limit of the allowable range, and with the Hessian matrix is ill-conditioned.
  • The constraint is only approximately satisfied.
  • With the Hessian matrix is ill-conditioned.
  • An examination of the activity of the NB can only be done by evaluating the functions and.
  • As the number of iterations, the solution found is excreted in found with Lagrange multipliers.
  • Meets NB with arbitrary precision.
  • The sign of the multiplier indicates whether the constraint is active or not.
  • Reduces the number of unknowns.
  • Does equality constraints exactly one.

Optimization theory

Isolated points

In a non-linear programming NB can reduce the permissible range on some points such that although but no point lies in its environment within the permissible range. Mathematically, this means that there is a neighborhood such that

Applies. Isolated points must all individually, each be tested separately, to optimality.

Regularity conditions, tangents and linearizing cone

For the formulation of optimality conditions, the NB must meet certain requirements, Eng. constraint qualifications ( CQ ). Essentially, it 's about optimal points from consideration excluded, isolated or in which there are redundant NB. There are several different sharp formulations which ensure the fulfillment of this CQ. Points in which the requirements are met, are called regular. Irregular points in which addresses none of these requirements must be excluded or considered separately.

Central to the formulation of the requirements for the NB and the optimality conditions is the term tangent cone and linearizing cone. To explain this clearly, one imagines a point in the permitted area and run in compliance with the NB ( the NB can be thought of as impenetrable walls) to the destination point. The tangent cone is then the set of all possible directions from which one can arrive at the destination point. When linearizing the cone NB are first linearized, that is, by their tangents replaced at the target point. The linearizing cone is then the set of all possible directions from which one can arrive at the target point with respect to the linearized NB. The tangent cone and linearizing cone differ where two walls at the site parallel and the target point as it were in a transition ( 0 width ) is located. In the linearized cone can then arrive from both directions of the corridor, he linearized yes the walls. If the first parallel walls immediately lose their parallelism in one direction and clear the path so that no matter how small step in this direction is possible, one can arrive at the tangent cone only from the open direction. That's the difference, see the first pathological case below. In the graph, the tangent cone and linearizing cone coincide at the optimum point and are indicated in red.

The requirements for the NB ensure that match the optimum point of the tangent cone and the linearizing cone and the optimal point is not isolated. Examples of formulations are:

  • Convexity: The objective function and the NB are convex and the optimal point is not isolated.
  • Slater is not isolated and there is a such that for all non-linear NB.
  • Linear independence - linear independence constraint qualification ( LICQ ): The gradient of the active Ungleichungsbedingungen and the gradients of the equation terms are linearly independent in point.
  • Mangasarian - Fromovitz - Mangasarian - Fromovitz constraint qualification ( MFCQ ): The gradient of the active Ungleichungsbedingungen and the gradients of the equation terms are positive - linearly independent in point.
  • Constant Rank - Constant rank constraint qualification ( CRCQ ): For each subset of the gradients of the Ungleichungsbedingungen which are active, and the gradient of the equation terms of rank near constant.
  • Constant positive linear dependence - Constant positive - linear dependence constraint qualification ( CPLD ): For each subset of the gradients, the Ungleichungsbedingungen which are active, and the gradient of the equation conditions, and if a positive - linear dependence exists at the point, then there is a positive linear relationship near.

One can show that the following two inference strands are

Although MFCQ is not equivalent to CRCQ. In practice, weaker constraint qualifications are preferred since this stronger optimality conditions provide.

Pathological cases

The CQ are there to exclude conditions such as the origin in the following examples from consideration:

Optimality conditions

The Karush -Kuhn -Tucker conditions ( also known as the KKT ) conditions are necessary for the optimality of a solution in nonlinear optimization. They were first performed in 1939 in the Master's thesis ( unpublished) by William Karush. Known were these, however, only in 1951 after a conference paper by Harold W. Kuhn and Albert W. Tucker. They are the generalization of the necessary condition of optimization problems without constraints.

Necessary condition and set of Karush -Kuhn - Tucker

In words, the set of Karush -Kuhn - Tucker roughly means that if an admissible, regular and optimal point is the gradient of the objective function can be represented as a positive linear combination of the gradients of the active NB, see also the picture above.

Be the objective function and the functions and the functions are the constraint functions. All common functions are continuously differentiable once. Whether it is a normal point that is the Regularitätsanforderung (CQ ) above is satisfied. When a local optimum, then there are constant and so that

Each point in which these conditions are satisfied, ie Karush -Kuhn -Tucker point (short: KKT - point).

If a point of the admissible region in which no NB are active, in particular there are no equality constraints, then for all and the above conditions reduce to the known necessary condition unrestringierter problems.

Sufficient conditions

Is a KKT point and the direction of steepest descent on or close to the edges of the tangent cone at an angle less than 90 °, then a minimum or maximum point. Mathematics: Applies

Then a local minimum is (or maximum). This is a sufficient optimality condition of first order. A sufficient optimality condition of second order for a KKT - point states that if a stationary point and the Hessian matrix of the objective function is positive ( negative) definite for all vectors of the tangent cone, then a local minimum is (or maximum). Mathematically:

This is the tangent cone, see # regularity conditions, tangents and linearizing cone.

Sets to approximate methods

Example

Will be explained in the above five methods of solving a problem using a simple example. In the problem of the product of two positive real numbers is to be maximized, the sum of which the highest is fourteen. Mathematically this means: We are looking for

With the NB

It is intuitively clear that in the optimum the NB is active, otherwise you could easily find a better solution to be found. The only stationary point with this linear in function and is therefore the search sometimes goes in this direction. Then you have the NB, so to speak " in its path " so that the algorithm they " noticed ".

Elimination of degrees of freedom

From the active detected NB determined to

And the auxiliary function only depends on, so that the solution can be calculated by curve sketching:

It provides:

Lagrange multiplier

Here the multiple NB is subtracted from the objective function, in which the factor of the Lagrange multiplier is and will be treated as an additional unknown. The subtraction is chosen so that a violation of the NB shall be punished with. The auxiliary or Lagrange function is here so:

At a minimum, all derivative vanish for all variables:

And the solution has been found. Paths and the Karush -Kuhn -Tucker condition is satisfied. The above system of equations can be written as a matrix equation:

The method of Lagrange multipliers:

  • Meets the NB exact
  • Introduces additional unknowns ()
  • Discharged vanishing diagonal elements in the system of equations, which are problematic when using the Cholesky decomposition.
  • Can based on the sign of judge whether the constraint is active or not ( positive in activity).

Barrier function

With barrier functions side - conditions can be satisfied with safety at the price that at the optimum, the NL is not maxed out. When searching for a solution to the target function, the times is a barrier function is added, for example:

This is a logarithmic barrier function and the barrier parameter. In extreme disappear all derivatives:

And therefore, as well as what the solution

Has, as defined for the solutions. In iterative search with the Newton -Raphson method to get the rule

For the calculation of the increment and. The determinant of the Hessian matrix is:

It provides:

  • The constraint is respected.
  • In the limit we obtain the exact solution.
  • For there is no optimum point. Generally agree with the contours of the auxiliary function is not consistent with those of the objective function.
  • When approaching the solution and with the Hessian matrix is ill-conditioned.
  • In a gradient -based search must be ensured that the increment is not so great in the unknowns that you accidentally get on the wrong side of the barrier, where the barrier function is not defined in this example.

Criminal proceedings

With criminal proceedings incidental conditions can be approximately satisfied. In the search for a solution is subtracted from the objective function, the times of a penalty function ( supposed to punish the violation ):

This is the penalty parameter and the penalty function. With the penalty function is called exactly, but it is not differentiable. Here is to be used. In extreme disappear all derivatives:

With one gets so you have to start here in the "forbidden" area. Then you have the system of equations

With the solution

Which becomes for the solution.

It provides:

  • The constraint is only approximately fulfilled with increasing penalty parameter is always better, but only because it can be precisely calculated here. For numerical solution of the system rounding errors would with increasing penalty parameters lead to errors.
  • The reason for this is that here also with increasing penalty parameter is the value of the determinant of the system of equations to zero. The problem is increasingly ill-posed.
  • It needs to find a compromise in terms of the conditioning of the equation system and the accuracy of the fulfillment of the NL.
  • Substituting and in the NB can be checked how much it will hurt.
  • There are no eingeschläust zustätzlichen variable or vanishing diagonal elements, there exists a solution for all parameters and the criminal procedure is considered numerically robust.

Extended or generalized - Lagrangian method

The extended or generalized Lagrangian method (german augmented lagrange method) uses the penalty function to calculate the Lagrange multipliers approximation. When searching for a solution which is then subtracted from the objective function, the times of the NB and the times of a penalty function (penalty idea):

In extreme disappear all derivatives:

With you get. Otherwise, one gets from

The system of equations

The solution

Has.

The numerical search of the extremum with the extended Lagrangian method

Here, however, a start value must be specified with, so that the point can be found. With and results up to an error of the following iteration:

The extended Lagrangian method

  • Meets the NB with arbitrary precision,
  • Shall not result in a new unknown or affect the conditioning of the equation system,
  • Needed to more converged solutions of the global problem ( the second step ) and
  • Can measure the activity of the side - condition on.
602558
de