SNOPT Description of the SQP method

From TomWiki

Jump to: navigation, search

Notice.png

This page is part of the SNOPT Manual. See SNOPT.

Here we summarize the main features of the SQP algorithm used in SNOPT and introduce some terminology used in the description of the subroutine and its arguments. The SQP algorithm is fully described by Gill, Murray and Saunders.

Contents

Constraints and slack variables

The upper and lower bounds on the m components of f (x) and AL x are said to define the general constraints of the problem. SNOPT converts the general constraints to equalities by introducing a set of slack variables. For example, the linear constraint 5 \le 2x_1 + 3x_2 \le
+\infty is replaced by 2x_1 + 3x_2 - s_1 = 0\, together with the bounded slack 5\le s_1 \le +\infty. SparseNP can therefore be written in the equivalent form


\min_{x,s} \quad f_0(x)


\text{subject to} \quad 
\left(\begin{array}{c}
f(x)\\
A_Lx
\end{array}\right) - s = 0,
l \le \left(\begin{array}{c}
x\\
s
\end{array}\right)
\le u.

The general constraints become the equalities f (x) - sN = 0 and ALx - sL = 0, where sL and sN are known as the linear and nonlinear slacks.

Major iterations

The basic structure of the SQP algorithm involves major and minor iterations. The major iterations generate a sequence of iterates {xk} that satisfy the linear constraints and converge to a point that satisfies the first-order

conditions for optimality. At each iterate a QP subproblem is used to generate a search direction towards the next iterate xk+1 . The constraints of the subproblem are formed from the linear constraints ALx - sL = 0 and the nonlinear constraint linearization

f(xk) + f'(xk)(xxk) − sN = 0,

where f'(xk ) denotes the Jacobian matrix, whose elements are the first derivatives of f (x) evaluated at xk . The QP constraints therefore comprise the m linear constraints


\begin{array}{ccc}
f'(x_k)x - s_N & = & f(x_k) + f(x_k)'x_k,\\
A_LX - s_L & = & 0,
\end{array}

where x and s are bounded above and below by u and l as before. If the m × n matrix A and m-vector b are defined as


A = \left(\begin{array}{c}
f'(x_k)\\
A_L
\end{array}\right) \quad and \quad b = \left(\begin{array}{c}
-f(x_k)+f'(x_k)x_k\\
0
\end{array}\right),

then the QP subproblem can be written as


\min_{x,s} \quad q(x) \quad
\text{subject to} \quad Ax - s = b, \quad l \le \left(\begin{array}{c}x \\ s\end{array}\right) \le u,

where q(x) is a quadratic approximation to a modified Lagrangian function.

Minor iterations

Solving the QP subproblem is itself an iterative procedure. The iterations of the QP solver are the minor iterations of the SQP method. At each minor iteration, the constraints Ax - s = b are (conceptually) partitioned into the form

BxB + SxS + NxN = b,

where the basis matrix B is square and nonsingular. The elements of xB, xS and xN are called the basic, superbasic and nonbasic variables respectively; they are a permutation of the elements of x and s. At a QP solution, the basic and superbasic variables will lie somewhere between their bounds, while the nonbasic variables will normally be equal to one of their bounds. At each iteration, xS is regarded as a set of independent variables that are free to move in any desired direction, namely one that will improve the value of the QP objective (or the sum of infeasibilities). The basic variables are then adjusted in order to ensure that (x, s) continues to satisfy Ax - s = b. The number of superbasic variables (nS , say) therefore indicates the number of degrees of freedom remaining after the constraints have been satisfied. In broad terms, nS is a measure of how nonlinear the problem is. In particular, nS will always be zero for LP problems.

If it appears that no improvement can be made with the current definition of B, S and N , a nonbasic variable is selected to be added to S, and the process is repeated with the value of nS increased by one. At all stages, if a basic or superbasic variables encounters one of its bounds, the variables is made nonbasic and the value of nS is decreased by one.

Associated with each of the m equality constraints Ax - s = b are the dual variables p. Similarly, each variable in (x, s) has an associated reduced gradient dj . The reduced gradients for the variables x are the quantities g-ATπ, where g is the gradient of the QP objective, and the reduced gradients for the slacks are the dual variables p. The QP subproblem is optimal if dj = 0 for all nonbasic variables at their lower bounds, dj = 0 for all nonbasic variables at their upper bounds, and dj = 0 for other variables, including superbasics. In practice, an approximate QP solution (\hat{x}_k, \hat{t}_k, \hat{\pi}_k) is found by relaxing these conditions.

The merit function

After a QP subproblem has been solved, new estimates of the SparseNP solution are computed using a line search on the augmented Lagrangian merit function

f(x,s,π) = f0(x) − πT(f(x) − sN) + (f(x) − SN)TD(f(x) − SN)

where D is a diagonal matrix of penalty parameters (Dii >= 0). If (xk, sk , pk) denotes the current solution estimate and (\hat{x}_k, \hat{t}_k, \hat{pi}_k) denotes the QP solution, the line search determines a step \alpha_k (0 < \alpha_k \le 1) such that the new point


\left(\begin{array}{c}
x_{k+1}\\s_{k+1}\\\pi_{k+1}
\end{array}\right)
= \left(\begin{array}{c}
x_k\\s_k\\\pi_k
\end{array}\right) + \alpha_k \left(\begin{array}{c}
\hat{x}_k-x_k\\\hat{s}_k-s_k\\\hat{\pi}_k-\pi_k
\end{array}\right)

gives a sufficient decrease in the merit function. When necessary, the penalties in D are increased by the minimum-norm perturbation that ensures descent for M .

As in NPSOL, sN is adjusted to minimize the merit function as a function of s prior to the solution of the QP subproblem.

Treatment of constraint infeasibilities

SNOPT makes explicit allowance for infeasible constraints. First, infeasible linear constraints are detected by solving the linear program (FLP)


\min_{x,v,w} \quad e^T(v + w)


\text{subject to} \quad l \le \left(\begin{array}{c}
x\\A_Lx - v + w
\end{array}\right) \le u, \quad v \ge 0, \quad w \ge 0,

where e is a vector of ones, and the nonlinear constraint bounds are temporarily excluded from l and u. This is equivalent to minimizing the sum of the general linear constraint violations subject to the bounds on x. (The sum is the £1 -norm of the linear constraint violations. In the linear programming literature, the approach is called elastic programming.)

The linear constraints are infeasible if the optimal solution of FLP has v ≠ 0 or w ≠ 0. SNOPT then terminates without computing the nonlinear functions.

Otherwise, all subsequent iterates satisfy the linear constraints. (Such a strategy allows linear constraints to be used to define a region in which the functions can be safely evaluated.) SNOPT proceeds to solve SparseNP as given, using search directions obtained from the sequence of QP subproblems.

If a QP subproblem proves to be infeasible or unbounded (or if the dual variables π for the nonlinear constraints become large), SNOPT enters "elastic" mode and thereafter solves the problem (NP(γ))


\min_{x,v,w} \quad f_0(x) + \gamma e^T(v + w)


\text{subject to} \quad l \le \left(\begin{array}{c}
x\\f(x) - v + w \\A_Lx
\end{array}\right)
\le u \quad v \ge 0, \quad w \ge 0

where γ is a nonnegative parameter (the elastic weight ), and f0(x) + γeT(v + w) is called a composite objective (the \ell_1 penalty function for the nonlinear constraints).

The value of γ may increase automatically by multiples of 10 if the optimal v and w continue to be nonzero. If γ is sufficiently large, this is equivalent to minimizing the sum of the nonlinear constraint violations subject to the linear constraints and bounds. A similar \ell_1 formulation of SparseNP is fundamental to the \ell_1 QP algorithm of Fletcher. See also Conn.

The initial value of γ is controlled by the optional parameters Elastic mode and Elastic weight.

Personal tools