MINOS Solver Options

From TomWiki
Revision as of 09:36, 14 August 2013 by Bjorn (talk | contribs) (→‎Options for All Problems)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Options for All Problems

The following options have the same purpose for all problems, whether they linear or nonlinear.

Check frequency k Default = 60

Every k-th minor iteration after the most recent basis factorization, a numerical test is made to see if the current solution x satisfies the general linear constraints (including linearized nonlinear constraints, if any). The constraints are of the form Ax + s = b, where s is the set of slack variables. To perform the numerical test, the residual vector r = b - Ax - s is computed. If the largest component of r is judged to be too large, the current basis is refactorized and the basic variables are recomputed to satisfy the general constraints more accurately.

Check frequency 1 is useful for debugging purposes, but otherwise this option should not be needed.

Cycle limit l Default = 1
Cycle print p Default = 1
Cycle tolerance t Default = 0.0
Phantom columns c Default = 0
Phantom elements e Default = 0
Debug level l Default = 0

This causes various amounts of information to be output to the Print file. Most debug levels are not helpful to normal users, but they are listed here for completeness.

l = 0 No debug output.
l = 2 (or more) Output from m5setx showing the maximum residual after a row check.
l = 40 Output from lu8rpc (which updates the LU factors of the basis matrix), showing the position of the last nonzero in the transformed incoming column.
l = 50 Output from lu1mar (which computes the LU factors each refactorization), showing each pivot row and column and the dimensions of the dense matrix involved in the associated elimination.
l = 100 Output from m2bfac and m5log listing the basic and superbasic variables and their values at every iteration.
Expand frequency k Default = 10000

This option is part of an anti-cycling procedure designed to guarantee progress even on highly degenerate problems.

"Cycling" can occur only if zero steplengths are allowed. Here, the strategy is to force a positive step at every iteration, at the expense of violating the bounds on the variables by a small amount. Suppose that the Feasibility tolerance is d. Over a period of k iterations, the tolerance actually used by MINOS increases from 0.5δ to δ (in steps of 0.5δ/k).

Every k iterations, or when feasibility and optimality are first detected, a resetting procedure eliminates any infeasible nonbasic variables. Some additional iterations may be needed to restore feasibility and optimality. Increasing k reduces that likelihood, but it gives less freedom to choose large pivot elements during basis changes. (See Pivot tolerance.)

Factorization frequency k Default = 100 (LP) or 50 (NLP)

With linear programs, most iterations cause a basis change, in which one column of the basis matrix B is replaced by another. The LU factors of B must be updated accordingly. At most k updates are performed before the current B is factorized directly.

Each update tends to add nonzeros to the LU factors. Since the updating method is stable, k mainly affects the efficiency of minor iterations, rather than stability.

High values of k (such as 100 or 200) may be more efficient on "dense" problems, when the factors of B tend to have two or three times as many nonzeros as B itself. Lower values of k may be more efficient on problems that are very sparse.

Feasibility tolerance t Default = 1.0e-6

This sets the feasibility tolerance δfea = t (see #Options for Linear Programming). A variable or constraint is considered feasible if it does not lie outside its bounds by more than δfea.

MINOS first attempts to satisfy the linear constraints and bounds. If the sum of infeasibilities cannot be reduced to zero, the problem is declared infeasible. Let sinf be the corresponding sum of infeasibilities. If sinf is quite small, it may be appropriate to raise t by a factor of 10 or 100. Otherwise, some error in the data should be suspected. If sinf is not small, there may be other points that have a significantly smaller sum of infeasibilities. MINOS does not attempt to find a solution that minimizes the sum.

For Scale option 1 or 2, feasibility is defined in terms of the scaled problem (since it is then more likely to be meaningful). The final unscaled solution can therefore be infeasible by an unpredictable amount, depending on the size of the scale factors. Precautions are taken so that in a "feasible solution" the original variables will never be infeasible by more than 0.1. Values that large are very unlikely.

Iterations limit k Default = 3m

MINOS stops after k iterations even if the simplex method has not yet reached a solution. If k = 0, no iterations are performed, but the starting point is tested for both feasibility and optimality.

LU factor tolerance t1 Default = 100.0 (LP) or 5.0 (NLP)
LU update tolerance t2 Default = 10.0 (LP) or 5.0 (NLP)

These tolerances affect the stability and sparsity of the basis factorization B = LU during refactorization and updating, respectively. They must satisfy t1 , t2 >= 1.0. The matrix L is a product of matrices of the form

where the multipliers µ satisfy |µ| = ti . Values near 1.0 favor stability, while larger values favor sparsity. The default values usually strike a good compromise. For large and relatively dense problems, t1 = 10.0 or 5.0 (say) may give a useful improvement in stability without impairing sparsity to a serious degree.

For certain very regular structures (e.g., band matrices) it may be necessary to reduce t1 and/or t2 in order to achieve stability. For example, if the columns of A include a submatrix of the form

one should set both and to values in the range

LU density tolerance t3 Default = 0.5
LU singularity tolerance t4 Default = E0.673.25e-11

The density tolerance t3 is used during LU factorization of the basis matrix. Columns of L and rows of U are formed one at a time, and the remaining rows and columns of the basis are altered appropriately. At any stage, if the density of the remaining matrix exceeds t3, the Markowitz strategy for choosing pivots is altered to reduce the time spent searching for each remaining pivot. Raising the density tolerance towards 1.0 may give slightly sparser LU factors, with a slight increase in factorization time.

The singularity tolerance t4 helps guard against ill-conditioned basis matrices. When the basis is refactorized, the diagonal elements of U are tested as follows: if or , the j-th column of the basis is replaced by the corresponding slack variable. (This is most likely to occur after a restart, or at the start of a major iteration.)

In some cases, the Jacobian matrix may converge to values that make the basis exactly singular. (For example, a whole row of the Jacobian could be zero at an optimal solution.) Before exact singularity occurs, the basis could become very ill-conditioned and the optimization could progress very slowly (if at all). Setting t4 = 1.0e-5, say, may help cause a judicious change of basis.

Maximize Default
Minimize Default

This specifies the required direction of optimization.

Multiple price k Default = 1

It is not normal to set k > 1 for linear programs, as it causes MINOS to use the reduced-gradient method rather than the simplex method. The number of iterations, and the total work, are likely to increase.

The reduced-gradient iterations do not correspond to the very efficient multiple pricing "minor iterations" carried out by certain commercial linear programming systems. Such systems require storage for k dense vectors of dimension m, so that k is usually limited to 5 or 6. In MINOS, the total storage requirements increase only slightly with k. (The Superbasics limit must be at least k.)

Optimality tolerance t Default = 1.0e-6

This is used to judge the size of the reduced gradients , where gj is the gradient of the objective function corresponding to the j-th variable, aj is the associated column of the constraint matrix (or Jacobian), and Πis the set of dual variables.

By construction, the reduced gradients for basic variables are always zero. Optimality is declared if the reduced gradients for nonbasic variables at their lower or upper bounds satisfy or respectively, and if for superbasic variables.

In those tests, is a measure of the size of the dual variables. It is included to make the tests independent of a large scale factor on the objective function. The quantity actually used is defined by

so that only scale factors larger than 1.0 are allowed for. If the objective is scaled down to be very small, the optimality test reduces to comparing dj against t.

Partial price p Default = 10 (LP) or 1 (NLP)

This parameter is recommended for large problems that have significantly more variables than constraints. It reduces the work required for each "pricing" operation (when a nonbasic variable is selected to become basic or superbasic).

When p = 1, all columns of the constraint matrix ( A I ) are searched. Otherwise, A and I are partitioned to give p roughly equal segments Aj , Ij (j = 1 to p). If the previous pricing search was successful on Aj , Ij, the next search begins on the segments Aj+1 , Ij+1 . (Subscripts are modulo p.)

If a reduced gradient is found that is larger than some dynamic tolerance, the variable with the largest such reduced gradient (of appropriate sign) is selected to become superbasic. (Several may be selected if multiple pricing has been specified.) If nothing is found, the search continues on the next segments Aj+2 , Ij+2 , and so on.

Partial price t (or t/2 or t/3) may be appropriate for time-stage models having t time periods.

Pivot tolerance t Default = E2/3 ≈ 10-11

Broadly speaking, the pivot tolerance is used to prevent columns entering the basis if they would cause the basis to become almost singular. When x changes to x + αp for some search direction p, a "ratio test" is used to determine which component of x reaches an upper or lower bound first. The corresponding element of p is called the pivot element.

For linear problems, elements of p are ignored (and therefore cannot be pivot elements) if they are smaller than the pivot tolerance t. For nonlinear problems, elements smaller than are ignored.

It is common for two or more variables to reach a bound at essentially the same time. In such cases, the Feasibility tolerance provides some freedom to maximize the pivot element and thereby improve numerical stability. Excessively small Feasibility tolerances should therefore not be specified.

To a lesser extent, the Expand frequency also provides some freedom to maximize the pivot element. Excessively large Expand frequencies should therefore not be specified.

Scale option l Default = 2 (LP) or 1 (NLP)
Scale Yes
Scale No
Scale linear variables Same as Scale option 1
Scale nonlinear variables Same as Scale option 2
Scale all variables Same as Scale option 2
Scale tolerance t Default = 0.9
Scale, Print
Scale, Print, Tolerance t

Three scale options are available as follows:

l = 0 No scaling. If storage is at a premium, this option saves m + n words of workspace.
l = 1 possible to 1.0. This sometimes improves the performance of the solution procedures.
l = 2 helpful if the right-hand side b or the solution x is large. This takes into account columns of ( A I ) that are fixed or have positive lower bounds or negative upper bounds.

Scale Yes sets the default scaling. (Caution : If all variables are nonlinear, Scale Yes unexpectedly does nothing, because there are no linear variables to scale.) Scale No suppresses scaling (equivalent to Scale option 0).

If nonlinear constraints are present, Scale option 1 or 0 should generally be tried at first. Scale option 2 gives scales that depend on the initial Jacobian, and should therefore be used only if (a) a good starting point is provided, and (b) the problem is not highly nonlinear.

Scale, Print causes the row-scales r(i) and column-scales c(j) to be printed. The scaled matrix coefficients are , and the scaled bounds on the variables and slacks are , where if .

All forms except Scale option may specify a tolerance t, where 0 < t < 1 (for example: Scale, Print, Tolerance = 0.99). This affects how many passes might be needed through the constraint matrix. On each pass, the scaling procedure computes the ratio of the largest and smallest nonzero coefficients in each column:

If is less than t times its previous value, another scaling pass is performed to adjust the row and column scales. Raising t from 0.9 to 0.99 (say) usually increases the number of scaling passes through A. At most 10 passes are made.

If a Scale option has not already been specified, Scale, Print or Scale tolerance both set the default scaling.

Weight on linear objective w Default = 0.0

This keyword invokes the so-called composite objective technique, if the first solution obtained is infeasible, and if the objective function contains linear terms. While trying to reduce the sum of infeasibilities, the method also attempts to optimize the linear objective. At each infeasible iteration, the objective function is defined to be

where σ = 1 for minimization, σ = -1 for maximization, and c is the linear objective. If an "optimal" solution is reached while still infeasible, w is reduced by a factor of 10. This helps to allow for the possibility that the initial w is too large. It also provides dynamic allowance for the fact that the sum of infeasibilities is tending towards zero.

The effect of w is disabled after 5 such reductions, or if a feasible solution is obtained.

The Weight option is intended mainly for linear programs. It is unlikely to be helpful on nonlinear problems.