CGO ego

From TomWiki
Revision as of 07:11, 4 March 2014 by Ango (talk | contribs)
Jump to navigationJump to search

Notice.png

This page is part of the CGO Manual. See CGO Manual.

Purpose

Solve general constrained mixed-integer global black-box optimization problems with costly objective functions.

The optimization problem is of the following form


where ; ; the linear constraints are defined by , ; and the nonlinear constraints are defined by . The variables are restricted to be integers, where is an index subset of possibly empty. It is assumed that the function is continuous with respect to all variables, even if there is a demand that some variables only take integer values. Otherwise it would not make sense to do the surrogate modeling of used by all CGO solvers.

f (x) is assumed to be a costly function while c(x) is assumed to be cheaply computed. Any costly constraints can be treated by adding penalty terms to the objective function in the following way:

where weighting parameters wj have been added. The user then returns p(x) instead of f (x) to the CGO solver.

Calling Syntax

Result=ego(Prob,varargin) 
Result = tomRun('ego', Prob);

Description of Inputs

Problem description structure. The following fields are used:

Field Description
Name Name of the problem. Used for security when doing warm starts.
FUNCS.f Name of function to compute the objective function.
FUNCS.c Name of function to compute the nonlinear constraint vector.
x_L Lower bounds on the variables. Must be finite.
x_U Upper bounds on the variables. Must be finite.
b_U Upper bounds for the linear constraints.
b_L Lower bounds for the linear constraints.
A Linear constraint matrix.
c_L Lower bounds for the nonlinear constraints.
c_U Upper bounds for the nonlinear constraints.
WarmStart Set true (non-zero) to load data from previous run from cgoSave.mat and re- sume optimization from where the last run ended. If Prob.CGO.WarmStartInfo has been defined through a call to WarmDefGLOBAL, this field is used instead of the cgoSave.mat file. All CGO solvers uses the same mat-file and structure field and can read the output of one another.
MaxCPU Maximal CPU Time (in seconds) to be used.
user User field used to send information to low-level functions.
PriLevOpt Print level. 0 = silent. 1 = Summary, 2 = Printing each iteration, 3 = Info about local / global solution, 4 = Progress in x.
PriLevSub Print Level in subproblem solvers.
optParam Structure with optimization parameters. The following fields are used:
MaxFunc Maximal number of costly function evaluations, default 300 for rbfSolve and arbfMIP, and default 200 for ego. MaxFunc must be <= 5000. If WarmStart = 1 and MaxFunc <= nFunc (Number of f(x) used) then set MaxFunc := MaxFunc + nFunc.
IterPrint Print one information line each iteration, and the new x tried. Default IterPrint = 1. fMinI means the best f(x) is infeasible. fMinF means the best f(x) is feasible (also integer feasible).
fGoal Goal for function value, not used if inf or empty.
eps_f Relative accuracy for function value, fTol == eps_f . Stop if |f - f Goal| = |f Goal| * fTol , if f Goal = 0. Stop if |f - f Goal| = fTol , if f Goal = 0. See the output field maxTri.
bTol Linear constraint tolerance.
cTol Nonlinear constraint tolerance.
MaxIter Maximal number of iterations used in the local optimization on the re- sponse surface in each step. Default 1000, except for pure IP problems, then max(GO.MaxFunc, MaxIter);.
CGO Structure (Prob.CGO) with parameters concerning global optimization options.

The following general fields in Prob.CGO are used:

Percent Type of strategy to get the initial sampled values:
Percent Experimental Design ExD
Corner strategies
900 All Corners 1
997 xL + xU + adjacent corners 2
998 xU + adjacent corners 3
999 xL + adjacent corners 4

Deterministic Strategies||

0 User given initial points 5
94 DIRECT solver glbFast 6
95 DIRECT solver glcFast 6
96 DIRECT solver glbSolve 6
97 DIRECT solver glcSolve 6
98 DIRECT solver glbDirect 6
99 DIRECT solver glcDirect 6

Latin Based Sampling||

1 Maximin LHD 1-norm 7
2 Maximin LHD 2-norm 8
3 Maximin LHD Inf-norm 9
4 Minimal Audze-Eglais 10
5 Minimax LHD (only 2 dim) 11
6 Latin Hypercube 12
7 Orthogonal Samling 13

Random Strategies (pp in %)||

1pp Circle surrounding 14
2pp Ellipsoid surrounding 15
3pp Rectangle surrounding 16

Negative values of Percent result in constrained versions of the experimental design methods 7-16. It means that all points sampled are feasible with respect to all given constraints.

For ExD 5,6-12,14-16 user defined points are used.

nSample Number of sample points to be used in initial experimental design. nSample is used differently dependent on the value of Percent:
(n)Sample: ExD < 0 = 0 > 0 []
1 2d
6 |n|
7-11 d + 1 d + 1 max (d + 1, n) (d + 1)(d + 2)/2
12 LATIN(k)
13 |n|
14-16 d + 1

where LATIN = [21 21 33 41 51 65 65] and k = |nSample|. Otherwise nSample as input does not matter.

Description of the experimental designs:

ExD 1, All Corners. Initial points is the corner points of the box given by Prob.x L and Prob.x U. Generates 2d points, which results in too many points when the dimension is high.

ExD 2, Lower and Upper Corner point + adjacent points. Initial points are 2 * d + 2 corners: the lower left corner xL and its d adjacent corners xL + (xU (i) - xL (i)) * ei , i = 1, ..., d and the upper right corner xU and its d adjacent corners xU - (xU(i) - xL(i)) * ei , i = 1, ..., d

ExD 3. Initial points are the upper right corner xU and its d adjacent corners xU - (xU(i) - xL(i)) * ei , i = 1, ..., d

ExD 4. Initial points are the lower left corner xL and its d adjacent corners xL + (xU (i) - xL (i)) * ei , i = 1, ..., d

ExD 5. User given initial points, given as a matrix in CGO.X. Each column is one sampled point. If d = length(Prob.x_L), then size(X,1) = d, size(X,2) = d + 1. CGO.F should be defined as empty, or contain a vector of corresponding f (x) values. Any CGO.F value set as NaN will be computed by solver routine.

ExD 6. Use determinstic global optimization methods to find the initial design. Current methods available (all DIRECT methods), dependent on the value of Percent:

99 = glcDirect, 98 = glbDirect, 97 = glcSolve, 96 = glbSolve, 95 = glcFast, 94 = glbFast.

ExD 7-11. Optimal Latin Hypercube Designs (LHD) with respect to different norms. The following norms and designs are available, dependent on the value of Percent:

1 = Maximin 1-Norm, 2 = Maximin 2-Norm, 3 = Maximin Inf-Norm, 4 = Audze-Eglais Norm, 5 = Minimax 2-Norm.

All designs taken from: http://www.spacefillingdesigns.nl/

Constrained versions will try bigger and bigger designs up to M = max(10 * d, nT rial) different designs, stopping when it has found nSample feasible points.

ExD 12. Latin hypercube space-filling design. For nSample < 0, k = |nSample| should in principle be the problem dimension. The number of points

sampled is:

k : 2 3 4 5 6 > 6

Points : 21 33 41 51 65 65

The call made is: X = daceInit(abs(nSample),Prob.x L,Prob.x U);

Set nSample = [] to get (d+1)*(d+2)/2 sampled points:

d : 1 2 3 4 5 6 7 8 9 10

Points : 3 6 10 15 21 28 36 45 55 66

This is a more efficient number of points to use.

If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points.

Constrained version of Latin hypercube only keep points that fulfill the linear and nonlinear constraints. The algorithm will try up to M = max(10 * d, nTrial) points, stopping when it has found nSample feasible points (d + 1 points if nSample < 0).

ExD 13. Orthogonal Sampling, LH with subspace density demands.

ExD 14-16. Random strategies, the |Percent| value gives the percentage size of an ellipsoid, circle or rectangle around the so far sampled points that new points are not allowed in. Range 1%-50%. Recommended values 10% - 20%.

If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points.

X,F,CX The fields X,F,CX are used to define user given points. ExD = 5 (Percent = 0) needs this information. If ExD == 6-12,14-16 these points are included into the design.
X A matrix of initial x values. One column for every x value. If ExD == 5, size(X,2) >= dim(x)+1 needed.
F A vector of initial f (x) values. If any element is set to NaN it will be computed.
CX Optionally a matrix of nonlinear constraint c(x) values. If nonempty, then size(CX,2) == size(X,2). If any element is set as NaN, the vector c(x) = CX(:,i) will be recomputed.
RandState If = 0, rand('state', RandState) is set to initialize the pseudo-random generator. If < 0, rand('state', 100 * clock) is set to give a new set of random values each run. If isnan(RandState), the random state is not initialized. RandState will influence if a stochastic initial experimental design is applied, see input Percent and nSample. RandState will also influence if using the multiMin solver, but the random state seed is not reset in multiMin. The state of the random generator is saved in the warm start output rngState, and the random generator is reinitialized with this state if warm start is used. Default RandState = 0.
AddMP If = 1, add the midpoint as extra point in the corner strategies. Default 1 for any corner strategy, i.e. Percent is 900, 997, 998 or 999.
nTrial For experimental design CLH, the method generates M = max(10 * d, nTrial) trial points, and evaluate them until nSample feasible points are found. In the random designs, nTrial is the maximum number of trial points randomly generated for each new point to sample.
CLHMethod Different search strategies for finding feasible LH points. First of all, the least infeasible point is added. Then the linear feasible points are considered. If more points are needed still, the nonlinear infeasible points are added.

1 - Take the sampled infeasible points in order.

2 - Take a random sample of the infeasible points.

3 - Use points with lowest constraint error (cErr).

SCALE 0 - Original search space (default if any integer values).

1 - Transform search space to unit cube (default if no integers).

REPLACE 0 - No replacement, default for constrained problems.

1 - Large function values are replaced by the median.

> 1 - Large values Z are replaced by new values. The replacement is defined as Z := FMAX + log10(Z - FMAX + 1), where FMAX = 10REPLACE, if min(F ) < 0 and FMAX = 10(ceil(log10(min(F )))+REPLACE) , if min(F ) >= 0. A new replacement is computed in every iteration, because min(F ) may change. Default REPLACE = 5, if no linear or nonlinear constraints.

LOCAL 0 - No local searches after global search. If RBF surface is inaccurate, might be an advantage.

1 - Local search from best points after global search. If equal best function values, up to 20 local searches are done.

SMOOTH 1 - The problem is smooth enough for local search using numerical gradient estimation methods (default).

0 - The problem is nonsmooth or noisy, and local search methods using numer- ical gradient estimation are likely to produce garbage search directions.

globalSolver Global optimization solver used for subproblem optimization. Default glcCluster (SMOOTH=1) or glcDirect (SMOOTH=0). If the global- Solver is glcCluster, the fields Prob.GO.maxFunc1, Prob.GO.maxFunc2, Prob.GO.maxFunc3, Prob.GO.localSolver, Prob.GO.DIRECT and other fields set in Prob.GO are used. See the help for these parameters in glcCluster.
localSolver Local optimization solver used for subproblem optimization. If not defined, the TOMLAB default constrained NLP solver is used.

- Special EGO algorithm parameters in Prob.CGO -

EGOAlg Main algorithm in the EGO solver (default EGOAlg == 1)

=1 Run expected improvement steps (modN=0,1,2,...). If no f (x) improve- ment, use DACE surface minimum (modN=-1) in 1 step

=2 Run expected improvement steps (modN=0) until ExpI/-yMin- ¡ Tol- ExpI for 3 successive steps (modN=1,2,3) without f (x) improvement (fRed = 0), where yMin is fMin transformed by TRANSFORM After 2 such steps (when modN=2), 1 step using the DACE surface minimum (modN=-1) is tried. If then fRed ¿0, reset to modN=0 steps.

pEst 1 - Estimate d-vector, p parameters (default), 0 - fix p=2.
pEst Norm parameters, fixed or estimated, also see p0, pLow, pUpp (default pEst = 0).

0 = Fixed constant p-value for all components (default, p0=1.99).

1 = Estimate one p-value valid for all components.

> 1 = Estimate d||||p parameters, one for each component.

p0 Fixed p-value (pEst==0, default = 1.99) or initial p-value (pEst == 1, default 1.9) or d-vector of initial p-values (pEst > 1, default 1.9*ones(d,1))
pLow Lower bound on p.

If pEst == 0, not used

if pEst == 1, lower bound on p-value (default 1.0)

if pEst > 1, lower bounds on p (default ones(d,1))

pUpp Upper bound on p.

If pEst == 0, not used

if pEst == 1, upper bound on p-value (default 2.0)

if pEst > 1, upper bounds on p (default 2*ones(d,1))

TRANSFORM Function value transformation.

0 - No transformation made.

1 - Median value transformation. Use REPLACE instead.

2 - log(y) transformation made.

3 - -log(-y) transformation made.

4 - -1/y transformation made.

Default EGO is computing the best possible transformation from the initial set of data. Note! No check is made on illegal y if user gives TRANSFORM.

EITRANSFORM Transformation of expected improvement function (default 1).

= 0 No transformation made.

= 1 - log(-f ) transformation made.

= 2 -1/f transformation made.

TolExpI Convergence tolerance for expected improvement (default 10-6 ).
SAMPLEF Sample criterion function:

0 = Expected improvment (default)

1 = Kushner's criterion (related option: KEPS)

2 = Lower confidence bounding (related option: LCBB)

3 = Generalized expected improvement (related option: GEIG)

4 = Maximum variance

5 = Watson and Barnes 2

KEPS The ε parameter in the Kushner's criterion (default: -0.01).

If KEPS > 0, then E = K EP S.

If KEPS < 0, then E = \|K EP S\| * fM in .

GEIG The exponent g in the generalized expected improvement function (default 2.0).
LCBB Lower Confidence Bounding parameter b (default 2.0).
GO Structure Prob.GO (Default values are set for all fields).

The following fields are used:

MaxFunc Maximal number of function evaluations in each global search.
MaxIter Maximal number of iterations in each global search.
DIRECT DIRECT solver used in glcCluster, either glcSolve or glcDirect(default).
maxFunc1 glcCluster parameter, maximum number of function evaluations in the first call. Only used if globalSolver is glcCluster, see help globalSolver.
maxFunc2 glcCluster parameter, maximum number of function evaluations in the second call. Only used if globalSolver is glcCluster, see help globalSolver.
maxFunc3 glcCluster parameter, maximum sum of function evaluations in repeated first calls to DIRECT routine when trying to get feasible. Only used if globalSolver is glcCluster, see help globalSolver.
localSolver The local solver used by glcCluster. If not defined, then Prob.CGO.localSolver is used
MIP Structure in Prob, Prob.MIP.

Defines integer optimization parameters. Fields used:

IntVars If empty, all variables are assumed non-integer.

If islogical(IntVars) (=all elements are 0/1), then 1 = integer variable, 0 = continuous variable. If any element > 1, IntVars is the indices for integer variables.

varargin Other arguments sent directly to low level functions.

Description of Outputs

Structure with result from optimization.

Output Description
x_k Matrix with the best points as columns.
f_k The best function value found so far.
Iter Number of iterations.
FuncEv Number of function evaluations.
ExitText Text string with information about the run.
ExitFlag Always 0.
CGO Subfield WarmStartInfo saves warm start information, the same information as in cgoSave.mat, see below.
Inform Information parameter.

0 = Normal termination.

1 = Function value f (x) is less than fGoal.

2 = Error in function value f (x), abs(f - fGoal) <= fTol, fGoal=0.

3 = Relative Error in function value f(x) is less than fTol, i.e. abs(f - fGoal)/abs(fGoal) <= fTol.

4 = No new point sampled for N iteration steps.

5 = All sample points same as the best point for N last iterations.

6 = All sample points same as previous point for N last iterations.

7 = All feasible integers tried.

9 = Max CPU Time reached.

10 = Expected improvement low for three iterations.

Result Structure with result from optimization.
cgoSave.mat To make a warm start possible, all CGO solvers saves information in the file cgoSave.mat. The file is created independent of the solver, which enables the user to call any CGO solver using the warm start information. cgoSave.mat is a MATLAB mat-file saved to the current directory. If the parameter SAVE is 1, the CGO solver saves the mat file every iteration, which enables the user to break the run and restart using warm start from the current state. SAVE = 1 is currently always set by the CGO solvers. If the cgoSave.mat file fails to open for writing, the information is also available in the output field Result.CGO.WarmStartInfo, if the run was concluded without interruption. Through a call to WarmDefGLOBAL, the Prob structure can be setup for warm start. In this case, the CGO solver will not load the data from cgoSave.mat. The file contains the following variables:
Name Problem name. Checked against the Prob.Name field if doing a warmstart.
O Matrix with sampled points (in original space).
X Matrix with sampled points (in unit space if SCALE==1)
F Vector with function values (penalty added for costly Cc(x))
F_m Vector with function values (replaced).
F00 Vector of pure function values, before penalties.
Cc MMatrix with costly constraint values, C c(x).
nInit Number of initial points.
Fpen Vector with function values + additional penalty if infeasible using the linear constraints and noncostly nonlinear c(x).
fMinIdx Index of the best point found.
rngState Current state of the random number generator used.

Description

ego implements the algorithm EGO by D. R. Jones, Matthias Schonlau and William J. Welch presented in the paper "Efficient Global Optimization of Expensive Black-Box Functions".

Please note that Jones et al. has a slightly different problem formulation. The TOMLAB version of ego treats linear and nonlinear constraints separately.

ego samples points to which a response surface is fitted. The algorithm then balances between sampling new points and minimization on the surface.

ego and rbfSolve use the same format for saving warm start data. This means that it is possible to try one solver for a certain number of iterations/function evaluations and then do a warm start with the other. Example:

>> Prob	= probInit('glc_prob',1);		%   Set up problem structure
>> Result_ego = tomRun('ego',Prob);		%   Solve for a while with  ego
>> Prob.WarmStart = 1; 				%   Indicate a warm start
>> Result_rbf = tomRun('rbfSolve',Prob);	%   Warm start with rbfSolve

M-files Used

iniSolve.m, endSolve.m, conAssign.m, glcAssign.m

See Also

rbfSolve

Warnings

Observe that when cancelling with CTRL+C during a run, some memory allocated by ego will not be deallocated.

To deallocate, do:

>> clear egolib