CGO: Difference between revisions
(Created page with "==Introduction== ===Overview=== Welcome to the TOMLAB /CGO User's Guide. TOMLAB /CGO includes the solvers, ''rbfSolve'', ''ego ''and ''arbfMIP''. The solvers are specificall...") |
No edit summary |
||
Line 9: | Line 9: | ||
===Contents of this manual=== | ===Contents of this manual=== | ||
* | *[[#Introduction]] provides a basic overview of the TOMLAB /CGO solver package. | ||
* | * [[#Using the Matlab Interface]] provides an overview of the solver interface. | ||
* | * [[#Setting CGO Options]] describes how to set CGO solver options from Matlab. | ||
* | * [[#TOMLAB /CHO Test Examples]] provides information regarding (non-costly) TOMLAB /CGO test examples. | ||
* | * [[#TOMLAB /CGO Solver Reference]] gives detailed information about the interface routines ''rbfSolve'', ''ego ''and ''arbfMIP''. | ||
===More information=== | ===More information=== | ||
Line 43: | Line 43: | ||
|''arbfMIP''||Costly global solver routine called by the TOMLAB driver routine ''tomRun''. This routine will also call local and global subsolvers. | |''arbfMIP''||Costly global solver routine called by the TOMLAB driver routine ''tomRun''. This routine will also call local and global subsolvers. | ||
|} | |} | ||
</figtable> | |||
==Setting CGO Options== | ==Setting CGO Options== | ||
Line 58: | Line 59: | ||
</pre> | </pre> | ||
A complete description of the available CGO parameters can be found in | A complete description of the available CGO parameters can be found in [[#TOMLAB /CGO Solver Reference]]. | ||
==TOMLAB /CGO Test Examples== | ==TOMLAB /CGO Test Examples== | ||
Line 100: | Line 101: | ||
\MATHSET{R}^{m_2}</math>. | \MATHSET{R}^{m_2}</math>. | ||
The variables <math>x_I</math> are restricted to be integers, | The variables <math>x_I</math> are restricted to be integers, | ||
where <math>\MATHSET{I}</math> is an index subset of \{1,\ldots, | where <math>\MATHSET{I}</math> is an index subset of <math>\{1,\ldots,d\},</math> possibly empty. | ||
It is assumed that the function <math>f(x)</math> is continuous with respect to all | It is assumed that the function <math>f(x)</math> is continuous with respect to all | ||
variables, | variables, even if there is a demand that some variables only take integer values. | ||
even if there is a demand that some variables only take integer values. | Otherwise it would not make sense to do the surrogate modeling of <math>f(x)</math> used by all CGO solvers. | ||
Otherwise it would not make | |||
sense to do the surrogate modeling of <math>f(x)</math> used by all CGO solvers. | |||
''f ''(''x'') is assumed to be a costly function while ''c''(''x'') is assumed to be cheaply computed. Any costly constraints can be treated by adding penalty terms to the objective function in the following way: | ''f ''(''x'') is assumed to be a costly function while ''c''(''x'') is assumed to be cheaply computed. Any costly constraints can be treated by adding penalty terms to the objective function in the following way: | ||
<math> | <math> | ||
\ | \min_x \quad p(x)= f(x)+ \sum_{j} | ||
w_j\max\left(0, | w_j \max \left(0,c^j(x)-c_U^j, c_L^j-c^j(x) \right), | ||
</math> | </math> | ||
where weighting parameters '' | where weighting parameters ''w<sub>j</sub>'' have been added. The user then returns ''p''(''x'') instead of ''f ''(''x'') to the CGO solver. | ||
====Calling Syntax==== | ====Calling Syntax==== | ||
<pre> | |||
Result = rbfSolve(Prob,varargin) | Result = rbfSolve(Prob,varargin) | ||
Result = tomRun('rbfSolve', Prob); | Result = tomRun('rbfSolve', Prob); | ||
</pre> | |||
====Description of Inputs==== | ====Description of Inputs==== | ||
Line 163: | Line 163: | ||
|''optParam''||Structure with optimization parameters. The following fields are used: | |''optParam''||Structure with optimization parameters. The following fields are used: | ||
|-valign="top" | |-valign="top" | ||
|''MaxFunc''||Maximal number of costly function evaluations, default 300 for ''rbfSolve ''and ''arbfMIP'', and default 200 for ''ego''. ''MaxFunc ''must be ''= ''5000. If WarmStart = 1 and MaxFunc ''= ''nFunc (Number of f(x) used) then set MaxFunc := MaxFunc + nFunc. | |''MaxFunc''||Maximal number of costly function evaluations, default 300 for ''rbfSolve ''and ''arbfMIP'', and default 200 for ''ego''. ''MaxFunc ''must be ''<= ''5000. If WarmStart = 1 and MaxFunc ''<= ''nFunc (Number of f(x) used) then set MaxFunc := MaxFunc + nFunc. | ||
|-valign="top" | |-valign="top" | ||
|''IterPrint''||Print one information line each iteration, and the new x tried. Default IterPrint | |''IterPrint''||Print one information line each iteration, and the new x tried. Default IterPrint = 1. fMinI means the best f(x) is infeasible. fMinF means the best f(x) is feasible (also integer feasible). | ||
= 1. fMinI means the best f(x) is infeasible. fMinF means the best f(x) is feasible (also integer feasible). | |||
|-valign="top" | |-valign="top" | ||
|''fGoal''||Goal for function value, not used if ''inf ''or empty. | |''fGoal''||Goal for function value, not used if ''inf ''or empty. | ||
|-valign="top" | |-valign="top" | ||
'' | |''eps_f ''||Relative accuracy for function value, ''fTol'' == ''eps_f''. Stop if ''<nowiki>|</nowiki>f - f Goal<nowiki>|</nowiki> <='' ''<nowiki>|</nowiki>fGoal<nowiki>|</nowiki> * fTol'', if ''fGoal'' ≠ 0. Stop if ''<nowiki>|</nowiki>f - fGoal<nowiki>|</nowiki> <= fTol'', if ''fGoal'' = 0. See the output field maxTri. | ||
|-valign="top" | |||
|''bTol ''||Linear constraint tolerance. | |||
|-valign="top" | |-valign="top" | ||
|''cTol''||Nonlinear constraint tolerance. | |''cTol''||Nonlinear constraint tolerance. | ||
Line 188: | Line 188: | ||
|900||All Corners||1 | |900||All Corners||1 | ||
|-valign="top" | |-valign="top" | ||
|997||'' | |997||''x<sub>L</sub> ''+ ''x<sub>U</sub> ''+ adjacent corners||2 | ||
|-valign="top" | |-valign="top" | ||
|998||'' | |998||''x<sub>U</sub> ''+ adjacent corners||3 | ||
|-valign="top" | |-valign="top" | ||
|999||'' | |999||''x<sub>L</sub> ''+ adjacent corners||4 | ||
|-valign="top" | |-valign="top" | ||
||| | ||| | ||
Line 229: | Line 229: | ||
|-valign="top" | |-valign="top" | ||
||| | ||| | ||
'''Random Strategies (pp in %)''' | '''Random Strategies (pp in %)''' | ||
|-valign="top" | |-valign="top" | ||
|1pp||Circle surrounding||14 | |1pp||Circle surrounding||14 | ||
Line 267: | Line 267: | ||
'''Description of the experimental designs:''' | '''Description of the experimental designs:''' | ||
'''ExD 1,'''All Corners. Initial points is the corner points of the box given by Prob. | '''ExD 1,''' All Corners. Initial points is the corner points of the box given by Prob.x_L and Prob.x_U. Generates 2''<sup>d</sup> ''points, which results in too many points when the dimension is high. | ||
'''ExD 2, '''Lower and Upper Corner point + adjacent points. Initial points are 2 ''* d ''+ 2 corners: the lower left corner '' | '''ExD 2, '''Lower and Upper Corner point + adjacent points. Initial points are 2 ''* d ''+ 2 corners: the lower left corner ''x<sub>L</sub> ''and its ''d '' adjacent corners ''x<sub>L</sub> ''+ (''x<sub>U</sub>''(''i'') - x<sub>L</sub> ''(''i'')) ''* e<sub>i</sub>, i ''= 1'', ..., d and the upper right corner ''x<sub>U</sub> ''and its ''d ''adjacent corners ''x''<sub>U</sub> - ''(''x<sub>U</sub> ''(''i'') ''- x<sub>L</sub> ''(''i'')) ''* e<sub>i</sub>, i ''= 1'', ..., d'' | ||
'''ExD 3. '''Initial points are the upper right corner '' | '''ExD 3. '''Initial points are the upper right corner ''x<sub>U</sub> ''and its ''d ''adjacent corners ''x<sub>U</sub> - ''(''x<sub>U</sub> ''(''i'') ''- x<sub>L</sub> ''(''i'')) ''* e<sub>i</sub> , i ''= 1'', ..., d'' | ||
'''ExD 4. '''Initial points are the lower left corner '' | '''ExD 4. '''Initial points are the lower left corner ''x<sub>L</sub> ''and its ''d ''adjacent corners ''x<sub>L</sub> ''+ (''x<sub>U</sub> ''(''i'') ''- x<sub>L</sub> ''(''i'')) ''* e<sub>i</sub> , i ''= 1'', ..., d'' | ||
'''ExD 5. '''User given initial points, given as a matrix in CGO.X. Each column is one sampled point. If ''d ''= length(Prob.x L), then size(X,1) = d, size(X,2) ''='' ''d ''+ 1. CGO.F should be defined as empty, or contain a vector of corresponding ''f ''(''x'') values. Any CGO.F value set as NaN will be computed by solver routine. | '''ExD 5. '''User given initial points, given as a matrix in CGO.X. Each column is one sampled point. If ''d ''>= length(Prob.x L), then size(X,1) = d, size(X,2) ''='' ''d ''+ 1. CGO.F should be defined as empty, or contain a vector of corresponding ''f ''(''x'') values. Any CGO.F value set as NaN will be computed by solver routine. | ||
'''ExD 6. '''Use determinstic global optimization methods to find the initial design. Current methods available (all DIRECT methods), dependent on the value of Percent: | '''ExD 6. '''Use determinstic global optimization methods to find the initial design. Current methods available (all DIRECT methods), dependent on the value of Percent: | ||
Line 290: | Line 290: | ||
'''ExD 12. '''Latin hypercube space-filling design. For nSample ''< ''0, ''k ''= ''<nowiki>|</nowiki>nSample<nowiki>|</nowiki> ''should in principle be the problem dimension. The number of points sampled is: | '''ExD 12. '''Latin hypercube space-filling design. For nSample ''< ''0, ''k ''= ''<nowiki>|</nowiki>nSample<nowiki>|</nowiki> ''should in principle be the problem dimension. The number of points sampled is: | ||
k : 2 3 4 5 6 ''> ''6 | k : 2 3 4 5 6 ''> ''6 | ||
Points : 21 33 41 51 65 65 | Points : 21 33 41 51 65 65 | ||
The call made is: X = daceInit(abs(nSample),Prob.x_L,Prob.x_U); Set nSample = [] to get (d+1)*(d+2)/2 sampled points: | The call made is: X = daceInit(abs(nSample),Prob.x_L,Prob.x_U); | ||
Set nSample = [] to get (d+1)*(d+2)/2 sampled points: | |||
d : 1 2 3 4 5 6 7 8 9 10 | d : 1 2 3 4 5 6 7 8 9 10 | ||
Points : 3 6 10 15 21 28 36 45 55 66 | Points : 3 6 10 15 21 28 36 45 55 66 | ||
This is a more efficient number of points to use. | This is a more efficient number of points to use. | ||
If CGO.X is nonempty, these points are verified as in ExD 5, and treated as | If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. | ||
Constrained version of Latin hypercube only keep points that fulfill the linear and nonlinear constraints. The algorithm will try up to ''M'' = ''max''(10 ''* d, nTrial'') points, stopping when it has found nSample feasible points (''d ''+ 1 points if ''nSample < ''0). | Constrained version of Latin hypercube only keep points that fulfill the linear and nonlinear constraints. The algorithm will try up to ''M'' = ''max''(10 ''* d, nTrial'') points, stopping when it has found nSample feasible points (''d ''+ 1 points if ''nSample < ''0). | ||
Line 306: | Line 311: | ||
'''ExD 13. '''Orthogonal Sampling, LH with subspace density demands. | '''ExD 13. '''Orthogonal Sampling, LH with subspace density demands. | ||
'''ExD 14-16'''. Random strategies, the ''<nowiki>|</nowiki>Percent<nowiki>|</nowiki> ''value gives the percentage size of an ellipsoid, circle or rectangle around the so far sampled points that new points are not allowed in. Range 1%-50%. Recommended values 10% - 20%. If CGO.X is nonempty, these points are verified as in ExD 5, and treated as | '''ExD 14-16'''. Random strategies, the ''<nowiki>|</nowiki>Percent<nowiki>|</nowiki> ''value gives the percentage size of an ellipsoid, circle or rectangle around the so far sampled points that new points are not allowed in. Range 1%-50%. Recommended values 10% - 20%. If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. | ||
|-valign="top" | |-valign="top" | ||
|''X,F,CX''||The fields X,F,CX are used to define user given points. ExD = 5 (Percent = 0) needs this information. If ExD == 6-12,14-16 these points are included into the design. | |''X,F,CX''||The fields X,F,CX are used to define user given points. ExD = 5 (Percent = 0) needs this information. If ExD == 6-12,14-16 these points are included into the design. | ||
|-valign="top" | |-valign="top" | ||
|''X''||A matrix of initial x values. One column for every x value. If ExD == 5, size(X,2) ''= ''dim(x)+1 needed. | |''X''||A matrix of initial x values. One column for every x value. If ExD == 5, size(X,2) ''>= ''dim(x)+1 needed. | ||
|-valign="top" | |-valign="top" | ||
|''F''||A vector of initial ''f ''(''x'') values. If any element is set to NaN it will be computed. | |''F''||A vector of initial ''f ''(''x'') values. If any element is set to NaN it will be computed. | ||
Line 316: | Line 321: | ||
|''CX''||Optionally a matrix of nonlinear constraint c(x) values. If nonempty, then size(CX,2) == size(X,2). If any element is set as NaN, the vector c(x) = CX(:,i) will be recomputed. | |''CX''||Optionally a matrix of nonlinear constraint c(x) values. If nonempty, then size(CX,2) == size(X,2). If any element is set as NaN, the vector c(x) = CX(:,i) will be recomputed. | ||
|-valign="top" | |-valign="top" | ||
|''RandState''||If ''= ''0, ''rand''('' | |''RandState''||If ''>= ''0, ''rand''(''<nowiki>'</nowiki>state<nowiki>'</nowiki>, RandState'') is set to initialize the pseudo-random generator. If ''< ''0, ''rand''(''<nowiki>'</nowiki>state<nowiki>'</nowiki>, ''100 ''* clock'') is set to give a new set of random values each run. If isnan(RandState), the random state is not initialized. RandState will influence if a stochastic initial experimental design is applied, see input Percent and nSample. RandState will also influence if using the ''multiMin ''solver, but the random state seed is not reset in ''multiMin''. The state of the random generator is saved in the warm start output rngState, and the random generator is reinitialized with this state if warm start is used. Default RandState = 0. | ||
|-valign="top" | |-valign="top" | ||
|''AddMP''||If = 1, add the midpoint as extra point in the corner strategies. Default 1 for any corner strategy, i.e. Percent is 900, 997, 998 or 999. | |''AddMP''||If = 1, add the midpoint as extra point in the corner strategies. Default 1 for any corner strategy, i.e. Percent is 900, 997, 998 or 999. | ||
|-valign="top" | |-valign="top" | ||
|''nTrial''||For experimental design CLH, the method generates ''M ''= ''max''(10 ''* d, | |''nTrial''||For experimental design CLH, the method generates ''M ''= ''max''(10 ''* d, nTrial'') trial points, and evaluate them until ''nSample ''feasible points are found. In the random designs, ''nTrial ''is the maximum number of trial points randomly generated for each new point to sample. | ||
|-valign="top" | |-valign="top" | ||
|''CLHMethod''||Different search strategies for finding feasible LH points. First of all, the least infeasible point is added. Then the linear feasible points are considered. If more points are needed still, the nonlinear infeasible points are added. | |''CLHMethod''||Different search strategies for finding feasible LH points. First of all, the least infeasible point is added. Then the linear feasible points are considered. If more points are needed still, the nonlinear infeasible points are added. | ||
Line 372: | Line 377: | ||
=1 for the case ''idea ''=1, ''fStarRule ''=3. | =1 for the case ''idea ''=1, ''fStarRule ''=3. | ||
|-valign="top" | |-valign="top" | ||
|''fStarRule''||Global-Local search strategy in idea 1, where N is the cycle length. Define ''minsn ''as the global minimum on the RBF surface. The following strategies for setting the target value ''fStar ''is defined: 1: '' | |''fStarRule''||Global-Local search strategy in idea 1, where N is the cycle length. Define ''minsn ''as the global minimum on the RBF surface. The following strategies for setting the target value ''fStar ''is defined: 1: ''fStar ''= ''min<sub>sn</sub> - ''((''N - ''(''n - nInit''))''/N '')<sup>2</sup> ''* ''Δ''n ''(Default), 2: ''fStar ''= ''min<sub>sn</sub> - ''(''N - ''(''n - nInit''))''/N * ''Δ''n ''. | ||
'' | Strategy 1 and 2 depends on Δ ''<sub>n</sub> ''estimate (see DeltaRule). If ''infStep ''=1, add <math>-\infty</math>-step first in cycle. 3: fStar = <math>-\infty</math>-step, ''min<sub>sn</sub>-k *''0''.''1''*<nowiki>|</nowiki>min<sub>sn</sub><nowiki>|</nowiki>k ''= ''N, ..., ''0. | ||
These strategies had the following names in Gutmanns thesis: III, II, I. | These strategies had the following names in Gutmanns thesis: III, II, I. | ||
Line 382: | Line 385: | ||
|''DeltaRule''||1 = Skip large f(x) when computing f(x) interval ?. 0 = Use all points. Default 1. | |''DeltaRule''||1 = Skip large f(x) when computing f(x) interval ?. 0 = Use all points. Default 1. | ||
|-valign="top" | |-valign="top" | ||
|''AddSurfMin''||Add up to AddSurfMin interior local minima on RBF surface as search points, based on estimated Lipschitz constants. AddSurfMin=0 implies no additional minimum added (Default). This option is only possible if ''globalSolver ''= ''multiMin''. Test for additional minimum is done in the local step (modN == N) If these additional | |''AddSurfMin''||Add up to AddSurfMin interior local minima on RBF surface as search points, based on estimated Lipschitz constants. AddSurfMin=0 implies no additional minimum added (Default). This option is only possible if ''globalSolver ''= ''multiMin''. Test for additional minimum is done in the local step (modN == N) If these additional local minima are used, in the printout modN = ''-''2'', -''3'', -''4'', ... ''are the iteration steps with these search points. | ||
|-valign="top" | |-valign="top" | ||
|''TargetMin''||Which minimum, if several minima found, to select in the target value problem: | |''TargetMin''||Which minimum, if several minima found, to select in the target value problem: | ||
Line 392: | Line 395: | ||
=2 Use best interior local minima, if none use RBF interior minimum. | =2 Use best interior local minima, if none use RBF interior minimum. | ||
=3 Use best minimum with lowest number of coefficients on bounds. Default is ''TargetMin ''= 3. | =3 Use best minimum with lowest number of coefficients on bounds. | ||
Default is ''TargetMin ''= 3. | |||
|-valign="top" | |-valign="top" | ||
|''eps_sn''||Relative tolerance used to test if the minimum of the RBF surface, '' | |''eps_sn''||Relative tolerance used to test if the minimum of the RBF surface, ''min<sub>sn</sub> '', is sufficiently lower than the best point (''fM in '') found (default is 10<sup>-7</sup> ). | ||
|-valign="top" | |-valign="top" | ||
|''MaxCycle''||Max number of cycles without progress before stopping, default 10. | |''MaxCycle''||Max number of cycles without progress before stopping, default 10. | ||
Line 437: | Line 442: | ||
|''f_k''||The best function value found so far. | |''f_k''||The best function value found so far. | ||
|-valign="top" | |-valign="top" | ||
''Iter | |''Iter''||Number of iterations. | ||
|-valign="top" | |-valign="top" | ||
''FuncEv ''Number of function evaluations. | |''FuncEv''||Number of function evaluations. | ||
|-valign="top" | |-valign="top" | ||
''ExitText | |''ExitText''||Text string with information about the run. | ||
|-valign="top" | |-valign="top" | ||
''ExitFlag ''Always 0. | |''ExitFlag''||Always 0. | ||
|-valign="top" | |-valign="top" | ||
''CGO ''Subfield ''WarmStartInfo | |''CGO''||Subfield ''WarmStartInfo'' saves warm start information, the same information as in cgoSave.mat, see below. | ||
|-valign="top" | |-valign="top" | ||
''Inform | |''Inform''||Information parameter. | ||
0 = Normal termination. | 0 = Normal termination. | ||
Line 453: | Line 458: | ||
1 = Function value f(x) is less than fGoal. | 1 = Function value f(x) is less than fGoal. | ||
2 = Error in function value ''f ''(''x'')'', | 2 = Error in function value ''f ''(''x'')'', <nowiki>|</nowiki>f - fGoal<nowiki>|</nowiki> <= fTol, fGoal ''= 0''.'' | ||
3 = Relative | 3 = Relative Error in function value ''f ''(''x'') is less than fTol, i.e. <nowiki>|</nowiki>f - fGoal<nowiki>|</nowiki> <nowiki>|</nowiki>fGoal<nowiki>|</nowiki> <= fTol. | ||
4 = No new point sampled for MaxCycle iteration steps. | 4 = No new point sampled for MaxCycle iteration steps. | ||
Line 467: | Line 470: | ||
7 = All feasible integers tried. | 7 = All feasible integers tried. | ||
8 = No progress for '' | 8 = No progress for ''MaxCycle * ''(''N ''+ 1) + 1 function evaluations (''> MaxCycle'' cycles, input CGO.MaxCycle). | ||
cycles, input CGO.MaxCycle). | |||
9 = Max CPU Time reached. | 9 = Max CPU Time reached. | ||
|-valign="top" | |-valign="top" | ||
''cgoSave.mat | |''cgoSave.mat''||To make a warm start possible, all CGO solvers saves information in the file cgoSave.mat. The file is created independent of the solver, which enables the user to call any CGO solver using the warm start information. cgoSave.mat is a MATLAB mat-file saved to the current directory. If the parameter SAVE is 1, the CGO solver saves the mat file every iteration, which enables the user to break the run and restart using warm start from the current state. SAVE = 1 is currently always set by the CGO solvers. If the cgoSave.mat file fails to open for writing, the information is also available in the output field Result.CGO.WarmStartInfo, if the run was concluded without interruption. Through a call to WarmDefGLOBAL, the Prob structure can be setup for warm start. In this case, the CGO solver will not load the data from cgoSave.mat. The file contains the following variables: | ||
|-valign="top" | |-valign="top" | ||
|''Name''||Problem name. Checked against the ''Prob.Name ''field if doing a warmstart. | |''Name''||Problem name. Checked against the ''Prob.Name ''field if doing a warmstart. | ||
Line 496: | Line 497: | ||
====Description==== | ====Description==== | ||
''rbfSolve ''implements the Radial Basis Function (RBF) algorithm | ''rbfSolve ''implements the Radial Basis Function (RBF) algorithm based on the work by Gutmann. The RBF method is enhanced to handle linear equality and inequality constraints, and nonlinear equality and inequality constraints, as well as mixed-integer problems. | ||
A response surface based on radial basis functions is fitted to a collection of sampled points. The algorithm then balances between minimizing the fitted function and adding new points to the set. | A response surface based on radial basis functions is fitted to a collection of sampled points. The algorithm then balances between minimizing the fitted function and adding new points to the set. | ||
Line 545: | Line 546: | ||
\MATHSET{R}^{m_2}</math>. | \MATHSET{R}^{m_2}</math>. | ||
The variables <math>x_I</math> are restricted to be integers, | The variables <math>x_I</math> are restricted to be integers, | ||
where <math>\MATHSET{I}</math> is an index subset of \{1,\ldots, | where <math>\MATHSET{I}</math> is an index subset of <math>\{1,\ldots,d\},</math> possibly empty. | ||
It is assumed that the function <math>f(x)<math> is continuous with respect to all | It is assumed that the function <math>f(x)</math> is continuous with respect to all | ||
variables, even if there is a demand that some variables only take integer values. | variables, even if there is a demand that some variables only take integer values. | ||
Otherwise it would not make sense to do the surrogate modeling of <math>f(x)</math> used by all CGO solvers. | Otherwise it would not make sense to do the surrogate modeling of <math>f(x)</math> used by all CGO solvers. | ||
Line 607: | Line 608: | ||
|''optParam''||Structure with optimization parameters. The following fields are used: | |''optParam''||Structure with optimization parameters. The following fields are used: | ||
|-valign="top" | |-valign="top" | ||
|''MaxFunc''||Maximal number of costly function evaluations, default 300 for ''rbfSolve ''and ''arbfMIP'', and default 200 for ''ego''. ''MaxFunc ''must be ''= ''5000. If WarmStart = 1 and MaxFunc ''= ''nFunc (Number of f(x) used) then set MaxFunc := MaxFunc + nFunc. | |''MaxFunc''||Maximal number of costly function evaluations, default 300 for ''rbfSolve ''and ''arbfMIP'', and default 200 for ''ego''. ''MaxFunc ''must be ''<= ''5000. If WarmStart = 1 and MaxFunc ''<= ''nFunc (Number of f(x) used) then set MaxFunc := MaxFunc + nFunc. | ||
|-valign="top" | |-valign="top" | ||
|''IterPrint''||Print one information line each iteration, and the new x tried. Default IterPrint | |''IterPrint''||Print one information line each iteration, and the new x tried. Default IterPrint = 1. fMinI means the best f(x) is infeasible. fMinF means the best f(x) is feasible (also integer feasible). | ||
= 1. fMinI means the best f(x) is infeasible. fMinF means the best f(x) is feasible (also integer feasible). | |||
|-valign="top" | |-valign="top" | ||
|''fGoal''||Goal for function value, not used if ''inf ''or empty. | |''fGoal''||Goal for function value, not used if ''inf ''or empty. | ||
|-valign="top" | |-valign="top" | ||
|''eps_f''||Relative accuracy for function value, ''fTol ''== ''eps_f ''. Stop if ''<nowiki>|</nowiki>f - f Goal<nowiki>|</nowiki> ='' ''<nowiki>|</nowiki>f Goal<nowiki>|</nowiki> * fTol '', if ''f Goal ''= 0. Stop if ''<nowiki>|</nowiki>f - f Goal<nowiki>|</nowiki> = fTol '', if ''f Goal ''= 0. See the output field maxTri. | |''eps_f''||Relative accuracy for function value, ''fTol ''== ''eps_f ''. Stop if ''<nowiki>|</nowiki>f - f Goal<nowiki>|</nowiki> ='' ''<nowiki>|</nowiki>f Goal<nowiki>|</nowiki> * fTol '', if ''f Goal ''= 0. Stop if ''<nowiki>|</nowiki>f - f Goal<nowiki>|</nowiki> = fTol '', if ''f Goal ''= 0. See the output field maxTri. | ||
|-valign="top" | |||
|''bTol''||Linear constraint tolerance. | |''bTol''||Linear constraint tolerance. | ||
|-valign="top" | |-valign="top" | ||
Line 635: | Line 635: | ||
|900||All Corners||1 | |900||All Corners||1 | ||
|-valign="top" | |-valign="top" | ||
|997||'' | |997||''x<sub>L</sub> ''+ ''x<sub>U</sub> ''+ adjacent corners||2 | ||
|-valign="top" | |-valign="top" | ||
|998||'' | |998||''x<sub>U</sub> ''+ adjacent corners||3 | ||
|-valign="top" | |-valign="top" | ||
|999||'' | |999||''x<sub>L</sub> ''+ adjacent corners||4 | ||
|-valign="top" | |-valign="top" | ||
||| | ||| | ||
Line 715: | Line 715: | ||
'''ExD 1, '''All Corners. Initial points is the corner points of the box given by Prob.x L and Prob.x U. Generates 2''d ''points, which results in too many points when the dimension is high. | '''ExD 1, '''All Corners. Initial points is the corner points of the box given by Prob.x L and Prob.x U. Generates 2''d ''points, which results in too many points when the dimension is high. | ||
'''ExD 2, '''Lower and Upper Corner point + adjacent points. Initial points are 2 ''* d ''+ 2 corners: the lower left corner '' | '''ExD 2, '''Lower and Upper Corner point + adjacent points. Initial points are 2 ''* d ''+ 2 corners: the lower left corner ''x<sub>L</sub> ''and its ''d ''adjacent corners ''x<sub>L</sub> ''+ (''x<sub>U</sub> ''(''i'') ''- x<sub>L</sub> ''(''i'')) ''* e<sub>i</sub> , i ''= 1'', ..., d ''and the upper right corner ''x<sub>U</sub> ''and its ''d ''adjacent corners ''x<sub>U</sub> - ''(''x<sub>U</sub>''(''i'') ''- x<sub>L</sub>''(''i'')) ''* e<sub>i</sub> , i ''= 1'', ..., d'' | ||
'''ExD 3. '''Initial points are the upper right corner '' | '''ExD 3. '''Initial points are the upper right corner ''x<sub>U</sub>'' and its ''d ''adjacent corners ''x<sub>U</sub> - ''(''x<sub>U</sub>''(''i'') ''- x<sub>L</sub>''(''i'')) ''* e<sub>i</sub> , i ''= 1'', ..., d'' | ||
'''ExD 4. '''Initial points are the lower left corner '' | '''ExD 4. '''Initial points are the lower left corner ''x<sub>L</sub> ''and its ''d ''adjacent corners ''x<sub>L</sub> ''+ (''x<sub>U</sub> ''(''i'') ''- x<sub>L</sub> ''(''i'')) ''* e<sub>i</sub> , i ''= 1'', ..., d'' | ||
'''ExD 5. '''User given initial points, given as a matrix in CGO.X. Each column is one sampled point. If ''d ''= length(Prob. | '''ExD 5. '''User given initial points, given as a matrix in CGO.X. Each column is one sampled point. If ''d ''= length(Prob.x_L), then size(X,1) = d, size(X,2) ''='' ''d ''+ 1. CGO.F should be defined as empty, or contain a vector of corresponding ''f ''(''x'') values. Any CGO.F value set as NaN will be computed by solver routine. | ||
'''ExD 6. '''Use determinstic global optimization methods to find the initial design. Current methods available (all DIRECT methods), dependent on the value of Percent: | '''ExD 6. '''Use determinstic global optimization methods to find the initial design. Current methods available (all DIRECT methods), dependent on the value of Percent: | ||
Line 740: | Line 740: | ||
k : 2 3 4 5 6 ''> ''6 | k : 2 3 4 5 6 ''> ''6 | ||
Points : 21 33 41 51 65 65 | Points : 21 33 41 51 65 65 | ||
The call made is: X = daceInit(abs(nSample),Prob.x L,Prob.x U); Set nSample = [] to get (d+1)*(d+2)/2 sampled points: | The call made is: X = daceInit(abs(nSample),Prob.x L,Prob.x U); | ||
Set nSample = [] to get (d+1)*(d+2)/2 sampled points: | |||
d : 1 2 3 4 5 6 7 8 9 10 | d : 1 2 3 4 5 6 7 8 9 10 | ||
Points : 3 6 10 15 21 28 36 45 55 66 | Points : 3 6 10 15 21 28 36 45 55 66 | ||
This is a more efficient number of points to use. | This is a more efficient number of points to use. | ||
If CGO.X is nonempty, these points are verified as in ExD 5, and treated as | If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. | ||
Constrained version of Latin hypercube only keep points that fulfill the linear and nonlinear constraints. The algorithm will try up to ''M ''= ''max''(10 ''* d, | Constrained version of Latin hypercube only keep points that fulfill the linear and nonlinear constraints. The algorithm will try up to ''M ''= ''max''(10 ''* d, nTrial'') points, stopping when it has found nSample feasible points (''d ''+ 1 points if ''nSample < ''0). | ||
'''ExD 13. '''Orthogonal Sampling, LH with subspace density demands. | '''ExD 13. '''Orthogonal Sampling, LH with subspace density demands. | ||
Line 759: | Line 763: | ||
If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. | If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. | ||
|-valign="top" | |-valign="top" | ||
''X,F,CX'' The fields X,F,CX are used to define user given points. ExD = 5 (Percent = 0) needs this information. If ExD == 6-12,14-16 these points are included into the design. | |''X,F,CX''||The fields X,F,CX are used to define user given points. ExD = 5 (Percent = 0) needs this information. If ExD == 6-12,14-16 these points are included into the design. | ||
|-valign="top" | |-valign="top" | ||
|''X''||A matrix of initial | |''X''||A matrix of initial x values. One column for every x value. If ExD == 5, size(X,2) ''>= ''dim(x)+1 needed. | ||
|-valign="top" | |-valign="top" | ||
|''F''||A vector of initial ''f ''(''x'') values. If any element is set to NaN it will be computed. | |''F''||A vector of initial ''f ''(''x'') values. If any element is set to NaN it will be computed. | ||
|-valign="top" | |-valign="top" | ||
|''CX''||Optionally | |''CX''||Optionally a matrix of nonlinear constraint c(x) values. If nonempty, then size(CX,2) == size(X,2). If any element is set as NaN, the vector c(x) = CX(:,i) will be recomputed. | ||
|-valign="top" | |-valign="top" | ||
|''RandState''||If ''= ''0, ''rand''('' | |''RandState''||If ''= ''0, ''rand''(''<nowiki>'</nowiki>state<nowiki>'</nowiki>, RandState'') is set to initialize the pseudo-random generator. If ''< ''0, ''rand''('''state', ''100 ''* clock'') is set to give a new set of random values each run. If isnan(RandState), the random state is not initialized. RandState will influence if a stochastic initial experimental design is applied, see input Percent and nSample. RandState will also influence if using the ''multiMin ''solver, but the random state seed is not reset in ''multiMin''. The state of the random generator is saved in the warm start output rngState, and the random generator is reinitialized with this state if warm start is used. Default RandState = 0. | ||
|-valign="top" | |-valign="top" | ||
|''AddMP''||If = 1, add the midpoint as extra point in the corner strategies. Default 1 for any corner strategy, i.e. Percent is 900, 997, 998 or 999. | |''AddMP''||If = 1, add the midpoint as extra point in the corner strategies. Default 1 for any corner strategy, i.e. Percent is 900, 997, 998 or 999. | ||
|-valign="top" | |-valign="top" | ||
|''nTrial''||For experimental design CLH, the method generates ''M ''= ''max''(10 ''* d, | |''nTrial''||For experimental design CLH, the method generates ''M ''= ''max''(10 ''* d, nTrial'') trial points, and evaluate them until ''nSample ''feasible points are found. In the random designs, ''nTrial ''is the maximum number of trial points randomly generated for each new point to sample. | ||
|-valign="top" | |-valign="top" | ||
|''CLHMethod''||Different search strategies for finding feasible LH points. First of all, the least infeasible point is added. Then the linear feasible points are considered. If more points are needed still, the nonlinear infeasible points are added. | |''CLHMethod''||Different search strategies for finding feasible LH points. First of all, the least infeasible point is added. Then the linear feasible points are considered. If more points are needed still, the nonlinear infeasible points are added. | ||
Line 789: | Line 793: | ||
1 - Large function values are replaced by the median. | 1 - Large function values are replaced by the median. | ||
''> ''1 - Large values Z are replaced by new values. The replacement is defined as ''Z '':= '' | ''> ''1 - Large values Z are replaced by new values. The replacement is defined as ''Z '':= ''FMAX ''+ ''log''10(''Z - FMAX ''+ 1), where ''FMAX ''= 10''<sup>REPLACE</sup>'', if ''min''(''F '') ''< ''0 and ''FMAX ''= 10<sup>(''ceil''(''log''10(''min''(''F '')))+''REPLACE'')</sup> , if ''min''(''F '') ''>= ''0. A new replacement is computed in every iteration, because ''min''(''F '') may change. Default REPLACE = 5, if no linear or nonlinear constraints. | ||
|-valign="top" | |-valign="top" | ||
|''LOCAL''||0 - No local searches after global search. If RBF surface is inaccurate, might be an advantage. | |''LOCAL''||0 - No local searches after global search. If RBF surface is inaccurate, might be an advantage. | ||
Line 943: | Line 947: | ||
1 = Function value ''f ''(''x'') is less than fGoal. | 1 = Function value ''f ''(''x'') is less than fGoal. | ||
2 = Error in function value ''f ''(''x''), ''abs''(''f - | 2 = Error in function value ''f ''(''x''), ''abs''(''f - fGoal'') ''<''= ''fTol'', fGoal=0. | ||
3 = Relative Error in function value f(x) is less than fTol, | 3 = Relative Error in function value f(x) is less than fTol, i.e. ''abs''(''f -'' ''fGoal'')''/abs''(''fGoal'') ''<''= ''fTol''. | ||
4 = No new point sampled for ''N ''iteration steps. | 4 = No new point sampled for ''N ''iteration steps. | ||
Line 1,046: | Line 1,050: | ||
\MATHSET{R}^{m_2}</math>. | \MATHSET{R}^{m_2}</math>. | ||
The variables <math>x_I</math> are restricted to be integers, | The variables <math>x_I</math> are restricted to be integers, | ||
where <math>\MATHSET{I}</math> is an index subset of \{1,\ldots, | where <math>\MATHSET{I}</math> is an index subset of <math>\{1,\ldots,d\},</math> possibly empty. | ||
It is assumed that the function <math>f(x)<math> is continuous with respect to all | It is assumed that the function <math>f(x)</math> is continuous with respect to all | ||
variables, even if there is a demand that some variables only take integer values. | variables, even if there is a demand that some variables only take integer values. | ||
Otherwise it would not make sense to do the surrogate modeling of <math>f(x)</math> used by all CGO solvers. | Otherwise it would not make sense to do the surrogate modeling of <math>f(x)</math> used by all CGO solvers. | ||
Line 1,110: | Line 1,114: | ||
|''optParam''||Structure with optimization parameters. The following fields are used: | |''optParam''||Structure with optimization parameters. The following fields are used: | ||
|-valign="top" | |-valign="top" | ||
|''MaxFunc''||Maximal number of costly function evaluations, default 300 for ''rbfSolve ''and ''arbfMIP'', and default 200 for ''ego''. ''MaxFunc ''must be ''= ''5000. If WarmStart = 1 and MaxFunc ''= ''nFunc (Number of f(x) used) then set MaxFunc := MaxFunc + nFunc. | |''MaxFunc''||Maximal number of costly function evaluations, default 300 for ''rbfSolve ''and ''arbfMIP'', and default 200 for ''ego''. ''MaxFunc ''must be ''<= ''5000. If WarmStart = 1 and MaxFunc ''<= ''nFunc (Number of f(x) used) then set MaxFunc := MaxFunc + nFunc. | ||
|-valign="top" | |-valign="top" | ||
|''IterPrint''||Print one information line each iteration, and the new x tried. Default IterPrint | |''IterPrint''||Print one information line each iteration, and the new x tried. Default IterPrint = 1. fMinI means the best f(x) is infeasible. fMinF means the best f(x) is feasible (also integer feasible). | ||
= 1. fMinI means the best f(x) is infeasible. fMinF means the best f(x) is feasible (also integer feasible). | |||
|-valign="top" | |-valign="top" | ||
|''fGoal''||Goal for function value, not used if ''inf ''or empty. | |||
|-valign="top" | |-valign="top" | ||
|''eps_f''||Relative accuracy for function value, ''fTol ''== ''eps_f ''. Stop if ''<nowiki>|</nowiki>f - fGoal<nowiki>|</nowiki> ='' ''<nowiki>|</nowiki>f_Goal<nowiki>|</nowiki> * fTol '', if ''fGoal ''= 0. Stop if ''<nowiki>|</nowiki>f - fGoal<nowiki>|</nowiki> = fTol '', if ''fGoal ''= 0. See the output field maxTri. | |''eps_f''||Relative accuracy for function value, ''fTol ''== ''eps_f ''. Stop if ''<nowiki>|</nowiki>f - fGoal<nowiki>|</nowiki> ='' ''<nowiki>|</nowiki>f_Goal<nowiki>|</nowiki> * fTol '', if ''fGoal ''= 0. Stop if ''<nowiki>|</nowiki>f - fGoal<nowiki>|</nowiki> = fTol '', if ''fGoal ''= 0. See the output field maxTri. | ||
Line 1,244: | Line 1,246: | ||
k : 2 3 4 5 6 ''> ''6 | k : 2 3 4 5 6 ''> ''6 | ||
Points : 21 33 41 51 65 65 | Points : 21 33 41 51 65 65 | ||
Line 1,249: | Line 1,252: | ||
d : 1 2 3 4 5 6 7 8 9 10 | d : 1 2 3 4 5 6 7 8 9 10 | ||
Points : 3 6 10 15 21 28 36 45 55 66 | Points : 3 6 10 15 21 28 36 45 55 66 | ||
This is a more efficient number of points to use. | This is a more efficient number of points to use. | ||
If CGO.X is nonempty, these points are verified as in ExD 5, and treated as | If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. | ||
Constrained version of Latin hypercube only keep points that fulfill the linear and nonlinear constraints. The algorithm will try up to ''M ''= ''max''(10 ''* d, nTrial'') points, stopping when it has found nSample feasible points (''d ''+ 1 points if ''nSample < ''0). | Constrained version of Latin hypercube only keep points that fulfill the linear and nonlinear constraints. The algorithm will try up to ''M ''= ''max''(10 ''* d, nTrial'') points, stopping when it has found nSample feasible points (''d ''+ 1 points if ''nSample < ''0). | ||
Line 1,261: | Line 1,265: | ||
'''ExD 14-16'''. Random strategies, the ''<nowiki>|</nowiki>Percent<nowiki>|</nowiki> ''value gives the percentage size of an ellipsoid, circle or rectangle around the so far sampled points that new points are not allowed in. Range 1%-50%. Recommended values 10% - 20%. | '''ExD 14-16'''. Random strategies, the ''<nowiki>|</nowiki>Percent<nowiki>|</nowiki> ''value gives the percentage size of an ellipsoid, circle or rectangle around the so far sampled points that new points are not allowed in. Range 1%-50%. Recommended values 10% - 20%. | ||
If CGO.X is nonempty, these points are verified as in ExD 5, and treated as | If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. | ||
|-valign="top" | |-valign="top" | ||
|''X,F,CX''||The fields X,F,CX are used to define user given points. ExD = 5 (Percent = 0) needs this information. If ExD == 6-12,14-16 these points are included into the design. | |''X,F,CX''||The fields X,F,CX are used to define user given points. ExD = 5 (Percent = 0) needs this information. If ExD == 6-12,14-16 these points are included into the design. | ||
|-valign="top" | |-valign="top" | ||
|''X''||A matrix of initial x values. One column for every x value. If ExD == 5, size(X,2) ''= ''dim(x)+1 needed. | |''X''||A matrix of initial x values. One column for every x value. If ExD == 5, size(X,2) ''>= ''dim(x)+1 needed. | ||
|-valign="top" | |-valign="top" | ||
|''F''||A vector of initial ''f ''(''x'') values. If any element is set to NaN it will be computed. | |''F''||A vector of initial ''f ''(''x'') values. If any element is set to NaN it will be computed. | ||
Line 1,271: | Line 1,275: | ||
|''CX''||Optionally a matrix of nonlinear constraint c(x) values. If nonempty, then size(CX,2) == size(X,2). If any element is set as NaN, the vector c(x) = CX(:,i) will be recomputed. | |''CX''||Optionally a matrix of nonlinear constraint c(x) values. If nonempty, then size(CX,2) == size(X,2). If any element is set as NaN, the vector c(x) = CX(:,i) will be recomputed. | ||
|-valign="top" | |-valign="top" | ||
|''RandState''||If ''>= ''0, ''rand''('' | |''RandState''||If ''>= ''0, ''rand''(''<nowiki>'</nowiki>state<nowiki>'</nowiki>, RandState'') is set to initialize the pseudo-random generator. If ''< ''0, ''rand''(''<nowiki>'</nowiki>state<nowiki>'</nowiki>, ''100 ''* clock'') is set to give a new set of random values each run. If isnan(RandState), the random state is not initialized. RandState will influence if a stochastic initial experimental design is applied, see input Percent and nSample. RandState will also influence if using the ''multiMin'' solver, but the random state seed is not reset in ''multiMin''. The state of the random generator is saved in the warm start output rngState, and the random generator is reinitialized with this state if warm start is used. Default RandState = 0. | ||
|-valign="top" | |-valign="top" | ||
|''AddMP''||If = 1, add the midpoint as extra point in the corner strategies. Default 1 for any corner strategy, i.e. Percent is 900, 997, 998 or 999. | |''AddMP''||If = 1, add the midpoint as extra point in the corner strategies. Default 1 for any corner strategy, i.e. Percent is 900, 997, 998 or 999. | ||
|-valign="top" | |-valign="top" | ||
|''nTrial''||For experimental design CLH, the method generates ''M ''= ''max''(10 ''* d, | |''nTrial''||For experimental design CLH, the method generates ''M ''= ''max''(10 ''* d, nTrial'') trial points, and evaluate them until ''nSample ''feasible points are found. In the random designs, ''nTrial ''is the maximum number of trial points randomly generated for each new point to sample. | ||
|-valign="top" | |-valign="top" | ||
|''CLHMethod''||Different search strategies for finding feasible LH points. First of all, the least infeasible point is added. Then the linear feasible points are considered. If more points are needed still, the nonlinear infeasible points are added. | |''CLHMethod''||Different search strategies for finding feasible LH points. First of all, the least infeasible point is added. Then the linear feasible points are considered. If more points are needed still, the nonlinear infeasible points are added. | ||
Line 1,293: | Line 1,297: | ||
1 - Large function values are replaced by the median. | 1 - Large function values are replaced by the median. | ||
''> ''1 - Large values Z are replaced by new values. The replacement is defined as ''Z '':= ''F M AX ''+ ''log''10(''Z - F M AX ''+ 1), where '' | ''> ''1 - Large values Z are replaced by new values. The replacement is defined as ''Z '':= ''F M AX ''+ ''log''10(''Z - F M AX ''+ 1), where ''FMAX ''= 10<sup>''REPLACE''</sup>, if ''min''(''F '') ''< ''0 and ''FMAX ''= 10<sup>(''ceil''(''log''10(''min''(''F '')))+''REPLACE'')</sup>, if ''min''(''F '') ''>= ''0. A new replacement is computed in every iteration, because ''min''(''F '') may change. Default REPLACE = 5, if no linear or nonlinear constraints. | ||
|-valign="top" | |-valign="top" | ||
|''LOCAL''||0 - No local searches after global search. If RBF surface is inaccurate, might be an advantage. | |''LOCAL''||0 - No local searches after global search. If RBF surface is inaccurate, might be an advantage. | ||
Line 1,320: | Line 1,324: | ||
=1 for the case ''idea ''=1, ''fStarRule ''=3. | =1 for the case ''idea ''=1, ''fStarRule ''=3. | ||
|-valign="top" | |-valign="top" | ||
|''fStarRule''||Global-Local search strategy in idea 1, where N is the cycle length. Define ''minsn ''as the global minimum on the RBF surface. The following strategies for setting the target value ''fStar ''is defined: 1: '' | |''fStarRule''||Global-Local search strategy in idea 1, where N is the cycle length. Define ''minsn ''as the global minimum on the RBF surface. The following strategies for setting the target value ''fStar ''is defined: 1: ''fStar ''= ''min<sub>sn</sub> - ''((''N - ''(''n - nInit''))''/N '')<sup>2</sup> ''* ''Δ''n ''(Default), 2: ''fStar ''= ''min<sub>sn</sub> - ''(''N - ''(''n - nInit''))''/N * ''Δ''n ''. | ||
Strategy 1 and 2 depends on Δ ''<sub>n</sub> ''estimate (see DeltaRule). If ''infStep ''=1, add <math>-\infty</math>-step first in cycle. 3: fStar = <math>-\infty</math>-step, ''min<sub>sn</sub>-k *''0''.''1''*<nowiki>|</nowiki>min<sub>sn</sub><nowiki>|</nowiki>k ''= ''N, ..., ''0. | |||
These strategies had the following names in Gutmanns thesis: III, II, I. | These strategies had the following names in Gutmanns thesis: III, II, I. | ||
|-valign="top" | |-valign="top" | ||
|''DeltaRule''||1 = Skip large f(x) when computing f(x) interval | |''DeltaRule''||1 = Skip large f(x) when computing f(x) interval δ. 0 = Use all points. Default 1. | ||
|-valign="top" | |-valign="top" | ||
|''eps_sn''||Relative tolerance used to test if the minimum of the RBF surface, ''minsn '', is sufficiently lower than the best point (''fM in '') found (default is 10''-''7 ). | |''eps_sn''||Relative tolerance used to test if the minimum of the RBF surface, ''minsn '', is sufficiently lower than the best point (''fM in '') found (default is 10''-''7 ). | ||
Line 1,380: | Line 1,386: | ||
1 = Function value f(x) is less than fGoal. | 1 = Function value f(x) is less than fGoal. | ||
2 = Error in function value ''f ''(''x'')'', <nowiki>|</nowiki>f - f Goal<nowiki>|</nowiki> = | 2 = Error in function value ''f ''(''x'')'', <nowiki>|</nowiki>f - f Goal<nowiki>|</nowiki> = fTol, fGoal ''= 0''.'' | ||
3 = Relative | 3 = Relative Error in function value ''f ''(''x'') is less than fTol, i.e. ''<nowiki>|</nowiki>f -'' ''f Goal<nowiki>|</nowiki>/<nowiki>|</nowiki>f Goal<nowiki>|</nowiki> = fTol''. | ||
4 = No new point sampled for MaxCycle iteration steps. | 4 = No new point sampled for MaxCycle iteration steps. | ||
Line 1,419: | Line 1,425: | ||
====Description==== | ====Description==== | ||
''arbfMIP ''implements the Adaptive Radial Basis Function (ARBF) algorithm | ''arbfMIP ''implements the Adaptive Radial Basis Function (ARBF) algorithm. The ARBF method handles linear equality and inequality constraints, and nonlinear equality and inequality constraints, as well as mixed-integer problems. | ||
====M-files Used==== | ====M-files Used==== | ||
Line 1,453: | Line 1,459: | ||
===Introduction=== | ===Introduction=== | ||
The task of global optimization is to find the set of parameters ''x ''in the feasible region <math>\Omega \subset R^d</math> for which the objective function ''f''(''x'') obtains its smallest value. In other words, a point <math>x^{\ast}</math> is a ''global optimizer ''to ''f ''(''x'') on <math>\Omega</math>, if <math>f(x^{\ast}) \leq f(x)</math>. On the other hand, a point | The task of global optimization is to find the set of parameters ''x ''in the feasible region <math>\Omega \subset R^d</math> for which the objective function ''f''(''x'') obtains its smallest value. In other words, a point <math>x^{\ast}</math> is a ''global optimizer ''to ''f ''(''x'') on <math>\Omega</math>, if <math>f(x^{\ast}) \leq f(x)</math>. On the other hand, a point <math>\hat{x}</math> is a ''local optimizer ''to ''f ''(''x''), if ''f ''(''x'') ''= f ''(''x'') for all ''x'' in some neighborhood around ''x''. Obviously, when the objective function has several local minima, there could be solutions that are locally optimal but not globally optimal and standard local optimization techniques are likely to get stuck before the global minimum is reached. Therefore, some kind of global search is needed to find the global minimum with some reliability. | ||
Previously a Matlab implementations of the DIRECT has been made, the new constrained DIRECT and the Efficient Global Optimization (EGO) algorithms. The implementations are part of the TOMLAB optimization environment. The implementation of the DIRECT algorithm is further discussed and analyzed in Bjorkman, Holmström. Since the objective functions in our applications often are expensive to compute, we have to focus on very efficient methods. At the IFIP TC7 Conference on System Modelling and Optimization in Cambridge 1999, Hans-Martin Gutmann presented his work on the RBF algorithm. The idea of the RBF algorithm is to use radial basis function interpolation to define a utility function (Powell). The next point, where the original objective function should be evaluated, is determined by optimizing on this utility function. The combination of our need for efficient global optimization software and the interesting ideas of Powell and Gutmann led to the development of an improved RBF algorithm implemented in Matlab. | Previously a Matlab implementations of the DIRECT has been made, the new constrained DIRECT and the Efficient Global Optimization (EGO) algorithms. The implementations are part of the TOMLAB optimization environment. The implementation of the DIRECT algorithm is further discussed and analyzed in Bjorkman, Holmström. Since the objective functions in our applications often are expensive to compute, we have to focus on very efficient methods. At the IFIP TC7 Conference on System Modelling and Optimization in Cambridge 1999, Hans-Martin Gutmann presented his work on the RBF algorithm. The idea of the RBF algorithm is to use radial basis function interpolation to define a utility function (Powell). The next point, where the original objective function should be evaluated, is determined by optimizing on this utility function. The combination of our need for efficient global optimization software and the interesting ideas of Powell and Gutmann led to the development of an improved RBF algorithm implemented in Matlab. | ||
Line 1,459: | Line 1,465: | ||
===The RBF Algorithm=== | ===The RBF Algorithm=== | ||
Our RBF algorithm is based on the ideas presented by Gutmann, with some extensions and further | Our RBF algorithm is based on the ideas presented by Gutmann, with some extensions and further development. The algorithm is implemented in the Matlab routine ''rbfSolve''. | ||
The RBF algorithm deals with problems of the form | The RBF algorithm deals with problems of the form | ||
Line 1,473: | Line 1,479: | ||
where <math>f \in R</math> and <math>x, x_{L}, x_{U} \in R^{d}</math>. We assume that no derivative information is available and that each function evaluation is very expensive. For example, the function value could be the result of a time-consuming experiment or computer simulation. | where <math>f \in R</math> and <math>x, x_{L}, x_{U} \in R^{d}</math>. We assume that no derivative information is available and that each function evaluation is very expensive. For example, the function value could be the result of a time-consuming experiment or computer simulation. | ||
===Description of the Algorithm=== | ====Description of the Algorithm==== | ||
We now consider the question of choosing the next point where the objective function should be evaluated. The idea of the RBF algorithm is to use radial basis function interpolation and a measure of 'bumpiness' of a radial function, ''σ ''say. A target value <math>f_n^*</math> is chosen that is an estimate of the global minimum of ''f ''. For each | We now consider the question of choosing the next point where the objective function should be evaluated. The idea of the RBF algorithm is to use radial basis function interpolation and a measure of 'bumpiness' of a radial function, ''σ ''say. A target value <math>f_n^*</math> is chosen that is an estimate of the global minimum of ''f ''. For each <math>y \neq \{x_1, ..., x_n \}</math> there exists a radial basis function that satisfies the interpolation conditions | ||
<math> | |||
s_y(x_i) = f(x_i), \quad i = 1, ..., n, | |||
</math> | |||
Here, the radial basis function interpolant '' | <math> | ||
s_y(y) = f^*_n | |||
</math> | |||
The next point ''x''<sub>n+1</sub> is calculated as the value of ''y ''in the feasible region that minimizes <math>\sigma(s_y)</math>. It turns out that the function <math>y \mapsto \sigma(s_y)</math> is much cheaper to compute than the original function. | |||
Here, the radial basis function interpolant ''s<sub>n</sub> ''has the form | |||
<math> | <math> | ||
Line 1,502: | Line 1,516: | ||
</math> | </math> | ||
where <math>\Phi</math> is the ''n × n ''matrix with <math>\Phi_{ij}=\phi\left(\left\|x_i-x_j\right\|_2\right)</math> and | |||
where <math>Phi</math> is the ''n × n ''matrix with <math>\Phi_{ij}=\phi\left(\left\|x_i-x_j\right\|_2\right)</math> and | |||
<math> | <math> | ||
Line 1,537: | Line 1,550: | ||
\right). | \right). | ||
</math> | </math> | ||
<math>s_y</math> could be obtained accordingly, but there is no need to do that as one is only interested in <math>\sigma(s_y)</math>. Powell shows that if the rank of ''P ''is ''d ''+ 1, then the matrix | <math>s_y</math> could be obtained accordingly, but there is no need to do that as one is only interested in <math>\sigma(s_y)</math>. Powell shows that if the rank of ''P ''is ''d ''+ 1, then the matrix | ||
Line 1,562: | Line 1,572: | ||
<math>\sigma(s_y) = \sigma(s_n) + \mu_n(y)\left[s_n(y)-f_n^*\right]^2,\ \ | <math>\sigma(s_y) = \sigma(s_n) + \mu_n(y)\left[s_n(y)-f_n^*\right]^2,\ \ | ||
y \notin \{x_1, \ldots ,x_n\}. </math> | y \notin \{x_1, \ldots ,x_n\}. </math> | ||
Thus minimizing <math>\sigma(s_y)</math> subject to constraints is equivalent to minimizing '' | Thus minimizing <math>\sigma(s_y)</math> subject to constraints is equivalent to minimizing ''g<sub>n</sub>'' defined as | ||
<math> | <math> | ||
Line 1,572: | Line 1,581: | ||
</math> | </math> | ||
where <math>\mu_n(y)</math> is the coefficient corresponding to ''y ''of the Lagrangian function ''L ''that satisfies <math>L(x_i)=0 | where <math>\mu_n(y)</math> is the coefficient corresponding to ''y ''of the Lagrangian function ''L ''that satisfies <math>L(x_i)=0</math>, <math>i=1, \ldots, n</math> and ''L''(''y'') = 1. It can be computed as follows. F is extended to | ||
<math> | <math> | ||
Line 1,624: | Line 1,633: | ||
is differentiable everywhere on <math>\Omega</math>, and is thus a better choice as objective function. Instead of minimizing <math>g_n(y)</math> one may minimize <math>-h_n(y)</math>. In our implementation we instead minimize <math>-\log (h_n(y))</math>. By this we avoid a flat minimum and numerical trouble when <math>h_n(y)</math> is very small. | is differentiable everywhere on <math>\Omega</math>, and is thus a better choice as objective function. Instead of minimizing <math>g_n(y)</math> one may minimize <math>-h_n(y)</math>. In our implementation we instead minimize <math>-\log (h_n(y))</math>. By this we avoid a flat minimum and numerical trouble when <math>h_n(y)</math> is very small. | ||
===The Choice of ''f *''=== | ====The Choice of ''f *''==== | ||
For the value of <math>$f_n^*</math> it should hold that | For the value of <math>$f_n^*</math> it should hold that | ||
Line 1,632: | Line 1,641: | ||
</math> | </math> | ||
The case <math>f_n^*=\min\limits_{y \in \Omega} s_n(y)</math> is only admissible if <math>\min\limits_{y \in \Omega} s_n(y)<s_n(x_i) | The case <math>f_n^*=\min\limits_{y \in \Omega} s_n(y)</math> is only admissible if <math>\min\limits_{y \in \Omega} s_n(y)<s_n(x_i)</math>, <mah>i=1, \ldots, n</math>. There are two special cases for the choice of <math>f_n^*</math>. In the case when <math>f_n^*=\min\limits_{y \in \Omega} s_n(y)</math>, then minimizing is equivalent to | ||
\ldots, n</math>. There are two special cases for the choice of <math>f_n^*</math>. In the case when <math>f_n^*=\min\limits_{y \in \Omega} s_n(y)</math>, then minimizing is equivalent to | |||
<math> | <math> | ||
Line 1,675: | Line 1,683: | ||
\min & f^*(y) \\ | \min & f^*(y) \\ | ||
\mathrm{s.t.} & \mu_n(y)\left[s_n(y)-f^*(y) \right]^2 \leq \alpha_n^2 \\ | \mathrm{s.t.} & \mu_n(y)\left[s_n(y)-f^*(y) \right]^2 \leq \alpha_n^2 \\ | ||
& y \in \Omega, | |||
\end{array} | \end{array} | ||
</math> | </math> | ||
Line 1,707: | Line 1,715: | ||
Denoting the minimizer by <math>y^*</math>, and choosing <math>f_n^*=f^*(y^*)</math>, it is evident that <math>y^*</math> minimizes <math>\mu_n(y)\left[s_n(y)-f_n^*\right]^2</math> and hence <math>g_n(y)</math>. | Denoting the minimizer by <math>y^*</math>, and choosing <math>f_n^*=f^*(y^*)</math>, it is evident that <math>y^*</math> minimizes <math>\mu_n(y)\left[s_n(y)-f_n^*\right]^2</math> and hence <math>g_n(y)</math>. | ||
For both strategies ('''idea 1 '''and '''idea 2'''), a check is performed when <math> | For both strategies ('''idea 1 '''and '''idea 2'''), a check is performed when <math>(n-n_{init})\mathrm{mod}(N+1)=N</math>. This is the stage when a purely local search is performed, so it is important to make sure that the minimizer of ''sn ''is not one of the interpolation points or too close to one. The test used is | ||
<math> | <math> | ||
Line 1,730: | Line 1,738: | ||
otherwise <math>\alpha_{n_0+3}</math> is set to 0. | otherwise <math>\alpha_{n_0+3}</math> is set to 0. | ||
===Factorizations and Updates=== | ====Factorizations and Updates==== | ||
In Powell, a factorization algorithm is presented for the solution. The algorithm makes use of the conditional definiteness of <math>\Phi</math>, i.e. <math>\lambda^T \Phi \lambda > 0, ~\lambda \neq 0</math> and <math>P^T \lambda=0</math>. If | In Powell, a factorization algorithm is presented for the solution. The algorithm makes use of the conditional definiteness of <math>\Phi</math>, i.e. <math>\lambda^T \Phi \lambda > 0, ~\lambda \neq 0</math> and <math>P^T \lambda=0</math>. If | ||
Line 1,784: | Line 1,792: | ||
Z_{y}^T \Phi_y Z_{y} = L_y L_y^T | Z_{y}^T \Phi_y Z_{y} = L_y L_y^T | ||
</math> | </math> | ||
is the Cholesky factorization, then the vector | is the Cholesky factorization, then the vector | ||
Line 1,832: | Line 1,839: | ||
where <math>H</math> is an orthogonal matrix obtained by ''d ''+ 1 Givens rotations and for <math>i=d+2, \ldots, n</math> the ''i''-th column of ''H'' is the ''i''-th unit vector. Denote <math>B=Q^T \Phi Q</math>. Using <math>\Phi_y</math> as defined in (10) consider the expanded ''B ''matrix | where <math>H</math> is an orthogonal matrix obtained by ''d ''+ 1 Givens rotations and for <math>i=d+2, \ldots, n</math> the ''i''-th column of ''H'' is the ''i''-th unit vector. Denote <math>B=Q^T \Phi Q</math>. Using <math>\Phi_y</math> as defined in (10) consider the expanded ''B ''matrix | ||
<math> | <math> | ||
\begin{array}{lll} | \begin{array}{lll} | ||
Line 1,842: | Line 1,850: | ||
Q & 0 \\ | Q & 0 \\ | ||
0 & 1 | 0 & 1 | ||
\end{array} \right) H = \\ | \end{array} \right) | ||
H = \\ & = & H^T \left( | |||
\begin{array}{cc} | \begin{array}{cc} | ||
B | B & Q^T \phi_y \\ | ||
\phi_y^T Q & 0 | \phi_y^T Q & 0 | ||
\end{array} \right) H. | \end{array} \right) H. | ||
Line 1,867: | Line 1,875: | ||
Z_{y}^T \Phi Z_{y} = \left( | Z_{y}^T \Phi Z_{y} = \left( | ||
\begin{array}{cc} | \begin{array}{cc} | ||
Z^T \Phi Z & v \\ | |||
v^T & \gamma | |||
\end{array} \right) | \end{array} \right) | ||
</math> | </math> | ||
Line 1,899: | Line 1,907: | ||
l^T L & l^T l + \beta^2 | l^T L & l^T l + \beta^2 | ||
\end{array} \right) = \\ | \end{array} \right) = \\ | ||
& = & \left( | |||
\begin{array}{cc} | \begin{array}{cc} | ||
Z^T \Phi Z & v \\ | Z^T \Phi Z & v \\ | ||
Line 1,925: | Line 1,933: | ||
<math> | <math> | ||
\min_{\lambda, c}\ \ | \min_{\lambda, c} \quad \textonehalf\lambda\T \Phi \lambda + \lambda\T (P c -F). | ||
</math> | </math> | ||
Viewing ''c ''as Lagrange multipliers for the linear equalities in (4), (47) is equivalent to the QP problem in '' | Viewing ''c ''as Lagrange multipliers for the linear equalities in (4), (47) is equivalent to the QP problem in ''λ ''defined as | ||
<math> | <math> | ||
\min_{\lambda}\ \ | \min_{\lambda} \quad \textonehalf \lambda\T \Phi \lambda - F^T\lambda \quad | ||
\lambda | \text{subject to} \quad P^T \lambda = 0. | ||
\text{subject to} P | |||
</math> | </math> | ||
Line 1,941: | Line 1,948: | ||
<math> | <math> | ||
\ | \left(\begin{array}{cc} | ||
\ | \Phi & P \\ P^T & 0 | ||
\ | \end{array}\right) | ||
\left(\begin{array}{c} | |||
\lambda \\ c} | |||
\end{array}\right) | |||
= | |||
\left(\begin{array}{c} | |||
g \\ r | |||
\end{array}\right). | |||
</math> | </math> | ||
Using the ''QR''-factorization in (28) the steps | Using the ''QR''-factorization in (28) the steps | ||
<math> | <math> | ||
R | \begin{array}{ccc} | ||
\\ Z | R^T v & = & r,\\ | ||
\\ \lambda &=& Yv + Zw, | Z^T \Phi Zw & = & Z^T(g - \Phi Yv),\\ | ||
\\ R c &=& Y | \lambda &=& Yv + Zw,\\ | ||
R c &=& Y^T(g - \Phi \lambda) | |||
\end{array} | |||
</math> | </math> | ||
Line 1,962: | Line 1,976: | ||
<math> | <math> | ||
\ | \left(\begin{array}{cc|c} | ||
\\ \phi^T & | \Phi & \phi & P\\ | ||
\\ \hline | \phi^T & 0 & p^T\\ | ||
P^T | \hline | ||
\ | P^T & p & 0 | ||
\ | \end{array}\right) | ||
\left(\begin{array}{c} | |||
\bar{\lambda} \\ \mu \\ \bar{c} | |||
\end{array}\right) = | |||
\left(\begin{array}{c} | |||
0 \\ 1 \\ 0 | |||
\end{array}\right), | |||
</math> | </math> | ||
where <math>p | where <math>p^T = (\,y^T\ \ 1\,)</math>. This permutes to | ||
<math> | <math> | ||
\ | \left(\begin{array}{cc|c} | ||
\\ | \Phi & P & \phi\\ | ||
\\ \hline | P^T & 0 & p\\ | ||
\phi^T & p^T & 0} | \hline | ||
\ | \phi^T & p^T & 0 | ||
\ | \end{array}\right) | ||
\left(\begin{array}{c} | |||
\bar{\lambda}\\ | |||
\bar{c} \\ | |||
\mu | |||
\end{array}\right) | |||
= | |||
\left(\begin{array}{c} | |||
0 \\ | |||
0 \\ | |||
1 | |||
\end{array}\right), | |||
</math> | </math> | ||
Line 1,984: | Line 2,015: | ||
<math> | <math> | ||
\ | \left(\begin{array}{cc} | ||
\ | \Phi & P \\ P^T & 0 | ||
\\ | \end{array}\right) | ||
\\ | |||
\left(\begin{array}{c} | |||
\hat{\lambda} \\ \hat{c}} | |||
\end{array}\right) | |||
= | |||
\left(\begin{array}{c} | |||
\phi \\ p | |||
\end{array}\right), | |||
</math> | |||
<math> | |||
\\\mu &=& -1 / (\phi^T \hat{\lambda} + p^T \hat{c}), | |||
</math> | |||
<math> | |||
\left(\begin{array}{c} | |||
\bar{\lambda} \\ \bar{c} | |||
\end{array}\right) | |||
= | |||
-\mu \left(\begin{array}{c} | |||
\hat{\lambda} \\ \hat{c} | |||
\end{array}\right). | |||
</math> | </math> | ||
Thus, each ''y ''requires little more than solving for <math>(\ | Thus, each ''y ''requires little more than solving for <math>(\hat{\lambda},\hat{c})</math> using the current factorizations (two operations each with ''Q'', ''R ''and ''L''). This is cheaper than updating the factors for each ''y'', and should be reliable unless the matrix in (4) is nearly singular. The updating procedure is best numerically, and it is still needed once when the final <math>y = x_{n+1}</math> is chosen. | ||
===A Compact Algorithm Description=== | ====A Compact Algorithm Description==== | ||
Section [[#Description of the Algorithm]]-[[#Factorizations and Updates]] described all the elements of the RBF algorithm as implemented in our Matlab routine ''rbfSolve'', but our discussion has covered several pages. We now summarize everything in a compact step-by-step description. Steps 2, 6 and 7 are different in '''idea 1 '''and '''idea 2'''. | Section [[#Description of the Algorithm]]-[[#Factorizations and Updates]] described all the elements of the RBF algorithm as implemented in our Matlab routine ''rbfSolve'', but our discussion has covered several pages. We now summarize everything in a compact step-by-step description. Steps 2, 6 and 7 are different in '''idea 1 '''and '''idea 2'''. | ||
{|class="wikitable" | {|class="wikitable" | ||
!idea 1||idea 2 | !||idea 1||idea 2 | ||
|-valign="top" | |-valign="top" | ||
|'''1:'''||Choose ''n ''initial points ''x''1 '', . . . , xn ''(normally the 2''d ''box corner points defined by the variable bounds). Compute ''Fi ''= ''f ''(''xi ''), ''i ''= 1'', ''2'', . . . , n ''and set ''ninit ''= ''n''. | |'''1:'''||Choose ''n ''initial points ''x''1 '', . . . , xn ''(normally the 2''d ''box corner points defined by the variable bounds). Compute ''Fi ''= ''f ''(''xi ''), ''i ''= 1'', ''2'', . . . , n ''and set ''ninit ''= ''n''. | ||
|-valign="top" | |-valign="top" | ||
'''2: '''Start a cycle of length 6.||Start a cycle of length 4. | |'''2: '''||Start a cycle of length 6.||Start a cycle of length 4. | ||
|-valign="top" | |-valign="top" | ||
'''3: '''If the maximum number of function evalua- tions reached, quit. | |'''3: '''||If the maximum number of function evalua- tions reached, quit. | ||
|-valign="top" | |-valign="top" | ||
'''4: '''Compute the radial basis function interpolant ''sn ''by solving the system of linear equations (4). | |'''4: '''||Compute the radial basis function interpolant ''sn ''by solving the system of linear equations (4). | ||
|-valign="top" | |-valign="top" | ||
'''5: '''Solve the minimization problem min ''sn ''(''y''). ''y?''? | |'''5: '''||Solve the minimization problem min ''sn ''(''y''). ''y?''? | ||
|-valign="top" | |-valign="top" | ||
'''6: '''Compute ''f * ''in (18) corresponding to the current position in the cycle.||Compute ''an ''in (22) corresponding to the current position in the cycle. | |'''6: '''||Compute ''f * ''in (18) corresponding to the current position in the cycle.||Compute ''an ''in (22) corresponding to the current position in the cycle. | ||
|-valign="top" | |-valign="top" | ||
'''7: '''New point ''xn''+1 is the value of ''y ''that mini- mizes ''gn ''(''y'') in (9).||New point ''x''<sub>n+1</sub> is the value of ''y ''that min- imizes ''f *''(''y'') in (24). | |'''7: '''||New point ''xn''+1 is the value of ''y ''that mini- mizes ''gn ''(''y'') in (9).||New point ''x''<sub>n+1</sub> is the value of ''y ''that min- imizes ''f *''(''y'') in (24). | ||
|-valign="top" | |-valign="top" | ||
'''8: '''Compute ''Fn''+1 = ''f ''(''xn''+1 ) and set ''n ''= ''n ''+ 1. | |'''8: '''||Compute ''Fn''+1 = ''f ''(''xn''+1 ) and set ''n ''= ''n ''+ 1. | ||
|-valign="top" | |-valign="top" | ||
'''9: '''If end of cycle, go to '''2'''. Otherwise go to '''4'''. | |'''9: '''||If end of cycle, go to '''2'''. Otherwise go to '''4'''. | ||
|} | |} | ||
===Some Implementation Details=== | ====Some Implementation Details==== | ||
The first question that arise is how to choose the points <math>x_1, | The first question that arise is how to choose the points <math>x_1, | ||
\ldots ,x_{n_{init}}</math> to include in the initial set. We only consider box constrained problems, and choose the corners of the box as initial points, i.e. <math>n_{init}=2^d</math>. Starting with other points is likely to lead to the corners during the iterations anyway. But as Gutmann suggests, having a "good" point beforehand, one can include it in the initial set. | \ldots ,x_{n_{init}}</math> to include in the initial set. We only consider box constrained problems, and choose the corners of the box as initial points, i.e. <math>n_{init}=2^d</math>. Starting with other points is likely to lead to the corners during the iterations anyway. But as Gutmann suggests, having a "good" point beforehand, one can include it in the initial set. |
Revision as of 05:09, 26 September 2011
Introduction
Overview
Welcome to the TOMLAB /CGO User's Guide. TOMLAB /CGO includes the solvers, rbfSolve, ego and arbfMIP. The solvers are specifically designed to solve costly (expensive) global optimization problems with up to roughly 30 decision variables. The costly component is only the objective function, i.e. if the constraints are costly as well they need to be integrated in the objective.
The overall solution approach followed by TOMLAB /CGO is based on the seamless combination of the global and local search strategies. The package requires the presence of a global solver and a local solver.
Contents of this manual
- #Introduction provides a basic overview of the TOMLAB /CGO solver package.
- #Using the Matlab Interface provides an overview of the solver interface.
- #Setting CGO Options describes how to set CGO solver options from Matlab.
- #TOMLAB /CHO Test Examples provides information regarding (non-costly) TOMLAB /CGO test examples.
- #TOMLAB /CGO Solver Reference gives detailed information about the interface routines rbfSolve, ego and arbfMIP.
More information
Please visit the following links for more information and see the references at the end of this manual.
- http://tomopt.com/tomlab/products/cgo/
- http://tomopt.com/tomlab/products/cgo/solvers/rbfSolve.php
- http://tomopt.com/tomlab/products/cgo/solvers/ego.php
- http://tomopt.com/tomlab/products/cgo/solvers/arbfMIP.php
Prerequisites
In this concise manual we assume that the user is familiar with global optimization and nonlinear programming, setting up problems in TOMLAB (in particular global constrained nonlinear (glc) problems) and with the Matlab language in general.
Using the Matlab Interface
The CGO solver package is accessed via the tomRun driver routine, which calls the rbfSolve, ego or arbfMIP routines.
<figtable id="costGlobSolv">
Function | Description |
---|---|
rbfSolve | Costly global solver routine called by the TOMLAB driver routine tomRun. This routine will also call local and global subsolvers. |
ego | Costly global solver routine called by the TOMLAB driver routine tomRun. This routine will also call local and global subsolvers. |
arbfMIP | Costly global solver routine called by the TOMLAB driver routine tomRun. This routine will also call local and global subsolvers. |
</figtable>
Setting CGO Options
All control parameters can be set directly from Matlab.
The parameters can be set as subfields in the Prob.CGO, Prob.optParam and Prob.GO structures. The following example shows how to set a limit on the maximum number of iterations when using a global subsolver to solve some sub problem and the global search idea (surface search strategy) used by rbfSolve. The major thing is most often to set the limit MaxFunc, defining how many costly function evaluations the CGO solver is allowed to use.
Prob = glcAssign(...) % Setup problem, see help glcAssign for more information Prob.GO.MaxIter = 50; % Setting the maximum number iterations. Prob.CGO.idea = 1; % Idea set to first option. Prob.optParam.MaxFunc = 90; % Maximal number of costly function evaluations
A complete description of the available CGO parameters can be found in #TOMLAB /CGO Solver Reference.
TOMLAB /CGO Test Examples
There are several test examples included in the general TOMLAB distribution. The examples are located in the testprob folder in TOMLAB. lgo1_prob contains one dimensional test problems while lgo2_prob includes two- and higher-dimensional. Several problems are also available in glb_prob, glc_prob, glcIP_prob and minlp_prob.
To test the solution of these problem sets with CGO, the following type of code can be used:
Prob = probInit('lgo1_prob', 1); Result = tomRun('rbfSolve', Prob, 1);
TOMLAB /CGO Solver Reference
A detailed description of the TOMLAB /CGO solvers is given below. Also see the M-file help for rbfSolve.m, ego.m and arbfMIP.m.
rbfSolve
Purpose
Solve general constrained mixed-integer global black-box optimization problems with costly objective functions.
The optimization problem is of the following form
Failed to parse (unknown function "\multicolumn"): {\displaystyle \begin{array}{cccccc} \min\limits_{x} & f(x) & & & & \\ s/t & x_{L} & \leq & x & \leq & x_{U} \\ & b_{L} & \leq & Ax & \leq & b_{U} \\ & c_{L} & \leq & c(x) & \leq & c_{U} \\ &\multicolumn{5}{c}{ ~x_{j} \in \MATHSET{N}\ ~~\forall j \in \MATHSET{I}}, \\ \end{array} }
where Failed to parse (unknown function "\MATHSET"): {\displaystyle f(x) \in \MATHSET{R}} ; Failed to parse (unknown function "\MATHSET"): {\displaystyle x_L,~x,~x_U \in \MATHSET{R}^d} ; the linear constraints are defined by Failed to parse (unknown function "\MATHSET"): {\displaystyle A \in \MATHSET{R}^{m_1 \times d}} , Failed to parse (unknown function "\MATHSET"): {\displaystyle b_L,~b_U \in \MATHSET{R}^{m_1}} ; and the nonlinear constraints are defined by Failed to parse (unknown function "\MATHSET"): {\displaystyle c_L,~c(x),~c_U \in~ \MATHSET{R}^{m_2}} . The variables are restricted to be integers, where Failed to parse (unknown function "\MATHSET"): {\displaystyle \MATHSET{I}} is an index subset of possibly empty. It is assumed that the function is continuous with respect to all variables, even if there is a demand that some variables only take integer values. Otherwise it would not make sense to do the surrogate modeling of used by all CGO solvers.
f (x) is assumed to be a costly function while c(x) is assumed to be cheaply computed. Any costly constraints can be treated by adding penalty terms to the objective function in the following way:
where weighting parameters wj have been added. The user then returns p(x) instead of f (x) to the CGO solver.
Calling Syntax
Result = rbfSolve(Prob,varargin) Result = tomRun('rbfSolve', Prob);
Description of Inputs
Problem description structure. The following fields are used:
Field | Description | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Name | Name of the problem. Used for security when doing warm starts. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
FUNCS.f | Name of function to compute the objective function. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
FUNCS.c | Name of function to compute the nonlinear constraint vector. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
x_L | Lower bounds on the variables. Must be finite. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
x_U | Upper bounds on the variables. Must be finite. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
b_U | Upper bounds for the linear constraints. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
b_L | Lower bounds for the linear constraints. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A | Linear constraint matrix. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
c_L | Lower bounds for the nonlinear constraints. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
c_U | Upper bounds for the nonlinear constraints. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
WarmStart | Set true (non-zero) to load data from previous run from cgoSave.mat and resume optimization from where the last run ended. If Prob.CGO.WarmStartInfo has been defined through a call to WarmDefGLOBAL, this field is used instead of the cgoSave.mat file. All CGO solvers uses the same mat-file and structure field and can read the output of one another. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxCPU | Maximal CPU Time (in seconds) to be used. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
user | User field used to send information to low-level functions. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PriLevOpt | Print Level. 0 = silent. 1 = Summary 2 = Printing each iteration. 3 = Info about local / global solution. 4 = Progress in x. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PriLevSub | Print Level in subproblem solvers, see help in snSolve and gnSolve. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
f_Low | Lower bound on the optimal function value. If defined, used to restrict the target values into interval \[f Low,min(surface)\]. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
optParam | Structure with optimization parameters. The following fields are used: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxFunc | Maximal number of costly function evaluations, default 300 for rbfSolve and arbfMIP, and default 200 for ego. MaxFunc must be <= 5000. If WarmStart = 1 and MaxFunc <= nFunc (Number of f(x) used) then set MaxFunc := MaxFunc + nFunc. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
IterPrint | Print one information line each iteration, and the new x tried. Default IterPrint = 1. fMinI means the best f(x) is infeasible. fMinF means the best f(x) is feasible (also integer feasible). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
fGoal | Goal for function value, not used if inf or empty. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eps_f | Relative accuracy for function value, fTol == eps_f. Stop if |f - f Goal| <= |fGoal| * fTol, if fGoal ≠ 0. Stop if |f - fGoal| <= fTol, if fGoal = 0. See the output field maxTri. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
bTol | Linear constraint tolerance. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
cTol | Nonlinear constraint tolerance. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxIter | Maximal number of iterations used in the local optimization on the re- sponse surface in each step. Default 1000, except for pure IP problems, then max(GO.MaxFunc, MaxIter);. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CGO | Structure (Prob.CGO) with parameters concerning global optimization options. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Percent | Type of strategy to get the initial sampled values:
Negative values of Percent result in constrained versions of the experimental design methods 7-16. It means that all points sampled are feasible with respect to all given constraints. For ExD 5,6-12,14-16 user defined points are used. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
nSample | Number of sample points to be used in initial experimental design. nSample is used differently dependent on the value of Percent:
where LATIN = [21 21 33 41 51 65 65] and k = |nSample|. Otherwise nSample as input does not matter. Description of the experimental designs: ExD 1, All Corners. Initial points is the corner points of the box given by Prob.x_L and Prob.x_U. Generates 2d points, which results in too many points when the dimension is high. ExD 2, Lower and Upper Corner point + adjacent points. Initial points are 2 * d + 2 corners: the lower left corner xL and its d adjacent corners xL + (xU(i) - xL (i)) * ei, i = 1, ..., d and the upper right corner xU and its d adjacent corners xU - (xU (i) - xL (i)) * ei, i = 1, ..., d ExD 3. Initial points are the upper right corner xU and its d adjacent corners xU - (xU (i) - xL (i)) * ei , i = 1, ..., d ExD 4. Initial points are the lower left corner xL and its d adjacent corners xL + (xU (i) - xL (i)) * ei , i = 1, ..., d ExD 5. User given initial points, given as a matrix in CGO.X. Each column is one sampled point. If d >= length(Prob.x L), then size(X,1) = d, size(X,2) = d + 1. CGO.F should be defined as empty, or contain a vector of corresponding f (x) values. Any CGO.F value set as NaN will be computed by solver routine. ExD 6. Use determinstic global optimization methods to find the initial design. Current methods available (all DIRECT methods), dependent on the value of Percent: 99 = glcDirect, 98 = glbDirect, 97 = glcSolve, 96 = glbSolve, 95 = glcFast, 94 = glbFast. ExD 7-11. Optimal Latin Hypercube Designs (LHD) with respect to different norms. The following norms and designs are available, dependent on the value of Percent: 1 = Maximin 1-Norm, 2 = Maximin 2-Norm, 3 = Maximin Inf-Norm, 4 = Audze-Eglais Norm, 5 = Minimax 2-Norm. All designs taken from: http://www.spacefillingdesigns.nl/ Constrained versions will try bigger and bigger designs up to M = max(10 * d, nTrial) different designs, stopping when it has found nSample feasible points. ExD 12. Latin hypercube space-filling design. For nSample < 0, k = |nSample| should in principle be the problem dimension. The number of points sampled is: k : 2 3 4 5 6 > 6 Points : 21 33 41 51 65 65 The call made is: X = daceInit(abs(nSample),Prob.x_L,Prob.x_U); Set nSample = [] to get (d+1)*(d+2)/2 sampled points: d : 1 2 3 4 5 6 7 8 9 10 Points : 3 6 10 15 21 28 36 45 55 66 This is a more efficient number of points to use. If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. Constrained version of Latin hypercube only keep points that fulfill the linear and nonlinear constraints. The algorithm will try up to M = max(10 * d, nTrial) points, stopping when it has found nSample feasible points (d + 1 points if nSample < 0). ExD 13. Orthogonal Sampling, LH with subspace density demands. ExD 14-16. Random strategies, the |Percent| value gives the percentage size of an ellipsoid, circle or rectangle around the so far sampled points that new points are not allowed in. Range 1%-50%. Recommended values 10% - 20%. If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
X,F,CX | The fields X,F,CX are used to define user given points. ExD = 5 (Percent = 0) needs this information. If ExD == 6-12,14-16 these points are included into the design. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
X | A matrix of initial x values. One column for every x value. If ExD == 5, size(X,2) >= dim(x)+1 needed. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
F | A vector of initial f (x) values. If any element is set to NaN it will be computed. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CX | Optionally a matrix of nonlinear constraint c(x) values. If nonempty, then size(CX,2) == size(X,2). If any element is set as NaN, the vector c(x) = CX(:,i) will be recomputed. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
RandState | If >= 0, rand('state', RandState) is set to initialize the pseudo-random generator. If < 0, rand('state', 100 * clock) is set to give a new set of random values each run. If isnan(RandState), the random state is not initialized. RandState will influence if a stochastic initial experimental design is applied, see input Percent and nSample. RandState will also influence if using the multiMin solver, but the random state seed is not reset in multiMin. The state of the random generator is saved in the warm start output rngState, and the random generator is reinitialized with this state if warm start is used. Default RandState = 0. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
AddMP | If = 1, add the midpoint as extra point in the corner strategies. Default 1 for any corner strategy, i.e. Percent is 900, 997, 998 or 999. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
nTrial | For experimental design CLH, the method generates M = max(10 * d, nTrial) trial points, and evaluate them until nSample feasible points are found. In the random designs, nTrial is the maximum number of trial points randomly generated for each new point to sample. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CLHMethod | Different search strategies for finding feasible LH points. First of all, the least infeasible point is added. Then the linear feasible points are considered. If more points are needed still, the nonlinear infeasible points are added.
1 - Take the sampled infeasible points in order. 2 - Take a random sample of the infeasible points. 3 - Use points with lowest constraint error (cErr). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SCALE | 0 - Original search space (default if any integer values).
1 - Transform search space to unit cube (default if no integers). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
REPLACE | 0 - No replacement, default for constrained problems.
1 - Large function values are replaced by the median. > 1 - Large values Z are replaced by new values. The replacement is defined as Z := FMAX + log10(Z - FMAX + 1), where FMAX = 10REPLACE , if min(F ) < 0 and FMAX = 10(ceil(log10(min(F )))+REPLACE), if min(F ) = 0. A new replacement is computed in every iteration, because min(F ) may change. Default REPLACE = 5, if no linear or nonlinear constraints. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
LOCAL | 0 - No local searches after global search. If RBF surface is inaccurate, might be an advantage.
1 - Local search from best points after global search. If equal best function values, up to 20 local searches are done. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SMOOTH | 1 - The problem is smooth enough for local search using numerical gradient estimation methods (default).
0 - The problem is nonsmooth or noisy, and local search methods using numer- ical gradient estimation are likely to produce garbage search directions. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
globalSolver | Global optimization solver used for subproblem optimization. Default glcCluster (SMOOTH=1) or glcDirect (SMOOTH=0). If the global- Solver is glcCluster, the fields Prob.GO.maxFunc1, Prob.GO.maxFunc2, Prob.GO.maxFunc3, Prob.GO.localSolver, Prob.GO.DIRECT and other fields set in Prob.GO are used. See the help for these parameters in glcCluster. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
localSolver | Local optimization solver used for subproblem optimization. If not defined, the TOMLAB default constrained NLP solver is used.
- Special RBF algorithm parameters in Prob.CGO - | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
rbfType | Type of radial basis function: 1 - thin plate spline; 2 - Cubic Spline (default); 3 - Multiquadric; 4 - Inverse multiquadric; 5 - Gaussian; 6 - Linear. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
idea | Type of search strategy on the response surface.
idea = 1 - cycle of N+1 points in target value fnStar. if fStarRule =3, then N=1 default, otherwise N=4 default. By default idea =1, fStarRule =1, i.e. N =4. To change N, see below. idea = 2 - cycle of 4 points (N+1, N=3 always) in alpha. alpha is a bound on an algorithmic constraint that implicitly sets a target value fStar. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
N | Cycle length in idea 1 (default N=1 for fStarRule 3, otherwise default N=4) or idea 2 (always N=3). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
infStep | If =1, add search step with target value -8 first in cycle. Default 0. Always
=1 for the case idea =1, fStarRule =3. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
fStarRule | Global-Local search strategy in idea 1, where N is the cycle length. Define minsn as the global minimum on the RBF surface. The following strategies for setting the target value fStar is defined: 1: fStar = minsn - ((N - (n - nInit))/N )2 * Δn (Default), 2: fStar = minsn - (N - (n - nInit))/N * Δn .
Strategy 1 and 2 depends on Δ n estimate (see DeltaRule). If infStep =1, add -step first in cycle. 3: fStar = -step, minsn-k *0.1*|minsn|k = N, ..., 0. These strategies had the following names in Gutmanns thesis: III, II, I. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
DeltaRule | 1 = Skip large f(x) when computing f(x) interval ?. 0 = Use all points. Default 1. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
AddSurfMin | Add up to AddSurfMin interior local minima on RBF surface as search points, based on estimated Lipschitz constants. AddSurfMin=0 implies no additional minimum added (Default). This option is only possible if globalSolver = multiMin. Test for additional minimum is done in the local step (modN == N) If these additional local minima are used, in the printout modN = -2, -3, -4, ... are the iteration steps with these search points. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TargetMin | Which minimum, if several minima found, to select in the target value problem:
=0 Use global minimum. =1 Use best interior local minima, if none use global minimum. =2 Use best interior local minima, if none use RBF interior minimum. =3 Use best minimum with lowest number of coefficients on bounds. Default is TargetMin = 3. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eps_sn | Relative tolerance used to test if the minimum of the RBF surface, minsn , is sufficiently lower than the best point (fM in ) found (default is 10-7 ). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxCycle | Max number of cycles without progress before stopping, default 10. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
GO | Structure Prob.GO (Default values are set for all fields).
The following fields are used: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxFunc | Maximal number of function evaluations in each global search. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxIter | Maximal number of iterations in each global search. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
DIRECT | DIRECT solver used in glcCluster, either glcSolve or glcDirect(default). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
maxFunc1 | glcCluster parameter, maximum number of function evaluations in the first call. Only used if globalSolver is glcCluster, see help globalSolver. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
maxFunc2 | glcCluster parameter, maximum number of function evaluations in the second call. Only used if globalSolver is glcCluster, see help globalSolver. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
maxFunc3 | glcCluster parameter, maximum sum of function evaluations in repeated first calls to DIRECT routine when trying to get feasible. Only used if globalSolver is glcCluster, see help globalSolver. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
localSolver | The local solver used by glcCluster. If not defined, then Prob.CGO.localSolver is used | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MIP | Structure in Prob, Prob.MIP.
Defines integer optimization parameters. Fields used: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
IntVars | If empty, all variables are assumed non-integer.
If islogical(IntVars) (=all elements are 0/1), then 1 = integer variable, 0 = continuous variable. If any element > 1, IntVars is the indices for integer variables. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
varargin | Other parameters directly sent to low level routines. |
Description of Outputs
Structure with result from optimization. The following fields are changed:
Field | Description |
---|---|
x_k | Matrix with the best points as columns. |
f_k | The best function value found so far. |
Iter | Number of iterations. |
FuncEv | Number of function evaluations. |
ExitText | Text string with information about the run. |
ExitFlag | Always 0. |
CGO | Subfield WarmStartInfo saves warm start information, the same information as in cgoSave.mat, see below. |
Inform | Information parameter.
0 = Normal termination. 1 = Function value f(x) is less than fGoal. 2 = Error in function value f (x), |f - fGoal| <= fTol, fGoal = 0. 3 = Relative Error in function value f (x) is less than fTol, i.e. |f - fGoal| |fGoal| <= fTol. 4 = No new point sampled for MaxCycle iteration steps. 5 = All sample points same as the best point for MaxCycle last iterations. 6 = All sample points same as previous point for MaxCycle last iterations. 7 = All feasible integers tried. 8 = No progress for MaxCycle * (N + 1) + 1 function evaluations (> MaxCycle cycles, input CGO.MaxCycle). 9 = Max CPU Time reached. |
cgoSave.mat | To make a warm start possible, all CGO solvers saves information in the file cgoSave.mat. The file is created independent of the solver, which enables the user to call any CGO solver using the warm start information. cgoSave.mat is a MATLAB mat-file saved to the current directory. If the parameter SAVE is 1, the CGO solver saves the mat file every iteration, which enables the user to break the run and restart using warm start from the current state. SAVE = 1 is currently always set by the CGO solvers. If the cgoSave.mat file fails to open for writing, the information is also available in the output field Result.CGO.WarmStartInfo, if the run was concluded without interruption. Through a call to WarmDefGLOBAL, the Prob structure can be setup for warm start. In this case, the CGO solver will not load the data from cgoSave.mat. The file contains the following variables: |
Name | Problem name. Checked against the Prob.Name field if doing a warmstart. |
O | Matrix with sampled points (in original space). |
X | Matrix with sampled points (in unit space if SCALE==1) |
F | Vector with function values (penalty added for costly Cc(x)) |
F_m | Vector with function values (replaced). |
F00 | Vector of pure function values, before penalties. Cc MMatrix with costly constraint values, C c(x). nInit Number of initial points. |
Fpen | Vector with function values + additional penalty if infeasible using the linear constraints and noncostly nonlinear c(x). |
fMinIdx | Index of the best point found. |
rngState | Current state of the random number generator used. |
Description
rbfSolve implements the Radial Basis Function (RBF) algorithm based on the work by Gutmann. The RBF method is enhanced to handle linear equality and inequality constraints, and nonlinear equality and inequality constraints, as well as mixed-integer problems.
A response surface based on radial basis functions is fitted to a collection of sampled points. The algorithm then balances between minimizing the fitted function and adding new points to the set.
M-files Used
daceInit.m, iniSolve.m, endSolve.m, conAssign.m, glcAssign.m, snSolve.m, gnSolve.m, expDesign.m.
MEX-files Used
tomsol
See Also
ego.m
Warnings
Observe that when cancelling with CTRL+C during a run, some memory allocated by rbfSolve will not be deal- located. To deallocate, do:
>> clear cgolib
ego
Purpose
Solve general constrained mixed-integer global black-box optimization problems with costly objective functions.
The optimization problem is of the following form
Failed to parse (unknown function "\multicolumn"): {\displaystyle \begin{array}{cccccc} \min\limits_{x} & f(x) & & & & \\ s/t & x_{L} & \leq & x & \leq & x_{U} \\ & b_{L} & \leq & Ax & \leq & b_{U} \\ & c_{L} & \leq & c(x) & \leq & c_{U} \\ &\multicolumn{5}{c}{ ~x_{j} \in \MATHSET{N}\ ~~\forall j \in \MATHSET{I}}, \\ \end{array} }
where Failed to parse (unknown function "\MATHSET"): {\displaystyle f(x) \in \MATHSET{R}}
; Failed to parse (unknown function "\MATHSET"): {\displaystyle x_L,~x,~x_U \in \MATHSET{R}^d}
;
the linear constraints are defined by Failed to parse (unknown function "\MATHSET"): {\displaystyle A \in \MATHSET{R}^{m_1 \times d}}
, Failed to parse (unknown function "\MATHSET"): {\displaystyle b_L,~b_U \in \MATHSET{R}^{m_1}}
;
and the nonlinear constraints are defined by Failed to parse (unknown function "\MATHSET"): {\displaystyle c_L,~c(x),~c_U \in~ \MATHSET{R}^{m_2}}
.
The variables are restricted to be integers,
where Failed to parse (unknown function "\MATHSET"): {\displaystyle \MATHSET{I}}
is an index subset of possibly empty.
It is assumed that the function is continuous with respect to all
variables, even if there is a demand that some variables only take integer values.
Otherwise it would not make sense to do the surrogate modeling of used by all CGO solvers.
f (x) is assumed to be a costly function while c(x) is assumed to be cheaply computed. Any costly constraints can be treated by adding penalty terms to the objective function in the following way:
where weighting parameters wj have been added. The user then returns p(x) instead of f (x) to the CGO solver.
Calling Syntax
Result=ego(Prob,varargin) Result = tomRun('ego', Prob);
Description of Inputs
Problem description structure. The following fields are used:
Field | Description | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Name | Name of the problem. Used for security when doing warm starts. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
FUNCS.f | Name of function to compute the objective function. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
FUNCS.c | Name of function to compute the nonlinear constraint vector. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
x_L | Lower bounds on the variables. Must be finite. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
x_U | Upper bounds on the variables. Must be finite. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
b_U | Upper bounds for the linear constraints. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
b_L | Lower bounds for the linear constraints. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A | Linear constraint matrix. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
c_L | Lower bounds for the nonlinear constraints. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
c_U | Upper bounds for the nonlinear constraints. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
WarmStart | Set true (non-zero) to load data from previous run from cgoSave.mat and re- sume optimization from where the last run ended. If Prob.CGO.WarmStartInfo has been defined through a call to WarmDefGLOBAL, this field is used instead of the cgoSave.mat file. All CGO solvers uses the same mat-file and structure field and can read the output of one another. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxCPU | Maximal CPU Time (in seconds) to be used. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
user | User field used to send information to low-level functions. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PriLevOpt | Print level. 0 = silent. 1 = Summary, 2 = Printing each iteration, 3 = Info about local / global solution, 4 = Progress in x. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PriLevSub | Print Level in subproblem solvers. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
optParam | Structure with optimization parameters. The following fields are used: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxFunc | Maximal number of costly function evaluations, default 300 for rbfSolve and arbfMIP, and default 200 for ego. MaxFunc must be <= 5000. If WarmStart = 1 and MaxFunc <= nFunc (Number of f(x) used) then set MaxFunc := MaxFunc + nFunc. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
IterPrint | Print one information line each iteration, and the new x tried. Default IterPrint = 1. fMinI means the best f(x) is infeasible. fMinF means the best f(x) is feasible (also integer feasible). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
fGoal | Goal for function value, not used if inf or empty. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eps_f | Relative accuracy for function value, fTol == eps_f . Stop if |f - f Goal| = |f Goal| * fTol , if f Goal = 0. Stop if |f - f Goal| = fTol , if f Goal = 0. See the output field maxTri. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
bTol | Linear constraint tolerance. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
cTol | Nonlinear constraint tolerance. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxIter | Maximal number of iterations used in the local optimization on the re- sponse surface in each step. Default 1000, except for pure IP problems, then max(GO.MaxFunc, MaxIter);. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CGO | Structure (Prob.CGO) with parameters concerning global optimization options.
The following general fields in Prob.CGO are used: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Percent | Type of strategy to get the initial sampled values:
Negative values of Percent result in constrained versions of the experimental design methods 7-16. It means that all points sampled are feasible with respect to all given constraints. For ExD 5,6-12,14-16 user defined points are used. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
nSample | Number of sample points to be used in initial experimental design. nSample is used differently dependent on the value of Percent:
where LATIN = [21 21 33 41 51 65 65] and k = |nSample|. Otherwise nSample as input does not matter. Description of the experimental designs: ExD 1, All Corners. Initial points is the corner points of the box given by Prob.x L and Prob.x U. Generates 2d points, which results in too many points when the dimension is high. ExD 2, Lower and Upper Corner point + adjacent points. Initial points are 2 * d + 2 corners: the lower left corner xL and its d adjacent corners xL + (xU (i) - xL (i)) * ei , i = 1, ..., d and the upper right corner xU and its d adjacent corners xU - (xU(i) - xL(i)) * ei , i = 1, ..., d ExD 3. Initial points are the upper right corner xU and its d adjacent corners xU - (xU(i) - xL(i)) * ei , i = 1, ..., d ExD 4. Initial points are the lower left corner xL and its d adjacent corners xL + (xU (i) - xL (i)) * ei , i = 1, ..., d ExD 5. User given initial points, given as a matrix in CGO.X. Each column is one sampled point. If d = length(Prob.x_L), then size(X,1) = d, size(X,2) = d + 1. CGO.F should be defined as empty, or contain a vector of corresponding f (x) values. Any CGO.F value set as NaN will be computed by solver routine. ExD 6. Use determinstic global optimization methods to find the initial design. Current methods available (all DIRECT methods), dependent on the value of Percent: 99 = glcDirect, 98 = glbDirect, 97 = glcSolve, 96 = glbSolve, 95 = glcFast, 94 = glbFast. ExD 7-11. Optimal Latin Hypercube Designs (LHD) with respect to different norms. The following norms and designs are available, dependent on the value of Percent: 1 = Maximin 1-Norm, 2 = Maximin 2-Norm, 3 = Maximin Inf-Norm, 4 = Audze-Eglais Norm, 5 = Minimax 2-Norm. All designs taken from: http://www.spacefillingdesigns.nl/ Constrained versions will try bigger and bigger designs up to M = max(10 * d, nT rial) different designs, stopping when it has found nSample feasible points. ExD 12. Latin hypercube space-filling design. For nSample < 0, k = |nSample| should in principle be the problem dimension. The number of points sampled is: k : 2 3 4 5 6 > 6 Points : 21 33 41 51 65 65 The call made is: X = daceInit(abs(nSample),Prob.x L,Prob.x U); Set nSample = [] to get (d+1)*(d+2)/2 sampled points: d : 1 2 3 4 5 6 7 8 9 10 Points : 3 6 10 15 21 28 36 45 55 66 This is a more efficient number of points to use. If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. Constrained version of Latin hypercube only keep points that fulfill the linear and nonlinear constraints. The algorithm will try up to M = max(10 * d, nTrial) points, stopping when it has found nSample feasible points (d + 1 points if nSample < 0). ExD 13. Orthogonal Sampling, LH with subspace density demands. ExD 14-16. Random strategies, the |Percent| value gives the percentage size of an ellipsoid, circle or rectangle around the so far sampled points that new points are not allowed in. Range 1%-50%. Recommended values 10% - 20%. If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
X,F,CX | The fields X,F,CX are used to define user given points. ExD = 5 (Percent = 0) needs this information. If ExD == 6-12,14-16 these points are included into the design. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
X | A matrix of initial x values. One column for every x value. If ExD == 5, size(X,2) >= dim(x)+1 needed. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
F | A vector of initial f (x) values. If any element is set to NaN it will be computed. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CX | Optionally a matrix of nonlinear constraint c(x) values. If nonempty, then size(CX,2) == size(X,2). If any element is set as NaN, the vector c(x) = CX(:,i) will be recomputed. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
RandState | If = 0, rand('state', RandState) is set to initialize the pseudo-random generator. If < 0, rand('state', 100 * clock) is set to give a new set of random values each run. If isnan(RandState), the random state is not initialized. RandState will influence if a stochastic initial experimental design is applied, see input Percent and nSample. RandState will also influence if using the multiMin solver, but the random state seed is not reset in multiMin. The state of the random generator is saved in the warm start output rngState, and the random generator is reinitialized with this state if warm start is used. Default RandState = 0. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
AddMP | If = 1, add the midpoint as extra point in the corner strategies. Default 1 for any corner strategy, i.e. Percent is 900, 997, 998 or 999. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
nTrial | For experimental design CLH, the method generates M = max(10 * d, nTrial) trial points, and evaluate them until nSample feasible points are found. In the random designs, nTrial is the maximum number of trial points randomly generated for each new point to sample. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CLHMethod | Different search strategies for finding feasible LH points. First of all, the least infeasible point is added. Then the linear feasible points are considered. If more points are needed still, the nonlinear infeasible points are added.
1 - Take the sampled infeasible points in order. 2 - Take a random sample of the infeasible points. 3 - Use points with lowest constraint error (cErr). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SCALE | 0 - Original search space (default if any integer values).
1 - Transform search space to unit cube (default if no integers). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
REPLACE | 0 - No replacement, default for constrained problems.
1 - Large function values are replaced by the median. > 1 - Large values Z are replaced by new values. The replacement is defined as Z := FMAX + log10(Z - FMAX + 1), where FMAX = 10REPLACE, if min(F ) < 0 and FMAX = 10(ceil(log10(min(F )))+REPLACE) , if min(F ) >= 0. A new replacement is computed in every iteration, because min(F ) may change. Default REPLACE = 5, if no linear or nonlinear constraints. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
LOCAL | 0 - No local searches after global search. If RBF surface is inaccurate, might be an advantage.
1 - Local search from best points after global search. If equal best function values, up to 20 local searches are done. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SMOOTH | 1 - The problem is smooth enough for local search using numerical gradient estimation methods (default).
0 - The problem is nonsmooth or noisy, and local search methods using numer- ical gradient estimation are likely to produce garbage search directions. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
globalSolver | Global optimization solver used for subproblem optimization. Default glcCluster (SMOOTH=1) or glcDirect (SMOOTH=0). If the global- Solver is glcCluster, the fields Prob.GO.maxFunc1, Prob.GO.maxFunc2, Prob.GO.maxFunc3, Prob.GO.localSolver, Prob.GO.DIRECT and other fields set in Prob.GO are used. See the help for these parameters in glcCluster. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
localSolver | Local optimization solver used for subproblem optimization. If not defined, the TOMLAB default constrained NLP solver is used.
- Special EGO algorithm parameters in Prob.CGO - | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
EGOAlg | Main algorithm in the EGO solver (default EGOAlg == 1)
=1 Run expected improvement steps (modN=0,1,2,...). If no f (x) improve- ment, use DACE surface minimum (modN=-1) in 1 step =2 Run expected improvement steps (modN=0) until ExpI/-yMin- ¡ Tol- ExpI for 3 successive steps (modN=1,2,3) without f (x) improvement (fRed = 0), where yMin is fMin transformed by TRANSFORM After 2 such steps (when modN=2), 1 step using the DACE surface minimum (modN=-1) is tried. If then fRed ¿0, reset to modN=0 steps. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
pEst | 1 - Estimate d-vector, p parameters (default), 0 - fix p=2. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
pEst | Norm parameters, fixed or estimated, also see p0, pLow, pUpp (default pEst = 0).
0 = Fixed constant p-value for all components (default, p0=1.99). 1 = Estimate one p-value valid for all components. > 1 = Estimate d||||p parameters, one for each component. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
p0 | Fixed p-value (pEst==0, default = 1.99) or initial p-value (pEst == 1, default 1.9) or d-vector of initial p-values (pEst > 1, default 1.9*ones(d,1)) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
pLow | Lower bound on p.
If pEst == 0, not used if pEst == 1, lower bound on p-value (default 1.0) if pEst > 1, lower bounds on p (default ones(d,1)) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
pUpp | Upper bound on p.
If pEst == 0, not used if pEst == 1, upper bound on p-value (default 2.0) if pEst > 1, upper bounds on p (default 2*ones(d,1)) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TRANSFORM | Function value transformation.
0 - No transformation made. 1 - Median value transformation. Use REPLACE instead. 2 - log(y) transformation made. 3 - -log(-y) transformation made. 4 - -1/y transformation made. Default EGO is computing the best possible transformation from the initial set of data. Note! No check is made on illegal y if user gives TRANSFORM. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
EITRANSFORM | Transformation of expected improvement function (default 1).
= 0 No transformation made. = 1 - log(-f ) transformation made. = 2 -1/f transformation made. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TolExpI | Convergence tolerance for expected improvement (default 10-6 ). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SAMPLEF | Sample criterion function:
0 = Expected improvment (default) 1 = Kushner's criterion (related option: KEPS) 2 = Lower confidence bounding (related option: LCBB) 3 = Generalized expected improvement (related option: GEIG) 4 = Maximum variance 5 = Watson and Barnes 2 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
KEPS | The ε parameter in the Kushner's criterion (default: -0.01).
If KEPS > 0, then E = K EP S. If KEPS < 0, then E = \|K EP S\| * fM in . | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
GEIG | The exponent g in the generalized expected improvement function (default 2.0). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
LCBB | Lower Confidence Bounding parameter b (default 2.0). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
GO | Structure Prob.GO (Default values are set for all fields).
The following fields are used: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxFunc | Maximal number of function evaluations in each global search. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxIter | Maximal number of iterations in each global search. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
DIRECT | DIRECT solver used in glcCluster, either glcSolve or glcDirect(default). | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
maxFunc1 | glcCluster parameter, maximum number of function evaluations in the first call. Only used if globalSolver is glcCluster, see help globalSolver. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
maxFunc2 | glcCluster parameter, maximum number of function evaluations in the second call. Only used if globalSolver is glcCluster, see help globalSolver. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
maxFunc3 | glcCluster parameter, maximum sum of function evaluations in repeated first calls to DIRECT routine when trying to get feasible. Only used if globalSolver is glcCluster, see help globalSolver. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
localSolver | The local solver used by glcCluster. If not defined, then Prob.CGO.localSolver is used | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MIP | Structure in Prob, Prob.MIP.
Defines integer optimization parameters. Fields used: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
IntVars | If empty, all variables are assumed non-integer.
If islogical(IntVars) (=all elements are 0/1), then 1 = integer variable, 0 = continuous variable. If any element > 1, IntVars is the indices for integer variables. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
varargin | Other arguments sent directly to low level functions. |
Description of Outputs
Structure with result from optimization.
Output | Description |
---|---|
x_k | Matrix with the best points as columns. |
f_k | The best function value found so far. |
Iter | Number of iterations. |
FuncEv | Number of function evaluations. |
ExitText | Text string with information about the run. |
ExitFlag | Always 0. |
CGO | Subfield WarmStartInfo saves warm start information, the same information as in cgoSave.mat, see below. |
Inform | Information parameter.
0 = Normal termination. 1 = Function value f (x) is less than fGoal. 2 = Error in function value f (x), abs(f - fGoal) <= fTol, fGoal=0. 3 = Relative Error in function value f(x) is less than fTol, i.e. abs(f - fGoal)/abs(fGoal) <= fTol. 4 = No new point sampled for N iteration steps. 5 = All sample points same as the best point for N last iterations. 6 = All sample points same as previous point for N last iterations. 7 = All feasible integers tried. 9 = Max CPU Time reached. 10 = Expected improvement low for three iterations. |
Result | Structure with result from optimization. |
cgoSave.mat | To make a warm start possible, all CGO solvers saves information in the file cgoSave.mat. The file is created independent of the solver, which enables the user to call any CGO solver using the warm start information. cgoSave.mat is a MATLAB mat-file saved to the current directory. If the parameter SAVE is 1, the CGO solver saves the mat file every iteration, which enables the user to break the run and restart using warm start from the current state. SAVE = 1 is currently always set by the CGO solvers. If the cgoSave.mat file fails to open for writing, the information is also available in the output field Result.CGO.WarmStartInfo, if the run was concluded without interruption. Through a call to WarmDefGLOBAL, the Prob structure can be setup for warm start. In this case, the CGO solver will not load the data from cgoSave.mat. The file contains the following variables: |
Name | Problem name. Checked against the Prob.Name field if doing a warmstart. |
O | Matrix with sampled points (in original space). |
X | Matrix with sampled points (in unit space if SCALE==1) |
F | Vector with function values (penalty added for costly Cc(x)) |
F_m | Vector with function values (replaced). |
F00 | Vector of pure function values, before penalties. |
Cc | MMatrix with costly constraint values, C c(x). |
nInit | Number of initial points. |
Fpen | Vector with function values + additional penalty if infeasible using the linear constraints and noncostly nonlinear c(x). |
fMinIdx | Index of the best point found. |
rngState | Current state of the random number generator used. |
Description
ego implements the algorithm EGO by D. R. Jones, Matthias Schonlau and William J. Welch presented in the paper "Efficient Global Optimization of Expensive Black-Box Functions".
Please note that Jones et al. has a slightly different problem formulation. The TOMLAB version of ego treats linear and nonlinear constraints separately.
ego samples points to which a response surface is fitted. The algorithm then balances between sampling new points and minimization on the surface.
ego and rbfSolve use the same format for saving warm start data. This means that it is possible to try one solver for a certain number of iterations/function evaluations and then do a warm start with the other. Example:
>> Prob = probInit('glc_prob',1); % Set up problem structure >> Result_ego = tomRun('ego',Prob); % Solve for a while with ego >> Prob.WarmStart = 1; % Indicate a warm start >> Result_rbf = tomRun('rbfSolve',Prob); % Warm start with rbfSolve
M-files Used
iniSolve.m, endSolve.m, conAssign.m, glcAssign.m
See Also
rbfSolve
Warnings
Observe that when cancelling with CTRL+C during a run, some memory allocated by ego will not be deallocated.
To deallocate, do:
>> clear egolib
arbfMIP
Purpose
Solve general constrained mixed-integer global black-box optimization problems with costly objective functions.
The optimization problem is of the following form
Failed to parse (unknown function "\multicolumn"): {\displaystyle \begin{array}{cccccc} \min\limits_{x} & f(x) & & & & \\ s/t & x_{L} & \leq & x & \leq & x_{U} \\ & b_{L} & \leq & Ax & \leq & b_{U} \\ & c_{L} & \leq & c(x) & \leq & c_{U} \\ &\multicolumn{5}{c}{ ~x_{j} \in \MATHSET{N}\ ~~\forall j \in \MATHSET{I}}, \\ \end{array} }
where Failed to parse (unknown function "\MATHSET"): {\displaystyle f(x) \in \MATHSET{R}}
; Failed to parse (unknown function "\MATHSET"): {\displaystyle x_L,~x,~x_U \in \MATHSET{R}^d}
;
the linear constraints are defined by Failed to parse (unknown function "\MATHSET"): {\displaystyle A \in \MATHSET{R}^{m_1 \times d}}
, Failed to parse (unknown function "\MATHSET"): {\displaystyle b_L,~b_U \in \MATHSET{R}^{m_1}}
;
and the nonlinear constraints are defined by Failed to parse (unknown function "\MATHSET"): {\displaystyle c_L,~c(x),~c_U \in~ \MATHSET{R}^{m_2}}
.
The variables are restricted to be integers,
where Failed to parse (SVG with PNG fallback (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \MATHSET{I}}
is an index subset of possibly empty.
It is assumed that the function is continuous with respect to all
variables, even if there is a demand that some variables only take integer values.
Otherwise it would not make sense to do the surrogate modeling of used by all CGO solvers.
f (x) is assumed to be a costly function while c(x) is assumed to be cheaply computed. Any costly constraints can be treated by adding penalty terms to the objective function in the following way:
where weighting parameters wj have been added. The user then returns p(x) instead of f (x) to the CGO solver.
Calling Syntax
Result = arbfMIP(Prob,varargin) Result = tomRun('arbfMIP', Prob);
Description of Inputs
Problem description structure. The following fields are used:
Input | Description | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Name | Name of the problem. Used for security when doing warm starts. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
FUNCS.f | Name of function to compute the objective function. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
FUNCS.c | Name of function to compute the nonlinear constraint vector. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
x_L | Lower bounds on the variables. Must be finite. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
x_U | Upper bounds on the variables. Must be finite. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
b_U | Upper bounds for the linear constraints. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
b_L | Lower bounds for the linear constraints. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A | Linear constraint matrix. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
c_L | Lower bounds for the nonlinear constraints. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
c_U | Upper bounds for the nonlinear constraints. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
WarmStart | Set true (non-zero) to load data from previous run from cgoSave.mat and re- sume optimization from where the last run ended. If Prob.CGO.WarmStartInfo has been defined through a call to WarmDefGLOBAL, this field is used instead of the cgoSave.mat file. All CGO solvers uses the same mat-file and structure field and can read the output of one another. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxCPU | Maximal CPU Time (in seconds) to be used. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
user | User field used to send information to low-level functions. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PriLevOpt | Print Level. 0 = silent. 1 = Summary 2 = Printing each iteration. 3 = Info about local / global solution. 4 = Progress in x. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PriLevSub | Print Level in subproblem solvers, see help in snSolve and gnSolve. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
f_Low | Lower bound on the optimal function value. If defined, used to restrict the target values into interval \[f Low,min(surface)\]. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
optParam | Structure with optimization parameters. The following fields are used: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxFunc | Maximal number of costly function evaluations, default 300 for rbfSolve and arbfMIP, and default 200 for ego. MaxFunc must be <= 5000. If WarmStart = 1 and MaxFunc <= nFunc (Number of f(x) used) then set MaxFunc := MaxFunc + nFunc. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
IterPrint | Print one information line each iteration, and the new x tried. Default IterPrint = 1. fMinI means the best f(x) is infeasible. fMinF means the best f(x) is feasible (also integer feasible). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
fGoal | Goal for function value, not used if inf or empty. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eps_f | Relative accuracy for function value, fTol == eps_f . Stop if |f - fGoal| = |f_Goal| * fTol , if fGoal = 0. Stop if |f - fGoal| = fTol , if fGoal = 0. See the output field maxTri. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
bTol | Linear constraint tolerance. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
cTol | Nonlinear constraint tolerance. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxIter | Maximal number of iterations used in the local optimization on the re- sponse surface in each step. Default 1000, except for pure IP problems, then max(GO.MaxFunc, MaxIter);. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CGO | Structure (Prob.CGO) with parameters concerning global optimization options.
The following general fields in Prob.CGO are used: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Percent | Type of strategy to get the initial sampled values:
Negative values of Percent result in constrained versions of the experimental design methods 7-16. It means that all points sampled are feasible with respect to all given constraints. For ExD 5,6-12,14-16 user defined points are used. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
nSample | Number of sample points to be used in initial experimental design. nSample is used differently dependent on the value of Percent:
where LATIN = [21 21 33 41 51 65 65] and k = |nSample|. Otherwise nSample as input does not matter. Description of the experimental designs: ExD 1, All Corners. Initial points is the corner points of the box given by Prob.x L and Prob.x U. Generates 2d points, which results in too many points when the dimension is high. ExD 2, Lower and Upper Corner point + adjacent points. Initial points are 2 * d + 2 corners: the lower left corner xL and its d adjacent corners xL + (xU (i) - xL (i)) * ei , i = 1, ..., d and the upper right corner xU and its d adjacent corners xU - (xU (i) - xL (i)) * ei , i = 1, ..., d ExD 3. Initial points are the upper right corner xU and its d adjacent corners xU - (xU (i) - xL (i)) * ei , i = 1, ..., d ExD 4. Initial points are the lower left corner xL and its d adjacent corners xL + (xU (i) - xL (i)) * ei , i = 1, ..., d ExD 5. User given initial points, given as a matrix in CGO.X. Each column is one sampled point. If d = length(Prob.x L), then size(X,1) = d, size(X,2) = d + 1. CGO.F should be defined as empty, or contain a vector of corresponding f (x) values. Any CGO.F value set as NaN will be computed by solver routine. ExD 6. Use determinstic global optimization methods to find the initial design. Current methods available (all DIRECT methods), dependent on the value of Percent: 99 = glcDirect, 98 = glbDirect, 97 = glcSolve, 96 = glbSolve, 95 = glcFast, 94 = glbFast. ExD 7-11. Optimal Latin Hypercube Designs (LHD) with respect to different norms. The following norms and designs are available, dependent on the value of Percent: 1 = Maximin 1-Norm, 2 = Maximin 2-Norm, 3 = Maximin Inf-Norm, 4 = Audze-Eglais Norm, 5 = Minimax 2-Norm. All designs taken from: http://www.spacefillingdesigns.nl/ Constrained versions will try bigger and bigger designs up to M = max(10 * d, nT rial) different designs, stopping when it has found nSample feasible points. ExD 12. Latin hypercube space-filling design. For nSample < 0, k = |nSample| should in principle be the problem dimension. The number of points sampled is: k : 2 3 4 5 6 > 6 Points : 21 33 41 51 65 65 The call made is: X = daceInit(abs(nSample),Prob.x L,Prob.x U); Set nSample = [ ] to get (d+1)*(d+2)/2 sampled points: d : 1 2 3 4 5 6 7 8 9 10 Points : 3 6 10 15 21 28 36 45 55 66 This is a more efficient number of points to use. If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. Constrained version of Latin hypercube only keep points that fulfill the linear and nonlinear constraints. The algorithm will try up to M = max(10 * d, nTrial) points, stopping when it has found nSample feasible points (d + 1 points if nSample < 0). ExD 13. Orthogonal Sampling, LH with subspace density demands. ExD 14-16. Random strategies, the |Percent| value gives the percentage size of an ellipsoid, circle or rectangle around the so far sampled points that new points are not allowed in. Range 1%-50%. Recommended values 10% - 20%. If CGO.X is nonempty, these points are verified as in ExD 5, and treated as already sampled points. Then nSample additional points are sampled, restricted to be close to the given points. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
X,F,CX | The fields X,F,CX are used to define user given points. ExD = 5 (Percent = 0) needs this information. If ExD == 6-12,14-16 these points are included into the design. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
X | A matrix of initial x values. One column for every x value. If ExD == 5, size(X,2) >= dim(x)+1 needed. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
F | A vector of initial f (x) values. If any element is set to NaN it will be computed. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CX | Optionally a matrix of nonlinear constraint c(x) values. If nonempty, then size(CX,2) == size(X,2). If any element is set as NaN, the vector c(x) = CX(:,i) will be recomputed. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
RandState | If >= 0, rand('state', RandState) is set to initialize the pseudo-random generator. If < 0, rand('state', 100 * clock) is set to give a new set of random values each run. If isnan(RandState), the random state is not initialized. RandState will influence if a stochastic initial experimental design is applied, see input Percent and nSample. RandState will also influence if using the multiMin solver, but the random state seed is not reset in multiMin. The state of the random generator is saved in the warm start output rngState, and the random generator is reinitialized with this state if warm start is used. Default RandState = 0. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
AddMP | If = 1, add the midpoint as extra point in the corner strategies. Default 1 for any corner strategy, i.e. Percent is 900, 997, 998 or 999. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
nTrial | For experimental design CLH, the method generates M = max(10 * d, nTrial) trial points, and evaluate them until nSample feasible points are found. In the random designs, nTrial is the maximum number of trial points randomly generated for each new point to sample. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CLHMethod | Different search strategies for finding feasible LH points. First of all, the least infeasible point is added. Then the linear feasible points are considered. If more points are needed still, the nonlinear infeasible points are added.
1 - Take the sampled infeasible points in order. 2 - Take a random sample of the infeasible points. 3 - Use points with lowest constraint error (cErr). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SCALE | 0 - Original search space (default if any integer values).
1 - Transform search space to unit cube (default if no integers). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
REPLACE | 0 - No replacement, default for constrained problems.
1 - Large function values are replaced by the median. > 1 - Large values Z are replaced by new values. The replacement is defined as Z := F M AX + log10(Z - F M AX + 1), where FMAX = 10REPLACE, if min(F ) < 0 and FMAX = 10(ceil(log10(min(F )))+REPLACE), if min(F ) >= 0. A new replacement is computed in every iteration, because min(F ) may change. Default REPLACE = 5, if no linear or nonlinear constraints. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
LOCAL | 0 - No local searches after global search. If RBF surface is inaccurate, might be an advantage.
1 - Local search from best points after global search. If equal best function values, up to 20 local searches are done. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SMOOTH | 1 - The problem is smooth enough for local search using numerical gradient estimation methods (default).
0 - The problem is nonsmooth or noisy, and local search methods using numer- ical gradient estimation are likely to produce garbage search directions. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
globalSolver | Global optimization solver used for subproblem optimization. Default glcCluster (SMOOTH=1) or
glcDirect (SMOOTH=0). If the global Solver is glcCluster, the fields Prob.GO.maxFunc1, Prob.GO.maxFunc2, Prob.GO.maxFunc3, Prob.GO.localSolver, Prob.GO.DIRECT and other fields set in Prob.GO are used. See the help for these parameters in glcCluster. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
localSolver | Local optimization solver used for subproblem optimization. If not defined, the TOMLAB default constrained NLP solver is used.
- Special RBF algorithm parameters in Prob.CGO - | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
rbfType | Type of radial basis function: 1 - thin plate spline; 2 - Cubic Spline (default); 3 - Multiquadric; 4 - Inverse multiquadric; 5 - Gaussian; 6 - Linear. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
idea | Global search type, always idea = 1, i.e. use fnStar values. if fStarRule =3, then N=1 default, otherwise N=4 default. By default idea =1, fStarRule =1, i.e. N =4. To change N, see below. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
N | Cycle length in idea 1 (default N=1 for fStarRule 3, otherwise default N=4) or idea 2 (always N=3). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
infStep | If =1, add search step with target value -8 first in cycle. Default 0. Always
=1 for the case idea =1, fStarRule =3. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
fStarRule | Global-Local search strategy in idea 1, where N is the cycle length. Define minsn as the global minimum on the RBF surface. The following strategies for setting the target value fStar is defined: 1: fStar = minsn - ((N - (n - nInit))/N )2 * Δn (Default), 2: fStar = minsn - (N - (n - nInit))/N * Δn .
Strategy 1 and 2 depends on Δ n estimate (see DeltaRule). If infStep =1, add -step first in cycle. 3: fStar = -step, minsn-k *0.1*|minsn|k = N, ..., 0. These strategies had the following names in Gutmanns thesis: III, II, I. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
DeltaRule | 1 = Skip large f(x) when computing f(x) interval δ. 0 = Use all points. Default 1. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eps_sn | Relative tolerance used to test if the minimum of the RBF surface, minsn , is sufficiently lower than the best point (fM in ) found (default is 10-7 ). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxCycle | Max number of cycles without progress before stopping, default 10. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
GO | Structure Prob.GO (Default values are set for all fields). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxFunc | Maximal number of function evaluations in each global search. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
MaxIter | Maximal number of iterations in each global search. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
DIRECT | DIRECT solver used in glcCluster, either glcSolve or glcDirect(default). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
maxFunc1 | glcCluster parameter, maximum number of function evaluations in the first call. Only used if globalSolver is glcCluster, see help globalSolver. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
maxFunc2 | glcCluster parameter, maximum number of function evaluations in the second call. Only used if globalSolver is glcCluster, see help globalSolver. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
maxFunc3 | glcCluster parameter, maximum sum of function evaluations in repeated first calls to DIRECT routine when trying to get feasible. Only used if globalSolver is glcCluster, see help globalSolver. localSolver The local solver used by glcCluster. If not defined, then Prob.CGO.localSolver is used MIP Structure in Prob, Prob.MIP.
Defines integer optimization parameters. Fields used: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
IntVars | If empty, all variables are assumed non-integer.
If islogical(IntVars) (=all elements are 0/1), then 1 = integer variable, 0 = continuous variable. If any element > 1, IntVars is the indices for integer variables. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
varargin | Other parameters directly sent to low level routines. |
Description of Outputs
Structure with result from optimization. The following fields are changed:
Output | Description |
---|---|
x_k | Matrix with the best points as columns. |
f_k | The best function value found so far. |
Iter | Number of iterations. |
FuncEv | Number of function evaluations. |
ExitText | Text string with information about the run. |
ExitFlag | Always 0. |
CGO | Subfield WarmStartInfo saves warm start information, the same information as in cgoSave.mat, see below. |
Inform | Information parameter.
0 = Normal termination. 1 = Function value f(x) is less than fGoal. 2 = Error in function value f (x), |f - f Goal| = fTol, fGoal = 0. 3 = Relative Error in function value f (x) is less than fTol, i.e. |f - f Goal|/|f Goal| = fTol. 4 = No new point sampled for MaxCycle iteration steps. 5 = All sample points same as the best point for MaxCycle last iterations. 6 = All sample points same as previous point for MaxCycle last iterations. 7 = All feasible integers tried. 9 = Max CPU Time reached. |
cgoSave.mat | To make a warm start possible, all CGO solvers saves information in the file cgoSave.mat. The file is created independent of the solver, which enables the user to call any CGO solver using the warm start information. cgoSave.mat is a MATLAB mat-file saved to the current directory. If the parameter SAVE is 1, the CGO solver saves the mat file every iteration, which enables the user to break the run and restart using warm start from the current state. SAVE = 1 is currently always set by the CGO solvers. If the cgoSave.mat file fails to open for writing, the information is also available in the output field Result.CGO.WarmStartInfo, if the run was concluded without interruption. Through a call to WarmDefGLOBAL, the Prob structure can be setup for warm start. In this case, the CGO solver will not load the data from cgoSave.mat. The file contains the following variables: |
Name | Problem name. Checked against the Prob.Name field if doing a warmstart. |
O | Matrix with sampled points (in original space). |
X | Matrix with sampled points (in unit space if SCALE==1) |
F | Vector with function values (penalty added for costly Cc(x)) |
F m | Vector with function values (replaced). |
F00 | Vector of pure function values, before penalties. |
Cc | MMatrix with costly constraint values, C c(x). nInitNumber of initial points. |
Fpen | Vector with function values + additional penalty if infeasible using the linear constraints and noncostly nonlinear c(x). |
fMinIdx | Index of the best point found. |
rngState | Current state of the random number generator used. |
Description
arbfMIP implements the Adaptive Radial Basis Function (ARBF) algorithm. The ARBF method handles linear equality and inequality constraints, and nonlinear equality and inequality constraints, as well as mixed-integer problems.
M-files Used
daceInit.m, iniSolve.m, endSolve.m, conAssign.m, glcAssign.m, snSolve.m, gnSolve.m, expDesign.m.
MEX-files Used
tomsol
See Also
rbfSolve.m and ego.m
Warnings
Observe that when cancelling with CTRL+C during a run, some memory allocated by arbfMIP will not be deallocated. To deallocate, do:
''>> ''clear cgolib
rbfSolve description
Following is a detailed description of the rbfSolve algorithm.
Summary
The manual considers global optimization of costly objective functions, i.e. the problem of finding the global minimum when there are several local minima and each function value takes considerable CPU time to compute. Such problems often arise in industrial and financial applications, where a function value could be a result of a time- consuming computer simulation or optimization. Derivatives are most often hard to obtain, and the algorithms presented make no use of such information.
The emphasis is on a new method by Gutmann and Powell, A radial basis function method for global optimization. This method is a response surface method, similar to the Efficient Global Optimization (EGO) method of Jones. The TOMLAB implementation of the Radial Basis Function (RBF) method is described in detail.
Introduction
The task of global optimization is to find the set of parameters x in the feasible region for which the objective function f(x) obtains its smallest value. In other words, a point is a global optimizer to f (x) on , if . On the other hand, a point is a local optimizer to f (x), if f (x) = f (x) for all x in some neighborhood around x. Obviously, when the objective function has several local minima, there could be solutions that are locally optimal but not globally optimal and standard local optimization techniques are likely to get stuck before the global minimum is reached. Therefore, some kind of global search is needed to find the global minimum with some reliability.
Previously a Matlab implementations of the DIRECT has been made, the new constrained DIRECT and the Efficient Global Optimization (EGO) algorithms. The implementations are part of the TOMLAB optimization environment. The implementation of the DIRECT algorithm is further discussed and analyzed in Bjorkman, Holmström. Since the objective functions in our applications often are expensive to compute, we have to focus on very efficient methods. At the IFIP TC7 Conference on System Modelling and Optimization in Cambridge 1999, Hans-Martin Gutmann presented his work on the RBF algorithm. The idea of the RBF algorithm is to use radial basis function interpolation to define a utility function (Powell). The next point, where the original objective function should be evaluated, is determined by optimizing on this utility function. The combination of our need for efficient global optimization software and the interesting ideas of Powell and Gutmann led to the development of an improved RBF algorithm implemented in Matlab.
The RBF Algorithm
Our RBF algorithm is based on the ideas presented by Gutmann, with some extensions and further development. The algorithm is implemented in the Matlab routine rbfSolve.
The RBF algorithm deals with problems of the form
where and . We assume that no derivative information is available and that each function evaluation is very expensive. For example, the function value could be the result of a time-consuming experiment or computer simulation.
Description of the Algorithm
We now consider the question of choosing the next point where the objective function should be evaluated. The idea of the RBF algorithm is to use radial basis function interpolation and a measure of 'bumpiness' of a radial function, σ say. A target value is chosen that is an estimate of the global minimum of f . For each there exists a radial basis function that satisfies the interpolation conditions
The next point xn+1 is calculated as the value of y in the feasible region that minimizes . It turns out that the function is much cheaper to compute than the original function.
Here, the radial basis function interpolant sn has the form
with , , and is either cubic with or the thin plate spline . Gutmann considers other choices of and of the additional polynomial, but later concludes that the situation in the multiquadric and Gaussian cases is disappointing.
The unknown parameters , b and a are obtained as the solution of the system of linear equations
where is the n × n matrix with and
could be obtained accordingly, but there is no need to do that as one is only interested in . Powell shows that if the rank of P is d + 1, then the matrix
is nonsingular and the linear system (4) has a unique solution.
For sn it is
Further, it is shown that is
Failed to parse (syntax error): {\displaystyle \sigma(s_y) = \sigma(s_n) + \mu_n(y)\left[s_n(y)-f_n^*\right]^2,\ \ y \notin \{x_1, \ldots ,x_n\}. }
Thus minimizing subject to constraints is equivalent to minimizing gn defined as
where is the coefficient corresponding to y of the Lagrangian function L that satisfies , and L(y) = 1. It can be computed as follows. F is extended to
where , and P is extended to
Then is the (n + 1)-th component of that solves the system
We use the notation 0n and for column vectors with all entries equal to zero and with dimension n and (d + 1), respectively. The computation of is done for many different y when minimizing . This requires operations if not exploiting the structure of and . Hence it does not make sense to solve the full system each time. A better alternative is to factorize the interpolation matrix and update the factorization for each y. An algorithm that requires operations is described in #Factorizations and Updates.
When there are large differences between function values, the interpolant has a tendency to oscillate strongly. It might also happen that is much lower than the best known function value, which leads to a choice of that overemphasizes global search. To handle these problems, large function values are in each iteration replaced by the median of all computed function values. Note that and are not defined at and
This will cause problems when is evaluated at a point close to one of the known points. The function defined by
is differentiable everywhere on , and is thus a better choice as objective function. Instead of minimizing one may minimize . In our implementation we instead minimize . By this we avoid a flat minimum and numerical trouble when is very small.
The Choice of f *
For the value of it should hold that
The case is only admissible if , <mah>i=1, \ldots, n</math>. There are two special cases for the choice of . In the case when , then minimizing is equivalent to
In the case when , then minimizing is equivalent to
So how should be chosen? If , then the algorithm will choose the new point in an unexplored region, which is good from a global search point of view, but the objective function will not be exploited at all. If , the algorithm will show good local behaviour, but the global minimum might be missed. Therefore, there is a need for a mixture of values for close to and far away from . Gutmann describes two different strategies for the choice of .
The first strategy, denoted idea 1, is to perform a cycle of length N + 1 and choose as
with
where is the number of initial points. Here, N = 5 is fixed and is not taken over all points, except for the first step of the cycle. In each of the subsequent steps the points with largest function value are removed (not considered) when taking the maximum. Hence the quantity is decreasing until the cycle is over. Then all points are considered again and the cycle starts from the beginning. More formally, if , , otherwise
Failed to parse (syntax error): {\displaystyle n_{max}=\max\left\{2,n_{max}-\mathrm{floor}((n-n_{init})/N) }
The second strategy, denoted idea 2, is to consider as the optimal value of
and then perform a cycle of length N + 1 on the choice of an . Here, N = 3 is fixed and
where is set to n at the beginning of each cycle. For this strategy, is taken over all points in all parts of the cycle.
Note that for a fixed y the optimal is the one for which
Substituting this equality constraint into the objective simplifies the problem to the minimization of
Denoting the minimizer by , and choosing , it is evident that minimizes and hence .
For both strategies (idea 1 and idea 2), a check is performed when . This is the stage when a purely local search is performed, so it is important to make sure that the minimizer of sn is not one of the interpolation points or too close to one. The test used is
where is the best function value found so far, i.e. , . For the first strategy (idea 1), then
otherwise is set to 0. For the second strategy (idea 2), then an (or more correctly ) is set
otherwise is set to 0.
Factorizations and Updates
In Powell, a factorization algorithm is presented for the solution. The algorithm makes use of the conditional definiteness of , i.e. and . If
is the QR decomposition of P , then the columns of Z span the null space of , and every with can be expressed as for some vector z. Thus the conditional positive definiteness implies that
This shows that is positive definite, and thus its Cholesky factorization
exists. This property can be used to solve (4) as follows. Consider the interpolation condition in (4). Multiply from left by and replace by . Because , the interpolation condition simplifies to
Solving this system using the Cholesky factorization gives z. Then compute and solve
for c using the QR decomposition of P as
The same principle can be applied to solve (12) for a given y to get . In analogy to the discussion above, if the extended matrices and in (10) and (11), respectively, are given, and if
and
is the Cholesky factorization, then the vector
yields , where z(y) solves
The Cholesky factorization is the most expensive part of this procedure. It requires operations. As must be computed for many different y this is inacceptable. However, if one knows the QR factors of P and the Cholesky factor of Z T FZ , the QR factorization of Py and the new Cholesky factor Ly can be computed in operations.
The new is
where , . The new P (y) is
Compute the QR factorization of Py, defined in (10). Given , the QR factorization of may be written as
where is an orthogonal matrix obtained by d + 1 Givens rotations and for the i-th column of H is the i-th unit vector. Denote . Using as defined in (10) consider the expanded B matrix
Multiplications from the right and left with H affects only the first (d + 1) rows and columns and the last row and the last columns of the matrix in the middle. (Remember, d is the dimension of the problem). Hence
where * denotes entries not important for the moment. From the form of By it follows that
holds. The Cholesky factorization of is already known. The new Cholesky factor is found by solving the lower triangular system for l, computing , and setting
It is easily seen that because
Note that in practice we do the following: First compute the factorization of P , i.e. , using Givens rotations. Then, since we are only interested in v and in (42), it is not necessary to compute the matrix By in (41). Setting v to the last column in Qyand computing ( is symmetric), gives v and by multiplying the last (n - d) columns in Qy by , i.e.
Using this algorithm, v and γ are computed using ((n + 1) + (n - d)) inner products instead of the two matrix multiplications in (41).
Note that the factorization algorithm is a normal 'null-space' method for solving an optimization problem involving linear equality constraints. The system of linear equations in (4) defines the necessary conditions for a stationary point to the unconstrained quadratic programming (QP) problem
Failed to parse (unknown function "\textonehalf"): {\displaystyle \min_{\lambda, c} \quad \textonehalf\lambda\T \Phi \lambda + \lambda\T (P c -F). }
Viewing c as Lagrange multipliers for the linear equalities in (4), (47) is equivalent to the QP problem in λ defined as
Failed to parse (SVG with PNG fallback (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \min_{\lambda} \quad \textonehalf \lambda\T \Phi \lambda - F^T\lambda \quad \text{subject to} \quad P^T \lambda = 0. }
The first condition in the conditional positive definiteness definition is the same as saying that the reduced Hessian must be positive definite at the solution of the QP problem if that solution is to be unique.
The type of update procedure described above is suitable each time an optimal point y = xn+1 is added. However, when evaluating all candidates y an even more efficient algorithm can be formulated. What is needed is a black-box procedure to solve linear systems with a general right-hand side:
Failed to parse (unknown function "\begin{array}"): {\displaystyle \left(\begin{array}{cc} \Phi & P \\ P^T & 0 \end{array}\right) \left(\begin{array}{c} \lambda \\ c} \end{array}\right) = \left(\begin{array}{c} g \\ r \end{array}\right). }
Using the QR-factorization in (28) the steps
simplify when r = 0 as in (4), but all steps are useful for solving the extended system (49); see next.
For each of many vectors y, the extended system takes the form
where . This permutes to
which may be solved by block-LU factorization (also known as the Schur-complement method). It helps that most of the right-hand side is zero. The solution is given by the steps
Failed to parse (unknown function "\begin{array}"): {\displaystyle \left(\begin{array}{cc} \Phi & P \\ P^T & 0 \end{array}\right) \left(\begin{array}{c} \hat{\lambda} \\ \hat{c}} \end{array}\right) = \left(\begin{array}{c} \phi \\ p \end{array}\right), }
Failed to parse (syntax error): {\displaystyle \\\mu &=& -1 / (\phi^T \hat{\lambda} + p^T \hat{c}), }
Thus, each y requires little more than solving for using the current factorizations (two operations each with Q, R and L). This is cheaper than updating the factors for each y, and should be reliable unless the matrix in (4) is nearly singular. The updating procedure is best numerically, and it is still needed once when the final is chosen.
A Compact Algorithm Description
Section #Description of the Algorithm-#Factorizations and Updates described all the elements of the RBF algorithm as implemented in our Matlab routine rbfSolve, but our discussion has covered several pages. We now summarize everything in a compact step-by-step description. Steps 2, 6 and 7 are different in idea 1 and idea 2.
idea 1 | idea 2 | |
---|---|---|
1: | Choose n initial points x1 , . . . , xn (normally the 2d box corner points defined by the variable bounds). Compute Fi = f (xi ), i = 1, 2, . . . , n and set ninit = n. | |
2: | Start a cycle of length 6. | Start a cycle of length 4. |
3: | If the maximum number of function evalua- tions reached, quit. | |
4: | Compute the radial basis function interpolant sn by solving the system of linear equations (4). | |
5: | Solve the minimization problem min sn (y). y?? | |
6: | Compute f * in (18) corresponding to the current position in the cycle. | Compute an in (22) corresponding to the current position in the cycle. |
7: | New point xn+1 is the value of y that mini- mizes gn (y) in (9). | New point xn+1 is the value of y that min- imizes f *(y) in (24). |
8: | Compute Fn+1 = f (xn+1 ) and set n = n + 1. | |
9: | If end of cycle, go to 2. Otherwise go to 4. |
Some Implementation Details
The first question that arise is how to choose the points to include in the initial set. We only consider box constrained problems, and choose the corners of the box as initial points, i.e. . Starting with other points is likely to lead to the corners during the iterations anyway. But as Gutmann suggests, having a "good" point beforehand, one can include it in the initial set.
The subproblem
is itself a problem which could have more than one local minima. To solve (51) (at least approximately), we start from the interpolation point with the least function value, i.e. , , and perform a local search. In many cases this leads to the minimum of sn . Of course, there is no guarantee that it does. We use analytical expressions for the derivatives of sn and perform the local optimization using ucSolve in TOMLAB running the inverse BFGS algorithm.
To minimize for the first strategy, or for the second strategy, we use our Matlab routine glbSolve implementing the DIRECT algorithm (see the TOMLAB manual). We run glbSolve for 500 function evaluations and choose xn+1 as the best point found by glbSolve. When (when a purely local search is performed) and the minimizer of sn is not too close to any of the interpolation points, i.e. (25) is not true, glbSolve is not used to minimize or . Instead, we choose the minimizer of (51) as the new point xn+1. The TOMLAB routine AppRowQR is used to update the QR decomposition.
Our experience so far with the RBF algorithm shows that for the second strategy (idea2), the minimum of (24) is very sensitive for the scaling of the box constraints. To overcome this problem we transform the search space to the unit hypercube. This algorithm improvement is necessary to avoid rank deficiency in the interpolation matrix for the train design problem.
In our implementation it is possible to restart the optimization with the final status of all parameters from the previous run.