TOMLAB Approximation of Derivatives

From TomWiki
Jump to navigationJump to search

Notice.png

This page is part of the TOMLAB Manual. See TOMLAB Manual.

This section about derivatives is particularly important from a practical point of view. It often seems to be the case that either it is nearly impossible or the user has difficulties in coding the derivatives.

For information about supplying partial derivatives see the Prob parameter CheckNaN in TOMLAB Appendix A.

Options Summary

Observe that the usage depends on which solver is used. Clearly, if a global solver, such as glbFast is used, derivatives are not used at all and the following sections do not apply. Some solvers use only first order information, while others may require second order information. See 'help iniSolve' and the TOMLAB interface for each solver (e.g. snoptTL) for information on derivative level needed. If a solver requires second order information it is highly recommended that at least the first order information is given analytically or the accuracy requested is changed.

Prob.NumDiff and Prob.ConsDiff options 11-15 require that the patterns (Prob.HessPattern, Prob.JacPattern and Prob.ConsPattern) are properly set. They should also be set for options 1-5 but are not required. These are most easily set by calling the corresponding user routines with 2 random set of variables and combining the sparse matrices. The functions estConsPattern, estJacPattern and estHessPattern automates this process with 3 safe trials.

If Prob.LargeScale is set to 1 and the respective user routines are not given, then estConsPattern, estJacPattern and estHessPattern are automatically executed before the solution process is started.

If first order information is provided the user should set minus (-) in front of the option that they want. For example if the objective function and gradient are given Prob.NumDiff = -1 will make sure that only the Hessian is estimated by numerical differentiation.

#Table: The differentiation options in TOMLAB. describes the differentiation options available in TOMLAB.

#Table: Callback flags in TOMLAB. shows the flags that are included in the callbacks to the user routines for objective, constraints and other functions. These flags should be used to optimize the performance of numerical differentiation. For example if FDVar = [1 4], then only the constraints that depend on decision variable number 1 and 4 need to be calculated.

The following applies to the different user supplied functions:

User c: If Prob.rows nonempty, then only these constraints need to be computed, others could be set as 0. If it is easier to exclude the constraints to compute from dependencies, then Prob.FDVar says which variables are changed, i.e. to compute numerical derivatives for. Only constraints that depend on the variable indices given in FDVar need to be computed. If FDVar == 0, nothing is known, and all constraints must be computed.

User dc: If Prob.rows nonempty, then only these constraint Jacobian rows need to be computed, others could be set as 0. If Prob.cols nonempty, only these columns in the constraint Jacobian need to be computed, others could be set as 0. Only if Prob.cols is empty and Prob.FDVar(1) > 0 could Prob.FDVar be used to simplify the computations.

User f : If Prob.FDVar(1) > 0, then only variables in Prob.FDVar are changed since the last call with Prob.FDVar == 0. If knowledge about the functional parts that are dependent on the non-changed variables are saved the last time Prob.FDVar == 0, it may be utilized to speed up the computations.

User g: If Prob.FDVar(1) > 0, the variables in Prob.FDVar are changed. They influence the variables in Prob.cols (if nonempty), which means that g(Prob.cols) are the elements accessed to obtain the numerical Hessian.

User r: If Prob.rows nonempty, then only these residuals need to be computed, others could be set as 0. If it is easier to exclude the residuals to compute from dependencies, then Prob.FDVar says which variables are changed, i.e. to compute numerical derivatives for. Only residuals that depend on the variable indices given in FDVar need to be computed, the rest could be set as 0. If FDVar == 0, nothing is known, and all residuals must be computed.

Table: The differentiation options in TOMLAB.

Flag Value Comments
Prob.ADObj 1 MAD used for first order information (gradient is automatically calculated with floating point precision). Some functions are not supported by MAD.
-1 MAD used for second order information (Hessian), i.e. the gradient needs to be specified. Users should make sure that MAD variables are allocated properly in their files and that they are not overwritten.
Prob.ADCons 1 MAD used for first order information (Jacobian of the constraints is automatically calculated with floating point precision). Some functions are not supported by MAD. Users should make sure that MAD variables are allocated properly in their files and that they are not overwritten.
-1 MAD used for second order information (d2c, the second part of the Hessian to the Lagrangian), i.e. the Jacobian needs to be specified. Users should make sure that MAD variables are allocated properly in their files and that they are not overwritten.
Prob.NumDiff 1 fndg calculates the gradient. FDJac calculates the Jacobian of the residuals. If needed the Hessian is estimated by FDHess. Default in TOMLAB.
-1 Same as option 1 but the gradient is not estimated, as it is given. A negative sign has the same functionality for all options.
11 Same as option 1 but a function findpatt is called from iniSolve when using this option. Prob.HessIx and Prob.JacIx are set to improve the performance of the differentiation routines. These are only for internal use.
-11 Same as option 11 but the gradient is not estimated, as it is given. A negative sign has the same functionality for all options.
2 Matlab standard splines (no additional toolboxes needed). Prob.optParam.CentralDiff is used by this routine. fndg2 calculates the gradient. FDJac2 calculates the Jacobian of the residuals. If needed the Hessian is estimated by FDHess2.
12 Same as option 2 but Prob.HessIx and Prob.JacIx are set for improved performance.
3 csaps will be used (Splines Toolbox needed). Prob.optParam.splineSmooth is used by this routine. fndg2 calculates the gradient. FDJac2 calculates the Jacobian of the residuals. If needed the Hessian is estimated by FDHess2.
13 Same as option 3 but Prob.HessIx and Prob.JacIx are set for improved performance.
4 spaps will be used (Splines Toolbox needed). Prob.optParam.splineTol is used by this routine. If Prob.optParam.SplineTol < 0 then csapi is used instead. fndg2 calculates the gradient. FDJac2 calculates the Jacobian of the residuals. If needed the Hessian is estimated by FDHess2.
14 Same as option 4 but Prob.HessIx and Prob.JacIx are set for improved performance.
5 A routine using complex variables. fndg3 calculates the gradient. FDJac3 calculates the Jacobian of the residuals. This option is not available for solvers requiring second order information.
15 Same as option 5 but Prob.JacIx is set for improved performance.
6 The derivatives are estimated by the solver (only available for some options).

Prob.ConsDiff

1 FDJac calculates the Jacobian of the constraints. If needed the nonlinear constraint Hessian is estimated by FDcHess.
11 Same as option 1 but a function findpatt is called from iniSolve when using this option. Prob.ConIx is set to improve the performance of the differentiation routines. This is only for internal use.
2 Matlab standard splines (no additional toolboxes needed). Prob.optParam.CentralDiffis used by this routine. FDJac2 calculates the Jacobian of the constraints. If needed the nonlinear constraint Hessian is estimated by FDcHess.
12 Same as option 2 but Prob.ConIx is set for improved performance.
3 csaps will be used (Splines Toolbox needed). Prob.optParam.splineSmooth is used by this routine. FDJac2 calculates the Jacobian of the constraints. If needed the nonlinear constraint Hessian is estimated by FDcHess.
13 Same as option 3 but Prob.ConIx is set for improved performance.
4 spaps will be used (Splines Toolbox needed). Prob.optParam.splineTol is used by this routine. If Prob.optParam.SplineTol < 0 then csapi is used instead. FDJac2 calculates the Jacobian of the constraints. If needed the nonlinear constraint Hessian is estimated by FDcHess.
14 Same as option 4 but Prob.ConIx is set for improved performance.
5 A routine using complex variables. FDJac3 calculates the Jacobian of the residuals. This option is not available for solvers requiring second order information.
15 Same as option 5 but Prob.ConIx is set for improved performance.
6 The derivatives are estimated by the solver.

Table: Callback flags in TOMLAB.

Flag Value Comments
Prob.FDVar 0 or vector The variables being perturbed in the callback for a numerical differentiation routine. The user should make sure that unnecessary calculation are not made.
Prob.rows 0 or vector The rows in the user computed vector/matrix that will be accessed, and needs to be set. If empty, no information is available, and all elements need to be set.
Prob.cols 0 or vector The columns in the user computed matrix that will be accessed, and needs to be set. If empty, no information is available, and all elements need to be set.

Both numerical differentiation and automatic differentiation are possible. There are six ways to compute numerical differentiation. Furthermore, the SOL solvers MINOS, NPSOL, NLSSOL, SNOPT and other solvers include numerical differentiation.

Numerical differentiation is automatically used for gradient, Jacobian, constraint Jacobian and Hessian if a user routine is not present.

Especially for large problems it is important to tell the system which values are nonzero, if estimating the Jacobian, the constraint Jacobian, or the Hessian matrix. Define a sparse 0-1 matrix, with ones for the nonzero elements. This matrix is set as input in the Prob structure using the fields Prob.JacPattern, Prob.ConsPattern, Prob.HessPattern or Prob.d2LPattern. If there are many zeros in the matrix, this leads to significant savings in each iteration of the optimization. It is possible to use the TOMLAB estimation routines mentioned above to set these.

A variable Prob.FDVar is a dynamically set field that indicates which decision variables are being perturbed during a call to the user's routines. For example if FDVar = [1,3,5], the variables with indices 1, 3 and 5 are being used for an estimation. When the Jacobian (for example) is being calculated by repeated calls to the constraints (or residuals) the user can make sure that unnecessary calculations are avoided.

FDVar = Prob.FDVar;
if FDVar == 0 \| FDVar == 4
    % Calculate some  constraints
end
    % The constraint will  only  be calculated if FDVar == 0
    % which  means  that all are  required, or  if FDVar == 4,  i.e.
    % only  if decision variable number 4 is being  perturbed.

To allow efficiently estimation of the derivatives in the five different options, it is possible to obtain the indices needed for the constraints (Prob.ConIx), Jacobian (Prob.JacIx) and Hessian (Prob.HessIx) evaluations. A function called findpatt is called by iniSolve automatically if a 1 is added before the number assigned for the differentiation method. The same functionality illustrated below applies for Prob.ConsDiff.

Prob.NumDiff = 11; % Use index information for NumDiff = 1
Prob.NumDiff = 12; % Use index information for NumDiff = 2
  ...
Prob.NumDiff = 15; % Use index information for NumDiff = 5

If a set of problems with identical indices for the differentiation routines are optimized in the sequence the following code can be used. This avoids re-generating the indices needed:

Prob.ConIx  = findpatt(Prob.ConsPattern); 
Prob.JacIx  = findpatt(Prob.JacPattern); 
Prob.HessIx = findpatt(Prob.HessPattern);

Prob.NumDiff = 1; % NumDiff = 11 used automatically

Forward or Backward Difference Approximations

Prob.NumDiff = 1 (11) or Prob.ConsDiff = 1 (11). Default in TOMLAB .

The default way is to use the classical approach with forward or backward differences together with an optional automatic step size selection procedure. Set Prob.GradTolg = -1 to run the procedure. The differentiation is handled by the routine fdng, which is a direct implementation of the FD algorithm.

The fdng routine is using the parameter field DiffInt, in the structure optParam, see TOMLAB Appendix A, as the numerical step size. The user could either change this field or set the field Prob.GradTolg. The field Prob.GradTolg may either be a scalar value or a vector of step sizes of the same length as the number of unknown parameters x. The advantage is that individual step sizes can be used, in the case of very different properties of the variables or very different scales. If the field Prob.GradTolg is defined as a negative number, the fdng routine is estimating a suitable step size for each of the unknown parameters. This is a costly operation in terms of function evaluations, and is normally not needed on well-scaled problems.

Similar to the fdng, there are two routines FDJac and FDHess. FDJac numerically estimates the Jacobian for nonlinear least squares problems or the constraint Jacobian in constrained nonlinear problems. FDJac checks if the field Prob.GradTolJ is defined, with the same action as fdng. FDHess estimates the Hessian matrix in nonlinear problems and checks for the definition of the field Prob.GradTolH. Both routines use field Prob.optParam.DiffInt as the default tolerance if the other field is empty. Note that FDHess has no automatic step size estimation. The implementation in fdng, FDJac and FDHess avoids taking steps outside the lower and upper bounds on the decision variables. This feature is important if going outside the bounds makes the function undefined.

Splines

Matlab splines is selected by setting Prob.NumDiff or Prob.ConsDiff = 2 (12). csaps will be used if Prob.NumDiff or Prob.ConsDiff is set to 3 (13), spaps if set to 4 (14) and finally csapi if set to 4 (14) with Prob.optParam.SplineTol < 0.

The first spline option can be used without the Spline Toolbox installed. If the Spline Toolbox is installed, gradient, Jacobian, constraint Jacobian and Hessian approximations could be computed in three different ways depending on which of the three routines, csaps, spaps or csapi the user choose to use.

The routines fdng2, FDJac2 and FDHess2 implements the gradient estimation procedure for the different approximations needed. All routines use the tolerance in the field Prob.optParam.CentralDiff as the numerical step length. The basic principle is central differences, taking a small step in both positive and negative direction.

Complex Variables

Prob.NumDiff = 5 (15) or Prob.ConsDiff = 5 (15).

The fifth approximation method is a method by Squire and Trapp [72], which is using complex variables to estimate the derivative of real functions. The method is not particularly sensitive to the choice of step length, as long as it is very small. The routine fdng3 implements the complex variable method for the estimation of the gradient and FDJac3 the similar procedure to estimate the Jacobian matrix or the constraint Jacobian matrix. The tolerance is hard coded as 1E - 20. There are some pitfalls in using Matlab code for this strategy. In the paper by Martins et. al, important hints are given about how to implement the functions in Matlab. They were essential in getting the predefined TOMLAB examples to work, and the user is advised to read this paper before attempting to make new code and using this differentiation strategy. However, the insensitivity of the numerical step size might make it worthwhile, if there are difficulties in the accuracy with traditional gradient estimation methods.

Automatic Differentiation

Automatic differentiation is performed by use of the MAD toolbox. MAD is a TOMLAB toolbox which is documented in a separate manual, see http://tomopt.com.

MAD should be initialized by calling madinitglobals before running TOMLAB with automatic differentiation. Note that in order for TOMLAB to be fully compatible with the MAD, the functions must be defined according to the MAD requirements and restrictions. Some of the predefined test problems in TOMLAB do not fulfill those requirements.

In the Graphical User Interface, the differentiation strategy selection is made from the Differentiation Method menu reachable in the General Parameters mode. Setting the Only 2ndD click-box, only unknown second derivatives are estimated. This is accomplished by changing the sign of Prob.NumDiff to negative to signal that first order derivatives are only estimated if the gradient routine is empty, not otherwise. The method to obtain derivatives for the constraint Jacobian is selected in the Constraint Jacobian diff. method menu in the General Parameters mode.

When running the menu program tomRemote/tomMenu, push/select the How to compute derivatives button in the Optimization Parameter Menu.

To choose differentiation strategy when running the driver routines or directly calling the actual solver set Prob.ADObj equal to -1 or 1 for automatic differentiation or Prob.NumDiff to 1 (11), 2 (12), 3 (13), 4 (14) or 5 (15) for numerical differentiation, before calling drivers or solvers. Note that P rob.NumDiff = 1 will run the fdng routine. Prob.NumDiff = 2, 3, 4 will make the fdng2 routine use standard Matlab splines or call the Spline Toolbox routines csaps, spaps, and csapi respectively. The csaps routine needs a smoothness parameter and the spaps routine needs a tolerance parameter. Default values for these parameters are set in the structure optParam, see TOMLAB Appendix A, fields splineSmooth and splineTol. The user should be aware of that there is no guarantee that the default values of splineSmooth and splineTol are the best for a particular problem. They work on the predefined examples in TOMLAB. To use the built in numerical differentiation in the SOL solvers MINOS, NPSOL, NLSSOL and SNOPT, set Prob.NumDiff = 6. Note that the DERIVATIVE LEVEL SOL parameter must be set properly to tell the SOL solvers which derivatives are known or not known. There is a field DerLevel in Prob.optParam that is used by TOMLAB to send this information to the solver. To select the method to obtain derivatives for the constraint Jacobian the field Prob.ConsDiff is set to 1-6 (11-15, 16 not valid) with the same meaning as for Prob.NumDiff as described above. Prob.ADCons is the equivalent variable for automatic differentiation of the nonlinear constraints.

Here follows some examples of the use of approximative derivatives when solving problems with ucSolve and clsSolve. The examples are part of the TOMLAB distribution in the file diffDemo in directory examples. To be able to use automatic differentiation the MAD toolbox must be installed.

Automatic Differentiation example

madinitglobals;	% Initiate MAD variables 
probFile = 'uc_prob'; % Name of Init File
P = 1; % Problem number
Prob = probInit(probFile, P);
Prob.Solver.Alg = 2; % Use the safeguarded standard BFGS 
Prob.ADObj = 1;	% Use Automatic Differentiation. 
Result = tomRun('ucSolve', Prob,  2);

FD example

%   Finite differentiation using  the  FD  algorithm

probFile = 'uc_prob'; % Name  of  Init File
P = 1; %Problem number
Prob = probInit(probFile, P);  
Prob.Solver.Alg = 2;
Prob.NumDiff 	= 1; % Use the fdng routine with the FD algorithm. 
Result = tomRun('ucSolve', Prob,  2);

% Change  the  tolerances used by algorithm FD
Prob.GradTolg = [1E-5; 1E-6]; % Change the predefined step size
Result = tomRun('ucSolve', Prob,  1);

% The change leads to worse accuracy
% Instead let an algorithm determine the best possible GradTolg
% It needs some extra f(x) evaluations, but the result is much better.

Prob.GradTolg = -1; % A negative number demands that the step length
                    % of  the  algorithm is  to  be used at  the  initial point

% Avoid  setting GradTolg empty,  then  instead Prob.optParam.DiffInt is  used. 
Result 	= tomRun('ucSolve', Prob,  1);

% Now   the  result is perfect, very  close  to  the  optimal == 0. 
Prob.NumDiff = 5; % Use the complex variable  technique

% The advantage is that it is insensitive to the choice of step length
Result = tomRun('ucSolve', Prob,  1);

%  When it works, like in this case, it gives absolutely perfect result.

Increasing the tolerances used as step sizes for the individual variables leads to a worse solution being found, but no less function evaluations until convergence. Using the automatic step size selection method gives very good results. The complex variable method gives absolutely perfect results, the optimum is found with very high accuracy.

The following example illustrates the use of spline function to approximate derivatives.

Spline example

probFile        = 'ls_prob'; % Name of Init File
P               = 1;         % Problem number
Prob            = probInit(probFile, P);
Prob.Solver.Alg = 0; 	     % Use the  default algorithm in clsSolve
Prob.NumDiff 	= 2; 	     % Use Matlab spline. 
Result 	        = tomRun('clsSolve', Prob,  2);

FD example with patterns

% Finite differentiation using  the  FD  algorithm
%
% This example illustrates how to set nonzeros in HessPattern
% to  tell TOMLAB  which elements  to  estimate in the  Hessian.
% All elements in Hessian with  corresponding zeros in HessPattern are
% set to 0, and no numerical estimate is  needed.
%
% This saves very much time for large problems
% In a similar way, Prob.ConsPattern is set for the constraint gradient
% matrix for the  nonlinear constraints, and Prob.JacPattern for the
% Jacobian matrix in nonlinear least squares  problems.

probFile         = 'con_prob'; 
P	         = 12;
Prob	         = probInit(probFile, P); 
Prob.Solver.Alg	 = 1;
Prob.HessPattern = sparse([ones(6,6), zeros(6,6);zeros(6,12)]);

% Note that if setting Prob.NumDiff  = 1,  also  the  gradient would be
% estimated with  numerical differences,  which  is not  recommended.
% Estimating two levels of  derivatives is an ill-conditioned  process.
% Setting Prob.NumDiff  = -1, only  the  Hessian  is estimated

% Use qpSolve  in base module for QP  subproblems
Prob.SolverQP = 'qpSolve';

Prob.NumDiff = -1; % Use the  fdng  routine with  the  FD  algorithm. 
Result 	= tomRun('nlpSolve',Prob,1);

% Setting Prob.NumDiff  = -11  makes Tomlab analyze  HessPattern
% and use a faster  algorithm for  large-scale problems
% In  this  small-scale case it is no advantage,
% it just takes  more CPU-time.

Prob.NumDiff 	= -11; % Use the fdng routine with the FD algorithm. 
Result 	        = tomRun('nlpSolve',Prob,1);

% Run the same problem estimating Hessian with Matlab Spline
% Needs more gradient calculations because it is principle
% smooth central differences to obtain the numerical Hessian.
% If the problem is noisy, then this method is recommended.

Prob.NumDiff = -2; % Matlab spline
Result       = tomRun('nlpSolve',Prob,1);