MAD High-Level Interfaces to Matlab’s ODE and Optimization Solvers: Difference between revisions
(Created page with "{{Part Of Manual|title=the MAD Manual|link=MAD Manual}} ==High-Level Interfaces to Matlab's ODE and Optimization Solvers== In many cases users wish to use AD for calcula...") |
No edit summary |
||
Line 1: | Line 1: | ||
{{Part Of Manual|title=the MAD Manual|link=[[MAD|MAD Manual]]}} | {{Part Of Manual|title=the MAD Manual|link=[[MAD|MAD Manual]]}} | ||
In many cases users wish to use AD for calculating derivatives necessary for ODE or optimisation solvers. When using the TOMLAB optimisation solvers the use of AD is easily achieved by setting the flags, ''Prob.NumDiff ''and ''Prob.ConsDiff ''as noted in Section 3. When solving stiff ODE's using Matlab's ODE solvers or solving optimization problems using Matlab's Optimization Toolbox functions the user can either: | In many cases users wish to use AD for calculating derivatives necessary for ODE or optimisation solvers. When using the TOMLAB optimisation solvers the use of AD is easily achieved by setting the flags, ''Prob.NumDiff ''and ''Prob.ConsDiff ''as noted in Section 3. When solving stiff ODE's using Matlab's ODE solvers or solving optimization problems using Matlab's Optimization Toolbox functions the user can either: | ||
Line 60: | Line 58: | ||
''First we define:'' | ''First we define:'' | ||
< | <source lang="matlab"> | ||
MU = 1000; % default stiffness parameter for Van der Pol problem | MU = 1000; % default stiffness parameter for Van der Pol problem | ||
tspan = [0; max(20,3*MU)]; % integration timespan of several periods | tspan = [0; max(20,3*MU)]; % integration timespan of several periods | ||
y0 = [2; 0]; % initial conditions | y0 = [2; 0]; % initial conditions | ||
options=odeset('Stats','on'); % display ODE statistics | options=odeset('Stats','on'); % display ODE statistics | ||
</ | </source> | ||
====Integrate using finite-differencing for the Jacobian==== | ====Integrate using finite-differencing for the Jacobian==== | ||
Line 71: | Line 69: | ||
''Using the default finite-differencing a solution is easily found.'' | ''Using the default finite-differencing a solution is easily found.'' | ||
< | <source lang="matlab"> | ||
t1=cputime; | t1=cputime; | ||
[t,y] = ode15s(@vdpode_f,tspan,y0,options,MU); | [t,y] = ode15s(@vdpode_f,tspan,y0,options,MU); | ||
tfd=cputime-t1; | tfd=cputime-t1; | ||
disp(['Jacobian by Finite-Differencing - CPU time = ',num2str(tfd)]) | disp(['Jacobian by Finite-Differencing - CPU time = ',num2str(tfd)]) | ||
</source> | |||
<pre> | |||
591 successful steps | 591 successful steps | ||
225 failed attempts | 225 failed attempts | ||
Line 90: | Line 90: | ||
''Since the Jacobian is recalculated many times supplying an analytic Jacobian should improve performance.'' | ''Since the Jacobian is recalculated many times supplying an analytic Jacobian should improve performance.'' | ||
< | <source lang="matlab"> | ||
options = odeset(options,'Jacobian',@vdpode_J); | options = odeset(options,'Jacobian',@vdpode_J); | ||
t1=cputime; | t1=cputime; | ||
Line 96: | Line 96: | ||
tjac=cputime-t1; | tjac=cputime-t1; | ||
disp(['Jacobian by hand-coding - CPU time = ',num2str(tjac)]) | disp(['Jacobian by hand-coding - CPU time = ',num2str(tjac)]) | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 112: | Line 112: | ||
''The automated MAD Jacobian calculation avoids the need for the user to supply the Jacobian. First we use odeset to specify that the Jacobian will be calculated by MADODEJacEval. We then use MADODEDelete to first clear any existing settings for vdpode f, then MADODERegister register the function and then integrate.'' | ''The automated MAD Jacobian calculation avoids the need for the user to supply the Jacobian. First we use odeset to specify that the Jacobian will be calculated by MADODEJacEval. We then use MADODEDelete to first clear any existing settings for vdpode f, then MADODERegister register the function and then integrate.'' | ||
< | <source lang="matlab"> | ||
options = odeset(options,'Jacobian',@MADODEJacEval); | options = odeset(options,'Jacobian',@MADODEJacEval); | ||
MADODEDelete; % deletes all previous settings for ODE evaluation | MADODEDelete; % deletes all previous settings for ODE evaluation | ||
Line 120: | Line 120: | ||
tmadjac=cputime-t1; | tmadjac=cputime-t1; | ||
disp(['Jacobian by MADODEJacEval - CPU time = ',num2str(tmadjac)]) | disp(['Jacobian by MADODEJacEval - CPU time = ',num2str(tmadjac)]) | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 138: | Line 138: | ||
''which technique to use to calculate the Jacobian.'' | ''which technique to use to calculate the Jacobian.'' | ||
< | <source lang="matlab"> | ||
t1=cputime; | t1=cputime; | ||
[t,y] = ode15s(@vdpode_f,tspan,y0,options,MU); | [t,y] = ode15s(@vdpode_f,tspan,y0,options,MU); | ||
tmadjac2=cputime-t1; | tmadjac2=cputime-t1; | ||
disp(['Repeat of Jacobian by MADODEJacEval - CPU time = ',num2str(tmadjac2)]) | disp(['Repeat of Jacobian by MADODEJacEval - CPU time = ',num2str(tmadjac2)]) | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 159: | Line 159: | ||
''Using MADODEReport we see which technique is used and why - here full storage is used due to the small size (''2 ''× ''2'')'' ''of the Jacobian required.'' | ''Using MADODEReport we see which technique is used and why - here full storage is used due to the small size (''2 ''× ''2'')'' ''of the Jacobian required.'' | ||
< | <source lang="matlab"> | ||
MADODEReport % report on which technique used | MADODEReport % report on which technique used | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 177: | Line 177: | ||
''Using MAD for such small ODE problems gives similar performance to finite-differencing. MAD is more useful for large ODE problems, particularly those with sparse Jacobians in, for example the Brusselator problem of MADEXbrussode.'' | ''Using MAD for such small ODE problems gives similar performance to finite-differencing. MAD is more useful for large ODE problems, particularly those with sparse Jacobians in, for example the Brusselator problem of MADEXbrussode.'' | ||
< | <source lang="matlab"> | ||
bar([tfd,tjac,tmadjac,tmadjac2]) | bar([tfd,tjac,tmadjac,tmadjac2]) | ||
colormap hsv | colormap hsv | ||
Line 186: | Line 186: | ||
text(3,tfd/2,'MADJacODE','Rotation',90) | text(3,tfd/2,'MADJacODE','Rotation',90) | ||
text(4,tfd/2,'Repeat of MADJacODE','Rotation',90) | text(4,tfd/2,'Repeat of MADJacODE','Rotation',90) | ||
</ | </source> | ||
Using Mad for such small ODE problems gives similar performance to finite-differencing. Mad is more useful for large ODE problems, particularly those with sparse Jacobians in, for example the Brusselator problem of MADEXbrussode. | Using Mad for such small ODE problems gives similar performance to finite-differencing. Mad is more useful for large ODE problems, particularly those with sparse Jacobians in, for example the Brusselator problem of MADEXbrussode. | ||
Line 220: | Line 220: | ||
''First we define:'' | ''First we define:'' | ||
< | <source lang="matlab"> | ||
N = 80; % number of mesh points, | N = 80; % number of mesh points, | ||
tspan = [0; 50]; % timespan of integration, | tspan = [0; 50]; % timespan of integration, | ||
Line 227: | Line 227: | ||
y0=reshape(y0,[2*N 1]); % initial conditions. and sparsity pattern | y0=reshape(y0,[2*N 1]); % initial conditions. and sparsity pattern | ||
S=brussode_S(N); % find sparsity pattern. | S=brussode_S(N); % find sparsity pattern. | ||
</ | </source> | ||
'''Integrate Brusselator problem using finite-differencing for the Jacobian''' | '''Integrate Brusselator problem using finite-differencing for the Jacobian''' | ||
Line 233: | Line 233: | ||
''Initially we use finite-differences to approximate the Jacobian ignoring any sparsity, timing and running the cal- culation.'' | ''Initially we use finite-differences to approximate the Jacobian ignoring any sparsity, timing and running the cal- culation.'' | ||
< | <source lang="matlab"> | ||
options = odeset('Vectorized','on','Stats','on'); | options = odeset('Vectorized','on','Stats','on'); | ||
t1=cputime; | t1=cputime; | ||
Line 239: | Line 239: | ||
cpufd=cputime-t1; | cpufd=cputime-t1; | ||
disp(['FD-Jacobian : cputime = ',num2str(cpufd)]) | disp(['FD-Jacobian : cputime = ',num2str(cpufd)]) | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 257: | Line 257: | ||
''Jacobian sparsity pattern, print out statistics and to turn on vectorization. We then time and run the calculation.'' | ''Jacobian sparsity pattern, print out statistics and to turn on vectorization. We then time and run the calculation.'' | ||
< | <source lang="matlab"> | ||
options = odeset('Vectorized','on','JPattern',S,'Stats','on'); | options = odeset('Vectorized','on','JPattern',S,'Stats','on'); | ||
t1=cputime; | t1=cputime; | ||
Line 263: | Line 263: | ||
cpucompfd=cputime-t1; | cpucompfd=cputime-t1; | ||
disp(['Compressed FD-Jacobian : cputime = ',num2str(cpucompfd)]) | disp(['Compressed FD-Jacobian : cputime = ',num2str(cpucompfd)]) | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 279: | Line 279: | ||
''First delete any existing settings for MADODE using MADODEDelete. Then register function brussode f for MADODEJacEval using MADODERegister. Then use odeset to specify that MADODEJacEval will calculate the required Jacobian. Finally time and run the calculation and use MADODEReport to provide information on the Jacobian technique used.'' | ''First delete any existing settings for MADODE using MADODEDelete. Then register function brussode f for MADODEJacEval using MADODERegister. Then use odeset to specify that MADODEJacEval will calculate the required Jacobian. Finally time and run the calculation and use MADODEReport to provide information on the Jacobian technique used.'' | ||
< | <source lang="matlab"> | ||
MADODEDelete % deletes any internal settings | MADODEDelete % deletes any internal settings | ||
MADODERegister(@brussode_f) % supplies handle to ODE function | MADODERegister(@brussode_f) % supplies handle to ODE function | ||
Line 288: | Line 288: | ||
disp(['MAD Jacobian : cputime = ',num2str(cpumadjac)]) | disp(['MAD Jacobian : cputime = ',num2str(cpumadjac)]) | ||
MADODEReport | MADODEReport | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 312: | Line 312: | ||
''For this problem the Jacobian is large and sparse but MAD does not "know" that the sparsity pattern is fixed and so can only use sparse forward mode and not compression. We first delete any internal settings, specify that the sparsity pattern is fixed, i.e., it does not change from one evaluation to the next due to branching; and then use the MAD Jacobian. The Jacobian sparsity pattern is calculated using fmad's sparse forward mode for a point randomly perturbed about y0.'' | ''For this problem the Jacobian is large and sparse but MAD does not "know" that the sparsity pattern is fixed and so can only use sparse forward mode and not compression. We first delete any internal settings, specify that the sparsity pattern is fixed, i.e., it does not change from one evaluation to the next due to branching; and then use the MAD Jacobian. The Jacobian sparsity pattern is calculated using fmad's sparse forward mode for a point randomly perturbed about y0.'' | ||
< | <source lang="matlab"> | ||
MADODEDelete % delete any internal settings | MADODEDelete % delete any internal settings | ||
MADODERegister(@brussode_f,'sparsity_fixed','true')% specify fixed sparsity | MADODERegister(@brussode_f,'sparsity_fixed','true')% specify fixed sparsity | ||
Line 322: | Line 322: | ||
MADODEReport | MADODEReport | ||
% Compressed forward mode should now have been used. | % Compressed forward mode should now have been used. | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 351: | Line 351: | ||
''If available the sparsity pattern can be specified directly within MADsetupODE.'' | ''If available the sparsity pattern can be specified directly within MADsetupODE.'' | ||
< | <source lang="matlab"> | ||
MADODEDelete % delete any internal settings | MADODEDelete % delete any internal settings | ||
MADODERegister(@brussode_f,'sparsity_pattern',S) | MADODERegister(@brussode_f,'sparsity_pattern',S) | ||
Line 360: | Line 360: | ||
disp(['MAD Jacobian (sparsity given) : cputime = ',num2str(cpumadjac_S)]) | disp(['MAD Jacobian (sparsity given) : cputime = ',num2str(cpumadjac_S)]) | ||
MADODEReport | MADODEReport | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 389: | Line 389: | ||
''Of course, there should be no discernable difference between the finite difference and AD-enabled solutions.'' | ''Of course, there should be no discernable difference between the finite difference and AD-enabled solutions.'' | ||
< | <source lang="matlab"> | ||
plot(tfd,yfd(:,1),'-',tmadjac,ymadjac(:,1),'o') | plot(tfd,yfd(:,1),'-',tmadjac,ymadjac(:,1),'o') | ||
xlabel('t') | xlabel('t') | ||
Line 395: | Line 395: | ||
legend('FD Jacobian','MAD Jacobian') | legend('FD Jacobian','MAD Jacobian') | ||
title('Comparison of Brusselator solutions using FD and AD for Jacobians') | title('Comparison of Brusselator solutions using FD and AD for Jacobians') | ||
</ | </source> | ||
====Specifying a Jacobian technique==== | ====Specifying a Jacobian technique==== | ||
Line 401: | Line 401: | ||
''We can specify from the outset that compression is to be used.'' | ''We can specify from the outset that compression is to be used.'' | ||
< | <source lang="matlab"> | ||
MADODEDelete % delete any internal settings | MADODEDelete % delete any internal settings | ||
MADODERegister(@brussode_f,'sparsity_pattern',S,'ad_technique','ad_fwd_compressed') | MADODERegister(@brussode_f,'sparsity_pattern',S,'ad_technique','ad_fwd_compressed') | ||
Line 410: | Line 410: | ||
disp(['MAD Jacobian (compressed specified) : cputime = ',num2str(cpumadjac_S)]) | disp(['MAD Jacobian (compressed specified) : cputime = ',num2str(cpumadjac_S)]) | ||
MADODEReport | MADODEReport | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 438: | Line 438: | ||
''For this problem compressed finite-differencing is most efficient. If the sparsity pattern is not known then sparse forward mode AD outperforms finite-differencing. On supplying the sparsity pattern the performance of compressed forward mode AD approaches that of compressed finite-differencing. Simply using the MADODEJacEval interface yields acceptable performance.'' | ''For this problem compressed finite-differencing is most efficient. If the sparsity pattern is not known then sparse forward mode AD outperforms finite-differencing. On supplying the sparsity pattern the performance of compressed forward mode AD approaches that of compressed finite-differencing. Simply using the MADODEJacEval interface yields acceptable performance.'' | ||
< | <source lang="matlab"> | ||
bar([cpufd,cpucompfd,cpumadjac,cpumadjac_fixed,cpumadjac_S,cpumadjac_comp]) | bar([cpufd,cpucompfd,cpumadjac,cpumadjac_fixed,cpumadjac_S,cpumadjac_comp]) | ||
colormap hsv | colormap hsv | ||
Line 449: | Line 449: | ||
text(5,cpucompfd/2,'MAD(sparsity supplied)','Rotation',90) | text(5,cpucompfd/2,'MAD(sparsity supplied)','Rotation',90) | ||
text(6,cpucompfd/2,'MAD(compression specified)','Rotation',90) | text(6,cpucompfd/2,'MAD(compression specified)','Rotation',90) | ||
</ | </source> | ||
The example MADEXpolluode provides a third example for these techniques. | The example MADEXpolluode provides a third example for these techniques. | ||
==Unconstrained Optimization== | |||
High-level interfaces to Mad's functionality for Matlab's Optimization Solvers have also been developed. In this section we cover unconstrained optimisation, in Section 6.3 we cover the constrained case. | High-level interfaces to Mad's functionality for Matlab's Optimization Solvers have also been developed. In this section we cover unconstrained optimisation, in Section 6.3 we cover the constrained case. | ||
===Example MADEXbanana: Unconstrained Optimization Using High-Level Interfaces=== | |||
''This example illustrates how the gradient for an unconstrained optimization problem, the banana problem of bandem, may be calculated using the high-level interface function MADOptObjEval and used within fminunc.'' | ''This example illustrates how the gradient for an unconstrained optimization problem, the banana problem of bandem, may be calculated using the high-level interface function MADOptObjEval and used within fminunc.'' | ||
Line 477: | Line 477: | ||
''If the gradient is not supplied then fminunc will approximate it using finite-differences.'' | ''If the gradient is not supplied then fminunc will approximate it using finite-differences.'' | ||
< | <source lang="matlab"> | ||
format compact | format compact | ||
x=[0 1]; | x=[0 1]; | ||
Line 486: | Line 486: | ||
disp(['Default with no gradient took ',num2str(output.iterations),... | disp(['Default with no gradient took ',num2str(output.iterations),... | ||
' iterations and CPU time = ',num2str(cputime-t1)]) | ' iterations and CPU time = ',num2str(cputime-t1)]) | ||
</source> | |||
<pre> | |||
Optimization terminated: relative infinity-norm of gradient less than options.TolFun. | Optimization terminated: relative infinity-norm of gradient less than options.TolFun. | ||
xsol = | xsol = | ||
Line 497: | Line 499: | ||
''Since the function is just quadratic it may easily be differentiated by hand to give the analytic gradient in banana_fg.'' | ''Since the function is just quadratic it may easily be differentiated by hand to give the analytic gradient in banana_fg.'' | ||
< | <source lang="matlab"> | ||
options=optimset(options,'GradObj','on'); | options=optimset(options,'GradObj','on'); | ||
t1=cputime; | t1=cputime; | ||
Line 504: | Line 506: | ||
disp(['Default with analytic gradient took ',num2str(output.iterations),... | disp(['Default with analytic gradient took ',num2str(output.iterations),... | ||
' iterations and CPU time = ',num2str(cputime-t1)]) | ' iterations and CPU time = ',num2str(cputime-t1)]) | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 517: | Line 519: | ||
''Using MADOptObjEval is easy, we first delete any previous settings, register the objective function banana f and then use MADOptObjEval as the objective function which will automatically calculate the gradient when needed.'' | ''Using MADOptObjEval is easy, we first delete any previous settings, register the objective function banana f and then use MADOptObjEval as the objective function which will automatically calculate the gradient when needed.'' | ||
< | <source lang="matlab"> | ||
MADOptObjDelete | MADOptObjDelete | ||
MADOptObjRegister(@banana_f) | MADOptObjRegister(@banana_f) | ||
Line 526: | Line 528: | ||
disp(['Jacobian by MADFandGradObjOpt took ',num2str(output.iterations),... | disp(['Jacobian by MADFandGradObjOpt took ',num2str(output.iterations),... | ||
' iterations and CPU time = ',num2str(cputime-t1)]) | ' iterations and CPU time = ',num2str(cputime-t1)]) | ||
</source> | |||
<pre> | |||
Optimization terminated: relative infinity-norm of gradient less than options.TolFun. | Optimization terminated: relative infinity-norm of gradient less than options.TolFun. | ||
xsol = | xsol = | ||
Line 551: | Line 555: | ||
</pre> | </pre> | ||
==Constrained Optimization== | |||
If we have a constrained optimization problem then we use MADOptObjEval for the objective, as in Example 6.3, and MADOptConstrEval for the constraint as Example 6.4 demonstrates. | If we have a constrained optimization problem then we use MADOptObjEval for the objective, as in Example 6.3, and MADOptConstrEval for the constraint as Example 6.4 demonstrates. | ||
===Example MADEXconfun: Constrained Optimization Using High-Level=== | |||
====Interfaces.==== | ====Interfaces.==== | ||
Line 577: | Line 581: | ||
''Here we simply use the medium-scale algorithm with gradients evaluated by default finite-differencing.'' | ''Here we simply use the medium-scale algorithm with gradients evaluated by default finite-differencing.'' | ||
< | <source lang="matlab"> | ||
options = optimset('LargeScale','off'); | options = optimset('LargeScale','off'); | ||
x0 = [-1 1]; | x0 = [-1 1]; | ||
Line 585: | Line 589: | ||
disp(['Default fmincon took ',num2str(output.iterations),... | disp(['Default fmincon took ',num2str(output.iterations),... | ||
' iterations and cpu= ',num2str(t2)]) | ' iterations and cpu= ',num2str(t2)]) | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 604: | Line 608: | ||
''Jacobian when needed can be used for analytic evaluation of the required derivatives.'' | ''Jacobian when needed can be used for analytic evaluation of the required derivatives.'' | ||
< | <source lang="matlab"> | ||
options = optimset(options,'GradObj','on','GradConstr','on'); | options = optimset(options,'GradObj','on','GradConstr','on'); | ||
t1=cputime; | t1=cputime; | ||
Line 612: | Line 616: | ||
disp(['Hand-coded derivs in fmincon took ',num2str(output.iterations),... | disp(['Hand-coded derivs in fmincon took ',num2str(output.iterations),... | ||
' iterations and cpu= ',num2str(t2)]) | ' iterations and cpu= ',num2str(t2)]) | ||
</ | </source> | ||
<pre> | <pre> | ||
Line 629: | Line 633: | ||
''Calculating the gradients and Jacobians using MAD's high-level interfaces is easy, we first use MADOptObjDelete and MADOptConstrDelete to delete any old settings, then register the objective and constraint functions using MADOptObjRegister and MADOptConstrRegister, then optimize using MADOptObjEval and MADOptConstrEval to supply the objective and constraints, and when needed their derivatives.'' | ''Calculating the gradients and Jacobians using MAD's high-level interfaces is easy, we first use MADOptObjDelete and MADOptConstrDelete to delete any old settings, then register the objective and constraint functions using MADOptObjRegister and MADOptConstrRegister, then optimize using MADOptObjEval and MADOptConstrEval to supply the objective and constraints, and when needed their derivatives.'' | ||
< | <source lang="matlab"> | ||
MADOptObjDelete | MADOptObjDelete | ||
MADOptConstrDelete | MADOptConstrDelete | ||
Line 640: | Line 644: | ||
disp(['MAD driver in fmincon took ',num2str(output.iterations),... | disp(['MAD driver in fmincon took ',num2str(output.iterations),... | ||
' iterations and cpu= ',num2str(t2)]) | ' iterations and cpu= ',num2str(t2)]) | ||
</ | </source> | ||
<pre> | <pre> |
Latest revision as of 08:17, 1 December 2011
This page is part of the MAD Manual. See MAD Manual. |
In many cases users wish to use AD for calculating derivatives necessary for ODE or optimisation solvers. When using the TOMLAB optimisation solvers the use of AD is easily achieved by setting the flags, Prob.NumDiff and Prob.ConsDiff as noted in Section 3. When solving stiff ODE's using Matlab's ODE solvers or solving optimization problems using Matlab's Optimization Toolbox functions the user can either:
- Directly use calls to Mad's functions to calculate required derivatives as detailed in Sections 4-5.
- Or use the facilities of Mad's high-level interfaces [FK04] to the ODE and optimization solvers as detailed in this section.
For a vector-valued function F(x), , Mad currently provides 3 techniques for computing the Jacobian JF,
Forward mode AD with full storage (see Section 4) - this is most likely to be efficient when n is small.
Forward mode AD with sparse storage (see Section 5.3) - this is most likely to be efficient for larger n, particularly if the Jacobian is sparse. Even if the Jacobian is dense, then provided n is sufficiently large this method is likely to outperform full storage because there is always sparsity present in the early stages of a Jacobian calculation. For small n this technique is likely to be slower than full storage because of the overhead of manipulating a sparse data structure.
Forward mode AD with compressed storage (see Section 5.4) - this is likely to be most efficient for large n when the Jacobian is sparse and has a favourable sparsity pattern (e.g., diagonal, banded) in which case the number of directional derivatives required to compute the Jacobian may be reduced to p < n with p dependent on the sparsity pattern and bounded below by the maximum number of nonzeros in a row of the Jacobian. Note that sparse storage always uses fewer floating-point operations than compressed storage - but provided p is sufficiently small then compression outperforms sparse storage. This method requires the sparsity pattern of the Jacobian (or a safe over-estimate of the sparsity pattern) to be known. If the sparsity pattern is unknown but is known to be fixed then the sparsity pattern may be estimated using a Jacobian calculation with sparse storage and randomly perturbed x (to reduce the chance of unfortuitous cancellation giving a Jacobian entry of zero when in general it is non-zero). If the sparsity pattern changes between evaluations of F (perhaps because of branching) then either compression should not be used or the supplied Jacobian pattern must be an over-estimate to cover any possible pattern.
In general it is not possible to predict a priori which technique will be most efficient for any particular problem since this depends on many factors: sparsity pattern of JF; relative size of n, m and when compression is possible p; efficiency of sparse data access; performance of the computer being used. A rational approach is to try each technique (except default to full storage for small n and reject full storage for large n), timing its evaluation and choosing the most efficient. To make this easy for the user this approach has been implemented within the function MADJacInternalEval and interfaces to MADJacInternalEval for ODE or optimisation solvers have been provided. Consequently user's need never access MADJacInternalEval directly and needn't bother themselves with the exact AD technique being used confident that a reasonably efficient technique for their problem has been automatically selected. Of course, this selection process has an associated overhead so the interface allows the user to mandate a technique if they so wish, perhaps based on experiments conducted previously.
Stiff ODE Solution
When solving the ordinary differential equation (ODE)
for where are fixed parameters using Matlab's stiff 3 ODE solvers ode15s, ode23s, ode23t, and ode23tb then it is necessary to calculate, or at least approximate accurately, the Jacobian,
for the solvers to use a quasi-Newton solve within their implicit algorithms. By default such Jacobians are approximated by finite-differences. If the Jacobian is sparse and the sparsity pattern supplied then Jacobian compression is used [SR97] (c.f. Section 5.4). Using MADODEJacEval we may easily take advantage of AD.
Example MADEXvdpode: Jacobian driver for the Van der Pol ODE.
This example shows how to use MAD's ODE Jacobian function MADJacODE to solve the stiff Van der Pol ODE problem.
See also: vdpode_f, vdpode_J, vdpode, ode15s, MADODERegister, MADODEJacEval, MADODEReport
Contents
- Set up the ODE problem
- Integrate using finite-differencing for the Jacobian
- Use the analytic Jacobian
- Use the MAD Jacobian function MADODEJacEval
- Repeat of MADODEJacEval use
- MADODEReport
- Conclusions
Set up the ODE problem
First we define:
MU = 1000; % default stiffness parameter for Van der Pol problem
tspan = [0; max(20,3*MU)]; % integration timespan of several periods
y0 = [2; 0]; % initial conditions
options=odeset('Stats','on'); % display ODE statistics
Integrate using finite-differencing for the Jacobian
Using the default finite-differencing a solution is easily found.
t1=cputime;
[t,y] = ode15s(@vdpode_f,tspan,y0,options,MU);
tfd=cputime-t1;
disp(['Jacobian by Finite-Differencing - CPU time = ',num2str(tfd)])
591 successful steps 225 failed attempts 1883 function evaluations 45 partial derivatives 289 LU decompositions 1747 solutions of linear systems Jacobian by Finite-Differencing - CPU time = 0.75108
Use the analytic Jacobian
Since the Jacobian is recalculated many times supplying an analytic Jacobian should improve performance.
options = odeset(options,'Jacobian',@vdpode_J);
t1=cputime;
[t,y] = ode15s(@vdpode_f,tspan,y0,options,MU);
tjac=cputime-t1;
disp(['Jacobian by hand-coding - CPU time = ',num2str(tjac)])
591 successful steps 225 failed attempts 1749 function evaluations 45 partial derivatives 289 LU decompositions 1747 solutions of linear systems Jacobian by hand-coding - CPU time = 0.75108
Use the MAD Jacobian function MADODEJacEval
The automated MAD Jacobian calculation avoids the need for the user to supply the Jacobian. First we use odeset to specify that the Jacobian will be calculated by MADODEJacEval. We then use MADODEDelete to first clear any existing settings for vdpode f, then MADODERegister register the function and then integrate.
options = odeset(options,'Jacobian',@MADODEJacEval);
MADODEDelete; % deletes all previous settings for ODE evaluation
MADODERegister(@vdpode_f); % all we must do is supply the ODE function handle
t1=cputime;
[t,y] = ode15s(@vdpode_f,tspan,y0,options,MU);
tmadjac=cputime-t1;
disp(['Jacobian by MADODEJacEval - CPU time = ',num2str(tmadjac)])
591 successful steps 225 failed attempts 1749 function evaluations 45 partial derivatives 289 LU decompositions 1747 solutions of linear systems Jacobian by MADODEJacEval - CPU time = 0.88127
Repeat of MADODEJacEval use
On a second use (''without ''calling MADODEDelete) MADODEJacEval should be slightly faster since it "remembers"
which technique to use to calculate the Jacobian.
t1=cputime;
[t,y] = ode15s(@vdpode_f,tspan,y0,options,MU);
tmadjac2=cputime-t1;
disp(['Repeat of Jacobian by MADODEJacEval - CPU time = ',num2str(tmadjac2)])
591 successful steps 225 failed attempts 1749 function evaluations 45 partial derivatives 289 LU decompositions 1747 solutions of linear systems Repeat of Jacobian by MADODEJacEval - CPU time = 0.91131
MADODEReport
Using MADODEReport we see which technique is used and why - here full storage is used due to the small size (2 × 2) of the Jacobian required.
MADODEReport % report on which technique used
MADODEreport Function name vdpode_f Jacobian of size 2x2 Fwd AD - full 0.010014 s Fwd AD - compressed Inf s FWD AD - full selected MADJacInternalEval: n< MADMinSparseN=10 so using forward mode full
Conclusions
Using MAD for such small ODE problems gives similar performance to finite-differencing. MAD is more useful for large ODE problems, particularly those with sparse Jacobians in, for example the Brusselator problem of MADEXbrussode.
bar([tfd,tjac,tmadjac,tmadjac2])
colormap hsv
ylabel('CPU time (s)')
title('Solving the Van der Pol Problem with Different Jacobian Techniques')
text(1,tfd/2,'FD','Rotation',90)
text(2,tfd/2,'Analytic Jacobian','Rotation',90)
text(3,tfd/2,'MADJacODE','Rotation',90)
text(4,tfd/2,'Repeat of MADJacODE','Rotation',90)
Using Mad for such small ODE problems gives similar performance to finite-differencing. Mad is more useful for large ODE problems, particularly those with sparse Jacobians in, for example the Brusselator problem of MADEXbrussode.
Example MADEXbrussode: Jacobian driver for the Brusselator ODE.
This example of MADEXbrussode shows how to use MAD's high-level ODE Jacobian function MADODEJacEval to solve the stiff ODE Brusselator problem.
See also: brussode f, brussode S, brussode, ode15s, MADODERegister, MADODEFuncEval, MADODEJacEval, MADODEReport
Contents
- Set up the ODE problem
- Integrate Brusselator problem using finite-differencing for the Jacobian
- Integrate Brusselator problem using compressed finite-differencing
- Use the MAD ODE Jacobian function - MADODEJacEval
- Specifying that the sparsity pattern is fixed
- Specifying the sparsity pattern
- Difference in solutions
- Specifying a Jacobian technique
- Conclusions
Set up the ODE problem
First we define:
N = 80; % number of mesh points,
tspan = [0; 50]; % timespan of integration,
t=tspan(1); % initial time,
y0 = [1+sin((2*pi/(N+1))*(1:N)); repmat(3,1,N)];
y0=reshape(y0,[2*N 1]); % initial conditions. and sparsity pattern
S=brussode_S(N); % find sparsity pattern.
Integrate Brusselator problem using finite-differencing for the Jacobian
Initially we use finite-differences to approximate the Jacobian ignoring any sparsity, timing and running the cal- culation.
options = odeset('Vectorized','on','Stats','on');
t1=cputime;
[tfd,yfd] = ode15s(@brussode_f,tspan,y0,options,N);
cpufd=cputime-t1;
disp(['FD-Jacobian : cputime = ',num2str(cpufd)])
339 successful steps 54 failed attempts 2868 function evaluations 13 partial derivatives 110 LU decompositions 774 solutions of linear systems FD-Jacobian: cputime = 2.0329
Integrate Brusselator problem using compressed finite-differencing
Finite-differencing can take advantage of sparsity when approximating the Jacobian. We set options to: specify the
Jacobian sparsity pattern, print out statistics and to turn on vectorization. We then time and run the calculation.
options = odeset('Vectorized','on','JPattern',S,'Stats','on');
t1=cputime;
[tfd,yfd] = ode15s(@brussode_f,tspan,y0,options,N);
cpucompfd=cputime-t1;
disp(['Compressed FD-Jacobian : cputime = ',num2str(cpucompfd)])
339 successful steps 54 failed attempts 840 function evaluations 13 partial derivatives 110 LU decompositions 774 solutions of linear systems Compressed FD-Jacobian : cputime = 0.91131
Use the MAD ODE Jacobian function - MADODEJacEval
First delete any existing settings for MADODE using MADODEDelete. Then register function brussode f for MADODEJacEval using MADODERegister. Then use odeset to specify that MADODEJacEval will calculate the required Jacobian. Finally time and run the calculation and use MADODEReport to provide information on the Jacobian technique used.
MADODEDelete % deletes any internal settings
MADODERegister(@brussode_f) % supplies handle to ODE function
options=odeset(options,'Jacobian',@MADODEJacEval);
t1=cputime;
[tmadjac,ymadjac]=ode15s(@brussode_f,tspan,y0,options,N);
cpumadjac=cputime-t1;
disp(['MAD Jacobian : cputime = ',num2str(cpumadjac)])
MADODEReport
339 successful steps 54 failed attempts 776 function evaluations 13 partial derivatives 110 LU decompositions 774 solutions of linear systems MAD Jacobian : cputime = 1.1416 MADODEreport Function name brussode_f Jacobian of size 160x160 Fwd AD - full Inf s Fwd AD - sparse 0.020029 s Fwd AD - compressed Inf s FWD AD - sparse selected Sparsity not fixed so compression rejected. Large system reject full method.
Specifying that the sparsity pattern is fixed
For this problem the Jacobian is large and sparse but MAD does not "know" that the sparsity pattern is fixed and so can only use sparse forward mode and not compression. We first delete any internal settings, specify that the sparsity pattern is fixed, i.e., it does not change from one evaluation to the next due to branching; and then use the MAD Jacobian. The Jacobian sparsity pattern is calculated using fmad's sparse forward mode for a point randomly perturbed about y0.
MADODEDelete % delete any internal settings
MADODERegister(@brussode_f,'sparsity_fixed','true')% specify fixed sparsity
options=odeset(options,'Jacobian',@MADODEJacEval); % Use MAD Jacobian
t1=cputime;
[tmadjac,ymadjac]=ode15s(@brussode_f,tspan,y0,options,N);
cpumadjac_fixed=cputime-t1;
disp(['MAD Jacobian (sparsity fixed): cputime = ',num2str(cpumadjac_fixed)])
MADODEReport
% Compressed forward mode should now have been used.
339 successful steps 54 failed attempts 776 function evaluations 13 partial derivatives 110 LU decompositions 774 solutions of linear systems MAD Jacobian (sparsity fixed): cputime = 1.1416 MADODEreport Function name brussode_f Jacobian of size 160x160 Fwd AD - full Inf s Fwd AD - sparse 0.020029 s Fwd AD - compressed 0.020029 s Percentage Sparsity = 2.4844% Maximum non-zeros/row = 4 of 160 Maximum non-zeros/col = 4 of 160 Size of compressed seed = 160x4 Percentage compressed to full seed = 2.5% FWD AD - compressed selected Large system reject full method. Sparse is worse than compressed - discarding Sparse.
Specifying the sparsity pattern
If available the sparsity pattern can be specified directly within MADsetupODE.
MADODEDelete % delete any internal settings
MADODERegister(@brussode_f,'sparsity_pattern',S)
options=odeset(options,'Jacobian',@MADODEJacEval);
t1=cputime;
[tmadjac,ymadjac]=ode15s(@brussode_f,tspan,y0,options,N);
cpumadjac_S=cputime-t1;
disp(['MAD Jacobian (sparsity given) : cputime = ',num2str(cpumadjac_S)])
MADODEReport
339 successful steps 54 failed attempts 776 function evaluations 13 partial derivatives 110 LU decompositions 774 solutions of linear systems MAD Jacobian (sparsity given) : cputime = 1.1216 MADODEreport Function name brussode_f Jacobian of size 160x160 Fwd AD - full Inf s Fwd AD - sparse 0.030043 s Fwd AD - compressed 0.010014 s Percentage Sparsity = 2.4844% Maximum non-zeros/row = 4 of 160 Maximum non-zeros/col = 4 of 160 Size of compressed seed = 160x4 Percentage compressed to full seed = 2.5% FWD AD - compressed selected Large system reject full method. Sparse is worse than compressed - discarding Sparse.
Difference in solutions
Of course, there should be no discernable difference between the finite difference and AD-enabled solutions.
plot(tfd,yfd(:,1),'-',tmadjac,ymadjac(:,1),'o')
xlabel('t')
ylabel('y')
legend('FD Jacobian','MAD Jacobian')
title('Comparison of Brusselator solutions using FD and AD for Jacobians')
Specifying a Jacobian technique
We can specify from the outset that compression is to be used.
MADODEDelete % delete any internal settings
MADODERegister(@brussode_f,'sparsity_pattern',S,'ad_technique','ad_fwd_compressed')
options=odeset(options,'Jacobian',@MADODEJacEval);
t1=cputime;
[tmadjac,ymadjac]=ode15s(@brussode_f,tspan,y0,options,N);
cpumadjac_comp=cputime-t1;
disp(['MAD Jacobian (compressed specified) : cputime = ',num2str(cpumadjac_S)])
MADODEReport
MADJacInternalRegister: AD technique selected is ad_fwd_compressed 339 successful steps 54 failed attempts 776 function evaluations 13 partial derivatives 110 LU decompositions 774 solutions of linear systems MAD Jacobian (compressed specified) : cputime = 1.1216 MADODEreport Function name brussode_f Jacobian of size 160x160 Fwd AD - compressed 0.010014 s Percentage Sparsity = 2.4844% Maximum non-zeros/row = 4 of 160 Maximum non-zeros/col = 4 of 160 Size of compressed seed = 160x4 Percentage compressed to full seed = 2.5% FWD AD - compressed selected AD_FWD_COMPRESSED specified by call to MADJacInternalRegister
Conclusions
For this problem compressed finite-differencing is most efficient. If the sparsity pattern is not known then sparse forward mode AD outperforms finite-differencing. On supplying the sparsity pattern the performance of compressed forward mode AD approaches that of compressed finite-differencing. Simply using the MADODEJacEval interface yields acceptable performance.
bar([cpufd,cpucompfd,cpumadjac,cpumadjac_fixed,cpumadjac_S,cpumadjac_comp])
colormap hsv
ylabel('CPU time (s)')
title(['Solving the Brusselator Problem with Different Jacobian Techniques (N= ',num2str(N),')'])
text(1,cpucompfd/2,'FD','Rotation',90)
text(2,cpucompfd/2,'Compressed FD (sparsity supplied)','Rotation',90)
text(3,cpucompfd/2,'MAD','Rotation',90)
text(4,cpucompfd/2,'MAD(sparsity fixed)','Rotation',90)
text(5,cpucompfd/2,'MAD(sparsity supplied)','Rotation',90)
text(6,cpucompfd/2,'MAD(compression specified)','Rotation',90)
The example MADEXpolluode provides a third example for these techniques.
Unconstrained Optimization
High-level interfaces to Mad's functionality for Matlab's Optimization Solvers have also been developed. In this section we cover unconstrained optimisation, in Section 6.3 we cover the constrained case.
Example MADEXbanana: Unconstrained Optimization Using High-Level Interfaces
This example illustrates how the gradient for an unconstrained optimization problem, the banana problem of bandem, may be calculated using the high-level interface function MADOptObjEval and used within fminunc.
See also MADOptObjEval, MADOptObjDelete, MADOptObjRegister, MADOptObjReport, fminunc, banana f, ba- nana fg, bandem
Contents
- Default BFGS optimization without gradient
- Use of exact gradient
- Use of MAD's high-level interface MADOptObjEval
- Information on the AD techiques used
Default BFGS optimization without gradient
If the gradient is not supplied then fminunc will approximate it using finite-differences.
format compact
x=[0 1];
options=optimset('LargeScale','off');
t1=cputime;
[xsol,fval,exitflag,output] = fminunc(@banana_f,x,options);
xsol
disp(['Default with no gradient took ',num2str(output.iterations),...
' iterations and CPU time = ',num2str(cputime-t1)])
Optimization terminated: relative infinity-norm of gradient less than options.TolFun. xsol = 1.0000 1.0000 Default with no gradient took 20 iterations and CPU time = 0.030043
Use of exact gradient
Since the function is just quadratic it may easily be differentiated by hand to give the analytic gradient in banana_fg.
options=optimset(options,'GradObj','on');
t1=cputime;
[xsol,fval,exitflag,output] = fminunc(@banana_fg,x,options);
xsol
disp(['Default with analytic gradient took ',num2str(output.iterations),...
' iterations and CPU time = ',num2str(cputime-t1)])
Optimization terminated: relative infinity-norm of gradient less than options.TolFun. xsol = 1.0000 1.0000 Default with analytic gradient took 20 iterations and CPU time = 0.020029
Use of MAD's high-level interface MADOptObjEval
Using MADOptObjEval is easy, we first delete any previous settings, register the objective function banana f and then use MADOptObjEval as the objective function which will automatically calculate the gradient when needed.
MADOptObjDelete
MADOptObjRegister(@banana_f)
options=optimset(options,'GradObj','on');
t1=cputime;
[xsol,fval,exitflag,output] = fminunc(@MADOptObjEval,x,options);
xsol
disp(['Jacobian by MADFandGradObjOpt took ',num2str(output.iterations),...
' iterations and CPU time = ',num2str(cputime-t1)])
Optimization terminated: relative infinity-norm of gradient less than options.TolFun. xsol = 1.0000 1.0000 Jacobian by MADFandGradObjOpt took 20 iterations and CPU time = 0.09013
Information on the AD techiques used
The function MADOptObjReport displays information on the techniques used. In this case since there are only 2 independent variables forward mode with full storage is used.
MADOptObjReport
MADOptObjReportODE Function name banana_f Jacobian of size 1x2 Fwd AD - full 0.010014 s Fwd AD - compressed Inf s FWD AD - full selected MADJacInternalEval: n< MADMinSparseN=10 so using forward mode full
Constrained Optimization
If we have a constrained optimization problem then we use MADOptObjEval for the objective, as in Example 6.3, and MADOptConstrEval for the constraint as Example 6.4 demonstrates.
Example MADEXconfun: Constrained Optimization Using High-Level
Interfaces.
This example shows how to use MAD's high-level interface functions to obtain both the objective function gradient and the constraint function Jacobian in a constrained optimisation problem solved using MATLAB's fmincon function.
See also MADEXbanana, MADOptConstrRegister, MADOptConstrEval, MADOptConstrReport, MADOptCon- strDelete, MADOptObjRegister, MADOptObjEval, MADOptObjReport, MADOptObjDelete
Contents
- Default optimization without gradients
- Using analytic gradients
- Using MAD's high-level interfaces
- Seeing which AD techniques are used
Default optimization without gradients
Here we simply use the medium-scale algorithm with gradients evaluated by default finite-differencing.
options = optimset('LargeScale','off');
x0 = [-1 1];
t1=cputime;
[x,fval,exitflag,output] = fmincon(@objfun,x0,[],[],[],[],[],[],@confun,options);
t2=cputime-t1;
disp(['Default fmincon took ',num2str(output.iterations),...
' iterations and cpu= ',num2str(t2)])
Optimization terminated: first-order optimality measure less than options.TolFun and maximum constraint violation is less than options.TolCon. Active inequalities (to within options.TolCon = 1e-006): lower upper ineqlin ineqnonlin 1 2 Default fmincon took 8 iterations and cpu= 0.040058
Using analytic gradients
The Mathworks supply functions objfungrad and confungrad which return the objective's gradient and constraint's
Jacobian when needed can be used for analytic evaluation of the required derivatives.
options = optimset(options,'GradObj','on','GradConstr','on');
t1=cputime;
[x,fval,exitflag,output] = fmincon('objfungrad',x0,[],[],[],[],[],[],...
'confungrad',options);
t2=cputime-t1;
disp(['Hand-coded derivs in fmincon took ',num2str(output.iterations),...
' iterations and cpu= ',num2str(t2)])
Optimization terminated: first-order optimality measure less than options.TolFun and maximum constraint violation is less than options.TolCon. Active inequalities (to within options.TolCon = 1e-006): lower upper ineqlin ineqnonlin 1 2 Hand-coded derivs in fmincon took 8 iterations and cpu= 0.040058
Using MAD's high-level interfaces
Calculating the gradients and Jacobians using MAD's high-level interfaces is easy, we first use MADOptObjDelete and MADOptConstrDelete to delete any old settings, then register the objective and constraint functions using MADOptObjRegister and MADOptConstrRegister, then optimize using MADOptObjEval and MADOptConstrEval to supply the objective and constraints, and when needed their derivatives.
MADOptObjDelete
MADOptConstrDelete
MADOptObjRegister(@objfun)
MADOptConstrRegister(@confun)
t1=cputime;
[x,fval,exitflag,output] = fmincon(@MADOptObjEval,x0,[],[],[],[],[],[],...
@MADOptConstrEval,options);
t2=cputime-t1;
disp(['MAD driver in fmincon took ',num2str(output.iterations),...
' iterations and cpu= ',num2str(t2)])
Optimization terminated: first-order optimality measure less than options.TolFun and maximum constraint violation is less than options.TolCon. Active inequalities (to within options.TolCon = 1e-006): lower upper ineqlin ineqnonlin 1 2 MAD driver in fmincon took 8 iterations and cpu= 0.10014
Seeing which AD techniques are used
The functions MADOptObjReport and MADOptConstrReport report back which techniques have been used and why. Here full mode is used since the number of derivatives is small.
MADOptObjReport MADOptConstrReport
MADOptObjReportODE Function name objfun Jacobian of size 1x2 Fwd AD - compressed Inf s FWD AD - full selected MADJacInternalEval: n< MADMinSparseN=10 so using forward mode full MADOptConstrReport Function name confun Jacobian of size 2x2 Fwd AD - full 0.010014 s Fwd AD - compressed Inf s FWD AD - full selected MADJacInternalEval: n< MADMinSparseN=10 so using forward mode full