MAD Basic Use of Forward Mode for First Derivatives

From TomWiki
Jump to navigationJump to search

Notice.png

This page is part of the MAD Manual. See MAD Manual.

Differentiating Matlab Expressions

Using forward mode AD, Mad can easily compute first derivatives of functions defined via expressions built-up of the arithmetic operations and intrinsic functions of Matlab. For the moment let us simply consider the trivial operation y = f (x) = x2 for which the derivative is 2x. Let us determine this derivative at x = 3 for which f (3) = 9 and df (3)/dx = 6. We will make use of the fmad constructor function fmad, and the functions getvalue and getderivs for extracting values and derivatives from fmad objects.

Example MADEXForward1: Forward Mode Example

This example demonstrates how to differentiate simple arithmetic expressions in Matlab. See also MADEXForward2

Contents

  • Differentiating
  • Using fmad
  • Calculating y
  • Extracting the value
  • Extracting the derivatives

Differentiating

MAD can easily compute first derivatives of functions defined via expressions built-up of the arithmetic operations and intrinsic functions of Matlab using forward mode AD. For the moment let us simply consider the trivial operation

for which the derivative is

Let us determine this derivative at x=3 for which f(3)=9 and df(3)/dx=6.

Using fmad

We first define x to be of fmad class and have value 3 and derivative 1 (since the derivative with respect to the independent variable is one, dx/dx=1). The command and Matlab output is as below.

format compact % so there are no blank lines in the output
x=fmad(3,1)    % creates the fmad object with value 3 and derivative 1
value =
     3
derivatives =
     1

Calculating y

The value of the object x is 3 and its derivatives correspond to an object of size [1 1] (a scalar) with one derivative of value 1. We now calculate y exactly as we would normally, allowing the fmad function libraries to take care of the arithmetic associated with values and derivatives.

y=x^2

value =
     9
derivatives =
     6

Extracting the value

We extract the value of y using the MAD function getvalue.

yvalue=getvalue(y)

yvalue =
     9

Extracting the derivatives

Similarly we extract the derivatives using getderivs.

yderivs=getderivs(y)

yderivs =
     6

Vector-Valued Functions

Let us now consider a simple, linear vector-valued function for which we may use fmad to calculate Jacobian-vector products or the function's Jacobian. We also see, for the first time fmad's getinternalderivs function.

Example MADEXForward2: Forward Mode Example

This example demonstrates how to calculate Jacobian-vector products and how to calculate the Jacobian for a simple vector-valued function.

See also MADEXForward1

Contents

  • Differentiating y = f (x) = A * x
  • Directional derivative in [1;0;0] direction
  • Directional derivative in [0;1;0] direction
  • Calculating the Jacobian
  • But dy is a 3-D array not a matrix!

Differentiating

MAD easily deals with vector-valued functions. Let us consider the function

at x=[1; 1 ;1 ] where A is the matrix,

A=[ 1  4  7
    2  5  8
    3  6  9]

for which the Jacobian is the matrix A and directional derivatives in the 3 respective coordinate directions are the 3 respective columns of A.

format  compact;
A=[1 4 7;2 5 8; 3 6 9];
xval=[1;1;1];

Directional derivative in [1;0;0] direction

Using the fmad constructor function we initialise x with its directional derivative, calculate y and extract the derivative. We find the correct directional derivative [1;2;3].

x=fmad(xval,[1;0;0]) 
y=A*x; 
dy=getderivs(y)
fmad object x
value =
     1
     1
     1
derivatives =
     1
     0
     0
dy =
     1
     2
     3

Directional derivative in [0;1;0] direction

Similarly in the [0;1;0] direction we get [4;5;6].

x=fmad(xval,[0;1;0]);
y=A*x;
dy=getderivs(y)
dy =
     4
     5
     6

Calculating the Jacobian

To obtain the full Jacobian we initialise the derivative with the 3 x 3 identity matrix.

x=fmad(xval,eye(3))
y=A*x
dy=getderivs(y)
fmad object x
value =
     1
     1
     1
derivvec object derivatives
Size = 3  1
No. of derivs = 3
derivs(:,:,1) =
     1
     0
     0
derivs(:,:,2) =
     0
     1
     0
derivs(:,:,3) =
     0
     0
     1
fmad object y
value =
    12
    15
    18
derivvec object derivatives
Size = 3  1
No. of derivs = 3
derivs(:,:,1) =
     1
     2
     3
derivs(:,:,2) =
     4
     5
     6
derivs(:,:,3) =
     7
     8
     9
dy(:,:,1) =
     1
     2
     3
dy(:,:,2) =
     4
     5
     6
dy(:,:,3) =
     7
     8
     9

But dy is a 3-D array not a matrix!

Notice now that we have a 3-dimensional array returned for a result we expect to be a matrix. This is easily remedied using the Matlab intrinsic squeeze, which squeezes out singleton dimensions, or more conveniently using getinternalderivs to extract the derivatives from the fmad object y.

dy=squeeze(dy)
dy=getinternalderivs(y)
dy =
     1     4     7
     2     5     8
     3     6     9
dy =
     1     4     7
     2     5     8
     3     6     9

This example brings us to an important topic, how are the derivatives stored in Mad?

Storage of Derivatives in Mad

Mad stores derivatives in two ways depending upon whether a variable of fmad class possesses single,

or multiple directional derivatives.

Single Directional Derivatives

For a variable of fmad class with a single directional derivatives the derivatives are stored as an array of class double with the same shape as the variable's value. The getderivs function will therefore return this array. We have seen this behaviour in Example 4.1 and Example 4.2.

Notice.png

This page is part of the MAD Manual. See MAD Manual.

Multiple Directional Derivatives

For a variable of fmad class with multiple directional derivatives, the derivatives are stored in an object of class derivvec. Mad has been written so that multiple directional derivatives may be thought of as an array of one-dimension higher than that of the corresponding value using, what we term, the derivative's external representation. Mad also has an internal representation of derivatives that may often be of use, particularly when interfacing Mad with another Matlab package. Accessing the internal representation is covered in MAD Advanced Use of Forward Mode for First Derivatives#Accessing The Internal Representation of Derivatives. Consider an N-dimensional Matlab array A with size(A)=[n1 n2 n3 ... nN], such that (n1 n2 ... nN) are positive integers. Then the external representation of A's derivatives, which we shall call dA is designed to have size (dA)=[n1 n2 n3 ... nN nD], where nD> 1 is the number of directional derivatives. An example or two may make things clearer.

Example MADEXExtRep1: External representation of a matrix

This example illustrates MAD's external representation of derivatives for a matrix. See also MADEXExtRep2, MADEXIntRep1

Contents
  • Define a 2 x 2 matrix A.
  • Multiple directional derivatives for A
  • The fmad object
  • The derivs component
Define a 2 x 2 matrix A.
format compact
A=[1 2;  3 4]
A  =
1	2

3	4
Multiple directional derivatives for A

Define a 2 x 2 x 3 matrix (note the transpose) to provide 3 directional derivatives for dA.

dA=reshape([1  2 3 4;5  6 7 8;9  10 11 12]',[2 2 3])
dA(:,:,1) =
     1     3
     2     4
dA(:,:,2) =
     5     7
     6     8
dA(:,:,3) =
     9    11
    10    12
The fmad object

We create the fmad object,

A=fmad(A,dA)
value =
     1     2
     3     4
Derivatives
Size = 2  2
No. of derivs = 3
derivs(:,:,1) =
     1     3
     2     4
derivs(:,:,2) =
     5     7
     6     8
derivs(:,:,3) =
     9    11
    10    12
The derivs component

The last index of the A's derivs component refers to the directional derivatives and we see that, as expected, we have a 2 x 2 x 3 array which makes up 3 sets of 2 x 2 directional derivative matrices.

Note that fmad (like Matlab and Fortran) assumes a column major (or first index fastest) ordering for array storage. The derivatives supplied to the constructor function fmad are therefore assumed to be in column major order and the external representation reshapes them appropriately. We therefore needn't bother reshaping the derivatives before passing them to fmad as the example below shows.

Example MADEXExtRep2: External Representation Example

This example shows the use of column-major ordering for an input derivative. See also MADEXExtRep1, MADEXIntRep1

Contents
  • Define a 2 x 2 matrix A.
  • Multiple directional derivatives for A
  • The fmad object
  • The derivs component
  • Define a 2 x 2 matrix A.
format  compact
A=[1 2; 3 4]
A =
     1     2
     3     4
Multiple directional derivatives for A

Define a vector to provide directional derivatives for dA.

dA=[1 2 3 4 5 6 7 8 9 10 11 12];
The fmad object

We create the fmad object,

A=fmad(A,dA)
value =
     1     2
     3     4
Derivatives
Size = 2  2
No. of derivs = 3
derivs(:,:,1) =
     1     3
     2     4
derivs(:,:,2) =
     5     7
     6     8
derivs(:,:,3) =
     9    11
    10    12
The derivs component

Notice how the elements of dA are taken in column major order to provide the 3 directional derivatives of A in turn.

Use of Intrinsics (Operators and Functions)

We have produced overloaded fmad functions for most commonly used Matlab operators and functions.

For example addition of two fmad objects is determined by the function @fmad\plus, subtraction @fmad\minus, etc. To see which overloaded intrinsic operators and functions are provided type help fmad.Contents at the Matlab prompt. The help information for individual functions describes the function and any additional information such as restrictions, e.g. help fmad.plus

Additional Functions

We have also coded a small number of additional functions which are not part of Matlab but application specific. For example, for a matrix A, logdet(A) calculates log(abs(det(A))) a function used in statistical parameter estimation. This composition of functions may be differentiated very efficiently by considering it in this composite form [Kub94, For01]. Note that a version of the function valid for arguments of classes double and sparse is provided in the MAD/madutil.

Differentiating User Defined Functions

In section 4.1 we saw how simple Matlab expressions could be differentiated using Mad. It is usually straight-forward to apply the same methodology to differentiating user defined functions.

Example 4.5 (Brusselator ODE Function) Consider the function which defines the Brusselator stiff ODE problem from the Matlab ODE toolbox, supplied as brussode_f.m and shown in #Comparing the AD and FD Jacobian. For a given scalar time t (unused) and vector y of 2N rows, this function brussode f will supply the right-hand-side function for an ODE of the form, dy = F(t, y). dt

The Jacobian of the function is required to solve the ODE efficiently via an implicit stiff ODE solver (e.g., ode15s in Matlab). The function has actually been written in so-called vectorised form, i.e. if y is a matrix then the dydt supplied will also be a matrix with each column of dydt corresponding to the appropriate column of y. This approach allows for the efficient approximation of the function's Jacobian via finite-differences. It is trivial to calculate the Jacobian of the function of #Comparing the AD and FD Jacobian using Mad. The commands to do so are given in the script MADEXbrussodeJac.m. The script also includes a simple one-sided finite-difference (FD) calculation of the Jacobian.

MADEXbrussodeJac: Full Jacobian of Brusselator Problem

This example shows how to use the fmad class to calculate the Jacobian of the Brusselator problem and compare the result with those from finite-differencing. More efficient AD approaches are available as shown in MADEXbrussodeJacSparse, MADEXbrussodeJacComp.

See also: brussode f, brussode S, MADEXbrussodeJacSparse, MADEXbrussodeJacComp, brussode

Contents

  • Initialise variables
  • Use fmad to calculate the Jacobian
  • Using finite-differencing
  • Comparing the AD and FD Jacobian

Initialise variables

We define the input variables.

N=3; 
t=0; 
y0=ones(2*N,1);

Use fmad to calculate the Jacobian

We initialise y with the value y0 and derivatives given by the identity matrix of size equal to the length of y0. We may then evaluate the function and extract the value and Jacobian.

y=fmad(y0,eye(length(y0)));
F=brussode_f(t,y,N);
F_fmad=getvalue(F); % grab value
JF_fmad=getinternalderivs(F) % grab Jacobian
JF_fmad =
   -2.6400    1.0000    0.3200         0         0         0
    1.0000   -1.6400         0    0.3200         0         0
    0.3200         0   -2.6400    1.0000    0.3200         0
         0    0.3200    1.0000   -1.6400         0    0.3200
         0         0    0.3200         0   -2.6400    1.0000
         0         0         0    0.3200    1.0000   -1.6400

Using finite-differencing

We make 2N extra copies of y, add a perturbation to the copies, perform all function evaluations taking advantage of the vectorization of the function brussode f, and approximate the derivatives using 1-sided finite-differences.

y=repmat(y0,[1 2*N+1]);
y(:,2:end)=y(:,2:end)+sqrt(eps)*eye(2*N);
F=brussode_f(t,y,N);
F_fd=F(:,1);
JF_fd=(F(:,2:end)-repmat(F_fd,[1 2*N]))./sqrt(eps)
JF_fd =
   -2.6400    1.0000    0.3200         0         0         0
    1.0000   -1.6400         0    0.3200         0         0
    0.3200         0   -2.6400    1.0000    0.3200         0
         0    0.3200    1.0000   -1.6400         0    0.3200
         0         0    0.3200         0   -2.6400    1.0000
         0         0         0    0.3200    1.0000   -1.6400

Comparing the AD and FD Jacobian

The function values computed are, of course, identical. The Jacobians disagree by about 1e-8, this is due to the truncation error of the finite-difference approximation.

disp(['Function values norm(F_fd-F_fmad) = ',num2str(norm(F_fd-F_fmad,inf))])
disp(['Function Jacobian  norm(JF_fd-JF_fmad)=  ',num2str(norm(JF_fd-JF_fmad,inf))])

Function values  norm(F_fd-F_fmad)     = 0
Function Jacobian  norm(JF_fd-JF_fmad) =  2.861e-008

function dydt=brussode_f(t,y,N)
% brussode_f: defines ODE  for the  Brusselator  problem.
%	Taken from  matlab  6.5  brussode.
%
%   See also  MADEXbrussode,  brussode_S,  brussode

%   AUTHOR:  S.A.Forth
%   DATE: 30/5/06
%   Copyright 2006:  S.A.Forth, Cranfield University
%
%   REVISIONS:
%   DATE	WHO	WHAT
c     = 0.02 * (N+1)^2;
dydt  = zeros(2*N,size(y,2)); % preallocate dy/dt

%   Evaluate  the  2 components of  the  function at  one edge of  the  grid
%   (with edge conditions). i = 1;
dydt(i,:)   = 1 + y(i+1,:).*y(i,:).^2 - 4*y(i,:) + c*(1-2*y(i,:)+y(i+2,:));
dydt(i+1,:) = 3*y(i,:) - y(i+1,:).*y(i,:).^2 + c*(3-2*y(i+1,:)+y(i+3,:));

%   Evaluate  the  2 components of  the  function at  all interior grid points. i = 3:2:2*N-3;
dydt(i,:)   = 1 + y(i+1,:).*y(i,:).^2 - 4*y(i,:) + ... c*(y(i-2,:)-2*y(i,:)+y(i+2,:));
dydt(i+1,:) = 3*y(i,:) - y(i+1,:).*y(i,:).^2 + ... c*(y(i-1,:)-2*y(i+1,:)+y(i+3,:));

%   Evaluate  the  2 components of  the  function at  the  other   edge of  the  grid
%   (with edge conditions). i = 2*N-1;
dydt(i,:)   = 1 + y(i+1,:).*y(i,:).^2 - 4*y(i,:) + c*(y(i-2,:)-2*y(i,:)+1);
dydt(i+1,:) = 3*y(i,:) - y(i+1,:).*y(i,:).^2 + c*(y(i-1,:)-2*y(i+1,:)+3);

We should note that the above approach is not efficient either for FD or AD for large N. The Jacobian is sparse and exploiting the sparsity greatly increases the efficiency of both approaches. Exploiting sparsity will be covered in MAD Advanced Use of Forward Mode for First Derivatives#Dynamic Propagation of Sparse Derivatives and MAD Advanced Use of Forward Mode for First Derivatives#Sparse Jacobians via Compression.

Possible Problems Using the Forward Mode

In many cases fmad will be able to differentiate fairly involved functions as shown in Figure 1. However some problems in terms of robustness and efficiency may occur.

1. Assigning an fmad object to a variable of class double. For example,

>> x=fmad(1,1);
>> y=zeros(2,1);
>> y(1)=x;
??? Conversion  to  double  from  fmad is not  possible.

See MAD Advanced Use of Forward Mode for First Derivatives#Preallocation of Matrices/Arrays and example 5.4 in MAD Advanced Use of Forward Mode for First Derivatives#Example (Preallocating fmad Arrays) for a solution to this problem.

2. Differentiation for a particular function is not supported. For example,

>> x=fmad(1,1);
>> y=tanh(x)
??? Error using  ==> tanh

Function 'tanh' not defined for variables of class 'fmad'. Please contact the author to arrange for the fmad class to be extended for your case.3. Derivatives are not accurate when differentiating an iteratively defined function. See section 5.5 for an approach to this issue.

4. Differentiation is slow compared to finite-differences. The fmad and derivvec class are designed to be competitive with finite-differencing in terms of speed of execution. They will of course return derivatives correct to within machine rounding. If this is not the case then please contact the author.

5. Derivatives are incorrect. Then you have discovered a bug, please contact the author.