# MAD Basic Use of Forward Mode for First Derivatives

### From TomWiki

This page is part of the MAD Manual. See MAD Manual. |

## Differentiating Matlab Expressions

Using forward mode AD, Mad can easily compute first derivatives of functions defined via expressions built-up of the arithmetic operations and intrinsic functions of Matlab. For the moment let us simply consider the trivial operation *y *= *f *(*x*) = *x*2 for which the derivative is 2*x*. Let us determine this derivative at *x *= 3 for which *f *(3) = 9 and *df *(3)*/dx *= 6. We will make use of the fmad constructor function fmad, and the functions getvalue and getderivs for extracting values and derivatives from fmad objects.

### Example MADEXForward1: Forward Mode Example *y* = *x*^{2}

*This example demonstrates how to differentiate simple arithmetic expressions in Matlab. See also MADEXForward2*

#### Contents

- Differentiating
*y*=*f*(*x*) =*x*^{2}

- Using fmad

- Calculating y

- Extracting the value

- Extracting the derivatives

#### Differentiating *y* = *f*(*x*) = *x*^{2}

*MAD can easily compute first derivatives of functions defined via expressions built-up of the arithmetic operations and intrinsic functions of Matlab using forward mode AD. For the moment let us simply consider the trivial operation*

*y* = *f*(*x*) = *x*^{2}

*for which the derivative is*

*d**y* / *d**x* = 2*x*.

*Let us determine this derivative at x=3 for which f(3)=9 and df(3)/dx=6.*

#### Using fmad

*We first define x to be of fmad class and have value 3 and derivative 1 (since the derivative with respect to the independent variable is one, dx/dx=1). The command and Matlab output is as below.*

format compact % so there are no blank lines in the output x=fmad(3,1) % creates the fmad object with value 3 and derivative 1

value = 3 derivatives = 1

#### Calculating y

*The value of the object x is 3 and its derivatives correspond to an object of size [1 1] (a scalar) with one derivative of value 1. We now calculate y exactly as we would normally, allowing the fmad function libraries to take care of the arithmetic associated with values and derivatives.*

y=x^2 value = 9 derivatives = 6

#### Extracting the value

*We extract the value of y using the MAD function getvalue.*

yvalue=getvalue(y) yvalue = 9

#### Extracting the derivatives

*Similarly we extract the derivatives using getderivs.*

yderivs=getderivs(y) yderivs = 6

## Vector-Valued Functions

Let us now consider a simple, linear vector-valued function for which we may use fmad to calculate Jacobian-vector products or the function's Jacobian. We also see, for the first time fmad's getinternalderivs function.

### Example MADEXForward2: Forward Mode Example *y* = *A* * *x*

*This example demonstrates how to calculate Jacobian-vector products and how to calculate the Jacobian for a simple vector-valued function.*

*See also MADEXForward1*

#### Contents

- Differentiating y
*=*f*(*x*) =*A * x

- Directional derivative in [1;0;0] direction

- Directional derivative in [0;1;0] direction

- Calculating the Jacobian

- But dy is a 3-D array not a matrix!

#### Differentiating *y* = *f*(*x*) = *A* * *x*

*MAD easily deals with vector-valued functions. Let us consider the function*

*y* = *f*(*x*) = *A* * *x*

*at x=[1; 1 ;1 ] where A is the matrix,*

A=[ 1 4 7 2 5 8 3 6 9]

*for which the Jacobian is the matrix A and directional derivatives in the 3 respective coordinate directions are the 3 respective columns of A.*

format compact; A=[1 4 7;2 5 8; 3 6 9]; xval=[1;1;1];

#### Directional derivative in [1;0;0] direction

*Using the fmad constructor function we initialise x with its directional derivative, calculate y and extract the derivative. We find the correct directional derivative [1;2;3].*

x=fmad(xval,[1;0;0]) y=A*x; dy=getderivs(y)

fmad object x value = 1 1 1 derivatives = 1 0 0 dy = 1 2 3

**Directional derivative in [0;1;0] direction**

*Similarly in the [0;1;0] direction we get [4;5;6].*

x=fmad(xval,[0;1;0]); y=A*x; dy=getderivs(y)

dy = 4 5 6

#### Calculating the Jacobian

*To obtain the full Jacobian we initialise the derivative with the 3 x 3 identity matrix.*

x=fmad(xval,eye(3)) y=A*x dy=getderivs(y)

fmad object x value = 1 1 1 derivvec object derivatives Size = 3 1 No. of derivs = 3 derivs(:,:,1) = 1 0 0 derivs(:,:,2) = 0 1 0 derivs(:,:,3) = 0 0 1 fmad object y value = 12 15 18 derivvec object derivatives Size = 3 1 No. of derivs = 3 derivs(:,:,1) = 1 2 3 derivs(:,:,2) = 4 5 6 derivs(:,:,3) = 7 8 9 dy(:,:,1) = 1 2 3 dy(:,:,2) = 4 5 6 dy(:,:,3) = 7 8 9

#### But dy is a 3-D array not a matrix!

*Notice now that we have a 3-dimensional array returned for a result we expect to be a matrix. This is easily remedied using the Matlab intrinsic squeeze, which squeezes out singleton dimensions, or more conveniently using getinternalderivs to extract the derivatives from the fmad object y.*

dy=squeeze(dy) dy=getinternalderivs(y)

dy = 1 4 7 2 5 8 3 6 9 dy = 1 4 7 2 5 8 3 6 9

This example brings us to an important topic, how are the derivatives stored in Mad?

## Storage of Derivatives in Mad

Mad stores derivatives in two ways depending upon whether a variable of fmad class possesses single,

or multiple directional derivatives.

### Single Directional Derivatives

For a variable of fmad class with a single directional derivatives the derivatives are stored as an array of class double with the same shape as the variable's value. The getderivs function will therefore return this array. We have seen this behaviour in Example 4.1 and Example 4.2.

This page is part of the MAD Manual. See MAD Manual. |

### Multiple Directional Derivatives

For a variable of fmad class with multiple directional derivatives, the derivatives are stored in an object of class derivvec. Mad has been written so that multiple directional derivatives may be thought of as an array of one-dimension higher than that of the corresponding value using, what we term, the derivative's *external representation*. Mad also has an *internal representation *of derivatives that may often be of use, particularly when interfacing Mad with another Matlab package. Accessing the internal representation is covered in MAD Advanced Use of Forward Mode for First Derivatives#Accessing The Internal Representation of Derivatives. Consider an N-dimensional Matlab array A with size(A)=[n1 n2 n3 ... nN], such that (n1 n2 ... nN) are positive integers. Then the external representation of A's derivatives, which we shall call dA is designed to have size (dA)=[n1 n2 n3 ... nN nD], where nD*> *1 is the number of directional derivatives. An example or two may make things clearer.

#### Example MADEXExtRep1: External representation of a matrix

*This example illustrates MAD's external representation of derivatives for a matrix. See also MADEXExtRep2, MADEXIntRep1*

##### Contents

- Define a 2 x 2 matrix A.

- Multiple directional derivatives for A

- The fmad object

- The derivs component

##### Define a 2 x 2 matrix A.

format compact A=[1 2; 3 4]

A = 1 2 3 4

##### Multiple directional derivatives for A

*Define a 2 x 2 x 3 matrix (note the transpose) to provide 3 directional derivatives for dA.*

dA=reshape([1 2 3 4;5 6 7 8;9 10 11 12]',[2 2 3])

dA(:,:,1) = 1 3 2 4 dA(:,:,2) = 5 7 6 8 dA(:,:,3) = 9 11 10 12

##### The fmad object

*We create the fmad object,*

A=fmad(A,dA)

value = 1 2 3 4 Derivatives Size = 2 2 No. of derivs = 3 derivs(:,:,1) = 1 3 2 4 derivs(:,:,2) = 5 7 6 8 derivs(:,:,3) = 9 11 10 12

##### The derivs component

*The last index of the A's derivs component refers to the directional derivatives and we see that, as expected, we have a 2 x 2 x 3 array which makes up 3 sets of 2 x 2 directional derivative matrices.*

Note that fmad (like Matlab and Fortran) assumes a *column major *(or first index fastest) ordering for array storage. The derivatives supplied to the constructor function fmad are therefore assumed to be in column major order and the external representation reshapes them appropriately. We therefore needn't bother reshaping the derivatives before passing them to fmad as the example below shows.

#### Example MADEXExtRep2: External Representation Example

*This example shows the use of column-major ordering for an input derivative. See also MADEXExtRep1, MADEXIntRep1*

##### Contents

- Define a 2 x 2 matrix A.

- Multiple directional derivatives for A

- The fmad object

- The derivs component

- Define a 2 x 2 matrix A.

format compact A=[1 2; 3 4]

A = 1 2 3 4

##### Multiple directional derivatives for A

*Define a vector to provide directional derivatives for dA.*

dA=[1 2 3 4 5 6 7 8 9 10 11 12];

##### The fmad object

*We create the fmad object,*

A=fmad(A,dA)

value = 1 2 3 4 Derivatives Size = 2 2 No. of derivs = 3 derivs(:,:,1) = 1 3 2 4 derivs(:,:,2) = 5 7 6 8 derivs(:,:,3) = 9 11 10 12

##### The derivs component

*Notice how the elements of dA are taken in column major order to provide the 3 directional derivatives of A in turn.*

## Use of Intrinsics (Operators and Functions)

We have produced overloaded fmad functions for most commonly used Matlab operators and functions.

For example addition of two fmad objects is determined by the function @fmad\plus, subtraction @fmad\minus, etc. To see which overloaded intrinsic operators and functions are provided type help fmad.Contents at the Matlab prompt. The help information for individual functions describes the function and any additional information such as restrictions, e.g. help fmad.plus

## Additional Functions

We have also coded a small number of additional functions which are not part of Matlab but application specific. For example, for a matrix A, logdet(A) calculates log(abs(det(A))) a function used in statistical parameter estimation. This composition of functions may be differentiated very efficiently by considering it in this composite form [Kub94, For01]. Note that a version of the function valid for arguments of classes double and sparse is provided in the MAD/madutil.

## Differentiating User Defined Functions

In section 4.1 we saw how simple Matlab expressions could be differentiated using Mad. It is usually straight-forward to apply the same methodology to differentiating user defined functions.

**Example 4.5 (Brusselator ODE Function) ***Consider the function which defines the Brusselator stiff ODE problem from the *Matlab *ODE toolbox, supplied as *brussode_f.m *and shown in #Comparing the AD and FD Jacobian. For a given scalar time *t *(unused) and vector *y *of *2N *rows, this function *brussode f *will supply the right-hand-side function for an ODE of the form,* *d***y **= **F**(*t, ***y**)*. dt*

*The Jacobian of the function is required to solve the ODE efficiently via an implicit stiff ODE solver (e.g., *ode15s *in *Matlab*). The function has actually been written in so-called *vectorised *form, i.e. if *y *is a matrix then the *dydt *supplied will also be a matrix with each column of *dydt *corresponding to the appropriate column of *y*. This approach allows for the efficient approximation of the function's Jacobian via finite-differences.* *It is trivial to calculate the Jacobian of the function of #Comparing the AD and FD Jacobian using *Mad*. The commands to do so are given in the script *MADEXbrussodeJac.m*. The script also includes a simple one-sided finite-difference (FD) calculation of the Jacobian.*

### MADEXbrussodeJac: Full Jacobian of Brusselator Problem

*This example shows how to use the fmad class to calculate the Jacobian of the Brusselator problem and compare the result with those from finite-differencing. More efficient AD approaches are available as shown in MADEXbrussodeJacSparse, MADEXbrussodeJacComp*.

*See also: brussode f, brussode S, MADEXbrussodeJacSparse, MADEXbrussodeJacComp, brussode*

#### Contents

- Initialise variables

- Use fmad to calculate the Jacobian

- Using finite-differencing

- Comparing the AD and FD Jacobian

#### Initialise variables

*We define the input variables.*

N=3; t=0; y0=ones(2*N,1);

#### Use fmad to calculate the Jacobian

*We initialise y with the value y0 and derivatives given by the identity matrix of size equal to the length of y0. We may then evaluate the function and extract the value and Jacobian.*

y=fmad(y0,eye(length(y0))); F=brussode_f(t,y,N); F_fmad=getvalue(F); % grab value JF_fmad=getinternalderivs(F) % grab Jacobian

JF_fmad = -2.6400 1.0000 0.3200 0 0 0 1.0000 -1.6400 0 0.3200 0 0 0.3200 0 -2.6400 1.0000 0.3200 0 0 0.3200 1.0000 -1.6400 0 0.3200 0 0 0.3200 0 -2.6400 1.0000 0 0 0 0.3200 1.0000 -1.6400

#### Using finite-differencing

*We make 2N extra copies of y, add a perturbation to the copies, perform all function evaluations taking advantage of the vectorization of the function brussode f, and approximate the derivatives using 1-sided finite-differences.*

y=repmat(y0,[1 2*N+1]); y(:,2:end)=y(:,2:end)+sqrt(eps)*eye(2*N); F=brussode_f(t,y,N); F_fd=F(:,1); JF_fd=(F(:,2:end)-repmat(F_fd,[1 2*N]))./sqrt(eps)

JF_fd = -2.6400 1.0000 0.3200 0 0 0 1.0000 -1.6400 0 0.3200 0 0 0.3200 0 -2.6400 1.0000 0.3200 0 0 0.3200 1.0000 -1.6400 0 0.3200 0 0 0.3200 0 -2.6400 1.0000 0 0 0 0.3200 1.0000 -1.6400

#### Comparing the AD and FD Jacobian

*The function values computed are, of course, identical. The Jacobians disagree by about 1e-8, this is due to the truncation error of the finite-difference approximation.*

disp(['Function values norm(F_fd-F_fmad) = ',num2str(norm(F_fd-F_fmad,inf))]) disp(['Function Jacobian norm(JF_fd-JF_fmad)= ',num2str(norm(JF_fd-JF_fmad,inf))]) Function values norm(F_fd-F_fmad) = 0 Function Jacobian norm(JF_fd-JF_fmad) = 2.861e-008 function dydt=brussode_f(t,y,N) % brussode_f: defines ODE for the Brusselator problem. % Taken from matlab 6.5 brussode. % % See also MADEXbrussode, brussode_S, brussode % AUTHOR: S.A.Forth % DATE: 30/5/06 % Copyright 2006: S.A.Forth, Cranfield University % % REVISIONS: % DATE WHO WHAT c = 0.02 * (N+1)^2; dydt = zeros(2*N,size(y,2)); % preallocate dy/dt % Evaluate the 2 components of the function at one edge of the grid % (with edge conditions). i = 1; dydt(i,:) = 1 + y(i+1,:).*y(i,:).^2 - 4*y(i,:) + c*(1-2*y(i,:)+y(i+2,:)); dydt(i+1,:) = 3*y(i,:) - y(i+1,:).*y(i,:).^2 + c*(3-2*y(i+1,:)+y(i+3,:)); % Evaluate the 2 components of the function at all interior grid points. i = 3:2:2*N-3; dydt(i,:) = 1 + y(i+1,:).*y(i,:).^2 - 4*y(i,:) + ... c*(y(i-2,:)-2*y(i,:)+y(i+2,:)); dydt(i+1,:) = 3*y(i,:) - y(i+1,:).*y(i,:).^2 + ... c*(y(i-1,:)-2*y(i+1,:)+y(i+3,:)); % Evaluate the 2 components of the function at the other edge of the grid % (with edge conditions). i = 2*N-1; dydt(i,:) = 1 + y(i+1,:).*y(i,:).^2 - 4*y(i,:) + c*(y(i-2,:)-2*y(i,:)+1); dydt(i+1,:) = 3*y(i,:) - y(i+1,:).*y(i,:).^2 + c*(y(i-1,:)-2*y(i+1,:)+3);

We should note that the above approach is not efficient either for FD or AD for large N. The Jacobian is sparse and exploiting the sparsity greatly increases the efficiency of both approaches. Exploiting sparsity will be covered in MAD Advanced Use of Forward Mode for First Derivatives#Dynamic Propagation of Sparse Derivatives and MAD Advanced Use of Forward Mode for First Derivatives#Sparse Jacobians via Compression.

## Possible Problems Using the Forward Mode

In many cases fmad will be able to differentiate fairly involved functions as shown in Figure 1. However some problems in terms of robustness and efficiency may occur.

1. Assigning an fmad object to a variable of class double. For example,

>> x=fmad(1,1); >> y=zeros(2,1); >> y(1)=x; ??? Conversion to double from fmad is not possible.

See MAD Advanced Use of Forward Mode for First Derivatives#Preallocation of Matrices/Arrays and example 5.4 in MAD Advanced Use of Forward Mode for First Derivatives#Example (Preallocating fmad Arrays) for a solution to this problem.

2. Differentiation for a particular function is not supported. For example,

>> x=fmad(1,1); >> y=tanh(x) ??? Error using ==> tanh

Function 'tanh' not defined for variables of class 'fmad'. Please contact the author to arrange for the fmad class to be extended for your case.3. Derivatives are not accurate when differentiating an iteratively defined function. See section 5.5 for an approach to this issue.

4. Differentiation is slow compared to finite-differences. The fmad and derivvec class are designed to be competitive with finite-differencing in terms of speed of execution. They will of course return derivatives correct to within machine rounding. If this is not the case then please contact the author.

5. Derivatives are incorrect. Then you have discovered a bug, please contact the author.