MAD
Mad is a Matlab library of functions and utilities for the automatic differentiation of Matlab func- tions/statements via operator and function overloading. Currently the forward mode of automatic differentiation is supported via the fmad class. For a single directional derivative objects of the fmad class use Matlab arrays of the same size for a variable's value and its directional derivative. Multiple directional derivatives are stored in objects of the derivvec class allowing for an internal 2-D, matrix storage so allowing the use of sparse matrix storage for derivatives and ensuring efficient linear combination of derivative vectors via high-level Matlab functions. This user guide covers:
- installation of Mad on UNIX and PC platforms,
- using TOMLAB /MAD,
- basic use of the forward mode for differentiating expressions and functions
- advanced use of the forward mode including:
- dynamic propagation of sparse derivatives,
- sparse derivatives via compression,
- differentiating implicitly defined functions,
- control of dependencies,
- use of high-level interfaces for solving ODEs and optimization problems outside of the TOMLAB frame- work,
- differentiating black-box functions for which derivatives are known.
Introduction
Automatic (or algorithmic) differentiation (AD) is now a widely used tool within scientific computing. AD concerns the mathematical and software process of taking a computer code, which calculates some outputs y in terms of inputs x through the action of some function y = F(x), and calculating the function's derivatives, e.g. the Jacobian F'(x) = δF / δx. The standard reference is the book [Gri00]. It is common notation to take and . In AD the function F is considered to be defined by the computer code which is termed the source or source code.
A variety of tools exist for AD of the standard programming languages including:
Fortran : ADIFOR [BCKM96], TAF/TAMC [GK96], Tapenade [INR05], ADO1 [PR98].
C,C++ : ADIC [BRM97], ADOL-C [GJU96]
All these tools will calculate Jacobian-vector products F' v by the forward (tangent-linear) mode of AD. Some will calculate vector-Jacobian products uT F' = F'T u via the reverse (adjoint) mode. Some will efficiently calculate matrix products F' V, UT F' and thus facilitate calculation of the Jacobian F' by taking V = In or U = Im (where In is the n × n identity matrix). Additionally a few will determine second derivatives (Hessians, Hessian-vector products) and even arbitrary order derivatives. An up to date list of AD tools may be found at the website [aut05].
The recent textbook [Gri00] and conference proceedings [BCH+ 05] detail the current state of the art in the theory and implementation of AD techniques.
For these languages listed above AD is implemented either via source-text transformation [BCKM96, GK96, INR05, BRM97] or overloading [PR98, GJU96]. In source-text transformation the original code that numerically defines the function F is transformed, via sophisticated compiler techniques, to new code that calculates the required derivatives. In overloading the ability offered by modern programming languages to define new data-types and overload arithmetic operations and intrinsic function calls is utilised to systematically propagate derivatives through the source code and hence calculate the necessary derivatives of the function.
In Matlab progress has been slower. Rich and Hill [RH92] detail a TURB-C function, callable from Matlab that allows a function defined by a character string to be differentiated by the forward mode and returned as another character string. The resulting string representing the derivatives may then be evaluated for any particular input x. We view this approach more as a symbolic algebra technique for differentiating a single statement rather than AD which is concerned with many lines of code containing subfunctions, looping etc. and which may make use of symbolic/compiler techniques.
Modern developments began with Coleman & Verma's produced the ADMIT Toolbox [CV00] allowing functions specified by C-code to be differentiated and accessed from within Matlab through use of ADOL-C. Of even greater relevance is the pioneering work of Verma who demonstrated in his ADMAT tool [Ver99, Ver98b] how the object oriented features of Matlab version 5 (not available to Rich and Hill in 1992) could be used to build up overloaded libraries of Matlab functions allowing first and second order derivatives to be calculated using both forward and reverse AD. ADMIT may now calculate Jacobians via ADMAT.
Our experience of ADMAT is that, although it has floating point operations counts in line with theory, it runs rather slowly. Additionally we have had some difficulties with robustness of the software. With these deficiencies in mind we reimplemented the forward mode of AD for first derivatives in a new Matlab toolbox named Mad (Matlab Automatic Differentiation) [For06].
In common with other overloaded, forward mode AD tools Mad evaluates a function F(x) at x = x0 and in the course of this also evaluates the directional derivative F' (x0 )v This is done by first initialising x as an fmad object using (in Matlab syntax) x = fmad(x0,v) to specify the point x of evaluation and the directional derivative v. Then evaluation of y=F(x) propagates fmad objects which, through overloading of the elementary operations and functions of Matlab ensures that directional derivatives and values of all successive quantities and in particular y are calculated. By specifying v = V, a matrix forming several directional derivatives, then several directional derivatives may be propagated simultaneously. In particular, if V = I the identity matrix, then the Jacobian Ft (x) at x = x0 is evaluated. The tool is not restricted to vector valued functions. If the result of a calculation is a N -dimensional array, then directional derivatives are N -dimensional arrays, and an N + 1 dimensional array may be propagated to handle multiple directional derivatives.
Our implementation features:
- A new Matlab class fmad which overloads the builtin Matlab arithmetic and some intrinsic functions.
- Separation of storage and manipulation of derivatives into a separate class derivvec. This greatly reduces the complexity of the Matlab code for the fmad class. Additionally it allows for the straightforward optimisation of the derivative vector operations using the high level matrix/array functions of Matlab.
- The derivvec class also allows for the use of Matlab's sparse matrix representation to exploit sparsity in the derivative calculations at runtime for arbitrary dimension arrays.
- New for version 1.2: Support for compressed Jacobian calculations when a sparse Jacobian has known sparsity pattern detailed in section 5.4.
- New for version 1.2: High-level interfaces for enabling use of Mad with Matlab's Optimization Toolbox and stiff ODE solvers as described in Section 6.
- New for version 1.2: Black box interfaces of section 7 for interfacing Mad with code for which derivative code already has been developed or for calling external procedures for which derivatives are known.
- New for version 1.2: all examples now available as script M-files and added to the Matlab path on startup.
- New for version 1.2: Improved help facilities allowing removal of the Appendix of previous versions of this manual and introduced in Section 2.5.
Coverage of Matlab's huge range of intrinsic functions is being constantly extended with new functionality added as demanded; so allowing for carefully testing on user's test cases before widespread release.
An implementation of reverse mode AD is currently under development.
This document constitutes a User Guide for the fmad class of Mad. Section 2 covers installing Mad. Section 4 introduces basic usage of the forward mode via some simple examples. Section 5 details more advanced usage of the forward mode and in particular provides guidance on how Matlab functions should be written to give effective usage of Mad. Section 6 introduces the high-level interfaces first described in [FK04]. These allow users of Matlab' Optimisation Toolbox [Mat06b] or Matlab's stiff ODE solvers [mat06a, Chapter 5] to easily use Mad for calculating function gradients or Jacobians. Section 7 details black box interfaces allowing users to interface Mad to functions whose derivatives are known and which possibly are coded in external mex files written in Fortran or C.