Matrix Calculus

Go to: Introduction, Notation, Index



Contents of Calculus Section

Notation

Derivatives

In the main part of this page we express results in terms of differentials rather than derivatives for two reasons: they avoid notational disagreements and they cope easily with the complex case. In most cases however, the differentials have been written in the form dY: = dY/dX dX: so that the corresponding derivative may be easily extracted.

Derivatives with respect to a real matrix

If X is p#q and Y is m#n, then dY: = dY/dX dX: where the derivative dY/dX is a large mn#pq matrix. If X and/or Y are column vectors or scalars, then the vectorization operator : has no effect and may be omitted. dY/dX is also called the Jacobian Matrix of Y: with respect to X: and det(dY/dX) is the corresponding Jacobian. The Jacobian occurs when changing variables in an integration: Integral(f(Y)dY:)=Integral(f(Y(X)) det(dY/dX) dX:).

Although they do not generalise so well, other authors use alternative notations for the cases when X and Y are both vectors or when one is a scalar. In particular:

Derivatives with respect to a complex matrix

If X is complex then dY: = dY/dX dX: can only be true iff Y(X) is an analytic function which normally implies that Y(X) does not depend on XC or XH.

Even for non-analytic functions we can write uniquely dY: = dY/dX dX: + dY/dXC dXC: provided that   is analytic with respect to X and XC individually (or equivalently with respect to XR and XI individually).  dY/dX is the Generalized Complex Derivative and dY/dXC is the Complex Conjugate Derivative [R.4, R.9].

We define the generalized derivatives in terms of partial derivatives with respect to XR and XI:

We have the following relationships for both analytic and non-analytic functions Y(X):

Complex Gradient Vector

If f(x) is a real function of a complex vector then df/dxC= (df/dx)C and we can define grad(f(x)) = 2 (df/dx)H = (df/dxR+j df/dxI)T as the Complex Gradient Vector [R.9] with the following properties:

Basic Properties

Differentials of Linear Functions

Differentials of Quadratic Products

Differentials of Cubic Products

Differentials of Inverses

Differentials of Trace

Note: matrix dimensions must result in an n*n argument for tr().

Differentials of Determinant

Note: matrix dimensions must result in an n#n argument for det(). Some of the expressions below involve inverses: these forms apply only if the quantity being inverted is square and non-singular; alternative forms involving the adjoint, ADJ(), do not have the non-singular requirement.

Jacobian

 dY/dX is called the Jacobian Matrix of Y: with respect to X: and JX(Y)=det(dY/dX) is the corresponding Jacobian. The Jacobian occurs when changing variables in an integration: Integral(f(Y)dY:)=Integral(f(Y(X)) det(dY/dX) dX:).

Hessian matrix

If f is a real function of x then the Hermitian matrix Hx  f = (d/dx (df/dx)H)T  is the Hessian matrix of f(x). A value of x for which grad f(x) = 0 corresponds to a minimum, maximum or saddle point according to whether Hx f is positive definite, negative definite or indefinite.


This page is part of The Matrix Reference Manual. Copyright © 1998-2005 Mike Brookes, Imperial College, London, UK. See the file gfl.html for copying instructions. Please send any comments or suggestions to "mike.brookes" at "imperial.ac.uk".
Updated: $Id: calculus.html 3437 2013-09-16 14:55:24Z dmb $