Matrix Calculus

Go to: Introduction, Notation, Index



Contents of Calculus Section

Notation

Derivatives

In the main part of this page we express results in terms of differentials rather than derivatives for two reasons: they avoid notational disagreements and they cope easily with the complex case. In most cases however, the differentials have been written in the form dY: = dY/dX dX: so that the corresponding derivative may be easily extracted.

Derivatives with respect to a real matrix

If X is p#q and Y is m#n, then dY: = dY/dX dX: where the derivative dY/dX is a large mn#pq matrix. If X and/or Y are column vectors or scalars, then the vectorization operator : has no effect and may be omitted. dY/dX is also called the Jacobian Matrix of Y: with respect to X: and det(dY/dX) is the corresponding Jacobian. The Jacobian occurs when changing variables in an integration: Integral(f(Y)dY:)=Integral(f(Y(X)) det(dY/dX) dX:).

Although they do not generalise so well, other authors use alternative notations for the cases when X and Y are both vectors or when one is a scalar. In particular:

Derivatives with respect to a complex matrix

If X is complex then dY: = dY/dX dX: can only be generally true iff Y(X) is an analytic function which normally implies that Y(X) does not depend on XC or XH.

Even for non-analytic functions we can treat X and XC (with XH=(XC)T) as distinct variables and write uniquely dY: = ðYX dX: + ðYXC dXC: provided that Y is analytic with respect to X and XC individually (or equivalently with respect to XR and XI individually).  ðYX is the Generalized Complex Derivative and ðYXC is the Complex Conjugate Derivative [R.4, R.9].

We define the generalized derivatives in terms of partial derivatives with respect to XR and XI:

We have the following relationships for both analytic and non-analytic functions Y(X):

Complex Gradient Vector

If f(x) is a real function of a complex vector then ðf/ðxC= (ðf/ðx)C and we can define the complex-valued column vector grad(f(x)) = 2 (ðf/ðx)H = (ðf/ðxR+j ðf/ðxI)T as the Complex Gradient Vector [R.9] with the following properties:

Basic Properties

Differentials of Linear Functions

Differentials of Quadratic Products

Differentials of Cubic Products

Differentials of Inverses

Differentials of Trace

Note: matrix dimensions must result in an n*n argument for tr().

Differentials of Determinant

Note: matrix dimensions must result in an n#n argument for det(). Some of the expressions below involve inverses: these forms apply only if the quantity being inverted is square and non-singular; alternative forms involving the adjoint, ADJ(), do not have the non-singular requirement.

Jacobian

 dY/dX is called the Jacobian Matrix of Y: with respect to X: and JX(Y)=det(dY/dX) is the corresponding Jacobian. The Jacobian occurs when changing variables in an integration: Integral(f(Y)dY:)=Integral(f(Y(X)) det(dY/dX) dX:).

Hessian matrix

If f is a real function of x then the Hermitian matrix Hx  f = (d/dx (df/dx)H)T  is the Hessian matrix of f(x). A value of x for which grad f(x) = 0 corresponds to a minimum, maximum or saddle point according to whether Hx f is positive definite, negative definite or indefinite.


This page is part of The Matrix Reference Manual. Copyright © 1998-2005 Mike Brookes, Imperial College, London, UK. See the file gfl.html for copying instructions. Please send any comments or suggestions to "mike.brookes" at "imperial.ac.uk".
Updated: $Id: calculus.html 5146 2014-09-16 09:26:15Z dmb $