Matrix Properties

Go to: Introduction, Notation, Index

Adjoint or Adjugate

The adjoint of A, ADJ(A) is the transpose of the matrix formed by taking the cofactor of each element of A.

ADJ(A) A = det(A) I
- If det(A) ≠ 0, then A^-1 = ADJ(A) / det(A) but this is a numerically and computationally poor way of calculating the inverse.
ADJ(A^T)=ADJ(A)^T
ADJ(A^H)=ADJ(A)^H

Characteristic Equation

The characteristic equation of a matrix A_[n#n] is |tI-A| = 0. It is a polynomial equation in t.

The properties of the characteristic equation are described in the section on eigenvalues.

Characteristic Matrix

The characteristic matrix of A_[n#n] is (tI-A) and is a function of the scalar t.

The properties of the characteristic matrix are described in the section on eigenvalues.

Characteristic Polynomial

The characteristic polynomial, p(t), of a matrix A_[n#n] is p(t) = |tI - A|.

The properties of the characteristic polynomial are described in the section on eigenvalues.

Cofactor

The cofactor of a minor of A:n#n is equal to the product of (i) the determinant of the submatrix consisting of all the rows and columns that are not in the minor and (ii) -1 raised to the power of the sum of all the row and column indices that are in the minor.

The cofactor of the element a(i,j) equals -1^i+j det(B) where B is the matrix formed by deleting row i and column j from A.

See Minor, Adjoint

Compound Matrix

The k^th compound matrix of A_[m#n] is the m!(k!(m-k)!)^-1#n!(k!(n-k)!)^-1 matrix formed from the determinants of all k#k submatrices of A arranged with the submatrix index sets in lexicographic order. Within this section, we denote this matrix by C_k(A).

C₁(A) = A
C_n(A_[n#n]) = det(A)
C_k(AB) = C_k(A)C_k(B)
C_k(aX) = a^kC_k(X)
C_k(I) = I
C_k(A^H) = C_k(A)^H
C_k(A^T) = C_k(A)^T
C_k(A^-1) = C_k(A)^-1

Condition Number

The condition number of a matrix is its largest singular value divided by its smallest singular value.

If Ax=y and A(x+p)=y+q then ||p||/||x|| ≤ k ||q||/||y|| where k is the condition number of A. Thus it provides a sensitivity bound for the solution of a linear equation.
If A_[2#2] is hermitian positive definite then its condition number, r, satisfies 4 ≤ tr(A)²/det(A) = (r+1)²/r. This expression is symmetric between r and r^-1 and is monotonically increasing for r>1. It therefore provides an easy way to check on the range of r.

Conjugate Transpose

X=Y^H is the Hermitian transpose or Conjugate transpose of Y iff x_i,j=y_j,i^C.

See Hermitian Transpose.

Constructibility

The pair of matrices {A_[n#n], C_[m#n]} are constructible iff {A^H, C^H} are controllable.

If {A, C} are observable then they are constructible.
If det(A)≠0 and {A, C} are constructible then they are observable.
If {A, C} are constructible then they are detectable.

Controllability

The pair of matrices {A_[n#n], B_[n#m]} are controllable iff any of the following equivalent conditions are true

There exists a G_[mn#n] such that Aⁿ = CG where C = [B AB A²B ... A^n-1B]_[n#mn] is the controllability matrix.
If x^TA^rB = 0 for 0≤r<n then x^TAⁿ = 0.
If x^TB = 0 and x^TA = kx^T then either k=0 or else x = 0.

If {A, B} are reachable then they are controllable.
If det(A)≠0 and {A, B} are controllable then they are reachable.
If {A, B} are controllable then they are stabilizable.
{DIAG(a), b} are controllable iff all non-zero elements of a are distinct and all the corresponding elements of b are non-zero.

Definiteness

A Hermitian square matrix A is

positive definite if x^HAx > 0 for all non-zero x.
positive semidefinite or non-negative definite if x^HAx ≥0 for all non-zero x.
indefinite if x^HAx is > 0 for some x and < 0 for some other x.

This definition only applies to Hermitian and real-symmetric matrices; if A is non-real and non-Hermitian then x^HAx is complex for some values of x and so the concept of definiteness does not make sense. Some authors also call a real non-symmetric matrix positive definite if x^HAx > 0 for all non-zero real x; this is true iff its symmetric part is positive definite (see below). We abbreviate positive as +ve below.

A (not necessarily symmetric) real matrix A satisfies x^HAx > 0 for all non-zero real x iff its symmetric part B=(A+A^T)/2 is +ve definite. Indeed x^TAx= x^TBx for all x.
The following are equivalent
- A is Hermitian and +ve semidefinite
- A is Hermitian and all its eigenvalues are ≥0
- A=B^HB for some B (not necessarily square)
- A=C² for some Hermitian C.
- D^HAD is Hermitian and +ve semidefinite for any D
If A is +ve definite then A^-1 exists and is +ve definite.
If A is +ve semidefinite, then
- the eigenvalues of A are equal to its singular values
- for any integer k>0 there exists a unique +ve semidefinite B with A=B^k. This B also satisifes:
  - AB=BA
  - B=p(A) for some polynomial p()
  - rank(B) = rank(A)
  - if A is real then so is B.
- |a_i,j| ≤ √(a_i,ia_j,j) [3.6]
- |a^HAb|² ≤ a^HAa×b^HAb for any a, b [3.6]
A is +ve definite iff all its eigenvalues are > 0.
- If A is +ve definite then det(A) > 0 and tr(A) > 0.
  - A Hermitian matrix A_[2#2] is +ve definite iff det(A) >0 and tr(A) > 0.
The columns of B_[m#n] are linearly independent iff B^HB is +ve definite.
If A and B are +ve semidefinite, then A+B is +ve semidefinite
If B is +ve definite and A is +ve semidefinite then:
- B^-1A is diagonalizable (i.e. similar to a diagonal matrix) and has non-negative eigenvalues [3.7]
- tr(B^-1A) = 0 iff A=0
- A+B is positive definite

Detectability

The pair of matrices {A_[n#n], C_[m#n]} are detectable iff {A^H, C^H} are stabilizable.

If {A, C} are observable or constructible then they are detectable..

Determinant

For an n#n matrix A, det(A) is a scalar number defined by det(A)=sgn(PERM(n))'*prod(A(1:n,PERM(n)))

This is the sum of n! terms each involving the product of n matrix elements of which exactly one comes from each row and each column. Each term is multiplied by the signature (+1 or -1) of the column-order permutation . See the notation section for definitions of sgn(), prod() and PERM().

The determinant is important because INV(A) exists iff det(A) ≠ 0.

Geometric Interpretation

The determinant of a matrix equals the +area of the +parallelogram that has the matrix columns as n of its sides. If a vector space is transformed by multiplying by a matrix A, then all +areas will be multiplied by det(A).

Properties of Determinants

det(A^T) = det(A)
det(A^H) = conj(det(A))
det(cA) = cⁿ det(A)
det(A^k) = (det(A))^k , k must be positive if det(A)=0.
Interchanging any pair of columns of a matrix multiplies its determinant by -1(likewise rows).
Multiplying any column of a matrix by c multiplies its determinant by c (likewise rows).
Adding any multiple of one column onto another column leaves the determinant unaltered (likewise rows).
det(A) ≠ 0 iff INV(A) exists.
[A,B:n#m ; m≥n]: If Q = CHOOSE(m,n). and d(k) = det(A(:,Q(k,:)) det(B(:,Q(k,:)) for k=1:rows(Q) then det(AB^T) = sum(d). This is the Binet-Cauchy theorem.
Suppose that for some r, P = CHOOSE(n,r) and Q = CHOOSE(n,n-r) with the rows of Q ordered so that P(k,:) and Q(k,:) have no elements in common. If we define D(m,k) = (-1)^{sum([P(m,:)
P(k,:)])} det(A(P(m,:)^T,P(k,:)) det(A(Q(m,:)^T,Q(k,:)) for m,k=1:rows(P) then det(A) = sum(D(m,:)) = sum(D(:,k)) for any k or m. This is the Laplace expansion theorem.
- If we set k=r=1 then P(m,:)=[m] and we obtain the familiar expansion by the first column:
  d(m)=(-1)^m+1 A(m,1) det(A([1:m-1 m+1:n]^T,2:n)) and det(A)=sum(d).
det(A) = 0 iff the columns of A are linearly dependent (likewise rows).
- det(A) = 0 if two columns are identical (likewise rows).
- det(A) = 0 if any column consists entirely of zeros (likewise rows).
If A = [a₁ a₂ ... a_n] then |det(A)| ≤ prod(||a_i||) with equality iff the a_i are mutually orthogonal where ||a|| is the Euclidean norm; this is the Hadamard inequality.
- If |a_i,j|≤B for all i,j then |det(A)| ≤ n^0.5ⁿBⁿ
- [A +ve semidefinite]: det(A) ≤ prod(diag(A))
[A:3#3]: If A = [a b c] then det(A) = det([a b c]) = a^T SKEW(b) c = b^T SKEW(c) a = c^T SKEW(a) b

Determinants of simple matrices

det([a b; c d]) = ad - bc
det([a b c]) = a₁b₂c₃ - a₁b₃c₂ - a₂b₁c₃ + a₂c₁b₃ + a₃b₁c₂ - a₃c₁b₂
The determinant of a diagonal or triangular matrix is the product of its diagonal elements.
The determinant of a unitary matrix has an absolute value of 1.
- The determinant of an orthogonal matrix is +1 or -1.
The determinant of a permutation matrix equals the signature of the column permutation.

Determinants of sums and products

[A,B:n#n ]:det(AB) = det(A) det(B)
[A,B:m#n ]:det(I + A^TB) = det(I + AB^T) = det(I + B^TA) = det(I + BA^T) [3.2]
[A:n#n ]:det(A+xy^T) = (1+y^TA^-1x) det(A) [3.4]
- det(I+xy^T) = 1+y^Tx = 1+x^Ty [3.3]
- det(kI+xy^T) = kⁿ+kⁿ^-1y^Tx = kⁿ+kⁿ^-1x^Ty
[A,B: n#n, symmetric, +ve semidefinite]:
- (det(A+B))^1/n ≥ (det(A))^1/n + (det(B))^1/n; this is the Minkowski determinant inequality.
  - If 0≤k≤1, then (det(kA+(1-k)B))^1/n ≥ k(det(A))^1/n + (1-k)(det(B))^1/n
- If 0≤k≤1, then det(kA+(1-k)B) ≥ (det(A))^k (det(B))^1-k
  - det(A+B) ≥ √(det(4AB))
- For any integer m>0, n(det(A)det(B))^m/n ≤ tr(A^mB^m)

Determinants of block matrices

In this section we have A_[m#m], B_[m#n], C_[n#m] and D_[n#n].

det([A, B; C, D]) = det([D, C; B, A]) = det(A)*det(D-CA^-1B) = det(D)*det(A-BD^-1C) [3.1]
- det([a, b^T; c, D]) = (a - b^TD^-1c)det(D)
det([I, B; C, I]) = det(I_[m#m]-BC) = det(I_[n#n]-CB)
det([A, B; 0, D]) = det([A, 0; C, D]) = det(A) det(D)
- det([a, b^T; 0, D]) = det([a, 0; c, D]) = a det(D)
For the special case when m=n (i.e. A, B, C, D all n#n):
- det([A, B; C, 0]) = -det(BC^T)
- [AB=BA]: det([A, B; C, D]) = det(DA-CB)
- [AC=CA]: det([A, B; C, D]) = det(AD-CB)
- [BD=DB]: det([A, B; C, D]) = det(DA-BC)
- [CD=DC]: det([A, B; C, D]) = det(AD-BC)

Displacement Rank

The displacement rank of X_[m#n] is given by dis_rank(X) = rank(X - ZXZ^T) where the Z are shift matrices of size m#m and n#n respectively.

dis_rank(X+Y) ≤ dis_rank(X) + dis_rank(Y)
dis_rank(XY) ≤ dis_rank(X) + dis_rank(Y)
dis_rank(X^-1)=dis_rank(JXJ) where J is the exchange matrix.
[X: Toeplitz] dis_rank(X) = 2 unless X is upper or lower triangular in which case dis_rank(X)=1 unless X = 0 , in which case dis_rank(X)=0.
- [X_[n#n]: Toeplitz] If a = X_1,1 and b = X²_1,1, then the characteristic polynomial of X - ZXZ^T is (t² - at + a²-b) tⁿ^-2

Eigenvalues

The eigenvalues of A are the roots of its characteristic equation: |tI-A| = 0.

The properties of the eigenvalues are described in the section on eigenvalues.

Field of Values

The field of values of a square matrix A is the set of complex numbers x^HAx for all x with ||x||=1.

The field of values is a closed convex set.
The field of values contains the convex hull of the eigenvalues of A.
If A is normal then the field of values equals the convex hull of its eigenvalues.
- [n<5] A_[n#n] is normal iff its field of values is the convex hull of its eigenvalues.
A is hermitian iff its field of values is a real interval.
If A and B are unitarily similar, they have the same field of values.

Generalized Inverse

A generalized inverse of X:m#n is any matrix, X^#:n#m satisfying XX^#X=X. Note that if X is singular or non-square, then X^# is not unique. This is also called a weak generalized inverse to distinguish it from the pseudoinverse.

If X is square and non-singular, X^# is unique and equal to X^-1.
(X^#)^H is a generalized inverse of X^H.
[k≠0] X^#/k is a generalized inverse of kX.
[A,B non-singular] B^-1X^#A^-1 is a generalized inverse of AXB
rank(X^#) ≥ rank(X).
rank(X)=rank(X^#) iff X is also the generalized inverse of X^# ( i.e. X^#XX^#=X^#.).
XX^# and X^#X are idempotent and have the same rank as X.
- I-XX^# and I-X^#X are also idempotent.
If Ax-b has any solutions, then x=A^#b is a solution.
If AA^# is hermitian, a value of x that minimizes ||Ax-b|| is given by x=A^#b. With this value of x, the error Ax-b is orthogonal to the columns of A. If we define the projection matrix P=AA^#, then Ax=Pb and Ax-b=-(I-P)b.
If X:m#n has rank r, we can find A:n#n-r, B:n#r and C:m#m-r whose columns form bases for the null space of X, the range of X⁺X and the null space of X^H respectively.

The set of generalized inverses of X is precisely given by X^#=X⁺+AY+BZC^H for arbitrary Y:n-r#m and Z:r#m-r where X⁺ is the pseudoinverse.
For a given choice of A, B and C, each X^# corresponds to a unique Y and Z.
XX^# is hermitian iff Z=0.

If X:m#n has rank r, we can find A:n#n-r, F:n#r and C:m#m-r whose columns form bases for the null space of X, the range of X⁺ and the null space of X^H respectively. We can also find G:m#r such that X⁺=FG^H.

The set of generalized inverses X^# of X, for which X is also the generalised inverse of X^# is precisely given by X^#=(F+AV)(G+CW)^H for arbitrary V:n-r#r and W:m-r#r.
For a given choice of A, C, F and G each X^# corresponds to a unique V and W.

Gram Matrix

The gram matrix of X, GRAM(X), is the matrix X^HX.

GRAM(X) is positive semi-definite hermitian.
det(GRAM(X)) = 0 iff a principal minor of GRAM(X) is zero.
rank(GRAM(X)) = rank(X)
trace(GRAM(X)) = ||X||_F², the squared Frobenius matrix norm.
y is an eigenvector of X^HX iff Xy is an eigenvector of XX^H. The corresponding eigenvalue is the same in both cases.

If X is m#n, the elements of GRAM(X) are the n² possible inner products between pairs of its columns. We can form such a matrix from n vectors in any vector space having an inner product.

Grammian

The grammian of a matrix X, gram(X), equals det(GRAM(X)) = det(X^HX).

gram(X) is real and ≥ 0.
gram(X) > 0 iff the columns of X are linearly independent, i.e. iff Xy = 0 implies y = 0
- [X_m_#n]: gram(X)=0 if m<n.
gram(X) = 0 iff a principal minor of GRAM(X) is zero.
[X_n#n]: gram(X) = gram(X^H) = |det(X)|²
gram(x) = x^Hx
gram([X Y]) = gram([Y X]) = gram(X)*det(Y^HY-Y^HX(X^HX)^-1X^HY) = gram(X)*det(Y^H(I-X(X^HX)^-1X^H)Y)
- gram([X y]) = gram([y X]) = gram(X)*y^Hy-y^HX(X^HX)^-1X^Hy = gram(X)*y^H(I-X(X^HX)^-1X^H)y
gram([X y]) = gram(X) ||XX^#y - y||² where X^# is the generalized inverse so that ||XX^#y - y|| equals the distance between y and its orthogonal projection onto the space spanned by the columns of X.
gram([X Y]) ≤ gram(X) gram(Y); this is the generalised Hadamard inequality.
- gram([X Y]) = gram(X) gram(Y) iff either X^HY = 0 or gram(X) gram(Y) = 0
- If X = [x₁ x₂ ... x_n] then gram(X) ≤ prod(||x_i||²) = prod(diag(X^HX))
  - [X_n#n]: |det(X)|²≤ prod(||x_i||²) = prod(diag(X^HX)); this is the Hadamard inequality.

Geometric Interpretation

The grammian of X_m#n is the squared "volume" of the n-dimensional parallelepiped spanned by the columns of X.

Hermitian Transpose or Conjugate Transpose

X=Y^H is the Hermitian transpose or Conjugate transpose of Y iff x(i,j)=conj(y(j,i)).

Inertia

The inertia of an m#m square matrix is the scalar triple (p,n,z) where p+n+z=m and p, n and z are respectively the number of eigenvalues, counting multiplicities, with positive, negative and zero real parts. Some authors call this the signature rather than the inertia.

If the inertia of a Hermitian matrix, A, is (p,n,z) then
- the rank of A is p+n
- the signature of A is p-n but note that some authors use signature to denote the triple (p,n,z) itself.
If A is Hermitian, it is conjunctive to a diagonal matrix of the form D=DIAG(I_p#p,-I_n#n,0_z#z) iff the inertia of A equals (p,n,z). D is the intertia matrix of A and A=X^HDX for some non-singular X. This is Sylvester's law of interia.

Inverse

B is a left inverse of A if BA=I. B is a right inverse of A if AB=I.

If BA=AB=I then B is the inverse of A and we write B=A^-1.

[A:n#n] AB=I iff BA=I, hence inverse, left inverse and right inverse are all equivalent for square matrices.
[A,B:n#n] (AB)^-1=B^-1A^-1
[A:m#n] A has a left inverse iff rank(A)=n and a right inverse iff rank(A)=m.
[A:n#m, B:m#n] AB=I implies that n≤m and that rank(A)=rank(B)=n.

Inverse of Block Matrices

[A, B; C, D]^-1 = [Q^-1, -Q^-1BD^-1; -D^-1CQ^-1, D^-1(I+CQ^-1BD^-1)] where Q =(A-BD^-1C) is the Schur Complement of D [3.5]
= [A^-1(I+BP^-1CA^-1), -A^-1BP^-1; -P^-1CA^-1, P^-1] where P =(D-CA^-1B) is the Schur Complement of A [3.5]
=[ I, -A^-1B; -D^-1C, I] DIAG((A-BD^-1C)^-1, (D-CA^-1B)^-1)
=DIAG((A-BD^-1C)^-1, (D-CA^-1B)^-1) [ I, -BD^-1; -CA^-1, I]
=DIAG(A^-1, 0) + [-A^-1B; I] (D-CA^-1B)^-1[-CA^-1, I]
=DIAG(0, D^-1) + [I; -D^-1C] (A-BD^-1C)^-1[I, -BD^-1]
- [A, 0; C, D]^-1 = [A^-1, 0; -D^-1CA^-1, D^-1]
  =[ I, 0; -D^-1C, I] DIAG(A^-1, D^-1)
  =DIAG(A^-1, D^-1) [ I, 0; -CA^-1, I]
- [A, B; C, 0]^-1 = DIAG(A^-1, 0) - [-A^-1B; I] (CA^-1B) ^-1[-CA^-1, I]

[A, b; c^T, d]^-1 = [Q^-1, -d^-1Q^-1b; -d^-1c^TQ^-1, d^-1(1+d^-1c^TQ^-1b)] where Q =(A-d^-1bc^T),
= [A^-1(I+p^-1bc^TA^-1), -p^-1A^-1b; -p^-1c^TA^-1, p^-1] where p =(d-c^TA^-1b)
=[ I, -A^-1b; -d^-1c^T, 1] DIAG((A-d^-1bc^T)^-1, (d-c^TA^-1b)^-1)
=DIAG((A-d^-1bc^T)^-1, (d-c^TA^-1b)^-1) [ I, -bd^-1; -c^TA^-1, 1]
=DIAG(A^-1, 0) + (d-c^TA^-1b)^-1[A^-1b; -1] [c^TA^-1, -1]
=DIAG(0, d^-1) + [I; -d^-1c^T] (A-d^-1bc^T)^-1[I, -d^-1b]
- [A, 0; c^T, d]^-1 = [A^-1, 0; -d^-1c^TA^-1, d^-1]
  =[ I, 0; -d^-1c^T, 1] DIAG(A^-1, d^-1)
  =DIAG(A^-1, d^-1) [ I, 0; -c^TA^-1, 1]
- [A, b; c^T, 0]^-1 = DIAG(A^-1, 0) - (c^TA^-1b) ^-1[A^-1b; -1] [c^TA^-1, -1]

Kernel

The kernel (or null space) of A is the subspace of vectors x for which Ax = 0. The dimension of this subspace is the nullity of A.

The kernel of A is the orthogonal complement of the range of A^H

Linear Independence

The columns of A are linearly independent iff the only solution to Ax=0 is x=0.

rank(A_[m#n]) = n iff its columns are linearly independent. [1.5]
If the columns of A_[m#n] are linearly independent then m ≥ n [1.3, 1.5]
If A has linearly independent columns and A=F_[m#r]G_[r#n] then r≥n. [1.1]

Matrix Norms

A matrix norm is a real-valued function of a square matrix satisfying the four axioms listed below. A generalized matrix norm satisfies only the first three.

Positive: ||X||=0 iff X=0 else ||X||>0
Homogeneous: ||cX||=|c| ||X|| for any real or complex scalar c
Triangle Inequality: ||X+Y||≤||X||+||Y||
Submultiplicative: ||XY||≤||X|| ||Y||

Induced Matrix Norm

If ||y|| is a vector norm, then we define the induced matrix norm to be ||X||=max(||Xy|| for ||y||=1)

Euclidean or Frobenius Norm

The Euclidean or Frobenius norm of a matrix A is given by ||A||_F = √(sum(ABS(A).²)). It is always a real number. The closely related Hilbert-Schmidt norm of a square matrix A_n#n is given by ||A||_HS = n^-½ ||A||_F.

||A||_F = ||A^T||_F = ||A^H||_F
||A||_F² = tr(A^HA) = sum(CONJ(A).*A)
[Q: orthogonal]: ||A||_F = ||QA||_F = ||AQ||_F

p-Norms

||A||_p = max(||Ax||_p) where the max() is taken over all x with ||x||_p = 1 where ||x||_p = sum(abs(x)^•p)^(1/p) denotes the vector p-norm for p≥1.

||AB||_p ≤ ||A||_p ||B||_p
||Ax||_p ≤ ||A||_p ||x||_p
[A_:m#n]: ||A||₂ ≤ ||A||_F ≤ n^½ ||A||₂
[A:_m#n]: max(ABS(A)) ≤ ||A||₂ ≤ √(mn) max(ABS(A))
||A||₂ ≤ √(||A||₁ ||A||_inf)
||A||₁ = max(sum(ABS(A^T)))
||A||_inf = max(sum(ABS(A)))
[A:_m#n]: ||A||_inf ≤ √(n) ||A||₂ ≤ √(mn) ||A||_inf
[A:_m#n]: ||A||₁ ≤ √(m) ||A||₂ ≤ √(mn) ||A||₁
[Q: orthogonal]: ||A||₂ = ||QA||₂ = ||AQ||₂

Minor

A kth-order minor of A is the determinant of a k#k submatrix of A.

A principal minor is the determinant of a submatrix whose diagonal elements lie on the principal diagonal of A.

Null Space

The null space (or kernel) of A is the subspace of vectors x for which Ax = 0.

The null space of A is the orthogonal complement of the range of A^H
The dimension of the null space of A is the nullity of A.
Given a vector x, we can choose a Householder matrix P=I-2vv^H with v = (x + ke₁)/||x + ke₁|| where k=sgn(x(1))*||x|| and e₁ is the first column of the identity matrix. The first row of P equals -k^-1x^T and the remaining rows form an orthonormal basis for the null space of x^T.

Nullity

The nullity of a matrix A is the dimension of the null space of A.

The nullity of A is the geometric multiplicity of the eigenvalue 0.

Observability

The pair of matrices {A_[n#n], C_[m#n]} are observable iff {A^H, C^H} are reachable.

If {A, C} are observable then they are constructible and detectable.
If det(A)≠0 and {A, C} are constructible then they are observable.

Permanent

For an n#n matrix A, pet(A) is a scalar number defined by pet(A)=sum(prod(A(1:n,PERM(n))))

This is the same as the determinant except that the individual terms within the sum are not multiplied by the signatures of the column permutations.

Properties of Permanents

pet(A.') = pet(A)
pet(A') = conj(pet(A))
pet(cA) = cⁿ pet(A)
[P: permutation matrix]: pet(PA) = pet(AP) = pet(A)
[D: diagonal matrix]: pet(DA) = pet(AD) = pet(A) pet(D) = pet(A) prod(diag(D))

Permanents of simple matrices

pet([a b; c d]) = ad + bc
The permanent of a diagonal or triangular matrix is the product of its diagonal elements.
The permanent of a permutation matrix equals 1.

Potency

The potency of a non-negative matrix A is the smallest n>0 such that diag(Aⁿ) > 0 i.e. all diagonal elements of Aⁿ are strictly positive. If no such n exists then A is impotent.

Pseudoinverse

The pseudoinverse (also called the Natural Inverse or Moore-Penrose Pseudoinverse) of X_m#n is the unique [1.20] n#m matrix X⁺ that satisfies:

XX⁺X=X (i.e. X⁺ is a generalized inverse of X).
X⁺XX⁺=X⁺ (i.e. X is a generalized inverse of X⁺).
(XX⁺)^H=XX⁺
(X⁺X)^H=X⁺X

If X is square and non-singular then X⁺=X^-1.
If X=UDV^H is the singular value decomposition of X, then X⁺=VD⁺U^H where D⁺ is formed by inverting all the non-zero elements of D^T.
- If D is a (not necessarily square) diagonal matrix, then D⁺ is formed by inverting all the non-zero elements of D^T.
The pseudoinverse of X is the generalized inverse having the lowest Frobenius norm.
If X is real then so is X⁺.
(X⁺)⁺=X
(X^T)⁺=(X⁺)^T
(X^H)⁺=(X⁺)^H
(cX)⁺=c^-1X⁺ for any real or complex scalar c.
X⁺=X^H(XX^H)⁺=(X^HX)⁺X^H.
If X_m#n = F_m#r G_r#n has rank r then X⁺= G⁺F⁺= G^H(F^HXG^H)^-1F^H.
- If X_m#n has rank n (i.e. the columns are linearly independent) then X⁺=(X^HX)^-1X^H and X⁺X=I.
- If X_m#n has rank m (i.e. the rows are linearly independent) then X⁺=X^H(XX^H)^-1 and XX⁺=I.
- If X has orthonormal rows or orthonormal columns then X⁺= X^H .
XX⁺ is a projection onto the column space of X.
[rank(X)=1]: X⁺ = X^H/tr(X^HX) = X^H/ ||X||_F² where ||X||_F is the Frobenius Norm (see rank-1 matrices)
- (xy^H)⁺ = yx^H/(x^Hxy^Hy)
- x⁺ = x^H/(x^Hx)

Rank

The rank of an m#n matrix A is the smallest r for which there exist F_[m#r] and G_[r#n] such that A=FG. Such a decomposition is a full-rank decomposition. As a special case, the rank of 0 is 0.

A=F_[m#r]G_[r#n] implies that rank(A) ≤ r .
rank(A)=1 iff A = xy^T for some x and y.
rank(A_[m#n]) ≤ min(m,n). [1.3]
rank(A_[m#n]) = n iff its columns are linearly independent. [1.5]
rank(A) = rank(A^T) = rank(A^H)
rank(A) = maximum number of linearly independent columns (or rows) of A.
rank(A) is the dimension of the range of A.
rank(A_[_n_#n]) + nullity(A_[_n_#n]) = n
- rank(A_[_n_#n]) = n - 1 if 0 is an eigenvalue of A with algebraic multiplicity 1.
det(A_[_n_#n])=0 iff rank(A_[_n_#n])<n.
rank(A + B) ≤ rank(A) + rank(B)
rank([A B]) = rank(A) + rank(B - AA^#B) where A^# is a generalized inverse of A.
- rank([A; C]) = rank(A) + rank(C - CA^#A)
- rank([A B; C 0]) = rank(B) + rank(C) + rank((I - BB^#)A(I - CC^#))
rank(AA^H) = rank(A^HA) = rank(A) [see grammian]
rank(AB) + rank(BC) ≤ rank(B) + rank(ABC)
- rank(A_[m#n]) + rank(B) - n ≤ rank(AB) ≤ min(rank(A), rank(B))
[X: non-singular]: rank(XA) = rank(AX) = rank(A)
rank(KRON(A,B)) = rank(A)rank(B)
rank(DIAG(A,B,...,Z)) = sum(rank(A), rank(B), ..., rank(Z))

Range

The range (or image) of A is the subspace of vectors that equal Ax for some x. The dimension of this subspace is the rank of A.

[A:m#n] The range of A is the orthogonal complement of the null space of A^H.

Reachability

The pair of matrices {A_[n#n], B_[n#m]} are reachable iff any of the following equivalent conditions are true

rank(C)=n where C = [B AB A²B ... A^n-1B]_[n#mn] is the controllability matrix.
If x^HA^rB = 0 for 0≤r<n then x = 0.
If x^HB = 0 and x^HA = kx^H then x = 0.
For any v, it is possible to choose L_[n#m] such that eig(A+BL^H)=v.

If {A, B} are reachable then they are controllable and stabilizable.
If det(A)≠0 and {A, B} are controllable then they are reachable.
{DIAG(a), b} are reachable iff all elements of a are distinct and all elements of b are non-zero.

Schur Complement

Given a block matrix M = [A_[m#m], B; C, D_[n#n]], then P_[n#n]=D-CA^-1B and Q_[m#m]=A-BD^-1C are respectively the Schur Complements of A and D in M.

det([A, B; C, D]) = det([D, C; B, A]) = det(A)*det(P) = det(Q)*det(D) [3.1]
[A, B; C, D]^-1 = [Q^-1, -Q^-1BD^-1; -D^-1CQ^-1, D^-1(I+CQ^-1BD^-1)]= [A^-1(I+BP^-1CA^-1), -A^-1BP^-1; -P^-1CA^-1, P^-1] [3.5]

Spectral Radius

The spectral radius, rho(A), of A_[n#n] is the maximum modulus of any of its eigenvalues.

rho(A) ≤ ||A|| where ||A|| is any matrix norm.
For any a>0, there exists a matrix norm such that ||A|| - a ≤ rho(A) ≤ ||A||.
If ABS(A)≤B then rho(A)≤rho(ABS(A))≤rho(B)
- [A,B: real] If B≥A≥0 then rho(B)≥rho(A)
[A: real] If A≥0 then rho(A)≥a_ij for all i,j
[A,B: Hermitian] abs(eig(A+B)-eig(A))≤rho(B) where eig(A) contains the eigenvalues of A sorted into ascending order. This shows that perturbing a hermitian matrix slightly doesn't have too big an effect on its eigenvalues.

Spectrum

The spectrum of A_[n#n] is the set of all its eigenvalues.

Stabilizability

The pair of matrices {A_[n#n], B_[n#m]} are stabilizable iff either of the following equivalent conditions are true

If x^TB = 0 and x^TA = kx^T then either |k|< 1 or else x = 0.
It is possible to choose L_[n#m] such that all elements of eig(A+BL^H) have absolute value < 1.

If {A, B} are reachable or controllable then they are stabilizable.
{DIAG(a), b} are stabilizable iff all elements of a with modulus ≥1 are distinct and all the corresponding elements of b are non-zero.

Submatrix

A submatrix of A is a matrix formed by the elements a(i,j) where i ranges over a subset of the rows and j ranges over a subset of the columns.

Trace

The trace of a square matrix is the sum of its diagonal elements: tr(A)=sum(diag(A))

In the formulae below, we assume that matrix dimensions ensure that the argument of tr() is square.

tr(aA) = a × tr(A)
tr(A^T) = tr(A)
tr(A^H) = tr(A)^C
tr(A+B) = tr(A) + tr(B)
tr(AB) = tr(BA) [1.17]
- tr((AB)^k) =tr((BA)^k)
- tr(ab^T) = a^Tb
- tr(Xba^T) = a^TXb
- tr(ab^H) = (a^Hb)^C
- tr(ABCD) = tr(BCDA) = tr(CDAB) = tr(DABC)
- Similar matrices have the same trace: tr(X^-1AX) = tr(A)
tr(AB) = A:^TB^T: = A^T:^TB: = A^H:^HB: = (A:^HB^H:)^C [1.18]
- tr(A^TB) = tr(AB^T) = sum(A: • B:) = A:^T B:
- tr(A^HB) = tr(BA^H) = sum(A^C: • B:) = A:^H B:
  - tr(A^HA) = tr(AA^H) = A:^H A: = ( ||A||_F )² where ||A||_F is the Frobenius matrix norm.
tr([A B]^T [C D]) = tr(A^TC) + tr(B^TD) [1.19]
- tr([A b]^T [C d]) = tr(A^TC) + b^Td
- tr([A B]^T X[C D]) = tr(A^TXC) + tr(B^TXD)
  - tr([A b]^T X[C d]) = tr(A^TXC) + b^TXd
tr(A ⊗ B) = [A,B: n#n] tr(A) tr(B) where ⊗ denotes the Kroneker product.
[D is diagonal] tr(XDX^T) = sum_i(d_i x_i^Tx_i) and tr(XDX^H) = sum_i(d_i x_i^Hx_i) = sum_i(d_i |x_i|²) [1.16]

Transpose

X=Y^T is the transpose of Y iff x(i,j)=y(j,i).

Vectorization

The vector formed by concatenating all the columns of X is written vec(X) or, in this website, X:. If y = X_[m#n]: then y_i_+m(j-1) = x_i,j.

a ⊗ b=(ba^T): where ⊗ denotes the Kroneker product.
sum((A • B):) = tr(A^TB) = sum(A: • B:) = A:^T B: = (A^T:)^T B^T: where A • B denotes the Hadamard or elementwise product.
tr(A^HB) = sum(A^C: • B:) = A:^H B:
- [A, B Hermitian] tr(A^HB) = tr(B^HA) = A:^H B: = B:^H A: is real-valued.
(ABC): = (C^T ⊗ A) B:
- (AB): = (I ⊗ A) B: = (B^T ⊗ I) A:= (B^T ⊗ A) I:
- (Abc^T): = (c ⊗ A) b = c ⊗ Ab
- ABc = (c^T ⊗ A) B:
- a^TBc = (c ⊗ a)^T B: = (c^T ⊗ a^T) B: = (ac^T):^T B: = B:^T (a ⊗ c) = B:^T (ca^T):
- ab^H ⊗ cd^H = (a ⊗ c)(b ⊗ d)^H = (ca^T):(db^T):^H
- a^Hbc^Hd = a^Hb ⊗ c^Hd = (a ⊗ c)^H(b ⊗ d) = (ca^T):^H(db^T):
(ABC):^T = B:^T (C ⊗ A^T)
- (AB):^T = B:^T (I ⊗ A^T) = A:^T (B ⊗ I) = I:^T (B ⊗ A^T)
- (Abc^T):^T = b^T(c^T ⊗ A^T) = c^T ⊗ b^TA^T
- a^TB^TC = B:^T (a ⊗ C)
If Y=AXB+CXD+... then X: = (B^T ⊗ A + D^T ⊗ C+...)^-1 Y: however this is a slow and often ill-conditioned way of solving such equations.
(A_[m#n]^T): = TVEC(m,n) (A:) [see vectorized transpose]

Vector Norms

A vector norm is a real-valued function of a vector satisfying the three axioms listed below.

Positive: ||x||=0 iff x=0 else ||x||>0
Homogeneous: ||cx||=|c| ||x|| for any real or complex scalar c
Triangle Inequality: ||x+x||≤||x||+||x||

Inner Product Norm

If <x, y> is an inner product then ||x|| = √(<x, x>) is a vector norm.

A vector norm may be derived from an inner product iff it satisfies the parallelogram identity: ||x+y||²+||x-y||²=2||x||²+2||y||²
If ||x|| is derived from <x, y> then 4Re(<x, y>) = ||x+y||²-||x-y||² = 2||x+y||²-||x||²-||y||²

Euclidean Norm

The Euclidean norm of a vector x equals the square root of the sum of the squares of the absolute values of all its elements and is written ||x||. It is always a real number and corresponds to the normal notion of the vector's length.

||x||² = x^Hx = tr(xx^H)
Cauchy-Schwartz inequality: |x^Hy| ≤ ||x|| ||y||
[Q: orthogonal]: ||Qx|| = ||x||

Hölder Norms or p-Norms

The p-norm of a vector x is defined by ||x||_p = sum(abs(x)^•p)^(1/p) for p≥1. The most common values of p are 1, 2 and infinity.

City-Block Norm: ||x||₁ = sum(abs(x))
Euclidean Norm: ||x|| = ||x||₂ = √(x^Hx)
Infinity Norm: ||x||_inf = max(abs(x))
Hölder inequality: abs(x)^Tabs(y) ≤ ||x||_p ||y||_q where 1/p + 1/q = 1
||x||_inf ≤ ||x||₂ ≤ ||x||₁ ≤ √(n) ||x||₂ ≤ n ||x||_inf

This page is part of The Matrix Reference Manual. Copyright © 1998-2022 Mike Brookes, Imperial College, London, UK. See the file gfl.html for copying instructions. Please send any comments or suggestions to "mike.brookes" at "imperial.ac.uk".
Updated: $Id: property.html 11291 2021-01-05 18:26:10Z dmb $