Matrix Reference Manual
Proofs Section 1: Basic Theorems

Go to: Introduction, Notation, Index

 1.1 If A[m#n]=F[m#r]G[r#n] has linearly independent columns then  r >= n and we can find B[r#r]  and a subset of F's columns, D[m#r-n], such that F = [D A]B. To prove this, we replace the columns of F one at a time with columns taken from A and show that we can do this for all of A's columns. Define F[k] = [D[k] a1:k] consisting of r-k columns selected from F followed by the first k columns of A. We prove by induction that, for each k in the range 0:n, we can choose D[k] and B[k] so that F=F[k]B[k]. For k=0, the proposition is true with D=F and B=I. Suppose F=F[k]B[k] is true for some k=n. 1.2 For any non-zero A[m#n] we can find a linearly independent subset of A's columns, F such that A=F[m#r]G[r#n] with G upper triangular and r<=n. We prove this by induction on n. If n=1, set F=A and G=1. Now suppose A, F and G satisfy the theorem and consider the m#n+1 matrix A+=[A x]. If x=Fy for some y then set F+=F and G+=[G y], otherwise set F+=[F x] and G+=[G 0; 0T 1]. It is straightforward to show in both cases that A+=F+G+ that G+ is upper triangular and that r+<=n+1. We must also show that in the second case (i.e. if x!=Fy for any y) that the columns of [F x] are linearly independent. If [F x][a; b] = 0, then Fa=-bx. We must have b=0 since otherwise x =F(-a/b) which contradicts the assumption that x != Fy. Hence Fa = 0 which implies that a=0 since the columns of F are linearly independent. Thus the columns of [F x] are also linearly independent. 1.3 For any matrix A[m#n] we have rank(A) <= m and rank(A) <= n. We have A = A[m#n] I[n#n] = I[m#m]A[m#n] . Since rank(A) is the minimum inner dimension of such a decomposition, we must have rank(A) <= m and rank(A) <= n. 1.4 rank(AB) <= rank(A) and rank(AB) <= rank(B) If A=FG is a full-rank decomposition, then we can decompose AB as F.GB. Hence rank(AB) is <= the inner dimension of this decomposition which is rank(A). A similar argument shows that rank(AB) <= rank(B). 1.5 rank(A)=n iff A[m#n] has linearly independent columns. If the columns of A are linearly independent and A=F[m#r]G[r#n] is a full-rank decomposition, then from 1.1, we have rank(A) = r >= n, but from 1.3, rank(A)<=n. Hence rank(A)=n. If A's columns are not linearly independent then from 1.2 we can find a linearly independent subset F with A=F[m#r]G[r#n]. This implies that rank(A) <= r < n since the rank is the minimum inner dimension of such a decomposition. 1.6 rank(I[n#n])=n The columns of I are linearly independent since Ix=0 implies x=0. Hence from 1.5 its rank is n. 1.7 If A has linearly independent columns then we can extend it to a square matrix [D[m#m-n] A[m#n]] with linearly independent columns. We have A=IA so from 1.1 we can find D and B such that I[m#m]=[D A][m#m]B. From 1.6, 1.4 and 1.3 we have m = rank(I) <= rank([D A]) <= m. Hence rank([D A])=m and by 1.5 its columns are linearly independent. 1.8 If A[m#n] has linearly independent columns, then there exists a Q[m#n] and upper triangular R[n#n] and S[n#n] such that A=QR  and QHQ = RS = I. If A is real then Q, R and S may be chosen to be real. We prove this by induction on n. If n=1, we set Q=a/||a||, R=||a||  and S=||a||-1 where a=A. Note that ||a|| cannot equal 0 since this would violate the assumption that the columns of A are linearly independent. Now assume the theorem to be true for A[m#n] with Q, R and S defined as in the theorem. Given an m#(n+1) matrix A+ = [A x] with linearly independent columns, we define y = x - QQHx and note that y cannot equal 0 since we would then have x = QQHx = ASQHx contradicting the assumption that the columns of A+ are linearly independent. We set Q+ = [Q  y/||y|| ], R+ = [R  QHx; 0T  ||y|| ] and S+ = [S  -SQHx/||y||; 0T  ||y||-1 ]. It is straightforward to show that A+, Q+, R+ and S+ satisfy the theorem by performing the appropriate multiplications. Note that if A is real then all the above calculations give real results. This recursive procedure is called Gram-Schmitt orthogonalization but is not recommended because of its poor numerical properties. 1.9 For any A[m#n] there exists a Q[m#m] and upper triangular R[m#n] such that A=QR  and QHQ = I. If A is real, then Q and R may be taken as real. From 1.2 we decompose A=FG with G upper triangular and F[m#r] linearly independent. From 1.7 we find D such that [F D][m#m] has linearly independent columns. From 1.8 we decompose [F d] = QU where QHQ = I. We now have A = FG = [F D] [I; 0] G = Q U [I; 0] G = QR where R = U [I; 0] G is the product of three upper triangular matrices and is therefore upper triangular itself. If A is real then all of thee operations preserve realness. If m>=n, the lower m-n rows of R are all zero and so the product A=QR is unaffected if we delete the last n-r columns of Q and the last n-r rows of R. 1.1 The diagonal elements of a skew-symmetric matrix are all 0. [Provided that the underlying field does not have characteristic 2] If A is skew symmetric then by definition a(i,j)=-a(j,i). By setting j=i, we get a(i,i)=-a(i,i) from which a(i,i)=0. 1.11 The rank of a skew-symmetric matrix is even. [Provided that the underlying field does not have characteristic 2] We prove this by induction starting with a 1#1 skew-symmetric matrix which must equal 0 and hence has even rank. Given a skew-symmetric An+1#n+1, we can, by [1.10] partition it as A = [C b ; -bT 0] where Cn#n is skew-symmetric and rank(C) is even, by the induction hypothesis. If  b = Cx for some x, then A = [I x]T C [I x] and also C = [I 0] A [I 0]T  and so, by [1.4], rank(A) = rank(C) which is even. If, on the other hand, b is linearly independent of the columns of C then, -bT is also linearly independent of the rows of C = -CT. Hence rank(A) = rank([C b ; -bT 0]) = rank([C b]) + 1 = rank(C) + 2 which is even. 1.12 If D is diagonal then AD = DA iff ai,j=0 whenever di,i != dj,j. (AD)i,j = ai,jdj,j = (DA)i,j = di,iai,j  ==> ai,j (di,i - dj,j) = 0  ==> ai,j = 0 or di,i = dj,j 1.13 If D = DIAG(c1I1, c2I2, ..., cMIM) where the ck are distinct scalars and the Ij are identity matrices, then AD = DA iff A = DIAG(A1, A2, ..., AM) where each Ak is the same size as the corresponding Ik. From [1.12] ai,j=0 except when di,i = dj,j. However since the ck are distinct, di,i = dj,j only within one of the ckIk blocks. It follows that ai,j=0 except when i and j lie in the same block. The result follows. 1.14 A=UDVH=LDMH are alternative singular value decompositions of A iff UHL = DIAG(Q1, Q2, ..., Qk, R) and VHM = DIAG(Q1, Q2, ..., Qk, S) where Q1, Q2, ..., Qk are unitary matrices whose sizes are given by the multiplicities of the corresponding distinct non-zero singular values and R and S are unitary matrices whose size equals the number of zero singular values. [R.11] Define the unitary matrices P=UHL and Q=VHM UD2UH = AAH = LD2LH = UPD2PHUH ==> D2P = PD2 From [1.13], it follows that P (and also Q, by a similar argument based on AHA) is block diagonal with orthogonal blocks whose sizes match the multiplicity of the singular values in D. It follows that PD = DP. We have UHAV = D = UHLDMH V= PDQH ==> DQ = PD = DP Hence D(P - Q) = 0 and P must equal Q except for rows in which dii = 0. The result follows. 1.15 If D is diagonal then XDXT = sumi(di × xixiT) and XDXH = sumi(di × xixiH) (XDXT)pq = sumi(xpidixqi) = sumi(di × xpixqi) = sumi(di × (xixiT)pq) = (sumi(di × xixiT))pq The proof for XDXH is identical. 1.16 If D is diagonal then tr(XDXT) = sumi(di × xiTxi) and tr(XDXH) = sumi(di × xiHxi) = sumi(di × |xi|2) tr(XDXT) = tr(sumi(di × xixiT)) [1.15] = tr(sumi(di × xiTxi)) = sumi(di × xiTxi) The proof for tr(XDXH) is identical. 1.17 tr(AB) = tr(BA) (AB)i,j = sumk(aikbkj) tr(AB) = sumi(AB)ii = sumi,k(aikbki) = sumk,i(bkiaik) = sumk(BA)kk = tr(BA) 1.18 tr(AB) = A:TBT: = AT:TB:  = AH:HB: = (A:HBH:)C (AB)i,j = sumk(aikbkj) tr(AB) = sumi(AB)ii = sumi,k(aikbki) = sumi,k(Aik(BT)ik) = A:TBT: tr(AB) = tr(BA)  [1.17] = B:TAT:  [1.18] =  AT:TB: tr(AB) =  AT:TB: =  AH:HB:  since AT:T = AH:H tr(AB) = A:TBT: =  (A:HBH:)C 1.19 tr([A B]T [C D]) = tr(ATC) + tr(BTD) Note that conformality implies that the matrix dimensions are A[n#m], B[n#k], C[n#m] and D[n#k] tr([A B]T [C D]) = tr([ATC ATD ; BTC BTD]) = tr(ATC) + tr(BTD) since these diagonal blocks are square. 1.2 The pseudoinverse is unique Assume that both X+ and Y satisfy the pseudo inverse equations. We will show that Y = X+. (a) XX+X=X  and (b)  XYX=X (a) X+XX+=X+ and (b)  YXY=Y (a) (XX+)H=XX+  and (b)  (XY)H=XY (a) (X+X)H=X+X  and (b)  (YX)H=YX Y = YXY [2b] = YYHXH [3b] = YYHXH(X+)HXH [1a] = YXY(X+)HXH [3b] = YXYXX+ [3a] = YXX+ [2b] = XHYHX+ [4b] = XH(X+)HXHYHX+ [1a] = X+XXHYHX+ [4a] = X+XYXX+ [4b] = X+XX+ [1b] = X+ [2a] 1.21 C=TOE(a)[m#r]TOE(b)[r#n] is toeplitz iff ar+1:r+m-1b1:n-1T = a1:m-1br+1:r+n-1T The product, C, is toeplitz iff ci+1,j+1-ci,j = 0 for all 1<=i<=m-1 and 1<=j<=n-1. ci+1,j+1-ci,j=sump=1:r(Ai+1,pBp,j+1) - sumq=1:r(Ai,qBq,j) where Au,v=au-v+r and Bu,v=bu-v+n from the toeplitz property. Hence ci+1,j+1-ci,j =  sump=1:r(ai+1-p+rbp-j-1+n) - sumq=1:r(ai-q+rbq-j+n) Substituting p=q+1 in the first summation gives ci+1,j+1-ci,j = sumq=0:r-1(ai-q+rbq-j+n) - sumq=1:r(ai-q+rbq-j+n) The summation terms all cancel except for q=0 and q=r which leaves ci+1,j+1-ci,j = ai+rbn-j - aibn-j+r = ai+rbk - aibk+r where k=n-j. Note that 1<=j<=n-1 implies 1<=k<=n-1. Thus C is toeplitz iff ai+rbk = aibk+r for  all 1<=i<=m-1 and 1<=k<=n-1 which gives ar+1:r+m-1b1:n-1T = a1:m-1br+1:r+n-1T as required.

This page is part of The Matrix Reference Manual. Copyright © 1998-2019 Mike Brookes, Imperial College, London, UK. See the file gfl.html for copying instructions. Please send any comments or suggestions to "mike.brookes" at "imperial.ac.uk".
Updated: \$Id: proof001.html 11172 2019-08-20 21:33:12Z dmb \$