1.1 
If
A_{[}_{m}_{#n]}=F_{[m#r]}G_{[r#n]}
has linearly independent columns then r >= n
and we can find B_{[r#r]} and a subset of
F's columns, D_{[m#rn]}, such that
F = [D A]B.
To prove this, we replace the columns of F one at a time with columns
taken from A and show that we can do this for all of A's
columns.
 Define F^{[k]} = [D^{[k]}
a_{1:k}] consisting of rk columns selected from
F followed by the first k columns of A. We prove by
induction that, for each k in the range 0:n, we can choose
D^{[k]} and B^{[k]} so that
F=F^{[k]}B^{[k]}.
 For k=0, the proposition is true with
D^{[0]}=F and B^{[0]}=I.
 Suppose F=F^{[k]}B^{[k]}
is true for some k<n. Then,
A=F^{[k]}B^{[k]}G and
if we define x =
(B^{[k]}G)_{k+1}, the
(k+1)^{th} column of B^{[k]}G, we have
a_{k+1}
= F^{[k]}x.
 Suppose x_{i} is the first nonzero element of x. We
must have i <= rk for otherwise
a_{k+1} = [0
a_{1:k}]x or equivalently
A[x_{rk+1:r}; 1;
0_{[nk1#1]}] = 0. This would
contradict the linear independence of A.
 If we define D^{[k+1]} to be
D^{[k]} with its i^{th} column removed
then F^{[k]} = F^{[k+1]}S
where S=[I_{[i1#i1]} 0;0
x_{i}^{1}x_{i+1:r}
I_{[ri#ri]};0_{[1#i1]}
x_{i}^{1} 0_{[1#ri]}]. It
follows that if B^{[k+1]}=SB^{[k]}
we have F=F^{[k+1]}B^{[k+1]}
as required.
 Finally, we have
F=F^{[}^{n}^{]}B^{[}^{n}^{]}
= [D^{[}^{n}^{]} A] where
D^{[}^{n}^{]} consists of rn
columns selected from F. In order for
D^{[}^{n}^{]} to exist, we must therefore
have r>=n.

1.2 
For any nonzero
A_{[}_{m}_{#n]} we can find a
linearly independent subset of A's columns, F such that
A=F_{[m#r]}G_{[r#n]}
with G upper triangular and r<=n.
We prove this by induction on n. If n=1, set F=A
and G=1.
Now suppose A, F and G satisfy the theorem and consider
the m#n+1 matrix A^{+}=[A x]. If
x=Fy for some y then set F^{+}=F and
G^{+}=[G y], otherwise set
F^{+}=[F x] and G^{+}=[G 0;
0T 1]. It is straightforward to show in both cases that
A^{+}=F^{+}G^{+} that
G^{+} is upper triangular and that
r^{+}<=n+1.
We must also show that in the second case (i.e. if x!=Fy for
any y) that the columns of [F x] are linearly independent. If
[F x][a; b] = 0, then Fa=bx. We must have b=0
since otherwise x =F(a/b) which contradicts the
assumption that x != Fy. Hence Fa = 0 which implies that
a=0 since the columns of F are linearly independent. Thus the
columns of [F x] are also linearly independent.

1.3 
For any matrix
A_{[}_{m}_{#n]} we have
rank(A) <= m and
rank(A) <= n.
 We have A =
A_{[}_{m}_{#n]}
I_{[n#n]} =
I_{[}_{m}_{#m]}A_{[}_{m}_{#n]}
. Since rank(A) is the minimum inner dimension of such a decomposition,
we must have rank(A) <= m and
rank(A) <= n.

1.4 
rank(AB) <= rank(A) and
rank(AB) <= rank(B)
 If A=FG is a fullrank decomposition, then we can decompose
AB as F^{.}GB. Hence rank(AB) is
<= the inner dimension of this decomposition which is rank(A). A
similar argument shows that rank(AB) <= rank(B).

1.5 
rank(A)=n iff
A_{[}_{m}_{#n]} has linearly
independent columns.
 If the columns of A are linearly independent and
A=F_{[m#r]}G_{[r#n]}
is a fullrank decomposition, then from 1.1, we
have rank(A) = r >= n, but from
1.3, rank(A)<=n. Hence
rank(A)=n.
 If A's columns are not linearly independent then from 1.2 we can find a linearly independent subset F
with
A=F_{[m#r]}G_{[r#n]}.
This implies that rank(A) <= r < n
since the rank is the minimum inner dimension of such a decomposition.

1.6 
rank(I_{[n#n]})=n
 The columns of I are linearly independent since Ix=0
implies x=0. Hence from 1.5 its rank
is n.

1.7 
If A has linearly independent
columns then we can extend it to a square matrix
[D_{[}_{m}_{#mn]}
A_{[}_{m}_{#n]}] with linearly
independent columns.
 We have A=IA so from 1.1 we
can find D and B such that
I_{[}_{m}_{#m]}=[D
A]_{[}_{m}_{#m]}B. From
1.6, 1.4 and 1.3 we have
m = rank(I) <= rank([D
A]) <= m. Hence rank([D A])=m and by
1.5 its columns are linearly independent.

1.8 
If
A_{[m#n]} has linearly independent columns, then
there exists a Q_{[m#n]} and upper triangular
R_{[n#n]} and
S_{[n#n]} such that A=QR and
Q^{H}Q = RS = I. If A is real
then Q, R and S may be chosen to be real.
We prove this by induction on n. If n=1, we set
Q=a/a, R=a and
S=a^{1} where a=A. Note that
a cannot equal 0 since this would violate the assumption that the
columns of A are linearly independent.
Now assume the theorem to be true for A_{[m#n]}
with Q, R and S defined as in the theorem. Given an
m#(n+1) matrix A^{+} = [A x] with
linearly independent columns, we define y = x 
QQ^{H}x and note that y cannot equal
0 since we would then have x =
QQ^{H}x = ASQ^{H}x
contradicting the assumption that the columns of A^{+} are
linearly independent.
We set Q^{+} = [Q y/y ],
R^{+} = [R Q^{H}x;
0^{T} y ] and S^{+} =
[S SQ^{H}x/y;
0^{T} y^{1} ]. It is
straightforward to show that A^{+}, Q^{+},
R^{+} and S^{+} satisfy the theorem by performing
the appropriate multiplications. Note that if A is real then all the
above calculations give real results.
This recursive procedure is called GramSchmitt orthogonalization but
is not recommended because of its poor numerical properties.

1.9 
For any
A_{[m#n]} there exists a
Q_{[m#m]} and upper triangular
R_{[m#n]} such that A=QR and
Q^{H}Q = I. If A is real, then
Q and R may be taken as real.
 From 1.2 we decompose A=FG with
G upper triangular and F_{[m#r]} linearly
independent.
 From 1.7 we find D such that [F
D]_{[m#m]} has linearly independent columns.
 From 1.8 we decompose [F d] =
QU where Q^{H}Q = I.
We now have A = FG = [F D] [I; 0]
G = Q U [I; 0] G = QR where R
= U [I; 0] G is the product of three upper
triangular matrices and is therefore upper triangular itself. If A is real then
all of thee operations preserve realness.
If m>=n, the lower mn rows of R are all zero
and so the product A=QR is unaffected if we delete the last
nr columns of Q and the last nr rows of
R.

1.10 
The diagonal elements of a skewsymmetric matrix are all 0.
[Provided that the underlying field does not have characteristic 2]
 If A is skew symmetric then by definition
a(i,j)=a(j,i). By setting
j=i, we get
a(i,i)=a(i,i) from which
a(i,i)=0.

1.11 
The rank of a skewsymmetric matrix is even.
[Provided that the underlying field does not have characteristic 2]
 We prove this by induction starting with a 1#1 skewsymmetric matrix which
must equal 0 and hence has even rank.
 Given a skewsymmetric A_{n+1#n+1}, we can, by
[1.10] partition it as A = [C
b ; b^{T} 0] where
C_{n#n} is skewsymmetric and rank(C) is
even, by the induction hypothesis.
 If b = Cx for some x, then A = [I
x]^{T} C [I x] and also C =
[I 0] A [I 0]^{T} and so, by
[1.4], rank(A) = rank(C) which is
even.
 If, on the other hand, b is linearly independent of the columns of
C then, b^{T} is also linearly independent of the
rows of C = C^{T}. Hence rank(A) =
rank([C b ; b^{T} 0]) = rank([C b])
+ 1 = rank(C) + 2 which is even.

1.12 
If D is diagonal then AD =
DA iff a_{i,j}=0 whenever d_{i,i} !=
d_{j,j}.
 (AD)_{i,j} = a_{i,j}d_{j,j} =
(DA)_{i,j} = d_{i,i}a_{i,j}
==> a_{i,j} (d_{i,i}  d_{j,j})
= 0 ==> a_{i,j} = 0 or d_{i,i} =
d_{j,j}

1.13 
If D =
DIAG(c_{1}I_{1},
c_{2}I_{2}, ...,
c_{M}I_{M}) where the
c_{k} are distinct scalars and the
I_{j} are identity matrices, then AD = DA
iff A = DIAG(A_{1}, A_{2}, ...,
A_{M}) where each A_{k} is the same
size as the corresponding I_{k}.
 From [1.12] a_{i,j}=0 except
when d_{i,i} = d_{j,j}. However since the
c_{k} are distinct, d_{i,i} =
d_{j,j} only within one of the
c_{k}I_{k} blocks. It follows that
a_{i,j}=0 except when i and j lie in the same
block. The result follows.

1.14 
A=UDV^{H}=LDM^{H}
are alternative singular value decompositions of A iff
U^{H}L = DIAG(Q_{1},
Q_{2}, ..., Q_{k}, R) and
V^{H}M = DIAG(Q_{1},
Q_{2}, ..., Q_{k}, S) where
Q_{1}, Q_{2}, ..., Q_{k}
are unitary matrices whose sizes are given by the multiplicities of the
corresponding distinct nonzero singular values and R and S are
unitary matrices whose size equals the number of zero singular values.
[R.11]
 Define the unitary matrices P=U^{H}L
and Q=V^{H}M
 UD^{2}U^{H} =
AA^{H} = LD^{2}L^{H}
= UPD^{2}P^{H}U^{H}
==> D^{2}P = PD^{2}
 From [1.13], it follows that P
(and also Q, by a similar argument based on
A^{H}A) is block diagonal with orthogonal blocks
whose sizes match the multiplicity of the singular values in D. It
follows that PD = DP.
 We have U^{H}AV = D =
U^{H}LDM^{H} V=
PDQ^{H} ==> DQ = PD = DP
 Hence D(P  Q) = 0 and P must equal
Q except for rows in which d_{ii} = 0. The result
follows.

1.15 
If D is diagonal then
XDX^{T} = sum_{i}(d_{i}
× x_{i}x_{i}^{T}) and
XDX^{H} = sum_{i}(d_{i}
×
x_{i}x_{i}^{H})
 (XDX^{T})_{pq} =
sum_{i}(x_{pi}d_{i}x_{qi}) =
sum_{i}(d_{i} ×
x_{pi}x_{qi}) =
sum_{i}(d_{i} ×
(x_{i}x_{i}^{T})_{pq})
= (sum_{i}(d_{i} ×
x_{i}x_{i}^{T}))_{pq}
 The proof for XDX^{H} is identical.

1.16 
If D is diagonal then
tr(XDX^{T}) = sum_{i}(d_{i}
× x_{i}^{T}x_{i}) and
tr(XDX^{H}) = sum_{i}(d_{i}
×
x_{i}^{H}x_{i}) =
sum_{i}(d_{i} ×
x_{i}^{2})
 tr(XDX^{T}) =
tr(sum_{i}(d_{i} ×
x_{i}x_{i}^{T}))
[1.15] = tr(sum_{i}(d_{i} ×
x_{i}^{T}x_{i})) =
sum_{i}(d_{i} ×
x_{i}^{T}x_{i})
 The proof for tr(XDX^{H}) is identical.

1.17 
tr(AB) =
tr(BA)
 (AB)_{i,j} =
sum_{k}(a_{ik}b_{kj})
 tr(AB) = sum_{i}(AB)_{ii} =
sum_{i,k}(a_{ik}b_{ki}) =
sum_{k,i}(b_{ki}a_{ik}) =
sum_{k}(BA)_{kk} = tr(BA)

1.18 
tr(AB) =
A:^{T}B^{T}: =
A^{T}:^{T}B: =
A^{H}:^{H}B: =
(A:^{H}B^{H}:)^{C}
 (AB)_{i,j} =
sum_{k}(a_{ik}b_{kj})
 tr(AB) = sum_{i}(AB)_{ii} =
sum_{i,k}(a_{ik}b_{ki}) =
sum_{i,k}(A_{ik}(B^{T})_{ik})
= A:^{T}B^{T}:
 tr(AB) = tr(BA) [1.17] = B:^{T}A^{T}:
[1.18] =
A^{T}:^{T}B:
 tr(AB) = A^{T}:^{T}B:
= A^{H}:^{H}B: since
A^{T}:^{T} =
A^{H}:^{H}
 tr(AB) = A:^{T}B^{T}: =
(A:^{H}B^{H}:)^{C}

1.19 
tr([A
B]^{T} [C D]) =
tr(A^{T}C) +
tr(B^{T}D)
 Note that conformality implies that the matrix dimensions are
A_{[n#m]}, B_{[n#k]},
C_{[n#m]} and D_{[n#k]}
 tr([A B]^{T} [C D]) =
tr([A^{T}C A^{T}D ;
B^{T}C B^{T}D]) =
tr(A^{T}C) +
tr(B^{T}D) since these diagonal blocks are
square.

1.20 
The pseudoinverse is
unique
 Assume that both X^{+} and Y satisfy the pseudo
inverse equations. We will show that Y = X^{+}.
 (a) XX^{+}X=X and (b)
XYX=X
 (a) X^{+}XX^{+}=X^{+} and (b)
YXY=Y
 (a) (XX^{+})^{H}=XX^{+}
and (b) (XY)^{H}=XY
 (a)
(X^{+}X)^{H}=X^{+}X
and (b) (YX)^{H}=YX
 Y = YXY [2b] =
YY^{H}X^{H} [3b] =
YY^{H}X^{H}(X^{+})^{H}X^{H}
[1a] =
YXY(X^{+})^{H}X^{H}
[3b] = YXYXX^{+} [3a] = YXX^{+} [2b] =
X^{H}Y^{H}X^{+}
[4b] =
X^{H}(X^{+})^{H}X^{H}Y^{H}X^{+}
[1a] =
X^{+}XX^{H}Y^{H}X^{+}
[4a] =
X^{+}XYXX^{+} [4b] = X^{+}XX^{+} [1b] = X^{+} [2a]

1.21 
C=TOE(a)_{[}_{m#r}_{]}TOE(b)_{[r#n]}
is toeplitz iff
a_{r+1:r+m1}b_{1:n1}^{T}
=
a_{1:m1}b_{r+1:r+n1}^{T}
 The product, C, is toeplitz iff
c_{i+1,j+1}c_{i,j} =
0 for all 1<=i<=m1 and
1<=j<=n1.

c_{i+1,j+1}c_{i,j}=sum_{p=1:r}(A_{i+1,p}B_{p,j+1})

sum_{q=1:r}(A_{i,q}B_{q,j})
where A_{u,v}=a_{uv+r} and
B_{u,v}=b_{uv+n} from the
toeplitz property.
 Hence
c_{i+1,j+1}c_{i,j}
=
sum_{p=1:r}(a_{i+1p+r}b_{pj1+n})

sum_{q=1:r}(a_{iq+r}b_{qj+n})
 Substituting p=q+1 in the first summation gives
c_{i+1,j+1}c_{i,j} =
sum_{q=0:r1}(a_{iq+r}b_{qj+n})

sum_{q=1:r}(a_{iq+r}b_{qj+n})
 The summation terms all cancel except for q=0 and q=r
which leaves
c_{i+1,j+1}c_{i,j} =
a_{i+r}b_{nj} 
a_{i}b_{nj+r} =
a_{i+r}b_{k} 
a_{i}b_{k+r} where
k=nj. Note that 1<=j<=n1 implies
1<=k<=n1.
 Thus C is toeplitz iff
a_{i+r}b_{k} =
a_{i}b_{k+r} for all
1<=i<=m1 and 1<=k<=n1 which
gives a_{r+1:r+m1}b_{1:n1}^{T}
=
a_{1:m1}b_{r+1:r+n1}^{T}
as required.
