4.1 |
diag(abT) =
a • b
- If X = abT, then xi,j =
aibj and the diagonal elements of X,
xi,i = aibi. This is
precisely the ith element of a • b.
|
4.2 |
(A-1 +
UVH)-1 = A - (AU(I +
VHAU)-1)VHA
We show that A-1 + UVH and
A - (AU(I +
VHAU)-1)VHA
are inverses by multiplying them together:
- (A-1 + UVH)( A -
(AU(I +
VHAU)-1)VHA)
- =
I+UVHA-(U+UVHAU)(I
+
VHAU)-1VHA
- =
I+UVHA-U(I+VHAU)(I
+
VHAU)-1VHA
- =
I+UVHA-UVHA
= I
|
4.3 |
det(A-1 +
UVH) = det(I +
VHAU) ×
det(A-1)
From [3.1] we have that
det([A-1, -U; VH, I])
= det(A-1) ×
det(I+VHAU) =
det(I)*det(A-1+UI-1VH) which
simplifies to det([A-1, -U;
VH, I]) = det(A-1)
× det(I+VHAU) =
det(A-1+UVH)
|
4.4 |
If
D[n#n]= DIAG(d) is real with
di≥di+1 for all i,
and if Y[n#k] is constrained to satisfy
YHY=I then (a) maxY
tr(YHDY) = sum(d1:k)
and is attained by Y=[I; 0] and (b)
minY tr(YHDY) =
sum(dn-k+1:n) and is attained by
Y=[0; I].
- Define vi =
sumj(|yij|2). We must have
0≤vi≤1 since Y consists of k columns
from an n#n unitary matrix. Also sum(v)=k since
each of the k columns of Y has unit length.
- tr(YHDY) =
vTd which is a weighted sum of the
di with the constraints on v given above. (a) To
maximize this sum, we set v1:k=1 and
vk+1:n=0 which gives
tr(YHDY) = sum(d1:k).
(b) To minimize this sum, we set v1:n-k=0 and
vn-k+1:n=1 which gives
tr(YHDY) =
sum(dn-k+1:n).
|
4.5 |
If
D[n#n]= DIAG(d) with
di≥di+1 for all i,
then maxy (yHDy |
yHy=1) = d1
and miny (yHDy |
yHy=1) = dn and these bounds
are attained by y=e1 and
y=en respectively.
This is a special case of [4.4] with k=1, but may also
be proved directly.
- yHDy = sum(d •
yC • y) ≤ max(d) ×
sum(yC • y) = d1 ×
yHy = d1
- yHDy = sum(d •
yC • y) ≥ min(d) ×
sum(yC • y) = dn ×
yHy = dn
|
4.6 |
If
D[n#n]= DIAG(d) with
di≥di+1 for all i,
then minW:[n#k] maxy
(yHDy | yHy=1
and WHy=0) =
dn-k and this bound is attained by
W=[0; I[k#k]] and
y=en-k.
First define V = [0;
I[k#k]] and note that
VHy=0 implies that
y=PPHy where
P=(I[n-k#n-k]; 0) since
PPH+VVH=I
For any fixed W[k#k],
maxy (yHDy |
yHy=1 and
WHy=0)
≥ maxy (yHDy |
yHy=1 and
WHy=0 and
VHy=0)
=maxy
(yHPPHDPPHy
| yHy=1 and
WHy=0 and
VHy=0)
≥ min(d1:n-k) from [4.5] since
PHDP=diag(d1:n-k)
=dn-k since the entries of d are in
descending order.
From [4.5] this bound is attained when
PHy=en-k which
is in turn when y=en-k.
Since this is true for any W[k#k], we must
have minW:[n#k] maxy
(yHDy | yHy=1
and WHy=0) ≥
dn-k But if we choose W=V,
then we attain this bound with y=en-k.
where ei is the i'th column of the identity
matrix.
|
4.7 |
If
H[n#n]=UDUH is
hermitian, U is unitary and
D=diag(d)=diag(eig(H)), then minW
maxx (xHHx |
xHx=1 and
W[n#k]Hx=0)
= dn-k and this bound is attained by
W=U:,n-k+1:n and
y=un-k
- Set y=UHx, then
xHHx = yHDy and
xHx = yHy
- minW maxx
(xHHx | xHx=1
and W[n#k]Hx=0)
= minW maxy
(yHDy |
yHy=1 and
W[n#k]HUy=0)
From [4.6] we
attain the bound with UHW= [0;
I[k#k]] and
y=en-k.
Hence W = U[0; I[k#k]] =
U:,n-k+1:n
and x = Uen-k =
un-k.
|
4.8 |
If
H[n#n]=UDUH is
hermitian, U is unitary and
D=diag(d)=diag(eig(H)) contains the eigenvalues in
decreasing order, then maxx
(xHHx | xHx=1)
= d1 and minx
(xHHx | xHx=1)
= dn and these bounds are attained
by x=u1 and y=un
respectively.
From [4.5] and setting
y=UHx, d1=
maxy (yHDy |
yHy=1) = maxx
((UHx)HD(UHx)
|
(UHx)H(UHx)=1)
= maxx (xHHx |
xHx=1)
Similarly, dn= miny
(yHDy | yHy=1)
= minx
((UHx)HD(UHx)
|
(UHx)H(UHx)=1)
= minx (xHHx |
xHx=1)
|
4.9 |
If
D[n#n]= DIAG(d) is real with
di≥di+1>0 for all
i, and if Y[n#k] is constrained to have
rank k, then (a)
max(det(YHDY)/det(YHY))
= prod(d1:k) and is attained by Y=[I;
0] and (b)
min(det(YHDY)/det(YHY))
= prod(dn-k+1:n) and is attained by
Y=[0; I].
- At an extreme value, the complex
gradient must be zero. Hence 0 =
d(ln(det(YHDY)/det(YHY)))/dYC
=
d(ln(det(YHDY)))/dYC
-
d(ln(det(YHY)))/dYC
= (DY(YHDY)-1 -
Y(YHY)-1):
[2.16]
- Hence DY(YHDY)-1 =
Y(YHY)-1 and so DY =
Y(YHY)-1YHDY
- Taking the Hermitian transpose, we get
YHDY(YHY)-1
YH = YH D
- Since rank(Y)=k, YH must contain
k linearly independent columns and so the columns of
YH contain a complete set of eigenvectors for the
k#k matrix
YHDY(YHY)-1
with eigenvalues given by the corresponding entries in d.
-
det(YHDY)/det(YHY)
=
det(YHDY(YHY)-1)
equals the product of the eigenvalues and is therefore the product of k
elements of d. Since the elements of d and in decreasing order, the
maximum possible product is prod(d1:k) and the minimum
is prod(dn-k+1:n). These values are
attained by Y=[I; 0] and Y=[0; I]
respectively.
|
4.10 |
If A is hermitian and B is
+ve definite hermitian there exists X such that
XHBX=I and
XHAX=D where X and
D may be obtained from the eigendecomposition
B-1A=XDX-1 with
D=DIAG(d) a diagonal matrix
of eigenvalues in non-increasing order. If S is the +ve definite
hermitian square root of B-1 (i.e.
S2B=I) then the eigendecomposition of the
hermitian matrix SAS is
SAS=WDWH where
W=S-1X is
unitary.
- Since B is +ve definite hermitian, we can decompose it as
B = ULUH where U is unitary and L
is diagonal with real diagonal entries >0.
- Define the real diagonal matrix M =
L-½.
- The matrix MHUHAUM
is hermitian and can therefore be decomposed as
MHUHAUM =
VDVH where V is unitary and
D=DIAG(d) is real. We may order the columns of V so that
the elements of d are in non-increasing order.
- Define X = UMV. Then XHBX =
VHMHUHBUMV
= VHMHLMV =
VHV = I and
XHAX =
VHMHUHAUMV
= VHVDVHV = D
- B-1A =
(ULUH)-1(UM-1VDVHM-1UH)
= (UM2UH)
(UM-1VDVHM-1UH)
=
UMVDVHM-1UH
= (UMV) D (UMV)-1 =
XDX-1
- The hermitian +ve definite square root of B-1 is S
= UMUH. Therefore SAS =
(UMUH)
(UM-1VDVHM-1UH)
(UMUH) =
UVDVHUH = (UV)
D
(UV)H=WDWH.
- If A and B are real, then all the above
matrices can be taken as real.
|
4.11 |
If W is +ve definite Hermitian
and B is Hermitian, then if X[n#k] is
restricted to have rank(X)=k, maxX
tr((XHWX)-1
XHBX) = sum(d1:k)
where d are the eigenvalues of W-1B sorted into
decreasing order and this is attained by taking the columns of X to be
the corresponding eigenvectors.
- From [4.10] we can find
G[n#n] such that
GHWG = I and
GHBG = D = DIAG(d) where
d and G are the eigenvalues and corresponding eigenvectors
of W-1B with the elements of d in
non-increasing order. Since G is non-singular, the range of
X=GY[n#k] over rank(Y)=k
includes all n#k matrices with rank k.
- Hence, maxX
tr((XHWX)-1
XHBX) = maxY
tr(((GY)HW(GY))-1
(GY)HB(GY)) = maxY
tr((YHGHWGY)-1
YHGHBGY) =
maxY tr((YHY)-1
YHDY).
- From [1.9] any
Y[n#k] with rank k may be decomposed as
Y=Q[n#k]R[k#k]
with QHQ=I and R
non-singular.
- Hence, maxY
tr((YHY)-1
YHDY) = maxQ,R
tr((RHQHQR)-1
RHQHDQR) =
maxQ,R
tr(R-1R-H
RHQHDQR) =
maxQ,R
tr(R-1QHDQR) =
maxQ,R
tr(QHDQRR-1) =
maxQ,R tr(QHDQ) =
maxQ tr(QHDQ)
- From [4.4], the maximum is
sum(d1:k) and is attained by Y=Q=[I;
0]. From this we find that X=GY consists of the first
k columns of G, that is, the eigenvectors corresponding to the
k highest eigenvalues.
|
4.12 |
If W is +ve definite Hermitian
and B is Hermitian, then if X[n#k] is
restricted to have rank(X)=k, maxX
det((XHWX)-1
XHBX) = prod(d1:k)
where d are the eigenvalues of W-1B sorted into
decreasing order and this is attained by taking the columns of X to be
the corresponding eigenvectors.
- From [4.10] we can find
G[n#n] such that
GHWG = I and
GHBG = D = DIAG(d) where
d and G are the eigenvalues and corresponding eigenvectors
of W-1B with the elements of d in
non-increasing order. Since G is non-singular, the range of
X=GY[n#k] over rank(Y)=k
includes all n#k matrices with rank k.
- Hence, maxX
det((XHWX)-1
XHBX) = maxY
det(((GY)HW(GY))-1
(GY)HB(GY)) = maxY
det((YHGHWGY)-1
YHGHBGY) =
maxY det((YHY)-1
YHDY).
- From [4.9], the maximum is
prod(d1:k) and is attained by Y=[I; 0].
From this we find that X=GY consists of the first k
columns of G, that is, the eigenvectors corresponding to the k
highest eigenvalues.
|
4.13 |
If W is +ve definite Hermitian
and B is Hermitian and A[n#m] is a given
matrix, then if X[n#k] is restricted such that
rank([A X])=m+k, maxX tr(([A
X]HW[A X])-1 [A
X]HB[A X]) =
tr((AHWA)-1AHBA)
+ sum(d1:k) where d are the eigenvalues of
(W-1-A(AHWA)-1AH)B
sorted into decreasing order and this is attained by taking the columns of
X to be the corresponding eigenvectors.
- From [4.10] we can find
G[n#n] such that
GHWG = I and
GHBG = D = DIAG(d) where
d and G are the eigenvalues and corresponding eigenvectors
of W-1B with the elements of d in
non-increasing order.
- We can do the QR decomposition
G-1A =
U[n#m]R[m#m] =
[U[n#m]
V[n#n-m]][R; 0] where [U
V]H[U V] = I
Note that
(AHWA)-1
AHBA =
(RHUHGHWGUR)-1
RHUHGHBGUR=
R-1UHDUR
- Now for any X[n#k] satisfying rank([A
X])=m+k,
- We form the QR decomposition
VHG-1X =
K[n-m#k]S[k#k] and we
define the upper triangular matrix T[m+k#m+k]
= [R UHG-1X; 0
S] giving [A X] = G[U VK]T.
T must be non singular since rank([A X])=m+k.
- Now we have tr(([A X]HW[A
X])-1 [A X]HB[A X])
= tr((TH[U
VK]HGHWG[U
VK]T)-1 TH[U
VK]HGHBG[U
VK]T) since [A X] = G[U
VK]T
= tr(T-1([U VK]H[U
VK])-1
T-HTH[U
VK]HD[U VK]T) since T is non-singular, GHWG
= I
= tr([U VK]HD[U VK]) since [U VK]H[U VK]
= I[m+k#m+k] and from [1.17]
= tr(UHDU) +
tr(KHVHDVK)
[1.19] =
tr((AHWA)-1
AHBA) +
tr(KHVHDVK)
- Thus we need to maximize
tr(KHVHDVK) subject to
KHK = I . From [4.11] (with W=I) the maximum equals the sum of the
top k eigenvalues of VHDV and is
attained by setting K to the corresponding eigenvectors. From K
we can derive X = GVK. Note that S does not affect our
objective function and so can be taken to be the identity.
- We have VHDV K = K L where L is a
diagonal matrix containing the top k eigenvalues of
VHDV.
Therefore X L = GVK L = GVVHDV K =
GVVHGHB GVK =
GVVHGHB X which means
that X consists of the eigenvectors of
GVVHGHB with
eigenvalues diag(L).
- We can use the fact that A=GUR, W =
G-HG-1 and
UHU=I to
give AHWA =
RHUHGHG-HG-1GUR
= RHR. Thus
A(AHWA)-1AH
= GUR R-1R-H
RHUHGH
= GUUHGH
- Therefore, since
UUH+VVH=I , we get
GVVHGHB=
G(I-UUH)GHB =
(GGH-GUUHGH)B
=
(W-1-A(AHWA)-1AH)B
|
4.14 |
If W=FHF is +ve
definite Hermitian, B is Hermitian and
A[n#m] is a given matrix and the columns of
V are an orthonormal basis for the null space of
AHFH, then if
X[n#k] is restricted such that rank([A
X])=m+k, maxX tr(([A
X]HW[A X])-1 [A
X]HB[A X]) =
tr((AHWA)-1AHBA)
+ sum(d1:k) where d are the eigenvalues of
VHF-HBF-1V
sorted into decreasing order and this is attained by taking the columns of
X to be the corresponding eigenvectors multiplied by
F-1V.
- For any X[n#k] we can do a QR decomposition VHFX =
K[n-m#k]S[k#k] and we define
the upper triangular matrix T[m+k#m+k] =
[R UHFX; 0
S[k#k]] where FA =
U[n#m]R[m#m] is
also a QR decomposition. We note that
Tis non singular iff rank([A X])=m+k.
- Now, tr(([A X]HW[A X])-1
[A X]HB[A X]) =
tr((TH[U
VK]HF-HFHFF-1[U
VK]T)-1 TH[U
VK]HF-HBF-1[U
VK]T) = tr(T-1([U
VK]H[U VK])-1
T-HTH[U
VK]HD[U VK]T) = tr([U
VK]HF-HBF-1[U
VK]) =
tr(UHF-HBF-1U)
+
tr(KHVHF-HBF-1VK)
- The first term is independent of X, while the maximum value of the second
subject to KHK=I is equal to the sum of
the k highest eigenvalues of
VHF-HBF-1V
with the columns of K the corresponding eigenvectors. A suitable
X is then given by X = F-1VK.
|
4.15 |
If W is +ve definite Hermitian
and B is Hermitian and A[n#m] is a given
matrix, then maxX det(([A
X]HW[A X])-1 [A
X]HB[A X] | rank([A
X[n#k]])=m+k) =
det((AHWA)-1AHBA)×prod(l1:k)
where l are the eigenvalues of
W-1B(I - A
(AHBA)-1AHB
) sorted into decreasing order and this maximum may be attained by taking the
columns of X to be the corresponding eigenvectors.
- From [4.10] we can find
G[n#n] such that
GHWG = I and
GHBG = D = DIAG(d) where
d and G are the eigenvalues and corresponding eigenvectors
of W-1B with the elements of d in
non-increasing order.
- We can do the QR decomposition
G-1A =
U[n#m]R[m#m] =
[U[n#m]
V[n#n-m]][R; 0] where [U
V]H[U V] = I
Note that
(AHWA)-1
AHBA =
(RHUHGHWGUR)-1
RHUHGHBGUR=
R-1UHDUR
- Now for any X[n#k] satisfying rank([A
X])=m+k,
- We form the QR decomposition
VHG-1X =
K[n-m#k]S[k#k] and we define
the upper triangular matrix T[m+k#m+k] =
[R UHG-1X; 0 S]
giving [A X] = G[U VK]T. T must
be non singular since rank([A X])=m+k.
- Now we have det(([A X]HW[A
X])-1 [A X]HB[A X])
= det((TH[U
VK]HGHWG[U
VK]T)-1 TH[U
VK]HGHBG[U
VK]T) since [A X] = G[U
VK]T
= det(T-1([U VK]H[U
VK])-1
T-HTH[U
VK]HD[U VK]T) since T is non-singular, GHWG
= I
= det([U VK]HD[U VK]) since [U VK]H[U VK]
= I[m+k#m+k]
= det([UHDU
UHDVK ;
KHVHDU
KHVHDVK )
=
det(UHDU)×det(KHVHDVK
- KHVHDU
(UHDU)-1UHDVK)
[3.1] =
det(UHDU)×det(KH
(VHDV - VHDU
(UHDU)-1UHDV)
K)
= det((AHWA)-1
AHBA)×det(KH
(VHDV - VHDU
(UHDU)-1UHDV)
K)
- Thus we need to maximize det(KH
(VHDV - VHDU
(UHDU)-1UHDV)
K) subject to KHK = I.
From [4.12] (with
W=I) the maximum equals the product of the top k
eigenvalues of VHDV - VHDU
(UHDU)-1UHDV
and is attained by setting K to the corresponding eigenvectors. From
K we can derive X = GVK as one possible X. Note
that S does not affect our objective function and so can be taken to be
the identity.
- We can manipulate VHDV -
VHDU
(UHDU)-1UHDV
= VHGHBGV -
VHGHBGU
(UHGHBGU)-1UHGHBGV
= VHGHBGV -
VHGHBGUR(RHUHGHBGUR)-1RHUHGHBGV
= VHGHBGV -
VHGHBA
(AHBA)-1AHBGV
= VHGH(I - BA
(AHBA)-1AH)BGV
- So if K contains eigenvectors of
VHGH(I - BA
(AHBA)-1AH)BGV,
we have VHGH(I -
BA
(AHBA)-1AH)BGV
K = K L for some diagonal L. Hence
X L = GVK L = GVVHGH(I
- BA
(AHBA)-1AH)BGV
K
= G (I-UUH)GH(I -
BA
(AHBA)-1AH)
BX
= (GGH -
GUUHGH - G
GHBA
(AHBA)-1AH
+ GUUHGHBA
(AHBA)-1AH
)BX
= (GGH -
AR-1R-HAH
- G GHBA
(AHBA)-1AH
+
AR-1R-HAHBA
(AHBA)-1AH
)BX
= (GGH -
AR-1R-HAH
- G GHBA
(AHBA)-1AH
+ AR-1R-HAH
)BX
= (GGH - G
GHBA
(AHBA)-1AH )BX
= (W-1B - W-1BA
(AHBA)-1AHB
)X
= W-1B(I - A
(AHBA)-1AHB
)X
- So X consists of the eigenvectors of
W-1B(I - A
(AHBA)-1AHB
) corresponding to the k highest eigenvalues [there must be an easier
proof than this methinks]
|
4.16 |
|xHy|2 =
xHyyHx ≤
xHxyHy for any complex
vectors x and y
- If yHy=0 then the inequality is true since
y=0 making both sides of the inequality zero. Hence we assume
that yHy>0.
- 0 ≤ |xyHy -
yyHx|2
= (yHyxH -
xHyyH)(xyHy
- yyHx)
=
yHyxHxyHy
-
xHyyHxyHy
-
yHyxHyyHx
+
xHyyHyyHx
= xHxyHy -
xHyyHx -
xHyyHx +
xHyyHx (after dividing
all terms by yHy)
= xHxyHy -
xHyyHx
|
4.17 |
XHY(YHY)-1YHX
≤ XHX where ≤ represents the
Loewner partial order.
- Let a be an arbitrary vector
-
aHXHY(YHY)-1YHXa
= aHXH ×
Y(YHY)-1YHXa
which is of the form uHv
- Applying the scalar Cauchy-Schwarz inequality [4.16] gives
(aHXHY(YHY)-1YHXa)2
≤ aHXHXa
×
aHXHY(YHY)-1YHY(YHY)-1YHXa
= aHXHXa
×
aHXHY(YHY)-1YHXa
- If
aHXHY(YHY)-1YHXa
> 0 we can divide through to obtain
aHXHY(YHY)-1YHXa
≤ aHXHXa = | Xa
|2
- If, however
aHXHY(YHY)-1YHXa
= 0, then the inequality is true in any case since the right side is ≥
0.
- Hence
aHXHY(YHY)-1YHXa
≤ aHXHXa for any
vector a.
- Hence
XHY(YHY)-1YHX
≤ XHX in the sense of the Loewner partial
order.
|