Matrix Reference Manual
Proofs Section 2: Matrix Calculus


Go to: Introduction, Notation, Index



2.1 d(X-1) = -X-1dX X-1
  • 0 = dI = d (XX-1) = dX X-1 + X d (X-1)
  • d(X-1) = -X-1dX X-1
2.2 dX/dxij = eiejT  where ei is the ith column of I.

Note that eiejT is a matrix containing a 1 in position i, j and zeros elsewhere.

2.3 d/dxij (X-1) = -X-1eiejT X-1 = -X-1ei(X-Tej)T

This follows straightforwardly from 2.1 and 2.2.

2.4 d/dX (tr(AXB)) = ATBT
  • d/dxij (tr(AXB)) = tr(AeiejT B) = tr(ejT BAei) = ejT BAei = (BA)ji = (ATBT)ij[2.3]
2.5 d/dX (tr(AX-1B)) = -X-TATBTX-T
  • d/dxij (tr(AX-1B)) = -tr(AX-1eiejT X-1B) = -tr(ejT X-1BAX-1ei) = -ejT X-1BAX-1ei = -(X-1BAX-1)ji = -(X-TATBTX-T)ij [2.3]
2.6 [D=DH] d{tr((AXB+C)D(AXB+C)H)} = {(2AH(AXB+C)DBH):H dX:}R
  • d{tr((AXB+C)D(AXB+C)H)} = tr{(A(dX)B)D(AXB+C)H} + tr{(AXB+C)D(A(dX)B)H}
  • = tr{A(dX)BD(AXB+C)H} + tr{((AXB+C)DBH(dX)HAH}
  • = tr{BD(AXB+C)HA(dX)} + tr{AH(AXB+C)DBH(dX)H}   [1.17]
  • = (BD(AXB+C)HA)H:HdX: + ((AH(AXB+C)DBH):HdX:)C   [1.18]
  • = ((AH(AXB+C)DBH):HdX:) +  ((AH(AXB+C)DBH):HdX:)C
  • = {(2AH(AXB+C)DBH):H dX:}R
2.7 [D=DH] argminX{tr((AXB+C)D(AXB+C)H} = -(AHA)-1AHCDBH(BDBH)-1
  • d{tr((AXB+C)D(AXB+C)H)} = 0
  • ⇒ {(2AH(AXB+C)DBH):H dX:}R = 0  [2.6]
  • ⇒ (2AH(AXB+C)DBH): = 0  since it must be true for any dX
  • AH(AXB+C)DBH = 0  removing the vectorization
  • AHAXBDBH =  -AHCDBH
  • ⇒  X = -(AHA)-1AHCDBH(BDBH)-1
2.8 [D=DH] argminX{tr((AXB+C)D(AXB+C)H | EXF-G=0} = (AHA)-1(EH{E(AHA)-1EH}-1{E(AHA)-1AHCDBH(BDBH)-1F+G}{FH(BDBH)-1F}-1FH - AHCDBH)(BDBH)-1
  • {tr((AXB+C)D(AXB+C)H)+tr(KH(EXF-G))+tr((EXF-G)HK)}/∂X = 0
  • AH(AXB+C)DBH) +EHKFH = 0    [2.4]
  • AHAXBDBH = -(AHCDBH +EHKFH )
  • X = -(AHA)-1(AHCDBH +EHKFH )(BDBH)-1
  • Substituting this into the constraint, EXF-G=0, gives
  • -E(AHA)-1(AHCDBH +EHKFH )(BDBH)-1F = G
  • ⇒  E(AHA)-1EHKFH (BDBH)-1F = -(G + E(AHA)-1AHCDBH(BDBH)-1F)
  • K = -(E(AHA)-1EH)-1( E(AHA)-1AHCDBH(BDBH)-1F+G)(FH (BDBH)-1F)-1
  • Finally, substituting this back into the previous expression for X gives
  • X = (AHA)-1( EH{E(AHA)-1EH}-1{ E(AHA)-1AHCDBH(BDBH)-1F+G}{FH (BDBH)-1F}-1FH - AHCDBH)(BDBH)-1
2.9 d/dX (aTX-1b) = -X-TabTX-T
  • d/dxij (aTX-1b) = -aTX-1eiejTX-1b = -aTX-1ei * ejTX-1b = -eiTX-Ta * bTX-Tej =  -eiTX-TabTX-Tej
    Hence d/dX (aTX-1b) = -X-TabTX-T [2.3]
2.10 d(det(X)) = ADJ(X)T:T dX:  = [X: nonsingular] det(X) (X-T):T dX:
  • det(X) I = X ADJ(X) so it follows that det(X) =  sumj(xij ADJ(X)ji) for each i
  • d/dxij det(X) = ADJ(X)ji = (ADJ(X)T)ij since ADJ(X)ji does not depend on xij
  • d(det(X)) = ADJ(X)T:T dX:
2.11 d(det(ATXB)) = d(det(BTXTA)) = (A ADJ(ATXB)TBT):T  dX: [A,B: nonsingular] det(ATXB) × (X-T):T dX:
  • d(det(ATXB)) = ADJ(ATXB)T:T d(ATXB): [2.10] = ADJ(ATXB)T:T (B ⊗ A)T d(X): = ((B ⊗ A) ADJ(ATXB)T:)T d(X): = (A ADJ(ATXB)TBT):T  dX:
  • (A ADJ(ATXB)TBT):T  dX: = [A,B,X: nonsingular] det(ATXB) × (A (ATXB)-TBT):T  dX: =  det(ATXB) × (A A-1X-TB-TBT):T  dX: =  det(ATXB) × (X-T):T  dX:
2.12 d(ln(det(ATXB))) = [A,B: nonsingular] (X-T):T dX:
  • d(ln(det(ATXB))) = det(ATXB)-1 × d(det(ATXB)) = [A,B,X: nonsingular] det(ATXB)-1 × det(ATXB) × (X-T):T  dX: [2.11] = (X-T):T dX:
2.13 d(det(X)k) = k × det(Xk) × (X-T):T dX:
  • d(det(X)k) =  k × det(X)k-1 × d(det(X)) =  k × det(X)k-1 × det(X) × (X-T):T dX: [2.10]
2.14 d(det(XTCX)) = [C=CT] 2det(XTCX)×(CX(XTCX)-1):T dX:
  • d(det(XTCX)) = det(XTCX)×(XTCX)-T:T (d(XTCX)): [2.10] = det(XTCX)×(XTCX)-T:T ( (I ⊗ XTC) dX: + (XTCT ⊗ I) dXT: ) =  det(XTCX)×( (CTX (XTCX)-T):T  dX: + ((XTCX)-T XTCT):T dXT: )  =  det(XTCX)×( (CTX (XTCTX)-1 + (CX(XTCX)-1) ):T  dX:  =  [C=CT]  2det(XTCX) × (CX(XTCX)-1):T dX:
2.15 d(det(XHCX))  =  det(XHCX)× ((CTXC (XTCTXC)-1):TdX:   + (CX(XHCX)-1):T dXC:)
  • d(det(XHCX)) = det(XHCX)×(XHCX)-T:T (d(XHCX)): [2.10] = det(XHCX)×(XHCX)-T:T ( (I ⊗ XHC) dX: + (XTCT ⊗ I) dXH: ) =  det(XHCX)×( (CTXC (XHCX)-T):T  dX: + ((XHCX)-T XTCT):T dXH: )  =  det(XHCX)× (CTXC (XTCTXC)-1):TdX:   + (CX(XHCX)-1):T dXC:)
2.16 d(ln(det(XHCX))) = (CTXC (XTCTXC)-1):TdX:   + (CX(XHCX)-1):T dXC:
  • d(ln(det(XHCX))) = (det(XHCX))-1× d(det(XHCX)) = (CTXC (XTCTXC)-1):TdX:   + (CX(XHCX)-1):T dXC: [2.15]
2.17 If C=CH, then Hx (Ax+b)HC(Ax+b) = (AHCA)T
  • (d/dx (d/dx (Ax+b)HC(Ax+b))H)T = (d/dx ((Ax+b)HCA)H)T = (d/dx (AHC(Ax+b)))T = (AHCA)T

This page is part of The Matrix Reference Manual. Copyright © 1998-2017 Mike Brookes, Imperial College, London, UK. See the file gfl.html for copying instructions. Please send any comments or suggestions to "mike.brookes" at "imperial.ac.uk".
Updated: $Id: proof002.html 10094 2017-09-01 17:31:37Z dmb $