[Math] 矩陣微分

數學知識:矩陣微分
線代啟示錄
Wiki Matrix calculus

簡介:矩陣微分定義與基本性質

定義:scalar by vector 的導數

假設 f 為 function,且擁有 p 個獨立變數 x1,x2,,xp
x=[x1,x2,,xp]T
fx=[fx1,fx2,,fxp]T

定理(1):xTxx=2x

Proof: xTxx=[xTxx1,xTxx1,xTxxp]T=[i=1nxi2x1,i=1nxi2x2,,i=1nxi2xp]T=[2x1,2x2,,2xp]T=2[x1,x2,,xp]T=2x

定理(2):xTyx=yxyx=y

Proof: xTyx=[xTyx1,xTyx2,,xTyxp]=[i=1pxiyix1,i=1pxiyix2,,i=1pxiyixp]T=[y1,y2,,yp]T=y

定理(3):ATxx=Aaxx=a

Proof: Let A=[aij]p×pAxx=xAx=x[j=1pa1jxj,j=1pa2jxj,,j=1papjxj]T=[xj=1pa1jxjxj=1pa2jxjxj=1papjxj]=[x1j=1pa1jxjx2j=1pa1jxjxpj=1pa1jxjx1j=1pa2jxjx2j=1pa2jxjxpj=1pa2jxjx1j=1papjxjx2j=1papjxjxpj=1papjxj]=A

定理(4):(i)xTAxx=(A+AT)x(ii)If A=ATxTAxx=2Axxax2=2x

Proof: Let A=[aij]p×pf(x)=xTAx=i=1pj=1pxiaijxjfxi=xi[i=1pj=1pxiaijxj]=j=1paijxj+j=1pxjaji=j=1paijxj+j=1pajixjfx=Ax+ATx=(A+AT)x 乘法律:
h(x)=f(x)g(x)f(x)g(x)x 都是可微
h(x)=f(x)g(x)+f(x)g(x)

定義:scalar by matrix 的導數

假設 f 為 function,且擁有 m×n matrix X 變數 Xm×n=[x11x12x1nx21x22x2nxm1xm2xmn]
假設所有 fxij 皆存在
fX=[fx11fx12fx1nfx21fx22fx2nfxm1fxm2fxmn]m×n

定義:matrix by scalar 的導數

假設 F 為 matrix function,且擁有 x 變數
且所有 fijx 皆存在
Fx=[f11xf12xf1nxf21xf22xf2nxfm1xfm2xfmnx]m×n

定理(5):Xn×ptr(XTX)X=2X

Proof: XTX=[x11x12x1px21x22x2pxn1xn2xnp]T[x11x12x1px21x22x2pxn1xn2xnp]=[i=1nxi1xi1i=1nxi1xi2i=1nxi1xipi=1nxi2xi1i=1nxi2xi2i=1nxi2xipi=1nxipxi1i=1nxipxi2i=1nxipxip]tr(XTX)=(i=1nxi1xi1+i=1nxi2xi2++i=1nxipxip)=i=1nj=1pxij2tr(XTX)xij=i=1nj=1pxij2xij=2xijtr(XTX)X=2X

定理(6):An×p,Xn×ptr(ATX)X=A

Proof: ATX=[a11a21an1a12a22an1a1pa2panp][x11x12x1px21x22x2pxn1xn2xnp]=[i=1nai1xi1i=1nai1xi2i=1nai1xipi=1nai2xi1i=1nai2xi2i=1nai2xipi=1naipxi1i=1naipxi2i=1naipxip]tr(ATX)=i=1nj=1paijxijtr(ATX)xij=aijtr(ATX)X=A

定理(7):An×n,Xn×p,Bp×ntr(ATXB)X=ABT

Proof: ATXB=[a11a12a1na21a22a2nan1an2ann]T[x11x12x1px21x22x2pxn1xn2xnp][b11b12b1nb21b22b2nbp1bp2bpn]=[a11a21an1a12a22an2a1na2nann][x11x12x1px21x22x2pxn1xn2xnp][b11b12b1nb21b22b2nbp1bp2bpn]=[i=1nai1xi1i=1nai1xi2i=1nai1xipi=1nai2xi1i=1nai2xi2i=1nai2xipi=1nainxi1i=1nainxi2i=1nainxip][b11b12b1nb21b22b2nbp1bp2bpn]=[j=1pi=1nai1xijbj1j=1pi=1nai1xijbj2j=1pi=1nai1xijbjnj=1pi=1nai2xijbj1j=1pi=1nai2xijbj2j=1pi=1nai2xijbjnj=1pi=1nainxijbj1j=1pi=1nainxijbj2j=1pi=1nainxijbjn]tr(AXB)=k=1nj=1pi=1naikxijbjktr(AXB)xij=k=1naikbjktr(AXB)X=ABTABT=[i=1na1ib1ii=1na1ib2ii=1na1ibpii=1na2ib1ii=1na2ib2ii=1na2ibpii=1nanib1ii=1nanib2ii=1nanibpi]

定理(8):An×n,Xn×ptr(XTAX)X=(A+AT)X

Proof: XTAX=[x11x12x1px21x22x2pxn1xn2xnp]T[a11a12a1na21a22a2nan1an2ann][x11x12x1px21x22x2pxn1xn2xnp]=[x11x21xn1x12x22xn2x1px2pxnp][a11a12a1na21a22a2nan1an2ann][x11x12x1px21x22x2pxn1xn2xnp]=[i=1nxi1ai1i=1nxi1ai2i=1nxi1aini=1nxi2ai1i=1nxi2ai2i=1nxi2aipi=1nxipai1i=1nxipai2i=1nxipain][x11x12x1px21x22x2pxn1xn2xnp]=[j=1ni=1nxi1aijxj1j=1ni=1nxi1aijxj2j=1ni=1nxi1aijxjpj=1ni=1nxi2aijxj1j=1ni=1nxi2aijxj2j=1ni=1nxi2aijxjpj=1ni=1nxipaijxj1j=1ni=1nxipaijxj2j=1ni=1nxipaijxjp] tr(XTAX)=k=1pj=1ni=1nxipaijxjptr(XTAX)xip=j=1naijxjp+j=1nxjpaji=j=1naijxjp+j=1najixjptr(XTAX)X=(A+AT)X乘法律:
h(x)=f(x)g(x)f(x)g(x)x 都是可微
h(x)=f(x)g(x)+f(x)g(x)

定理(9):Ap×p,Xn×ptr(XAXT)X=X(A+AT)

Proof: XAXT=[x11x12x1px21x22x2pxn1xn2xnp][a11a12a1pa21a22a2pap1ap2app][x11x12x1px21x22x2pxn1xn2xnp]T=[x11x12x1px21x22x2pxn1xn2xnp][a11a12a1pa21a22a2pap1ap2app][x11x21xn1x12x22xn2x1px2pxnp]=[i=1px1iai1i=1px1iai2i=1px1iaipi=1px2iai1i=1px2iai2i=1px2iaipi=1pxniai1i=1pxniai2i=1pxniaip][x11x21xn1x12x22xn2x1px2pxnp]=[j=1pi=1px1iaijx1jj=1pi=1px1iaijx2jj=1pi=1px1iaijxpjj=1pi=1px2iaijx1jj=1pi=1px2iaijx2jj=1pi=1px2iaijxpjj=1pi=1pxpiaijx1jj=1pi=1pxpiaijx2jj=1pi=1pxniaijxnj] tr(XAXT)=k=1nj=1pi=1pxkiaijxkjtr(XAXT)xki=j=1paijxkj+j=1pxkjaji=j=1pxkjaji+j=1pxkjaijtr(XTAX)X=X(A+AT) 乘法律:
h(x)=f(x)g(x)f(x)g(x)x 都是可微
h(x)=f(x)g(x)+f(x)g(x) $$

定理(10):Xp×p|X|X=|X|(X1)T

Proof: |X|=detX=j=1nxij(1)i+jdetXij=j=1nxijcijdetXxij=kxikcikxij=cij=((adjX)T)ij=((detX)(X1)T)ij|X|X=|X|(X1)T
----------------------------------------------------------------------------------------------------------------------------------------
detX=j=1nxijcij=j=1n(1)i+jxijdetX~ij[x11x12x1nx21x22x2nxn1xn2xnn][c11c21cn1c12c22cn2c1nc2ncnn]=[detX000detX000detX]XCT=(detX)I CTX 的伴隨矩陣 (adjugate 或 classical adjoint),記作 adjX
X 可逆,則 CT=(detX)X1

為何只有對角線為 detX,其餘為 0?
如果 ij,那麼 XCT 的第 i 行第 j 列的係數是 k=1nxikcjk
拉普拉斯公式說明這個和等於 0
(等同把 X 的第 j 行元素換成第 i 行元素後求行列式。由於有兩行相同,行列式為 0)。
X 的第 2 列和 CT 的第 1 行相乘為例:
x21c11 思考如圖,左為目前的做法,右為正常的做法
k=1nx2kc1kdet[x21x22x2nx21x22x2nxn1xn2xnn]=0

定理(11):Xp×pln|X|X=(X1)T

Proof: 根據 定理(10) ln|X|x=1|X|X|X|=1|X||X|(X1)T=(X1)T Chain Rule 連鎖律:
y=f(u) 可以對 u 微分,而函數 u=g(x) 是可以對 x 微分
y=f(g(x)) 是可以對 x 微分的
同時 dydx=dydududx

留言