[ML] Backpropagation

ML:Backpropagation
李宏毅 老師的課程

簡介:Backpropagation

直觀的思考
簡單推導
全微分由來 $$ \begin{align*} dL(u_1,u_2,\cdots ,u_N) &= du_1 \cdot \frac{\partial L}{\partial u_1} + du_2 \cdot \frac{\partial L}{\partial u_2} + \cdots + du_N \cdot \frac{\partial L}{\partial u_N} \\ &= \sum_{i=1}^{N} du_i \cdot \frac{\partial L}{\partial u_i}\\ \frac{\partial L(u_1,u_2,\cdots ,u_N)}{\partial x} &= \sum_{i=1}^{N} \frac{\partial u_i}{\partial x} \cdot \frac{\partial L}{\partial u_i}\\ \end{align*} $$ 推導過程 $$ \begin{align*} \frac{\partial C}{\partial W_1} &= \frac{\partial z}{\partial W_1}\frac{\partial C}{\partial z}\\ \frac{\partial z}{\partial W_1} &= x_1\\ \frac{\partial C}{\partial z} &= \frac{\partial a}{\partial z}\frac{\partial C}{\partial a}\\ &= {f}'(z)\left ( \frac{\partial {z}'}{\partial a}\frac{\partial C}{\partial {z}'} + \frac{\partial {z}''}{\partial a}\frac{\partial C}{\partial {z}''}\right )\\ &= {f}'(z)\left ( W_3\frac{\partial C}{\partial {z}'} + W_4\frac{\partial C}{\partial {z}''}\right )\\ \end{align*} $$
矩陣形式 $$ \begin{align*} \frac{\partial C}{\partial \mathbf{w_1}} &= \frac{\partial z}{\partial \mathbf{w_1}}\frac{\partial C}{\partial z}\\ &= \mathbf{x}^T {f}'(z)\mathbf{w_2} \begin{bmatrix} \frac{\partial C}{\partial {z}'}\\ \frac{\partial C}{\partial {z}''} \end{bmatrix} \\ \end{align*} $$ $$ \begin{align*} \mathbf{x} &= \begin{bmatrix} x_1\\ x_2 \end{bmatrix}\\ \mathbf{w_1} &= \begin{bmatrix} W_1& W_2 \end{bmatrix}\\ \mathbf{w_2} &= \begin{bmatrix} W_3& W_4 \end{bmatrix}\\ \end{align*} $$
$$ \begin{align*} \mathbf{x} &= \begin{bmatrix} x_1\\ x_2 \end{bmatrix}\\ \mathbf{w_1} &= \begin{bmatrix} W_1& W_2 \end{bmatrix}\\ \mathbf{w_2} &= \begin{bmatrix} W_3& W_4 \end{bmatrix}\\ \mathbf{z_2} &= \begin{bmatrix} {z}'\\ {z}'' \end{bmatrix}\\ \end{align*} $$ $$ \begin{align*} \frac{\partial C}{\partial \mathbf{w_1}} &= \frac{\partial z}{\partial \mathbf{w_1}}\frac{\partial C}{\partial z}\\ &= \begin{bmatrix} \frac{\partial z}{\partial W_1}& \frac{\partial z}{\partial W_2} \end{bmatrix} \frac{\partial a}{\partial z}\frac{\partial C}{\partial a} \\ &= \mathbf{x}^T {f}'(z)\frac{\partial \mathbf{z_2}}{\partial a}\frac{\partial C}{\partial \mathbf{z_2}} \\ &= \mathbf{x}^T {f}'(z)\begin{bmatrix} \frac{\partial {z}'}{\partial a} & \frac{\partial {z}''}{\partial a} \end{bmatrix} \frac{\partial C}{\partial \mathbf{z_2}} \\ &= \mathbf{x}^T {f}'(z)\mathbf{w_2} \begin{bmatrix} \frac{\partial C}{\partial {z}'}\\ \frac{\partial C}{\partial {z}''} \end{bmatrix} \\ \end{align*} $$

參考

深度學習的數學地圖:用 Python 實作神經網路的數學模型
機器學習的數學基礎 : AI、深度學習打底必讀

留言