BP算法的原理解释和推导
已知的神经网络结构:

且已知的条件:
- a(j)=f(z(j))\mathbf{a}^{\left( \mathbf{j} \right)}=\mathbf{f}\left( \mathbf{z}^{\left( \mathbf{j} \right)} \right)a(j)=f(z(j))
- z(j)=W(j)a(j−1)+b(j),而θ(j)={ W(j),b(j)}\mathbf{z}^{\left( \mathbf{j} \right)}=\mathbf{W}^{\left( \mathbf{j} \right)}\mathbf{a}^{\left( \mathbf{j}-1 \right)}+\mathbf{b}^{\left( \mathbf{j} \right)}\text{,而}\mathbf{\theta }^{\left( \mathbf{j} \right)}=\left\{ \mathbf{W}^{\left( \mathbf{j} \right)},\mathbf{b}^{\left( \mathbf{j} \right)} \right\}z(j)=W(j)a(j−1)+b(j),而θ(j)={ W(j),b(j)}
对于上图,如果我们想得到∂l∂θ(j)\frac{\partial \mathbf{l}}{\partial \mathbf{\theta }^{\left( \mathbf{j} \right)}}∂θ(j)∂l,可以通过z(j)\mathbf{z}^{\left( \mathbf{j} \right)}z(j)建立l和θ(j)之间的联系,即∂l∂θ(j)=∂l∂z(j)∗∂z(j)∂θ(j)\frac{\partial \mathbf{l}}{\partial \mathbf{\theta }^{\left( \mathbf{j} \right)}}=\frac{\partial \mathbf{l}}{\partial \mathbf{z}^{\left( \mathbf{j} \right)}}*\frac{\partial \mathbf{z}^{\left( \mathbf{j} \right)}}{\partial \mathbf{\theta }^{\left( \mathbf{j} \right)}}∂θ(j)∂l=∂z(j)∂l∗∂θ(j)∂z(j),而l和z(j)之间的联系则可以通过z(j+1)进行建立∂l∂z(j)=∂l∂z(j+1)∗∂z(j+1)∂z(j)=∂l∂z(j+1)∗∂z(j+1)∂a(j)∗∂a(j)∂z(j)\frac{\partial \mathbf{l}}{\partial \mathbf{z}^{\left( \mathbf{j} \right)}}=\frac{\partial \mathbf{l}}{\partial \mathbf{z}^{\left( \mathbf{j}+1 \right)}}*\frac{\partial \mathbf{z}^{\left( \mathbf{j}+1 \right)}}{\partial \mathbf{z}^{\left( \mathbf{j} \right)}}=\frac{\partial \mathbf{l}}{\partial \mathbf{z}^{\left( \mathbf{j}+1 \right)}}*\frac{\partial \mathbf{z}^{\left( \mathbf{j}+1 \right)}}{\partial \mathbf{a}^{\left( \mathbf{j} \right)}}*\frac{\partial \mathbf{a}^{\left( \mathbf{j} \right)}}{\partial \mathbf{z}^{\left( \mathbf{j} \right)}}∂z(j)∂l=∂z(j+1)∂l∗∂z(j)∂z(j+1)=∂z(j+1)∂l∗∂a(j)∂z(j+1)∗∂z(j)∂a(j),由此,我们得到∂l∂θ(j)=∂l∂z(j+1)∗∂z(j+1)∂a(j)∗∂a(j)∂z(j)∗∂z(j)∂θ(j)\frac{\partial \mathbf{l}}{\partial \mathbf{\theta }^{\left( \mathbf{j} \right)}}=\frac{\partial \mathbf{l}}{\partial \mathbf{z}^{\left( \mathbf{j}+1 \right)}}*\frac{\partial \mathbf{z}^{\left( \mathbf{j}+1 \right)}}{\partial \mathbf{a}^{\left( \mathbf{j} \right)}}*\frac{\partial \mathbf{a}^{\left( \mathbf{j} \right)}}{\partial \mathbf{z}^{\left( \mathbf{j} \right)}}*\frac{\partial \mathbf{z}^{\left( \mathbf{j} \right)}}{\partial \mathbf{\theta }^{\left( \mathbf{j} \right)}}∂θ(j)∂l=∂z(j+1)∂l<

最低0.47元/天 解锁文章
7万+





