Wi=Wi−α∗dWi→min(L)
L=(Y−Ŷ )2
where
Ŷ
is the ground truth
Y=φ(W3Y2)=φ(z)
z(W,Y)=WY
Y2=f(W2Y1)
f(x)=RELU(x)=max(x,0)
dLdW3
=dLdY∗dYdW3
=dLdY∗dYdz∗dzdW3
=2(Y−Ŷ )∗φ(z)(1−φ(z))∗Y2
=dLdY∗dYdz∗dzdW3=2(Y−Ŷ )∗φ(W3Y2)(1−φ(W3Y2))∗Y2
dLdW2
=dLdY2∗dY2dW2
=dLdYdYdφdφdY2∗Y2W2
if W2Y1≥0,f(W2Y1)=W2Y1
=