BP神经网络
1 BP神经网络结构和原理

定义说明:
(1)
n
l
n_l
nl :表示网络层数,此处为4层
(2)
L
l
L_l
Ll :表示第
l
l
l层,
L
1
L_1
L1是输入层,
L
n
l
L_{n_l}
Lnl是输出层,其他为隐含层。
(3)
w
i
j
(
l
)
w_{i j}^{(l)}
wij(l):表示第
l
+
1
l+1
l+1层的第
i
i
i个单元与第
l
l
l层第
j
j
j个单元的连接权重
(4)
b
i
(
l
)
b_i^{(l)}
bi(l):表示第
l
l
l层第
i
i
i个单元的偏置项(激活阈值)
(5)
z
i
(
l
)
z_i^{(l)}
zi(l):表示第
l
l
l层第
i
i
i个单元的权重累计
(6)
a
i
(
l
)
a_i^{(l)}
ai(l):表示第l层第i个单元的激活值(输出值)
(7)
h
w
,
b
(
X
)
h_{w,b}(X)
hw,b(X):表示最后的输出值
(8)
S
l
S_l
Sl:表示第
l
l
l层神经元个数
(9)样本个数为
m
m
m,特征个数为
n
n
n
通过上面的定义可知:
第一层:
当
l
=
1
时
,
a
i
(
1
)
=
x
i
当l = 1 时,a_i^{(1)} = x_i
当l=1时,ai(1)=xi
第二层:
z
1
(
2
)
=
∑
j
=
1
4
(
w
1
j
(
1
)
a
j
(
1
)
)
+
b
1
(
1
)
a
1
(
2
)
=
f
(
z
1
(
2
)
)
a
2
(
2
)
=
f
(
w
21
(
1
)
x
1
+
w
22
(
1
)
x
2
+
w
23
(
1
)
x
3
+
b
2
(
1
)
)
⋯
a
4
(
2
)
=
f
(
w
41
(
1
)
x
1
+
w
42
(
1
)
x
2
+
w
43
(
1
)
x
3
+
b
4
(
1
)
)
\begin{array}{c} z_{1}^{(2)}=\sum_{j=1}^{4}\left(w_{1 j}^{(1)} a_{j}^{(1)}\right)+b_{1}^{(1)} \\ a_{1}^{(2)}=f\left(z_{1}^{(2)}\right) \\ a_{2}^{(2)}=f\left(w_{21}^{(1)} x_{1}+w_{22}^{(1)} x_{2}+w_{23}^{(1)} x_{3}+b_{2}^{(1)}\right) \\ \quad \cdots \\ a_{4}^{(2)}=f\left(w_{41}^{(1)} x_{1}+w_{42}^{(1)} x_{2}+w_{43}^{(1)} x_{3}+b_{4}^{(1)}\right) \end{array}
z1(2)=∑j=14(w1j(1)aj(1))+b1(1)a1(2)=f(z1(2))a2(2)=f(w21(1)x1+w22(1)x2+w23(1)x3+b2(1))⋯a4(2)=f(w41(1)x1+w42(1)x2+w43(1)x3+b4(1))
第三层:
z
1
(
3
)
=
∑
j
=
1
4
(
w
1
j
(
2
)
a
j
(
2
)
)
+
b
1
(
2
)
a
1
(
3
)
=
f
(
z
1
(
3
)
)
⋯
\begin{array}{c} z_{1}^{(3)}=\sum_{j=1}^{4}\left(w_{1 j}^{(2)} a_{j}^{(2)}\right)+b_{1}^{(2)} \\ a_{1}^{(3)}=f\left(z_{1}^{(3)}\right) \\ \cdots \end{array}
z1(3)=∑j=14(w1j(2)aj(2))+b1(2)a1(3)=f(z1(3))⋯
第四层:
z
1
(
4
)
=
∑
j
=
1
4
(
w
1
j
(
3
)
a
j
(
3
)
)
+
b
1
(
3
)
h
w
.
b
(
X
)
=
(
a
1
(
4
)
,
a
2
(
4
)
)
T
\begin{array}{c} z_{1}^{(4)}=\sum_{j=1}^{4}\left(w_{1 j}^{(3)} a_{j}^{(3)}\right)+b_{1}^{(3)} \\ h_{w . b}(X)=\left(a_{1}^{(4)}, a_{2}^{(4)}\right)^{T} \end{array}
z1(4)=∑j=14(w1j(3)aj(3))+b1(3)hw.b(X)=(a1(4),a2(4))T
2 BP神经网络的实现流程
- 进行前向传导计算,得到 L 2 , L 3 , ⋯ , L n l L_2,L_3,\cdots,L_{n_l} L2,L3,⋯,Lnl的激活函数值
- 对于最后一层即 n l n_l nl层,计算误差: δ i ( n l ) = − ( y i − a i ( n l ) ) ⋅ f ′ ( z i ( n l ) ) \delta_{i}^{\left(n_{l}\right)}=-\left(y_{i}-a_{i}^{\left(n_{l}\right)}\right) \cdot f^{\prime}\left(z_{i}^{\left(n_{l}\right)}\right) δi(nl)=−(yi−ai(nl))⋅f′(zi(nl))
- 对
l
=
n
l
−
1
,
n
l
−
2
,
n
l
−
3
,
⋯
,
2
:
l=n_l-1,n_l-2,n_l-3,\cdots,2:
l=nl−1,nl−2,nl−3,⋯,2:
δ i l = ∑ j = 1 S l + 1 [ δ j ( l + 1 ) ⋅ w j i ( l ) ] f ′ ( z i ( l ) ) \delta_{i}^{l}=\sum_{j=1}^{S_{l+1}}\left[\delta_{j}^{(l+1)} \cdot w_{j i}^{(l)}\right] f^{\prime}\left(z_{i}^{(l)}\right) δil=j=1∑Sl+1[δj(l+1)⋅wji(l)]f′(zi(l)) - 权重和偏置更新,此处
l
=
1
l=1
l=1时,a^{(l)}实际上就是输入
x
x
x
w
i
j
(
l
)
=
w
i
j
(
l
)
−
α
⋅
a
j
(
l
)
δ
i
(
l
+
1
)
b
i
(
l
)
=
b
i
(
l
)
−
α
⋅
δ
i
(
l
+
1
)
\begin{array}{l} w_{i j}^{(l)}=w_{i j}^{(l)}-\alpha \cdot a_{j}^{(l)} \delta_{i}^{(l+1)} \\ b_{i}^{(l)}=b_{i}^{(l)}-\alpha \cdot \delta_{i}^{(l+1)} \end{array}
wij(l)=wij(l)−α⋅aj(l)δi(l+1)bi(l)=bi(l)−α⋅δi(l+1)
如果考虑正则化的化,则权重的更新方程为: w i j ( l ) = w i j ( l ) ( 1 − α λ ) − α ⋅ a j ( l ) δ i ( l + 1 ) w_{i j}^{(l)}=w_{i j}^{(l)}(1-\alpha \lambda)-\alpha \cdot a_{j}^{(l)} \delta_{i}^{(l+1)} wij(l)=wij(l)(1−αλ)−α⋅aj(l)δi(l+1)
本文详细介绍了BP(反向传播)神经网络的结构和原理,包括输入层、隐藏层和输出层的激活值计算过程。同时,阐述了BP神经网络的前向传导和误差反向传播算法,用于权重和偏置的更新。通过实例展示了权重累计和激活函数的运用,以及如何通过梯度下降法调整网络参数以减少预测误差。

被折叠的 条评论
为什么被折叠?



