吴恩达·Machine Learning || chap8 Neural Networks:Representation 简记

8 Neural Networks: Representation

8-1 Non-linear hypotheses

Non-linear Classification 0、1

8-2 Neurons and the brain

Neurons Networks

Origins: Algorithms that try to mimic the brain.
Was very widely used in 80s and early 90s;popularity diminished in late 90s
Recent resurgence: State-of-the-art technique for many applications

The “one learning algorithm” hypothsis

​ neuro-rewiring experiments

Sensor representation in brain

8-3 Model representation I

Neuron in brain

Neuron model: logistic unit

​ sigmoid (logistic) activation function.

Neural Network

input layer

hidden layer

output layer

a i ( j ) = a_i^{(j)}= ai(j)=“activation” of unit i in layer j

θ ( j ) = \theta^{(j)=} θ(j)=matrix of weight controlling function mapping from layer j to layer j+1

a 1 ( 2 ) = g ( θ 10 ( 1 ) x 0 + θ 11 ( 1 ) x 1 + θ 12 ( 1 ) x 2 + θ 13 ( 1 ) x 3 ) a 2 ( 2 ) = g ( θ 20 ( 1 ) x 0 + θ 21 ( 1 ) x 1 + θ 22 ( 1 ) x 2 + θ 23 ( 1 ) x 3 ) a 3 ( 2 ) = g ( θ 30 ( 1 ) x 0 + θ 31 ( 1 ) x 1 + θ 32 ( 1 ) x 2 + θ 33 ( 1 ) x 3 ) h θ ( x ) = a 1 ( 3 ) = g ( θ 10 ( 2 ) a 0 ( 2 ) + θ 11 ( 2 ) ) a 1 ( 2 ) + θ 12 ( 2 ) a 2 ( 2 ) + θ 13 2 a 3 ( 2 ) \begin{matrix}\\a_1^{(2)}=g(\theta_{10}^{(1)}x_0+\theta_{11}^{(1)}x_1+\theta_{12}^{(1)}x_2+\theta_{13}^{(1)}x_3)\\a_2^{(2)}=g(\theta_{20}^{(1)}x_0+\theta_{21}^{(1)}x_1+\theta_{22}^{(1)}x_2+\theta_{23}^{(1)}x_3)\\a_3^{(2)}=g(\theta_{30}^{(1)}x_0+\theta_{31}^{(1)}x_1+\theta_{32}^{(1)}x_2+\theta_{33}^{(1)}x_3)\\h_{\theta(x)=a_1^{(3)}=g(\theta_{10}^{(2)}a_0^{(2)}+\theta_{11}^{(2)})a_1^{(2)}+\theta_{12}^{(2)}a_2^{(2)}+\theta_{13}^{2}a_3^{(2)}}\end{matrix} a1(2)=g(θ10(1)x0+θ11(1)x1+θ12(1)x2+θ13(1)x3)a2(2)=g(θ20(1)x0+θ21(1)x1+θ22(1)x2+θ23(1)x3)a3(2)=g(θ30(1)x0+θ31(1)x1+θ32(1)x2+θ33(1)x3)hθ(x)=a1(3)=g(θ10(2)a0(2)+θ11(2))a1(2)+θ12(2)a2(2)+θ132a3(2)

If network has s j s_j sj units in layer j , s j + 1 j, s_j+1 j,sj+1 units in layer j+ 1, then θ ( j ) \theta^{(j)} θ(j) will be of dimension s j + 1 × ( s j + 1 ) s_{j+1}\times(s_j+1) sj+1×(sj+1)

8-4 Model representation II

Forward propagation: Vectorized implementation

x = [ x 0 x 1 x 2 x 3 ] x =\begin{bmatrix}x_0 \\ x_1 \\ x_2\\x_3\end{bmatrix} x=x0x1x2x3​​ z ( 2 ) = [ z 1 ( 2 ) z 2 ( 2 ) z 3 ( 2 ) ] z ^ { ( 2 ) } = \left[ \begin{array} { l } { z _ { 1 } ^ { ( 2 ) } } \\ { z _ { 2 } ^ { ( 2 ) } } \\ { z _ { 3 } ^ { ( 2 ) } } \end{array} \right] z(2)=z1(2)z2(2)z3(2)​​​

z ( 2 ) = θ ( 1 ) x z ^ { ( 2 ) } = \theta ^ { ( 1 ) } x z(2)=θ(1)x

a ( 2 ) = g ( z ( 2 ) ) a ^ { ( 2 ) } = g ( z ^ { ( 2 ) } ) a(2)=g(z(2))

Add a 0 ( 2 ) = 1 a_0^{(2)}=1 a0(2)=1

z ( 3 ) = θ ( 2 ) a ( 2 ) z^{(3)}=\theta^{(2)}a^{(2)} z(3)=θ(2)a(2)

h θ ( x ) = a ( 3 ) = g ( z ( 3 ) ) h_\theta(x)=a^{(3)}=g(z^{(3)}) hθ(x)=a(3)=g(z(3))

Neural Network learning its own features

Other network architectures

8-5 Example and intuitions I

XOR/XNOR

AND

Or

8-6 Example and intuitions II

Negation

Putting it together:

AND/NOT AND NOT/OR ⟶ \longrightarrow x 1 x_1 x1 XNOR x 2 x_2 x2

(Similar to the combination of multiple logic circuits)

Handwritten digit classfication

8-6 Multi-class classification

Multiple output units : One-vs-all

Want h θ ( x ) ≈ [ 1 0 0 0 ] h_\theta(x)\approx \begin{bmatrix}1\\0\\0\\0\end{bmatrix} hθ(x)1000​​, h θ ( x ) ≈ [ 0 1 0 0 ] h_\theta(x)\approx \begin{bmatrix}0\\1\\0\\0\end{bmatrix} hθ(x)0100​​, h θ ( x ) ≈ [ 0 0 1 0 ] h_\theta(x)\approx \begin{bmatrix}0\\0\\1\\0\end{bmatrix} hθ(x)0010​​, e t c etc etc​​

Training set: ( x ( 1 ) , y ( 1 ) ) , ( x ( 2 ) , y ( 2 ) ) , ⋯   , ( x ( m ) , y ( m ) ) (x^{(1)},y^{(1)}),(x^{(2)},y^{(2)}),\cdots,(x^{(m)},y^{(m)}) (x(1),y(1)),(x(2),y(2)),,(x(m),y(m))

y ( i ) y^{(i)} y(i)​​​ one of [ 1 0 0 0 ] \begin{bmatrix}1\\0\\0\\0\end{bmatrix} 1000​​​, [ 0 1 0 0 ] \begin{bmatrix}0\\1\\0\\0\end{bmatrix} 0100​​​, [ 0 0 1 0 ] \begin{bmatrix}0\\0\\1\\0\end{bmatrix} 0010​​​, [ 0 0 0 1 ] \begin{bmatrix}0\\0\\0\\1\end{bmatrix} 0001​​​

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值