Coursera ML笔记4

最新推荐文章于 2021-06-26 16:35:18 发布

原创最新推荐文章于 2021-06-26 16:35:18 发布 · 329 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#机器学习 #CourseraML

机器学习专栏收录该内容

12 篇文章

订阅专栏

本文介绍了神经网络的基本构成，包括输入层、隐藏层和输出层的概念。详细解析了神经元的激活函数及其计算过程，并通过实例展示了如何使用不同的权重矩阵实现逻辑运算。

神经网络的来源

Neuron in the brain

Neural Network:input layer,hidden layer,output layer

$\begin{bmatrix}x_0 \newline x_1 \newline x_2 \newline x_3\end{bmatrix}\rightarrow\begin{bmatrix}a_1^{(2)} \newline a_2^{(2)} \newline a_3^{(2)} \newline \end{bmatrix}\rightarrow h_\theta(x)$

a (j) i = "activation" of unit i in layer j Θ (j) = matrix of weights controlling function mapping from layer j to layer j + 1

$\begin{align*}& a_i^{(j)} = \text{"activation" of unit $i$ in layer $j$} \newline& \Theta^{(j)} = \text{matrix of weights controlling function mapping from layer $j$ to layer $j+1$}\end{align*}$

a (2) 1 = g (Θ (1) 10 x 0 + Θ (1) 11 x 1 + Θ (1) 12 x 2 + Θ (1) 13 x 3) a (2) 2 = g (Θ (1) 20 x 0 + Θ (1) 21 x 1 + Θ (1) 22 x 2 + Θ (1) 23 x 3) a (2) 3 = g (Θ (1) 30 x 0 + Θ (1) 31 x 1 + Θ (1) 32 x 2 + Θ (1) 33 x 3) h Θ (x) = a (3) 1 = g (Θ (2) 10 a (2) 0 + Θ (2) 11 a (2) 1 + Θ (2) 12 a (2) 2 + Θ (2) 13 a (2) 3)

$\begin{align*} a_1^{(2)} = g(\Theta_{10}^{(1)}x_0 + \Theta_{11}^{(1)}x_1 + \Theta_{12}^{(1)}x_2 + \Theta_{13}^{(1)}x_3) \newline a_2^{(2)} = g(\Theta_{20}^{(1)}x_0 + \Theta_{21}^{(1)}x_1 + \Theta_{22}^{(1)}x_2 + \Theta_{23}^{(1)}x_3) \newline a_3^{(2)} = g(\Theta_{30}^{(1)}x_0 + \Theta_{31}^{(1)}x_1 + \Theta_{32}^{(1)}x_2 + \Theta_{33}^{(1)}x_3) \newline h_\Theta(x) = a_1^{(3)} = g(\Theta_{10}^{(2)}a_0^{(2)} + \Theta_{11}^{(2)}a_1^{(2)} + \Theta_{12}^{(2)}a_2^{(2)} + \Theta_{13}^{(2)}a_3^{(2)}) \newline \end{align*}$

$\text{If network has $s_j$ units in layer $j$ and $s_{j+1}$ units in layer $j+1$, then $\Theta^{(j)}$ will be of dimension $s_{j+1} \times (s_j + 1)$.}$ (because of the bias unit)

向量化

$z_k^{(2)} = \Theta_{k,0}^{(1)}x_0 + \Theta_{k,1}^{(1)}x_1 + \cdots + \Theta_{k,n}^{(1)}x_n$

a (2) 1 = g (z (2) 1) a (2) 2 = g (z (2) 2) a (2) 3 = g (z (2) 3)

$\begin{align*}a_1^{(2)} = g(z_1^{(2)}) \newline a_2^{(2)} = g(z_2^{(2)}) \newline a_3^{(2)} = g(z_3^{(2)}) \newline \end{align*}$

x = ⎡ ⎣ ⎢ ⎢ ⎢ x 0 x 1 \dots x n ⎤ ⎦ ⎥ ⎥ ⎥ / a (1) = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ a (1) 0 a (1) 1 \dots a (1) n ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ z (j) = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ z (j) 1 z (j) 2 \dots z (j) n ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥

$\begin{align*}x = \begin{bmatrix}x_0 \newline x_1 \newline\cdots \newline x_n\end{bmatrix} /&a^{(1)} = \begin{bmatrix}a_0^{(1)} \newline a_1^{(1)} \newline\cdots \newline a_n^{(1)}\end{bmatrix} &z^{(j)} = \begin{bmatrix}z_1^{(j)} \newline z_2^{(j)} \newline\cdots \newline z_n^{(j)}\end{bmatrix}\end{align*}$

$z^{(2)}=\Theta ^{(1)}x=\Theta ^{(1)}a ^{(1)}$
$a^{(2)}=g(z^{(2)})$
Add $a^{(2)}_0=1$ then $a^{(2)} \in R^m,z^{(2)} \in R^m; n=m+1.$
$h_\Theta(x)=a^{(3)}=g(z^{(3)})$

$a^{(j)} = g(z^{(j)})$
$z^{(j+1)} = \Theta^{(j)}a^{(j)}$
$h_\Theta(x) = a^{(j+1)} = g(z^{(j+1)})$

Example

Example1

h Θ (x) = g (- 30 + 20 x 1 + 20 x 2) x 1 = 0 a n d x 2 = 0 t h e n g (- 30) \approx 0 x 1 = 0 a n d x 2 = 1 t h e n g (- 10) \approx 0 x 1 = 1 a n d x 2 = 0 t h e n g (- 10) \approx 0 x 1 = 1 a n d x 2 = 1 t h e n g (10) \approx 1

$\begin{align*}& h_\Theta(x) = g(-30 + 20x_1 + 20x_2) \newline \newline & x_1 = 0 \ \ and \ \ x_2 = 0 \ \ then \ \ g(-30) \approx 0 \newline & x_1 = 0 \ \ and \ \ x_2 = 1 \ \ then \ \ g(-10) \approx 0 \newline & x_1 = 1 \ \ and \ \ x_2 = 0 \ \ then \ \ g(-10) \approx 0 \newline & x_1 = 1 \ \ and \ \ x_2 = 1 \ \ then \ \ g(10) \approx 1\end{align*}$

Example2

A N D : Θ (1) N O R : Θ (1) O R : Θ (1) = [- 30 2020] = [10 - 20 - 20] = [- 10 2020]

$\begin{align*}AND:\newline\Theta^{(1)} &=\begin{bmatrix}-30 & 20 & 20\end{bmatrix} \newline NOR:\newline\Theta^{(1)} &= \begin{bmatrix}10 & -20 & -20\end{bmatrix} \newline OR:\newline\Theta^{(1)} &= \begin{bmatrix}-10 & 20 & 20\end{bmatrix} \newline\end{align*}$

⎡ ⎣ ⎢ x 0 x 1 x 2 ⎤ ⎦ ⎥ \to ⎡ ⎣ a (2) 1 a (2) 2 ⎤ ⎦ \to [a (3)] \to h Θ (x)

$\begin{align*}\begin{bmatrix}x_0 \newline x_1 \newline x_2\end{bmatrix} \rightarrow\begin{bmatrix}a_1^{(2)} \newline a_2^{(2)} \end{bmatrix} \rightarrow\begin{bmatrix}a^{(3)}\end{bmatrix} \rightarrow h_\Theta(x)\end{align*}$