深度学习与CNN实践-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_43343116/article/details/93634404

今天我做的事情

把昨天没有写完的博客写完
Lecture 4: Backpropagation &Multi-layer Perceptrons &CNN
部分代码实现
写今天的博客

其实今天做的事并不是很多，主要是对ubuntu系统不熟悉，我在RTX2080的台式机上面进行进行代码实现总是出很多奇怪的错误，最后将conda里面的库进行更新，目前解决了这个问题，再就是在学习反向传播的时候，基本的数学原理还是没有理解透彻（矩阵运算）。

Backpropagation 反向传播

链式法则

链式法则是BP的关键，高数里面学过，但是要注意一点在对某个变量求偏导数的时候，一定是所有到达该变量的链求和。

一个简单的例子

在这里插入图片描述 Q 这里有一个小问题就是最开始的梯度为什么是1？
A 这里有两种理解方式：

最后面的梯度，也就是这里的1 是由Loos Function 的全微分得到
可以在后面加一部*1的运算，对Loss Function无影响，但是

# set some inputs
x = -2; y = 5; z = -4

# perform the forward pass
q = x + y # q becomes 3
f = q * z # f becomes -12

# perform the backward pass (backpropagation) in reverse order:
# first backprop through f = q * z
dfdz = q # df/dz = q, so gradient on z becomes 3
dfdq = z # df/dq = z, so gradient on q becomes -4
# now backprop through q = x + y
dfdx = 1.0 * dfdq # dq/dx = 1. And the multiplication here is the chain rule!
dfdy = 1.0 * dfdq # dq/dy = 1

sigmoid function的反向传播

$\sigma(x) = \frac{1}{1+e^{-x}} \\\\ \rightarrow \hspace{0.3in} \frac{d\sigma(x)}{dx} = \frac{e^{-x}}{(1+e^{-x})^2} = \left( \frac{1 + e^{-x} - 1}{1 + e^{-x}} \right) \left( \frac{1}{1+e^{-x}} \right) = \left( 1 - \sigma(x) \right) \sigma(x)$

记住几种重要的gate

在这里插入图片描述
使用矩阵计算

# forward pass
W = np.random.randn(5, 10)
X = np.random.randn(10, 3)
D = W.dot(X)

# now suppose we had the gradient on D from above in the circuit
dD = np.random.randn(*D.shape) # same shape as D
dW = dD.dot(X.T) #.T gives the transpose of the matrix
dX = W.T.dot(dD)

CNN

Why CNN for image

如果单纯用DNN的话需要太多的参数，CNN是用来简化DNN的。因为neuron在对图片的pattern进行检测的时候不需要连接整副图，在人来观察一张图片的时候，你一般会从下面几个dimension去理解，1是观察特征；2而这个特征出现在图片的哪些位置其实不重要，所以可以用相同的方法去检测，在CNN中体现在参数共享；3是图片的大小对你理解图片其实并没有什么影响，在CNN中体现在Maxpooling