python卷积函数_【python实现卷积神经网络】卷积层Conv2D反向传播过程

最新推荐文章于 2023-03-06 23:45:56 发布

weixin_39638623

最新推荐文章于 2023-03-06 23:45:56 发布

阅读量328

点赞数

文章标签： python卷积函数

激活函数的实现(sigmoid、softmax、tanh、relu、leakyrelu、elu、selu、softplus)：https://www.cnblogs.com/xiximayou/p/12713081.html

优化器的实现(SGD、Nesterov、Adagrad、Adadelta、RMSprop、Adam)：https://www.cnblogs.com/xiximayou/p/12713594.html

本节将根据代码继续学习卷积层的反向传播过程。

这里就只贴出Conv2D前向传播和反向传播的代码了：

def forward_pass(self, X, training=True):

batch_size, channels, height, width=X.shape

self.layer_input=X#Turn image shape into column shape

#(enables dot product between input and weights)

self.X_col = image_to_column(X, self.filter_shape, stride=self.stride, output_shape=self.padding)#Turn weights into column shape

self.W_col = self.W.reshape((self.n_filters, -1))#Calculate output

output = self.W_col.dot(self.X_col) +self.w0#Reshape into (n_filters, out_height, out_width, batch_size)

output = output.reshape(self.output_shape() +(batch_size, ))#Redistribute axises so that batch size comes first

return output.transpose(3,0,1,2)defbackward_pass(self, accum_grad):#Reshape accumulated gradient into column shape

accum_grad = accum_grad.transpose(1, 2, 3, 0).reshape(self.n_filters, -1)ifself.trainable:#Take dot product between column shaped accum. gradient and column shape

#layer input to determine the gradient at the layer with respect to layer weights

grad_w =accum_grad.dot(self.X_col.T).reshape(self.W.shape)#The gradient with respect to bias terms is the sum similarly to in Dense layer

grad_w0 = np.sum(accum_grad, axis=1, keepdims=True)#Update the layers weights

self.W =self.W_opt.update(self.W, grad_w)

self.w0=self.w0_opt.update(self.w0, grad_w0)#Recalculate the gradient which will be propogated back to prev. layer

accum_grad =self.W_col.T.dot(accum_grad)#Reshape from column shape to image shape

accum_grad =column_to_image(accum_grad,

self.layer_input.shape,

self.filter_shape,

stride=self.stride,

output_shape=self.padding)return accum_grad

而在定义卷积神经网络中是在neural_network.py中

deftrain_on_batch(self, X, y):"""Single gradient update over one batch of samples"""y_pred=self._forward_pass(X)

loss=np.mean(self.loss_function.loss(y, y_pred))

acc=self.loss_function.acc(y, y_pred)#Calculate the gradient of the loss function wrt y_pred

loss_grad =self.loss_function.gradient(y, y_pred)#Backpropagate. Update weights

self._backward_pass(loss_grad=loss_grad)return loss, acc

还需要看一下self._forward_pas和self._backward_pass：

def _forward_pass(self, X, training=True):"""Calculate the output of the NN"""layer_output=Xfor layer inself.layers:

layer_output=layer.forward_pass(layer_output, training)returnlayer_outputdef_backward_pass(self, loss_grad):"""Propagate the gradient 'backwards' and update the weights in each layer"""

for layer inreversed(self.layers):

loss_grad= layer.backward_pass(loss_grad)

我们可以看到，在前向传播中会计算出self.layers中每一层的输出，把包括卷积、池化、激活和归一化等。然后在反向传播中从后往前更新每一层的梯度。这里我们以一个卷积层+全连接层+损失函数为例。网络前向传播完之后，最先获得的梯度是损失函数的梯度。然后将损失函数的梯度传入到全连接层，然后获得全连接层计算的梯度，传入到卷积层中，此时调用卷积层的backward_pass()方法。在卷积层中的backward_pass()方法中，如果设置了self.trainable，那么会计算出对权重W以及偏置项w0的梯度，然后使用优化器optmizer，也就是W_opt和w0_opt进行参数的更新，然后再计算对前一层的梯度。最后有一个colun_to_image()方法。

def column_to_image(cols, images_shape, filter_shape, stride, output_shape='same'):

batch_size, channels, height, width=images_shape

pad_h, pad_w=determine_padding(filter_shape, output_shape)

height_padded= height +np.sum(pad_h)

width_padded= width +np.sum(pad_w)

images_padded=np.empty((batch_size, channels, height_padded, width_padded))#Calculate the indices where the dot products are applied between weights

#and the image

k, i, j =get_im2col_indices(images_shape, filter_shape, (pad_h, pad_w), stride)

cols= cols.reshape(channels * np.prod(filter_shape), -1, batch_size)

cols= cols.transpose(2, 0, 1)#Add column content to the images at the indices

np.add.at(images_padded, (slice(None), k, i, j), cols)#Return image without padding

return images_padded[:, :, pad_h[0]:height+pad_h[0], pad_w[0]:width+pad_w[0]]