LeNet5—论文及源码阅读

原创

已于 2023-01-10 23:04:09 修改 · 9.3k 阅读

31 ·

CC 4.0 BY-SA版权

文章标签：

#深度学习 #cnn #神经网络

于 2023-01-08 23:47:51 首次发布

文章详细介绍了LeNet5的网络结构，包括卷积层C1、S2、C3、S4、C5以及全连接层F6和Output层，并解析了每层的功能。接着，文章展示了如何使用PyTorch实现LeNet5，包括数据预处理、模型构建、训练过程和模型保存。最后，提供了测试代码以加载模型并预测图像类别。

LeNet5实现图像分类

🐬 目录:

一、概论
二、论文选读
三、源码精读
四、参考资料

一、概论

LeNet-5是一种经典的卷积神经网络结构，于1998年投入实际使用中。该网络最早应用于手写体字符识别应用中。普遍认为，卷积神经网络的出现开始于LeCun等提出的LeNet网络，可以说LeCun等是CNN的缔造者，而LeNet则是LeCun等创造的CNN经典之作。

二、论文选读

论文：《Gradient-Based Learning Applied to Document Recognition》

LeNet-5 comprises seven layers, not counting the input, all of which contain trainable parameters (weights). The input is a 32×32 pixel image.

理解： LeNet5共包含7层，输入为32×32像素的图片，如下图所示：

在这里插入图片描述

Layer C1 is a convolutional layer with six feature maps.Each unit in each feature map is connected to a 5✖5 neigh-borhood in the input.

理解： C1 层是卷积层，使用 6 个 5×5 大小的卷积核，padding=0，stride=1进行卷积，得到 6 个 28×28 大小的特征图：32-5+1=28

Layer S2 is a subsampling layer with six feature maps of size 14×14. Each unit in each feature map is connected to a 2×2 neighborhood in the corresponding feature map in C1.The four inputs to a unit in S2 are added, then multiplied by a trainable coefficient, and then added to a trainable bias.The result is passed through a sigmoidal function.

理解： S2 层是降采样层，使用 6 个 2×2 大小的卷积核进行池化，padding=0，stride=2，得到 6 个 14×14 大小的特征图：28/2=14。S2 层其实相当于降采样层+激活层。先是降采样，然后激活函数 sigmoid 非线性输出。先对 C1 层 2x2