ImageNet Classification with Deep Convolutional Neural Networks-AlexNet阅读笔记

最新推荐文章于 2024-11-14 02:03:49 发布

Erin__Zhang

最新推荐文章于 2024-11-14 02:03:49 发布

阅读量309

点赞数

CC 4.0 BY-SA版权

分类专栏：深度学习论文笔记

本文链接：https://blog.youkuaiyun.com/m0_37749527/article/details/79248242

深度学习论文笔记专栏收录该内容

4 篇文章

订阅专栏

本文深入探讨了AlexNet的创新贡献，包括高效的GPU实现、减少过拟合的方法等，并详细介绍了其在网络架构、学习细节方面的特点。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

ImageNet Classification with Deep Convolutional Neural Networks-AlexNet

authors: Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton

点击打开链接

AlexNet

Ⅰ new contributions:

1. highly-optimized GPUimplementation of 2D convolution

2. improve its performance andreduce its training time

3. preventing overfitting

Ⅱ Dataset: ImageNet

ImageNet is a dataset of over 15 million labeledhigh-resolution images belonging to roughly 22,000 categories. The images werecollected from the web and labeled by human labelers using Amazon’s MechanicalTurk crowd-sourcing tool.

Ⅲ Features:

1. ReLU Nonlinearity

Rectified Linear Units (ReLUs)

non-saturating neurons 非饱和神经元（？目前的理解：没有值域限制，比如饱和神经元sigmoid值域被限制在[0,1]）对提升训练效率效果最显著

2. Training on Multiple GPUs

spread the net across two GPUs将网络分布在两个GPU上。

employ parallelization scheme 并行计算

the GPUs communicate only in certain layers GPU通信限制在某些特定的层上

3. Local Response Normalization LRN

（不太理解）横向抑制，归一化，用在ReLU之后？

4. Overlapping Pooling

Models with overlapping pooling are slightly more difficultto overfit

s=2 z=3

有重叠的池化，可以提升效率

5. Reduce Overfitting 解决过拟合问题

Data Augmentation: label-preserving transformations, at test time extract ten patches

数据增强，比如训练时剪裁、翻转原图片得到多个patches

Dropout: setting to zero the output of each hidden neuron with probability0.5. The neurons which are “dropped out” in this way do not contribute to theforward pass and do not participate in back-propagation.

在训练中以概率P(一般为50%)关掉一部分神经元，在预测的时候，将使用所有的神经元，但是会将其输出乘以0.5

Used dropout in the first twofully-connected layers 在前两个全连接层使用

IV Overall Architecture

● 8 layers: 5 convolutional+3 fc

● The output of the last fully-connected layer is fed to a 1000-way softmaxwhich produces a distribution over the 1000 class labels. 最后一层输出对1000个类别的预测

● Maximizes the multinomial logistic regression objective. 多项式回归

● The kernels of the second, fourth, and fifth convolutional layersare connected only to those kernel maps in the previous layer which reside onthe same GPU . The kernels of the third convolutional layer are connected toall kernel maps in the second layer

第2、4、5个卷积层的内核只与前一层与自己同在一个GPU上的内核映射相连接。

第三层的内核与全部的第二层内核映射相连接。

● Response-normalization layers follow the first and secondconvolutional layers

前两个卷积层后有LRN

● Max-pooling layers follow both response-normalization layers aswell as the fifth convolutional layer

LRN层和第五个卷积层后有池化层

● The ReLU non-linearity is applied to the output of everyconvolutional and fully-connected layer ReLU用在每一个卷积层、全连接层后