Paper Reading: ImageNet Classification with Deep Convolutional Neural Networks

本文介绍了AlexNet,一种在ImageNet数据集上进行大规模图像分类的深度卷积神经网络。文章详细阐述了AlexNet的创新特征,包括ReLU非线性激活、在多个GPU上训练、局部响应归一化、重叠池化等,并讨论了数据增强和Dropout等防止过拟合的技术。AlexNet的成功展示了深层和大型网络在复杂图像识别任务中的优势。

Alex Net
ImageNet Classification with Deep Convolutional Neural Networks

Section 1 & 2

Current problem

  1. Current networks perform relatively well on small datasets like MNIST.
  2. The immerse complexity makes larger datasets like ImageNet even not large enough. Thus, the model must have lots of prior knowledge to compensate for it.
  3. The model itself must have large learning capacity.

Author’s work

  1. Trained the one of the largest convolutional neural network to date on the subset of ImageNet.
  2. Optimize with GPU.
  3. Some unusual features to speed up the training and improve performance, detailed in section 3.
  4. Used several effective techniques for preventing overfitting.

Section 3

Overall architecture

Contains 8 learned layers

  • 5 convloutional layers
  • 3 fully-connected layers
  • a 1000-way softmax layer afterwards
    AlexNet architecture

Notes:

  • 1st and 2nd convolutional layer is followed by a LRN layer each.
  • Each LRN, as well as the 5th convolutional layer, is followed by a max pooling layer.
  • The architecture graph is divided vertically into two parts, and distributed on two GPUs.

Novel and unusual features

Relu Nonlinearity

In terms of training time when using an activation function, a non-saturating function( f ( x ) = m a x ( 0 , x ) f(x)=max(0,x) f(x)=max(0,x) ) works faster then a saturating function( f ( x ) = t a n h ( x ) f(x)=tanh(x) f(x)=tanh(x) or f ( x ) = 1 1 + e − x f(x)=\frac{1}{1+e^{-x}} f(x)=1+ex1 ).

Training on Multiple GPUs

GPU at that time is not capable enough to hold that network, so the auther split the network into two.

Local Response Normalization

LRN in short. A method that enlarge large responses and minish small responses, creating competition for neurons, used to reduce error rates. LRN was mentioned useless, however, in Very Deep Convolutional Networks for Large-Scale Image Recognition(VGG net).

Overlapping Pooling

The pooling kernel overlaps, which reduces the error rate a little bit.

Section 4

This part introduces techniques that prevent overfitting.

Data Augmentation

In short, artificially enlarging the dataset.

  1. Cut off random parts form a respectively large images, and train them as well as their vertical and horizontial reflections.
  2. Altering the intensities of the RGB channel.

Dropout

Inactivate some neurons randomly.

Section 5 & 6 & 7

Details, results and thoughts afterwards. In the end, the author propose that a deeper and larger really counts.

【论文复现】一种基于价格弹性矩阵的居民峰谷分时电价激励策略【需求响应】(Matlab代码实现)内容概要:本文介绍了一种基于价格弹性矩阵的居民峰谷分时电价激励策略,旨在通过需求响应机制优化电力系统的负荷分布。该研究利用Matlab进行代码实现,构建了居民用电行为与电价变动之间的价格弹性模型,通过分析不同时间段电价调整对用户用电习惯的影响,设计合理的峰谷电价方案,引导用户错峰用电,从而实现电网负荷的削峰填谷,提升电力系统运行效率与稳定性。文中详细阐述了价格弹性矩阵的构建方法、优化目标函数的设计以及求解算法的实现过程,并通过仿真验证了所提策略的有效性。; 适合人群:具备一定电力系统基础知识和Matlab编程能力,从事需求响应、电价机制研究或智能电网优化等相关领域的科研人员及研究生。; 使用场景及目标:①研究居民用电行为对电价变化的响应特性;②设计并仿真基于价格弹性矩阵的峰谷分时电价激励策略;③实现需求响应下的电力负荷优化调度;④为电力公司制定科学合理的电价政策提供理论支持和技术工具。; 阅读建议:建议读者结合提供的Matlab代码进行实践操作,深入理解价格弹性建模与优化求解过程,同时可参考文中方法拓展至其他需求响应场景,如工业用户、商业楼宇等,进一步提升研究的广度与深度。
针对TC275微控制器平台,基于AUTOSAR标准的引导加载程序实现方案 本方案详细阐述了一种专为英飞凌TC275系列微控制器设计的引导加载系统。该系统严格遵循汽车开放系统架构(AUTOSAR)规范进行开发,旨在实现可靠的应用程序刷写与启动管理功能。 核心设计严格遵循AUTOSAR分层软件架构。基础软件模块(BSW)的配置与管理完全符合标准要求,确保了与不同AUTOSAR兼容工具链及软件组件的无缝集成。引导加载程序本身作为独立的软件实体,实现了与上层应用软件的完全解耦,其功能涵盖启动阶段的硬件初始化、完整性校验、程序跳转逻辑以及通过指定通信接口(如CAN或以太网)接收和验证新软件数据包。 在具体实现层面,工程代码重点处理了TC275芯片特有的多核架构与内存映射机制。代码包含了对所有必要外设驱动(如Flash存储器驱动、通信控制器驱动)的初始化与抽象层封装,并设计了严谨的故障安全机制与回滚策略,以确保在软件更新过程中出现意外中断时,系统能够恢复到已知的稳定状态。整个引导流程的设计充分考虑了时序确定性、资源占用优化以及功能安全相关需求,为汽车电子控制单元的固件维护与升级提供了符合行业标准的底层支持。 资源来源于网络分享,仅用于学习交流使用,请勿用于商业,如有侵权请联系我删除!
### ImageNet Classification Using Deep Convolutional Neural Networks Paper Implementation and Explanation #### Overview of the Approach The approach described involves utilizing a deep convolutional neural network (ConvNet) for classifying images from the ImageNet dataset. When an unseen image enters this system, it undergoes forward propagation within the ConvNet structure. The outcome is a set of probabilities corresponding to different classes that the input could belong to[^1]. These probabilities result from computations involving optimized weights derived during training. #### Training Process Insights Training plays a crucial role in ensuring accurate classifications by optimizing these weights so they can effectively categorize previously seen data points accurately. A sufficiently large training set enhances generalization capabilities; thus, when presented with entirely novel inputs post-training phase completion, the model should still perform reliably well at assigning appropriate labels based on learned features rather than memorized instances. #### Historical Context and Impact In 2012, a groundbreaking paper titled "ImageNet Classification with Deep Convolutional Neural Networks" was published, marking significant advancements in computer vision technology. This work introduced innovations such as deeper architectures compared to earlier models along with improved techniques like ReLU activation functions which accelerated learning processes significantly over traditional methods used before then[^2]. #### Detailed Architecture Review For those interested in delving deeper into recent developments surrounding CNNs up until around 2019, surveys provide comprehensive reviews covering various aspects including architectural improvements made since AlexNet's introduction back in 2012[^3]. Such resources offer valuable insights not only regarding specific design choices but also broader trends shaping modern approaches towards building efficient yet powerful visual recognition systems capable of handling complex tasks efficiently while maintaining high accuracy levels across diverse datasets similar or even larger scale versions of what existed originally within ImageNet itself. ```python import torch from torchvision import models # Load pretrained ResNet-50 model trained on ImageNet model = models.resnet50(pretrained=True) # Set evaluation mode model.eval() def predict_image(image_tensor): """Predicts the class label given an image tensor.""" with torch.no_grad(): outputs = model(image_tensor.unsqueeze(0)) _, predicted_class = torch.max(outputs.data, 1) return predicted_class.item() ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值