论文笔记 | Deep Residual Learning for Image Recognition

本文探讨了深度残差学习的概念,介绍了残差网络如何通过增加深度来提高优化效率及准确率,同时保持较低的复杂度。文章分析了残差映射相较于原始映射更容易优化的原因,并讨论了不同类型的快捷连接及其在解决退化问题中的应用。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Authors

Kaiming He Xiangyu Zhang Shaoqing Ren Jian Sun

Abstract

Residual Networks are easier to optimize and gain accuracy from considerably increased depth, but it have lower complexity than VGGnets.

1 Introduction

这里写图片描述
We denote the desired underlying mapping as H(x)=F(x)+x, We hypothesize that it is easier to optimize the residual mapping than to optimize original, unreferenced mapping. To the extreme, if an identity mapping were optimal, it would be easier to push the residual to zero than to fit an identity mapping by a stack of nonlinear layers.

2 relate work

2.1 Residual representation

F. Perronnin and C. Dance. Fisher kernels on visual vocabularies for
image categorization. In CVPR, 2007.
H. Jegou, F. Perronnin, M. Douze, J. Sanchez, P. Perez, and
C. Schmid. Aggregating local image descriptors into compact codes.
TPAMI, 2012.
W. L. Briggs, S. F. McCormick, et al. A Multigrid Tutorial. Siam,
2000.

2.2 shortcut connections

#highway
R. K. Srivastava, K. Greff, and J. Schmidhuber. Highway networks.
arXiv:1505.00387, 2015
R. K. Srivastava, K. Greff, and J. Schmidhuber. Training very deep
networks. 1507.06228, 2015.
#LSTM
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural
computation, 9(8):17351780, 1997

3 Deep residual learning

3.1 residual learning

The degradation problem suggests that the solver might have difficults in approximating identity mappings by multiple layers.

3.2 Identity Mapping by Shortcuts

这里写图片描述
we can perform a linear projection by the shortcut connetions to match the dimensions.
这里写图片描述
In this exprements, the F has 2-3 layers , but cannot only 1 layer, that will similar to a linear year.
这里写图片描述
Plain Network The convolutional layer mostly have 3x3 filters and follow two simple design rules:1) for the same output feature map size, the layers have the same number of filters;2) if the feature map size is halved, the number of filters is doubled so as to preserve the time complexity per layer.
Residual Network: when the dimensions increase(dotted line shortcuts), there are two options:1) zeros pads for increaseing dimensions 2)1x1 convolutions(slightly better)

4 Experments

BN ensures forward propagated signals to have non-zero variances.

S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep
network training by reducing internal covariate shift. In ICML, 2015

ResNet eases the optimization by providing faster convergence at the early stage.
shortcut : identity or projection?
这里写图片描述
deep bottleneck architectures
这里写图片描述
Analysis of layer responses
这里写图片描述

Object detection Improvement

box refinement:

S. Gidaris and N. Komodakis. Object detection via a multi-region &
semantic segmentation-aware cnn model. In ICCV, 2015.

global context:
RoI–add a global feature SPP

Conclusions

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值