经典cnn之resnet

https://arxiv.org/pdf/1512.03385.pdf#page=9&zoom=100,0,157

  1. 摘要
    1.  residual 残余的
    2. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
  2. introduction
    1. Recent evidence reveals that network depth is of crucial importance, and the leading results on the challenging ImageNet dataset all exploit “very deep” models, with a depth of sixteen to thirty.
    2. 深的网络带来的问题:
      1. the notorious problem of vanishing/exploding gradients 梯度弥散,梯度爆炸,batch normalization
      2. with the network depth increasing, accuracy gets saturated (饱和)(which might be unsurprising) and then degrades rapidly
    3. model的本质是mapping,将F(x):=H(x)变为F(x):=H(x)-x
    4. shortcut connection
      1. 没增加参数量,也不增加运算量
      2. \\y = F(x,Wi) + x\\ F = \sigma (W_2 * \sigma (W_1*x))\\\sigma\textup{ is relu}     or      y = F(x,W_i)+W_s*x
  3. related work
    1. residual representation
    2. shortcut connection
  4. deep residual learning
    1. residual learning
      1. If one hypothesizes that multiple nonlinear layers can asymptotically approximate complicated functions2 , then it is equivalent to hypothesize that they can asymptotically approximate the residual functions, i.e., H(x) − x (assuming that the input and output are of the same dimensions),两种拟合方式的训练难易程度可能不同。
      2. The degradation problem suggests that the solvers might have difficulties in approximating identity mappings by multiple nonlinear layers.因为deeper的模型并没有更优的性能。
    2. Identity Mapping by Shortcuts
      1. a shortcut connection and element-wise addition
      2. 2 residual function 
    3. 网络结构
      1. When the dimensions increase , we consider two options: (A) The shortcut still performs identity mapping, with extra zero entries padded for increasing dimensions. This option introduces no extra parameter; (B) The projection shortcut in Eqn.(2) is used to match dimensions (done by 1×1 convolutions). For both options, when the shortcuts go across feature maps of two sizes, they are performed with a stride of 2.

      2. convolution+bn+activation

  5. 其他模型:

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值