论文阅读 Rethinking the Inception Architecture for Computer Vision

最新推荐文章于 2022-04-21 15:26:46 发布

原创最新推荐文章于 2022-04-21 15:26:46 发布 · 1.3k 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#c++ #python #深度学习

papers 专栏收录该内容

8 篇文章

订阅专栏

本文探讨了针对InceptionV1的网络优化方法，包括避免早期特征压缩、增加卷积块激活、使用小卷积核分解大卷积核以及辅助分类器的正则化作用。通过这些优化，可以提升网络性能而不显著增加计算量。此外，介绍了高效降低特征图尺寸的策略，以及InceptionV2和InceptionV3的演变过程。

文章目录

1 摘要
- 1.1 本文要解决的问题（优化InceptionV1）
2 一般性准则和优化方法
3 优化方法一：用小的卷积核去分解大的卷积核（Factorizing Convolutions with Large Filter Size）
4 优化方法二：辅助分类器（Utility of Auxiliary Classifiers）
5 优化方法三：高效下降特征图尺寸（Efficient Grid Size Reduction）
6 InceptionV2
7 InceptionV3

1 摘要

1.1 本文要解决的问题（优化InceptionV1）

Here we will describe a few design principles based on large-scale experimentation with various architectural choices with convolutional networks
提出了一些被证明有用的、用于扩展网络结构的一般性准则和优化方法，并优化InceptionV1

2 一般性准则和优化方法

2.1 避免在网络初期就将特征极度压缩，会导致丢失非常多的信息

One should avoid bottlenecks with extreme compression. In general the representation size should gently decrease from the inputs to the outputs before reaching the final representation used for the task at hand.

2.2 增加每个卷积块的激活（不太理解）

Higher dimensional representations are easier to process locally within a network. Increasing the activations per tile in a convolutional network allows for more disentangled features. The resulting networks will train faster.

2.3 3x3之前用1x1降维信息并不会有太多损失，但能加快训练

Spatial aggregation can be done over lower dimensional embeddings without much or any loss in representational power.
Given that these signals should be easily compressible, the dimension reduction even promotes faster learning.

2.4 平衡网络的深度和宽度（修改filter的参数，每层filter的数量，网络的结构等），可以在不增加计算量的情况下，加强网络性能

Balance the width and depth of the network. Optimal performance of the network can be reached by balancing the number of filters per stage and the depth of the network. Increasing both the width and the depth of the network can contribute to higher quality networks. However, the optimal improvement for a constant amount of computation can be reached if both are increased in parallel. The computational budget should therefore be distributed in a balanced way between the depth and width of the network.