R-FCN网络学习_translation‐variance in object detection and trans-优快云博客

本文链接：https://blog.youkuaiyun.com/qinguanggai9953/article/details/88986824

R-FCN网络针对传统检测网络的效率问题，提出了几乎在整个图像上共享计算的方法，旨在平衡图像分类的平移不变性和目标检测的平移敏感性。该网络由全卷积子网络和RoI-wise子网络组成，解决RoI池化层在ResNets和GoogLeNets中引入的翻译不变性破坏，同时试图在不牺牲速度的情况下提高检测精度。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

由于传统的检测网络：apply a costly per-region subnetwork hundreds of times

R-FCN:almost all computation shared on the entire image.

解决问题：分类的平移不变性和检测的平移敏感性。（a dilemma between translation-invariance in image classification and
translation-variance in object detection）

介绍：

流行的检测网络一般是由RoI分割成两个子网络：（1）独立于RoIs的全卷积子网络（共享运算）

（2）RoI-wise子网络（不共享运算）

历史问题：AlexNet and VGG Nets 这种传统网络的spatial pooling layer is naturally turned into the RoI pooling layer在目标检测中。

最近的新网络：ResNets and GoogLeNets are by design fully convolutional. 在目标检测架构中，很自然的使用全部的卷积层来构建共享卷积子网络，让RoI-wise没有隐藏层。但是：降低了检测精度，不能与分类精度匹配。为了解决这个问题：ResNet paper里面提到Faster R-CNN的RoI pooling layer is unnaturally inserted between two sets of convolutional layers，breaks down translation invariance（the post-RoI convolutional layers are no longer translation-invariant when evaluated across different regions），提高了精度，但是由于unshared per-RoI computation牺牲了速度。since it introduces a considerable number of region-wise layers