深度学习系列之Focal Loss个人总结

本文链接：https://blog.youkuaiyun.com/LeeWanzhi/article/details/80069592

1. Introduction

object detection按其流程来说，一般分为两大类。一类是two stage detector(如非常经典的Faster R-CNN)，另一类则是one stage detector(如SSD、YOLO系列)。
虽然one stage detector检测速度可以完爆two stage，但是mAP却干不过two stage。
So，Why？
the Reason is：Class Imbalance(正负样本不平衡)
one stage detector evaluate 10^4 - 10^5 candidate locations per image, but only a few locations contain objects.

这带来的问题就是：样本中会存在大量的easy examples，且都是负样本(属于背景的样本)。这样，en masse(这个词不错，以前在经济学人里学过)easy negative examples会对loss起主要贡献作用，会主导梯度的更新方向。
这样，网络学不到有用的信息，无法对object进行准确分类。

还有一个问题，为什么two stage不会有这样的问题呢或者为什么two stage没有one stage这么严重呢？
因为，对于two stage来说，首先利用RPN产生region proposal，这一步就已经删去了很多easy examples。我们对这些region proposal进行筛选，可以人为控制正负样本的比例为1：3.
此外，对于负样本的选取，可以通过在