1. Motivation
few shot 本身存在的意义:
-
In other words, we are unable to alleviate the situation of scarce cases by simply spend- ing more money on annotation even big data is accessible.
-
Therefore, the study of few-shot learning is an imperative and long-lasting task.
-
长尾分布在真实世界中也是一个的问题,数据缺少data scarcity。
-
long-tail distribution is an inherent characteristic of the real world.
-
Its performance is largely affected by the data scarcity of novel classes
FSOD(Few Shot Obejct Detection)中包含novel classes以及base classes,base classes是指包含大量标注信息的数据集,而novel classes则是只有少部分标注信息的数据集。
- In FSOD, there are base classes in which suffi-cient objects are annotated with bounding boxes and novel classes in which very few labeled objects are available.
FSOD的目标在于通过base classes的协助,来学习novel classes中优先的数据,从而在测试中能够检测出所有的novel objects。
- The few-shot detectors are expected to learn from limited data in novel classes with the aid of abundant data in base classes and to be able to detect all novel objects in a held-out testing set
目前大部分的方法采用meat-learning以及metric learning,然后将他们应用于全监督的检测器中。
- To achieve this, most recent few- shot detection methods adopt the ideas from meta-learning and metric learning for few-shot recognition and apply them to conventional detection frameworks, e.g. Faster R-CNN [35], YOLO [34].
在本文中,对于explicit shots自己implicit shots的定义:
其中explicit shots指的就是FSOD中novel classes的k-shot;而implicit shots则是只预训练模型的数据集。
- The explicit shots refer to the available labeled objects from the novel classes.
- In terms of implicit shots, initializing the backbone net- work with a model pretrained on a large-scale image clas- sification dataset is a common practice for training an ob- ject detector.
但是对于implicit shots来说,有可能大规模的预训练的数据集中的类别会和novel classes重复。因此,如果去除了explict shots,如图1所示,大部分的方法在同等的shot的情况下,expilict以及implicit shots相比,就会有性能的下降。
作者认为出现这个问题的原因在于视觉信息的独立。随着训练数据的减少,视觉信息会越来越局限。
- We believe the reason for shot sensitivity is due to exclusive dependence on the visual information.
- As a result, visual information becomes limited as image data becomes scarce.
但是作者指出,有一项是不变的常数,那就是base classes和novel classes中的semantic relation。
2. Contribution
-
To our knowledge, our work is the first to investigate semantic relation reasoning for the few-shot detection task and show its potential to improve a strong baseline.
-
Our SRR-FSD achieves stable performance w.r.t the shot variation, outperforming state-of-the-art FSOD methods under several existing settings especially when the novel class data is extremely limited.
-
We suggest a more realistic FSOD setting in which implicit shots of novel classes are removed from the classification dataset for the pretrained model, and show that our SRR-FSD can maintain a more steady performance compared to previous methods if using the new pretrained model.
3. Method
3.1 FSOD Preliminaries
base classes Cb, Db, Db中包含xi,yi{x_i, y_i}