【目标检测系列：一】综述阅读笔记 Deep Learning for Generic Object Detection: A Survey

最新推荐文章于 2022-08-30 19:24:33 发布

原创最新推荐文章于 2022-08-30 19:24:33 发布 · 1k 阅读

4 ·

CC 4.0 BY-SA版权

文章标签：

#目标检测 #综述 #objectdetection #分类 #Deep Learning for Generic Object De

目标检测专栏收录该内容

14 篇文章

订阅专栏

本文概述了250多项关键的通用目标检测技术，包括特征表示、候选区域生成、上下文建模和训练策略，同时探讨了评估标准和未来的研究方向。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

title

【摘要】

本次调研包括250多项关键技术，涵盖了通用目标检测研究的许多方面：前沿的检测框架和基本子问题，包括目标特征表示，候选区域生成，上下文信息建模和训练策略等；评价问题，特别是benchmark数据集，评价指标和最先进的方法。最后，讨论了未来研究的方向。

object feature representation, object proposal generation, context information modeling and training strategies; evaluation issues, specifically benchmark datasets, evaluation metrics, and state of the art performance.

【时间】2018年9月

【参考链接】

1.https://arxiv.org/abs/1809.02165

2.GitHub - hoya012/deep_learning_object_detection: A paper list of object detection using deep learning.

1 Background

任务

通用对象检测相关的识别问题。（a）图像级别对象分类，（b）边界框级别通用对象检测，（c）像素级语义分割，（d）实例语义分割。

图像分类
分类和定位
目标物体检测
语义分割
进入像素级
实例分割

目标检测可以分为两种类型：特定实例检测和特定类别检测。
前者比如唐纳德·特朗普的脸、五角大楼建筑，而后者如人、车、自行车和狗。

一个好的检测器要做到定位准确、分类准确还要效率高

title

2 Frameworks

通常采用的策略包括级联、共享特性计算和减少每个窗口的计算。

目标检测的框架可以分成2类：

Two stage detection framework：含region proposal，先获取ROI，然后对ROI进行识别和回归bounding box，以RCNN系列方法为代表。

RCNN ,SPPNet， Fast RCNN , Faster RCNN，RFCN(Region based Fully Convolutional Network)，Mask RCNN

One stage detection framework：不含region proposal，将全图grid化，对每个grid进行识别和回归，以YOLO系列方法为代表。

直接从全图上预测类概率和边界框偏移的架构，不涉及候选区域生成或后分类。
YOLO，SSD

title
title

title

3 Fundamental SubProblems

Improving Object Representation

multiscale object detection，可分成3类：

Detecting with combined features of multiple CNN layers，使用多个CNN层的组合特征进行检测

Hypercolumns，HyperNet，ION
Detecting at multiple CNN layers，在多个CNN层上直接检测

FCN通过平均分割概率结合多个层的从粗糙到精细的预测。SSD，MSCNN，RBFNet，DSOD结合多个特征图的预测来处理各种大小的目标。
Combinations of the above two methods

SharpMask，DSSD(Deconvolutional Single Shot Detector)，FPN(Feature Pyramid Network)，TDM(Top Down Modulation)，RON(Reverse connection with Objectness prior Network)，ZIP，STDN(Scale Transfer Detection Network)，RefineDet，StairNet

Context Modeling

上下文信息可以分为3类：

Semantic context: The likelihood of an object to be found in some scenes but not in others;

语义上下文
Spatial context: The likelihood of finding an object in some position and not others with respect to other objects in the scene;

空间上下文
Scale context: Objects have a limited set of sizes relative to other objects in the scene.

尺度上下文