Object Detection[论文合集]

https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html


Jump to...

Papers

          R-CNN

          Fast R-CNN

          Faster R-CNN

          YOLO

          YOLOv2

          YOLOv3

          DenseBox

          SSD

          OHEM

          R-FCN

          Feature Pyramid Network (FPN)

          RetinaNet

Non-Maximum Suppression (NMS)

Adversarial Examples

Weakly Supervised Object Detection

Video Object Detection

Object Detection on Mobile Devices

Object Detection in 3D

Object Detection on RGB-D

Zero-Shot Object Detection

Salient Object Detection

Video Saliency Detection

Visual Relationship Detection

Face Deteciton

          MTCNN

          Detect Small Faces

Person Head Detection

Pedestrian Detection / People Detection

          Multispectral Pedestrian Detection

Vehicle Detection

Traffic-Sign Detection

Skeleton Detection

Fruit Detection

          Shadow Detection

Others Detection

Object Proposal

Localization

Tutorials / Talks

Projects

Leaderboard

Tools

Blogs


Methodbackbonetest sizeVOC2007VOC2010VOC2012ILSVRC 2013MSCOCO 2015Speed
OverFeat     24.3%  
R-CNNAlexNet 58.5%53.7%53.3%31.4%  
R-CNNVGG16 66.0%     
SPP_netZF-5 54.2%  31.84%  
DeepID-Net  64.1%  50.3%  
NoC73.3% 68.8%     
Fast-RCNNVGG16 70.0%68.8%68.4% 19.7%(@[0.5-0.95]), 35.9%(@0.5) 
MR-CNN78.2% 73.9%     
Faster-RCNNVGG16 78.8% 75.9% 21.9%(@[0.5-0.95]), 42.7%(@0.5)198ms
Faster-RCNNResNet101 85.6% 83.8% 37.4%(@[0.5-0.95]), 59.0%(@0.5) 
YOLO  63.4% 57.9%  45 fps
YOLO VGG-16  66.4%    21 fps
YOLOv2 448x44878.6% 73.4% 21.6%(@[0.5-0.95]), 44.0%(@0.5)40 fps
SSDVGG16300x30077.2% 75.8% 25.1%(@[0.5-0.95]), 43.1%(@0.5)46 fps
SSDVGG16512x51279.8% 78.5% 28.8%(@[0.5-0.95]), 48.5%(@0.5)19 fps
SSDResNet101300x300    28.0%(@[0.5-0.95])16 fps
SSDResNet101512x512    31.2%(@[0.5-0.95])8 fps
DSSDResNet101300x300    28.0%(@[0.5-0.95])8 fps
DSSDResNet101500x500    33.2%(@[0.5-0.95])6 fps
ION  79.2% 76.4%   
CRAFT  75.7% 71.3%48.5%  
OHEM  78.9% 76.3% 25.5%(@[0.5-0.95]), 45.9%(@0.5) 
R-FCNResNet50 77.4%    0.12sec(K40), 0.09sec(TitianX)
R-FCNResNet101 79.5%    0.17sec(K40), 0.12sec(TitianX)
R-FCN(ms train)ResNet101 83.6% 82.0% 31.5%(@[0.5-0.95]), 53.2%(@0.5) 
PVANet 9.0  84.9% 84.2%  750ms(CPU), 46ms(TitianX)
RetinaNetResNet101-FPN       
Light-Head R-CNNXception*800/1200    31.5%@[0.5:0.95]95 fps
Light-Head R-CNNXception*700/1100    30.7%@[0.5:0.95]102 fps

Papers

Deep Neural Networks for Object Detection

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

R-CNN

Rich feature hierarchies for accurate object detection and semantic segmentation

Fast R-CNN

Fast R-CNN

A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection

Faster R-CNN

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

R-CNN minus R

Faster R-CNN in MXNet with distributed implementation and data parallelization

Contextual Priming and Feedback for Faster R-CNN

An Implementation of Faster RCNN with Study for Region Sampling

Interpretable R-CNN

Light-Head R-CNN: In Defense of Two-Stage Object Detector

Cascade R-CNN: Delving into High Quality Object Detection

Scalable Object Detection using Deep Neural Networks

Scalable, High-Quality Object Detection

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection

Object Detectors Emerge in Deep Scene CNNs

segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection

Object Detection Networks on Convolutional Feature Maps

Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction

DeepBox: Learning Objectness with Convolutional Networks

Object detection via a multi-region & semantic segmentation-aware CNN model

YOLO

You Only Look Once: Unified, Real-Time Object Detection

darkflow - translate darknet to tensorflow. Load trained weights, retrain/fine-tune them using tensorflow, export constant graph def to C++

Start Training YOLO with Our Own Data

YOLO: Core ML versus MPSNNGraph

TensorFlow YOLO object detection on Android

Computer Vision in iOS – Object Detection

YOLOv2

YOLO9000: Better, Faster, Stronger

darknet_scripts

Yolo_mark: GUI for marking bounded boxes of objects in images for training Yolo v2

LightNet: Bringing pjreddie’s DarkNet out of the shadows

https://github.com//explosion/lightnet

YOLO v2 Bounding Box Tool

YOLOv3

YOLOv3: An Incremental Improvement

YOLO-LITE: A Real-Time Object Detection Algorithm Optimized for Non-GPU Computers

https://arxiv.org/abs/1811.05588

Spiking-YOLO: Spiking Neural Network for Real-time Object Detection

https://arxiv.org/abs/1903.06530


AttentionNet: Aggregating Weak Directions for Accurate Object Detection

DenseBox

DenseBox: Unifying Landmark Localization with End to End Object Detection

SSD

SSD: Single Shot MultiBox Detector

What’s the diffience in performance between this new code you pushed and the previous code? #327

https://github.com/weiliu89/caffe/issues/327

DSSD : Deconvolutional Single Shot Detector

Enhancement of SSD by concatenating feature maps for object detection

Context-aware Single-Shot Detector

Feature-Fused SSD: Fast Detection for Small Objects

https://arxiv.org/abs/1709.05054

FSSD: Feature Fusion Single Shot Multibox Detector

https://arxiv.org/abs/1712.00960

Weaving Multi-scale Context for Single Shot Detector

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Tiny SSD: A Tiny Single-shot Detection Deep Convolutional Neural Network for Real-time Embedded Object Detection

https://arxiv.org/abs/1802.06488

MDSSD: Multi-scale Deconvolutional Single Shot Detector for small objects

Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

Adaptive Object Detection Using Adjacency and Zoom Prediction

G-CNN: an Iterative Grid Based Object Detector

Factors in Finetuning Deep Model for object detection

Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution

We don’t need no bounding-boxes: Training object class detectors using only human verification

HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection

A MultiPath Network for Object Detection

CRAFT Objects from Images

OHEM

Training Region-based Object Detectors with Online Hard Example Mining

S-OHEM: Stratified Online Hard Example Mining for Object Detection

https://arxiv.org/abs/1705.02233


Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers

R-FCN

R-FCN: Object Detection via Region-based Fully Convolutional Networks

R-FCN-3000 at 30fps: Decoupling Detection and Classification

https://arxiv.org/abs/1712.01802

Recycle deep features for better object detection

A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

Multi-stage Object Detection with Group Recursive Learning

Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection

PVANet: Lightweight Deep Neural Networks for Real-time Object Detection

Gated Bi-directional CNN for Object Detection

Crafting GBD-Net for Object Detection

StuffNet: Using ‘Stuff’ to Improve Object Detection

Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene

Hierarchical Object Detection with Deep Reinforcement Learning

Learning to detect and localize many objects from few examples

Speed/accuracy trade-offs for modern convolutional object detectors

SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving

Feature Pyramid Network (FPN)

Feature Pyramid Networks for Object Detection

Action-Driven Object Detection with Top-Down Visual Attentions

Beyond Skip Connections: Top-Down Modulation for Object Detection

Wide-Residual-Inception Networks for Real-time Object Detection

Attentional Network for Visual Object Detection

Learning Chained Deep Features and Classifiers for Cascade in Object Detection

DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling

Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries

Spatial Memory for Context Reasoning in Object Detection

Accurate Single Stage Detector Using Recurrent Rolling Convolution

Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection

https://arxiv.org/abs/1704.05775

LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems

Point Linking Network for Object Detection

Perceptual Generative Adversarial Networks for Small Object Detection

https://arxiv.org/abs/1706.05274

Few-shot Object Detection

https://arxiv.org/abs/1706.08249

Yes-Net: An effective Detector Based on Global Information

https://arxiv.org/abs/1706.09180

SMC Faster R-CNN: Toward a scene-specialized multi-object detector

https://arxiv.org/abs/1706.10217

Towards lightweight convolutional neural networks for object detection

https://arxiv.org/abs/1707.01395

RON: Reverse Connection with Objectness Prior Networks for Object Detection

Mimicking Very Efficient Network for Object Detection

Residual Features and Unified Prediction Network for Single Stage Detection

https://arxiv.org/abs/1707.05031

Deformable Part-based Fully Convolutional Network for Object Detection

Adaptive Feeding: Achieving Fast and Accurate Detections by Adaptively Combining Object Detectors

Recurrent Scale Approximation for Object Detection in CNN

DSOD: Learning Deeply Supervised Object Detectors from Scratch

Object Detection from Scratch with Deep Supervision

https://arxiv.org/abs/1809.09294

RetinaNet

Focal Loss for Dense Object Detection

Focal Loss Dense Detector for Vehicle Surveillance

https://arxiv.org/abs/1803.01114

CoupleNet: Coupling Global Structure with Local Parts for Object Detection

Incremental Learning of Object Detectors without Catastrophic Forgetting

Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection

https://arxiv.org/abs/1709.04347

StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection

https://arxiv.org/abs/1709.05788

Dynamic Zoom-in Network for Fast Object Detection in Large Images

https://arxiv.org/abs/1711.05187

Zero-Annotation Object Detection with Web Knowledge Transfer

MegDet: A Large Mini-Batch Object Detector

Single-Shot Refinement Neural Network for Object Detection

Receptive Field Block Net for Accurate and Fast Object Detection

An Analysis of Scale Invariance in Object Detection - SNIP

Feature Selective Networks for Object Detection

https://arxiv.org/abs/1711.08879

Learning a Rotation Invariant Detector with Rotatable Bounding Box

Scalable Object Detection for Stylized Objects

Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids

Deep Regionlets for Object Detection

Training and Testing Object Detectors with Virtual Images

Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video

  • keywords: object mining, object tracking, unsupervised object discovery by appearance-based clustering, self-supervised detector adaptation
  • arxiv: https://arxiv.org/abs/1712.08832

Spot the Difference by Object Detection

Localization-Aware Active Learning for Object Detection

Object Detection with Mask-based Feature Encoding

https://arxiv.org/abs/1802.03934

LSTD: A Low-Shot Transfer Detector for Object Detection

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Pseudo Mask Augmented Object Detection

https://arxiv.org/abs/1803.05858

Revisiting RCNN: On Awakening the Classification Power of Faster RCNN

Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection

Learning Region Features for Object Detection

Single-Shot Bidirectional Pyramid Networks for High-Quality Object Detection

Object Detection for Comics using Manga109 Annotations

Task-Driven Super Resolution: Object Detection in Low-resolution Images

https://arxiv.org/abs/1803.11316

Transferring Common-Sense Knowledge for Object Detection

https://arxiv.org/abs/1804.01077

Multi-scale Location-aware Kernel Representation for Object Detection

Loss Rank Mining: A General Hard Example Mining Method for Real-time Detectors

DetNet: A Backbone network for Object Detection

Robust Physical Adversarial Attack on Faster R-CNN Object Detector

https://arxiv.org/abs/1804.05810

AdvDetPatch: Attacking Object Detectors with Adversarial Patches

https://arxiv.org/abs/1806.02299

Attacking Object Detectors via Imperceptible Patches on Background

https://arxiv.org/abs/1809.05966

Physical Adversarial Examples for Object Detectors

Quantization Mimic: Towards Very Tiny CNN for Object Detection

https://arxiv.org/abs/1805.02152

Object detection at 200 Frames Per Second

Object Detection using Domain Randomization and Generative Adversarial Refinement of Synthetic Images

SNIPER: Efficient Multi-Scale Training

Soft Sampling for Robust Object Detection

https://arxiv.org/abs/1806.06986

MetaAnchor: Learning to Detect Objects with Customized Anchors

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Auto-Context R-CNN

Pooling Pyramid Network for Object Detection

Modeling Visual Context is Key to Augmenting Object Detection Datasets

Dual Refinement Network for Single-Shot Object Detection

https://arxiv.org/abs/1807.08638

Acquisition of Localization Confidence for Accurate Object Detection

CornerNet: Detecting Objects as Paired Keypoints

Unsupervised Hard Example Mining from Videos for Improved Object Detection

SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection

https://arxiv.org/abs/1808.04974

A Survey of Modern Object Detection Literature using Deep Learning

https://arxiv.org/abs/1808.07256

Tiny-DSOD: Lightweight Object Detection for Resource-Restricted Usages

Deep Feature Pyramid Reconfiguration for Object Detection

MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection

Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks

https://arxiv.org/abs/1809.03193

Deep Learning for Generic Object Detection: A Survey

https://arxiv.org/abs/1809.02165

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples

ScratchDet:Exploring to Train Single-Shot Object Detectors from Scratch

Fast and accurate object detection in high resolution 4K and 8K video using GPUs

  • intro: Best Paper Finalist at IEEE High Performance Extreme Computing Conference (HPEC) 2018
  • intro: Carnegie Mellon University
  • arxiv: https://arxiv.org/abs/1810.10551

Hybrid Knowledge Routed Modules for Large-scale Object Detection

Gradient Harmonized Single-stage Detector

M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network

BAN: Focusing on Boundary Context for Object Detection

https://arxiv.org/abs/1811.05243

Multi-layer Pruning Framework for Compressing Single Shot MultiBox Detector

R2CNN++: Multi-Dimensional Attention Based Rotation Invariant Detector with Robust Anchor Strategy

DeRPN: Taking a further step toward more general object detection

Fast Efficient Object Detection Using Selective Attention

https://arxiv.org/abs/1811.07502

Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects

https://arxiv.org/abs/1811.10862

Efficient Coarse-to-Fine Non-Local Module for the Detection of Small Objects

https://arxiv.org/abs/1811.12152

Deep Regionlets: Blended Representation and Deep Learning for Generic Object Detection

https://arxiv.org/abs/1811.11318

Grid R-CNN

Transferable Adversarial Attacks for Image and Video Object Detection

https://arxiv.org/abs/1811.12641

Anchor Box Optimization for Object Detection

AutoFocus: Efficient Multi-Scale Inference

Few-shot Object Detection via Feature Reweighting

https://arxiv.org/abs/1812.01866

Practical Adversarial Attack Against Object Detector

https://arxiv.org/abs/1812.10217

Learning Efficient Detector with Semi-supervised Adaptive Distillation

Scale-Aware Trident Networks for Object Detection

Region Proposal by Guided Anchoring

Consistent Optimization for Single-Shot Object Detection

Bottom-up Object Detection by Grouping Extreme and Center Points

A Single-shot Object Detector with Feature Aggragation and Enhancement

https://arxiv.org/abs/1902.02923

Bag of Freebies for Training Object Detection Neural Networks

Augmentation for small object detection

https://arxiv.org/abs/1902.07296

Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression

SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition

BayesOD: A Bayesian Approach for Uncertainty Estimation in Deep Object Detectors

DetNAS: Neural Architecture Search on Object Detection

ThunderNet: Towards Real-time Generic Object Detection

https://arxiv.org/abs/1903.11752

Feature Intertwiner for Object Detection

Few-shot Adaptive Faster R-CNN

Improving Object Detection with Inverted Attention

https://arxiv.org/abs/1903.12255

FCOS: Fully Convolutional One-Stage Object Detection

https://arxiv.org/abs/1904.01355

Non-Maximum Suppression (NMS)

End-to-End Integration of a Convolutional Network, Deformable Parts Model and Non-Maximum Suppression

A convnet for non-maximum suppression

Improving Object Detection With One Line of Code

Soft-NMS – Improving Object Detection With One Line of Code

Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection

Learning non-maximum suppression

Relation Networks for Object Detection

Learning Pairwise Relationship for Multi-object Detection in Crowded Scenes

Daedalus: Breaking Non-Maximum Suppression in Object Detection via Adversarial Examples

https://arxiv.org/abs/1902.02067

Adversarial Examples

Adversarial Examples that Fool Detectors

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

Weakly Supervised Object Detection

Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection

Weakly supervised object detection using pseudo-strong labels

Saliency Guided End-to-End Learning for Weakly Supervised Object Detection

Visual and Semantic Knowledge Transfer for Large Scale Semi-supervised Object Detection

Video Object Detection

Learning Object Class Detectors from Weakly Annotated Video

Analysing domain shift factors between videos and images for object detection

Video Object Recognition

Deep Learning for Saliency Prediction in Natural Video

T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos

Object Detection from Video Tubelets with Convolutional Neural Networks

Object Detection in Videos with Tubelets and Multi-context Cues

Context Matters: Refining Object Detection in Video with Recurrent Neural Networks

CNN Based Object Detection in Large Video Images

Object Detection in Videos with Tubelet Proposal Networks

Flow-Guided Feature Aggregation for Video Object Detection

Video Object Detection using Faster R-CNN

Improving Context Modeling for Video Object Detection and Tracking

http://image-net.org/challenges/talks_2017/ilsvrc2017_short(poster).pdf

Temporal Dynamic Graph LSTM for Action-driven Video Object Detection

Mobile Video Object Detection with Temporally-Aware Feature Maps

https://arxiv.org/abs/1711.06368

Towards High Performance Video Object Detection

https://arxiv.org/abs/1711.11577

Impression Network for Video Object Detection

https://arxiv.org/abs/1712.05896

Spatial-Temporal Memory Networks for Video Object Detection

https://arxiv.org/abs/1712.06317

3D-DETNet: a Single Stage Video-Based Vehicle Detector

https://arxiv.org/abs/1801.01769

Object Detection in Videos by Short and Long Range Object Linking

https://arxiv.org/abs/1801.09823

Object Detection in Video with Spatiotemporal Sampling Networks

Towards High Performance Video Object Detection for Mobiles

Optimizing Video Object Detection via a Scale-Time Lattice

Pack and Detect: Fast Object Detection in Videos Using Region-of-Interest Packing

https://arxiv.org/abs/1809.01701

Fast Object Detection in Compressed Video

https://arxiv.org/abs/1811.11057

Tube-CNN: Modeling temporal evolution of appearance for object detection in video

AdaScale: Towards Real-time Video Object Detection Using Adaptive Scaling

SCNN: A General Distribution based Statistical Convolutional Neural Network with Application to Video Object Detection

Looking Fast and Slow: Memory-Guided Mobile Video Object Detection

Progressive Sparse Local Attention for Video object detection

Object Detection on Mobile Devices

Pelee: A Real-Time Object Detection System on Mobile Devices

Object Detection in 3D

Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks

Complex-YOLO: Real-time 3D Object Detection on Point Clouds

Focal Loss in 3D Object Detection

3D Object Detection Using Scale Invariant and Feature Reweighting Networks

** 3D Backbone Network for 3D Object Detection**

https://arxiv.org/abs/1901.08373

Object Detection on RGB-D

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

Differential Geometry Boosts Convolutional Neural Networks for Object Detection

A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation

https://arxiv.org/abs/1703.03347

Cross-Modal Attentional Context Learning for RGB-D Object Detection

Zero-Shot Object Detection

Zero-Shot Detection

Zero-Shot Object Detection

https://arxiv.org/abs/1804.04340

Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel Concepts

Zero-Shot Object Detection by Hybrid Region Embedding

Salient Object Detection

This task involves predicting the salient regions of an image given by human eye fixations.

Best Deep Saliency Detection Models (CVPR 2016 & 2015)

http://i.cs.hku.hk/~yzyu/vision.html

Large-scale optimization of hierarchical features for saliency prediction in natural images

Predicting Eye Fixations using Convolutional Neural Networks

Saliency Detection by Multi-Context Deep Learning

DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection

SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection

Shallow and Deep Convolutional Networks for Saliency Prediction

Recurrent Attentional Networks for Saliency Detection

Two-Stream Convolutional Networks for Dynamic Saliency Prediction

Unconstrained Salient Object Detection

Unconstrained Salient Object Detection via Proposal Subset Optimization

DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection

Salient Object Subitizing

Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection

Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection

A Deep Multi-Level Network for Saliency Prediction

Visual Saliency Detection Based on Multiscale Deep CNN Features

A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection

Deeply supervised salient object detection with short connections

Weakly Supervised Top-down Salient Object Detection

SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

Visual Saliency Prediction Using a Mixture of Deep Neural Networks

A Fast and Compact Salient Score Regression Network Based on Fully Convolutional Network

Saliency Detection by Forward and Backward Cues in Deep-CNNs

https://arxiv.org/abs/1703.00152

Supervised Adversarial Networks for Image Saliency Detection

https://arxiv.org/abs/1704.07242

Group-wise Deep Co-saliency Detection

https://arxiv.org/abs/1707.07381

Towards the Success Rate of One: Real-time Unconstrained Salient Object Detection

Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection

Learning Uncertain Convolutional Features for Accurate Saliency Detection

Deep Edge-Aware Saliency Detection

https://arxiv.org/abs/1708.04366

Self-explanatory Deep Salient Object Detection

PiCANet: Learning Pixel-wise Contextual Attention in ConvNets and Its Application in Saliency Detection

https://arxiv.org/abs/1708.06433

DeepFeat: A Bottom Up and Top Down Saliency Model Based on Deep Features of Convolutional Neural Nets

https://arxiv.org/abs/1709.02495

Recurrently Aggregating Deep Features for Salient Object Detection

Deep saliency: What is learnt by a deep network about saliency?

Contrast-Oriented Deep Neural Networks for Salient Object Detection

Salient Object Detection by Lossless Feature Reflection

HyperFusion-Net: Densely Reflective Fusion for Salient Object Detection

https://arxiv.org/abs/1804.05142

Video Saliency Detection

Deep Learning For Video Saliency Detection

Video Salient Object Detection Using Spatiotemporal Deep Features

https://arxiv.org/abs/1708.01447

Predicting Video Saliency with Object-to-Motion CNN and Two-layer Convolutional LSTM

https://arxiv.org/abs/1709.06316

Visual Relationship Detection

Visual Relationship Detection with Language Priors

ViP-CNN: A Visual Phrase Reasoning Convolutional Neural Network for Visual Relationship Detection

Visual Translation Embedding Network for Visual Relation Detection

Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection

Detecting Visual Relationships with Deep Relational Networks

Identifying Spatial Relations in Images using Convolutional Neural Networks

https://arxiv.org/abs/1706.04215

PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN

Natural Language Guided Visual Relationship Detection

https://arxiv.org/abs/1711.06032

Detecting Visual Relationships Using Box Attention

Google AI Open Images - Visual Relationship Track

Context-Dependent Diffusion Network for Visual Relationship Detection

A Problem Reduction Approach for Visual Relationships Detection

Exploring the Semantics for Visual Relationship Detection

https://arxiv.org/abs/1904.02104

Face Deteciton

Multi-view Face Detection Using Deep Convolutional Neural Networks

From Facial Parts Responses to Face Detection: A Deep Learning Approach

Compact Convolutional Neural Network Cascade for Face Detection

Face Detection with End-to-End Integration of a ConvNet and a 3D Model

CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection

Towards a Deep Learning Framework for Unconstrained Face Detection

Supervised Transformer Network for Efficient Face Detection

UnitBox: An Advanced Object Detection Network

Bootstrapping Face Detection with Hard Negative Examples

Grid Loss: Detecting Occluded Faces

A Multi-Scale Cascade Fully Convolutional Network Face Detector

MTCNN

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks

Face Detection using Deep Learning: An Improved Faster RCNN Approach

Faceness-Net: Face Detection through Deep Facial Part Responses

Multi-Path Region-Based Convolutional Neural Network for Accurate Detection of Unconstrained “Hard Faces”

End-To-End Face Detection and Recognition

https://arxiv.org/abs/1703.10818

Face R-CNN

https://arxiv.org/abs/1706.01061

Face Detection through Scale-Friendly Deep Convolutional Networks

https://arxiv.org/abs/1706.02863

Scale-Aware Face Detection

Detecting Faces Using Inside Cascaded Contextual CNN

Multi-Branch Fully Convolutional Network for Face Detection

https://arxiv.org/abs/1707.06330

SSH: Single Stage Headless Face Detector

Dockerface: an easy to install and use Faster R-CNN face detector in a Docker container

https://arxiv.org/abs/1708.04370

FaceBoxes: A CPU Real-time Face Detector with High Accuracy

S3FD: Single Shot Scale-invariant Face Detector

Detecting Faces Using Region-based Fully Convolutional Networks

https://arxiv.org/abs/1709.05256

AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection

https://arxiv.org/abs/1709.07326

Face Attention Network: An effective Face Detector for the Occluded Faces

https://arxiv.org/abs/1711.07246

Feature Agglomeration Networks for Single Stage Face Detection

https://arxiv.org/abs/1712.00721

Face Detection Using Improved Faster RCNN

PyramidBox: A Context-assisted Single Shot Face Detector

PyramidBox++: High Performance Detector for Finding Tiny Face

A Fast Face Detection Method via Convolutional Neural Network

Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy

Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks

SFace: An Efficient Network for Face Detection in Large Scale Variations

Survey of Face Detection on Low-quality Images

https://arxiv.org/abs/1804.07362

Anchor Cascade for Efficient Face Detection

Adversarial Attacks on Face Detectors using Neural Net based Constrained Optimization

Selective Refinement Network for High Performance Face Detection

https://arxiv.org/abs/1809.02693

DSFD: Dual Shot Face Detector

https://arxiv.org/abs/1810.10220

Learning Better Features for Face Detection with Feature Fusion and Segmentation Supervision

https://arxiv.org/abs/1811.08557

FA-RPN: Floating Region Proposals for Face Detection

https://arxiv.org/abs/1812.05586

Robust and High Performance Face Detector

https://arxiv.org/abs/1901.02350

DAFE-FD: Density Aware Feature Enrichment for Face Detection

https://arxiv.org/abs/1901.05375

Improved Selective Refinement Network for Face Detection

Revisiting a single-stage method for face detection

https://arxiv.org/abs/1902.01559

MSFD:Multi-Scale Receptive Field Face Detector

Detect Small Faces

Finding Tiny Faces

Detecting and counting tiny faces

Seeing Small Faces from Robust Anchor’s Perspective

Face-MagNet: Magnifying Feature Maps to Detect Small Faces

Robust Face Detection via Learning Small Faces on Hard Images

SFA: Small Faces Attention Face Detector

Person Head Detection

Context-aware CNNs for person head detection

Detecting Heads using Feature Refine Net and Cascaded Multi-scale Architecture

https://arxiv.org/abs/1803.09256

A Comparison of CNN-based Face and Head Detectors for Real-Time Video Surveillance Applications

https://arxiv.org/abs/1809.03336

FCHD: A fast and accurate head detector

Pedestrian Detection / People Detection

Pedestrian Detection aided by Deep Learning Semantic Tasks

Deep Learning Strong Parts for Pedestrian Detection

Taking a Deeper Look at Pedestrians

Convolutional Channel Features

End-to-end people detection in crowded scenes

Learning Complexity-Aware Cascades for Deep Pedestrian Detection

Deep convolutional neural networks for pedestrian detection

Scale-aware Fast R-CNN for Pedestrian Detection

New algorithm improves speed and accuracy of pedestrian detection

Pushing the Limits of Deep CNNs for Pedestrian Detection

  • intro: “set a new record on the Caltech pedestrian dataset, lowering the log-average miss rate from 11.7% to 8.9%”
  • arxiv: http://arxiv.org/abs/1603.04525

A Real-Time Deep Learning Pedestrian Detector for Robot Navigation

A Real-Time Pedestrian Detector using Deep Learning for Human-Aware Navigation

Is Faster R-CNN Doing Well for Pedestrian Detection?

Unsupervised Deep Domain Adaptation for Pedestrian Detection

Reduced Memory Region Based Deep Convolutional Neural Network Detection

Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection

Detecting People in Artwork with CNNs

Deep Multi-camera People Detection

Expecting the Unexpected: Training Detectors for Unusual Pedestrians with Adversarial Imposters

What Can Help Pedestrian Detection?

Illuminating Pedestrians via Simultaneous Detection & Segmentation

[https://arxiv.org/abs/1706.08564](https://arxiv.org/abs/1706.08564

Rotational Rectification Network for Robust Pedestrian Detection

STD-PD: Generating Synthetic Training Data for Pedestrian Detection in Unannotated Videos

Too Far to See? Not Really! — Pedestrian Detection with Scale-aware Localization Policy

https://arxiv.org/abs/1709.00235

Repulsion Loss: Detecting Pedestrians in a Crowd

https://arxiv.org/abs/1711.07752

Aggregated Channels Network for Real-Time Pedestrian Detection

https://arxiv.org/abs/1801.00476

Exploring Multi-Branch and High-Level Semantic Networks for Improving Pedestrian Detection

https://arxiv.org/abs/1804.00872

Pedestrian-Synthesis-GAN: Generating Pedestrian Data in Real Scene and Beyond

https://arxiv.org/abs/1804.02047

PCN: Part and Context Information for Pedestrian Detection with CNNs

Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors

Small-scale Pedestrian Detection Based on Somatic Topology Localization and Temporal Feature Aggregation

Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd

Bi-box Regression for Pedestrian Detection and Occlusion Estimation

Pedestrian Detection with Autoregressive Network Phases

SSA-CNN: Semantic Self-Attention CNN for Pedestrian Detection

https://arxiv.org/abs/1902.09080

Multispectral Pedestrian Detection

Multispectral Deep Neural Networks for Pedestrian Detection

Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection

Multispectral Pedestrian Detection via Simultaneous Detection and Segmentation

The Cross-Modality Disparity Problem in Multispectral Pedestrian Detection

https://arxiv.org/abs/1901.02645

Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection

https://arxiv.org/abs/1902.05291

GFD-SSD: Gated Fusion Double SSD for Multispectral Pedestrian Detection

https://arxiv.org/abs/1903.06999

Vehicle Detection

DAVE: A Unified Framework for Fast Vehicle Detection and Annotation

Evolving Boxes for fast Vehicle Detection

Fine-Grained Car Detection for Visual Census Estimation

SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection

Label and Sample: Efficient Training of Vehicle Object Detector from Sparsely Labeled Data

Domain Randomization for Scene-Specific Car Detection and Pose Estimation

https://arxiv.org/abs/1811.05939

ShuffleDet: Real-Time Vehicle Detection Network in On-board Embedded UAV Imagery

Traffic-Sign Detection

Traffic-Sign Detection and Classification in the Wild

Evaluating State-of-the-art Object Detector on Challenging Traffic Light Data

Detecting Small Signs from Large Images

Localized Traffic Sign Detection with Multi-scale Deconvolution Networks

https://arxiv.org/abs/1804.10428

Detecting Traffic Lights by Single Shot Detection

A Hierarchical Deep Architecture and Mini-Batch Selection Method For Joint Traffic Sign and Light Detection

Skeleton Detection

Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs

DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images

SRN: Side-output Residual Network for Object Symmetry Detection in the Wild

Hi-Fi: Hierarchical Feature Integration for Skeleton Detection

https://arxiv.org/abs/1801.01849

Fruit Detection

Deep Fruit Detection in Orchards

Image Segmentation for Fruit Detection and Yield Estimation in Apple Orchards

Shadow Detection

Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network

https://arxiv.org/abs/1709.09283

A+D-Net: Shadow Detection with Adversarial Shadow Attenuation

https://arxiv.org/abs/1712.01361

Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal

https://arxiv.org/abs/1712.02478

Direction-aware Spatial Context Features for Shadow Detection

Direction-aware Spatial Context Features for Shadow Detection and Removal

Others Detection

Deep Deformation Network for Object Landmark Localization

Fashion Landmark Detection in the Wild

Deep Learning for Fast and Accurate Fashion Item Detection

OSMDeepOD - OSM and Deep Learning based Object Detection from Aerial Imagery (formerly known as “OSM-Crosswalk-Detection”)

Selfie Detection by Synergy-Constraint Based Convolutional Neural Network

Associative Embedding:End-to-End Learning for Joint Detection and Grouping

Deep Cuboid Detection: Beyond 2D Bounding Boxes

Automatic Model Based Dataset Generation for Fast and Accurate Crop and Weeds Detection

Deep Learning Logo Detection with Data Expansion by Synthesising Context

Scalable Deep Learning Logo Detection

https://arxiv.org/abs/1803.11417

Pixel-wise Ear Detection with Convolutional Encoder-Decoder Networks

Automatic Handgun Detection Alarm in Videos Using Deep Learning

Objects as context for part detection

https://arxiv.org/abs/1703.09529

Using Deep Networks for Drone Detection

Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection

Target Driven Instance Detection

https://arxiv.org/abs/1803.04610

DeepVoting: An Explainable Framework for Semantic Part Detection under Partial Occlusion

https://arxiv.org/abs/1709.04577

VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition

Grab, Pay and Eat: Semantic Food Detection for Smart Restaurants

https://arxiv.org/abs/1711.05128

ReMotENet: Efficient Relevant Motion Event Detection for Large-scale Home Surveillance Videos

Deep Learning Object Detection Methods for Ecological Camera Trap Data

EL-GAN: Embedding Loss Driven Generative Adversarial Networks for Lane Detection

https://arxiv.org/abs/1806.05525

Towards End-to-End Lane Detection: an Instance Segmentation Approach

iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection

Densely Supervised Grasp Detector (DSGD)

https://arxiv.org/abs/1810.03962

Object Proposal

DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers

Scale-aware Pixel-wise Object Proposal Networks

Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization

Learning to Segment Object Proposals via Recursive Neural Networks

Learning Detection with Diverse Proposals

  • intro: CVPR 2017
  • keywords: differentiable Determinantal Point Process (DPP) layer, Learning Detection with Diverse Proposals (LDDP)
  • arxiv: https://arxiv.org/abs/1704.03533

ScaleNet: Guiding Object Proposal Generation in Supermarkets and Beyond

Improving Small Object Proposals for Company Logo Detection

Open Logo Detection Challenge

AttentionMask: Attentive, Efficient Object Proposal Generation Focusing on Small Objects

Localization

Beyond Bounding Boxes: Precise Localization of Objects in Images

Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning

Weakly Supervised Object Localization Using Size Estimates

Active Object Localization with Deep Reinforcement Learning

Localizing objects using referring expressions

LocNet: Improving Localization Accuracy for Object Detection

Learning Deep Features for Discriminative Localization

ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

Ensemble of Part Detectors for Simultaneous Classification and Localization

https://arxiv.org/abs/1705.10034

STNet: Selective Tuning of Convolutional Networks for Object Localization

https://arxiv.org/abs/1708.06418

Soft Proposal Networks for Weakly Supervised Object Localization

Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN

Tutorials / Talks

Convolutional Feature Maps: Elements of efficient (and accurate) CNN-based object detection

Towards Good Practices for Recognition & Detection

Work in progress: Improving object detection and instance segmentation for small objects

https://docs.google.com/presentation/d/1OTfGn6mLe1VWE8D0q6Tu_WwFTSoLGd4OF8WCYnOWcVo/edit#slide=id.g37418adc7a_0_229

Object Detection with Deep Learning: A Review

https://arxiv.org/abs/1807.05511

Projects

Detectron

TensorBox: a simple framework for training neural networks to detect objects in images

Object detection in torch: Implementation of some object detection frameworks in torch

Using DIGITS to train an Object Detection network

FCN-MultiBox Detector

KittiBox: A car detection model implemented in Tensorflow.

Deformable Convolutional Networks + MST + Soft-NMS

How to Build a Real-time Hand-Detector using Neural Networks (SSD) on Tensorflow

Metrics for object detection

MobileNetv2-SSDLite

Leaderboard

Detection Results: VOC2012

Tools

BeaverDam: Video annotation tool for deep learning training labels

https://github.com/antingshen/BeaverDam

Blogs

Convolutional Neural Networks for Object Detection

http://rnd.azoft.com/convolutional-neural-networks-object-detection/

Introducing automatic object detection to visual search (Pinterest)

Deep Learning for Object Detection with DIGITS

Analyzing The Papers Behind Facebook’s Computer Vision Approach

Easily Create High Quality Object Detectors with Deep Learning

How to Train a Deep-Learned Object Detection Model in the Microsoft Cognitive Toolkit

Object Detection in Satellite Imagery, a Low Overhead Approach

You Only Look Twice — Multi-Scale Object Detection in Satellite Imagery With Convolutional Neural Networks

Faster R-CNN Pedestrian and Car Detection

Small U-Net for vehicle detection

Region of interest pooling explained

Supercharge your Computer Vision models with the TensorFlow Object Detection API

Understanding SSD MultiBox — Real-Time Object Detection In Deep Learning

https://towardsdatascience.com/understanding-ssd-multibox-real-time-object-detection-in-deep-learning-495ef744fab

One-shot object detection

http://machinethink.net/blog/object-detection/

An overview of object detection: one-stage methods

https://www.jeremyjordan.me/object-detection-one-stage/

deep learning object detection

 

<think>我们正在讨论面向对象检测(Oriented ObjectDetection)与旋转对象检测(RotationObject Detection)技术的比较。首先需要明确这两个术语的含义,它们实际上指的是同一类技术,即检测图像中带有任意方向(旋转)的对象。在文献中,这两个术语经常互换使用,但更常见的术语是“旋转不变对象检测”或“定向对象检测”。关键点:1. **目标**:检测图像中任意方向的对象,并输出带角度的边界框(通常表示为(x,y,width,height,θ))。2. **挑战**:传统的水平边界框检测器(如FasterR-CNN,YOLO)对旋转对象表现不佳,因为旋转会使得对象在特征图上呈现出不同的外观。主要技术方法: ###1.旋转区域提议网络(RRPN)-在FasterR-CNN框架基础上,将传统的区域提议网络(RPN)改进为旋转区域提议网络(RRPN),生成旋转的候选框(rotated proposals)。-使用角度离散化(如每15度一个bin)来预测角度偏移。-例如:Ma等人提出的RRPN(IEEETransactions onGeoscienceand RemoteSensing,2018)[^1]。###2.旋转不变特征学习-设计网络结构或损失函数,使得提取的特征对旋转具有不变性。-方法包括:-在卷积层后加入旋转不变层(如循环层、方向编码层)。-使用可变形卷积(DeformableConvolution)来适应对象的几何变换。-使用空间变换网络(STN)来对特征图进行旋转校正。 ###3.旋转敏感回归-直接回归边界框的角度参数(θ)。-为了克服角度回归的周期性难题(如0度和359度非常接近),通常采用两种方式:-将角度离散化为多个方向类别(分类问题),再结合偏移量回归。-使用角度回归的特定损失函数,如Smooth L1损失,并考虑角度的周期性(例如,将角度转换为两个分量:sin(θ)和cos(θ)的回归)。###4.基于关键点的方法-使用关键点检测框架(如CenterNet)来预测对象的中心点,然后预测宽度、高度和角度。-例如:Ding等人提出的R3Det(CVPR2021)[^2]。###5.圆形平滑标签(CSL)-为了解决角度回归的边界问题(如0度和180度相邻时的不连续问题),将角度预测视为分类问题,并使用圆形平滑标签(即相邻角度有重叠的标签分布)。-例如:Yang等人提出的CSL方法(ECCV2020)[^3]。 ###6.高斯分布表示法-将对象表示为高斯分布(均值和协方差矩阵),通过椭圆拟合来得到方向。-例如:Yang等人提出的GaussianYOLOv3(IEEE Access,2020)[^4]。 ##比较总结|方法|优点 |缺点| |------|------|------|| RRPN |直接扩展了FasterR-CNN,易于实现|计算量大,角度离散化可能导致精度损失 ||旋转不变特征 |特征对旋转鲁棒,可迁移性强|需要额外的网络结构设计,训练复杂 ||旋转敏感回归 |端到端训练,精度较高|角度回归存在边界不连续问题| |关键点方法 |简化了检测流程,速度快 |角度预测可能不够精确| |CSL|解决了角度边界问题|需要设计标签分布,计算量增加 ||高斯表示|数学表达优雅,适合任意形状 |实现复杂,后处理需要转换| ##应用场景- **遥感图像**:飞机、船只等对象具有任意方向。-**文本检测**:自然场景中的文本往往是旋转的。-**自动驾驶**:道路上的车辆等对象需要精确的方向检测。参考文献:[^1]:Ma,J.,et al. "Arbitrary-Oriented SceneText Detectionvia RotationProposals." IEEETransactions onGeoscience andRemote Sensing(2018).[^2]:Ding,J.,et al. "R3Det:RefinedSingle-Stage Detector withFeature Refinement forRotatingObject."CVPR2021. [^3]: Yang, X., etal."RethinkingtheRotation Invariance inDetection:CircularSmooth Label." ECCV2020.[^4]:Yang,X.,et al. "GaussianYOLOv3: AnAccurate andFast ObjectDetectorUsing Localization Uncertaintyfor AutonomousDriving."IEEE Access(2020).注意:上述引用文献为示例,实际文献可能需要根据具体论文调整。 §§相关问题§§1.旋转对象检测中如何处理角度回归的边界问题?2.旋转对象检测在遥感图像中有哪些具体应用?3.如何评估旋转对象检测模型的性能?常用的指标有哪些?</think>### 面向对象检测与旋转对象检测技术比较 面向对象检测(Oriented Object Detection)和旋转对象检测(Rotated Object Detection)都是针对**非水平排列对象**的检测技术,但侧重点有所不同: #### **1. 核心概念差异** - **面向对象检测** 检测具有**方向性**的对象(如车辆、飞机),输出带角度的边界框($(x,y,w,h,\theta)$),其中$\theta$表示物体朝向(通常$-90°<\theta\leq90°$)。 **特点**:强调物体自身的朝向语义(如车头方向)[^1]。 - **旋转对象检测** 检测**任意旋转角度**的对象(如遥感图像中的建筑物),边界框表示为$(x,y,w,h,\theta)$或四边形顶点坐标。 **特点**:关注几何旋转不变性,适用于任意角度物体[^2]。 #### **2. 关键技术对比** | 技术维度 | 面向对象检测 | 旋转对象检测 | |------------------|---------------------------------------|---------------------------------------| | **边界框表示** | 旋转矩形(5参数) | 旋转矩形/四边形(5或8参数) | | **主流方法** | - R-ROI(旋转感兴趣区域)<br>- 方向敏感卷积 | - 旋转区域提议网络(RRPN)<br>- 极坐标表示 | | **损失函数** | 方向感知回归损失(如$\mathcal{L}_{angle}$) | 旋转IoU损失(解决角度周期性歧义) | | **典型应用** | 自动驾驶(车辆朝向)<br>工业质检 | 遥感图像分析<br>文档检测 | #### **3. 技术挑战** - **共享挑战** - **角度回归歧义**:$\theta$和$\theta+180°$的边界框等价性(如使用$\mathcal{L}_{smooth} = \min(|\theta-\hat{\theta}|, 180°-|\theta-\hat{\theta}|)$解决) - **特征对齐**:旋转对象导致特征图不对齐(可通过旋转RoI对齐层缓解)[^1] - **特有挑战** - 面向检测:需区分物体前后方向(如添加方向分类头) - 旋转检测:密集场景下四边形顶点易重叠(需设计顶点顺序约束) #### **4. 性能评估指标** 两类检测均使用: $$ \text{mAP}_{\theta} = \frac{1}{N}\sum_{i=1}^{N} AP_i(\theta) $$ 其中$\theta$为旋转IoU阈值(通常设0.5),但**旋转检测**额外要求: - 四边形顶点精度(如DOTA数据集指标) - 角度误差容忍度($\Delta\theta \leq 5°$) #### **5. 典型算法** - **面向检测** - R-CNN[^1]:引入旋转RoI池化 ```python # 伪代码:旋转RoI对齐 rotated_roi = rotate(roi, angle) # 根据预测角度旋转区域 aligned_feature = bilinear_sampling(feature_map, rotated_roi) ``` - **旋转检测** - RRPN[^2]:生成旋转锚框 - CSL(Circular Smooth Label):将角度预测转化为分类问题 --- ### 总结 | **维度** | **面向对象检测** | **旋转对象检测** | |---------------|-----------------------|-----------------------| | 核心目标 | 物体朝向语义理解 | 几何旋转不变性 | | 适用场景 | 结构化方向场景 | 任意旋转场景 | | 技术重点 | 方向分类+回归 | 旋转不变特征表示 | | 未来趋势 | 3D朝向估计融合 | 无锚点四边形预测 | ---
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值