目标检测神文,非常全而且持续在更新。转发自:Object Detection - handong1587,如有侵权联系删除。
更新时间:
20181026
我会跟进原作者博客持续更新,加入自己对目标检测领域的一些新研究及论文解读。博客根据需求直接进行关键字搜索,例如2018,可找到最新论文。
文章目录
- Papers
- R-CNN
- Fast R-CNN
- Faster R-CNN
- Light-Head R-CNN
- MultiBox
- SPP-Net
- Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection
- Object Detectors Emerge in Deep Scene CNNs
- segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection
- Object Detection Networks on Convolutional Feature Maps
- Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction
- DeepBox: Learning Objectness with Convolutional Networks
- MR-CNN
- YOLO
- You Only Look Once: Unified, Real-Time Object Detection
- darkflow - translate darknet to tensorflow. Load trained weights, retrain/fine-tune them using tensorflow, export constant graph def to C++
- Start Training YOLO with Our Own Data
- YOLO: Core ML versus MPSNNGraph
- TensorFlow YOLO object detection on Android
- Computer Vision in iOS – Object Detection
- YOLOv2
- YOLOv3
- DenseBox
- SSD
- DSSD
- FSSD
- ESSD
- Inside-Outside Net (ION)
- Factors in Finetuning Deep Model for object detection
- CRAFT
- OHEM
- R-FCN
- MS-CNN
- PVANET
- GBD-Net
- Gated Bi-directional CNN for Object Detection
- Crafting GBD-Net for Object Detection
- StuffNet: Using ‘Stuff’ to Improve Object Detection
- Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene
- Hierarchical Object Detection with Deep Reinforcement Learning
- Learning to detect and localize many objects from few examples
- Speed/accuracy trade-offs for modern convolutional object detectors
- SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving
- Feature Pyramid Network (FPN)
- Feature Pyramid Networks for Object Detection
- Action-Driven Object Detection with Top-Down Visual Attentions
- Beyond Skip Connections: Top-Down Modulation for Object Detection
- Wide-Residual-Inception Networks for Real-time Object Detection
- Attentional Network for Visual Object Detection
- Learning Chained Deep Features and Classifiers for Cascade in Object Detection
- DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling
- Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
- Spatial Memory for Context Reasoning in Object Detection
- Accurate Single Stage Detector Using Recurrent Rolling Convolution
- Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection
- LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems
- Point Linking Network for Object Detection
- Perceptual Generative Adversarial Networks for Small Object Detection
- Few-shot Object Detection
- Yes-Net: An effective Detector Based on Global Information
- SMC Faster R-CNN: Toward a scene-specialized multi-object detector
- Towards lightweight convolutional neural networks for object detection
- RON: Reverse Connection with Objectness Prior Networks for Object Detection
- Mimicking Very Efficient Network for Object Detection
- Residual Features and Unified Prediction Network for Single Stage Detection
- Deformable Part-based Fully Convolutional Network for Object Detection
- Adaptive Feeding: Achieving Fast and Accurate Detections by Adaptively Combining Object Detectors
- Recurrent Scale Approximation for Object Detection in CNN
- DSOD
- DSOD: Learning Deeply Supervised Object Detectors from Scratch
- Object Detection from Scratch with Deep Supervision
- Focal Loss for Dense Object Detection
- Focal Loss Dense Detector for Vehicle Surveillance
- CoupleNet: Coupling Global Structure with Local Parts for Object Detection
- Incremental Learning of Object Detectors without Catastrophic Forgetting
- Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection
- StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection
- Dynamic Zoom-in Network for Fast Object Detection in Large Images
- Zero-Annotation Object Detection with Web Knowledge Transfer
- MegDet
- MegDet: A Large Mini-Batch Object Detector
- Single-Shot Refinement Neural Network for Object Detection
- Receptive Field Block Net for Accurate and Fast Object Detection
- An Analysis of Scale Invariance in Object Detection - SNIP
- Feature Selective Networks for Object Detection
- Learning a Rotation Invariant Detector with Rotatable Bounding Box
- Scalable Object Detection for Stylized Objects
- Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids
- Deep Regionlets for Object Detection
- Training and Testing Object Detectors with Virtual Images
- Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video
- Spot the Difference by Object Detection
- Localization-Aware Active Learning for Object Detection
- Object Detection with Mask-based Feature Encoding
- LSTD: A Low-Shot Transfer Detector for Object Detection
- Domain Adaptive Faster R-CNN for Object Detection in the Wild
- Pseudo Mask Augmented Object Detection
- Revisiting RCNN: On Awakening the Classification Power of Faster RCNN
- Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection
- Learning Region Features for Object Detection
- Single-Shot Bidirectional Pyramid Networks for High-Quality Object Detection
- Object Detection for Comics using Manga109 Annotations
- Task-Driven Super Resolution: Object Detection in Low-resolution Images
- Transferring Common-Sense Knowledge for Object Detection
- Multi-scale Location-aware Kernel Representation for Object Detection
- Loss Rank Mining: A General Hard Example Mining Method for Real-time Detectors
- DetNet: A Backbone network for Object Detection
- Robust Physical Adversarial Attack on Faster R-CNN Object Detector
- AdvDetPatch: Attacking Object Detectors with Adversarial Patches
- Attacking Object Detectors via Imperceptible Patches on Background
- Physical Adversarial Examples for Object Detectors
- Quantization Mimic: Towards Very Tiny CNN for Object Detection
- Object detection at 200 Frames Per Second
- Object Detection using Domain Randomization and Generative Adversarial Refinement of Synthetic Images
- SNIPER: Efficient Multi-Scale Training
- Soft Sampling for Robust Object Detection
- MetaAnchor: Learning to Detect Objects with Customized Anchors
- Localization Recall Precision (LRP): A New Performance Metric for Object Detection
- Auto-Context R-CNN
- Pooling Pyramid Network for Object Detection
- Modeling Visual Context is Key to Augmenting Object Detection Datasets
- Dual Refinement Network for Single-Shot Object Detection
- Acquisition of Localization Confidence for Accurate Object Detection
- CornerNet: Detecting Objects as Paired Keypoints
- Unsupervised Hard Example Mining from Videos for Improved Object Detection
- SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection
- A Survey of Modern Object Detection Literature using Deep Learning
- Tiny-DSOD: Lightweight Object Detection for Resource-Restricted Usages
- Deep Feature Pyramid Reconfiguration for Object Detection
- MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection
- Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks
- Deep Learning for Generic Object Detection: A Survey
- Non-Maximum Suppression (NMS)
- Adversarial Examples
- Weakly Supervised Object Detection
- Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection
- Weakly supervised object detection using pseudo-strong labels
- Saliency Guided End-to-End Learning for Weakly Supervised Object Detection
- Visual and Semantic Knowledge Transfer for Large Scale Semi-supervised Object Detection
- Video Object Detection
- Learning Object Class Detectors from Weakly Annotated Video
- Analysing domain shift factors between videos and images for object detection
- Video Object Recognition
- Deep Learning for Saliency Prediction in Natural Video
- T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos
- Object Detection from Video Tubelets with Convolutional Neural Networks
- Object Detection in Videos with Tubelets and Multi-context Cues
- Context Matters: Refining Object Detection in Video with Recurrent Neural Networks
- CNN Based Object Detection in Large Video Images
- Object Detection in Videos with Tubelet Proposal Networks
- Flow-Guided Feature Aggregation for Video Object Detection
- Video Object Detection using Faster R-CNN
- Improving Context Modeling for Video Object Detection and Tracking
- Temporal Dynamic Graph LSTM for Action-driven Video Object Detection
- Mobile Video Object Detection with Temporally-Aware Feature Maps
- Towards High Performance Video Object Detection
- Impression Network for Video Object Detection
- Spatial-Temporal Memory Networks for Video Object Detection
- 3D-DETNet: a Single Stage Video-Based Vehicle Detector
- Object Detection in Videos by Short and Long Range Object Linking
- Object Detection in Video with Spatiotemporal Sampling Networks
- Towards High Performance Video Object Detection for Mobiles
- Optimizing Video Object Detection via a Scale-Time Lattice
- Pack and Detect: Fast Object Detection in Videos Using Region-of-Interest Packing
- Object Detection on Mobile Devices
- Object Detection in 3D
- Object Detection on RGB-D
- Zero-Shot Object Detection
- Salient Object Detection
- Best Deep Saliency Detection Models (CVPR 2016 & 2015)
- Large-scale optimization of hierarchical features for saliency prediction in natural images
- Predicting Eye Fixations using Convolutional Neural Networks
- Saliency Detection by Multi-Context Deep Learning
- DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection
- SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection
- Shallow and Deep Convolutional Networks for Saliency Prediction
- Recurrent Attentional Networks for Saliency Detection
- Two-Stream Convolutional Networks for Dynamic Saliency Prediction
- Unconstrained Salient Object Detection
- Unconstrained Salient Object Detection via Proposal Subset Optimization
- DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection
- Salient Object Subitizing
- Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection
- Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs
- Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection
- A Deep Multi-Level Network for Saliency Prediction
- Visual Saliency Detection Based on Multiscale Deep CNN Features
- A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection
- Deeply supervised salient object detection with short connections
- Weakly Supervised Top-down Salient Object Detection
- SalGAN: Visual Saliency Prediction with Generative Adversarial Networks
- Visual Saliency Prediction Using a Mixture of Deep Neural Networks
- A Fast and Compact Salient Score Regression Network Based on Fully Convolutional Network
- Saliency Detection by Forward and Backward Cues in Deep-CNNs
- Supervised Adversarial Networks for Image Saliency Detection
- Group-wise Deep Co-saliency Detection
- Towards the Success Rate of One: Real-time Unconstrained Salient Object Detection
- Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection
- Learning Uncertain Convolutional Features for Accurate Saliency Detection
- Deep Edge-Aware Saliency Detection
- Self-explanatory Deep Salient Object Detection
- PiCANet: Learning Pixel-wise Contextual Attention in ConvNets and Its Application in Saliency Detection
- DeepFeat: A Bottom Up and Top Down Saliency Model Based on Deep Features of Convolutional Neural Nets
- Recurrently Aggregating Deep Features for Salient Object Detection
- Deep saliency: What is learnt by a deep network about saliency?
- Contrast-Oriented Deep Neural Networks for Salient Object Detection
- Salient Object Detection by Lossless Feature Reflection
- HyperFusion-Net: Densely Reflective Fusion for Salient Object Detection
- Video Saliency Detection
- Visual Relationship Detection
- Visual Relationship Detection with Language Priors
- ViP-CNN: A Visual Phrase Reasoning Convolutional Neural Network for Visual Relationship Detection
- Visual Translation Embedding Network for Visual Relation Detection
- Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection
- Detecting Visual Relationships with Deep Relational Networks
- Identifying Spatial Relations in Images using Convolutional Neural Networks
- PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN
- Natural Language Guided Visual Relationship Detection
- Detecting Visual Relationships Using Box Attention
- Google AI Open Images - Visual Relationship Track
- Context-Dependent Diffusion Network for Visual Relationship Detection
- A Problem Reduction Approach for Visual Relationships Detection
- Face Deteciton
- Multi-view Face Detection Using Deep Convolutional Neural Networks
- From Facial Parts Responses to Face Detection: A Deep Learning Approach
- Compact Convolutional Neural Network Cascade for Face Detection
- Face Detection with End-to-End Integration of a ConvNet and a 3D Model
- CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection
- Towards a Deep Learning Framework for Unconstrained Face Detection
- Supervised Transformer Network for Efficient Face Detection
- UnitBox: An Advanced Object Detection Network
- Bootstrapping Face Detection with Hard Negative Examples
- Grid Loss: Detecting Occluded Faces
- A Multi-Scale Cascade Fully Convolutional Network Face Detector
- MTCNN
- Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks
- Face Detection using Deep Learning: An Improved Faster RCNN Approach
- Faceness-Net: Face Detection through Deep Facial Part Responses
- Multi-Path Region-Based Convolutional Neural Network for Accurate Detection of Unconstrained “Hard Faces”
- End-To-End Face Detection and Recognition
- Face R-CNN
- Face Detection through Scale-Friendly Deep Convolutional Networks
- Scale-Aware Face Detection
- Detecting Faces Using Inside Cascaded Contextual CNN
- Multi-Branch Fully Convolutional Network for Face Detection
- SSH: Single Stage Headless Face Detector
- Dockerface: an easy to install and use Faster R-CNN face detector in a Docker container
- FaceBoxes: A CPU Real-time Face Detector with High Accuracy
- S3FD: Single Shot Scale-invariant Face Detector
- Detecting Faces Using Region-based Fully Convolutional Networks
- AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection
- Face Attention Network: An effective Face Detector for the Occluded Faces
- Feature Agglomeration Networks for Single Stage Face Detection
- Face Detection Using Improved Faster RCNN
- PyramidBox: A Context-assisted Single Shot Face Detector
- A Fast Face Detection Method via Convolutional Neural Network
- Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy
- Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks
- SFace: An Efficient Network for Face Detection in Large Scale Variations
- Survey of Face Detection on Low-quality Images
- Anchor Cascade for Efficient Face Detection
- Adversarial Attacks on Face Detectors using Neural Net based Constrained Optimization
- Selective Refinement Network for High Performance Face Detection
- Detect Small Faces
- Person Head Detection
- Pedestrian Detection / People Detection
- Pedestrian Detection aided by Deep Learning Semantic Tasks
- Deep Learning Strong Parts for Pedestrian Detection
- Taking a Deeper Look at Pedestrians
- Convolutional Channel Features
- End-to-end people detection in crowded scenes
- Learning Complexity-Aware Cascades for Deep Pedestrian Detection
- Deep convolutional neural networks for pedestrian detection
- Scale-aware Fast R-CNN for Pedestrian Detection
- New algorithm improves speed and accuracy of pedestrian detection
- Pushing the Limits of Deep CNNs for Pedestrian Detection
- A Real-Time Deep Learning Pedestrian Detector for Robot Navigation
- A Real-Time Pedestrian Detector using Deep Learning for Human-Aware Navigation
- Is Faster R-CNN Doing Well for Pedestrian Detection?
- Unsupervised Deep Domain Adaptation for Pedestrian Detection
- Reduced Memory Region Based Deep Convolutional Neural Network Detection
- Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection
- Detecting People in Artwork with CNNs
- Multispectral Deep Neural Networks for Pedestrian Detection
- Deep Multi-camera People Detection
- Expecting the Unexpected: Training Detectors for Unusual Pedestrians with Adversarial Imposters
- What Can Help Pedestrian Detection?
- Illuminating Pedestrians via Simultaneous Detection & Segmentation
- Rotational Rectification Network for Robust Pedestrian Detection
- STD-PD: Generating Synthetic Training Data for Pedestrian Detection in Unannotated Videos
- Too Far to See? Not Really! — Pedestrian Detection with Scale-aware Localization Policy
- Repulsion Loss: Detecting Pedestrians in a Crowd
- Aggregated Channels Network for Real-Time Pedestrian Detection
- Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection
- Exploring Multi-Branch and High-Level Semantic Networks for Improving Pedestrian Detection
- Pedestrian-Synthesis-GAN: Generating Pedestrian Data in Real Scene and Beyond
- PCN: Part and Context Information for Pedestrian Detection with CNNs
- Small-scale Pedestrian Detection Based on Somatic Topology Localization and Temporal Feature Aggregation
- Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd
- Multispectral Pedestrian Detection via Simultaneous Detection and Segmentation
- Vehicle Detection
- DAVE: A Unified Framework for Fast Vehicle Detection and Annotation
- Evolving Boxes for fast Vehicle Detection
- Fine-Grained Car Detection for Visual Census Estimation
- SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection
- Label and Sample: Efficient Training of Vehicle Object Detector from Sparsely Labeled Data
- Traffic-Sign Detection
- Traffic-Sign Detection and Classification in the Wild
- Evaluating State-of-the-art Object Detector on Challenging Traffic Light Data
- Detecting Small Signs from Large Images
- Localized Traffic Sign Detection with Multi-scale Deconvolution Networks
- Detecting Traffic Lights by Single Shot Detection
- A Hierarchical Deep Architecture and Mini-Batch Selection Method For Joint Traffic Sign and Light Detection
- Skeleton Detection
- Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs
- DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images
- SRN: Side-output Residual Network for Object Symmetry Detection in the Wild
- Hi-Fi: Hierarchical Feature Integration for Skeleton Detection
- Fruit Detection
- Shadow Detection
- Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network
- A+D-Net: Shadow Detection with Adversarial Shadow Attenuation
- Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal
- Direction-aware Spatial Context Features for Shadow Detection
- Direction-aware Spatial Context Features for Shadow Detection and Removal
- Others Detection
- Deep Deformation Network for Object Landmark Localization
- Fashion Landmark Detection in the Wild
- Deep Learning for Fast and Accurate Fashion Item Detection
- OSMDeepOD - OSM and Deep Learning based Object Detection from Aerial Imagery (formerly known as “OSM-Crosswalk-Detection”)
- Selfie Detection by Synergy-Constraint Based Convolutional Neural Network
- Associative Embedding:End-to-End Learning for Joint Detection and Grouping
- Deep Cuboid Detection: Beyond 2D Bounding Boxes
- Automatic Model Based Dataset Generation for Fast and Accurate Crop and Weeds Detection
- Deep Learning Logo Detection with Data Expansion by Synthesising Context
- Scalable Deep Learning Logo Detection
- Pixel-wise Ear Detection with Convolutional Encoder-Decoder Networks
- Automatic Handgun Detection Alarm in Videos Using Deep Learning
- Objects as context for part detection
- Using Deep Networks for Drone Detection
- Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection
- Target Driven Instance Detection
- DeepVoting: An Explainable Framework for Semantic Part Detection under Partial Occlusion
- VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition
- Grab, Pay and Eat: Semantic Food Detection for Smart Restaurants
- ReMotENet: Efficient Relevant Motion Event Detection for Large-scale Home Surveillance Videos
- Deep Learning Object Detection Methods for Ecological Camera Trap Data
- EL-GAN: Embedding Loss Driven Generative Adversarial Networks for Lane Detection
- Towards End-to-End Lane Detection: an Instance Segmentation Approach
- iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection
- Densely Supervised Grasp Detector (DSGD)
- Object Proposal
- DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers
- Scale-aware Pixel-wise Object Proposal Networks
- Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization
- Learning to Segment Object Proposals via Recursive Neural Networks
- Learning Detection with Diverse Proposals
- ScaleNet: Guiding Object Proposal Generation in Supermarkets and Beyond
- Improving Small Object Proposals for Company Logo Detection
- Open Logo Detection Challenge
- Localization
- Beyond Bounding Boxes: Precise Localization of Objects in Images
- Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning
- Weakly Supervised Object Localization Using Size Estimates
- Active Object Localization with Deep Reinforcement Learning
- Localizing objects using referring expressions
- LocNet: Improving Localization Accuracy for Object Detection
- Learning Deep Features for Discriminative Localization
- ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization
- Ensemble of Part Detectors for Simultaneous Classification and Localization
- STNet: Selective Tuning of Convolutional Networks for Object Localization
- Soft Proposal Networks for Weakly Supervised Object Localization
- Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN
- Tutorials / Talks
- Projects
- Detectron
- TensorBox: a simple framework for training neural networks to detect objects in images
- Object detection in torch: Implementation of some object detection frameworks in torch
- Using DIGITS to train an Object Detection network
- FCN-MultiBox Detector
- KittiBox: A car detection model implemented in Tensorflow.
- Deformable Convolutional Networks + MST + Soft-NMS
- How to Build a Real-time Hand-Detector using Neural Networks (SSD) on Tensorflow
- Metrics for object detection
- Leaderboard
- Tools
- Blogs
- Convolutional Neural Networks for Object Detection
- Introducing automatic object detection to visual search (Pinterest)
- Deep Learning for Object Detection with DIGITS
- Analyzing The Papers Behind Facebook’s Computer Vision Approach
- Easily Create High Quality Object Detectors with Deep Learning
- How to Train a Deep-Learned Object Detection Model in the Microsoft Cognitive Toolkit
- Object Detection in Satellite Imagery, a Low Overhead Approach
- You Only Look Twice — Multi-Scale Object Detection in Satellite Imagery With Convolutional Neural Networks
- Faster R-CNN Pedestrian and Car Detection
- Small U-Net for vehicle detection
- Region of interest pooling explained
- Supercharge your Computer Vision models with the TensorFlow Object Detection API
- Understanding SSD MultiBox — Real-Time Object Detection In Deep Learning
- One-shot object detection
- An overview of object detection: one-stage methods
Method | backbone | test size | VOC2007 | VOC2010 | VOC2012 | ILSVRC 2013 | MSCOCO 2015 | Speed |
---|---|---|---|---|---|---|---|---|
OverFeat | 24.3% | |||||||
R-CNN | AlexNet | 58.5% | 53.7% | 53.3% | 31.4% | |||
R-CNN | VGG17 | 66.0% | ||||||
SPP_net | ZF-5 | 54.2% | 31.84% | |||||
DeepID-Net | 64.1% | 50.3% | ||||||
NoC | 73.3% | 68.8% | ||||||
Fast-RCNN | VGG16 | 70.0% | 68.8% | 68.4% | 19.7%(@[0.5-0.95]), 35.9%(@0.5) | |||
MR-CNN | 78.2% | 73.9% | ||||||
Faster-RCNN | VGG16 | 78.8% | 75.9% | 21.9%(@[0.5-0.95]), 42.7%(@0.5) | 198ms | |||
Faster-RCNN | ResNet101 | 85.6% | 83.8% | 37.4%(@[0.5-0.95]), 59.0%(@0.5) | ||||
YOLO | 63.4% | 57.9% | 45 fps | |||||
YOLO | VGG-16 | 66.4% | 21 fps | |||||
YOLOv2 | 448x448 | 78.6% | 73.4% | 21.6%(@[0.5-0.95]), 44.0%(@0.5) | 40 fps | |||
SSD | VGG16 | 300x300 | 77.2% | 75.8% | 25.1%(@[0.5-0.95]), 43.1%(@0.5) | 46 fps | ||
SSD | VGG16 | 512x512 | 79.8% | 78.5% | 28.8%(@[0.5-0.95]), 48.5%(@0.5) | 19 fps | ||
SSD | ResNet101 | 300x300 | 28.0%(@[0.5-0.95]) | 16 fps | ||||
SSD | ResNet101 | 512x512 | 31.2%(@[0.5-0.95]) | 8 fps | ||||
DSSD | ResNet101 | 300x300 | 28.0%(@[0.5-0.95]) | 8 fps | ||||
DSSD | ResNet101 | 500x500 | 33.2%(@[0.5-0.95]) | 6 fps | ||||
ION | 79.2% | 76.4% | ||||||
CRAFT | 75.7% | 71.3% | 48.5% | |||||
OHEM | 78.9% | 76.3% | 25.5%(@[0.5-0.95]), 45.9%(@0.5) | |||||
R-FCN | ResNet50 | 77.4% | 0.12sec(K40), 0.09sec(TitianX) | |||||
R-FCN | ResNet101 | 79.5% | 0.17sec(K40), 0.12sec(TitianX) | |||||
R-FCN(ms train) | ResNet101 | 83.6% | 82.0% | 31.5%(@[0.5-0.95]), 53.2%(@0.5) | ||||
PVANet 9.0 | 84.9% | 84.2% | 750ms(CPU), 46ms(TitianX) | |||||
RetinaNet | ResNet101-FPN | |||||||
Light-Head R-CNN | Xception* | 800/1200 | 31.5%@[0.5:0.95] | 95 fps | ||||
Light-Head R-CNN | Xception* | 700/1100 | 30.7%@[0.5:0.95] | 102 fps |
Papers
Deep Neural Networks for Object Detection
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
- arxiv: [1312.6229] OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
- github: https://github.com/sermanet/OverFeat
- code: CILVR at NYU
R-CNN
Rich feature hierarchies for accurate object detection and semantic segmentation
- intro: R-CNN
- arxiv: [1311.2524] Rich feature hierarchies for accurate object detection and semantic segmentation
- supp: rbg's home page
- slides: http://www.image-net.org/challenges/LSVRC/2013/slides/r-cnn-ilsvrc2013-workshop.pdf
- slides: rbg's home page
- github: https://github.com/rbgirshick/rcnn
- notes: http://zhangliliang.com/2014/07/23/paper-note-rcnn/
- caffe-pr(“Make R-CNN the Caffe detection example”): https://github.com/BVLC/caffe/pull/482
Fast R-CNN
Fast R-CNN
- arxiv: [1504.08083] Fast R-CNN
- slides: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
- github: https://github.com/rbgirshick/fast-rcnn
- github(COCO-branch): https://github.com/rbgirshick/fast-rcnn/tree/coco
- webcam demo: https://github.com/rbgirshick/fast-rcnn/pull/29
- notes: http://zhangliliang.com/2015/05/17/paper-note-fast-rcnn/
- notes: http://blog.youkuaiyun.com/linj_m/article/details/48930179
- github(“Fast R-CNN in MXNet”): https://github.com/precedenceguo/mx-rcnn
- github: https://github.com/mahyarnajibi/fast-rcnn-torch
- github: https://github.com/apple2373/chainer-simple-fast-rnn
- github: https://github.com/zplizzi/tensorflow-fast-rcnn
A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection
- intro: CVPR 2017
- arxiv: [1704.03414] A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection
- paper: http://abhinavsh.info/papers/pdfs/adversarial_object_detection.pdf
- github(Caffe): https://github.com/xiaolonw/adversarial-frcnn
Faster R-CNN
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- intro: NIPS 2015
- arxiv: [1506.01497] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- gitxiv: gitxiv.com
- slides: http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722/slides/w05-FasterR-CNN.pdf
- github(official, Matlab): https://github.com/ShaoqingRen/faster_rcnn
- github: https://github.com/rbgirshick/py-faster-rcnn
- github(MXNet): https://github.com/msracver/Deformable-ConvNets/tree/master/faster_rcnn
- github: https://github.com//jwyang/faster-rcnn.pytorch
- github: https://github.com/mitmul/chainer-faster-rcnn
- github: https://github.com/andreaskoepf/faster-rcnn.torch
- github: https://github.com/ruotianluo/Faster-RCNN-Densecap-torch
- github: https://github.com/smallcorgi/Faster-RCNN_TF
- github: https://github.com/CharlesShang/TFFRCNN
- github(C++ demo): https://github.com/YihangLou/FasterRCNN-Encapsulation-Cplusplus
- github: https://github.com/yhenon/keras-frcnn
- github: https://github.com/Eniac-Xie/faster-rcnn-resnet
- github(C++): https://github.com/D-X-Y/caffe-faster-rcnn/tree/dev
R-CNN minus R
- intro: BMVC 2015
- arxiv: [1506.06981] R-CNN minus R
Faster R-CNN in MXNet with distributed implementation and data parallelization
Contextual Priming and Feedback for Faster R-CNN
- intro: ECCV 2016. Carnegie Mellon University
- paper: http://abhinavsh.info/context_priming_feedback.pdf
- poster: Welcome eccv2016.org - Justhost.com
An Implementation of Faster RCNN with Study for Region Sampling
- intro: Technical Report, 3 pages. CMU
- arxiv: [1702.02138] An Implementation of Faster RCNN with Study for Region Sampling
- github: https://github.com/endernewton/tf-faster-rcnn
Interpretable R-CNN
- intro: North Carolina State University & Alibaba
- keywords: AND-OR Graph (AOG)
- arxiv: [1711.05226] Towards Interpretable R-CNN by Unfolding Latent Structures
Light-Head R-CNN
Light-Head R-CNN: In Defense of Two-Stage Object Detector
- intro: Tsinghua University & Megvii Inc
- arxiv: [1711.07264] Light-Head R-CNN: In Defense of Two-Stage Object Detector
- github(official, Tensorflow): https://github.com/zengarden/light_head_rcnn
- github: https://github.com/terrychenism/Deformable-ConvNets/blob/master/rfcn/symbols/resnet_v1_101_rfcn_light.py#L784
##Cascade R-CNN
Cascade R-CNN: Delving into High Quality Object Detection
- intro: CVPR 2018. UC San Diego
- arxiv: [1712.00726] Cascade R-CNN: Delving into High Quality Object Detection
- github(Caffe, official): https://github.com/zhaoweicai/cascade-rcnn
MultiBox
Scalable Object Detection using Deep Neural Networks
- intro: first MultiBox. Train a CNN to predict Region of Interest.
- arxiv: [1312.2249] Scalable Object Detection using Deep Neural Networks
- github: https://github.com/google/multibox
- blog: https://research.googleblog.com/2014/12/high-quality-object-detection-at-scale.html
Scalable, High-Quality Object Detection
- intro: second MultiBox
- arxiv: [1412.1441] Scalable, High-Quality Object Detection
- github: https://github.com/google/multibox
SPP-Net
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- intro: ECCV 2014 / TPAMI 2015
- arxiv: [1406.4729] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- github: https://github.com/ShaoqingRen/SPP_net
- notes: http://zhangliliang.com/2014/09/13/paper-note-sppnet/
DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection
- intro: PAMI 2016
- intro: an extension of R-CNN. box pre-training, cascade on region proposals, deformation layers and context representations
- project page: http://www.ee.cuhk.edu.hk/˜wlouyang/projects/imagenetDeepId/index.html
- arxiv: [1412.5661] DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection
Object Detectors Emerge in Deep Scene CNNs
- intro: ICLR 2015
- arxiv: [1412.6856] Object Detectors Emerge in Deep Scene CNNs
- paper: https://www.robots.ox.ac.uk/~vgg/rg/papers/zhou_iclr15.pdf
- paper: https://people.csail.mit.edu/khosla/papers/iclr2015_zhou.pdf
- slides: http://places.csail.mit.edu/slide_iclr2015.pdf
segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection
- intro: CVPR 2015
- project(code+data): Yukun Zhu | University of Toronto
- arxiv: [1502.04275] segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection
- github: https://github.com/YknZhu/segDeepM
Object Detection Networks on Convolutional Feature Maps
- intro: TPAMI 2015
- keywords: NoC
- arxiv: [1504.06066] Object Detection Networks on Convolutional Feature Maps
Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction
- arxiv: [1504.03293] Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction
- slides: http://www.ytzhang.net/files/publications/2015-cvpr-det-slides.pdf
- github: https://github.com/YutingZhang/fgs-obj
DeepBox: Learning Objectness with Convolutional Networks
- keywords: DeepBox
- arxiv: [1505.02146] DeepBox: Learning Objectness with Convolutional Networks
- github: https://github.com/weichengkuo/DeepBox
MR-CNN
Object detection via a multi-region & semantic segmentation-aware CNN model
- intro: ICCV 2015. MR-CNN
- arxiv: [1505.01749] Object detection via a multi-region & semantic segmentation-aware CNN model
- github: https://github.com/gidariss/mrcnn-object-detection
- notes: http://zhangliliang.com/2015/05/17/paper-note-ms-cnn/
- notes: http://blog.cvmarcher.com/posts/2015/05/17/multi-region-semantic-segmentation-aware-cnn/
YOLO
You Only Look Once: Unified, Real-Time Object Detection
- arxiv: [1506.02640] You Only Look Once: Unified, Real-Time Object Detection
- code: YOLO: Real-Time Object Detection
- github: https://github.com/pjreddie/darknet
- blog: You Only Look Once: Unified, Real-Time Object Detection - Reviews
- slides: https://docs.google.com/presentation/d/1aeRvtKG21KHdD5lg6Hgyhx5rPq_ZOsGjG5rJ1HP7BbA/pub?start=false&loop=false&delayms=3000&slide=id.p
- reddit: https://www.reddit.com/r/MachineLearning/comments/3a3m0o/realtime_object_detection_with_yolo/
- github: https://github.com/gliese581gg/YOLO_tensorflow
- github: https://github.com/xingwangsfu/caffe-yolo
- github: https://github.com/frankzhangrui/Darknet-Yolo
- github: https://github.com/BriSkyHekun/py-darknet-yolo
- github: https://github.com/tommy-qichang/yolo.torch
- github: https://github.com/frischzenger/yolo-windows
- github: https://github.com/AlexeyAB/yolo-windows
- github: https://github.com/nilboy/tensorflow-yolo
darkflow - translate darknet to tensorflow. Load trained weights, retrain/fine-tune them using tensorflow, export constant graph def to C++
- blog: https://thtrieu.github.io/notes/yolo-tensorflow-graph-buffer-cpp
- github: https://github.com/thtrieu/darkflow
Start Training YOLO with Our Own Data
- intro: train with customized data and class numbers/labels. Linux / Windows version for darknet.
- blog: http://guanghan.info/blog/en/my-works/train-yolo/
- github: https://github.com/Guanghan/darknet
YOLO: Core ML versus MPSNNGraph
- intro: Tiny YOLO for iOS implemented using CoreML but also using the new MPS graph API.
- blog: YOLO: Core ML versus MPSNNGraph
- github: https://github.com/hollance/YOLO-CoreML-MPSNNGraph
TensorFlow YOLO object detection on Android
- intro: Real-time object detection on Android using the YOLO network with TensorFlow
- github: https://github.com/natanielruiz/android-yolo
Computer Vision in iOS – Object Detection
- blog: https://sriraghu.com/2017/07/12/computer-vision-in-ios-object-detection/
- github:https://github.com/r4ghu/iOS-CoreML-Yolo
YOLOv2
YOLO9000: Better, Faster, Stronger
- arxiv: [1612.08242] YOLO9000: Better, Faster, Stronger
- code: YOLO: Real-Time Object Detection
- github(Chainer): https://github.com/leetenki/YOLOv2
- github(Keras): https://github.com/allanzelener/YAD2K
- github(PyTorch): https://github.com/longcw/yolo2-pytorch
- github(Tensorflow): https://github.com/hizhangp/yolo_tensorflow
- github(Windows): https://github.com/AlexeyAB/darknet
- github: https://github.com/choasUp/caffe-yolo9000
- github: https://github.com/philipperemy/yolo-9000
darknet_scripts
- intro: Auxilary scripts to work with (YOLO) darknet deep learning famework. AKA -> How to generate YOLO anchors?
- github: https://github.com/Jumabek/darknet_scripts
Yolo_mark: GUI for marking bounded boxes of objects in images for training Yolo v2
LightNet: Bringing pjreddie’s DarkNet out of the shadows
YOLO v2 Bounding Box Tool
- intro: Bounding box labeler tool to generate the training data in the format YOLO v2 requires.
- github: https://github.com/Cartucho/yolo-boundingbox-labeler-GUI
YOLOv3
YOLOv3: An Incremental Improvement
- project page: YOLO: Real-Time Object Detection
- arxiv: [1804.02767] YOLOv3: An Incremental Improvement
AttentionNet: Aggregating Weak Directions for Accurate Object Detection
- intro: ICCV 2015
- intro: state-of-the-art performance of 65% (AP) on PASCAL VOC 2007/2012 human detection task
- arxiv: [1506.07704] AttentionNet: Aggregating Weak Directions for Accurate Object Detection
- slides: https://www.robots.ox.ac.uk/~vgg/rg/slides/AttentionNet.pdf
- slides: http://image-net.org/challenges/talks/lunit-kaist-slide.pdf
DenseBox
DenseBox: Unifying Landmark Localization with End to End Object Detection
- arxiv: [1509.04874] DenseBox: Unifying Landmark Localization with End to End Object Detection
- demo: Current_Result.avi_免费高速下载|百度网盘-分享无限制
- KITTI result: The KITTI Vision Benchmark Suite
SSD
SSD: Single Shot MultiBox Detector
- intro: ECCV 2016 Oral
- arxiv: [1512.02325] SSD: Single Shot MultiBox Detector
- paper: http://www.cs.unc.edu/~wliu/papers/ssd.pdf
- slides: http://www.cs.unc.edu/~wliu/papers/ssd_eccv2016_slide.pdf
- github(Official): https://github.com/weiliu89/caffe/tree/ssd
- video: Sina Visitor System
- github: https://github.com/zhreshold/mxnet-ssd
- github: https://github.com/zhreshold/mxnet-ssd.cpp
- github: https://github.com/rykov8/ssd_keras
- github: https://github.com/balancap/SSD-Tensorflow
- github: https://github.com/amdegroot/ssd.pytorch
- github(Caffe): https://github.com/chuanqi305/MobileNet-SSD
What’s the diffience in performance between this new code you pushed and the previous code? #327
https://github.com/weiliu89/caffe/issues/327
DSSD
DSSD : Deconvolutional Single Shot Detector
- intro: UNC Chapel Hill & Amazon Inc
- arxiv: [1701.06659] DSSD : Deconvolutional Single Shot Detector
- github: https://github.com/chengyangfu/caffe/tree/dssd
- github: https://github.com/MTCloudVision/mxnet-dssd
- demo: http://120.52.72.53/www.cs.unc.edu/c3pr90ntc0td/~cyfu/dssd_lalaland.mp4
Enhancement of SSD by concatenating feature maps for object detection
- intro: rainbow SSD (R-SSD)
- arxiv: [1705.09587] Enhancement of SSD by concatenating feature maps for object detection
Context-aware Single-Shot Detector
- keywords: CSSD, DiCSSD, DeCSSD, effective receptive fields (ERFs), theoretical receptive fields (TRFs)
- arxiv: [1707.08682] Context-Aware Single-Shot Detector
Feature-Fused SSD: Fast Detection for Small Objects
[1709.05054] Feature-Fused SSD: Fast Detection for Small Objects
FSSD
FSSD: Feature Fusion Single Shot Multibox Detector
[1712.00960] FSSD: Feature Fusion Single Shot Multibox Detector
Weaving Multi-scale Context for Single Shot Detector
- intro: WeaveNet
- keywords: fuse multi-scale information
- arxiv: [1712.03149] Weaving Multi-scale Context for Single Shot Detector
ESSD
Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network
Tiny SSD: A Tiny Single-shot Detection Deep Convolutional Neural Network for Real-time Embedded Object Detection
MDSSD: Multi-scale Deconvolutional Single Shot Detector for small objects
- intro: Zhengzhou University
- arxiv: [1805.07009] MDSSD: Multi-scale Deconvolutional Single Shot Detector for Small Objects
Inside-Outside Net (ION)
Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks
- intro: “0.8s per image on a Titan X GPU (excluding proposal generation) without two-stage bounding-box regression and 1.15s per image with it”.
- arxiv: [1512.04143] Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks
- slides: http://www.seanbell.ca/tmp/ion-coco-talk-bell2015.pdf
- coco-leaderboard: http://mscoco.org/dataset/#detections-leaderboard
Adaptive Object Detection Using Adjacency and Zoom Prediction
- intro: CVPR 2016. AZ-Net
- arxiv: [1512.07711] Adaptive Object Detection Using Adjacency and Zoom Prediction
- github: https://github.com/luyongxi/az-net
- youtube: https://www.youtube.com/watch?v=YmFtuNwxaNM
G-CNN: an Iterative Grid Based Object Detector
Factors in Finetuning Deep Model for object detection
Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution
- intro: CVPR 2016.rank 3rd for provided data and 2nd for external data on ILSVRC 2015 object detection
- project page: http://www.ee.cuhk.edu.hk/~wlouyang/projects/ImageNetFactors/CVPR16.html
- arxiv: [1601.05150] Factors in Finetuning Deep Model for object detection
We don’t need no bounding-boxes: Training object class detectors using only human verification
HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection
- arxiv: [1604.00600] HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection
A MultiPath Network for Object Detection
- intro: BMVC 2016. Facebook AI Research (FAIR)
- arxiv: [1604.02135] A MultiPath Network for Object Detection
- github: https://github.com/facebookresearch/multipathnet
CRAFT
CRAFT Objects from Images
- intro: CVPR 2016. Cascade Region-proposal-network And FasT-rcnn. an extension of Faster R-CNN
- project page: http://byangderek.github.io/projects/craft.html
- arxiv: [1604.03239] CRAFT Objects from Images
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Yang_CRAFT_Objects_From_CVPR_2016_paper.pdf
- github: https://github.com/byangderek/CRAFT
OHEM
Training Region-based Object Detectors with Online Hard Example Mining
- intro: CVPR 2016 Oral. Online hard example mining (OHEM)
- arxiv: [1604.03540] Training Region-based Object Detectors with Online Hard Example Mining
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Shrivastava_Training_Region-Based_Object_CVPR_2016_paper.pdf
- github(Official): https://github.com/abhi2610/ohem
- author page: Abhinav Shrivastava's Academic Website
S-OHEM: Stratified Online Hard Example Mining for Object Detection
Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers
- intro: CVPR 2016
- keywords: scale-dependent pooling (SDP), cascaded rejection classifiers (CRC)
- paper: U-M Web Hosting
R-FCN
R-FCN: Object Detection via Region-based Fully Convolutional Networks
arxiv: [1605.06409] R-FCN: Object Detection via Region-based Fully Convolutional Networks
github: https://github.com/daijifeng001/R-FCN
github(MXNet): https://github.com/msracver/Deformable-ConvNets/tree/master/rfcn
github: https://github.com/Orpine/py-R-FCN
github: https://github.com/PureDiors/pytorch_RFCN
github: https://github.com/bharatsingh430/py-R-FCN-multiGPU
github: https://github.com/xdever/RFCN-tensorflow
R-FCN-3000 at 30fps: Decoupling Detection and Classification
Recycle deep features for better object detection
MS-CNN
A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection
- intro: ECCV 2016
- intro: 640×480: 15 fps, 960×720: 8 fps
- arxiv: [1607.07155] A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection
- github: https://github.com/zhaoweicai/mscnn
- poster: Welcome eccv2016.org - Justhost.com
Multi-stage Object Detection with Group Recursive Learning
- intro: VOC2007: 78.6%, VOC2012: 74.9%
- arxiv: [1608.05159] Multi-stage Object Detection with Group Recursive Learning
Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection
- intro: WACV 2017. SubCNN
- arxiv: [1604.04693] Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection
- github: https://github.com/tanshen/SubCNN
PVANET
PVANet: Lightweight Deep Neural Networks for Real-time Object Detection
- intro: Presented at NIPS 2016 Workshop on Efficient Methods for Deep Neural Networks (EMDNN). Continuation of arXiv:1608.08021
- arxiv: [1611.08588] PVANet: Lightweight Deep Neural Networks for Real-time Object Detection
- github: https://github.com/sanghoon/pva-faster-rcnn
- leaderboard(PVANet 9.0): http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4
GBD-Net
Gated Bi-directional CNN for Object Detection
- intro: The Chinese University of Hong Kong & Sensetime Group Limited
- paper: Gated Bi-directional CNN for Object Detection | SpringerLink
- mirror: Gated_Bi-directional_CNN_for_Object_Detection.pdf_免费高速下载|百度网盘-分享无限制
Crafting GBD-Net for Object Detection
- intro: winner of the ImageNet object detection challenge of 2016. CUImage and CUVideo
- intro: gated bi-directional CNN (GBD-Net)
- arxiv: [1610.02579] Crafting GBD-Net for Object Detection
- github: https://github.com/craftGBD/craftGBD
StuffNet: Using ‘Stuff’ to Improve Object Detection
Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene
Hierarchical Object Detection with Deep Reinforcement Learning
- intro: Deep Reinforcement Learning Workshop (NIPS 2016)
- project page: Hierarchical Object Detection with Deep Reinforcement Learning by imatge-upc
- arxiv: [1611.03718] Hierarchical Object Detection with Deep Reinforcement Learning
- slides: http://www.slideshare.net/xavigiro/hierarchical-object-detection-with-deep-reinforcement-learning
- github: https://github.com/imatge-upc/detection-2016-nipsws
- blog: http://jorditorres.org/nips/
Learning to detect and localize many objects from few examples
Speed/accuracy trade-offs for modern convolutional object detectors
- intro: CVPR 2017. Google Research
- arxiv: [1611.10012] Speed/accuracy trade-offs for modern convolutional object detectors
SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving
- arxiv: [1612.01051] SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving
- github: https://github.com/BichenWuUCB/squeezeDet
- github: https://github.com/fregu856/2D_detection
Feature Pyramid Network (FPN)
Feature Pyramid Networks for Object Detection
- intro: Facebook AI Research
- arxiv: [1612.03144] Feature Pyramid Networks for Object Detection
Action-Driven Object Detection with Top-Down Visual Attentions
Beyond Skip Connections: Top-Down Modulation for Object Detection
- intro: CMU & UC Berkeley & Google Research
- arxiv: [1612.06851] Beyond Skip Connections: Top-Down Modulation for Object Detection
Wide-Residual-Inception Networks for Real-time Object Detection
- intro: Inha University
- arxiv: [1702.01243] Wide-Residual-Inception Networks for Real-time Object Detection
Attentional Network for Visual Object Detection
- intro: University of Maryland & Mitsubishi Electric Research Laboratories
- arxiv: [1702.01478] Attentional Network for Visual Object Detection
Learning Chained Deep Features and Classifiers for Cascade in Object Detection
- keykwords: CC-Net
- intro: chained cascade network (CC-Net). 81.1% mAP on PASCAL VOC 2007
- arxiv: [1702.07054] Learning Chained Deep Features and Classifiers for Cascade in Object Detection
DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling
- intro: ICCV 2017 (poster)
- arxiv: [1703.10295] DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
- intro: CVPR 2017
- arxiv: [1704.03944] Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
Spatial Memory for Context Reasoning in Object Detection
Accurate Single Stage Detector Using Recurrent Rolling Convolution
- intro: CVPR 2017. SenseTime
- keywords: Recurrent Rolling Convolution (RRC)
- arxiv: [1704.05776] Accurate Single Stage Detector Using Recurrent Rolling Convolution
- github: https://github.com/xiaohaoChen/rrc_detection
Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection
LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems
- intro: Embedded Vision Workshop in CVPR. UC San Diego & Qualcomm Inc
- arxiv: [1705.05922] LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems
Point Linking Network for Object Detection
- intro: Point Linking Network (PLN)
- arxiv: [1706.03646] Point Linking Network for Object Detection
Perceptual Generative Adversarial Networks for Small Object Detection
Few-shot Object Detection
Yes-Net: An effective Detector Based on Global Information
SMC Faster R-CNN: Toward a scene-specialized multi-object detector
Towards lightweight convolutional neural networks for object detection
RON: Reverse Connection with Objectness Prior Networks for Object Detection
- intro: CVPR 2017
- arxiv: [1707.01691] RON: Reverse Connection with Objectness Prior Networks for Object Detection
- github: https://github.com/taokong/RON
Mimicking Very Efficient Network for Object Detection
- intro: CVPR 2017. SenseTime & Beihang University
- paper: http://openaccess.thecvf.com/content_cvpr_2017/papers/Li_Mimicking_Very_Efficient_CVPR_2017_paper.pdf
Residual Features and Unified Prediction Network for Single Stage Detection
[1707.05031] Residual Features and Unified Prediction Network for Single Stage Detection
Deformable Part-based Fully Convolutional Network for Object Detection
- intro: BMVC 2017 (oral). Sorbonne Universités & CEDRIC
- arxiv: [1707.06175] Deformable Part-based Fully Convolutional Network for Object Detection
Adaptive Feeding: Achieving Fast and Accurate Detections by Adaptively Combining Object Detectors
- intro: ICCV 2017
- arxiv: [1707.06399] Adaptive Feeding: Achieving Fast and Accurate Detections by Adaptively Combining Object Detectors
Recurrent Scale Approximation for Object Detection in CNN
- intro: ICCV 2017
- keywords: Recurrent Scale Approximation (RSA)
- arxiv: [1707.09531] Recurrent Scale Approximation for Object Detection in CNN
- github: https://github.com/sciencefans/RSA-for-object-detection
DSOD
DSOD: Learning Deeply Supervised Object Detectors from Scratch
- intro: ICCV 2017. Fudan University & Tsinghua University & Intel Labs China
- arxiv: [1708.01241] DSOD: Learning Deeply Supervised Object Detectors from Scratch
- github: https://github.com/szq0214/DSOD
Object Detection from Scratch with Deep Supervision
##RetinaNet
Focal Loss for Dense Object Detection
- intro: ICCV 2017 Best student paper award. Facebook AI Research
- keywords: RetinaNet
- arxiv: [1708.02002] Focal Loss for Dense Object Detection
Focal Loss Dense Detector for Vehicle Surveillance
CoupleNet: Coupling Global Structure with Local Parts for Object Detection
- intro: ICCV 2017
- arxiv: [1708.02863] CoupleNet: Coupling Global Structure with Local Parts for Object Detection
Incremental Learning of Object Detectors without Catastrophic Forgetting
- intro: ICCV 2017. Inria
- arxiv: [1708.06977] Incremental Learning of Object Detectors without Catastrophic Forgetting
Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection
StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection
Dynamic Zoom-in Network for Fast Object Detection in Large Images
[1711.05187] Dynamic Zoom-in Network for Fast Object Detection in Large Images
Zero-Annotation Object Detection with Web Knowledge Transfer
- intro: NTU, Singapore & Amazon
- keywords: multi-instance multi-label domain adaption learning framework
- arxiv: [1711.05954] Zero-Annotation Object Detection with Web Knowledge Transfer
MegDet
MegDet: A Large Mini-Batch Object Detector
- intro: Peking University & Tsinghua University & Megvii Inc
- arxiv: [1711.07240] MegDet: A Large Mini-Batch Object Detector
Single-Shot Refinement Neural Network for Object Detection
- arxiv: [1711.06897] Single-Shot Refinement Neural Network for Object Detection
- github: https://github.com/sfzhang15/RefineDet
- github: https://github.com/MTCloudVision/RefineDet-Mxnet
Receptive Field Block Net for Accurate and Fast Object Detection
- intro: RFBNet
- arxiv: [1711.07767] Receptive Field Block Net for Accurate and Fast Object Detection
- github: https://github.com//ruinmessi/RFBNet
An Analysis of Scale Invariance in Object Detection - SNIP
- intro: CVPR 2018
- arxiv: [1711.08189] An Analysis of Scale Invariance in Object Detection - SNIP
- github: https://github.com/bharatsingh430/snip
Feature Selective Networks for Object Detection
Learning a Rotation Invariant Detector with Rotatable Bounding Box
- arxiv: [1711.09405] Learning a Rotation Invariant Detector with Rotatable Bounding Box
- github(official, Caffe): https://github.com/liulei01/DRBox
Scalable Object Detection for Stylized Objects
- intro: Microsoft AI & Research Munich
- arxiv: [1711.09822] Scalable Object Detection for Stylized Objects
Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids
- arxiv: [1712.00886] Improving Object Detection from Scratch via Gated Feature Reuse
- github: https://github.com/szq0214/GRP-DSOD
Deep Regionlets for Object Detection
- keywords: region selection network, gating network
- arxiv: [1712.02408] Deep Regionlets for Object Detection
Training and Testing Object Detectors with Virtual Images
- intro: IEEE/CAA Journal of Automatica Sinica
- arxiv: [1712.08470] Training and Testing Object Detectors with Virtual Images
Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video
- keywords: object mining, object tracking, unsupervised object discovery by appearance-based clustering, self-supervised detector adaptation
- arxiv: [1712.08832] Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video
Spot the Difference by Object Detection
- intro: Tsinghua University & JD Group
- arxiv: [1801.01051] Spot the Difference by Object Detection
Localization-Aware Active Learning for Object Detection
Object Detection with Mask-based Feature Encoding
LSTD: A Low-Shot Transfer Detector for Object Detection
- intro: AAAI 2018
- arxiv: [1803.01529] LSTD: A Low-Shot Transfer Detector for Object Detection
Domain Adaptive Faster R-CNN for Object Detection in the Wild
- intro: CVPR 2018. ETH Zurich & ESAT/PSI
- arxiv: [1803.03243] Domain Adaptive Faster R-CNN for Object Detection in the Wild
- github(official. Caffe): https://github.com/yuhuayc/da-faster-rcnn
Pseudo Mask Augmented Object Detection
Revisiting RCNN: On Awakening the Classification Power of Faster RCNN
- intro: ECCV 2018
- keywords: DCR V1
- arxiv: [1803.06799] Revisiting RCNN: On Awakening the Classification Power of Faster RCNN
- github(official, MXNet): https://github.com/bowenc0221/Decoupled-Classification-Refinement
Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection
- keywords: DCR V2
- arxiv: [1810.04002] Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection
- github(official, MXNet): https://github.com/bowenc0221/Decoupled-Classification-Refinement
Learning Region Features for Object Detection
- intro: Peking University & MSRA
- arxiv: [1803.07066] Learning Region Features for Object Detection
Single-Shot Bidirectional Pyramid Networks for High-Quality Object Detection
- intro: Singapore Management University & Zhejiang University
- arxiv: [1803.08208] Single-Shot Bidirectional Pyramid Networks for High-Quality Object Detection
Object Detection for Comics using Manga109 Annotations
- intro: University of Tokyo & National Institute of Informatics, Japan
- arxiv: [1803.08670] Object Detection for Comics using Manga109 Annotations
Task-Driven Super Resolution: Object Detection in Low-resolution Images
Transferring Common-Sense Knowledge for Object Detection
Multi-scale Location-aware Kernel Representation for Object Detection
- intro: CVPR 2018
- arxiv: [1804.00428] Multi-scale Location-aware Kernel Representation for Object Detection
- github: https://github.com/Hwang64/MLKP
Loss Rank Mining: A General Hard Example Mining Method for Real-time Detectors
- intro: National University of Defense Technology
- arxiv: [1804.04606] Loss Rank Mining: A General Hard Example Mining Method for Real-time Detectors
DetNet: A Backbone network for Object Detection
- intro: Tsinghua University & Megvii Inc
- arxiv: [1804.06215] DetNet: A Backbone network for Object Detection
Robust Physical Adversarial Attack on Faster R-CNN Object Detector
- arxiv: [1804.05810] ShapeShifter: Robust Physical Adversarial Attack on Faster R-CNN Object Detector
AdvDetPatch: Attacking Object Detectors with Adversarial Patches
Attacking Object Detectors via Imperceptible Patches on Background
Physical Adversarial Examples for Object Detectors
- intro: WOOT 2018
- arxiv: [1807.07769] Physical Adversarial Examples for Object Detectors
Quantization Mimic: Towards Very Tiny CNN for Object Detection
Object detection at 200 Frames Per Second
- intro: United Technologies Research Center-Ireland
- arxiv: [1805.06361] Object detection at 200 Frames Per Second
Object Detection using Domain Randomization and Generative Adversarial Refinement of Synthetic Images
- intro: CVPR 2018 Deep Vision Workshop
- arxiv: [1805.11778] Object Detection using Domain Randomization and Generative Adversarial Refinement of Synthetic Images
SNIPER: Efficient Multi-Scale Training
- arxiv: [1805.09300] SNIPER: Efficient Multi-Scale Training
- github: https://github.com/mahyarnajibi/SNIPER
Soft Sampling for Robust Object Detection
MetaAnchor: Learning to Detect Objects with Customized Anchors
- intro: Megvii Inc (Face++) & Fudan University
- arxiv: [1807.00980] MetaAnchor: Learning to Detect Objects with Customized Anchors
Localization Recall Precision (LRP): A New Performance Metric for Object Detection
- intro: ECCV 2018. Middle East Technical University
- arxiv: [1807.01696] Localization Recall Precision (LRP): A New Performance Metric for Object Detection
- github: https://github.com/cancam/LRP
Auto-Context R-CNN
- intro: Rejected by ECCV18
- arxiv: [1807.02842] Auto-Context R-CNN
Pooling Pyramid Network for Object Detection
- intro: Google AI Perception
- arxiv: [1807.03284] Pooling Pyramid Network for Object Detection
Modeling Visual Context is Key to Augmenting Object Detection Datasets
- intro: ECCV 2018
- arxiv: [1807.07428] Modeling Visual Context is Key to Augmenting Object Detection Datasets
Dual Refinement Network for Single-Shot Object Detection
Acquisition of Localization Confidence for Accurate Object Detection
- intro: ECCV 2018
- arxiv: [1807.11590] Acquisition of Localization Confidence for Accurate Object Detection
- gihtub: https://github.com/vacancy/PreciseRoIPooling
CornerNet: Detecting Objects as Paired Keypoints
- intro: ECCV 2018
- keywords: IoU-Net, PreciseRoIPooling
- arxiv: [1808.01244] CornerNet: Detecting Objects as Paired Keypoints
- github: https://github.com/umich-vl/CornerNet
Unsupervised Hard Example Mining from Videos for Improved Object Detection
- intro: ECCV 2018
- arxiv: [1808.04285] Unsupervised Hard Example Mining from Videos for Improved Object Detection
SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection
A Survey of Modern Object Detection Literature using Deep Learning
Tiny-DSOD: Lightweight Object Detection for Resource-Restricted Usages
- intro: BMVC 2018
- arxiv: [1807.11013] Tiny-DSOD: Lightweight Object Detection for Resource-Restricted Usages
- github: https://github.com/lyxok1/Tiny-DSOD
Deep Feature Pyramid Reconfiguration for Object Detection
- intro: ECCV 2018
- arxiv: [1808.07993] Deep Feature Pyramid Reconfiguration for Object Detection
MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection
- intro: ICPR 2018
- arxiv: [1809.01791] MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection
Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks
Deep Learning for Generic Object Detection: A Survey
Non-Maximum Suppression (NMS)
End-to-End Integration of a Convolutional Network, Deformable Parts Model and Non-Maximum Suppression
- intro: CVPR 2015
- arxiv: [1411.5309] End-to-End Integration of a Convolutional Network, Deformable Parts Model and Non-Maximum Suppression
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Wan_End-to-End_Integration_of_2015_CVPR_paper.pdf
A convnet for non-maximum suppression
- arxiv: [1511.06437] A convnet for non-maximum suppression
Improving Object Detection With One Line of Code
Soft-NMS – Improving Object Detection With One Line of Code
- intro: ICCV 2017. University of Maryland
- keywords: Soft-NMS
- arxiv: [1704.04503] Soft-NMS -- Improving Object Detection With One Line of Code
- github: https://github.com/bharatsingh430/soft-nms
Learning non-maximum suppression
- intro: CVPR 2017
- project page: Learning Non-Maximum Suppression - Max Planck Institute for Informatics
- arxiv: [1705.02950] Learning non-maximum suppression
- github: https://github.com/hosang/gossipnet
Relation Networks for Object Detection
- intro: CVPR 2018 oral
- arxiv: [1711.11575] Relation Networks for Object Detection
- github(official, MXNet): https://github.com/msracver/Relation-Networks-for-Object-Detection
Adversarial Examples
Adversarial Examples that Fool Detectors
- intro: University of Illinois
- arxiv: [1712.02494] Adversarial Examples that Fool Detectors
Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods
- project page: Breaking Neural Network Detection Schemes
- arxiv: [1705.07263] Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods
- github: https://github.com/carlini/nn_breaking_detection
Weakly Supervised Object Detection
Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection
- intro: CVPR 2016
- arxiv: [1604.05766] Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection
Weakly supervised object detection using pseudo-strong labels
Saliency Guided End-to-End Learning for Weakly Supervised Object Detection
- intro: IJCAI 2017
- arxiv: [1706.06768] Saliency Guided End-to-End Learning for Weakly Supervised Object Detection
Visual and Semantic Knowledge Transfer for Large Scale Semi-supervised Object Detection
- intro: TPAMI 2017. National Institutes of Health (NIH) Clinical Center
- arxiv: [1801.03145] Visual and Semantic Knowledge Transfer for Large Scale Semi-supervised Object Detection
Video Object Detection
Learning Object Class Detectors from Weakly Annotated Video
- intro: CVPR 2012
- paper: https://www.vision.ee.ethz.ch/publications/papers/proceedings/eth_biwi_00905.pdf
Analysing domain shift factors between videos and images for object detection
Video Object Recognition
- slides: http://vision.princeton.edu/courses/COS598/2015sp/slides/VideoRecog/Video Object Recognition.pptx
Deep Learning for Saliency Prediction in Natural Video
- intro: Submitted on 12 Jan 2016
- keywords: Deep learning, saliency map, optical flow, convolution network, contrast features
- paper: https://hal.archives-ouvertes.fr/hal-01251614/document
T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos
- intro: Winning solution in ILSVRC2015 Object Detection from Video(VID) Task
- arxiv: [1604.02532] T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos
- github: https://github.com/myfavouritekk/T-CNN
Object Detection from Video Tubelets with Convolutional Neural Networks
- intro: CVPR 2016 Spotlight paper
- arxiv: [1604.04053] Object Detection from Video Tubelets with Convolutional Neural Networks
- paper: http://www.ee.cuhk.edu.hk/~wlouyang/Papers/KangVideoDet_CVPR16.pdf
- gihtub: https://github.com/myfavouritekk/vdetlib
Object Detection in Videos with Tubelets and Multi-context Cues
- intro: SenseTime Group
- slides: http://www.ee.cuhk.edu.hk/~xgwang/CUvideo.pdf
- slides: http://image-net.org/challenges/talks/Object Detection in Videos with Tubelets and Multi-context Cues - Final.pdf
Context Matters: Refining Object Detection in Video with Recurrent Neural Networks
- intro: BMVC 2016
- keywords: pseudo-labeler
- arxiv: [1607.04648] Context Matters: Refining Object Detection in Video with Recurrent Neural Networks
- paper: http://vision.cornell.edu/se3/wp-content/uploads/2016/07/video_object_detection_BMVC.pdf
CNN Based Object Detection in Large Video Images
- intro: WangTao @ 爱奇艺
- keywords: object retrieval, object detection, scene classification
- slides: NVIDIA On-Demand
Object Detection in Videos with Tubelet Proposal Networks
Flow-Guided Feature Aggregation for Video Object Detection
- intro: MSRA
- arxiv: [1703.10025] Flow-Guided Feature Aggregation for Video Object Detection
Video Object Detection using Faster R-CNN
- blog: http://andrewliao11.github.io/object_detection/faster_rcnn/
- github: https://github.com/andrewliao11/py-faster-rcnn-imagenet
Improving Context Modeling for Video Object Detection and Tracking
http://image-net.org/challenges/talks_2017/ilsvrc2017_short(poster).pdf
Temporal Dynamic Graph LSTM for Action-driven Video Object Detection
- intro: ICCV 2017
- arxiv: [1708.00666] Temporal Dynamic Graph LSTM for Action-driven Video Object Detection
Mobile Video Object Detection with Temporally-Aware Feature Maps
Towards High Performance Video Object Detection
Impression Network for Video Object Detection
Spatial-Temporal Memory Networks for Video Object Detection
3D-DETNet: a Single Stage Video-Based Vehicle Detector
Object Detection in Videos by Short and Long Range Object Linking
Object Detection in Video with Spatiotemporal Sampling Networks
- intro: University of Pennsylvania, 2Dartmouth College
- arxiv: [1803.05549] Object Detection in Video with Spatiotemporal Sampling Networks
Towards High Performance Video Object Detection for Mobiles
- intro: Microsoft Research Asia
- arxiv: [1804.05830] Towards High Performance Video Object Detection for Mobiles
Optimizing Video Object Detection via a Scale-Time Lattice
- intro: CVPR 2018
- project page: Scale-Time Lattice (CVPR2018)
- arxiv: [1804.05472] Optimizing Video Object Detection via a Scale-Time Lattice
- github: https://github.com/hellock/scale-time-lattice
Pack and Detect: Fast Object Detection in Videos Using Region-of-Interest Packing
Object Detection on Mobile Devices
Pelee: A Real-Time Object Detection System on Mobile Devices
- intro: ICLR 2018 workshop track
- intro: based on the SSD
- arxiv: [1804.06882] Pelee: A Real-Time Object Detection System on Mobile Devices
- github: https://github.com/Robert-JunWang/Pelee
Object Detection in 3D
Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks
Complex-YOLO: Real-time 3D Object Detection on Point Clouds
- intro: Valeo Schalter und Sensoren GmbH & Ilmenau University of Technology
- arxiv: [1803.06199] Complex-YOLO: Real-time 3D Object Detection on Point Clouds
Focal Loss in 3D Object Detection
Object Detection on RGB-D
Learning Rich Features from RGB-D Images for Object Detection and Segmentation
Differential Geometry Boosts Convolutional Neural Networks for Object Detection
- intro: CVPR 2016
- paper: CVPR 2016 Open Access Repository
A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation
Zero-Shot Object Detection
Zero-Shot Detection
- intro: Australian National University
- keywords: YOLO
- arxiv: [1803.07113] Zero-Shot Detection
Zero-Shot Object Detection
Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel Concepts
- intro: Australian National University
- arxiv: [1803.06049] Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel Concepts
Zero-Shot Object Detection by Hybrid Region Embedding
- intro: Middle East Technical University & Hacettepe University
- arxiv: [1805.06157] Zero-Shot Object Detection by Hybrid Region Embedding
Salient Object Detection
This task involves predicting the salient regions of an image given by human eye fixations.
Best Deep Saliency Detection Models (CVPR 2016 & 2015)
Large-scale optimization of hierarchical features for saliency prediction in natural images
Predicting Eye Fixations using Convolutional Neural Networks
Saliency Detection by Multi-Context Deep Learning
DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection
SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection
Shallow and Deep Convolutional Networks for Saliency Prediction
- intro: CVPR 2016
- arxiv: [1603.00845] Shallow and Deep Convolutional Networks for Saliency Prediction
- github: https://github.com/imatge-upc/saliency-2016-cvpr
Recurrent Attentional Networks for Saliency Detection
- intro: CVPR 2016. recurrent attentional convolutional-deconvolution network (RACDNN)
- arxiv: [1604.03227] Recurrent Attentional Networks for Saliency Detection
Two-Stream Convolutional Networks for Dynamic Saliency Prediction
Unconstrained Salient Object Detection
Unconstrained Salient Object Detection via Proposal Subset Optimization
- intro: CVPR 2016
- project page: http://cs-people.bu.edu/jmzhang/sod.html
- paper: http://cs-people.bu.edu/jmzhang/SOD/CVPR16SOD_camera_ready.pdf
- github: https://github.com/jimmie33/SOD
- caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-object-proposal-models-for-salient-object-detection
DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection
Salient Object Subitizing
- intro: CVPR 2015
- intro: predicting the existence and the number of salient objects in an image using holistic cues
- project page: http://cs-people.bu.edu/jmzhang/sos.html
- arxiv: [1607.07525] Salient Object Subitizing
- paper: http://cs-people.bu.edu/jmzhang/SOS/SOS_preprint.pdf
- caffe model zoo: https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-models-for-salient-object-subitizing
Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection
- intro: ACMMM 2016. deeply-supervised recurrent convolutional neural network (DSRCNN)
- arxiv: [1608.05177] Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection
Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs
- intro: ECCV 2016
- arxiv: [1608.05186] Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs
Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection
- arxiv: [1608.08029] Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection
A Deep Multi-Level Network for Saliency Prediction
Visual Saliency Detection Based on Multiscale Deep CNN Features
- intro: IEEE Transactions on Image Processing
- arxiv: [1609.02077] Visual Saliency Detection Based on Multiscale Deep CNN Features
A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection
- intro: DSCLRCN
- arxiv: [1610.01708] A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection
Deeply supervised salient object detection with short connections
- intro: IEEE TPAMI 2018 (IEEE CVPR 2017)
- arxiv: [1611.04849] Deeply supervised salient object detection with short connections
- github(official, Caffe): https://github.com/Andrew-Qibin/DSS
- github(Tensorflow): https://github.com/Joker316701882/Salient-Object-Detection
Weakly Supervised Top-down Salient Object Detection
- intro: Nanyang Technological University
- arxiv: [1611.05345] Backtracking Spatial Pyramid Pooling (SPP)-based Image Classifier for Weakly Supervised Top-down Salient Object Detection
SalGAN: Visual Saliency Prediction with Generative Adversarial Networks
- project page: https://imatge-upc.github.io/saliency-salgan-2017/
- arxiv: [1701.01081] SalGAN: Visual Saliency Prediction with Generative Adversarial Networks
Visual Saliency Prediction Using a Mixture of Deep Neural Networks
A Fast and Compact Salient Score Regression Network Based on Fully Convolutional Network
Saliency Detection by Forward and Backward Cues in Deep-CNNs
Supervised Adversarial Networks for Image Saliency Detection
Group-wise Deep Co-saliency Detection
Towards the Success Rate of One: Real-time Unconstrained Salient Object Detection
- intro: University of Maryland College Park & eBay Inc
- arxiv: [1708.00079] Towards the Success Rate of One: Real-time Unconstrained Salient Object Detection
Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection
- intro: ICCV 2017
- arixv: [1708.02001] Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection
Learning Uncertain Convolutional Features for Accurate Saliency Detection
- intro: Accepted as a poster in ICCV 2017
- arxiv: [1708.02031] Learning Uncertain Convolutional Features for Accurate Saliency Detection
Deep Edge-Aware Saliency Detection
Self-explanatory Deep Salient Object Detection
- intro: National University of Defense Technology, China & National University of Singapore
- arxiv: [1708.05595] Self-explanatory Deep Salient Object Detection
PiCANet: Learning Pixel-wise Contextual Attention in ConvNets and Its Application in Saliency Detection
DeepFeat: A Bottom Up and Top Down Saliency Model Based on Deep Features of Convolutional Neural Nets
Recurrently Aggregating Deep Features for Salient Object Detection
- intro: AAAI 2018
- paper: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16775/16281
Deep saliency: What is learnt by a deep network about saliency?
- intro: 2nd Workshop on Visualisation for Deep Learning in the 34th International Conference On Machine Learning
- arxiv: [1801.04261] Deep saliency: What is learnt by a deep network about saliency?
Contrast-Oriented Deep Neural Networks for Salient Object Detection
Salient Object Detection by Lossless Feature Reflection
- intro: IJCAI 2018
- arxiv: [1802.06527] Salient Object Detection by Lossless Feature Reflection
HyperFusion-Net: Densely Reflective Fusion for Salient Object Detection
Video Saliency Detection
Deep Learning For Video Saliency Detection
Video Salient Object Detection Using Spatiotemporal Deep Features
Predicting Video Saliency with Object-to-Motion CNN and Two-layer Convolutional LSTM
- arxiv: [1709.06316] Predicting Video Saliency with Object-to-Motion CNN and Two-layer Convolutional LSTM
Visual Relationship Detection
Visual Relationship Detection with Language Priors
- intro: ECCV 2016 oral
- paper: https://cs.stanford.edu/people/ranjaykrishna/vrd/vrd.pdf
- github: https://github.com/Prof-Lu-Cewu/Visual-Relationship-Detection
ViP-CNN: A Visual Phrase Reasoning Convolutional Neural Network for Visual Relationship Detection
- intro: Visual Phrase reasoning Convolutional Neural Network (ViP-CNN), Visual Phrase Reasoning Structure (VPRS)
- arxiv: [1702.07191] ViP-CNN: Visual Phrase Guided Convolutional Neural Network
Visual Translation Embedding Network for Visual Relation Detection
Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection
- intro: CVPR 2017 spotlight paper
- arxiv: [1703.03054] Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection
Detecting Visual Relationships with Deep Relational Networks
- intro: CVPR 2017 oral. The Chinese University of Hong Kong
- arxiv: [1704.03114] Detecting Visual Relationships with Deep Relational Networks
Identifying Spatial Relations in Images using Convolutional Neural Networks
PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN
- intro: ICCV
- arxiv: [1708.01956] PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN
Natural Language Guided Visual Relationship Detection
Detecting Visual Relationships Using Box Attention
- intro: Google AI & IST Austria
- arxiv: [1807.02136] Detecting Visual Relationships Using Box Attention
Google AI Open Images - Visual Relationship Track
- intro: Detect pairs of objects in particular relationships
- kaggle: Google AI Open Images - Visual Relationship Track | Kaggle
Context-Dependent Diffusion Network for Visual Relationship Detection
- intro: 2018 ACM Multimedia Conference
- arxiv: [1809.06213] Context-Dependent Diffusion Network for Visual Relationship Detection
A Problem Reduction Approach for Visual Relationships Detection
- intro: ECCV 2018 Workshop
- arxiv: [1809.09828] A Problem Reduction Approach for Visual Relationships Detection
Face Deteciton
Multi-view Face Detection Using Deep Convolutional Neural Networks
- intro: Yahoo
- arxiv: [1502.02766] Multi-view Face Detection Using Deep Convolutional Neural Networks
- github: https://github.com/guoyilin/FaceDetection_CNN
From Facial Parts Responses to Face Detection: A Deep Learning Approach
- intro: ICCV 2015. CUHK
- project page: http://personal.ie.cuhk.edu.hk/~ys014/projects/Faceness/Faceness.html
- arxiv: [1509.06451] From Facial Parts Responses to Face Detection: A Deep Learning Approach
- paper: http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Yang_From_Facial_Parts_ICCV_2015_paper.pdf
Compact Convolutional Neural Network Cascade for Face Detection
- arxiv: [1508.01292] Compact Convolutional Neural Network Cascade for Face Detection
- github: https://github.com/Bkmz21/FD-Evaluation
- github: https://github.com/Bkmz21/CompactCNNCascade
Face Detection with End-to-End Integration of a ConvNet and a 3D Model
- intro: ECCV 2016
- arxiv: [1606.00850] Face Detection with End-to-End Integration of a ConvNet and a 3D Model
- github(MXNet): https://github.com/tfwu/FaceDetection-ConvNet-3D
CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection
- intro: CMU
- arxiv: [1606.05413] CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection
Towards a Deep Learning Framework for Unconstrained Face Detection
- intro: overlap with CMS-RCNN
- arxiv: [1612.05322] Towards a Deep Learning Framework for Unconstrained Face Detection
Supervised Transformer Network for Efficient Face Detection
UnitBox: An Advanced Object Detection Network
- intro: ACM MM 2016
- keywords: IOULoss
- arxiv: [1608.01471] UnitBox: An Advanced Object Detection Network
Bootstrapping Face Detection with Hard Negative Examples
- author: 万韶华 @ 小米.
- intro: Faster R-CNN, hard negative mining. state-of-the-art on the FDDB dataset
- arxiv: [1608.02236] Bootstrapping Face Detection with Hard Negative Examples
Grid Loss: Detecting Occluded Faces
- intro: ECCV 2016
- arxiv: [1609.00129] Grid Loss: Detecting Occluded Faces
- paper: http://lrs.icg.tugraz.at/pubs/opitz_eccv_16.pdf
- poster: Welcome eccv2016.org - Justhost.com
A Multi-Scale Cascade Fully Convolutional Network Face Detector
- intro: ICPR 2016
- arxiv: [1609.03536] A Multi-Scale Cascade Fully Convolutional Network Face Detector
MTCNN
Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks
- project page: Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks
- arxiv: [1604.02878] Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks
- github(official, Matlab): https://github.com/kpzhang93/MTCNN_face_detection_alignment
- github: https://github.com/pangyupo/mxnet_mtcnn_face_detection
- github: https://github.com/DaFuCoding/MTCNN_Caffe
- github(MXNet): https://github.com/Seanlinx/mtcnn
- github: https://github.com/Pi-DeepLearning/RaspberryPi-FaceDetection-MTCNN-Caffe-With-Motion
- github(Caffe): https://github.com/foreverYoungGitHub/MTCNN
- github: https://github.com/CongWeilin/mtcnn-caffe
- github(OpenCV+OpenBlas): https://github.com/AlphaQi/MTCNN-light
- github(Tensorflow+golang): https://github.com/jdeng/goface
Face Detection using Deep Learning: An Improved Faster RCNN Approach
- intro: DeepIR Inc
- arxiv: [1701.08289] Face Detection using Deep Learning: An Improved Faster RCNN Approach
Faceness-Net: Face Detection through Deep Facial Part Responses
- intro: An extended version of ICCV 2015 paper
- arxiv: [1701.08393] Faceness-Net: Face Detection through Deep Facial Part Responses
Multi-Path Region-Based Convolutional Neural Network for Accurate Detection of Unconstrained “Hard Faces”
- intro: CVPR 2017. MP-RCNN, MP-RPN
- arxiv: [1703.09145] Multi-Path Region-Based Convolutional Neural Network for Accurate Detection of Unconstrained "Hard Faces"
End-To-End Face Detection and Recognition
Face R-CNN
- arxiv: [1706.01061] Face R-CNN
Face Detection through Scale-Friendly Deep Convolutional Networks
Scale-Aware Face Detection
- intro: CVPR 2017. SenseTime & Tsinghua University
- arxiv: [1706.09876] Scale-Aware Face Detection
Detecting Faces Using Inside Cascaded Contextual CNN
- intro: CVPR 2017. Tencent AI Lab & SenseTime
- paper: 腾讯元宝 - 轻松工作 多点生活
Multi-Branch Fully Convolutional Network for Face Detection
SSH: Single Stage Headless Face Detector
- intro: ICCV 2017. University of Maryland
- arxiv: [1708.03979] SSH: Single Stage Headless Face Detector
- github(official, Caffe): https://github.com/mahyarnajibi/SSH
Dockerface: an easy to install and use Faster R-CNN face detector in a Docker container
- arxiv: [1708.04370] Dockerface: an Easy to Install and Use Faster R-CNN Face Detector in a Docker Container
FaceBoxes: A CPU Real-time Face Detector with High Accuracy
- intro: IJCB 2017
- keywords: Rapidly Digested Convolutional Layers (RDCL), Multiple Scale Convolutional Layers (MSCL)
- intro: the proposed detector runs at 20 FPS on a single CPU core and 125 FPS using a GPU for VGA-resolution images
- arxiv: [1708.05234] FaceBoxes: A CPU Real-time Face Detector with High Accuracy
- github(Caffe): https://github.com/zeusees/FaceBoxes
S3FD: Single Shot Scale-invariant Face Detector
- intro: ICCV 2017. Chinese Academy of Sciences
- intro: can run at 36 FPS on a Nvidia Titan X (Pascal) for VGA-resolution images
- arxiv: [1708.05237] S$^3$FD: Single Shot Scale-invariant Face Detector
- github(Caffe, official): https://github.com/sfzhang15/SFD
- github: https://github.com//clcarwin/SFD_pytorch
Detecting Faces Using Region-based Fully Convolutional Networks
AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection
- arxiv: [1709.07326] AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection
Face Attention Network: An effective Face Detector for the Occluded Faces
Feature Agglomeration Networks for Single Stage Face Detection
Face Detection Using Improved Faster RCNN
- intro: Huawei Cloud BU
- arxiv: [1802.02142] Face Detection Using Improved Faster RCNN
PyramidBox: A Context-assisted Single Shot Face Detector
- intro: Baidu, Inc
- arxiv: [1803.07737] PyramidBox: A Context-assisted Single Shot Face Detector
A Fast Face Detection Method via Convolutional Neural Network
- intro: Neurocomputing
- arxiv: [1803.10103] A Fast Face Detection Method via Convolutional Neural Network
Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy
- intro: CVPR 2018. Beihang University & CUHK & Sensetime
- arxiv: [1804.05197] Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy
Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks
- intro: CVPR 2018
- arxiv: [1804.06039] Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks
- github: https://github.com/Jack-CV/PCN
SFace: An Efficient Network for Face Detection in Large Scale Variations
- intro: Beihang University & Megvii Inc. (Face++)
- arxiv: [1804.06559] SFace: An Efficient Network for Face Detection in Large Scale Variations
Survey of Face Detection on Low-quality Images
Anchor Cascade for Efficient Face Detection
- intro: The University of Sydney
- arxiv: [1805.03363] Anchor Cascade for Efficient Face Detection
Adversarial Attacks on Face Detectors using Neural Net based Constrained Optimization
- intro: IEEE MMSP
- arxiv: [1805.12302] Adversarial Attacks on Face Detectors using Neural Net based Constrained Optimization
Selective Refinement Network for High Performance Face Detection
Detect Small Faces
Finding Tiny Faces
- intro: CVPR 2017. CMU
- project page: Peiyun Hu
- arxiv: [1612.04402] Finding Tiny Faces
- github(official, Matlab): https://github.com/peiyunh/tiny
- github(inference-only): https://github.com/chinakook/hr101_mxnet
- github: https://github.com/cydonia999/Tiny_Faces_in_Tensorflow
Detecting and counting tiny faces
- intro: ENS Paris-Saclay. ExtendedTinyFaces
- intro: Detecting and counting small objects - Analysis, review and application to counting
- arxiv: [1801.06504] Detecting and counting tiny faces
- github: https://github.com/alexattia/ExtendedTinyFaces
Seeing Small Faces from Robust Anchor’s Perspective
- intro: CVPR 2018
- arxiv: [1802.09058] Seeing Small Faces from Robust Anchor's Perspective
Face-MagNet: Magnifying Feature Maps to Detect Small Faces
- intro: WACV 2018
- keywords: Face Magnifier Network (Face-MageNet)
- arxiv: [1803.05258] Face-MagNet: Magnifying Feature Maps to Detect Small Faces
- github: https://github.com/po0ya/face-magnet
Person Head Detection
Context-aware CNNs for person head detection
- intro: ICCV 2015
- project page: Context-aware CNNs for person head detection
- arxiv: [1511.07917] Context-aware CNNs for person head detection
- github: https://github.com/aosokin/cnn_head_detection
Detecting Heads using Feature Refine Net and Cascaded Multi-scale Architecture
A Comparison of CNN-based Face and Head Detectors for Real-Time Video Surveillance Applications
FCHD: A fast and accurate head detector
- arxiv: [1809.08766] FCHD: Fast and accurate head detection in crowded scenes
- github(PyTorch, official): https://github.com/aditya-vora/FCHD-Fully-Convolutional-Head-Detector
Pedestrian Detection / People Detection
Pedestrian Detection aided by Deep Learning Semantic Tasks
- intro: CVPR 2015
- project page: Multimedia Laboratory
- arxiv: [1412.0069] Pedestrian Detection aided by Deep Learning Semantic Tasks
Deep Learning Strong Parts for Pedestrian Detection
- intro: ICCV 2015. CUHK. DeepParts
- intro: Achieving 11.89% average miss rate on Caltech Pedestrian Dataset
- paper: http://personal.ie.cuhk.edu.hk/~pluo/pdf/tianLWTiccv15.pdf
Taking a Deeper Look at Pedestrians
- intro: CVPR 2015
- arxiv: [1501.05790] Taking a Deeper Look at Pedestrians
Convolutional Channel Features
- intro: ICCV 2015
- arxiv: [1504.07339] Convolutional Channel Features
- github: https://github.com/byangderek/CCF
End-to-end people detection in crowded scenes
- arxiv: [1506.04878] End-to-end people detection in crowded scenes
- github: https://github.com/Russell91/reinspect
- ipn: http://nbviewer.ipython.org/github/Russell91/ReInspect/blob/master/evaluation_reinspect.ipynb
- youtube: https://www.youtube.com/watch?v=QeWl0h3kQ24
Learning Complexity-Aware Cascades for Deep Pedestrian Detection
- intro: ICCV 2015
- arxiv: [1507.05348] Learning Complexity-Aware Cascades for Deep Pedestrian Detection
Deep convolutional neural networks for pedestrian detection
- arxiv: [1510.03608] Deep convolutional neural networks for pedestrian detection
- github: https://github.com/DenisTome/DeepPed
Scale-aware Fast R-CNN for Pedestrian Detection
New algorithm improves speed and accuracy of pedestrian detection
Pushing the Limits of Deep CNNs for Pedestrian Detection
- intro: “set a new record on the Caltech pedestrian dataset, lowering the log-average miss rate from 11.7% to 8.9%”
- arxiv: [1603.04525] Pushing the Limits of Deep CNNs for Pedestrian Detection
A Real-Time Deep Learning Pedestrian Detector for Robot Navigation
A Real-Time Pedestrian Detector using Deep Learning for Human-Aware Navigation
Is Faster R-CNN Doing Well for Pedestrian Detection?
- intro: ECCV 2016
- arxiv: [1607.07032] Is Faster R-CNN Doing Well for Pedestrian Detection?
- github: https://github.com/zhangliliang/RPN_BF/tree/RPN-pedestrian
Unsupervised Deep Domain Adaptation for Pedestrian Detection
- intro: ECCV Workshop 2016
- arxiv: [1802.03269] Unsupervised Deep Domain Adaptation for Pedestrian Detection
Reduced Memory Region Based Deep Convolutional Neural Network Detection
- intro: IEEE 2016 ICCE-Berlin
- arxiv: [1609.02500] Reduced Memory Region Based Deep Convolutional Neural Network Detection
Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection
Detecting People in Artwork with CNNs
- intro: ECCV 2016 Workshops
- arxiv: [1610.08871] Detecting People in Artwork with CNNs
Multispectral Deep Neural Networks for Pedestrian Detection
- intro: BMVC 2016 oral
- arxiv: [1611.02644] Multispectral Deep Neural Networks for Pedestrian Detection
Deep Multi-camera People Detection
Expecting the Unexpected: Training Detectors for Unusual Pedestrians with Adversarial Imposters
- intro: CVPR 2017
- project page: http://ml.cs.tsinghua.edu.cn:5000/publications/synunity/
- arxiv: [1703.06283] Expecting the Unexpected: Training Detectors for Unusual Pedestrians with Adversarial Imposters
- github(Tensorflow): https://github.com/huangshiyu13/RPNplus
What Can Help Pedestrian Detection?
- intro: CVPR 2017. Tsinghua University & Peking University & Megvii Inc.
- keywords: Faster R-CNN, HyperLearner
- arxiv: [1705.02757] What Can Help Pedestrian Detection?
- paper: http://openaccess.thecvf.com/content_cvpr_2017/papers/Mao_What_Can_Help_CVPR_2017_paper.pdf
Illuminating Pedestrians via Simultaneous Detection & Segmentation
Rotational Rectification Network for Robust Pedestrian Detection
- intro: CMU & Volvo Construction
- arxiv: [1706.08917] Rotational Rectification Network: Enabling Pedestrian Detection for Mobile Vision
STD-PD: Generating Synthetic Training Data for Pedestrian Detection in Unannotated Videos
- intro: The University of North Carolina at Chapel Hill
- arxiv: [1707.09100] MixedPeds: Pedestrian Detection in Unannotated Videos using Synthetically Generated Human-agents for Training
Too Far to See? Not Really! — Pedestrian Detection with Scale-aware Localization Policy
Repulsion Loss: Detecting Pedestrians in a Crowd
Aggregated Channels Network for Real-Time Pedestrian Detection
Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection
- intro: State Key Lab of CAD&CG, Zhejiang University
- arxiv: [1803.05347] Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection
Exploring Multi-Branch and High-Level Semantic Networks for Improving Pedestrian Detection
Pedestrian-Synthesis-GAN: Generating Pedestrian Data in Real Scene and Beyond
PCN: Part and Context Information for Pedestrian Detection with CNNs
- intro: British Machine Vision Conference(BMVC) 2017
- arxiv: [1804.04483] PCN: Part and Context Information for Pedestrian Detection with CNNs
Small-scale Pedestrian Detection Based on Somatic Topology Localization and Temporal Feature Aggregation
- intro: ECCV 2018. Hikvision Research Institute
- arxiv: [1807.01438] Small-scale Pedestrian Detection Based on Somatic Topology Localization and Temporal Feature Aggregation
Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd
- intro: ECCV 2018
- arxiv: [1807.08407] Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd
Multispectral Pedestrian Detection via Simultaneous Detection and Segmentation
- intro: BMVC 2018
- arxiv: [1808.04818] Multispectral Pedestrian Detection via Simultaneous Detection and Segmentation
Vehicle Detection
DAVE: A Unified Framework for Fast Vehicle Detection and Annotation
- intro: ECCV 2016
- arxiv: [1607.04564] DAVE: A Unified Framework for Fast Vehicle Detection and Annotation
Evolving Boxes for fast Vehicle Detection
Fine-Grained Car Detection for Visual Census Estimation
- intro: AAAI 2016
- arxiv: [1709.02480] Fine-Grained Car Detection for Visual Census Estimation
SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection
- intro: IEEE Transactions on Intelligent Transportation Systems (T-ITS)
- arxiv: [1804.00433] SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection
Label and Sample: Efficient Training of Vehicle Object Detector from Sparsely Labeled Data
- intro: UC Berkeley
- arxiv: [1808.08603] Label and Sample: Efficient Training of Vehicle Object Detector from Sparsely Labeled Data
Traffic-Sign Detection
Traffic-Sign Detection and Classification in the Wild
- intro: CVPR 2016
- project page(code+dataset): index
- paper: http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Zhu_Traffic-Sign_Detection_and_CVPR_2016_paper.pdf
- code & model: http://cg.cs.tsinghua.edu.cn/traffic-sign/data_model_code/newdata0411.zip
Evaluating State-of-the-art Object Detector on Challenging Traffic Light Data
- intro: CVPR 2017 workshop
- paper: http://openaccess.thecvf.com/content_cvpr_2017_workshops/w9/papers/Jensen_Evaluating_State-Of-The-Art_Object_CVPR_2017_paper.pdf
Detecting Small Signs from Large Images
- intro: IEEE Conference on Information Reuse and Integration (IRI) 2017 oral
- arxiv: [1706.08574] Detecting Small Signs from Large Images
Localized Traffic Sign Detection with Multi-scale Deconvolution Networks
Detecting Traffic Lights by Single Shot Detection
- intro: ITSC 2018
- arxiv: [1805.02523] Detecting Traffic Lights by Single Shot Detection
A Hierarchical Deep Architecture and Mini-Batch Selection Method For Joint Traffic Sign and Light Detection
- intro: IEEE 15th Conference on Computer and Robot Vision
- arxiv: [1806.07987] A Hierarchical Deep Architecture and Mini-Batch Selection Method For Joint Traffic Sign and Light Detection
- demo: https://www.youtube.com/watch?v=_YmogPzBXOw&feature=youtu.be
Skeleton Detection
Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs
- arxiv: [1603.09446] Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs
- github: https://github.com/zeakey/DeepSkeleton
DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images
SRN: Side-output Residual Network for Object Symmetry Detection in the Wild
- intro: CVPR 2017
- arxiv: [1703.02243] SRN: Side-output Residual Network for Object Symmetry Detection in the Wild
- github: https://github.com/KevinKecc/SRN
Hi-Fi: Hierarchical Feature Integration for Skeleton Detection
Fruit Detection
Deep Fruit Detection in Orchards
Image Segmentation for Fruit Detection and Yield Estimation in Apple Orchards
- intro: The Journal of Field Robotics in May 2016
- project page: http://confluence.acfr.usyd.edu.au/display/AGPub/
- arxiv: [1610.08120] Image Segmentation for Fruit Detection and Yield Estimation in Apple Orchards
Shadow Detection
Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network
- arxiv: [1709.09283] Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network
A+D-Net: Shadow Detection with Adversarial Shadow Attenuation
Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal
Direction-aware Spatial Context Features for Shadow Detection
- intro: CVPR 2018
- arxiv: [1712.04142] Direction-aware Spatial Context Features for Shadow Detection
Direction-aware Spatial Context Features for Shadow Detection and Removal
- intro: The Chinese University of Hong Kong & The Hong Kong Polytechnic University
- arxiv: [1805.04635] Direction-aware Spatial Context Features for Shadow Detection and Removal
Others Detection
Deep Deformation Network for Object Landmark Localization
Fashion Landmark Detection in the Wild
- intro: ECCV 2016
- project page: http://personal.ie.cuhk.edu.hk/~lz013/projects/FashionLandmarks.html
- arxiv: [1608.03049] Fashion Landmark Detection in the Wild
- github(Caffe): https://github.com/liuziwei7/fashion-landmarks
Deep Learning for Fast and Accurate Fashion Item Detection
- intro: Kuznech Inc.
- intro: MultiBox and Fast R-CNN
- paper: https://kddfashion2016.mybluemix.net/kddfashion_finalSubmissions/Deep Learning for Fast and Accurate Fashion Item Detection.pdf
OSMDeepOD - OSM and Deep Learning based Object Detection from Aerial Imagery (formerly known as “OSM-Crosswalk-Detection”)
Selfie Detection by Synergy-Constraint Based Convolutional Neural Network
- intro: IEEE SITIS 2016
- arxiv: [1611.04357] Selfie Detection by Synergy-Constraint Based Convolutional Neural Network
Associative Embedding:End-to-End Learning for Joint Detection and Grouping
Deep Cuboid Detection: Beyond 2D Bounding Boxes
- intro: CMU & Magic Leap
- arxiv: [1611.10010] Deep Cuboid Detection: Beyond 2D Bounding Boxes
Automatic Model Based Dataset Generation for Fast and Accurate Crop and Weeds Detection
- arxiv: [1612.03019] Automatic Model Based Dataset Generation for Fast and Accurate Crop and Weeds Detection
Deep Learning Logo Detection with Data Expansion by Synthesising Context
Scalable Deep Learning Logo Detection
Pixel-wise Ear Detection with Convolutional Encoder-Decoder Networks
Automatic Handgun Detection Alarm in Videos Using Deep Learning
- arxiv: [1702.05147] Automatic Handgun Detection Alarm in Videos Using Deep Learning
- results: https://github.com/SihamTabik/Pistol-Detection-in-Videos
Objects as context for part detection
Using Deep Networks for Drone Detection
- intro: AVSS 2017
- arxiv: [1706.05726] Using Deep Networks for Drone Detection
Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection
- intro: ICCV 2017
- arxiv: [1708.01642] Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection
Target Driven Instance Detection
DeepVoting: An Explainable Framework for Semantic Part Detection under Partial Occlusion
VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition
- intro: ICCV 2017
- arxiv: [1710.06288] VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition
- github: https://github.com/SeokjuLee/VPGNet
Grab, Pay and Eat: Semantic Food Detection for Smart Restaurants
ReMotENet: Efficient Relevant Motion Event Detection for Large-scale Home Surveillance Videos
- intro: WACV 2018
- arxiv: [1801.02031] ReMotENet: Efficient Relevant Motion Event Detection for Large-scale Home Surveillance Videos
Deep Learning Object Detection Methods for Ecological Camera Trap Data
- intro: Conference of Computer and Robot Vision. University of Guelph
- arxiv: [1803.10842] Deep Learning Object Detection Methods for Ecological Camera Trap Data
EL-GAN: Embedding Loss Driven Generative Adversarial Networks for Lane Detection
- arxiv: [1806.05525] EL-GAN: Embedding Loss Driven Generative Adversarial Networks for Lane Detection
Towards End-to-End Lane Detection: an Instance Segmentation Approach
iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection
- intro: BMVC 2018
- project page: iCAN
- arxiv: [1808.10437] iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection
- github: https://github.com/vt-vl-lab/iCAN
Densely Supervised Grasp Detector (DSGD)
Object Proposal
DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers
- arxiv: [1510.04445] DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers
- github: https://github.com/aghodrati/deepproposal
Scale-aware Pixel-wise Object Proposal Networks
- intro: IEEE Transactions on Image Processing
- arxiv: [1601.04798] Scale-aware Pixel-wise Object Proposal Networks
Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization
- intro: BMVC 2016. AttractioNet
- arxiv: [1606.04446] Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization
- github: https://github.com/gidariss/AttractioNet
Learning to Segment Object Proposals via Recursive Neural Networks
Learning Detection with Diverse Proposals
- intro: CVPR 2017
- keywords: differentiable Determinantal Point Process (DPP) layer, Learning Detection with Diverse Proposals (LDDP)
- arxiv: [1704.03533] Learning Detection with Diverse Proposals
ScaleNet: Guiding Object Proposal Generation in Supermarkets and Beyond
- keywords: product detection
- arxiv: [1704.06752] ScaleNet: Guiding Object Proposal Generation in Supermarkets and Beyond
Improving Small Object Proposals for Company Logo Detection
- intro: ICMR 2017
- arxiv: [1704.08881] Improving Small Object Proposals for Company Logo Detection
Open Logo Detection Challenge
- intro: BMVC 2018
- keywords: QMUL-OpenLogo
- project page: Redirecting to https://hangsu0730.github.io/qmul-openlogo/
- arxiv: [1807.01964] Open Logo Detection Challenge
Localization
Beyond Bounding Boxes: Precise Localization of Objects in Images
- intro: PhD Thesis
- homepage: Tech Reports | EECS at UC Berkeley
- phd-thesis: http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.pdf
- github(“SDS using hypercolumns”): https://github.com/bharath272/sds
Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning
- arxiv: [1503.00949] Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning
Weakly Supervised Object Localization Using Size Estimates
Active Object Localization with Deep Reinforcement Learning
- intro: ICCV 2015
- keywords: Markov Decision Process
- arxiv: [1511.06015] Active Object Localization with Deep Reinforcement Learning
Localizing objects using referring expressions
- intro: ECCV 2016
- keywords: LSTM, multiple instance learning (MIL)
- paper: http://www.umiacs.umd.edu/~varun/files/refexp-ECCV16.pdf
- github: https://github.com/varun-nagaraja/referring-expressions
LocNet: Improving Localization Accuracy for Object Detection
- intro: CVPR 2016 oral
- arxiv: [1511.07763] LocNet: Improving Localization Accuracy for Object Detection
- github: https://github.com/gidariss/LocNet
Learning Deep Features for Discriminative Localization
- homepage: CNN Discriminative Localization and Saliency - MIT
- arxiv: [1512.04150] Learning Deep Features for Discriminative Localization
- github(Tensorflow): https://github.com/jazzsaxmafia/Weakly_detector
- github: https://github.com/metalbubble/CAM
- github: https://github.com/tdeboissiere/VGG16CAM-keras
ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization
- intro: ECCV 2016
- project page: ContextLocNet: Context-aware Deep Network Models for Weakly Supervised Localization
- arxiv: [1609.04331] ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization
- github: https://github.com/vadimkantorov/contextlocnet
Ensemble of Part Detectors for Simultaneous Classification and Localization
STNet: Selective Tuning of Convolutional Networks for Object Localization
Soft Proposal Networks for Weakly Supervised Object Localization
- intro: ICCV 2017
- arxiv: [1709.01829] Soft Proposal Networks for Weakly Supervised Object Localization
Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN
- intro: ACM MM 2017
- arxiv: [1709.08295] Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN
Tutorials / Talks
Convolutional Feature Maps: Elements of efficient (and accurate) CNN-based object detection
Towards Good Practices for Recognition & Detection
- intro: Hikvision Research Institute. Supervised Data Augmentation (SDA)
- slides: http://image-net.org/challenges/talks/2016/Hikvision_at_ImageNet_2016.pdf
Work in progress: Improving object detection and instance segmentation for small objects
Object Detection with Deep Learning: A Review
Projects
Detectron
- intro: FAIR’s research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
- github: https://github.com/facebookresearch/Detectron
TensorBox: a simple framework for training neural networks to detect objects in images
- intro: “The basic model implements the simple and robust GoogLeNet-OverFeat algorithm. We additionally provide an implementation of the ReInspect algorithm”
- github: https://github.com/Russell91/TensorBox
Object detection in torch: Implementation of some object detection frameworks in torch
Using DIGITS to train an Object Detection network
FCN-MultiBox Detector
- intro: Full convolution MultiBox Detector (like SSD) implemented in Torch.
- github: https://github.com/teaonly/FMD.torch
KittiBox: A car detection model implemented in Tensorflow.
- keywords: MultiNet
- intro: KittiBox is a collection of scripts to train out model FastBox on the Kitti Object Detection Dataset
- github: https://github.com/MarvinTeichmann/KittiBox
Deformable Convolutional Networks + MST + Soft-NMS
How to Build a Real-time Hand-Detector using Neural Networks (SSD) on Tensorflow
- blog: https://towardsdatascience.com/how-to-build-a-real-time-hand-detector-using-neural-networks-ssd-on-tensorflow-d6bac0e4b2ce
- github: https://github.com//victordibia/handtracking
Metrics for object detection
- intro: Most popular metrics used to evaluate object detection algorithms
- github: https://github.com/rafaelpadilla/Object-Detection-Metrics
Leaderboard
Detection Results: VOC2012
- intro: Competition “comp4” (train on additional data)
- homepage: http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4
Tools
BeaverDam: Video annotation tool for deep learning training labels
https://github.com/antingshen/BeaverDam
Blogs
Convolutional Neural Networks for Object Detection
http://rnd.azoft.com/convolutional-neural-networks-object-detection/
Introducing automatic object detection to visual search (Pinterest)
- keywords: Faster R-CNN
- blog: https://engineering.pinterest.com/blog/introducing-automatic-object-detection-visual-search
- demo: https://engineering.pinterest.com/sites/engineering/files/Visual Search V1 - Video.mp4
- review: Pinterest Introduces the Future of Visual Search | NVIDIA Technical Blog
Deep Learning for Object Detection with DIGITS
Analyzing The Papers Behind Facebook’s Computer Vision Approach
- keywords: DeepMask, SharpMask, MultiPathNet
- blog: https://adeshpande3.github.io/adeshpande3.github.io/Analyzing-the-Papers-Behind-Facebook’s-Computer-Vision-Approach/
Easily Create High Quality Object Detectors with Deep Learning
- intro: dlib v19.2
- blog: dlib C++ Library: Easily Create High Quality Object Detectors with Deep Learning
How to Train a Deep-Learned Object Detection Model in the Microsoft Cognitive Toolkit
- blog: How to Train a Deep-Learned Object Detection Model in the Microsoft Cognitive Toolkit | Microsoft Learn
- github: https://github.com/Microsoft/CNTK/tree/master/Examples/Image/Detection/FastRCNN
Object Detection in Satellite Imagery, a Low Overhead Approach
- part 1: https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-i-cbd96154a1b7#.2csh4iwx9
- part 2: https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-ii-893f40122f92#.f9b7dgf64
You Only Look Twice — Multi-Scale Object Detection in Satellite Imagery With Convolutional Neural Networks
- part 1: https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-38dad1cf7571#.fmmi2o3of
- part 2: https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-34f72f659588#.nwzarsz1t
Faster R-CNN Pedestrian and Car Detection
- blog: Faster R-CNN Pedestrian and Car Detection | BigSnarf blog
- ipn: https://gist.github.com/bigsnarfdude/2f7b2144065f6056892a98495644d3e0#file-demo_faster_rcnn_notebook-ipynb
- github: https://github.com/bigsnarfdude/Faster-RCNN_TF
Small U-Net for vehicle detection
Region of interest pooling explained
- blog: Region of interest pooling explained - deepsense.ai
- github: https://github.com/deepsense-io/roi-pooling
Supercharge your Computer Vision models with the TensorFlow Object Detection API
- blog: https://research.googleblog.com/2017/06/supercharge-your-computer-vision-models.html
- github: https://github.com/tensorflow/models/tree/master/object_detection