CNN大规模视频处理--NoScope: 1000x Faster Deep Learning Queries over Video

最新推荐文章于 2024-04-19 09:52:00 发布

原创最新推荐文章于 2024-04-19 09:52:00 发布 · 2.2k 阅读

2 ·

CC 4.0 BY-SA版权

目标检测专栏收录该内容

59 篇文章

订阅专栏

针对大量视频流的实时检测需求，NoScope方案通过结合运动检测与专用CNN模型，实现了相较于YOLOv2超过15,000帧/秒的处理速度提升。该方案首先利用差异检测器判断当前帧是否有变化，若无变化则丢弃该帧；若有变化，则通过为每个摄像头定制的小型CNN模型进行检测。对于复杂场景，可回退至完整的CNN模型。

http://dawn.cs.stanford.edu/2017/06/22/noscope/
https://arxiv.org/abs/1703.02529

YOLOv2在视频检测中的效果比较好，但是一个GPU也只能达到每秒几十帧的处理速度。对于上百路视频怎么使用YOLOv2来完成检测和检索的任务了？总不能每一路视频都配置个 GPU吧。这里主要的思路还是先进行运动检测，看看当前帧有没有运动物体models that detect differences (to exploit temporal locality locality)，然后再对每个相机训练一个小的 CNN 模型来完成检测任务。models that are specialized to a given feed and object (to exploit scene-specific locality) .
NoScope’s specialized models can run at over 15,000 frames per second compared to YOLOv2’s 80 frames per second

If the difference detector is confident that nothing has changed, NoScope drops the frame; otherwise, if the specialized model is confident in its label, NoScope outputs the label. And, for particularly tricky frames, NoScope can always fall back to the full CNN.