arxiv2017_Face Detection using Deep Learning：An Improved Faster RCNN Approach

最新推荐文章于 2021-06-18 09:32:50 发布

原创最新推荐文章于 2021-06-18 09:32:50 发布 · 674 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#人脸检测 #深度学习

人脸检测专栏收录该内容

5 篇文章

订阅专栏

本文介绍了一种基于Faster R-CNN的人脸检测方法，该方法在FDDB等数据集上取得了优异的成绩。主要改进包括特征融合、难例挖掘、多尺度训练及预训练策略。实验结果显示，在WIDER FACE数据集上进行预训练并在FDDB数据集上进行微调，能显著提升检测精度。

深图智服人脸检测paper，基于faster rcnn，ranking the best among all the published approaches in FDDB until 2017-01

做了改进策略为：

1 feature concatenation

2 hard negative mining

3 multi-scale training：图像训练多尺度，但是尺度从三个scale里面随机选；增加了随机性；

4 model pretraining：先在wider face上来一波

5 proper calibration of key parameters：anchor尺度增加一个64*64

the VJ framework was the first one to apply rectangular Haar-like features in a cascaded Adaboost classifier for achieving real-time face detection.

因为基于frcnn，所以简单介绍了下frcnn：

1 RPN for generating a list of region proposals which likely contain objects, or called regions of interest (RoIs); 

2 a Fast RCNN network for classifying a region of image into objects (and background) and refining the boundaries of those regions.

训练步骤，比较简单的：

1 train the CNN model(VGG16 from imagenet) of Faster RCNN using the WIDER FACE dataset. 

2 use the same dataset to test the pre-trained model so as to generate hard negatives. 

3 These hard negatives are fed into the network as the second step of our training procedure.----就是说通过2中生成的难例，再在wider face上训练一波；

4 The resulting model will be further fine-tuned on the FDDB dataset. 

最终 fine-tuning process，两个小技巧：

1 multi-scale training process

2 feature concatenation strategy

3 将检测的bbox从矩形转换为椭圆形(可选)；

feature concatenation strategy：特征融合策略

比较简单：训练+测试的前向操作中，rpn网络得到了roi，原始roi pooling只在conv上做，论文中在conv3、4、5上做(加个L2正则化)，然后做一个concate。剩下的操作就是走fast rcnn了；

hard negative mining：只挖掘false positive，OHEM挖掘false positive + false negtive；

multi-scale training：

The Faster RCNN architecture typically adopt a fixed scale for all the training images. FRCNN只使用一个尺度进行训练；论文中用了三个尺度，随机对每张图像用一个尺度做resize，再扔到模型去训练；达到对尺度的不变性；

实验：

wider face中，根据每个人脸的难度赋值，如果困难度大于2，就直接舍弃，不用于训练；图像中有1000个以上的人脸，也不加入训练；anchor数目3*4；

负难例：score > 0.8 & iou with gt < 0.5

训练step3，也就是在fddb上finetune，使用10-fold cross-validation + 多尺度训练(step1 + step 3都使用多尺度)；

测试：也是图像多尺度，扔进去做测试；

ablation studies里面很好地说明了本论文中所采用的方法：ID7效果最好

论文参考

1 arxiv2017_Face Detection using Deep Learning：An Improved Faster RCNN Approach