Face Detection using Deep Learning: An Improved Faster RCNN Approach论文解读

最新推荐文章于 2024-01-31 10:51:52 发布

kkkkkkkkq

最新推荐文章于 2024-01-31 10:51:52 发布

阅读量426

点赞数

CC 4.0 BY-SA版权

分类专栏：人脸检测

本文链接：https://blog.youkuaiyun.com/kkkkkkkkq/article/details/79246953

人脸检测专栏收录该内容

7 篇文章

订阅专栏

本文介绍了一种改进的Faster R-CNN目标检测模型，主要包括三个方面的改进：特征融合，通过连接不同层级的特征图提高检测精度；难例挖掘，通过再次训练减少误检；多尺度训练，提升模型对不同大小物体的鲁棒性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Flowchart of the training procedure

First of all, we train the CNN model of Faster RCNN using the WIDER FACE dataset [30]. We
further use the same dataset to test the pre-trained model so as to generate hard negatives. Thesehard negatives are fed into the network as the second step of our training procedure. The resulting
model will be further fine-tuned on the FDDB dataset. During the final fine-tuning process, we
apply the multi-scale training process, and adopt a feature concatenation strategy to further boost
the performance of our model. For the whole training processes, we follow the similar end-to-end
training strategy as Faster RCNN.

文章在faster RCNN上做了一些改进，主要体现在三个方面：

1、Feature Concatenation

Network architecture of the proposed feature concatenation scheme

我理解的就是把得出的ROIs映射到Conv3_3,Conv4_3,conv5_3,之后连接一个1X1的卷积层以保持深度一致，然后将这些ROIs输入到roi pool层，分别得到ROI_pool3,ROI_pool4,ROI_pool5,然后将这些ROI_pool进行concatenate（具体concatenate是怎么操作的文章中没写）。特别的，比较浅的特征层ROI_poo之后有一个L2-normalized的操作。

2、Hard Negative Mining

hard negatives are the regions where the network has failed to make correct prediction.
Thus, the hard negatives are fed into the network again as a reinforcement for improving our
trained model.

把hard negatives输入到用wider face预训练好的RPN部分，保持正负样本比例为1:3

3、Multi-Scale Training

文章对每张图片取了三种尺度进行训练，短边不超过480; 600; 750. 长边不超过1250.实验结果表示多尺度的训练让模型对不同尺寸的图片更具有鲁棒性，而且提高了检测的性能。

Experiments

1、VGG16 was selected to be our backbone CNN network, which had been pre-trained on ImageNet.

2、训练数据是WIDER FACE的training +validation datasets。

3、We gave each ground-truth annotation a difficulty value