【论文 | 】Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in theWild

Pixel-in-PixelNet是一种针对野生环境高效面部地标检测的方法,它探讨了热图回归和坐标回归之间的联系。热图回归虽然精度高但计算成本大,对异常值敏感,而坐标回归则速度快但准确性不足。该研究旨在结合两者的优点,提出了PIP回归、邻居回归模块和自我训练的课程框架,并利用从基于CNN的面部地标检测器中观察到的隐含先验。通过这种方式,提高了跨域泛化能力和检测精度,同时保持了推理速度。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

IJCV

 

Abstract

Related Work

For deep learning based facial landmark detection, there are two widely used detection heads, namely heatmap regression and coordinate regression. Heatmap regression can achieve good results, but it has two drawbacks: (1) it is computationally expensive; (2) it is sensitive to outliers (see Figure 5(b)). In contrast, coordinate regression is fast and robust, but not accurate enough (see Figure 5(a)). Although coordinate regression can be used in a multi-stage manner to yield better performance, its inference speed becomes slow as a result.

基于DL脸部点检测方法×2:热图回归、坐标回归

Heatmap:√结果好 ×计算量大,对异常值敏感

coordinate:√快速且鲁棒 ×准确度不够(虽然多阶段来提高性能但会降低速度)

⇒目的是结合二者优点(the first study in this area that discusses the connection between heatmap and coordinate regression.)

Coordinate Regression Models

 Heatmap Regression Models

Cross-Domain Generalization

 Semi-Supervised Facial Landmark Detection

 

Method

PIP regression、

neighbor regression module、

self-training with curriculum framework、

implicit prior we observe from CNN-based facial landmark detectors

 

知识点+词汇

Generalization capability across domains 跨域的泛化能力

domain gaps

Stacked hourglass networks堆叠沙漏网络

hybrids of classification and regression分类和回归的混合体

leverage利用

相关链接

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值