文章记录Group Sampling+freeanchor

本文探讨了面部检测任务中正负样本及不同尺度样本不均衡的问题,介绍了GroupSampling方法如何通过分组和按比例采样来平衡训练数据,尤其在处理细长物体和密集物体检测方面取得的显著效果。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1 Group Sampling for scale invariant face detection

问题,训练样本中存在两种不均衡:
1 正负样本不均衡,负样本个数远大于正样本个数。
2 不同尺度上训练样本不均衡:比如retinaface在三个尺度上进行人脸检测,每隔尺度上的训练样本个数不均衡(由于基于iou的anchor匹配策略,小的物体更不容易找到合适的anchor):
在这里插入图片描述不同的方式对应的anchor、stride设定

widerface训练集下不同的方式对应的不同尺度下的正负anchor数量,使用总的训练样本数据进行归一化
widerface训练集下不同的方式对应的不同尺度下的正负anchor数量,使用总的训练样本数据进行归一化

在这里插入图片描述

在retinaface中,其anchor设定如下:

在这里插入图片描述
在retinaface中anchor设定75%都在P2层,但是训练样本却不一定是都在P2层,所以这篇文章没有考虑另外一个不平衡,不同尺度下anchor的均衡?(但是不同尺度下的anchor很自然的就不均衡,图像中大脸的最大个数肯定比小脸的最大个数要少),但是anchor的不均衡,对训练有影响吗?

在group sampling这篇文章中,给出的解决方案如下:
1 将所有的样本,按照anchor尺度进行分组。
2 在每个组内,对政府样本按照1:3进行采样,各个组之间的训练样本数量保持一致
3 如果那个尺度内的正样本缺失或者未达到要求比例,就在这个组内增加负样本,以确保各个组之间的训练样本数量相同

最后得到的结果如下:
在这里插入图片描述

2 freeanchor(不是不需要anchor,只是每个groundtruth对应的anchor不再固定,而是自由可选)

核心思想:手动分配的anchor在一下两种情况存在问题:
1 没有中心特征的细长物体
2 多个物体挤在一起

1 groundtruth的候选anchor:对训练时的每个groundtruth,根据anchor与groundtruth之间的iou排序,取前k个作为该groundtruth的候选anchor
2 学习从候选的anchor中得到最佳匹配的anchor:学习时使用下面的公式(原文):
在这里插入图片描述
学习到的anchor不一定是最对齐的anchor,但却是最有代表性的特征

最终结果:
对细长物体、多个物体拥挤在一起的情况有改善
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

### YOLOv8 Anchor-Free Object Detection Implementation and Advantages #### Overview of Anchor-Free Mechanism In the context of object detection, traditional methods like earlier versions of YOLO rely on predefined anchor boxes to predict objects' locations. However, in an anchor-free approach such as implemented within certain variants or improvements leading up to concepts seen around YOLOv8, each location directly predicts bounding box coordinates without relying on anchors[^1]. This shift simplifies model design by removing the need for complex anchor matching processes during training. #### Implementation Details The transition from using anchors to being completely free of them involves several key changes: - **Direct Coordinate Prediction**: Instead of predicting offsets relative to pre-defined anchor boxes, models now output absolute values representing center points along with width and height parameters. - **Center Sampling Strategy**: To improve localization accuracy while maintaining simplicity, a strategy similar to what is described under FCOS can be adopted where only pixels close enough to ground truth centers are considered positive samples[^4]. ```python def get_center_points(boxes): """ Calculate the center point of each box Args: boxes (list): List containing tuples of (xmin, ymin, xmax, ymax) Returns: list: Centers represented as [(cx,cy)] """ return [(b[0]+b[2])/2., (b[1]+b[3])/2.] for b in boxes] ``` This method ensures that predictions focus more accurately on actual object positions rather than trying to fit into fixed-size templates. #### Advantages of Being Anchor-Free Adopting this paradigm offers multiple benefits including but not limited to: - **Simplified Configuration Management**: Eliminating the necessity to tune hyperparameters associated with various sizes and aspect ratios of anchors reduces complexity significantly. - **Enhanced Generalization Ability**: Models trained without specific assumptions about object shapes tend to generalize better across diverse datasets since they learn features based purely on data distribution instead of artificial constraints imposed through anchoring mechanisms[^5]. - **Improved Efficiency During Inference Time**: Removing steps related to generating proposals leads to faster processing times which could prove crucial especially when deploying real-time applications requiring high throughput rates. --related questions-- 1. How does eliminating anchors impact performance metrics compared to previous generations? 2. What challenges arise when implementing direct coordinate prediction techniques? 3. Can you provide examples demonstrating improved generalization due to absence of anchors? 4. Are there any trade-offs between speed gains achieved via simpler configurations versus potential loss in precision?
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值