COCO数据集之gt_overlaps之分析

本文深入解析了gt_overlaps在目标检测评估中的关键作用,详细介绍了其计算过程,即如何从候选框中选取与每个真实框最大IoU的框,以及在不同面积范围内的评估方式。

gt_overlaps的源头来自于如下代码:

def evaluate_recall(self, candidate_boxes=None, thresholds=None,
                      area='all', limit=None):
    """Evaluate detection proposal recall metrics.

    Returns:
        results: dictionary of results with keys
            'ar': average recall
            'recalls': vector recalls at each IoU overlap threshold
            'thresholds': vector of IoU overlap thresholds
            'gt_overlaps': vector of all ground-truth overlaps
    """
    # Record max overlap value for each gt box
    # Return vector of overlap values
    areas = {'all': 0, 'small': 1, 'medium': 2, 'large': 3,
             '96-128': 4, '128-256': 5, '256-512': 6, '512-inf': 7}
    area_ranges = [[0 ** 2, 1e5 ** 2],  # all
                   [0 ** 2, 32 ** 2],  # small
                   [32 ** 2, 96 ** 2],  # medium
                   [96 ** 2, 1e5 ** 2],  # large
                   [96 ** 2, 128 ** 2],  # 96-128
                   [128 ** 2, 256 ** 2],  # 128-256
                   [256 ** 2, 512 ** 2],  # 256-512
                   [512 ** 2, 1e5 ** 2],  # 512-inf
                   ]
    assert area in areas, 'unknown area range: {}'.format(area)
    area_range = area_ranges[areas[area]]
    gt_overlaps = np.zeros(0)
    num_pos = 0
    for i in range(self.num_images):
      # Checking for max_overlaps == 1 avoids including crowd annotations
      # (...pretty hacking :/)
      max_gt_overlaps = self.roidb[i]['gt_overlaps'].toarray().max(axis=1)
      gt_inds = np.where((self.roidb[i]['gt_classes'] > 0) &
                         (max_gt_overlaps == 1))[0]
      gt_boxes = self.roidb[i]['boxes'][gt_inds, :]
      gt_areas = self.roidb[i]['seg_areas'][gt_inds]
      valid_gt_inds = np.where((gt_areas >= area_range[0]) &
                               (gt_areas <= area_range[1]))[0]
      gt_boxes = gt_boxes[valid_gt_inds, :]
      num_pos += len(valid_gt_inds)

      if candidate_boxes is None:
        # If candidate_boxes is not supplied, the default is to use the
        # non-ground-truth boxes from this roidb
        non_gt_inds = np.where(self.roidb[i]['gt_classes'] == 0)[0]
        boxes = self.roidb[i]['boxes'][non_gt_inds, :]
      else:
        boxes = candidate_boxes[i]
      if boxes.shape[0] == 0:
        continue
      if limit is not None and boxes.shape[0] > limit:
        boxes = boxes[:limit, :]

      overlaps = bbox_overlaps(boxes.astype(np.float),
                               gt_boxes.astype(np.float))

      _gt_overlaps = np.zeros((gt_boxes.shape[0]))
      for j in range(gt_boxes.shape[0]):
        # find which proposal box maximally covers each gt box
        argmax_overlaps = overlaps.argmax(axis=0)
        # and get the iou amount of coverage for each gt box
        max_overlaps = overlaps.max(axis=0)
        # find which gt box is 'best' covered (i.e. 'best' = most iou)
        gt_ind = max_overlaps.argmax()
        gt_ovr = max_overlaps.max()
        assert (gt_ovr >= 0)
        # find the proposal box that covers the best covered gt box
        box_ind = argmax_overlaps[gt_ind]
        # record the iou coverage of this gt box
        _gt_overlaps[j] = overlaps[box_ind, gt_ind]
        assert (_gt_overlaps[j] == gt_ovr)
        # mark the proposal box and the gt box as used
        overlaps[box_ind, :] = -1
        overlaps[:, gt_ind] = -1
      # append recorded iou coverage level
      gt_overlaps = np.hstack((gt_overlaps, _gt_overlaps))

    gt_overlaps = np.sort(gt_overlaps)
    if thresholds is None:
      step = 0.05
      thresholds = np.arange(0.5, 0.95 + 1e-5, step)
    recalls = np.zeros_like(thresholds)
    # compute recall for each iou threshold
    for i, t in enumerate(thresholds):
      recalls[i] = (gt_overlaps >= t).sum() / float(num_pos)
    # ar = 2 * np.trapz(recalls, thresholds)
    ar = recalls.mean()
    return {'ar': ar, 'recalls': recalls, 'thresholds': thresholds,
            'gt_overlaps': gt_overlaps}

从改代码中可以看出:gt_overlaps记录的是candidate_boxes(假定数量为M)与gt_boxes(假定数量为N)之间的IoU的值,这些值构成的是一个MxN的矩阵。上述代码的作用就在M个candidate框当中取出与每个gt_boxes的IoU最大的那个框,这时有两种情况,如下图所示:

COCO gt_overlaps
COCO gt_overlaps

从上图的分析中可以看到,存在gt_overlaps=-1的请框,这就意味着candidate box的数量大于gt_boxes的数量,这也就是is_crowded这个标记的来源。

 

 

Traceback (most recent call last): File "E:\YOLO\yolo\ultralytics-8.3.13\ultralytics-8.3.13\ultralytics\train.py", line 12, in <module&gt; model.train( File "E:\YOLO\yolo\ultralytics-8.3.13\ultralytics-8.3.13\ultralytics\engine\model.py", line 802, in train self.trainer.train() File "E:\YOLO\yolo\ultralytics-8.3.13\ultralytics-8.3.13\ultralytics\engine\trainer.py", line 207, in train self._do_train(world_size) File "E:\YOLO\yolo\ultralytics-8.3.13\ultralytics-8.3.13\ultralytics\engine\trainer.py", line 385, in _do_train self.loss, self.loss_items = self.model(batch) ^^^^^^^^^^^^^^^^^ File "E:\anaconda\envs\Robot\Lib\site-packages\torch\nn\modules\module.py", line 1739, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\anaconda\envs\Robot\Lib\site-packages\torch\nn\modules\module.py", line 1750, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\YOLO\yolo\ultralytics-8.3.13\ultralytics-8.3.13\ultralytics\nn\tasks.py", line 111, in forward return self.loss(x, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\YOLO\yolo\ultralytics-8.3.13\ultralytics-8.3.13\ultralytics\nn\tasks.py", line 294, in loss return self.criterion(preds, batch) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\YOLO\yolo\ultralytics-8.3.13\ultralytics-8.3.13\ultralytics\utils\loss.py", line 234, in __call__ _, target_bboxes, target_scores, fg_mask, _ = self.assigner( ^^^^^^^^^^^^^^ File "E:\anaconda\envs\Robot\Lib\site-packages\torch\nn\modules\module.py", line 1739, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\anaconda\envs\Robot\Lib\site-packages\torch\nn\modules\module.py", line 1750, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\anaconda\envs\Robot\Lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "E:\YOLO\yolo\ultralytics-8.3.13\ultralytics-8.3.13\ultralytics\utils\tal.py", line 72, in forward mask_pos, align_metric, overlaps = self.get_pos_mask( ^^^^^^^^^^^^^^^^^^ File "E:\YOLO\yolo\ultralytics-8.3.13\ultralytics-8.3.13\ultralytics\utils\tal.py", line 94, in get_pos_mask align_metric, overlaps = self.get_box_metrics(pd_scores, pd_bboxes, gt_labels, gt_bboxes, mask_in_gts * mask_gt) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\YOLO\yolo\ultralytics-8.3.13\ultralytics-8.3.13\ultralytics\utils\tal.py", line 113, in get_box_metrics bbox_scores[mask_gt] = pd_scores[ind[0], :, ind[1]][mask_gt] # b, max_num_obj, h*w ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^ RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
最新发布
04-01
<think&gt;嗯,用户遇到了一个CUDA错误,错误信息是“RuntimeError: CUDA error: device-side assert triggered”。看起来是在训练YOLO模型的时候出现的。首先,我得想想这个错误通常是什么原因引起的。记得CUDA设备端断言触发通常是因为代码里存在某种逻辑错误,比如内存访问越界、张量形状不匹配,或者标签索引超出范围之类的。 看错误堆栈,问题出现在`tal.py`的`get_box_metrics`函数中,具体是在处理`bbox_scores`的时候。这里的代码尝试将某些值赋给`bbox_scores[mask_gt]`,但可能索引超出了范围。例如,`pd_scores`的某个索引可能超过了它的维度大小。这时候,可能的原因是数据标签有问题,比如标签的类别数超过了模型配置的类别数。 用户用的是Ultralytics YOLO版本8.3.13,所以需要检查数据集的YAML文件是否正确,特别是`nc`(类别数)是否和实际数据中的类别标签匹配。例如,如果数据集中有5个类别,但模型配置的是4个,那么训练时就会在计算损失的时候出现索引越界,导致CUDA断言错误。 另外,用户可能需要检查数据集中是否有无效的标签,比如标签索引是负数,或者大于等于`nc`的值。例如,某个对象的类别标签写成了5,但`nc`设置的是4,这就会导致问题。 还有,用户的环境是Anaconda下的某个环境,可能需要检查CUDA和PyTorch的版本是否兼容,有没有正确安装。不过更可能的是数据配置的问题。此外,建议用户按照错误提示中的建议,设置`CUDA_LAUNCH_BLOCKING=1`来运行,这样可以更准确地定位错误发生的位置,而不是延迟报告。 另外,用户是否在训练自定义数据集?如果是的话,可能是在标注的时候出了问题。比如使用了一些标注工具生成的标签,可能存在类别索引错误或者越界的情况。这时候,用脚本检查所有标签文件中的类别索引是否正确,确保都在0到`nc-1`的范围内。 总结下来,解决步骤应该是: 1. 检查数据集配置文件中的类别数`nc`是否正确。 2. 验证所有标签文件中的类别索引是否有效。 3. 运行训练命令时设置`CUDA_LAUNCH_BLOCKING=1`来获取更详细的错误信息。 4. 确保PyTorch和CUDA版本兼容。 5. 如果问题依旧,尝试简化数据集或使用官方示例数据集测试,以排除数据问题。 另外,可能需要指导用户如何检查标签文件,例如使用Python脚本遍历所有标签,统计最大类别索引,或者是否有负数存在。如果用户不熟悉如何操作,可以给出具体的代码示例,帮助他们排查问题。</think&gt;这个错误信息表明在训练YOLO模型时触发了CUDA设备端的断言错误。这类错误通常由以下原因引起: ### 错误原因分析 $$ \begin{aligned} &\text{错误根源} \in \{ \text{标签索引越界}, \text{张量形状不匹配}, \text{内存访问违规} \} \\ &\text{常见触发场景} \propto \frac{\text{类别数配置错误}}{\text{标签数值异常}} \end{aligned} $$ ### 逐步排查方案 #### 第一步:验证数据集配置 1. 检查`dataset.yaml`中的类别数`nc`是否与实际标签一致 ```yaml names: ['cat', 'dog', 'person'] # 实际类别名称 nc: 3 # 必须与标签文件中的最大类别索引匹配 ``` #### 第二步:检查标签文件 2. 使用以下Python脚本验证标签有效性: ```python import os import glob label_dir = "path/to/your/labels" max_class = 0 for label_file in glob.glob(os.path.join(label_dir, "*.txt")): with open(label_file) as f: for line in f: cls_id = int(line.strip().split()[0]) max_class = max(max_class, cls_id) print(f"最大类别索引: {max_class}") ``` - 如果输出值 ≥ 配置的`nc`,则必须修正标签或调整`nc` #### 第三步:启用详细错误诊断 3. 在训练命令前添加环境变量: ```bash CUDA_LAUNCH_BLOCKING=1 python train.py ... ``` - 这将显示精确的错误位置 - 预期可能出现的明确错误提示示例: ```python IndexError: Target 3 is out of bounds for dimension with size 3 ``` #### 第四步:环境验证 4. 检查PyTorch与CUDA兼容性: ```python import torch print(torch.__version__) # 应 ≥ 2.0.0 print(torch.cuda.is_available()) # 必须返回True ``` #### 第五步:简化测试 5. 使用官方示例数据集验证: ```bash yolo train data=coco8.yaml model=yolov8n.yaml epochs=10 ``` - 若官方数据集能正常运行,则确认是自定义数据集问题 ### 典型修正案例 假设原配置为`nc: 3`,但检测到最大类别索引为3: ```diff # dataset.yaml - nc: 3 + nc: 4 names: ['cat', 'dog', 'person', 'car'] # 添加缺失的类别 ``` ### 附加建议 - 对于图像尺寸问题,检查训练参数中的`imgsz`是否与数据预处理一致 - 使用`yolo checks`命令验证环境完整性 - 清除`runs`目录后重新训练以排除缓存干扰 通过以上步骤,90%以上的同类错误可以得到有效解决。如果问题仍然存在,建议提供完整的错误输出和数据集配置片段进行深度分析
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值