【YOLO改进】换遍IoU损失函数之ShapeIoU Loss（基于MMYOLO）

最新推荐文章于 2025-04-22 20:58:22 发布

原创

最新推荐文章于 2025-04-22 20:58:22 发布 · 1.5k 阅读

12 ·

CC 4.0 BY-SA版权

文章标签：

#YOLO

Shape IoU损失函数是目标检测任务中的一个重要工具，用于评估模型预测的目标与实际目标之间的重叠程度。下面从原因、具体设计原理、计算步骤等方面详细介绍Shape IoU损失函数。

提出原因

在目标检测任务中，边界框（bounding box）是用于表示目标位置的常用方法。然而，现有的边界框回归方法往往只考虑真实框（GT box）与预测框之间的几何关系，忽略了边界框的固有属性（如形状和尺度）对边界框回归的影响。Shape IoU损失函数的提出，正是为了关注边界框本身的形状和尺度，从而更准确地计算损失，提高边界框回归的准确性。

具体设计原理

Shape IoU损失函数的设计原理在于通过关注边界框的形状和尺度来计算损失。在目标检测中，不同目标的边界框可能具有不同的形状和尺度。传统的IoU损失函数在计算损失时，往往只考虑边界框之间的重叠程度，而没有考虑到边界框的形状和尺度差异。Shape IoU损失函数则通过引入形状和尺度的概念，使得在计算损失时能够更全面地考虑边界框的特性。

具体来说，Shape IoU损失函数在计算损失时，首先会计算真实框和预测框之间的重叠面积（Area of Overlap），然后计算真实框和预测框合并后的总面积（Area of Union）。接着，通过计算重叠面积与总面积的比值（即IoU值），得到一个介于0和1之间的数值，表示两个边界框的重叠程度。最后，通过一定的方式（如1减去IoU值）将IoU值转化为损失值，用于指导模型的训练和优化。

计算步骤

Shape IoU损失函数的计算步骤如下：

计算真实框（GT box）和预测框（predicted box）之间的重叠面积（Area of Overlap），即两个边界框交集的面积。
计算真实框和预测框合并后的总面积（Area of Union），即两个边界框并集的面积。
将重叠面积除以总面积，得到一个介于0和1之间的IoU值。这个值表示了两个边界框的重叠程度。
通过一定的方式（如1减去IoU值）将IoU值转化为损失值。这个损失值将用于指导模型的训练和优化。

Shape IoU 源代码

def shape_iou(box1, box2, xywh=True, scale=0, eps=1e-7):
    (x1, y1, w1, h1), (x2, y2, w2, h2) = box1.chunk(4, -1), box2.chunk(4, -1)
    w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
    b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
    b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_

    # Intersection area
    inter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \
            (torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0)

    # Union Area
    union = w1 * h1 + w2 * h2 - inter + eps

    # IoU
    iou = inter / union

    #Shape-Distance    #Shape-Distance    #Shape-Distance    #Shape-Distance    #Shape-Distance    #Shape-Distance    #Shape-Distance  
    ww = 2 * torch.pow(w2, scale) / (torch.pow(w2, scale) + torch.pow(h2, scale))
    hh = 2 * torch.pow(h2, scale) / (torch.pow(w2, scale) + torch.pow(h2, scale))
    cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1)  # convex width
    ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1)  # convex height
    c2 = cw ** 2 + ch ** 2 + eps                            # convex diagonal squared
    center_distance_x = ((b2_x1 + b2_x2 - b1_x1 - b1_x2) ** 2) / 4
    center_distance_y = ((b2_y1 + b2_y2 - b1_y1 - b1_y2) ** 2) / 4
    center_distance = hh * center_distance_x + ww * center_distance_y
    distance = center_distance / c2

    #Shape-Shape    #Shape-Shape    #Shape-Shape    #Shape-Shape    #Shape-Shape    #Shape-Shape    #Shape-Shape    #Shape-Shape    
    omiga_w = hh * torch.abs(w1 - w2) / torch.max(w1, w2)
    omiga_h = ww * torch.abs(h1 - h2) / torch.max(h1, h2)
    shape_cost = torch.pow(1 - torch.exp(-1 * omiga_w), 4) + torch.pow(1 - torch.exp(-1 * omiga_h), 4)
    
    #Shape-IoU    #Shape-IoU    #Shape-IoU    #Shape-IoU    #Shape-IoU    #Shape-IoU