【目标检测】IoU、GIoU、DIoU、CIoU、EIoU_diou越大越好还是越小越好-优快云博客

本文链接：https://blog.youkuaiyun.com/beginner1207/article/details/137365482

本文详细介绍了IoU、GIoU、DIoU和CIoU在目标检测任务中的概念、计算方法及其作为损失函数时的问题。这些改进旨在解决IoU的局限性，如考虑重叠程度、中心点距离和长宽比，提高预测精度。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一、IoU

1. 概念

在目标检测任务中，IoU被广泛使用，它反映预测bbox与真实bbox的重叠程度。具体来说，是预测bbox与真实bbox的交集与并集的比，该比值越大预测效果越好。

2. 计算

import torch


def IoU(box1, box2, x1y1x2y2=False):
    # box1.shape = 4  box2.shape = 4
    # return iou.shape = 1

    if x1y1x2y2:
        b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]
        b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]
    else:
        b1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2
        b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2
        b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2
        b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2

    cap_x1, cap_y1 = torch.max(b1_x1, b2_x1), torch.max(b1_y1, b2_y1)  # 交集左上坐标，取两框左上坐标的大值，请注意图像左上角坐标为(0,0)
    cap_x2, cap_y2 = torch.min(b1_x2, b2_x2), torch.min(b1_y2, b2_y2)  # 交集右下坐标

    in_area = torch.clamp(cap_x2 - cap_x1, min=0) * torch.clamp(cap_y2 - cap_y1, min=0)  # clamp 限制最小为0，此时不相交，面积为0
    un_area = (b1_x2 - b1_x1) * (b1_y2 - b1_y1) + (b2_x2 - b2_x1) * (b2_y2 - b2_y1)

    return in_area / torch.clamp(un_area - in_area, min=1e-10)  # 避免溢出


if __name__ == "__main__":
    boxs1 = torch.tensor([1, 1, 3, 3])
    boxs2 = torch.tensor([2, 2, 4, 4])
    print(IoU(boxs1, boxs2, x1y1x2y2=True))

3. 问题

当1-IoU作为损失时，出现以下问题：
(1) 所有两框没相交的情况IoU均为0，损失均为1。此时，我们希望IoU同为0的不同情况下产生的损失有差异，比如两框距离越远损失越大。
(2) 所有两框相交且面积相同的情况IoU和损失均分别相同。此时，我们希望重合效果更好的拥有更小的损失。

如下图，前者应好于后者：

二、GIoU

1. 概念

针对上述问题，在IoU基础上，增加最小凸集概念，即包含两框的最小框。最小凸集减去并集的面积占最小凸集的比值越小预测效果越好。

2. 计算

如图所示， $GIoU=IoU-\frac{C-B}{C}$ 其中 $GIoU\in(-1,1]$ ， $C$ 为最小凸集， $B$ 为并集。

import torch


def GIoU(box1, box2, x1y1x2y2=False):
    # box1.shape = 4  box2.shape = 4
    # return iou.shape = 1

    if x1y1x2y2:
        b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]
        b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]
    else:
        b1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2
        b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2
        b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2
        b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2

    cap_x1, cap_y1 = torch.max(b1_x1, b2_x1), torch.max(b1_y1, b2_y1)
    cap_x2, cap_y2 = torch.min(b1_x2, b2_x2), torch.min(b1_y2, b2_y2)

    in_area = torch.clamp(cap_x2 - cap_x1, min=0) * torch.clamp(cap_y2 - cap_y1, min=0)  # clamp 限制最小为0
    un_area = (b1_x2 - b1_x1) * (b1_y2 - b1_y1) + (b2_x2 - b2_x1) * (b2_y2 - b2_y1)
    iou = in_area / torch.clamp(un_area - in_area, min=1e-10)

    C_x1, C_y1 = torch.min(b1_x1, b2_x1), torch.min(b1_y1, b2_y1)
    C_x2, C_y2 = torch.max(b1_x2, b2_x2), torch.max(b1_y2, b2_y2)
    C_area = (C_x2 - C_x1) * (C_y2 - C_y1)

    return iou - (C_area - (un_area - in_area)) / torch.clamp(C_area, min=1e-10)


if __name__ == "__main__":
    boxs1 = torch.tensor([1, 1, 3, 3])
    boxs2 = torch.tensor([2, 2, 4, 4])

    print(GIoU(boxs1, boxs2, x1y1x2y2=True))

3. 问题

当1-GIoU作为损失时，存在以下问题：
(1) 如下图，当两框在不同位置重叠时，损失无差异。

(2) 收敛速度较慢。

三、DIoU

1. 概念

为解决上述问题，DIoU不再使用非重叠区域占比作为度量，而是引入中心点距离。两个框的重叠面积越大、中心点距离越近预测效果越好。因为直接最小化两框的距离，所以GIoU收敛速度更快。

2. 计算

如图所示， $DIoU=IoU-\frac{d^2}{c^2}$ 其中 $DIoU\in(-1,1]$ ， $d$ 为预测框与真实框的中心点距离， $c$ 为最小凸集的对角线距离。

import torch


def DIoU(box1, box2, x1y1x2y2=False):
    # box1.shape = 4  box2.shape = 4
    # return iou.shape = 1

    if x1y1x2y2:
        b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]
        b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]
    else:
        b1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2
        b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2
        b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2
        b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2

    cap_x1, cap_y1 = torch.max(b1_x1, b2_x1), torch.max(b1_y1, b2_y1)
    cap_x2, cap_y2 = torch.min(b1_x2, b2_x2), torch.min(b1_y2, b2_y2)

    in_area = torch.clamp(cap_x2 - cap_x1, min=0) * torch.clamp(cap_y2 - cap_y1, min=0)  # clamp 限制最小为0
    un_area = (b1_x2 - b1_x1) * (b1_y2 - b1_y1) + (b2_x2 - b2_x1) * (b2_y2 - b2_y1)
    iou = in_area / torch.clamp(un_area - in_area, min=1e-10)

    b1_cx, b1_cy = (b1_x1 + b1_x2) / 2, (b1_y1 + b1_y2) / 2
    b2_cx, b2_cy = (b2_x1 + b2_x2) / 2, (b2_y1 + b2_y2) / 2

    d2 = (b2_cx - b1_cx)**2 + (b2_cy - b1_cy)**2

    C_x1, C_y1 = torch.min(b1_x1, b2_x1), torch.min(b1_y1, b2_y1)
    C_x2, C_y2 = torch.max(b1_x2, b2_x2), torch.max(b1_y2, b2_y2)
    c2 = (C_x2 - C_x1)**2 + (C_y2 - C_y1)**2

    return iou - d2 / torch.clamp(c2, min=1e-10)


if __name__ == "__main__":
    boxs1 = torch.tensor([1, 1, 3, 3])
    boxs2 = torch.tensor([2, 2, 4, 4])

    print(DIoU(boxs1, boxs2, x1y1x2y2=True))

3. 问题

bbox的回归效果有三个重要依据：重叠面积、中心点距离、长宽比。1-DIoU作为损失时，未考虑预测框的长宽比与真实框的长宽比的匹配度。如下图，我们希望状态1拥有更小的损失。

三、CIoU

1. 概念

为解决上述问题，CIoU引入长宽比。综合重叠面积、中心点距离、长宽比评估bbox的回归效果。

2. 计算

$CIoU=IoU-\frac{d^2}{c^2}-\alpha\times v$ 其中， $v=\frac{4}{\pi^2}(\arctan\frac{w^{gt}}{h^{gt}}-\arctan\frac{w}{h})^2$ 其中， $\arctan$ 图像如下，在大于0部分最大为 $\frac{\pi}{2}$ 最小为0，所以 $v$ 的取值范围在 $(0, 1)$ ，长宽比越接近， $v$ 越小。

另外， $\alpha=\frac{v}{1-IoU+v}$ 其中， $\frac{x}{a+x}$ 的图像（左图 $a = 0.1$ ，右图 $a = 0.9$ ）如下，所以 $I o U$ 越大， $1 - I o U$ 越小， $\alpha$ 越大。

综合来看， $a\times v$ 中 $v$ 迫使预测框与真实框长宽比更接近， $\alpha$ 迫使两框重叠越多，即IoU越大时，长宽比越应该更接近（相同长宽比时，IoU越大，损失越大）。

import torch
import math


def CIoU(box1, box2, x1y1x2y2=False):
    # box1.shape = 4  box2.shape = 4
    # return iou.shape = 1

    if x1y1x2y2:
        b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]
        b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]
    else:
        b1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2
        b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2
        b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2
        b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2

    cap_x1, cap_y1 = torch.max(b1_x1, b2_x1), torch.max(b1_y1, b2_y1)
    cap_x2, cap_y2 = torch.min(b1_x2, b2_x2), torch.min(b1_y2, b2_y2)

    in_area = torch.clamp(cap_x2 - cap_x1, min=0) * torch.clamp(cap_y2 - cap_y1, min=0)  # clamp 限制最小为0
    un_area = (b1_x2 - b1_x1) * (b1_y2 - b1_y1) + (b2_x2 - b2_x1) * (b2_y2 - b2_y1)
    iou = in_area / torch.clamp(un_area - in_area, min=1e-10)

    b1_cx, b1_cy = (b1_x1 + b1_x2) / 2, (b1_y1 + b1_y2) / 2
    b2_cx, b2_cy = (b2_x1 + b2_x2) / 2, (b2_y1 + b2_y2) / 2

    d2 = (b2_cx - b1_cx)**2 + (b2_cy - b1_cy)**2

    C_x1, C_y1 = torch.min(b1_x1, b2_x1), torch.min(b1_y1, b2_y1)
    C_x2, C_y2 = torch.max(b1_x2, b2_x2), torch.max(b1_y2, b2_y2)
    c2 = (C_x2 - C_x1)**2 + (C_y2 - C_y1)**2

    b1_w, b1_h = b1_x2 - b1_x1, b1_y2 - b1_y1
    b2_w, b2_h = b2_x2 - b2_x1, b2_y2 - b2_y1
    v = 4 / math.pi ** 2 * (torch.atan(b1_w / torch.clamp(b1_h, min=1e-10)) - torch.atan(b2_w / torch.clamp(b2_h, min=1e-10))) ** 2
    alpha = v / torch.clamp(1 - iou + v, min=1e-10)

    return iou - d2 / torch.clamp(c2, 1e-10) - alpha * v


if __name__ == "__main__":
    boxs1 = torch.tensor([2, 2, 4, 4])
    boxs2 = torch.tensor([1, 1, 3, 3])

    print(CIoU(boxs1, boxs2, x1y1x2y2=True))