DW_目标检测基础
目标检测基本概念
-
目标检测:需要在识别出图片中目标类别的基础上(图像分类),还要精确定位到目标的具体位置,并用外接矩形框标出。
-
物体的位置:通过滑窗的方式确定众多候选框,罗列图中各种可能的区域,再对候选框进行分类和微调。这样对于图像中每个区域都能得到(class,x1,y1,x2,y2)五个属性,汇总后最终就得到了图中物体的类别和坐标信息。
除此之外,每个框送入到分类网络分类都有一个得分(代表当前框的置信度),那么得分最高的就代表识别的最准确的框,其位置就是最终要检测的目标的位置。 -
目标框定义:目标检测的标签信息有5个,除了类别label以外,需要同时包含目标的位置信息,也就是目标的外接矩形框bounding box。
用来表达bbox的格式通常有两种,(x1, y1, x2, y2) 和 (x_c, y_c, w, h)
两种格式会分别在后续不同场景下更加便于计算。
两种格式互相转换的实现utils.py
def xy_to_cxcy(xy):
"""
Convert bounding boxes from boundary coordinates (x_min, y_min, x_max, y_max) to center-size coordinates (c_x, c_y, w, h).
:param xy: bounding boxes in boundary coordinates, a tensor of size (n_boxes, 4)
:return: bounding boxes in center-size coordinates, a tensor of size (n_boxes, 4)
"""
return torch.cat([(xy[:, 2:] + xy[:, :2]) / 2, # c_x, c_y
xy[:, 2:] - xy[:, :2]], 1) # w, h
def cxcy_to_xy(cxcy):
"""
Convert bounding boxes from center-size coordinates (c_x, c_y, w, h) to boundary coordinates (x_min, y_min, x_max, y_max).
:param cxcy: bounding boxes in center-size coordinates, a tensor of size (n_boxes, 4)
:return: bounding boxes in boundary coordinates, a tensor of size (n_boxes, 4)
"""
return torch.cat([cxcy[:, :2] - (cxcy[:, 2:] / 2), # x_min, y_min
cxcy[:, :2] + (cxcy[:, 2:] / 2)], 1) # x_max, y_max
- 交并比
流程:
1.首先获取两个框的坐标,红框坐标: 左上(red_x1, red_y1), 右下(red_x2, red_y2),绿框坐标: 左上(green_x1, green_y1),右下(green_x2, green_y2)
2.计算两个框左上点的坐标最大值:(max(red_x1, green_x1), max(red_y1, green_y1)), 和右下点坐标最小值:(min(red_x2, green_x2), min(red_y2, green_y2))
3.利用2算出的信息计算黄框面积:yellow_area
4.计算红绿框的面积:red_area 和 green_area
5.iou = yellow_area / (red_area + green_area - yellow_area)
def find_intersection(set_1, set_2):
"""
Find the intersection of every box combination between two sets of boxes that are in boundary coordinates.
:param set_1: set 1, a tensor of dimensions (n1, 4)
:param set_2: set 2, a tensor of dimensions (n2, 4)
:return: intersection of each of the boxes in set 1 with respect to each of the boxes in set 2, a tensor of dimensions (n1, n2)
"""
# PyTorch auto-broadcasts singleton dimensions
lower_bounds = torch.max(set_1[:, :