maskrcnn_benchmark 代码详解之　bounding_box.py

最新推荐文章于 2022-07-12 15:28:28 发布

leijieZhang

最新推荐文章于 2022-07-12 15:28:28 发布

阅读量2.2k

点赞数 6

CC 4.0 BY-SA版权

分类专栏： maskrcnn benchmark 文章标签： maskrcnn_benchmark bounding_box.py Bounding Box 目标检测深度学习

本文链接：https://blog.youkuaiyun.com/leijieZhang/article/details/91446506

本文深入解析maskrcnn_benchmark框架中bounding_box.py文件的代码，探讨在目标检测网络中Bounding Box的重要作用及其具体实现，涉及深度学习相关知识。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

前言：

　　在目标检测网络模型当中，最常用到的便是Bounding Box，在maskrcnn_benchmark当中，bounding_box.py 实现了这一功能，其代码为：

# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import torch

# transpose
FLIP_LEFT_RIGHT = 0
FLIP_TOP_BOTTOM = 1


class BoxList(object):
    """
    This class represents a set of bounding boxes.这个类用于保存bbox列表
    The bounding boxes are represented as a Nx4 Tensor.bbox是一个Nx4的张量
    In order to uniquely determine the bounding boxes with respect为了决定这个bbox列表是属于那一个图片的，本类还保存了一个图片的大小
    to an image, we also store the corresponding image dimensions.
    They can contain extra information that is specific to each bounding box, such as
    labels.本类还包含一个其他信息的map，用户可以用其来保存一些备用信息，例如标签，目标水平等
    """

    def __init__(self, bbox, image_size, mode="xyxy"):
        # 如果bbox是一个张量，得到bbox存在的设备，否则就指定设备为cpu
        device = bbox.device if isinstance(bbox, torch.Tensor) else torch.device("cpu")
        # 指定bbox为一个张量，并指定其设备和数据类型
        bbox = torch.as_tensor(bbox, dtype=torch.float32, device=device)
        # bbox的维度应为２：Nx4
        if bbox.ndimension() != 2:
            raise ValueError(
                "bbox should have 2 dimensions, got {}".format(bbox.ndimension())
            )
        # 如果bbox的倒数第一个维度不是４个边框信息，则说明有错误
        if bbox.size(-1) != 4:
            raise ValueError(
                "last dimension of bbox should have a "
                "size of 4, got {}".format(bbox.size(-1))
            )
        # 如果边框信息的格式不是"xyxy", "xywh"中的一种
        if mode not in ("xyxy", "xywh"):
            raise ValueError("mode should be 'xyxy' or 'xywh'")
        # 初始化Boxlist的各种属性
        self.bbox = bbox
        self.size = image_size  # (image_width, image_height)
        self.mode = mode
        self.extra_fields = {}

    # 增加额外的信息
    def add_field(self, field, field_data):
        self.extra_fields[field] = field_data

    # 从extra_fields中获取名为field的数据
    def get_field(self, field):
        return self.extra_fields[field]

    # 判断extra_fields是否有为field的数据
    def has_field(self, field):
        return field in self.extra_fields

    # 得到保存在extra_fields的所有数据的键值
    def fields(self):
        return list(self.extra_fields.keys())

    # 复制bbox中extra_fields的数据到本boxlist中