项目实战：PyTorch实现Yolov5目标检测算法

最新推荐文章于 2025-02-13 02:57:47 发布

CV算法攻城狮

最新推荐文章于 2025-02-13 02:57:47 发布

阅读量1.6k

点赞数 23

文章标签： pytorch YOLO 目标检测计算机视觉深度学习

本文链接：https://blog.youkuaiyun.com/qq_40980981/article/details/141791466

版权

项目简介

基于PyTorch实现Yolov5算法，作为Yolov5算法的复现。可以帮助读者更好的理解它的网络结构、训练流程、损失计算等。读者也可以使用该仓库训练自己的数据集，项目代码可在github获取。

Yolov5算法介绍，参考博客：零基础Yolov5学习-优快云博客

项目github地址：GitHub - wzl639/yolov5-pytorch

数据集加载

数据集加载类定义在dataset.yolo_dataset.py中。__getitem__()方法主要包括：1）对图片进行数据增强操作，包括：mosaic数据增强、缩放、色域变换、旋转；2）预处理：维度转换、归一化；3）标签处理：voc原始框坐标是坐上右下形式，转换为中心点宽高形式[cx, cy, w, h]，并且是归一化的形式

class YoloDataset(Dataset):
    def __init__(self, annotation_lines, input_shape, num_classes, epoch_length, mosaic, train, mosaic_ratio=0.7):
        super(YoloDataset, self).__init__()
        self.annotation_lines = annotation_lines  # 图片标注信息，.txt文件
        self.input_shape = input_shape  # 输入模型图片大小
        self.num_classes = num_classes  #
        self.epoch_length = epoch_length
        self.mosaic = mosaic  # 是否使用mosaic数据增强
        self.train = train  # 训练数据还是测试数据
        self.mosaic_ratio = mosaic_ratio  # 使用mosaic数据增强的比例，前多少个epoch使用
        self.epoch_now = -1  # 当前训练批次，用来控制是否使用mosaic数据增强， 随训练过程修改
        self.length = len(self.annotation_lines)  # 数据集大小

    def __len__(self):
        return self.length

    def __getitem__(self, index):
        """
        获取单个数据和标签, 训练时进行数据的随机增强，验证时不进行数据的随机增强
        数据增强主要包括：mosaic数据增强、缩放、色域变换、旋转，预处理：维度转换、归一化
        标签处理：voc原始框坐标是坐上右下形式，转换为中心点宽高形式[cx, cy, w, h]，并且是归一化的形式
        index: 获取数据的索引
        return:
            image：处理后图片，np格式
            box：图片对应标签，np格式
        """
        index = index % self.length
        if self.mosaic:
            if self.rand() < 0.5 and self.epoch_now < self.epoch_length * self.mosaic_ratio:
                lines = sample(self.annotation_lines, 3)
                lines.append(self.annotation_lines[index])
                shuffle(lines)
                image, box = self.get_random_data_with_Mosaic(lines, self.input_shape)
            else:
                image, box = self.get_random_data(self.annotation_lines[index], self.input_shape, random=self.train)
        else:
            image, box = self.get_random_data(self.annotation_lines[index], self.input_shape, random=self.train)
        image = np.transpose(preprocess_input(np.array(image, dtype=np.float32)), (2, 0, 1))
        box = np.array(box, dtype=np.float32)
        if len(box) != 0:
            box[:, [0, 2]] = box[:, [0, 2]] / self.input_shape[1]
            box[:, [1, 3]] = box[:, [1, 3]] / self.input_shape[0]

            box[:, 2:4] = box[:, 2:4] - box[:, 0:2]
            box[:, 0:2] = box[:, 0:2] + box[:, 2:4] / 2
        return image, box

模型定义

模型相关的定义在models在模块中，CSPdarknet.py中定义了主干网络、yolo.py是整个模型的定义。

class YoloBody(nn.Module):
    def __init__(self, anchors_mask, num_classes, phi):
        super(YoloBody, self).__init__()
        depth_dict          = {'s' : 0.33, 'm' : 0.67, 'l' : 1.00, 'x' : 1.33,}
        width_dict          = {'s' : 0.50, 'm' : 0.75, 'l' : 1.00, 'x' : 1.25,}
        dep_mul, wid_mul    = depth_dict[phi], width_dict[phi]

        base_channels       = int(wid_mul * 64)  # 64
        base_depth          = max(round(dep_mul * 3), 1)  # 3
        #   输入图片是640, 640, 3，初始的基本通道是64
        #   生成CSPdarknet53的主干模型
        #   获得三个有效特征层，他们的shape分别是：
        #   80,80,256
        #   40,40,512
        #   20,20,1024
        #---------------------------------------------------#
        self.backbone   = CSPDarknet(base_channels, base_depth)

        self.upsample   = nn.Upsample(scale_factor=2, mode="nearest")

        self.conv_for_feat3         = Conv(base_channels * 16, base_channels * 8, 1, 1)
        self.conv3_for_upsample1    = C3(base_channels * 16, base_channels * 8, base_depth, shortcut=False)

        self.conv_for_feat2         = Conv(base_channels * 8, base_channels * 4, 1, 1)
        self.conv3_for_upsample2    = C3(base_channels * 8, base_channels * 4, base_depth, shortcut=False)

        self.down_sample1           = Conv(base_channels * 4, base_channels * 4, 3, 2)
        self.conv3_for_downsam

最低0.47元/天解锁文章