新手学习yolov8目标检测小记2--对比实验中经典模型库MMDetection使用方法（使用自己的数据集训练，并转换为yolo格式评价指标）

本文链接：https://blog.youkuaiyun.com/qq_42041632/article/details/144926614

一、按照步骤环境配置

pip install timm==1.0.7 thop efficientnet_pytorch==0.7.1 einops grad-cam==1.4.8 dill==0.3.6 albumentations==1.4.11 pytorch_wavelets==1.3.0 tidecv PyWavelets -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install -U openmim -i https://pypi.tuna.tsinghua.edu.cn/simple
mim install mmengine -i https://pypi.tuna.tsinghua.edu.cn/simple
mim install "mmcv==2.1.0" -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install YOLO
pip install ultralytics
pip install -v -e.

二、自定义数据集放置

我这里已经将数据集按照训练集、验证集、测试集=8:1:1划分好，具体的存放目录结构如下图所示。其中test2017、train2017、val2017存放图片，testlabels、trainlabels、vallabels存放标注文件txt，数据集格式转换后，在annotations文件中。

YOLO格式转coco格式代码如下

import os
import cv2
import json
from tqdm import tqdm
from sklearn.model_selection import train_test_split
import argparse

# visdrone2019
classes = ['beibie1',
           'beibie2',
           'beibie3'
           ]

parser = argparse.ArgumentParser()
parser.add_argument('--image_path', default=r'E:\mmde\mmdetection-3.0.0\mmdetection-3.0.0\data\coco\val2017', type=str, help="path of images")
parser.add_argument('--label_path', default=r'E:\mmde\mmdetection-3.0.0\mmdetection-3.0.0\data\coco\vallabels', type=str, help="path of labels .txt")
parser.add_argument('--save_path', default='val.json', type=str,
                    help="if not split the dataset, give a path to a json file")
arg = parser.parse_args()


def yolo2coco(arg):
    print("Loading data from ", arg.image_path, arg.label_path)

    assert os.path.exists(arg.image_path)
    assert os.path.exists(arg.label_path)

    originImagesDir = arg.image_path
    originLabelsDir = arg.label_path
    # images dir name
    indexes = os.listdir(originImagesDir)

    dataset = {'categories': [], 'annotations': [], 'images': []}
    for i, cls in enumerate(classes, 0):
        dataset['categories'].append({'id': i, 'name': cls, 'supercategory': 'mark'})

    # 标注的id
    ann_id_cnt = 0
    for k, index in enumerate(tqdm(indexes)):
        # 支持 png jpg 格式的图片.
        txtFile = f'{index[:index.rfind(".")]}.txt'
        stem = index[:index.rfind(".")]
        # 读取图像的宽和高
        try:
            im = cv2.imread(os.path.join(originImagesDir, index))
            height, width, _ = im.shape
        except Exception as e:
            print(f'{os.path.join(originImagesDir, index)} read error.\nerror:{e}')
        # 添加图像的信息
        if not os.path.exists(os.path.join(originLabelsDir, txtFile)):
            # 如没标签，跳过，只保留图片信息.
            continue
        dataset['images'].append({'file_name': index,
                                  'id': stem,
                                  'width': width,
                                  'height': height})
        with open(os.path.join(originLabelsDir, txtFile), 'r') as fr:
            labelList = fr.readlines()
            for label in labelList:
                label = label.strip().split()
                x = float(label[1])
                y = float(label[2])
                w = float(label[3])
                h = float(label[4])

                # convert x,y,w,h to x1,y1,x2,y2
                H, W, _ = im.shape
                x1 = (x - w / 2) * W
                y1 = (y - h / 2) * H
                x2 = (x + w / 2) * W
                y2 = (y + h / 2) * H
                # 标签序号从0开始计算, coco2017数据集标号混乱，不管它了。
                cls_id = int(label[0])
                width = max(0, x2 - x1)
                height = max(0, y2 - y1)
                dataset['annotations'].append({
                    'area': width * height,
                    'bbox': [x1, y1, width, height],
                    'category_id': cls_id,
                    'id': ann_id_cnt,
                    'image_id': stem,
                    'iscrowd': 0,
                    # mask, 矩形是从左上角点按顺时针的四个顶点
                    'segmentation': [[x1, y1, x2, y1, x2, y2, x1, y2]]
                })
                ann_id_cnt += 1

    # 保存结果
    with open(arg.save_path, 'w') as f:
        json.dump(dataset, f)
        print('Save annotation to {}'.format(arg.save_path))


if __name__ == "__main__":
    yolo2coco(arg)

三、参数修改

以faster-rcnn为例，查看文件configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py内容如下：

_base_ = [
    '../_base_/models/faster-rcnn_r50_fpn.py',
    '../_base_/datasets/coco_detection.py',
    '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
]

根据显示内容，修改具体配置。（‘../_base_/default_runtime.py’无需修改）

（1）到‘../_base_/models/faster-rcnn_r50_fpn.py’修改

num_classes为自己的实际数据集类别数。

（2）到‘../_base_/datasets/coco_detection.py’，修改

dataset_type = 'CocoDataset'
data_root = 'data/coco/'

由于我使用的是coco数据集格式，只需要改data_root为自己数据集的位置即可。并修改

scale为自己的图像尺寸大小，