X-AnyLabeling 半自动化数据标注，将AnyLabeling数据转换成yolov8训练集格式

neter.asia

已于 2025-03-17 21:23:19 修改

阅读量3.9k

点赞数 15

CC 4.0 BY-SA版权

文章标签： python 开发语言

于 2024-07-11 17:30:20 首次发布

本文链接：https://blog.youkuaiyun.com/neterrrr/article/details/140325533

安装X-AnyLabeling

将AnyLabeling数据转换成yolov8训练集格式

1 安装X-AnyLabeling

最近想做一个图像识别的项目，要用到标注器标注训练数据，下面记录一下用X-AnyLabeling来做标注，可以实现半自动化数据标注，标注起来更轻松一些。

下面开干：

# 创建标注工具conda环境
conda create -n label python=3.9.13

# 启动标注工具环境
conda activate label

下载标注工具源码，这里我用的是X-AnyLabeling

GitHub - CVHub520/X-AnyLabeling: Effortless data labeling with AI support from Segment Anything and other awesome models.

在刚刚的命令行窗口下，进入X-AnyLabeling文件夹，安装依赖

# 进入项目文件夹
cd X-AnyLabeling
# 安装依赖
pip install -r requirements-gpu.txt -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com

安装完执行命令打开软件

python anylabeling/app.py

2 用X-AnyLabeling进行标注

标签文件的格式是json, 会存放在图片文件里。

3 将AnyLabeling数据转换成yolov8训练集格式

import os
import json
import shutil
import random
from PIL import Image


def convert_to_yolov8(json_path, output_dir, image_size):
    with open(json_path, 'r', encoding='utf-8') as f:
        data = json.load(f)

    image_file = os.path.splitext(os.path.basename(json_path))[0]
    txt_file = os.path.join(output_dir, f"{image_file}.txt")

    image_width, image_height = image_size

    with open(txt_file, 'w') as out_file:
        for shape in data['shapes']:
            label = shape['label']
            points = shape['points']
            x_min = min(points, key=lambda x: x[0])[0]
            y_min = min(points, key=lambda x: x[1])[1]
            x_max = max(points, key=lambda x: x[0])[0]
            y_max = max(points, key=lambda x: x[1])[1]

            # 计算 YOLOv8 格式所需的值
            x_center = (x_min + x_max) / 2 / image_width
            y_center = (y_min + y_max) / 2 / image_height
            width = (x_max - x_min) / image_width
            height = (y_max - y_min) / image_height

            # 假设所有对象的标签编号都是 0
            label_number = 0

            out_file.write(f"{label_number} {x_center} {y_center} {width} {height}\n")


if __name__ == "__main__":
    # 定义 JSON 文件目录和输出目录
    images_dir = "H:\\pyworkspace\\modiantu\\4month\\result"  # 图像文件目录
    output_dir = "H:\\pyworkspace\\modiantu\\4month\\yolo"  # 输出目录

    # 创建 YOLOv8 的目录结构
    for subdir in ['train', 'val']:
        os.makedirs(os.path.join(output_dir, 'images', subdir), exist_ok=True)
        os.makedirs(os.path.join(output_dir, 'labels', subdir), exist_ok=True)

    # 将文件分为训练集和验证集
    json_files = [f for f in os.listdir(images_dir) if f.endswith(".json")]
    random.shuffle(json_files)
    split_index = int(0.8 * len(json_files))  # 80% 用于训练，20% 用于验证
    train_files = json_files[:split_index]
    val_files = json_files[split_index:]

    # 转换训练集文件
    for json_file in train_files:
        json_path = os.path.join(images_dir, json_file)
        image_name = json_file.replace('.json', '.jpg')
        image_path = os.path.join(images_dir, image_name)

        # 获取图像尺寸
        with Image.open(image_path) as img:
            image_size = img.size

        convert_to_yolov8(json_path, os.path.join(output_dir, 'labels', 'train'), image_size)
        shutil.copy(image_path, os.path.join(output_dir, 'images', 'train', image_name))

    # 转换验证集文件
    for json_file in val_files:
        json_path = os.path.join(images_dir, json_file)
        image_name = json_file.replace('.json', '.jpg')
        image_path = os.path.join(images_dir, image_name)

        # 获取图像尺寸
        with Image.open(image_path) as img:
            image_size = img.size

        convert_to_yolov8(json_path, os.path.join(output_dir, 'labels', 'val'), image_size)
        shutil.copy(image_path, os.path.join(output_dir, 'images', 'val', image_name))

执行后生成数据集结构如下：