Yolact数据标注工具：Labelme与COCO格式转换-优快云博客

Yolact数据标注工具：Labelme与COCO格式转换

【免费下载链接】yolact A simple, fully convolutional model for real-time instance segmentation. 项目地址: https://gitcode.com/gh_mirrors/yo/yolact

1. 实例分割标注痛点与解决方案

1.1 标注流程困境

在Yolact（You Only Look At Coefficients for Real-Time Instance Segmentation，实时实例分割模型）的训练流程中，数据标注面临双重挑战：

格式不兼容：主流标注工具（如Labelme）输出JSON格式，而Yolact依赖COCO（Common Objects in Context，通用上下文对象）数据集格式
转换复杂度：COCO格式包含图像元数据、对象边界框（Bounding Box，边界框）、分割掩码（Mask）等多层嵌套结构

1.2 解决方案架构

本文提供完整的标注-转换流水线，核心包含： mermaid

2. Labelme标注实战

2.1 标注规范

mermaid

2.2 关键操作步骤

安装Labelme
```
pip install labelme==4.5.7
```
多边形标注原则
- 目标边缘采样点间隔≤5像素
- 同类目标使用统一标签（如"person"而非"person_1"）
- 遮挡区域保留完整轮廓

生成标注文件
标注完成后自动生成JSON文件，核心结构示例：

{
  "imagePath": "20230921_001.jpg",
  "shapes": [
    {
      "label": "cat",
      "points": [[120.5, 34.2], [189.1, 34.2], [189.1, 105.8], [120.5, 105.8]],
      "shape_type": "polygon"
    }
  ],
  "imageHeight": 480,
  "imageWidth": 640
}

3. COCO格式深度解析

3.1 数据结构规范

COCO数据集采用JSON格式存储标注信息，包含5个核心字段：

字段名	类型	描述
images	array	图像元数据数组
annotations	array	对象标注数组
categories	array	类别定义数组
licenses	array	授权信息数组
info	object	数据集基本信息

3.2 核心字段详解

图像元数据：

{
  "id": 1,
  "width": 640,
  "height": 480,
  "file_name": "20230921_001.jpg"
}

对象标注（含RLE压缩掩码）：

{
  "id": 101,
  "image_id": 1,
  "category_id": 1,
  "bbox": [120, 34, 69, 72],
  "area": 4968,
  "segmentation": {
    "counts": "nY3Fs89...",
    "size": [480, 640]
  },
  "iscrowd": 0
}

4. 格式转换实现方案

4.1 转换脚本开发

基于Yolact项目现有工具链（scripts/convert_sbd.py）扩展实现Labelme2COCO转换器：

import json
import numpy as np
import os
from PIL import Image

def labelme_to_coco(labelme_dir, output_json):
    """
    将Labelme标注文件转换为COCO格式
    参数:
        labelme_dir: Labelme JSON文件目录
        output_json: 输出COCO JSON路径
    """
    # 初始化COCO数据结构
    coco = {
        "images": [],
        "annotations": [],
        "categories": []
    }
    
    img_id = 1
    ann_id = 1
    categories = {}
    
    # 遍历Labelme文件
    for json_file in os.listdir(labelme_dir):
        if not json_file.endswith('.json'):
            continue
            
        with open(os.path.join(labelme_dir, json_file), 'r') as f:
            data = json.load(f)
            
        # 处理图像信息
        img = Image.open(os.path.join(labelme_dir, data['imagePath']))
        coco['images'].append({
            "id": img_id,
            "width": img.width,
            "height": img.height,
            "file_name": data['imagePath']
        })
        
        # 处理标注信息
        for shape in data['shapes']:
            # 类别ID映射
            if shape['label'] not in categories:
                categories[shape['label']] = len(categories) + 1
                
            # 计算边界框
            points = np.array(shape['points'])
            xmin, ymin = points.min(axis=0)
            xmax, ymax = points.max(axis=0)
            bbox = [xmin, ymin, xmax-xmin, ymax-ymin]
            area = bbox[2] * bbox[3]
            
            # 添加标注
            coco['annotations'].append({
                "id": ann_id,
                "image_id": img_id,
                "category_id": categories[shape['label']],
                "bbox": bbox,
                "area": area,
                "segmentation": [points.flatten().tolist()],
                "iscrowd": 0
            })
            ann_id += 1
            
        img_id += 1
    
    # 添加类别信息
    coco['categories'] = [
        {"id": v, "name": k} for k, v in categories.items()
    ]
    
    # 保存COCO JSON
    with open(output_json, 'w') as f:
        json.dump(coco, f, indent=2)

# 使用示例
labelme_to_coco('path/to/labelme_jsons', 'coco_annotations.json')

4.2 关键算法解析

边界框计算
采用最小外接矩形算法：

def mask2bbox(mask):
    """从掩码计算边界框（源自Yolact项目scripts/convert_sbd.py）"""
    rows = np.any(mask, axis=1)
    cols = np.any(mask, axis=0)
    rmin, rmax = np.where(rows)[0][[0, -1]]
    cmin, cmax = np.where(cols)[0][[0, -1]]
    return cmin, rmin, cmax - cmin, rmax - rmin

RLE压缩实现
集成pycocotools工具：

from pycocotools import mask as maskUtils

def polygon_to_rle(polygon, height, width):
    """将多边形转换为RLE格式"""
    rles = maskUtils.frPyObjects(polygon, height, width)
    rle = maskUtils.merge(rles)
    return rle

5. Yolact训练集成

5.1 数据集配置

修改data/config.py指定COCO数据集路径：

class COCOConfig:
    def __init__(self):
        self.dataset = DatasetInfo()
        self.dataset.train_images = "path/to/train2017"
        self.dataset.train_info = "path/to/coco_annotations.json"  # 转换后的标注文件
        self.dataset.valid_images = "path/to/val2017"
        self.dataset.valid_info = "path/to/coco_val_annotations.json"

5.2 训练命令示例

# 单GPU训练
python train.py --config=yolact_base_config --batch_size=8

# 多GPU训练
python -m torch.distributed.launch --nproc_per_node=4 train.py --config=yolact_base_config --batch_size=32

6. 质量控制与优化

6.1 标注质量检查清单

mermaid

6.2 性能优化策略

批量处理：使用scripts/convert_sbd.py中的批处理逻辑处理大规模数据集
内存优化：对超过1000x1000像素的图像进行下采样处理
并行计算：采用多进程加速掩码转换（参考optimize_bboxes.py中的批处理实现）

7. 常见问题解决方案

问题	原因	解决方案
JSON解析错误	文件编码不一致	使用`encoding='utf-8-sig'`参数读取中文路径
掩码与图像尺寸不匹配	图像缩放导致坐标偏移	在转换前统一图像尺寸
训练时类别ID错误	类别映射不一致	使用`scripts/unpack_statedict.py`同步类别字典

8. 扩展应用

8.1 半自动化标注

结合Yolact推理结果辅助标注：

# 生成伪标注
python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.3 --save_predictions

8.2 跨数据集迁移

通过格式转换实现与其他分割框架兼容： mermaid

9. 总结与展望

本文提供的Labelme-COCO转换方案已整合Yolact项目现有工具链（convert_sbd.py、compute_masks.py等），实现从标注到训练的全流程支持。未来可进一步开发：

基于主动学习的智能标注推荐系统
多模态标注工具（融合文本描述与视觉信息）
实时标注质量反馈插件

完整代码与示例已集成至项目scripts/labelme2coco.py，执行python scripts/labelme2coco.py -h查看使用帮助。

提示：收藏本文档，关注项目CHANGELOG.md获取格式转换工具更新通知。下期待续：《Yolact模型压缩与部署优化》

【免费下载链接】yolact A simple, fully convolutional model for real-time instance segmentation. 项目地址: https://gitcode.com/gh_mirrors/yo/yolact

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考