YOLO-seg 的 TXT 格式的标注并保存图像【标签可视化】【YOLO分割】

道心

已于 2024-12-03 15:22:50 修改

阅读量2.3k

点赞数 19

文章标签： YOLO 目标检测人工智能标签可视化

于 2024-12-03 15:21:18 首次发布

本文链接：https://blog.youkuaiyun.com/heart_warmonger/article/details/144215722

版权

如何可视化 YOLO-seg 的 TXT 格式的标注并保存图像

在目标检测和图像分割任务中， YOLO-seg 的 TXT 是一个广泛使用的模型，它不仅能进行目标检测，还能执行像素级的图像分割。 YOLO-seg 的 TXT 格式的标注文件通常包括目标的类别、边界框的中心坐标、宽高，以及目标的分割多边形。为了更好地理解和调试模型的预测结果，我们往往需要将这些标注数据可视化并保存图像。

今天，我将带你通过一段 Python 代码，帮助你完成将 YOLO-seg 的 TXT 格式的标注文件可视化并保存的过程。

目标

读取 YOLO-seg 格式的 .txt 标注文件。
根据标注绘制目标框和分割多边形。
为每个目标分配不同的颜色，以便区分不同类别。
将处理后的图像保存到指定的文件夹中。

YOLO-seg 的 TXT 格式简介

YOLO-seg 的标注文件使用 .txt 格式，其中每一行描述一个目标。每行的格式如下：

class_id x_center y_center width height polygon_points

class_id：目标的类别索引（从 0 开始）。
x_center, y_center：目标框的中心坐标，归一化到 [0, 1] 范围内。
width, height：目标框的宽度和高度，归一化到 [0, 1] 范围内。
polygon_points：目标的分割多边形顶点坐标，顶点的 x, y 坐标也归一化到 [0, 1] 范围内，并以空格分隔。

示例标注文件内容：

0 0.5 0.5 0.5 0.5 0.25 0.25 0.75 0.25 0.75 0.75 0.25 0.75
1 0.75 0.75 0.5 0.5 0.65 0.65 0.85 0.65 0.85 0.85 0.65 0.85

第一行表示类别 0 的目标，其边界框中心在 (0.5, 0.5)，宽度和高度各占图像的一半。多边形顶点构成一个矩形。
第二行表示类别 1 的目标，边界框中心为 (0.75, 0.75)，宽度和高度各占图像的一半，且对应的多边形为一个小矩形。

实现步骤

1. 读取图像和标注文件

我们使用 OpenCV (cv2) 来读取图像和解析 .txt 格式的标注文件。标注文件中包含目标的类别、边界框和分割多边形。

2. 解析 YOLOv5-seg 格式的标注

解析 .txt 文件，提取目标的 class_id、x_center、y_center、width、height 和多边形顶点坐标。然后将坐标转换回图像像素坐标，进行后续绘制。

3. 绘制目标框和分割多边形

根据每个目标的 class_id 为其分配一个不同的颜色，然后用绿色绘制目标框，使用蓝色或其他颜色绘制分割多边形。

4. 保存可视化结果

使用 cv2.imwrite() 将处理后的图像保存到指定的文件夹中。

完整代码

import cv2
import numpy as np
import os

# 为不同的类别分配不同的颜色
def get_class_color(class_id):
    color_map = {
        0: (0, 255, 0),   # 类别 0，绿色
        1: (0, 0, 255),   # 类别 1，红色
        2: (255, 0, 0),   # 类别 2，蓝色
        3: (0, 255, 255), # 类别 3，黄色
        4: (255, 0, 255), # 类别 4，品红色
        5: (255, 255, 0), # 类别 5，青色
    }
    return color_map.get(class_id, (255, 255, 255))  # 默认白色

# 解析 YOLO-seg 的 TXT 文件
def parse_yolov5seg_txt(txt_file, image_width, image_height):
    annotations = []
    
    # 读取txt文件中的标注
    with open(txt_file, 'r') as f:
        lines = f.readlines()
    
    for line in lines:
        line = line.strip().split()
        
        # 解析每一行数据
        class_id = int(line[0])
        x_center = float(line[1])
        y_center = float(line[2])
        width = float(line[3])
        height = float(line[4])
        
        # 获取多边形点坐标
        polygon_points = []
        for i in range(5, len(line), 2):
            x = float(line[i]) * image_width
            y = float(line[i+1]) * image_height
            polygon_points.append((int(x), int(y)))
        
        # 保存解析后的数据
        annotations.append((class_id, x_center, y_center, width, height, polygon_points))
    
    return annotations

# 在图像上绘制目标框和分割多边形，并保存可视化图像
def visualize_annotations(image_path, txt_file, output_folder):
    # 读取图像
    img = cv2.imread(image_path)
    image_height, image_width, _ = img.shape
    
    # 解析YOLOv5-seg的标注文件
    annotations = parse_yolov5seg_txt(txt_file, image_width, image_height)
    
    # 绘制每一个标注
    for annotation in annotations:
        class_id, x_center, y_center, width, height, polygon_points = annotation
        
        # 获取类别对应的颜色
        color = get_class_color(class_id)
        
        # 计算目标框的左上角和右下角
        x1 = int((x_center - width / 2) * image_width)
        y1 = int((y_center - height / 2) * image_height)
        x2 = int((x_center + width / 2) * image_width)
        y2 = int((y_center + height / 2) * image_height)
        
        # 绘制边界框（使用类别的颜色）（可以对这一小部分进行注即不显示框）
        cv2.rectangle(img, (x1, y1), (x2, y2), color, 2)
        
        # 绘制多边形（使用类别的颜色）
        polygon_points = np.array(polygon_points, np.int32)
        polygon_points = polygon_points.reshape((-1, 1, 2))
        cv2.polylines(img, [polygon_points], isClosed=True, color=color, thickness=2)
        
        # 在目标框中心添加类别标签（颜色与目标框相同）（可以对这一小部分进行注即不显示标签类型）
        label = f"Class {class_id}" 
        cv2.putText(img, label, (x1, y1-10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 1, cv2.LINE_AA)

    # 构建保存路径
    output_filename = os.path.join(output_folder, os.path.basename(image_path))
    
    # 保存可视化图像到文件夹
    cv2.imwrite(output_filename, img)

# 批量可视化图像和标注文件，并保存结果
def visualize_folder(image_folder, annotation_folder, output_folder):
    # 创建输出文件夹（如果不存在）
    os.makedirs(output_folder, exist_ok=True)
    
    for image_file in os.listdir(image_folder):
        if image_file.endswith(".jpg") or image_file.endswith(".png"):
            image_path = os.path.join(image_folder, image_file)
            txt_file = os.path.join(annotation_folder, os.path.splitext(image_file)[0] + ".txt")
            
            if os.path.exists(txt_file):
                visualize_annotations(image_path, txt_file, output_folder)
            else:
                print(f"Warning: Annotation file for {image_file} not found.")

# 设置图像、标注文件和输出文件夹路径
image_folder = "path_to_your_images"  # 替换为你的图像文件夹路径
annotation_folder = "path_to_your_annotations"  # 替换为你存储txt标注文件的文件夹路径
output_folder = "path_to_output_folder"  # 替换为你希望保存可视化图像的输出文件夹

路径

# 可视化图像和标注，并保存结果
visualize_folder(image_folder, annotation_folder, output_folder)

如何使用

安装依赖：确保你已安装 OpenCV 和 NumPy：
```
pip install opencv-python numpy
```
设置文件夹路径：
- 将 image_folder 设置为你的图像文件夹路径。
- 将 annotation_folder 设置为存储 .txt 标注文件的文件夹路径。
- 将 output_folder 设置为你希望保存可视化图像的输出文件夹路径。
运行代码：代码会批量读取图像和 .txt 文件，进行可视化并将每张处理后的图像保存到 output_folder 中。