从Segmentation到Classification任务：数据整理_segmentation 数据-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_55406683/article/details/136019656

本文详细描述了作者如何将无人机图像数据集进行预处理，包括图像裁剪以生成适合分类任务的patch，使用sge_clas_binary函数对segmentation数据库按类别分离，并整理成train、val和test数据集的过程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1 前言

这篇笔记记载自己如何把segmentation数据库整理成适合作classification任务的数据库，万事开头难，好记忆不如烂笔头～

2. 背景介绍

我本来想尝试图像分割（image segmentation task），老大觉得还我还需要打基础，建议从更简单的图像分类（classification task）先做起。我就，当然听老大的。

2.1 数据库

提前收集了无人机数据集，图像尺寸是（44592，3072，3）。由于样本数量有限，我将图片进行了裁剪。

2.2 数据组织

所收集的数据保存格式如下，其中****.jpg 是图像，****_m.png是对应的mask。

------seg
-------6627.jpg
-------6627_m.png
我的数据混在了一起，其实不太好，后面会尝试把image 和 mask 分开，然后在处理。

3. 上代码

3.1 图像裁剪

此处，在split_images_in_folder 函数中遍历 source_folder中所有的图像，并利用split_image 实现图片的裁剪。

from typing import Tuple
import os
import shutil
from PIL import Image
import numpy as np

def split_images_in_folder(source_folder: str, destination_folder: str, patch_size: Tuple[int, int]):
    # Use assert to check if the source folder exists
    assert os.path.exists(source_folder), f"The source folder '{source_folder}' does not exist."

    # Check if the destination folder exists, and create it if not
    if not os.path.exists(destination_folder):
        os.makedirs(destination_folder)

    # Loop through all files in the source folder
    for filename in os.listdir(source_folder):
        if filename.endswith(".jpg"):  # Assuming your images are in JPG format
            # Full path to the image
            image_path = os.path.join(source_folder, filename)

            # Call the split_image function for each image
            split_image(image_path, destination_folder, patch_size)

此处的 image_path， destination_folder， patch_size都是从split_images_in_folder中传入的。

# Function to split a single image
def split_image(image_path: str, destination_folder: str, patch_size: Tuple[int, int]):
    # Use assert to check if the image file exists
    assert os.path.exists(image_path), f"The image file '{image_path}' does not exist."

    # Check if the destination folder exists, and create it if not
    if not os.path.exists(destination_folder):
        os.makedirs(destination_folder)

    # Load the image
    image = Image.open(image_path).convert("RGB")

    # Convert the image to a NumPy array
    image_np = np.array(image)

    # Get the dimensions of the original image
    height, width, _ = image_np.shape

    # Calculate the number of patches in each dimension
    num_patches_height = height // patch_size[0]
    num_patches_width = width // patch_size[1]

    # Loop through each patch
    for i in range(num_patches_height):
        for j in range(num_patches_width):
            # Calculate the coordinates of the patch
            start_height = i * patch_size