带Json标注的数据集转成TFRecord(TT100K数据集)

问题描述:

手上有TT100K数据集,图片标注信息是json文件(作者用Qt开发的标注软件标注),但是想用Tensorflow Object Detection API来训练,之前做的Demo都是xml格式的标注,那么,如何将TT100K转成想要的TFRecord呢?

参考上一篇博客(VOC数据集转换成TFRecord文件):https://blog.youkuaiyun.com/m0_37970224/article/details/89305787

做下改动~
主要区别就是之前是读取xml文件的内容,现在改成读取json数据里面的内容~
上代码:tt100k_to_tfrecord.py

# coding=utf-8
import os
import sys
import random
import tensorflow as tf
import json
from PIL import Image

# DIRECTORY_IMAGES = './train/'
DIRECTORY_IMAGES = './test/'
RANDOM_SEED = 4242
SAMPLES_PER_FILES = 1600


def int64_feature(values):
    """Returns a TF-Feature of int64s.
    Args:
    values: A scalar or list of values.
    Returns:
    a TF-Feature.
    """
    if not isinstance(values, (tuple, list)):
        values = [values]
    return tf.train.Feature(int64_list=tf.train.Int64List(value=values))


def float_feature(value):
    """Wrapper for inserting float features into Example proto.
    """
    if not isinstance(value, list):
        value = [value]
    return tf.train.Feature(float_list=tf.train.FloatList(value=value))


def bytes_feature(value):
    """Wrapper for inserting bytes features into Example proto.
    """
    if not isinstance(value, list):
        value = [value]
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=value))


def _process_image(directory, name):
    # Read the image file.
    filename = os.path.join(directory, DIRECTORY_IMAGES, name + '.jpg')
    image_data = tf.gfile.FastGFile(filename, 'rb').read()

    # Read the json annotation file.
    filedir = directory + "/annotations.json"
    annos = json.loads(open(filedir).read())
    annos['imgs'][name]
    # shape
    with Image.open(filename) as img:
        shape = [img.height, img.width, 3]

    # 获取每个object的信息
    bboxes = []
    labels = []
    labels_text = []
    for obj in annos['imgs'][name]['objects']:
        label = obj['category']
        labels.append(annos['types'].index(label) + 1)
        labels_text.append(label.encode('utf8'))

        bbox = obj['bbox']
        bboxes.append((float(bbox['ymin']) / shape[0],
                       float(bbox['xmin']) / shape[1],
                       float(bbox['ymax']) / shape[0],
                       float(bbox['xmax']) / shape[1]
                       ))
    return image_data, shape, bboxes, labels, labels_text


def _convert_to_example(image_data, labels, labels_text, bboxes, shape):
    xmin = []
    ymin = []
    xmax = []
    ymax = []
    for b in bboxes:
        assert len(b) == 4
        # pylint: disable=expression-not-assigned
        [l.append(point) for l, point in zip([ymin, xmin, ymax, xmax], b)]
        # pylint: enable=expression-not-assigned

    image_format = b'JPEG'
    example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': int64_feature(shape[0]),
        'image/width': int64_feature(shape[1]),
        'image/channels': int64_feature(shape[2]),
        'image/shape': int64_feature(shape),
        'image/object/bbox/xmin': float_feature(xmin),
        'image/object/bbox/xmax': float_feature(xmax),
        'image/object/bbox/ymin': float_feature(ymin),
        'image/object/bbox/ymax': float_feature(ymax),
        'image/object/class/label': int64_feature(labels),
        'image/object/class/text': bytes_feature(labels_text),
        'image/format': bytes_feature(image_format),
        'image/encoded': bytes_feature(image_data)}))
    return example


def _add_to_tfrecord(dataset_dir, name, tfrecord_writer):
    image_data, shape, bboxes, labels, labels_text = \
        _process_image(dataset_dir, name)
    print(shape, bboxes, labels, labels_text)
    example = _convert_to_example(image_data,
                                  labels,
                                  labels_text,
                                  bboxes,
                                  shape)
    tfrecord_writer.write(example.SerializeToString())


def run(tt100k_root, split, output_dir, shuffling=False):
    # 如果output_dir不存在则创建
    if not tf.gfile.Exists(output_dir):
        tf.gfile.MakeDirs(output_dir)
    # TT100K/data/train/ids.txt
    # 中存放有所有221个类别的训练样本名称,共6105个
    split_file_path = os.path.join(tt100k_root, split, 'ids.txt')
    print('>> ', split_file_path)
    with open(split_file_path) as f:
        filenames = f.readlines()
    # shuffling == Ture时,打乱顺序
    if shuffling:
        random.seed(RANDOM_SEED)
        random.shuffle(filenames)
    # Process dataset files.
    i = 0
    fidx = 0
    while i < len(filenames):
        # Open new TFRecord file.
        tf_filename = '%s/%s_%03d.tfrecord' % (output_dir, 'test', fidx)
        with tf.python_io.TFRecordWriter(tf_filename) as tfrecord_writer:
            j = 0
            while i < len(filenames) and j < SAMPLES_PER_FILES:
                sys.stdout.write('\r>> Converting image %d/%d' % (i + 1, len(filenames)))
                sys.stdout.flush()
                filename = filenames[i].strip()
                _add_to_tfrecord(tt100k_root, filename, tfrecord_writer)
                i += 1
                j += 1
            fidx += 1
    print('\n>> Finished converting the TT100K %s dataset!' % (split))


if __name__ == '__main__':
    run('E:\data\TT100K\data', 'test', './data/tt100k/test')

模仿着改,因你数据而异,也能够转成tfrecord文件~

### TT100K 数据集格式转换方法 TT100K 数据集是一种广泛应用于交通标志检测和识别任务的公开数据集[^1]。该数据集通常以特定的标注格式提供,例如 JSON 或 XML 格式。为了将其转换为其他格式(如 COCO、Pascal VOC 或 YOLO 格式),需要对数据集的结构和标注内容进行解析和重构。 以下是一个通用的转换流程及代码示例: #### 1. 理解 TT100K 数据集的原始格式 TT100K 数据集标注文件通常以 JSON 格式存储,包含图像路径、类别标签以及边界框信息。例如,一个典型的 JSON 文件可能具有以下结构: ```json { "images": [ { "filename": "image_0001.jpg", "width": 1920, "height": 1080, "regions": [ { "category": "speedlimit_50", "bbox": [100, 200, 300, 400] }, { "category": "stop", "bbox": [400, 500, 600, 700] } ] } ] } ``` 上述结构中,`regions` 列表包含了每个目标的类别和边界框坐标。 #### 2. 转换为目标格式 ##### (1) 转换为 COCO 格式 COCO 格式要求将标注信息组织为 `annotations` 和 `categories` 等字段。以下是一个 Python 示例代码,展示如何将 TT100K 数据集转换为 COCO 格式: ```python import json def convert_to_coco(tt100k_json_path, output_path): with open(tt100k_json_path, 'r') as f: tt100k_data = json.load(f) images = [] annotations = [] categories = {} annotation_id = 1 for image_info in tt100k_data['images']: image_id = len(images) + 1 images.append({ "id": image_id, "file_name": image_info["filename"], "width": image_info["width"], "height": image_info["height"] }) for region in image_info["regions"]: category = region["category"] if category not in categories: categories[category] = len(categories) + 1 bbox = region["bbox"] x, y, w, h = bbox[0], bbox[1], bbox[2] - bbox[0], bbox[3] - bbox[1] annotations.append({ "id": annotation_id, "image_id": image_id, "category_id": categories[category], "bbox": [x, y, w, h], "area": w * h, "iscrowd": 0 }) annotation_id += 1 coco_data = { "images": images, "annotations": annotations, "categories": [{"id": v, "name": k} for k, v in categories.items()] } with open(output_path, 'w') as f: json.dump(coco_data, f, indent=4) convert_to_coco("tt100k.json", "coco_output.json") ``` ##### (2) 转换为 Pascal VOC 格式 Pascal VOC 格式使用 XML 文件存储标注信息。以下是一个 Python 示例代码,展示如何将 TT100K 数据集转换为 Pascal VOC 格式: ```python import xml.etree.ElementTree as ET def create_voc_xml(image_info, output_dir): root = ET.Element("annotation") folder = ET.SubElement(root, "folder") folder.text = "TT100K" filename = ET.SubElement(root, "filename") filename.text = image_info["filename"] size = ET.SubElement(root, "size") width = ET.SubElement(size, "width") width.text = str(image_info["width"]) height = ET.SubElement(size, "height") height.text = str(image_info["height"]) depth = ET.SubElement(size, "depth") depth.text = "3" for region in image_info["regions"]: obj = ET.SubElement(root, "object") name = ET.SubElement(obj, "name") name.text = region["category"] bndbox = ET.SubElement(obj, "bndbox") xmin = ET.SubElement(bndbox, "xmin") xmin.text = str(region["bbox"][0]) ymin = ET.SubElement(bndbox, "ymin") ymin.text = str(region["bbox"][1]) xmax = ET.SubElement(bndbox, "xmax") xmax.text = str(region["bbox"][2]) ymax = ET.SubElement(bndbox, "ymax") ymax.text = str(region["bbox"][3]) tree = ET.ElementTree(root) output_file = f"{output_dir}/{image_info['filename'].split('.')[0]}.xml" tree.write(output_file) # 假设 tt100k_data 是加载的 TT100K 数据 for image_info in tt100k_data['images']: create_voc_xml(image_info, "voc_output") ``` ##### (3) 转换为 YOLO 格式 YOLO 格式要求将标注信息存储为 `.txt` 文件,每行表示一个目标的类别和归一化后的边界框坐标。以下是一个 Python 示例代码,展示如何将 TT100K 数据集转换为 YOLO 格式: ```python def convert_to_yolo(tt100k_json_path, output_dir, class_mapping): with open(tt100k_json_path, 'r') as f: tt100k_data = json.load(f) for image_info in tt100k_data['images']: output_file = f"{output_dir}/{image_info['filename'].split('.')[0]}.txt" with open(output_file, 'w') as f: for region in image_info["regions"]: category = region["category"] if category not in class_mapping: continue class_id = class_mapping[category] bbox = region["bbox"] x_center = (bbox[0] + bbox[2]) / (2 * image_info["width"]) y_center = (bbox[1] + bbox[3]) / (2 * image_info["height"]) width = (bbox[2] - bbox[0]) / image_info["width"] height = (bbox[3] - bbox[1]) / image_info["height"] f.write(f"{class_id} {x_center} {y_center} {width} {height}\n") class_mapping = {"speedlimit_50": 0, "stop": 1} convert_to_yolo("tt100k.json", "yolo_output", class_mapping) ``` ### 注意事项 在执行格式转换时,请确保以下几点: - 检查原始数据集中是否有缺失或错误的标注信息。 - 确保类别映射正确无误,尤其是在涉及多类别的场景下。 - 在转换过程中,注意边界框坐标的单位一致性(如像素值与归一化值之间的转换)。
评论 16
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

龚大龙

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值