将数据集做成VOC2007格式

最新推荐文章于 2024-01-29 14:40:41 发布

晴天stick

最新推荐文章于 2024-01-29 14:40:41 发布

阅读量1.2k

点赞数 1

分类专栏：深度学习文章标签： VOC2007 数据集 faster RCNN

本文链接：https://blog.youkuaiyun.com/qq_42805483/article/details/89740665

版权

为了学习Faster RCNN，博主详细记录了如何将数据集转换为VOC2007格式的过程，包括图片重命名、XML注释生成以及ImageSets中txt文件的创建。这个过程对于后续的Faster RCNN训练至关重要。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

为了学习Faster RCNN，需要将自己的数据集制作成V0C2007的格式，搜索了许多的资料发现都不太完整，所以记录一下自己的学习过程，以便后续可能的使用。

文件夹名字为VOC2007,里面分别是JPEGImages，Annotations，ImageSets三个文件，按顺序依次处理。

1.图片的重命名

VOC2007中图片名资是六位数字，如000001.jpg，因此最好将JPEGImages文件中数据集图片名字改成一样的格式。
相应的代码为

%图片重命名程序
clc;
clear;
maindir='F:\数据集\positive image set';
newdir = 'F:\数据集\new positive image set';
name_long=6; %图片名字的长度，如000123.jpg为6,最多9位,可修改
num_begin=1; %图像命名开始的数字如000123.jpg开始的话就是123

subdir = dir(maindir);%文件夹中文件内容
n = 1;


for i = 1:length(subdir) 
    if ~strcmp(subdir(i).name ,'.') && ~strcmp(subdir(i).name,'..') %前2个为'.','..',strcmp 比较字符串
        A = strcat(maindir,'\',subdir(i).name);
        B = imread(A);
        %imshow(B)
        str=num2str(num_begin,'%09d');
        newname=strcat(str,'.jpg');
        newname=newname(end-(name_long+3):end);%名字重命名，如000001.jpg
        C = strcat(newdir,'\',newname);
        imwrite(B,C);
        num_begin  = num_begin + 1;
        
    end
end

2.Annotations文件夹的生成
如果是别人标记好的数据集，则需要自己转换成XML文件。

% <annotation> //格式
%     <folder>VOC2012</folder>                           
%     <filename>2007_000392.jpg</filename>              //文件名
%     <source>                                         //图像来源（不重要）
%         <database>The VOC2007 Database</database>
%         <annotation>PASCAL VOC2007</annotation>
%         <image>flickr</image>
%     </source>
%     <size>                        //图像尺寸（长宽以及通道数）                       
%         <width>500</width>
%         <height>332</height>
%         <depth>3</depth>
%     </size>
%     <segmented>1</segmented>        //是否用于分割（在图像物体识别中01无所谓）
%     <object>                        //检测到的物体
%         <name>horse</name>          //物体类别
%         <pose>Right</pose>          //拍摄角度
%         <truncated>0</truncated>    //是否被截断（0表示完整）
%         <difficult>0</difficult>    //目标是否难以识别（0表示容易识别）
%         <bndbox>                    //bounding-box（包含左下角和右上角xy坐标）
%             <xmin>100</xmin>
%             <ymin>96</ymin>
%             <xmax>355</xmax>
%             <ymax>324</ymax>
%         </bndbox>
%     </object>
%     <object>                        //检测到多个物体
%         <name>person</name>
%         <pose>Unspecified</pose>
%         <truncated>0</truncated>
%         <difficult>0</difficult>
%         <bndbox>
%             <xmin>198</xmin>
%             <ymin>58</ymin>
%             <xmax>286</xmax>
%