cifar10数据集读取和显示_查询api文档,写一个cifar-10数据集的数据读取器,获取数据集的数据量等信息,并显示-优快云博客

本文链接：https://blog.youkuaiyun.com/Vertira/article/details/127178310

本文介绍CIFAR-10数据集的基本信息，包括图像尺寸、类别及数量，并提供Python读取方法与图像展示过程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

数据集的官方下载地址：

CIFAR-10 and CIFAR-100 datasets (utoronto.ca)

CIFAR-10
Size: 32×32 RGB图像，数据集本身是 BGR 通道
Num: 训练集 50000 和测试集 10000，一共60000张图片
Classes: plane（飞机）， car（汽车），bird（鸟），cat（猫），deer（鹿），dog（狗），frog（蛙类），horse（马），ship（船），truck（卡车）

解压后的内容如下

数据集读取

数据集读取的官网方法（python3 version）：

def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

返回的是一个字典

2）将字典打印出来

dict = unpickle('./data_batch_1')
print(dict)

{b'batch_label': b'training batch 1 of 5', 
 b'labels': [6, 9 ... 1, 5], 
 b'data': array([[ 59,  43,  50, ..., 140,  84,  72],
                                 ...
                 [ 62,  61,  60, ..., 130, 130, 131]], dtype=uint8),
 b'filenames': [b'leptodactylus_pentadactylus_s_000004.png', b'camion_s_000148.png',
                                 ...
                b'estate_car_s_001433.png', b'cur_s_000170.png']}

b’batch_label’ ：所属文件集
b’labels’ ：图片标签
b’data’ ：图片数据
b’filename’ ：图片名称

3）打印类型

print(type(dict[b'batch_label']))
print(type(dict[b'labels']))
print(type(dict[b'data']))
print(type(dict[b'filenames']))

<class 'bytes'>
<class 'list'>
<class 'numpy.ndarray'>
<class 'list'>

4）打印图片类型

img = dict[b'data']
print(img.shape)

(10000, 3072)

其中 3072 = 32 * 32 * 3 （图片 size）

5）绘制图片

show_image = img[666]
img_reshape = show_image.reshape(3, 32, 32)
pic = img_reshape.transpose(1, 2, 0)    # (3, 32, 32) --> (32, 32, 3)
plt.imshow(pic)
plt.show()

label = dict[b'labels']
image_label = label[666]
print(image_label)

卡车