手写数字识别的常用数据库,数据集的获取:http://yann.lecun.com/exdb/mnist/
数据集中有下面四个文件:
train-images-idx3-ubyte: training set images
train-labels-idx1-ubyte: training set labels
t10k-images-idx3-ubyte: test set images
t10k-labels-idx1-ubyte: test set labels
train-labels-idx1-ubyte: training set labels
t10k-images-idx3-ubyte: test set images
t10k-labels-idx1-ubyte: test set labels
其中训练样本有60000个,测试样本有10000个。其中测试数据集中前5000个比较清楚,容易识别,后面5000个识别相对难一点。
对于图片数据,格式是下面这样的:
IMAGE FILE (train-images-idx3-ubyte):
[offset] [type] [value] [description]
0000 32 bit integer 0x00000803(2051) magic number
0004 32 bit integer 60000 number of images
0008 32 bit integer 28 number of rows
0012 32 bit integer 28 number of columns
0016 unsigned byte ?? pixel
0017 unsigned byte ?? pixel
........
xxxx unsigned byte ?? pixel
0000 32 bit integer 0x00000803(2051) magic number
0004 32 bit integer 60000 number of images
0008 32 bit integer 28 number of rows
0012 32 bit integer 28 number of columns
0016 unsigned byte ?? pixel
0017 unsigned byte ?? pixel
........
xxxx unsigned byte ?? pixel
所以在程序读取的时候要先偏移16个字节,才能真正的读取到实际的像素数据。数据是归一化后的,大小都是28*28,可以减少很多前期的预处理工作。
对于标签数据,格式也有点区别:
LABEL FILE (train-labels-idx1-ubyte):
[offset] [type] [value] [description]
0000 32 bit integer 0x00000801(2049) magic number (MSB first)
0004 32 bit integer 60000 number of items
0008 unsigned byte ?? label
0009 unsigned byte ?? label
........
xxxx unsigned byte ?? label
0000 32 bit integer 0x00000801(2049) magic number (MSB first)
0004 32 bit integer 60000 number of items
0008 unsigned byte ?? label
0009 unsigned byte ?? label
........
xxxx unsigned byte ?? label
标签的值是0到9,要读取到真正的标签数据,需偏移8个字节。
c语言例程序:
int read_data(unsigned char(*data)[28][28], unsigned char label[], const int count, const char data_file[], const char label_file[])
{
FILE *fp_image = fopen(data_file, "rb");//读图片
FILE *fp_label = fopen(label_file, "rb");//读类标
if (!fp_image || !fp_label) return 1;
fseek(fp_image, 16, SEEK_SET);//从文件开始位置偏移16字节,定位像素数据
fseek(fp_label, 8, SEEK_SET);//从文件开始位置偏移8字节,定位标签数据
fread(data, sizeof(*data)*count, 1, fp_image);
fread(label, count, 1, fp_label);
fclose(fp_image);
fclose(fp_label);
return 0;
}