Google Flax项目实战：三种方式加载MNIST数据集-优快云博客

本文链接：https://blog.youkuaiyun.com/gitblog_01002/article/details/148466188

Google Flax项目实战：三种方式加载MNIST数据集

flax Flax is a neural network library for JAX that is designed for flexibility. 项目地址: https://gitcode.com/gh_mirrors/fl/flax

前言

在深度学习项目中，数据预处理是模型训练前的重要环节。本文将详细介绍如何在Google Flax框架下，使用三种主流方式加载经典的MNIST手写数字数据集。Flax是基于JAX的神经网络库，要求输入数据必须是JAX数组格式，因此我们需要掌握将不同来源的数据转换为JAX数组的方法。

MNIST数据集简介

MNIST是一个广泛使用的手写数字识别数据集，包含：

60,000张训练图像
10,000张测试图像
每张图像为28×28像素的灰度图
标签为0-9的数字类别

对于CNN模型，输入数据需要处理为(B,28,28,1)的形状，其中B是批次大小，最后的1表示单通道灰度图。

准备工作

首先确保已安装必要的库：

import numpy as np
import jax.numpy as jnp

方法一：使用Torchvision加载

Torchvision是PyTorch的计算机视觉库，提供了便捷的数据集接口。

import torchvision

def load_mnist_torch():
    # 下载并加载数据集
    mnist = {
        'train': torchvision.datasets.MNIST('./data', train=True, download=True),
        'test': torchvision.datasets.MNIST('./data', train=False, download=True)
    }

    dataset = {}
    
    for split in ['train', 'test']:
        # 转换为numpy数组
        images = mnist[split].data.numpy()
        labels = mnist[split].targets.numpy()
        
        # 转换为JAX数组并归一化
        dataset[split] = {
            'image': jnp.float32(images) / 255.0,
            'label': jnp.int16(labels)
        }
        
        # 添加通道维度
        dataset[split]['image'] = jnp.expand_dims(dataset[split]['image'], -1)
    
    return dataset['train'], dataset['test']

特点：

直接使用PyTorch的数据加载器
需要手动添加通道维度
内存占用较低

方法二：使用TensorFlow Datasets加载

TensorFlow Datasets提供了标准化的数据集接口。

import tensorflow_datasets as tfds

def load_mnist_tf():
    # 构建并准备数据集
    builder = tfds.builder('mnist')
    builder.download_and_prepare()
    
    dataset = {}
    
    for split in ['train', 'test']:
        # 转换为numpy数组
        data = tfds.as_numpy(builder.as_dataset(split=split, batch_size=-1))
        
        # 转换为JAX数组并归一化
        dataset[split] = {
            'image': jnp.float32(data['image']) / 255.0,
            'label': jnp.int16(data['label'])
        }
    
    return dataset['train'], dataset['test']

特点：

自动处理数据集下载和缓存
支持多种数据格式转换
内置数据预处理功能

方法三：使用Hugging Face Datasets加载

Hugging Face提供了统一的数据集接口。

from datasets import load_dataset

def load_mnist_hf():
    # 加载数据集
    mnist = load_dataset("mnist")
    
    dataset = {}
    
    for split in ['train', 'test']:
        # 转换为numpy数组
        images = np.array([np.array(img) for img in mnist[split]['image']])
        labels = np.array(mnist[split]['label'])
        
        # 转换为JAX数组并归一化
        dataset[split] = {
            'image': jnp.float32(images) / 255.0,
            'label': jnp.int16(labels)
        }
        
        # 添加通道维度
        dataset[split]['image'] = jnp.expand_dims(dataset[split]['image'], -1)
    
    return dataset['train'], dataset['test']

特点：

统一的API接口
支持流式加载大数据集
丰富的社区数据集资源

数据验证

无论使用哪种方法加载，最终都应该得到相同格式的数据：

train, test = load_mnist_torch()  # 或其他加载方法

print("训练集图像形状:", train['image'].shape)  # 应为(60000,28,28,1)
print("训练集标签形状:", train['label'].shape)  # 应为(60000,)
print("测试集图像形状:", test['image'].shape)   # 应为(10000,28,28,1)
print("测试集标签形状:", test['label'].shape)   # 应为(10000,)