PyTorch/torchvision.datasets自带常用数据集【总结】

最新推荐文章于 2025-10-13 16:21:07 发布

原创

最新推荐文章于 2025-10-13 16:21:07 发布 · 1.4w 阅读

CC 4.0 BY-SA版权

文章标签：

本文介绍了PyTorch框架中可用的各种数据集，包括MNIST、COCO、LSUN、ImageNet、CIFAR、STL10、SVHN和PhotoTour等，涵盖了从手写数字到复杂图像识别的不同领域。同时，文章提供了数据集的调用方式和一些常见问题的解决方案。

一、PyTorch环境

@PyTorch 1.0
@安装torchvision及报错的解决办法如下:
@下载命令1：pip3 install torchvision #可能会报错
@下载命令2：pip install --no-deps torchvision
@Linux下，可能需要sudo

MNIST
#一个手写数字数据集集，提供了60000+训练用例和10000个测试用例

The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples.
COCO
#通常认为COCO是更倾向于图像分割的数据集

COCO-Text is a new large scale dataset for text detection and recognition in natural images.
Version 1.3 of the dataset is out!
63,686 images, 145,859 text instances, 3 fine-grained text attributes.
This dataset is based on the MSCOCO dataset.
•Text localizations as bounding boxes
•Text transcriptions for legible text
•Multiple text instances per image
•More than 63,000 images
•More than 145,000 text instances
•Text instances categorized into machine printed and handwritten text
•Text instances categorized into legible and illegilbe text
•Text instances categorized into English script and non-English script
Captions
#望文生义，主要是标题测试用例，通常算在COCO中
Detection
#感知数据集，通常也包含在COCO中
LSUN
#场景感知数据集（感觉很酷

While there has been remarkable progress in the performance of visual recognition algorithms, the state-of-the-art models tend to be exceptionally data-hungry. Large labeled training datasets, expensive and tedious to produce, are required to optimize millions of parameters in deep network models.
详细信息点击：