pytorch官方文档有关DataLoader数据迭代器的说明

最新推荐文章于 2025-07-09 20:27:15 发布

原创最新推荐文章于 2025-07-09 20:27:15 发布 · 3.7k 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#DataLoader

Python 同时被 2 个专栏收录

21 篇文章

订阅专栏

Pytorch

8 篇文章

订阅专栏

本文详细介绍了PyTorch中DataLoader的功能与用法，包括如何通过设置不同的参数来实现数据加载的批量化、随机化及多进程加载等功能，帮助读者更好地理解和使用DataLoader。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

class torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=False, sampler=None, batch_sampler=None, num_workers=0, collate_fn=<function default_collate>, pin_memory=False, drop_last=False) [source]

Data loader. Combines a dataset and a sampler, and providessingle- or multi-process iterators over the dataset.

Parameters:

Parameters:	dataset (Dataset) – dataset from which to load the data. batch_size (int, optional) – how many samples per batch to load(default: 1). shuffle (bool, optional) – set to `True` to have the data reshuffledat every epoch (default: False). sampler (Sampler, optional) – defines the strategy to draw samples fromthe dataset. If specified, `shuffle` must be False. batch_sampler (Sampler, optional) – like sampler, but returns a batch ofindices at a time. Mutually exclusive with batch_size, shuffle,sampler, and drop_last. num_workers (int, optional) – how many subprocesses to use for dataloading. 0 means that the data will be loaded in the main process(default: 0) collate_fn (callable, optional) – merges a list of samples to form a mini-batch. pin_memory (bool, optional) – If `True`, the data loader will copy tensorsinto CUDA pinned memory before returning them. drop_last (bool, optional) – set to `True` to drop the last incomplete batch,if the dataset size is not divisible by the batch size. If `False` andthe size of dataset is not divisible by the batch size, then the last batchwill be smaller. (default: False)

dataset (Dataset) – dataset from which to load the data.
batch_size (int, optional) – how many samples per batch to load(default: 1).
shuffle (bool, optional) – set to True to have the data reshuffledat every epoch (default: False).
sampler (Sampler, optional) – defines the strategy to draw samples fromthe dataset. If specified, shuffle must be False.
batch_sampler (Sampler, optional) – like sampler, but returns a batch ofindices at a time. Mutually exclusive with batch_size, shuffle,sampler, and drop_last.
num_workers (int, optional) – how many subprocesses to use for dataloading. 0 means that the data will be loaded in the main process(default: 0)
collate_fn (callable, optional) – merges a list of samples to form a mini-batch.
pin_memory (bool, optional) – If True, the data loader will copy tensorsinto CUDA pinned memory before returning them.
drop_last (bool, optional) – set to True to drop the last incomplete batch,if the dataset size is not divisible by the batch size. If False andthe size of dataset is not divisible by the batch size, then the last batchwill be smaller. (default: False)