pytorch学习笔记（二）-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_44680341/article/details/135054671

一、数据加载

读取数据主要分成两种，Dataset、Dataloader

用up的垃圾回收的例子来说

Dataset可以进行一个编号的控制，提供一种方式去获取数据及其label

Dataset	Dataloader
提供一种方式去获取数据及其label	为后面的网络提供不同的数据形式
如何获取每一个数据及其label 告诉我们总共有多少的数据

为了更好的了解上述两种加载方式，先了解下几种组织形式

训练图片和训练label

.txt的+label

图片上直接+label

二、练手

对蚂蚁和蜜蜂进行二分类

1.用工具之前，先看看官方文档怎么使用

from torch.utils.data import Dataset

两种方式查看Dataset怎么使用

Init signature: Dataset(*args, **kwds)
Source:        
class Dataset(Generic[T_co]):
    r"""An abstract class representing a :class:`Dataset`.

    All datasets that represent a map from keys to data samples should subclass
    it. All subclasses should overwrite :meth:`__getitem__`, supporting fetching a
    data sample for a given key. Subclasses could also optionally overwrite
    :meth:`__len__`, which is expected to return the size of the dataset by many
    :class:`~torch.utils.data.Sampler` implementations and the default options
    of :class:`~torch.utils.data.DataLoader`.

    .. note::
      :class:`~torch.utils.data.DataLoader` by default constructs a index
      sampler that yields integral indices.  To make it work with a map-style
      dataset with non-integral indices/keys, a custom sampler must be provided.
    """

    def __getitem__(self, index) -> T_co:
        raise NotImplementedError

    def __add__(self, other: 'Dataset[T_co]') -> 'ConcatDataset[T_co]':
        return ConcatDataset([self, other])
File:           d:\work_app\anconda\envs\motionbert\lib\site-packages\torch\utils\data\dataset.py
Type:           type
Subclasses:     IterableDataset, TensorDataset, ConcatDataset, Subset, MapDataPipe

主要是重写，__getitem__()，__add__()这两个类。

以下是相对路径，注意windows下要写两个斜杠表示转义