PyTorch之图像和Tensor填充

在PyTorch中可以对图像和Tensor进行填充,如常量值填充,镜像填充和复制填充等。在图像预处理阶段设置图像边界填充的方式如下:

import vision.torchvision.transforms as transforms

img_to_pad = transforms.Compose([
             transforms.Pad(padding=2, padding_mode='symmetric'),
             transforms.ToTensor(),
         ])

对Tensor进行填充的方式如下:

import torch.nn.functional as F

feature = feature.unsqueeze(0).unsqueeze(0)
avg_feature = F.pad(feature, pad = [1, 1, 1, 1], mode='replicate')

这里需要注意一点的是,transforms.Pad只能对PIL图像格式进行填充,而F.pad可以对Tensor进行填充,目前F.pad不支持对2D Tensor进行填充,可以通过unsqueeze扩展为4D Tensor进行填充。

F.pad的部分源码如下:

@torch._jit_internal.weak_script
def pad(input, pad, mode='constant', value=0):
    # type: (Tensor, List[int], str, float) -> Tensor
    r"""Pads tensor.

    Pading size:
        The number of dimensions to pad is :math:`\left\lfloor\frac{\text{len(pad)}}{2}\right\rfloor`
        and the dimensions that get padded begins with the last dimension and moves forward.
        For example, to pad the last dimension of the input tensor, then `pad` has form
        `(padLeft, padRight)`; to pad the last 2 dimensions of the input tensor, then use
        `(padLeft, padRight, padTop, padBottom)`; to pad the last 3 dimensions, use
        `(padLeft, padRight, padTop, padBottom, padFront, padBack)`.

    Padding mode:
        See :class:`torch.nn.ConstantPad2d`, :class:`torch.nn.ReflectionPad2d`, and
        :class:`torch.nn.ReplicationPad2d` for concrete examples on how each of the
        padding modes works. Constant padding is implemented for arbitrary dimensions.
        Replicate padding is implemented for padding the last 3 dimensions of 5D input
        tensor, or the last 2 dimensions of 4D input tensor, or the last dimension of
        3D input tensor. Reflect padding is only implemented for padding the last 2
        dimensions of 4D input tensor, or the last dimension of 3D input tensor.

    .. include:: cuda_deterministic_backward.rst

    Args:
        input (Tensor): `Nd` tensor
        pad (tuple): m-elem tuple, where :math:`\frac{m}{2} \leq` input dimensions and :math:`m` is even.
        mode: 'constant', 'reflect' or 'replicate'. Default: 'constant'
        value: fill value for 'constant' padding. Default: 0

    Examples::

        >>> t4d = torch.empty(3, 3, 4, 2)
        >>> p1d = (1, 1) # pad last dim by 1 on each side
        >>> out = F.pad(t4d, p1d, "constant", 0)  # effectively zero padding
        >>> print(out.data.size())
        torch.Size([3, 3, 4, 4])
        >>> p2d = (1, 1, 2, 2) # pad last dim by (1, 1) and 2nd to last by (2, 2)
        >>> out = F.pad(t4d, p2d, "constant", 0)
        >>> print(out.data.size())
        torch.Size([3, 3, 8, 4])
        >>> t4d = torch.empty(3, 3, 4, 2)
        >>> p3d = (0, 1, 2, 1, 3, 3) # pad by (0, 1), (2, 1), and (3, 3)
        >>> out = F.pad(t4d, p3d, "constant", 0)
        >>> print(out.data.size())
        torch.Size([3, 9, 7, 3])

    """
    assert len(pad) % 2 == 0, 'Padding length must be divisible by 2'
    assert len(pad) // 2 <= input.dim(), 'Padding length too large'
    if mode == 'constant':
        ret = _VF.constant_pad_nd(input, pad, value)
    else:
        assert value == 0, 'Padding mode "{}"" doesn\'t take in value argument'.format(mode)
        if input.dim() == 3:
            assert len(pad) == 2, '3D tensors expect 2 values for padding'
            if mode == 'reflect':
                ret = torch._C._nn.reflection_pad1d(input, pad)
            elif mode == 'replicate':
                ret = torch._C._nn.replication_pad1d(input, pad)
            else:
                ret = input  # TODO: remove this when jit raise supports control flow
                raise NotImplementedError

        elif input.dim() == 4:
            assert len(pad) == 4, '4D tensors expect 4 values for padding'
            if mode == 'reflect':
                ret = torch._C._nn.reflection_pad2d(input, pad)
            elif mode == 'replicate':
                ret = torch._C._nn.replication_pad2d(input, pad)
            else:
                ret = input  # TODO: remove this when jit raise supports control flow
                raise NotImplementedError

        elif input.dim() == 5:
            assert len(pad) == 6, '5D tensors expect 6 values for padding'
            if mode == 'reflect':
                ret = input  # TODO: remove this when jit raise supports control flow
                raise NotImplementedError
            elif mode == 'replicate':
                ret = torch._C._nn.replication_pad3d(input, pad)
            else:
                ret = input  # TODO: remove this when jit raise supports control flow
                raise NotImplementedError
        else:
            ret = input  # TODO: remove this when jit raise supports control flow
            raise NotImplementedError("Only 3D, 4D, 5D padding with non-constant padding are supported for now")
    return ret

如果您觉得我的文章对您有所帮助,欢迎扫码进行赞赏!

 

1. Source code for torch.nn.functional

2. torch.nn.functional

### PyTorch 中处理图像时的尺寸规范或调整方法 在PyTorch中,为了确保输入数据适合神经网络模型的要求,经常需要对图像进行尺寸上的规范化调整。这可以通过多种方式来完成。 #### 使用 `torchvision.transforms` 进行预处理 对于图像大小不一致的情况,最常用的方法之一是借助于`torchvision.transforms`模块提供的功能来进行转换操作。例如,Resize变换可以用来改变图片的高度宽度;CenterCrop或者RandomResizedCrop可以帮助裁剪出固定大小的部分用于后续分析[^1]。 ```python from torchvision import transforms transform = transforms.Compose([ transforms.Resize((256, 256)), # 改变分辨率至指定大小 transforms.CenterCrop(224), # 居中裁切得到所需区域 transforms.ToTensor() # 转换为张量形式并归一化到[0,1] ]) ``` #### 自定义层内调整 除了上述基于外部库的方式外,在构建自定义卷积神经网络(CNN)架构时也可以直接在网络内部加入相应的机制以适应不同尺度的数据集。比如采用最大池化(Max Pooling),它可以有效地减小特征图的空间维度而不丢失太多重要信息[^2]: ```python import torch.nn as nn class CustomCNN(nn.Module): def __init__(self): super().__init__() self.conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=7, stride=2) self.pool = nn.MaxPool2d(kernel_size=3, stride=2) def forward(self, x): x = self.conv1(x) x = self.pool(x) return x ``` 这种做法不仅简化了前期准备工作流程,而且使得整个项目更加紧凑统一。 #### 动态调整批量样本形状 当遇到批次(batch)内的单个实例具有不同的原始尺寸时,则可能需要用到更灵活的技术如填充(Padding)或是插值法(Interpolation)等手段使所有元素达到相同的规格以便参与计算过程。下面是一个简单的例子展示了如何利用双线性插值(bilinear interpolation)实现这一点: ```python def adjust_batch_shape(batch_images, target_size=(224, 224)): adjusted_imgs = [] for img in batch_images: resized_img = F.interpolate(img.unsqueeze(0), size=target_size, mode='bilinear', align_corners=False).squeeze() adjusted_imgs.append(resized_img) return torch.stack(adjusted_imgs) ``` 此函数接收一批次未标准化过的图像作为输入,并返回经过重新缩放后的版本集合,其中每个成员都具备相同的目标尺寸。
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值