日常学习,给自己挖坑,and造轮子
class torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
如下是MaxPool2d的解释:
class MaxPool2d(_MaxPoolNd):
r"""Applies a 2D max pooling over an input signal composed of several input
planes.
In the simplest case, the output value of the layer with input size :math:`(N, C, H, W)`,output :math:`(N, C, H_{out}, W_{out})` and :attr:`kernel_size` :math:`(kH, kW)`can be precisely described as:
.. math::
\begin{aligned}
out(N_i, C_j, h, w) ={} & \max_{m=0, \ldots, kH-1} \max_{n=0, \ldots, kW-1} \\
& \text{input}(N_i, C_j, \text{stride[0]} \times h + m,
\text{stride[1]} \times w + n)
\end{aligned}
If :attr:`padding` is non-zero, then the input is implicitly zero-padded on both sides
for :attr:`padding` number of points. :attr:`dilation` controls the spacing between the kernel points.
It is harder to describe, but this `link`_ has a nice visualization of what :attr:`dilation` does.
The parameters :attr:`kernel_size`, :attr:`stride`, :attr:`padding`, :attr:`dilation` can either be:
- a single ``int`` -- in which case the same value is used for the height and width dimension
- a ``tuple`` of two ints -- in which case, the first `int` is used for the height dimension, and the second `int` for the width dimension
Args:
kernel_size: the size of the window to take a max over
stride: the stride of the window. Default value is :attr:`kernel_size`
padding: implicit zero padding to be added on both sides
dilation: a parameter that controls the stride of elements in the window
return_indices: if ``True``, will return the max indices along with the outputs.
Useful for :class:`torch.nn.MaxUnpool2d` later
ceil_mode: when True, will use `ceil` instead of `floor` to compute the output shape
Shape:
- Input: :math:`(N, C, H_{in}, W_{in})`
- Output: :math:`(N, C, H_{out}, W_{out})`, where
.. math::
H_{out} = \left\lfloor\frac{H_{in} + 2 * \text{padding[0]} - \text{dilation[0]}
\times (\text{kernel\_size[0]} - 1) - 1}{\text{stride[0]}} + 1\right\rfloor
.. math::
W_{out} = \left\lfloor\frac{W_{in} + 2 * \text{padding[1]} - \text{dilation[1]}
\times (\text{kernel\_size[1]} - 1) - 1}{\text{stride[1]}} + 1\right\rfloor
Examples::
>>> # pool of square window of size=3, stride=2
>>> m = nn.MaxPool2d(3, stride=2)
>>> # pool of non-square window
>>> m = nn.MaxPool2d((3, 2), stride=(2, 1))
>>> input = torch.randn(20, 16, 50, 32)
>>> output = m(input)
.. _link:
https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md
"""
def forward(self, input):
return F.max_pool2d(input, self.kernel_size, self.stride,
self.padding, self.dilation, self.ceil_mode,
self.return_indices)
大致解释为:
在由多个输入通道组成的输入信号上应用2D max池。
在最简单的情况下,具有输入大小的层的输出值:(N, C, H, W),
输出:(N, C, H_{out}, W_{out}), kernel_size,和 (kH, kW)可以准确地描述为:
out(N_i, C_j, h, w) ={} & \max_{m=0, \ldots, kH-1} \max_{n=0, \ldots, kW-1} \\
& \text{input}(N_i, C_j, \text{stride[0]} \times h + m,
\text{stride[1]} \times w + n)
如果 padding 非零,那么输入的两边都隐式地被0填充;dilation用于控制内核点之间的距离
但这里很好地展示了 diagration 的作用。
这些参数:kernel_size,stride,padding, ,dilation 可以为:
-单个int值–在这种情况下,高度和宽度标注使用相同的值
-两个整数组成的数组——在这种情况下,第一个int用于高度维度,第二个int表示宽度
参数:
kernel_size(int or tuple) :max pooling的窗口大小
stride(int or tuple, optional):max pooling的窗口移动的步长。默认值是kernel_size
padding(int or tuple, optional) :输入的每一条边补充0的层数
dilation(int or tuple, optional):一个控制窗口中元素步幅的参数
return_indices :如果等于True,会返回输出最大值的序号,对于上采样操作会有帮助
ceil_mode :如果等于True,计算输出信号大小的时候,会使用向上取整,代替默认的向下取整的操作
Shape:
- Input: :math:`(N, C, H_{in}, W_{in})`
- Output: :math:`(N, C, H_{out}, W_{out})`, where
.. math::
H_{out} = \left\lfloor\frac{H_{in} + 2 * \text{padding[0]} - \text{dilation[0]}
\times (\text{kernel\_size[0]} - 1) - 1}{\text{stride[0]}} + 1\right\rfloor
.. math::
W_{out} = \left\lfloor\frac{W_{in} + 2 * \text{padding[1]} - \text{dilation[1]}
\times (\text{kernel\_size[1]} - 1) - 1}{\text{stride[1]}} + 1\right\rfloor
Examples::
>>> # pool of square window of size=3, stride=2
>>> m = nn.MaxPool2d(3, stride=2)
>>> # pool of non-square window
>>> m = nn.MaxPool2d((3, 2), stride=(2, 1))
>>> input = torch.randn(20, 16, 50, 32)
>>> output = m(input)
理解PyTorch中的MaxPool2d操作
本文详细介绍了PyTorch中MaxPool2d模块的工作原理,包括其在2D输入上的作用方式,参数如kernel_size、stride、padding和dilation的影响,以及如何计算输出尺寸。通过实例展示了不同参数设置下的池化操作,并强调了在需要最大值索引时return_indices参数的重要性。
1万+

被折叠的 条评论
为什么被折叠?



