0. Pytorch 卷积层的padding计算

HRex39

已于 2022-07-03 17:43:53 修改

阅读量4.9k

点赞数 4

分类专栏： DQN 文章标签： pytorch python

于 2022-04-05 16:27:40 首次发布

本文链接：https://blog.youkuaiyun.com/weixin_47047999/article/details/123970718

版权

DQN 专栏收录该内容

1 篇文章

订阅专栏

文章目录

简介
Reference

简介

最近在使用Pytorch搭建一个简单的DQN网络，其中涉及到图像需要进行卷积层和池化层的计算。
个人感觉Pytorch是一个数据每走一步都需要编程者清楚明白的Library，从github也可以感受到Pytorch的开发者对于极致性能的追求，这个问题会在后面讨论到……
好的，那我们先来查看一下Pytorch官网的接口。

Pytorch官网接口

Pytorch.nn Document
打开Doc并定位到convolution layers。

class
torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros',
 device=None, dtype=None)

in_channels：代表输入的通道数，比如你有4帧图像，那么in_channels=4
out_channels：代表输出的通道数，比如你想输出32个矩阵，那么out_channels=32
kernel_size：是个int也可以是个tuple，int即为正方形的卷积核，tuple即为矩形卷积核(X,Y)
stride：步长
padding：controls the amount of padding applied to the input. It can be either a string {‘valid’, ‘same’} or a tuple of ints giving the amount of implicit padding applied on both sides.
emm，那既然你官网都这么说了，那我就这么写吧。

import numpy as np 
import torch 
import torch.nn as nn

input = np.random.rand(80,80,4) # 80*80*4帧图像
input = input.transpose(2,0,1) # numpy转制为4*80*80便于输入
stateinput = input[None,:] # numpy数组升维
x = torch.from_numpy(stateinput).to(torch.float32) # torch tensor和numpy的转换

conv1 = nn.Conv2d(in_channels=4, out_channels=32, kernel_size=(8,8), stride=4, padding='same')
out = conv1(x)
print(out.shape)

好的，在这里我的每一步都是按照官网的Docs来的，但是你这么运行，会发现在构造conv2d时出现了错误。

ValueError: padding='same' is not supported for strided convolutions

【有内鬼啊！】
搜索了一下Pytorch的issue，发现了问题所在：
Pytorch issues67551
似乎是为了Pytorch的性能，需要用户自己计算一下padding的值，那么好吧，也是一个学习的过程。

Padding的计算

在实际操作时，我们会碰到 padding的两种方式 “SAME” 和 “VALID”，padding = “SAME”时，会在图像的周围填 “0”，padding = “VALID”则不需要，即 P=0。
一般会选“SAME”，以来减缓图像变小的速度，二来防止边界信息丢失（即有些图像边界的信息发挥作用较少）。
padding= “SAME”时：
$N=\frac{W-F+2P}{S}+1$
其中：
N代表输出大小
W代表输入图片大小
F代表kernel大小
S代表步长
P就是padding也就是我们需要计算的值

举例

我们现在其余变量均已知，想要计算P值。
希望得到的输出是2020的图像，N=20；
输入是8080的图像，W=80；
kernel大小是8*8，F=8
步长为4，S=4
$N=\frac{W-F+2P}{S}+1$
$20=\frac{80-8+2*P}{4}+1$
$求得 P = 2$

验证

import numpy as np 
import torch 
import torch.nn as nn

input = np.random.rand(80,80,4) # 80*80*4帧图像
input = input.transpose(2,0,1) # numpy转制为4*80*80便于输入
stateinput = input[None,:] # numpy数组升维
x = torch.from_numpy(stateinput).to(torch.float32) # torch tensor和numpy的转换

conv1 = nn.Conv2d(in_channels=4, out_channels=32, kernel_size=(8,8), stride=4, padding=2)
out = conv1(x)
print(out.shape)