百度AIstudio可在线运行项目
转置卷积(Transposed Convolution)
反卷积(Deconvolution)
微步卷积(Fractional strided convolution)
后向卷积(backwards convolution)
亚像素卷积(sub-pixs convolution)
一般而言 转置卷积≈反卷积≈微步卷积≈后向卷积≈亚像素卷积
计算原理
- 在深度学习框架中,设置的转置卷积层参数并非转置卷积实际stride、padding、dilations,
- 是其对应的卷积参数,所以直接用卷积公式反推,来计算转置卷积层的输出,详见下节。
转置卷积的计算公式
paddle API: https://www.paddlepaddle.org.cn/documentation/docs/zh/api_cn/dygraph_cn/Conv2DTranspose_cn.html#conv2dtranspose
pytorch转置卷积公式补充
公式理解与推导
理解
- 输入特征图Hin与一个卷积层运算过程,可以转换为输入特征矩阵与卷积矩阵的运算,输出特征图Hout,详见: 卷积与反卷积关系的示例举例:抽丝剥茧,带你理解转置卷积(反卷积)
- 卷积核大小、输入特征向量、以及卷积参数(s,p,d等)决定了唯一卷积矩阵 C
- 由矩阵变换,Hin * C =Hout 等效于 Hin *C * CT=Hout * CT 最终得到 Hin = Hout *CT (其中CT为卷积矩阵C的转置)
- 一般而言,转置卷积只能恢复输入特征图的形态,不能恢复数值。(本人暂时未理解暂时)
- paddle框架里的转置卷积的参数配置,对应的卷积运算的参数,并不是转置卷积的直接操作,但可用于公式推导,反着计算转置卷积的输出
待理解
- 转置卷积除了矩阵运算理解,直接在特征图上是如何可视化运算的
- 卷积矩阵由卷积参数、参与运算特征图的形态决定,如何验证。
- 证明转置恢复的特征图无法恢复特征图的值,只能恢复形状。
公式推导
主要参考
- 直接通过公式反推:卷积与转置卷积——输出特征图边长公式推导
- 卷积与反卷积关系的示例举例:抽丝剥茧,带你理解转置卷积(反卷积)
- 外文文献的翻译与理解,并举了一维卷积的例子:PyTorch中的转置卷积详解——全网最细
- pytorch参考的原理
- Is the deconvolution layer the same as a convolutional layer?A note on RealTime Single Image and Video SuperResolution Using an Efficient SubPixel Convolutional Neural Network
代码验证
Paddle转置卷积API验证
import paddle.fluid as fluid
import numpy as np
from paddle.fluid import ParamAttr
import paddle
'''
Conv2DTranspose(
num_channels, num_filters,
filter_size, output_size=None,
padding=0, stride=1, dilation=1, groups=None, param_attr=None, bias_attr=None, use_cudnn=True, act=None, dtype="float32")
'''
device=fluid.CPUPlace() # 此处要设置为动态形式
with fluid.dygraph.guard(device):
data = np.array([[1,2],
[3,4]]).astype('float32')
data=data[np.newaxis,np.newaxis,:,:]
data=fluid.dygraph.base.to_variable(data)
print("input_feature_tensor:",data)
# param_attr=paddle.fluid.ParamAttr(name="param",
# initializer=fluid.initializer.Constant(1,0))
conv2DTranspose = fluid.dygraph.Conv2DTranspose(
num_channels=1, num_filters=1,
padding=0,
stride=1,
filter_size=3) # ,param_attr=param_attr
ret = conv2DTranspose(data)
print("out_feature_tensor:",ret)
print("out_feature_numpy:\n",ret.numpy())
print(conv2DTranspose.weight.numpy())
conv = fluid.dygraph.Conv2D(
num_channels=1, num_filters=1,
padding=0,
stride=1,
filter_size=3
) # ,param_attr=param_attr
conv_ret=conv(ret)
print(conv_ret)
Pytorch转置卷积API
https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose2d.html
class transposed_conv():
def pytorch_convTranspose2d(self):
"""
torch.nn.ConvTranspose2d(
in_channels: int,
out_channels: int,
kernel_size: Union[T, Tuple[T, T]],
stride: Union[T, Tuple[T, T]] = 1,
padding: Union[T, Tuple[T, T]] = 0,
output_padding: Union[T, Tuple[T, T]] = 0,
groups: int = 1,
bias: bool = True,
dilation: int = 1,
padding_mode: str = 'zeros')
Hout=(Hin−1)×stride[0]−2×padding[0]
+dilation[0]×(kernel_size[0]−1)
+output_padding[0]+1
"""
import torch
import torch.nn as nn
# nchw 包含了从标准正态分布(均值为0,方差为1,即高斯白噪声)中抽取的一组随机数
input = torch.randn(20, 16, 50, 100)
# With square kernels and equal stride
m = nn.ConvTranspose2d(16, 33, 3, stride=2)
output = m(input)
print("ConvTranspose2d(16, 33, kernel=3, stride=2)", output.size())
# torch.Size([20, 33, 101, 201])
# 101= (Hin-1)*2-2*0+1*(3-1)=(50-1)*2+2+1=101
# 201=(100-1)*2-2*0+1*(3-1)=99*2+2+1=201
# non-square kernels and unequal stride and with padding
m = nn.ConvTranspose2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2))
output = m(input)
print("ConvTranspose2d(16, 33, kernel=(3, 5), stride=(2, 1), padding=(4, 2))", output.size())
# torch.Size([20, 33, 93, 100])
# 93=(50-1)*2-2*4+2+1=93
# 100=(100-1)*1-2*2+4+1=100
input = torch.randn(1, 16, 12, 12)
print("input.size", input.size())
downsample = nn.Conv2d(16, 16, 3, stride=2, padding=1)
upsample = nn.ConvTranspose2d(16, 16, 3, stride=2, padding=1)
h = downsample(input)
print("downsample:", h.size(), " result from: Conv2d(16, 16, 3, stride=2, padding=1)")
# downsample: torch.Size([1, 16, 6, 6]) result from: Conv2d(16, 16, 3, stride=2, padding=1)
output = upsample(input)
print("upsample:", output.size(), " result from: ConvTranspose2d(16, 16, 3, stride=2, padding=1)")
# upsample: torch.Size([1, 16, 23, 23]) result from: ConvTranspose2d(16, 16, 3, stride=2, padding=1)
# 23=(12-1)*2-2*1+3=23
if __name__ == '__main__':
trans_conv = transposed_conv()
trans_conv.pytorch_convTranspose2d()
# trans_conv.tf_nn_Conv2DTranspose()
Tensorflow转置卷积API
https://blog.youkuaiyun.com/mao_xiao_feng/article/details/71713358
def tf_nn_Conv2DTranspose(self):
'''
tf.nn.conv2d_transpose(
input,
filters,
output_shape,
strides,
padding='SAME',
data_format='NHWC',
dilations=None,
name=None
)
padding
one of "valid" or "same" (case-insensitive). "valid" means no padding.
"same" results in padding evenly to the left/right or up/down of the input
such that output has the same height/width dimension as the input.
https://zhuanlan.zhihu.com/p/31988761
if padding == 'SAME':
input_size = output_size // stride + 1
elif padding == 'VALID':
input_size = (output_size - kernel + 1) // stride + 1
else:
print("wrong :3")
[code from]:https://blog.youkuaiyun.com/mao_xiao_feng/article/details/71713358
'''
import tensorflow as tf
x1 = tf.constant(1.0, shape=[1, 3, 3, 1])
x2 = tf.constant(1.0, shape=[1, 6, 6, 3])
x3 = tf.constant(1.0, shape=[1, 5, 5, 3])
kernel = tf.constant(1.0, shape=[3, 3, 3, 1])
y1 = tf.nn.conv2d_transpose(x1, kernel, output_shape=[1, 6, 6, 3],
data_format='NHWC', # 默认
# The stride of the sliding window for each dimension of input
# # the format of strides: [1, stride, stride, 1]
strides=[1, 2, 2, 1],
padding="SAME")
y2 = tf.nn.conv2d(x3, kernel, strides=[1, 2, 2, 1], padding="SAME")
y3 = tf.nn.conv2d_transpose(y2, kernel, output_shape=[1, 5, 5, 3],
strides=[1, 2, 2, 1], padding="SAME")
y4 = tf.nn.conv2d(x2, kernel, strides=[1, 2, 2, 1], padding="SAME")
'''
Wrong!!This is impossible
y5 = tf.nn.conv2d_transpose(x1,kernel,output_shape=[1,10,10,3],strides=[1,2,2,1],padding="SAME")
'''
sess = tf.Session()
tf.global_variables_initializer().run(session=sess)
x1_run, x1_decov, x3_cov, y2_decov, x2_cov = sess.run([x1, y1, y2, y3, y4])
print(x1_run)
'''
[[[[1.]
[1.]
[1.]]
[[1.]
[1.]
[1.]]
[[1.]
[1.]
[1.]]]]
'''
print(x1_decov.shape) # (1,3,3,1) -> (1, 6, 6, 3)
print(x3_cov.shape) # (1,5,5,3) -> (1, 3, 3, 1)
print(y2_decov.shape) # (1,3,3,1) -> (1, 5, 5, 3)
tf.Keras转置卷积API
官方API: https://tensorflow.google.cn/api_docs/python/tf/keras/layers/Conv2DTranspose
应用举例: https://tensorflow.google.cn/tutorials/generative/dcgan
import tensorflow as tf
import tf.keras.layers.Conv2DTranspose
'''
API from https://tensorflow.google.cn/api_docs/python/tf/keras/layers/Conv2DTranspose
生成器使用 tf.keras.layers.Conv2DTranspose
(上采样)层来从种子(随机噪声)中产生图片。
以一个使用该种子作为输入的 Dense 层开始,
然后多次上采样直到达到所期望的 28x28x1 的图片尺寸
'''
# 示例1
x = Conv2DTranspose(filters=nb_classes,
kernel_size=(64, 64),
strides=(32, 32),
padding='same',
activation='sigmoid',
kernel_initializer=initializers.Constant(bilinear_upsample_weights(32, nb_classes)))(x)
# 示例2
def make_generator_model():
model = tf.keras.Sequential()
model.add(layers.Dense(7 * 7 * 256, use_bias=False, input_shape=(100,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Reshape((7, 7, 256)))
assert model.output_shape == (None, 7, 7, 256) # 注意:batch size 没有限制
model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
assert model.output_shape == (None, 7, 7, 128)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
assert model.output_shape == (None, 14, 14, 64)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2),
padding='same', use_bias=False, activation='tanh'))
assert model.output_shape == (None, 28, 28, 1)
return model