卷积Conv

yore0531

已于 2023-07-29 14:49:59 修改

阅读量420

点赞数

文章标签：深度学习神经网络 cnn

于 2023-06-07 21:06:50 首次发布

本文链接：https://blog.youkuaiyun.com/yore0531/article/details/131096111

版权

卷积后H、W输出：

$H_{out}/W_{out} = \frac{H_{in}/W_{in} - k + 2\times p} {stride} + 1$ ，如有小数，向下取整。

参数：groups

未分组：

输入图片shape： $H\times W \times c_1$ ，目标输出shape： $H\times W \times c_2$ ，未分组卷积核shape： $h\times w \times c_1$ 。参数量： $c_2 \times (h \times w \times c_1)$

分组：

设置参数groups=g，即将输入特征图按通道分成g组 ，则每组shape： $H\times W \times \frac{c_1}{g}$ , 所以对应卷积核shape $h\times w \times \frac{c_1}{g}$ ，每组输出特征图shape： $H\times W \times \frac{c_2}{g}$ 。

最终将g组输出特征图concat，得到 $H\times W \times c_2$ 的输出。

所需参数量： $c_2 \times h \times w \times \frac{c_1}{g}$

分组卷积的参数量是标准卷积的 $\frac{1}{g}$

Depthwise Convolution

当groups = in_channel = out_channel=c_1时，每个feature map一一对应一个卷积核，即参数量 $\frac{c_1}{c_1}(1) \times H \times W * c_1 = c_1 \times H \times W$ 进一步减少了参数。

Deformable Convolution 可变形卷积

CNN对于未知形状变换的建模存在缺陷，因为CNN模块有固定的形状结构，即感受野是固定的。在进行诸如分割等精确定位的任务上效果不佳。在卷积网络中加入可学习的偏移量offset，使卷积核在feature map上不断发生偏移，即可更好学习ROI特征。

（a）即常见3x3卷积核，(b)即deformable conv，加上offset后采样点发生变化 ; (c) (d)是deformable conv的特殊形式。

绿框是原始卷积window，deformable conv可视为2branch，1branch通过额外conv学习offset（HxWx2N，2N的意思是有x,y两个方向的偏移），获得的offsets与feature map共同作为input输入2branch中（即相当于在蓝框中对feature map做卷积操作）。

注意，offset而是对feature map中的每个位置学习而非对kernel内容学习！

torchvision.ops.DeformConv2d(input: Tensor, offset: Tensor, mask: Optional[Tensor] = None)

input (Tensor[batch_size, in_channels, in_height, in_width]): input tensor
offset (Tensor[batch_size, 2 * offset_groups * kernel_height * kernel_width, out_height, out_width]): offsets to be applied for each position in the convolution kernel.
mask (Tensor[batch_size, offset_groups * kernel_height * kernel_width, out_height, out_width]): masks to be applied for each position in the convolution kernel.

#首先在__init__函数定义：
deform_conv2d = DeformConv2d(dim, dim, kernel_size, padding = 2, groups = deform_groups）

input = torch.rand(4, 3, 10, 10)
kh, kw = 3, 3
weight = torch.rand(5, 3, kh, kw)


# offset and mask should have the same spatial size as the output of the convolution. 
# if input h,w = 10, k=3, s=1, p=0 -> output h,w = 8 
offset = torch.rand(4, 2 * kh * kw, 8, 8)
mask = torch.rand(4, kh * kw, 8, 8)
out = deform_conv2d(input, offset, weight, mask=mask)
print(out.shape)

>>> # returns
>>>  torch.Size([4, 5, 8, 8])

函数方法实现Conv2d：torch.nn.functional.conv2d

torch.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)

Parameters:

input – input tensor of shape (minibatch, in_channels, inH, inW)
weight – filters of shape (out_channels, groups / in_channels, kernel_H, kernel_W)
bias – optional bias tensor of shape (out_channels)(out_channels). Default: None
stride – the stride of the convolving kernel. Can be a single number or a tuple (sH, sW). Default: 1.