神经网络卷积操作理解

最新推荐文章于 2025-04-16 09:00:00 发布

阿光light

最新推荐文章于 2025-04-16 09:00:00 发布

阅读量428

点赞数 1

分类专栏：笔记

本文链接：https://blog.youkuaiyun.com/weixin_42098609/article/details/107953664

版权

笔记专栏收录该内容

3 篇文章

订阅专栏

参考资料：Pytorch官方文档

一维卷积

一维卷积层，输入的尺度是 $N, C_{in},L)$ ，输出尺度 $N,C_{out},L_{out})$ 的计算方式：
$out(N_i, C_{out_j})=bias(C _{out_j})+\sum^{C_{in}-1}_{k=0}weight(C_{out_j},k)\bigotimes input(N_i,k)$

$N$ 是batch size， $C$ 是channels， $L$ 是一维张量的长度。

class torch.nn.Conv1d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

参数说明：
in_channels(int) – 输入信号的通道
out_channels(int) – 卷积产生的通道
kerner_size(int or tuple) - 卷积核的尺寸
stride(int or tuple, optional) - 卷积步长
padding (int or tuple, optional) - 输入的每一条边补充0的层数
dilation(int or tuple, optional) – 卷积核元素之间的间距
groups(int, optional)– 从输入通道到输出通道的阻塞连接数。 group=1，输出是所有的输入的卷积；group=2，此时相当于有并排的两个卷积层，每个卷积层计算输入通道的一半，并且产生的输出是输出通道的一半，随后将这两个输出连接起来。
bias(bool, optional)- 如果bias=True，添加偏置

shape:
输入: $N,C_{in},L_{in})$
输出: $N,C_{out},L_{out})$
输入输出的计算方式：
$L_{out}=floor((L_{in}+2padding-dilation(kernerl\_size-1)-1)/stride+1)$

变量：
weight(tensor) - 卷积的权重，大小是(out_channels, in_channels, kernel_size)
bias(tensor) - 卷积的偏置系数，大小是（out_channel）

二维卷积

二维卷积层, 输入的尺度是 $N, C_{in},H,W)$ ，输出尺度 $N,C_{out},H_{out},W_{out})$ 的计算方式：
$out(N_i, C_{out_j})=bias(C_{out_j})+\sum^{C_{in}-1}_{k=0}weight(C_{out_j},k)\bigotimes input(N_i,k)$

举例：输入 $3\times3\times3$ ，即输入通道为3，图像高和宽均为3，卷积核为 $1\times1$ ，输出通道为2，故输出 $2\times3\times3$ 。
在这里插入图片描述
卷积核的个数为输出通道的个数，卷积核的通道数为输入的通道数，因此对于一个二维卷积层，参数的个数应该为 $kernel_size × C i n × C o u t + C o u t \text{kernel\_size}\times C_{in}\times C_{out} + C_{out}$ 。

class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

参数与一维卷积含义相同。

shape:
input: $N,C_{in},H_{in},W_{in})$
output: $N,C_{out},H_{out},W_{out})$
$H_{out}=floor((H_{in}+2padding[0]-dilation[0](kernerl\_size[0]-1)-1)/stride[0]+1)$

$W_{out}=floor((W_{in}+2padding[1]-dilation[1](kernerl\_size[1]-1)-1)/stride[1]+1)$

变量:
weight(tensor) - 卷积的权重，大小是(out_channels, in_channels,kernel_size)
bias(tensor) - 卷积的偏置系数，大小是（out_channel）

import torch
import torch.nn as nn
m = nn.Conv1d(1, 3, 2, stride=2)
n = nn.Conv2d(1, 3, (1,2), stride=2)
input_m = torch.randn(1, 1, 4)
input_n = torch.randn(1,1,1,4)
output_m = m(input_m)
output_n = n(input_n)
print('output of Conv1d:',output_m)
print('output size of 1d:',output_m.size())
print('output of Conv2d:',output_n)
print('output size of 2d:',output_n.size())

# 输出：
output of Conv1d: tensor([[[ 2.2308,  0.8016],
         [ 0.8213, -0.3508],
         [ 0.3208,  0.0174]]], grad_fn=<SqueezeBackward1>)
output size of 1d: torch.Size([1, 3, 2])
output of Conv2d: tensor([[[[-0.7004,  0.6022]],

         [[-0.6448,  0.1165]],

         [[-0.0601,  0.4772]]]], grad_fn=<ThnnConv2DBackward>)
output size of 2d: torch.Size([1, 3, 1, 2])

三维卷积

3D卷积
图源：人工智能研究院

三维卷积层, 输入的尺度是 $N, C_{in},D,H,W)$ ，输出尺度 $N,C_{out},D_{out},H_{out},W_{out})$ 的计算方式：
$out(N_i, C_{out_j})=bias(C_{out_j})+\sum^{C_{in}-1}_{k=0}weight(C_{out_j},k)\bigotimes input(N_i,k)$

$N$ 是批尺寸， $C$ 是通道数， $D$ 是特征深度， $H$ 是特征高度， $W$ 是特征宽度。

class torch.nn.Conv3d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

参数kernel_size，stride，padding，dilation可以是一个int的数据 - 卷积height和width值相同，也可以是一个有三个int数据的tuple数组，tuple的第一维度表示depth的数值，tuple的第二维度表示height的数值，tuple的第三维度表示width的数值。

变量:
weight(tensor) - 卷积的权重，shape是(out_channels, in_channels,kernel_size)`
bias(tensor) - 卷积的偏置系数，shape是（out_channel）

import torch
import torch.nn as nn
# With square kernels and equal stride
m = nn.Conv3d(1, 1, 3, stride=1)
# non-square kernels and unequal stride and with padding
m = nn.Conv3d(1, 1, (3, 3, 3), stride=(1, 1, 1))
input = torch.rand(1, 1, 4, 4, 4)# N,C,D,H,W

output = m(input)
print(output.size())

# 输出：
torch.Size([1, 1, 2, 2, 2])