【深度学习】(2+1)D模型框架结构笔记

最新推荐文章于 2025-03-10 10:58:20 发布

夹猪逃

最新推荐文章于 2025-03-10 10:58:20 发布

阅读量2.4k

点赞数 7

分类专栏：深度学习文章标签：卷积神经网络深度学习神经网络

本文链接：https://blog.youkuaiyun.com/WadeDu3/article/details/115350280

版权

本文详细介绍了(2+1)D模型的结构，包括SpatioTemporalConv、SpatioTemporalResBlock和SpatioTemporalResLayer。SpatioTemporalConv模块基于3D卷积，通过 Spatial_conv 和 Temporal_conv 实现。SpatioTemporalResBlock采用残差网络设计，可选择是否进行下采样。SpatioTemporalResLayer则根据输入参数构建层结构。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

（2+1）D 模型框架结构笔记

SpatioTemporalConv模块结构

SpatioTemporalConv的输入参数:(in_channels,out_channels,kernel_size,stride=1,padding=0, bias=False,first_conv=False)

Args:
in_channels (int): Number of channels in the input tensor,输入张量中的通道数

out_channels (int): Number of channels produced by the convolution,卷积提供的通道数

kernel_size (int or tuple): Size of the convolving kernel，卷积核大小

stride (int or tuple, optional): Stride of the convolution. Default: 1,卷积的步长。默认值：1

padding (int or tuple, optional): Zero-padding added to the sides of the input during their respective convolutions. Default: 0,在它们各自的卷积期间将零填充添加到输入的边。默认值：0

bias (bool, optional): If True, adds a learnable bias to the output. Default: True,

在代码中，当first_conv=True时intermed_channels=45，否则intermed_channels=(kernel_size[0] * kernel_size[1] * kernel_size[2] * in_channels * out_channels)/(kernel_size[1] * kernel_size[2] * in_channels+kernel_size[0] * out_channels)。

其中intermed_channels出自论文中的计算
在这里插入图片描述
也就是（3D卷积核x输入通道数x输出通道数）/（空间卷积核x输入通道数 + 时间卷积核x输出通道数）。