深度可分离卷积

@清雅绝尘

于 2025-03-19 15:53:31 发布

阅读量827

点赞数 5

文章标签： python

本文链接：https://blog.youkuaiyun.com/qq_45560461/article/details/146372903

版权

在现有网络结构中添加深度可分离卷积（Depthwise Separable Convolution）

1. 理解深度可分离卷积的结构

深度可分离卷积由两部分组成：

深度卷积（Depthwise Convolution）：对每个输入通道单独进行空间卷积。
逐点卷积（Pointwise Convolution）：通过1x1卷积合并通道信息。

计算量对比：

标准卷积计算量：( K \times K \times C_{in} \times C_{out} \times H \times W )
深度可分离卷积计算量：( K \times K \times C_{in} \times H \times W + C_{in} \times C_{out} \times H \times W )
（其中 (K) 为卷积核大小，(C_{in}) 和 (C_{out}) 为输入/输出通道数）

2. 实现深度可分离卷积模块

以PyTorch为例，自定义模块：

import torch.nn as nn

class DepthwiseSeparableConv(nn.Module):
    def __init__(self, in_channels, out_channels, stride=1):
        super().__init__()
        # 深度卷积（不改变通道数）
        self.depthwise = nn.Conv2d(
            in_channels, 
            in_channels, 
            kernel_size=3, 
            stride=stride, 
            padding=1, 
            groups=in_channels  # 关键参数：groups=in_channels
        )
        # 逐点卷积（1x1卷积调整通道数）
        self.pointwise = nn.Conv2d(
            in_channels, 
            out_channels, 
            kernel_size=1, 
            stride=1, 
            padding=0
        )
        # 可选：添加BatchNorm和激活函数
        self.bn = nn.BatchNorm2d(out_channels)
        self.relu = nn.ReLU(inplace=True)

    def forward(self, x):
        x = self.depthwise(x)
        x = self.pointwise(x)
        x = self.bn(x)
        x = self.relu(x)
        return x

3. 在现有网络中替换或插入模块

场景1：替换标准卷积层

假设原网络中的某个卷积层如下：

self.conv = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=2, padding=1)

替换为深度可分离卷积：

self.conv = DepthwiseSeparableConv(in_channels=64, out_channels=128, stride=2)

场景2：在网络中插入新模块

在残差块中插入深度可分离卷积（以ResNet为例）：

class CustomResBlock(nn.Module):
    def __init__(self, in_channels, out_channels, stride=1):
        super().__init__()
        # 原ResNet的卷积层
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride)
        # 插入深度可分离卷积
        self.ds_conv = DepthwiseSeparableConv(out_channels, out_channels)
        self.conv2 = nn.Conv2d(out_channels, out_channels * 4, kernel_size=1)
    
    def forward(self, x):
        residual = x
        x = self.conv1(x)
        x = self.ds_conv(x)  # 新增模块
        x = self.conv2(x)
        x += residual
        return x

4. 调整训练策略

微调已有参数：如果替换了关键层（如靠近输入的卷积），建议以较低学习率微调整个网络。
冻结部分层：若仅测试模块效果，可冻结原网络参数，仅训练新添加的模块。

优化器设置示例：

import torch.optim as optim

model = ExistingNetwork()  # 假设已有网络实例
# 将新添加的模块参数设为高学习率，其他层设为低学习率
optimizer = optim.SGD([
    {'params': model.base_layers.parameters(), 'lr': 0.001},  # 原有层
    {'params': model.ds_conv_layers.parameters(), 'lr': 0.01} # 新模块
], momentum=0.9)

5. 验证与调试

计算量分析

使用torchsummary库统计参数量和计算量：

from torchsummary import summary

model = YourModifiedNetwork()
summary(model, input_size=(3, 224, 224))  # 输入尺寸根据任务调整

性能验证

指标对比：与原模型对比验证集准确率、推理速度（FPS）。
消融实验：逐步替换不同位置的卷积层，观察性能变化。

6. 常见问题与解决方案

问题	可能原因	解决方案
模型输出尺寸不匹配	替换后步长（stride）不一致	检查`stride`和`padding`设置
训练时损失震荡	学习率过高或未添加BN层	降低学习率，确保模块中包含BN层
推理速度未提升	替换的卷积层数过少	在更多层中替换标准卷积