Attention机制中CBAM以及Dual pooling的pytorch实现

最新推荐文章于 2025-06-20 13:42:13 发布

*pprp*

最新推荐文章于 2025-06-20 13:42:13 发布

阅读量4.9k

点赞数 6

CC 4.0 BY-SA版权

分类专栏：深度学习 pytorch 文章标签： Attention pytorch CBAM Dual pooling

原创文章不要私自转载，自私转载必究责任，如需转载请联系wx:topeijie商谈

本文链接：https://blog.youkuaiyun.com/DD_PP_JJ/article/details/103318617

前言：虽然会pytorch框架中的一些基础操作，但是有很多实现直接让自己写还是挺困难的。本次的代码参考senet中的channel-wise加权，CBAM中的channel-attention和spatial-attention, 另外还有kaggle Master@gray 分享的Dual pooling。由于没有特别的逻辑，所以看到哪写到哪吧。

文章目录

1. SENET中的channel-wise加权的实现

实现代码参考自：senet.pytorch

selayer:

from torch import nn

class SELayer(nn.Module):
    def __init__(self, channel, reduction=16):
        super(SELayer, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.fc = nn.Sequential(
            nn.Linear(channel, channel // reduction, bias=False),
            nn.ReLU(inplace=True),
            nn.Linear(channel // reduction, channel, bias=False),
            nn.Sigmoid()
        )

    def forward(self, x):
        b, c, _, _ = x.size()
        y = self.avg_pool(x).view(b, c)
        y = self.fc(y).view(b, c, 1, 1)
        return x * y.expand_as(x)

在这里插入图片描述

以上代码涉及到的API：

AdaptiveAvgPool2d: 自适应平均池化，参数为（n,m）则将原来的feature（w,h）通过pooling得到（n,m）的feature，如果是（n）,则将原来的feature从（w,h）通过pooling得到（n,n）
Sequential: torch容器，存放网络层等内容。
Linear: 线性层，参数为（in, out）,将原有的in个feature转为out个feature
ReLU: 激活层， inplace进行原地操作，节省内存
Sigmoid: 激活层，将输入压缩到0-1

分析forward进行模型的构建：

x通过AdaptiveAvgPool2d(1)以后将得到（batch size, channel, 1, 1）, 然后view（b,c）意思是按照b,c进行展开

>>> import torch
>>> x = torch.zeros((16,256,256,256))
>>> import torch.nn as nn
>>> avg_pool = nn.AdaptiveAvgPool2d(1)
>>> avg_pool(x).shape
torch.Size([16, 256, 1, 1])
>>>torch.Size([16, 3, 1, 1])
>>> avg_pool(x).view((16,256)).shape
torch.Size([16, 256]