2021-3-13论文学习——SENet，StairNet，Generalized Focal Loss，R3Det，CARAFE

最新推荐文章于 2025-09-03 11:39:44 发布

原创

最新推荐文章于 2025-09-03 11:39:44 发布 · 1.7k 阅读

20 ·

CC 4.0 BY-SA版权

文章标签：

#目标检测 #计算机视觉 #深度学习

本文深入探讨了Squeeze-and-Excitation Networks (SENet)、StairNet、Generalized Focal Loss等多种先进的视觉检测技术。重点介绍了SENet的注意力机制、StairNet的自上而下语义聚合方法及R3Det在旋转目标检测领域的应用。此外，还讨论了CARAFE作为一种轻量级通用上采样操作在多种视觉任务中的显著效果。

[1]Squeeze-and-Excitation Networks

论文地址：https://arxiv.org/abs/1709.01507

代码地址：https://github.com/moskomule/senet.pytorch/blob/master/senet

论文发表于CVPR 2018，同时提交于IEEE TPAMI 2019

在这里插入图片描述

结构图

一个全局avg pooling得到11C的向量，然后通过一个MLP感知机得到进行线性变换的11C
向量。再通过一个Sigmod函数进行激活。

在这里插入图片描述

各种SE block的变体

在这里插入图片描述

实验结果证明了SE net的有效性。

我认为其作用就是它对于通道施加了注意力机制，能够提取更加有用的信息。

Pytorch代码

import torch
from torch import nn

class SELayer(nn.Module):
    def __init__(self, channel, reduction=16):
        super(SELayer, self).__init__()
        ## 定义全局平均池化层
        self.avg_pool = nn.AdaptiveAvgPool2d(output_size=1)
        ## 定义一个MLP感知机
        self.fc = nn.Sequential(
            nn.Linear(channel, channel // reduction, bias=False),
            nn.ReLU(inplace=True),
            nn.Linear(channel // reduction, channel, bias=False),
            nn.Sigmoid()
        )
 
    def forward(self, x):   # exsample x.size() = [8,128,256,256]
        b, c, _, _ = x.size()
        y = self.avg_pool(x).view(b, c)    ## [8,128]
        print(y.size())
        y = self.fc(y).view(b, c, 1, 1)     # [8,128,1,1]
        print(y.size())
        return x * y.expand_as(x)    # [8,128,256,256]
"""
x = torch.Tensor(8,128,256,256)
print(x.size())
se = SELayer(128,16)
print(se(x).size())
"""