RFB(Receptive Field Block)

ECCV2018:Receptive Field Block Net for Accurate and Fast Object Detection一文中提出了一种新的特征提取模块——RFB,该文的出发点是模拟人类视觉的感受野从而加强网络的特征提取能力,在结构上RFB借鉴了Inception的思想,主要是在Inception的基础上加入了空洞卷积,从而有效增大了感受野

RFB的效果示意图如所示,其中中间虚线框部分就是RFB结构。RFB结构主要有两个特点:
1、不同尺寸卷积核的卷积层构成的多分枝结构,这部分可以参考Inception结构。在Figure2的RFB结构中也用不同大小的圆形表示不同尺寸卷积核的卷积层。
2、引入了dilated卷积层,dilated卷积层之前应用在分割算法Deeplab中,主要作用也是增加感受野,和deformable卷积有异曲同工之处。

在RFB结构中用不同rate表示dilated卷积层的参数。结构中最后会将不同尺寸和rate的卷积层输出进行concat,达到融合不同特征的目的。结构中用3种不同大小和颜色的输出叠加来展示。最后一列中将融合后的特征与人类视觉感受野做对比,从图可以看出是非常接近的,这也是这篇文章的出发点,换句话说就是模拟人类视觉的感受野进行RFB结构的设计。
在这里插入图片描述
如下图是两种RFB结构示意图。(a)是RFB,整体结构上借鉴了Inception的思想,主要不同点在于引入3个dilated卷积层(比如3×3conv, rate=1),这也是这篇文章增大感受野的主要方式之一。(b)是RFB-s。RFB-s和RFB相比主要有两个改进,一方面用3×3卷积层代替5×5卷积层,另一方面用1×3和3×1卷积层代替3×3卷积层,主要目的应该是为了减少计算量,类似Inception后期版本对Inception结构的改进。
在这里插入图片描述
RFB代码实现

class BasicConv(nn.Module):

    def __init__(self, in_planes, out_planes, kernel_size, stride=1, padding=0, dilation=1, groups=1, relu=True, bn=True, bias=False):
        super(BasicConv, self).__init__()
        self.out_channels = out_planes
        self.conv = nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=groups, bias=bias)
        self.bn = nn.BatchNorm2d(out_planes,eps=1e-5, momentum=0.01, affine=True) if bn else None
        self.relu = nn.ReLU(inplace=True) if relu else None

    def forward(self, x):
        x = self.conv(x)
        if self.bn is not None:
            x = self.bn(x)
        if self.relu is not None:
            x = self.relu(x)
        return x


class BasicRFB(nn.Module):

    def __init__(self, in_planes, out_planes, stride=1, scale = 0.1, visual = 1):
        super(BasicRFB, self).__init__()
        self.scale = scale
        self.out_channels = out_planes
        inter_planes = in_planes // 8
        self.branch0 = nn.Sequential(
                BasicConv(in_planes, 2*inter_planes, kernel_size=1, stride=stride),
                BasicConv(2*inter_planes, 2*inter_planes, kernel_size=3, stride=1, padding=visual, dilation=visual, relu=False)
                )
        self.branch1 = nn.Sequential(
                BasicConv(in_planes, inter_planes, kernel_size=1, stride=1),
                BasicConv(inter_planes, 2*inter_planes, kernel_size=(3,3), stride=stride, padding=(1,1)),
                BasicConv(2*inter_planes, 2*inter_planes, kernel_size=3, stride=1, padding=visual+1, dilation=visual+1, relu=False)
                )
        self.branch2 = nn.Sequential(
                BasicConv(in_planes, inter_planes, kernel_size=1, stride=1),
                BasicConv(inter_planes, (inter_planes//2)*3, kernel_size=3, stride=1, padding=1),
                BasicConv((inter_planes//2)*3, 2*inter_planes, kernel_size=3, stride=stride, padding=1),
                BasicConv(2*inter_planes, 2*inter_planes, kernel_size=3, stride=1, padding=2*visual+1, dilation=2*visual+1, relu=False)
                )

        self.ConvLinear = BasicConv(6*inter_planes, out_planes, kernel_size=1, stride=1, relu=False)
        self.shortcut = BasicConv(in_planes, out_planes, kernel_size=1, stride=stride, relu=False)
        self.relu = nn.ReLU(inplace=False)

    def forward(self,x):
        x0 = self.branch0(x)
        x1 = self.branch1(x)
        x2 = self.branch2(x)

        out = torch.cat((x0,x1,x2),1)
        out = self.ConvLinear(out)
        short = self.shortcut(x)
        out = out*self.scale + short
        out = self.relu(out)

        return out



class BasicRFB_a(nn.Module):

    def __init__(self, in_planes, out_planes, stride=1, scale = 0.1):
        super(BasicRFB_a, self).__init__()
        self.scale = scale
        self.out_channels = out_planes
        inter_planes = in_planes //4


        self.branch0 = nn.Sequential(
                BasicConv(in_planes, inter_planes, kernel_size=1, stride=1),
                BasicConv(inter_planes, inter_planes, kernel_size=3, stride=1, padding=1,relu=False)
                )
        self.branch1 = nn.Sequential(
                BasicConv(in_planes, inter_planes, kernel_size=1, stride=1),
                BasicConv(inter_planes, inter_planes, kernel_size=(3,1), stride=1, padding=(1,0)),
                BasicConv(inter_planes, inter_planes, kernel_size=3, stride=1, padding=3, dilation=3, relu=False)
                )
        self.branch2 = nn.Sequential(
                BasicConv(in_planes, inter_planes, kernel_size=1, stride=1),
                BasicConv(inter_planes, inter_planes, kernel_size=(1,3), stride=stride, padding=(0,1)),
                BasicConv(inter_planes, inter_planes, kernel_size=3, stride=1, padding=3, dilation=3, relu=False)
                )
        self.branch3 = nn.Sequential(
                BasicConv(in_planes, inter_planes//2, kernel_size=1, stride=1),
                BasicConv(inter_planes//2, (inter_planes//4)*3, kernel_size=(1,3), stride=1, padding=(0,1)),
                BasicConv((inter_planes//4)*3, inter_planes, kernel_size=(3,1), stride=stride, padding=(1,0)),
                BasicConv(inter_planes, inter_planes, kernel_size=3, stride=1, padding=5, dilation=5, relu=False)
                )

        self.ConvLinear = BasicConv(4*inter_planes, out_planes, kernel_size=1, stride=1, relu=False)
        self.shortcut = BasicConv(in_planes, out_planes, kernel_size=1, stride=stride, relu=False)
        self.relu = nn.ReLU(inplace=False)

    def forward(self,x):
        x0 = self.branch0(x)
        x1 = self.branch1(x)
        x2 = self.branch2(x)
        x3 = self.branch3(x)

        out = torch.cat((x0,x1,x2,x3),1)
        out = self.ConvLinear(out)
        short = self.shortcut(x)
        out = out*self.scale + short
        out = self.relu(out)

        return out

### SAM-Net 中 RFB 模块的功能与实现 RFBReceptive Field Block模块在 SAM2-UNet 的整体架构中起到了至关重要的作用。它主要用于增强模型的感受野能力,从而提升对多尺度目标的检测效果和分割性能。 #### 1. **功能** RFB 模块的核心目的是扩大神经网络的感受野范围,使模型能够更好地捕捉不同尺寸的目标特征[^1]。具体来说,该模块通过引入多个平行的卷积操作来模拟不同的空间分辨率下的特征提取过程。这种设计有助于解决传统卷积层因固定窗口大小而导致的感受野局限性问题。 #### 2. **实现细节** RFB 模块的具体实现方式如下: - **多分支结构** RFB 模块采用了类似于 Inception 结构的设计思路,即在同一输入上应用多种不同大小的卷积核进行处理。这些分支通常包括 \(3 \times 3\) 卷积、\(5 \times 5\) 卷积以及其他形式的空间变换操作。每一分支负责捕获特定尺度的信息,并最终将它们融合在一起形成综合特征表示。 - **空洞卷积的应用** 为了进一步增加感受野而不显著提高计算成本,在某些版本的 RFB 设计中会加入空洞卷积 (Dilated Convolution) 技术。这种方法允许更大的有效覆盖区域同时保持较低参数量水平[^4]。 - **通道间交互优化** 受益于 SE 注意力机制的影响[RFB也可能结合类似的策略], 它们调整各个 channel 权重以便突出重要部分并抑制冗余信息, 这样做不仅提高了效率还增强了鲁棒性.[^3] 以下是基于上述描述的一个简化版 Python 实现代码示例: ```python import torch.nn as nn class RFBlock(nn.Module): def __init__(self, in_channels, out_channels): super(RFBlock, self).__init__() # Branches with different kernel sizes and dilations self.branch0 = nn.Sequential( nn.Conv2d(in_channels=in_channels, out_channels=out_channels//4, kernel_size=1), nn.ReLU() ) self.branch1 = nn.Sequential( nn.Conv2d(in_channels=in_channels, out_channels=out_channels//4, kernel_size=(1, 3), padding=(0, 1)), nn.ReLU(), nn.Conv2d(out_channels//4, out_channels//4, kernel_size=(3, 1), padding=(1, 0)) ) self.branch2 = nn.Sequential( nn.Conv2d(in_channels=in_channels, out_channels=out_channels//4, kernel_size=3, dilation=3, padding=3), nn.ReLU() ) # Fusion layer self.conv_linear = nn.Conv2d(3*out_channels//4, out_channels, kernel_size=1) def forward(self, x): branch0 = self.branch0(x) branch1 = self.branch1(x) branch2 = self.branch2(x) outputs = torch.cat([branch0, branch1, branch2], dim=1) return self.conv_linear(outputs) ``` 此代码片段展示了如何构建一个多路径结构以适应各种物体尺寸的需求. --- ###
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值