爆改YOLOv8 | 利用AFPN增加小目标检测层(替换小目标检测头)

1,本文介绍

这篇文章的改进机制是利用新推出的 渐近特征金字塔网络(AFPN)来优化yolov8的检测头, AFPN的核心 是引入一种渐近的特征融合策略,将底层和高层的特征逐渐整合到目标检测过程中。这种方式有助于减小不同层次特征之间的语义差距,提高特征融合效果,使得检测模型能更好地适应不同层次的语义信息。

关于AFPN的详细介绍可以看论文:https://arxiv.org/pdf/2306.15988.pdf

本文将讲解如何将AFPN融合进yolov8,以提高小目标检测的性能。

话不多说,上代码!

2,将AFPN融入YOLOv8

2.1 步骤一

首先找到如下的目录'ultralytics/nn/modules',然后在这个目录下创建一个afpn.py文件,文件名字可以根据你自己的习惯起,然后将afpn的核心代码复制进去。

# AFPN 核心代码

import math
from collections import OrderedDict
import torch
import torch.nn as nn
import torch.nn.functional as F
from ultralytics.nn.modules import DFL
from ultralytics.nn.modules.conv import Conv
from ultralytics.utils.tal import dist2bbox, make_anchors
 
__all__ =['Detect_AFPN']
 
def BasicConv(filter_in, filter_out, kernel_size, stride=1, pad=None):
    if not pad:
        pad = (kernel_size - 1) // 2 if kernel_size else 0
    else:
        pad = pad
    return nn.Sequential(OrderedDict([
        ("conv", nn.Conv2d(filter_in, filter_out, kernel_size=kernel_size, stride=stride, padding=pad, bias=False)),
        ("bn", nn.BatchNorm2d(filter_out)),
        ("relu", nn.ReLU(inplace=True)),
    ]))
 
 
class BasicBlock(nn.Module):
    expansion = 1
 
    def __init__(self, filter_in, filter_out):
        super(BasicBlock, self).__init__()
        self.conv1 = nn.Conv2d(filter_in, filter_out, 3, padding=1)
        self.bn1 = nn.BatchNorm2d(filter_out, momentum=0.1)
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = nn.Conv2d(filter_out, filter_out, 3, padding=1)
        self.bn2 = nn.BatchNorm2d(filter_out, momentum=0.1)
 
    def forward(self, x):
        residual = x
 
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
 
        out = self.conv2(out)
        out = self.bn2(out)
 
        out += residual
        out = self.relu(out)
 
        return out
 
 
class Upsample(nn.Module):
    def __init__(self, in_channels, out_channels, scale_factor=2):
        super(Upsample, self).__init__()
 
        self.upsample = nn.Sequential(
            BasicConv(in_channels, out_channels, 1),
            nn.Upsample(scale_factor=scale_factor, mode='bilinear')
        )
 
 
    def forward(self, x):
        x = self.upsample(x)
 
        return x
 
 
class Downsample_x2(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(Downsample_x2, self).__init__()
 
        self.downsample = nn.Sequential(
            BasicConv(in_channels, out_channels, 2, 2, 0)
        )
 
    def forward(self, x, ):
        x = self.downsample(x)
 
        return x
 
 
class Downsample_x4(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(Downsample_x4, self).__init__()
 
        self.downsample = nn.Sequential(
            BasicConv(in_channels, out_channels, 4, 4, 0)
        )
 
    def forward(self, x, ):
        x = self.downsample(x)
 
        return x
 
 
class Downsample_x8(nn.Module):
    def __init__(self, in_channels, out_channels):
        super(Downsample_x8, self).__init__()
 
        self.downsample = nn.Sequential(
            BasicConv(in_channels, out_channels, 8, 8, 0)
        )
 
    def forward(self, x, ):
        x = self.downsample(x)
 
        return x
 
 
class ASFF_2(nn.Module):
    def __init__(self, inter_dim=512):
        super(ASFF_2, self).__init__()
 
        self.inter_dim = inter_dim
        compress_c = 8
 
        self.weight_level_1 = BasicConv(self.inter_dim, compress_c, 1, 1)
        self.weight_level_2 = BasicConv(self.inter_dim, compress_c, 1, 1)
 
        self.weight_levels = nn.Conv2d(compress_c * 2, 2, kernel_size=1, stride=1, padding=0)
 
        self.conv = BasicConv(self.inter_dim, self.inter_dim, 3, 1)
 
    def forward(self, input1, input2):
        level_1_weight_v = self.weight_level_1(input1)
        level_2_weight_v = self.weight_level_2(input2)
 
        levels_weight_v = torch.cat((level_1_weight_v, level_2_weight_v), 1)
        levels_weight = self.weight_levels(levels_weight_v)
        levels_weight = F.softmax(levels_weight, dim=1)
 
        fused_out_reduced = input1 * levels_weight[:, 0:1, :, :] + \
                            input2 * levels_weight[:, 1:2
03-13
### AFPN概述 新型的特征金字塔网络(AFPN)旨在支持非相邻级别的直接交互,从而提升目标检测任务中的表现[^1]。传统的方法通常仅限于相邻层次间的简单连接或融合,而AFPNN则通过一种更为复杂的方式处理跨层信息传递。 #### 特征融合机制 AFPN的核心在于其独特的特征融合策略。该架构不仅融合了两个相邻较低级的特性映射,而且以渐进式的手段引入更高阶的信息到这个混合进程中,以此减少因层级间距离过大而导致的意义差异——即所谓的“语义鸿沟”。这种设计使得模型能够更好地捕捉来自不同尺度的有效视觉模式。 #### 自适应空间融合操作 为了进一步优化多源数据的一致性和协调性,AFPN引入了一种名为自适应空间融合的操作。这一过程有助于解决当多个对象存在于同一区域内时可能出现的信息竞争问题,确保每个实例都能获得足够的表示权重而不至于被其他临近物体所掩盖。 ```python def adaptive_spatial_fusion(features): # 假设features是一个列表,包含了来自不同层次的特征图 fused_feature = None for feature_map in features: if fused_feature is None: fused_feature = feature_map else: # 实现具体的自适应加权融合逻辑 weights = calculate_weights(feature_map, fused_feature) fused_feature = weighted_sum(fused_feature, feature_map, weights) return fused_feature ``` #### 性能评估 经过严格的MS-COCO 2017验证和测试集上的实证研究显示,在多种评价指标上,AFPN均超越了许多现有的先进方案,证明了其在实际应用环境下的优越性能。
评论 11
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值