YOLOv8改进——使用C2f-Faster-EMA替换C2f

 YOLOv8改进

第三章 使用C2f-Faster-EMA替换C2f.(C2f-Faster-EMA推荐可以放在主干上,Neck和head部分可以选择C2f-Faster)

一、yolov8-C2f-Faster-EMA模型

1.yolov8-C2f-Faster-EMA模型结构

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 80  # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024]  # YOLOv8n summary: 225 layers,  3157200 parameters,  3157184 gradients,   8.9 GFLOPs
  s: [0.33, 0.50, 1024]  # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients,  28.8 GFLOPs
  m: [0.67, 0.75, 768]   # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients,  79.3 GFLOPs
  l: [1.00, 1.00, 512]   # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
  x: [1.00, 1.25, 512]   # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs

# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f_Faster_EMA, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f_Faster_EMA, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f_Faster_EMA, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f_Faster_EMA, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]]  # 9

# YOLOv8.0n head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2f_Faster_EMA, [512]]  # 12

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3
  - [-1, 3, C2f_Faster_EMA, [256]]  # 15 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]]  # cat head P4
  - [-1, 3, C2f_Faster_EMA, [512]]  # 18 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2f_Faster_EMA, [1024]]  # 21 (P5/32-large)

  - [[15, 18, 21], 1, Detect, [nc]]  # Detect(P3, P4, P5)

2.C2f-Faster-EMA代码

class Faster_Block_EMA(nn.Module):
    def __init__(self,
                 inc,
                 dim,
                 n_div=4,
                 mlp_ratio=2,
                 drop_path=0.1,
                 layer_scale_init_value=0.0,
                 pconv_fw_type='split_cat'
                 ):
        super().__init__()
        self.dim = dim
        self.mlp_ratio = mlp_ratio
        self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity()
        self.n_div = n_div

        mlp_hidden_dim = int(dim * mlp_ratio)

        mlp_layer = [
            Conv(dim, mlp_hidden_dim, 1),
            nn.Conv2d(mlp_hidden_dim, dim, 1, bias=False)
        ]

        self.mlp = nn.Sequential(*mlp_layer)

        self.spatial_mixing = Partial_conv3(
            dim,
            n_div,
            pconv_fw_type
        )
        self.attention = EMA(dim)
        
        self.adjust_channel = None
        if inc != dim:
            self.adjust_channel = Conv(inc, dim, 1)

        if layer_scale_init_value > 0:
            self.layer_scale = nn.Parameter(layer_scale_init_value * torch.ones((dim)), requires_grad=True)
            self.forward = self.forward_layer_scale
        else:
            self.forward = self.forward

    def forward(self, x):
        if self.adjust_channel is not None:
            x = self.adjust_channel(x)
        shortcut = x
        x = self.spatial_mixing(x)
        x = shortcut + self.attention(self.drop_path(self.mlp(x)))
        return x

    def forward_layer_scale(self, x):
        shortcut = x
        x = self.spatial_mixing(x)
        x = shortcut + self.drop_path(self.layer_scale.unsqueeze(-1).unsqueeze(-1) * self.mlp(x))
        return x

class C3_Faster_EMA(C3):
    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5):
        super().__init__(c1, c2, n, shortcut, g, e)
        c_ = int(c2 * e)  # hidden channels
        self.m = nn.Sequential(*(Faster_Block_EMA(c_, c_) for _ in range(n)))

class C2f_Faster_EMA(C2f):
    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5):
        super().__init__(c1, c2, n, shortcut, g, e)
        self.m = nn.ModuleList(Faster_Block_EMA(self.c, self.c) for _ in range(n))

二、添加方式

1.将上文中C2f-Faster-EMA代码添加ultralytics/nn/modules/block.py中

2.在block.py的顶部注册模块名.

'C2f_Faster_EMA'

3.在ultralytics/nn/modules/__init__.py中注册模块名

C2f_Faster_EMA     'C2f_Faster_EMA'

4.在ultralytics/nn/tasks.py顶部注册模块名

C2f_Faster_EMA

5.在ultralytics/nn/tasks.py中的def parse_model添加

C2f_Faster_EMA

6.使用上文配置文件开始训练

三、yolov8-C2f-Faster-EMA

### YOLOv8 C2f-Faster 改进后的中间特征层可视化方法 为了实现改进后的YOLOv8模型中的中间特征层可视化,可以采用以下几种常见技术: #### 1. 使用PyTorch Hook机制提取特定层的输出 通过PyTorch框架提供的`register_forward_hook`功能,可以在训练或推理过程中捕获指定层的激活值。以下是具体的实现方式: - 首先加载经过C2f模块替换FasterBlock的YOLOv8模型[^1]。 - 定位到目标层(即被替换后的FasterBlock部分),并注册一个前向传播钩子函数。 ```python import torch from ultralytics import YOLO def get_activation(name): """用于存储激活值""" activation = {} def hook(model, input, output): activation[name] = output.detach() return hook, activation # 加载自定义配置的YOLOv8模型 model = YOLO('path_to_your_model.pt') # 假设我们想获取名为 'FasterBlock' 的层的输出 hook, activations = get_activation('FasterBlock') for name, layer in model.model.named_modules(): if 'FasterBlock' in name: # 找到对应的FasterBlock层 layer.register_forward_hook(hook) # 输入数据进行推断 input_tensor = torch.randn(1, 3, 640, 640) # 示例输入张量 output = model(input_tensor) # 获取激活值 activation_output = activations['FasterBlock'] print(f"FasterBlock Layer Output Shape: {activation_output.shape}") ``` 上述代码展示了如何定位到FasterBlock层,并将其输出保存下来以便后续分析。 #### 2. 利用TensorBoard或其他工具绘制热力图 一旦获得了感兴趣的中间特征图,就可以利用Matplotlib或者Seaborn库来生成这些特征图的可视化图像。对于更复杂的场景,则推荐使用TensorBoard这样的专用工具来进行多维度的数据展示。 例如,在得到某一层的多个通道响应之后,可以通过下面的方式创建灰度级表示形式: ```python import matplotlib.pyplot as plt # 提取第一个样本的第一个通道作为例子 feature_map = activation_output[0][0].cpu().numpy() plt.figure(figsize=(10, 10)) plt.imshow(feature_map, cmap='gray') # 显示单个特征图 plt.axis('off') # 关闭坐标轴显示 plt.show() ``` 如果希望查看整个批次的所有通道效果,则需循环遍历每一个单独的channel,并按网格布局排列成最终效果图表。 另外值得注意的是,由于原生YOLO系列架构已经包含了大量跨尺度融合操作,因此即使只是简单替换了基础组件如这里提到的从C2f变为FasterBlock,也可能显著改变整体信息流路径结构[^2]。所以在实际应用当中可能还需要额外注意不同阶段之间相互作用关系的变化情况。 ---
评论 16
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值