YOLOv8改进 | 有效涨点 | 使用TPAMI2025 Hyper-YOLO中的尺度融合方式HyperC2Net改进YOLOv8的Neck

最新推荐文章于 2025-11-24 17:01:47 发布

原创最新推荐文章于 2025-11-24 17:01:47 发布 · 1.1k 阅读

12 ·

CC 4.0 BY-SA版权

文章标签：

#YOLO #深度学习 #机器学习

YOLOv8改进系列专栏收录该内容

13 篇文章

订阅专栏

文章目录

本文介绍
代码迁移

本文介绍

为提升 YOLOv8 多尺度特征融合能力，本文借鉴 TPAMI2025 Hyper-YOLO 所提出的尺度融合方式HyperC2Net改进YOLOv8的Neck部分。 HyperC2Net有助于在语义层和位置上传递高阶消息，从而提高Neck提取高阶特征的能力。 HyperC2Net融合五阶段特征图以构建超图结构，并将超图结构分别融合进 $B_3$ 、 $B_4$ 和 $B_5$ ，最后通过Bottom-Up结构进行高阶信息的传递。 实验结果如下（本文通过VOC数据验证算法性能，epoch为100，batchsize为32，imagesize为640*640）：

Model	mAP50-95	mAP50	run time (h)	params (M)	interence time (ms)
YOLOv8	0.549	0.760	1.051	3.01	0.2+0.3(postprocess)
YOLO11	0.553	0.757	1.142	2.59	0.2+0.3(postprocess)
yolov8_HyperC2Net	0.561	0.770	1.229	3.29	0.4+0.3(postprocess)

在这里插入图片描述

重要声明：本文改进后代码可能只是并不适用于我所使用的数据集，对于其他数据集可能存在有效性。

本文改进是为了降低最新研究进展至YOLO的代码迁移难度，从而为对最新研究感兴趣的同学提供参考。

代码迁移

重点内容

步骤一：迁移代码

ultralytics框架的模块代码主要放在ultralytics/nn文件夹下，此处为了与官方代码进行区分，可以新增一个extra_modules文件夹，然后将我们的代码添加进入。

具体代码如下：

import torch
import torch.nn as nn

__all___ = ['HyperComputeModule']

class MessageAgg(nn.Module):
    def __init__(self, agg_method="mean"):
        super().__init__()
        self.agg_method = agg_method

    def forward(self, X, path):
        """
            X: [n_node, dim]
            path: col(source) -> row(target)
        """
        X = torch.matmul(path, X)
        if self.agg_method == "mean":
            norm_out = 1 / torch.sum(path, dim=2, keepdim=True)
            norm_out[torch.isinf(norm_out)] = 0
            X = norm_out * X
            return X
        elif self.agg_method == "sum":
            pass
        return X


class HyPConv(nn.Module):
    def __init__(self, c1, c2):
        super().__init__()
        self.fc = nn.Linear(c1, c2)
        self.v2e = MessageAgg(agg_method="mean")
        self.e2v = MessageAgg(agg_method="mean")


    def forward(self, x, H):
        x = self.fc(x)
        # v -> e
        E = self.v2e(x, H.transpose(1, 2).contiguous())
        # e -> v
        x = self.e2v(E, H)

        return x


class HyperComputeModule(nn.Module):
    def __init__(self, c1, c2, threshold):
        super().__init__()
        self.threshold = threshold
        self.hgconv = HyPConv(c1, c2)
        self.bn = nn.BatchNorm2d(c2)
        self.act = nn.SiLU()


    def forward(self, x):
        b, c, h, w = x.shape[0], x.shape[1], x.shape[2], x.shape[3]
        x = x.view(b, c, -1).transpose(1, 2).contiguous()
        feature = x.clone()
        distance = torch.cdist(feature, feature)
        hg = distance < self.threshold
        hg = hg.float().to(x.device).to(x.dtype)
        x = self.hgconv(x, hg).to(x.device).to(x.dtype) + x
        x = x.transpose(1, 2).contiguous().view(b, c, h, w)
        x = self.act(self.bn(x))

        return x

步骤二：创建模块并导入

此时需要在当前目录新增一个__init__.py文件，将添加的模块导入到__init__.py文件中，这样在调用的时候就可以直接使用from extra_modules import *。__init__.py文件需要撰写以下内容：

from .hyper_yolo import HyperComputeModule

具体目录结构如下图所示：

nn/
└── extra_modules/
    ├── __init__.py
    └── hyper_yolo .py

步骤三：修改`tasks.py`文件

首先在tasks.py文件中添加以下内容：

from ultralytics.nn.extra_modules import *

然后找到parse_model()函数，在函数查找如下内容：

        if m in base_modules:
            c1, c2 = ch[f], args[0]
            if c2 != nc:  # if c2 not equal to number of classes (i.e. for Classify() output)
                c2 = make_divisible(min(c2, max_channels) * width, 8)

使用较老ultralytics版本的同学，此处可能不是base_modules，而是相关的模块的字典集合，此时直接添加到集合即可；若不是就找到base_modules所指向的集合进行添加，添加方式如下：

    base_modules = frozenset(
        {
            Classify, Conv, ConvTranspose, GhostConv, Bottleneck, GhostBottleneck,
            SPP, SPPF, C2fPSA, C2PSA, DWConv, Focus, BottleneckCSP, C1, C2, C2f, C3k2,
            RepNCSPELAN4, ELAN1, ADown, AConv, SPPELAN, C2fAttn, C3, C3TR, C3Ghost,
            torch.nn.ConvTranspose2d, DWConvTranspose2d, C3x, RepC3, PSA, SCDown, C2fCIB,
            A2C2f,
            # 自定义模块
            HyperComputeModule,
        }
    )

步骤四：修改配置文件

在相应位置添加如下代码即可。

# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024] # YOLOv8n summary: 129 layers, 3157200 parameters, 3157184 gradients, 8.9 GFLOPS
  s: [0.33, 0.50, 1024] # YOLOv8s summary: 129 layers, 11166560 parameters, 11166544 gradients, 28.8 GFLOPS
  m: [0.67, 0.75, 768] # YOLOv8m summary: 169 layers, 25902640 parameters, 25902624 gradients, 79.3 GFLOPS
  l: [1.00, 1.00, 512] # YOLOv8l summary: 209 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPS
  x: [1.00, 1.25, 512] # YOLOv8x summary: 209 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPS

# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]   # 0-B1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1
  - [-1, 3, C2f, [128, True]]   # 2-B2/4
  - [-1, 1, Conv, [256, 3, 2]]  # 3
  - [-1, 6, C2f, [256, True]]   # 4-B3/8
  - [-1, 1, Conv, [512, 3, 2]]  # 5
  - [-1, 6, C2f, [512, True]]   # 6-B4/16
  - [-1, 1, Conv, [1024, 3, 2]] # 7
  - [-1, 3, C2f, [1024, True]]  # 8
  - [-1, 1, SPPF, [1024, 5]]    # 9-B5/32

# YOLOv8.0n head
head:
  # Semantic Collecting
  - [0, 1, nn.AvgPool2d, [8, 8, 0]]           # 10
  - [2, 1, nn.AvgPool2d, [4, 4, 0]]           # 11
  - [4, 1, nn.AvgPool2d, [2, 2, 0]]           # 12
  - [9, 1, nn.Upsample, [None, 2, 'nearest']] # 13
  - [[10, 11, 12, 6, 13], 1, Concat, [1]]     # cat 14

  # Hypergraph Compution
  - [-1, 1, Conv, [512, 1, 1]]                # 15
  - [-1, 1, HyperComputeModule, [512, 6]]     # 16
  - [-1, 3, C2f, [512, True]]                 # 17

  # Semantic Collecting
  - [-1, 1, nn.AvgPool2d, [2, 2, 0]]          # 18
  - [[-1, 9], 1, Concat, [1]]                 # cat 19
  - [-1, 1, Conv, [1024, 1, 1]]               # 20 P5

  - [[17, 6], 1, Concat, [1]]                 # cat 21
  - [-1, 3, C2f, [512]]                       # 22 P4

  - [17, 1, nn.Upsample, [None, 2, 'nearest']] # 23
  - [[-1, 4], 1, Concat, [1]]                  # cat 24
  - [-1, 3, C2f, [256]]                        # 25 P3/N3

  - [-1, 1, Conv, [256, 3, 2]]                 # 26
  - [[-1, 22], 1, Concat, [1]]                 # 27 cat 
  - [-1, 3, C2f, [512]]                        # 28 N4

  - [-1, 1, Conv, [512, 3, 2]]                 # 29
  - [[-1, 20], 1, Concat, [1]]                 # 30 cat
  - [-1, 3, C2f, [1024]]                       # 31 N5

  - [[25, 28, 31], 1, Detect, [nc]]  # Detect(N3, N4, N5)