实验验证
在VOC数据集且不使用预训练权重的前提下,YOLOv13效果较YOLOv12及YOLOv11均有所差距:在带来较大推理开销的同时,检测精度均有所下降(mAP50-95精度相似,但mAP50精度有不同程度的下降)。
测试结果(epoch:100; imagese: 640; batch: 32; 数据集:VOC):
| Model | mAP50-95 | mAP50 | run time (h) | params (M) | interence time (ms) |
|---|---|---|---|---|---|
| YOLOv8n | 0.549 | 0.760 | 1.051 | 3.01 | 0.2+0.3(postprocess) |
| YOLOv11n | 0.553 | 0.757 | 1.142 | 2.59 | 0.2+0.3(postprocess) |
| YOLOv12n | 0.553 | 0.762 | 1.965 | 2.51 | 0.4+0.2(postprocess) |
| YOLOv13n | 0.553 | 0.751 | 2.263 | 2.45 | 0.5+0.2(postprocess) |
带读YOLOv13

YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception 主要提出了以下改进:
-
HyperACE:基于超图的自适应关系增强 | Hypergraph-based Adaptive Correlation Enhancement
-
通过C3AH模块学习全局高阶感知信息,其主要借助自适应超图计算实现,仅具有线性复杂性。
-
通过DS-C3k模块学习局部低阶感知信息。
-
-
FullPAD:全流程“聚合-分发”策略 | Full-Pipeline Aggregation-and-Distribution Paradigm
-
从主干网络中收集多尺度特征图,并将其传送至HyperACE,然后通过不同的FullPAD策略将增强后的特征重新分发到整个流程(Neck)的各个位置。
-
实现了细粒度的信息流动与表示协同,显著提升了梯度传播效率并增强了检测性能。
-
NeurIPS2023 Gold-YOLO也提出了一种“聚合-分发”策略,通过收集并对齐不同层的特征信息完成聚合,然后通过简单的注意力机制将聚合信息注入到每个Level中。
-
HyperACE

- 对Backbone部分的B3、B5分别进行下采样、上采样操作,然后将前两者与B4进行拼接,最后使用1x1卷积进行通道调整。
class FuseModule(nn.Module):
"""
A module to fuse multi-scale features for the HyperACE block.
This module takes a list of three feature maps from different scales, aligns them to a common
spatial resolution by downsampling the first and upsampling the third, and then concatenates
and fuses them with a convolution layer.
Attributes:
c_in (int): The number of channels of the input feature maps.
channel_adjust (bool): Whether to adjust the channel count of the concatenated features.
Methods:
forward: Fuses a list of three multi-scale feature maps.
Examples:
>>> import torch
>>> model = FuseModule(c_in=64, channel_adjust=False)
>>> # Input is a list of features from different backbone stages
>>> x_list = [torch.randn(2, 64, 64, 64), torch.randn(2, 64, 32, 32), torch.randn(2, 64, 16, 16)]
>>> output = model(x_list)
>>> print(output.shape)
torch.Size([2, 64, 32, 32])
"""
def __init__(self, c_in, channel_adjust):
super(FuseModule, self).__init__()
self.downsample = nn.AvgPool2d(kernel_size=2)
self.upsample = nn.Upsample(scale_factor=2, mode='nearest')
if channel_adjust:
self.conv_out = Conv(4 * c_in, c_in, 1)
else:
self.conv_out = Conv(3 * c_in, c_in, 1)
def forward(self, x):
x1_ds = self.downsample(x[0])
x3_up = self.upsample(x[2])
x_cat = torch.cat([x1_ds, x[1], x3_up], dim=1)
out = self.conv_out(x_cat)
return out
-
通过C3AH模块学习全局高阶感知信息:1)AdaHyperedgeGen生成超边矩阵;2)AdaHGConv根据超边生成超图;3)AdaHGComputation根据超图捕捉高级感知信息

2.1 AdaHyperedgeGen 生成一个自适应超边参与矩阵,通过节点特征的全局上下文(‘mean’,‘max’和‘both’)动态生成超边原型,并计算每个节点与每个超边之间的连续性参与矩阵。(以下代码包含解读)
class AdaHyperedgeGen(nn.Module): """ Generates an adaptive hyperedge participation matrix from a set of vertex features. """ def __init__(self, node_dim, num_hyperedges, num_heads=4, dropout=0.1, context="both"): super().__init__() self.num_heads = num_heads self.num_hyperedges = num_hyperedges self.head_dim = no

最低0.47元/天 解锁文章
3118

被折叠的 条评论
为什么被折叠?



