
arXiv-2024
代码:
- https://github.com/ultralytics/ultralytics
文档:
文章目录
1、Background and Motivation
Background
- 计算机视觉的快速发展
- YOLO系列模型的演进
- YOLOv11的发布
Motivation
- 提升对象检测性能
- 优化模型效率和可扩展性
- 推动实时计算机视觉应用的发展
This study presents an architectural analysis of YOLOv11, the latest iteration in the YOLO (You Only Look Once) series of object detection models.
芒果YOLO11算法解析-最新YOLO11结构图以及YOLO11各部分细致结构图解析




2、Related Work
yolov1 ~ yolov10
3、Evolution of YOLO models

这个 contributions 不知道准不准确,早期的 yolov5 应该是 anchor-based ,不过引用[10] 指向的不是 v5 官网地址

4、Architectural footprint of Yolov11

核心改动是 C3K2( Cross Stage Partial with kernel size 2) 和 C2PSA (Convolutional block with Parallel Spatial Attention)
看源码的描述 2 好像是 two convolution 的意思 (two c3k),并非 kernel size
# YOLO11n backbone
backbone:
# [from, repeats, module, args]
- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
- [-1, 2, C3k2, [256, False, 0.25]]
- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
- [-1, 2, C3k2, [512, False, 0.25]]
- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
- [-1, 2, C3k2, [512, True]]
- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
- [-1, 2, C3k2, [1024, True]]
- [-1, 1, SPPF, [1024, 5]] # 9
- [-1, 2, C2PSA, [1024]] # 10
# YOLO11n head
head:
- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 6], 1, Concat, [1]] # cat backbone P4
- [-1, 2, C3k2, [512, False]] # 13
- [-1, 1, nn.Upsample, [None, 2, "nearest"]

最低0.47元/天 解锁文章
1834

被折叠的 条评论
为什么被折叠?



