关于anchor box, RPN, ROIalign

这篇博客深入探讨了目标检测中的Anchor Boxes概念及其在Faster R-CNN中的作用。通过源码解析,详细阐述了Anchor的生成过程,并提供了RPN(区域提议网络)和ROIAlign的清晰理解。此外,还介绍了mmdetection框架中相关配置参数的含义,为读者提供了深入学习目标检测算法的宝贵资源。
graph TD %% Styling and Definitions classDef data fill:#e6e6fa,stroke:#333,stroke-width:2px; classDef backbone fill:#d1e7dd,stroke:#333,stroke-width:2px; classDef rpn fill:#f8d7da,stroke:#333,stroke-width:2px; classDef roi fill:#cff4fc,stroke:#333,stroke-width:2px; classDef loss fill:#fff3cd,stroke:#333,stroke-width:2px,stroke-dasharray: 5 5; classDef output fill:#f0e68c,stroke:#333,stroke-width:4px; %% Start of the Flow Input[Input Image<br/>(来自 mydata_coco_train)]:::data subgraph "1. 数据输入与增强" direction LR Input -- BGR Image --> Aug_Resize(多尺度缩放<br/>短边 640-800px); Aug_Resize -- Resized Image --> Aug_Flip(随机水平翻转); end Aug_Flip -- Batched Tensor<br/>(Batch Size: 8) --> Backbone_Start; %% Backbone Network subgraph "2. 骨干网络 (Backbone)" direction TB Backbone_Start[Input Batch] subgraph "2.1 ResNet-50 with DCN" direction TB Backbone_Start --> Stem(Stem<br/>7x7 Conv, MaxPool); Stem --> Res2(res2: 3x Bottleneck); Res2 --> Res3(res3: 4x Bottleneck); Res3 --> Res4(res4: 6x Bottleneck); Res4 --> Res5(res5: 3x DeformBottleneckBlock<br/><b>(可变形卷积)</b>); end Res2 -- res2 feature --> FPN; Res3 -- res3 feature --> FPN; Res4 -- res4 feature --> FPN; Res5 -- res5 feature --> FPN; subgraph "2.2 FPN (特征金字塔)" direction TD FPN_IN(Inputs: res2-res5); FPN_IN -- "自顶向下 + 横向连接" --> FPN_Fusion(多尺度特征融合); FPN_Fusion -- "3x3 Conv" --> FPN_OUT(Outputs: P2, P3, P4, P5); end end class Backbone_Start,Stem,Res2,Res3,Res4,Res5,FPN,FPN_IN,FPN_Fusion,FPN_OUT backbone; %% RPN - Region Proposal Network FPN_OUT -- 特征金字塔 --> RPN_Start; subgraph "3. RPN (区域提议网络)" direction TD RPN_Start(Input: P2-P5) RPN_Start --> RPN_Anchor(锚点生成器<br/><b>为缺陷定制的长宽比</b>); RPN_Anchor --> RPN_Head(RPN 头部); RPN_Head -- "分类分支" --> RPN_Cls(前景/背景); RPN_Head -- "回归分支" --> RPN_Reg(边界框微调); RPN_Cls & RPN_Reg --> NMS(非极大值抑制); NMS --> RPN_Out(Output: ~1000 RoIs); end class RPN_Start,RPN_Anchor,RPN_Head,RPN_Cls,RPN_Reg,NMS,RPN_Out rpn; %% RoI Heads RPN_Out -- RoIs --> RoI_Start; FPN_OUT -- 特征金字塔 --> RoI_Start; subgraph "4. RoI 头部 (开放集检测)" direction TD RoI_Start(Inputs: RoIs & P2-P5) RoI_Start --> RoI_Align(RoIAlign<br/>池化为 7x7 特征图); RoI_Align --> Box_Head(Box Head<br/>Conv & FC Layers); Box_Head --> Box_Output_Layers(<b>OpenDet输出层</b>); Box_Output_Layers -- "分类分支" --> Final_Cls(最终分类<br/>4个已知类 + 未知类); Box_Output_Layers -- "回归分支" --> Final_Reg(最终定位<br/>类别无关回归); end class RoI_Start,RoI_Align,Box_Head,Box_Output_Layers,Final_Cls,Final_Reg roi; %% Final Output Final_Cls & Final_Reg --> Final_Output[最终检测结果<br/>(边界框, 类别, 置信度)]:::output; %% Loss Calculation (for Training) subgraph "5. 损失计算 (仅训练时)" direction TD Loss_Start(模型输出 vs. 真实标签) Loss_Start --> RPN_Loss(RPN 损失); Loss_Start --> RoI_Loss(RoI 基础损失); Loss_Start --> OpenSet_Loss(<b>开放集损失</b>); OpenSet_Loss --> ICLOSS(ICLOSS<br/>实例对比损失); OpenSet_Loss --> UPLOSS(UPLOSS<br/>未知提议损失); RPN_Loss & RoI_Loss & ICLOSS & UPLOSS --> Total_Loss(总损失); Total_Loss -- "反向传播" --> Update(更新模型权重); end class Loss_Start,RPN_Loss,RoI_Loss,OpenSet_Loss,ICLOSS,UPLOSS,Total_Loss,Update loss; 请你画出以上流程图
最新发布
11-10
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值