MultiScaleRoIAlign

博客探讨了PyTorch中MultiScaleRoIAlign操作的使用和输出形状的困惑。作者通过代码示例展示了该方法如何处理不同大小的特征图和RoIs,并解释了为何输出形状不是预期的[12, 5, 3, 4],而是[6, 5, 3, 3]。文章提到了关键点在于RoIs在不同特征图上的应用方式,以及在计算时如何聚合信息。作者通过参考多个资源最终理解了这个问题。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

这几天在看一个代码,看到这个方法,官方给的代码样例:

        >>> m = torchvision.ops.MultiScaleRoIAlign(['feat1', 'feat3'], 3, 2)
        >>> i = OrderedDict()
        >>> i['feat1'] = torch.rand(1, 5, 64, 64)
        >>> i['feat2'] = torch.rand(1, 5, 32, 32)  # this feature won't be used in the pooling
        >>> i['feat3'] = torch.rand(1, 5, 16, 16)
        >>> # create some random bounding boxes
        >>> boxes = torch.rand(6, 4) * 256; boxes[:, 2:] += boxes[:, :2]
        >>> # original image size, before computing the feature maps
        >>> image_sizes = [(512, 512)]
        >>> output = m(i, [boxes], image_sizes)
        >>> print(output.shape)
        >>> torch.Size([6, 5, 3, 3])

当时很奇怪,输入除了 feat2 不要用,feat1,feat3都要用,但是我当时很奇怪的是:明明两个featuremap,按理输出的shape应该是 shape= [12,5,3,4],为啥是[6, 5, 3, 3]?

我当时的想法是,有两个feature map,那6个box应该分别在每个feature map上进行找box对应区域这个想法是错的,下图见解:

  

而且在RPN给出的rois时,给几个box,就会在上面初始化 MultiScaleRoIAlign 方法时确定(rois给几个,这个是确定好的),接着result中就会有几个key,不会出现覆盖。

虽然简单记录了,也不知道别人能不能看懂,但是困扰我有好久了,昨天晚上到今天终于搞懂了。

为数不多找到的参考:
​​​​​​​从源码学习 Faster-RCNN - 刘知安的博客 | LiuZhian's Blog

保姆级 faster rcnn 源码逐行解读 (五)roi_head part1 - 知乎

捋一捋pytorch官方FasterRCNN代码 - 知乎

D:\yolov\Miniconda3\envs\yoloshow\python.exe D:\YOLOSHOW-master\main.py Traceback (most recent call last): File "D:\YOLOSHOW-master\main.py", line 12, in <module> from yoloshow.Window import YOLOSHOWWindow as yoloshowWindow File "D:\YOLOSHOW-master\yoloshow\Window.py", line 8, in <module> from yoloshow.YOLOSHOW import YOLOSHOW File "D:\YOLOSHOW-master\yoloshow\YOLOSHOW.py", line 10, in <module> from yoloshow.YOLOSHOWBASE import YOLOSHOWBASE, MODEL_THREAD_CLASSES File "D:\YOLOSHOW-master\yoloshow\YOLOSHOWBASE.py", line 28, in <module> from models import common, yolo, experimental File "D:\YOLOSHOW-master\models\common.py", line 22, in <module> from torchvision.ops import DeformConv2d File "D:\yolov\Miniconda3\envs\yoloshow\lib\site-packages\torchvision\__init__.py", line 10, in <module> from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils # usort:skip File "D:\yolov\Miniconda3\envs\yoloshow\lib\site-packages\torchvision\models\__init__.py", line 2, in <module> from .convnext import * File "D:\yolov\Miniconda3\envs\yoloshow\lib\site-packages\torchvision\models\convnext.py", line 8, in <module> from ..ops.misc import Conv2dNormActivation, Permute File "D:\yolov\Miniconda3\envs\yoloshow\lib\site-packages\torchvision\ops\__init__.py", line 23, in <module> from .poolers import MultiScaleRoIAlign File "D:\yolov\Miniconda3\envs\yoloshow\lib\site-packages\torchvision\ops\poolers.py", line 10, in <module> from .roi_align import roi_align File "D:\yolov\Miniconda3\envs\yoloshow\lib\site-packages\torchvision\ops\roi_align.py", line 7, in <module> from torch._dynamo.utils import is_compile_supported File "D:\yolov\Miniconda3\envs\yoloshow\lib\site-packages\torch\_dynamo\__init__.py", line 13, in <module> from . import config, convert_frame, eval_frame, resume_execution File "D:\yolov\Miniconda3\envs\yoloshow\lib\site-packages\torch\_dynamo\convert_frame.py", line 52,
最新发布
06-16
Traceback (most recent call last): File "D:\code\pytorch\deep_learning\deeplearning_base\Examples\Vit\train.py", line 6, in <module> from utils import get_loaders File "D:\code\pytorch\deep_learning\deeplearning_base\Examples\Vit\utils.py", line 3, in <module> from dataset import MNISTValDataset,MNISTSubmissionDataset,MNISTTrainDataset File "D:\code\pytorch\deep_learning\deeplearning_base\Examples\Vit\dataset.py", line 6, in <module> from torchvision import transforms File "C:\Users\admin\anaconda3\Lib\site-packages\torchvision\__init__.py", line 10, in <module> from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils # usort:skip ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\admin\anaconda3\Lib\site-packages\torchvision\models\__init__.py", line 2, in <module> from .convnext import * File "C:\Users\admin\anaconda3\Lib\site-packages\torchvision\models\convnext.py", line 8, in <module> from ..ops.misc import Conv2dNormActivation, Permute File "C:\Users\admin\anaconda3\Lib\site-packages\torchvision\ops\__init__.py", line 23, in <module> from .poolers import MultiScaleRoIAlign File "C:\Users\admin\anaconda3\Lib\site-packages\torchvision\ops\poolers.py", line 10, in <module> from .roi_align import roi_align File "C:\Users\admin\anaconda3\Lib\site-packages\torchvision\ops\roi_align.py", line 7, in <module> from torch._dynamo.utils import is_compile_supported File "C:\Users\admin\anaconda3\Lib\site-packages\torch\_dynamo\__init__.py", line 2, in <module> from . import convert_frame, eval_frame, resume_execution File "C:\Users\admin\anaconda3\Lib\site-packages\torch\_dynamo\convert_frame.py", line 39, in <module> from torch.fx.experimental.symbolic_shapes import ( File "C:\Users\admin\anaconda3\Lib\site-packages\torch\fx\experimental\symbolic_shapes.py", line 64, in <module> from torch.utils._sympy.function
05-22
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值