根据
调试得出
backbone 为 'ShuffleNetV2'
根据 nanodet_custom_xml_dataset.yml 的默认配置,默认输入尺寸为 320,320
然后我用笔记本电脑的默认图片尺寸为 480, 640
然后等比例缩放后,尺寸为 240, 320
然后因为240除以32不能整除,会变换为 256,320
这是我此次调试的输入尺寸
首先会进行backbone计算,
summary(self.backbone, input_size = (3, 256, 320))
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 24, 128, 160] 648
BatchNorm2d-2 [-1, 24, 128, 160] 48
LeakyReLU-3 [-1, 24, 128, 160] 0
MaxPool2d-4 [-1, 24, 64, 80] 0
Conv2d-5 [-1, 24, 32, 40] 216
BatchNorm2d-6 [-1, 24, 32, 40] 48
Conv2d-7 [-1, 58, 32, 40] 1,392
BatchNorm2d-8 [-1, 58, 32, 40] 116
LeakyReLU-9 [-1, 58, 32, 40] 0
Conv2d-10 [-1, 58, 64, 80] 1,392
BatchNorm2d-11 [-1, 58, 64, 80] 116
LeakyReLU-12 [-1, 58, 64, 80] 0
Conv2d-13 [-1, 58, 32, 40] 522
BatchNorm2d-14 [-1, 58, 32, 40] 116
Conv2d-15 [-1, 58, 32, 40] 3,364
BatchNorm2d-16 [-1, 58, 32, 40] 116
LeakyReLU-17 [-1, 58, 32, 40] 0
ShuffleV2Block-18 [-1, 116, 32, 40] 0
Conv2d-19 [-1, 58, 32, 40] 3,364
BatchNorm2d-20 [-1, 58, 32, 40] 116
LeakyReLU-21 [-1, 58, 32, 40] 0
Conv2d-22 [-1, 58, 32, 40] 522
BatchNorm2d-23 [-1, 58, 32, 40] 116
Conv2d-24 [-1, 58, 32, 40] 3,364
BatchNorm2d-25 [-1, 58, 32, 40] 116
LeakyReLU-26 [-1, 58, 32, 40] 0
ShuffleV2Block-27 [-1, 116, 32, 40] 0
Conv2d-28 [-1, 58, 32, 40] 3,364
BatchNorm2d-29 [-1, 58, 32, 40] 116
LeakyReLU-30 [-1, 58, 32, 40] 0
Conv2d-31 [-1, 58, 32, 40] 522
BatchNorm2d-32 [-1, 58, 32, 40] 116
Conv2d-33 [-1, 58, 32, 40] 3,364
BatchNorm2d-34 [-1, 58, 32, 40] 116
LeakyReLU-35 [-1, 58, 32, 40] 0
ShuffleV2Block-36 [-1, 116, 32, 40] 0
Conv2d-37 [-1, 58, 32, 40] 3,364
BatchNorm2d-38 [-1, 58, 32, 40] 116
LeakyReLU-39 [-1, 58, 32, 40] 0
Conv2d-40 [-1, 58, 32, 40] 522
BatchNorm2d-41 [-1, 58, 32, 40] 116
Conv2d-42 [-1, 58, 32, 40] 3,364
BatchNorm2d-43 [-1, 58, 32, 40] 116
LeakyReLU-44 [-1, 58, 32, 40] 0
ShuffleV2Block-45 [-1, 116, 32, 40] 0
Conv2d-46 [-1, 116, 16, 20] 1,044
BatchNorm2d-47 [-1, 116, 16, 20] 232
Conv2d-48 [-1, 116, 16, 20] 13,456
BatchNorm2d-49 [-1, 116, 16, 20] 232
LeakyReLU-50 [-1, 116, 16, 20] 0
Conv2d-51 [-1, 116, 32, 40] 13,456
BatchNorm2d-52 [-1, 116, 32, 40] 232
LeakyReLU-53 [-1, 116, 32, 40] 0
Conv2d-54 [-1, 116, 16, 20] 1,044
BatchNorm2d-55 [-1, 116, 16, 20] 232
Conv2d-56 [-1, 116, 16, 20] 13,456
BatchNorm2d-57 [-1, 116, 16, 20] 232
LeakyReLU-58 [-1, 116, 16, 20] 0
ShuffleV2Block-59 [-1, 232, 16, 20] 0
Conv2d-60 [-1, 116, 16, 20] 13,456
BatchNorm2d-61 [-1, 116, 16, 20] 232
LeakyReLU-62 [-1, 116, 16, 20] 0
Conv2d-63 [-1, 116, 16, 20] 1,044
BatchNorm2d-64 [-1, 116, 16, 20] 232
Conv2d-65 [-1, 116, 16, 20] 13,456
BatchNorm2d-66 [-1, 116, 16, 20] 232
LeakyReLU-67 [-1, 116, 16, 20] 0
ShuffleV2Block-68 [-1, 232, 16, 20] 0
Conv2d-69 [-1, 116, 16, 20] 13,456
BatchNorm2d-70 [-1, 116, 16, 20] 232
LeakyReLU-71 [-1, 116, 16, 20] 0
Conv2d-72 [-1, 116, 16, 20] 1,044
BatchNorm2d-73 [-1, 116, 16, 20] 232
Conv2d-74 [-1, 116, 16, 20] 13,456
BatchNorm2d-75 [-1, 116, 16, 20] 232
LeakyReLU-76 [-1, 116, 16, 20] 0
ShuffleV2Block-77 [-1, 232, 16, 20] 0
Conv2d-78 [-1, 116, 16, 20] 13,456
BatchNorm2d-79 [-1, 116, 16, 20] 232
LeakyReLU-80 [-1, 116, 16, 20] 0
Conv2d-81 [-1, 116, 16, 20] 1,044
BatchNorm2d-82 [-1, 116, 16, 20] 232
Conv2d-83 [-1, 116, 16, 20] 13,456
BatchNorm2d-84 [-1, 116, 16, 20] 232
LeakyReLU-85 [-1, 116, 16, 20] 0
ShuffleV2Block-86 [-1, 232, 16, 20] 0
Conv2d-87 [-1, 116, 16, 20] 13,456
BatchNorm2d-88 [-1, 116, 16, 20] 232
LeakyReLU-89 [-1, 116, 16, 20] 0
Conv2d-90 [-1, 116, 16, 20] 1,044
BatchNorm2d-91 [-1, 116, 16, 20] 232
Conv2d-92 [-1, 116, 16, 20] 13,456
BatchNorm2d-93 [-1, 116, 16, 20] 232
LeakyReLU-94 [-1, 116, 16, 20] 0
ShuffleV2Block-95 [-1, 232, 16, 20] 0
Conv2d-96 [-1, 116, 16, 20] 13,456
BatchNorm2d-97 [-1, 116, 16, 20] 232
LeakyReLU-98 [-1, 116, 16, 20] 0
Conv2d-99 [-1, 116, 16, 20] 1,044
BatchNorm2d-100 [-1, 116, 16, 20] 232
Conv2d-101 [-1, 116, 16, 20] 13,456
BatchNorm2d-102 [-1, 116, 16, 20] 232
LeakyReLU-103 [-1, 116, 16, 20] 0
ShuffleV2Block-104 [-1, 232, 16, 20] 0
Conv2d-105 [-1, 116, 16, 20] 13,456
BatchNorm2d-106 [-1, 116, 16, 20] 232
LeakyReLU-107 [-1, 116, 16, 20] 0
Conv2d-108 [-1, 116, 16, 20] 1,044
BatchNorm2d-109 [-1, 116, 16, 20] 232
Conv2d-110 [-1, 116, 16, 20] 13,456
BatchNorm2d-111 [-1, 116, 16, 20] 232
LeakyReLU-112 [-1, 116, 16, 20] 0
ShuffleV2Block-113 [-1, 232, 16, 20] 0
Conv2d-114 [-1, 116, 16, 20] 13,456
BatchNorm2d-115 [-1, 116, 16, 20] 232
LeakyReLU-116 [-1, 116, 16, 20] 0
Conv2d-117 [-1, 116, 16, 20] 1,044
BatchNorm2d-118 [-1, 116, 16, 20] 232
Conv2d-119 [-1, 116, 16, 20] 13,456
BatchNorm2d-120 [-1, 116, 16, 20] 232
LeakyReLU-121 [-1, 116, 16, 20] 0
ShuffleV2Block-122 [-1, 232, 16, 20] 0
Conv2d-123 [-1, 232, 8, 10] 2,088
BatchNorm2d-124 [-1, 232, 8, 10] 464
Conv2d-125 [-1, 232, 8, 10] 53,824
BatchNorm2d-126 [-1, 232, 8, 10] 464
LeakyReLU-127 [-1, 232, 8, 10] 0
Conv2d-128 [-1, 232, 16, 20] 53,824
BatchNorm2d-129 [-1, 232, 16, 20] 464
LeakyReLU-130 [-1, 232, 16, 20] 0
Conv2d-131 [-1, 232, 8, 10] 2,088
BatchNorm2d-132 [-1, 232, 8, 10] 464
Conv2d-133 [-1, 232, 8, 10] 53,824
BatchNorm2d-134 [-1, 232, 8, 10] 464
LeakyReLU-135 [-1, 232, 8, 10] 0
ShuffleV2Block-136 [-1, 464, 8, 10] 0
Conv2d-137 [-1, 232, 8, 10] 53,824
BatchNorm2d-138 [-1, 232, 8, 10] 464
LeakyReLU-139 [-1, 232, 8, 10] 0
Conv2d-140 [-1, 232, 8, 10] 2,088
BatchNorm2d-141 [-1, 232, 8, 10] 464
Conv2d-142 [-1, 232, 8, 10] 53,824
BatchNorm2d-143 [-1, 232, 8, 10] 464
LeakyReLU-144 [-1, 232, 8, 10] 0
ShuffleV2Block-145 [-1, 464, 8, 10] 0
Conv2d-146 [-1, 232, 8, 10] 53,824
BatchNorm2d-147 [-1, 232, 8, 10] 464
LeakyReLU-148 [-1, 232, 8, 10] 0
Conv2d-149 [-1, 232, 8, 10] 2,088
BatchNorm2d-150 [-1, 232, 8, 10] 464
Conv2d-151 [-1, 232, 8, 10] 53,824
BatchNorm2d-152 [-1, 232, 8, 10] 464
LeakyReLU-153 [-1, 232, 8, 10] 0
ShuffleV2Block-154 [-1, 464, 8, 10] 0
Conv2d-155 [-1, 232, 8, 10] 53,824
BatchNorm2d-156 [-1, 232, 8, 10] 464
LeakyReLU-157 [-1, 232, 8, 10] 0
Conv2d-158 [-1, 232, 8, 10] 2,088
BatchNorm2d-159 [-1, 232, 8, 10] 464
Conv2d-160 [-1, 232, 8, 10] 53,824
BatchNorm2d-161 [-1, 232, 8, 10] 464
LeakyReLU-162 [-1, 232, 8, 10] 0
ShuffleV2Block-163 [-1, 464, 8, 10] 0
================================================================
Total params: 776,420
Trainable params: 776,420
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.94
Forward/backward pass size (MB): 76.38
Params size (MB): 2.96
Estimated Total Size (MB): 80.28
----------------------------------------------------------------
也是常规的FPN输出,会得出三个尺寸的feature
torch.Size([1, 116, 32, 40])
torch.Size([1, 232, 16, 20])
torch.Size([1, 464, 8, 10])
然后送到FPN进行计算,FPN类型为 'GhostPAN', 输入的x的三个元素,分别经过 reduce_layers 的三个元素的计算
ModuleList(
(0): ConvModule(
(conv): Conv2d(116, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act): LeakyReLU(negative_slope=0.1, inplace=True)
)
(1): ConvModule(
(conv): Conv2d(232, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act): LeakyReLU(negative_slope=0.1, inplace=True)
)
(2): ConvModule(
(conv): Conv2d(464, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(act): LeakyReLU(negative_slope=0.1, inplace=True)
)
)
得到三个输出
torch.Size([1, 96, 32, 40])
torch.Size([1, 96, 16, 20])
torch.Size([1, 96, 8, 10])
inner_outs 先获取了最后一个,也就是 torch.Size([1, 96, 8, 10])
然后,总之经过一系列计算,最终会输出 outs,有四个元素,尺寸分别为:
torch.Size([1, 96, 32, 40])
torch.Size([1, 96, 16, 20])
torch.Size([1, 96, 8, 10])
torch.Size([1, 96, 4, 5])
经过head计算,head的类型为 'NanoDetPlusHead',最终的输出是
torch.Size([1, 1700, 52])
其中52是 20 + 4 * (7 + 1)
20是class个数,7是reg_max
1700 = 1280 + 320 + 80 + 20
然后,这个32的长度的数据,经过softmax和一个全连接层,再和步长相乘,获取到了所有box的尺寸,进而获取到所有box的大小,共(1, 1700,4),然后送到nms里去
也就差不多得出最终结果了