前言
slowfast众所周知,是多标签分类,一个人是有多个动作,可能不止一个。
参考的博客:
Multi-Label Classification with Deep Learning
Multi-Label Image Classification(多标签图像分类
多标签分类(一) | CNN-RNN: A Unified Framework for Multi-label Image Classification
多标签图像分类–HCP: A Flexible CNN Framework for Multi-Label Image Classification
Guide to multi-class multi-label classification with neural networks in python 这篇里有代码
一,代码解析
1.1 video_model_builder.py
位于:SlowFast-master/slowfast/models/video_model_builder.py
在class SlowFast(nn.Module):中:
def forward(self, x, bboxes=None):
x = self.s1(x)
x = self.s1_fuse(x)
x = self.s2(x)
x = self.s2_fuse(x)
for pathway in range(self.num_pathways):
pool = getattr(self, "pathway{}_pool".format(pathway))
x[pathway] = pool(x[pathway])
x = self.s3(x)
x = self.s3_fuse(x)
x = self.s4(x)
x = self.s4_fuse(x)
x = self.s5(x)
if self.enable_detection:
x = self.head(x, bboxes)
else:
x = self.head(x)
sum = 0
for i in x.to("cpu").data.numpy()[0]:
print(i)
sum = i + sum
print("\n====sum====\n",sum)
return x
最后的x是对判断动作的输出:
命令:
(torch17) lxn@lxn-System-Product-Name:~/0yangfan/Slowfast2/SlowFast-master$ python tools/run_net.py --cfg demo/AVA/SLOWFAST_32x2_R101_50_50s4.yaml
输出结果如下:
2it [00:09, 4.13s/it]0.0006857314
1.126401e-05
0.000896903
9.283956e-06
1.3434112e-05
5.1868254e-05
1.4454018e-05
0.00020378748
1.7117023e-05
4.5036345e-06
0.93631124
0.030427974
2.1429083e-05
0.00013174543
0.00023135419
3.1755062e-05
0.10004238
5.04442e-05
1.1602075e-05
1.011116e-05
0.000145544
7.539917e-06
1.8375302e-05
2.3885143e-05
6.660219e-06
7.979267e-06
0.083057225
7.2511386e-05
0.0035209632
9.1447055e-06
1.4738828e-05
9.837757e-06
2.013603e-05
2.2849692e-05
1.1397525e-05
5.4423002e-05
0.00051283394
7.8566545e-06
4.3257776e-05
0.00051684293
0.00023136412
2.4495803e-05
6.456105e-05
8.8253415e-05
2.101141e-05
1.1654662e-05
0.00029671428
0.00039023798
0.00012704037
1.8536752e-05
9.058921e-05
8.61598e-06
6.4948144e-06
0.0047166185
5.1742085e-05
0.00013905742
0.00010724684
2.6692143e-05
0.39213014
8.9092566e-05
0.002478492
0.00026908738
0.0019222762
1.7974466e-05
0.0011669936
0.00014598113
0.00017028252
7.2456096e-05
7.615097e-05
6.273652e-05
1.0934462e-05
6.717625e-05
1.9446294e-05
0.76119375
2.5729018e-05
1.0991138e-05
4.2511077e-05
0.00048743526
0.017380886
0.86945987
====sum====
3.210983705057515
sum上面代表80个动作的判断概率,然后sum下面代表80个动作概率和,这个概率和的大小超过了1,说明不是使用的softmax来分类动作
1.2 head_helper.py
在 home/lxn/0yangfan/Slowfast2/SlowFast-master/slowfast/models/head_helper.py中
class ResNetRoIHead(nn.Module):中
if act_func == "softmax":
self.act = nn.Softmax(dim=1)
elif act_func == "sigmoid":
print("act_func:",act_func)
self.act = nn.Sigmoid()
else:
raise NotImplementedError(
"{} is not supported as an activation"
"function.".format(act_func)
)
这一段是多标签分类的关键,slowfast采用sigmoid而不是sofrmax。
命令:
(torch17) lxn@lxn-System-Product-Name:~/0yangfan/Slowfast2/SlowFast-master$ python tools/run_net.py --cfg demo/AVA/SLOWFAST_32x2_R101_50_50s4.yaml
结果如下: