classify_video.py
classify_video.py will classify a video using
(1) singleFrame RGB model
(2) singleFrame flow model
(3) 0.5/0.5 singleFrame RGB/singleFrame flow fusion
(4) 0.33/0.67 singleFrame RGB/singleFrame flow fusion
(5) LRCN RGB model
(6) LRCN flow model
(7) 0.5/0.5 LRCN RGB/LRCN flow model
(8) 0.33/0.67 LRCN RGB/LRCN flow model
结果输出了8种预测结果,分别是不同的模型以及他们的融合,那么问题来了,这几个模型是怎么融合的呢
action_hash[compute_fusion(predictions_RGB_singleFrame, predictions_flow_singleFrame, 0.33)]
1、函数compute_fusion
计算融合的函数非常简单
函数输入:两个预测好的矩阵以及权重
predictions_flow_singleFrame:预测矩阵 是一个156X01维的矩阵。 UCF101有101类,156是因为有156张图片
输出:融合结果
def compute_fusion(RGB_pred, flow_pred, p):
return np.argmax(p*np.mean(RGB_pred,0) + (1-p)*np.mean(flow_pred,0))
2、函数caffe.Net(a,b,c)
函数输入:deploy prototxt,caffemodel
输出:
a:.prototxt文件,网络结构net
b: caffemodel文件,pretrained可《Long-term Recurrent Convolutional Networks for Visual Recognition and Description》学习参数
c:不明确caffe.TEST=1
Models and weights
singleFrame_model = ‘deploy_singleFrame.prototxt’
lstm_model = ‘deploy_lstm.prototxt’
RGB_singleFrame = ‘single_frame_all_layers_hyb_RGB_iter_5000.caffemodel’
flow_singleFrame = ‘single_frame_all_layers_hyb_flow_iter_50000.caffemodel’
RGB_lstm = ‘RGB_lstm_model_iter_30000.caffemodel’
flow_lstm = ‘flow_lstm_model_iter_50000.caffemodel’
两个单帧模型,共享一个singleFrame_model,但是输入caffe.Net的网络权重不同。加入训练好的网络权重(caffelmodel)之前,网络结构只能称之为model,加入可学习参数之后,才叫net。
由此可知caffe.Net的作用是讲网络结构,和网络权重结合
RGB_singleFrame_net = caffe.Net(singleFrame_model, RGB_singleFrame, caffe.TEST)
flow_singleFrame_net = caffe.Net(singleFrame_model, flow_singleFrame, caffe.TEST)
RGB_lstm_net = caffe.Net<