前言:
上一篇博客中已经将所有的场景视频的特征提取出来了,下面介绍一下如何将相似的场景视频聚类到一起。
代码实现:
1、读取h5特征文件:
def H5Filepocess():
h5_file_path_dir = '/opt/data/private/xuyunyang/EasyCut/' + args.ID + '/' + args.ID_VideoName + '/SceneFeature'
dirs = os.listdir(h5_file_path_dir)
for file in dirs:
h5_file_path = h5_file_path_dir + '/' + file
h5 = h5py.File(h5_file_path, 'r')
data = h5['video_1']['features'][...]
h5FileName_feature[h5_file_path] = data
2、相似度对比:
思路是对于每一个视频,计算与除自己外的所有其他视频的相似度,相似度计算使用余弦相似度方法,设定相似度阈值为0.95,大于阈值的视频就被添加到列表中,该列表之后作为字典结构的值存在,键则为对应的目标视频路径。
def CalculateDistance(path):
distance = dict()
A = h5FileName_feature[path]
a = A
a = np.average(A, axis=0) # 按列求均值
for i in h5FileName_feature.keys():
if i == path:
continue
B = h5FileName_feature[i]
b = B
b