【CVPR2020】Temporal Pyramid Network for Action Recognition

最新推荐文章于 2022-04-21 11:48:44 发布

Amazingren

最新推荐文章于 2022-04-21 11:48:44 发布

阅读量3.2k

点赞数 4

分类专栏： HumanActionRecognition PaperReading

本文链接：https://blog.youkuaiyun.com/Amazingren/article/details/105631183

版权

〇、基本信息:

Title：Temporal Pyramid Network for Action Recognition
Author：港中文和商汤的工作，有大佬周博磊的参与
一句话概括一下：This work addresses the importance of visual tempo within action recognitiong problem, and inspired by this, the author proposed a module named TPN at feature level for tempo modeling. And the two components in TPN (the source and the fusion of features respectively), forming the feature hierarchy for a given backbone. Then the model can capture action instance at various tempos!（其实不知道大家有没有看过何凯明SlowFast的工作，个人目前对此的理解其实就是一个在feature 层面的上的SlowFast的工作，这样的情况下网络的总输入就还是固定的单一的frame rate比如一次8帧或者一次16帧，这样的。）

一、读完摘要后的疑问：

什么是visual tempo?

二、读完Introduction后的一些想法：

还和之前的想法一样，这个工作就是一个Feature层面的SLow Fast的工作，这样的化输入的frame rate就是固定的了，比如8frame或者16frame这样的。
该方法产生的动机呢？？？
原文是这么说的：
（1）之前的工作虽然利用多分支的方式进行了不同frame rate的输入，但是这样的方式是非常的对算力不友好的，并且事先去定义不同的tempo也是不显示的（可能每个动作就有一种tempo,那动不动几百种的动作类别的难道我们要去定义上百种的分支吗？这是不现实的，当然不知道自己这样理解对不对）
（2）常见的C3D这样的方法经常堆叠一些列的时序卷积&#x

最低0.47元/天解锁文章