时序行为检测和分类

一 数据集

ActivityNet
发展简介

ActivityNet是15年年cvpr,16年年开始举办的竞赛。
16-19年,

  • 16:只有detection和classification(untrimmed) -
  • 17:classification(trimmed和untrimmed) proposals captioning
  • 18:从-18年年开始去掉了了untrimmed的分类,

任务A: 视频动作分类 (trimmed)改⽤用Kinetics数据集

新加的Task B

时空⾏行行为定位(Spatio-temporal Action Localization)依据 AVA 数据集,试图评估算法对⼈人类⾏行行为时空信息的定位能⼒力力,其中每个标注的视频⽚片 段连续且超过 15 分钟,包含多个主体,每个主体有多个⾏行行为;

Task B 分为
#1 (Vision Only) 以及 #2 (Full) 两个⼦子挑战赛。

总体来说,这⼀一任务做⼤大的难点是 将动作细化到了了原⼦子级别,需要在任务中判断⼈人类⾏行行为主体的位置,发⽣生了了什什么 动作,⼜又与其他物体/⼈人发⽣生了了什什么交互。 AVA的两个任务( AVA atomic visual actions dataset)The long term goal of this dataset is to enable modeling of complex activities by building on top of current work in recognizing atomic actions. This task will be divided into two challenges.
Challenge #1 is strictly computer vision, i.e. participants are requested not to use signals derived from audio, metadata, etc. Challenge #2 lifts this restriction, allowing creative solutions that leverage any input modalities. We ask only that users document the additional data and features they use. Performance will be ranked separately for the two challenges.

19年

任务C改为egocentric activity understanding 该任务适⽤用于online,即处理理的是⼀一个视频流,需要在线的检测(or 预测未来) 发⽣生的动作类别,但⽆无法知道检测时间点之后的内容。online的问题设定更更符合 surveillance的需求,需要做实时的检测或者预警,⽐比如 anomaly detection ; offline的设定更更符合视频搜索的需求,⽐比如youtube可能⽤用到的 highlight detection / preview generation。 这篇⽂文章主要聚焦在 online action detection & anticipation(预测) This task is intended to evaluate the ability of algorithms to understand daily activities in egocentric videos. There will be three tracks focus on classifying actions on trimmed segments, detecting objects in egocentric videos and anticipating future actions Challenge #1 - Object Detection: Detect and localise objects in individual images out of 290 classes, with a long-tail distribution. Challenge #2 - Action Recognition: Given a start-end time in an untrimmed video, classify the varying-length segments into verb classes (125 verb classes, 331 noun classes). Challenge #3 - Action Anticipation: Given an action segment, predict the action class (125 verb classes and 331 noun classes) by observing the video segment preceding the action start time by a preselected anticipation time duration of 1 second. Task D:Activity Detection in Extended Videos (ActEV-PC) ActEV-PC task has two phases: ActEV-PC Open Leaderboard Evaluation (Phase 1): challenge participants will run their activity detection software on their compute hardware and submit system output defined by the ActEV-PC evaluation plan to the NIST ActEV Scoring Server . This phase will serve as a qualifying stage where the top 6 participants will proceed to phase 2. ActEV-PC Independent Evaluation (Phase 2): invited challenge participants
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值