【视频理解数据集汇总】’A collection of recent video understanding datasets, under construction!’ by Yao Zhou
原文地址:https://github.com//yoosan/video-understanding-dataset
Video-understanding-dataset
Video Classification
Dataset | Paper | Website | Category | Examples | Classes | Duration | Organizer | SOTA performance |
---|---|---|---|---|---|---|---|---|
UCF101 | Link | human action | 13,320 | 101 | <10s | UCF | 98% (DeepMind I3D) | |
HMDB51 | Link | human action | 6,766 | 51 | <10s | SERRE LAB, Brown | - | |
ActivityNet v1.3 | Link | human activities | ~20,000 | 200 | - | ActivityNet | 8.83% err (iBUG) | |
Charades | Link | daily human activities | 9,848 | 157 | - | AI2 | - | |
Kinetics | Link | human action | ~300,000 | 400 | 10s | DeepMind | - | |
Sports-1M | Link | sports | ~1 million | 478 | 5m36s | Google & Stanford | - | |
YouTube-8M | Link | visual contents | ~7 million | 4716 | 120-500s | Google Cloud | 85% GAP (WILLOW) | |
FCVID | Link | visual contents | 91,223 | 239 | 100s+ | Fudan-Columbia | - | |
Something-Something | Link | action with objects | 108,499 | 174 | ~4s | TwentyBN | - | |
Moments in Time | Link | action or activity | ~1 million | 339 | 3s | MIT-IBM Watson | - |
Temporal Action Detection
Dataset | Paper | Website | Examples | Organizer | SOTA performance |
---|---|---|---|---|---|
THUMOS2014 | PFD | Link | 9.682 | UCF | - |
ActivityNet(v1.3) | PFD | Link | ~20,000 | ActivityNet | 0.344(SJTU & Columbia ) |
Video Captioning
Dataset | Paper | Website | Context | Examples | Organizer | SOTA performance |
---|---|---|---|---|---|---|
MPII-MD | Link | movie | 68,337 clips with 68,375 sentences | MPII | - | |
MSR-VTT | Link | 20 categories | 10,000 clips wth 200,000 sentences | MSR | - | |
Charades | Link | human activity | 9,848 clips wth 27,847 sentences | AI2 | - | |
Densevid | Link | event | 20k clips and 100k sentences | Stanford, ActivityNet | - |
Video Question Answering
Dataset | Paper | Website | Task | Examples | Organizer | SOTA performance |
---|---|---|---|---|---|---|
MovieQA | Link | question-answering in movies | 408 movies & 14944 QAs | UToronto | - | |
MarioQA | Link | reasoning events in game videos | 187,757 examples with 92,874 QAs | POSTECH | - |