【视频理解数据集汇总】’A collection of recent video understanding datasets, under construction!’ by Yao Zhou
原文地址:https://github.com//yoosan/video-understanding-dataset
Video-understanding-dataset
Video Classification
| Dataset | Paper | Website | Category | Examples | Classes | Duration | Organizer | SOTA performance |
|---|---|---|---|---|---|---|---|---|
| UCF101 | Link | human action | 13,320 | 101 | <10s | UCF | 98% (DeepMind I3D) | |
| HMDB51 | Link | human action | 6,766 | 51 | <10s | SERRE LAB, Brown | - | |
| ActivityNet v1.3 | Link | human activities | ~20,000 | 200 | - | ActivityNet | 8.83% err (iBUG) | |
| Charades | Link | daily human activities | 9,848 | 157 | - | AI2 | - | |
| Kinetics | Link | human action | ~300,000 | 400 | 10s | DeepMind | - | |
| Sports-1M | Link | sports | ~1 million | 478 | 5m36s | Google & Stanford | - | |
| YouTube-8M | Link | visual contents | ~7 million | 4716 | 120-500s | Google Cloud | 85% GAP (WILLOW) | |
| FCVID | Link | visual contents | 91,223 | 239 | 100s+ | Fudan-Columbia | - | |
| Something-Something | Link | action with objects | 108,499 | 174 | ~4s | TwentyBN | - | |
| Moments in Time | Link | action or activity | ~1 million | 339 | 3s | MIT-IBM Watson | - |
Temporal Action Detection
| Dataset | Paper | Website | Examples | Organizer | SOTA performance |
|---|---|---|---|---|---|
| THUMOS2014 | PFD | Link | 9.682 | UCF | - |
| ActivityNet(v1.3) | PFD | Link | ~20,000 | ActivityNet | 0.344(SJTU & Columbia ) |
Video Captioning
| Dataset | Paper | Website | Context | Examples | Organizer | SOTA performance |
|---|---|---|---|---|---|---|
| MPII-MD | Link | movie | 68,337 clips with 68,375 sentences | MPII | - | |
| MSR-VTT | Link | 20 categories | 10,000 clips wth 200,000 sentences | MSR | - | |
| Charades | Link | human activity | 9,848 clips wth 27,847 sentences | AI2 | - | |
| Densevid | Link | event | 20k clips and 100k sentences | Stanford, ActivityNet | - |
Video Question Answering
| Dataset | Paper | Website | Task | Examples | Organizer | SOTA performance |
|---|---|---|---|---|---|---|
| MovieQA | Link | question-answering in movies | 408 movies & 14944 QAs | UToronto | - | |
| MarioQA | Link | reasoning events in game videos | 187,757 examples with 92,874 QAs | POSTECH | - |

2164

被折叠的 条评论
为什么被折叠?



