Robust Multi-Person Tracking from Mobile Platforms

In all cases, data was recorded using a pair of AVT Marlins F033C mounted on a chariot respectively a car, with a resolution of 640 x 480 (bayered), and a framerate of 13--14 FPS. For each dataset, we provide the unbayered images for both cameras, the camera calibration, and if available, the set of bounding box annotations. Depth maps were created based on this data using the publicy available belief-propagation-based stereo algorithm of Huttenlocher and Felzenszwalb (note: this has no occlusion handling built in, if you know of a better, publicly available stereo algorithm, please contact me).
Note: The annotation files available here contain a few very small pedestrians. If you want to compare to our results, please note that we filter out bounding boxes with a height smaller than 60 pixels, which is close to the detection limit of HOG.
We deeply appreciate the help of Martin Vogt in annotating this large amount of data.
We hope that you can use the provided data in your research. If you are performing comparisons, we would love to learn about your results. Please reference our CVPR'08 paper when using the data:
@InProceedings{eth_biwi_00534,
author = {A. Ess and B. Leibe and K. Schindler and and L. van Gool},
title = {A Mobile Vision System for Robust Multi-Person Tracking},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08)},
year = {2008},
month = {June},
publisher = {IEEE Press},
keywords = {}
}
Data
Below you can find the sequences used in our publications. For each sequence, we provide the images, the calibration files, as well as single frame annotations. The most current video results are the ones from ICRA '09, based on the HOG pedestrian detector and a more advanced tracker.Update Feb 2010: Christian Wojek from the vision group at TU Darmstadt updated the annotations of our original ICCV sequences to also contain pedestrian down to a size of roughly 48 pixels. If you intend to compare with his results, please use these updated annotations. ETH01 ETH02 ETH03
Setup 1 (chariot Mk I) | |
![]() | Sequence BAHNHOF (999 frames) images NOT undistorted Used in: ICCV '07, CVPR '08, PAMI '09, ICRA '09 Images (left, 500 MB) Images (right, 500 MB) Annotations Annotations (Feb 2010, TU Darmstadt) Calibration Odometry Result (ICRA '09) |
<img src="https://data.vision.ee.ethz.ch/cvl/aess/dataset/jelmoli.png" <="" img=""> | Sequence JELMOLI (936 frames) Used in: ICCV '07, CVPR '08 Images (left, frames 1-450) Images (left, frames 451 - 936) Images (right, frames 1-450) Images (right, frames 451 - 936) Annotations Annotations (Feb 2010, TU Darmstadt) Calibration Result (ICRA '09) |
<img src="https://data.vision.ee.ethz.ch/cvl/aess/dataset/part13.png" <="" img=""> | Sequence SUNNY DAY (354 frames) Used in: ICCV '07 Images (left) Images (right) Annotations Calibration |
Setup 2 (chariot Mk II) | |
<img src="https://data.vision.ee.ethz.ch/cvl/aess/dataset/linthescher.png" <="" img=""> | Sequence LINTHESCHER (1,208 frames) Used in: CVPR '08, PAMI '09 Images (left, 500 MB) Images (right, 500 MB) Annotations (every 4th frame) Calibration Result (ICRA '09) |
<img src="https://data.vision.ee.ethz.ch/cvl/aess/dataset/parade.png" <="" img=""> | Sequence PARADEPLATZ (231 frames) Used in: ICRA '09 Result (ICRA '09) |
<img src="https://data.vision.ee.ethz.ch/cvl/aess/dataset/crossing.png" <="" img=""> | Sequence CROSSING (220 frames) Used in: CVPR '08 Images (left, 86 MB) Images (right, 86 MB) Annotations (every 4th frame) Calibration Result (CVPR '08) |
<img src="https://data.vision.ee.ethz.ch/cvl/aess/dataset/crossing2.png" <="" img=""> | Sequence PEDCROSS 2 (840 frames) Used in: CVPR '08, PAMI '09 Images (left, 340 MB) Images (right, 340 MB) Calibration Result (CVPR '08) |
Setup 3 (car) | |
<img src="https://data.vision.ee.ethz.ch/cvl/aess/dataset/loewenplatz.png" <="" img=""> | Sequence LOEWENPLATZ Used in: PAMI '09, ICRA '09 Images (left, 330 MB) Images (right, 330 MB) Annotations Calibration Odometry Result (ICRA '09) |
We thank Kristijan Macek, Luciano Spinello, and Prof. Roland Siegwart from the ASL lab for giving us the opportunity to record from the SmartTer platform (Sequence LOEWENPLATZ).
Calibration
Calibration files contain the calibration for both left and right camera (K [3x3], rad [1x2] tan [1x2] R [3x3] t [1x3]), with K the internal calibration, rad/tan the radial/tangential distortion coefficients, and R/t external calibration, world -> camera (i.e. X_cam = R X_world + t).The cameras are installed about 0.95m above ground for the chariot, and about 1.4m for the car sequence.
Odometry
Odometry contains one file for each image, containing the calibration of the left camera (K [3x3], rad [1x3], R^T [3x3], C [1x3]). Note that as opposed to calibration, R and C transform X into world coordinates: X_world = R^T * X_cam + C, with C the camera center.IDL files
An IDL file is used for storing the annotations of the sequence. For each image, it lists a set of bounding boxes, separated by commas. The boxes contain upper-left and lower-right corner, but are not necessarily sorted according to this. A semicolon ends the list of bounding boxes for a single file, a period ends the file."filename": (x1, y1, x2, y2), (x1, y1, x2, y2), ...;
A simple MATLAB reader is available: readIDL.m
References
[Ess et al, 2007] A. Ess, B. Leibe, and L. van Gool. Depth and Appearance for Mobile Scene Analysis, Proceedings of ICCV 2007.[Ess et al, 2008] A. Ess, B. Leibe, K. Schindler, and L. van Gool. A Mobile Vision System for Robust Multi-Person Tracking, Proceedings of CVPR 2008.
[Ess et al, 2009a] , A. Ess, B. Leibe, K. Schindler, and L. van Gool. Moving Obstacle Detection in Highly Dynamic Scenes, Proceedings of ICRA 2009, best vision paper award
[Ess et al, 2009b], A. Ess, B. Leibe, K. Schindler, and L. van Gool. Robust Multi-Person Tracking from a Mobile Platform, in Transactions PAMI 2009