PETS-ICVS Datasets 数据集

该数据集为智能会议室场景提供四个不同情境的视频资料及注释,旨在自动标注会议活动,包括人脸定位、表情识别、手势及行为识别等任务。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

PETS-ICVS Datasets

Warning: you are strongly advised to view the smart meeting specification file available here before downloading any data.  This will allow you to determine which part of the data is most appropriate for you.  The total size of the dataset is 5.9 Gb.

The JPEG images for the PETS-ICVS may be obtained from here

You can also download all files under one directory using wget
Please see http://www.gnu.org/software/wget/wget.html  for more details.

Note: there appears to be some problems accessing the ftp site using Netscape. If you are having problems, please try using Internet Explorer instead, or access via direct ftp as shown above.

Important instructions are given at the bottom of this page on processing the datasets - please read these carefully.



Annotation of Datasets (Ground Truth)

The following annotations are available for the datasets:

1.  Eye positions of people in Scenarios A, B and D.  The format is described in the specification file link above, i.e. 
image0001.jpg  3  left_eye_center_x left_eye_center_y right_eye_center_x right_eye_center_y 
(where left and right are as seen by the camera, rather than the persons left/right eyes). 
Image coordinates: the origin is in the top left. 
Every 10th frame is annotated. 
The annotation is available here.

2. Facial expression and gaze estimation for Scenarios A and D, Cameras 1-2. 
The annotation is available here.

3. Gesture/action annotations for Scenarios B and D, Cameras 1-2. 
The annotation is available here.

You are strongly encouraged to evaluate your results against the appropiate data above and report in your paper submission



PETS-ICVS consists of datasets for a smart meeting.

 
Two views of the smart meeting room without participants.  The environment consists of three cameras: one mounted on each of two opposing walls, and an omnidirectional camera positioned at the centre of the room. 



 
View from Camera 1.  Click on the background image for the full resolution version.

 
View from Camera 2.  Click on the background image for the full resolution version.

 
View from Camera 3.  Click on the background image for the full resolution version.

The measurements for the smart meeting room may be found in the following calibration file (Powerpoint).

The Task

The overall task is to automatically annotate the smart meeting.

The dataset consists of four scenarios A, B, C and D.

Each scenario consists of a number of separate sub-tasks.  For each frame, the requirement is to perform:

  • face localisation (centre location of eyes)
  • recognition of facial expression
  • recognition of face/hand gesture
  • estimation of face/head direction (gaze)
  • recognition of actions
Note that it is  not  a requirement of the your paper to address all of the tasks stated above.  You may address one or more of the tasks in  any  of the scenarios.  For example, if you specialise in action recognition, you may wish to submit a paper which addresses this aspect alone, i.e. annotation on a frame-by-frame basis of actions performed within one of the scenarios.  Your annotation may be based on one or more of the 3 camera views used.

A full specification of the dataset is available here including details of scenarios A-D, list of actions/gestures and facial expressions.

The results in your paper can be based on any of the data supplied in the dataset. 
The images may be converted to any other format as appropriate, e.g. subsampled, or converted to monochrome. All results reported in the paper should clearly indicate which part of the test data is used, ideally with reference to frame numbers where appropriate, e.g. Scenario B, ...

There is no requirement to report results on all the test data, however you are encouraged to test your algorithms on as much of the test data as possible.

The results must be submitted along with the paper, with the  results generated in XML format.

The paper that you submit may be based on previously published tracking methods/algorithms (including papers submitted to the main ICVS conference).  The importance is that your paper MUST report results using the datasets above.

You are strongly encouraged to evaluate your results against the ground truth given above and report in your paper submission.

Acknowledgements 
The sequences have been provided by the consortium of Project FGnet (IST-2000-26434) http://www.fg-net.org 
with additional support provided by the Swiss National Centre of Competence in Research (NCCR) on Interactive MultiModal Information Management (IM)2.  The NCCR is managed by the Swiss National Science Foundation on behalf of the Federal Authorities. 


If you have any queries please email pets-icvs@visualsurveillance.org.


from: http://www-prima.inrialpes.fr/FGnet/data/08-Pets2003/pets-icvs-db.html

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值