gaze and visual focus of attention in conversation and manipulation settings

automatically extracted features (head pose, gaze, speaking status, people/object positions) as well as manual annotations (visual focus of attention, grasp/release actions) from three different datasets.

Get Data


Description

This database contains automatically extracted features (head pose, gaze, speaking status, people/object positions) as well as manual annotations (visual focus of attention, grasp/release actions) from three different datasets:

  • UBImpressed: 5-10 minutes dyadic interactions in a formal setting (8 sessions, i.e. 16 videos used, partially annotated)
    • Website : https://www.idiap.ch/project/ubimpressed
    • Reference : Muralidhar et al. 2016. Training on the Job : Behavioral Analysis of Job Interviews in Hospitality. In ACM International Conference on Multimodal Interaction (ICMI’16).
  • KTH-Idiap Group-Interviewing Corpus : 1 hour long four-party meetings in a more relaxed setting (5 sessions, i.e. 20 videos used, partially annotated)
    • Reference: Oertel et al. 2014. Who will get the grant? In International Conference on Multimodal Interaction Workshop (UMMI).
  • ManiGaze: a set of short gazing and manipulation tasks perform in front of a robot (16 sessions used, partially annotated)
    • Website: https://www.idiap.ch/en/dataset/manigaze
    • Reference: R. Siegfried, B. Aminian and J.-M. Odobez, « ManiGaze: a Dataset for Evaluating Remote Gaze Estimator in Object Manipulation Situations », ACM Symposium on Eye Tracking Research and Applications (ETRA) 2020

The conversation datasets were manually annotated with the visual focus of attention (around 3000 annotated frames per video in the UBImpressed dataset and 9000 annotated frames per video in KTH-Idiap Group-Interviewing dataset. The ManiGaze dataset provides VFOA ground truth for its gazing sessions (51 target per subject) and actions annotations for its object manipulation session (22 actions per subject). See the reference paper for details on feature extraction.

This database was mainly collected to make experiments on visual focus of attention estimation and gaze estimation calibration.

 

Acknowledgement

Rémy Siegfried and Jean-Marc odobez, « Robust Unsupervised Gaze Calibration using Conversation and Manipulation Attention Priors », ACM Transactions on Multimedia Computing, Communications, and Applications, 2021
https://publications.idiap.ch/index.php/publications/show/4611

https://publications.idiap.ch/index.php/publications/show/4294