Current projects

3D2Cut - Machine Learning for Tailor Made Vine Pruning

Role : Principal Investigator
Funding agency : The Ark foundation, company funding
Dates : 2020-2023
Partners: 3D2cut SA

Description : The main objectives of this project is to study and design innovative AI algorithms for the analysis of vine trees from images. The aim is to extract the main components of the tree with the further goal of using them to recommend the pruning of the tree to field workers.

MuMMER - MultiModal Mall Entertainment Robot (website)

Role : Principal Investigator
Funding agency : H2020, RIA
Dates : 2016 - 2020
Partners: University of Glasgow (UK), Idiap Research Institute (CH), Aldebaran Robotics (F), Heriot-Watt University (UK), LAAS-CNRS (F), VTT Technical Research Centre (Finland), Ideapark (Finland).

Description : The overall goal is the development of a humanoid robot (based on Aldebaran's Pepper platform) that can interact autonomously and naturally in the dynamic environments of a public shopping mall, providing an engaging and entertaining experience to the general public. Using co-design methods, we will work together with stakeholders including customers, retailers, and business managers to develop truly engaging robot behaviours. Crucially, our robot will exhibit behaviour that is socially appropriate: combining speech-based interaction with non-verbal communication and human-aware navigation. To support this behaviour, we will develop and integrate new methods from audiovisual scene processing, social-signal processing, high-level action selection, and human-aware robot navigation.

ROSALIS - Robot Skill Acquisition through Active Learning and Social Interaction Strategies

Role : Principal Investigator
Funding agency : SNF (Swiss National Foundation)
Dates : 2018-2022
Partners: IDIAP RLI (Robot Leaning and Interactions group)

Description : ROSALIS proposes to rely on social interactions to teach gesture and skills to a robot. Interactionscan involve requests from the robot to have a demonstration by the human teacher, asking questions about the skill, scaffolding the environment and exploiting reproduction attempts to assess what the robot has learned. The research will advance on several fronts. First, for skills representation, the robot learners will require an appropriate level of plasticity, allowing them to adapt, refine, or freeze skill primitives currently being learned. Active learning methodologies will be developed, relying on heterogeneous sources of information (demonstrations, feedback labels, properties) allowing the robot to make hypotheses about the skill invariants and to suggest demonstrations or queries. Secondly, to allow natural interactions, we will design perception algorithms to provide a higher level understanding of people behaviors and intentions, including gaze information and multimodal action recognition and segmentation. The different mechanisms will be integrated within the definition of interaction units implying the coordination (selection, timing) between different components: interpretation of the different multimodal inputs; synthesis of demonstrations, queries and social signals expressed through verbal and non-verbal behaviors; selection of the interaction units to build scaffolded interactions (progressive learning using combination of different learning strategies). We target applications of robots in both manufacturing and home/office environments which both require the ability to re-program robots in an efficient and personalized manner.

Past projects

Sport Profiling -

Role : Principal Investigator
Funding agency : The Ark foundation
Dates : 2019-2021
Partners: ActionTypes Swiss Sarl, ProKeyCoach

Description : L’objectif principal de ce projet est de vérifier la faisabilité de l’utilisation de l’intelligence artificielle dans le domaine du profiling sportif en analysant des vidéos de sportifs. Deux situations typiques complémentaires et identifiées par la compagnie seront considérées.

REGENN - Robust Eye Gaze Deep Neural Networks

Role : Principal Investigator
Funding agency : Innovation Promotion Agency - CTI
Dates : 2017-2018
Partners: Eyeware SA (https://eyeware.tech)

Description : Eyeware creates natural and intuitive human-machine interaction experiences using eye-gaze tracking. We propose to design a novel, robust gaze estimation system based on multi-task deep learning that can adapt to challenging conditions, targeting new applications in home automation and advertising.

UNICITY - 3D Scene Understanding through Machine Learning to Secure Entrance Zones

Role : Principal Investigator
Funding agency : Innovation Promotion Agency - CTI
Dates : 2017- 2019
Partners: FASTCOM Technology SA, HES-SO Fribourg

Description : The UNICITY project targets the research and development of a versatile anti-tailgating solution relying on the off-line learning of statistical classifiers: (a) a view dependent single person detector learned from synthetic and real depth maps representing multiple corpulence and pose variabilities; (b) an object identification module learned on a captured 3D representation of the object. These two components will allow Fastcom to match the various configurations of its customers.

VIEW - Visibility Improvement for Event Webcasting

Role : Principal Investigator
Funding agency : Innovation Promotion Agency - CTI
Dates : 2016- 2018
Partners: Klewel (www.klewel.com/), Haute Ecole Valais/Wallis (HES-SO)

Description : Klewel enregistre des présentations et les publie sur une plateforme Web. Ce projet vise à améliorer leur référencement pour permettre aux clients un meilleur retour sur investissement, et à Klewel de convertir davantage de clients. Des méthodes innovantes seront développées pour améliorer la qualité des données extraites (Deep Learning sur OCR) et générer automatiquement des mots-clés sémantiques (Active Learning sur ASR et OCR) qui seront exposés aux moteurs de recherche. IDIAP is involved in the design of deep learning models for OCR and semantic slide segmentation.

EUMSSI - Event Understanding through Multimodal Social Stream. (website)

Role : Principal Investigator
Funding agency : European community (FP 7-ICT, STREP)
Dates : 2014-2016
Partners: University Pompeu Fabra (SP), Laboratoire d'Informatique de l'University du Mans (LIUM,F), GFAI (D), Leibniz University Hannover (D), Deutsche Welle (D), Video Stream Network (SP).

Description : Development of identification and aggregation technologies of unstructured information sources of very different nature (video, image, audio, speech, text and social context), including both online (e.g., YouTube) and traditional media (e.g. audiovisual repositories). The multimodal analytics will help organize, classify and cluster cross-media streams, by enriching its associated metadata in a cross-modal interoperable semantic representation framework. The resulting platform will be exploited for intelligent content management systems and more specifically for helping journalists writing articles.

UBImpressed - Ubiquitous First Impressions and Ubiquitous Awareness (website)

Role : Co-Principal Investigator
Funding agency : SNF (Swiss National Fund for scientific research)
Dates : 2014-2017
Partners: University of Lausanne (CH), Cornell University (USA)
Description :

G3E - Geometric Generative Gaze Estimation (website)

Role : Principal Investigator
Funding agency : SNF (Swiss National Fund for scientific research)
Dates : 2014-2015
Partners: Idiap Research Institute
Description :

VANAHEIM - Video/Audio Networked Surveillance System Enhancement through Human-Centered Adaptive Monitoring (website)

Role : Principal Investigator
Funding agency : European community (FP 7-ICT, IP)
Dates : 2010 - 2013
Partners: Multitel (B), Idiap Research Institute (CH), Vienna University (AUT), INRIA-Pulsar (F), Thales TCF (F), Thales Italia (I), GTT (I), RATP (France).

Description : The goal is to develop autonomous computer vision and learning algorithms contributing to the automatic selection of camera views displayed in the surveillance control room, the extraction and use of non-verbal cues for social human behavior understanding in surveillance scenes, and the online building of collective behavior models from long term recordings.

HUMAVIPS - Humanoids with auditory and visual abilities in populated spaces. (website)

Role : Principal Investigator
Funding agency : European community (FP 7-ICT, STREP)
Dates : 2010-2013
Partners: INRIA-Perception (F), Idiap Research Institute (CH), Czech Technical University (CZ), Bielefeld University (D), Aldebaran Robotics (F).

Description : Endow humanoid robots with audio-visual perception abilities (exploration, recognition, interactions) allowing it to adopt an autonomous and welladapted social behavior in the presence of a group of people.

SONVB - Sensing and Analyzing Organizational Nonverbal Behavior (website)

Role : Co-Principal Investigator
Funding agency : SNF (Swiss National Fund for scientific research)
Dates : 2010-2013
Partners: University of Neuchâtel (CH), Dartmouth College (USA)
Description :

SODA - perSon recOgnition in Debates and broAdcast news.

Role : Principal Investigator
Funding agency : ANR (Agence Nationale pour la Recherche, France)
Dates : 2011-2014
Partners: Laboratoire d'Informatique de l'University du Mans (LIUM), Idiap Research Institute

Description : automatic audio-visual person clustering and recognition

TRACOME - Robust face tracking, feature extraction and multimodal fusion for audiovisual speech recognition and visual attention modeling in complex environment (website)

Role : Principal Investigator
Funding agency : SNF (Swiss National Fund for scientific research)
Dates : 2010-2014
Partners: EPFL-LTS5, Idiap Research Institute
Description :

ASLEEP -

Role : Principal Investigator
Funding agency : Thales Italia
Dates : 2012-2013

Description : Developement and evaluation of static luggage detection module.

HAI - Human Activity and Interactivity modeling (website)

Role : Principal Investigator
Funding agency : SNF (Swiss National Fund for scientific research)
Dates : 2010-2014

Description : investigate unsupervised data mining techniques relying on Bayesian topic models for activity and interactivity mining in three dierent application settings (video surveillance, cell-phones, social networks).

PROMOVAR - Probabilistic Motifs for Action Recognition

Role : Principal Investigator
Funding agency : SNF (Swiss National Fund for scientific research)
Dates : 2012-2013
Description :

CODICES - Automatic Analysis of Mexican Codex Collections (website)

Role : Co-Principal Investigator
Funding agency : SNF (Swiss National Fund for scientific research)
Dates : 2008-2012
Partners: Mexico's National Institute of Anthropology and History (INAH)
Description :

CARETAKER - Content Analysis and Retrieval Technologies to Apply Knowledge Extraction to massive Recording (website)

Role : Principal Investigator
Funding agency : European community (IST FP6-027231, STREP)
Dates : 2006 - 2008
Partners: Partners: Thales TCF (F), Idiap research institute (CH), INRIA-Orion (F), Multitel (B), Kingston University (GB), BUT (CZ), Solid/IBM (Fin), ATAC (I), GTT (I).

Description : study, develop and evaluate content extraction algorithms applied to large collections of multimedia data captured from networks of camera and microphones in real sites (metro stations). Idiap was in charge of the development of robust and efficient algorithms for multi-camera multi-person tracking, the recognition of events (left-luggage, queue detection), and long term data mining (several camera recordings over 15 days).