Human–AI Teaming
This research program capitalizes on the well-established expertise at Idiap on multimodal interaction. It leverages Idiap’s unique ability to undertake in-depth multidisciplinary research across verbal and nonverbal communication, language processing, perceptual and cognitive systems, and human–robot interaction.
The aim of the program is to expand human capabilities in several fields (creativity, cognitive limitations, collaboration, knowledge). This research aims to improve machines’ sensing and understanding of human activities, improve information access (e.g., through chatbots serving as on-demand domain experts), use human feedback for improving learning systems, and use robots to assist humans in everyday tasks at work and at home.
Expertise domains
#Bioinformatics&HealthInformatics
#DataScience&SocialComputing
#HumanComputerInteraction
#Imaging&ComputerVision
#MachineLearning
#NaturalLanguageProcessing
#Robotics&AutonomousSystems
#Security&Privacy
#SignalProcessing
#Speech&AudioProcessing
This program contributes to the following UN SDG
People
There was an error while rendering this tile
Publication highlights
Neural Network Adaptation and Data Augmentation for MultiSpeaker Direction-of-Arrival Estimation, W. He, P. Motlicek and J.-M. Odobez, IEEE/ACM Trans. on Audio, Speech and Language Processing, 29, pp. 1303-1317, 2021.
The first viable deep learning framework (task definition, network architecture, training paradigm) for solving fundamental auditory tasks such as sound source localization, speaker identification and speech/non-speech classification. The framework is suitable for highly noisy environments and overcomes limitations of previous methods, which heavily relied on idealized sound and environment models and are inadequate for everyday situations with multiple sound sources, background noise, short utterances, and lack of prior knowledge on the number of sound sources. The method learns sound source localization models with limited training resources leveraging simulated and weakly-labeled real audio data.
Active Learning by Feature Mixing, A. Parvaneh, E. Abbasnejad, D. Teney, G. R. Haffari, A. Van Den Hengel, & J. Q. Shi, In Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 12227-12236, 2022.
A method to train deep learning models with humans in the loop. Current approaches to machine learning depend on large amounts of data that are costly or difficult to acquire. This paper presents an active learning approach where human experts interact with the learning algorithm to iteratively refine and resolve inconsistencies in a model by labeling a small set of training examples. This approach contributes to widening the accessibility of machine learning technologies to small organizations.
Learning Joint Space Reference Manifold for Reliable Physical Assistance, Razmjoo, A., Brecelj, T., Savevska, K., Ude, A., Petric, T. and Calinon, S., In Proc. IEEE/RSJ Intl Conf. on Intelligent Robots and Systems (IROS), 2023.
Project highlights
C-LING, 2022-2026, SNSF, Van der Plas: TOWARDS CREATIVE SYSTEMS WITH LINGUISTIC MODELLING
This project aims to investigate what aspects computational models need to perform creative cognitive tasks, from generating relatively simple novel concepts to more complex and structured ideas, across multiple domains and languages. More in particular, it aims to answer what types of structured and unstructured knowledge are needed and what models best integrate these types of knowledge.
NeuMath, 2022-2024, SNSF, Freitas: NEUMATH: NEURAL DISCOURSE INFERENCE OVER MATHEMATICAL TEXTS
NeuMath will develop models which can jointly represent and reason over two symbolic modalities (natural language and mathematical expressions) and will build the foundations to deliver embedding models which can interpret and support the generation of mathematical arguments (by leveraging available large-scale scientific corpora).
SMILE-II, 2021-2024, SNSF Sinergia, Magimai Doss: SMILE-II SCALABLE MULTIMODAL SIGN LANGUAGE TECHNOLOGY FOR SIGN LANGUAGE LEARNING AND ASSESSMENT PHASE-II
The proposed project SMILE-II aims to research and build advanced technology for sign language learning. More precisely, the proposed project builds on the groundwork laid down by the SNSF Sinergia project SMILE, which dealt with assessment of the manual activity of Swiss German Sign Language (Deutschschweizerische Gebärdensprache, DSGS) in isolated signs produced by early learners and L2 learners. SMILE-II will extend this technology to continuous sign language assessment including both manual and non-manual components of signs so that a DSGS learner’s sentence-level production can be assessed in an automatic manner.
Full list of related projects
C-LING, 2022-2026, SNSF, Van der Plas
Building computational models of human creative thinking to help with creative tasks
NeuMath, 2022-2024, SNSF, Freitas
Neuro-symbolic architectures for supporting mathematical discovery
NAST, 2020-2024, SNSF, Garner
Neural architectures for speech technology
SteADI, 2021-2025, SNSF, Garner
Storytelling algorithms for digital interviews
NKBP, 2020-2024, SNSF, Henderson
Deep learning models for continual extraction of knowledge from text
SINFONIA, 2023-2027, Innosuisse, Teney, Freitas
Generalization and domain adaptation of large language models
LUCIDELES, 2020-2023, SFOE, Kämpf
Research at the interface between humans and building control systems
CODIMAN, 2020-2024, National Research Programme "Digital Transformation", SNSF, Calinon
Cobotics, digital skills and the re-humanization of the workplace
SESTOSENSO, 2022-2025, Horizon Europe, Calinon
Physical cognition for intelligent control and safe human-robot interaction
SMILE-II, 2021-2024, SNSF Sinergia, Magimai Doss
Assistive technology for sign language learning and testing
Amazon research award, 2023, Teney
Addressing underspecification for improved fairness and robustness in conversational AI
MALORCA, 2016-2018, EU, Motlicek
Machine learning of speech recognition models for controller assistance
HAAWAII, 2020-2022, EU, Motlicek
Highly automated air-traffic controller workstations with artificial intelligence integration
ATCO2, 2019-2022, EU, Motlicek
Automatic collection and processing of voice data from air-traffic communications
EUROCONTROL, 2023-2024, France, Motlicek
Automatic speech recognition in air-traffic control simulation