Idiap on LinkedIn Idiap youtube channel Idiap on Twitter Idiap on Facebook
Personal tools
You are here: Home Research Projects Past Projects

Past Projects

List of terminated projects ordered by the most recently ended


ADDG2SU_EXT - Flexible Acoustic Data-Driven Grapheme to Subword Unit Conversion
MIRROR - Mobile Data to Improve Decision Making in African Cities
Africa's population is booming, with almost 75% of the 30 countries with the largest population growth rates according to United Nations and World Bank. However, phenomena in African cities often cannot be analyzed due to the lack of accurate census data or limited urban infrastructure. Fundamental facts like actual city boundaries and real populations are unknown in many cases.
MODERN - Modeling Discourse Entities and Relations for Coherent Machine Translation
MODERN is a SNSF Sinergia projected started in August 2013 for a duration of 3 years (grant n. CRSII2_147653). MODERN is led by the Idiap Research Institute with participants from the Universities of Geneva, Utrecht (the Netherlands), and Zurich.
SODS - Semantically Self-Organized Distributed Web Search
In this project we wish to develop a new search engine distributed over available web servers, in contrast to existing search engines centralized at a single company site.
PUNK - Punk – Punktuation
The PUNK project aims at enriching the ASR results provided by recapp IT with punctuation marks and a proper formatting of named entities (dates, numbers, ...).
FAVEO - Accelerating online information discovery through context-driven and behaviour-based personalization of search
Faveeo helps businesses gather strategic information from publishers and social media, using real-time magazines constructed from complex queries.
SP2 - SCOPES Project on Speech Prosody
This is a proposal for a Joint Research Project (JRP) under the SNSF SCOPES mechanism.
ADDG2SU - Flexible Acoustic Data-Driven Grapheme to Subword Unit Conversion
Current state-of-the-art automatic speech recognition (ASR) systems commonly use hidden Markov models (HMMs), where phonemes (phones) are assumed to be the intermediate subword units and each word to be recognized is explicitly modeled as a sequence of phonemes. Thus, despite availability of sophisticated statistical modeling or machine learning techniques, to develop an ASR system one requires prior knowledge, such as lexical resources (e.g., phoneme set, lexicon) and some minimum phonetic expertise.
BEAT - Biometrics Evaluation and Testing
SCOREL2 - Automatic scoring and adaptive pedagogy for oral language learning
SENSECITYVITY - Mobile Sensing, Urban Awareness, and Collective Action
The project goal is to engage citizens as factors of social change through the use of mobile technologies as tools that can improve the understanding of socio-urban problems in cities, neighborhoods, and communities.
MCSC - Mi Casa es Su Casa
Understanding Peer Accommodation in Developed and Developing Countries
DRACULA - DRACULA - Detect and track people/object in order to deliver personalised movies.
VIDEOPROTECTOR - Morphean VideoProtector
VideoProtector is a Software-as-a-Service proposition based on cameras coupled with video analysis.
VideoProtector - Behavior detection with active learning
This project aims at integrating an intelligent, predictive and adaptive event detection system in the VideoProtector platform (VideoProtector is a Software-as-a-Service proposition based on cameras coupled with video analysis).
VALAIS-2015 - Valais*Wallis Digital
SIWIS - Spoken Interaction with Interpretation in Switzerland
ROCKIT - Roadmap for Conversational Interaction Technologies
SIVI - Situated Vision to Perceive Object Shape and Affordances
DBOX - D-Box: A generic dialog box for multilingual conversational applications
OMSI_ARMASUISSE - Objective Measurement of Speech Intelligibility
GOOGLE_MOBILE - Mobile Face and Voice Anti Spoofing
DEEPSTD-EXT - Universal Spoken Term Detection with Deep Learning (extension)
DeepSTD project is interested in applying deep learning methods for speech processing.
Geneemo - An Expressive Audio Content Generation Tool
A-MUSE - Adaptive Multilingual Speech Processing
DEMO_NAO - Demonstrateur NAO
COHFACE - COntactless Heartbeat detection for trustworthy FACE Biometrics
Identity spoofing is a contender for high-security face recognition applications.
DASH - Object Detection with Active Sample Harvesting
EMMA1 - Expression Mimics Marker Analysis
OMSI_ARMASUISSE - Objective Measurement of Speech Intelligibility
During the last meeting at the armasuisse offices in March, armasuisse expressed interest in measurement methods for testing the quality of tactical telecommunication systems.
MULTIVEO - High Accuracy Speaker-Independent Multilingual Automatic Speech Recognition System
SUVA - Recomed: Intégration de la transcription vocale dans le dossier patient informatisé CRR
Le but du projet « Intégration de la transcription vocale dans le dossier patient informatisé CRR » est de réaliser, via un partenariat entre l’Idiap et la Clinique romande de réadaptation (CRR), une solution permettant de générer des rapports médicaux à partir de dictées vocales et de l’intégrer, comme un module, dans le dossier patient informatisé de la CRR (utilisé par de nombreuses cliniques, CMS et EMS valaisans).
RECOMED - Recomed: Intégration de la transcription vocale dans le dossier patient informatisé CRR
SODS - Semantically Self-Organized Distributed Web Search
MCM-FF - Multimodal Computational Modeling of Nonverbal Social Behavior in Face to Face Interaction
SESAME - SEarching Swiss Audio MEmories
G3E - Geometric Generative Gaze Estimation model from RGB-D sensors
AMASSE - Acoustic Model Adaptation toward Spontaneous Speech and Environment
Speech user interfaces typically rely on being able to place a microphone close to the user’s mouth. This maximizes the volume and clarity of the speech signal, whilst minimizing the effect of other noise in the vicinity. Such an interface is natural for, say, a telephone. However, many applications do not lend themselves to this type of interface. Examples include most home electronics, where the user might typically be in the center of a room, but the device is near a wall. In the case of televisions, a useful intermediate device is the remote control. Nevertheless, it is still inconvenient to hold a remote control like a telephone in order to talk to it.
DEEPSTD - Universal Spoken Term Detection with Deep Learning
REMUS - REMUS: Re-ranking Multiple Search Results for Just-in-Time Document Recommendation
RODI - Role based speaker diarization
MCM-FF - Multimodal Computational Modeling of Nonverbal Social Behavior in Face to Face Interaction
InEvent - Accessing Dynamic Networked Multimedia Events
The main goal of inEvent is to develop new means to structure, retrieve, and share large archives of networked, and dynamically changing, multimedia recordings, mainly consisting here of meetings, video-conferences, and lectures.
DOMOCARE - DomoCare - A new Home Care Preventive Protocol
DomoSafety develops intelligent care systems allowing elderly persons to keep their autonomy longer.
SODA - Person Recognition in debate and broadcast news
LOBI - Low Complexity Binary Features for Robust-to-Noise Speaker Recognition
DIMHA - Diarizing Massive Amounts of Heterogeneous Audio
AROLES - Automatic Recommendation of Lectures and Snipets
NISHA - NISHA: NTT - Idiap Social beHavior Analysis Initiative
RECOD - Low bit-rate speech coding
NINAPRO - Non-Invasive Adaptive Hand Prosthetics
SONVB - Sensing and Analyzing Organizational Nonverbal Behavior
TABULA RASA - Trusted Biometrics under Spoofing Attacks
NEC - NEC collaboration
FlexASR - Flexible Grapheme-Based Automatic Speech Recognition
There has always been an interest in using directly the grapheme (orthographic) transcription of the word, without explicit phonetic modeling. However, while limiting the variability at the word representation level, the link between the acoustic waveform has become weaker (depending on the language), as the standard acoustic features characterize phonemes. Most recent attempts were based on mapping orthography of the words onto HMM states using phonetic information, or extending conventional HMM-based ASR systems by improving context-dependent modelling for grapheme units.
HAI-2010 - Human activity and interactivity modeling
SSPNet - Social Signal Processing Network
IMAGECLEF - The Robot Vision Task @ ImageCLEF: Towards Web-Robotics
MediaParl - MEDIAPARL
VlogSense - Understanding Nonverbal Behavior in Social Media
VlogSense is funded by the NCCR IM2 (Swiss National Science Foundation)
PANDA - Perceptual Background Noise Analysis for the Newest Generation of Telecommunication Systems
TRACOME - Robust face tracking, feature extraction and multimodal fusion for audio-visual speech recognition and visual attention modeling in complex environment
IM2 (Phase III) - Interactive Multimodal Information Management
IM2 is one the 20 Swiss National Centres of Competence in Research (NCCR) aiming at boosting research and development in several areas considered of strategic importance to the Swiss economy. The National Centers of Competence in Research are a research instrument managed by the Swiss National Science Foundation on behalf of the Federal Authorities. Granted for a maximum duration of 12 years, they are evaluated every year by a review panel, and renewed every four years. In December 2009, the SNSF approved the next and last four-year period (2010 - 2013) which has started January 1st, 2010. Success of the NCCRs is measured in terms of research achievements, training of young scientists (PhD students and postdocs), knowledge and technology transfer (including spin-offs), and advancement of women.
UBM - CRSII3_127456 - Understanding Brain morphogenesis
MASH - Massive Sets of Heuristics for Machine Learning
BBfor2 - Bayesian Biometrics for Forensics
IM2 (Phase three) - Interactive Multimodal Information Management
IM2 is one the 20 Swiss National Centres of Competence in Research (NCCR) aiming at boosting research and development in several areas considered of strategic importance to the Swiss economy. The National Centers of Competence in Research are a research instrument managed by the Swiss National Science Foundation on behalf of the Federal Authorities. Granted for a maximum duration of 12 years, they are evaluated every year by a review panel, and renewed every four years. Success of the NCCRs is measured in terms of research achievements, training of young scientists (PhD students and postdocs), knowledge and technology transfer (including spin-offs), and advancement of women.
CLEAR - Online Cloud-based Platform for Efficient and Robust Face Recognition Servi
DAUM - Multi-Lingual and Cross-Lingual Adaptation for Automatic Speech Recognition
VELASH - Very Large Sets of Heuristics for Scene Interpretation
PROMOVAR - Probabilistic Motifs for Video Action Recognition
VANAHEIM - Video/Audio networked surveillance system enhancement through human-centered adaptIve monitoring
ASLEEP - Adapting the Static Luggage dEtEction module for the ProtectRail demonstrator
UBSL - User-Based Similarity Learning for Interactive Image Retrieval
COMTIS - Improving the coherence of machine translation output by modeling intersentential relations
PASCAL2 - Pattern Analysis, Statistical Modelling and Computational Learning
HUMAVIPS - Humanoids with Auditory and Visual Abilities in Populated Spaces
SCALE - Speech communication with adaptive learning
ICS-2010 - Interactive Cognitive Systems
TOBI - Tools for Brain-Computer Tools for Brain-Computer Interaction
TAO-CSR - Task Adaptation and Optimisation for Conversational Speech Recognition
LS-CONTEXT - Large-Scale Human Context Discovery from Mobile Phones
LS-CONTEXT (Large-Scale Human Context Discovery from Mobile Phones) will investigate probabilistic methods to discover personal and social behavioral patterns from cell phone data. The project addresses three goals.
CODICES - Automatic Analysis of Mexican Codex Collections
YOUBLOG - YOUBLOG
TA2 - Together Anywhere, Together Anytime
PIRMIN - Personalized Information Recommendation for Multimedia Archive Navigation
CONTEXT - Context-based Modelling for Object Detection
CCPP - Cross Cultural Personality Perception
Psychologists have shown that there is a correlation between nonverbal characteristics of speaking on one side, and personality traits as perceived by the listeners on the other side. For example, individuals that speak loud are perceived as more extroverted than individuals that speak soft, and individuals that speak fast are perceived as more brilliant than individuals that speak slow. The problem is that the mapping between nonverbal characteristics of speaking and perceived personality traits is, in many cases, culture dependent. In other words, the above examples are known to apply in southern Europe, but they can be wrong when applied in other cultural areas.
NOVICOM - Automatic Analysis of Group Conversations via Visual Cues in Non-Verbal Communication
REPLAY - Face Authentication Robust to Replay Attacks
BCID - Brain-Coupled Interactive Devices
EMIME - Effective Multilingual Interaction in Mobile Environments
DIRAC - Detection and Identification of Rare Audio-visual Cues
MOBIO - Mobile Biometry
DM3 - Distributed MultiModal Media Server
airvea - Visite Augmentée
MULTI 08 - MULTImodal Interaction and MULTImedia Data Mining
IMPACT - Image Spam Classification
BACS - Bayesian Approach to Cognitive Systems
TT - Truetime

Document Actions