Seminars are lectures given by external visitors and hosted at Idiap.

Next event


Solving hard problems in robotics – with a little help from semidefinite relaxations, nullspaces, and sparsity

Dr Frederike Dümbgen, Robotics Institute of University of Toronto

March 20, 11:00pm


Many state estimation and planning tasks in robotics are formulated as non-convex optimization problems, and commonly deployed efficient solvers may converge to poor local minima. Recent years have seen promising developments in so-called certifiably optimal estimation, showing that many problems can in fact be solved to global optimality or certified through the use of tight semidefinite relaxations.

In this talk, I present our efforts to make such methods – for the field of state estimation in particular – more practical for roboticists. Among those efforts, I will present novel efficient optimality certificates as a low-cost add-on to off-the-shelf local solvers, which apply to a variety of problems including range-only, stereo-camera and, more generally, matrix-weighted localization. Then, I present our approach to automatically certify almost any state estimation problem, using a sampling-based method to automatically find tight relaxations through nullspace characterizations. I end with an overview of our most recent work, which allows to create both fast and certifiably optimal solvers by exploiting the sparse problem structure.


Frederike Dümbgen is currently a postdoctoral researcher at the Robotics Institute of University of Toronto, working with Prof. Tim Barfoot. She received her Ph.D. in 2021 from the Laboratory of AudioVisual Communications (LCAV) with Prof. Martin Vetterli and Dr. Adam Scholefield in Computer and Communication Sciences at École Polytechnique Fédérale de Lausanne (EPFL), Switzerland. Before that, she obtained her B.Sc. and M.Sc. in Mechanical Engineering from EPFL in 2013 and 2016, respectively, with a minor in Computational Science and Engineering, and Master's thesis at the Autonomous Systems Lab of ETH Zürich. Her research has ranged from novel localization methods, using in particular acoustic, radio-frequency and ultra-wideband signals, to, most recently, global optimization for robotics.


Past events


Isaac Asimov, robots and planet Earth

Pierre-Brice Wieber

February 1, 2:30pm


Isaac Asimov was a scientist and a science-fiction writer who invented the words « robotics » and « roboticist ». Doing so, he proposed Three famous Laws of Robotics: 1) A robot may not injure a human being or, through inaction, allow a human being to come to harm. 2) A robot must obey the orders given it by human beings except where such orders would conflict with the First Law. 3) A robot must protect its own existence as long as such protection does not conflict with the First or Second Law. He later added a Zeroth Law, superseding the others: A robot may not injure humanity or, through inaction, allow humanity to come to harm. I propose to discuss how these different statements can be put to use when doing robotics today.


Pierre-Brice Wieber is a full-time researcher at INRIA Grenoble. He graduated from Ecole Polytechnique in 1996 and received his PhD degree in Robotics from Ecole des Mines de Paris in 2000. He was a visiting researcher at AIST/CNRS Joint Research Lab in Tsukuba in 2008–2010. Pierre-Brice has been serving as Associate Editor for IEEE Transactions on Robotics, Robotics and Automation Letters and conferences such as ICRA and Humanoids. His research interests include the modeling and control of humanoid and manipulator robots.


Deep Surface Meshes

Prof. Pascal Fua, EPFL

December 12, 2023, 2:30pm


Geometric Deep Learning has made striking progress with the advent of Deep Implicit Fields. They allow for detailed modeling of surfaces of arbitrary topology while not relying on a 3D Euclidean grid, resulting in a learnable 3D surface parameterization that is not limited in resolution. Unfortunately, they have not yet reached their full potential for applications that require an explicit surface representation in terms of vertices and facets because converting the implicit representation to such an explicit representation requires a marching-cube algorithm, whose output cannot be easily differentiated with respect to the implicit surface parameters. In this talk, I will present our approach to overcoming this limitation and implementing convolutional neural nets that output complex 3D surface meshes while remaining fully-differentiable and end-to-end trainable. I will also present applications to single view reconstruction, physically-driven Shape optimization, and bio-medical image segmentation.


Pascal Fua received an engineering degree from Ecole Polytechnique, Paris, in 1984 and a Ph.D. in Computer Science from the University of Orsay in 1989. He joined EPFL (Swiss Federal Institute of Technology) in 1996 where he is a Professor in the School of Computer and Communication Science and head of the Computer Vision Lab. Before that, he worked at SRI International and at INRIA Sophia-Antipolis as a Computer Scientist. His research interests include shape modeling and motion recovery from images, analysis of microscopy images, and machine learning. He has (co)authored over 300 publications in refereed journals and conferences. He has received several ERC grants. He is an IEEE Fellow and has been an Associate Editor of IEEE journal Transactions for Pattern Analysis and Machine Intelligence. He often serves as program committee member, area chair, and program chair of major vision conferences and has cofounded three spinoff companies.


Disentangling Linguistic intelligence: automatic generalisation of structure and meaning across languages

Prof. Paola Merlo, UNIGE

October 20, 2023


The current reported success of large language models is based on computationally (and environmentally) expensive algorithms and prohibitively large amounts of data that are available for only a few, non-representative languages. This limitation reduces the access to natural language processing technology to a few dominant languages and modalities and leads to the development of systems that are not human-like, with great potential for unfairness and bias. To reach better, possibly human-like, abilities in neural networks' abstraction and generalisation, we need to develop tasks and data that train the networks to more complex and compositional linguistic abilities. We identify these abilities as the intelligent ability to infer patterns of regularities in unstructured data, generalise from few examples, using abstractions that are valid across possibly very different languages. We have developed a new task and a set of problems inspired by IQ intelligence tests. These problems are developed specifically for language and aim to learn disentangled linguistic representations of underlying linguistic rules of grammar. These investigations can lead to three beneficial improvements of methods and practices: (i) deep, compositional representations would be learnt, thus reducing needs in data size; (ii) current machine learning methods would be extended to low-resources languages or low-resource modalities and scenarios; (iii) higher-level abstractions would be learnt, avoiding the use of superficial, associative cues (possibly reducing bias and potential harm in the representations learned by current artificial linguistic systems).


Early-exits models for automatic speech recognition on resource-constrained devices

Alessio Brutti, Fondazione Bruno Kessler, Trento, Italy

October 20, 2023


The possibility of dynamically modifying the computational load of neural models at inference time is crucial for on-device processing, where computational power is limited and time-varying. Established approaches for neural model compression exist, but they provide static models. Relying on intermediate exit branches, early-exit architectures allow for the development of dynamic models that adjust their computational cost to resources and performance. This talk will present an experimental analysis on the use of early exit architectures in large vocabulary speech recognition scenarios, showing that properly training the models not only preserves performance levels when using fewer layers, but also improves the accuracy as compared to using single-exit models or using pre-trained models. In addition, the talk will discuss the application of early-exits architectures in federating learning frameworks with heterogeneous devices.


Investigating the overheating risk in a free-running building in Thailand using CIBSE TM52 and Annual Sun Exposure (ASE)

Apiparn Borisuit, EPFL

September 18, 2023


Annual Sunlight Exposure (ASE) is widely used to assess direct sunlight exposure in the building as a proxy to detect potential visual discomfort. Even though ASE was not targeted at thermal comfort, the relationship between direct sunlight and thermal sensation has been known. The study aims to explore the associations of ASE and thermal comfort criteria through an improvement of thermal comfort in a Child Development Centre (CDC) in Thailand. An existing condition of a CDC building and a simplified version were simulated using the IESVE simulation tool. Overhangs, external shutters, and double glazing were integrated into the computer models to improve thermal comfort. CIBSE TM52 overheating criteria are used to indicate thermal comfort. We found significant correlations between ASE and the criteria of CIBSE TM52 (r=0.28 -0.56; p.


The Regularization of the Presentation Attack Detection (PAD) Systems By Explainability

Gökhan Özbulak, Dokuz Eylül University

June 5, 2023


A Presentation Attack Detection (PAD) system is the crucial sub-component of the biometric systems when it comes to recognize or verify someone for further processing. In case of the lack of such PAD systems, one can penetrate into the protected areas in unauthorized way and causes the biometric validation to be broken. Therefore, a PAD system must be exist and robust against all kind of attacks including any kind of the paper, tablet screen, 2D or 3D mask etc. In this talk, I will present my past study about the generalization of PAD systems. I will propose an explainability based regularization method for the PAD systems and share the generalizability performance of the proposed method in public and cross-dataset experiments. I will also share a brief introduction about my other studies regarding of hard and soft biometrics.


Automatic analysis of Parkinson's disease: unimodal and multimodal perspectives

Prof. Juan Rafael Orozco-Arroyave

March 23, 2023


Parkinson's disease (PD) is a (mainly) movement disorder and appears due to the progressive death of dopaminergic neurons in the substantia nigra of the midbrain (part of the basal ganglia). Diagnosis and monitoring of PD patients are still highly subjective, time-consuming, and expensive. Existing medical scales used to evaluate the neurological state of PD patients cover many different aspects, including activities of daily living, motor skills, speech, and depression. This makes the task of automatically reproducing experts' evaluations very difficult because several bio-signals and methods are required to produce clinically acceptable/practical results.

This talk tries to show how different bio-signals (e.g., speech, gait, handwriting, and facial expressions) can be used on the way to find suitable models for PD diagnosis and monitoring. Results with classical feature extraction and classification methods will be presented along with CNN and GRU -based architectures.


Understanding Neural Speech Embeddings for Speech Assessment

Prof. Elmar Nöth

January 20, 2023


In this talk, we present preliminary results on experiments which were performed in order to understand, what information is represented in which layer of deep neural networks. We will motivate our experiments with an image processing problem (identification of orca individuals based on the dorsal fin), where we show that the result of unsupervised clustering of previously unseen individuals strongly depends on the underlying embedding and for what that embedding was trained in a supervised manner. We then present preliminary results on t-SNE projections of different pathologic an control corpora based on the different layers of a pre-trained wav2vec2 module and end with an outlook to current and future research.


The e-David project: Painting strategies and their influence on robotic painting

Prof. Dr. Oliver Deussen, University of Konstanz

August 2, 2022


Our drawing robot e-David is able to create paintings using visual feedback. So far, our paintings have been created using a stroke-based metaphor. In my talk I will speak about the development of a n umber of stroke-based styles. However, being in close contact with artists we realized at some point that painting can much better be modeled by interacting and contrasting areas instead of strokes - which are more the basis of drawings. This paradigm shift allows us to construct paintings from a different perspective; the interaction between areas enables us to model different forms of abstraction and reshape areas according to style settings. We will also be able to integrate machine-learning based tools for analyzing and deconstructing input images. This enhances our creative space and will allow us to find our own forms of machine abstraction and representation.


Artificial Intelligence meets Digital Forensics: a panorama

Prof. Anderson Rocha  

July 14, 2022


In this talk, we will discuss a panoramic view of digital forensics in the last 10 years and how it needed to evolve from basic computer vision and simple natural language processing techniques to powerful AI-driven methods to deal with the signs of the new age. We will discuss tampering detection, fact-checking, deepfakes, and authorship analysis as well as recent advances in self-supervised learning to deal with large-scale search in some forensics problems.