Adaptive and Asynchronous Detection and Segmentation

Large neural networks, and more generally so-called "deep models" are currently the most efficient technical solution for processing high-dimension natural signals such as images and videos. They have been put to use in particular for the general task of "scene understanding", which aims at automatically extracting a semantic description of an image or a video as an arrangement of components identified and localized. This can now be done at a level of performance that seemed unreachable five years ago, which opens the way to automatizing many complex tasks, such as content-based image and video retrieval, event detection, and autonomous driving or flying. This performance is reached under two heavy requirements: First, the training of the models requires large-scale annotated data-sets, and second, the training and inference are computationally extremely demanding, requiring often several millions of oating-point operations per pixel. The objective of this project is to address both issues to improve the performance of autonomous flying and wheeled drones in their context of use.
Idiap Research Institute
Oct 01, 2018
Sep 30, 2021