This website talks about the projects I'm working on, at work (at the Idiap Research Institute) or on my free time.

Introduction

The goal of this application is to display a visual representation of the events happening in the Idiap Showroom in real-time. The room is equipped with a microphone array (8 microphones, located in the center of the room) and 4 cameras (one in each corner). A 3D representation of the room is projected on a screen.

The video/audio processing part of the application is made of separate processing blocks chained together. Most of these blocks run in their own thread. Some block examples: Video capture, distortion correction, background substraction, people detection, people tracking, audio capture, speakers localization, rendering, streams synchronization, ...

Screenshots

(Click for full size)

The current 3D representation of the room The current 3D representation of the room. The dark yellow area on the floor indicates that a speaker was detected there. The more confident we are, the more bright yellow it is. The grid on the floor is the actual one used for the detection. In that sequence, each cell has 25x25cm.
The 4 video streams captured, after distortion correction The 4 video streams captured, after distortion correction (hence the strange curves at the borders). The red rectangles are the result of the people detection algorithm.
Composite view of one video stream and the corresponding virtual camera Composite view of one video stream and the corresponding virtual camera. My avatar (I'm the guy with the white shirt) is speaking, thanks to the speakers localization process.
Another perspective Another perspective. Each change of view triggers an animation, either an animation of the camera of the 3D view, or an animation of the views themselves, or even a combination of the two.

Details

  • 4 video streams at 7.5fps
  • 8 audio streams
  • 38 processing blocks
  • Libraries used: Ogre3D, OpenCV, pom (The Probabilistic Occupancy Map)
  • Running on a MacPro 2.8GHz Quad-core
  • 4 developers

Current features

  • People tracking
  • Speakers localization

Planned features/future work

  • Improve some of the algorithms (background substraction, people tracking)
  • Improve the 3D view (replace the avatar by a better one, add the missing furnitures, add some special effects, collision detection)
  • Customization of the avatars appearence
  • Speech recognition (to pilot the application: the toolbar at the top will the be removed, and the application will go fullscreen at that point)

The swiss TV show nouvo did a reportage on Smoovee, where Jean-Pierre Gehrig (from Cinetis SA) is interviewed. The pitch is:

Des images qui tremblent ? Une vidéo réalisée dans des conditions extrêmes ? Aujourd'hui il est possible de retoucher ces séquences via Internet. La société valaisanne Cinetis a développé un stabilisateur d'images disponible sur le Web. Le principe est simple : on filme, on transfère ces images sur l'ordinateur, et le logiciel fait le reste.

« Notre travail principal consiste à copier sur des DVD des images anciennes tournées sur pellicule. Ces séquences vidéos tremblent souvent. Nous avons donc développé un petit programme informatique pour les lisser. » Fort de ce résultat, Jean-Pierre Gehrig et son équipe ont décidé de rendre ce programme public.

Aujourd'hui le stabilisateur d'images ne fonctionne que sur des séquences vidéo de faible résolution tournées par exemple avec un téléphone portable. Dans le futur, les concepteurs veulent cependant le rendre accessible aux professionnels de l'image en incluant par exemple le traitement de la « haute définition ».

You can see it here: http://www.nouvo.ch/138-4

Smoovee is an online video stabilization service which takes shaky movies and make them look like they were shoot with a SteadyCam.

This video stabilization technology doesn't need any user input, it segments the video automatically, tracks the motion between frames and computes a smooth virtual camera movement with minimum zooming.

You can try it freely or see examples at http://smoovee.net

FaceOnIt - Logo FaceOnIt is a robust, fast and powerful software library to find faces in pictures and videos. The library can be freely downloaded for non-commercial use (a free registration is required).

In addition to the library, two Windows demos are provided, in binary and source code forms:

  • Face Detector detect faces in an image
  • Face Tracker detect faces in live video (using your webcam or a similar video capture device)

URL: http://www.faceonit.ch/

Note: When an image bigger than the screen is loaded in Face Detector, the image is resized when displayed, but not during the face detection. The demo also use a different minimum possible face width on big images than on smaller ones, to speedup the process.

FaceOnIt - Face DetectorFaceOnIt - Face Tracker

Google Portrait - LogoGoogle Portrait is a web-based demonstration system of IDIAP face detection technology.

The user must enter the name of a person. This query is then sent to Google Images and the resulting images are downloaded and processed (face detection) on-the-fly, before display. Each detected face can then be tagged by the user.

URL: http://www.idiap.ch/googleportrait

Note: Besides using Google Images to retrieve the URLs of the images, this demonstration has no official link with Google. It's very likely that the name will change in the future.

Example of the result of the query "Tom Cruise":

Google Portrait - Main page

Interface used to tag the faces:

Google Portrait - Tagging of faces