Skip to content. | Skip to navigation

Navigation

Idiap Speaker Series

Personal tools

This is SunRain Plone Theme
You are here: Home / Past Talks

Archives

Tue, 27 Sep 2016
11:00:00
Prof. Mark Gales
from Cambridge University
Talk place: Idiap Research Institute

Deep Learning for Speech Processing: An NST Perspective

Abstract:
The Natural Speech Technology EPSRC Programme Grant was a 5 year collaboration between Edinburgh,Cambridge and Sheffield Universities, with the aim of improving core speech recognition and synthesis technology. During the lifetime of the project dramatic changes took place in the underlying technology for speech processing with the introduction of deep learning. This has yielded significant performance improvements, as well as offering a very rich space of model to investigate. This talk discusses the general area of deep learning for speech processing, with a particular emphasis on sequence-to-sequence models: in speech recognition, waveform to text; and in synthesis, text to waveform. Both generative and discriminative models for sequence-to-sequence models are described along with variants on the standard topologies and the implications for both training and inference. Rather than focusing on results for particular models, the talk aims to describe the connections and differences between sequence-to-sequence models and the underlying assumptions for these models.
               
 
Fri, 15 Jul 2016
11:00:00
Dr. Freek Stulp
from the German Aerospace Center (DLR) in Oberpfaffenhofen, Germany
Talk place: Idiap Research Institute

TALK - Robot Skill Learning: From Reinforcement Learning to Evolution Strategies

Abstract:
A popular approach to robot skill learning is to initialize a skill through imitation learning, and to then refine and improve the skill through reinforcement learning. In this presentation, I highlight three contributions to this approach:
1) Enabling skills to adapt to task variations by using multiple demonstrations for imitation learning,
2) Improving skills through reinforcement learning based on reward-weighted averaging and black-box optimization with evolution strategies.
3) Using covariance matrix adaptation to automatically tune exploration during reinforcement learning.
Throughout the presentation I show several applications to challenging manipulation tasks on several humanoid robots.
               
 
Fri, 15 Jul 2016
15:00:00
Dr. Freek Stulp
from the German Aerospace Center (DLR) in Oberpfaffenhofen, Germany
Talk place: Idiap Research Institute

TUTORIAL - Tutorial on Regression

Abstract:
Tutorial on Regression based on the article: 
Freek Stulp and Olivier Sigaud (2015). Many Regression Algorithms, One Unified Model - A Review. Neural Networks, 69:60-79.
Link: http://freekstulp.net/publications/pdfs/stulp15many.pdf
               
 
Thu, 7 Jul 2016
14:00:00
Harry Witchel* & Carina Westling#
from *Discipline Leader in Physiology, Brighton/Sussex Medical School --- #School of Media Film and Music
Talk place: Idiap Research Institute

Eliciting and recognising complex emotions and mental states including engagement and boredom

Abstract:
Complex emotions are any emotional state except for Ekman’s 6 basic emotions: happy, sad, fear, anger, surprise and disgust.  Complex emotions can include mixtures of the basic emotions (e.g. horror), emotions outside the basic emotions (e.g. musical “tension”), and emotions mixed with mental states that are not emotions (e.g. engagement and boredom).  Eliciting and recognising complex emotions, and allowing systems to respond to them, will be useful for eLearning, human factors (including vigilance), and responsive systems including human-robot interaction.  

In this talk we will present our work towards the elicitation and recognition of conscious or subconscious responses. Engineering and psychological solutions to non-invasively determine such mental states and complex emotions may use movement, posture, facial expression, physiology, and sound.  Furthermore, our team has shown that what people suppress is as revealing as what they do. We consider aspects of music listening, movie watching, game playing, quiz-taking, reading, and walking to untangle the complex emotions that can arise.  The mental states of engagement and boredom are considered in relation to fidgeting and to Non-Instrumental Movement Inhibition (NIMI), in order to clarify fundamental research problems and direct research design toward improved solutions.
               
 
Wed, 22 Jun 2016
11:00:00
Asst Prof Gregoire Mariethoz
from University of Lausanne, Institute of Earth Surface Dynamics
Talk place: Idiap Research Institute

Training models with images: algorithms and applications

Abstract:
Multiple-point geostatistics (MPS) has received a lot of attention in the last decade for modeling complex spatial patterns. The underlying principle consists in representing spatial variability using training images. A common conception is that a training image can be seen as a prior for the desired spatial variability. As a result, a variety of algorithmic tools have been developed to generate stochastic realizations of spatial processes based on what can be seen broadly as texture generation algorithms.

While the initial applications of MPS were dedicated to the characterization of 3D subsurface structures and the study of geological/hydrogeological reservoirs, a new trend is to use MPS for the modeling of earth surface processes. In this domain, the availability of remote sensing data as a basis to construct training images offers new possibilities for represent complexity with such non-parametric data-driven approaches. Repeated satellite observations or climate models outputs, available at a daily frequency for periods of several years, provide the required patterns repetition for having robust statistics on high-order patterns that vary in both space and time.

This presentation will delineate recent results in this direction, including MPS applications to the stochastic downscaling of climate models, the completion of partially informed satellite images, the removal of noise in remote sensing data, and modeling of complex spatio-temporal phenomena such as precipitation.
               
 
Thu, 12 May 2016
10:30:00
Prof. Steve Renals
from University of Edinburgh, UK
Talk place: Idiap Research Institute

Adaptation of Neural Network Acoustic Models

Abstract:
Neural networks can learn invariances through many layers of non-linear transformations.  Explicit adaptation to speaker or acoustic characteristics can further improve accuracy.   A good adaptation technique should: (1) have a compact representation to allow the speaker-dependent parameters to be estimated from small amounts of adaptation data, and minimises storage requirements; (2) operate in an unsupervised fashion without requiring labelled adaptation data; and (3) allow for both test-only adaptation and speaker-adaptive training.

In this talk I'll discuss some approaches to the adaptation of neural network acoustic models - for both speech recognition and speech synthesis - with a focus on some approaches that we have explored in the "Natural Speech Technology" programme: factorised i-vectors, LDA domain codes, learning hidden unit contributions (LHUC), and differentiable pooling.
               
 
Mon, 21 Mar 2016
14:00:00
Prof. Réjean Plamondon
from Laboratoire Scribens, Département de Génie Électrique École Polytechnique de Montréal
Talk place: Idiap Research Institute

The Lognormality Principle

Abstract:
The Kinematic Theory of rapid human movements and its family of lognormal models provide analytical representations of pen tip strokes, often considered as the basic unit of handwriting. This paradigm has not only been experimentally confirmed in numerous predictive and physiologically significant tests but it has also been shown to be the ideal mathematical description of the impulse response of a neuromuscular system. This proof has led to postulate the LOGNORMALITY PRINCIPLE. In its simplest form, this fundamental premise states that the lognormality of the neuromuscular impulse responses is the result of an asymptotic convergence, a basic global feature reflecting the behaviour of individuals who are in perfect control of their movements. As a corollary, motor control learning in young children can be interpreted as a migration toward lognormality. For the larger part of their lives, healthy human adults take advantage of lognormality to control their movements. Finally, as aging and health issues intensify, a progressive departure from lognormality is occurring. To illustrate this principle, we present various software tools and psychophysical tests used to investigate the fine motor control of subjects, with respect to these ideal lognormal behaviors, from childhood to old age. In this latter case, we focus particularly on investigations dealing with brain strokes, Parkinson and Alzheimer diseases. We also show how lognormality can be exploited in many pattern recognition applications for automatic generation of gestures, signatures, words and script independent patterns as well as CAPTCHA production, graffiti generation, anthropomorphic robot control and even speech modelling. Among other things, this lecture aims at elaborating a theoretical background for many handwriting applications as well as providing some basic knowledge that could be integrated or taking care of in the development of new automatic pattern recognition systems to be used for e-Learning, e-Security and e-Health.
               
 
Tue, 19 Jan 2016
11:00:00
Gareth Morlais
from Welsh Government, Cardiff, Wales
Talk place: Idiap Research Institute

How technology is opening up new potential for democracy, participation and collaboration

Abstract:
The barriers to production are being lowered so it's a good time to build platforms which make it as simple as possible for everyone to join in and help train and refine language technologies, share their stories and spread the word. Gareth draws on digital storytelling with the BBC, democratic activism via hyperlocal journalism and tools for citizenship to see if there's a new way to corral people's enthusiasm for languages to help build better, more relevant resources.
               
 
Mon, 14 Dec 2015
14:30:00
Dr. Baptiste Caramiaux
from Goldsmiths, University of London
Talk place: Idiap Research Institute

Probabilistic Models for Music Performance: Interaction, Creation, Cognition

Abstract:
Music performance is an epitome of complex and creative motor skills. It is indeed striking that musicians can continuously show more physical virtuosity in playing their instrument and can show more creativity in varying their interpretation. Technology-mediated music performance has naturally explored the potential of interfaces and interactions for enhancing musical expression. It is however a difficult (and ill-posed) problem and musical interactive systems cannot yet challenge traditional instruments in terms of expressive control and skill learning.
I believe that an important aspect of the problem relies on the understanding of variability in the performer’s movements. I will start my talk by presenting the computational approach based on probabilistic models, particularly suited to handle uncertainty in motion data that stem from noise or intentional variations of the performers. I will then illustrate the potential of the approach in the design of expressive music interactions through experiments with proofs of concept developed and evaluated in the lab; as well as real world applications in artistic projects and in industrial products for consumer devices. Finally, I will present my upcoming EU-funded research project that takes a more theoretical perspective by examining how this approach could potentially be used to infer an understanding of the cognitive processes underlying sensorimotor learning in music performance.
               
 
Thu, 3 Sep 2015
14:00:00
Prof. Frederic Fol Leymarie
from Goldsmiths, University of London
Talk place: Idiap Research Institute

Shape, Medialness and Applications

Abstract:
I will present on-going research in my group with a focus on shape understanding with applications to computer vision, robotics and the creative industries. I will principally discuss our recent work on building an algorithmic chain exploiting models of shape derived from the cognitive science literature but relating closely to well-known approaches in computer vision and computational geometry: that of medial descriptors of shape.

Recent relevant publications:

[1] Point-based medialness for 2D shape description and identification
P. Aparajeya and F. F. Leymarie
Multimedia Tools and Applications, May 2015
http://link.springer.com/article/10.1007%2Fs11042-015-2605-6

[2] Portrait drawing by Paul the robot
P. Tresset and F. F. Leymarie
Computers & Graphics, April 2013
Special Section on Expressive Graphics
http://www.sciencedirect.com/science/article/pii/S0097849313000149
               
 
Mon, 8 Jun 2015
16:00:00
Prof. Fausto Giunchiglia
from University of Trento, Italy
Talk place: Idiap Research Institute

Discoverying Life patterns

Abstract:
The main goal of this proposal is to discover a person’s life patterns (e.g., where she goes, what she does, how she is and feels and whom she spends time with) namely those situations that repeat themselves, almost but not exactly identical, with regularity, and to exploit this knowledge for improving her quality of life.

The challenge is how to synchronize a sensor and data-driven representation of the world, which is noisy, imprecise and agnostic of the user needs with a knowledge level representation of the world which should be: (i) general, by allowing for the representation and integration of different combinations of sensors and interesting aspects of the user’s life and, (ii) adaptive, by representing life happenings at the desired level of abstraction, capturing their progress, and adapting to changes in the life dynamics.

The solution exploits three main components: (i) a methodology and mechanisms for an incremental evolution of a knowledge level representation of the world (e.g., ontologies), (ii) an extension of deep learning to take into account and adapt to the constraints coming from the knowledge level and (iii) a Question Answering (Q/A) service which allows the user to interact with the computer according to her needs and terminology.
               
 
Mon, 11 May 2015
11:00:00
Prof. Louis-Philippe Morency
from Language Technology Institute, Carnegie Mellon University
Talk place: Idiap Research Institute

Modeling Human Communication Dynamics

Abstract:
Human face-to-face communication is a little like a dance, in that participants continuously adjust their behaviors based on verbal and nonverbal cues from the social context. Today's computers and interactive devices are still lacking many of these human-like abilities to hold fluid and natural interactions. Leveraging recent advances in machine learning, audio-visual signal processing and computational linguistic, my research focuses on creating human-computer interaction (HCI) technologies able to analyze, recognize and predict human subtle communicative behaviors in social context. I formalize this new research endeavor with a Human Communication Dynamics framework, addressing four key computational challenges: behavioral dynamic, multimodal dynamic, interpersonal dynamic and societal dynamic. Central to this research effort is the introduction of new probabilistic models able to learn the temporal and fine-grained latent dependencies across behaviors, modalities and interlocutors. In this talk, I will present some of our recent achievements modeling multiple aspects of human communication dynamics, motivated by applications in healthcare (depression, PTSD, suicide, autism), education (learning analytics), business (negotiation, interpersonal skills) and social multimedia (opinion mining, social influence).
               
 
Fri, 24 Apr 2015
11:00:00
Prof. Vincent Lepetit
from TU Graz, Austria
Talk place: Idiap Research Institute

Robust image feature extraction learning and object registration

Abstract:
Extracting image features such as feature points or edges is a critical step of many Computer Vision systems, however this is still performed with carefully handcrafted methods. In this talk, I will first present a new Machine Learning-based approach to detecting local image features, with application to contour detection in natural images, but also biomedical and aerial images, and to feature point extraction under drastic weather and lighting changes. I will then show that it is also possible to learn efficient object description based on low-level features for scalable 3D object detection.
               
 
Thu, 19 Feb 2015
11:00:00
Prof. Henning Mueller
from HES-SO Sierre, Switzerland
Talk place: Idiap Research Institute

Medical visual information retrieval: techniques & evaluation

Abstract:
Medical imaging has enormously increased in importance and volume in medical institutions, particularly 3D tomographic imaging. Via digital analysis the knowledge stored in medical cases can be used for more than a single patient to help decision-making.

This presentation will highlight several challenges in medical image data processing starting with the VISCERAL EU project that evaluates segmentation, lesion detection and similar case retrieval on large amounts of medical 3D data using a cloud-based infrastructure for participants. The description of the MANY project highlights techniques for 3D texture analysis that can be used in a variety of contexts. Finally an overview of the radiology search system of the Khresmoi project will show a combination of the 3D data and the 3D analyses in a multi-modal environment.
               
 
Thu, 5 Feb 2015
14:00:00
Prof. Yann Gousseau
from ENST Telecom Paris
Talk place: Idiap Research Institute

Video Inpainting of Complex Scenes

Abstract:
While image inpainting is  a relatively mature subject whose numerical results are often visually striking, the automatic filling-in of video is still prone to yield incoherent results in many situations. Moreover, the subject is impaired by strong computational bottlenecks. In this talk, we present a patch-based approach to inpaint videos, relying on a global, multi-scale optimization heuristic. Contrarily to previous approaches, the best patch candidates are selected using texture attributes, that are built within a multi-scale video representation. We show that this rationale prevents the usual wash-out of textured and cluttered parts of video. Combined with an appropriate nearest neighbor search and a simple stabilization-like procedure, the resulting approach is able to successfully and automatically inpaint complex situations, including high resolution sequences with dynamic textures and multiple moving objects.
               
 
Thu, 8 Jan 2015
11:00:00
Dr. Mary Ellen Foster
from Interaction Lab, Heriot-Watt University Edinburgh, UK
Talk place: Idiap Research Institute

Trainable Interaction Models for Embodied Conversational Agents

Abstract:
Human communication is inherently multimodal: when we communicate with one another, we use a wide variety of channels, including speech, facial expressions, body postures, and gestures. An embodied conversational agent (ECA) is an interactive character -- virtual or physically embodied -- with a human-like appearance, which uses its face and body to communicate in a natural way. Giving such an agent the ability to understand and produce natural, multimodal communicative behaviour will allow humans to interact with such agents as naturally and freely as they interact with one another, enabling the agents to be used in applications as diverse as service robots, manufacturing, personal companions, automated customer support, and therapy.

To develop an agent capable of such natural, multimodal communication, we must first record and analyse how humans communicate with one another.  Based on that analysis, we then develop models of human multimodal interaction and integrate those models into the reasoning process of an ECA.  Finally, the models are tested and validated through human-agent interactions in a range of contexts.

In this talk, I will give three examples where the above steps have been followed to create interaction models for ECAs. First, I will describe how human-like referring expressions improve user satisfaction with a collaborative robot; then I show how data-driven generation of facial displays affects interactions with an animated virtual agent; finally, I describe how trained classifiers can be used to estimate engagement for customers of a robot bartender.
               
 
Fri, 17 Oct 2014
11:00:00
Prof. Christian Wolf
from LIRIS team, INSA Lyon, France
Talk place: Idiap Research Institute

Pose estimation and gesture recognition using structured deep learning

Abstract:
In this talk I will address the problem of gesture recognition and pose estimation from videos, following two different strategies: 
(i) estimation of articulated pose (full body or hand pose) alleviates subsequent recognition steps in many conditions and allows smooth interaction modes and tight coupling between object and manipulator; 
(ii) in situations of low image quality (e.g. large distances between hand and camera), obtaining an articulated pose is hard. Training a deep model directly on video data can give excellent results in these situations.

We tackle both cases by training deep architectures capable of learning discriminative intermediate representations. The main goal is to integrate structural information into the model in order to decrease the dependency on large amounts of training data.To achieve this, we propose an approach for hand pose estimation that requires very little labelled data. It leverages both unlabeled data and synthetic data produced by a rendering pipeline. The key to making it work is to integrate structural information not into the model architecture, which would slow down inference, but into the training objective. We show that adding unlabeled real-world samples significantly improves results compared to a purely supervised setting.   

In the context of multi-modal gesture detection and recognition, we propose a deep recurrent architecture that iteratively learns and integrates discriminative data representations from individual channels (pose, video, audio), modeling complex cross-modality correlations and temporal dependencies. It is based on multi-scale and multi-modal deep learning. Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at two temporal scales. Key to our technique is a training strategy which exploits i) careful initialization of individual modalities; and ii) gradual fusion of modalities from strongest to weakest cross-modality structure. 

We present experiments on the "ChaLearn 2014 Looking at People Challenge" gesture recognition track organized in conjunction with ECCV 2014, in which we placed 1st out of 17 teams. The objective of the challenge was to detect, localize and classify Italian conversational gestures from large database of 13858 gestures. The multimodal data included color video, range maps and a skeleton stream.

The talk will be preceded by a  brief introduction to the work done in my LIRIS team.
 
Site : http://liris.cnrs.fr/christian.wolf/research/gesturerec.html
               
 
Tue, 1 Jul 2014
11:00:00
Dr. Gabrielle Vail
from New College of Florida
Talk place: Idiap Research Institute

Fitting Ancient Texts into Modern Technology: The Maya Hieroglyphic Codices Database Project

Abstract:
The Maya hieroglyphic codices provide a rich dataset concerning astronomical beliefs, divinatory practices, and the ritual life of prehispanic Maya cultures inhabiting the Yucatán Peninsula in the years leading up to the Spanish conquest in the early sixteenth century. Structurally, the codices are organized in terms of almanacs and astronomical tables, both of which incorporate several types of data—calendrical, iconographic, and textual—that together allowed Maya scribes to encode complex relationships among deities, dates having ritual and/or celestial significance, and associated activities. In order to better understand these relationships, the Maya Hieroglyphic Codices Database project was initiated to develop sophisticated online research tools to aid in analysis of these manuscripts. Because the Maya scribes did not live in a culture that demanded strict adherence to paradigms that we take for granted when organizing information for electronic search and retrieval, this posed a significant challenge in efforts to discover how data contained in ancient manuscripts could be converted into data structures that facilitated computer searching and information retrieval. This presentation discusses the approaches taken by the author and the architect of the database project to find compromises that enable computer analysis of a set of texts created by scribes more than half a millennium ago, while avoiding the biases inherent in translating knowledge across spatial and cultural divides. The presentation will be made by Dr. Vail; the technical architect to the project, William Giltinan, will be available to answer questions at the conclusion of the lecture.
               
 
Wed, 11 Jun 2014
11:00:00
Prof. Richard Bowden
from University of Surrey
Talk place: Idiap Research Institute

Recognising people, motion and actions in video

Abstract:
Learning to recognise the motion or actions of people in video has wide applications covering topic from sign or gesture recognition through to surveillance and HCI. This talk will discuss approaches to video mining, allowing the discovery of weakly supervised spatiotemporal signatures such as actions embedded in video or Signs/facial motion weakly supervised by language. Whether the task is recognising an atomic action of an individual or their implied activity, the continuous multichannel nature of sign language recognition or the appearance of words on the lips, all approaches can be categorised at the most basic level as the learning and recognition of spatio-temporal patterns. However, in all cases, inaccuracies in labelling and the curse of dimensionality lead us to explore new learning approaches that can operate in a weakly supervised setting. This talk will discuss the adaptation of mining to the video domain and new approaches to learning spatiotemporal signatures covering a broad range of application areas such as facial feature extraction and regression, lip reading, activity recognition and sign and gesture recognition in both 2D and 3D.
               
 
Wed, 21 May 2014
15:00:00
Prof. Ricardo Baeza-Yates
from Yahoo! Labs
Talk place: Idiap Research Institute

The Web: Wisdom of Crowds or Wisdom of a Few?

Abstract:
The Web continues to grow and evolve very fast, changing our daily lives. This activity represents the collaborative work of the millions of institutions and people that contribute content to the Web as well as more than two billion people that use it. In this ocean of hyperlinked data there is explicit and implicit information and knowledge. But how is the Web? What are the activities of people? How content is generated? Web data mining is the main approach to answer these questions. Web data comes in three main flavors: content (text, images, etc.), structure (hyperlinks) and usage (navigation, queries, etc.), implying different techniques such as text, graph or log mining. Each case reflects the wisdom of some group of people that can be used to make the Web better. For example, user generated tags in Web 2.0 sites. In this presentation we explore the wisdom of crowds in relation to several dimensions such as bias, privacy, scalability, and spam. We also cover related concepts such as the long tail of the special interests of people, or the digital desert, content that nobody sees.
               
 
Tue, 15 Apr 2014
11:00:00
Prof. Mohamed Chetouani
from University Pierre and Marie Curie-Paris 6
Talk place: Idiap Research Institute

Interpersonal synchrony: social signal processing and social robotics for revealing social signatures

Abstract:
Social signal processing is an emerging research domain with rich and open fundamental and applied challenges. In this talk, I’ll focus on the development of social signal processing techniques for real applications in the field of psycho-pathology. I’ll overview recent research and investigation methods allowing neuroscience, psychology and developmental science to move from isolated individuals paradigms to interactive contexts by jointly analyzing behaviors and social signals of partners. From the concept of interpersonal synchrony, we’ll show how to address the complex problem of evaluating children with pervasive developmental disorders. These techniques are also demonstrated in the context of human-robot interaction by a new way of using robots in autism (moving from assistive devices to clinical investigations tools). I will finish by closing the loop between behaviors and physiological states by presenting new results on oxytocin and proxemics during early parent-infant interactions.
               
 
Fri, 14 Feb 2014
11:00:00
Dr. Ivan Laptev
from INRIA, Paris
Talk place: Idiap Research Institute

Recent trends and future challenges in action recognition

Abstract:
This talk will overview recent progress and open challenges in human action recognition. Specifically, I will focus on the three problems of (i) action representation in video, (ii) weakly-supervised action learning and (iii) ambiguity of action vocabulary. To the first problem, I will overview local feature methods providing state-of-the-art results on current action recognition benchmarks. Motivated by the difficulty of large-scale video annotation, I will next present our recent work on weakly-supervised action learning from video and corresponding video scripts. I will finish by highlighting limitations of the standard action classification paradigm and will show some of our work addressing this problem.
               
 
Tue, 21 Jan 2014
11:00:00
Prof. Marc Langheinrich
from Università della Svizzera italiana (USI)
Talk place: Idiap Research Institute

Privacy & Trust Challenges in Open Public Display Networks

Abstract:
Future public displays have the potential to become much more than a simple digital signage -- they can form the basis for a novel communication medium. By interconnecting displays and opening them up to applications and content from a wide range of sources, they can not only support individuals and their communities, but also increase their relevance and ultimately their economic benefits. Ultimately, open display networks could have the same impact on society as radio, television and the Internet. In this talk, I will briefly summarize this vision and its related challenges, in particular with respect to privacy and trust, and present the work that we did in this area in the context of a recently finished FET-Open project titled "PD-Net".
               
 
Thu, 31 Oct 2013
11:00:00
Prof. Francis Quek
from Texas A&M University
Talk place: Idiap Research Institute

Interacting with the Embodied Mind

Abstract:
Humans do not think like computers. Our minds are ‘designed’ for us to function as embodied beings in the world in ways that are: 1. Physical-Spatial; 2. Temporal-Dynamic; 3 Social-Cultural; and 4. Affective-Emotional. These aspects of embodiment give us four lenses to understand the embodied mind and how computation/technology may support its function. I adopt a two-pronged to human-computer interaction research by first harnessing technological means to contribute to the understanding of how embodiment ultimately ascends into mind, and second, to inform the design and engineering of technologies that support and augment human higher psychological functions of learning, sensemaking, creating, and experiencing.

In line with the first approach, I shall first show how language, as a core human capacity, is rooted in human embodied function. We will see that mental imagery shapes multimodal (gesture, gaze, and speech) human discourse. In line with the second approach, I shall then present an assemblage of interactive projects that illustrate how our concept of human embodiment can inform technology design through the light of our four lenses. Projects cluster around three application domains, namely 1. Technology for special populations (e.g. mathematics instruction and reading for the blind, games for older adults); 2. Learning and Education (e.g. learning and knowledge discovery through device/display ecologies, creativity support for children); and 3. Experience (e.g. socially-based information access, experience of images, affective communication).
               
 
Thu, 19 Sep 2013
15:00:00
Nuria Oliver
from Telefonica Research, Barcelona, Spain
Talk place: Idiap Research Institute

The power of the cellphone: small devices for big impact

Abstract:
There are almost as many mobile phones in the world as humans. The mobile phone is the piece of technology with the highest levels of adoption in human history. We carry them with us all through the day (and night, in many cases). Therefore, mobile phones have become sensors of human activity in the large scale and also the most personal devices.

In my talk, I will present some of the work that we are doing at Telefonica Research in the area of mobile computing, both in terms of analyzing and understanding large-scale human behavioral data from mobile traces and in designing novel mobile systems in the areas of healthcare, education and information access.
               
 
Tue, 3 Sep 2013
14:00:00
Prof. Anil K. Jain
from Michigan State University
Talk place: Idiap Research Institute

Biometric Recognition: Sketch to photo matching, Tattoo Matching and Fingerprint Obfuscation

Abstract:
http://biometrics.cse.msu.edu
http://scholar.google.com/citations?user=g-_ZXGsAAAAJ&hl=en

If you are like many people, navigating the complexities of everyday life depends on an array of cards and passwords that confirm your identity. But lose a card, and your ATM will refuse to give you money. Forget a password, and your own computer may balk at your command. Allow your card or passwords to fall into the wrong hands, and what were intended to be security measures can become the tools of fraud or identity theft. Biometrics - the automated recognition of people via distinctive anatomical and behavioral traits has the potential to overcome many of these problems. 

Biometrics is not a new idea. Pioneering work by several British scholars, including Fauld, Galton and Henry in the late 19th century established that fingerprints exhibit a unique pattern that persists over time. This set the stage for the development of Automatic Fingerprint Identification Systems that are now used by law enforcement agencies worldwide. The success of fingerprints in law enforcement coupled with growing concerns related to homeland security, financial fraud and identity theft has generated renewed interest in research and development of biometric systems. It is, therefore, not surprising to see biometrics permeating our society (laptops and mobile phones, border crossing, civil registration, and access to secure facilities). Despite these successful deployments, biometrics is not a panacea for human recognition. There are challenges related to data acquisition, image quality, robust matching, multibiometrics, biometric system security and user privacy. This talk will introduce three challenging problems of particular interest to law enforcement and border crossing agencies: (i) face sketch to photo matching, (ii) scars, marks & tattoos (SMT) and (iii) fingerprint obfuscation. 

Short bio
Anil K. Jain is a University Distinguished Professor in the Department of Computer Science at Michigan State University where he conducts research in pattern recognition, computer vision and biometrics. He has received Guggenheim fellowship, Humboldt Research award, Fulbright fellowship, IEEE Computer Society Technical Achievement award, W. Wallace McDowell award, IAPR King-Sun Fu Prize, and ICDM Research Award for contributions to pattern recognition and biometrics. He served as the Editor-in-Chief of the IEEE Trans. Pattern Analysis and Machine Intelligence and is a Fellow of ACM, IEEE, AAAS, IAPR and SPIE. Holder of eight patents in biometrics, he is the author of several books. ISI has designated him as a highly cited author. He served as a member of the National Academies panels on Information Technology, Whither Biometrics and Improvised Explosive Devices (IED). He also served as a member of the Defense Science Board. 

His H-index is 137 (Source: Google Scholar). 
               
 
Thu, 29 Aug 2013
11:00:00
Dr. Fernando De la Torre
from Robotics Institute, CMU
Talk place: Idiap Research Institute

Component Analysis for Human Sensing

Abstract:
Enabling computers to understand human behavior has the potential to revolutionize many areas that benefit society such as clinical diagnosis, human computer interaction, and social robotics. A critical element in the design of any behavioral sensing system is to find a good representation of the data for encoding, segmenting, classifying and predicting subtle human behavior. In this talk I will propose several extensions of Component Analysis (CA) techniques (e.g., kernel principal component analysis, support vector machines, spectral clustering) that are able to learn spatio-temporal representations or components useful in many human sensing tasks.

In the first part of the talk I will give an overview of several ongoing projects in the CMU Human Sensing Laboratory, including our current work on depression assessment from videos. In the second part, I will show how several extensions of CA methods outperform state-of-the-art algorithms in problems such as facial feature detection and tracking, temporal clustering of human behavior, early detection of activities, weakly-supervised visual labeling, and robust classification. The talk will be adaptive, and I will discuss the topics of major interest to the audience.

Biography:

Fernando De la Torre received his B.Sc. degree in Telecommunications (1994), M.Sc. (1996), and Ph. D. (2002) degrees in Electronic Engineering from La Salle School of Engineering in Ramon Llull University, Barcelona, Spain. In 2003 he joined the Robotics Institute at Carnegie Mellon University, and since 2010 he has been a Research Associate Professor. Dr. De la Torre's research interests include computer vision and machine learning, in particular face analysis, optimization and component analysis methods, and its applications to human sensing. He is Associate Editor at IEEE PAMI and leads the Component Analysis Laboratory (http://ca.cs.cmu.edu) and the Human Sensing Laboratory (http://humansensing.cs.cmu.edu).