- the development of novel data fusion mechanisms to improve tracking,
via
- visual cue fusion (shape/color with motion) for stable
tracking [1].
- audio-visual fusion for speaker tracking with one or
more cameras [2,3].
- the development of mixed-state models (which combine continuous
motion parameters and discrete labels in a joint distribution)
for
tracking and recognition, including
- multi-camera speaker tracking (the discrete variable
represents the
specific camera view in which the speaker
appears) [3].
- joint head tracking and head pose estimation (the discrete
variable denotes a head pose/appearance examplar)
[4].
- the development of new sampling methods for improving tracking efficiency
and performance, including
- the use of motion as proposal distribution for single-object
tracking [5],
- a distributed partitioned sampling strategy for multi-object
tracking [6]
- data collection and development of performance evaluation procedures,
including
- a data set for the evaluation of multi-people and AV
speaker
tracking algorithms with precise 3-D groundtruth.
- a data set for the evaluation of joint head tracking
and head pose recognition.
- a protocol for performance evaluation of multi-object
tracking algorithms.
Current work focuses on
- the extension of multiple-people tracking algorithms (multi-camera,
visual
and audio-visual).
- the development of algorithms combining HMMs and particle filters
to recognize,
while tracking, more precise and complex activities (e.g. head
and body gestures).
- the definition with other IM2.SA partners of a common data set for
evaluation
of multiple people tracking algorithms in surveillance scenarios.