Getting started with Bob¶
The following tutorial constitutes a suitable starting point to get to know how to use Bob’s packages and to learn its fundamental concepts.
Multi-dimensional Arrays¶
The fundamental data structure of Bob is a multi-dimensional array. In signal processing and machine learning, arrays are a suitable representation for many different types of digital signals such as images, audio data and extracted features. For multi-dimensional arrays, we rely on NumPy. For an introduction and tutorials about NumPy ndarrays, just visit the NumPy Reference website.
Digital signals as multi-dimensional arrays¶
For Bob, we have decided to represent digital signals directly as
numpy.ndarray
rather than having dedicated classes for each type of
signals. This implies that some convention has been defined.
Vectors and matrices¶
A vector is represented as a 1D NumPy array, whereas a matrix is represented by a 2D array whose first dimension corresponds to the rows, and second dimension to the columns.
>>> import numpy
>>> A = numpy.array([[1, 2, 3], [4, 5, 6]], dtype='uint8') # A is a matrix 2x3
>>> print(A)
[[1 2 3]
[4 5 6]]
>>> b = numpy.array([1, 2, 3], dtype='uint8') # b is a vector of length 3
>>> print(b)
[1 2 3]
Images¶
Grayscale images are represented as 2D arrays, the first dimension being the height (number of rows) and the second dimension being the width (number of columns). For instance:
>>> img = numpy.ndarray((480,640), dtype='uint8')
img
which is a 2D array can be seen as a gray-scale image of
dimension 640 (width) by 480 (height). In addition, img
can be seen
as a matrix with 480 rows and 640 columns. This is the reason why we
have decided that for images, the first dimension is the height and the
second one the width, such that it matches the matrix convention as
well.
Color images are represented as 3D arrays, the first dimension being
the number of color planes, the second dimension the height and the
third the width. As an image is an array, this is the responsibility of
the user to know in which color space the content is stored.
bob.io.image
provides functions to convert Bob format images into
Matplotlib and other formats and back:
>>> import bob.io.image
>>> colored_bob_format = numpy.ndarray((3,480,640), dtype='uint8')
>>> colored_matplotlib_format = bob.io.image.to_matplotlib(colored_bob_format)
>>> print(colored_matplotlib_format.shape)
[480 640 3]
>>> colored_bob_format = bob.io.image.to_bob(colored_matplotlib_format)
>>> print(colored_bob_format.shape)
[3 480 640]
>>> pillow_img = bob.io.image.bob_to_pillow(colored_bob_format)
>>> opencv_bgr = bob.io.image.bob_to_opencv(colored_bob_format)
Note
In Open Source Face Recognition Library, the images are assumed to be in range [0,255]
irrespective of their data type.
Videos¶
A video can be seen as a sequence of images over time. By convention, the first
dimension is for the frame indices (time index), whereas the remaining ones are
related to the corresponding image frame. The videos have the shape of
(N,C,H,W)
, where N
is the number of frames, H
the height, W
the
width and C
the number of color planes.
Input and output¶
Bob’s Core I/O Routines provides two generic functions bob.io.base.load
and
bob.io.base.save
to load and save data of various types, based on the
filename extension. For example, to load a .jpg
image, simply call:
>>> import bob.io.base
>>> img = bob.io.base.load("myimg.jpg")
HDF5 format, through h5py, and images, through imageio, are supported. For loading videos, use imageio-ffmpeg directly.
Machine learning¶
Expectation Maximization Machine Learning Tools provides implementation of the following methods:
K-Means clustering
Gaussian Mixture Modeling (GMM)
Joint Factor Analysis (JFA)
Inter-Session Variability (ISV)
Total Variability (TV, also known as i-vector)
Probabilistic Linear Discriminant Analysis (PLDA, also known as i-vector)
All implementations use dask to parallelize the training computation.
Database interfaces¶
Bob provides an API on top of CSV files to easily query databases. A generic implementation is provided in Bob Pipelines but packages such as Resources for biometric experiments and Running Presentation Attack Detection Experiments provide their own implementations.
Performance evaluation¶
Methods in the Bob’s Metric Routines module can be used evaluate error for multi-class or binary classification problems. Several evaluation techniques such as:
Root Mean Squared Error (RMSE)
F-score
Precision and Recall
False Positive Rate (FPR)
False Negative Rate (FNR)
Equal Error Rates (EER)
can be computed. Moreover, functionality for plotting
ROC
DET
CMC
EPC
curves are described in more detail in the Bob’s Metric Routines.