Tools implemented in bob.pad.base

Please not that some parts of the code in this package are dependent on and reused from bob.bio.base package.

Summary

Base Classes

Most of the base classes are reused from bob.bio.base. Only one base class that is presentation attack detection specific, Algorithm is implemented in this package.

bob.pad.base.algorithm.Algorithm([…])

This is the base class for all anti-spoofing algorithms.

bob.pad.base.algorithm.Predictions(**kwargs)

An algorithm that takes the precomputed predictions and uses them for scoring.

Implementations

bob.pad.base.database.PadDatabase(name[, …])

This class represents the basic API for database access.

bob.pad.base.database.PadFile(client_id, path)

A simple base class that defines basic properties of File object for the use in PAD experiments

Preprocessors and Extractors

Preprocessors and Extractors from the bob.bio.base package can also be used in this package. Please see Tools implemented in bob.bio.base for more details.

Algorithms

class bob.pad.base.algorithm.Algorithm(performs_projection=False, requires_projector_training=True, **kwargs)

Bases: object

This is the base class for all anti-spoofing algorithms. It defines the minimum requirements for all derived algorithm classes.

Call the constructor in derived class implementations. If your derived algorithm performs feature projection, please register this here. If it needs training for the projector, please set this here, too.

Parameters:

performs_projectionbool

Set to True if your derived algorithm performs a projection. Also implement the project() function, and the load_projector() if necessary.

requires_projector_trainingbool

Only valid, when performs_projection = True. Set this flag to False, when the projection is applied, but the projector does not need to be trained.

kwargskey=value pairs

A list of keyword arguments to be written in the __str__ function.

load_projector(projector_file)[source]

Loads the parameters required for feature projection from file. This function usually is useful in combination with the train_projector() function. In this base class implementation, it does nothing.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_filestr

The file to read the projector from.

project(feature) → projected[source]

This function will project the given feature. It must be overwritten by derived classes, as soon as performs_projection = True was set in the constructor. It is assured that the load_projector() was called once before the project function is executed.

Parameters:

featureobject

The feature to be projected.

Returns:

projectedobject

The projected features. Must be writable with the write_feature() function and readable with the read_feature() function.

read_feature(feature_file) → feature[source]

Reads the projected feature from file. In this base class implementation, it uses bob.io.base.load() to do that. If you have different format, please overwrite this function.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

feature_filestr or bob.io.base.HDF5File

The file open for reading, or the file name to read from.

Returns:

featureobject

The feature that was read from file.

score(toscore) → score[source]

This function will compute the score for the given object toscore. It must be overwritten by derived classes.

Parameters:

toscoreobject

The object to compute the score for. This will be the output of extractor if performs_projection is False, otherwise this will be the output of project method of the algorithm.

Returns:

scorefloat

A score value for the object toscore.

score_for_multiple_projections(toscore)[source]

scorescore_for_multiple_projections(toscore) -> score

This function will compute the score for a list of objects in toscore. It must be overwritten by derived classes.

Parameters:

toscore[object]

A list of objects to compute the score for.

Returns:

scorefloat

A score value for the object toscore.

train_projector(training_features, projector_file)[source]

This function can be overwritten to train the feature projector. If you do this, please also register the function by calling this base class constructor and enabling the training by requires_projector_training = True.

Parameters:

training_features[object] or [[object]]

A list of extracted features that can be used for training the projector. Features will be provided in a single list

projector_filestr

The file to write. This file should be readable with the load_projector() function.

write_feature(feature, feature_file)[source]

Saves the given projected feature to a file with the given name. In this base class implementation:

  • If the given feature has a save attribute, it calls feature.save(bob.io.base.HDF5File(feature_file), 'w'). In this case, the given feature_file might be either a file name or a bob.io.base.HDF5File.

  • Otherwise, it uses bob.io.base.save() to do that.

If you have a different format, please overwrite this function.

Please register ‘performs_projection = True’ in the constructor to enable this function.

Parameters:

featureobject

A feature as returned by the project() function, which should be written.

feature_filestr or bob.io.base.HDF5File

The file open for writing, or the file name to write to.

class bob.pad.base.algorithm.GMM(number_of_gaussians, kmeans_training_iterations=25, gmm_training_iterations=10, training_threshold=0.0005, variance_threshold=0.0005, update_weights=True, update_means=True, update_variances=True, responsibility_threshold=0, INIT_SEED=5489, performs_projection=True, requires_projector_training=True, **kwargs)[source]

Bases: bob.pad.base.algorithm.Algorithm

Trains two GMMs for two classes of PAD and calculates log likelihood ratio during evaluation.

train_gmm(array)[source]
save_gmms(projector_file)[source]

Save projector to file

train_projector(training_features, projector_file)[source]

This function can be overwritten to train the feature projector. If you do this, please also register the function by calling this base class constructor and enabling the training by requires_projector_training = True.

Parameters:

training_features[object] or [[object]]

A list of extracted features that can be used for training the projector. Features will be provided in a single list

projector_filestr

The file to write. This file should be readable with the load_projector() function.

load_projector(projector_file)[source]

Loads the parameters required for feature projection from file. This function usually is useful in combination with the train_projector() function. In this base class implementation, it does nothing.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_filestr

The file to read the projector from.

project(feature) → projected[source]

Projects the given feature into GMM space.

Parameters:

feature1D numpy.ndarray

The 1D feature to be projected.

Returns:

projected1D numpy.ndarray

The feature projected into GMM space.

score(toscore)[source]

Returns the difference between log likelihoods of being real or attack

score_for_multiple_projections(toscore)[source]

Returns the difference between log likelihoods of being real or attack

class bob.pad.base.algorithm.LogRegr(C=1, frame_level_scores_flag=False, subsample_train_data_flag=False, subsampling_step=10, subsample_videos_flag=False, video_subsampling_step=3)

Bases: bob.pad.base.algorithm.Algorithm

This class is designed to train Logistic Regression classifier given Frame Containers with features of real and attack classes. The procedure is the following:

  1. First, the input data is mean-std normalized using mean and std of the real class only.

  2. Second, the Logistic Regression classifier is trained on normalized input features.

  3. The input features are next classified using pre-trained LR machine.

Parameters:

Cfloat

Inverse of regularization strength in LR classifier; must be a positive. Like in support vector machines, smaller values specify stronger regularization. Default: 1.0 .

frame_level_scores_flagbool

Return scores for each frame individually if True. Otherwise, return a single score per video. Default: False.

subsample_train_data_flagbool

Uniformly subsample the training data if True. Default: False.

subsampling_stepint

Training data subsampling step, only valid is subsample_train_data_flag = True. Default: 10 .

subsample_videos_flagbool

Uniformly subsample the training videos if True. Default: False.

video_subsampling_stepint

Training videos subsampling step, only valid is subsample_videos_flag = True. Default: 3 .

load_lr_machine_and_mean_std(projector_file)[source]

Loads the machine, features mean and std from the hdf5 file. The absolute name of the file is specified in projector_file string.

Parameters:

projector_filestr

Absolute name of the file to load the trained projector from, as returned by bob.pad.base framework.

Returns:

machineobject

The loaded LR machine. As returned by sklearn.linear_model module.

features_mean1D numpy.ndarray

Mean of the features.

features_std1D numpy.ndarray

Standart deviation of the features.

load_projector(projector_file)[source]

Loads the machine, features mean and std from the hdf5 file. The absolute name of the file is specified in projector_file string.

This function sets the arguments self.lr_machine, self.features_mean and self.features_std of this class with loaded machines.

The function must be capable of reading the data saved with the train_projector() method of this class.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_filestr

The file to read the projector from, as returned by the bob.pad.base framework. In this class the names of the files to read the projectors from are modified, see load_machine and load_cascade_of_machines methods of this class for more details.

project(feature)[source]

This function computes a vector of scores for each sample in the input array of features. The following steps are apllied:

  1. First, the input data is mean-std normalized using mean and std of the real class only.

  2. The input features are next classified using pre-trained LR machine.

Set performs_projection = True in the constructor to enable this function. It is assured that the load_projector() was called before the project function is executed.

Parameters:

featureFrameContainer or 2D numpy.ndarray

Two types of inputs are accepted. A Frame Container conteining the features of an individual, see bob.bio.video.utils.FrameContainer. Or a 2D feature array of the size (N_samples x N_features).

Returns:

scores1D numpy.ndarray

Vector of scores. Scores for the real class are expected to be higher, than the scores of the negative / attack class. In this case scores are probabilities.

save_lr_machine_and_mean_std(projector_file, machine, features_mean, features_std)[source]

Saves the LR machine, features mean and std to the hdf5 file. The absolute name of the file is specified in projector_file string.

Parameters:

projector_filestr

Absolute name of the file to save the data to, as returned by bob.pad.base framework.

machineobject

The LR machine to be saved. As returned by sklearn.linear_model module.

features_mean1D numpy.ndarray

Mean of the features.

features_std1D numpy.ndarray

Standart deviation of the features.

score(toscore)[source]

Returns a probability of a sample being a real class.

Parameters:

toscore1D numpy.ndarray

Vector with scores for each frame/sample defining the probability of the frame being a sample of the real class.

Returns:

score[float]

If frame_level_scores_flag = False a single score is returned. One score per video. This score is placed into a list, because the score must be an iterable. Score is a probability of a sample being a real class. If frame_level_scores_flag = True a list of scores is returned. One score per frame/sample.

subsample_train_videos(training_features, step)[source]

Uniformly select subset of frmae containes from the input list

Parameters:

training_features[FrameContainer]

A list of FrameContainers

stepint

Data selection step.

Returns:

training_features_subset[FrameContainer]

A list with selected FrameContainers

train_lr(real, attack, C)[source]

Train LR classifier given real and attack classes. Prior to training the data is mean-std normalized.

Parameters:

real2D numpy.ndarray

Training features for the real class.

attack2D numpy.ndarray

Training features for the attack class.

Cfloat

Inverse of regularization strength in LR classifier; must be a positive. Like in support vector machines, smaller values specify stronger regularization. Default: 1.0 .

Returns:

machineobject

A trained LR machine.

features_mean1D numpy.ndarray

Mean of the features.

features_std1D numpy.ndarray

Standart deviation of the features.

train_projector(training_features, projector_file)[source]

Train LR for feature projection and save them to files. The requires_projector_training = True flag must be set to True to enable this function.

Parameters:

training_features[[FrameContainer], [FrameContainer]]

A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.

projector_filestr

The file to save the trained projector to, as returned by the bob.pad.base framework.

class bob.pad.base.algorithm.MLP(hidden_units=(10, 10), max_iter=1000, precision=0.001, **kwargs)

Bases: bob.pad.base.algorithm.Algorithm

Interfaces an MLP classifier used for PAD

hidden_units

The number of hidden units in each hidden layer

Type

tuple of int

max_iter

The maximum number of training iterations

Type

int

precision

criterion to stop the training: if the difference between current and last loss is smaller than this number, then stop training.

Type

float

project(feature)[source]

Project the given feature

Parameters

feature (numpy.ndarray) – The feature to classify

Returns

The value of the two units in the last layer of the MLP.

Return type

numpy.ndarray

score(toscore)[source]

Returns the probability of the real class.

Parameters

toscore (numpy.ndarray) –

Returns

probability of the authentication attempt to be real.

Return type

float

train_projector(training_features, projector_file)[source]

Trains the MLP

Parameters
  • training_features (list of numpy.ndarray) – Data used to train the MLP. The real attempts are in training_features[0] and the attacks are in training_features[1]

  • projector_file (str) – Filename where to save the trained model.

class bob.pad.base.algorithm.OneClassGMM(n_components=1, random_state=3, frame_level_scores_flag=False, covariance_type='full', reg_covar=1e-06, normalize_features=False)

Bases: bob.pad.base.algorithm.Algorithm

This class is designed to train a OneClassGMM based PAD system. The OneClassGMM is trained using data of one class (real class) only. The procedure is the following:

  1. First, the training data is mean-std normalized using mean and std of the real class only.

  2. Second, the OneClassGMM with n_components Gaussians is trained using samples of the real class.

  3. The input features are next classified using pre-trained OneClassGMM machine.

Parameters:

n_componentsint

Number of Gaussians in the OneClassGMM. Default: 1 .

random_stateint

A seed for the random number generator used in the initialization of the OneClassGMM. Default: 3 .

frame_level_scores_flagbool

Return scores for each frame individually if True. Otherwise, return a single score per video. Default: False.

load_gmm_machine_and_mean_std(projector_file)[source]

Loads the machine, features mean and std from the hdf5 file. The absolute name of the file is specified in projector_file string.

Parameters:

projector_filestr

Absolute name of the file to load the trained projector from, as returned by bob.pad.base framework.

Returns:

machineobject

The loaded OneClassGMM machine. As returned by sklearn.mixture module.

features_mean1D numpy.ndarray

Mean of the features.

features_std1D numpy.ndarray

Standart deviation of the features.

load_projector(projector_file)[source]

Loads the machine, features mean and std from the hdf5 file. The absolute name of the file is specified in projector_file string.

This function sets the arguments self.machine, self.features_mean and self.features_std of this class with loaded machines.

The function must be capable of reading the data saved with the train_projector() method of this class.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_filestr

The file to read the projector from, as returned by the bob.pad.base framework. In this class the names of the files to read the projectors from are modified, see load_machine and load_cascade_of_machines methods of this class for more details.

project(feature)[source]

This function computes a vector of scores for each sample in the input array of features. The following steps are applied:

  1. First, the input data is mean-std normalized using mean and std of the real class only.

  2. The input features are next classified using pre-trained OneClassGMM machine.

Set performs_projection = True in the constructor to enable this function. It is assured that the load_projector() was called before the project function is executed.

Parameters:

featureFrameContainer or 2D numpy.ndarray

Two types of inputs are accepted. A Frame Container conteining the features of an individual, see bob.bio.video.utils.FrameContainer. Or a 2D feature array of the size (N_samples x N_features).

Returns:

scores1D numpy.ndarray

Vector of scores. Scores for the real class are expected to be higher, than the scores of the negative / attack class. In this case scores are the weighted log probabilities.

save_gmm_machine_and_mean_std(projector_file, machine, features_mean, features_std)[source]

Saves the OneClassGMM machine, features mean and std to the hdf5 file. The absolute name of the file is specified in projector_file string.

Parameters:

projector_filestr

Absolute name of the file to save the data to, as returned by bob.pad.base framework.

machineobject

The OneClassGMM machine to be saved. As returned by sklearn.linear_model module.

features_mean1D numpy.ndarray

Mean of the features.

features_std1D numpy.ndarray

Standart deviation of the features.

score(toscore)[source]

Returns a probability of a sample being a real class.

Parameters:

toscore1D numpy.ndarray

Vector with scores for each frame/sample defining the probability of the frame being a sample of the real class.

Returns:

score[float]

If frame_level_scores_flag = False a single score is returned. One score per video. This score is placed into a list, because the score must be an iterable. Score is a probability of a sample being a real class. If frame_level_scores_flag = True a list of scores is returned. One score per frame/sample.

train_gmm(real)[source]

Train OneClassGMM classifier given real class. Prior to the training the data is mean-std normalized.

Parameters:

real2D numpy.ndarray

Training features for the real class.

Returns:

machineobject

A trained OneClassGMM machine.

features_mean1D numpy.ndarray

Mean of the features.

features_std1D numpy.ndarray

Standart deviation of the features.

train_projector(training_features, projector_file)[source]

Train OneClassGMM for feature projection and save it to file. The requires_projector_training = True flag must be set to True to enable this function.

Parameters:

training_features[[FrameContainer], [FrameContainer]]

A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.

projector_filestr

The file to save the trained projector to, as returned by the bob.pad.base framework.

class bob.pad.base.algorithm.OneClassGMM2(number_of_gaussians, kmeans_training_iterations=25, gmm_training_iterations=25, training_threshold=0.0005, variance_threshold=0.0005, update_weights=True, update_means=True, update_variances=True, n_threads=40, preprocessor=None, **kwargs)

Bases: bob.pad.base.algorithm.Algorithm

A one class GMM implementation based on Bob’s GMM implementation which is more stable than scikit-learn’s one.

load_projector(projector_file)[source]

Loads the parameters required for feature projection from file. This function usually is useful in combination with the train_projector() function. In this base class implementation, it does nothing.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_filestr

The file to read the projector from.

project(feature) → projected[source]

This function will project the given feature. It must be overwritten by derived classes, as soon as performs_projection = True was set in the constructor. It is assured that the load_projector() was called once before the project function is executed.

Parameters:

featureobject

The feature to be projected.

Returns:

projectedobject

The projected features. Must be writable with the write_feature() function and readable with the read_feature() function.

score(toscore) → score[source]

This function will compute the score for the given object toscore. It must be overwritten by derived classes.

Parameters:

toscoreobject

The object to compute the score for. This will be the output of extractor if performs_projection is False, otherwise this will be the output of project method of the algorithm.

Returns:

scorefloat

A score value for the object toscore.

train_projector(training_features, projector_file)[source]

This function can be overwritten to train the feature projector. If you do this, please also register the function by calling this base class constructor and enabling the training by requires_projector_training = True.

Parameters:

training_features[object] or [[object]]

A list of extracted features that can be used for training the projector. Features will be provided in a single list

projector_filestr

The file to write. This file should be readable with the load_projector() function.

class bob.pad.base.algorithm.PadLDA(lda_subspace_dimension=None, pca_subspace_dimension=None, use_pinv=False, **kwargs)

Bases: bob.bio.base.algorithm.LDA

Wrapper for bob.bio.base.algorithm.LDA,

Here, LDA is used in a PAD context. This means that the feature will be projected on a single dimension subspace, which acts as a score

For more details, you may want to have a look at bob.learn.linear Documentation

lda_subspace_dimension

the dimension of the LDA subspace. In the PAD case, the default value is always used, and corresponds to the number of classes in the training set (i.e. 2).

Type

int

pca_subspace_dimension

The dimension of the PCA subspace to be applied before on the data, before applying LDA.

Type

int

use_pinv

Use the pseudo-inverse in LDA computation.

Type

bool

score(model, probe) → float[source]

Computes the distance of the model to the probe using the distance function specified in the constructor.

Parameters:

model2D numpy.ndarray

The model storing all enrollment features.

probe1D numpy.ndarray

The probe feature vector in Fisher space.

Returns:

scorefloat

A similarity value between model and probe

class bob.pad.base.algorithm.Predictions(**kwargs)

Bases: bob.pad.base.algorithm.Algorithm

An algorithm that takes the precomputed predictions and uses them for scoring.

score(toscore) → score[source]

This function will compute the score for the given object toscore. It must be overwritten by derived classes.

Parameters:

toscoreobject

The object to compute the score for. This will be the output of extractor if performs_projection is False, otherwise this will be the output of project method of the algorithm.

Returns:

scorefloat

A score value for the object toscore.

class bob.pad.base.algorithm.SVM(machine_type='C_SVC', kernel_type='RBF', n_samples=10000, trainer_grid_search_params={'cost': [0.03125, 0.125, 0.5, 2, 8, 32, 128, 512, 2048, 8192, 32768], 'gamma': [3.0517578125e-05, 0.0001220703125, 0.00048828125, 0.001953125, 0.0078125, 0.03125, 0.125, 0.5, 2, 8]}, mean_std_norm_flag=False, frame_level_scores_flag=False, save_debug_data_flag=True, reduced_train_data_flag=False, n_train_samples=50000)

Bases: bob.pad.base.algorithm.Algorithm

This class is designed to train SVM given features (either numpy arrays or Frame Containers) from real and attack classes. The trained SVM is then used to classify the testing data as either real or attack. The SVM is trained in two stages. First, the best parameters for SVM are estimated using train and cross-validation subsets. The size of the subsets used in hyper-parameter tuning is defined by n_samples parameter of this class. Once best parameters are determined, the SVM machine is trained using complete training set.

Parameters:

machine_typestr

A type of the SVM machine. Please check bob.learn.libsvm for more details. Default: ‘C_SVC’.

kernel_typestr

A type of kerenel for the SVM machine. Please check bob.learn.libsvm for more details. Default: ‘RBF’.

n_samplesint

Number of uniformly selected feature vectors per class defining the sizes of sub-sets used in the hyper-parameter grid search.

trainer_grid_search_paramsdict

Dictionary containing the hyper-parameters of the SVM to be tested in the grid-search. Default: {‘cost’: [2**p for p in range(-5, 16, 2)], ‘gamma’: [2**p for p in range(-15, 4, 2)]}.

mean_std_norm_flagbool

Perform mean-std normalization of data if set to True. Default: False.

frame_level_scores_flagbool

Return scores for each frame individually if True. Otherwise, return a single score per video. Should be used only when features are in Frame Containers. Default: False.

save_debug_data_flagbool

Save the data, which might be usefull for debugging if True. Default: True.

reduced_train_data_flagbool

Reduce the amount of final training samples if set to True. Default: False.

n_train_samplesint

Number of uniformly selected feature vectors per class defining the sizes of sub-sets used in the final traing of the SVM. Default: 50000.

comp_prediction_precision(machine, real, attack)[source]

This function computes the precision of the predictions as a ratio of correctly classified samples to the total number of samples.

Parameters:

machineobject

A pre-trained SVM machine.

real2D numpy.ndarray

Array of features representing the real class.

attack2D numpy.ndarray

Array of features representing the attack class.

Returns:

precisionfloat

The precision of the predictions.

load_projector(projector_file)[source]

Load the pretrained projector/SVM from file to perform a feature projection. This function usually is useful in combination with the train_projector() function.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_filestr

The file to read the projector from.

project(feature)[source]

This function computes class probabilities for the input feature using pretrained SVM. The feature in this case is a Frame Container with features for each frame. The probabilities will be computed and returned for each frame.

Set performs_projection = True in the constructor to enable this function. It is assured that the load_projector() was called before the project function is executed.

Parameters:

featureobject

A Frame Container conteining the features of an individual, see bob.bio.video.utils.FrameContainer.

Returns:

probabilities1D or 2D numpy.ndarray

2D in the case of two-class SVM. An array containing class probabilities for each frame. First column contains probabilities for each frame being a real class. Second column contains probabilities for each frame being an attack class. 1D in the case of one-class SVM. Vector with scores for each frame defining belonging to the real class. Must be writable with the write_feature function and readable with the read_feature function.

score(toscore)[source]

Returns a probability of a sample being a real class.

Parameters:

toscore1D or 2D numpy.ndarray

2D in the case of two-class SVM. An array containing class probabilities for each frame. First column contains probabilities for each frame being a real class. Second column contains probabilities for each frame being an attack class. 1D in the case of one-class SVM. Vector with scores for each frame defining belonging to the real class.

Returns:

scorefloat or a 1D numpy.ndarray

If frame_level_scores_flag = False a single score is returned. One score per video. Score is a probability of a sample being a real class. If frame_level_scores_flag = True a 1D array of scores is returned. One score per frame. Score is a probability of a sample being a real class.

score_for_multiple_projections(toscore)[source]

Returns a list of scores computed by the score method of this class.

Parameters:

toscore1D or 2D numpy.ndarray

2D in the case of two-class SVM. An array containing class probabilities for each frame. First column contains probabilities for each frame being a real class. Second column contains probabilities for each frame being an attack class. 1D in the case of one-class SVM. Vector with scores for each frame defining belonging to the real class.

Returns:

list_of_scores[float]

A list containing the scores.

train_projector(training_features, projector_file)[source]

Train SVM feature projector and save the trained SVM to a given file. The requires_projector_training = True flag must be set to True to enable this function.

Parameters:

training_features[[FrameContainer], [FrameContainer]]

A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.

projector_filestr

The file to save the trained projector to. This file should be readable with the load_projector() function.

train_svm(training_features, n_samples=10000, machine_type='C_SVC', kernel_type='RBF', trainer_grid_search_params={'cost': [0.03125, 0.125, 0.5, 2, 8, 32, 128, 512, 2048, 8192, 32768], 'gamma': [3.0517578125e-05, 0.0001220703125, 0.00048828125, 0.001953125, 0.0078125, 0.03125, 0.125, 0.5, 2, 8]}, mean_std_norm_flag=False, projector_file='', save_debug_data_flag=True, reduced_train_data_flag=False, n_train_samples=50000)[source]

First, this function tunes the hyper-parameters of the SVM classifier using grid search on the sub-sets of training data. Train and cross-validation subsets for both classes are formed from the available input training_features.

Once successfull parameters are determined the SVM is trained on the whole training data set. The resulting machine is returned by the function.

Parameters:

training_features[[FrameContainer], [FrameContainer]]

A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.

n_samplesint

Number of uniformly selected feature vectors per class defining the sizes of sub-sets used in the hyper-parameter grid search.

machine_typestr

A type of the SVM machine. Please check bob.learn.libsvm for more details.

kernel_typestr

A type of kerenel for the SVM machine. Please check bob.learn.libsvm for more details.

trainer_grid_search_paramsdict

Dictionary containing the hyper-parameters of the SVM to be tested in the grid-search.

mean_std_norm_flagbool

Perform mean-std normalization of data if set to True. Default: False.

projector_filestr

The name of the file to save the trained projector to. Only the path of this file is used in this function. The file debug_data.hdf5 will be save in this path. This file contains information, which might be usefull for debugging.

save_debug_data_flagbool

Save the data, which might be usefull for debugging if True. Default: True.

reduced_train_data_flagbool

Reduce the amount of final training samples if set to True. Default: False.

n_train_samplesint

Number of uniformly selected feature vectors per class defining the sizes of sub-sets used in the final traing of the SVM. Default: 50000.

Returns:

machineobject

A trained SVM machine.

class bob.pad.base.algorithm.SVMCascadePCA(machine_type='C_SVC', kernel_type='RBF', svm_kwargs={'cost': 1, 'gamma': 0}, N=2, pos_scores_slope=0.01, frame_level_scores_flag=False)

Bases: bob.pad.base.algorithm.Algorithm

This class is designed to train the cascede of SVMs given Frame Containers with features of real and attack classes. The procedure is the following:

  1. First, the input data is mean-std normalized.

  2. Second, the PCA is trained on normalized input features. Only the features of the real class are used in PCA training, both for one-class and two-class SVMs.

  3. The features are next projected given trained PCA machine.

  4. Prior to SVM training the features are again mean-std normalized.

  5. Next SVM machine is trained for each N projected features. First, preojected features corresponding to highest eigenvalues are selected. N is usually small N = (2, 3). So, if N = 2, the first SVM is trained for projected features 1 and 2, second SVM is trained for projected features 3 and 4, and so on.

  6. These SVMs then form a cascade of classifiers. The input feature vector is then projected using PCA machine and passed through all classifiers in the cascade. The decision is then made by majority voting.

Both one-class SVM and two-class SVM cascades can be trained. In this implementation the grid search of SVM parameters is not supported.

Parameters:

machine_typestr

A type of the SVM machine. Please check bob.learn.libsvm for more details. Default: ‘C_SVC’.

kernel_typestr

A type of kerenel for the SVM machine. Please check bob.learn.libsvm for more details. Default: ‘RBF’.

svm_kwargsdict

Dictionary containing the hyper-parameters of the SVM. Default: {‘cost’: 1, ‘gamma’: 0}.

Nint

The number of features to be used for training a single SVM machine in the cascade. Default: 2.

pos_scores_slopefloat

The positive scores returned by SVM cascade will be multiplied by this constant prior to majority voting. Default: 0.01 .

frame_level_scores_flagbool

Return scores for each frame individually if True. Otherwise, return a single score per video. Default: False.

combine_scores_of_svm_cascade(scores_array, pos_scores_slope)[source]

First, multiply positive scores by constant pos_scores_slope in the input 2D array. The constant is usually small, making the impact of negative scores more significant. Second, the a single score per sample is obtained by avaraging the pre-modified scores of the cascade.

Parameters:

scores_array2D numpy.ndarray

2D score array of the size (N_samples x N_scores).

pos_scores_slopefloat

The positive scores returned by SVM cascade will be multiplied by this constant prior to majority voting. Default: 0.01 .

Returns:

scores1D numpy.ndarray

Vector of scores. Scores for the real class are expected to be higher, than the scores of the negative / attack class.

comp_prediction_precision(machine, real, attack)[source]

This function computes the precision of the predictions as a ratio of correctly classified samples to the total number of samples.

Parameters:

machineobject

A pre-trained SVM machine.

real2D numpy.ndarray

Array of features representing the real class.

attack2D numpy.ndarray

Array of features representing the attack class.

Returns:

precisionfloat

The precision of the predictions.

get_cascade_file_names(projector_file, projector_file_name)[source]

Get the list of file-names storing the cascade of machines. The location of the files is specified in the path component of the projector_file argument.

Parameters:

projector_filestr

Absolute name of the file to load the trained projector from, as returned by bob.pad.base framework. In this function only the path component is used.

projector_file_namestr

The common string in the names of files storing the cascade of pretrained machines. Name without extension.

Returns:

cascade_file_names[str]

A list of of relative file-names storing the cascade of machines.

get_data_start_end_idx(data, N)[source]

Get indexes to select the subsets of data related to the cascades. First (n_machines - 1) SVMs will be trained using N features. Last SVM will be trained using remaining features, which is less or equal to N.

Parameters:

data2D numpy.ndarray

Data array containing the training features. The dimensionality is (N_samples x N_features).

Nint

Number of features per single SVM.

Returns:

idx_start[int]

Starting indexes for data subsets.

idx_end[int]

End indexes for data subsets.

n_machinesint

Number of SVMs to be trained.

load_cascade_of_machines(projector_file, projector_file_name)[source]

Loades a cascade of machines from the hdf5 files. The name of the file is specified in projector_file_name string and will be augumented with a number of the machine. The location is specified in the path component of the projector_file string.

Parameters:

projector_filestr

Absolute name of the file to load the trained projector from, as returned by bob.pad.base framework. In this function only the path component is used.

projector_file_namestr

The relative name of the file to load the machine from. This name will be augumented with a number of the machine. Name without extension.

Returns:

machinesdict

A cascade of machines. The key in the dictionary is the number of the machine, value is the machine itself.

load_machine(projector_file, projector_file_name)[source]

Loads the machine from the hdf5 file. The name of the file is specified in projector_file_name string. The location is specified in the path component of the projector_file string.

Parameters:

projector_filestr

Absolute name of the file to load the trained projector from, as returned by bob.pad.base framework. In this function only the path component is used.

projector_file_namestr

The relative name of the file to load the machine from. Name without extension.

Returns:

machineobject

A machine loaded from file.

load_projector(projector_file)[source]

Load the pretrained PCA machine and a cascade of SVM classifiers from files to perform feature projection. This function sets the arguments self.pca_machine and self.svm_machines of this class with loaded machines.

The function must be capable of reading the data saved with the train_projector() method of this class.

Please register performs_projection = True in the constructor to enable this function.

Parameters:

projector_filestr

The file to read the projector from, as returned by the bob.pad.base framework. In this class the names of the files to read the projectors from are modified, see load_machine and load_cascade_of_machines methods of this class for more details.

project(feature)[source]

This function computes a vector of scores for each sample in the input array of features. The following steps are apllied:

  1. Convert input array to numpy array if necessary.

  2. Project features using pretrained PCA machine.

  3. Apply the cascade of SVMs to the preojected features.

  4. Compute a single score per sample by combining the scores produced by the cascade of SVMs. The combination is done using combine_scores_of_svm_cascade method of this class.

Set performs_projection = True in the constructor to enable this function. It is assured that the load_projector() was called before the project function is executed.

Parameters:

featureFrameContainer or 2D numpy.ndarray

Two types of inputs are accepted. A Frame Container conteining the features of an individual, see bob.bio.video.utils.FrameContainer. Or a 2D feature array of the size (N_samples x N_features).

Returns:

scores1D numpy.ndarray

Vector of scores. Scores for the real class are expected to be higher, than the scores of the negative / attack class.

save_cascade_of_machines(projector_file, projector_file_name, machines)[source]

Saves a cascade of machines to the hdf5 files. The name of the file is specified in projector_file_name string and will be augumented with a number of the machine. The location is specified in the path component of the projector_file string.

Parameters:

projector_filestr

Absolute name of the file to save the trained projector to, as returned by bob.pad.base framework. In this function only the path component is used.

projector_file_namestr

The relative name of the file to save the machine to. This name will be augumented with a number of the machine. Name without extension.

machinesdict

A cascade of machines. The key in the dictionary is the number of the machine, value is the machine itself.

save_machine(projector_file, projector_file_name, machine)[source]

Saves the machine to the hdf5 file. The name of the file is specified in projector_file_name string. The location is specified in the path component of the projector_file string.

Parameters:

projector_filestr

Absolute name of the file to save the trained projector to, as returned by bob.pad.base framework. In this function only the path component is used.

projector_file_namestr

The relative name of the file to save the machine to. Name without extension.

machineobject

The machine to be saved.

score(toscore)[source]

Returns a probability of a sample being a real class.

Parameters:

toscore1D or 2D numpy.ndarray

2D in the case of two-class SVM. An array containing class probabilities for each frame. First column contains probabilities for each frame being a real class. Second column contains probabilities for each frame being an attack class. 1D in the case of one-class SVM. Vector with scores for each frame defining belonging to the real class.

Returns:

score[float]

If frame_level_scores_flag = False a single score is returned. One score per video. This score is placed into a list, because the score must be an iterable. Score is a probability of a sample being a real class. If frame_level_scores_flag = True a list of scores is returned. One score per frame/sample.

train_pca(data)[source]

Train PCA given input array of feature vectors. The data is mean-std normalized prior to PCA training.

Parameters:

data2D numpy.ndarray

Array of feature vectors of the size (N_samples x N_features). The features must be already mean-std normalized.

Returns:

machinebob.learn.linear.Machine

The PCA machine that has been trained. The mean-std normalizers are also set in the machine.

eig_vals1D numpy.ndarray

The eigen-values of the PCA projection.

train_pca_svm_cascade(real, attack, machine_type, kernel_type, svm_kwargs, N)[source]

This function is designed to train the cascede of SVMs given features of real and attack classes. The procedure is the following:

  1. First, the PCA machine is trained also incorporating mean-std feature normalization. Only the features of the real class are used in PCA training, both for one-class and two-class SVMs.

  2. The features are next projected given trained PCA machine.

  3. Next, SVM machine is trained for each N projected features. Prior to SVM training the features are again mean-std normalized. First, preojected features corresponding to highest eigenvalues are selected. N is usually small N = (2, 3). So, if N = 2, the first SVM is trained for projected features 1 and 2, second SVM is trained for projected features 3 and 4, and so on.

Both one-class SVM and two-class SVM cascades can be trained. In this implementation the grid search of SVM parameters is not supported.

Parameters:

real2D numpy.ndarray

Training features for the real class.

attack2D numpy.ndarray

Training features for the attack class. If machine_type == ‘ONE_CLASS’ this argument can be anything, it will be skipped.

machine_typestr

A type of the SVM machine. Please check bob.learn.libsvm for more details.

kernel_typestr

A type of kerenel for the SVM machine. Please check bob.learn.libsvm for more details.

svm_kwargsdict

Dictionary containing the hyper-parameters of the SVM.

Nint

The number of features to be used for training a single SVM machine in the cascade.

Returns:

pca_machineobject

A trained PCA machine.

svm_machinesdict

A cascade of SVM machines.

train_projector(training_features, projector_file)[source]

Train PCA and cascade of SVMs for feature projection and save them to files. The requires_projector_training = True flag must be set to True to enable this function.

Parameters:

training_features[[FrameContainer], [FrameContainer]]

A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.

projector_filestr

The file to save the trained projector to, as returned by the bob.pad.base framework. In this class the names of the files to save the projectors to are modified, see save_machine and save_cascade_of_machines methods of this class for more details.

train_svm(real, attack, machine_type, kernel_type, svm_kwargs)[source]

One-class or two class-SVM is trained in this method given input features. The value of attack argument is not important in the case of one-class SVM. Prior to training the data is mean-std normalized.

Parameters:

real2D numpy.ndarray

Training features for the real class.

attack2D numpy.ndarray

Training features for the attack class. If machine_type == ‘ONE_CLASS’ this argument can be anything, it will be skipped.

machine_typestr

A type of the SVM machine. Please check bob.learn.libsvm for more details.

kernel_typestr

A type of kerenel for the SVM machine. Please check bob.learn.libsvm for more details.

svm_kwargsdict

Dictionary containing the hyper-parameters of the SVM.

Returns:

machineobject

A trained SVM machine. The mean-std normalizers are also set in the machine.

train_svm_cascade(real, attack, machine_type, kernel_type, svm_kwargs, N)[source]

Train a cascade of SVMs, one SVM machine per N features. N is usually small N = (2, 3). So, if N = 2, the first SVM is trained for features 1 and 2, second SVM is trained for features 3 and 4, and so on.

Both one-class and two-class SVM cascades can be trained. The value of attack argument is not important in the case of one-class SVM.

The data is mean-std normalized prior to SVM cascade training.

Parameters:

real2D numpy.ndarray

Training features for the real class.

attack2D numpy.ndarray

Training features for the attack class. If machine_type == ‘ONE_CLASS’ this argument can be anything, it will be skipped.

machine_typestr

A type of the SVM machine. Please check bob.learn.libsvm for more details.

kernel_typestr

A type of kerenel for the SVM machine. Please check bob.learn.libsvm for more details.

svm_kwargsdict

Dictionary containing the hyper-parameters of the SVM.

Nint

The number of features to be used for training a single SVM machine in the cascade.

Returns:

machinesdict

A dictionary containing a cascade of trained SVM machines.

class bob.pad.base.algorithm.VideoPredictions(axis=1, frame_level_scoring=False, **kwargs)

Bases: bob.pad.base.algorithm.Algorithm

An algorithm that takes the precomputed predictions and uses them for scoring.

score(toscore) → score[source]

This function will compute the score for the given object toscore. It must be overwritten by derived classes.

Parameters:

toscoreobject

The object to compute the score for. This will be the output of extractor if performs_projection is False, otherwise this will be the output of project method of the algorithm.

Returns:

scorefloat

A score value for the object toscore.

Databases

class bob.pad.base.database.Client(client_id)

Bases: object

The clients of this database contain ONLY client ids. Nothing special.

class bob.pad.base.database.FileListPadDatabase(filelists_directory, name, protocol=None, pad_file_class=<class 'bob.pad.base.database.PadFile'>, original_directory=None, original_extension=None, annotation_directory=None, annotation_extension='', annotation_type=None, train_subdir=None, dev_subdir=None, eval_subdir=None, real_filename=None, attack_filename=None, keep_read_lists_in_memory=True, **kwargs)

Bases: bob.pad.base.database.PadDatabase, bob.bio.base.database.FileListBioDatabase

This class provides a user-friendly interface to databases that are given as file lists.

Keyword parameters:

filelists_directorystr

The directory that contains the filelists defining the protocol(s). If you use the protocol attribute when querying the database, it will be appended to the base directory, such that several protocols are supported by the same class instance of bob.pad.base.

namestr

The name of the database

protocolstr

The protocol of the database. This should be a folder inside filelists_directory.

pad_file_classclass

The class that should be used for return the files. This can be PadFile, PadVoiceFile, or anything similar.

original_directorystr or None

The directory, where the original data can be found

original_extensionstr or [str] or None

The filename extension of the original data, or multiple extensions

annotation_directorystr or None

The directory, where additional annotation files can be found

annotation_extensionstr or None

The filename extension of the annotation files

annotation_type : str The type of the annotation file to read, see bob.db.base.read_annotation_file for accepted formats.

train_subdirstr or None

Specify a custom subdirectory for the filelists of the development set (default is ‘train’)

dev_subdirstr or None

Specify a custom subdirectory for the filelists of the development set (default is ‘dev’)

eval_subdirstr or None

Specify a custom subdirectory for the filelists of the development set (default is ‘eval’)

keep_read_lists_in_memorybool

If set to true, the lists are read only once and stored in memory

annotations(file)[source]

Returns the annotations for the given File object, if available. You need to override this method in your high-level implementation. If your database does not have annotations, it should return None.

Parameters:

filebob.pad.base.database.PadFile

The file for which annotations should be returned.

Returns:

annotsdict or None

The annotations for the file, if available.

client_ids(protocol=None, groups=None)[source]

Returns a list of client ids for the specific query by the user.

Keyword Parameters:

protocolstr or None

The protocol to consider

groupsstr or [str] or None

The groups to which the clients belong (“dev”, “eval”, “train”).

Returns: A list containing all the client ids which have the given properties.

groups(protocol=None, add_world=False, add_subworld=False)[source]

This function returns the list of groups for this database.

protocolstr or None

The protocol for which the groups should be retrieved.

Returns: a list of groups

objects(groups=None, protocol=None, purposes=None, model_ids=None, **kwargs)[source]

Returns a set of PadFile objects for the specific query by the user.

Keyword Parameters:

groupsstr or [str] or None

One of the groups (“dev”, “eval”, “train”) or a tuple with several of them. If ‘None’ is given (this is the default), it is considered the same as a tuple with all possible values.

protocolstr or None

The protocol to consider

purposesstr or [str] or None

The purposes required to be retrieved (“real”, “attack”) or a tuple with several of them. If ‘None’ is given (this is the default), it is considered the same as a tuple with all possible values.

model_ids[various type]

This parameter is not supported in PAD databases yet

Returns: A list of PadFile objects considering all the filtering criteria.

tobjects(groups=None, protocol=None, model_ids=None, **kwargs)[source]

Returns a list of bob.bio.base.database.BioFile objects for enrolling T-norm models for score normalization.

Parameters
  • protocol (str or None) – The protocol to consider

  • model_ids (str or [str] or None) – Only retrieves the files for the provided list of model ids (claimed client id). If None is given (this is the default), no filter over the model_ids is performed.

  • groups (str or [str] or None) – The groups to which the models belong ('dev', 'eval').

Returns

A list of BioFile objects considering all the filtering criteria.

Return type

[BioFile]

zobjects(groups=None, protocol=None, **kwargs)[source]

Returns a list of BioFile objects to perform Z-norm score normalization.

Parameters
  • protocol (str or None) – The protocol to consider

  • groups (str or [str] or None) – The groups to which the clients belong ('dev', 'eval').

Returns

A list of File objects considering all the filtering criteria.

Return type

[BioFile]

class bob.pad.base.database.HighBioDatabase(filelists_directory=None, original_directory='[DB_DATA_DIRECTORY]', original_extension='.wav', db_name='', file_class=None, **kwargs)

Bases: bob.bio.base.database.FileListBioDatabase

Implements verification API for querying High database.

annotations(file)[source]

Reads the annotations for the given file id from file and returns them in a dictionary.

Parameters

file (BioFile) – The BioFile object for which the annotations should be read.

Returns

The annotations as a dictionary, e.g.: {'reye':(re_y,re_x), 'leye':(le_y,le_x)}

Return type

dict

arrange_by_client(files) → files_by_client[source]

Arranges the given list of files by client id. This function returns a list of lists of File’s.

Parameters:

filesbob.bio.base.database.BioFile

A list of files that should be split up by BioFile.client_id.

Returns:

files_by_client[[bob.bio.base.database.BioFile]]

The list of lists of files, where each sub-list groups the files with the same BioFile.client_id

client_id_from_model_id(model_id, group='dev')[source]

This wrapper around PAD database does not have a knowledge of model ids used in verification experiments, so we just assume that the client_id is the same as model_id, which is actually true for most of the verification databases as well.

model_ids_with_protocol(groups=None, protocol=None, **kwargs)[source]

This wrapper around PAD database does not have a knowledge of model ids used in verification experiments, so we just assume that the model_ids are the same as client ids, which is actually true for most of the verification databases as well.

objects(protocol=None, purposes=None, model_ids=None, groups=None, **kwargs)[source]

Maps objects method of PAD databases into objects method of Verification database

Parameters
  • protocol (str) – To distinguish two vulnerability scenarios, protocol name should have either ‘-licit’ or ‘-spoof’ appended to it. For instance, if DB has protocol ‘general’, the named passed to this method should be ‘general-licit’, if we want to run verification experiments on bona fide data only, but it should be ‘general-spoof’, if we want to run it for spoof scenario (the probes are attacks).

  • purposes ([str]) – This parameter is passed by the bob.bio.base verification experiment

  • model_ids ([object]) – This parameter is passed by the bob.bio.base verification experiment

  • groups ([str]) – We map the groups from (‘world’, ‘dev’, ‘eval’) used in verification experiments to (‘train’, ‘dev’, ‘eval’)

  • **kwargs – The rest of the parameters valid for a given database

Returns

Set of BioFiles that verification experiments expect.

Return type

[object]

class bob.pad.base.database.HighPadDatabase(filelists_directory=None, original_directory='[DB_DATA_DIRECTORY]', original_extension='.wav', file_class=None, db_name='', **kwargs)

Bases: bob.pad.base.database.FileListPadDatabase

class bob.pad.base.database.PadDatabase(name, protocol='Default', original_directory=None, original_extension=None, **kwargs)

Bases: bob.bio.base.database.BioDatabase

This class represents the basic API for database access. Please use this class as a base class for your database access classes. Do not forget to call the constructor of this base class in your derived class.

Parameters:

name : str A unique name for the database.

protocol : str or None The name of the protocol that defines the default experimental setup for this database.

original_directory : str The directory where the original data of the database are stored.

original_extension : str The file name extension of the original data.

kwargs : key=value pairs The arguments of the bob.bio.base.database.BioDatabase base class constructor.

all_files(groups=('train', 'dev', 'eval'), flat=False)[source]

Returns all files of the database, respecting the current protocol. The files can be limited using the all_files_options in the constructor.

Parameters
  • groups (str or tuple or None) – The groups to get the data for. it should be some of ('train', 'dev', 'eval') or None

  • flat (bool) – if True, it will merge the real and attack files into one list.

Returns

files – The sorted and unique list of all files of the database.

Return type

[bob.pad.base.database.PadFile]

abstract annotations(file)[source]

Returns the annotations for the given File object, if available. You need to override this method in your high-level implementation. If your database does not have annotations, it should return None.

Parameters:

filebob.pad.base.database.PadFile

The file for which annotations should be returned.

Returns:

annotsdict or None

The annotations for the file, if available.

model_ids_with_protocol(groups = None, protocol = None, **kwargs) → ids[source]

Client-based PAD is not implemented.

abstract objects(groups=None, protocol=None, purposes=None, model_ids=None, **kwargs)[source]

This function returns lists of File objects, which fulfill the given restrictions.

Keyword parameters:

groupsstr or [str]

The groups of which the clients should be returned. Usually, groups are one or more elements of (‘train’, ‘dev’, ‘eval’)

protocol

The protocol for which the clients should be retrieved. The protocol is dependent on your database. If you do not have protocols defined, just ignore this field.

purposesstr or [str]

The purposes for which File objects should be retrieved. Usually it is either ‘real’ or ‘attack’.

model_ids[various type]

This parameter is not supported in PAD databases yet

original_file_names(files) → paths[source]

Returns the full paths of the real and attack data of the given PadFile objects.

Parameters:

files[[bob.pad.base.database.PadFile], [bob.pad.base.database.PadFile]

The list of lists ([real, attack]) of file object to retrieve the original data file names for.

Returns:

paths[str] or [[str]]

The paths extracted for the concatenated real+attack files, in the preserved order.

training_files(step = None, arrange_by_client = False) → files[source]

Returns all training File objects This function needs to be implemented in derived class implementations.

Parameters:

The parameters are not applicable in this version of anti-spoofing experiments

Returns:

files[bob.pad.base.database.PadFile] or [[bob.pad.base.database.PadFile]]

The (arranged) list of files used for the training.

class bob.pad.base.database.PadFile(client_id, path, attack_type=None, file_id=None)

Bases: bob.bio.base.database.BioFile

A simple base class that defines basic properties of File object for the use in PAD experiments

Grid Configuration

Code related to grid is reused from bob.bio.base package. Please see the corresponding documentation.