Tools implemented in bob.pad.base¶
Please not that some parts of the code in this package are dependent on and reused from bob.bio.base package.
Summary¶
Base Classes¶
Most of the base classes are reused from bob.bio.base.
Only one base class that is presentation attack detection specific, Algorithm is implemented in this package.
This is the base class for all anti-spoofing algorithms. |
|
|
An algorithm that takes the precomputed predictions and uses them for scoring. |
Implementations¶
|
This class represents the basic API for database access. |
|
A simple base class that defines basic properties of File object for the use in PAD experiments |
Preprocessors and Extractors¶
Preprocessors and Extractors from the bob.bio.base package can also be used in this package. Please see Tools implemented in bob.bio.base for more details.
Algorithms¶
-
class
bob.pad.base.algorithm.Algorithm(performs_projection=False, requires_projector_training=True, **kwargs)¶ Bases:
objectThis is the base class for all anti-spoofing algorithms. It defines the minimum requirements for all derived algorithm classes.
Call the constructor in derived class implementations. If your derived algorithm performs feature projection, please register this here. If it needs training for the projector, please set this here, too.
Parameters:
- performs_projectionbool
Set to
Trueif your derived algorithm performs a projection. Also implement theproject()function, and theload_projector()if necessary.- requires_projector_trainingbool
Only valid, when
performs_projection = True. Set this flag toFalse, when the projection is applied, but the projector does not need to be trained.- kwargs
key=valuepairs A list of keyword arguments to be written in the __str__ function.
-
load_projector(projector_file)[source]¶ Loads the parameters required for feature projection from file. This function usually is useful in combination with the
train_projector()function. In this base class implementation, it does nothing.Please register performs_projection = True in the constructor to enable this function.
Parameters:
- projector_filestr
The file to read the projector from.
-
project(feature) → projected[source]¶ This function will project the given feature. It must be overwritten by derived classes, as soon as
performs_projection = Truewas set in the constructor. It is assured that theload_projector()was called once before theprojectfunction is executed.Parameters:
- featureobject
The feature to be projected.
Returns:
- projectedobject
The projected features. Must be writable with the
write_feature()function and readable with theread_feature()function.
-
read_feature(feature_file) → feature[source]¶ Reads the projected feature from file. In this base class implementation, it uses
bob.io.base.load()to do that. If you have different format, please overwrite this function.Please register
performs_projection = Truein the constructor to enable this function.Parameters:
- feature_filestr or
bob.io.base.HDF5File The file open for reading, or the file name to read from.
Returns:
- featureobject
The feature that was read from file.
- feature_filestr or
-
score(toscore) → score[source]¶ This function will compute the score for the given object
toscore. It must be overwritten by derived classes.Parameters:
- toscoreobject
The object to compute the score for. This will be the output of extractor if performs_projection is False, otherwise this will be the output of project method of the algorithm.
Returns:
- scorefloat
A score value for the object
toscore.
-
score_for_multiple_projections(toscore)[source]¶ scorescore_for_multiple_projections(toscore) -> score
This function will compute the score for a list of objects in
toscore. It must be overwritten by derived classes.Parameters:
- toscore[object]
A list of objects to compute the score for.
Returns:
- scorefloat
A score value for the object
toscore.
-
train_projector(training_features, projector_file)[source]¶ This function can be overwritten to train the feature projector. If you do this, please also register the function by calling this base class constructor and enabling the training by
requires_projector_training = True.Parameters:
- training_features[object] or [[object]]
A list of extracted features that can be used for training the projector. Features will be provided in a single list
- projector_filestr
The file to write. This file should be readable with the
load_projector()function.
-
write_feature(feature, feature_file)[source]¶ Saves the given projected feature to a file with the given name. In this base class implementation:
If the given feature has a
saveattribute, it callsfeature.save(bob.io.base.HDF5File(feature_file), 'w'). In this case, the given feature_file might be either a file name or a bob.io.base.HDF5File.Otherwise, it uses
bob.io.base.save()to do that.
If you have a different format, please overwrite this function.
Please register ‘performs_projection = True’ in the constructor to enable this function.
Parameters:
- featureobject
A feature as returned by the
project()function, which should be written.- feature_filestr or
bob.io.base.HDF5File The file open for writing, or the file name to write to.
-
class
bob.pad.base.algorithm.LogRegr(C=1, frame_level_scores_flag=False, subsample_train_data_flag=False, subsampling_step=10, subsample_videos_flag=False, video_subsampling_step=3)¶ Bases:
bob.pad.base.algorithm.AlgorithmThis class is designed to train Logistic Regression classifier given Frame Containers with features of real and attack classes. The procedure is the following:
First, the input data is mean-std normalized using mean and std of the real class only.
Second, the Logistic Regression classifier is trained on normalized input features.
The input features are next classified using pre-trained LR machine.
Parameters:
CfloatInverse of regularization strength in LR classifier; must be a positive. Like in support vector machines, smaller values specify stronger regularization. Default: 1.0 .
frame_level_scores_flagboolReturn scores for each frame individually if True. Otherwise, return a single score per video. Default:
False.subsample_train_data_flagboolUniformly subsample the training data if
True. Default:False.subsampling_stepintTraining data subsampling step, only valid is
subsample_train_data_flag = True. Default: 10 .subsample_videos_flagboolUniformly subsample the training videos if
True. Default:False.video_subsampling_stepintTraining videos subsampling step, only valid is
subsample_videos_flag = True. Default: 3 .
-
load_lr_machine_and_mean_std(projector_file)[source]¶ Loads the machine, features mean and std from the hdf5 file. The absolute name of the file is specified in
projector_filestring.Parameters:
projector_filestrAbsolute name of the file to load the trained projector from, as returned by
bob.pad.baseframework.
Returns:
machineobjectThe loaded LR machine. As returned by sklearn.linear_model module.
features_mean1Dnumpy.ndarrayMean of the features.
features_std1Dnumpy.ndarrayStandart deviation of the features.
-
load_projector(projector_file)[source]¶ Loads the machine, features mean and std from the hdf5 file. The absolute name of the file is specified in
projector_filestring.This function sets the arguments
self.lr_machine,self.features_meanandself.features_stdof this class with loaded machines.The function must be capable of reading the data saved with the
train_projector()method of this class.Please register performs_projection = True in the constructor to enable this function.
Parameters:
projector_filestrThe file to read the projector from, as returned by the
bob.pad.baseframework. In this class the names of the files to read the projectors from are modified, seeload_machineandload_cascade_of_machinesmethods of this class for more details.
-
project(feature)[source]¶ This function computes a vector of scores for each sample in the input array of features. The following steps are apllied:
First, the input data is mean-std normalized using mean and std of the real class only.
The input features are next classified using pre-trained LR machine.
Set
performs_projection = Truein the constructor to enable this function. It is assured that theload_projector()was called before theprojectfunction is executed.Parameters:
featureFrameContainer or 2Dnumpy.ndarrayTwo types of inputs are accepted. A Frame Container conteining the features of an individual, see
bob.bio.video.utils.FrameContainer. Or a 2D feature array of the size (N_samples x N_features).
Returns:
scores1Dnumpy.ndarrayVector of scores. Scores for the real class are expected to be higher, than the scores of the negative / attack class. In this case scores are probabilities.
-
save_lr_machine_and_mean_std(projector_file, machine, features_mean, features_std)[source]¶ Saves the LR machine, features mean and std to the hdf5 file. The absolute name of the file is specified in
projector_filestring.Parameters:
projector_filestrAbsolute name of the file to save the data to, as returned by
bob.pad.baseframework.machineobjectThe LR machine to be saved. As returned by sklearn.linear_model module.
features_mean1Dnumpy.ndarrayMean of the features.
features_std1Dnumpy.ndarrayStandart deviation of the features.
-
score(toscore)[source]¶ Returns a probability of a sample being a real class.
Parameters:
toscore1Dnumpy.ndarrayVector with scores for each frame/sample defining the probability of the frame being a sample of the real class.
Returns:
score[float]If
frame_level_scores_flag = Falsea single score is returned. One score per video. This score is placed into a list, because thescoremust be an iterable. Score is a probability of a sample being a real class. Ifframe_level_scores_flag = Truea list of scores is returned. One score per frame/sample.
-
subsample_train_videos(training_features, step)[source]¶ Uniformly select subset of frmae containes from the input list
Parameters:
training_features[FrameContainer]A list of FrameContainers
stepintData selection step.
Returns:
training_features_subset[FrameContainer]A list with selected FrameContainers
-
train_lr(real, attack, C)[source]¶ Train LR classifier given real and attack classes. Prior to training the data is mean-std normalized.
Parameters:
real2Dnumpy.ndarrayTraining features for the real class.
attack2Dnumpy.ndarrayTraining features for the attack class.
CfloatInverse of regularization strength in LR classifier; must be a positive. Like in support vector machines, smaller values specify stronger regularization. Default: 1.0 .
Returns:
machineobjectA trained LR machine.
features_mean1Dnumpy.ndarrayMean of the features.
features_std1Dnumpy.ndarrayStandart deviation of the features.
-
train_projector(training_features, projector_file)[source]¶ Train LR for feature projection and save them to files. The
requires_projector_training = Trueflag must be set to True to enable this function.Parameters:
training_features[[FrameContainer], [FrameContainer]]A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.
projector_filestrThe file to save the trained projector to, as returned by the
bob.pad.baseframework.
-
class
bob.pad.base.algorithm.MLP(hidden_units=(10, 10), max_iter=1000, precision=0.001, **kwargs)¶ Bases:
bob.pad.base.algorithm.AlgorithmInterfaces an MLP classifier used for PAD
-
precision¶ criterion to stop the training: if the difference between current and last loss is smaller than this number, then stop training.
- Type
-
project(feature)[source]¶ Project the given feature
- Parameters
feature (
numpy.ndarray) – The feature to classify- Returns
The value of the two units in the last layer of the MLP.
- Return type
-
score(toscore)[source]¶ Returns the probability of the real class.
- Parameters
toscore (
numpy.ndarray) –- Returns
probability of the authentication attempt to be real.
- Return type
-
train_projector(training_features, projector_file)[source]¶ Trains the MLP
- Parameters
training_features (
listofnumpy.ndarray) – Data used to train the MLP. The real attempts are in training_features[0] and the attacks are in training_features[1]projector_file (str) – Filename where to save the trained model.
-
-
class
bob.pad.base.algorithm.OneClassGMM(n_components=1, random_state=3, frame_level_scores_flag=False, covariance_type='full', reg_covar=1e-06)¶ Bases:
bob.pad.base.algorithm.AlgorithmThis class is designed to train a OneClassGMM based PAD system. The OneClassGMM is trained using data of one class (real class) only. The procedure is the following:
First, the training data is mean-std normalized using mean and std of the real class only.
Second, the OneClassGMM with
n_componentsGaussians is trained using samples of the real class.The input features are next classified using pre-trained OneClassGMM machine.
Parameters:
n_componentsintNumber of Gaussians in the OneClassGMM. Default: 1 .
random_stateintA seed for the random number generator used in the initialization of the OneClassGMM. Default: 3 .
frame_level_scores_flagboolReturn scores for each frame individually if True. Otherwise, return a single score per video. Default: False.
-
load_gmm_machine_and_mean_std(projector_file)[source]¶ Loads the machine, features mean and std from the hdf5 file. The absolute name of the file is specified in
projector_filestring.Parameters:
projector_filestrAbsolute name of the file to load the trained projector from, as returned by
bob.pad.baseframework.
Returns:
machineobjectThe loaded OneClassGMM machine. As returned by sklearn.mixture module.
features_mean1Dnumpy.ndarrayMean of the features.
features_std1Dnumpy.ndarrayStandart deviation of the features.
-
load_projector(projector_file)[source]¶ Loads the machine, features mean and std from the hdf5 file. The absolute name of the file is specified in
projector_filestring.This function sets the arguments
self.machine,self.features_meanandself.features_stdof this class with loaded machines.The function must be capable of reading the data saved with the
train_projector()method of this class.Please register performs_projection = True in the constructor to enable this function.
Parameters:
projector_filestrThe file to read the projector from, as returned by the
bob.pad.baseframework. In this class the names of the files to read the projectors from are modified, seeload_machineandload_cascade_of_machinesmethods of this class for more details.
-
project(feature)[source]¶ This function computes a vector of scores for each sample in the input array of features. The following steps are applied:
First, the input data is mean-std normalized using mean and std of the real class only.
The input features are next classified using pre-trained OneClassGMM machine.
Set
performs_projection = Truein the constructor to enable this function. It is assured that theload_projector()was called before theprojectfunction is executed.Parameters:
featureFrameContainer or 2Dnumpy.ndarrayTwo types of inputs are accepted. A Frame Container conteining the features of an individual, see
bob.bio.video.utils.FrameContainer. Or a 2D feature array of the size (N_samples x N_features).
Returns:
scores1Dnumpy.ndarrayVector of scores. Scores for the real class are expected to be higher, than the scores of the negative / attack class. In this case scores are the weighted log probabilities.
-
save_gmm_machine_and_mean_std(projector_file, machine, features_mean, features_std)[source]¶ Saves the OneClassGMM machine, features mean and std to the hdf5 file. The absolute name of the file is specified in
projector_filestring.Parameters:
projector_filestrAbsolute name of the file to save the data to, as returned by
bob.pad.baseframework.machineobjectThe OneClassGMM machine to be saved. As returned by sklearn.linear_model module.
features_mean1Dnumpy.ndarrayMean of the features.
features_std1Dnumpy.ndarrayStandart deviation of the features.
-
score(toscore)[source]¶ Returns a probability of a sample being a real class.
Parameters:
toscore1Dnumpy.ndarrayVector with scores for each frame/sample defining the probability of the frame being a sample of the real class.
Returns:
score[float]If
frame_level_scores_flag = Falsea single score is returned. One score per video. This score is placed into a list, because thescoremust be an iterable. Score is a probability of a sample being a real class. Ifframe_level_scores_flag = Truea list of scores is returned. One score per frame/sample.
-
train_gmm(real)[source]¶ Train OneClassGMM classifier given real class. Prior to the training the data is mean-std normalized.
Parameters:
real2Dnumpy.ndarrayTraining features for the real class.
Returns:
machineobjectA trained OneClassGMM machine.
features_mean1Dnumpy.ndarrayMean of the features.
features_std1Dnumpy.ndarrayStandart deviation of the features.
-
train_projector(training_features, projector_file)[source]¶ Train OneClassGMM for feature projection and save it to file. The
requires_projector_training = Trueflag must be set to True to enable this function.Parameters:
training_features[[FrameContainer], [FrameContainer]]A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.
projector_filestrThe file to save the trained projector to, as returned by the
bob.pad.baseframework.
-
class
bob.pad.base.algorithm.OneClassGMM2(number_of_gaussians, kmeans_training_iterations=25, gmm_training_iterations=25, training_threshold=0.0005, variance_threshold=0.0005, update_weights=True, update_means=True, update_variances=True, n_threads=40, **kwargs)¶ Bases:
bob.pad.base.algorithm.AlgorithmA one class GMM implementation based on Bob’s GMM implementation which is more stable than scikit-learn’s one.
-
load_projector(projector_file)[source]¶ Loads the parameters required for feature projection from file. This function usually is useful in combination with the
train_projector()function. In this base class implementation, it does nothing.Please register performs_projection = True in the constructor to enable this function.
Parameters:
- projector_filestr
The file to read the projector from.
-
project(feature) → projected[source]¶ This function will project the given feature. It must be overwritten by derived classes, as soon as
performs_projection = Truewas set in the constructor. It is assured that theload_projector()was called once before theprojectfunction is executed.Parameters:
- featureobject
The feature to be projected.
Returns:
- projectedobject
The projected features. Must be writable with the
write_feature()function and readable with theread_feature()function.
-
score(toscore) → score[source]¶ This function will compute the score for the given object
toscore. It must be overwritten by derived classes.Parameters:
- toscoreobject
The object to compute the score for. This will be the output of extractor if performs_projection is False, otherwise this will be the output of project method of the algorithm.
Returns:
- scorefloat
A score value for the object
toscore.
-
train_projector(training_features, projector_file)[source]¶ This function can be overwritten to train the feature projector. If you do this, please also register the function by calling this base class constructor and enabling the training by
requires_projector_training = True.Parameters:
- training_features[object] or [[object]]
A list of extracted features that can be used for training the projector. Features will be provided in a single list
- projector_filestr
The file to write. This file should be readable with the
load_projector()function.
-
-
class
bob.pad.base.algorithm.PadLDA(lda_subspace_dimension=None, pca_subspace_dimension=None, use_pinv=False, **kwargs)¶ Bases:
bob.bio.base.algorithm.LDAWrapper for bob.bio.base.algorithm.LDA,
Here, LDA is used in a PAD context. This means that the feature will be projected on a single dimension subspace, which acts as a score
For more details, you may want to have a look at bob.learn.linear Documentation
-
lda_subspace_dimension¶ the dimension of the LDA subspace. In the PAD case, the default value is always used, and corresponds to the number of classes in the training set (i.e. 2).
- Type
-
pca_subspace_dimension¶ The dimension of the PCA subspace to be applied before on the data, before applying LDA.
- Type
-
score(model, probe) → float[source]¶ Computes the distance of the model to the probe using the distance function specified in the constructor.
Parameters:
- model2D
numpy.ndarray The model storing all enrollment features.
- probe1D
numpy.ndarray The probe feature vector in Fisher space.
Returns:
- scorefloat
A similarity value between
modelandprobe
- model2D
-
-
class
bob.pad.base.algorithm.Predictions(**kwargs)¶ Bases:
bob.pad.base.algorithm.AlgorithmAn algorithm that takes the precomputed predictions and uses them for scoring.
-
score(toscore) → score[source]¶ This function will compute the score for the given object
toscore. It must be overwritten by derived classes.Parameters:
- toscoreobject
The object to compute the score for. This will be the output of extractor if performs_projection is False, otherwise this will be the output of project method of the algorithm.
Returns:
- scorefloat
A score value for the object
toscore.
-
-
class
bob.pad.base.algorithm.SVM(machine_type='C_SVC', kernel_type='RBF', n_samples=10000, trainer_grid_search_params={'cost': [0.03125, 0.125, 0.5, 2, 8, 32, 128, 512, 2048, 8192, 32768], 'gamma': [3.0517578125e-05, 0.0001220703125, 0.00048828125, 0.001953125, 0.0078125, 0.03125, 0.125, 0.5, 2, 8]}, mean_std_norm_flag=False, frame_level_scores_flag=False, save_debug_data_flag=True, reduced_train_data_flag=False, n_train_samples=50000)¶ Bases:
bob.pad.base.algorithm.AlgorithmThis class is designed to train SVM given features (either numpy arrays or Frame Containers) from real and attack classes. The trained SVM is then used to classify the testing data as either real or attack. The SVM is trained in two stages. First, the best parameters for SVM are estimated using train and cross-validation subsets. The size of the subsets used in hyper-parameter tuning is defined by
n_samplesparameter of this class. Once best parameters are determined, the SVM machine is trained using complete training set.Parameters:
machine_typestrA type of the SVM machine. Please check
bob.learn.libsvmfor more details. Default: ‘C_SVC’.kernel_typestrA type of kerenel for the SVM machine. Please check
bob.learn.libsvmfor more details. Default: ‘RBF’.n_samplesintNumber of uniformly selected feature vectors per class defining the sizes of sub-sets used in the hyper-parameter grid search.
trainer_grid_search_paramsdictDictionary containing the hyper-parameters of the SVM to be tested in the grid-search. Default: {‘cost’: [2**p for p in range(-5, 16, 2)], ‘gamma’: [2**p for p in range(-15, 4, 2)]}.
mean_std_norm_flagboolPerform mean-std normalization of data if set to True. Default: False.
frame_level_scores_flagboolReturn scores for each frame individually if True. Otherwise, return a single score per video. Should be used only when features are in Frame Containers. Default: False.
save_debug_data_flagboolSave the data, which might be usefull for debugging if
True. Default:True.reduced_train_data_flagboolReduce the amount of final training samples if set to
True. Default:False.n_train_samplesintNumber of uniformly selected feature vectors per class defining the sizes of sub-sets used in the final traing of the SVM. Default: 50000.
-
comp_prediction_precision(machine, real, attack)[source]¶ This function computes the precision of the predictions as a ratio of correctly classified samples to the total number of samples.
Parameters:
machineobjectA pre-trained SVM machine.
real2Dnumpy.ndarrayArray of features representing the real class.
attack2Dnumpy.ndarrayArray of features representing the attack class.
Returns:
precisionfloatThe precision of the predictions.
-
load_projector(projector_file)[source]¶ Load the pretrained projector/SVM from file to perform a feature projection. This function usually is useful in combination with the
train_projector()function.Please register performs_projection = True in the constructor to enable this function.
Parameters:
projector_filestrThe file to read the projector from.
-
project(feature)[source]¶ This function computes class probabilities for the input feature using pretrained SVM. The feature in this case is a Frame Container with features for each frame. The probabilities will be computed and returned for each frame.
Set
performs_projection = Truein the constructor to enable this function. It is assured that theload_projector()was called before theprojectfunction is executed.Parameters:
featureobjectA Frame Container conteining the features of an individual, see
bob.bio.video.utils.FrameContainer.
Returns:
probabilities1D or 2Dnumpy.ndarray2D in the case of two-class SVM. An array containing class probabilities for each frame. First column contains probabilities for each frame being a real class. Second column contains probabilities for each frame being an attack class. 1D in the case of one-class SVM. Vector with scores for each frame defining belonging to the real class. Must be writable with the
write_featurefunction and readable with theread_featurefunction.
-
score(toscore)[source]¶ Returns a probability of a sample being a real class.
Parameters:
toscore1D or 2Dnumpy.ndarray2D in the case of two-class SVM. An array containing class probabilities for each frame. First column contains probabilities for each frame being a real class. Second column contains probabilities for each frame being an attack class. 1D in the case of one-class SVM. Vector with scores for each frame defining belonging to the real class.
Returns:
scorefloator a 1Dnumpy.ndarrayIf
frame_level_scores_flag = Falsea single score is returned. One score per video. Score is a probability of a sample being a real class. Ifframe_level_scores_flag = Truea 1D array of scores is returned. One score per frame. Score is a probability of a sample being a real class.
-
score_for_multiple_projections(toscore)[source]¶ Returns a list of scores computed by the score method of this class.
Parameters:
toscore1D or 2Dnumpy.ndarray2D in the case of two-class SVM. An array containing class probabilities for each frame. First column contains probabilities for each frame being a real class. Second column contains probabilities for each frame being an attack class. 1D in the case of one-class SVM. Vector with scores for each frame defining belonging to the real class.
Returns:
list_of_scores[float]A list containing the scores.
-
train_projector(training_features, projector_file)[source]¶ Train SVM feature projector and save the trained SVM to a given file. The
requires_projector_training = Trueflag must be set to True to enable this function.Parameters:
training_features[[FrameContainer], [FrameContainer]]A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.
projector_filestrThe file to save the trained projector to. This file should be readable with the
load_projector()function.
-
train_svm(training_features, n_samples=10000, machine_type='C_SVC', kernel_type='RBF', trainer_grid_search_params={'cost': [0.03125, 0.125, 0.5, 2, 8, 32, 128, 512, 2048, 8192, 32768], 'gamma': [3.0517578125e-05, 0.0001220703125, 0.00048828125, 0.001953125, 0.0078125, 0.03125, 0.125, 0.5, 2, 8]}, mean_std_norm_flag=False, projector_file='', save_debug_data_flag=True, reduced_train_data_flag=False, n_train_samples=50000)[source]¶ First, this function tunes the hyper-parameters of the SVM classifier using grid search on the sub-sets of training data. Train and cross-validation subsets for both classes are formed from the available input training_features.
Once successfull parameters are determined the SVM is trained on the whole training data set. The resulting machine is returned by the function.
Parameters:
training_features[[FrameContainer], [FrameContainer]]A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.
n_samplesintNumber of uniformly selected feature vectors per class defining the sizes of sub-sets used in the hyper-parameter grid search.
machine_typestrA type of the SVM machine. Please check
bob.learn.libsvmfor more details.kernel_typestrA type of kerenel for the SVM machine. Please check
bob.learn.libsvmfor more details.trainer_grid_search_paramsdictDictionary containing the hyper-parameters of the SVM to be tested in the grid-search.
mean_std_norm_flagboolPerform mean-std normalization of data if set to True. Default: False.
projector_filestrThe name of the file to save the trained projector to. Only the path of this file is used in this function. The file debug_data.hdf5 will be save in this path. This file contains information, which might be usefull for debugging.
save_debug_data_flagboolSave the data, which might be usefull for debugging if
True. Default:True.reduced_train_data_flagboolReduce the amount of final training samples if set to
True. Default:False.n_train_samplesintNumber of uniformly selected feature vectors per class defining the sizes of sub-sets used in the final traing of the SVM. Default: 50000.
Returns:
machineobjectA trained SVM machine.
-
class
bob.pad.base.algorithm.SVMCascadePCA(machine_type='C_SVC', kernel_type='RBF', svm_kwargs={'cost': 1, 'gamma': 0}, N=2, pos_scores_slope=0.01, frame_level_scores_flag=False)¶ Bases:
bob.pad.base.algorithm.AlgorithmThis class is designed to train the cascede of SVMs given Frame Containers with features of real and attack classes. The procedure is the following:
First, the input data is mean-std normalized.
Second, the PCA is trained on normalized input features. Only the features of the real class are used in PCA training, both for one-class and two-class SVMs.
The features are next projected given trained PCA machine.
Prior to SVM training the features are again mean-std normalized.
Next SVM machine is trained for each N projected features. First, preojected features corresponding to highest eigenvalues are selected. N is usually small N = (2, 3). So, if N = 2, the first SVM is trained for projected features 1 and 2, second SVM is trained for projected features 3 and 4, and so on.
These SVMs then form a cascade of classifiers. The input feature vector is then projected using PCA machine and passed through all classifiers in the cascade. The decision is then made by majority voting.
Both one-class SVM and two-class SVM cascades can be trained. In this implementation the grid search of SVM parameters is not supported.
Parameters:
machine_typestrA type of the SVM machine. Please check
bob.learn.libsvmfor more details. Default: ‘C_SVC’.kernel_typestrA type of kerenel for the SVM machine. Please check
bob.learn.libsvmfor more details. Default: ‘RBF’.svm_kwargsdictDictionary containing the hyper-parameters of the SVM. Default: {‘cost’: 1, ‘gamma’: 0}.
NintThe number of features to be used for training a single SVM machine in the cascade. Default: 2.
pos_scores_slopefloatThe positive scores returned by SVM cascade will be multiplied by this constant prior to majority voting. Default: 0.01 .
frame_level_scores_flagboolReturn scores for each frame individually if True. Otherwise, return a single score per video. Default: False.
-
combine_scores_of_svm_cascade(scores_array, pos_scores_slope)[source]¶ First, multiply positive scores by constant
pos_scores_slopein the input 2D array. The constant is usually small, making the impact of negative scores more significant. Second, the a single score per sample is obtained by avaraging the pre-modified scores of the cascade.Parameters:
scores_array2Dnumpy.ndarray2D score array of the size (N_samples x N_scores).
pos_scores_slopefloatThe positive scores returned by SVM cascade will be multiplied by this constant prior to majority voting. Default: 0.01 .
Returns:
scores1Dnumpy.ndarrayVector of scores. Scores for the real class are expected to be higher, than the scores of the negative / attack class.
-
comp_prediction_precision(machine, real, attack)[source]¶ This function computes the precision of the predictions as a ratio of correctly classified samples to the total number of samples.
Parameters:
machineobjectA pre-trained SVM machine.
real2Dnumpy.ndarrayArray of features representing the real class.
attack2Dnumpy.ndarrayArray of features representing the attack class.
Returns:
precisionfloatThe precision of the predictions.
-
get_cascade_file_names(projector_file, projector_file_name)[source]¶ Get the list of file-names storing the cascade of machines. The location of the files is specified in the path component of the
projector_fileargument.Parameters:
projector_filestrAbsolute name of the file to load the trained projector from, as returned by
bob.pad.baseframework. In this function only the path component is used.projector_file_namestrThe common string in the names of files storing the cascade of pretrained machines. Name without extension.
Returns:
cascade_file_names[str]A list of of relative file-names storing the cascade of machines.
-
get_data_start_end_idx(data, N)[source]¶ Get indexes to select the subsets of data related to the cascades. First (n_machines - 1) SVMs will be trained using N features. Last SVM will be trained using remaining features, which is less or equal to N.
Parameters:
data2Dnumpy.ndarrayData array containing the training features. The dimensionality is (N_samples x N_features).
NintNumber of features per single SVM.
Returns:
idx_start[int]Starting indexes for data subsets.
idx_end[int]End indexes for data subsets.
n_machinesintNumber of SVMs to be trained.
-
load_cascade_of_machines(projector_file, projector_file_name)[source]¶ Loades a cascade of machines from the hdf5 files. The name of the file is specified in
projector_file_namestring and will be augumented with a number of the machine. The location is specified in the path component of theprojector_filestring.Parameters:
projector_filestrAbsolute name of the file to load the trained projector from, as returned by
bob.pad.baseframework. In this function only the path component is used.projector_file_namestrThe relative name of the file to load the machine from. This name will be augumented with a number of the machine. Name without extension.
Returns:
machinesdictA cascade of machines. The key in the dictionary is the number of the machine, value is the machine itself.
-
load_machine(projector_file, projector_file_name)[source]¶ Loads the machine from the hdf5 file. The name of the file is specified in
projector_file_namestring. The location is specified in the path component of theprojector_filestring.Parameters:
projector_filestrAbsolute name of the file to load the trained projector from, as returned by
bob.pad.baseframework. In this function only the path component is used.projector_file_namestrThe relative name of the file to load the machine from. Name without extension.
Returns:
machineobjectA machine loaded from file.
-
load_projector(projector_file)[source]¶ Load the pretrained PCA machine and a cascade of SVM classifiers from files to perform feature projection. This function sets the arguments
self.pca_machineandself.svm_machinesof this class with loaded machines.The function must be capable of reading the data saved with the
train_projector()method of this class.Please register performs_projection = True in the constructor to enable this function.
Parameters:
projector_filestrThe file to read the projector from, as returned by the
bob.pad.baseframework. In this class the names of the files to read the projectors from are modified, seeload_machineandload_cascade_of_machinesmethods of this class for more details.
-
project(feature)[source]¶ This function computes a vector of scores for each sample in the input array of features. The following steps are apllied:
Convert input array to numpy array if necessary.
Project features using pretrained PCA machine.
Apply the cascade of SVMs to the preojected features.
Compute a single score per sample by combining the scores produced by the cascade of SVMs. The combination is done using
combine_scores_of_svm_cascademethod of this class.
Set
performs_projection = Truein the constructor to enable this function. It is assured that theload_projector()was called before theprojectfunction is executed.Parameters:
featureFrameContainer or 2Dnumpy.ndarrayTwo types of inputs are accepted. A Frame Container conteining the features of an individual, see
bob.bio.video.utils.FrameContainer. Or a 2D feature array of the size (N_samples x N_features).
Returns:
scores1Dnumpy.ndarrayVector of scores. Scores for the real class are expected to be higher, than the scores of the negative / attack class.
-
save_cascade_of_machines(projector_file, projector_file_name, machines)[source]¶ Saves a cascade of machines to the hdf5 files. The name of the file is specified in
projector_file_namestring and will be augumented with a number of the machine. The location is specified in the path component of theprojector_filestring.Parameters:
projector_filestrAbsolute name of the file to save the trained projector to, as returned by
bob.pad.baseframework. In this function only the path component is used.projector_file_namestrThe relative name of the file to save the machine to. This name will be augumented with a number of the machine. Name without extension.
machinesdictA cascade of machines. The key in the dictionary is the number of the machine, value is the machine itself.
-
save_machine(projector_file, projector_file_name, machine)[source]¶ Saves the machine to the hdf5 file. The name of the file is specified in
projector_file_namestring. The location is specified in the path component of theprojector_filestring.Parameters:
projector_filestrAbsolute name of the file to save the trained projector to, as returned by
bob.pad.baseframework. In this function only the path component is used.projector_file_namestrThe relative name of the file to save the machine to. Name without extension.
machineobjectThe machine to be saved.
-
score(toscore)[source]¶ Returns a probability of a sample being a real class.
Parameters:
toscore1D or 2Dnumpy.ndarray2D in the case of two-class SVM. An array containing class probabilities for each frame. First column contains probabilities for each frame being a real class. Second column contains probabilities for each frame being an attack class. 1D in the case of one-class SVM. Vector with scores for each frame defining belonging to the real class.
Returns:
score[float]If
frame_level_scores_flag = Falsea single score is returned. One score per video. This score is placed into a list, because thescoremust be an iterable. Score is a probability of a sample being a real class. Ifframe_level_scores_flag = Truea list of scores is returned. One score per frame/sample.
-
train_pca(data)[source]¶ Train PCA given input array of feature vectors. The data is mean-std normalized prior to PCA training.
Parameters:
data2Dnumpy.ndarrayArray of feature vectors of the size (N_samples x N_features). The features must be already mean-std normalized.
Returns:
machinebob.learn.linear.MachineThe PCA machine that has been trained. The mean-std normalizers are also set in the machine.
eig_vals1Dnumpy.ndarrayThe eigen-values of the PCA projection.
-
train_pca_svm_cascade(real, attack, machine_type, kernel_type, svm_kwargs, N)[source]¶ This function is designed to train the cascede of SVMs given features of real and attack classes. The procedure is the following:
First, the PCA machine is trained also incorporating mean-std feature normalization. Only the features of the real class are used in PCA training, both for one-class and two-class SVMs.
The features are next projected given trained PCA machine.
Next, SVM machine is trained for each N projected features. Prior to SVM training the features are again mean-std normalized. First, preojected features corresponding to highest eigenvalues are selected. N is usually small N = (2, 3). So, if N = 2, the first SVM is trained for projected features 1 and 2, second SVM is trained for projected features 3 and 4, and so on.
Both one-class SVM and two-class SVM cascades can be trained. In this implementation the grid search of SVM parameters is not supported.
Parameters:
real2Dnumpy.ndarrayTraining features for the real class.
attack2Dnumpy.ndarrayTraining features for the attack class. If machine_type == ‘ONE_CLASS’ this argument can be anything, it will be skipped.
machine_typestrA type of the SVM machine. Please check
bob.learn.libsvmfor more details.kernel_typestrA type of kerenel for the SVM machine. Please check
bob.learn.libsvmfor more details.svm_kwargsdictDictionary containing the hyper-parameters of the SVM.
NintThe number of features to be used for training a single SVM machine in the cascade.
Returns:
pca_machineobjectA trained PCA machine.
svm_machinesdictA cascade of SVM machines.
-
train_projector(training_features, projector_file)[source]¶ Train PCA and cascade of SVMs for feature projection and save them to files. The
requires_projector_training = Trueflag must be set to True to enable this function.Parameters:
training_features[[FrameContainer], [FrameContainer]]A list containing two elements: [0] - a list of Frame Containers with feature vectors for the real class; [1] - a list of Frame Containers with feature vectors for the attack class.
projector_filestrThe file to save the trained projector to, as returned by the
bob.pad.baseframework. In this class the names of the files to save the projectors to are modified, seesave_machineandsave_cascade_of_machinesmethods of this class for more details.
-
train_svm(real, attack, machine_type, kernel_type, svm_kwargs)[source]¶ One-class or two class-SVM is trained in this method given input features. The value of
attackargument is not important in the case of one-class SVM. Prior to training the data is mean-std normalized.Parameters:
real2Dnumpy.ndarrayTraining features for the real class.
attack2Dnumpy.ndarrayTraining features for the attack class. If machine_type == ‘ONE_CLASS’ this argument can be anything, it will be skipped.
machine_typestrA type of the SVM machine. Please check
bob.learn.libsvmfor more details.kernel_typestrA type of kerenel for the SVM machine. Please check
bob.learn.libsvmfor more details.svm_kwargsdictDictionary containing the hyper-parameters of the SVM.
Returns:
machineobjectA trained SVM machine. The mean-std normalizers are also set in the machine.
-
train_svm_cascade(real, attack, machine_type, kernel_type, svm_kwargs, N)[source]¶ Train a cascade of SVMs, one SVM machine per N features. N is usually small N = (2, 3). So, if N = 2, the first SVM is trained for features 1 and 2, second SVM is trained for features 3 and 4, and so on.
Both one-class and two-class SVM cascades can be trained. The value of
attackargument is not important in the case of one-class SVM.The data is mean-std normalized prior to SVM cascade training.
Parameters:
real2Dnumpy.ndarrayTraining features for the real class.
attack2Dnumpy.ndarrayTraining features for the attack class. If machine_type == ‘ONE_CLASS’ this argument can be anything, it will be skipped.
machine_typestrA type of the SVM machine. Please check
bob.learn.libsvmfor more details.kernel_typestrA type of kerenel for the SVM machine. Please check
bob.learn.libsvmfor more details.svm_kwargsdictDictionary containing the hyper-parameters of the SVM.
NintThe number of features to be used for training a single SVM machine in the cascade.
Returns:
machinesdictA dictionary containing a cascade of trained SVM machines.
-
class
bob.pad.base.algorithm.VideoPredictions(axis=1, frame_level_scoring=False, **kwargs)¶ Bases:
bob.pad.base.algorithm.AlgorithmAn algorithm that takes the precomputed predictions and uses them for scoring.
-
score(toscore) → score[source]¶ This function will compute the score for the given object
toscore. It must be overwritten by derived classes.Parameters:
- toscoreobject
The object to compute the score for. This will be the output of extractor if performs_projection is False, otherwise this will be the output of project method of the algorithm.
Returns:
- scorefloat
A score value for the object
toscore.
-
Databases¶
-
class
bob.pad.base.database.Client(client_id)¶ Bases:
objectThe clients of this database contain ONLY client ids. Nothing special.
-
class
bob.pad.base.database.FileListPadDatabase(filelists_directory, name, protocol=None, pad_file_class=<class 'bob.pad.base.database.PadFile'>, original_directory=None, original_extension=None, annotation_directory=None, annotation_extension='', annotation_type=None, train_subdir=None, dev_subdir=None, eval_subdir=None, real_filename=None, attack_filename=None, keep_read_lists_in_memory=True, **kwargs)¶ Bases:
bob.pad.base.database.PadDatabase,bob.bio.base.database.FileListBioDatabaseThis class provides a user-friendly interface to databases that are given as file lists.
Keyword parameters:
- filelists_directorystr
The directory that contains the filelists defining the protocol(s). If you use the protocol attribute when querying the database, it will be appended to the base directory, such that several protocols are supported by the same class instance of bob.pad.base.
- namestr
The name of the database
- protocolstr
The protocol of the database. This should be a folder inside
filelists_directory.- pad_file_classclass
The class that should be used for return the files. This can be PadFile, PadVoiceFile, or anything similar.
- original_directorystr or
None The directory, where the original data can be found
- original_extensionstr or [str] or
None The filename extension of the original data, or multiple extensions
- annotation_directorystr or
None The directory, where additional annotation files can be found
- annotation_extensionstr or
None The filename extension of the annotation files
annotation_type : str The type of the annotation file to read, see bob.db.base.read_annotation_file for accepted formats.
- train_subdirstr or
None Specify a custom subdirectory for the filelists of the development set (default is ‘train’)
- dev_subdirstr or
None Specify a custom subdirectory for the filelists of the development set (default is ‘dev’)
- eval_subdirstr or
None Specify a custom subdirectory for the filelists of the development set (default is ‘eval’)
- keep_read_lists_in_memorybool
If set to true, the lists are read only once and stored in memory
-
annotations(file)[source]¶ Returns the annotations for the given File object, if available. You need to override this method in your high-level implementation. If your database does not have annotations, it should return
None.Parameters:
- file
bob.pad.base.database.PadFile The file for which annotations should be returned.
Returns:
- annotsdict or None
The annotations for the file, if available.
- file
-
client_ids(protocol=None, groups=None)[source]¶ Returns a list of client ids for the specific query by the user.
Keyword Parameters:
- protocolstr or
None The protocol to consider
- groupsstr or [str] or
None The groups to which the clients belong (“dev”, “eval”, “train”).
Returns: A list containing all the client ids which have the given properties.
- protocolstr or
-
groups(protocol=None, add_world=False, add_subworld=False)[source]¶ This function returns the list of groups for this database.
- protocolstr or
None The protocol for which the groups should be retrieved.
Returns: a list of groups
- protocolstr or
-
objects(groups=None, protocol=None, purposes=None, model_ids=None, **kwargs)[source]¶ Returns a set of
PadFileobjects for the specific query by the user.Keyword Parameters:
- groupsstr or [str] or
None One of the groups (“dev”, “eval”, “train”) or a tuple with several of them. If ‘None’ is given (this is the default), it is considered the same as a tuple with all possible values.
- protocolstr or
None The protocol to consider
- purposesstr or [str] or
None The purposes required to be retrieved (“real”, “attack”) or a tuple with several of them. If ‘None’ is given (this is the default), it is considered the same as a tuple with all possible values.
- model_ids[various type]
This parameter is not supported in PAD databases yet
Returns: A list of
PadFileobjects considering all the filtering criteria.- groupsstr or [str] or
-
tobjects(groups=None, protocol=None, model_ids=None, **kwargs)[source]¶ Returns a list of
bob.bio.base.database.BioFileobjects for enrolling T-norm models for score normalization.- Parameters
protocol (str or
None) – The protocol to considermodel_ids (str or [str] or
None) – Only retrieves the files for the provided list of model ids (claimed client id). IfNoneis given (this is the default), no filter over the model_ids is performed.groups (str or [str] or
None) – The groups to which the models belong('dev', 'eval').
- Returns
A list of
BioFileobjects considering all the filtering criteria.- Return type
[BioFile]
-
zobjects(groups=None, protocol=None, **kwargs)[source]¶ Returns a list of
BioFileobjects to perform Z-norm score normalization.- Parameters
protocol (str or
None) – The protocol to considergroups (str or [str] or
None) – The groups to which the clients belong('dev', 'eval').
- Returns
A list of File objects considering all the filtering criteria.
- Return type
[BioFile]
-
class
bob.pad.base.database.HighBioDatabase(filelists_directory=None, original_directory='[DB_DATA_DIRECTORY]', original_extension='.wav', db_name='', file_class=None, **kwargs)¶ Bases:
bob.bio.base.database.FileListBioDatabaseImplements verification API for querying High database.
-
annotations(file)[source]¶ Reads the annotations for the given file id from file and returns them in a dictionary.
-
arrange_by_client(files) → files_by_client[source]¶ Arranges the given list of files by client id. This function returns a list of lists of File’s.
Parameters:
- files
bob.bio.base.database.BioFile A list of files that should be split up by BioFile.client_id.
Returns:
- files_by_client[[
bob.bio.base.database.BioFile]] The list of lists of files, where each sub-list groups the files with the same BioFile.client_id
- files
-
client_id_from_model_id(model_id, group='dev')[source]¶ This wrapper around PAD database does not have a knowledge of model ids used in verification experiments, so we just assume that the client_id is the same as model_id, which is actually true for most of the verification databases as well.
-
model_ids_with_protocol(groups=None, protocol=None, **kwargs)[source]¶ This wrapper around PAD database does not have a knowledge of model ids used in verification experiments, so we just assume that the model_ids are the same as client ids, which is actually true for most of the verification databases as well.
-
objects(protocol=None, purposes=None, model_ids=None, groups=None, **kwargs)[source]¶ Maps objects method of PAD databases into objects method of Verification database
- Parameters
protocol (str) – To distinguish two vulnerability scenarios, protocol name should have either ‘-licit’ or ‘-spoof’ appended to it. For instance, if DB has protocol ‘general’, the named passed to this method should be ‘general-licit’, if we want to run verification experiments on bona fide data only, but it should be ‘general-spoof’, if we want to run it for spoof scenario (the probes are attacks).
purposes ([str]) – This parameter is passed by the
bob.bio.baseverification experimentmodel_ids ([object]) – This parameter is passed by the
bob.bio.baseverification experimentgroups ([str]) – We map the groups from (‘world’, ‘dev’, ‘eval’) used in verification experiments to (‘train’, ‘dev’, ‘eval’)
**kwargs – The rest of the parameters valid for a given database
- Returns
Set of BioFiles that verification experiments expect.
- Return type
[object]
-
-
class
bob.pad.base.database.HighPadDatabase(filelists_directory=None, original_directory='[DB_DATA_DIRECTORY]', original_extension='.wav', file_class=None, db_name='', **kwargs)¶
-
class
bob.pad.base.database.PadDatabase(name, protocol='Default', original_directory=None, original_extension=None, **kwargs)¶ Bases:
bob.bio.base.database.BioDatabaseThis class represents the basic API for database access. Please use this class as a base class for your database access classes. Do not forget to call the constructor of this base class in your derived class.
Parameters:
name : str A unique name for the database.
protocol : str or
NoneThe name of the protocol that defines the default experimental setup for this database.original_directory : str The directory where the original data of the database are stored.
original_extension : str The file name extension of the original data.
kwargs :
key=valuepairs The arguments of thebob.bio.base.database.BioDatabasebase class constructor.-
all_files(groups=('train', 'dev', 'eval'), flat=False)[source]¶ Returns all files of the database, respecting the current protocol. The files can be limited using the
all_files_optionsin the constructor.- Parameters
- Returns
files – The sorted and unique list of all files of the database.
- Return type
-
abstract
annotations(file)[source]¶ Returns the annotations for the given File object, if available. You need to override this method in your high-level implementation. If your database does not have annotations, it should return
None.Parameters:
- file
bob.pad.base.database.PadFile The file for which annotations should be returned.
Returns:
- annotsdict or None
The annotations for the file, if available.
- file
-
model_ids_with_protocol(groups = None, protocol = None, **kwargs) → ids[source]¶ Client-based PAD is not implemented.
-
abstract
objects(groups=None, protocol=None, purposes=None, model_ids=None, **kwargs)[source]¶ This function returns lists of File objects, which fulfill the given restrictions.
Keyword parameters:
- groupsstr or [str]
The groups of which the clients should be returned. Usually, groups are one or more elements of (‘train’, ‘dev’, ‘eval’)
- protocol
The protocol for which the clients should be retrieved. The protocol is dependent on your database. If you do not have protocols defined, just ignore this field.
- purposesstr or [str]
The purposes for which File objects should be retrieved. Usually it is either ‘real’ or ‘attack’.
- model_ids[various type]
This parameter is not supported in PAD databases yet
-
original_file_names(files) → paths[source]¶ Returns the full paths of the real and attack data of the given PadFile objects.
Parameters:
- files[[
bob.pad.base.database.PadFile], [bob.pad.base.database.PadFile] The list of lists ([real, attack]) of file object to retrieve the original data file names for.
Returns:
- paths[str] or [[str]]
The paths extracted for the concatenated real+attack files, in the preserved order.
- files[[
-
training_files(step = None, arrange_by_client = False) → files[source]¶ Returns all training File objects This function needs to be implemented in derived class implementations.
- Parameters:
The parameters are not applicable in this version of anti-spoofing experiments
Returns:
- files[
bob.pad.base.database.PadFile] or [[bob.pad.base.database.PadFile]] The (arranged) list of files used for the training.
-
-
class
bob.pad.base.database.PadFile(client_id, path, attack_type=None, file_id=None)¶ Bases:
bob.bio.base.database.BioFileA simple base class that defines basic properties of File object for the use in PAD experiments
Grid Configuration¶
Code related to grid is reused from bob.bio.base package. Please see the corresponding documentation.