# Python API¶

## Classes¶

 bob.ip.facedetect.BoundingBox A bounding box class storing top, left, height and width of an rectangle bob.ip.facedetect.FeatureExtractor This class extracts LBP features of several types from a given image patch of a certain size bob.ip.facedetect.Cascade([cascade_file, …]) This class defines a cascade of strong classifiers bob.learn.boosting.BoostedMachine. bob.ip.facedetect.Sampler([patch_size, …]) This class generates (samples) bounding boxes for different scales and locations in the image. A set of images including bounding boxes that are used as a training set

## Functions¶

 Detects a single face in the given image, i.e., the one with the highest prediction value. Detects all faces in the given image, whose prediction values are higher than the given threshold. Returns the bob.ip.facedetect.Cascade that is loaded from the pre-trained cascade file provided by this package. bob.ip.facedetect.best_detection(detections, …) Computes the best detection for the given detections and according predictions. Returns the detections and predictions that overlap with the best detection Prunes the given detected bounding boxes according to their predictions and returns the pruned bounding boxes and their predictions Computes the expected eye positions based on the relative coordinates of the bounding box. Creates a bounding box from the given parameters, which are, in general, annotations read using bob.ip.facedetect.read_annotation_file(). Reads annotations from the given annotation_file.

## Detailed Information¶

class bob.ip.facedetect.Bootstrap(number_of_rounds=7, number_of_weak_learners_in_first_round=8, number_of_positive_examples_per_round=5000, number_of_negative_examples_per_round=5000)[source]

Bases: object

This class deals with selecting new training examples for each boosting round.

Bootstrapping a classifier works as follows:

1. round = 1

2. the classifier is trained on a random subset of the training data, where number_of_positive_examples_per_round and number_of_negative_examples_per_round defines the number of positive and negative examples used in the first round

3. add number_of_weak_learners_in_first_round**round weak classifiers (selected using boosting)

4. evaluate the whole training set using the current set of classifiers

5. add the new data that is mis-classified by the largest margin to the training set

6. round = round + 1

7. if round < number_of_rounds goto 3

Constructor Documentation

Creates a new Bootstrap object that can be used to start or continue the training of a bootstrapped boosted classifier.

Parameters:

number_of_roundsint

The number of bootstrapping rounds, where each round adds more weak learners to the model

number_of_weak_learners_in_first_roundint

The number of weak classifiers chosen in the first round; later rounds will potentate this number, so don’t choose it too large

number_of_positive_examples_per_round, number_of_negative_examples_per_roundint

The number of positive and negative samples added in each bootstrapping round; these numbers should be balanced, but do not necessarily need to be

run(training_set, trainer[, filename][, force]) → model[source]

Runs the bootstrapped training of a strong classifier using the given training data and a strong classifier trainer. The training set need to contain extracted features already, as this function will need the features several times.

Parameters:

training_setTrainingSet

The training set containing pre-extracted feature files

trainerbob.learn.boosting.Boosting

A strong boosting trainer to use for selecting the weak classifiers and their weights for each round.

filenamestr

A filename, where to write the resulting strong classifier to. This filename is also used as a base to compute filenames of intermediate files, which store results of each of the bootstrapping steps.

forcebool

If set to False (the default), the bootstrapping will continue the round, where it has been stopped during the last run (reading the current stage from respective files). If set to True, the training will start from the beginning.

Returns:

modelbob.learn.boosting.BoostedMachine

The resulting strong classifier, a weighted combination of weak classifiers.

class bob.ip.facedetect.BoundingBox

Bases: object

A bounding box class storing top, left, height and width of an rectangle

Constructor Documentation:

• bob.ip.facedetect.BoundingBox (topleft, size)

• bob.ip.facedetect.BoundingBox (bounding_box)

Constructs a new Bounding box from the given top-left position and the size of the rectangle

Parameters:

topleft : (float, float)

The top-left position of the bounding box

size : (float, float)

The size of the bounding box

bounding_box : BoundingBox

The BoundingBox object to use for copy-construction

Class Members:

area

float <– The area (height x width) of the bounding box, read access only

bottom

int <– The bottom position of the bounding box (which is just outside the bounding box) as int, read access only

bottom_f

float <– The bottom position of the bounding box (which is just outside the bounding box) as float, read access only

bottomright

(int, int) <– The bottom-right position of the bounding box (which is just outside the bounding box) as integral values, read access only

bottomright_f

(float, float) <– The bottom-right position of the bounding box (which is just outside the bounding box) as float values, read access only

center

(float, float) <– The center of the bounding box (as float values), read access only

contains(point) → contained

Checks if the bounding box contains the given point

Parameters:

point : (float, float)

The point to test

Returns:

contained : bool

True if the bounding box contains the given point, False otherwise

is_valid_for(size) → valid

Checks if the bounding box is inside the given image size

Parameters:

size : (int, int)

The size of the image to test

Returns:

valid : bool

True if the bounding box is inside the image boundaries, False otherwise

left

int <– The left position of the bounding box as int, read access only

left_f

float <– The left position of the bounding box as float, read access only

mirror_x(width) → bounding_box

This function returns a horizontally mirrored version of this BoundingBox

Parameters:

width : int

The width of the image at which this bounding box should be mirrored

Returns:

bounding_box : BoundingBox

The mirrored version of this bounding box

overlap(other) → bounding_box

This function returns the overlapping bounding box between this and the given bounding box

Parameters:

other : BoundingBox

The other bounding box to compute the overlap with

Returns:

bounding_box : BoundingBox

The overlap between this and the other bounding box

right

int <– The right position of the bounding box (which is just outside the bounding box) as int, read access only

right_f

float <– The right position of the bounding box (which is just outside the bounding box) as float, read access only

scale(scale[, centered]) → bounding_box

This function returns a scaled version of this BoundingBox

When the centered parameter is set to True, the transformation center will be in the center of this bounding box, otherwise it will be at (0,0)

Parameters:

scale : float

The scale with which this bounding box should be shifted

centered : bool

[Default: False] : Should the scaling done with repect to the center of the bounding box?

Returns:

bounding_box : BoundingBox

The scaled version of this bounding box

shift(offset) → bounding_box

This function returns a shifted version of this BoundingBox

Parameters:

offset : (float, float)

The offset with which this bounding box should be shifted

Returns:

bounding_box : BoundingBox

The shifted version of this bounding box

similarity(other) → sim

This function computes the Jaccard similarity index between this and the given BoundingBox

The Jaccard similarity coefficient between two bounding boxes is defined as their intersection divided by their union:

$J(A,B) = \frac{|A\cap B|}{|A\cup B|}$

Parameters:

other : BoundingBox

The other bounding box to compute the overlap with

Returns:

sim : float

The Jaccard similarity index between this and the given BoundingBox

size

(int, int) <– The size of the bounding box as integral values, read access only

size_f

(float, float) <– The size of the bounding box as float values, read access only

top

int <– The top position of the bounding box as int, read access only

top_f

float <– The top position of the bounding box as float, read access only

topleft

(int, int) <– The top-left position of the bounding box as integral values, read access only

topleft_f

(float, float) <– The top-left position of the bounding box as float values, read access only

class bob.ip.facedetect.Cascade(cascade_file=None, feature_extractor=None)[source]

Bases: object

This class defines a cascade of strong classifiers bob.learn.boosting.BoostedMachine.

For each strong classifier, a threshold exists. When the weighted sum of predictions of classifiers gets below this threshold, the classification is stopped.

Constructor Documentation:

The constructor has two different ways to be called. The first and most obvious way is to load the cascade from the given cascade_file.

The second way instantiates an empty cascade, with the given feature_extractor. Please use the add() function to add new strong classifiers with according thresholds.

Parameters:

cascade_filebob.io.base.HDF5File

An HDF5 file open for reading

feature_extractorFeatureExtractor

A feature extractor that will be used to extract features for the strong classifiers.

add(classifier, threshold, begin=None, end=None)[source]

Adds a new strong classifier with the given threshold to the cascade.

Parameters:

classifierbob.learn.boosting.BoostedMachine

thresholdfloat

The classification threshold for this cascade step

begin, endint or None

If specified, only the weak machines with the indices range(begin,end) will be added.

create_from_boosted_machine(boosted_machine, classifiers_per_round, classification_thresholds=- 5.0)[source]

Creates this cascade from the given boosted machine, by simply splitting off strong classifiers that have classifiers_per_round weak classifiers.

Parameters:

boosted_machinebob.learn.boosting.BoostedMachine

The strong classifier to split into a regular cascade.

classifiers_per_roundint

The number of classifiers that each cascade step should contain.

classification_thresholdfloat

A single threshold that will be applied in all rounds of the cascade.

generate_boosted_machine() → strong[source]

Creates a single strong classifier from this cascade by concatenating all strong classifiers.

Returns:

strongbob.learn.boosting.BoostedMachine

The strong classifier as a combination of all classifiers in this cascade.

prepare(image, scale)[source]

Prepares the cascade for extracting features of the given image in the given scale.

Parameters:

imagearray_like (2D, float)

The image from which features will be extracted

scalefloat

The scale of the image, for which features will be extracted

save(hdf5)[source]

Saves this cascade into the given HDF5 file.

Parameters:

hdf5bob.io.base.HDF5File

An HDF5 file open for writing

load(hdf5)[source]

Parameters:

hdf5bob.io.base.HDF5File

An HDF5 file open for reading

class bob.ip.facedetect.FeatureExtractor

Bases: object

This class extracts LBP features of several types from a given image patch of a certain size

LBP features are extracted using different variants of bob.ip.base.LBP feature extractors. All LBP features of one patch are stored in a single long feature vector of type numpy.uint16.

Constructor Documentation:

• bob.ip.facedetect.FeatureExtractor (patch_size)

• bob.ip.facedetect.FeatureExtractor (patch_size, extractors)

• bob.ip.facedetect.FeatureExtractor (patch_size, template, [overlap], [square], [min_size], [max_size])

• bob.ip.facedetect.FeatureExtractor (other)

• bob.ip.facedetect.FeatureExtractor (hdf5)

Generates a new feature extractor for the given patch_size using one or several feature extractors

The constructor can be called in different ways:

• The first constructor initializes a feature extractor with no LBP extractor. Please use the append() function to add LBP extractors.

• In the second constructor, a given list of LBP extractors is specified.

• The third constructor initializes a tight set of LBP extractors for different bob.ip.base.LBP.radii, by adding all possible combinations of x- and y- radii, until the patch_size is too small, or min_size (start) or max_size (end) is reached.

• The fourth constructor copies all LBP extractors from the given FeatureExtractor

• The last constructor read the configuration from the given bob.io.base.HDF5File.

Parameters:

patch_size : (int, int)

The size of the patch to extract from the images

extractors : [bob.ip.base.LBP]

The LBP classes to use as extractors

template : bob.ip.base.LBP

The LBP classes to use as template for all extractors

overlap : bool

[default: False] Should overlapping LBPs be created?

square : bool

[default: False] Should only square LBPs be created?

min_size : int

[default: 1] The minimum radius of LBP

max_size : int

[default: MAX_INT] The maximum radius of LBP (limited by patch size)

other : FeatureExtractor

The feature extractor to use for copy-construction

hdf5 : bob.io.base.HDF5File

The HDF5 file to read the extractors from

Class Members:

append()
• append(other) -> None

• append(lbp, offsets) -> None

Appends the given feature extractor or LBP class to this one

With this function you can either append a complete feature extractor, or a partial axtractor (i.e., a single LBP class) including the offset positions for them

Parameters:

other : FeatureExtractor

All LBP classes and offset positions of the given extractor will be appended

lbp : bob.ip.base.LBP

The LBP extractor that will be added

offsets : [(int,int)]

The offset positions at which the given LBP will be extracted

extract_all(bounding_box, dataset, dataset_index)None

Extracts all features into the given dataset of (training) features at the given index

This function exists to extract training features for several training patches. To avoid data copying, the full training dataset, and the current training feature index need to be provided.

Parameters:

bounding_box : BoundingBox

The bounding box for which the features should be extracted

dataset : array_like <2D, uint16>

The (training) dataset, into which the features should be extracted; must be of shape (#training_patches, number_of_features)

dataset_index : int

The index of the current training patch

extract_indexed(bounding_box, feature_vector[, indices])None

Extracts the features only at the required locations, which defaults to model_indices

Parameters:

bounding_box : BoundingBox

The bounding box for which the features should be extracted

feature_vector : array_like <1D, uint16>

The feature vector, into which the features should be extracted; must be of size number_of_features

indices : array_like<1D,int32>

The indices, for which the features should be extracted; if not given, model_indices is used (must be set beforehands)

extractor(index) → lbp

Get the LBP feature extractor associated with the given feature index

Parameters:

index : int

The feature index for which the extractor should be retrieved

Returns:

lbp : bob.ip.base.LBP

The feature extractor for the given feature index

extractors

[bob.ip.base.LBP] <– The LBP extractors, read access only

image

array_like <2D, uint8> <– The (prepared) image the next features will be extracted from, read access only

load(hdf5)None

Loads the extractors from the given HDF5 file

Parameters:

hdf5 : bob.io.base.HDF5File

mean_variance(bounding_box[, compute_variance]) → mv

Computes the mean (and the variance) of the pixel gray values in the given bounding box

Parameters:

bounding_box : BoundingBox

The bounding box for which the mean (and variance) shoulf be calculated

compute_variance : bool

[Default: False] If enabled, the variance is computed as well; requires the compute_integral_square_image enabled in the prepare() function

Returns:

mv : float or (float, float)

The mean (or the mean and the variance) of the pixel gray values for the given bounding box

model_indices

array_like <1D, int32> <– The indices at which the features are extracted, read and write access

number_of_features

int <– The length of the feature vector that will be extracted by this class, read access only

number_of_labels

int <– The maximum label for the features in this class, read access only

offset(index)offset

Get the offset position associated with the given feature index

Parameters:

index : int

The feature index for which the extractor should be retrieved

Returns:

offset : (int,int)

The offset position for the given feature index

patch_size

(int, int) <– The expected size of the patch that this extractor can handle, read access only

prepare(image, scale[, compute_integral_square_image])None

Take the given image to perform the next extraction steps for the given scale

If compute_integral_square_image is enabled, the (internally stored) integral square image is computed as well. This image is required to compute the variance of the pixels in a given patch, see mean_variance()

Parameters:

image : array_like <2D, uint8 or float>

The image that should be used in the next extraction step

scale : float

The scale of the image to extract

compute_integral_square_image : bool

[Default: False] : Enable the computation of the integral square image

save(hdf5)None

Saves the extractors to the given HDF5 file

Parameters:

hdf5 : bob.io.base.HDF5File

The file to write to

class bob.ip.facedetect.Sampler(patch_size=24, 20, scale_factor=0.9576032806985737, lowest_scale=0.015625, distance=2)[source]

Bases: object

This class generates (samples) bounding boxes for different scales and locations in the image.

It computes different scales of the image and provides a tight set of BoundingBox of a given patch size for the given image.

Constructor Documentation:

Generates a patch-sampler, which will scan images and sample bounding boxes.

Parameters:

patch_size(int, int)

the size of the patch (i.e., the bounding box) to sample

scale_factorfloat

image pyramids are computed using the given scale factor between two scales

lowest_scalefloat or None

patches which will be lower than the given scale times the image resolution will not be taken into account; if 0. all possible patches will be considered

distanceint

the distance in both horizontal and vertical direction to generate samples

scales(image) → scale, shape[source]

Computes the all possible scales for the given image and yields a tuple of the scale and the scaled image shape as an iterator.

Parameters::

imagearray_like(2D or 3D)

The image, for which the scales should be computed

Yields:

scalefloat

The next scale of the image to be considered

shape(int, int) or (int, int, int)

The shape of the image, when scaled with the current scale

sample_scaled(shape) → bounding_box[source]

Yields an iterator that iterates over all sampled bounding boxes in the given (scaled) image shape.

Parameters:

shape(int, int) or (int, int, int)

The (current) shape of the (scaled) image

Yields:

bounding_boxBoundingBox

An iterator iterating over all bounding boxes that are valid for the given shape

sample(image) → bounding_box[source]

Yields an iterator over all bounding boxes in different scales that are sampled for the given image.

Parameters:

imagearray_like(2D or 3D)

The image, for which the bounding boxes should be generated

Yields:

bounding_boxBoundingBox

An iterator iterating over all bounding boxes for the given image

iterate(image, feature_extractor, feature_vector) → bounding_box[source]

Scales the given image, and extracts features from all possible bounding boxes.

For each of the sampled bounding boxes, this function fills the given pre-allocated feature vector and yields the current bounding box.

Parameters:

imagearray_like(2D)

The given image to extract features for

feature_extractorFeatureExtractor

The feature extractor to use to extract the features for the sampled patches

feature_vectornumpy.ndarray (1D, uint16)

The pre-allocated feature vector that will be filled inside this function; needs to be of size FeatureExtractor.number_of_features

Yields:

bounding_boxBoundingBox

The bounding box for which the current features are extracted for

iterate_cascade(self, cascade, image[, threshold]) → prediction, bounding_box[source]

Iterates over the given image and computes the cascade of classifiers. This function will compute the cascaded classification result for the given image using the given cascade. It yields a tuple of prediction value and the according bounding box. If a threshold is specified, only those predictions are returned, which exceed the given threshold.

Note

The threshold does not overwrite the cascade thresholds :py:attr:Cascade.thresholds, but only threshold the final prediction. Specifying the threshold here is just slightly faster than thresholding the yielded prediction.

Parameters:

cascadeCascade

The cascade that performs the predictions

imagearray_like(2D)

The image for which the predictions should be computed

thresholdfloat

The threshold, which limits the number of predictions

Yields:

predictionfloat

The prediction value for the current bounding box

bounding_boxBoundingBox

An iterator over all possible sampled bounding boxes (which exceed the prediction threshold, if given)

class bob.ip.facedetect.TrainingSet(feature_directory=None)[source]

Bases: object

A set of images including bounding boxes that are used as a training set

The TrainingSet incorporates information about the data used to train the face detector. It is heavily bound to the scripts to re-train the face detector, which are documented in section Retrain the Detector.

The training set can be in several stages, which are optimized for speed. First, training data is collected in different ways and stored in one or more list files. These list files contain the location of the image files, and where the face bounding boxes in the according images are. Then, positive and negative features from one or more file lists are extracted and stored in a given feature_directory, where ‘positive’ features represent faces, and ‘negative’ features represent the background. Finally, the training is performed using only these features only, without keeping track of where they actually stem from.

Constructor Documentation

Creates an empty training set.

Parameters:

feature_directorystr

The name of a temporary directory, where (intermediate) features will be stored. This directory should be able to store several 100GB of data.

add_image(image_path, annotations)[source]

Adds an image and its bounding boxes to the current list of files

The bounding boxes are automatically estimated based on the given annotations.

Parameters:

image_pathstr

The file name of the image, including its full path

annotations[dict]

A list of annotations, i.e., where each annotation can be anything that bounding_box_from_annotation() can handle; this list can be empty, in case the image does not contain any faces

add_from_db(database, files)[source]

Adds images and bounding boxes for the given files of a database that follows the bob.bio.base.database.BioDatabase interface.

Parameters:

databasea derivative of bob.bio.base.database.BioDatabase

The database interface, which provides file names and annotations for the given files

filesbob.bio.base.database.BioFile or compatible

The files (as returned by bob.bio.base.database.BioDatabase.objects()) which should be added to the training list

save(list_file)[source]

Saves the current list of annotations to the given file.

Parameters:

list_filestr

The name of a list file to write the currently stored list into

load(list_file)[source]

Loads the list of annotations from the given file and appends it to the current list.

list_filestr

The name of a list file to load and append

iterate([max_number_of_files]) → image, bounding_boxes, image_file[source]

Yields the image and the bounding boxes stored in the training set as an iterator.

This function loads the images and converts them to gray-scale. It yields the image, the list of bounding boxes and the original image file name.

Parameters:

max_number_of_filesint or None

If specified, limit the number of returned data by sub-selection using quasi_random_indices()

Yields:

imagearray_like(2D)

The image loaded from file and converted to gray scale

bounding_boxes

A list of bounding boxes, where faces are found in the image; might be empty (in case of pure background images)

 image_filestr

The name of the original image that was read

extract(sampler, feature_extractor, number_of_examples_per_scale=100, 100, similarity_thresholds=0.5, 0.8, parallel=None, mirror=False, use_every_nth_negative_scale=1)[source]

Extracts features from all images in all scales and writes them to file.

This function iterates over all images that are present in the internally stored list, and extracts features using the given feature_extractor for every image patch that the given sampler returns. The final features will be stored in the feature_directory that is set in the constructor.

For each image, the sampler samples patch locations, which cover the whole image in different scales. For each patch locations is tested, how similar they are to the face bounding boxes that belong to that image, using the Jaccard BoundingBox.similarity(). The similarity is compared to the similarity_thresholds. If it is smaller than the first threshold, the patch is considered as background, when it is greater the the second threshold, it is considered as a face, otherwise it is rejected. Depending on the image resolution and the number of bounding boxes, this will usually result in some positive and thousands of negative patches per image. To limit the total amount of training data, for all scales, only up to a given number of positive and negative patches are kept. Also, to further limit the number of negative samples, only every use_every_nth_negative_scale scale is considered (for the positives, always all scales are processed).

To increase the number (especially of positive) examples, features can also be extracted for horizontally mirrored images. Simply set the mirror parameter to True. Furthermore, this function is designed to be run using several parallel processes, e.g., using the GridTK. Each of the processes will run on a particular subset of the images, which is defined by the SGE_TASK_ID environment variable. The parallel parameter defines the total number of parallel processes that are used.

Parameters:

samplerSampler

The sampler to use to sample patches of the images. Please assure that the sampler is set up such that it samples patch locations which can overlap with the face locations.

feature_extractorFeatureExtractor

The feature extractor to be used to extract features from image patches

number_of_examples_per_scale(int, int)

The maximum number of positive and negative examples to extract for each scale of the image

similarity_thresholds(float, float)

The Jaccard similarity threshold, below which patch locations are considered to be negative, and above which patch locations are considered to be positive examples.

parallelint or None

If given, the total number of parallel processes, which are used to extract features (the current process index is read from the SGE_TASK_ID environment variable)

mirrorbool

Extract positive and negative samples also from horizontally mirrored images?

use_every_nth_negative_scaleint

Skip some negative scales to decrease the number of negative examples, i.e., only extract and store negative features, when scale_counter % use_every_nth_negative_scale == 0

Note

The scale_counter is not reset between images, so that we might get features from different scales in subsequent images.

sample([model][, maximum_number_of_positives][, maximum_number_of_negatives][, positive_indices][, negative_indices]) → positives, negatives[source]

Returns positive and negative samples from the set of positives and negatives.

This reads the previously extracted feature file (or all of them, in case features were extracted in parallel) and returns features. If the model is not specified, a random sub-selection of positive and negative features is returned. When the model is given, all patches are first classified with the given model, and the ones that are mis-classified most are returned. The number of returned positives and negatives can be limited by specifying the maximum_number_of_positives and maximum_number_of_negatives.

This function keeps track of the positives and negatives that it once has returned, so it does not return the same positive or negative feature twice. However, when you have to restart training from a given point, you can set the positive_indices and negative_indices parameters, to retrieve the features for the given indices. In this case, no additional features are selected, but the given sets of indices are stored internally.

Note

The positive_indices and negative_indices only have an effect, when model is None.

Parameters:

modelbob.learn.boosting.BoostedMachine or None

If given, the model is used to predict the training features, and the highest mis-predicted features are returned

maximum_number_of_positives, maximum_number_of_negativesint

The maximum number of positive and negative features to be returned

positive_indices, negative_indicesset(int) or None

The set of positive and negative indices to extract features for, instead of randomly choosing indices; only considered when model = None

Returns:

positives, negativesarray_like(2D, uint16)

The new set of training features for the positive class (faces) and negative class (background).

feature_extractor() → extractor[source]

Returns the feature extractor used to extract the positive and negative features.

This feature extractor is stored to file during the extract() method ran, so this function reads that file (from the feature_directory set in the constructor) and returns its content.

Returns:

extractorFeatureExtractor

The feature extractor used to extract the features stored in the feature_directory

bob.ip.facedetect.average_detections(detections, predictions[, relative_prediction_threshold]) → bounding_box, prediction[source]

Computes the weighted average of the given detections, where the weights are computed based on the prediction values.

Parameters:

detections

The overlapping bounding boxes.

predictions[float]

The predictions for the detections.

relative_prediction_thresholdfloat between 0 and 1

Limits the bounding boxes to those that have a prediction value higher then relative_prediction_threshold * max(predictions)

Returns:

bounding_boxBoundingBox

The bounding box which has been merged from the detections

predictionfloat

The prediction value of the bounding box, which is a weighted sum of the predictions with minimum overlap

bob.ip.facedetect.best_detection(detections, predictions[, minimum_overlap][, relative_prediction_threshold]) → bounding_box, prediction[source]

Computes the best detection for the given detections and according predictions.

This is achieved by computing a weighted sum of detections that overlap with the best detection (the one with the highest prediction), where the weights are based on the predictions. Only detections with according prediction values > 0 are considered.

Parameters:

detections

The detected bounding boxes.

predictions[float]

The predictions for the detections.

minimum_overlapfloat between 0 and 1

The minimum overlap (in terms of Jaccard BoundingBox.similarity()) of bounding boxes with the best detection to be considered.

relative_prediction_thresholdfloat between 0 and 1

Limits the bounding boxes to those that have a prediction value higher then relative_prediction_threshold * max(predictions)

Returns:

bounding_boxBoundingBox

The bounding box which has been merged from the detections

predictionfloat

The prediction value of the bounding box, which is a weighted sum of the predictions with minimum overlap

bob.ip.facedetect.bounding_box_from_annotation(source, padding, **kwargs) → bounding_box[source]

Creates a bounding box from the given parameters, which are, in general, annotations read using bob.ip.facedetect.read_annotation_file(). Different kinds of annotations are supported, given by the source keyword:

• direct : bounding boxes are directly specified by keyword arguments topleft and bottomright

• eyes : the left and right eyes are specified by keyword arguments leye and reye

• left-profile : the left eye and the mouth are specified by keyword arguments eye and mouth

• right-profile : the right eye and the mouth are specified by keyword arguments eye and mouth

• ellipse : the face ellipse as well as face angle and axis radius is provided by keyword arguments center, angle and axis_radius

If a source is specified, the according keywords must be given as well. Otherwise, the source is estimated from the given keyword parameters if possible.

If ‘topleft’ and ‘bottomright’ are given (i.e., the ‘direct’ source), they are taken as is. Note that the ‘bottomright’ is NOT included in the bounding box. Please assure that the aspect ratio of the bounding box is 6:5 (height : width).

For source ‘ellipse’, the bounding box is computed to capture the whole ellipse, even if it is rotated.

For other sources (i.e., ‘eyes’), the center of the two given positions is computed, and the padding is applied, which is relative to the distance between the two given points. If padding is None (the default) the default_paddings of this source are used instead. These padding is required to keep an aspect ratio of 6:5.

Parameters:

sourcestr or None

The type of annotations present in the list of keyword arguments, see above.

padding{‘top’:float, ‘bottom’:float, ‘left’:float, ‘right’:float}

This padding is added to the center between the given points, to define the top left and bottom right positions in the bounding box; values are relative to the distance between the two given points; ignored for some of the sources

kwargskey=value

Further keyword arguments specifying the annotations.

Returns:

bounding_boxBoundingBox

The bounding box that was estimated from the given annotations.

bob.ip.facedetect.default_cascade()[source]

Returns the bob.ip.facedetect.Cascade that is loaded from the pre-trained cascade file provided by this package.

bob.ip.facedetect.detect_all_faces(image[, cascade][, sampler][, threshold][, overlaps][, minimum_overlap][, relative_prediction_threshold]) → bounding_boxes, qualities[source]

Detects all faces in the given image, whose prediction values are higher than the given threshold.

If the given minimum_overlap is lower than 1, overlapping bounding boxes are grouped, with the minimum_overlap being the minimum Jaccard similarity between two boxes to be considered to be overlapping. Afterwards, all groups which have less than overlaps elements are discarded (this measure is similar to the Viola-Jones face detector). Finally, average_detections() is used to compute the average bounding box for each of the groups, including averaging the detection value (which will, hence, usually decrease in value).

Parameters:

imagearray_like (2D aka gray or 3D aka RGB)

The image to detect a face in.

cascadestr or Cascade or None

If given, the cascade file name or the loaded cascade to be used to classify image patches. If not given, the default_cascade() is used.

samplerSampler or None

The sampler that defines the sampling of bounding boxes to search for the face. If not specified, a default Sampler is instantiated.

thresholdfloat

The threshold of the quality of detected faces. Detections with a quality lower than this value will not be considered. Higher thresholds will not detect all faces, while lower thresholds will generate false detections.

overlapsint

The number of overlapping boxes that must exist for a bounding box to be considered. Higher values will remove a lot of false-positives, but might increase the chance of a face to be missed. The default value 1 will not limit the boxes.

minimum_overlapfloat between 0 and 1

Groups detections based on the given minimum bounding box overlap, see group_detections().

relative_prediction_thresholdfloat between 0 and 1

Limits the bounding boxes to those that have a prediction value higher then relative_prediction_threshold * max(predictions)

Returns:

bounding_boxes

The bounding box containing the detected face.

qualities[float]

The qualities of the bounding_boxes, values greater than threshold.

bob.ip.facedetect.detect_single_face(image[, cascade][, sampler][, minimum_overlap][, relative_prediction_threshold]) → bounding_box, quality[source]

Detects a single face in the given image, i.e., the one with the highest prediction value.

Parameters:

imagearray_like (2D aka gray or 3D aka RGB)

The image to detect a face in.

cascadestr or Cascade or None

If given, the cascade file name or the loaded cascade to be used. If not given, the default_cascade() is used.

samplerSampler or None

The sampler that defines the sampling of bounding boxes to search for the face. If not specified, a default Sampler is instantiated, which will perform a tight sampling.

minimum_overlapfloat between 0 and 1

Computes the best detection using the given minimum overlap, see best_detection()

relative_prediction_thresholdfloat between 0 and 1

Limits the bounding boxes to those that have a prediction value higher then relative_prediction_threshold * max(predictions)

Returns:

bounding_boxBoundingBox

The bounding box containing the detected face.

qualityfloat

The quality of the detected face, a value greater than 0.

bob.ip.facedetect.expected_eye_positions(bounding_box, padding) → eyes[source]

Computes the expected eye positions based on the relative coordinates of the bounding box.

This function can be used to translate between bounding-box-based image cropping and eye-location-based alignment. The returned eye locations return the average eye locations, no landmark detection is performed.

Parameters:

bounding_boxBoundingBox

The face bounding box as detected by one of the functions in bob.ip.facedetect.

padding{‘top’:float, ‘bottom’:float, ‘left’:float, ‘right’:float}

The padding that was used for the eyes source in bounding_box_from_annotation(), has a proper default.

Returns:

eyes{‘reye’(rey, rex), ‘leye’(ley, lex)}

A dictionary containing the average left and right eye annotation.

bob.ip.facedetect.get_config()[source]

Returns a string containing the configuration information.

bob.ip.facedetect.group_detections(detections, predictions, overlap_threshold, prediction_threshold, box_count_threshold) → grouped_detections, grouped_predictions

Groups the given detected bounding boxes according to their overlap and returns a list of lists of detections, and their according list of predictions

Each of the returned lists of bounding boxes contains all boxes that overlap with the first box in the list with at least the given overlap_threshold.

Parameters:

detections : [BoundingBox]

A list of detected bounding boxes

predictions : array_like <1D, float>

The prediction (quality, weight, …) values for the detections

overlap_threshold : float

The overlap threshold (Jaccard similarity), for which detections should be considered to overlap

prediction_threshold : float

[Default: 0] The prediction threshold, below which the bounding boxes should be disregarded and not added to any group

box_count_threshold : int

[Default: 1] Only bounding boxes with at least the given number of overlapping boxes are considered

Returns:

grouped_detections : [[BoundingBox]]

The lists of bounding boxes that are grouped by their overlap; each list contains all bounding boxes that overlap with the first entry in the list

grouped_predictions : [array_like <float, 1D>]

The according list of grouped predictions (qualities, weights, …)

bob.ip.facedetect.overlapping_detections(detections, predictions, threshold) → overlapped_detections, overlapped_predictions

Returns the detections and predictions that overlap with the best detection

For threshold >= 1., all detections will be returned (i.e., no pruning is performed), but the list will be sorted with descendingly predictions.

Parameters:

detections : [BoundingBox]

A list of detected bouding boxes

predictions : array_like <1D, float>

The prediction (quality, weight, …) values for the detections

threshold : float

The overlap threshold (Jaccard similarity) which should be considered

Returns:

overlapped_detections : [BoundingBox]

The list of overlapping bounding boxes

overlapped_predictions : array_like <float, 1D>

The according predictions (qualities, weights, …)

bob.ip.facedetect.parallel_part(data, parallel) → part[source]

Splits off samples from the the given data list and the given number of parallel jobs based on the SGE_TASK_ID environment variable.

Parameters:

data[object]

A list of data that should be split up into parallel parts

parallelint or None

The total number of parts, in which the data should be split into

Returns:

part[object]

The desired partition of the data

bob.ip.facedetect.prune_detections(detections, predictions, threshold[, number_of_detections]) → pruned_detections, pruned_predictions

Prunes the given detected bounding boxes according to their predictions and returns the pruned bounding boxes and their predictions

For threshold >= 1., all detections will be returned (i.e., no pruning is performed), but the list will be sorted with descendant predictions.

Parameters:

detections : [BoundingBox]

A list of detected bounding boxes

predictions : array_like <1D, float>

The prediction (quality, weight, …) values for the detections

threshold : float

The overlap threshold (Jaccard similarity), for which detections should be pruned

number_of_detections : int

[default: MAX_INT] The number of detections that should be returned

Returns:

pruned_detections : [BoundingBox]

The list of pruned bounding boxes

pruned_predictions : array_like <float, 1D>

The according predictions (qualities, weights, …)

bob.ip.facedetect.quasi_random_indices(number_of_total_items[, number_of_desired_items]) → index[source]

Yields an iterator to a quasi-random list of indices that will contain exactly the number of desired indices (or the number of total items in the list, if this is smaller).

This function can be used to retrieve a consistent and reproducible list of indices of the data, in case the number_of_total_items is lower that the given number_of_desired_items.

Parameters:

number_of_total_itemsint

The total number of elements in the collection, which should be sub-sampled

number_of_desired_itemsint or None

The number of items that should be used; if None or greater than number_of_total_items, all indices are yielded

Yields:

indexint

An iterator to indices, which will span number_of_total_items evenly.

bob.ip.facedetect.read_annotation_file(annotation_file, annotation_type) → annotations[source]

Reads annotations from the given annotation_file.

The way, how annotations are read depends on the given annotation_type. Depending on the type, one or several annotations might be present in the annotation file. Currently, these variants are implemented:

• 'lr-eyes': Only the eye positions are stored, in a single row, like: le_x le_y re_x re_y, comment lines starting with '#' are ignored.

• 'named': Each line of the file contains a name and two floats, like reye x y; empty lines separate between sets of annotations.

• 'idiap': A special 22 point format, where each line contains the index and the locations, like 1 x y.

• 'fddb': a special format for the FDDB database; empty lines separate between sets of annotations

Finally, a list of annotations is returned in the format: [{name: (y,x)}].

Parameters:

annotation_filestr

The file name of the annotation file to read

annotation_typestr (see above)

The style of annotation file, in which the given annotation_file is

Returns:

annotations[dict]

A list of annotations read from the given file, grouped by annotated objects (faces). Each annotation is generally specified as the two eye coordinates, i.e., {'reye' : (rey, rex), 'leye' : (ley, lex)}, but other types of annotations might occur as well.

class bob.ip.facedetect.mtcnn.MTCNN(min_size=40, factor=0.709, thresholds=0.6, 0.7, 0.7, **kwargs)[source]

Bases: object

MTCNN v1 wrapper for Tensorflow 2. See https://kpzhang93.github.io/MTCNN_face_detection_alignment/index.html for more details on MTCNN.

factor

Factor is a trade-off between performance and speed.

Type

float

min_size

Minimum face size to be detected.

Type

int

thresholds

Thresholds are a trade-off between false positives and missed detections.

Type

list

property mtcnn_fun
detect(image)[source]

Detects all faces in the image.

Parameters

image (numpy.ndarray) – An RGB image in Bob format.

Returns

A tuple of boxes, probabilities, and landmarks.

Return type

tuple

annotations(image)[source]

Detects all faces in the image and returns annotations in bob format.

Parameters

image (numpy.ndarray) – An RGB image in Bob format.

Returns

A list of annotations. Annotations are dictionaries that contain the following keys: topleft, bottomright, reye, leye, nose, mouthright, mouthleft, and quality`.

Return type

list