Python API ¶

bob.measure.rmse(estimation, target)[source]¶

Calculates the root mean square error between a set of outputs and target

Uses the formula:

\[RMSE(\hat{\Theta}) = \sqrt(E[(\hat{\Theta} - \Theta)^2])\]

Estimation (\(\hat{\Theta}\)) and target (\(\Theta\)) are supposed to have 2 dimensions. Different examples are organized as rows while different features in the estimated values or targets are organized as different columns.

Parameters

estimation (array) – an N-dimensional array that corresponds to the value estimated by your procedure
target (array) – an N-dimensional array that corresponds to the expected value

Returns

The square-root of the average of the squared error between the estimated value and the target

Return type

bob.measure.relevance(input, machine)[source]¶

Calculates the relevance of every input feature to the estimation process

Uses the formula:

Neural Triggering System Operating on High Resolution Calorimetry Information, Anjos et al, April 2006, Nuclear Instruments and Methods in Physics Research, volume 559, pages 134-138

\[R(x_{i}) = |E[(o(x) - o(x|x_{i}=E[x_{i}]))^2]|\]

In other words, the relevance of a certain input feature i is the change on the machine output value when such feature is replaced by its mean for all input vectors. For this to work, the input parameter has to be a 2D array with features arranged column-wise while different examples are arranged row-wise.

Parameters

input (array) – an N-dimensional array that corresponds to the value estimated by your model
machine (object) – A machine that can be called to “process” your input

Returns

An 1D float array as large as the number of columns (second dimension) of your input array, estimating the “relevance” of each input column (or feature) to the score provided by the machine.

Return type

array

bob.measure.recognition_rate(cmc_scores, threshold=None, rank=1)[source]¶

Calculates the recognition rate from the given input

It is identical to the CMC value for the given rank.

The input has a specific format, which is a list of two-element tuples. Each of the tuples contains the negative \(\{S_p^-\}\) and the positive \(\{S_p^+\}\) scores for one probe item \(p\), or None in case of open set recognition.

If threshold is set to None, the rank 1 recognition rate is defined as the number of test items, for which the highest positive \(\max\{S_p^+\}\) score is greater than or equal to all negative scores, divided by the number of all probe items \(P\):

\[\begin{split}\mathrm{RR} = \frac{1}{P} \sum_{p=1}^{P} \begin{cases} 1 & \mathrm{if } \max\{S_p^+\} >= \max\{S_p^-\}\\ 0 & \mathrm{otherwise} \end{cases}\end{split}\]

For a given rank \(r>1\), up to \(r\) negative scores that are higher than the highest positive score are allowed to still count as correctly classified in the top \(r\) rank.

If threshold \(\theta\) is given, all scores below threshold will be filtered out. Hence, if all positive scores are below threshold \(\max\{S_p^+\} < \theta\), the probe will be misclassified at any rank.

For open set recognition, i.e., when there exist a tuple including negative scores without corresponding positive scores (None), and all negative scores are below threshold \(\max\{S_p^+\} < \theta\), the probe item is correctly rejected, and it does not count into the denominator \(P\). When no threshold is provided, the open set probes will always count as misclassified, regardless of the rank.

Parameters

cmc_scores (list) –
A list in the format [(negatives, positives), ...] containing the CMC scores (i.e. list: A list of tuples, where each tuple contains the negative and positive scores for one probe of the database).

Each pair contains the negative and the positive scores for one probe item. Each pair can contain up to one empty array (or None), i.e., in case of open set recognition.
threshold (float, optional) – Decision threshold. If not None, all scores will be filtered by the threshold. In an open set recognition problem, all open set scores (negatives with no corresponding positive) for which all scores are below threshold, will be counted as correctly rejected and removed from the probe list (i.e., the denominator).
rank (int, optional) – The rank for which the recognition rate should be computed, 1 by default.

Returns

The (open set) recognition rate for the given rank, a value between 0 and 1.

Return type

bob.measure.cmc(cmc_scores)[source]¶

Calculates the cumulative match characteristic (CMC) from the given input.

The input has a specific format, which is a list of two-element tuples. Each of the tuples contains the negative and the positive scores for one probe item.

For each probe item the probability that the rank \(r\) of the positive score is calculated. The rank is computed as the number of negative scores that are higher than the positive score. If several positive scores for one test item exist, the highest positive score is taken. The CMC finally computes how many test items have rank r or higher, divided by the total number of test values.

Note

The CMC is not available for open set classification. Please use the detection_identification_rate() and false_alarm_rate() instead.

Parameters

cmc_scores (list) –

A list in the format [(negatives, positives), ...] containing the CMC scores.

Each pair contains the negative and the positive scores for one probe item. Each pair can contain up to one empty array (or None), i.e., in case of open set recognition.

Returns

A 1D float array representing the CMC curve. The rank 1 recognition rate can be found in array[0], rank 2 rate in array[1], and so on. The number of ranks (array.shape[0]) is the number of gallery items. Values are in range [0,1].

Return type

1D numpy.ndarray of float

bob.measure.detection_identification_rate(cmc_scores, threshold, rank=1)[source]¶

Computes the detection and identification rate for the given threshold.

This value is designed to be used in an open set identification protocol, and defined in Chapter 14.1 of [LiJain2005].

Although the detection and identification rate is designed to be computed on an open set protocol, it uses only the probe elements, for which a corresponding gallery element exists. For closed set identification protocols, this function is identical to recognition_rate(). The only difference is that for this function, a threshold for the scores need to be defined, while for recognition_rate() it is optional.

Parameters

cmc_scores (list) –
A list in the format [(negatives, positives), ...] containing the CMC.

Each pair contains the negative and the positive scores for one probe item. Each pair can contain up to one empty array (or None), i.e., in case of open set recognition.
threshold (float) – The decision threshold \(\tau\).
rank (int, optional) – The rank for which the curve should be plotted

Returns

The detection and identification rate for the given threshold.

Return type

bob.measure.false_alarm_rate(cmc_scores, threshold)[source]¶

Computes the false alarm rate for the given threshold,.

This value is designed to be used in an open set identification protocol, and defined in Chapter 14.1 of [LiJain2005].

The false alarm rate is designed to be computed on an open set protocol, it uses only the probe elements, for which no corresponding gallery element exists.

Parameters

cmc_scores (list) –
A list in the format [(negatives, positives), ...] containing the CMC scores (i.e. list: A list of tuples, where each tuple contains the negative and positive scores for one probe of the database).

Each pair contains the negative and the positive scores for one probe item. Each pair can contain up to one empty array (or None), i.e., in case of open set recognition.
threshold (float) – The decision threshold \(\tau\).

Returns

The false alarm rate.

Return type

bob.measure.eer(negatives, positives, is_sorted=False, also_farfrr=False)[source]¶

Calculates the Equal Error Rate (EER).

Please note that it is possible that eer != fpr != fnr. This function returns (fpr + fnr) / 2 as eer. If you also need the fpr and fnr values, set also_farfrr to True.

Parameters

negatives (array_like (1D, float)) – The scores for comparisons of objects of different classes.
positives (array_like (1D, float)) – The scores for comparisons of objects of the same class.
is_sorted (bool) – Are both sets of scores already in ascendantly sorted order?
also_farfrr (bool) – If True, it will also return far and frr.

Returns

eer (float) – The Equal Error Rate (EER).
fpr (float) – The False Positive Rate (FPR). Returned only when also_farfrr is True.
fnr (float) – The False Negative Rate (FNR). Returned only when also_farfrr is True.

bob.measure.roc_auc_score(negatives, positives, npoints=2000, min_far=-8, log_scale=False)[source]¶

Area Under the ROC Curve. Computes the area under the ROC curve. This is useful when you want to report one number that represents an ROC curve. This implementation uses the trapezoidal rule for the integration of the ROC curve. For more information, see: https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve

Parameters

negatives (array_like) – The negative scores.
positives (array_like) – The positive scores.
npoints (int, optional) – Number of points in the ROC curve. Higher numbers leads to more accurate ROC.
min_far (float, optional) – Min FAR and FRR values to consider when calculating ROC.
log_scale (bool, optional) – If True, converts the x axis (FPR) to log10 scale before calculating AUC. This is useful in cases where len(negatives) >> len(positives)

Returns

The ROC AUC. If log_scale is False, the value should be between 0 and 1.

Return type

bob.measure.get_config()[source]¶: Returns a string containing the configuration information.

bob.measure.correctly_classified_negatives(negatives, threshold) → classified¶

This method returns an array composed of booleans that pin-point, which negatives where correctly classified for the given threshold

The pseudo-code for this function is:

foreach (k in negatives) if negatives[k] < threshold: classified[k] = true else: classified[k] = false

Parameters:

negatives : array_like(1D, float)

The scores generated by comparing objects of different classes

threshold : float

The threshold, for which scores should be considered to be correctly classified

Returns:

classified : array_like(1D, bool)

The decision for each of the negatives

bob.measure.correctly_classified_positives(positives, threshold) → classified¶

This method returns an array composed of booleans that pin-point, which positives where correctly classified for the given threshold

The pseudo-code for this function is:

foreach (k in positives) if positives[k] >= threshold: classified[k] = true else: classified[k] = false

Parameters:

positives : array_like(1D, float)

The scores generated by comparing objects of the same classes

threshold : float

The threshold, for which scores should be considered to be correctly classified

Returns:

classified : array_like(1D, bool)

The decision for each of the positives

bob.measure.det(negatives, positives, n_points[, min_far]) → curve¶

Calculates points of an Detection Error-Tradeoff (DET) curve

Calculates the DET curve given a set of negative and positive scores and a desired number of points. Returns a two-dimensional array of doubles that express on its rows:

[0] X axis values in the normal deviate scale for the false-accepts

[1] Y axis values in the normal deviate scale for the false-rejections

You can plot the results using your preferred tool to first create a plot using rows 0 and 1 from the returned value and then replace the X/Y axis annotation using a pre-determined set of tickmarks as recommended by NIST. The derivative scales are computed with the bob.measure.ppndf() function.

Parameters:

negatives, positives : array_like(1D, float)

The list of negative and positive scores to compute the DET for

n_points : int

The number of points on the DET curve, for which the DET should be evaluated

min_far : int

Minimum FAR in terms of 10^(min_far). This value is also used for min_frr. Default value is -8. Values should be negative.

Returns:

curve : array_like(2D, float)

The DET curve, with the FPR in the first and the FNR in the second row

bob.measure.eer_rocch(negatives, positives) → threshold¶

Calculates the equal-error-rate (EER) given the input data, on the ROC Convex Hull (ROCCH)

It replicates the EER calculation from the Bosaris toolkit (https://sites.google.com/site/bosaristoolkit/).

Parameters:

negatives, positives : array_like(1D, float)

The set of negative and positive scores to compute the threshold

Returns:

threshold : float

The threshold for the equal error rate

bob.measure.eer_threshold(negatives, positives[, is_sorted]) → threshold¶

Calculates the threshold that is as close as possible to the equal-error-rate (EER) for the given input data

The EER should be the point where the FPR equals the FNR. Graphically, this would be equivalent to the intersection between the ROC (or DET) curves and the identity.

Note

The scores will be sorted internally, requiring the scores to be copied. To avoid this copy, you can sort both sets of scores externally in ascendant order, and set the is_sorted parameter to True

Parameters:

negatives, positives : array_like(1D, float)

The set of negative and positive scores to compute the threshold

is_sorted : bool

[Default: False] Are both sets of scores already in ascendantly sorted order?

Returns:

threshold : float

The threshold (i.e., as used in bob.measure.farfrr()) where FPR and FNR are as close as possible

bob.measure.epc(dev_negatives, dev_positives, test_negatives, test_positives, n_points[, is_sorted][, thresholds]) → curve¶

Calculates points of an Expected Performance Curve (EPC)

Calculates the EPC curve given a set of positive and negative scores and a desired number of points. Returns a two-dimensional numpy.ndarray of type float with the shape of (2, points) or (3, points) depending on the thresholds argument. The rows correspond to the X (cost), Y (weighted error rate on the test set given the min. threshold on the development set), and the thresholds which were used to calculate the error (if the thresholds argument was set to True), respectively. Please note that, in order to calculate the EPC curve, one needs two sets of data comprising a development set and a test set. The minimum weighted error is calculated on the development set and then applied to the test set to evaluate the weighted error rate at that position.

The EPC curve plots the HTER on the test set for various values of ‘cost’. For each value of ‘cost’, a threshold is found that provides the minimum weighted error (see bob.measure.min_weighted_error_rate_threshold()) on the development set. Each threshold is consecutively applied to the test set and the resulting weighted error values are plotted in the EPC.

The cost points in which the EPC curve are calculated are distributed uniformly in the range \([0.0, 1.0]\).

Note

It is more memory efficient, when sorted arrays of scores are provided and the is_sorted parameter is set to True.

Parameters:

dev_negatives, dev_positives, test_negatives, test_positives : array_like(1D, float)

The scores for negatives and positives of the development and test set

n_points : int

The number of weights for which the EPC curve should be computed

is_sorted : bool

[Default: False] Set this to True if the scores are already sorted. If False, scores will be sorted internally, which will require more memory

thresholds : bool

[Default: False] If True the function returns an array with the shape of (3, points) where the third row contains the thresholds that were calculated on the development set.

Returns:

curve : array_like(2D, float)

The EPC curve, with the first row containing the weights and the second row containing the weighted errors on the test set. If thresholds is True, there is also a third row which contains the thresholds that were calculated on the development set.

bob.measure.f_score(negatives, positives, threshold[, weight]) → f_score¶

This method computes the F-score of the accuracy of the classification

The F-score is a weighted mean of precision and recall measurements, see bob.measure.precision_recall(). It is computed as:

\[\mathrm{f-score} = (1 + w^2)\frac{\mathrm{precision}\cdot{}\mathrm{recall}}{w^2\cdot{}\mathrm{precision} + \mathrm{recall}}\]

The weight \(w\) needs to be non-negative real value. In case the weight parameter is 1 (the default), the F-score is called F1 score and is a harmonic mean between precision and recall values.

Parameters:

negatives, positives : array_like(1D, float)

The set of negative and positive scores to compute the precision and recall

threshold : float

The threshold to compute the precision and recall for

weight : float

[Default: 1] The weight \(w\) between precision and recall

Returns:

f_score : float

The computed f-score for the given scores and the given threshold

bob.measure.far_threshold(negatives, positives[, far_value][, is_sorted]) → threshold¶

Computes the threshold such that the real FPR is at most the requested far_value if possible

Note

The scores will be sorted internally, requiring the scores to be copied. To avoid this copy, you can sort the negatives scores externally in ascendant order, and set the is_sorted parameter to True

Parameters:

negatives : array_like(1D, float)

The set of negative scores to compute the FPR threshold

positives : array_like(1D, float)

Ignored, but needs to be specified – may be given as []

far_value : float

[Default: 0.001] The FPR value, for which the threshold should be computed

is_sorted : bool

[Default: False] Set this to True if the negatives are already sorted in ascending order. If False, scores will be sorted internally, which will require more memory

Returns:

threshold : float

The threshold such that the real FPR is at most far_value

bob.measure.farfrr(negatives, positives, threshold) → far, frr¶

Calculates the false-acceptance (FA) ratio and the false-rejection (FR) ratio for the given positive and negative scores and a score threshold

positives holds the score information for samples that are labeled to belong to a certain class (a.k.a., ‘signal’ or ‘client’). negatives holds the score information for samples that are labeled not to belong to the class (a.k.a., ‘noise’ or ‘impostor’). It is expected that ‘positive’ scores are, at least by design, greater than ‘negative’ scores. So, every ‘positive’ value that falls bellow the threshold is considered a false-rejection (FR). negative samples that fall above the threshold are considered a false-accept (FA).

Positives that fall on the threshold (exactly) are considered correctly classified. Negatives that fall on the threshold (exactly) are considered incorrectly classified. This equivalent to setting the comparison like this pseudo-code:

foreach (positive as K) if K < threshold: falseRejectionCount += 1

foreach (negative as K) if K >= threshold: falseAcceptCount += 1

The output is in form of a tuple of two double-precision real numbers. The numbers range from 0 to 1. The first element of the pair is the false positive ratio (FPR), the second element the false negative ratio (FNR).

The threshold value does not necessarily have to fall in the range covered by the input scores (negatives and positives altogether), but if it does not, the output will be either (1.0, 0.0) or (0.0, 1.0), depending on the side the threshold falls.

It is possible that scores are inverted in the negative/positive sense. In some setups the designer may have setup the system so ‘positive’ samples have a smaller score than the ‘negative’ ones. In this case, make sure you normalize the scores so positive samples have greater scores before feeding them into this method.

Parameters:

negatives : array_like(1D, float)

The scores for comparisons of objects of different classes

positives : array_like(1D, float)

The scores for comparisons of objects of the same class

threshold : float

The threshold to separate correctly and incorrectly classified scores

Returns:

far : float

The False Positve Rate (FPR) for the given threshold

frr : float

The False Negative Rate (FNR) for the given threshold

bob.measure.frr_threshold(negatives, positives[, frr_value][, is_sorted]) → threshold¶

Computes the threshold such that the real FNR is at most the requested frr_value if possible

Note

The scores will be sorted internally, requiring the scores to be copied. To avoid this copy, you can sort the positives scores externally in ascendant order, and set the is_sorted parameter to True

Parameters:

negatives : array_like(1D, float)

Ignored, but needs to be specified – may be given as []

positives : array_like(1D, float)

The set of positive scores to compute the FNR threshold

frr_value : float

[Default: 0.001] The FNR value, for which the threshold should be computed

is_sorted : bool

[Default: False] Set this to True if the positives are already sorted in ascendant order. If False, scores will be sorted internally, which will require more memory

Returns:

threshold : float

The threshold such that the real FRR is at most frr_value

bob.measure.min_hter_threshold(negatives, positives[, is_sorted]) → threshold¶

Calculates the bob.measure.min_weighted_error_rate_threshold() with cost=0.5

Parameters:

negatives, positives : array_like(1D, float)

The set of negative and positive scores to compute the threshold

is_sorted : bool

[Default: False] Are both sets of scores already in ascendantly sorted order?

Returns:

threshold : float

The threshold for which the weighted error rate is minimal

bob.measure.min_weighted_error_rate_threshold(negatives, positives, cost[, is_sorted]) → threshold¶

Calculates the threshold that minimizes the error rate for the given input data

The cost parameter determines the relative importance between false-accepts and false-rejections. This number should be between 0 and 1 and will be clipped to those extremes. The value to minimize becomes: \(ER_{cost} = cost * FPR + (1-cost) * FNR\). The higher the cost, the higher the importance given to not making mistakes classifying negatives/noise/impostors.

Note

The scores will be sorted internally, requiring the scores to be copied. To avoid this copy, you can sort both sets of scores externally in ascendant order, and set the is_sorted parameter to True

Parameters:

negatives, positives : array_like(1D, float)

The set of negative and positive scores to compute the threshold

cost : float

The relative cost over FPR with respect to FNR in the threshold calculation

is_sorted : bool

[Default: False] Are both sets of scores already in ascendantly sorted order?

Returns:

threshold : float

The threshold for which the weighted error rate is minimal

bob.measure.ppndf(value) → ppndf¶

Returns the Deviate Scale equivalent of a false rejection/acceptance ratio

The algorithm that calculates the deviate scale is based on function ppndf() from the NIST package DETware version 2.1, freely available on the internet. Please consult it for more details. By 20.04.2011, you could find such package here.

Parameters:

value : float

The value (usually FPR or FNR) for which the ppndf should be calculated

Returns:

ppndf : float

The derivative scale of the given value

bob.measure.precision_recall(negatives, positives, threshold) → precision, recall¶

Calculates the precision and recall (sensitiveness) values given negative and positive scores and a threshold

Precision and recall are computed as:

\[ \begin{align}\begin{aligned}\mathrm{precision} = \frac{tp}{tp + fp}\\\mathrm{recall} = \frac{tp}{tp + fn}\end{aligned}\end{align} \]

where \(tp\) are the true positives, \(fp\) are the false positives and \(fn\) are the false negatives.

positives holds the score information for samples that are labeled to belong to a certain class (a.k.a., ‘signal’ or ‘client’). negatives holds the score information for samples that are labeled not to belong to the class (a.k.a., ‘noise’ or ‘impostor’). For more precise details about how the method considers error rates, see bob.measure.farfrr().

Parameters:

negatives, positives : array_like(1D, float)

The set of negative and positive scores to compute the measurements

threshold : float

The threshold to compute the measures for

Returns:

precision : float

The precision value for the given negatives and positives

recall : float

The recall value for the given negatives and positives

bob.measure.precision_recall_curve(negatives, positives, n_points) → curve¶

Calculates the precision-recall curve given a set of positive and negative scores and a number of desired points

The points in which the curve is calculated are distributed uniformly in the range [min(negatives, positives), max(negatives, positives)]

Parameters:

negatives, positives : array_like(1D, float)

The set of negative and positive scores to compute the measurements

n_points : int

The number of thresholds for which precision and recall should be evaluated

Returns:

curve : array_like(2D, float)

2D array of floats that express the X (precision) and Y (recall)
coordinates

bob.measure.roc(negatives, positives, n_points[, min_far]) → curve¶

Calculates points of an Receiver Operating Characteristic (ROC)

Calculates the ROC curve given a set of negative and positive scores and a desired number of points.

Parameters:

negatives, positives : array_like(1D, float)

The negative and positive scores, for which the ROC curve should be calculated

n_points : int

The number of points, in which the ROC curve are calculated, which are distributed uniformly in the range [min(negatives, positives), max(negatives, positives)]

min_far : int

Minimum FAR in terms of 10^(min_far). This value is also used for min_frr. Default value is -8. Values should be negative.

Returns:

curve : array_like(2D, float)

A two-dimensional array of doubles that express the X (FPR) and Y (FNR) coordinates in this order

bob.measure.roc_for_far(negatives, positives, far_list[, is_sorted]) → curve¶

Calculates the ROC curve for a given set of positive and negative scores and the FPR values, for which the FNR should be computed

Note

The scores will be sorted internally, requiring the scores to be copied. To avoid this copy, you can sort both sets of scores externally in ascendant order, and set the is_sorted parameter to True

Parameters:

negatives, positives : array_like(1D, float)

The set of negative and positive scores to compute the curve

far_list : array_like(1D, float)

A list of FPR values, for which the FNR values should be computed

is_sorted : bool

[Default: False] Set this to True if both sets of scores are already sorted in ascending order. If False, scores will be sorted internally, which will require more memory

Returns:

curve : array_like(2D, float)

The ROC curve, which holds a copy of the given FPR values in row 0, and the corresponding FNR values in row 1

bob.measure.rocch(negatives, positives) → curve¶

Calculates the ROC Convex Hull (ROCCH) curve given a set of positive and negative scores

Parameters:

negatives, positives : array_like(1D, float)

The set of negative and positive scores to compute the curve

Returns:

curve : array_like(2D, float)

The ROC curve, with the first row containing the FPR, and the second row containing the FNR

bob.measure.rocch2eer(pmiss_pfa) → threshold¶

Calculates the threshold that is as close as possible to the equal-error-rate (EER) given the input data

Todo

The parameter(s) ‘pmiss_pfa’ are used, but not documented.

Returns:

threshold : float

The computed threshold, at which the EER can be obtained

Measures for calibration

bob.measure.calibration.cllr(negatives, positives)[source]¶

Cost of log likelihood ratio as defined by the Bosaris toolkit

Computes the ‘cost of log likelihood ratio’ (\(C_{llr}\)) measure as given in the Bosaris toolkit

Parameters

negatives (array) – 1D float array that contains the scores of the “negative” (noise, non-class) samples of your classifier.
positives (array) – 1D float array that contains the scores of the “positive” (signal, class) samples of your classifier.

Returns

The computed \(C_{llr}\) value.

Return type

bob.measure.calibration.min_cllr(negatives, positives)[source]¶

Minimum cost of log likelihood ratio as defined by the Bosaris toolkit

Computes the ‘minimum cost of log likelihood ratio’ (\(C_{llr}^{min}\)) measure as given in the bosaris toolkit

Parameters

negatives (array) – 1D float array that contains the scores of the “negative” (noise, non-class) samples of your classifier.
positives (array) – 1D float array that contains the scores of the “positive” (signal, class) samples of your classifier.

Returns

The computed \(C_{llr}^{min}\) value.

Return type

bob.measure.plot.log_values(min_step=-4, counts_per_step=4)[source]¶

Computes log-scaled values between \(10^{M}\) and 1

This function computes log-scaled values between \(10^{M}\) and 1 (including), where \(M\) is the min_ste argument, which needs to be a negative integer. The integral counts_per_step value defines how many values between two adjacent powers of 10 will be created. The total number of values will be -min_step * counts_per_step + 1.

Parameters

min_step (int, optional) – The power of 10 that will be the minimum value. E.g., the default -4 will result in the first number to be \(10^{-4}\) = 0.00001 or 0.01%
counts_per_step (int, optional) – The number of values that will be put between two adjacent powers of 10. With the default value 4 (and default values of min_step), we will get log_list[0] == 1e-4, log_list[4] == 1e-3, …, log_list[16] == 1.

Returns

A list of logarithmically scaled values between \(10^{M}\) and 1.

Return type

list

bob.measure.plot.roc(negatives, positives, npoints=2000, CAR=None, min_far=-8, tpr=False, semilogx=False, **kwargs)[source]¶

Plots Receiver Operating Characteristic (ROC) curve.

This method will call matplotlib to plot the ROC curve for a system which contains a particular set of negatives (impostors) and positives (clients) scores. We use the standard matplotlib.pyplot.plot() command. All parameters passed with exception of the three first parameters of this method will be directly passed to the plot command.

The plot will represent the false-alarm on the horizontal axis and the false-rejection on the vertical axis. The values for the axis will be computed using bob.measure.roc().

Note

This function does not initiate and save the figure instance, it only issues the plotting command. You are the responsible for setting up and saving the figure as you see fit.

Parameters

negatives (array) – 1D float array that contains the scores of the “negative” (noise, non-class) samples of your classifier. See (bob.measure.roc())
positives (array) – 1D float array that contains the scores of the “positive” (signal, class) samples of your classifier. See (bob.measure.roc())
npoints (int, optional) – The number of points for the plot. See (bob.measure.roc())
min_far (float, optional) – The minimum value of FPR and FNR that is used for ROC computations.
tpr (bool, optional) – If True, will plot TPR (TPR = 1 - FNR) on the y-axis instead of FNR.
semilogx (bool, optional) – If True, will use pyplot.semilogx to plot the ROC curve.
CAR (bool, optional) – This option is deprecated. Please use TPR and semilogx options instead. If set to True, it will plot the CPR (CAR) over FPR in using matplotlib.pyplot.semilogx(), otherwise the FPR over FNR linearly using matplotlib.pyplot.plot().
**kwargs – Extra plotting parameters, which are passed directly to matplotlib.pyplot.plot().

Returns

list of matplotlib.lines.Line2D: The lines that were added as defined by the return value of :py:func`matplotlib.pyplot.plot`.

Return type

object

bob.measure.plot.roc_for_far(negatives, positives, far_values=[0.0001, 0.00017782794100389227, 0.00031622776601683794, 0.0005623413251903491, 0.001, 0.0017782794100389228, 0.0031622776601683794, 0.005623413251903491, 0.01, 0.01778279410038923, 0.03162277660168379, 0.05623413251903491, 0.1, 0.1778279410038923, 0.31622776601683794, 0.5623413251903491, 1.0], CAR=True, **kwargs)[source]¶

Plots the ROC curve for the given list of False Positive Rates (FAR).

This method will call matplotlib to plot the ROC curve for a system which contains a particular set of negatives (impostors) and positives (clients) scores. We use the standard matplotlib.pyplot.semilogx() command. All parameters passed with exception of the three first parameters of this method will be directly passed to the plot command.

The plot will represent the False Positive Rate (FPR) on the horizontal axis and the Correct Positive Rate (CPR) on the vertical axis. The values for the axis will be computed using bob.measure.roc_for_far().

Note

This function does not initiate and save the figure instance, it only issues the plotting command. You are the responsible for setting up and saving the figure as you see fit.

Parameters

negatives (array) – 1D float array that contains the scores of the “negative” (noise, non-class) samples of your classifier. See (bob.measure.roc())
positives (array) – 1D float array that contains the scores of the “positive” (signal, class) samples of your classifier. See (bob.measure.roc())
far_values (list, optional) – The values for the FPR, where the CPR (CAR) should be plotted; each value should be in range [0,1].
CAR (bool, optional) – If set to True, it will plot the CPR (CAR) over FPR in using matplotlib.pyplot.semilogx(), otherwise the FPR over FNR linearly using matplotlib.pyplot.plot().
kwargs (dict, optional) – Extra plotting parameters, which are passed directly to matplotlib.pyplot.plot().

Returns

The lines that were added as defined by the return value of matplotlib.pyplot.semilogx().

Return type

bob.measure.plot.precision_recall_curve(negatives, positives, npoints=2000, **kwargs)[source]¶

Plots a Precision-Recall curve.

This method will call matplotlib to plot the precision-recall curve for a system which contains a particular set of negatives (impostors) and positives (clients) scores. We use the standard matplotlib.pyplot.plot() command. All parameters passed with exception of the three first parameters of this method will be directly passed to the plot command.

Note

This function does not initiate and save the figure instance, it only issues the plotting command. You are the responsible for setting up and saving the figure as you see fit.

Parameters

negatives (array) – 1D float array that contains the scores of the “negative” (noise, non-class) samples of your classifier. See (bob.measure.precision_recall_curve())
positives (array) – 1D float array that contains the scores of the “positive” (signal, class) samples of your classifier. See (bob.measure.precision_recall_curve())
npoints (int, optional) – The number of points for the plot. See (bob.measure.precision_recall_curve())
kwargs (dict, optional) – Extra plotting parameters, which are passed directly to matplotlib.pyplot.plot().

Returns

The lines that were added as defined by the return value of matplotlib.pyplot.plot().

Return type

bob.measure.plot.epc(dev_negatives, dev_positives, test_negatives, test_positives, npoints=100, **kwargs)[source]¶

Plots Expected Performance Curve (EPC) as defined in the paper:

Bengio, S., Keller, M., Mariéthoz, J. (2004). The Expected Performance Curve. International Conference on Machine Learning ICML Workshop on ROC Analysis in Machine Learning, 136(1), 1963–1966. IDIAP RR. Available: http://eprints.pascal-network.org/archive/00000670/

This method will call matplotlib to plot the EPC curve for a system which contains a particular set of negatives (impostors) and positives (clients) for both the development and test sets. We use the standard matplotlib.pyplot.plot() command. All parameters passed with exception of the five first parameters of this method will be directly passed to the plot command.

The plot will represent the minimum HTER on the vertical axis and the cost on the horizontal axis.

Note

This function does not initiate and save the figure instance, it only issues the plotting commands. You are the responsible for setting up and saving the figure as you see fit.

Parameters

dev_negatives (array) – 1D float array that contains the scores of the “negative” (noise, non-class) samples of your classifier, from the development set. See (bob.measure.epc())
dev_positives (array) – 1D float array that contains the scores of the “positive” (signal, class) samples of your classifier, from the development set. See (bob.measure.epc())
test_negatives (array) – 1D float array that contains the scores of the “negative” (noise, non-class) samples of your classifier, from the test set. See (bob.measure.epc())
test_positives (array) – 1D float array that contains the scores of the “positive” (signal, class) samples of your classifier, from the test set. See (bob.measure.epc())
npoints (int, optional) – The number of points for the plot. See (bob.measure.epc())
kwargs (dict, optional) – Extra plotting parameters, which are passed directly to matplotlib.pyplot.plot().

Returns

The lines that were added as defined by the return value of matplotlib.pyplot.plot().

Return type

http://www.itl.nist.gov/iad/mig/tools/

bob.measure.plot.det(negatives, positives, npoints=2000, min_far=-8, **kwargs)[source]¶

Plots Detection Error Trade-off (DET) curve as defined in the paper:

Martin, A., Doddington, G., Kamm, T., Ordowski, M., & Przybocki, M. (1997). The DET curve in assessment of detection task performance. Fifth European Conference on Speech Communication and Technology (pp. 1895-1898). Available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.117.4489&rep=rep1&type=pdf

This method will call matplotlib to plot the DET curve(s) for a system which contains a particular set of negatives (impostors) and positives (clients) scores. We use the standard matplotlib.pyplot.plot() command. All parameters passed with exception of the three first parameters of this method will be directly passed to the plot command.

The plot will represent the false-alarm on the horizontal axis and the false-rejection on the vertical axis.

This method is strongly inspired by the NIST implementation for Matlab, called DETware, version 2.1 and available for download at the NIST website:

Note

This function does not initiate and save the figure instance, it only issues the plotting commands. You are the responsible for setting up and saving the figure as you see fit.

Note

If you wish to reset axis zooming, you must use the Gaussian scale rather than the visual marks showed at the plot, which are just there for displaying purposes. The real axis scale is based on bob.measure.ppndf(). For example, if you wish to set the x and y axis to display data between 1% and 40% here is the recipe:

import bob.measure
from matplotlib import pyplot
bob.measure.plot.det(...) #call this as many times as you need
#AFTER you plot the DET curve, just set the axis in this way:
pyplot.axis([bob.measure.ppndf(k/100.0) for k in (1, 40, 1, 40)])

We provide a convenient way for you to do the above in this module. So, optionally, you may use the bob.measure.plot.det_axis() method like this:

import bob.measure
bob.measure.plot.det(...)
# please note we convert percentage values in det_axis()
bob.measure.plot.det_axis([1, 40, 1, 40])

Parameters

negatives (array) – 1D float array that contains the scores of the “negative” (noise, non-class) samples of your classifier. See (bob.measure.det())
positives (array) – 1D float array that contains the scores of the “positive” (signal, class) samples of your classifier. See (bob.measure.det())
npoints (int, optional) – The number of points for the plot. See (bob.measure.det())
axisfontsize (str, optional) – The size to be used by x/y-tick-labels to set the font size on the axis
kwargs (dict, optional) – Extra plotting parameters, which are passed directly to matplotlib.pyplot.plot().

Returns

The lines that were added as defined by the return value of matplotlib.pyplot.plot().

Return type

bob.measure.plot.det_axis(v, **kwargs)[source]¶

Sets the axis in a DET plot.

This method wraps the matplotlib.pyplot.axis() by calling bob.measure.ppndf() on the values passed by the user so they are meaningful in a DET plot as performed by bob.measure.plot.det().

Parameters

v (sequence) – A sequence (list, tuple, array or the like) containing the X and Y limits in the order (xmin, xmax, ymin, ymax). Expected values should be in percentage (between 0 and 100%). If v is not a list or tuple that contains 4 numbers it is passed without further inspection to matplotlib.pyplot.axis().
kwargs (dict, optional) – Extra plotting parameters, which are passed directly to matplotlib.pyplot.axis().

Returns

Whatever is returned by matplotlib.pyplot.axis().

Return type

object

bob.measure.plot.cmc(cmc_scores, logx=True, **kwargs)[source]¶

Plots the (cumulative) match characteristics and returns the maximum rank.

This function plots a CMC curve using the given CMC scores (list:: A list of tuples, where each tuple contains the negative and positive scores for one probe of the database).

Parameters

cmc_scores (array) – 1D float array containing the CMC values (See bob.measure.cmc())
logx (bool, optional) – If set (the default), plots the rank axis in logarithmic scale using matplotlib.pyplot.semilogx() or in linear scale using matplotlib.pyplot.plot()
kwargs (dict, optional) – Extra plotting parameters, which are passed directly to matplotlib.pyplot.plot().

Returns

The number of classes (clients) in the given scores.

Return type

int

bob.measure.plot.detection_identification_curve(cmc_scores, far_values=[0.0001, 0.00017782794100389227, 0.00031622776601683794, 0.0005623413251903491, 0.001, 0.0017782794100389228, 0.0031622776601683794, 0.005623413251903491, 0.01, 0.01778279410038923, 0.03162277660168379, 0.05623413251903491, 0.1, 0.1778279410038923, 0.31622776601683794, 0.5623413251903491, 1.0], rank=1, logx=True, **kwargs)[source]¶

Plots the Detection & Identification curve over the FPR

This curve is designed to be used in an open set identification protocol, and defined in Chapter 14.1 of [LiJain2005]. It requires to have at least one open set probe item, i.e., with no corresponding gallery, such that the positives for that pair are None.

The detection and identification curve first computes FPR thresholds based on the out-of-set probe scores (negative scores). For each probe item, the maximum negative score is used. Then, it plots the detection and identification rates for those thresholds, which are based on the in-set probe scores only. See [LiJain2005] for more details.

LiJain2005(1,2,3,4): Stan Li and Anil K. Jain, Handbook of Face Recognition, Springer, 2005

Parameters

cmc_scores (array) – 1D float array containing the CMC values (See bob.measure.cmc())
rank (int, optional) – The rank for which the curve should be plotted
far_values (list, optional) – The values for the FPR (FAR), where the CPR (CAR) should be plotted; each value should be in range [0,1].
logx (bool, optional) – If set (the default), plots the rank axis in logarithmic scale using matplotlib.pyplot.semilogx() or in linear scale using matplotlib.pyplot.plot()
kwargs (dict, optional) – Extra plotting parameters, which are passed directly to matplotlib.pyplot.plot().

Returns

The lines that were added as defined by the return value of matplotlib.pyplot.plot().

Return type

A set of utilities to load score files with different formats.

bob.measure.load.split(filename) → negatives, positives[source]¶

Loads the scores from the given file and splits them into positive and negative arrays. The file must be a two columns file where the first column contains -1 or 1 (for negative or positive respectively) and the second the corresponding scores.

Parameters

filename (str:) – The name of the file containing the scores.

Returns

negatives (1D numpy.ndarray of type float) – This array contains the list of negative scores
positives (1D numpy.ndarray of type float) – This array contains the list of positive scores

bob.measure.load.split_files(filenames)[source]¶

Parse a list of files using split()

Parameters

filenames – list: A list of file paths

Returns

:any:`list` (A list of tuples, where each tuple contains the)
negative and positive scores for one probe of the database. Both
negatives and positives can be either an 1D
numpy.ndarray of type float, or None.

utility functions for bob.measure

bob.measure.utils.remove_nan(scores)[source]¶

Remove NaN(s) in the given array

Parameters

scores – numpy.ndarray : array

Returns

:py:class:`numpy.ndarray` (array without NaN(s))
:py:class:`int` (number of NaN(s) in the input array)
:py:class:`int` (length of the input array)

bob.measure.utils.get_fta(scores)[source]¶

calculates the Failure To Acquire (FtA) rate, i.e. proportion of NaN(s): in the input scores

Parameters

scores – Tuple of (positive, negative) numpy.ndarray.

Returns

(:py:class:`numpy.ndarray`, :py:class:`numpy.ndarray`) (scores without)
NaN(s)
:py:class:`float` (failure to acquire rate)

bob.measure.utils.get_fta_list(scores)[source]¶

Get FTAs for a list of scores

Parameters

scores (list) – list of scores

Returns

neg_list (list) – list of negatives
pos_list (list) – list of positives
fta_list (list) – list of FTAs

bob.measure.utils.get_thres(criter, neg, pos, far=None)[source]¶

Get threshold for the given positive/negatives scores and criterion

Parameters

criter – Criterion (eer or hter or far)
neg (numpy.ndarray:) – array of negative scores pos : numpy.ndarray:: array of positive scores

Returns

threshold

Return type

float

bob.measure.utils.get_colors(n)[source]¶

Get a list of matplotlib colors

Parameters: n (int) – Number of colors to output
Returns: list of colors
Return type: list

bob.measure.utils.get_linestyles(n, on=True)[source]¶

Get a list of matplotlib linestyles

Parameters: n (int) – Number of linestyles to output
Returns: list of linestyles
Return type: list

bob.measure.utils.confidence_for_indicator_variable(x, n, alpha=0.05)[source]¶

Calculates the confidence interval for proportion estimates The Clopper-Pearson interval method is used for estimating the confidence intervals.

Parameters

x (int) – The number of successes.
n (int) – The number of trials. alpha : float, optional The 1-confidence value that you want. For example, alpha should be 0.05 to obtain 95% confidence intervals.

Returns

a tuple of (lower_bound, upper_bound) which shows the limit of your success rate: lower_bound < x/n < upper_bound

Return type

(float, float)

Runs error analysis on score sets, outputs metrics and plots

bob.measure.script.figure.check_list_value(values, desired_number, name, name2='systems')[source]¶

class bob.measure.script.figure.MeasureBase(ctx, scores, evaluation, func_load)[source]¶

Bases: object

Base class for metrics and plots. This abstract class define the framework to plot or compute metrics from a list of (positive, negative) scores tuples.

func_load¶: Function that is used to load the input files

run()[source]¶: Generate outputs (e.g. metrics, files, pdf plots). This function calls abstract methods init_process() (before loop), compute() (in the loop iterating through the different systems) and end_process() (after the loop).

init_process()[source]¶: Called in MeasureBase().run before iterating through the different systems. Should reimplemented in derived classes

abstract compute(idx, input_scores, input_names)[source]¶

Compute metrics or plots from the given scores provided by run(). Should reimplemented in derived classes

Parameters

idx (int) – index of the system
input_scores (list) – list of scores returned by the loading function
input_names (list) – list of base names for the input file of the system

abstract end_process()[source]¶: Called in MeasureBase().run after iterating through the different systems. Should reimplemented in derived classes

class bob.measure.script.figure.Metrics(ctx, scores, evaluation, func_load, names=('False Positive Rate', 'False Negative Rate', 'Precision', 'Recall', 'F1-score', 'Area Under ROC Curve', 'Area Under ROC Curve (log scale)'))[source]¶

Bases: bob.measure.script.figure.MeasureBase

Compute metrics from score files

log_file¶

output stream

Type: str

get_thres(criterion, dev_neg, dev_pos, far)[source]¶

compute(idx, input_scores, input_names)[source]¶: Compute metrics thresholds and tables (FPR, FNR, precision, recall, f1_score) for given system inputs

end_process()[source]¶: Close log file if needed

class bob.measure.script.figure.MultiMetrics(ctx, scores, evaluation, func_load, names=('NaNs Rate', 'False Positive Rate', 'False Negative Rate', 'False Accept Rate', 'False Reject Rate', 'Half Total Error Rate'))[source]¶

Bases: bob.measure.script.figure.Metrics

Computes average of metrics based on several protocols (cross validation)

log_file¶

output stream

Type: str

names¶

List of names for the metrics.

Type: tuple

compute(idx, input_scores, input_names)[source]¶: Computes the average of metrics over several protocols.

end_process()[source]¶: Close log file if needed

class bob.measure.script.figure.PlotBase(ctx, scores, evaluation, func_load)[source]¶

Bases: bob.measure.script.figure.MeasureBase

Base class for plots. Regroup several options and code shared by the different plots

init_process()[source]¶: Open pdf and set axis font size if provided

end_process()[source]¶: Set title, legend, axis labels, grid colors, save figures, drow lines and close pdf if needed

class bob.measure.script.figure.Roc(ctx, scores, evaluation, func_load)[source]¶

Handles the plotting of ROC

compute(idx, input_scores, input_names)[source]¶: Plot ROC for dev and eval data using bob.measure.plot.roc()

class bob.measure.script.figure.Det(ctx, scores, evaluation, func_load)[source]¶

Handles the plotting of DET

compute(idx, input_scores, input_names)[source]¶: Plot DET for dev and eval data using bob.measure.plot.det()

class bob.measure.script.figure.Epc(ctx, scores, evaluation, func_load, hter='HTER')[source]¶

Handles the plotting of EPC

compute(idx, input_scores, input_names)[source]¶: Plot EPC using bob.measure.plot.epc()

class bob.measure.script.figure.GridSubplot(ctx, scores, evaluation, func_load)[source]¶