Python API¶

This section includes information for using the pure Python API of bob.learn.em.

Classes¶

`bob.learn.em.KMeansMachine`(n_clusters[, ...])	Stores the k-means clusters parameters (centroid of each cluster).
`bob.learn.em.GMMStats`(n_gaussians, n_features)	Stores accumulated statistics of a GMM.
`bob.learn.em.GMMMachine`(n_gaussians[, ...])	Transformer that stores a Gaussian Mixture Model (GMM) parameters.
`bob.learn.em.WCCN`([pinv])	Trains a linear machine to perform Within-Class Covariance Normalization (WCCN) WCCN finds the projection matrix W that allows us to linearly project the data matrix X to another (sub) space such that:
`bob.learn.em.Whitening`([pinv])	Trains an Estimator perform Cholesky whitening.

Functions¶

bob.learn.em.linear_scoring(models_means, ...)

Estimation of the LLR between a target model and the UBM for a test instance.

Detailed Information¶

class bob.learn.em.GMMMachine(n_gaussians: int, trainer: str = 'ml', ubm: Optional[GMMMachine] = None, convergence_threshold: float = 1e-05, max_fitting_steps: Optional[int] = 200, random_state: Union[int, RandomState] = 0, weights: Optional[ndarray['n_gaussians', float]] = None, k_means_trainer: Optional[KMeansMachine] = None, update_means: bool = True, update_variances: bool = False, update_weights: bool = False, mean_var_update_threshold: float = 2.220446049250313e-16, map_alpha: float = 0.5, map_relevance_factor: Union[None, float] = 4, **kwargs)¶

Bases: BaseEstimator

Transformer that stores a Gaussian Mixture Model (GMM) parameters.

This class implements the statistical model for multivariate diagonal mixture Gaussian distribution (GMM), as well as ways to train a model on data.

A GMM is defined as \(\sum_{c=0}^{C} \omega_c \mathcal{N}(x | \mu_c, \sigma_c)\), where \(C\) is the number of Gaussian components \(\mu_c\), \(\sigma_c\) and \(\omega_c\) are respectively the the mean, variance and the weight of each gaussian component \(c\). See Section 2.3.9 of Bishop, “Pattern recognition and machine learning”, 2006

Two types of training are available MLE and MAP, chosen with trainer.

Maximum Likelihood Estimation (MLE, ML)

The mixtures are initialized (with k-means by default). The means, variances, and weights of the mixtures are then trained on the data to increase the likelihood value. (MLE)

Maximum a Posteriori (MAP)

The MAP machine takes another GMM machine as prior, called Universal Background Model (UBM). The means, variances, and weights of the MAP mixtures are then trained on the data as adaptation of the UBM.

Both training method use a Expectation-Maximization (e-m) algorithm to iteratively train the GMM.

Note

When setting manually any of the means, variances or variance thresholds, the k-means initialization will be skipped in fit.

means, variances, variance_thresholds: Gaussians parameters.

acc_stats(X)[source]¶: Returns the statistics for X.

fit(X, y=None)[source]¶: Trains the GMM on data until convergence or maximum step is reached.

classmethod from_hdf5(hdf5, ubm=None)[source]¶: Creates a new GMMMachine object from an HDF5File object.

property g_norms¶: Precomputed g_norms (depends on variances and feature shape).

initialize_gaussians(data: Optional[ndarray['n_samples', 'n_features', float]] = None)[source]¶: Populates gaussians parameters with either k-means or the UBM values.

is_similar_to(other, rtol=1e-05, atol=1e-08)[source]¶: Returns True if other has the same gaussians (within a tolerance).

load(hdf5)[source]¶: Overwrites the current state with those in an HDF5File object.

log_likelihood(data: ndarray['n_samples', 'n_features', float])[source]¶

Returns the current log likelihood for a set of data in this Machine.

Parameters: data – Data to compute the log likelihood on.
Returns: The log likelihood of each sample.
Return type: array of shape (n_samples)

log_weighted_likelihood(data: ndarray['n_samples', 'n_features', float])[source]¶

Returns the weighted log likelihood for each Gaussian for a set of data.

Parameters: data – Data to compute the log likelihood on.
Returns: The weighted log likelihood of each sample of each Gaussian.
Return type: array of shape (n_gaussians, n_samples)

property log_weights¶: Retrieve the logarithm of the weights.

property means¶: The means of each Gaussian.

save(hdf5)[source]¶: Saves the current statistics in an HDF5File object.

property shape¶: Shape of the gaussians in the GMM machine.

stats_per_sample(X)[source]¶

transform(X)[source]¶: Returns the statistics for X.

property variance_thresholds¶: Threshold below which variances are clamped to prevent precision losses.

property variances¶: The (diagonal) variances of the gaussians.

property weights¶: The weights of each Gaussian mixture.

class bob.learn.em.GMMStats(n_gaussians: int, n_features: int, like=None, **kwargs)¶

Bases: object

Stores accumulated statistics of a GMM.

log_likelihood¶

The sum of log_likelihood of each sample on a GMM.

Type: float

t¶

The number of considered samples.

Type: int

n¶

Sum of responsibility.

Type: array of shape (n_gaussians,)

sum_px¶

First order statistic

Type: array of shape (n_gaussians, n_features)

sum_pxx¶

Second order statistic

Type: array of shape (n_gaussians, n_features)

classmethod from_hdf5(hdf5)[source]¶: Creates a new GMMStats object from an HDF5File object.

init_fields(log_likelihood=0.0, t=0, n=None, sum_px=None, sum_pxx=None)[source]¶: Initializes the statistics values to a defined value, or zero by default.

is_similar_to(other, rtol=1e-05, atol=1e-08)[source]¶: Returns True if other has the same values (within a tolerance).

load(hdf5)[source]¶: Overwrites the current statistics with those in an HDF5File object.

property nbytes¶: The number of bytes used by the statistics n, sum_px, sum_pxx.

reset()[source]¶: Sets all statistics to zero.

resize(n_gaussians, n_features)[source]¶: Reinitializes the machine with new dimensions.

save(hdf5)[source]¶: Saves the current statistsics in an HDF5File object.

property shape¶: The number of gaussians and their dimensionality.

class bob.learn.em.ISVMachine(r_U, em_iterations=10, relevance_factor=4.0, random_state=0, ubm=None, ubm_kwargs=None, **kwargs)¶

Bases: FactorAnalysisBase

Implements the Intersession Variability Modelling hypothesis on top of GMMs

Inter-Session Variability (ISV) modeling is a session variability modeling technique built on top of the Gaussian mixture modeling approach. It hypothesizes that within-class variations are embedded in a linear subspace in the GMM means subspace and these variations can be suppressed by an offset w.r.t each mean during the MAP adaptation. For more information check [McCool2013]

Parameters

r_U (int) – Dimension of the subspace U
em_iterations (int) – Number of EM iterations
relevance_factor (float) – Factor analysis relevance factor
random_state (int) – random_state for the random number generator
ubm (bob.learn.em.GMMMachine or None) – A trained UBM (Universal Background Model). If None, the UBM is trained with a new bob.learn.em.GMMMachine when fit is called, with ubm_kwargs as parameters.

e_step(X, y, n_samples_per_class, n_acc, f_acc)[source]¶: E-step of the EM algorithm

enroll(X)[source]¶

Enrolls a new client In ISV, the enrolment is defined as: \(m + Dz\) with the latent variables z representing the enrolled model.

Parameters: X (list of bob.learn.em.GMMStats) – List of statistics to be enrolled
Returns: self – z
Return type: object

enroll_using_array(X)[source]¶

Enrolls a new client using a numpy array as input

Parameters

X (array) – features to be enrolled
iterations (int) – Number of iterations to perform

Returns

self – z

Return type

object

fit(X, y)[source]¶

Trains the U matrix (session variability matrix)

Parameters

X (numpy.ndarray) – Nxd features of N GMM statistics
y (numpy.ndarray) – The input labels, a 1D numpy array of shape (number of samples, )

Returns

self – Returns self.

Return type

object

m_step(acc_U_A1_acc_U_A2_list)[source]¶

ISV M-step. This updates U matrix

Parameters

acc_U_A1 (array) – Accumulated statistics for U_A1(n_gaussians, r_U, r_U)
acc_U_A2 (array) – Accumulated statistics for U_A2(n_gaussians* feature_dimension, r_U)

score(latent_z, data)[source]¶

Computes the ISV score

Parameters

latent_z (numpy.ndarray) – Latent representation of the client (E[z_i])
data (list of bob.learn.em.GMMStats) – List of statistics to be scored

Returns

score – The linear scored

Return type

float

transform(X)[source]¶

class bob.learn.em.IVectorMachine(ubm: GMMMachine, dim_t: int = 2, convergence_threshold: Optional[float] = None, max_iterations: int = 25, update_sigma: bool = True, variance_floor: float = 1e-10, **kwargs)¶

Bases: BaseEstimator

Trains and projects data using I-Vector.

Dimensions:

dim_c: number of Gaussians
dim_d: number of features
dim_t: dimension of the i-vector

Attributes

T (c,d,t):: The total variability matrix \(T\)
sigma (c,d):: The diagonal covariance matrix \(Sigma\)

fit(X: Union[List[ndarray], Bag], y=None) → IVectorMachine[source]¶

Trains the IVectorMachine.

Repeats the e-m steps until max_iterations is reached.

project(stats: GMMStats) → ndarray[source]¶

Projects the GMMStats on the IVectorMachine.

This takes data already projected onto the UBM.

Returns:

The IVector of the input stats.

transform(X: List[GMMStats]) → List[ndarray][source]¶

Transforms the data using the trained IVectorMachine.

This takes MFCC data, will project them onto the ubm, and compute the IVector statistics.

Parameters:

data: The data (MFCC features) to transform. Arrays of shape (n_samples, n_features).

Returns:

The IVector for each sample. Arrays of shape (dim_t,)

class bob.learn.em.JFAMachine(r_U, r_V, em_iterations=10, relevance_factor=4.0, random_state=0, ubm=None, ubm_kwargs=None, **kwargs)¶

Bases: FactorAnalysisBase

Joint Factor Analysis (JFA) is an extension of ISV. Besides the within-class assumption (modeled with \(U\)), it also hypothesize that between class variations are embedded in a low rank rectangular matrix \(V\). In the supervector notation, this modeling has the following shape: \(\mu_{i, j} = m + Ux_{i, j} + Vy_{i} + D_z{i}\).

For more information check [McCool2013]

Parameters

ubm (bob.learn.em.GMMMachine) – A trained UBM (Universal Background Model)
r_U (int) – Dimension of the subspace U
r_V (int) – Dimension of the subspace V
em_iterations (int) – Number of EM iterations
relevance_factor (float) – Factor analysis relevance factor
random_state (int) – random_state for the random number generator

e_step_d(X, y, n_samples_per_class, latent_x, latent_y, n_acc, f_acc)[source]¶

ISV E-step for the U matrix.

Parameters

X (list of bob.learn.em.GMMStats) – List of statistics
y (list of int) – List of labels
n_classes (int) – Number of classes
latent_x (array) – E(x) latent variable
latent_y (array) – E(y) latent variable
latent_z (array) – E(z) latent variable
n_acc (array) – Accumulated 0th-order statistics
f_acc (array) – Accumulated 1st-order statistics

Returns

acc_D_A1 (array) – Accumulated statistics for D_A1(n_gaussians* feature_dimension, )
acc_D_A2 (array) – Accumulated statistics for D_A2(n_gaussians* feature_dimension, )

e_step_u(X, y, n_samples_per_class, latent_y)[source]¶

ISV E-step for the U matrix.

Parameters

X (list of bob.learn.em.GMMStats) – List of statistics
y (list of int) – List of labels
latent_y (array) – E(y) latent variable

Returns

acc_U_A1 (array) – Accumulated statistics for U_A1(n_gaussians, r_U, r_U)
acc_U_A2 (array) – Accumulated statistics for U_A2(n_gaussians* feature_dimension, r_U)

e_step_v(X, y, n_samples_per_class, n_acc, f_acc)[source]¶

ISV E-step for the V matrix.

Parameters

X (list of bob.learn.em.GMMStats) – List of statistics
y (list of int) – List of labels
n_classes (int) – Number of classes
n_acc (array) – Accumulated 0th-order statistics
f_acc (array) – Accumulated 1st-order statistics

Returns

acc_V_A1 (array) – Accumulated statistics for V_A1(n_gaussians, r_V, r_V)
acc_V_A2 (array) – Accumulated statistics for V_A2(n_gaussians* feature_dimension, r_V)

enroll(X)[source]¶

Enrolls a new client. In JFA the enrolment is defined as: \(m + Vy + Dz\) with the latent variables y and z representing the enrolled model.

Parameters: X (list of bob.learn.em.GMMStats) – List of statistics
Returns: self – z, y latent variables
Return type: array

finalize_u(X, y, n_samples_per_class, latent_y)[source]¶

Compute for the last time E[x]

Parameters

X (list of bob.learn.em.GMMStats) – List of statistics
y (list of int) – List of labels
n_classes (int) – Number of classes
latent_y (array) – E[y] latent variable

Returns

latent_x – E[x]

Return type

array

finalize_v(X, y, n_samples_per_class, n_acc, f_acc)[source]¶

Compute for the last time E[y]

Parameters

X (list of bob.learn.em.GMMStats) – List of statistics
y (list of int) – List of labels
n_classes (int) – Number of classes
n_acc (array) – Accumulated 0th-order statistics
f_acc (array) – Accumulated 1st-order statistics

Returns

latent_y – E[y]

Return type

array

fit(X, y)[source]¶

Trains the U matrix (session variability matrix)

Parameters

X (numpy.ndarray) – Nxd features of N GMM statistics
y (numpy.ndarray) – The input labels, a 1D numpy array of shape (number of samples, )

Returns

self – Returns self.

Return type

object

m_step_d(acc_D_A1_acc_D_A2_list)[source]¶

D Matrix M-step. This updates the D matrix

Parameters

acc_D_A1 (array) – Accumulated statistics for D_A1(n_gaussians* feature_dimension, )
acc_D_A2 (array) – Accumulated statistics for D_A2(n_gaussians* feature_dimension, )

m_step_u(acc_U_A1_acc_U_A2_list)[source]¶

U Matrix M-step. This updates the U matrix

Parameters

acc_V_A1 (array) – Accumulated statistics for V_A1(n_gaussians, r_V, r_V)
acc_V_A2 (array) – Accumulated statistics for V_A2(n_gaussians* feature_dimension, r_V)

m_step_v(acc_V_A1_acc_V_A2_list)[source]¶

V Matrix M-step. This updates the V matrix

Parameters

acc_V_A1 (array) – Accumulated statistics for V_A1(n_gaussians, r_V, r_V)
acc_V_A2 (array) – Accumulated statistics for V_A2(n_gaussians* feature_dimension, r_V)

score(model, data)[source]¶

Computes the JFA score

Parameters

latent_z (numpy.ndarray) – Latent representation of the client (E[z_i])
data (list of bob.learn.em.GMMStats) – List of statistics to be scored

Returns

score – The linear scored

Return type

float

class bob.learn.em.KMeansMachine(n_clusters: int, init_method: Union[str, ndarray] = 'k-means||', convergence_threshold: float = 1e-05, max_iter: int = 20, random_state: Union[int, RandomState] = 0, init_max_iter: Optional[int] = 5, oversampling_factor: float = 2, **kwargs)¶

Bases: BaseEstimator

Stores the k-means clusters parameters (centroid of each cluster).

Allows the clustering of data with the fit method.

The training works in two phases:

An initialization (setting the initial values of the centroids)
An e-m loop reducing the total distance between the data points and their closest centroid.

The initialization can use an iterative process to find the best set of coordinates, use random starting points, or take specified coordinates. The init_method parameter specifies which of these behavior is considered.

centroids_¶

The current clusters centroids. Available after fitting.

Type: ndarray of shape (n_clusters, n_features)

fit(X, y=None)[source]¶: Fits this machine on data samples.

get_variances_and_weights_for_each_cluster(data: ndarray)[source]¶

Returns the clusters variance and weight for data clustered by the machine.

For each cluster, finds the subset of the samples that is closest to that centroid, and calculates: 1) the variance of that subset (the cluster variance) 2) the proportion of samples represented by that subset (the cluster weight)

Parameters

data – The data to compute the variance of.

Returns

variances: ndarray of shape (n_clusters, n_features): For each cluster, the variance in each dimension of the data.
weights: ndarray of shape (n_clusters, ): Weight (proportion of quantity of data point) of each cluster.

Return type

Tuple of arrays

initialize(data: ndarray)[source]¶: Assigns the means to an initial value using a specified method or randomly.

is_similar_to(obj, r_epsilon=1e-05, a_epsilon=1e-08) → bool[source]¶

property means: ndarray¶: An alias for centroids_.

predict(X)[source]¶

Returns the labels of the closest cluster centroid to the data.

Parameters: X (ndarray of shape (n_samples, n_features)) – Series of data points.
Returns: indices – The indices of the closest cluster for each data point.
Return type: ndarray of shape (n_samples)

transform(X)[source]¶

Returns all the distances between the data and each cluster’s mean.

Parameters: X (ndarray of shape (n_samples, n_features)) – Series of data points.
Returns: distances – For each mean, for each point, the squared Euclidian distance between them.
Return type: ndarray of shape (n_clusters, n_samples)

class bob.learn.em.WCCN(pinv=False, **kwargs)¶

Bases: TransformerMixin, BaseEstimator

Trains a linear machine to perform Within-Class Covariance Normalization (WCCN) WCCN finds the projection matrix W that allows us to linearly project the data matrix X to another (sub) space such that:

\[(1/N) S_{w} = W W^T\]

where \(W\) is an upper triangular matrix computed using Cholesky Decomposition:

\[W = cholesky([(1/K) S_{w} ]^{-1})\]

where:

\(K\) the number of classes
\(S_w\) the within-class scatter; it also has dimensions (X.shape[0], X.shape[0]) and is defined as \(S_w = \sum_{k=1}^K \sum_{n \in C_k} (x_n-m_k)(x_n-m_k)^T\), with \(C_k\) being a set representing all samples for class k.
\(m_k\) the class k empirical mean, defined as \(m_k = \frac{1}{N_k}\sum_{n \in C_k} x_n\)

References

1. Within-class covariance normalization for SVM-based speaker recognition, Andrew O. Hatch, Sachin Kajarekar, and Andreas Stolcke, In INTERSPEECH, 2006.
1. http://en.wikipedia.org/wiki/Cholesky_decomposition”

fit(X, y)[source]¶

transform(X)[source]¶

class bob.learn.em.Whitening(pinv: bool = False, **kwargs)¶

Bases: TransformerMixin, BaseEstimator

Trains an Estimator perform Cholesky whitening.

The whitening transformation is a decorrelation method that converts the covariance matrix of a set of samples into the identity matrix \(I\). This effectively linearly transforms random variables such that the resulting variables are uncorrelated and have the same variances as the original random variables.

This transformation is invertible. The method is called the whitening transform because it transforms the input matrix \(X\) closer towards white noise (let’s call it \(\tilde{X}\)):

\[Cov(\tilde{X}) = I\]

with:: \[\tilde{X} = X W\]

where \(W\) is the projection matrix that allows us to linearly project the data matrix \(X\) to another (sub) space such that:

\[Cov(X) = W W^T\]

\(W\) is computed using Cholesky decomposition:

\[W = cholesky([Cov(X)]^{-1})\]

References

fit(X, y=None)[source]¶

transform(X)[source]¶

bob.learn.em.get_config()[source]¶: Returns a string containing the configuration information.

bob.learn.em.linear_scoring(models_means: Union[list[bob.learn.em.GMMMachine], ndarray['n_models', 'n_gaussians', 'n_features', float]], ubm: GMMMachine, test_stats: Union[list[bob.learn.em.GMMStats], GMMStats], test_channel_offsets: ndarray['n_test_stats', 'n_gaussians', float] = 0, frame_length_normalization: bool = False) → ndarray['n_models', 'n_test_stats', float][source]¶

Estimation of the LLR between a target model and the UBM for a test instance.

The Linear scoring is an approximation to the log-likelihood ratio (LLR) that was shown to be as accurate and up to two orders of magnitude more efficient to compute. [Glembek2009]

Parameters

models_means – The model(s) to score against. If a list of GMMMachine is given, the means of each model are considered.
ubm – The Universal Background Model. Accepts a GMMMachine object. If the GMMMachine uses MAP, it’s ubm attribute is used.
test_stats – The instances to score.
test_channel_offsets – Offset values added to the test instances.

Returns

The scores of each probe against each model.

Return type

Array of shape (n_models, n_probes)