Python API¶
This section includes information for using the pure Python API of bob.ap.
-
class
bob.ap.Ceps(sampling_frequency[, win_length_ms=20.[, win_shift_ms=10.[, n_filters=24[, n_ceps=19[, f_min=0.[, f_max=4000.[, delta_win=2[, pre_emphasis_coeff=0.95[, mel_scale=True[, dct_norm=True]]]]]]]]]]) → new Ceps¶ Bases:
bob.ap.SpectrogramCeps(other) -> new Ceps
Objects of this class, after configuration, can extract the cepstral coefficients from 1D audio array/signals.
Parameters:
- sampling_frequency
- [float] the sampling frequency/frequency rate
- win_length_ms
- [float] the window length in miliseconds
- win_shift_ms
- [float] the window shift in miliseconds
- n_filters
- [int] the number of filter bands
- n_ceps
- [int] the number of cepstral coefficients
- f_min
- [double] the minimum frequency of the filter bank
- f_max
- [double] the maximum frequency of the filter bank
- delta_win
- [int] The integer delta value used for computing the first and second order derivatives
- pre_emphasis_coeff
- [double] the coefficient used for the pre-emphasis
- mel_scale
- [bool] tells whether cepstral features are extracted
on a linear (LFCC, set it to
False) or Mel (MFCC, set it toTrue- the default) - dct_norm
- [bool] A factor by which the cepstral coefficients are multiplied
- other
- [Ceps] an object of which is or inherits from
Cepsthat will be deep-copied into a new instance.
-
dct_norm¶ A factor by which the cepstral coefficients are multiplied
-
delta_win¶ The integer delta value used for computing the first and second order derivatives
-
energy_bands¶ Tells whether we compute a spectrogram or energy bands
-
energy_filter¶ Tells whether we use the energy or the square root of the energy
-
energy_floor¶ The energy flooring threshold
-
f_max¶ The maximum frequency of the filter bank
-
f_min¶ The minimum frequency of the filter bank
-
get_shape(input) → tuple¶ Computes the shape of the output features, given the size of an input array or an input array.
Parameters:
- input
- [int|array] Either an integral value or an array for which the output shape of this extractor is going to be computed.
This method always returns a 2-tuple containing the shape of output features produced by this extractor.
-
log_filter¶ Tells whether we use the log triangular filter or the triangular filter
-
mel_scale¶ Tells whether cepstral features are extracted on a linear (LFCC) or Mel (MFCC) scale
-
n_ceps¶ The number of cepstral coefficients
-
n_filters¶ The number of filter bands
-
pre_emphasis_coeff¶ The coefficient used for the pre-emphasis
-
sampling_frequency¶ The sampling frequency/frequency rate
-
win_length¶ The normalized window length w.r.t. the sample frequency
-
win_length_ms¶ The window length of the cepstral analysis in milliseconds
-
win_shift¶ The normalized window shift w.r.t. the sample frequency
-
win_shift_ms¶ The window shift of the cepstral analysis in milliseconds
-
with_delta¶ Tells if we add the first derivatives to the output feature
-
with_delta_delta¶ Tells if we add the second derivatives to the output feature
-
with_energy¶ Tells if we add the energy to the output feature
-
class
bob.ap.Energy(sampling_frequency[, win_length_ms=20.[, win_shift_ms=10.]]) → new Energy¶ Bases:
bob.ap.FrameExtractorEnergy(other) -> new Energy
Objects of this class, after configuration, can extract the energy of frames extracted from a 1D audio array/signal.
Parameters:
- sampling_frequency
- [float] the sampling frequency/frequency rate
- win_length_ms
- [float] the window length in miliseconds
- win_shift_ms
- [float] the window shift in miliseconds
- other
- [Energy] an object of which is or inherits from
Energythat will be deep-copied into a new instance.
-
energy_floor¶ The energy flooring threshold
-
get_shape(input) → tuple¶ Computes the shape of the output features, given the size of an input array or an input array.
Parameters:
- input
- [int|array] Either an integral value or an array for which the output shape of this extractor is going to be computed.
This method always returns a 2-tuple containing the shape of output features produced by this extractor.
-
sampling_frequency¶ The sampling frequency/frequency rate
-
win_length¶ The normalized window length w.r.t. the sample frequency
-
win_length_ms¶ The window length of the cepstral analysis in milliseconds
-
win_shift¶ The normalized window shift w.r.t. the sample frequency
-
win_shift_ms¶ The window shift of the cepstral analysis in milliseconds
-
class
bob.ap.FrameExtractor(sampling_frequency[, win_length_ms=20.[, win_shift_ms=10.]]) → new FrameExtractor¶ Bases:
objectFrameExtractor(other) -> new FrameExtractor
This class is a base type for classes that perform audio processing on a frame basis. It can be instantiated from Python.
Objects of this class, after configuration, can extract audio frame from a 1D audio array/signal. You can instantiate objects of this class by passing a set of construction parameters or another object of which the base type is
FrameExtractor.Parameters:
- sampling_frequency
- [float] the sampling frequency/frequency rate
- win_length_ms
- [float] the window length in miliseconds
- win_shift_ms
- [float] the window shift in miliseconds
- other
- [FrameExtractor] an object of which is or inherits from a FrameExtractor that will be deep-copied into a new instance.
-
get_shape(input) → tuple¶ Computes the shape of the output features, given the size of an input array or an input array.
Parameters:
- input
- [int|array] Either an integral value or an array for which the output shape of this extractor is going to be computed.
This method always returns a 2-tuple containing the shape of output features produced by this extractor.
-
sampling_frequency¶ The sampling frequency/frequency rate
-
win_length¶ The normalized window length w.r.t. the sample frequency
-
win_length_ms¶ The window length of the cepstral analysis in milliseconds
-
win_shift¶ The normalized window shift w.r.t. the sample frequency
-
win_shift_ms¶ The window shift of the cepstral analysis in milliseconds
-
class
bob.ap.Spectrogram(sampling_frequency[, win_length_ms=20.[, win_shift_ms=10.[, n_filters=24[, f_min=0.[, f_max=4000.[, pre_emphasis_coeff=0.95[, mel_scale=True]]]]]]]) → new Spectrogram¶ Bases:
bob.ap.EnergySpectrogram(other) -> new Spectrogram
Objects of this class, after configuration, can extract the spectrogram from 1D audio array/signals.
Parameters:
- sampling_frequency
- [float] the sampling frequency/frequency rate
- win_length_ms
- [float] the window length in miliseconds
- win_shift_ms
- [float] the window shift in miliseconds
- n_filters
- [int] the number of filter bands
- f_min
- [double] the minimum frequency of the filter bank
- f_max
- [double] the maximum frequency of the filter bank
- pre_emphasis_coeff
- [double] the coefficient used for the pre-emphasis
- mel_scale
- [bool] tells whether cepstral features are extracted
on a linear (LFCC, set it to
False) or Mel (MFCC, set it toTrue- the default) - other
- [Spectrogram] an object of which is or inherits from
Spectrogramthat will be deep-copied into a new instance.
-
energy_bands¶ Tells whether we compute a spectrogram or energy bands
-
energy_filter¶ Tells whether we use the energy or the square root of the energy
-
energy_floor¶ The energy flooring threshold
-
f_max¶ The maximum frequency of the filter bank
-
f_min¶ The minimum frequency of the filter bank
-
get_shape(input) → tuple¶ Computes the shape of the output features, given the size of an input array or an input array.
Parameters:
- input
- [int|array] Either an integral value or an array for which the output shape of this extractor is going to be computed.
This method always returns a 2-tuple containing the shape of output features produced by this extractor.
-
log_filter¶ Tells whether we use the log triangular filter or the triangular filter
-
mel_scale¶ Tells whether cepstral features are extracted on a linear (LFCC) or Mel (MFCC) scale
-
n_filters¶ The number of filter bands
-
pre_emphasis_coeff¶ The coefficient used for the pre-emphasis
-
sampling_frequency¶ The sampling frequency/frequency rate
-
win_length¶ The normalized window length w.r.t. the sample frequency
-
win_length_ms¶ The window length of the cepstral analysis in milliseconds
-
win_shift¶ The normalized window shift w.r.t. the sample frequency
-
win_shift_ms¶ The window shift of the cepstral analysis in milliseconds