Bob 2.0 extraction of cepstral features (MFCC or LFCC) from audio

This algorithm is a legacy one. The API has changed since its implementation. New versions and forks will need to be updated.
This algorithm is splittable

Algorithms have at least one input and one output. All algorithm endpoints are organized in groups. Groups are used by the platform to indicate which inputs and outputs are synchronized together. The first group is automatically synchronized with the channel defined by the block in which the algorithm is deployed.

Group: main

Endpoint Name Data Format Nature
speech system/array_1d_floats/1 Input
vad system/array_1d_integers/1 Input
features system/array_2d_floats/1 Output

Parameters allow users to change the configuration of an algorithm when scheduling an experiment

Name Description Type Default Range/Choices
f_max Max frequency of the range used in bandpass filtering float64 8000.0
delta_win Window size used in delta and delta-delta computation uint32 2
withDelta Compute deltas (with window size specified by delta_win) bool True
pre_emphasis_coef Pre-emphasis coefficient float64 0.95
win_shift_ms The length of the overlap between neighboring windows. Typically the half of window length. float64 10.0
win_length_ms The length of the sliding processing window, typically about 20 ms float64 20.0
dct_norm Use normalized DCT bool False
normalizeFeatures Normalize computed Cepstral features (shift by mean and divide by std) bool True
filter_frames Filter frames with computed Cepstral features based on the VAD labels. Either trim out silence head/tails, keep only speech, or keep only silence. string trim_silence trim_silence, silence_only, speech_only
rate Sampling rate of the speech signal float64 16000.0
n_filters Number of filter bands uint32 24
f_min Min frequency of the range used in bandpass filtering float64 0.0
withDeltaDelta Compute delta-deltas (with window size specified by delta_win) bool True
withEnergy Use power of the FFT magnitude, otherwise just an absolute value of the magnitude bool True
mel_scale Set true to use Mel-scaled triangular filter, otherwise it's a linear scale bool True
n_ceps Number of cepstral coefficients uint32 19

The code for this algorithm in Python
The ruler at 80 columns indicate suggested POSIX line breaks (for readability).
The editor will automatically enlarge to accomodate the entirety of your input
Use keyboard shortcuts for search/replace and faster editing. For example, use Ctrl-F (PC) or Cmd-F (Mac) to search through this box

Extract cepstral features (MFCC or LFCC) from audio

Experiments

Updated Name Databases/Protocols Analyzers
pkorshunov/pkorshunov/isv-asv-pad-fusion-complete/1/asv_isv-pad_lbp_hist_ratios_lr-fusion_lr-pa_aligned avspoof/2@physicalaccess_verification,avspoof/2@physicalaccess_verification_spoof,avspoof/2@physicalaccess_antispoofing,avspoof/2@physicalaccess_verify_train_spoof,avspoof/2@physicalaccess_verify_train pkorshunov/spoof-score-fusion-roc_hist/1
pkorshunov/pkorshunov/isv-asv-pad-fusion-complete/1/asv_isv-pad_gmm-fusion_lr-pa avspoof/2@physicalaccess_verification,avspoof/2@physicalaccess_verification_spoof,avspoof/2@physicalaccess_antispoofing,avspoof/2@physicalaccess_verify_train_spoof,avspoof/2@physicalaccess_verify_train pkorshunov/spoof-score-fusion-roc_hist/1
pkorshunov/pkorshunov/speech-pad-simple/1/speech-pad_gmm-pa avspoof/2@physicalaccess_antispoofing pkorshunov/simple_antispoofing_analyzer/4
pkorshunov/pkorshunov/isv-speaker-verification-spoof/1/isv-speaker-verification-spoof-pa avspoof/2@physicalaccess_verification_spoof,avspoof/2@physicalaccess_verification pkorshunov/eerhter_postperf_iso_spoof/1
pkorshunov/pkorshunov/isv-speaker-verification/1/isv-speaker-verification-licit avspoof/2@physicalaccess_verification pkorshunov/eerhter_postperf_iso/1

This table shows the number of times this algorithm has been successfully run using the given environment. Note this does not provide sufficient information to evaluate if the algorithm will run when submitted to different conditions.

Terms of Service | Contact Information | BEAT platform version 2.2.1b0 | © Idiap Research Institute - 2013-2024