Python API

This section includes information for using the pure Python API of bob.io.audio.

Classes

bob.io.audio.reader

Use this object to read samples from audio files

bob.io.audio.writer

Use this object to write samples to audio files

Details

bob.io.audio.get_config()[source]

Returns a string containing the configuration information.

class bob.io.audio.reader

Bases: object

Use this object to read samples from audio files

Audio reader objects can read data from audio files. The current implementation uses SoX , which is a stable freely available audio encoding and decoding library, designed specifically for these tasks. You can read an entire audio in memory by using the load() method.

Constructor Documentation:

reader (filename)

Opens an audio file for reading

Opens the audio file with the given filename for reading, i.e., using the load() function

Parameters:

filename : str

The file path to the file you want to read data from

Class Members:

bits_per_sample

int <– The number of bits per sample in this audio stream

compression_factor

float <– Compression factor on the audio stream

duration

float <– Total duration of this audio file in seconds

encoding

str <– Name of the encoding in which this audio file was recorded

filename

str <– The full path to the file that will be decoded by this object

load() data

Loads all of the audio stream in a numpy.ndarray

The data is organized in this way: (channels, data).

Returns:

data : numpy.ndarray

The data read from this file

number_of_channels

int <– The number of channels on the audio stream

number_of_samples

int <– The number of samples in this audio stream

rate

float <– The sampling rate of the audio stream

type

tuple <– Typing information to load all of the file at once

class bob.io.audio.writer

Bases: object

Use this object to write samples to audio files

Audio writer objects can write data to audio files. The current implementation uses SoX.

Audio files are objects composed potentially multiple channels. The numerical representation are 2-D arrays where the first dimension corresponds to the channels of the audio stream and the second dimension represents the samples through time.

Constructor Documentation:

writer (filename, [rate], [encoding], [bits_per_sample])

Opens an audio file for writing

Opens the audio file with the given filename for writing, i.e., using the append() function

Parameters:

filename : str

The file path to the file you want to write data to

rate : float

[Default: 8000.] The number of samples per second

encoding : str

[Default: 'UNKNOWN'] The encoding to use

bits_per_sample : int

[Default: 16] The number of bits per sample to be recorded

Class Members:

append(sample) None

Writes a new sample or set of samples to the file

The frame should be setup as a array with 1 dimension where each entry corresponds to one stream channel. Sets of samples should be setup as a 2D array in this way: (channels, samples). Arrays should contain only 64-bit float numbers.

Note

At present time we only support arrays that have C-style storages (if you pass reversed arrays or arrays with Fortran-style storage, the result is undefined)

Parameters:

sample : array-like (1D or 2D, float)

The sample(s) that should be appended to the file

bits_per_sample

int <– The number of bits per sample in this audio stream

close() None

Closes the current audio stream and forces writing the trailer

After this point the audio is finalized and cannot be written to anymore.

compression_factor

float <– Compression factor on the audio stream

duration

float <– Total duration of this audio file in seconds

encoding

str <– Name of the encoding in which this audio file will be written

filename

str <– The full path to the file that will be decoded by this object

is_opened

bool <– A flag indicating if the audio is still opened for writing, or has already been closed by the user using close()

number_of_channels

int <– The number of channels on the audio stream

number_of_samples

int <– The number of samples in this audio stream

rate

float <– The sampling rate of the audio stream

type

tuple <– Typing information to load all of the file at once