A speech processing system inspired by the human brain

Idiap researchers published a paper describing an approach to speech processing based on the properties of the human brain. Their method proved as efficient as the current standard, whilst conserving the advantage of energy efficiency. Moreover, their work is replicable thanks to open access software paving the way for future applications.

Without noticing, you probably already used speech processing technologies. They are the backbone of voice command devices. Although widely used, these technologies are still being improved to obtain better performance. Recently, one of the successful methods uses computing systems called artificial neural networks. These systems traditionally operate with real numbers that can have an arbitrary large number of significant digits. Using such numbers allows a high degree of precision to process speech efficiently. This method comes with a downside: computing energy costs grow as precision increases. To circumvent this issue, Idiap’s researchers mimicked the functioning of the human brain to develop another approach to speech processing.

Artificial vs human neurons

The human brain shines when it comes to speech processing. Even with billions of neurons, the computing resources of the brain are still limited. Furthermore, as humans, we are able to listen to a person and to perform other tasks at the same time. To achieve such wonders the human brain functions with less energy demanding discrete signals rather than energy demanding real numbers. When a neuron of the brain reaches a given threshold of stimulation, it fires a single electric signal to convey binary information.

To process speech that is made of multiple consecutive sounds, human neurons must process series of single electric signals. Transposing this approach to artificial neuron networks is a challenge, as an important part of the information is encoded not only in the signal itself, but also in the time sequence of signals. “We wanted to recreate a similar method and to compare it to classic artificial neural networks in terms of performance and reliability,” Alexandre Bittar, first author of the paper and research assistant at Idiap, explains.

In a classic artificial network, an artificial neuron’s functioning can be seen as an approximation of a biological neuron’s electric signal rates. To better take into account the variations of this rate, which conveys information, researchers are using another type of artificial neuron, called a spiking neuron. The main issue with these spiking neurons is that they are usually exhibit lower performance. “By carefully selecting appropriate techniques, we have established a method that, on top of being compatible with standard deep learning frameworks, is capable of competing with classic artificial neural networks on the same speech processing tasks, whilst conserving the advantage of energy efficiency,” Phil Garner, senior researcher in the Speech & Audio processing group, explains.

A tool to model the brain

In addition to their paper, the Idiap researchers also published the software they used to test their methods. Their aim is to provide an open access tool to improve this method, and to offer materials for potential multidisciplinary applications.

Beyond the speech-processing field, this approach could prove to be an interesting tool to further research how the brain works. “The experiments do not attempt to say anything about biological function. However, they show that a neurological capability to represent a sensory stimulus, that is available to biological entities, is capable of solving the same problems as artificial networks that are known to be capable of exceeding human performance on many tasks. This provides a strong hypothesis for future understanding of the biological mechanisms of the brain,” Garner concludes.



More information

- Speech & Audio Processing research group at Idiap
- “A surrogate gradient spiking baseline for speech command recognition”, Alexandre Bittar and Philip N. Garner in Front. Neurosci., 22 August 2022 Sec. Neural Technology
- Software release