Machine learning improves human speech recognition

Hearing loss is a rapidly growing area of ​​scientific research, as the number of baby boomers experiencing hearing loss continues to increase as they age.

To understand how hearing loss affects people, researchers study people’s ability to recognize speech. It is more difficult for people to recognize human speech if there is reverberation, hearing loss, or significant background noise, such as traffic noise or multiple speakers.

Therefore, hearing aid algorithms are often used to improve human speech recognition. To evaluate such algorithms, researchers perform experiments to determine the signal-to-noise ratio at which a specific number of words (usually 50%) are recognized. However, these tests are costly and time consuming.

In The Journal of the Acoustical Society of Americapublished by the Acoustical Society of America via AIP Publishing, German researchers are exploring a human speech recognition model based on machine learning and deep neural networks.

“The novelty of our model is that it provides good predictions for hearing-impaired listeners for noise types of vastly different complexity and shows both low errors and high correlations with the measured data,” the author said. Jana Roßbach, Carl Von Ossietzky University.

The researchers calculated the number of words per sentence a listener understands using automatic speech recognition (ASR). Most people are familiar with ASR from voice recognition tools like Alexa and Siri.

The study involved eight normal-hearing and 20 hearing-impaired listeners who were exposed to a variety of complex noises that mask speech. Hearing impaired listeners were categorized into three groups with varying levels of age-related hearing loss.

The model allowed researchers to predict the human speech recognition performance of hearing-impaired listeners with varying degrees of hearing loss for a variety of noise masks with increasing complexity in temporal modulation and similarity to real speech. A person’s possible hearing loss could be considered individually.

“We were very surprised that the predictions worked well for all types of noise. We expected the model to have problems when using a single concurrent speaker. However, this was not the case. “Roßbach said.

The model created predictions for hearing in one ear. In the future, researchers will develop a binaural model since speech understanding is affected by hearing in both ears.

In addition to predicting speech intelligibility, the model could also be used to predict listening effort or speech quality, as these topics are closely related.

Source of the story:

Material provided by American Institute of Physics. Note: Content may be edited for style and length.

Comments are closed.