Physicists from Altai State University are implementing a project to clean up the background noise on voice messages

10 December 2021 Department of Information and Media Communications
The Russian Science Foundation supported the project of Andrey Lependin, Associate Professor of the Department of Information Security of the Institute of Digital Technologies, Electronics and Physics, Candidate of Physical and Mathematical Sciences, to clean up the background noise on voice messages.

The project entitled "Development of New Methods for Improving the Quality of Speech Signals Using Deep Neural Networks" was among the winners of the 2021 competition for grants from the Russian Science Foundation in the priority area of the Russian Science Foundation "Conducting fundamental scientific research and exploratory research by small individual scientific groups."

“We take a recording of a person's speech made in real conditions – there can be something making noise, shouting, singing in the background - and we try to clean up the unnecessary background distortion on the recording. From noisy, “dirty” speech we get a clear and beautiful recording, which can then be used in the future,” explains A. Lependine. “Such methods of improving the quality of speech signals are already used in modern programs, for example, in video chats. However, they cope with uniform background noise that does not change over time, such as the hum of cars or machinery. As soon as a harsh sound appears, the system does not have time to respond to it, the noise penetrates into the recording. Therefore, our task is to modify these methods so that they can remove all background sounds and leave only human speech."

According to the scientist, clear sound is required in many areas: in a speech recognition system that synthesizes words into text, for voice messages and video chats like Zoom and Skype, for creating audio and video content. And also high-quality sound recording is necessary when solving information security problems.

“The project is at the stage of the active development. We already have some results that we demonstrated in the grant application. Our team has made a good model that improves the quality of speech signals in real time - that is, it manages to process speech in sync with the way a person speaks. We also have some interesting ideas on how to finalize this model and make a better version. But it is too early to talk about the completion of our research," specifies the project developer.

Scientists from Altai State University have been working on the project for a couple of years, and, according to A. Lependin, they have two more years of hard work ahead of them. By the way, not so many specialists in the country are involved in developments in this area, according to the researcher.

You can count them on fingers: this is Speech Technology Center in St. Petersburg and several groups working in large companies, for example, Yandex and Sberbank.

Printable version