An Apparatus for Providing a Processed Audio Signal, a Method for Providing a Processed Audio Signal, an Apparatus for Providing Neural Network Parameters and a Method for Providing Neural Network Parameters

Publication: WO2022083900A1

Published: 2022-04-28

Family Size: 9

Granted: Yes (2/9)

Simple SummaryContent extracted from patent full text and abstract with AI.

This invention describes an apparatus and method for processing and enhancing audio signals, particularly speech, using a normalizing-flow-based neural network approach. The system takes in a noisy or distorted audio signal and, using a series of neural network-controlled 'flow blocks' that process a noise-like signal, generates a clear, enhanced version of the original speech. By training the neural networks to model the distribution of clean audio conditioned on noisy input, and by applying special techniques such as nonlinear companding, the invention achieves effective speech enhancement with reduced computational requirements.

Use CasesContent extracted from patent full text and abstract with AI.

Improving voice clarity in telecommunications (e.g., phone calls, VoIP).
Enhancing speech in devices for people with hearing impairments (hearing aids).
Speech enhancement in consumer electronics (smartphones, voice assistants, conference systems).
Pre-processing audio for automatic speech recognition (ASR) systems to improve robustness in noisy conditions.
Removing background noise from recorded audio for media production or dictation software.
Supporting real-time communication systems to deliver clearer speech in noisy environments.
Enhancing audio feeds in security/surveillance systems for better speech intelligibility.

BenefitsContent extracted from patent full text and abstract with AI.

Provides high-quality speech enhancement even in noisy environments.
Processes audio directly in the time domain without needing computationally expensive transformations.
Reduces neural network parameter count (and thus computational load) via architectural choices such as depthwise separable convolutions.
Can operate in real-time, making it suitable for live communications and devices.
Is compatible with a wide range of devices, from mobile phones to hearing aids, due to its efficiency.
Demonstrates comparable or superior performance to state-of-the-art generative adversarial network (GAN)-based approaches in objective tests.
Can be trained efficiently, and the method for providing neural network parameters is integrated, reducing reliance on external systems.

Technical Classifications (CPCs)

Main Classifications

Physics & Measurement

Sub Classifications

Computing & Calculating

Musical Instruments & Acoustics

CPC Codes

G06N3/045G10L21/0208G10L21/0264G10L25/30

Inventors & Applicants

Inventors

Martin Strauss

Bernd Edler

Applicants

Fraunhofer Ges Forschung

Univ Friedrich Alexander Er

Patent Abstract

The invention describes an apparatus for providing a processed audio signal on the basis of an input audio signal, wherein the apparatus is configured to process a noise signal, or a signal derived from the noise signal, using one or more flow blocks, in order to obtain the processed audio signal, wherein the apparatus is configured to adapt a processing performed using the one or more flow blocks in dependence on the input audio signal and using a neural network. The invention further describes an apparatus for providing neural network parameters for an audio processing, wherein the apparatus is configured to process a training audio signal, or a processed version thereof, using one or more flow blocks in order to obtain a training result signal, wherein the apparatus is configured to adapt a processing performed using the one or more flow blocks in dependence on a distorted version of the training audio signal and using a neural network; wherein the apparatus is configured to determine neural network parameters of the neural networks, such that a characteristic of the training result audio signal approximates or comprises a predetermined characteristic. A method for providing a processed audio signal and a method for providing neural network parameters for an audio processing are also provided. The invention provides a trade-off between an effective modeling of a flow-based audio signal processing using neural networks and audio signal enhancement capabilities.

Key Information

Publication No.

WO2022083900A1

Family ID

75850213

Publication Date

2022-04-28

Application No.

EP2021062076W

Application Date

2021-05-06

Priority Date

2020-10-20

Granted

Yes (2/9)

Possible Cooperation

For further information please contact the transfer office.

See full document in Espacenet