System, apparatus and method for consistent acoustic scene reproduction based on adaptive functions

Publication: EP2942981A1

Published: 2015-11-11

Family Size: 30

Granted: Yes (12/30)

Simple SummaryContent extracted from patent full text and abstract with AI.

This patent describes a system and method for accurately reproducing acoustic scenes—especially those accompanying video—by processing audio signals for spatial consistency. It combines direct (like voices or instruments) and diffuse (ambient or reverberant) audio components using adaptive gain functions that take into account the direction of arrival and other spatial or visual information. The system makes it possible to closely align an audio scene with a corresponding visual scenario, such as matching audio focus with a camera's zoom or rotation. The approach is applicable in real time and at playback, supporting scenarios like teleconferencing, multimedia production, and advanced hearing aids.

Use CasesContent extracted from patent full text and abstract with AI.

Video recording on smartphones or cameras where audio is matched to the video focus and zoom.
Teleconferencing systems that spatially align voices with participants' video positions, improving clarity and directionality.
Hearing aids or assistive listening devices that help users focus on sounds coming from a specific direction or person.
Post-production editing for film and TV to synchronize audio scenes with dynamic visual scenes, such as zooms or camera pans.
Augmented and virtual reality platforms that require realistic spatial audio to match user perspective changes in immersive experiences.
Wearable devices such as smart glasses, improving audio focus and directionality based on user gaze or movement.
Public address or sound reinforcement systems that dynamically tailor output based on the position and focus within a venue.

BenefitsContent extracted from patent full text and abstract with AI.

Provides highly consistent and realistic spatial audio reproduction, enhancing immersion and intelligibility.
Allows flexible post-processing or live adaptation of audio scenes to match visual changes (like camera zoom or rotation), improving audio-visual coherence.
Reduces the need to transmit or store all original multichannel microphone signals, enabling bandwidth and storage efficiency.
Improves focus and separation of desired sounds from background noise, which is beneficial in hearing aids, conferencing, and noisy environments.
Supports customizable sound field control, leading to better user experiences in consumer electronics, entertainment, and accessibility devices.
Can be implemented via software or hardware, fitting a wide range of applications and devices.

Technical Classifications (CPCs)

Main Classifications

Electrical & Electronic Tech

Physics & Measurement

Sub Classifications

Electric Communication Technique

Musical Instruments & Acoustics

CPC Codes

G10L19/00G10L19/008H04R3/00H04R25/00H04R25/407H04S5/005H04S7/00H04S7/30H04S7/307

Inventors & Applicants

Inventors

Emanuel Habets

Oliver Thiergart

Konrad Kowalczyk

Applicants

Fraunhofer Ges Forschung

Friedrich Alexander Universität Erlangen Nürnberg

Patent Abstract

A system for generating one or more audio output signals is provided. The system comprises a decomposition module (101), a signal processor (105), and an output interface (106). The signal processor (105) is configured to receive the direct component signal, the diffuse component signal and direction information, said direction information depending on a direction of arrival of the direct signal components of the two or more audio input signals. Moreover, the signal processor (105) is configured to generate one or more processed diffuse signals depending on the diffuse component signal. For each audio output signal of the one or more audio output signals, the signal processor (105) is configured to determine, depending on the direction of arrival, a direct gain, the signal processor (105) is configured to apply said direct gain on the direct component signal to obtain a processed direct signal, and the signal processor (105) is configured to combine said processed direct signal and one of the one or more processed diffuse signals to generate said audio output signal. The output interface (106) is configured to output the one or more audio output signals. The signal processor (105) comprises a gain function computation module (104) for calculating one or more gain functions, wherein each gain function of the one or more gain functions, comprises a plurality of gain function argument values, wherein a gain function return value is assigned to each of said gain function argument values, wherein, when said gain function receives one of said gain function argument values, wherein said gain function is configured to return the gain function return value being assigned to said one of said gain function argument values. Moreover, the signal processor (105) further comprises a signal modifier (103) for selecting, depending on the direction of arrival, a direction dependent argument value from the gain function argument values of a gain function of the one or more gain functions, for obtaining the gain function return value being assigned to said direction dependent argument value from said gain function, and for determining the gain value of at least one of the one or more audio output signals depending on said gain function return value obtained from said gain function.

Key Information

Publication No.

EP2942981A1

Family ID

51485417

Publication Date

2015-11-11

Application No.

EP14183854A

Application Date

2014-09-05

Priority Date

2014-05-05

Granted

Yes (12/30)

Possible Cooperation

For further information please contact the transfer office.

See full document in Espacenet