Apparatus and Method for Encoding a Plurality of Audio Objects Using Direction Information During a Downmixing or Apparatus and Method for Decoding Using an Optimized Covariance Synthesis
Simple SummaryContent extracted from patent full text and abstract with AI.
This patent presents an efficient method and device for encoding and decoding multiple audio objects (such as voices or instruments in a sound scene) by using their direction information during audio downmixing, and proposes an optimized way to reproduce high-quality, immersive audio. The system analyzes which objects are most significant in each time-frequency segment, incorporates their direction as metadata, and downmixes the objects accordingly to generate a lower-bitrate signal. At the decoding stage, the system reconstructs the audio scene using optimized matrix techniques that ensure accurate placement of sounds in space and reduce computational complexity.
Use CasesContent extracted from patent full text and abstract with AI.
- 3D/immersive audio playback in home entertainment systems, cinemas, or VR/AR applications.
- Efficient transmission and rendering of multi-speaker teleconference calls or live events with spatialized audio.
- Streaming or broadcasting high-fidelity spatial audio over networks with limited bandwidth, such as mobile or internet radio.
- Content creation tools for music and film, allowing producers to encode complex multi-object audio scenes for interactive or adaptive playback platforms.
- Hearing aids or assistive listening devices, where spatial cues help users localize sound sources more naturally.
BenefitsContent extracted from patent full text and abstract with AI.
- Delivers immersive audio experiences with accurate spatial rendering, enhancing realism and user engagement.
- Efficiently codes complex sound scenes with multiple objects, using less data than traditional methods, enabling low-bitrate applications.
- Optimized decoding reduces computational requirements, allowing deployment on devices with limited processing power (like mobile phones or smart speakers).
- Flexible: works for various numbers of audio objects and target output formats, supporting everything from stereo to sophisticated 3D speaker setups.
- Improved audio quality compared to prior art, with minimized artifacts (such as those caused by generic decorrelators).
- Object position (direction) metadata enables interactive or personalized renders (e.g., rotating or customizing sound scenes for the listener).
Technical Classifications (CPCs)
Main Classifications
Electrical & Electronic Tech
Physics & Measurement
Sub Classifications
Computing & Calculating
Electric Communication Technique
Musical Instruments & Acoustics
CPC Codes
Inventors & Applicants
Inventors
Applicants
Fraunhofer Ges Forschung
Univ Friedrich Alexander Er
Patent Abstract
An apparatus for encoding a plurality of audio objects and related metadata indicating direction information on the plurality of audio objects, comprises: a downmixer (400) for downmixing the plurality of audio objects to obtain one or more transport channels; a transport channel encoder (300) for encoding one or more transport channels to obtain one or more encoded transport channels; and an output interface (200) for outputting an encoded audio signal comprising the one or more encoded transport channels, wherein the downmixer (400) is configured to downmix the plurality of audio objects in response to the direction information on the plurality of audio objects.
Key Information
Publication No.
WO2022079044A1
Family ID
78087389
Publication Date
2022-04-21
Application No.
EP2021078209W
Application Date
2021-10-12
Priority Date
2020-10-13
Granted
Yes (4/15)
Possible Cooperation
For further information please contact the transfer office.