Encoder, decoder and methods for signal-dependent zoom-transform in spatial audio object coding

Publication: EP2717262A1

Published: 2014-04-09

Family Size: 53

Granted: Yes (24/53)

Simple SummaryContent extracted from patent full text and abstract with AI.

This invention relates to an improved method and system for encoding and decoding spatial audio object signals. It introduces an encoder and decoder that can dynamically adapt the time-frequency resolution of audio signals depending on the characteristics (such as the presence of transients or stationary tones) of different audio objects within a sound mix. This is achieved by switching between fine and coarse resolutions, allowing for better separation and rendering of audio objects, while maintaining backward compatibility with existing standards.

Use CasesContent extracted from patent full text and abstract with AI.

High-quality 3D or spatial audio playback in virtual reality (VR) and augmented reality (AR) systems.
Advanced audio editing and mixing tools that require precise separation and manipulation of different sound sources or objects.
Live broadcasting or streaming of multi-object audio with efficient bandwidth usage and customizable rendering for users.
Hearing aids or audio enhancement devices that improve intelligibility in noisy environments by separating and focusing on specific audio objects.
Gaming applications requiring immersive and dynamic sound positioning and separation.
Karaoke and music education platforms where users can adjust levels or isolate specific instruments or vocals.
Teleconferencing systems improving speech recognition and clarity by handling multiple speakers as separate audio objects.

BenefitsContent extracted from patent full text and abstract with AI.

Improves perceptual audio quality by dynamically adapting the time-frequency resolution to each signal's needs, minimizing artifacts like pre/post echoes or crosstalk.
Enables fine separation and manipulation of individual audio objects for personalized or interactive audio experiences.
Maintains backward compatibility with existing Spatial Audio Object Coding (SAOC) systems, allowing seamless integration and gradual adoption.
Enhances coding efficiency, reducing required bitrate or storage while maintaining or improving audio quality.
Facilitates flexible and scalable audio rendering across diverse playback environments, such as headphones, multi-speaker setups, or binaural playback.
Supports both real-time and offline processing, adaptable to hardware or software implementations.

Technical Classifications (CPCs)

Main Classifications

Physics & Measurement

Sub Classifications

Musical Instruments & Acoustics

CPC Codes

G10L19/008G10L19/02G10L19/0204G10L19/0208G10L19/025G10L19/20

Inventors & Applicants

Inventors

Applicants

Fraunhofer Ges Forschung

Univ Friedrich Alexander Er

Patent Abstract

A decoder for generating an audio output signal comprising one or more audio output channels from a downmix signal is provided. The downmix signal encodes one or more audio object signals. The decoder comprises a control unit (181) for setting an activation indication to an activation state depending on a signal property of at least one of the one or more audio object signals. Moreover, the decoder comprises a first analysis module (182) for transforming the downmix signal to obtain a first transformed downmix comprising a plurality of first subband channels. Furthermore, the decoder comprises a second analysis module (183) for generating, when the activation indication is set to the activation state, a second transformed downmix by transforming at least one of the first subband channels to obtain a plurality of second subband channels, wherein the second transformed downmix comprises the first subband channels which have not been transformed by the second analysis module and the second subband channels. Moreover, the decoder comprises an un-mixing unit (184), wherein the un-mixing unit (184) is configured to un-mix the second transformed downmix, when the activation indication is set to the activation state, based on parametric side information on the one or more audio object signals to obtain the audio output signal, and to un-mix the first transformed downmix, when the activation indication is not set to the activation state, based on the parametric side information on the one or more audio object signals to obtain the audio output signal. Furthermore, an encoder is provided.

Key Information

Publication No.

EP2717262A1

Family ID

48325509

Publication Date

2014-04-09

Application No.

EP13167487A

Application Date

2013-05-13

Priority Date

2012-10-05

Granted

Yes (24/53)

Possible Cooperation

For further information please contact the transfer office.

See full document in Espacenet