Encoder, Decoder, System and Method Employing a Residual Concept for Parametric Audio Object Coding

Publication: WO2014023443A1
Published: 2014-02-13
Family Size: 33
Granted: Yes (17/33)

Simple SummaryContent extracted from patent full text and abstract with AI.

This invention describes an improved method and system for encoding and decoding audio that contains multiple separate audio objects (such as individual instruments or voices in a mix), using a new 'residual' concept. The encoder combines the audio objects into three or more downmix signals and supplements this with both parametric (side) information and 'residual' signals, which represent the differences between the original objects and their initial estimates. The corresponding decoder uses these downmix signals, parametric information, and residuals to more accurately reconstruct the original audio objects even for complex scenes with many channels or objects. The new approach overcomes previous limitations in audio object coding, such as being restricted to only a small number of objects or output channels or only supporting two-channel downmixes.

Use CasesContent extracted from patent full text and abstract with AI.

  • Professional music production and post-production, enabling precise manipulation of individual instruments or voices after mixing.
  • Broadcast and live event audio, allowing separate mixing or enhancement of elements like commentators, crowd noise, and soundtrack for different platforms or languages.
  • Personalized audio experiences, such as allowing end users to remix music or adjust audio scene elements (e.g., muting vocals in karaoke).
  • Virtual reality (VR), gaming, and immersive media, where accurate spatial placement and interactivity with individual audio sources is needed.
  • Hearing aids and assistive listening devices, enabling focus or suppression of specific sound sources in complex environments.
  • Advanced teleconferencing where participants' voices can be isolated, enhanced, or spatially rendered.

BenefitsContent extracted from patent full text and abstract with AI.

  • Enables precise separation and reconstruction of individual audio objects from complex downmixes using more than two channels, overcoming traditional limitations of SAOC systems.
  • Improves perceived audio quality and naturalness, especially when 'soloing' (isolating) individual objects or creating personalized audio mixes.
  • Reduces the need to compute complex channel prediction coefficients, simplifying and accelerating the decoding process.
  • Scales to a larger number of audio objects and output channels (including surround and object-based audio formats), supporting modern immersive audio systems.
  • Supports both joint and cascaded processing strategies, allowing trade-offs between computational complexity and output audio quality.
  • Facilitates flexible, object-based audio rendering for diverse applications, from professional studios to consumer devices.

Technical Classifications (CPCs)

Main Classifications

Electrical & Electronic Tech

Physics & Measurement

Sub Classifications

Electric Communication Technique

Musical Instruments & Acoustics

CPC Codes

G10L19/008G10L19/04G10L19/20H04S3/00

Inventors & Applicants

Applicants

Fraunhofer Ges Forschung

Univ Friedrich Alexander Er

Patent Abstract

A decoder is provided. The decoder comprises a parametric decoding unit (110) for generating a plurality of first estimated audio object signals by upmixing three or more downmix signals, wherein the three or more downmix signals encode a plurality of original audio object signals, wherein the parametric decoding unit (110) is configured to upmix the three or more downmix signals depending on parametric side information indicating information on the plurality of original audio object signals. Moreover, the decoder comprises a residual processing unit (120) for generating a plurality of second estimated audio object signals by modifying one or more of the first estimated audio object signals, wherein the residual processing unit (120) is configured to modify said one or more of the first estimated audio object signals depending on one or more residual signals.

Key Information

Publication No.

WO2014023443A1

Family ID

48092997

Publication Date

2014-02-13

Application No.

EP2013057932W

Application Date

2013-04-16

Priority Date

2012-08-10

Granted

Yes (17/33)

Possible Cooperation

For further information please contact the transfer office.