Perceptual Audio Coding with Adaptive Non-Uniform Time/frequency Tiling Using Subband Merging and Time Domain Aliasing Reduction

Publication: EP3644313A1

Published: 2020-04-29

Family Size: 17

Granted: Yes (6/17)

Simple SummaryContent extracted from patent full text and abstract with AI.

This invention introduces an advanced method for perceptual audio coding by using an adaptive, non-uniform time/frequency tiling approach through subband merging and time domain aliasing reduction. The audio processor employs a multi-stage transform, where audio signals are processed in partially overlapping blocks and segmented with flexible windowing, followed by a special combination of subband samples to reduce aliasing. This enables the filterbank to adapt its time and frequency resolution more closely to human auditory perception, resulting in improved coding efficiency and audio quality at lower bitrates compared to classic uniform filterbanks.

Use CasesContent extracted from patent full text and abstract with AI.

Audio codecs for music streaming or digital music distribution (e.g. streaming services, online music stores)
Speech and audio compression in telecommunications (VoIP, mobile calls)
Digital broadcasting (DAB, satellite radio, television broadcasting)
Storage of large music or speech libraries with minimized file sizes (e.g. in professional archiving or music libraries)
Efficient audio data transmission in bandwidth-limited systems (such as IoT devices or wireless headphones)
Audio analysis and feature extraction systems where alias-reduced high-fidelity representations are needed

BenefitsContent extracted from patent full text and abstract with AI.

Achieves better audio quality at the same or lower bitrate compared to traditional methods, due to perceptual alignment and flexible tiling.
Reduces perceptible coding artifacts by following human auditory system resolution more closely (better frequency and time adaptation).
Allows greater coding efficiency—tests show up to 5–13% fewer bits needed for similar quality compared to standard approaches.
Enables fine-grained control and adaptation for different audio content types (speech, music, percussive sounds).
Backward compatible and extendable to multichannel (stereo or surround) systems and suitable for real-world audio codecs.
Can dynamically optimize transform parameters for each frequency band and each signal block, enhancing robustness to diverse input audio.

Technical Classifications (CPCs)

Main Classifications

Physics & Measurement

Sub Classifications

Computing & Calculating

Musical Instruments & Acoustics

CPC Codes

G06F17/147G10L19/008G10L19/0204G10L19/0212G10L19/022G10L19/26

Inventors & Applicants

Inventors

Nils Werner

Bernd Edler

Sascha Disch

Applicants

Fraunhofer Ges Forschung

Univ Friedrich Alexander Er

Patent Abstract

Embodiments provide an audio processor for processing an audio signal to obtain a subband representation of the audio signal. The audio processor is configured to perform a cascaded lapped critically sampled transform on at least two partially overlapping blocks of samples of the audio signal, to obtain a set of subband samples on the basis of a first block of samples of the audio signal, and to obtain a corresponding set of subband samples on the basis of a second block of samples of the audio signal. Further, the audio processor is configured to perform a weighted combination of two corresponding sets of subband samples, one obtained on the basis of the first block of samples of the audio signal and one obtained on the basis on the second block of samples of the audio signal, to obtain an aliasing reduced subband representation of the audio signal; wherein performing a cascaded lapped critically sampled transform comprises segmenting a set of bins obtained on the basis of the first block of samples using at least two window functions, and to obtain at least two segmented sets of bins based on the segmented set of bins corresponding to the first block of samples; wherein performing a cascaded lapped critically sampled transform comprises segmenting a set of bins obtained on the basis of the second block of samples using the at least two window functions, and to obtain at least two sets of bins based on the segmented set of bins corresponding to the second block of samples; and wherein the sets of bins are processed using a second lapped critically sampled transform of the cascaded lapped critically sampled transform, wherein the second lapped critically sampled transform comprises performing lapped critically sampled transforms having the same framelength for at least one set of bins.

Key Information

Publication No.

EP3644313A1

Family ID

64316263

Publication Date

2020-04-29

Application No.

EP19169635A

Application Date

2019-04-16

Priority Date

2018-10-26

Granted

Yes (6/17)

Possible Cooperation

For further information please contact the transfer office.

See full document in Espacenet