Embodiments according to the invention are related to an audio decoder for providing a decoded audio information on the basis of an encoded audio information and to an audio encoder for providing an encoded audio information on the basis of an input audio information. Further embodiments are related to a method for providing a decoded representation of an audio content on the basis of an encoded representation of the audio content and to a method for providing an encoded representation of an audio content on the basis of an input representation of the audio content. Yet further embodiments according to the invention are related to computer programs for performing the inventive methods.
Embodiments according to the invention are related to improvements of a transition from a frequency-domain mode to a linear-prediction-domain mode.
In the following, some background information of the invention will be explained in order to facilitate the understanding of the invention and the advantages thereof. During the past decade, big efforts have been put on creating the possibility to digitally store and distribute audio contents. One important achievement on this way is the definition of the International Standard ISO/IEC 14496-3. Part 3 of this Standard is related to an encoding and decoding of audio contents, and subpart 4 of part 3 is related to general audio coding. ISO/IEC 14496 part 3, subpart 4 defines a concept for encoding and decoding of general audio contents. In addition, further improvements have been proposed in order to improve the quality and/or reduce the required bitrate.
According to the concept described in said standard, a time-domain audio signal is converted into a time-frequency representation. The transform from the time-domain to the time-frequency-domain is typically performed using transform blocks, which are also designated as “audio frames” or briefly “frames”.
It has been found that it is advantageous to use overlapping frames, which are shifted, for example, by half a frame, because the overlap allows to efficiently avoid artifacts. In addition, it has been found that a windowing should be performed in order to avoid the artifacts originating from the processing of temporally limited frames. Also, the windowing allows for an optimization of an overlap-and-add process of subsequent temporally shifted but overlapping frames.
In addition, techniques for an efficient encoding of speech signals have been proposed. For example, concepts for a speech coding have been defined in the International Standards 3GPP TS 26.090, 3GPP TS 26.190 and 3GPP TS 26.290. In addition, many additional concepts for an encoding of speech signals have been discussed in the literature.
However, it has been found that it is difficult to combine the concepts for general audio coding (as defined, for example, in the International Standard ISO/IEC 14496-3, part 3, subpart 4) with the concepts for speech coding (as defined, for example, in the above-mentioned 3GPP Standards).
In view of this situation, there is a desire to create concepts which allow for a sufficiently smooth yet bitrate-efficient transition between audio frames encoded in the frequency-domain and audio frames encoded in the linear-prediction-domain.