There are two important signal processing tools applied in systems for source coding of audio signals, namely critically sampled filterbanks and linear prediction. Critically sampled filterbanks (e.g. modified discrete cosine transform, MDCT, based filterbanks) enable direct access to time-frequency representations where perceptual irrelevancy and signal redundancy can be exploited. Linear prediction enables the efficient source modeling of audio signals, in particular of speech signals. The combination of the two tools, i.e. the use of prediction in the subbands of a filterbank, has mainly been used for high bit rate audio coding. For low bit rate coding, a challenge with prediction in the subbands is to keep the cost (i.e. the bit rate) for the description of the predictors low. Another challenge is to control the resulting noise shaping of the prediction error signal obtained by a subband predictor.
For the challenge of encoding the description of the subband predictor in a bit-efficient manner, a possible path is to estimate the predictor from previously decoded portions of the audio signal and to thereby avoid the cost of a predictor description altogether. If the predictor can be determined from previously decoded portions of the audio signal, the predictor can be determined at the encoder and at the decoder, without the need of transmitting a predictor description from the encoder to the decoder. This scheme is referred to as a backwards adaptive prediction scheme. However, the backwards adaptive prediction scheme typically degrades significantly when the bit rate of the encoded audio signal decreases. An alternative or additional path to the efficient encoding of a subband predictor is to identify a more natural predictor description, e.g. a description which exploits the inherent structure of the to-be-encoded audio signal. For instance, low bit rate speech coding typically applies a forward adaptive scheme based on a compact representation of a short term predictor (exploiting short term correlations) and a long time predictor (exploiting long term correlations due to an underlying pitch of the speech signal).
For the challenge of controlling the noise shaping of the prediction error signal, it is observed that while the noise shaping of a predictor may be well controlled inside of a subband, the final output audio signal of the encoder typically exhibits alias artifacts (except for audio signals exhibiting a substantially flat spectral noise shape).
An important case of a subband predictor is the implementation of long term prediction in a filterbank with overlapping windows. A long term predictor typically exploits the redundancies in periodic and near periodic audio signals (such as speech signals exhibiting an inherent pitch), and may be described with a single or a low number of prediction parameters. The long term predictor may be defined in continuous time by means of a delay which reflects the periodicity of the audio signal. When this delay is large compared to the length of the filterbank window, the long term predictor can be implemented in the discrete time domain by means of a shift or a fractional delay and may be converted back into a causal predictor in the subband domain. Such a long term predictor typically does not exhibit alias artifacts, but there is a significant penalty in computational complexity caused by the need for additional filterbank operations for the conversion from the time domain to the subband domain. Furthermore, the approach of determining the delay in the time domain and of converting the delay into a subband predictor is not applicable for the case where the period of the to-be-encoded audio signal is comparable or smaller than the filterbank window size.
The present document addresses the above mentioned shortcomings of subband prediction. In particular, the present document describes methods and systems which allow for a bit-rate efficient description of subband predictors and/or which allow for a reduction of alias artifacts caused by subband predictors. In particular, the method and systems described in the present document enable the implementation of low bit rate audio coders using subband prediction, which cause a reduced level of aliasing artifacts.