1. Field of the Invention
The present invention relates to signal processing and particularly to signal processing of sequential values, such as audio samples or video samples, which are particularly suitable especially for lossless coding applications.
2. Description of the Related Art
The present invention is further suitable for compression algorithms for discrete values comprising audio and/or image information, and particularly for coding algorithms including a transform in the frequency domain or time domain or location domain, which are followed by a coding, such as an entropy coding in the form of a Huffman or arithmetic coding.
Modern audio coding methods, such as MPEG Layer3 (MP3) or MPEG AAC, use transforms, such as the so-called modified discrete cosine transform (MDCT), to obtain a block-wise frequency representation of an audio signal. Such an audio coder usually obtains a stream of time-discrete audio samples. The stream of audio samples is windowed to obtain a windowed block of for example 1,024 or 2,048 windowed audio samples. For the windowing, various window functions are employed, such as a sine window, etc.
The windowed time-discrete audio samples are then converted to a spectral representation by means of a filter bank. In principle, a Fourier transform or, for special reasons, a variety of the Fourier transform, such as an FFT or, as discussed, an MDCT, may be employed for this. The block of audio spectral values at the output of the filter bank may then be processed further, as necessary. In the above audio coders, a quantization of the audio spectral values follows, wherein the quantization stages are typically chosen so that the quantization noise introduced by the quantizing is below the psychoacoustic masking threshold, i.e. is “masked away”. The quantization is a lossy coding. In order to obtain further data amount reduction, the quantized spectral values are then entropy-coded, for example by means of Huffman coding. By adding side information, such as scale factors etc., a bit stream, which may be stored or transmitted, is formed from the entropy-coded quantized spectral values by means of a bit stream multiplexer.
In the audio decoder, the bit stream is split up into coded quantized spectral values and side information by means of a bit stream demultiplexer. The entropy-coded quantized spectral values are first entropy-decoded to obtain the quantized spectral values. The quantized spectral values are then inversely quantized to obtain decoded spectral values comprising quantization noise, which, however, is below the psychoacoustic masking threshold and will thus be inaudible. These spectral values are then converted to a temporal representation by means of a synthesis filter bank to obtain time-discrete decoded audio samples. In the synthesis filter bank, a transform algorithm inverse to the transform algorithm has to be employed. Moreover, the windowing has to be reversed after the frequency-time backward transform.
In order to achieve good frequency selectivity, modern audio coders typically use block overlap. Such a case is illustrated in FIG. 6a. First for example 2,048 time-discrete audio samples are taken and windowed by means of means 402. The window embodying means 402 has a window length of 2N samples and provides a block of 2N windowed samples on the output side. In order to achieve a window overlap, a second block of 2N windowed samples is formed by means of means 404, which is illustrated separate from means 402 in FIG. 6a only for reasons of clarity. The 2,048 samples fed to means 404, however, are not the time-discrete audio samples immediately subsequent to the first window, but contain the second half of the samples windowed by means 402 and additionally contain only 1,024 “new” samples. The overlap is symbolically illustrated by means 406 in FIG. 6a, causing an overlapping degree of 50%. Both the 2N windowed samples output by means 402 and the 2N windowed samples output by means 404 are then subjected to the MDCT algorithm by means of means 408 and 410, respectively. Means 408 provides N spectral values for the first window according to the known MDCT algorithm, whereas means 410 also provides N spectral values, but for the second window, wherein there is an overlap of 50% between the first window and the second window.
In the decoder, the N spectral values of the first window, as shown in FIG. 6b, are fed to means 412 performing an inverse modified discrete cosine transform. The same applies to the N spectral values of the second window. They are fed to means 414 also performing an inverse modified discrete cosine transform. Both means 412 and means 414 each provide 2N samples for the first window and 2N samples for the second window, respectively.
In means 416, designated TDAC (time domain aliasing cancellation) in FIG. 6b, the fact is taken into account that the two windows are overlapping. In particular, a sample y1 of the second half of the first window, i.e. with an index N+k, is summed with a sample y2 from the first half of the second window, i.e. with an index k, so that N decoded temporal samples result on the output side, i.e. in the decoder.
It is to be noted that, by the function of means 416, which is also referred to as add function, the windowing performed in the coder schematically illustrated by FIG. 6a is taken into account somewhat automatically, so that no explicit “inverse windowing” has to take place in the decoder illustrated by FIG. 6b. 
If the window function implemented by means 402 or 404 is designated w(k), wherein the index k represents the time index, the condition has to be met that the squared window weight w(k) added to the squared window weight w(N+k) together are 1, wherein k runs from 0 to N−1. If a sine window is used whose window weightings follow the first half-wave of the sine function, this condition is always met, since the square of the sine and the square of the cosine together result in the value 1 for each angle.
In the window method with subsequent MDCT function described in FIG. 6a, it is disadvantageous that the windowing by multiplication of a time-discrete sample, when thinking of a sine window, is achieved with a floating-point number, since the sine of an angle between 0 and 180 degrees does not yield an integer, apart from the angle of 90 degrees. Even when integer time-discrete samples are windowed, floating-point numbers result after the windowing.
Therefore, even if no psychoacoustic coder is used, i.e. if lossless coding is to be achieved, quantization will be necessary at the output of means 408 and 410, respectively, to be able to perform reasonably manageable entropy coding.
Generally, currently known integer transforms for lossless audio and/or video coding are obtained by a decomposition of the transforms used therein into Givens rotations and by applying the lifting scheme to each Givens rotation. Thus a rounding error is introduced in each step. For subsequent stages of Givens rotations, the rounding error continues to accumulate. The resulting approximation error becomes problematic particularly for lossless audio coding approaches, particularly when long transforms are used providing, for example, 1,024 spectral values, such as it is the case in the known MDCT with overlap and add (MDCT=modified discrete cosine transform). Particularly in the higher frequency range, where the audio signal typically has a very low energy amount anyway, the approximation error may quickly become larger than the actual signal, so that these approaches are problematic with respect to lossless coding and particularly with respect to the coding efficiency that may achieved by it.
With respect to the audio coding, integer transforms, i.e. transform algorithms generating integer output values, are particularly based on the known DCT-IV, which does not take into account a DC component, while integer transforms for image applications are rather based on the DCT-II, which especially contains the provisions for the DC component. Such integer transforms are, for example, known in Y. Zeng, G. Bi and Z. Lin, “Integer sinusoidal transforms based on lifting factorization”, in Proc. ICASSP'01, May 2001, pp. 1,181-1,184, K. Komatsu and K. Sezaki, “Reversible Discrete Cosine Transform”, in Proc. ICASSP, 1998, vol. 3, pp. 1,769-1,772, P. Hao and Q. Shi, “Matrix factorizations for reversible integer mapping”, IEEE Trans. Signal Processing, Signal Processing, vol. 49, pp. 2,314-2,324, and J. Wang, J. Sun and S. Yu, “1-d and 2-d transforms from integers to integers”, in Proc. ICASSP'03, Hongkong, April 2003.
As mentioned above, the integer transform described there are based on the decomposition of the transform into Givens rotations and on the application of the known lifting scheme to the Givens rotations, which results in the problem of the accumulating rounding errors. This is particularly due to the fact that, within a transform, roundings must be performed many times, i.e. after each lifting step, so that, particularly in long transforms causing a corresponding large number of lifting steps, there must be a particularly large number of roundings. As described, this results in an accumulated error and particularly also in a relatively complex processing, because rounding is performed after every lifting step to perform the next lifting step.
Subsequently, the decomposition of the MDCT windowing will be illustrated again with respect to FIGS. 9 to 11, as described in DE 10129240 A1, wherein this decomposition of the MDCT windowing into Givens rotations with lifting matrices and corresponding roundings is advantageously combinable with the concept discussed in FIG. 1 for the conversion and in FIG. 2 for the inverse conversion, to obtain a complete integer MDCT approximation, i.e. an integer MDCT (IntMDCT) according to the present invention, wherein both a forward and a backward transform concept are given for the example of an MDCT.
FIG. 3 shows an overview diagram for the inventive preferred device for processing time-discrete samples representing an audio signal to obtain integer values based on which the Int-MDCT integer transform algorithm is operative. The time-discrete samples are windowed by the device shown in FIG. 3 and optionally converted to a spectral representation. The time-discrete samples supplied to the device at an input 10 are windowed with a window w with a length corresponding to 2N time-discrete samples to achieve, at an output 12, integer windowed samples suitable to be converted to a spectral representation by means of a transform and particularly the means 14 for performing an integer DCT. The integer DCT is designed to generate N output values from N input values which is in contrast to the MDCT function 408 of FIG. 6a which only generates N spectral values from 2N windowed samples due to the MDCT equation.
For windowing the time-discrete samples, first two time-discrete samples are selected in means 16 which together represent a vector of time-discrete samples. A time-discrete sample selected by the means 16 is in the first quarter of the window. The other time-discrete sample is in the second quarter of the window, as discussed in more detail with respect to FIG. 5. The vector generated by the means 16 is now provided with a rotation matrix of the dimension 2×2, wherein this operation is not performed directly, but by means of several so-called lifting matrices.
A lifting matrix has the property to comprise only one element depending on the window w and being unequal to “1” or “0”.
The factorization of wavelet transforms in lifting steps is presented in the specialist publication “Factoring Wavelet Transforms Into Lifting Steps”, Ingrid Daubechies and Wim Sweldens, Preprint, Bell Laboratories, Lucent Technologies, 1996. Generally, a lifting scheme is a simple relation between perfectly reconstructing filter pairs having the same low-pass or high-pass filters. Each pair of complementary filters may be factorized into lifting steps. This applies particularly to Givens rotations. Consider the case in which the polyphase matrix is a Givens rotation. The following then applies:
                              (                                                                      cos                  ⁢                                                                          ⁢                  α                                                                                                  -                    sin                                    ⁢                                                                          ⁢                  α                                                                                                      sin                  ⁢                                                                          ⁢                  α                                                                              cos                  ⁢                                                                          ⁢                  α                                                              )                =                              (                                                            1                                                                                                                    cos                        ⁢                                                                                                  ⁢                        α                                            -                      1                                                              sin                      ⁢                                                                                          ⁢                      α                                                                                                                    0                                                  1                                                      )                    ⁢                      (                                                            1                                                  0                                                                                                  sin                    ⁢                                                                                  ⁢                    α                                                                    1                                                      )                    ⁢                      (                                                            1                                                                                                                    cos                        ⁢                                                                                                  ⁢                        α                                            -                      1                                                              sin                      ⁢                                                                                          ⁢                      α                                                                                                                    0                                                  1                                                      )                                              (        1        )            
Each of the three lifting matrices on the right-hand side of the equal sign has the value “1” as main diagonal element. There is further, in each lifting matrix, a secondary diagonal element equal to 0 and a secondary diagonal element depending on the rotation angle α.
The vector is now multiplied by the third lifting matrix, i.e. the lifting matrix on the far right in the above equation, to obtain a first result vector. This is illustrated in FIG. 3 by means 18. Now the first result vector is rounded with any rounding function mapping the set of real numbers into the set of integers, as illustrated in FIG. 3 by means 20. At the output of the means 20, a rounded first result vector is obtained. The rounded first result vector is now supplied to means 22 for multiplying it by the central, i.e. second, lifting matrix to obtain a second result vector which is again rounded in means 24 to obtain a rounded second result vector. The rounded second result vector is now supplied to means 26 for multiplying it by the lifting matrix shown on the left in the above equation, i.e. the first one, to obtain a third result vector which is finally rounded by means of means 28 to finally obtain integer windowed samples at the output 12 which, if a spectral representation of the same is desired, now have to be processed by means 14 to obtain integer spectral values at a spectral output 30.
Preferably, the means 14 is implemented as integer DCT.
The discrete cosine transform according to type 4 (DCT-IV) with a length N is given by the following equation:
                                          X            t                    ⁡                      (            m            )                          =                                            2              N                                ⁢                                    ∑                              k                =                0                                            N                -                1                                      ⁢                                          x                ⁡                                  (                  k                  )                                            ⁢                              cos                ⁡                                  (                                                            π                                              4                        ⁢                                                                                                  ⁢                        N                                                              ⁢                                          (                                                                        2                          ⁢                                                                                                          ⁢                          k                                                +                        1                                            )                                        ⁢                                          (                                                                        2                          ⁢                                                                                                          ⁢                          m                                                +                        1                                            )                                                        )                                                                                        (        2        )            
The coefficients of the DCT-IV form an orthonormal N×N matrix. Each orthogonal N×N matrix may be decomposed into N(N−1)/2 Givens rotations, as discussed in the specialist publication P. P. Vaidyanathan, “Multirate Systems And Filter Banks”, Prentice Hall, Englewood Cliffs, 1993. It is to be noted that other decompositions also exist.
With respect to the classifications of the various DCT algorithms, see H. S. Malvar, “Signal Processing With Lapped Transforms”, Artech House, 1992. Generally, the DCT algorithms differ in the kind of their basis functions. While the DCT-IV preferred herein includes non-symmetric basis functions, i.e. a cosine quarter wave, a cosine ¾ wave, a cosine 5/4 wave, a cosine 7/4 wave, etc., the discrete cosine transform of, for example, type II (DCT-II) has axisymmetric and point symmetric basis functions. The 0th basis function has a DC component, the first basis function is half a cosine wave, the second basis function is a whole cosine wave, etc. Due to the fact that the DCT-II gives special emphasis to the DC component, it is used in video coding, but not in audio coding, because the DC component is not relevant in audio coding in contrast to video coding.
In the following, there will be a discussion how the rotation angle α of the Givens rotation depends on the window function.
An MDCT with a window length of 2N may be reduced to a discrete cosine transform of the type IV with a length N. This is achieved by explicitly performing the TDAC operation in the time domain and then applying the DCT-IV. In the case of a 50% overlap, the left half of the window for a block t overlaps with the right half of the preceding block, i.e. block t−1. The overlapping part of two consecutive blocks t−1 and t is preprocessed in the time domain, i.e. prior to the transform, as follows, i.e. it is processed between the input 10 and the output 12 of FIG. 3:
                              (                                                                                                                x                      ~                                        t                                    ⁡                                      (                    k                    )                                                                                                                                                                  x                      ~                                                              t                      -                      1                                                        ⁡                                      (                                          N                      -                      1                      -                      k                                        )                                                                                )                =                              (                                                                                w                    ⁡                                          (                                                                        N                          2                                                +                        k                                            )                                                                                                            -                                          w                      ⁡                                              (                                                                              N                            2                                                    -                          1                          -                          k                                                )                                                                                                                                                              w                    ⁡                                          (                                                                        N                          2                                                -                        1                        -                        k                                            )                                                                                                            w                    ⁡                                          (                                                                        N                          2                                                +                        k                                            )                                                                                            )                    ⁢                      (                                                                                                      x                      t                                        ⁡                                          (                                                                        N                          2                                                +                        k                                            )                                                                                                                                                              x                      t                                        ⁡                                          (                                                                        N                          2                                                -                        1                        -                        k                                            )                                                                                            )                                              (        3        )            
The values marked with the tilde are the values at the output 12 of FIG. 3, while the x values not marked with a tilde in the above equation are the values at the input 10 and/or following the means 16 for selecting. The running index k runs from 0 to N/2−1, while w represents the window function.
From the TDAC condition for the window function w, the following applies:
                                                        w              ⁡                              (                                                      N                    2                                    +                  k                                )                                      2                    +                                    w              ⁡                              (                                                      N                    2                                    -                  1                  -                  k                                )                                      2                          =        1                            (        4        )            
For certain angles αk, k=0, . . . , N/2−1, this preprocessing in the time domain may be written as Givens rotation, as discussed.
The angle α of the Givens rotation depends on the window function w as follows:α=arctan [w(N/2−1−k)/w(N/2+k)]  (5)
It is to be noted that any window functions w may be employed as long as they fulfill this TDAC condition.
In the following, a cascaded coder and decoder are described with respect to FIG. 4. The time-discrete samples x(0) to x(2N−1), which are “windowed” together by a window, are first selected by the means 16 of FIG. 3 such that the sample x(0) and the sample x(N−1), i.e. a sample from the first quarter of the window and a sample from the second quarter of the window, are selected to form the vector at the output of the means 16. The crossing arrows schematically represent the lifting multiplications and subsequent roundings of the means 18, 20 and 22, 24 and 26, 28, respectively, to obtain the integer windowed samples at the input of the DCT-IV blocks.
When the first vector has been processed as described above, a second vector is further selected from the samples x(N/2−1) and x(N/2), i.e. again a sample from the first quarter of the window and a sample from the second quarter of the window, and is again processed by the algorithm described in FIG. 3. Analogously, all other sample pairs from the first and second quarters of the window are processed. The same processing is performed for the third and fourth quarters of the first window. Now there are 2N windowed integer samples at the output 12 which are now supplied to a DCT-IV transform as illustrated in FIG. 4. In particular, the integer windowed samples of the second and third quarters are supplied to a DCT. The windowed integer samples of the first quarter of the window are processed into a preceding DCT-IV together with the windowed integer samples of the fourth quarter of the preceding window. Analogously, in FIG. 4, the fourth quarter of the windowed integer samples is supplied to a DCT-IV transform together with the first quarter of the next window. The central integer DCT-IV transform 32 shown in FIG. 4 now provides N integer spectral values y(0) to y(N−1). These integer spectral values may now, for example, be simply entropy-coded without an interposed quantization being necessary, because the windowing and transform yield integer output values.
In the right half of FIG. 4, a decoder is illustrated. The decoder consisting of backward transform and “inverse windowing” operates inversely to the coder. It is known that an inverse DCT-IV may be used for the backward transform of a DCT-IV, as illustrated in FIG. 4. The output values of the decoder DCT-IV 34 are now inversely processed with the corresponding values of the preceding transform and/or the following transform, as illustrated in FIG. 4, in order to generate again time-discrete audio samples x(0) to x(2−N1) from the integer windowed samples at the output of the means 34 and/or the preceding and following transform.
The operation on the output side takes place by an inverse Givens rotation, i.e. such that the blocks 26, 28 and 22, 24 and 18, 20, respectively, are traversed in the opposite direction. This will be illustrated in more detail with respect to the second lifting matrix of equation 1. When (in the coder) the second result vector is formed by multiplication of the rounded first result vector by the second lifting matrix (means 22), the following expression results:(x,y)(x,y+x sin α)  (6)
The values x, y on the right-hand side of equation 6 are integers. This, however, does not apply to the value x sin α. Here, the rounding function r must be introduced, as illustrated in the following equation:(x,y)(x,y+r(x sin α))  (7)
This operation is performed by the means 24.
The inverse mapping (in the decoder) is defined as follows:(x′,y′)(x′,y′−r(x′ sin α))  (8)
Due to the minus sign in front of the rounding operation, it becomes apparent that the integer approximation of the lifting step may be reversed without introducing an error. The application of this approximation to each of the three lifting steps leads to an integer approximation of the Givens rotation. The rounded rotation (in the coder) may be reversed (in the decoder) without introducing an error by traversing the inverse rounded lifting steps in reverse order, i.e. if in decoding the algorithm of FIG. 3 is performed from bottom to top.
If the rounding function r is point symmetric, the inverse rounded rotation is identical to the rounded rotation with the angle −α and is expressed as follows:
                    (                                                            cos                ⁢                                                                  ⁢                α                                                                    sin                ⁢                                                                  ⁢                α                                                                                                          -                  sin                                ⁢                                                                  ⁢                α                                                                    cos                ⁢                                                                  ⁢                α                                                    )                            (        9        )            
The lifting matrices for the decoder, i.e. for the inverse Givens rotation, in this case result directly from equation (1) by merely replacing the expression “sin α” by the expression “−sin α”.
In the following, the decomposition of a common MDCT with overlapping windows 40 to 46 is illustrated again with respect to FIG. 5. The windows 40 to 46 each have a 50% overlap. First, Givens rotations are performed per window within the first and second quarters of a window and/or within the third and fourth quarters of a window, as illustrated schematically by arrows 48. Then the rotated values, i.e. the windowed integer samples, are supplied to an N-to-N DCT such that always the second and third quarters of a window and the fourth and first quarters of a subsequent window, respectively, are converted to a spectral representation together by means of a DCT-IV algorithm.
The common Givens rotations are therefore decomposed into lifting matrices which are executed sequentially, wherein, after each lifting matrix multiplication, a rounding step is inserted such that the floating point numbers are rounded immediately after being generated such that, prior to each multiplication of a result vector with a lifting matrix, the result vector only has integers.
The output values thus always remain integer, wherein it is preferred to use also integer input values. This does not represent a limitation, because any exemplary PCM samples as they are stored on a CD are integer numerical values whose value range varies depending on bit width, i.e. depending on whether the time-discrete digital input values are 16-bit values or 24-bit values. Nevertheless, the whole process is invertible, as discussed above, by performing the inverse rotations in reverse order. There is thus an integer approximation of the MDCT with perfect reconstruction, i.e. a lossless transform.
The shown transform provides integer output values instead of floating point values. It provides a perfect reconstruction so that no error is introduced when a forward and then a backward transform are performed. According to a preferred embodiment of the present invention, the transform is a substitution for the modified discrete cosine transform. Other transform methods, however, may also be performed with integers as long as a decomposition into rotations and a decomposition of the rotations into lifting steps is possible.
The integer MDCT has most of the favorable properties of the MDCT. It has an overlapping structure, whereby a better frequency selectivity is obtained than with non-overlapping block transforms. Due to the TDAC function which is already taken into account in windowing prior to the transform, a critical sampling is maintained so that the total number of spectral values representing an audio signal is equal to the total number of input samples.
Compared to a normal MDCT providing floating point samples, the described preferred integer transform shows that the noise compared to the normal MDCT is increased only in the spectral range in which there is little signal level, while this noise increase does not become noticeable at significant signal levels. But the integer processing suggests itself for an efficient hardware implementation, because only multiplication steps are used which may readily be decomposed into shift/add steps which may be hardware-implemented in a simple and quick way. Of course, a software implementation is also possible.
The integer transform provides a good spectral representation of the audio signal and yet remains in the area of integers. When it is applied to tonal parts of an audio signal, this results in good energy concentration. With this, an efficient lossless coding scheme may be built up by simply cascading the windowing/transform illustrated in FIG. 3 with an entropy coder. In particular, stacked coding using escape values, as it is employed in MPEG AAC, is advantageous. It is preferred to scale down all values by a certain power of two until they fit in a desired code table, and then additionally code the omitted least significant bits. In comparison with the alternative of the use of larger code tables, the described alternative is more favorable with regard to the storage consumption for storing the code tables. An almost lossless coder could also be obtained by simply omitting certain ones of the least significant bits.
In particular for tonal signals, entropy coding of the integer spectral values allows a high coding gain. For transient parts of the signal, the coding gain is low, namely due to the flat spectrum of transient signals, i.e. due to a small number of spectral values equal to or almost 0. As described in J. Herre, J. D. Johnston: “Enhancing the Performance of Perceptual Audio Coders by Using Temporal Noise Shaping (TNS)” 101st AES Convention, Los Angeles, 1996, preprint 4384, this flatness may be used, however, by using a linear prediction in the frequency domain. An alternative is a prediction with open loop. Another alternative is the predictor with closed loop. The first alternative, i.e. the predictor with open loop, is called TNS. The quantization after the prediction leads to an adaptation of the resulting quantization noise to the temporal structure of the audio signal and thus prevents pre-echoes in psychoacoustic audio coders. For lossless audio coding, the second alternative, i.e. with a predictor with closed loop, is more suitable, since the prediction with closed loop allows accurate reconstruction of the input signal. When this technique is applied to a generated spectrum, a rounding step has to be performed after each step of the prediction filter in order to stay in the area of the integers. By using the inverse filter and the same rounding function, the original spectrum may accurately be reproduced.
In order to make use of the redundancy between two channels for data reduction, center-side coding may be also employed in a lossless manner, if a rounded rotation with an angle of π/4 is used. In comparison to the alternative of calculating the sum and difference of the left and the right channel of a stereo signal, the rounded rotation has the advantage of energy conservation. The use of so-called joint stereo coding techniques may be switched on or off for each band, as it is also performed in the standard MPEG AAC. Further rotation angles may also be considered to be able to reduce redundancy between two channels more flexibly.
Particularly the transform concept illustrated with respect to FIG. 3 provides an integer implementation of the MDCT, i.e. an IntMDCT, which operates losslessly with respect to forward transform and subsequent backward transform. By the rounding steps 20, 24, 28 and the corresponding rounding steps in the integer DCT (block 14 in FIG. 3), there is further always possible an integer processing, i.e. processing with more roughly quantized values than they have been generated, for example, by floating point multiplication with a lifting matrix (blocks 18, 22, 26 of FIG. 3).
The result is that the whole IntMDCT may be performed efficiently with respect to calculating.
The losslessness of this IntMDCT or, generally speaking, the losslessness of all coding algorithms referred to as lossless is related to the fact that the signal, when it is coded to achieve a coded signal and when it is afterwards again decoded to achieve a coded/decoded signal, “looks” exactly like the original signal. In other words, the original signal is identical to the coded/decoded original signal. This is an obvious contrast to a so-called lossy coding, in which, as in the case of audio coders operating on a psychoacoustic basis, data are irretrievably lost by the coding process and particularly by the quantizing process controlled by a psychoacoustic model.
Of course, rounding errors are still introduced. Thus, as shown with respect to FIG. 3 in the blocks 20, 24, 28, rounding steps are performed which, of course, introduce a rounding error which is only “eliminated” in the decoder when the inverse operations are performed. Thus lossless coding/decoding concept differ essentially from lossy coding/decoding concepts in that, in lossless coding/decoding concepts, the rounding error is introduced so that it may be eliminated again, while this is not the case in lossy coding/decoding concepts.
However, if you consider the coded signal, i.e., in the example of transform coders, the spectrum of a block of temporal samples, the rounding in the forward transform and/or generally the quantization of such a signal results in an error being introduced in the signal. Thus, a rounding error is superimposed on the ideal error-free spectrum of the signal, the error being typically, for example in the case of FIG. 3, white noise equally including all frequency components of the considered spectral range. This white noise superimposed on the ideal spectrum thus represents the rounding error which occurs, for example, by the rounding in the blocks 20, 24, 28 during windowing, i.e. the pre-processing of the signal prior to the actual DCT in block 14. It is particularly to be noted that, for a losslessness requirement, the whole rounding error must necessarily be coded, i.e. transmitted to the decoder, because the decoder requires the whole rounding error introduced in the coder to achieve a correct lossless reconstruction.
The rounding error may not be problematic when nothing is “done” with the spectral representation, i.e. when the spectral representation is only stored, transmitted and decoded again by a correctly matching inverse decoder. In that case, the losslessness criterion will always be met, irrespective of how much rounding error has been introduced into the spectrum. If, however, something is done with the spectral representation, i.e. with the ideal spectral representation of an original signal containing a rounding error, for example if scalability layers are generated, etc., all these things work better, the smaller the rounding error.
Thus, there is also a requirement in lossless codings/decodings that, on the one hand, a signal should be losslessly reconstructable by special decoders, that, however, a signal also should have a minimal rounding error in its spectral representation to preserve flexibility in that also non-ideal lossless decoders may be fed with the spectral representation or that scaling layers, etc. may be generated.
As discussed above, the rounding error is expressed as white noise across the entire considered spectrum. On the other hand, particularly in high quality applications, such as they are especially interesting for the lossless case, i.e. in audio applications with very high sampling frequencies, such as 96 kHz, the audio signal only has a reasonably signal content in a certain spectral range, which typically only reaches up to, at the most, 20 kHz. Typically, the range in which most signal energy of the audio signal is concentrated will be the range between 0 and 10 kHz, while the signal energy will considerably decrease in the range above 10 kHz. However, this does not matter to the white noise introduced by rounding. It superimposes itself across the entire considered spectral range of the signal energy. The result is that, in spectral ranges, i.e. typically in the high spectral ranges where there is no or only very little audio signal energy, there will be only the rounding error. At the same time, particularly due to its non-deterministic nature, the rounding error is also difficult to code, i.e. it is only codeable with relatively high bit requirements. The bit requirements do not play the decisive role, particularly in some lossless applications. However, for lossless coding applications to become more and more widespread, it is very important to operate very bit-efficiently also here to combine the advantage of the absent quality reduction inherent in lossless applications also with corresponding bit efficiency, as it is known from lossy coding concepts.
Although a rounding error is thus unproblematic in a lossless context in that it may be eliminated in the decoding, it is still of considerable significance for allowing the lossless decoding and/or reconstruction to be performed in the first place. On the other hand, as already discussed, the rounding error is responsible for the spectral representation becoming defective, i.e. being distorted as compared to an ideal spectral representation of the unrounded signal. For special cases of application, in which the spectral representation, i.e. the coded signal, is actually important, i.e. when, for example, various scaling layers are generated from the coded signal, it is still desirable to obtain a coded representation with a rounding error as small as possible from which, however, no rounding error has been eliminated that is required for a reconstruction.