1. Field of the Invention
The present invention relates to an information coding apparatus, and more particularly, to an information coding apparatus that quantizes difference data between an input audio signal and a predicted audio signal.
2. Description of the Related Art
In the related art, adaptive differential PCM (ADPCM: Adaptive Differential Pulse Code Modulation) is one of the known time-domain waveform coding methods. In addition to this, other waveform coding methods such as adaptive predictive coding (APC) are known. In many cases, these coding methods such as ADPCM or APC are used in combination with a noise shaping technique. The noise shaping technique as used herein refers to a technique that modulates the frequency characteristic of the quantization noise after decoding by feeding back quantization errors so as to obtain the auditory masking effects. A brief description of an example of the ADPCM method combining the noise shaping technique and the coding method will be provided below.
FIGS. 13A and 13B are block diagrams showing one exemplary configuration of an audio transmission system based on ADPCM methods according to the related art. Specifically, FIGS. 13A and 13B, respectively, show an audio coding apparatus that codes an input audio signal X(z) to output a quantized signal Xq(z), and an audio decoding apparatus that decodes the quantized signal Xq(z).
FIG. 13A is a block diagram showing one exemplary configuration of an audio coding apparatus 700 based on ADPCM methods according to the related art. The audio coding apparatus 700 is configured to receive an input audio signal X(z) of each frame from a signal line 701 and output a quantized signal Xq(z) from a signal line 709. The frame as used therein refers to a predetermined number of the sample values of sampled discrete time signals.
The audio coding apparatus 700 includes a predictive filter P(z) 710, subtractors 720 and 730, a quantizer 740, a subtractor 750, and a feedback calculator R(z) 760.
The predictive filter P(z) 710 is configured to predict the present audio signal based on the past audio signal in the input audio signal X(z) and predictive filter coefficients for generating predictive signals. The predictive filter P(z) 710 predicts the present sample values by performing a product-sum operation on the past sample values and the predictive filter coefficients. That is to say, the predictive filter P(z) 710 generates the predictive signals based on Equation 1.
                              P          ⁡                      (            z            )                          =                              ∑                          i              =              1                        Np                    ⁢                                    p              i                        ⁢                          z                              -                i                                                                        Equation        ⁢                                  ⁢        1            
In this equation, P(z) is a predictive filter based on an all-pole model of the input audio signal X(z). pi is the predictive filter coefficient for generating the predictive signals. The predictive filter coefficient pi can be calculated by linear predictive coding (LPC) analysis on the input audio signal X(z), for example. The LPC analysis as used herein is a method of estimating the frequency characteristic of the input audio signal by using the proximity correlation between audio samples. That is, the LPC analysis is a method of estimating the coefficient of a filter approximating the characteristics of a vocal tract in a voice generation model from the input audio signal. Np is the order of the predictive filter P(z).
The predictive filter P(z) 710 outputs the generated predictive signals to the subtractor 720.
The subtractor 720 is configured to calculate a difference between the present audio signal supplied from the signal line 701 and the predictive signal supplied from the predictive filter P(z) 710. The subtractor 720 generates a predictive residual signal by subtracting the predictive signal supplied from the predictive filter P(z) 710 from the present audio signal supplied from the signal line 701. The subtractor 720 outputs the generated predictive residual signal to the subtractor 730.
The subtractor 730 is configured to feed back the output of the feedback calculator R(z) 760 to the predictive residual signal supplied from the subtractor 720. The subtractor 730 calculates a difference between the predictive residual signal output from the subtractor 720 and the output of the feedback calculator R(z) 760. The subtractor 730 generates a modified predictive residual signal by subtracting the output of the feedback calculator R(z) 760 from the predictive residual signal output from the subtractor 720. The subtractor 730 outputs the generated modified predictive residual signal to the quantizer 740 and the subtractor 750.
The quantizer 740 is configured to quantize the modified predictive residual signal generated by the subtractor 730 into a predetermined number of bits. The quantizer 740 outputs the quantized signal Xq(z) to the signal line 709 and the subtractor 750.
The subtractor 750 is configured to calculate a difference between the modified predictive residual signal generated by the subtractor 730 and the quantized signal Xq(z) quantized by the quantizer 740. The subtractor 750 generates a quantization error signal E(z) by subtracting the modified predictive residual signal generated by the subtractor 730 from the quantized signal Xq(z) quantized by the quantizer 740. The subtractor 750 outputs the generated quantization error signal E(z) to the feedback calculator R(z) 760.
The feedback calculator R(z) 760 is a noise shaping filter that generates a feedback signal Es(z) for controlling the frequency characteristic of the quantization noise after decoding based on the quantization error signal E(z) from the subtractor 750. The feedback calculator R(z) 760 is configured based on the predictive filter P(z) 710. That is to say, the feedback calculator R(z) 760 performs arithmetic processing based on Equation 2 to generate the processing results as the feedback signal Es(z).
                              R          ⁡                      (            z            )                          =                              P            ⁡                          (                                                λ                                      -                    1                                                  ⁢                z                            )                                =                                    ∑                              i                =                1                            Np                        ⁢                                          λ                i                            ⁢                              p                i                            ⁢                              z                                  -                  i                                                                                        Equation        ⁢                                  ⁢        2            
In this equation, λ is an adjustment parameter for adjusting the peak level in the frequency characteristic of the quantization noise after decoding.
The feedback calculator R(z) 760 supplies the generated feedback signal Es(z) to the subtractor 730.
As described above, the feedback calculator R(z) 760 of the audio coding apparatus 700 is configured based on the predictive filter P(z) 710.
FIG. 13B is a block diagram showing one exemplary configuration of an audio decoding apparatus 1 that decodes the quantized signal Xq(z) output from the audio coding apparatus 700. The audio decoding apparatus 800 includes an adder 810 and a predictive filter P(z) 820.
The adder 810 is configured to add the quantized signal Xq(z) supplied via the signal line 801 and the output of the predictive filter P(z) 820. The adder 810 generates a decoded signal Y(z) by adding the quantized signal Xq(z) and the output of the predictive filter P(z) 820. The adder 810 outputs the generated decoded signal Y(z) to a signal line 809 and a predictive filter P(z) 820.
The predictive filter P(z) 820 is configured to perform arithmetic processing on the decoded signal Y(z) output from the adder 810. The predictive filter P(z) 820 has the same configuration as the predictive filter P(z) 710 of the audio coding apparatus 700. That is to say, the predictive filter P(z) 820 uses the same predictive filter coefficient pi as used by the predictive filter P(z) 710. Moreover, the predictive filter P(z) 820 performs arithmetic processing based on Equation 1 and supplies the processing results to the adder 810.
As described above, the audio decoding apparatus 800 decodes the quantized signal Xq(z) by using only the adder 810 and the predictive filter P(z) 820 having the same configuration as that of the audio coding apparatus 700. Therefore, it can be understood that the configuration of the audio decoding apparatus 800 is not affected by the configuration of the feedback calculator R(z) 760.
Next, the characteristic of the quantization noise included in the decoded signal Y(z) which is output from the audio decoding apparatus 800 will be described.
First, the characteristic of the quantized signal Xq(z) output from the audio coding apparatus 700 can be expressed by the following equation in which E(z) represents the quantization error in the audio coding apparatus 700.
                              Xq          ⁡                      (            z            )                          =                                            [                              1                -                                  P                  ⁡                                      (                    z                    )                                                              ]                        ·                          X              ⁡                              (                z                )                                              +                      E            ⁡                          (              z              )                                -                      Es            ⁡                          (              z              )                                                              =                                            [                              1                -                                  P                  ⁡                                      (                    z                    )                                                              ]                        ·                          X              ⁡                              (                z                )                                              +                                    [                              1                -                                  R                  ⁡                                      (                    z                    )                                                              ]                        ·                          E              ⁡                              (                z                )                                                        
The characteristic of the decoded signal Y(z) output from the audio decoding apparatus 800 can be expressed by Equation 3 based on the above equation.
                              Y          ⁡                      (            z            )                          =                                            1                              1                -                                  P                  ⁡                                      (                    z                    )                                                                        ·                          Xq              ⁡                              (                z                )                                              =                                    X              ⁡                              (                z                )                                      +                                                            1                  -                                      R                    ⁡                                          (                      z                      )                                                                                        1                  -                                      P                    ⁡                                          (                      z                      )                                                                                  ·                              E                ⁡                                  (                  z                  )                                                                                        Equation        ⁢                                  ⁢        3            
It can be understood from the above equation that the quantization noise characteristic of the decoded signal Y(z) output from the audio decoding apparatus 800 can be controlled by P(z) and R(z). The frequency characteristic of the quantization noise output from the audio decoding apparatus 800 in the case of P(z)=R(z) will be described.
FIG. 14 is a diagram showing an example of the quantization noise output from the audio decoding apparatus 800 in the case of P(z)=R(z). In the drawing, the frequency characteristic of the input audio signal is represented by the solid line 780, and the frequency characteristic of the quantization noise is represented by the broken line 881. The horizontal axis represents frequency and the vertical axis represents intensity.
The frequency characteristic 780 of the input audio signal is the frequency characteristic of the audio signal input to the audio coding apparatus 700. The waveform of the frequency characteristic 780 of the input audio signal has three peaks (poles), and the peak level decreases as the frequency decreases.
The frequency characteristic 881 of the quantization noise is the frequency characteristic of the quantization noise included in the decoded signal Y(z) when the input audio signal coded by the audio coding apparatus 700 is decoded by the audio decoding apparatus 800.
As described above, in the case of P(z)=R(z), the quantization noise shows a flat frequency characteristic regardless of the frequency characteristic 780 of the input audio signal. In this case, the S/N which is the ratio of the level of the input audio signal (Signal) to the level of the quantization noise (Noise) will be poor in the valley portions of the input audio signal waveform, and thus annoying noise is likely to be heard. Therefore, it is important to match the frequency characteristic of the quantization noise to the waveform of the frequency characteristic of the input audio signal, thus reducing the auditory noise by the auditory masking effects. An example of the frequency characteristic of the quantization noise modulated by the feedback calculator R(z) 760 of the audio coding apparatus 700 will be described below.
FIG. 15 is a diagram showing an example of the frequency characteristic of the quantization noise modulated by the feedback calculator R(z) 760 of the audio coding apparatus 700. In the drawing, the frequency characteristic of the input audio signal is represented by the solid line 780, and the frequency characteristic of the quantization noise is represented by the broken lines 882 to 884. The horizontal axis represents frequency and the vertical axis represents intensity. The frequency characteristic 780 of the input audio signal has the same characteristic as that in FIG. 14 and thus will not be described herein.
The frequency characteristics 882 to 884 of the quantization noise are the frequency characteristics of the quantization noise after decoding when the adjustment parameter λ of the feedback calculator R(z) 760 was set to “0.0,” “0.5,” and “1.0,” respectively. The frequency characteristic 884 of the quantization noise when the adjustment parameter λ is set to “1.0,” namely P(z)=R(z), shows the same flat frequency characteristic as the frequency characteristic 881 of the quantization noise shown in FIG. 14.
As described above, the peak level in the frequency characteristic of the quantization noise can be adjusted by decreasing the value of the adjustment parameter λ of the feedback calculator R(z) 760. That is to say, it is preferable to decrease the adjustment parameter λ as much as possible to obtain the auditory masking effects. However, if the adjustment parameter λ is too small, the level of the feedback signal Es(z) generated by the feedback calculator R(z) 760 will become too high. In such a case, signals at levels exceeding the quantization range will be input to the quantizer 740, and thus the decoded signals will produce an unnatural sound. For this reason, the adjustment parameter λ is typically set to a range of “0.4” to “0.8.” When the signals at levels exceeding the quantization range are input to the quantizer, whereby the quantized signals of the quantizer are saturated, those input signals are referred to as having been clipped.
Therefore, in order to appropriately control the quantization noise after decoding, an audio coding apparatus has been proposed in which the feedback calculator R(z) 760 is configured based on the predictive filter P(z) 710. Such a proposal is described, for example, in B. S. Atal, M. R. Schroeder: “Predictive coding of speech signals and subjective error criteria,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. ASSP-27, p. 247-254, June 1979.