This application is based on Application No. 2001-052113, filed in Japan on Feb. 27, 2001, the contents of which are hereby incorporated by reference.
1. Field of the Invention
The present invention relates to an audio signal encoding apparatus for encoding a wide-band audio signal and multiplexing and transmitting an encoded bit string generated by the encoding processing to a transmission line. More specifically, the present invention relates to a technique of preventing deterioration in objective characteristics such as an S/N ratio (signal-to-noise ratio), etc., in cases where the component in the form of a frequency component such as a sine wave of a signal to be processed exists in a narrow band.
2. Description of the Related Art
As a typical example of conventional audio signal encoding apparatuses, reference is made to one illustrated in the ISO/IEC 13818-7 standard (hereinafter, referred to as an MPEG-2 AAC method). Here, note that the MPEG-2 AAC method is defined in detail in that standard.
FIG. 15 illustrates a block diagram of the MPEG-2 AAC method as such a conventional audio signal encoding apparatus. In this figure, the conventional audio signal encoding apparatus includes a psychoacoustic model section 1, an MDCT (Modified Discrete Cosine Transform) processing section 2, an iterative loop processing section 3, and a multiplexer section 4. The psychoacoustic model section 1 includes an FFT (Fast Fourier Transform) operation section 11, a block type determination section 12 and an SMR (Signal Mask Ratio) operation section 13. The iterative loop processing section 3 includes an allowable error amount calculation section 31, a bit amount/error amount control section 32, a normalization processing section 33, a quantization section 34, and a Huffman encoding section 35.
Next, the operation of this audio signal encoding apparatus will be described below.
An input signal input to the psychoacoustic model section 1 is subjected to FFT operation processing in the FFT operation section 11 to generate an FFT frequency spectrum.
Now, the processing block type will be briefly described prior to an explanation of the block type determination section 12. When a signal on a time base is converted into a signal on a frequency base, there are two kinds of processing block types, one being a long type in which a signal to be analyzed is expanded in time for improved frequency resolution, the other being a short type in which a signal to be analyzed is shortened in time for improved time resolution. The former type is used in the case where there exists only a stationary signal, whereas the latter is used when there is a rapid signal change. In the MPEG-2 AAC method, by properly using these two kinds of processing block types according to the characteristics of a signal to be analyzed, it is possible to prevent the generation of unpleasant noise called a pre-echo, which would otherwise result from an insufficient time resolution.
The block type determination section 12 calculates a masking threshold from an FFT frequency spectrum from the FFT operation section 11, determines the block type of the input signal based on the masking threshold thus obtained, and passes the result of determination to the MDCT processing section 2 and the multiplexer section 4 as a processing block type.
Then, the SMR operation section 13 calculates an SMR based on the FFT frequency spectrum from the FFT operation section 11 and the masking threshold in the block type determination section 12, and sends the SMR thus generated to the allowable error amount calculation section 31 in the iterative loop processing section 3.
The MDCT processing section 2 performs conversion processing, i.e., frequency orthogonal transformation processing, from the time base to the frequency base based on the processing block type received from the block type determination section 12. As a result, the MDCT frequency spectrum thus generated is passed to the allowable error amount calculation section 31 and the normalization processing section 33 in the iterative loop processing section 3.
The allowable error amount calculation section 31 in the iterative loop processing section 3 performs multiplication between the MDCT frequency spectrum and the reciprocal (1/SMR) of the SMR to provide an allowable amount of error. The amount of error as mentioned here represents an indication of a difference between the MDCT frequency spectrum from the MDCT processing section 2 and the dequantized value generated through quantization/dequantization, that is, a quantizing error. If this quantizing error is within an allowable range, noise can not be perceived by the human ear.
The amount of error calculated in the allowable error amount calculation section 31 is passed to the bit amount/error amount control section 32 where this amount of error is used as an index of determination as to whether the MDCT frequency spectrum generated through quantization/dequantization satisfies the allowable amount of error.
The normalization processing section 33 normalizes the MDCT frequency spectrum passed from the MDCT processing section 2 by using a scale factor selected in the bit amount/error amount control section 32.
The quantization section 34 quantizes the MDCT frequency spectrum normalized by the normalization processing section 33, and passes the result of quantization to the Huffman encoding section 35. In addition, the quantization section 34 performs dequantization so as to calculate an amount of error in the quantization, and the value thus obtained by the dequantization is passed to the bit amount/error amount control section 32.
The quantized MDCT frequency spectrum is subjected to Huffman encoding in the Huffman encoding section 35, so that an amount of bits actually needed are supplied to the bit amount/error amount control section 32, and a Huffman code book number and a Huffman code are passed to the multiplexer section 4.
The bit amount/error amount control section 32 calculates a difference between the MDCT frequency spectrum from the MDCT processing section and the dequantized MDCT frequency spectrum obtained from the quantization section 34, that is, an amount of error due to quantization, which is then compared with the amount of error calculated by the allowable error amount calculation section 31. As a result, when it is determined that the amount of error due to the quantization is greater than the amount of error calculated by the allowable error amount calculation section 31, the value of the scale factor is reduced and then passed to the normalization processing section 33.
On the other hand, when it is determined that the amount of error due to the quantization is smaller than the amount of error calculated by the allowable error amount calculation section 31, a comparison is made between an amount of used bits obtained from the Huffman encoding section 35 and an allowable amount of bits calculated from the bit rate specified upon encoding. As a result, when it is determined that the amount of the used bits is greater than the allowable amount of bits, the value of the scale factor is increased and then passed to the normalization processing section 33. On the other hand, when it is determined that the amount of used bits is smaller than the allowable amount of bits, the processing in the iterative loop processing section 3 is ended, and the process is shifted to multiplex processing.
As described above, the processing in the iterative loop processing section 3, which is constituted by the allowable error amount calculation section 31, the bit amount/error amount control section 32, the normalization processing section 33, the quantization section 34 and the Huffman encoding section 35, is reiterated until when the quantized MDCT frequency spectrum actually becomes lower than the allowable amount of error, and when the amount of bits required for quantization actually becomes lower than the allowable amount of bits.
Thereafter, the quantized and Huffman-encoded MDCT frequency spectrum, together with ancillary information such as a header, the processing block type determined by the block type determination section 12, the scale factor selected by the bit amount/error amount control section 32 and the Huffman code book number selected by the Huffman encoding section 35, is subjected to the multiplex processing in the multiplexer section 4, so that it is transformed into a coded stream and then sent out to a transmission line.
In general, an encoding system using a psychoacoustic model is featured in that the auditory quality of a voice/music signal is good. However, there is a tendency that the objective characteristics such as, for example, S/N ratio (Signal/Noise: signal-to-noise ratio), etc., are deteriorated. In the above-mentioned conventional audio signal encoding apparatus, etc., even when the width of a frequency band in which the frequency component such as a sine wave of a signal to be encoded exists is narrow, the signal has been subjected to the encoding processing by using parameters in consideration of the human auditory characteristics calculated in a psychoacoustic model, thus giving rise to a problem in that the objective characteristics of the signal are deteriorated.
The present invention is intended to obviate the problem as referred to above, and has for its object to provide an audio signal encoding apparatus which is capable of preventing deterioration in the objective characteristics of a signal to be encoded without using parameters from a psychoacoustic model generated based on the human auditory characteristics or by replacing such parameters with those by which the signal can be effectively quantized in cases where the width of a frequency band in which the frequency component such as a sine wave of the signal concerned exists is narrow.
Bearing the above object in mind, according to a first aspect of the present invention, there is provided an audio signal encoding apparatus comprising: an FFT operation section for performing FFT operation processing of an input signal; a block type determination section for determining a processing block type of an MDCT processing section by using an FFT frequency spectrum calculated by the FFT operation section; a sine wave discrimination section for discriminating whether or not the input signal is a sine wave, by using the FFT frequency spectrum calculated by the FFT operation section; an SMR operation section for calculating an SMR by using the FFT frequency spectrum calculated by the FFT operation section; a switching section for switching between use and nonuse of an output value from the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section; an MDCT processing section for calculating an MDCT frequency spectrum by performing frequency orthogonal transformation processing of the input signal based on the processing block type received from the block type determination section; an allowable error amount calculation section for calculating an allowable amount of error by using the SMR and the MDCT frequency spectrum; a bit amount/error amount control section for determining a scale factor by performing bit amount/error amount control based on an amount of error from the allowable error amount calculation section, a dequantized value from a quantization section and an amount of used bits from a Huffman encoding section; a normalization processing section for normalizing the MDCT frequency spectrum from the MDCT processing section based on the scale factor from the bit amount/error amount control section; the quantization section for quantizing and dequantizing the MDCT frequency spectrum normalized by the normalization processing section; the Huffman encoding section for performing Huffman encoding of the quantized MDCT frequency spectrum to output a Huffman code book number and a Huffman code and to calculate an amount of used bits; and a multiplexer section for multiplexing the processing block type from the block type determination section, the scale factor from the bit amount/error amount control section, the Huffman code book number and the Huffman code from the Huffman encoding section.
In a preferred form of the first aspect of the present invention, the audio signal encoding apparatus further comprises a switching section for switching between execution and stop of the calculation processing of the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section.
In another preferred form of the first aspect of the present invention, the switching section for switching between use and nonuse of an output value from the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section uses a preset SMR value when the output value from the SMR operation section is not used.
According to a second aspect of the present invention, there is provided an audio signal encoding apparatus comprising: an FFT operation section for performing FFT operation processing of an input signal; a block type determination section for determining a processing block type of an MDCT processing section by using an FFT frequency spectrum calculated by the FFT operation section; a sine wave discrimination section for discriminating whether or not the input signal is a sine wave, by using the FFT frequency spectrum calculated by the FFT operation section; an SMR operation section for calculating an SMR by using the FFT frequency spectrum calculated by the FFT operation section; an MDCT processing section for calculating an MDCT frequency spectrum by performing frequency orthogonal transformation processing of the input signal based on the processing block type received from the block type determination section; an allowable error amount calculation section for calculating an allowable amount of error by using the SMR and the MDCT frequency spectrum; a switching section for switching between use and nonuse of an output value from the allowable error amount calculation section based on the result of sine wave discrimination in the sine wave discrimination section; a bit amount/error amount control section for determining a scale factor by performing bit amount/error amount control based on an amount of error from the allowable error amount calculation section, a dequantized value from a quantization section and an amount of used bits from a Huffman encoding section; a normalization processing section for normalizing the MDCT frequency spectrum from the MDCT processing section based on the scale factor from the bit amount/error amount control section; the quantization section for quantizing and dequantizing the MDCT frequency spectrum normalized by the normalization processing section; the Huffman encoding section for performing Huffman encoding of the quantized MDCT frequency spectrum to output a Huffman code book number and a Huffman code and to calculate an amount of used bits; and a multiplexer section for multiplexing the processing block type from the block type determination section, the scale factor from the bit amount/error amount control section, the Huffman code book number and the Huffman code from the Huffman encoding section.
In a preferred form of the second aspect of the present invention, the audio signal encoding apparatus further comprises: a switching section for switching between execution and stop of the calculation processing of the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section; and a switching section for switching between execution and stop of the calculation processing of the allowable error amount calculation section based on the result of sine wave discrimination in the sine wave discrimination section.
In another preferred form of the second aspect of the present invention, when the output value from the allowable error amount calculation section is not used, a preset allowable error amount value is used in the switching section for switching between use and nonuse of an output value from the allowable error amount calculation section based on the result of sine wave discrimination in the sine wave discrimination section.
In a further preferred form of the second aspect of the present invention, when the calculation processing of the SMR operation section is stopped, a preset SMR value is used in the switching section for switching between execution and stop of the calculation processing of the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section.
In a still further preferred form of the first or second aspect of the present invention, the FFT frequency spectrum is an amplitude spectrum.
In a yet further preferred form of the firs or second aspect of the present invention, the FFT frequency spectrum is a power spectrum.
In a further preferred form of the first or second aspect of the present invention, the FFT frequency spectrum is a real number component or an imaginary number component of the FFT operation result.
According to a third aspect of the present invention, there is provided an audio signal encoding apparatus comprising: an FFT operation section for performing FFT operation processing of an input signal; a block type determination section for determining a processing block type of an MDCT processing section by using an FFT frequency spectrum calculated by the FFT operation section; an MDCT processing section for calculating an MDCT frequency spectrum by performing frequency orthogonal transformation processing of the input signal based on the processing block type received from the block type determination section; a sine wave discrimination section for discriminating whether or not the input signal is a sine wave, by using the MDCT frequency spectrum calculated by the MDCT processing section; an SMR operation section for calculating an SMR by using the FFT frequency spectrum calculated by the FFT operation section; a switching section for switching between use and nonuse of an output value from the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section; an allowable error amount calculation section for calculating an allowable amount of error by using the SMR and the MDCT frequency spectrum; a bit amount/error amount control section for determining a scale factor by performing bit amount/error amount control based on an amount of error from the allowable error amount calculation section, a dequantized value from a quantization section and an amount of used bits from a Huffman encoding section; a normalization processing section for normalizing the MDCT frequency spectrum from the MDCT processing section based on the scale factor from the bit amount/error amount control section; the quantization section for quantizing and dequantizing the MDCT frequency spectrum normalized by the normalization processing section; the Huffman encoding section for performing Huffman encoding of the quantized MDCT frequency spectrum to output a Huffman code book number and a Huffman code and to calculate an amount of used bits; and a multiplexer section for multiplexing the processing block type from the block type determination section, the scale factor from the bit amount/error amount control section, the Huffman code book number and the Huffman code from the Huffman encoding section.
In a preferred form of the third aspect of the present invention, the audio signal encoding apparatus further comprises a switching section for switching between execution and stop of the calculation processing of the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section.
In another preferred form of the third aspect of the present invention, when the output value from the SMR operation section is not used, a preset SMR value is used in the switching section for switching between use and nonuse of an output value from the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section.
According to a fourth aspect of the present invention, there is provided an audio signal encoding apparatus comprising: an FFT operation section for performing FFT operation processing of an input signal; a block type determination section for determining a processing block type of an MDCT processing section by using an FFT frequency spectrum calculated by the FFT operation section; an MDCT processing section for calculating an MDCT frequency spectrum by performing frequency orthogonal transformation processing of the input signal based on the processing block type received from the block type determination section; a sine wave discrimination section for discriminating whether or not the input signal is a sine wave, by using the MDCT frequency spectrum calculated by the MDCT processing section; an SMR operation section for calculating an SMR by using the FFT frequency spectrum calculated by the FFT operation section; an allowable error amount calculation section for calculating an allowable amount of error by using the SMR and the MDCT frequency spectrum; a switching section for switching between use and nonuse of an output value from the allowable error amount calculation section based on the result of sine wave discrimination in the sine wave discrimination section; a bit amount/error amount control section for determining a scale factor by performing bit amount/error amount control based on an amount of error from the allowable error amount calculation section, a dequantized value from a quantization section and an amount of used bits from a Huffman encoding section; a normalization processing section for normalizing the MDCT frequency spectrum from the MDCT processing section based on the scale factor from the bit amount/error amount control section; the quantization section for quantizing and dequantizing the MDCT frequency spectrum normalized by the normalization processing section; the Huffman encoding section for performing Huffman encoding of the quantized MDCT frequency spectrum to output a Huffman code book number and a Huffman code and to calculate an amount of used bits; and a multiplexer section for multiplexing the processing block type from the block type determination section, the scale factor from the bit amount/error amount control section, the Huffman code book number and the Huffman code from the Huffman encoding section.
In a preferred form of the fourth aspect of the present invention, the audio signal encoding apparatus further comprises a switching section for switching between execution and stop of the calculation processing of the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section.
In another preferred form of the fourth aspect of the present invention, when the output value from the allowable error amount calculation section is not used, a preset allowable error amount value is used in the switching section for switching between use and nonuse of an output value from the allowable error amount calculation section based on the result of sine wave discrimination in the sine wave discrimination section.
In a further preferred form of the third or fourth aspect of the present invention, the MDCT frequency spectrum used for sine wave discrimination in the sine wave discrimination section is a power spectrum.
According to a fifth aspect of the present invention, there is provided an audio signal encoding apparatus comprising: an FFT operation section for performing FFT operation processing of an input signal; a block type determination section for determining a processing block type of an MDCT processing section by using an FFT frequency spectrum calculated by the FFT operation section; an MDCT processing section for calculating an MDCT frequency spectrum by performing frequency orthogonal transformation processing of the input signal based on the processing block type received from the block type determination section; a sine wave discrimination section for discriminating whether or not the input signal is a sine wave, by using the input signal; an SMR operation section for calculating an SMR by using the FFT frequency spectrum calculated by the FFT operation section; a switching section for switching between use and nonuse of an output value from the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section; an allowable error amount calculation section for calculating an allowable amount of error by using the SMR and the MDCT frequency spectrum; a bit amount/error amount control section for determining a scale factor by performing bit amount/error amount control based on an amount of error from the allowable error amount calculation section, a dequantized value from a quantization section and an amount of used bits from a Huffman encoding section; a normalization processing section for normalizing the MDCT frequency spectrum from the MDCT processing section based on the scale factor from the bit amount/error amount control section; the quantization section for quantizing and dequantizing the MDCT frequency spectrum normalized by the normalization processing section; the Huffman encoding section for performing Huffman encoding of the quantized MDCT frequency spectrum to output a Huffman code book number and a Huffman code and to calculate an amount of used bits; and a multiplexer section for multiplexing the processing block type from the block type determination section, the scale factor from the bit amount/error amount control section, the Huffman code book number and the Huffman code from the Huffman encoding section.
In a preferred form of the fifth aspect of the present invention, the audio signal encoding apparatus further comprises a switching section for switching between execution and stop of the calculation processing of the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section.
In another preferred form of the fifth aspect of the present invention, when the output value from the SMR operation section is not used, a preset SMR value is used in the switching section for switching between use and nonuse of an output value from the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section.
According to a sixth aspect of the present invention, there is provided an audio signal encoding apparatus comprising: an FFT operation section for performing FFT operation processing of an input signal; a block type determination section for determining a processing block type of an MDCT processing section by using an FFT frequency spectrum calculated by the FFT operation section; an MDCT processing section for calculating an MDCT frequency spectrum by performing frequency orthogonal transformation processing of the input signal based on the processing block type received from the block type determination section; a sine wave discrimination section for discriminating whether or not the input signal is a sine wave, by using the input signal; an SMR operation section for calculating an SMR by using the FFT frequency spectrum calculated by the FFT operation section; an allowable error amount calculation section for calculating an allowable amount of error by using the SMR and the MDCT frequency spectrum; a switching section for switching between use and nonuse of an output value from the allowable error amount calculation section based on the result of sine wave discrimination in the sine wave discrimination section; a bit amount/error amount control section for determining a scale factor by performing bit amount/error amount control based on an amount of error from the allowable error amount calculation section, a dequantized value from a quantization section and an amount of used bits from a Huffman encoding section; a normalization processing section for normalizing the MDCT frequency spectrum from the MDCT processing section based on the scale factor from the bit amount/error amount control section; the quantization section for quantizing and dequantizing the MDCT frequency spectrum normalized by the normalization processing section; the Huffman encoding section for performing Huffman encoding of the quantized MDCT frequency spectrum to output a Huffman code book number and a Huffman code and to calculate an amount of used bits; and a multiplexer section for multiplexing the processing block type from the block type determination section, the scale factor from the bit amount/error amount control section, the Huffman code book number and the Huffman code from the Huffman encoding section.
In a preferred form of the sixth aspect of the present invention, the audio signal encoding apparatus further comprises a switching section for switching between execution and stop of the calculation processing of the SMR operation section based on the result of sine wave discrimination in the sine wave discrimination section.
In another preferred form of the sixth aspect of the present invention, when the output value from the allowable error amount calculation section is not used, a preset allowable error amount value is used in the switching section for switching between use and nonuse of an output value from the allowable error amount calculation section based on the result of sine wave discrimination in the sine wave discrimination section.
The above and other objects, features and advantages of the present invention will become more readily apparent to those skilled in the art from the following detailed description of preferred embodiments of the present invention taken in conjunction with the accompanying drawings.