1. Field of the Invention
The present invention relates to a method for determining quantization parameters, particularly a method for determining quantization parameters in a bit allocation process.
2. Description of the Related Art
Since Thomas Alva Edison invented the gramophone, music has been playing an important role in people's lives. Because of people's demand of music, engineers keep on researching and have advanced the method to record and reproduce audio signals from the preliminary analog system to the presently popular digital system. Nowadays, CD (compact disc) is a popular format for storing audio signals. However, as the Internet continues to gain more popularity, the traditional format of CD music recordings is gradually replaced by some other coding algorithm formats, such as MPEG-audio Layer-3 or AAC (Advanced Audio Coding), because CD format recording generally has much more data size.
There are three steps in the traditional analog to digital music transforming process—Sampling, Quantization and Pulse Code Modulation (PCM). Sampling means reading the signal level of the music at each equal time interval. Quantization means representing the amplitude of each read signal in a quantization degree with a limited numerical value. Pulse Code Modulation (PCM) means representing the quantized value with a binary number. Traditional music CDs employ the aforementioned PCM technique to record analog music in the digital format, but it demands huge storage space and communication bandwidth. For example, nowadays music CDs adopt the 16 bits quantization degree. Therefore, it needs about 10 MB storage space for the music recording per minute. Due to the limited data transmission bandwidth for digital TV, wireless communication and the Internet, some encoding techniques for higher compression ratio on music signals are invented and developed.
Referring to FIG. 1, FIG. 1 shows a functional block diagram of an audio encoding system 10 of the prior art. Encoders, such as the aforementioned MPEG-audio LAYER-3 or AAC, encode a PCM sample into an audio bitstream of MPEG-audio LAYER-3 or AAC in the audio encoding system 10 in FIG. 1. The traditional audio encoding system 10 comprises a Modified Discrete Cosine Transform module (MDCT module) 12, a psychoacoustic module 14, a quantization module 16, an encoding module 18, and a bitstream packing module 19.
The PCM samples are inputted to both the MDCT module 12 and the psychoacoustic module 14, and the samples are first analyzed by the psychoacoustic module 14 to generate a masking curve and a window message. The masking curve delineates the range of audio signals to be perceived by ordinary human ears. Ordinary human ears can perceive only audio signals that are higher above than the masking curve.
According to the window message transmitted from the psychoacoustic module 14, the MDCT module 12 performs a modified discrete cosine transformation on the PCM samples. The PCM samples are transformed to a plurality of MDCT samples, and then the MDCT samples are grouped, according to the characteristic of human acoustic perception, to form a plurality of frequency subbands with non-equivalent bandwidth; each frequency subband is associated with a masking threshold. The quantization module 16 cooperates with the encoding module 18, repeatedly performing a bit allocation process on every frequency subband; such procedure ensures every MDCT sample in the frequency subbands conforms to the coding distortion standard. For instance, by means of a limited available bit numbers, the final encoding distortion of every MDCT sample is made to be lower than the corresponding masking threshold determined by the psychoacoustic module 14. After the bit allocation procedure, the encoding module 18 performs Huffman encoding on all MDCT samples in that frequency subband. Further, the bitstream packing module 19 combines all encoded frequency subbands, and packs all frequency subbands with corresponding side information so as to generate an audio bitstream, The side information contains information related to the entire audio encoding process, for example, window message, stepsize factor, Huffman encoding information, etc.
Referring to FIG. 2, FIG. 2 shows the flow chart of a conventional audio encoding. The conventional audio encoding such as MPEG-audio LAYER-3 (MP3) or AAC includes the following steps:
STEP 200: Start.
STEP 202: Receive PCM samples. Then go to step 204 and step 206.
STEP 204: Analyze the PCM samples using the psychoacoustic module to determine the corresponding masking curve.
STEP 206: Perform the modified discrete cosine transformation on the PCM samples to generate a plurality of MDCT samples which are grouped into several frequency subbands; each frequency subband may include different number of MDCT samples.
STEP 208: According to the masking threshold of each corresponding frequency subband, perform a bit allocation process on every MDCT sample in the frequency subband, so that the MDCT samples in the frequency subband conform to the encoding distortion standard.
STEP 210: Pack all of the encoded frequency subbands with the corresponding side information so as to generate a corresponding audio bitstream of the PCM samples.
STEP 212: End.
The bit allocation procedure performed by the quantization module 16 and the encoding module 18 in FIG. 1 further include many complicated steps. Referring to FIG. 3, FIG. 3 shows a flow chart of a conventional bit allocation procedure. The conventional bit allocation procedure includes the following steps.
STEP 300: Start.
STEP 302: Perform quantization of all the frequency subbands nonlinearly (disproportionately) according to a stepsize factor corresponding to each audio frame.
STEP 304: Look up the Huffman Table to calculate the number of bits needed by every MDCT sample of corresponding frequency subband.
STEP 306: Determine if the number of needed bits is lower than the number of available bits. If YES, go to STEP 310. If NO, go to STEP 308.
STEP 308: Increase the stepsize factor, and go back to STEP 302.
STEP 310: De-quantize the quantized frequency subbands.
STEP 312: Calculate the distortion of the frequency subbands.
STEP 314: Store the scalefactor of the frequency subbands and the stepsize factor of the audio frame.
STEP 316: Determine if there is any frequency subband with distortion exceeds the corresponding masking threshold. If NO, go to STEP 322. If YES, go to STEP 317.
STEP 317: Determine if there is any other termination condition met (such as the scalefactor has reached the upper limit); if YES, then go to STEP 318, if NO, then go to STEP 320.
STEP 318: Increase the value of the scalefactor.
STEP 319: Amplify all the MDCT samples of the frequency subband according to the scalefactor, and then go to STEP 302.
STEP 320: Determine if the scalefactor and the stepsize factor are better values or the most preferable values. If YES, then go to STEP 322. If NO, then go to STEP 321.
STEP 321: Restore previous better scalefactor and stepsize factor; then go to STEP 322.
STEP 322: End.
From the discussion above, there are two loops in the bit allocation procedure for determining the quantization parameter. The first loop is from STEP 302 to STEP 308; it is usually called the inner loop or the bit rate control loop, used for determining the stepsize factor. The second loop is from STEP 302 to STEP 322; it is usually called the outer loop or the distortion control loop, used for determining the scalefactor. Thus, each run of the traditional bit allocation method usually requires many runs of the outer loop, and every outer loop includes many runs of the inner loop. Such replicated operation leads to poor efficiency of the prior art. To improve the encoding efficiency, reducing the number of the loops and operations becomes important. Besides, since the bit allocation loop of the prior art only increments one to the stepsize factor each time, it causes the increase of the repeated operation of the bit-rate control loop.
Some Related Information are Listed for Reference.
[1] Information technology—coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s. part 3: Audio. Technical report, ISO/IEC, MPEG 11172-3, 1993.
[2] Information technology—generic coding of moving pictures and associated audio information. Part 3: Audio. Technical report, ISO/IEC MPEG 13818-3, 1998.
[3] Information technology—generic coding of moving pictures and associated audio information. Part 7: Advanced audio coding (AAC). Technical report, ISO/IEC MPEG 13818-7, 1997.
[4] Information technology—very low bitrate audio-visual coding. Part 3: Audio. Technical report, ISO/IEC MPEG 14496-3, 1998.
[5] US2001/0032086 A1, Fast convergence method for bit allocation stage of MPEG audio layer 3 encoders.
[6] EP 0967593 B1, Audio coding and quantization method.
[7] H. Oh, J. Kim, C. Song, Y. Park and D. Youn. “Low power MPEG/audio encoders using simplified psychoacoustic model and fast bit allocation. IEEE transactions on Vol. 47, pp. 613-621, 2001.
[8] C. Liu, C. Chen, W. Lee and S. Lee. “A fast bit allocation method for MPEG layer III”. Proc. of ICCE, pp.22-23, 1999.
[9] Alberto D. Duenas, Rafael Perez, Begona Rivas, Enrique Alexandre, Antonio S. Pena. “A robust and efficient implementation of MPEG-2/4 AAC Natural Audio Coders”. In AES 112th Convention, 2002.