The present invention relates to a method and a device for defining the table of bit allocations and more particularly to a method and a device for defining the table of bit allocation in processing audio signals.
The recent subband encoders, developed from the human acoustic system, can compress audio signals with great change in frequency. Music is a typical example of audio signals. The compression ratio becomes more and more important recently because the data transmission between computers is very frequent in internet world. The basic principle of subband encoders is to divide the audio spectrum into several subbands. Then, the audio signals in different subbands are encoded respectively.
Filter bank is often used to divide audio signals. The band-pass filters in the filter bank restrict the frequency range of the audio signals in the subbands. It is known that Nyquist ratio is adapted to sample, quantize, encode, multiplex, and transmit the audio signals. These steps are indirectly controlled by a psychoacoustic model. The psychoacoustic model will define a table of bit allocation to determine the number of bits to store the audio signals in respective subbands. Then, the audio signals are converted into digital signals for the purpose of transmission. That is, the table of bit allocation plays an important role in transmitting audio signals. The masking threshold estimation is always used to control the quantizer if possible.
After the digital signals are transmitted, the receiving end must reconstruct them to show the original music. The subband decoder demultiplexes, decodes, up-samples, and mixes these digital signals to restore the audio signals. These steps are also based on the table of bit allocation.
Please refer to FIG. 1 which is a block diagram showing a conventional subband encoder. The audio signals s(n) are inputted into the band-pass filters 11 to become several subband signals B1. . . BN. The symbol n means the nth signal frame at specific moment. The subband signals B1. . . BNrepresent the amplitude of the audio signals in the respective subbands. Then the subband signals B1. . . BN are respectively decimated by the decimating units 12, that is, the subband signals B1. . . BN are sampled. Then the encoders 15 encode the obtained signals. The table of bit allocation 13 provided from the psychoacoustic model 14 teaches the encoders 15 the number of bits for storing the data in different subbands and at different moments. After the encoding step, the multiplexer 16 multiplexes all the encoded signals to generate the distal signals x(n). The digital signals x(n) can be easily transmitted to other operating systems or computers by means of cables or telephone lines. By the way, the digital signals x(n) can be stored easily and conveniently because their size are smaller than the audio signals s(n).
An important key to the system is how to determine the table of bit allocation 13. The psychoacoustic model 14 does it based on the acoustic system of human. Human ears can only accept sound with limit frequency. We can not hear audio signals with too high frequency or too low frequency even their amplitude is great, but we can clearly hear the audio signals with middle frequency even their amplitude is not so great. Hence, more bits should be used to store the audio signals in the middle subbands. On the other hand, fewer bits should be used for the subbands with low weight; even no bits are needed.
The encoders 15 quantize the decimated signals according to the table of bit allocation 13. For example, the table of bit allocation 13 indicates that the signals in subband 1 can use 2 bits, the possible encoded data may be one of 00, 01, 10, and 11 to respectively indicate the unloud, loud, louder and loudest voices.
Please refer to FIG. 2 which is a block diagram showing the conventional subband decoder. The reconstruction process is the reverse of the encoding process. At first, the digital signals x(n) are demutltiplexed by the demultiplexer 21 to take out signals in each subband and at each moment. The decoders 22 decode these signals to generate the decoded signals b1. . . bN according to the information stored in the table of bit allocation 23. The decoded signals b1. . . bN are up-sampled by the expanding units 24. After passing the band-pass filters 25, all the signals are mixed by the mixer 26 to be combined into audio signals s(n). The obtained signals s(n) are similar to the original audio signals s(n).
The quality of audio signals reconstructed by the conventional method is not high enough. The principle of the conventional method is to find the minimum noise-to-mask ratio in respective signal frames (about 10-30 ms). The xe2x80x9cadbxe2x80x9d bits used for each signal frame are calculated from tie following equation:
adb=B÷1000xc3x97K
wherein B is bit rate (bits/sec) and K is frame interval (s). The same frame interval will be allocated the sane bit size. Usually, many signal frames can not be sensed because of masking effects, Such allocation really wastes the bits for storing the audio signals and quality of the audio signals can no be raised. It also increases the production cost. Hence, it is a good idea by using fewer bits to provide the same audio quality or by using the same bits to provide higher audio quality.
An objective of the present invention is to disclose a method for defining the table of bit allocation in processing audio signals. This method can allocate bits in effective signal frames and subbands. Such bit allocation can both increases transmission efficiency and reduces production cost.
Another objective of the present invention is to disclose a device for defining the table of bit allocation in processing audio signals. This device can allocate bits in effective signal frames and subbands. Such device can both increases transmission efficiency and reduces production cost.
In accordance with the present invention, the defining method includes the following steps. At first step the total number of bits used for storing the audio signals is determined. In this specification, the words xe2x80x9cbit allocation valuexe2x80x9d indicate the number of bits used for storing the audio signals. Then, the psychoacoustic model finds several signal-to-mask ratios in different subbands and at different moments according to the original audio signals. All the signal-to-mask ratios will be quantized to generate some quantized levels. Each quantized level includes at least one signal-to-mask ratios and corresponds to a bit allocation value and a sampled signal-to-mask ratio. Hence, the table of bit allocation composed of the bit allocation values is defined.
In accordance with another aspect of the present invention, the table of bit allocation includes a time axis and a band axis. Therefore, a given moment and subband corresponds to a bit allocation value. Of course, non-effective subframes and subbands correspond to a bit allocation value of 0. The slim of bit allocation values in one signal fire may be different from that in another signal frame. Therefore, the bit allocation is optimized.
In accordance with another aspect of the present invention the quantizing step is explained briefly as follows. First of all, all the bit allocation values must be initialized; that is, they are assigned a value of 0. Then, the signal-to-mask ratios are classified into several quantized levels so that each quantized level has at least one signal-to-mask ratio. In each quantized level, a signal-to-mask ratio suitable for representing the quantized level will be selected to become the sample signal-to-mask ratio. The middle value is a good choice. Then, the mask-to-noise ratios of quantized levels are calculated according to the sample signal-to-mask ratios. The quantized level corresponding to the minimum mask-to-noise ratio is the quantized level with the greatest weight. Therefore, all the bit allocation values of the specific signal frames and subbands included in this quantized level increase, and the total bit allocation value decreases. These steps are repeated until the total bit allocation value becomes 0. Hence, all the bit allocation values are obtained.
An equation is provided to calculate the mask-to-noise ratios.
MNR=BQLxc3x976.02xe2x88x92SMR
Wherein MNR is mask-to-noise ratio, BQL is bit allocation value, and SMR is sample signal-to-mask ratio.
In accordance with the present invention, by way of making reference to the foregoing paragraphs, the device includes a psychoacoustic model, a digital storage unit, and a quantizer. The psychoacoustic model is used for providing the signal-to-mask ratios according to the audio signals. The digital storage unit electrically connected to the psychoacoustic model is used for storing the signal-to-mask ratios. The quantizer electrically connected to the digital storage unit is used for quantizing the signal-to-mask ratios to generate several quantized levels.
In accordance with present invention, the apparatus adopting the present method and device is also disclosed. The apparatus includes a bit allocation device and an audio processor. The bit allocation device has be described in the foregoing paragraphs. The audio processor, i.e. encoding processor or decoding processor, is used for processing the audio signals according to the present table of bit allocation.
The present invention may best be understood through the following description with reference to the accompanying drawings, in which;