1. Field of the Invention
This invention relates to an apparatus and a method for encoding a signal by quantizing an input signal through time base/frequency base conversion as well as to an apparatus and a method for decoding an encoded signal. More particularly, the present invention relates to an apparatus and a method for encoding a signal that can be suitably used for encoding audio signals in a highly efficient way. It also relates to an apparatus and a method for decoding an encoded signal.
2. Prior Art
Various methods for encoding an audio signal are known to date including those adapted to compress the signal by utilizing statistic characteristics of audio signals (including voice signals and music signals) in terms of time and frequency and characteristic traits of the human hearing sense. Such coding methods can be roughly classified into encoding in the time region, encoding in the frequency region and analytic/synthetic encoding.
In the operation of transform coding of encoding an input signal on the time base by orthogonally transforming it into a signal on the frequency base, it is desirable from the viewpoint of coding efficiency that the characteristics of the time base waveform of the input signal are removed before subjecting it to transform coding.
Additionally, when quantizing the coefficient data on the orthogonally transformed frequency base, the data are more often than not weighted for bit allocation. However, it is not desirable to transmit the information on the bit allocation as additional information or side information because it inevitably increases the bit rate.
In view of these circumstances, it is therefore an object of the present invention to provide an apparatus and a method for encoding a signal that are adapted to remove the characteristic or correlative aspects of the time base waveform prior to orthogonal transform in order to improve the coding efficiency and, at the same time, reduce the bit rate by making the corresponding decoder able to know the bit allocation without directly transmitting the information on the bit allocation used for the quantizing operation.
Meanwhile, for the operation of transform coding of encoding an input signal on the time base by orthogonally transforming it into a signal on the frequency base, techniques have been proposed to quantize the coefficient data on the frequency base by dynamically allocating bits in response to the input signal in order to realize a low coding rate. However, cumbersome arithmetic operations are required for the bit allocation particularly when the bit allocation changes for each coefficient in the operation of dividing coefficient data on the frequency base in order to produce sub-vectors for vector quantization.
Additionally, the reproduced sound can become highly unstable when the bit allocation changes extremely for each frame that provides a unit for orthogonal transform.
In view of these circumstances, it is therefore another object of the present invention to provide an apparatus and a method for encoding a signal that are adapted to dynamically allocate bits in response to the input signal with simple arithmetic operations for the bit allocation and reproduce sound without making it unstable if the bit allocation changes remarkably among frames for the operation of encoding the input signal that involves orthogonal transform as well as an apparatus and a method for decoding a signal encoded by such an apparatus and a method.
Additionally, since quantization takes place after the bit allocation for the coefficient on the frequency base such as the MDCT coefficient in the operation of transform coding of encoding an input signal on the time base by orthogonally transforming it into a signal on the frequency base, quantization errors spreads over the entire orthogonal transform block length on the time base to give rise to harsh noises such as pre-echo and post-echo. This tendency is particularly remarkable for sounds that relatively quickly attenuate between pitch peaks. This problem is conventionally addressed by switching the transform window size (so-called window switching). However, this technique of switching the transform window size involves cumbersome processing operations because it is not easy to detect the right window having the right size.
In view of the above circumstances, it is therefore still another object of the present invention to provide an apparatus and a method for encoding a signal adapted to reduce harsh noises such as pre-echo and post-echo without modifying the transform window size as well as an apparatus and a method for decoding a signal encoded by such an apparatus and a method.
According to a first aspect of the invention, the above objectives are achieved by providing a method for encoding an input signal on the time base through orthogonal transform, said method comprising:
a step of removing the correlation of signal waveform on the basis of the parameters obtained by means of linear predictive coding (LPC) analysis and pitch analysis of the input signal on the time base prior to the orthogonal transform.
Preferably, the input time base signal is transformed to coefficient data on the frequency base by means of modified discrete cosine transform (MDCT) in said orthogonal transform step. Preferably, in said normalization step, the LPC analysis residue of said input signal is output on the basis of the LPC coefficient obtained through LPC analysis of said input signal and the correlation of the pitch of said LPC prediction residue is removed on the basis of the parameters obtained through pitch analysis of said LPC prediction residue. Preferably, said quantization means quantizes according to the number of allocated bits determined on the basis of the outcome of said LPC analysis and said pitch analysis.
According to a second aspect of the invention, there is provided a method for encoding an input signal on the time base through orthogonal transform, said method comprising:
a calculating step of calculating weights as a function of said input signal; and
a quantizing step of determining an order for the coefficient data obtained through the orthogonal transform according to the order of the calculated weights and carrying out an accurate quantizing operation according to the determined order.
Preferably, in said quantizing step, a larger number of allocated bits are used for quantization for the coefficient data of a higher order.
Preferably, the coefficient data obtained through said orthogonal transform are divided into a plurality of bands on the frequency base and the coefficient data of each of the bands are quantized according to said determined order of said weights independently from the remaining bands.
Preferably, the coefficient data of each of the bands are divided into a plurality of groups in the descending order of the bands to define respective coefficient vectors and each of the obtained coefficient vectors is subjected to vector quantization.
According to a third aspect of the invention, there is provided a method for encoding an input signal on the time base through orthogonal transform on a frame by frame basis, each frame providing a coding unit, said method comprising:
an envelope extracting step of an extracting envelope within each frame of said input signal; and
a gain smoothing step of carrying out a gain smoothing operation on said input signal on the basis of the envelope extracted by said envelope extracting step and supplying the input signal for said orthogonal transform.
Preferably, the input time base signal is transformed to coefficient data on the frequency base by means of modified discrete cosine transform (MDCT) for said orthogonal transform. Preferably, the information on said envelope is quantized and output. Preferably, said frame is divided into a plurality of sub-frames and said envelope is determined as the root means square (rms) value of each of the divided sub-frames. Preferably, the rms value of each of the divided sub-frames is quantized and output.
Thus, according to the first aspect of the invention, there is provided a method for encoding an input signal on the time base through orthogonal transform, said method comprising:
a step of removing the correlation of signal waveform on the basis of the parameters obtained by means of linear predictive coding (LPC) analysis and pitch analysis of the input signal on the time base prior to the orthogonal transform.
With this arrangement, a residual signal that resembles a white nose is subjected to orthogonal transform to improve the coding efficiency. Additionally, in a method for encoding an input signal on the time base through orthogonal transform, preferably a quantization operation is conducted according to the number of allocated bits determined on the basis of the outcome of said linear predictive coding (LPC) analysis and said pitch analysis. Then, the corresponding decoder is able to reproduce the bit allocation of the encoder from the parameters of the LPC analysis and the pitch analysis to make it possible to suppress the rate of transmitting side information and hence the overall bit rate and improve the coding efficiency.
Still additionally, the operation of encoding high quality audio signals can be carried out highly efficiently by using a technique of modified discrete cosine transform (MDCT) for orthogonal transform.
According to the second aspect of the invention, there is provided a method for encoding an input signal on the time base through orthogonal transform, said method comprising:
a calculating step of calculating weights as a function of said input signal; and
a quantizing step of determining an order for the coefficient data obtained through the orthogonal transform according to the order of the calculated weights and carrying out an accurate quantizing operation according to the determined order.
With this arrangement, it is possible to dynamically allocate bits in response to the input signal with simple arithmetic operations for calculating the number of bits to be allocated to each coefficient.
Particularly, when the coefficient data obtained through said orthogonal transform are divided into a plurality of sub-vectors, the number of bits to be allocated to each sub-vector can be determined by calculating the weight for it to reduce the arithmetic operations if the number of bits to be allocated to each coefficient changes because the coefficient data can be reduced into sub-vectors after they are sorted out according to the descending order of the weights.
Additionally, when the coefficient data on the frequency base are divided into bands and the number of bits to be allocated to each band is predetermined, any possible abrupt change in the quantization distortion can be prevented from taking place to reproduce sound on a stable basis if the weight of each coefficient change extremely from frame to frame because the number of allocated bits is reliable determined for each band.
Still additionally, when the parameters to be used for the arithmetic operations of bit allocation are predetermined and transmitted to the decoder, it is no longer necessary to transmit the information on bit allocation to the decoder so that it is possible to suppress the rate of transmitting side information and hence the overall bit rate and improve the coding efficiency. Still additionally, the operation of encoding high quality audio signals can be carried out highly efficiently by using a technique of modified discrete cosine transform (MDCT) for orthogonal transform.
According to the third aspect of the invention, there is provided a method for encoding an input signal on the time base through orthogonal transform on a frame by frame basis, each frame providing a coding unit, said method comprising:
an envelope extracting step of an extracting envelope within each frame of said input signal; and
a gain smoothing step of carrying out a gain smoothing operation on said input signal on the basis of the envelope extracted by said envelope extracting step and supplying the input signal for said orthogonal transform.
With this arrangement, it is possible to reduce harsh noises such as pre-echo and post-echo without modifying the transform window size as in the case of the prior art.
Additionally, when the information on said envelope is quantized and output to the decoder and the gain is smoothed by using the quantized envelope value, the decoder can accurately restore the gain.
Still additionally, the operation of encoding high quality audio signals can be carried out highly efficiently by using a technique of modified discrete cosine transform (MDCT) for orthogonal transform.