This invention relates to a high efficiency encoding method and device, a high efficiency decoding method and device for encoding input digital data by high efficiency encoding, transmitting, recording, reproducing and decoding the data for producing replay data.
There are a variety of high efficiency encoding methods for audio or voice signals. Typical of these are a sub-band coding (SBC) which is the non-block forming frequency band dividing system consisting in dividing the time-domain signals into signals of plural frequency bands without dividing the time-domain audio signals into plural blocks, and a block-forming frequency band dividing system consisting in dividing time-domain signals into plural blocks, transforming the resulting blocks into frequency-domain signals by orthogonal transform and encoding the resulting frequency-domain signals from one frequency band to another, also known as a transform coding. There is also known a high efficiency encoding consisting in the combination of the sub-band coding and transform coding, in which the time-domain signals are divided into signal of plural frequency bands by the sub-band coding, the signals of the respective bands are orthogonally transformed into frequency-domain signals which are then encoded from one band to another.
A filter employed in the above methods may be exemplified by a quadrature mirror filter (QMF) discussed in 1976 R. E. Crochiere Digital Coding of Speech in Subbands Bell Syst. Tach. J. Vol. 55 No. 8 1976. An equal-band filter dividing method by a polyphase quadrature filter is discussed in ICASSP 83, Boston Polyphase Quadrature Filter--A New Subband Coding Technique, by Joseph H. Rothweiler.
The orthogonal transform may be enumerated by orthogonal transform consisting in forming input audio signals into blocks at an interval of a pre-set unit time period (frame) and processing the resulting blocks with fast Fourier transform (FFT), cosine transform (DCT) or modified DCT (MDCT) for transforming time-domain signals into frequency-domain signals. As for the MDCT, reference is had to ICASSP 1987 Subband/Transform Coding Using Filter Band Designs Based on Time Domain Aliasing Cancellation, J. P. Princen, A. B. Bradley Univ. of Surrey Royal Melbourne Inst. of Tech. The concrete techniques of the MDCT are discussed in detail in our co-pending U.S. Ser. No. 07/950,945 filed on Sept. 24, 1992, now U.S. Pat. No. 5,349,549 .
In quantizing the frequency components, the frequency bands are selected to take into account the characteristics of the human aural sense. That is, the audio signals are divided into a plurality of, for example 25, bands, known as critical bands, in which the bandwidths become broader in a direction of increasing frequencies. In encoding data of the respective frequency bands, a pre-set number of bits are allocated to each band, or variable numbers of bits are allocated to each band by way of adaptive bit allocation. For example, when encoding coefficient data obtained by MDCT by bit allocation, the MDCT coefficient data of the respective bands obtained by block-based MDCT are encoded by adaptive numbers of bits. There are known the following two bit allocation methods.
In IEEE Transactions of Acoustics, Speech, and Signal Processing, vol.ASSP-25, No. 4, August 1977, bit allocation is made based on the magnitudes of the signals of the respective bands. With this system, the quantization noise spectrum becomes flatter and the noise energy becomes minimum. However, the noise as actually perceived by the ear is not optimum because the aural masking effect is not utilized.
In ICASSP 1980 The Critical Band Coder--Digital Encoding of the Perceptual Requirements of the Auditory System, M. A. Kransner MIT, there is discussed a method of producing the signal to noise ratio as required for each band for realizing fixed bit allocation. However, it is not possible with this method to produce a satisfactory characteristic value because the bit allocation remains fixed even for measuring the characteristic with a sine wave input.
With the above-described methods, temporal characteristics, such as time fluctuations of the input information signals, are not taken into account. As a result thereof, the problem of highly jarring pre-echo, which is produced when the input information signals are changed abruptly in amplitude, above all, when small information signals are changed to larger information signals, cannot be solved. By the pre-echo is meant a phenomenon in which the quantization noise produced directly before the small information signal is abruptly changed to the larger information signal is heard without being covered by backward masking to cause deterioration in the sound quality.
As a method of decreasing the pre-echo to a level imperceptible to the ear, the present Applicant has already proposed in U.S. Ser. No. 07/553,608 filed on Jul. 18, 1990, now U.S. Pat. No. 5,197,087, a method consisting in adaptively changing the block length. Specifically, the method consists in sub-dividing a block where there exist acutely changing signals so that there is a high risk of occurrence of pre-echo. Although the pre-echo may be effectively suppressed with this method, pre-echo still exists in the subdivided block portions, albeit to a limited extent.
If bit allocation is made in consideration only of frequency characteristics, it is difficult to avoid the deterioration of the sound quality due to pre-echo during abrupt transition of the information signals. Thus a demand is raised towards a method of effectively preventing the pre-echo from occurring.