1. Field of the Invention
The present invention relates to an encoding apparatus and method, adapted to encode a second code string conforming to a second format based on a second coding method with a higher efficiency than that of a first code string conforming to a first format based on a first coding method.
2. Description of the Related Art
The technique to record information to a recording medium capable of recording an encoded audio or speech signal, such as a magneto-optical disc or the like, is widely used. For a highly efficient coding of an audio or speech signal, there have been proposed various methods such as the subband coding method (SBC) in which an audio signal or the like on a time base is divided into a plurality of frequency bands without blocking, and the so-called transform coding method in which a signal on the time base is transformed to a signal on the frequency base (spectrum transform), divided into a plurality of frequency bands, and then the signal in each of the frequency bands is encoded. Also, a high efficiency coding method has also been proposed which is a combination of the SBC method and transform coding method. In this third method, for example, after an audio or speech signal is divided into a plurality of frequency bands by the SBC method, the signal in each frequency band is spectrum-transformed to a signal on the frequency base, and the signal is encoded in each spectrum-transformed frequency band. The QMF filter is defined in R.E. Crochiere: xe2x80x9cDigital Coding of Speech Subbandsxe2x80x9d, Bell Syst. Tech. Journal, Vol. 55, No. 8, 1976xe2x80x3. Also, the method for equal-bandwidth division by filter is defined in Joseph H. Rothweiler: xe2x80x9cPolyphase Quadrature Filtersxe2x80x94A New Subband Cording Techniquexe2x80x9d, ICASSP 83, BOSTON.
In an example of the above-mentioned spectrum, an input audio signal is blocked at predetermined unit times (frames), and each of the blocks is subjected to the discrete Fourier transform (DFI), discrete cosine transform (DCT) or modified discrete cosine transform (MDCT) to transform a time base to a frequency base. The MDCT is described in xe2x80x9cJ. P. Princen and A. B. Bradley, Univ. of Surrey Royal Melbourne Inst. of Tech.: Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation, ICASSP, 1987xe2x80x9d.
When the above-mentioned DFT or DCT is used for transform of a waveform signal to a spectrum, with a time block consisting of M samples will yield a number M of independent real data. Normally, a time block is arranged to overlap M1 samples of its neighboring blocks each to suppress the distortion of the connection between time blocks. Therefore, in the DFT and DCT, a signal will be encoded by quantizing on average M real data for a number (M-M1) of samples.
When the MDCT is used as the method for transform of a waveform signal to a spectrum, M independent real data can be obtained from 2M samples arranged to overlap M ones of its neighboring blocks each. Therefore, in the MDCT, the signal is encoded by quantizing on average M real data for the M samples. In a decoder, waveform elements obtained from a code resulted from the MDCT by inverse transform in each block are added together while being made to interfere with each other, thereby permitting reconstruction of the waveform signal.
Generally, by increasing the length of the time block, the frequency separation of the spectrum is increased and energy is concentrated on a specific spectrum component. Therefore, by transforming a waveform signal to a spectrum with an increased block length obtained by overlapping a time block a half of its neighboring time blocks each and using the MDCT in which the number of spectrum signals obtained will not increase relative to the number of original time samples, it will be possible to enable a coding whose efficiency is higher than that attainable with the DFT or DCT.
By quantizing a signal divided into a plurality of frequency bands by the filtering or spectrum transform as in the above, it is possible to control any frequency band where quantization noise occurs and encode an audio signal with a higher efficiency in the auditory sense, using a property such as the masking effect. Also, by normalizing, for each of the frequency bands, the audio signal with a maximum absolute value of a signal component in the frequency band before effecting the quantization, a further higher efficiency of the coding can be attained.
The width of frequency division for quantization of each frequency component resulted from a frequency band division is selected with the auditory characteristic of the human being for example, taken into consideration. That is, an audio signal is divided into a plurality of frequency bands (25 bands for example) in such a bandwidth as will be larger as its frequency band is higher, which is generally called a xe2x80x9ccritical bandxe2x80x9d, as the case may be. Also, at this time data in each band is encoded by a bit distribution to each band or with an adaptive bit allocation to each band. For example, when a coefficient data obtained using MDCT is encoded with the above bit allocation, an MDCT coefficient data in each band, obtained using the MDCT at each block, will be encoded with an adaptively allocated number of bits. The adaptive bit allocation information can be determined so as to be previously included in a code string, whereby the sound quality can be improved by improving the coding method even after determining a format for decoding. The known bit allocation techniques include the following two:
One of them is disclosed in xe2x80x9cR. Zelinski and P. Noll: Adaptive Transform Coding of Speech Signalsxe2x80x9d, IEEE Transaction of Acoustics, Speech, and Signal Processing, Vol. ASSP-25, No. 4, August 1977. This technique is such that the bit allocation is made based on the size of a signal in each frequency band. With this technique, the quantization noise spectrum can be flat and the noise energy be at a minimum, but since no masking effect is used, the actual noise will not feel auditorily optimum.
The other one is disclosed in xe2x80x9cM. A. Kransner, MIT: The Critical Band Coderxe2x80x94Digital encoding of the perceptual requirements of the auditory system, ICASSP, 1980xe2x80x9d. This technique is such that the auditory masking is used to acquire a necessary signal-to-noise ratio for each frequency band, thus making a fixed bit allocation. With this technique, however, since the bit allocation is a fixed one , the signal characteristic will not be so good even when it is measured on a sine wave input.
To solve the above problem, there has been proposed a high efficiency encoder in which all bits usable for the bit allocation are divided for a fixed bit allocation pattern predetermined for each small block and for a bit distribution dependent upon a signal size of each block at a ratio dependent upon a signal related with an input signal and whose number of bits for the fixed bit allocation pattern is larger as the spectrum of the signal is smoother.
With the above method adopted in the encoder, the entire signal-to-noise ratio can considerably be improved by allocating more bits to a block including a specific spectrum to which energy is concentrated, such as a sine wave input. Generally, since the human ears are extremely sensitive to a signal having a steep spectrum component, the above method can be used to improve the signal-to-noise ratio, which does not only improve a measured value but also can effectively improve the sound quality.
The bit allocation methods include many other ones as well. The auditory model is further elaborated to enable a higher-efficiency coding if the encoder could. Generally, in these methods, a reference for the real bit allocation to realize a computed signal-to-noise ratio with a highest possible fidelity is determined and an integral value approximate to the computed value is taken as a number of allocated bits.
For example, the present invention has proposed an encoding method in which a signal component having an auditorily important tone component, namely, a signal component having an energy concentrated around a predetermined frequency thereof, is separated from a spectrum signal and encoded separately from the other spectrum component. Thus, this method allows encoding of an audio signal or the like, efficiently, with a high compression rate with little auditory deterioration.
To form an actual code string, it suffices to first encode quantizing precision information and normalizing coefficient information with a predetermined number of bits for each frequency band in which the normalization and quantization are effected, and then encode the normalized and quantized signals. Also, in the ISO/IEC 11172-3: 1998 (E), 1993, a high efficiency coding method is defined in which the number of bits indicating quantizing precision information varies from one frequency band to another in such a manner that as the frequency is higher, the number of bits indicating quantizing precision information will be smaller.
It has also been proposed to determine quantizing precision information based on normalizing coefficient information for example, in a decoder instead of directly encoding the quantizing precision information. In this method, however, since the relation between the normalized efficient information and quantizing precision information will be determined when a format is set, it is not possible to introduce the control of the precision of quantization based on a further advanced auditory model which will be available in the future, if at all. Also, when a compression rate to be realized ranges widely, it is necessary to determine the relation between the normalizing coefficient information and quantizing precision information for each compression rate.
Also, there is known an encoding method in which a quantized spectrum signal is encoded using a variable-length code defined in xe2x80x9cD. A. Huffman: A Method for Construction of Minimum Redundancy Codes, Proc. I. R. E, 40, p. 1098 (1952)xe2x80x9d for example with a higher efficiency.
As in the above, techniques for a higher-efficiency coding have been developed one after another. By employing a format incorporating a newly developed technique, it is possible to record for a longer time, and also record an audio signal having a higher sound quality for the same length of recording time.
However, if players capable of playing back only signals recorded in a predetermined format (will be referred to as xe2x80x9cfirst formatxe2x80x9d hereinafter) prevail (this player will be referred to as xe2x80x9cfirst format-conforming playerxe2x80x9d hereinafter), the first format-conforming players will not be able to read a recording medium in which signals are recorded in a format using a higher-efficiency coding method (this format will be referred to as xe2x80x9csecond formatxe2x80x9d hereinafter). More specifically, even if the recording medium has a flag indicating a format when the first format is determined, the first format-conforming player adapted to read a signal with no disregard for the flag signal will read signals from the recording medium taking that all signals in the recording medium have been recorded in the first format. Therefore, all the first format-conforming players will not recognize that signals in the recording medium have been recorded in the second format if applicable. Thus, if the first format-conforming player plays back a signal recorded in the second format in the recording medium taking that the signal has been recorded in the first format, a terrible noise will possibly occur.
It is therefore an object of the present invention to overcome the above-mentioned drawbacks of the prior art by providing an encoding apparatus and method, in which a second code string conforming to a second format and which has been encoded with a higher efficiency than a first code string conforming to a first format, is played back silently by a player intended for playing back the first code string conforming to the first format.
The above object can be attained by providing an encoder including according to the present invention:
means for generating a dummy string;
a first encoding means for generating a first code string by forming a blank area in a frame based on the dummy string;
a second encoding means for generating a second code string by encoding an input signal; and
a code string synthesizing means for generating a synthetic code string by embedding the second code string generated by the second encoding means in the blank area in the first code string.
Also the above object can be attained by providing an encoding method including according to the present invention:
a step of generating a dummy string;
a first encoding step of generating a first code string by forming a blank area in a frame based on the dummy string;
a second encoding step of generating a second code string by encoding an input signal; and
a code string synthesizing step of generating a synthetic code string by embedding the second code string generated by the second encoding means in the blank area in the first code string.
Also the above object can be attained by providing an encoder including according to the present invention:
a first encoding means for generating a first code string;
a second encoding means for generating a second code string; and
a code string synthesizing means for generating a synthetic code string in such a manner that a part of the second code string generated by the second encoding means forms a part of the first code string.
Also the above object can be attained by providing an encoding method including according to the present invention:
a first encoding step of generating a first code string;
a second encoding step of generating a second code string; and
a code string synthesizing step of generating a synthetic code string in such a manner that a part of the second code string generated by the second encoding means forms a part of the first code string.
Also the above object can be attained by providing a recording medium having, according to the present invention, a synthetic code string obtained by embedding a second code string recorded in a blank area formed in a first code string based on a dummy string formed in the first code string.
Also the above object can be attained by providing a recording medium having recorded therein, according to the present invention, a code string synthesized so that a part of a second code string forms a part of a first code string.
Also the above object can be attained by providing a decoder including according to the present invention:
means for receiving a synthetic code string obtained by embedding a second code string in a blank area formed in a first code string based on a dummy string generated in the first code string;
means for detecting the dummy string from the synthetic code string received by the synthetic code string receiving means;
means for decoding the second code string; and
means for controlling output of a signal generated by decoding the second code string according to whether the dummy string detecting means has detected a predetermined dummy string.
Also the above object can be attained by providing a decoding method including, according to the present invention, steps of:
receiving a synthetic code string obtained by embedding a second code string in a blank area formed in a first code string based on a dummy string generated in the first code string;
detecting the dummy string from the synthetic code string received at the synthetic code string receiving step;
decoding the second code string; and
controlling output of a signal generated by decoding the second code string depending upon whether the dummy string detecting means has detected a predetermined dummy string.
Also the above object can be attained by providing a decoder including according to the present invention:
means for receiving a code string synthesized so that a part of a second code string forms a part of a first code string;
means for detecting a predetermined dummy string from the synthetic code string received by the synthetic code string receiving means;
means for decoding the second code string; and
means for controlling output of a signal generated by decoding the second code string depending upon whether the dummy string detecting means has detected the predetermined string.
Also the above object can be attained by providing a decoding method including, according to the present invention, steps of:
receiving a code string synthesized so that a part of a second code string forms a part of a first code string;
detecting a predetermined dummy string from the synthetic code string received at the synthetic code string receiving step;
decoding the second code string; and
controlling output of a signal generated by decoding the second code string depending upon whether the dummy string detecting means has detected the predetermined string.
These objects and other objects, features and advantages of the present intention will become more apparent from the following detailed description of the preferred embodiments of the present invention when taken in conjunction with the accompanying drawings.