1. Field of the Invention
The present invention relates to an encoding apparatus and method, adapted to encode a second code string which can be encoded with a higher efficiency than that with which a first code string can be encoded.
2. Description of the Related Art
The technique to record information to a recording medium capable of recording an encoded audio or speech signal, such as a magneto-optical disc or the like, is widely used. For a highly efficient coding of an audio or speech signal, there have been proposed various methods such as the subband coding method (SBC) in which an audio signal or the like on a time base is divided into a plurality of frequency bands without blocking, and the so-called transform coding method in which a signal on the time base is transformed to a one on the frequency base (spectrum transform), divided into a plurality of frequency bands and then the signal in each of the frequency bands is encoded. Also, a high efficiency coding method has also been proposed which is a combination of the SBC method and transform coding method. In this third one, for example, after an audio or speech signal is divided into a plurality of frequency bands by the SBC method, the signal in each frequency band is spectrum-transformed to a signal on the frequency base, and the signal is encoded in each spectrum-transformed frequency band. The QMF filter for example is used in this coding method. The QMF filter is defined in R. E. Crochiere: Digital Coding of Speech in Subbands, Bell Syst. Tech. Journal, Vol. 55, No. 8, 1976xe2x80x9d. Also, the method for equal-bandwidth division by filter is defined in xe2x80x9cJoseph H. Rothweiler: Polyphase Quadrature Filtersxe2x80x94A New subband Cording Technique, ICASSP 83, BOSTONxe2x80x9d.
In an example of the above-mentioned spectrum, an input audio signal is blocked at predetermined unit times (encoding frames), and each of the blocks is subjected to the discrete Fourier transform (DFT), discrete cosine transform (DCT) or modified discrete cosine transform (MDCT) to transform a time base to a frequency base. The MDCT is described in xe2x80x9cJ. P. Princen and A. B. Bradley, Univ. of Surrey Royal Melbourne Insit. of Tech.: Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation, ICASSP, 1987xe2x80x9d.
When the above-mentioned DFT or DCT is used for of a waveform signal to a spectrum, with a time block consisting of M samples will yield a number M of independent real data. Normally, a time block is arranged to overlap Ml samples thereof its neighboring blocks each to suppress the distortion of the connection between time blocks. Therefore, in the DFT and DCT, signal will be encoded by quantizing on average M real data for a number (Mxe2x88x92M1) of samples.
When the MDCT is used as the method for of a waveform signal to a spectrum, M independent real data can be obtained from 2M samples arranged to overlap M ones thereof its neighboring blocks each. Therefore, in the MDCT, signal is encoded by quantizing on average M real data for the M samples. In a decoder, waveform elements obtained from a code resulted from the MDCT by inverse transform in each block are added together while being made to interfere with each other, thereby permitting to reconstruct the waveform signal.
Generally, by increasing the length of the time block, the frequency separation of the spectrum is increased and energy is concentrated on a specific spectrum component. Therefore, by transforming a waveform signal to a spectrum with an increased block length obtained by overlapping a time block a half thereof its neighboring time blocks each and using the MDCT in which the number of spectrum signals obtained will not increase relative to the number of original time samples, it will be possible to enable a coding whose efficiency is higher than that attainable with the DFT or DCT.
By quantizing a signal divided into plurality of frequency bands by the filtering or spectrum as in the above, it is possible to control any frequency band where quantization noise occurs and encode an audio signal with a higher efficiency in the auditory sense using a property such as the masking effect. Also, by normalizing, for each of the frequency bands, the audio signal with a maximum absolute value of a signal component in the frequency band before effecting the quantization, a further higher efficiency of the coding can be attained.
The width of frequency division for quantization of each frequency component resulted from a frequency band division is selected with the auditory characteristic of the human being for example taken in consideration. That is, an audio signal is divided into a plurality of frequency bands (25 bands for example) in such a bandwidth as will be larger as its frequency band is higher, which is generally called xe2x80x9ccritical bandxe2x80x9d, as the case may be. Also, at this time, data in each band is encoded by a bit distribution to each band or with an adaptive bit allocation to each band. For example, when a coefficient data obtained using the MDCT is encoded with the above bit allocation, an MDCT coefficient data in each band, obtained using the MDCT at each block, will be encoded with an adaptively allocated number of bits. The of the adaptive bit allocation information can be determined so as to be previously included in a code string, whereby the sound quality can be improved by improving the coding method even after determining a format for decoding. The known bit allocation techniques include the following two:
One of them is disclosed in xe2x80x9cR. Zelinski and P. Noll: Adaptive Transform Coding of Speech Signals, IEEE Transactions of Acoustics, Speech, and Signal Processing, Vol. ASSP-25, No. 4, August 1977xe2x80x9d. This technique is such that the bit allocation is made based on the size of a signal in each frequency band. With this technique, the quantization noise spectrum can be flat an the noise energy be minimum, but since no masking effect is used, the actual noise will not feel auditorily optimum.
The other one is disclosed in xe2x80x9cM. A. Kransner, MIT: The Critical Band Coderxe2x80x94Digital encoding of the perceptual requirements of the auditory system, ICASSP, 1980xe2x80x9d. This technique is such that the auditory masking is used to acquire a necessary signal-to-noise ratio for each frequency band, thus making a fixed bit allocation. With this technique, however, since the bit allocation is a fixed one, the signal characteristic will not be so good even when it is measured on a sine wave input.
To solve the above problem, there has been proposed a high efficiency encoder in which all bits usable for the bit allocation are divided for a fixed bit allocation pattern predetermined for each small block and for a bit distribution dependent upon a signal size of each block at a ratio dependent upon a signal related with an input signal and whose number of bits for the fixed bit allocation pattern is larger as the spectrum of the signal is smoother.
With the above method adopted in the encoder, the entire signal-to-noise ratio can considerably be improved by allocating more bits to a block including a specific spectrum to which energy is concentrated, such as a sine wave input. Generally, since the human ears are extremely sensitive to a signal having a steep spectrum component, the above method can be used to improve the signal-to-noise ratio, which does not only improve a measured value but also can effectively improve the sound quality.
The bit allocation methods include many other ones as well. The auditory model is further elaborated to enable a higher-efficiency coding if the encoder could. Generally, in these methods, a reference for the real bit allocation to realize a computed signal-to-noise ratio with a highest possible fidelity is determined and an integral value approximate to the computed value is taken as a number of allocated bits.
For example, the Application of the present invention has proposed an encoding method in which a signal component having an auditorily important tone component, namely, a signal component having an energy concentrated around a predetermined frequency thereof, is separated from a spectrum signal and encoded separately from the other spectrum component. Thus, this method allows to encode an audio signal or the like efficiently with a high compression rate with little auditory deterioration.
To form an actual code string, it suffices to first encode quantizing precision information and normalizing coefficient information with a predetermined number of bits for each frequency band in which the normalization and quantization are effected, and then encode the normalized and quantized signals. Also, in the ISO/IEC 11172-3: 1998 (E), 1993, a high efficiency coding method is defined in which the number of bits indicating quantizing precision information varies from one frequency band to another in such a manner that as the frequency is higher, the number of bits indicating quantizing precision information will be smaller.
It has also been proposed to determine quantizing precision information based on normalizing coefficient information for example in a decoder instead of directly encoding the quantizing precision information. In this method, however, since the relation between the normalizing efficient information and quantizing precision information will be determined when a format is set, so it is not possible to introduce the control of the precision of quantization based on a further advanced auditory model which will be available in future if any. Also, when a compression rate to be realized ranges wide, it is necessary to determine the relation between the normalizing coefficient information and quantizing precision information for each compression rate.
Also, there is known an encoding method in which a quantized spectrum signal is encoded using a variable-length code defined in xe2x80x9cD. A. Huffman: A Method for Construction of Minimum Redundancy Codes, Proc. I. R. E, 40, p. 1098 (1952)xe2x80x9d for example with a higher efficiency.
As in the above, techniques for a higher-efficiency coding have been developed one after another. By employing a format incorporating a newly developed technique, it is possible to record for a longer time, and also record an audio signal having a higher sound quality for the same length of recording time.
However, if players capable of playing back only signals recorded in a predetermined format (will be referred to as xe2x80x9cfirst formatxe2x80x9d hereinafter) prevail (this player will be referred to as xe2x80x9cfirst format-conforming playerxe2x80x9d hereinafter), the first format-conforming players will not be able to read a recording medium in which signals are recorded in a format using a higher-efficiency coding method (this format will be referred to as xe2x80x9csecond formatxe2x80x9d hereinafter). More specifically, even if the recording medium has a flag indicating a format when the first format is determined, the first format-conforming player adapted to read a signal with no disregard for the flag signal will read signals from the recording medium taking that all signals in the recording medium have been recorded in the first format. Therefore, all the first format-conforming players will not recognize that signals in the recording medium have been recorded in the second format if applicable.
Thus, if the first format-conforming player plays back a signal recorded in the second format taking that the signal has been recorded in the first format, a terrible noise will possibly occur.
To avoid the above, the Applicant of the present invention has also applied for patent an improved method for recording data in a so-called TOC area, in which when a music piece is recorded by the second format codec, the first format-conforming player will actually play back a warning message recorded in nay other area than the TOC area by the first format codec.
However, the above method proposed by the Applicant needs that an ambient spare area in the TOC area in the first format and is not advantageous in that the playback by a second format-conforming player is complicated.
It is therefore an object of the present invention to overcome the above-mentioned drawbacks of the prior art by providing an encoding apparatus and method, which needs no ambient spare area in the TOC area and in which the playback by a second format-conforming player is not complicated.
The above object can be attained by providing an encoder including according to the present invention:
a first encoding means for generating a first code string by encoding a warning message signal or silent signal;
a second encoding means for generating, when the first encoding means is encoding a silent signal, a second code string by encoding an input signal; and
means for generating a synthetic code string by combining the first and second code strings together.
Also the above object can be attained by providing an encoding method including according to the present invention:
a first encoding step of generating a first code string by encoding a warning message signal or silent signal;
a second encoding step of generating, when the first encoding means is encoding a silent signal, a second code string by encoding an input signal; and
a step of generating a synthetic code string by combining the first and second code strings together.
Also the above object can be attained by providing a recording medium for recording a synthetic signal generated by combining a first code string and second code string, in which the first code string is generated by encoding a warning message or silent signal while the second code string is generated by encoding an input signal when the first code string is a silent signal encoded.
Also the above object can be attained by providing a decoder including according to the present invention:
means for receiving a code string synthesized by combining a code string encoded by a first encoding means and a code string encoded by a second encoding means;
means for detecting a predetermined bit pattern in the first code string; and
means for decoding the second code string;
the second code string decoding means providing a predetermined sound when the predetermined bit pattern has not been detected by the bit pattern detecting means.
Also the above object can be attained by providing a decoding method including, according to the present invention, steps of:
receiving a code string synthesized by combining a code string encoded by a first encoding and a code string encoded by a second encoding;
means for detecting a predetermined bit pattern in the first code string; and
means for decoding the second code string;
at the second code string decoding step, there being provided a predetermined sound when the predetermined bit pattern has not been detected at the bit pattern detecting step.
Also the above object can be attained by providing a decoder including according to the present invention:
means for receiving a code string synthesized by recording, in a predetermined-length encoding frame, a first code string from the top of the encoding frame and a second code string from the bottom of the encoding frame; and
means for decoding the second code string recorded from the bottom of the encoding frame.
Also the above object can be attained by providing a decoding method including, according to the present invention, steps of:
receiving a code string synthesized by recording, in a predetermined-length encoding frame, a first code string from the top of the encoding frame and a second code string from the bottom of the encoding frame; and
decoding the second code string recorded from the bottom of the encoding frame.
These objects and other objects, features and advantages of the present intention will become more apparent from the following detailed description of the preferred embodiments of the present invention when taken in conjunction with the accompanying drawings.