1. Field of the Invention
This invention relates to an information encoding method for encoding information, an associated information decoding apparatus and an information recording medium having the encoded information recorded thereon.
2. Description of the Related Art
As to what can record the encoded acoustic information or the encoded speech information, referred to herein as audio signals, an information recording medium, such as a magneto-optical disc, has so far been proposed. A variety of high-efficiency encoding techniques exist for encoding audio or speech signals. Examples of these techniques include so-called transform coding as a blocking frequency spectrum splitting system and a so-called sub-band coding system (SBC) as a non-blocking frequency spectrum splitting system. In the transform coding, audio signals on the time axis are blocked every pre-set time interval, the blocked time-domain signals are transformed into signals on the frequency axis, and the resulting frequency-domain signals are encoded from band to band. In the sub-band coding system, the audio signals on the time axis are split into plural frequency bands and encoded without blocking. In a combination of the sub-band coding system and the transform coding system, the audio signals on the time axis are split into plural frequency bands by the sub-band coding system, and the resulting band-based signals are transformed into frequency-domain signals by orthogonal transform for encoding.
As band-splitting filters used in the sub-band coding system, there is a so-called quadrature mirror filter (QMF) discussed in R. E. Crochiere, "Digital Coding of Speech in Sub-bands", Bell Syst. Tech. J., Vol.55, No.8, 1976. This QMF filter divides the frequency spectrum into two bands of equal bandwidths. With the QMF filter, so-called aliasing is not produced on subsequent synthesis of the band-split signals. The technique of splitting the frequency spectrum into equal frequency bands is discussed in Joseph H. Rothweiler, Polyphase Quadrature Filters --A New Subband Coding Technique", ICASSP 83 BOSTON. With the polyphase quadrature filter, the signal can be split at a time into plural frequency bands of equal bandwidths.
Among the techniques for orthogonal transform, there is known such a technique in which the input audio signal is split into frames of a predetermined time duration and the resulting frames are processed by discrete Fourier transform (DFT), discrete cosine transform (DCT) or modified DCT (MDCT) to convert the signals from the time axis to the frequency axis. Discussions of a MDCT may be found in J. P. Princen and A. B. Bradley, "Subband/Transform Coding Using Filter Bank Based on Time Domain Aliasing Cancellation", ICASSP 1987.
If the above-mentioned DFT or DCT is used as a method for orthogonal transform of waveform signals, using a time block made up of M samples, M independent real-number data are obtained. For reducing the junction distortion between these time blocks, M1 sample data each are usually overlapped between both neighboring time blocks. Thus, in DFT or DCT, M real-number data are obtained on an average for (M-M1) sample data, so that these M real-number data are subsequently quantized and encoded.
Conversely, if the above-mentioned MDCT is used as the method for orthogonal transform, M independent real-number data are obtained from 2M samples obtained on overlapping M sample data between two neighboring time blocks. That is, if MDCT is used, M real-number data are obtained for M sample data on an average. These M real-number data are then quantized and encoded. In the decoding apparatus, waveform elements obtained on inverse transform in each block are summed together with interference to re-construct the waveform signals.
Meanwhile, if the time block for orthogonal transform is lengthened, the frequency resolution is increased, with the result that the signal energy is concentrated in specified spectral signal components. Thus, with the MDCT in which orthogonal transform is carried out using a long time block obtained on overlapping one-half sample data between both neighboring time blocks, and in which the number of the spectral signal components is not increased as compared the number of the original time-domain sample data, a higher encoding efficiency can be realized than if the DFT or DCT is used. Morever, if neighboring time blocks are overlapped with each other with a sufficiently long overlap, junction distortion between time blocks of waveform signals can be reduced.
By quantizing signal components split from band to band by a filter or orthogonal transform, it becomes possible to control the quantization noise, thus enabling encoding with perceptually higher encoding efficiency by exploiting masking effects. By normalizing respective sample data with the maximum value of the absolute values of the signal components in each band prior to quantization, the encoding efficiency can be improved further.
As the band splitting width used for quantizing the signal components resulting from splitting of the frequency spectrum of the audio signals, the band width taking into account the psychoacoustic characteristics of the human being is preferably used. That is, it is preferred to divide the frequency spectrum of the audio signals into a plurality of, for example, 25, critical bands. The width of the critical bands increases with increasing frequency. In encoding the band-based data in such case, bits are fixedly or adaptively allocated among the various critical bands. For example, when applying adaptive bit allocation to the spectral coefficient data resulting from a MDCT, the spectral coefficient data generated by the MDCT within each of the critical bands is quantized using an adaptively allocated number of bits. The following two techniques are known as the bit allocation technique.
In R. Zelinsky and P. Noll, "Adaptive transform Coding of Speech Signals", IEEE Transactions of Acoustics, Speech and Signal processing", vol. ASSP-25, August 1977, bit allocation is carried out on the basis of the amplitude of the signal in each critical band. This technique produces a flat quantization spectrum and minimizes noise energy, but the noise level perceived by the listener is not optimum because the technique does not exploit the psychoacoustic masking effect.
In M. A. Krassener, "The Critical Band Coder--Digital Encoding of the Perceptual Requirements of the Auditory System", MIT, ICASSP 1980, there is described a technique in which the psychoacoustic masking effect is used to determine a fixed bit allocation that produces the necessary bit allocation for each critical band. However, with this technique, since the bit allocation is fixed, non-optimum results are obtained even for a strongly tonal signal such as a sine wave.
For overcoming this problem, it has been proposed to divide the bits that may be used for bit allocation into a fixed pattern allocation fixed for each band or each small block subdivided from the band and a bit allocation portion dependent on the amplitude of the signal in each block. The division ratio is set depending on a signal related to the input signal such that the division ratio for the fixed allocation pattern portion becomes higher the smoother the pattern of the signal spectrum.
With this method, if the audio signal has high energy concentration in a specified spectral signal component, as in the case of a sine wave, abundant bits are allocated to a block containing the signal spectral component for significantly improving the signal-to-noise ratio as a whole. In general, the hearing sense of the human being is highly sensitive to a signal having sharp spectral signal components, so that, if the signal-to-noise ratio is improved by using this method, not only the numerical values as measured can be improved, but also the audio signal as heard may be improved in quality.
Various other bit allocation methods have been proposed and the perceptual models have become refined, such that, if the encoding device is of high ability, a perceptually higher encoding efficiency may be achieved.
With these methods, the usual practice is to find a real-number bit allocation reference value which will realize the theoretical S/N ratio as faithfully as possible and to use an integer approximating it as an allocated number of bits.
In the U.S. patent application Ser. No. 08/374518 now U.S. Pat. No. 5,717,821 by the present Assignee, there is proposed a method of separating perceptually crucial tonal components from spectral signal components and encoding these tonal components separately from the other spectral components. This assures efficient encoding of the audio signals with a high compression ratio without substantially producing perceptual deterioration.
For constructing an actual codestring, it suffices if the quantization fineness information and the normalization coefficient information are encoded with a pre-set number of bits for each band for which normalization and quantization are performed and to encode the normalized and quantized spectral signal components. In the ISO standard (ISO/IEC 11172-3:1993 (E), a993), there is described a high-efficiency method in which the number of bits representing the quantization fineness information is set so as to be different from band to band. The number of bits representing the quantization fineness information is set so as to be smaller with increased frequency.
There is also known a method in which the quantization fineness information is determined from, for example, the normalization coefficient information in the decoding device instead of directly encoding the quantization fineness information. However, since the relation between the normalization coefficient information and the quantization fineness information is determined with this method upon setting the standard, quantization fineness control based on an advanced perceptual model cannot be introduced in future. Moreover, if there is a width in the compression rate to be realized, it becomes necessary to set the relation between the normalization coefficient information and the quantization fineness information from one compression rate to another.
There is also known a method of encoding the quantized spectral signal components more efficiently by using variable length codes as discussed in D. A. Huffman, Proc. I. R.E., 40, p.1098 (1952), "A Method for Construction of Minimum redundancy Codes".
The above are merely illustrative examples of the methods for raising the encoding efficiency, which are being developed one after another. Therefore, by use of the standard which has incorporated the newly developed method, recording of longer time duration or recording of audio signals with higher quality for the same recording time becomes possible.
In determining the above-described standard, a method is used in which there is room left for recording flag information concerning the standard on the information recording medium in preparation for a future standard modification or expansion. For example, on initial standardization, `0` is recorded as a 1-bit flag information and, in case of a standard modification, `1` is recorded as the flag information. The reproducing device conforming to the modified standard checks if the flag information is `0` or `1` and, if the flag information is `1`, signals are read out and reproduced from the information recording medium. If the flag information is `0`, and the reproducing device also conforms to the initially determined standard, the signals are read out and reproduced from the information recording medium based on this standard. If otherwise, the signals are not reproduced.
The above-described conventional information encoding method is carried out on an information encoding and/or decoding device (compressed data recording and/or reproducing device, referred to herein simply as a recording/reproducing device) shown for example in FIG. 1.
The recording/reproducing device uses, as an information recording medium, a magneto-optical disc 1, which is run in rotation by a spindle motor 51. During data recording on the magneto-optical disc 1, a modulated magnetic field corresponding to the recording data is applied from the magnetic head 54 while the laser light is radiated from the optical head 53 for recording, that is for recording by magnetic field modulation, thereby recording data along a recording track of the magneto-optical disc 1. For reproducing data from the magneto-optical disc 1, the recording track of the disc 1 is traced by the optical head 53 and changes caused in the direction of polarization of the reflected laser light from the magneto-optical disc 1 are detected for photomagnetic reproduction.
The optical head 53 is made up of optical components, such as a laser light source, for example, a laser diode, a collimator lens, an objective lens, a polarization beam splitter or a cylindrical lens, and a photodetector having light-receiving segments of a pre-set pattern. The optical head 53 is provided facing the magnetic head 54 with the magneto-optical disc 1 in-between. For recording data on the magneto-optical disc 1, the magnetic head 54 is driven by a head driving circuit 66 of a recording system, as later explained, for applying a modulated magnetic field corresponding to the recording data, at the same time as the laser light is radiated on a target track of the magneto-optical disc 1, by way of performing thermo-magnetic recording in accordance with the magnetic field modulation system. In addition, the optical head 53 detects the reflected laser light from the target track for detecting focusing error signals and tracking error signals by the astigmatic method and by the push-pull method, respectively. For reproducing data from the magneto-optical disc 1, the optical head 53 detects the focusing error signals and tracking error signals while also detecting difference in the angle of polarization (Kerr rotation) of the reflected light from the target track of the laser light for generating the playback signals.
An output of the optical head 53 is sent to an RF circuit 55, which then extracts the focusing error signals and tracking error signals from the output of the optical head 53 to send the extracted signals to a servo control circuit 56 while converting the playback signals into bi-level signals and routing the bi-level signals to a decoder 71 of a reproducing system as later explained.
The servo control circuit 56 is made up of, for example, a focusing servo circuit, a tracking servo circuit, a spindle motor servo control circuit and a thread servo control circuit. The focusing servo circuit controls the optical system of the optical head 53 so that the focusing error signal will be zero. The tracking servo circuit controls the optical system of the optical head 53 so that the tracking error signal will be zero. The spindle motor servo control circuit controls the spindle motor 51 for rotationally driving the magneto-optical disc 1 to rotate at a pre-set rotational velocity, such as constant linear velocity. The thread servo control circuit moves the optical head 53 and the magnetic head 54 to a target track position of the magneto-optical disc 1 designated by a system controller 57. The servo control circuit 56, performing these various control operations, sends to the system controller 57 the information specifying the operating states of various components controlled by the servo control circuit 56.
To the system controller 57 are connected a key input unit 58 and a display 59. The system controller 57 controls the recording system and the reproducing system by the operating input information from the key input unit 58. The system controller 57 also controls the recording position or the reproducing position on the recording track traced by the optical head 53 and the magnetic head 54 based on the sector-based address information read out from the recording track of the magneto-optical disc 1, such as the header time or the abb-code Q-data. The system controller 57 also causes the reproducing time to be displayed on the display 59 based on the data compression rate by the recording/reproducing device and the reproducing position information on the recording track.
For this reproducing time display, the sector-based address information reproduced from the recording track of the magneto-optical disc 1 by, for example, the header time or the sub-code Q data (absolute time information), is multiplied by the reciprocal of the data compression rate, such as 4 in case of 1/4 compression, to find the actual time information, which is displayed on the display 59. During recording, if the absolute time information is recorded (pre-formatted) on the recording track of, for example, a magneto-optical disc, this pre-formatted absolute time information may be read and multiplied by a reciprocal of the data compression rate for displaying the current position in terms of the actual recording time.
In the recording system of the recording/reproducing device, an analog audio input signal Ain from an input terminal 60 is fed via a low-pass filter 61 to an analog/digital (A/D) converter 62 which then quantizes the analog audio input signal Ain. On the other hand, a digital audio input signal Din from an input terminal 67 is supplied via a digital input interfacing circuit 68 to an ATC encoder 63. The ATC encoder 63 performs bit compression (data compression) associated with a pre-set data compression rate on digital audio PCM data of a pre-set transfer rate corresponding to the input signal Ain quantized by the A/D converter 62. The compressed data outputted by the ATC encoder 63 (ATC data) is routed to a memory (RAM) 64. If the data compression rate is 1/8, the data transfer rate in this area is reduced to one-eighth of the data transfer rate of the standard format (CD-DA format) of 75 sectors/second, or to 9.375 sectors/second.
The memory 64 has its data writing and data readout controlled by the system controller 57, and is used for temporarily storing the ATC data supplied from the ATC encoder 63 for recording on the disc when the necessity arises. That is, for the data compression rate of, for example, 1/8, the compressed audio data supplied from the ATC encoder 63 has its data transfer rate reduced to one-eighth of the data transfer rate of the standard CD-DA format of 75 sectors/second, or to 9.375 sectors/second. It is this compressed data that is continuously recorded on the memory 64. It is sufficient to record one of eight sectors, as explained previously. However, since the recording of every eight sector is virtually impossible, sector-continuous recording is performed, as will be explained subsequently.
This recording is done in a burst fashion, that is intermittently, at a data transfer rate equal to that of the standard CD-DA format or 75 sectors/second, with a `cluster` as a recording unit with the interposition of a non-recoding period. The cluster is comprised of a pre-set plural number of sectors, such as 32, and several sectors each ahead and at back of the cluster. That is, in the memory 64, the ATC audio data, with the data compression rate of 1/8, continuously written at a low transfer rate of 9.375 (=75/8) sectors/second conforming to the above bit compression rate, is read out in a burst fashion as the recording data at the above-mentioned transfer rate of 75 sectors/second. The overall data transfer rate of the read-out and recorded data, inclusive of the non-recording period, is the above-mentioned low rate of 9.375 sectors/second. However, the instantaneous data transfer rate within the burst-like recording time is the above-mentioned instantaneous rate of 75 sectors/ second. Therefore, if the rotational velocity of the disc is the same as the velocity of the standard CD-DA format (constant linear velocity), the recording performed is of the same recording density and the same storage pattern as those of the CD-DA format.
The ATC audio data read out in a burst fashion from the memory 64 at the transfer rate (instantaneous transfer rate) of 75 sectors/second, that is recording data, is sent to an encoder 65. In the data string supplied from the memory 64 to the encoder 65, the unit of continuous recording by one recording is the `cluster` made up of a plurality of, for example, 32, sectors, and several cluster-connecting sectors arrayed before and after the `cluster`. This cluster connecting sector is set so as to be longer than the interleaving length in the encoder 65 so that interleaving cannot affect data of other clusters.
The encoder 65 performs error correction processing, that is parity appendage, interleaving or EFM encoding, on the recording data supplied thereto in a burst fashion as described above. It is this recording data, thus processed by the encoder 65, that is routed to a magnetic head driving circuit 66. The magnetic head 54 is connected to this magnetic head driving circuit 66 which drives the magnetic head 54 for applying the modulated magnetic field corresponding to the recording data to the magneto-optical disc 1.
The system controller 57, controlling the memory 64 as described above, also controls the recording position so that the recording data read out in the burst fashion from the memory 64 will be continuously recorded on the recording track of the magneto-optical disc 1. The recording position is controlled by managing the recording position of the recording data read out in the burst fashion from the memory 64 by the system controller 57 for supplying a control signal designating the recording position on the recording track of the magneto-optical disc 1 to the servo control circuit 56.
The reproducing system is hereinafter explained. This reproducing system is designed for reproducing recording data continuously recorded on the recording track of the magneto-optical disc 1 by the above recording system. The reproducing system includes a decoder 71 fed from the RF circuit 55 with a bi-level version of the playback output obtained on tracing the recording track of the magneto-optical disc 1 by the laser light from the optical head 53. This reproducing system can read out information signals from not only the magneto-optical disc 1 but also from a read-only optical disc such as a Compact Disc (trade mark).
The decoder 71 is a counterpart device of the encoder 65 of the above-described recording system. Specifically, the decoder processes the bi-level playback output from the RF circuit 55 such as with the above-mentioned decoding for error correction of EFM decoding, and reproduces the ATC audio data having the data compression rate of 1/8 at a transfer rate of 75 sectors/second which is faster than the regular transfer rate. The playback data, obtained by the decoder 71, is routed to a memory (RAM) 72.
The memory 72 has data writing and data readout controlled by the controller 57, such that the playback data supplied from the decoder 71 at the transfer rate of 75 sectors/second is written therein in a burst fashion at the same transfer rate of 75 sectors/second. The playback data, written in the memory 72 at the transfer rate of 75 sectors/second, is continuously read out from the memory 72 at a transfer rate of 9.375 sectors/second corresponding to the data compression rate of 1/8.
The system controller 57 performs memory control of writing the playback data in a burst fashion in the memory 72 at the transfer rate of 75 sectors/second and continuously reads out the playback data written in a burst fashion in the memory 72 by this memory control from the recording track of the magneto-optical disc 1. In addition to performing the above-mentioned memory control for the memory 72, the system controller 57 performs playback position control of continuously reproducing the playback data written in a burst fashion in the memory 72 by this memory control from the recording track of the magneto-optical disc 1. The playback position is controlled by the system controller 57 supplying a control signal designating the playback position on the recording track of the magneto-optical disc 1 or the optical disc to the servo control circuit 56.
The ATC audio data, obtained as playback data continuously read out at the transfer rate of 9.375 sectors/second, are routed to an ATC decoder 73. The ATC decoder 73 can cope with both an A-coded and the B-codec. This ATC decoder 73 is a counterpart, of the ATC encoder 63 of the recording system and reproduces the 16-bit digital audio data by 8-fold data expansion (bit expansion) of the ATC data. The digital audio data from the ATC decoder 73 is routed to a digital/analog (D/A) converter 74.
The D/A converter 74 converts digital audio data supplied from the ATC decoder 73 into analog signals for forming analog audio output signals Aout. The analog audio output signals Aout, thus obtained from the D/A converter 74, are outputted via low-pass filter 75 at an output terminal 76.
However, if a reproducing device capable of reproducing only signals recorded by a pre-set standard, referred to herein as `old standard` or `first encoding method`, comes into widespread use, this type of the reproducing device, referred to herein as a `reproducing device conforming to the old standard`, cannot reproduce an information recording medium recorded with an upper order standard employing a higher efficiency encoding system, thus inconveniencing the user of the device. The upper order standard is referred to herein as a `new standard` or `a second encoding method`. In particular, in certain reproducing devices developed at the time point of formulation of the old standard, the flag information recorded on the information recording medium is disregarded and the signals recorded on the information recording medium are reproduced on the assumption that these signals are all recorded by the old standard. That is, if the information is recorded on an information recording medium in accordance with the new standard, not all reproducing devices conforming to the old standard can recognize it. Thus, if the reproducing devices conforming to the old standard construes the information recording medium having recorded thereon signals conforming to the new standard as being an information recording medium having recorded thereon signals conforming to the old standard and proceeds to reproduction, it may be an occurrence that the device cannot operate regularly, or generates objectionable noise.
FIG. 2 shows a conventional formatting example in the case of recording signals encoded as described above on a magneto-optical disc. In the example of FIG. 2, it is assumed that four audio signal data (four musical numbers) have been recorded on the disc.
In FIG. 2, not only the four audio signal data but also the management data used for recording/reproducing the audio signal data are recorded on the disc. In an address 0 and an address 1 of the management data area, a leading data number and a trailing data number are recorded, respectively. In the example of FIG. 2, 1 and 4 are recorded as the value of the leading data number and the trailing data number, respectively. This indicates that four audio signal data of from number 1 to number 4 have been recorded on the disc.
In the addresses 5 to 8 of the management data area, there is recorded the information on the address storage positions specifying in which portion of the management data area `data specifying in which portion of the disc each audio signal data is recorded`, that is the address information, is recorded. The information on the address storage position is recorded in the sequence of the audio signal data, that is in the number playing sequence, such that the information on the address storage position for audio signal data played first is stored in the address 5, while the information on the address storage position for audio signal data played second is stored in the address 6, and so forth. By using this management data, the reproducing sequence for the first number and the second number can be easily realized by exchanging the contents of the addresses 5 and 6 instead of by exchanging the actual recording positions of the audio signal data. In the management data area is reserved a spare area, stuffed with 0s, for enabling future expansion.
It is assumed that a certain encoding method, referred to herein as A codec, is developed, a recording format for a disc is standardized using this encoding method, and that later an encoding method of a higher efficiency, representing the expansion of the A-codec, referred to herein as a B-codec, has been developed. In such case, the signals encoded by the B-codec can be recorded on the same sort of the disc as that on which signals by the A-codec are recorded. If the signals by the B-codec can be recorded similarly to those by the A-codec, signals can be recorded for a longer time on the disc, or to a higher signal quality, thus conveniently expanding the field of application of the disc.
For recording on the disc the signals encoded by the B-codec representing expansion of the A-codec the mode designation information shown in FIG. 3 is recorded in the address 2 set as the spare area on a disc designed to cope with only the old standard (A-codec) shown in FIG. 2. The mode designation information `0` specifies that recording is made in accordance with the old standard, while the mode designation information 1 specifies that recording is made in accordance with the new standard (B-codec). Thus, if the mode designation information is `1` during disc reproduction, it is seen that the recording according to the new standard, that is recording by the B-codec, has been done on the disc.
In addition, if the signals by the B-codec are recorded on the disc, one of the spare areas formed next to the area for recording the address information (start and end addresses) of each audio signal data as shown in FIG. 2 is used as the area for the codec designation information. The codec designation information of `0` specifies that the audio signal data specified by the address information made up of the start and end addresses has been encoded in accordance with the old standard, while the codec designation information of `1` specifies that the audio signal data specified by the address information has been encoded in accordance with the new standard (B-codec).
In this manner, audio signal data encoded by the A-codec can be recorded on the same disc so as to co-exist with that encoded by the B-codec such that the disc can be reproduced by a reproducing device designed to cope with the new standard.
However, with the above disc, it cannot be discriminated from the appearance whether the recording has been done in accordance with the old standard or the new standard. Thus there is a risk for the user to reproduce the disc with a reproducing device designed to cope with only the old standard. At this time, if the reproducing device is designed for reproducing all of the recorded signals on the assumption that these recorded signals are encoded in accordance with the old standard, the reproducing device attempts to reproduce the signals on the assumption that the recorded signals have been encoded in accordance with the A-codec without attempting to check whether the address 2 is set to 0 at all times in the old standard. Thus the risk is high that the disc cannot be reproduced or random or haphazard noise is produced thus inconveniencing the user.