1. Field of the Invention
This invention relates to an information management method and an information management apparatus for ensuring the compatibility of a recording medium storing signals that are coded by different methods.
2. Related Background Art
Recording media such as magneto-optic disks that are adapted to record coded signals of acoustic or sound information (to be referred to as audio signals hereinafter) have been expanding the market.
Meanwhile, when recording audio signals on a magneto-optic disk, it is a popular practice to compress the information of the audio signals to reduce the amount thereof by processing them for high-efficiency coding.
Various techniques are known to date for high-efficiency coding of audio signals including, for example, the blocking/frequency band splitting system, which is also referred to as transform coding, of blocking the audio signals on a time base by using a predetermined time unit, transforming (spectrum transform) the signals of each block on the time basis into signals on a frequency base, splitting them into a plurality of frequency subbands and coding the signals in each band or the non-blocking frequency band splitting system, which is also referred to as subband coding (SBC), of splitting the audio signals on a time base into a plurality of frequency subbands and coding the signals without blocking the audio signals. Additionally, high-efficiency coding techniques realized by combining transform coding and subband coding have been proposed. With such a technique, for example, the frequency band is divided into subbands by means subband coding and the signals of each subband are subjected to spectrum transform so as to be transformed into signals on a frequency base, which are then coded on a subband by subband basis.
Filters that are used as band splitting filters for subband coding include so-called QMFs (quadrature mirror filters). “Digital Coding of Speech in Subbands”, R. E. Crochiere, Bell Syst. Tech. J., Vol. 55, No. 8, 1976 describes a QMF. The QMF described in the above document is devised to utilize the phenomenon that, if an aliasing noise is generated by thinning out the signals that are subjected to subband coding using the QMF to make them show a half signal rate, the aliasing noise generated by the decimation is cancelled by the aliasing noise generated in the subsequent band synthesis. Therefore, the coding loss can be substantially eliminated by using a QMF as time splitting filter so long as the signals of each subband are coded with a satisfactory level of accuracy.
“Polyphase Quadrature Filters—A New Subband Coding Technique”, Joseph H. Rothweiler, ICASSP 83, BOSTON describes a band splitting technique using a PQF. The PQF described in the above paper is devised to utilize the phenomenon that, if the signals that are subjected to subband coding using the PQF are thinned out to show a signal rate corresponding to the related bandwidth and consequently aliasing noises are generated between adjacent subbands, the generated aliasing noises are cancelled by the aliasing noises that are generated between adjacent subbands in the subsequent band synthesis. Therefore, again, the coding loss can be substantially eliminated by using a PQF as time splitting filter so long as the signals of each subband are coded with a satisfactory level of accuracy.
Spectrum transform techniques include those adapted to split the input audio signals into blocks on the basis of a predetermined time unit (frame) and transform the signals on a time base into those on a frequency base by subjecting them to discrete Fourier transform (DFT), discrete cosine transform (DCT) or modified discrete cosine transform (MDCT) on a block by block bases. For MDCT, refer to “Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation”, J. P. Princen, A. B. Bradley, Univ. of Surrey Royal Melbourne Inst. Of Tech. ICASSP 1987.
When DFT or DCT is used for the purpose of spectrum transform of waveform signals on the basis of a time block of M sample data (hereinafter to be referred to as transform block), a total of M independent real number data will be obtained. Then, normally, M1 sample data are made to overlap between two adjacent transform blocks in order to alleviate the connection distortion between transform blocks. Thus, with DFT or DCT, a total of M real number data are obtained in average for (M-M1) sample data. The M real number data will be subsequently quantized and coded.
When, on the other hand, MDCT is used for the purpose of spectrum transform of waveform signals, a total of M independent real number data will be obtained for each transform block out of 2M samples produced by causing M samples thereof to overlap between two adjacent transform blocks. In other words, when MDCT is used, a total of M real number data are obtained in average for M sample data. The M real number data will then be quantized and coded. With a decoder adapted to use MDCT for spectrum transform and decode quantized and coded signals, the original waveform signal can be reconstructed by adding the waveform elements obtained through inverse transformation of the coded signals for each block, causing them to interfere with each other.
Generally, the frequency resolution is enhanced to give rise to a phenomenon of concentration of energy on a specific spectrum signal component if the transform blocks for spectrum transform are made long. Therefore, a coding operation can be conducted more efficiently by using MDCT than by using DFT or DCT because, if a long transform block length is used for spectrum transform with MDCT, a half of the total number of sample data are made to overlap between two adjacent transform blocks and the number of the obtained spectrum signal components is not increased relative to the number of the original sample data on the time base. Additionally, the connection distortion between transform blocks of waveform signals can be alleviated by causing adjacent transform blocks to overlap by a sufficiently long span. However, it should be noted that a long transform block means that more work areas are required for the transform to possibly baffle the efforts for down-sizing the signal reproduction means. Particularly, the use of a long transform block can entail a cost rise when it is difficult to raise the degree of integration of semiconductors.
Meanwhile, with the above described technique of splitting the signal frequency bands by means of a filter and spectrum transform, the quantization noise generation band can be limited when quantizing the signal components obtained by the band division. In other words, it is possible to perform a coding operation highly efficiently in terms of the auditory perception by limiting the quantization noise generation band, typically utilizing the masking effect. The masking effect refers to an effect that a large sound hides a small sound to the ears. Thus, the signal sound itself can be made to hide the quantization noise generated as a result of quantization due to the masking effect. Therefore, if audio signals are compressed in a way that maximally exploits the masking effect, the sound reproduced from the audio signals obtained by expanding the compressed audio signals will be almost the same as the original sound to the ears in terms of sound quality. However, it should be noted that the generation of quantization noise has to be controlled in terms of both time and frequency in order to maximally exploit the masking effect. More specifically, the masking effect can vary along the time base in terms of the duration of the effect and as far as an attack where the signal level abruptly rises from a relatively low level to a high level is concerned, the masking effect works only several milliseconds temporally before the attack whereas it works for a considerably long time after the attack. Therefore, assuming a transform block containing an attack and low level signals located before and after the attack, if a low level signal is found for more than several milliseconds temporally before the attack and the level of the quantization noise generated in the transform block is higher than that of the low level signal, the level of the quantization noise generated in the transform block exceeds that of the low level signal (and hence is not hidden by the small sound of the low level signal) so that there arises a phenomenon of so-called pre-echo that is very harsh to the ears.
In view of this problem, there are occasions where a technique of shifting the length of the transform block to be used for spectrum transform depending on the signals contained in the transform block. More specifically, if the transform block contains an attack and low level signals located before and after the attack, the length of the transform block shifted to show a small length so that no pre-echo may occur there. It will be appreciated that the coding operation can be conducted more efficiently if the largest one of the absolute values of the signal components in each subband is determined prior to the quantization and the signal components of the band are normalized by referring to the largest value.
When each of the signal components obtained by splitting the frequency band of the audio signal is quantized in a manner as described above, the subbands obtained by splitting the frequency band preferably has a bandwidth that matches the human sense of hearing. In other words, when splitting the frequency band of an audio signal, it is preferable to divide the audio signal into a plurality of subbands (e.g., 25 subbands) having respective band widths that increases as a function of frequency (cristical band).
Additionally, the operation of coding the data of the subbands obtained by frequency splitting is preferably carried out by allocating a predetermined number of bits or by adaptively allocating an appropriate number of bits to each of the subbands (bit allocation). For instance, the technique of adaptively allocating an appropriate number of bits to the MDCT coefficient data of each subband obtained by MDCT conducted on each transform block will be used for the operation of coding the coefficient data obtained by MDCT.
Two types of techniques are known to date for bit allocation.
“Adaptive Transform Coding of Speech Signals”, R. Zelinski and P. Noll, IEEE Transactions of Acoustics, Speech and Signal Processing, Vol. ASSP-25, No. 4, August 1997 describes a technique of bit allocation based on the signal size of each subband. However, while a flat quantization noise spectrum is produced to minimize the noise energy with this technique, the actual feeling of hearing noise is not optimal to the auditory sense because it does not utilize the masking effect.
On the other hand, “The Critical Band Coder—Digital Encoding of the Perceptual Requirements of the Auditory Systems”, M. A. Kransner, MIT, ICASSP 1980 describes a technique of invariably allocating bits to subbands by determining the necessary S/N ratio for each subband, utilizing the auditory masking effect. However, with this technique, the observed characteristics of the input sine wave are not particularly encouraging because the bit allocation is stationary and invariable.
In an attempt for dissolving the above identified problems, there has been a proposed high-efficiency coding technique of splitting the entire allocatable bits into those for a fixed bit allocation pattern predetermined for each small block and those to be allocated depending on the signal size of each block and selecting the splitting ratio depending on a signal related to the input signal so that the fixed bit allocation pattern takes a large ratio when the signal shows a smooth spectrum pattern.
This technique can remarkably improve the overall S/N characteristics in the case of a signal where energy is concentrated on a specific spectrum signal component such as a sine wave because, with this technique, a large number of bits are allocated to the block containing the spectrum signal component. Generally, the human auditory sense is keen to a signal containing a steep spectrum signal component. Therefore, the use of this technique of improving the S/N characteristics is effective for improving not only the numerical values obtained by observation but also the sound quality as sensed by the auditory perception.
There are many other techniques proposed for bit allocation, according to which models that are by far more sophisticated than the one used with the above described technique can be formed to improve the ability of the coding device of highly efficiently carrying out a coding operation in terms of the human auditory sense.
When allocating bits, it is a general practice to determine a reference value of a real number for bit allocation in order to reliably produce the computationally obtained S/N characteristics and select an integer approximating the reference value for the number of bits that are actually allocated.
When forming an actual code string, firstly the quantization accuracy information and the normalization coefficient information are coded in a predetermined number of bits for each subband that is subjected to normalization and quantization. Then, the spectrum signal component that is normalized and quantized is coded.
The ISO Standards (ISO/IEC 11172-3:1993 (E), a993) describes a high-efficiency coding system that is so devised as to differentiate the number of bits expressing the quantization accuracy information from subband to subband, with which the number of bits expressing the quantization accuracy information is decreased as a function of frequency.
There is also known a technique of determining the quantization accuracy information typically from the normalization coefficient information in a decoder instead of directly encoding the quantization accuracy information. However, with this technique, the relationship between the normalization coefficient information and the quantization accuracy information becomes fixed when the standards are installed so that it is no longer possible to introduce an improved system for controlling the quantization accuracy on the basis of an enhanced auditory model in the future.
Additionally, “A Method for Construction of Minimum Redundancy Codes”, D. A. Huffman: Proc. I. R. E., 40. P. 1098 (1952) describes a method of efficiently coding quantized spectrum signal components by using variable length codes.
Still additionally, Japanese Patent Application Laid-Open No. 6-828633 filed by the applicant of this patent application proposes in its specification and drawings a method of isolating tone-related components that are important to the human auditory sense from the spectrum signal components and coding them separately from the remaining spectrum signal components. With proposed method, it is possible to efficiently encode audio signals to a high compression ratio practically without degrading the sound quality to the auditory sense.
Note that any of the above listed coding techniques is applicable to each channel of an acoustic signal constituted by a plurality of channels. For instance, any of them may be applied separately to the L channel that corresponds to the left-side loudspeaker and also to the R channel that corresponds to the right-side loudspeaker. Furthermore, any of them may be applied to the (L+R)/2 signal obtained by adding the signal of the L channel and that of the R channel or both of the (L+R)/2 and (L−R)/2 signals for efficient coding. For example, Japanese Patent Application Laid-Open No. 10-336039 filed by the applicant of this patent application proposes in its specification and drawings a method of reducing the bandwidth of the (L−R)/2 signal relative to the that of the (L+R)/2 signal, paying attention to the fact that the feeling of stereophony is dominantly affected by low frequency side signals. With this technique, it is possible to efficiently carry out a coding operation, using a reduced number of bits, while maintaining the feeling of stereophony as perceived by the auditory sense. It should be noted here that, since the amount of data required for coding signals of a channel is half of that of data for coding signals of two channels independently, a technique of establishing a set of standards providing both a mode for recording monaural signals of a single channel and a mode for recording stereo signals of two channels is popularly used so that signals may be recorded as monaural signals when a long recording time is expected for recording signals on a recording medium.
As described above, novel techniques for improving the coding efficiency have been developed almost incessantly so that, if a set of standards accommodating a newly developed coding technique is used, it will normally be possible to record signals for a prolonged period of time on an information recording medium or, if the recording time is the same, record higher quality audio signals.
When establishing a new set of standards, provisions are normally made to accommodate possible revisions and/or extensions in the future so that flag information and other necessary pieces of information relating to the standards may be recorded on the recording medium in advance. For instance, a 1-bit flag information of “0” may be recorded on the recording medium when the standards are established for the first time and the flag information may be turned to “1” when the standards are revised. With this arrangement, an apparatus that is adapted to the revised standards checks if the flag information recorded on the recording medium is equal to “0” or “1” and reads and reproduces signals from the information recoding medium according to the revised standards if the flag information is “1”, whereas it reads and reproduces signals from the information recording medium according to the original standards if the flag information is “0” and the apparatus is not adapted to the original standards.
However, if apparatus that can reproduce signals that are recorded according to a set of standards (which is to be referred to as “the old standards” or “the first coding system” hereinafter) become popular and widely used and a new set of standards accommodating a more efficient coding system, which may be superceding standards, (which is to be referred to as “the new standards” or “the second coding system” hereinafter) is established, the users of the apparatus will have to experience the inconvenience of not being able to replay any information recording medium where signals are recorded according to the new standards. Apparatus that can reproduce and/or record signals according to the old standards will be referred to as apparatus adapted to the old standards hereinafter.
Particularly, there may be apparatus that are adapted to the old standards and try to reproduce all the signals recorded on the information recording medium as if they are coded according to the old standards, disregarding the flag information recorded on the information recording medium. In other words, if the information recording medium stores signals coded according to the new standards, the apparatus adapted to the old standards cannot recognize it. Then, if the apparatus adapted to the old standards tries to reproduce signals recorded according to the new standards as if they are signals recorded according to the old standards, the apparatus may not operate properly and/or give rise to terrible noises.
Additionally, if signals coded according to the old standards and those coded according to the new standards are recorded on a same recording medium, less storage areas will inevitably be allocated to them to make it difficult to maintain a required level of quality for the signals that are recorded and reproduced.
On the other hand, Japanese Patent Application Laid-Open No. 10-302405 filed by the applicant of the present patent application proposes a technique with which an apparatus adapted to the old standards can reproduce signals coded according to the old standards if the recording medium stores both signals coded according to the old standards and those coded according to the new standards while an apparatus adapted to the new standards can reproduce from the recording medium both signals coded according to the old standards and those coded according to the new standards and any possible degradation of signal quality that can arise when signals coded according to different sets of standards are recorded on a same information recording medium can be minimized. Note that, in the following description, an apparatus that can reproduce and/or record signals coded according to the new standards, which may be superceding standards, is referred to as apparatus adapted to the new standards.
However, a variety of problems can take place to confuse the user when signals coded according to the old standards are added to an information recording medium storing signals coded according to old standards and those coded according to the new standards by means of an apparatus adapted to the old standards or an operation of track splitting and/or track coupling by way of track erasing and track editing is repeatedly conducted.
To be more accurate, while management data (so-called TOC) including track replay mode information, start address information and end address information have to be stored in the management data area of the recording medium defined by the old standards so that they may be referred to by an apparatus adapted to the old standards, data on the additional information (extended information) such as the information on the replay mode adapted to the new standards and necessary for an apparatus adapted to the new standards to reproduce value-added data have to be stored in an area (extended management data area) that can be referred to only by an apparatus adapted to the new standards so that they may not be referred to nor erased by an apparatus adapted to the old standards.
More specifically, assume here that apparatus adapted to the new standards can accommodate both mode a and mode c while apparatus adapted to the old standards can accommodate only mode a and the signals stored in an information recording medium are adapted to both the features of mode a and those of mode c. Also assume that the above signals are divided into two parts by using the editing feature of the apparatus adapted to the old standards and the replay mode information for the signals of the latter part is stored in mode a in the management data area of the information recording medium provided for the old standards. Then, if the information recording medium is replayed by the apparatus adapted to the new standards, the signals stored in the information recording medium can be reproduced only in mode a adapted only to the old standards, although they are actually signals (code string) adapted to both the features of mode a and those of mode c. In such a case, the quality of the signals are no longer maintained and the user of the apparatus adapted to the new standards will be very confused.
Assume now that the signals stored in the information recording medium are adapted to both the features of mode a and those of mode c and signals adapted to both mode a and mode c are stored in the extended management data area for the new standards. Also assume that the above signals are erased by an apparatus adapted to the old standards and additional signals are recorded by the apparatus adapted to the old standards in mode a. Then, extended replay mode information indicating that signals adapted to both mode a and mode c is left unerased in the extended management data area for the new standards on the information recording medium. Therefore, when the information recording medium is replayed by an apparatus adapted to the new standards, the apparatus will wrongly recognize that the signals stored on the information recording medium are those adapted to both mode a and mode c on the basis of the extended replay mode information left unerased in the extended management data area. Then, in the worst case, the apparatus adapted to the new standards can run away to terribly degrade the signal quality and confuse the user.